Bug 1114113 - the latest LVM2 can overwrite extents beyond metadata area
the latest LVM2 can overwrite extents beyond metadata area
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem
Other All
: P3 - Medium : Normal (vote)
: ---
Assigned To: Gang He
E-mail List
Depends on:
  Show dependency treegraph
Reported: 2018-10-31 07:49 UTC by Gang He
Modified: 2018-12-22 07:36 UTC (History)
0 users

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Note You need to log in before you can comment on or make changes to this bug.
Description Gang He 2018-10-31 07:49:46 UTC
I got a bug report from the upstream developer, he said, in the latest versions, LVM2 can overwrite extents beyond metadata area. 

A critical bug has been found in LVM which can cause data corruption in rare cases.  Avoid using LVM commands that change Volume Group metadata (e.g. lvcreate, lvextend) while LVs in the VG are being used.

An I/O bug can cause LVM commands to read and write back the first 128KB of data immediately following the LVM metadata at the start of the disk.  If these blocks are being modified by another program (or file system) at the same time as the LVM command, then that program's changes could be lost.

A fix is being evaluated and will be provided as soon as possible.

The more details can be found at the link 
Comment 1 Gang He 2018-10-31 07:51:01 UTC
The reproducible LV corruption I was seeing was from the following:

setup and start sanlock and lvmlockd

vgcreate --shared --metadatasize 1m foo /dev/sdg

(Note that this creates an internal "lvmlock" LV that sanlock uses to store leases.)

lvcreate 500 inactive LVs in foo

vgremove foo

During the vgremove step, sanlock notices that its updates to the internal "lvmlock" LV are periodically lost.  It's because when vgremove writes metadata at the end of the metadata area, it also clobbers PEs that were allocated to the lvmlock LV.  (sanlock reads/writes blocks to the lvmlock LV and notices if data changes out from under it.)

It should be straight forward to reproduce this same issue without lvmlockd and sanlock.  Create an ordinary VG, create an initial small LV (that uses the first PEs in the VG), start a script or program that reads/writes data to that LV and verifies that what it wrote comes back again.  Then start creating 500 other LVs in the VG, and removing those 500 LVs.  This causes the LV metadata to grow large and wrap around the end of the metadata area.  When lvm writes to the end of the metadata area, it will clobber data that the test program wrote and the test program should eventually notice that its last write is missing.
Comment 2 Gang He 2018-10-31 07:56:36 UTC
From the reproduce description, I think this bug will be triggered in case sanlock+lvmlockd, so far, we do not declare to support sanlock, then, the priority of this bug is not very high, but I will back-port the patch soon.

Comment 3 Gang He 2018-11-05 03:23:56 UTC
The final fix is available here, https://sourceware.org/git/?p=lvm2.git;a=commit;h=ab27d5dc2a5c3bf23ab8fed438f1542015dc723d

I will backport it to SLE12SP4 and Tumbleweed
Comment 6 Gang He 2018-11-06 02:48:00 UTC
I have submitted the code change to open::factory, sle12sp4update and sle15sp1 code branches.
Comment 10 Gang He 2018-12-14 08:17:46 UTC
The patch has been in the related code branches, close it.
Comment 11 Swamp Workflow Management 2018-12-21 20:15:18 UTC
SUSE-RU-2018:4226-1: An update that has two recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1110872,1114113
CVE References: 
Sources used:
SUSE Linux Enterprise Software Development Kit 12-SP4 (src):    lvm2-2.02.180-9.4.2
SUSE Linux Enterprise Server 12-SP4 (src):    lvm2-2.02.180-9.4.2
SUSE Linux Enterprise High Availability 12-SP4 (src):    lvm2-2.02.180-9.4.2
SUSE Linux Enterprise Desktop 12-SP4 (src):    lvm2-2.02.180-9.4.2