Bug 1080485 - dracut: device mapper cannot be disassembled on shutdown
dracut: device mapper cannot be disassembled on shutdown
Status: RESOLVED UPSTREAM
: 1116154 (view as bug list)
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem
Current
Other Other
: P5 - None : Normal with 2 votes (vote)
: ---
Assigned To: heming zhao
E-mail List
:
Depends on:
Blocks: 1172006
  Show dependency treegraph
 
Reported: 2018-02-11 17:37 UTC by Achim Gratz
Modified: 2020-07-04 21:21 UTC (History)
16 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
/proc/cmdline (138 bytes, text/plain)
2018-05-11 12:46 UTC, Michiel Janssens
Details
fstab (1.80 KB, text/plain)
2018-05-11 12:47 UTC, Michiel Janssens
Details
rdsosreport_preshutdown (16.61 KB, text/plain)
2018-05-11 12:47 UTC, Michiel Janssens
Details
rdsosreport_shutdown (16.60 KB, text/plain)
2018-05-11 12:48 UTC, Michiel Janssens
Details
systemd_debug_shutdown-log (1008.85 KB, text/plain)
2018-05-11 12:48 UTC, Michiel Janssens
Details
Screenshot showing the error messages (25.81 KB, image/png)
2018-08-22 15:32 UTC, Neil Rickert
Details
serial console log (16.87 KB, text/plain)
2018-08-24 14:28 UTC, Neil Rickert
Details
Serial console log (second try) (135.42 KB, text/plain)
2018-08-25 13:18 UTC, Neil Rickert
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Achim Gratz 2018-02-11 17:37:13 UTC
Since a few months dracut spews about two pages of errors/warnings on
shutdown.  Yesterday I halted the system instead of poweroff to be able
to read those and it seems that the unmount of /oldroot fails and then
the disassembly of the device-mapper devices errors out (maybe there'd
been other errors before that that already rolled off the screen).

Interestingly, I sometimes get a lot less of these errors when a kernel update just happened before.
Comment 1 Neil Rickert 2018-02-11 21:18:31 UTC
I am probably seeing the same errors.  I haven't reported, because they go past too quickly to grab a copy.  I've tried to reproduce in a KVM virtual machine, but without success.

I don't worry much about it, because the file systems check out as clean.  So nothing is being corrupted.  Maybe it is trying to disassemble something that does not need to be disassembled (i.e. could just be abandoned).

I get this on both Tumbleweed and Leap 15.0.  And occasionally I see a clean shutdown after updates.
Comment 2 zhen ren 2018-02-23 07:10:55 UTC
Hi,

Thanks for your report and input. I am just back from holiday.

@Achim, can you reproduce this problem?
Comment 3 zhen ren 2018-02-23 07:12:16 UTC
@Achim, any screen snapshot or log message can you provide?
Comment 4 zhen ren 2018-02-23 07:14:48 UTC
Hi Martin,

If you don't have objections, I tend to always add you in the CC list for device-mapper problem :)
Comment 5 Martin Wilck 2018-02-26 16:47:01 UTC
Can we get some more details please? Maybe serial console output, or logs retrieved via "debugging dracut on shutdown"?

(see https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html)
Comment 6 Achim Gratz 2018-02-26 19:06:09 UTC
I can't easily use a serial console on that machine right now as the port is used by a GPS (for timing).  I've tried to use the netconsole before, but the output cuts off when the system tears down the network (I think), so it's no use in this case.  The system gets its IP over DHCP, so it's not entirely unexpected that it would fall off the network when that gets shut down.

As I wrote on the factory mailing list in July last year, here's what I gleaned from shutting down into the dracut shell.  The last few lines from dracut are:

dmsetup ls --tree
  system-root (254:0)
  `- (8:2)

...which probably tells me the same thing as dracut not being able to
unmount /oldroot.  I am seeing it being unable to unmount /var/log also (that error is still in the journal).
Comment 7 Neil Rickert 2018-03-30 13:10:23 UTC
I'll note that comment 5 of bug 1083392 seems relevant.
Comment 8 Martin Wilck 2018-04-04 15:18:23 UTC
I don't think we can do a lot here unless we get more information.

Could you repeat the step from comment 0 and send screenshots of the messages you observe?

Otherwise I think I'll have to close this.

Wrt bug 1083392, that's almost certainly something else, and bug 1083392, comment 5 (although more closely related to this bug here than the rest of that bug) adds very little useful information.
Comment 9 Ronnie Bailey 2018-04-05 11:57:00 UTC
There are 2 primary errors on my system. I had to shoot a video to be able to catch and read them. There may be 60 or more repeats over a half second time period. I should note that this happens 80% of the time. So there are times when shutdown is without errors. 

Error 1: device-mapper: remove ioctl on system-root failed: Device or resource busy  (system-root is my encrypted LVM Luks root partition)

Error 2: device-mapper: remove ioctl on cr_scsi_hard_drive_partition_containing_root

My /home is also a partition within the same LVM, but never produces errors.

SLES 11 SP4 had a problem that produced the same error in 2016, but it's possible that the cause is not identical. Suse document is 7017390
Comment 10 Ronnie Bailey 2018-04-05 12:00:21 UTC
I had a typo in my previous post.

Error 2 should be: Error 2: device-mapper: remove ioctl on cr_scsi_hard_drive_partition_containing_root failed: Device or Resource busy
Comment 11 Martin Wilck 2018-04-05 15:04:45 UTC
Could you try this:

(https://mirrors.edge.kernel.org/pub/linux/utils/boot/dracut/dracut.html#debugging-dracut)

# mkdir -p /run/initramfs/etc/cmdline.d
# echo "rd.debug rd.break=pre-shutdown rd.break=shutdown" > /run/initramfs/etc/cmdline.d/debug.conf
# touch /run/initramfs/.need_shutdown

You should now be able to watch better what's going on.

> remove ioctl on system-root failed

Once we're back in the initramfs, shutting down the root LV should work - unless it's still in use by some process. This is only a real problem if the situation persists (e.g. because a file system hasn't been unmounted).

bug 1083392 is quite interesting in this respect. It seems to be quite common that such "remove ioctl" errors occur transiently, and succeed after a few retries. The error messages are printed nonetheless. If it's just that, we may want to simply silence the error message and only print a message after the last retry has failed.
Comment 12 Achim Gratz 2018-04-08 13:13:18 UTC
Well, I don't know how to make screenshots on a system that's no connection to the outside world anymore.  Besides, one problem is (as said in the initial mailing list thread) that there are so many messages that the trigger has scrolled off the screen and I can't scroll back on the dracut console (or at least not far enough).

So if anybody can tell me how to create a netconsole that is active throughout the boot/shutdown process I can debug things from a different box, otherwise that'll have to wait until I can repurpose the serial temporarily.
Comment 13 Martin Wilck 2018-04-09 08:37:12 UTC
Please try setting STARTMODE=nfsroot for the network connection that is used for netconsole.
Comment 14 Ronnie Bailey 2018-04-09 08:49:37 UTC
  In my case, the last suggestion in comment 11 may work. I used to get shutdown time-outs of 90 seconds while the system was waiting for TeamViewer (3rd party) to shut down. I no longer get that error (possibly due to updates), but it is possible that TeamViewer is what is causing the delay in releasing root. 
  I another forum, someone suggested allowing the first device-mapper error to post, then to remain silent unless there is a total failure in unmounting. 
  Or if there were a way to add a 2 second delay to allow more time to complete the release of root, that may help.
Comment 15 Martin Wilck 2018-04-19 09:08:08 UTC
Affected people, please check if /var/run and /var/lock on your systems are directories (bind mounts) or symlinks. The former may be the case if the system is an old installation with a long history of being updated. The latter would be the case on new installations.

It they are directories, please move them away (will require an umount -l first) and create symlinks

/var/run -> /run
/var/lock -> /run/lock

I've recently been debugging a similar case where this workaround helped.
Comment 16 Christoph Obexer 2018-04-20 07:01:41 UTC
(In reply to Martin Wilck from comment #15)
> Affected people, please check if /var/run and /var/lock on your systems are
> directories (bind mounts) or symlinks.
Both are symlinks on my system but I still experience the disassembly problem. My Symlinks are apparently from 28. June 2017.
Comment 17 Martin Wilck 2018-04-23 08:33:29 UTC
(In reply to Christoph Obexer from comment #16)
> Both are symlinks on my system but I still experience the disassembly
> problem. My Symlinks are apparently from 28. June 2017.

Ok, so my guess was wrong.

Maybe we can make progresss if someone could gather logs as described under
"Shutdown Completes Eventually" on https://freedesktop.org/wiki/Software/systemd/Debugging/.
Comment 18 Michiel Janssens 2018-05-11 12:46:34 UTC
Created attachment 769904 [details]
/proc/cmdline
Comment 19 Michiel Janssens 2018-05-11 12:47:18 UTC
Created attachment 769906 [details]
fstab
Comment 20 Michiel Janssens 2018-05-11 12:47:49 UTC
Created attachment 769907 [details]
rdsosreport_preshutdown
Comment 21 Michiel Janssens 2018-05-11 12:48:12 UTC
Created attachment 769908 [details]
rdsosreport_shutdown
Comment 22 Michiel Janssens 2018-05-11 12:48:46 UTC
Created attachment 769909 [details]
systemd_debug_shutdown-log
Comment 23 Michiel Janssens 2018-05-11 12:57:19 UTC
As I was also was getting these shutdown messages, I tried to get some logging info as requested. see attached files.
I hope this will be helpful in determining if the messages should be suppressed or not.
Comment 24 Ronnie Bailey 2018-07-31 12:38:41 UTC
  I had opened a thread in the openSUSE Tumbleweed forum regarding this problem. A user has found a workaround that may at least give a hint as to what is going on. 
 It was posted that: "Adding "plymouth.enable=0" to the kernel parameters makes the "device-mapper: remove ioctl on XXXXX failed: Device or resource busy" messages disappear."
 I have no idea if Plymouth is the problem or just a direction to look. I can verify that adding the above kernel parameter has stopped the error in my Tumbleweed machine. My original thread may be found at:

https://forums.opensuse.org/showthread.php/530530-device-mapper-remove-ioctl-on-system-root-failed
Comment 25 José Díaz 2018-07-31 13:23:54 UTC
Same as Comment 24, I confirm the error messages disappear when disabling plymouth in a laptop running Leap 15.
Comment 26 Martin Wilck 2018-08-22 08:25:11 UTC
My guess would be that disabling plymouth only hides the messages.
Comment 27 Neil Rickert 2018-08-22 15:32:48 UTC
Created attachment 780445 [details]
Screenshot showing the error messages

Something changed with snapshot 20180818

I am now seeing these errors on every shutdown.  This is with a KVM virtual machine, where I previously never could produced the problem.  And I do have "plymouth.enable=0".

Responding to c#26
>My guess would be that disabling plymouth only hides the messages.

I disagree with that.  My guess is that plymouth is using something from the root file system, perhaps a dynamic ".so" library.

But now it is happening even with plymouth disabled.  So some other component is now using something from the root file system.

NOTE: I did check.  I booted to rescue mode (with installer DVD), and an "fsck" on all file systems showed them to be clean.

My personal opinion:  this is all a mistake.  The system should not try to disassemble device mapper setup.  It should just reset the system and abandon the current setup.  What makes the device mapper setup is all in volatile storage.  So it goes away on reset.  No need to tear it down.  What's important is that everything has been remounted as read-only (or has been unmounted).
Comment 28 Martin Wilck 2018-08-22 16:34:37 UTC
(In reply to Neil Rickert from comment #27)
> I am now seeing these errors on every shutdown.  This is with a KVM virtual
> machine, where I previously never could produced the problem.  And I do have
> "plymouth.enable=0".

IIUC use see these messages for a while, then they stop and the reboot succeeds (how long? how many messages)?

If actually some library from the root FS was used by some process running outside of it, then the error should persist and the remove ioctl should consistently fail, causing the shutdown process to time out.

> My personal opinion:  this is all a mistake.  The system should not try to
> disassemble device mapper setup.

I'm not sure where this happens. I guess it's the stop job of "blk-availability.service". Could you try to mask that service and see if the problem disappears?

Wrt this being a "mistake", the blkdeactivate script was introduced to LVM by Red Hat in 2012. I didn't find an explicit rationale, but as they're unlikely to have written that code without a purpose, I guess they were seeing data corruption of some sort if it's not done.
Comment 29 Neil Rickert 2018-08-22 18:05:08 UTC
>(how long? how many messages)?

They go past so fast, that it is very hard to tell.  I would guess that it is around 100 messages.

As soon as I could see the messages, I paused the VM so that I could take a screenshot of the paused virtual machine.  Otherwise they would go past too fast to capture.

>Could you try to mask that service and see if the problem disappears?

How and when should I do that?
Comment 30 Martin Wilck 2018-08-22 20:17:44 UTC
(In reply to Neil Rickert from comment #29)
> >(how long? how many messages)?
> 
> They go past so fast, that it is very hard to tell. 

So it's really a transient condition, and the problem is mainly the irritating log messages, no real damage done. Might be solvable by adding a short sleep somewhere (but I can't say where :-/ ).

> >Could you try to mask that service and see if the problem disappears?
> 
> How and when should I do that?

When the system is up, run "systemctl mask blk-availability.service".
You probably need to reboot after that.

But wait - first check whether that service is enabled at all ("systemctl status blk-availability.service"). If it isn't, we have to find another culprit.
Comment 31 Neil Rickert 2018-08-22 22:35:55 UTC
I agree that no real damage is done.  It seems to be a harmless bug.

 ----
# systemctl status blk-availability.service
● blk-availability.service - Availability of block devices
   Loaded: loaded (/usr/lib/systemd/system/blk-availability.service; disabled; >
   Active: inactive (dead)
 ---
I guess the service is not running.
Comment 32 Martin Wilck 2018-08-23 08:18:34 UTC
> I agree that no real damage is done. 

OK, good. If you're willing to invest some more work nonetheless:

As you're seeing it on a VM, maybe you could provide a serial console log taken with "systemd.log_target=console systemd.journald.forward_to_console=1 console=ttyS0,57600n8"? In libvirt, you can connect e.g. "screen" to the serial port of the VM (which appears as /dev/pts/$X device on the host, you can see it e.g. with "virsh dumpxml"), and log the screen session. qemu has various  methods to redirect a virtual serial console to a file.

The reason I'm asking is to identify at which point exactly in the shutdown sequence the messages occur.
Comment 33 Neil Rickert 2018-08-24 13:33:00 UTC
Yes, I'm willing to spend more time on this.

Note, however, that I am relatively new to using VMs.  So I might need a little time to get up to speed in setting up a serial console log.
Comment 34 Neil Rickert 2018-08-24 14:28:54 UTC
Created attachment 780732 [details]
serial console log

Well, that was easier than expected.  I have attached the serial console log.
Comment 35 Neil Rickert 2018-08-25 13:18:41 UTC
Created attachment 780777 [details]
Serial console log (second try)

I see that I missed some of the instruction (the part about journal).  So I have repeated this, and now there is a lot more output.
Comment 36 Achim Gratz 2018-11-12 18:41:11 UTC
The problem has stopped appearing somewhere around the 4.18.5 kernel update on Tumbleweed and has not returned so far.
Comment 37 Neil Rickert 2018-11-13 05:15:06 UTC
>The problem has stopped appearing somewhere around the 4.18.5 kernel update on Tumbleweed and has not returned so far.

I am still seeing the problem.

I have just updated Tumbleweed on three systems.  And then I rebooted twice (to make sure that this was with everything fully updated to 20181110).  I checked on the second reboot.

(1) A virtual machine under KVM:  Still seeing the problem.
(2) A Lenovo ThinkServer: Still seeing the problem.
(3) A Dell laptop: I am no longer seeing the problem on this system.

I guess that means there's a hardware dependency on whether this happens.
Comment 38 Ronnie Bailey 2018-11-13 06:22:46 UTC
  I also continue to see the messages, although they disappear during shutdown immediately after a kernel update. After the following boot, the problem is back.          
  I might note that my practice is to always remove the existing kernel at the same time I update to a newer kernel and not wait for the system to do it at another time.
Comment 39 Andrei Dziahel 2018-11-15 10:05:30 UTC
*** Bug 1116154 has been marked as a duplicate of this bug. ***
Comment 40 - - 2018-11-19 17:16:16 UTC
https://openqa.opensuse.org/tests/799295#step/shutdown/6


On my system:

With 'rd.break=shutdown' (after the error messages) /oldroot remains mounted with a 'ro' option, but one time it was 'rw' (because something related with btrfs balance or fs sync timed out).

It seems that dracut doesn't succeed after many failed retries but gives up.

(Adding the 'rd.debug' option results in a complete freeze of my system every time)
Comment 41 - - 2018-11-19 21:15:32 UTC
There are 252 (42 * 6) lines of device-mapper's error messages in the Neil's log.

Dracut tries 42 times and gives up:
    _cnt=0
    while [ $_cnt -le 40 ]; do
        _check_shutdown && break
        _cnt=$(($_cnt+1))
    done
    [ $_cnt -ge 40 ] && _check_shutdown final

https://github.com/dracutdevs/dracut/blob/master/modules.d/99shutdown/shutdown.sh
Comment 42 - - 2018-11-21 17:05:50 UTC
shutdown:/# /oldroot/usr/bin/lsof -R /oldroot
COMMAND   PID PPID USER  FD   TYPE DEVICE SIZE/OFF   NODE NAME
none     1454    2 root mem    REG   0,44  2078080 181864 /oldroot/lib64/libc-2.27.so
none     1454    2 root mem    REG   0,44   181048 181856 /oldroot/lib64/ld-2.27.so
lsof     2891 2889 root txt    REG   0,44   168376  11130 /oldroot/usr/bin/lsof
lsof     2892 2891 root txt    REG   0,44   168376  11130 /oldroot/usr/bin/lsof
Comment 43 - - 2018-11-22 00:28:58 UTC
After running `kill -s TERM 1454` I could unmount /oldroot with no error.

I found also a line identyfying the mysterious kthread named [none] in logs:
>kernel: bpfilter: Loaded bpfilter_umh pid 1454
Comment 44 Martin Wilck 2019-01-11 11:46:02 UTC
Similar bug in SLE (non-public) is bug 1111392.
Comment 45 Martin Wilck 2019-01-11 12:04:35 UTC
(In reply to - - from comment #43)
> After running `kill -s TERM 1454` I could unmount /oldroot with no error.
> 
> I found also a line identyfying the mysterious kthread named [none] in logs:
> >kernel: bpfilter: Loaded bpfilter_umh pid 1454

If this is really the culprit, try running "rmmod bpfilter" before shutdown.
Comment 46 Martin Wilck 2019-01-11 12:35:04 UTC
Here's something of interest:

https://lkml.org/lkml/2018/10/20/259
Comment 47 Martin Wilck 2019-01-11 12:42:15 UTC
This is:

876dcf2 umh: Add command line to user mode helpers
4b78030 net: bpfilter: Set user mode helper's command line

Contained in 4.20-rc1.

Could you test if the problem is gone with the 4.20.1 kernel from Kernel:stable?

http://download.opensuse.org/repositories/Kernel:/stable/standard
Comment 48 Neil Rickert 2019-01-11 21:43:19 UTC
I have installed kernel 4.20.1-1.g5978cc8-default.  After that, I have rebooted once and  then shutdown once.  Both reboot and shutdown were clean with this kernel.  They used to be very noisy with those "remove ioctl" fails.

This was in a KVM virtual machine.  I have not tested on real hardware.
Comment 49 - - 2019-01-12 01:28:11 UTC
First of all, I have to correct my last comment - the 'none' process wasn't a kthread, because it lacked the PF_KTHREAD flag.

(In reply to Martin Wilck from comment #46)
> Here's something of interest:
> 
> https://lkml.org/lkml/2018/10/20/259

There is also a patch in systemd 240: https://github.com/systemd/systemd/commit/e45154c770567507cc51eb78b28a1fae1fcdf396
Comment 50 Martin Wilck 2019-01-14 15:09:05 UTC
(In reply to - - from comment #49)
> First of all, I have to correct my last comment - the 'none' process wasn't
> a kthread, because it lacked the PF_KTHREAD flag.
 
Indeed, the bpfilter_umh thread doesn't have this flag set. bpfilter_umh is not a kernel thread, it's a "user mode helper", started by the "fork_usermode_blob" function.

This makes sense: a kernel thread wouldn't cause troube umounting file systems, but a user mode helper runs in user mode and thus has a CWD which may block the umount. 

Could you please test with 4.20.1 as requested in comment 47?

> There is also a patch in systemd 240:
> https://github.com/systemd/systemd/commit/
> e45154c770567507cc51eb78b28a1fae1fcdf396

That one looks orthogonal with what we are discussing here. It deals with _not killing_ kernel threads, where we must ensure that systemd _does kill_ this user mode helper.
Comment 51 Andrei Dziahel 2019-01-14 21:40:45 UTC
> Could you please test with 4.20.1 as requested in comment 47?

Worked like a charm for me.
Comment 52 - - 2019-01-15 23:37:45 UTC
(In reply to Martin Wilck from comment #50)
> Could you please test with 4.20.1 as requested in comment 47?

I'm sorry, but I would prefer to wait for the release in the official repo. (Won't it appear any day now?)
I had recently even more issues with the rd.break=shutdown option, so I'll try to give you a deatailed answer then, but I cannot guarantee it.

Maybe could you consider adding an extra test to openqa, which could show you most reliably if the patches have had the desired effect? It would also be useful for detecting similar issues in the future.

> > There is also a patch in systemd 240:
> > https://github.com/systemd/systemd/commit/
> > e45154c770567507cc51eb78b28a1fae1fcdf396
> 
> That one looks orthogonal with what we are discussing here. It deals with
> _not killing_ kernel threads, where we must ensure that systemd _does kill_
> this user mode helper.

It corrects kthread distinguishing to ensure killing user mode helpers.
Checking the PF_KTHREAD flag is the right solution IMO and dracut too should implement it.
The old hacky way - checking if /proc/{PID}/cmdline is empty, isn't reliable enough as we all have seen ;)
I consider the kernel patches as a workaround of the problem.
Comment 53 Martin Wilck 2019-01-16 13:01:28 UTC
(In reply to - - from comment #52)

> Maybe could you consider adding an extra test to openqa, which could show
> you most reliably if the patches have had the desired effect? It would also
> be useful for detecting similar issues in the future.

I'm not aware of an OpenQA test that reproduces this behavior. Anyway, that's really not my area of expertise.

@coolo, do you see a chance to do this?
@all, do we have a fool-proof setup procedure for a VM to reproduce this problem?

> I consider the kernel patches as a workaround of the problem.

Not sure what you meant to say, but judging from the other testers' feedback, it looks as if these patches were actually the solution to the problem described in this bug. I wouldn't call this a workaround.
Comment 54 - - 2019-01-17 02:28:30 UTC
(In reply to Martin Wilck from comment #53)
> > I consider the kernel patches as a workaround of the problem.
> 
> Not sure what you meant to say, but judging from the other testers'
> feedback, it looks as if these patches were actually the solution to the
> problem described in this bug. I wouldn't call this a workaround.

Workarounds are solutions by definition. Poor solutions.

Anyway, I can confirm that the issue has been solved in the Tumbleweed 20190115 snapshot:
kernel-default-4.20.0-1.5.x86_64
systemd-239-3.1.x86_64
dracut-044.1-22.1.x86_64

The error messages have stopped appearing.

/oldroot and /oldroot/var mounts: 
 - both ro at the pre-shutdown breakpoint;
 - not listed at the shutdown breakpoint.

The cmdline entry in /proc/<pid> for the bpfilter umh process contains "bpfilter_umh" followed by a null byte. The dir isn't available at the shutdown breakpoint (or - in other words - the process has been killed earlier, as expected).

> @all, do we have a fool-proof setup procedure for a VM to reproduce this problem?

I suspect that the issue affects every system where bpfilter_umh is started, but verbose errors appear only if the dracut's shutdown hook is enabled.
Comment 55 Martin Wilck 2019-01-17 10:50:20 UTC
(In reply to - - from comment #54)
> Workarounds are solutions by definition. Poor solutions.

So you consider the addition of the two kernel patches above a poor solution? Sorry to hear that, I thought it was quite a good one.

> Anyway, I can confirm that the issue has been solved in the Tumbleweed
> 20190115 snapshot:
> kernel-default-4.20.0-1.5.x86_64
> systemd-239-3.1.x86_64
> dracut-044.1-22.1.x86_64
> 
> The error messages have stopped appearing.

OK, great.

> > @all, do we have a fool-proof setup procedure for a VM to reproduce this problem?
> 
> I suspect that the issue affects every system where bpfilter_umh is started,
> but verbose errors appear only if the dracut's shutdown hook is enabled.

I haven't yet figured out which service starts the bpfilter_umh process. Have you? I suspected firewalld, but I've seen bpfilter_umh running on systems with firewalld disabled.

Closing the bug per comment 48, 51, and 54.
Comment 56 Ronnie Bailey 2019-01-17 16:56:27 UTC
I upgraded to kernel-default-4-20 this morning. After reboot into the new kernel, and shut-down, the problem still exists.
Comment 57 Martin Wilck 2019-01-17 19:33:07 UTC
(In reply to Ronnie Bailey from comment #56)
> I upgraded to kernel-default-4-20 this morning. After reboot into the new
> kernel, and shut-down, the problem still exists.

Sorry to hear that. Your problem must be different than the other people's ones then. Could you please try the debugging steps described in comment 40 and comment 42?
Comment 58 - - 2019-01-17 21:37:35 UTC
(In reply to Martin Wilck from comment #55)
> > > @all, do we have a fool-proof setup procedure for a VM to reproduce this problem?
> > 
> > I suspect that the issue affects every system where bpfilter_umh is started,
> > but verbose errors appear only if the dracut's shutdown hook is enabled.
> 
> I haven't yet figured out which service starts the bpfilter_umh process.
> Have you? I suspected firewalld, but I've seen bpfilter_umh running on
> systems with firewalld disabled.

I suspect NetworkManager (on the basis of the order of log entries and boot messages).
Comment 59 Ronnie Bailey 2019-01-19 18:20:25 UTC
I set network management to "Wicked" via Yast. NetworkManager closed. I still got the messages on shutdown. Something curious though. On reboot, NetworkManager was running. I went to Yast and it still showed Wicked as the service to use. 

Earlier, I removed the bpfilter module. lsmod had showed it being used by 0. When I initiated shutdown, the first console message I got was "starting bpfilter". But during that shutdown there were no messages. 

I am not sure if it would make a difference, but I have never migrated to btrfs. My volumes are formatted as ext4. 

Shutdowns on my system typically take 90 seconds or so, which seems quite long to me. One of the final messages I get before the long delay is: "Stopping monitoring of LVM2 partitions" or something to that effect. After a very long delay, I get the error messages which flash by quickly, just before shutdown.
Comment 60 Martin Wilck 2019-01-28 10:29:00 UTC
(In reply to Ronnie Bailey from comment #59)
> I set network management to "Wicked" via Yast. NetworkManager closed. I
> still got the messages on shutdown. Something curious though. On reboot,
> NetworkManager was running. I went to Yast and it still showed Wicked as the
> service to use. 

Different issue, please open a separate bug.
 
> Earlier, I removed the bpfilter module. lsmod had showed it being used by 0.
> When I initiated shutdown, the first console message I got was "starting
> bpfilter". But during that shutdown there were no messages. 

Could you try that after activating systemd debugging ("systemd-analyze log-level debug")? It might show us which service loads the module.

> Shutdowns on my system typically take 90 seconds or so, which seems quite
> long to me. 

Yes indeed. 90s is the standard systemd timeout for jobs that don't finish.
Comment 61 - - 2019-01-29 00:54:30 UTC
(In reply to Martin Wilck from comment #55)
> I haven't yet figured out which service starts the bpfilter_umh process.
> Have you? I suspected firewalld, but I've seen bpfilter_umh running on
> systems with firewalld disabled.

After a little research I found that it can be anything that calls `(set|get)sockopt`.
https://github.com/torvalds/linux/commit/97adaddaa6db7a8af81b9b11e30cbe3628cd6700

However, I agree with you that Ronnie's problem is different (probably completely unrelated to bpfilter_umh IMO) and debugging at the shutdown breakpoint would be helpful.

Ronnie, first steps are described in https://bugzilla.opensuse.org/show_bug.cgi?id=1080485#c11

In my experience, I would also suggest:
- enabling the magic sysrq key (temporarily),
- setting
`[Manager]
ShutdownWatchdogSec=0`
  in /run/systemd/system.conf.d/99-shutdown-watchdog.conf
- not using the 'rd.debug' option,
- trying to run commands, even if typed chars are not displayed.
Comment 62 Martin Wilck 2019-01-29 11:58:11 UTC
(In reply to - - from comment #61)

> After a little research I found that it can be anything that calls
> `(set|get)sockopt`.

Thanks for digging that up. More precisely, getsockopt() or setsockopt() with level == IPPROTO_IP and optname supported by bpfilter (https://elixir.bootlin.com/linux/v5.0-rc4/source/include/uapi/linux/bpfilter.h) automagically loads the bpfilter module and starts the UMH. This happens when iptables initializes itself: it calls getsockopt(IPPROTO_IP, IPT_SO_GET_INFO), and IPT_SO_GET_INFO has the same value as BPFILTER_IPT_SO_GET_INFO (=64).
Comment 63 filippos Filippos 2019-06-27 11:46:11 UTC
Same here until today
Comment 64 heming zhao 2019-11-12 07:54:34 UTC
For comment #59

Hello Ronnie Bailey, 

The lvm 90s freezed issue please check bug #1096241.
I fixed it last week.

--------
For comment #63

Hello filippos Filippos, 

Can you provide your reproduced steps or your machine configuration?
What's the kernel version & tumbleweed/leap version?

I plan to setup a kvm/qemu env to reproduce this issue.
Comment 65 heming zhao 2019-11-12 12:07:16 UTC
for comment #63

Could you execute debug cmds in comment #11 (mkdir -p xx, echo xx, touch xx), then execute cmds in comment #42 (lsof -R /oldroot).
Comment 66 Ronnie Bailey 2020-01-15 17:28:42 UTC
I added a repository to be able to upgrade to lvm2 2.03 and can verify that the problem is fixed with that patch. According to other threads, it will be released to openSUSE 15.1 sometime in the near future.
Comment 67 Ronnie Bailey 2020-01-15 17:30:07 UTC
lvm 2.03 fixes the problem.