Bug 1149364 - Sometimes "tumbleweed update" freezes at kernel-default update
Sometimes "tumbleweed update" freezes at kernel-default update
Status: RESOLVED WONTFIX
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 Other
: P5 - None : Normal (vote)
: ---
Assigned To: E-mail List
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-09-04 10:40 UTC by Michael Käck
Modified: 2020-03-30 12:28 UTC (History)
5 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Käck 2019-09-04 10:40:06 UTC
Hello!

Sometimes "tumbleweed update" freeze at kernel-default update on an VMware-VM with NO btrfs Filesystem (i think because of btrfs-defrag-pl and snapper.py)!


=================================================================
# df -hT | grep -v tmpfs
Dateisystem    Typ      Größe Benutzt Verf. Verw% Eingehängt auf
/dev/sda1      ext4       30G    8,0G   21G   28% /

=================================================================
Update Freeze:
....
(45/49) Installieren: kernel-default-5.2.11-1.2.x86_64 ....................................................................................................................................................<100%>[/]
Freeze

=================================================================
# pstree
systemd─┬─VGAuthService
        ├─agetty
        ├─atd
        ├─auditd───{auditd}
        ├─avahi-daemon
        ├─cron
        ├─dbus-daemon
        ├─gpg-agent
        ├─haveged
        ├─irqbalance───{irqbalance}
        ├─iscsid
        ├─lvmetad
        ├─master─┬─pickup
        │        └─qmgr
        ├─nscd───10*[{nscd}]
        ├─ntpd───ntpd
        ├─snmpd
        ├─sshd─┬─sshd───bash───bash───sudo───zypper─┬─btrfs-defrag-pl
        │      │                                    ├─rpm───sh───bootloader_entr───config───config───grub2-mkconfig───30_os-prober───30_os-prober─┬─os-prober───os-prober─┬─blkid
        │      │                                    │                                                                                             │                       ├─cut
        │      │                                    │                                                                                             │                       └─grep
        │      │                                    │                                                                                             ├─paste
        │      │                                    │                                                                                             └─tr
        │      │                                    ├─snapper.py
        │      │                                    └─systemd-inhibit───cat
        │      └─sshd───bash───pstree
        ├─systemd───(sd-pam)
..........

# ps aux | grep -e btrfs-defrag-pl -e snapper.py 
root      1954  0.0  0.0      0     0 pts/0    Z+   12:15   0:00 [btrfs-defrag-pl] <defunct>
root      1961  0.0  0.0      0     0 pts/0    Z+   12:15   0:00 [snapper.py] <defunct>


FG Michael Käck
Comment 1 Takashi Iwai 2019-09-05 09:21:44 UTC
Any other relevant information, e.g. some suspicious kernel logs, etc?

Also, once when the stall happens and still you have some control on VM, take a look at the stack entries of /proc/$PID of the process that got stuck.
Comment 2 Michael Käck 2019-09-05 10:02:04 UTC
At last Freeze I start a Reboot from another Console.
Shutdown wait for that hanging root-session then reboot.
zypper up install's then the remaining packages.

I didn't look at the log files.

At next hanging kernel-default update i have a look at /proc/$PID and journalctl!
Comment 3 Michael Käck 2019-10-14 07:25:32 UTC
System stalled at Update from:
target   : 20191003
to
latest   : 20191011
with no BTRFS-Filesystem!
--------------------------------

.....
(539/540) Installieren: samba-ad-dc-4.10.8+git.124.a2010fbd0de-1.2.x86_64 ..................................................................................................................................[fertig]
(540/540) Installieren: samba-test-4.10.8+git.124.a2010fbd0de-1.2.x86_64 ...................................................................................................................................[fertig]
%posttrans script 'plymouth-branding-openSUSE-84.87.20191004-1.1.noarch.rpm' wird ausgeführt --------------------------------------------------------------------------------------------------------------------[/]
STALLED!
--------------------------------

journalctl -k -n 100 --no-pager
.........
Okt 14 08:49:46 suse-tumbleweed-64bit-template kernel: Loading iSCSI transport class v2.0-870.
Okt 14 08:56:21 suse-tumbleweed-64bit-template kernel: printk: systemd: 65 output lines suppressed due to ratelimiting
Okt 14 08:56:21 suse-tumbleweed-64bit-template systemd[1]: systemd +suse.36.g9e41d7ec35 running in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Okt 14 08:56:21 suse-tumbleweed-64bit-template systemd[1]: Detected virtualization vmware.
Okt 14 08:56:21 suse-tumbleweed-64bit-template systemd[1]: Detected architecture x86-64.
Okt 14 08:56:21 suse-tumbleweed-64bit-template systemd[1]: /usr/lib/systemd/system/auditd.service:12: PIDFile= references a path below legacy directory /var/run/, updating /var/run/auditd.pid → /run/auditd.pid; please update the unit file accordingly.
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd[1]: /usr/lib/systemd/system/display-manager.service:12: PIDFile= references a path below legacy directory /var/run/, updating /var/run/displaymanager.pid → /run/displaymanager.pid; please update the unit file accordingly.
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd[1]: /usr/lib/systemd/system/vmtoolsd.service:12: PIDFile= references a path below legacy directory /var/run/, updating /var/run/vmtoolsd.pid → /run/vmtoolsd.pid; please update the unit file accordingly.
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd[1]: /usr/lib/systemd/system/ntpd.service:15: PIDFile= references a path below legacy directory /var/run/, updating /var/run/ntp/ntpd.pid → /run/ntp/ntpd.pid; please update the unit file accordingly.
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd[1]: /usr/lib/systemd/system/pcscd.socket:5: ListenStream= references a path below legacy directory /var/run/, updating /var/run/pcscd/pcscd.comm → /run/pcscd/pcscd.comm; please update the unit file accordingly.
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd[1]: Stopping Journal Service...
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd-journald[474]: Received SIGTERM from PID 1 (systemd).
Okt 14 08:56:22 suse-tumbleweed-64bit-template systemd[1]: systemd-journald.service: Succeeded.
Okt 14 08:57:25 suse-tumbleweed-64bit-template kernel: blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Okt 14 08:57:25 suse-tumbleweed-64bit-template kernel: floppy: error 10 while reading block 0
Okt 14 08:57:27 suse-tumbleweed-64bit-template kernel: blk_update_request: I/O error, dev fd0, sector 0 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
Okt 14 08:57:27 suse-tumbleweed-64bit-template kernel: floppy: error 10 while reading block 0
Okt 14 08:58:38 suse-tumbleweed-64bit-template kernel: floppy0: floppy_queue_rq: timeout handler died.  old request running

-----------------------

suse-tumbleweed-64bit-template:~ # pstree -p            
systemd(1)─┬─VGAuthService(5346)
           ├─agetty(1621)
           ├─atd(1626)
           ├─auditd(759)───{auditd}(760)
           ├─avahi-daemon(3801)
           ├─cron(12012)
           ├─dbus-daemon(1068)
           ├─gpg-agent(1973)
           ├─haveged(473)
           ├─irqbalance(1069)───{irqbalance}(1091)
           ├─iscsid(1595)
           ├─master(4805)─┬─pickup(4806)
           │              └─qmgr(4807)
           ├─nscd(4907)─┬─{nscd}(4913)
           │            ├─{nscd}(4914)
           │            ├─{nscd}(4915)
           │            ├─{nscd}(4916)
           │            ├─{nscd}(4917)
           │            ├─{nscd}(4918)
           │            ├─{nscd}(4919)
           │            ├─{nscd}(4920)
           │            ├─{nscd}(4921)
           │            └─{nscd}(4922)
           ├─ntpd(1615)───ntpd(1624)
           ├─snapperd(1991)───{snapperd}(1998)
           ├─snmpd(1587)
           ├─sshd(1614)─┬─sshd(1742)───bash(1767)───bash(1823)───sudo(1955)───zypper(1956)─┬─btrfs-defrag-pl(1983)
           │            │                                                                  ├─plymouth-brandi(12528)───bash(12529)───regenerate-init(12530)───mkinitrd(12533)───update-bootload(22954)───config(2295+
           │            │                                                                  ├─snapper.py(1989)
           │            │                                                                  └─systemd-inhibit(1981)───cat(1982)
           │            └─sshd(23883)───bash(23890)───pstree(24085)
           ├─systemd(1756)───(sd-pam)(1762)
           ├─systemd-journal(4154)
           ├─systemd-logind(1155)
           ├─systemd-udevd(4330)
           ├─vmtoolsd(5350)───{vmtoolsd}(5351)
           ├─vmware-vmblock-(11505)─┬─{vmware-vmblock-}(11509)
           │                        └─{vmware-vmblock-}(11511)
           ├─wickedd(5998)
           ├─wickedd-auto4(5995)
           ├─wickedd-dhcp4(5996)
           ├─wickedd-dhcp6(5997)
           └─wickedd-nanny(6000)


suse-tumbleweed-64bit-template:~ # cat /proc/1991/stack 
[<0>] do_sys_poll+0x467/0x540
[<0>] __x64_sys_poll+0x37/0x130
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe


suse-tumbleweed-64bit-template:~ # cat /proc/198
1981/ 1982/ 1983/ 1989/ 
suse-tumbleweed-64bit-template:~ # cat /proc/1983/stack 
[<0>] pipe_read+0x206/0x2d0
[<0>] new_sync_read+0x12a/0x1c0
[<0>] vfs_read+0x9d/0x150
[<0>] ksys_read+0x5f/0xe0
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe


suse-tumbleweed-64bit-template:~ # cat /proc/2295
22954/ 22955/ 22956/ 22957/ 

suse-tumbleweed-64bit-template:~ # cat /proc/2295*
cat: /proc/22954: Ist ein Verzeichnis
cat: /proc/22955: Ist ein Verzeichnis
cat: /proc/22956: Ist ein Verzeichnis
cat: /proc/22957: Ist ein Verzeichnis
suse-tumbleweed-64bit-template:~ # cat /proc/2295*/stack
[<0>] pipe_read+0x206/0x2d0
[<0>] new_sync_read+0x12a/0x1c0
[<0>] vfs_read+0x9d/0x150
[<0>] ksys_read+0x5f/0xe0
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<0>] kernel_wait4+0xa6/0x140
[<0>] __do_sys_wait4+0x85/0x90
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<0>] kernel_wait4+0xa6/0x140
[<0>] __do_sys_wait4+0x85/0x90
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[<0>] kernel_wait4+0xa6/0x140
[<0>] __do_sys_wait4+0x85/0x90
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
suse-tumbleweed-64bit-template:~ # 
suse-tumbleweed-64bit-template:~ # cat /proc/1989/stack 
[<0>] pipe_read+0x206/0x2d0
[<0>] new_sync_read+0x12a/0x1c0
[<0>] vfs_read+0x9d/0x150
[<0>] ksys_read+0x5f/0xe0
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
suse-tumbleweed-64bit-template:~ # cat /proc/1982/stack 
[<0>] pipe_read+0x206/0x2d0
[<0>] new_sync_read+0x12a/0x1c0
[<0>] vfs_read+0x9d/0x150
[<0>] ksys_read+0x5f/0xe0
[<0>] do_syscall_64+0x6e/0x1e0
[<0>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
suse-tumbleweed-64bit-template:~ #

HTH, FG Michael
Comment 4 Jiri Slaby 2019-10-16 05:58:31 UTC
Do you have a floppy? Does this happen if you blacklist or delete the floppy module?

The traces indicate that all the processes wait for some data on a pipe. I.e. their parent (zypper or dracut) should send something to them, but it never happens.
Comment 5 Michael Käck 2019-10-16 06:27:21 UTC
It is a VMware VM without BTRFS file system:
# df -hT | grep -v tmpfs
File system Type Size Used Verf. Verw% Mounted on
/dev/sda1 ext4 30G 6.8G 22G 24% /

No floppy drive but activated floppy controller -> I have deactivated the Ctrl in the BIOS.
I don't know the name of the floppy module but I think the name starts with "f" and see now:
# lsmod | grep -i '^f.*'
fuse 139264 3
fb_sys_fops 16384 1 drm_kms_helper

Then I made an update from 20191011 to 20191012 -> Everything OK!

I'll keep watching the whole thing!
Comment 6 Jiri Slaby 2019-10-16 06:57:05 UTC
(In reply to Michael Käck from comment #5)
> No floppy drive but activated floppy controller -> I have deactivated the
> Ctrl in the BIOS.
> I don't know the name of the floppy module but I think the name starts with
> "f" and see now:

Sure, the floppy module is named floppy. And it causes more bad than good nowadays.
Comment 7 Michael Käck 2019-11-13 08:01:14 UTC
Feedback:
No update-freezing since I disabled the floppy controller.
Comment 8 Miroslav Beneš 2020-03-30 12:28:23 UTC
I guess we should close then. As Jiri said, floppy module does more harm than good, so if you don't need it, disable it and all should be fine.