Bug 1079416 - Kernel 4.15 causing lock-up at shutdown
Kernel 4.15 causing lock-up at shutdown
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 Other
: P5 - None : Critical with 5 votes (vote)
: ---
Assigned To: E-mail List
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-02-05 17:27 UTC by Ryan Nunya
Modified: 2018-08-14 06:36 UTC (History)
7 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Photo capture of OOPS (353.95 KB, image/png)
2018-06-20 03:52 UTC, Boian Berberov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryan Nunya 2018-02-05 17:27:00 UTC
User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:57.0) Gecko/20100101 Firefox/57.0
Build Identifier: 

After upgrading to kernel version 4.15 my computer can no longer shutdown properly.

Reproducible: Always

Steps to Reproduce:
1. running kernel 4.14
2. sudo zypper dup (upgrade to 4.15)
3. reboot into 4.15
4. shutdown/reboot
Actual Results:  
Computer freezes on tty screen during shutdown process

Expected Results:  
Shutdown normally

After initiating shutdown the session logs out and I am shown the tty log in screen normally that would last a few seconds but it now freezes on the tty screen and the computer gets very hot and the fans increase a lot. I have waited an hour to see if it is just doing something like rebuilding a kernel module.
Selecting the 4.14 kernel from grub does not have this issue.

About half the time after attempting to reboot with the 4.15 kernel my UEFI settings in BIOS change. Unsure if related.

I am the user _LeBronse and this is where I describe more in detail what I have done:
https://www.reddit.com/r/openSUSE/comments/7veel7/tumbleweed_support_computer_locks_up_on_shutdown/
Comment 1 Jiri Slaby 2018-02-07 16:26:59 UTC
In case it still works at that phase, could you dump CPU traces by sysrq-p?
Comment 2 Cin Abby 2018-02-11 08:16:33 UTC
same issue when I upgrade system to 20180206. 
current system version is 20180208, still not fixed.

CPU i5 6300HQ
Comment 3 Boian Berberov 2018-02-14 03:53:27 UTC
Same issue, on a legacy BIOS boot system.
Comment 4 Sergey Kosenkov 2018-02-14 06:33:17 UTC
after sleep mode, power off and reboot is ok.
current system version is 20180212
CPU i7-6700HQ
Comment 5 Boian Berberov 2018-02-14 15:54:07 UTC
Just updated to 20180212.

Still happens with 4.15.2-1, not with 4.14.15-2.

I tried SysRq-P but nothing happens.  On the laptop I have to use the Fn key, so maybe not definitive.
Comment 6 Sergey Kosenkov 2018-02-17 09:38:47 UTC
After remove all old unused kernels, poweroff is normal.
Comment 7 Boian Berberov 2018-02-18 02:59:21 UTC
Just updated to 20180212, kernel 4.15.2-1.  Error persists.

I cannot remove the old kernels because they are no longer in the repo and I cannot revert.  Did anything specific to reboot change?

I also have to use these kernel command line options, which work with 4.14:
nouveau.config=NvMSI=0 nouveau.nofbaccel=1 nouveau.runpm=0 reboot=warm,pci
Comment 8 Sergey Kosenkov 2018-02-18 17:32:42 UTC
Try this kernel parameter nouveau.modeset=0
Comment 9 Boian Berberov 2018-02-18 21:27:35 UTC
Tried with just:
reboot=warm,pci

So far everything seems to work with no side effects.  Might have been some changes in nouveau, at least in my case.
Comment 10 Boian Berberov 2018-02-23 00:05:37 UTC
Still occasional lock-ups, but less frequent now.

I'm trying logging out of X/KDE and then rebooting/shutdown.  Seems to work successfully more often than just direct reboot from X/KDE, but I need a bigger sample.
Comment 11 Ben Steel 2018-02-23 17:57:54 UTC
Original problem (freeze at the TTY login and significant heat generation during shutdown from KDE) and workaround (logout prior to shutdown) both confirmed on a Lenovo Thinkpad T510 with i5 processor and Nvidia Quadro NVS 3100M with 512MB.

Thank you, Mr. Berberov.
Comment 12 Boian Berberov 2018-03-01 03:55:45 UTC
It's locking up about once a day now, in the middle of a session too.  With longer sessions it's more prevalent, but it locked up ~5min after login once.

Is there any idea what this could be?  Is it nouveau/Optimus related?

Is there a way to debug, dump some error messages on screen, maybe some kernel debug boot options I don't know about?

I could go back to running 4.14 for a week and see what that looks like.  Suggestions?
Comment 13 Martin Kincl 2018-03-01 09:14:59 UTC
I can confirm this issue on an HP ProBook 4510s with core2 duo T5870 and Radeon HD4330, except all the TTYs close. Only tty7 shows the remains of the tribar plymouth progress bar and tty10 still shows the log. No excessive heat on this machine. One update before it started doing this, the system did eventually shutdown about 10 seconds after plymouth disappeared, but did so in a way that the hard drive was apparently shut down abruptly in the middle of an IO operation.
Comment 14 Boian Berberov 2018-03-14 08:09:25 UTC
I was able to capture these errors after a crash.

======================================================================
Mar 13 22:23:42 linux-qjvs systemd-logind[1410]: Power key pressed.
Mar 13 22:23:42 linux-qjvs systemd-logind[1410]: Powering Off...
Mar 13 22:23:42 linux-qjvs systemd-logind[1410]: System is powering down.
...
Mar 13 22:23:42 linux-qjvs systemd[1]: Stopping Getty on tty1...
...
Mar 13 22:23:42 linux-qjvs sddm[1673]: Error from greeter session: "Process crashed"
Mar 13 22:23:42 linux-qjvs sddm[1673]: Auth: sddm-helper crashed (exit code 15)
Mar 13 22:23:42 linux-qjvs sddm[1673]: Error from greeter session: "Process crashed"
Mar 13 22:23:42 linux-qjvs sddm[1673]: Auth: sddm-helper exited with 15
Mar 13 22:23:42 linux-qjvs sddm[1673]: Greeter stopped.
Mar 13 22:23:42 linux-qjvs systemd[1]: Stopping Session 1 of user sddm.
...
Mar 13 22:23:42 linux-qjvs systemd[1916]: Reached target Shutdown.
Mar 13 22:23:42 linux-qjvs systemd[1916]: Starting Exit the Session...
Mar 13 22:23:42 linux-qjvs systemd[1]: Stopped Session 1 of user sddm.
Mar 13 22:23:42 linux-qjvs systemd[1]: Stopped Modem Manager.
Mar 13 22:23:42 linux-qjvs systemd[1]: Stopped The nginx HTTP and reverse proxy server.
Mar 13 22:23:42 linux-qjvs systemd-logind[1410]: Removed session 1.
Mar 13 22:23:42 linux-qjvs systemd[1916]: Received SIGRTMIN+24 from PID 1984 (kill).
Mar 13 22:23:43 linux-qjvs systemd[1917]: pam_unix(systemd-user:session): session closed for user sddm
Mar 13 22:23:43 linux-qjvs systemd[1]: Stopped User Manager for UID 473.
Mar 13 22:23:43 linux-qjvs systemd[1]: Removed slice User Slice of sddm.
Mar 13 22:23:43 linux-qjvs systemd[1]: Unmounted /var/run/user/473.
Mar 13 22:23:43 linux-qjvs systemd[1]: Unmounted /run/user/473.
Mar 13 22:23:43 linux-qjvs postfix/postfix-script[1991]: stopping the Postfix mail system
Mar 13 22:23:43 linux-qjvs postfix/master[1761]: terminating on signal 15
Mar 13 22:23:43 linux-qjvs systemd[1]: Stopped Postfix Mail Transport Agent.
Mar 13 22:23:43 linux-qjvs systemd[1]: Stopped target Host and Network Name Lookups.
Mar 13 22:24:47 linux-qjvs systemd[1]: Stopped Getty on tty1.
Mar 13 22:24:47 linux-qjvs systemd[1]: Removed slice system-getty.slice.
Mar 13 22:24:47 linux-qjvs kernel: ACPI: \_SB_.PCI0.PEG0.VID_: failed to evaluate _DSM
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
Mar 13 22:24:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:24:47 linux-qjvs display-manager[1976]: /usr/lib/X11/display-manager: line 132: type: console_vars: not found
Mar 13 22:24:47 linux-qjvs display-manager[1976]: /usr/lib/X11/display-manager: line 132: type: default-displaymanager_vars: not found
Mar 13 22:24:47 linux-qjvs sddm[1673]: Signal received: SIGTERM
Mar 13 22:24:47 linux-qjvs sddm[1673]: Socket server stopping...
Mar 13 22:24:47 linux-qjvs sddm[1673]: Socket server stopped.
Mar 13 22:24:47 linux-qjvs sddm[1673]: Display server stopping...
Mar 13 22:24:52 linux-qjvs sddm[1673]: Display server stopping...
Mar 13 22:24:52 linux-qjvs display-manager[1976]: Shutting down service sddm..done
Mar 13 22:24:53 linux-qjvs kernel: ACPI: \_SB_.PCI0.PEG0.VID_: failed to evaluate _DSM
Mar 13 22:24:57 linux-qjvs sddm[1673]: QProcess: Destroyed while process ("/usr/bin/X") is still running.
Mar 13 22:24:58 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:25:08 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
Mar 13 22:25:18 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Mar 13 22:25:18 linux-qjvs systemd[1]: Stopped X Display Manager.
Mar 13 22:25:18 linux-qjvs systemd[1]: Starting Show Plymouth Power Off Screen...
...
Mar 13 22:25:21 linux-qjvs systemd[1]: Reached target Shutdown.
Mar 13 22:25:21 linux-qjvs systemd[1]: Reached target Final Step.
Mar 13 22:25:21 linux-qjvs systemd[1]: Starting Power-Off...
Mar 13 22:25:21 linux-qjvs systemd[1]: Shutting down.
Mar 13 22:25:22 linux-qjvs systemd-shutdown[1]: Sending SIGTERM to remaining processes...
Mar 13 22:25:22 linux-qjvs haveged[408]: haveged: Stopping due to signal 15
Mar 13 22:25:22 linux-qjvs haveged[408]: haveged starting up
Mar 13 22:25:22 linux-qjvs systemd-journald[397]: Journal stopped
-- Reboot --
======================================================================

If I switch to VT1 before reboot/shutdown, I can capture logs when the power button is responsive.

Is getty supposed to shutdown before sddm?
Comment 15 Cin Abby 2018-03-25 14:29:49 UTC
It still not fix in kernel-default-4.15.10-1.6.x86_64.

I guess it may cause by a meltdown, spectre patch.
Comment 16 Boian Berberov 2018-04-22 20:02:46 UTC
Kernel 4.16.2-1 crashes significantly more often than 4.16.0-1 for me.  I'm about to update to revision 4.16.2-1.7.

Could you make 4.14 available again so we can at least use our computers normally?  According to Kernel.org, its `longterm`.
Comment 17 Cin Abby 2018-04-23 14:39:47 UTC
I have old kernle-4.14.1 take from an old tunbleweed iso file, I have uploaded it to my vps https://isliberty.me/kernel-4.14.tgz 

which contains three packages list below:

kernel-default-4.14.1-1.4.x86_64.rpm
kernel-default-devel-4.14.1-1.4.x86_64.rpm
kernel-syms-4.14.1-1.4.x86_64.rpm

in case someone need these packages. 

for me, I just compiled linux-4.14.1 from source with localmodconfig, and also complied virtualbox from source, and then add locks for kernel-default and kernel-firmware :(

you can try Leap 15's kernel too. which is kernel-4.12.x
Comment 18 Jiri Slaby 2018-06-16 12:33:41 UTC
Does it still happen with 4.17?
Comment 19 Boian Berberov 2018-06-17 19:48:51 UTC
I will find out later this week.  So far it still occurs with 4.16.12-2-default

============================================================
Jun 17 13:59:13 linux-qjvs sddm[1916]: Error from greeter session: "Process crashed"
Jun 17 13:59:13 linux-qjvs sddm[1916]: Auth: sddm-helper crashed (exit code 15)
Jun 17 13:59:13 linux-qjvs sddm[1916]: Error from greeter session: "Process crashed"
Jun 17 13:59:13 linux-qjvs sddm[1916]: Auth: sddm-helper exited with 15
Jun 17 13:59:13 linux-qjvs sddm[1916]: Greeter stopped.
...
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:48:LVDS-1] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:48:LVDS-1] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
Jun 17 14:00:36 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
...
Jun 17 14:00:36 linux-qjvs sddm[1916]: Signal received: SIGTERM
Jun 17 14:00:36 linux-qjvs sddm[1916]: Socket server stopping...
Jun 17 14:00:36 linux-qjvs sddm[1916]: Socket server stopped.
Jun 17 14:00:36 linux-qjvs sddm[1916]: Display server stopping...
...
Jun 17 14:00:46 linux-qjvs sddm[1916]: QProcess: Destroyed while process ("/usr/bin/X") is still running.
Jun 17 14:00:47 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:37:pipe A] flip_done timed out
Jun 17 14:00:57 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CONNECTOR:48:LVDS-1] flip_done timed out
Jun 17 14:01:08 linux-qjvs kernel: [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:28:primary A] flip_done timed out
============================================================

I see Leap 15.0 has kernel 4.12.14.  I'll see if I can install that on Tumbleweed side-by-side.
Comment 20 Boian Berberov 2018-06-20 03:52:28 UTC
Created attachment 774593 [details]
Photo capture of OOPS

I've been experiencing much more frequent crashes with 4.17.1-1-default.  I'd estimate 3/4 reboots/shutdowns do not succeed.  I've gone back to using 4.16.12-2-default where it's less than 1/4.

I was able to capture this OOPS.  Steps:

1. Fresh boot with all defaults (no user interaction)
2. sddm shows up
3. Switch to VT1 with Ctrl + F1
4. Press Ctrl + Alt + Del
5. OOPS shows on screen

I've enhanced the image a little for readability.

I'll try to reproduce it, but so far I haven't been able to.  The screen is simply freezes during lock-up with no message.
Comment 21 Boian Berberov 2018-07-25 21:31:46 UTC
Do you expect to make any progress on this issue this quarter (by October)?  It's still NEW, not CONFIRMED.  If it's not resolvable, will there be an official workaround, alternative kernels or something?  I'd like to help, but I have no idea what or how.
Comment 22 Ben Steel 2018-08-03 19:18:53 UTC
For me, this bug (as originally described by Mr. Nunya) was resolved by the Tumbleweed update that I applied (zypper dup) on 30 July 2018 at 16:32 UTC. This update happened to include the kernel 4.17.9-1-default #1 SMP PREEMPT Sun Jul 22 19:18:23 UTC 2018 (059e5b8) x86_64 x86_64 x86_64 GNU/Linux.

It had not been resolved by "my" previous update on 24 July 2018 at 20:55 UTC, which included kernel 4.17.7-1-default. There likely were updates offered in between that I didn't see or install.

For future reference, the workaround of logging out of the KDE workspace before hitting the shutdown button in the graphical login screen also required waiting a second or two, during which the hard drive LED would briefly light. If you hit shutdown too soon, the system would freeze, just like before.

I hope this helps.
Comment 23 Jiri Slaby 2018-08-14 06:36:25 UTC
So this is hopefully fixed...