Bug 1170339 - Realtek r8169 Ethernet driver cause 'systemctl suspend' to hang
Realtek r8169 Ethernet driver cause 'systemctl suspend' to hang
Status: RESOLVED WONTFIX
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-04-23 13:02 UTC by Michael Pujos
Modified: 2020-09-10 09:27 UTC (History)
5 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
tiwai: needinfo? (pujos.michael)


Attachments
log file with suspend getting stuck (183.59 KB, text/plain)
2020-04-27 18:11 UTC, Michael Pujos
Details
output of hwinfo --netcard (2.59 KB, text/plain)
2020-04-27 18:12 UTC, Michael Pujos
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Pujos 2020-04-23 13:02:53 UTC
Kernel 5.6.4


After attempting to understand why my PC would hang on suspend (fan still running and PC not going to sleep) requiring a hard reboot, I found out it is cause by kernel module r8169 managing this hardware:

01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
	Subsystem: Realtek Semiconductor Co., Ltd. Device 0123
	Flags: bus master, fast devsel, latency 0, IRQ 16
	I/O ports at e000 [size=256]
	Memory at df204000 (64-bit, non-prefetchable) [size=4K]
	Memory at df200000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 01
	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00
	Capabilities: [170] Latency Tolerance Reporting
	Capabilities: [178] L1 PM Substates
	Kernel driver in use: r8169
	Kernel modules: r8169



If I remove that module before suspend, it works just fine.

Is there a known workaround for this (kernel parameter?). If not, what is the procedure to provide additional info to identify the cause in that module for an eventual fix ?
Comment 1 Michael Pujos 2020-04-23 13:34:30 UTC
For the time being, this service works:


[Unit]
Description=Hack for suspend r8169 failure
Before=sleep.target
StopWhenUnneeded=yes

[Service]
User=root
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/sbin/modprobe -r r8169
ExecStop=-/usr/sbin/modprobe r8169

[Install]
WantedBy=sleep.target
Comment 2 Takashi Iwai 2020-04-27 17:23:59 UTC
When it hangs, do you have any kernel Oops or such messages left?
Also, is it a regression, i.e. it worked in the earlier versions?

r8169 driver supports a very wide range of devices, so it's hard to know the problem only from the driver name.

In anyway, please give hwinfo output.
Comment 3 Michael Pujos 2020-04-27 18:11:39 UTC
Created attachment 836880 [details]
log file with suspend getting stuck

Output of journalctl:

- booted into multi-user target runlevel
- logged on console
- did a 'systemctl suspend'. it starts at 19:53:27. To my surprise, this suspend worked fine
- I shortly resumed, and it worked fine
- verified that network was up with ping
- then did a new 'systemctl suspend' and it got stuck (PC not going to sleep, fans still on). Starts at 19:55:25, at the very end of the log

So this log shows a working suspend (which I never saw before) following that one that fails.
Comment 4 Michael Pujos 2020-04-27 18:12:31 UTC
Created attachment 836881 [details]
output of hwinfo --netcard

If the full hwinfo is needed, let me know.
Comment 5 Michael Pujos 2020-04-27 18:15:40 UTC
There is no Kernel oops.
hwinfo and journalctl output attached.
Googling this issue did not give any result

Interestingly the r8168 driver works fine with suspend but there is not network on resume.
Comment 6 Michael Pujos 2020-04-27 18:26:13 UTC
I cannot tell if this is a regression as I used this hardware for the first time with kernel 5.6.4.
Comment 7 Michael Pujos 2020-04-28 10:19:52 UTC
I have updated to kernel 5.6.6 and the problem persists, which is not a surprise.

However, I noticed that the first suspend is always sucessful and only the second one gets stuck with the last line of the journal always being:


Apr 28 12:10:25 p72 systemd-sleep[14059]: INFO: Skip running /usr/lib/systemd/system-sleep/grub2.sleep for suspend
Apr 28 12:10:25 p72 systemd-sleep[14055]: Suspending system...


If I can add more logging to understand this issue, let me know.

I could also verify that with the r8169 module removed, I can successfully suspend/resume many times (tried 10 times in a row).
Comment 8 Takashi Iwai 2020-05-17 07:55:40 UTC
Could you try to report this to upstream?  e.g. bugzilla.kernel.org.
Feel free to put me (tiwai@suse.de) to Cc in case the need of assistance from distro side.  Thanks.

BTW, you might see more messages when you add "no_console_suspend" boot option.
Boot with that option, go to VT1 via ctrl-alt-F1, login there, and do suspend/resume manually via "systemctl suspend".

Also, some basic techniques to debug the suspend/resume is described in
  https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html
Comment 9 Michael Pujos 2020-05-17 10:44:10 UTC
I no longer use the PC with this hardware.
Comment 10 Miroslav Beneš 2020-09-10 09:27:18 UTC
Per comment 9.