Bugzilla – Bug 1168661
[Build 20200404] openQA test fails in await_install of installer, fails to reboot on SMP systems
Last modified: 2020-11-04 21:15:07 UTC
openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-install_only@smp_64 fails in
fails to reboot on SMP systems
Fails since Build (https://openqa.opensuse.org/tests/1224485), this build, reproducibly
## Expected result
Last good: (https://openqa.opensuse.org/tests/1223880), previous build
## Further details
Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD&machine=smp_64&test=install_only&version=Tumbleweed)
See https://openqa.opensuse.org/snapshot-changes/opensuse/Tumbleweed/diff/20200404 for changes of this Tumbleweed snapshot.
This looks like a dracut crash, when inspecting the serial log:
begin 644 dracut-install.core.pid_13445.sig_11.time_1586122805
I reverted dracut to version 049.1+suse.138.g9068a629 for the time being
CC fbui: I was told systemd 245 would require the dracut update
No v245 doesn't depend on dracut 050. It's just we thought that it would be a good idea to test both updates in the same staging.
I'll check dracut on factory. Ignore the latest submission.
I can reproduce a segfault in dracut-install, which is used during initramfs creation.
According to git bisect, this upstream commit is the cause:
Author: Böszörményi Zoltán <email@example.com>
Date: Thu Oct 24 11:28:55 2019 +0200
Allow running on a cross-compiled rootfs
Stack exhaustion through recursion.
uas.ko -> usb_storage.ko
#116266 0x0000000000405ddc in install_dependent_modules (modlist=0x643fb0) at install/dracut-install.c:1469
#116267 0x0000000000405e40 in install_dependent_modules (modlist=modlist@entry=0x616ab0) at install/dracut-install.c:1473
#116268 0x0000000000406297 in install_module (mod=mod@entry=0x65aa50) at install/dracut-install.c:1531
#116269 0x00000000004068ba in install_modules (argc=argc@entry=1, argv=argv@entry=0x7fffffffdd30) at install/dracut-install.c:1827
#116270 0x0000000000402ef9 in main (argc=<optimized out>, argv=0x7fffffffdce8) at install/dracut-install.c:2017
caused by these lines in /etc/modprobe.d/00-system.conf:
# uas devices can be unpredictably a fallback for both drivers must be present
softdep usb_storage pre: uas
softdep uas pre: usb_storage
(which cause a questionable circular dependency, but dracut has handled them and should continue to do so). Can anyone make sense of the comment though?
The generic modaliases
are supported by both uas and usb-storage. Theoretically "ip62" is UAS (https://superuser.com/questions/928741/how-can-i-check-whether-usb3-0-uasp-usb-attached-scsi-protocol-mode-is-enabled), but I guess there are lots of broken disks around which pretend to speak UAS but can't, so there must be a fallback to usb-storage. There's also the "usb-storage.quirks" module parameter.
The softdeps were introduced in bug 862397 to make sure both drivers are loaded. They don't seem to be necessary any more on modern SUSE systems. uas has a hard dependency on usb-storage, so it's indeed non-obvious why the softdep uas->usb-storage was ever needed.
Wrt the reverse dependency, the worst thing that can happen when we remove it is cause some disks to perform sub-optimally with the usb-storage driver, AFAICS.
I'll remove the uas->usb-storage dependency in suse-module-tools, and convert the usb-storage->uas dep into a "softdep post". That should eliminate the circular dependency in this case (there may be other similar cases though).
dracut ignores "softdep post" dependencies currently. So if a customer generates an initrd with an USB storage device connected via usb-storage, uas will not be packaged in the initrd. If the system is booted with this initrd later, and another, UAS-capable disk is attached without explictly loading uas, it *might* happen that uas is not auto-loaded and the new disk may perform worse than expected. But this is a corner case, so this potential regression is justified, as the change eliminates an inconsistent configuration with circular dependency.
This is an autogenerated message for OBS integration:
This bug (1168661) was mentioned in
https://build.opensuse.org/request/show/794962 Factory / suse-module-tools
Fixed in suse-module-tools. Closing.
Can we reinstate the test? I'll reopen for now.
Test is succeeding again. Closing.
Why was the test disabled even though the last run succeeded?
(In reply to Daniel Molkentin from comment #14)
> Why was the test disabled even though the last run succeeded?
I do not understand why you think the test would be disabled. The description mentions the "Always latest result in this scenario" which currently links to https://openqa.opensuse.org/tests/1282620 from 3 days ago. The test scenario is enabled on Tumbleweed and triggered on every new Tumbleweed snapshot. Please keep in mind that old job results are deleted after some time.