Bug 1167632 - GCC 10: kernel-default and other flavors are miscompiled
GCC 10: kernel-default and other flavors are miscompiled
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: Borislav Petkov
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-03-25 09:10 UTC by Martin Liška
Modified: 2020-06-02 22:16 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Liška 2020-03-25 09:10:35 UTC
The root cause is -fstack-protector which results in:

[    6s] ### VM INTERACTION START ###
[    6s] /usr/bin/qemu-kvm -nodefaults -no-reboot -nographic -vga none -cpu host -object rng-random,filename=/dev/random,id=rng0 -device virtio-rng-pci,rng=rng0 -runas qemu -net none -kernel /var/cache/obs/worker/root_7/.mount/boot/kernel -initrd /var/cache/obs/worker/root_7/.mount/boot/initrd -append root=/dev/disk/by-id/virtio-0 rootfstype=ext4 rootflags=noatime ext4.allow_unsupported=1 kpti=off pti=off spectre_v2=off panic=1 quiet no-kvmclock elevator=noop nmi_watchdog=0 rw rd.driver.pre=binfmt_misc console=ttyS0 init=/.build/build -m 8192 -drive file=/var/cache/obs/worker/root_7/root,format=raw,if=none,id=disk,cache=unsafe -device virtio-blk-pci,drive=disk,serial=0 -drive file=/var/cache/obs/worker/root_7/swap,format=raw,if=none,id=swap,cache=unsafe -device virtio-blk-pci,drive=swap,serial=1 -serial stdio -chardev socket,id=monitor,server,nowait,path=/var/cache/obs/worker/root_7/root.qemu/monitor -mon chardev=monitor,mode=readline -smp 8
[    7s] c[?7l[2J[0mSeaBIOS (version rel-1.12.0-0-ga698c89-rebuilt.opensuse.org)
[    8s] Booting from ROM..c[?7l[2J[    0.002473] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: start_secondary+0x12b/0x130
[    8s] [    0.002473] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.5.11-5-default #1 openSUSE Tumbleweed (unreleased)
[    8s] [    0.002473] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.opensuse.org 04/01/2014
[    8s] [    0.002473] Call Trace:
[    8s] [    0.002473]  dump_stack+0x8b/0xc8
[    8s] [    0.002473]  panic+0x106/0x2ed
[    8s] [    0.002473]  ? start_secondary+0x12b/0x130
[    8s] [    0.002473]  __stack_chk_fail+0x15/0x20
[    8s] [    0.002473]  start_secondary+0x12b/0x130
[    8s] [    0.002473]  secondary_startup_64+0xb6/0xc0
[    8s] [    0.002473] Rebooting in 1 seconds..

Can be seen for example here:
https://build.opensuse.org/package/live_build_log/openSUSE:Factory:Staging:Gcc7/bind/standard/x86_64

Apparently it's a known issue and some basic analysis can be seen here:
https://bugzilla.redhat.com/show_bug.cgi?id=1796780
Comment 1 Martin Liška 2020-03-25 09:32:55 UTC
It's a known issue and there's a patch candidate for it:
https://lkml.org/lkml/2020/3/14/186
Comment 2 Jiri Slaby 2020-03-26 12:37:40 UTC
Boris, could you ping me (clearing needinfo is enough) once something is merged to -tip re "x86: fix early boot crash on gcc-10"?
Comment 3 Borislav Petkov 2020-03-26 13:04:48 UTC
Yah, lemme take over this one. It's the easiest this way.
Comment 4 Martin Liška 2020-04-22 04:25:23 UTC
@Boris: Do you have any estimation when the fix will land to openSUSE:Factory kernel (likely as a backport)? We'll need it in order to update to gcc10 as the Factory default compiler.
Comment 5 Borislav Petkov 2020-04-22 10:29:21 UTC
I don't know whether you're following the upstream thread but the situation got hairy. Lemme ping them and expedite a solution.
Comment 6 Martin Liška 2020-04-22 12:15:56 UTC
(In reply to Borislav Petkov from comment #5)
> I don't know whether you're following the upstream thread but the situation
> got hairy. Lemme ping them and expedite a solution.

I'm aware of a i586 related issue, but it's hard to follow the entire mailing list discussion. LKML.org web interface provides a poor listing of an email thread.
Comment 7 Martin Liška 2020-05-12 06:57:20 UTC
Hi Boris. Any estimation when we can get the patch into our openSUSE:Factory/kernel ?
Comment 8 Borislav Petkov 2020-05-12 07:53:18 UTC
It is queued here:

https://git.kernel.org/tip/f670269a42bfdd2c83a1118cc3d1b475547eac22

want me to backport it?
Comment 9 Martin Liška 2020-05-12 08:22:06 UTC
(In reply to Borislav Petkov from comment #8)
> It is queued here:
> 
> https://git.kernel.org/tip/f670269a42bfdd2c83a1118cc3d1b475547eac22
> 
> want me to backport it?

Yes, please do so.
We can get gcc10 in Factory (as system compiler) quite soon (in a week or two from now).
Comment 10 Borislav Petkov 2020-05-12 16:42:41 UTC
Pushed to users/bpetkov/stable/for-next

Closing.
Comment 11 Jiri Slaby 2020-05-12 18:11:47 UTC
https://build.opensuse.org/request/show/803399
Comment 12 Borislav Petkov 2020-05-14 13:48:40 UTC
We will need to redo that - Linus doesn't like this fix, see this thread:

https://lkml.kernel.org/r/CAK8P3a3KpM91%2Bjv6%2B7KSKFRpwLqf38Lz1wbGhkFFyfDb9oahgA@mail.gmail.com

New version I just did is:

https://lore.kernel.org/linux-wireless/20200514132706.GB9266@zn.tnic/

If all goes well, we'll submit on the weekend and it will be in 5.7. It is CCed stable too.
Comment 13 Borislav Petkov 2020-05-18 07:49:11 UTC
And here's the upstream fix:

 3afa06dc1aaf..1adb3637dffc  HEAD -> users/bpetkov/stable/for-next
Comment 14 Borislav Petkov 2020-06-02 22:16:29 UTC
Ok, this should be finally done now. Closing.