Bug 1104833 - Initramfs unpacking failed: junk in compressed archive
Initramfs unpacking failed: junk in compressed archive
Status: RESOLVED FIXED
: 1101194 (view as bug list)
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
armv7 openSUSE Factory
: P5 - None : Normal (vote)
: ---
Assigned To: Michal Hocko
E-mail List
:
Depends on:
Blocks: 1122614 1145646
  Show dependency treegraph
 
Reported: 2018-08-14 16:49 UTC by Matwey Kornilov
Modified: 2022-07-21 17:24 UTC (History)
13 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matwey Kornilov 2018-08-14 16:49:17 UTC
Hello,

I am trying to run openSUSE-Tumbleweed-ARM-JeOS-beaglebone.armv7l-2018.08.13-Build1.1 at BeagleBone Black board.

I see the following after Grub2:

Loading kernel...
Loading initrd...
[    0.005152] timer_probe: no matching timers found
[    0.424134] Initramfs unpacking failed: junk in compressed archive
[    0.757548] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[    0.765875] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.17.13-1-default #1 openSUSE Tumbleweed (unreleased)
[    0.775657] Hardware name: Generic AM33XX (Flattened Device Tree)
[    0.781830] [<c0321000>] (unwind_backtrace) from [<c0319640>] (show_stack+0x20/0x28)
[    0.789618] [<c0319640>] (show_stack) from [<c0de3998>] (dump_stack+0xb8/0xe4)
[    0.796886] [<c0de3998>] (dump_stack) from [<c036c1a0>] (panic+0xf0/0x28c)
[    0.803806] [<c036c1a0>] (panic) from [<c15018ec>] (mount_block_root+0x26c/0x31c)
[    0.811328] [<c15018ec>] (mount_block_root) from [<c1501a30>] (mount_root+0x94/0x98)
[    0.819111] [<c1501a30>] (mount_root) from [<c1501b94>] (prepare_namespace+0x160/0x1a8)
[    0.827155] [<c1501b94>] (prepare_namespace) from [<c1501420>] (kernel_init_freeable+0x42c/0x440)
[    0.836075] [<c1501420>] (kernel_init_freeable) from [<c0df8a44>] (kernel_init+0x18/0x128)
[    0.844382] [<c0df8a44>] (kernel_init) from [<c03010ac>] (ret_from_fork+0x14/0x28)
[    0.851984] Exception stack(0xdb137fb0 to 0xdb137ff8)
[    0.857057] 7fa0:                                     00000000 00000000 00000000 00000000
[    0.865272] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.873487] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000
[    0.880154] Rebooting in 90 seconds..
Comment 1 Andreas Färber 2018-08-15 12:18:28 UTC
Sounds like the dracut initrd is too large for the BB's RAM?
Comment 2 Matwey Kornilov 2018-08-15 15:22:11 UTC
I would not say so.

I've take openSUSE Leap 15.0 image and installed 4.17.14 kernel from RPM. This means that KIWI part of dracut was omitted.
initrd size is 14MB

13749120 авг 14 20:31 initrd-4.17.14-1.gdf1fc0f-default
Comment 3 Guillaume GARDET 2018-08-16 09:26:28 UTC
I reproduced this problem on Beaglebone as you, but also on Beagle xM (openSUSE-Tumbleweed-ARM-JeOS-beagle.armv7l-2018.08.13-Build1.1.raw).

Raspberrypi2 (openSUSE-Tumbleweed-ARM-JeOS-raspberrypi2.armv7l-2018.08.13-Build1.1.raw) and Arndale (openSUSE-Tumbleweed-ARM-JeOS-arndale.armv7l-2018.08.13-Build1.1.raw, with a little fix on u-boot to get bootefi command) are fine.
Comment 4 Alexander Graf 2018-08-16 11:27:06 UTC
That sounds to me as if grub puts the initrd in a place that by accident gets overwritten by kernel contents later.

I think the next steps would be to

  a) figure out all addresses the initrd and kernel get written to by grub
  b) check the initrd integrity in RAM at various points, see what it got overwritten with

I think a can be done with just enabling debug output in grub (set debug=all on the grub cmdline IIRC). For b we probably need a JTAG adapter that allows us to read the physical memory at various points.
Comment 5 Andreas Färber 2018-08-16 11:40:26 UTC
I also ran into such symptoms on my Udoo Neo some weeks back (512 MiB RAM only).

Could these 32-bit platforms be lacking some memory reservation at U-Boot level?
Comment 6 Alexander Graf 2018-08-16 11:49:09 UTC
I think the most likely thing happening is that something goes wrong with the switch from EFI land to legacy Linux boot. During EFI (U-Boot, grub) we have full awareness of which memory is in use. When the actual Linux boot does get triggered though, we're going back to the legacy Linux boot protocol which has no such awareness.
Comment 7 Guillaume GARDET 2018-08-16 12:24:40 UTC
(In reply to Alexander Graf from comment #4)
> That sounds to me as if grub puts the initrd in a place that by accident
> gets overwritten by kernel contents later.
> 
> I think the next steps would be to
> 
>   a) figure out all addresses the initrd and kernel get written to by grub

By adding 'set debug=all' to grub, I got the wanted addresses on BeagleBone Black:
  ??, er/arm/linux.c:238: atag: 0x87ff1000, e0, edfe0dd0, ?
  loader/arm/linux.c:246: Kernel at: 0x80008000
  loader/arm/linux.c:184: linux_args:
  'BOOT_IMAGE=(hd0,gpt2)/boot/zImage-4.17.13-1-default loglevel=3 splash=silent
  plymouth.enable=0 console=ttyS0,115200n8
  root=UUID=7f80b027-9e79-459b-8ed5-52c8cc8aef29 rw'
  loader/arm/linux.c:199: Initrd @ 0x82008000-0x8369fe48
  loader/arm/linux.c:215: FDT updated for Linux boot
  loader/arm/linux.c:255: FDT @ 0x0x87ff1000
Comment 8 Guillaume GARDET 2018-08-27 15:52:16 UTC
Sabrelite (1G of RAM) is also affected.

Tested with openSUSE-Tumbleweed-ARM-LXQT-sabrelite.armv7l-2018.08.13-Snapshot20180822.raw
Comment 9 Guillaume GARDET 2018-08-27 16:43:13 UTC
It looks like kernel-lpae is working fine whereas kernel-default is broken.
Comment 10 Guillaume GARDET 2018-09-05 15:21:17 UTC
More tests on Raspberry Pi 2 (armv7):
* kernel-lpae 4.17.14 and 4.18.5 are booting fine
* kernel-default 4.17.14 and 4.18.5 are broken
* kernel-default 4.12.14 (taken from leap) does boot

So, this is definitely kernel related.
Comment 11 Guillaume GARDET 2018-09-10 16:36:25 UTC
More tests.

kernel-default 4.18.7 from OBS (Kernel:stable commit 952d850f3777feaeb7b647ccad11f8f525fd8e8c) is broken.

But the same kernel, with the same config (from kernel and kernel-source git repos, stable branch), cross-compiled with [1], does boot!

[1]: arm-suse-linux-gnueabi-gcc (SUSE Linux) 8.2.1 20180817 [gcc-8-branch revision 263612] (same gcc version used inside OBS, in native compilation)
Comment 12 Andreas Färber 2018-09-10 17:20:50 UTC
(In reply to Guillaume GARDET from comment #11)
> More tests.
> 
> kernel-default 4.18.7 from OBS (Kernel:stable commit
> 952d850f3777feaeb7b647ccad11f8f525fd8e8c) is broken.
> 
> But the same kernel, with the same config (from kernel and kernel-source git
> repos, stable branch), cross-compiled with [1], does boot!
> 
> [1]: arm-suse-linux-gnueabi-gcc (SUSE Linux) 8.2.1 20180817 [gcc-8-branch
> revision 263612] (same gcc version used inside OBS, in native compilation)

That compiler defaults to ARMv6 whereas the native armv7hl gcc defaults to ARMv7.

Are you sure there's no differences in initrd sizes or other non-kernel factors? I'd expect kernel-lpae('s initrd) to be smaller than kernel-default('s initrd).
Comment 13 Matwey Kornilov 2018-09-10 17:30:58 UTC
Guillaume,

Since the issue is reproduced with qemu-system-arm, could you try to use gdb server provided by qemu?
Comment 14 Guillaume GARDET 2018-09-11 06:41:56 UTC
(In reply to Andreas Färber from comment #12)
> 
> That compiler defaults to ARMv6 whereas the native armv7hl gcc defaults to
> ARMv7.

Interesting.


> Are you sure there's no differences in initrd sizes or other non-kernel
> factors? I'd expect kernel-lpae('s initrd) to be smaller than
> kernel-default('s initrd).

No other factor than kernel, because I only updated the zImage kernel on my SD card. So, initrd remains the same.
Comment 15 Guillaume GARDET 2018-10-10 15:44:32 UTC
With more debug, the problem is due to a bad magic number read on 1st 2 bytes of initrd:
  https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/lib/decompress.c?h=v4.18.13#n69

We do not get 0xFD 0x37 as expected for XZ, and no decompressor match. So, initrd cannot be used.

The pointer to initrd is consistent across working and broken kernel: 0xC2008000 (on Beagleboard xM).

I do not know why the read is broken.
Comment 16 Guillaume GARDET 2018-10-11 13:24:48 UTC
As a workaround, we can use the following config change:
  -CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE=y
  -# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
  +# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
  +CONFIG_CC_OPTIMIZE_FOR_SIZE=y

And the system handle initrd properly on armv7 with kernel-default (and kernel-vanilla).

Current GCC8 from Tumbleweed (8.2.1+r264010) has probably some wrong optimizations for armv7.
Comment 17 Andreas Färber 2018-10-11 13:31:02 UTC
You say the pointer is the same for the initrd, but doesn't the size optimization reduce the size of the kernel? Can you compare the addresses and numbers there? I'd still assume something overwrites the initrd, not miscompiles the kernel.
Comment 18 Matwey Kornilov 2018-10-11 17:03:40 UTC
It can be checked by running kernel and initrd directly as qemu arguments:

qemu -kernel ... -initrd ...
Comment 19 Guillaume GARDET 2018-10-12 19:43:00 UTC
On a broken kernel (zImage-4.18.13-0.gc434d5c-default) on the Beagleboard xM, I dumped what is at initrd place (phys: 0x82008000, virt: 0xC2008000) and I get:
  5F C3 77 C8 1D 12 5F 0A C7 8C 9A 0B 32 B8 3A 97 
  BC 07 F5 DE 40 EE 93 A0 17 FC DB A3 46 B1 2A 35 
  53 C5 01 53 2D B5 56 38 3E 2B 17 69 B9 C7 42 EC 

Which is a part of the kernel (zImage):
  00787ec0  5f c3 77 c8 1d 12 5f 0a  c7 8c 9a 0b 32 b8 3a 97  |_.w..._.....2.:.|
  00787ed0  bc 07 f5 de 40 ee 93 a0  17 fc db a3 46 b1 2a 35  |....@.......F.*5|
  00787ee0  53 c5 01 53 2d b5 56 38  3e 2b 17 69 b9 c7 42 ec  |S..S-.V8>+.i..B.|

But kernel of size 8616888 Bytes (zImage-4.17.13-1-default) or 8736648 Bytes (zImage-4.18.13-0.gc434d5c-default) are broken.

But a cross-compiled kernel (see comment #c11), 8713240 Bytes long, works fine!
Comment 20 Andreas Färber 2018-10-12 20:29:32 UTC
The native compiler will likely enable PIE (gcc-PIE package confirmed installed), whereas a cross-compiler will probably still default to non-PIE mode?
Comment 21 Guillaume GARDET 2018-10-15 07:47:14 UTC
(In reply to Andreas Färber from comment #20)
> The native compiler will likely enable PIE (gcc-PIE package confirmed
> installed), whereas a cross-compiler will probably still default to non-PIE
> mode?

Apparently, Kernel disable PIE usage with '-fno-PIE': https://kernel.opensuse.org/cgit/kernel/tree/Makefile?h=stable#n522
Comment 22 Andreas Färber 2018-10-15 10:37:32 UTC
(In reply to Guillaume GARDET from comment #21)
> Apparently, Kernel disable PIE usage with '-fno-PIE':
> https://kernel.opensuse.org/cgit/kernel/tree/Makefile?h=stable#n522

And there have been problems before, cf. bug #1092456. Did you check that it is actually used for e.g. the decompressor before the actual kernel?

Only other thing I could think of was KASLR, but I couldn't find an option in armv6hl or armv7hl.
Comment 23 Andreas Färber 2018-10-15 10:44:46 UTC
Could this stack protector fix be related?
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7bbaf27d9c83
Comment 24 Guillaume GARDET 2018-10-15 15:47:04 UTC
(In reply to Andreas Färber from comment #23)
> Could this stack protector fix be related?
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=7bbaf27d9c83

I tried to revert this patch, but same behavior, except in data values.
Read values:
  FE 02 BC 56 7B C3 C2 BC F2 D7 C5 3A AC FE 79 0C 
  13 9C 67 FC 22 5C EB DA 47 8C C2 EB EF B7 F3 46 
  7A 6E 32 CC 7F CE CC 14 CA BB DA B2 27 82 4C 3A 

Partial hexdump of zImage:
  00787ec0  fe 02 bc 56 7b c3 c2 bc  f2 d7 c5 3a ac fe 79 0c  |...V{......:..y.|
  00787ed0  13 9c 67 fc 22 5c eb da  47 8c c2 eb ef b7 f3 46  |..g."\..G......F|
  00787ee0  7a 6e 32 cc 7f ce cc 14  ca bb da b2 27 82 4c 3a  |zn2.........'.L:|
Comment 25 Guillaume GARDET 2018-10-17 06:45:28 UTC
(In reply to Andreas Färber from comment #22)
> (In reply to Guillaume GARDET from comment #21)
> > Apparently, Kernel disable PIE usage with '-fno-PIE':
> > https://kernel.opensuse.org/cgit/kernel/tree/Makefile?h=stable#n522
> 
> And there have been problems before, cf. bug #1092456. Did you check that it
> is actually used for e.g. the decompressor before the actual kernel?

I tried to remove the gcc-PIE package from the OBS build, but the problem remains the same.
Comment 26 Guillaume GARDET 2018-10-17 16:17:26 UTC
More testing on Beagleboard xM.
Grub debug output:
  ??, er/arm/linux.c:238: atag: 0x88000000, 90, edfe0dd0, ?
  loader/arm/linux.c:246: Kernel at: 0x80008000
  loader/arm/linux.c:184: linux_args: 'BOOT_IMAGE=(hd0,gpt2)/boot/zImage-ko
  loglevel=7 splash=silent plymouth.enable=0 console=ttyS2,115200n8 vram=16M
  root=UUID=4bc2d416-72ae-40a1-a5f5-68f2786cb2c3 rw'
  loader/arm/linux.c:199: Initrd @ 0x82008000-0x83618d20
  loader/arm/linux.c:215: FDT updated for Linux boot
  loader/arm/linux.c:255: FDT @ 0x0x88000000
  loader/arm/linux.c:267: Jumping to Linux...

So, Linux agrees for initrd addresses, find FDT, boot kernel, but fails only on (overwritten) initrd: 
  [    0.000000] OF: fdt: Machine model: TI OMAP3 BeagleBoard xM
  [    0.000000] Looking for initrd properties... 
  [    0.000000] initrd_start=0x82008000  initrd_end=0x83618d20

Moreover, if I reset the board while Grub is copying initrd (to not jump to Linux, but still after zImage has been loaded by Grub), 'md.b 0x82008000' from U-Boot shows the initrd is in place:
  82008000: fd 37 7a 58 5a 00 00 01 69 22 de 36 04 c0 e4 f9    .7zXZ...i".6....
  82008010: 4f 80 80 c0 01 21 01 10 00 00 00 00 79 51 f7 52    O....!......yQ.R
  82008020: e2 70 03 ef fe 5d 00 18 0d dd 04 63 9c 04 52 84    .p...].....c..R.
  82008030: bc 50 ed 27 41 23 55 a4 28 26 2a ff 95 96 5a 82    .P.'A#U.(&*...Z

So, something copied zImage kernel after this point, overlapping initrd.
Comment 27 Matwey Kornilov 2018-10-17 16:31:22 UTC
Could you try to use u-boot to load linux and initrd directly instead of running Grub? Just to check.
Comment 28 Guillaume GARDET 2018-10-17 19:11:12 UTC
(In reply to Matwey Kornilov from comment #27)
> Could you try to use u-boot to load linux and initrd directly instead of
> running Grub? Just to check.

Yes, it does work.
On Beagleboard xM, I use in U-Boot:
  setenv bootargs "loglevel=7 debug plymouth.enable=0 console=ttyS2,115200n8 vram=16M root=UUID=5117ba96-be32-46a9-b3d3-d0d77f3f506d rw"
  load mmc 0:2 0x80008000 /boot/zImage
  load mmc 0:2 0x88000000 boot/dtb/$fdtfile
  load mmc 0:2 0x82008000 /boot/initrd
  setenv rd_filesize 0x${filesize}
  bootz 0x80008000 0x82008000:${rd_filesize} ${fdtaddr}

And I get:
  ## Flattened Device Tree blob at 88000000
     Booting using the fdt blob at 0x88000000
     Loading Ramdisk to 8c815000, end 8deed160 ... OK
     Using Device Tree in place at 88000000, end 8801c52b

  Starting kernel ...

Until the login prompt.
Comment 29 Guillaume GARDET 2018-10-17 19:28:38 UTC
(In reply to Guillaume GARDET from comment #28)
>      Loading Ramdisk to 8c815000, end 8deed160 ... OK

Note that standalone U-Boot move the Ramdisk place.
The kernel confirms this new address:
  [    0.000000] Looking for initrd properties... 
  [    0.000000] initrd_start=0x8c815000  initrd_end=0x8deed160
Comment 30 Guillaume GARDET 2018-10-17 19:35:09 UTC
(In reply to Guillaume GARDET from comment #29)
> (In reply to Guillaume GARDET from comment #28)
> >      Loading Ramdisk to 8c815000, end 8deed160 ... OK
> 
> Note that standalone U-Boot move the Ramdisk place.
> The kernel confirms this new address:
>   [    0.000000] Looking for initrd properties... 
>   [    0.000000] initrd_start=0x8c815000  initrd_end=0x8deed160

So, if I set initrd_high to 0xffffffff in u-boot to avoid the relocation of initrd, I get the exact same behavior we have with Grub and boot fails.
Comment 31 Guillaume GARDET 2018-10-19 15:57:19 UTC
I tested a few kernel versions and 4.17-rc6 was working whereas 4.17-rc7 was broken. So, I did a git bisect which pointed to commit d883c6cf3b39f1f42506e82ad2779fb88004acf3 (which is a revert):
  Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE"
  https://kernel.opensuse.org/cgit/kernel/commit/?h=stable&id=d883c6cf3b39f1f42506e82ad2779fb88004acf3

as the first bad commit.
Comment 32 Matwey Kornilov 2018-10-19 16:04:44 UTC
Thanks, Guillaume!
Comment 33 Andreas Färber 2018-10-19 18:20:25 UTC
(In reply to Guillaume GARDET from comment #31)
> I tested a few kernel versions and 4.17-rc6 was working whereas 4.17-rc7 was
> broken. So, I did a git bisect which pointed to commit
> d883c6cf3b39f1f42506e82ad2779fb88004acf3 (which is a revert):
>   Revert "mm/cma: manage the memory of the CMA area by using the
> ZONE_MOVABLE"
>  
> https://kernel.opensuse.org/cgit/kernel/commit/
> ?h=stable&id=d883c6cf3b39f1f42506e82ad2779fb88004acf3
> 
> as the first bad commit.

Thanks for bisecting this, Guillaume!

Michal, can you take a look please why the commit could break 32-bit arm kernels? Might this affect aarch64 too under some circumstances?
Comment 34 Michal Hocko 2018-10-22 07:36:33 UTC
(In reply to Andreas Färber from comment #33)
> (In reply to Guillaume GARDET from comment #31)
> > I tested a few kernel versions and 4.17-rc6 was working whereas 4.17-rc7 was
> > broken. So, I did a git bisect which pointed to commit
> > d883c6cf3b39f1f42506e82ad2779fb88004acf3 (which is a revert):
> >   Revert "mm/cma: manage the memory of the CMA area by using the
> > ZONE_MOVABLE"
> >  
> > https://kernel.opensuse.org/cgit/kernel/commit/
> > ?h=stable&id=d883c6cf3b39f1f42506e82ad2779fb88004acf3
> > 
> > as the first bad commit.
> 
> Thanks for bisecting this, Guillaume!
> 
> Michal, can you take a look please why the commit could break 32-bit arm
> kernels? Might this affect aarch64 too under some circumstances?

I have a very vague recollection there was some strange fallout from the revert but I cannot find it right now. Is the issue reproducible with the current upstream code?
Comment 35 Guillaume GARDET 2018-10-23 12:22:40 UTC
(In reply to Michal Hocko from comment #34)
> I have a very vague recollection there was some strange fallout from the
> revert but I cannot find it right now. Is the issue reproducible with the
> current upstream code?

I have just tested kernel-default 4.19.0 from Kernel:HEAD project in OBS and the problem is still there.
Comment 36 Michal Hocko 2018-10-23 12:30:43 UTC
(In reply to Guillaume GARDET from comment #35)
> (In reply to Michal Hocko from comment #34)
> > I have a very vague recollection there was some strange fallout from the
> > revert but I cannot find it right now. Is the issue reproducible with the
> > current upstream code?
> 
> I have just tested kernel-default 4.19.0 from Kernel:HEAD project in OBS and
> the problem is still there.

Could you report upstream then, please? Feel free to CC me. Having Joonsoo Kim, Laura Abbott and arch maintainers would be a good start as well.

Thanks!
Comment 37 Guillaume GARDET 2018-10-23 13:27:36 UTC
(In reply to Michal Hocko from comment #36)
> Could you report upstream then, please? Feel free to CC me. Having Joonsoo
> Kim, Laura Abbott and arch maintainers would be a good start as well.
> 
> Thanks!

I reported it upstream: https://bugzilla.kernel.org/show_bug.cgi?id=201495
Comment 38 Michal Hocko 2018-10-23 17:20:18 UTC
(In reply to Guillaume GARDET from comment #37)
> (In reply to Michal Hocko from comment #36)
> > Could you report upstream then, please? Feel free to CC me. Having Joonsoo
> > Kim, Laura Abbott and arch maintainers would be a good start as well.
> > 
> > Thanks!
> 
> I reported it upstream: https://bugzilla.kernel.org/show_bug.cgi?id=201495

Can we switch to email please (CC linux-mm, linux-kernel and arch list). This has a bigger chance to get an attention in my practice. At least for MM issues. But maybe arm works differently in that regards.
Comment 39 Guillaume GARDET 2018-10-26 08:57:57 UTC
Just to be sure, I built latest Kernel:stable (4.19.0) for Leap 15.0 (which uses GCC7). The kernel is broken in the same way.
Comment 40 Guillaume GARDET 2018-10-26 13:44:20 UTC
I think I have found why armv7l kernel fails now: the kernel is simply too big.
IIUC, zImage is relocated and decompressed after the initial zImage space.

Currently we have 32 MB between start of zImage and start of initrd. So, the size of zImage + the size of uncompressed image (vmlinux) + a little bit of malloc, must fit in 32MB.

For Leap 15.0 kernel-default is 7.7 + 24.2 = 31.9 MB, so it is fine.
For kernel-default-4.18.15-1.2.armv7hl (current TW version) we have: 8.3 + 25.9 = 34.2 MB which is too large.
For the kernel-default 4.19.0 from kernel:stable it is worth: 8.5 + 26.4 = 34.9 MB

The solutions are to:
* get a smaller kernel
* or patch Grub (or U-Boot, depending on the board) to load initrd at a bigger address

I think, it would be better to get a smaller kernel. Using optimize for size instead of optimize for performance for armv7 would be the quicker way to get a smaller kernel.
Comment 41 Matwey Kornilov 2018-10-26 16:28:25 UTC
...or review existing default config for armv7l to replace some y-s with m-s.
Comment 42 Guillaume GARDET 2018-11-30 14:03:48 UTC
I fixed the problem (for -stable branch) with the patch posted to opensuse-kernel ML: https://lists.opensuse.org/opensuse-kernel/2018-11/msg00000.html

With kernel 4.19.5 with this config patch, kernel boots properly on Pandaboard.
Comment 43 Andreas Färber 2018-12-11 17:12:30 UTC
(In reply to Guillaume GARDET from comment #42)
> I fixed the problem (for -stable branch) with the patch posted to
> opensuse-kernel ML:
> https://lists.opensuse.org/opensuse-kernel/2018-11/msg00000.html
> 
> With kernel 4.19.5 with this config patch, kernel boots properly on
> Pandaboard.

This 4.19.x patch has now built here:
https://build.opensuse.org/project/show/Kernel:stable

Please also help test today's new 4.20-rc6 kernels:
https://build.opensuse.org/project/show/Kernel:HEAD

For 4.20 I've enabled EFI support for armv6hl and armv7hl, but it may require a GRUB update to 2.03 (or backport) for efistub to get used there.
Comment 44 Michael Chang 2018-12-12 09:48:00 UTC

(In reply to Andreas Färber from comment #43)
> (In reply to Guillaume GARDET from comment #42)
> > I fixed the problem (for -stable branch) with the patch posted to
> > opensuse-kernel ML:
> > https://lists.opensuse.org/opensuse-kernel/2018-11/msg00000.html
> > 
> > With kernel 4.19.5 with this config patch, kernel boots properly on
> > Pandaboard.
> 
> This 4.19.x patch has now built here:
> https://build.opensuse.org/project/show/Kernel:stable
> 
> Please also help test today's new 4.20-rc6 kernels:
> https://build.opensuse.org/project/show/Kernel:HEAD
> 
> For 4.20 I've enabled EFI support for armv6hl and armv7hl, but it may
> require a GRUB update to 2.03 (or backport) for efistub to get used there.

Is this the upstream commit ?

> http://git.savannah.gnu.org/cgit/grub.git/commit/?id=d0c070179d4d78c297364e41ece54fd7755c4b58

There's warning in the commit message.

"This *WILL* stop non-efistub Linux kernels from booting on arm-efi."

Are we fine with dropping non-efistub Linux kernels support (for arm 32bit) right now? (IOW with this update if one has enabled parallel kernel installation some of the old one may stop from booting ...)
Comment 45 Guillaume GARDET 2018-12-20 14:27:53 UTC
I tested Tumbleweed images with kernel-default 4.20-rc7 (from kernel:HEAD) successfully on Pandaboard and on BeagleBoneBlack.

kernel 4.19.x update (from kernel:Stable) is on the way to Factory: https://build.opensuse.org/request/show/660240
Comment 46 Alexander Graf 2018-12-20 15:55:18 UTC
Maybe a Conflicts on older kernel versions for %arm would help then? That way we ideally don't break setups by accident. Apart from missing rollback of course.
Comment 47 Matwey Kornilov 2018-12-20 16:01:13 UTC
Ideally, we need QA for armv7l.
This particular issue could be caught by qemu setup.
Comment 48 Andreas Färber 2018-12-21 23:21:54 UTC
(In reply to Matwey Kornilov from comment #47)
> Ideally, we need QA for armv7l.
> This particular issue could be caught by qemu setup.

Ideally yes. But SUSE doesn't do armv7l, begging the question who would QA it:

Are _you_ willing to monitor openQA runs and report/associate Bugzilla tickets? openQA requires humans to keep it running, to update its needles and to act on the failures it reports, so that ultimately things can get fixed. Guillaume has been helping out for aarch64 (with big improvements to snapshot frequency).

Not to mention that the more architectures, releases and tests we want to cover, the more hardware resources will be needed to run those tests in time. And many new arm servers no longer support AArch32 mode needed for armv7l/armv6l.

Also I've been made aware that no .iso is being built for armv7l/armv6l. That would be needed as input to openQA.
https://github.com/openSUSE/software-o-o/pull/420#issuecomment-442782046
Comment 49 Andreas Färber 2018-12-21 23:36:59 UTC
(In reply to Alexander Graf from comment #46)
> Maybe a Conflicts on older kernel versions for %arm would help then? That
> way we ideally don't break setups by accident. Apart from missing rollback
> of course.

4.20 is not yet final and therefore not yet submitted to openSUSE:Factory, so we can only do such breaking grub2 changes afterwards.

I'm surprised that there would be no backwards compatibility though - can't that be implemented by detecting whether the loaded kernel is an EFI binary?
Comment 50 Michael Chang 2018-12-24 04:31:17 UTC
(In reply to Andreas Färber from comment #49)
> (In reply to Alexander Graf from comment #46)

> I'm surprised that there would be no backwards compatibility though - can't
> that be implemented by detecting whether the loaded kernel is an EFI binary?

Yes it can, probably together with other logic to determine secure boot status and use efi-handover in that situation, if I wasn't mistaken with Alex's idea in fate#326541.
Thanks.
Comment 51 Swamp Workflow Management 2019-01-08 08:20:08 UTC
This is an autogenerated message for OBS integration:
This bug (1104833) was mentioned in
https://build.opensuse.org/request/show/663572 Factory:ARM:Live / JeOS
Comment 52 Matwey Kornilov 2019-01-14 20:43:47 UTC
It seems that arm is the single one Grub platform which uses fixed initrd loading memory addr.

Compare grub_cmd_initrd() implementation from

  grub/grub-core/loader/arm64/linux.c 

with

  grub/grub-core/loader/arm/linux.c
Comment 53 Guillaume GARDET 2019-01-21 10:23:01 UTC
Current armv7 images are booting again with kernel-default.
Tested on Pandaboard.
Comment 54 Guillaume GARDET 2019-02-13 15:53:41 UTC
*** Bug 1101194 has been marked as a duplicate of this bug. ***