Bug 1175908 - encrypted rootfs does not boot anymore
encrypted rootfs does not boot anymore
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem
Current
x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
: ---
Assigned To: dracut maintainers
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-08-28 23:44 UTC by Harald Koenig
Modified: 2021-12-02 10:28 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Harald Koenig 2020-08-28 23:44:27 UTC
since quite a while initrd for newer kernel do not boot/mount my encrypted rootfs anymore.

I'm still running kernel 5.3.12-2-default.
first time I tried to boot new kernel and failed was with kernel-default-5.5.7-1.1,
then later with kernel-default-5.7.1-1.2 and now with kernel-default-5.8.0-1.1

these are the messages from journalctl for the failed boot of 5.8.0:

Aug 28 21:13:23 hl systemd[1]: Found device Samsung_SSD_850_EVO_2TB 5.
Aug 28 21:13:23 hl systemd[1]: Found device Samsung_SSD_850_EVO_2TB 3.
Aug 28 21:13:23 hl systemd[1]: Reloading.
Aug 28 21:13:23 hl systemd[1]: Starting Cryptography Setup for cr_sda5...
Aug 28 21:13:23 hl systemd[1]: Started Forward Password Requests to Plymouth.
Aug 28 21:13:32 hl systemd-cryptsetup[433]: Set cipher aes, mode cbc-essiv:sha256, key size 256 bits for device /dev/sda5.
Aug 28 21:13:32 hl systemd-cryptsetup[433]: device-mapper: reload ioctl on   failed: No such file or directory
Aug 28 21:13:32 hl systemd-cryptsetup[433]: Failed to activate with specified passphrase: Invalid argument
Aug 28 21:13:32 hl systemd[1]: systemd-cryptsetup@cr_sda5.service: Main process exited, code=exited, status=1/FAILURE
Aug 28 21:13:32 hl systemd[1]: systemd-cryptsetup@cr_sda5.service: Failed with result 'exit-code'.
Aug 28 21:13:32 hl systemd[1]: Failed to start Cryptography Setup for cr_sda5.
Aug 28 21:13:32 hl systemd[1]: Dependency failed for Local Encrypted Volumes.
Aug 28 21:13:32 hl systemd[1]: cryptsetup.target: Job cryptsetup.target/start failed with result 'dependency'.

and here the msgs for successfully booting into 5.3.12-2 again:

Aug 28 23:29:19 hl systemd[1]: Found device Samsung_SSD_850_EVO_2TB 5.
Aug 28 23:29:19 hl systemd[1]: Found device Samsung_SSD_850_EVO_2TB 3.
Aug 28 23:29:19 hl systemd[1]: Starting Cryptography Setup for cr_sda5...
Aug 28 23:29:19 hl systemd[1]: Reloading.
Aug 28 23:29:19 hl systemd[1]: Started Forward Password Requests to Plymouth.
Aug 28 23:29:25 hl systemd-cryptsetup[476]: Set cipher aes, mode cbc-essiv:sha256, key size 256 bits for device /dev/sda5.
Aug 28 23:29:26 hl systemd[1]: Finished Cryptography Setup for cr_sda5.
Aug 28 23:29:26 hl systemd[1]: Reached target Local Encrypted Volumes.


both initrd are freshly created with up-to-date dracut:
-rw------- 1 root root 18584060 Aug 28 23:10 /boot/initrd-5.8.0-1-default
-rw------- 1 root root 17643032 Aug 28 23:11 /boot/initrd-5.3.12-2-default


what is "missing" in this msg in between "on" and "failed:" (between the two extra spaces)?

Aug 28 21:13:32 hl systemd-cryptsetup[433]: device-mapper: reload ioctl on   failed: No such file or directory
                                                                          ^^^
Comment 1 Harald Koenig 2020-08-31 06:34:28 UTC
is there an archive where I can get all kernel packages 
between version 5.3.12-2-default and 5.5.7-1.1-default ?

then I could bisect which kernel version introduced this problem,
maybe this helps understanding what's going wrong ?
Comment 2 Takashi Iwai 2020-08-31 07:23:01 UTC
You find some old kernels in my OBS home:tiwai:kernel:* project (like home:tiwai:kernel:5.4).  Each one contains only the last stable version.
Comment 3 Harald Koenig 2020-08-31 10:50:21 UTC
thanks! I've downloaded/installed/booted 

kernel-default-5.3.12-1.1.g60a2268.x86_64.rpm
kernel-default-5.4.14-1.1.gfc4ea7a.x86_64.rpm
kernel-default-5.5.13-1.1.g0af205d.x86_64.rpm
kernel-default-5.6.15-1.1.gbfa465b.x86_64.rpm
kernel-default-5.7.12-1.1.g9c98feb.x86_64.rpm

and only 5.3.12-1.1.g60a2268 successfully booted -- all other kernels are stuck in the same "Failed to start Cryptography Setup for cr_sda5" error.

how to proceed ?

what are the magic spells for systemd to get more info at boot time when failing?


AND what's really strange (for me): 

after testing/booting all your kernel pkgs and then booting into my working 5.3.12-2-default again, ALL  *your* kernel packages are gone again!!

before the (re)boot tests, "rpm -q/-qf" showed all your kernel packages,
all your versions showed up in grub2 'Advanced options for openSUSE Tumbleweed' menu, and in grub I've seen the msg "loading kernel-5.4..." etc. (and the git hashes in version)


$ rpm -qf /boot/vmlinu* | sort -u
kernel-default-5.3.12-2.2.x86_64
kernel-default-5.8.2-1.2.x86_64

$ rpm -q kernel-default
kernel-default-5.3.12-2.2.x86_64
kernel-default-5.8.2-1.2.x86_64

in /etc/zypp/zypp.conf I'm using:

  multiversion = provides:multiversion(kernel)
  multiversion.kernels = latest,latest-10,running


is there any "magic" removing any non-booting (or non-official-suse) kernel RPMs at boot time ????

I'm using all ext4fs, so no btrfs issues are possible. 

and zypper history shows that I'm not dreaming:

$ egrep '(install|remove ).kernel-default' /var/log/zypp/history
2020-08-31 11:34:58|install|kernel-default|5.3.12-1.1.g60a2268|x86_64|root@hl.fritz.box|_tmpRPMcache_||
2020-08-31 11:35:40|install|kernel-default|5.4.14-1.1.gfc4ea7a|x86_64|root@hl.fritz.box|_tmpRPMcache_||
2020-08-31 11:36:25|install|kernel-default|5.5.13-1.1.g0af205d|x86_64|root@hl.fritz.box|_tmpRPMcache_||
2020-08-31 11:37:09|install|kernel-default|5.6.15-1.1.gbfa465b|x86_64|root@hl.fritz.box|_tmpRPMcache_||
2020-08-31 11:46:34|remove |kernel-default|5.7.12-1.1.g9c98feb|x86_64|root@hl.fritz.box|
2020-08-31 11:46:39|remove |kernel-default|5.3.12-1.1.g60a2268|x86_64|root@hl.fritz.box|
2020-08-31 11:46:47|remove |kernel-default|5.4.14-1.1.gfc4ea7a|x86_64|root@hl.fritz.box|
2020-08-31 11:46:51|remove |kernel-default|5.5.13-1.1.g0af205d|x86_64|root@hl.fritz.box|
2020-08-31 11:46:55|remove |kernel-default|5.6.15-1.1.gbfa465b|x86_64|root@hl.fritz.box|

so, what's going on here?
new ticket needed?
Comment 4 Harald Koenig 2020-08-31 11:05:02 UTC
I found this report for a very similar problem with Fedora,
unfortuneately it's very unspecific about it's solution
("removing all software which had been installed from standalone rpm files" is not really an option for my system;)

https://unix.stackexchange.com/questions/576828/fedora-31-kernels-5-5-10-and-5-5-11-fail-when-trying-to-decrypt-luks-root-filesy

BUT it'd limit the problems to somewhere betweeh 5.5.8 and 5.5.10-200.fc31

where can I find kernel RPMs with these versions for tumbleweed?
Comment 5 Takashi Iwai 2020-08-31 13:32:33 UTC
(In reply to Harald Koenig from comment #3)
> in /etc/zypp/zypp.conf I'm using:
> 
>   multiversion = provides:multiversion(kernel)
>   multiversion.kernels = latest,latest-10,running

It means that you keep *only* the latest one, the running one and the last-10, at most three kernels :)  You'll need to put each item like latest-2, latest-3, etc, for keeping all those.

Also, the purge-kernel stuff still has some issue with the wild kernels from the git snapshot.  It was fixed very recently in zypper.
Comment 6 Takashi Iwai 2020-08-31 13:47:39 UTC
(In reply to Harald Koenig from comment #4)
> I found this report for a very similar problem with Fedora,
> unfortuneately it's very unspecific about it's solution
> ("removing all software which had been installed from standalone rpm files"
> is not really an option for my system;)
> 
> https://unix.stackexchange.com/questions/576828/fedora-31-kernels-5-5-10-and-
> 5-5-11-fail-when-trying-to-decrypt-luks-root-filesy
> 
> BUT it'd limit the problems to somewhere betweeh 5.5.8 and 5.5.10-200.fc31

Well, it said the problem being between 5.5.8 and 5.5.10, while you verified that 5.4.14 also failed.  So it's not directly relevant, but hitting a similar problem.

As the error message shows, something odd happening in the systemd-cryptsetup, so now reassigned to systemd guys for diagnose.  It might be a kernel issue in the end, but let's analyze from user-space side at first.
Comment 7 Franck Bui 2020-08-31 19:34:18 UTC
(In reply to Harald Koenig from comment #0)
> Aug 28 21:13:32 hl systemd-cryptsetup[433]: device-mapper: reload ioctl on  
> failed: No such file or directory

That's probably a message emitted by libcryptsetup.

Can you show the content of /etc/crypttab ?

Can you try to attach your encrypted root device with cryptsetup(8) directly for the failing case (assuming that the emergency shell is started) ?
Comment 8 Harald Koenig 2020-08-31 19:53:41 UTC
(In reply to Franck Bui from comment #7)
> (In reply to Harald Koenig from comment #0)
> > Aug 28 21:13:32 hl systemd-cryptsetup[433]: device-mapper: reload ioctl on  
> > failed: No such file or directory
> 
> That's probably a message emitted by libcryptsetup.
> 
> Can you show the content of /etc/crypttab ?

cr_sda5 /dev/sda5 none       none


> Can you try to attach your encrypted root device with cryptsetup(8) directly
> for the failing case (assuming that the emergency shell is started) ?

there is only systemd-cryptsetup in the initrd.
will reboot and reply later...
Comment 9 Harald Koenig 2020-08-31 20:46:08 UTC
(In reply to Harald Koenig from comment #8)
> > Can you try to attach your encrypted root device with cryptsetup(8) directly
> > for the failing case (assuming that the emergency shell is started) ?
> 
> there is only systemd-cryptsetup in the initrd.
> will reboot and reply later...

$ strace -fo O -s999 /usr/lib/systemd/systemd-cryptsetup attach cr /dev/sda5
Please enter passphrase for disk Samsung_SSD_850_EVO_2TB (cr):
Set cipher aes, mode cbc-essiv:sha256, key size 256 bits for device /dev/sda5.
[  377.736756] device-mapper: table: 254:0: crypt: Error allocating crypto tfm
device-mapper: reload ioctl on cr (254:0) failed: No such file or directory
Failed to activate with specified passphrase: Invalid argument


dmesg output:
[    7.611008] Console: switching to colour frame buffer device 240x67
[    7.634358] i915 0000:00:02.0: fb0: i915drmfb frame buffer device
[   16.169035] device-mapper: table: 254:0: crypt: Error allocating crypto tfm
[   16.169054] device-mapper: ioctl: error adding target to table
[  245.056425] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
[  377.736756] device-mapper: table: 254:0: crypt: Error allocating crypto tfm
[  377.736780] device-mapper: ioctl: error adding target to table


strace shows that /etc/crypttab is not accessed/opened at all

strace output:
2774  openat(AT_FDCWD, "/etc/fstab", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
2774  ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
2774  ioctl(0, TCGETS, {B38400 opost isig icanon echo ...}) = 0
2774  inotify_init1(IN_NONBLOCK|IN_CLOEXEC) = 7
2774  request_key("user", "cryptsetup", NULL, 0) = -1 ENOKEY (Required key not available)
2774  inotify_add_watch(7, "/run/systemd/ask-password", IN_ATTRIB) = 1
2774  openat(AT_FDCWD, "/dev/tty", O_RDWR|O_NOCTTY|O_CLOEXEC) = 8
2774  ioctl(8, TCGETS, {B38400 opost isig icanon echo ...}) = 0
2774  ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
2774  ioctl(2, TCGETS, {B38400 opost isig icanon echo ...}) = 0
2774  write(8, "\33[0;1;39m", 9)        = 9
2774  write(8, "Please enter passphrase for disk Samsung_SSD_850_EVO_2TB (cr):", 62) = 62
2774  write(8, " ", 1)                  = 1
2774  write(8, "\33[0;38;5;245m", 13)   = 13
2774  write(8, "(press TAB for no echo) ", 24) = 24
2774  write(8, "\33[0m", 4)             = 4
...
2774  read(8, "\n", 1)                  = 1
2774  request_key("user", "cryptsetup", NULL, 0) = -1 ENOKEY (Required key not available)
2774  add_key("user", "cryptsetup", "Dim2vP4r", 8, KEY_SPEC_USER_KEYRING) = 581660211
2774  keyctl(KEYCTL_SET_TIMEOUT, 581660211, 150) = 0
2774  openat(AT_FDCWD, "/run/systemd/ask-password", O_RDONLY|O_NOFOLLOW|O_CLOEXEC|O_PATH) = 9
2774  fstat(9, {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
2774  utimensat(AT_FDCWD, "/proc/self/fd/9", NULL, 0) = 0
2774  close(9)                          = 0
2774  write(8, "\n", 1)                 = 1
2774  ioctl(8, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = 0
2774  close(7)                          = 0
2774  close(8)                          = 0
2774  writev(2, [{iov_base="Set cipher aes, mode cbc-essiv:sha256, key size 256 bits for device /dev/sda5.", iov_len=78}, {iov_base="\n", iov_len=1}], 2) = 79
2774  ioctl(3, DM_LIST_VERSIONS, {version=4.1.0, data_size=16384, data_start=312, flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=488, data_start=312, flags=DM_EXISTS_FLAG, ...}) = 0
2774  ioctl(3, DM_TABLE_STATUS, {version=4.0.0, data_size=16384, data_start=312, name="cr", flags=DM_EXISTS_FLAG|DM_NOFLUSH_FLAG} => {version=4.42.0, data_size=16384, data_start=312, name="cr", flags=DM_EXISTS_FLAG|DM_NOFLUSH_FLAG}) = -1 ENXIO (No such device or address)
2774  futex(0x7f622a3e1e00, FUTEX_WAKE_PRIVATE, 2147483647) = 0
2774  lseek(6, 4096, SEEK_SET)          = 4096
2774  read(6, "H'\367a=\21"..., 128000) = 128000
2774  ioctl(3, DM_LIST_VERSIONS, {version=4.1.0, data_size=16384, data_start=312, flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=488, data_start=312, flags=DM_EXISTS_FLAG, ...}) = 0
2774  ioctl(3, DM_TABLE_STATUS, {version=4.0.0, data_size=16384, data_start=312, name="cr", flags=DM_EXISTS_FLAG|DM_NOFLUSH_FLAG} => {version=4.42.0, data_size=16384, data_start=312, name="cr", flags=DM_EXISTS_FLAG|DM_NOFLUSH_FLAG}) = -1 ENXIO (No such device or address)
2774  stat("/dev/sda5", {st_mode=S_IFBLK|0660, st_rdev=makedev(0x8, 0x5), ...}) = 0
2774  openat(AT_FDCWD, "/dev/sda5", O_RDWR|O_EXCL) = 7
2774  ioctl(7, BLKROGET, [0])           = 0
2774  ioctl(7, BLKGETSIZE64, [536870912000]) = 0
2774  close(7)                          = 0
2774  openat(AT_FDCWD, "/dev/sda5", O_RDONLY) = 7
2774  ioctl(7, BLKRAGET, [1024])        = 0
2774  close(7)                          = 0
2774  openat(AT_FDCWD, "/dev/urandom", O_RDONLY) = 7
2774  read(7, "\356\344", 2)            = 2
2774  semget(0xd4de4ee, 1, IPC_CREAT|IPC_EXCL|0600) = 1
2774  semctl(1, 0, SETVAL, 0x1)         = 0
2774  semctl(1, 0, GETVAL, NULL)        = 1
2774  close(7)                          = 0
2774  semtimedop(1, [{0, 1, 0}], 1, NULL) = 0
2774  semctl(1, 0, GETVAL, NULL)        = 2
2774  ioctl(3, DM_DEV_CREATE, {version=4.0.0, data_size=16384, name="cr", uuid="CRYPT-LUKS1-00000000000000000000000000000000-cr", flags=DM_EXISTS_FLAG} => {version=4.42.0, data_size=305, dev=makedev(0xfe, 0), name="cr", uuid="CRYPT-LUKS1-00000000000000000000000000000000-cr", target_count=0, open_count=0, event_nr=0, flags=DM_EXISTS_FLAG}) = 0
2774  ioctl(3, DM_TABLE_LOAD, {version=4.0.0, data_size=16384, data_start=312, dev=makedev(0xfe, 0), target_count=1, flags=DM_EXISTS_FLAG|DM_PERSISTENT_DEV_FLAG|DM_SECURE_DATA_FLAG, ...}, 0x557a323af760) = -1 ENOENT (No such file or directory)
2774  writev(2, [{iov_base="device-mapper: reload ioctl on cr (254:0) failed: No such file or directory", iov_len=75}, {iov_base="\n", iov_len=1}], 2) = 76
2774  semget(0xd4de4ee, 1, 000)         = 1
2774  semctl(1, 0, GETVAL, NULL)        = 2
2774  semtimedop(1, [{0, -1, IPC_NOWAIT}], 1, NULL) = 0
2774  semget(0xd4de4ee, 1, 000)         = 1
2774  semtimedop(1, [{0, 1, 0}], 1, NULL) = 0
2774  semctl(1, 0, GETVAL, NULL)        = 2
2774  ioctl(3, DM_DEV_REMOVE, {version=4.0.0, data_size=16384, name="cr", event_nr=6350062, flags=DM_EXISTS_FLAG|DM_SECURE_DATA_FLAG} => {version=4.42.0, data_size=305, name="cr", uuid="CRYPT-LUKS1-00000000000000000000000000000000-cr", flags=DM_EXISTS_FLAG|DM_UEVENT_GENERATED_FLAG}) = 0
2774  semget(0xd4de4ee, 1, 000)         = 1
2774  semctl(1, 0, GETVAL, NULL)        = 2
2774  semtimedop(1, [{0, -1, IPC_NOWAIT}], 1, NULL) = 0
2774  semtimedop(1, [{0, 0, 0}], 1, NULL) = 0
2774  semctl(1, 0, IPC_RMID, NULL)      = 0
2774  writev(2, [{iov_base="Failed to activate with specified passphrase: Invalid argument", iov_len=62}, {iov_base="\n", iov_len=1}], 2) = 63
2774  close(3)                          = 0
2774  close(6)                          = 0
2774  close(5)                          = 0
2774  close(4)                          = 0
2774  exit_group(1)                     = ?
Comment 10 Harald Koenig 2020-08-31 22:58:15 UTC
(In reply to Harald Koenig from comment #8)
> > Can you show the content of /etc/crypttab ?
> 
> cr_sda5 /dev/sda5 none       none

I've tried to change this to

cr_sda5 UUID=0a1f2bc2-d119-454e-bbbb-bef01272cc8b none luks,discard


and created a new initrd for 5.8.4-1-default, does not help

$ ls -l /dev/disk/by-uuid/0a1f2bc2-d119-454e-bbbb-bef01272cc8b
lrwxrwxrwx 1 root root 10 Sep  1 00:44 /dev/disk/by-uuid/0a1f2bc2-d119-454e-bbbb-bef01272cc8b -> ../../sda5
Comment 11 Franck Bui 2020-09-01 07:28:43 UTC
(In reply to Harald Koenig from comment #8)
> there is only systemd-cryptsetup in the initrd.
> will reboot and reply later...

You need to explicitly ask for it to be included in initrd.

Try something like: dracut --install "/sbin/cryptsetup"
Comment 12 Franck Bui 2020-09-01 07:33:07 UTC
(In reply to Harald Koenig from comment #9)
> dmesg output:
> [    7.611008] Console: switching to colour frame buffer device 240x67
> [    7.634358] i915 0000:00:02.0: fb0: i915drmfb frame buffer device
> [   16.169035] device-mapper: table: 254:0: crypt: Error allocating crypto
> tfm
> [   16.169054] device-mapper: ioctl: error adding target to table
> [  245.056425] EXT4-fs (sda3): mounted filesystem with ordered data mode.
> Opts: (null)
> [  377.736756] device-mapper: table: 254:0: crypt: Error allocating crypto
> tfm
> [  377.736780] device-mapper: ioctl: error adding target to table
> 

@Takashi, it's failing in the kernel (after libcryptsetup did the ioctl).

Can you please tell us what went wrong exactly ?

> 
> strace shows that /etc/crypttab is not accessed/opened at all
> 

systemd-cryptsetup doesn't access to crypttab, it's systemd-cryptsetup-generator which does.
Comment 13 Takashi Iwai 2020-09-01 09:31:03 UTC
IIRC, the error indicating some missing crypt module.
Comment 14 Franck Bui 2020-09-01 09:38:59 UTC
Can you tell us which one exactly ?
Comment 15 Takashi Iwai 2020-09-01 09:42:38 UTC
I dunno, it depends on how it was encrypted.  We may check the crypt entries in the working case.
Comment 16 Franck Bui 2020-09-01 10:16:26 UTC
@Harald, can you do that and find which kernel module is missing please ?

Thanks.
Comment 17 Harald Koenig 2020-09-01 19:39:43 UTC
(In reply to Franck Bui from comment #11)
> > there is only systemd-cryptsetup in the initrd.
> > will reboot and reply later...
> 
> You need to explicitly ask for it to be included in initrd.
> 
> Try something like: dracut --install "/sbin/cryptsetup"

is there any information benefit when I try real cryptsetup again in initrd with new kernel when boot failed ?
Comment 18 Harald Koenig 2020-09-01 19:55:05 UTC
(In reply to Takashi Iwai from comment #15)
> I dunno, it depends on how it was encrypted.  We may check the crypt entries
> in the working case.

how can I provide this info?  

$ lsmod | egrep -i 'aes|crypt'
dm_crypt               49152  1
aesni_intel           372736  14
aes_x86_64             20480  1 aesni_intel
glue_helper            16384  1 aesni_intel
crypto_simd            16384  1 aesni_intel
cryptd                 24576  4 crypto_simd,ghash_clmulni_intel
dm_mod                155648  27 dm_crypt,dm_thin_pool,dm_multipath,dm_log,dm_mirror,dm_bufio

$ dmesg  | grep -i crypt
[    4.045430] Key type encrypted registered
[    4.959582] Freeing unused decrypted memory: 2040K
[    4.970846] systemd[1]: systemd 245.7+suse.49.g6d6d92930a running in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[    5.055040] systemd[1]: Created slice system-systemd\x2dcryptsetup.slice.
[    5.993860] cryptd: max_cpu_qlen set to 1000
[   17.342901] systemd[1]: systemd 245.7+suse.49.g6d6d92930a running in system mode. (+PAM -AUDIT +SELINUX -IMA +APPARMOR -SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[   17.670956] systemd[1]: Reached target Local Encrypted Volumes.

comparing list of all kernel mods in initrd (fully extracted including compressed cpio in directory cpio-root-19000/
(left column: only in 5.3, right column: only in 5.8):

$ comm -3 <( find _initrd-5.3.12-2-default.extracted/ -name \*.ko\* |  sed 's,.*/,,' | sort -u ) <( find _initrd-5.8.4-1-default.extracted/ -name \*.ko\* |  sed 's,.*/,,' | sort -u )
aes-x86_64.ko.xz
        battery.ko.xz
        cec.ko.xz
        fsl_linflexuart.ko.xz
        hid-creative-sb0540.ko.xz
        hid-glorious.ko.xz
        hid-lg-g15.ko.xz
        hid-mcp2221.ko.xz
input-polldev.ko.xz
        iqs62x-keys.ko.xz
        iqs62x.ko.xz
        lantiq.ko.xz
        pci-hyperv-intf.ko.xz
        pinctrl-da9062.ko.xz
        pinctrl-jasperlake.ko.xz
        pinctrl-lynxpoint.ko.xz
        pinctrl-tigerlake.ko.xz
        regmap-spi.ko.xz
        sprd_serial.ko.xz
xen-blkfront.ko.xz
xen-netfront.ko.xz
        xhci-pci-renesas.ko.xz
        xhci-plat-hcd.ko.xz


could it be aes-x86_64.ko.xz ???

$ find -name aes-x86_64.ko.xz
./_initrd-5.3.12-2-default.extracted/cpio-root-19000/lib/modules/5.3.12-2-default/kernel/arch/x86/crypto/aes-x86_64.ko.xz

this module is *not* available in 5.8 anymore:

$ find /lib/modules/5.8.4-1-default/  -name aes-x86_64.ko.xz
$ find /lib/modules/5.8.4-1-default/  -name aes\*.ko.xz
/lib/modules/5.8.4-1-default/kernel/arch/x86/crypto/aesni-intel.ko.xz
/lib/modules/5.8.4-1-default/kernel/crypto/aes_ti.ko.xz


for next reboot I'll try an initrc created with

dracut --kver 5.8.4-1-default --force  --install "/sbin/cryptsetup" --install  /lib/modules/5.8.4-1-default/kernel/crypto/aes_ti.ko.xz --install /lib/modules/5.8.4-1-default/kernel/arch/x86/crypto/aesni-intel.ko.xz

cross fingers... ;-)
Comment 19 Harald Koenig 2020-09-01 20:13:52 UTC
(In reply to Harald Koenig from comment #18)
> for next reboot I'll try an initrc created with
> 
> dracut --kver 5.8.4-1-default --force  --install "/sbin/cryptsetup"
> --install  /lib/modules/5.8.4-1-default/kernel/crypto/aes_ti.ko.xz --install
> /lib/modules/5.8.4-1-default/kernel/arch/x86/crypto/aesni-intel.ko.xz
> 
> cross fingers... ;-)

update: didn't work either:-(

native cryptsetup showed the same error.

aesni_intel was loaded, aes_ti was not.

after manually loading aes_ti, cryptsetup still shows same error.
back to 5.3.12....

how can I find/provide the missing kernel mod?

which other tests do you suggest, which infos may I provide?


thanks for your help and ideas!!
Comment 20 Harald Koenig 2020-09-01 20:20:06 UTC
(In reply to Takashi Iwai from comment #15)
> I dunno, it depends on how it was encrypted.  We may check the crypt entries
> in the working case.

how can I find out how my partition is encrypted ?
maybe this can help (running 5.3.12): 

# cryptsetup status cr_sda5 
/dev/mapper/cr_sda5 is active and is in use.
  type:    LUKS1
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  key location: dm-crypt
  device:  /dev/sda5
  sector size:  512
  offset:  4096 sectors
  size:    1048571904 sectors
  mode:    read/write
Comment 21 Franck Bui 2020-09-02 06:32:11 UTC
(In reply to Harald Koenig from comment #19)
> native cryptsetup showed the same error.
> 

Ok then it's not specific to systemd-cryptsetup.

> 
> how can I find/provide the missing kernel mod?
> 

You can try to dump the content of 2 initrds, one which fails to boot and the other one which is ok, and compare the outputs.

You can dump the relevant content of an initrd with the following commnad:

>  $ lsinitrd --kver <kernel-version-ko> | sed -ne 's,.* \(lib/modules/.*.ko\),\1,p'

Please make sure to restore the initrds you modified previously (by adding cryptsetup and other kern modules).

Thanks.
Comment 22 Harald Koenig 2020-09-02 10:19:17 UTC
(In reply to Franck Bui from comment #21)
> You can try to dump the content of 2 initrds, one which fails to boot and
> the other one which is ok, and compare the outputs.
> 
> You can dump the relevant content of an initrd with the following commnad:

that's what I did with 'binwalk': I've fully extracted trees of both initrd.

the differences are shown in comment 18 with this command:

$ comm -3 <( find _initrd-5.3.12-2-default.extracted/ -name \*.ko\* |  sed 's,.*/,,' | sort -u ) <( find _initrd-5.8.4-1-default.extracted/ -name \*.ko\* |  sed 's,.*/,,' | sort -u )


> >  $ lsinitrd --kver <kernel-version-ko> | sed -ne 's,.* \(lib/modules/.*.ko\),\1,p'

didn't know lsinitrd, thanks!

but I notice that lsinitrd does not show the *.ko.xz in the next cpio block *after* the large compressed cpio block: there are more files (firmware for i915 and 23 more kernel mods which will be extracted with "binwalk -e ..." into cpio-root-0/...)
Comment 23 Franck Bui 2020-09-02 12:57:00 UTC
(In reply to Harald Koenig from comment #22)
> but I notice that lsinitrd does not show the *.ko.xz

That's due to the sed expression which only shows *.ko modules so it needs to be adapted to match *.ko.xz too.
Comment 24 Franck Bui 2020-09-02 13:00:03 UTC
(In reply to Harald Koenig from comment #18)
> could it be aes-x86_64.ko.xz ???

Might be.

So Takashi, can you please have a look and tell us why this module is missing and if it's expected which module is supposed to replace it ?
Comment 25 Takashi Iwai 2020-09-02 14:29:53 UTC
The driver module got dropped in the recent upstream, commit 1d2c3279311e4f03fcf164e1366f2fda9f4bfccf
    crypto: x86/aes - drop scalar assembler implementations
    
    The AES assembler code for x86 isn't actually faster than code
    generated by the compiler from aes_generic.c, and considering
    the disproportionate maintenance burden of assembler code on
    x86, it is better just to drop it entirely. Modern x86 systems
    will use AES-NI anyway, and given that the modules being removed
    have a dependency on aes_generic already, we can remove them
    without running the risk of regressions.

So this sounds like the missing aes_generic in initrd?
Comment 26 Franck Bui 2020-09-02 14:46:32 UTC
(In reply to Takashi Iwai from comment #25)
> So this sounds like the missing aes_generic in initrd?

Harald, can you check that and if it's missing, can you please add this module in  initrd and see if it helps ?
Comment 27 Harald Koenig 2020-09-02 17:52:37 UTC
(In reply to Franck Bui from comment #26)
> (In reply to Takashi Iwai from comment #25)
> > So this sounds like the missing aes_generic in initrd?
> 
> Harald, can you check that and if it's missing, can you please add this
> module in  initrd and see if it helps ?

hmmm, there is no such thing as aes_generic, neither in 5.3.12 nor in 5.8.4:

$ find /lib/modules/ -name aes\*
/lib/modules/5.8.4-1-default/kernel/arch/x86/crypto/aesni-intel.ko.xz
/lib/modules/5.8.4-1-default/kernel/crypto/aes_ti.ko.xz
/lib/modules/5.3.12-2-default/kernel/arch/x86/crypto/aesni-intel.ko.xz
/lib/modules/5.3.12-2-default/kernel/arch/x86/crypto/aes-x86_64.ko.xz
/lib/modules/5.3.12-2-default/kernel/crypto/aes_ti.ko.xz

where can I get aes_generic ?

or is it the "aes-x86_64" which only exists for 5.3 kernel ?

reading kernel config and makefile for 5.8.4 suggests me that aes_generic should be compiled into the kernel image, so it can't be missing ?!

$ grep aes_generic  /usr/src/linux-5.8.4-1/crypto/Makefile 
obj-$(CONFIG_CRYPTO_AES) += aes_generic.o
CFLAGS_aes_generic.o := $(call cc-option,-fno-code-hoisting) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

$ grep -r CONFIG_CRYPTO_AES=  /usr/src/linux-5.8.4-1* | grep x86_64/default
/usr/src/linux-5.8.4-1-obj/x86_64/default/.config.old:CONFIG_CRYPTO_AES=y
/usr/src/linux-5.8.4-1-obj/x86_64/default/include/config/auto.conf:CONFIG_CRYPTO_AES=y
/usr/src/linux-5.8.4-1-obj/x86_64/default/.config:CONFIG_CRYPTO_AES=y


and looking for symbol crypto_ft_tab found in linux-5.8.4-1/crypto/aes_generic.c, it's available in old and new kernel:

grep " crypto_ft_tab" /boot/System.map-5.*
/boot/System.map-5.3.12-2-default:ffffffff81e88540 D crypto_ft_tab
/boot/System.map-5.8.4-1-default:ffffffff82089f40 R crypto_ft_tab

so I guess this cannot be the root cause ?!
Comment 28 Harald Koenig 2020-09-02 18:02:43 UTC
(In reply to Franck Bui from comment #23)
> (In reply to Harald Koenig from comment #22)
> > but I notice that lsinitrd does not show the *.ko.xz
> 
> That's due to the sed expression which only shows *.ko modules so it needs
> to be adapted to match *.ko.xz too.

I checked the raw output of lsinitrd, no sed....

in my initrd-5.8.4-1-default I find 2 copies of e.g. dm-crypt.ko.xz with binwalk,
on in the compressed cpio block starting at 0x19000 (manually extracted into cpio-root-19000/)
and a 2nd copy in the next uncompressed cpio block which follows (extracted with "binwalk -e" into cpio-root-0/)
at offset 0x3A40CE :

binwalk /boot/initrd-5.8.4-1-default | grep dm-crypt.ko.xz
3817678       0x3A40CE        ASCII cpio archive (SVR4 with no CRC), file name: "lib/modules/5.8.4-1-default/kernel/drivers/md/dm-crypt.ko.xz", file name length: "0x0000003D", file size: "0x000056D0"


stuff extracted with "binwalk -e" and "mkdir pio-root-19000 ; cd pio-root-19000 ; cpio -id < ../19000"

$ find /root/I1/_initrd-5.8.4-1-default.extracted/ -name dm-crypt.ko.xz
/root/I1/_initrd-5.8.4-1-default.extracted/cpio-root-19000/lib/modules/5.8.4-1-default/kernel/drivers/md/dm-crypt.ko.xz
/root/I1/_initrd-5.8.4-1-default.extracted/cpio-root-0/lib/modules/5.8.4-1-default/kernel/drivers/md/dm-crypt.ko.xz


whereas lsinitrd only lists one copy -- maybe the output is "optimized" ?
lsinitrd output:

$ lsinitrd | grep dm-crypt.ko.xz
-rw-r--r--   1 root     root        19604 Dec 19  2019 lib/modules/5.3.12-2-default/kernel/drivers/md/dm-crypt.ko.xz
Comment 29 Franck Bui 2020-09-02 18:30:57 UTC
@Takashi, at this point it seems that there's something wrong with the kernel modules in initrd. Both systemd-cryptsetup and cryptsetup fail when issuing an ioctl which ends up loading a module (according to you) which seems to be missing.

Can you please help Harald as there's no point for me to serve as a proxy.

Thanks.
Comment 30 Takashi Iwai 2020-09-02 19:37:15 UTC
Well, I'm no expert in this area at all, so not the best person to ask...

In anyway, Harald, could you test the recent TW live image or a rescue image, and verify whether you can mount the partition manually from there?  If it works, we can compare what's missing.
Comment 31 Harald Koenig 2020-09-02 22:08:14 UTC
(In reply to Takashi Iwai from comment #30)
> Well, I'm no expert in this area at all, so not the best person to ask...
> 
> In anyway, Harald, could you test the recent TW live image or a rescue
> image, and verify whether you can mount the partition manually from there? 
> If it works, we can compare what's missing.

VERY good idea, thanks!

openSUSE-Tumbleweed-XFCE-Live-x86_64-Snapshot20200831-Media.iso works, can open encrypted fs.

lsmod shows 188 modules. I've created an initrd with all 188 modules with dracut --install mod1 ...
this still does not work. but provides lots of files for more testing in initrd ;)

loading all kernel/crypto/* with insmod  works.

rebooting and loading those modules one by one shows that essiv.ko.xz is needed.
but just "modprobe essiv" complains about missing symbol.
only after "depmod -a" this modprobe works and cryptsetup open too.

now I've created a new initrd with

     dracut  --add-drivers  essiv   --kver 5.8.4-1-default --force

and boot works!!!


questions remaining: 

why did this happen ?
what's the right way to fix this for the future ?


a big THANKS for your help and input!
Comment 32 Takashi Iwai 2020-10-01 08:37:28 UTC
Somewhat overlooked during my vacation, sorry.

About the missing kernel module in dracut: maybe dracut people have a better clue.
Comment 33 Harald Koenig 2020-10-01 18:20:43 UTC
(In reply to Takashi Iwai from comment #32)
> Somewhat overlooked during my vacation, sorry.
> 
> About the missing kernel module in dracut: maybe dracut people have a better
> clue.

I guess this was a one-time-only problem in the transition from older kernel versions which did not yet have essiv.ko, that dracut did not "automatically" add that module for newer kernels at the first time.

now that I've manually created a first initrd with new kernel and essiv.ko,
after reboot and running new kernel (and module essiv loaded), dracut does the right thing and again creates working (for me) initrd for updated kernel.

but still it would good to add that magic knowlege for users who run opensuse 15.2 with 5.3.* kernels to avoid that they run into that issue once they upgrade from 15.2 to newer distros...
Comment 34 Antonio Feijoo 2021-12-02 10:28:50 UTC
Sorry for the lack of response here. We are cleaning up old bug entries.

> openSUSE-Tumbleweed-XFCE-Live-x86_64-Snapshot20200831-Media.iso works, can open > encrypted fs.

The issue disappeared with a new TW snapshot, so we are going to close it as fixed.

> I guess this was a one-time-only problem in the transition from older kernel
> versions which did not yet have essiv.ko, that dracut did not "automatically"
> add that module for newer kernels at the first time.
> 
> now that I've manually created a first initrd with new kernel and essiv.ko,
> after reboot and running new kernel (and module essiv loaded), dracut does the > right thing and again creates working (for me) initrd for updated kernel.
> 
> but still it would good to add that magic knowlege for users who run opensuse
> 15.2 with 5.3.* kernels to avoid that they run into that issue once they
> upgrade from 15.2 to newer distros...

I think this is not easy to deduce from dracut. We will keep this report in mind when testing an upgrade from a 5.3 kernel to a much higher version.