Bug 724912 - systemd-cryptsetup: crypt_init() failed: Block device required
systemd-cryptsetup: crypt_init() failed: Block device required
Status: RESOLVED FIXED
: 728543 728674 (view as bug list)
Classification: openSUSE
Product: openSUSE 12.1
Classification: openSUSE
Component: Basesystem
Factory
Other Other
: P5 - None : Critical (vote)
: ---
Assigned To: Frederic Crozat
E-mail List
:
Depends on:
Blocks: 696902
  Show dependency treegraph
 
Reported: 2011-10-18 15:26 UTC by Christian Boltz
Modified: 2012-02-22 04:43 UTC (History)
10 users (show)

See Also:
Found By: Beta-Customer
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
/etc/fstab (1.44 KB, text/plain)
2011-10-18 18:01 UTC, Christian Boltz
Details
dmesg with systemd debugging enabled (153.78 KB, text/plain)
2011-10-19 00:08 UTC, Christian Boltz
Details
output from various systemctl commands on working system (11.96 KB, application/octet-stream)
2011-10-26 17:57 UTC, Christian Boltz
Details
systemctl show dev-md3.device output (902 bytes, text/plain)
2011-10-27 16:45 UTC, Christian Boltz
Details
screen photo from shutdown (156.01 KB, image/jpeg)
2011-11-02 12:34 UTC, Christian Boltz
Details
patch which works for me. (861 bytes, patch)
2011-11-06 13:56 UTC, Christian Volkmann
Details | Diff
boot.md output during an error. (2.50 KB, text/plain)
2011-11-06 23:02 UTC, Christian Volkmann
Details
dmesg with systemd.log_level=debug systemd.log_target=kmsg (165.32 KB, text/plain)
2011-11-09 21:45 UTC, Christian Volkmann
Details
My configuration crypttab,fstab,mdadm.conf (880 bytes, application/x-bzip2)
2011-11-09 21:48 UTC, Christian Volkmann
Details
systemctl show fsck\@dev-md2.service (2.67 KB, application/octet-stream)
2011-11-10 17:30 UTC, Christian Volkmann
Details
systemd-loglevel=debug for chasing systemd raid bug (133.49 KB, text/plain)
2011-11-11 06:36 UTC, andreas hoffmann
Details
dmesg systemd-37-293.1.x86_64 situation with md-trouble (170.45 KB, text/plain)
2011-11-11 21:34 UTC, Christian Volkmann
Details
systemctl show systemd-37-293.1.x86_64 situation with md-trouble (2.67 KB, text/plain)
2011-11-11 21:36 UTC, Christian Volkmann
Details
no error situration dmesg | grep -e md2 -e sd[ab]6 > dmesg.noerr (1.98 KB, text/plain)
2011-11-11 21:49 UTC, Christian Volkmann
Details
dmesg with proper boot, but also error message of boot.md (237.44 KB, text/plain)
2011-11-17 19:25 UTC, Christian Volkmann
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Boltz 2011-10-18 15:26:22 UTC
I just zypper dup'ed my system (previous update was 2011-10-12), and one of the changes was that /sbin/init was switched from sysvinit to systemd.

Unfortunately systemd fails to mount my encrypted /home partition.
I see the following error message:
systemd-cryptsetup [801]: crypt_init() failed: Block device required

I'm not even asked for the password for my home partition.

# cat /etc/crypttab:
cr_home          /dev/md1             none       none

# grep home /etc/fstab
/dev/mapper/cr_home    /home     ext3    noatime,acl,user_xattr,nofail 0 0

Booting with sysvinit works without problems.
Comment 1 Frederic Crozat 2011-10-18 15:50:59 UTC
please give rpm -q systemd

and attach /etc/fstab

also, try to boot with systemd.log_level=debug systemd.log_target=kmsg

and after booting, attach dmesg output to this bug report.
Comment 2 Christian Boltz 2011-10-18 18:01:58 UTC
Created attachment 457357 [details]
/etc/fstab

/etc/fstab (some /data/* nfs mounts renamed for privacy reasons)
Comment 3 Christian Boltz 2011-10-18 18:03:49 UTC
# rpm -q systemd
systemd-37-2.1.x86_64

I can't reboot at the moment, so the dmesg debug output will need some hours...
Comment 4 Christian Boltz 2011-10-19 00:08:44 UTC
Created attachment 457433 [details]
dmesg with systemd debugging enabled

BTW: After saving the dmesg output, "systemctl start cryptsetup@cr_home.service" worked without problems. Maybe the raid devices are not ready yet when systemd tries to mount them?
Comment 5 Frederic Crozat 2011-10-19 09:05:32 UTC
could you describe your system disk layout ?

It looks like a dup from bnc#724238 although your disk layout is probably different
Comment 6 Christian Boltz 2011-10-19 11:47:57 UTC
(In reply to comment #5)
> could you describe your system disk layout ?

Two harddisks, identical partitioning scheme with 6 partitions on both. Each pair of partitions is used as RAID1 (mirroring) raid.

# cat /proc/mdstat 
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] 
md0 : active raid1 sdb2[1] sda2[0]
      200800 blocks super 1.0 [2/2] [UU]
      bitmap: 0/7 pages [0KB], 16KB chunk

md1 : active raid1 sdb3[1] sda3[0]
      133162712 blocks super 1.0 [2/2] [UU]
      bitmap: 7/254 pages [28KB], 256KB chunk

md4 : active (auto-read-only) raid1 sdb7[1] sda7[0]
      1839396 blocks super 1.0 [2/2] [UU]
      bitmap: 0/8 pages [0KB], 128KB chunk

md3 : active raid1 sdb6[1] sda6[0]
      10482308 blocks super 1.0 [2/2] [UU]
      bitmap: 2/160 pages [8KB], 32KB chunk

md2 : active raid1 sdb5[1] sda5[0]
      10482308 blocks super 1.0 [2/2] [UU]
      bitmap: 0/160 pages [0KB], 32KB chunk

unused devices: <none>

> It looks like a dup from bnc#724238 although your disk layout is probably
> different

There are probably more differences - I'm not even asked for the password at boot, while bug 724238 sounds like Ludwig entered the password. OTOH, I see an error message "Block device required" which Ludwig doesn't mention.

Hmmm, are all /dev/md* devices already setup at the time of (the attemp of) mounting my encrypted home? Or only those that are listed in /etc/mdadm.conf of the initrd (wich is only the / partition AFAIK, because mkinitrd considers the other partitions as superfluous at boot)?
Comment 7 Frederic Crozat 2011-10-19 12:02:23 UTC
could you try to copy /run/systemd/generator/crypsetup@cr_home.service to /etc/systemd/systemd

and in it, in the section [Unit], add :
After=dmraid.service
Comment 8 Christian Boltz 2011-10-20 13:09:56 UTC
(In reply to comment #7)
> could you try to copy /run/systemd/generator/crypsetup@cr_home.service to
> /etc/systemd/systemd

I only have a directory /etc/systemd/system (without d) and copied the file there:
    /etc/systemd/system/cryptsetup@cr_home.service
Is this the correct location?

> and in it, in the section [Unit], add :
> After=dmraid.service

This doesn't seem to change anything :-(
I did another experiment on the next boot: I added an additonal
    ExecStart=/bin/cat /proc/mdstat
and hoped to get the /proc/mdstat output - but it didn't appear on the screen.

# cat  /etc/systemd/system/cryptsetup@cr_home.service
[Unit]
Description=Cryptography Setup for %I
Conflicts=umount.target
DefaultDependencies=no
BindTo=dev-md1.device dev-mapper-%i.device
After=systemd-readahead-collect.service systemd-readahead-replay.service dev-md1.device
After=dmraid.service
Before=umount.target
Before=local-fs.target
Before=cryptsetup.target

[Service]
Type=oneshot
RemainAfterExit=yes
TimeoutSec=0
ExecStart=/bin/cat /proc/mdstat
ExecStart=/lib/systemd/systemd-cryptsetup attach 'cr_home' '/dev/md1' 'none' 'none'
ExecStop=/lib/systemd/systemd-cryptsetup detach 'cr_home'
Comment 9 Frederic Crozat 2011-10-20 13:20:31 UTC
yes, it is the correct location.

try using /bin/cp /proc/mdstat /tmp/mdstat instead of cat
Comment 10 Christian Boltz 2011-10-21 10:39:38 UTC
(In reply to comment #9)
> try using /bin/cp /proc/mdstat /tmp/mdstat instead of cat

That was a very good idea :-)
The md* devices are there, but only the / partition is assembled.

# cat /mdstat 
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] 
md3 : inactive sda6[0](S)
      10482308 blocks super 1.0
       
md4 : inactive sda7[0](S)
      1839396 blocks super 1.0
       
md1 : inactive sda3[0](S)
      133162712 blocks super 1.0
       
md0 : inactive sda2[0](S)
      200800 blocks super 1.0
       
md2 : active raid1 sda5[0] sdb5[1]
      10482308 blocks super 1.0 [2/2] [UU]
      bitmap: 4/160 pages [16KB], 32KB chunk

unused devices: <none>
Comment 11 Frederic Crozat 2011-10-21 11:50:34 UTC
after discussing with kay, could you try replacing :
After=dmraid.service
with
After=device-mapper.service

since your RAID might be provided by dm
Comment 12 Christian Boltz 2011-10-22 09:44:58 UTC
(In reply to comment #11)
> with
> After=device-mapper.service

The result doesn't look too different :-( - the only difference is that md4 (swap, commented out in /etc/fstab) does not exist.

# cat /mdstat 
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] 
md3 : inactive sda6[0](S)
      10482308 blocks super 1.0

md1 : inactive sda3[0](S)
      133162712 blocks super 1.0

md0 : inactive sda2[0](S)
      200800 blocks super 1.0

md2 : active raid1 sdb5[1] sda5[0]
      10482308 blocks super 1.0 [2/2] [UU]
      bitmap: 4/160 pages [16KB], 32KB chunk

unused devices: <none>
Comment 13 Christian Boltz 2011-10-22 18:02:56 UTC
After checking this again, I'm not really surprised. The reason is simple:

# ls -l /lib/systemd/system/device-mapper.service 
lrwxrwxrwx   [...]    /lib/systemd/system/device-mapper.service -> /dev/null

There doesn't seem to be a device.mapper.service in /etc or /run.
Comment 14 Frederic Crozat 2011-10-24 14:29:40 UTC
my bad, try with
After=md.service
Comment 15 Christian Boltz 2011-10-25 17:58:35 UTC
also doesn't change anything :-(

Well, to be exact, the incomplete raid arrays now contain sdb* instead of sda*...

The interesting thing is that I see messages about a failed fsck for /dev/md3 (that's /testroot) - I'm asked to enter the root password to login to emergency mode afterwards.

Screen content at this time:

Waiting for device /dev/md2 to appear:  ok
fsck from util-linux 2.20
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a -C0 /dev/md2
/dev/md2: clean, 305143/655360 files, 2310028/2620577 blocks
fsck succeeded. Mounting root device read-write.
Mounting root /dev/md2
mount -o rw,noatime,acl,user_xattr -t ext3 /dev/md2 /root
systemd-fsck[788]: fsck.ext3: Das Argument ist ungültig beim Versuch, /dev/md3 zu öffnen
systemd-fsck[788]: /dev/md3:
systemd-fsck[788]: SuperBlock ist unlesbar bzw. beschreibt kein gültiges ext2
systemd-fsck[788]: Dateisystem.  Wenn Gerät gültig ist und ein ext2
systemd-fsck[788]: Dateisystem (kein swap oder ufs usw.) enthält,  dann ist der SuperBlock
systemd-fsck[788]: beschädigt, und sie könnten e2fsck mit einem anderen SuperBlock:
systemd-fsck[788]: e2fsck -b 8193 <Gerät>
[    9.650065] systemd-fsck[788]: fsck failed with error code 8.
[    9.697610] EXT3-fs (md3): error: unable to read superblock
[    9.907186] udevd[421]: '/sbin/blkid -o udev -p /dev/md0' [807] terminated by signal 15 (Terminated)
[    9.933196] udevd[433]: '/sbin/mdadm --incremental /dev/sdb7' [796] terminated by signal 15 (Terminated)
[    9.933207] udevd[422]: '/sbin/mdadm --detail --export /dev/md3' [835] terminated by signal 15 (Terminated)
Welcome to emergency mode. Use "systemctl default" or ^D to activate default mode.
Give root password for login:



# cat /mdstat
Personalities : [raid1] [raid0] [raid10] [raid6] [raid5] [raid4] 
md1 : inactive sdb3[1](S)
      133162712 blocks super 1.0
       
md4 : inactive sdb7[1](S)
      1839396 blocks super 1.0
       
md3 : inactive sdb6[1](S)
      10482308 blocks super 1.0
       
md0 : inactive sdb2[1](S)
      200800 blocks super 1.0
       
md2 : active raid1 sda5[0] sdb5[1]
      10482308 blocks super 1.0 [2/2] [UU]
      bitmap: 5/160 pages [20KB], 32KB chunk

unused devices: <none>
Comment 16 Frederic Crozat 2011-10-26 07:57:51 UTC
did you try to increase MDADM_DEVICE_TIMEOUT in /etc/sysconfig/mdadm ?

default value is 60 (seconds), so your system should wait for 60s before asking for crypto password.
Comment 17 Christian Boltz 2011-10-26 16:08:52 UTC
I found a better solution and am now using bugzilla on a systemd-booted installation the first time :-)

What I did: I copied fsck@.service to /etc/systemd/system/ and added
    After=md.service

It looks like this did the trick.
Note: I still have the modified cryptsetup@cr_home.service in place, so maybe both are needed. I'll test without the modified cryptsetup@cr_home.service when I boot next time and report back.
Comment 18 Frederic Crozat 2011-10-26 16:25:59 UTC
if it is working, I'd be interesting by output of systemctl 

and systemctl show fsck@*.service

where you replace fsck@*.service by each of the fsck@* service you found in the systemctl output.
Comment 19 Christian Boltz 2011-10-26 17:57:22 UTC
Created attachment 459011 [details]
output from various systemctl commands on working system

"systemctl" didn't list anything related to fsck, but "systemctl --all" did. Therefore I used --all as base. The output is included in this tarball, the filenames should speak for themselves ;-)  (Interestingly, there doesn't seem to be a fsck for my encrypted partition - or is it "hidden" in cryptsetup@cr_home.service?)
I also included my /etc/systemd in case you need to check something there.

Note: I didn't reboot yet, so the modified cryptsetup@cr_home.service is still used.
Comment 20 Christian Boltz 2011-10-27 11:23:42 UTC
Without the modified cryptsetup@cr_home.service, I get
    systemd-cryptsetup[823]: crypt_init() failed: Block device required.

This means the "After=md.service" is needed in cryptsetup@cr_home.service _and_ fsck@.service.
Comment 21 Frederic Crozat 2011-10-27 12:01:37 UTC
hmm, could you attach:

systemctl show dev-md3.device 

I hope there is a way to detect if a device is a "md" one (from systemd) so we could add dependency on md.service on the fly.
Comment 22 Christian Boltz 2011-10-27 16:45:06 UTC
Created attachment 459201 [details]
systemctl show dev-md3.device output
Comment 23 Christian Boltz 2011-11-02 12:34:20 UTC
Created attachment 459924 [details]
screen photo from shutdown

Frederic, I have a probably related problem at shutdown: systemd tries to stop systemd-cryptsetup before umounting my encrypted /home partition. Needless to say that this obviously fails ;-)

The system is only halted (not powered off) - is this caused by the previous failure or another bug?

This screen photo contains all messages that are displayed on tty1.
Comment 24 Frederic Crozat 2011-11-02 12:46:57 UTC
chech man halt => you should use halt -p (SUSE has a "extension" to transform halt into poweroff on x86/x86-64)
Comment 25 Frederic Crozat 2011-11-04 13:15:08 UTC
could you try replacing After=md.service with After=lvm.service in both fsck@.service and cryptsetup@cr_home.service ?

since boot.lvm has a dependency on boot.md, it could be enough to just add this one.
Comment 26 Christian Boltz 2011-11-05 12:36:15 UTC
(In reply to comment #25)
> could you try replacing After=md.service with After=lvm.service in both
> fsck@.service and cryptsetup@cr_home.service ?

That gives me the same /proc/mdstat as in Comment 10 :-( - in other words: depending on just lvm.service doesn't work.
Comment 27 Christian Volkmann 2011-11-06 13:35:39 UTC
I have the same problem. The home directory is located at a md-device.

For me the following fix worked:
- Add boot.md as required for /etc/init.d/boot.crypto / boot.crypto-early

- Changed for me:
boot.crypto:
# Required-Start:    boot.localfs boot.md

boot.crypto-early:
# Required-Start:    boot.udev boot.md


I personally would set this bug to P1 and add the fix to Goldmaster.
Everybody with encrypted home partition on a raid device might have
hard trouble to boot the system.

Christian
Comment 28 Christian Volkmann 2011-11-06 13:56:31 UTC
Created attachment 460622 [details]
patch which works for me.

I expect boot.lvm should also be started before boot.crypto.
The shutdown problem should be fixed with the Required-Stop
change.

Christian
Comment 29 Christian Boltz 2011-11-06 15:18:56 UTC
@Christian V: while your problem might be similar, it is still totally different - this bugreport is about systemd, but you are talking about the sysvinit initscripts.
Please open a new bugreport, assign it to lnussel@suse.com and add a short note with the bug number here.
Comment 30 Christian Volkmann 2011-11-06 22:02:37 UTC
Christian B is right. My change did not bring anything. The system
seems to run by random sometimes for me. (systemd)

What I can see from my dmesg logs:
- md.service is started early
- calls somehow boot.md
  /etc/init.d/boot.md fails for me: (code=exited, status=1/FAILURE)
  md.service fails due to this.

I would expect "cryptsetups" should depend on md.service. Also md.service (as
replacement for the old boot.md should be successful 

Potential error reasons for me can be:
- md.service starts boot.md, => "PPID != 1"(/etc/rc.status redirect) =>
   boot.md fails due to this, like for a manual call?

- wrong error detection "md.service" cause "/" ( as md-device) has partially
  started md-services and start of another device ( for /home) only reports
  a partial ok.
Comment 31 Christian Volkmann 2011-11-06 23:02:35 UTC
Created attachment 460633 [details]
boot.md output during an error.

The return code "3" of /etc/init.d/boot.md confuses the md.service start.
The md-devices seem to appear from multiple places.

As a test to skip boot.md I did "ln -s /dev/null /lib/systemd/system/md.service"
The system also started the md-devices. My resume: The error is that boot.md
(mdadm) started and two mechanism start to activate the devices => problems 

I have attached a small log file to show the output of /proc/mdstat
and boot.md in the error situation.

I suppose the fix is to add the following link to the systemd package:
/lib/systemd/system/md.service => /dev/null
Comment 32 Christian Volkmann 2011-11-06 23:06:51 UTC
(In reply to comment #23)
> Frederic, I have a probably related problem at shutdown: systemd tries to stop
> systemd-cryptsetup before umounting my encrypted /home partition. Needless to
> say that this obviously fails ;-)
> 
> The system is only halted (not powered off) - is this caused by the previous
> failure or another bug?

My "md-problem system" does not have this problem. "Not powered off seems" to be 
another story. The system of my wife has also this problem, but it uses no
md not crypted partitions.
Comment 33 Frederic Crozat 2011-11-07 14:31:56 UTC
*** Bug 728543 has been marked as a duplicate of this bug. ***
Comment 34 Frederic Crozat 2011-11-07 16:07:03 UTC
*** Bug 728674 has been marked as a duplicate of this bug. ***
Comment 35 Christian Volkmann 2011-11-07 20:57:42 UTC
Did somebody try as fix ?:

"ln -s /dev/null /lib/systemd/system/md.service"

I expect it should fix the problem and should be included
for 12.1 before goldmaster!

Christian
Comment 36 Frederic Crozat 2011-11-08 08:32:54 UTC
What makes you think disabling boot.md will fix anything : it will not prevent systemd service to try to use md partitions until they are ready..
Comment 37 Christian Volkmann 2011-11-08 08:44:25 UTC
It looks for me like a concurrent access of two mechanism.

Depending on the timing one might disturb/block the other.
"boot.md" vs. "systemd detection"
The "systemd detection" works for me without need of boot.md
I expect it's triggered by the kernel internal detection(?)

The error messages "mdadm: /dev/md1 is already in use.", see attachment
<https://bugzillafiles.novell.org/attachment.cgi?id=460633>, also points
to this.
Comment 38 Frederic Crozat 2011-11-08 09:04:02 UTC
it is probably already working for you because your md device is started from initrd (which might not be the case for all setup).
Comment 39 Christian Volkmann 2011-11-08 09:25:45 UTC
(In reply to comment #38)
> it is probably already working for you because your md device is started from
> initrd (which might not be the case for all setup).
Hmm, I expect only "/" as one of the three md should be detected by initrd.
Then it seems to trigger the md-assembly of my "crypted /home/cv" and the third
md-partition.

So the final solution has to take care about:
- avoid concurrent access ( systemd vs. boot.md), (was the error for me).
- boot.md to accept/ignore already active devices
- work also in case initrd does not trigger the md-device assembly.

I expect the tests to cover should be:
1.
  / as md
2.
  / as md
  /other as md
3.
  / as md
  /other (encrypted) as md
4.
  / as md
  /other (encrypted) as md
  /others (multiple md)
5.
  /other as md
6.
  /other (encrypted) as md
7.
  /other (encrypted) as md
  /others (multiple md)

The tests should be done multiple times to find concurrent/timing problems.
Comment 40 Olivier Nicolas 2011-11-08 21:16:35 UTC
1. root fs is not an MD device

2. Adding /lib/systemd/system/md.service -> /dev/null  does not solve the problem (Boot fails most of the time)

3. I don't have the problem for LVM over DM partitions !
Comment 41 Christian Volkmann 2011-11-08 22:08:11 UTC
(In reply to comment #40)
> 2. Adding /lib/systemd/system/md.service -> /dev/null  does not solve..

I have double checked my configuration and my changes.
As first try I did add the UUID-devices to /etc/fstab and /etc/crypttab.
This change was still active before I did the /dev/null link.

Both changes together boot fine. A fallback to /dev/md? in fstab/crypttab
brought the error also back on my system.

So as workaround both is needed:
-  The /dev/null link and
- A change of /etc/fstab and /etc/crypttab to use /dev/disk/by-id/md-uuid-...
  instead of /dev/md.. ( md? relation to uuid can be found in /etc/mdadm.conf )
Comment 42 Christian Boltz 2011-11-08 22:28:17 UTC
(In reply to comment #41)
> (In reply to comment #40)
> > 2. Adding /lib/systemd/system/md.service -> /dev/null  does not solve..
> So as workaround both is needed:
> -  The /dev/null link and
> - A change of /etc/fstab and /etc/crypttab to use /dev/disk/by-id/md-uuid-...
>   instead of /dev/md.. ( md? relation to uuid can be found in /etc/mdadm.conf )

That sounds like the additional After=md.service in fsck@.service and cryptsetup@*.service (see earlier comments for the details) are the better solution ;-)
Comment 43 andreas hoffmann 2011-11-09 06:22:00 UTC
Confirm, I've also had some md-boot issues after upgrading to RC2 but adding md.service at the end of the After= in fsck@.service line did do the trick for me.

I exactly followed comment No 17. and got the resolution for the problem. Now it always boots fine, all three md devices are working properly, even by using /dev/mdX in fstab. The UUID thing I don't really like, I prefer to adress the devices by the old style /dev/mdx name.
Comment 44 Frederic Crozat 2011-11-09 15:10:37 UTC
I have committed a fix in home:fcrozat:systemd / systemd, which should be available in systemd-37-291.1 (or more) later today, when OBS has rebuilt it.

please test both cryptsetup and non-cryptsetup case.

if you had modified cryptsetup / fsck files in /etc/systemd/system, please remove them (or move them in a different location) and any other changes you might have done to your systemd setup.
Comment 45 Olivier Nicolas 2011-11-09 20:08:21 UTC
I have successfully tested systemd-37-292.1.x86_64.rpm.
No problem mounting /home after multiple reboot.
Comment 46 Christian Volkmann 2011-11-09 21:41:39 UTC
No success with systemd-37-292.1.x86_64.
Comment 47 Christian Volkmann 2011-11-09 21:45:04 UTC
Created attachment 461278 [details]
dmesg with systemd.log_level=debug systemd.log_target=kmsg
Comment 48 Christian Volkmann 2011-11-09 21:48:16 UTC
Created attachment 461279 [details]
My configuration crypttab,fstab,mdadm.conf
Comment 49 Frederic Crozat 2011-11-10 10:32:37 UTC
christian, could you attach systemctl show fsck\@dev-md2.service output ? it looks like this fsck is not waiting for boot.md to finish its job.
Comment 50 andreas hoffmann 2011-11-10 12:27:39 UTC
(In reply to comment #44)
No success with systemd-37-292.1.x86_64. It still hangs on normal boot. After booting to maintenance mode and pressing reset it boots normally. Now the trick with /etc/systemd/system/fsck@.service After= ... md.service is not working anymore.

Error Message on screen is same as before with the unpatched systemd.
Comment 51 Frederic Crozat 2011-11-10 12:37:11 UTC
andreas : dmesg output after booting with systemd.log_level=debug systemd.log_target=kmsg please.
Comment 52 Christian Boltz 2011-11-10 17:05:33 UTC
I also can't boot with your test package :-(  I'll attach dmesg with debug output tomorrow, but there's a possible bug I noticed:

--- AWAY_system/fsck@.service (my modified, working fsck@.service)
+++ /lib/systemd/system/fsck@.service (from systemd-37-292.1.x86_64)
-After=[...] %i.device
-After=md.service
-#After=lvm.service
+After=[....]  %i.device lvm.service dm.service mdraid.service

Are "dm.service" and "mdraid.service" correct or did you mean "md.service" with one of them?
Comment 53 Christian Volkmann 2011-11-10 17:30:42 UTC
Created attachment 461488 [details]
systemctl show fsck\@dev-md2.service
Comment 54 andreas hoffmann 2011-11-11 06:36:39 UTC
Created attachment 461600 [details]
systemd-loglevel=debug for chasing systemd raid bug
Comment 55 andreas hoffmann 2011-11-11 06:38:15 UTC
(In reply to comment #51)
> andreas : dmesg output after booting with systemd.log_level=debug
> systemd.log_target=kmsg please.

Here you can find my systemd log from booting with error with the new systemd version:

rpm -q systemd
systemd-37-292.1.x86_64
Comment 56 andreas hoffmann 2011-11-11 06:46:43 UTC
Now ive tried something. I've checked the new /lib/systemd/system/fsck@.service file and dioscovered the After-line had an mdraid.service inside.

I've changed this to md.service as in comment #17 in the /etc/systemd/system/fsck-file and now it also boots with the new version too. Maybe its just a spelling miskate and it should be md.service from the beginning and not mdraid.service. But I'm not that familiar with systemd yet, so I don't know if mdraid.service should be ok or not. At least it works for me with md.service in /lib/systemd/system/fsck@.service.
Comment 57 andreas hoffmann 2011-11-11 09:38:14 UTC
I've installed the version systemd-37-293.1.x86_64 now and it boots always fine without any tinkering with the fsck@.service file needed. Inside the file there is now md.service referenced from the start. Just it takes time after the system is up until systemd had started all the network services (up to 5 minutes) but I think its a different kettle of fish.
Comment 58 Christian Volkmann 2011-11-11 20:14:23 UTC
systemd-37-293.1.x86_64 works fine for me also.
I had trouble on the first boot, but was not able to reproduce it
on the next 10 boots. Good errors will come again ;-)
Comment 59 andreas hoffmann 2011-11-11 20:23:25 UTC
@Christian Volkmann I also had problems one time. I figured out it was because one of the raids lost the second device (sda3). I re added it and then it was fine again. As long as the Riad was missing one drive, 12.1 refused refused to boot up with only one copy left in this raid. Why the drive went missing, I have had no clue...
An mdadm --manage /dev/mdX --add /dev/sdXY was fixing the problem and the system booted fine afterwards for at least 10 times.
So maybe your problem was also related to some disturbance in the raid. Maybe you better check /proc/mdstat if everything looks ok (all devs. there and all in sync).
Comment 60 Christian Volkmann 2011-11-11 21:12:59 UTC
@Andreas, thanks for the hint. The 3 md-devices are fine without any manual
action. But I did <ctrl>-<alt>-<del> without regular booting. So the raid
should have gotten no chance to get corrupt. I will change to "permanent systemd.log_level=debug systemd.log_target=kmsg" to get a trace in case it
happens again.
Comment 61 Christian Volkmann 2011-11-11 21:34:48 UTC
Created attachment 461790 [details]
dmesg systemd-37-293.1.x86_64 situation with md-trouble
Comment 62 Christian Volkmann 2011-11-11 21:36:09 UTC
Created attachment 461791 [details]
systemctl show  systemd-37-293.1.x86_64 situation with md-trouble
Comment 63 Christian Volkmann 2011-11-11 21:49:19 UTC
Created attachment 461792 [details]
no error situration dmesg | grep -e md2 -e sd[ab]6   > dmesg.noerr
Comment 64 Christian Boltz 2011-11-11 22:22:10 UTC
Good news also from me: with systemd-37-293.1.x86_64 I don't need any changes in fsck@.service and cryptsetup@*.service :-)
Comment 65 Frederic Crozat 2011-11-14 14:02:48 UTC
latest package in my home repo contains the fix for the typo spotted by Christian.

could everybody involved in this bug reply with :

your_name : FIXED / NOT_FIXED

to get a better view on whether the issue is fixed for everybody or not ;)
Comment 66 Christian Volkmann 2011-11-14 14:23:20 UTC
Christian Volkmann: NOT_FIXED

runs sometimes. 
error situation: see comment 61 and comment 62
non error situation: see comment 63
Comment 67 Christian Boltz 2011-11-14 20:12:29 UTC
Christian Boltz: FIXED

(With the exception from comment 23 - my system only halts, but doesn't switch off automatically. I should probably open another bugreport for it, this one is long enough already ;-)
Comment 68 Olivier Nicolas 2011-11-14 22:38:49 UTC
Olivier Nicolas: FIXED

Currently systemd-37-293.1.x86_64 fixed since systemd-37-292.1.x86_64.rpm

Simple RAID1
Comment 69 andreas hoffmann 2011-11-17 12:01:08 UTC
andreas hoffmann : FIXED
Comment 70 Frederic Crozat 2011-11-17 14:21:29 UTC
@christian Boltz: use halt -p, not halt (see manpage, our initscript had a workaround to cause halt to become halt -p on x86*, which is not really "standard" and not available under systemd).

@christian Volkmann: it looks like boot.md is returning an error when you fail to boot. it would be better to attach full dmesg trace when everything is working properly, just grepping dmesg is not enough
Comment 71 Christian Volkmann 2011-11-17 19:25:38 UTC
Created attachment 462728 [details]
dmesg with proper boot, but also error message of boot.md

There is still an error message of boot.md.
See comment 37 for my idea about this error.
( "set -x" within boot.md on an earlier session)
Comment 72 Christian Boltz 2011-11-17 21:20:25 UTC
(In reply to comment #70)
> @christian Boltz: use halt -p, not halt

Can you give me a hint what file I need to change, please? IIRC I tried with a modified halt.service (two weeks ago), but it didn't change anything. So either I modified the wrong file or my modification was wrong...
Comment 73 Frederic Crozat 2011-11-30 09:09:20 UTC
*** Bug 731230 has been marked as a duplicate of this bug. ***
Comment 74 Frederic Crozat 2011-11-30 11:51:23 UTC
*** Bug 728674 has been marked as a duplicate of this bug. ***
Comment 75 Frederic Crozat 2011-12-09 14:40:59 UTC
@christian : you can't change any file.

sr 96122 pushed to openSUSE:12.1:Update:Test
requesting maintenance update for 12.1
Comment 76 Bernhard Wiedemann 2011-12-09 15:00:21 UTC
This is an autogenerated message for OBS integration:
This bug (724912) was mentioned in
https://build.opensuse.org/request/show/96122 12.1 / systemd
https://build.opensuse.org/request/show/96125 Factory / systemd
Comment 77 Bernhard Wiedemann 2011-12-09 18:00:22 UTC
This is an autogenerated message for OBS integration:
This bug (724912) was mentioned in
https://build.opensuse.org/request/show/96193 Factory / systemd
Comment 78 Bernhard Wiedemann 2011-12-12 17:00:27 UTC
This is an autogenerated message for OBS integration:
This bug (724912) was mentioned in
https://build.opensuse.org/request/show/96377 12.1 / systemd
Comment 80 Cristian Rodríguez 2011-12-17 14:40:49 UTC
Fixes are submitted to 12.1 , factory.
Comment 81 Christian Volkmann 2011-12-17 14:58:53 UTC
(In reply to comment #66)
> Christian Volkmann: NOT_FIXED
> 
> runs sometimes. 
> error situation: see comment 61 and comment 62
> non error situation: see comment 63
Will say: comment 63 partially situation where the
error does not happen.

(In reply to comment #80)
> Fixes are submitted to 12.1 , factory.

Hmm, still not_fixed for me. Just the basic fix is submitted.
No uuid-usage fails sometimes for me.
Comment 82 Frederic Crozat 2011-12-19 10:04:10 UTC
Christian, could you open a new bug with the relevant data and configuration?(please avoid using tarball for configuration files, if possible), it will be easier to track.

Thanks
Comment 83 Frederic Crozat 2012-01-04 09:24:30 UTC
Maintenance update has been released for 12.1, closing as fixed.

Christian, please open a separate bug to track your issue.
Comment 84 Jiri Stavinoha 2012-01-27 02:19:05 UTC
Christian, please take a look to #743717
Comment 85 Steve Revilak 2012-02-22 04:40:51 UTC
I have also been affected by this issue, although in my case, the
problem manifested itself exactly as described in
https://bugzilla.novell.com/show_bug.cgi?id=724912.

I have an encrypted /home partition (/dev/mapper/cr_md1), where the
underlying volume is a mirrored raid partition (/dev/md1).

In about 1/3 of boots, I'm not asked for a password to unlock cr_md1,
and systemd-cryptsetup fails with

[   22.030435] systemd-cryptsetup[868]: crypt_init() failed: Block device required

When this occurs, `cat /proc/mdstat' shows md1 as "inactive".

In this state, `cryptsetup isLuks /dev/md1' exits with status 4.
Normally, this command exits with status 0.

I will attach /etc/fstab,  /etc/crypttab, and dmesg.

For now, I have worked around this by replacing
systemd-sysvinit-37-3.6.1.x86_64 with
sysvinit-init-2.88+-66.58.2.x86_64.
Comment 86 Steve Revilak 2012-02-22 04:43:59 UTC
Oops - meant to add Comment 85 to #743717