Bug 960669 - udev 228: after update from 13.2/leap 42.1, the network is not enabled
udev 228: after update from 13.2/leap 42.1, the network is not enabled
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: systemd maintainers
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-05 08:50 UTC by Dominique Leuenberger
Modified: 2021-03-02 16:46 UTC (History)
13 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
fbui: needinfo? (daniel)


Attachments
revert predictable name support for virtio net dev (1.54 KB, patch)
2016-01-28 09:15 UTC, Franck Bui
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dominique Leuenberger 2016-01-05 08:50:01 UTC
+++ This bug was initially created as a clone of Bug #959764 +++

systemd 228[1] now submit to Factory and currently is in Staging Project - Staging I[2].

The upgrade test from openSUSE 13.2 to current TW yields the problem that there is no network active / configured post the update.

[1] https://build.opensuse.org/request/show/351003
[2] https://build.opensuse.org/project/show/openSUSE:Factory:Staging:I
[3] https://openqa.opensuse.org/tests/110901
Comment 1 Jan Engelhardt 2016-01-08 15:26:28 UTC
The OpenQA test fails because in the just-upgraded system, the network did not come up, which itself is a result of a lack of a /etc/sysconfig/network/ifcfg-enp0s3 file.

So, I checked the pristine state of openSUSE-13.2-x86_64.qcow2. This I converted to VDI/Virtualbox, and I booted the DVD rescue system to rerun mkinitrd (because Virtualbox has different harddisk controllers and therefore kernel modules in use). After that, booting openSUSE-13.2-x86_64 succeeds, and presents to me: enp0s3, which means that 13.2 already had no network.

I conclude: This is not a regression caused by systemd-228.
Comment 2 Dominique Leuenberger 2016-01-08 16:11:45 UTC
Not changing the hardware configuration might be a good idea...

I can reproduce having network with the 13.2 image if I boot the machine the same way openQA does:

/usr/bin/qemu-kvm -m 1024 -netdev user,id=qanet0 -device virtio-net,netdev=qanet0,mac=52:54:00:12:34:56 -drive file=openSUSE-13.2-x86_64.qcow2
(I shortened things away that did not make a difference)
Comment 3 Jan Engelhardt 2016-01-08 22:21:46 UTC
So qemu uses virtio-net (at least the way you invoked it), which was not supported by udev 220's "net_id" function, but is with 228, judging from the commit logs. This is why it remains at eth0 in 220 and becomes enp1234 in udev-228.

Normally, Werner's persistent naming feature should have caught this, but fails to activate in 13.2, probably because of this line in 75-persistent.*rules:

  ENV{net.ifnames}!="0", GOTO="persistent_net_generator_end"

Note how, in Tumbleweed, the line reads ENV{net.ifnames}=="1" instead.
Comment 4 Jan Engelhardt 2016-01-09 16:34:38 UTC
Another problem is that dracut behaves differently in QEMU than on normal systems.

http://www.spinics.net/lists/linux-initramfs/msg04208.html
Comment 5 Jan Engelhardt 2016-01-17 11:19:38 UTC
dracut report:
http://www.spinics.net/lists/linux-initramfs/msg04208.html

Basically, we somehow have to ensure the persistent name rule file (if any) is copied into the initramfs.
Comment 6 Ludwig Nussel 2016-01-25 11:00:01 UTC
in 13.2 or TW?
Comment 7 Jan Engelhardt 2016-01-25 11:10:48 UTC
Actions for 13.2:
* switch udev script from net.ifnames!=0 to net.ifnames==1 so that 75-persistent-net.rules is generated
* make sure 75-persistent-net.rules, if present, ends up in the initramfs

Actions for TW:
* make sure 75-persistent-net.rules, if present, ends up in the initramfs

Actions for OpenQA:
* make sure that the 13.2 QCOW image is actually populated with all updates (that did not seem to be the case last time I checked in December - it lacked the systemd-210.+biggitnumberhere+)
Comment 8 Fabian Vogt 2016-01-25 11:22:49 UTC
(In reply to Jan Engelhardt from comment #7)
> * make sure 75-persistent-net.rules, if present, ends up in the initramfs

I can do that, but:
- Why has it worked before?
- Are we sure that it doesn't break anything? I'd say it'll break some setups which are currently working fine.
Comment 9 Jan Engelhardt 2016-01-25 11:30:23 UTC
"work" is coincidental.

systemd 210 does not support virtio-net. systemd 228 does. Change in default behavior.

13.2 systems rely on defaults and fail to store the "custom parts" that OpenQA actually so depends on.
Comment 10 Fabian Vogt 2016-01-25 11:43:30 UTC
So after the update to TW, udev grabs the network interface during the initrd already and we have to avoid that with 75-persistent-net.rules, right?
If that's correct it should be enough to include the rule in TW only, so it won't unnecessarily break 13.2 setups.
Comment 11 Ludwig Nussel 2016-01-25 13:04:29 UTC
So the problem is very similar to
https://doc.opensuse.org/release-notes/x86_64/openSUSE/Leap/42.1/#idm140096374852304

Maybe we should enable the name generator mechanism always, no matter what the actual naming policy is (the generator can decide that), just to make sure we always get the udev rule so network interfaces never change names.
Comment 12 Jan Engelhardt 2016-01-25 13:20:37 UTC
>So after the update to TW, udev grabs the network interface during the initrd already

It always grabbed the name during the initrd (in case of virtio-net), because virtio-net.ko gets loaded in the initrd (and e1000.ko is not!).

>it should be enough to include the rule in TW only,

But you have to make sure 75-p exists and mkinitrd was run, both before you reboot into the TW systemd. This is why I am saying that 13.2 needs to record its naming scheme first before it can be upgraded to TW.

At least, inside openqa... regular systems without virtio-net have no issue.
Comment 13 Fabian Vogt 2016-01-25 13:30:43 UTC
(In reply to Jan Engelhardt from comment #12)
> >So after the update to TW, udev grabs the network interface during the initrd already
> 
> It always grabbed the name during the initrd (in case of virtio-net),
> because virtio-net.ko gets loaded in the initrd (and e1000.ko is not!).

Yeah, but that's not a big issue. It's a tiny bug that should be fixed, but it should not change behaviour in any way. If it does, that's a different bug.

> >it should be enough to include the rule in TW only,
> 
> But you have to make sure 75-p exists and mkinitrd was run, both before you
> reboot into the TW systemd. This is why I am saying that 13.2 needs to
> record its naming scheme first before it can be upgraded to TW.
I'm not sure whether it is even possible (without doing something weird) to boot tumbleweed with a 13.2 generated initrd.
Comment 14 Franck Bui 2016-01-25 16:31:26 UTC
(In reply to Jan Engelhardt from comment #7)
> Actions for 13.2:
> * switch udev script from net.ifnames!=0 to net.ifnames==1 so that
> 75-persistent-net.rules is generated

That's correct, I'm not sure what's the value of "net.ifnames" if this option is not set on the kernel cmdline but using "net.ifnames==1" makes a difference.

Does that mean that persistent naming has never worked so far ?

> * make sure 75-persistent-net.rules, if present, ends up in the initramfs

I don't see why this needs to be included in the initramfs for normal systems. That might be needed for system using nfsroot, but for systems that don't use the network within initramfs...
Comment 15 Jan Engelhardt 2016-01-25 17:19:08 UTC
Correct, persistent naming never worked because of this. You will find that 13.2 gives you enp0s3 and wwp0s2u24 or something in that style, and will also not generate /etc/udev/rules.d/7*-persistent-net* on its own.
Comment 16 Ludwig Nussel 2016-01-26 09:52:46 UTC
for a while there was a patch in dracut that added 70-persistent-net.rules as fix for bug 868375, the patch was removed by trenn a few months later without bug reference. Thomas, any idea why? Looks like we need it.
Comment 17 Fabian Vogt 2016-01-26 10:01:45 UTC
He removed it as "merged mainline" although it was not.
I hesitate to add it back in as it will cause breakage in VMs running TW and also machines with / on network.
Comment 18 Ludwig Nussel 2016-01-26 10:05:56 UTC
no, the patch was actually reverted. there were two patches, one that adds it and one that reverts it, ie noop. With the version upgrade to 44 both were removed.
Comment 19 Ludwig Nussel 2016-01-26 10:07:38 UTC
why do you fear breakage? TW doesn't use persistent names so the file in question doesn't exist unless it was an upgrade from e.g. leap.
Comment 20 Ludwig Nussel 2016-01-26 10:13:42 UTC
confirmed that making dracut include 70-persistent-net.rules in the initrd correctly renames the virtio net device as used by openQA.
Comment 21 Franck Bui 2016-01-26 10:15:49 UTC
(In reply to Ludwig Nussel from comment #20)
> confirmed that making dracut include 70-persistent-net.rules in the initrd
> correctly renames the virtio net device as used by openQA.

I don't see why including this rule in initramfs helps if it's already present in /etc/udev/rules.d/

Could anybody explain ?
Comment 22 Fabian Vogt 2016-01-26 10:25:52 UTC
(Calm down! I need some time writing my comments, I had to confirm "Save changes" twice now...)

(In reply to Ludwig Nussel from comment #18)
> no, the patch was actually reverted. there were two patches, one that adds
> it and one that reverts it, ie noop. With the version upgrade to 44 both
> were removed.

Which one is the reverting patch? I can't see any patch related to this in the changes for 44. I remember removing some pointless patches, but not network related.

It says in dracut.changes:

> Patches merged in the git tracking repository:
> 95udev-rules-add-persistent-network-rule

(In reply to Ludwig Nussel from comment #19)
> why do you fear breakage? TW doesn't use persistent names so the file in
> question doesn't exist unless it was an upgrade from e.g. leap.

Has this changed during some point in TW? I have a old Tumbleweed system here that uses "ens*".

(In reply to Franck Bui from comment #21)
> (In reply to Ludwig Nussel from comment #20)
> > confirmed that making dracut include 70-persistent-net.rules in the initrd
> > correctly renames the virtio net device as used by openQA.
> 
> I don't see why including this rule in initramfs helps if it's already
> present in /etc/udev/rules.d/
> 
> Could anybody explain ?

It's only triggered on device recognition, e.g. "rmmod virtio-net; modprobe virtio-net" renames the device.

(In reply to Ludwig Nussel from comment #20)
> confirmed that making dracut include 70-persistent-net.rules in the initrd
> correctly renames the virtio net device as used by openQA.

Ok. I'll fix both bugs then.
Comment 23 Franck Bui 2016-01-26 10:30:48 UTC
(In reply to Fabian Vogt from comment #22)
> > 
> > I don't see why including this rule in initramfs helps if it's already
> > present in /etc/udev/rules.d/
> > 
> > Could anybody explain ?
> 
> It's only triggered on device recognition, e.g. "rmmod virtio-net; modprobe
> virtio-net" renames the device.

udev is restarted once the system has switched to the real rootfs and it should apply *all* rules at this point, including renaming the net interface.

Did anybody actually tried to include the rule in /etc/udev/rules.d but not in initramfs ?

Here it simply works as expected.
Comment 24 Dr. Werner Fink 2016-01-26 10:36:04 UTC
I'd like to receive coments only once per commit :)
Comment 25 Ludwig Nussel 2016-01-26 10:36:48 UTC
(In reply to Fabian Vogt from comment #22)
> (In reply to Ludwig Nussel from comment #18)
> > no, the patch was actually reverted. there were two patches, one that adds
> > it and one that reverts it, ie noop. With the version upgrade to 44 both
> > were removed.
> 
> Which one is the reverting patch? I can't see any patch related to this in
> the changes for 44. I remember removing some pointless patches, but not
> network related.

$ osc ls -R 69 openSUSE:Factory dracut|grep pers
0022-95udev-rules-add-persistent-network-rule.patch
0136-Revert-95udev-rules-add-persistent-network-rule.patch

>> why do you fear breakage? TW doesn't use persistent names so the file in
>> question doesn't exist unless it was an upgrade from e.g. leap.
>
> Has this changed during some point in TW?

The change from persisent to predicable was in 13.1 already AFAICT,
so very long ago.

> I have a old Tumbleweed system here that uses "ens*".

Which is expected as that is a predictable name :-) That system
doesn't have 70-persistent-net.rules then, right?

Just to summarize, we are talking about upgrade problems here and
the problems are different depending on base distro:

1. upgrade from leap: leap uses persistent names (eth0), just as SLE,
  so an upgrade to TW must honor 70-persistent-net.rules
2. upgrade from 13.2: 13.2 does use predicatable names but udev on
  13.2 apparently didn't support predicable names for virtio
  devices. So in openQA all machines still used eth0.

1) will be immediately fixed when dracut includes
70-persistent-net.rules in initrd again.

2) requires an additional workaround in 13.2 to create 70-persistent-net.rules, we may do that in openQA or just don't test upgrades from 13.2 anymore :-)
Comment 26 Ludwig Nussel 2016-01-26 10:44:12 UTC
(In reply to Franck Bui from comment #23)
> (In reply to Fabian Vogt from comment #22)
> > > 
> > > I don't see why including this rule in initramfs helps if it's already
> > > present in /etc/udev/rules.d/
> > > 
> > > Could anybody explain ?
> > 
> > It's only triggered on device recognition, e.g. "rmmod virtio-net; modprobe
> > virtio-net" renames the device.
> 
> udev is restarted once the system has switched to the real rootfs and it
> should apply *all* rules at this point, including renaming the net interface.
> 
> Did anybody actually tried to include the rule in /etc/udev/rules.d but not
> in initramfs ?

Yes, openQA does that when trying to upgrade from Leap:
https://openqa.opensuse.org/tests/115742

I can give you access to a disk image that was created by openQA that way if
you want to take a look yourself.
Comment 27 Jan Engelhardt 2016-01-26 10:47:22 UTC
> (In reply to Fabian Vogt from comment #22)
> 
> udev is restarted once the system has switched to the real rootfs and it
> should apply *all* rules at this point, including renaming the net interface.

Except that renaming the net interface is NOT done when it already WAS renamed at some point in the past.

^ Which is what happens with virtio-net inside openqa.
Comment 28 Franck Bui 2016-01-26 10:51:59 UTC
(In reply to Ludwig Nussel from comment #26)
> 
> Yes, openQA does that when trying to upgrade from Leap:
> https://openqa.opensuse.org/tests/115742
> 

The initial issue was about upgrading from 13.2 to TW, so was my testing.

> I can give you access to a disk image that was created by openQA that way if
> you want to take a look yourself.

yes please.
Comment 29 Fabian Vogt 2016-01-26 10:52:58 UTC
(In reply to Ludwig Nussel from comment #25)
> >> why do you fear breakage? TW doesn't use persistent names so the file in
> >> question doesn't exist unless it was an upgrade from e.g. leap.
> >
> > Has this changed during some point in TW?
> 
> The change from persisent to predicable was in 13.1 already AFAICT,
> so very long ago.
> 
> > I have a old Tumbleweed system here that uses "ens*".
> 
> Which is expected as that is a predictable name :-) That system
> doesn't have 70-persistent-net.rules then, right?

I don't know, I only have the config files for it (ifcfg-ens1).

> Just to summarize, we are talking about upgrade problems here and
> the problems are different depending on base distro:
> 
> 1. upgrade from leap: leap uses persistent names (eth0), just as SLE,
>   so an upgrade to TW must honor 70-persistent-net.rules
> 2. upgrade from 13.2: 13.2 does use predicatable names but udev on
>   13.2 apparently didn't support predicable names for virtio
>   devices. So in openQA all machines still used eth0.
> 
> 1) will be immediately fixed when dracut includes
> 70-persistent-net.rules in initrd again.
> 
> 2) requires an additional workaround in 13.2 to create
> 70-persistent-net.rules, we may do that in openQA or just don't test
> upgrades from 13.2 anymore :-)

I talked to trenn, the cause for the (chaotic) revert is bug 886669.
Basically, 70-persistent-net.rules does not work correctly in the initrd
due to various reasons and bugs. It also conflicts with dracut's network
naming scheme (but that can be fixed).

I'll only commit the removal of forced virtio-net in QEMU, that should
fix it as well.
Comment 30 Franck Bui 2016-01-26 11:09:25 UTC
(In reply to Jan Engelhardt from comment #27)
> > (In reply to Fabian Vogt from comment #22)
> > 
> > udev is restarted once the system has switched to the real rootfs and it
> > should apply *all* rules at this point, including renaming the net interface.
> 
> Except that renaming the net interface is NOT done when it already WAS
> renamed at some point in the past.
> 
> ^ Which is what happens with virtio-net inside openqa.

That's not what I'm seeing here:

$ journalctl -b
...
Jan 26 12:03:36 linux-nypb kernel: virtio_net virtio0 ens3: renamed from eth0
Jan 26 12:03:36 linux-nypb systemd-udevd[321]: renamed network interface eth0 to ens3
...
Jan 26 12:03:42 linux-nypb kernel: virtio_net virtio0 eth0: renamed from ens3
Jan 26 12:03:42 linux-nypb systemd-udevd[655]: renamed network interface ens3 to eth0
...

$ ip show eth0
ip link show eth0
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff

$ lsinitrd | grep 70-persistent-net.rules
$

$ lsinitrd | grep virtio_net
-rw-r--r--   1 root     root        56818 Jan 18 12:09 lib/modules/4.4.0-1-default/kernel/drivers/net/virtio_net.ko
Comment 31 Jan Engelhardt 2016-01-26 11:29:50 UTC
(In reply to Franck Bui from comment #30)
> > Except that renaming the net interface is NOT done when it already WAS
> > renamed at some point in the past.
> > 
> > ^ Which is what happens with virtio-net inside openqa.
> 
> That's not what I'm seeing here:
> 
> $ journalctl -b
> ...
> Jan 26 12:03:36 linux-nypb kernel: virtio_net virtio0 ens3: renamed from eth0
> Jan 26 12:03:36 linux-nypb systemd-udevd[321]: renamed network interface
> eth0 to ens3
> ...
> Jan 26 12:03:42 linux-nypb kernel: virtio_net virtio0 eth0: renamed from ens3
> Jan 26 12:03:42 linux-nypb systemd-udevd[655]: renamed network interface
> ens3 to eth0
> ...

That's not what the openqa test showed. If it was rerenamed back to eth0, then the openqa test would not have failed. Something seems wrong with your system.

In 75-persistent-net-rules-generator, you find

# ignore the interface if a name has already been set
NAME=="?*", GOTO="persistent_net_generator_end"

# device name whitelist
KERNEL!="eth*|ath*|wlan*[0-9]|msh*|ra*|sta*|ctc*|lcs*|hsi*", GOTO="persistent_net_generator_end"

So by definition, you even have a bug because ens must not be rerenamed.
Comment 32 Fabian Vogt 2016-01-26 12:07:53 UTC
I made a patch that virtio_net is not needlessly included, but apparently the default initrd contains the whole network stack. I'll track this down, but it'll take a while.

In the end, the proper fix is to include the rules file, but currently that does not seem to work reliably.
Comment 33 Franck Bui 2016-01-26 13:50:29 UTC
(In reply to Jan Engelhardt from comment #31)
> That's not what the openqa test showed. If it was rerenamed back to eth0,
> then the openqa test would not have failed. Something seems wrong with your
> system.

Initially this bug was about an upgrade from 13.2 to TW.

In the case of 13.2, /etc/udev/rules.d/70-persistent-net.rules is just missing since as you already know, 75-persistent-net-rules-generator is boggus. And if the rule is missing, well no renaming happens obviously.

However if you add manually a persistent rule in /etc/udev/rules.d in order to make sure that the virtio net interface is named "eth0" then it will work just fine.

For some reasons the persistent net rules generator had been fixed on SLE12-SP1/Leap, and that the reason why the behavior is different and the reason why adding it in the initramfs makes difference.

And here is what is generated by the generator:
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="52:54:00:12:34:56", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

Please note the KERNEL=="eth*" condition.

Right after the system switched to the final rootfs, the interface is named "ens3" (since 70-persistent-net.rules is not included). When udev is started (again), it reads 70-persistent-net.rules but the rule doesn't apply since the above condition is not met.

So if you remove this condition, the renaming will work as expected even if systemd already renamed it "ens3" from within the initramfs.

Now I have another question: why 80-net-setup-link.rules is included in the initramfs ? It's the rule which enables the predictable naming scheme done by udev for eth0, if I'm understanding correctly: if it's removed then the NIC name will be "eth0" as usual and the generated rule (if present) will work as expected.
Comment 34 Fabian Vogt 2016-01-26 13:57:10 UTC
(In reply to Franck Bui from comment #33)
> Now I have another question: why 80-net-setup-link.rules is included in the
> initramfs ? It's the rule which enables the predictable naming scheme done
> by udev for eth0, if I'm understanding correctly: if it's removed then the
> NIC name will be "eth0" as usual and the generated rule (if present) will
> work as expected.

It's been in upstream dracut since 37 (basically since we use it).
It is important that NICs are called the same in the initrd and the full system, as the configuration files for wicked are just "cp -R"ed over.
Comment 35 Franck Bui 2016-01-26 15:07:37 UTC
(In reply to Fabian Vogt from comment #34)
> It's been in upstream dracut since 37 (basically since we use it).
> It is important that NICs are called the same in the initrd and the full
> system, as the configuration files for wicked are just "cp -R"ed over.

Could you explain how the config files for wicked are used inside the initramfs ? I don't think you meant that wicked (and all of its deps) can be included in the initramfs...

If you don't want NICs to be renamed after switching to the new rootfs, I'm afraid you'll have to include all rules in /etc/udev/rules.d/
Comment 36 Fabian Vogt 2016-01-26 15:19:21 UTC
(In reply to Franck Bui from comment #35)
> (In reply to Fabian Vogt from comment #34)
> > It's been in upstream dracut since 37 (basically since we use it).
> > It is important that NICs are called the same in the initrd and the full
> > system, as the configuration files for wicked are just "cp -R"ed over.
> 
> Could you explain how the config files for wicked are used inside the
> initramfs ? I don't think you meant that wicked (and all of its deps) can be
> included in the initramfs...

It is, with most parts stripped. (No DBus for example, as mentioned in the last meeting).
One example: https://build.opensuse.org/package/view_file/Base:System/dracut/0015-40network-replace-dhclient-with-wickedd-dhcp-supplic.patch?expand=1

> If you don't want NICs to be renamed after switching to the new rootfs, I'm
> afraid you'll have to include all rules in /etc/udev/rules.d/

I totally agree with this, but it's not possible right now, I'm afraid.
Comment 37 Franck Bui 2016-01-27 08:19:21 UTC
(In reply to Fabian Vogt from comment #36)
> (In reply to Franck Bui from comment #35)
> > (In reply to Fabian Vogt from comment #34)
> > > It's been in upstream dracut since 37 (basically since we use it).
> > > It is important that NICs are called the same in the initrd and the full
> > > system, as the configuration files for wicked are just "cp -R"ed over.
> > 
> > Could you explain how the config files for wicked are used inside the
> > initramfs ? I don't think you meant that wicked (and all of its deps) can be
> > included in the initramfs...
> 
> It is, with most parts stripped. (No DBus for example, as mentioned in the
> last meeting).
> One example:
> https://build.opensuse.org/package/view_file/Base:System/dracut/0015-
> 40network-replace-dhclient-with-wickedd-dhcp-supplic.patch?expand=1
> 
> > If you don't want NICs to be renamed after switching to the new rootfs, I'm
> > afraid you'll have to include all rules in /etc/udev/rules.d/
> 
> I totally agree with this, but it's not possible right now, I'm afraid.

The support of the network in dracut seems to have some issues (missing udev rules, now virtio_net driver is going to be removed, ...).

Is a bug report already opened to track those shortcomings ? if not shouldn't it be created ?
Comment 38 Fabian Vogt 2016-01-27 08:25:21 UTC
(In reply to Franck Bui from comment #37)
> (In reply to Fabian Vogt from comment #36)
> > (In reply to Franck Bui from comment #35)
> > > (In reply to Fabian Vogt from comment #34)
> > > > It's been in upstream dracut since 37 (basically since we use it).
> > > > It is important that NICs are called the same in the initrd and the full
> > > > system, as the configuration files for wicked are just "cp -R"ed over.
> > > 
> > > Could you explain how the config files for wicked are used inside the
> > > initramfs ? I don't think you meant that wicked (and all of its deps) can be
> > > included in the initramfs...
> > 
> > It is, with most parts stripped. (No DBus for example, as mentioned in the
> > last meeting).
> > One example:
> > https://build.opensuse.org/package/view_file/Base:System/dracut/0015-
> > 40network-replace-dhclient-with-wickedd-dhcp-supplic.patch?expand=1
> > 
> > > If you don't want NICs to be renamed after switching to the new rootfs, I'm
> > > afraid you'll have to include all rules in /etc/udev/rules.d/
> > 
> > I totally agree with this, but it's not possible right now, I'm afraid.
> 
> The support of the network in dracut seems to have some issues (missing udev
> rules, now virtio_net driver is going to be removed, ...).
> 
> Is a bug report already opened to track those shortcomings ? if not
> shouldn't it be created ?

It's only a single bug, that is that the udev rule is missing for various reasons. The virtio_net driver is not going to be removed, it's only added when necessary (instead of always on QEMU).

I'll open a bug to track the support for the udev rule.
Comment 39 Ludwig Nussel 2016-01-27 08:29:43 UTC
Can you estimate when to expect a fix/workaround for Factory? I'm just asking as I need to evaluate whether it's worth investing time into adding a temporary hack to the openQA tests to make them pass. systemd is stuck in stagings far too long already unfortunately.
Comment 40 Fabian Vogt 2016-01-27 08:35:43 UTC
(In reply to Ludwig Nussel from comment #39)
> Can you estimate when to expect a fix/workaround for Factory? I'm just
> asking as I need to evaluate whether it's worth investing time into adding a
> temporary hack to the openQA tests to make them pass. systemd is stuck in
> stagings far too long already unfortunately.

A proper fix takes a bit longer, but that is not needed as fixing the other two bugs (one of which is currently in SR #356093, but it's likely not sufficient) is enough: Not forcibly include network stuff in the initrd. It shouldn't take too long to fix, I already talked to pwieczorkiewicz yesterday and he confirmed that network support isn't needed by default.
If nothing goes horribly wrong (if it does, I'll let you know) it should be done in a few hours.
Comment 41 Franck Bui 2016-01-27 09:58:08 UTC
(In reply to Fabian Vogt from comment #40)
> A proper fix takes a bit longer, but that is not needed as fixing the other
> two bugs (one of which is currently in SR #356093, but it's likely not
> sufficient) is enough: Not forcibly include network stuff in the initrd. It
> shouldn't take too long to fix, I already talked to pwieczorkiewicz
> yesterday and he confirmed that network support isn't needed by default.
> If nothing goes horribly wrong (if it does, I'll let you know) it should be
> done in a few hours.

I don't see how removing network stuff from initramfs will help for the initial test case which is:

  - updating systemd from  version lesser than v226 to v228
  - virtio-net is used
  - no persistent rule has been generated previously

This basically happens when:

  - updating TW
  - upgrading 13.2 -> TW
Comment 42 Fabian Vogt 2016-01-27 10:02:22 UTC
(In reply to Franck Bui from comment #41)
> (In reply to Fabian Vogt from comment #40)
> > A proper fix takes a bit longer, but that is not needed as fixing the other
> > two bugs (one of which is currently in SR #356093, but it's likely not
> > sufficient) is enough: Not forcibly include network stuff in the initrd. It
> > shouldn't take too long to fix, I already talked to pwieczorkiewicz
> > yesterday and he confirmed that network support isn't needed by default.
> > If nothing goes horribly wrong (if it does, I'll let you know) it should be
> > done in a few hours.
> 
> I don't see how removing network stuff from initramfs will help for the
> initial test case which is:
> 
>   - updating systemd from  version lesser than v226 to v228
>   - virtio-net is used
>   - no persistent rule has been generated previously
> 
> This basically happens when:
> 
>   - updating TW
>   - upgrading 13.2 -> TW

This says otherwise:

(In reply to Ludwig Nussel from comment #20)
> confirmed that making dracut include 70-persistent-net.rules in the initrd
> correctly renames the virtio net device as used by openQA.

Not including network stuff at all is basically the same.
Comment 43 Franck Bui 2016-01-27 10:11:13 UTC
(In reply to Fabian Vogt from comment #42)
> (In reply to Franck Bui from comment #41)
> > (In reply to Fabian Vogt from comment #40)
> > > A proper fix takes a bit longer, but that is not needed as fixing the other
> > > two bugs (one of which is currently in SR #356093, but it's likely not
> > > sufficient) is enough: Not forcibly include network stuff in the initrd. It
> > > shouldn't take too long to fix, I already talked to pwieczorkiewicz
> > > yesterday and he confirmed that network support isn't needed by default.
> > > If nothing goes horribly wrong (if it does, I'll let you know) it should be
> > > done in a few hours.
> > 
> > I don't see how removing network stuff from initramfs will help for the
> > initial test case which is:
> > 
> >   - updating systemd from  version lesser than v226 to v228
> >   - virtio-net is used
> >   - no persistent rule has been generated previously
> > 
> > This basically happens when:
> > 
> >   - updating TW
> >   - upgrading 13.2 -> TW
> 
> This says otherwise:
> 
> (In reply to Ludwig Nussel from comment #20)
> > confirmed that making dracut include 70-persistent-net.rules in the initrd
> > correctly renames the virtio net device as used by openQA.
> 
> Not including network stuff at all is basically the same.

I guess this was for Leap...

Ludwig ?

OTOH it's easy to give it a test.
Comment 44 Franck Bui 2016-01-27 10:16:30 UTC
(In reply to Franck Bui from comment #43)
> (In reply to Fabian Vogt from comment #42)
> > 
> > (In reply to Ludwig Nussel from comment #20)
> > > confirmed that making dracut include 70-persistent-net.rules in the initrd
> > > correctly renames the virtio net device as used by openQA.
> > 
> > Not including network stuff at all is basically the same.
> 

and as I described in my test case, 70-persistent-net.rules is not present at all.
Comment 46 Dominique Leuenberger 2016-01-27 10:20:10 UTC
(In reply to Franck Bui from comment #43)

> I guess this was for Leap...
> 
> Ludwig ?
> 
> OTOH it's easy to give it a test.

https://openqa.opensuse.org/tests/116375/modules/consoletest_setup/steps/20

latest openQA run of Staging I, containing systemd/udev 228 and the recent dracut submission (with this bug mentioned)

The test itself updates a Leap 42.1 installation to a TW snapshot - no network avaialble after the update from DVD
Comment 47 Ludwig Nussel 2016-01-27 10:22:08 UTC
(In reply to Franck Bui from comment #43)
> > This says otherwise:
> > 
> > (In reply to Ludwig Nussel from comment #20)
> > > confirmed that making dracut include 70-persistent-net.rules in the initrd
> > > correctly renames the virtio net device as used by openQA.
> > 
> > Not including network stuff at all is basically the same.
> 
> I guess this was for Leap...

Yes, it was Leap as 13.2 doesn't create 70-persistent-net.rules.

If we can get the Leap upgrade working the 13.2 one isn't that important anymore.
Comment 49 Franck Bui 2016-01-27 11:23:11 UTC
(In reply to Ludwig Nussel from comment #47)
> If we can get the Leap upgrade working the 13.2 one isn't that important
> anymore.

It is important: as I said earlier it's the same case as updating TW from v224 to v228.
Comment 52 Ludwig Nussel 2016-01-27 12:08:29 UTC
(In reply to Franck Bui from comment #49)
> (In reply to Ludwig Nussel from comment #47)
> > If we can get the Leap upgrade working the 13.2 one isn't that important
> > anymore.
> 
> It is important: as I said earlier it's the same case as updating TW from
> v224 to v228.

Ok, that sucks but I'm not sure this problem should prevent systemd from passing stagings. It's at least not a problem for the development process as we unfortunately don't have upgrade tests that zypper dup between TW snapshots yet. If we don't find a fix for that problem we can warn about it before a snapshot with 228 gets released.
Comment 53 Dominique Leuenberger 2016-01-27 12:14:10 UTC
(In reply to Ludwig Nussel from comment #52)

> Ok, that sucks but I'm not sure this problem should prevent systemd from
> passing stagings. It's at least not a problem for the development process as
> we unfortunately don't have upgrade tests that zypper dup between TW
> snapshots yet. If we don't find a fix for that problem we can warn about it
> before a snapshot with 228 gets released.

And as usual forget about it down the line until SLE comes up with a new systemd version and then remember again that we had that problem long ago but decided to 'speed up things' instead of fixing it.. I for one prefer a fix when it's time to fix it - nothing else is blocked for not having the latest version of this package in TW for now.... I don't understand the hard push to circumvent all the processes we have in place.
Comment 54 Franck Bui 2016-01-27 12:23:14 UTC
(In reply to Dominique Leuenberger from comment #53)
> (In reply to Ludwig Nussel from comment #52)
> 
> > Ok, that sucks but I'm not sure this problem should prevent systemd from
> > passing stagings. It's at least not a problem for the development process as
> > we unfortunately don't have upgrade tests that zypper dup between TW
> > snapshots yet. If we don't find a fix for that problem we can warn about it
> > before a snapshot with 228 gets released.
> 
> And as usual forget about it down the line until SLE comes up with a new
> systemd version and then remember again that we had that problem long ago
> but decided to 'speed up things' instead of fixing it.. I for one prefer a
> fix when it's time to fix it - nothing else is blocked for not having the
> latest version of this package in TW for now.... I don't understand the hard
> push to circumvent all the processes we have in place.

Totally agreed.
Comment 55 Ludwig Nussel 2016-01-27 13:23:54 UTC
(In reply to Dominique Leuenberger from comment #53)
> (In reply to Ludwig Nussel from comment #52)
> 
> > Ok, that sucks but I'm not sure this problem should prevent systemd from
> > passing stagings. It's at least not a problem for the development process as
> > we unfortunately don't have upgrade tests that zypper dup between TW
> > snapshots yet. If we don't find a fix for that problem we can warn about it
> > before a snapshot with 228 gets released.
> 
> And as usual forget about it down the line until SLE comes up with a new
> systemd version and then remember again that we had that problem long ago

I'd assume that the openQA SP1->SP2 upgrade tests would also reveal that. And we have Bugzilla to track issues also in SLE:
https://bugzilla.suse.com/enter_bug.cgi?product=SUSE%20Linux%20Enterprise%20Server%2012%20SP2

> but decided to 'speed up things' instead of fixing it.. I for one prefer a
> fix when it's time to fix it - nothing else is blocked for not having the
> latest version of this package in TW for now.... I don't understand the hard
> push to circumvent all the processes we have in place.

No process is circumvented here. I thought we agreed to have at least one upgrade case working and that is what the dracut submission is meant to achieve. That doesn't need to be the ultimate perfect final fix yet. For now we don't even know what other surprises are hiding as we only exposed 228 to the limited test set of staging. To me this issue just doesn't seem to justify letting systemd rot any longer in staging.
Comment 56 Jiri Bohac 2016-01-27 14:01:53 UTC
Whatever you guys do, just don't think of putting the 70-p file into initrd.
The problemes described in bsc#886669 are really severe and it cannot be fixed in a sane way. We spent weeks on that when developing SLE12.

You should not be needing networking inside initramfs at all, except for diskless clients. I don't think we have any way to set up these using our installer or YaST, so it's up to the administrator to make such a config work manually, right?

Then the administratos should not have a problem with either 
setting up the persistent names himself (e.g. he knows that the client will only ever have one network interface, he can create an udev rule that will rename any interface to eth0).
Or he should be able to use the systemd-provided predictable names.
Or he can use 
Or he can use the ifname= command line argument.

In all other situations, please try to to prevent dracut from doing anything to the network - it will make our lives much simpler! :)
Comment 57 Fabian Vogt 2016-01-27 14:13:05 UTC
(In reply to Jiri Bohac from comment #56)
> Whatever you guys do, just don't think of putting the 70-p file into initrd.
> The problemes described in bsc#886669 are really severe and it cannot be
> fixed in a sane way. We spent weeks on that when developing SLE12.
> 
> You should not be needing networking inside initramfs at all

That's what triggered the bug.

, except for
> diskless clients. I don't think we have any way to set up these using our
> installer or YaST, so it's up to the administrator to make such a config
> work manually, right?

YaST can do that, iSCSI and NFS.

> Then the administratos should not have a problem with either 
> setting up the persistent names himself (e.g. he knows that the client will
> only ever have one network interface, he can create an udev rule that will
> rename any interface to eth0).
> Or he should be able to use the systemd-provided predictable names.
> Or he can use 
> Or he can use the ifname= command line argument.
> 
> In all other situations, please try to to prevent dracut from doing anything
> to the network - it will make our lives much simpler! :)

Yeah, it really is a mess :-/
Comment 58 Dominique Leuenberger 2016-01-27 16:13:24 UTC
WE injected the current package from
> home:favogt:branches:Base:System
into Staging:I (no submitrequest, just for testing) and the latest test run performed in this setup showed that the Leap 42.1 instance keeps a working network setup after the upgrade.

Test run for reference: https://openqa.opensuse.org/tests/116730
Comment 59 Franck Bui 2016-01-28 09:14:40 UTC
I can't really work on this one currently but will in a couple of days. (This must be solved for SLE12-SP2 anyway).

If you really want to release v228 quickly, I would suggest to revert temporarly the commit which introduces the support of virtio buses in net_id (commit 54683f0).
Comment 60 Franck Bui 2016-01-28 09:15:34 UTC
Created attachment 663549 [details]
revert predictable name support for virtio net dev
Comment 61 Marcus Rückert 2016-02-08 10:37:05 UTC
2016-02-03 17:01:29|install|openSUSE-release-ftp|20160130-1.2|x86_64||repo-oss|d6f806eec9879899ceffb8ad471e4429e2385186|
2016-02-06 14:18:00|install|openSUSE-release-ftp|20160205-1.1|x86_64||repo-oss|9016720b7312590e54fa8f2eeb165f125dcfaed4|


on the 2nd upgrade it also renamed my virtio-net device.
Comment 63 Jiri Bohac 2017-10-05 15:49:00 UTC
There is no plan for SLE15 to switch to predictable names. Thanks for pointing this out; the fact that SLE15alpha5 uses them is a bug (bsc#1061883). With this in mind, please don't put the 70-persistent-net.rules file in the initrd (details in bsc#886669)
Comment 64 Franck Bui 2017-10-05 16:00:58 UTC
Of course there's one see FATE#323174.

Also could you be so kind to actually answer my questions in my last comment ?
Comment 65 Jiri Bohac 2017-10-05 16:54:51 UTC
Ok, I wasn't aware of FATE#323174. However, whatever the default is, if the admin chooses to use the old naming scheme, you get the problem with initrd just as before. We can't just break it, even if we decide not to use it by default.
Comment 66 Franck Bui 2017-10-17 12:43:39 UTC
Since this bug is for TW which (fortunately) uses the predictable naming scheme since some time now, there's no point in preventing the net rules or the link files from being included in the initramfs otherwise interface renaming doesn't work at all being done through systemd link files or udev rules.

However for backward compat reasons and for those who can't type anything else than "eth0" we should make sure that udev is prepared for naming clashes (actually no one else has complained again so far so I'm not sure about that point).

So migration from old distros to TW work as expected, we won't need to suggest users to use the dracut options to make the network work from inside initramfs and interface renaming work.

And yes any attempt to rename interfaces will need a regeneration of the initrd but that seems much better and easier than having to append the very limited dracut options to the kernel comand lines (which also need grub2-mkconfig)...
Comment 67 Franck Bui 2017-10-20 13:38:43 UTC
@Daniel, could you include /etc/udev/rules.d/70-persistent-net.rules in the initramfs (if it's present) please ?

Also we should make sure that custom link files end up in the initramfs too. The same happened to Debian a couple of years ago, see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=793374 that might give you some relevant information.

Currently the link files dont seem to be considered by dracut, tested with TW.

Just make sure this is done for Factory *only* as all SLE versions (including SLE15) will continue to use the obsolete persistent naming scheme. And yes if customers decide to use the predictable names then renaming won't work for them... oh well.
Comment 68 Daniel Molkentin 2017-10-23 11:18:03 UTC
Added to factory, pending review. Also, the link *.files are already being collected.
Comment 69 Bernhard Wiedemann 2017-10-23 14:01:53 UTC
This is an autogenerated message for OBS integration:
This bug (960669) was mentioned in
https://build.opensuse.org/request/show/535999 Factory / dracut
Comment 70 Jiri Bohac 2017-10-24 09:36:56 UTC
(In reply to Franck Bui from comment #67)
> Just make sure this is done for Factory *only* as all SLE versions
> (including SLE15) will continue to use the obsolete persistent naming
> scheme. And yes if customers decide to use the predictable names then
> renaming won't work for them... oh well.

I think there is no reason to include the .link files in dracut. This should go to SLE15 as well. Unlike the 70-persistent-net.rules file.
Comment 71 Jiri Bohac 2017-10-24 09:38:51 UTC
(In reply to Jiri Bohac from comment #70)
> I think there is no reason to include the .link files

Sorry, typo ... there is no reason _NOT_ to include. We should include the .link files in dracut.
Comment 72 Daniel Molkentin 2017-10-24 09:51:11 UTC
The *.link files already do get included in SLE15. In fact, also SEL13SP2 and 3.
Comment 73 Franck Bui 2017-10-26 12:47:10 UTC
I'm closing this one since the changes have been submitted by Daniel.
Comment 76 Swamp Workflow Management 2017-12-22 17:10:48 UTC
SUSE-RU-2017:3412-1: An update that has 7 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1011554,1019938,1048551,1052840,1067279,1072424,960669
CVE References: 
Sources used:
SUSE Linux Enterprise Server 12-SP3 (src):    dracut-044.1-114.17.1
SUSE Linux Enterprise Desktop 12-SP3 (src):    dracut-044.1-114.17.1
SUSE Container as a Service Platform ALL (src):    dracut-044.1-114.17.1
Comment 77 Swamp Workflow Management 2017-12-22 17:12:20 UTC
SUSE-RU-2017:3413-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1011554,1019938,1036323,1048551,1052840,1067279,1072424,960669
CVE References: 
Sources used:
SUSE Linux Enterprise Server for Raspberry Pi 12-SP2 (src):    dracut-044.1-109.26.1
SUSE Linux Enterprise Server 12-SP2 (src):    dracut-044.1-109.26.1
SUSE Linux Enterprise Desktop 12-SP2 (src):    dracut-044.1-109.26.1
OpenStack Cloud Magnum Orchestration 7 (src):    dracut-044.1-109.26.1
Comment 78 Swamp Workflow Management 2017-12-22 23:09:31 UTC
openSUSE-RU-2017:3429-1: An update that has 8 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1011554,1019938,1036323,1048551,1052840,1067279,1072424,960669
CVE References: 
Sources used:
openSUSE Leap 42.2 (src):    dracut-044.1-16.15.1
Comment 79 Swamp Workflow Management 2017-12-22 23:10:36 UTC
openSUSE-RU-2017:3430-1: An update that has 7 recommended fixes can now be installed.

Category: recommended (important)
Bug References: 1011554,1019938,1048551,1052840,1067279,1072424,960669
CVE References: 
Sources used:
openSUSE Leap 42.3 (src):    dracut-044.1-29.1