-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create custom systemd-networkd-wait-online.service override to wait on individual interfaces. (LP: #2060311) #456
Conversation
PPA for testing can be found in: https://launchpad.net/~slyon/+archive/ubuntu/lp2060311/+packages |
6482ac5
to
6e37de0
Compare
This seem to work well. The new I did find a case though where the behavior changes: when you have the loopback interface in your yaml. Apparently |
Yeah either ignore it or would have to set a different operational state range, e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new behavior matches what we have discussed. I did some testing with various combinations of optional:
for eth0
and wlan0
and things work as expected in all cases.
While I'm not seeing issues specifically related to cloud-init, this doesn't seem to be working correctly, or I'm not understanding how it is supposed to work. After install the PPA, if I reboot, regardless what I have under
and nothing else. |
@TheRealFalcon As long as you have any network definition in /etc/netplan/ that uses networkd as a renderer and does NOT use "optional: true" you should see the corresponding interface listed in that file after calling "netplan apply" (or reboot). |
Does it matter that I'm in an LXD container?
|
It worked in an LXD container for me earlier today. But it could have some impact on the macaddress matching.. can you try with a different matching condition, like name or driver? Looks like it doesn't find the interface here. I need to double-check that. |
@slyon , doh, sorry. This was after I tore down a previous container and forgot to update my network configuration. Once I corrected the MAC address, it worked for me. |
Looks like cloud-init is still having issues first boot. During our init-local timeframe, we write out netplan config, then call This can be simulated in a container by running After boot:
|
Another thing I've found is that |
98d7f8a
to
931c634
Compare
Thanks for all of your testing and comments! I put all fixes into follow-up commits, so it can be reviewed more easily. @daniloegea PTAL at the most recent commits.
@TheRealFalcon I addressed the
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was doing some testing based on you comments and I noticed that dhcp is not considered to set the interface as degraded. So if I disabled link-local and enable dhcp4 it will be set to carrier. Is that expected?
@@ -1460,29 +1474,41 @@ _netplan_netdef_write_networkd( | |||
gboolean | |||
_netplan_networkd_write_wait_online(const NetplanState* np_state, const char* rootdir) | |||
{ | |||
// Set of all current network interfaces, potentially non yet renamed | |||
GHashTable* system_interfaces = g_hash_table_new_full(g_str_hash, g_str_equal, g_free, NULL); | |||
_netplan_query_system_interfaces(system_interfaces); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain here about the impact of ignoring devices not currently present in /sys/class/net during systemd generator timeframe. By only considering interfaces already present in /sys/class/net, are we exposed to scenarios where the device drivers for the specified device haven't loaded yet (in initramfs-less environments without the proper drivers compiled into the kernel)? This reminds me of this cloud-init issue canonical/cloud-init#4451 for minimal images. In the event that we don't have device drivers loaded (from initramfs or kernel static compiled), is it possible netplan's generatornetplan generate
may not see the device in /sys/class/net via if_nameindex
and systemd-networkd-wait-online will be disabled right? Maybe in this case, we prescribe that initramfs/kernel must have the desired device drivers in order to expect systemd-networkd-wait-online to block properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A similar (the same?) issue will happen with virtual devices. By the time the generator runs they will not be created yet so they will not be in the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
James corrected me, this isn't systemd generator timeframe, but the point in time when netplan generate
is called. For cloud-init init-local boot stage, I think we may still be exposed to this race where cloud-init local gets in and invokes netplan generate
possibly before supplemental device drivers have loaded. So this may still be a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! That's a very interesting case.. I just pushed more commits to not require virtual devices (bridges, bonds, dummys, ...) to be available in /sys/class/net
just yet. Those will be created later.
For physical devices that are not yet loaded, I don't know. I don't think there's anything that can be done as part of this PR and such interfaces might be ignored by systemd-networkd-wait-online
.
A longer-term solution might be splitting up Netplan into a sytemd-generator and a systemd-service (running Before=network-pre.target
, similar to systemd-network-generator.service
), which we are tracking as spec "FO165 - Netplan generator architecture".
Do you think this should be a blocker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this retains the same gap/issue cloud-init and systemd-networkd-wait-online are already exposed to, so I don't think this needs to be fixed in this PR. The stance we've taken already Ubuntu minimal case was to recompile the kernel to include the module built-ins that are required to support virtio-net to ensure they are loaded in time for network generation and network-online for "initrdless" boots. It feels "ok" to state that if you have a critical device driver needed in early boot for your cloud image and you are booting without initrd, you would either need to statically compile the necessary device drivers into that kernel, or use initrd and provide any supplemental drivers needed earlier boot. This may just be something we take into consideration for future behavior though, and thinking of how to approach this type of concern later in F0165 probably makes the most sense at the moment.
Added more commits fixing (uploaded as
|
Thanks, Lukas. The cases I've found seem to be fixed now. |
Added a tiny fix to recognize that we do not configure link-local IPs on bridge/bond members. (and of course, a codestyle typo as a follow-up ;-)) |
…aces only Skip s-n-wait-online if we don't have any non-optional interfaces, using a "ConditionPathIsSymbolicLink=" checking Netplan's s-n-wait-online.service enablement symlink. This is in favor to RequiredForOnline=yes as the behavior of upstream (pure) systemd-networkd-wait-online.service is not mean to be used in this way. If "RequiredForOnline=no" sd-networkd-wait-online will fully ignore the corresponding interface and it will block/delay network-online.target if no interfaces are "RequiredForOnline=yes" at all. FR-7246
…nal state for interfaces without IP configuration
Description
Skip s-n-wait-online if we don't have any non-optional interfaces, using a "ConditionPathIsSymbolicLink=" checking Netplan's s-n-wait-online.service enablement symlink.
This is in favor to RequiredForOnline=yes as the behavior of upstream (pure)
systemd-networkd-wait-online.service is not mean to be used in this way.
If "RequiredForOnline=no" sd-networkd-wait-online will fully ignore the
corresponding interface and it will block/delay network-online.target if
no interfaces are "RequiredForOnline=yes" at all.
FR-7246
Note: This is a replacement for #455 that keeps compatibility with cloud-init, so it can still sort
After=systemd-networkd-wait-online.service
ANDBefore=network-online.target
.Checklist
make check
successfully.make check-coverage
).