Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
Commit 0339cd770 changed libsystemd-network's error code for missing DHCP lease
data from ENOENT to ENODATA. Adjust networkd accordingly.
This fixes interfaces being stuck in "degraded/configuring" mode forever.
https://github.com/systemd/systemd/issues/1147
When handing out DHCP leases, try to propagate DNS/NTP server
information from "uplink". The "uplink" is automatically determined as
the network interface with the highest priority default route on it.
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().
This also unifies the general error paths of fflush()+rename() file
writers.
In 5a8bcb674f, IPForwarding was introduced
to set forwarding flags on interfaces in .network files. networkd sets
forwarding options regardless of the previous setting, even if it was
set by e.g. sysctl. This commit creates a new option for IPForwarding,
"kernel", that preserves the sysctl settings rather than always setting
them.
See https://bugs.freedesktop.org/show_bug.cgi?id=89509 for the initial
bug report.
This should simplify the prototype a bit. The bus parameter is redundant
in most cases, and in the few where it matters it can be derived from
the message via sd_bus_message_get_bus().
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
We will be woken up on rtnl or dbus activity, so let's just quit if some time has passed and that is the only thing that can happen.
Note that we will always stay around if we expect network activity (e.g. DHCP is enabled), as we are not restarted on that.
Only the very basics, more to come.
For now:
$ busctl tree org.freedesktop.network1
└─/org/freedesktop/network1
└─/org/freedesktop/network1/link
├─/org/freedesktop/network1/link/1
├─/org/freedesktop/network1/link/2
├─/org/freedesktop/network1/link/3
├─/org/freedesktop/network1/link/4
├─/org/freedesktop/network1/link/5
├─/org/freedesktop/network1/link/6
├─/org/freedesktop/network1/link/7
├─/org/freedesktop/network1/link/8
└─/org/freedesktop/network1/link/9
$ busctl introspect org.freedesktop.network1 /org/freedesktop/network1
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.network1.Manager interface - - -
.OperationalState property s "carrier" emits-change
$ busctl introspect org.freedesktop.network1 /org/freedesktop/network1/link/1
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.network1.Link interface - - -
.AdministrativeState property s "unmanaged" emits-change
.OperationalState property s "carrier" emits-change
If we get a NEWLINK + NEWADDR between enumerating the links and enumerating the addresses, we
would get a warning that the link corresponding to the address does not exist. This is a false
warning as both the NEWLINK and NEWADDR would be processed after enumerating completed, so drop
it.
This introduces am AddressFamilyBoolean type that works more or less
like a booleaan, but can optionally turn on/off things for ipv4 and ipv6
independently. THis also ports the DHCP field over to it.
Using:
find . -name '*.[ch]' | while read f; do perl -i.mmm -e \
'local $/;
local $_=<>;
s/(if\s*\([^\n]+\))\s*{\n(\s*)(log_[a-z_]*_errno\(\s*([->a-zA-Z_]+)\s*,[^;]+);\s*return\s+\g4;\s+}/\1\n\2return \3;/msg;
print;'
$f
done
And a couple of manual whitespace fixups.
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
We got the following error when running systemd on a device with many ports:
"rtnl: kernel receive buffer overrun
Event source 'rtnl-receive-message' returned error, disabling: No buffer space
available"
I think the kernel socket receive buffer queue should be increased. The default
value is taken from:
"/proc/sys/net/core/rmem_default", but we can overwrite it using SO_RCVBUF
socket option.
This is already done in networkd for other sockets.
For example, the bus socket (sd-bus/bus-socket.c) has a receive queue of 8MB.
In our case, the default is 208KB.
Increasing the buffer receive queue for manager socket to 512KB should be enough
to get rid of the above error.
[tomegun: bump the limit even higher to 8M]
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.
systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
Let's settle on a single type for all address family values, even if
UNIX is very inconsitent on the precise type otherwise. Given that
socket() is the primary entrypoint for the sockets API, and that uses
"int", and "int" is relatively simple and generic, we settle on "int"
for this.
We failed to take a ref when waiting for udev synchronization. Fix that and also
make unreffing in callbacks simpler throughout by using _cleanup_ macros.
Fixes <https://bugs.freedesktop.org/show_bug.cgi?id=80556>.
When an address is configured to be all zeroes, networkd will now
automatically find a locally unused network of the right size from a
list of pre-configured pools. Currently those pools are 10.0.0.0/8,
172.16.0.0/12, 192.168.0.0/16 and fc00::/7, i.e. the network ranges for
private networks. They are compiled in, but should be configurable
eventually.
This allows applying the same configuration to a large number of
interfaces with each time a different IP range block, and management of
these IP ranges is fully automatic.
When allocating an address range from the pool it is made sure the range
is not used otherwise.
Configuration will be in
root:root /run/systemd/network
and state will be in
systemd-network:systemd-network /run/systemd/netif
This matches what we do for logind's seat/session state.
Rely on modules being built-in or autoloaded on-demand.
As networkd is a network facing service, we want to limits its capabilities,
as much as possible. Also, we may not have CAP_SYS_MODULE in a container,
and we want networkd to work the same there.
Module autoloading does not always work, but should be fixed by the kernel
patch f98f89a0104454f35a: 'net: tunnels - enable module autoloading', which
is currently in net-next and which people may consider backporting if they
want tunneling support without compiling in the modules.
Early adopters may also use a module-load.d snippet and order
systemd-modules-load.service before networkd to force the module
loading of tunneling modules.
This sholud fix the various build issues people have reported.
The bitmask is deprecated in the kernel, so move to the new interface. At the moment
this does not make a difference for us, but it avoids having to change the API in the future.
This essentially swaps the roles of rtnl and udev in networkd. After this
change libudev is only used for waiting for udev to initialize devices and
to get udev-specific information needed for some [Match] attributes.
This in particular simplifies the code in containers where udev is not really
useful, but also simplifies things and reduces round-trips in the non-container
case.
Pass the mac address on to ipv4ll and dhcp clients so they always have
up-to-date information, and may react appropriately to the change.
Also drop setting the mac address from uevent, and only log when the
address actually changes.
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:
1. The object the new object is derived from is put first, if there is any
2. The object we are creating will be returned in the next arguments
3. This is followed by any additional arguments
Rationale:
For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.
Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.
Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
Don't set set **ret when returning r < 0, as matching on the errno may easily
give false positives in the future leading to null pointer dereference.
Reported-by: David Herrmann <dh.herrmann@gmail.com>
Udev does not run in containers, so instead of relying on it to tell us when a
network device is ready to be used by networkd, we simply assume that any
device was fully initialized before being added to the container.
This allows us users of the library to keep copies of old leases. This is
used by networkd to know what addresses to drop (if any) when the lease
expires.
In the future this may be used by DNAv4 and sd-dhcp-server.
When creating a new link, the kernel will not inform us about the new ifindex
in its ack. We have to listen for newly created devices and deduce the new
ifindex by matching on the ifname.
We used to do this by waiting for a new device from libudev, but that is asking
for trouble, as udev will happily rename the device before handing it to us.
Listen on rtnl instead, the chance of the name being changed before reaching us
is much smaller (if not nil).
Kernel patch in the works to make this unneccessary.
We may not have a dbus daemon in the initrd (until we can rely on kdbus). In
this case, simply ignore any attempts at using the bus. There is only one user
for now, but surely more to come.
In order to work reliably in the real root without kdbus, but at the same time
don't delay boot when kdbus is in use, order ourselves after dbus.service.
This adds support to generate a basic resolv.conf in /run/systemd/network.
This file will not take any effect unless a symlink is created from
/etc/resolv.conf.
Nameservers received over DHCP takes precedence over statically configured ones.
Note: /etc/resolv.conf is severely limited, so in the future we will likely
rather provide a much more powerfull nss plugin (or something to that effect),
but this should allow current users to function without any loss of
functionality.