From bce67bbee3, systemd-networkd always shows
```
rtnl: received address with invalid family type 32, ignoring.
```
during boot-up. In the code, there are log_warning() and log_debug() for the
same situation, and the log_debug() is never called. So, let's lower the
log level and remove never called function.
Routing Policy rule manipulates rules in the routing policy database control the
route selection algorithm.
This work supports to configure Rule
```
[RoutingPolicyRule]
TypeOfService=0x08
Table=7
From= 192.168.100.18
```
```
ip rule show
0: from all lookup local
0: from 192.168.100.18 tos 0x08 lookup 7
```
V2 changes:
1. Added logic to handle duplicate rules.
2. If rules are changed or deleted and networkd restarted
then those are deleted when networkd restarts next time
V3:
1. Add parse_fwmark_fwmask
As a follow-up for db3f45e2d2 let's do the
same for all other cases where we create a FILE* with local scope and
know that no other threads hence can have access to it.
For most cases this shouldn't change much really, but this should speed
dbus introspection and calender time formatting up a bit.
This adds a modified version of dhcp6_option_parse_domainname() that is
able to parse compressed domain names, borrowing the idea from
dns_packet_read_name(). It also adds pieces in networkd-link and
networkd-manager to properly save/load the added option field.
Resolves#2710.
This will allow us to have several managers sharing an event loop
and running in parallel, as if they were running in separate processes.
The long term-aim is to allow networkd to be split into separate
processes, so restructure the code to make this simpler.
For now we drop the exit-on-idle logic, as this was anyway severely
restricted at the moment. Once split, we will revisit this as it may
then make more sense again.
If setting the received timezone or transient hostname fails because D-Bus is
not (yet) up, store the data in the Manager object and try again after
connecting to D-Bus.
DNS servers must be specified as IP addresses, hence let's store them as that
internally, so that they are guaranteed to be fully normalized always, and
invalid data cannot be stored.
It is just an alias for route_free which requires that route is not null,
but it was only used in one place where it was checked that route is not
null anyway. Let's just call route_free instead.
Separate fields are replaced with a struct.
Second second duid type field is removed. The first field was used to carry
the result of DUIDType= configuration, and the second was either a copy of
this, or contained the type extracted from DuidRawData. The semantics are changed
so that the type specified in DUIDType is always used. DUIDRawData= no longer
overrides the type setting.
The networkd code is now more constrained than the sd-dhcp code:
DUIDRawData cannot have 0 length, length 0 is treated the same as unsetting.
Likewise, it is not possible to set a DUIDType=0. If it ever becomes necessary
to set type=0 or a zero-length duid, the code can be changed to support that.
Nevertheless, I think that's unlikely.
This addresses #3127 § 1 and 3.
v2:
- rename DUID.duid, DUID.duid_len to DUID.raw_data, DUID.raw_data_len
Both versions of the code are changed to allow the caller to override
DUID using simple rules: duid type and value may be specified, in
which case the caller is responsible to providing the contents,
or just duid type may be specified as DUID_TYPE_EN, in which case we
we fill in the values. In the future more support for other types may
be added, e.g. DUID_TYPE_LLT.
There still remains and ugly discrepancy between dhcp4 and dhcp6 code:
dhcp6 has sd_dhcp6_client_set_duid and sd_dhcp6_client_set_iaid and
requires client->state to be DHCP6_STATE_STOPPED, while dhcp4 has
sd_dhcp_client_set_iaid_duid and will reconfigure the client if it
is not stopped. This commit doesn't touch that part.
This addresses #3127 § 2.
This patch makes networkd stay around as long as there is more than just a
loopback interface around, or the loopback device isn't fully probed yet, or
the loopback device has a .network file attached.
In essence, this means networkd stays around now continously as it should,
unless it is running in some (container?) environment that really has no
interface except a loopback device.
Fixes#2577.
The call combines outputing a string with prefixing it with a space, optionally. This is useful to shorten the logic
for outputing lists of strings, that are space separated.
This changes the UseDomains= setting of .network files to take an optional third value "route", in addition to the
boolean values. If set, the passed domain information is used for routing rules only, but not for the search path
logic.
All booleans called dhcp_xyz are now called ".dhcp_use_xyz", to match their respective configuration file settings. This
should clarify things a bit, in particular as there is a DHCP hostname that was previously called just ".hostname"
because ".dhcp_hostname" was already existing as a bool. Since this confusion is removed now because the bool is called
".dhcp_use_hostname", the string field is now renamed to ".dhcp_hostname".
When we collect the domain names of the various links and other sources in one ordered set, make sure to use proper DNS
name comparison to filter out duplicates.
For the search domain logic the order is highly relevant, hence make sure when collecting the various search domains to
add them to an ordered set, so that the order between search domains of a specific link is retained.
Previously, .network files only knew a vaguely defined "Domains=" concept, for which the documentation declared it was
the "DNS domain" for the network connection, without specifying what that means.
With this the Domains setting is reworked, so that there are now "routing" domains and "search" domains. The former are
to be used by resolved to route DNS request to specific network interfaces, the latter is to be used for searching
single-label hostnames with (in addition to being used for routing). Both settings are configured in the "Domains="
setting. Normal domain names listed in it are now considered search domains (for compatibility with existing setups),
while those prefixed with "~" are considered routing domains only. To route all lookups to a specific interface the
routing domain "." may be used, referring to the root domain. An alternative syntax for this is the "*", as was already
implemented before using the "wildcard" domain concept.
This commit adds proper parsers for this new logic, and exposes this via the sd-network API. This information is not
used by resolved yet, this will be added in a later commit.
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
The previous behavior:
When DHCPv6 was enabled, router discover was performed first, and then DHCPv6 was
enabled only if the relevant flags were passed in the Router Advertisement message.
Moreover, router discovery was performed even if AcceptRouterAdvertisements=false,
moreover, even if router advertisements were accepted (by the kernel) the flags
indicating that DHCPv6 should be performed were ignored.
New behavior:
If RouterAdvertisements are accepted, and either no routers are found, or an
advertisement is received indicating DHCPv6 should be performed, the DHCPv6
client is started. Moreover, the DHCP option now truly enables the DHCPv6
client regardless of router discovery (though it will probably not be
very useful to get a lease withotu any routes, this seems the more consistent
approach).
The recommended default setting should be to set DHCP=ipv4 and to leave
IPv6AcceptRouterAdvertisements unset.
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
We only keep the addresses that we added ourselves in link->addresses, and
introduce a new set link->addresses_foreign to keep addresses of unknown
origin.
Only functional change is that "foreign" addresses no longer prevent a link
from entering "configured" state.
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
Commit 0339cd770 changed libsystemd-network's error code for missing DHCP lease
data from ENOENT to ENODATA. Adjust networkd accordingly.
This fixes interfaces being stuck in "degraded/configuring" mode forever.
https://github.com/systemd/systemd/issues/1147
When handing out DHCP leases, try to propagate DNS/NTP server
information from "uplink". The "uplink" is automatically determined as
the network interface with the highest priority default route on it.
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().
This also unifies the general error paths of fflush()+rename() file
writers.
In 5a8bcb674f, IPForwarding was introduced
to set forwarding flags on interfaces in .network files. networkd sets
forwarding options regardless of the previous setting, even if it was
set by e.g. sysctl. This commit creates a new option for IPForwarding,
"kernel", that preserves the sysctl settings rather than always setting
them.
See https://bugs.freedesktop.org/show_bug.cgi?id=89509 for the initial
bug report.
This should simplify the prototype a bit. The bus parameter is redundant
in most cases, and in the few where it matters it can be derived from
the message via sd_bus_message_get_bus().
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
We will be woken up on rtnl or dbus activity, so let's just quit if some time has passed and that is the only thing that can happen.
Note that we will always stay around if we expect network activity (e.g. DHCP is enabled), as we are not restarted on that.
Only the very basics, more to come.
For now:
$ busctl tree org.freedesktop.network1
└─/org/freedesktop/network1
└─/org/freedesktop/network1/link
├─/org/freedesktop/network1/link/1
├─/org/freedesktop/network1/link/2
├─/org/freedesktop/network1/link/3
├─/org/freedesktop/network1/link/4
├─/org/freedesktop/network1/link/5
├─/org/freedesktop/network1/link/6
├─/org/freedesktop/network1/link/7
├─/org/freedesktop/network1/link/8
└─/org/freedesktop/network1/link/9
$ busctl introspect org.freedesktop.network1 /org/freedesktop/network1
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.network1.Manager interface - - -
.OperationalState property s "carrier" emits-change
$ busctl introspect org.freedesktop.network1 /org/freedesktop/network1/link/1
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.network1.Link interface - - -
.AdministrativeState property s "unmanaged" emits-change
.OperationalState property s "carrier" emits-change
If we get a NEWLINK + NEWADDR between enumerating the links and enumerating the addresses, we
would get a warning that the link corresponding to the address does not exist. This is a false
warning as both the NEWLINK and NEWADDR would be processed after enumerating completed, so drop
it.
This introduces am AddressFamilyBoolean type that works more or less
like a booleaan, but can optionally turn on/off things for ipv4 and ipv6
independently. THis also ports the DHCP field over to it.
Using:
find . -name '*.[ch]' | while read f; do perl -i.mmm -e \
'local $/;
local $_=<>;
s/(if\s*\([^\n]+\))\s*{\n(\s*)(log_[a-z_]*_errno\(\s*([->a-zA-Z_]+)\s*,[^;]+);\s*return\s+\g4;\s+}/\1\n\2return \3;/msg;
print;'
$f
done
And a couple of manual whitespace fixups.
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
We got the following error when running systemd on a device with many ports:
"rtnl: kernel receive buffer overrun
Event source 'rtnl-receive-message' returned error, disabling: No buffer space
available"
I think the kernel socket receive buffer queue should be increased. The default
value is taken from:
"/proc/sys/net/core/rmem_default", but we can overwrite it using SO_RCVBUF
socket option.
This is already done in networkd for other sockets.
For example, the bus socket (sd-bus/bus-socket.c) has a receive queue of 8MB.
In our case, the default is 208KB.
Increasing the buffer receive queue for manager socket to 512KB should be enough
to get rid of the above error.
[tomegun: bump the limit even higher to 8M]
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.
systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
Let's settle on a single type for all address family values, even if
UNIX is very inconsitent on the precise type otherwise. Given that
socket() is the primary entrypoint for the sockets API, and that uses
"int", and "int" is relatively simple and generic, we settle on "int"
for this.
We failed to take a ref when waiting for udev synchronization. Fix that and also
make unreffing in callbacks simpler throughout by using _cleanup_ macros.
Fixes <https://bugs.freedesktop.org/show_bug.cgi?id=80556>.
When an address is configured to be all zeroes, networkd will now
automatically find a locally unused network of the right size from a
list of pre-configured pools. Currently those pools are 10.0.0.0/8,
172.16.0.0/12, 192.168.0.0/16 and fc00::/7, i.e. the network ranges for
private networks. They are compiled in, but should be configurable
eventually.
This allows applying the same configuration to a large number of
interfaces with each time a different IP range block, and management of
these IP ranges is fully automatic.
When allocating an address range from the pool it is made sure the range
is not used otherwise.
Configuration will be in
root:root /run/systemd/network
and state will be in
systemd-network:systemd-network /run/systemd/netif
This matches what we do for logind's seat/session state.
Rely on modules being built-in or autoloaded on-demand.
As networkd is a network facing service, we want to limits its capabilities,
as much as possible. Also, we may not have CAP_SYS_MODULE in a container,
and we want networkd to work the same there.
Module autoloading does not always work, but should be fixed by the kernel
patch f98f89a0104454f35a: 'net: tunnels - enable module autoloading', which
is currently in net-next and which people may consider backporting if they
want tunneling support without compiling in the modules.
Early adopters may also use a module-load.d snippet and order
systemd-modules-load.service before networkd to force the module
loading of tunneling modules.
This sholud fix the various build issues people have reported.
The bitmask is deprecated in the kernel, so move to the new interface. At the moment
this does not make a difference for us, but it avoids having to change the API in the future.
This essentially swaps the roles of rtnl and udev in networkd. After this
change libudev is only used for waiting for udev to initialize devices and
to get udev-specific information needed for some [Match] attributes.
This in particular simplifies the code in containers where udev is not really
useful, but also simplifies things and reduces round-trips in the non-container
case.
Pass the mac address on to ipv4ll and dhcp clients so they always have
up-to-date information, and may react appropriately to the change.
Also drop setting the mac address from uevent, and only log when the
address actually changes.
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:
1. The object the new object is derived from is put first, if there is any
2. The object we are creating will be returned in the next arguments
3. This is followed by any additional arguments
Rationale:
For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.
Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.
Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
Don't set set **ret when returning r < 0, as matching on the errno may easily
give false positives in the future leading to null pointer dereference.
Reported-by: David Herrmann <dh.herrmann@gmail.com>
Udev does not run in containers, so instead of relying on it to tell us when a
network device is ready to be used by networkd, we simply assume that any
device was fully initialized before being added to the container.
This allows us users of the library to keep copies of old leases. This is
used by networkd to know what addresses to drop (if any) when the lease
expires.
In the future this may be used by DNAv4 and sd-dhcp-server.
When creating a new link, the kernel will not inform us about the new ifindex
in its ack. We have to listen for newly created devices and deduce the new
ifindex by matching on the ifname.
We used to do this by waiting for a new device from libudev, but that is asking
for trouble, as udev will happily rename the device before handing it to us.
Listen on rtnl instead, the chance of the name being changed before reaching us
is much smaller (if not nil).
Kernel patch in the works to make this unneccessary.
We may not have a dbus daemon in the initrd (until we can rely on kdbus). In
this case, simply ignore any attempts at using the bus. There is only one user
for now, but surely more to come.
In order to work reliably in the real root without kdbus, but at the same time
don't delay boot when kdbus is in use, order ourselves after dbus.service.
This adds support to generate a basic resolv.conf in /run/systemd/network.
This file will not take any effect unless a symlink is created from
/etc/resolv.conf.
Nameservers received over DHCP takes precedence over statically configured ones.
Note: /etc/resolv.conf is severely limited, so in the future we will likely
rather provide a much more powerfull nss plugin (or something to that effect),
but this should allow current users to function without any loss of
functionality.
Adds a new call sd_event_set_watchdog() that can be used to hook up the
event loop with the watchdog supervision logic of systemd. If enabled
and $WATCHDOG_USEC is set the event loop will ping the invoking systemd
daemon right after coming back from epoll_wait() but not more often than
$WATCHDOG_USEC/4. The epoll_wait() will sleep no longer than
$WATCHDOG_USEC/4*3, to make sure the service manager is called in time.
This means that setting WatchdogSec= in a .service file and calling
sd_event_set_watchdog() in your daemon is enough to hook it up with the
watchdog logic.
This listens to rtnetlink for changes to IFF_UP and IFF_LOWER_UP (link sense). The latter
is simply logged at the moment, but will be useful once we add dhcp support.
A bridge is specified in a .netdev file with a section [Bridge]
and at least the entry Name=.
A link may be joined to a bridge if the .network applied to it has
a Bridge= entry giving the name of the bridge in its [Network] section.
We eagerly create all bridges on startup, and links are added to
bridges as soon as they both appear.
This removed the requirement for devices to be tagged with
'systemd-networkd' before they will be visible to networkd.
Still, as by default we don't ship any .network files, network
devices will simply be tracked, but not touched, unless the
admin configures things explicitly.
Try to emphasize a bit that there should be a mapping between event
loops and threads, hence introduce a logic that there's one "default"
event loop for each thread, that can be queried via
"sd_event_default()".
This daemon listens for and configures network devices tagged with
'systemd-networkd'. By default, no devices are tagged so this daemon
can safely run in parallel with existing network daemons/scripts.
Networks are configured in /etc/systemd/network/*.network. The first .network
file that matches a given link is applied. The matching logic is similar to
the one for .link files, but additionally supports matching on interface name.
The mid-term aim is to provide an alternative to ad-hoc scripts currently used
in initrd's and for wired setups that don't change much (e.g., as seen on
servers/and some embedded systems).
Currently, static addresses and a gateway can be configured.
Example .network file:
[Match]
Name=wlp2s0
[Network]
Description=My Network
Gateway=192.168.1.1
Address=192.168.1.23/24
Address=fe80::9aee:94ff:fe3f:c618/64