This reverts commit d4e9e574ea.
(systemd.conf.m4 part was already reverted in 5b5d82615011b9827466b7cd5756da35627a1608.)
Together those reverts should "fix" #10025 and #10011. ("fix" is in quotes
because this doesn't really fix the underlying issue, which is combining
DynamicUser= with strict container sandbox, but it avoids the problem by not
using that feature in our default installation.)
Dynamic users don't work well if the service requires matching configuration in
other places, for example dbus policy. This is true for those three services.
In effect, distros create the user statically [1, 2]. Dynamic users make more
sense for "add-on" services where not creating the user, or more precisely,
creating the user lazily, can save resources. For "basic" services, if we are
going to create the user on package installation anyway, setting DynamicUser=
just creates unneeded confusion. The only case where it is actually used is
when somebody forgets to do system configuration. But it's better to have the
service fail cleanly in this case too. If we want to turn on some side-effect
of DynamicUser=yes for those services, we should just do that directly through
fine-grained options. By not using DynamicUser= we also avoid the need to
restart dbus.
[1] bd9bf30727
[2] 48ac1cebde/f/systemd.spec (_473)
(Fedora does not create systemd-timesync user.)
In order to shut down networkd properly, the delegated routes added
need to be removed properly, and as error reporting is wanted, the
network link is needed in the debug output.
Solve this by calling manager_dhcp6_prefix_remove_all(), which will
remove each prefix stored in the Manager structure, and while doing
that reference each link so that it isn't freed before the route
removal callback is called. This in turn causes the network link to
be referenced once more, and an explicit hashmap_remove() must be
called to remove the network link from the m->links hashmap.
Also, since the registered callback is not called when the DHCPv6
client is stopped with sd_dhcp6_client_stop(), an explicit call
to dhcp6_lease_pd_prefix_lost() needs to be made to clean up any
unreachable routes set up for the delegated prefixes.
When networkd has not connected and setting hostname/timezone is
requested, the operation is delayed, not canceled. So, logging in
debug level is sufficient for the corresponding log message.
Closes#9699.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
This drops a good number of type-specific _cleanup_ macros, and patches
all users to just use the generic ones.
In most recent code we abstained from defining type-specific macros, and
this basically removes all those added already, with the exception of
the really low-level ones.
Having explicit macros for this is not too useful, as the expression
without the extra macro is generally just 2ch wider. We should generally
emphesize generic code, unless there are really good reasons for
specific code, hence let's follow this in this case too.
Note that _cleanup_free_ and similar really low-level, libc'ish, Linux
API'ish macros continue to be defined, only the really high-level OO
ones are dropped. From now on this should really be the rule: for really
low-level stuff, such as memory allocation, fd handling and so one, go
ahead and define explicit per-type macros, but for high-level, specific
program code, just use the generic _cleanup_() macro directly, in order
to keep things simple and as readable as possible for the uninitiated.
Note that before this patch some of the APIs (notable libudev ones) were
already used with the high-level macros at some places and with the
generic _cleanup_ macro at others. With this patch we hence unify on the
latter.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
Found by the following warning by gcc.
```
../src/network/networkd-manager.c: In function 'dhcp6_prefixes_compare_func':
../src/network/networkd-manager.c:1383:16: warning: 'memcmp' reading 16 bytes from a region of size 8 [-Wstringop-overflow=]
return memcmp(&a, &b, sizeof(*a));
^
```
This also adds the ability to incorporate arrays into netlink messages
and to determine when a netlink message is too big, used by some generic
netlink protocols.
The changes both networkd and resolved to make use of the watch_bind
feature of sd-bus to connect to the system bus. This way, both daemons
can be started during early boot, and automatically and instantly
connect to the system bus as it becomes available.
This replaces prior code that used a time-based retry logic to connect
to the bus.
Let's remove a number of synchronization points from our service
startups: let's drop synchronous match installation, and let's opt for
asynchronous instead.
Also, let's use sd_bus_match_signal() instead of sd_bus_add_match()
where we can.
Add a hashmap to the Manager struct that stores the association
between an IPv6 prefix and the network Link it is assigned to.
This is added in order to keep assigning the same prefixes with
the same links even though they are delegated at different times
or by different DHCPv6 clients.
Init rule variable iif oif and to, from
While foreign rules are added the network part is not attached.
attach manager to rules and use it in routing_policy_rule_free.
Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f,
FSETLOCKING_BYCALLER). This turns off locking for the entire FILE*,
instead of doing individual per-call decision whether to use normal
calls or _unlocked() calls.
This has various benefits:
1. It's easier to read and easier not to forget
2. It's more comprehensive, as fprintf() and friends are covered too
(as these functions have no _unlocked() counterpart)
3. Philosophically, it's a bit more correct, because it's more a
property of the file handle really whether we ever pass it on to another
thread, not of the operations we then apply to it.
This patch reworks all pieces of codes that so far used fxyz_unlocked()
calls to use __fsetlocking() instead. It also reworks all places that
use open_memstream(), i.e. use stdio FILE* for string manipulations.
Note that this in some way a revert of 4b61c87511.
../src/network/networkd-link.c:3577:84: warning: format specifies type 'unsigned char' but the argument has type 'uint32_t' (aka 'unsigned int') [-Wformat]
route->dst_prefixlen, route->tos, route->priority, route->table, route->lifetime);
^~~~~~~~~~~~
../src/network/networkd-manager.c:1146:132: warning: format specifies type 'unsigned char' but the argument has type 'uint32_t' (aka 'unsigned int') [-Wformat]
rule->from_prefixlen, space ? " " : "", to_str, rule->to_prefixlen, rule->tos, rule->fwmark, rule->fwmask, rule->table);
^~~~~~~~~~~
Also add some line breaks to make it easier to see which argument is for which
part of the format string.
The advantage is that is the name is mispellt, cpp will warn us.
$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build
squash! build-sys: use #if Y instead of #ifdef Y everywhere
v2:
- fix incorrect setting of HAVE_LIBIDN2
From bce67bbee3, systemd-networkd always shows
```
rtnl: received address with invalid family type 32, ignoring.
```
during boot-up. In the code, there are log_warning() and log_debug() for the
same situation, and the log_debug() is never called. So, let's lower the
log level and remove never called function.
Routing Policy rule manipulates rules in the routing policy database control the
route selection algorithm.
This work supports to configure Rule
```
[RoutingPolicyRule]
TypeOfService=0x08
Table=7
From= 192.168.100.18
```
```
ip rule show
0: from all lookup local
0: from 192.168.100.18 tos 0x08 lookup 7
```
V2 changes:
1. Added logic to handle duplicate rules.
2. If rules are changed or deleted and networkd restarted
then those are deleted when networkd restarts next time
V3:
1. Add parse_fwmark_fwmask
As a follow-up for db3f45e2d2 let's do the
same for all other cases where we create a FILE* with local scope and
know that no other threads hence can have access to it.
For most cases this shouldn't change much really, but this should speed
dbus introspection and calender time formatting up a bit.
This adds a modified version of dhcp6_option_parse_domainname() that is
able to parse compressed domain names, borrowing the idea from
dns_packet_read_name(). It also adds pieces in networkd-link and
networkd-manager to properly save/load the added option field.
Resolves#2710.
This will allow us to have several managers sharing an event loop
and running in parallel, as if they were running in separate processes.
The long term-aim is to allow networkd to be split into separate
processes, so restructure the code to make this simpler.
For now we drop the exit-on-idle logic, as this was anyway severely
restricted at the moment. Once split, we will revisit this as it may
then make more sense again.
If setting the received timezone or transient hostname fails because D-Bus is
not (yet) up, store the data in the Manager object and try again after
connecting to D-Bus.
DNS servers must be specified as IP addresses, hence let's store them as that
internally, so that they are guaranteed to be fully normalized always, and
invalid data cannot be stored.
It is just an alias for route_free which requires that route is not null,
but it was only used in one place where it was checked that route is not
null anyway. Let's just call route_free instead.
Separate fields are replaced with a struct.
Second second duid type field is removed. The first field was used to carry
the result of DUIDType= configuration, and the second was either a copy of
this, or contained the type extracted from DuidRawData. The semantics are changed
so that the type specified in DUIDType is always used. DUIDRawData= no longer
overrides the type setting.
The networkd code is now more constrained than the sd-dhcp code:
DUIDRawData cannot have 0 length, length 0 is treated the same as unsetting.
Likewise, it is not possible to set a DUIDType=0. If it ever becomes necessary
to set type=0 or a zero-length duid, the code can be changed to support that.
Nevertheless, I think that's unlikely.
This addresses #3127 § 1 and 3.
v2:
- rename DUID.duid, DUID.duid_len to DUID.raw_data, DUID.raw_data_len
Both versions of the code are changed to allow the caller to override
DUID using simple rules: duid type and value may be specified, in
which case the caller is responsible to providing the contents,
or just duid type may be specified as DUID_TYPE_EN, in which case we
we fill in the values. In the future more support for other types may
be added, e.g. DUID_TYPE_LLT.
There still remains and ugly discrepancy between dhcp4 and dhcp6 code:
dhcp6 has sd_dhcp6_client_set_duid and sd_dhcp6_client_set_iaid and
requires client->state to be DHCP6_STATE_STOPPED, while dhcp4 has
sd_dhcp_client_set_iaid_duid and will reconfigure the client if it
is not stopped. This commit doesn't touch that part.
This addresses #3127 § 2.
This patch makes networkd stay around as long as there is more than just a
loopback interface around, or the loopback device isn't fully probed yet, or
the loopback device has a .network file attached.
In essence, this means networkd stays around now continously as it should,
unless it is running in some (container?) environment that really has no
interface except a loopback device.
Fixes#2577.
The call combines outputing a string with prefixing it with a space, optionally. This is useful to shorten the logic
for outputing lists of strings, that are space separated.
This changes the UseDomains= setting of .network files to take an optional third value "route", in addition to the
boolean values. If set, the passed domain information is used for routing rules only, but not for the search path
logic.
All booleans called dhcp_xyz are now called ".dhcp_use_xyz", to match their respective configuration file settings. This
should clarify things a bit, in particular as there is a DHCP hostname that was previously called just ".hostname"
because ".dhcp_hostname" was already existing as a bool. Since this confusion is removed now because the bool is called
".dhcp_use_hostname", the string field is now renamed to ".dhcp_hostname".
When we collect the domain names of the various links and other sources in one ordered set, make sure to use proper DNS
name comparison to filter out duplicates.
For the search domain logic the order is highly relevant, hence make sure when collecting the various search domains to
add them to an ordered set, so that the order between search domains of a specific link is retained.
Previously, .network files only knew a vaguely defined "Domains=" concept, for which the documentation declared it was
the "DNS domain" for the network connection, without specifying what that means.
With this the Domains setting is reworked, so that there are now "routing" domains and "search" domains. The former are
to be used by resolved to route DNS request to specific network interfaces, the latter is to be used for searching
single-label hostnames with (in addition to being used for routing). Both settings are configured in the "Domains="
setting. Normal domain names listed in it are now considered search domains (for compatibility with existing setups),
while those prefixed with "~" are considered routing domains only. To route all lookups to a specific interface the
routing domain "." may be used, referring to the root domain. An alternative syntax for this is the "*", as was already
implemented before using the "wildcard" domain concept.
This commit adds proper parsers for this new logic, and exposes this via the sd-network API. This information is not
used by resolved yet, this will be added in a later commit.
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
The previous behavior:
When DHCPv6 was enabled, router discover was performed first, and then DHCPv6 was
enabled only if the relevant flags were passed in the Router Advertisement message.
Moreover, router discovery was performed even if AcceptRouterAdvertisements=false,
moreover, even if router advertisements were accepted (by the kernel) the flags
indicating that DHCPv6 should be performed were ignored.
New behavior:
If RouterAdvertisements are accepted, and either no routers are found, or an
advertisement is received indicating DHCPv6 should be performed, the DHCPv6
client is started. Moreover, the DHCP option now truly enables the DHCPv6
client regardless of router discovery (though it will probably not be
very useful to get a lease withotu any routes, this seems the more consistent
approach).
The recommended default setting should be to set DHCP=ipv4 and to leave
IPv6AcceptRouterAdvertisements unset.
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
We only keep the addresses that we added ourselves in link->addresses, and
introduce a new set link->addresses_foreign to keep addresses of unknown
origin.
Only functional change is that "foreign" addresses no longer prevent a link
from entering "configured" state.
Introduce a proper enum, and don't pass around string ids anymore. This
simplifies things quite a bit, and makes virtualization detection more
similar to architecture detection.
Commit 0339cd770 changed libsystemd-network's error code for missing DHCP lease
data from ENOENT to ENODATA. Adjust networkd accordingly.
This fixes interfaces being stuck in "degraded/configuring" mode forever.
https://github.com/systemd/systemd/issues/1147
When handing out DHCP leases, try to propagate DNS/NTP server
information from "uplink". The "uplink" is automatically determined as
the network interface with the highest priority default route on it.
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().
This also unifies the general error paths of fflush()+rename() file
writers.
In 5a8bcb674f, IPForwarding was introduced
to set forwarding flags on interfaces in .network files. networkd sets
forwarding options regardless of the previous setting, even if it was
set by e.g. sysctl. This commit creates a new option for IPForwarding,
"kernel", that preserves the sysctl settings rather than always setting
them.
See https://bugs.freedesktop.org/show_bug.cgi?id=89509 for the initial
bug report.
This should simplify the prototype a bit. The bus parameter is redundant
in most cases, and in the few where it matters it can be derived from
the message via sd_bus_message_get_bus().
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
We will be woken up on rtnl or dbus activity, so let's just quit if some time has passed and that is the only thing that can happen.
Note that we will always stay around if we expect network activity (e.g. DHCP is enabled), as we are not restarted on that.
Only the very basics, more to come.
For now:
$ busctl tree org.freedesktop.network1
└─/org/freedesktop/network1
└─/org/freedesktop/network1/link
├─/org/freedesktop/network1/link/1
├─/org/freedesktop/network1/link/2
├─/org/freedesktop/network1/link/3
├─/org/freedesktop/network1/link/4
├─/org/freedesktop/network1/link/5
├─/org/freedesktop/network1/link/6
├─/org/freedesktop/network1/link/7
├─/org/freedesktop/network1/link/8
└─/org/freedesktop/network1/link/9
$ busctl introspect org.freedesktop.network1 /org/freedesktop/network1
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.network1.Manager interface - - -
.OperationalState property s "carrier" emits-change
$ busctl introspect org.freedesktop.network1 /org/freedesktop/network1/link/1
NAME TYPE SIGNATURE RESULT/VALUE FLAGS
org.freedesktop.network1.Link interface - - -
.AdministrativeState property s "unmanaged" emits-change
.OperationalState property s "carrier" emits-change
If we get a NEWLINK + NEWADDR between enumerating the links and enumerating the addresses, we
would get a warning that the link corresponding to the address does not exist. This is a false
warning as both the NEWLINK and NEWADDR would be processed after enumerating completed, so drop
it.
This introduces am AddressFamilyBoolean type that works more or less
like a booleaan, but can optionally turn on/off things for ipv4 and ipv6
independently. THis also ports the DHCP field over to it.
Using:
find . -name '*.[ch]' | while read f; do perl -i.mmm -e \
'local $/;
local $_=<>;
s/(if\s*\([^\n]+\))\s*{\n(\s*)(log_[a-z_]*_errno\(\s*([->a-zA-Z_]+)\s*,[^;]+);\s*return\s+\g4;\s+}/\1\n\2return \3;/msg;
print;'
$f
done
And a couple of manual whitespace fixups.
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
We got the following error when running systemd on a device with many ports:
"rtnl: kernel receive buffer overrun
Event source 'rtnl-receive-message' returned error, disabling: No buffer space
available"
I think the kernel socket receive buffer queue should be increased. The default
value is taken from:
"/proc/sys/net/core/rmem_default", but we can overwrite it using SO_RCVBUF
socket option.
This is already done in networkd for other sockets.
For example, the bus socket (sd-bus/bus-socket.c) has a receive queue of 8MB.
In our case, the default is 208KB.
Increasing the buffer receive queue for manager socket to 512KB should be enough
to get rid of the above error.
[tomegun: bump the limit even higher to 8M]
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.
systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
Let's settle on a single type for all address family values, even if
UNIX is very inconsitent on the precise type otherwise. Given that
socket() is the primary entrypoint for the sockets API, and that uses
"int", and "int" is relatively simple and generic, we settle on "int"
for this.
We failed to take a ref when waiting for udev synchronization. Fix that and also
make unreffing in callbacks simpler throughout by using _cleanup_ macros.
Fixes <https://bugs.freedesktop.org/show_bug.cgi?id=80556>.
When an address is configured to be all zeroes, networkd will now
automatically find a locally unused network of the right size from a
list of pre-configured pools. Currently those pools are 10.0.0.0/8,
172.16.0.0/12, 192.168.0.0/16 and fc00::/7, i.e. the network ranges for
private networks. They are compiled in, but should be configurable
eventually.
This allows applying the same configuration to a large number of
interfaces with each time a different IP range block, and management of
these IP ranges is fully automatic.
When allocating an address range from the pool it is made sure the range
is not used otherwise.
Configuration will be in
root:root /run/systemd/network
and state will be in
systemd-network:systemd-network /run/systemd/netif
This matches what we do for logind's seat/session state.
Rely on modules being built-in or autoloaded on-demand.
As networkd is a network facing service, we want to limits its capabilities,
as much as possible. Also, we may not have CAP_SYS_MODULE in a container,
and we want networkd to work the same there.
Module autoloading does not always work, but should be fixed by the kernel
patch f98f89a0104454f35a: 'net: tunnels - enable module autoloading', which
is currently in net-next and which people may consider backporting if they
want tunneling support without compiling in the modules.
Early adopters may also use a module-load.d snippet and order
systemd-modules-load.service before networkd to force the module
loading of tunneling modules.
This sholud fix the various build issues people have reported.
The bitmask is deprecated in the kernel, so move to the new interface. At the moment
this does not make a difference for us, but it avoids having to change the API in the future.
This essentially swaps the roles of rtnl and udev in networkd. After this
change libudev is only used for waiting for udev to initialize devices and
to get udev-specific information needed for some [Match] attributes.
This in particular simplifies the code in containers where udev is not really
useful, but also simplifies things and reduces round-trips in the non-container
case.
Pass the mac address on to ipv4ll and dhcp clients so they always have
up-to-date information, and may react appropriately to the change.
Also drop setting the mac address from uevent, and only log when the
address actually changes.
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:
1. The object the new object is derived from is put first, if there is any
2. The object we are creating will be returned in the next arguments
3. This is followed by any additional arguments
Rationale:
For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.
Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.
Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
Don't set set **ret when returning r < 0, as matching on the errno may easily
give false positives in the future leading to null pointer dereference.
Reported-by: David Herrmann <dh.herrmann@gmail.com>
Udev does not run in containers, so instead of relying on it to tell us when a
network device is ready to be used by networkd, we simply assume that any
device was fully initialized before being added to the container.
This allows us users of the library to keep copies of old leases. This is
used by networkd to know what addresses to drop (if any) when the lease
expires.
In the future this may be used by DNAv4 and sd-dhcp-server.
When creating a new link, the kernel will not inform us about the new ifindex
in its ack. We have to listen for newly created devices and deduce the new
ifindex by matching on the ifname.
We used to do this by waiting for a new device from libudev, but that is asking
for trouble, as udev will happily rename the device before handing it to us.
Listen on rtnl instead, the chance of the name being changed before reaching us
is much smaller (if not nil).
Kernel patch in the works to make this unneccessary.
We may not have a dbus daemon in the initrd (until we can rely on kdbus). In
this case, simply ignore any attempts at using the bus. There is only one user
for now, but surely more to come.
In order to work reliably in the real root without kdbus, but at the same time
don't delay boot when kdbus is in use, order ourselves after dbus.service.
This adds support to generate a basic resolv.conf in /run/systemd/network.
This file will not take any effect unless a symlink is created from
/etc/resolv.conf.
Nameservers received over DHCP takes precedence over statically configured ones.
Note: /etc/resolv.conf is severely limited, so in the future we will likely
rather provide a much more powerfull nss plugin (or something to that effect),
but this should allow current users to function without any loss of
functionality.