The public function and the implementation were split into two for
no particular reason.
We would assert() on the internal state of the client. This should not be done
in a function that is directly called from a public function. (I.e., we should
not crash if the public function is called at the wrong time.)
assert() is changed to assert_return().
And before anyone asks: I put the assert_returns() *above* the internal
variables on purpose. This makes it easier to see that the assert_returns()
are about the state that is passed in, and if they are not satisfied, the
function returns immediately. The compiler doesn't care either way, so
the ordering that is clearest to the reader should be chosen.
IPServiceType set to CS6 (network control) causes problems on some old
network setups that continue to interpret the field as IP TOS.
Make DHCP work on such networks by allowing this field to be set to
CS4 (Realtime) instead, as this maps to IPTOS_LOWDELAY.
Signed-off-by: Siddharth Chandrasekaran <csiddharth@vmware.com>
The RFC states that lifetime (AdvDefaultLifetime) must be at least
MaxRtrAdvInterval (which more or less corresponds to SD_RADV_DEFAULT_MAX_TIMEOUT_USEC
in systemd).
To fulfill this limit, virtually lower MaxRtrAdvInterval and MinRtrAdvInterval
accordingly.
Also check that min is not lower than 3s and max is not lower than 4s.
This reverts commit 8a07b4033e.
The tests are kept. test-networkd-conf is adjusted to pass.
This fixes#13276. I think current rules are extremely confusing, as the
case in test-networkd-conf shows. We apply some kinds of unescaping (relating
to quoting), but not others (related to escaping of special characters).
But fixing this is hard, because people have adjusted quoting to match
our rules, and if we make the rules "better", things might break in unexpected
places.
Comparisons are done in the normal order (if (need > available), not if (available < need)),
variables have reduced scope and are renamed for clarity.
The only functional change is that if we return -ENAMETOOLONG, we do that
without modifying the options[] array.
I also added an explanatory comment. The use of one offset to point into three
buffers is not obvious.
Coverity (in CID#1402354) says that sname might be accessed at bad offset, but
I cannot see this happening. We check for available space before writing anything.
It's hard to even say what exactly this combination means. Escaping is
necessary when quoting to have quotes within the string. So the escaping of
quote characters is inherently tied to quoting. When unquoting, it seems
natural to remove escaping which was done for the quoting purposes. But with
both flags we would be expected to re-add this escaping after unqouting? Or
maybe keep the escaping which is not necessary for quoting but otherwise
present? This all seems too complicated, let's just forbid such usage and
always fully unescape when unquoting.
This is for 6d36464065. It turns out that this is causing more problems than
expected. Let's retroactively introduce naming scheme v241 to conditionalize
this change.
Follow-up for #12792 and 6d36464065. See also
https://bugzilla.suse.com/show_bug.cgi?id=1136600.
$ SYSTEMD_LOG_LEVEL=debug NET_NAMING_SCHEME=v240 build/udevadm test-builtin net_setup_link /sys/class/net/br11
$ SYSTEMD_LOG_LEVEL=debug NET_NAMING_SCHEME=v241 build/udevadm test-builtin net_setup_link /sys/class/net/br11
...
@@ -20,11 +20,13 @@
link_config: could not set ethtool features for br11
Could not set offload features of br11: Operation not permitted
br11: Device has name_assign_type=3
-Using interface naming scheme 'v240'.
+Using interface naming scheme 'v241'.
br11: Policy *keep*: keeping existing userspace name
br11: Device has addr_assign_type=1
-br11: No stable identifying information found
-br11: Could not generate persistent MAC: No data available
+br11: Using "br11" as stable identifying information
+br11: Using generated persistent MAC address
+Could not set Alias=, MACAddress= or MTU= on br11: Operation not permitted
+br11: Could not apply link config, ignoring: Operation not permitted
Unload module index
Unloaded link configuration context.
ID_NET_DRIVER=bridge
This reflect its role better.
(I didn't use …_persistent_name(), because which name is actually used
depends on the policy. So it's better not to make this sound like it returns
*the* persistent name.)
This is partially a refactoring, but also makes many more places use
unlocked operations implicitly, i.e. all users of fopen_temporary().
AFAICT, the uses are always for short-lived files which are not shared
externally, and are just used within the same context. Locking is not
necessary.
When the link goes down, DHCP client_receive_message*() functions return an
error and the related I/O source is removed from the main loop. With the
current implementation of systemd-networkd this doesn't matter because the DHCP
client is always stopped on carrier down and restarted on carrier up. However
it seems wrong to have the DHCP client crippled (because no packet can be
received anymore) once the link goes temporarily down.
Change the receive functions to ignore a ENETDOWN event so that the client will
be able to receive packets again after the link comes back.
inet_ntop() is not documented to be thread-safe, so it should not
be used in the DHCP library. Arguably, glibc uses a thread local
buffer, so indeed there is no problem with a suitable libc. Anyway,
just avoid it.
The DHCP client should not pre-filter addresses beyond what RFC
requires. If a client's user (like networkd) wishes to skip/filter
certain addresses, it's their responsibility.
The point of this is that the DHCP library does not hide/abstract
information that might be relevant for certain users. For example,
NetworkManager exposes DHCP options in its API. When doing that, the
options should be close to the actual lease.
This is related to commit d9ec2e632d
(dhcp4: filter bogus DNS/NTP server addresses silently).
The Router DHCP option may contain a list of one or more
routers ([1]). Extend the API of sd_dhcp_lease to return a
list instead of only the first.
Note that networkd still only uses the first router (if present).
Aside from extending the internal API of the DHCP client, there
is almost no change in behavior. The only visible difference in
behavior is that the "ROUTER" variable in the lease file is now a
list of addresses.
Note how RFC 2132 does not define certain IP addresses as invalid for the
router option. Still, previously sd_dhcp_lease_get_router() would never
return a "0.0.0.0" address. In fact, the previous API could not
differenciate whether no router option was present, whether it
was invalid, or whether its first router was "0.0.0.0". No longer let
the DHCP client library impose additional restrictions that are not
part of RFC. Instead, the caller should handle this. The patch does
that, and networkd only consideres the first router entry if it is not
"0.0.0.0".
[1] https://tools.ietf.org/html/rfc2132#section-3.5
deserialize_in_addrs() allocates the buffer before trying to parse
the IP address. Since a parsing error is silently ignored, the returned
size might be zero. In such a case we shouldn't return any buffer.
Anyway, there was no leak, because there are only two callers like
r = deserialize_in_addrs(&lease->dns, dns);
which both keep the unused buffer and later release it.
Note that deserialize_in_addrs() doesn't free the pointer before
reassigning the new output. The caller must take care to to pass
"ret" with an allocated buffer that would be leaked when returning
the result.
The "chaddr" field is 16 bytes long, with "hlen" being the
length of the address.
https://tools.ietf.org/html/rfc2131#section-4.3.1 says:
The server MUST return to the client:
...
o Any parameters specific to this client (as identified by
the contents of 'chaddr' or 'client identifier' in the DHCPDISCOVER
or DHCPREQUEST message), e.g., as configured by the network
administrator,
It's not clear, whether only the first 'hlen' bytes of 'chaddr'
must correspond or all 16 bytes.
Note that https://tools.ietf.org/html/rfc4390#section-2.1 says for IPoIB
"chaddr" (client hardware address) field MUST be zeroed.
with having "hlen" zero. This indicates that at least in this case, the
bytes after "hlen" would matter.
As the DHCP client always sets the trailing bytes to zero, we would expect
that the server also replies as such and we could just compare all 16 bytes.
However, let's be liberal and accept any padding here.
This in practice only changes behavior for infiniband, where we
previously would enforce that the first ETH_ALEN bytes are zero.
That seems arbitrary for IPoIB. We should either check all bytes or
none of them. Let's do the latter and don't enforce RFC 4390 in this
regard.
gcc-9 warns whenever the elements of a structure defined with _packed_ are used:
../src/network/networkd-dhcp6.c: In function ‘dhcp6_pd_prefix_assign’:
../src/network/networkd-dhcp6.c:92:53: warning: taking address of packed member of ‘struct <anonymous>’ may result in an unaligned pointer value [-Waddress-of-packed-member]
92 | r = manager_dhcp6_prefix_add(link->manager, &p->opt.in6_addr, link);
| ^~~~~~~~~~~~~~~~
And the compiler is right, because in principle the alignment could be wrong.
In this particular case it is not, because the structure is carefully defined
not to have holes. Let's remove _packed_ and use compile-time asserts to verify
that the offsets are not changed.
Fixes#3374. The problem is that we set MACPolicy=persistent (i.e. we would
like to generate persistent MAC addresses for interfaces which don't have a
fixed MAC address), but various virtual interfaces including bridges, tun/tap,
bonds, etc., do not not have the necessary ID_NET_NAME_* attributes and udev
would not assing the address and warn:
Could not generate persistent MAC address for $name: No such file or directory
Basic requirements which I think a solution for this needs to satisfy:
1. No changes to MAC address generation for those cases which are currently
handled successfully. This means that net_get_unique_predictable_data() must
keep returning the same answer, which in turn means net_get_name() must keep
returning the same answer. We can only add more things we look at with lower
priority so that we start to cover cases which were not covered before.
2. Like 1, but for IPvLL seed and DHCP IAD. This is less important, but "nice
to have".
3. Keep MACPolicy=persistent. If people don't want it, they can always apply
local configuration, but in general stable MACs are a good thing. I have never
seen anyone complain about that.
== Various approaches that have been proposed
=== https://github.com/systemd/systemd/issues/3374#issuecomment-223753264 (tomty89)
if !ID_BUS and INTERFACE, use INTERFACE
I think this almost does the good thing, but I don't see the reason to reject ID_BUS
(i.e. physical hardware). Stable MACs are very useful for physical hardware that has
no physical MAC.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-224733069 (teg)
if (should_rename(device, true))
This means looking at name_assign_type. In particular for
NET_NAME_USER should_rename(..., true) returns true. It only returns false
for NET_NAME_PREDICTABLE. So this would cover stuff like br0, bond0, etc,
but would not cover lo and other devices with predictable names. That doesn't
make much sense.
But did teg mean should_rename() or !should_rename()?
=== https://github.com/systemd/systemd/issues/3374#issuecomment-234628502 (tomty89):
+ if (!should_rename(device, true))
+ return udev_device_get_sysname(device)
This covers only devices with NET_NAME_PREDICTABLE. Since the problem applies as
much to bridges and such, this isn't neough.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-281745967 (grafi-tt)
+ /* if the machine doesn't provide data about the device, use the ifname specified by userspace
+ * (this is the case when the device is virtual, e.g., bridge or bond) */
+ s = udev_device_get_sysattr_value(device, "name_assign_type");
+ if (s && safe_atou(s, &type) >= 0 && type == NET_NAME_USER)
+ return udev_device_get_sysname(device);
This does not cover bond0, vnet0, tun/tap and similar.
grafi-tt also proposes patching the kernel, but *not* setting name_assign_type
seems intentional in those cases, because the device name is a result of
enumeration, not set by the userspace.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-288882355 (tomty89)
(also PR #11372)
- MACAddressPolicy=persistent
This break requirement 3. above. It would solve the immediate problem, but I
think the disruption is too big.
=== This patch
This patch means that we will set a "stable" MAC for pretty much any virtual
device by default, where "stable" means keyed off the machine-id and interface
name.
It seems like a big change, but we already did this for most physical devices.
Doing it also for virtual devices doesn't seem like a big issue. It will make
the setup and monitoring of virtualized networks slightly nicer. I don't think
anyone is depending on having the MAC address changed when those devices are
destoryed and recreated. If they do, they'd have to change MACAddressPolicy=.
== Implementation
net_get_name() is called from dhcp_ident_set_iaid() so I didn't change
net_get_name() like in grafi-tt's patch, but net_get_unique_predictable_data().
net_get_unique_predictable_data() is called from get_mac() in link-config.c
and sd_ipv4ll_set_address_seed(), so both of those code paths are affected
and will now get data in some cases where they errored out previously.
The return code is changed to -ENODATA since that gives a nicer error string.
Found by inspecting results of running this small program:
int main(int argc, const char **argv) {
for (int i = 1; i < argc; i++) {
FILE *f;
char line[1024], prev[1024], *r;
int lineno;
prev[0] = '\0';
lineno = 1;
f = fopen(argv[i], "r");
if (!f)
exit(1);
do {
r = fgets(line, sizeof(line), f);
if (!r)
break;
if (strcmp(line, prev) == 0)
printf("%s:%d: error: dup %s", argv[i], lineno, line);
lineno++;
strcpy(prev, line);
} while (!feof(f));
fclose(f);
}
}
It seems quite useful to provide this additional information in public exported
functions.
This is a c99 feature, not supported in C++. Without the check in _sd-common.h:
FAILED: test-bus-vtable-cc@exe/src_libsystemd_sd-bus_test-bus-vtable-cc.cc.o
...
In file included from ../src/libsystemd/sd-bus/test-bus-vtable-cc.cc:9:
In file included from ../src/systemd/sd-bus-vtable.h:26:
In file included from ../src/systemd/sd-bus.h:26:
../src/systemd/sd-id128.h:38:47: error: static array size is a C99 feature, not permitted in C++
char *sd_id128_to_string(sd_id128_t id, char s[static SD_ID128_STRING_MAX]);
^
In .c files, I opted to use the define for consistency, even though we don't support
compilation with a C++ compiler, so the unconditional keyword would work too.
There are various functions to set the DUID of a DHCPv6 client.
However, none of them allows to set arbitrary data. The closest is
sd_dhcp6_client_set_duid(), which would still do validation of the
DUID's content via dhcp_validate_duid_len().
Relax the validation and only log a debug message if the DUID
does not validate.
Note that dhcp_validate_duid_len() already is not very strict. For example
with DUID_TYPE_LLT it only ensures that the length is suitable to contain
hwtype and time. It does not further check that the length of hwaddr is non-zero
or suitable for hwtype. Also, non-well-known DUID types are accepted for
extensibility. Why reject certain DUIDs but allowing clearly wrong formats
otherwise?
The validation and failure should happen earlier, when accepting the
unsuitable DUID. At that point, there is more context of what is wrong,
and a better failure reason (or warning) can be reported to the user. Rejecting
the DUID when setting up the DHCPv6 client seems not optimal, in particular
because the DHCPv6 client does not care about actual content of the
DUID and treats it as opaque blob.
Also, NetworkManager (which uses this code) allows to configure the entire
binary DUID in binary. It intentionally does not validate the binary
content any further. Hence, it needs to be able to set _invalid_ DUIDs,
provided that some basic constraints are satisfied (like the maximum length).
sd_dhcp6_client_set_duid() has two callers: both set the DUID obtained
from link_get_duid(), which comes from configuration.
`man networkd.conf` says: "The configured DHCP DUID should conform to
the specification in RFC 3315, RFC 6355.". It does not not state that
it MUST conform.
Note that dhcp_validate_duid_len() has another caller: DHCPv4's
dhcp_client_set_iaid_duid_internal(). In this case, continue with
strict validation, as the callers are more controlled. Also, there is
already sd_dhcp_client_set_client_id() which can be used to bypass
this check and set arbitrary client identifiers.
sd_dhcp_client_set_client_id() is the only API for setting a raw client-id.
All other setters are more restricted and only allow to set a type 255 DUID.
Also, dhcp4_set_client_identifier() is the only caller, which already
does:
r = sd_dhcp_client_set_client_id(link->dhcp_client,
ARPHRD_ETHER,
(const uint8_t *) &link->mac,
sizeof(link->mac));
and hence ensures that the data length is indeed ETH_ALEN.
Drop additional input validation from sd_dhcp_client_set_client_id(). The client-id
is an opaque blob, and if a caller wishes to set type 1 (ethernet) or type 32
(infiniband) with unexpected address length, it should be allowed. The actual
client-id is not relevant to the DHCP client, and it's the responsibility of the
caller to generate a suitable client-id.
For example, in NetworkManager you can configure all the bytes of the
client-id, including such _invalid_ settings. I think it makes sense,
to allow the user to fully configure the identifier. Even if such configuration
would be rejected, it would be the responsibility of the higher layers (including
a sensible error message to the user) and not fail later during
sd_dhcp_client_set_client_id().
Still log a debug message if the length is unexpected.
Infiniband addresses are 20 bytes (INFINIBAND_ALEN), but only the last
8 bytes are suitable for putting into the client-id.
This bug had no effect for networkd, because sd_dhcp_client_set_client_id()
has only one caller which always uses ARPHRD_ETHER type.
I was unable to find good references for why this is correct ([1]). Fedora/RHEL
has patches for ISC dhclient that also only use the last 8 bytes ([2], [3]).
RFC 4390 (Dynamic Host Configuration Protocol (DHCP) over InfiniBand) [4] does
not discuss the content of the client-id either.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1658057#c29
[2] https://bugzilla.redhat.com/show_bug.cgi?id=660681
[3] 3ccf3c8d81/f/dhcp-lpf-ib.patch
[4] https://tools.ietf.org/html/rfc4390
In particular, check that the order of the results is consistent.
This test coverage will be useful in order to refactor the compare_func
used while sorting the results.
When introduced, this test also uncovered a memory leak in sd_lldp_stop(),
which was then fixed by a separate commit using a specialized function
as destructor of the LLDP Hashmap.
Tested:
$ ninja -C build/ test
$ valgrind --leak-check=full build/test-lldp
This reverts commit dd102e4d0c.
That test case exposed a memory leak and breaks CI, so let's revert it until
the original issue is fixed, to prevent disruption of automated testing.
The ?: operator is very useful for chaining comparison functions
(strcmp, memcmp, CMP), since its behavior is to return the result
of the comparison function call if non-zero, or continue evaluating
the chain of comparison functions.
This simplifies the code in that using a temporary `r` variable
to store the function results is no longer necessary and the checks
for non-zero to return are no longer needed either, resulting in a
typical three-fold reduction to the number of lines in the code.
Introduce a new memcmp_nn() to compare two memory buffers in
lexicographic order, taking length in consideration.
Tested: $ ninja -C build/ test
All test cases pass. In particular, test_multiple_neighbors_sorted()
in test-lldp would catch regressions introduced by this commit.
In particular, check that the order of the results is consistent.
This test coverage will be useful in order to refactor the compare_func
used while sorting the results.
Tested: ninja -C build/ test
An earlier commit 0e408b82b (dhcp6-client: handle IAID with value zero)
introduced a flag to sd_dhcp6_client to distinguish between an unset
IAID and a value set to zero.
However, that was not sufficient and broke leaving the setting
uninitialized in networkd configuration. The configuration parsing
also must distinguish between the default, unset value and an
explict zero configuration.
Fixes: 0e408b82b8
https://tools.ietf.org/html/rfc1035#section-2.3.1 says (approximately)
that only letters, numbers, and non-leading non-trailing dashes are allowed
(for entries with A/AAAA records). We set no restrictions.
hosts(5) says:
> Host names may contain only alphanumeric characters, minus signs ("-"), and
> periods ("."). They must begin with an alphabetic character and end with an
> alphanumeric character.
nss-files follows those rules, and will ignore names in /etc/hosts that do not
follow this rule.
Let's follow the documented rules for /etc/hosts. In particular, this makes us
consitent with nss-files, reducing surprises for the user.
I'm pretty sure we should apply stricter filtering to names received over DNS
and LLMNR and MDNS, but it's a bigger project, because the rules differ
depepending on which level the label appears (rules for top-level names are
stricter), and this patch takes the minimalistic approach and only changes
behaviour for /etc/hosts.
Escape syntax is also disallowed in /etc/hosts, even if the resulting character
would be allowed. Other tools that parse /etc/hosts do not support this, and
there is no need to use it because no allowed characters benefit from escaping.
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.
No code changes, just some rearranging of source files.
Since sd_dhcp_lease_get_routes() returns the list of all routes,
the caller may need to differenciate whether the route was option
33 (static-routes) or 121 (classless-static-route).
Add an accessor for the internal field.
config_parse_iaid(), dhcp_identifier_set_iaid() and sd_dhcp6_client_set_iaid() all
allow for the IAID to be zero. Also, RFC 3315 makes no mention that zero
would be invalid.
However, client_ensure_iaid() would take an IAID of zero as a sign that
the values was unset. Fix that by keeping track whether IAID is
initialized.
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
In certain cases the timeouts may not have been unref'ed before they
need to be re-added. Add the appropriate unref calls to ensure we don't
register the timeout multiple times.
This fixes possible cases where timeouts are triggered multiple times
and even on destroyed DHCPv6 clients.
https://gitlab.freedesktop.org/NetworkManager/NetworkManager/issues/73Fixes#10749.
Now that we don't (mis-)use the env file parser to parse kernel command
lines there's no need anymore to override the used newline character
set. Let's hence drop the argument and just "\n\r" always. This nicely
simplifies our code.
The previous code did htole64() followed by unaligned_write_be32() (the
XOR and shift in between is endianness agnostic). That means, on every
architeture there is always exactly one byte swap and the iaid is
dependent on endianness.
Since dhcp_identifier_set_iaid() is part of the DUID generation
algorithm, this cannot be fixed without changing the client-id.
In particular, as the client-id already depends on the machine-id (and
is thus inherrently host-specific), it is better to stick to the current
behavior.
However, add a parameter to switch between old and new behaviour.
Since the new behavior is unused, the only real purpose of this
change is to self-document the oddity of the function.
Fixes: 933f9caeeb
This doesn't change anything in the generated source, but I think makes
semantically more sense, as these structures have undefined size, and we
only want to know the size up to the data field in these cases.
All over the place we define local variables for the various sockopts
that take a bool-like "int" value. Sometimes they are const, sometimes
static, sometimes both, sometimes neither.
Let's clean this up, introduce a common const variable "const_int_one"
(as well as one matching "const_int_zero") and use it everywhere, all
acorss the codebase.
Arguably, libsystemd-network is (still) entirely internal API.
However there is the aim of maybe exposing it as public API.
For that reason, it cannot include internal headers from
"src/basic/".
Note how files "src/systemd/sd-*.h" don't include any systemd
headers which don't themself have an "sd-" prefix.
Fixes: d89a400ed6
According to our CODING_STYLE our library code should generally not log
beyond LOG_DEBUG. Let's hence get rid of log_radv_warning_errno() and
just use log_radv_errno() instead.
The code handling the errors was originally part of ndisc_recv, which,
being an event handler, would be simply turned off if it returned a negative
error code. It's no longer necessary. Plus, it helps avoid passing
an uninitialized value to radv_send.
Closes https://github.com/systemd/systemd/issues/10223.
Since we now have the possibility to request prefixes to be delegated
without corresponding IPv6 addresses, it does not make sense to store
lease T1 and T2 timeouts in the otherwise unused IA_NA structure.
Therefore lease timeouts T1 and T2 are moved to the DHCPv6 client
structure, as there will be only one set of stateful timeouts required
by RFC 7550, Section 4.3.
Select T1 and T2 timeouts based on whether addresses or prefixes were
requested and what the server offered. The address and prefix timeouts
values have been computed earlier when the relevant DHCPv6 options were
parsed.
Add function to fetch the IAID for the delegated IA_PD prefix. In
order to keep things simple in the implemntation, the same IAID
is used with IA_NA addresses and IA_PD prefixes. But the DHCPv6
server can choose to return only IA_PD prefixes, and the client
can nowadays omit requesting of IA_NA addresses. Since the function
fetching said IAID from the lease looks only for IA_NA ones, it
will return an empty IAID, which of course does not match the one
set for prefixes.
Fix this by adding a function returning the IAID for the prefix.
Add function to enable/disable IA_NA address requests. Internally
handle the request as a bit mask and add IA_PD prefix delegation
to the same bit mask instead of having a separate boolean. Thus
the calling code can set requests for prefix and address delegation
separately. This is handy when supporting RFC 7084.
Add a check in the code that at least something is requested from
the server in Managed mode. By default request IA_NA addresses from
the DHCPv6 server. Although a value has been defined for IA_TA,
temporay IA_TA addresses are not yet requested.