Let's clean up hostname_is_valid() a bit: let's turn the second boolean
argument into a more explanatory flags field, and add a flag that
accepts the special name ".host" as valid. This is useful for the
container logic, where the special hostname ".host" refers to the "root
container", i.e. the host system itself, and can be specified at various
places.
let's also get rid of machine_name_is_valid(). It was just an alias,
which is confusing and even more so now that we have the flags param.
This changes the retransmission timeout algorithm for requests
other than RENEW and REBIND. Previously, the retransmission timeout
started at 2 seconds, then doubling each retransmission up to a max
of 64 seconds. This is changed to match what RFC2131 section 4.1 describes,
which skips the initial 2 second timeout and starts with a 4 second timeout
instead. Note that -1 to +1 seconds of random 'fuzz' is added to each
timeout, in previous and current behavior.
This change is therefore slightly slower than the previous behavior in
attempting retransmissions when no server response is received, since the
first transmission times out in 4 seconds instead of 2.
Since TRANSIENT_FAILURE_ATTEMPTS is set to 3, the previous length of time
before a transient failure was reported back to systemd-networkd was
2 + 4 + 8 = 14 seconds, plus, on average, 3 seconds of random 'fuzz' for
a transient failure timeout between 11 and 17 seconds. Now, since the
first timeout starts at 4, the transient failure will be reported at
4 + 8 + 16 = 28 seconds, again plus 3 random seconds for a transient
failure timeout between 25 and 31 seconds.
Additionally, if MaxAttempts= is set, it will take slightly longer to
reach than with previous behavior.
Use the request timeout algorithm specified in RFC2131 section 4.4.5 for
handling timed out RENEW and REBIND requests.
This changes behavior, as previously only 2 RENEW and 2 REBIND requests
were sent, no matter how long the lease lifetime. Now, requests are
send according to the RFC, which results in starting with a timeout
of 1/2 the t1 or t2 period, and halving the timeout for each retry
down to a minimum of 60 seconds.
Fixes: #17909
The parsing of the dhcpv4 lease lifetime, as well as the t1/t2
times, is simplified by this commit.
This differs from previous behavior; previously, the lease lifetime and
t1/t2 values were modified by random 'fuzz' by subtracting 3, then adding
a random number between 0 and (slightly over) 2 seconds. The resulting
values were therefore always between 1-3 seconds shorter than the value
provided by the server (or the default, in case of t1/t2). Now, as
described in RFC2131, the random 'fuzz' is between -1 and +1 seconds,
meaning the actual t1 and t2 value will be up to 1 second earlier or
later than the server-provided (or default) t1/t2 value.
This also differs in handling the lease lifetime, as described above it
previously was adjusted by the random 'fuzz', but the RFC does not state
that the lease expiration time should be adjusted, so now the code uses
exactly the lease lifetime as provided by the server with no adjustment.
RFC2131, providing the details for dhcpv4, has specific retransmission
intervals that it outlines. This adds functions to compute the timeouts
as the RFC describes.
So far we only reported major state transitions like failure to acquire
the message. Let's report the initial failure after a few timeouts in
a new event type.
The number of timeouts is hardcoded as 3, since Windows seems to be using
that. I don't think we need to make this configurable out of the box. A
reasonable default may be enough.
With these patches applied, networkd is successfully able to get an
address from a DHCP server on an IPoIB interface.
1)
Makes networkd pass the actual interface type to the dhcp client,
instead of hardcoding it to Ethernet.
2)
Fixes some issues in handling the larger (20 Byte) IB MAC addresses in
the dhcp code.
3)
Add a new field to networkds Link struct, which holds the interface
broadcast address.
3.1)
Modify the DHCP code to also expect the broadcast address as parameter.
On an Ethernet-Interface the Broadcast address never changes and is always
all 6 bytes set to 0xFF.
On an IB one however it is not neccesarily always the same, thus
fetching the actual address from the interface is neccesary.
4)
Only the last 8 bytes of an IB MAC are stable, so when using an IB MAC to
generate a client ID, only pass those 8 bytes.
This effectively reverts the commit 22fc2420b2.
The function `asynchronous_close()` confuses valgrind. Before this commit,
valgrind may report the following:
```
HEAP SUMMARY:
in use at exit: 384 bytes in 1 blocks
total heap usage: 4,787 allocs, 4,786 frees, 1,379,191 bytes allocated
384 bytes in 1 blocks are possibly lost in loss record 1 of 1
at 0x483CAE9: calloc (vg_replace_malloc.c:760)
by 0x401456A: _dl_allocate_tls (in /usr/lib64/ld-2.31.so)
by 0x4BD212E: pthread_create@@GLIBC_2.2.5 (in /usr/lib64/libpthread-2.31.so)
by 0x499B662: asynchronous_job (async.c:47)
by 0x499B7DC: asynchronous_close (async.c:102)
by 0x4CFA8B: client_initialize (sd-dhcp-client.c:696)
by 0x4CFC5E: client_stop (sd-dhcp-client.c:725)
by 0x4D4589: sd_dhcp_client_stop (sd-dhcp-client.c:2134)
by 0x493C2F: link_stop_clients (networkd-link.c:620)
by 0x4126DB: manager_free (networkd-manager.c:867)
by 0x40D193: manager_freep (networkd-manager.h:97)
by 0x40DAFC: run (networkd.c:20)
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 384 bytes in 1 blocks
still reachable: 0 bytes in 0 blocks
suppressed: 0 bytes in 0 blocks
For lists of detected and suppressed errors, rerun with: -s
ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
```
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
We need to use the CMSG_SPACE() macro to size the control buffers, not
CMSG_LEN(). The former is rounded up to next alignment boundary, the
latter is not. The former should be used for allocations, the latter for
encoding how much of it is actually initialized. See cmsg(3) man page
for details about this.
Given how confusing this is, I guess we don't have to be too ashamed
here, in most cases we actually did get this right.
Split out of #15457, let's see if this is the culprit of the CI failure.
(also setting green label here, since @keszybz already greenlit it in that other PR)
When specifying `DHCPv4.SendOption=`, it is used by systemd-networkd to
set the value of that option within the DHCP request that is sent out.
This differs to setting `DHCPServer.SendOption=`, which will place all
the options together as suboptions into the vendor-specific information
(code 43) option.
This commit adds two new config options, `DHCPv4.SendVendorOption=` and
`DHCPServer.SendVendorOption=`. These both have the behaviour of the old
`DHCPServer.SendOption=` flag, and set the value of the suboption in the
vendor-specific information option.
The behaviour of `DHCPServer.SendOption=` is then changed to reflect
that of `DHCPv4.SendOption=`. It will set the value of the corresponding
option in the DHCP request.
The public function and the implementation were split into two for
no particular reason.
We would assert() on the internal state of the client. This should not be done
in a function that is directly called from a public function. (I.e., we should
not crash if the public function is called at the wrong time.)
assert() is changed to assert_return().
And before anyone asks: I put the assert_returns() *above* the internal
variables on purpose. This makes it easier to see that the assert_returns()
are about the state that is passed in, and if they are not satisfied, the
function returns immediately. The compiler doesn't care either way, so
the ordering that is clearest to the reader should be chosen.
IPServiceType set to CS6 (network control) causes problems on some old
network setups that continue to interpret the field as IP TOS.
Make DHCP work on such networks by allowing this field to be set to
CS4 (Realtime) instead, as this maps to IPTOS_LOWDELAY.
Signed-off-by: Siddharth Chandrasekaran <csiddharth@vmware.com>