assert_return() should only be used to validate user-facing parameters and
state, assert() should be used for checking our own internal state and
parameters.
A field "index" is not particularly precise and also might conflict with libc's
index() function definition. Also, pretty much everywhere else we call this
concept "ifindex", including in networkd, the primary user of these libraries.
Hence, let's fix this up and call this "ifindex" everywhere here too.
According to recv(2) these should be the same, but that is not true.
Passing a buffer of length 0 to read is defined to be a noop according
to read(2), but passing a buffer of length 0 to recv will discard the
pending pacet.
We can easily hit this as we allocate our buffer size depending on
the size of the incoming packet (using FIONREAD). As pointed out in
issue #3299 simply sending an empty UDP packet to the DHCP client
port will trigger a busy loop in networkd as we are polling on the
socket but never discarding the empty packet.
This reverts ad5ae47a0d but fixes the
same issue.
Both versions of the code are changed to allow the caller to override
DUID using simple rules: duid type and value may be specified, in
which case the caller is responsible to providing the contents,
or just duid type may be specified as DUID_TYPE_EN, in which case we
we fill in the values. In the future more support for other types may
be added, e.g. DUID_TYPE_LLT.
There still remains and ugly discrepancy between dhcp4 and dhcp6 code:
dhcp6 has sd_dhcp6_client_set_duid and sd_dhcp6_client_set_iaid and
requires client->state to be DHCP6_STATE_STOPPED, while dhcp4 has
sd_dhcp_client_set_iaid_duid and will reconfigure the client if it
is not stopped. This commit doesn't touch that part.
This addresses #3127 § 2.
Initialize auto variables with cleanup attribute, otherwise we
get a compiler warning with -fexceptions.
./configure CFLAGS='-Wmaybe-uninitialized -fexceptions -O2'
The server might answer to a DHCPREQUEST with a NAK and currently the
client restarts the configuration process immediately. It was
observed that this can easily generate loops in which the network is
flooded with DISCOVER,OFFER,REQUEST,NAK sequences.
RFC 2131 only states that "if the client receives a DHCPNAK message,
the client restarts the configuration process" without further
details.
Add a delay with exponential backoff between retries after NAKs to
limit the number of requests and cap the delay to 30 minutes.
libsystemd-network provides the public function
sd_dhcp_client_set_request_option() to enable the request of a given
DHCP option. However the enum defining such options is defined in the
internal header dhcp-protocol.h. Move the enum definition to the
public header sd-dhcp-client.h and properly namespace values.
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
Let's change the return value to bool. If we encounter an error while
parsing, return "false" instead of the actual parsing error, after all
the specified hostname does not qualify for what the function is
supposed to test.
Dealing with the additional error codes was always cumbersome, and
easily misused, like for example in the DHCP code.
Let's also rename the functions from dns_name_root() to
dns_name_is_root(), to indicate that this function checks something and
returns a bool. Similar for dns_name_is_signal_label().
If a client sends a DECLINE or a server sends a NAK, they can include
a string with a message to explain the error. Parse this and print it
at debug level.
This adds support for the Client Fully Qualified Domain Name (FQDN)
option [RFC 4702] to libsystemd-network. The option can be used to
exchange information about a DHCPv4 client's fully qualified domain
name and about responsibility for updating the DNS RR related to the
client's address assignment.
Other popular DHCP clients (dhclient, dhcpcd) support this option and
it would be useful to have it in networkd too.
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
a) drop handling of obsolete or unused DHCP options time_offset,
mtu_aging_timeout, policy filter, mdr, ttl, ip forwarding settings.
Should this become useful one day we can readd support for this.
b) For subnet mask and broadcast it is not always clear whether 0 or
255.255.255.255 might be valid, hence maintain a boolean indicating
validity next to it.
c) serialize/deserialize broadcast address, lifetime, T1 and T2 together
with the rest of the fields in dhcp_lease_save() and
dhcp_lease_load().
d) consistently return ENODATA from getter functions for data that is
missing in the lease.
e) add missing getter calls for broadcast, lifetime, T1, T2.
f) when decoding DHCP options, generate debug messages on parse
failures, but try to proceed if possible.
g) Similar, when deserializing a lease in dhcp_lease_load(), make sure
we deal nicely with unparsable fields, to provide upgrade compat.
h) fix some memory allocations
In our API design, getter-functions don't ref objects. Calls like
foo_get_bar() will not ref 'bar'. We never do that and there is no real
reason to do it in single threaded APIs. If you need a ref-count, you
better take it yourself *BEFORE* doing anything else on the parent object
(as this might invalidate your pointer).
Right now, sd_dhcp?_get_lease() refs the lease it returns. A lot of
code-paths in systemd do not expect this and thus leak the lease
reference. Fix this by changing the API to not ref returned objects.
This patch removes includes that are not used. The removals were found with
include-what-you-use which checks if any of the symbols from a header is
in use.
In addition to the benefits listed in the RFC, this allows DHCP to work also in
case several interfaces share the same MAC address on the same link (IPVLAN).
Note that this will make the ClientID (so probably the assigned IP address)
change on upgrades. If it is desired to avoid that we would have to remember and
write back the ID (which the library supports, but networkd currently does not).
The client identifier can be in many different formats, not just
the one that systemd creates from the Ethernet MAC address. Non-
ethernet interfaces may have different client IDs formats. Users
may also have custom client IDs that the wish to use to preserve
lease options delivered by servers configured with the existing
client ID.
client->secs wasn't getting set in the REBOOT state, causing
an assertion. REBOOT should work the same way as INIT, per
RFC 2131:
secs 2 Filled in by client, seconds elapsed since client
began address acquisition or renewal process.
REBOOT is necessary because some DHCP servers (eg on
home routers) do not hand back the same IP address unless the
'ciaddr' field is filled with that address, which DISCOVER
cannot do per the RFCs. This leads to multiple leases
on machine reboot or DHCP client restart.
The raw socket sd_event_source used for DHCP server solicitations
was simply dropped on the floor when creating the new UDP socket
after a lease has been acquired. Clean it up properly so we're
not still listening and responding to events on it.
Like Infiniband. See RFC 4390 section 2.1 for details on DHCP
and Infiniband; chaddr is zeroed, hlen is set to 0, and htype
is set to ARPHRD_INFINIBAND because IB hardware addresses
are 20 bytes in length.
The timeouts in the networking library (DHCP lease timeouts and similar) should not be affected
by suspend. In the cases where CLOCK_BOOTTIME is not implemented, it is still safe to fallback to
CLOCK_MONOTONIC, as the consumers of the library (i.e., networkd) _should_ renew the leases when
coming out of suspend.
It appears there is no good way to decide whether or not broadcasts should be enabled,
there is hardware that must have broadcast, and there are networks that only allow
unicast. So we give up and make this configurable.
By default, unicast is used, but if the kernel were to inform us abotu certain
interfaces requiring broadcast, we could change this to opt-in by default in
those cases.
Vendor Class Identifier be used by DHCP clients to identify
their vendor type and configuration. When using this option,
vendors can define their own specific identifier values, such
as to convey a particular hardware or operating system
configuration or other identifying information.
Vendor-specified DHCP options—features that let administrators assign
separate options to clients with similar configuration requirements.
For example, if DHCP-aware clients for example we want to separate
different gateway and option for different set of people
(dev/test/hr/finance) in a org or devices for example web/database
servers or let's say in a embedded device etc and require a different
default gateway or DNS server than the rest of clients.
Check that received DHCP packets actually include our MAC address in
chaddr field. BPF interpreter has 32 bit wide registers but MAC address
is 48 bits long so we have to do check in two steps.
Send hostname (option 12) in DISCOVER and REQUEST messages so the
DHCP server could use it to register with dynamic DNS and such.
To opt-out of this behaviour set SendHostname to false in [DHCP]
section of .network file
[tomegun: rebased, made sure a failing set_hostname is a noop and moved
config from DHCPv4 to DHCP]
Even if we cannot renew the lease at T1, we will likely succeed at T2, so warn and ignore the failure.
This could happen if for whatever reason the received address is not yet configured, or it has
been lost.
Let's keep this behavior consistent across our libraries.
In order to keep the refcounting working, a DONT_DESTROY macro similar
to the one in sd-bus was introduced.
static int client_send_request(...) in
./src/libsystemd-network/sd-dhcp-client.c tries to initialize
"request" by calling client_message_init(...), which has atleast
5 error cases where it can return without that happening.
This leads to the function finishing without "request" being initialized.
On systems which cannot receive unicast packets until its IP stack has been configured
we need to request broadcast packets. We are currently not able to reliably detect when
this is necessary, so set it unconditionally for now.
This is set on all packets, but the DHCP server will only broadcast the packets that are
necessary, and unicast the rest.
For more information please refer to this thread in CoreOS: https://github.com/coreos/bugs/issues/12
[tomegun: rephrased commit message]
Store a pointer to the options in the DHCPMessage struct, and pass
this together with an offset around, rather than a uint8_t**.
This avoids us having to (re)compute the pointer; and changes
dhcp_option_append from adjusting both the pointer to the next
option and the remaining size of the options, to just adjusting
the current offset.
This makes the code a bit simpler to follow IMHO, but there should
be no functional change.
close() is a blocking call, which may slow things down measurably when running many dhcp
clients in the same single-threaded main loop. Let's just use the asynchronous version
instead to avoid the problem.
- Also only allow positive ifindex on both dhcp and ipv4ll
[tomegun: the kernel always sets a positive ifindex, but some APIs accept
ifindex=0 with various meanings, so we should protect against
accidentally passing ifindex=0 along.]
Also reshuffle some code to make the correspondence with the RFC a bit more
obvious.
Small functional change: fail if we try to send a message from the wrong state.
Add an explicit stop state for the DHCP client so that the library
user can issue a stop at any time the callback has been called.
When returning from the callback, check also the stop state and
stop any further DHCP processing.
The DHCP library user can decide to free the DHCP client any time
the callback is called. After the callback has been called, other
computations may still be needed - the best example being a full
restart of the DHCP procedure in case of lease expiry.
Fix this by introducing proper reference counting. Properly handle
a returned NULL from the notify and stop functions if the DHCP
client was freed.
Try a bit harder to make the kernel drop packets not for us. This should reduce
the number of wakeups from n^2 to n in the number of dhcp clients, which admittedly
only makes a differenc in very extreme cases.
If they are too small to fit the IP+UDP+DHCP headers they can be of no use, so
don't waste resources parsing them. This is at the cost of losing some verbosity
in the logging.
Also move the checking of it to the main message handler, rather than the
options parser.
Fix a bug, so we now drop the packet if any of the magic bytes don't match.
Before we used to only drop the packet if they were all wrong.
If necessary, restart the clients to deal with a changing mac address
at runtime. This will solve the problem of starting clients on bridges
before they have received their final MAC address.
The DHCP RFC does not require the DHCP server to send a subnet mask, so if it
is missing, let's try to use the default subnet masks based on address class.
In case the class the address belongs to does not have a default subnet mask,
we fail as before.
Also improve logging when handling invalid dhcp messages, and simply ignore them
rather than stop the whole dhcp client.
Accept any lease lifetime greater than one second. Server should not
hand out extremely short leases, but let's not be the ones to fail.
Do not fail when arming a timer in the past, but also only arm one such
timer.
Avoid rounding errors when computing the default timeouts, this may be
an issue if we are handed a very short lease.
Also, don't pass 'time_now' around, as that can be found in the event
object when needed.
Even though client identifiers SHOULD be treated as opaque objects by
DHCP servers, follow the recommendation of a hardware type field with
value 0x01 (ethernet) followed by the hardware address as described in
RFC 2132.
Init-Reboot is tried if a client IP address has been given when
the DHCP client is started. In Init-Reboot, start by sending a
broadcast DHCP Request including the supplied client IP address
but without the server identifier. After sending the request,
enter Reboot state.
If a DHCP Ack is received, proceed to Bound state as usual. If a
DHCP Nak is received or the first timeout triggers, start the
address acquisition over from DHCP Init state.
See RFC 2131, sections 4.3.2, 4.4, 4.4.1 and 4.4.2 for details.
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:
fd = safe_close(fd);
Which will close an fd if it is open, and reset the fd variable
correctly.
By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
The default slack caused there to be a delay before timers fired. Solve it
by setting timers that should trigger immediately to trigger far in the past.
This brings down the ideal-case dhcp lease acquisition time from about 500ms to
about 50ms (over a veth pair, so no network latency involved).
All the rest of the time (except for ~0.5ms) is spent in the bind() call in,
dhcp_network_bind_raw_socket(). I don't know if there is anything to be done
about that though...