Previous commits changed the dhcpv4 retransmission algorithm to be
slightly slower, changing the amount of time it takes to notify
systemd-networkd that the dhcpv4 configuration has (transiently)
failed from around 14 second up to 28 seconds.
Since the test_dhcp_client_with_ipv4ll_without_dhcp_server test
configures an interface to use dhcpv4 without any operating dhcpv4
server running, it must increase the amount of time it waits for
the test interface to reach degraded state.
This changes the retransmission timeout algorithm for requests
other than RENEW and REBIND. Previously, the retransmission timeout
started at 2 seconds, then doubling each retransmission up to a max
of 64 seconds. This is changed to match what RFC2131 section 4.1 describes,
which skips the initial 2 second timeout and starts with a 4 second timeout
instead. Note that -1 to +1 seconds of random 'fuzz' is added to each
timeout, in previous and current behavior.
This change is therefore slightly slower than the previous behavior in
attempting retransmissions when no server response is received, since the
first transmission times out in 4 seconds instead of 2.
Since TRANSIENT_FAILURE_ATTEMPTS is set to 3, the previous length of time
before a transient failure was reported back to systemd-networkd was
2 + 4 + 8 = 14 seconds, plus, on average, 3 seconds of random 'fuzz' for
a transient failure timeout between 11 and 17 seconds. Now, since the
first timeout starts at 4, the transient failure will be reported at
4 + 8 + 16 = 28 seconds, again plus 3 random seconds for a transient
failure timeout between 25 and 31 seconds.
Additionally, if MaxAttempts= is set, it will take slightly longer to
reach than with previous behavior.
Use the request timeout algorithm specified in RFC2131 section 4.4.5 for
handling timed out RENEW and REBIND requests.
This changes behavior, as previously only 2 RENEW and 2 REBIND requests
were sent, no matter how long the lease lifetime. Now, requests are
send according to the RFC, which results in starting with a timeout
of 1/2 the t1 or t2 period, and halving the timeout for each retry
down to a minimum of 60 seconds.
Fixes: #17909
The parsing of the dhcpv4 lease lifetime, as well as the t1/t2
times, is simplified by this commit.
This differs from previous behavior; previously, the lease lifetime and
t1/t2 values were modified by random 'fuzz' by subtracting 3, then adding
a random number between 0 and (slightly over) 2 seconds. The resulting
values were therefore always between 1-3 seconds shorter than the value
provided by the server (or the default, in case of t1/t2). Now, as
described in RFC2131, the random 'fuzz' is between -1 and +1 seconds,
meaning the actual t1 and t2 value will be up to 1 second earlier or
later than the server-provided (or default) t1/t2 value.
This also differs in handling the lease lifetime, as described above it
previously was adjusted by the random 'fuzz', but the RFC does not state
that the lease expiration time should be adjusted, so now the code uses
exactly the lease lifetime as provided by the server with no adjustment.
RFC2131, providing the details for dhcpv4, has specific retransmission
intervals that it outlines. This adds functions to compute the timeouts
as the RFC describes.
The commit 6f3ac0d517 drops the prefix and
suffix in TAGS= property. But there exists several rules that have like
`TAGS=="*:tag:*"`. So, the property must be always prefixed and suffixed
with ":".
Fixes#17930.
The ret_size result is a bit of an awkward optimization that in a
sense enables bypassing the mmap-cache API, while encouraging
duplication of logic it already implements.
It's only utilized in one place; journal_file_move_to_object(),
apparently to avoid the overhead of remapping the whole object
again once its header, and thus its actual size, is known.
With mmap-cache's context cache, the overhead of simply
re-getting the object with the now known size should already be
negligible. So it's not clear what benefit this brings, unless
avoiding some function calls that do very little in the hot
context-cache hit case is of such a priority.
There's value in having all object-sized gets pass through
mmap_cache_get(), as it provides a single entrypoint for
instrumentation in profiling/statistics gathering. When
journal_file_move_to_object() bypasses getting the full object
size, you don't capture the full picture on the mmap-cache side
in terms of object sizes explicitly loaded from a journal file.
I'd like to see additional accounting in mmap_cache_get() in a
future commit, taking advantage of this change.
Quoting Andy Lutomirski:
> The upcoming Linux SGX driver has a device node /dev/sgx. User code opens
> it, does various setup things, mmaps it, and needs to be able to create
> PROT_EXEC mappings. This gets quite awkward if /dev is mounted noexec.
We already didn't use noexec in spawn, and this extends this behaviour to other
systems.
Afaik, the kernel would refuse execve() on a character or block device
anyway. Thus noexec on /dev matters only for actual binaries copied to /dev,
which requires root privileges in the first place.
We don't do noexec on either /tmp or /dev/shm (because that causes immediate
problems with stuff like Java and cffi). And if you have those two at your
disposal anyway, having noexec on /dev doesn't seem important. So the 'noexec'
attribute on /dev doesn't really mean much, since there are multiple other
similar directories which don't require root privileges to write to.
C.f. 33c10ef43b.
When an interface gains carrier but udev have not initialized the
interface or link_initialized_handler() has not been called yet,
then link_configure will be called twice. Thus LLDP client will be
configured twice, and triggers assertion.
Fixes#17929.
When the .so module is loaded, it gets a separate copy of stuff in src/basic,
including the log level variables. So any logging settings are unaffected by
the loading program calling log_parse_environment() or such. Let's also parse
the environment here so that we can have nice logging.
Initialization is done from each exported function, and pthread_once_t is used
to avoid duplicate initialization. I didn't merge PROTECT_ERRNO into
NSS_ENTRYPOINT_BEGIN because UNPROTECT_ERRNO is called in a bunch of places
and it would feel strange to have PROTECT_ERRNO hidden, but not UNPROTECT_ERRNO.
The most interesting stuff in this module is the varlink messages, and any
potential errors in json. So let's enable json logging when debug messages are
enabled.
With those changes, figuring out the issue in
https://github.com/systemd/systemd/pull/17823 is trivial:
$ LD_LIBRARY_PATH=build/ SYSTEMD_LOG_COLOR=1 SYSTEMD_LOG_LOCATION=1 SYSTEMD_LOG_LEVEL=debug getent hosts mirrors.fedoraproject.org
src/shared/varlink.c:237: n/a: varlink: setting state idle-client
src/shared/varlink.c:1240: n/a: Sending message: {"method":"io.systemd.Resolve.ResolveHostname","parameters":{"name":"mirrors.fedoraproject.org","family":10}}
src/shared/varlink.c:240: n/a: varlink: changing state idle-client → calling
src/shared/varlink.c:588: n/a: New incoming message: {"parameters":{"addresses":[{"ifindex":0,"family":10,"address":[42,5,208,20,0,16,120,3,247,116,77,124,226,119,164,87]},{"ifindex":0,"family":10,"address":[42,5,208,28,12,106,204,3,38,58,132,9,185,97,126,2]},{"ifindex":0,"family":10,"address":[38,32,0,82,0,3,0,1,222,173,190,239,202,254,254,215]},{"ifindex":0,"family":10,"address":[38,5,188,128,48,16,6,0,222,173,190,239,202,254,254,217]},{"ifindex":0,"family":10,"address":[38,4,21,128,254,0,0,0,222,173,190,239,202,254,254,209]},{"ifindex":0,"family":10,"address":[38,32,0,82,0,3,0,1,222,173,190,239,202,254,254,214]},{"ifindex":0,"family":10,"address":[38,16,0,40,48,144,48,1,222,173,190,239,202,254,254,211]},{"ifindex":0,"family":10,"address":[32,1,65,120,0,2,18,105,0,0,0,0,0,0,254,210]}],"name":"wildcard.fedoraproject.org","flags":1}}
src/shared/varlink.c:240: n/a: varlink: changing state calling → called
src/shared/varlink.c:240: n/a: varlink: changing state called → idle-client
src/nss-resolve/nss-resolve.c:84: (string):1:40: JSON field 'ifindex' is out of bounds for an interface index.
Normally, the udev rules operate on "change" events. But when
coldplugging, there's an "add" event present. The udev rules have to
recognize this and do some actions in this particular situation, too.
Also, we don't want the nodes to be created prematurely on "add"
events while not coldplugging. The udev rules will check
DM_UDEV_PRIMARY_SOURCE_FLAG to see if the device was activated
correctly before and if not, it ignore the "add" event totally.
This way the udev rules can support udev triggers generating "add"
events (e.g. "udevadm trigger --action=add" or
"echo add > /sys/block/<dm_device>/uevent").
In this case, the udevd service is started after
systemd-cryptsetup@config.service, is started, which will cause udevd
service to miss the "change" uevent with DM_UDEV_PRIMARY_SOURCE_FLAG
flag generated by systemd-cryptsetup@config.service. To solve this
issue, we let the cryptsetup service be started after the udevd
service.
Back in 5248e7e1f1 (July 2017) we moved over to
"_gateway", with the old name declared to be temporary measure. Since we're
doing a bunch of changes to resolved now, it seems to be a good moment to make
this simplification and not add support for the compat name in new code.
When seccomp_restrict_archs is called, architectures that are blocked
are replaced by the SECCOMP_LOCAL_ARCH_BLOCKED marker so that they are
not disabled again and filters are not installed for them.
This can make some service that use SystemCallArchitecture= and
SystemCallFilter= start faster.
There are no mmap_cache_get() users that actually deviate prot
from the JournalFile's f->prot.
So there's no point in making this a separate parameter to
mmap_cache_get(), nor is there any need to store it in
JournalFile's f->prot.
Instead just pass it to mmap_cache_add_fd() at MMapFileDescriptor
creation, storing it in there for the mmap() callers, which
already receive MMapFileDescriptor *.
For functions receiving both an MMapFileDescriptor * and prot,
the prot argument has been simply removed and call sites updated.
Formalizing this fd:prot binding at the public API also enables
discarding the prot check in window_matches(), which is a hot
function on long window lists, so a minor CPU efficiency gain
should be had there as seen with the past removal of the fd
check. Unnoticable for uncached journals, but maybe a little
runtime improvement when cached in specific circumstances.
window_matches_fd() has also been simplified to treat the
MMapFileDescrptor * as equivalent to its fd and prot.