Fixes#5659.
Currently, if Persistent=true and the machine is off at the scheduled time of the timer unit, the timer
will be triggered immediately at the next boot even if RandomizedDelaySec= is specified.
As a result, if multiple timers meet that condition, they will be triggered at the same time and too
much CPU/IO work makes boot slow down.
With this commit, if the scheduled time of the persistent timer has already elapsed at boot,
set the time when systemd first started as the scheduled time and RandomizedDelaySec= is applied to it.
We were already using OrderedSets in the manager object, but strvs in the
configuration parsing code. Using sets gives us better scaling when many
domains are used.
In oss-fuzz #13059 the attached reproducer takes approximately 30.5 s to be
parsed. Converting to sets makes this go down to 10s. This is not _vastly_
faster, but using sets seems like a nicer approach anyway. In particular, we
avoid the quadratic de-unification operation after each addition.
After debugging the issue with gdb, I found that the following change
94ddb08 "cgtop: Still try to get CPU statistics if controller-free"
has introduced a bug, which prevents process(..) method processing
memory and io controllers when cpu_accounting_is_cheap() is true.
The obvious fix is to move this branch to be the last one, keeping
the intended behavior of the above change, without having a negative
effect on the other controllers.
Fixes#11773 [systemd-cgtop no longer shows memory (and io) usage]
The second name was used in documentation, and the first in the code that
generated the unit. 'systemd-makefs' is the name we want, for example for
consistency with the systemd-makefs executable.
In principle this breaks compatibility, but in practice this is unlikely to be
noticeable. Each instance of the unit is created by writing out a full
definition, so the template was never defined. So the name could only be used
for ordering, and there is not reason to order things against this unit from
the outside: the ordering would rather be against the final mount unit.
Fixes#11769.
The current implementation copied the *complete* header to boot_params,
thus making the kernel ignore many of the fields.
As mentioned in the code comment for the sentinel variable in
bootparam.h a bootloader should only copy the setup_header, set some
fields in boot_params and zero out anything else.
This change makes systemd-boot (mostly) compliant with the Linux Boot
Protocol and the EFI Handover Protocol described in bootparam.h and
Documentation/boot.txt to fix various issues:
- Secure boot not being detected corretly by Linux (#11717)
- tboot error message / warning on boot (#11717)
- Strange purple text color when booting in qemu with OVMF
- Hopefully even more ...
This reverts commit b4f9f2a62f.
Revert this because a) the quiet bug is fixed in linux and b)
Documentation/boot.txt says "All other fields should be zero."
adds a fully safe way how apps can pin files into /tmp temporarily, excepting them from the tmpfiles aging algorithm, based on BSD file locks on dirs we descend into
Let services use a private UTS namespace. In addition, a seccomp filter is
installed on set{host,domain}name and a ro bind mounts on
/proc/sys/kernel/{host,domain}name.
Since commit 0722b35934, the root mountpoint is
unconditionnally turned to slave which breaks units that are using explicitly
MountFlags=shared (and no other options that would implicitly require a slave
root mountpoint).
Here is a test case:
$ systemctl cat test-shared-mount-flag.service
# /etc/systemd/system/test-shared-mount-flag.service
[Service]
Type=simple
ExecStartPre=/usr/bin/mkdir -p /mnt/tmp
ExecStart=/bin/sh -c "/usr/bin/mount -t tmpfs -o size=10M none /mnt/tmp && sleep infinity"
ExecStop=-/bin/sh -c "/usr/bin/umount /mnt/tmp"
MountFlags=shared
$ systemctl start test-shared-mount-flag.service
$ findmnt /mnt/tmp
$
Mount on /mnt/tmp is not visible from the host although MountFlags=shared was
used.
This patch fixes that and turns the root mountpoint to slave when it's really
required.
Some settings cannot set simultaneously. Let's warn and drop
incompatible settings.
Currently, it is not comprehensive. But this may be a good first step.
When the link goes down, DHCP client_receive_message*() functions return an
error and the related I/O source is removed from the main loop. With the
current implementation of systemd-networkd this doesn't matter because the DHCP
client is always stopped on carrier down and restarted on carrier up. However
it seems wrong to have the DHCP client crippled (because no packet can be
received anymore) once the link goes temporarily down.
Change the receive functions to ignore a ENETDOWN event so that the client will
be able to receive packets again after the link comes back.
Paths are limited to BUS_PATH_SIZE_MAX but the maximum size is anyway too big
to be allocated on the stack, so let's switch to the heap where there is a
clear way to understand if the allocation fails.
Even though the dbus specification does not enforce any length limit on the
path of a dbus message, having to analyze too long strings in PID1 may be
time-consuming and it may have security impacts.
In any case, the limit is set so high that real-life applications should not
have a problem with it.
* Current logic:
For each NSEC RR find the common suffix between the owner name and
the next name, append asterisk to that suffix and check that
generated wildcard is covered by the NSEC RR in question.
* New logic:
Find NSEC RR covering queried name, generate wildcard as
<asterisk>.<closest encloser> using this RR, then check if any
of the NSEC RRs covers generated wildcard.
This adds a new bitfield to `execute_directories()` which allows to
configure whether to ignore non-zero exit statuses of binaries run and
whether to allow parallel execution of commands.
In case errors are not ignored, the exit status of the failed script
will now be returned for error reposrting purposes or other further
future use.
FRA_TUN_ID is a 64 big endian integer. Fix the policy.
FRA_TUN_ID is unused by networkd, hence I think this bug
has no actual consequences.
Fixes: bce67bbee3
inet_ntop() is not documented to be thread-safe, so it should not
be used in the DHCP library. Arguably, glibc uses a thread local
buffer, so indeed there is no problem with a suitable libc. Anyway,
just avoid it.
The DHCP client should not pre-filter addresses beyond what RFC
requires. If a client's user (like networkd) wishes to skip/filter
certain addresses, it's their responsibility.
The point of this is that the DHCP library does not hide/abstract
information that might be relevant for certain users. For example,
NetworkManager exposes DHCP options in its API. When doing that, the
options should be close to the actual lease.
This is related to commit d9ec2e632d
(dhcp4: filter bogus DNS/NTP server addresses silently).
The Router DHCP option may contain a list of one or more
routers ([1]). Extend the API of sd_dhcp_lease to return a
list instead of only the first.
Note that networkd still only uses the first router (if present).
Aside from extending the internal API of the DHCP client, there
is almost no change in behavior. The only visible difference in
behavior is that the "ROUTER" variable in the lease file is now a
list of addresses.
Note how RFC 2132 does not define certain IP addresses as invalid for the
router option. Still, previously sd_dhcp_lease_get_router() would never
return a "0.0.0.0" address. In fact, the previous API could not
differenciate whether no router option was present, whether it
was invalid, or whether its first router was "0.0.0.0". No longer let
the DHCP client library impose additional restrictions that are not
part of RFC. Instead, the caller should handle this. The patch does
that, and networkd only consideres the first router entry if it is not
"0.0.0.0".
[1] https://tools.ietf.org/html/rfc2132#section-3.5
deserialize_in_addrs() allocates the buffer before trying to parse
the IP address. Since a parsing error is silently ignored, the returned
size might be zero. In such a case we shouldn't return any buffer.
Anyway, there was no leak, because there are only two callers like
r = deserialize_in_addrs(&lease->dns, dns);
which both keep the unused buffer and later release it.
Note that deserialize_in_addrs() doesn't free the pointer before
reassigning the new output. The caller must take care to to pass
"ret" with an allocated buffer that would be leaked when returning
the result.
- RTA_OIF has no business in the routing-rule policy. It is numerical
identical to FRA_GOTO. Fix using the correct enum value. Note that
RTA_OIF/FRA_GOTO was not used by networkd, and the type was already
correct at uint32. So, there is no change in behavior.
- RTA_GATEWAY also does not belong to the routing-rules. It is numerical
identical to FRA_UNUSED2. Obviously, that value is unused as well,
so there is no actual change in behavior either. In particular
that is because:
- kernel would not send messages with FRA_UNUSED2 attribute.
- networkd would not try to parse/send RTA_GATEWAY/FRA_UNUSED2
attributes.
Fixes: bce67bbee3
Initially, the check was that /usr is not a separate fs, and was later relaxed
to allow /usr to be mounted in the initramfs. Documentation was updated in 9e93f6f092,
but this comment wasn't. Let's update it too.
Fixes#10256.
What works:
systemd-analyze cat-config systemd/system-preset
systemd-analyze cat-config systemd/user-preset
systemd-analyze cat-config tmpfiles.d
systemd-analyze cat-config sysusers.d
systemd-analyze cat-config systemd/sleep.conf
systemd-analyze cat-config systemd/user.conf
systemd-analyze cat-config systemd/system.conf
systemd-analyze cat-config udev/udev.conf
(and other .conf files)
systemd-analyze cat-config udev/rules.d
systemd-analyze cat-config environment.d
systemd-analyze cat-config environment
Directories may be specified with the trailing dash or not.
The caveat is that for user configuration, systemd and other tools also look
at ~/.config/. It would be nice to support this, but this patch doesn't.
"cat-config --user" is rejected, and we may allow it in the future and then
extend the search path with directories under ~/.config.
What doesn't work (and probably shouldn't because those files cannot be
meaningfully concatenated):
systemd-analyze cat-config systemd/system (.service, .slice, .socket, ...)
systemd-analyze cat-config systemd/user
systemd-analyze cat-config systemd/network (.network, .link, and .dnssd)
The hardcoding of information about paths in this manner is a bit ugly, but
OTOH, it is not too onerous, and at least we have one place where all the
schemes are "documented" through code. It'll make us think twice before adding
yet another slightly different scheme.
This adds /usr/local/lib/udev/rules.d to the search path on non-split-usr systems.
On split-usr systems, the paths with /usr/-prefixes are added too.
In the past, on split-usr systems, it made sense to only load rules from
/lib/udev/rules.d, because /usr could be mounted late. But we don't support running
without /usr since 80758717a6, so in practice it doesn't matter whether the
rules files are in /lib/udev/rules.d or /usr/lib/udev/rules.d. Distributions
that maintain the illusion of functional split-usr are welcome to simply not put any
files in /usr/lib/udev/rules.d/.
In practice this doesn't change much, but it makes udev more consistent with the
rest of the systemd suite.
This makes the code match documentation (opposite compared to previous
commits). This is user configuration, hence it should not depend on stuff
in /lib/ even when split-usr is used.
/usr/local/lib/systemd/dnssd is now also included in the search path. This
path is of limited usefulness, but it makes sense to be consistent.
Documentation is updated to match. Outdated advice against drop-ins in /usr
is removed.
CID #139583: plen + 1 is evaluated as int, and could in principle overflow.
So cast to ssize_t and add an additional check that our overflow calculation
doesn't overflow itself.
Follow-up for a3ce813697 and
5ce41697bd.
Before a3ce813697, all properties in
src->properties and src->properties_db are mixed and copied to
dst->properties_db by device_copy_properties().
So, it is not necessary to store data from udev database file to
sd_device::properties_db before copying properties.
But now, properties are not mixed. So, the read data need to be
stored to also ::properties_db.
Fixes#11721.
This also set lower log level for the messages.
6e2efb6c73 introduces the log messages.
But udevd may be started with --resolve-names=never, and the behavior
is expected.
Fixes#11720.
We potentially might descent into quite deep directory trees. Let's
hence make sure we can allocate a lot of fds.
(This reflects the fact that glibc nftw() and friends have some logic in
place to reduce fd usage while descending into directory trees. Doing so
is a bit nasty I think, and given that fds are basically free now, if we
ask for them, lte's just protect ourselves and make use of that)
(No, I am not aware of a real-world case where this was necessary, but
let's better be safe than sorry)
Let's add a fully safe way to exclude certain directories from aging, by
taking a BSD file lock on them before aging them. This is useful for
clients that untar tarballs into /tmp or /var/tmp, which might have
really old timestamps, and to which the aging logic would be very harsh:
they can simply take a BSD file lock on any directory they like and thus
exclude it from automatic aging, and thus need not to be afraid of
untarring stuff below it.
Previously, similar functionality was already available through the
sticky bit on non-directories, but it's problematic, since as soon as
the bit is set no clean-up is done for it at all anymore, forever. Also,
it is not suitable for untarring stuff, since the sticky bit after all
is a concept denoted in the tarball itself. BSD file locking semantics
are much much nicer there, as they are automatically released when the
application that has them dies, and they are entirely orthogonal to data
encoded in tarballs.
This patch takes BSD file locks only on *directories* while descending
down the tree, not on regular files. Moreover, it will do so in
non-blocking mode only, i.e. if anyone else has a lock the aging for a
dir and everything below it is immediately skipped for the current
clean-up iteration.
Of course applications might take BSD file locks for other reasons than
just prevent aging (i.e for their own reasons), but that should be
entirely OK, as in that case tmpfiles should step away from those files
anyway too: it's a good idea to stay away from any such locked file
anyway since it's apparently curretnly being manipulated.
This allows us to fix bugs like this:
https://github.com/systemd/mkosi/issues/252