This adds -Dnss-resolve= and -Dnss-mymachines= meson options.
By using this option, e.g., resolved can be built without nss-resolve.
When no nss modules are built, then test-nss is neither built.
Also, This changes the option name -Dmyhostname= to -Dnss-myhostname=
for consistency to other nss related options.
Closes#9596.
Usecase is to allow changing the final kill from SIGKILL to SIGQUIT which
should create a core dump useful for debugging why the service didn't stop
with the SIGTERM
We often open the parent directory of a path. Let's add a common helper
for that, that shortens our code a bit and adds some extra safety
checks, for example it will fail if used on the root directory (which
doesn't really have a parent).
The helper is actually generalized from a function in btrfs-util.[ch]
which already existed for this purpose.
the service manager serializes ExecStop= execution data after
ExecStart=, like it makes sense and how it should be expected. However,
systemctl previously would reverse them when deserializing them locally,
and thus show ExecStop= results before ExecStart= results. And that's
confusing. Let's fix that.
Whenever a unit is started fresh we should flush out any runtime data
from the previous cycle. We are pretty good at that already, but what so
far we missed was the ExecStart=/ExecStop=/… command exit status data.
Let's fix that, and properly flush out that stuff too.
Consider this service:
[Service]
ExecStart=/bin/sleep infinity
ExecStop=/bin/false
When this service is started, then stopped and then started again
"systemctl status" would show the ExecStop= results of the previous run
along with the ExecStart= results of the current one, which is very
confusing. With this patch this is corrected: the data is kept right
until the moment the new service cycle starts, and then flushed out.
Hence "systemctl status" in that case will only show the ExecStart=
data, but no ExecStop= data, like it should be.
This should fix part of the confusion of #9588
We always initialize it from the same field in ExecCommand anyway, hence
there's no point in passing it separately to exec_spawn(), after all we
already pass the ExecCommand structure itself anyway.
No change in behaviour.
That call to mount was added as a safeguard against a kernel bug which was fixed in
torvalds/linux@bbd5192.
In principle, the error could be ignored because
* normally everything mounted on /proc/PID should disappear as soon as the PID has gone away
* test-mount-util that had been confused by those phantom entries in /proc/self/mountinfo was
taught to ignore them in 112cc3b.
On the other hand, in practice, if the mount fails, then the next one is extremely unlikely to
succeed, so it seems to be reasonable to just skip the rest of `test_get_process_cmdline_harder`
if that happens.
Closes https://github.com/systemd/systemd/issues/9649.
Currently to set the flag to reboot into the firmware setup an
authentication by an administrative user is required. Since we are
already enabling active users to reboot the system, it is advisable to
let the user decide if he wants to boot into the firmware setup without
any more hassle.
Currently, mount_sysfs() only creates /sys/fs/cgroup if cg_ns_supported().
The comment explains that we need to "Create mountpoint for
cgroups. Otherwise we are not allowed since we remount /sys read-only.";
that is: that we need to do it now, rather than later. However, the
comment doesn't do anything to explain why we only need to do this if
cg_ns_supported(); shouldn't we _always_ need to do it?
The answer is that if !use_cgns, then this was already done by the outer
child, so mount_sysfs() only needs to do it if use_cgns. Now,
mount_sysfs() doesn't know whether use_cgns, but !cg_ns_supported() implies
!use_cgns, so we can optimize" the case where we _know_ !use_cgns, and deal
with a no-op mkdir_p() in the false-positive where cgns_supported() but
!use_cgns.
But is it really much of an optimization? We're potentially spending an
access(2) (cg_ns_supported() could be cached from a previous call) to
potentially save an lstat(2) and mkdir(2); and all of them are on virtual
fileystems, so they should all be pretty cheap.
So, simplify and drop the conditional. It's a dubious optimization that
requires more text to explain than it's worth.
Remove "arbitrary named hierarchies" from the list of things that
cg_kernel_controllers() might return, and clarify that "name="
pseudo-controllers are not included in the returned list.
/proc/cgroups does not contain "name=" pseudo-controllers, and
cg_kernel_controllers() makes no effort to enumerate them via a different
mechanism.
One of the things that tmpfs_patch_options does is take an (optional) UID,
and insert "uid=${UID},gid=${UID}" into the options string. So we need a
uid_t argument, and a way of telling if we should use it. Fortunately,
that is built in to the uid_t type by having UID_INVALID as a possible
value.
So this is really a feature that requires one argument. Yet, it is somehow
taking 4! That is absurd. Simplify it to only take one argument, and have
that trickle all the way up to mount_all()'s usage.
Now, in may of the uses, the argument becomes
uid_shift == 0 ? UID_INVALID : uid_shift
because it used to treat uid_shift=0 as invalid unless the patch_ids flag
was also set. This keeps the behavior the same. Note that in all cases
where it is invoked, if !use_userns (sometimes called !userns), then
uid_shift is 0; we don't have to add any checks for that.
That said, I'm pretty sure that "uid=0" and not setting "uid=" are the
same, but Christian Brauner seemed to not think so when implementing the
cgns support. https://github.com/systemd/systemd/pull/3589
One of the things that mkdir_userns{,_p}() does is take an (optional) UID,
and chown the directory to that. So we need a uid_t argument, and a way of
telling if we should use that uid_t argument. Fortunately, that is built
in to the uid_t type by having UID_INVALID as a possible value.
However, currently mkdir_userns() also takes a MountSettingsMask and checks
a couple of bits in it to decide if it should perform the chown.
Drop the mask argument, and instead have the caller pass UID_INVALID if it
shouldn't chown.
When we open our own little namespace for running our tests in, let's
turn off mount propagation only one way, rather than both ways. This is
better as this means we don't pin host mounts unnecessarily long in our
namespace, even though the host already got rid of them. This is because
MS_SLAVE in contrast to MS_PRIVATE allows umount events to propagate
from the host into our environment.
Looking at a recent Bad Day, my log contains over 100 lines of
systemd[23895]: Failed to connect to API bus: Connection refused
It is due to "systemd --user" retrying to connect to an API bus.[*] I
would prefer to avoid spamming the logs. I don't think it is good for us
to retry so much like this.
systemd was mislead by something setting DBUS_SESSION_BUS_ADDRESS. My best
guess is an unfortunate series of events caused gdm to set this. gdm has
code to start a session dbus if there is not a bus available already (and
in this case it exports the environment variable). I believe it does not
normally do this when running under systemd, because "systemd --user" and
hence "dbus.service" would already have been started by pam_systemd.
I see two possibilities
1. Rip out the check for DBUS_SESSION_BUS_ADDRESS entirely.
2. Only check for DBUS_SESSION_BUS_ADDRESS on startup. Not in the
"recheck" logic.
The justification for 2), is that the recheck is called from unit_notify(),
this is used to check whether the service just started (or stopped) was
"dbus.service". This reason for rechecking does not apply if we think
the session bus was started outside our logic.
But I think we can justify 1). dbus-daemon ships a statically-enabled
/usr/lib/systemd/user/dbus.service, which would conflict with an attempt to
use an external dbus. Also "systemd --user" is started from user@.service;
if you try to start it manually so that it inherits an environment
variable, it will conflict if user@.service was started by pam_systemd
(or loginctl enable-linger).
This allows aliases to be used for the basic modules we load from pid1 before
udev is started. In #9501 the kernel renamed autofs4 to autofs, with "autofs4"
as alias, but we wouldn't load the module, because we didn't follow aliases.
The kernel change was reverted, but it's probably better to support aliases.
These custom macros make the expression go through a function, in order
to prevent ASSERT_SIDE_EFFECT false positives on our macros such as
assert_se() and assert_return() that cannot be disabled and will always
evaluate their expressions.
This technique has been described and recommended in:
https://community.synopsys.com/s/question/0D534000046Yuzb/suppressing-assertsideeffect-for-functions-that-allow-for-sideeffects
Tested by doing a local cov-build and uploading the resulting tarball to
scan.coverity.com, confirmed that the ASSERT_SIDE_EFFECT false positives
were gone.
This makes bus_slot_disconnect() unref the slot object from bus when
`unref == true` and it is floating, as the function removes the
reference from the relevant bus object.
This reverts 20d4ee2cbc, as it
introduces #9604.
Fixes#9604.
key_serial_t is defined in keyutil.h, which wasn't included in the header list
in the test, so the test always failed. We were always compiling stuff with
!HAVE_KEY_SERIAL_T.
We could try to add keyutil.h to the test, but then we'd have to first check if
it is available, which just doesn't seem worth the trouble.
key_serial_t should always be defined as int32_t. Let's keep the uncoditional
define, since repeated compatible typedefs are not a problem, and it allows us
to compile even if the header file is missing. If there's ever a change in the
definition, we'll have to adjust the code for the different type anyway, and
our compiler will tell us.
Using _GNU_SOURCE is better because that's how we include the headers in the
actual build, and some headers define different stuff when it is defined.
sys/stat.h for example defines 'struct statx' conditionally.
The switch to memory_startswith() changed the logic to only look for a space or
NUL byte after the matched word, but matching the full size should also be
acceptable.
This changed the behavior of parsing of "AUTH\r\n", where m will be set to 4,
since even though the word will match, the check for it being followed by ' '
or NUL will make line_begins() return false.
Tested:
- Using netcat to connect to the private socket directly:
$ echo -ne '\0AUTH\r\n' | sudo nc -U /run/systemd/private
REJECTED EXTERNAL ANONYMOUS
- Running the Ignition blackbox test:
$ sudo sh -c 'PATH=$PWD/bin/amd64:$PATH ./tests.test'
PASS
Fixes: d27b725abf
The current CLI does not support a way to clear these lists, since without any
additional arguments, the command will list the current values.
Introduce a new way to clear the lists by passing a single '' argument to these
subcommands.
Update the man page to document this.
Tested:
$ build/resolvectl domain eth1
Link 3 (eth1): ~.
$ build/resolvectl domain eth1 ''
$ build/resolvectl domain eth1
Link 3 (eth1):
$ build/resolvectl domain eth1 '~.' '~example.com'
$ build/resolvectl domain eth1
Link 3 (eth1): ~. ~example.com
$ build/resolvectl domain eth1 ''
$ build/resolvectl domain eth1
Link 3 (eth1):
$ build/resolvectl domain eth1 '~.'
$ build/resolvectl domain eth1
Link 3 (eth1): ~.
And similar for "dns" and "nta".
Check if the fd is a folder before setting default acls
Tested:
Ubuntu 18.04.
test.conf: A+ /tmp/test - - - - u:user2:rw,d:u:user1:rwx
The folder /tmp/test looks like
/tmp/test/file1
/tmp/test/folder2
start systemd-tmpfiles manually
Fixes: #9545
This got broken in 9d9dd746d4, because a template
is not a valid unit, so the check for being masked failed. Avoid this by
handling templates specially. Fixes#9554.
Also, this improves 'cat' with masked units:
(before) $ systemctl cat foofoofoo@.service
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
(after) $ build/systemctl cat foofoofoo@.service
In check_triggering_units(), the call to unit_is_masked() is replaced with an
open-coded check. This is a bit unfortunate, but unit_is_masked() now requires
LookupPaths to be initialized, which we don't have or need in this case, so it
seems easiest to just accept this tiny code duplication.
This adds sd_bus_{get,set}_method_call_timeout().
If the timeout is not set or set to 0, then the timeout value is
parsed from $SYSTEMD_BUS_TIMEOUT= environment variable. If the
environment variable is not set, then built-in timeout is used.
Unfortunately this needs libshared to link to libkmod. Before it was linked
into systemd-udevd, udevadm, and systemd each seperately. On most systems this
doesn't make much difference, because at least systemd would be installed, but
it might not be in small chroots. It is a small library, so I hope this is not
a big issue.
Back in 2012 the project was renamed, see the release notes for v 0.105
[https://cgit.freedesktop.org/polkit/tree/NEWS#n754]. Let's update our
documentation and comments to do the same. Referring to PolicyKit is confusing
to users because at the time the polkit api changed too, and we support the new
version. I updated NEWS too, since all the references to PolicyKit there were
added after the rename.
"PolicyKit" is unchanged in various URLs and method call names.
Starting with glibc 2.27.9000-36.fc29, include file sys/stat.h will have a
definition for struct statx, in which case include file linux/stat.h should be
avoided, in order to prevent a duplicate definition.
In file included from ../src/basic/missing.h:18,
from ../src/basic/util.h:28,
from ../src/basic/hashmap.h:10,
from ../src/shared/bus-util.h:12,
from ../src/libsystemd/sd-bus/bus-creds.c:11:
/usr/include/linux/stat.h:99:8: error: redefinition of ‘struct statx’
struct statx {
^~~~~
In file included from /usr/include/sys/stat.h:446,
from ../src/basic/util.h:19,
from ../src/basic/hashmap.h:10,
from ../src/shared/bus-util.h:12,
from ../src/libsystemd/sd-bus/bus-creds.c:11:
/usr/include/bits/statx.h:36:8: note: originally defined here
struct statx
^~~~~
Extend our meson.build to look for struct statx when only sys/stat.h is
included and, in that case, do not include linux/stat.h anymore.
Tested that systemd builds correctly when using a glibc version that includes a
definition for struct statx.
glibc Fedora RPM update:
28cb5d31fc
glibc upstream commit:
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=fd70af45528d59a00eb3190ef6706cb299488fcd
When unmounting user runtime directory, only UID is necessary,
and the corresponding user may not exist anymore.
This makes first try to parse the input by parse_uid(), and only if it
fails, prase the input by get_user_creds().
Fixes#9541.
When a slot is disconnected, then slot->match_callback.install_slot
is also disconnected. So, bus_slot_disconnect() removes the install_slot
from the list of slots in bus, although it is a floating object.
This makes install_slot unreffed from bus when it is disconnected.
Fixes#9505 and #9510.
The kernel added support for a new cgroup memory controller knob memory.min in
bf8d5d52ffe8 ("memcg: introduce memory.min") which was merged during v4.18
merge window.
Add MemoryMin to support memory.min.
This fixes the following valgrind warning:
```
Syscall param sendmsg(msg.msg_name) points to uninitialised byte(s)
at 0x6189CC1: sendmsg (in /usr/lib64/libpthread-2.27.so)
by 0x153082: dns_stream_writev (resolved-dns-stream.c:235)
by 0x153343: dns_stream_tls_writev (resolved-dns-stream.c:299)
by 0x5B30343: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B3158F: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B33190: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B36307: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B37D47: gnutls_handshake (in /usr/lib64/libgnutls.so.30.20.2)
by 0x154591: dns_stream_connect_tls (resolved-dns-stream.c:596)
by 0x13A889: dns_transaction_emit_tcp (resolved-dns-transaction.c:676)
by 0x13D901: dns_transaction_go (resolved-dns-transaction.c:1761)
by 0x1330C8: dns_query_candidate_go (resolved-dns-query.c:156)
Address 0xa9ac268 is 312 bytes inside a block of size 592 alloc'd
at 0x4C30B06: calloc (vg_replace_malloc.c:711)
by 0x1541F8: dns_stream_new (resolved-dns-stream.c:545)
by 0x13A662: dns_transaction_emit_tcp (resolved-dns-transaction.c:642)
by 0x13D901: dns_transaction_go (resolved-dns-transaction.c:1761)
by 0x1330C8: dns_query_candidate_go (resolved-dns-query.c:156)
by 0x134E16: dns_query_go (resolved-dns-query.c:757)
by 0x11F3FB: bus_method_resolve_hostname (resolved-bus.c:353)
by 0x4F947A7: method_callbacks_run (bus-objects.c:402)
by 0x4F97266: object_find_and_run (bus-objects.c:1260)
by 0x4F978B1: bus_process_object (bus-objects.c:1376)
by 0x4FAF82C: process_message (sd-bus.c:2661)
by 0x4FAFA1B: process_running (sd-bus.c:2703)
```
If --dev-kvm-mode is set to something different then 0666, which we
explicitly support, it makes sense to still apply the uaccess tag to
/dev/kvm. For distros which opt to use the default 0666, this change is
a nop.
This partially reverts commit b8fd3d8220.
Prior to this commit, a .link file with a [Match] section containing
MACAddress= would match any device without a MAC. This restores the
matching logic prior to e90d037.
This is useful if someone wants to recreate the original syslog datagram. We
already include timestamp information as _SOURCE_REALTIME_TIMESTAMP=, and in
normal use that timestamp, converted back to the form used by syslog
(Mth dd HH:MM:SS) would usually give the value. But there are various
circumstances where this might not be true. Most obviously, if the datagram is
sent a bit later after being prepared, the time is rounded to the nearest
second, and it might be off. This is especially bad around New Year when the
syslog timestamp wraps around. Then the same timezone and locale need to be
used to recreate the original timestamp. In the end doing this reliably is
complicated, and it seems much easier to just unconditionally include the
original timestamp.
If the original timestamp cannot be located, we store the full log line.
This way, it should be always possible to recreate the original input.
Example:
MESSAGE=x
SYSLOG_TIMESTAMP=Sep 15 15:07:58
SYSLOG_RAW
^]^@^@^@^@^@^@^@<13>Sep 15 15:07:58 HOST: x^@y
_PID=3318
_SOURCE_REALTIME_TIMESTAMP=1530743976393553
Fixes#2398.
There are some modern programming languages use userspace context switches
to implement coroutine features. PowerPC (32-bit) needs syscall "swapcontext" to get
contexts or switch between contexts, which is special.
Adding this rule should fix#9485.
The D-Bus library supplies a va_list variant of
`sd_bus_message_append()` called `sd_bus_message_appendv()`,
but failed to provide a va_list variant of its opposite,
`sd_bus_message_read()`. This commit publicizes a previously static
function as `sd_bus_message_readv()`.
This makes OBJECT_PATH_FOREACH_PREFIX consistent with PATH_FOREACH_PREFIX
and also fixes 7 alerts reported by LGTM at
ac0a087003/files/src/libsystemd/sd-bus/bus-objects.c?sort=name&dir=ASC&mode=heatmap&showExcluded=true#V1383
nspawn as it is now is a generally useful tool, hence let's drop the
comments about it being useful for debug and so on only.
The new wording just makes the first sentence of the main page also the
summary.
Use PRIu64 constant to get the format right on LP-64 architectures,
cast to (uint64_t) to solve incompatibility of __u64.
This was missed in ad4bc33522, so fix it
with this follow up.
Also use compat_main() when called as `resolvconf`, since the interface
is closer to that of `systemd-resolve`.
Use a heap allocated string to set arg_ifname, since a stack allocated
one would be lost after the function returns. (This last one broke the
case where an interface name was suffixed with a dot, such as in
`resolvconf -a tap0.dhcp`.)
Tested:
$ build/resolvconf -a nonexistent.abc </etc/resolv.conf
Unknown interface 'nonexistent': No such device
Fixes#9423.
Use PRIu64 and PRIu32 constants to also get the format right on LP-64
architectures.
For the 64-bit fields, we need a cast to (uint64_t), since __u64 is
defined as a `long long unsigned` and PRIu64 expects a `long unsigned`.
In practice, both are the same, so the cast should be OK.
gmtime() and localtime() operate on a static buffer. let's avoid this,
as we never know whether some library might use these calls in some
backrgound thread.
Discovered by lgtm:
https://lgtm.com/projects/g/systemd/systemd/
Using an assertion is fine, since calls to job_merge_into_installed()
are protected by a check for job_type_is_conflicting().
Uncovered by Coverity, fixes CID 996307.
We add n_installed_jobs and n_failed_jobs to our inner state after
deserialization. This is fine during daemon-reexec when we start with clear
Manager (and some jobs possibly queued before deserialization), however,
daemon-reload works with the same manager and adding the values would
effectively double the counters. Reset the counters before we deserialize and
add their values again.
It does not make sense for udev to even open DRBD block devices
(/dev/drbdX). It is on one hand not necessary as DRBD is controlled by
something else in the stack (e.g., pacemaker), and it even can get
cumbersome in various scenarios (e.g., DRBD9 auto-promote).
Closes: #9371
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This compensates for the unsynchronized reload cycles of systemd and
udev: we manually trigger the deps listed in SYSTEMD_WANTS properties if
they change for device units that are already up. That way all deps
defined that way will be triggered at least once: the first time the
unit goes up by the usual dependency logic, and if it already is up by
the device.c specific logic.
Fixes: #9323
Before this patch:
# systemctl --runtime mask abuild.mount does-not-exist.mount does-also-not-exist.mount
Unit abuild.mount does not exist, proceeding anyway.
Unit abuild.mount does not exist, proceeding anyway.
Unit abuild.mount does not exist, proceeding anyway.
Created symlink /run/systemd/system/abuild.mount → /dev/null.
Created symlink /run/systemd/system/does-not-exist.mount → /dev/null.
Created symlink /run/systemd/system/does-also-not-exist.mount → /dev/null.
After this patch:
# systemctl --runtime mask abuild.mount does-not-exist.mount does-also-not-exist.mount
Unit abuild.mount does not exist, proceeding anyway.
Unit does-not-exist.mount does not exist, proceeding anyway.
Unit does-also-not-exist.mount does not exist, proceeding anyway.
Created symlink /run/systemd/system/abuild.mount → /dev/null.
Created symlink /run/systemd/system/does-not-exist.mount → /dev/null.
Created symlink /run/systemd/system/does-also-not-exist.mount → /dev/null.
This commit adds the stop alias to the output of `machinectl --help`.
In the past we only mention this in the man page. It's nice to mention
this in the output `machinectl --help` as well.
Now the setting MTU is embedded into the link_up message which makes it
incapable of setting MTU if link is up. MTU can be set while Link is up.
Closes#9254
The prefix for EMC Symmetrix pre-SPC VPD inquiry reply
is always SCSI_ID_NAA, so we need to hardcode it to
avoid false values here.
Signed-off-by: Hannes Reinecke <hare@suse.com>
The method already uses a boolean argument to determine whether it is in
whitelist mode or not. The code that will parse the string of filters
does not expect the ~, since it already has the boolean argument. Thus,
it will fail to parse the list of filters.
The functions protect_{home,system}_from_string() are not used
except for defining protect_{home,system}_or_bool_from_string().
This makes protect_{home,system}_from_string() support boolean
strings, and drops protect_{home,system}_or_bool_from_string().
During the transition from system functions using errno to our own read and write functions with negative return codes some errors where introduced. This patch correctly convert errno to negative return codes for read and write and fix checks still using errno instead of the return code.
Closes#9283
Currently we employ mostly system call blacklisting for our system
services. Let's add a new system call filter group @system-service that
helps turning this around into a whitelist by default.
The new group is very similar to nspawn's default filter list, but in
some ways more restricted (as sethostname() and suchlike shouldn't be
available to most system services just like that) and in others more
relaxed (for example @keyring is blocked in nspawn since it's not
properly virtualized yet in the kernel, but is fine for regular system
services).
$ git grep -e 'This program is free software' -l |grep -v LICENSE | \
xargs perl -i -0pe 's/ \* This program.*?for more details.\s*\*\n( \* You should have.*licenses.>.\n)?//gms'
For some reason they were missed previously. All those files seem to
have proper SDPX tags.
1) mv /var/tmp /var/tmp.old
2) mkdir /tmp/varrr
3) ln -s /tmp/varrr /var/tmp
Now, when a service has PrivateTmp=yes, during namespace setup,
/tmp is first mounted over with a new mount. Then, when /var/tmp
is being resolved, it points to /tmp/varrr, which by then doesn't
exist, because it had already been obscured.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
On overlayfs, FTW_MOUNT causes nftw to not list *any* files because the
condition used by glibc to verify that it's on the same mountpoint doesn't work
on overlayfs, see https://bugzilla.suse.com/show_bug.cgi?id=1096807 for the
details.
However using FTW_MOUNT doesn't seem to be really needed when walking through
the keymap directorie tree. So until the glibc or the kernel is fixed (which
might take some time), let's make localectl works with overlayfs.
There's a small side effect here, by which regular (non-directory) files with
bind mounts will be parsed while they were skipped by the previous logic.
To make debugging easier, this patches allows one to change the log target and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.
Indeed if one changed the log target at runtime (via the bus or via signals),
the change was lost on the next reload/reexecution.
In order to restore back the default value (set via system.conf, environment
variables or any other means ), the empty string in the "LogTarget" property is
now supported as well as sending SIGTRMIN+26 signal.
To make debugging easier, this patches allows one to change the log level and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.
Indeed if one changed the log max level at runtime (via the bus or via
signals), the change was lost on the next daemon reload/reexecution.
In order to restore the original value back (set via system.conf, environment
variables or any other means), the empty string in the "LogLevel" property is
now supported as well as sending SIGRTMIN+23 signal.
Let's add "const" where we don't change structures passed.
Also, we generally use "unsigned char" for IP prefix length values, do
so here too. Previously different parts of the sd-radv.h API used
different types for this.
sd_radv_stop is called from two places. if sd_radv_stop is alrady
success then just don't try to close it .
```
systemd-networkd[604]: RADV: Stopping IPv6 Router Advertisement daemon
systemd-networkd[604]: RADV: Unable to send last Router Advertisement with router lifetime set to zero: Bad file descriptor <==================HERE
systemd-networkd[604]: RADV: Updated prefix 2a0a:*:*:fc::/64 preferred 1h valid 2h
systemd-networkd[604]: RADV: Started IPv6 Router Advertisement daemon
```
Closes one of the issue #8960
C++03: "An rvalue of arithmetic, enumeration, pointer, or pointer to member
type can be converted to an rvalue of type bool. A zero value, null pointer
value, or null member pointer value is converted to false; any other value is
converted to true"
C should behave the same because pointers are scalars in C, but let's verify
that.
They are not needed, because anything that is non-zero is converted
to true.
C11:
> 6.3.1.2: When any scalar value is converted to _Bool, the result is 0 if the
> value compares equal to 0; otherwise, the result is 1.
https://stackoverflow.com/questions/31551888/casting-int-to-bool-in-c-c
DNS queries need timeout values to detect whether a DNS server is
unresponsive or, if the query is sent over UDP, whether a DNS message
was lost and has to be resent. The total time that it takes to answer a
query to arrive is t + RTT, where t is the maximum time that the DNS
server that is being queried needs to answer the query.
An authoritative server stores a copy of the zone that it serves in main
memory or secondary storage, so t is very small and therefore the time
that it takes to answer a query is almost entirely determined by the
RTT. Modern authoritative server software keeps its zones in main memory
and, for example, Knot DNS and NSD are able to answer in less than
100 µs [1]. So iterative resolvers continuously measure the RTT to
optimize their query timeouts and to resend queries more quickly if they
are lost.
systemd-resolved is a stub resolver: it forwards DNS queries to an
upstream resolver and waits for an answer. So the time that it takes for
systemd-resolved to answer a query is determined by the RTT and the time
that it takes the upstream resolver to answer the query.
It seems common for iterative resolver software to set a total timeout
for the query. Such total timeout subsumes the timeout of all queries
that the iterative has to make to answer a query. For example, BIND
seems to use a default timeout of 10 s.
At the moment systemd-resolved derives its query timeout entirely from
the RTT and does not consider the query timeout of the upstream
resolver. Therefore it often mistakenly degrades the feature set of its
upstream resolvers if it takes them longer than usual to answer a query.
It has been reported to be a considerable problem in practice, in
particular if DNSSEC=yes. So the query timeout systemd-resolved should
be derived from the timeout of the upstream resolved and the RTT to the
upstream resolver.
At the moment systemd-resolved measures the RTT as the time that it
takes the upstream resolver to answer a query. This clearly leads to
incorrect measurements. In order to correctly measure the RTT
systemd-resolved would have to measure RTT separately and continuously,
for example with a query with an empty question section or a query for
the SOA RR of the root zone so that the upstream resolver would be able
to answer to query without querying another server. However, this
requires significant changes to systemd-resolved. So it seems best to
postpone them until other issues have been addressed and to set the
resend timeout to a fixed value for now.
As mentioned, BIND seems to use a timeout of 10 s, so perhaps 12 s is a
reasonable value that also accounts for common RTT values. If we assume
that the we are going to retry, it could be less. So it should be enough
to set the resend timeout to DNS_TIMEOUT_MAX_USEC as
DNS_SERVER_FEATURE_RETRY_ATTEMPTS * DNS_TIMEOUT_MAX_USEC = 15 s.
However, this will not solve the incorrect feature set degradation and
should be seen as a temporary change until systemd-resolved does
probe the feature set of an upstream resolver independently from the
actual queries.
[1] https://www.knot-dns.cz/benchmark/
Let's always write "1 << 0", "1 << 1" and so on, except where we need
more than 31 flag bits, where we write "UINT64(1) << 0", and so on to force
64bit values.
This new setting is supposed to be useful in most cases where
"MountFlags=slave" is currently used, i.e. as an explicit way to run a
service in its own mount namespace and decouple propagation from all
mounts of the new mount namespace towards the host.
The effect of MountFlags=slave and PrivateMounts=yes is mostly the same,
as both cause a CLONE_NEWNS namespace to be opened, and both will result
in all mounts within it to be mounted MS_SLAVE. The difference is mostly
on the conceptual/philosophical level: configuring the propagation mode
is nothing people should have to think about, in particular as the
matter is not precisely easyto grok. Moreover, MountFlags= allows configuration
of "private" and "slave" modes which don't really make much sense to use
in real-life and are quite confusing. In particular PrivateMounts=private means
mounts made on the host stay pinned for good by the service which is
particularly nasty for removable media mount. And PrivateMounts=shared
is in most ways a NOP when used a alone...
The main technical difference between setting only MountFlags=slave or
only PrivateMounts=yes in a unit file is that the former remounts all
mounts to MS_SLAVE and leaves them there, while that latter remounts
them to MS_SHARED again right after. The latter is generally a nicer
approach, since it disables propagation, while MS_SHARED is afterwards
in effect, which is really nice as that means further namespacing down
the tree will get MS_SHARED logic by default and we unify how
applications see our mounts as we always pass them as MS_SHARED
regardless whether any mount namespacing is used or not.
The effect of PrivateMounts=yes was implied already by all the other
mount namespacing options. With this new option we add an explicit knob
for it, to request it without any other option used as well.
See: #4393
This drops needless safety checks that ensure we only reference block
devices for blockio/io settings. The backing code was already able to
accept regular file system paths too, in which case the backing device
node of that file system would be used. Hence, let's drop the artificial
restrictions and open up this underlying functionality.
We document the rule that return values >= 0 of functions are supposed
to indicate success, and that in case of success all return parameters
should be initialized. Let's actually do so.
Just a tiny coding style fix-up.
Jun 11 14:29:12 krowka systemd[1]: /etc/systemd/system/workingdir.service:6: = path is not normalizedWorkingDirectory: /../../etc
↓
Jun 11 14:32:12 krowka systemd[1]: /etc/systemd/system/workingdir.service:6: WorkingDirectory= path is not normalized: /../../etc
Since bb28e68477 parsing failures of
certain unit file settings will result in load failures of units. This
introduces a new load state "bad-setting" that is entered in precisely
this case.
With this addition error messages on bad settings should be a lot more
explicit, as we don't have to show some generic "errno" error in that
case, but can explicitly say that a bad setting is at fault.
Internally this unit load state is entered as soon as any configuration
loader call returns ENOEXEC. Hence: config parser calls should return
ENOEXEC now for such essential unit file settings. Turns out, they
generally already do.
Fixes: #9107
The load_error is only valid in some load_state cases, lets generate
prettier messages for other cases too, by reusing the
bus_unit_validate_load_state() call which does jus that.
Clients (such as systemctl) ignored LoadError unles LoadState was
"error" before. With this change they could even show LoadError in other
cases and it would show a useful name.
Let's use a switch() statement, cover more cases with pretty messages.
Also let's rename it to "validate", as that's more specific that
"check", as it implies checking for a "valid"/"good" state, which is
what this function does.
This makes two changes: first of all we will now explicitly check
whether a domain to test against an NSEC record is actually below the
signer's name. This is relevant for NSEC records that chain up the end
and the beginning of a zone: we shouldn't alow that NSEC record to match
against domains outside of the zone.
This also fixes how we handle NSEC checks for domains that are prefixes
of the NSEC RR domain itself, fixing #8164 which triggers this specific
case. The non-wildcard NSEC check is simplified for that, we can
directly make our between check, there's no need to find the "Next
Closer" first, as the between check should not be affected by additional
prefixes. For the wild card NSEC check we'll prepend the asterisk in
this case to the NSEC RR itself to make a correct check.
Fixes: #8164
oss-fuzz flags this as:
==1==WARNING: MemorySanitizer: use-of-uninitialized-value
0. 0x7fce77519ca5 in ascii_is_valid systemd/src/basic/utf8.c:252:9
1. 0x7fce774d203c in ellipsize_mem systemd/src/basic/string-util.c:544:13
2. 0x7fce7730a299 in print_multiline systemd/src/shared/logs-show.c:244:37
3. 0x7fce772ffdf3 in output_short systemd/src/shared/logs-show.c:495:25
4. 0x7fce772f5a27 in show_journal_entry systemd/src/shared/logs-show.c:1077:15
5. 0x7fce772f66ad in show_journal systemd/src/shared/logs-show.c:1164:29
6. 0x4a2fa0 in LLVMFuzzerTestOneInput systemd/src/fuzz/fuzz-journal-remote.c:64:21
...
I didn't reproduce the issue, but this looks like an obvious error: the length
is specified, so we shouldn't use the string with any functions for normal
C-strings.
This introduces a has_data boolean field in struct unit_files which can
be used to detect the end of the array.
Use a _cleanup_ for struct unit_files in acquire_time_data and its
callers. Code for acquire_time_data is also simplified by replacing
goto's with straight returns.
Tested: By running the commands below, also checking them under valgrind.
- build/systemd-analyze blame
- build/systemd-analyze critical-chain
- build/systemd-analyze plot
Fixes: Coverity finding CID 996464.