The CI occasionally fail in test-path with a timeout. test-path loads
units from the filesystem, and this conceivably might take more than
the default limit of 3 s. Increase the timeout substantially to see if
this helps.
The test binary has two modes: in the default argument-less mode, it
just checks that "root" can be resolved. When invoked manually, a root
prefix and user/group names can be specified.
The test is failing in koji, and the line from printf() does not end up
in the logs for some reason. log_info() works fine, so let's just use
that here too.
Read-only /var/tmp is more likely, because it's backed by a real device. /tmp
is (by default) backed by tmpfs, but it doesn't have to be. In both cases the
same consideration applies.
If we boot with read-only /var/tmp, any unit with PrivateTmp=yes would fail
because we cannot create the subdir under /var/tmp to mount the private directory.
But many services actually don't require /var/tmp (either because they only use
it occasionally, or because they only use /tmp, or even because they don't use the
temporary directories at all, and PrivateTmp=yes is used to isolate them from
the rest of the system).
To handle both cases let's create a read-only directory under /run/systemd and
mount it as the private /tmp or /var/tmp. (Read-only to not fool the service into
dumping too much data in /run.)
$ sudo systemd-run -t -p PrivateTmp=yes bash
Running as unit: run-u14.service
Press ^] three times within 1s to disconnect TTY.
[root@workstation /]# ls -l /tmp/
total 0
[root@workstation /]# ls -l /var/tmp/
total 0
[root@workstation /]# touch /tmp/f
[root@workstation /]# touch /var/tmp/f
touch: cannot touch '/var/tmp/f': Read-only file system
This commit has more changes than I like to put in one commit, but it's touching all
the same paths so it's hard to split.
exec_runtime_make() was using the wrong cleanup function, so the directory would be
left behind on error.
We never return anything higher than 63, so using "long unsigned"
as the type only confused the reader. (We can still use "long unsigned"
and safe_atolu() to parse the kernel file.)
We would refuse to print capabilities which were didn't have a name
for. The kernel adds new capabilities from time to time, most recently
cap_bpf. 'systmectl show -p CapabilityBoundingSet ...' would fail with
"Failed to parse bus message: Invalid argument" because
capability_set_to_string_alloc() would fail with -EINVAL. So let's
print such capabilities in hexadecimal:
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search
cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap
cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin
cap_net_raw cap_ipc_lock cap_ipc_owner 0x10 0x11 0x12 0x13 0x14 0x15 0x16
0x17 0x18 0x19 0x1a ...
For symmetry, also allow capabilities that we don't know to be specified.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1853736.
The desktop file specification allows entries like ";;;;;;", full of empty strings.
But looking at the actual list of supported keys [1], empty entries are meaningless
(unless we would allow e.g. the desktop name to be the empty string. But that doesn't
seem very useful either). So let's just simplify our life and skip any empty substrings
entirely.
This would also resolve the fuzzer case:
$ valgrind build/fuzz-xdg-desktop test/fuzz/fuzz-xdg-desktop/oss-fuzz-22812
test/fuzz/fuzz-xdg-desktop/oss-fuzz-22812... ok
==2899241== HEAP SUMMARY:
==2899241== in use at exit: 0 bytes in 0 blocks
==2899241== total heap usage: 484,385 allocs, 484,385 frees, 12,411,330 bytes allocated
↓
==2899650== HEAP SUMMARY:
==2899650== in use at exit: 0 bytes in 0 blocks
==2899650== total heap usage: 1,325 allocs, 1,325 frees, 1,463,602 bytes allocated
This partially undoes the parent commit. We follow the symlink and
if it appears to be a symlink to /dev/null, even if /dev/null is not
present, we treat it as such. The addition of creation of /dev/null
in the test is reverted.
There's some inconsistency in the what is considered a masked unit:
some places (i.e. load-fragment.c) use `null_or_empty()` while others
check if the file path is symlinked to "/dev/null". Since the latter
doesn't account for things like non-absolute symlinks to "/dev/null",
this commit switches the check for "/dev/null" to use `null_or_empty_path()`
Since bc8d57f290 mac_selinux_init() is checked and considered fatal
tree-wide.
Coverity complains about it not being checked in the test code.
Follow-up of: #16223
Fixes: CID 1429975
https://tools.ietf.org/html/draft-knodel-terminology-02https://lwn.net/Articles/823224/
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
Presently, CLI utilities such as systemctl will check whether they have a tty
attached or not to decide whether to parse /proc/cmdline or EFI variable
SystemdOptions looking for systemd.log_* entries.
But this check will be misleading if these tools are being launched by a
daemon, such as a monitoring daemon or automation service that runs in
background.
Make log handling of CLI tools uniform by never checking /proc/cmdline or EFI
variables to determine the logging level.
Furthermore, introduce a new log_setup_cli() shortcut to set up common options
used by most command-line utilities.
This combines set_ensure_allocated() with set_consume(). The cool thing is that
because we know the hash ops, we can correctly free the item if appropriate.
Similarly to set_consume(), the goal is to simplify handling of the case where
the item needs to be freed on error and if already present in the set.
It's such a common operation to allocate the set and put an item in it,
that it deserves a helper. set_ensure_put() has the same return values
as set_put().
Comes with tests!
Let's make sure $XDG_RUNTIME_DIR for the user instance and /run for the
system instance is always organized the same way: the "inaccessible"
device nodes should be placed in a subdir of either called "systemd" and
a subdir of that called "inaccessible".
This way we can emphasize the common behaviour, and only differ where
really necessary.
Follow-up for #13823
dm-verity support in dissect-image at the moment is restricted to GPT
volumes.
If the image a single-filesystem type without a partition table (eg: squashfs)
and a roothash/verity file are passed, set the verity flag and mark as
read-only.
This is a follow-up for 9f83091e3c.
Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.
This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
We would parse numbers with base prefixes as user identifiers. For example,
"0x2b3bfa0" would be interpreted as UID==45334432 and "01750" would be
interpreted as UID==1000. This parsing was used also in cases where either a
user/group name or number may be specified. This means that names like
0x2b3bfa0 would be ambiguous: they are a valid user name according to our
documented relaxed rules, but they would also be parsed as numeric uids.
This behaviour is definitely not expected by users, since tools generally only
accept decimal numbers (e.g. id, getent passwd), while other tools only accept
user names and thus will interpret such strings as user names without even
attempting to convert them to numbers (su, ssh). So let's follow suit and only
accept numbers in decimal notation. Effectively this means that we will reject
such strings as a username/uid/groupname/gid where strict mode is used, and try
to look up a user/group with such a name in relaxed mode.
Since the function changed is fairly low-level and fairly widely used, this
affects multiple tools: loginctl show-user/enable-linger/disable-linger foo',
the third argument in sysusers.d, fourth and fifth arguments in tmpfiles.d,
etc.
Fixes#15985.
Prompted by the discussions in #15180.
This is a bit more complex than I hoped, since for PID 1 we need to pass
in the synethetic environment block in we generate on demand.
It's easy to add, and should be pretty useful, in particular as in
AssertPathIsEncrypted= as it can be used for checking that
some path is encrypted before some service is invoked that might want to
place secure material there.
kernel 5.6 added support for a new flag for getrandom(): GRND_INSECURE.
If we set it we can get some random data out of the kernel random pool,
even if it is not yet initializated. This is great for us to initialize
hash table seeds and such, where it is OK if they are crap initially. We
used RDRAND for these cases so far, but RDRAND is only available on
newer CPUs and some archs. Let's now use GRND_INSECURE for these cases
as well, which means we won't needlessly delay boot anymore even on
archs/CPUs that do not have RDRAND.
Of course we never set this flag when generating crypto keys or uuids.
Which makes it different from RDRAND for us (and is the reason I think
we should keep explicit RDRAND support in): RDRAND we don't trust enough
for crypto keys. But we do trust it enough for UUIDs.
As described in #15603, it is a fairly common setup to use a fqdn as the
configured hostname. But it is often convenient to use just the actual
hostname, i.e. until the first dot. This adds support in tmpfiles, sysusers,
and unit files for %l which expands to that.
Fixes#15603.
Ideally, assert_cc() would be used for this, so that it is not possible to even
compile systemd with something like '-Dfallback-hostname=.foo'. But to do a
proper check we need to call hostname_is_valid(), and we cannot depend on being
able to run code (e.g. during cross-compilation). So let's do a very superficial
check in meson, and a proper on in test-util.
The sets are such basic functionality that it is convenient to be able to
build test-set without all the machinery in shared, and to test it without
the mempool to validate memory accesses easier.
For units which are aliases of other units, reporting preset status as
"enabled" is rather misleading. For example, dbus.service is an alias of
dbus-broker.service. In list-unit-files we'd show both as "enabled". In
particular, systemctl preset ignores aliases, so showing any preset status at
all is always going to be misleading. Let's introduce a new state "alias" and
use that for all aliases.
I was trying to avoid adding a new state, to keep compatibility with previous
behaviour, but for alias unit files it simply doesn't seem very useful to show
any of the existing states. It seems that the clearly showing that those are
aliases for other units will be easiest to understand for users.
This changes the calendarspec parser to allow expressions such as
"00:05..05", i.e. a range where start and end is the same. It also
allows expressions such as "00:1-2/3", i.e. where the repetition value
does not fit even once in the specified range. With this patch both
cases will now be optimized away, i.e. the range is removed and a fixed
value is used, which is functionally equivalent.
See #15030 for an issue where the inability to parse such expressions
caused confusion.
I think it's probably better to accept these gracefully and optimizing
them away instead of refusing them with a plain EINVAL. With a tool such
as "systemd-analyze" calendar it should be easy to figure out the
normalized form with the redundant bits optimized away.
As documented in systemd.path(5):
When a service unit triggered by a path unit terminates (regardless
whether it exited successfully or failed), monitored paths are
checked immediately again, and the service accordingly restarted
instantly.
This commit implements this behaviour for PathExists=, PathExistsGlob=,
and DirectoryNotEmpty=. These predicates are essentially
"level-triggered": the service should be activated whenever the
predicate is true. PathChanged= and PathModified=, on the other hand,
are "edge-triggered": the service should only be activated when the
predicate *becomes* true.
The behaviour has been broken since at least as far back as commit
8fca6944c2 ("path: stop watching path specs once we triggered the target
unit"). This commit had systemd stop monitoring inotify whenever the
triggered unit was activated. Unfortunately this meant it never updated
the ->inotify_triggered flag, so it never rechecked the path specs when
the triggered unit deactivated.
With this commit, systemd rechecks all paths specs whenever the
triggered unit deactivates. If any PathExists=, PathExistsGlob= or
DirectoryNotEmpty= predicate passes, the triggered unit is reactivated.
If the target unit is activated by something outside of the path unit,
the path unit immediately transitions to a running state. This ensures
the path unit stops monitoring inotify in this situation.
With this change in place, commit d7cf8c24d4 ("core/path: fix spurious
triggering of PathExists= on restart/reload") is no longer necessary.
The path unit (and its triggered unit) is now always active whenever
the PathExists= predicate passes, so there is no spurious restart when
systemd is reloaded or restarted.
Callers of cg_get_keyed_attribute_full() can now specify via the flag whether the
missing keyes in cgroup attribute file are OK or not. Also the wrappers for both
strict and graceful version are provided.
Don't assume that 4MB can be allocated from stack since there could be smaller
DefaultLimitSTACK= in force, so let's use malloc(). NUL terminate the huge
strings by hand, also ensure termination in test_lz4_decompress_partial() and
optimize the memset() for the string.
Some items in /proc and /etc may not be accessible to poor unprivileged users
due to e.g. SELinux, BOFH or both, so check for EACCES and EPERM.
/var/tmp may be a symlink to /tmp and then path_compare() will always fail, so
let's stick to /tmp like elsewhere.
/tmp may be mounted with noexec option and then trying to execute scripts from
there would fail.
Detect and warn if seccomp is already in use, which could make seccomp test
fail if the syscalls are already blocked.
Unset $TMPDIR so it will not break specifier tests where %T is assumed to be
/tmp and %V /var/tmp.
It's not always mounted, e.g. during the build-time tests, it's running inside
a chroot (that's how Debian/Ubuntu build packages, in chroots) so this test
always fails because /sys/fs/cgroup isn't mounted.
We provide a way via the '-' symbol to ignore errors when nonexistent
executable files are passed to Exec* parameters & so on. In such a case,
the flag `EXEC_COMMAND_IGNORE_FAILURE` is set and we go on happily with
our life if that happens. However, `systemd-analyze verify` complained
about missing executables even in such a case. In such a case it is not
an error for this to happen so check if the flag is set before checking
if the file is accessible and executable.
Add some small tests to check this condition.
Closes#15218.
This reworks the user validation infrastructure. There are now two
modes. In regular mode we are strict and test against a strict set of
valid chars. And in "relaxed" mode we just filter out some really
obvious, dangerous stuff. i.e. strict is whitelisting what is OK, but
"relaxed" is blacklisting what is really not OK.
The idea is that we use strict mode whenver we allocate a new user
(i.e. in sysusers.d or homed), while "relaxed" mode is when we process
users registered elsewhere, (i.e. userdb, logind, …)
The requirements on user name validity vary wildly. SSSD thinks its fine
to embedd "@" for example, while the suggested NAME_REGEX field on
Debian does not even allow uppercase chars…
This effectively liberaralizes a lot what we expect from usernames.
The code that warns about questionnable user names is now optional and
only used at places such as unit file parsing, so that it doesn't show
up on every userdb query, but only when processing configuration files
that know better.
Fixes: #15149#15090
split() and FOREACH_WORD really should die, and everything be moved to
extract_first_word() and friends, but let's at least make sure that for
the remaining code using it we can't deadlock by not progressing in the
word iteration.
Fixes: #15305
The kernel does not sanitize /proc/cmdline. E.g. when running under qemu, it is
easy to pass a string with newline by mistake. We use read_one_line_file(), so
we would read only the first list of the file, and
write_string_file(WRITE_STRING_FILE_VERIFY_ON_FAILURE) would fail because the
target file is obviously different. Change to a kernel-generated file to avoid
the issue.
v2:
- use /proc/version instead of /proc/uptime for attempted writes, so the test
test passes even if test_write_string_file_verify() takes more than 10 ms ;]
I think the two names were both pretty bad. They did not give a proper hint
what the difference between the two functions is, and sd_path_home sounds like
it is somehow related to /home or home directories or whatever, when in fact
both functions return the same set of paths as either a colon-delimited string
or a strv. "_strv" suffix is used by various functions in sd-bus, so let's
reuse that.
Those functions are not public yet, so let's rename.
In 1a29610f5f the change inadvertedly
disabled names with digit as the first character. This follow-up change
allows a digit as the first character in compat mode.
Fixes: #15141
It returns 32 bits, unsigned on amd64, so it's probably similar everywhere
with glibc. But let's make the code generic, without assuming specific size
or signedness.
The man pages state that the '+' prefix in Exec* directives should
ignore filesystem namespacing options such as PrivateTmp. Now it does.
This is very similar to #8842, just with PrivateTmp instead of
PrivateDevices.
Fix the following issues in test-cgroup:
1, commit 65be7e0652 ("pid1: do not reset subtree_control on
already-existing units with delegation") changed the return value
of cg_create () as follows:
"Returns 0 if the group already existed, 1 on success, negative
otherwise."
So we need to modify the test cases related to cg_create ().
2, commit efdb02375b ("core: unified cgroup hierarchy support")
changed cg_delete () to cg_rmdir (), so the test cases also need
to be adjusted a bit.
3. There is no cleanup of "test-a". If we execute test-cgroup
multiple times, we will encounter an error.
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
For #8495: it is arguably useful to not show the length of the password
in public spaces. It is possible to press TAB or BS to cancel the asterisks,
but this is not very discoverable. Let's make it discoverable by showing
a message (in gray). The message is "erased" after the first character
is entered.
test-ask-password-api would crash if ^D was pressed.
If think the callers generally expect a non-empty strv as reply. Let's
return an error if we have nothing to return.
Also modernize test-ask-password-api a bit.
We use those strings as hash keys. While writing "a...b" looks strange,
"a///b" does not look so strange. Both syntaxes would actually result in the
value being correctly written to the file, but they would confuse our
de-deplication over keys. So let's normalize. Output also becomes nicer.
Add test.
This way we can use libxcrypt specific functionality such as
crypt_gensalt() and thus take benefit of the newer algorithms libxcrypt
implements. (Also adds support for a new env var $SYSTEMD_CRYPT_PREFIX
which may be used to select the hash algorithm to use for libxcrypt.)
Also, let's move the weird crypt.h inclusion into libcrypt.h so that
there's a single place for it.