The idea is that we have strvs like list of server names or addresses, where
the majority of strings is rather short, but some are long and there can
potentially be many strings. So formattting them either all on one line or all
in separate lines leads to output that is either hard to read or uses way too
many rows. We want to wrap them, but relying on the pager to do the wrapping is
not nice. Normal text has a lot of redundancy, so when the pager wraps a line
in the middle of a word the read can understand what is going on without any
trouble. But for a high-density zero-redundancy text like an IP address it is
much nicer to wrap between words. This also makes c&p easier.
This adds a variant of TABLE_STRV which is wrapped on output (with line breaks
inserted between different strv entries).
The change table_print() is quite ugly. A second pass is added to re-calculate
column widths. Since column size is now "soft", i.e. it can adjust based on
available columns, we need to two passes:
- first we figure out how much space we want
- in the second pass we figure out what the actual wrapped columns
widths will be.
To avoid unnessary work, the second pass is only done when we actually have
wrappable fields.
A test is added in test-format-table.
Failed to enter shared memory directory multipath: Permission denied
→
Failed to enter shared memory directory /dev/shm/multipath: Permission denied
When looking at nested directories, we will print only the final two elements
of the path. That is still more useful than just the last component of the
path. To print the full path, we'd have to allocate the string, and since the
error occurs so very rarely, I think the current best-effort approach is
enough.
By making them unsigned comparing them with other sizes is less likely
to trigger compiler warnings regarding signed/unsigned comparisons.
After all sizes (i.e. size_t) are generally assumed to be unsigned, so
these should be too.
Prompted-by: https://github.com/systemd/systemd/pull/17345#issuecomment-709402332
If the first boot was aborted, /etc/machine-id might read as
"uninitialized" in some cases. Add a separate case for this
instead of printing a confusing error message.
p itself is never null. Because of this, we would always
call sd_notify() in cleanup, even though the intention was to only
call it if notify_start() was executed.
I can't think of any real vulnerability about this, but it still feels
better to check a variable with "secure" in its name with
secure_getenv() rather than plain getenv().
Paranoia FTW!
We would return ENOENT, which is extremely confusing. Strace is not helpful because
no *file* is actually missing. So let's add some logs at debug level and also use
a custom return code. Let all user-facing utilities print a custom error message
in that case.
The variable is renamed to SYSTEMD_PAGERSECURE (because it's not just about
less now), and we automatically enable secure mode in certain cases, but not
otherwise.
This approach is more nuanced, but should provide a better experience for
users:
- Previusly we would set LESSSECURE=1 and trust the pager to make use of
it. But this has an effect only on less. We need to not start pagers which
are insecure when in secure mode. In particular more is like that and is a
very popular pager.
- We don't enable secure mode always, which means that those other pagers can
reasonably used.
- We do the right thing by default, but the user has ultimate control by
setting SYSTEMD_PAGERSECURE.
Fixes#5666.
v2:
- also check $PKEXEC_UID
v3:
- use 'sd_pid_get_owner_uid() != geteuid()' as the condition
While a server is in the VARLINK_PENDING_METHOD or VARLINK_PENDING_METHOD_MORE
states and its write end is disconnected and it gets a POLLHUP, we
should disconnect since it can't write anymore.
In the case of systemd-oomd disconnecting while pid1 was pending-more, this
condition left pid1 in a state where it started throttling from
continually getting POLLHUP.
It's an auxiliary function to the UnitInfo structures, and very generic.
Let's hence move it over to the other code operating with UnitInfo, even
if it's not used by code outside of systemctl (yet).
Some extra safety when invoked via "sudo". With this we address a
genuine design flaw of sudo, and we shouldn't need to deal with this.
But it's still a good idea to disable this surface given how exotic it
is.
Prompted by #5666
If we set O_NONBLOCK on stdin/stdout directly this means the flag is
left set when we abort abnormally, as we don't get the chance to reset
it again on exit. This might confuse progrms invoking us. Moreover, if
programs invoking us continue to write to the stdout passed to us, they
might be confused by non-blocking mode too.
Hence, let's avoid this if possible: let's reopen stdin/stdout and set
O_NONBLOCK only on the reopend fds, leaving the original fds as they
are.
Prompted-by: https://github.com/systemd/systemd/pull/17070#issuecomment-702304802
Also, even if login.defs are not present, don't start allocating at 1, but at
SYSTEM_UID_MIN.
Fixes#9769.
The test is adjusted. Actually, it was busted before, because sysusers would
never use SYSTEM_GID_MIN, so if SYSTEM_GID_MIN was different than
SYSTEM_UID_MIN, the tests would fail. On all "normal" systems the two are
equal, so we didn't notice. Since sysusers now always uses the minimum of the
two, we only need to substitute one value.
We don't (and shouldn't I think) look at them when determining the type of the
user, but they should be used during user/group allocation. (For example, an
admin may specify SYS_UID_MIN==200 to allow statically numbered users that are
shared with other systems in the range 1–199.)
It makes little sense to make the boundary between systemd and user guids
configurable. Nevertheless, a completely fixed compile-time define is not
enough in two scenarios:
- the systemd_uid_max boundary has moved over time. The default used to be
500 for a long time. Systems which are upgraded over time might have users
in the wrong range, but changing existing systems is complicated and
expensive (offline disks, backups, remote systems, read-only media, etc.)
- systems are used in a heterogenous enviornment, where some vendors pick
one value and others another.
So let's make this boundary overridable using /etc/login.defs.
Fixes#3855, #10184.
This is like membarrier() I guess and basically just exposes CPU
functionality via kernel syscall on some archs. Let's whitelist it for
everyone.
Fixes: #17197
After https://github.com/systemd/systemd/pull/16981 only the presence of crypt_gensalt_ra
is checked, but there are cases where that function is available but crypt_preferred_method
is not, and they are used in the same ifdef.
Add a check for the latter as well.
Fixes#17035. We use "," as the separator between arguments in fstab and crypttab
options field, but the kernel started using "," within arguments. Users will need
to escape those nested commas.
If the whole call is simple and we don't need to look at the return value
apart from the conditional, let's use a form without assignment of the return
value. When the function call is more complicated, it still makes sense to
use a temporary variable.
Let's make umount_verbose() more like mount_verbose_xyz(), i.e. take log
level and flags param. In particular the latter matters, since we
typically don't actually want to follow symlinks when unmounting.
Let's suppress the secondary arch data, since we never ever want to
mount it if we found the primary arch.
Previously we only suppressed in the Verity case, but there's little
reason to entertain the idea of a secondary arch in non-Verity
environments either, we are not going to use them, and should not do
decryption or anything like that.
Uppercase first char of log message, and indicate correct program name.
Reindent comment table at one place.
Use correct, specific, enum type at one more place.
Just some refactoring: let's place the various verity related parameters
in a common structure, and pass that around instead of the individual
parameters.
Also, let's load the PKCS#7 signature data when finding metadata
right-away, instead of delaying this until we need it. In all cases we
call this there's not much time difference between the metdata finding
and the loading, hence this simplifies things and makes sure root hash
data and its signature is now always acquired together.
With new directive SystemCallLog= it's possible to list system calls to be
logged. This can be used for auditing or temporarily when constructing system
call filters.
---
v5: drop intermediary, update HASHMAP_FOREACH_KEY() use
v4: skip useless debug messages, actually parse directive
v3: don't declare unused variables with old libseccomp
v2: fix build without seccomp or old libseccomp
Define explicit action "kill" for SystemCallErrorNumber=.
In addition to errno code, allow specifying "kill" as action for
SystemCallFilter=.
---
v7: seccomp_parse_errno_or_action() returns -EINVAL if !HAVE_SECCOMP
v6: use streq_ptr(), let errno_to_name() handle bad values, kill processes,
init syscall_errno
v5: actually use seccomp_errno_or_action_to_string(), don't fail bus unit
parsing without seccomp
v4: fix build without seccomp
v3: drop log action
v2: action -> number
Since the loop to check various xcrypt functions is already in place,
adding one more is cheap. And it is nicer to check for the function
directly. People like to backport things, so we might get lucky even
without having libxcrypt.
They are only used under src/home/, but I want to add tests in test-libcrypt-util.c.
And the functions are almost trivial, so I think it is OK to move them to shared.
This lets the libc/xcrypt allocate as much storage area as it needs.
Should fix#16965:
testsuite-46.sh[74]: ==74==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f3e972e1080 at pc 0x7f3e9be8deed bp 0x7ffce4f28530 sp 0x7ffce4f27ce0
testsuite-46.sh[74]: WRITE of size 131232 at 0x7f3e972e1080 thread T0
testsuite-46.sh[74]: #0 0x7f3e9be8deec (/usr/lib/clang/10.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x9feec)
testsuite-46.sh[74]: #1 0x559cd05a6412 in user_record_make_hashed_password /systemd-meson-build/../build/src/home/user-record-util.c:818:21
testsuite-46.sh[74]: #2 0x559cd058fb03 in create_home /systemd-meson-build/../build/src/home/homectl.c:1112:29
testsuite-46.sh[74]: #3 0x7f3e9b5b3058 in dispatch_verb /systemd-meson-build/../build/src/shared/verbs.c:103:24
testsuite-46.sh[74]: #4 0x559cd058c101 in run /systemd-meson-build/../build/src/home/homectl.c:3325:16
testsuite-46.sh[74]: #5 0x559cd058c00a in main /systemd-meson-build/../build/src/home/homectl.c:3328:1
testsuite-46.sh[74]: #6 0x7f3e9a88b151 in __libc_start_main (/usr/lib/libc.so.6+0x28151)
testsuite-46.sh[74]: #7 0x559cd0583e7d in _start (/usr/bin/homectl+0x24e7d)
testsuite-46.sh[74]: Address 0x7f3e972e1080 is located in stack of thread T0 at offset 32896 in frame
testsuite-46.sh[74]: #0 0x559cd05a60df in user_record_make_hashed_password /systemd-meson-build/../build/src/home/user-record-util.c:789
testsuite-46.sh[74]: This frame has 6 object(s):
testsuite-46.sh[74]: [32, 40) 'priv' (line 790)
testsuite-46.sh[74]: [64, 72) 'np' (line 791)
testsuite-46.sh[74]: [96, 104) 'salt' (line 809)
testsuite-46.sh[74]: [128, 32896) 'cd' (line 810)
testsuite-46.sh[74]: [33152, 33168) '.compoundliteral' <== Memory access at offset 32896 partially underflows this variable
testsuite-46.sh[74]: [33184, 33192) 'new_array' (line 832) <== Memory access at offset 32896 partially underflows this variable
testsuite-46.sh[74]: HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
testsuite-46.sh[74]: (longjmp and C++ exceptions *are* supported)
testsuite-46.sh[74]: SUMMARY: AddressSanitizer: stack-buffer-overflow (/usr/lib/clang/10.0.1/lib/linux/libclang_rt.asan-x86_64.so+0x9feec)
It seems 'struct crypt_data' is 32896 bytes, but libclang_rt wants more, at least 33168?
Two functional changes:
- "/" is now refused. The test is adjusted.
- The trailing NUL is *not* included in the returned size for abstract size. The
comments in sockaddr_un_set_path() indicate that this is the right thing to do,
and the code in socket_address_parse() wasn't doing that.
Closes#12624.
The formatting in systemd.socket.xml is updated a bit.
Currently in_addr_port_ifindex_name_to_string() always prints the ifindex
numerically. This is not super useful since the interface numbers are
semi-random. Should we use interface names in preference?
We would set .type to a fake value. All real callers (outside of tests)
immediately overwrite .type with a proper value after calling
socket_address_parse(). So let's not set it and adjust the few places
that relied on it being set to the fake value.
socket_address_parse() is modernized to only set the output argument on
success.
One special syntax is not supported anymore: "iface:port" would be parsed as an
interface name plus numerical port, equivalent to "[::]%iface:port". This was
added in 542563babd, but was undocumented, and we had no tests for it. It seems
that this actually wasn't doing anything useful, because the kernel only uses the
scope identifier for link-local addresses.
We don't try to resolve invalid ifnames as all. A different return
code is used. This difference will be verified later in test_socket_address_parse()
when socket_address_parse() is converted to use
in_addr_port_ifindex_name_from_string_auto().
The tricky part here is that the function is not allowed to fail in this code
path. Initially, I wanted to change the return value to allow it to fail, but
this cascades through all the places where fstab_test_option() and friends are
used; updating all those sites would be a lot of work. And since quoting is not
allowed here, a simple loop with strcspn() is easy to do.
Let's make sure we keep a reference to the event source
(Note that this code is currently not used, which is why this was never
used: in all cases we do not add listener fds after the event is
attached, but before. In that case this code is not called.)
We support read-only ptyfwd options, and on those the input event source
won't be allocated. Deal with that and don't invoke a function on it
that will then instantly fail.
The new methods work as the unflavoured ones, but takes flags as a
single uint64_t DBUS parameters instead of different booleans, so
that it can be extended without breaking backward compatibility.
Add new flag to allow adding/removing symlinks in
[/etc|/run]/systemd/system.attached so that portable services
configuration files can be self-contained in those directories, without
affecting the system services directories.
Use the new methods and flags from portablectl --enable.
Useful in case /etc is read-only, with only the portable services
directories being mounted read-write.
Let's make libcryptsetup a dlopen() style dep for PID 1 (i.e. for
RootImage= and stuff), systemd-growfs and systemd-repart. (But leave to
be a regulra dep in systemd-cryptsetup, systemd-veritysetup and
systemd-homed since for them the libcryptsetup support is not auxiliary
but pretty much at the core of what they do.)
This should be useful for container images that want systemd in the
payload but don't care for the cryptsetup logic since dm-crypt and stuff
isn't available in containers anyway.
Fixes: #8249
"crypt-util.c" is such a generic name, let's avoid that, in particular
as libc's/libcrypt's crypt() function is so generically named too that
one might thing this is about that. Let's hence be more precise, and
make clear that this is about cryptsetup, and nothing else.
We already had cryptsetup-util.[ch] in src/cryptsetup/ doing keyfile
management. To avoid the needless confusion, let's rename that file to
cryptsetup-keyfile.[ch].
strv_extend_strv_utf8_only() uses a temporary buffer to make the implementation
conscise. Otherwise we'd have to rewrite all of strv_extend_strv() which didn't
seem worth the trouble for this one use outside of a hot path.
If the data is not serializable, we just pretend it doesn't exists.
This fixes#16683 and https://bugs.gentoo.org/735072 in a second way.
They both are both short and contain similar parts and various helper will be
shared between both parts of the code so it's easier to use a single file.
JSON strings must be utf-8-clean. We also verify this in json_parse_string()
so we would reject a message with invalid utf-8 anyway.
It would probably be slightly cheaper to detect non-conformaning strings in
serialization, but then we'd have to fail serialization. By doing this early,
we give the caller a chance to handle the error nicely.
The test is adjusted to contain a valid utf-8 string after decoding of the
utf-32 encoding in json ("विवेकख्यातिरविप्लवा हानोपायः।", something about the
cessation of ignorance).
Upon reception of a message which fails in json_parse(), we would proceed to
parse it again from a deferred callback and hang. Once we have realized that
the message is invalid, let's move the pointer in the buffer even if the
message is invalid. We don't want to look at this data again.
(before) $ build-rawhide/userdbctl --output=json user test.user
n/a: varlink: setting state idle-client
/run/systemd/userdb/io.systemd.Multiplexer: Sending message: {"method":"io.systemd.UserDatabase.GetUserRecord","parameters":{"userName":"test.user","service":"io.systemd.Multiplexer"}}
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state idle-client → awaiting-reply
/run/systemd/userdb/io.systemd.Multiplexer: New incoming message: {...}
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state awaiting-reply → pending-disconnect
/run/systemd/userdb/io.systemd.Multiplexer: New incoming message: {...}
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state pending-disconnect → disconnected
^C
(after) $ n/a: varlink: setting state idle-client
/run/systemd/userdb/io.systemd.Multiplexer: Sending message: {"method":"io.systemd.UserDatabase.GetUserRecord","parameters":{"userName":"test.user","service":"io.systemd.Multiplexer"}}
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state idle-client → awaiting-reply
/run/systemd/userdb/io.systemd.Multiplexer: New incoming message: {...}
/run/systemd/userdb/io.systemd.Multiplexer: Failed to parse JSON: Invalid argument
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state awaiting-reply → pending-disconnect
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state pending-disconnect → processing-disconnect
Got lookup error: io.systemd.Disconnected
/run/systemd/userdb/io.systemd.Multiplexer: varlink: changing state processing-disconnect → disconnected
Failed to find user test.user: Input/output error
This should fix#16683 and https://bugs.gentoo.org/735072.
We would reject various passwords that glibc accepts, for example ""
or any descrypted password. Accounts with empty password are definitely
useful, for example for testing or in scenarios where a password is not
needed. Also, using weak encryption methods is probably not a good idea,
it's not the job of our nss helpers to decide that: they should just
faithfully forward whatever data is there.
Also rename the function to make it more obvious that the returned answer
is not in any way certain.
Instead of assuming that more-recently modified directories have higher mtime,
just look for any mtime changes, up or down. Since we don't want to remember
individual mtimes, hash them to obtain a single value.
This should help us behave properly in the case when the time jumps backwards
during boot: various files might have mtimes that in the future, but we won't
care. This fixes the following scenario:
We have /etc/systemd/system with T1. T1 is initially far in the past.
We have /run/systemd/generator with time T2.
The time is adjusted backwards, so T2 will be always in the future for a while.
Now the user writes new files to /etc/systemd/system, and T1 is updated to T1'.
Nevertheless, T1 < T1' << T2.
We would consider our cache to be up-to-date, falsely.
N_DEVICE_NODE_LIST_ATTEMPTS is unconditionally used since version 246 and
ac1f3ad05f
However, this variable is only defined if HAVE_BLKID is set resulting in
the following build failure if cryptsetup is enabled but not libblkid:
../src/shared/dissect-image.c:1336:34: error: 'N_DEVICE_NODE_LIST_ATTEMPTS' undeclared (first use in this function)
1336 | for (unsigned i = 0; i < N_DEVICE_NODE_LIST_ATTEMPTS; i++) {
|
Fixes:
- http://autobuild.buildroot.org/results/67782c225c08387c1bbcbea9eee3ca12bc6577cd
On systems without an RTC, systemd currently sets the clock to a
compile-time epoch value, derived from the NEWS file in the
repository. This is not ideal as the initial clock hence depends
on the last time systemd was built, not when the image was compiled.
Let's provide a different way here and look at `/usr/lib/clock-epoch`.
If that file exists, it's timestamp for the last modification will be
used instead of the compile-time default.
I find this version much more readable.
Add replacement defines so that when acl/libacl.h is not available, the
ACL_{READ,WRITE,EXECUTE} constants are also defined. Those constants were
declared in the kernel headers already in 1da177e4c3f41524e886b7f1b8a0c1f,
so they should be the same pretty much everywhere.
It appears LOOP_CONFIGURE in 5.8 is even more broken than initially
thought: it doesn't properly propgate lo_sizelimit to the block device
layer. :-(
Let's hence check the block device size immediately after issuing
LOOP_CONFIGURE, and if it doesn't match what we just set let's fallback
to the old ioctls.
This means LOOP_CONFIGURE currently works correctly only for the most
simply case: no partition table logic and no size limit. Sad!
(Kernel people should really be told about the concepts of tests and
even CI, one day!)
This matches how we handle things everywhere else, i.e. in .mount units,
and similar: when a mount point dir is missing, we create it, let's do
so too when dealing with disk images.
This makes things a lot simpler, more robust, and systematic.
fat is a bit more limited in volume name length and UUID support. Let's
add some special support for it.
This is particularly useful to generate EFI system partitions.
Kernel 5.8 gained a hidepid= implementation that is truly per procfs,
which allows us to mount a distinct once into every unit, with
individual hidepid= settings. Let's expose this via two new settings:
ProtectProc= (wrapping hidpid=) and ProcSubset= (wrapping subset=).
Replaces: #11670
We return BUS_ERROR_NO_SUCH_UNIT a.k.a. org.freedesktop.systemd1.NoSuchUnit
in various places. In #16813:
Aug 22 06:14:48 core sudo[2769199]: pam_systemd_home(sudo:account): Failed to query user record: Unit dbus-org.freedesktop.home1.service not found.
Aug 22 06:14:48 core dbus-daemon[5311]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.home1.service': Unit dbus-org.freedesktop.home1.service not found.
Aug 22 06:14:48 core dbus-daemon[5311]: [system] Activating via systemd: service name='org.freedesktop.home1' unit='dbus-org.freedesktop.home1.service' requested by ':1.6564' (uid=0 pid=2769199 comm="sudo su ")
This particular error comes from bus_unit_validate_load_state() in pid1:
case UNIT_NOT_FOUND:
return sd_bus_error_setf(error, BUS_ERROR_NO_SUCH_UNIT, "Unit %s not found.", u->id);
It seems possible that we should return a different error, but it doesn't really
matter: if we change pid1 to return a different error, we still need to handle
BUS_ERROR_NO_SUCH_UNIT as in this patch to handle pid1 with current code.