Pretty much all intel cpus have had RDRAND in a long time. While
CPU-internal RNG are widely not trusted, for seeding hash tables it's
perfectly OK to use: we don't high quality entropy in that case, hence
let's use it.
This is only hooked up with 'high_quality_required' is false. If we
require high quality entropy the kernel is the only source we should
use.
This makes two changes to the namespacing code:
1. We'll only gracefully skip service namespacing on access failure if
exclusively sandboxing options where selected, and not mount-related
options that result in a very different view of the world. For example,
ignoring RootDirectory=, RootImage= or Bind= is really probablematic,
but ReadOnlyPaths= is just a weaker sandbox.
2. The namespacing code will now return a clearly recognizable error
code when it cannot enforce its namespacing, so that we cannot
confuse EPERM errors from mount() with those from unshare(). Only the
errors from the first unshare() are now taken as hint to gracefully
disable namespacing.
Fixes: #9844#9835
The fstab entry may contain comment/application-specific options, like
for example x-systemd.automount or x-initrd.mount.
With the recent switch to libmount, the mount options during remount are
now gathered via mnt_fs_get_options(), which returns the merged fstab
options with the effective options in mountinfo.
Unfortunately if one of these application-specific options are set in
fstab, the remount will fail with -EINVAL.
In systemd 238:
Remounting '/test-x-initrd-mount' read-only in with options
'errors=continue,user_xattr,acl'.
In systemd 239:
Remounting '/test-x-initrd-mount' read-only in with options
'errors=continue,user_xattr,acl,x-initrd.mount'.
Failed to remount '/test-x-initrd-mount' read-only: Invalid argument
So instead of using mnt_fs_get_options(), we're now using both
mnt_fs_get_fs_options() and mnt_fs_get_vfs_options() and merging the
results together so we don't get any non-relevant options from fstab.
Signed-off-by: aszlig <aszlig@nix.build>
A follow-up for commit 9d874aec45.
This patch makes "path" parameter mandatory in fd_set_*() helpers removing the
need to use fd_get_path() when NULL was passed. The caller is supposed to pass
the fd anyway so assuming that it also knows the path should be safe.
Actually, the only case where this was useful (or used) was when we were
walking through directory trees (in item_do()). But even in those cases the
paths could be constructed trivially, which is still better than relying on
fd_get_path() (which is an ugly API).
A very succinct test case is also added for 'z/Z' operators so the code dealing
with recursive operators is tested minimally.
Let's fold get_user_creds_clean() into get_user_creds(), and introduce a
flags argument for it to select "clean" behaviour. This flags parameter
also learns to other new flags:
- USER_CREDS_SYNTHESIZE_FALLBACK: in this mode the user records for
root/nobody are only synthesized as fallback. Normally, the synthesized
records take precedence over what is in the user database. With this
flag set this is reversed, and the user database takes precedence, and
the synthesized records are only used if they are missing there. This
flag should be set in cases where doing NSS is deemed safe, and where
there's interest in knowing the correct shell, for example if the
admin changed root's shell to zsh or suchlike.
- USER_CREDS_ALLOW_MISSING: if set, and a UID/GID is specified by
numeric value, and there's no user/group record for it accept it
anyway. This allows us to fix#9767
This then also ports all users to set the most appropriate flags.
Fixes: #9767
[zj: remove one isempty() call]
On the host these symlinks are created by udev, and we consider them API
and make use of them ourselves at various places. Hence when running a
private /dev, also create these symlinks so that lookups by major/minor
work in such an environment, too.
This is some extra protection for sloppy "golden master" systems, where
images are duplicated many times but the random seed is not
deleted (or reset for each copy). That golden master systems have to
reset /etc/machine-id is better known, and easier to notice (as having
the same ID will result in address conflicts and suchlike quite often).
Hence let's write the machine ID into /dev/urandom, in case it has been
initialized and unlikely the stored random seed has been provisioned
differently on each image.
Note that we don't credit the entropy either way, hence in the case
there's a cycle of a) generating the machine-id early at boot and b)
writing it back into /dev/urandom late at boot it shouldn't matter. It's
never going to make things worse, just in a few cases better.
When stdin/stdout/stderr is initialized from an fd, let's read the tty
name of it if we can, and pass that to PAM.
This makes sure that "machinectl shell" sessions have proper TTY fields
initialized that "loginctl" then shows.
This is a bit like the info link in most of GNU's --help texts, but we
don't do info but man pages, and we make them properly clickable on
terminal supporting that, because awesome.
I think it's generally advisable to link up our (brief) --help texts and
our (more comprehensive) man pages a bit, so this should be an easy and
straight-forward way to do it.
This fixes a memory leak:
```
d5070e2f67ededca022f81f2941900606b16f3196b2268e856295f59._openpgpkey.gmail.com: resolve call failed: 'd5070e2f67ededca022f81f2941900606b16f3196b2268e856295f59._openpgpkey.gmail.com' not found
=================================================================
==224==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 65 byte(s) in 1 object(s) allocated from:
#0 0x7f71b0878850 in malloc (/usr/lib64/libasan.so.4+0xde850)
#1 0x7f71afaf69b0 in malloc_multiply ../src/basic/alloc-util.h:63
#2 0x7f71afaf6c95 in hexmem ../src/basic/hexdecoct.c:62
#3 0x7f71afbb574b in string_hashsum ../src/basic/gcrypt-util.c:45
#4 0x56201333e0b9 in string_hashsum_sha256 ../src/basic/gcrypt-util.h:30
#5 0x562013347b63 in resolve_openpgp ../src/resolve/resolvectl.c:908
#6 0x562013348b9f in verb_openpgp ../src/resolve/resolvectl.c:944
#7 0x7f71afbae0b0 in dispatch_verb ../src/basic/verbs.c:119
#8 0x56201335790b in native_main ../src/resolve/resolvectl.c:2947
#9 0x56201335880d in main ../src/resolve/resolvectl.c:3087
#10 0x7f71ad8fcf29 in __libc_start_main (/lib64/libc.so.6+0x20f29)
SUMMARY: AddressSanitizer: 65 byte(s) leaked in 1 allocation(s).
```
The references to the dns_server are now setup after the tls connection is setup.
This ensures that the stream got fully stopped when the initial tls setup failed
instead of having the unref being blocked by the reference to the stream by the server.
Therefore on_stream_io would no longer be called with a half setup encrypted connection.
Fixes the issue reported in #9838.
Previously, we'd act immediately on StopWhenUnneeded= when a unit state
changes. With this rework we'll maintain a queue instead: whenever
there's the chance that StopWhenUneeded= might have an effect we enqueue
the unit, and process it later when we have nothing better to do.
This should make the implementation a bit more reliable, as the unit notify event
cannot immediately enqueue tons of side-effect jobs that might
contradict each other, but we do so only in a strictly ordered fashion,
from the main event loop.
This slightly changes the check when to consider a unit "unneeded".
Previously, we'd assume that a unit in "deactivating" state could also
be cleaned up. With this new logic we'll only consider units unneeded
that are fully up and have no job queued. This means that whenever
there's something pending for a unit we won't clean it up.
The function replaces a couple commas, a semicolon and the final newline with
zero bytes in the string passed to it. The 'const' seems to have been added
by accident during a bulk edit (more specifically 3b3154df7e).
The qgroup logic (types 'q' and 'Q') only has an effect if there's no previous
setup at all, and any explicitly configured subvolumes with their qgroups are
left entirely unmodified.
The idea is that if users want a different logic than the one we set up by
default, then by all means they should do that before hand, and tmpfiles won't
override their logic.
tmpfiles now passes an O_PATH fd to btrfs_subvol_make_fd() under the
assumption it will accept it like mkdirat() does. So far this assumption
was wrong, let's correct that.
Without that tmpfiles' on btrfs file systems failed systematically...
fd_get_path() is an ugly API, as it creates ambiguities related to the
" (deleted)" suffix /proc/$PID/fd/$FD shows. Let's use it a bit less
excessively, and whenever we have a good valid path already, let's
simply pass that along, instead of forgetting it in one stackframe and
reacquiring it in the next.
Even if built without gcrypt, show the relevant options in help message.
Otherwise, the help message diverges from the man page or suggestions
by the shell completion.
The "features" fields is parsed as a tristate value. The values
are thus not of type NetDevFeature enum but int. The NetDevFeature
enum is instead the index for the features array.
Adjust the type. In practice, this had no impact because NetDevFeature
enum commonly has size of int.
Also, don't use memset() 0xFF to initilize the int with -1. While
it works correctly in practice, it feels ugly.
This makes it possible to wait until boot is finished without having to poll
for this command repeatedly, instead using the syntax:
$ systemctl is-system-running --wait
Waiting is implemented by waiting for the StartupFinished signal to be posted
on the bus.
Register the matcher before checking for the property to avoid race conditions.
Tested by artificially delaying startup with a oneshot service and calling this
command, checked that it emitted `running` and exited with a 0 return code as
soon as the delay service completed startup.
Also tested that booting to degraded state unblocks the command.
Inserted a delay between getting the property and waiting for the signal and
confirmed this seems to work free of race conditions.
Updated the --help text (under --wait) and the man page to document the new
feature.
This function doesn't really implement ordering, but CMP() is still fine to use
there. Keep the comment in place, just update it slightly to indicate that.
Looked for definitions of functions using the *_compare_func() suffix.
Tested:
- Unit tests passed (ninja -C build/ test)
- Installed this build and booted with it.
Macro returns -1, 0, 1 depending on whether a < b, a == b or a > b.
It's safe to use on unsigned types.
Add tests to confirm corner cases are properly covered.
Drop __extension__, since we don't use gcc -Wpedantic or -ansi.
Reformat code for spacing. Add spaces after commas almost everywhere.
Reindent code blocks in macro definitions, for consistency.
--machine has been missing for a while in systemd-stdio-bridge
this syntax can be switched to be more standard.
v2: Support the old syntax too.
timedatectl -H server1.myhostingcompany.com:5555/container1
Closes: #8071
Previously, we'd only ever read 512 byte from the random seed file,
under the assumption we won't need more. With this change we'll read the
full file, even if it is larger.
The idea behind htis change is that people can dump additional data into the
random seed file offline if they like, and it can be low quality, and
we'll seed the pool with it anyway. Moreover, if people are paranoid and
want us to save/restore a bigger seed, it's easy to do: just truncate
the file to the right size and we'll save/restore as much in the future.
This also reworks the file a bit, introducing two clear if blocks that
load and that save the random seed, and that each are conditionalized
more carefully.
On target boards without RTC, `t->kernel_time` is 0 or 1 usec.
`systemd-analyze` reads this value over D-Bus from
`org.freedesktop.systemd1.Manager`, property `KernelTimestamp`.
The issue is: if `t->kernel_time` is 0, `systemd-analyze` does not print
the kernel time:
~~~~
$ systemd-analyze
Startup finished in 1.860s (userspace) = 5.957s
~~~~
This commit fixes the misbehaviour:
~~~~
$ systemd-analyze
Startup finished in 3.866s (kernel) + 2.015s (userspace) = 5.881s
~~~~
Fixes#7721.
v2: fixes one more condition (by Yu Watanabe <watanabe.yu+github@gmail.com>)
v3: fixes one more condition (by Kirill Marinushkin <kmarinushkin@de.adit-jv.com>)
RootImage= may require the following settings
```
DeviceAllow=/dev/loop-control rw
DeviceAllow=block-loop rwm
DeviceAllow=block-blkext rwm
```
This adds the following settings implicitly when RootImage= is
specified.
Fixes#9737.
When clients don't follow protocol and use the same object from
different threads, then we previously would silently corrupt memory.
With this assert we'll fail with an assert(). This doesn't fix anything
but certainly makes mis-uses easier to detect and debug.
Triggered by https://bugzilla.redhat.com/show_bug.cgi?id=1609349
As the comments already say it might be quite likely that
$XDG_RUNTIME_DIR is not set up as mount, and we shouldn't complain about
that.
Moreover, let's make this idempotent, so that a runtime dir that is
already gone and is removed again doesn't cause failure.
Test it when sending an FD without any contents, or an FD and some contents,
or only contents and no FD (using a bare send().)
Also fix the previous test which forked but was missing an _exit() at the
end of the child execution code.
These take a struct iovec to send data together with the passed FD.
The receive function returns the FD through an output argument. In case data is
received, but no FD is passed, the receive function will set the output
argument to -1 explicitly.
Update code in dynamic-user to use the new helpers.
We would verify destination e.g. in sd_bus_message_new_call, but allow setting
any value later on with sd_bus_message_set_destination. I assume this check was
omitted not on purpose.
The choice what errors to ignore is left to the caller, and the caller is
changed to ignore all errors.
On error, previously read data is kept. So if e.g. an oom error happens, we
will continue to return slightly stale data instead of pretending we have no
entries for the given address. I think that's better, for example when
/etc/hosts contains some important overrides that external DNS should not be
queried for.
We'd store every 0.0.0.0 and ::0 entry as a structure without any addresses
allocated. This is a somewhat common use case, let's optimize it a bit.
This gives some memory savings and a bit faster response time too:
'time build/test-resolved-etc-hosts hosts' goes from 7.7s to 5.6s, and
memory use as reported by valgrind for ~10000 hosts is reduced
==18097== total heap usage: 29,902 allocs, 29,902 frees, 2,136,437 bytes allocated
==18240== total heap usage: 19,955 allocs, 19,955 frees, 1,556,021 bytes allocated
Also rename 'suppress' to 'found' (with reverse meaning). I think this makes
the intent clearer.