There are two ways in swich sd_login_* functions acquire data:
some are derived from the cgroup path, but others use the data serialized
by logind.
When the tests are executed under Fedora's mock, without systemd-spawn
but instead in a traditional chroot, test-login gets confused:
the "outside" cgroup path is visible, so sd_pid_get_unit() and
sd_pid_get_session() work, but sd_session_is_active() and other functions
that need logind data fail.
Such a buildroot setup is fairly bad, but it can be encountered in the wild, so
let's just skip the tests in that case.
/* Information printed is from the live system */
sd_pid_get_unit(0, …) → "session-237.scope"
sd_pid_get_user_unit(0, …) → "n/a"
sd_pid_get_slice(0, …) → "user-1000.slice"
sd_pid_get_session(0, …) → "237"
sd_pid_get_owner_uid(0, …) → 1000
sd_pid_get_cgroup(0, …) → "/user.slice/user-1000.slice/session-237.scope"
sd_uid_get_display(1000, …) → "(null)"
sd_uid_get_sessions(1000, …) → [0] ""
sd_uid_get_seats(1000, …) → [0] ""
Assertion 'r >= 0' failed at src/libsystemd/sd-login/test-login.c:104, function test_login(). Aborting.
Will be used by nftables nfnetlink backend.
It sends a series of netlink messages that form a nftables
update transaction.
The transaction will then generate a series of ack messages
(or an error).
This function will be used to read these acks.
nftables uses a transaction-based netlink model: one netlink write
comes with multiple messages.
A 'BEGIN' message to tell nf_tables/kernel that a new transaction starts.
Then, one more messages to add/delete tables/chains/rules etc.
Lastly, an END message that commits all changes.
This function will be used to send all the individual messages that should
make up a single transaction as a single write.
add nfnetlink_nftables helper functions to:
* open a new nfnetlink socket to kernel
* add tables, chains, rules, sets and maps
* delete/flush table
* add and delete elements from sets/maps
This is unfortunately harder to implement than it sounds. The user's bus
is bound a to the user's lifecycle after all (i.e. only exists as long
as the user has at least one PAM session), and the path dynamically (at
least theoretically, in practice it's going to be the same always)
generated via $XDG_RUNTIME_DIR in /run/.
To fix this properly, we'll thus go through PAM before connecting to a
user bus. Which is hard since we cannot just link against libpam in the
container, since the container might have been compiled entirely
differently. So our way out is to use systemd-run from outside, which
invokes a transient unit that does PAM from outside, doing so via D-Bus.
Inside the transient unit we then invoke systemd-stdio-bridge which
forwards D-Bus from the user bus to us. The systemd-stdio-bridge makes
up the PAM session and thus we can sure tht the bus exists at least as
long as the bus connection is kept.
Or so say this differently: if you use "systemctl -M lennart@foobar"
now, the bus connection works like this:
1. sd-bus on the host forks off:
systemd-run -M foobar -PGq --wait -pUser=lennart -pPAMName=login systemd-stdio-bridge
2. systemd-run gets a connection to the "foobar" container's
system bus, and invokes the "systemd-stdio-bridge" binary as
transient service inside a PAM session for the user "lennart"
3. The systemd-stdio-bridge then proxies our D-Bus traffic to
the user bus.
sd-bus (on host) → systemd-run (on host) → systemd-stdio-bridge (in container)
Complicated? Well, to some point yes, but otoh it's actually nice in
various other ways, primarily as it makes the -H and -M codepaths more
alike. In the -H case (i.e. connect to remote host via SSH) a very
similar three steps are used. The only difference is that instead of
"systemd-run" the "ssh" binary is used to invoke the stdio bridge in a
PAM session of some other system. Thus we get similar implementation and
isolation for similar operations.
Fixes: #14580
So far when asked for augmented bus credentials and the process was
already gone we'd fail fatally. Let's make this graceful instead, and
never allow augmenting fail due to PID having vanished — unless the
augmenting is the explicit and only purpose of the requested operation.
This should be safe as clients have to explicitly query the acquired
creds anyway and handle if they couldn't be acquired. Moreover we
already handle permission problems gracefully, thus clients must be
ready to deal with missing creds.
This is useful to make selinux authorization work for short-lived client
proceses. PReviously we'd augment creds to have more info to log about
(the selinux decision would not be based on augmented data however,
because that'd be unsafe), and would fail if we couldn't get it. Now,
we'll try to acquire the data, but if we cannot acquire it, we'll still
do the selinux check, except that logging will be more limited.
Let's clean up hostname_is_valid() a bit: let's turn the second boolean
argument into a more explanatory flags field, and add a flag that
accepts the special name ".host" as valid. This is useful for the
container logic, where the special hostname ".host" refers to the "root
container", i.e. the host system itself, and can be specified at various
places.
let's also get rid of machine_name_is_valid(). It was just an alias,
which is confusing and even more so now that we have the flags param.
The commit 6f3ac0d517 drops the prefix and
suffix in TAGS= property. But there exists several rules that have like
`TAGS=="*:tag:*"`. So, the property must be always prefixed and suffixed
with ":".
Fixes#17930.
Let's a concept of "rate limiting" to event sources: if specific event
sources fire too often in some time interval temporarily take them
offline, and take them back online once the interval passed.
This is a simple scheme of avoiding starvation of event sources if some
event source fires too often.
This introduces the new conceptual states of "offline" and "online" for
event sources: an event source is "online" only when enabled *and* not
ratelimited, and offline in all other cases. An event source that is
online hence has its fds registered in the epoll, its signals in the
signalfd and so on.
So far we used these fields to organize the earliest/latest timer event
priority queue. In a follow-up commit we want to introduce ratelimiting
to event sources, at which point we want any kind of event source to be
able to trigger time wakeups, and hence they all need to be included in
the earliest/latest prioqs. Thus, in preparation let's make this
generic.
No change in behaviour, just some shifting around of struct members from
the type-specific to the generic part.
sd_event_prepare() invokes callbacks that might drop the last user ref
on our event loop. Let's make sure we keep an explicit ref around it, so
that we won't end up with an invalid pointer. Similar in sd_event_run().
Basically, any function that is publically callable that might end up
invoking callbacks should ref the relevant objects to be protected
against callbacks destroying these objects while we still want to access
them. We did this correctly in sd_event_dispatch() and sd_event_loop(),
but these are not the only ones which are callable from the outside.
If /sys/class/OOO node is created and destroyed during booting (kernle driver initialization fails),
systemd-udev-trigger.service fails due to race condition.
***** race condition ***********************************************************************************
1. kernel driver create /sys/class/OOO
2. systemd-udev-trigger.service execues "/usr/bin/udevadm trigger --type=devices --action=add"
3. device_enumerator_scan_devices() => enumerator_scan_devices_all() => enumerator_scan_dir("class") =>
opendir("/sys/class") and iterate all subdirs ==> enumerator_scan_dir_and_add_devices("/sys/class/OOO")
4. kernel driver fails and destroy /sys/class/OOO
5. enumerator_scan_dir_and_add_devices("/sys/class/OOO") fails in opendir("/sys/class/OOO")
6. "systemd-udev-trigger.service" fails
7. udev coldplug fails and some device units not ready
8. mount units asociated with device units fail
9. local-fs.target fails
10. enters emergency mode
********************************************************************************************************
***** status of systemd-udev-trigger.service unit ******************************************************
$ systemctl status systemd-udev-trigger.service
systemd-udev-trigger.service - udev Coldplug all Devices
Loaded: loaded (/usr/lib/systemd/system/systemd-udev-trigger.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2020-01-02 13:16:54 KST; 22min ago
Docs: man:udev(7)
man:systemd-udevd.service(8)
Process: 2162 ExecStart=/usr/bin/udevadm trigger --type=subsystems --action=add (code=exited, status=0/SUCCESS)
Process: 2554 ExecStart=/usr/bin/udevadm trigger --type=devices --action=add (code=exited, status=1/FAILURE)
Main PID: 2554 (code=exited, status=1/FAILURE)
Jan 02 13:16:54 localhost udevadm[2554]: Failed to scan devices: No such file or directory
Jan 02 13:16:54 localhost systemd[1]: systemd-udev-trigger.service: Main process exited, code=exited, status=1/FAILURE
Jan 02 13:16:54 localhost systemd[1]: systemd-udev-trigger.service: Failed with result 'exit-code'.
Jan 02 13:16:54 localhost systemd[1]: Failed to start udev Coldplug all Devices.
*******************************************************************************************************
***** journal log with Environment=SYSTEMD_LOG_LEVEL=debug in systemd-udev-trigger.service ***********
Jan 01 21:57:20 localhost udevadm[2039]: sd-device-enumerator: Scanning /sys/bus
Jan 01 21:57:20 localhost udevadm[2522]: sd-device-enumerator: Scan all dirs
Jan 01 21:57:20 localhost udevadm[2522]: sd-device-enumerator: Scanning /sys/bus
Jan 01 21:57:21 localhost udevadm[2522]: sd-device-enumerator: Scanning /sys/class
Jan 01 21:57:21 localhost udevadm[2522]: sd-device-enumerator: Failed to scan /sys/class: No such file or directory
Jan 01 21:57:21 localhost udevadm[2522]: Failed to scan devices: No such file or directory
*******************************************************************************************************
... when called with a valid environment variable name. This means that
any time we call it with a fixed string, it is guaranteed to return 0.
(Also when the variable is not present in the environment block.)
Coverity in CID#1435966 was complaining that s->enabled is not "restored" in
all cases. But the code was actually correct, since it should only be
"restored" in the error paths. But let's still make this prettier by not setting
the state before all operations that may fail are done.
We need to set .enabled for the prioq reshuffling operations, so move those down.
No functional change intended.