We already refuse sd_event_add_signal() if the specified signal is not
blocked, let's do this also for sd_event_add_child(), since we might
need signalfd() to implement this, and this means the signal needs to be
blocked.
This adds support for watching for process exits via Linux new pidfd
concept. This makes watching processes and killing them race-free if
properly used, fixing a long-standing UNIX misdesign.
This patch adds implicit and explicit pidfd support to sd-event: if a
process shall be watched and is specified by PID we will now internally
create a pidfd for it and use that, if available. Alternatively a new
constructor for child process event sources is added that takes pidfds
as input.
Besides mere watching of child processes via pidfd two additional
features are added:
→ sd_event_source_send_child_signal() allows sending a signal to the
process being watched in the safest way possible (wrapping
the new pidfd_send_signal() syscall).
→ sd_event_source_set_child_process_own() allows marking a process
watched for destruction as soon as the event source is freed. This is
currently implemented in userspace, but hopefully will become a kernel
feature eventually.
Altogether this means an sd_event_source object is now a safe and stable
concept for referencing processes in race-free way, with automatic
fallback to pre-pidfd kernels.
Note that this patch adds support for this only to sd-event, not to PID
1. That's because PID 1 needs to use waitid(P_ALL) for reaping any
process that might get reparented to it. This currently semantically
conflicts with pidfd use for watching processes since we P_ALL is
undirected and thus might reap process earlier than the pidfd notifies
process end, which is hard to handle. The kernel will likely gain a
concept for excluding specific pidfds from P_ALL watching, as soon as
that is around we can start making use of this in PID 1 too.
This is not a new system call at all (since kernel 2.2), however it's
not exposed in glibc (a wrapper is exposed however in sigqueue(), but it
substantially simplifies the system call). Since we want a nice fallback
for sending signals on non-pidfd systems for pidfd_send_signal() let's
wrap rt_sigqueueinfo() since it takes the same siginfo_t parameter.
When running the program with udev_event_spawn it is possible to miss
output in stdout when the program exits causing the result to be empty
which can cause rules using the result to not function correctly.
This is due to the on_spawn_sigchld callback being processed while IO is
still pending and causing the event loop to exit.
To correct this the sigchld event source is made a lower priority than
the other event sources to ensure it is processed after IO. This
requires changing the IO event source to oneshot and re-enabling it when
valid data is read but not for EOF, this prevents the empty pipes
constantly generating IO events.
Reloading the SELinux label cache here enables a light-wight follow-up of a SELinux policy change, e.g. adding a label for a RuntimeDirectory.
Closes: #13363
From v4.12 the kernel appends some attributes to netlink acks
containing a textual description of the error and other fields.
This makes sd-netlink parse the attributes.
This makes the code do what the documentation says. The code had no inkling
about initrd.target, so I think this change is fairly risky. As a fallback,
default.target will be loaded, so initramfses which relied on current behaviour
will still work, as along as they don't have a different initrd.target.
In an initramfs created with recent dracut:
$ ls -l usr/lib/systemd/system/{default.target,initrd.target}
lrwxrwxrwx. usr/lib/systemd/system/default.target -> initrd.target
-rw-r--r--. usr/lib/systemd/system/initrd.target
So at least for dracut, there should be no difference.
Also avoid a pointless allocation.
We can break if KEYCTL_READ return value is equal to our buffer size.
From keyctl(2):
On a successful return, the return value is always the total size of
the payload data. To determine whether the buffer was of sufficient
size, check to see that the return value is less than or equal to the
value supplied in arg4.
The function capability_ambient_set_apply() now drops capabilities not
in the capability_ambient_set(), so it is necessary to call it when
the ambient set is empty.
Fixes#13163
Change test_set_ambient_caps() to test_apply_ambient_caps(), since the
function capability_ambient_set_apply() not only sets ambient
capabilities, but clears inherited capabilities that are not explicitly
requested by the caller.
Follow-up for 0d588deae2.
In that commit the output got moved a 2 chars to the right, hence make
sure to also shift the cgroup tree to the right, so that it gets
properly aligned under the cgroup path again.
cryptsetup introduced optional locking scheme that should serialize
unlocking keyslots which use memory hard key derivation
function (argon2). Using the serialization should prevent OOM situation
in early boot while unlocking encrypted volumes.
This partially reverts a07a7324ad.
We have two pieces of information: the value and a boolean.
config_parse_timeout_abort() added in the reverted commit would write
the boolean to the usec_t value, making a mess.
The code is reworked to have just one implementation and two wrappers
which pass two pointers.
The conf-parser machinery already removed whitespace before and after "=", no
need to repeat this step.
The test is adjusted to pass. It was testing an code path that doesn't happen
normally, no point in doing that.