Initially I thought this is a good idea, but when reviewing a different PR
(https://github.com/systemd/systemd/pull/13862#discussion_r340604313) I changed
my mind about this. At some point we probably should start warning about the
old option name, and yet later remove it. But it'll make it easier for people
to transition to the new option name if there's a period of support for both
names without any fuss. There's nothing particularly wrong about the old name,
and there is no support cost.
Fixes#13919 (by avoiding the issue completely).
If udevd receives an exit signal, it releases its reference on the udev
monitor in manager_exit(). If at this time a worker is hanging, and if
the event timeout for this worker expires before udevd exits, udevd
crashes in on_sigchld()->udev_monitor_send_device(), because the monitor
has already been freed.
Fix this by releasing the main process's monitor ref later, in
manager_free().
This patch is a new attempt to fix the race originally described in issue #9754.
The initial fix (commit ad96887a12) consisted in
spawning a sub process that became the controlling process of the VT and hence
kicked the old controlling process off to make sure that the VT wouldn't have
entered in HUP state while logind restored the VT.
But it introduced a regression (see issue #11269) and thus was reverted. But
unlike it was described in the revert commit message, commit
adb8688b3f alone doen't fix the initial race.
This patch fixes the race in a simpler way by trying to restore the VT a second
time after making sure to re-open it if the first attempt fails.
Indeed if the old controlling process dies before or during the first attempt,
logind will fail to restore the VT. At this point the VT is in HUP state but
we're sure that it won't enter in a HUP state a second time. Therefore we will
retry by re-opening the VT to clear the HUP state and by restoring the VT a
second time, which should be safe this time.
Fixes: #9754Fixes: #13241
On some systems with lots of devices, device probing for certain drivers can
take a very long time. If systemd-udevd detects a timeout and kills the worker
running modprobe using SIGKILL, some devices will not be probed, or end up in
unusable state. The --event-timeout option can be used to modify the maximum
time spent in an uevent handler. But if systemd-udevd exits, it uses a
different timeout, hard-coded to 30s, and exits when this timeout expires,
causing all workers to be KILLed by systemd afterwards. In practice, this may
lead to workers being killed after significantly less time than specified with
the event-timeout. This is particularly significant during initrd processing:
systemd-udevd will be stopped by systemd when initrd-switch-root.target is
about to be isolated, which usually happens quickly after finding and mounting
the root FS.
If systemd-udevd is started by PID 1 (i.e. basically always), systemd will
kill both udevd and the workers after expiry of TimeoutStopSec. This is
actually better than the built-in udevd timeout, because it's more transparent
and configurable for users. This way users can avoid the mentioned boot problem
by simply increasing StopTimeoutSec= in systemd-udevd.service.
If udevd is not started by systemd (standalone), this is still an
improvement. udevd will kill hanging workers when the event timeout is
reached, which is configurable via the udev.event_timeout= kernel
command line parameter. Before this patch, udevd would simply exit with
workers still running, which would then become zombie processes.
With the timeout removed, the sd_event_now() assertion in manager_exit() can be
dropped.
This makes it easier to see what unit_name_is_valid() returns at a glance.
The output is not whitespace clean, but I think it's good enough for a test.
With meson-0.52.0-1.module_f31+6771+f5d842eb.noarch I get:
src/test/meson.build:19: WARNING: Overriding previous value of environment variable 'PATH' with a new one
When we're using *prepend*, the whole point is to modify an existing variable,
so meson shouldn't warn. But let's set avoid the warning and shorten things by
setting the final value immediately.
We already handle it specially in get_timezones(), hence we should OK it
here too, even if the timezone file doesn't actually exist.
Prompted by:
https://serverfault.com/questions/991172/invalid-time-zone-utc
(Yes, Ubuntu should install the UTC timezone data unconditionally: it
should not be an option, even if all other timezone data is excluded,
but since it's our business to validate user input but not out business
to validate distros, let's just accept "UTC" unconditionally, it's magic
after all)
We were looking at the wrong variable, and would always crash if this
comparison was reached. Fixes#13965.
Also, fix crash (_cleanup_ called on uninitialized variable) if we failed in
error path.
While at it, let's shorten some messages.
When autoclose is set (kernel default but many distributions reverse the
setting) opening a CD-rom device causes the tray to close.
The function of blkid is to report the current state of the device and
not to change it. Hence it should use O_NONBLOCK when opening the
device to avoid closing a CD-rom tray.
blkid is used liberally in scripts so it can potentially interfere with
the user operating the CD-rom hardware.
[kzak@redhat.com: add O_NONBLOCK also to:
- wipefs
- blkid_new_probe_from_filename()
- blkid_evaluate_tag()]
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Karel Zak <kzak@redhat.com>
(cherry picked from commit 39f5af25982d8b0244000e92a9d0e0e6557d0e17)
f5947a5e92 dropped missing.h and
replaced with the more specific headers but did not add
missing_fcntl.h in places that use O_TMPFILE. This is needed for
some older versions of glibc.
Discussed in #13743, the -.service semantic conflicts with the
existing root mount and slice names, making this feature not
uniformly extensible to all types. Change the name to be
<type>.d instead.
Updating to this format also extends the top-level dropin to
unit types.
For all units that aren't timers, if it is activated by another unit,
add the triggering unit under the "TriggeredBy:" header. If a unit can
trigger other units, print the units it triggers other the "Triggers:"
header.
journald assumes that getsockopt(SO_PEERCRED) correctly identifies the
process on the remote end of the socket. However, this is incorrect
according to man 7 socket:
The returned credentials are those that were in effect at the
time of the call to connect(2) or socketpair(2).
This becomes a problem when a new process inherits the stdout stream
from a parent. First, log messages from the child process will
be attributed to the parent. Second, the struct ucred used by journald
becomes invalid as soon as the parent exits. Further sendmsg calls then
fail with ENOENT. Logs for the child process then vanish from the journal.
Fix this by using recvmsg on the stdout stream, and refreshing the cached
struct ucred if SCM_CREDENTIALS indicate a new process.
Fixes#13708