If udevd receives an exit signal, it releases its reference on the udev
monitor in manager_exit(). If at this time a worker is hanging, and if
the event timeout for this worker expires before udevd exits, udevd
crashes in on_sigchld()->udev_monitor_send_device(), because the monitor
has already been freed.
Fix this by releasing the main process's monitor ref later, in
manager_free().
This patch is a new attempt to fix the race originally described in issue #9754.
The initial fix (commit ad96887a12) consisted in
spawning a sub process that became the controlling process of the VT and hence
kicked the old controlling process off to make sure that the VT wouldn't have
entered in HUP state while logind restored the VT.
But it introduced a regression (see issue #11269) and thus was reverted. But
unlike it was described in the revert commit message, commit
adb8688b3f alone doen't fix the initial race.
This patch fixes the race in a simpler way by trying to restore the VT a second
time after making sure to re-open it if the first attempt fails.
Indeed if the old controlling process dies before or during the first attempt,
logind will fail to restore the VT. At this point the VT is in HUP state but
we're sure that it won't enter in a HUP state a second time. Therefore we will
retry by re-opening the VT to clear the HUP state and by restoring the VT a
second time, which should be safe this time.
Fixes: #9754Fixes: #13241
On some systems with lots of devices, device probing for certain drivers can
take a very long time. If systemd-udevd detects a timeout and kills the worker
running modprobe using SIGKILL, some devices will not be probed, or end up in
unusable state. The --event-timeout option can be used to modify the maximum
time spent in an uevent handler. But if systemd-udevd exits, it uses a
different timeout, hard-coded to 30s, and exits when this timeout expires,
causing all workers to be KILLed by systemd afterwards. In practice, this may
lead to workers being killed after significantly less time than specified with
the event-timeout. This is particularly significant during initrd processing:
systemd-udevd will be stopped by systemd when initrd-switch-root.target is
about to be isolated, which usually happens quickly after finding and mounting
the root FS.
If systemd-udevd is started by PID 1 (i.e. basically always), systemd will
kill both udevd and the workers after expiry of TimeoutStopSec. This is
actually better than the built-in udevd timeout, because it's more transparent
and configurable for users. This way users can avoid the mentioned boot problem
by simply increasing StopTimeoutSec= in systemd-udevd.service.
If udevd is not started by systemd (standalone), this is still an
improvement. udevd will kill hanging workers when the event timeout is
reached, which is configurable via the udev.event_timeout= kernel
command line parameter. Before this patch, udevd would simply exit with
workers still running, which would then become zombie processes.
With the timeout removed, the sd_event_now() assertion in manager_exit() can be
dropped.
This makes it easier to see what unit_name_is_valid() returns at a glance.
The output is not whitespace clean, but I think it's good enough for a test.
We compile some c++ code for tests. We would simply use the default options for
those. When the previous commit raised the default warning level, we started
getting warnings from c++ code. Let's add the most important options to the c++
command, so that we get a compilation without any warnings again.
I don't think it makes sense to add *all* the options that we add for c to the
c++ flags, because testing them takes quite a while, and the c++ compilations
are for small amounts of code, mostly to check that the headers have compatible
syntax.
Let's bump up the warning level, and not add by -Wextra by hand. This is the
approach recommended by meson. The idea is that all projects should be as
similar as possible to make it easier for users to switch between projects.
With meson-0.52.0-1.module_f31+6771+f5d842eb.noarch I get:
src/test/meson.build:19: WARNING: Overriding previous value of environment variable 'PATH' with a new one
When we're using *prepend*, the whole point is to modify an existing variable,
so meson shouldn't warn. But let's set avoid the warning and shorten things by
setting the final value immediately.
We already handle it specially in get_timezones(), hence we should OK it
here too, even if the timezone file doesn't actually exist.
Prompted by:
https://serverfault.com/questions/991172/invalid-time-zone-utc
(Yes, Ubuntu should install the UTC timezone data unconditionally: it
should not be an option, even if all other timezone data is excluded,
but since it's our business to validate user input but not out business
to validate distros, let's just accept "UTC" unconditionally, it's magic
after all)
We were looking at the wrong variable, and would always crash if this
comparison was reached. Fixes#13965.
Also, fix crash (_cleanup_ called on uninitialized variable) if we failed in
error path.
While at it, let's shorten some messages.
When autoclose is set (kernel default but many distributions reverse the
setting) opening a CD-rom device causes the tray to close.
The function of blkid is to report the current state of the device and
not to change it. Hence it should use O_NONBLOCK when opening the
device to avoid closing a CD-rom tray.
blkid is used liberally in scripts so it can potentially interfere with
the user operating the CD-rom hardware.
[kzak@redhat.com: add O_NONBLOCK also to:
- wipefs
- blkid_new_probe_from_filename()
- blkid_evaluate_tag()]
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Karel Zak <kzak@redhat.com>
(cherry picked from commit 39f5af25982d8b0244000e92a9d0e0e6557d0e17)
f5947a5e92 dropped missing.h and
replaced with the more specific headers but did not add
missing_fcntl.h in places that use O_TMPFILE. This is needed for
some older versions of glibc.
Discussed in #13743, the -.service semantic conflicts with the
existing root mount and slice names, making this feature not
uniformly extensible to all types. Change the name to be
<type>.d instead.
Updating to this format also extends the top-level dropin to
unit types.
We want users to use Wants, but we'd describe Requires first and ask users to
look for Wants instead. While at it, let's split the wall of text into sensible
paragraphs: syntax first, followed by semantics and longer description, and
finally hints and comparison to other configuration items last.
The value is obviously bogus, but didn't seem to cause problems so far.
With meson-0.52.0, it causes a hang. The number of aliases is always rather
small (usually just one or two, possibly up to a dozen in a few cases), so
even if this causes some looping, it is strange that it has such a huge impact.
But let's just remove it.
Fixes#13742.
Tested with meson-0.52.0-1.module_f31+6771+f5d842eb.noarch,
meson-0.51.1-1.fc29.noarch.