Commit graph

380 commits

Author SHA1 Message Date
Martin Wilck 030f457167 udevd: don't use monitor after manager_exit()
If udevd receives an exit signal, it releases its reference on the udev
monitor in manager_exit(). If at this time a worker is hanging, and if
the event timeout for this worker expires before udevd exits, udevd
crashes in on_sigchld()->udev_monitor_send_device(), because the monitor
has already been freed.

Fix this by testing the validity of manager->monitor in on_sigchld().
2019-11-26 20:38:28 +01:00
Martin Wilck 3cabdc2345 Revert "udevd: fix crash when workers time out after exit is signal caught"
This reverts commit 5db454b8031c58a743cc4ee3d5d1dd01dcff17e8.
See https://github.com/systemd/systemd/issues/14128
2019-11-26 20:38:28 +01:00
Martin Wilck 7b6596d748 udevd: fix crash when workers time out after exit is signal caught
If udevd receives an exit signal, it releases its reference on the udev
monitor in manager_exit(). If at this time a worker is hanging, and if
the event timeout for this worker expires before udevd exits, udevd
crashes in on_sigchld()->udev_monitor_send_device(), because the monitor
has already been freed.

Fix this by releasing the main process's monitor ref later, in
manager_free().
2019-11-12 16:43:42 +01:00
Martin Wilck bfde9421af udevd: wait for workers to finish when exiting
On some systems with lots of devices, device probing for certain drivers can
take a very long time. If systemd-udevd detects a timeout and kills the worker
running modprobe using SIGKILL, some devices will not be probed, or end up in
unusable state. The --event-timeout option can be used to modify the maximum
time spent in an uevent handler. But if systemd-udevd exits, it uses a
different timeout, hard-coded to 30s, and exits when this timeout expires,
causing all workers to be KILLed by systemd afterwards. In practice, this may
lead to workers being killed after significantly less time than specified with
the event-timeout. This is particularly significant during initrd processing:
systemd-udevd will be stopped by systemd when initrd-switch-root.target is
about to be isolated, which usually happens quickly after finding and mounting
the root FS.

If systemd-udevd is started by PID 1 (i.e. basically always), systemd will
kill both udevd and the workers after expiry of TimeoutStopSec. This is
actually better than the built-in udevd timeout, because it's more transparent
and configurable for users. This way users can avoid the mentioned boot problem
by simply increasing StopTimeoutSec= in systemd-udevd.service.

If udevd is not started by systemd (standalone), this is still an
improvement. udevd will kill hanging workers when the event timeout is
reached, which is configurable via the udev.event_timeout= kernel
command line parameter. Before this patch, udevd would simply exit with
workers still running, which would then become zombie processes.

With the timeout removed, the sd_event_now() assertion in manager_exit() can be
dropped.
2019-11-12 12:20:20 +01:00
Yu Watanabe 021cdf8330 tree-wide: drop signal.h when signal-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 92133d2878 tree-wide: drop socket.h when socket-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 455fa9610c tree-wide: drop string.h when string-util.h or friends are included 2019-11-04 00:30:32 +09:00
Zbigniew Jędrzejewski-Szmek 0584b17a8c udevd: add helper with error handling to synthesize "change" events
Coverity was unhappy that we ignore the return value from write_string_file().
We should at least warn. CID#1302373.
2019-07-12 00:17:47 +02:00
Zbigniew Jędrzejewski-Szmek 6b413782df udevd: open log output early
We'd log to the "console", losing structured logs during configuration file parsing.
Let's be nice to journalctl users, and log to the journal immediately.
2019-07-04 10:36:30 +02:00
Lubomir Rintel f053fc337e udevd: fix a reversed conditional on global property set
# udevadm control --property=HELLO=WORLD
  Received udev control message (ENV), unsetting 'HELLO'
  # udevadm control --property=HELLO=
  Received udev control message (ENV), setting 'HELLO='

Oh no, it's busted. Let's try removing this one little negation real quick
to see if it helps...

  # udevadm control --property=HELLO=WORLD
  Received udev control message (ENV), setting 'HELLO=WORLD'
  # udevadm control --property=HELLO=
  Received udev control message (ENV), unsetting 'HELLO'

Feels much better now.
2019-06-25 10:20:48 +09:00
Yu Watanabe 99058cd66a udev: propagate errors in udev_event_execute_rules()
And do not set initialized flag to the device.
2019-06-05 10:44:17 +02:00
Yu Watanabe 25de7aa7b9 udev: modernize udev-rules.c
This does the following:
- rename enum udev_builtin_cmd -> UdevBuiltinCmd
- rename struct udev_builtin -> UdevBuiltin
- move type definitions to udev-rules.h
- move prototypes of functions defined in udev-rules.c to udev-rules.h
- drop to use strbuf
- propagate critical errors in applying rules,
- drop limitation for number of tokens per line.
2019-06-02 14:15:26 +09:00
Franck Bui 88bd5a32e8 udevd: change the default value of udev.children-max (again)
Follow-up for faae64fa3d, which increased the
default number of udev workers per cpu regardless of how big the system is.

It's not really clear from the commit message if the new number of workers
improved the overall time for the boot process or only reduced the number of
times the max number of children limit was reached (and in this case
5406c36844 commit might have been more appropriate in the first place).

But systems with ~1000 CPUs are not rare these days and the worker numbers get
quite large with CPU factor of 8. Spawning more than 2000 workers can't be
healthy on any system, no matter how big.

Indeed the main mistake is the belief that udev is CPU-intensive, and thus the
number of allowed workers has to increase with the number of CPUs. It is not,
at probably has never been. It's I/O bound, and sometimes, bound by resources
such as locks.

This is an argument to:

 - scale only weakly with the number of CPUs, and the rationale to switch back
   to a scale factor C=2 but with a higher offset number which should affect
   systems with a small number of CPUs only. With this patch applied the offset
   is increased from O=8 to O=16.

 - put an absolute maximum limit to make sure no more than 2048 workers are
   spawned no matter how big the system is.

This still provides more workers for the laptop cases (where the number of CPUs
is limited), while avoiding sky-rocketing numbers for big systems.

Note that on most desktop systems, the memory limit will kick in. The following
table collects numbers about children-max. For each scenario, the first column
is the "cpu_limit" limit, and the second number is the minimum amount of memory
for the "cpu_limit" limit to become relevant (with less RAM, memory will limit
the number of children thus "mem_limit" will become the active limit).

       |    > v240    |    < v240     |  this patch   |
 CPUs  | C = 8, O = 8 | C = 2, O = 8  | C = 2, O = 16 |
-------------------------------------------------------
   1   |   16      2  |   10    1.3   |   18       2  |
   2   |   24      3  |   12    1.5   |   20       2  |
   4   |   40      5  |   16      2   |   24       3  |
   8   |   72      9  |   24      3   |   32       4  |
  16   |  136     17  |   40      5   |   48       5  |
  64   |  520     65  |  136     17   |  144      18  |
1024   | 8200   1025  | 2056    263   | 2048     256  |
2048   |16392   2049  | 4104    513   | 2048     256  |

This patch is mainly based on Martin Wilck's analyze and comments.
2019-05-16 23:09:41 +02:00
Franck Bui 5406c36844 udevd: notify when max number value of children is reached only once per batch of events
When booting with "udev.log-priority=debug" for example, the output might be
spammed with messages like this:

    systemd-udevd[23545]: maximum number (248) of children reached
    systemd-udevd[23545]: maximum number (248) of children reached
    systemd-udevd[23545]: maximum number (248) of children reached
    systemd-udevd[23545]: maximum number (248) of children reached
    systemd-udevd[23545]: maximum number (248) of children reached
    systemd-udevd[23545]: maximum number (248) of children reached
    systemd-udevd[23545]: maximum number (248) of children reached

While the message itself is useful, printing it per batch of events should be
enough.
2019-04-26 09:58:12 +02:00
Yu Watanabe 76e62a4d22 udev: move udev_ctrl_cleanup() into manager_free() 2019-04-01 15:20:03 +02:00
Lennart Poettering eefc66aa8f util: split out some stuff into a new file limits-util.[ch] 2019-03-13 12:16:43 +01:00
Yu Watanabe d4d690facd udevd: use device_get_action() and device_get_seqnum() 2019-03-12 03:49:53 +09:00
Yu Watanabe d02c6f5461 udev-ctrl: use sd_event and introduce udev_ctrl_start()
Now the new callback function get enum udev_ctrl_msg_type.
So, this commit also drops udev_ctrl_connection and udev_ctrl_msg.
2019-02-20 06:17:42 +09:00
Yu Watanabe 204e9c3e29 udev: drop unused Manager::uevent_event 2019-02-20 06:17:42 +09:00
Yu Watanabe 100bc5bf98 udev-ctrl: make udev_ctrl_new() return negative errno on failure 2019-02-20 06:17:42 +09:00
Lennart Poettering bbd8598f8b
Merge pull request #11589 from yuwata/udevd-is-device-busy
udevd: refactoring is_device_busy()
2019-02-15 12:16:37 +01:00
Yu Watanabe 31cbd20253 udev: check whether systemd is running, and do not use cg_kill() if not
Fixes #11645.
2019-02-13 17:07:56 +01:00
Yu Watanabe 8671411668 udevd: save the result of devnum or ifindex blocker 2019-01-29 14:07:21 +01:00
Yu Watanabe baa461fc05 udevd: always block follwoing events with same devpath
Originally commented as "devices names might have changed/swapped in the meantime",
but may not. For safety, let's block the following events with same
devpath.

This may fix #6514.
2019-01-29 14:04:43 +01:00
Yu Watanabe 0bd0407efc udevd: trivial refactoring of is_device_busy() 2019-01-29 13:53:19 +01:00
Yu Watanabe d30f43eec5 udevd: rename control command SYNC -> PING in log message 2019-01-18 16:10:01 +01:00
Yu Watanabe c809f33719 udev: also update log level for LOG_REALM_SYSTEMD
Now, not a few udevd debug logs come from sd-device or sd-hwdb.
Only setting LOG_REALM_UDEV may not sufficient to debug.
We have already similar code in main() and udevadm.
2019-01-13 07:37:49 +09:00
Yu Watanabe e0b7a5d151 udevd: refuse devices which do not have ACTION property 2019-01-12 09:32:20 +09:00
Yu Watanabe 33ad742a84 udevd: drop unnecessary brackets 2019-01-12 09:32:20 +09:00
Yu Watanabe c0ff3d6cbc udevd: make worker also log ACTION property 2019-01-12 09:32:20 +09:00
Yu Watanabe 25d4f5b071 udevd: reject devices which do not have SEQNUM 2019-01-12 09:32:20 +09:00
Yu Watanabe 956833b417 udevd: provide worker_hash_ops and drop manager_workers_free() 2019-01-12 09:32:20 +09:00
Yu Watanabe d40534643b udevd: use structured initializer at one more place 2019-01-12 09:32:20 +09:00
Yu Watanabe 1f3f6bd007 udevd: use worker_free() on failure in worker_new()
Otherwise, worker_monitor may not unrefed correctly.
2019-01-12 09:32:20 +09:00
Zbigniew Jędrzejewski-Szmek b5af8c8cdf udev: open control and netlink sockets before daemonization
c4b69e990f effectively moved the initalization of socket.
Before that commit:
run → listen_fds → udev_ctrl_new → udev_ctrl_new_from_fd → socket()
After:
run → main_loop → manager_new → udev_ctrl_new_from_fd → socket()

The problem is that main_loop was called after daemonization. Move manager_new
out of main_loop and before daemonization.

Fixes #11314 (hopefully ;)).

v2: Yu Watanabe
sd_event is initialized in main_loop().
2019-01-10 14:09:15 +09:00
Zbigniew Jędrzejewski-Szmek 44dcf454b6 udevd: drop redundant call to sd_event_get_exit_code
sd_event_loop returns the same thing anyway.
2019-01-10 14:09:01 +09:00
Zbigniew Jędrzejewski-Szmek 88514e7bae
Merge pull request #11244 from yuwata/revert-udev-changes
udev: revert bind/unbind patch and one more
2019-01-03 16:28:30 +01:00
Yu Watanabe ff86c92e30 Revert "udevd: configure a child process name for worker processes"
This reverts commit 49f3ee7e74.
2019-01-02 04:01:07 +09:00
Zbigniew Jędrzejewski-Szmek 681bd2c524 meson: generate version tag from git
$ build/systemctl --version
systemd 239-3555-g6178cbb5b5
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
$ git tag v240 -m 'v240'
$ ninja -C build
ninja: Entering directory `build'
[76/76] Linking target fuzz-unit-file.
$ build/systemctl --version
systemd 240
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid

This is very useful during development, because a precise version string is
embedded in the build product and displayed during boot, so we don't have to
guess answers for questions like "did I just boot the latest version or the one
from before?".

This change creates an overhead for "noop" builds. On my laptop, 'ninja -C
build' that does nothing goes from 0.1 to 0.5 s. It would be nice to avoid
this, but I think that <1 s is still acceptable.

Fixes #7183.

PACKAGE_VERSION is renamed to GIT_VERSION, to make it obvious that this is the
more dynamically changing version string.

Why save to a file? It would be easy to generate the version tag using
run_command(), but we want to go through a file so that stuff gets rebuilt when
this file changes. If we just defined an variable in meson, ninja wouldn't know
it needs to rebuild things.
2018-12-21 13:43:20 +01:00
Lennart Poettering 95cde1ed24
Merge pull request #11159 from keszybz/udev-typedef
Udev typedef and normal error reporting
2018-12-17 16:19:10 +01:00
Zbigniew Jędrzejewski-Szmek 2e08871534 udev: use typedef for struct udev_event 2018-12-17 09:27:24 +01:00
Zbigniew Jędrzejewski-Szmek 9a07157dd5 udev: use typedef for struct udev_rules 2018-12-17 09:27:21 +01:00
Zbigniew Jędrzejewski-Szmek 1d79128121 udev: make udev_rules_new() return a proper error code 2018-12-14 10:20:43 +01:00
Yu Watanabe 903893237a sd-device: do not change buffer size if the socket is already bound
From the results of CIs in #11076, changing buffer size may cause
issue #10754. So, let's prohibit to change the size if it is already
bound.

This also reverts commit 986ab0d2dc.
2018-12-14 09:33:06 +09:00
Lennart Poettering 49f3ee7e74 udevd: configure a child process name for worker processes 2018-12-01 12:50:45 +01:00
Lennart Poettering ece0fe12ad tree-wide: (void)ify some setsid() and related calls 2018-12-01 12:50:45 +01:00
Lennart Poettering a0ee3d93bb
Merge pull request #10919 from yuwata/sd-device-monitor-fixes
sd-device-monitor: several fixes
2018-11-26 18:08:41 +01:00
Lennart Poettering 49fe5c0996 tree-wide: port various places over to STARTSWITH_SET() 2018-11-26 14:08:46 +01:00
Yu Watanabe 6acf1ccba5 udevd: do not call udev_ctrl_enable_receiving() if the socket is passed from pid1
Before c4b69e990f, if the socket fd is
passed from pid1, `udev_ctrl_enable_receiving()` was not called.
Let's preserve the original logic.
2018-11-26 11:57:48 +09:00
Yu Watanabe 986ab0d2dc udevd: do not set buffer size if the socket is passed from pid1
Before c4b69e990f, if the socket fd is
passed from pid1, `udev_monitor_set_receive_buffer_size()` (now it is
a wrapper of `sd_device_monitor_set_receive_buffer_size()`) was not
called. Let's preserve the original logic.
2018-11-25 20:17:25 +09:00