CID#996458. Coverity warns that we trust desc->bLength as read in
the input data to adjust our position in the buffer. This value could
be anything, leading to overflow. It's unlikely that the kernel feeds
us invalid data, but let's me more careful.
If any error is encountered, more logs are given.
This adds /usr/local/lib/udev/rules.d to the search path on non-split-usr systems.
On split-usr systems, the paths with /usr/-prefixes are added too.
In the past, on split-usr systems, it made sense to only load rules from
/lib/udev/rules.d, because /usr could be mounted late. But we don't support running
without /usr since 80758717a6, so in practice it doesn't matter whether the
rules files are in /lib/udev/rules.d or /usr/lib/udev/rules.d. Distributions
that maintain the illusion of functional split-usr are welcome to simply not put any
files in /usr/lib/udev/rules.d/.
In practice this doesn't change much, but it makes udev more consistent with the
rest of the systemd suite.
This also set lower log level for the messages.
6e2efb6c73 introduces the log messages.
But udevd may be started with --resolve-names=never, and the behavior
is expected.
Fixes#11720.
According to the specification[1] the 'capabilities' describe the physical
device as a whole and the 'device_caps' describe the current device node.
The existence of 'device_caps' is indicated by the V4L2_CAP_DEVICE_CAPS
capability flag.
Use the 'device_caps' if available to generate the correct
ID_V4L_CAPABILITIES for the current device node.
This is relevant for UVC devices with current kernels: Two /dev/videoX
devices exist for those. One for video and one for metadata. The
V4L2_CAP_VIDEO_CAPTURE flag is present in the 'capabilities' for both
device nodes but only in the 'device_caps' of the video device node.
Without this, the ID_V4L_CAPABILITIES of the metadata device node
incorrectly contains 'capture'.
[1] https://www.linuxtv.org/downloads/v4l-dvb-apis-new/uapi/v4l/vidioc-querycap.html
Originally commented as "devices names might have changed/swapped in the meantime",
but may not. For safety, let's block the following events with same
devpath.
This may fix#6514.
Fixes#3374. The problem is that we set MACPolicy=persistent (i.e. we would
like to generate persistent MAC addresses for interfaces which don't have a
fixed MAC address), but various virtual interfaces including bridges, tun/tap,
bonds, etc., do not not have the necessary ID_NET_NAME_* attributes and udev
would not assing the address and warn:
Could not generate persistent MAC address for $name: No such file or directory
Basic requirements which I think a solution for this needs to satisfy:
1. No changes to MAC address generation for those cases which are currently
handled successfully. This means that net_get_unique_predictable_data() must
keep returning the same answer, which in turn means net_get_name() must keep
returning the same answer. We can only add more things we look at with lower
priority so that we start to cover cases which were not covered before.
2. Like 1, but for IPvLL seed and DHCP IAD. This is less important, but "nice
to have".
3. Keep MACPolicy=persistent. If people don't want it, they can always apply
local configuration, but in general stable MACs are a good thing. I have never
seen anyone complain about that.
== Various approaches that have been proposed
=== https://github.com/systemd/systemd/issues/3374#issuecomment-223753264 (tomty89)
if !ID_BUS and INTERFACE, use INTERFACE
I think this almost does the good thing, but I don't see the reason to reject ID_BUS
(i.e. physical hardware). Stable MACs are very useful for physical hardware that has
no physical MAC.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-224733069 (teg)
if (should_rename(device, true))
This means looking at name_assign_type. In particular for
NET_NAME_USER should_rename(..., true) returns true. It only returns false
for NET_NAME_PREDICTABLE. So this would cover stuff like br0, bond0, etc,
but would not cover lo and other devices with predictable names. That doesn't
make much sense.
But did teg mean should_rename() or !should_rename()?
=== https://github.com/systemd/systemd/issues/3374#issuecomment-234628502 (tomty89):
+ if (!should_rename(device, true))
+ return udev_device_get_sysname(device)
This covers only devices with NET_NAME_PREDICTABLE. Since the problem applies as
much to bridges and such, this isn't neough.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-281745967 (grafi-tt)
+ /* if the machine doesn't provide data about the device, use the ifname specified by userspace
+ * (this is the case when the device is virtual, e.g., bridge or bond) */
+ s = udev_device_get_sysattr_value(device, "name_assign_type");
+ if (s && safe_atou(s, &type) >= 0 && type == NET_NAME_USER)
+ return udev_device_get_sysname(device);
This does not cover bond0, vnet0, tun/tap and similar.
grafi-tt also proposes patching the kernel, but *not* setting name_assign_type
seems intentional in those cases, because the device name is a result of
enumeration, not set by the userspace.
=== https://github.com/systemd/systemd/issues/3374#issuecomment-288882355 (tomty89)
(also PR #11372)
- MACAddressPolicy=persistent
This break requirement 3. above. It would solve the immediate problem, but I
think the disruption is too big.
=== This patch
This patch means that we will set a "stable" MAC for pretty much any virtual
device by default, where "stable" means keyed off the machine-id and interface
name.
It seems like a big change, but we already did this for most physical devices.
Doing it also for virtual devices doesn't seem like a big issue. It will make
the setup and monitoring of virtualized networks slightly nicer. I don't think
anyone is depending on having the MAC address changed when those devices are
destoryed and recreated. If they do, they'd have to change MACAddressPolicy=.
== Implementation
net_get_name() is called from dhcp_ident_set_iaid() so I didn't change
net_get_name() like in grafi-tt's patch, but net_get_unique_predictable_data().
net_get_unique_predictable_data() is called from get_mac() in link-config.c
and sd_ipv4ll_set_address_seed(), so both of those code paths are affected
and will now get data in some cases where they errored out previously.
The return code is changed to -ENODATA since that gives a nicer error string.
If "keep" policy is specified, and the interface has a name that is
NET_NAME_USER or NET_NAME_RENAMED, we stop processing rules. "keep" should
probably be specified either first or last depending on the preference.
This partially reimplements 55b6530baa, in the
sense that if the "keep" policy is not specified, and if the interface has
a NamingPolicy, it will be renamed, even if it had a name previously.
So this breaks backwards compatibility in this case, but that's more in line
with what users expect.
Closes#9006.
What policy we dicide to use it rather important, but this bit of information
wasn't logged. Let's always do that.
The code was also written in a confusing way, which probably contributed to the
unintended effects of 55b6530baa and other commits.
We would loop over all policies, and note if "kernel" was specified, and then
possibly unset the result at the end. Let's immediately log the result and cut
to the end if we can figure out the answer.
No functional change intended, except for the new log lines.
Using goto is not very elegant, but we can't use break because of the switch,
and there are multiple conditions to break the loop, so using goto is cleanest.
This reverts commit 55b6530baa.
This commit description says "Always rename an interface to its name specified
in config if no NamePolicy= is specified", but it does much more:
1. It completely changes the meaning of NamePolicy=kernel. Before, it meant that an interface
with type==NAMEPOLICY_KERNEL would not be renamed. After, the kernel name only works as
a fallback, if no policy matches.
2. The "if no NamePolicy= is specified" part is not true at all, the interface will be renamed
according to the specified NamePolicy=.
After 55b6530baa, the should_rename() function is named very misleadingly: it is only used
to mean "respect kernel predictable name if no naming policy matches".
Let's revert, and start with a clean slate. This fixes#11436.
Before:
IMPORT builtin 'hwdb' fails: No such file or directory
After:
IMPORT builtin 'hwdb' fails: No data available
Previous log is confusing and may be understood as hwdb file not exist.
Now, not a few udevd debug logs come from sd-device or sd-hwdb.
Only setting LOG_REALM_UDEV may not sufficient to debug.
We have already similar code in main() and udevadm.
Found by inspecting results of running this small program:
int main(int argc, const char **argv) {
for (int i = 1; i < argc; i++) {
FILE *f;
char line[1024], prev[1024], *r;
int lineno;
prev[0] = '\0';
lineno = 1;
f = fopen(argv[i], "r");
if (!f)
exit(1);
do {
r = fgets(line, sizeof(line), f);
if (!r)
break;
if (strcmp(line, prev) == 0)
printf("%s:%d: error: dup %s", argv[i], lineno, line);
lineno++;
strcpy(prev, line);
} while (!feof(f));
fclose(f);
}
}
c4b69e990f effectively moved the initalization of socket.
Before that commit:
run → listen_fds → udev_ctrl_new → udev_ctrl_new_from_fd → socket()
After:
run → main_loop → manager_new → udev_ctrl_new_from_fd → socket()
The problem is that main_loop was called after daemonization. Move manager_new
out of main_loop and before daemonization.
Fixes#11314 (hopefully ;)).
v2: Yu Watanabe
sd_event is initialized in main_loop().
When running PROGRAM="...", we would log
systemd-udevd[447]: Failed to wait spawned command '...': Input/output error
no matter why the program actually failed, at error level.
The code wouldn't distinguish between an internal failure and a failure in the
program being called and run sd_event_exit(..., -EIO) on any kind of error. EIO
is rather misleading here, becuase it suggests a serious error.
on_spawn_sigchld is updated to set the return code to distinguish failure to
spawn, including the program being killed by a signal (a negative return value),
and the program failing (positive return value).
The logging levels are adjusted, so that for PROGRAM= calls, which are
essentially "if" statements, we only log at debug level (unless we get a
timeout or segfault or another unexpected error).
The idea was that those vars could be configured to 'no' to not install the .pc
files, or they could be set to '', and then they would be built but not
installed. This was inherited from the autoconf build system. This couldn't
work because '' is replaced by the default value. Also, having this level of
control doesn't seem necessary, since creating those files is very
quick. Skipping with 'no' was implemented only for systemd.pc and not the other
.pc files. Let's simplify things and skip installation if the target dir
is configured as 'no' for all .pc files.
$ build/systemctl --version
systemd 239-3555-g6178cbb5b5
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
$ git tag v240 -m 'v240'
$ ninja -C build
ninja: Entering directory `build'
[76/76] Linking target fuzz-unit-file.
$ build/systemctl --version
systemd 240
+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN +PCRE2 default-hierarchy=hybrid
This is very useful during development, because a precise version string is
embedded in the build product and displayed during boot, so we don't have to
guess answers for questions like "did I just boot the latest version or the one
from before?".
This change creates an overhead for "noop" builds. On my laptop, 'ninja -C
build' that does nothing goes from 0.1 to 0.5 s. It would be nice to avoid
this, but I think that <1 s is still acceptable.
Fixes#7183.
PACKAGE_VERSION is renamed to GIT_VERSION, to make it obvious that this is the
more dynamically changing version string.
Why save to a file? It would be easy to generate the version tag using
run_command(), but we want to go through a file so that stuff gets rebuilt when
this file changes. If we just defined an variable in meson, ninja wouldn't know
it needs to rebuild things.
PROJECT_VERSION is used in preparation for future changes. Let's simplify the
code by using structured initialization. If the string written to .version ever
became to long, the compiler will truncate it and tell us:
../src/udev/udev-ctrl.c: In function ‘ctrl_send’:
../src/udev/udev-ctrl.c:221:28: warning: initializer-string for array of chars is too long
.version = "udev-" STRINGIFY(R_VERSION),
^~~~~~~
../src/udev/udev-ctrl.c:221:28: note: (near initialization for ‘ctrl_msg_wire.version’)
No functional change.
Let's not use atoi() if we can simply provide the project version as a number.
In C code, this is the numerical project version. In substitutions in other
files, this is just the bare substitution.
The "PACKAGE_" prefix is from autotools, and is strange. We call systemd a
"project", and "package" is something that distros build. Let's rename.
PACKAGE_URL is renamed to PROJECT_URL for the same reasons and for consistency.
(This leave PACKAGE_VERSION as the stringified define for C code.)
_c is misleading because .h files should be included in those lists too
(this tells meson that the build outputs should be rebuilt if the header
files change).
Follow-up for 1437822638.
We had two very similar functions: device_read_db_aux and device_read_db,
and a number of wrappers for them:
device_read_db_aux
← device_read_db (in sd-device.c)
← all functions in sd-device.c, including sd_device_is_initialized
← device_read_db_force
← event_execute_rules_on_remove (in udev-event.c)
device_read_db (in device-private.c)
← functions in device_private.c (but not device_read_db_force):
device_get_devnode_{mode,uid,gid}
device_get_devlink_priority
device_get_watch_handle
device_clone_with_db
← called from udevadm, udev-{node,event,watch}.c
Before 7141e4f62c (sd-device: don't retry loading
uevent/db files more than once), the two implementations were the same. In that
commit, device_read_db_aux was changed. Those changes were reverted in the parent
commit, so the two implementations are now again the same except for superficial
differences. This commit removes device_read_db (in sd-device.c), and renames
device_read_db_aux to device_read_db_internal and makes everyone use this one
implementation. There should be no functional change.
From the results of CIs in #11076, changing buffer size may cause
issue #10754. So, let's prohibit to change the size if it is already
bound.
This also reverts commit 986ab0d2dc.
This is useful for distributions, where the stability of interface names should
be preseved after an upgrade of systemd. So when some specific release of the
distro is made available, systemd defaults to the latest & greatest naming
scheme, and subsequent updates set the same default. This default may still
be overriden through the kernel and env var options.
A special value "latest" is also allowed. Without a specific name, it is harder
to verride from meson. In case of 'combo' options, meson reads the default
during the initial configuration, and "remembers" this choice. When systemd is
updated, old build/ directories could keep the old default, which would be
annoying. Hence, "latest" is introduced to make it explicit, yet follow the
upstream. This is actually useful for the user too, because it may be used
as an override, without having to actually specify a version.
With this we can stabilize how naming works for network interfaces. A
user can request through a kernel cmdline option or an env var which
scheme to follow. The idea is that installers use this to set into stone
(a very soft stone though) the scheme used during installation so that
interface naming doesn't change afterwards anymore.
Why use env vars and kernel cmdline options, and not a config file of
its own?
Well, first of all there's no obvious existing one to use. But more
importantly: I have the feeling that this logic is kind of an incomplete
hack, and I simply don't want to do advertise this as a perfectly
working solution. So far we used env vars for the non-so-official
options and proper config files for the official stuff. Given how
incomplete this logic is (i.e. the big variable for naming remains the
kernel, which might expose sysfs attributes in newer versions that we
check for and didn't exist in older versions — and other problems like
this), I am simply not confident in giving this first-class exposure in
a primary configuration file.
Fixes: #10448
This is convenient when working with device units in systemd. Instead of
converting the systemd unit name to a path to feed to udevadm, udevadm
info|trigger can be called directly on the unit name.
The man page is reworked a bit to describe the modern syntax with positional
arguments first. It's just simpler to use than the positional options.
udevadm would dump help() output, instead of printing a message about what is
wrong. That's just bad UX. Let's use a different message if the argument is
missing, and a different one if it is invalid.
Also, rework the code to separate the business logic from argument parsing.
Let's not use "default:" in switch statements. This way, the compiler will warn
us if we miss one of the cases.
Whenever we invoke external, foreign code from code that has
RLIMIT_NOFILE's soft limit bumped to high values, revert it to 1024
first. This is a safety precaution for compatibility with programs using
select() which cannot operate with fds > 1024.
This commit adds the call to rlimit_nofile_safe() to all invocations of
exec{v,ve,l}() and friends that either are in code that we know runs
with RLIMIT_NOFILE bumped up (which is PID 1 and all journal code for
starters) or that is part of shared code that might end up there.
The calls are placed as early as we can in processes invoking a flavour
of execve(), but after the last time we do fd manipulations, so that we
can still take benefit of the high fd limits for that.
Before c4b69e990f, if the socket fd is
passed from pid1, `udev_monitor_set_receive_buffer_size()` (now it is
a wrapper of `sd_device_monitor_set_receive_buffer_size()`) was not
called. Let's preserve the original logic.
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
This removes the call to log_close(), and refactors how fork() is done. Now
the parent also goes through normal cleanup. This isn't necessary to use the
macro, but it feels cleaner this way.
This removes a call to log_close(). I don't think this should matter.
The call to mac_selinux_init() is moved after parse_argv(). We probably
don't need selinux when printing help().
It was only used in one place, where we don't actually need it, and
it is too easy to forget to update it when adding new items to the table.
Let's just drop it.
All users of the macro (except for one, in serialize.c), use the macro in
connection with read_line(), so they must include fileio.h. Let's not play
libc games and require multiple header file to be included for the most common
use of a function.
The removal of def.h includes is not exact. I mostly went over the commits that
switch over to use read_line() and add def.h at the same time and reverted the
addition of def.h in those files.
My logs are full of:
systemd-udevd[6586]: seq 13515 queued, 'add' 'block'
systemd-udevd[6586]: seq 13516 queued, 'change' 'block'
systemd-udevd[6586]: seq 13517 queued, 'change' 'block'
systemd-udevd[6586]: seq 13518 queued, 'remove' 'bdi'
systemd-udevd[6586]: seq 13519 queued, 'remove' 'block'
systemd-udevd[9865]: seq 13514 processed
systemd-udevd[9865]: seq 13515 running
systemd-udevd[9865]: GROUP 6 /usr/lib/udev/rules.d/50-udev-default.rules:59
systemd-udevd[9865]: IMPORT builtin 'blkid' /usr/lib/udev/rules.d/60-persistent-storage.rules:95
systemd-udevd[9865]: IMPORT builtin 'blkid' fails: No such file or directory
systemd-udevd[9865]: loop4: Failed to add device '/dev/loop4' to watch: No such file or directory
(the last line is at error level).
If we are too slow to set up a watch and the device is already gone by the time
we try, this is not an error.
Rebooting to set change the kernel command line to set some udev parameters is
inconvenient. Let's allow setting more stuff in the config file.
Also drop quotes from around "info" in udev.conf. We need to accept them for
compatibility, but there is no reason to use them.
It was always set to one third of timeout_usec, so let's simplify things by
calculating it using a helper function right before it is used.
Before 9d9264ba39, udevd.c would avoid setting
timeout_warn_usec to 0, using 1 instead. This wasn't necessary, because when
timeout_warn_usec is finally used in spawn_wait(), it is ignored if
timeout_usec is 0 or timeout_warn_usec is 0. So there was no need to handle
this case specially.
The function util_log_priority() is almost same as
log_level_from_string(). The difference between them is only that
util_log_priority() accepts such that '3 hogehoge'.
The uevent handling in udevd is not cpu hungry.
So, let's increase the default number of workers per cpu.
This decrease the number of queued uevents.
Without this commit (children_max is 16 on my laptop)
```
$ journalctl -b -u systemd-udevd.service | grep reached | wc -l
1544
```
With this commit (children_max is 30 on my laptop)
```
$ journalctl -b -u systemd-udevd.service | grep reached | wc -l
7
```
During boot process, many worker processes are forked and killed.
To decrease cycles of forking and killing worker, let's wait
3 seconds before killing workers. If new uevent or inotify event
comes within the delay, the killing porcess will be cancelled.