Previously, .device units generated by SYSTEMD_ALIAS= udev properties
are not supported to specify devices for e.g. 'udevadm info'.
Before:
```
$ udevadm info sys-subsystem-net-devices-enp0s31f6.device
Unknown device "sys-subsystem-net-devices-enp0s31f6.device": No such device
```
After:
```
$ ./udevadm info sys-subsystem-net-devices-enp0s31f6.device
P: /devices/pci0000:00/0000:00:1f.6/net/enp0s31f6
L: 0
E: DEVPATH=/devices/pci0000:00/0000:00:1f.6/net/enp0s31f6
E: INTERFACE=enp0s31f6
E: IFINDEX=2
E: SUBSYSTEM=net
E: USEC_INITIALIZED=25317523
E: ID_NET_NAMING_SCHEME=v245
(snip)
```
Negative value means there is no match between a PCI device and any of
the slots. In the following commit we will extend this and value of 0
will indicate that there is a match between some slot and PCI device,
but that device is a PCI bridge.
This fixes a race where a block device that pops up and immediately is
locked (such as a loopback device in preparation) might result in
udev never run any rules for it, and thus never turn on inotify watching
for it (as inotify watching is controlled via an option set via udev
rules), thus not noticing when the device is unlocked/closed again
(which is noticed via IN_CLOSE_WRITE inotify events).
This changes two things:
1. Whenever we encounter a locked block device we'll now inotify watch
it, so that it is guaranteed we'll notice when the BSD lock fd is
closed again, and will reprobe.
2. We'll now turn off inotify watching again once we realise the
udev rules don't actually want that. Previously, once watching a
device was enabled via a udev rule, it would be watched forever until
the device disappeared, even if the option was dropped by the rules
for later events.
Together this will make sure that we'll watch the device via inotify
in both of the following cases:
a) The block device has been BSD locked when udev wanted to look at it
b) The udev rules run for the last seen event for the device say so
In all other cases inotify is off for block devices.
This new behaviour both fixes the race, but also makes the most sense,
as the rules (when they are run) actually really control the watch state
now. And if someone BSD locks a block device then it should be OK to
inotify watch it briefly until the lock is released again as the user
this way more or less opts into the locking protocol.
This effectively reverts commit 67acde4869.
After commits 569ad251ad and
67acde4869, -EACCES errors are ignored,
and thus 'udevadm trigger' succeeds even when it is invoked by non-root
users. Moreover, on -EACCES error, log messages are shown in debug
level, so usually we see no message, and users are easily confused
why uevents for devices are not triggered.
Let's open the device node to modify with O_PATH, and then adjust it
only after verifying everything is in order. This fixes a race where the
a device appears, disappears and quickly reappers, while we are still
running the rules for the first appearance: when going by path we'd
possibly adjust half of the old and half of the new node. By O_PATH we
can pin the node while we operate on it, thus removing the race.
Previously, we'd do a superficial racey check if the device node changed
undearneath us, and would propagate EEXIST in that case, failing the
rule set. With this change we'll instead gracefully handle this, exactly
like in the pre-existing case when the device node disappeared in the
meantime.
If the whole call is simple and we don't need to look at the return value
apart from the conditional, let's use a form without assignment of the return
value. When the function call is more complicated, it still makes sense to
use a temporary variable.
Graphics tablet devices comprise multiple event nodes, usually a Pen, Finger
and Pad node (that's how the kernel postfixes them). Pen and Pad are labeled
as ID_INPUT_TABLET but the pad doesn't actually send stylus events - it
doesn't usually have BTN_TOOL_PEN, merely BTN_STYLUS.
For the last several years, libwacom has set ID_INPUT_TABLET_PAD for all pad
devices known to it based on vid/pid and a "* Pad" name match. That does not
cover devices not in libwacom. libinput relies on ID_INPUT_TABLET_PAD to
initialize the pad backend.
We can't drop ID_INPUT_TABLET without breaking userspace, but we can add
ID_INPUT_TABLET_PAD ourselves - where a device has BTN_0 in addition to
BTN_STYLUS, let's add it as a pad.
There are some devices (notably: bamboos) that use BTN_LEFT instead of BTN_0
but they are relatively rare and there's a risk of mislabeling those devices,
so let's just stick with BTN_0 only.
Previous commit makes drop ID_NET_DRIVER=, ID_NET_LINK_FILE=, and
ID_NET_NAME= properties for network interfaces on 'move' uevent.
ID_NET_DRIVER= and ID_NET_LINK_FILE= properties are used by networkctl.
ID_NET_NAME= may be used by end-user rules or programs. So, let's
re-assign them on 'move' uevent. (Note that strictly speaking, this
makes them re-assigned on all but 'remove' uevent.)
This reverts commit b081b27e14.
If a network interface get a 'move' event, then previously SYSTEMD_ALIAS=
property still contains an old alias, and the old alias .device unit
will not be removed.
This makes all properties cleared on 'move' event, and then old alias
.device unit will be removed by pid1.
Fixes#16967.
Behaviour is not identical, as shown by the tests in test-strv.
The combination of EXTRACT_UNQUOTE without EXTRACT_RELAX only appears in
the test, so it doesn't seem particularly important. OTOH, the difference
in handling of squished parameters could make a difference. New behaviour
is what both bash and python do, so I think we can ignore this corner case.
This change has the following advantages:
- the duplication of code paths that do a very similar thing is removed
- extract_one_word() / strv_split_extract() return a proper error code.
This tries to address the "bind"/"unbind" uevent kernel API breakage, by
changing the semantics of device tags.
Previously, tags would be applied on uevents (and the database entries
they result in) only depending on the immediate context. This means that
if one uevent causes the tag to be set and the next to be unset, this
would immediately effect what apps would see and the database entries
would contain each time. This is problematic however, as tags are a
filtering concept, and if tags vanish then clients won't hence notice
when a device stops being relevant to them since not only the tags
disappear but immediately also the uevents for it are filtered including
the one necessary for the app to notice that the device lost its tag and
hence relevance.
With this change tags become "sticky". If a tag is applied is once
applied to a device it will stay in place forever, until the device is
removed. Tags can never be removed again. This means that an app
watching a specific set of devices by filtering for a tag is guaranteed
to not only see the events where the tag is set but also all follow-up
events where the tags might be removed again.
This change of behaviour is unfortunate, but is required due to the
kernel introducing new "bind" and "unbind" uevents that generally have
the effect that tags and properties disappear and apps hence don't
notice when a device looses relevance to it. "bind"/"unbind" events were
introduced in kernel 4.12, and are now used in more and more subsystems.
The introduction broke userspace widely, and this commit is an attempt
to provide a way for apps to deal with it.
While tags are now "sticky" a new automatic device property
CURRENT_TAGS is introduced (matching the existing TAGS property) that
always reflects the precise set of tags applied on the most recent
events. Thus, when subscribing to devices through tags, all devices that
ever had the tag put on them will be be seen, and by CURRENT_TAGS it may
be checked whether the device right at the moment matches the tag
requirements.
See: #7587#7018#8221
The parent process may not perform any label operation, so the
database might not get updated on a SELinux policy change on its own.
Reload the label database once on a policy change, instead of n times
in every started child.
Similarly to "setup" vs. "set up", "fallback" is a noun, and "fall back"
is the verb. (This is pretty clear when we construct a sentence in the
present continous: "we are falling back" not "we are fallbacking").
Fixed below systemd codesonar warning.
isprint() is invoked here with an argument of signed
type char, but only has defined behavior for int arguments that are
either representable as unsigned char or equal to the value
of macro EOF(-1).
As per codesonar report, in a number of libc implementations, isprint()
function implemented using lookup tables (arrays): passing in a
negative value can result in a read underrun.
When the kernel does not provide a modalias, we generate our own for usb devices.
For some reason, we generated the expected usb:vXXXXpYYYY string, suffixed by "*".
It was added that way already in 796b06c21b, but I
think that was a mistake, and Kay was thinking about the match pattern instead
of the matched string.
For example, for a qemu device:
old: "usb:v0627p0001*"
new: "usb:v0627p0001:QEMU USB Tablet"
On the match side, all hwdb files in the wild seem to be using match patterns
with "*" at the end. So we can add more stuff to our generated modalias with
impunity.
This will allow more obvious and more certain matches on USB devices. In
principle the vendor+product id should be unique, but it's only 8 digits, and
there's a high chance of people getting this wrong. And matching the wrong
device would be quite problematic. By including the name in the match string we
make a mismatch much less likely.
E.g. udevadm test prints "Invalid inotify descriptor." which is
meaningless without any context. I think it should be OK to call udev_watch_end()
from a cleanup path without any warning (even at debug level).
There is no reason to consider this wrong. In fact one could argue that +=
is more appropriate, because we always add to options, and not replace previous
assignments. If we output a debug message, we implicitly ask people to "fix" this,
and we shouldn't.
Also, all our rules use += right now.
Let's add a catalog entry explaining further details.
Most importantly though: talk to PID 1 directly, via the private D-Bus
socket, so that this actually works correctly during early boot, where
D-Bus is not around.
All devices behind a SPI controller have the same udev ID_PATH property.
This is a problem for predicable network names for CAN controllers.
CAN controllers, in contrast to Ethernet controllers, don't have a MAC
Address, so there's no way to tell two CAN controllers on the same SPI
host controller apart:
$ udevadm info /sys/class/net/can0
P: /devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.1/net/can0
L: 0
E: DEVPATH=/devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.1/net/can0
E: INTERFACE=can0
E: IFINDEX=3
E: SUBSYSTEM=net
E: USEC_INITIALIZED=11187199
E: ID_PATH=platform-fe204000.spi
E: ID_PATH_TAG=platform-fe204000_spi
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/can0
E: TAGS=:systemd:
$ udevadm info /sys/class/net/can1
P: /devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.0/net/can1
L: 0
E: DEVPATH=/devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.0/net/can1
E: INTERFACE=can1
E: IFINDEX=4
E: SUBSYSTEM=net
E: USEC_INITIALIZED=11192211
E: ID_PATH=platform-fe204000.spi
E: ID_PATH_TAG=platform-fe204000_spi
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/can1
E: TAGS=:systemd:
With this the chip select number is added to the ID_PATH, to make
predictable network names possible.
$ sudo udevadm info /sys/class/net/can0
P: /devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.1/net/can0
L: 0
E: DEVPATH=/devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.1/net/can0
E: INTERFACE=can0
E: IFINDEX=3
E: SUBSYSTEM=net
E: USEC_INITIALIZED=11187199
E: ID_PATH=platform-fe204000.spi-cs-1
E: ID_PATH_TAG=platform-fe204000_spi-cs-1
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/can0
E: TAGS=:systemd:
$ sudo udevadm info /sys/class/net/can1
P: /devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.0/net/can1
L: 0
E: DEVPATH=/devices/platform/soc/fe204000.spi/spi_master/spi0/spi0.0/net/can1
E: INTERFACE=can1
E: IFINDEX=4
E: SUBSYSTEM=net
E: USEC_INITIALIZED=11192211
E: ID_PATH=platform-fe204000.spi-cs-0
E: ID_PATH_TAG=platform-fe204000_spi-cs-0
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/can1
E: TAGS=:systemd:
Prompted by the discussion on #16110, let's migrate more code to
fd_wait_for_event().
This only leaves 7 places where we call into poll()/poll() directly in
our entire codebase. (one of which is fd_wait_for_event() itself)
poll() sets POLLNVAL inside of the poll structures if an invalid fd is
passed. So far we generally didn't check for that, thus not taking
notice of the error. Given that this specific kind of error is generally
indication of a programming error, and given that our code is embedded
into our projects via NSS or because people link against our library,
let's explicitly check for this and convert it to EBADF.
(I ran into a busy loop because of this missing check when some of my
test code accidentally closed an fd it shouldn't close, so this is a
real thing)
On systemd systems we generally don't need to chdir() to root, we don't
need to setup /dev/ ourselves (as PID 1 does that during earliest boot),
and we don't need to set the OOM adjustment values, as that's done via
unit files.
Hence, drop this. if people want to use udev from other init systems
they should do this on their own, I am very sure it's a good thing to do
it from outside of udevd, so that fewer privileges are required by udevd. In
particular the dev_setup() stuff is something that people who build
their own non-systemd distros want to set up themselves anyway, in
particular as they already have to mount devtmpfs themselves anyway.
Note that this only drops stuff that isn't really necessary for testing
stuff, i.e. process properties and settings that don't matter if you
quickly want to invoke udev from a terminal session to test something.
To make Driver= in [Match] section work in containers.
Note that ID_NET_DRIVER= property in udev database is set with the
result of the ethtool. So, this should not change anything for
non-container cases.
Closes#15678.
This is a follow-up for 9f83091e3c.
Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.
This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
Since the separate binaries contain mostly the same code,
this almost halves the size of the installation.
before:
398K /bin/udevadm
391K /lib/systemd/systemd-udevd
after:
431K /bin/udevadm
0 /lib/systemd/systemd-udevd -> ../../bin/udevadm
Fixes: #14200
if someone implements https://systemd.io/BLOCK_DEVICE_LOCKING/ then we
shouldn't loudly complain about that.
This reverts back to the original behaviour from
3ebdb81ef088afd3b4c72b516beb5610f8c93a0d: when the lock is taken we
silently skip processing the device and sending out the messages for it.
ATA devices should use the ATA ids like port number and (possibly)
master/slave or multiplier id, not the generic SCSI ID.
Currently only port number is included in the link. With this patch
the link would be able to support more cases, which are a) when the
device is behind a port multiplexer b) the distinction between master
and slave (distinguished by target id).
I tried to verify scenario a) with this patch, but I failed to find a
machine with PMP SATA devices attached. But the link below
https://github.com/systemd/systemd/issues/3943
could show what's the difference. Here is my test for scenario b)
Current version:
linux-ql21:~ # ll /sys/class/block/sd[ab]
lrwxrwxrwx 1 root root 0 May 8 20:46 /sys/class/block/sda ->
../../devices/pci0000:00/0000:00:01.1/ata4/host3/target3:0:0/3:0:0:0/block/sda
lrwxrwxrwx 1 root root 0 May 8 20:46 /sys/class/block/sdb ->
../../devices/pci0000:00/0000:00:01.1/ata4/host3/target3:0:1/3:0:1:0/block/sdb
linux-ql21:~ # ll /dev/disk/by-path/pci-0000\:00\:01.1-ata-1
lrwxrwxrwx 1 root root 9 May 8 20:44
/dev/disk/by-path/pci-0000:00:01.1-ata-1 -> ../../sdb
linux-ql21:~ # udevadm info /sys/class/block/sda |grep by-path
S: disk/by-path/pci-0000:00:01.1-ata-1
E: DEVLINKS=/dev/disk/by-id/ata-VBOX_HARDDISK_VB3649e885-3e0cdd64
/dev/disk/by-id/scsi-0ATA_VBOX_HARDDISK_VB3649e885-3e0cdd64
/dev/disk/by-id/scsi-1ATA_VBOX_HARDDISK_VB3649e885-3e0cdd64
/dev/disk/by-path/pci-0000:00:01.1-ata-1
/dev/disk/by-id/scsi-SATA_VBOX_HARDDISK_VB3649e885-3e0cdd64
linux-ql21:~ # udevadm info /sys/class/block/sdb |grep by-path
S: disk/by-path/pci-0000:00:01.1-ata-1
E: DEVLINKS=/dev/disk/by-id/ata-VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-id/scsi-SATA_VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-id/scsi-1ATA_VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-id/scsi-0ATA_VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-path/pci-0000:00:01.1-ata-1
After patch applied:
linux-ql21:~ # ll /sys/class/block/sd[ab]
lrwxrwxrwx 1 root root 0 May 8 21:07 /sys/class/block/sda ->
../../devices/pci0000:00/0000:00:01.1/ata4/host3/target3:0:0/3:0:0:0/block/sda
lrwxrwxrwx 1 root root 0 May 8 21:07 /sys/class/block/sdb ->
../../devices/pci0000:00/0000:00:01.1/ata4/host3/target3:0:1/3:0:1:0/block/sdb
linux-ql21:~ # ll /dev/disk/by-path/pci-0000\:00\:01.1-ata-*
lrwxrwxrwx 1 root root 9 May 8 21:07
/dev/disk/by-path/pci-0000:00:01.1-ata-1.0 -> ../../sda
lrwxrwxrwx 1 root root 9 May 8 21:07
/dev/disk/by-path/pci-0000:00:01.1-ata-1.1 -> ../../sdb
linux-ql21:~ # udevadm info /sys/class/block/sda |grep by-path
S: disk/by-path/pci-0000:00:01.1-ata-1.0
E: DEVLINKS=/dev/disk/by-id/scsi-1ATA_VBOX_HARDDISK_VB3649e885-3e0cdd64
/dev/disk/by-id/scsi-0ATA_VBOX_HARDDISK_VB3649e885-3e0cdd64
/dev/disk/by-id/ata-VBOX_HARDDISK_VB3649e885-3e0cdd64
/dev/disk/by-path/pci-0000:00:01.1-ata-1.0
/dev/disk/by-id/scsi-SATA_VBOX_HARDDISK_VB3649e885-3e0cdd64
linux-ql21:~ # udevadm info /sys/class/block/sdb |grep by-path
S: disk/by-path/pci-0000:00:01.1-ata-1.1
E: DEVLINKS=/dev/disk/by-id/scsi-0ATA_VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-id/ata-VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-id/scsi-1ATA_VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-id/scsi-SATA_VBOX_HARDDISK_VBc53b2498-d84ae8de
/dev/disk/by-path/pci-0000:00:01.1-ata-1.1
Changelog:
v5: add another parameter compat_link in handle_scsi()
v4: comment for ID_PATH_ATA_COMPAT
get string length with pointer difference
(suggested by Franck Bui<fbui@suse.com>)
v3: creating compatible link from env
variables type change
v2: remove udev rules modification for compatible link
setup a test scenario of master/slave ATA devices
v1: initial patch
And do not use it in the IMPORT{cmdline} udev code. Wherever we expose
direct interfaces to check the kernel cmdline, let's not consult our
systemd-specific EFI variable, but strictly use the actual kernel
variable, because that's what we claim we do. i.e. it's fine to use the
EFI variable for our own settings, but for the generic APIs to the
kernel cmdline we should not use it.
Specifically, this applies to IMPORT{cmdline} and
ConditionKernelCommandLine=. In the latter case we weren#t checking the
EFI variable anyway, hence let's do the same for the udev case, too.
Fixes: #15739
This is a security feature, and we thus shouldn't derive the random MACs
from a potentially guessable source. MAC addresses are after all facing
to the outside, and can be interacted with from untrusted environments.
Hence, let's generate them the same way as we generate UUIDs: from
getrandom() or /dev/urandom, and optionally with RDRAND if that's
supported.
RDRAND should be fine, since this is not cryptographic key material, but
ultimately public information. We just want to make sure conflicts are
not likely.
Previously we'd generate the MACs via rand(), which means given the
short seed they are a little bit too guessable, making collisions too
likely. See #14355 in particular.
Fixes: #14355
(Note that #14355 was already fixed by
a0f11d1d11, but I think we should do
better even, and not rely on rand() and uninitialized random pools)
When setting flow control attributes of an interface we first acquire
the current settings and then add in the new settings before applying
them again. This only works on interfaces that implement the ethtool
ioctls. on others we'll see an ugly "Could not set flow control of"
message, simply because we issue the SIOCETHTOOL ioctl once, for getting
the data. In particular we'll get it for the "lo" interface all the
time, which sucks hard. Let's get rid of it.
Prompted by the discussions in #15180.
This is a bit more complex than I hoped, since for PID 1 we need to pass
in the synethetic environment block in we generate on demand.
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.
hashmap_put_strdup() was already doing something similar.
(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)
No functional change intended.