The commit 6f3ac0d517 drops the prefix and
suffix in TAGS= property. But there exists several rules that have like
`TAGS=="*:tag:*"`. So, the property must be always prefixed and suffixed
with ":".
Fixes#17930.
If /sys/class/OOO node is created and destroyed during booting (kernle driver initialization fails),
systemd-udev-trigger.service fails due to race condition.
***** race condition ***********************************************************************************
1. kernel driver create /sys/class/OOO
2. systemd-udev-trigger.service execues "/usr/bin/udevadm trigger --type=devices --action=add"
3. device_enumerator_scan_devices() => enumerator_scan_devices_all() => enumerator_scan_dir("class") =>
opendir("/sys/class") and iterate all subdirs ==> enumerator_scan_dir_and_add_devices("/sys/class/OOO")
4. kernel driver fails and destroy /sys/class/OOO
5. enumerator_scan_dir_and_add_devices("/sys/class/OOO") fails in opendir("/sys/class/OOO")
6. "systemd-udev-trigger.service" fails
7. udev coldplug fails and some device units not ready
8. mount units asociated with device units fail
9. local-fs.target fails
10. enters emergency mode
********************************************************************************************************
***** status of systemd-udev-trigger.service unit ******************************************************
$ systemctl status systemd-udev-trigger.service
systemd-udev-trigger.service - udev Coldplug all Devices
Loaded: loaded (/usr/lib/systemd/system/systemd-udev-trigger.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2020-01-02 13:16:54 KST; 22min ago
Docs: man:udev(7)
man:systemd-udevd.service(8)
Process: 2162 ExecStart=/usr/bin/udevadm trigger --type=subsystems --action=add (code=exited, status=0/SUCCESS)
Process: 2554 ExecStart=/usr/bin/udevadm trigger --type=devices --action=add (code=exited, status=1/FAILURE)
Main PID: 2554 (code=exited, status=1/FAILURE)
Jan 02 13:16:54 localhost udevadm[2554]: Failed to scan devices: No such file or directory
Jan 02 13:16:54 localhost systemd[1]: systemd-udev-trigger.service: Main process exited, code=exited, status=1/FAILURE
Jan 02 13:16:54 localhost systemd[1]: systemd-udev-trigger.service: Failed with result 'exit-code'.
Jan 02 13:16:54 localhost systemd[1]: Failed to start udev Coldplug all Devices.
*******************************************************************************************************
***** journal log with Environment=SYSTEMD_LOG_LEVEL=debug in systemd-udev-trigger.service ***********
Jan 01 21:57:20 localhost udevadm[2039]: sd-device-enumerator: Scanning /sys/bus
Jan 01 21:57:20 localhost udevadm[2522]: sd-device-enumerator: Scan all dirs
Jan 01 21:57:20 localhost udevadm[2522]: sd-device-enumerator: Scanning /sys/bus
Jan 01 21:57:21 localhost udevadm[2522]: sd-device-enumerator: Scanning /sys/class
Jan 01 21:57:21 localhost udevadm[2522]: sd-device-enumerator: Failed to scan /sys/class: No such file or directory
Jan 01 21:57:21 localhost udevadm[2522]: Failed to scan devices: No such file or directory
*******************************************************************************************************
This tries to address the "bind"/"unbind" uevent kernel API breakage, by
changing the semantics of device tags.
Previously, tags would be applied on uevents (and the database entries
they result in) only depending on the immediate context. This means that
if one uevent causes the tag to be set and the next to be unset, this
would immediately effect what apps would see and the database entries
would contain each time. This is problematic however, as tags are a
filtering concept, and if tags vanish then clients won't hence notice
when a device stops being relevant to them since not only the tags
disappear but immediately also the uevents for it are filtered including
the one necessary for the app to notice that the device lost its tag and
hence relevance.
With this change tags become "sticky". If a tag is applied is once
applied to a device it will stay in place forever, until the device is
removed. Tags can never be removed again. This means that an app
watching a specific set of devices by filtering for a tag is guaranteed
to not only see the events where the tag is set but also all follow-up
events where the tags might be removed again.
This change of behaviour is unfortunate, but is required due to the
kernel introducing new "bind" and "unbind" uevents that generally have
the effect that tags and properties disappear and apps hence don't
notice when a device looses relevance to it. "bind"/"unbind" events were
introduced in kernel 4.12, and are now used in more and more subsystems.
The introduction broke userspace widely, and this commit is an attempt
to provide a way for apps to deal with it.
While tags are now "sticky" a new automatic device property
CURRENT_TAGS is introduced (matching the existing TAGS property) that
always reflects the precise set of tags applied on the most recent
events. Thus, when subscribing to devices through tags, all devices that
ever had the tag put on them will be be seen, and by CURRENT_TAGS it may
be checked whether the device right at the moment matches the tag
requirements.
See: #7587#7018#8221
We use LOG_PRI() in all log_*() functions, so let's do that here too for
consistency. Effectively this doesn't change anything since we only use
LOG_{INFO,DEBUG,...} as the argument.
Tracking down #15931 confused the hell out of me, since running homed in
gdb from the command line worked fine, but doing so as a service failed.
Let's make this more debuggable and check if we live in the host netns
when allocating a new udev monitor.
This is just debug stuff, so that if things don't work, a quick debug
run will reveal what is going on.
That said, while we are at it, also fix unexpected closing of passed in
fd when failing.
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
Our hashmap and set helpers return a different code whenever an entry
already exists, so let's use this to avoid unsetting scan_uptodate when
not necessary.
Thus, the return convention for
sd_device_enumerator_add_match_subsystem,
sd_device_enumerator_add_match_sysattr,
sd_device_enumerator_add_match_property,
sd_device_enumerator_add_match_sysname,
sd_device_enumerator_add_match_tag,
device_enumerator_add_match_parent_incremental,
sd_device_enumerator_add_match_parent,
sd_device_enumerator_allow_uninitialized,
device_enumerator_add_match_is_initialized
is that "1" is returned if action was taken, and "0" on noop.
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.
hashmap_put_strdup() was already doing something similar.
(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)
No functional change intended.
We don't need a seperate output parameter that is of type int. glibc() says
that the type is "unsigned", but the kernel thinks it's "int". And the
"alternative names" interface also uses ints. So let's standarize on ints,
since it's clearly not realisitic to have interface numbers in the upper half
of unsigned int range.
Virtual filesystems such as sysfs or procfs use kernfs, and kernfs can work
with two sorts of virtual files.
One sort uses "seq_file", and the results of the first read are buffered for
the second read. The other sort uses "raw" reads which always go direct to the
device.
In the later case, the content of the virtual file must be retrieved with a
single read otherwise subsequent read might get the new value instead of
finding EOF immediately. That's the reason why the usage of fread(3) is
prohibited in this case as it always performs a second call to read(2) looking
for EOF which is subject to the race described previously.
Fixes: #13585.
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.
(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
Because it's not a device path and (slightly) bad things happen if it
gets confused with one:
$ udevadm info /sys/
Assertion 'device->devpath[0] == '/'' failed at
../src/libsystemd/sd-device/sd-device.c:958,
function sd_device_get_devpath(). Aborting.
Aborted (core dumped)