Negative value means there is no match between a PCI device and any of
the slots. In the following commit we will extend this and value of 0
will indicate that there is a match between some slot and PCI device,
but that device is a PCI bridge.
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.
(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
This does the following:
- rename enum udev_builtin_cmd -> UdevBuiltinCmd
- rename struct udev_builtin -> UdevBuiltin
- move type definitions to udev-rules.h
- move prototypes of functions defined in udev-rules.c to udev-rules.h
- drop to use strbuf
- propagate critical errors in applying rules,
- drop limitation for number of tokens per line.
The comment in udev-builtin-net_id.c (removed in grandparent commit) showed the
property without the prefix. I assume that was always the intent, because it
doesn't make much sense to concatenate anything to an arbitrary user-specified
field.
I decided to make this a separate man page because it is freakin' long.
This content could equally well go in systemd-udevd.service(8), systemd.link(5),
or a new man page for the net_id builtin.
v2:
- rename to systemd.net-naming-scheme
- add udevadm test-builtin net_id example
In order to properly and predictably name netdevsim netdevices,
introduce a separate implementation, as the netdevices reside on a
specific netdevsim bus. Note that this applies only to netdevsim devices
created using sysfs, because those expose phys_port_name attribute.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
This is useful for distributions, where the stability of interface names should
be preseved after an upgrade of systemd. So when some specific release of the
distro is made available, systemd defaults to the latest & greatest naming
scheme, and subsequent updates set the same default. This default may still
be overriden through the kernel and env var options.
A special value "latest" is also allowed. Without a specific name, it is harder
to verride from meson. In case of 'combo' options, meson reads the default
during the initial configuration, and "remembers" this choice. When systemd is
updated, old build/ directories could keep the old default, which would be
annoying. Hence, "latest" is introduced to make it explicit, yet follow the
upstream. This is actually useful for the user too, because it may be used
as an override, without having to actually specify a version.
With this we can stabilize how naming works for network interfaces. A
user can request through a kernel cmdline option or an env var which
scheme to follow. The idea is that installers use this to set into stone
(a very soft stone though) the scheme used during installation so that
interface naming doesn't change afterwards anymore.
Why use env vars and kernel cmdline options, and not a config file of
its own?
Well, first of all there's no obvious existing one to use. But more
importantly: I have the feeling that this logic is kind of an incomplete
hack, and I simply don't want to do advertise this as a perfectly
working solution. So far we used env vars for the non-so-official
options and proper config files for the official stuff. Given how
incomplete this logic is (i.e. the big variable for naming remains the
kernel, which might expose sysfs attributes in newer versions that we
check for and didn't exist in older versions — and other problems like
this), I am simply not confident in giving this first-class exposure in
a primary configuration file.
Fixes: #10448
0 can be a valid index returned by the BIOS, so allow that by using the
parsing function safe_atolu() to check for errors without excluding the
valid value "0".
Signed-off-by: Joe Hershberger <joe.hershberger@ni.com>
We use strtoul() which returns an "unsigned long", but then assign this
to int or unsigned in, i.e. drop 32bit silently on 64bit systems. Let's
clean this up a bit, and retain the right types.
In older kernels IPoIB network devices expose the port number via
the sysfs attribute 'dev_id', which is not intended to be used this way.
Let's support both options for a while.
An InfiniBand network address is 20 bytes long. Only the least
significant 8 bytes can be interpreted as a persistent hardware unit
identifier; the other 12 are transiently derived at runtime from metadata
specific to the protocol stack.
However, since the network interface name length is hard-capped by
IFNAMSIZ at 16 chars and the 2-byte type prefix with '\0' at the end
leave us only at 13, we cannot squeeze a descriptive representation of a
HW address into an interface name. Thus, it makes the most sense to drop
the scheme for IPoIB interfaces entirely.
Currently udev just gets confused and does what it has been taught
to do: fetches the first six bytes and puts them into a permanent
device attribute.
We've long neglected IP-over-InfiniBand network interfaces, let's treat
them the same way we treat anyone else.
IPoIB interfaces will retain the 'ib' prefix; otherwise the naming scheme
is the same one we use for other network interfaces. E.g. a IPoIB network
device provided by a PCI card at bus 21 slot 0 function 6 will be named
'ibp21s0f6'.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
This drops a good number of type-specific _cleanup_ macros, and patches
all users to just use the generic ones.
In most recent code we abstained from defining type-specific macros, and
this basically removes all those added already, with the exception of
the really low-level ones.
Having explicit macros for this is not too useful, as the expression
without the extra macro is generally just 2ch wider. We should generally
emphesize generic code, unless there are really good reasons for
specific code, hence let's follow this in this case too.
Note that _cleanup_free_ and similar really low-level, libc'ish, Linux
API'ish macros continue to be defined, only the really high-level OO
ones are dropped. From now on this should really be the rule: for really
low-level stuff, such as memory allocation, fd handling and so one, go
ahead and define explicit per-type macros, but for high-level, specific
program code, just use the generic _cleanup_() macro directly, in order
to keep things simple and as readable as possible for the uninitiated.
Note that before this patch some of the APIs (notable libudev ones) were
already used with the high-level macros at some places and with the
generic _cleanup_ macro at others. With this patch we hence unify on the
latter.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
NPAR is a technology that allows a single network interface to
be divided into number of partitions. The partitions show up
as functions on the same PCI device... when there are more than
8 functions, ARI (alternative routing-ID interpretation) is
used. With ARI is enabled, the 8 bit field that normally has 5
bits for the PCI device and 3 bits for the PCI function is instead
interpreted as (implicit) device 0, with 8 bits for the function
number.
Because the linux kernel exposes the PCI device/function numbers
to userspace the same regardless of whether ARI is enabled,
systemd predictable device naming can generate unpredictable
names in this case, because network names using the PCI slot use
the function number, but not the device number, causing systemd
to generate the same name for mulitple network devices (so some
will revert to the "ethX" names).
With this patch, device naming code checks if ARI is enabled for
a PCI network device, and uses the full 8-bit function number
for naming to avoid this situation. This should improve
readability and predictability of device names.
Here is an example of how this change would affect naming:
before patch | after patch
-----------------------------
ens2f0 | ens2f0 NPAR partition 0 (in PCI slot 2)
ens2f1 | ens2f1 NPAR partition 1
...
ens2f7 | ens2f7 NPAR partition 7
eth1 | ens2f8 NPAR partition 8
eth2 | ens2f9 NPAR partition 9
With PCI SR-IOV, a number of virtual network devices can be enabled,
all of which share the same physical network device. Currently,
udev generates names for SR-IOV virtual functions as if they were
independent network devices.
With this change, the predictable network device naming code will
check if a network device is an SR-IOV virtual device, and will
generate a name based on the physical PCI device plus a "v%u"
suffix. This should improve readability and predictability of
device names.
Here is an example of how this change would affect naming:
before patch | after patch
-----------------------------
eno1 | eno1 onboard NIC, physical function
enp101s0f0 | eno1v0 onboard NIC, SR-IOV virtual func 0
enp101s0f1 | eno1v1 onboard NIC, SR-IOV virtual func 1
To generate predictable network device names, the code in
udev-builting-net_id.c tries to match the PCI device address
of the network device to the entries in /sys/bus/pci/slots.
However, sometimes the slot number is not associated the
network controller PCI device itself, but rather with one of
its parents.
This change will try to find a match in /sys/bus/pci/slots for
the parents of the PCI network device, if it doesn't find a
match for the device itself.