Catch up with latest changes in kdbus.ko:
* Signals can be sent as unicast now, hence they need to be marked as
such with the KDBUS_MSG_SIGNAL in the message flags.
* Follow ioctl number change for KDBUS_CMD_FREE
The kernel provides capabilities as a u32 array, sd-bus uses an u8 array.
This works fine on little-endian as both are encoded the same way.
However, this fails on big-endian if we do not perform sufficient
byte-swapping on each u32 entry.
This patch makes sd-bus use u32, too. We avoid changing any kernel
provided data so we can keep pointing into kdbus pool buffers which
contain u32 arrays.
The number of available caps can be read from
/proc/sys/kernel/cap_last_cap during runtime. Our helper cap_last_cap()
does that, so there's no reason to remember the size of any capability
cache. We can just pre-allocate arrays with a suitable size for all
available caps and reject any higher caps.
The kernel capability API uses u32 as base so make sure we do the same.
Note that this is specified by POSIX, so it's unlikely to change.
Sync up with recent kdbus changed:
* several ioctls gained .size and .items members (but still unused)
* CMD_SEND gained its own ioctl structure
* several members of kdbus_msg were dropped as they were only used during
SEND, not during RECV etc.
* CMD_RECV and CMD_SEND now share a kdbus_reply member which contains the
offset and size of the returned message.
kdbus-git gained two new features:
* memfd offsets: This allows to specify a 'start' offset in kdbus_memfd
so you can send partial memfd hunks instead of always
the full memfd
* KDBUS_HELLO_UNPRIVILEGED: If passed during HELLO, the client will be
treated as unprivileged.
Implement a recent change in the kdbus pool logic:
PAYLOAD_VEC_OFF items are now referencing offsets relative to the
connection's pool, not to the item itself. Follow this change in
sd-bus.
Let's do this right from the beginning, to prepare ground for udev
messages that most likely want to store list of strings (for device
tags) in messages, and filter on them.
As a followup to 086891e5c1 "log: add an "error" parameter to all
low-level logging calls and intrdouce log_error_errno() as log calls
that take error numbers", use sed to convert the simple cases to use
the new macros:
find . -name '*.[ch]' | xargs sed -r -i -e \
's/log_(debug|info|notice|warning|error|emergency)\("(.*)%s"(.*), strerror\(-([a-zA-Z_]+)\)\);/log_\1_errno(-\4, "\2%m"\3);/'
Multi-line log_*() invocations are not covered.
And we also should add log_unit_*_errno().
The kdbus module will later get a policy that endpoint-names are
restricted to "<uid>-<name>" just like bus-names. Make sure that systemd
is already compatible to that.
On the system and user busses we create it's the receiver that chooses
which metadata is attched, not the sender, hence set the requirement
mask to ANY, to allow any current of future credential bit to be
attached.
Also:
- adds support for euid, suid, fsuid, egid, sgid, fsgid fields.
- makes augmentation of creds with data from /proc explicitly
controllable to give apps better control over this, given that this is
racy.
- enables augmentation for kdbus connections (previously we only did it
for dbus1). This is useful since with recent kdbus versions it is
possible for clients to control the metadata they want to send.
- changes sd_bus_query_sender_privilege() to take the euid of the client
into consideration, if known
- when we don't have permissions to read augmentation data from /proc,
don't fail, just don't add the data in
kdbus has seen a larger update than expected lately, most notably with
kdbusfs, a file system to expose the kdbus control files:
* Each time a file system of this type is mounted, a new kdbus
domain is created.
* The layout inside each mount point is the same as before, except
that domains are not hierarchically nested anymore.
* Domains are therefore also unnamed now.
* Unmounting a kdbusfs will automatically also detroy the
associated domain.
* Hence, the action of creating a kdbus domain is now as
privileged as mounting a filesystem.
* This way, we can get around creating dev nodes for everything,
which is last but not least something that is not limited by
20-bit minor numbers.
The kdbus specific bits in nspawn have all been dropped now, as nspawn
can rely on the container OS to set up its own kdbus domain, simply by
mounting a new instance.
A new set of mounts has been added to mount things *after* the kernel
modules have been loaded. For now, only kdbus is in this set, which is
invoked with mount_setup_late().
kdbus learned to accept both a numerical destination ID as well as a
well-known-name. In that case, kdbus makes sure that the numerical ID is in
fact the owner of the provided name and fails otherwise.
This allows for race-free assertion of a bus name owner while sending a
message, which is a requirement for bus-proxyd.
Add two new fields to sd_bus_message, and set the numerical ID to
verify_destination_id if bus_message_setup_kmsg() is called for a
message with a well-known name.
Also, set the destination's name in the kdbus item to .destination_ptr
if it is non-NULL.
Normal users should not touch these fields, and they're not publicy
accessible.
The metadata logic in kdbus has seen a rework, and the only mandatory
change we have to follow for now is that attach_flags in kdbus_cmd_hello
is now split into two parts, attach_flags_send and attach_flags_recv.
Catch up with some changes in kdbus.h:
* KDBUS_{ITEM,ATTACH}_CONN_NAME were renamed to
KDBUS_{ITEM,ATTACH}_CONN_DESCRIPTION, so the term 'name' is not
overloaded as much.
* The item types were re-ordered a little so they are lined up to the
order of the corresponding KDBUS_ATTACH flags
* A new item type KDBUS_ITEM_OWNED_NAME was introduced, designated to
store a struct kdbus_name in item->name. KDBUS_ITEM_NAME soley
stores data in item->str now
* Some kerneldoc fixes
kdbus_msg.timeout_ns now takes an absolute value, based on CLOCK_MONOTONIC,
in order to eventually support automatically restarted syscalls.
Signed-off-by: Daniel Mack <daniel@zonque.org>
In kdbus.h, the following details changed:
* All commands gained a 'kernel_flags' field to report the flags supported
by the driver. Before, this was done in the 'flags' field in a
bidirectional way, which turned out to be a problem for the code in
sd-bus, as many parts of it reuse the same ioctl struct more than once
and consider them to be owned by userspace.
* Name listings are now returned by a new struct instead of reusing struct
kdbus_cmd_name for that matter. This way, we don't add more unneeded
fields to it and make the API cleaner.
* 'conn_flags' was renamed to 'flags' in struct kdbus_cmd_hello to make
the API a bit more unified.
kdbus.h now has KDBUS_ATTACH_COMM split into KDBUS_ATTACH_TID_COMM and
KDBUS_ATTACH_PID_COMM. The items were split already, so the change in
systemd is easy.
In kdbus.h, the 'features' field has been dropped again. Instead of
negotiating features that way, we decided to make the kernel return the
set of supported flags in each ioctl struct's .flags field, in both the
success and error cases.
Sync kdbus.h once again. Two thing have changed:
a) KDBUS_CMD_EP_* was renamed to KDBUS_CMD_ENDPOINT_*
b) struct kdbus_cmd_hello and struct kdbus_cmd_make gained a
'features' bitfield (which is currently unused)
Walk the items first, then free the memory of the message.
Also, while at it, make coverity happy with an explicit (void) prefix.
We intentionally ignore the return value here.
The KDBUS_CMD_FREE ioctl now uses a struct rather than a direct pointer
to the offset to free.
The KDBUS_CMD_MSG_CANCEL ioctl has also changes, but there's no user of
it yet in systemd.
All kdbus ioctl arguments must be 8byte aligned. Make sure we use
alloca_align() and _alignas_(8) in all situations where gcc doesn't
guarantee 8-byte alignment.
Note that objects on the stack are always 8byte aligned as we put
_alignas_(8) into the structure definition in kdbus.h.
Custom endpoints are alternative connection points to a bus, allowing
specific policy to be uploaded.
Add two functions to bus-kernel. One to create such endpoints, and another
one for setting a policy for them.
On systems without properly setup systemd, cg_get_root_path returns
-ENOENT. This means that busctl doesn't display much information.
busctl monitor also fails whenever it intercepts messages.
This fix fakes creates a fake "/" root cgroup which lets busctl work
on such systems.
Add a new config 'Activating' directive which denotes whether a busname
is actually registered on the bus. It defaults to 'yes'.
If set to 'no', the .busname unit only uploads policy, which will remain
active as long as the unit is running.
AcceptFD= defaults to true, thus making sure that by default fd passing
is enabled for all activatable names. Since for normal bus connections
fd passing is enabled too by default this makes sure fd passing works
correctly regardless whether a service is already activated or not.
Making this configurable on both busname units and in bus connections is
messy, but unavoidable since busnames are established and may queue
messages before the connection feature negotiation is done by the
service eventually activated. Conversely, feature negotiation on bus
connections takes place before the connection acquires its names.
Of course, this means developers really should make sure to keep the
settings in .busname units in sync with what they later intend to
negotiate.
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:
fd = safe_close(fd);
Which will close an fd if it is open, and reset the fd variable
correctly.
By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
GCC optimizes strlen("string constant") to a constant, even with -O0.
Thus, replace patterns like sizeof("string constant")-1 with
strlen("string constant") where possible, for clarity. In particular,
for expressions intended to add up the lengths of components going into
a string, this often makes it clearer that the expression counts the
trailing '\0' exactly once, by putting the +1 for the '\0' at the end of
the expression, rather than hidden in a sizeof in the middle of the
expression.
This broke hashtable lookups for the message cookies on s390x, which is
a 64bit BE machine where accessing 32bit values as 64bit and vice versa
will explode.
Also, while we are at it, be a bit more careful when dealing with the
64bit cookies we expose and the 32bit serial numbers dbus uses in its
payload.
Problem identified by Fridrich Strba.
The new calls work similarly, but enforce a that a common, fixed bus
path prefix is used.
This follows discussions with Simon McVittie on IRC that it should be a
good idea to make sure that people don't use the escaping applied here
too wildly as anything other than the last label of a bus path.
Activator connections may upload policy when registering to the bus.
This patch contains code to translate between BusNamePolicy objects and
the kdbus specific items.
Bring kdbus.h in sync with recent policy related changes. Most notably,
KDBUS_MAKE_POLICY_OPEN is now gone. The rest doesn't matter for systemd
at this point, as the policy functionality was not yet used.
Previously the returned object of constructor functions where sometimes
returned as last, sometimes as first and sometimes as second parameter.
Let's clean this up a bit. Here are the new rules:
1. The object the new object is derived from is put first, if there is any
2. The object we are creating will be returned in the next arguments
3. This is followed by any additional arguments
Rationale:
For functions that operate on an object we always put that object first.
Constructors should probably not be too different in this regard. Also,
if the additional parameters might want to use varargs which suggests to
put them last.
Note that this new scheme only applies to constructor functions, not to
all other functions. We do give a lot of freedom for those.
Note that this commit only changes the order of the new functions we
added, for old ones we accept the wrong order and leave it like that.
In trying to track down a stupid linker bug, I noticed a bunch of
memset() calls that should be using memzero() to make it more "obvious"
that the options are correct (i.e. 0 is not the length, but the data to
set). So fix up all current calls to memset(foo, 0, length) to
memzero(foo, length).
sizes and numbers of hash functions
In order to make the bloom filter logic more future proof communicate
bloom filter parameters from the original bus creator to the clients,
and allow them to be variable within certain ranges.