This adds two new calls to get the list of all journal fields names currently in use.
This is the low-level support to implement the feature requested in #2176 in a more optimized way.
This should simplify handling of time events in clients and is in-line with the USEC_INFINITY macro we already have.
This way setting a timeout to 0 indicates "elapse immediately", and a timeout of USEC_INFINITY "elapse never".
Previously, .network files only knew a vaguely defined "Domains=" concept, for which the documentation declared it was
the "DNS domain" for the network connection, without specifying what that means.
With this the Domains setting is reworked, so that there are now "routing" domains and "search" domains. The former are
to be used by resolved to route DNS request to specific network interfaces, the latter is to be used for searching
single-label hostnames with (in addition to being used for routing). Both settings are configured in the "Domains="
setting. Normal domain names listed in it are now considered search domains (for compatibility with existing setups),
while those prefixed with "~" are considered routing domains only. To route all lookups to a specific interface the
routing domain "." may be used, referring to the root domain. An alternative syntax for this is the "*", as was already
implemented before using the "wildcard" domain concept.
This commit adds proper parsers for this new logic, and exposes this via the sd-network API. This information is not
used by resolved yet, this will be added in a later commit.
This is useful for alternative network management solutions (such as NetworkManager) to push DNS configuration data
into resolved.
The calls will fail should networkd already have taken possesion of a link, so that the bus API is only available if
we don't get the data from networkd.
Setting of dst_id was based on interplay of two booleans,
making the logic hard to follow (for humans and compilers alike).
gcc was confused and emmitted a warning about an uninitialized
variable. Rework the code to make it obvious that dst_id is
set properly.
sd_event_now() is a public function, so we must check all
arguments for validity. Update man page and add tests.
Sample debug message:
Assertion 'IN_SET(clock, CLOCK_REALTIME, CLOCK_REALTIME_ALARM, CLOCK_MONOTONIC, CLOCK_BOOTTIME, CLOCK_BOOTTIME_ALARM)' failed at src/libsystemd/sd-event/sd-event.c:2719, function sd_event_now(). Ignoring.
Go over the entries in the map and check that they make sense.
Tests are added. In the future we might want to do additional
checks, e.g. verifying that the error names are in the expected
format.
errno_from_name used an unusual return convention where 0 meant
"not found". This tripped up config_parse_syscall_errno(),
which would treat that as success. Return -EINVAL instead,
and adjust bus_error_name_to_errno() for the new convention.
Also remove a goto which was used as a simple if and clean
up surroudning code a bit.
Set SD_EVENT_PROFILE_DELAYS to activate accounting and periodic logging
of the distribution of delays between sd_event_run() calls.
Time spent in dispatching as well as time spent outside of
sd_event_run() is measured and accounted for. Every 5 seconds a
logarithmic histogram loop iteration delays since 5 seconds previous is
logged.
This is useful in identifying the frequency and magnitude of latencies
affecting the event loop, which should be kept to a minimum.
If we already degraded the feature level below DO don't bother with sending requests for DS, DNSKEY, RRSIG, NSEC, NSEC3
or NSEC3PARAM RRs. After all, we cannot do DNSSEC validation then anyway, and we better not press a legacy server like
this with such modern concepts.
This also has the benefit that when we try to validate a response we received using DNSSEC, and we detect a limited
server support level while doing so, all further auxiliary DNSSEC queries will fail right-away.
Fixes:
$ make valgrind-tests TESTS=test-bus-cleanup
==6363== 9 bytes in 1 blocks are possibly lost in loss record 1 of 28
==6363== at 0x4C2BBCF: malloc (vg_replace_malloc.c:299)
==6363== by 0x197D12: hexmem (hexdecoct.c:79)
==6363== by 0x183083: bus_socket_start_auth_client (bus-socket.c:639)
==6363== by 0x1832A0: bus_socket_start_auth (bus-socket.c:678)
==6363== by 0x183438: bus_socket_connect (bus-socket.c:705)
==6363== by 0x14B0F2: bus_start_address (sd-bus.c:1053)
==6363== by 0x14B592: sd_bus_start (sd-bus.c:1134)
==6363== by 0x14B95E: sd_bus_open_system (sd-bus.c:1235)
==6363== by 0x1127E2: test_bus_open (test-bus-cleanup.c:42)
==6363== by 0x112AAE: main (test-bus-cleanup.c:87)
==6363==
...
$ ./libtool --mode=execute valgrind ./test-bus-cleanup
==6584== LEAK SUMMARY:
...
==6584== possibly lost: 10,566 bytes in 27 blocks
Since we honour RFC5011 revoked keys it might happen we end up with an
empty trust anchor, or one where there's no entry for the root left.
With this patch the logic is changed what to do in this case.
Before this patch we'd end up requesting the root DS, which returns with
NODATA but a signed NSEC we cannot verify, since the trust anchor is
empty after all. Thus we'd return a DNSSEC result of "missing-key", as
we lack a verified version of the key.
With this patch in place, look-ups for the root DS are explicitly
recognized, and not passed on to the DNS servers. Instead, if
downgrade-ok mode is on an unsigned NODATA response is synthesized, so
that the validator code continues under the assumption the root zone was
unsigned. If downgrade-ok mode is off a new transaction failure is
generated, that makes this case recognizable.
Fixes:
```
$ ./configure ... --enable-dbus
$ make
$ make valgrind-tests TESTS=test-bus-marshal
...
==25301== 51 bytes in 1 blocks are definitely lost in loss record 7 of 18
==25301== at 0x4C2DD9F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25301== by 0x5496B8C: ??? (in /lib/x86_64-linux-gnu/libdbus-1.so.3.14.3)
==25301== by 0x54973E3: _dbus_string_append_printf_valist (in /lib/x86_64-linux-gnu/libdbus-1.so.3.14.3)
==25301== by 0x547E5C2: _dbus_set_error_valist (in /lib/x86_64-linux-gnu/libdbus-1.so.3.14.3)
==25301== by 0x547E73E: dbus_set_error (in /lib/x86_64-linux-gnu/libdbus-1.so.3.14.3)
==25301== by 0x548969A: dbus_message_demarshal (in /lib/x86_64-linux-gnu/libdbus-1.so.3.14.3)
==25301== by 0x115C1A: main (test-bus-marshal.c:244)
==25301==
```
Previously, if we couldn't reach a server via UDP we'd generate an
MAX_ATTEMPTS transaction result, but if we couldn't reach it via TCP
we'd generate a RESOURCES transaction result. While it is OK to generate
two different errors I think, "RESOURCES" is certainly a misnomer.
Introduce a new transaction result "CONNECTION_FAILURE" instead.
Printing the pointer variable really doesn't help, so drop that.
Instead, add a string lookup table for the EventSourceType enum, and print
the type of event source in case of errors.
We need to check the same thing in multiple tests. Use a shared
macro to make it easier to update the list of errnos.
Change the errno code for "unitialized cgroup fs" for ENOMEDIUM.
Exec format error looks like something more serious.
This fixes test-execute invocation in mock.
Let's distuingish the cases where our code takes an active role in
selinux management, or just passively reports whatever selinux
properties are set.
mac_selinux_have() now checks whether selinux is around for the passive
stuff, and mac_selinux_use() for the active stuff. The latter checks the
former, plus also checks UID == 0, under the assumption that only when
we run priviliged selinux management really makes sense.
Fixes: #1941
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.
With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.
The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).
This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.
Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:
#define _cleanup_(function) __attribute__((cleanup(function)))
Or similar, to make the gcc feature easier to use.
Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.
See #2008.
We already have a state RUNNING and EXITING when we dispatch regular and
exit callbacks. Let's introduce a new state called PREPARING that is
active while we invoke preparation callbacks. This way we have a state
each for all three kinds of event handlers.
The states are currently not documented, hence let's add a new state to
the end, before we start documenting this.
Let's make _ref() calls happy when NULL is passed to them, and simply
return NULL without any assertion logic. This makes them nicely
symmetric to the _unref() calls which also are happy to take NULL and
become NOPs then.
Rather than passing a pointer to return the result, return it directly
from the function calls.
Also, return the result in native endianess, and let the callers care
about the conversion. For hash tables and bloom filters, we don't care,
but in order to keep MAC addresses and DHCP client IDs stable, we
explicitly convert to LE.
Change the "out" parameter from uint8_t[8] to uint64_t. On architectures which
enforce pointer alignment this fixes crashes when we previously cast an
unaligned array to uint64_t*, and on others this should at least improve
performance as the compiler now aligns these properly.
This also simplifies the code in most cases by getting rid of typecasts. The
only place which we can't change is struct duid's en.id, as that is _packed_
and public API, so we can't enforce alignment of the "id" field and have to
use memcpy instead.
This way we do not rely on the size MAX* constants from the kernel headers, as these will
be out-of-sync in case we have old headers and new defines in missing.h.
We already filter out 0, and as -1 is usually special (meaning infinity,
as in USEC_INFINITY) we should better not accept it either. Better safe
than sorry...
Let's make sure we don't start blocking on sd_notify() earlier than
necessary, let's bump the socket buffer sizes to 8M.
We already do something similar for our logging socket buffers, hence
apply a similar bump here.
The files are named too generically, so that they might conflict with
the upstream project headers. Hence, let's add a "-util" suffix, to
clarify that this are just our utility headers and not any official
upstream headers.
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.
This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.
Also touches a few unrelated include files.
pthread APIs (unlike the rest of libc) return their errors as positive
error codes directly from the functions, rather than using errno. Let's
make sure we always handle things that way.
Track the number of matches installed for a given multicast group, and leave the
group once no matches depend on it.
In order to handle passed-in sockets that are already members of multicast groups
we initialize the refcount based on the membership once we take over the socket.
This way we will leave the socket in the state we found it once we finish with
it.
On kernels that do not fully support reading out the multicast group membership
we fall back to never leaving any groups (as before).
CMSG_NXTHDR() checks for cmsg->cmsg_len *after* it increased the pointer.
While this makes sense for parsing received messages, that's a pitfall
for code crafting messages with this macro.
Wipe out the allocated memory to fix this.
This adds support for naming file descriptors passed using socket
activation. The names are passed in a new $LISTEN_FDNAMES= environment
variable, that matches the existign $LISTEN_FDS= one and contains a
colon-separated list of names.
This also adds support for naming fds submitted to the per-service fd
store using FDNAME= in the sd_notify() message.
This also adds a new FileDescriptorName= setting for socket unit files
to set the name for fds created by socket units.
This also adds a new call sd_listen_fds_with_names(), that is similar to
sd_listen_fds(), but also returns the names of the fds.
systemd-activate gained the new --fdname= switch to specify a name for
testing socket activation.
This is based on #1247 by Maciej Wereski.
Fixes#1247.
The kernel replaces '/' in device names with '!', we translate that back
to '/' in sysname, when taking sysname as input, we should translate it
back again.
Make sure all variable-length inputs are properly terminated or that
their length is encoded in some way. This avoids ambiguity of
adjacent inputs.
E.g., in case of a hash function taking two strings, compressing "ab"
followed by "c" is now distinct from "a" followed by "bc".
All our hash functions are based on siphash24(), factor out
siphash_init() and siphash24_finalize() and pass the siphash
state to the hash functions rather than the hash key.
This simplifies the hash functions, and in particular makes
composition simpler as calling siphash24_compress() repeatedly
on separate chunks of input has the same effect as first
concatenating the input and then calling siphash23_compress()
on the result.
By default we set as NLM_F_CREATE | NLM_F_EXCL in
sd_rtnl_message_new_link
But incase of bridge we need to set NLM_F_REQUEST | NLM_F_ACK.
If NLM_F_EXCL is set then we are unable to set the parameters. As bridge
supports setting properties after creation not during creation.
Use %m where previously %s was used together with strerrno().
Fixes: e53fc357a9 "tree-wide: remove a number of invocations of
strerror() and replace by %m"
A few of the recent conversions to log_*_errno() were missing the errno
value arguments.
Fixes: e53fc357a9 "tree-wide: remove a number of invocations of
strerror() and replace by %m"
mq_getattr returns -1/EBADF for file descriptors which are not mq.
But we should return 0 in this case.
We first check that fd is a valid fd, so we can assume that if
mq_getattr returns EBADF, it is simply a non-mq fd. There is a slight
race, but there doesn't seem to be a nice way to fix it.
We can just use access() to check whether /run/system/system/ is a
directory, no need to involve stat(). The trick is to suffix the path
name with a dash.
This also allows us to drop build.h from a ton of files, hence do so.
Since we touched the #includes of those files, let's order them properly
according to CODING_STYLE.
Also, make it slightly more powerful, by accepting a flags argument, and
make it safe for handling if more than one cmsg attribute happens to be
attached.
Currently, we guarantee that if two event-sources with the same priority
fire at the same time, they're always dispatched in the same order. While
this might sound nice in theory, there's is little benefit in providing
stability on that level. We have no control over the order the events are
reported, hence, we cannot guarantee that we get notified about both at
the same time.
By dropping the stability guarantee, we loose roughly 10% Heap swaps in
the prioq on a desktop cold-boot. Krzysztof Kotlenga even reported up to
20% on his tests. This sounds worth optimizing, so drop the stability
guarantee.
This introduces two new helpers alongside sd_bus_path_{encode,decode}(),
which work similarly to their counterparts, but accept a format-string as
input. This allows encoding and decoding multiple labels of a format
string at the same time.
Add a 'destination' match rule for every SERVICE argument in addition to
the 'sender' rule. This is consistent with busctl(1), which documents
monitor as dumping "messages to or from this peer".
Let's underline the header line of the table shown by cgtop, how it is
customary for tables. In order to do this, let's introduce new ANSI
underline macros, and clean up the existing ones as side effect.
If code enqueues a message on one of the default busses, but doesn't
sync on it, and immediately drops the reference to the bus again, it
will stay queued and consume memory. Intrdouce a new call
sd_bus_default_flush_close() that can be invoked at the end of programs
(or threads) and flushes out all unsent messages on any of the default
busses.
The size of the allocated array for received file descriptors was
incorrectly calculated. This did not matter when a single file
descriptor was received, but for more descriptors the allocation was
insufficient.
off_t is a really weird type as it is usually 64bit these days (at least
in sane programs), but could theoretically be 32bit. We don't support
off_t as 32bit builds though, but still constantly deal with safely
converting from off_t to other types and back for no point.
Hence, never use the type anymore. Always use uint64_t instead. This has
various benefits, including that we can expose these values directly as
D-Bus properties, and also that the values parse the same in all cases.
This seems to be an oversight from:
707b66c663
We have to return ENODATA instead of ENOENT if a requested entry is
non-present. Also fix the call-site in udev to check for these errors.
When forking of a child process for connecting to a container, pass
the preicse connection error to the calling process.
We already did this correctly for kdbus busses, let's do so for dbus1
busses, too.
We should never access the "signal" part of the event source unless the
event source is actually for a signal. In this case it's a child pid
handler however, hence make sure to use the right signal.
This is a fix for PR #1177, which in turn was a fix for
9da4cb2be2.
Currently, our introspection data looks like this:
<node>
<interface name="org.freedesktop.DBus.Peer">
...
</interface>
<interface name="org.freedesktop.DBus.Introspectable">
...
</interface>
<interface name="org.freedesktop.DBus.Properties">
...
</interface>
<node name="org"/>
<node name="org/freedesktop"/>
<node name="org/freedesktop/login1"/>
<node name="org/freedesktop/login1/user"/>
<node name="org/freedesktop/login1/user/self"/>
<node name="org/freedesktop/login1/user/_1000"/>
<node name="org/freedesktop/login1/seat"/>
<node name="org/freedesktop/login1/seat/self"/>
<node name="org/freedesktop/login1/seat/seat0"/>
<node name="org/freedesktop/login1/session"/>
<node name="org/freedesktop/login1/session/self"/>
<node name="org/freedesktop/login1/session/c1"/>
</node>
(ordered alphabetically for better visibility)
This is grossly incorrect. The spec says that we're allowed to return
non-directed children, however, it does not allow us to return data
recursively in multiple parents. If we return "org", then we must not
return anything else that starts with "org/".
It is unclear, whether we can include child-nodes as a tree. Moreover, it
is usually not what the caller wants. Hence, this patch changes sd-bus to
never return introspection data recursively. Instead, only a single
child-layer is returned.
This patch relies on enumerators to never return hierarchies. If someone
registers an enumerator via sd_bus_add_enumerator, they better register
sub-enumerators if they support *TRUE* hierarchies. Each enumerator is
treated as a single layer and not filtered.
Enumerators are still allowed to return nested data. However, that data
is still required to be a single hierarchy. For instance, returning
"/org/foo" and "/com/bar" is fine, but including "/com" or "/org" in that
dataset is not.
This should be the default for enumerators and I see no reason to filter
in sd-bus. Moreover, filtering that data-set would require to sort the
strv by path and then do prefix-filtering. This is O(n log n), which
would be fine, but still better to avoid.
Fixes#664.
Whenever we run in a user context, sd_bus_{default_user,open_user}() and
friends should always connect to the user-bus of the current context,
instead of deriving the uid from getuid(). This allows us running
programs via sudo/su, without the nasty side-effect of accidentally
connecting to the root user-bus.
This patch enforces the idea of making su/sudo *not* opening sessions by
default. That is, all they do is raising privileges, but keeping
everything set as before. You can still use su/sudo to open real sessions
by requesting a login-session (or loading pam_systemd otherwise).
However, in this case XDG_RUNTIME_DIR= will not be set (as usual in these
cases), hence, you will not be able to connect to *any* user-bus.
Long story short: With this patch applied, both:
- ./busctl --user
- sudo ./busctl --user
..will successfully connect to the user-bus of the local user.
Fixes#390.
This adds a new sd_pid_get_cgroup() call to sd-login which may be used
to query the control path of a process. This is useful for programs when
making use of delegation units, in order to figure out which subtree has
been delegated.
In light of the unified control group hierarchy this is finally safe to
do, hence let's add a proper API for it, to make it easier to use this.
Commit efdb023 ("core: unified cgroup hierarchy support") introduced a new
error ENOEXEC in cg_unified() if /sys/fs/cgroup/ is not available. Adjust the
"skip" checks in various tests accordingly.
Add a corresponding "skip" check to test-bus-creds as well, as
sd_bus_creds_new_from_pid() now calls cg_unified() as well.
This re-fixes "make check" in build chroots without /sys/fs/cgroup.
https://github.com/systemd/systemd/issues/1132
Makre sure we always return sensible errors for the various, following
the same rules, and document them in a comment in sd-login.c. Also,
update all relevant man pages accordingly.
RT signals operate in a queue, and we should be careful to never merge
two queued signals into one. Hence, makes sure we only ever dequeue a
single signal at a time and leave the remaining ones queued in the
signalfd. In order to implement correct priorities for the signals
introduce one signalfd per priority, so that we only process the highest
priority signal at a time.
This simply factors out the uid validation checks from parse_uid() and
uses them everywhere. This simply verifies that the passed UID is
neither 64bit -1 nor 32bit -1.
We should never connect to the host bus as fallback if connecting to a
container failed via one method. Otherwise connecting to a dbus1
container will always result in a connection to the host.
We rely on the correct error used when opening the kdbus device node,
hence let's make sure we pass it up from the namespaced child process to
the process which actually wants to connect.
let's return ENXIO whenever we don't know something rather than ENOENT.
ENOENT suggests this was really about a file or directory, while ENXIO
is a more generic "not found" indicator.
Querying low-level DNS RRs should be done via resolved now, not via
glibc's awful res_query() API anymore. Let's not introduce an async
wrapper for it hence.
We should not fall back to dbus-1 and connect to the proxy when kdbus
returns an error that indicates that kdbus is running but just does not
accept new connections because of quota limits or something similar.
Based on a patch by Kay.
This reverts commit d4d00020d6. The idea of
the commit is broken and needs to be reworked. We really cannot reduce
the bus-addresses to a single address. We always will have systemd with
native clients and legacy clients at the same time, so we also need both
addresses at the same time.
We use dashes in our bloom-tags. Make sure the newly introduced arg0has
tag uses the same style.
Note that the external dbus-tags don't use dashes, though. They are
defined in the spec and we need to keep compatibility there.
Previously, sd-bus inofficially already supported bus matches that
tested a string against an array of strings ("as"). This was done via an
enhanced way to interpret "arg0=" matches. This is problematic however,
since clients have no way to determine if their respective
implementation understood strv matches or not, thus allowing invalid
matches to be installed without a way to detect that.
This patch changes the logic to only allow such matches with a new
"arg0has=" syntax. This has the benefit that non-conforming
implementations will return a parse error and a client application may
thus efficiently detect support for the match type.
Matches of this type are useful for "udev"-like systems that "tag" objects
with a number of strings, and clients need to be able to match against
any of these "tags".
The name "has" takes inspiration from Python's ".has_key()" construct.
This allows marking properties as "explicit". Properties marked like
this are included in the introspection, but are avoided in GetAll()
property queries, PropertiesChanged() signals and in in GetManaged()
object manager calls and InterfacesAdded() signals.
Expensive properties may be marked that way, and they will be
retrievable when explicitly being requested, but never in "blanket"
all-property queries and signals.
This flag may be combined with the flags for "const" and
"emit-validation" properties, but not with "emit-validation", as that
is only useful for properties whose value shall be sent in "blanket"
all-property signals.
The "explicit" flag is also exposed in the introspection data via a new
annotation.
When enumerating machines from /run, and when accepting machine names
for operations, be more strict and always validate.
Note that these checks are strictly speaking unnecessary, since
enumeration happens only on the trusted /run...
As it turns out machine_name_is_valid() does the exact same thing as
hostname_is_valid() these days, as it just invoked that and checked the
name length was < 64. However, hostname_is_valid() checks the length
against HOST_NAME_MAX anyway (which is 64 on Linux), hence any
additional check is redundant.
We hence replace machine_name_is_valid() by a macro that simply maps it
to hostname_is_valid() but sets the allow_trailing_dot parameter to
false. We also move this this call to hostname-util.h, to the same place
as the hostname_is_valid() declaration.
If a connection passed KDBUS_HELLO_ACTIVATOR, it cannot do I/O on the
bus. Hence, we should not treat it as proper peer. To actually query it,
you have to explicitly ask for activators.
This makes kdbus in-line with what dbus-daemon does.
This reverts commit 92d16a53e3. As it turns
out, this is not how ObjectManager is supposed to work. It is just a
special behavior of BlueZ, but no-one else implements it this way.
Revert the patch as discussed on github, and as such revert to the
previous behavior (as described in the spec).
Prior to commit c32eb440ba, libudev's
function udev_enumerate_scan_devices() had behaved differently. If
parent match was added with udev_enumerate_add_match_parent(),
udev_enumerate_scan_devices() did not return error if some child devices
had no subsystem symlink in sysfs. An example of such devices is USB
endpoints /sys/bus/usb/devices/*/ep_*. If there was a parent match
against USB device, old implementation of udev_enumerate_scan_devices()
did not treat ep_* device directories without subsystem symlink as error
and just ignored them, but new implementation returns -ENOENT (also
ignoring these devices) though correctly enumerates all other matching
devices.
To compare, you could look at 96df036fe3,
in src/libudev/libudev-enumerate.c, function parent_add_child():
if (!match_subsystem(enumerate, udev_device_get_subsystem(dev)))
goto nomatch;
udev_device_get_subsystem() was returning NULL, match_subsystem() was
returning false, and USB endpoint device was ignored.
New parent_add_child() from src/libsystemd/sd-device/device-enumerator.c
checks return value of sd_device_get_subsystem() and fails if subsystem
was not found. Absence of subsystem symlink should not be really treated
as error because all enumerations of children of USB devices will fail
with -ENOENT. This new behavior also breaks system-config-printer.
So restore old behavior and treat absence of subsystem symlink as no
match.
To be able to use `systemd-run` or `machinectl login` on a container
that is in a private user namespace, the sub-process must have entered
the user namespace before connecting to the container's D-Bus, otherwise
the UID and GID in the peer credentials are garbage.
So we extend namespace_open and namespace_enter to support UID namespaces,
and we enter the UID namespace in bus_container_connect_{socket,kernel}.
namespace_open will degrade to a no-op if user namespaces are not enabled
in the kernel.
Special handling is required for the setns call in namespace_enter with
a user namespace, since transitioning to your own namespace is forbidden,
as it would result in re-entering your user namespace as root.
Arguably it may be valid to check this at the call site, rather than
inside namespace_enter, but it is less code to do it inside, and if the
intention of calling namespace_enter is to *be* in the target namespace,
rather than to transition to the target namespace, it is a reasonable
approach.
The check for whether the user namespace is the same must happen before
entering namespaces, as we may not be able to access /proc during the
intermediate transition stage.
We can't instead attempt to enter the user namespace and then ignore
the failure from it being the same namespace, since the error code is
not distinct, and we can't compare namespaces while mid-transition.
The following functions return immediately if a null pointer was passed.
* calendar_spec_free
* link_address_free
* manager_free
* sd_bus_unref
* sd_journal_close
* udev_monitor_unref
* udev_unref
It is therefore not needed that a function caller repeats a corresponding check.
This issue was fixed by using the software Coccinelle 1.0.1.
Whenever one of our calls is invoked with a non-NULL, writable
sd_bus_error parameter, let's fill in some valid error on failure. We
previously only filled in remote errors, but never local errors, which is
hard to handle by users. Hence, let's clean this up to always fill in
the error.
This introduces a new bus_assert_return() macro that works like
assert_return() but optionally also initializes a bus_error struct.
Fixes#224.
Based on a patch by Umut Tezduyar.
We should not fall back to dbus-1 and connect to the proxy when kdbus
returns an error that indicates that kdbus is running but just does not
accept new connections because of quota limits or something similar.
Using is_kdbus_available() in libsystemd/ requires it to move from
shared/ to libsystemd/.
Based on a patch from David Herrmann:
https://github.com/systemd/systemd/pull/886
Previously, if the event loop never ran before sd_event_now() would
fail. With this change it will instead fall back to invoking now(). This
way, the function cannot fail anymore, except for programming error when
invoking it with wrong parameters.
This takes into account the fact that many callers did not handle the
error condition correctly, and if the callers did, then they kept simply
invoking now() as fall back on their own. Hence let's shorten the code
using this call, and make things more robust, and let's just fall back
to now() internally.
Whether now() is used or the cache timestamp may still be detected via
the return value of sd_event_now(). If > 0 is returned, then the fall
back to now() was used, if == 0 is returned, then the cached value was
returned.
This patch also simplifies many of the invocations of sd_event_now():
the manual fall back to now() can be removed. Also, in cases where the
call is invoked withing void functions we can now protect the invocation
via assert_se(), acknowledging the fact that the call cannot fail
anymore except for programming errors with the parameters.
This change is inspired by #841.
Using --size option triggers an assert failure below because
parse_size() requires the second argument, base, being either 1000 or
1024. As it's for a packet size, it'd be better using IEC binary
suffix (base 1024) IMHO.
$ busctl --size 2048
Assertion 'base == 1000 || base == 1024' failed at src/basic/util.c:2222,
function parse_size(). Aborting.
Aborted (core dumped)
In member_compare_func(), it compares interface, type and name of
members. But as it can contain NULL pointer, it needs to check them
before calling strcmp(). So make it as a separate strcmp_ptr
function (named after streq_ptr) so that it can be used by others.
Also let streq_ptr() to use it in order to make the code simpler.
We *must not* assume that an entry returned by KDBUS_CMD_LIST only
carries a single KDBUS_ITEM_OWNED_NAME. Similarly, we already parse
multiple such items for message-metadata, so make sure we support the
same on KDBUS_CMD_LIST.
By relying on the kernel to return all names separately, we limit the
kernel API significantly. Stop this and let the kernel decide how to
return its data.
Some places invoked fflush() directly with their own manual error
checking, let's unify all that by using fflush_and_check().
This also unifies the general error paths of fflush()+rename() file
writers.
The gvariant root container contains a 'variant' at the end, which embeds
the whole message body. This variant *must* contain a structure so we are
compatible to dbus1. Otherwise, it could encode at most 1 type, instead
of a full signature.
Our gvariant message parser already parses the variant-content as a
structure, so we're mostly good. However, it does *not* include the
opening and closing parantheses, nor does it parse them.
This patch fixes the decoder to verify a message contains the
parantheses, and also make the encoder add those parantheses into the
marshaled message.
If c->item_size is 0, the next item to parse in a structure is empty.
However, this also implies that the signature must be empty. The latter
case is already handled just fine by enter_struct_or_dict_entry() so
there is no reason to handle the same case in the caller.
Right now sd_bus_message_skip() will abort execution if passed a
signature of the unary type "()". Regardless whether this should be
supported or not, we really must not abort. Drop the incorrect assertion
and add a test-case for this.
Each signal of the ObjectManager interface carries the path of the object
in question as an argument. Therefore, a caller will deduce the object
this signal is generated for, by parsing the _argument_. A caller will
*not* use the object-path of the message itself (i.e., message->path).
This is done on purpose, so the caller can rely on message->path to be
the path of the actual object-manager that generated this signal, instead
of the path of the object that triggered this signal.
This commit fixes all InterfacesAdded/Removed signals to use the path of
the closest object-manager as message->path. 'closest' in this case means
closest parent with at least one object-manager registered.
This fix raises the question what happens if we stack object-managers in
a hierarchy. Two implementations are possible: First, we report each
object only on the nearest object-manager. Second, we report it on each
parent object-manager. This patch chooses the former. This is compatible
with other existing ObjectManager implementations, which are required to
call GetManagedObjects() recursively on each object they find, which
implements the ObjectManager interface.
In bus_kernel_translate_message(), we print a DEBUG message on unknown
items. But right now, we also print this message for KDBUS_ITEM_TIMESTAMP
despite parsing it properly. Fix this!
This adds test-bus-proxy which should be used to test correct behavior of
systemd-bus-proxyd. The first test that was added is to verify we actually
receive NameAcquired signals for ourselves on bus-connect.
If the caller does not specify arg1 for NameOwnerChanged matches, we
really must take the ID from arg2 or arg3, if provided. They are
guaranteed to be identical to arg1 if either is supplied, but there is no
strict requiredment that arg1 is supplied. Hence, make sure to always
take the more restrictive match. Otherwise, we install rather wide
matches without anyone requiring them.
Make sure we don't install NameOwnerChanged matches if the caller passed
a destination='' match (except if it is the broadcast address). Per spec,
all NameOwnerChanged signals are broadcasts.
Only the NameLost/NameAcquired signals are unicasts, but those are never
received through sd-bus. Instead, the bus-proxy synthesizes them and it
already installs proper matches for them.
In gvariant, all fixed-size objects need to be sized a multiple of their
alignment. If a structure has only fixed-size members, it is required to
be fixed size itself. If you imagine a structure like (ty), you have an
8-byte member followed by an 1-byte member. Hence, the overall inner-size
is 9. The alignment of the object is 8, though. Therefore, the specs
mandates final padding after fixed-size structures, to make sure it's
sized a multiple of its alignment (=> 16).
On the gvariant decoder side, we already account for this in
bus_gvariant_get_size(), as we apply overall padding to the size of the
structure. Therefore, our decoder correctly skips such final padding when
parsing fixed-size structure.
On the gvariant encoder side, however, we don't account for this final
padding. This patch fixes the structure and dict-entry encoders to
properly place such padding at the end of non-uniform fixed-size
structures.
The problem can be easily seen by running:
$ busctl --user monitor
and
$ busctl call --user org.freedesktop.systemd1 / org.foobar foobar "(ty)" 777 8
The monitor will fail to parse the message and print an error. With this
patch applied, everything works fine again.
This patch also adds a bunch of test-cases to force non-uniform
structures with non-pre-aligned positions.
Thanks to Jan Alexander Steffens <jan.steffens@gmail.com> for spotting
this and narrowing it down to non-uniform gvariant structures. Fixes#597.
So right now our object-tree is limited to 2 levels at most
('/' and '/foo/...../bar'). We never link any intermediate levels, even
though that was clearly the plan. Fix the bus_node_allocate() helper to
actually link all intermediate nodes, too, not just the root node.
This fixes a simple inverse ptr-diff bug.
The downside of this fix is that we clearly never tested (nor used) the
object tree in any way. The only reason that the introspection works is
that our enumerators shortcut the object tree.
Lets see whether that code actually works..
Thanks to: Nathaniel McCallum <nathaniel@themccallums.org>
..for reporting this. See #524 for an actual example code.
It is highly confusing if a getter function returns 0, but the value is
set to NULL. This, right now, triggers assertions as code relies on the
returned values to be non-NULL.
Like with sd-bus-creds and friends, return 0 only if a value is actually
available.
Discussed with Tom, and actually fixes real bugs as in #512.
We were ignoring failures from unhexchar, which meant that invalid
hex characters were being turned into garbage rather than the string
rejected.
Fix this by making unhexmem return an error code, also change the API
slightly, to return the size of the returned memory, reflecting the
fact that the memory is a binary blob,and not a string.
For convenience, still append a trailing NULL byte to the returned
memory (not included in the returned size), allowing callers to
treat it as a string without doing a second copy.
Given a container "foo", that maps user id $UID to container user, using
user namespaces, this NSS module extenstion will now map the $UID to a
name "vu-foo-$TUID" for the translated UID $UID.
Similar, userns groups are mapped to "vg-foo-$TGID" for translated GIDs
of $GID.
This simple change should make userns users more discoverable. Also,
given that many tools like "adduser" check NSS before allocating a UID,
should lower the chance of UID range conflicts between tools.
If GetManagedObjects is called on /foo/bar, then it should also include
the object /foo/bar, if it exists. Right now, we only include objects
underneath /foo/bar/.
This follows the behavior of existing dbus implementations.
Obsoletes #527 and fixes#525. Reported by: Nathaniel McCallum
All other *_get_description() functions use 'const char**', so make sure
sd_bus_slot_get_description() does the same.
This changes API, but ABI stays stable. I think this is fine, but I
wouldn't mind bumping SONAME.
Reported in #528.
Right now, if you're already in a session and call CreateSession, we
return information about the current session of yours. This is highy
confusing and a nasty hack. Avoid that, and instead return a commonly
known error, so the caller can detect that.
This has the side-effect, that we no longer override XDG_VTNR and XDG_SEAT
in pam_systemd, if you're already in a session. But this sounds like the
right thing to do, anyway.