Commit graph

46467 commits

Author SHA1 Message Date
Lennart Poettering 278fdd064d man: document the new libudev APIs 2020-09-01 17:40:13 +02:00
Lennart Poettering bf6e5c574b NEWS: explain the "bind"/"unbind" situation a bit 2020-09-01 17:40:13 +02:00
Lennart Poettering 643bb92408 logind: always check current tag list before using a device 2020-09-01 17:40:13 +02:00
Lennart Poettering fccb48b286 logind: only apply ACLs for device currently tagged with "uaccess"
This is about security, hence let's be particularly careful here: only
devices currenlty tagged with "uaccess" will get ACL management, and
it's not sufficient if they once were (though that is used for
filtering).
2020-09-01 17:40:12 +02:00
Lennart Poettering 242c1c075a core: make sure to recheck current udev tag "systemd" before considering a device ready
Let's ensure that a device once tagged can become active/inactive simply
by toggling the current tag.

Note that this means that a device once tagged with "systemd" will
always have a matching .device unit. However, the active/inactive state
of the unit reflects whether it is currently tagged that way (and
doesn't have SYSTEMD_READY=0 set).

Fixes: #7587
2020-09-01 17:40:12 +02:00
Lennart Poettering 31abedbb03 test: add test for new "sticky" tags logic 2020-09-01 17:40:12 +02:00
Lennart Poettering 3b684be04b libudev: also expose API to check for current tags in libudev 2020-09-01 17:40:12 +02:00
Lennart Poettering e77b146f82 udev: make tags "sticky"
This tries to address the "bind"/"unbind" uevent kernel API breakage, by
changing the semantics of device tags.

Previously, tags would be applied on uevents (and the database entries
they result in) only depending on the immediate context. This means that
if one uevent causes the tag to be set and the next to be unset, this
would immediately effect what apps would see and the database entries
would contain each time. This is problematic however, as tags are a
filtering concept, and if tags vanish then clients won't hence notice
when a device stops being relevant to them since not only the tags
disappear but immediately also the uevents for it are filtered including
the one necessary for the app to notice that the device lost its tag and
hence relevance.

With this change tags become "sticky". If a tag is applied is once
applied to a device it will stay in place forever, until the device is
removed. Tags can never be removed again. This means that an app
watching a specific set of devices by filtering for a tag is guaranteed
to not only see the events where the tag is set but also all follow-up
events where the tags might be removed again.

This change of behaviour is unfortunate, but is required due to the
kernel introducing new "bind" and "unbind" uevents that generally have
the effect that tags and properties disappear and apps hence don't
notice when a device looses relevance to it. "bind"/"unbind" events were
introduced in kernel 4.12, and are now used in more and more subsystems.
The introduction broke userspace widely, and this commit is an attempt
to provide a way for apps to deal with it.

While tags are now "sticky" a new automatic device property
CURRENT_TAGS is introduced (matching the existing TAGS property) that
always reflects the precise set of tags applied on the most recent
events. Thus, when subscribing to devices through tags, all devices that
ever had the tag put on them will be be seen, and by CURRENT_TAGS it may
be checked whether the device right at the moment matches the tag
requirements.

See: #7587 #7018 #8221
2020-09-01 17:40:12 +02:00
Lennart Poettering 73484ecff9 test-functions: make sure we test our own libudev instead of the host libudev
When invoking "ldd" to find dependency libraries we already set
$LD_LIBRARY_PATH to point to our own build tree, so that our libraries
are checked, not the host libraries. This is not sufficient howeever, as
libudev is built in a subdir. Add that, too.
2020-09-01 17:40:12 +02:00
Lennart Poettering 895abf3fdd
Merge pull request #16727 from wusto/core-fix-securebits
core: fix securebits setting
2020-09-01 17:21:48 +02:00
Renaud Métrich 3e5f04bf64 socket: New option 'FlushPending' (boolean) to flush socket before entering listening state
Disabled by default. When Enabled, before listening on the socket, flush the content.
Applies when Accept=no only.
2020-09-01 17:20:23 +02:00
Zbigniew Jędrzejewski-Szmek f77d6ec953
Merge pull request #16877 from poettering/tmpfiles-statx
tmpfiles: use statx()
2020-09-01 16:32:50 +02:00
Lennart Poettering df2f58176d doc: cross link sd_listen_fd() docs a bit
Let's make sure the sd_listen_fd() docs are really found from the
.socket file documentation as well as the FileDescriptorStoreMax=
documentation.

Let's also emphasize that that's where the order in which the fds are
passed are documented.

Fixes: #16647
2020-09-01 16:20:34 +02:00
Tobias Kaufmann dbdc4098f6 core: fix securebits setting
Desired functionality:
Set securebits for services started as non-root user.

Failure:
The starting of the service fails if no ambient capability shall be
raised.
... systemd[217941]: ...: Failed to set process secure bits: Operation
not permitted
... systemd[217941]: ...: Failed at step SECUREBITS spawning
/usr/bin/abc.service: Operation not permitted
... systemd[1]: abc.service: Failed with result 'exit-code'.

Reason:
For setting securebits the capability CAP_SETPCAP is required. However
the securebits (if no ambient capability shall be raised) are set after
setresuid.
When setresuid is invoked all capabilities are dropped from the
permitted, effective and ambient capability set. If the securebit
SECBIT_KEEP_CAPS is set the permitted capability set is retained, but
the effective and the ambient set are cleared.

If ambient capabilities shall be set, the securebit SECBIT_KEEP_CAPS is
added to the securebits configured in the service file and set together
with the securebits from the service file before setresuid is executed
(in enforce_user).
Before setresuid is executed the capabilities are the same as for pid1.
This means that all capabilities in the effective, permitted and
bounding set are set. Thus the capability CAP_SETPCAP is in the
effective set and the prctl(PR_SET_SECUREBITS, ...) succeeds.
However, if the secure bits aren't set before setresuid is invoked they
shall be set shortly after the uid change in enforce_user.
This fails as SECBIT_KEEP_CAPS wasn't set before setresuid and in
consequence the effective and permitted set was cleared, hence
CAP_SETPCAP is not set in the effective set (and cannot be raised any
longer) and prctl(PR_SET_SECUREBITS, ...) failes with EPERM.

Proposed solution:
The proposed solution consists of three parts
1. Check in enforce_user, if securebits are configured in the service
   file. If securebits are configured, set SECBIT_KEEP_CAPS
   before invoking setresuid.
2. Don't set any other securebits than SECBIT_KEEP_CAPS in enforce_user,
   but set all requested ones after enforce_user.
   This has the advantage that securebits are set at the same place for
   root and non-root services.
3. Raise CAP_SETPCAP to the effective set (if not already set) before
   setting the securebits to avoid EPERM during the prctl syscall.

For gaining CAP_SETPCAP the function capability_bounding_set_drop is
splitted into two functions:
- The first one raises CAP_SETPCAP (required for dropping bounding
  capabilities)
- The second drops the bounding capabilities

Why are ambient capabilities not affected by this change?
Ambient capabilities get cleared during setresuid, no matter if
SECBIT_KEEP_CAPS is set or not.
For raising ambient capabilities for a user different to root, the
requested capability has to be raised in the inheritable set first. Then
the SECBIT_KEEP_CAPS securebit needs to be set before setresuid is
invoked. Afterwards the ambient capability can be raised, because it is
in the inheritable and permitted set.

Security considerations:
Although the manpage is ambiguous SECBIT_KEEP_CAPS is cleared during
execve no matter if SECBIT_KEEP_CAPS_LOCKED is set or not. If both are
set only SECBIT_KEEP_CAPS_LOCKED is set after execve.
Setting SECBIT_KEEP_CAPS in enforce_user for being able to set
securebits is no security risk, as the effective and permitted set are
set to the value of the ambient set during execve (if the executed file
has no file capabilities. For details check man 7 capabilities).

Remark:
In capability-util.c is a comment complaining about the missing
capability CAP_SETPCAP in the effective set, after the kernel executed
/sbin/init. Thus it is checked there if this capability has to be raised
in the effective set before dropping capabilities from the bounding set.
If this were true all the time, ambient capabilities couldn't be set
without dropping at least one capability from the bounding set, as the
capability CAP_SETPCAP would miss and setting SECBIT_KEEP_CAPS would
fail with EPERM.
2020-09-01 10:53:26 +02:00
Tobias Kaufmann 57d4d284c9 capability-util: add new function for raising setpcap
Up to now the capability CAP_SETPCAP was raised implicitly in the
function capability_bounding_set_drop.

This functionality is moved into a new function
(capability_gain_cap_setpcap).

The new function optionally provides the capability set as it was
before raisining CAP_SETPCAP.
2020-09-01 10:53:26 +02:00
Anita Zhang 0419dae715
Merge pull request #16885 from keszybz/rework-cache-timestamps
Rework cache timestamps
2020-08-31 23:21:12 -07:00
Anita Zhang 026a72c175
Merge pull request #16917 from poettering/contrib-rfe
CONTRIBUTING: be clearer about versions and RFE process
2020-08-31 17:25:36 -07:00
Lennart Poettering c4bc2e9343 CONTRIBUTING: be clearer about versions and RFE process
Fixes: #16550
2020-08-31 23:23:56 +02:00
Lennart Poettering 4dc286658d dissect-image: generate nice error when we can't detect any file system 2020-08-31 22:11:53 +02:00
Zbigniew Jędrzejewski-Szmek c2911d48ff Rework how we cache mtime to figure out if units changed
Instead of assuming that more-recently modified directories have higher mtime,
just look for any mtime changes, up or down. Since we don't want to remember
individual mtimes, hash them to obtain a single value.

This should help us behave properly in the case when the time jumps backwards
during boot: various files might have mtimes that in the future, but we won't
care. This fixes the following scenario:

We have /etc/systemd/system with T1. T1 is initially far in the past.
We have /run/systemd/generator with time T2.
The time is adjusted backwards, so T2 will be always in the future for a while.
Now the user writes new files to /etc/systemd/system, and T1 is updated to T1'.
Nevertheless, T1 < T1' << T2.
We would consider our cache to be up-to-date, falsely.
2020-08-31 20:53:38 +02:00
Zbigniew Jędrzejewski-Szmek 02103e5716 core: always try to reload not-found unit
This check was added in d904afc730. It would only
apply in the case where the cache hasn't been loaded yet. I think we pretty
much always have the cache loaded when we reach this point, but even if we
didn't, it seems better to try to reload the unit. So let's drop this check.
2020-08-31 20:53:38 +02:00
Zbigniew Jędrzejewski-Szmek c149d2b491 pid1: use the cache mtime not clock to "mark" load attempts
We really only care if the cache has been reloaded between the time when we
last attempted to load this unit and now. So instead of recording the actual
time we try to load the unit, just store the timestamp of the cache. This has
the advantage that we'll notice if the cache mtime jumps forward or backward.

Also rename fragment_loadtime to fragment_not_found_time. It only gets set when
we failed to load the unit and the old name was suggesting it is always set.

In https://bugzilla.redhat.com/show_bug.cgi?id=1871327
(and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1867930
and most likely https://bugzilla.redhat.com/show_bug.cgi?id=1872068) we try
to load a non-existent unit over and over from transaction_add_job_and_dependencies().
My understanding is that the clock was in the future during inital boot,
so cache_mtime is always in the future (since we don't touch the fs after initial boot),
so no matter how many times we try to load the unit and set
fragment_loadtime / fragment_not_found_time, it is always higher than cache_mtime,
so manager_unit_cache_should_retry_load() always returns true.
2020-08-31 20:53:38 +02:00
Zbigniew Jędrzejewski-Szmek 81be23886d core: rename manager_unit_file_maybe_loadable_from_cache()
The name is misleading, since we aren't really loading the unit from cache — if
this function returns true, we'll try to load the unit from disk, updating the
cache in the process.
2020-08-31 20:53:38 +02:00
Lennart Poettering b519529104
Merge pull request #16841 from keszybz/acl-util-bitmask
Use a bitmask in fd_add_uid_acl_permission()
2020-08-31 16:45:13 +02:00
Lennart Poettering e2b40db616 man: document fd ownership for sd-bus fd marshalling
Fixes: #8003
2020-08-30 18:09:39 +01:00
Daan De Meyer a667ddf132
Merge pull request #16898 from poettering/resolved-errno
make sure we have `answer_errno` always properly initialized before we end a transaction with ERRNO cause
2020-08-30 18:05:38 +01:00
Florian Klink 43269733ef homed: fix log message to honor real homework path
This seems to be overridable by setting the SYSTEMD_HOMEWORK_PATH env
variable, but the error message always printed the SYSTEMD_HOMEWORK_PATH
constant.
2020-08-30 17:04:01 +02:00
fangxiuning 6d95e7d9b2
tree-wide: drop pointless zero initialization (#16900) 2020-08-30 06:21:20 +09:00
Fabrice Fontaine 28e2641a1a
src/shared/dissect-image.c: fix build without blkdid (#16901)
N_DEVICE_NODE_LIST_ATTEMPTS is unconditionally used since version 246 and
ac1f3ad05f

However, this variable is only defined if HAVE_BLKID is set resulting in
the following build failure if cryptsetup is enabled but not libblkid:

../src/shared/dissect-image.c:1336:34: error: 'N_DEVICE_NODE_LIST_ATTEMPTS' undeclared (first use in this function)
 1336 |         for (unsigned i = 0; i < N_DEVICE_NODE_LIST_ATTEMPTS; i++) {
      |

Fixes:
 - http://autobuild.buildroot.org/results/67782c225c08387c1bbcbea9eee3ca12bc6577cd
2020-08-30 06:17:18 +09:00
Michael Biebl 9654645b62
Merge pull request #16850 from mbiebl/networkd-socket-activation
networkd: use socket activation when starting networkd
2020-08-29 16:12:46 +02:00
Lennart Poettering ca19dedb56 sym: fix bad symbol file addition
The next release is 247, not 248.

A follow-up for 2b07ec316a
2020-08-29 12:35:00 +02:00
Lennart Poettering a75cb4e20a resolved: remove superfluous ;; 2020-08-28 22:59:26 +02:00
Lennart Poettering 95d2155aeb tree-wide: no need to negate argument to ERROR_IS_XYZ()
These macros call abs() internally, hence let's simplify invocations.
2020-08-28 22:58:58 +02:00
Lennart Poettering fd8a301703 resolved: make sure we initialize t->answer_errno before completing the transaction
We must have the error number around when completing the transaction.
Let's hence make sure we always initialize it *first* (we accidentally
did it once after).

Fixes: #11626
2020-08-28 22:44:57 +02:00
Michael Biebl 3aa645f0c0 test-network: stop networkd and its socket
With the changes from 2c0dffe82d, starting
systemd-networkd.service will also activate systemd-networkd.socket.
When tearing down a test, we need to stop the socket as well, to make
sure networkd can't be activated accidentally with the wrong
configuration.
2020-08-28 21:03:09 +02:00
Lennart Poettering 5895a9d600
Merge pull request #16874 from poettering/analyze-cap
introduce "systemd-analyze capability" command for dumping locally defined caps
2020-08-28 21:01:56 +02:00
Daniel Mack 5170afbc55 clock-util: read timestamp from /usr/lib/clock-epoch
On systems without an RTC, systemd currently sets the clock to a
compile-time epoch value, derived from the NEWS file in the
repository. This is not ideal as the initial clock hence depends
on the last time systemd was built, not when the image was compiled.

Let's provide a different way here and look at `/usr/lib/clock-epoch`.
If that file exists, it's timestamp for the last modification will be
used instead of the compile-time default.
2020-08-28 18:58:22 +02:00
Kyle Russell dd05042039 units: add missing usb-gadget.target 2020-08-28 18:57:58 +02:00
Lennart Poettering b2af819b22 analyze: add 'capability' verb for dumping all known and unknown caps 2020-08-28 18:52:48 +02:00
Lennart Poettering 0f849d0af9 analyze: fix error handling in one case 2020-08-28 18:14:53 +02:00
Lennart Poettering 524bdc95e7 basic: make sure we include inttypes.h when we use its types 2020-08-28 18:14:35 +02:00
Lennart Poettering 3c719357dc man: extend on the usec/sec discrepancy
Let's document the discrepancy between the Sec and USec suffixing of
unit files and D-Bus properties at three places: in "systemctl show"
(where it already was briefly mentioned), in the D-Bus interface
description (at one place at least, i.e. the most prominent of
properties that encapsulate time values, there are many more) and in the
general man page explaining time values.

By documenting this at all three places I think we now do as much as we
can do about this highlighting the discrepancy of the naming and the
reasons behind it.

Fixes: #2047
2020-08-28 18:01:17 +02:00
fangxiuning c53aafb7b5
tree-wide: drop pointless zero initialization (#16884)
tree-wide: drop pointless zero initialization
2020-08-28 17:45:54 +02:00
Lennart Poettering cd17bb6e71 networkd: consider any uevent other than "remove" sufficient for the network device to be ready 2020-08-28 17:45:05 +02:00
Ikey Doherty 97207ac85c login/logind: Include sys/stat.h for struct stat usage
We need to include `<sys/stat.h>` for usage of the `struct stat` in
the Manager struct, much as we already include `<stdbool.h>` for C99
booleans.

This helps alleviate another minor build failure on non-glibc systems.
2020-08-28 17:44:39 +02:00
Lennart Poettering ddb439b8f9 tmpfiles: use statx() when aging files
This allows us to properly detect mount points, for free. (Also, allows
us to respect btimes that are newer than the cutoff, which should be
useful when people untar file trees in /var/tmp)

Fixes: #16848
2020-08-28 15:45:37 +02:00
Lennart Poettering cd1361e203 time-util: add timespec_store_nsec()
timespec_store_nsec() is to timespec_store() what timespec_load_nsec()
is to timespec_load(), i.e. the nsec version of the usual usec API
2020-08-28 14:22:43 +02:00
Plan C b23728ec9f
Add simple usage message to systemd-backlight (#16709) 2020-08-28 13:50:35 +02:00
Ikey Doherty 677bb0555a partition/makefs: Include missing sys/file.h header
This file must be included on non-glibc systems to ensure
the `LOCK_EX` definition is available.

Signed-off-by: Ikey Doherty <ikey.doherty@lispysnake.com>
2020-08-28 13:32:09 +02:00
Lennart Poettering ae6ad21e0b device: propagate reload events from devices on everything but "add", and "remove"
Any uevent other then the initial and the last uevent we see for a
device (which is "add" and "remove") should result in a reload being
triggered, including "bind" and "unbind". Hence, let's fix up the check.

("move" is kinda a combined "remove" + "add", hence cover that too)
2020-08-28 13:30:13 +02:00