Currently, mount_sysfs() only creates /sys/fs/cgroup if cg_ns_supported().
The comment explains that we need to "Create mountpoint for
cgroups. Otherwise we are not allowed since we remount /sys read-only.";
that is: that we need to do it now, rather than later. However, the
comment doesn't do anything to explain why we only need to do this if
cg_ns_supported(); shouldn't we _always_ need to do it?
The answer is that if !use_cgns, then this was already done by the outer
child, so mount_sysfs() only needs to do it if use_cgns. Now,
mount_sysfs() doesn't know whether use_cgns, but !cg_ns_supported() implies
!use_cgns, so we can optimize" the case where we _know_ !use_cgns, and deal
with a no-op mkdir_p() in the false-positive where cgns_supported() but
!use_cgns.
But is it really much of an optimization? We're potentially spending an
access(2) (cg_ns_supported() could be cached from a previous call) to
potentially save an lstat(2) and mkdir(2); and all of them are on virtual
fileystems, so they should all be pretty cheap.
So, simplify and drop the conditional. It's a dubious optimization that
requires more text to explain than it's worth.
Remove "arbitrary named hierarchies" from the list of things that
cg_kernel_controllers() might return, and clarify that "name="
pseudo-controllers are not included in the returned list.
/proc/cgroups does not contain "name=" pseudo-controllers, and
cg_kernel_controllers() makes no effort to enumerate them via a different
mechanism.
One of the things that tmpfs_patch_options does is take an (optional) UID,
and insert "uid=${UID},gid=${UID}" into the options string. So we need a
uid_t argument, and a way of telling if we should use it. Fortunately,
that is built in to the uid_t type by having UID_INVALID as a possible
value.
So this is really a feature that requires one argument. Yet, it is somehow
taking 4! That is absurd. Simplify it to only take one argument, and have
that trickle all the way up to mount_all()'s usage.
Now, in may of the uses, the argument becomes
uid_shift == 0 ? UID_INVALID : uid_shift
because it used to treat uid_shift=0 as invalid unless the patch_ids flag
was also set. This keeps the behavior the same. Note that in all cases
where it is invoked, if !use_userns (sometimes called !userns), then
uid_shift is 0; we don't have to add any checks for that.
That said, I'm pretty sure that "uid=0" and not setting "uid=" are the
same, but Christian Brauner seemed to not think so when implementing the
cgns support. https://github.com/systemd/systemd/pull/3589
One of the things that mkdir_userns{,_p}() does is take an (optional) UID,
and chown the directory to that. So we need a uid_t argument, and a way of
telling if we should use that uid_t argument. Fortunately, that is built
in to the uid_t type by having UID_INVALID as a possible value.
However, currently mkdir_userns() also takes a MountSettingsMask and checks
a couple of bits in it to decide if it should perform the chown.
Drop the mask argument, and instead have the caller pass UID_INVALID if it
shouldn't chown.
When we open our own little namespace for running our tests in, let's
turn off mount propagation only one way, rather than both ways. This is
better as this means we don't pin host mounts unnecessarily long in our
namespace, even though the host already got rid of them. This is because
MS_SLAVE in contrast to MS_PRIVATE allows umount events to propagate
from the host into our environment.
Looking at a recent Bad Day, my log contains over 100 lines of
systemd[23895]: Failed to connect to API bus: Connection refused
It is due to "systemd --user" retrying to connect to an API bus.[*] I
would prefer to avoid spamming the logs. I don't think it is good for us
to retry so much like this.
systemd was mislead by something setting DBUS_SESSION_BUS_ADDRESS. My best
guess is an unfortunate series of events caused gdm to set this. gdm has
code to start a session dbus if there is not a bus available already (and
in this case it exports the environment variable). I believe it does not
normally do this when running under systemd, because "systemd --user" and
hence "dbus.service" would already have been started by pam_systemd.
I see two possibilities
1. Rip out the check for DBUS_SESSION_BUS_ADDRESS entirely.
2. Only check for DBUS_SESSION_BUS_ADDRESS on startup. Not in the
"recheck" logic.
The justification for 2), is that the recheck is called from unit_notify(),
this is used to check whether the service just started (or stopped) was
"dbus.service". This reason for rechecking does not apply if we think
the session bus was started outside our logic.
But I think we can justify 1). dbus-daemon ships a statically-enabled
/usr/lib/systemd/user/dbus.service, which would conflict with an attempt to
use an external dbus. Also "systemd --user" is started from user@.service;
if you try to start it manually so that it inherits an environment
variable, it will conflict if user@.service was started by pam_systemd
(or loginctl enable-linger).
This allows aliases to be used for the basic modules we load from pid1 before
udev is started. In #9501 the kernel renamed autofs4 to autofs, with "autofs4"
as alias, but we wouldn't load the module, because we didn't follow aliases.
The kernel change was reverted, but it's probably better to support aliases.
These custom macros make the expression go through a function, in order
to prevent ASSERT_SIDE_EFFECT false positives on our macros such as
assert_se() and assert_return() that cannot be disabled and will always
evaluate their expressions.
This technique has been described and recommended in:
https://community.synopsys.com/s/question/0D534000046Yuzb/suppressing-assertsideeffect-for-functions-that-allow-for-sideeffects
Tested by doing a local cov-build and uploading the resulting tarball to
scan.coverity.com, confirmed that the ASSERT_SIDE_EFFECT false positives
were gone.
This makes bus_slot_disconnect() unref the slot object from bus when
`unref == true` and it is floating, as the function removes the
reference from the relevant bus object.
This reverts 20d4ee2cbc, as it
introduces #9604.
Fixes#9604.
key_serial_t is defined in keyutil.h, which wasn't included in the header list
in the test, so the test always failed. We were always compiling stuff with
!HAVE_KEY_SERIAL_T.
We could try to add keyutil.h to the test, but then we'd have to first check if
it is available, which just doesn't seem worth the trouble.
key_serial_t should always be defined as int32_t. Let's keep the uncoditional
define, since repeated compatible typedefs are not a problem, and it allows us
to compile even if the header file is missing. If there's ever a change in the
definition, we'll have to adjust the code for the different type anyway, and
our compiler will tell us.
Using _GNU_SOURCE is better because that's how we include the headers in the
actual build, and some headers define different stuff when it is defined.
sys/stat.h for example defines 'struct statx' conditionally.
The switch to memory_startswith() changed the logic to only look for a space or
NUL byte after the matched word, but matching the full size should also be
acceptable.
This changed the behavior of parsing of "AUTH\r\n", where m will be set to 4,
since even though the word will match, the check for it being followed by ' '
or NUL will make line_begins() return false.
Tested:
- Using netcat to connect to the private socket directly:
$ echo -ne '\0AUTH\r\n' | sudo nc -U /run/systemd/private
REJECTED EXTERNAL ANONYMOUS
- Running the Ignition blackbox test:
$ sudo sh -c 'PATH=$PWD/bin/amd64:$PATH ./tests.test'
PASS
Fixes: d27b725abf
The current CLI does not support a way to clear these lists, since without any
additional arguments, the command will list the current values.
Introduce a new way to clear the lists by passing a single '' argument to these
subcommands.
Update the man page to document this.
Tested:
$ build/resolvectl domain eth1
Link 3 (eth1): ~.
$ build/resolvectl domain eth1 ''
$ build/resolvectl domain eth1
Link 3 (eth1):
$ build/resolvectl domain eth1 '~.' '~example.com'
$ build/resolvectl domain eth1
Link 3 (eth1): ~. ~example.com
$ build/resolvectl domain eth1 ''
$ build/resolvectl domain eth1
Link 3 (eth1):
$ build/resolvectl domain eth1 '~.'
$ build/resolvectl domain eth1
Link 3 (eth1): ~.
And similar for "dns" and "nta".
Check if the fd is a folder before setting default acls
Tested:
Ubuntu 18.04.
test.conf: A+ /tmp/test - - - - u:user2:rw,d:u:user1:rwx
The folder /tmp/test looks like
/tmp/test/file1
/tmp/test/folder2
start systemd-tmpfiles manually
Fixes: #9545
This got broken in 9d9dd746d4, because a template
is not a valid unit, so the check for being masked failed. Avoid this by
handling templates specially. Fixes#9554.
Also, this improves 'cat' with masked units:
(before) $ systemctl cat foofoofoo@.service
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
Failed to derive unit name prefix from unit name: Invalid argument
(after) $ build/systemctl cat foofoofoo@.service
In check_triggering_units(), the call to unit_is_masked() is replaced with an
open-coded check. This is a bit unfortunate, but unit_is_masked() now requires
LookupPaths to be initialized, which we don't have or need in this case, so it
seems easiest to just accept this tiny code duplication.
This adds sd_bus_{get,set}_method_call_timeout().
If the timeout is not set or set to 0, then the timeout value is
parsed from $SYSTEMD_BUS_TIMEOUT= environment variable. If the
environment variable is not set, then built-in timeout is used.
Unfortunately this needs libshared to link to libkmod. Before it was linked
into systemd-udevd, udevadm, and systemd each seperately. On most systems this
doesn't make much difference, because at least systemd would be installed, but
it might not be in small chroots. It is a small library, so I hope this is not
a big issue.
Back in 2012 the project was renamed, see the release notes for v 0.105
[https://cgit.freedesktop.org/polkit/tree/NEWS#n754]. Let's update our
documentation and comments to do the same. Referring to PolicyKit is confusing
to users because at the time the polkit api changed too, and we support the new
version. I updated NEWS too, since all the references to PolicyKit there were
added after the rename.
"PolicyKit" is unchanged in various URLs and method call names.
Starting with glibc 2.27.9000-36.fc29, include file sys/stat.h will have a
definition for struct statx, in which case include file linux/stat.h should be
avoided, in order to prevent a duplicate definition.
In file included from ../src/basic/missing.h:18,
from ../src/basic/util.h:28,
from ../src/basic/hashmap.h:10,
from ../src/shared/bus-util.h:12,
from ../src/libsystemd/sd-bus/bus-creds.c:11:
/usr/include/linux/stat.h:99:8: error: redefinition of ‘struct statx’
struct statx {
^~~~~
In file included from /usr/include/sys/stat.h:446,
from ../src/basic/util.h:19,
from ../src/basic/hashmap.h:10,
from ../src/shared/bus-util.h:12,
from ../src/libsystemd/sd-bus/bus-creds.c:11:
/usr/include/bits/statx.h:36:8: note: originally defined here
struct statx
^~~~~
Extend our meson.build to look for struct statx when only sys/stat.h is
included and, in that case, do not include linux/stat.h anymore.
Tested that systemd builds correctly when using a glibc version that includes a
definition for struct statx.
glibc Fedora RPM update:
28cb5d31fc
glibc upstream commit:
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=fd70af45528d59a00eb3190ef6706cb299488fcd
When unmounting user runtime directory, only UID is necessary,
and the corresponding user may not exist anymore.
This makes first try to parse the input by parse_uid(), and only if it
fails, prase the input by get_user_creds().
Fixes#9541.
When a slot is disconnected, then slot->match_callback.install_slot
is also disconnected. So, bus_slot_disconnect() removes the install_slot
from the list of slots in bus, although it is a floating object.
This makes install_slot unreffed from bus when it is disconnected.
Fixes#9505 and #9510.
The kernel added support for a new cgroup memory controller knob memory.min in
bf8d5d52ffe8 ("memcg: introduce memory.min") which was merged during v4.18
merge window.
Add MemoryMin to support memory.min.
This fixes the following valgrind warning:
```
Syscall param sendmsg(msg.msg_name) points to uninitialised byte(s)
at 0x6189CC1: sendmsg (in /usr/lib64/libpthread-2.27.so)
by 0x153082: dns_stream_writev (resolved-dns-stream.c:235)
by 0x153343: dns_stream_tls_writev (resolved-dns-stream.c:299)
by 0x5B30343: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B3158F: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B33190: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B36307: ??? (in /usr/lib64/libgnutls.so.30.20.2)
by 0x5B37D47: gnutls_handshake (in /usr/lib64/libgnutls.so.30.20.2)
by 0x154591: dns_stream_connect_tls (resolved-dns-stream.c:596)
by 0x13A889: dns_transaction_emit_tcp (resolved-dns-transaction.c:676)
by 0x13D901: dns_transaction_go (resolved-dns-transaction.c:1761)
by 0x1330C8: dns_query_candidate_go (resolved-dns-query.c:156)
Address 0xa9ac268 is 312 bytes inside a block of size 592 alloc'd
at 0x4C30B06: calloc (vg_replace_malloc.c:711)
by 0x1541F8: dns_stream_new (resolved-dns-stream.c:545)
by 0x13A662: dns_transaction_emit_tcp (resolved-dns-transaction.c:642)
by 0x13D901: dns_transaction_go (resolved-dns-transaction.c:1761)
by 0x1330C8: dns_query_candidate_go (resolved-dns-query.c:156)
by 0x134E16: dns_query_go (resolved-dns-query.c:757)
by 0x11F3FB: bus_method_resolve_hostname (resolved-bus.c:353)
by 0x4F947A7: method_callbacks_run (bus-objects.c:402)
by 0x4F97266: object_find_and_run (bus-objects.c:1260)
by 0x4F978B1: bus_process_object (bus-objects.c:1376)
by 0x4FAF82C: process_message (sd-bus.c:2661)
by 0x4FAFA1B: process_running (sd-bus.c:2703)
```
If --dev-kvm-mode is set to something different then 0666, which we
explicitly support, it makes sense to still apply the uaccess tag to
/dev/kvm. For distros which opt to use the default 0666, this change is
a nop.
This partially reverts commit b8fd3d8220.
Prior to this commit, a .link file with a [Match] section containing
MACAddress= would match any device without a MAC. This restores the
matching logic prior to e90d037.
This is useful if someone wants to recreate the original syslog datagram. We
already include timestamp information as _SOURCE_REALTIME_TIMESTAMP=, and in
normal use that timestamp, converted back to the form used by syslog
(Mth dd HH:MM:SS) would usually give the value. But there are various
circumstances where this might not be true. Most obviously, if the datagram is
sent a bit later after being prepared, the time is rounded to the nearest
second, and it might be off. This is especially bad around New Year when the
syslog timestamp wraps around. Then the same timezone and locale need to be
used to recreate the original timestamp. In the end doing this reliably is
complicated, and it seems much easier to just unconditionally include the
original timestamp.
If the original timestamp cannot be located, we store the full log line.
This way, it should be always possible to recreate the original input.
Example:
MESSAGE=x
SYSLOG_TIMESTAMP=Sep 15 15:07:58
SYSLOG_RAW
^]^@^@^@^@^@^@^@<13>Sep 15 15:07:58 HOST: x^@y
_PID=3318
_SOURCE_REALTIME_TIMESTAMP=1530743976393553
Fixes#2398.
There are some modern programming languages use userspace context switches
to implement coroutine features. PowerPC (32-bit) needs syscall "swapcontext" to get
contexts or switch between contexts, which is special.
Adding this rule should fix#9485.
The D-Bus library supplies a va_list variant of
`sd_bus_message_append()` called `sd_bus_message_appendv()`,
but failed to provide a va_list variant of its opposite,
`sd_bus_message_read()`. This commit publicizes a previously static
function as `sd_bus_message_readv()`.
This makes OBJECT_PATH_FOREACH_PREFIX consistent with PATH_FOREACH_PREFIX
and also fixes 7 alerts reported by LGTM at
ac0a087003/files/src/libsystemd/sd-bus/bus-objects.c?sort=name&dir=ASC&mode=heatmap&showExcluded=true#V1383
nspawn as it is now is a generally useful tool, hence let's drop the
comments about it being useful for debug and so on only.
The new wording just makes the first sentence of the main page also the
summary.
Use PRIu64 constant to get the format right on LP-64 architectures,
cast to (uint64_t) to solve incompatibility of __u64.
This was missed in ad4bc33522, so fix it
with this follow up.
Also use compat_main() when called as `resolvconf`, since the interface
is closer to that of `systemd-resolve`.
Use a heap allocated string to set arg_ifname, since a stack allocated
one would be lost after the function returns. (This last one broke the
case where an interface name was suffixed with a dot, such as in
`resolvconf -a tap0.dhcp`.)
Tested:
$ build/resolvconf -a nonexistent.abc </etc/resolv.conf
Unknown interface 'nonexistent': No such device
Fixes#9423.
Use PRIu64 and PRIu32 constants to also get the format right on LP-64
architectures.
For the 64-bit fields, we need a cast to (uint64_t), since __u64 is
defined as a `long long unsigned` and PRIu64 expects a `long unsigned`.
In practice, both are the same, so the cast should be OK.
gmtime() and localtime() operate on a static buffer. let's avoid this,
as we never know whether some library might use these calls in some
backrgound thread.
Discovered by lgtm:
https://lgtm.com/projects/g/systemd/systemd/
Using an assertion is fine, since calls to job_merge_into_installed()
are protected by a check for job_type_is_conflicting().
Uncovered by Coverity, fixes CID 996307.
We add n_installed_jobs and n_failed_jobs to our inner state after
deserialization. This is fine during daemon-reexec when we start with clear
Manager (and some jobs possibly queued before deserialization), however,
daemon-reload works with the same manager and adding the values would
effectively double the counters. Reset the counters before we deserialize and
add their values again.
It does not make sense for udev to even open DRBD block devices
(/dev/drbdX). It is on one hand not necessary as DRBD is controlled by
something else in the stack (e.g., pacemaker), and it even can get
cumbersome in various scenarios (e.g., DRBD9 auto-promote).
Closes: #9371
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This compensates for the unsynchronized reload cycles of systemd and
udev: we manually trigger the deps listed in SYSTEMD_WANTS properties if
they change for device units that are already up. That way all deps
defined that way will be triggered at least once: the first time the
unit goes up by the usual dependency logic, and if it already is up by
the device.c specific logic.
Fixes: #9323
Before this patch:
# systemctl --runtime mask abuild.mount does-not-exist.mount does-also-not-exist.mount
Unit abuild.mount does not exist, proceeding anyway.
Unit abuild.mount does not exist, proceeding anyway.
Unit abuild.mount does not exist, proceeding anyway.
Created symlink /run/systemd/system/abuild.mount → /dev/null.
Created symlink /run/systemd/system/does-not-exist.mount → /dev/null.
Created symlink /run/systemd/system/does-also-not-exist.mount → /dev/null.
After this patch:
# systemctl --runtime mask abuild.mount does-not-exist.mount does-also-not-exist.mount
Unit abuild.mount does not exist, proceeding anyway.
Unit does-not-exist.mount does not exist, proceeding anyway.
Unit does-also-not-exist.mount does not exist, proceeding anyway.
Created symlink /run/systemd/system/abuild.mount → /dev/null.
Created symlink /run/systemd/system/does-not-exist.mount → /dev/null.
Created symlink /run/systemd/system/does-also-not-exist.mount → /dev/null.
This commit adds the stop alias to the output of `machinectl --help`.
In the past we only mention this in the man page. It's nice to mention
this in the output `machinectl --help` as well.
Now the setting MTU is embedded into the link_up message which makes it
incapable of setting MTU if link is up. MTU can be set while Link is up.
Closes#9254
The prefix for EMC Symmetrix pre-SPC VPD inquiry reply
is always SCSI_ID_NAA, so we need to hardcode it to
avoid false values here.
Signed-off-by: Hannes Reinecke <hare@suse.com>
The method already uses a boolean argument to determine whether it is in
whitelist mode or not. The code that will parse the string of filters
does not expect the ~, since it already has the boolean argument. Thus,
it will fail to parse the list of filters.
The functions protect_{home,system}_from_string() are not used
except for defining protect_{home,system}_or_bool_from_string().
This makes protect_{home,system}_from_string() support boolean
strings, and drops protect_{home,system}_or_bool_from_string().
During the transition from system functions using errno to our own read and write functions with negative return codes some errors where introduced. This patch correctly convert errno to negative return codes for read and write and fix checks still using errno instead of the return code.
Closes#9283
Currently we employ mostly system call blacklisting for our system
services. Let's add a new system call filter group @system-service that
helps turning this around into a whitelist by default.
The new group is very similar to nspawn's default filter list, but in
some ways more restricted (as sethostname() and suchlike shouldn't be
available to most system services just like that) and in others more
relaxed (for example @keyring is blocked in nspawn since it's not
properly virtualized yet in the kernel, but is fine for regular system
services).
$ git grep -e 'This program is free software' -l |grep -v LICENSE | \
xargs perl -i -0pe 's/ \* This program.*?for more details.\s*\*\n( \* You should have.*licenses.>.\n)?//gms'
For some reason they were missed previously. All those files seem to
have proper SDPX tags.
1) mv /var/tmp /var/tmp.old
2) mkdir /tmp/varrr
3) ln -s /tmp/varrr /var/tmp
Now, when a service has PrivateTmp=yes, during namespace setup,
/tmp is first mounted over with a new mount. Then, when /var/tmp
is being resolved, it points to /tmp/varrr, which by then doesn't
exist, because it had already been obscured.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
On overlayfs, FTW_MOUNT causes nftw to not list *any* files because the
condition used by glibc to verify that it's on the same mountpoint doesn't work
on overlayfs, see https://bugzilla.suse.com/show_bug.cgi?id=1096807 for the
details.
However using FTW_MOUNT doesn't seem to be really needed when walking through
the keymap directorie tree. So until the glibc or the kernel is fixed (which
might take some time), let's make localectl works with overlayfs.
There's a small side effect here, by which regular (non-directory) files with
bind mounts will be parsed while they were skipped by the previous logic.
To make debugging easier, this patches allows one to change the log target and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.
Indeed if one changed the log target at runtime (via the bus or via signals),
the change was lost on the next reload/reexecution.
In order to restore back the default value (set via system.conf, environment
variables or any other means ), the empty string in the "LogTarget" property is
now supported as well as sending SIGTRMIN+26 signal.
To make debugging easier, this patches allows one to change the log level and
do reload/reexec without modifying configuration permanently, which makes
debugging easier.
Indeed if one changed the log max level at runtime (via the bus or via
signals), the change was lost on the next daemon reload/reexecution.
In order to restore the original value back (set via system.conf, environment
variables or any other means), the empty string in the "LogLevel" property is
now supported as well as sending SIGRTMIN+23 signal.
Let's add "const" where we don't change structures passed.
Also, we generally use "unsigned char" for IP prefix length values, do
so here too. Previously different parts of the sd-radv.h API used
different types for this.
sd_radv_stop is called from two places. if sd_radv_stop is alrady
success then just don't try to close it .
```
systemd-networkd[604]: RADV: Stopping IPv6 Router Advertisement daemon
systemd-networkd[604]: RADV: Unable to send last Router Advertisement with router lifetime set to zero: Bad file descriptor <==================HERE
systemd-networkd[604]: RADV: Updated prefix 2a0a:*:*:fc::/64 preferred 1h valid 2h
systemd-networkd[604]: RADV: Started IPv6 Router Advertisement daemon
```
Closes one of the issue #8960