The test is failing in koji, and the line from printf() does not end up
in the logs for some reason. log_info() works fine, so let's just use
that here too.
Read-only /var/tmp is more likely, because it's backed by a real device. /tmp
is (by default) backed by tmpfs, but it doesn't have to be. In both cases the
same consideration applies.
If we boot with read-only /var/tmp, any unit with PrivateTmp=yes would fail
because we cannot create the subdir under /var/tmp to mount the private directory.
But many services actually don't require /var/tmp (either because they only use
it occasionally, or because they only use /tmp, or even because they don't use the
temporary directories at all, and PrivateTmp=yes is used to isolate them from
the rest of the system).
To handle both cases let's create a read-only directory under /run/systemd and
mount it as the private /tmp or /var/tmp. (Read-only to not fool the service into
dumping too much data in /run.)
$ sudo systemd-run -t -p PrivateTmp=yes bash
Running as unit: run-u14.service
Press ^] three times within 1s to disconnect TTY.
[root@workstation /]# ls -l /tmp/
total 0
[root@workstation /]# ls -l /var/tmp/
total 0
[root@workstation /]# touch /tmp/f
[root@workstation /]# touch /var/tmp/f
touch: cannot touch '/var/tmp/f': Read-only file system
This commit has more changes than I like to put in one commit, but it's touching all
the same paths so it's hard to split.
exec_runtime_make() was using the wrong cleanup function, so the directory would be
left behind on error.
We never return anything higher than 63, so using "long unsigned"
as the type only confused the reader. (We can still use "long unsigned"
and safe_atolu() to parse the kernel file.)
We would refuse to print capabilities which were didn't have a name
for. The kernel adds new capabilities from time to time, most recently
cap_bpf. 'systmectl show -p CapabilityBoundingSet ...' would fail with
"Failed to parse bus message: Invalid argument" because
capability_set_to_string_alloc() would fail with -EINVAL. So let's
print such capabilities in hexadecimal:
CapabilityBoundingSet=cap_chown cap_dac_override cap_dac_read_search
cap_fowner cap_fsetid cap_kill cap_setgid cap_setuid cap_setpcap
cap_linux_immutable cap_net_bind_service cap_net_broadcast cap_net_admin
cap_net_raw cap_ipc_lock cap_ipc_owner 0x10 0x11 0x12 0x13 0x14 0x15 0x16
0x17 0x18 0x19 0x1a ...
For symmetry, also allow capabilities that we don't know to be specified.
Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1853736.
The desktop file specification allows entries like ";;;;;;", full of empty strings.
But looking at the actual list of supported keys [1], empty entries are meaningless
(unless we would allow e.g. the desktop name to be the empty string. But that doesn't
seem very useful either). So let's just simplify our life and skip any empty substrings
entirely.
This would also resolve the fuzzer case:
$ valgrind build/fuzz-xdg-desktop test/fuzz/fuzz-xdg-desktop/oss-fuzz-22812
test/fuzz/fuzz-xdg-desktop/oss-fuzz-22812... ok
==2899241== HEAP SUMMARY:
==2899241== in use at exit: 0 bytes in 0 blocks
==2899241== total heap usage: 484,385 allocs, 484,385 frees, 12,411,330 bytes allocated
↓
==2899650== HEAP SUMMARY:
==2899650== in use at exit: 0 bytes in 0 blocks
==2899650== total heap usage: 1,325 allocs, 1,325 frees, 1,463,602 bytes allocated
This partially undoes the parent commit. We follow the symlink and
if it appears to be a symlink to /dev/null, even if /dev/null is not
present, we treat it as such. The addition of creation of /dev/null
in the test is reverted.
There's some inconsistency in the what is considered a masked unit:
some places (i.e. load-fragment.c) use `null_or_empty()` while others
check if the file path is symlinked to "/dev/null". Since the latter
doesn't account for things like non-absolute symlinks to "/dev/null",
this commit switches the check for "/dev/null" to use `null_or_empty_path()`
Since bc8d57f290 mac_selinux_init() is checked and considered fatal
tree-wide.
Coverity complains about it not being checked in the test code.
Follow-up of: #16223
Fixes: CID 1429975
https://tools.ietf.org/html/draft-knodel-terminology-02https://lwn.net/Articles/823224/
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
Presently, CLI utilities such as systemctl will check whether they have a tty
attached or not to decide whether to parse /proc/cmdline or EFI variable
SystemdOptions looking for systemd.log_* entries.
But this check will be misleading if these tools are being launched by a
daemon, such as a monitoring daemon or automation service that runs in
background.
Make log handling of CLI tools uniform by never checking /proc/cmdline or EFI
variables to determine the logging level.
Furthermore, introduce a new log_setup_cli() shortcut to set up common options
used by most command-line utilities.
This combines set_ensure_allocated() with set_consume(). The cool thing is that
because we know the hash ops, we can correctly free the item if appropriate.
Similarly to set_consume(), the goal is to simplify handling of the case where
the item needs to be freed on error and if already present in the set.
It's such a common operation to allocate the set and put an item in it,
that it deserves a helper. set_ensure_put() has the same return values
as set_put().
Comes with tests!
Let's make sure $XDG_RUNTIME_DIR for the user instance and /run for the
system instance is always organized the same way: the "inaccessible"
device nodes should be placed in a subdir of either called "systemd" and
a subdir of that called "inaccessible".
This way we can emphasize the common behaviour, and only differ where
really necessary.
Follow-up for #13823
dm-verity support in dissect-image at the moment is restricted to GPT
volumes.
If the image a single-filesystem type without a partition table (eg: squashfs)
and a roothash/verity file are passed, set the verity flag and mark as
read-only.
This is a follow-up for 9f83091e3c.
Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.
This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
We would parse numbers with base prefixes as user identifiers. For example,
"0x2b3bfa0" would be interpreted as UID==45334432 and "01750" would be
interpreted as UID==1000. This parsing was used also in cases where either a
user/group name or number may be specified. This means that names like
0x2b3bfa0 would be ambiguous: they are a valid user name according to our
documented relaxed rules, but they would also be parsed as numeric uids.
This behaviour is definitely not expected by users, since tools generally only
accept decimal numbers (e.g. id, getent passwd), while other tools only accept
user names and thus will interpret such strings as user names without even
attempting to convert them to numbers (su, ssh). So let's follow suit and only
accept numbers in decimal notation. Effectively this means that we will reject
such strings as a username/uid/groupname/gid where strict mode is used, and try
to look up a user/group with such a name in relaxed mode.
Since the function changed is fairly low-level and fairly widely used, this
affects multiple tools: loginctl show-user/enable-linger/disable-linger foo',
the third argument in sysusers.d, fourth and fifth arguments in tmpfiles.d,
etc.
Fixes#15985.
Prompted by the discussions in #15180.
This is a bit more complex than I hoped, since for PID 1 we need to pass
in the synethetic environment block in we generate on demand.
It's easy to add, and should be pretty useful, in particular as in
AssertPathIsEncrypted= as it can be used for checking that
some path is encrypted before some service is invoked that might want to
place secure material there.
kernel 5.6 added support for a new flag for getrandom(): GRND_INSECURE.
If we set it we can get some random data out of the kernel random pool,
even if it is not yet initializated. This is great for us to initialize
hash table seeds and such, where it is OK if they are crap initially. We
used RDRAND for these cases so far, but RDRAND is only available on
newer CPUs and some archs. Let's now use GRND_INSECURE for these cases
as well, which means we won't needlessly delay boot anymore even on
archs/CPUs that do not have RDRAND.
Of course we never set this flag when generating crypto keys or uuids.
Which makes it different from RDRAND for us (and is the reason I think
we should keep explicit RDRAND support in): RDRAND we don't trust enough
for crypto keys. But we do trust it enough for UUIDs.
As described in #15603, it is a fairly common setup to use a fqdn as the
configured hostname. But it is often convenient to use just the actual
hostname, i.e. until the first dot. This adds support in tmpfiles, sysusers,
and unit files for %l which expands to that.
Fixes#15603.
Ideally, assert_cc() would be used for this, so that it is not possible to even
compile systemd with something like '-Dfallback-hostname=.foo'. But to do a
proper check we need to call hostname_is_valid(), and we cannot depend on being
able to run code (e.g. during cross-compilation). So let's do a very superficial
check in meson, and a proper on in test-util.
The sets are such basic functionality that it is convenient to be able to
build test-set without all the machinery in shared, and to test it without
the mempool to validate memory accesses easier.
For units which are aliases of other units, reporting preset status as
"enabled" is rather misleading. For example, dbus.service is an alias of
dbus-broker.service. In list-unit-files we'd show both as "enabled". In
particular, systemctl preset ignores aliases, so showing any preset status at
all is always going to be misleading. Let's introduce a new state "alias" and
use that for all aliases.
I was trying to avoid adding a new state, to keep compatibility with previous
behaviour, but for alias unit files it simply doesn't seem very useful to show
any of the existing states. It seems that the clearly showing that those are
aliases for other units will be easiest to understand for users.
This changes the calendarspec parser to allow expressions such as
"00:05..05", i.e. a range where start and end is the same. It also
allows expressions such as "00:1-2/3", i.e. where the repetition value
does not fit even once in the specified range. With this patch both
cases will now be optimized away, i.e. the range is removed and a fixed
value is used, which is functionally equivalent.
See #15030 for an issue where the inability to parse such expressions
caused confusion.
I think it's probably better to accept these gracefully and optimizing
them away instead of refusing them with a plain EINVAL. With a tool such
as "systemd-analyze" calendar it should be easy to figure out the
normalized form with the redundant bits optimized away.
As documented in systemd.path(5):
When a service unit triggered by a path unit terminates (regardless
whether it exited successfully or failed), monitored paths are
checked immediately again, and the service accordingly restarted
instantly.
This commit implements this behaviour for PathExists=, PathExistsGlob=,
and DirectoryNotEmpty=. These predicates are essentially
"level-triggered": the service should be activated whenever the
predicate is true. PathChanged= and PathModified=, on the other hand,
are "edge-triggered": the service should only be activated when the
predicate *becomes* true.
The behaviour has been broken since at least as far back as commit
8fca6944c2 ("path: stop watching path specs once we triggered the target
unit"). This commit had systemd stop monitoring inotify whenever the
triggered unit was activated. Unfortunately this meant it never updated
the ->inotify_triggered flag, so it never rechecked the path specs when
the triggered unit deactivated.
With this commit, systemd rechecks all paths specs whenever the
triggered unit deactivates. If any PathExists=, PathExistsGlob= or
DirectoryNotEmpty= predicate passes, the triggered unit is reactivated.
If the target unit is activated by something outside of the path unit,
the path unit immediately transitions to a running state. This ensures
the path unit stops monitoring inotify in this situation.
With this change in place, commit d7cf8c24d4 ("core/path: fix spurious
triggering of PathExists= on restart/reload") is no longer necessary.
The path unit (and its triggered unit) is now always active whenever
the PathExists= predicate passes, so there is no spurious restart when
systemd is reloaded or restarted.
Callers of cg_get_keyed_attribute_full() can now specify via the flag whether the
missing keyes in cgroup attribute file are OK or not. Also the wrappers for both
strict and graceful version are provided.
Don't assume that 4MB can be allocated from stack since there could be smaller
DefaultLimitSTACK= in force, so let's use malloc(). NUL terminate the huge
strings by hand, also ensure termination in test_lz4_decompress_partial() and
optimize the memset() for the string.
Some items in /proc and /etc may not be accessible to poor unprivileged users
due to e.g. SELinux, BOFH or both, so check for EACCES and EPERM.
/var/tmp may be a symlink to /tmp and then path_compare() will always fail, so
let's stick to /tmp like elsewhere.
/tmp may be mounted with noexec option and then trying to execute scripts from
there would fail.
Detect and warn if seccomp is already in use, which could make seccomp test
fail if the syscalls are already blocked.
Unset $TMPDIR so it will not break specifier tests where %T is assumed to be
/tmp and %V /var/tmp.
It's not always mounted, e.g. during the build-time tests, it's running inside
a chroot (that's how Debian/Ubuntu build packages, in chroots) so this test
always fails because /sys/fs/cgroup isn't mounted.
We provide a way via the '-' symbol to ignore errors when nonexistent
executable files are passed to Exec* parameters & so on. In such a case,
the flag `EXEC_COMMAND_IGNORE_FAILURE` is set and we go on happily with
our life if that happens. However, `systemd-analyze verify` complained
about missing executables even in such a case. In such a case it is not
an error for this to happen so check if the flag is set before checking
if the file is accessible and executable.
Add some small tests to check this condition.
Closes#15218.
This reworks the user validation infrastructure. There are now two
modes. In regular mode we are strict and test against a strict set of
valid chars. And in "relaxed" mode we just filter out some really
obvious, dangerous stuff. i.e. strict is whitelisting what is OK, but
"relaxed" is blacklisting what is really not OK.
The idea is that we use strict mode whenver we allocate a new user
(i.e. in sysusers.d or homed), while "relaxed" mode is when we process
users registered elsewhere, (i.e. userdb, logind, …)
The requirements on user name validity vary wildly. SSSD thinks its fine
to embedd "@" for example, while the suggested NAME_REGEX field on
Debian does not even allow uppercase chars…
This effectively liberaralizes a lot what we expect from usernames.
The code that warns about questionnable user names is now optional and
only used at places such as unit file parsing, so that it doesn't show
up on every userdb query, but only when processing configuration files
that know better.
Fixes: #15149#15090
split() and FOREACH_WORD really should die, and everything be moved to
extract_first_word() and friends, but let's at least make sure that for
the remaining code using it we can't deadlock by not progressing in the
word iteration.
Fixes: #15305
The kernel does not sanitize /proc/cmdline. E.g. when running under qemu, it is
easy to pass a string with newline by mistake. We use read_one_line_file(), so
we would read only the first list of the file, and
write_string_file(WRITE_STRING_FILE_VERIFY_ON_FAILURE) would fail because the
target file is obviously different. Change to a kernel-generated file to avoid
the issue.
v2:
- use /proc/version instead of /proc/uptime for attempted writes, so the test
test passes even if test_write_string_file_verify() takes more than 10 ms ;]
I think the two names were both pretty bad. They did not give a proper hint
what the difference between the two functions is, and sd_path_home sounds like
it is somehow related to /home or home directories or whatever, when in fact
both functions return the same set of paths as either a colon-delimited string
or a strv. "_strv" suffix is used by various functions in sd-bus, so let's
reuse that.
Those functions are not public yet, so let's rename.
In 1a29610f5f the change inadvertedly
disabled names with digit as the first character. This follow-up change
allows a digit as the first character in compat mode.
Fixes: #15141
It returns 32 bits, unsigned on amd64, so it's probably similar everywhere
with glibc. But let's make the code generic, without assuming specific size
or signedness.
The man pages state that the '+' prefix in Exec* directives should
ignore filesystem namespacing options such as PrivateTmp. Now it does.
This is very similar to #8842, just with PrivateTmp instead of
PrivateDevices.
Fix the following issues in test-cgroup:
1, commit 65be7e0652 ("pid1: do not reset subtree_control on
already-existing units with delegation") changed the return value
of cg_create () as follows:
"Returns 0 if the group already existed, 1 on success, negative
otherwise."
So we need to modify the test cases related to cg_create ().
2, commit efdb02375b ("core: unified cgroup hierarchy support")
changed cg_delete () to cg_rmdir (), so the test cases also need
to be adjusted a bit.
3. There is no cleanup of "test-a". If we execute test-cgroup
multiple times, we will encounter an error.
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
For #8495: it is arguably useful to not show the length of the password
in public spaces. It is possible to press TAB or BS to cancel the asterisks,
but this is not very discoverable. Let's make it discoverable by showing
a message (in gray). The message is "erased" after the first character
is entered.
test-ask-password-api would crash if ^D was pressed.
If think the callers generally expect a non-empty strv as reply. Let's
return an error if we have nothing to return.
Also modernize test-ask-password-api a bit.
We use those strings as hash keys. While writing "a...b" looks strange,
"a///b" does not look so strange. Both syntaxes would actually result in the
value being correctly written to the file, but they would confuse our
de-deplication over keys. So let's normalize. Output also becomes nicer.
Add test.
This way we can use libxcrypt specific functionality such as
crypt_gensalt() and thus take benefit of the newer algorithms libxcrypt
implements. (Also adds support for a new env var $SYSTEMD_CRYPT_PREFIX
which may be used to select the hash algorithm to use for libxcrypt.)
Also, let's move the weird crypt.h inclusion into libcrypt.h so that
there's a single place for it.
We'd just print nothing and exit with 0. If the user gave an explicit
name, we should fail. If a pattern didn't match, we should at least warn.
$ networkctl status enx54ee75cb1dc0a* --no-pager && echo $?
No interfaces matched.
0
$ networkctl status enx54ee75cb1dc0a --no-pager
Interface "enx54ee75cb1dc0a" not found.
1
In subsequent commits, calls to if_nametoindex() will be replaced by a wrapper
that falls back to alternative name resolution over netlink. netlink support
requires libsystemd (for sd-netlink), and we don't want to add any functions
that require netlink in basic/. So stuff that calls if_nametoindex() for user
supplied interface names, and everything that depends on that, needs to be
moved.
If we have a template unit template@.service, it should be allowed to specify a
dependency on a unit without an instance, bar@.service. When the unit is created,
the instance will be propagated into the target, so template@inst.service will
depend on bar@inst.service.
This commit changes unit_dependency_name_compatible(), which makes the manager
accept links like that, and unit_file_verify_alias(), so that the installation
function will agree to create a symlink like that, and finally the tests are
adjusted to pass.
This mostly reuses existing checkers used by pid1, so handling of aliases
should be consistent. Hopefully, with the test it'll be clearer what it
happening.
Support for .wants/.requires "aliases" is restored. Those are still used in the
wild quite a bit, so we need to support them.
See https://github.com/systemd/systemd/pull/13119 for a discussion of aliases
with an instance that point to a different template: this is allowed.
Let's add the actual unicode names of the glyphs we use. Let's also add
in comments what the width expectations of these glyphs are on the
console.
Also, remove the "mdash" definition. First of all it wasn't used, but
what's worse the glyph encoded was actually an "ndash"...
Fixes: #14075
This has been requested many times before. Let's add it finally.
GPT auto-discovery for /var is a bit more complex than for other
partition types: the other partitions can to some degree be shared
between multiple OS installations on the same disk (think: swap, /home,
/srv). However, /var is inherently something bound to an installation,
i.e. specific to its identity, or actually *is* its identity, and hence
something that cannot be shared.
To deal with this this new code is particularly careful when it comes to
/var: it will not mount things blindly, but insist that the UUID of the
partition matches a hashed version of the machine-id of the
installation, so that each installation has a very specific /var
associated with it, and would never use any other. (We actually use
HMAC-SHA256 on the GPT partition type for /var, keyed by the machine-id,
since machine-id is something we want to keep somewhat private).
Setting the right UUID for installations takes extra care. To make
things a bit simpler to set up, we avoid this safety check for nspawn
and RootImage= in unit files, under the assumption that such container
and service images unlikely will have multiple installations on them.
The check is hence only required when booting full machines, i.e. in
in systemd-gpt-auto-generator.
To help with putting together images for full machines, PR #14368
introduces a repartition tool that can automatically fill in correctly
calculated UUIDs on first boot if images have the var partition UUID
initialized to all zeroes. With that in place systems can be put
together in a way that on first boot the machine ID is determined and
the partition table automatically adjusted to have the /var partition
with the right UUID.
To support ProtectHome=y in a user namespace (which mounts the inaccessible
nodes), the nodes need to be accessible by the user. Create these paths and
devices in the user runtime directory so they can be used later if needed.
Like with shmat already the actual results of the test
test_memory_deny_write_execute_mmap depend on kernel/libseccomp/glibc
of the platform it is running on.
There are known-good platforms, but on the others do not assert success
(which implies test has actually failed as no seccomp blocking was achieved),
but instead make the check dependent to the success of the mmap call
on that platforms.
Finally the assert of the munmap on that valid pointer should return ==0,
so that is what the check should be for in case of p != MAP_FAILED.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Let's add a concept of normalization: as preparation for signing json
records let's add a mechanism to bring JSON records into a well-defined
order so that we can safely validate JSON records.
This adds two booleans to each JsonVariant object: "sorted" and
"normalized". The latter indicates whether a variant is fully sorted
(i.e. all keys of objects listed in alphabetical order) recursively down
the tree. The former is a weaker property: it only checks whether the
keys of the object itself are sorted. All variants which are
"normalized" are also "sorted", but not vice versa.
The knowledge of the "sorted" property is then used to optimize
searching for keys in the variant by using bisection.
Both properties are determined at the moment the variants are allocated.
Since our objects are immutable this is safe.
This will call json_variant_sensitive() internally while parsing for
each allocated sub-variant. This is better than calling it a posteriori
at the end, because partially parsed variants will always be properly
erased from memory this way.
Change test_set_ambient_caps() to test_apply_ambient_caps(), since the
function capability_ambient_set_apply() not only sets ambient
capabilities, but clears inherited capabilities that are not explicitly
requested by the caller.
The conf-parser machinery already removed whitespace before and after "=", no
need to repeat this step.
The test is adjusted to pass. It was testing an code path that doesn't happen
normally, no point in doing that.
EOF is defined to -1, hence on platforms that have "char" unsigned we
can't compare it as-is, except if we accept an implicit cast. let's make
it an explicit cast, acknowledging the issue.
Fixes: #14118
If we ignore any uknown section, we will not be able to show any
warning if a typo in a section name is made. Let's reverse our
approach, and explicitly list sections to ignore instead.
I opted to make use the same section list for this, instead of adding a second
list, because this list is passed through to many functions and adding yet
another parameter to the long signature would be very noisy.
TasksMax= and DefaultTasksMax= can be specified as percentages. We don't
actually document of what the percentage is relative to, but the implementation
uses the smallest of /proc/sys/kernel/pid_max, /proc/sys/kernel/threads-max,
and /sys/fs/cgroup/pids.max (when present). When the value is a percentage,
we immediately convert it to an absolute value. If the limit later changes
(which can happen e.g. when systemd-sysctl runs), the absolute value becomes
outdated.
So let's store either the percentage or absolute value, whatever was specified,
and only convert to an absolute value when the value is used. For example, when
starting a unit, the absolute value will be calculated when the cgroup for
the unit is created.
Fixes#13419.
This partially reverts db11487d10 (the logic to
calculate the correct value is removed, we always use the same setting as for
the system manager). Distributions have an easy mechanism to override this if
they wish.
I think making this configurable is better, because different distros clearly
want different defaults here, and making this configurable is nice and clean.
If we don't make it configurable, distros which either have to carry patches,
or what would be worse, rely on some other configuration mechanism, like
/etc/profile. Those other solutions do not apply everywhere (they usually
require the shell to be used at some point), so it is better if we provide
a nice way to override the default.
Fixes #13469.
This makes it easier to see what unit_name_is_valid() returns at a glance.
The output is not whitespace clean, but I think it's good enough for a test.
With meson-0.52.0-1.module_f31+6771+f5d842eb.noarch I get:
src/test/meson.build:19: WARNING: Overriding previous value of environment variable 'PATH' with a new one
When we're using *prepend*, the whole point is to modify an existing variable,
so meson shouldn't warn. But let's set avoid the warning and shorten things by
setting the final value immediately.
The code in cgroup.c has support for all hierarchies, but the test,
as written, will only work on unified. Since the test is really about
bpf code, and not the legacy devices controller, let's just skip
the test.
It turns out that the kernel verifier would reject a program we would build
if there was a whitelist, but no entries in the whitelist matched.
The program would approximately like this:
0: (61) r2 = *(u32 *)(r1 +0)
1: (54) w2 &= 65535
2: (61) r3 = *(u32 *)(r1 +0)
3: (74) w3 >>= 16
4: (61) r4 = *(u32 *)(r1 +4)
5: (61) r5 = *(u32 *)(r1 +8)
48: (b7) r0 = 0
49: (05) goto pc+1
50: (b7) r0 = 1
51: (95) exit
and insn 50 is unreachable, which is illegal. We would then either keep a
previous version of the program or allow everything. Make sure we build a
valid program that simply rejects everything.
64MB is not that much, but let's not be greedy, esp. because we may run
many things in parallel.
Also, rlim_cur should never be higher than rlim_max, so let's simplify our
code.
Discussed in #13743, the -.service semantic conflicts with the
existing root mount and slice names, making this feature not
uniformly extensible to all types. Change the name to be
<type>.d instead.
Updating to this format also extends the top-level dropin to
unit types.
When showing logs from a container, we would fail to show various lines:
Oct 29 09:50:51 krowka systemd-nspawn[61376]: Detected architecture x86-64.
Oct 29 09:50:51 krowka systemd-nspawn[61376]: [1B blob data]
Oct 29 09:50:51 krowka systemd-nspawn[61376]: Welcome to Fedora 32 (Rawhide)!
Oct 29 09:50:51 krowka systemd-nspawn[61376]: [1B blob data]
Those are only harmless \r characters that trail the line. We already replace
tabs and strip various ansi characters that we deem inconsequential, so let's
also strip trailing carriage returns. Non-trailing ones are different, because
they change what would be displayed.
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.
(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
.sun_path has 108 bytes, and we'd write a string of 108 bytes + NUL.
I added this test, but I don't know what it was supposed to test. Let's
just remove.
Fixes#13713. CID#1405854.
ARPHRD_NETROM was excluded, most likely just because it is protocol No. 0,
and ARPHRD_CISCO was reported under its alias name "HDLC". Let's just
allow defined aliases under the main name.
Our biggest object in libsystemd was a table full of zeros, for the arphdr
names. Let's use a switch (which gcc nicely optimizes for us), instead a
table with a gap between 826 and 65534:
$ ls -l build{,2}/src/basic/a6ba3eb@@basic@sta/arphrd-list.c.o
-rw-rw-r--. 1 zbyszek zbyszek 540232 Sep 22 00:29 build/src/basic/a6ba3eb\@\@basic\@sta/arphrd-list.c.o
-rw-rw-r--. 1 zbyszek zbyszek 20512 Sep 25 11:56 build2/src/basic/a6ba3eb\@\@basic\@sta/arphrd-list.c.o
$ ls -l build{,2}/src/shared/libsystemd-shared-243.so
-rwxrwxr-x. 1 zbyszek zbyszek 6774368 Sep 22 00:29 build/src/shared/libsystemd-shared-243.so
-rwxrwxr-x. 1 zbyszek zbyszek 6254808 Sep 25 12:16 build2/src/shared/libsystemd-shared-243.so
No functional change.
Introduce support for configuring cpus and mems for processes using
cgroup v2 CPUSET controller. This allows users to limit which cpus
and memory NUMA nodes can be used by processes to better utilize
system resources.
The cgroup v2 interfaces to control it are cpuset.cpus and cpuset.mems
where the requested configuration is written. However, it doesn't mean
that the requested configuration will be actually used as parent cgroup
may limit the cpus or mems as well. In order to reflect the real
configuration cgroup v2 provides read-only files cpuset.cpus.effective
and cpuset.mems.effective which are exported to users as well.
In various circumstances, overriding the kernel commandline can be inconvenient.
People have different bootloaders, and e.g. the grub config can be pretty scary.
grubby helps, but it isn't always available.
This option adds an alternative mechanism that can quite convenient on EFI
systems. cmdline settings have higher priority, because they can be (usually)
changed on the bootloader prompt.
$SYSTEMD_EFI_OPTIONS can be used to override, same as $SYSTEMD_PROC_CMDLINE.