For low memory machines (256MB), 10% of RAM for /run may not be enough for
re-exec of PID1 because 16MB of free space is required and /run may already
contain something.
Fixes: oss-fuzz#22208
```
test/fuzz/fuzz-calendarspec/oss-fuzz-22208... ../src/shared/calendarspec.c:666:48: runtime error: signed integer overflow: 2147000000 + 1000000 cannot be represented in type 'int'
#0 0x7f0b9f6cc56a in prepend_component ../src/shared/calendarspec.c:666
#1 0x7f0b9f6cd03a in parse_chain ../src/shared/calendarspec.c:718
#2 0x7f0b9f6cea1c in parse_calendar_time ../src/shared/calendarspec.c:845
#3 0x7f0b9f6d1397 in calendar_spec_from_string ../src/shared/calendarspec.c:1084
#4 0x401570 in LLVMFuzzerTestOneInput ../src/fuzz/fuzz-calendarspec.c:17
#5 0x401ae0 in main ../src/fuzz/fuzz-main.c:39
#6 0x7f0b9e31b1a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
#7 0x40122d in _start (/home/fsumsal/repos/systemd/build/fuzz-calendarspec+0x40122d)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/shared/calendarspec.c:666:48 in
```
Prompted by the discussions in #15180.
This is a bit more complex than I hoped, since for PID 1 we need to pass
in the synethetic environment block in we generate on demand.
It's easy to add, and should be pretty useful, in particular as in
AssertPathIsEncrypted= as it can be used for checking that
some path is encrypted before some service is invoked that might want to
place secure material there.
Limit size of various tmpfs mounts to 10% of RAM, except volatile root and /var
to 25%. Another exception is made for /dev (also /devs for PrivateDevices) and
/sys/fs/cgroup since no (or very few) regular files are expected to be used.
In addition, since directories, symbolic links, device specials and xattrs are
not counted towards the size= limit, number of inodes is also limited
correspondingly: 4MB size translates to 1k of inodes (assuming 4k each), 10% of
RAM (using 16GB of RAM as baseline) translates to 400k and 25% to 1M inodes.
Because nr_inodes option can't use ratios like size option, there's an
unfortunate side effect that with small memory systems the limit may be on the
too large side. Also, on an extremely small device with only 256MB of RAM, 10%
of RAM for /run may not be enough for re-exec of PID1 because 16MB of free
space is required.
[127/1355] Compiling C object 'src/shared/5afaae1@@systemd-shared-245@sta/ethtool-util.c.o'
../src/shared/ethtool-util.c: In function ‘ethtool_get_permanent_macaddr’:
../src/shared/ethtool-util.c:260:60: warning: array subscript 5 is outside the bounds of an interior zero-length array ‘__u8[0]’ {aka ‘unsigned char[]’} [-Wzero-length-bounds]
260 | ret->ether_addr_octet[i] = epaddr.addr.data[i];
| ~~~~~~~~~~~~~~~~^~~
In file included from ../src/shared/ethtool-util.c:5:
../src/shared/linux/ethtool.h:704:7: note: while referencing ‘data’
704 | __u8 data[0];
| ^~~~
../src/shared/ethtool-util.c: In function ‘ethtool_set_features’:
../src/shared/ethtool-util.c:488:31: warning: array subscript 0 is outside the bounds of an interior zero-length array ‘__u32[0]’ {aka ‘unsigned int[]’} [-Wzero-length-bounds]
488 | len = buffer.info.data[0];
| ~~~~~~~~~~~~~~~~^~~
In file included from ../src/shared/ethtool-util.c:5:
../src/shared/linux/ethtool.h:631:8: note: while referencing ‘data’
631 | __u32 data[0];
| ^~~~
The kernel should not define the length of the array, but it does. We can't fix
that, so let's use a cast to avoid the warning.
For https://github.com/systemd/systemd/issues/6119#issuecomment-626073743.
v2:
- use #pragma instead of a cast. It seems the cast only works in some cases, and
gcc is "smart" enough to see beyond the cast. Unfortunately clang does not support
this warning, so we need to do a config check whether to try to suppress.
As described in #15603, it is a fairly common setup to use a fqdn as the
configured hostname. But it is often convenient to use just the actual
hostname, i.e. until the first dot. This adds support in tmpfiles, sysusers,
and unit files for %l which expands to that.
Fixes#15603.
We make this entirely independent of the regular discard field, i.e. the
one that controls discard behaviour when the home directory is online.
Not all combinations make a ridiculous amount of sense, but most do.
Specifically:
online-discard = yes, offline-discard = yes
→ Discard when activating explicitly, and during runtime using
the "discard" mount option, and discard explicitly when logging
out again.
online-discard = no, offline-discard = yes
→ The new default: when logging in allocate the full backing
store, and use no discard while active. When loging out discard
everything. This provides nice behaviour: we take minimal storage
when offline but provide allocation guarantees while online.
online-discard = no, offline-discard = no
→ Never, ever discard, always operate with fully allocated
backing store. The extra safe mode.
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.
hashmap_put_strdup() was already doing something similar.
(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)
No functional change intended.
The idea is to have a static table that defines the dbus API. The vtable is
defined right next to the interface name and path because they are logically
connected.
For units which are aliases of other units, reporting preset status as
"enabled" is rather misleading. For example, dbus.service is an alias of
dbus-broker.service. In list-unit-files we'd show both as "enabled". In
particular, systemctl preset ignores aliases, so showing any preset status at
all is always going to be misleading. Let's introduce a new state "alias" and
use that for all aliases.
I was trying to avoid adding a new state, to keep compatibility with previous
behaviour, but for alias unit files it simply doesn't seem very useful to show
any of the existing states. It seems that the clearly showing that those are
aliases for other units will be easiest to understand for users.