Possibly fixes#15220. (There might be another leak. I'm still investigating.)
The leak would occur when the path cache was rebuilt. So in normal circumstances
it wouldn't be too bad, since usually the path cache is not rebuilt too often. But
the case in #15220, where new unit files are created in a loop and started, the leak
occurs once for each unit file:
$ for i in {1..300}; do cp ~/.config/systemd/user/test0001.service ~/.config/systemd/user/test$(printf %04d $i).service; systemctl --user start test$(printf %04d $i).service;done
sd-boot uses rdtsc to set those timestamps. There is no guarantee that the tsc
has any particular absolute value.
On my VM:
$ head /sys/firmware/efi/efivars/LoaderTime*
==> /sys/firmware/efi/efivars/LoaderTimeExecUSec-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f <==
4397904074
==> /sys/firmware/efi/efivars/LoaderTimeInitUSec-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f <==
4396386839
==> /sys/firmware/efi/efivars/LoaderTimeMenuUSec-4a67b082-0a4c-41cf-b6c7-440b29bb8c4f <==
4396392521
$ build/test-boot-timestamps
...
LoaderTimeExecUSec=4396386839 too large, refusing.
Failed to read EFI loader data: Input/output error
Assertion 'q >= 0' failed at src/test/test-boot-timestamps.c:84, function main(). Aborting.
(with patch)
$ build/test-boot-timestamps
...
EFI Loader: start=1h 13min 16.386s exit=1h 13min 17.904s duration=1.517s
Firmware began 1h 13min 17.904074s before kernel.
Loader began 1.517235s before kernel.
Firmware began Tue 2020-05-26 11:04:13 CEST.
Loader began Tue 2020-05-26 12:17:30 CEST.
Kernel began Tue 2020-05-26 12:17:31 CEST.
Units created with "systemctl edit --full" from generator units aren't
removed with "systemctl revert" because units in the generator,
generator_early, and generator_late paths aren't considered vendor
units.
Alter path_is_vendor() to return true if the given path is a
generator unit, which in turn causes "systemctl revert" to remove units
created with "systemctl edit --full" as they now have vendor versions.
Fixes#10053
Much like systemd.condition-needs-update= this new switch allows
overriding of a unit file condition, but this time its
ConditionFirstBoot=.
Usecase is also primarily debugging, but could be useful for other
schemes too.
Condition checks shouldn't log loudly, since they run all the time.
Let's make things debuggable, by keeping the messages in LOG_DEBUG in,
but don't make more noise than necessary.
We should do this check first since it is done on the string itself
without any conditioning of system state otherwise. It is a weird to do
this test only if /etc is read-only.
For low memory machines (256MB), 10% of RAM for /run may not be enough for
re-exec of PID1 because 16MB of free space is required and /run may already
contain something.
Fixes: oss-fuzz#22208
```
test/fuzz/fuzz-calendarspec/oss-fuzz-22208... ../src/shared/calendarspec.c:666:48: runtime error: signed integer overflow: 2147000000 + 1000000 cannot be represented in type 'int'
#0 0x7f0b9f6cc56a in prepend_component ../src/shared/calendarspec.c:666
#1 0x7f0b9f6cd03a in parse_chain ../src/shared/calendarspec.c:718
#2 0x7f0b9f6cea1c in parse_calendar_time ../src/shared/calendarspec.c:845
#3 0x7f0b9f6d1397 in calendar_spec_from_string ../src/shared/calendarspec.c:1084
#4 0x401570 in LLVMFuzzerTestOneInput ../src/fuzz/fuzz-calendarspec.c:17
#5 0x401ae0 in main ../src/fuzz/fuzz-main.c:39
#6 0x7f0b9e31b1a2 in __libc_start_main (/lib64/libc.so.6+0x271a2)
#7 0x40122d in _start (/home/fsumsal/repos/systemd/build/fuzz-calendarspec+0x40122d)
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/shared/calendarspec.c:666:48 in
```
Prompted by the discussions in #15180.
This is a bit more complex than I hoped, since for PID 1 we need to pass
in the synethetic environment block in we generate on demand.
It's easy to add, and should be pretty useful, in particular as in
AssertPathIsEncrypted= as it can be used for checking that
some path is encrypted before some service is invoked that might want to
place secure material there.
Limit size of various tmpfs mounts to 10% of RAM, except volatile root and /var
to 25%. Another exception is made for /dev (also /devs for PrivateDevices) and
/sys/fs/cgroup since no (or very few) regular files are expected to be used.
In addition, since directories, symbolic links, device specials and xattrs are
not counted towards the size= limit, number of inodes is also limited
correspondingly: 4MB size translates to 1k of inodes (assuming 4k each), 10% of
RAM (using 16GB of RAM as baseline) translates to 400k and 25% to 1M inodes.
Because nr_inodes option can't use ratios like size option, there's an
unfortunate side effect that with small memory systems the limit may be on the
too large side. Also, on an extremely small device with only 256MB of RAM, 10%
of RAM for /run may not be enough for re-exec of PID1 because 16MB of free
space is required.
[127/1355] Compiling C object 'src/shared/5afaae1@@systemd-shared-245@sta/ethtool-util.c.o'
../src/shared/ethtool-util.c: In function ‘ethtool_get_permanent_macaddr’:
../src/shared/ethtool-util.c:260:60: warning: array subscript 5 is outside the bounds of an interior zero-length array ‘__u8[0]’ {aka ‘unsigned char[]’} [-Wzero-length-bounds]
260 | ret->ether_addr_octet[i] = epaddr.addr.data[i];
| ~~~~~~~~~~~~~~~~^~~
In file included from ../src/shared/ethtool-util.c:5:
../src/shared/linux/ethtool.h:704:7: note: while referencing ‘data’
704 | __u8 data[0];
| ^~~~
../src/shared/ethtool-util.c: In function ‘ethtool_set_features’:
../src/shared/ethtool-util.c:488:31: warning: array subscript 0 is outside the bounds of an interior zero-length array ‘__u32[0]’ {aka ‘unsigned int[]’} [-Wzero-length-bounds]
488 | len = buffer.info.data[0];
| ~~~~~~~~~~~~~~~~^~~
In file included from ../src/shared/ethtool-util.c:5:
../src/shared/linux/ethtool.h:631:8: note: while referencing ‘data’
631 | __u32 data[0];
| ^~~~
The kernel should not define the length of the array, but it does. We can't fix
that, so let's use a cast to avoid the warning.
For https://github.com/systemd/systemd/issues/6119#issuecomment-626073743.
v2:
- use #pragma instead of a cast. It seems the cast only works in some cases, and
gcc is "smart" enough to see beyond the cast. Unfortunately clang does not support
this warning, so we need to do a config check whether to try to suppress.