These helper calls are potentially called often, and allocate FILE*
objects internally for a very short period of time, let's turn off
locking for them too.
Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f,
FSETLOCKING_BYCALLER). This turns off locking for the entire FILE*,
instead of doing individual per-call decision whether to use normal
calls or _unlocked() calls.
This has various benefits:
1. It's easier to read and easier not to forget
2. It's more comprehensive, as fprintf() and friends are covered too
(as these functions have no _unlocked() counterpart)
3. Philosophically, it's a bit more correct, because it's more a
property of the file handle really whether we ever pass it on to another
thread, not of the operations we then apply to it.
This patch reworks all pieces of codes that so far used fxyz_unlocked()
calls to use __fsetlocking() instead. It also reworks all places that
use open_memstream(), i.e. use stdio FILE* for string manipulations.
Note that this in some way a revert of 4b61c87511.
fputs() writes only first 2048 bytes and fails
to write to /proc when values are larger than that.
This patch adds a new flag to WriteStringFileFlags
that make it possible to disable the buffer under
specific cases.
This commit updates networkd behavior to check if the hostname option
received via DHCP is too long for Linux limit, and in case shorten it.
An overlong hostname will be truncated to the first dot or to
`HOST_MAX_LEN`, whatever comes earlier.
Using strlen() to declare a buffer results in a variable-length array,
even if the compiler likely optimizes it to be a compile time constant.
When building with -Wvla, certain versions of gcc complain about such
buffers. Compiling with -Wvla has the advantage of preventing variably
length array, which defeat static asserts that are implemented by
declaring an array of negative length.
While the compiler likely optimizes strlen(x) for string literals,
it is not a constant expression.
Hence,
char buffer[strlen("OPTION_000") + 1];
declares a variable-length array. STRLEN() can be used instead
when a constant espression is needed.
It's not entirely identical to strlen(), as STRLEN("a\0") counts 2.
Also, it only works with string literals and the macro enforces
that the argument is a literal.
Our CODING_STYLE suggests not comparing with NULL, but relying on C's
downgrade-to-bool feature for that. Fix up some code to match these
guidelines. (This is not comprehensive, the coccinelle output for this
is unfortunately kinda borked)
The detection of ConditionVirtualisation= relies on the presence of
/proc/xen/capabilities. If the file exists and contains the string
"control_d", the running system is a dom0 and VIRTUALIZATION_NONE should
be set. In case /proc/xen exists, or some sysfs files indicate "xen",
VIRTUALIZATION_XEN should be set to indicate the system is a domU.
With an (old) xenlinux based kernel, /proc/xen/capabilities is always
available and the detection described above works always. But with a
pvops based kernel, xenfs must be mounted on /proc/xen to get
"capabilities". This is done by a proc-xen.mount unit, which is part of
xen.git. Since the mounting happens "late", other units may be scheduled
before "proc-xen.mount". If these other units make use of
"ConditionVirtualisation=", the virtualization detection returns
incorect results. detect_vm() will set VIRTUALIZATION_XEN because "xen"
is found in sysfs. This value will be cached. Once xenfs is mounted, the
next process that runs detect_vm() will get VIRTUALIZATION_NONE.
This misdetection can be fixed by using
/sys/hypervisor/properties/features, which exports the value returned by
the "XENVER_get_features" hypercall. If the bit XENFEAT_dom0 is set, the
domain is the "hardware domain". It is supposed to have permissions to
access all hardware. The used sysfs file is available since v2.6.31.
The commonly used term "dom0" refers to the control domain which runs
the toolstack and has access to all hardware. But the virtualization
host may be configured such that one dedicated domain becomes the
"hardware domain", and another one the "toolstack domain".
Update detect_vm_xen_dom0 to propagate errors in case reading
/proc/xen/capabilites fails. This does not fix any bugs, it just makes
it consistent with other functions called by detect_vm.
The file /proc/xen/capabilities is only available if xenfs is mounted.
With a classic xenlinux based kernel that file is available
unconditionally. But with a modern pvops based kernel, xenfs must be
mounted before the "capabilities" may appear. xenfs is mounted very late
via .services files provided by the Xen toolstack. Other units may be
scheduled before xenfs is mounted, which will confuse the detection of
VIRTUALIZATION_XEN.
In all Xen enabled kernels, and if that kernel is actually running on
the Xen hypervisor, the "/proc/xen" directory is the reliable indicator
that this instance runs in a "Xen guest".
Adjust the code to check for /proc/xen instead of
/proc/xen/capabilities.
Fixes commit 3f61278b5 ("basic: Bugfix Detect XEN Dom0 as no virtualization")
We already synthesize records for both "root" and "nobody" in
nss-systemd. Let's do the same in our own NSS wrappers that are supposed
to bypass NSS if possible. Previously this was done for "root" only, but
let's clean this up, and do the same for "nobody" too, so that we
synthesize records the same way everywhere, regardless whether in NSS or
internally.
Instead of contacting PID 1 for dynamic UID/GID lookups for all
UIDs/GIDs that do not qualify as "system" do the more precise check
instead: check if they actually qualify for the "dynamic" range.
This adds uid_is_system() and gid_is_system(), similar in style to
uid_is_dynamic(). That a helper like this is useful is illustrated by
the fact that test-condition.c didn't get the check right so far, which
this patch fixes.
The loop preparation and part of the loop contents are actually the
same, let's merge this.
Also, it's so much fun tweaking around in the name_to_handle_at() code,
let's do more of it with this patch!
(This also adds two NULL assignments, that aren't strictly necessary.
However, I figured its safer to place them in there, just in case the
for() condition is changed later. After all the freeing of the handle
and the invalidation of the cleanup-controller pointer to it are
otherwise really far away from each other...)
In -U mode we might need to re-chown() all files and directories to
match the UID shift we want for the image. That's problematic on fat
partitions, such as the ESP (and which is generated by mkosi's
--bootable switch), because fat of course knows no UID/GID file
ownership natively.
With this change we take benefit of the uid= and gid= mount options FAT
knows: instead of chown()ing all files and directories we can just
specify the right UID/GID to use at mount time.
This beefs up the image dissection logic in two ways:
1. First of all support for mounting relevant file systems with
uid=/gid= is added: when a UID is specified during mount it is used for
all applicable file systems.
2. Secondly, two new mount flags are added:
DISSECT_IMAGE_MOUNT_ROOT_ONLY and DISSECT_IMAGE_MOUNT_NON_ROOT_ONLY.
If one is specified the mount routine will either only mount the root
partition of an image, or all partitions except the root partition.
This is used by nspawn: first the root partition is mounted, so that
we can determine the UID shift in use so far, based on ownership of
the image's root directory. Then, we mount the remaining partitions
in a second go, this time with the right UID/GID information.
This adds a new flavour of strextend(), called
strextend_with_separator(), which takes an optional separator string. If
specified, the separator is inserted between each appended string, as
well as before the first one, but only if the original string was
non-empty.
This new call is particularly useful when appending new options to mount
option strings and suchlike, which need to be comma-separated, and
initially start out from an empty string.
Let's optimize things a bit, and instead of having to strip whitespace
first before decoding base64, let's do that implicitly while doing so.
Given that base64 was designed the way it was designed specifically to
be tolerant to whitespace changes, it's a good idea to do this
automatically and implicitly.
If we'd use the system header's version of MAX_HANDLE_SZ then our code
would break on older kernels as soon as the value is increased, as old
kernels refuse larger buffers with EINVAL.
Apparently, the kernel returns EINVAL on NFS4 sometimes, even if we do
everything right, let's fallback in that case and find a different
approach to determine if something's a mount point.
See discussion at:
https://github.com/systemd/systemd/issues/7082#issuecomment-348001289
Now the function returns an empty string when given an empty string.
Not sure if this is the best option (maybe this should be an error?),
but at least the behaviour is well defined.
Also do not include libcryptsetup.h directly, but only through crypt-util.h.
This way we do not have to repeat the define in every file where it is used.
The kernel will reply with -ENOTDIR when we try to access a non-directory under
a name which ends with a slash. But our functions would strip the trailing slash
under various circumstances. Keep the trailing slash, so that
path_is_mount_point("/path/to/file/") return -ENOTDIR when /path/to/file/ is a file.
Tests are added for this change in behaviour.
Also, when called with a trailing slash, path_is_mount_point() would get
"" from basename(), and call name_to_handle_at(3, "", ...), and always
return -ENOENT. Now it'll return -ENOTDIR if the mount point is a file, and
true if it is a directory and a mount point.
v2:
- use strip_trailing_chars()
v3:
- instead of stripping trailing chars(), do the opposite — preserve them.