All users of the macro (except for one, in serialize.c), use the macro in
connection with read_line(), so they must include fileio.h. Let's not play
libc games and require multiple header file to be included for the most common
use of a function.
The removal of def.h includes is not exact. I mostly went over the commits that
switch over to use read_line() and add def.h at the same time and reverted the
addition of def.h in those files.
"because blacklisted" is not grammatical, it should be something like
"becuase it is blacklisted", which in turn in too verbose. Let's use
a shorter form.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
The function is similar to path_kill_slashes() but also removes
initial './', trailing '/.', and '/./' in the path.
When the second argument of path_simplify() is false, then it
behaves as the same as path_kill_slashes(). Hence, this also
replaces path_kill_slashes() with path_simplify().
if we lack privs to create device nodes that's fine, and creating
/run/systemd/inaccessible/chr or /run/systemd/inaccessible/blk won't
work then. Document this in longer comments.
Fixes: #4484
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.
It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
This takes inspiration from Rust:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take
and was suggested by Alan Jenkins (@sourcejedi).
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
This rearranges chase_symlinks() a bit: if no special flags are
specified it will now revert to behaviour before
b12d25a8d6. However, if the new
CHASE_TRAIL_SLASH flag is specified it will follow the behaviour
introduced by that commit.
I wasn't sure which one to make the beaviour that requires specification
of a flag to enable. I opted to make the "append trailing slash"
behaviour the one to enable by a flag, following the thinking that the
function should primarily be used to generate a normalized path, and I
am pretty sure a path without trailing slash is the more "normalized"
one, as the trailing slash is not really a part of it, but merely a
"decorator" that tells various system calls to generate ENOTDIR if the
path doesn't refer to a path.
Or to say this differently: if the slash was part of normalization then
we really should add it in all cases when the final path is a directory,
not just when the user originally specified it.
Fixes: #8544
Replaces: #8545
These helper calls are potentially called often, and allocate FILE*
objects internally for a very short period of time, let's turn off
locking for them too.
The loop preparation and part of the loop contents are actually the
same, let's merge this.
Also, it's so much fun tweaking around in the name_to_handle_at() code,
let's do more of it with this patch!
(This also adds two NULL assignments, that aren't strictly necessary.
However, I figured its safer to place them in there, just in case the
for() condition is changed later. After all the freeing of the handle
and the invalidation of the cleanup-controller pointer to it are
otherwise really far away from each other...)
In -U mode we might need to re-chown() all files and directories to
match the UID shift we want for the image. That's problematic on fat
partitions, such as the ESP (and which is generated by mkosi's
--bootable switch), because fat of course knows no UID/GID file
ownership natively.
With this change we take benefit of the uid= and gid= mount options FAT
knows: instead of chown()ing all files and directories we can just
specify the right UID/GID to use at mount time.
This beefs up the image dissection logic in two ways:
1. First of all support for mounting relevant file systems with
uid=/gid= is added: when a UID is specified during mount it is used for
all applicable file systems.
2. Secondly, two new mount flags are added:
DISSECT_IMAGE_MOUNT_ROOT_ONLY and DISSECT_IMAGE_MOUNT_NON_ROOT_ONLY.
If one is specified the mount routine will either only mount the root
partition of an image, or all partitions except the root partition.
This is used by nspawn: first the root partition is mounted, so that
we can determine the UID shift in use so far, based on ownership of
the image's root directory. Then, we mount the remaining partitions
in a second go, this time with the right UID/GID information.
If we'd use the system header's version of MAX_HANDLE_SZ then our code
would break on older kernels as soon as the value is increased, as old
kernels refuse larger buffers with EINVAL.
Apparently, the kernel returns EINVAL on NFS4 sometimes, even if we do
everything right, let's fallback in that case and find a different
approach to determine if something's a mount point.
See discussion at:
https://github.com/systemd/systemd/issues/7082#issuecomment-348001289
The kernel will reply with -ENOTDIR when we try to access a non-directory under
a name which ends with a slash. But our functions would strip the trailing slash
under various circumstances. Keep the trailing slash, so that
path_is_mount_point("/path/to/file/") return -ENOTDIR when /path/to/file/ is a file.
Tests are added for this change in behaviour.
Also, when called with a trailing slash, path_is_mount_point() would get
"" from basename(), and call name_to_handle_at(3, "", ...), and always
return -ENOENT. Now it'll return -ENOTDIR if the mount point is a file, and
true if it is a directory and a mount point.
v2:
- use strip_trailing_chars()
v3:
- instead of stripping trailing chars(), do the opposite — preserve them.
So, it appears name_to_handle_at() always returns the right buffer size
on EOVERFLOW, when it's returned due to a too small buffer. Let's rely
on that exclusively for sizing the buffer, and let's drop the
exponential buffer growing.
The new logic is now: if we see EOVERFLOW and the returned size has
increased, resize our buffer and try again. But if it didn't increase,
then propagate the EOVERFLOW as it likely has other causes.
This is a simple wrapper around name_to_handle_at_loop() and
fd_fdinfo_mnt_id() to query the mnt ID of a path. It uses
name_to_handle_at() where it can, and falls back to to
fd_fdinfo_mnt_id() where that doesn't work.
This is a best-effort thing of course, since neither name_to_handle_at()
nor the fdinfo logic work on all kernels.
As it turns out MAX_HANDLE_SZ is a lie, the handle buffer we pass into
name_to_handle_at() might need to be larger than MAX_HANDLE_SZ, and we
thus need to invoke name_to_handle_at() in a loop, growing the buffer as
needed.
This adds a new wrapper name_to_handle_at_loop() around
name_to_handle_at() that does the necessary looping, and ports over all
users.
Fixes: #7082
STR_IN_SET is a newer approach which is easier to write and read, and which
seems to result in space savings too:
before:
4949848 build/src/shared/libsystemd-shared-234.so
350704 build/systemctl
4967184 build/systemd
826216 build/udevadm
after:
4949848 build/src/shared/libsystemd-shared-234.so
350704 build/systemctl
4966888 build/systemd
826168 build/udevadm
Enable masking the /proc folder using the 'InaccessiblePaths' unit
option.
This also slightly simplify mounts setup as the bind_remount_recursive
function will only open /proc/self/mountinfo once.
This is based on the suggestion at:
https://lists.freedesktop.org/archives/systemd-devel/2017-April/038634.html
Container managers frequently block name_to_handle_at(), returning
EACCES or EPERM when this is issued. Accept that, and simply fall back
to to fdinfo-based checks.
Note that we accept either EACCES or EPERM here, as container managers
can choose the error code and aren't very good on agreeing on just one.
(note that this is a non-issue with nspawn, as we permit
name_to_handle_at() there, only block open_by_handle_at(), which should
be sufficiently safe).
Let's remove chase_symlinks_prefix() and instead introduce a flags parameter to
chase_symlinks(), with a flag CHASE_PREFIX_ROOT that exposes the behaviour of
chase_symlinks_prefix().
Let's use chase_symlinks() everywhere, and stop using GNU
canonicalize_file_name() everywhere. For most cases this should not change
behaviour, however increase exposure of our function to get better tested. Most
importantly in a few cases (most notably nspawn) it can take the correct root
directory into account when chasing symlinks.