Commit graph

220 commits

Author SHA1 Message Date
Yu Watanabe f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Zbigniew Jędrzejewski-Szmek a5648b8094 basic/fs-util: change CHASE_OPEN flag into a separate output parameter
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.

(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
2019-10-24 22:44:24 +09:00
Lennart Poettering 2caa38e99f tree-wide: some more [static] related fixes
let's add [static] where it was missing so far

Drop [static] on parameters that can be NULL.

Add an assert() around parameters that have [static] and can't be NULL
hence.

Add some "const" where it was forgotten.
2019-07-12 16:40:10 +02:00
Lennart Poettering c6134d3e2f path-util: get rid of prefix_root()
prefix_root() is equivalent to path_join() in almost all ways, hence
let's remove it.

There are subtle differences though: prefix_root() will try shorten
multiple "/" before and after the prefix. path_join() doesn't do that.
This means prefix_root() might return a string shorter than both its
inputs combined, while path_join() never does that. I like the
path_join() semantics better, hence I think dropping prefix_root() is
totally OK. In the end the strings generated by both functon should
always be identical in terms of path_equal() if not streq().

This leaves prefix_roota() in place. Ideally we'd have path_joina(), but
I don't think we can reasonably implement that as a macro. or maybe we
can? (if so, sounds like something for a later PR)

Also add in a few missing OOM checks
2019-06-21 08:42:55 +09:00
Zbigniew Jędrzejewski-Szmek 7cc5ef5f18 pid1: improve message when setting up namespace fails
I covered the most obvious paths: those where there's a clear problem
with a path specified by the user.

Prints something like this (at error level):
May 21 20:00:01.040418 systemd[125871]: bad-workdir.service: Failed to set up mount namespacing: /run/systemd/unit-root/etc/tomcat9/Catalina: No such file or directory
May 21 20:00:01.040456 systemd[125871]: bad-workdir.service: Failed at step NAMESPACE spawning /bin/true: No such file or directory

Fixes #10972.
2019-05-22 16:28:02 +02:00
Lennart Poettering 6990fb6bc6 tree-wide: (void)ify a few unlink() and rmdir()
Let's be helpful to static analyzers which care about whether we
knowingly ignore return values. We do in these cases, since they are
usually part of error paths.
2019-03-27 18:09:56 +01:00
Lennart Poettering 9ce4e4b0f6 namespace: when DynamicUser=1 is set, mount StateDirectory= bind mounts "nosuid"
Add even more suid/sgid protection to DynamicUser= envionments: the
state directories we bind mount from the host will now have the nosuid
flag set, to disable the effect of nosuid on them.
2019-03-25 19:57:15 +01:00
Lennart Poettering 64e82c1976 mount-util: beef up bind_remount_recursive() to be able to toggle more than MS_RDONLY
The function is otherwise generic enough to toggle other bind mount
flags beyond MS_RDONLY (for example: MS_NOSUID or MS_NODEV), hence let's
beef it up slightly to support that too.
2019-03-25 19:33:55 +01:00
Lennart Poettering 867189b545 namespace: get rid of {} around single-line if blocks 2019-03-25 19:33:55 +01:00
Lennart Poettering 39e91a2777 namespace: get rid of local variable 2019-03-25 19:33:55 +01:00
Lennart Poettering 1019a48f40 namespace: (void)ify a number of syscalls 2019-03-25 19:33:55 +01:00
Lennart Poettering 5f7a690aaa namespace: replace one case of stack allocation with heap allocation
The list of mounts might grow quite large, let's avoid the stack for
this. Better safe than sorry.
2019-03-25 19:33:55 +01:00
Lennart Poettering d8b4d14df4 util: split out nulstr related stuff to nulstr-util.[ch] 2019-03-14 13:25:52 +01:00
Lennart Poettering 760877e90c util: split out sorting related calls to new sort-util.[ch] 2019-03-13 12:16:43 +01:00
Lennart Poettering 0cb8e3d118 util: split out namespace related stuff into a new namespace-util.[ch] pair
Just some minor reorganiztion.
2019-03-13 12:16:38 +01:00
Yu Watanabe 5beb8688e0 core/namespace: logs mount mode when the entry is dropped 2019-03-13 11:53:22 +09:00
Yu Watanabe 1e05071d27 core/namespace: introduce new mount mode READWRITE_IMPLICIT
ProtectSystem=strict or ProtectKernelTunable=yes create implicit
read-write mounts, but they are not overridable by TemporaryFileSystem=.
This makes such implicit read-write mounts use the new mount mode.
So, they can be override by TemproraryFileSystem= now.
A typical usecase is that ProtectSystem=strict and ProtectHome=tmpfs.

Fixes #11276.
2019-03-13 11:51:09 +09:00
Lennart Poettering 51af7fb230 core: add open_netns_path() helper
The new call allows us to open a netns from the file system, and store
it in a "storage fd pair". It's supposed to work with setup_netns() and
allows pre-population of the netns used with one opened from the file
system.
2019-03-07 16:55:23 +01:00
Lennart Poettering 44ffcbaea4 execute: (void)ify more 2019-03-07 16:53:45 +01:00
Topi Miettinen aecd5ac621 core: ProtectHostname= feature
Let services use a private UTS namespace. In addition, a seccomp filter is
installed on set{host,domain}name and a ro bind mounts on
/proc/sys/kernel/{host,domain}name.
2019-02-20 10:50:44 +02:00
Zbigniew Jędrzejewski-Szmek 3042bbebdd tree-wide: use c99 static for array size declarations
https://hamberg.no/erlend/posts/2013-02-18-static-array-indices.html

This only works with clang, unfortunately gcc doesn't seem to implement the check
(tested with gcc-8.2.1-5.fc29.x86_64).

Simulated error:
[2/3] Compiling C object 'systemd-nspawn@exe/src_nspawn_nspawn.c.o'.
../src/nspawn/nspawn.c:3179:45: warning: array argument is too small; contains 15 elements, callee requires at least 16 [-Warray-bounds]
                        candidate = (uid_t) siphash24(arg_machine, strlen(arg_machine), hash_key);
                                            ^                                           ~~~~~~~~
../src/basic/siphash24.h:24:64: note: callee declares array parameter as static here
uint64_t siphash24(const void *in, size_t inlen, const uint8_t k[static 16]);
                                                               ^~~~~~~~~~~~
2019-01-04 12:37:25 +01:00
Zbigniew Jędrzejewski-Szmek 049af8ad0c Split out part of mount-util.c into mountpoint-util.c
The idea is that anything which is related to actually manipulating mounts is
in mount-util.c, but functions for mountpoint introspection are moved to the
new file. Anything which requires libmount must be in mount-util.c.

This was supposed to be a preparation for further changes, with no functional
difference, but it results in a significant change in linkage:

$ ldd build/libnss_*.so.2
(before)
build/libnss_myhostname.so.2:
	linux-vdso.so.1 (0x00007fff77bf5000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f4bbb7b2000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f4bbb755000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4bbb734000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f4bbb56e000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4bbb8c1000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f4bbb51b000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f4bbb512000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f4bbb4e3000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f4bbb45e000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f4bbb458000)
build/libnss_mymachines.so.2:
	linux-vdso.so.1 (0x00007ffc19cc0000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fdecb74b000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fdecb744000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fdecb6e7000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdecb6c6000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fdecb500000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fdecb8a9000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fdecb4ad000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fdecb4a2000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdecb475000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fdecb3f0000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fdecb3ea000)
build/libnss_resolve.so.2:
	linux-vdso.so.1 (0x00007ffe8ef8e000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fcf314bd000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fcf314b6000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fcf31459000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcf31438000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fcf31272000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fcf31615000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fcf3121f000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fcf31214000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fcf311e7000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fcf31162000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fcf3115c000)
build/libnss_systemd.so.2:
	linux-vdso.so.1 (0x00007ffda6d17000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f610b83c000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f610b835000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f610b7d8000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f610b7b7000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f610b5f1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f610b995000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f610b59e000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f610b593000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f610b566000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f610b4e1000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f610b4db000)

(after)
build/libnss_myhostname.so.2:
	linux-vdso.so.1 (0x00007fff0b5e2000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fde0c328000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fde0c307000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fde0c141000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fde0c435000)
build/libnss_mymachines.so.2:
	linux-vdso.so.1 (0x00007ffdc30a7000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f06ecabb000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f06ecab4000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f06eca93000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f06ec8cd000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f06ecc15000)
build/libnss_resolve.so.2:
	linux-vdso.so.1 (0x00007ffe95747000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fa56a80f000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fa56a808000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa56a7e7000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fa56a621000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa56a964000)
build/libnss_systemd.so.2:
	linux-vdso.so.1 (0x00007ffe67b51000)
	librt.so.1 => /lib64/librt.so.1 (0x00007ffb32113000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007ffb3210c000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffb320eb000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ffb31f25000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ffb3226a000)

I don't quite understand what is going on here, but let's not be too picky.
2018-11-29 21:03:44 +01:00
Zbigniew Jędrzejewski-Szmek baaa35ad70 coccinelle: make use of SYNTHETIC_ERRNO
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.

I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
2018-11-22 10:54:38 +01:00
Yu Watanabe 93bab28895 tree-wide: use typesafe_qsort() 2018-09-19 08:02:52 +09:00
Yu Watanabe 2e4a4faea8 core/namespace: add more log messages 2018-09-18 14:31:09 +09:00
Alan Jenkins fcac12d150 namespace: remove redundant .has_prefix=false
The MountEntry's added for EMPTY_DIR work very similarly to the TMPFS ones.
In both cases, .has_prefix is false.  In fact, .has_prefix is false in
*all* the MountEntry's we add except for the access mounts (READONLY etc).

But EMPTY_DIR stuck out by explicitly setting .has_prefix = false.
Let's remove that.
2018-09-01 17:23:01 +09:00
Alan Jenkins 4a756839e6 namespace: we always use a root_directory now
We changed to always setup the new namespace in a separate directory
(commit 0722b35)
2018-09-01 17:23:01 +09:00
Alan Jenkins ad8e66dcc4 namespace: fix mode for TemporaryFileSystem=
... when no mount options are passed.

Change the code, to avoid the following failure in the newly added tests:

exec-temporaryfilesystem-rw.service: Executing: /usr/bin/sh -x -c
'[ "$(stat -c %a /var)" == 755 ]'
++ stat -c %a /var
+ '[' 1777 == 755 ']'
Received SIGCHLD from PID 30364 (sh).
Child 30364 (sh) died (code=exited, status=1/FAILURE)

(And I spotted an opportunity to use TAKE_PTR() at the end).
2018-09-01 17:22:14 +09:00
Alan Jenkins 69338c3dfb namespace: don't try to remount superblocks
We can't remount the underlying superblocks, if we are inside a user
namespace and running Linux <= 4.17.  We can only change the per-mount
flags (MS_REMOUNT | MS_BIND).

This type of mount() call can only change the per-mount flags, so we
don't have to worry about passing the right string options now.

Fixes #9914 ("Since 1beab8b was merged, systemd has been failing to start
systemd-resolved inside unprivileged containers" ... "Failed to re-mount
'/run/systemd/unit-root/dev' read-only: Operation not permitted").

> It's basically my fault :-). I pointed out we could remount read-only
> without MS_BIND when reviewing the PR that added TemporaryFilesystem=,
> and poettering suggested to change PrivateDevices= at the same time.
> I think it's safe to change back, and I don't expect anyone will notice
> a difference in behaviour.
>
> It just surprised me to realize that
> `TemporaryFilesystem=/tmp:size=10M,ro,nosuid` would not apply `ro` to the
> superblock (underlying filesystem), like mount -osize=10M,ro,nosuid does.
> Maybe a comment could note the kernel version (v4.18), that lets you
> remount without MS_BIND inside a user namespace.

This makes the code longer and I guess this function is still ugly, sorry.
One obstacle to cleaning it up is the interaction between
`PrivateDevices=yes` and `ReadOnlyPaths=/dev`.  I've added a test for the
existing behaviour, which I think is now the correct behaviour.
2018-08-30 11:17:16 +01:00
Yu Watanabe 52e4d62550
Merge pull request #9852 from poettering/namespace-errno
namespace: be more careful when handling namespacing failures
2018-08-22 11:16:29 +09:00
Lennart Poettering 1beab8b0d0 namespace: be more careful when handling namespacing failures gracefully
This makes two changes to the namespacing code:

1. We'll only gracefully skip service namespacing on access failure if
   exclusively sandboxing options where selected, and not mount-related
   options that result in a very different view of the world. For example,
   ignoring RootDirectory=, RootImage= or Bind= is really probablematic,
   but ReadOnlyPaths= is just a weaker sandbox.

2. The namespacing code will now return a clearly recognizable error
   code when it cannot enforce its namespacing, so that we cannot
   confuse EPERM errors from mount() with those from unshare(). Only the
   errors from the first unshare() are now taken as hint to gracefully
   disable namespacing.

Fixes: #9844 #9835
2018-08-21 20:00:33 +02:00
Zbigniew Jędrzejewski-Szmek 7692fed98b
Merge pull request #9783 from poettering/get-user-creds-flags
beef up get_user_creds() a bit and other improvements
2018-08-21 10:09:33 +02:00
Lennart Poettering b2a60844c4 namespace: when creating device nodes, also create /dev/char/* symlinks
On the host these symlinks are created by udev, and we consider them API
and make use of them ourselves at various places. Hence when running a
private /dev, also create these symlinks so that lookups by major/minor
work in such an environment, too.
2018-08-20 15:58:11 +02:00
Yu Watanabe 763a260ae7 core/namespace: add more log messages
Suggested by #9835.
2018-08-10 14:30:35 +09:00
Yu Watanabe 839f187753 core/namespace: drop mount points outside of root even if RootDirectory= is not set 2018-08-06 12:51:33 +09:00
Yu Watanabe 9b68367b3a core/namespace: drop conditions depends on root is empty or not
After 0722b35934, the variable `root`
is always set.
2018-08-06 12:51:33 +09:00
Chris Lamb 3fe910794b Correct a number of trivial typos. 2018-06-18 22:44:44 +02:00
Yu Watanabe 1e8c7bd55c namespace: drop protect_{home,system}_or_bool_from_string()
The functions protect_{home,system}_from_string() are not used
except for defining protect_{home,system}_or_bool_from_string().
This makes protect_{home,system}_from_string() support boolean
strings, and drops protect_{home,system}_or_bool_from_string().
2018-06-15 11:32:27 +02:00
Zbigniew Jędrzejewski-Szmek b0450864f1
Merge pull request #9274 from poettering/comment-header-cleanup
drop "this file is part of systemd" and lennart's copyright from header
2018-06-14 11:26:50 +02:00
Jan Synacek 0722b35934 namespace: always use a root directory when setting up namespace
1) mv /var/tmp /var/tmp.old
2) mkdir /tmp/varrr
3) ln -s /tmp/varrr /var/tmp

Now, when a service has PrivateTmp=yes, during namespace setup,
/tmp is first mounted over with a new mount. Then, when /var/tmp
is being resolved, it points to /tmp/varrr, which by then doesn't
exist, because it had already been obscured.
2018-06-14 10:25:16 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Zbigniew Jędrzejewski-Szmek 5d904a6aaa tree-wide: drop !! casts to booleans
They are not needed, because anything that is non-zero is converted
to true.

C11:
> 6.3.1.2: When any scalar value is converted to _Bool, the result is 0 if the
> value compares equal to 0; otherwise, the result is 1.

https://stackoverflow.com/questions/31551888/casting-int-to-bool-in-c-c
2018-06-13 10:52:40 +02:00
Lennart Poettering 228af36fff core: add new PrivateMounts= unit setting
This new setting is supposed to be useful in most cases where
"MountFlags=slave" is currently used, i.e. as an explicit way to run a
service in its own mount namespace and decouple propagation from all
mounts of the new mount namespace towards the host.

The effect of MountFlags=slave and PrivateMounts=yes is mostly the same,
as both cause a CLONE_NEWNS namespace to be opened, and both will result
in all mounts within it to be mounted MS_SLAVE. The difference is mostly
on the conceptual/philosophical level: configuring the propagation mode
is nothing people should have to think about, in particular as the
matter is not precisely easyto grok. Moreover, MountFlags= allows configuration
of "private" and "slave" modes which don't really make much sense to use
in real-life and are quite confusing. In particular PrivateMounts=private means
mounts made on the host stay pinned for good by the service which is
particularly nasty for removable media mount. And PrivateMounts=shared
is in most ways a NOP when used a alone...

The main technical difference between setting only MountFlags=slave or
only PrivateMounts=yes in a unit file is that the former remounts all
mounts to MS_SLAVE and leaves them there, while that latter remounts
them to MS_SHARED again right after. The latter is generally a nicer
approach, since it disables propagation, while MS_SHARED is afterwards
in effect, which is really nice as that means further namespacing down
the tree will get MS_SHARED logic by default and we unify how
applications see our mounts as we always pass them as MS_SHARED
regardless whether any mount namespacing is used or not.

The effect of PrivateMounts=yes was implied already by all the other
mount namespacing options. With this new option we add an explicit knob
for it, to request it without any other option used as well.

See: #4393
2018-06-12 16:12:10 +02:00
Yu Watanabe fa65c28176 namespace: rename parse_protect_{home,system}_or_bool() to protect_{home,system}_or_bool_to_string()
Hence, we can define config_parse_protect_{home,system}() by using
DEFINE_CONFIG_PARSE_ENUM() macro.
2018-05-31 11:09:41 +09:00
Lennart Poettering 4e2c0a227e namespace: extend list of masked files by ProtectKernelTunables=
This adds a number of entries nspawn already applies to regular service
namespacing too. Most importantly let's mask /proc/kcore and
/proc/kallsyms too.
2018-05-03 17:46:31 +02:00
Lennart Poettering da6053d0a7 tree-wide: be more careful with the type of array sizes
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.

Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.

So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.

This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:

1. strv_length()' return type becomes size_t

2. the unit file changes array size becomes size_t

3. DNS answer and query array sizes become size_t

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
2018-04-27 14:29:06 +02:00
Lennart Poettering 088696fe29 namespace: rework how we resolve symlinks in mount points
Before this patch we'd resolve all symlinks of bind mounts and other
mount points to establish for a service in advance, and only then start
mounting them. This is problematic, if symlink chains jump around
between directories in a namespace tree, so that to resolve a specific
symlink chain we need to establish another mount already. A typical case
where this happens is if /etc/resolv.conf is a symlink to some file in
/run: in that case we'd normally resolve and mount /etc/resolv.conf
early on, but that's broken, as to do this properly we'd need to resolve
/etc/resolv.conf first, then figure out that /run needs to be mounted
before we can proceed, and thus reorder the order in which we apply
mounts dynamically.

With this change, whenever we are about to apply a mount, we'll do a
single step of the symlink normalization process, patch the mount entry
accordingly, and then sort the list of mounts to establish again, taking
the new path into account. This means that we can correctly deal with
the example above: we might start with wanting to mount /etc/resolv.conf
early, but after resolving it to the path in /run/ we'd push it to the
end of the list, ensuring that /run is mounted first.

(Note that this also fixes another bug: we were following symlinks on
the bind mount source relative to the root directory of the service,
rather than of the host. That's wrong though as we explicitly document
tha the source of bind mounts is always on the host.)
2018-04-18 14:17:50 +02:00
Lennart Poettering e871786273 namespace: improve logging when creating mount source nodes 2018-04-18 14:15:48 +02:00
Lennart Poettering f8b64b5723 namespace: split out calls to normalize mount entry list into new function 2018-04-18 14:15:48 +02:00