Commit graph

1099 commits

Author SHA1 Message Date
Lennart Poettering 7bf011e355 nspawn: make use of SIGINT handling when copying files
Fixes: #13079
2019-07-17 11:14:11 +02:00
Yu Watanabe 8cec0a5c32 tree-wide: drop duplicated blank lines
```
$ for i in */*.[ch] */*/*.[ch]; do sed -e '/^$/ {N; s/\n$//g}' -i $i; done
$ git checkout HEAD -- basic/linux shared/linux
```
2019-07-15 18:41:27 +02:00
Lennart Poettering b910cc72c0 tree-wide: get rid of strappend()
It's a special case of strjoin(), so no need to keep both. In particular
as typing strjoin() is even shoert than strappend().
2019-07-12 14:31:12 +09:00
Zbigniew Jędrzejewski-Szmek cd132992bb nspawn: fix abort when we cannot execve
If execve failed, we would die in safe_close(), because master was already
closed by fdset_close_others() on line 3123. IIUC, we don't need to keep the
fd open after sending it, so let's just close it immediately.

Reproducer:
sudo build/systemd-nspawn -M rawhide fooooooo

Fixup for 3acc84ebd9.
2019-07-09 01:24:20 +02:00
Yu Watanabe 2d9b74ba87 tree-wide: replace strjoin() with path_join() 2019-06-24 23:59:38 +09:00
Lennart Poettering cee97d5768
Merge pull request #12836 from yuwata/tree-wide-replace-strjoin
tree-wide: replace strjoin() with path_join()
2019-06-22 20:02:46 +02:00
Lennart Poettering c6134d3e2f path-util: get rid of prefix_root()
prefix_root() is equivalent to path_join() in almost all ways, hence
let's remove it.

There are subtle differences though: prefix_root() will try shorten
multiple "/" before and after the prefix. path_join() doesn't do that.
This means prefix_root() might return a string shorter than both its
inputs combined, while path_join() never does that. I like the
path_join() semantics better, hence I think dropping prefix_root() is
totally OK. In the end the strings generated by both functon should
always be identical in terms of path_equal() if not streq().

This leaves prefix_roota() in place. Ideally we'd have path_joina(), but
I don't think we can reasonably implement that as a macro. or maybe we
can? (if so, sounds like something for a later PR)

Also add in a few missing OOM checks
2019-06-21 08:42:55 +09:00
Anita Zhang f66ad46066 nspawn: don't hard fail when setting capabilities
The OCI changes in #9762 broke a use case in which we use nspawn from
inside a container that has dropped capabilities from the bounding set
that nspawn expected to retain. In an attempt to keep OCI compliance
and support our use case, I made hard failing on setting capabilities
not in the bounding set optional (hard fail if using OCI and log only
if using nspawn cmdline).

Fixes #12539
2019-06-20 21:46:36 +02:00
Yu Watanabe 657ee2d82b tree-wide: replace strjoin() with path_join() 2019-06-21 03:26:16 +09:00
Franck Bui dc98caea32 nspawn: make use of openpt_allocate() 2019-06-18 09:27:06 +02:00
Franck Bui 3acc84ebd9 nspawn: allocate the pty used for /dev/console within the container
The console tty is now allocated from within the container so it's not
necessary anymore to allocate it from the host and bind mount the pty slave
into the container. The pty master is sent to the host.

/dev/console is now a symlink pointing to the pty slave.

This might also be less confusing for applications running inside the container
and the overall result looks cleaner (we don't need to apply manually the
passed selinux context, if any, to the allocated pty for instance).
2019-06-18 08:17:34 +02:00
Franck Bui ba72801d66 nspawn: use correct error variable when logging errors returned by send_one_fd() 2019-06-18 07:54:51 +02:00
Michal Sekletar 3f09629c22
Merge pull request #12628 from keszybz/dbus-execute
Rework cpu affinity parsing
2019-05-30 12:32:53 +02:00
Yu Watanabe a0267b30f8 nspawn: also support ifindex when specifying network interface 2019-05-30 11:04:05 +02:00
Zbigniew Jędrzejewski-Szmek 0985c7c4e2 Rework cpu affinity parsing
The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size
of the array, but operations which take two (CPU_EQUAL_S) or even three arrays
(CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must
be of the same size, or buffer overruns will occur. This is exactly what our
code would do, if it received an array of unexpected size over the network.
("Unexpected" here means anything different from what cpu_set_malloc() detects
as the "right" size.)

Let's rework this, and store the size in bytes of the allocated storage area.

The code will now parse any number up to 8191, independently of what the current
kernel supports. This matches the kernel maximum setting for any architecture,
to make things more portable.

Fixes #12605.
2019-05-29 10:20:42 +02:00
Zbigniew Jędrzejewski-Szmek 127c167cdb
Merge pull request #12390 from poettering/string-file-mkdir
fileio: add a WRITE_STRING_FILE_MKDIR_0755 flag to write_string_file() that creates parent directories if needed
2019-05-28 14:42:55 +02:00
Lennart Poettering f9a3d8e2f3 nspawn: expose the new seccomp actions in the OCI logic 2019-05-24 10:48:28 +02:00
Lennart Poettering e82e549fb2 tree-wide: make use of the new WRITE_STRING_FILE_MKDIR_0755 flag 2019-05-08 06:36:20 -04:00
Franck Bui 9f3f596477 meson: make source files including nspawn-settings.h depend on libseccomp
Since nspawn-settings.h includes seccomp.h, any file that includes
nspawn-settings.h should depend on libseccomp so the correct header path where
seccomp.h lives is added to the header search paths.

It's especially important for distros such as openSUSE where seccomp.h is not
shipped in /usr/include but /usr/include/libseccomp.

This patch is similar to 8238423095.
2019-04-30 19:31:22 +02:00
Ben Boeckel 5238e95759 codespell: fix spelling errors 2019-04-29 16:47:18 +02:00
Ben Boeckel 8f8dfb9552 nspawn-expose-ports: fix a typo in error message 2019-04-26 23:42:55 +02:00
Dominick Grift 8f1ed04ad6 nspawn: Fix volatile SELinux label
nspawn should associate the specified nspawn container apifs object label instead of the nspawn container process label with the volatile tmpfs
2019-04-13 12:03:02 +02:00
Lennart Poettering 33d60b8d57 json: simplify JSON_VARIANT_OBJECT_FOREACH() macro a bit
There's no point in returning the "key" within each loop iteration as
JsonVariant object. Let's simplify things and return it as string. That
simplifies usage (since the caller doesn't have to convert the object to
the string anymore) and is safe since we already validate that keys are
strings when an object JsonVariant is allocated.
2019-04-12 13:11:11 +02:00
Anita Zhang 7bc5e0b12b seccomp: check more error codes from seccomp_load()
We noticed in our tests that occasionally SystemCallFilter= would
fail to set and the service would run with no syscall filtering.
Most of the time the same tests would apply the filter and fail
the service as expected. While it's not totally clear why this happens,
we noticed seccomp_load() in the systemd code base would fail open for
all errors except EPERM and EACCES.

ENOMEM, EINVAL, and EFAULT seem like reasonable values to add to the
error set based on what I gather from libseccomp code and man pages:

-ENOMEM: out of memory, failed to allocate space for a libseccomp structure, or would exceed a defined constant
-EINVAL: kernel isn't configured to support the operations, args are invalid (to seccomp_load(), seccomp(), or prctl())
-EFAULT: addresses passed as args are invalid
2019-04-12 10:23:07 +02:00
Lennart Poettering 1eacc47062 nspawn: create boot_id and kmsg files for overmounting in /run, not /tmp
/tmp might not be mounted at all yet (given that we support
SYSTEMD_NSPAWN_TMPFS_TMP=0 to turn this off), and /tmp is a dir systemd
usually tries to unmount during shutdown (unlike /run), and we shouldn't
keep it busy. Hence let's just move these deleted files to /run so that
we don't keep /tmp needlessly busy.
2019-04-07 08:55:31 +02:00
Lennart Poettering c614711386 tree-wide: use SYNTHETIC_ERRNO() where appropriate 2019-04-02 14:54:42 +02:00
Lennart Poettering 8a016c746e util-lib: when copying files make sure to apply some chattrs early, some late
Some chattrs only work sensible if you set them right after opening a
file for create (think: FS_NOCOW_FL). Others only work when they are
applied when the file is fully written (think: FS_IMMUTABLE_FL). Let's
take that into account when copying files and applying a chattr to them.
2019-03-28 18:43:04 +01:00
Zbigniew Jędrzejewski-Szmek ca78ad1de9 headers: remove unneeded includes from util.h
This means we need to include many more headers in various files that simply
included util.h before, but it seems cleaner to do it this way.
2019-03-27 11:53:12 +01:00
Zbigniew Jędrzejewski-Szmek e1af3bc62a
Merge pull request #12106 from poettering/nosuidns
add "nosuid" flag to exec directory mounts of DynamicUser=1 services
2019-03-26 08:58:00 +01:00
Zbigniew Jędrzejewski-Szmek 99f57a4fea
Merge pull request #12105 from poettering/api-vfs-mount-flags
some API VFS mount flag tweaks
2019-03-26 08:32:53 +01:00
Lennart Poettering 25e68fd397 nspawn: minor improvements to --help text 2019-03-26 08:06:00 +01:00
Lennart Poettering 849b9b85b8 nspawn: mount mqueue with nodev,noexec,nosuid, too
The host mounts it like that, nspawn hence should do too.

Moreover, mount the file system after doing CLONEW_NEWIPC so that it
actually reflects the right mqueues. Finally, mount it wthout
considering it fatal, since POSIX mqueue support is little used and it
should be fine not to support it in the kernel.
2019-03-25 19:53:05 +01:00
Lennart Poettering 64e82c1976 mount-util: beef up bind_remount_recursive() to be able to toggle more than MS_RDONLY
The function is otherwise generic enough to toggle other bind mount
flags beyond MS_RDONLY (for example: MS_NOSUID or MS_NODEV), hence let's
beef it up slightly to support that too.
2019-03-25 19:33:55 +01:00
Lennart Poettering 83276695c6
Merge pull request #12079 from keszybz/fuzz-nspawn-oci
Add fuzzer for nspawn-oci
2019-03-22 21:06:17 +01:00
Lennart Poettering e4077ff6f3 nspawn: don't free "fds" twice
Previously both run() and run_container() would free 'fds'. Let's fix
that, and let run() free it but make run_container() already remove all
fds from it, because that's what we actually want to do.

Fixes: #12073
2019-03-22 18:11:27 +01:00
Zbigniew Jędrzejewski-Szmek b2645747b7 nspawn-oci: fix double free
Also rename function to make it clear that it also frees the array
object itself.
2019-03-22 17:39:12 +01:00
Zbigniew Jędrzejewski-Szmek 094eecd29d
Merge pull request #12055 from poettering/save-argc-argv
main-func.h and systemctl argc/argv improvements
2019-03-22 16:58:18 +01:00
Zbigniew Jędrzejewski-Szmek b1f13b0e75 nspawn-oci: mount source is optional 2019-03-22 12:04:32 +01:00
Zbigniew Jędrzejewski-Szmek b2e07b1a02 nspawn-oci: use _cleanup_ in one more place 2019-03-22 11:51:21 +01:00
Lennart Poettering ae408d77a9 nspawn: conditionalize libseccomp use
We support compilation without libseccomp, hence don't rely on its
symbols.
2019-03-22 11:07:03 +01:00
Lennart Poettering 60ffa37a65 main-func: implicitly save argc/argv in DEFINE_MAIN_FUNCTION() functions
Let's remove the risk of forgetting to save argc/argv if
DEFINE_MAIN_FUNCTION() is used.
2019-03-21 18:10:06 +01:00
Lennart Poettering 36fea15565 util: introduce save_argc_argv() helper 2019-03-21 18:08:56 +01:00
Lennart Poettering c82cfae00b
Merge pull request #12062 from poettering/nspawn-main-func
nspawn: port to DEFINE_MAIN_FUNCTION()
2019-03-21 18:08:27 +01:00
Zbigniew Jędrzejewski-Szmek bb068de080 nspawn: add --no-pager switch
It only matters for --help.
2019-03-21 17:42:43 +01:00
Lennart Poettering 04f590a4a4 nspawn: voidify sd_notify() calls 2019-03-21 16:32:46 +01:00
Lennart Poettering 6145bb4f78 nspawn: port to static destructors 2019-03-21 16:32:46 +01:00
Lennart Poettering 44dbef90f1 nspawn: port to main-func.h logic 2019-03-21 16:32:46 +01:00
Zbigniew Jędrzejewski-Szmek fa28e4e377
Merge pull request #12059 from poettering/nspawn-typos
some typo and other fixes result of the OCI nspawn merge
2019-03-21 15:14:11 +01:00
Lennart Poettering c3d13d2ad5
Merge pull request #12058 from keszybz/oci-simplifications
Follow-ups for nspawn-oci review
2019-03-21 13:55:09 +01:00
Lennart Poettering f4e803c809 nspawn: add a few missing flags from --help text 2019-03-21 13:31:09 +01:00
Lennart Poettering 2514865391 nspawn: reorder --help text, and add section
The list is so long, let's add a bit of structure and order things a
bit.
2019-03-21 13:27:19 +01:00
Lennart Poettering 2c9b7a7e62 mount: when we fail to establish an inaccessible mount gracefully, undo the mount 2019-03-21 12:41:02 +01:00
Zbigniew Jędrzejewski-Szmek 6757a01356 util-lib: get rid of a helper variable 2019-03-21 11:08:58 +01:00
Zbigniew Jędrzejewski-Szmek f1531db5af nspawn-oci: add helper function for free_and_strdup with oom check 2019-03-21 11:08:58 +01:00
Zbigniew Jędrzejewski-Szmek d0b6a10c00
Merge pull request #9762 from poettering/nspawn-oci
OCI runtime support for nspawn
2019-03-21 11:01:53 +01:00
Zbigniew Jędrzejewski-Szmek 19130626a0 nspawn-oci: use SYNTHETIC_ERRNO 2019-03-21 10:51:43 +01:00
Topi Miettinen ebcf697685 tree-wide: fix false search hits with ppp (typos) 2019-03-18 14:25:56 +01:00
Lennart Poettering 95658673a0
Merge pull request #12016 from yuwata/fix-two-memleaks-found-by-oss-fuzz
Fix two memleaks found by oss fuzz
2019-03-15 17:33:48 +01:00
Yu Watanabe 1d0c1146ea nspawn: fix memleak
Fixes oss-fuzz#13691.
2019-03-15 23:53:05 +09:00
Zbigniew Jędrzejewski-Szmek 7acf581a58 Handle or voidify all calls to close_all_fds()
In activate, it is important that we close the fds. In other cases, meh.
2019-03-15 15:46:41 +01:00
Lennart Poettering a3fc6b55ac nspawn: mask out CAP_NET_ADMIN again if settings file turns off private networking
Fixes: #11755
2019-03-15 15:42:21 +01:00
Lennart Poettering bd4b15f274 nspawn: use right constant for shifting for uint64_t caps 2019-03-15 15:42:20 +01:00
Lennart Poettering de40a3037a nspawn: add support for executing OCI runtime bundles with nspawn
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.

Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:

1. Support for "masking" files and directories. This functionatly is now
   also available via the new --inaccesible= cmdline command, and
   Inaccessible= in .nspawn files.

2. Support for mounting arbitrary file systems. (not exposed through
   nspawn cmdline nor .nspawn files, because probably not a good idea)

3. Ability to configure the console settings for a container. This
   functionality is now also available on the nspawn cmdline in the new
   --console= switch (not added to .nspawn for now, as it is something
   specific to the invocation really, not a property of the container)

4. Console width/height configuration. Not exposed through
   .nspawn/cmdline, but this may be controlled through $COLUMNS and
   $LINES like in most other UNIX tools.

5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
   the cmdline, since containers likely have different user tables, and
   the existing --user= switch appears to be the better option)

6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
   OCI)

7. Creation of additional devices nodes in /dev. Most likely not a good
   idea, hence not exposed in .nspawn/cmdline. There's already --bind=
   to achieve the same, which is the better alternative.

8. Explicit syscall filters. This is not a good idea, due to the skewed
   arch support, hence not exposed through .nspawn/cmdline.

9. Configuration of some sysctls on a whitelist. Questionnable, not
   supported in .nspawn/cmdline for now.

10. Configuration of all 5 types of capabilities. Not a useful concept,
    since the kernel will reduce the caps on execve() anyway. Not
    exposed through .nspawn/cmdline as this is not very useful hence.

Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.

Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.

There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
2019-03-15 15:41:28 +01:00
Lennart Poettering 5ef4cb7ad0 nspawn: (void)ify more stuff 2019-03-15 15:33:09 +01:00
Lennart Poettering 61b4443361 nspawn: refactor setuid code a bit
Let's separate out the raw uid_t/gid_t handling from the username
handling. This is useful later on.

Also, let's use the right gid_t type for group types wherever
appropriate.
2019-03-15 15:33:09 +01:00
Lennart Poettering d8b4d14df4 util: split out nulstr related stuff to nulstr-util.[ch] 2019-03-14 13:25:52 +01:00
Lennart Poettering e45c81b8bc shared: split out code to wait for jobs to complet into its own source file
It's complex enough and quite a few functions. Let's hence split this
out.

No code change, just some rearranging of source files.
2019-03-13 17:39:24 +01:00
Lennart Poettering 760877e90c util: split out sorting related calls to new sort-util.[ch] 2019-03-13 12:16:43 +01:00
Lennart Poettering 0cb8e3d118 util: split out namespace related stuff into a new namespace-util.[ch] pair
Just some minor reorganiztion.
2019-03-13 12:16:38 +01:00
Zbigniew Jędrzejewski-Szmek 0e636bf51a nspawn: fix memleak uncovered by fuzzer
Also use TAKE_PTR as appropriate.
2019-03-11 14:29:30 +01:00
Lennart Poettering 27da7ef0d0 nspawn: move payload to sub-cgroup first, then sync cgroup trees
if we sync the legacy and unified trees before moving to the right
subcgroup then ultimately the cgroup paths in the hierarchies will be
out-of-sync... Hence, let's move the payload first, and sync then.

Addresses: https://github.com/systemd/systemd/pull/9762#issuecomment-441187979
2019-03-07 11:26:17 +01:00
Lennart Poettering adc6f43b14 copy: don't synthesize a 'user.crtime_usec' xattr on copy unless explicitly requested
Previously, when we'd copy an individual file we'd synthesize a
user.crtime_usec xattr with the source's creation time if we can
determine it. As the creation/birth time was until recently not
queriable form userspace this effectively just propagated the same xattr
on the source to the same xattr on the destination. However, current
kernels now allow to query the birthtime using statx() and we do make
use of that now. Which means that suddenly we started synthesizing these
xattrs much more regularly.

Doing this actually does make sense, but only in very few cases:
not for the typical regular files we copy, but certainly when dealing
with disk images. Hence, let's keep this kind of propagation, but let's
make it a flag and default to off. Then turn it on whenever we deal with
disk images, and leave it off otherwise.

This is particularly relevant as overlayfs combining a real fs, and a
tmpfs on top will result in EOPNOTSUPP when it is attempted to open a
file with xattrs for writing, as tmpfs does not support xattrs, and
hence the copy-up cannot work. Hence, let's avoid synthesizing this
needlessly, to increase compat with overlayfs.
2019-03-01 14:11:07 +01:00
Lennart Poettering e5a4bb0d4e nspawn: rework how arg_read_only is initialized in --volatile= mode
Previously, we'd refuse the combination, and claimed we'd imply it, but
actually didn't. Let's allow the combination and imply read-only from
--volatile=, because that's what's documented, what we claim we do, and
what makes sense.
2019-03-01 14:11:07 +01:00
Lennart Poettering 83205269c0 nspawn: refactor how we determine whether it's OK to write to /etc 2019-03-01 14:11:07 +01:00
Lennart Poettering e50cd82f68 nspawn: no need to make top-level directory a bind mount if we just dissected an image 2019-03-01 14:11:07 +01:00
Lennart Poettering 7d0ecdd62d nspawn: slightly reorder mount logic
Let's first setup the volatile logic, and only then mount secondary
partitions of the image in.
2019-03-01 14:11:07 +01:00
Lennart Poettering 6c610acaaa nspawn: add --volatile=overlay support
Fixes: #11054 #3847
2019-03-01 14:11:06 +01:00
Lennart Poettering c55d0ae764 nspawn: fix an error path 2019-03-01 14:11:06 +01:00
Lennart Poettering e5b43a04b6 nspawn: add volatile mode multiplexer call setup_volatile_mode()
Just some refactoring, no change in behaviour.
2019-03-01 14:11:06 +01:00
Lennart Poettering 0646d3c3dd nspawn: explicitly refuse mounts over /
Previously this would fail later on, but let's filter this out at the
time of parsing.
2019-03-01 14:11:06 +01:00
Lennart Poettering 6e9417f5b4 tree-wide: use newa() instead of alloca() wherever we can
Typesafety is nice. And this way we can take benefit of the new size
assert() the previous commit added.
2019-01-26 16:17:04 +01:00
Lennart Poettering 2949ff2691 nspawn: ignore SIGPIPE for nspawn itself
Let's not abort due to a dead stdout.

Fixes: #11533
2019-01-26 13:54:44 +01:00
Lennart Poettering b2238e380e test,systemctl,nspawn: use "const char*" instead of "char*" as iterator for FOREACH_STRING()
The macro iterates through literal strings (i.e. constant strings),
hence it's more correct to have the iterator const too.
2019-01-16 12:29:30 +01:00
Chris Down e92aaed30e tree-wide: Remove O_CLOEXEC from fdopen
fdopen doesn't accept "e", it's ignored. Let's not mislead people into
believing that it actually sets O_CLOEXEC.

From `man 3 fdopen`:

> e (since glibc 2.7):
> Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen()

As mentioned by @jlebon in #11131.
2018-12-12 20:47:40 +01:00
Zbigniew Jędrzejewski-Szmek 489fae526d nspawn: check cg_ns_supported() just once
cg_ns_supported() caches, so the condition was really checked just once, but
it looks weird to assign the return value to arg_use_cgns (if the variable is not present),
because then the other checks are effectively equivalent to
  if (cg_ns_supported() && cg_ns_supported()) { ...
and later
  if (!cg_ns_supported() || !cg_ns_supported()) { ...
2018-12-11 13:37:41 +00:00
Lennart Poettering 60f1ec13ed nspawn: move most validation checks and configuration mangling into verify_arguments()
That's what the function is for after all, and only if it's done there
we can verify the effect of .nspawn files correctly too: after all we
should not just validate that everything configured on the command line
makes sense, but the stuff configured in the .nspawn files, too.
2018-12-10 12:54:56 +01:00
Lennart Poettering d5455d2f98 nspawn: split out code parsing env vars into a function of its own
This then let's us to ensure it's called after we parsed the cmdline,
and after we loaded the settings file, so that it these env var settings
override everything loaded from there.
2018-12-10 12:54:56 +01:00
Lennart Poettering 5eee829043 nspawn: move cg_unified_flush() invocation out of parse_argv()
It has nothing to do with argument parsing, and hence shouldn't be
there.
2018-12-10 12:54:56 +01:00
Zbigniew Jędrzejewski-Szmek 871fa294ff Merge pull request #10935 from poettering/rlimit-nofile-safe
Merged by hand to resolve a trivial conflict in TODO.
2018-12-06 17:19:21 +01:00
Yu Watanabe e93672eeac tree-wide: drop missing.h from headers and use relevant missing_*.h 2018-12-06 13:31:16 +01:00
Yu Watanabe 204f52e32d lockfile: drop unnecessary headers from lockfile-util.h 2018-12-06 13:31:16 +01:00
Yu Watanabe 503f480f8e missing: move fs or mount related definitions to missing_fs.h
This also fixes errnous definition MS_REC -> MS_SLAVE.
2018-12-06 13:30:43 +01:00
Yu Watanabe 36dd5ffd5d util: drop missing.h from util.h 2018-12-04 10:00:34 +01:00
Lennart Poettering e4de72876e util-lib: split out all temporary file related calls into tmpfiles-util.c
This splits out a bunch of functions from fileio.c that have to do with
temporary files. Simply to make the header files a bit shorter, and to
group things more nicely.

No code changes, just some rearranging of source files.
2018-12-02 13:22:29 +01:00
Lennart Poettering 5dd9527883 tree-wide: remove various unused functions
All found with "cppcheck --enable=unusedFunction".
2018-12-02 13:35:34 +09:00
Lennart Poettering 595225af7a tree-wide: invoke rlimit_nofile_safe() before various exec{v,ve,l}() invocations
Whenever we invoke external, foreign code from code that has
RLIMIT_NOFILE's soft limit bumped to high values, revert it to 1024
first. This is a safety precaution for compatibility with programs using
select() which cannot operate with fds > 1024.

This commit adds the call to rlimit_nofile_safe() to all invocations of
exec{v,ve,l}() and friends that either are in code that we know runs
with RLIMIT_NOFILE bumped up (which is PID 1 and all journal code for
starters) or that is part of shared code that might end up there.

The calls are placed as early as we can in processes invoking a flavour
of execve(), but after the last time we do fd manipulations, so that we
can still take benefit of the high fd limits for that.
2018-12-01 12:50:45 +01:00
Zbigniew Jędrzejewski-Szmek b2ac2b01c8
Merge pull request #10996 from poettering/oci-prep
Preparation for the nspawn-OCI work
2018-11-30 10:09:00 +01:00
Zbigniew Jędrzejewski-Szmek 049af8ad0c Split out part of mount-util.c into mountpoint-util.c
The idea is that anything which is related to actually manipulating mounts is
in mount-util.c, but functions for mountpoint introspection are moved to the
new file. Anything which requires libmount must be in mount-util.c.

This was supposed to be a preparation for further changes, with no functional
difference, but it results in a significant change in linkage:

$ ldd build/libnss_*.so.2
(before)
build/libnss_myhostname.so.2:
	linux-vdso.so.1 (0x00007fff77bf5000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f4bbb7b2000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f4bbb755000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4bbb734000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f4bbb56e000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f4bbb8c1000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f4bbb51b000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f4bbb512000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f4bbb4e3000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f4bbb45e000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f4bbb458000)
build/libnss_mymachines.so.2:
	linux-vdso.so.1 (0x00007ffc19cc0000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fdecb74b000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fdecb744000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fdecb6e7000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdecb6c6000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fdecb500000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fdecb8a9000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fdecb4ad000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fdecb4a2000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdecb475000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fdecb3f0000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fdecb3ea000)
build/libnss_resolve.so.2:
	linux-vdso.so.1 (0x00007ffe8ef8e000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fcf314bd000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fcf314b6000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007fcf31459000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcf31438000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fcf31272000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fcf31615000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fcf3121f000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fcf31214000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fcf311e7000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fcf31162000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007fcf3115c000)
build/libnss_systemd.so.2:
	linux-vdso.so.1 (0x00007ffda6d17000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f610b83c000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f610b835000)
	libmount.so.1 => /lib64/libmount.so.1 (0x00007f610b7d8000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f610b7b7000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f610b5f1000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f610b995000)
	libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f610b59e000)
	libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f610b593000)
	libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f610b566000)
	libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f610b4e1000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f610b4db000)

(after)
build/libnss_myhostname.so.2:
	linux-vdso.so.1 (0x00007fff0b5e2000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fde0c328000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fde0c307000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fde0c141000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fde0c435000)
build/libnss_mymachines.so.2:
	linux-vdso.so.1 (0x00007ffdc30a7000)
	librt.so.1 => /lib64/librt.so.1 (0x00007f06ecabb000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007f06ecab4000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f06eca93000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f06ec8cd000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f06ecc15000)
build/libnss_resolve.so.2:
	linux-vdso.so.1 (0x00007ffe95747000)
	librt.so.1 => /lib64/librt.so.1 (0x00007fa56a80f000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007fa56a808000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa56a7e7000)
	libc.so.6 => /lib64/libc.so.6 (0x00007fa56a621000)
	/lib64/ld-linux-x86-64.so.2 (0x00007fa56a964000)
build/libnss_systemd.so.2:
	linux-vdso.so.1 (0x00007ffe67b51000)
	librt.so.1 => /lib64/librt.so.1 (0x00007ffb32113000)
	libcap.so.2 => /lib64/libcap.so.2 (0x00007ffb3210c000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffb320eb000)
	libc.so.6 => /lib64/libc.so.6 (0x00007ffb31f25000)
	/lib64/ld-linux-x86-64.so.2 (0x00007ffb3226a000)

I don't quite understand what is going on here, but let's not be too picky.
2018-11-29 21:03:44 +01:00
Lennart Poettering 17c58ba97b nspawn: let's also pre-mount /dev/mqueue 2018-11-29 20:21:40 +01:00
Yu Watanabe acf4d15893 util: make *_from_name() returns negative errno on error 2018-11-28 20:20:50 +09:00
Yu Watanabe 938dbb292a
Merge pull request #10901 from poettering/startswith-list
add new STARTSWITH_SET() macro
2018-11-26 22:40:51 +09:00
Lennart Poettering da9fc98ded tree-wide: port more code over to PATH_STARTSWITH_SET() 2018-11-26 14:08:46 +01:00
Lennart Poettering 27adcc9737 cgroup: be more careful with which controllers we can enable/disable on a cgroup
This changes cg_enable_everywhere() to return which controllers are
enabled for the specified cgroup. This information is then used to
correctly track the enablement mask currently in effect for a unit.
Moreover, when we try to turn off a controller, and this works, then
this is indicates that the parent unit might succesfully turn it off
now, too as our unit might have kept it busy.

So far, when realizing cgroups, i.e. when syncing up the kernel
representation of relevant cgroups with our own idea we would strictly
work from the root to the leaves. This is generally a good approach, as
when controllers are enabled this has to happen in root-to-leaves order.
However, when controllers are disabled this has to happen in the
opposite order: in leaves-to-root order (this is because controllers can
only be enabled in a child if it is already enabled in the parent, and
if it shall be disabled in the parent then it has to be disabled in the
child first, otherwise it is considered busy when it is attempted to
remove it in the parent).

To make things complicated when invalidating a unit's cgroup membershup
systemd can actually turn off some controllers previously turned on at
the very same time as it turns on other controllers previously turned
off. In such a case we have to work up leaves-to-root *and*
root-to-leaves right after each other. With this patch this is
implemented: we still generally operate root-to-leaves, but as soon as
we noticed we successfully turned off a controller previously turned on
for a cgroup we'll re-enqueue the cgroup realization for all parents of
a unit, thus implementing leaves-to-root where necessary.
2018-11-23 13:41:37 +01:00
Zbigniew Jędrzejewski-Szmek baaa35ad70 coccinelle: make use of SYNTHETIC_ERRNO
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.

I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
2018-11-22 10:54:38 +01:00
Lennart Poettering 818623aca5
Merge pull request #10860 from keszybz/more-cleanup-2
Do more stuff from main macros
2018-11-21 11:07:31 +01:00
Zbigniew Jędrzejewski-Szmek 294bf0c34a Split out pretty-print.c and move pager.c and main-func.h to shared/
This is high-level functionality, and fits better in shared/ (which is for
our executables), than in basic/ (which is also for libraries).
2018-11-20 18:40:02 +01:00
Lennart Poettering f2fb2ec942 nspawn: use EXIT_EXCEPTION where appropriate 2018-11-20 17:04:07 +01:00
Lennart Poettering 042cad5737
Merge pull request #10753 from keszybz/pager-no-interrupt
Add mode in journalctl where ^C is handled by the pager
2018-11-14 20:09:39 +01:00
Zbigniew Jędrzejewski-Szmek 0221d68a13 basic/pager: convert the pager options to a flags argument
Pretty much everything uses just the first argument, and this doesn't make this
common pattern more complicated, but makes it simpler to pass multiple options.
2018-11-14 16:25:11 +01:00
Zbigniew Jędrzejewski-Szmek bd897e729a nspawn: add a hint to the message we emit when a child dies
From #10526:

$ sudo systemd-nspawn -i image
Spawning container image on /home/zbyszek/src/mkosi/image.
Press ^] three times within 1s to kill container.
Short read while reading cgroup mode.
2018-11-13 11:58:44 +01:00
Lennart Poettering 1d78fea2d6 nspawn: rework how we allocate/kill scopes
Fixes: #6347
2018-11-09 17:08:59 +01:00
Lennart Poettering df61bc5e4a nspawn: merge two variable declaration lines 2018-11-09 17:08:59 +01:00
Lennart Poettering 11d81e506e nspawn: simplify machine terminate bus call
We have the machine name anyway, let's use TerminateMachine() on
machined's Manager object directly with it. That way it's a single
method call only, instead of two, to terminate the machine.
2018-11-09 17:08:59 +01:00
Lennart Poettering e5a2d8b5b5 nspawn: make use of the new sd_bus_set_close_on_exit() call in nspawn 2018-11-09 17:08:59 +01:00
Yu Watanabe 57512c893e tree-wide: set WRITE_STRING_FILE_DISABLE_BUFFER flag when we write files under /proc or /sys 2018-11-06 21:24:03 +09:00
Lennart Poettering 6619ad889d nspawn: beef up netns checking a bit, for compat with old kernels
Fixes: #10544
2018-10-31 21:42:45 +03:00
Lennart Poettering e2d39e549f nspawn: add proper error message if setns() on network namespace fd fails
Addresses: https://github.com/systemd/systemd/pull/10589#issuecomment-434670595
2018-10-31 18:07:30 +01:00
Yu Watanabe 5a937ea2f6 sd-device: make sd_device_get_is_initialized() returns is_initialized by return value 2018-10-29 17:33:33 +09:00
Jiuyang liu a2f577fca0 add ephemeral to nspawn-settings. 2018-10-24 10:22:20 +02:00
Zbigniew Jędrzejewski-Szmek 369ca6dab1 systemd-nspawn: do not crash on /var/log/journal creation if not required
When running a read-only file system, we might not be able to create
/var/log/journal. Do not fail on this, unless actually requested by the
--link-journal options.

$ systemd-nspawn --image=image.squashfs ...
2018-10-22 15:07:08 +02:00
Yu Watanabe c65ac075ef nspawn: do not include '%m' in log message if errno is zero 2018-10-20 02:01:15 +09:00
Yu Watanabe b0b8c9a5a4
Merge pull request #10389 from poettering/nspawn-path-fix
nspawn $PATH execvpe() fix
2018-10-19 08:48:37 +09:00
Lennart Poettering 2ff48e981e tree-wide: introduce setsockopt_int() helper and make use of it everywhere
As suggested by @heftig:

6d5e65f645 (commitcomment-30938667)
2018-10-18 19:50:29 +02:00
Lennart Poettering c0815ca93d
Merge pull request #10407 from yuwata/netlink-slot
sd-netlink: introduce sd_netlink_slot object and relevant functions
2018-10-18 18:05:58 +02:00
Lennart Poettering b6b180b77b nspawn: use container $PATH (not host $PATH) when searching for PID 1 binaries to execute
Fixes: #10377
2018-10-18 16:40:12 +02:00
Yu Watanabe 8190a388a6 sd-netlink: make sd_netlink_slot take its description 2018-10-16 18:42:23 +09:00
Lennart Poettering 271f518f35 nspawn: TAKE_FD() is your friend 2018-10-15 19:45:37 +02:00
Lennart Poettering fbda85b078 tree-wide: use sockaddr_un_unlink() at two more places where appropriate 2018-10-15 19:44:34 +02:00
Lennart Poettering 6d5e65f645 tree-wide: add a single version of "static const int one = 1"
All over the place we define local variables for the various sockopts
that take a bool-like "int" value. Sometimes they are const, sometimes
static, sometimes both, sometimes neither.

Let's clean this up, introduce a common const variable "const_int_one"
(as well as one matching "const_int_zero") and use it everywhere, all
acorss the codebase.
2018-10-15 19:40:51 +02:00
Lennart Poettering 44ed5214ad tree-wide: use structured initialization for sockaddr_un 2018-10-15 19:35:00 +02:00
Yu Watanabe ee38400bba sd-netlink: introduce sd_netlink_slot 2018-10-15 18:10:04 +09:00
David Tardon f369f47c26 be consistent about sun_path length
Most places use the whole buffer for name, without leaving extra space
for the trailing NUL.
2018-10-12 12:38:49 +02:00
Lennart Poettering b37469d7d1 nspawn: add comments explaining the namespacing situation and the inner/outer children 2018-10-09 10:52:17 +02:00
Lennart Poettering 1099ceebce nspawn: optionally don't mount a tmpfs over /tmp (#10294)
nspawn: optionally, don't mount a tmpfs on /tmp

Fixes: #10260
2018-10-08 18:32:03 +02:00
Lennart Poettering ff6c6cc117 nspawn: when --quiet is passed, simply downgrade log messages to LOG_DEBUG (#10181)
With this change almost all log messages that are suppressed through
--quiet are not actually suppressed anymore, but simply downgraded to
LOG_DEBUG. Previously we did it this way for some log messages and fully
suppressed them for others. With this it's pretty much systematic.

Inspired by #10122.
2018-09-26 23:40:39 +02:00
Evgeny Vereshchagin 89f180201c nspawn: chown() the legacy hierarchy when it's used in a container
This is a follow-up to 720f0a2f3c.

Closes https://github.com/systemd/systemd/issues/10026
Closes https://github.com/systemd/systemd/issues/9563
2018-09-26 17:29:17 +02:00
Lennart Poettering ee8d493cbd
Merge pull request #10158 from keszybz/seccomp-log-tightening
Seccomp log tightening
2018-09-26 15:56:32 +02:00
Yu Watanabe 6c9c51e5e2 fs-util: make symlink_idempotent() optionally create relative link 2018-09-24 18:52:53 +03:00
Zbigniew Jędrzejewski-Szmek 7e86bd73a4 seccomp: tighten checking of seccomp filter creation
In seccomp code, the code is changed to propagate errors which are about
anything other than unknown/unimplemented syscalls. I *think* such errors
should not happen in normal usage, but so far we would summarilly ignore all
errors, so that part is uncertain. If it turns out that other errors occur and
should be ignored, this should be added later.

In nspawn, we would count the number of added filters, but didn't use this for
anything. Drop that part.

The comments suggested that seccomp_add_syscall_filter_item() returned negative
if the syscall is unknown, but this wasn't true: it returns 0.

The error at this point can only be if the syscall was known but couldn't be
added. If the error comes from our internal whitelist in nspawn, treat this as
error, because it means that our internal table is wrong. If the error comes
from user arguments, warn and ignore. (If some syscall is not known at current
architecture, it is still silently ignored.)
2018-09-24 17:21:09 +02:00
Zbigniew Jędrzejewski-Szmek b54f36c604 seccomp: reduce logging about failure to add syscall to seccomp
Our logs are full of:
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldstat() / -10037, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call get_thread_area() / -10076, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call set_thread_area() / -10079, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldfstat() / -10034, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldolduname() / -10036, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldlstat() / -10035, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call waitpid() / -10073, ignoring: Numerical argument out of domain
...
This is pointless and makes debug logs hard to read. Let's keep the logs
in test code, but disable it in nspawn and pid1. This is done through a function
parameter because those functions operate recursively and it's not possible to
make the caller to log meaningfully.


There should be no functional change, except the skipped debug logs.
2018-09-24 17:21:09 +02:00
Yu Watanabe cf37f937ee nspawn: suppress one more log message when --quiet is passed
Fixes #10119.
2018-09-19 08:42:17 +02:00
Yu Watanabe 93bab28895 tree-wide: use typesafe_qsort() 2018-09-19 08:02:52 +09:00
Zbigniew Jędrzejewski-Szmek 6d7c403324 tests: use a helper function to parse environment and open logging
The advantages are that we save a few lines, and that we can override
logging using environment variables in more test executables.
2018-09-14 09:29:57 +02:00
afg 27b620b7db nspawn: use copy-static if systemd-resolved is up and image is writable 2018-09-12 20:48:21 +02:00
Franck Bui 03d0f4b58e nspawn: always use mode 555 for /sys
When a network namespace is needed, /sys is mounted as tmpfs (see commit
d8fc6a000f for details).

But in this case mode 755 was used as initial permissions for /sys whereas the
default mode for sysfs is 555.

In practice using 755 doesn't have any impact because /sys is mounted read-only
too but for consistency, let's use the correct mode.

Fixes: #10050
2018-09-11 00:34:00 +02:00
Yu Watanabe f55b0d3fd6 nspawn: replace udev_device by sd_device 2018-08-23 04:57:39 +09:00
Zbigniew Jędrzejewski-Szmek 7692fed98b
Merge pull request #9783 from poettering/get-user-creds-flags
beef up get_user_creds() a bit and other improvements
2018-08-21 10:09:33 +02:00
Lennart Poettering 8967f29169 nspawn: add two missing OOM checks 2018-08-20 15:58:11 +02:00
Lennart Poettering 8dfce114ab nspawn: make sure to create /dev/char/x:y symlinks in nspawn containers too
On the host udev creates these, but they are useful API, hence create
them in nspawn containers too.
2018-08-20 15:58:11 +02:00
Lennart Poettering 37ec0fdd34 tree-wide: add clickable man page link to all --help texts
This is a bit like the info link in most of GNU's --help texts, but we
don't do info but man pages, and we make them properly clickable on
terminal supporting that, because awesome.

I think it's generally advisable to link up our (brief) --help texts and
our (more comprehensive) man pages a bit, so this should be an easy and
straight-forward way to do it.
2018-08-20 11:33:04 +02:00
Yu Watanabe 4ae25393f3 tree-wide: shorten error logging a bit
Continuation of 4027f96aa0.
2018-08-07 10:14:33 +09:00
Luke Shumaker 677a72cd3e nspawn: mount_sysfs(): Unconditionally mkdir /sys/fs/cgroup
Currently, mount_sysfs() only creates /sys/fs/cgroup if cg_ns_supported().
The comment explains that we need to "Create mountpoint for
cgroups. Otherwise we are not allowed since we remount /sys read-only.";
that is: that we need to do it now, rather than later.  However, the
comment doesn't do anything to explain why we only need to do this if
cg_ns_supported(); shouldn't we _always_ need to do it?

The answer is that if !use_cgns, then this was already done by the outer
child, so mount_sysfs() only needs to do it if use_cgns.  Now,
mount_sysfs() doesn't know whether use_cgns, but !cg_ns_supported() implies
!use_cgns, so we can optimize" the case where we _know_ !use_cgns, and deal
with a no-op mkdir_p() in the false-positive where cgns_supported() but
!use_cgns.

But is it really much of an optimization?  We're potentially spending an
access(2) (cg_ns_supported() could be cached from a previous call) to
potentially save an lstat(2) and mkdir(2); and all of them are on virtual
fileystems, so they should all be pretty cheap.

So, simplify and drop the conditional.  It's a dubious optimization that
requires more text to explain than it's worth.
2018-07-20 12:12:03 -04:00
Luke Shumaker 93dbdf6cb1 nspawn: sync_cgroup(): Rename arg_uid_shift -> uid_shift
Naming it arg_uid_shift is confusing because of the global arg_uid_shift in
nspawn.c
2018-07-20 12:12:02 -04:00
Luke Shumaker 0402948206 nspawn: Move cgroup mount stuff from nspawn-mount.c to nspawn-cgroup.c 2018-07-20 12:12:02 -04:00
Luke Shumaker 2fa017f169 nspawn: Simplify tmpfs_patch_options() usage, and trickle that up
One of the things that tmpfs_patch_options does is take an (optional) UID,
and insert "uid=${UID},gid=${UID}" into the options string.  So we need a
uid_t argument, and a way of telling if we should use it.  Fortunately,
that is built in to the uid_t type by having UID_INVALID as a possible
value.

So this is really a feature that requires one argument.  Yet, it is somehow
taking 4!  That is absurd.  Simplify it to only take one argument, and have
that trickle all the way up to mount_all()'s usage.

Now, in may of the uses, the argument becomes

    uid_shift == 0 ? UID_INVALID : uid_shift

because it used to treat uid_shift=0 as invalid unless the patch_ids flag
was also set.  This keeps the behavior the same.  Note that in all cases
where it is invoked, if !use_userns (sometimes called !userns), then
uid_shift is 0; we don't have to add any checks for that.

That said, I'm pretty sure that "uid=0" and not setting "uid=" are the
same, but Christian Brauner seemed to not think so when implementing the
cgns support.  https://github.com/systemd/systemd/pull/3589
2018-07-20 12:12:02 -04:00
Luke Shumaker 9c0fad5fb5 nspawn: Simplify mkdir_userns() usage, and trickle that up
One of the things that mkdir_userns{,_p}() does is take an (optional) UID,
and chown the directory to that.  So we need a uid_t argument, and a way of
telling if we should use that uid_t argument.  Fortunately, that is built
in to the uid_t type by having UID_INVALID as a possible value.

However, currently mkdir_userns() also takes a MountSettingsMask and checks
a couple of bits in it to decide if it should perform the chown.

Drop the mask argument, and instead have the caller pass UID_INVALID if it
shouldn't chown.
2018-07-20 12:12:02 -04:00
Lennart Poettering a7e2e50d35 summary: update nspawn description string a bit
nspawn as it is now is a generally useful tool, hence let's drop the
comments about it being useful for debug and so on only.

The new wording just makes the first sentence of the main page also the
summary.
2018-06-28 11:55:44 +09:00
Zbigniew Jędrzejewski-Szmek 0cd41d4dff Drop my copyright headers
perl -i -0pe 's/\s*Copyright © .... Zbigniew Jędrzejewski.*?\n/\n/gms' man/*xml
git grep -e 'Copyright.*Jędrzejewski' -l | xargs perl -i -0pe 's/(#\n)?# +Copyright © [0-9, -]+ Zbigniew Jędrzejewski.*?\n//gms'
git grep -e 'Copyright.*Jędrzejewski' -l | xargs perl -i -0pe 's/\s*\/\*\*\*\s+Copyright © [0-9, -]+ Zbigniew Jędrzejewski[^\n]*?\s*\*\*\*\/\s*/\n\n/gms'
git grep -e 'Copyright.*Jędrzejewski' -l | xargs perl -i -0pe 's/\s+Copyright © [0-9, -]+ Zbigniew Jędrzejewski[^\n]*//gms'
2018-06-14 13:03:20 +02:00
Lennart Poettering 96b2fb93c5 tree-wide: beautify remaining copyright statements
Let's unify an beautify our remaining copyright statements, with a
unicode ©. This means our copyright statements are now always formatted
the same way. Yay.
2018-06-14 10:20:21 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Lennart Poettering df1fac6dea nspawn: free global variables before exiting
This doesn't really matter much, but is prettier for valgrind
2018-06-13 17:51:40 +02:00
Lennart Poettering 2f14e52f08 nspawn: drop unused parameter from one call 2018-06-13 17:42:16 +02:00
Lennart Poettering ef31828d06 tree-wide: unify how we define bit mak enums
Let's always write "1 << 0", "1 << 1" and so on, except where we need
more than 31 flag bits, where we write "UINT64(1) << 0", and so on to force
64bit values.
2018-06-12 21:44:00 +02:00
Lennart Poettering b8b846d7b4 tree-wide: fix a number of log calls that use %m but have no errno set
This is mostly fall-out from d1a1f0aaf0,
however some cases are older bugs.

There might be more issues lurking, this was a simple grep for "%m"
across the tree, with all lines removed that mention "errno" at all.
2018-06-07 15:29:17 +02:00
Lennart Poettering 669fc4e5c5 tree-wide: some O_NDELAY → O_NONBLOCK fixes
Somehow the coccinelle script misses these, hence fix them manually.
2018-05-31 12:04:39 +02:00
Lennart Poettering d32d473d66
Merge pull request #9103 from keszybz/more-tables-tests
More tables tests
2018-05-28 14:24:19 +02:00
Zbigniew Jędrzejewski-Szmek 83e803a9ef nspawn: reset umask early
Fixes #8911.
2018-05-28 11:01:43 +02:00
Zbigniew Jędrzejewski-Szmek 667c1baff5 nspawn: remove some vertical whitespace
Sometimes an empty line is good for readability, but here I think
they all can be removed without any loss.
2018-05-28 11:01:43 +02:00
Zbigniew Jędrzejewski-Szmek 8514095fe6 test-nspawn-tables: add another "tables" test 2018-05-28 10:40:00 +02:00
Zbigniew Jędrzejewski-Szmek 97d9061563 meson: use a convenience static library for nspawn core
This makes it easier to link the nspawn implementation to the tests.
Right now this just means that nspawn-patch-uid.c is not compiled
twice, which is nice, but results in test-patch-uid being slightly bigger,
which is not nice. But in general, we should use convenience libs to
compile everything just once, as far as possible. Otherwise, once we
start compiling a few files here twice, and a few file there thrice, we
soon end up in a state where we are doing hundreds of extra compilations.
So let's do the "right" thing, even if is might not be more efficient.
2018-05-28 10:40:00 +02:00
Lennart Poettering 3a6ce860ac machine-image: rework error handling
Let's rework error handling a bit in image_find() and friends: when we
can't find an image, return -ENOENT rather than 0. That's better as
before we violated the usual rule in our codebase that return parameters
are initialized when the return value is >= 0 and otherwise not touched.

This also makes enumeration and validation a bit more strict: we'll only
accept ".raw" as suffix for regular files, and filter out this suffix
handling on directories/subvolumes, where it makes no sense.
2018-05-24 17:01:57 +02:00
Lennart Poettering 5ef46e5f65 machine-image: introduce two different classes of images
This distuingishes two different classes of images, one for the purpose
of npsawn-like containers, i.e. "machines", and one for portable
services.

This distinction is mostly about search paths. We look for machine
images in /var/lib/machines and for portable images in
/var/lib/portables.
2018-05-24 17:01:57 +02:00
Lennart Poettering d58ad743f9 os-util: add helpers for finding /etc/os-release
Place this new helpers in a new source file os-util.[ch], and move the
existing and related call path_is_os_tree() to it as well.
2018-05-24 17:01:57 +02:00
Lennart Poettering 03bcb6d408 dissect: optionally, validate that the image we dissect is a valid OS image
We already do this kind of validation in nspawn when we operate on a
plain directory, let's also do this on raw images under the same
condition: that we are about too boot the image. Also, do this when we
are about to read OS metadata from it.
2018-05-24 17:01:57 +02:00
Zbigniew Jędrzejewski-Szmek 17c1b9a93f
Merge pull request #9024 from poettering/nspawn-attrs-more
make even more nspawn concepts configurable
2018-05-24 16:27:27 +02:00
Zbigniew Jędrzejewski-Szmek 7cd92e2e9d
Merge pull request #9068 from poettering/nspawn-pty-deadlock
nspawn logging deadlock fix
2018-05-24 16:25:22 +02:00
Zbigniew Jędrzejewski-Szmek 14d0afb94d
Merge pull request #9065 from poettering/fixup-tab-double-newline
tree-wide: fix some TABs and double newlines
2018-05-22 17:14:48 +02:00
Lennart Poettering 17cac366ae nspawn: make sure our container PID 1 keeps logging to the original stderr as long as possible
If we log to the pty that is configured as stdin/stdout/stderr of the
container too early we risk filling it up in full before we start
processing the pty from the parent process, resulting in deadlocks.
Let's hence keep a copy of the original tty we were started on before
setting up stdin/stdout/stderr, so that we can log to it, and keep using
it as long as we can.

Since the kernel's pty internal buffer is pretty small this actually
triggered deadlocks when we debug logged at lot from nspawn's child
processes, see: https://github.com/systemd/systemd/pull/9024#issuecomment-390403674

With this change we won't use the pty at all, only the actual payload we
start will, and hence we won't deadlock on it, ever.
2018-05-22 16:52:50 +02:00
Lennart Poettering 8ca082b49a nspawn: make use of log_set_open_when_needed() in nspawn too
Let's make use of log_set_open_when_needed() in nspawn too, i.e. at the
point where we close logging because we are about to rearrange fds,
let's automatically reopen the logging fds when we need them, the same
way as we do that in the service manager. This makes things simpler and
more robust.
2018-05-22 16:51:28 +02:00
Lennart Poettering f728ab1724 nspawn: let's rename _FORCE_ENUM_WIDTH → _SETTING_FORCE_ENUM_WIDTH
Just some preparation in case we need a similar hack in another enum one
day.
2018-05-22 16:21:26 +02:00
Lennart Poettering 1688841f46 nspawn: similar to the previous patches, also make /etc/localtime handling more configurable
Fixes: #9009
2018-05-22 16:21:26 +02:00
Lennart Poettering 63d1c29ffa nspawn: complain if people still use --share-system 2018-05-22 16:20:08 +02:00
Lennart Poettering 4e1d6aa983 nspawn: make --link-journal= configurable through .nspawn files, too 2018-05-22 16:20:08 +02:00
Lennart Poettering b8ea7a6e12 nspawn: add a bit of debug logging to resolved_listening() 2018-05-22 16:19:26 +02:00
Lennart Poettering 09d423e921 nspawn: add greater control over how /etc/resolv.conf is handled
Fixes: #8014 #1781
2018-05-22 16:19:26 +02:00
Lennart Poettering 8904ab86b0
Merge pull request #9062 from poettering/parse-conf-macro
add new CONFIG_PARSER_PROTOTYPE() macro
2018-05-22 16:14:49 +02:00
Lennart Poettering a5201ed6ce tree-wide: fix a couple of TABs 2018-05-22 16:13:45 +02:00
Arnaud Rebillout c9fe05e07d nspawn: support pivot-root option during directory validation
Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>
2018-05-22 14:42:10 +02:00
Lennart Poettering c0d7a4f0cd
Merge pull request #9061 from poettering/dump-string-table
add new DUMP_STRING_TABLE() macro and make use of it everywhere
2018-05-22 14:28:38 +02:00
Lennart Poettering a210692525 tree-wide: port over all code to the new CONFIG_PARSER_PROTOTYPE() macro
This makes most header files easier to look at. Also Emacs gets really
slow when browsing through large sections of overly long prototypes,
which is much improved by this macro.

We should probably not do something similar with too many other cases,
as macros like this might help readability for some, but make it worse
for others. But I think given the complexity of this specific prototype
and how often we use it, it's worth doing.
2018-05-22 13:18:44 +02:00
Lennart Poettering 5c828e66b5 tree-wide: port various bits of the tree over to the new DUMP_STRING_TABLE() macro 2018-05-22 13:14:18 +02:00
Zbigniew Jędrzejewski-Szmek b49c6ca089 systemd-nspawn: make SettingsMask 64 bit wide
The use of UINT64_C() in the SettingsMask enum definition is misleading:
it does not mean that individual fields have this width. E.g., with
enum {
   FOO = UINT64_C(1)
}
sizeof(FOO) gives 4. It only means that the shift is done properly. So
1 << 35 is undefined, but UINT64_C(1) << 35 is the expected 64 bit
constant. Thus, the use UINT64_C() is useful, because we know that the shifts
are done properly, no matter what the value of _RLIMIT_MAX is, but when those
fields are used in expressions, we don't know what size they will be
(probably 4). Let's add a define which "hides" the enum definition behind a
define which gives the same value but is actually 64 bit. I think this is a
nicer solution than requiring all users to cast SETTING_RLIMIT_FIRST before
use.

Fixes #9035.
2018-05-22 10:51:49 +02:00
Lennart Poettering 919f5ae0c7 nspawn: voidify more things 2018-05-17 20:48:55 +02:00
Lennart Poettering 5d9614077d nspawn: split out merging of settings object
Let's separate the loading of the settings object and the merging into
our arg_xyz fields into two.

This will become particularly useful when we eventually are able to load
settings from OCI runtime files in addition to .nspawn files.
2018-05-17 20:48:55 +02:00
Lennart Poettering d107bb7d63 nspawn: add a new --cpu-affinity= switch
Similar as the other options added before, this is primarily useful to
provide comprehensive OCI runtime compatbility, but might be useful
otherwise, too.
2018-05-17 20:48:54 +02:00
Lennart Poettering 50ebcf6cb7 nspawn: show --help text in a pager
The text is long enough now, and we do auto-paging for systemctl
already, hence let's do it here too.
2018-05-17 20:48:13 +02:00
Lennart Poettering 81f345dfed nspawn: add a new --oom-score-adjust= command line switch
This is primarily useful in order to provide comprehensive OCI runtime
compatibility with nspawn, but might have uses outside of it.
2018-05-17 20:48:12 +02:00
Lennart Poettering c818eef1cd nspawn: properly handle and log about hostname setting errors 2018-05-17 20:47:21 +02:00
Lennart Poettering 66edd96310 nspawn: add a new --no-new-privileges= cmdline option to nspawn
This simply controls the PR_SET_NO_NEW_PRIVS flag for the container.
This too is primarily relevant to provide OCI runtime compaitiblity, but
might have other uses too, in particular as it nicely complements the
existing --capability= and --drop-capability= flags.
2018-05-17 20:47:20 +02:00