Commit Graph

777 Commits

Author SHA1 Message Date
Lennart Poettering 19ac32cdd6 docs: import discoverable partitions spec
This was previously available here:

https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/

Let's pull it into our repository.
2019-12-23 14:44:33 +01:00
Lennart Poettering d4dffb8533 dissect: introduce new recognizable partition types for /var and /var/tmp
This has been requested many times before. Let's add it finally.

GPT auto-discovery for /var is a bit more complex than for other
partition types: the other partitions can to some degree be shared
between multiple OS installations on the same disk (think: swap, /home,
/srv). However, /var is inherently something bound to an installation,
i.e. specific to its identity, or actually *is* its identity, and hence
something that cannot be shared.

To deal with this this new code is particularly careful when it comes to
/var: it will not mount things blindly, but insist that the UUID of the
partition matches a hashed version of the machine-id of the
installation, so that each installation has a very specific /var
associated with it, and would never use any other. (We actually use
HMAC-SHA256 on the GPT partition type for /var, keyed by the machine-id,
since machine-id is something we want to keep somewhat private).

Setting the right UUID for installations takes extra care. To make
things a bit simpler to set up, we avoid this safety check for nspawn
and RootImage= in unit files, under the assumption that such container
and service images unlikely will have multiple installations on them.
The check is hence only required when booting full machines, i.e. in
in systemd-gpt-auto-generator.

To help with putting together images for full machines, PR #14368
introduces a repartition tool that can automatically fill in correctly
calculated UUIDs on first boot if images have the var partition UUID
initialized to all zeroes. With that in place systems can be put
together in a way that on first boot the machine ID is determined and
the partition table automatically adjusted to have the /var partition
with the right UUID.
2019-12-23 14:43:59 +01:00
Anita Zhang e5f10cafe0 core: create inaccessible nodes for users when making runtime dirs
To support ProtectHome=y in a user namespace (which mounts the inaccessible
nodes), the nodes need to be accessible by the user. Create these paths and
devices in the user runtime directory so they can be used later if needed.
2019-12-18 11:09:30 -08:00
Lennart Poettering a724732208
Merge pull request #14269 from DaanDeMeyer/enable-mounts-on-root
nspawn: Enable specifying root as the mount target directory.
2019-12-13 00:05:38 +01:00
Daan De Meyer 5530dc87f2 nspawn: Only bind-mount directory when necessary. 2019-12-12 20:15:10 +01:00
Daan De Meyer e091a5dfd1 nspawn-mount: Remove unused parameters 2019-12-12 20:15:10 +01:00
Daan De Meyer 5f0a6347ac nspawn: Enable specifying root as the mount target directory.
Fixes #3847.
2019-12-12 20:15:03 +01:00
Shengjing Zhu 679ecd3616 nspawn: allow combination of private-network and network-namespace-path
Fixes: #14289
2019-12-12 19:26:32 +01:00
Lennart Poettering 5905d7cf5b tree-wide: use SD_ID128_STRING_MAX where appropriate 2019-12-10 11:56:18 +01:00
Lennart Poettering b5ea030d65 id128: introduce ID128_UUID_STRING_MAX for sizing UUID buffers 2019-12-10 11:56:18 +01:00
Lennart Poettering e08f94acf5 loop-util: accept loopback flags when creating loopback device
This way callers can choose if they want partition scanning or not.
2019-12-02 10:05:09 +01:00
Lennart Poettering 37a92352d6 nspawn: highlight description string in --help text
We do so in most tools now, do so here, too.
2019-11-28 11:41:24 +01:00
Zbigniew Jędrzejewski-Szmek 8a99bd0c46 nspawn: dump capability list with --capabilities=help 2019-11-22 10:15:46 +01:00
Torsten Hilbrich 7be830c6e8 nspawn: Allow Capability= to overrule private network setting
The commit:

a3fc6b55ac nspawn: mask out CAP_NET_ADMIN again if settings file turns off private networking

turned off the CAP_NET_ADMIN capability whenever no private networking
feature was enabled. This broke configurations where the CAP_NET_ADMIN
capability was explicitly requested in the configuration.

Changing the order of evalution here to allow the Capability= setting
to overrule this implicit setting:

Order of evaluation:

1. if no private network setting is enabled, CAP_NET_ADMIN is removed
2. if a private network setting is enabled, CAP_NET_ADMIN is added
3. the settings of Capability= are added
4. the settings of DropCapability= are removed

This allows the fix for #11755 to be retained and to still allow the
admin to specify CAP_NET_ADMIN as additional capability.

Fixes: a3fc6b55ac
Fixes: #13995
2019-11-15 10:13:51 +01:00
Zbigniew Jędrzejewski-Szmek d5fc5b2f8d nspawn: do not emit any warning when $UNIFIED_CGROUP_HIERARCHY is used
Initially I thought this is a good idea, but when reviewing a different PR
(https://github.com/systemd/systemd/pull/13862#discussion_r340604313) I changed
my mind about this. At some point we probably should start warning about the
old option name, and yet later remove it. But it'll make it easier for people
to transition to the new option name if there's a period of support for both
names without any fuss. There's nothing particularly wrong about the old name,
and there is no support cost.

Fixes #13919 (by avoiding the issue completely).
2019-11-13 12:21:18 +01:00
Yu Watanabe 1405cb653a tree-wide: drop stdio.h when stdio-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 021cdf8330 tree-wide: drop signal.h when signal-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe adb29d588e tree-wide: drop blkid.h when blkid-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 927d2351d7 tree-wide: drop pwd.h and grp.h when user-util.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe df26692947 tree-wide: drop sched.h when missing_sched.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe 455fa9610c tree-wide: drop string.h when string-util.h or friends are included 2019-11-04 00:30:32 +09:00
Justin Trudell 0ccdaa79ca nspawn: respect quiet on capabilities warning 2019-11-03 22:05:48 +09:00
Lennart Poettering 43c3fb4680 nspawn: mangle slice name
It's user-facing, parsed from the command line and we typically mangle
in these cases, let's do so here too. (In particular as the identical
switch for systemd-run already does it.)
2019-11-03 21:32:56 +09:00
Yu Watanabe f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Lennart Poettering a93503e86f
Merge pull request #13866 from keszybz/nspawn-restarts
Make 'machinectl reboot' functional
2019-10-30 10:53:28 +01:00
Zbigniew Jędrzejewski-Szmek 0bb0a9faa7 nspawn: when stopping the machine, just deregister the machine
We already shut the machine down ourselves (and pid1 will also do
cleanup for us after we exit if anything was left behind). No need for
systemd-machined to try to stop the unit too.

(This calls the new machined method. If we are running against an older
machined, we will not deregister the machine. If we are simply exiting,
machined should notice that the unit is gone on its own. If we are restarting,
we will fail to register the machine after restart and fail. But this case
was already broken, because machined would create a stop job, breaking the
restart. So not doing anything with old machined should not make anything
more broken than it already is.)

Fixes #13766.
2019-10-29 10:54:45 +01:00
Zbigniew Jędrzejewski-Szmek a5648b8094 basic/fs-util: change CHASE_OPEN flag into a separate output parameter
chase_symlinks() would return negative on error, and either a non-negative status
or a non-negative fd when CHASE_OPEN was given. This made the interface quite
complicated, because dependning on the flags used, we would get two different
"types" of return object. Coverity was always confused by this, and flagged
every use of chase_symlinks() without CHASE_OPEN as a resource leak (because it
would this that an fd is returned). This patch uses a saparate output parameter,
so there is no confusion.

(I think it is OK to have functions which return either an error or an fd. It's
only returning *either* an fd or a non-fd that is confusing.)
2019-10-24 22:44:24 +09:00
Zbigniew Jędrzejewski-Szmek dce66ffedb nspawn: fix handling of --console=help
We shouldn't continue to run the container after printing help.
2019-10-23 09:35:36 +02:00
Zbigniew Jędrzejewski-Szmek 86e94d95d0
Merge pull request #13246 from keszybz/add-SystemdOptions-efi-variable
Add efi variable to augment /proc/cmdline
2019-10-03 12:19:44 +02:00
Zbigniew Jędrzejewski-Szmek c78c095b1e nspawn: rename UNIFIED_CGROUP_HIERARCHY to SYSTEMD_NSPAWN_UNIFIED_HIERARCHY
We should never have used an unprefixed environment variable name.
All other systemd-nspawn variables have the "SYSTEMD_NSPAWN_" prefix,
and all other systemd variables have the "SYSTEMD_" prefix.

The new variable name takes precedence, but we fall back to checking the
old one. If only the old one is found, a warning is emitted.

In addition, SYSTEMD_NSPAWN_UNIFIED_HIERARCHY="" is accepted as an override
to avoid looking for the old variable name.

We have a variable with the same name ($UNIFIED_CGROUP_HIERARCHY) in tests,
which governs both systemd-nspawn and qemu behaviour. It is not renamed.
2019-10-01 10:21:13 -07:00
Zbigniew Jędrzejewski-Szmek 490486842b nspawn: consistenly fail if parsing the environment fails
We would parse the environment twice (to re-apply settings after reading
config from disk), but we would not check the return code first time.
This means that for some settings we would ignore invalid values, while
for others, we'd fail at some point.

Let's just consistently fail. Those environment variables define important
aspects of behaviour, and it is better for the user if we ignore invalid
values. (Unknown settings are still ignored, so forward compatibility is
maintained.)
2019-10-01 10:21:13 -07:00
Zbigniew Jędrzejewski-Szmek 75b0d8b89d nspawn: default to unified hierarchy if --as-pid2 is used
See comment added in the patch.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1756143.
2019-10-01 10:21:13 -07:00
Zbigniew Jędrzejewski-Szmek d4d99bc6e4 basic/cgroup-util: let cgroup_unified_flush() return the detected hierarchy
This avoid the use of the global variable.

Also rename cgroup_unified_update() to cgroup_unified_cached() and
cgroup_unified_flush() to cgroup_unified() to better reflect their new roles.
2019-09-16 18:06:20 +02:00
Lennart Poettering 6f83d3d149 nspawn: when operating on the host image, let's move the root to a different directory first, via a bind mount 2019-07-29 09:57:04 +02:00
Lennart Poettering 6992459c12 nspawn: always take exclusive locks of ephemeral OS tree copies 2019-07-29 09:52:02 +02:00
Lennart Poettering cd6e3914f7 nspawn: don't look for .nspawn file above the top-level directory, it makes no sense 2019-07-29 09:52:02 +02:00
Lennart Poettering b35ca61ae2 nspawn: allow --volatile=yes instances of -D / 2019-07-29 09:52:02 +02:00
Lennart Poettering 76c887fdaa
Merge pull request #13092 from keszybz/coverity-fixes
Coverity fixes
2019-07-17 14:18:49 +02:00
Zbigniew Jędrzejewski-Szmek 622ecfa869 nspawn: fix memleak in argument parsing
Coverity CID#1402297.
2019-07-17 11:35:04 +02:00
Lennart Poettering 7bf011e355 nspawn: make use of SIGINT handling when copying files
Fixes: #13079
2019-07-17 11:14:11 +02:00
Lennart Poettering b910cc72c0 tree-wide: get rid of strappend()
It's a special case of strjoin(), so no need to keep both. In particular
as typing strjoin() is even shoert than strappend().
2019-07-12 14:31:12 +09:00
Zbigniew Jędrzejewski-Szmek cd132992bb nspawn: fix abort when we cannot execve
If execve failed, we would die in safe_close(), because master was already
closed by fdset_close_others() on line 3123. IIUC, we don't need to keep the
fd open after sending it, so let's just close it immediately.

Reproducer:
sudo build/systemd-nspawn -M rawhide fooooooo

Fixup for 3acc84ebd9.
2019-07-09 01:24:20 +02:00
Yu Watanabe 2d9b74ba87 tree-wide: replace strjoin() with path_join() 2019-06-24 23:59:38 +09:00
Lennart Poettering cee97d5768
Merge pull request #12836 from yuwata/tree-wide-replace-strjoin
tree-wide: replace strjoin() with path_join()
2019-06-22 20:02:46 +02:00
Lennart Poettering c6134d3e2f path-util: get rid of prefix_root()
prefix_root() is equivalent to path_join() in almost all ways, hence
let's remove it.

There are subtle differences though: prefix_root() will try shorten
multiple "/" before and after the prefix. path_join() doesn't do that.
This means prefix_root() might return a string shorter than both its
inputs combined, while path_join() never does that. I like the
path_join() semantics better, hence I think dropping prefix_root() is
totally OK. In the end the strings generated by both functon should
always be identical in terms of path_equal() if not streq().

This leaves prefix_roota() in place. Ideally we'd have path_joina(), but
I don't think we can reasonably implement that as a macro. or maybe we
can? (if so, sounds like something for a later PR)

Also add in a few missing OOM checks
2019-06-21 08:42:55 +09:00
Anita Zhang f66ad46066 nspawn: don't hard fail when setting capabilities
The OCI changes in #9762 broke a use case in which we use nspawn from
inside a container that has dropped capabilities from the bounding set
that nspawn expected to retain. In an attempt to keep OCI compliance
and support our use case, I made hard failing on setting capabilities
not in the bounding set optional (hard fail if using OCI and log only
if using nspawn cmdline).

Fixes #12539
2019-06-20 21:46:36 +02:00
Yu Watanabe 657ee2d82b tree-wide: replace strjoin() with path_join() 2019-06-21 03:26:16 +09:00
Franck Bui dc98caea32 nspawn: make use of openpt_allocate() 2019-06-18 09:27:06 +02:00
Franck Bui 3acc84ebd9 nspawn: allocate the pty used for /dev/console within the container
The console tty is now allocated from within the container so it's not
necessary anymore to allocate it from the host and bind mount the pty slave
into the container. The pty master is sent to the host.

/dev/console is now a symlink pointing to the pty slave.

This might also be less confusing for applications running inside the container
and the overall result looks cleaner (we don't need to apply manually the
passed selinux context, if any, to the allocated pty for instance).
2019-06-18 08:17:34 +02:00
Franck Bui ba72801d66 nspawn: use correct error variable when logging errors returned by send_one_fd() 2019-06-18 07:54:51 +02:00