Commit graph

1059 commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek 55aacd502b
Merge pull request #15891 from bluca/host_os_release
Container Interface: expose the host's os-release metadata to nspawn and portable guests
2020-07-08 23:52:13 +02:00
Lennart Poettering 9b71e4ab90 shared: actually move all BusLocator related calls to bus-locator.c 2020-06-30 15:09:19 +02:00
Luca Boccassi c2923fdcd7 dissect/nspawn: add support for dm-verity root hash signature
Since cryptsetup 2.3.0 a new API to verify dm-verity volumes by a
pkcs7 signature, with the public key in the kernel keyring,
is available. Use it if libcryptsetup supports it.
2020-06-25 08:45:21 +01:00
Lennart Poettering 6b000af4f2 tree-wide: avoid some loaded terms
https://tools.ietf.org/html/draft-knodel-terminology-02
https://lwn.net/Articles/823224/

This gets rid of most but not occasions of these loaded terms:

1. scsi_id and friends are something that is supposed to be removed from
   our tree (see #7594)

2. The test suite defines an API used by the ubuntu CI. We can remove
   this too later, but this needs to be done in sync with the ubuntu CI.

3. In some cases the terms are part of APIs we call or where we expose
   concepts the kernel names the way it names them. (In particular all
   remaining uses of the word "slave" in our codebase are like this,
   it's used by the POSIX PTY layer, by the network subsystem, the mount
   API and the block device subsystem). Getting rid of the term in these
   contexts would mean doing some major fixes of the kernel ABI first.

Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
2020-06-25 09:00:19 +02:00
Luca Boccassi e1bb4b0d1d nspawn: implement container host os-release interface 2020-06-23 12:58:21 +01:00
Luca Boccassi b3b1a08a56 nspawn: use mkdir_p_safe instead of homegrown version 2020-06-23 12:57:05 +01:00
Luca Boccassi 0389f4fa81 core: add RootHash and RootVerity service parameters
Allow to explicitly pass root hash (explicitly or as a file) and verity
device/file as unit options. Take precedence over implicit checks.
2020-06-23 10:50:09 +02:00
Lennart Poettering 6fe01ced0e nspawn: mkdir selinux mount point once, but not twice
Since #15533 we didn't create the mount point for selinuxfs anymore.

Before it we created it twice because we mount selinuxfs twice: once the
superblock, and once we remount its bind mound read-only. The second
mkdir would mean we'd chown() the host version of selinuxfs (since
there's only one selinuxfs superblock kernel-wide).

The right time to create mount point point is once: before we mount the
selinuxfs. But not a second time for the remount.

Fixes: #16032
2020-06-23 10:17:36 +02:00
Zbigniew Jędrzejewski-Szmek 9664be199a
Merge pull request #16118 from poettering/inaccessible-fixlets
move $XDG_RUNTIME_DIR/inaccessible/ to $XDG_RUNTIME_DIR/systemd/inaccessible
2020-06-10 10:23:13 +02:00
Lennart Poettering 48b747fa03 inaccessible: move inaccessible file nodes to /systemd/ subdir in runtime dir always
Let's make sure $XDG_RUNTIME_DIR for the user instance and /run for the
system instance is always organized the same way: the "inaccessible"
device nodes should be placed in a subdir of either called "systemd" and
a subdir of that called "inaccessible".

This way we can emphasize the common behaviour, and only differ where
really necessary.

Follow-up for #13823
2020-06-09 16:23:56 +02:00
Luca Boccassi e7cbe5cb9e dissect: support single-filesystem verity images with external verity hash
dm-verity support in dissect-image at the moment is restricted to GPT
volumes.
If the image a single-filesystem type without a partition table (eg: squashfs)
and a roothash/verity file are passed, set the verity flag and mark as
read-only.
2020-06-09 12:19:21 +01:00
Lennart Poettering 4f9ff96a55 conf-parser: return mtime in config_parse() and friends
This is a follow-up for 9f83091e3c.

Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.

This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
2020-06-02 19:32:20 +02:00
Tobias Hunger 129635333d repart: Add UUID option to config files
Add a option to provide a UUID for the partition that will get
created and document that.
2020-05-25 15:48:59 +02:00
Topi Miettinen 7d85383edb tree-wide: add size limits for tmpfs mounts
Limit size of various tmpfs mounts to 10% of RAM, except volatile root and /var
to 25%. Another exception is made for /dev (also /devs for PrivateDevices) and
/sys/fs/cgroup since no (or very few) regular files are expected to be used.

In addition, since directories, symbolic links, device specials and xattrs are
not counted towards the size= limit, number of inodes is also limited
correspondingly: 4MB size translates to 1k of inodes (assuming 4k each), 10% of
RAM (using 16GB of RAM as baseline) translates to 400k and 25% to 1M inodes.

Because nr_inodes option can't use ratios like size option, there's an
unfortunate side effect that with small memory systems the limit may be on the
too large side. Also, on an extremely small device with only 256MB of RAM, 10%
of RAM for /run may not be enough for re-exec of PID1 because 16MB of free
space is required.
2020-05-13 00:37:18 +02:00
Zbigniew Jędrzejewski-Szmek 8acb7780df
Merge pull request #15623 from poettering/cmsg-cleanup
various CMSG_xyz clean-ups, split out of #15571
2020-05-08 11:05:06 +02:00
Vito Caputo 5e55340ad4
Merge pull request #15681 from vcaputo/buslocator
*: switch to BusLocator-oriented helpers
2020-05-07 09:46:01 -07:00
Vito Caputo 1ecaac5c30 nspawn: switch to BusLocator-oriented helpers
Mechanical substitution reducing some verbosity
2020-05-07 08:46:44 -07:00
Lennart Poettering fb29cdbef2 tree-wide: make sure our control buffers are properly aligned
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.

Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.

Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
2020-05-07 14:39:44 +02:00
Zbigniew Jędrzejewski-Szmek be32732168 basic/set: let set_put_strdup() create the set with string hash ops
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.

hashmap_put_strdup() was already doing something similar.

(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)

No functional change intended.
2020-05-06 16:54:06 +02:00
Motiejus Jakštys 5c4deb9a5c nspawn: mount custom paths before writing to /etc
Consider such configuration:

    $ systemd-nspawn --read-only --timezone=copy --resolv-conf=copy-host \
        --overlay="+/etc::/etc" <...>

Assuming one wants `/` to be read-only, DNS and `/etc/localtime` to
work. One way to do it is to create an overlay filesystem in `/etc/`.
However, systemd-nspawn tries to create `/etc/resolv.conf` and
`/etc/localtime` before mounting the custom paths, while `/` (and, by
extension, `/etc`) is read-only. Thus it fails to create those files.

Mounting custom paths before modifying anything in `/etc/` makes this
possible.

Full example:

```
$ debootstrap buster /var/lib/machines/t1 http://deb.debian.org/debian
$ systemd-nspawn --private-users=false --timezone=copy --resolv-conf=copy-host --read-only --tmpfs=/var --tmpfs=/run --overlay="+/etc::/etc" -D /var/lib/machines/t1 ping -c 1 example.com
Spawning container t1 on /var/lib/machines/t1.
Press ^] three times within 1s to kill container.
ping: example.com: Temporary failure in name resolution
Container t1 failed with error code 130.
```

With the patch:

```
$ sudo ./build/systemd-nspawn --private-users=false --timezone=copy --resolv-conf=copy-host --read-only --tmpfs=/var --tmpfs=/run --overlay="+/etc::/etc" -D /var/lib/machines/t1 ping -qc 1 example.com
Spawning container t1 on /var/lib/machines/t1.
Press ^] three times within 1s to kill container.
PING example.com (93.184.216.34) 56(84) bytes of data.

--- example.org ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 110.912/110.912/110.912/0.000 ms
Container t1 exited successfully.
```
2020-05-05 09:02:57 +02:00
Lennart Poettering dcff2fa5d1 nspawn: be more careful with creating/chowning directories to overmount
We should never re-chown selinuxfs.

Fixes: #15475
2020-04-28 19:40:46 +02:00
Lennart Poettering 371d72e05b socket-util: introduce type-safe, dereferencing wrapper CMSG_FIND_DATA around cmsg_find()
let's take this once step further, and add type-safety to cmsg_find(),
and imply the CMSG_DATA() macro for finding the cmsg payload.
2020-04-23 19:41:15 +02:00
Lennart Poettering 0f4a141744
Merge pull request #15504 from poettering/cmsg-find-pure
just the recvmsg_safe() stuff from #15457
2020-04-23 17:28:19 +02:00
Lennart Poettering 3691bcf3c5 tree-wide: use recvmsg_safe() at various places
Let's be extra careful whenever we return from recvmsg() and see
MSG_CTRUNC set. This generally means we ran into a programming error, as
we didn't size the control buffer large enough. It's an error condition
we should at least log about, or propagate up. Hence do that.

This is particularly important when receiving fds, since for those the
control data can be of any size. In particular on stream sockets that's
nasty, because if we miss an fd because of control data truncation we
cannot recover, we might not even realize that we are one off.

(Also, when failing early, if there's any chance the socket might be
AF_UNIX let's close all received fds, all the time. We got this right
most of the time, but there were a few cases missing. God, UNIX is hard
to use)
2020-04-23 09:41:47 +02:00
Lennart Poettering 287b737693 nspawn: refuse politely when we are run in the non-host netns in combination with --image=
Strictly speaking this doesn't really fix #15079, but it at least means
we won't hang anymore.

Fixes: #15079
2020-04-23 09:18:43 +02:00
Lennart Poettering 1433e0f212 nspawn: minor simplification 2020-04-23 09:18:05 +02:00
Zbigniew Jędrzejewski-Szmek 4ee40eefce
Merge pull request #15516 from poettering/nspawn-resolv-conf
beef up --resolv-conf= options of systemd-nspawn
2020-04-23 08:01:46 +02:00
Lennart Poettering 81d2fe53fc nspawn: some minor modernizations 2020-04-23 07:59:26 +02:00
Lennart Poettering 86775e3524 nspawn: beef up --resolve-conf= modes
Let's add flavours for copying stub/uplink resolv.conf versions.

Let's add a more brutal "replace" mode, where we'll replace any existing
destination file.

Let's also change what "auto" means: instead of copying the static file,
let's use the stub file, so that DNS search info is copied over.

Fixes: #15340
2020-04-22 19:38:04 +02:00
Frantisek Sumsal 86b52a3958 tree-wide: fix spelling errors
Based on a report from Fossies.org using Codespell.

Followup to #15436
2020-04-21 23:21:08 +02:00
Zbigniew Jędrzejewski-Szmek 162392b75a tree-wide: spellcheck using codespell
Fixes #15436.
2020-04-16 18:00:40 +02:00
Vito Caputo 9f81a592c1 *: convert amenable fdopendir() calls to take_fdopendir()
Some fdopendir() calls remain where safe_close() is manually
performed, those could be simplified as well by converting to
use the _cleanup_close_ machinery, but makes things less trivial
to review so left for a future cleanup.
2020-03-31 06:48:03 -07:00
Vito Caputo 4fa744a35c *: convert amenable fdopen calls to take_fdopen
Mechanical change to eliminate some cruft by using the
new take_fdopen{_unlocked}() wrappers where trivial.
2020-03-31 06:48:03 -07:00
Yu Watanabe df883de98a pid1, nspawn: voidify loopback_setup() 2020-03-04 14:18:55 +01:00
Zbigniew Jędrzejewski-Szmek 105a1a36cd tree-wide: fix spelling of lookup and setup verbs
"set up" and "look up" are the verbs, "setup" and "lookup" are the nouns.
2020-03-03 15:02:53 +01:00
Yu Watanabe 9610210d32 nspawn: voidify umount_verbose()
Fixes CID#1415122.
2020-01-31 23:10:29 +09:00
Lennart Poettering 4fcb96ce25 nspawn: fsck all images when mounting things
Also, start logging about mount errors, things are hard to debug
otherwise.
2020-01-29 19:29:55 +01:00
Zbigniew Jędrzejewski-Szmek ea7fe1d1c2
Merge pull request #14390 from poettering/gpt-var-tmp
introduce GPT partition types for /var and /var/tmp and support them for auto-discovery
2020-01-14 15:37:53 +01:00
Lennart Poettering 04d8507f68
Merge pull request #14381 from keszybz/ifindex-cleanup
Resolve alternative names
2020-01-13 17:57:59 +01:00
Zbigniew Jędrzejewski-Szmek d308bb99d2 Resolve alternative ifnames wherever we would resolve an interface name
To keep the names manageable, "ifname_or_ifindex" is replaced by "interface".
2020-01-12 11:24:35 +01:00
Zbigniew Jędrzejewski-Szmek 597da51bae tree-wide: make parse_ifindex simply return the index
We don't need a seperate output parameter that is of type int.  glibc() says
that the type is "unsigned", but the kernel thinks it's "int".  And the
"alternative names" interface also uses ints. So let's standarize on ints,
since it's clearly not realisitic to have interface numbers in the upper half
of unsigned int range.
2020-01-11 12:06:08 +01:00
rhn bcc0fe635d nspawn: Correct "container" to "host" MAC setting message 2020-01-11 12:21:18 +09:00
Yu Watanabe 6b50cb5ca9 nspawn: set original ifname as alternative if it is truncated 2020-01-07 15:15:59 +01:00
Daan De Meyer 2436ea761b nspawn: Make a custom mount on root imply --read-only. 2020-01-03 14:06:38 +01:00
Daan De Meyer bbd407ea2b nspawn: Don't mount read-only if we have a custom mount on root. 2020-01-03 14:06:38 +01:00
Lennart Poettering 12da859a3f
Merge pull request #14401 from DaanDeMeyer/nspawn-move-veth-back-to-host
nspawn: move virtual interfaces added with --network-interface back to the host
2020-01-03 12:47:03 +01:00
Kai Krakow bc5ea049f2 nspawn: Generate unique short veth names
This commit lowers the chance of having veth name conflicts for machines
created with similar names.

Replaces: #12865
Fixes: #13417
2020-01-02 20:05:42 +01:00
Daan De Meyer 5b4855ab73 nspawn: Move --network-interface interfaces back to the host. 2020-01-02 14:13:03 +01:00
Daan De Meyer b390f17892 nspawn-network: Split off udev checking from parse_interface. 2019-12-23 18:47:36 +01:00
Lennart Poettering 19ac32cdd6 docs: import discoverable partitions spec
This was previously available here:

https://www.freedesktop.org/wiki/Specifications/DiscoverablePartitionsSpec/

Let's pull it into our repository.
2019-12-23 14:44:33 +01:00