Commit Graph

425 Commits

Author SHA1 Message Date
Michal Sekletar e01ff70a77 nspawn: always setup machine id
We check /etc/machine-id of the container and if it is already populated
we use value from there, possibly ignoring value of --uuid option from
the command line. When dealing with R/O image we setup transient machine
id.

Once we determined machine id of the container, we use this value for
registration with systemd-machined and we also export it via
container_uuid environment variable.

As registration with systemd-machined is done by the main nspawn process
we communicate container machine id established by setup_machine_id from
outer child to the main process by unix domain socket. Similarly to PID
of inner child.
2016-04-11 16:43:16 +02:00
Evgeny Vereshchagin 1c1ea21735 nspawn: don't run nspawn --port=... without libiptc support
We get
$ systemd-nspawn --image /dev/loop1 --port 8080:80 -n -b 3
--port= is not supported, compiled without libiptc support.

instead of a ping-nc-iptables debugging session
2016-03-17 21:07:11 +00:00
Dan Walsh 68b020494d /dev/console must be labeled with SELinux label
If the user specifies an selinux_apifs_context all content created in
the container including /dev/console should use this label.

Currently when this uses the default label it gets labeled user_devpts_t,
which would require us to write a policy allowing container processes to
manage user_devpts_t.  This means that an escaped process would be allowed
to attack all users terminals as well as other container terminals.  Changing
the label to match the apifs_context, means the processes would only be allowed
to manage their specific tty.

This change fixes a problem preventing RKT containers from working with systemd-nspawn.
2016-03-09 11:19:45 -05:00
Vito Caputo 9ed794a32d tree-wide: minor formatting inconsistency cleanups 2016-02-23 14:20:34 -08:00
Vito Caputo 313cefa1d9 tree-wide: make ++/-- usage consistent WRT spacing
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands.  Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
2016-02-22 20:32:04 -08:00
Lennart Poettering 91ba5ac7d0 Merge pull request #2589 from keszybz/resolve-tool-2
Better support of OPENPGPKEY, CAA, TLSA packets and tests
2016-02-13 11:15:41 +01:00
Zbigniew Jędrzejewski-Szmek 75f32f047c Add memcpy_safe
ISO/IEC 9899:1999 §7.21.1/2 says:
Where an argument declared as size_t n specifies the length of the array
for a function, n can have the value zero on a call to that
function. Unless explicitly stated otherwise in the description of a
particular function in this subclause, pointer arguments on such a call
shall still have valid values, as described in 7.1.4.

In base64_append_width memcpy was called as memcpy(x, NULL, 0).  GCC 4.9
started making use of this and assumes This worked fine under -O0, but
does something strange under -O3.

This patch fixes a bug in base64_append_width(), fixes a possible bug in
journal_file_append_entry_internal(), and makes use of the new function
to simplify the code in other places.
2016-02-11 13:07:02 -05:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Lennart Poettering 2b26a72816 nspawn: make sure --help fits it 79ch 2016-02-03 23:58:25 +01:00
Lennart Poettering 7732f92bad nspawn: optionally run a stub init process as PID 1
This adds a new switch --as-pid2, which allows running commands as PID 2, while a stub init process is run as PID 1.
This is useful in order to run arbitrary commands in a container, as PID1's semantics are different from all other
processes regarding reaping of unknown children or signal handling.
2016-02-03 23:58:24 +01:00
Lennart Poettering 5f932eb9af nspawn: add new --chdir= switch
Fixes: #2192
2016-02-03 23:58:24 +01:00
Lennart Poettering ba8e6c4d0e nspawn: make sure --link-journal=host may be used twice in a row
Fixes #2186

This fixes fall-out from 574edc9006.
2016-01-28 20:24:28 +01:00
Lennart Poettering 8054d749c4 nspawn: make journal linking non-fatal in try and auto modes
Fixes #2091
2016-01-28 20:16:44 +01:00
Michal Sekletar 61e741ed3d nspawn: fix memory leak 2016-01-25 12:06:38 +01:00
Ismo Puustinen a103496ca5 capabilities: keep bounding set in non-inverted format.
Change the capability bounding set parser and logic so that the bounding
set is kept as a positive set internally. This means that the set
reflects those capabilities that we want to keep instead of drop.
2016-01-12 12:14:50 +02:00
Lennart Poettering 4afd3348c7 tree-wide: expose "p"-suffix unref calls in public APIs to make gcc cleanup easy
GLIB has recently started to officially support the gcc cleanup
attribute in its public API, hence let's do the same for our APIs.

With this patch we'll define an xyz_unrefp() call for each public
xyz_unref() call, to make it easy to use inside a
__attribute__((cleanup())) expression. Then, all code is ported over to
make use of this.

The new calls are also documented in the man pages, with examples how to
use them (well, I only added docs where the _unref() call itself already
had docs, and the examples, only cover sd_bus_unrefp() and
sd_event_unrefp()).

This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we
tend to call our destructors these days.

Note that this defines no public macro that wraps gcc's attribute and
makes it easier to use. While I think it's our duty in the library to
make our stuff easy to use, I figure it's not our duty to make gcc's own
features easy to use on its own. Most likely, client code which wants to
make use of this should define its own:

       #define _cleanup_(function) __attribute__((cleanup(function)))

Or similar, to make the gcc feature easier to use.

Making this logic public has the benefit that we can remove three header
files whose only purpose was to define these functions internally.

See #2008.
2015-11-27 19:19:36 +01:00
Lennart Poettering 4a0b58c4a3 tree-wide: use right cast macros for UIDs, GIDs and PIDs 2015-11-17 00:52:10 +01:00
Lennart Poettering f6d6bad146 nspawn: add new --network-veth-extra= switch for defining additional veth links
The new switch operates like --network-veth, but may be specified
multiple times (to define multiple link pairs) and allows flexible
definition of the interface names.

This is an independent reimplementation of #1678, but defines different
semantics, keeping the behaviour completely independent of
--network-veth. It also comes will full hook-up for .nspawn files, and
the matching documentation.
2015-11-12 22:04:49 +01:00
Daniel Mack b0bc8dbd73 Merge pull request #1820 from michich/errno-v2
[v2] treewide: treatment of errno and other cleanups
2015-11-09 21:56:49 +01:00
Michal Schmidt e1427b138f treewide: apply errno.cocci
with small manual cleanups for style.
2015-11-09 20:01:06 +01:00
Lennart Poettering 6c9e781eba Merge pull request #1799 from jengelh/doc
doc: typo and ortho fixes
2015-11-09 18:16:21 +01:00
Iago López Galeiras 6aadfa4c52 nspawn: support custom container service name
We were hardcoding "systemd-nspawn" as the value of the $container env
variable and "nspawn" as the service string in machined registration.

This commit allows the user to configure it by setting the
$SYSTEMD_NSPAWN_CONTAINER_SERVICE env variable when calling
systemd-nspawn.

If $SYSTEMD_NSPAWN_CONTAINER_SERVICE is not set, we use the string
"systemd-nspawn" for both, fixing the previous inconsistency.
2015-11-09 16:40:05 +01:00
Jan Engelhardt a8eaaee72a doc: correct orthography, word forms and missing/extraneous words 2015-11-06 13:45:21 +01:00
Jan Engelhardt b938cb902c doc: correct punctuation and improve typography in documentation 2015-11-06 13:00:02 +01:00
Michal Schmidt 35607a8d1c nspawn: save errno before reopening log after exec failure 2015-11-05 13:44:12 +01:00
Michal Schmidt 070edd97f3 nspawn: no fake errno
The S_ISREG test does not set errno, so don't use it in the error
message.
2015-11-05 13:44:11 +01:00
Michal Schmidt 4314d33f51 nspawn: simplify error returns
Use the "return log_error_errno(...)" idiom to have fewer curly braces.

The last hunk also fixes the return value of setup_journal(), but the
fix has no practical effect.
2015-11-05 13:44:10 +01:00
Michal Schmidt 709f6e46a3 treewide: use the negative error codes returned by our functions
Our functions return negative error codes.
Do not rely on errno being set after calling our own functions.
2015-11-05 13:44:06 +01:00
Lennart Poettering 97044145b4 core,nspawn: minor coding style fixes 2015-10-31 19:09:20 +01:00
Susant Sahani 6cbe4ed1e1 nspwan: port to extract_first_word 2015-10-28 22:59:01 +05:30
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 15a5e95075 util-lib: split out printf() helpers to stdio-util.h 2015-10-27 13:25:57 +01:00
Lennart Poettering 430f0182b7 src/basic: rename audit.[ch] → audit-util.[ch] and capability.[ch] → capability-util.[ch]
The files are named too generically, so that they might conflict with
the upstream project headers. Hence, let's add a "-util" suffix, to
clarify that this are just our utility headers and not any official
upstream headers.
2015-10-27 13:25:57 +01:00
Lennart Poettering affb60b1ef util-lib: split out umask-related code to umask-util.h 2015-10-27 13:25:56 +01:00
Lennart Poettering 8fcde01280 util-lib: split stat()/statfs()/stavfs() related calls into stat-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering f4f15635ec util-lib: move a number of fs operations into fs-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 4349cd7c1d util-lib: move mount related utility calls to mount-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering 6bedfcbb29 util-lib: split string parsing related calls from util.[ch] into parse-util.[ch] 2015-10-27 13:25:55 +01:00
Lennart Poettering 2583fbea8e socket-util: move remaining socket-related calls from util.[ch] to socket-util.[ch] 2015-10-26 01:24:39 +01:00
Lennart Poettering b1d4f8e154 util-lib: split out user/group/uid/gid calls into user-util.[ch] 2015-10-26 01:24:38 +01:00
Lennart Poettering 3ffd4af220 util-lib: split out fd-related operations into fd-util.[ch]
There are more than enough to deserve their own .c file, hence move them
over.
2015-10-25 13:19:18 +01:00
Lennart Poettering 07630cea1f util-lib: split our string related calls from util.[ch] into its own file string-util.[ch]
There are more than enough calls doing string manipulations to deserve
its own files, hence do something about it.

This patch also sorts the #include blocks of all files that needed to be
updated, according to the sorting suggestions from CODING_STYLE. Since
pretty much every file needs our string manipulation functions this
effectively means that most files have sorted #include blocks now.

Also touches a few unrelated include files.
2015-10-24 23:05:02 +02:00
Lennart Poettering 0f03c2a4c0 path-util: unify how we process paths specified on the command line
Let's introduce a common function that makes relative paths absolute and
warns about any errors while doing so.
2015-10-24 23:03:49 +02:00
Lennart Poettering 0f47436510 util-lib: get_current_dir_name() can return errors other than ENOMEM
get_current_dir_name() can return a variety of errors, not just ENOMEM,
hence don't blindly turn its errors to ENOMEM, but return correct errors
in path_make_absolute_cwd().

This trickles down into a couple of other functions, some of which
receive unrelated minor fixes too with this commit.
2015-10-24 23:03:49 +02:00
Lennart Poettering 16fb773ee3 nspawn: don't try to resolve passed binary before entering namespace
Othewise we might follow the symlinks on the host, instead of the
container.

Fixes #1400
2015-10-22 01:59:25 +02:00
Lennart Poettering 0e2656744f nspawn: rework how we determine private networking settings
Make sure we acquire CAP_NET_ADMIN if we require virtual networking.

Make sure we imply virtual ethernet correctly when bridge is request.

Fixes: #1511
Fixes: #1554
Fixes: #1590
2015-10-22 01:59:25 +02:00
Lennart Poettering 5bcd08db28 btrfs: beef-up btrfs support with a limited understanding of quota
With this change we understand more than just leaf quota groups for
btrfs file systems. Specifically:

- When we create a subvolume we can now optionally add the new subvolume
  to all qgroups its parent subvolume was member of too. Alternatively
  it is also possible to insert an intermediary quota group between the
  parent's qgroups and the subvolume's leaf qgroup, which is useful for
  a concept of "subtree" qgroups, that contain a subvolume and all its
  children.

- The remove logic for subvolumes has been updated to optionally remove
  any leaf qgroups or "subtree" qgroups, following the logic above.

- The snapshot logic for subvolumes has been updated to replicate the
  original qgroup setup of the source, if it follows the "subtree"
  design described above. It will not cover qgroup setups that introduce
  arbitrary qgroups, especially those orthogonal to the subvolume
  hierarchy.

This also tries to be more graceful when setting up /var/lib/machines as
btrfs. For example, if mkfs.btrfs is missing we don't even try to set it
up as loopback device.

Fixes #1559
Fixes #1129
2015-10-22 01:59:25 +02:00
Iago López Galeiras d167824896 nspawn: skip /sys-as-tmpfs if we don't use private-network
Since v3.11/7dc5dbc ("sysfs: Restrict mounting sysfs"), the kernel
doesn't allow mounting sysfs if you don't have CAP_SYS_ADMIN rights over
the network namespace.

So the mounting /sys as a tmpfs code introduced in
d8fc6a000f doesn't work with user
namespaces if we don't use private-net. The reason is that we mount
sysfs inside the container and we're in the network namespace of the host
but we don't have CAP_SYS_ADMIN over that namespace.

To fix that, we mount /sys as a sysfs (instead of tmpfs) if we don't use
private network and ignore the /sys-as-a-tmpfs code if we find that /sys
is already mounted as sysfs.

Fixes #1555
2015-10-20 10:19:23 +02:00
Lennart Poettering ae3dde8012 machinectl: fix race when opening new shells with "machinectl shell"
Previously, we'd allocate the TTY, spawn a service on it, but
immediately start processing the TTY and forwarding it to whatever the
commnd was started on. This is however problematic, as the TTY might get
actually opened only much later by the service. We'll hence first get
EIOs on the master as the other side is still closed, and hence
considered it hung up and terminated the session.

With this change we add a flag to the pty forwarding logic:
PTY_FORWARD_IGNORE_INITIAL_VHANGUP. If set, we'll ignore all hangups
(i.e. EIOs) on the master PTY until the first byte is successfully read.
From that point on we consider a hangup/EIO a regular connection termination. This
way, we handle the race: when we get EIO initially we'll ignore it,
until the connection is properly set up, at which time we start
honouring it.
2015-10-07 20:10:48 +02:00
Lennart Poettering d8fc6a000f nspawn: mount /sys as tmpfs, and then mount only select subdirs of the real sysfs below it
This way we can hide things like /sys/firmware or /sys/hypervisor from
the container, while keeping the device tree around.

While this is a security benefit in itself it also allows us to fix
issue #1277.

Previously we'd mount /sys before creating the user namespace, in order
to be able to mount /sys/fs/cgroup/* beneath it (which resides in it),
which we can only mount outside of the user namespace. To ensure that
the user namespace owns the network namespace we'd set up the network
namespace at the same time as the user namespace. Thus, we'd still see
the /sys/class/net/ from the originating network namespace, even though
we are in our own network namespace now. With this patch, /sys is
mounted before transitioning into the user namespace as tmpfs, so that
we can also mount /sys/fs/cgroup/* into it this early. The directories
such as /sys/class/ are then later added in from the real sysfs from
inside the network and user namespace so that they actually show whatis
available in it.

Fixes #1277
2015-09-30 15:19:33 +02:00