Commit Graph

47 Commits

Author SHA1 Message Date
Lennart Poettering 52ef5dd798 hostname-util: flagsify hostname_is_valid(), drop machine_name_is_valid()
Let's clean up hostname_is_valid() a bit: let's turn the second boolean
argument into a more explanatory flags field, and add a flag that
accepts the special name ".host" as valid. This is useful for the
container logic, where the special hostname ".host" refers to the "root
container", i.e. the host system itself, and can be specified at various
places.

let's also get rid of machine_name_is_valid(). It was just an alias,
which is confusing and even more so now that we have the flags param.
2020-12-15 17:59:48 +01:00
Yu Watanabe db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Yu Watanabe a864170745 nspawn: downgrade log level if the error will be ignored 2020-09-10 15:16:14 +09:00
Lennart Poettering 58cf204730 nspawn: let's make LinkJournal an extended boolean
Let's accept the usual boolean parameters for LinkJournal. It's
confusing otherwise.

Previously we'd accept "no" but not the other values we typically accept
for "false". We'd not accept any values for "true".

With this change we'll accept all true and false values and will do
something somewhat reasonable: any false value is treated like "no"
previously was reated. And any true value is now treated like "auto".

We don't document the new values, since this logic is mostly redundant,
and it's probably better if people consider this an enum rather than a
bool.

Fixes: #16888
2020-09-02 08:57:44 +02:00
Lennart Poettering 6b000af4f2 tree-wide: avoid some loaded terms
https://tools.ietf.org/html/draft-knodel-terminology-02
https://lwn.net/Articles/823224/

This gets rid of most but not occasions of these loaded terms:

1. scsi_id and friends are something that is supposed to be removed from
   our tree (see #7594)

2. The test suite defines an API used by the ubuntu CI. We can remove
   this too later, but this needs to be done in sync with the ubuntu CI.

3. In some cases the terms are part of APIs we call or where we expose
   concepts the kernel names the way it names them. (In particular all
   remaining uses of the word "slave" in our codebase are like this,
   it's used by the POSIX PTY layer, by the network subsystem, the mount
   API and the block device subsystem). Getting rid of the term in these
   contexts would mean doing some major fixes of the kernel ABI first.

Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
2020-06-25 09:00:19 +02:00
Lennart Poettering 4f9ff96a55 conf-parser: return mtime in config_parse() and friends
This is a follow-up for 9f83091e3c.

Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.

This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
2020-06-02 19:32:20 +02:00
Tobias Hunger 129635333d repart: Add UUID option to config files
Add a option to provide a UUID for the partition that will get
created and document that.
2020-05-25 15:48:59 +02:00
Lennart Poettering 86775e3524 nspawn: beef up --resolve-conf= modes
Let's add flavours for copying stub/uplink resolv.conf versions.

Let's add a more brutal "replace" mode, where we'll replace any existing
destination file.

Let's also change what "auto" means: instead of copying the static file,
let's use the stub file, so that DNS search info is copied over.

Fixes: #15340
2020-04-22 19:38:04 +02:00
afg c152a2ba54 nspawn: allow Capability=all in systemd.nspawn [EXEC] section
Just like --capability=all is allowed in the systemd-nspawn
command line.
2019-11-29 14:42:27 +01:00
Lennart Poettering b910cc72c0 tree-wide: get rid of strappend()
It's a special case of strjoin(), so no need to keep both. In particular
as typing strjoin() is even shoert than strappend().
2019-07-12 14:31:12 +09:00
Zbigniew Jędrzejewski-Szmek 0985c7c4e2 Rework cpu affinity parsing
The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size
of the array, but operations which take two (CPU_EQUAL_S) or even three arrays
(CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must
be of the same size, or buffer overruns will occur. This is exactly what our
code would do, if it received an array of unexpected size over the network.
("Unexpected" here means anything different from what cpu_set_malloc() detects
as the "right" size.)

Let's rework this, and store the size in bytes of the allocated storage area.

The code will now parse any number up to 8191, independently of what the current
kernel supports. This matches the kernel maximum setting for any architecture,
to make things more portable.

Fixes #12605.
2019-05-29 10:20:42 +02:00
Zbigniew Jędrzejewski-Szmek b2645747b7 nspawn-oci: fix double free
Also rename function to make it clear that it also frees the array
object itself.
2019-03-22 17:39:12 +01:00
Lennart Poettering de40a3037a nspawn: add support for executing OCI runtime bundles with nspawn
This is a pretty large patch, and adds support for OCI runtime bundles
to nspawn. A new switch --oci-bundle= is added that takes a path to an
OCI bundle. The JSON file included therein is read similar to a .nspawn
settings files, however with a different feature set.

Implementation-wise this mostly extends the pre-existing Settings object
to carry additional properties for OCI. However, OCI supports some
concepts .nspawn files did not support yet, which this patch also adds:

1. Support for "masking" files and directories. This functionatly is now
   also available via the new --inaccesible= cmdline command, and
   Inaccessible= in .nspawn files.

2. Support for mounting arbitrary file systems. (not exposed through
   nspawn cmdline nor .nspawn files, because probably not a good idea)

3. Ability to configure the console settings for a container. This
   functionality is now also available on the nspawn cmdline in the new
   --console= switch (not added to .nspawn for now, as it is something
   specific to the invocation really, not a property of the container)

4. Console width/height configuration. Not exposed through
   .nspawn/cmdline, but this may be controlled through $COLUMNS and
   $LINES like in most other UNIX tools.

5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on
   the cmdline, since containers likely have different user tables, and
   the existing --user= switch appears to be the better option)

6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to
   OCI)

7. Creation of additional devices nodes in /dev. Most likely not a good
   idea, hence not exposed in .nspawn/cmdline. There's already --bind=
   to achieve the same, which is the better alternative.

8. Explicit syscall filters. This is not a good idea, due to the skewed
   arch support, hence not exposed through .nspawn/cmdline.

9. Configuration of some sysctls on a whitelist. Questionnable, not
   supported in .nspawn/cmdline for now.

10. Configuration of all 5 types of capabilities. Not a useful concept,
    since the kernel will reduce the caps on execve() anyway. Not
    exposed through .nspawn/cmdline as this is not very useful hence.

Note that this only implements the OCI runtime logic itself. It does not
provide a runc-compatible command line tool. This is left for a later
PR. Only with that in place tools such as "buildah" can use the OCI
support in nspawn as drop-in replacement.

Currently still missing is OCI hook support, but it's already parsed and
everything, and should be easy to add. Other than that it's OCI is
implemented pretty comprehensively.

There's a list of incompatibilities in the nspawn-oci.c file. In a later
PR I'd like to convert this into proper markdown and add it to the
documentation directory.
2019-03-15 15:41:28 +01:00
Yu Watanabe acf4d15893 util: make *_from_name() returns negative errno on error 2018-11-28 20:20:50 +09:00
Yu Watanabe c65ac075ef nspawn: do not include '%m' in log message if errno is zero 2018-10-20 02:01:15 +09:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Lennart Poettering 1688841f46 nspawn: similar to the previous patches, also make /etc/localtime handling more configurable
Fixes: #9009
2018-05-22 16:21:26 +02:00
Lennart Poettering 4e1d6aa983 nspawn: make --link-journal= configurable through .nspawn files, too 2018-05-22 16:20:08 +02:00
Lennart Poettering 09d423e921 nspawn: add greater control over how /etc/resolv.conf is handled
Fixes: #8014 #1781
2018-05-22 16:19:26 +02:00
Lennart Poettering d107bb7d63 nspawn: add a new --cpu-affinity= switch
Similar as the other options added before, this is primarily useful to
provide comprehensive OCI runtime compatbility, but might be useful
otherwise, too.
2018-05-17 20:48:54 +02:00
Lennart Poettering 81f345dfed nspawn: add a new --oom-score-adjust= command line switch
This is primarily useful in order to provide comprehensive OCI runtime
compatibility with nspawn, but might have uses outside of it.
2018-05-17 20:48:12 +02:00
Lennart Poettering 66edd96310 nspawn: add a new --no-new-privileges= cmdline option to nspawn
This simply controls the PR_SET_NO_NEW_PRIVS flag for the container.
This too is primarily relevant to provide OCI runtime compaitiblity, but
might have other uses too, in particular as it nicely complements the
existing --capability= and --drop-capability= flags.
2018-05-17 20:47:20 +02:00
Lennart Poettering 3a9530e5f1 nspawn: make the hostname of the container explicitly configurable with a new --hostname= switch
Previously, the container's hostname was exclusively initialized from
the machine name configured with --machine=, i.e. the internal name and
the external name used for and by the container was synchronized. This
adds a new option --hostname= that optionally allows the internal name
to deviate from the external name.

This new option is mainly useful to ultimately implement the OCI runtime
spec directly in nspawn, but it might be useful on its own for some
other usecases too.
2018-05-17 20:46:45 +02:00
Lennart Poettering bf428efb07 nspawn: add new --rlimit= switch, and always set resource limits explicitly for our container payloads
This ensures we set the various resource limits of our container
explicitly on each invocation so that we inherit less from our callers
into the payload.

By default resource limits are now set to the same values Linux
generally passes to the host PID 1, thus minimizing needless differences
between host and container environments.

The limits are now also configurable using a new --rlimit= switch. This
is preparation for teaching nspawn native OCI runtime support as OCI
permits setting resource limits for container payloads, and it hence
probably makes sense if we do too.
2018-05-17 20:45:54 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Yu Watanabe 1cc6c93a95 tree-wide: use TAKE_PTR() and TAKE_FD() macros 2018-04-05 14:26:26 +09:00
Daniel Lockyer f9ecfd3bbe Replace free and reassignment with free_and_replace 2017-11-24 10:33:41 +00:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering bcde742e78 conf-parser: turn three bool function params into a flags fields
This makes things more readable and fixes some issues with incorrect
flag propagation between the various flavours of config_parse().
2017-11-13 10:24:03 +01:00
myrkr 1898e5f9a3 nspawn: Fix calculation of capabilities for configuration file (#7087)
The current code shifting an integer 1 failed for capabilities like
CAP_MAC_ADMIN (numerical value 33). This caused issues when specifying
them in the nspawn configuration file. Using an uint64_t 1 instead.

The similar code for processing the --capability command line option
was already correctly working.
2017-10-24 09:56:40 +02:00
Lennart Poettering 960e4569e1 nspawn: implement configurable syscall whitelisting/blacklisting
Now that we have ported nspawn's seccomp code to the generic code in
seccomp-util, let's extend it to support whitelisting and blacklisting
of specific additional syscalls.

This uses similar syntax as PID1's support for system call filtering,
but in contrast to that always implements a blacklist (and not a
whitelist), as we prepopulate the filter with a blacklist, and the
unit's system call filter logic does not come with anything
prepopulated.

(Later on we might actually want to invert the logic here, and
whitelist rather than blacklist things, but at this point let's not do
that. In case we switch this over later, the syscall add/remove logic of
this commit should be compatible conceptually.)

Fixes: #5163

Replaces: #5944
2017-09-12 14:06:21 +02:00
Philip Withnall b53ede699c nspawn: Add support for sysroot pivoting (#5258)
Add a new --pivot-root argument to systemd-nspawn, which specifies a
directory to pivot to / inside the container; while the original / is
pivoted to another specified directory (if provided). This adds
support for booting container images which may contain several bootable
sysroots, as is common with OSTree disk images. When these disk images
are booted on real hardware, ostree-prepare-root is run in conjunction
with sysroot.mount in the initramfs to achieve the same results.
2017-02-08 16:54:31 +01:00
Lennart Poettering 7b4318b6a5 nspawn: add ability to configure overlay mounts to .nspawn files
Fixes: #4634
2016-12-01 12:41:17 +01:00
Zbigniew Jędrzejewski-Szmek 6b430fdb7c tree-wide: use mfree more 2016-10-16 23:35:39 -04:00
Lennart Poettering 22b28dfdc7 nspawn: add new --network-zone= switch for automatically managed bridge devices
This adds a new concept of network "zones", which are little more than bridge
devices that are automatically managed by nspawn: when the first container
referencing a bridge is started, the bridge device is created, when the last
container referencing it is removed the bridge device is removed again. Besides
this logic --network-zone= is pretty much identical to --network-bridge=.

The usecase for this is to make it easy to run multiple related containers
(think MySQL in one and Apache in another) in a common, named virtual Ethernet
broadcast zone, that only exists as long as one of them is running, and fully
automatically managed otherwise.
2016-05-09 15:45:31 +02:00
Lennart Poettering 0de7accea9 nspawn: allow configuration of user namespaces in .nspawn files
In order to implement this we change the bool arg_userns into an enum
UserNamespaceMode, which can take one of NO, PICK or FIXED, and replace the
arg_uid_range_pick bool with it.
2016-04-25 12:16:02 +02:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Lennart Poettering 7732f92bad nspawn: optionally run a stub init process as PID 1
This adds a new switch --as-pid2, which allows running commands as PID 2, while a stub init process is run as PID 1.
This is useful in order to run arbitrary commands in a container, as PID1's semantics are different from all other
processes regarding reaping of unknown children or signal handling.
2016-02-03 23:58:24 +01:00
Lennart Poettering 5f932eb9af nspawn: add new --chdir= switch
Fixes: #2192
2016-02-03 23:58:24 +01:00
Lennart Poettering f6d6bad146 nspawn: add new --network-veth-extra= switch for defining additional veth links
The new switch operates like --network-veth, but may be specified
multiple times (to define multiple link pairs) and allows flexible
definition of the interface names.

This is an independent reimplementation of #1678, but defines different
semantics, keeping the behaviour completely independent of
--network-veth. It also comes will full hook-up for .nspawn files, and
the matching documentation.
2015-11-12 22:04:49 +01:00
Lennart Poettering 7b3e062cb6 process-util: move a couple of process-related calls over 2015-10-27 14:24:58 +01:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 0e2656744f nspawn: rework how we determine private networking settings
Make sure we acquire CAP_NET_ADMIN if we require virtual networking.

Make sure we imply virtual ethernet correctly when bridge is request.

Fixes: #1511
Fixes: #1554
Fixes: #1590
2015-10-22 01:59:25 +02:00
Lennart Poettering 12ca818ffd tree-wide: clean up log_syntax() usage
- Rely everywhere that we use abs() on the error code passed in anyway,
  thus don't need to explicitly negate what we pass in

- Never attach synthetic error number information to log messages. Only
  log about errors we *receive* with the error number we got there,
  don't log any synthetic error, that don#t even propagate, but just eat
  up.

- Be more careful with attaching exactly the error we get, instead of
  errno or unrelated errors randomly.

- Fix one occasion where the error number and line number got swapped.

- Make sure we never tape over OOM issues, or inability to resolve
  specifiers
2015-09-30 22:26:16 +02:00
Lennart Poettering 2b5c04d59c nspawn: remove nspawn.h, it's empty now 2015-09-07 18:47:34 +02:00
Lennart Poettering f757855e81 nspawn: add new .nspawn files for container settings
.nspawn fiels are simple settings files that may accompany container
images and directories and contain settings otherwise passed on the
nspawn command line. This provides an efficient way to attach execution
data directly to containers.
2015-09-06 01:49:06 +02:00