Systemd

Author	SHA1	Message	Date
Christian Brauner	0996ef00fb	nspawn: handle cgroup namespaces (NOTE: Cgroup namespaces work with legacy and unified hierarchies: "This is completely backward compatible and will be completely invisible to any existing cgroup users (except for those running inside a cgroup namespace and looking at /proc/pid/cgroup of tasks outside their namespace.)" (https://lists.linuxfoundation.org/pipermail/containers/2016-January/036582.html) So there is no need to special case unified.) If cgroup namespaces are supported we skip mount_cgroups() in the outer_child(). Instead, we unshare(CLONE_NEWCGROUP) in the inner_child() and only then do we call mount_cgroups(). The clean way to handle cgroup namespaces would be to delegate mounting of cgroups completely to the init system in the container. However, this would likely break backward compatibility with the UNIFIED_CGROUP_HIERARCHY flag of systemd-nspawn. Also no cgroupfs would be mounted whenever the user simply requests a shell and no init is available to mount cgroups. Hence, we introduce mount_legacy_cgns_supported(). After calling unshare(CLONE_NEWCGROUP) it parses /proc/self/cgroup to find the mounted controllers and mounts them inside the new cgroup namespace. This should preserve backward compatibility with the UNIFIED_CGROUP_HIERARCHY flag and mount a cgroupfs when no init in the container is running.	2016-07-09 06:34:11 +02:00
Lennart Poettering	50b52222f2	nspawn: order caps to retain alphabetically	2016-06-13 16:25:54 +02:00
Alessandro Puccetti	9c1e04d0fa	nspawn: introduce --notify-ready=[no\|yes] (#3474 ) This the patch implements a notificaiton mechanism from the init process in the container to systemd-nspawn. The switch --notify-ready=yes configures systemd-nspawn to wait the "READY=1" message from the init process in the container to send its own to systemd. --notify-ready=no is equivalent to the previous behavior before this patch, systemd-nspawn notifies systemd with a "READY=1" message when the container is created. This notificaiton mechanism uses socket file with path relative to the contanier "/run/systemd/nspawn/notify". The default values it --notify-ready=no. It is also possible to configure this mechanism from the .nspawn files using NotifyReady. This parameter takes the same options of the command line switch. Before this patch, systemd-nspawn notifies "ready" after the inner child was created, regardless the status of the service running inside it. Now, with --notify-ready=yes, systemd-nspawn notifies when the service is ready. This is really useful when there are dependencies between different contaniers. Fixes https://github.com/systemd/systemd/issues/1369 Based on the work from https://github.com/systemd/systemd/pull/3022 Testing: Boot a OS inside a container with systemd-nspawn. Note: modify the commands accordingly with your filesystem. 1. Create a filesystem where you can boot an OS. 2. sudo systemd-nspawn -D ${HOME}/distros/fedora-23/ sh 2.1. Create the unit file /etc/systemd/system/sleep.service inside the container (You can use the example below) 2.2. systemdctl enable sleep 2.3 exit 3. sudo systemd-run --service-type=notify --unit=notify-test ${HOME}/systemd/systemd-nspawn --notify-ready=yes -D ${HOME}/distros/fedora-23/ -b 4. In a different shell run "systemctl status notify-test" When using --notify-ready=yes the service status is "activating" for 20 seconds before being set to "active (running)". Instead, using --notify-ready=no the service status is marked "active (running)" quickly, without waiting for the 20 seconds. This patch was also test with --private-users=yes, you can test it just adding it at the end of the command at point 3. ------ sleep.service ------ [Unit] Description=sleep After=network.target [Service] Type=oneshot ExecStart=/bin/sleep 20 [Install] WantedBy=multi-user.target ------------ end ------------	2016-06-10 13:09:06 +02:00
Michael Karcher	8869a0b40b	util-lib: Add sparc64 support for process creation (#3348 ) The current raw_clone function takes two arguments, the cloning flags and a pointer to the stack for the cloned child. The raw cloning without passing a "thread main" function does not make sense if a new stack is specified, as it returns in both the parent and the child, which will fail in the child as the stack is virgin. All uses of raw_clone indeed pass NULL for the stack pointer which indicates that both processes should share the stack address (so you better don't pass CLONE_VM). This commit refactors the code to not require the caller to pass the stack address, as NULL is the only sensible option. It also adds the magic code needed to make raw_clone work on sparc64, which does not return 0 in %o0 for the child, but indicates the child process by setting %o1 to non-zero. This refactoring is not plain aesthetic, because non-NULL stack addresses need to get mangled before being passed to the clone syscall (you have to apply STACK_BIAS), whereas NULL must not be mangled. Implementing the conditional mangling of the stack address would needlessly complicate the code. raw_clone is moved to a separete header, because the burden of including the assert machinery and sched.h shouldn't be applied to every user of missing_syscalls.h	2016-05-29 20:03:51 -04:00
Djalal Harouni	520e0d541f	nspawn: rename arg_retain to arg_caps_retain The argument is about capabilities.	2016-05-26 22:43:34 +02:00
Djalal Harouni	f011b0b87a	nspawn: split out seccomp call into nspawn-seccomp.[ch] Split seccomp into nspawn-seccomp.[ch]. Currently there are no changes, but this will make it easy in the future to share or use the seccomp logic from systemd core.	2016-05-26 22:42:29 +02:00
Zbigniew Jędrzejewski-Szmek	b5a2179b10	nspawn: remove unreachable return statement (#3320 )	2016-05-22 13:02:41 +02:00
Lennart Poettering	2099b3e993	nspawn: drop spurious newline	2016-05-12 20:14:58 +02:00
Lennart Poettering	7513c5b89f	nspawn: only remove veth links we created ourselves Let's make sure we don't remove veth links that existed before nspawn was invoked. https://github.com/systemd/systemd/pull/3209#discussion_r62439999	2016-05-09 15:45:31 +02:00
Lennart Poettering	22b28dfdc7	nspawn: add new --network-zone= switch for automatically managed bridge devices This adds a new concept of network "zones", which are little more than bridge devices that are automatically managed by nspawn: when the first container referencing a bridge is started, the bridge device is created, when the last container referencing it is removed the bridge device is removed again. Besides this logic --network-zone= is pretty much identical to --network-bridge=. The usecase for this is to make it easy to run multiple related containers (think MySQL in one and Apache in another) in a common, named virtual Ethernet broadcast zone, that only exists as long as one of them is running, and fully automatically managed otherwise.	2016-05-09 15:45:31 +02:00
Lennart Poettering	ef76dff225	util-lib: add new ifname_valid() call that validates interface names Make use of this in nspawn at a couple of places. A later commit should port more code over to this, including networkd.	2016-05-09 15:45:31 +02:00
Zbigniew Jędrzejewski-Szmek	5ab1cef0db	Merge pull request #3111 from poettering/nspawn-remove-veth	2016-05-03 13:53:00 -04:00
Zbigniew Jędrzejewski-Szmek	c29f959b44	Revert "nspawn: explicitly remove veth links after use (#3111 )" This reverts commit `d2773e59de`. Merge got squashed by mistake.	2016-05-03 13:53:00 -04:00
Evgeny Vereshchagin	e192a2815e	nspawn: convert uuid to string (#3146 ) Fixes: cp /etc/machine-id /var/tmp/systemd-test.HccKPa/nspawn-root/etc systemd-nspawn -D /var/tmp/systemd-test.HccKPa/nspawn-root --link-journal host -b ... Host and machine ids are equal (P�S!V): refusing to link journals	2016-04-29 10:38:35 +02:00
Evgeny Vereshchagin	5aa3eba50c	nspawn: initialize the veth_name (#3141 ) Fixes: $ systemd-nspawn -h ... Failed to remove veth interface ��: Operation not permitted This is a follow-up for `d2773e59de`	2016-04-28 19:48:17 +02:00
Lennart Poettering	d7fe83bbc2	Merge pull request #3093 from poettering/nspawn-userns-magic nspawn automatic user namespaces	2016-04-26 14:57:04 +02:00
Lennart Poettering	d2773e59de	nspawn: explicitly remove veth links after use (#3111 ) * sd-netlink: permit RTM_DELLINK messages with no ifindex This is useful for removing network interfaces by name. * nspawn: explicitly remove veth links we created after use Sometimes the kernel keeps veth links pinned after the namespace they have been joined to died. Let's hence explicitly remove veth links after use. Fixes: #2173	2016-04-25 17:36:51 +02:00
Lennart Poettering	ef3b2aa7a1	nspawn: explicitly remove veth links we created after use Sometimes the kernel keeps veth links pinned after the namespace they have been joined to died. Let's hence explicitly remove veth links after use. Fixes: #2173	2016-04-25 13:44:24 +02:00
Lennart Poettering	ccabee0d64	nspawn: make -U a tiny bit smarter With this change -U will turn on user namespacing only if the kernel actually supports it and otherwise gracefully degrade to non-userns mode.	2016-04-25 12:16:02 +02:00
Lennart Poettering	0de7accea9	nspawn: allow configuration of user namespaces in .nspawn files In order to implement this we change the bool arg_userns into an enum UserNamespaceMode, which can take one of NO, PICK or FIXED, and replace the arg_uid_range_pick bool with it.	2016-04-25 12:16:02 +02:00
Lennart Poettering	19aac838fc	nspawn: add -U as shortcut for --private-users=pick Given that user namespacing is pretty useful now, let's add a shortcut command line switch for the logic.	2016-04-25 12:16:02 +02:00
Lennart Poettering	0e7ac7515f	nspawn: optionally, automatically allocate a UID/GID range for userns containers This adds the new value "pick" to --private-users=. When specified a new UID/GID range of 65536 users is automatically and randomly allocated from the host range 0x00080000-0xDFFF0000 and used for the container. The setting implies --private-users-chown, so that container directory is recursively chown()ed to the newly allocated UID/GID range, if that's necessary. As an optimization before picking a randomized UID/GID the UID of the container's root directory is used as starting point and used if currently not used otherwise. To protect against using the same UID/GID range multiple times a few mechanisms are in place: - The first and the last UID and GID of the range are checked with getpwuid() and getgrgid(). If an entry already exists a different range is picked. Note that by "last" UID the user 65534 is used, as 65535 is the 16bit (uid_t) -1. - A lock file for the range is taken in /run/systemd/nspawn-uid/. Since the ranges are taken in a non-overlapping fashion, and always start on 64K boundaries this allows us to maintain a single lock file for each range that can be randomly picked. This protects nspawn from picking the same range in two parallel instances. - If possible the /etc/passwd lock file is taken while a new range is selected until the container is up. This means adduser/addgroup should safely avoid the range as long as nss-mymachines is used, since the allocated range will then show up in the user database. The UID/GID range nspawn picks from is compiled in and not configurable at the moment. That should probably stay that way, since we already provide ways how users can pick their own ranges manually if they don't like the automatic logic. The new --private-users=pick logic makes user namespacing pretty useful now, as it relieves the user from managing UID/GID ranges.	2016-04-25 12:16:02 +02:00
Lennart Poettering	7336138eed	nspawn: optionally fix up OS tree uid/gids for userns This adds a new --private-userns-chown switch that may be used in combination with --private-userns. If it is passed a recursive chmod() operation is run on the OS tree, fixing all file owner UID/GIDs to the right ranges. This should make user namespacing pretty workable, as the OS trees don't need to be prepared manually anymore.	2016-04-25 12:15:57 +02:00
Thomas H. P. Andersen	0f5e13822d	tree-wide: remove unused variables (#3098 )	2016-04-22 20:49:07 -04:00
Zbigniew Jędrzejewski-Szmek	ccddd104fc	tree-wide: use mdash instead of a two minuses	2016-04-21 23:00:13 -04:00
Zbigniew Jędrzejewski-Szmek	a5f1cb3bad	nspawn: add -E as alias for --setenv v2: - "=" is required, so remove the <optional> tags that v1 added	2016-04-20 09:00:39 -04:00
Lennart Poettering	70a399c43a	Merge pull request #3014 from msekletar/nspawn-empty-machine-id-v3 nspawn: always setup machine id (v3)	2016-04-11 17:27:11 +02:00
Michal Sekletar	e01ff70a77	nspawn: always setup machine id We check /etc/machine-id of the container and if it is already populated we use value from there, possibly ignoring value of --uuid option from the command line. When dealing with R/O image we setup transient machine id. Once we determined machine id of the container, we use this value for registration with systemd-machined and we also export it via container_uuid environment variable. As registration with systemd-machined is done by the main nspawn process we communicate container machine id established by setup_machine_id from outer child to the main process by unix domain socket. Similarly to PID of inner child.	2016-04-11 16:43:16 +02:00
Zbigniew Jędrzejewski-Szmek	d929b0f98b	nspawn: ignore failure to chdir CID #1322380.	2016-04-08 21:09:06 -04:00
Evgeny Vereshchagin	1c1ea21735	nspawn: don't run nspawn --port=... without libiptc support We get $ systemd-nspawn --image /dev/loop1 --port 8080:80 -n -b 3 --port= is not supported, compiled without libiptc support. instead of a ping-nc-iptables debugging session	2016-03-17 21:07:11 +00:00
Dan Walsh	68b020494d	/dev/console must be labeled with SELinux label If the user specifies an selinux_apifs_context all content created in the container including /dev/console should use this label. Currently when this uses the default label it gets labeled user_devpts_t, which would require us to write a policy allowing container processes to manage user_devpts_t. This means that an escaped process would be allowed to attack all users terminals as well as other container terminals. Changing the label to match the apifs_context, means the processes would only be allowed to manage their specific tty. This change fixes a problem preventing RKT containers from working with systemd-nspawn.	2016-03-09 11:19:45 -05:00
Vito Caputo	9ed794a32d	tree-wide: minor formatting inconsistency cleanups	2016-02-23 14:20:34 -08:00
Vito Caputo	313cefa1d9	tree-wide: make ++/-- usage consistent WRT spacing Throughout the tree there's spurious use of spaces separating ++ and -- operators from their respective operands. Make ++ and -- operator consistent with the majority of existing uses; discard the spaces.	2016-02-22 20:32:04 -08:00
Lennart Poettering	91ba5ac7d0	Merge pull request #2589 from keszybz/resolve-tool-2 Better support of OPENPGPKEY, CAA, TLSA packets and tests	2016-02-13 11:15:41 +01:00
Zbigniew Jędrzejewski-Szmek	75f32f047c	Add memcpy_safe ISO/IEC 9899:1999 §7.21.1/2 says: Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. In base64_append_width memcpy was called as memcpy(x, NULL, 0). GCC 4.9 started making use of this and assumes This worked fine under -O0, but does something strange under -O3. This patch fixes a bug in base64_append_width(), fixes a possible bug in journal_file_append_entry_internal(), and makes use of the new function to simplify the code in other places.	2016-02-11 13:07:02 -05:00
Daniel Mack	b26fa1a2fb	tree-wide: remove Emacs lines from all files This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.	2016-02-10 13:41:57 +01:00
Lennart Poettering	2b26a72816	nspawn: make sure --help fits it 79ch	2016-02-03 23:58:25 +01:00
Lennart Poettering	7732f92bad	nspawn: optionally run a stub init process as PID 1 This adds a new switch --as-pid2, which allows running commands as PID 2, while a stub init process is run as PID 1. This is useful in order to run arbitrary commands in a container, as PID1's semantics are different from all other processes regarding reaping of unknown children or signal handling.	2016-02-03 23:58:24 +01:00
Lennart Poettering	5f932eb9af	nspawn: add new --chdir= switch Fixes: #2192	2016-02-03 23:58:24 +01:00
Lennart Poettering	ba8e6c4d0e	nspawn: make sure --link-journal=host may be used twice in a row Fixes #2186 This fixes fall-out from `574edc9006`.	2016-01-28 20:24:28 +01:00
Lennart Poettering	8054d749c4	nspawn: make journal linking non-fatal in try and auto modes Fixes #2091	2016-01-28 20:16:44 +01:00
Michal Sekletar	61e741ed3d	nspawn: fix memory leak	2016-01-25 12:06:38 +01:00
Ismo Puustinen	a103496ca5	capabilities: keep bounding set in non-inverted format. Change the capability bounding set parser and logic so that the bounding set is kept as a positive set internally. This means that the set reflects those capabilities that we want to keep instead of drop.	2016-01-12 12:14:50 +02:00
Lennart Poettering	4afd3348c7	tree-wide: expose "p"-suffix unref calls in public APIs to make gcc cleanup easy GLIB has recently started to officially support the gcc cleanup attribute in its public API, hence let's do the same for our APIs. With this patch we'll define an xyz_unrefp() call for each public xyz_unref() call, to make it easy to use inside a __attribute__((cleanup())) expression. Then, all code is ported over to make use of this. The new calls are also documented in the man pages, with examples how to use them (well, I only added docs where the _unref() call itself already had docs, and the examples, only cover sd_bus_unrefp() and sd_event_unrefp()). This also renames sd_lldp_free() to sd_lldp_unref(), since that's how we tend to call our destructors these days. Note that this defines no public macro that wraps gcc's attribute and makes it easier to use. While I think it's our duty in the library to make our stuff easy to use, I figure it's not our duty to make gcc's own features easy to use on its own. Most likely, client code which wants to make use of this should define its own: #define _cleanup_(function) __attribute__((cleanup(function))) Or similar, to make the gcc feature easier to use. Making this logic public has the benefit that we can remove three header files whose only purpose was to define these functions internally. See #2008.	2015-11-27 19:19:36 +01:00
Lennart Poettering	4a0b58c4a3	tree-wide: use right cast macros for UIDs, GIDs and PIDs	2015-11-17 00:52:10 +01:00
Lennart Poettering	f6d6bad146	nspawn: add new --network-veth-extra= switch for defining additional veth links The new switch operates like --network-veth, but may be specified multiple times (to define multiple link pairs) and allows flexible definition of the interface names. This is an independent reimplementation of #1678, but defines different semantics, keeping the behaviour completely independent of --network-veth. It also comes will full hook-up for .nspawn files, and the matching documentation.	2015-11-12 22:04:49 +01:00
Daniel Mack	b0bc8dbd73	Merge pull request #1820 from michich/errno-v2 [v2] treewide: treatment of errno and other cleanups	2015-11-09 21:56:49 +01:00
Michal Schmidt	e1427b138f	treewide: apply errno.cocci with small manual cleanups for style.	2015-11-09 20:01:06 +01:00
Lennart Poettering	6c9e781eba	Merge pull request #1799 from jengelh/doc doc: typo and ortho fixes	2015-11-09 18:16:21 +01:00
Iago López Galeiras	6aadfa4c52	nspawn: support custom container service name We were hardcoding "systemd-nspawn" as the value of the $container env variable and "nspawn" as the service string in machined registration. This commit allows the user to configure it by setting the $SYSTEMD_NSPAWN_CONTAINER_SERVICE env variable when calling systemd-nspawn. If $SYSTEMD_NSPAWN_CONTAINER_SERVICE is not set, we use the string "systemd-nspawn" for both, fixing the previous inconsistency.	2015-11-09 16:40:05 +01:00
Jan Engelhardt	a8eaaee72a	doc: correct orthography, word forms and missing/extraneous words	2015-11-06 13:45:21 +01:00
Jan Engelhardt	b938cb902c	doc: correct punctuation and improve typography in documentation	2015-11-06 13:00:02 +01:00
Michal Schmidt	35607a8d1c	nspawn: save errno before reopening log after exec failure	2015-11-05 13:44:12 +01:00
Michal Schmidt	070edd97f3	nspawn: no fake errno The S_ISREG test does not set errno, so don't use it in the error message.	2015-11-05 13:44:11 +01:00
Michal Schmidt	4314d33f51	nspawn: simplify error returns Use the "return log_error_errno(...)" idiom to have fewer curly braces. The last hunk also fixes the return value of setup_journal(), but the fix has no practical effect.	2015-11-05 13:44:10 +01:00
Michal Schmidt	709f6e46a3	treewide: use the negative error codes returned by our functions Our functions return negative error codes. Do not rely on errno being set after calling our own functions.	2015-11-05 13:44:06 +01:00
Lennart Poettering	97044145b4	core,nspawn: minor coding style fixes	2015-10-31 19:09:20 +01:00
Susant Sahani	6cbe4ed1e1	nspwan: port to extract_first_word	2015-10-28 22:59:01 +05:30
Lennart Poettering	b5efdb8af4	util-lib: split out allocation calls into alloc-util.[ch]	2015-10-27 13:45:53 +01:00
Lennart Poettering	15a5e95075	util-lib: split out printf() helpers to stdio-util.h	2015-10-27 13:25:57 +01:00
Lennart Poettering	430f0182b7	src/basic: rename audit.[ch] → audit-util.[ch] and capability.[ch] → capability-util.[ch] The files are named too generically, so that they might conflict with the upstream project headers. Hence, let's add a "-util" suffix, to clarify that this are just our utility headers and not any official upstream headers.	2015-10-27 13:25:57 +01:00
Lennart Poettering	affb60b1ef	util-lib: split out umask-related code to umask-util.h	2015-10-27 13:25:56 +01:00
Lennart Poettering	8fcde01280	util-lib: split stat()/statfs()/stavfs() related calls into stat-util.[ch]	2015-10-27 13:25:56 +01:00
Lennart Poettering	f4f15635ec	util-lib: move a number of fs operations into fs-util.[ch]	2015-10-27 13:25:56 +01:00
Lennart Poettering	4349cd7c1d	util-lib: move mount related utility calls to mount-util.[ch]	2015-10-27 13:25:55 +01:00
Lennart Poettering	6bedfcbb29	util-lib: split string parsing related calls from util.[ch] into parse-util.[ch]	2015-10-27 13:25:55 +01:00
Lennart Poettering	2583fbea8e	socket-util: move remaining socket-related calls from util.[ch] to socket-util.[ch]	2015-10-26 01:24:39 +01:00
Lennart Poettering	b1d4f8e154	util-lib: split out user/group/uid/gid calls into user-util.[ch]	2015-10-26 01:24:38 +01:00
Lennart Poettering	3ffd4af220	util-lib: split out fd-related operations into fd-util.[ch] There are more than enough to deserve their own .c file, hence move them over.	2015-10-25 13:19:18 +01:00
Lennart Poettering	07630cea1f	util-lib: split our string related calls from util.[ch] into its own file string-util.[ch] There are more than enough calls doing string manipulations to deserve its own files, hence do something about it. This patch also sorts the #include blocks of all files that needed to be updated, according to the sorting suggestions from CODING_STYLE. Since pretty much every file needs our string manipulation functions this effectively means that most files have sorted #include blocks now. Also touches a few unrelated include files.	2015-10-24 23:05:02 +02:00
Lennart Poettering	0f03c2a4c0	path-util: unify how we process paths specified on the command line Let's introduce a common function that makes relative paths absolute and warns about any errors while doing so.	2015-10-24 23:03:49 +02:00
Lennart Poettering	0f47436510	util-lib: get_current_dir_name() can return errors other than ENOMEM get_current_dir_name() can return a variety of errors, not just ENOMEM, hence don't blindly turn its errors to ENOMEM, but return correct errors in path_make_absolute_cwd(). This trickles down into a couple of other functions, some of which receive unrelated minor fixes too with this commit.	2015-10-24 23:03:49 +02:00
Lennart Poettering	16fb773ee3	nspawn: don't try to resolve passed binary before entering namespace Othewise we might follow the symlinks on the host, instead of the container. Fixes #1400	2015-10-22 01:59:25 +02:00
Lennart Poettering	0e2656744f	nspawn: rework how we determine private networking settings Make sure we acquire CAP_NET_ADMIN if we require virtual networking. Make sure we imply virtual ethernet correctly when bridge is request. Fixes: #1511 Fixes: #1554 Fixes: #1590	2015-10-22 01:59:25 +02:00
Lennart Poettering	5bcd08db28	btrfs: beef-up btrfs support with a limited understanding of quota With this change we understand more than just leaf quota groups for btrfs file systems. Specifically: - When we create a subvolume we can now optionally add the new subvolume to all qgroups its parent subvolume was member of too. Alternatively it is also possible to insert an intermediary quota group between the parent's qgroups and the subvolume's leaf qgroup, which is useful for a concept of "subtree" qgroups, that contain a subvolume and all its children. - The remove logic for subvolumes has been updated to optionally remove any leaf qgroups or "subtree" qgroups, following the logic above. - The snapshot logic for subvolumes has been updated to replicate the original qgroup setup of the source, if it follows the "subtree" design described above. It will not cover qgroup setups that introduce arbitrary qgroups, especially those orthogonal to the subvolume hierarchy. This also tries to be more graceful when setting up /var/lib/machines as btrfs. For example, if mkfs.btrfs is missing we don't even try to set it up as loopback device. Fixes #1559 Fixes #1129	2015-10-22 01:59:25 +02:00
Iago López Galeiras	d167824896	nspawn: skip /sys-as-tmpfs if we don't use private-network Since v3.11/7dc5dbc ("sysfs: Restrict mounting sysfs"), the kernel doesn't allow mounting sysfs if you don't have CAP_SYS_ADMIN rights over the network namespace. So the mounting /sys as a tmpfs code introduced in `d8fc6a000f` doesn't work with user namespaces if we don't use private-net. The reason is that we mount sysfs inside the container and we're in the network namespace of the host but we don't have CAP_SYS_ADMIN over that namespace. To fix that, we mount /sys as a sysfs (instead of tmpfs) if we don't use private network and ignore the /sys-as-a-tmpfs code if we find that /sys is already mounted as sysfs. Fixes #1555	2015-10-20 10:19:23 +02:00
Lennart Poettering	ae3dde8012	machinectl: fix race when opening new shells with "machinectl shell" Previously, we'd allocate the TTY, spawn a service on it, but immediately start processing the TTY and forwarding it to whatever the commnd was started on. This is however problematic, as the TTY might get actually opened only much later by the service. We'll hence first get EIOs on the master as the other side is still closed, and hence considered it hung up and terminated the session. With this change we add a flag to the pty forwarding logic: PTY_FORWARD_IGNORE_INITIAL_VHANGUP. If set, we'll ignore all hangups (i.e. EIOs) on the master PTY until the first byte is successfully read. From that point on we consider a hangup/EIO a regular connection termination. This way, we handle the race: when we get EIO initially we'll ignore it, until the connection is properly set up, at which time we start honouring it.	2015-10-07 20:10:48 +02:00
Lennart Poettering	d8fc6a000f	nspawn: mount /sys as tmpfs, and then mount only select subdirs of the real sysfs below it This way we can hide things like /sys/firmware or /sys/hypervisor from the container, while keeping the device tree around. While this is a security benefit in itself it also allows us to fix issue #1277. Previously we'd mount /sys before creating the user namespace, in order to be able to mount /sys/fs/cgroup/* beneath it (which resides in it), which we can only mount outside of the user namespace. To ensure that the user namespace owns the network namespace we'd set up the network namespace at the same time as the user namespace. Thus, we'd still see the /sys/class/net/ from the originating network namespace, even though we are in our own network namespace now. With this patch, /sys is mounted before transitioning into the user namespace as tmpfs, so that we can also mount /sys/fs/cgroup/* into it this early. The directories such as /sys/class/ are then later added in from the real sysfs from inside the network and user namespace so that they actually show whatis available in it. Fixes #1277	2015-09-30 15:19:33 +02:00
Lennart Poettering	403af78c80	nspawn: fix user namespace support We didn#t actually pass ownership of /run to the UID in the container since some releases, let's fix that.	2015-09-30 12:48:17 +02:00
Lennart Poettering	db3b1dedb2	nspawn: order includes	2015-09-30 12:24:06 +02:00
Lennart Poettering	3f6fd1ba65	util: introduce common version() implementation and use it everywhere This also allows us to drop build.h from a ton of files, hence do so. Since we touched the #includes of those files, let's order them properly according to CODING_STYLE.	2015-09-29 21:08:37 +02:00
Lennart Poettering	189d5bac5c	util: unify implementation of NOP signal handler This is highly complex code after all, we really should make sure to only keep one implementation of this extremely difficult function around.	2015-09-29 21:08:37 +02:00
Lennart Poettering	2feceb5eb9	tree-wide: take benefit of the fact that fdset_free() returns NULL	2015-09-29 21:08:37 +02:00
Lennart Poettering	3ee897d6c2	tree-wide: port more code to use send_one_fd() and receive_one_fd() Also, make it slightly more powerful, by accepting a flags argument, and make it safe for handling if more than one cmsg attribute happens to be attached.	2015-09-29 21:08:37 +02:00
Krzesimir Nowak	c0ffce2bd1	nspawn, machined: fix comments and error messages A bunch of "Client -> Child" fixes and one barrier-enumerator fix. (David: rebased on master)	2015-09-22 14:17:03 +02:00
Krzesimir Nowak	327e26d689	nspawn: close unneeded sockets in outer child (David: Note, this is just a cleanup and doesn't fix any bugs)	2015-09-22 14:11:44 +02:00
David Herrmann	d960371482	util: introduce {send,receive}_one_fd() Introduce two new helpers that send/receive a single fd via a unix transport. Also make nspawn use them instead of hard-coding it. Based on a patch by Krzesimir Nowak.	2015-09-22 14:09:54 +02:00
Lennart Poettering	59f448cf15	tree-wide: never use the off_t unless glibc makes us use it off_t is a really weird type as it is usually 64bit these days (at least in sane programs), but could theoretically be 32bit. We don't support off_t as 32bit builds though, but still constantly deal with safely converting from off_t to other types and back for no point. Hence, never use the type anymore. Always use uint64_t instead. This has various benefits, including that we can expose these values directly as D-Bus properties, and also that the values parse the same in all cases.	2015-09-10 18:16:18 +02:00
Lennart Poettering	82116c4329	nspawn: also close uid shift socket in the parent We should really close all parent sides of our child/parent socket pairs.	2015-09-08 01:22:46 +02:00
Lennart Poettering	76d448820e	nspawn: short reads do not set errno, hence don't try to print it	2015-09-08 01:22:26 +02:00
Lennart Poettering	4610de5022	inspawn: switch from SOCK_DGRAM to SOCK_SEQPACKET for internal socketpairs SOCK_DGRAM and SOCK_SEQPACKET have very similar semantics when used with socketpair(). However, SOCK_SEQPACKET has the advantage of knowing a hangup concept, since it is inherently connection-oriented. Since we use socket pairs to communicate between the nspawn main process and the nspawn child process, where the child might die abnormally it's interesting to us to learn about this via hangups if the child side of the pair is closed. Hence, let's switch to SOCK_SEQPACKET for these internal communication sockets. Fixes #956.	2015-09-08 01:17:47 +02:00
Lennart Poettering	07fa00f9d9	nspawn: properly propagate errors when we fail to set soemthing up	2015-09-08 01:17:15 +02:00
Lennart Poettering	8fe0087ede	nspawn: sort and clean up included header list Let's remove unnecessary inclusions, and order the list alphabetically as suggested in CODING_STYLE now.	2015-09-07 18:56:54 +02:00
Lennart Poettering	2b5c04d59c	nspawn: remove nspawn.h, it's empty now	2015-09-07 18:47:34 +02:00
Lennart Poettering	ee64508006	nspawn: split out --uid= logic into nspawn-setuid.[ch]	2015-09-07 18:44:31 +02:00
Lennart Poettering	b7103bc5f4	nspawn: split out machined registration code to nspawn-register.[ch]	2015-09-07 18:44:31 +02:00
Lennart Poettering	34829a324b	nspawn: split out cgroup related calls into nspawn-cgroup.[ch]	2015-09-07 18:44:30 +02:00
Lennart Poettering	9a2a5625bf	nspawn: split out network related code to nspawn-network.[ch]	2015-09-07 18:44:30 +02:00
Lennart Poettering	7a8f63251d	nspawn: split all port exposure code into nspawn-expose-port.[ch]	2015-09-07 18:44:30 +02:00
Lennart Poettering	e83bebeff7	nspawn: split out mount related functions into a new nspawn-mount.c file	2015-09-07 18:44:30 +02:00
Lennart Poettering	f757855e81	nspawn: add new .nspawn files for container settings .nspawn fiels are simple settings files that may accompany container images and directories and contain settings otherwise passed on the nspawn command line. This provides an efficient way to attach execution data directly to containers.	2015-09-06 01:49:06 +02:00
Lennart Poettering	98e4d8d763	nspawn: enable all controllers we can for the "payload" subcgroup we create In the unified hierarchy delegating controller access is safe, hence make sure to enable all controllers for the "payload" subcgroup if we create it, so that the container will have all controllers enabled the nspawn service itself has.	2015-09-04 09:07:31 +02:00
Lennart Poettering	efdb02375b	core: unified cgroup hierarchy support This patch set adds full support the new unified cgroup hierarchy logic of modern kernels. A new kernel command line option "systemd.unified_cgroup_hierarchy=1" is added. If specified the unified hierarchy is mounted to /sys/fs/cgroup instead of a tmpfs. No further hierarchies are mounted. The kernel command line option defaults to off. We can turn it on by default as soon as the kernel's APIs regarding this are stabilized (but even then downstream distros might want to turn this off, as this will break any tools that access cgroupfs directly). It is possibly to choose for each boot individually whether the unified or the legacy hierarchy is used. nspawn will by default provide the legacy hierarchy to containers if the host is using it, and the unified otherwise. However it is possible to run containers with the unified hierarchy on a legacy host and vice versa, by setting the $UNIFIED_CGROUP_HIERARCHY environment variable for nspawn to 1 or 0, respectively. The unified hierarchy provides reliable cgroup empty notifications for the first time, via inotify. To make use of this we maintain one manager-wide inotify fd, and each cgroup to it. This patch also removes cg_delete() which is unused now. On kernel 4.2 only the "memory" controller is compatible with the unified hierarchy, hence that's the only controller systemd exposes when booted in unified heirarchy mode. This introduces a new enum for enumerating supported controllers, plus a related enum for the mask bits mapping to it. The core is changed to make use of this everywhere. This moves PID 1 into a new "init.scope" implicit scope unit in the root slice. This is necessary since on the unified hierarchy cgroups may either contain subgroups or processes but not both. PID 1 hence has to move out of the root cgroup (strictly speaking the root cgroup is the only one where processes and subgroups are still allowed, but in order to support containers nicey, we move PID 1 into the new scope in all cases.) This new unit is also used on legacy hierarchy setups. It's actually pretty useful on all systems, as it can then be used to filter journal messages coming from PID 1, and so on. The root slice ("-.slice") is now implicitly created and started (and does not require a unit file on disk anymore), since that's where "init.scope" is located and the slice needs to be started before the scope can. To check whether we are in unified or legacy hierarchy mode we use statfs() on /sys/fs/cgroup. If the .f_type field reports tmpfs we are in legacy mode, if it reports cgroupfs we are in unified mode. This patch set carefuly makes sure that cgls and cgtop continue to work as desired. When invoking nspawn as a service it will implicitly create two subcgroups in the cgroup it is using, one to move the nspawn process into, the other to move the actual container processes into. This is done because of the requirement that cgroups may either contain processes or other subgroups.	2015-09-01 23:52:27 +02:00
Lennart Poettering	a19222e1d3	nspawn: don't try to extract quotes from option string, glibc doesn't do that either Follow-up regarding #649.	2015-08-29 19:43:48 +02:00
Eugene Yakubovich	5e5bfa6e1c	nspawn: add (no)rbind option to --bind and --bind-ro --bind and --bind-ro perform the bind mount non-recursively. It is sometimes (often?) desirable to do a recursive mount. This patch adds an optional set of bind mount options in the form of: --bind=src-path:dst-path:options options are comma separated and currently only "rbind" and "norbind" are allowed. Default value is "rbind".	2015-08-28 18:06:05 -07:00
Lennart Poettering	c1521918b4	nspawn: make sure --template= and --machine= my be combined Fixes #1018. Based on a patch from Seth Jennings.	2015-08-25 20:28:31 +02:00
Thomas Hindoe Paaboel Andersen	62f176068c	remove unused variables	2015-08-21 22:19:10 +02:00
Richard Maw	62f9f39a45	nspawn: Allow : characters in overlay paths : characters can be entered with the \: escape sequence.	2015-08-07 15:50:43 +00:00
Richard Maw	872d0dbdc3	nspawn: escape paths in overlay mount options Overlayfs uses , as an option separator and : as a list separator. These characters are both valid in file paths, so overlayfs allows file paths which contain these characters to backslash escape these values.	2015-08-07 15:50:43 +00:00
Richard Maw	e4a5d9edee	nspawn: Allow : characters in nspawn --bind paths : characters in bind paths can be entered as the \: escape sequence.	2015-08-07 15:50:43 +00:00
Richard Maw	6330ee1083	nspawn: Allow : characters in --tmpfs path This now accepts : characters with the \: escape sequence. Other escape sequences are also interpreted, but having a \ in your file path is less likely than :, so this shouldn't break anyone's existing tools.	2015-08-07 15:50:42 +00:00
Zbigniew Jędrzejewski-Szmek	73974f6768	Merge branch 'hostnamectl-dot-v2' Manual merge of https://github.com/systemd/systemd/pull/751.	2015-08-05 21:02:41 -04:00
Zbigniew Jędrzejewski-Szmek	ae691c1d93	hostname-util: get rid of unused parameter of hostname_cleanup() All users are now setting lowercase=false.	2015-08-05 20:49:21 -04:00
David Herrmann	97b11eedff	tree-wide: introduce mfree() Pretty trivial helper which wraps free() but returns NULL, so we can simplify this: free(foobar); foobar = NULL; to this: foobar = mfree(foobar);	2015-07-31 19:56:38 +02:00
Daniel Mack	2fc09a9cdd	tree-wide: use free_and_strdup() Use free_and_strdup() where appropriate and replace equivalent, open-coded versions.	2015-07-30 13:09:01 +02:00
Mike Gilbert	3dce891505	nspawn: Don't pass uid mount option for devpts Mounting devpts with a uid breaks pty allocation with recent glibc versions, which expect that the kernel will set the correct owner for user-allocated ptys. The kernel seems to be smart enough to use the correct uid for root when we switch to a user namespace. This resolves #337.	2015-07-22 22:34:57 -04:00
Lennart Poettering	1434eb3838	Merge pull request #500 from zonque/fileio fileio: consolidate write_string_file*()	2015-07-08 17:13:53 -03:00
Zbigniew Jędrzejewski-Szmek	af86c44038	Remove repeated 'the's	2015-07-07 07:40:53 -04:00
Daniel Mack	ad118bda15	tree-wide: fix write_string_file() user that should not create files The latest consolidation cleanup of write_string_file() revealed some users of that helper which should have used write_string_file_no_create() in the past but didn't. Basically, all existing users that write to files in /sys and /proc should not expect to write to a file which is not yet existant.	2015-07-06 19:27:20 -04:00
Daniel Mack	4c1fc3e404	fileio: consolidate write_string_file*() Merge write_string_file(), write_string_file_no_create() and write_string_file_atomic() into write_string_file() and provide a flags mask that allows combinations of atomic writing, newline appending and automatic file creation. Change all users accordingly.	2015-07-06 19:19:25 -04:00
Lennart Poettering	eff8efe671	Merge pull request #492 from richardmaw-codethink/nspawn-automatic-uid-shift-fix-v2 nspawn: Communicate determined UID shift to parent version 2	2015-07-06 20:53:56 +02:00
Richard Maw	825d5287d7	nspawn: Communicate determined UID shift to parent There is logic to determine the UID shift from the file-system, rather than having it be explicitly passed in. However, this needs to happen in the child process that sets up the mounts, as what's important is the UID of the mounted root, rather than the mount-point. Setting up the UID map needs to happen in the parent becuase the inner child needs to have been started, and the outer child is no longer able to access the uid_map file, since it lost access to it when setting up the mounts for the inner child. So we need to communicate the uid shift back out, along with the PID of the inner child process. Failing to communicate this means that the invalid UID shift, which is the value used to specify "this needs to be determined from the file system" is left invalid, so setting up the user namespace's UID shift fails.	2015-07-06 13:23:19 +01:00
Lennart Poettering	dbb60d6944	nspawn: fix indenting	2015-07-06 12:35:51 +02:00
David Herrmann	6acc94b621	Merge pull request #485 from poettering/sd-bus-flush-close-unref sd-bus: introduce new sd_bus_flush_close_unref() call	2015-07-04 12:41:01 +02:00
Lennart Poettering	03976f7b4a	sd-bus: introduce new sd_bus_flush_close_unref() call sd_bus_flush_close_unref() is a call that simply combines sd_bus_flush() (which writes all unwritten messages out) + sd_bus_close() (which terminates the connection, releasing all unread messages) + sd_bus_unref() (which frees the connection). The combination of this call is used pretty frequently in systemd tools right before exiting, and should also be relevant for most external clients, and is hence useful to cover in a call of its own. Previously the combination of the three calls was already done in the _cleanup_bus_close_unref_ macro, but this was only available internally. Also see #327	2015-07-03 19:49:03 +02:00
Lennart Poettering	391567f479	Revert "nspawn: determine_uid_shift before forking"	2015-07-03 12:30:53 +02:00
Tom Gundersen	b7a049dba5	Merge pull request #429 from richardmaw-codethink/nspawn-userns-uid-shift-autodetection-fix nspawn: determine_uid_shift before forking	2015-06-30 18:24:14 +02:00
Richard Maw	7fe2bb84c4	nspawn: determine_uid_shift before forking It is needed in one branch of the fork, but calculated in another branch. Failing to do this means using --private-users without specifying a uid shift always fails because it tries to shift the uid to UID_INVALID.	2015-06-30 14:05:58 +00:00
Richard Maw	3c59d4f21f	nspawn: Don't remount with fewer options When we do a MS_BIND mount, it inherits the flags of its parent mount. When we do a remount, it sets the flags to exactly what is specified. If we are in a user namespace then these mount points have their flags locked, so you can't reduce the protection. As a consequence, the default setup of mount_all doesn't work with user namespaces. However if we ensure we add the mount flags of the parent mount when remounting, then we aren't removing mount options, so we aren't trying to unlock an option that we aren't allowed to.	2015-06-30 14:05:03 +00:00
Lennart Poettering	68a313c592	nspawn: suppress warning when /etc/resolv.conf is a valid symlink In such a case let's suppress the warning (downgrade to LOG_DEBUG), under the assumption that the user has no config file to update in its place, but a symlink that points to something like resolved's automatically managed resolve.conf file. While we are at it, also stop complaining if we cannot write /etc/resolv.conf due to a read-only disk, given that there's little we could do about it.	2015-06-18 19:45:18 +02:00
Lennart Poettering	503546da7c	nspawn: when exiting, flush all remaining bytes from the pty to stdout This is a simpler fix for #210, it simply uses copy_bytes() for the copying.	2015-06-17 20:54:45 +02:00
Djalal Harouni	b774fb7f00	nspawn: check if kernel supports userns as early as possible If the kernel do not support user namespace then one of the children created by nspawn parent will fail at clone(CLONE_NEWUSER) with the generic error EINVAL and without logging the error. At the same time the parent may also try to setup the user namespace and will fail with another error. To improve this, check if the kernel supports user namespace as early as possible.	2015-06-16 17:30:45 +01:00
Lennart Poettering	86b85cf440	Merge pull request #214 from poettering/signal-rework-2 everywhere: port everything to sigprocmask_many() and friends	2015-06-15 20:35:18 +02:00
Lennart Poettering	72c0a2c255	everywhere: port everything to sigprocmask_many() and friends This ports a lot of manual code over to sigprocmask_many() and friends. Also, we now consistly check for sigprocmask() failures with assert_se(), since the call cannot realistically fail unless there's a programming error. Also encloses a few sd_event_add_signal() calls with (void) when we ignore the return values for it knowingly.	2015-06-15 20:13:23 +02:00
Lennart Poettering	770b5ce4fc	tmpfiles: automatically remove old machine snapshots at boot Remove old temporary snapshots, but only at boot. Ideally we'd have "self-destroying" btrfs snapshots that go away if the last last reference to it does. To mimic a scheme like this at least remove the old snapshots on fresh boots, where we know they cannot be referenced anymore. Note that we actually remove all temporary files in /var/lib/machines/ at boot, which should be safe since the directory has defined semantics. In the root directory (where systemd-nspawn --ephemeral places snapshots) we are more strict, to avoid removing unrelated temporary files. This also splits out nspawn/container related tmpfiles bits into a new tmpfiles snippet to systemd-nspawn.conf	2015-06-15 19:28:55 +02:00
Lennart Poettering	14bcf25c8b	util: when creating temporary file names, allow including extra id string in it This adds a "char *extra" parameter to tempfn_xxxxxx(), tempfn_random(), tempfn_ranomd_child(). If non-NULL this string is included in the middle of the newly created file name. This is useful for being able to distuingish the kind of temporary file when we see one. This also adds tests for the three call. For now, we don't make use of this at all, but port all users over.	2015-06-15 19:28:55 +02:00
Daniel Mack	12c2884c55	firewall: rename fw-util.[ch] → firewall-util.[ch] The names fw-util.[ch] are too ambiguous, better rename the files to firewall-util.[ch]. Also rename the test accordingly.	2015-06-15 14:08:02 +02:00
Lennart Poettering	5feece76fb	Merge pull request #205 from endocode/iaguis/seccomp-v2 nspawn: make seccomp loading errors non-fatal	2015-06-15 11:45:48 +02:00
Iago López Galeiras	9b1cbdc6e1	nspawn: make seccomp loading errors non-fatal seccomp_load returns -EINVAL when seccomp support is not enabled in the kernel [1]. This should be a debug log, not an error that interrupts nspawn. If the seccomp filter can't be set and audit is enabled, the user will get an error message anyway. [1]: http://man7.org/linux/man-pages/man2/prctl.2.html	2015-06-15 10:55:31 +02:00
Tom Gundersen	1c4baffc18	sd-netlink: rename from sd-rtnl	2015-06-13 19:52:54 +02:00
Tom Gundersen	31710be527	sd-rtnl: make joining broadcast groups implicit	2015-06-11 17:47:40 +02:00
Lennart Poettering	ce30c8dcb4	tree-wide: whenever we fork off a foreign child process reset signal mask/handlers Also, when the child is potentially long-running make sure to set a death signal. Also, ignore the result of the reset operations explicitly by casting them to (void).	2015-06-10 01:28:58 +02:00
Lennart Poettering	24882e06c1	util: split out signal-util.[ch] from util.[ch] No functional changes.	2015-05-29 20:14:11 +02:00
Martin Pitt	e26d6ce517	path-util: Change path_is_mount_point() symlink arg from bool to flags This makes path_is_mount_point() consistent with fd_is_mount_point() wrt. flags.	2015-05-29 17:42:44 +02:00
Tom Gundersen	cc9fce6554	nspawn: fix memleak This was a typo, swapping prefix_root() in place of prefix_roota(). Fixes CID 1299640.	2015-05-25 23:01:50 +02:00
Tom Gundersen	2371271c2a	nspawn: avoid memleak Simplify the code a bit, at the cost of potentially duplicating some memory unneccessarily. Fixes CID 1299641.	2015-05-25 22:58:26 +02:00
Tom Gundersen	4b53a9d21b	nspawn: drop some debugging code These have no effect. Fixes CID 1299643.	2015-05-25 22:49:14 +02:00
Tom Gundersen	f001a83522	nspawn: make coverity happy Rather than checking the return of asprintf() we are checking if buf gets allocated, make it clear that it is ok to ignore the return value. Fixes CID 1299644.	2015-05-25 22:27:29 +02:00
Umut Tezduyar Lindskog	637aa8a36c	nspawn: be verbose about interface names Allowed interface name is relatively small. Lets not make users go in to the source code to figure out what happened. --machine=debian-tree conflicts with --machine=debian-tree2 ex: Failed to add new veth \ interfaces (host0, vb-debian-tree): File exists	2015-05-24 22:39:09 +02:00
Lennart Poettering	5ba7a26847	nspawn: prohibit access to the kernel log buffer by default Unless CAP_SYSLOG is explicitly passed block all access to kmg	2015-05-21 20:49:24 +02:00
Lennart Poettering	050f727728	util: introduce PERSONALITY_INVALID as macro for 0xffffffffLU	2015-05-21 19:48:49 +02:00
Lennart Poettering	03cfe0d514	nspawn: finish user namespace support	2015-05-21 16:32:01 +02:00
Lennart Poettering	6458ec20b5	core,nspawn: unify code that moves the root dir	2015-05-20 14:38:12 +02:00
Alban Crequy	6b7d2e9ea4	nspawn: close extra fds before execing init When systemd-nspawn gets exec*()ed, it inherits the followings file descriptors: - 0, 1, 2: stdin, stdout, stderr - SD_LISTEN_FDS_START, ... SD_LISTEN_FDS_START+LISTEN_FDS: file descriptors passed by the system manager (useful for socket activation). They are passed to the child process (process leader). - extra lock fd: rkt passes a locked directory as an extra fd, so the directory remains locked as long as the container is alive. systemd-nspawn used to close all open fds except 0, 1, 2 and the SD_LISTEN_FDS_START..SD_LISTEN_FDS_START+LISTEN_FDS. This patch delays the close just before the exec so the nspawn process (parent) keeps the extra fds open. This patch supersedes the previous attempt ("cloexec extraneous fds"): http://lists.freedesktop.org/archives/systemd-devel/2015-May/031608.html	2015-05-18 22:24:15 +02:00
Lennart Poettering	958b66ea16	util: split all hostname related calls into hostname-util.c	2015-05-18 17:10:07 +02:00
Stefan Junker	ce5b3ad450	nspawn: allow access to device nodes listed in --bind= and --bind-ro= switches https://bugs.freedesktop.org/show_bug.cgi?id=90385	2015-05-14 22:51:05 +02:00
Iago López Galeiras	875e1014dd	nspawn: skip symlink to a combined cgroup hierarchy if it already exists If a symlink to a combined cgroup hierarchy already exists and points to the right path, skip it. This avoids an error when the cgroups are set manually before calling nspawn.	2015-05-13 16:03:07 +02:00
Iago López Galeiras	54b4755f15	nspawn: only mount the cgroup root if it's not already mounted This allows the user to set the cgroups manually before calling nspawn.	2015-05-13 15:56:59 +02:00
Lennart Poettering	5a8af538ae	nspawn: rework custom mount point order, and add support for overlayfs Previously all bind mount mounts were applied in the order specified, followed by all tmpfs mounts in the order specified. This is problematic, if bind mounts shall be placed within tmpfs mounts. This patch hence reworks the custom mount point logic, and alwas applies them in strict prefix-first order. This means the order of mounts specified on the command line becomes irrelevant, the right operation will always be executed. While we are at it this commit also adds native support for overlayfs mounts, as supported by recent kernels.	2015-05-13 14:07:26 +02:00
Lennart Poettering	27023c0ef5	nspawn: pass on kill signal setting to contaner scope Let's just pass on what the user set for us.	2015-05-11 22:10:36 +02:00
Lennart Poettering	1a2399e57d	nspawn: when run as a service, don't ask machined for terminatin of ourselves	2015-04-28 21:34:23 +02:00
Lennart Poettering	773ce3d89c	nspawn: make sure we install the device policy if nspawn is run as unit as on the command line	2015-04-28 21:34:23 +02:00
Lennart Poettering	aee327b816	nspawn: don't inherit read-only flag from disk image if --ephemeral is used When --ephemeral is used there's no need to keep the image read-only, so let's not do that then.	2015-04-22 16:56:51 +02:00
Lennart Poettering	10a8700606	tree-wide: get rid of more strerror() calls	2015-04-21 18:05:44 +02:00
Ronny Chevalier	288a74cce5	shared: add terminal-util.[ch]	2015-04-11 00:34:02 +02:00
Ronny Chevalier	3df3e884ae	shared: add random-util.[ch]	2015-04-11 00:11:13 +02:00
Ronny Chevalier	0b452006de	shared: add process-util.[ch]	2015-04-10 23:54:49 +02:00
Ronny Chevalier	6482f6269c	shared: add formats-util.h	2015-04-10 23:54:48 +02:00
Lennart Poettering	da00518b3f	path-util: fix more path_is_mount `e792e890f` fallout	2015-04-07 16:03:45 +02:00
Lennart Poettering	f70a17f8d4	btrfs: add support for recursive btrfs snapshotting	2015-04-06 15:26:59 +02:00
Lennart Poettering	e9bc1871b9	btrfs: make btrfs_subvol_snapshot() parameters a flags field	2015-04-06 14:54:58 +02:00
Lennart Poettering	d9e2daaf3d	btrfs: support recursively removing btrfs snapshots	2015-04-06 11:28:16 +02:00
Lennart Poettering	c687863750	util: rework rm_rf() logic - Move to its own file rm-rf.c - Change parameters into a single flags parameter - Remove "honour sticky" logic, it's unused these days	2015-04-06 10:57:53 +02:00
Alban Crequy	81f5049b7c	nspawn: fallback on bind mount when mknod fails Some systems abusively restrict mknod, even when the device node already exists in /dev. This is unfortunate because it prevents systemd-nspawn from creating the basic devices in /dev in the container. This patch implements a workaround: when mknod fails, fallback on bind mounts. Additionally, /dev/console was created with a mknod with the same major/minor as /dev/null before bind mounting a pts on it. This patch removes the mknod and creates an empty regular file instead. In order to test this patch, I used the following configuration, which I think should replicate the system with the abusive restriction on mknod: # grep devices /proc/self/cgroup 4:devices:/user.slice/restrict # cat /sys/fs/cgroup/devices/user.slice/restrict/devices.list c 1:9 r c 5:2 rw c 136:* rw # systemd-nspawn --register=false -D . v2: - remove "bind", it is not needed since there is already MS_BIND v3: - fix error management when calling touch() - fix lowercase in error message	2015-03-31 17:21:03 +02:00
Lennart Poettering	4f923a1984	nspawn: drop sd_booted() check We have no such check in any of the other tools, hence don't have one in nspawn either. (This should make things nicer for Rocket, among other things) Note: removing this check does not mean that we support running nspawn on non-systemd. We explicitly don't. It just means that we remove the check for running it like that. You are still on your own if you do...	2015-03-31 15:36:53 +02:00
Iago López Galeiras	4543768d13	nspawn: change filesystem type from "bind" to NULL in mount() syscalls Try to keep syscalls as minimal as possible.	2015-03-31 15:36:53 +02:00
Zbigniew Jędrzejewski-Szmek	48861960ac	nspawn: tell coverity that we ignore return value CID #1271353.	2015-03-13 23:42:16 -04:00
David Herrmann	15411c0cb1	tree-wide: there is no ENOTSUP on linux Replace ENOTSUP by EOPNOTSUPP as this is what linux actually uses.	2015-03-13 14:10:39 +01:00
Zbigniew Jędrzejewski-Szmek	8a16a7b4e7	nspawn: fix use-after-free and leak in error paths CID #1257765.	2015-03-07 14:19:20 -05:00
Jay Faulkner	9a71b1122c	nspawn: Map all seccomp filters to capabilities This change makes it so all seccomp filters are mapped to the appropriate capability and are only added if that capability was not requested when running the container. This unbreaks the remaining use cases broken by the addition of seccomp filters without respecting requested capabilities. Co-Authored-By: Clif Houck <me@clifhouck.com> [zj: - adapt to our coding style, make struct anonymous]	2015-03-04 23:18:09 -05:00
Lennart Poettering	c6c8f6e218	nspawn: make kill signal to use for PID 1 configurable	2015-02-25 22:06:54 +01:00
Thomas Hindoe Paaboel Andersen	2eec67acbb	remove unused includes This patch removes includes that are not used. The removals were found with include-what-you-use which checks if any of the symbols from a header is in use.	2015-02-23 23:53:42 +01:00
Jan Synacek	4aab5d0cbd	nspawn: fix whitespace and typo in partition table blurb	2015-02-23 15:26:58 +01:00
Lennart Poettering	6278cf6048	nspawn: chown basic device nodes to userns root	2015-02-19 12:03:39 +01:00
Lennart Poettering	d15d65a01f	nspawn: fix build on non-selinux systems	2015-02-19 12:03:12 +01:00
Lennart Poettering	6dac160c0a	nspawn: add basic user namespacing support (This is incomplete, /proc and /sys are still owned by root from outside the container, not inside)	2015-02-19 11:31:08 +01:00
Lennart Poettering	9c857b9d16	nspawn: when connected to pipes for stdin/stdout, pass them as-is to PID 1 Previously we always invoked the container PID 1 on /dev/console of the container. With this change we do so only if nspawn was invoked interactively (i.e. its stdin/stdout was connected to a TTY). In all other cases we directly pass through the fds unmodified. This has the benefit that nspawn can be added into shell pipelines. https://bugs.freedesktop.org/show_bug.cgi?id=87732	2015-02-18 23:36:20 +01:00
Lennart Poettering	f36933fef6	nspawn: add support for --property= to set scope properties This is similar to systemd-run's --property= setting.	2015-02-18 19:42:24 +01:00
Jay Faulkner	d0a0ccf3fe	nspawn: Allow module loading if CAP_SYS_MODULE is requested nspawn containers currently block module loading in all cases, with no option to disable it. This allows an admin, specifically setting capability=CAP_SYS_MODULE or capability=all to load modules.	2015-02-04 13:34:46 +01:00
Lennart Poettering	63c372cb9d	util: rework strappenda(), and rename it strjoina() After all it is now much more like strjoin() than strappend(). At the same time, add support for NULL sentinels, even if they are normally not necessary.	2015-02-03 02:05:59 +01:00
Thomas Hindoe Paaboel Andersen	fed6df828d	remove unused variables	2015-02-02 22:58:06 +01:00
Lennart Poettering	c0534580ac	nspawn: when mounting the cgroup hierarchies, use the exact same mount options for the superblock as the host Otherwise we'll generate kernel runtime warnings about non-matching mount options.	2015-01-23 01:43:16 +01:00
Lennart Poettering	bbb99c30d0	nspawn: mount /tmp in the container, don't leave this to the container's init We really want /tmp to be properly mounted, especially in containers that lack CAP_SYS_ADMIN or that are not fully booted up and only get a shell, hence let's do so in nspawn already.	2015-01-23 01:27:06 +01:00
Alban Crequy	05e7da5afa	nspawn: allow bind-mounting char and block files	2015-01-23 01:22:55 +01:00
Lennart Poettering	c09ef2e4e8	nspawn: work around kernel bug with partition table probing on loopback devices When we set up a loopback device with partition probing, the udev "change" event about the configured device is first passed on to userspace, only the the in-kernel partition prober is started. Since partition probing fails with EBUSY when somebody has the device open, the probing frequently fails since udev starts probing/opening the device as soon as it gets the notification about it, and it might do so earlier than the kernel probing. This patch adds a (hopefully temporary) work-around for this, that compares the number of probed partitions of the kernel with those of blkid and synchronously asks for reprobing until the numebrs are in sync. This really deserves a proper kernel fix.	2015-01-20 20:40:45 +01:00
Tom Gundersen	4bbfe7ad22	nspawn: add ipvlan support	2015-01-20 00:46:13 +01:00
Lennart Poettering	f6c51a8136	nspawn: support dissecting GPT images that contain only a single generic linux partition This should allow running Ubuntu UEFI GPT Images with nspawn, unmodified.	2015-01-19 20:24:10 +01:00
Lennart Poettering	2fbe4296c5	inspawn: wait until udev has probed a loopback device before making us of it	2015-01-19 20:24:10 +01:00
Jonathan Boulle	835214146b	nspawn: fix log typos	2015-01-15 08:19:30 +01:00
Lennart Poettering	aceac2f0b6	import: rename "gpt" disk image type to "raw" After all, nspawn can now dissect MBR partition levels, too, hence ".gpt" appears a misnomer. Moreover, the the .raw suffix for these files is already pretty popular (the Fedora disk images use it for example), hence sounds like an OK scheme to adopt.	2015-01-15 01:47:21 +01:00

... 2 3 4 5 6 ...

603 commits