Systemd

Author	SHA1	Message	Date
Zbigniew Jędrzejewski-Szmek	cd132992bb	nspawn: fix abort when we cannot execve If execve failed, we would die in safe_close(), because master was already closed by fdset_close_others() on line 3123. IIUC, we don't need to keep the fd open after sending it, so let's just close it immediately. Reproducer: sudo build/systemd-nspawn -M rawhide fooooooo Fixup for `3acc84ebd9`.	2019-07-09 01:24:20 +02:00
Yu Watanabe	2d9b74ba87	tree-wide: replace strjoin() with path_join()	2019-06-24 23:59:38 +09:00
Lennart Poettering	cee97d5768	Merge pull request #12836 from yuwata/tree-wide-replace-strjoin tree-wide: replace strjoin() with path_join()	2019-06-22 20:02:46 +02:00
Lennart Poettering	c6134d3e2f	path-util: get rid of prefix_root() prefix_root() is equivalent to path_join() in almost all ways, hence let's remove it. There are subtle differences though: prefix_root() will try shorten multiple "/" before and after the prefix. path_join() doesn't do that. This means prefix_root() might return a string shorter than both its inputs combined, while path_join() never does that. I like the path_join() semantics better, hence I think dropping prefix_root() is totally OK. In the end the strings generated by both functon should always be identical in terms of path_equal() if not streq(). This leaves prefix_roota() in place. Ideally we'd have path_joina(), but I don't think we can reasonably implement that as a macro. or maybe we can? (if so, sounds like something for a later PR) Also add in a few missing OOM checks	2019-06-21 08:42:55 +09:00
Anita Zhang	f66ad46066	nspawn: don't hard fail when setting capabilities The OCI changes in #9762 broke a use case in which we use nspawn from inside a container that has dropped capabilities from the bounding set that nspawn expected to retain. In an attempt to keep OCI compliance and support our use case, I made hard failing on setting capabilities not in the bounding set optional (hard fail if using OCI and log only if using nspawn cmdline). Fixes #12539	2019-06-20 21:46:36 +02:00
Yu Watanabe	657ee2d82b	tree-wide: replace strjoin() with path_join()	2019-06-21 03:26:16 +09:00
Franck Bui	dc98caea32	nspawn: make use of openpt_allocate()	2019-06-18 09:27:06 +02:00
Franck Bui	3acc84ebd9	nspawn: allocate the pty used for /dev/console within the container The console tty is now allocated from within the container so it's not necessary anymore to allocate it from the host and bind mount the pty slave into the container. The pty master is sent to the host. /dev/console is now a symlink pointing to the pty slave. This might also be less confusing for applications running inside the container and the overall result looks cleaner (we don't need to apply manually the passed selinux context, if any, to the allocated pty for instance).	2019-06-18 08:17:34 +02:00
Franck Bui	ba72801d66	nspawn: use correct error variable when logging errors returned by send_one_fd()	2019-06-18 07:54:51 +02:00
Michal Sekletar	3f09629c22	Merge pull request #12628 from keszybz/dbus-execute Rework cpu affinity parsing	2019-05-30 12:32:53 +02:00
Yu Watanabe	a0267b30f8	nspawn: also support ifindex when specifying network interface	2019-05-30 11:04:05 +02:00
Zbigniew Jędrzejewski-Szmek	0985c7c4e2	Rework cpu affinity parsing The CPU_SET_S api is pretty bad. In particular, it has a parameter for the size of the array, but operations which take two (CPU_EQUAL_S) or even three arrays (CPU_{AND,OR,XOR}_S) still take just one size. This means that all arrays must be of the same size, or buffer overruns will occur. This is exactly what our code would do, if it received an array of unexpected size over the network. ("Unexpected" here means anything different from what cpu_set_malloc() detects as the "right" size.) Let's rework this, and store the size in bytes of the allocated storage area. The code will now parse any number up to 8191, independently of what the current kernel supports. This matches the kernel maximum setting for any architecture, to make things more portable. Fixes #12605.	2019-05-29 10:20:42 +02:00
Zbigniew Jędrzejewski-Szmek	127c167cdb	Merge pull request #12390 from poettering/string-file-mkdir fileio: add a WRITE_STRING_FILE_MKDIR_0755 flag to write_string_file() that creates parent directories if needed	2019-05-28 14:42:55 +02:00
Lennart Poettering	f9a3d8e2f3	nspawn: expose the new seccomp actions in the OCI logic	2019-05-24 10:48:28 +02:00
Lennart Poettering	e82e549fb2	tree-wide: make use of the new WRITE_STRING_FILE_MKDIR_0755 flag	2019-05-08 06:36:20 -04:00
Franck Bui	9f3f596477	meson: make source files including nspawn-settings.h depend on libseccomp Since nspawn-settings.h includes seccomp.h, any file that includes nspawn-settings.h should depend on libseccomp so the correct header path where seccomp.h lives is added to the header search paths. It's especially important for distros such as openSUSE where seccomp.h is not shipped in /usr/include but /usr/include/libseccomp. This patch is similar to `8238423095`.	2019-04-30 19:31:22 +02:00
Ben Boeckel	5238e95759	codespell: fix spelling errors	2019-04-29 16:47:18 +02:00
Ben Boeckel	8f8dfb9552	nspawn-expose-ports: fix a typo in error message	2019-04-26 23:42:55 +02:00
Dominick Grift	8f1ed04ad6	nspawn: Fix volatile SELinux label nspawn should associate the specified nspawn container apifs object label instead of the nspawn container process label with the volatile tmpfs	2019-04-13 12:03:02 +02:00
Lennart Poettering	33d60b8d57	json: simplify JSON_VARIANT_OBJECT_FOREACH() macro a bit There's no point in returning the "key" within each loop iteration as JsonVariant object. Let's simplify things and return it as string. That simplifies usage (since the caller doesn't have to convert the object to the string anymore) and is safe since we already validate that keys are strings when an object JsonVariant is allocated.	2019-04-12 13:11:11 +02:00
Anita Zhang	7bc5e0b12b	seccomp: check more error codes from seccomp_load() We noticed in our tests that occasionally SystemCallFilter= would fail to set and the service would run with no syscall filtering. Most of the time the same tests would apply the filter and fail the service as expected. While it's not totally clear why this happens, we noticed seccomp_load() in the systemd code base would fail open for all errors except EPERM and EACCES. ENOMEM, EINVAL, and EFAULT seem like reasonable values to add to the error set based on what I gather from libseccomp code and man pages: -ENOMEM: out of memory, failed to allocate space for a libseccomp structure, or would exceed a defined constant -EINVAL: kernel isn't configured to support the operations, args are invalid (to seccomp_load(), seccomp(), or prctl()) -EFAULT: addresses passed as args are invalid	2019-04-12 10:23:07 +02:00
Lennart Poettering	1eacc47062	nspawn: create boot_id and kmsg files for overmounting in /run, not /tmp /tmp might not be mounted at all yet (given that we support SYSTEMD_NSPAWN_TMPFS_TMP=0 to turn this off), and /tmp is a dir systemd usually tries to unmount during shutdown (unlike /run), and we shouldn't keep it busy. Hence let's just move these deleted files to /run so that we don't keep /tmp needlessly busy.	2019-04-07 08:55:31 +02:00
Lennart Poettering	c614711386	tree-wide: use SYNTHETIC_ERRNO() where appropriate	2019-04-02 14:54:42 +02:00
Lennart Poettering	8a016c746e	util-lib: when copying files make sure to apply some chattrs early, some late Some chattrs only work sensible if you set them right after opening a file for create (think: FS_NOCOW_FL). Others only work when they are applied when the file is fully written (think: FS_IMMUTABLE_FL). Let's take that into account when copying files and applying a chattr to them.	2019-03-28 18:43:04 +01:00
Zbigniew Jędrzejewski-Szmek	ca78ad1de9	headers: remove unneeded includes from util.h This means we need to include many more headers in various files that simply included util.h before, but it seems cleaner to do it this way.	2019-03-27 11:53:12 +01:00
Zbigniew Jędrzejewski-Szmek	e1af3bc62a	Merge pull request #12106 from poettering/nosuidns add "nosuid" flag to exec directory mounts of DynamicUser=1 services	2019-03-26 08:58:00 +01:00
Zbigniew Jędrzejewski-Szmek	99f57a4fea	Merge pull request #12105 from poettering/api-vfs-mount-flags some API VFS mount flag tweaks	2019-03-26 08:32:53 +01:00
Lennart Poettering	25e68fd397	nspawn: minor improvements to --help text	2019-03-26 08:06:00 +01:00
Lennart Poettering	849b9b85b8	nspawn: mount mqueue with nodev,noexec,nosuid, too The host mounts it like that, nspawn hence should do too. Moreover, mount the file system after doing CLONEW_NEWIPC so that it actually reflects the right mqueues. Finally, mount it wthout considering it fatal, since POSIX mqueue support is little used and it should be fine not to support it in the kernel.	2019-03-25 19:53:05 +01:00
Lennart Poettering	64e82c1976	mount-util: beef up bind_remount_recursive() to be able to toggle more than MS_RDONLY The function is otherwise generic enough to toggle other bind mount flags beyond MS_RDONLY (for example: MS_NOSUID or MS_NODEV), hence let's beef it up slightly to support that too.	2019-03-25 19:33:55 +01:00
Lennart Poettering	83276695c6	Merge pull request #12079 from keszybz/fuzz-nspawn-oci Add fuzzer for nspawn-oci	2019-03-22 21:06:17 +01:00
Lennart Poettering	e4077ff6f3	nspawn: don't free "fds" twice Previously both run() and run_container() would free 'fds'. Let's fix that, and let run() free it but make run_container() already remove all fds from it, because that's what we actually want to do. Fixes: #12073	2019-03-22 18:11:27 +01:00
Zbigniew Jędrzejewski-Szmek	b2645747b7	nspawn-oci: fix double free Also rename function to make it clear that it also frees the array object itself.	2019-03-22 17:39:12 +01:00
Zbigniew Jędrzejewski-Szmek	094eecd29d	Merge pull request #12055 from poettering/save-argc-argv main-func.h and systemctl argc/argv improvements	2019-03-22 16:58:18 +01:00
Zbigniew Jędrzejewski-Szmek	b1f13b0e75	nspawn-oci: mount source is optional	2019-03-22 12:04:32 +01:00
Zbigniew Jędrzejewski-Szmek	b2e07b1a02	nspawn-oci: use _cleanup_ in one more place	2019-03-22 11:51:21 +01:00
Lennart Poettering	ae408d77a9	nspawn: conditionalize libseccomp use We support compilation without libseccomp, hence don't rely on its symbols.	2019-03-22 11:07:03 +01:00
Lennart Poettering	60ffa37a65	main-func: implicitly save argc/argv in DEFINE_MAIN_FUNCTION() functions Let's remove the risk of forgetting to save argc/argv if DEFINE_MAIN_FUNCTION() is used.	2019-03-21 18:10:06 +01:00
Lennart Poettering	36fea15565	util: introduce save_argc_argv() helper	2019-03-21 18:08:56 +01:00
Lennart Poettering	c82cfae00b	Merge pull request #12062 from poettering/nspawn-main-func nspawn: port to DEFINE_MAIN_FUNCTION()	2019-03-21 18:08:27 +01:00
Zbigniew Jędrzejewski-Szmek	bb068de080	nspawn: add --no-pager switch It only matters for --help.	2019-03-21 17:42:43 +01:00
Lennart Poettering	04f590a4a4	nspawn: voidify sd_notify() calls	2019-03-21 16:32:46 +01:00
Lennart Poettering	6145bb4f78	nspawn: port to static destructors	2019-03-21 16:32:46 +01:00
Lennart Poettering	44dbef90f1	nspawn: port to main-func.h logic	2019-03-21 16:32:46 +01:00
Zbigniew Jędrzejewski-Szmek	fa28e4e377	Merge pull request #12059 from poettering/nspawn-typos some typo and other fixes result of the OCI nspawn merge	2019-03-21 15:14:11 +01:00
Lennart Poettering	c3d13d2ad5	Merge pull request #12058 from keszybz/oci-simplifications Follow-ups for nspawn-oci review	2019-03-21 13:55:09 +01:00
Lennart Poettering	f4e803c809	nspawn: add a few missing flags from --help text	2019-03-21 13:31:09 +01:00
Lennart Poettering	2514865391	nspawn: reorder --help text, and add section The list is so long, let's add a bit of structure and order things a bit.	2019-03-21 13:27:19 +01:00
Lennart Poettering	2c9b7a7e62	mount: when we fail to establish an inaccessible mount gracefully, undo the mount	2019-03-21 12:41:02 +01:00
Zbigniew Jędrzejewski-Szmek	6757a01356	util-lib: get rid of a helper variable	2019-03-21 11:08:58 +01:00
Zbigniew Jędrzejewski-Szmek	f1531db5af	nspawn-oci: add helper function for free_and_strdup with oom check	2019-03-21 11:08:58 +01:00
Zbigniew Jędrzejewski-Szmek	d0b6a10c00	Merge pull request #9762 from poettering/nspawn-oci OCI runtime support for nspawn	2019-03-21 11:01:53 +01:00
Zbigniew Jędrzejewski-Szmek	19130626a0	nspawn-oci: use SYNTHETIC_ERRNO	2019-03-21 10:51:43 +01:00
Topi Miettinen	ebcf697685	tree-wide: fix false search hits with ppp (typos)	2019-03-18 14:25:56 +01:00
Lennart Poettering	95658673a0	Merge pull request #12016 from yuwata/fix-two-memleaks-found-by-oss-fuzz Fix two memleaks found by oss fuzz	2019-03-15 17:33:48 +01:00
Yu Watanabe	1d0c1146ea	nspawn: fix memleak Fixes oss-fuzz#13691.	2019-03-15 23:53:05 +09:00
Zbigniew Jędrzejewski-Szmek	7acf581a58	Handle or voidify all calls to close_all_fds() In activate, it is important that we close the fds. In other cases, meh.	2019-03-15 15:46:41 +01:00
Lennart Poettering	a3fc6b55ac	nspawn: mask out CAP_NET_ADMIN again if settings file turns off private networking Fixes: #11755	2019-03-15 15:42:21 +01:00
Lennart Poettering	bd4b15f274	nspawn: use right constant for shifting for uint64_t caps	2019-03-15 15:42:20 +01:00
Lennart Poettering	de40a3037a	nspawn: add support for executing OCI runtime bundles with nspawn This is a pretty large patch, and adds support for OCI runtime bundles to nspawn. A new switch --oci-bundle= is added that takes a path to an OCI bundle. The JSON file included therein is read similar to a .nspawn settings files, however with a different feature set. Implementation-wise this mostly extends the pre-existing Settings object to carry additional properties for OCI. However, OCI supports some concepts .nspawn files did not support yet, which this patch also adds: 1. Support for "masking" files and directories. This functionatly is now also available via the new --inaccesible= cmdline command, and Inaccessible= in .nspawn files. 2. Support for mounting arbitrary file systems. (not exposed through nspawn cmdline nor .nspawn files, because probably not a good idea) 3. Ability to configure the console settings for a container. This functionality is now also available on the nspawn cmdline in the new --console= switch (not added to .nspawn for now, as it is something specific to the invocation really, not a property of the container) 4. Console width/height configuration. Not exposed through .nspawn/cmdline, but this may be controlled through $COLUMNS and $LINES like in most other UNIX tools. 5. UID/GID configuration by raw numbers. (not exposed in .nspawn and on the cmdline, since containers likely have different user tables, and the existing --user= switch appears to be the better option) 6. OCI hook commands (no exposed in .nspawn/cmdline, as very specific to OCI) 7. Creation of additional devices nodes in /dev. Most likely not a good idea, hence not exposed in .nspawn/cmdline. There's already --bind= to achieve the same, which is the better alternative. 8. Explicit syscall filters. This is not a good idea, due to the skewed arch support, hence not exposed through .nspawn/cmdline. 9. Configuration of some sysctls on a whitelist. Questionnable, not supported in .nspawn/cmdline for now. 10. Configuration of all 5 types of capabilities. Not a useful concept, since the kernel will reduce the caps on execve() anyway. Not exposed through .nspawn/cmdline as this is not very useful hence. Note that this only implements the OCI runtime logic itself. It does not provide a runc-compatible command line tool. This is left for a later PR. Only with that in place tools such as "buildah" can use the OCI support in nspawn as drop-in replacement. Currently still missing is OCI hook support, but it's already parsed and everything, and should be easy to add. Other than that it's OCI is implemented pretty comprehensively. There's a list of incompatibilities in the nspawn-oci.c file. In a later PR I'd like to convert this into proper markdown and add it to the documentation directory.	2019-03-15 15:41:28 +01:00
Lennart Poettering	5ef4cb7ad0	nspawn: (void)ify more stuff	2019-03-15 15:33:09 +01:00
Lennart Poettering	61b4443361	nspawn: refactor setuid code a bit Let's separate out the raw uid_t/gid_t handling from the username handling. This is useful later on. Also, let's use the right gid_t type for group types wherever appropriate.	2019-03-15 15:33:09 +01:00
Lennart Poettering	d8b4d14df4	util: split out nulstr related stuff to nulstr-util.[ch]	2019-03-14 13:25:52 +01:00
Lennart Poettering	e45c81b8bc	shared: split out code to wait for jobs to complet into its own source file It's complex enough and quite a few functions. Let's hence split this out. No code change, just some rearranging of source files.	2019-03-13 17:39:24 +01:00
Lennart Poettering	760877e90c	util: split out sorting related calls to new sort-util.[ch]	2019-03-13 12:16:43 +01:00
Lennart Poettering	0cb8e3d118	util: split out namespace related stuff into a new namespace-util.[ch] pair Just some minor reorganiztion.	2019-03-13 12:16:38 +01:00
Zbigniew Jędrzejewski-Szmek	0e636bf51a	nspawn: fix memleak uncovered by fuzzer Also use TAKE_PTR as appropriate.	2019-03-11 14:29:30 +01:00
Lennart Poettering	27da7ef0d0	nspawn: move payload to sub-cgroup first, then sync cgroup trees if we sync the legacy and unified trees before moving to the right subcgroup then ultimately the cgroup paths in the hierarchies will be out-of-sync... Hence, let's move the payload first, and sync then. Addresses: https://github.com/systemd/systemd/pull/9762#issuecomment-441187979	2019-03-07 11:26:17 +01:00
Lennart Poettering	adc6f43b14	copy: don't synthesize a 'user.crtime_usec' xattr on copy unless explicitly requested Previously, when we'd copy an individual file we'd synthesize a user.crtime_usec xattr with the source's creation time if we can determine it. As the creation/birth time was until recently not queriable form userspace this effectively just propagated the same xattr on the source to the same xattr on the destination. However, current kernels now allow to query the birthtime using statx() and we do make use of that now. Which means that suddenly we started synthesizing these xattrs much more regularly. Doing this actually does make sense, but only in very few cases: not for the typical regular files we copy, but certainly when dealing with disk images. Hence, let's keep this kind of propagation, but let's make it a flag and default to off. Then turn it on whenever we deal with disk images, and leave it off otherwise. This is particularly relevant as overlayfs combining a real fs, and a tmpfs on top will result in EOPNOTSUPP when it is attempted to open a file with xattrs for writing, as tmpfs does not support xattrs, and hence the copy-up cannot work. Hence, let's avoid synthesizing this needlessly, to increase compat with overlayfs.	2019-03-01 14:11:07 +01:00
Lennart Poettering	e5a4bb0d4e	nspawn: rework how arg_read_only is initialized in --volatile= mode Previously, we'd refuse the combination, and claimed we'd imply it, but actually didn't. Let's allow the combination and imply read-only from --volatile=, because that's what's documented, what we claim we do, and what makes sense.	2019-03-01 14:11:07 +01:00
Lennart Poettering	83205269c0	nspawn: refactor how we determine whether it's OK to write to /etc	2019-03-01 14:11:07 +01:00
Lennart Poettering	e50cd82f68	nspawn: no need to make top-level directory a bind mount if we just dissected an image	2019-03-01 14:11:07 +01:00
Lennart Poettering	7d0ecdd62d	nspawn: slightly reorder mount logic Let's first setup the volatile logic, and only then mount secondary partitions of the image in.	2019-03-01 14:11:07 +01:00
Lennart Poettering	6c610acaaa	nspawn: add --volatile=overlay support Fixes: #11054 #3847	2019-03-01 14:11:06 +01:00
Lennart Poettering	c55d0ae764	nspawn: fix an error path	2019-03-01 14:11:06 +01:00
Lennart Poettering	e5b43a04b6	nspawn: add volatile mode multiplexer call setup_volatile_mode() Just some refactoring, no change in behaviour.	2019-03-01 14:11:06 +01:00
Lennart Poettering	0646d3c3dd	nspawn: explicitly refuse mounts over / Previously this would fail later on, but let's filter this out at the time of parsing.	2019-03-01 14:11:06 +01:00
Lennart Poettering	6e9417f5b4	tree-wide: use newa() instead of alloca() wherever we can Typesafety is nice. And this way we can take benefit of the new size assert() the previous commit added.	2019-01-26 16:17:04 +01:00
Lennart Poettering	2949ff2691	nspawn: ignore SIGPIPE for nspawn itself Let's not abort due to a dead stdout. Fixes: #11533	2019-01-26 13:54:44 +01:00
Lennart Poettering	b2238e380e	test,systemctl,nspawn: use "const char" instead of "char" as iterator for FOREACH_STRING() The macro iterates through literal strings (i.e. constant strings), hence it's more correct to have the iterator const too.	2019-01-16 12:29:30 +01:00
Chris Down	e92aaed30e	tree-wide: Remove O_CLOEXEC from fdopen fdopen doesn't accept "e", it's ignored. Let's not mislead people into believing that it actually sets O_CLOEXEC. From `man 3 fdopen`: > e (since glibc 2.7): > Open the file with the O_CLOEXEC flag. See open(2) for more information. This flag is ignored for fdopen() As mentioned by @jlebon in #11131.	2018-12-12 20:47:40 +01:00
Zbigniew Jędrzejewski-Szmek	489fae526d	nspawn: check cg_ns_supported() just once cg_ns_supported() caches, so the condition was really checked just once, but it looks weird to assign the return value to arg_use_cgns (if the variable is not present), because then the other checks are effectively equivalent to if (cg_ns_supported() && cg_ns_supported()) { ... and later if (!cg_ns_supported() \|\| !cg_ns_supported()) { ...	2018-12-11 13:37:41 +00:00
Lennart Poettering	60f1ec13ed	nspawn: move most validation checks and configuration mangling into verify_arguments() That's what the function is for after all, and only if it's done there we can verify the effect of .nspawn files correctly too: after all we should not just validate that everything configured on the command line makes sense, but the stuff configured in the .nspawn files, too.	2018-12-10 12:54:56 +01:00
Lennart Poettering	d5455d2f98	nspawn: split out code parsing env vars into a function of its own This then let's us to ensure it's called after we parsed the cmdline, and after we loaded the settings file, so that it these env var settings override everything loaded from there.	2018-12-10 12:54:56 +01:00
Lennart Poettering	5eee829043	nspawn: move cg_unified_flush() invocation out of parse_argv() It has nothing to do with argument parsing, and hence shouldn't be there.	2018-12-10 12:54:56 +01:00
Zbigniew Jędrzejewski-Szmek	871fa294ff	Merge pull request #10935 from poettering/rlimit-nofile-safe Merged by hand to resolve a trivial conflict in TODO.	2018-12-06 17:19:21 +01:00
Yu Watanabe	e93672eeac	tree-wide: drop missing.h from headers and use relevant missing_*.h	2018-12-06 13:31:16 +01:00
Yu Watanabe	204f52e32d	lockfile: drop unnecessary headers from lockfile-util.h	2018-12-06 13:31:16 +01:00
Yu Watanabe	503f480f8e	missing: move fs or mount related definitions to missing_fs.h This also fixes errnous definition MS_REC -> MS_SLAVE.	2018-12-06 13:30:43 +01:00
Yu Watanabe	36dd5ffd5d	util: drop missing.h from util.h	2018-12-04 10:00:34 +01:00
Lennart Poettering	e4de72876e	util-lib: split out all temporary file related calls into tmpfiles-util.c This splits out a bunch of functions from fileio.c that have to do with temporary files. Simply to make the header files a bit shorter, and to group things more nicely. No code changes, just some rearranging of source files.	2018-12-02 13:22:29 +01:00
Lennart Poettering	5dd9527883	tree-wide: remove various unused functions All found with "cppcheck --enable=unusedFunction".	2018-12-02 13:35:34 +09:00
Lennart Poettering	595225af7a	tree-wide: invoke rlimit_nofile_safe() before various exec{v,ve,l}() invocations Whenever we invoke external, foreign code from code that has RLIMIT_NOFILE's soft limit bumped to high values, revert it to 1024 first. This is a safety precaution for compatibility with programs using select() which cannot operate with fds > 1024. This commit adds the call to rlimit_nofile_safe() to all invocations of exec{v,ve,l}() and friends that either are in code that we know runs with RLIMIT_NOFILE bumped up (which is PID 1 and all journal code for starters) or that is part of shared code that might end up there. The calls are placed as early as we can in processes invoking a flavour of execve(), but after the last time we do fd manipulations, so that we can still take benefit of the high fd limits for that.	2018-12-01 12:50:45 +01:00
Zbigniew Jędrzejewski-Szmek	b2ac2b01c8	Merge pull request #10996 from poettering/oci-prep Preparation for the nspawn-OCI work	2018-11-30 10:09:00 +01:00
Zbigniew Jędrzejewski-Szmek	049af8ad0c	Split out part of mount-util.c into mountpoint-util.c The idea is that anything which is related to actually manipulating mounts is in mount-util.c, but functions for mountpoint introspection are moved to the new file. Anything which requires libmount must be in mount-util.c. This was supposed to be a preparation for further changes, with no functional difference, but it results in a significant change in linkage: $ ldd build/libnss_*.so.2 (before) build/libnss_myhostname.so.2: linux-vdso.so.1 (0x00007fff77bf5000) librt.so.1 => /lib64/librt.so.1 (0x00007f4bbb7b2000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f4bbb755000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f4bbb734000) libc.so.6 => /lib64/libc.so.6 (0x00007f4bbb56e000) /lib64/ld-linux-x86-64.so.2 (0x00007f4bbb8c1000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f4bbb51b000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f4bbb512000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f4bbb4e3000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f4bbb45e000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f4bbb458000) build/libnss_mymachines.so.2: linux-vdso.so.1 (0x00007ffc19cc0000) librt.so.1 => /lib64/librt.so.1 (0x00007fdecb74b000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fdecb744000) libmount.so.1 => /lib64/libmount.so.1 (0x00007fdecb6e7000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdecb6c6000) libc.so.6 => /lib64/libc.so.6 (0x00007fdecb500000) /lib64/ld-linux-x86-64.so.2 (0x00007fdecb8a9000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fdecb4ad000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fdecb4a2000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fdecb475000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fdecb3f0000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fdecb3ea000) build/libnss_resolve.so.2: linux-vdso.so.1 (0x00007ffe8ef8e000) librt.so.1 => /lib64/librt.so.1 (0x00007fcf314bd000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fcf314b6000) libmount.so.1 => /lib64/libmount.so.1 (0x00007fcf31459000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fcf31438000) libc.so.6 => /lib64/libc.so.6 (0x00007fcf31272000) /lib64/ld-linux-x86-64.so.2 (0x00007fcf31615000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007fcf3121f000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007fcf31214000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fcf311e7000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007fcf31162000) libdl.so.2 => /lib64/libdl.so.2 (0x00007fcf3115c000) build/libnss_systemd.so.2: linux-vdso.so.1 (0x00007ffda6d17000) librt.so.1 => /lib64/librt.so.1 (0x00007f610b83c000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f610b835000) libmount.so.1 => /lib64/libmount.so.1 (0x00007f610b7d8000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f610b7b7000) libc.so.6 => /lib64/libc.so.6 (0x00007f610b5f1000) /lib64/ld-linux-x86-64.so.2 (0x00007f610b995000) libblkid.so.1 => /lib64/libblkid.so.1 (0x00007f610b59e000) libuuid.so.1 => /lib64/libuuid.so.1 (0x00007f610b593000) libselinux.so.1 => /lib64/libselinux.so.1 (0x00007f610b566000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x00007f610b4e1000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f610b4db000) (after) build/libnss_myhostname.so.2: linux-vdso.so.1 (0x00007fff0b5e2000) librt.so.1 => /lib64/librt.so.1 (0x00007fde0c328000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fde0c307000) libc.so.6 => /lib64/libc.so.6 (0x00007fde0c141000) /lib64/ld-linux-x86-64.so.2 (0x00007fde0c435000) build/libnss_mymachines.so.2: linux-vdso.so.1 (0x00007ffdc30a7000) librt.so.1 => /lib64/librt.so.1 (0x00007f06ecabb000) libcap.so.2 => /lib64/libcap.so.2 (0x00007f06ecab4000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f06eca93000) libc.so.6 => /lib64/libc.so.6 (0x00007f06ec8cd000) /lib64/ld-linux-x86-64.so.2 (0x00007f06ecc15000) build/libnss_resolve.so.2: linux-vdso.so.1 (0x00007ffe95747000) librt.so.1 => /lib64/librt.so.1 (0x00007fa56a80f000) libcap.so.2 => /lib64/libcap.so.2 (0x00007fa56a808000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fa56a7e7000) libc.so.6 => /lib64/libc.so.6 (0x00007fa56a621000) /lib64/ld-linux-x86-64.so.2 (0x00007fa56a964000) build/libnss_systemd.so.2: linux-vdso.so.1 (0x00007ffe67b51000) librt.so.1 => /lib64/librt.so.1 (0x00007ffb32113000) libcap.so.2 => /lib64/libcap.so.2 (0x00007ffb3210c000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ffb320eb000) libc.so.6 => /lib64/libc.so.6 (0x00007ffb31f25000) /lib64/ld-linux-x86-64.so.2 (0x00007ffb3226a000) I don't quite understand what is going on here, but let's not be too picky.	2018-11-29 21:03:44 +01:00
Lennart Poettering	17c58ba97b	nspawn: let's also pre-mount /dev/mqueue	2018-11-29 20:21:40 +01:00
Yu Watanabe	acf4d15893	util: make *_from_name() returns negative errno on error	2018-11-28 20:20:50 +09:00
Yu Watanabe	938dbb292a	Merge pull request #10901 from poettering/startswith-list add new STARTSWITH_SET() macro	2018-11-26 22:40:51 +09:00
Lennart Poettering	da9fc98ded	tree-wide: port more code over to PATH_STARTSWITH_SET()	2018-11-26 14:08:46 +01:00
Lennart Poettering	27adcc9737	cgroup: be more careful with which controllers we can enable/disable on a cgroup This changes cg_enable_everywhere() to return which controllers are enabled for the specified cgroup. This information is then used to correctly track the enablement mask currently in effect for a unit. Moreover, when we try to turn off a controller, and this works, then this is indicates that the parent unit might succesfully turn it off now, too as our unit might have kept it busy. So far, when realizing cgroups, i.e. when syncing up the kernel representation of relevant cgroups with our own idea we would strictly work from the root to the leaves. This is generally a good approach, as when controllers are enabled this has to happen in root-to-leaves order. However, when controllers are disabled this has to happen in the opposite order: in leaves-to-root order (this is because controllers can only be enabled in a child if it is already enabled in the parent, and if it shall be disabled in the parent then it has to be disabled in the child first, otherwise it is considered busy when it is attempted to remove it in the parent). To make things complicated when invalidating a unit's cgroup membershup systemd can actually turn off some controllers previously turned on at the very same time as it turns on other controllers previously turned off. In such a case we have to work up leaves-to-root and root-to-leaves right after each other. With this patch this is implemented: we still generally operate root-to-leaves, but as soon as we noticed we successfully turned off a controller previously turned on for a cgroup we'll re-enqueue the cgroup realization for all parents of a unit, thus implementing leaves-to-root where necessary.	2018-11-23 13:41:37 +01:00
Zbigniew Jędrzejewski-Szmek	baaa35ad70	coccinelle: make use of SYNTHETIC_ERRNO Ideally, coccinelle would strip unnecessary braces too. But I do not see any option in coccinelle for this, so instead, I edited the patch text using search&replace to remove the braces. Unfortunately this is not fully automatic, in particular it didn't deal well with if-else-if-else blocks and ifdefs, so there is an increased likelikehood be some bugs in such spots. I also removed part of the patch that coccinelle generated for udev, where we returns -1 for failure. This should be fixed independently.	2018-11-22 10:54:38 +01:00
Lennart Poettering	818623aca5	Merge pull request #10860 from keszybz/more-cleanup-2 Do more stuff from main macros	2018-11-21 11:07:31 +01:00
Zbigniew Jędrzejewski-Szmek	294bf0c34a	Split out pretty-print.c and move pager.c and main-func.h to shared/ This is high-level functionality, and fits better in shared/ (which is for our executables), than in basic/ (which is also for libraries).	2018-11-20 18:40:02 +01:00
Lennart Poettering	f2fb2ec942	nspawn: use EXIT_EXCEPTION where appropriate	2018-11-20 17:04:07 +01:00
Lennart Poettering	042cad5737	Merge pull request #10753 from keszybz/pager-no-interrupt Add mode in journalctl where ^C is handled by the pager	2018-11-14 20:09:39 +01:00
Zbigniew Jędrzejewski-Szmek	0221d68a13	basic/pager: convert the pager options to a flags argument Pretty much everything uses just the first argument, and this doesn't make this common pattern more complicated, but makes it simpler to pass multiple options.	2018-11-14 16:25:11 +01:00
Zbigniew Jędrzejewski-Szmek	bd897e729a	nspawn: add a hint to the message we emit when a child dies From #10526: $ sudo systemd-nspawn -i image Spawning container image on /home/zbyszek/src/mkosi/image. Press ^] three times within 1s to kill container. Short read while reading cgroup mode.	2018-11-13 11:58:44 +01:00
Lennart Poettering	1d78fea2d6	nspawn: rework how we allocate/kill scopes Fixes: #6347	2018-11-09 17:08:59 +01:00
Lennart Poettering	df61bc5e4a	nspawn: merge two variable declaration lines	2018-11-09 17:08:59 +01:00
Lennart Poettering	11d81e506e	nspawn: simplify machine terminate bus call We have the machine name anyway, let's use TerminateMachine() on machined's Manager object directly with it. That way it's a single method call only, instead of two, to terminate the machine.	2018-11-09 17:08:59 +01:00
Lennart Poettering	e5a2d8b5b5	nspawn: make use of the new sd_bus_set_close_on_exit() call in nspawn	2018-11-09 17:08:59 +01:00
Yu Watanabe	57512c893e	tree-wide: set WRITE_STRING_FILE_DISABLE_BUFFER flag when we write files under /proc or /sys	2018-11-06 21:24:03 +09:00
Lennart Poettering	6619ad889d	nspawn: beef up netns checking a bit, for compat with old kernels Fixes: #10544	2018-10-31 21:42:45 +03:00
Lennart Poettering	e2d39e549f	nspawn: add proper error message if setns() on network namespace fd fails Addresses: https://github.com/systemd/systemd/pull/10589#issuecomment-434670595	2018-10-31 18:07:30 +01:00
Yu Watanabe	5a937ea2f6	sd-device: make sd_device_get_is_initialized() returns is_initialized by return value	2018-10-29 17:33:33 +09:00
Jiuyang liu	a2f577fca0	add ephemeral to nspawn-settings.	2018-10-24 10:22:20 +02:00
Zbigniew Jędrzejewski-Szmek	369ca6dab1	systemd-nspawn: do not crash on /var/log/journal creation if not required When running a read-only file system, we might not be able to create /var/log/journal. Do not fail on this, unless actually requested by the --link-journal options. $ systemd-nspawn --image=image.squashfs ...	2018-10-22 15:07:08 +02:00
Yu Watanabe	c65ac075ef	nspawn: do not include '%m' in log message if errno is zero	2018-10-20 02:01:15 +09:00
Yu Watanabe	b0b8c9a5a4	Merge pull request #10389 from poettering/nspawn-path-fix nspawn $PATH execvpe() fix	2018-10-19 08:48:37 +09:00
Lennart Poettering	2ff48e981e	tree-wide: introduce setsockopt_int() helper and make use of it everywhere As suggested by @heftig: `6d5e65f645 (commitcomment-30938667)`	2018-10-18 19:50:29 +02:00
Lennart Poettering	c0815ca93d	Merge pull request #10407 from yuwata/netlink-slot sd-netlink: introduce sd_netlink_slot object and relevant functions	2018-10-18 18:05:58 +02:00
Lennart Poettering	b6b180b77b	nspawn: use container $PATH (not host $PATH) when searching for PID 1 binaries to execute Fixes: #10377	2018-10-18 16:40:12 +02:00
Yu Watanabe	8190a388a6	sd-netlink: make sd_netlink_slot take its description	2018-10-16 18:42:23 +09:00
Lennart Poettering	271f518f35	nspawn: TAKE_FD() is your friend	2018-10-15 19:45:37 +02:00
Lennart Poettering	fbda85b078	tree-wide: use sockaddr_un_unlink() at two more places where appropriate	2018-10-15 19:44:34 +02:00
Lennart Poettering	6d5e65f645	tree-wide: add a single version of "static const int one = 1" All over the place we define local variables for the various sockopts that take a bool-like "int" value. Sometimes they are const, sometimes static, sometimes both, sometimes neither. Let's clean this up, introduce a common const variable "const_int_one" (as well as one matching "const_int_zero") and use it everywhere, all acorss the codebase.	2018-10-15 19:40:51 +02:00
Lennart Poettering	44ed5214ad	tree-wide: use structured initialization for sockaddr_un	2018-10-15 19:35:00 +02:00
Yu Watanabe	ee38400bba	sd-netlink: introduce sd_netlink_slot	2018-10-15 18:10:04 +09:00
David Tardon	f369f47c26	be consistent about sun_path length Most places use the whole buffer for name, without leaving extra space for the trailing NUL.	2018-10-12 12:38:49 +02:00
Lennart Poettering	b37469d7d1	nspawn: add comments explaining the namespacing situation and the inner/outer children	2018-10-09 10:52:17 +02:00
Lennart Poettering	1099ceebce	nspawn: optionally don't mount a tmpfs over /tmp (#10294 ) nspawn: optionally, don't mount a tmpfs on /tmp Fixes: #10260	2018-10-08 18:32:03 +02:00
Lennart Poettering	ff6c6cc117	nspawn: when --quiet is passed, simply downgrade log messages to LOG_DEBUG (#10181 ) With this change almost all log messages that are suppressed through --quiet are not actually suppressed anymore, but simply downgraded to LOG_DEBUG. Previously we did it this way for some log messages and fully suppressed them for others. With this it's pretty much systematic. Inspired by #10122.	2018-09-26 23:40:39 +02:00
Evgeny Vereshchagin	89f180201c	nspawn: chown() the legacy hierarchy when it's used in a container This is a follow-up to `720f0a2f3c`. Closes https://github.com/systemd/systemd/issues/10026 Closes https://github.com/systemd/systemd/issues/9563	2018-09-26 17:29:17 +02:00
Lennart Poettering	ee8d493cbd	Merge pull request #10158 from keszybz/seccomp-log-tightening Seccomp log tightening	2018-09-26 15:56:32 +02:00
Yu Watanabe	6c9c51e5e2	fs-util: make symlink_idempotent() optionally create relative link	2018-09-24 18:52:53 +03:00
Zbigniew Jędrzejewski-Szmek	7e86bd73a4	seccomp: tighten checking of seccomp filter creation In seccomp code, the code is changed to propagate errors which are about anything other than unknown/unimplemented syscalls. I think such errors should not happen in normal usage, but so far we would summarilly ignore all errors, so that part is uncertain. If it turns out that other errors occur and should be ignored, this should be added later. In nspawn, we would count the number of added filters, but didn't use this for anything. Drop that part. The comments suggested that seccomp_add_syscall_filter_item() returned negative if the syscall is unknown, but this wasn't true: it returns 0. The error at this point can only be if the syscall was known but couldn't be added. If the error comes from our internal whitelist in nspawn, treat this as error, because it means that our internal table is wrong. If the error comes from user arguments, warn and ignore. (If some syscall is not known at current architecture, it is still silently ignored.)	2018-09-24 17:21:09 +02:00
Zbigniew Jędrzejewski-Szmek	b54f36c604	seccomp: reduce logging about failure to add syscall to seccomp Our logs are full of: Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldstat() / -10037, ignoring: Numerical argument out of domain Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call get_thread_area() / -10076, ignoring: Numerical argument out of domain Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call set_thread_area() / -10079, ignoring: Numerical argument out of domain Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldfstat() / -10034, ignoring: Numerical argument out of domain Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldolduname() / -10036, ignoring: Numerical argument out of domain Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldlstat() / -10035, ignoring: Numerical argument out of domain Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call waitpid() / -10073, ignoring: Numerical argument out of domain ... This is pointless and makes debug logs hard to read. Let's keep the logs in test code, but disable it in nspawn and pid1. This is done through a function parameter because those functions operate recursively and it's not possible to make the caller to log meaningfully. There should be no functional change, except the skipped debug logs.	2018-09-24 17:21:09 +02:00
Yu Watanabe	cf37f937ee	nspawn: suppress one more log message when --quiet is passed Fixes #10119.	2018-09-19 08:42:17 +02:00
Yu Watanabe	93bab28895	tree-wide: use typesafe_qsort()	2018-09-19 08:02:52 +09:00
Zbigniew Jędrzejewski-Szmek	6d7c403324	tests: use a helper function to parse environment and open logging The advantages are that we save a few lines, and that we can override logging using environment variables in more test executables.	2018-09-14 09:29:57 +02:00
afg	27b620b7db	nspawn: use copy-static if systemd-resolved is up and image is writable	2018-09-12 20:48:21 +02:00
Franck Bui	03d0f4b58e	nspawn: always use mode 555 for /sys When a network namespace is needed, /sys is mounted as tmpfs (see commit `d8fc6a000f` for details). But in this case mode 755 was used as initial permissions for /sys whereas the default mode for sysfs is 555. In practice using 755 doesn't have any impact because /sys is mounted read-only too but for consistency, let's use the correct mode. Fixes: #10050	2018-09-11 00:34:00 +02:00
Yu Watanabe	f55b0d3fd6	nspawn: replace udev_device by sd_device	2018-08-23 04:57:39 +09:00
Zbigniew Jędrzejewski-Szmek	7692fed98b	Merge pull request #9783 from poettering/get-user-creds-flags beef up get_user_creds() a bit and other improvements	2018-08-21 10:09:33 +02:00
Lennart Poettering	8967f29169	nspawn: add two missing OOM checks	2018-08-20 15:58:11 +02:00
Lennart Poettering	8dfce114ab	nspawn: make sure to create /dev/char/x:y symlinks in nspawn containers too On the host udev creates these, but they are useful API, hence create them in nspawn containers too.	2018-08-20 15:58:11 +02:00
Lennart Poettering	37ec0fdd34	tree-wide: add clickable man page link to all --help texts This is a bit like the info link in most of GNU's --help texts, but we don't do info but man pages, and we make them properly clickable on terminal supporting that, because awesome. I think it's generally advisable to link up our (brief) --help texts and our (more comprehensive) man pages a bit, so this should be an easy and straight-forward way to do it.	2018-08-20 11:33:04 +02:00
Yu Watanabe	4ae25393f3	tree-wide: shorten error logging a bit Continuation of `4027f96aa0`.	2018-08-07 10:14:33 +09:00
Luke Shumaker	677a72cd3e	nspawn: mount_sysfs(): Unconditionally mkdir /sys/fs/cgroup Currently, mount_sysfs() only creates /sys/fs/cgroup if cg_ns_supported(). The comment explains that we need to "Create mountpoint for cgroups. Otherwise we are not allowed since we remount /sys read-only."; that is: that we need to do it now, rather than later. However, the comment doesn't do anything to explain why we only need to do this if cg_ns_supported(); shouldn't we _always_ need to do it? The answer is that if !use_cgns, then this was already done by the outer child, so mount_sysfs() only needs to do it if use_cgns. Now, mount_sysfs() doesn't know whether use_cgns, but !cg_ns_supported() implies !use_cgns, so we can optimize" the case where we _know_ !use_cgns, and deal with a no-op mkdir_p() in the false-positive where cgns_supported() but !use_cgns. But is it really much of an optimization? We're potentially spending an access(2) (cg_ns_supported() could be cached from a previous call) to potentially save an lstat(2) and mkdir(2); and all of them are on virtual fileystems, so they should all be pretty cheap. So, simplify and drop the conditional. It's a dubious optimization that requires more text to explain than it's worth.	2018-07-20 12:12:03 -04:00
Luke Shumaker	93dbdf6cb1	nspawn: sync_cgroup(): Rename arg_uid_shift -> uid_shift Naming it arg_uid_shift is confusing because of the global arg_uid_shift in nspawn.c	2018-07-20 12:12:02 -04:00

1 2 3 4 5 ...

1046 commits