Systemd

Author	SHA1	Message	Date
Yu Watanabe	fa65c28176	namespace: rename parse_protect_{home,system}_or_bool() to protect_{home,system}_or_bool_to_string() Hence, we can define config_parse_protect_{home,system}() by using DEFINE_CONFIG_PARSE_ENUM() macro.	2018-05-31 11:09:41 +09:00
Lennart Poettering	4e2c0a227e	namespace: extend list of masked files by ProtectKernelTunables= This adds a number of entries nspawn already applies to regular service namespacing too. Most importantly let's mask /proc/kcore and /proc/kallsyms too.	2018-05-03 17:46:31 +02:00
Lennart Poettering	da6053d0a7	tree-wide: be more careful with the type of array sizes Previously we were a bit sloppy with the index and size types of arrays, we'd regularly use unsigned. While I don't think this ever resulted in real issues I think we should be more careful there and follow a stricter regime: unless there's a strong reason not to use size_t for array sizes and indexes, size_t it should be. Any allocations we do ultimately will use size_t anyway, and converting forth and back between unsigned and size_t will always be a source of problems. Note that on 32bit machines "unsigned" and "size_t" are equivalent, and on 64bit machines our arrays shouldn't grow that large anyway, and if they do we have a problem, however that kind of overly large allocation we have protections for usually, but for overflows we do not have that so much, hence let's add it. So yeah, it's a story of the current code being already "good enough", but I think some extra type hygiene is better. This patch tries to be comprehensive, but it probably isn't and I missed a few cases. But I guess we can cover that later as we notice it. Among smaller fixes, this changes: 1. strv_length()' return type becomes size_t 2. the unit file changes array size becomes size_t 3. DNS answer and query array sizes become size_t Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745	2018-04-27 14:29:06 +02:00
Lennart Poettering	088696fe29	namespace: rework how we resolve symlinks in mount points Before this patch we'd resolve all symlinks of bind mounts and other mount points to establish for a service in advance, and only then start mounting them. This is problematic, if symlink chains jump around between directories in a namespace tree, so that to resolve a specific symlink chain we need to establish another mount already. A typical case where this happens is if /etc/resolv.conf is a symlink to some file in /run: in that case we'd normally resolve and mount /etc/resolv.conf early on, but that's broken, as to do this properly we'd need to resolve /etc/resolv.conf first, then figure out that /run needs to be mounted before we can proceed, and thus reorder the order in which we apply mounts dynamically. With this change, whenever we are about to apply a mount, we'll do a single step of the symlink normalization process, patch the mount entry accordingly, and then sort the list of mounts to establish again, taking the new path into account. This means that we can correctly deal with the example above: we might start with wanting to mount /etc/resolv.conf early, but after resolving it to the path in /run/ we'd push it to the end of the list, ensuring that /run is mounted first. (Note that this also fixes another bug: we were following symlinks on the bind mount source relative to the root directory of the service, rather than of the host. That's wrong though as we explicitly document tha the source of bind mounts is always on the host.)	2018-04-18 14:17:50 +02:00
Lennart Poettering	e871786273	namespace: improve logging when creating mount source nodes	2018-04-18 14:15:48 +02:00
Lennart Poettering	f8b64b5723	namespace: split out calls to normalize mount entry list into new function	2018-04-18 14:15:48 +02:00
Lennart Poettering	c9ef8573be	namespace: don't consider raw image read-only if /home in it is writable	2018-04-18 14:15:48 +02:00
Lennart Poettering	12777909c9	Merge pull request #8417 from brauner/2018-03-09/add_bind_mount_fallback_to_private_devices core: fall back to bind-mounts for PrivateDevices= execution environments	2018-04-18 11:56:56 +02:00
Zbigniew Jędrzejewski-Szmek	af984e137e	core/namespace: rework the return semantics of clone_device_node yet again Returning 0 on not-found/wrong-type is confusing. Let's return -ENXIO in that case instead, and explicitly ignore it in the call site where we want to do that. I think this is clearer and less likely to be used errenously in case another call site is added. C.f. `152c475f95` and `98b1d2b8d9`.	2018-04-12 18:15:33 +02:00
Christian Brauner	1649861744	core: fall back to bind-mounts for PrivateDevices= execution environments In environments where CAP_MKNOD is not available or inside user namespaces it is still desirable to enable services to use PrivateDevices= . So fall back to using bind-mounts on EPERM.	2018-04-12 18:15:12 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Yu Watanabe	1cc6c93a95	tree-wide: use TAKE_PTR() and TAKE_FD() macros	2018-04-05 14:26:26 +09:00
Lennart Poettering	62570f6f03	fs-util: add new CHASE_TRAIL_SLASH flag for chase_symlinks() This rearranges chase_symlinks() a bit: if no special flags are specified it will now revert to behaviour before `b12d25a8d6`. However, if the new CHASE_TRAIL_SLASH flag is specified it will follow the behaviour introduced by that commit. I wasn't sure which one to make the beaviour that requires specification of a flag to enable. I opted to make the "append trailing slash" behaviour the one to enable by a flag, following the thinking that the function should primarily be used to generate a normalized path, and I am pretty sure a path without trailing slash is the more "normalized" one, as the trailing slash is not really a part of it, but merely a "decorator" that tells various system calls to generate ENOTDIR if the path doesn't refer to a path. Or to say this differently: if the slash was part of normalization then we really should add it in all cases when the final path is a directory, not just when the user originally specified it. Fixes: #8544 Replaces: #8545	2018-03-22 19:54:24 +01:00
Zbigniew Jędrzejewski-Szmek	671f0f8de0	Remove /sbin from paths if split-bin is false (#8324 ) Follow-up for `157baa87e4`.	2018-03-01 21:48:36 +01:00
Ansgar Burchardt	7486f305cd	Include additional directories in ProtectSystem	2018-02-27 18:56:19 -03:00
Zbigniew Jędrzejewski-Szmek	aa484f3561	tree-wide: use reallocarray instead of our home-grown realloc_multiply (#8279 ) There isn't much difference, but in general we prefer to use the standard functions. glibc provides reallocarray since version 2.26. I moved explicit_bzero is configure test to the bottom, so that the two stdlib functions are at the bottom.	2018-02-26 21:20:00 +01:00
Lennart Poettering	13a141f046	namespace: protect bpf file system as part of ProtectKernelTunables= It also exposes kernel objects, let's better include this in ProtectKernelTunables=.	2018-02-21 16:43:36 +01:00
Yu Watanabe	e4da7d8c79	core: add new option 'tmpfs' to ProtectHome= This make ProtectHome= setting can take 'tmpfs'. This is mostly equivalent to `TemporaryFileSystem=/home /run/user /root`.	2018-02-21 09:18:17 +09:00
Yu Watanabe	2abd4e388a	core: add new setting TemporaryFileSystem= This introduces a new setting TemporaryFileSystem=. This is useful to hide files not relevant to the processes invoked by unit, while necessary files or directories can be still accessed by combining with Bind{,ReadOnly}Paths=.	2018-02-21 09:17:52 +09:00
Yu Watanabe	4ca763a902	core/namespace: make '-' prefix in Bind{,ReadOnly}Paths= work Each path in `Bind{ReadOnly}Paths=` accept '-' prefix. However, the prefix is completely ignored. This makes it work as expected.	2018-02-21 09:07:56 +09:00
Yu Watanabe	f5c52a7724	core/namespace: remove unused argument	2018-02-21 09:05:30 +09:00
Yu Watanabe	e282f51f57	core/namespace: use free_and_replace()	2018-02-21 09:05:21 +09:00
Yu Watanabe	55fe743273	core/namespace: fix comment	2018-02-21 09:05:18 +09:00
Yu Watanabe	89bd586cd3	core/namespace: merge PRIVATE_VAR_TMP into PRIVATE_TMP	2018-02-21 09:05:16 +09:00
Yu Watanabe	2a2969fd5d	core/namespace: make arguments const if possible	2018-02-21 09:05:14 +09:00
Zbigniew Jędrzejewski-Szmek	f863b1c6fa	core: move very long argument to a separate statement I like compact, but this was a bit too much.	2018-02-15 10:10:01 +01:00
Lennart Poettering	152c475f95	namepace: fix error handling when clone_device_node() returns 0 Before this patch, we'd treat clone_device_node() returning 0 (as opposed to 1) as error, but then propagate this non-error result in confusion. This makes sure that if we ptmx isn't around we propagate that as -ENXIO. This is a follow-up for `98b1d2b8d9`	2018-01-23 19:50:32 +01:00
Lennart Poettering	36ce7110b0	namespace: use is_symlink() helper We have this prett ylittle helper, let's use it, it makes things a tiny bit more readable.	2018-01-23 19:36:55 +01:00
Lennart Poettering	6f7f3a3351	namespace: use stack allocation for paths, where we can	2018-01-23 19:36:36 +01:00
Alan Jenkins	68f7480b7e	Merge pull request #7913 from sourcejedi/devpts 3 nitpicks from core/namespace.c	2018-01-18 21:56:26 +00:00
Alan Jenkins	225874dc9c	core: clone_device_node(): add debug message For people who use debug messages, maybe it is helpful to know that PrivateDevices= failed due to mknod(), and which device node. (The other (un-logged) failures could be while mounting filesystems e.g. no CAP_SYS_ADMIN which is the common case, or missing /dev/shm or /dev/pts, or missing /dev/ptmx).	2018-01-18 13:58:13 +00:00
Alan Jenkins	8d95368210	core: namespace: remove unnecessary mode on /dev/shm mount target This should have no behavioural effect; it just confused me. All the other mount directories in this function are created as 0755. Some of the mounts are allowed to fail - mqueue and hugepages. If the /dev/mqueue mount target was created with the permissive mode 01777, to match the filesystem we're trying to mount there, then a mount failure would allow unprivileged users to write to the /dev filesystem, e.g. to exhaust the available space. There is no reason to allow this. (Allowing the user read access (0755) seems a reasonable idea though, e.g. for quicker troubleshooting.) We do not allow failure of the /dev/shm mount, so it doesn't matter that it is created as 01777. But on the same grounds, we have no reason to create it as any specific mode. 0755 is equally fine. This function will be clearer by using 0755 throughout, to avoid unintentionally implying some connection between the mode of the mount target, and the mode of the mounted filesystem.	2018-01-17 18:04:34 +00:00
Alan Jenkins	98b1d2b8d9	core: namespace: nitpick /dev/ptmx error handling If /dev/tty did not exist, or had st_rdev == 0, we ignored it. And the same is true for null, zero, full, random, urandom. If /dev/ptmx did not exist, we treated this as a failure. If /dev/ptmx had st_rdev == 0, we ignored it. This was a very recent change, but there was no reason for ptmx creation specifically to treat st_rdev == 0 differently from non-existence. This confuses me when reading it. Change the creation of /dev/ptmx so that st_rdev == 0 is treated as failure. This still leaves /dev/ptmx as a special case with stricter handling. However it is consistent with the immediately preceding creation of /dev/pts/, which is treated as essential, and is directly related to ptmx. I don't know why we check st_rdev. But I'd prefer to have only one unanswered question here, and not to have a second unanswered question added on top.	2018-01-17 13:28:32 +00:00
Дамјан Георгиевски	414b304ba2	namespace: only make the symlink /dev/ptmx if it was already a symlink …otherwise try to clone it as a device node On most contemporary distros /dev/ptmx is a device node, and /dev/pts/ptmx has 000 inaccessible permissions. In those cases the symlink /dev/ptmx -> /dev/pts/ptmx breaks the pseudo tty support. In that case we better clone the device node. OTOH, in nspawn containers (and possibly others), /dev/pts/ptmx has normal permissions, and /dev/ptmx is a symlink. In that case make the same symlink. fixes #7878	2018-01-17 01:19:46 +01:00
Дамјан Георгиевски	b5e99f23ed	namespace: extract clone_device_node function from mount_private_dev	2018-01-16 21:41:10 +01:00
Yu Watanabe	03c791aa24	namespace: introduce parse_protect_system()_or_bool	2018-01-02 02:23:13 +09:00
Yu Watanabe	5e1c61544c	namespace: introduce parse_protect_home_or_bool()	2018-01-02 02:23:05 +09:00
Lennart Poettering	2d3a5a73e0	nspawn: make sure images containing an ESP are compatible with userns -U mode In -U mode we might need to re-chown() all files and directories to match the UID shift we want for the image. That's problematic on fat partitions, such as the ESP (and which is generated by mkosi's --bootable switch), because fat of course knows no UID/GID file ownership natively. With this change we take benefit of the uid= and gid= mount options FAT knows: instead of chown()ing all files and directories we can just specify the right UID/GID to use at mount time. This beefs up the image dissection logic in two ways: 1. First of all support for mounting relevant file systems with uid=/gid= is added: when a UID is specified during mount it is used for all applicable file systems. 2. Secondly, two new mount flags are added: DISSECT_IMAGE_MOUNT_ROOT_ONLY and DISSECT_IMAGE_MOUNT_NON_ROOT_ONLY. If one is specified the mount routine will either only mount the root partition of an image, or all partitions except the root partition. This is used by nspawn: first the root partition is mounted, so that we can determine the UID shift in use so far, based on ownership of the image's root directory. Then, we mount the remaining partitions in a second go, this time with the right UID/GID information.	2017-12-05 13:49:12 +01:00
Shawn Landden	4831981d89	tree-wide: adjust fall through comments so that gcc is happy Distcc removes comments, making the comment silencing not work. I know there was a decision against a macro in commit `ec251fe7d5`	2017-11-20 13:06:25 -08:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	4e0c20de97	namespace: set up OS hierarchy only after mounting the new root, not before Otherwise it's a pointless excercise, as we'll set up an empty directory tree that's never going to be used. Hence, let's move this around a bit, so that we do the basesystem initialization exactly when RootImage= or RootDirectory= are used, but not otherwise.	2017-11-13 10:22:36 +01:00
Yu Watanabe	d18aff0422	core: ReadWritePaths= and friends assume '+' prefix when BindPaths= or freinds are set When at least one of BindPaths=, BindReadOnlyPaths=, RootImage=, RuntimeDirectory= or their friends are set, systemd prepares a namespace under /run/systemd/unit-root. Thus, ReadWritePaths= or their friends without '+' prefix is completely meaningless. So, let's assume '+' prefix when one of them are set. Fixes #7070 and #7080.	2017-11-08 15:48:01 +09:00
Lennart Poettering	0fa5b8312a	namespace: make ns_type_supported() a tiny bit shorter namespace_type_to_string() already validates the type paramater, we can use that, and shorten the function a bit.	2017-10-10 09:52:08 +02:00
Lennart Poettering	bb0ff3fb1b	namespace: change NameSpace → Namespace We generally use the casing "Namespace" for the word, and that's visible in a number of user-facing interfaces, including "RestrictNamespace=" or "JoinsNamespaceOf=". Let's make sure to use the same casing internally too. As discussed in #7024	2017-10-10 09:51:58 +02:00
Michal Sekletar	6e2d7c4f13	namespace: fall back gracefully when kernel doesn't support network namespaces (#7024 )	2017-10-10 09:46:13 +02:00
Zbigniew Jędrzejewski-Szmek	349cc4a507	build-sys: use #if Y instead of #ifdef Y everywhere The advantage is that is the name is mispellt, cpp will warn us. $ git grep -Ee "conf.set$'(HAVE\|ENABLE)_" -l\|xargs sed -r -i "s/conf.set\('(HAVE\|ENABLE)_/conf.set10('\1_/" $ git grep -Ee '#ifn?def (HAVE\|ENABLE)' -l\|xargs sed -r -i 's/#ifdef (HAVE\|ENABLE)/#if \1/; s/#ifndef (HAVE\|ENABLE)/#if ! \1/;' $ git grep -Ee 'if.defined\(HAVE' -l\|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_])$/\1/g' $ git grep -Ee 'if.defined$ENABLE' -l\|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_])$/\1/g' + manual changes to meson.build squash! build-sys: use #if Y instead of #ifdef Y everywhere v2: - fix incorrect setting of HAVE_LIBIDN2	2017-10-04 12:09:29 +02:00
Lennart Poettering	6c47cd7d3b	execute: make StateDirectory= and friends compatible with DynamicUser=1 and RootDirectory=/RootImage= Let's clean up the interaction of StateDirectory= (and friends) to DynamicUser=1: instead of creating these directories directly below /var/lib, place them in /var/lib/private instead if DynamicUser=1 is set, making that directory 0700 and owned by root:root. This way, if a dynamic UID is later reused, access to the old run's state directory is prohibited for that user. Then, use file system namespacing inside the service to make /var/lib/private a readable tmpfs, hiding all state directories that are not listed in StateDirectory=, and making access to the actual state directory possible. Mount all directories listed in StateDirectory= to the same places inside the service (which means they'll now be mounted into the tmpfs instance). Finally, add a symlink from the state directory name in /var/lib/ to the one in /var/lib/private, so that both the host and the service can access the path under the same location. Here's an example: let's say a service runs with StateDirectory=foo. When DynamicUser=0 is set, it will get the following setup, and no difference between what the unit and what the host sees: /var/lib/foo (created as directory) Now, if DynamicUser=1 is set, we'll instead get this on the host: /var/lib/private (created as directory with mode 0700, root:root) /var/lib/private/foo (created as directory) /var/lib/foo → private/foo (created as symlink) And from inside the unit: /var/lib/private (a tmpfs mount with mode 0755, root:root) /var/lib/private/foo (bind mounted from the host) /var/lib/foo → private/foo (the same symlink as above) This takes inspiration from how container trees are protected below /var/lib/machines: they generally reuse UIDs/GIDs of the host, but because /var/lib/machines itself is set to 0700 host users cannot access files in the container tree even if the UIDs/GIDs are reused. However, for this commit we add one further trick: inside and outside of the unit /var/lib/private is a different thing: outside it is a plain, inaccessible directory, and inside it is a world-readable tmpfs mount with only the whitelisted subdirs below it, bind mounte din. This means, from the outside the dir acts as an access barrier, but from the inside it does not. And the symlink created in /var/lib/foo itself points across the barrier in both cases, so that root and the unit's user always have access to these dirs without knowing the details of this mounting magic. This logic resolves a major shortcoming of DynamicUser=1 units: previously they couldn't safely store persistant data. With this change they can have their own private state, log and data directories, which they can write to, but which are protected from UID recycling. With this change, if RootDirectory= or RootImage= are used it is ensured that the specified state/log/cache directories are always mounted in from the host. This change of semantics I think is much preferable since this means the root directory/image logic can be used easily for read-only resource bundling (as all writable data resides outside of the image). Note that this is a change of behaviour, but given that we haven't released any systemd version with StateDirectory= and friends implemented this should be a safe change to make (in particular as previously it wasn't clear what would actually happen when used in combination). Moreover, by making this change we can later add a "+" modifier to these setings too working similar to the same modifier in ReadOnlyPaths= and friends, making specified paths relative to the container itself.	2017-10-02 17:41:44 +02:00
Lennart Poettering	a227a4be48	namespace: if we can create the destination of bind and PrivateTmp= mounts When putting together the namespace, always create the file or directory we are supposed to bind mount on, the same way we do it for most other stuff, for example mount units or systemd-nspawn's --bind= option. This has the big benefit that we can use namespace bind mounts on dirs in /tmp or /var/tmp even in conjunction with PrivateTmp=.	2017-10-02 17:41:43 +02:00
Lennart Poettering	e908468b5b	namespace: properly handle bind mounts from the host Before this patch we had an ordering problem: if we have no namespacing enabled except for two bind mounts that intend to swap /a and /b via bind mounts, then we'd execute the bind mount binding /b to /a, followed by thebind mount from /a to /b, thus having the effect that /b is now visible in both /a and /b, which was not intended. With this change, as soon as any bind mount is configured we'll put together the service mount namespace in a temporary directory instead of operating directly in the root. This solves the problem in a straightforward fashion: the source of bind mounts will always refer to the host, and thus be unaffected from the bind mounts we already created.	2017-10-02 17:41:43 +02:00
Lennart Poettering	645767d6b5	namespace: create /dev, /proc, /sys when needed We already create /dev implicitly if PrivateTmp=yes is on, if it is missing. Do so too for the other two API VFS, as well as for /dev if PrivateTmp=yes is off but MountAPIVFS=yes is on (i.e. when /dev is bind mounted from the host).	2017-10-02 17:41:43 +02:00
Topi Miettinen	07ce74074d	namespace: avoid assertion failure (#6649 ) If the root image is not decrypted, it must not be relinquished.	2017-08-29 17:31:24 +02:00
Nicolas Iooss	3a0bf6d6aa	namespace: keep selinuxfs mounted read-write with ProtectKernelTunables (#5741 ) When a service unit uses "ProtectKernelTunables=yes", it currently remounts /sys/fs/selinux read-only. This makes libselinux report SELinux state as "disabled", because most SELinux features are not usable. For example it is not possible to validate security contexts (with security_check_context_raw() or /sys/fs/selinux/context). This behavior of libselinux has been described in http://danwalsh.livejournal.com/73099.html and confirmed in a recent email, https://marc.info/?l=selinux&m=149220233032594&w=2 . Since commit `0c28d51ac8` ("units: further lock down our long-running services"), systemd-localed unit uses ProtectKernelTunables=yes. Nevertheless this service needs to use libselinux API in order to create /etc/vconsole.conf, /etc/locale.conf... with the right SELinux contexts. This is broken when /sys/fs/selinux is mounted read-only in the mount namespace of the service. Make SELinux-aware systemd services work again when they are using ProtectKernelTunables=yes by keeping selinuxfs mounted read-write.	2017-07-31 17:45:33 +02:00
Timothée Ravier	ac9de0b379	core: open /proc/self/mountinfo early to allow mounts over /proc (#5985 ) Enable masking the /proc folder using the 'InaccessiblePaths' unit option. This also slightly simplify mounts setup as the bind_remount_recursive function will only open /proc/self/mountinfo once. This is based on the suggestion at: https://lists.freedesktop.org/archives/systemd-devel/2017-April/038634.html	2017-05-19 14:38:40 +02:00
Djalal Harouni	9c988f934b	namespace: Apply MountAPIVFS= only when a Root directory is set The MountAPIVFS= documentation says that this options has no effect unless used in conjunction with RootDirectory= or RootImage= ,lets fix this and avoid to create private mount namespaces where it is not needed.	2017-03-05 21:39:43 +01:00
Djalal Harouni	10404d52e3	namespace: create base-filesystem directories if RootImage= or RootDirectory= are set When a service is started with its own file system image, always try to create the base-filesystem directories that are needed. This implicitly covers the directories handled by MountAPIVFS= {/proc\|/sys\|/dev}. Mount protections or MountAPIVFS= mounts were never applied if we changed the root directory and the related paths were not present under the new root. The mounts were silently. Fix this by creating those directories if they are missing. Closes https://github.com/systemd/systemd/issues/5488	2017-03-05 21:19:29 +01:00
AsciiWolf	13e785f7a0	Fix missing space in comments (#5439 )	2017-02-24 18:14:02 +01:00
Lennart Poettering	78ebe98061	core,nspawn,dissect: make nspawn's .roothash file search reusable This makes nspawn's logic of automatically discovering the root hash of an image file generic, and then reuses it in systemd-dissect and in PID1's RootImage= logic, so that verity is automatically set up whenever we can.	2017-02-07 12:21:28 +01:00
Lennart Poettering	915e6d1676	core: add RootImage= setting for using a specific image file as root directory for a service This is similar to RootDirectory= but mounts the root file system from a block device or loopback file instead of another directory. This reuses the image dissector code now used by nspawn and gpt-auto-discovery.	2017-02-07 12:19:42 +01:00
Lennart Poettering	5d997827e2	core: add a per-unit setting MountAPIVFS= for mounting /dev, /proc, /sys in conjunction with RootDirectory= This adds a boolean unit file setting MountAPIVFS=. If set, the three main API VFS mounts will be mounted for the service. This only has an effect on RootDirectory=, which it makes a ton times more useful. (This is basically the /dev + /proc + /sys mounting code posted in the original #4727, but rebased on current git, and with the automatic logic replaced by explicit logic controlled by a unit file setting)	2017-02-07 11:22:05 +01:00
Lennart Poettering	1eb7e08e20	core: fix minor memleak in namespace.c The source_malloc field wants to be freed, too.	2017-02-07 11:22:05 +01:00
Lennart Poettering	d2d6c096f6	core: add ability to define arbitrary bind mounts for services This adds two new settings BindPaths= and BindReadOnlyPaths=. They allow defining arbitrary bind mounts specific to particular services. This is particularly useful for services with RootDirectory= set as this permits making specific bits of the host directory available to chrooted services. The two new settings follow the concepts nspawn already possess in --bind= and --bind-ro=, as well as the .nspawn settings Bind= and BindReadOnly= (and these latter options should probably be renamed to BindPaths= and BindReadOnlyPaths= too). Fixes: #3439	2016-12-14 00:54:10 +01:00
Lennart Poettering	8fceda937f	namespace: instead of chasing mount symlinks a priori, do so as-we-go This is relevant as many of the mounts we try to establish only can be followed when some other prior mount that is a prefix of it is established. Hence: move the symlink chasing into the actual mount functions, so that we do it as late as possibly but as early as necessary. Fixes: #4588	2016-12-14 00:51:37 +01:00
Lennart Poettering	34de407a4f	core: rename BindMount structure → MountEntry After all, these don#t strictly encapsulate bind mounts anymore, and we are preparing this for adding arbitrary user-defined bind mounts in a later commit, at which point this would become really confusing. Let's clean this up, rename the BindMount structure to MountEntry, so that it is clear that it can contain information about any kind of mount.	2016-12-14 00:48:52 +01:00
Lennart Poettering	cfbeb4ef8d	namespace: add explicit read-only flag This reworks handling of the read-only management for mount points. This will become handy as soon as we add arbitrary bind mount support (which comes in a later commit).	2016-12-14 00:42:01 +01:00
Lennart Poettering	ddbe041277	namespace: reindent protect_system_strict_table[] as well All other tables got reindented, but one was forgotten. Fix that.	2016-12-13 21:22:13 +01:00
Lennart Poettering	c4f4fce79e	fs-util: add flags parameter to chase_symlinks() Let's remove chase_symlinks_prefix() and instead introduce a flags parameter to chase_symlinks(), with a flag CHASE_PREFIX_ROOT that exposes the behaviour of chase_symlinks_prefix().	2016-12-01 00:25:51 +01:00
Lennart Poettering	e187369587	tree-wide: stop using canonicalize_file_name(), use chase_symlinks() instead Let's use chase_symlinks() everywhere, and stop using GNU canonicalize_file_name() everywhere. For most cases this should not change behaviour, however increase exposure of our function to get better tested. Most importantly in a few cases (most notably nspawn) it can take the correct root directory into account when chasing symlinks.	2016-12-01 00:25:51 +01:00
Lennart Poettering	aa70f38b5c	namespace: clarify that /proc/apm is obsolete, but leave it blocked	2016-11-17 18:10:30 +01:00
Lennart Poettering	c6232fb0e9	namespace: reindent namespace tables Let's align all our BindMount tables, let's use the same column widths in all of them, and let's make them not any wider than necessary. This only changes whitespace, not contents of any of the tables.	2016-11-17 18:09:16 +01:00
Lennart Poettering	5327c910d2	namespace: simplify, optimize and extend handling of mounts for namespace This changes a couple of things in the namespace handling: It merges the BindMount and TargetMount structures. They are mostly the same, hence let's just use the same structue, and rely on C's implicit zero initialization of partially initialized structures for the unneeded fields. This reworks memory management of each entry a bit. It now contains one "const" and one "malloc" path. We use the former whenever we can, but use the latter when we have to, which is the case when we have to chase symlinks or prefix a root directory. This means in the common case we don't actually need to allocate any dynamic memory. To make this easy to use we add an accessor function bind_mount_path() which retrieves the right path string from a BindMount structure. While we are at it, also permit "+" as prefix for dirs configured with ReadOnlyPaths= and friends: if specified the root directory of the unit is implicited prefixed. This also drops set_bind_mount() and uses C99 structure initialization instead, which I think is more readable and clarifies what is being done. This drops append_protect_kernel_tunables() and append_protect_kernel_modules() as append_static_mounts() is now simple enough to be called directly. Prefixing with the root dir is now done in an explicit step in prefix_where_needed(). It will prepend the root directory on each entry that doesn't have it prefixed yet. The latter is determined depending on an extra bit in the BindMount structure.	2016-11-17 18:08:32 +01:00
Djalal Harouni	1d54cd5d25	core:namespace: count and free failed paths inside chase_all_symlinks() (#4619 ) This certainly fixes a bug that was introduced by PR https://github.com/systemd/systemd/pull/4594 that intended to fix https://github.com/systemd/systemd/issues/4567. The fix was not complete. This patch makes sure that we count and free all paths that fail inside chase_all_symlinks(). Fixes https://github.com/systemd/systemd/issues/4567	2016-11-10 12:11:37 -05:00
Djalal Harouni	af964954c6	core: on DynamicUser= make sure that protecting sensitive paths is enforced (#4596 ) This adds a variable that is always set to false to make sure that protect paths inside sandbox are always enforced and not ignored. The only case when it is set to true is on DynamicUser=no and RootDirectory=/chroot is set. This allows users to use more our sandbox features inside RootDirectory= The only exception is ProtectSystem=full\|strict and when DynamicUser=yes is implied. Currently RootDirectory= is not fully compatible with these due to two reasons: * /chroot/usr\|etc has to be present on ProtectSystem=full * /chroot// has to be a mount point on ProtectSystem=strict.	2016-11-08 21:57:32 -05:00
Zbigniew Jędrzejewski-Szmek	46c3230dd0	nspawn: slight simplification	2016-11-07 08:57:30 -05:00
Zbigniew Jędrzejewski-Szmek	49fedb4094	nspawn: avoid one strdup by using free_and_replace	2016-11-07 08:54:47 -05:00
Djalal Harouni	f0a4feb0a5	core: make RootDirectory= and ProtectKernelModules= work Instead of having two fields inside BindMount struct where one is stack based and the other one is heap, use one field to store the full path and updated it when we chase symlinks. This way we avoid dealing with both at the same time. This makes RootDirectory= work with ProtectHome= and ProtectKernelModules=yes Fixes: https://github.com/systemd/systemd/issues/4567	2016-11-07 12:34:52 +01:00
Zbigniew Jędrzejewski-Szmek	605405c6cc	tree-wide: drop NULL sentinel from strjoin This makes strjoin and strjoina more similar and avoids the useless final argument. spatch -I . -I ./src -I ./src/basic -I ./src/basic -I ./src/shared -I ./src/shared -I ./src/network -I ./src/locale -I ./src/login -I ./src/journal -I ./src/journal -I ./src/timedate -I ./src/timesync -I ./src/nspawn -I ./src/resolve -I ./src/resolve -I ./src/systemd -I ./src/core -I ./src/core -I ./src/libudev -I ./src/udev -I ./src/udev/net -I ./src/udev -I ./src/libsystemd/sd-bus -I ./src/libsystemd/sd-event -I ./src/libsystemd/sd-login -I ./src/libsystemd/sd-netlink -I ./src/libsystemd/sd-network -I ./src/libsystemd/sd-hwdb -I ./src/libsystemd/sd-device -I ./src/libsystemd/sd-id128 -I ./src/libsystemd-network --sp-file coccinelle/strjoin.cocci --in-place $(git ls-files src/.c) git grep -e '\bstrjoin\b.NULL' -l\|xargs sed -i -r 's/strjoin$(.*), NULL$/strjoin(\1)/' This might have missed a few cases (spatch has a really hard time dealing with _cleanup_ macros), but that's no big issue, they can always be fixed later.	2016-10-23 11:43:27 -04:00
Djalal Harouni	c575770b75	core:sandbox: lets make /lib/modules/ inaccessible on ProtectKernelModules= Lets go further and make /lib/modules/ inaccessible for services that do not have business with modules, this is a minor improvment but it may help on setups with custom modules and they are limited... in regard of kernel auto-load feature. This change introduce NameSpaceInfo struct which we may embed later inside ExecContext but for now lets just reduce the argument number to setup_namespace() and merge ProtectKernelModules feature.	2016-10-12 14:11:16 +02:00
Djalal Harouni	b6c432ca7e	core:namespace: simplify ProtectHome= implementation As with previous patch simplify ProtectHome and don't care about duplicates, they will be sorted by most restrictive mode and cleaned.	2016-09-25 12:41:16 +02:00
Djalal Harouni	f471b2afa1	core: simplify ProtectSystem= implementation ProtectSystem= with all its different modes and other options like PrivateDevices= + ProtectKernelTunables= + ProtectHome= are orthogonal, however currently it's a bit hard to parse that from the implementation view. Simplify it by giving each mode its own table with all paths and references to other Protect options. With this change some entries are duplicated, but we do not care since duplicate mounts are first sorted by the most restrictive mode then cleaned.	2016-09-25 12:21:25 +02:00
Djalal Harouni	49accde7bd	core:sandbox: add more /proc/* entries to ProtectKernelTunables= Make ALSA entries, latency interface, mtrr, apm/acpi, suspend interface, filesystems configuration and IRQ tuning readonly. Most of these interfaces now days should be in /sys but they are still available through /proc, so just protect them. This patch does not touch /proc/net/...	2016-09-25 11:30:11 +02:00
Djalal Harouni	2652c6c103	core:namespace: simplify mount calculation Move out mount calculation on its own function. Actually the logic is smart enough to later drop nop and duplicates mounts, this change improves code readability. --- src/core/namespace.c \| 47 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 11 deletions(-)	2016-09-25 11:25:00 +02:00
Djalal Harouni	11a30cec2a	core:namespace: put paths protected by ProtectKernelTunables= in Instead of having all these paths everywhere, put the ones that are protected by ProtectKernelTunables= into their own table. This way it is easy to add paths and track which ones are protected.	2016-09-25 11:16:44 +02:00
Djalal Harouni	9c94d52e09	core:namespace: minor improvements to append_mounts()	2016-09-25 11:03:21 +02:00
Lennart Poettering	cd2902c954	namespace: drop all mounts outside of the new root directory There's no point in mounting these, if they are outside of the root directory we'll move to.	2016-09-25 10:52:57 +02:00
Lennart Poettering	8f1ad200f0	namespace: don't make the root directory of a namespace a mount if it already is one Let's not stack mounts needlessly.	2016-09-25 10:42:18 +02:00
Lennart Poettering	d944dc9553	namespace: chase symlinks for mounts to set up in userspace This adds logic to chase symlinks for all mount points that shall be created in a namespace environment in userspace, instead of leaving this to the kernel. This has the advantage that we can correctly handle absolute symlinks that shall be taken relative to a specific root directory. Moreover, we can properly handle mounts created on symlinked files or directories as we can merge their mounts as necessary. (This also drops the "done" flag in the namespace logic, which was never actually working, but was supposed to permit a partial rollback of the namespace logic, which however is only mildly useful as it wasn't clear in which case it would or would not be able to roll back.) Fixes: #3867	2016-09-25 10:42:18 +02:00
Lennart Poettering	1e4e94c881	namespace: invoke unshare() only after checking all parameters Let's create the new namespace only after we validated and processed all parameters, right before we start with actually mounting things. This way, the window where we can roll back is larger (not that it matters IRL...)	2016-09-25 10:42:18 +02:00
Lennart Poettering	3f815163ff	core: introduce ProtectSystem=strict Let's tighten our sandbox a bit more: with this change ProtectSystem= gains a new setting "strict". If set, the entire directory tree of the system is mounted read-only, but the API file systems /proc, /dev, /sys are excluded (they may be managed with PrivateDevices= and ProtectKernelTunables=). Also, /home and /root are excluded as those are left for ProtectHome= to manage. In this mode, all "real" file systems (i.e. non-API file systems) are mounted read-only, and specific directories may only be excluded via ReadWriteDirectories=, thus implementing an effective whitelist instead of blacklist of writable directories. While we are at, also add /efi to the list of paths always affected by ProtectSystem=. This is a follow-up for `b52a109ad3` which added /efi as alternative for /boot. Our namespacing logic should respect that too.	2016-09-25 10:42:18 +02:00
Lennart Poettering	160cfdbed3	namespace: add some debug logging when enforcing InaccessiblePaths=	2016-09-25 10:42:18 +02:00
Lennart Poettering	6b7c9f8bce	namespace: rework how ReadWritePaths= is applied Previously, if ReadWritePaths= was nested inside a ReadOnlyPaths= specification, then we'd first recursively apply the ReadOnlyPaths= paths, and make everything below read-only, only in order to then flip the read-only bit again for the subdirs listed in ReadWritePaths= below it. This is not only ugly (as for the dirs in question we first turn on the RO bit, only to turn it off again immediately after), but also problematic in containers, where a container manager might have marked a set of dirs read-only and this code will undo this is ReadWritePaths= is set for any. With this patch behaviour in this regard is altered: ReadOnlyPaths= will not be applied to the children listed in ReadWritePaths= in the first place, so that we do not need to turn off the RO bit for those after all. This means that ReadWritePaths=/ReadOnlyPaths= may only be used to turn on the RO bit, but never to turn it off again. Or to say this differently: if some dirs are marked read-only via some external tool, then ReadWritePaths= will not undo it. This is not only the safer option, but also more in-line with what the man page currently claims: "Entries (files or directories) listed in ReadWritePaths= are accessible from within the namespace with the same access rights as from outside." To implement this change bind_remount_recursive() gained a new "blacklist" string list parameter, which when passed may contain subdirs that shall be excluded from the read-only mounting. A number of functions are updated to add more debug logging to make this more digestable.	2016-09-25 10:40:51 +02:00
Lennart Poettering	7648a565d1	namespace: when enforcing fs namespace restrictions suppress redundant mounts If /foo is marked to be read-only, and /foo/bar too, then the latter may be suppressed as it has no effect.	2016-09-25 10:19:15 +02:00
Lennart Poettering	6ee1a919cf	namespace: simplify mount_path_compare() a bit	2016-09-25 10:19:10 +02:00
Lennart Poettering	fe3c2583be	namespace: make sure InaccessibleDirectories= masks all mounts further down If a dir is marked to be inaccessible then everything below it should be masked by it.	2016-09-25 10:18:51 +02:00
Lennart Poettering	59eeb84ba6	core: add two new service settings ProtectKernelTunables= and ProtectControlGroups= If enabled, these will block write access to /sys, /proc/sys and /proc/sys/fs/cgroup.	2016-09-25 10:18:48 +02:00
Martin Pitt	5c3c778014	Merge pull request #3764 from poettering/assorted-stuff-2 Assorted fixes	2016-07-22 09:10:04 +02:00
Topi Miettinen	176e51b710	namespace: fix wrong return value from mount(2) (#3758 ) Fix bug introduced by #3263: mount(2) return value is 0 or -1, not errno. Thanks to Evgeny Vereshchagin (@evverx) for reporting.	2016-07-20 17:43:21 +03:00
Lennart Poettering	fe048ce56a	namespace: add a (void) cast	2016-07-20 14:53:15 +02:00
Lennart Poettering	5fd7cf6fe2	namespace: minor improvements We generally try to avoid strerror(), due to its threads-unsafety, let's do this here, too. Also, let's be tiny bit more explanatory with the log messages, and let's shorten a few things.	2016-07-20 08:57:25 +02:00
Alessandro Puccetti	2a624c36e6	doc,core: Read{Write,Only}Paths= and InaccessiblePaths= This patch renames Read{Write,Only}Directories= and InaccessibleDirectories= to Read{Write,Only}Paths= and InaccessiblePaths=, previous names are kept as aliases but they are not advertised in the documentation. Renamed variables: `read_write_dirs` --> `read_write_paths` `read_only_dirs` --> `read_only_paths` `inaccessible_dirs` --> `inaccessible_paths`	2016-07-19 17:22:02 +02:00
Alessandro Puccetti	c4b4170746	namespace: unify limit behavior on non-directory paths Despite the name, `Read{Write,Only}Directories=` already allows for regular file paths to be masked. This commit adds the same behavior to `InaccessibleDirectories=` and makes it explicit in the doc. This patch introduces `/run/systemd/inaccessible/{reg,dir,chr,blk,fifo,sock}` {dile,device}nodes and mounts on the appropriate one the paths specified in `InacessibleDirectories=`. Based on Luca's patch from https://github.com/systemd/systemd/pull/3327	2016-07-19 17:22:02 +02:00
topimiettinen	737ba3c82c	namespace: Make private /dev noexec and readonly (#3263 ) Private /dev will not be managed by udev or others, so we can make it noexec and readonly after we have made all device nodes. As /dev/shm needs to be writable, we can't use bind_remount_recursive().	2016-05-15 22:34:05 -04:00
topimiettinen	9e5f825280	namespace: unmount old /dev under our new private /dev (#3254 ) Drop all dangling old /dev mounts before mounting a new private /dev tree.	2016-05-14 12:46:23 -04:00
Daniel Mack	9ca6ff50ab	Remove kdbus custom endpoint support This feature will not be used anytime soon, so remove a bit of cruft. The BusPolicy= config directive will stay around as compat noop.	2016-02-11 22:12:04 +01:00
Daniel Mack	b26fa1a2fb	tree-wide: remove Emacs lines from all files This should be handled fine now by .dir-locals.el, so need to carry that stuff in every file.	2016-02-10 13:41:57 +01:00
Lennart Poettering	b5efdb8af4	util-lib: split out allocation calls into alloc-util.[ch]	2015-10-27 13:45:53 +01:00
Lennart Poettering	ee104e11e3	user-util: move UID/GID related macros from macro.h to user-util.h	2015-10-27 13:25:57 +01:00
Lennart Poettering	affb60b1ef	util-lib: split out umask-related code to umask-util.h	2015-10-27 13:25:56 +01:00
Lennart Poettering	8b43440b7e	util-lib: move string table stuff into its own string-table.[ch]	2015-10-27 13:25:56 +01:00
Lennart Poettering	4349cd7c1d	util-lib: move mount related utility calls to mount-util.[ch]	2015-10-27 13:25:55 +01:00
Lennart Poettering	2583fbea8e	socket-util: move remaining socket-related calls from util.[ch] to socket-util.[ch]	2015-10-26 01:24:39 +01:00
Lennart Poettering	3ffd4af220	util-lib: split out fd-related operations into fd-util.[ch] There are more than enough to deserve their own .c file, hence move them over.	2015-10-25 13:19:18 +01:00
Lennart Poettering	07630cea1f	util-lib: split our string related calls from util.[ch] into its own file string-util.[ch] There are more than enough calls doing string manipulations to deserve its own files, hence do something about it. This patch also sorts the #include blocks of all files that needed to be updated, according to the sorting suggestions from CODING_STYLE. Since pretty much every file needs our string manipulation functions this effectively means that most files have sorted #include blocks now. Also touches a few unrelated include files.	2015-10-24 23:05:02 +02:00
Lennart Poettering	3ee897d6c2	tree-wide: port more code to use send_one_fd() and receive_one_fd() Also, make it slightly more powerful, by accepting a flags argument, and make it safe for handling if more than one cmsg attribute happens to be attached.	2015-09-29 21:08:37 +02:00
Lennart Poettering	1f6b411372	tree-wide: update empty-if coccinelle script to cover empty-while and more Let's also clean up single-line while and for blocks.	2015-09-09 14:59:51 +02:00
Lennart Poettering	94c156cd45	tree-wide: make use of log_error_errno() return value in more cases The previous coccinelle semantic patch that improved usage of log_error_errno()'s return value, only looked for log_error_errno() invocations with a single parameter after the error parameter. Update the patch to handle arbitrary numbers of additional arguments.	2015-09-09 14:58:26 +02:00
Lennart Poettering	76ef789d26	tree-wide: make use of log_error_errno() return value Turns this: r = -errno; log_error_errno(errno, "foo"); into this: r = log_error_errno(errno, "foo"); and this: r = log_error_errno(errno, "foo"); return r; into this: return log_error_errno(errno, "foo");	2015-09-09 08:20:20 +02:00
Lennart Poettering	2a1288ff89	util: introduce CMSG_FOREACH() macro and make use of it everywhere It's only marginally shorter then the usual for() loop, but certainly more readable.	2015-06-10 19:29:47 +02:00
Jason Pleau	d38e01dc96	core/namespace: Protect /usr instead of /home with ProtectSystem=yes A small typo in `ee818b8` caused /home to be put in read-only instead of /usr when ProtectSystem was enabled (ie: not set to "no").	2015-05-31 20:29:36 +02:00
Lennart Poettering	03cfe0d514	nspawn: finish user namespace support	2015-05-21 16:32:01 +02:00
Lennart Poettering	6458ec20b5	core,nspawn: unify code that moves the root dir	2015-05-20 14:38:12 +02:00
Alban Crequy	ee818b89f4	core: Private/Protect options with RootDirectory When a service is chrooted with the option RootDirectory=/opt/..., then the options PrivateDevices, PrivateTmp, ProtectHome, ProtectSystem must mount the directories under $RootDirectory/{dev,tmp,home,usr,boot}. The test-ns tool can test setup_namespace() with and without chroot: $ sudo TEST_NS_PROJECTS=/home/lennart/projects ./test-ns $ sudo TEST_NS_CHROOT=/home/alban/debian-tree TEST_NS_PROJECTS=/home/alban/debian-tree/home/alban/Documents ./test-ns	2015-05-18 18:47:45 +02:00
Lennart Poettering	5a8af538ae	nspawn: rework custom mount point order, and add support for overlayfs Previously all bind mount mounts were applied in the order specified, followed by all tmpfs mounts in the order specified. This is problematic, if bind mounts shall be placed within tmpfs mounts. This patch hence reworks the custom mount point logic, and alwas applies them in strict prefix-first order. This means the order of mounts specified on the command line becomes irrelevant, the right operation will always be executed. While we are at it this commit also adds native support for overlayfs mounts, as supported by recent kernels.	2015-05-13 14:07:26 +02:00
Iago López Galeiras	4543768d13	nspawn: change filesystem type from "bind" to NULL in mount() syscalls Try to keep syscalls as minimal as possible.	2015-03-31 15:36:53 +02:00
Michal Schmidt	a0827e2b12	core/namespace: fix path sorting The comparison function we use for qsorting paths is overly indifferent. Consider these 3 paths for sorting: /foo /bar /foo/foo qsort() may compare: "/foo" with "/bar" => 0, indifference "/bar" with "/foo/foo" => 0, indifference and assume transitively that "/foo" and "/foo/foo" are also indifferent. But this is wrong, we want "/foo" sorted before "/foo/foo". The comparison function must be transitive. Use path_compare(), which behaves properly. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1184016	2015-03-16 22:17:15 +01:00
Zbigniew Jędrzejewski-Szmek	42b1b9907d	core: explicitly ignore failure during cleanup CID #1237550.	2015-03-13 23:42:17 -04:00
Zbigniew Jędrzejewski-Szmek	3164e3cbc5	core: either ignore or handle mount failures /dev/pts/ptmx is as important as /dev/pts, so error out if that fails. Others seem less important, since the namespace is usable without them, so ignore failures. CID #123755, #123754.	2015-03-13 23:42:17 -04:00
Zbigniew Jędrzejewski-Szmek	dc75168823	Use space after a silencing (void) We were using a space more often than not, and this way is codified in CODING_STYLE.	2015-03-13 23:42:17 -04:00
Thomas Hindoe Paaboel Andersen	2eec67acbb	remove unused includes This patch removes includes that are not used. The removals were found with include-what-you-use which checks if any of the symbols from a header is in use.	2015-02-23 23:53:42 +01:00
Lennart Poettering	63c372cb9d	util: rework strappenda(), and rename it strjoina() After all it is now much more like strjoin() than strappend(). At the same time, add support for NULL sentinels, even if they are normally not necessary.	2015-02-03 02:05:59 +01:00
Topi Miettinen	e65476622d	Type of mount(2) flags is unsigned long	2015-01-01 14:39:17 -05:00
Lennart Poettering	d7b8eec7dc	tmpfiles: add new line type 'v' for creating btrfs subvolumes	2014-12-28 02:08:40 +01:00
Michal Schmidt	4a62c710b6	treewide: another round of simplifications Using the same scripts as in `f647962d64` "treewide: yet more log_*_errno + return simplifications".	2014-11-28 19:57:32 +01:00
Michal Schmidt	56f64d9576	treewide: use log__errno whenever %m is in the format string If the format string contains %m, clearly errno must have a meaningful value, so we might as well use log__errno to have ERRNO= logged. Using: find . -name '.[ch]' \| xargs sed -r -i -e \ 's/log_(debug\|info\|notice\|warning\|error\|emergency)\((".%m.*")/log_\1_errno(errno, \2/' Plus some whitespace, linewrap, and indent adjustments.	2014-11-28 19:49:27 +01:00
Susant Sahani	b77acbcf7d	namespace: unchecked return value from library fix: CID 1237553 (#1 of 6): Unchecked return value from library (CHECKED_RETURN CID 1237553 (#3 of 6): Unchecked return value from library (CHECKED_RETURN) CID 1237553 (#4 of 6): Unchecked return value from library (CHECKED_RETURN) CID 1237553 (#5 of 6): Unchecked return value from library (CHECKED_RETURN CID 1237553 (#6 of 6): Unchecked return value from library (CHECKED_RETURN)	2014-11-17 12:06:40 +01:00
Daniel Mack	63cc4c3138	sd-bus: sync with kdbus upstream (ABI break) kdbus has seen a larger update than expected lately, most notably with kdbusfs, a file system to expose the kdbus control files: * Each time a file system of this type is mounted, a new kdbus domain is created. * The layout inside each mount point is the same as before, except that domains are not hierarchically nested anymore. * Domains are therefore also unnamed now. * Unmounting a kdbusfs will automatically also detroy the associated domain. * Hence, the action of creating a kdbus domain is now as privileged as mounting a filesystem. * This way, we can get around creating dev nodes for everything, which is last but not least something that is not limited by 20-bit minor numbers. The kdbus specific bits in nspawn have all been dropped now, as nspawn can rely on the container OS to set up its own kdbus domain, simply by mounting a new instance. A new set of mounts has been added to mount things after the kernel modules have been loaded. For now, only kdbus is in this set, which is invoked with mount_setup_late().	2014-11-13 20:41:52 +01:00
Lennart Poettering	ecabcf8b6e	selinux: clean up selinux label function naming	2014-10-23 21:36:56 +02:00
WaLyong Cho	cc56fafeeb	mac: rename apis with mac_{selinux/smack}_ prefix	2014-10-23 17:13:15 +02:00
Lennart Poettering	a004cb4cb2	namespace: add missing 'const' to parameters	2014-10-17 13:49:08 +02:00
Zbigniew Jędrzejewski-Szmek	d267c5aa3d	core/namespace: remove invalid check dir cannot be NULL here, because it was allocated with alloca. CID #1237768.	2014-10-03 20:42:09 -04:00
Zbigniew Jędrzejewski-Szmek	1775f1ebc4	core/namespace: remove invalid check root cannot be NULL here, because it was allocated with alloca. CID #1237769.	2014-10-03 20:42:09 -04:00
Thomas Hindoe Paaboel Andersen	120d578e5f	namespace: avoid posible use of uninitialized variable	2014-09-08 22:09:41 +02:00
Daniel Mack	a610cc4f18	namespace: add support for custom kdbus endpoint If a path to a previously created custom kdbus endpoint is passed in, bind-mount a new devtmpfs that contains a 'bus' node, which in turn in bind-mounted with the custom endpoint. This tmpfs then mounted over the kdbus subtree that refers to the current bus. This way, we can fake the bus node in order to lock down services with a kdbus custom endpoint policy.	2014-09-08 14:12:56 +02:00
Ansgar Burchardt	e2d7c1a075	drop_duplicates: copy full BindMount struct At least t->ignore = f->ignore; is missing here. Just copy the full struct to be sure.	2014-07-27 15:15:11 -04:00
Lennart Poettering	664064d60c	namespace: make sure /tmp, /var/tmp and /dev are writable in namespaces we set up	2014-07-03 16:28:26 +02:00
Lennart Poettering	002b226843	namespace: fix uninitialized memory access	2014-07-03 16:28:26 +02:00
Lennart Poettering	dd078a1ef8	namespace: properly label device nodes we create https://bugzilla.redhat.com/show_bug.cgi?id=1081429	2014-06-18 00:09:46 +02:00
Lennart Poettering	051be1f71c	namespace: cover /boot with ProtectSystem= again Now that we properly exclude autofs mounts from ProtectSystem= we can include it in the effect of ProtectSystem= again.	2014-06-06 14:48:51 +02:00
Lennart Poettering	d6797c920e	namespace: beef up read-only bind mount logic Instead of blindly creating another bind mount for read-only mounts, check if there's already one we can use, and if so, use it. Also, recursively mark all submounts read-only too. Also, ignore autofs mounts when remounting read-only unless they are already triggered.	2014-06-06 14:37:40 +02:00
Lennart Poettering	c8835999c3	namespace: also include /root in ProtectHome= /root can't really be autofs, and is also a home, directory, so cover it with ProtectHome=.	2014-06-05 21:55:06 +02:00
Lennart Poettering	6d313367d9	namespace: when setting up an inaccessible mount point, unmounting everything below This has the benefit of not triggering any autofs mount points unnecessarily.	2014-06-05 21:35:35 +02:00

1 2 3 4 5 ...

276 commits