namespace: be more careful when handling namespacing failures gracefully

This makes two changes to the namespacing code:

1. We'll only gracefully skip service namespacing on access failure if
   exclusively sandboxing options where selected, and not mount-related
   options that result in a very different view of the world. For example,
   ignoring RootDirectory=, RootImage= or Bind= is really probablematic,
   but ReadOnlyPaths= is just a weaker sandbox.

2. The namespacing code will now return a clearly recognizable error
   code when it cannot enforce its namespacing, so that we cannot
   confuse EPERM errors from mount() with those from unshare(). Only the
   errors from the first unshare() are now taken as hint to gracefully
   disable namespacing.

Fixes: #9844 #9835
This commit is contained in:
Lennart Poettering 2018-08-10 15:07:14 +02:00 committed by Zbigniew Jędrzejewski-Szmek
parent b257b19e6b
commit 1beab8b0d0
2 changed files with 18 additions and 4 deletions

View File

@ -2386,10 +2386,15 @@ static int apply_mount_namespace(
bind_mount_free_many(bind_mounts, n_bind_mounts);
/* If we couldn't set up the namespace this is probably due to a
* missing capability. In this case, silently proceeed. */
if (IN_SET(r, -EPERM, -EACCES)) {
log_unit_debug_errno(u, r, "Failed to set up namespace, assuming containerized execution, ignoring: %m");
/* If we couldn't set up the namespace this is probably due to a missing capability. setup_namespace() reports
* that with a special, recognizable error ENOANO. In this case, silently proceeed, but only if exclusively
* sandboxing options were used, i.e. nothing such as RootDirectory= or BindMount= that would result in a
* completely different execution environment. */
if (r == -ENOANO &&
n_bind_mounts == 0 && context->n_temporary_filesystems == 0 &&
!root_dir && !root_image &&
!context->dynamic_user) {
log_unit_debug(u, "Failed to set up namespace, assuming containerized execution and ignoring.");
return 0;
}

View File

@ -1290,8 +1290,17 @@ int setup_namespace(
normalize_mounts(root, mounts, &n_mounts);
}
/* All above is just preparation, figuring out what to do. Let's now actually start doing something. */
if (unshare(CLONE_NEWNS) < 0) {
r = log_debug_errno(errno, "Failed to unshare the mount namespace: %m");
if (IN_SET(r, -EACCES, -EPERM, -EOPNOTSUPP, -ENOSYS))
/* If the kernel doesn't support namespaces, or when there's a MAC or seccomp filter in place
* that doesn't allow us to create namespaces (or a missing cap), then propagate a recognizable
* error back, which the caller can use to detect this case (and only this) and optionally
* continue without namespacing applied. */
r = -ENOANO;
goto finish;
}