To support ProtectHome=y in a user namespace (which mounts the inaccessible
nodes), the nodes need to be accessible by the user. Create these paths and
devices in the user runtime directory so they can be used later if needed.
Don't try to show top level drop-in for non-existent units or when trying to
instantiate non-instantiated units:
$ systemctl cat nonexistent@.service
Assertion 'name' failed at src/shared/dropin.c:143, function unit_file_find_dirs(). Aborting.
$ systemctl cat systemd-journald@.service
Assertion 'name' failed at src/shared/dropin.c:143, function unit_file_find_dirs(). Aborting.
Ideally, we would want to report this over back over dbus. But that is pretty hard,
because the unitfile parsing logic doesn't provide any feedback.
systemd-analyze verify also doesn't notice the issue, because it doesn't look
at the [Install] section at all. Let's print a message in the logs at least.
If we call LOOP_CLR_FD and LOOP_CTL_REMOVE too rapidly, the kernel cannot deal
with that (5.3.13-300.fc31.x86_64 running on dual core
Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz).
$ sudo strace -eioctl build/test-dissect-image /tmp/foobar3.img
ioctl(3, TCGETS, 0x7ffcee47de20) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(4, LOOP_CTL_GET_FREE) = 9
ioctl(5, LOOP_SET_FD, 3) = 0
ioctl(5, LOOP_SET_STATUS64, {lo_offset=0, lo_number=0, lo_flags=LO_FLAGS_READ_ONLY|LO_FLAGS_AUTOCLEAR|LO_FLAGS_PARTSCAN, lo_file_name="", ...}) = 0
ioctl(5, BLKGETSIZE64, [299999744]) = 0
ioctl(5, CDROM_GET_CAPABILITY, 0) = -1 EINVAL (Invalid argument)
ioctl(5, BLKSSZGET, [512]) = 0
Waiting for device (parent + 0 partitions) to appear...
Found root partition, writable of type btrfs at #-1 (/dev/block/7:9)
ioctl(5, LOOP_CLR_FD) = 0
ioctl(3, LOOP_CTL_REMOVE, 9) = -1 EBUSY (Device or resource busy)
Failed to remove loop device: Device or resource busy
This seems to be clear race condition, and attaching strace is generally enough
to "win" the race. But even with strace attached, we will fail occasionally.
Let's wait a bit and retry. With the wait, on my machine, the second attempt
always succeeds:
...
Found root partition, writable of type btrfs at #-1 (/dev/block/7:9)
ioctl(5, LOOP_CLR_FD) = 0
ioctl(3, LOOP_CTL_REMOVE, 9) = -1 EBUSY (Device or resource busy)
ioctl(3, LOOP_CTL_REMOVE, 9) = 9
+++ exited with 0 +++
Without the wait, all 64 attempts will occasionally fail.
The function no longer returns the fd. This complicated semantics, because it
wasn't clear what holds the ownership: the return value or the output
parameter. There were no users of the fd in the return value, so let's
simplify things conceptually and only return the fd once.
Reduce the scope of variables.
LOOP_CLR_FD was called on the wrong fd. Let's use a cleanup function to make
this automatic and reduce chances of a mixup in the future.
CID 1408498.
CID 1409488.
This code was added in 903659e7b2. The change
that is done here is a simple fix to avoid use of a
unitialized/wrongly-initialized variable, but the bigger issue is that nothing
looks at the returned result to distinguish between 0 and a positive return
value.
At the beginning of seccomp_memory_deny_write_execute architectures
can set individual filter_syscall, block_syscall, shmat_syscall values.
The former two are then used in the call to add_seccomp_syscall_filter
but shmat_syscall is not.
Right now all shmat_syscall values are the same, so the change is a
no-op, but if ever an architecture is added/modified this would be a
subtle source for a mistake so fix it by using shmat_syscall later.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
If seccomp_memory_deny_write_execute was fatally failing to load rules it
already returned a bad retval.
But if any adding filters failed it skipped the subsequent seccomp_load and
always returned an rc of 0 even if no rule was loaded at all.
Lets fix this requiring to (non fatally-failing) load at least one rule set.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
So far we silently convert negative return values from run() as
EXIT_FAILURE, which is how UNIX expects it. In many cases it would be
very useful for the caller to retrieve the actual error number we exit
with. Let's generically return that via sd_notify()'s ERRNO= attribute.
This means callers can set $NOTIFY_SOCKET and get the actual error
number delivered at their doorstep just like that.
The code was using timeout=0 as the default option string. This option string
was ultimately passed to generator_write_timeouts(), which only looks for
comment=systemd.device-timeout= or x-systemd.device-timeout=, i.e. the whole
call path was bogus. Let's rework this: generator_write_timeouts() now writes
any timeouts if configured by the user. create_disk() writes out it's own
timeout, but with lower priority. Since the code path that was calling
timeout=0 was not effective, the only change is that we stop overwriting the
timeout if explicitly configured by the user.
In both code paths, ignore failure to write.