Commit graph

387 commits

Author SHA1 Message Date
Lennart Poettering 21022b9dde util-lib: wrap personality() to fix up broken glibc error handling (#6766)
glibc appears to propagate different errors in different ways, let's fix
this up, so that our own code doesn't get confused by this.

See #6752 + #6737 for details.

Fixes: #6755
2017-09-08 17:16:29 +03:00
Lennart Poettering aac8c0c382 execute: minor ExecOutput handling beautification (#6711)
Let's clean up the checking for the various ExecOutput values a bit,
let's use IN_SET everywhere, and the same concepts for all three bools
we pass to dprintf().
2017-09-01 09:04:27 +09:00
Lennart Poettering e8132d63fe seccomp: default to something resembling the current personality when locking it
Let's lock the personality to the currently set one, if nothing is
specifically specified. But do so with a grain of salt, and never
default to any exotic personality here, but only PER_LINUX or
PER_LINUX32.
2017-08-29 15:56:57 +02:00
Topi Miettinen 78e864e5b3 seccomp: LockPersonality boolean (#6193)
Add LockPersonality boolean to allow locking down personality(2)
system call so that the execution domain can't be changed.
This may be useful to improve security because odd emulations
may be poorly tested and source of vulnerabilities, while
system services shouldn't need any weird personalities.
2017-08-29 15:54:50 +02:00
Yu Watanabe 5ce96b141a Merge pull request #6582 from poettering/logind-tty
various tty path parsing fixes
2017-08-26 22:12:48 +09:00
Lennart Poettering 165a31c0db core: add two new special ExecStart= character prefixes
This patch adds two new special character prefixes to ExecStart= and
friends, in addition to the existing "-", "@" and "+":

"!"  → much like "+", except with a much reduced effect as it only
       disables the actual setresuid()/setresgid()/setgroups() calls, but
       leaves all other security features on, including namespace
       options. This is very useful in combination with
       RuntimeDirectory= or DynamicUser= and similar option, as a user
       is still allocated and used for the runtime directory, but the
       actual UID/GID dropping is left to the daemon process itself.
       This should make RuntimeDirectory= a lot more useful for daemons
       which insist on doing their own privilege dropping.

"!!" → Similar to "!", but on systems supporting ambient caps this
       becomes a NOP. This makes it relatively straightforward to write
       unit files that make use of ambient capabilities to let systemd
       drop all privs while retaining compatibility with systems that
       lack ambient caps, where priv dropping is the left to the daemon
       codes themselves.

This is an alternative approach to #6564 and related PRs.
2017-08-10 15:04:32 +02:00
Lennart Poettering 43b1f7092d execute: needs_{selinux,apparmor,smack} → use_{selinux,apparmor,smack}
These booleans simply store whether selinux/apparmor/smack are supposed
ot be used, and chache the various mac_xyz_use() calls before we
transition into the namespace, hence let's use the same verb for the
variables and the functions: "use"
2017-08-10 15:02:50 +02:00
Lennart Poettering 9f6444eb92 execute: make use of IN_SET() where we can 2017-08-10 15:02:50 +02:00
Lennart Poettering 937ccce94c execute: simplify needs_sandboxing checking
Let's merge three if blocks that shall only run when sandboxing is applied
into one.

Note that this changes behaviour in one corner case: PrivateUsers=1 is
now honours both PermissionsStartOnly= and the "+" modifier in
ExecStart=, and not just the former, as before. This was an oversight,
so let's fix this now, at a point in time the option isn't used much
yet.
2017-08-10 15:02:50 +02:00
Lennart Poettering 1703fa41a7 core: rename EXEC_APPLY_PERMISSIONS → EXEC_APPLY_SANDBOXING
"Permissions" was a bit of a misnomer, as it suggests that UNIX file
permission bits are adjusted, which aren't really changed here. Instead,
this is about UNIX credentials such as users or groups, as well as
namespacing, hence let's use a more generic term here, without any
misleading reference to UNIX file permissions: "sandboxing", which shall
refer to all kinds of sandboxing technologies, including UID/GID
dropping, selinux relabelling, namespacing, seccomp, and so on.
2017-08-10 15:02:50 +02:00
Lennart Poettering 584b8688d1 execute: also fold the cgroup delegate bit into ExecFlags 2017-08-10 15:02:50 +02:00
Lennart Poettering ac6479781e execute: also control the SYSTEMD_NSS_BYPASS_BUS through an ExecFlags field
Also, correct the logic while we are at it: the variable is only
required for system services, not user services.
2017-08-10 15:02:49 +02:00
Lennart Poettering c71b2eb77e core: don't chown() the configuration directory
The configuration directory is commonly not owned by a service, but
remains root-owned, hence don't change the owner automatically for it.
2017-08-10 15:02:49 +02:00
Lennart Poettering 8679efde21 execute: add one more ExecFlags flag, for controlling unconditional directory chowning
Let's decouple the Manager object from the execution logic a bit more
here too, and simply pass along the fact whether we should
unconditionally chown the runtime/... directories via the ExecFlags
field too.
2017-08-10 14:44:58 +02:00
Lennart Poettering af635cf377 execute: let's decouple execute.c a bit from the unit logic
Let's try to decouple the execution engine a bit from the Unit/Manager
concept, and hence pass one more flag as part of the ExecParameters flags
field.
2017-08-10 14:44:58 +02:00
Lennart Poettering 3ed0cd26ea execute: replace command flag bools by a flags field
This way, we can extend it later on in an easier way, and can pass it
along nicely.
2017-08-10 14:44:58 +02:00
Lennart Poettering a119ec7c82 util-lib: add a new skip_dev_prefix() helper
This new helper removes a leading /dev if there is one. We have code
doing this all over the place, let's unify this, and correct it while
we are at it, by using path_startswith() rather than startswith() to
drop the prefix.
2017-08-09 19:01:18 +02:00
Yu Watanabe 07d46372fe securebits-util: add secure_bits_{from_string,to_string_alloc}() 2017-08-07 23:40:25 +09:00
Yu Watanabe dd1f5bd0aa cap-list: add capability_set_{from_string,to_string_alloc}() 2017-08-07 23:25:11 +09:00
Yu Watanabe 837df14040 core: do not ignore returned values 2017-08-06 23:34:55 +09:00
Yu Watanabe ecfbc84f1c core: define variables only when they are required
Follow-up for 7f18ef0a55.
2017-08-06 13:08:34 +09:00
Fabio Kung 7f18ef0a55 core: check which MACs to use before a new mount ns is created (#6498)
/sys is not guaranteed to exist when a new mount namespace is created.
It is only mounted under conditions specified by
`namespace_info_mount_apivfs`.

Checking if the three available MAC LSMs are enabled requires a sysfs
mounted at /sys, so the checks are moved to before a new mount ns is
created.
2017-08-01 09:15:18 +02:00
Lennart Poettering c867611e0a execute: don't pass unit ID in --user mode to journald for stream logging
When we create a log stream connection to journald, we pass along the
unit ID. With this change we do this only when we run as system
instance, not as user instance, to remove the ambiguity whether a user
or system unit is specified. The effect of this change is minor:
journald ignores the field anyway from clients with UID != 0. This patch
hence only fixes the unit attribution for the --user instance of the
root user.
2017-07-31 18:01:42 +02:00
Lennart Poettering 92a17af991 execute: make some code shorter
Let's simplify some lines to make it shorter.
2017-07-31 18:01:42 +02:00
Lennart Poettering cad93f2996 core, sd-bus, logind: make use of uid_is_valid() in more places 2017-07-31 18:01:42 +02:00
Lennart Poettering df0ff12775 tree-wide: make use of getpid_cached() wherever we can
This moves pretty much all uses of getpid() over to getpid_raw(). I
didn't specifically check whether the optimization is worth it for each
replacement, but in order to keep things simple and systematic I
switched over everything at once.
2017-07-20 20:27:24 +02:00
Yu Watanabe 3536f49e8f core: add {State,Cache,Log,Configuration}Directory= (#6384)
This introduces {State,Cache,Log,Configuration}Directory= those are
similar to RuntimeDirectory=. They create the directories under
/var/lib, /var/cache/, /var/log, or /etc, respectively, with the mode
specified in {State,Cache,Log,Configuration}DirectoryMode=.

This also fixes #6391.
2017-07-18 14:34:52 +02:00
Lennart Poettering 688230d3a7 Merge pull request #6354 from walyong/smack_process_label_free
core: modify resource leak and missed security context dump
2017-07-17 10:04:12 +02:00
Yu Watanabe 23a7448efa core: support subdirectories in RuntimeDirectory= option 2017-07-17 16:30:53 +09:00
Yu Watanabe 53f47dfc7b core: allow preserving contents of RuntimeDirectory= over process restart
This introduces RuntimeDirectoryPreserve= option which takes a boolean
argument or 'restart'.

Closes #6087.
2017-07-17 16:22:25 +09:00
WaLyong Cho 80c21aea11 core: dump also missed security context 2017-07-13 13:12:24 +09:00
WaLyong Cho 5b8e1b7755 core: modify resource leak by SmackProcessLabel= 2017-07-13 13:12:15 +09:00
Lennart Poettering 782c925f7f Revert "core: link user keyring to session keyring (#6275)" (#6342)
This reverts commit 437a85112e.

The outcome of this isn't that clear, let's revert this for now, see
discussion on #6286.
2017-07-12 10:00:43 -04:00
Christian Hesse 437a85112e core: link user keyring to session keyring (#6275)
Commit  74dd6b515f (core: run each system
service with a fresh session keyring) broke adding keys to user keyring.
Added keys could not be accessed with error message:

keyctl_read_alloc: Permission denied

So link the user keyring to our session keyring.
2017-07-04 09:38:31 +02:00
Lennart Poettering 7f452159b8 core: make IOSchedulingClass= and IOSchedulingPriority= settable for transient units
This patch is a bit more complex thant I hoped. In particular the single
IOScheduling= property exposed on the bus is split up into
IOSchedulingClass= and IOSchedulingPriority= (though compat is
retained). Otherwise the asymmetry between setting props and getting
them is a bit too nasty.

Fixes #5613
2017-06-26 17:43:18 +02:00
Zbigniew Jędrzejewski-Szmek 7e867138f5 Merge pull request #5600 from fbuihuu/make-logind-restartable
Make logind restartable.
2017-06-24 18:58:36 -04:00
Franck Bui 4c47affcf1 core: remove the redundancy of 'n_fds' and 'n_storage_fds' in ExecParameters struct
'n_fds' field in the ExecParameters structure was counting the total number of
file descriptors to be passed to a unit.

This counter also includes the number of passed socket fds which is counted by
'n_socket_fds' already.

This patch removes that redundancy by replacing 'n_fds' with
'n_storage_fds'. The new field only counts the fds passed via the storage store
mechanism.  That way each fd is counted at one place only.

Subsequently the patch makes sure to fix code that used 'n_fds' and also wanted
to iterate through all of them by explicitly adding 'n_socket_fds' + 'n_storage_fds'.

Suggested by Lennart.
2017-06-08 16:21:35 +02:00
Franck Bui 9b1419111a core: only apply NonBlocking= to fds passed via socket activation
Make sure to only apply the O_NONBLOCK flag to the fds passed via socket
activation.

Previously the flag was also applied to the fds which came from the fd store
but this was incorrect since services, after being restarted, expect that these
passed fds have their flags unchanged and can be reused as before.

The documentation was a bit unclear about this so clarify it.
2017-06-06 22:42:50 +02:00
Zbigniew Jędrzejewski-Szmek 52511fae7b core: fix warning about unsigned variable (#5935)
Fixup for d8c92e8bc7.
2017-05-11 08:15:28 +02:00
Lennart Poettering 4e168f4606 Merge pull request #5420 from OpenDZ/tixxdz/namespace-fixes-v2
Namespace: RootImage= RootDirectory= and MountAPIVFS fixes
2017-05-09 20:42:32 +02:00
Aggelos Avgerinos 488ab41cb8 execute: Properly log errors considering socket fds (#5910)
Till now if the params->n_fds was 0, systemd was logging that there were
more than one sockets.

Thanks @gregoryp and @VFXcode who did the most work debugging this.
2017-05-08 19:09:22 -04:00
Zbigniew Jędrzejewski-Szmek d8c92e8bc7 execute: filter out "." for ".." in EnvironmentFile= globs too
This doesn't really matter much, only in case somebody would use
something strange like

  EnvironmentFile=/etc/something/.*

Make sure that "." and ".." is not returned by that glob. This makes
all our globbing patterns behave the same.
2017-04-27 13:21:08 -04:00
Djalal Harouni 74e941c022 Merge pull request #5774 from keszybz/printf-annotations
Printf annotation improvements
2017-04-23 01:03:42 +02:00
Zbigniew Jędrzejewski-Szmek ba360bb05c tree-wide: mark log_struct with _printf_ and fix fallout
log_struct takes multiple format strings, each one followed by arguments.
The _printf_ annotation is not sufficiently flexible to express this,
but we can still annotate the first format string, though not its
arguments (because their number is unknown).

With the annotation, the places which specified the message id or similar
as the first pattern cause a warning from -Wformat-nonliteral. This can
be trivially fixed by putting the MESSAGE= first.

This change will help find issues where a non-literal is erroneously used
as the pattern.
2017-04-21 13:37:04 -04:00
Yu Watanabe 4d8b0f0f7a core: downgrade error message if command is prefixed with - and the command is not found
Fixes #5621
2017-04-03 15:38:37 +09:00
Djalal Harouni 9c988f934b namespace: Apply MountAPIVFS= only when a Root directory is set
The MountAPIVFS= documentation says that this options has no effect
unless used in conjunction with RootDirectory= or RootImage= ,lets fix
this and avoid to create private mount namespaces where it is not
needed.
2017-03-05 21:39:43 +01:00
Zbigniew Jędrzejewski-Szmek 643f4706b0 core/execute: add (void)
CID #778045.
2017-02-20 16:02:18 -05:00
Zbigniew Jędrzejewski-Szmek 2b0445262a tree-wide: add SD_ID128_MAKE_STR, remove LOG_MESSAGE_ID
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
  "conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.

Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:

$ size .libs/systemd{,-logind,-journald}{.old,}
   text	   data	    bss	    dec	    hex	filename
1265204	 149564	   4808	1419576	 15a938	.libs/systemd.old
1260268	 149564	   4808	1414640	 1595f0	.libs/systemd
 246805	  13852	    209	 260866	  3fb02	.libs/systemd-logind.old
 240973	  13852	    209	 255034	  3e43a	.libs/systemd-logind
 146839	   4984	     34	 151857	  25131	.libs/systemd-journald.old
 146391	   4984	     34	 151409	  24f71	.libs/systemd-journald

It is also much easier to check if a certain binary uses a certain MESSAGE_ID:

$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x

$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
2017-02-15 00:45:12 -05:00
Lennart Poettering 6818c54ca6 core: skip ReadOnlyPaths= and other permission-related mounts on PermissionsStartOnly= (#5309)
ReadOnlyPaths=, ProtectHome=, InaccessiblePaths= and ProtectSystem= are
about restricting access and little more, hence they should be disabled
if PermissionsStartOnly= is used or ExecStart= lines are prefixed with a
"+". Do that.

(Note that we will still create namespaces and stuff, since that's about
a lot more than just permissions. We'll simply disable the effect of
the four options mentioned above, but nothing else mount related.)

This also adds a test for this, to ensure this works as intended.

No documentation updates, as the documentation are already vague enough
to support the new behaviour ("If true, the permission-related execution
options…"). We could clarify this further, but I think we might want to
extend the switches' behaviour a bit more in future, hence leave it at
this for now.

Fixes: #5308
2017-02-12 00:44:46 -05:00
Lennart Poettering 376fecf670 execute: set the right exit status for CHDIR vs. CHROOT
Fixes: #5125
2017-02-09 13:18:35 +01:00