Commit graph

127 commits

Author SHA1 Message Date
Lennart Poettering 6baa7db008 mac: also rename use_{smack,selinux,apparmor}() calls so that they share the new mac_{smack,selinux,apparmor}_xyz() convention 2014-10-23 17:34:30 +02:00
WaLyong Cho cc56fafeeb mac: rename apis with mac_{selinux/smack}_ prefix 2014-10-23 17:13:15 +02:00
Lukas Nykryn 7491ccf2cb environment: append unit_id to error messages regarding EnvironmentFile 2014-10-17 16:05:57 +02:00
Lennart Poettering 8fa6cbe1a9 execute: downgrade namespace error to "warning"
Also, extend the printed warning a bit, explaining the situation more
verbosely.
2014-10-17 13:54:27 +02:00
Michal Sekletar 0015ebf3fa execute: don't fail child when we don't have privileges to setup namespaces
If we don't have privileges to setup the namespaces then we are most likely
running inside some sort of unprivileged container, hence not being able to
create namespace is not a problem because spawned service can't access host
system anyway.
2014-10-17 11:51:46 +02:00
Michael Scherer 5482192e57 Report aa_change_onexec error code
Since aa_change_onexec return the error code in errno, and return
-1, the current code do not give any useful information when
something fail. This make apparmor easier to debug, as seen on
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=760526
2014-10-11 12:04:47 +02:00
Tom Gundersen e63ff941ea core: execute - don't leak strv 2014-09-30 11:34:01 +02:00
Jan Synacek 86b23b07c9 swap: introduce Discard property
Process possible "discard" values from /etc/fstab.
2014-09-29 11:08:12 -04:00
Michal Sekletar 16115b0a7b socket: introduce SELinuxContextFromNet option
This makes possible to spawn service instances triggered by socket with
MLS/MCS SELinux labels which are created based on information provided by
connected peer.

Implementation of label_get_child_mls_label derived from xinetd.

Reviewed-by: Paul Moore <pmoore@redhat.com>
2014-09-19 12:32:06 +02:00
Thomas Hindoe Paaboel Andersen 822a59607c execute: silence warnings
Mark two function parameters as const
2014-09-08 22:10:36 +02:00
Daniel Mack e44da745d1 service: hook up custom endpoint logic
If BusPolicy= was passed, the parser function will have created
an ExecContext->bus_endpoint object, along with policy information.

In that case, create a kdbus endpoint, and pass its path name to the
namespace logic, to it will be mounted over the actual 'bus' node.

At endpoint creation time, no policy is updloaded. That is done after
fork(), through a separate call. This is necessary because we don't
know the real uid of the process earlier than that.
2014-09-08 14:15:02 +02:00
Daniel Mack a610cc4f18 namespace: add support for custom kdbus endpoint
If a path to a previously created custom kdbus endpoint is passed in,
bind-mount a new devtmpfs that contains a 'bus' node, which in turn in
bind-mounted with the custom endpoint. This tmpfs then mounted over the
kdbus subtree that refers to the current bus.

This way, we can fake the bus node in order to lock down services with
a kdbus custom endpoint policy.
2014-09-08 14:12:56 +02:00
Daniel Mack bb7dd0b04a bus: add kdbus endpoint types
Add types to describe endpoints and associated policy entries,
and add a BusEndpoint instace to ExecContext.
2014-09-08 11:06:45 +02:00
Daniel Mack d35fbf6bdf exec: move code executed after fork into exec_child()
This factors out one conditional branch that has grown way too big, and
makes the code more readable by using return statements rather than jump
labels.
2014-09-05 12:19:39 +02:00
Daniel Mack 9fa95f8539 exec: factor out most function arguments of exec_spawn() to ExecParameters
This way, the list of arguments to that function gets more comprehensive,
and we can get around passing lots of NULL and 0 arguments from socket.c,
swap.c and mount.c.

It also allows for splitting up the code in exec_spawn().

While at it, make ExecContext const in execute.c.
2014-09-05 12:18:57 +02:00
Lennart Poettering 1b6d7fa742 util: make use of newly added reset_signal_mask() call wherever appropriate 2014-08-26 21:12:54 +02:00
Lennart Poettering f461c8073d execute: explain in a comment, why close_all_fds() is invoked the second time differently 2014-08-21 17:35:19 +02:00
Lennart Poettering 4c94096027 core: unify how we generate the prefix string when dumping unit state 2014-08-21 17:24:21 +02:00
Lennart Poettering 3bb07b7680 Revert "socket: introduce SELinuxLabelViaNet option"
This reverts commit cf8bd44339.

Needs more discussion on the mailing list.
2014-08-19 19:16:08 +02:00
Michal Sekletar cf8bd44339 socket: introduce SELinuxLabelViaNet option
This makes possible to spawn service instances triggered by socket with
MLS/MCS SELinux labels which are created based on information provided by
connected peer.

Implementation of label_get_child_label derived from xinetd.

Reviewed-by: Paul Moore <pmoore@redhat.com>
2014-08-19 18:57:12 +02:00
Kay Sievers 3a43da2832 time-util: add and use USEC/NSEC_INFINIY 2014-07-29 13:20:20 +02:00
Lennart Poettering 418b9be500 firstboot: add new component to query basic system settings on first boot, or when creating OS images offline
A new tool "systemd-firstboot" can be used either interactively on boot,
where it will query basic locale, timezone, hostname, root password
information and set it. Or it can be used non-interactively from the
command line when prepareing disk images for booting. When used
non-inertactively the tool can either copy settings from the host, or
take settings on the command line.

$ systemd-firstboot --root=/path/to/my/new/root --copy-locale --copy-root-password --hostname=waldi

The tool will be automatically invoked (interactively) now on first boot
if /etc is found unpopulated.

This also creates the infrastructure for generators to be notified via
an environment variable whether they are running on the first boot, or
not.
2014-07-07 15:25:55 +02:00
Lennart Poettering 717603e391 machinectl: show /etc/os-release information of container in status output 2014-07-03 17:54:24 +02:00
Ronny Chevalier e1d758033d use more _cleanup_ macro 2014-06-24 19:09:57 +02:00
Lennart Poettering 1b8689f949 core: rename ReadOnlySystem= to ProtectSystem= and add a third value for also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.

With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
2014-06-04 18:12:55 +02:00
Lennart Poettering 417116f234 core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.

ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.

This patch also enables these settings for all our long-running services.

Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
2014-06-03 23:57:51 +02:00
Zbigniew Jędrzejewski-Szmek de0671ee7f Remove unnecessary casts in printfs
No functional change expected :)
2014-05-15 15:29:58 +02:00
Lennart Poettering 7f8aa67131 core: remove tcpwrap support
tcpwrap is legacy code, that is barely maintained upstream. It's APIs
are awful, and the feature set it exposes (such as DNS and IDENT
access control) questionnable. We should not support this natively in
systemd.

Hence, let's remove the code. If people want to continue making use of
this, they can do so by plugging in "tcpd" for the processes they start.
With that scheme things are as well or badly supported as they were from
traditional inetd, hence no functionality is really lost.
2014-03-24 20:07:42 +01:00
Lennart Poettering 3d94f76c99 util: replace close_pipe() with new safe_close_pair()
safe_close_pair() is more like safe_close(), except that it handles
pairs of fds, and doesn't make and misleading allusion, as it works
similarly well for socketpairs() as for pipe()s...
2014-03-24 03:22:44 +01:00
Lennart Poettering 03e334a1c7 util: replace close_nointr_nofail() by a more useful safe_close()
safe_close() automatically becomes a NOP when a negative fd is passed,
and returns -1 unconditionally. This makes it easy to write lines like
this:

        fd = safe_close(fd);

Which will close an fd if it is open, and reset the fd variable
correctly.

By making use of this new scheme we can drop a > 200 lines of code that
was required to test for non-negative fds or to reset the closed fd
variable afterwards.
2014-03-18 19:31:34 +01:00
Lennart Poettering 517d56b1d0 missing: if RLIMIT_RTTIME is not defined by the libc, then we need a new define for the max number of rlimits, too 2014-03-05 02:31:09 +01:00
Lennart Poettering e66cf1a3f9 core: introduce new RuntimeDirectory= and RuntimeDirectoryMode= unit settings
As discussed on the ML these are useful to manage runtime directories
below /run for services.
2014-03-03 17:55:32 +01:00
Lennart Poettering 98b47d54ce execute: free directory path if we fail to remove it because we cannot allocate a thread 2014-03-03 17:55:32 +01:00
Lennart Poettering f513e420c8 exec: imply NoNewPriviliges= only when seccomp filters are used in user mode 2014-02-26 02:28:52 +01:00
Lennart Poettering 4298d0b512 core: add new RestrictAddressFamilies= switch
This new unit settings allows restricting which address families are
available to processes. This is an effective way to minimize the attack
surface of services, by turning off entire network stacks for them.

This is based on seccomp, and does not work on x86-32, since seccomp
cannot filter socketcall() syscalls on that platform.
2014-02-26 02:19:28 +01:00
Lennart Poettering 7c66bae2ff seccomp: we should control NO_NEW_PRIVS on our own, not let seccomp do this for us 2014-02-26 02:19:28 +01:00
Michael Scherer eef65bf3ee core: Add AppArmor profile switching
This permit to switch to a specific apparmor profile when starting a daemon. This
will result in a non operation if apparmor is disabled.
It also add a new build requirement on libapparmor for using this feature.
2014-02-21 03:44:20 +01:00
Lennart Poettering 1756a0118e execute: modernizations 2014-02-19 17:53:50 +01:00
Lennart Poettering ac45f971a1 core: add Personality= option for units to set the personality for spawned processes 2014-02-19 03:27:03 +01:00
Lennart Poettering e9642be2cc seccomp: add helper call to add all secondary archs to a seccomp filter
And make use of it where appropriate for executing services and for
nspawn.
2014-02-18 22:14:00 +01:00
Lennart Poettering 5f8640fb62 core: store and expose SELinuxContext field normalized as bool + string 2014-02-17 16:52:52 +01:00
Lennart Poettering 57183d117a core: add SystemCallArchitectures= unit setting to allow disabling of non-native
architecture support for system calls

Also, turn system call filter bus properties into complex types instead
of concatenated strings.
2014-02-13 00:24:00 +01:00
Lennart Poettering 351a19b17d core: fix build without libseccomp 2014-02-12 18:44:40 +01:00
Lennart Poettering 17df7223be core: rework syscall filter
- Allow configuration of an errno error to return from blacklisted
  syscalls, instead of immediately terminating a process.

- Fix parsing logic when libseccomp support is turned off

- Only keep the actual syscall set in the ExecContext, and generate the
  string version only on demand.
2014-02-12 18:30:36 +01:00
Ronny Chevalier c0467cf387 syscallfilter: port to libseccomp 2014-02-12 18:30:36 +01:00
Lennart Poettering 82adf6af7c nspawn,man: use a common vocabulary when referring to selinux security contexts
Let's always call the security labels the same way:

  SMACK: "Smack Label"
  SELINUX: "SELinux Security Context"

And the low-level encapsulation is called "seclabel". Now let's hope we
stick to this vocabulary in future, too, and don't mix "label"s and
"security contexts" and so on wildly.
2014-02-10 13:18:16 +01:00
Michael Scherer 0d3f7bb3a5 exec: Add support for ignoring errors on SELinuxContext by prefixing it with -, like for others settings.
Also remove call to security_check_context, as this doesn't serve anything, since
setexeccon will fail anyway.
2014-02-10 13:18:16 +01:00
Michael Scherer 5c56a259e0 exec: Ignore the setting SELinuxContext if selinux is not enabled 2014-02-10 13:18:16 +01:00
Michael Scherer 7b52a628f8 exec: Add SELinuxContext configuration item
This permit to let system administrators decide of the domain of a service.
This can be used with templated units to have each service in a différent
domain ( for example, a per customer database, using MLS or anything ),
or can be used to force a non selinux enabled system (jvm, erlang, etc)
to start in a different domain for each service.
2014-02-10 13:18:16 +01:00
Lennart Poettering 7f112f50fe exec: introduce PrivateDevices= switch to provide services with a private /dev
Similar to PrivateNetwork=, PrivateTmp= introduce PrivateDevices= that
sets up a private /dev with only the API pseudo-devices like /dev/null,
/dev/zero, /dev/random, but not any physical devices in them.
2014-01-20 21:28:37 +01:00