Commit Graph

12 Commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek 349cc4a507 build-sys: use #if Y instead of #ifdef Y everywhere
The advantage is that is the name is mispellt, cpp will warn us.

$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build

squash! build-sys: use #if Y instead of #ifdef Y everywhere

v2:
- fix incorrect setting of HAVE_LIBIDN2
2017-10-04 12:09:29 +02:00
Djalal Harouni 09d3020b0a seccomp: remove '@credentials' syscall set (#6958)
This removes the '@credentials' syscall set that was added in commit
v234-468-gcd0ddf6f75.

Most of these syscalls are so simple that we do not want to filter them.
They work on the current calling process, doing only read operations,
they do not have a deep kernel path.

The problem may only be in 'capget' syscall since it can query arbitrary
processes, and used to discover processes, however sending signal 0 to
arbitrary processes can be used to discover if a process exists or not.
It is unfortunate that Linux allows to query processes of different
users. Lets put it now in '@process' syscall set, and later we may add
it to a new '@basic-process' set that allows most basic process
operations.
2017-10-03 07:20:05 +02:00
Lennart Poettering 96bedbe2e5 nspawn: replace syscall blacklist by a whitelist
Let's lock things down a bit, and maintain a list of what's permitted
rather than a list of what's prohibited in nspawn (also to make things a
bit more like Docker and friends).

Note that this slightly alters the effect of --system-call-filter=, as
now the negative list now takes precedence over the positive list.
However, given that the option is just a few days old and not included
in any released version it should be fine to change it at this point in
time.

Note that the whitelist is good chunk more restrictive thatn the
previous blacklist. Specifically:

- fanotify is not permitted (given the buffer size issues it's
  problematic in containers)
- nfsservctl is not permitted (NFS server support is not virtualized)
- pkey_xyz stuff is not permitted (really new stuff I don't grok)
- @cpu-emulation is prohibited (untested legacy stuff mostly, and if
  people really want to run dosemu in nspawn, they should use
  --system-call-filter=@cpu-emulation and all should be good)
2017-09-14 15:45:21 +02:00
Lennart Poettering 960e4569e1 nspawn: implement configurable syscall whitelisting/blacklisting
Now that we have ported nspawn's seccomp code to the generic code in
seccomp-util, let's extend it to support whitelisting and blacklisting
of specific additional syscalls.

This uses similar syntax as PID1's support for system call filtering,
but in contrast to that always implements a blacklist (and not a
whitelist), as we prepopulate the filter with a blacklist, and the
unit's system call filter logic does not come with anything
prepopulated.

(Later on we might actually want to invert the logic here, and
whitelist rather than blacklist things, but at this point let's not do
that. In case we switch this over later, the syscall add/remove logic of
this commit should be compatible conceptually.)

Fixes: #5163

Replaces: #5944
2017-09-12 14:06:21 +02:00
Lennart Poettering 7609340e2f nspawn: replace homegrown seccomp filter table largely with references to the existing syscall groups
Let's shorten the table, now that we are hooked up to the syscall group
system.
2017-09-11 18:00:07 +02:00
Lennart Poettering 402530d91e nspawn: part over seccomp code to use seccomp_add_syscall_filter_item()
Let's unify a bit of the code here.
2017-09-11 18:00:07 +02:00
Lennart Poettering 469830d142 seccomp: rework seccomp code, to improve compat with some archs
This substantially reworks the seccomp code, to ensure better
compatibility with some architectures, including i386.

So far we relied on libseccomp's internal handling of the multiple
syscall ABIs supported on Linux. This is problematic however, as it does
not define clear semantics if an ABI is not able to support specific
seccomp rules we install.

This rework hence changes a couple of things:

- We no longer use seccomp_rule_add(), but only
  seccomp_rule_add_exact(), and fail the installation of a filter if the
  architecture doesn't support it.

- We no longer rely on adding multiple syscall architectures to a single filter,
  but instead install a separate filter for each syscall architecture
  supported. This way, we can install a strict filter for x86-64, while
  permitting a less strict filter for i386.

- All high-level filter additions are now moved from execute.c to
  seccomp-util.c, so that we can test them independently of the service
  execution logic.

- Tests have been added for all types of our seccomp filters.

- SystemCallFilters= and SystemCallArchitectures= are now implemented in
  independent filters and installation logic, as they semantically are
  very much independent of each other.

Fixes: #4575
2017-01-17 22:14:27 -05:00
Lennart Poettering 8d7b0c8fd7 seccomp: add new seccomp_init_conservative() helper
This adds a new seccomp_init_conservative() helper call that is mostly just a
wrapper around seccomp_init(), but turns off NNP and adds in all secondary
archs, for best compatibility with everything else.

Pretty much all of our code used the very same constructs for these three
steps, hence unifying this in one small function makes things a lot shorter.

This also changes incorrect usage of the "scmp_filter_ctx" type at various
places. libseccomp defines it as typedef to "void*", i.e. it is a pointer type
(pretty poor choice already!) that casts implicitly to and from all other
pointer types (even poorer choice: you defined a confusing type now, and don't
even gain any bit of type safety through it...). A lot of the code assumed the
type would refer to a structure, and hence aded additional "*" here and there.
Remove that.
2016-10-24 17:32:50 +02:00
Felipe Sateler 1cec406d62 nspawn: detect SECCOMP availability, skip audit filter if unavailable
Fail hard if SECCOMP was detected but could not be installed
2016-09-06 20:25:49 -03:00
Zbigniew Jędrzejewski-Szmek d710aaf7a5 Use "return log_error_errno" in more places" 2016-07-22 21:25:09 -04:00
Lennart Poettering 54a17e01de nspawn: lock down system call filter a bit
Let's block access to the kernel keyring and a number of obsolete system calls.
Also, update list of syscalls that may alter the system clock, and do raw IO
access. Filter ptrace() if CAP_SYS_PTRACE is not passed to the container and
acct() if CAP_SYS_PACCT is not passed.

This also changes things so that kexec(), some profiling calls, the swap calls
and quotactl() is never available to containers, not even if CAP_SYS_ADMIN is
passed. After all we currently permit CAP_SYS_ADMIN to containers by default,
but these calls should not be available, even then.
2016-06-13 16:25:54 +02:00
Djalal Harouni f011b0b87a nspawn: split out seccomp call into nspawn-seccomp.[ch]
Split seccomp into nspawn-seccomp.[ch]. Currently there are no changes,
but this will make it easy in the future to share or use the seccomp logic
from systemd core.
2016-05-26 22:42:29 +02:00