Commit graph

54 commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek 604b163a31 test-seccomp: minor simpification 2020-08-05 10:49:46 +02:00
Lennart Poettering 6b000af4f2 tree-wide: avoid some loaded terms
https://tools.ietf.org/html/draft-knodel-terminology-02
https://lwn.net/Articles/823224/

This gets rid of most but not occasions of these loaded terms:

1. scsi_id and friends are something that is supposed to be removed from
   our tree (see #7594)

2. The test suite defines an API used by the ubuntu CI. We can remove
   this too later, but this needs to be done in sync with the ubuntu CI.

3. In some cases the terms are part of APIs we call or where we expose
   concepts the kernel names the way it names them. (In particular all
   remaining uses of the word "slave" in our codebase are like this,
   it's used by the POSIX PTY layer, by the network subsystem, the mount
   API and the block device subsystem). Getting rid of the term in these
   contexts would mean doing some major fixes of the kernel ABI first.

Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
2020-06-25 09:00:19 +02:00
Topi Miettinen 3c14dc61f7 tests: various small fixes for strict systems
Don't assume that 4MB can be allocated from stack since there could be smaller
DefaultLimitSTACK= in force, so let's use malloc(). NUL terminate the huge
strings by hand, also ensure termination in test_lz4_decompress_partial() and
optimize the memset() for the string.

Some items in /proc and /etc may not be accessible to poor unprivileged users
due to e.g. SELinux, BOFH or both, so check for EACCES and EPERM.

/var/tmp may be a symlink to /tmp and then path_compare() will always fail, so
let's stick to /tmp like elsewhere.

/tmp may be mounted with noexec option and then trying to execute scripts from
there would fail.

Detect and warn if seccomp is already in use, which could make seccomp test
fail if the syscalls are already blocked.

Unset $TMPDIR so it will not break specifier tests where %T is assumed to be
/tmp and %V /var/tmp.
2020-04-26 20:18:48 +02:00
Yu Watanabe dd0395b565 make namespace_flags_to_string() not return empty string
This improves the following debug log.

Before:
systemd[1162]: Restricting namespace to: .

After:
systemd[1162]: Restricting namespace to: n/a.
2020-03-03 21:17:38 +01:00
Mike Gilbert fb4b0465ab seccomp: real syscall numbers are >= 0
Real syscall numbers start at 0. The fake seccomp values seem to be
strictly less than 0.

Fixes: 4df8fe8415
2019-12-09 11:29:06 +01:00
Christian Ehrhardt 49219b5c2a
seccomp: mmap test results depend on kernel/libseccomp/glibc
Like with shmat already the actual results of the test
test_memory_deny_write_execute_mmap depend on kernel/libseccomp/glibc
of the platform it is running on.

There are known-good platforms, but on the others do not assert success
(which implies test has actually failed as no seccomp blocking was achieved),
but instead make the check dependent to the success of the mmap call
on that platforms.

Finally the assert of the munmap on that valid pointer should return ==0,
so that is what the check should be for in case of p != MAP_FAILED.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2019-12-05 07:19:12 +01:00
Lennart Poettering 8af381679d
Merge pull request #13940 from keur/protect_kernel_logs
Add ProtectKernelLogs to systemd.exec
2019-11-15 16:26:10 +01:00
Lennart Poettering 4df8fe8415 seccomp: more comprehensive protection against libseccomp's __NR_xyz namespace invasion
A follow-up for 59b657296a, adding the
same conditioning for all cases of our __NR_xyz use.

Fixes: #14031
2019-11-15 08:13:36 +01:00
Kevin Kuehler 97d05f3b70 test/test-seccomp: add test_protect_syslog 2019-11-14 13:31:03 -08:00
Yu Watanabe df26692947 tree-wide: drop sched.h when missing_sched.h is included 2019-11-04 00:30:32 +09:00
Yu Watanabe f5947a5e92 tree-wide: drop missing.h 2019-10-31 17:57:03 +09:00
Lennart Poettering 7bbc229cf7 test: use the new action in our tests
This way, we know that it works as intended.
2019-05-24 10:48:28 +02:00
Zbigniew Jędrzejewski-Szmek dff6c6295b test-seccomp: fix compilation on arm64
It has no open().
2019-04-03 13:24:43 +02:00
Lennart Poettering 167fc10cb3 test: add test case for restrict_suid_sgid() 2019-04-02 16:56:48 +02:00
Zbigniew Jędrzejewski-Szmek 67fb5f338f seccomp: allow shmat to be a separate syscall on architectures which use a multiplexer
After
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d6040d46817,
those syscalls have their separate numbers and we can block them.
But glibc might still use the old ones. So let's just do a best-effort
block and not assume anything about how effective it is.
2019-03-15 15:46:41 +01:00
Zbigniew Jędrzejewski-Szmek e55bdf9b6c seccomp: shm{get,at,dt} now have their own numbers everywhere
E.g. on i686:

(previously)
arch x86: SCMP_SYS(mmap) = 90
arch x86: SCMP_SYS(mmap2) = 192
arch x86: SCMP_SYS(shmat) = -221
arch x86: SCMP_SYS(shmat) = -221
arch x86: SCMP_SYS(shmdt) = -222

(now)
arch x86: SCMP_SYS(mmap) = 90
arch x86: SCMP_SYS(mmap2) = 192
arch x86: SCMP_SYS(shmat) = 397
arch x86: SCMP_SYS(shmat) = 397
arch x86: SCMP_SYS(shmdt) = 398

The relevant commit seems to be
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d6040d46817.
2019-03-15 15:28:43 +01:00
Lennart Poettering d8b4d14df4 util: split out nulstr related stuff to nulstr-util.[ch] 2019-03-14 13:25:52 +01:00
Lennart Poettering 0a9707187b util: split out memcmp()/memset() related calls into memory-util.[ch]
Just some source rearranging.
2019-03-13 12:16:43 +01:00
Lennart Poettering 5f00dc4df6 test: skip various tests if namespacing is not available
Apparently on Debian LXC/AppArmor doesn't allow namespacing to container
payloads. Deal with it.

Fixes: #9700
2018-10-24 19:40:24 +02:00
Zbigniew Jędrzejewski-Szmek b54f36c604 seccomp: reduce logging about failure to add syscall to seccomp
Our logs are full of:
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldstat() / -10037, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call get_thread_area() / -10076, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call set_thread_area() / -10079, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldfstat() / -10034, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldolduname() / -10036, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call oldlstat() / -10035, ignoring: Numerical argument out of domain
Sep 19 09:22:10 autopkgtest systemd[690]: Failed to add rule for system call waitpid() / -10073, ignoring: Numerical argument out of domain
...
This is pointless and makes debug logs hard to read. Let's keep the logs
in test code, but disable it in nspawn and pid1. This is done through a function
parameter because those functions operate recursively and it's not possible to
make the caller to log meaningfully.


There should be no functional change, except the skipped debug logs.
2018-09-24 17:21:09 +02:00
Zbigniew Jędrzejewski-Szmek f09da7ccbc test-seccomp: log function names
Various tests produce similar output, and the function names make it
easier to see where the output is generated.
2018-09-24 17:21:09 +02:00
Zbigniew Jędrzejewski-Szmek 23e12f8e6c test-seccomp: move two similar tests closer 2018-09-24 17:19:11 +02:00
Yu Watanabe cd90ec7544 test-seccomp: add log messages when skipping tests 2018-09-21 00:32:44 +09:00
Zbigniew Jędrzejewski-Szmek 6d7c403324 tests: use a helper function to parse environment and open logging
The advantages are that we save a few lines, and that we can override
logging using environment variables in more test executables.
2018-09-14 09:29:57 +02:00
Lennart Poettering 705268414f seccomp: add new system call filter, suitable as default whitelist for system services
Currently we employ mostly system call blacklisting for our system
services. Let's add a new system call filter group @system-service that
helps turning this around into a whitelist by default.

The new group is very similar to nspawn's default filter list, but in
some ways more restricted (as sethostname() and suchlike shouldn't be
available to most system services just like that) and in others more
relaxed (for example @keyring is blocked in nspawn since it's not
properly virtualized yet in the kernel, but is fine for regular system
services).
2018-06-14 17:44:20 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Yu Watanabe 86c2a9f1c2 nsflsgs: drop namespace_flag_{from,to}_string()
This also drops namespace_flag_to_string_many_with_check(), and
renames namespace_flag_{from,to}_string_many() to
namespace_flags_{from,to}_string().
2018-05-05 11:07:37 +09:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Lennart Poettering 7d4904fe7a process-util: rework wait_for_terminate_and_warn() to take a flags parameter
This renames wait_for_terminate_and_warn() to
wait_for_terminate_and_check(), and adds a flags parameter, that
controls how much to log: there's one flag that means we log about
abnormal stuff, and another one that controls whether we log about
non-zero exit codes. Finally, there's a shortcut flag value for logging
in both cases, as that's what we usually use.

All callers are accordingly updated. At three occasions duplicate logging
is removed, i.e. where the old function was called but logged in the
caller, too.
2018-01-04 13:27:27 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Yu Watanabe b4891260b9 test: add tests for syscall:errno style in SystemCallFilter= 2017-11-11 21:54:20 +09:00
Matija Skala d7e454ba9c fix includes
sys/wait.h is needed for WEXITED macro

poll.h is more portable than sys/poll.h
2017-10-30 10:32:45 +01:00
Lennart Poettering 25e94f8c75 tests: let's make sure the seccomp filter lists remain properly ordered
It's too easy to corrupt the order, hence let's check for the right
order automatically as part of testing.
2017-09-14 15:45:21 +02:00
Lennart Poettering 21022b9dde util-lib: wrap personality() to fix up broken glibc error handling (#6766)
glibc appears to propagate different errors in different ways, let's fix
this up, so that our own code doesn't get confused by this.

See #6752 + #6737 for details.

Fixes: #6755
2017-09-08 17:16:29 +03:00
Evgeny Vereshchagin 48fa42d4ef tests: check the return value of personality when errno is not set (#6752)
The `personality` wrapper might not set errno, so in that case the return value
should be checked instead.

For details, see
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=e0043e17dfc52fe1702746543127cb4a87232bcd.

Closes #6737.
2017-09-06 06:08:04 +02:00
Lennart Poettering 72eafe7159 seccomp: rework seccomp_lock_personality() to apply filter to all archs 2017-08-29 15:58:13 +02:00
Lennart Poettering e8132d63fe seccomp: default to something resembling the current personality when locking it
Let's lock the personality to the currently set one, if nothing is
specifically specified. But do so with a grain of salt, and never
default to any exotic personality here, but only PER_LINUX or
PER_LINUX32.
2017-08-29 15:56:57 +02:00
Topi Miettinen 78e864e5b3 seccomp: LockPersonality boolean (#6193)
Add LockPersonality boolean to allow locking down personality(2)
system call so that the execution domain can't be changed.
This may be useful to improve security because odd emulations
may be poorly tested and source of vulnerabilities, while
system services shouldn't need any weird personalities.
2017-08-29 15:54:50 +02:00
Zbigniew Jędrzejewski-Szmek f60a865a49 test-seccomp: arm64 does not have access() and poll()
glibc uses faccessat and ppoll, so just add a filters for that.

(cherry picked from commit abc0213839fef92e2e2b98a434914f22ece48490)
2017-07-15 17:18:22 -04:00
Zbigniew Jędrzejewski-Szmek 2e64e8f46d seccomp: arm64/x32 do not have _sysctl
So don't even try to added the filter to reduce noise.
The test is updated to skip calling _sysctl because the kernel prints
an oops-like message that is confusing and unhelpful:

Jul 15 21:07:01 rpi3 kernel: test-seccomp[8448]: syscall -10080
Jul 15 21:07:01 rpi3 kernel: Code: aa0503e4 aa0603e5 aa0703e6 d4000001 (b13ffc1f)
Jul 15 21:07:01 rpi3 kernel: CPU: 3 PID: 8448 Comm: test-seccomp Tainted: G        W       4.11.8-300.fc26.aarch64 #1
Jul 15 21:07:01 rpi3 kernel: Hardware name: raspberrypi rpi/rpi, BIOS 2017.05 06/24/2017
Jul 15 21:07:01 rpi3 kernel: task: ffff80002bb0bb00 task.stack: ffff800036354000
Jul 15 21:07:01 rpi3 kernel: PC is at 0xffff8669c7c4
Jul 15 21:07:01 rpi3 kernel: LR is at 0xaaaac64b6750
Jul 15 21:07:01 rpi3 kernel: pc : [<0000ffff8669c7c4>] lr : [<0000aaaac64b6750>] pstate: 60000000
Jul 15 21:07:01 rpi3 kernel: sp : 0000ffffdc640fd0
Jul 15 21:07:01 rpi3 kernel: x29: 0000ffffdc640fd0 x28: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x27: 0000000000000000 x26: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x25: 0000000000000000 x24: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x23: 0000000000000000 x22: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x21: 0000aaaac64b4940 x20: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x19: 0000aaaac64b88f8 x18: 0000000000000020
Jul 15 21:07:01 rpi3 kernel: x17: 0000ffff8669c7a0 x16: 0000aaaac64d2ee0
Jul 15 21:07:01 rpi3 kernel: x15: 0000000000000000 x14: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x13: 203a657275746365 x12: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x11: 0000ffffdc640418 x10: 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x9 : 0000000000000005 x8 : 00000000ffffd8a0
Jul 15 21:07:01 rpi3 kernel: x7 : 7f7f7f7f7f7f7f7f x6 : 7f7f7f7f7f7f7f7f
Jul 15 21:07:01 rpi3 kernel: x5 : 65736d68716f7277 x4 : 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x3 : 0000000000000008 x2 : 0000000000000000
Jul 15 21:07:01 rpi3 kernel: x1 : 0000000000000000 x0 : 0000000000000000
Jul 15 21:07:01 rpi3 kernel:

(cherry picked from commit 1e20e640132c700c23494bb9e2619afb83878380)
2017-07-15 17:18:22 -04:00
Zbigniew Jędrzejewski-Szmek da1921a5c3 seccomp: enable RestrictAddressFamilies on ppc64, autodetect SECCOMP_RESTRICT_ADDRESS_FAMILIES_BROKEN
We expect that if socket() syscall is available, seccomp works for that
architecture.  So instead of explicitly listing all architectures where we know
it is not available, just assume it is broken if the number is not defined.
This should have the same effect, except that other architectures where it is
also broken will pass tests without further changes. (Architectures where the
filter should work, but does not work because of missing entries in
seccomp-util.c, will still fail.)

i386, s390, s390x are the exception — setting the filter fails, even though
socket() is available, so it needs to be special-cased
(https://github.com/systemd/systemd/issues/5215#issuecomment-277241488).

This remove the last define in seccomp-util.h that was only used in test-seccomp.c. Porting
the seccomp filter to new architectures should be simpler because now only two places need
to be modified.

RestrictAddressFamilies seems to work on ppc64[bl]e, so enable it (the tests pass).
2017-05-10 09:21:16 -04:00
Zbigniew Jędrzejewski-Szmek 511ceb1f8d seccomp: assume clone() arg order is known on all architectures
While adding the defines for arm, I realized that we have pretty much all
known architectures covered, so SECCOMP_RESTRICT_NAMESPACES_BROKEN is not
necessary anymore. clone(2) is adamant that the order of the first two
arguments is only reversed on s390/s390x. So let's simplify things and remove
the #if.
2017-05-07 20:01:04 -04:00
Zbigniew Jędrzejewski-Szmek 4278d1f531 seccomp: add mmap/shmat defines for arm and arm64 2017-05-07 20:01:04 -04:00
Zbigniew Jędrzejewski-Szmek 2a8d6e6395 seccomp: add mmap/shmat defines for ppc64 2017-05-07 20:01:04 -04:00
Zbigniew Jędrzejewski-Szmek 2a65bd94e4 seccomp: drop SECCOMP_MEMORY_DENY_WRITE_EXECUTE_BROKEN, add test for shmat
SECCOMP_MEMORY_DENY_WRITE_EXECUTE_BROKEN was conflating two separate things:
1. whether shmat/shmdt/shmget can be filtered (if ipc multiplexer is used, they can not)
2. whether we know this for the current architecture

For i386, shmat is implemented as ipc, so seccomp filter is "broken" for shmat,
but not for mmap, and SECCOMP_MEMORY_DENY_WRITE_EXECUTE_BROKEN cannot be used
to cover both cases. The define was only used for tests — not in the implementation
in seccomp-util.c. So let's get rid of SECCOMP_MEMORY_DENY_WRITE_EXECUTE_BROKEN
and encode the right condition directly in tests.
2017-05-07 18:59:37 -04:00
Zbigniew Jędrzejewski-Szmek dce0e62046 test-seccomp: limit the code under #ifdef
Try to make the paths for supported and unsupported architectures as
similar as possible.
2017-05-03 19:50:39 +00:00
Lennart Poettering ae9d60ce4e seccomp: on s390 the clone() parameters are reversed
Add a bit of code that tries to get the right parameter order in place
for some of the better known architectures, and skips
restrict_namespaces for other archs.

This also bypasses the test on archs where we don't know the right
order.

In this case I didn't bother with testing the case where no filter is
applied, since that is hopefully just an issue for now, as there's
nothing stopping us from supporting more archs, we just need to know
which order is right.

Fixes: #5241
2017-02-08 22:21:27 +01:00
Lennart Poettering 8a50cf6957 seccomp: MemoryDenyWriteExecute= should affect both mmap() and mmap2() (#5254)
On i386 we block the old mmap() call entirely, since we cannot properly
filter it. Thankfully it hasn't been used by glibc since quite some
time.

Fixes: #5240
2017-02-08 15:14:02 +01:00
Lennart Poettering ad8f1479b4 seccomp: RestrictAddressFamilies= is not supported on i386/s390/s390x, make it a NOP
See: #5215
2017-02-06 14:17:12 +01:00