Systemd

Author	SHA1	Message	Date
Greg Depoire--Ferrer	6597686865	seccomp: don't install filters for archs that can't use syscalls When seccomp_restrict_archs is called, architectures that are blocked are replaced by the SECCOMP_LOCAL_ARCH_BLOCKED marker so that they are not disabled again and filters are not installed for them. This can make some service that use SystemCallArchitecture= and SystemCallFilter= start faster.	2020-12-10 16:13:02 +01:00
Zbigniew Jędrzejewski-Szmek	d5923e38bc	shared/seccomp-util: address family filtering is broken on ppc This reverts the gist of `da1921a5c3` and `0d9fca76bb` (for ppc). Quoting #17559: > libseccomp 2.5 added socket syscall multiplexing on ppc64(el): > https://github.com/seccomp/libseccomp/pull/229 > > Like with i386, s390 and s390x this breaks socket argument filtering, so > RestrictAddressFamilies doesn't work. > > This causes the unit test to fail: > /* test_restrict_address_families */ > Operating on architecture: ppc > Failed to install socket family rules for architecture ppc, skipping: Operation canceled > Operating on architecture: ppc64 > Failed to add socket() rule for architecture ppc64, skipping: Invalid argument > Operating on architecture: ppc64-le > Failed to add socket() rule for architecture ppc64-le, skipping: Invalid argument > Assertion 'fd < 0' failed at src/test/test-seccomp.c:424, function test_restrict_address_families(). Aborting. > > The socket filters can't be added so `socket(AF_UNIX, SOCK_DGRAM, 0);` still > works, triggering the assertion. Fixes #17559.	2020-11-26 14:23:15 +01:00
Yu Watanabe	11b9105dfd	seccomp: also move munmap into @default syscall filter set Follow-up for `5abede3247`.	2020-11-24 16:18:34 +01:00
Lennart Poettering	5abede3247	seccomp: move brk+mmap+mmap2 into @default syscall filter set These three syscalls are internally used by libc's memory allocation logic, i.e. ultimately back malloc(). Allocating a bit of memory is so basic, it should just be in the default set. This fixes a couple of issues with asan/msan and the seccomp tests: when asan/msan is used some additional, large memory allocations take place in the background, and unless mmap/mmap2/brk are allowlisted these will fail, aborting the test prematurely.	2020-11-19 16:44:50 +01:00
Yu Watanabe	db9ecf0501	license: LGPL-2.1+ -> LGPL-2.1-or-later	2020-11-09 13:23:58 +09:00
Lennart Poettering	ce8f6d478e	seccomp: allow turning off of seccomp filtering via env var Fixes: #17504 (While we are it, also move $SYSTEMD_SECCOMP_LOG= env var description into the right document section) Also suggested in: https://github.com/systemd/systemd/issues/17245#issuecomment-704773603	2020-11-05 20:22:19 +01:00
Topi Miettinen	ae5e9bf46f	shared/seccomp-util: move stime() to @obsolete Quoting the manual page of stime(2): "Starting with glibc 2.31, this function is no longer available to newly linked applications and is no longer declared in <time.h>."	2020-11-04 09:48:33 +01:00
Lennart Poettering	6ea0d25c57	seccomp: allowlist close_range() by default in @basic-io	2020-10-14 10:40:06 +02:00
Frantisek Sumsal	d7a0f1f4f9	tree-wide: assorted coccinelle fixes	2020-10-09 15:02:23 +02:00
Samanta Navarro	7b121df640	seccomp-util: fix typo in help message	2020-10-03 11:56:40 +00:00
Lennart Poettering	8e24b1d23f	seccomp-util: add cacheflush() syscall to @default syscall set This is like membarrier() I guess and basically just exposes CPU functionality via kernel syscall on some archs. Let's whitelist it for everyone. Fixes: #17197	2020-09-30 10:08:15 +02:00
Topi Miettinen	9df2cdd8ec	exec: SystemCallLog= directive With new directive SystemCallLog= it's possible to list system calls to be logged. This can be used for auditing or temporarily when constructing system call filters. --- v5: drop intermediary, update HASHMAP_FOREACH_KEY() use v4: skip useless debug messages, actually parse directive v3: don't declare unused variables with old libseccomp v2: fix build without seccomp or old libseccomp	2020-09-15 12:54:17 +03:00
Topi Miettinen	005bfaf118	exec: Add kill action to system call filters Define explicit action "kill" for SystemCallErrorNumber=. In addition to errno code, allow specifying "kill" as action for SystemCallFilter=. --- v7: seccomp_parse_errno_or_action() returns -EINVAL if !HAVE_SECCOMP v6: use streq_ptr(), let errno_to_name() handle bad values, kill processes, init syscall_errno v5: actually use seccomp_errno_or_action_to_string(), don't fail bus unit parsing without seccomp v4: fix build without seccomp v3: drop log action v2: action -> number	2020-09-15 12:54:17 +03:00
Zbigniew Jędrzejewski-Szmek	90e74a66e6	tree-wide: define iterator inside of the macro	2020-09-08 12:14:05 +02:00
fangxiuning	6d95e7d9b2	tree-wide: drop pointless zero initialization (#16900 )	2020-08-30 06:21:20 +09:00
Zbigniew Jędrzejewski-Szmek	9f56c88aeb	Merge pull request #16819 from keszybz/seccomp-enosys Return ENOSYS in nspawn for "unknown" syscalls	2020-08-25 09:18:46 +02:00
Zbigniew Jędrzejewski-Szmek	000c05207d	shared/seccomp-util: added functionality to make list of filtred syscalls While at it, start removing the "seccomp_" prefix from our own functions. It is used by libseccomp.	2020-08-24 20:05:09 +02:00
Zbigniew Jędrzejewski-Szmek	077e8fc0ca	shared/seccomp: reduce scope of indexing variables	2020-08-24 20:04:54 +02:00
Zbigniew Jędrzejewski-Szmek	95aac01259	shared: add @known syscall list	2020-08-24 20:04:17 +02:00
Steve Dodd	44aaddad06	Request seccomp logging if SYSTEMD_LOG_SECCOMP environment variable is set.	2020-08-21 11:24:53 +02:00
Aurelien Jarno	f9252236c8	seccomp: add support for riscv64 This patch adds seccomp support to the riscv64 architecture. seccomp support is available in the riscv64 kernel since version 5.5, and it has just been added to the libseccomp library. riscv64 uses generic syscalls like aarch64, so I used that architecture as a reference to find which code has to be modified. With this patch, the testsuite passes successfully, including the test-seccomp test. The system boots and works fine with kernel 5.4 (i.e. without seccomp support) and kernel 5.5 (i.e. with seccomp support). I have also verified that the "SystemCallFilter=~socket" option prevents a service to use the ping utility when running on kernel 5.5.	2020-08-21 10:10:29 +02:00
Zbigniew Jędrzejewski-Szmek	b4eaa6cc99	shared/seccomp: use _cleanup_ in one more place (cherry picked from commit 27605d6a836d85563faf41db9f7a72883d44c0ff)	2020-08-19 10:57:30 +02:00
Zbigniew Jędrzejewski-Szmek	6da432fd54	shared/seccomp: do not use ifdef guards around textual syscall names It is possible that we will be running with an upgraded libseccomp, in which case libseccomp might know the syscall name, even if the number is not known at the time when systemd is being compiled. The guard only serves to break such upgrades, by requiring that we also recompile systemd. For s390-specific syscalls, use a define to exclude them, so that that we don't try to filter them on other arches. (cherry picked from commit 6cf852e79eb0eced2f77653941f9c75c3bd79386)	2020-08-19 10:57:18 +02:00
Michael Scherer	bcf08acbff	Newer Glibc use faccessat2 to implement faccessat cf https://repo.or.cz/glibc.git/commit/3d3ab573a5f3071992cbc4f57d50d1d29d55bde2 This cause breakage on Fedora Rawhide: https://bugzilla.redhat.com/show_bug.cgi?id=1869030	2020-08-16 15:10:13 +02:00
Lennart Poettering	6b000af4f2	tree-wide: avoid some loaded terms https://tools.ietf.org/html/draft-knodel-terminology-02 https://lwn.net/Articles/823224/ This gets rid of most but not occasions of these loaded terms: 1. scsi_id and friends are something that is supposed to be removed from our tree (see #7594) 2. The test suite defines an API used by the ubuntu CI. We can remove this too later, but this needs to be done in sync with the ubuntu CI. 3. In some cases the terms are part of APIs we call or where we expose concepts the kernel names the way it names them. (In particular all remaining uses of the word "slave" in our codebase are like this, it's used by the POSIX PTY layer, by the network subsystem, the mount API and the block device subsystem). Getting rid of the term in these contexts would mean doing some major fixes of the kernel ABI first. Regarding the replacements: when whitelist/blacklist is used as noun we replace with with allow list/deny list, and when used as verb with allow-list/deny-list.	2020-06-25 09:00:19 +02:00
Zbigniew Jędrzejewski-Szmek	de7fef4b6e	tree-wide: use set_ensure_put() Patch contains a coccinelle script, but it only works in some cases. Many parts were converted by hand. Note: I did not fix errors in return value handing. This will be done separate to keep the patch comprehensible. No functional change is intended in this patch.	2020-06-22 16:32:37 +02:00
Lennart Poettering	ecc04067f9	seccomp: filter openat2() entirely in seccomp_restrict_sxid()	2020-06-03 18:26:34 +02:00
Benjamin Robin	b9c54c4665	tree-wide: Initialize _cleanup_ variables if needed	2020-05-13 22:56:42 +02:00
Lennart Poettering	8270e3d8ed	seccomp-util: add new syscalls from kernel 5.6 to syscall filter table	2020-05-11 06:24:02 +00:00
Zbigniew Jędrzejewski-Szmek	b069c2a3f2	shared/seccomp: avoid possibly writing bogus errno code in debug log CID 1409488. This code was added in `903659e7b2`. The change that is done here is a simple fix to avoid use of a unitialized/wrongly-initialized variable, but the bigger issue is that nothing looks at the returned result to distinguish between 0 and a positive return value.	2019-12-06 15:12:40 +01:00
Christian Ehrhardt	5ef3ed97e3	seccomp: use per arch shmat_syscall At the beginning of seccomp_memory_deny_write_execute architectures can set individual filter_syscall, block_syscall, shmat_syscall values. The former two are then used in the call to add_seccomp_syscall_filter but shmat_syscall is not. Right now all shmat_syscall values are the same, so the change is a no-op, but if ever an architecture is added/modified this would be a subtle source for a mistake so fix it by using shmat_syscall later. Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>	2019-12-05 07:19:12 +01:00
Christian Ehrhardt	903659e7b2	seccomp: ensure rules are loaded in seccomp_memory_deny_write_execute If seccomp_memory_deny_write_execute was fatally failing to load rules it already returned a bad retval. But if any adding filters failed it skipped the subsequent seccomp_load and always returned an rc of 0 even if no rule was loaded at all. Lets fix this requiring to (non fatally-failing) load at least one rule set. Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>	2019-12-05 07:19:12 +01:00
Christian Ehrhardt	bed4668d1d	seccomp: fix multiplexed system calls Since libseccomp 2.4.2 more architectures have shmat handled as multiplexed call. Those will fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules [1]. See the discussion at https://github.com/seccomp/libseccomp/issues/193 After discussions about the options rejected [2][3] the initial thought of a fallback to the non '_exact' version of the seccomp rule adding the next option is to handle those now affected (i386, s390, s390x) the same way as ppc which ignores and does not block shmat. [1]: https://github.com/seccomp/libseccomp/issues/193 [2]: https://github.com/systemd/systemd/pull/14167#issuecomment-559136906 [3]: https://github.com/systemd/systemd/commit/469830d1	2019-12-05 07:19:07 +01:00
Kevin Kuehler	620dbdd248	shared: Add ProtectKernelLogs property Add seccomp_protect_syslog, which adds a filter rule for the syslog system call.	2019-11-11 12:11:56 -08:00
Zbigniew Jędrzejewski-Szmek	9493b16871	Add @pkey syscall group Inspired by https://bugzilla.redhat.com/show_bug.cgi?id=1769299. This change doesn't solve the issue, but makes it easier to whitelist the syscall group.	2019-11-08 14:41:22 +01:00
Zbigniew Jędrzejewski-Szmek	6ca6771069	seccomp: add all time64 syscalls From https://bugzilla.redhat.com/show_bug.cgi?id=1770154: > utime is an obsolete system call. The current kernel interface is > utimensat_time64. New 32-bit architectures do not even provide the utime > system call. Also add all other time64 syscalls listed in https://fedora.juszkiewicz.com.pl/syscalls.html.	2019-11-08 14:40:49 +01:00
Lennart Poettering	9e48626571	seccomp: add new Linux 5.3 syscalls to syscall filter lists Many syscalls added and all fit nicely into existing groups, hence lets add them there.	2019-10-30 15:42:49 +01:00
Zbigniew Jędrzejewski-Szmek	a8fb09f573	shared/seccomp: add sync_file_range2 Some architectures need the arguments to be reordered because of alignment issues. Otherwise, it's the same as sync_file_range.	2019-08-19 11:10:40 +02:00
Dan Streetman	57311925aa	src/shared/seccomp-util.c: Add mmap definitions for s390	2019-08-13 15:40:36 -04:00
Lennart Poettering	46fcf95dbe	seccomp: add new 5.1 syscall pidfd_send_signal() to filter set list	2019-05-28 17:01:05 +02:00
Lennart Poettering	915fb32438	seccomp: add scmp_act_kill_process() helper that returns SCMP_ACT_KILL_PROCESS if supported	2019-05-24 10:48:28 +02:00
Anita Zhang	7bc5e0b12b	seccomp: check more error codes from seccomp_load() We noticed in our tests that occasionally SystemCallFilter= would fail to set and the service would run with no syscall filtering. Most of the time the same tests would apply the filter and fail the service as expected. While it's not totally clear why this happens, we noticed seccomp_load() in the systemd code base would fail open for all errors except EPERM and EACCES. ENOMEM, EINVAL, and EFAULT seem like reasonable values to add to the error set based on what I gather from libseccomp code and man pages: -ENOMEM: out of memory, failed to allocate space for a libseccomp structure, or would exceed a defined constant -EINVAL: kernel isn't configured to support the operations, args are invalid (to seccomp_load(), seccomp(), or prctl()) -EFAULT: addresses passed as args are invalid	2019-04-12 10:23:07 +02:00
Zbigniew Jędrzejewski-Szmek	b3e8032bb4	Merge pull request #12198 from keszybz/seccomp-parsing-logging Seccomp parsing logging cleanup	2019-04-03 17:19:14 +02:00
Zbigniew Jędrzejewski-Szmek	da4dc9a674	seccomp: rework how the S[UG]ID filter is installed If we know that a syscall is undefined on the given architecture, don't even try to add it. Try to install the filter even if some syscalls fail. Also use a helper function to make the whole a bit less magic. This allows the S[UG]ID test to pass on arm64.	2019-04-03 13:33:06 +02:00
Zbigniew Jędrzejewski-Szmek	58f6ab4454	pid1: pass unit name to seccomp parser when we have no file location Building on previous commit, let's pass the unit name when parsing dbus message or builtin whitelist, which is better than nothing. seccomp_parse_syscall_filter() is not needed anymore, so it is removed, and seccomp_parse_syscall_filter_full() is renamed to take its place.	2019-04-03 09:17:42 +02:00
Lennart Poettering	3c27973b13	seccomp: introduce seccomp_restrict_suid_sgid() for blocking chmod() for suid/sgid files	2019-04-02 16:56:48 +02:00
Lennart Poettering	9e6e543c17	seccomp: add debug messages to seccomp_protect_hostname()	2019-04-02 16:56:48 +02:00
Lennart Poettering	6fee3be0b4	seccomp: add rseq() to default list of syscalls to whitelist Apparently glibc is going to call this implicitly soon, hence let's whitelist this by default. Fixes: #12127	2019-03-28 12:09:38 +01:00
Zbigniew Jędrzejewski-Szmek	67fb5f338f	seccomp: allow shmat to be a separate syscall on architectures which use a multiplexer After https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d6040d46817, those syscalls have their separate numbers and we can block them. But glibc might still use the old ones. So let's just do a best-effort block and not assume anything about how effective it is.	2019-03-15 15:46:41 +01:00
Zbigniew Jędrzejewski-Szmek	e55bdf9b6c	seccomp: shm{get,at,dt} now have their own numbers everywhere E.g. on i686: (previously) arch x86: SCMP_SYS(mmap) = 90 arch x86: SCMP_SYS(mmap2) = 192 arch x86: SCMP_SYS(shmat) = -221 arch x86: SCMP_SYS(shmat) = -221 arch x86: SCMP_SYS(shmdt) = -222 (now) arch x86: SCMP_SYS(mmap) = 90 arch x86: SCMP_SYS(mmap2) = 192 arch x86: SCMP_SYS(shmat) = 397 arch x86: SCMP_SYS(shmat) = 397 arch x86: SCMP_SYS(shmdt) = 398 The relevant commit seems to be https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0d6040d46817.	2019-03-15 15:28:43 +01:00

1 2 3 4

162 commits