Merge pull request #8766 from poettering/syscall-filter-service

add a new `@system-service` syscall group that is good as a starting point for whitelisting syscalls
This commit is contained in:
Zbigniew Jędrzejewski-Szmek 2018-06-15 11:34:44 +02:00 committed by GitHub
commit 88f375b8c2
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
21 changed files with 196 additions and 13 deletions

72
.mkosi/mkosi.ubuntu Normal file
View File

@ -0,0 +1,72 @@
# SPDX-License-Identifier: LGPL-2.1+
#
# This file is part of systemd.
#
# Copyright 2016 Daniel Rusek
# This is a settings file for OS image generation using mkosi (https://github.com/systemd/mkosi).
# Simply invoke "mkosi" in the project directory to build an OS image.
[Distribution]
Distribution=ubuntu
Release=bionic
Repositories=main,universe
[Output]
Format=raw_btrfs
Bootable=no
[Partitions]
RootSize=2G
[Packages]
BuildPackages=
acl
docbook-xml
docbook-xsl
gcc
gettext
git
gnu-efi
gperf
iptables-dev
libacl1-dev
libaudit-dev
libblkid-dev
libbz2-dev
libcap-dev
libcryptsetup-dev
libcurl4-gnutls-dev
libdbus-1-dev
libdw-dev
libfdisk-dev
libgcrypt20-dev
libgnutls28-dev
libidn2-0-dev
libkmod-dev
liblz4-dev
liblz4-tool
liblzma-dev
libmicrohttpd-dev
libmount-dev
libpam0g-dev
libqrencode-dev
libseccomp-dev
libsmartcols-dev
libtool
libxkbcommon-dev
m4
meson
pkg-config
python3
python3-lxml
tree
tzdata
uuid-dev
xsltproc
xz-utils
Packages=
libqrencode3
locales
libidn2-0

9
NEWS
View File

@ -59,6 +59,15 @@ CHANGES WITH 239 in spe:
both runtime and persistent enablement/masking, i.e. it will remove
any relevant symlinks both in /run and /etc.
* Note that all long-running system services shipped with systemd will
now default to a system call whitelist (rather than a blacklist, as
before). In particular, systemd-udevd will now enforce one too. For
most cases this should be safe, however downstream distributions
which disabled sandboxing of systemd-udevd (specifically the
MountFlags= setting), might want to disable this security feature
too, as the default whitelisting will prohibit all mount, swap,
reboot and clock changing operations from udev rules.
* sd-boot acquired new loader configuration settings to optionally turn
off Windows and MacOS boot partition discovery as well as
reboot-into-firmware menu items. It is also able to pick a better

View File

@ -1490,6 +1490,10 @@ RestrictNamespaces=~cgroup net</programlisting>
<entry>@sync</entry>
<entry>Synchronizing files and memory to disk: (<citerefentry project='man-pages'><refentrytitle>fsync</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>msync</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry>
</row>
<row>
<entry>@system-service</entry>
<entry>A reasonable set of system calls used by common system services, excluding any special purpose calls. This is the recommended starting point for whitelisting system calls for system services, as it contains what is typically needed by system services, but excludes overly specific interfaces. For example, the following APIs are excluded: <literal>@clock</literal>, <literal>@mount</literal>, <literal>@swap</literal>, <literal>@reboot</literal>.</entry>
</row>
<row>
<entry>@timer</entry>
<entry>System calls for scheduling operations by time (<citerefentry project='man-pages'><refentrytitle>alarm</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>timer_create</refentrytitle><manvolnum>2</manvolnum></citerefentry>, …)</entry>
@ -1504,6 +1508,14 @@ RestrictNamespaces=~cgroup net</programlisting>
<command>systemd-analyze syscall-filter</command> to list the actual list of system calls in each
filter.</para>
<para>Generally, whitelisting system calls (rather than blacklisting) is the safer mode of operation. It is
recommended to enforce system call whitelists for all long-running system services. Specifically, the
following lines are a relatively safe basic choice for the majority of system services:</para>
<programlisting>[Service]
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM</programlisting>
<para>It is recommended to combine the file system namespacing related options with
<varname>SystemCallFilter=~@mount</varname>, in order to prohibit the unit's processes to undo the
mappings. Specifically these are the options <varname>PrivateTmp=</varname>,

View File

@ -27,4 +27,6 @@ LockPersonality=yes
MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native

View File

@ -25,6 +25,8 @@ LockPersonality=yes
MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
PrivateNetwork=yes
IPAddressDeny=any

View File

@ -23,6 +23,8 @@ NoNewPrivileges=yes
MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
PrivateNetwork=yes
IPAddressDeny=any

View File

@ -632,7 +632,7 @@ const SyscallFilterSet syscall_filter_sets[_SYSCALL_FILTER_SET_MAX] = {
"setresuid32\0"
"setreuid\0"
"setreuid32\0"
"setuid\0"
"setuid\0" /* We list the explicit system calls here, as @setuid also includes setgid() which is not necessarily privileged */
"setuid32\0"
"vhangup\0"
},
@ -756,6 +756,75 @@ const SyscallFilterSet syscall_filter_sets[_SYSCALL_FILTER_SET_MAX] = {
"sync_file_range\0"
"syncfs\0"
},
[SYSCALL_FILTER_SET_SYSTEM_SERVICE] = {
.name = "@system-service",
.help = "General system service operations",
.value =
"@aio\0"
"@basic-io\0"
"@chown\0"
"@default\0"
"@file-system\0"
"@io-event\0"
"@ipc\0"
"@keyring\0"
"@memlock\0"
"@network-io\0"
"@process\0"
"@resources\0"
"@setuid\0"
"@signal\0"
"@sync\0"
"@timer\0"
"brk\0"
"capget\0"
"capset\0"
"copy_file_range\0"
"fadvise64\0"
"fadvise64_64\0"
"flock\0"
"get_mempolicy\0"
"getcpu\0"
"getpriority\0"
"getrandom\0"
"ioctl\0"
"ioprio_get\0"
"kcmp\0"
"madvise\0"
"mincore\0"
"mprotect\0"
"mremap\0"
"name_to_handle_at\0"
"oldolduname\0"
"olduname\0"
"personality\0"
"readahead\0"
"readdir\0"
"remap_file_pages\0"
"sched_get_priority_max\0"
"sched_get_priority_min\0"
"sched_getaffinity\0"
"sched_getattr\0"
"sched_getparam\0"
"sched_getscheduler\0"
"sched_rr_get_interval\0"
"sched_yield\0"
"sendfile\0"
"sendfile64\0"
"setfsgid\0"
"setfsgid32\0"
"setfsuid\0"
"setfsuid32\0"
"setpgid\0"
"setsid\0"
"splice\0"
"sysinfo\0"
"tee\0"
"umask\0"
"uname\0"
"userfaultfd\0"
"vmsplice\0"
},
[SYSCALL_FILTER_SET_TIMER] = {
.name = "@timer",
.help = "Schedule operations by time",

View File

@ -47,6 +47,7 @@ enum {
SYSCALL_FILTER_SET_SIGNAL,
SYSCALL_FILTER_SET_SWAP,
SYSCALL_FILTER_SET_SYNC,
SYSCALL_FILTER_SET_SYSTEM_SERVICE,
SYSCALL_FILTER_SET_TIMER,
_SYSCALL_FILTER_SET_MAX
};

View File

@ -104,7 +104,8 @@ static void test_filter_sets(void) {
if (pid == 0) { /* Child? */
int fd;
if (i == SYSCALL_FILTER_SET_DEFAULT) /* if we look at the default set, whitelist instead of blacklist */
/* if we look at the default set (or one that includes it), whitelist instead of blacklist */
if (IN_SET(i, SYSCALL_FILTER_SET_DEFAULT, SYSCALL_FILTER_SET_SYSTEM_SERVICE))
r = seccomp_load_syscall_filter_set(SCMP_ACT_ERRNO(EUCLEAN), syscall_filter_sets + i, SCMP_ACT_ALLOW);
else
r = seccomp_load_syscall_filter_set(SCMP_ACT_ALLOW, syscall_filter_sets + i, SCMP_ACT_ERRNO(EUCLEAN));

View File

@ -33,7 +33,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -29,7 +29,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service sethostname
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -23,6 +23,7 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=net
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service @mount
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes

View File

@ -29,7 +29,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -29,7 +29,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -30,7 +30,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -23,7 +23,8 @@ CapabilityBoundingSet=CAP_KILL CAP_SYS_PTRACE CAP_SYS_ADMIN CAP_SETGID CAP_SYS_C
MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service @mount
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -35,7 +35,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6 AF_PACKET
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
RuntimeDirectory=systemd/netif

View File

@ -38,7 +38,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6
SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
RuntimeDirectory=systemd/resolve

View File

@ -27,7 +27,8 @@ MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX
SystemCallFilter=~@cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service @clock
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any

View File

@ -38,7 +38,8 @@ RestrictRealtime=yes
RestrictNamespaces=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6
RuntimeDirectory=systemd/timesync
SystemCallFilter=~@cpu-emulation @debug @keyring @module @mount @obsolete @raw-io @reboot @swap
SystemCallFilter=@system-service @clock
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
StateDirectory=systemd/timesync

View File

@ -29,6 +29,8 @@ PrivateMounts=yes
MemoryDenyWriteExecute=yes
RestrictRealtime=yes
RestrictAddressFamilies=AF_UNIX AF_NETLINK AF_INET AF_INET6
SystemCallFilter=@system-service @module @raw-io
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native
LockPersonality=yes
IPAddressDeny=any