Systemd/src
Florian Westphal 715a70e721 firewall-util: add nftables backend
Idea is to use a static ruleset, added when the first attempt to
add a masquerade or dnat rule is made.

The alternative would be to add the ruleset when the init function is called.
The disadvantage is that this enables connection tracking and NAT in the kernel
(as the ruleset needs this to work), which comes with some overhead that might
not be needed (no nspawn usage and no IPMasquerade option set).

There is no additional dependency on the 'nft' userspace binary or other libraries.
sd-netlinks nfnetlink backend is used to modify the nftables ruleset.

The commit message/comments still use nft syntax since that is what
users will see when they use the nft tool to list the ruleset.

The added initial skeleton (added on first fw_add_masquerade/local_dnat
call) looks like this:

table ip io.systemd.nat {
        set masq_saddr {
                type ipv4_addr
                flags interval
                elements = { 192.168.59.160/28 }
        }

        map map_port_ipport {
                type inet_proto . inet_service : ipv4_addr . inet_service
                elements = { tcp . 2222 : 192.168.59.169 . 22 }
        }

        chain prerouting {
                type nat hook prerouting priority dstnat + 1; policy accept;
                fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport
        }

        chain output {
                type nat hook output priority -99; policy accept;
                ip daddr != 127.0.0.0/8 oif "lo" dnat ip addr . port to meta l4proto . th dport map @map_port_ipport
        }

        chain postrouting {
                type nat hook postrouting priority srcnat + 1; policy accept;
                ip saddr @masq_saddr masquerade
        }
}

Next calls to fw_add_masquerade/add_local_dnat will then only add/delete the
element/mapping to masq_saddr and map_port_ipport, i.e. the ruleset doesn't
change -- only the set/map content does.

Running test-firewall-util with this backend gives following output
on a parallel 'nft monitor':

$ nft monitor
add table ip io.systemd.nat
add chain ip io.systemd.nat prerouting { type nat hook prerouting priority dstnat + 1; policy accept; }
add chain ip io.systemd.nat output { type nat hook output priority -99; policy accept; }
add chain ip io.systemd.nat postrouting { type nat hook postrouting priority srcnat + 1; policy accept; }
add set ip io.systemd.nat masq_saddr { type ipv4_addr; flags interval; }
add map ip io.systemd.nat map_port_ipport { type inet_proto . inet_service : ipv4_addr . inet_service; }
add rule ip io.systemd.nat prerouting fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport
add rule ip io.systemd.nat output ip daddr != 127.0.0.0/8 fib daddr type local dnat ip addr . port to meta l4proto . th dport map @map_port_ipport
add rule ip io.systemd.nat postrouting ip saddr @masq_saddr masquerade
add element ip io.systemd.nat masq_saddr { 10.1.2.3 }
add element ip io.systemd.nat masq_saddr { 10.0.2.0/28 }
delete element ip io.systemd.nat masq_saddr { 10.0.2.0/28 }
delete element ip io.systemd.nat masq_saddr { 10.1.2.3 }
add element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.4 . 815 }
delete element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.4 . 815 }
add element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.5 . 815 }
delete element ip io.systemd.nat map_port_ipport { tcp . 4711 : 1.2.3.5 . 815 }
CTRL-C

Things not implemented/supported:
1. Change monitoring.  The kernel allows userspace to learn about changes
   made by other clients (using nfnetlink notifications). It would be
   possible to detect when e.g. someone removes the systemd nat table.
   This would need more work.  Its also not clear on how to react to
   external changes -- it doesn't seem like a good idea to just auto-undo
   everthing.
2. 'set masq_saddr' doesn't handle overlaps.
   Example:

   fw_add_masquerade(true, AF_INET, "10.0.0.0" , 16);
   fw_add_masquerade(true, AF_INET, "10.0.0.0" , 8); /* fails */

With the iptables backend the second call works, as it adds an
independent iptables rule.

With the nftables backend, the range 10.0.0.0-10.255.255.255 clashes with
the existing range of 10.0.0.0-10.0.255.255 so 2nd add gets rejected by the
kernel.

This will generate an error message from networkd ("Could not enable IP
masquerading: File exists").

To resolve this it would be needed to either keep track of the added elements
and perform range merging when overlaps are detected.

However, the add erquests are done using the configured network on a
device, so no overlaps should occur in normal setups.

IPv6 support is added in a extra changeset.

Fixes: #13307
2020-12-16 01:07:08 +01:00
..
ac-power license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
activate license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
analyze license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
ask-password license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
backlight backlight: add several debug logs and adjust log level of non critical error 2020-11-23 14:35:14 +01:00
basic linux: import nf_tables and nfnetlink headers from Linux 5.8 2020-12-16 00:35:56 +01:00
binfmt license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
boot efi: Only use arm flags if supported 2020-12-15 12:06:52 +09:00
busctl Merge pull request #17967 from poettering/connect-user-bus 2020-12-15 21:14:01 +01:00
cgls license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
cgroups-agent license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
cgtop license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
core hostname-setup: clarify that failures reading /etc/hostname are ignored 2020-12-15 17:59:58 +01:00
coredump meson: add option to skip installing to $sysconfdir 2020-11-12 11:21:46 +01:00
cryptsetup cryptsetup: give command line parameters proper names 2020-12-04 17:49:34 +01:00
debug-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
delta license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
detect-virt license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
dissect license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
environment-d-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
escape license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
firstboot firstboot: clean-up the copied hostname, not argv[] directly, as that's ugly 2020-12-15 18:00:02 +01:00
fsck license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
fstab-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
fuzz Merge pull request #17478 from yuwata/split-network-internal 2020-11-27 09:04:19 +09:00
getty-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
gpt-auto-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
hibernate-resume license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
home tree-wide: fix typo 2020-12-14 12:05:55 +00:00
hostname hostname-util: flagsify hostname_is_valid(), drop machine_name_is_valid() 2020-12-15 17:59:48 +01:00
hwdb license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
id128 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
import hostname-util: flagsify hostname_is_valid(), drop machine_name_is_valid() 2020-12-15 17:59:48 +01:00
initctl license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
journal Merge pull request #17967 from poettering/connect-user-bus 2020-12-15 21:14:01 +01:00
journal-remote fileio: teach read_full_file_full() to read from offset/with maximum size 2020-12-01 14:17:47 +01:00
kernel-install kernel-install: drop redundant "/" 2020-11-24 11:18:40 +01:00
libsystemd sd-netlink: add a read function 2020-12-16 00:35:56 +01:00
libsystemd-network Merge pull request #17967 from poettering/connect-user-bus 2020-12-15 21:14:01 +01:00
libudev license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
locale license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
login logind: use free_and_strdup() where appropriate 2020-12-09 10:09:11 +01:00
machine hostname-util: flagsify hostname_is_valid(), drop machine_name_is_valid() 2020-12-15 17:59:48 +01:00
machine-id-setup license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
modules-load license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
mount license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
network firewall-util: introduce context structure 2020-12-16 00:35:56 +01:00
notify license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
nspawn firewall-util: add nftables backend 2020-12-16 01:07:08 +01:00
nss-myhostname license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
nss-mymachines nss-mymachines: initialize logging 2020-12-10 20:52:03 +01:00
nss-resolve nss-resolve: initialize logging, log json errors 2020-12-10 20:50:01 +01:00
nss-systemd nss-systemd: initialize logging 2020-12-10 20:51:59 +01:00
oom test: use for(;;) instead of while(true) 2020-11-27 14:35:20 +09:00
partition fileio: teach read_full_file_full() to read from offset/with maximum size 2020-12-01 14:17:47 +01:00
path license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
portable license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
pstore pstore: use log_oom() 2020-11-20 02:59:02 +09:00
quotacheck license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
random-seed license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
rc-local-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
remount-fs license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
reply-password license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
resolve resolved: create stub-resolv.conf symlink with correct security label 2020-12-15 14:12:08 +01:00
rfkill license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
run license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
run-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
shared firewall-util: add nftables backend 2020-12-16 01:07:08 +01:00
shutdown tree-wide: update "that that" 2020-11-18 17:23:00 +09:00
sleep license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
socket-proxy license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
stdio-bridge stdio-bridge: add support for --system and --user 2020-12-15 18:01:20 +01:00
sulogin-shell license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
sysctl license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
system-update-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
systemctl sd-bus: add API for connecting to a specific user's user bus of a specific container 2020-12-15 18:00:15 +01:00
systemd sd-netlink: add a read function 2020-12-16 00:35:56 +01:00
sysusers specifiers: introduce common macros for generating specifier tables 2020-11-25 14:13:02 +09:00
sysv-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
test firewall-util: add nftables backend 2020-12-16 01:07:08 +01:00
time-wait-sync license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
timedate timedate: use localtime_or_gmtime_r() and mktime_or_timegm() 2020-11-21 04:28:45 +09:00
timesync timesync: Make delaying attempts to contact servers configurable 2020-12-15 08:52:51 +09:00
tmpfiles tree-wide: sort specifiers and move common comments to specifier.h 2020-11-25 15:12:27 +09:00
tty-ask-password-agent license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
udev tree-wide: fix typo 2020-12-14 12:05:55 +00:00
update-done license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
update-utmp tree-wide: use return value of log_xxx_errno() 2020-11-20 02:58:27 +09:00
user-sessions license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
userdb tree-wide: unsetenv cannot fail 2020-11-10 15:52:32 +01:00
vconsole license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
veritysetup veritysetup: also place udev socket dep 2020-12-10 20:44:12 +01:00
version
volatile-root license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
xdg-autostart-generator license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00