Systemd

Commit Graph

Author	SHA1	Message	Date
Yu Watanabe	bcdb3b7d50	core: detect_container() may return negative errno	2020-12-14 19:35:11 +01:00
Yu Watanabe	d85ff94477	core: use SYNTHETIC_ERRNO() macro	2020-11-27 14:35:20 +09:00
Yu Watanabe	db9ecf0501	license: LGPL-2.1+ -> LGPL-2.1-or-later	2020-11-09 13:23:58 +09:00
Yu Watanabe	8ed6f81ba3	core: make log_unit_error() or friends return void	2020-09-09 02:34:38 +09:00
Zbigniew Jędrzejewski-Szmek	90e74a66e6	tree-wide: define iterator inside of the macro	2020-09-08 12:14:05 +02:00
Zbigniew Jędrzejewski-Szmek	37b22b3b47	tree: wide "the the" and other trivial grammar fixes	2020-07-02 09:51:38 +02:00
Zbigniew Jędrzejewski-Szmek	d02fd8b1c6	core/bpf-firewall: use the correct cleanup function On error, we'd just free the object, and not close the fd. While at it, let's use set_ensure_consume() to make sure we don't leak the object if it was already in the set. I'm not sure if that condition can be achieved.	2020-06-24 10:38:15 +02:00
Zbigniew Jędrzejewski-Szmek	de7fef4b6e	tree-wide: use set_ensure_put() Patch contains a coccinelle script, but it only works in some cases. Many parts were converted by hand. Note: I did not fix errors in return value handing. This will be done separate to keep the patch comprehensible. No functional change is intended in this patch.	2020-06-22 16:32:37 +02:00
Zbigniew Jędrzejewski-Szmek	2899aac46a	core: constify bpf program arrays In cases where the programs were modified after being initially declared, reorder operations so that the declaration is already in final form.	2019-11-10 23:22:14 +01:00
Yu Watanabe	455fa9610c	tree-wide: drop string.h when string-util.h or friends are included	2019-11-04 00:30:32 +09:00
Zbigniew Jędrzejewski-Szmek	5cfa33e0bc	Create src/shared/unit-file.[ch] for unit-file related ops So far we put such functinos in install.[ch], but that is tied too closely to enable/disable. Let's start moving things to a place with a better name.	2019-07-19 16:51:14 +02:00
Kai Lüke	fab347489f	bpf-firewall: custom BPF programs through IP(Ingress\|Egress)FilterPath= Takes a single /sys/fs/bpf/pinned_prog string as argument, but may be specified multiple times. An empty assignment resets all previous filters. Closes https://github.com/systemd/systemd/issues/10227	2019-06-25 09:56:16 +02:00
Anita Zhang	4c1567f29a	bpf-firewall: optimization for IPAddressXYZ="any" (and unprivileged users) This is a workaround to make IPAddressDeny=any/IPAddressAllow=any work for non-root users that have CAP_NET_ADMIN. "any" was chosen since all or nothing network access is one of the most common use cases for isolation. Allocating BPF LPM TRIE maps require CAP_SYS_ADMIN while BPF_PROG_TYPE_CGROUP_SKB only needs CAP_NET_ADMIN. In the case of IPAddressXYZ="any" we can just consistently return false/true to avoid allocating the map and limit the user to having CAP_NET_ADMIN.	2019-06-22 19:56:06 +02:00
Lennart Poettering	1e59b5455e	bpf: use more TAKE_FD()	2019-06-21 03:28:24 +09:00
Zbigniew Jędrzejewski-Szmek	f140ed02f7	Silence warning about BPF firewall in containers We'd get a warning on every nspawn invocation: dev-hugepages.mount: unit configures an IP firewall, but the local system does not support BPF/cgroup firewalling. (This warning is only shown for the first unit using IP firewalling.) Before the previous commit, I'd generally get a warning about systemd-udev.service, even though that service is not started in containers. But are still many other units which that declare a firewall, which is currently unsupported in containers. Let's stop warning about this. The warning is still emitted e.g. if legacy cgroups are used. This is something that can be configured, so it makes more sense to emit the warning.	2019-06-04 17:22:37 +02:00
Zbigniew Jędrzejewski-Szmek	84d2744bc5	Move warning about unsupported BPF firewall right before the firewall would be created There's no need to warn about the firewall when parsing, because the unit might not be started at all. Let's warn only when we're actually preparing to start the firewall. This changes behaviour: - the warning is printed just once for all unit types, and not once for normal units and once for transient units. - on repeat warnings, the message is not printed at all. There's already detailed debug info from bpf_firewall_compile(), so we don't need to repeat ourselves. - when we are not root, let's say precisely that, not "lack of necessary privileges" and "the local system does not support BPF/cgroup firewalling". Fixes #12673.	2019-06-04 17:22:37 +02:00
Yu Watanabe	01234e1fe7	tree-wide: drop several missing_*.h and import relevant headers from kernel-5.0	2019-04-11 19:00:37 +02:00
Lennart Poettering	0a9707187b	util: split out memcmp()/memset() related calls into memory-util.[ch] Just some source rearranging.	2019-03-13 12:16:43 +01:00
Yu Watanabe	e93672eeac	tree-wide: drop missing.h from headers and use relevant missing_*.h	2018-12-06 13:31:16 +01:00
Lennart Poettering	13711093ef	bpf-firewall: always use log_unit_xyz() insteadof log_xyz() That way it's easier to figure out what the various messages belong to	2018-10-09 21:11:41 +02:00
Alexander Filippov	047de7e1b1	core: fix the check if CONFIG_CGROUP_BPF is on Since the commit torvalds/linux@fdb5c4531c the syscall BPF_PROG_ATTACH return EBADF when CONFIG_CGROUP_BPF is turned off and as result the bpf_firewall_supported() returns the incorrect value. This commmit replaces the syscall BPF_PROG_ATTACH with BPF_PROG_DETACH which is still work as expected. Resolves openbmc/linux#159 See also systemd/systemd#7054 Signed-off-by: Alexander Filippov <a.filippov@yadro.com>	2018-09-18 16:19:51 +02:00
Yu Watanabe	f330408d62	tree-wide: drop empty lines in comments	2018-07-23 08:44:24 +02:00
Zbigniew Jędrzejewski-Szmek	d9b02e1697	tree-wide: drop copyright headers from frequent contributors Fixes #9320. for p in Shapovalov Chevalier Rozhkov Sievers Mack Herrmann Schmidt Rudenberg Sahani Landden Andersen Watanabe; do git grep -e 'Copyright.'$p -l\|xargs perl -i -0pe 's\|/([][])?[]\s+([#]\s+)?Copyright[^\n]'$p'[^\n]\s[]([][])?/\n\|\n\|gms; s\|\s+([#]\s+)?Copyright[^\n]'$p'[^\n]\n\|\n\|gms' done	2018-06-20 11:58:53 +02:00
Lennart Poettering	96b2fb93c5	tree-wide: beautify remaining copyright statements Let's unify an beautify our remaining copyright statements, with a unicode ©. This means our copyright statements are now always formatted the same way. Yay.	2018-06-14 10:20:21 +02:00
Lennart Poettering	818bf54632	tree-wide: drop 'This file is part of systemd' blurb This part of the copyright blurb stems from the GPL use recommendations: https://www.gnu.org/licenses/gpl-howto.en.html The concept appears to originate in times where version control was per file, instead of per tree, and was a way to glue the files together. Ultimately, we nowadays don't live in that world anymore, and this information is entirely useless anyway, as people are very welcome to copy these files into any projects they like, and they shouldn't have to change bits that are part of our copyright header for that. hence, let's just get rid of this old cruft, and shorten our codebase a bit.	2018-06-14 10:20:20 +02:00
Zbigniew Jędrzejewski-Szmek	4355f1c9da	Fix three uses of bogus errno value in logs (and returned value in one case)	2018-04-24 14:10:27 +02:00
Zbigniew Jędrzejewski-Szmek	b1c05b98bf	tree-wide: avoid assignment of r just to use in a comparison This changes r = ...; if (r < 0) to if (... < 0) when r will not be used again.	2018-04-24 14:10:27 +02:00
Zbigniew Jędrzejewski-Szmek	11a1589223	tree-wide: drop license boilerplate Files which are installed as-is (any .service and other unit files, .conf files, .policy files, etc), are left as is. My assumption is that SPDX identifiers are not yet that well known, so it's better to retain the extended header to avoid any doubt. I also kept any copyright lines. We can probably remove them, but it'd nice to obtain explicit acks from all involved authors before doing that.	2018-04-06 18:58:55 +02:00
Yu Watanabe	1cc6c93a95	tree-wide: use TAKE_PTR() and TAKE_FD() macros	2018-04-05 14:26:26 +09:00
Lennart Poettering	5128346127	bpf: reset "extra" IP accounting counters when turning off IP accounting for a unit We maintain an "extra" set of IP accounting counters that are used when we systemd is reloaded to carry over the counters from the previous run. Let's reset these to zero whenever IP accounting is turned off. If we don't do this then turning off IP accounting and back on later wouldn't reset the counters, which is quite surprising and different from how our CPU time counting works.	2018-02-21 16:43:36 +01:00
Lennart Poettering	aa2b6f1d2b	bpf: rework how we keep track and attach cgroup bpf programs So, the kernel's management of cgroup/BPF programs is a bit misdesigned: if you attach a BPF program to a cgroup and close the fd for it it will stay pinned to the cgroup with no chance of ever removing it again (or otherwise getting ahold of it again), because the fd is used for selecting which BPF program to detach. The only way to get rid of the program again is to destroy the cgroup itself. This is particularly bad for root the cgroup (and in fact any other cgroup that we cannot realistically remove during runtime, such as /system.slice, /init.scope or /system.slice/dbus.service) as getting rid of the program only works by rebooting the system. To counter this let's closely keep track to which cgroup a BPF program is attached and let's implicitly detach the BPF program when we are about to close the BPF fd. This hence changes the bpf_program_cgroup_attach() function to track where we attached the program and changes bpf_program_cgroup_detach() to use this information. Moreover bpf_program_unref() will now implicitly call bpf_program_cgroup_detach(). In order to simplify things, bpf_program_cgroup_attach() will now implicitly invoke bpf_program_load_kernel() when necessary, simplifying the caller's side. Finally, this adds proper reference counting to BPF programs. This is useful for working with two BPF programs in parallel: the BPF program we are preparing for installation and the BPF program we so far installed, shortening the window when we detach the old one and reattach the new one.	2018-02-21 16:43:36 +01:00
Lennart Poettering	acf7f253de	bpf: use BPF_F_ALLOW_MULTI flag if it is available This new kernel 4.15 flag permits that multiple BPF programs can be executed for each packet processed: multiple per cgroup plus all programs defined up the tree on all parent cgroups. We can use this for two features: 1. Finally provide per-slice IP accounting (which was previously unavailable) 2. Permit delegation of BPF programs to services (i.e. leaf nodes). This patch beefs up PID1's handling of BPF to enable both. Note two special items to keep in mind: a. Our inner-node BPF programs (i.e. the ones we attach to slices) do not enforce IP access lists, that's done exclsuively in the leaf-node BPF programs. That's a good thing, since that way rules in leaf nodes can cancel out rules further up (i.e. for example to implement a logic of "disallow everything except httpd.service"). Inner node BPF programs to accounting however if that's requested. This is beneficial for performance reasons: it means in order to provide per-slice IP accounting we don't have to add up all child unit's data. b. When this code is run on pre-4.15 kernel (i.e. where BPF_F_ALLOW_MULTI is not available) we'll make IP acocunting on slice units unavailable (i.e. revert to behaviour from before this commit). For leaf nodes we'll fallback to non-ALLOW_MULTI mode however, which means that BPF delegation is not available there at all, if IP fw/acct is turned on for the unit. This is a change from earlier behaviour, where we use the BPF_F_ALLOW_OVERRIDE flag, so that our fw/acct would lose its effect as soon as delegation was turned on and some client made use of that. I think the new behaviour is the safer choice in this case, as silent bypassing of our fw rules is not possible anymore. And if people want proper delegation then the way out is a more modern kernel or turning off IP firewalling/acct for the unit algother.	2018-02-21 16:43:36 +01:00
Lennart Poettering	9b3c189786	bpf-program: optionally take fd of program to detach This is useful for BPF_F_ALLOW_MULTI programs, where the kernel requires us to specify the fd.	2018-02-21 16:43:36 +01:00
Lennart Poettering	2ae7ee58fa	bpf: beef up bpf detection, check if BPF_F_ALLOW_MULTI is supported This improves the BPF/cgroup detection logic, and looks whether BPF_ALLOW_MULTI is supported. This flag allows execution of multiple BPF filters in a recursive fashion for a whole cgroup tree. It enables us to properly report IP accounting for slice units, as well as delegation of BPF support to units without breaking our own IP accounting.	2018-02-21 16:43:36 +01:00
Lennart Poettering	418cdd69d1	bpf-firewall: fix warning text I figure saying "systemd" here was a typo, and it should have been "system". (Yes, it becomes very hard after a while typing "system" correctly if you type "systemd" so often.) That said, "systemd" in some ways is actually more correct, since BPF might be available for the system instance but not in the user instance. Either way, talking of "this systemd" is weird, let's reword this to be "this manager", to emphasize that it's the local instance of systemd where BPF is not available, but that it might be available otherwise.	2018-02-12 11:34:00 +01:00
Lennart Poettering	1d9cc8768f	cgroup: add a new "can_delegate" flag to the unit vtable, and set it for scope and service units only Currently we allowed delegation for alluntis with cgroup backing except for slices. Let's make this a bit more strict for now, and only allow this in service and scope units. Let's also add a generic accessor unit_cgroup_delegate() for checking whether a unit has delegation turned on that checks the new bool first. Also, when doing transient units, let's explcitly refuse turning on delegation for unit types that don#t support it. This is mostly cosmetical as we wouldn't act on the delegation request anyway, but certainly helpful for debugging.	2018-02-12 11:34:00 +01:00
Lennart Poettering	e583759bd1	bpf-firewall: actually invoke BPF_PROG_ATTACH to check whether cgroup/bpf is available Apparently that's the only way to really know whether the kernel has CONFIG_CGROUP_BPF turned on. Fixes: #7054	2017-11-29 20:15:23 +01:00
Zbigniew Jędrzejewski-Szmek	53e1b68390	Add SPDX license identifiers to source files under the LGPL This follows what the kernel is doing, c.f. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.	2017-11-19 19:08:15 +01:00
Lennart Poettering	93e93da5cc	bpf-firewall: properly handle kernels where BPF cgroup is disabled but TRIE maps are enabled (#7298 ) So far, we assumed that kernels where TRIE was on also supported BPF/cgroup stuff. That's not a correct assumption to make, hence check for both features separately. Fixes: #7054	2017-11-13 10:56:43 +01:00
Lennart Poettering	9f2e6892a2	bpf: set BPF_F_ALLOW_OVERRIDE when attaching a cgroup program if Delegate=yes is set Let's permit installing BPF programs in cgroup subtrees if Delegeate=yes. Let's not document this precise behaviour for now though, as most likely the logic here should become recursive, but that's only going to happen if the kernel starts supporting that. Until then, support this in a non-recursive fashion.	2017-09-22 15:28:05 +02:00
Daniel Mack	1988a9d120	Add firewall eBPF compiler	2017-09-22 15:24:55 +02:00

41 Commits