Commit Graph

19 Commits

Author SHA1 Message Date
Zbigniew Jędrzejewski-Szmek d9b02e1697 tree-wide: drop copyright headers from frequent contributors
Fixes #9320.

for p in Shapovalov Chevalier Rozhkov Sievers Mack Herrmann Schmidt Rudenberg Sahani Landden Andersen Watanabe; do
  git grep -e 'Copyright.*'$p -l|xargs perl -i -0pe 's|/([*][*])?[*]\s+([*#]\s+)?Copyright[^\n]*'$p'[^\n]*\s*[*]([*][*])?/\n*|\n|gms; s|\s+([*#]\s+)?Copyright[^\n]*'$p'[^\n]*\n*|\n|gms'
done
2018-06-20 11:58:53 +02:00
Lennart Poettering 96b2fb93c5 tree-wide: beautify remaining copyright statements
Let's unify an beautify our remaining copyright statements, with a
unicode ©. This means our copyright statements are now always formatted
the same way. Yay.
2018-06-14 10:20:21 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Zbigniew Jędrzejewski-Szmek 4355f1c9da Fix three uses of bogus errno value in logs (and returned value in one case) 2018-04-24 14:10:27 +02:00
Zbigniew Jędrzejewski-Szmek b1c05b98bf tree-wide: avoid assignment of r just to use in a comparison
This changes
  r = ...;
  if (r < 0)
to
  if (... < 0)
when r will not be used again.
2018-04-24 14:10:27 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
Yu Watanabe 1cc6c93a95 tree-wide: use TAKE_PTR() and TAKE_FD() macros 2018-04-05 14:26:26 +09:00
Lennart Poettering 5128346127 bpf: reset "extra" IP accounting counters when turning off IP accounting for a unit
We maintain an "extra" set of IP accounting counters that are used when
we systemd is reloaded to carry over the counters from the previous run.
Let's reset these to zero whenever IP accounting is turned off. If we
don't do this then turning off IP accounting and back on later wouldn't
reset the counters, which is quite surprising and different from how our
CPU time counting works.
2018-02-21 16:43:36 +01:00
Lennart Poettering aa2b6f1d2b bpf: rework how we keep track and attach cgroup bpf programs
So, the kernel's management of cgroup/BPF programs is a bit misdesigned:
if you attach a BPF program to a cgroup and close the fd for it it will
stay pinned to the cgroup with no chance of ever removing it again (or
otherwise getting ahold of it again), because the fd is used for
selecting which BPF program to detach. The only way to get rid of the
program again is to destroy the cgroup itself.

This is particularly bad for root the cgroup (and in fact any other
cgroup that we cannot realistically remove during runtime, such as
/system.slice, /init.scope or /system.slice/dbus.service) as getting rid
of the program only works by rebooting the system.

To counter this let's closely keep track to which cgroup a BPF program
is attached and let's implicitly detach the BPF program when we are
about to close the BPF fd.

This hence changes the bpf_program_cgroup_attach() function to track
where we attached the program and changes bpf_program_cgroup_detach() to
use this information. Moreover bpf_program_unref() will now implicitly
call bpf_program_cgroup_detach().

In order to simplify things, bpf_program_cgroup_attach() will now
implicitly invoke bpf_program_load_kernel() when necessary, simplifying
the caller's side.

Finally, this adds proper reference counting to BPF programs. This
is useful for working with two BPF programs in parallel: the BPF program
we are preparing for installation and the BPF program we so far
installed, shortening the window when we detach the old one and reattach
the new one.
2018-02-21 16:43:36 +01:00
Lennart Poettering acf7f253de bpf: use BPF_F_ALLOW_MULTI flag if it is available
This new kernel 4.15 flag permits that multiple BPF programs can be
executed for each packet processed: multiple per cgroup plus all
programs defined up the tree on all parent cgroups.

We can use this for two features:

1. Finally provide per-slice IP accounting (which was previously
   unavailable)

2. Permit delegation of BPF programs to services (i.e. leaf nodes).

This patch beefs up PID1's handling of BPF to enable both.

Note two special items to keep in mind:

a. Our inner-node BPF programs (i.e. the ones we attach to slices) do
   not enforce IP access lists, that's done exclsuively in the leaf-node
   BPF programs. That's a good thing, since that way rules in leaf nodes
   can cancel out rules further up (i.e. for example to implement a
   logic of "disallow everything except httpd.service"). Inner node BPF
   programs to accounting however if that's requested. This is
   beneficial for performance reasons: it means in order to provide
   per-slice IP accounting we don't have to add up all child unit's
   data.

b. When this code is run on pre-4.15 kernel (i.e. where
   BPF_F_ALLOW_MULTI is not available) we'll make IP acocunting on slice
   units unavailable (i.e. revert to behaviour from before this commit).
   For leaf nodes we'll fallback to non-ALLOW_MULTI mode however, which
   means that BPF delegation is not available there at all, if IP
   fw/acct is turned on for the unit. This is a change from earlier
   behaviour, where we use the BPF_F_ALLOW_OVERRIDE flag, so that our
   fw/acct would lose its effect as soon as delegation was turned on and
   some client made use of that. I think the new behaviour is the safer
   choice in this case, as silent bypassing of our fw rules is not
   possible anymore. And if people want proper delegation then the way
   out is a more modern kernel or turning off IP firewalling/acct for
   the unit algother.
2018-02-21 16:43:36 +01:00
Lennart Poettering 9b3c189786 bpf-program: optionally take fd of program to detach
This is useful for BPF_F_ALLOW_MULTI programs, where the kernel requires
us to specify the fd.
2018-02-21 16:43:36 +01:00
Lennart Poettering 2ae7ee58fa bpf: beef up bpf detection, check if BPF_F_ALLOW_MULTI is supported
This improves the BPF/cgroup detection logic, and looks whether
BPF_ALLOW_MULTI is supported. This flag allows execution of multiple
BPF filters in a recursive fashion for a whole cgroup tree. It enables
us to properly report IP accounting for slice units, as well as
delegation of BPF support to units without breaking our own IP
accounting.
2018-02-21 16:43:36 +01:00
Lennart Poettering 418cdd69d1 bpf-firewall: fix warning text
I figure saying "systemd" here was a typo, and it should have been
"system". (Yes, it becomes very hard after a while typing "system"
correctly if you type "systemd" so often.) That said, "systemd" in some
ways is actually more correct, since BPF might be available for the
system instance but not in the user instance.

Either way, talking of "this systemd" is weird, let's reword this to be
"this manager", to emphasize that it's the local instance of systemd
where BPF is not available, but that it might be available otherwise.
2018-02-12 11:34:00 +01:00
Lennart Poettering 1d9cc8768f cgroup: add a new "can_delegate" flag to the unit vtable, and set it for scope and service units only
Currently we allowed delegation for alluntis with cgroup backing
except for slices. Let's make this a bit more strict for now, and only
allow this in service and scope units.

Let's also add a generic accessor unit_cgroup_delegate() for checking
whether a unit has delegation turned on that checks the new bool first.

Also, when doing transient units, let's explcitly refuse turning on
delegation for unit types that don#t support it. This is mostly
cosmetical as we wouldn't act on the delegation request anyway, but
certainly helpful for debugging.
2018-02-12 11:34:00 +01:00
Lennart Poettering e583759bd1 bpf-firewall: actually invoke BPF_PROG_ATTACH to check whether cgroup/bpf is available
Apparently that's the only way to really know whether the kernel has
CONFIG_CGROUP_BPF turned on.

Fixes: #7054
2017-11-29 20:15:23 +01:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering 93e93da5cc
bpf-firewall: properly handle kernels where BPF cgroup is disabled but TRIE maps are enabled (#7298)
So far, we assumed that kernels where TRIE was on also supported
BPF/cgroup stuff. That's not a correct assumption to make, hence check
for both features separately.

Fixes: #7054
2017-11-13 10:56:43 +01:00
Lennart Poettering 9f2e6892a2 bpf: set BPF_F_ALLOW_OVERRIDE when attaching a cgroup program if Delegate=yes is set
Let's permit installing BPF programs in cgroup subtrees if
Delegeate=yes. Let's not document this precise behaviour for now though,
as most likely the logic here should become recursive, but that's only
going to happen if the kernel starts supporting that. Until then,
support this in a non-recursive fashion.
2017-09-22 15:28:05 +02:00
Daniel Mack 1988a9d120 Add firewall eBPF compiler 2017-09-22 15:24:55 +02:00