Both SO_SNDBUFFORCE and SO_RCVBUFFORCE requires capability 'net_admin'.
If this capability is not granted to the service the first attempt to increase
the recv/snd buffers (via sd_notify()) with SO_RCVBUFFORCE/SO_SNDBUFFORCE will
fail, even if the requested size is lower than the limit enforced by the
kernel.
If apparmor is used, the DENIED logs for net_admin will show up. These log
entries are seen as red warning light, because they could indicate that a
program has been hacked and tries to compromise the system.
It would be nicer if they can be avoided without giving services (relying on
sd_notify) net_admin capability or dropping DENIED logs for all such services
via their apparmor profile.
I'm not sure if sd_notify really needs to forcibly increase the buffer sizes,
but at least if the requested size is below the kernel limit, the capability
(hence the log entries) should be avoided.
Hence let's first ask politely for increasing the buffers and only if it fails
then ignore the kernel limit if we have sufficient privileges.
These take a struct iovec to send data together with the passed FD.
The receive function returns the FD through an output argument. In case data is
received, but no FD is passed, the receive function will set the output
argument to -1 explicitly.
Update code in dynamic-user to use the new helpers.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.
It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
This takes inspiration from Rust:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take
and was suggested by Alan Jenkins (@sourcejedi).
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
log.h really should only include the bare minimum of other headers, as
it is really pulled into pretty much everything else and already in
itself one of the most basic pieces of code we have.
Let's hence drop inclusion of:
1. sd-id128.h because it's entirely unneeded in current log.h
2. errno.h, dito.
3. sys/signalfd.h which we can replace by a simple struct forward
declaration
4. process-util.h which was needed for getpid_cached() which we now hide
in a funciton log_emergency_level() instead, which nicely abstracts
the details away.
5. sys/socket.h which was needed for struct iovec, but a simple struct
forward declaration suffices for that too.
Ultimately this actually makes our source tree larger (since users of
the functionality above must now include it themselves, log.h won't do
that for them), but I think it helps to untangle our web of includes a
tiny bit.
(Background: I'd like to isolate the generic bits of src/basic/ enough
so that we can do a git submodule import into casync for it)
Let's call getsockopt() in a loop, so that we can deal correctly with
the label changing while we are trying to read it.
(also, while we are at it, let's make sure that there's always one
trailing NUL byte at the end of the buffer, after all SO_PEERSEC has
zero documentation, and multiple implementing backends, hence let's
better be safe than sorry)
Also, drop UID/GID validity checks from getpeercred() as the kernel will
never pass us invalid UID/GID on userns, but the overflow UID/GID
instead. Add a comment about this.
The advantage is that is the name is mispellt, cpp will warn us.
$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build
squash! build-sys: use #if Y instead of #ifdef Y everywhere
v2:
- fix incorrect setting of HAVE_LIBIDN2
The AF_VSOCK address family facilitates guest<->host communication on
VMware and KVM (virtio-vsock). Adding support to systemd allows guest
agents to be launched through .socket unit files. Today guest agents
are stand-alone daemons running inside guests that do not take advantage
of systemd socket activation.
sockaddr_port() either returns a >= 0 port number or a negative errno.
This works for AF_INET and AF_INET6 because port ranges are only 16-bit.
In AF_VSOCK ports are 32-bit so an int cannot represent all port number
and negative errnos. Separate the port and the return code.
We don't have plural in the name of any other -util files and this
inconsistency trips me up every time I try to type this file name
from memory. "formats-util" is even hard to pronounce.
As suggested here:
https://github.com/systemd/systemd/pull/4296#issuecomment-251911349
Let's try AF_INET first as socket, but let's fall back to AF_NETLINK, so that
we can use a protocol-independent socket here if possible. This has the benefit
that our code will still work even if AF_INET/AF_INET6 is made unavailable (for
exmple via seccomp), at least on current kernels.
This adds a new boolean setting DynamicUser= to service files. If set, a new
user will be allocated dynamically when the unit is started, and released when
it is stopped. The user ID is allocated from the range 61184..65519. The user
will not be added to /etc/passwd (but an NSS module to be added later should
make it show up in getent passwd).
For now, care should be taken that the service writes no files to disk, since
this might result in files owned by UIDs that might get assigned dynamically to
a different service later on. Later patches will tighten sandboxing in order to
ensure that this cannot happen, except for a few selected directories.
A simple way to test this is:
systemd-run -p DynamicUser=1 /bin/sleep 99999
Calling recv with a NULL buffer returns EFAULT instead of EOPNOTSUPP on
older kernels (3.14).
Fixes#3407
Signed-off-by: Kai Ruhnau <kai.ruhnau@target-sg.com>
Previously, we'd simply close and reopen the socket file descriptors. This is
problematic however, as we won't transition through the SOCKET_CHOWN state
then, and thus the file ownership won't be correct for the sockets.
Rework the flushing logic, and actually read any queued data from the sockets
for flushing, and accept any queued messages and disconnect them.