A variety of sockopts exist both for IPv4 and IPv6 but require a
different pair of sockopt level/option number. Let's add helpers for
these that internally determine the right sockopt to call.
This should shorten code that generically wants to support both ipv4 +
ipv6 and for the first time adds correct support for some cases where we
only called the ipv4 versions, and not the ipv6 options.
We would set .type to a fake value. All real callers (outside of tests)
immediately overwrite .type with a proper value after calling
socket_address_parse(). So let's not set it and adjust the few places
that relied on it being set to the fake value.
socket_address_parse() is modernized to only set the output argument on
success.
If the interface scope is specified, this changes the meaning of the address
quite significantly. Let's show the IPv6 scope_id if present.
Sadly we don't even have a test for sockaddr_pretty() output :(
This will be implicitly tested through socket_address_parse() later on.
The commit 10ce2e0681 inverts the order of
SO_{RCV,SND}BUFFORCE and SO_{RCV,SND}BUF. However, setting buffer size with
SO_{RCV,SND}BUF does not fail even if the requested size is larger than
the kernel limit. Hence, SO_{RCV,SND}BUFFORCE will not use anymore and
the buffer size is always limited by the kernel limit even if we have
the priviledge to ignore the limit.
This makes the buffer size is checked after configuring it with
SO_{RCV,SND}BUF, and if it is still not sufficient, then try to set it
with FORCE command. With this commit, if we have enough priviledge, the
requested buffer size is correctly set.
Hopefully fixes#14417.
Prompted by the discussion on #16110, let's migrate more code to
fd_wait_for_event().
This only leaves 7 places where we call into poll()/poll() directly in
our entire codebase. (one of which is fd_wait_for_event() itself)
poll() sets POLLNVAL inside of the poll structures if an invalid fd is
passed. So far we generally didn't check for that, thus not taking
notice of the error. Given that this specific kind of error is generally
indication of a programming error, and given that our code is embedded
into our projects via NSS or because people link against our library,
let's explicitly check for this and convert it to EBADF.
(I ran into a busy loop because of this missing check when some of my
test code accidentally closed an fd it shouldn't close, so this is a
real thing)
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
Let's be extra careful whenever we return from recvmsg() and see
MSG_CTRUNC set. This generally means we ran into a programming error, as
we didn't size the control buffer large enough. It's an error condition
we should at least log about, or propagate up. Hence do that.
This is particularly important when receiving fds, since for those the
control data can be of any size. In particular on stream sockets that's
nasty, because if we miss an fd because of control data truncation we
cannot recover, we might not even realize that we are one off.
(Also, when failing early, if there's any chance the socket might be
AF_UNIX let's close all received fds, all the time. We got this right
most of the time, but there were a few cases missing. God, UNIX is hard
to use)
In subsequent commits, calls to if_nametoindex() will be replaced by a wrapper
that falls back to alternative name resolution over netlink. netlink support
requires libsystemd (for sd-netlink), and we don't want to add any functions
that require netlink in basic/. So stuff that calls if_nametoindex() for user
supplied interface names, and everything that depends on that, needs to be
moved.
So apparently there are two reasons why accept() can return EOPNOTSUPP:
because the socket is not a listening stream socket (or similar), or
because the incoming TCP connection for some reason wasn't acceptable to
the host. THe latter should be a transient error, as suggested on
accept(2). The former however should be considered fatal for
flush_accept(). Let's fix this by explicitly checking whether the socket
is a listening socket beforehand.
socket_bind_to_ifindex() uses the the SO_BINDTOIFINDEX sockopt of kernel
5.0, with a fallback to SO_BINDTODEVICE on older kernels.
socket_bind_to_ifname() is a trivial wrapper around SO_BINDTODEVICE, the
only benefit of using it instead of SO_BINDTODEVICE directly is that it
determines the size of the interface name properly so that it also works
for unbinding. Moreover, it's an attempt to unify our invocations of the
sockopt with a size of strlen(ifname) rather than strlen(ifname)+1...
Linux is stupid and sometimes returns a "struct sockaddr_un" that is
longer than its fields, as it NUL terminates .sun_path[] even if it has
full length. ubsan detects this, rightfully. Since this is a Linux
misdesign let's trick out ubsan a bit.
Fixes: #11024
We are pretty careful to reject abstract sockets that are too long to fit in
the address structure as a NUL-terminated string. And since we parse sockets as
strings, it is not possible to embed a NUL in the the address either. But we
might receive an external socket (abstract or not), and we want to be able to
print its address in all cases. We would call socket_address_verify() and
refuse to print various sockets that the kernel considers legit.
Let's do the strict verification only in case of socket addresses we parse and
open ourselves, and do less strict verification when printing addresses of
existing sockets, and use c-escaping to print embedded NULs and such.
More tests are added.
This should make LGTM happier because on FIXME comment is removed.
The helper is supposed to properly handle cases where .sun_path does not
contain a NUL byte, and thus copies out the path suffix a NUL as
necessary.
This also reworks the more specific socket_address_unlink() to be a
wrapper around the more generic sockaddr_un_unlink()