https://tools.ietf.org/html/draft-knodel-terminology-02https://lwn.net/Articles/823224/
This gets rid of most but not occasions of these loaded terms:
1. scsi_id and friends are something that is supposed to be removed from
our tree (see #7594)
2. The test suite defines an API used by the ubuntu CI. We can remove
this too later, but this needs to be done in sync with the ubuntu CI.
3. In some cases the terms are part of APIs we call or where we expose
concepts the kernel names the way it names them. (In particular all
remaining uses of the word "slave" in our codebase are like this,
it's used by the POSIX PTY layer, by the network subsystem, the mount
API and the block device subsystem). Getting rid of the term in these
contexts would mean doing some major fixes of the kernel ABI first.
Regarding the replacements: when whitelist/blacklist is used as noun we
replace with with allow list/deny list, and when used as verb with
allow-list/deny-list.
Presently, CLI utilities such as systemctl will check whether they have a tty
attached or not to decide whether to parse /proc/cmdline or EFI variable
SystemdOptions looking for systemd.log_* entries.
But this check will be misleading if these tools are being launched by a
daemon, such as a monitoring daemon or automation service that runs in
background.
Make log handling of CLI tools uniform by never checking /proc/cmdline or EFI
variables to determine the logging level.
Furthermore, introduce a new log_setup_cli() shortcut to set up common options
used by most command-line utilities.
Patch contains a coccinelle script, but it only works in some cases. Many
parts were converted by hand.
Note: I did not fix errors in return value handing. This will be done separate
to keep the patch comprehensible. No functional change is intended in this
patch.
Access to bit fields is less efficient, and since the Manager is a singleton,
a byte or two of space in the structure doesn't matter at all. (And in this
particular case, because of alignment issues, we wouldn't save anything
anyway.)
This makes the output more predictable. Also, interesting interfaces
are often the low-numbered ones (actual hardware links, not virtual
devices stacked on top), and this makes them more visible.
Those lists are very long and use up a significant chunk of screen real estate.
But the contents are mostly static (usually they just reflect built-in
configuration). Let's just not show them in 'status' output. They can still
be viewed with 'nta' verb.
This is a follow-up for 9f83091e3c.
Instead of reading the mtime off the configuration files after reading,
let's do so before reading, but with the fd we read the data from. This
is not only cleaner (as it allows us to save one stat()), but also has
the benefit that we'll detect changes that happen while we read the
files.
This also reworks unit file drop-ins to use the common code for
determining drop-in mtime, instead of reading system clock for that.
We ask for the TTL, then have enough space for it.
We probably can drop the extra cmsg space now, but let's figure that out
another time, since the extra cmsg space is used elsewhere in resolved
as well.
Resolved can't reliably determine on whether "it makes sense" to query
AAAA records when not explicitly specifying it in the request, so we
shouldn't remove them.
After having done the resolving, applications can use RFC6724 to
determine whether that address is reachable.
We can't know whether an address is reachable before having resolved it
and inspecting the routing table, and not resolving AAAA just because
there's no IPv6 default route on the main interface link them breaks
various setups, including IPv6-providing wireguard tunnels on a
non-dualstacked environment.
Fixes#5782Fixes#5915Fixes#8017
This also makes sure the control buffer is properly aligned. This
matters, as otherwise the control buffer might not be aligned and the
cmsg buffer counting might be off. The incorrect alignment is becoming
visible by using recvmsg_safe() as we suddenly notice the MSG_CTRUNC bit
set because of this.
That said, apparently this isn't enough to make this work on all
kernels. Since I couldn't figure this out, we now add 1K to the buffer
to be sure. We do this once already, also for a pktinfo structure
(though an IPv4/IPv6) one. I am puzzled by this, but this shouldn't
matter much. it works locally just fine, except for those ubuntu CI
kernels...
While we are at it, make some other changes too, to simplify and
modernize the function.
We always need to make them unions with a "struct cmsghdr" in them, so
that things properly aligned. Otherwise we might end up at an unaligned
address and the counting goes all wrong, possibly making the kernel
refuse our buffers.
Also, let's make sure we initialize the control buffers to zero when
sending, but leave them uninitialized when reading.
Both the alignment and the initialization thing is mentioned in the
cmsg(3) man page.
We need to use the CMSG_SPACE() macro to size the control buffers, not
CMSG_LEN(). The former is rounded up to next alignment boundary, the
latter is not. The former should be used for allocations, the latter for
encoding how much of it is actually initialized. See cmsg(3) man page
for details about this.
Given how confusing this is, I guess we don't have to be too ashamed
here, in most cases we actually did get this right.
If we're using a set with _put_strdup(), most of the time we want to use
string hash ops on the set, and free the strings when done. This defines
the appropriate a new string_hash_ops_free structure to automatically free
the keys when removing the set, and makes set_put_strdup() and set_put_strdupv()
instantiate the set with those hash ops.
hashmap_put_strdup() was already doing something similar.
(It is OK to instantiate the set earlier, possibly with a different hash ops
structure. set_put_strdup() will then use the existing set. It is also OK
to call set_free_free() instead of set_free() on a set with
string_hash_ops_free, the effect is the same, we're just overriding the
override of the cleanup function.)
No functional change intended.
Let's be extra careful whenever we return from recvmsg() and see
MSG_CTRUNC set. This generally means we ran into a programming error, as
we didn't size the control buffer large enough. It's an error condition
we should at least log about, or propagate up. Hence do that.
This is particularly important when receiving fds, since for those the
control data can be of any size. In particular on stream sockets that's
nasty, because if we miss an fd because of control data truncation we
cannot recover, we might not even realize that we are one off.
(Also, when failing early, if there's any chance the socket might be
AF_UNIX let's close all received fds, all the time. We got this right
most of the time, but there were a few cases missing. God, UNIX is hard
to use)
It's not that I think that "hostname" is vastly superior to "host name". Quite
the opposite — the difference is small, and in some context the two-word version
does fit better. But in the tree, there are ~200 occurrences of the first, and
>1600 of the other, and consistent spelling is more important than any particular
spelling choice.
When the stub listener is disabled, stub-resolv.conf is useless. Instead of
warning about this, let's just make stub-resolv.conf point to the private
resolv.conf file. (The original bug report asked for "mirroring", but I think
a symlink is nicer than a copy because it is easier to see that a redirection
was made.)
Fixes#14700.
All callers ignore the return value.
This is almost entirely theoretical, since writing to /run is unlikely to
fail..., but the user is almost certainly better off keeping the old copy
around and having working dns resolution with an out-of-date dns server list
than having having a dangling /etc/resolv.conf symlink.
An error with a full path is immediately clear. OTOH, a user might not be
familiar with concenpt like "private resolv.conf".
I opted to use %s-formatting for the path, because the code is much easier to
read this way. Any difference in t speed of execution is not important.
This is useful to raise the log level for a single transaction or a few,
without affecting other state of the resolved as a restart would.
The log level can only be set, I didn't bother with having the ability
to restore the original as in pid1.
There are legitimate reasons to access the file directly, as currently
discussed on fedora-devel. Hence tone things down from "must" to "should
typically not".
Also, let's use fputs() instead of fputs_unlocked() here,
fopen_temporary_label() turns off stdio locking anyway for the whole
FILE*, hence no need to do this manually each time.
On certain distributions such as NixOS the mtime of `/etc/hosts` is
locked to a fixed value. In such cases, only checking the last mtime of
`/etc/hosts` is not enough - we also need to check if the st_ino/st_dev
match up. Thus, let's make sure make sure that systemd-resolved also
rereads `/etc/hosts` if the inode or the device containing `/etc/hosts` changes.
Test script:
```bash
hosts="/etc/hosts"
echo "127.0.0.1 testpr" > "hosts_new"
mv "hosts_new" "$hosts"
resolvectl query testpr || exit 1
mtime="$(stat -c %y "$hosts")"
echo "127.0.0.1 newhost" > "hosts_tmp"
touch -d "$mtime" "hosts_tmp"
install -p "hosts_tmp" "$hosts"
sleep 10
resolvectl query newhost || exit 1
rm -f "hosts_tmp"
```
Closes#14456.
Widely accepted certificates for IP addresses are expensive and only
affordable for larger organizations. Therefore if the user provides
the hostname in the DNS= option, we should use it instead of the IP
address.
Section 6.2 of RFC4034 requires that "all uppercase US-ASCII letters in
the DNS names contained within the RDATA are replaced by the corresponding
lowercase US-ASCII letters" for a long list of RR types.
Fixes#14891
In subsequent commits, calls to if_nametoindex() will be replaced by a wrapper
that falls back to alternative name resolution over netlink. netlink support
requires libsystemd (for sd-netlink), and we don't want to add any functions
that require netlink in basic/. So stuff that calls if_nametoindex() for user
supplied interface names, and everything that depends on that, needs to be
moved.
We don't need a seperate output parameter that is of type int. glibc() says
that the type is "unsigned", but the kernel thinks it's "int". And the
"alternative names" interface also uses ints. So let's standarize on ints,
since it's clearly not realisitic to have interface numbers in the upper half
of unsigned int range.
If a daemon is not started as root, most likely it also can't create its
directory and let's not try to resolve the user in that case either.
Create /run/systemd/netif/lldp with tmpfiles.d like other netif directories.
This is also very helpful for preparing a RootImage for the daemons as NSS crud
is not needed.
This cleans up and unifies the outut of --help texts a bit:
1. Highlight the human friendly description string, not the command
line via ANSI sequences. Previously both this description string and
the brief command line summary was marked with the same ANSI
highlight sequence, but given we auto-page to less and less does not
honour multi-line highlights only the command line summary was
affectively highlighted. Rationale: for highlighting the description
instead of the command line: the command line summary is relatively
boring, and mostly the same for out tools, the description on the
other hand is pregnant, important and captions the whole thing and
hence deserves highlighting.
2. Always suffix "Options" with ":" in the help text
3. Rename "Flags" → "Options" in one case
4. Move commands to the top in a few cases
5. add coloring to many more help pages
6. Unify on COMMAND instead of {COMMAND} in the command line summary.
Some tools did it one way, others the other way. I am not sure what
precisely {} is supposed to mean, that uppercasing doesn't, hence
let's simplify and stick to the {}-less syntax
And minor other tweaks.
Validate the IP address in the certificate for DNS-over-TLS in strict mode when GnuTLS is used. As this is not yet the case in contrast to the documentation.
Increase the required version to ensure TLS 1.3 is always supported when using GnuTLS for DNS-over-TLS and allow further changes to use recent API additions.
Notifications are only sent for the top object, and not for individual
links. This should be enough for the most obvious cases where somebody
just cares about the effective set of servers.
Fixes#13721.
The DnsStreamType was added to track different types of DNS TCP streams,
instead of refcounting all of them together. However, the stream type was
not actually set into the stream->type field, so while the reference count
was correctly incremented per-stream-type, the reference count was always
decremented in the cleanup function for stream type 0, leading to
underflow for the type 0 stream (unsigned) refcount, and preventing new
type 0 streams from being created.
Since type 0 is DNS_STREAM_LOOKUP, which is used to communicate with
upstream nameservers, once the refcount underflows the stub resolver
no longer is able to successfully fall back to TCP upstream lookups
for any truncated UDP packets.
This was found because lookups of A records with a large number of
addresses, too much to fit into a single 512 byte DNS UDP reply,
were causing getaddrinfo() to fall back to TCP and trigger this bug,
which then caused the TCP fallback for later large record lookups
to fail with 'connection timed out; no servers could be reached'.
The stream type was introduced in commit:
652ba568c6
Prefer TLS 1.3 before TLS 1.2 for DNS-over-TLS support, otherwise
servers compliant with RFC 8446 might end up agreeing TLS 1.2 plus a
downgrade signal which is not expected by GnuTLS clients. This manifests
in the following error:
Failed to invoke gnutls_handshake: An illegal parameter has been received.
Fixes: #13528
Fixes: v242-962-g9c0624dcdb ("resolved: support TLS 1.3 when using GnuTLS for DNS-over-TLS")
For executables which take a verb, we should list the verbs first, and
then options which modify those verbs second. The general layout of
the man page is from general description to specific details, usually
Overview, Commands, Options, Return Value, Examples, References.
"reset" is more understandable. The verb is "revert", but it might actually be
better to have a description which uses different words instead of duplicating
the name of the command.
379158684a (commitcomment-34992552)
This matches what is done in networkd very closely. In fact even the
policy descriptions are all identical (with s/network/resolve), except
for the last one:
resolved has org.freedesktop.resolve1.revert while
networkd has org.freedesktop.network1.revert-ntp and
org.freedesktop.network1.revert-dns so the description is a bit different.
This doesn't matter much, but let's just do the loop once and allocate
the populate the result set on the fly. If we find an error, it'll get
cleaned up automatically.
Change the resolved.conf Cache option to a tri-state "no, no-negative, yes" values.
If a lookup returns SERVFAIL systemd-resolved will cache the result for 30s (See 201d995),
however, there are several use cases on which this condition is not acceptable (See systemd#5552 comments)
and the only workaround would be to disable cache entirely or flush it , which isn't optimal.
This change adds the 'no-negative' option when set it avoids putting in cache
negative answers but still works the same heuristics for positive answers.
Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
Whenever I see EXTRACT_QUOTES, I'm always confused whether it means to
leave the quotes in or to take them out. Let's say "unquote", like we
say "cunescape".
Instead of having a context and/or trusted CA list per server this is now moved to the server. Ensures future TLS configuration options are global instead of per server.
This reverts commit 18bddeaaf2.
Revert this because it does not take the OpenSSL internal read pointer
into considoration. Resulting in padding in packetdata and therefore
broken SSL connections.
When emitting the calendarspec warning we want to see some color.
Follow-up for 04220fda5c.
Exceptions:
- systemctl, because it has a lot hand-crafted coloring
- tmpfiles, sysusers, stdio-bridge, etc, because they are also used in
services and I'm not sure if this wouldn't mess up something.
This is partially a refactoring, but also makes many more places use
unlocked operations implicitly, i.e. all users of fopen_temporary().
AFAICT, the uses are always for short-lived files which are not shared
externally, and are just used within the same context. Locking is not
necessary.
We'd call dns_resource_record_equal(), which calls dns_resource_key_equal()
internally, and then dns_resource_key_equal() a second time. Let's be
a bit smarter, and call dns_resource_key_equal() only once.
(before)
dns_resource_key_hash_func_count=514
dns_resource_key_compare_func_count=275
dns_resource_key_equal_count=62371
4.13s user 0.01s system 99% cpu 4.153 total
(after)
dns_resource_key_hash_func_count=514
dns_resource_key_compare_func_count=276
dns_resource_key_equal_count=31337
2.13s user 0.01s system 99% cpu 2.139 total
This doesn't necessarily make things faster, because we still spend more time
in dns_answer_add(), but it improves the compuational complexity of this part.
If we even make dns_resource_key_equal_faster, this will become worthwhile.
* Current logic:
For each NSEC RR find the common suffix between the owner name and
the next name, append asterisk to that suffix and check that
generated wildcard is covered by the NSEC RR in question.
* New logic:
Find NSEC RR covering queried name, generate wildcard as
<asterisk>.<closest encloser> using this RR, then check if any
of the NSEC RRs covers generated wildcard.
/usr/local/lib/systemd/dnssd is now also included in the search path. This
path is of limited usefulness, but it makes sense to be consistent.
Documentation is updated to match. Outdated advice against drop-ins in /usr
is removed.
In all other cases (i.e. classic DNS connection towards an upstream
server, or incoming stub connection, or incoming LMMNR connection) we
want long-running connections, hence keep the connection open for good.
Only in the LLMNR client case let's close the stream as soon as we are
done.
Previously we'd start the timeout once when we allocated the stream.
However, we'd now like to emphasize long-running connections hence let's
rework the timeout logic, and restart it whenever we see action ont the
stream. Thus, idle streams are eventually closed down, but those where
we read or write from are not.
We use stream objects in four different cases: let's track them.
This in particular allows us to make sure the limit on outgoing streams
cannot be exhausted by having incoming streams as this means we can
neatly separate the counters for all four types.
In all three cases, sd_event_loop() will return the exit code anyway.
If sd_event_loop() returns negative, failure is logged and results in an
immediate return. Otherwise, we don't care if sd_event_loop() returns 0
or positive, because the return value feeds into DEFINE_MAIN_FUNCTION(), which
doesn't make the distinction.
Previously, we'd use a link as "default" route depending on whether
there are route-only domains defined on it or not. (If there are, it
would not be used as default route, if there aren't it would.)
Let's make this explicit and add a link variable controlling this. The
variable is not changeable from the outside yet, but subsequent commits
are supposed to add that.
Note that making this configurable adds a certain amount of redundancy,
as there are now two ways to ensure a link does not receive "default"
lookup (i.e. DNS queries matching no configured route):
1. By ensuring that at least one other link configures a route on it
(for example by add "." to its search list)
2. By setting this new boolean to false.
But this is exactly what is intended with this patch: that there is an
explicit way to configure on the link itself whether it receives
'default' traffic, rather than require this to be configured on other
links.
The variable added is a tri-state: if true, the link is suitable for
recieving "default" traffic. If false, the link is not suitable for it.
If unset (i.e. negative) the original logic of "has this route-only
routes" is used, to ensure compatibility with the status quo ante.
The function dns_server_limited_domains() was very strange as it
enumerate the domains associated with a DnsScope object to determine
whether any "route-only" domains, but did so as a function associated
with a DnsServer object.
Let's clear this up, and replace it by a function associated with a
DnsScope instead. This makes more sense philosphically and allows us to
reduce the loops through which we need to jump to determine whether a
scope is suitable for default routing a bit.
Previously, we'd return DNS_SCOPE_MAYBE for all domain lookups matching
LLMNR or mDNS. Let's upgrade this to DNS_SCOPE_YES, to make the binding
stronger.
The effect of this is that even if "local" is defined as routing domain
on some iface, we'll still lookup domains in local via mDNS — if mDNS is
turned on. This should not be limiting, as people who don't want such
lookups should turn off mDNS altogether, as it is useless if nothing is
routed to it.
This also has the nice benefit that mDNS/LLMR continue to work if people
use "~." as routing domain on some interface.
Similar for LLMNR and single label names.
Similar also for the link local IPv4 and IPv6 reverse lookups.
Fixes: #10125
There's no value in authenticating SOA RRs that are neither answer to
our question nor parent of our question (the latter being relevant so
that we have a TTL from the SOA field for negative caching of the actual
query).
By being to eager here, and trying to authenticate too much we run the
risk of creating cyclic deps between our transactions which then causes
the over-all authentication to fail.
Fixes: #9771
This appears to be necessary for client software to ensure the reponse data
is validated with DNSSEC. For example, `ssh -v -o VerifyHostKeyDNS=yes -o
StrictHostKeyChecking=yes redpilllinpro01.ring.nlnog.net` fails if EDNS0 is
not enabled. The debugging output reveals that the `SSHFP` records were
found in DNS, but were considered insecure.
Note that the patch intentionally does *not* enable EDNS0 in the
`/run/systemd/resolve/resolv.conf` file (the one that contains `nameserver`
entries for the upstream DNS servers), as it is impossible to know for
certain that all the upstream DNS servers handles EDNS0 correctly.
Apparently people want more of these (as #11175 shows). Since this is
merely a safety limit for us, let's just bump all values substantially.
Fixes: #11175
RFC7766 section 4 states that in the absence of EDNS0, a response that
is too large for a 512-byte UDP packet will have the 'truncated' bit
set. The client is expected to retry the query over TCP.
Fixes#10264.
It would be very wrong if any of the specfier printf calls modified
any of the objects or data being printed. Let's mark all arguments as const
(primarily to make it easier for the reader to see where modifications cannot
occur).