We use stream objects in four different cases: let's track them.
This in particular allows us to make sure the limit on outgoing streams
cannot be exhausted by having incoming streams as this means we can
neatly separate the counters for all four types.
There's no value in authenticating SOA RRs that are neither answer to
our question nor parent of our question (the latter being relevant so
that we have a TTL from the SOA field for negative caching of the actual
query).
By being to eager here, and trying to authenticate too much we run the
risk of creating cyclic deps between our transactions which then causes
the over-all authentication to fail.
Fixes: #9771
DnsStream and DnsServer have a symbiotic relationship: one DnsStream is
the current "default" stream of the server (and thus reffed by it), but
each stream also refs the server it is connected to. This cyclic
dependency can result in weird situations: when one is
destroyed/unlinked/stopped it needs to unregister itself from the other,
but doing this will trigger unregistration of the other. Hence, let's
make sure we unregister the stream from the server before destroying it,
to break this cycle.
Most likely fixes: #10725
We register an on_packet() handler anyway, which is called first.
There's hence no need to check in on_stream_complete() again, as it is
already taken by that time.
Ideally, coccinelle would strip unnecessary braces too. But I do not see any
option in coccinelle for this, so instead, I edited the patch text using
search&replace to remove the braces. Unfortunately this is not fully automatic,
in particular it didn't deal well with if-else-if-else blocks and ifdefs, so
there is an increased likelikehood be some bugs in such spots.
I also removed part of the patch that coccinelle generated for udev, where we
returns -1 for failure. This should be fixed independently.
The references to the dns_server are now setup after the tls connection is setup.
This ensures that the stream got fully stopped when the initial tls setup failed
instead of having the unref being blocked by the reference to the stream by the server.
Therefore on_stream_io would no longer be called with a half setup encrypted connection.
Fixes the issue reported in #9838.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
DNS queries need timeout values to detect whether a DNS server is
unresponsive or, if the query is sent over UDP, whether a DNS message
was lost and has to be resent. The total time that it takes to answer a
query to arrive is t + RTT, where t is the maximum time that the DNS
server that is being queried needs to answer the query.
An authoritative server stores a copy of the zone that it serves in main
memory or secondary storage, so t is very small and therefore the time
that it takes to answer a query is almost entirely determined by the
RTT. Modern authoritative server software keeps its zones in main memory
and, for example, Knot DNS and NSD are able to answer in less than
100 µs [1]. So iterative resolvers continuously measure the RTT to
optimize their query timeouts and to resend queries more quickly if they
are lost.
systemd-resolved is a stub resolver: it forwards DNS queries to an
upstream resolver and waits for an answer. So the time that it takes for
systemd-resolved to answer a query is determined by the RTT and the time
that it takes the upstream resolver to answer the query.
It seems common for iterative resolver software to set a total timeout
for the query. Such total timeout subsumes the timeout of all queries
that the iterative has to make to answer a query. For example, BIND
seems to use a default timeout of 10 s.
At the moment systemd-resolved derives its query timeout entirely from
the RTT and does not consider the query timeout of the upstream
resolver. Therefore it often mistakenly degrades the feature set of its
upstream resolvers if it takes them longer than usual to answer a query.
It has been reported to be a considerable problem in practice, in
particular if DNSSEC=yes. So the query timeout systemd-resolved should
be derived from the timeout of the upstream resolved and the RTT to the
upstream resolver.
At the moment systemd-resolved measures the RTT as the time that it
takes the upstream resolver to answer a query. This clearly leads to
incorrect measurements. In order to correctly measure the RTT
systemd-resolved would have to measure RTT separately and continuously,
for example with a query with an empty question section or a query for
the SOA RR of the root zone so that the upstream resolver would be able
to answer to query without querying another server. However, this
requires significant changes to systemd-resolved. So it seems best to
postpone them until other issues have been addressed and to set the
resend timeout to a fixed value for now.
As mentioned, BIND seems to use a timeout of 10 s, so perhaps 12 s is a
reasonable value that also accounts for common RTT values. If we assume
that the we are going to retry, it could be less. So it should be enough
to set the resend timeout to DNS_TIMEOUT_MAX_USEC as
DNS_SERVER_FEATURE_RETRY_ATTEMPTS * DNS_TIMEOUT_MAX_USEC = 15 s.
However, this will not solve the incorrect feature set degradation and
should be seen as a temporary change until systemd-resolved does
probe the feature set of an upstream resolver independently from the
actual queries.
[1] https://www.knot-dns.cz/benchmark/
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.
It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
Currently, we accept SERVFAIL after downgrading fully, cache it and move
on. Let's extend this a bit: after downgrading fully, if the SERVFAIL
logic continues to be an issue, then use a different DNS server if there
are any.
Fixes: #7147
As discussed in RFC 6762, Section 8.2 a race condition may
happen when two hosts are probing for the same name simultaniously.
Detect and handle such race conditions.
According to RFC 6762 Section 8.2 "Simultaneous Probe Tiebreaking"
probing queries' Authority Section is populated with proposed
resource records in order to resolve possible race conditions.
This is our own header, we should include use the local-include syntax
("" not <>), to make it clear we are including the one from the build tree.
All other includes of files from src/systemd/ use this scheme.
For caching negative replies we need the SOA TTL information. Hence,
let's authenticate all auxiliary SOA RRs through DS requests on all
negative requests.
This is the most important piece of information of replies, hence show
this in the first log message about it.
(Wireshark shows it too in the short summary, hence this definitely
makes sense...)
Retrying a transaction via TCP is a good approach for mitigating
packet loss. However, it's not a good away way to fix a bad RCODE if we
already downgraded to UDP level for it. Hence, don't do this.
This is a small tweak only, but shortens the time we spend on
downgrading when a specific domain continously returns a bad rcode.
When we are doing a TCP transaction the kernel will automatically resend
all packets for us, there's no need to do that ourselves. Hence:
increase the timeout for TCP transactions substantially, to give the
kernel enough time to connect to the peer, without interrupting it when
we become impatient.
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
"conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.
Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:
$ size .libs/systemd{,-logind,-journald}{.old,}
text data bss dec hex filename
1265204 149564 4808 1419576 15a938 .libs/systemd.old
1260268 149564 4808 1414640 1595f0 .libs/systemd
246805 13852 209 260866 3fb02 .libs/systemd-logind.old
240973 13852 209 255034 3e43a .libs/systemd-logind
146839 4984 34 151857 25131 .libs/systemd-journald.old
146391 4984 34 151409 24f71 .libs/systemd-journald
It is also much easier to check if a certain binary uses a certain MESSAGE_ID:
$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
We don't actually make use of the return value for now, but it matches
our coding style elsewhere, and it actually shortens our code quite a
bit.
Also, add a missing OOM check after dns_answer_new().