Commit graph

89 commits

Author SHA1 Message Date
Iwan Timmer 4310bfc20b resolved: add strict mode for DNS-over-TLS
Add strict mode for DNS-over-TLS, which will require TLS support from the server. Closes #10755
2019-06-19 13:10:44 +02:00
Iwan Timmer e22c5b2064 resolved: move TLS data shared by all servers to manager
Instead of having a context and/or trusted CA list per server this is now moved to the server. Ensures future TLS configuration options are global instead of per server.
2019-06-18 19:16:36 +02:00
Ben Boeckel 5238e95759 codespell: fix spelling errors 2019-04-29 16:47:18 +02:00
Yu Watanabe 6ff79f7640 resolve: rename Link.name -> Link.ifname
This also changes the type from char[IF_NAMESIZE] to char*.
By changing the type, now resolved-link.h can drop the dependency to
the header net/if.h.
2019-04-13 17:51:59 +09:00
Lennart Poettering f76fa08899 resolved: rework dns_server_limited_domains(), replace by dns_scope_has_route_only_domains()
The function dns_server_limited_domains() was very strange as it
enumerate the domains associated with a DnsScope object to determine
whether any "route-only" domains, but did so as a function associated
with a DnsServer object.

Let's clear this up, and replace it by a function associated with a
DnsScope instead. This makes more sense philosphically and allows us to
reduce the loops through which we need to jump to determine whether a
scope is suitable for default routing a bit.
2018-12-21 12:09:00 +01:00
Lennart Poettering 904dcaf9d4 resolved: take particular care when detaching DnsServer from its default stream
DnsStream and DnsServer have a symbiotic relationship: one DnsStream is
the current "default" stream of the server (and thus reffed by it), but
each stream also refs the server it is connected to. This cyclic
dependency can result in weird situations: when one is
destroyed/unlinked/stopped it needs to unregister itself from the other,
but doing this will trigger unregistration of the other. Hence, let's
make sure we unregister the stream from the server before destroying it,
to break this cycle.

Most likely fixes: #10725
2018-12-07 17:16:29 +01:00
Lennart Poettering 65b0179a25 resolved: use structured initialization for DnsServer allocation 2018-12-07 17:16:29 +01:00
Yu Watanabe 7a08d314f2 tree-wide: make hash_ops typesafe 2018-12-02 07:53:27 +01:00
Yu Watanabe 8301aa0bf1 tree-wide: use DEFINE_TRIVIAL_REF_UNREF_FUNC() macro or friends where applicable 2018-08-27 14:01:46 +09:00
Filipe Brandenburger a0edd02e43 tree-wide: Convert compare_func's to use CMP() macro wherever possible.
Looked for definitions of functions using the *_compare_func() suffix.

Tested:
- Unit tests passed (ninja -C build/ test)
- Installed this build and booted with it.
2018-08-06 19:26:35 -07:00
Iwan Timmer 6016fcb0ea resolved: refactor GnuTLS specific code in separate source file
This is a first step towards supporting alternative TLS implementations for DNS-over-TLS.

Co-authored-by: Filipe Brandenburger <filbranden@google.com>
2018-07-27 21:23:17 +01:00
Yu Watanabe df0fbad0d4 resolve: add assert_not_reached()
Follow-up for 3fe30d85e3.
2018-07-24 13:00:11 +02:00
Yu Watanabe 56ddbf1009 meson: make DNS-over-TLS support optional
This adds dns-over-tls option to meson. If set to 'false',
systemd-resolved is not linked with libgnutls.
2018-06-20 22:28:01 +02:00
Lennart Poettering 0c69794138 tree-wide: remove Lennart's copyright lines
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
2018-06-14 10:20:20 +02:00
Lennart Poettering 818bf54632 tree-wide: drop 'This file is part of systemd' blurb
This part of the copyright blurb stems from the GPL use recommendations:

https://www.gnu.org/licenses/gpl-howto.en.html

The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.

hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
2018-06-14 10:20:20 +02:00
Iwan Timmer c9299be2f5 resolve: rename PrivateDNS to DNSOverTLS
PrivateDNS is not considered a good name for this option, so rename it to DNSOverTLS
2018-06-14 09:57:56 +02:00
Yu Watanabe 3da3cdd592 resolve: drop unused argument of dns_server_packet_lost() 2018-06-13 13:20:23 +09:00
Yu Watanabe 3c0dcbcf4f resolve: fix log message 2018-06-13 12:21:54 +09:00
Matthias-Christian Ott dbc4661a2c resolve: do not derive query timeout from RTT
DNS queries need timeout values to detect whether a DNS server is
unresponsive or, if the query is sent over UDP, whether a DNS message
was lost and has to be resent. The total time that it takes to answer a
query to arrive is t + RTT, where t is the maximum time that the DNS
server that is being queried needs to answer the query.

An authoritative server stores a copy of the zone that it serves in main
memory or secondary storage, so t is very small and therefore the time
that it takes to answer a query is almost entirely determined by the
RTT. Modern authoritative server software keeps its zones in main memory
and, for example, Knot DNS and NSD are able to answer in less than
100 µs [1]. So iterative resolvers continuously measure the RTT to
optimize their query timeouts and to resend queries more quickly if they
are lost.

systemd-resolved is a stub resolver: it forwards DNS queries to an
upstream resolver and waits for an answer. So the time that it takes for
systemd-resolved to answer a query is determined by the RTT and the time
that it takes the upstream resolver to answer the query.

It seems common for iterative resolver software to set a total timeout
for the query. Such total timeout subsumes the timeout of all queries
that the iterative has to make to answer a query. For example, BIND
seems to use a default timeout of 10 s.

At the moment systemd-resolved derives its query timeout entirely from
the RTT and does not consider the query timeout of the upstream
resolver. Therefore it often mistakenly degrades the feature set of its
upstream resolvers if it takes them longer than usual to answer a query.
It has been reported to be a considerable problem in practice, in
particular if DNSSEC=yes. So the query timeout systemd-resolved should
be derived from the timeout of the upstream resolved and the RTT to the
upstream resolver.

At the moment systemd-resolved measures the RTT as the time that it
takes the upstream resolver to answer a query. This clearly leads to
incorrect measurements. In order to correctly measure the RTT
systemd-resolved would have to measure RTT separately and continuously,
for example with a query with an empty question section or a query for
the SOA RR of the root zone so that the upstream resolver would be able
to answer to query without querying another server. However, this
requires significant changes to systemd-resolved. So it seems best to
postpone them until other issues have been addressed and to set the
resend timeout to a fixed value for now.

As mentioned, BIND seems to use a timeout of 10 s, so perhaps 12 s is a
reasonable value that also accounts for common RTT values. If we assume
that the we are going to retry, it could be less. So it should be enough
to set the resend timeout to DNS_TIMEOUT_MAX_USEC as
DNS_SERVER_FEATURE_RETRY_ATTEMPTS * DNS_TIMEOUT_MAX_USEC = 15 s.
However, this will not solve the incorrect feature set degradation and
should be seen as a temporary change until systemd-resolved does
probe the feature set of an upstream resolver independently from the
actual queries.

[1] https://www.knot-dns.cz/benchmark/
2018-06-12 23:21:18 +02:00
Iwan Timmer d050561ac3 resolve: make PrivateDNS configurable per link
Like with DNSSec, make PrivateDNS configurable per link, so you can have trusted and untrusted links.
2018-06-11 21:35:58 +02:00
Iwan Timmer 5d67a7ae74 resolved: support for DNS-over-TLS
Add support for DNS-over-TLS using GnuTLS. To reduce latency also TLS False Start and TLS session resumption is supported.
2018-06-11 21:35:58 +02:00
Iwan Timmer 98767d75d7 resolved: longlived TCP connections
Keep DNS over TCP connection open until it's closed by the server or after a timeout.
2018-06-11 20:17:51 +02:00
Zbigniew Jędrzejewski-Szmek f4cf1d66f7 Remove NULL terminator from two log_struct calls
Fixup for a1230ff972. I forgot to press "save" ;(
2018-06-06 14:44:34 +02:00
Zbigniew Jędrzejewski-Szmek 11a1589223 tree-wide: drop license boilerplate
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.

I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
2018-04-06 18:58:55 +02:00
ott 4cbfd62b46 resolve: Adjust and unify D-Bus call timeout (#7847)
DNS queries have a timeout of DNS_TRANSACTION_ATTEMPTS_MAX *
DNS_TIMEOUT_MAX_USEC = 120 s. Calls to the ResolveHostname method of
the org.freedesktop.resolve1.Manager interface have various call
timeouts that are smaller than 120 s. So it seems correct to adjust
the call timeout to the maximum query timeout and to unify the call
timeout among all callers.

A timeout of 120 s might seem large, in particular since BIND does seem
to have a query timeout of 10 s. However, it seems match the timeout
value of 120 s of Unbound. Moreover, the query and timeout handling of
resolve have problems and might be improved in the future, so this
change is at best an interim solution.
2018-01-23 09:53:31 +09:00
Zbigniew Jędrzejewski-Szmek 53e1b68390 Add SPDX license identifiers to source files under the LGPL
This follows what the kernel is doing, c.f.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5fd54ace4721fc5ce2bb5aef6318fcf17f421460.
2017-11-19 19:08:15 +01:00
Lennart Poettering cf84484a56 resolved: include DNS server feature level info in SIGUSR1 status dump
let's make the status dump more useful for tracking down server issues.
2017-10-05 17:02:25 +02:00
Lennart Poettering 59c0fd0e17 resolved: automatically forget all learnt DNS server information when the network configuration changes
When the network configuration changes we should relearn everything
there is to know about the configured DNS servers, because we might talk
to the same addresses, but there might be different servers behind them.
2017-10-05 16:22:22 +02:00
Lennart Poettering db8e1324b8 resolved: downgrade log messages about switching DNS servers
As suggested in:

496ae8c84b (commitcomment-22819483)

Let's drop some noise from the logs, as switching between DNS servers is
definitely useful for debugging, but shouldn't get more attention that
that.
2017-07-03 11:20:04 +02:00
Kai Krakow 496ae8c84b resolved: Recover from slow DNS responses
When DNS is unreliable temporarily, the current implementation will
never improve resend behavior again and switch DNS servers only late
(current maximum timeout is 5 seconds).

We can improve this by biasing the resend_timeout back to the current
RTT when a successful response was received. Next time, a timeout is hit
on this server, it will switch to the next server faster.

Fixes: #5953
2017-06-27 22:04:16 +02:00
Zbigniew Jędrzejewski-Szmek 62cc1c55cb bus: include sd-{bus,messages}.h the same as other systemd headers
This is our own header, we should include use the local-include syntax
("" not <>), to make it clear we are including the one from the build tree.
All other includes of files from src/systemd/ use this scheme.
2017-04-21 12:05:55 -04:00
Lennart Poettering 74a3ed7408 resolved: extend various timeouts
Let's increase a number of timeouts as they apparently are too short for
some real-world lookups.

See:

https://github.com/systemd/systemd/issues/4003#issuecomment-279842616

In particular we change the following timeouts:

1) The first UDP retry we increase 500ms → 750ms. This is a good idea,
   since some servers need relatively long responses for trivial lookups,
   and giving up our first attempt also has the effect of trying a
   different server for the next attempt which has the side effect that
   we'll run two down-grade iterations in parallel, on both servers.
   Hence, let's give servers a bit more time in the first iteration.

2) Permit 24 retries instead of just 16 per transactions. If we end up
   downgrading all the way down to UDP for a lookup we already need 5
   iterations for that. If we want permit a couple of lost packages for
   each (let's say 4), then we already need 20 iterations.

3) Increase the overall query timeout on the service side to 60s (from
   45s), simply because very long and slow DNSSEC + CNAME chains (such as
   us.ynuf.alipay.com) hit this boundary too easily. The client side
   timeout for the bus method call is increased to 90s, in order to have
   room for the dbus reply to go through
2017-02-17 10:25:16 +01:00
Lennart Poettering 97277567b8 resolved: when DNSSEC mode is disabled, don't go beyond EDNS0 feature level
There's no point in talking to a server in DNSSEC mode when we don't
actually want to verify anything.

See: #5352
2017-02-17 10:25:15 +01:00
Lennart Poettering ce7c8b20df resolved: when the dns server feature level grace period elapses, flush caches
The cache might contain all kinds of unauthenticated data that we really
shouldn't be using if we upgrade our feature level and suddenly are able
to get authenticated data again.

Might fix: #4866
2017-02-17 10:25:15 +01:00
Zbigniew Jędrzejewski-Szmek 2b0445262a tree-wide: add SD_ID128_MAKE_STR, remove LOG_MESSAGE_ID
Embedding sd_id128_t's in constant strings was rather cumbersome. We had
SD_ID128_CONST_STR which returned a const char[], but it had two problems:
- it wasn't possible to statically concatanate this array with a normal string
- gcc wasn't really able to optimize this, and generated code to perform the
  "conversion" at runtime.
Because of this, even our own code in coredumpctl wasn't using
SD_ID128_CONST_STR.

Add a new macro to generate a constant string: SD_ID128_MAKE_STR.
It is not as elegant as SD_ID128_CONST_STR, because it requires a repetition
of the numbers, but in practice it is more convenient to use, and allows gcc
to generate smarter code:

$ size .libs/systemd{,-logind,-journald}{.old,}
   text	   data	    bss	    dec	    hex	filename
1265204	 149564	   4808	1419576	 15a938	.libs/systemd.old
1260268	 149564	   4808	1414640	 1595f0	.libs/systemd
 246805	  13852	    209	 260866	  3fb02	.libs/systemd-logind.old
 240973	  13852	    209	 255034	  3e43a	.libs/systemd-logind
 146839	   4984	     34	 151857	  25131	.libs/systemd-journald.old
 146391	   4984	     34	 151409	  24f71	.libs/systemd-journald

It is also much easier to check if a certain binary uses a certain MESSAGE_ID:

$ strings .libs/systemd.old|grep MESSAGE_ID
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x
MESSAGE_ID=%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x%02x

$ strings .libs/systemd|grep MESSAGE_ID
MESSAGE_ID=c7a787079b354eaaa9e77b371893cd27
MESSAGE_ID=b07a249cd024414a82dd00cd181378ff
MESSAGE_ID=641257651c1b4ec9a8624d7a40a9e1e7
MESSAGE_ID=de5b426a63be47a7b6ac3eaac82e2f6f
MESSAGE_ID=d34d037fff1847e6ae669a370e694725
MESSAGE_ID=7d4958e842da4a758f6c1cdc7b36dcc5
MESSAGE_ID=1dee0369c7fc4736b7099b38ecb46ee7
MESSAGE_ID=39f53479d3a045ac8e11786248231fbf
MESSAGE_ID=be02cf6855d2428ba40df7e9d022f03d
MESSAGE_ID=7b05ebc668384222baa8881179cfda54
MESSAGE_ID=9d1aaa27d60140bd96365438aad20286
2017-02-15 00:45:12 -05:00
Lennart Poettering 12bf233175 resolved: if strict DNSSEC mode is selected never downgrade below DNSSEC server feature level due to packet loss
Fixes: #4315
2017-02-09 16:13:07 +01:00
Lennart Poettering 413b05ccac resolved: properly check for the root domain
Fix-up for #4164
2016-10-24 19:04:43 +02:00
Zbigniew Jędrzejewski-Szmek 6b430fdb7c tree-wide: use mfree more 2016-10-16 23:35:39 -04:00
Martin Pitt b9fe94cad9 resolved: don't query domain-limited DNS servers for other domains (#3621)
DNS servers which have route-only domains should only be used for
the specified domains. Routing queries about other domains there is a privacy
violation, prone to fail (as that DNS server was not meant to be used for other
domains), and puts unnecessary load onto that server.

Introduce a new helper function dns_server_limited_domains() that checks if the
DNS server should only be used for some selected domains, i. e. has some
route-only domains without "~.". Use that when determining whether to query it
in the scope, and when writing resolv.conf.

Extend the test_route_only_dns() case to ensure that the DNS server limited to
~company does not appear in resolv.conf. Add test_route_only_dns_all_domains()
to ensure that a server that also has ~. does appear in resolv.conf as global
name server. These reproduce #3420.

Add a new test_resolved_domain_restricted_dns() test case that verifies that
domain-limited DNS servers are only being used for those domains. This
reproduces #3421.

Clarify what a "routing domain" is in the manpage.

Fixes #3420
Fixes #3421
2016-09-30 09:30:08 +02:00
Martin Pitt 54522e941d Merge pull request #3594 from poettering/resolved-servfail
resolved fixes for handling SERVFAIL errors from servers
2016-06-24 08:01:49 +02:00
Lennart Poettering d001e0a3af resolved: rework SERVFAIL handling
There might be two reasons why we get a SERVFAIL response from our selected DNS
server: because this DNS server itself is bad, or because the DNS server
actually serving the zone upstream is bad. So far we immediately downgraded our
server feature level when getting SERVFAIL, under the assumption that the first
case is the only possible case. However, this meant we'd downgrade immediately
even if we encountered the second case described above.

With this commit handling of SERVFAIL is reworked. As soon as we get a SERVFAIL
on a transaction we retry the transaction with a lower feature level, without
changing the feature level tracked for the DNS server itself. If that fails
too, we downgrade further, and so on. If during this downgrading the SERVFAIL
goes away we assume that the DNS server we are talking to is bad, but the zone
is fine and propagate the detected feature level to the information we track
about the DNS server. Should the SERVFAIL not go away this way we let the
transaction fail and accept the SERVFAIL.
2016-06-23 23:24:38 +02:00
Lennart Poettering b30bf55d5c resolved: respond to local resolver requests on 127.0.0.53:53
In order to improve compatibility with local clients that speak DNS directly
(and do not use NSS or our bus API) listen locally on 127.0.0.53:53 and process
any queries made that way.

Note that resolved does not implement a full DNS server on this port, but
simply enough to allow normal, local clients to resolve RRs through resolved.
Specifically it does not implement queries without the RD bit set (these are
requests where recursive lookups are explicitly disabled), and neither queries
with DNSSEC DO set in combination with DNSSEC CD (i.e. DNSSEC lookups with
validation turned off). It also refuses zone transfers and obsolete RR types.
All lookups done this way will be rejected with a clean error code, so that the
client side can repeat the query with a reduced feature set.

The code will set the DNSSEC AD flag however, depending on whether the data
resolved has been validated (or comes from a local, trusted source).

Lookups made via this mechanisms are propagated to LLMNR and mDNS as necessary,
but this is only partially useful as DNS packets cannot carry IP scope data
(i.e. the ifindex), and hence link-local addresses returned cannot be used
properly (and given that LLMNR/mDNS are mostly about link-local communication
this is quite a limitation). Also, given that DNS tends to use IDNA for
non-ASCII names, while LLMNR/mDNS uses UTF-8 lookups cannot be mapped 1:1.

In general this should improve compatibility with clients bypassing NSS but
it is highly recommended for clients to instead use NSS or our native bus API.

This patch also beefs up the DnsStream logic, as it reuses the code for local
TCP listening. DnsStream now provides proper reference counting for its
objects.

In order to avoid feedback loops resolved will no silently ignore 127.0.0.53
specified as DNS server when reading configuration.

resolved listens on 127.0.0.53:53 instead of 127.0.0.1:53 in order to leave
the latter free for local, external DNS servers or forwarders.

This also changes the "etc.conf" tmpfiles snippet to create a symlink from
/etc/resolv.conf to /usr/lib/systemd/resolv.conf by default, thus making this
stub the default mode of operation if /etc is not populated.
2016-06-21 14:15:23 +02:00
Lennart Poettering f2ed4c696a resolved: extend dns_packet_append_opt() so that it can set the extended rcode
We don't make use of this yet, but later work will.
2016-06-21 13:20:48 +02:00
Lennart Poettering 2817157bb7 resolved: support IPv6 DNS servers on the local link
Make sure we can parse DNS server addresses that use the "zone id" syntax for
local link addresses, i.e. "fe80::c256:27ff:febb:12f%wlp3s0", when reading
/etc/resolv.conf.

Also make sure we spit this out correctly again when writing /etc/resolv.conf
and via the bus.

Fixes: #3359
2016-06-06 19:17:38 +02:00
Zbigniew Jędrzejewski-Szmek 0e3e1aefeb resolved: fix accounting of dns serves on a link (#3291)
After a few link up/down events I got this warning:
May 17 22:05:10 laptop systemd-resolved[2983]: Failed to read DNS servers for interface wlp3s0, ignoring: Argument list too long
2016-05-20 15:11:58 +02:00
Vito Caputo 313cefa1d9 tree-wide: make ++/-- usage consistent WRT spacing
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands.  Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
2016-02-22 20:32:04 -08:00
Daniel Mack b26fa1a2fb tree-wide: remove Emacs lines from all files
This should be handled fine now by .dir-locals.el, so need to carry that
stuff in every file.
2016-02-10 13:41:57 +01:00
Zbigniew Jędrzejewski-Szmek e3309036cd resolved: log server type when switching servers
I'm not defining _DNS_SERVER_TYPE_MAX/INVALID as usual in the enum,
because it wouldn't be used, and then gcc would complain that
various enums don't test for _DNS_SERVER_TYPE_MAX. It seems better
to define the macro rather than add assert_not_reached() in multiple
places.
2016-01-29 12:24:15 -05:00
Lennart Poettering 1e02e182f1 resolved: log recognizably about DNSSEC downgrades
If we downgrade from DNSSEC to non-DNSSEC mode, let's log about this in a recognizable way (i.e. with a message ID),
after all, this is of major importance.
2016-01-25 17:19:19 +01:00
Lennart Poettering cc450722a0 resolved: don't forget about lost OPT and RRSIG when downgrading a feature level
Certain Belkin routers appear to implement a broken DNS cache for A RRs and some others, but implement a pass-thru for
AAAA RRs. This has the effect that we quickly recognize the broken logic of the router when we do an A lookup, but for
AAAA everything works fine until we actually try to validate the request. Given that the validation will necessarily
fail ultimately let's make sure we remember even when downgrading a feature level that OPT or RRSIG was missing.
2016-01-19 00:51:26 +01:00