Commit Graph

266 Commits

Author SHA1 Message Date
Daniel Mack 1effe96568 resolved: split dns_transaction_go()
Split some code out of dns_transaction_go() so we can re-use it later from
different context. The new function dns_transaction_prepare_next_attempt()
takes care of preparing everything so that a new packet can conditionally
be formulated for a transaction.

This patch shouldn't cause any functional change.
2015-12-08 16:51:41 +01:00
Daniel Mack 547493c5ad resolved: handle more mDNS protocol details 2015-12-08 16:51:41 +01:00
Daniel Mack a20b959217 resolved: fix debug message 2015-12-08 16:51:40 +01:00
Daniel Mack 11a27c2ec1 resolved: handle mDNS timeouts per transaction
mDNS packet timeouts need to be handled per transaction, not per link.
Re-use the n_attempts field for this purpose, as packets timeouts should be
determined by starting at 1 second, and doubling the value on each try.
2015-12-08 16:51:40 +01:00
Daniel Mack ef7ce6df4d resolved: short-cut jitter callbacks for LLMNR and mDNS
When a jitter callback is issued instead of sending a DNS packet directly,
on_transaction_timeout() is invoked to 'retry' the transaction. However,
this function has side effects. For once, it increases the packet loss
counter on the scope, and it also unrefs/refs the server instances.

Fix this by tracking the jitter with two bool variables. One saying that
the initial jitter has been scheduled in the first place, and one that
tells us the delay packet has been sent.
2015-12-08 16:51:40 +01:00
Daniel Mack ea12bcc789 resolved: add mDNS initial jitter
The logic is to kick off mDNS packets in a delayed way is mostly identical
to what LLMNR needs, except that the constants are different.
2015-12-08 16:41:45 +01:00
Daniel Mack 4e5bf5e158 resolved: add packet header details for mDNS
Validate mDNS queries and responses by looking at some header fields,
add mDNS flags.
2015-12-08 16:41:45 +01:00
Lennart Poettering 931851e8e4 resolved: add a concept of "authenticated" responses
This adds a new SD_RESOLVED_AUTHENTICATED flag for responses we return
on the bus. When set, then the data has been authenticated. For now this
mostly reflects the DNSSEC AD bit, if DNSSEC=trust is set. As soon as
the client-side validation is complete it will be hooked up to this flag
too.

We also set this bit whenver we generated the data ourselves, for
example, because it originates in our local LLMNR zone, or from the
built-in trust anchor database.

The "systemd-resolve-host" tool has been updated to show the flag state
for the data it shows.
2015-12-03 21:17:49 +01:00
Lennart Poettering 24710c48ed resolved: introduce a dnssec_mode setting per scope
The setting controls which kind of DNSSEC validation is done: none at
all, trusting the AD bit, or client-side validation.

For now, no validation is implemented, hence the setting doesn't do much
yet, except of toggling the CD bit in the generated messages if full
client-side validation is requested.
2015-12-03 21:17:49 +01:00
Lennart Poettering 0d2cd47617 resolved: add a simple trust anchor database as additional RR source
When doing DNSSEC lookups we need to know one or more DS or DNSKEY RRs
as trust anchors to validate lookups. With this change we add a
compiled-in trust anchor database, serving the root DS key as of today,
retrieved from:

https://data.iana.org/root-anchors/root-anchors.xml

The interface is kept generic, so that additional DS or DNSKEY RRs may
be served via the same interface, for example by provisioning them
locally in external files to support "islands" of security.

The trust anchor database becomes the fourth source of RRs we maintain,
besides, the network, the local cache, and the local zone.
2015-12-03 21:17:49 +01:00
Tom Gundersen d74fb368b1 resolved: announce support for large UDP packets
This is often needed for proper DNSSEC support, and even to handle AAAA records
without falling back to TCP.

If the path between the client and server is fully compliant, this should always
work, however, that is not the case, and overlarge packets will get mysteriously
lost in some cases.

For that reason, we use a similar fallback mechanism as we do for palin EDNS0,
EDNS0+DO, etc.:

The large UDP size feature is different from the other supported feature, as we
cannot simply verify that it works based on receiving a reply (as the server
will usually send us much smaller packets than what we claim to support, so
simply receiving a reply does not mean much).

For that reason, we keep track of the largest UDP packet we ever received, as this
is the smallest known good size (defaulting to the standard 512 bytes). If
announcing the default large size of 4096 fails (in the same way as the other
features), we fall back to the known good size. The same logic of retrying after a
grace-period applies.
2015-11-27 01:35:47 +01:00
Tom Gundersen 9c5e12a431 resolved: implement minimal EDNS0 support
This is a minimal implementation of RFC6891. Only default values
are used, so in reality this will be a noop.

EDNS0 support is dependent on the current server's feature level,
so appending the OPT pseudo RR is done when the packet is emitted,
rather than when it is assembled. To handle different feature
levels on retransmission, we strip off the OPT RR again after
sending the packet.

Similarly, to how we fall back to TCP if UDP fails, we fall back
to plain UDP if EDNS0 fails (but if EDNS0 ever succeeded we never
fall back again, and after a timeout we will retry EDNS0).
2015-11-27 01:35:34 +01:00
Tom Gundersen 4e0b8b17a7 resolved: degrade the feature level on explicit failure
Previously, we would only degrade on packet loss, but when adding EDNS0 support,
we also have to handle the case where the server replies with an explicit error.
2015-11-27 01:35:33 +01:00
Tom Gundersen be808ea083 resolved: fallback to TCP if UDP fails
This is inspired by the logic in BIND [0], follow-up patches
will implement the reset of that scheme.

If we get a server error back, or if after several attempts we don't
get a reply at all, we switch from UDP to TCP for the given
server for the current and all subsequent requests. However, if
we ever successfully received a reply over UDP, we never fall
back to TCP, and once a grace-period has passed, we try to upgrade
again to using UDP. The grace-period starts off at five minutes
after the current feature level was verified and then grows
exponentially to six hours. This is to mitigate problems due
to temporary lack of network connectivity, but at the same time
avoid flooding the network with retries when the feature attempted
feature level genuinely does not work.

Note that UDP is likely much more commonly supported than TCP,
but depending on the path between the client and the server, we
may have more luck with TCP in case something is wrong. We really
do prefer UDP though, as that is much more lightweight, that is
why TCP is only the last resort.

[0]: <https://kb.isc.org/article/AA-01219/0/Refinements-to-EDNS-fallback-behavior-can-cause-different-outcomes-in-Recursive-Servers.html>
2015-11-27 01:35:33 +01:00
Lennart Poettering d830ebbdf6 resolved: never cache RRs originating from localhost
After all, this is likely a local DNS forwarder that caches anyway,
hence there's no point in caching twice.

Fixes #2038.
2015-11-27 00:46:51 +01:00
Lennart Poettering f9ebb22ab4 resolved: handle properly if there are multiple transactions for the same key per scope
When the zone probing code looks for a transaction to reuse it will
refuse to look at transactions that have been answered from cache or the
zone itself, but insist on the network. This has the effect that there
might be multiple transactions around for the same key on the same
scope. Previously we'd track all transactions in a hashmap, indexed by
the key, which implied that there would be only one transaction per key,
per scope. With this change the hashmap will only store the most recent
transaction per key, and a linked list will be used to track all
transactions per scope, allowing multiple per-key per-scope.

Note that the linked list fields for this actually already existed in
the DnsTransaction structure, but were previously unused.
2015-11-27 00:03:39 +01:00
Lennart Poettering c3bc53e624 resolved: for a transaction, keep track where the answer data came from
Let's track where the data came from: from the network, the cache or the
local zone. This is not only useful for debugging purposes, but is also
useful when the zone probing wants to ensure it's not reusing
transactions that were answered from the cache or the zone itself.
2015-11-27 00:03:39 +01:00
Lennart Poettering ae6a4bbf31 resolved: store just the DnsAnswer instead of a DnsPacket as answer in DnsTransaction objects
Previously we'd only store the DnsPacket in the DnsTransaction, and the
DnsQuery would then take the DnsPacket's DnsAnswer and return it. With
this change we already pull the DnsAnswer out inside the transaction.

We still store the DnsPacket in the transaction, if we have it, since we
still need to determine from which peer a response originates, to
implement caching properly. However, the DnsQuery logic doesn't care
anymore for the packet, it now only looks at answers and rcodes from the
successfuly candidate.

This also has the benefit of unifying how we propagate incoming packets,
data from the local zone or the local cache.
2015-11-27 00:03:39 +01:00
Lennart Poettering 801ad6a6a9 resolved: fully support DNS search domains
This adds support for searching single-label hostnames in a set of
configured search domains.

A new object DnsQueryCandidate is added that links queries to scopes.
It keeps track of the search domain last used for a query on a specific
link. Whenever a host name was unsuccessfuly resolved on a scope all its
transactions are flushed out and replaced by a new set, with the next
search domain appended.

This also adds a new flag SD_RESOLVED_NO_SEARCH to disable search domain
behaviour. The "systemd-resolve-host" tool is updated to make this
configurable via --search=.

Fixes #1697
2015-11-25 21:59:16 +01:00
Lennart Poettering d746bb3eb2 resolved: shortcut lookups names in the local zone
Previously, we'd always generate a packet on the wire, even for names
that are within our local zone. Shortcut this, and always check the
local zone first. This should minimize generated traffic and improve
security.
2015-11-18 17:07:11 +01:00
Lennart Poettering b5efdb8af4 util-lib: split out allocation calls into alloc-util.[ch] 2015-10-27 13:45:53 +01:00
Lennart Poettering 8752c5752f util-lib: move more locale-related calls to locale-util.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 8b43440b7e util-lib: move string table stuff into its own string-table.[ch] 2015-10-27 13:25:56 +01:00
Lennart Poettering 3ffd4af220 util-lib: split out fd-related operations into fd-util.[ch]
There are more than enough to deserve their own .c file, hence move them
over.
2015-10-25 13:19:18 +01:00
Tom Gundersen 8e427d9be9 resolved: cache - only allow putting a single question key at a time
Only one key is allowed per transaction now, so let's simplify things and only allow putting
one question key into the cache at a time.
2015-09-16 17:03:17 +02:00
Lennart Poettering 4667e00a61 resolved: rename DNS UDP socket to 'dns_udp_fd'
This hopefully makes this a bit more expressive and clarifies that the
fd is not used for the DNS TCP socket. This also mimics how the LLMNR
UDP fd is named in the manager object.
2015-08-25 18:51:23 +02:00
Daniel Mack 9c56a6f3e2 resolved: move assertion
Make a scope with invalid protocol state fail as soon as possible.
2015-08-25 14:25:58 +02:00
Daniel Mack 106784ebb7 resolved: use switch-case statements for protocol details
With more protocols to come, switch repetitive if-else blocks with a
switch-case statements.
2015-08-25 14:25:56 +02:00
Daniel Mack 8326c7f789 resolved: remove runtime check for previously asserted condition 2015-08-25 10:18:45 +02:00
Lennart Poettering 9318cdd374 resolved: change error code when trying to resolve direct LLMNR PTR RRs
If we try to resoolve an LLMNR PTR RR we shall connect via TCP directly
to the specified IP address. We already refuse to do this if the address
to resolve is of a different address family as the transaction's scope.
The error returned was EAFNOSUPPORT. Let's change this to ESRCH which is
how we indicate "not server available" when connecting for LLMNR or DNS,
since that's what this really is: we have no server we could connect to
in this address family.

This allows us to ensure that no server errors are always handled the same
way.
2015-08-24 23:47:28 +02:00
Lennart Poettering da0c630e14 resolved: replace transaction list by hashmap
Right now we keep track of ongoing transactions in a linked listed for
each scope. Replace this by a hashmap that is indexed by the RR key.
Given that all ongoing transactions will be placed in pretty much the
same scopes usually this should optimize behaviour.

We used to require a list here, since we wanted to do "superset" query
checks, but this became obsolete since transactions are now single-key
instead of multi-key.
2015-08-24 23:15:51 +02:00
Lennart Poettering f52e61da04 resolved: only maintain one question RR key per transaction
Let's simplify things and only maintain a single RR key per transaction
object, instead of a full DnsQuestion. Unicast DNS and LLMNR don't
support multiple questions per packet anway, and Multicast DNS suggests
coalescing questions beyond a single dns query, across the whole system.
2015-08-21 22:55:01 +02:00
Lennart Poettering 9e08a6e0ce resolved: add extra check for family when doing LLMNR TCP connections
It shouldn't happen that we try to resolve IPv4 addresses via LLMNR on
IPv6 and vice versa, but let's explicitly verify that we don't turn an
IPv4 LLMNR lookup into an IPv6 TCP connection.
2015-08-21 22:51:05 +02:00
Lennart Poettering a8f6397f53 resolved: minor typo comment fix 2015-08-21 12:41:08 +02:00
Tom Gundersen 6b34a6c995 resolved: cache - add more detailed cache debug logging 2015-08-17 07:18:30 +02:00
Lennart Poettering 38a03f06a7 sd-event: make sure sd_event_now() cannot fail
Previously, if the event loop never ran before sd_event_now() would
fail. With this change it will instead fall back to invoking now(). This
way, the function cannot fail anymore, except for programming error when
invoking it with wrong parameters.

This takes into account the fact that many callers did not handle the
error condition correctly, and if the callers did, then they kept simply
invoking now() as fall back on their own. Hence let's shorten the code
using this call, and make things more robust, and let's just fall back
to now() internally.

Whether now() is used or the cache timestamp may still be detected via
the return value of sd_event_now(). If > 0 is returned, then the fall
back to now() was used, if == 0 is returned, then the cached value was
returned.

This patch also simplifies many of the invocations of sd_event_now():
the manual fall back to now() can be removed. Also, in cases where the
call is invoked withing void functions we can now protect the invocation
via assert_se(), acknowledging the fact that the call cannot fail
anymore except for programming errors with the parameters.

This change is inspired by #841.
2015-08-03 17:34:49 +02:00
Tom Gundersen 9df3ba6c6c resolved: transaction - exponentially increase retry timeouts
Rather than fixing this to 5s for unicast DNS and 1s for LLMNR, start
at a tenth of those values and increase exponentially until the old
values are reached. For LLMNR the recommended timeout for IEEE802
networks (which basically means all of the ones we care about) is 100ms,
so that should be uncontroversial. For unicast DNS I have found no
recommended value. However, it seems vastly more likely that hitting a
500ms timeout is casued by a packet loss, rather than the RTT genuinely
being greater than 500ms, so taking this as a startnig value seems
reasonable to me.

In the common case this greatly reduces the latency due to normal packet
loss. Moreover, once we get support for probing for features, this means
that we can send more packets before degrading the feature level whilst
still allowing us to settle on the correct feature level in a reasonable
timeframe.

The timeouts are tracked per server (or per scope for the multicast
protocols), and once a server (or scope) receives a successfull package
the timeout is reset. We also track the largest RTT for the given
server/scope, and always start our timouts at twice the largest
observed RTT.
2015-08-03 14:06:58 +02:00
Lennart Poettering 1086182d83 resolved: compare dns question arrays properly
Let's optimize things a bit and properly compare DNS question arrays,
instead of checking if they are mutual supersets. This also makes ANY
query handling more accurate.
2015-07-28 18:38:54 +02:00
Tom Gundersen c73ee39d10 resolved: transaction - don't explicitly verify packet source
This is handled by the kernel now that the socket is connect()ed.
2015-07-27 20:34:28 +02:00
Tom Gundersen 088480faf1 resolved: transaction - don't unref server when creating TCP socket
This was a bug.
2015-07-27 20:34:15 +02:00
Tom Gundersen 471d40d92f resolved: transaction - introduce dns_transaction_emit()
This function emits the UDP packet via the scope, but first it will
determine the current server (and connect to it) and store the
server in the transaction.

This should not change the behavior, but simplifies the code.
2015-07-27 20:30:54 +02:00
Tom Gundersen c19ffd9fbf resolved: transaction - move a couple of functions
No functional change, but makes follow-up patch clearer.
2015-07-27 20:18:43 +02:00
Tom Gundersen 0db643664c resolved: transaction - move DNS UDP socket creation to the scope
With access to the server when creating the socket, we can connect()
to the server and hence simplify message sending and receiving in
follow-up patches.
2015-07-27 20:13:11 +02:00
Tom Gundersen 647f6aa8fc resolved: transaction - close socket when changing server
Close the socket when changing the server in a transaction, in
order for it to be reopened with the right server when we send
the next packet.

This fixes a regression where we could get stuck with a failing
server.
2015-07-27 20:01:07 +02:00
Tom Gundersen 86ad4cd709 resolved: transaction - don't request PKTINFO for unicast DNS
This was only ever used by LLMNR, so don't request this for unicast DNS packets.
2015-07-27 19:56:45 +02:00
Tom Gundersen 0eb99d0a6a resloved: transaction - unify IPv4 and IPv6 sockets
A transaction can only have one socket at a time, so no need to distinguish these.
2015-07-27 19:52:48 +02:00
Tom Gundersen 6709eb94f9 resolve: transaction - stop processing packet when found to be invalid
We were stopping the transaction, but we need to stop processing the packet alltogether.
2015-07-23 18:06:50 +02:00
Tom Gundersen d20b1667db resolved: use one UDP socket per transaction
We used to have one global socket, use one per transaction instead. This
has the side-effect of giving us a random UDP port per transaction, and
hence increasing the entropy and making cache poisoining significantly
harder to achieve.

We still reuse the same port number for packets belonging to the same
transaction (resent packets).
2015-07-14 18:50:57 +02:00
Tom Gundersen 29815b6c60 resolved: implement RFC5452
This improves the resilience against cache poisoning by being stricter
about only accepting responses that match precisely the requst they
are in reply to.

It should be noted that we still only use one port (which is picked
at random), rather than one port for each transaction. Port
randomization would improve things further, but is not required by
the RFC.
2015-07-14 18:50:57 +02:00
Tom Gundersen 8300ba218e resolved: pin the server used in a transaction
We want to discover information about the server and use that in when crafting
packets to be resent.
2015-07-14 18:50:53 +02:00
Daniel Mack 8b757a3861 resolved: separate LLMNR specific header bits
The C and T bits in the DNS packet header definitions are specific to LLMNR.
In regular DNS, they are called AA and RD instead. Reflect that by calling
the macros accordingly, and alias LLMNR specific macros.

While at it, define RA, AD and CD getters as well.
2015-07-13 11:28:29 -04:00
Daniel Mack 22a37591ed resolved: use a #define for LLMNR port
De-duplicate some magic numbers.
2015-07-13 11:28:29 -04:00
Ronny Chevalier 3df3e884ae shared: add random-util.[ch] 2015-04-11 00:11:13 +02:00
Harald Hoyer a7f7d1bde4 fix gcc warnings about uninitialized variables
like:

src/shared/install.c: In function ‘unit_file_lookup_state’:
src/shared/install.c:1861:16: warning: ‘r’ may be used uninitialized in
this function [-Wmaybe-uninitialized]
         return r < 0 ? r : state;
                ^
src/shared/install.c:1796:13: note: ‘r’ was declared here
         int r;
             ^
2015-03-27 14:57:38 +01:00
Michal Schmidt d5099efc47 hashmap: introduce hash_ops to make struct Hashmap smaller
It is redundant to store 'hash' and 'compare' function pointers in
struct Hashmap separately. The functions always comprise a pair.
Store a single pointer to struct hash_ops instead.

systemd keeps hundreds of hashmaps, so this saves a little bit of
memory.
2014-09-15 16:08:50 +02:00
Lennart Poettering 4d91eec42d resolved: actually, the peer with the lower IP address wins conflicts 2014-08-11 15:06:22 +02:00
Lennart Poettering 3ef64445cd resolved: make sure we don't mark the wrong zone RRs conflicting 2014-08-11 15:06:22 +02:00
Lennart Poettering 2fb3034cb2 resolved: be a bit more communicative about conflicts 2014-08-11 15:06:22 +02:00
Lennart Poettering a407657425 resolved: implement full LLMNR conflict detection logic 2014-08-11 15:06:22 +02:00
Lennart Poettering e56187ca4a resolved: don't abort if a transaction is aborted because its scope is removed 2014-08-05 17:02:46 +02:00
Lennart Poettering 6e06847294 resolved: add 100ms initial jitter to all LLMNR requests 2014-08-05 17:02:46 +02:00
Lennart Poettering 13b551acb6 resolved: when sending fails, don't try connecting to the next DNS server if we actually use LLMNR as protocol 2014-08-05 04:15:45 +02:00
Lennart Poettering 4d926a69bc resolved: bypass local cache when we issue a transaction for verification purposes 2014-08-05 01:52:24 +02:00
Lennart Poettering 2c27fbca2d resolved: flush cache each time we change to a different DNS server 2014-08-01 18:10:01 +02:00
Lennart Poettering 9a015429b3 resolved: use CLOCK_BOOTTIME instead of CLOCK_MONOTONIC when aging caches and timeing out transactions
That way the cache doens't get confused when the system is suspended.
2014-08-01 00:58:12 +02:00
Lennart Poettering ec2c5e4398 resolved: implement LLMNR uniqueness verification 2014-07-31 17:47:19 +02:00