Systemd/src
Matthias-Christian Ott dbc4661a2c resolve: do not derive query timeout from RTT
DNS queries need timeout values to detect whether a DNS server is
unresponsive or, if the query is sent over UDP, whether a DNS message
was lost and has to be resent. The total time that it takes to answer a
query to arrive is t + RTT, where t is the maximum time that the DNS
server that is being queried needs to answer the query.

An authoritative server stores a copy of the zone that it serves in main
memory or secondary storage, so t is very small and therefore the time
that it takes to answer a query is almost entirely determined by the
RTT. Modern authoritative server software keeps its zones in main memory
and, for example, Knot DNS and NSD are able to answer in less than
100 µs [1]. So iterative resolvers continuously measure the RTT to
optimize their query timeouts and to resend queries more quickly if they
are lost.

systemd-resolved is a stub resolver: it forwards DNS queries to an
upstream resolver and waits for an answer. So the time that it takes for
systemd-resolved to answer a query is determined by the RTT and the time
that it takes the upstream resolver to answer the query.

It seems common for iterative resolver software to set a total timeout
for the query. Such total timeout subsumes the timeout of all queries
that the iterative has to make to answer a query. For example, BIND
seems to use a default timeout of 10 s.

At the moment systemd-resolved derives its query timeout entirely from
the RTT and does not consider the query timeout of the upstream
resolver. Therefore it often mistakenly degrades the feature set of its
upstream resolvers if it takes them longer than usual to answer a query.
It has been reported to be a considerable problem in practice, in
particular if DNSSEC=yes. So the query timeout systemd-resolved should
be derived from the timeout of the upstream resolved and the RTT to the
upstream resolver.

At the moment systemd-resolved measures the RTT as the time that it
takes the upstream resolver to answer a query. This clearly leads to
incorrect measurements. In order to correctly measure the RTT
systemd-resolved would have to measure RTT separately and continuously,
for example with a query with an empty question section or a query for
the SOA RR of the root zone so that the upstream resolver would be able
to answer to query without querying another server. However, this
requires significant changes to systemd-resolved. So it seems best to
postpone them until other issues have been addressed and to set the
resend timeout to a fixed value for now.

As mentioned, BIND seems to use a timeout of 10 s, so perhaps 12 s is a
reasonable value that also accounts for common RTT values. If we assume
that the we are going to retry, it could be less. So it should be enough
to set the resend timeout to DNS_TIMEOUT_MAX_USEC as
DNS_SERVER_FEATURE_RETRY_ATTEMPTS * DNS_TIMEOUT_MAX_USEC = 15 s.
However, this will not solve the incorrect feature set degradation and
should be seen as a temporary change until systemd-resolved does
probe the feature set of an upstream resolver independently from the
actual queries.

[1] https://www.knot-dns.cz/benchmark/
2018-06-12 23:21:18 +02:00
..
ac-power tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
activate tree-wide: be more careful with the type of array sizes 2018-04-27 14:29:06 +02:00
analyze analyze: use _cleanup_ for struct unit_times 2018-06-08 15:46:07 +02:00
ask-password tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
backlight tree-wide: drop redundant _cleanup_ macros (#8810) 2018-04-25 12:31:45 +02:00
basic tree-wide: unify how we define bit mak enums 2018-06-12 21:44:00 +02:00
binfmt Eliminate config_dirs vars which hold a static strv 2018-05-07 18:17:36 +02:00
boot efi: explicitly cast physical address to UINTN when converting to/from pointers 2018-05-31 16:10:46 +02:00
busctl string-util: rename strdash_if_empty() to empty_to_dash() 2018-05-11 01:55:46 +09:00
cgls path-util: introduce path_simplify() 2018-06-03 23:39:26 +09:00
cgroups-agent tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
cgtop path-util: introduce path_simplify() 2018-06-03 23:39:26 +09:00
core core: when applying io/blkio per-device rules, don't remove them if they fail 2018-06-12 22:52:36 +02:00
coredump basic/log: add the log_struct terminator to macro 2018-06-04 13:46:03 +02:00
cryptsetup tree-wide: drop spurious newlines (#8764) 2018-04-19 12:13:23 +02:00
debug-generator tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
delta tree-wide: unify how we define bit mak enums 2018-06-12 21:44:00 +02:00
detect-virt detect-virt: add new --list command for showing all currently known VM/container envs 2018-05-22 13:14:18 +02:00
dissect tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
environment-d-generator tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
escape tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
firstboot os-util: add helpers for finding /etc/os-release 2018-05-24 17:01:57 +02:00
fsck tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
fstab-generator path-util: introduce path_simplify() 2018-06-03 23:39:26 +09:00
fuzz fuzz-journal-remote: write to /dev/null not stdout 2018-05-31 14:30:23 +02:00
getty-generator tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
gpt-auto-generator Merge pull request #8812 from keszybz/gpt-auto-memleak 2018-04-25 15:46:57 +02:00
hibernate-resume tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
hostname os-util: add helpers for finding /etc/os-release 2018-05-24 17:01:57 +02:00
hwdb systemd-hwdb: reflow help() to avoid a line break 2018-04-24 12:11:10 +02:00
import Add macro for checking if some flags are set 2018-06-04 11:50:44 +02:00
initctl tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
journal journal: forward messages from /dev/log unmodified to syslog.socket 2018-06-11 21:26:22 +02:00
journal-remote journal-remote: do not send _BOOT_ID twice 2018-05-31 14:33:41 +02:00
kernel-install tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
libsystemd tree-wide: unify how we define bit mak enums 2018-06-12 21:44:00 +02:00
libsystemd-network resolved: fix typo in macro name 2018-06-08 16:05:18 +02:00
libudev tree-wide: remove some double newlines in headers, too 2018-05-22 16:13:45 +02:00
locale locale: add _unused_ attribute for dummy variable 2018-06-06 12:27:52 +02:00
login tree-wide: drop trailing whitespace 2018-06-12 13:05:38 +02:00
machine basic/log: add the log_struct terminator to macro 2018-06-04 13:46:03 +02:00
machine-id-setup tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
modules-load tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
mount path-util: introduce path_simplify() 2018-06-03 23:39:26 +09:00
network resolve: make PrivateDNS configurable per link 2018-06-11 21:35:58 +02:00
notify tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
nspawn tree-wide: unify how we define bit mak enums 2018-06-12 21:44:00 +02:00
nss-myhostname tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
nss-mymachines mymachines: fix getgrnam() 2018-06-08 17:52:18 +02:00
nss-resolve tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
nss-systemd tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
partition tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
path tree-wide: drop spurious newlines (#8764) 2018-04-19 12:13:23 +02:00
portable tree-wide: unify how we define bit mak enums 2018-06-12 21:44:00 +02:00
quotacheck tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
random-seed tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
rc-local-generator tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
remount-fs tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
reply-password tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
resolve resolve: do not derive query timeout from RTT 2018-06-12 23:21:18 +02:00
rfkill tree-wide: drop redundant _cleanup_ macros (#8810) 2018-04-25 12:31:45 +02:00
run tree-wide: some O_NDELAY → O_NONBLOCK fixes 2018-05-31 12:04:39 +02:00
shared tree-wide: unify how we define bit mak enums 2018-06-12 21:44:00 +02:00
sleep basic/log: add the log_struct terminator to macro 2018-06-04 13:46:03 +02:00
socket-proxy tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
stdio-bridge tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
sulogin-shell tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
sysctl Eliminate config_dirs vars which hold a static strv 2018-05-07 18:17:36 +02:00
system-update-generator tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
systemctl Merge pull request #9157 from poettering/unit-config-load-error 2018-06-11 14:37:10 +02:00
systemd resolve: make PrivateDNS configurable per link 2018-06-11 21:35:58 +02:00
sysusers path-util: introduce path_simplify() 2018-06-03 23:39:26 +09:00
sysv-generator tree-wide: drop redundant _cleanup_ macros (#8810) 2018-04-25 12:31:45 +02:00
test Merge pull request #9185 from marckleinebudde/can 2018-06-11 12:58:55 +02:00
time-wait-sync time-util: introduce common implementation of TFD_TIMER_CANCEL_ON_SET client code 2018-06-06 10:55:45 +02:00
timedate basic/log: add the log_struct terminator to macro 2018-06-04 13:46:03 +02:00
timesync time-util: introduce common implementation of TFD_TIMER_CANCEL_ON_SET client code 2018-06-06 10:55:45 +02:00
tmpfiles path-util: introduce path_simplify() 2018-06-03 23:39:26 +09:00
tty-ask-password-agent tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
udev scsi_id: use _cleanup_free_ on buffer allocated by get_file_options 2018-06-08 15:15:02 +02:00
update-done tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
update-utmp tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
user-sessions tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
vconsole fileio: accept FILE* in addition to path in parse_env_file() 2018-05-24 17:01:57 +02:00
veritysetup tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00
volatile-root tree-wide: drop license boilerplate 2018-04-06 18:58:55 +02:00