Apparently 30s is a bit too long for some cases, see #5552. But not
caching SERVFAIL at all also breaks stuff, see explanation in
201d99584e.
Let's try to find some middle ground, by lowering the cache timeout to
10s. This should be ample for the problem
201d99584e attackes, but not as long as
half a miute, as #5552 complains.
Fixes: #5552
Change the resolved.conf Cache option to a tri-state "no, no-negative, yes" values.
If a lookup returns SERVFAIL systemd-resolved will cache the result for 30s (See 201d995),
however, there are several use cases on which this condition is not acceptable (See systemd#5552 comments)
and the only workaround would be to disable cache entirely or flush it , which isn't optimal.
This change adds the 'no-negative' option when set it avoids putting in cache
negative answers but still works the same heuristics for positive answers.
Signed-off-by: Jorge Niedbalski <jnr@metaklass.org>
Looked for definitions of functions using the *_compare_func() suffix.
Tested:
- Unit tests passed (ninja -C build/ test)
- Installed this build and booted with it.
These lines are generally out-of-date, incomplete and unnecessary. With
SPDX and git repository much more accurate and fine grained information
about licensing and authorship is available, hence let's drop the
per-file copyright notice. Of course, removing copyright lines of others
is problematic, hence this commit only removes my own lines and leaves
all others untouched. It might be nicer if sooner or later those could
go away too, making git the only and accurate source of authorship
information.
This part of the copyright blurb stems from the GPL use recommendations:
https://www.gnu.org/licenses/gpl-howto.en.html
The concept appears to originate in times where version control was per
file, instead of per tree, and was a way to glue the files together.
Ultimately, we nowadays don't live in that world anymore, and this
information is entirely useless anyway, as people are very welcome to
copy these files into any projects they like, and they shouldn't have to
change bits that are part of our copyright header for that.
hence, let's just get rid of this old cruft, and shorten our codebase a
bit.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This makes things a bit easier to read I think, and also makes sure we
always use the _unlikely_ wrapper around it, which so far we used
sometimes and other times we didn't. Let's clean that up.
Following our coding style on success we should initialize all return
parameters of a function. We missed to cases for dns_cache_lookup() (but
covered all others), fix them too.
Some domains (such as us.ynuf.alipay.com) almost appear as if they actively
want to sabotage our DNSSEC work. Specifically, they unconditionally
return SERVFAIL on SOA lookups and always only after a 1s delay (at
least). This is pretty bad for our validation logic, as we use SOA
lookups to distuingish zones from non-terminal names. Moreover, SERVFAIL
is an error that is typically returned if we send requests a server
doesn't grok, and thus is reason for us to downgrade our protocol and
try again. In case of these zones this means we'll accept the SERVFAIL
response only after a full iterative downgrade to our lowest feature
level: TCP. In combination with the 1s delays this has the effect of
making us hit our transaction timeout way to easily.
As first attempt to improve the situation: let's start caching SERVFAIL
responses in our cache, after the full downgrade for a short period of
time.
Conceptually this is exposed as "weird rcode" caching, but for now we
only consider SERVFAIL a "weird rcode" worthy of caching. Later on we
might want to add more.
When we return the full RR wire data, let's make sure the TTL included in it is
adjusted by the time the RR sat in the cache.
As an optimization we do this only for ResolveRecord() and not for
ResolveHostname() and friends, since adjusting the TTL means copying the RR
object, and we don#t want to do that if there's no reason to.
(ResolveHostname() and friends don't return the TTL hence there's no reason to
in that case)
Throughout the tree there's spurious use of spaces separating ++ and --
operators from their respective operands. Make ++ and -- operator
consistent with the majority of existing uses; discard the spaces.
When the buffer is allocated on the stack we do not have to check for
failure everywhere. This is especially useful in debug statements, because
we can put dns_resource_key_to_string() call in the debug statement, and
we do not need a seperate if (log_level >= LOG_DEBUG) for the conversion.
dns_resource_key_to_string() is changed not to provide any whitespace
padding. Most callers were stripping the whitespace with strstrip(),
and it did not look to well anyway. systemd-resolve output is not column
aligned anymore.
The result of the conversion is not stored in DnsTransaction object
anymore. It is used only for debugging, so it seems fine to generate it
when needed.
Various debug statements are extended to provide more information.
When using NSEC/NSEC3 RRs from the cache to derive existance of arbitrary RRs, we should not get confused by the fact
that NSEC/NSEC3 RRs exist twice at zone cuts: once in the parent zone, and once in the child zone. For most RR types we
should only consult the latter since that's where the beef is. However, for DS lookups we have to check the former.
This change makes sure we never cache NSEC/NSEC3 RRs from any parent zone of a zone-cut. It also makes sure that when
we look for a DS RR in the cache we never consider any cached NSEC RR, as those are now always from the child zone.
Quite often we read the same RR key multiple times from the same message. Try to replace them by a single object when
we notice this. Do so again when we add things to the cache.
This should reduce memory consumption a tiny bit.
When storing negative responses, clamp the SOA minimum TTL (as suggested
by RFC2308) to the TTL of the NSEC/NSEC3 RRs we used to prove
non-existance, if it there is any.
This is necessary since otherwise an attacker might put together a faked
negative response for one of our question including a high-ttl SOA RR
for any parent zone, and we'd use trust the TTL.
When we verified a signature, fix up the RR's TTL to the original TTL
mentioned in the signature, and store the signature expiry information
in the RR, too. Then, use that when adding RRs to the cache.
This collects statistical data about transactions, dnssec verifications
and the cache, and exposes it over the bus. The systemd-resolve-host
tool learns new options to query these statistics and reset them.
Let's simplify usage and memory management of DnsResourceRecord's
dns_resource_record_to_string() call: cache the formatted string as
part of the object, and return it on subsequent calls, freeing it when
the DnsResourceRecord itself is freed.
This adds a new DnsAnswer item flag "DNS_ANSWER_SHARED_OWNER" which is
set for mDNS RRs that lack the cache-flush bit. The cache-flush bit is
removed from the DnsResourceRecord object in favour of this.
This also splits out the code that removes previous entries when adding
new positive ones into a new separate call dns_cache_remove_previous().
Let's use dns_cache_remove() rather than
dns_cache_item_remove_and_free() to destroy the cache, since the former
requires far fewer hash table lookups.
When we receieve a TTL=0 RR, then let's only flush that specific RR and
not the whole RRset.
On mDNS with RRsets that a shared-owner this is how specific RRs are
removed from the set, hence support this. And on non-mDNS the whole
RRset will already be removed much earlier in dns_cache_put() hence
there's no reason remove it again.