This makes two changes: first of all we will now explicitly check
whether a domain to test against an NSEC record is actually below the
signer's name. This is relevant for NSEC records that chain up the end
and the beginning of a zone: we shouldn't alow that NSEC record to match
against domains outside of the zone.
This also fixes how we handle NSEC checks for domains that are prefixes
of the NSEC RR domain itself, fixing #8164 which triggers this specific
case. The non-wildcard NSEC check is simplified for that, we can
directly make our between check, there's no need to find the "Next
Closer" first, as the between check should not be affected by additional
prefixes. For the wild card NSEC check we'll prepend the asterisk in
this case to the NSEC RR itself to make a correct check.
Fixes: #8164
Let's add some protection for split horizon setups, where different
zones are visible on the same global DNS servers depending on where you
come from.
Fixes: #9196
LOG_MESSAGE is just a wrapper, but it keeps the arguments indented together
with the format string, so put the argument inside of the macro invocation.
(No functional change.)
Also use lowercase for "trust anchor" — it should either be all capitaled or not
at all, and it's not a proper name, so let's make it all lowercase.
Also, add a newline, to make the string more readable. "%s" can expand to
something that is quite long.
Most our other parsing functions do this, let's do this here too,
internally we accept that anyway. Also, the closely related
load_env_file() and load_env_file_pairs() also do this, so let's be
systematic.
This makes most header files easier to look at. Also Emacs gets really
slow when browsing through large sections of overly long prototypes,
which is much improved by this macro.
We should probably not do something similar with too many other cases,
as macros like this might help readability for some, but make it worse
for others. But I think given the complexity of this specific prototype
and how often we use it, it's worth doing.
When I see "test", I have to think three times what the return value
means. With "below" this is immediately clear. ratelimit_below(&limit)
sounds almost like English and is imho immediately obvious.
(I also considered ratelimit_ok, but this strongly implies that being under the
limit is somehow better. Most of the times this is true, but then we use the
ratelimit to detect triple-c-a-d, and "ok" doesn't fit so well there.)
C.f. a1bcaa07.
Previously we were a bit sloppy with the index and size types of arrays,
we'd regularly use unsigned. While I don't think this ever resulted in
real issues I think we should be more careful there and follow a
stricter regime: unless there's a strong reason not to use size_t for
array sizes and indexes, size_t it should be. Any allocations we do
ultimately will use size_t anyway, and converting forth and back between
unsigned and size_t will always be a source of problems.
Note that on 32bit machines "unsigned" and "size_t" are equivalent, and
on 64bit machines our arrays shouldn't grow that large anyway, and if
they do we have a problem, however that kind of overly large allocation
we have protections for usually, but for overflows we do not have that
so much, hence let's add it.
So yeah, it's a story of the current code being already "good enough",
but I think some extra type hygiene is better.
This patch tries to be comprehensive, but it probably isn't and I missed
a few cases. But I guess we can cover that later as we notice it. Among
smaller fixes, this changes:
1. strv_length()' return type becomes size_t
2. the unit file changes array size becomes size_t
3. DNS answer and query array sizes become size_t
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=76745
Double newlines (i.e. one empty lines) are great to structure code. But
let's avoid triple newlines (i.e. two empty lines), quadruple newlines,
quintuple newlines, …, that's just spurious whitespace.
It's an easy way to drop 121 lines of code, and keeps the coding style
of our sources a bit tigther.
This makes `resolvectl` use the verb style command line, e.g.,
`resolvectl status` or `resolvectl tlsa tcp fedoraproject.org:443`.
For compatibility, if the invocation name is `systemd-resolve`,
then it accepts the old syntax, e.g. `systemd-resolve --status`.
Files which are installed as-is (any .service and other unit files, .conf
files, .policy files, etc), are left as is. My assumption is that SPDX
identifiers are not yet that well known, so it's better to retain the
extended header to avoid any doubt.
I also kept any copyright lines. We can probably remove them, but it'd nice to
obtain explicit acks from all involved authors before doing that.
This adds flags BUS_MAP_STRDUP and BUS_MAP_BOOLEAN_AS_BOOL.
If BUS_MAP_STRDUP is set, then each "s" message is duplicated.
If BUS_MAP_BOOLEAN_AS_BOOL is set, then each "b" message is
written to a bool pointer.
Follow-up for #8488.
See https://github.com/systemd/systemd/pull/8488#discussion_r175816270.
When we are attempting to create directory somewhere in the bowels of /var/lib
and get an error that it already exists, it can be quite hard to diagnose what
is wrong (especially for a user who is not aware that the directory must have
the specified owner, and permissions not looser than what was requested). Let's
print a warning in most cases. A warning is appropriate, because such state is
usually a sign of borked installation and needs to be resolved by the adminstrator.
$ build/test-fs-util
Path "/tmp/test-readlink_and_make_absolute" already exists and is not a directory, refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but has mode 0775 that is too permissive (0755 was requested), refusing.
(or)
Directory "/tmp/test-readlink_and_make_absolute" already exists, but is owned by 1001:1000 (1000:1000 was requested), refusing.
Assertion 'mkdir_safe(tempdir, 0755, getuid(), getgid(), MKDIR_WARN_MODE) >= 0' failed at ../src/test/test-fs-util.c:320, function test_readlink_and_make_absolute(). Aborting.
No functional change except for the new log lines.
This is similar to TAKE_PTR() but operates on file descriptors, and thus
assigns -1 to the fd parameter after returning it.
Removes 60 lines from our codebase. Pretty good too I think.
This macro will read a pointer of any type, return it, and set the
pointer to NULL. This is useful as an explicit concept of passing
ownership of a memory area between pointers.
This takes inspiration from Rust:
https://doc.rust-lang.org/std/option/enum.Option.html#method.take
and was suggested by Alan Jenkins (@sourcejedi).
It drops ~160 lines of code from our codebase, which makes me like it.
Also, I think it clarifies passing of ownership, and thus helps
readability a bit (at least for the initiated who know the new macro)
Even if pager_open() fails, in general, we should continue the operations.
All erroneous cases in pager_open() show log message in the function.
So, it is not necessary to check the returned value.
This turns resolve-tool into a multi-call binary. When invoked as
"resolvconf" it provides minimal compatibility with the resolvconf(8)
tool of various distributions (and FreeBSD as it appears).
This new interface understands to varying degrees features of the two
major implementations of resolvconf(8): Debian's original one and
"openresolv". Specifically:
Fully supported:
-a -d (supported by all implementations)
-f (introduced by openresolv)
Somewhat supported:
-x (introduced by openresolv, mapped to a '~.' domain entry)
Unsupported and ignored:
-m -p (introduced by openresolv, not really necessary for us)
Unsupported and resulting in failure:
-u (supported by all other implementations)
-I -i -l -R -r -v -V
(all introduced by openresolv)
--enable-updates --disable-updates --updates-are-enabled
(specific to Debian's implementation)
Of course, resolvconf(8) is a tool with multiple backends, in our
implementation systemd-resolved is the only backend.
Fixes: #7202
Those files don't contain any @variables@, so the configuration step was just
copying them to build/. Let's avoid that, and fix their suffixes while at it.
Apparently O_NONBLOCK is the modern name used in most documentation and
for most cases in our sources. Let's hence replace the old alias
O_NDELAY and stick to O_NONBLOCK everywhere.
DNS queries have a timeout of DNS_TRANSACTION_ATTEMPTS_MAX *
DNS_TIMEOUT_MAX_USEC = 120 s. Calls to the ResolveHostname method of
the org.freedesktop.resolve1.Manager interface have various call
timeouts that are smaller than 120 s. So it seems correct to adjust
the call timeout to the maximum query timeout and to unify the call
timeout among all callers.
A timeout of 120 s might seem large, in particular since BIND does seem
to have a query timeout of 10 s. However, it seems match the timeout
value of 120 s of Unbound. Moreover, the query and timeout handling of
resolve have problems and might be improved in the future, so this
change is at best an interim solution.
log.h really should only include the bare minimum of other headers, as
it is really pulled into pretty much everything else and already in
itself one of the most basic pieces of code we have.
Let's hence drop inclusion of:
1. sd-id128.h because it's entirely unneeded in current log.h
2. errno.h, dito.
3. sys/signalfd.h which we can replace by a simple struct forward
declaration
4. process-util.h which was needed for getpid_cached() which we now hide
in a funciton log_emergency_level() instead, which nicely abstracts
the details away.
5. sys/socket.h which was needed for struct iovec, but a simple struct
forward declaration suffices for that too.
Ultimately this actually makes our source tree larger (since users of
the functionality above must now include it themselves, log.h won't do
that for them), but I think it helps to untangle our web of includes a
tiny bit.
(Background: I'd like to isolate the generic bits of src/basic/ enough
so that we can do a git submodule import into casync for it)
The changes both networkd and resolved to make use of the watch_bind
feature of sd-bus to connect to the system bus. This way, both daemons
can be started during early boot, and automatically and instantly
connect to the system bus as it becomes available.
This replaces prior code that used a time-based retry logic to connect
to the bus.
Let's remove a number of synchronization points from our service
startups: let's drop synchronous match installation, and let's opt for
asynchronous instead.
Also, let's use sd_bus_match_signal() instead of sd_bus_add_match()
where we can.
Enumerating DNS-SD PTR resource records are a special case and
are supposed to have non-unique keys pointing to services of the
same type running on different hosts. There's no need for them
to be checked for conflicts.
Thus don't check for conflicts such RRs.
Refcounting for a RR's key is done separately from refcounting
for the RR itself, but in dns_scope_notify_conflict() we don't
do that. This may lead to a situation when a RR key put in the
conflict_queue hash as a value's key gets freed upon
cache reduction when it's still referenced by the hash.
Thus increase refcount for the key when putting it into the hash
and unreference it upon removing from the hash.
Closes#6456
This reduces the man=false meson target count from 1281 to 1253.
--
A fully scientific test:
git grep _sources, :/*.build|cut -d: -f2|tr -d ' '|sort|uniq -c
reveals that libudev_sources is the only source list now reused twice. There's
some ugly circular dependency between libudev and libshared, and anyway I'm not
sure if we don't want to use different compilation options (LOG_REALM_…) in
those two cases, so I'm leaving that alone for now.
This makes things a bit easier to read I think, and also makes sure we
always use the _unlikely_ wrapper around it, which so far we used
sometimes and other times we didn't. Let's clean that up.
This is useful to debug things, but also to hook up external post-up
scripts with resolved.
Eventually this code might be useful to implement a
resolvconf(8)-compatible interface for compatibility purposes. Since the
semantics don't map entirely cleanly as first step we add a native
interface for pushing DNS configuration into resolved, that exposes the
correct semantics, before adding any compatibility interface.
See: #7202
Let's replace usage of fputc_unlocked() and friends by __fsetlocking(f,
FSETLOCKING_BYCALLER). This turns off locking for the entire FILE*,
instead of doing individual per-call decision whether to use normal
calls or _unlocked() calls.
This has various benefits:
1. It's easier to read and easier not to forget
2. It's more comprehensive, as fprintf() and friends are covered too
(as these functions have no _unlocked() counterpart)
3. Philosophically, it's a bit more correct, because it's more a
property of the file handle really whether we ever pass it on to another
thread, not of the operations we then apply to it.
This patch reworks all pieces of codes that so far used fxyz_unlocked()
calls to use __fsetlocking() instead. It also reworks all places that
use open_memstream(), i.e. use stdio FILE* for string manipulations.
Note that this in some way a revert of 4b61c87511.
RFC 8080 describes how to use EdDSA keys and signatures in DNSSEC. It
uses the curves Ed25519 and Ed448. Libgcrypt 1.8.1 does not support
Ed448, so only the Ed25519 is supported at the moment. Once Libgcrypt
supports Ed448, support for it can be trivially added to resolve.
Currently, we accept SERVFAIL after downgrading fully, cache it and move
on. Let's extend this a bit: after downgrading fully, if the SERVFAIL
logic continues to be an issue, then use a different DNS server if there
are any.
Fixes: #7147
This makes sure that a classic DNS scope that has no DNS servers
assigned is never considered for routing requests to even if it has
matching search/routing domains associated.
This is inspired by #7544, where lookup requests are refused since a
scope with no DNS server is configured. This change does not deliver
what the reporter intended, but is generally useful in general, as it
makes us mor robust to misconfiguration.
The user might replace a foreign /etc/resolv.conf with a symlink to one
of ours between the time we did stat() and open the file. Hence, let's
check the fstat() data right after opening the file, a second time.
It might happen that a DNS-SD service doesn't include local host's
name in its RR keys and still conflicts with a remote service.
In this case try to resolve the conflict by changing name for
this particular service.
As discussed in RFC 6762, Section 8.2 a race condition may
happen when two hosts are probing for the same name simultaniously.
Detect and handle such race conditions.
According to RFC 6762 Section 8.2 "Simultaneous Probe Tiebreaking"
probing queries' Authority Section is populated with proposed
resource records in order to resolve possible race conditions.
From RFC 6762, Section 10.2
"They (the rules about when to set the cache-flush bit) apply to
startup announcements as described in Section 8.3, "Announcing",
and to responses generated as a result of receiving query messages."
So, set the cache-flush bit for mDNS answers except for DNS-SD
service enumerattion PTRs described in RFC 6763, Section 4.1.
According to p5.3 of RFC6762 (Multicast DNS) one mDNS query message
can contain more than one question sections.
Generate answers for all found questions and put them to a reply
message.
This might very well happen due to races between joining multicast
groups and network configuration and such, let's not complain, but just
drop the messages at debug level.
Fixes: #7527
So far I avoided adding license headers to meson files, but they are pretty
big and important and should carry license headers like everything else.
I added my own copyright, even though other people modified those files too.
But this is mostly symbolic, so I hope that's OK.
This creates a second private resolve.conf file which lists the stub resolver
and the resolved acquired search domains.
This runtime file should be used as a symlink target for /etc/resolv.conf such
that non-nss based applications can resolve search domains.
Fixes: #7009
This adds "systemd-resolve --reset-server-features" for explicitly
forgetting what we learnt. This might be useful for debugging
purposes, and to force systemd-resolved to restart its learning logic
for all DNS servers.
When the network configuration changes we should relearn everything
there is to know about the configured DNS servers, because we might talk
to the same addresses, but there might be different servers behind them.
When we a reply message gets longer than the client supports we need to
truncate the response and set the TC bit, and we already do that.
However, we are not supposed to send incomplete RRs in that case, but
instead truncate right at a record boundary. Do that.
This fixes the "Message parser reports malformed message packet."
warning the venerable "host" tool outputs when a very large response is
requested.
See: #6520
The configuration option was called -Dresolve, but the internal define
was …RESOLVED. This options governs more than just resolved itself, so
let's settle on the version without "d".
The advantage is that is the name is mispellt, cpp will warn us.
$ git grep -Ee "conf.set\('(HAVE|ENABLE)_" -l|xargs sed -r -i "s/conf.set\('(HAVE|ENABLE)_/conf.set10('\1_/"
$ git grep -Ee '#ifn?def (HAVE|ENABLE)' -l|xargs sed -r -i 's/#ifdef (HAVE|ENABLE)/#if \1/; s/#ifndef (HAVE|ENABLE)/#if ! \1/;'
$ git grep -Ee 'if.*defined\(HAVE' -l|xargs sed -i -r 's/defined\((HAVE_[A-Z0-9_]*)\)/\1/g'
$ git grep -Ee 'if.*defined\(ENABLE' -l|xargs sed -i -r 's/defined\((ENABLE_[A-Z0-9_]*)\)/\1/g'
+ manual changes to meson.build
squash! build-sys: use #if Y instead of #ifdef Y everywhere
v2:
- fix incorrect setting of HAVE_LIBIDN2
Previously, if a PTR query is seen for a non-existing record, we'd
generate an empty response (but not NXDOMAIN or so). Fix that. If we
have no data about an IP address, then let's say so, so that the
original error is returned, instead of anything synthesized.
Fixes: #6543
Let's ensure that "no gateway" translates to "no domain", instead of an
empty reply. This is in line with what nss-myhostname does in the same
case, hence let's unify behaviour here of nss-myhostname and resolved.
This changes the symbolic name for the default gateway from "gateway" to
"_gateway". A new configuration option -Dcompat-gateway-hostname=true|false
is added. If it is set, the old name is also supported, but the new name
is used as the canonical name in either case. This is intended as a temporary
measure to make the transition easier, and the option should be removed
after a few releases, at which point only the new name will be used.
The old "gateway" name mostly works OK, but hasn't gained widespread acceptance
because of the following (potential) conflicts:
- it is completely legal to have a host called "gateway"
- there is no guarantee that "gateway" will not be registered as a TLD, even
though this currently seems unlikely. (Even then, there would be no
conflict except for the case when the top-level domain itself was being resolved.
The "gateway" or "_gateway" labels have only special meaning when the
whole name consists of a single label, so resolution of any subdomain
of the hypothetical gateway. TLD would still work OK. )
Moving to "_gateway" avoids those issues because underscores are not allowed
in host names (RFC 1123, §2.1) and avoids potential conflicts with local or
global names.
v2:
- simplify the logic to hardcode "_gateway" and allow
-Dcompat-gateway-hostname=true as a temporary measure.
As a follow-up for db3f45e2d2 let's do the
same for all other cases where we create a FILE* with local scope and
know that no other threads hence can have access to it.
For most cases this shouldn't change much really, but this should speed
dbus introspection and calender time formatting up a bit.