Commit graph

16691 commits

Author SHA1 Message Date
Ondřej Surý
5f068a307f Remove isc_mem_strndup()
The function had a single caller, the HTTP/2 request-path handling in
the network manager, and its semantics (strlen() of the source clamped
to the requested size) amounted to an obscured bounded string copy.
Replace the only use with a plain allocation and strlcpy(), and drop the
function.
2026-06-12 17:40:05 +02:00
Aram Sargsyan
fb27599b58 Fix a bug in xfrin.c:xfrin_connect_done()
When the connect callback's result is ISC_R_SUCCESS and the callback
changes the result because of some condition, the 'xfr' should not
be detached, because it now belongs to the receive callback.

Detach the reference only if the callback's result is non-success.
2026-06-12 14:59:47 +00:00
Colin Vidal
ebd4e26f46 Fix delegdb dump buffer overflow
A buffer used to dump a DNS name in the delegdb dump flow was using the
wrong size: it was using `DNS_NAME_MAXWIRE` which is the actual max
length of a DNS name on the wire instead of using `DNS_NAME_FORMATSIZE`
which is the maximum length of a textual representation of a DNS name
(which can be way longer than `DNS_NAME_MAXWIRE` if using the master
file escape sequence format) plus 1 (end of string byte). This could
lead to a buffer overflow. This is now fixed.
2026-06-12 15:45:22 +02:00
Aram Sargsyan
8448136b16 Fix 'geoip' ACL matching bug
The geoip2.c:match_string() function can incorrectly return 'true'
when matching strings of different lengths (i.e. it matches a
substring). Return 'false' when the lengths of the matched strings
are different.
2026-06-11 13:28:21 +00:00
Matthijs Mekking
008ad8ade2 ISC_ATTR_UNUSED in favor of UNUSED() 2026-06-11 10:53:56 +00:00
Matthijs Mekking
97b9c9c823 Keep our key on update removing DNSKEY RRset
When a Dynamic Update is received that removes the DNSKEY (or CDNSKEY,
or CDS) RRset, remove all records except the ones that are in use
for signing for the zone (with dnssec-policy).
2026-06-11 10:53:56 +00:00
Ondřej Surý
8b2c490811 Enforce strict RSA DNSKEY shape during DNSSEC validation
A resolver that validated DNSSEC accepted RSA DNSKEYs of any modulus
size up to OpenSSL's compile-time ceiling, and accepted any public
exponent the wire format could carry.  RSA verification cost grows
sharply with the modulus length, so an authoritative server could
publish an oversized DNSKEY to make each signature check on the
resolver many times more expensive than for a normally sized key.

The intended verify-time cap had no effect because the helper it called
returned the public-exponent bit length rather than the modulus bit
length, so the test was always satisfied.  Replace it with an honest
modulus-range check and a stricter exponent check that accepts only odd
exponents in the closed range [3, 2^32 + 1] (covering every Fermat
prime up to F5 and the odd intermediate values seen in deployed keys),
reject anything outside those bounds at every RSA key load path so an
invalid key never reaches the verifier, and keep the same checks at the
verifier as a backstop against future load paths.
2026-06-08 18:00:27 +02:00
Ondřej Surý
b11bf7a45e
Use sequentially consistent ordering in the adaptive rwlock handshake
The adaptive isc_rwlock (the modified C-RW-WP variant) synchronizes a
reader against a writer through a store-buffer handshake across two
independent atomic objects: the reader publishes its arrival in
readers_ingress and then reads writers_lock, while the writer publishes
its lock in writers_lock and then reads the reader indicator. With the
acquire/release ordering introduced by the 2021 simplification, neither
side is forced to observe the other's publish store before its own check
load, so on weak-memory targets a reader could see writers_lock unlocked
while the writer sees the indicator empty, and both would enter their
critical sections at once.

Restore the sequentially consistent ordering the original algorithm
specifies on the handshake atomics. The single total order over the
seq_cst operations is what forbids the overlap; targeting individual
fences is both more fragile and, on x86, more expensive. On x86 this
ordering is free (seq_cst loads remain plain loads and the RMWs remain
lock-prefixed); the added cost falls only on the weak-memory targets that
actually need it.
2026-06-08 17:51:17 +02:00
Ondřej Surý
9771df0aca Disallow configuration of user-defined non-IN class views
Only class IN is allowed for user-defined views; the internally
generated `_bind` view stays in the CH class. Both `named` and the
shared checker in `lib/isccfg/check.c` now reject non-IN views, so a
config can no longer pass `named-checkconf` yet fail to start in
`named`.

Tests, configs, and catalog zones using CH or arbitrary classes
(e.g. `class10`) are removed accordingly.
2026-06-08 17:50:15 +02:00
Ondřej Surý
04da917e35 Drop support for the HS class
HS (Hesiod) class is no longer supported and is treated same as
any other unknown class.
2026-06-08 17:50:15 +02:00
Ondřej Surý
43a85a5bca Use variable size struct for zonecut ndata to avoid allocation
Previously, the node_deleg_t would do double allocation, one for the
struct itself and one for the zonecut.  This has been changed to use
variable sized struct with the zonecut .ndata buffer attached to the end
of node_deleg_t structure.
2026-06-08 17:48:02 +02:00
Michal Nowak
ddfe8a1bdc Preserve the request buffer across async SIG(0) processing
For SIG(0)-signed requests, view matching is offloaded and the request
is finished asynchronously from ns_client_request_continue(), which
passes client->inner.buffer to dns_dt_send().  That buffer aliases the
network manager's receive buffer, only valid during the read callback,
so it may already be freed and reused, producing garbage dnstap frames
(e.g. the "upforwd" sig0-over-DoT test fails with UQ=0).

When the request is offloaded (ns_client_setup_view() returns
DNS_R_WAIT) and dnstap is enabled, copy the request buffer and point
client->inner.buffer at the copy so it survives the asynchronous hop;
free it in ns__client_reset_cb().  When dnstap is disabled there is no
async consumer of the buffer, so detach it from the receive buffer
instead.

Assisted-by: Claude:claude-opus-4-8
2026-06-08 11:30:28 +00:00
Ondřej Surý
bb43ecaf51
Fix use-after-free when destroying the bad and unreachable caches
Eviction of an entry owned by another loop was bounced to that loop via
isc_async_run(), so a queued list removal could run after the cache had
freed its LRU lists.  Use a single mutex-guarded LRU list instead, removing
entries synchronously under the lock, and let each entry hold its own
memory-context reference so the RCU free never touches a gone loop.
2026-06-04 15:54:21 +02:00
Colin Vidal
844eee509b Add DTrace support for resolver queries
When `fctx_query()` is called, a DTrace probe (if enabled) prints the
fetch context address, the upstream server address and port, and the
latest known SRTT for the server.
2026-06-04 13:55:56 +02:00
Ondřej Surý
c50e96f8d6
Simplify the delegation database LRU to a single shared SIEVE
The delegation database kept one SIEVE LRU list per loop so that node
eviction could run lock-free on each node's owning loop; this required
every node to hold a loop reference and to defer its own destruction to
that loop via isc_async_run().  Move the SIEVE unlink into the QP write
transaction, taking the evicted node directly from dns_qp_deletename(),
which serialises every list mutation under the qpmulti writer lock and
lets a single shared list replace the per-loop arrays.  Node and database
teardown are now synchronous.

The QP trie and the SIEVE list are wrapped in a reference-counted holder.
Each node keeps a reference to the holder so it (and its memory context)
stays valid until the node is destroyed, while shutdown drains the SIEVE
and destroys the trie from an RCU callback and frees the holder once the
last node drops its reference.  Reuse across a reconfiguration now moves
ownership of the holder to the new view instead of sharing it through a
separate owners counter, so dns_delegdb_reuse() is removed.
2026-06-04 12:52:30 +02:00
Ondřej Surý
a570e37c06
Only update the global tid_count once
Normally, the tid_count is initialized only once at the beginning of the
application.  The only exception is the pattern in the unit test where
isc_loopmgr is repeatedly created and torn down and each creation of
isc_loopmgr_t calls isc__tid_initcount() with the previous value.

ThreadSanitizer sees that as write operation on unprotected memory are
reports this as data race even though the value has not really changed.

This has been fixed by skipping the tid_count value update on repeated
calls.
2026-06-04 12:03:29 +02:00
Evan Hunt
07c8c1d242 Add "warn_unused_result" to attributes.h
A new ISC_ATTR_WARN_UNUSED_RESULT macro now defines
__attribute__((warn_unused_result)) and is used for
dns_rdata_fromstruct().
2026-06-04 08:20:33 +00:00
Evan Hunt
175d418e28 Check dns_rdata_fromstruct() return values
The in some functions implementing RFC 5011 key maintenance, the
results of dns_rdata_fromstruct() were not checked. This has been
fixed.
2026-06-04 08:20:33 +00:00
Colin Vidal
0811c2deb2 Fix isc__tid_initcount() REQUIRE
`isc__tid_initcount()` was checking that the current number of thread
didn't exceed `ISC_TID_MAX`, not the newly assigned number. This is now
fixed.
2026-06-02 16:16:12 +02:00
Alessio Podda
19a84c78d0 Fix stdc_count_zeros/stdc_count_ones polyfill mismatch
A previous commit introduced a latent bug where the wrong popcount
definition was used when overriding the compilation mode to C23.
This commit fixes it.
2026-06-02 10:18:13 +02:00
Michal Nowak
36fe37f6b4 Fix spelling typos in comments and code
Assisted-by: Mistral Vibe:mistral-medium-3.5
Assisted-by: Claude:claude-opus-4-7
2026-06-01 16:58:16 +02:00
Ondřej Surý
dc3e15d7a6 Funnel the validator's DS fetches through a single helper
The validator starts a DS fetch from three places while building or
proving a trust chain.  Only validate_dnskey() handed the resolver the
parent zone cut as a delegation hint; the other two started the fetch
with no hint at all.

Factor the shared setup into create_ds_fetch() and route all three
through it, so every validator DS fetch is created identically and
carries the parent zone cut.  DS is an at-parent type, so the resolver
already anchors such a query at the parent on its own; supplying the
zone cut explicitly also lets the resolver's fetch loop detection match
the fetch by domain, which it cannot do for a fetch with no hint.

Assisted-by: Claude:claude-opus-4-8
2026-05-30 05:59:31 +02:00
Ondřej Surý
938b58a809
Fail promptly on an RRSIG answer with no usable record
A query for an RRSIG is handled as a subset of ANY, so rctx_answer_any()
filters out records that do not match the queried type. When every
record was filtered out (an answer carrying only unrelated types), the
function still returned success with nothing cached, and the fetch then
waited for a validator that was never started until the backstop fetch
timer fired ~12s later. Treat an all-filtered answer as a broken
response, matching how non-meta types already reject a reply with no
usable record.
2026-05-29 22:01:29 +02:00
Ondřej Surý
d0c6219d66
Detect non-advancing alias chains in the validator
The resolver turned a CNAME response to an RRSIG or NSEC query into
FORMERR inside rctx_answer_cname().  That is redundant -- every caller
already copes with a DNS_R_CNAME or DNS_R_DNAME result -- and it is the
wrong layer, because the resolver cannot tell a legitimate alias from a
broken one.  Drop it; a CNAME for one of these types now flows back as
an ordinary alias.

The case that must be stopped lives in the validator.  While proving an
unsigned CNAME insecure, proveunsecure() fetches the DS for the CNAME's
own name; because fetches are shared, that fetch re-enters and stalls on
the in-flight fetch the validator is waiting for, deadlocking for about
twelve seconds (GL#5878).  Unlike the resolver, the validator knows it
is validating an alias, so check_chaining() now aborts a fetch whose
name matches the chaining rdataset's owner: it cannot advance the chain
and would only self-join.
2026-05-29 22:01:29 +02:00
Ondřej Surý
0e04671b65
Detect resolver fetch loops only when joining an in-flight fetch
dns_resolver_createfetch() guarded against fetch loops by comparing the
raw request name/type/domain before any fetch context existed. Move the
check after the context is obtained and run it against the context
itself, and only when we joined an already in-flight context
(!new_fctx) that is also an ancestor in the parent chain. That is the
real loop condition: the new fetch would block waiting on a fetch that
is itself waiting on us. A newly created context waits on nothing, so it
proceeds, bounded by the fetch depth limit and the complementary ADB
loop detection.
2026-05-29 17:08:55 +02:00
Alessio Podda
7502e73333
Remove name boundary optimization
In MR !9740, we introduced an optimization that reduces memory usage
by processing rdatas in batches during AXFR.

The maximum batch size is 128, but the batch size was allowed to grow
beyond that limit if all rdatas in a batch were for the same name, as
that allows a more efficient optimization.

This optimization could theoretically allow the batch size arbitrarily
for a sufficient large zone transfer. Since synthetic tests don't show
any performance improvement from the optimization, this MR removes it.
2026-05-29 09:50:11 +02:00
Matthijs Mekking
21791160b3 Check options in templates that must be non-zero
Without this, named-checkconf falsely accepts badly configured zones,
and rndc addzone would abort on adding a zone referring to such
templates.
2026-05-28 14:25:48 +00:00
Colin Vidal
ad92081c5b Add system test for delegdb size preservation across rndc flush
Test that flushing the delegdb via `rndc flush` preserves its
configured size limit.  The test checks delegdb watermarks after
`named` startup, flushes caches, and verifies that the delegdb
watermarks are correctly restored afterwards.

To distinguish between the previous `delegdb` memory contexts and the
new ones, we need to know exactly when the previous `delegdb` memory
contexts are removed (this is not immediate, since those are removed
during RCU reclamation phase). A trace is therefore added when a memory
context is destroyed, if `ISC_MEM_DEBUGTRACE` is set.
2026-05-28 13:59:38 +02:00
Colin Vidal
3610fd2df9 Add delegdb configuration struct
Instead of having independent APIs to configure various aspects of the
delegdb (i.e. cache size, other settings that may come up later), a
single configuration struct is passed to `dns_delegdb_setconfig()`, which
internally does all the plumbing. To avoid relying on
atomics/synchronization, `dns_delegdb_setconfig()` must be called from
exclusive mode (for now).

The configuration can be retrieved at any time (not necessarily from
exclusive mode) using `dns_delegdb_getconfig()`. This is useful, for
instance, to flush the delegdb without losing its parameters.
2026-05-28 13:59:38 +02:00
Ondřej Surý
b9c1b90b50 Drop RFC 2535 special-casing of the KEY record type
After SIG and NXT lost their special handling, KEY remained the only
RFC 2535-era type still receiving coexistence allowances: KEY
alongside CNAME at the same owner, KEY answered from the parent side
of a zone cut, KEY kept across CNAME eviction in the cache.  RFC 3755
retains type 25 only for SIG(0) and TKEY transaction signatures, and
neither relies on those allowances in practice.  The in-tree comment
that flagged the RFC 3007 parent-side carve-out as "unclear" predicted
this cleanup.

Zones that publish CNAME and KEY at the same owner — already invalid
under RFC 2181 — now fail to load.  System test fixtures are updated
accordingly, and a new test asserts that SIG, NXT, and KEY records
pick up covering RRSIGs when their zone is signed.
2026-05-28 13:21:00 +02:00
Ondřej Surý
2de202a6b7 Stop treating SIG and NXT records specially
RFC 3755 retired SIG and NXT in favour of RRSIG and NSEC.  BIND still
warned about them at zone load, refused them in dynamic updates,
parsed SIG with a non-zero "type covered" field as a signature on an
RRset, and tracked them via dns_rdatatype_issig().  Those carve-outs
were the sole path that made the GL#5818 crash class reachable.

Treat both types as ordinary unknown rdata: they load, transfer, sign
and answer like any other record, and dynamic updates carry them
through the generic path.  SIG(0) is unaffected; its message-parsing
carve-out is preserved.
2026-05-28 13:21:00 +02:00
Ondřej Surý
4bfd18d08d Skip DNS64 synthesis when answering a redirected response
redirect2() swaps qctx->db to the redirect zone before
query_nodata() runs. The DNS64 fallback there issues an A lookup
for the original query name, which is out of zone for the
redirect db, and the resulting query_notfound() trips
INSIST(!is_zone). The cached NCACHENXRRSET variant trips a
REQUIRE in dns_rdataset_first() on a disassociated rdataset.
The synth-from-dnssec entry reaches the same fallback via
query_coveringnsec(). Guarding the fallback with
!qctx->redirected leaves the nxdomain-redirect NXRRSET answer to
be served as-is.
2026-05-28 11:11:17 +02:00
Matthijs Mekking
77649ea8e2 Fix startup crash on bump in the wire signer
A secondary server that is configured as a bump in the wire signer
with inline-signing implicitly enabled via dnssec-policy requires
a 'file' entry.
2026-05-27 09:45:11 +00:00
Michal Nowak
5ebf17741f Generate Ed25519/Ed448 keys via PKCS#11 when a label is set
When a dst_key_t carries a PKCS#11 URI in key->label (as named
does for dnssec-policy zones backed by a key-store "hsm"), key
generation must happen inside the HSM, not in software.
opensslecdsa_generate already branches on key->label and calls
the matching pkcs11 wrapper; the EDDSA generator silently ignored
the label and produced a software key, which named then wrote to
the .private file with both a Label: line and the raw PrivateKey:
bytes -- a corrupt hybrid record that prevented zone signing.

Add the missing wrapper:

  - lib/isc/ossl_wrap/ossl3.c gains generate_pkcs11_eddsa_key()
    and the public isc_ossl_wrap_generate_pkcs11_ed25519_key() /
    isc_ossl_wrap_generate_pkcs11_ed448_key() entry points.  They
    use EVP_PKEY_CTX_new_from_name(NULL, "ED25519" or "ED448",
    "provider=pkcs11") with the pkcs11_uri and pkcs11_key_usage
    parameters, mirroring the existing EC wrapper.
  - lib/isc/ossl_wrap/ossl1_1.c provides stubs returning
    ISC_R_NOTIMPLEMENTED for the new EDDSA wrappers; the
    pkcs11-provider stack requires OpenSSL 3.  The pre-existing
    isc_ossl_wrap_generate_pkcs11_rsa_key() stub used to silently
    delegate to software keygen -- that hid the same "HSM label
    on a software key" hazard for RSA on OpenSSL 1.1 builds, so
    align it with the EDDSA stubs and return ISC_R_NOTIMPLEMENTED
    too.
  - lib/isc/include/isc/ossl_wrap.h declares the new wrappers.
  - lib/dns/openssleddsa_link.c routes openssleddsa_generate()
    through the new wrappers when key->label is non-NULL, leaving
    the existing EVP_PKEY_keygen() path untouched for software
    keys.  The Ed448 case is guarded by HAVE_OPENSSL_ED448 to
    match the surrounding code.

Assisted-by: Claude:claude-opus-4-7
2026-05-26 11:18:56 +00:00
Michal Nowak
945838e621 Tolerate non-extractable Ed25519/Ed448 private keys in tofile
openssleddsa_tofile() called EVP_PKEY_get_raw_private_key()
unconditionally whenever the dst_key_t had a private EVP_PKEY
attached and aborted with ISC_R_FAILURE on any error.  That is
wrong for keys whose private material lives in a hardware token
(PKCS#11): the provider deliberately refuses to export the raw
bytes, but the keypair is still valid and the .private file
should be written containing only the PKCS#11 label, with no raw
key material.  Without this, "dnssec-keyfromlabel -a ed25519 -l
pkcs11:..." fails with "failed to write key ...: failure" even
though pkcs11-tool has generated a valid Ed25519 key in SoftHSM.

Mirror the behaviour already implemented in opensslecdsa_tofile():
if the raw private key cannot be retrieved AND the key has a
PKCS#11 label to fall back on, clear the OpenSSL error queue and
fall through to writing just the Label element.

If extraction fails and there is no label to fall back on, return
the OpenSSL failure rather than silently producing a .private
file with neither raw key material nor a label, which would be
unusable on the next load.

Consolidate buffer cleanup into a single cleanup: path, freeing
with the original allocation size (alginfo->key_size) rather than
the potentially-modified len output parameter.

Assisted-by: Claude:claude-opus-4-7
2026-05-26 11:18:56 +00:00
Michal Nowak
2ef6f0b4f2 Fix stack corruption in copy_initfile() on 32-bit
copy_initfile() declared a size_t local variable to receive the size of
the initial file and passed it to isc_file_getsizefd() with an explicit
(off_t *) cast. On 32-bit platforms with _FILE_OFFSET_BITS=64, off_t is
8 bytes while size_t is only 4 bytes, so isc_file_getsizefd()'s
"*size = stats.st_size;" writes 8 bytes into the 4-byte slot and
clobbers the adjacent "output" FILE * on the stack. The next iteration
of the read/write loop then calls clearerr() through a NULL pointer and
named crashes with SIGSEGV.

This is triggered whenever a zone with an initial-file (e.g. one
configured via a template) is loaded for the first time, so on 32-bit
the addzone and masterfile system tests crash named in ns2 with cores.

Declare "len" as off_t to match the API and drop the unsafe cast.

Assisted-by: Claude:claude-opus-4-7
2026-05-21 16:56:46 +02:00
Colin Vidal
3594b0ff7a renamed cfg_map_addclone() into cfg_map_add()
Since there is no `cfg_map_add()` anymore, and `cfg_map_addclone()`
wasn't quite clear, let's rename to `cfg_map_addclone()` into
`cfg_map_add()`, as this is fundamentally what this function is doing.
2026-05-21 16:04:01 +02:00
Colin Vidal
d1c55d78c6 fix possible NULL dereference in cfg_map_findclause()
`cfg_map_findclause()` did not check whether a clause existed before
dereferencing it, which could lead to a NULL dereference. Add the
missing check to prevent this.

In practice, this was not triggering any known bug, since
`cfg_map_findclause()` is only called in contexts where the clause is
known to exist.
2026-05-21 16:04:01 +02:00
Colin Vidal
b0f7bb3495 remove unused cfg_map_add() function
Function `cfg_map_add()` was unused, it is now removed.
2026-05-21 16:04:01 +02:00
Michal Nowak
075305dcc7 Move last_purge declaration under the same #ifdef as its user
The static atomic last_purge is only read and written from mem_purge(),
which is compiled only when JEMALLOC_API_SUPPORTED or __GLIBC__ is
defined. This used to fail on OpenBSD:

    ../lib/isc/mem.c:405:31: error: unused variable 'last_purge' [-Werror,-Wunused-variable]
      405 | static _Atomic(isc_stdtime_t) last_purge = 0;
          |                               ^~~~~~~~~~
2026-05-21 15:10:20 +02:00
Evan Hunt
3ff0018308
Clear REDIRECT flag when it isn't needed
The NS_QUERYATTR_REDIRECT flag is set when processing a recursive
NXDOMAIN redirection lookup, so that if that lookup also returns
NXDOMAIN we don't end up looping.

Previously, the flag was left active after use, but if the
same client triggered a subsequent recursive lookup (for example,
in the filter-aaaa plugin), then the wrong branch could be reached
in query_resume(), potentially leading to an assertion failure.  This
has been fixed.
2026-05-21 14:58:39 +02:00
Andoni Duarte
6cae1d10ca Merge tag 'v9.21.22'
Some checks are pending
CodeQL / Analyze (push) Waiting to run
SonarCloud / Build and analyze (push) Waiting to run
2026-05-20 10:26:28 +00:00
Ondřej Surý
965995c66a
Properly handle the return value of BN_num_bits()
BN_num_bits() returns 0 when passed NULL and a negative value on
internal error.  The OpenSSL wrappers stored the result in a size_t,
so a 0 return falsely satisfied the bit-length check and a negative
return wrapped to a huge value.  Capture the int return, reject
non-positive values, then compare against the limit.
2026-05-19 19:21:49 +02:00
Ondřej Surý
c28ba9c3c6
Reject malformed RRSIG records
A signature cannot cover a meta-type (NONE, ANY, AXFR, IXFR, MAILB,
MAILA, OPT, TSIG, TKEY); previously such records were cached by the
recursive resolver and collided with negative-cache entries on the
same owner name, corrupting the QP-trie cache.

Assisted-by: Claude:claude-opus-4-7
2026-05-19 13:21:48 +02:00
Matthijs Mekking
1abd977f43
Don't remove corresponding RRSIG in the same loop
The dns_db_deleterdataset() removing the corresponding signature
within the iterator is wrong, because it mutates an rdataset
that is not the current one.
2026-05-19 11:19:47 +02:00
Ondřej Surý
08295d004e
Skip EDNS UDP-size hint on TCP retries
The hint feeds the EDNS OPT UDP-size field, which has no effect on TCP
transport.  Avoid the dns_adb_getudpsize() lookup when the query is
already pinned to TCP.

Assisted-by: Claude:claude-opus-4-7
2026-05-19 11:18:30 +02:00
Ondřej Surý
a9283c08c2
Emit EDE 22 when the resolver runs out of usable addresses
Two exits from fctx_try() landed at DNS_R_SERVFAIL without attaching
DNS_EDE_NOREACHABLEAUTH: when fctx_getaddresses() returned a non-success,
non-wait status, and when every candidate addrinfo was unusable
(over-quota or filtered) after a restart.

With the new TCP fallback actually firing, those paths are now reached
by serve-stale and similar scenarios in which the auth is unreachable.
Attach the EDE so SERVFAIL responses keep carrying the same operator
signal that the timeout-based exit paths already produce.

Co-authored-by: Evan Hunt <each@isc.org>
Assisted-by: Claude:claude-opus-4-7
2026-05-19 11:18:30 +02:00
Ondřej Surý
1af37e24b2
Open the stale-refresh-time window on any resolver failure
The TCP-fallback fix in the previous commits means a query that would
previously have timed out on UDP now actually escalates to TCP, and a
TCP-side failure surfaces a non-ISC_R_TIMEDOUT result code to
query_usestale().  The trigger for DNS_DBFIND_STALESTART was previously
narrowed to ISC_R_TIMEDOUT, so the stale-refresh-time window stopped
opening for those clients.

Broaden the condition to any failure that has already cleared the
upstream DUPLICATE/DROP filtering in query_usestale() — the spirit of
the window is "the resolver tried and could not get a fresh answer",
not "the resolver timed out specifically".

Co-authored-by: Evan Hunt <each@isc.org>
Assisted-by: Claude:claude-opus-4-7
2026-05-19 11:18:30 +02:00
Ondřej Surý
59c00a6f31
Force TCP after repeated UDP timeouts to the same authoritative
Make the decision in fctx_query() before the dispatch is bound so the
chosen transport and the DNS_FETCHOPT_TCP flag agree.  The previous
location in resquery_send() ran after the UDP dispatch had already been
attached, so the flag flip had no effect on the wire.

Moving the decision earlier also means FCTX_ADDRINFO_NOEDNS0 servers,
previously exempt, now escalate to TCP too.  TCP works regardless of
EDNS state, so this is the intended behaviour.

Assisted-by: Claude:claude-opus-4-7
2026-05-19 11:18:30 +02:00
Ondřej Surý
01523a078a
Temporarily remove TCP fallback after UDP timeouts
The retry path in resquery_send() that flipped DNS_FETCHOPT_TCP on a
query whose dispatch had already been bound as UDP in fctx_query() had
no effect on the transport actually used, but did leave a stale TCP
bit visible to downstream consumers (dnstap framing, cookie checks,
the AUTHORITY-NS spoofability guard).

The ineffective code has been removed from resquery_send().  The
TCP fallback functionality will be corrected and restored in the next
commit.

Assisted-by: Claude:claude-opus-4-7
2026-05-19 11:18:16 +02:00