Commit graph

9846 commits

Author SHA1 Message Date
Andoni Duarte
3d293e2a1e Merge tag 'v9.20.23' into bind-9.20 2026-05-20 10:18:28 +00:00
Ondřej Surý
f756907f24 Guard parent-NS walk against running off the root
Once the walk reaches the root, splitting one more label off would
trip an internal assertion and abort named.  Stop cleanly with
ISC_R_NOTFOUND so the dispatcher cancels the fetch.  Only reachable
through misconfiguration (root configured as a primary with parental
agents, or a parent zone that NODATAs its own NS).

Assisted-by: Claude:claude-opus-4-7

Manually edited to resolve conflicts.

(cherry picked from commit 141e8110f7)
2026-05-19 15:30:34 +02:00
Ondřej Surý
6b02ecb3f1
Remove TCP fallback after UDP timeouts
The retry path in resquery_send() that flipped DNS_FETCHOPT_TCP on a
query whose dispatch had already been bound as UDP in fctx_query() had
no effect on the transport actually used, but did leave a stale TCP
bit visible to downstream consumers (dnstap framing, cookie checks,
the AUTHORITY-NS spoofability guard).

The ineffective code has been removed from resquery_send().  The
TCP fallback functionality will be corrected and restored in the
BIND 9.22.

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit 01523a078a)
2026-05-19 12:26:04 +02:00
Ondřej Surý
38dd0e0ccc Switch UDP fetches to TCP on the first response with a wrong query id
Until now, the dispatcher silently dropped UDP responses from the
expected peer that carried the wrong DNS message id and kept listening
for the correct id to arrive within the read timeout.  An off-path
attacker who knows the destination address and source port of an
outgoing fetch could exploit that quiet retry window to flood the
resolver with guessed responses; with a gigabit link the per-query
success probability grows linearly with the number of guesses that
arrive before the legitimate answer or the timeout.

Treat any such mismatch as a possible spoofing attempt and let the
resolver immediately retry the same query over TCP, the same control
path the truncation handler already uses.

Add a resolver statistics counter - exposed as 'queries retried over TCP
after a response with mismatched query id' in rndc stats and
'MismatchTCP' in the statistics channel

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit 11bca1051f)
2026-05-15 08:49:19 +02:00
Ondřej Surý
fbf5ecb5a8 Fix data race in async master dump/load context publication
Bouncing the offload itself to the target loop let the after-work
callback fire on the target thread and run the user's done callback
before the calling thread had published *dctxp / *lctxp.  Enqueue on
the calling loop and bounce only the done callback instead, so the
publish is sequenced before the cross-thread hand-off by construction
and cannot be reintroduced by reordering the entry-point body.

(cherry picked from commit 8ae464d552)
2026-05-14 06:53:23 +00:00
Ondřej Surý
ddf6274ba2
Cap glue records cached from a referral
The resolver marked every NS RR's glue from a referral for caching with
no aggregate bound, so a parent server returning many NS RRs and many
glue addresses per NS could inflate cache memory long beyond what
resolution can ever use.

Truncate each glue rdataset to DELEG_MAX_GLUES_PER_NS (20) A and 20 AAAA
records before marking it for caching.  The NS RRset itself is still
cached in full, bounded by max-records-per-type.
2026-05-12 16:18:40 +02:00
Ondřej Surý
c4d2675f48
Remove the disabled CHECK_FOR_GLUE_IN_ANSWER code
The CHECK_FOR_GLUE_IN_ANSWER macro defaulted to 0 and was never enabled
by the build system, leaving check_answer() and the answer-section glue
scan in rctx_referral() as dead code.  Drop them so the surrounding
referral-cache path is easier to reason about.
2026-05-12 16:18:16 +02:00
Evan Hunt
ec1404cc4c
Skip "deny-answer-address" for non-IN addresses
Ensure that we don't attempt an ACL match for answer addresses
when handling a class-CHAOS zone. This is an additional line of
defense for YWH-PGM40640-74.

(cherry picked from commit e62673c765b52307c800e86f0185fe52b573c145)
2026-05-07 13:09:18 +02:00
Mark Andrews
a40325c6a2
Reject meta-classes in UPDATE and NOTIFY messages
NOTIFY and UPDATE messages must specify a data class in the
QUESTION/ZONE section.  NONE and ANY are meta-classes and not
appropriate here.  Return FORMERR if either is used.

Rejecting messages with a query class of NONE addresses YWH-PGM40640-72,
YWH-PGM40640-82, and YWH-PGM40640-83.  Rejecting messages with a query
class of ANY addresses YWH-PGM40640-87, YWH-PGM40640-88, and
YWH-PGM40640-117.

Fixes: isc-projects/bind9#5778
Fixes: isc-projects/bind9#5782
Fixes: isc-projects/bind9#5783
Fixes: isc-projects/bind9#5797
Fixes: isc-projects/bind9#5798
Fixes: isc-projects/bind9#5853

(cherry picked from commit c66a1b1e1bfd6c79d7b9bc8d4a59e69f4faa1563)
2026-05-07 13:09:18 +02:00
Evan Hunt
134706912f
Disable UPDATE and NOTIFY for non-IN classes
Return NOTIMP for UPDATE and NOTIFY requests received for views with a
class other than IN.  Only QUERY is now supported for non-IN views such
as CHAOS.

When running dns dns_rdata_tostruct() with types that are only defined
for class IN, ensure that the class is correct before proceeding.

Add an assertion that any zone being updated is of class IN. (Note
that previously, a DLZ zone could have its class value set incorrectly
to NONE; this has been fixed.)

This addresses YWH-PGM40640-70 and YWH-PGM40640-73 (as well as any
similar problems that might have occurred in the future) by minimizing
the code paths that can be reached by rdata classes other than IN, so it
is safe for the implementation to assume that rdatatypes that are only
defined for class IN, such as SVCB or WKS, have been parsed and
validated, and not accepted as unknown/opaque data.

Fixes: isc-projects/bind9#5777
Fixes: isc-projects/bind9#5779

(cherry picked from commit a6d8e330ed6cf0021bff3f00aa1dc7a296f5aec0)
2026-05-07 13:09:18 +02:00
Ondřej Surý
11276087a7
Fix output token and GSS context leaks in TKEY/GSS-API error paths
In dst_gssapi_acceptctx(), rename outtoken to outtokenp (matching BIND
convention for output pointer parameters) and free the allocated output
token buffer on error in the cleanup path.

In process_gsstkey(), route the empty-principal error path through
cleanup via CLEANUP() instead of returning early, so that the output
token, GSS context, and TSIG key are all freed consistently by the
existing cleanup block.

(cherry picked from commit 6c46c85d02849fb659584275313529794039f433)
2026-05-07 13:09:18 +02:00
Ondřej Surý
9bdace3fa3
Fix GSS-API context leak in TKEY negotiation
Reject multi-round GSS-API negotiation (GSS_S_CONTINUE_NEEDED) in
dst_gssapi_acceptctx().  Each call to gss_accept_sec_context()
allocates a context inside the GSS library; without this fix, the
context handle was passed back to process_gsstkey() which did not
store it persistently, leaking it on every incomplete negotiation.

An unauthenticated attacker could exhaust server memory by sending
repeated TKEY queries with GSSAPI tokens, each leaking one GSS
context.  The leaked memory is allocated by the GSS library via
malloc(), bypassing BIND's memory accounting.

In practice, Kerberos/SPNEGO (the only mechanism used with BIND)
completes in a single round, so rejecting continuation does not
affect real-world deployments.  See RFC 3645 Section 4.1.3.

(cherry picked from commit 3d8e0d068f08694282c5ecd3bd6c332de6c75485)
2026-05-07 13:09:18 +02:00
Ondřej Surý
1222a2aa05
Fix use-after-free in resolver SIG(0) async verification path
When a SIG(0)-signed response triggers async ECDSA verification via
dns_message_checksig_async(), the respctx_t holds a raw pointer to
the resquery_t. If the fetch context is shut down while verification
is in flight (e.g. due to recursive-clients quota exhaustion), the
query is destroyed and the callback dereferences a dangling pointer.

Take a reference on the resquery_t when initializing the respctx_t,
and release it in both cleanup paths. The query's own reference to
the fetch context keeps the fctx alive transitively.

(cherry picked from commit 5b58caf5a2cd39d57a51b7b0373bfbc4877a96f9)
2026-05-07 13:09:18 +02:00
Colin Vidal
b456007e2d
Remove duplicate addresses from the resolver SLIST
The SLIST (essentially `fctx->finds`, forwarders and dual-stack
alternatives aside) can have duplicate server addresses when multiple
in-domain nameservers share the same IP addresses:

  sub.example.          NS      ns1.sub.example.
  sub.example.          NS      ns2.sub.example.
  ns1.sub.example.      A       1.2.3.4
  ns1.sub.example.      A       5.6.7.8
  ns2.sub.example.      A       1.2.3.4
  ns2.sub.example.      A       5.6.7.8

If both 1.2.3.4 and 5.6.7.8 fail to return a valid answer, the resolver
would query each address twice.

The problem is fixed by replacing the two-phase server selection (sort
each find list by SRTT, sort finds by head SRTT) with a single linear
scan in nextaddress() that finds the lowest-SRTT unmarked, non-duplicate
address across all find lists.

The old approach had a correctness bug: after sorting, the resolver
picked the next address from the "current" find list rather than
globally.  For example, with find lists [1, 15, 26] and [3, 4, 5], the
second pick would be SRTT 15 instead of the correct SRTT 3.

The new approach is both simpler and correct: each call to nextaddress()
walks all addresses, skips marked and duplicate entries, and returns the
one with the lowest SRTT.  While this walk is repeated for each server
attempt, it operates on a small bounded list and is negligible compared
to the network I/O of querying the server.

(cherry picked from commit b1c5856a3764b4025e93f8baf06c45c8fa029752)
2026-05-07 13:09:18 +02:00
Colin Vidal
1bedd7f244
Limit the number of addresses returned per ADB find
Add a hard limit on the number of addresses that ADB returns from a
single NS lookup (dns_adbfind_t).  This mitigates a flood attack
where an attacker controls a zone with many addresses for a
nameserver, each returning an invalid response.  The global
max-query count (default 50) also limits this, but significant harm
can be done before that limit is reached.

The default limit is now 6 (v4 and/or v6) addresses for an ADB find (so,
ADB looking up for A/AAAA addresses of a name server name). It can be
overridden for testing via 'named -T adbaddrslimit=N'.

(cherry picked from commit 3ec37fc69356ee682bee7f67940613ac31d93d7b)
2026-05-07 13:09:18 +02:00
Colin Vidal
ac3ea4ecd4
rctx_resend() increment query counters
Calls to `rctx_resend()` are done internally within the resolver, in
flow which are not supposed to happens more than once. For instance,
if some query fails, and a specific flag "F" wasn't set, then set the
flag and try again. This wouldn't occur more than once because if the
query fails the next attempt, the flag "F" would be set already, so the
resolver would move to the next server (or give up).

However, a subtle bug missing checking a flag, for instance, could lead
to an unbounded loop re-trying to query the same server. This is now
impossible as `rctx_resend()` also increment the query counters (so if
such case occurs, it would stop once the maximum limit is reached).

The dns_resstatscounter_retry are also only incremented if the
`fctx_query()` succeeds, similar to as is done in `fctx_try()`.

(cherry picked from commit f3e74304889a2e8b69c8e88fc9a383589decda32)
2026-05-07 13:09:18 +02:00
Colin Vidal
ae554715ae
Refactor incrementing query counters
Move the logic incrementing the query counter and the global query
counter into a dedicated helper function.

(cherry picked from commit 05d6da2de54c093689e675e81ae898ee41220666)
2026-05-07 13:09:18 +02:00
Evan Hunt
c665472b10 check for val->name == NULL when adding EDE text
When a validator is being shut down, the associated name
`val->name` is set to NULL.  This could cause a crash if a worker
thread subsequently added an EDE code to the response containing
val->name in the extra text.

`validator_addede()` now checks whether the name is NULL before
trying to add it to the extra text.

(cherry picked from commit 2c60870527)
2026-05-07 11:23:17 +10:00
Aram Sargsyan
b4cab10461 Fix a bug in catz_process_apl()
The allow-transfer/allow-query catalog zone custom properties support
only APL RRtypes. All other types are correctly rejected by the
catz_process_apl() function. However, when an APL RRtype is processed
by that function, and another (non-APL) RRtype is then attempted to be
processed, there is an assertion failure happening in the prologue
of the function because `*aclbp != NULL` (i.e. an APL has been already
processed). Move the code to do type checking before the affected
REQUIRE assertion.

(cherry picked from commit 67e0090371)
2026-05-06 19:37:12 +00:00
Aram Sargsyan
83cd5b52b5 Fix a memory leak issue in catz_process_primaries()
Free the old version of the keyname (if it exists) before setting
the new one.

(cherry picked from commit 4576a67a93)
2026-05-06 18:34:29 +00:00
Ondřej Surý
5588f37baa Include disptype and transport in dispatch hash key
Move disptype and transport into dispatch_hash() and dispatch_match()
so that the match function is the single source of truth for whether
two TCP dispatches are interchangeable.  This replaces the post-loop
disptype filter in dispatch_gettcp() and makes the disptype field in
struct dispatch_key actually used.

(cherry picked from commit 4654796683)
2026-05-06 15:05:48 +02:00
Ondřej Surý
abe201bac3 Do not reuse shared TCP dispatches for zone transfers
Zone transfers (XFRIN) need a dedicated TCP connection because they
are long-lived and stream the entire zone.

(cherry picked from commit 6e78094ebd)
2026-05-06 15:05:48 +02:00
Ondřej Surý
4f54328eaf Use sequential per-dispatch message IDs for TCP
TCP dispentries no longer use the global QID hash table at all.
Responses are matched by scanning disp->active, and sequential
per-dispatch IDs (bounded by the pipelining limit) are unique
within a single dispatch by construction.  Since TCP delivers
only data we asked for on a specific connection, the per-peer
uniqueness that the global table enforced was never actually
needed for TCP.

DNS_DISPATCHOPT_FIXEDID is plumbed through dns_request_createraw
-> get_dispatch -> dns_dispatch_createtcp so FIXEDID TCP requests
always get a fresh isolated dispatch — the caller-supplied ID
then cannot collide with any other in-flight query either.

(cherry picked from commit 3e364aec2b)
2026-05-06 15:05:48 +02:00
Ondřej Surý
a3ba799eae Limit TCP pipelining per shared dispatch
Cap the number of in-flight queries on a single shared TCP dispatch.
When the limit is reached, the dispatch is removed from the hash
table so subsequent queries get a fresh connection.  The existing
dispatch continues serving its queries until they complete.

This bounds the blast radius of a connection drop: at most N queries
fail simultaneously instead of all queries to that server.

The default limit is 256.  It can be overridden for testing via
'named -T tcppipelining=N'.

(cherry picked from commit 385ceabe8f)
2026-05-06 15:05:48 +02:00
Ondřej Surý
d35bc843c5 Implement seamless TCP connection reuse in dns_dispatch
Previously, the user of dns_dispatch API had to first call
dns_dispatch_gettcp() and if that failed create a new TCP dispatch with
dns_dispatch_createtcp().  This has been changed and the TCP connection
reuse happens transparently inside dns_dispatch_createtcp().  There are
separate buckets for dns_resolver, dns_request and dns_xfrin units, so
these don't get mixed together.

(cherry picked from commit d5ee86b799)
2026-05-06 15:05:48 +02:00
Evan Hunt
3db5d47ed4 Hold a reference to the NTA table for the lifetime of each NTA
Each dns__nta_t now references its parent ntatable in nta_create() and
releases it in dns__nta_destroy().  This avoids a use-after-free in
fetch_done() and other callbacks that dereference nta->ntatable: the
ntatable could otherwise be released by view destruction while an
in-flight resolver fetch still holds a reference to the NTA.

(cherry picked from commit 26c895cc92)
2026-05-06 09:33:11 +02:00
Evan Hunt
8e31dc5353 Fix a stack use-after-free in qpzone
In previous_closest_nsec(), a new qpreader was opened to search the NSEC
tree. It was possible for that to be used to update a QP iterator object
owned by the caller, and then be destroyed when the function returned.

This has been addressed by having the caller open the NSEC qpreader
instead.
2026-05-05 16:17:25 -07:00
Ondřej Surý
2bbbd60de3
Reject oversized RRsets at slab construction
dns_rdataslab_fromrdataset(), dns_rdataslab_merge() and
dns_rdataslab_subtract() summed per-record storage into an
unsigned int with no upper-bound check.  An RRset whose total
encoded size exceeds DNS_RDATA_MAXLENGTH cannot fit in a DNS
message and is unservable; building its in-memory representation
only burns memory on data that will fail at response time, and at
the upper bound the running sum could in theory wrap.

Cap the running total at DNS_RDATA_MAXLENGTH and return ISC_R_NOSPACE
when exceeded.  Update the qpdb cache memory-purge test to use a
record size that fits within the new limit.

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit f9d24b1b85)
2026-05-05 19:24:29 +02:00
Evan Hunt
0af1b07aad
Remove DNS_KEYFLAG_EXTENDED
The DNS_KEYFLAG_EXTENDED flag was only legitimate for type KEY
and was eliminated by RFC 3445. Dropping the extended-flags
handling in pub_compare() also fixes a possible crash when
signing a zone whose journal contains a crafted DNSKEY: a
6-byte record with the EXTENDED bit set produced a memmove()
length that underflowed and ran off a stack buffer.

(cherry picked from commit 9c06f0a41d)
2026-05-05 11:07:32 +02:00
Ondřej Surý
19f44a0aa3
Tidy up cleanup path in check_signer()
The cloned signature rdataset was not disassociated on the early
return taken when dns_dnssec_keyfromrdata() fails to parse the DNSKEY
public-key data.  In every current caller val->sigrdataset reaches
check_signer() rdatalist-backed, so dns_rdataset_clone() copies the
struct without taking any reference and dns_rdataset_disassociate()
is a no-op -- no memory is actually leaked today.  Hoist the key
parse out of the per-RRSIG loop and let the function fall through
to a single cleanup path, so the parse and the iteration cannot
diverge again.

Assisted-by: Claude:claude-opus-4-7
2026-05-05 07:43:15 +02:00
Ondřej Surý
0f821104e0
Use a keyed hash for the RRL bucket table
The previous hash_key() was a deterministic, unkeyed (<<1) + add over the
key words.  An off-path attacker could invert it offline and submit
queries whose source /24, qname hash, and qtype map to a single bucket;
under chaining this turns every lookup into an O(N) walk under
rrl->lock and starves legitimate query processing on the very feature
deployed to mitigate DoS.

Replace it with isc_hash32(), which is HalfSipHash-2-4 keyed by a
per-process random seed, so collision sets cannot be precomputed.

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit a6b7ce29c4)
2026-05-04 16:15:58 +02:00
Ondřej Surý
2b18aa9d59 Reject RSA DNSKEYs with oversize public exponents at parse time
The wire-format RSA DNSKEY parser was the only key path with no upper
bound on the public exponent — opensslrsa_parse and opensslrsa_fromlabel
already cap at RSA_MAX_PUBEXP_BITS.  An attacker-controlled DNSKEY could
therefore force a validator to compute s^e mod n with e up to ~|n| bits,
amplifying every verify by ~120x for typical 2048-bit moduli (OpenSSL
itself only caps the exponent for moduli above 3072 bits).  Apply the
same bit-count cap to wire-format keys.

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit ab8c1a77e0)
2026-04-30 13:16:30 +02:00
Ondřej Surý
45ea92ea4b
Size HMAC key generation buffers to the maximum block size
hmac_generate() declared its on-stack nonce buffer as
unsigned char data[ISC_MAX_MD_SIZE], i.e. 64 bytes. That is the maximum
digest size, but the buffer is filled up to the algorithm's HMAC block
size, which is 128 bytes for SHA-384 and SHA-512. Asking rndc-confgen
for an HMAC-SHA-384 or HMAC-SHA-512 key with -b > 512 (the documented
range allows up to 1024) wrote past the end of the stack buffer; on
hardened builds this aborted with a stack-smash detector firing
instead of producing a key.

Use the existing ISC_MAX_BLOCK_SIZE (128) for the buffer so the full
1..1024 range advertised by -A hmac-sha{384,512} works as documented.
The matching key_rawsecret[64] in confgen's generate_key() is enlarged
the same way so the generated key fits when dumped to the buffer.

Add a system test that exercises rndc-confgen across the previously
overflowing keysizes; with -Db_sanitize=address it caught the abort
before the fix.

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit 46f6bb6364)
2026-04-30 06:00:07 +02:00
Ondřej Surý
2c2533774c Fix dropped covers field for SIG records in dns_diff_apply
rdata_covers() in lib/dns/diff.c discriminated only on
dns_rdatatype_rrsig (46) and returned 0 for the legacy SIG (24), so
the covered-type field was silently discarded on the dynamic-update
and IXFR paths.  Every SIG rdataset was then filed in the zone DB
under typepair (SIG, 0) instead of (SIG, covered_type); a second SIG
add with a different covers but a different TTL collided at that
bucket, tripped DNS_DBADD_EXACTTTL in qpzone, returned
DNS_R_NOTEXACT, and came back to the client as SERVFAIL.

Use dns_rdatatype_issig() here so both SIG and RRSIG carry their
covers through the diff, matching the helper pattern already used in
lib/dns/master.c, lib/ns/xfrout.c, lib/dns/qpcache.c, and the
dns__db_findrdataset() REQUIRE that the surrounding merge request
just relaxed.

(cherry picked from commit 0a5ba57116)
2026-04-17 19:24:13 +02:00
Mark Andrews
62920a981f Fix assertion failure in dns_db_findrdataset() for SIG records
dns__db_findrdataset() had a REQUIRE() that only accepted
dns_rdatatype_rrsig when the covers parameter was set.  A dynamic
update containing a SIG record (type 24) would trigger this
assertion, crashing named.  Use dns_rdatatype_issig() to accept
both SIG and RRSIG.

(cherry picked from commit 03edeccaa1)
2026-04-17 19:24:13 +02:00
Mark Andrews
35a5e29800 Remove unnecessary dns_name_free call
When processing a catalog zone member's primaries definition and
there is a TXT record containing an invalid name TSIG key name,
dns_name_free was incorrectly called triggering an assertion.
This has been fixed.

(cherry picked from commit 9f411c93c4)
2026-04-15 12:30:22 +10:00
Mark Andrews
6d38c398c8 Use the correct maximal compressed bit map buffer size
There are up to 256 windows in a NSEC/NSEC3 compressed bit
map of 32 + 2 octets each.

(cherry picked from commit e43e4bd20a)
2026-04-10 06:23:53 +00:00
Matthijs Mekking
f58554d05a Rename isdelegation() to is_insecure_referral()
The name 'isdelegation()' was confusing. This function is not checking
whether this message is a delegation, but whether the denial of
existence proofs in this message is a proof of a referral to an
unsigned zone.

The name 'is_unsecure_referral()' is more appropriate.

(cherry picked from commit e0f09bb374)
2026-04-07 09:44:30 +02:00
Matthijs Mekking
bd852b1f97 Revert isdelegation() to return boolean value again
The isdelegation() was changed to return an isc_result_t because the
idea was to have a separate return value DNS_R_NSEC3ITERRANGE to signal
to the caller we could not verify the proof because of too many
iterations in the NSEC3 record, or perhaps ISC_R_UNEXPECTED for a more
generic cause that verification was not done.

But this would make error handling more fragile and all we care about
is whether we can reliably say the NS bit was not set.

If we can not reliably say so, we have to treat it as an insecure
referrral.

Since the answer is either yes or no, we can revert back to returning
a boolean value.

(cherry picked from commit 3ac1bb1c39)
2026-04-07 09:44:19 +02:00
Aram Sargsyan
913f290e75 Fix a race condition in xfrin_recv_done() when calling xfrin_reset()
When the xfrin_recv_done() function decides to retry the transfer
using AXFR because of a previous error, it calls the xfrin_reset()
function which calls dns_db_closeversion() on 'xfr->ver'. The problem
is that the ixfr processing of a previous message could be still
in process in a worker thread, which then can use freed 'xfr->ver'.

If there is an ongoing worker thread delay the AXFR retry until after
the worker thread has finished its work.

(cherry picked from commit 141ff7bfa7)
2026-04-03 12:05:44 +00:00
Ondřej Surý
bd3c7d8014
Count temporal problems with DNSSEC validation as attempts
After KeyTrap, the temporal DNSSEC were originally hard errors that
caused validation failures even if the records had another valid
signature.  This has been changed and the RRSIGs outside of the
inception and expiration time are not counted as hard errors.  However,
these errors are not even counted as validation attempts, so excessive
number of expired RRSIGs would cause some non-cryptograhic extra work
for the validator.  This has been fixed and the temporal errors are
correctly counted as validation attempts.

(cherry picked from commit 6ba57a1f0f)
2026-03-30 13:07:15 +02:00
Mark Andrews
8a3408c3b1 Allow the dns_rdata_in_apl structure to be walked twice
The offset value should be set prior to calculating the length.

(cherry picked from commit f2fd54f4b2)
2026-03-27 12:38:01 +00:00
Aram Sargsyan
dbd86cb6d2 Allow empty APL records
Allow empty APL records because RFC 3123 (Section 4) says "zero or
more items". This fixes processing of a catalog zone ACL (which is
based on APL records) when the zone contains an empty APL record or
when a zone update arrives which creates an empty APL record.

(cherry picked from commit 35b8af229e)
2026-03-27 12:38:00 +00:00
Michał Kępień
b040b566fe Merge tag 'v9.20.21' into bind-9.20 2026-03-25 14:24:13 +00:00
Ondřej Surý
d3965a91b6 Replace existing NTA instead of reusing it in dns_ntatable_add()
When an NTA already exists for a name, the old code retrieved
and reused the existing NTA object, then reset its timer via
settimer().  This is incorrect because isc_timer_start() and
isc_timer_stop() require the timer to be manipulated from its
owning loop (enforced by REQUIRE(timer->loop == isc_loop()) in
lib/isc/timer.c), and the caller may be running on a different
loop than the one that created the original NTA.

Instead, delete the old NTA (shutting down its timer on the
correct loop) and insert a fresh one that is owned by the
current loop.
2026-03-23 08:31:32 +00:00
Ondřej Surý
33d219bfe1 SKIP cache flush ordering on NTA expiry
dns_view_flushnode() was called in the delete_expired() async
callback, which runs after the query that detected the NTA expiry.
This created a race: the query would proceed with stale cached data
from the NTA period before the flush had a chance to run, resulting
in transient SERVFAIL with EDE 22 (No Reachable Authority).

Skip dns_view_flushnode() in the older branches as the solutions for
older branches are too complicated and this was not a critical bug.

(cherry picked from commit da8e1c956a)
2026-03-23 08:31:32 +00:00
Aram Sargsyan
567c170b0c Flush the node when NTA expires
When NTA expires the name's node should be flushed from the view's
cache as it's done when the NTA is manually removed using a rndc
command.

(cherry picked from commit 1899a3318c)
2026-03-20 03:24:56 +01:00
Ondřej Surý
993002cebf Fix data race on fctx->vresult in validated()
Move the write to fctx->vresult after LOCK(&fctx->lock).  The field was
being set before acquiring the lock, but dns_resolver_logfetch() reads
it under the same lock from another thread.

(cherry picked from commit a2bd833909)
2026-03-20 03:22:53 +01:00
Ondřej Surý
fb8a9e73cc Fix non-atomic read-modify-write on entry->srtt in adjustsrtt()
The SRTT update loaded the old value, computed a new one, and stored it
back as separate operations.  Two concurrent callers could each read the
same old value and one update would be silently lost.

Use a CAS loop for the read-modify-write on entry->srtt.  For the aging
path, also CAS on entry->lastage to prevent multiple threads from aging
the same entry in the same second.

(cherry picked from commit 4d15494b94)
2026-03-20 01:06:56 +00:00
Aram Sargsyan
99b583592e Take 'env' reference before async calling perform_reopen()
The 'env' pointer is passed to an async function without taking
a reference first, which can potentially cause a use-after-free
error. Take a reference, then detach in the async function.

(cherry picked from commit 48d7401f0d)
2026-03-18 17:04:56 +00:00