Commit graph

15922 commits

Author SHA1 Message Date
Ondřej Surý
98f5caea05 Fix INSIST copy-paste error checking RADIX_V4 instead of RADIX_V6
The INSIST in isc_radix_insert() checks node->data[RADIX_V4] and
node->node_num[RADIX_V4] twice due to a copy-paste error, never
verifying the RADIX_V6 fields.

Fix the second pair to check RADIX_V6.

(cherry picked from commit 3f15f2d9e5)
2026-03-30 19:01:29 +02:00
Ondřej Surý
bd3c7d8014
Count temporal problems with DNSSEC validation as attempts
After KeyTrap, the temporal DNSSEC were originally hard errors that
caused validation failures even if the records had another valid
signature.  This has been changed and the RRSIGs outside of the
inception and expiration time are not counted as hard errors.  However,
these errors are not even counted as validation attempts, so excessive
number of expired RRSIGs would cause some non-cryptograhic extra work
for the validator.  This has been fixed and the temporal errors are
correctly counted as validation attempts.

(cherry picked from commit 6ba57a1f0f)
2026-03-30 13:07:15 +02:00
Mark Andrews
8a3408c3b1 Allow the dns_rdata_in_apl structure to be walked twice
The offset value should be set prior to calculating the length.

(cherry picked from commit f2fd54f4b2)
2026-03-27 12:38:01 +00:00
Aram Sargsyan
dbd86cb6d2 Allow empty APL records
Allow empty APL records because RFC 3123 (Section 4) says "zero or
more items". This fixes processing of a catalog zone ACL (which is
based on APL records) when the zone contains an empty APL record or
when a zone update arrives which creates an empty APL record.

(cherry picked from commit 35b8af229e)
2026-03-27 12:38:00 +00:00
Ondřej Surý
feb5dc7f98 Add regression test for TOCTOU race in DNS UPDATE SSU handling
Race rndc reconfig (toggling between allow-update and update-policy)
against a stream of DNS UPDATEs for 5 seconds and verify that named
does not crash.

Before the fix, the race between send_update() and update_action()
reading the SSU table independently could trigger an assertion
failure (INSIST) when the zone's update policy changed between the
two reads.

(cherry picked from commit c503b6eee8)
2026-03-25 16:16:22 +01:00
Ondřej Surý
c409b9a939 Fix TOCTOU race in DNS UPDATE SSU table handling
Pass the SSU table through the update event struct from
send_update() to update_action() instead of reading it from the
zone twice.  If rndc reconfig changed the zone's update policy
between the two reads (e.g., from allow-update to update-policy),
send_update() would skip the maxbytype allocation but
update_action() would see a non-NULL ssutable, triggering
INSIST(ssutable == NULL || maxbytype != NULL) and crashing named.

The ssutable reference is now taken once in send_update() and
transferred to update_action() via the event struct, ensuring
both functions see the same value.

(cherry picked from commit c172416559)
2026-03-25 16:16:22 +01:00
Michał Kępień
b040b566fe Merge tag 'v9.20.21' into bind-9.20 2026-03-25 14:24:13 +00:00
Ondřej Surý
8feefb9874 Add MOVE_OWNERSHIP() macro for transferring pointer ownership
A helper macro that returns the current value of a pointer and sets
it to NULL in one expression, useful for transferring ownership in
designated initializers.

(cherry picked from commit 0f3be0beb8)
2026-03-23 12:05:30 +01:00
Ondřej Surý
d3965a91b6 Replace existing NTA instead of reusing it in dns_ntatable_add()
When an NTA already exists for a name, the old code retrieved
and reused the existing NTA object, then reset its timer via
settimer().  This is incorrect because isc_timer_start() and
isc_timer_stop() require the timer to be manipulated from its
owning loop (enforced by REQUIRE(timer->loop == isc_loop()) in
lib/isc/timer.c), and the caller may be running on a different
loop than the one that created the original NTA.

Instead, delete the old NTA (shutting down its timer on the
correct loop) and insert a fresh one that is owned by the
current loop.
2026-03-23 08:31:32 +00:00
Ondřej Surý
33d219bfe1 SKIP cache flush ordering on NTA expiry
dns_view_flushnode() was called in the delete_expired() async
callback, which runs after the query that detected the NTA expiry.
This created a race: the query would proceed with stale cached data
from the NTA period before the flush had a chance to run, resulting
in transient SERVFAIL with EDE 22 (No Reachable Authority).

Skip dns_view_flushnode() in the older branches as the solutions for
older branches are too complicated and this was not a critical bug.

(cherry picked from commit da8e1c956a)
2026-03-23 08:31:32 +00:00
Aram Sargsyan
567c170b0c Flush the node when NTA expires
When NTA expires the name's node should be flushed from the view's
cache as it's done when the NTA is manually removed using a rndc
command.

(cherry picked from commit 1899a3318c)
2026-03-20 03:24:56 +01:00
Ondřej Surý
993002cebf Fix data race on fctx->vresult in validated()
Move the write to fctx->vresult after LOCK(&fctx->lock).  The field was
being set before acquiring the lock, but dns_resolver_logfetch() reads
it under the same lock from another thread.

(cherry picked from commit a2bd833909)
2026-03-20 03:22:53 +01:00
Ondřej Surý
fb8a9e73cc Fix non-atomic read-modify-write on entry->srtt in adjustsrtt()
The SRTT update loaded the old value, computed a new one, and stored it
back as separate operations.  Two concurrent callers could each read the
same old value and one update would be silently lost.

Use a CAS loop for the read-modify-write on entry->srtt.  For the aging
path, also CAS on entry->lastage to prevent multiple threads from aging
the same entry in the same second.

(cherry picked from commit 4d15494b94)
2026-03-20 01:06:56 +00:00
Aram Sargsyan
99b583592e Take 'env' reference before async calling perform_reopen()
The 'env' pointer is passed to an async function without taking
a reference first, which can potentially cause a use-after-free
error. Take a reference, then detach in the async function.

(cherry picked from commit 48d7401f0d)
2026-03-18 17:04:56 +00:00
Aram Sargsyan
77d60acb86 Convert dns_dtenv_t reference counting to standard macors
Use standard reference counting macros for dns_dtenv_t instead of
custom attach/detach functions.

(cherry picked from commit 4ac3a6520e)
2026-03-18 17:04:56 +00:00
Ondřej Surý
d685d448e2 Fix isc_buffer_init capacity mismatch in DoH data chunk callback
isc_buffer_init() is given MAX_DNS_MESSAGE_SIZE (65535) as capacity but
only h2->content_length bytes are allocated.  This makes the buffer
believe it has more space than actually allocated.  A secondary bounds
check (new_bufsize <= h2->content_length) prevents actual overflow, but
the buffer invariant is violated.

Pass h2->content_length as the capacity to match the allocation.

(cherry picked from commit 8e240bbb5f)
2026-03-18 10:39:38 +00:00
Ondřej Surý
b0e8f5ec65
Fix missing mutex destroy and ede invalidate on fctx_create() error paths
The error cleanup in fctx_create() was missing isc_mutex_destroy() and
dns_ede_invalidate() calls. When error paths (cleanup_nameservers,
cleanup_fcount, cleanup_qmessage, cleanup_adb) were taken after the
mutex and edectx were initialized, the fctx memory was freed without
properly destroying these resources first.

(cherry picked from commit 5b1750f15f)
2026-03-17 23:26:28 +01:00
Ondřej Surý
0b9f45c0ff
Fix rwlock type mismatch in delete_ds() error path
The lock is acquired for reading but the error path from
dns_rdata_fromstruct() incorrectly unlocks it as a write lock.

(cherry picked from commit 96a22451d7)
2026-03-17 23:25:21 +01:00
Matthijs Mekking
e65298edb2 Fast fail a validator deadlock
We return DNS_R_NOVALIDSIG if we detected a deadlock. Then in
'validate_async_done()', this result value is used to check if we
need to fall back to insecure. As part of that we create a new fetch
but that fails because of the detected deadlock. This results in a loop
of deadlock detected, fallback to insecure, deadlock detected, ...

Add a new result value, ISC_R_DEADLOCK, and return this instead when
we have detected a deadlock. This will be treated as a generic error,
as there is no special handling for this result value.

(cherry picked from commit bc1d177cc2)
2026-03-17 14:39:48 +00:00
Mark Andrews
0fb1e5c865 Clear errno before calling strtol
The previous code was incorrectly clearing errno after calling
strtol but before testing the result rather than clearing it and
then calling strtol so that changes to errno can be correctly
determined.

(cherry picked from commit d3ffa1f007)
2026-03-17 00:28:07 +00:00
Ondřej Surý
07e1042108 Fix use-after-free in xfrin_recv_done
Move the LIBDNS_XFRIN_RECV_DONE probe execution before dns_xfrin_detach
in xfrin_recv_done.

Previously, dns_xfrin_detach was called before the trace probe, which
could free the xfr object.  Because the accessed member xfr->info is an
embedded array, the expression evaluates via pointer arithmetic rather
than a direct memory dereference.  Although this prevents a reliable
crash in practice, it technically remains a use-after-free issue.
Reorder the statements to ensure the transfer context is fully valid
when the probe executes.

(cherry picked from commit e57245ee81)
2026-03-16 12:00:04 +01:00
Aram Sargsyan
9a7ae56032 OpenSSL 4 compatibility fix
Starting from OpenSSL 4 the the X509_get_subject_name() function
returns a 'const' pointer to a name instead of a regular pointer.
Duplicate the name before operating on it, then free it.

(cherry picked from commit 336c523b79)
2026-03-16 10:56:17 +00:00
Ondřej Surý
338961bf7e
Fix KASP key leaks on keystore lookup failure
In both cfg_kasp_fromconfig() and cfg_kasp_builtinconfig(), the
newly allocated KASP key was not destroyed when the keystore
lookup failed.

(cherry picked from commit df1993611b)
2026-03-16 11:05:03 +01:00
Ondřej Surý
769dff3e6f
Fix missing server socket detach in TLS accept error path
When TLS creation fails in tlslisten_acceptcb(), tlssock->server
was not detached before detaching tlssock itself.

(cherry picked from commit 2ab3d7c075)
2026-03-16 11:05:03 +01:00
Ondřej Surý
f907b229d3
Simplify checkds_create() to return void
Since memory allocation never fails in BIND 9, checkds_create() cannot
fail.  Change it to return void and use designated initializers,
removing error handling at all call sites.

(cherry picked from commit 63d3c1f58a)
2026-03-16 11:04:58 +01:00
Ondřej Surý
dc419568d1
Fix cb_args memory leak in ns_query() error path
Initialize cb_args to NULL and free it in the cleanup path so it
is not leaked when the function fails after allocation.

(cherry picked from commit d7e1013741)
2026-03-16 10:50:22 +01:00
Ondřej Surý
109d100495
Fix TSIG key and transport leaks in zone_notify() error paths
Two 'goto next' paths in zone_notify() skipped detaching the TSIG
key and transport, leaking them on TLS configuration failure and
when the destination address is disabled.

(cherry picked from commit 1505cb1c24)
2026-03-16 10:50:22 +01:00
Ondřej Surý
644c012c0d
Fix memory leak in ixfr_commit() error path
The 'data' allocation was not freed when reaching the cleanup
label with an error result.

(cherry picked from commit 80fae7a4b7)
2026-03-16 10:50:22 +01:00
Ondřej Surý
12ba43a2a3
Fix memory context leak in dns_client_resolve() error path
Use isc_mem_putanddetach() instead of isc_mem_put() to properly
detach the attached memory context stored in resarg->mctx.

(cherry picked from commit d0165070c7)
2026-03-16 10:50:22 +01:00
Ondřej Surý
a32a0d771b
Fix resquery reference imbalance on TCP connect failure
In fctx_query(), resquery_ref(query) is called before
dns_dispatch_connect() in anticipation of the resquery_connected()
callback consuming the reference.

When dns_dispatch_connect() fails synchronously on TCP (e.g. from
dns_transport_get_tlsctx() failing in tcp_dispatch_connect()), the
connect callback is never scheduled, so the extra reference is never
consumed.  The error path then tears down the query via manual cleanup
(isc_mem_put) without going through the refcount destructor, leaving
the reference imbalanced.

Fix by dropping the extra reference on the error path, just after
dns_dispatch_done() which cleans up the dispatch entry.

(cherry picked from commit 2da669490c)
2026-03-15 03:13:00 +01:00
Ondřej Surý
d55d914cb7
Fix copy-paste typos in dns_dispatchmgr comments
The v6ports and nv6ports fields are documented as "available ports
for IPv4" instead of "IPv6".

(cherry picked from commit 0d28e1bed2)
2026-03-15 03:13:00 +01:00
Aram Sargsyan
439144bcaa Fix a bug in rpz.c:del_name()
When the dns_qp_getname() call returns an error the del_name() function
just returns without cleaning up the trasnaction.

Instead of returning, jump to a new label 'done:' similar to the code
written in the add_nm() function.

(cherry picked from commit 4df5b9ac32)
2026-03-14 12:43:37 +00:00
Ondřej Surý
930d9042a1 Fix memory leak in dns_catz_options_setdefault() for zonedir
When defaults->zonedir is set, opts->zonedir is unconditionally
overwritten without freeing the previous value. This leaks memory
on every catalog zone update when zonedir defaults are configured.

Free the existing opts->zonedir before replacing it.

(cherry picked from commit 5cd17c8adc)
2026-03-14 09:11:05 +00:00
Ondřej Surý
51a0151113
Dispatch async work jobs from the correct loop
Refactor dns_loadctx_t and dns_dumpctx_t to use standard
ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros, retiring the
redundant manual attach and detach implementations.

Introduce dns_loadctx_enqueue() and dns_dumpctx_enqueue() to
ensure compliance with the new strict loop affinity in
isc_work_enqueue(). If the current loop does not match the
target loop, the enqueue operation is safely bounced to the
correct thread via isc_async_run().

(cherry picked from commit e7c550730a)
2026-03-14 07:52:59 +01:00
Ondřej Surý
d4b96af062
Enforce isc_work enqueue loop affinity
Add a REQUIRE(isc_loop() == loop) assertion to isc_work_enqueue()
to strictly enforce that work is enqueued from the loop it is
assigned to. This loudly prohibits cross-thread queue manipulation
before it inevitably turns into a concurrency debugging nightmare.

(cherry picked from commit f1311d2d19)
2026-03-14 07:52:56 +01:00
Aram Sargsyan
163db61ebd
Fix a bug in dns_tkey_processquery()
The 'keyname' variable could be used in the add_rdata_to_list()
call without being initialized. Make sure that 'keyname' is non-NULL
for all the cases that do not jump to the 'cleanup:' label.

(cherry picked from commit 172f5496ba)
2026-03-13 13:39:38 +01:00
Ondřej Surý
5f15df5c53
Fix memory leak in QPcache addnoqname/addclosest mechanism
The attacker that controls DNSSEC-signed zone can trigger a memory leak
in the addnoqname() and/or addclosest() by creating more than
max-records-per-type RRSIG for any NSEC records.  The memory leaks have
been fixed.

(cherry picked from commit a854a5c83d)
2026-03-13 13:22:23 +01:00
Matthijs Mekking
ef01ff31db
Check RRset trust in validate_neg_rrset()
In many places we only create a validator if the RRset has too low
trust (the RRset is pending validation, or could not be validated
before). This check was missing prior to validating negative response
data.

(cherry picked from commit 6ca67f65cd)
2026-03-13 13:06:38 +01:00
Matthijs Mekking
1fb2c667fd
Combine validator_log and marksecure
When we mark RRsets as secure, we most of the time also log a debug
message. Combine this the same way as 'markanswer()' does.

(cherry picked from commit d4c7c83a70)
2026-03-13 13:06:38 +01:00
Matthijs Mekking
4ba3caff98
Don't verify already trusted rdatasets
If we already marked an rdataset as secure (or it has even stronger
trust), there is no need to cryptographically verify it again.

(cherry picked from commit 0ec08c2120)
2026-03-13 13:06:38 +01:00
Matthijs Mekking
cd1f2fed56
Check iterations in isdelegation()
When looking up an NSEC3 as part of an insecurity proof, check the
number of iterations. If this is too high, treat the answer as insecure
by marking the answer with trust level "answer", indicating that they
did not validate, but could be cached as insecure.

(cherry picked from commit 988040a5e0)
2026-03-13 13:06:38 +01:00
Mark Andrews
f5acdbb783 Set length in dns_rdata_in_dhcid structure
tostruct_in_dhcid was not setting the length field in the
dns_rdata_in_dhcid structure.

(cherry picked from commit cfa21d1e8b)
2026-03-12 09:26:01 +00:00
Michal Nowak
82991c7881
Use clang-format-22 to update formatting
(cherry picked from commit 239464f276)
2026-03-04 12:18:27 +01:00
Colin Vidal
867a85713e checkconf: check key existence in views
Commit `2956e4fc45b3c2142a3351682d4200647448f193` hardened the `key`
name check when used in `primaries` to reject the configuration if
the key was not defined, rather than simply checking whether the
key name was correctly formed.

However, the key name check didn't include the view configuration,
causing keys not to be recognized if they were defined inside the
view and not at the global level.  This regression is now fixed.

(cherry picked from commit b90399ebdc)
2026-03-01 13:41:53 +01:00
Ondřej Surý
8ddab7f0b8
Implement Fisher-Yates shuffle for nameserver selection
Replace the two-pass "random start index and wrap around" logic in
fctx_getaddresses_nameservers() with a statistically sound Fisher-Yates
shuffle.

The previous implementation picked a random starting node and did two
passes over the linked list to find query candidates.  The new logic
extracts the available nameservers into a bounded, stack-allocated array
of dns_rdata_t structures.

This array is then randomized in-place using a Fisher-Yates shuffle.
Finally, the shuffled array is traversed sequentially to launch fetches
until the dynamic quota (fctx->pending_running >= fetches_allowed) is
reached.

This guarantees a fair random distribution for outbound queries while
properly respecting dynamic query limits, entirely within O(1) memory
and without the overhead of linked-list pointer shuffling or multiple
dataset traversals.

(cherry picked from commit 3c33e7d937)
2026-02-26 08:17:23 +01:00
Matthijs Mekking
038a9ae46a Fix log level bug in keystore
A debug message that logs a PKCS#11 object has been generated was
erroneously logged at error level. This has been fixed.

(cherry picked from commit 5bd6322739)
2026-02-25 16:27:29 +00:00
Ondřej Surý
6ac20d5099 Clear serve-stale flags when following the CNAME chains
A stale answer or SERVFAIL could have been served in case of multiple
upstream failures when following the CNAME chains. This has been fixed.

(cherry picked from commit d46277b398)
2026-02-25 11:30:34 +01:00
Mark Andrews
80316efdd2 Remove determinist selection of nameserver
When selecting nameserver addresses to be looked up we where
always selecting them in dnssec name order from the start of
the nameserver rrset.  This could lead to resolution failure
despite there being address that could be resolved for the
other names.  Use a random starting point when selecting which
names to lookup.

(cherry picked from commit b78052119a)
2026-02-25 10:18:46 +01:00
Ondřej Surý
25006e2f17 Importing invalid SKR file might overflow the stack buffer
If an invalid SKR file is imported, reading the time from the token
buffer might overflow the buffer on the local stack.  This has been
fixed by removing the intermediate buffer and parsing the lexer token
directly.

(cherry picked from commit 8ab4827a0c)
2026-02-24 18:45:41 +00:00
Mark Andrews
f4ea445c66 Remove invalid REQUIRE in NSEC3 fromstruct method
The NSEC3 fromstruct method only worked for hash type 1
when it should work for all hash types.

(cherry picked from commit f030bc6756)
2026-02-24 17:10:52 +01:00