Commit graph

10228 commits

Author SHA1 Message Date
Evan Hunt
dc6202479f remove find_deepest_zonecut() from qpcache
because the cache no longer stores delegation (parent-side) NS rrsets,
and authoritative (child-side) NS rrsets don't affect recursion,
it no longer makes sense for qpcache_find() to look for NS rrsets
and return DNS_R_DELEGATION. that code has been removed.

the cache still does search for covering DNAME records. the
check_zonecut() function has been renamed to check_dname() for clarity.

related changes:
- one test case has been removed from the mirror system test, because it
  tested the behavior of a cached delegation.
- query_checkrrl() and rpz_rrset_find() have been updated so they no
  longer expect cache responses to have DNS_R_DELEGATION response codes.
2026-03-30 20:41:13 +02:00
Ondřej Surý
792d8a74ab Add invariant check for delegset in rctx_nextserver()
The get_nameservers path in rctx_nextserver() is only reachable from
rctx_referral(), which already detaches fctx->delegset.  Assert that
it is NULL rather than redundantly detaching it, since
dns_view_bestzonecut() requires *delegsetp == NULL.
2026-03-30 20:41:13 +02:00
Ondřej Surý
3a339cfca4 Clean up frdataset in resume_dslookup() on shutdown
When resume_dslookup() receives ISC_R_SHUTTINGDOWN or ISC_R_CANCELED,
frdataset (fctx->nsrrset) was not disassociated.  While fctx__destroy()
eventually cleans it up, leaving it associated keeps the underlying DB
node referenced longer than necessary.
2026-03-30 20:41:13 +02:00
Evan Hunt
cd4a7a2d72 Fix fetchlimit test failure
When a referral lookup is triggered by a QMIN query, it should be
exempt from the fetches-per-zone limit just as the QMIN query itself
is.

Also restart the test server between the fetches-per-server and
fetches-per-zone tests so that leftover statistics from the former
do not pollute the latter.

Another fix is because zone spills and general query drops are no longer
in a strict >= relation (on a parent-centric resolver), so check that
both counters are non-zero instead.
2026-03-30 20:41:13 +02:00
Colin Vidal
f2f9a97526 Do not cache NS from referral in negative responses
Stop storing the NS referral into the main cache when processing a
negative response.  These records are already cached in the delegation
database and are not needed elsewhere.

Update dnssec tests that relied on parent-side NS RRsets being
returned in recursive query responses.
2026-03-30 20:41:13 +02:00
Colin Vidal
bc8f0b3a79 Cleans up mark_related()
Cleans up mark_related(): since the FCTX_ATTR_GLUING flag is never set
anymore, the code that handled it has been removed.
2026-03-30 20:41:13 +02:00
Colin Vidal
6ed7a8a723 Resolver is parent-centric
The resolver now uses glue addresses from `dns_deleg_t` objects stored
in the delegation database.  The main cache is still used for ADB A/AAAA
lookups when no glue is available for a nameserver name.

The resolver's `fctx_getaddresses()` is refactored to, for each
delegation of the delegation set, try to get the address-based finds,
then nameserver name lookups. (Later, the logic to handle DELEG
`include-delegparm=` will be hooked there too.)
2026-03-30 20:41:13 +02:00
Colin Vidal
cfac5f3974 Add dns_adb_createaddrinfosfind() for address-based lookups
Add a new ADB API function that creates a find from a list of addresses
rather than by looking up nameserver names.  This enables the resolver
to handle address-based delegations (NS-based with glues or DELEG with
addresses) and name-based delegations uniformly (i.e. the list of finds
from ADB is handled the same way no matter the type of the delegation).
2026-03-30 20:41:13 +02:00
Evan Hunt
a9883483ef Remove dns_db_findzonecut()
This function is no longer used and has been removed, along with its
implementation in qpcache.
2026-03-30 20:41:13 +02:00
Colin Vidal
de8bc44dc8 Use delegation DB for bestzonecut lookups
Function `dns_view_bestzonecut()` now uses the delegation DB instead of
the main cache when looking up at the cache.

As a result, replace `dns_rdataset_t` (representing an NS RRset) with
`dns_delegset_t` in `dns_view_bestzonecut()` and
`dns_resolver_createfetch()` APIs. The resolver and query processing now
use the delegation DB instead of the cache for zonecut lookups.

In the case of the delegation lives in the local database, the locally
found `rdataset` is internally converted into a `dns_delegset_t` object.
From caller POV, it doesn't change anything: a delegation set is a
read-only object which can be used as long as needed and must be
detached one it's done with it.
2026-03-30 20:41:13 +02:00
Colin Vidal
c7b75f448f Populate the delegation DB from referrals answers
The resolver now caches NS records and their A/AAAA glues from referral
answers into the delegation database.

A new `cache_delegns()` function extracts NS names and associated glue
addresses from the authority/additional sections of a referral answer
and use those informations to build a delegation set, which is then
inserted into the delegation database.

The created delegation set contains a delegation per NS RR. If the NS RR
has matching A/AAAA RR, the delegation only store the addresses and not
the name. (Note this is technically possible to group all NS RR which
doesn't have glues into a single delegation, and the implementation can
be changed in that way in the future).

Each view has its own instance of the delegation database (they are
never shared between views), but a server restart/reload preserve the
delegation database state.
2026-03-30 20:41:13 +02:00
Colin Vidal
1b5f757084 Introduce a delegation database
Add `dns_delegdb_t`, a qpmulti-based database enabling to lookup a
delegation set (`dns_delegset_t`) from a zonecut name (`dns_name_t`). A
delegation set object essentially contains an expiration time and a list
of delegation (`dns_deleg_t`). Finally, a delegation can be either:

- A list of IP addresses (`isc_netaddrlist_t`), for NS-based delegation
  providing glues or DELEG-based delegation using `server-ipv4=` or
  `server-ipv6=`;
- Or a list of nameserver names, for NS-based delegation without glues,
  or DELEG-based delegation using `server-name=`;
- Or a list of nameserver names, for DELEG-based delegation using
  `include-delegparam=`.

The delegation database API provides lookup by closest zonecut,
delegation and delegation set builders as well as insertion of those
newly built delegation set, dumping to a `FILE *`, conversion from an NS
rdataset to a delegation set, deletion of a specific zonecut or all the
sub-tree of a given zonecut.

A memory context is internally used inside the delegation database and
can be constraint to a maximum size. Once it gets close to its maximum
size and a new delegation set is inserted into the database, a
reclamation flow is run internally removing the least recently used
entries.

The delegation set and delegation objects are, once they been inserted
into the database, read-only object. Thus, the caller can use them
without concurrency or locking concerns, and must detached them once its
done with it.
2026-03-30 20:41:13 +02:00
Aram Sargsyan
357331f886 Revert NTA flush on expire
Flushing the name when NTA expires causes problems for the ongoing
resolving process. Do not flush the name from the cache. Instead,
the resolver should do the flushing (this is planned to be merged
next).
2026-03-30 18:27:35 +00:00
Ondřej Surý
6ba57a1f0f
Count temporal problems with DNSSEC validation as attempts
After KeyTrap, the temporal DNSSEC were originally hard errors that
caused validation failures even if the records had another valid
signature.  This has been changed and the RRSIGs outside of the
inception and expiration time are not counted as hard errors.  However,
these errors are not even counted as validation attempts, so excessive
number of expired RRSIGs would cause some non-cryptograhic extra work
for the validator.  This has been fixed and the temporal errors are
correctly counted as validation attempts.
2026-03-30 11:16:13 +02:00
Mark Andrews
f2fd54f4b2 Allow the dns_rdata_in_apl structure to be walked twice
The offset value should be set prior to calculating the length.
2026-03-27 12:00:22 +00:00
Aram Sargsyan
35b8af229e Allow empty APL records
Allow empty APL records because RFC 3123 (Section 4) says "zero or
more items". This fixes processing of a catalog zone ACL (which is
based on APL records) when the zone contains an empty APL record or
when a zone update arrives which creates an empty APL record.
2026-03-27 12:36:50 +11:00
Alessio Podda
ed0ecb62e4 Add low contention stats counter
In the current statistics counter implementation, the statistics are
backed by an array of counters, which are updated via atomic operations.
This leads to contention, especially on high core count
machines.

This commit introduces a new isc_statsmulti_t counter that keeps a
separate array per thread. These counters are then aggregated only when
statistics are queried, shifting work off the critical path.

These changes lead to a ~2% improvement in perflab.
2026-03-26 10:19:25 +01:00
Mark Andrews
ed15b6cb26 Add switch to disable cookie checking in delv
This adds the switch +[no]cookie to delv to control the sending of
DNS COOKIE options when sending requests.  The default is to send
DNS COOKIE options.
2026-03-26 11:18:26 +11:00
Michał Kępień
b0fc0e31c5 Merge tag 'v9.21.20' 2026-03-25 14:23:41 +00:00
Ondřej Surý
da8e1c956a
Fix cache flush ordering on NTA expiry
dns_view_flushnode() was called in the delete_expired() async
callback, which runs after the query that detected the NTA expiry.
This created a race: the query would proceed with stale cached data
from the NTA period before the flush had a chance to run, resulting
in transient SERVFAIL with EDE 22 (No Reachable Authority).

Move dns_view_flushnode() into dns_ntatable_covered() so the cache
is flushed synchronously when the expiry is detected, before the
query continues.

Also simplify the expiry comparison in delete_expired() to a direct
pointer comparison (nta == pval) instead of comparing expiry
timestamps.
2026-03-20 14:35:11 +01:00
Ondřej Surý
4d15494b94 Fix non-atomic read-modify-write on entry->srtt in adjustsrtt()
The SRTT update loaded the old value, computed a new one, and stored it
back as separate operations.  Two concurrent callers could each read the
same old value and one update would be silently lost.

Use a CAS loop for the read-modify-write on entry->srtt.  For the aging
path, also CAS on entry->lastage to prevent multiple threads from aging
the same entry in the same second.
2026-03-20 02:06:21 +01:00
Ondřej Surý
a2bd833909 Fix data race on fctx->vresult in validated()
Move the write to fctx->vresult after LOCK(&fctx->lock).  The field was
being set before acquiring the lock, but dns_resolver_logfetch() reads
it under the same lock from another thread.
2026-03-20 00:56:19 +01:00
Ondřej Surý
44bb3cd2a7 Fix data race on nta->expiry
Use CMM_LOAD_SHARED/CMM_STORE_SHARED for nta->expiry, which is
written from the NTA's owning loop but read from any loop (validator,
rndc status, rndc nta -dump).

Also dispatch delete_expired to the NTA's owning loop rather than
the caller's loop.
2026-03-19 01:44:37 +01:00
Ondřej Surý
fae6c6eead Refactor NTA to use RCU instead of rwlock
Replace the ntatable rwlock with RCU read-side critical sections.
The QP multi trie already provides its own concurrency control for
reads and writes, making the rwlock redundant. NTA fields like
expiry are only accessed from the NTA's own event loop thread, so
no additional synchronization is needed.

The table shutdown is now deferred via call_rcu to ensure all
read-side critical sections have completed before iterating and
shutting down individual NTAs.
2026-03-19 01:44:37 +01:00
Aram Sargsyan
1899a3318c
Flush the node when NTA expires
When NTA expires the name's node should be flushed from the view's
cache as it's done when the NTA is manually removed using a rndc
command.
2026-03-19 00:12:59 +01:00
Aram Sargsyan
48d7401f0d Take 'env' reference before async calling perform_reopen()
The 'env' pointer is passed to an async function without taking
a reference first, which can potentially cause a use-after-free
error. Take a reference, then detach in the async function.
2026-03-18 16:10:07 +00:00
Aram Sargsyan
4ac3a6520e Convert dns_dtenv_t reference counting to standard macors
Use standard reference counting macros for dns_dtenv_t instead of
custom attach/detach functions.
2026-03-18 16:10:07 +00:00
Ondřej Surý
7f8b972a3d
Remove NZF support, make LMDB required for new zone storage
Drop the NZF (New Zone File) fallback for persisting runtime zone
configurations, making LMDB (NZD) the only storage backend. This
removes all #ifdef HAVE_LMDB conditionals, the meson 'lmdb' option,
and the NZF-related functions. LMDB is now a mandatory build
dependency.

The named-nzd2nzf tool is now always built.
2026-03-18 11:02:33 +01:00
Ondřej Surý
5b1750f15f Fix missing mutex destroy and ede invalidate on fctx_create() error paths
The error cleanup in fctx_create() was missing isc_mutex_destroy() and
dns_ede_invalidate() calls. When error paths (cleanup_nameservers,
cleanup_fcount, cleanup_qmessage, cleanup_adb) were taken after the
mutex and edectx were initialized, the fctx memory was freed without
properly destroying these resources first.
2026-03-17 16:05:11 +01:00
Ondřej Surý
96a22451d7 Fix rwlock type mismatch in delete_ds() error path
The lock is acquired for reading but the error path from
dns_rdata_fromstruct() incorrectly unlocks it as a write lock.
2026-03-17 16:05:11 +01:00
Matthijs Mekking
bc1d177cc2 Fast fail a validator deadlock
We return DNS_R_NOVALIDSIG if we detected a deadlock. Then in
'validate_async_done()', this result value is used to check if we
need to fall back to insecure. As part of that we create a new fetch
but that fails because of the detected deadlock. This results in a loop
of deadlock detected, fallback to insecure, deadlock detected, ...

Add a new result value, ISC_R_DEADLOCK, and return this instead when
we have detected a deadlock. This will be treated as a generic error,
as there is no special handling for this result value.
2026-03-16 16:46:51 +00:00
Ondřej Surý
6e286beaa6 Cleanup weird syntax defining struct dns_ixfr
The struct dns_ixfr was defined as part of struct dns_xfrin, probably
because at some point it was an anonymous struct and then it was changed
to named struct with typedef at the top.  Move the definition from
struct dns_xfrin into and fold into the typedef ... dns_ixfr_t.
2026-03-16 12:17:06 +01:00
Ondřej Surý
f4b4f030c4 Cleanup the duplicate logic and comments around add into NSEC tree
After merging the NORMAL, NSEC and NSEC3 tree into single QP tree, there
were some comments still speaking about auxiliary NSEC tree.  These were
cleaned up and the logic when we pass the qp tree (write transaction) to
qpzone_addrdataset_inner() was changed to be more obvious that this is
needed only when we are adding NSEC records.
2026-03-16 12:17:06 +01:00
Ondřej Surý
e57245ee81
Fix use-after-free in xfrin_recv_done
Move the LIBDNS_XFRIN_RECV_DONE probe execution before dns_xfrin_detach
in xfrin_recv_done.

Previously, dns_xfrin_detach was called before the trace probe, which
could free the xfr object.  Because the accessed member xfr->info is an
embedded array, the expression evaluates via pointer arithmetic rather
than a direct memory dereference.  Although this prevents a reliable
crash in practice, it technically remains a use-after-free issue.
Reorder the statements to ensure the transfer context is fully valid
when the probe executes.
2026-03-16 11:06:06 +01:00
Ondřej Surý
63d3c1f58a
Simplify checkds_create() to return void
Since memory allocation never fails in BIND 9, checkds_create() cannot
fail.  Change it to return void and use designated initializers,
removing error handling at all call sites.
2026-03-14 13:58:26 +01:00
Ondřej Surý
d7e1013741
Fix cb_args memory leak in ns_query() error path
Initialize cb_args to NULL and free it in the cleanup path so it
is not leaked when the function fails after allocation.
2026-03-14 13:48:08 +01:00
Ondřej Surý
1505cb1c24
Fix TSIG key and transport leaks in zone_notify() error paths
Two 'goto next' paths in zone_notify() skipped detaching the TSIG
key and transport, leaking them on TLS configuration failure and
when the destination address is disabled.
2026-03-14 13:48:08 +01:00
Ondřej Surý
80fae7a4b7
Fix memory leak in ixfr_commit() error path
The 'data' allocation was not freed when reaching the cleanup
label with an error result.
2026-03-14 13:48:08 +01:00
Ondřej Surý
d0165070c7
Fix memory context leak in dns_client_resolve() error path
Use isc_mem_putanddetach() instead of isc_mem_put() to properly
detach the attached memory context stored in resarg->mctx.
2026-03-14 13:47:48 +01:00
Aram Sargsyan
4df5b9ac32
Fix a bug in rpz.c:del_name()
When the dns_qp_getname() call returns an error the del_name() function
just returns without cleaning up the trasnaction.

Instead of returning, jump to a new label 'done:' similar to the code
written in the add_nm() function.
2026-03-14 13:01:55 +01:00
Ondřej Surý
5cd17c8adc
Fix memory leak in dns_catz_options_setdefault() for zonedir
When defaults->zonedir is set, opts->zonedir is unconditionally
overwritten without freeing the previous value. This leaks memory
on every catalog zone update when zonedir defaults are configured.

Free the existing opts->zonedir before replacing it.
2026-03-14 07:57:00 +01:00
Ondřej Surý
e7c550730a
Dispatch async work jobs from the correct loop
Refactor dns_loadctx_t and dns_dumpctx_t to use standard
ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros, retiring the
redundant manual attach and detach implementations.

Introduce dns_loadctx_enqueue() and dns_dumpctx_enqueue() to
ensure compliance with the new strict loop affinity in
isc_work_enqueue(). If the current loop does not match the
target loop, the enqueue operation is safely bounced to the
correct thread via isc_async_run().
2026-03-14 06:32:54 +01:00
Aram Sargsyan
172f5496ba
Fix a bug in dns_tkey_processquery()
The 'keyname' variable could be used in the add_rdata_to_list()
call without being initialized. Make sure that 'keyname' is non-NULL
for all the cases that do not jump to the 'cleanup:' label.
2026-03-13 13:38:07 +01:00
Ondřej Surý
a854a5c83d
Fix memory leak in QPcache addnoqname/addclosest mechanism
The attacker that controls DNSSEC-signed zone can trigger a memory leak
in the addnoqname() and/or addclosest() by creating more than
max-records-per-type RRSIG for any NSEC records.  The memory leaks have
been fixed.
2026-03-13 13:18:48 +01:00
Matthijs Mekking
6ca67f65cd
Check RRset trust in validate_neg_rrset()
In many places we only create a validator if the RRset has too low
trust (the RRset is pending validation, or could not be validated
before). This check was missing prior to validating negative response
data.
2026-03-13 13:03:33 +01:00
Matthijs Mekking
d4c7c83a70
Combine validator_log and marksecure
When we mark RRsets as secure, we most of the time also log a debug
message. Combine this the same way as 'markanswer()' does.
2026-03-13 13:03:33 +01:00
Matthijs Mekking
0ec08c2120
Don't verify already trusted rdatasets
If we already marked an rdataset as secure (or it has even stronger
trust), there is no need to cryptographically verify it again.
2026-03-13 13:03:33 +01:00
Matthijs Mekking
988040a5e0
Check iterations in isdelegation()
When looking up an NSEC3 as part of an insecurity proof, check the
number of iterations. If this is too high, treat the answer as insecure
by marking the answer with trust level "answer", indicating that they
did not validate, but could be cached as insecure.
2026-03-13 13:03:33 +01:00
Mark Andrews
cfa21d1e8b Set length in dns_rdata_in_dhcid structure
tostruct_in_dhcid was not setting the length field in the
dns_rdata_in_dhcid structure.
2026-03-12 14:08:32 +11:00
Ondřej Surý
2da669490c Fix resquery reference imbalance on TCP connect failure
In fctx_query(), resquery_ref(query) is called before
dns_dispatch_connect() in anticipation of the resquery_connected()
callback consuming the reference.

When dns_dispatch_connect() fails synchronously on TCP (e.g. from
dns_transport_get_tlsctx() failing in tcp_dispatch_connect()), the
connect callback is never scheduled, so the extra reference is never
consumed.  The error path then tears down the query via manual cleanup
(isc_mem_put) without going through the refcount destructor, leaving
the reference imbalanced.

Fix by dropping the extra reference on the error path, just after
dns_dispatch_done() which cleans up the dispatch entry.
2026-03-10 17:58:43 +01:00