Commit graph

16558 commits

Author SHA1 Message Date
Matthijs Mekking
d36d775f0f Rename private zone functions
Rename functions that are defined in the private header file to start
with 'dns__zone_'.
2026-04-08 14:24:17 +02:00
Matthijs Mekking
1a0b419991 Lock zone when incrementing statistics
dns__zone_stats_increment() requires the zone to be locked. This was
not always the case. This commit fixes that.
2026-04-08 14:24:17 +02:00
Matthijs Mekking
d3eba4e78f Replace static functions with private functions
Replace 'inc_stats()' with 'dns__zone_stats_increment()'.

Replace 'get_request_transport_type()' with
'dns_zone_getrequesttransporttype()'.
2026-04-08 14:24:17 +02:00
Matthijs Mekking
080e849eaa Move zonemgr to own source file
In order to make zone.c more readable, we are splitting it up in
separate source files. This moves the zonemgr to its own file
("zonemgr.c").

Since this code accesses the zone structure directly, move the
'struct dns_zonemgr' and its prerequisites to "zone_p.h".

The helper functions 'forward_cancel()', 'zone_xfrdone()',
'zmgr_start_xfrin_ifquota()', and 'zmgr_resume_xfrs() need to be
internally accessible to both source files.

Note: This commit does not compile.
2026-04-08 14:24:17 +02:00
Matthijs Mekking
e0f09bb374 Rename isdelegation() to is_insecure_referral()
The name 'isdelegation()' was confusing. This function is not checking
whether this message is a delegation, but whether the denial of
existence proofs in this message is a proof of a referral to an
unsigned zone.

The name 'is_unsecure_referral()' is more appropriate.
2026-04-07 08:38:57 +02:00
Matthijs Mekking
3ac1bb1c39 Revert isdelegation() to return boolean value again
The isdelegation() was changed to return an isc_result_t because the
idea was to have a separate return value DNS_R_NSEC3ITERRANGE to signal
to the caller we could not verify the proof because of too many
iterations in the NSEC3 record, or perhaps ISC_R_UNEXPECTED for a more
generic cause that verification was not done.

But this would make error handling more fragile and all we care about
is whether we can reliably say the NS bit was not set.

If we can not reliably say so, we have to treat it as an insecure
referrral.

Since the answer is either yes or no, we can revert back to returning
a boolean value.
2026-04-07 08:38:46 +02:00
Aydın Mercan
e16a3d7a8e
embed default sanitizer flags in executables
Replicating CI failures requires the developer to piece together the
sanitizer flags by hand, reducing ergonomics.

Fix this problem by embedding the relevant settings to the executables.
Symbol resolution still needs manual intervention by setting the env
variable `*SAN_SYMBOLIZER_PATH`. However, this doesn't affect any behavior.
2026-04-05 12:46:38 +03:00
Aram Sargsyan
141ff7bfa7 Fix a race condition in xfrin_recv_done() when calling xfrin_reset()
When the xfrin_recv_done() function decides to retry the transfer
using AXFR because of a previous error, it calls the xfrin_reset()
function which calls dns_db_closeversion() on 'xfr->ver'. The problem
is that the ixfr processing of a previous message could be still
in process in a worker thread, which then can use freed 'xfr->ver'.

If there is an ongoing worker thread delay the AXFR retry until after
the worker thread has finished its work.
2026-04-03 11:01:34 +00:00
Ondřej Surý
3fe3751c22 Fix wrong NSEC proof for empty non-terminals after IXFR
When receiving NSEC records via IXFR, the node was not marked with
havensec because the condition checked the uninitialized output
rdataset type instead of the input rdataset type.  This caused
queries for empty non-terminal names in NSEC-signed zones received
via IXFR to return the zone apex NSEC instead of the correct
covering NSEC record.

The bug was introduced in f4b4f030.
2026-04-03 06:33:31 +02:00
Ondřej Surý
14cebe4d61 Change NSEC3 and NSEC3PARAM struct fields to use isc_region_t
Replace the separate pointer+length field pairs in dns_rdata_nsec3_t
(salt/salt_length, next/next_length, typebits/len) and
dns_rdata_nsec3param_t (salt/salt_length) with isc_region_t.  This
makes the structs self-describing and eliminates a class of
length-mismatch bugs.

The dns_zone_setnsec3param() signature is updated to take
isc_region_t *salt instead of separate saltlen and salt arguments.

Function signatures for dns_nsec3_addnsec3, dns_db_getnsec3parameters,
and related internal functions still use separate pointer+length pairs
and should be updated in a follow-up.
2026-04-02 16:53:18 +02:00
Matthijs Mekking
93376b8f67 Return void in functions that cannot fail
dns_zone_getloadtime(), dns_zone_getexpiretime(),
dns_zone_getrefreshtime(), and dns_zone_getrefreshkeytime()
cannot fail, so return void instead of ISC_R_SUCCESS.
2026-04-02 15:50:09 +02:00
Matthijs Mekking
f015170ec7 Fix some documentation issues
dns_zone_setorigin() does not return ISC_R_SUCCESS, but void.

dns_zone_setalsonotify() referred to non-existing
dns_zone_alsonotifywithkeys().
2026-04-02 15:50:09 +02:00
Matthijs Mekking
f04590d006 Remove unneccesary functions
Remove dns_zone_getmem() (duplicate of dns_zone_getmctx()).
Remove dns_zone_getrdclass() (duplicate of dns_zone_getclass()).
Remove obsoleted dns_zone_getstatscounters().
Remove obsoleted dns_zone_setstatistics().
2026-04-02 15:50:09 +02:00
Matthijs Mekking
af505bc44f Small refactor 'dns_zone_set*acl()'
The various 'dns_zone_set*acl()' functions can be refactored to
call 'dns_zone_clear*acl()', to avoid code duplication.
2026-04-02 15:50:09 +02:00
Matthijs Mekking
6c69fd16d0 Lock zone when checking for inline raw/secure
The caller is supposed to hold the zone lock for 'inline_raw()' and
'inline_secure()', but when adding 'REQUIRE(LOCKED_ZONE(zone));' to
these functions it turned out to be not always the case.
2026-04-02 15:50:09 +02:00
Matthijs Mekking
94788446db Rename private zone functions
Rename functions that are defined in the private header file to start
with 'dns__zone_'.
2026-04-02 15:50:09 +02:00
Matthijs Mekking
2893e128a7 Move zone set/get properties to own source file
In order to make zone.c more readable, we are splitting it up in
separate source files. This moves the set and get functions to its
own file ("zoneproperties.c").

Since this code accesses the zone structure directly, move the
'struct dns_zone' and its prerequisites to "zone_p.h".

The helper functions 'inline_raw()', 'inline_secure()',
'dns_zone_setview_helper()', 'zone_settimer(), 'set_resigntime()', and
'zone_freedbargs()' need to be internally accessible to both source
files.

A few set/get functions remain in zone.c for now:
- dns_zone_getserial
- dns_zone_getversion
- dns_zone_setviewcommit
- dns_zone_setviewrevert
- dns_zone_get_rpz_num
- dns_zone_set_parentcatz
- dns_zone_get_parentcatz
- dns_zone_setrawdata
- dns_zone_setskr
- dns_zone_getskrbundle
- dns_zone_setnsec3param
- dns_zone_setoption
- dns_zone_getoptions
- dns_zone_getrequesttransporttype
- dns_zone_getredirecttype
- dns__zone_getnotifyctx
- dns_zone_getgluecachestats
- dns_zone_setplugins
- dns_zone_setserial
- dns_zone_getxfr
- dns_zone_getkeystores
2026-04-02 15:50:07 +02:00
Matthijs Mekking
e1bd1a4003 Introduce zone functions dns_zone_(get|set)modded
Introduce new functions to set and get whether the zone configuration
has been modified with 'rndc modzone'.
2026-04-02 12:35:54 +00:00
Colin Vidal
fc03a876ae fix NULL dereference in dns_view_bestzonecut()
When `dns_view_bestzonecut()` is called with a NULL `delegsetp`, it
calls `bestzonecut_zone()` with a NULL `rdataset` pointer but there is a
non-guarded de-reference of the `rdataset` pointer in
`bestzonecut_zone()`.

In practice, the only current situation where `dns_view_bestzonecut()`
is called with NULL `delegsetp` is from a case of `seek_ds()` _and_ the
non-guarded dereference occurs only if there is a static-stub local
zone matching the zonecut `seek_ds()` is looking for. It's unclear if
such flow is actually possible.

The `rdataset` is now always valid inside `dns_view_bestzonecut()`. (It
was initially set only if `delegsetp` was set to avoid extra works in
the qpzone, which can be skipped when `rdataset` is NULL, but this
doesn't really make a difference, considering we are in a slow path
considering the result wasn't found in this case.)
2026-04-02 12:16:12 +02:00
Colin Vidal
50a2fce68f remove deadcode in query_addbestns()
The local variable `zfname` was released in the cleanup part of the
function if not NULL, but it turns out it is now always NULL at that
point.

The flow can get to that part only in two cases: either `zfname` is not
NULL, and then it's ownership is moved to a different variable (thus, it
is now NULL), or `zfname` is already NULL.

Removing the bit of deadcode releasing it.
2026-04-02 11:51:12 +02:00
Ondřej Surý
c0a6f3bf65
Fix GSS context leak on error paths in process_gsstkey()
After gss_accept_sec_context() succeeds, the GSS context is passed
to dst_key_fromgssapi() which transfers ownership to the dst_key.
If a subsequent operation fails (dst_key_fromgssapi itself,
dns_tsigkey_createfromkey, or dns_tsigkeyring_add), the cleanup
label frees the dst_key but only if it was created.  If the failure
happened before dst_key_fromgssapi, the GSS context was orphaned.

Delete the GSS context in the cleanup path when it was not
transferred to a dst_key.
2026-04-01 07:04:40 +02:00
Ondřej Surý
5305679633
Fix GSS context leak when principal name is empty
When gss_accept_sec_context() completes successfully but
gss_display_name() returns an empty principal, the GSS context
was leaked — it was neither stored in a key nor deleted.

Delete the context and reject with BADKEY in this case.  This
should only occur due to a GSS library bug, since a completed
context should always have a valid principal.
2026-04-01 07:04:39 +02:00
Ondřej Surý
8c1fe179e3
Fix off-by-one in TSIG generated key eviction
Use pre-increment (++ring->generated) instead of post-increment
(ring->generated++) so the comparison against DNS_TSIG_MAXGENERATEDKEYS
happens after counting the new key.  With post-increment, one extra key
beyond the limit was allowed before eviction kicked in.
2026-04-01 07:04:39 +02:00
Ondřej Surý
5e10fdc295
Implement RFC 3645 Section 4.1.1 key expiry check in TKEY
Check for existing non-expired TSIG keys before accepting a new
GSS-API negotiation.  Per RFC 3645 Section 4.1.1:

- If a key exists and has not expired, reject with BADNAME
- If a key exists but has expired, delete it and start fresh

Previously, an expired GSS key would permanently block
re-negotiation for that name until the server was restarted.

Use BADKEY rather than BADNAME to avoid creating an oracle for
key name enumeration by unauthenticated attackers.
2026-04-01 07:04:39 +02:00
Alessio Podda
9fe3809ccf Fix benign race condition
The dns_rdatavec_subtractrdataset function would copy the old header
using memmove but the old header includes fields such as trust and
reference counts that are atomic.

While the values of those fields were never used, it did cause a benign
race condition. This commit refactors dns_rdatavec_subtractrdataset and
dns_rdatavec_merge not to use memmove.
2026-03-31 16:25:33 +02:00
Alessio Podda
6202492509 Remove node and db pointer from dns_rdataset_t.vec
Now that we track the references at the vecheader level, binding an
rdataset is no longer guaranteed to keep its node alive. Therefore
remove the node pointer from the rdataset, and instead decide whether
glue is required by explicitely passing the owner name to addglue.
2026-03-31 16:22:56 +02:00
Alessio Podda
3521900ecd Add hashmap to qpz_heap
This commit adds a level of indirection to the signing operations.
Instead of being intrusive, the qpz_heap will keep track of which
headers must be resigned through a hashmap.
The intent is to make dns_vecheader_t entirely self-contained. In
particular, the ownership structure between the heap and the headers is
flipped. Before, the headers would "own" the heap, now the heap owns
the header.
2026-03-31 16:22:56 +02:00
Alessio Podda
1aa0768151 Refactor setsigningtime
Change setsigningtime to take the node of the header being changed.
Done to facilitate further refactoring that will remove the header
pointer from vecheader.
2026-03-31 16:22:56 +02:00
Alessio Podda
0683d76025 Extract heap deregistration
This commit changes the deregistration of vecheaders from the heap to
go through a private api instead of the dyndb public one. This is safe
since vecheader is only used by qpzone.

This is done in order to facilitate further refactoring.
2026-03-31 16:22:56 +02:00
Aydın Mercan
2a62cd449f
include <sys/endian.h> according by checking in meson
The <sys/endian.h> header has existed in macOS since around ~26. This
causes the `htobeNN`/`htoleNN` macros to be redefined in <isc/endian.h>
in terms of <libkern/OSByteOrder.h> when other system headers include
<sys/endian.h>.

Fix this issue by using checking for the existence of <sys/endian.h> in
meson and including it according to the probe result.
2026-03-31 16:06:37 +03:00
Ondřej Surý
9e40f6508c Remove the dead dns_expire_ttl code path and deletettl stats counter
Now that TTL-based cleaning has been removed, the dns_expire_ttl enum
value, its switch case in expireheader(), and the deletettl stats counter
(text, XML, JSON) are all dead code.  Remove them so the stats channel
no longer reports a permanently-zero counter.
2026-03-30 21:46:44 +02:00
Ondřej Surý
03ff80a1f7 Move ADB TTL-based cleanup into dump_adb()
Instead of doing a full sweep of all names and entries before dumping,
expire stale entries lazily as they are encountered during the dump
iteration.  This aligns with the QPcache approach of avoiding separate
TTL-based cleaning passes.

dns_adb_flush() retains its explicit full sweep since it needs to
force-expire everything.
2026-03-30 21:46:44 +02:00
Ondřej Surý
dc9564f14d Raise the minimum cache size to 8 MB, warn below 256 MB
Lower the hard floor for max-cache-size from 2 MB to 8 MB to support
resource-constrained environments (e.g. CPE devices) while remaining
safe for LRU-only eviction.
2026-03-30 21:46:44 +02:00
Ondřej Surý
6fa415f71f Refactor the 'max-cache-size' configuration handling
Extract the inline max-cache-size logic from configure_view() into
reusable helpers: configure_max_cache_size(), default_max_cache_size(),
max_cache_size_as_percent(), and sanitized_max_cache_size().

Move DNS_CACHE_MINSIZE and DNS_ADB_MINADBSIZE to public headers and
remove the SIZE_AS_PERCENT sentinel.
2026-03-30 21:46:44 +02:00
Ondřej Surý
d7c99c14fc Remove 'unlimited' setting for the max-cache-size
Since TTL-based cache cleaning has been removed, an unlimited
max-cache-size would eventually exhaust system memory.

Both 'max-cache-size unlimited;' and 'max-cache-size 0;' now fall
back to the default value (90% of physical memory for recursive
views).
2026-03-30 21:46:44 +02:00
Ondřej Surý
4891a6b14f Remove the heap memory context from QPcache
The heaps have been removed, so the separate heap memory context
(hmctx) is no longer needed.  Remove it from both dns_cache and
dns_qpcache, along with the HeapMemInUse statistics.
2026-03-30 21:46:44 +02:00
Ondřej Surý
602f5b73e6 Remove TTL-based cleaning from the QPcache
The experiments show that the SIEVE-LRU based mechanism is good enough
as the only mechanism for cleaning up the expired entries from the
cache.

This simplifies the internal logic and memory usage of the cache.

The disadvantage is that the cache use will organically grow until it
hits the overmem cleaning mechanism.

The advantage is that the measurements show that BIND 9 is well behaved
even with 512 MB cache under heavy load.
2026-03-30 21:46:44 +02:00
Ondřej Surý
08a33a9cc9 Remove useless .expire initialization from rdataslab
dns_rdataslab_fromrdataset() set .expire to rdataset->ttl, but the
only consumer (qpcache_addrdataset) immediately overwrote it with
now + rdataset->ttl.  Remove the redundant initialization and set the
expire time only once.
2026-03-30 21:46:44 +02:00
Colin Vidal
ea2cb4e9df test for auth+res server and glues in delegation
When a resolver+auth server has a delegation on a local zone and has a
glue, the glue can only be for in-domain NS.

In this case, when the resolver is looking at the zonecut,
`dns_view_bestzonecut()` synthesizes a delegset from an NS rdataset
found in the local zone (the delegation inside auth zone), and ignores
the glues if any.

As a result, the delegset will contain a single delegation of type
DNS_DELEGTYPE_NS_NAMES, which leads to an ADB fetch. But it's actually an
in-memory fetch, because in this case, the fetch will immediately find
the A/AAAA glues from the local zone.

An alternative approach (not chosen here) would be to make
`dns_view_bestzonecut()`, when converting an NS rdataset into a
`dns_deleg_t`, check for glues for the delegation in the auth zone, and
add those in the `dns_deleg_t`. The delegation would be of type
DNS_DELEGTYPE_NS_GLUES which would avoid the ADB name lookup.

However, that's extra code, extra logic and complexities, for a lookup
that will be done in memory anyway, just a bit later. So for now, this
is not implemented that way.

The test is added, however, to confirm that there is no attempt from the
resolver to get the NS fron the child zone.
2026-03-30 20:41:13 +02:00
Evan Hunt
dc6202479f remove find_deepest_zonecut() from qpcache
because the cache no longer stores delegation (parent-side) NS rrsets,
and authoritative (child-side) NS rrsets don't affect recursion,
it no longer makes sense for qpcache_find() to look for NS rrsets
and return DNS_R_DELEGATION. that code has been removed.

the cache still does search for covering DNAME records. the
check_zonecut() function has been renamed to check_dname() for clarity.

related changes:
- one test case has been removed from the mirror system test, because it
  tested the behavior of a cached delegation.
- query_checkrrl() and rpz_rrset_find() have been updated so they no
  longer expect cache responses to have DNS_R_DELEGATION response codes.
2026-03-30 20:41:13 +02:00
Ondřej Surý
792d8a74ab Add invariant check for delegset in rctx_nextserver()
The get_nameservers path in rctx_nextserver() is only reachable from
rctx_referral(), which already detaches fctx->delegset.  Assert that
it is NULL rather than redundantly detaching it, since
dns_view_bestzonecut() requires *delegsetp == NULL.
2026-03-30 20:41:13 +02:00
Ondřej Surý
a1cb966944 Guard against NULL delegset in query_delegation_recurse()
If both dns_view_bestzonecut() and dns_deleg_fromrdataset() fail,
delegset stays NULL.  Passing it to ns_query_recurse() would crash
on the REQUIRE(DNS_DELEGSET_VALID(delegset)) in createfetch().

Return ISC_R_NOTFOUND instead, which lets the caller handle the
failure gracefully.
2026-03-30 20:41:13 +02:00
Ondřej Surý
3a339cfca4 Clean up frdataset in resume_dslookup() on shutdown
When resume_dslookup() receives ISC_R_SHUTTINGDOWN or ISC_R_CANCELED,
frdataset (fctx->nsrrset) was not disassociated.  While fctx__destroy()
eventually cleans it up, leaving it associated keeps the underlying DB
node referenced longer than necessary.
2026-03-30 20:41:13 +02:00
Evan Hunt
cd4a7a2d72 Fix fetchlimit test failure
When a referral lookup is triggered by a QMIN query, it should be
exempt from the fetches-per-zone limit just as the QMIN query itself
is.

Also restart the test server between the fetches-per-server and
fetches-per-zone tests so that leftover statistics from the former
do not pollute the latter.

Another fix is because zone spills and general query drops are no longer
in a strict >= relation (on a parent-centric resolver), so check that
both counters are non-zero instead.
2026-03-30 20:41:13 +02:00
Colin Vidal
f2f9a97526 Do not cache NS from referral in negative responses
Stop storing the NS referral into the main cache when processing a
negative response.  These records are already cached in the delegation
database and are not needed elsewhere.

Update dnssec tests that relied on parent-side NS RRsets being
returned in recursive query responses.
2026-03-30 20:41:13 +02:00
Colin Vidal
bc8f0b3a79 Cleans up mark_related()
Cleans up mark_related(): since the FCTX_ATTR_GLUING flag is never set
anymore, the code that handled it has been removed.
2026-03-30 20:41:13 +02:00
Colin Vidal
883478bc6a Use delegdb for lookup in query_delegation_recurse()
When `query.c` finds a zonecut in the main cache (e.g. from stale NS
records), it must still use the correct delegation for recursion. Look
up the delegation DB via `dns_view_bestzonecut()` first; fall back to
`dns_deleg_fromrdataset()` only if no delegation is found.

This might also be done inside `query_lookup()` instead, with the `qctx`
holding a delegset property, but that approach needs further work to
avoid breakage and it is not clear so far if there would be other use
case of it. Current approach is simpler for now.
2026-03-30 20:41:13 +02:00
Colin Vidal
6ed7a8a723 Resolver is parent-centric
The resolver now uses glue addresses from `dns_deleg_t` objects stored
in the delegation database.  The main cache is still used for ADB A/AAAA
lookups when no glue is available for a nameserver name.

The resolver's `fctx_getaddresses()` is refactored to, for each
delegation of the delegation set, try to get the address-based finds,
then nameserver name lookups. (Later, the logic to handle DELEG
`include-delegparm=` will be hooked there too.)
2026-03-30 20:41:13 +02:00
Colin Vidal
cfac5f3974 Add dns_adb_createaddrinfosfind() for address-based lookups
Add a new ADB API function that creates a find from a list of addresses
rather than by looking up nameserver names.  This enables the resolver
to handle address-based delegations (NS-based with glues or DELEG with
addresses) and name-based delegations uniformly (i.e. the list of finds
from ADB is handled the same way no matter the type of the delegation).
2026-03-30 20:41:13 +02:00
Evan Hunt
a9883483ef Remove dns_db_findzonecut()
This function is no longer used and has been removed, along with its
implementation in qpcache.
2026-03-30 20:41:13 +02:00