Commit graph

1696 commits

Author SHA1 Message Date
Alessio Podda
ed0ecb62e4 Add low contention stats counter
In the current statistics counter implementation, the statistics are
backed by an array of counters, which are updated via atomic operations.
This leads to contention, especially on high core count
machines.

This commit introduces a new isc_statsmulti_t counter that keeps a
separate array per thread. These counters are then aggregated only when
statistics are queried, shifting work off the critical path.

These changes lead to a ~2% improvement in perflab.
2026-03-26 10:19:25 +01:00
Michał Kępień
b0fc0e31c5 Merge tag 'v9.21.20' 2026-03-25 14:23:41 +00:00
Aram Sargsyan
4ac3a6520e Convert dns_dtenv_t reference counting to standard macors
Use standard reference counting macros for dns_dtenv_t instead of
custom attach/detach functions.
2026-03-18 16:10:07 +00:00
Matthijs Mekking
81dca80877
Update documentation now that LMDB is required
Remove references to viewname.nzf, and no longer use "if LMDB is used".
2026-03-18 11:02:33 +01:00
Ondřej Surý
8ae0828e15
Split NZD functions into a separate compilation unit
Move all LMDB-based new zone database functions from server.c into
nzd.c to reduce the size of server.c and isolate the NZD/LMDB
interface. Rename load_nzf() to nzd_load_nzf() to match the nzd_
namespace.
2026-03-18 11:02:33 +01:00
Ondřej Surý
f203f6e77a
Remove dead NZF writer parameter and simplify newzone locking
Now that NZF write support is gone, remove the unused nzfwriter_t
typedef and nzfwriter parameter from delete_zoneconf().  Remove the
bool locked parameter and simplify the locking in do_modzone() and
rmzone() to unconditional lock/unlock pairs.
2026-03-18 11:02:33 +01:00
Ondřej Surý
7f8b972a3d
Remove NZF support, make LMDB required for new zone storage
Drop the NZF (New Zone File) fallback for persisting runtime zone
configurations, making LMDB (NZD) the only storage backend. This
removes all #ifdef HAVE_LMDB conditionals, the meson 'lmdb' option,
and the NZF-related functions. LMDB is now a mandatory build
dependency.

The named-nzd2nzf tool is now always built.
2026-03-18 11:02:33 +01:00
Matthijs Mekking
780872e07e Don't call dns_zone_setadded() on modify
If we are modifiying the zone, the zone must have been added before.
Don't overwrite this value on modifications.

Also it feels cleaner to pass added=false to configure_zone() in
do_modzone().
2026-03-16 15:18:39 +01:00
Matthijs Mekking
71587b0816 Only lock view->newzone.lock if not already locked
Some code paths try to lock an already locked view->newzone.lock.

For example, do_modzone() aqcuires the lock and then calls
delete_zoneconf(), that wants to acquire the same lock.

Add a parameter to delete_zoneconf() that informs the function if the
lock has already been acquired.
2026-03-16 15:18:39 +01:00
Ondřej Surý
66ce33603b
Fix port validation rejecting valid port 65535
A few port validation checks use >= UINT16_MAX instead of > UINT16_MAX,
incorrectly rejecting port 65535 as out of range.  Port 65535 is a valid
TCP/UDP port number.  Other port checks in the same file already use the
correct > comparison.
2026-03-14 10:11:55 +01:00
Ondřej Surý
b4b81deed9
Fix stack Use-After-Return in SIG(0) handling
The asynchronous SIG(0) handling improperly used srcaddr, and dstaddr
from the caller's stack and didn't attach to aclenv.  This could
possibly lead to ACL bypass as an invalid srcaddr could be matched or
possible assertion failure if the ACL environment would change between
the initial call and the SIG(0) processing due to the server
reconfiguration.  This has been fixed.
2026-03-13 13:47:17 +01:00
Ondřej Surý
c1ba80169c
Introduce max-delegation-servers configuration option
Make the maximum number of processed delegation nameservers configurable
via the new 'max-delegation-servers' option (default: 13), replacing the
hardcoded NS_PROCESSING_LIMIT (20).

The default is reduced to 13 to precisely match the maximum number of
root servers that can fit into a classic 512-byte UDP payload.  This
provides a natural, historically sound cap that mitigates resource
exhaustion and amplification attacks from artificially inflated or
misconfigured delegations.

The configuration option is strictly bounded between 1 and 100 to ensure
resolver stability.
2026-03-04 16:13:49 +01:00
Aram Sargsyan
e41fbea843 Replace the outgoing queries RTT histogram code with isc_histomulti
The granularity of the simple histogram with fixed number of ranges
sometimes isn't good enough. As there's a need to implement a new
histogram statistics for the incoming query times (RTT), it was decided
to also update the existing RTT statistics of the outgoing queries
so that they look similar and use common code.

Remove the old histogram code from the resolver and from the statistics
channel. Reimplement the outgoing queries RTT histogram using the
isc_histomulti module, and prepare the necessary base for implementing
the incoming queries RTT histogram. The statistics channel will be
updated to expose the new histograms in an upcoming commit.
2026-02-26 14:00:10 +00:00
Ondřej Surý
295139f8ca
Rename isc_net_getudpportrange() to isc_net_getportrange()
This better reflects the true nature of the function as we are reading
the ephemeral port range which is not related to UDP at all.
2026-02-20 14:06:23 +01:00
Ondřej Surý
04c81b55d2
Implement IP_LOCAL_PORT_RANGE socket option for Linux
For Linux >= 6.8:

Since 2023, Linux has introduced a change to the IP_LOCAL_PORT_RANGE
socket option that eliminates the need for the random window
shifting (implemented as a fallback in the next commit).

By setting IP_LOCAL_PORT_RANGE option, we tell the kernel to use better
approach to the source port selection.

For Linux << 6.8:

This implement selecting port by random shifting range leveraging the
IP_LOCAL_PORT_RANGE socket option.  The network manager is initialized
with the ephemeral port range (on startup and on reconfig) and then for
every outgoing TCP connection, we define a custom port range (1000
ports) and then randomly shift the custom range within the system range.

This helps the kernel to reduce the search space to the custom window
between <random_offset, random_offset + 1000>.

Reference:
https://blog.cloudflare.com/linux-transport-protocol-port-selection-performance/#kernel
2026-02-20 14:06:23 +01:00
Colin Vidal
e62cafd3c7 rename fetch response db field to cache
As the `dns_fetchresponse_t` `db` field can only be attached to the
resolver cache database, rename it into `cache` to avoid ambiguities.
2026-02-10 08:50:16 +01:00
Aydın Mercan
5ae9b4d14c
cleanup unused header in isc/md.h
Use `isc/crypto.h` whenever needed instead.
2026-02-02 11:50:14 +03:00
Colin Vidal
e8b0d4749c rename dns_view_findzonecut() into dns_view_bestzonecut()
`dns_view_findzonecut()` is used only in the context where the closest
name servers for a name need to be queried.  In the future, this API
will also return the glues (if known) for those name servers, as well
as (exclusively, if both NS and DELEG exist) the DELEG record.

To avoid ambiguities with other code flows using `dns_db_findzonecut()`,
`dns_view_findzonecut()` has been renamed into `dns_view_bestzonecut()`.
2026-01-16 07:52:56 +01:00
Colin Vidal
18d6b94c1f remove sigrdataset from dns_view_findzonecut()
Since the `sigrdataset` "output" parameter of `dns_view_findzonecut()`
is never used (always called with NULL), it is now removed.

Also, since the resolver is moving towards a parent-centric direction,
there is no point having a signature for the NS record (which is not
authoritative in the parent, so never signed) in the contextes where
`dns_view_findzonecut()` is called.
2026-01-15 19:48:30 -08:00
Colin Vidal
e0d7bddc6c simplify usage of dns_view_findzonecut()
As `dns_view_findzonecut()` only returns either ISC_R_SUCCESS or
DNS_R_NXDOMAIN, and since it automatically disassociates the rdatasets
in case of failure, some call sites are simplified.
2026-01-08 20:26:32 +01:00
Alessio Podda
78588981df Remove rrset-order cyclic from the default config, with shim
Currently we add an rrset-order cyclic statement to the default config.
Since the rrset-order allows matching a subset of all names, it must
be implemented with a string comparison against a wildcard, and since
the statement applies per rrset, this can result in millions of
comparisons per second on a busy authoritative server.

This commit removes rrset-order from the default config, but adds back
a code shim in query_setorder to preserve the previous behaviour.
2026-01-08 14:43:04 +01:00
Colin Vidal
a67487a4ad remove implicit bounds fixes in server config
Now that the configuration options `edns-version`, `edns-udp-size`,
`max-udp-size`, `no-cookie-udp-size` and `padding` have strict boundaries
(configuration failing if they are not respected), remove configuration
loading code which implicitely raises or lowers them.
2026-01-07 07:01:59 +00:00
Matthijs Mekking
6554a5f9f7 Add new 'notify-cds' configuration option
Add a new configuration option to enable/disable sending NOTIFY(CDS)
messages.
2025-12-19 14:08:15 +01:00
Aram Sargsyan
aed9cafd5c Lock the catalog zone when reconfiguring it
A catalog zone is updated in an offloaded thread, which is not
stopped during a reconfiguration in an exclusive mode, and so
can cause a race condition with it.

Waiting for the offloaded threads to complete their work before
entering into the exclusive mode can potentially cause unwanted
delays, because offloaded threads are generally "allowed" to take
a longer amount of time before they complete.

Add a dns_catz_zone_prereconfig()/dns_catz_zone_postreconfig() pair
of functions which currently just lock the catalog zone when
reconfiguring it. The change should eliminate the race.

As a side note, there was already a similar pair of functions,
dns_catz_prereconfig() and dns_catz_postreconfig() which are called
before and after reconfiguring a 'dns_catz_zones_t' object.

Below are the stack traces of the reconfiguration thread which has
asserted, and a catalog zone update thread which was caught in the
middle of its work despite the fact that the exclusive mode is
turned on.

                Stack trace of thread 23859:
                #0  0x00007f80e7b8e52f raise (libc.so.6)
                #1  0x00007f80e7b61e65 abort (libc.so.6)
                #2  0x0000000000422558 assertion_failed (named)
                #3  0x00007f80eaa6799e isc_assertion_failed (libisc-9.18.41.so)
                #4  0x00007f80ea5bc788 dns_catz_entry_getname (libdns-9.18.41.so)
                #5  0x000000000042ce0e catz_reconfigure (named)
                #6  0x000000000042d3c5 configure_catz_zone (named)
                #7  0x000000000042d7a4 configure_catz (named)
                #8  0x0000000000430645 configure_view (named)
                #9  0x000000000043d998 load_configuration (named)
                #10 0x000000000044184f loadconfig (named)
                #11 0x0000000000442525 named_server_reconfigcommand (named)
                #12 0x000000000041b277 named_control_docommand (named)
                #13 0x000000000041c74a control_command (named)
                #14 0x00007f80eaa912ae task_run (libisc-9.18.41.so)
                #15 0x00007f80eaa914cd isc_task_run (libisc-9.18.41.so)
                #16 0x00007f80eaa46435 isc__nm_async_task (libisc-9.18.41.so)
                #17 0x00007f80eaa467aa process_netievent (libisc-9.18.41.so)
                #18 0x00007f80eaa475a6 process_queue (libisc-9.18.41.so)
                #19 0x00007f80eaa46227 process_all_queues (libisc-9.18.41.so)
                #20 0x00007f80eaa462a1 async_cb (libisc-9.18.41.so)
                #21 0x00007f80e8d01893 uv__async_io.part.3 (libuv.so.1)
                #22 0x00007f80e8d13ac4 uv__io_poll (libuv.so.1)
                #23 0x00007f80e8d023fb uv_run (libuv.so.1)
                #24 0x00007f80eaa45ced nm_thread (libisc-9.18.41.so)
                #25 0x00007f80eaa9bda3 isc__trampoline_run (libisc-9.18.41.so)
                #26 0x00007f80e7f1e1ca start_thread (libpthread.so.0)
                #27 0x00007f80e7b798d3 __clone (libc.so.6)
    ...
    ...
                Stack trace of thread 23912:
                #0  0x00007f80ea5bc2da dns_catz_options_setdefault (libdns-9.18.41.so)
                #1  0x00007f80ea5bd411 dns__catz_zones_merge (libdns-9.18.41.so)
                #2  0x00007f80ea5c3c2f dns__catz_update_cb (libdns-9.18.41.so)
                #3  0x00007f80eaa4fee9 isc__nm_work_run (libisc-9.18.41.so)
                #4  0x00007f80eaa9bda3 isc__trampoline_run (libisc-9.18.41.so)
                #5  0x00007f80eaa4ff48 isc__nm_work_cb (libisc-9.18.41.so)
                #6  0x00007f80e8cfc75e worker (libuv.so.1)
                #7  0x00007f80e7f1e1ca start_thread (libpthread.so.0)
                #8  0x00007f80e7b798d3 __clone (libc.so.6)
2025-12-17 14:54:49 +00:00
Ondřej Surý
8320faf64b
Apply the dns_rdataset_cleanup patch through the codebase
Add a semantic patch to turn the conditional rdataset disassociate into
dns_rdataset_cleanup() call and run it.
2025-12-17 15:19:55 +01:00
Mark Andrews
066847af25 log failing buffer 2025-12-09 18:09:45 +11:00
Colin Vidal
77e0104cf4 shrunk cfgobj down to 48bytes
Make all non-scalar properties of `cfg_obj_t` allocated values, which
ensures the union size is the width of one pointer. Also reorder the
fields inside `cfg_obj_t` to avoid alignment padding that would increase
the size. As a result, a `cfg_obj_t` instance is now 48 bytes on a
64-bit platform.

Add a static assertion to avoid increasing the size of the struct by
mistake.

The function `parse_sockaddrsub` was taking advantage of the fact that
both sockaddr and sockaddrtls were in the same position, and used to
initialize the sockaddr field independently if this was a -tls one or
not. This doesn't work anymore now that all fields are allocated,
so it has been slightly rewritten to take both cases into account
separately.
2025-12-05 08:59:53 +01:00
Colin Vidal
f7b64e2e87 cfg_parse_ API doesn't need memory context
Because the parser now uses global memory context, the cfg_parse_* API
doesn't take a memory context anymore.
2025-12-04 16:09:40 +01:00
Evan Hunt
d4ebea1037 use a standard CLEANUP macro
CLEANUP is a macro similar to CHECK but unconditional, jumping
to cleanup even if the result is ISC_R_SUCCESS. It is now used
in place of DST_RET, CLEANUP_WITH, and CHECK(<non-success constant>).
2025-12-03 13:45:43 -08:00
Evan Hunt
6b33b7fc77 switch to RETERR where it wasn't being used
replace all instances of the pattern:

        result = <statement>
        if (result != ISC_R_SUCCESS) {
                return result;
        }

with:

        RETERR(<statement>);
2025-12-03 13:45:43 -08:00
Evan Hunt
38e94cc7da switch to CHECK where it wasn't being used
replace all instances of the pattern:

        result = <statement>
        if (result != ISC_R_SUCCESS) {
                goto cleanup;
        }

with:

        CHECK(<statement>);
2025-12-03 13:45:42 -08:00
Evan Hunt
52bba5cc34 standardize CHECK and RETERR macros
previously, there were over 40 separate definitions of CHECK macros, of
which most used "goto cleanup", and the rest "goto failure" or "goto
out". there were another 10 definitions of RETERR, of which most were
identical to CHECK, but some simply returned a result code instead of
jumping to a cleanup label.

this has now been standardized throughout the code base: RETERR is for
returning an error code in the case of an error, and CHECK is for jumping
to a cleanup tag, which is now always called "cleanup". both macros are
defined in isc/util.h.
2025-12-03 13:26:28 -08:00
Mark Andrews
99c848e4a4 Fix mislocated break 2025-12-02 14:24:25 +11:00
Evan Hunt
76b6fb3802 pass isc_buffer_t pointers when applicable
In commit aea251f3bc, `isc_buffer_reserve()` was changed to
take a simple `isc_buffer_t *` instead of `isc_buffer_t **`.
A number of functions calling it have now been similarly
modified.
2025-11-28 18:47:49 +00:00
Matthijs Mekking
0941b5754c Change output of rndc dnssec -status
Wrap 'dns_keymgr_status()' in 'dns_zone_dnssecstatus()' so we can easily
retrieve the zone string name and refresh key time value.

In addition to the current time, output when the next key event is
expected.

Don't log keys that are completely hidden unless verbose is set.
Don't log key state values unless verbose is set, or they are in a
weird state.

For expected key states, log a more useful message of the stage of
the rollover. If we are in the middle of a key rollover, don't log
when the next key rollover is scheduled.

Condense the output for better readability.
2025-11-28 15:32:17 +01:00
Matthijs Mekking
0ff66f2924 Add verbose option to rndc dnssec -status
This can be used to hide noisy details such as key states, and keys that
have been fully retired.
2025-11-28 15:32:17 +01:00
Ondřej Surý
4d307ac67a
Detect resolution loops between fetches
Maintain the relationship between the parent and child fetch and when
creating a new child fetch, properly check the resolution loops that
would lead to a new fetch would join one of the parent's fetch contexts.
2025-11-27 17:34:25 +01:00
Colin Vidal
68fda6a035 do not log "no root hints for view '_bind'"
The "no root hints for view X" message must not be shown for the default
_bind/CH view. However, it is shown since 27c4f68dcc (part of effective
configuration changes).

The reason is that since 27c4f68dcc, `configure_views()` now processes
a single list of views, which contains both builtin and user views as
they are both part of the effective configuration. Those changes omitted
the `need_hints` bool that disabled the warning for the builtin view.
This commit silences the log message again.
2025-11-21 14:21:44 -08:00
Evan Hunt
f798feda40 fix ACL settings when merging views
when merging view objects into the effective configuration, add
allow-query-cache, allow-recursion, allow-query-cache-on and
allow-recursion-on ACLs as needed to reflect the way those
options inherit from each other.

this means the effective configuration is now correct for each
view.  ACLs no longer need to be corrected when applying the
configuration, and the actual effective ACL values will be
displayed in "rndc showconf" and "named-checkconf -pe".
2025-11-20 11:24:11 -08:00
Evan Hunt
1a77ae2a7a fix allow-recursion/allow-query-cache inheritance
the merging of options and defaults into the effective configuration
broke the mutual inheritance of the allow-recursion, allow-query, and
allow-query-cache ACLs, and of the allow-recursion-on and
allow-query-cache-on ACLs.

this has been corrected by adding a 'cloned' flag to the cfg_obj
structure to indicate whether it was configured explicitly or
cloned from the defaults during parsing. we can then adjust the
ACLs while configuring a view, favoring user-configured values
when they're available over cloned defaults.

currently the adjustments to the ACLs are done in configure_view();
later they'll be moved into the effective configuration and this
special handling can be removed.
2025-11-20 11:24:11 -08:00
Colin Vidal
4b566599a6 refactor detection of zone DB load completion
Because the asynchronous loading logic expected all jobs to be scheduled
then to be run (because it used to be scheduled during the exclusive
mode) and because all jobs are scheduled on various threads, there were
random situations where load_zones() would return after the scheduled
DB zone loading actually ran. In such cases, the zl->refs ref counter
in view_loaded() wouldn't go down to 0 and the remaining task to do
once all zones were loaded was never called. In particular,
server->reload_status kept the NAMED_RELOAD_PENDING state.

This problem is fixed by handling zoneload_t as a ref-counted object,
shared between load_zones() and each instance of scheduled zone DB
loading. Its destructor function is actually the content of
view_loaded() in the case the zt->refs went to 0. This ensures a
correct post-loading routine to be called once the last load is done.
2025-11-18 12:16:14 +01:00
Colin Vidal
19cec37d5e set reload_status to fail before logging it
The `reload_status` is set to `NAMED_RELOAD_FAILED` after the log line is
printed about this change. Update `reload_status` first, to avoid
(unlikely) case where a test waiting for this log line would attempt a
RNDC reload query but it would be processed by `named` before the status
is updated.
2025-11-18 12:16:14 +01:00
Colin Vidal
e8e879c008 remove exclusive mode when scheduling zone load
Remove the exclusive mode when scheduling the zone load right after
(re)loading `named` configuration, as there is no reason anymore to
schedule zone loading while the exclusive lock is held. Data which can
be read or written by multiple threads are locked or atomic.
2025-11-18 12:16:14 +01:00
Colin Vidal
fd49c95070 enforces that catalog-zone can't be used in non IN views
Catalog-zones can't be used in view which are not from the IN class.
This is now enforced as the server won't load (instead of loading
without the catalog-zone). This configuration error is now also caught
by `named-checkconf`.
2025-11-18 10:08:42 +01:00
Colin Vidal
6b5f714e53 remove need_hints parameters to configure_view
The `configure_view()` `need_hints` is removed as it this function was
always called with the value `true`.

The `need_hints` wasn't even used in the function. The only thing it was
actually used was to throw a warning which can be done simply in an
`else` condition branch.

Moreoever, in the case of catalog zones and response-policy, it fixes a
possible bug that would affect root zones, as those wouldn't be reverted
back to their previous version in case of the view fails to load
(during a server reconfiguration).
2025-11-18 10:08:42 +01:00
Colin Vidal
fb64fac3f3 no effective config as text if allow-new-zones is yes
Do not save the textual version of the effective configuration when
`allow-new-zones` is enabled, as it can be printed on-demand. This
enable to reduce the memory footprint of ~70MB on huge configurations
(1M zones).
2025-11-17 11:45:28 +01:00
Evan Hunt
f9922eb65a save effective configuration as text
the effective configuration tree is now detached if allow-new-zones
or catalog-zones aren't enabled in any views. this reduces memory
consumption while still allowing "rndc showconf -effective" to work.
2025-11-12 11:36:07 +01:00
Evan Hunt
6a57c6e8f6 save zone configuration as text
as previously mentioned in commit c65b2868ab, a cfg_obj_t
configuration tree structure takes up considerably more space than
the canonical text. since the zone configuration saved in the zone
object using dns_zone_setcfg() is only currently used for "rndc
showzone", it can be saved as text more efficiently than as an
object tree. (and, if a tree were needed, the text could be
re-parsed quickly; zone configuration text is generally small.)
2025-11-12 11:36:07 +01:00
Colin Vidal
51bc6e7dd8 don't retain the default configuration
The built-in configuration is actually used in two cases: first, when
the server is loaded (or reloaded), and second when
'rndc showconf -builtin' is called.

Considering the parsing of the builtin configuration is quick and does
not occur during exclusive mode, but the configuration tree takes
considerable memory space, the built-in configuration is no longer kept
in memory once it has been used; instead it is re-parsed on demand.
2025-10-31 08:02:17 +01:00
Evan Hunt
c65b2868ab save userconfig as text instead of a cfg_obj tree
once the user configuration has been merged into the effective
configuration, it no longer needs to be accessed as a configuration
tree, but we still want to be able to show it with 'rndc showconf -user'.

because the recursive strucure of cfg_obj objects is fairly large, the
canonical text form is a fraction of the size of the configuration
tree, so we now save it in that form instead.
2025-10-30 22:55:31 +00:00