Commit graph

236 commits

Author SHA1 Message Date
Alessio Podda
70b65648ac Move ns_highwater_recursclients to highwater stats
Since it is impossible to increase an isc_statsmulti counter and
retrieve the new counter atomically, and we need the output of
recursclients in order to compute ns_highwater_recursive, we change the
recursclients counter to an isc_stats one.
2026-03-26 10:19:25 +01:00
Alessio Podda
ed0ecb62e4 Add low contention stats counter
In the current statistics counter implementation, the statistics are
backed by an array of counters, which are updated via atomic operations.
This leads to contention, especially on high core count
machines.

This commit introduces a new isc_statsmulti_t counter that keeps a
separate array per thread. These counters are then aggregated only when
statistics are queried, shifting work off the critical path.

These changes lead to a ~2% improvement in perflab.
2026-03-26 10:19:25 +01:00
Michal Nowak
239464f276
Use clang-format-22 to update formatting 2026-03-04 10:56:41 +01:00
Ondřej Surý
e94a31a666
Split qctx_destroy() into qctx_deinit() and qctx_destroy()
The qctx_destroy() only needs to be called on allocated memory and
qctx_deinit() needs to be called always.  Also remove .allocated member
from the query_ctx_t structure.
2025-11-27 10:37:58 +01:00
Evan Hunt
d5e4684b3d remove dns_message_buildopt
now that the EDNS state is stored within dns_message_t, it's no longer
necessary to have a public API call to build an opt rdataset; we can
just have dns_message_setopt() build the opt record internally.
2025-11-21 11:13:21 -08:00
Colin Vidal
62002cfa9c rename ns_pluginregister_ctx_t into ns_pluginctx_t
The type `ns_pluginregister_ctx_t` was initially added to pass plugin
contextual data when the plugin is registered, but this is also now
passed into `plugin_check`. Furthermore, those various data are not
specific to the registration in particular. Rename the type into
`ns_pluginctx_t` for clarity.
2025-10-01 20:20:48 +02:00
Colin Vidal
25e258fb0b provide a context structure for plugin_register()
This commit introduces a new type, ns_pluginregister_ctx_t,
which is passed to plugin_check() and plugin_register() in place of the
'source' parameter. The source value is now just part of the structure,
which also holds a pointer to the zone origin if the plugin is loaded at
a zone level.

This provides more contextual information, enabling the plugin to make
specific configuration decisions based on the name of the zone for which
it is loaded.

It's also flexible if more contextual data are needed in the future:
add a new field to ns_pluginregister_ctx_t, and new plugins can use
it without affecting compatibility with existing plugins.
2025-10-01 11:11:00 +02:00
Evan Hunt
92cefc52bc check plugin config before registering
In named_config_parsefile(), when checking the validity of
named.conf, the checking of plugin correctness was deliberately
postponed until the plugin is loaded and registered. However,
when the plugin was registered, the checking was never actually
done: the plugin_register() implementation was called, but
plugin_check() was not.

This made it necessary to duplicate the correctness checking in both
functions, so that both named-checkconf and named could catch errors.
That should not be required.

ns_plugin_register() now calls the check function before the register
function, and aborts if either one fails.  ns_plugin_check() calls only
the check function.  ns_plugin_check() is used by named-checkconf, and
ns_plugin_register() is used by named. (Note: this design has a
side effect that a call to ns_plugin_register() will result in the
plugin parameters being parsed twice at registration time.)

ns_plugin_check() now takes an additional argument for the hook
source: zone or view.
2025-09-30 15:42:26 -07:00
Colin Vidal
cecb03d6db fix hookasyncctx renaming
The field `ns_hookasync_t` was initially named `hook_actx` and wrongly
renamed `hook_aclctx` during a mass-renaming of various names for the
config acl context into a consistent `aclctx` name (see !11003). Of
course this is wrong as `ns_hookasync_t` has nothing to do with ACL but
about _async_ context. This commit fixes the mistake by renaming this
field `hookasyncctx`
2025-09-28 22:41:32 +02:00
Colin Vidal
36a05c81b4 rename cfg_aclconfctx_t variables to aclctx
ACL configuration context variables are inconsistently named as `actx`,
`ac`, or `aclconfctx`, which caused confusion during code reviews. This
commit renames all `cfg_aclconfctx_t` variables to `aclctx`, which is
short, consistent, and unambiguous.
2025-09-24 20:14:49 +02:00
Evan Hunt
0cdcc8a8f4 rename NS_QUERY_RESET to NS_QUERY_CLEANUP
query_reset() is called during query initialization, but the only
time the NS_QUERY_SETUP hook runs is when it's called from
query_cleanup().  it makes more sense to move the hook point to
there and rename it to NS_QUERY_CLEANUP.

this change caused a crash in the unit tests due to the view being
unnecessarily detached before ns__client_reset_cb() was called.
this has also been fixed.
2025-09-10 17:46:53 -07:00
Evan Hunt
637e8d01d2 minimize calls to dns_zone_gethooktable per qctx
add a 'zhooks' member to the query_ctx structure, so that we only
need to look up the hook table for the zone once when iniitalizing
a qctx, and not once for every hook point.
2025-09-10 14:05:42 -07:00
Colin Vidal
d676ce8085 remove query_ctx_t detach_client property
Since the removal of NS_QUERY_QCTX_DESTROYED hook, there is no need for
the `qctx->detach_client` object anymore, as this was designed to tell
the plugin whether the client object is about to be, or is already,
freed from memory.  This is not needed anymore, as NS_QUERY_RESET is
called _always_ when the client object is about to be freed from memory.

Remove `detach_client` and tidy up the code a bit by including the
freeing of the qctx object (when allocated) inside the qctx_destroy
function instead of requiring extra calls.
2025-09-09 10:02:32 +02:00
Colin Vidal
95c71c2739 replace QCTX_INIT/_DESTROY hooks with QUERY_SETUP/_RESET
The hook NS_QUERY_QCTX_DESTROY is problematic with zone plugins because
it can be called in some contexts where `qctx->client` is invalid (the
pointer is dangling); which would lead to a use-after-free (spotted by
TSAN build) as `qctx->client` is used to get the zone hooktable, to find
out whether there is an authoritive zone which would have
NS_QUERY_QCTX_DESTROY registered.

This can't easily be fixed, because there is no easy way to know from
query.c code if `client` is still a valid object: `client->reqhandle`,
representing the request from a client, is refcounted, and the `client`
object is freed from memory once its refcounter gets to 0. While
`reqhandle` is attached from query.c code, it can be attached more than
once from asynchronous code and there is no clear path where detaching
it would lead to a client free. Hence, there is no way to know for sure
when to set `qctx->client = NULL` (this is why the pointer remains
dangling).

Back to the original problem; this is why the NS_QUERY_QCTX_DESTROY hook
is incompatible with zone plugins. `qctx->detach_client`, which is used
to tell a plugin that the `client` object is either free or about to be
free can't be use either, because in some cases the client is still
there, and should be used.

Code issue aside, the `qctx` object is really just an aggregate of
various data to pass easily in the various functions and callbacks,
initially stored on the stack, but allocated in some cases (for some
asynchronous flow, when recursion is needed), so the point it gets
created/"destroyed" is really just an implementation "detail", and
providing a higher level hook for the plugin would be beneficial. Hence,
NS_QUERY_RESET and NS_QUERY_INIT are removed, and instead, the existing
NS_QUERY_SETUP can be used as well as the newly introduced
NS_QUERY_RESET (which replaces NS_QUERY_QCTX_DESTROY). The advanage is
that NS_QUERY_RESET is called _only_ when the client object is _always_
about to be freed, which avoids usage of the extra `qctx->detach_client`
usage from the plugin.

The way NS_QUERY_RESET works is that when the `client` is freed, a
callback (from `query.c`) is called. This callback creates a transient
qctx object on the stack with a pointer to the view, and uses that
to call the hook.
2025-09-09 09:42:34 +02:00
Colin Vidal
260bbc24c9 add plugin_register param telling the source
The plugin `plugin_register` API has a new parameter `source` indicating
whether the plugin is loaded from a view or a zone.

This extra parameter enables the plugin to fail early during
initialization if a plugin written to be used in a zone exclusively
is loaded at a view level, or vice versa.
2025-09-09 09:42:34 +02:00
Colin Vidal
1566634fae add NS_QUERY_AUTHZONE_ATTACHED hook
Add a new query hook called `NS_QUERY_AUTHZONE_ATTACHED`. This hook is
called whenever an authoritative zone is found and attached during a
query answer.

From code level, this hook is called when `qctx->client->query->authzone`
is attached during a query.  This enables zone-specific plugins to
initialize specific states whenever a local zone is found that can
answer a query.
2025-09-09 09:42:34 +02:00
Colin Vidal
91cd7b865c update hook developer documentation
Attempt to add zone plugin specificities into the hook developer
documentation. In particular about the hook call order and hookpoint
which can't be called on a zone plugin.
2025-09-09 09:42:34 +02:00
Colin Vidal
5893770cd9 add zone-specific plugin instance
The zone object now has its own hooktable and plugins, which are
initialized during zone initialization.
2025-09-09 09:42:34 +02:00
Aram Sargsyan
5e718dd220 Implement '-T slowrpz' named testing option
When used, named processes RPZ zones slowly. Useful for system tests.
2025-08-22 16:31:17 +00:00
Ondřej Surý
f6aed602f0
Refactor the network manager to be a singleton
There is only a single network manager running on top of the loop
manager (except for tests).  Refactor the network manager to be a
singleton (a single instance) and change the unit tests, so that the
shorter read timeouts apply only to a specific handle, not the whole
extra 'connect_nm' network manager instance.
2025-07-23 22:45:38 +02:00
Ondřej Surý
b8d00e2e18
Change the loopmgr to be singleton
All the applications built on top of the loop manager were required to
create just a single instance of the loop manager.  Refactor the loop
manager to not expose this instance to the callers and keep the loop
manager object internal to the isc_loop compilation unit.

This significantly simplifies a number of data structures and calls to
the isc_loop API.
2025-07-23 22:44:16 +02:00
Matthijs Mekking
a66b04c8d4 Make serve-stale refresh behave as prefetch
A serve-stale refresh is similar to a prefetch, the only difference
is when it triggers. Where a prefetch is done when an RRset is about
to expire, a serve-stale refresh is done when the RRset is already
stale.

This means that the check for the stale-refresh window needs to
move into query_stale_refresh(). We need to clear the
DNS_DBFIND_STALEENABLED option at the same places as where we clear
DNS_DBFIND_STALETIMEOUT.

Now that serve-stale refresh acts the same as prefetch, there is no
worry that the same rdataset is added to the message twice. This makes
some code obsolete, specifically where we need to clear rdatasets from
the message.
2025-07-23 07:18:48 +00:00
Alessio Podda
e84704bd55 Improve efficiency of ns_client_t reset
The ns_client_t struct is reset and zero-ed out on every query,
but some fields (query, message, manager) are preserved.

We observe two things:
 - The sendbuf field is going to be overwritten anyway, there's
   no need to zero it out.
 - The fields are copied out when the struct is zero-ed out, and
   then copied back in. For the query field (which is 896 bytes)
   this is very inefficient.

This commit makes the reset more efficient avoiding to unnecessary
zero-ing and copy.
2025-07-10 07:19:47 +02:00
Ondřej Surý
1032681af0 Convert the isc/tid.h to use own signed integer isc_tid_t type
Change the internal type used for isc_tid unit to isc_tid_t to hide the
specific integer type being used for the 'tid'.  Internally, the signed
integer type is being used.  This allows us to have negatively indexed
arrays that works both for threads with assigned tid and the threads
with unassigned tid.  This should be used only in specific situations.
2025-06-28 13:32:12 +02:00
Mark Andrews
e9a87f0389 Return EDNS ZONEVERSION if requested
If there was an EDNS ZONEVERSION option in the DNS request and the
answer was from a zone, return the zone's serial and number of
labels excluding the root label with the type set to 0 (ZONE-SERIAL).
2025-03-24 22:16:09 +00:00
Mark Andrews
998b93acca Add EDNS ZONEVERSION option counter 2025-03-24 22:16:09 +00:00
Aram Sargsyan
807ef8545d Implement -T cookiealwaysvalid
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.
2025-03-17 10:42:47 +00:00
Evan Hunt
c2e4358267 prevent a reference leak from the ns_query_done hooks
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.

this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().
2025-02-25 22:40:48 +00:00
Ondřej Surý
2f8e0edf3b Split and simplify the use of EDE list implementation
Instead of mixing the dns_resolver and dns_validator units directly with
the EDE code, split-out the dns_ede functionality into own separate
compilation unit and hide the implementation details behind abstraction.

Additionally, the EDE codes are directly copied into the ns_client
buffers by passing the EDE context to dns_resolver_createfetch().

This makes the dns_ede implementation simpler to use, although sligtly
more complicated on the inside.

Co-authored-by: Colin Vidal <colin@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
2025-01-30 11:52:53 +01:00
Evan Hunt
10accd6260 clean up uses of ISC_R_NOMEMORY
the isc_mem allocation functions can no longer fail; as a result,
ISC_R_NOMEMORY is now rarely used: only when an external library
such as libjson-c or libfstrm could return NULL. (even in
these cases, arguably we should assert rather than returning
ISC_R_NOMEMORY.)

code and comments that mentioned ISC_R_NOMEMORY have been
cleaned up, and the following functions have been changed to
type void, since (in most cases) the only value they could
return was ISC_R_SUCCESS:

- dns_dns64_create()
- dns_dyndb_create()
- dns_ipkeylist_resize()
- dns_kasp_create()
- dns_kasp_key_create()
- dns_keystore_create()
- dns_order_create()
- dns_order_add()
- dns_peerlist_new()
- dns_tkeyctx_create()
- dns_view_create()
- dns_zone_setorigin()
- dns_zone_setfile()
- dns_zone_setstream()
- dns_zone_getdbtype()
- dns_zone_setjournal()
- dns_zone_setkeydirectory()
- isc_lex_openstream()
- isc_portset_create()
- isc_symtab_create()

(the exception is dns_view_create(), which could have returned
other error codes in the event of a crypto library failure when
calling isc_file_sanitize(), but that should be a RUNTIME_CHECK
anyway.)
2025-01-23 15:54:57 -08:00
Colin Vidal
4096f27130 add support for multiple EDE
Extended DNS error mechanism (EDE) enables to have several EDE raised
during a DNS resolution (typically, a DNSSEC query will do multiple
fetches which each of them can have an error). Add support to up to 3
EDE errors in an DNS response. If duplicates occur (two EDEs with the
same code, the extra text is not compared), only the first one will be
part of the DNS answer.

Because the maximum number of EDE is statically fixed, `ns_client_t`
object own a static vector of `DNS_DE_MAX_ERRORS` (instead of a linked
list, for instance). The array can be fully filled (all slots point to
an allocated `dns_ednsopt_t` object) or partially filled (or
empty). In such case, the first NULL slot means there is no more EDE
objects.
2025-01-22 21:07:44 +01:00
Evan Hunt
3394aa9c25 remove "sortlist"
this commit removes the deprecated "sortlist" option. the option
is now marked as ancient; it is a fatal error to use it in
named.conf.

the sortlist system test has been removed, and other tests that
referenced the option have been modified.

the enabling functions, dns_message_setsortorder() and
dns_rdataset_towiresorted(), have also been removed.
2024-12-11 15:09:24 -08:00
Matthijs Mekking
16b3bd1cc7 Implement global limit for outgoing queries
This global limit is not reset on query restarts and is a hard limit
for any client request.
2024-12-05 14:17:07 +01:00
Petr Menšík
c5ebe5eb0a Remove ns_listenlist_default()
It is not used anywhere in named and is no longer necessary
there.  It was called in some unit tests, but was not actually
needed by them.
2024-11-26 15:22:30 -08:00
Aydın Mercan
d987e2d745
add separate query counters for new protocols
Add query counters for DoT, DoH, unencrypted DoH and their proxied
counterparts. The protocols don't increment TCP/UDP counters anymore
since they aren't the same as plain DNS-over-53.
2024-11-25 13:07:29 +03:00
Mark Andrews
c676fd2566 Allow send-report-channel to be set at the zone level
If send-report-channel is set at the zone level, it will
be stored in the zone object and used instead of the
view-level agent-domain when constructing the EDNS
Report-Channel option.
2024-10-23 21:29:32 +00:00
Mark Andrews
ac1c60d87e Add send-report-channel option
This commit adds support for the EDNS Report-Channel option,
which is returned in authoritative responses when EDNS is in use.

"send-report-channel" sets the Agent-Domain value that will be
included in EDNS Report-Channel options.  This is configurable at
the options/view level; the value is a DNS name. Setting the
Agent-Domain to the root zone (".") disables the option.

When this value has been set, incoming queries matchng the form
_er.<qtype>.<qname>.<extended-error-code>._er.<agent-domain>/TXT
will be logged to the dns-reporting-agent channel at INFO level.

(Note: error reporting queries will only be accepted if sent via
TCP or with a good server cookie.  If neither is present, named
returns BADCOOKIE to complete the DNS COOKIE handshake, or TC=1
to switch the client to TCP.)
2024-10-23 21:29:32 +00:00
Aram Sargsyan
b8c068835e Clean up 'nodetach' in ns_client
The 'nodetach' member is a leftover from the times when non-zero
'stale-answer-client-timeout' values were supported, and currently
is always 'false'. Clean up the member and its usage.
2024-10-09 08:03:13 +00:00
Mark Andrews
5fad79c92f Log the rcode returned to for a query
Log to the querylog the rcode of a previous query using
the identifier 'response:' to diffenciate queries from
responses.
2024-09-19 21:44:06 +00:00
Ondřej Surý
091d738c72 Convert all categories and modules into static lists
Remove the complicated mechanism that could be (in theory) used by
external libraries to register new categories and modules with
statically defined lists in <isc/log.h>.  This is similar to what we
have done for <isc/result.h> result codes.  All the libraries are now
internal to BIND 9, so we don't need to provide a mechanism to register
extra categories and modules.
2024-08-20 12:50:39 +00:00
Ondřej Surý
8506102216 Remove logging context (isc_log_t) from the public namespace
Now that the logging uses single global context, remove the isc_log_t
from the public namespace.
2024-08-20 12:50:39 +00:00
Ondřej Surý
b2dda86254 Replace isc_log_create/destroy with isc_logconfig_get()
Add isc_logconfig_get() function to get the current logconfig and use
the getter to replace most of the little dancing around setting up
logging in the tools. Thus:

    isc_log_create(mctx, &lctx, &logconfig);
    isc_log_setcontext(lctx);
    dns_log_setcontext(lctx);
    ...
    ...use lcfg...
    ...
    isc_log_destroy();

is now only:

    logconfig = isc_logconfig_get(lctx);
    ...use lcfg...

For thread-safety, isc_logconfig_get() should be surrounded by RCU read
lock, but since we never use isc_logconfig_get() in threaded context,
the only place where it is actually used (but not really needed) is
named_log_init().
2024-08-20 12:50:39 +00:00
Evan Hunt
c5588babaf make "max_restarts" a configurable value
MAX_RESTARTS is no longer hard-coded; ns_server_setmaxrestarts()
and dns_client_setmaxrestarts() can now be used to modify the
max-restarts value at runtime. in both cases, the default is 11.
2024-08-07 13:03:08 -07:00
Ondřej Surý
b26079fdaf Don't open route socket if we don't need it
When automatic-interface-scan is disabled, the route socket was still
being opened.  Add new API to connect / disconnect from the route socket
only as needed.

Additionally, move the block that disables periodic interface rescans to
a place where it actually have access to the configuration values.
Previously, the values were being checked before the configuration was
loaded.
2024-08-05 07:31:02 +00:00
Aram Sargsyan
cb5238cc62 Replace #define DNS_GETDB_ with struct of bools
This makes it easier to pretty-print the attributes in a debugger.
2024-07-31 11:52:52 +00:00
Ondřej Surý
1c0564d715
Remove ns_query_init() cannot fail, remove the error paths
As ns_query_init() cannot fail now, remove the error paths, especially
in ns__client_setup() where we now don't have to care what to do with
the connection if setting up the client could fail.  It couldn't fail
even before, but now it's formal.
2024-07-03 09:05:51 +02:00
Aram Sargsyan
ad489c44df
Remove sig0checks-quota-maxwait-ms support
Waiting for a quota to appear complicates things and wastes
rosources on timer management. Just answer with REFUSE if
there is no quota.
2024-06-10 17:33:11 +02:00
Aram Sargsyan
f0cde05e06
Implement asynchronous view matching for SIG(0)-signed queries
View matching on an incoming query checks the query's signature,
which can be a CPU-heavy task for a SIG(0)-signed message. Implement
an asynchronous mode of the view matching function which uses the
offloaded signature checking facilities, and use it for the incoming
queries.
2024-06-10 17:33:10 +02:00
Aram Sargsyan
c7f79a0353
Add a quota for SIG(0) signature checks
In order to protect from a malicious DNS client that sends many
queries with a SIG(0)-signed message, add a quota of simultaneously
running SIG(0) checks.

This protection can only help when named is using more than one worker
threads. For example, if named is running with the '-n 4' option, and
'sig0checks-quota 2;' is used, then named will make sure to not use
more than 2 workers for the SIG(0) signature checks in parallel, thus
leaving the other workers to serve the remaining clients which do not
use SIG(0)-signed messages.

That limitation is going to change when SIG(0) signature checks are
offloaded to "slow" threads in a future commit.

The 'sig0checks-quota-exempt' ACL option can be used to exempt certain
clients from the quota requirements using their IP or network addresses.

The 'sig0checks-quota-maxwait-ms' option is used to define a maximum
amount of time for named to wait for a quota to appear. If during that
time no new quota becomes available, named will answer to the client
with DNS_R_REFUSED.
2024-06-10 17:33:08 +02:00
Ondřej Surý
e28266bfbc
Remove the extra memory context with own arena for sending
The changes in this MR prevent the memory used for sending the outgoing
TCP requests to spike so much.  That strictly remove the extra need for
own memory context, and thus since we generally prefer simplicity,
remove the extra memory context with own jemalloc arenas just for the
outgoing send buffers.
2024-06-10 16:48:54 +02:00