Commit graph

5159 commits

Author SHA1 Message Date
Colin Vidal
b4abc63dfa Add ISC_LIST support for isc_netaddr_t
Add an `isc_netaddrlink_t` type wrapping an `isc_netaddr_t` and an
`ISC_LINK`. This enable to build list of `isc_netaddr_t` without
increasing the memory footprint of existing usages of `isc_netaddr_t`
(which doesn't require to be linked).
2026-03-30 20:41:13 +02:00
Alessio Podda
80be99d3ac Convert isc_statsmulti to use ISC_REFCOUNT_IMPL
Instead of using hand-rolled attach and detach function, this commit
declares the same functions through the ISC_REFCOUNT_IMPL macro.
2026-03-26 10:19:25 +01:00
Alessio Podda
ed0ecb62e4 Add low contention stats counter
In the current statistics counter implementation, the statistics are
backed by an array of counters, which are updated via atomic operations.
This leads to contention, especially on high core count
machines.

This commit introduces a new isc_statsmulti_t counter that keeps a
separate array per thread. These counters are then aggregated only when
statistics are queried, shifting work off the critical path.

These changes lead to a ~2% improvement in perflab.
2026-03-26 10:19:25 +01:00
Ondřej Surý
24951b703e Move ISC_NONSTRING from util.h to attributes.h
ISC_NONSTRING is a compiler attribute macro and belongs alongside
the other attribute definitions in attributes.h, not in util.h.
2026-03-23 11:06:28 +01:00
Ondřej Surý
0f3be0beb8 Add MOVE_OWNERSHIP() macro for transferring pointer ownership
A helper macro that returns the current value of a pointer and sets
it to NULL in one expression, useful for transferring ownership in
designated initializers.
2026-03-23 11:06:28 +01:00
Ondřej Surý
9457f4f8c5
Fix data race in RCU pointer exchange operations
The liburcu rcu_cmpxchg_pointer() uses CMM_RELAXED ordering on the CAS
failure path.  When a thread loses the CAS and gets another thread's
pointer back, reading fields through that pointer is a data race on
weakly-ordered architectures (ARM, POWER) because the failing load has
no acquire semantics.

Override rcu_cmpxchg_pointer() and rcu_xchg_pointer() to use standard
__atomic builtins with __ATOMIC_ACQ_REL (success) and __ATOMIC_ACQUIRE
(failure) ordering.  This fixes the race on all architectures and is
natively visible to ThreadSanitizer.
2026-03-19 08:10:22 +01:00
Ondřej Surý
8e240bbb5f Fix isc_buffer_init capacity mismatch in DoH data chunk callback
isc_buffer_init() is given MAX_DNS_MESSAGE_SIZE (65535) as capacity but
only h2->content_length bytes are allocated.  This makes the buffer
believe it has more space than actually allocated.  A secondary bounds
check (new_bufsize <= h2->content_length) prevents actual overflow, but
the buffer invariant is violated.

Pass h2->content_length as the capacity to match the allocation.
2026-03-18 11:39:01 +01:00
Mark Andrews
d3ffa1f007 Clear errno before calling strtol
The previous code was incorrectly clearing errno after calling
strtol but before testing the result rather than clearing it and
then calling strtol so that changes to errno can be correctly
determined.
2026-03-17 10:51:37 +11:00
Matthijs Mekking
bc1d177cc2 Fast fail a validator deadlock
We return DNS_R_NOVALIDSIG if we detected a deadlock. Then in
'validate_async_done()', this result value is used to check if we
need to fall back to insecure. As part of that we create a new fetch
but that fails because of the detected deadlock. This results in a loop
of deadlock detected, fallback to insecure, deadlock detected, ...

Add a new result value, ISC_R_DEADLOCK, and return this instead when
we have detected a deadlock. This will be treated as a generic error,
as there is no special handling for this result value.
2026-03-16 16:46:51 +00:00
Aram Sargsyan
336c523b79 OpenSSL 4 compatibility fix
Starting from OpenSSL 4 the the X509_get_subject_name() function
returns a 'const' pointer to a name instead of a regular pointer.
Duplicate the name before operating on it, then free it.
2026-03-16 10:01:18 +00:00
Ondřej Surý
2ab3d7c075
Fix missing server socket detach in TLS accept error path
When TLS creation fails in tlslisten_acceptcb(), tlssock->server
was not detached before detaching tlssock itself.
2026-03-14 13:58:32 +01:00
Ondřej Surý
3f15f2d9e5
Fix INSIST copy-paste error checking RADIX_V4 instead of RADIX_V6
The INSIST in isc_radix_insert() checks node->data[RADIX_V4] and
node->node_num[RADIX_V4] twice due to a copy-paste error, never
verifying the RADIX_V6 fields.

Fix the second pair to check RADIX_V6.
2026-03-14 11:03:31 +01:00
Ondřej Surý
f1311d2d19
Enforce isc_work enqueue loop affinity
Add a REQUIRE(isc_loop() == loop) assertion to isc_work_enqueue()
to strictly enforce that work is enqueued from the loop it is
assigned to. This loudly prohibits cross-thread queue manipulation
before it inevitably turns into a concurrency debugging nightmare.
2026-03-14 06:32:50 +01:00
Michal Nowak
239464f276
Use clang-format-22 to update formatting 2026-03-04 10:56:41 +01:00
Aram Sargsyan
7f5608206e Use standard reference counting for isc_histomulti
Use reference counting for isc_histomulti module so that it's
possible to attach/detach to/from the objects when used in the
statistics channel in the coming commits.
2026-02-26 14:00:10 +00:00
Mark Andrews
3801d0ebbf
Enforce NSEC3 record consistency
NSEC3 hashes are required to fit within a single DNS label.  Since there
are 5 bits per label byte without pad characters, the maximum hash size
is floor(63*5/8) (39 bytes).

This patch enforces this maximum length for unknown algorithms, while
strictly enforcing the exact expected digest length for known algorithms
like SHA-1.
2026-02-24 14:57:22 +01:00
Ondřej Surý
10270f6b42
Cleanup setting netmgr ports from isc_managers_create()
This is now duplicate as the default ports are already set in
isc_netmgr_create().
2026-02-20 16:37:44 +01:00
Ondřej Surý
295139f8ca
Rename isc_net_getudpportrange() to isc_net_getportrange()
This better reflects the true nature of the function as we are reading
the ephemeral port range which is not related to UDP at all.
2026-02-20 14:06:23 +01:00
Ondřej Surý
04c81b55d2
Implement IP_LOCAL_PORT_RANGE socket option for Linux
For Linux >= 6.8:

Since 2023, Linux has introduced a change to the IP_LOCAL_PORT_RANGE
socket option that eliminates the need for the random window
shifting (implemented as a fallback in the next commit).

By setting IP_LOCAL_PORT_RANGE option, we tell the kernel to use better
approach to the source port selection.

For Linux << 6.8:

This implement selecting port by random shifting range leveraging the
IP_LOCAL_PORT_RANGE socket option.  The network manager is initialized
with the ephemeral port range (on startup and on reconfig) and then for
every outgoing TCP connection, we define a custom port range (1000
ports) and then randomly shift the custom range within the system range.

This helps the kernel to reduce the search space to the custom window
between <random_offset, random_offset + 1000>.

Reference:
https://blog.cloudflare.com/linux-transport-protocol-port-selection-performance/#kernel
2026-02-20 14:06:23 +01:00
Ondřej Surý
2c48fcaeed
Improve the source port selection on Linux
Since 2015, Linux has introduced a new socket option to overcome TCP
limitations: When an application needs to force a source IP on an active
TCP socket it has to use bind(IP, port=x).  As most applications do not
want to deal with already used ports, x is often set to 0, meaning the
kernel is in charge to find an available port.  But kernel does not know
yet if this socket is going to be a listener or be connected. This
IP_BIND_ADDRESS_NO_PORT socket option ask the kernel to ignore the 0
port provided by application in bind(IP, port=0) and only remember the
given IP address. The port will be automatically chosen at connect()
time, in a way that allows sharing a source port as long as the 4-tuples
are unique.

Enable IP_BIND_ADDRESS_NO_PORT on the outgoing TCP sockets to overcome
this TCP limitation.
2026-02-20 14:06:23 +01:00
Aydın Mercan
a531f00a75
wipe hmac keys correctly pre-3.0 libcrypto
A lingering `sizeof` from the prototype era of !11094 caused the
key-wipe in `isc_hmac_key_destroy` to use `sizeof(key->len)` instead of
`key->len` for the length argument of `isc_safe_memwipe`.

This results in a buffer overflow of zero bytes in HMAC keys that are
less than 4 bytes. As such, the overflow can only be visibile in keys
that are less than 32-bits, which is beyond broken and creating such
keys are only possible in testing.

Therefore, this change is *not* a security fix since the conditions are
never reachable in any imaginable deployment scenario.

Builds that use OpenSSL >=3.0 are unaffected as the `sizeof` was only
remaining in pre-3.0 builds.
2026-02-06 14:14:43 +03:00
Aydın Mercan
19c9053a6b
use isc_ossl_wrap to generate epheremal tls keys 2026-02-02 11:50:14 +03:00
Aydın Mercan
b748651bb0
explicitly set ec points properties in pre-3.0 openssl
Generating a P-256 key in pre-3.0 wasn't explicitly using uncompressed
named curves in DNSSEC but was when generating an epheremal TLS key.
2026-02-02 11:50:14 +03:00
Aydın Mercan
251af02fe7
make generate_pkcs11_ec_key consistent with others 2026-02-02 11:50:14 +03:00
Aydın Mercan
c2f3a23a3e
expose isc__crypto_md in isc/ossl_wrap.h
This is a bit of a namespace convention violation but it fits the spirit of
this header since it is exposing OpenSSL-isms to others.

Further work is needed to make sure the exposed EVP_MD isn't needed
anymore.
2026-02-02 11:50:14 +03:00
Aydın Mercan
21f80a2bd7
make isc_ossl_wrap_ecdsa_set_deterministic consistent with style 2026-02-02 11:50:14 +03:00
Aydın Mercan
8c69fedc7c
switch away from ossl_param builders from ecdsa functions 2026-02-02 11:50:14 +03:00
Aydın Mercan
fe617aa830
set parameters in batch for rsa keygen
On top on improving readability, doing so allows us to use a uint32_t
for setting the e value, getting rid of allocating an unneccessary
BIGNUM.
2026-02-02 11:50:14 +03:00
Aydın Mercan
3bd3754994
remove libcrypto version specific code in opensslecdsa_link
Using `EVP_SIGNATURE` explicit algoritms for signatures have been added
in OpenSSL 3.4 and so is skipped for the initial OpenSSL version
specific code splitting.
2026-02-02 11:50:14 +03:00
Aydın Mercan
f4d88404e2
remove libcrypto version specific code in opensslrsa_link
Using `EVP_SIGNATURE` explicit algoritms for signatures have been added
in OpenSSL 3.4 and so is skipped for the initial OpenSSL version
specific code splitting.
2026-02-02 11:50:14 +03:00
Aydın Mercan
f21d237374
move openssl error reporting to isc/ossl_wrap
While being the best place at the time, the tlserr2result doesn't belong
inside TLS code since it is generic to OpenSSL and mostly used in the
dst interface. The newly created ossl_wrap interface is the idea place
for flushing the OpenSSL thread error queue.
2026-02-02 11:50:14 +03:00
Aydın Mercan
c4a25e633c
add openssl_wrap
The isc_ossl_wrap API is intended to separate OpenSSL version specific
code that needs to expose the libcrypto internals and keep isc_crypto
clean.
2026-02-02 11:50:14 +03:00
Aydın Mercan
5ae9b4d14c
cleanup unused header in isc/md.h
Use `isc/crypto.h` whenever needed instead.
2026-02-02 11:50:14 +03:00
Aydın Mercan
8f106f2b66
Separate isc_hmac between pre and post OpenSSL 3.0
Instead of the `EVP_MD_CTX` based functions, use either the new
`EVP_MAC` or the old `HMAC_CTX` based functions.

`EVP_MAC` is the recommended way using using MAC functions in post-3.0
while `HMAC_CTX` is used internally by `EVP_MD_CTX`, making the latter
redundant.
2026-02-02 11:50:14 +03:00
Aydın Mercan
f9ec4a1cdf
switch isc_md_type_t to a proper enum
Get rid of the OpenSSL-isms that plague the codebase where the hash type
is `EVP_MD *`

By using a proper enum, alongside the cleanup, we also get the ability
to use constants for known hash sizes instead of having a function call
every time.

`EVP_MD_CTX_get0_md` has been removed instead of being adapted since it
wasn't used anymore.
2026-02-02 11:12:55 +03:00
Aydın Mercan
35eeefb437
initial openssl version splitting
Dealing with OpenSSL has been rapidly turning into an unwieldy situation
as post-3.0 changes turn the library into a different beast.

Start treating pre and post-3.0 versions differently for easier
maintenance.
2026-02-02 11:12:53 +03:00
Mark Andrews
07610f8566 Add enum for use with isc_base64_tobuffer and isc_hex_tobuffer
This adds the following enum isc_one_or_more and isc_zero_or_more
which specify if one or more or zeror or more bytes are required
when reading the unbounded base64 / hex encoded data.
2026-01-27 23:57:34 +11:00
Mark Andrews
af379e10cc Use const pointer with strchr of const pointer
C23 now has qualifier preserving standard functions for strchr,
bsearch, strpbrk, strrchr, strstr, memchr.  There where a few places
where the return value was not assigned to a const qualified pointer.
These have been fixed.
2026-01-20 16:23:58 +11:00
Giulio Benetti
0e43f62c12 Fix building on uclibc
While building on uclibc this error is thrown:
In file included from ./include/dns/log.h:20,
                 from callbacks.c:19:
../../lib/isc/include/isc/log.h:141:9: error: unknown type name ‘off_t’
  141 |         off_t maximum_size;
      |         ^~~~~

This is due to missing include unistd.h, so let's add it on top of
isc/log.h

Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
2026-01-04 15:14:10 +01:00
Matthijs Mekking
c8253a0a7a Implement NOTIFY(CDS) logic
When the CDS/CDNSKEY RRset gets updated, schedule a NOTIFY(CDS) to be
sent to the parental agent. The parental agent is published in the
parent zone as a DSYNC RRset, so first we need to figure out the
parent owner name. This is done by finding the zonecut (querying for
NS RRset until we find a postive answer).

In nsfetch_dsync, we then schedule a zone fetch for the DSYNC record
at <child-labels>._dsync.<parent-labels>. Then we queue the notify
for each target in the DSYNC records that matches the NOTIFY scheme
and CDS RRtype.
2025-12-19 14:08:15 +01:00
Alessio Podda
f1d8c3059c Fix formatting 2025-12-10 12:18:34 +01:00
Alessio Podda
04fdf242a8 Add slist.h
Add a macro-based singly-linked list implementation to the codebase,
inspired by the doubly-linked list in list.h.
2025-12-10 12:18:34 +01:00
Colin Vidal
c3b7b56dd0 document usage of BIND9 constructors/destructors
Document the way `__attribute__((__constructor__))` and
`__attribute__((__destructor__))` must be used in BIND9 libraries in
order to avoid unexpected behaviors with other third-party libraries.
2025-12-04 16:09:40 +01:00
Evan Hunt
d4ebea1037 use a standard CLEANUP macro
CLEANUP is a macro similar to CHECK but unconditional, jumping
to cleanup even if the result is ISC_R_SUCCESS. It is now used
in place of DST_RET, CLEANUP_WITH, and CHECK(<non-success constant>).
2025-12-03 13:45:43 -08:00
Evan Hunt
6b33b7fc77 switch to RETERR where it wasn't being used
replace all instances of the pattern:

        result = <statement>
        if (result != ISC_R_SUCCESS) {
                return result;
        }

with:

        RETERR(<statement>);
2025-12-03 13:45:43 -08:00
Evan Hunt
38e94cc7da switch to CHECK where it wasn't being used
replace all instances of the pattern:

        result = <statement>
        if (result != ISC_R_SUCCESS) {
                goto cleanup;
        }

with:

        CHECK(<statement>);
2025-12-03 13:45:42 -08:00
Evan Hunt
52bba5cc34 standardize CHECK and RETERR macros
previously, there were over 40 separate definitions of CHECK macros, of
which most used "goto cleanup", and the rest "goto failure" or "goto
out". there were another 10 definitions of RETERR, of which most were
identical to CHECK, but some simply returned a result code instead of
jumping to a cleanup label.

this has now been standardized throughout the code base: RETERR is for
returning an error code in the case of an error, and CHECK is for jumping
to a cleanup tag, which is now always called "cleanup". both macros are
defined in isc/util.h.
2025-12-03 13:26:28 -08:00
Ondřej Surý
b0194004d9
Provide more information when the memory allocation fails
Instead of just crashing when memory allocation fails, also print a
message saying "Out of memory!", the size of the allocation that failed,
total allocated memory from all memory contexts and value of errno.
2025-11-28 14:42:21 +01:00
Ondřej Surý
4d307ac67a
Detect resolution loops between fetches
Maintain the relationship between the parent and child fetch and when
creating a new child fetch, properly check the resolution loops that
would lead to a new fetch would join one of the parent's fetch contexts.
2025-11-27 17:34:25 +01:00
Ondřej Surý
d6e2bf2b3d
Use malloc_usable_size()/malloc_size() for memory accounting
Restore usage of malloc_usable_size()/malloc_size(), but this time only
for memory accounting and statistics purposes.  This should reduce the
memory footprint in case of compilation without jemalloc as we don't
have to keep track of the allocated memory size ourselves.
2025-11-27 11:07:55 +01:00