For Linux >= 6.8:
Since 2023, Linux has introduced a change to the IP_LOCAL_PORT_RANGE
socket option that eliminates the need for the random window
shifting (implemented as a fallback in the next commit).
By setting IP_LOCAL_PORT_RANGE option, we tell the kernel to use better
approach to the source port selection.
For Linux << 6.8:
This implement selecting port by random shifting range leveraging the
IP_LOCAL_PORT_RANGE socket option. The network manager is initialized
with the ephemeral port range (on startup and on reconfig) and then for
every outgoing TCP connection, we define a custom port range (1000
ports) and then randomly shift the custom range within the system range.
This helps the kernel to reduce the search space to the custom window
between <random_offset, random_offset + 1000>.
Reference:
https://blog.cloudflare.com/linux-transport-protocol-port-selection-performance/#kernel
Since 2015, Linux has introduced a new socket option to overcome TCP
limitations: When an application needs to force a source IP on an active
TCP socket it has to use bind(IP, port=x). As most applications do not
want to deal with already used ports, x is often set to 0, meaning the
kernel is in charge to find an available port. But kernel does not know
yet if this socket is going to be a listener or be connected. This
IP_BIND_ADDRESS_NO_PORT socket option ask the kernel to ignore the 0
port provided by application in bind(IP, port=0) and only remember the
given IP address. The port will be automatically chosen at connect()
time, in a way that allows sharing a source port as long as the 4-tuples
are unique.
Enable IP_BIND_ADDRESS_NO_PORT on the outgoing TCP sockets to overcome
this TCP limitation.
A lingering `sizeof` from the prototype era of !11094 caused the
key-wipe in `isc_hmac_key_destroy` to use `sizeof(key->len)` instead of
`key->len` for the length argument of `isc_safe_memwipe`.
This results in a buffer overflow of zero bytes in HMAC keys that are
less than 4 bytes. As such, the overflow can only be visibile in keys
that are less than 32-bits, which is beyond broken and creating such
keys are only possible in testing.
Therefore, this change is *not* a security fix since the conditions are
never reachable in any imaginable deployment scenario.
Builds that use OpenSSL >=3.0 are unaffected as the `sizeof` was only
remaining in pre-3.0 builds.
This is a bit of a namespace convention violation but it fits the spirit of
this header since it is exposing OpenSSL-isms to others.
Further work is needed to make sure the exposed EVP_MD isn't needed
anymore.
Using `EVP_SIGNATURE` explicit algoritms for signatures have been added
in OpenSSL 3.4 and so is skipped for the initial OpenSSL version
specific code splitting.
Using `EVP_SIGNATURE` explicit algoritms for signatures have been added
in OpenSSL 3.4 and so is skipped for the initial OpenSSL version
specific code splitting.
While being the best place at the time, the tlserr2result doesn't belong
inside TLS code since it is generic to OpenSSL and mostly used in the
dst interface. The newly created ossl_wrap interface is the idea place
for flushing the OpenSSL thread error queue.
Instead of the `EVP_MD_CTX` based functions, use either the new
`EVP_MAC` or the old `HMAC_CTX` based functions.
`EVP_MAC` is the recommended way using using MAC functions in post-3.0
while `HMAC_CTX` is used internally by `EVP_MD_CTX`, making the latter
redundant.
Get rid of the OpenSSL-isms that plague the codebase where the hash type
is `EVP_MD *`
By using a proper enum, alongside the cleanup, we also get the ability
to use constants for known hash sizes instead of having a function call
every time.
`EVP_MD_CTX_get0_md` has been removed instead of being adapted since it
wasn't used anymore.
Dealing with OpenSSL has been rapidly turning into an unwieldy situation
as post-3.0 changes turn the library into a different beast.
Start treating pre and post-3.0 versions differently for easier
maintenance.
This adds the following enum isc_one_or_more and isc_zero_or_more
which specify if one or more or zeror or more bytes are required
when reading the unbounded base64 / hex encoded data.
C23 now has qualifier preserving standard functions for strchr,
bsearch, strpbrk, strrchr, strstr, memchr. There where a few places
where the return value was not assigned to a const qualified pointer.
These have been fixed.
While building on uclibc this error is thrown:
In file included from ./include/dns/log.h:20,
from callbacks.c:19:
../../lib/isc/include/isc/log.h:141:9: error: unknown type name ‘off_t’
141 | off_t maximum_size;
| ^~~~~
This is due to missing include unistd.h, so let's add it on top of
isc/log.h
Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com>
When the CDS/CDNSKEY RRset gets updated, schedule a NOTIFY(CDS) to be
sent to the parental agent. The parental agent is published in the
parent zone as a DSYNC RRset, so first we need to figure out the
parent owner name. This is done by finding the zonecut (querying for
NS RRset until we find a postive answer).
In nsfetch_dsync, we then schedule a zone fetch for the DSYNC record
at <child-labels>._dsync.<parent-labels>. Then we queue the notify
for each target in the DSYNC records that matches the NOTIFY scheme
and CDS RRtype.
Document the way `__attribute__((__constructor__))` and
`__attribute__((__destructor__))` must be used in BIND9 libraries in
order to avoid unexpected behaviors with other third-party libraries.
CLEANUP is a macro similar to CHECK but unconditional, jumping
to cleanup even if the result is ISC_R_SUCCESS. It is now used
in place of DST_RET, CLEANUP_WITH, and CHECK(<non-success constant>).
previously, there were over 40 separate definitions of CHECK macros, of
which most used "goto cleanup", and the rest "goto failure" or "goto
out". there were another 10 definitions of RETERR, of which most were
identical to CHECK, but some simply returned a result code instead of
jumping to a cleanup label.
this has now been standardized throughout the code base: RETERR is for
returning an error code in the case of an error, and CHECK is for jumping
to a cleanup tag, which is now always called "cleanup". both macros are
defined in isc/util.h.
Instead of just crashing when memory allocation fails, also print a
message saying "Out of memory!", the size of the allocation that failed,
total allocated memory from all memory contexts and value of errno.
Maintain the relationship between the parent and child fetch and when
creating a new child fetch, properly check the resolution loops that
would lead to a new fetch would join one of the parent's fetch contexts.
Restore usage of malloc_usable_size()/malloc_size(), but this time only
for memory accounting and statistics purposes. This should reduce the
memory footprint in case of compilation without jemalloc as we don't
have to keep track of the allocated memory size ourselves.
Upstream has removed the atomics implementation of CMM_LOAD_SHARED and
CMM_STORE_SHARED as these can be used also with non-stdatomics types.
As we only use the CMM api with stdatomics types, we can restore the
previous behaviour to prevent ThreadSanitizer warnings.
Call to `streamdns_resume_processing` is asynchronous but the socket
passed as argument is not attached when scheduling the call.
While there is no reproducible way (so far) to make the socket reference
number down to 0 before `streamdns_resume_processing` is called, attach
the socket before scheduling the call. This guard against an hypothetic
case where, for some reasons, the socket refcount would reach 0, and be
freed from memory when `streamdns_resume_processing` is called.
Under the overmem conditions, the header could get unlinked from the
SIEVE LRU using a different path. This could lead to double-unlink
which causes assertion failure. Add a guard to ISC_SIEVE_UNLINK() to
unlink only still linked headers.
Changes introduced by 72862c2abc moved the
default configuration from within `bin/named` to a central place
`bin/includes`.
The default configuration is conditioned by several compile-time macro.
While for most of them it's fine because they are defined in the global
`config.h` file included by default to all binaries (by meson), one
specific is not defined here. `HAVE_SO_REUSEPORT_LB` was defined in
`lib/isc/include/isc/netmgr.h` which is of course not included in
`bin/includes/defaultconfig.h`.
As a result, reuseport was disabled for all platform by default, even
the supported ones. This fixes the problem by checking if reuseport is
available on the platform from meson `config.h` generation directly,
which makes `HAVE_SO_REUSEPORT_LB` available everywhere.
The sun_path field is not used anymore, and consumes over a hundred
bytes for every isc_netaddr_t object. Remove it.
As isc_netaddr_t is used in cfg_obj_t, in some huge configuration trees
(e.g., a million zones), the gain is almost 1GB of resident memory.
When the arc4random_uniform() is called on NetBSD with upper_bound that
makes no sense statistically (0 or 1), the call crashes the calling
program. Fix this by returning 0 when upper bound is < 2 as does Linux,
FreeBSD and NetBSD. (Hint: System CSPRNG should never crash.)
Add a magic number check to ensure the memory context validity before
destorying it.
This check is needed now as it was done before implicitly when
isc_mem_inuse was called, but isc_mem_inuse is now called later (to be
able to dump the outstanding allocations).
When a memory context is destroyed, if the `checkfree` property is set,
the program assert there is no remaining allocation. If there are and
assertions are enabled, the program immediately stops.
However, if memory trace/record debug is enabled, the dump of
outstanding allocation won't be printed as it is done after the
no remaining allocation assertion check.
This moves the no remaining allocation assertion check after the dump of
outstanding allocations, so it is still possible to figure out what's
still allocated by this memory context.
if a zone reload is already in progress when 'rndc reload <zone>' is
run, currently the message returned in "zone reload queued", which
is correct, but it's identical to the message returned when a reload
was *not* in progress, so the user can't easily tell what happened.
a user could reload a zone twice and not realize that only one
reload actually took place.
this has been addressed by changing the message returned to
"zone reload was already queued".
a new result code ISC_R_LOADING has been added to signal this
condition, taking the place of ISC_R_RELOAD, which was obsolete
and has been removed.
Use arc4random on platforms where available. arc4random() provides high
quality cryptographically-secure pseudo-random numbers and is generally
recommended for application use.
The uv_random() call unfortunately uses getentropy() on platforms like
MacOS, OpenBSD or NetBSD which is not recommended for application use.