Commit graph

15162 commits

Author SHA1 Message Date
Aydın Mercan
197de93bdc
Forward declare mallocx in isc/mem.h
cmocka.h and jemalloc.h/malloc_np.h has conflicting macro definitions.
While fixing them with push_macro for only malloc is done below, we only
need the non-standard mallocx interface which is easy to just define by
ourselves.
2024-01-18 09:34:36 +01:00
Ondřej Surý
41a0ee1071
Add workaround for jemalloc linking order
Because we don't use jemalloc functions directly, but only via the
libisc library, the dynamic linker might pull the jemalloc library
too late when memory has been already allocated via standard libc
allocator.

Add a workaround round isc_mem_create() that makes the dynamic linker
to pull jemalloc earlier than libc.
2024-01-18 09:34:36 +01:00
Artem Boldariev
20d5a805e2
TLS: improve framing by assembling DNS message in one buffer
This commit improves TLS messages framing by avoiding an extra call to
SSL_write_ex(). Before that we would use an extra SSL_write_ex() call
to pass DNS message length to OpenSSL. That could create an extra TLS
frame, increasing number of bytes sent due to frame header and
padding.

This commit fixes that by making the code pass both DNS message length
and data at once, just like old TLS code did.

It should improve compatibility with some buggy clients that expect
both DNS message length and data to be in one TLS frame.

Older TLS DNS code worked like this, too.
2024-01-17 17:09:41 +02:00
Aydın Mercan
2690dc48d3
Expose the TCP client count in statistics channel
The statistics channel does not expose the current number of TCP clients
connected, only the highwater. Therefore, users did not have an easy
means to collect statistics about TCP clients served over time. This
information could only be measured as a seperate mechanism via rndc by
looking at the TCP quota filled.

In order to expose the exact current count of connected TCP clients
(tracked by the "tcp-clients" quota) as a statistics counter, an
extra, dedicated Network Manager callback would need to be
implemented for that purpose (a counterpart of ns__client_tcpconn()
that would be run when a TCP connection is torn down), which is
inefficient. Instead, track the number of currently-connected TCP
clients separately for IPv4 and IPv6, as Network Manager statistics.
2024-01-17 11:11:12 +03:00
Artem Boldariev
dffb11f2c0 TCP: remove wrong INSIST(csock->recv_cb != NULL)
This commit removes wrong INSIST() condition as the assumption that if
'csock->recv_cb != NULL' iff 'csock->statichandle != NULL' is wrong.

There is no direct relation between 'csock->statichandle' and
'csock->recv_cb', as 'csock->statichandle' gets set when allocating a
handle regardless of 'csock->recv_cb' not being NULL, as it is
possible to attach to the handle without starting a read operation (at
the very least, it is correct to start writing before reading).

That condition made `cipher-suites` system test fail with crash on
some platforms in FIPS mode (namely, Oracle Linux 9) despite not being
related to FIPS at all.
2024-01-16 15:01:26 +02:00
Artem Boldariev
8ae661048d Fix flawed logic when detecting same listener type
The older version of the code was reporting that listeners are going
to be of the same type after reconfiguration when switching from DoT
to HTTPS listener, making BIND abort its executions.

That was happening due to the flaw in logic due to which the code
could consider a current listener and a configuration for the new one
to be of the same type (DoT) even when the new listener entry is
explicitly marked as HTTP.

The checks for PROXY in between the configuration were masking that
behaviour, but when porting it to 9.18 (when there is no PROXY
support), the behaviour was exposed.

Now the code mirrors the logic in 'interface_setup()' closely (as it
was meant to).
2024-01-12 17:59:53 +02:00
Mark Andrews
2cf6cf967d Report the type being filtered from an UPDATE
When processing UPDATE request DNSKEY, CDNSKEY and CDS record that
are managed by named are filtered out.  The log message has been
updated to report the actual type rather that just DNSKEY.
2024-01-12 14:06:58 +00:00
Artem Boldariev
d59cf5e0ce Recreate listeners on DNS transport change
This commit ensures that listeners are recreated on reconfiguration in
the case when their type changes (or when PROXY protocol type changes,
too).

Previously, if a "listen-on" statement was modified to represent a
different transport, BIND would not pick-up the change on
reconfiguration if listener type changes (e.g. DoH -> DoT) for a given
interface address and port combination. This commit fixes that by
recreating the listener.

Initially, that worked for most of the new transports as we would
recreate listeners on each reconfiguration for DoH and DoT. But at
some point we changed that in such a way that listeners were not
recreated to avoid rebinding a port as on some platforms only root can
do that for port numbers <1000, making some ports binding possible
only on start-up. We chose to asynchronously update listener socket
settings (like TLS contexts, HTTP settings) instead.

Now, we both avoid recreating the sockets if unnecessary and recreate
listeners when listener type changes.
2024-01-12 14:55:12 +02:00
Artem Boldariev
eb924e460b Integrate TLS cipher suites support into BIND
This commit makes BIND use the new 'cipher-suites' option from the
'tls' statement.
2024-01-12 13:27:59 +02:00
Artem Boldariev
3818c58bf6 Add TLS cipher suites configuration option to BIND
This commit extends the 'tls' statement with 'cipher-suites' option.
2024-01-12 13:27:59 +02:00
Artem Boldariev
9d052522a0 Add TLS cipher-suites related low-level functionality
This commits adds low-level wrappers on top of
'SSL_CTX_set_ciphersuites()'. These are going to be a foundation
behind the 'cipher-suites' option of the 'tls' statement.
2024-01-12 13:27:59 +02:00
Mark Andrews
d5103b742b
Defer control channel message invalidation
The conn_shutdown() function is called whenever a control channel
connection is supposed to be closed, e.g. after a response to the client
is sent or when named is being shut down.  That function calls
isccc_ccmsg_invalidate(), which resets the magic number in the structure
holding the messages exchanged over a given control channel connection
(isccc_ccmsg_t).  The expectation here is that all operations related to
the given control channel connection will have been completed by the
time the connection needs to be shut down.

However, if named shutdown is initiated while a control channel message
is still in flight, some netmgr callbacks might still be pending when
conn_shutdown() is called and isccc_ccmsg_t invalidated.  This causes
the REQUIRE assertion checking the magic number in ccmsg_senddone() to
fail when the latter function is eventually called, resulting in a
crash.

Fix by splitting up isccc_ccmsg_invalidate() into two separate
functions:

  - isccc_ccmsg_disconnect(), which initiates TCP connection shutdown,
  - isccc_ccmsg_invalidate(), which cleans up magic number and buffer,

and then:

  - replacing all existing uses of isccc_ccmsg_invalidate() with calls
    to isccc_ccmsg_disconnect(),

  - only calling isccc_ccmsg_invalidate() when all netmgr callbacks are
    guaranteed to have been run.

Adjust function comments accordingly.
2024-01-10 15:48:25 +01:00
Aydın Mercan
ca9a05f9ce Check for atomic operations consistency in checklibs.sh
isc/atomic.h and its defined macros should be preferred over
stdatomic.h and explicit atomic operations.

Fix the redundant stdatomic.h header in histo.c found by the introduced
check.
2024-01-03 17:04:31 +00:00
Aydın Mercan
294329da3a Use <isc/atomic.h> instead of <stdatomic.h> directly in <isc/types.h> 2024-01-03 17:04:31 +00:00
Matthijs Mekking
b770740b44 Write new DNSKEY TTL to key file
When the current DNSKEY TTL does not match the one from the policy,
write the new TTL to disk.
2024-01-03 12:09:11 +11:00
Mark Andrews
27e74b2e4b Only create private records for DNSKEYs that have changed
We don't need to create private records for DNSKEY records that
have only had their TTL's changed.
2024-01-03 12:09:11 +11:00
Mark Andrews
d601a90ea3 sync_secure_db failed to handle some TTL changes
If the DNSKEY, CDNSKEY or CDS RRset had different TTLs then the
filtering of these RRset resulted in dns_diff_apply failing with
"not exact". Identify tuple pairs that are just TTL changes and
allow them through the filter.
2024-01-03 12:09:11 +11:00
Mark Andrews
21be35c54e Use the current CDS and CDNSKEY TTLs
When adding new CDS and CDNSKEY records use the existing RRset
TTL if they already exist.
2024-01-03 12:09:11 +11:00
Mark Andrews
dcb7799061 Update the DNSKEY, CDNSKEY and CDS TTLs to match dnskey-ttl
If the TTLs of the DNSKEY, CDNSKEY and CDS do not match the
dnskey-ttl update them by removing all records and re-adding
them with the correct TTL.
2024-01-03 12:09:11 +11:00
Michał Kępień
9cf1f39b54
Silence a scan-build warning in dns_rbt_addname()
Clang Static Analyzer is unable to grasp that when dns_rbt_addnode()
returns ISC_R_EXISTS, it always sets the pointer passed to it via its
'nodep' parameter to a non-NULL value.  Add an extra safety check in the
conditional expression used in dns_rbt_addname() to silence that
warning.
2023-12-22 19:27:37 +01:00
Evan Hunt
ea9a8cb392 prevent an infinite loop in fix_iterator()
it was possible for fix_iterator() to get stuck in a loop while
trying to find the predecessor of a missing node. this has been
fixed and a regression test has been added.
2023-12-21 09:18:30 -08:00
Evan Hunt
84f79cd164 fix_iterator() could produce incoherent iterator stacks
the fix_iterator() function moves an iterator so that it points
to the predecessor of the searched-for name when that name doesn't
exist in the database. the tests only checked the correctness of
the top of the stack, however, and missed some cases where interior
branches in the stack could be missing or duplicated. in these
cases, the iterator would produce inconsistent results when walked.

the predecessors test case in qp_test has been updated to walk
each iterator to the end and ensure that the expected number of
nodes are found.
2023-12-21 09:18:30 -08:00
Mark Andrews
0509351e92 Update the NSEC3PARAM TTL to match the SOA minimum
When building NSEC3 chains update the NSEC3PARAM TTL to match
the SOA minimum.  Delete all records using the old TTL then
re-add them using the new TTL.
2023-12-21 20:12:09 +11:00
Mark Andrews
f3ae88d84e Don't delete the NSEC3PARAM immediately
Wait until the new NSEC or NSEC3 chain is generated then it should
be deleted.
2023-12-21 20:12:09 +11:00
Mark Andrews
a3d0476d17 Don't look for KSK status here and squash memory leak
Just remove the key from  consideration as it is being removed.

The old code could leak a key reference as dst_free_key was not
called every time we continued. This simplification will address
this as well.
2023-12-21 09:18:45 +11:00
Mark Andrews
6ccb93884d dns_request_cancel needs to be callable from any thread
Check the tid and cancel the request immediately or pass it to the
appropriate loop for processing.  Call request->cb directly from
req_sendevent as it is now always called with the correct tid.
2023-12-21 08:11:59 +11:00
Michał Kępień
efcba4dd23
Do not destroy IXFR journal in xfrin_end()
The xfrin_end() function is run when a zone transfer is finished or
canceled.  One of the actions it takes for incremental transfers (IXFR)
is calling dns_journal_destroy() on the zone journal structure that is
stored in the relevant zone transfer context (xfr->ixfr.journal).  That
immediately invalidates that structure as it is not reference-counted.
However, since the changes present in the IXFR stream are applied to the
journal asynchronously (via isc_work_enqueue()), it is possible that
some zone changes may still be in the process of being written to the
journal by the time xfrin_end() destroys the relevant structure.  Such a
scenario leads to crashes.

Fix by not destroying the zone journal structure until the entire zone
transfer context is destroyed.  xfrin_destroy() already conditionally
calls dns_journal_destroy() and when the former is called, all
asynchronous work for a given zone transfer process is guaranteed to be
complete.
2023-12-20 17:21:14 +01:00
Matthijs Mekking
16f2c811e3 Revert "Remove kasp mutex lock"
This reverts commit 634c80ea12.
2023-12-20 08:30:44 +00:00
Mark Andrews
7ab4e1537a Obtain a client->handle reference when calling async_restart
otherwise client may be freed before async_restart is called.
2023-12-20 02:50:48 +11:00
Mark Andrews
c896e07277 Log what change generated a 'not exact' error 2023-12-20 01:56:38 +11:00
Matthijs Mekking
634c80ea12 Remove kasp mutex lock
Multiple zones should be able to read the same key and signing policy
at the same time. Since writing the kasp lock only happens during
reconfiguration, and the complete kasp list is being replaced, there
is actually no need for a lock. Reference counting ensures that a kasp
structure is not destroyed when still being attached to one or more
zones.

This significantly improves the load configuration time.
2023-12-19 14:53:51 +01:00
Mark Andrews
6066e41948 Use 'now' rather than 'inception' in 'add_sigs'
When kasp support was added 'inception' was used as a proxy for
'now' and resulted in signatures not being generated or the wrong
signatures being generated.  'inception' is the time to be set
in the signatures being generated and is usually in the past to
allow for clock skew.  'now' determines what keys are to be used
for signing.
2023-12-19 11:21:46 +11:00
Michał Kępień
b1baf7af3a
"trust-anchor-telemetry" is no longer experimental
Remove the CFG_CLAUSEFLAG_EXPERIMENTAL flag from the
"trust-anchor-telemetry" statement as the behavior of the latter has not
been changed since its initial implementation and there are currently no
plans to do so.  This silences a relevant log message that was emitted
even when the feature was explicitly disabled.
2023-12-18 15:11:39 +01:00
Michał Kępień
2a3b6d1406
Fix reference counting in do_nsfetch()
Each function queuing a do_nsfetch() call using isc_async_run() is
expected to increase the given zone's internal reference count
(zone->irefs), which is then correspondingly decreased in either
do_nsfetch() itself (when the dns_resolver_createfetch() fails) or in
nsfetch_done() (when recursion is finished).

However, do_nsfetch() can also return early if either the zone itself or
the relevant view's resolver object is being shut down.  In that case,
do_nsfetch() simply returns without decreasing the internal reference
count for the zone.  This leaves a dangling zone reference around, which
leads to hangs during named shutdown.

Fix by executing the same cleanup code for early returns from
do_nsfetch() as for a failed dns_resolver_createfetch() call in that
function as the reference count will not be decreased in nsfetch_done()
in any of these cases.
2023-12-18 11:33:43 +01:00
Aram Sargsyan
791a046cc7 Use atomic store operations instead of atomic initialize
The atomic_init() function makes sense to use with structure's
members when creating a new instance of a strucutre. In other
places, use atomic store operations instead, in order to avoid
data races.
2023-12-15 09:56:44 +00:00
Aydın Mercan
9c4dd863a6 Move atomic statscounter next to the non-atomic definition 2023-12-14 09:11:48 +01:00
Aydın Mercan
bb96142a17 Use a non-atomic counter when passing to stats dumper 2023-12-14 09:11:48 +01:00
Petr Špaček
7b0115e331 Avoid overflow during statistics dump
Related: !1493
Fixes: #4467
2023-12-14 09:11:02 +01:00
Mark Andrews
fd077c2661 NetBSD has added 'hmac' to libc so rename out uses of hmac 2023-12-13 22:27:38 +00:00
Matthijs Mekking
21867f200a Refactor getpred code
Move the code to find the predecessor into one function, as it is shares
quite some similarities: In both cases we first need to find the
immediate predecessor/successor, then we need to find the immediate
predecessor if the iterator is not already pointing at it.
2023-12-11 21:01:29 +00:00
Matthijs Mekking
ab8a0c4b5a and fix yet another dns_qp_lookup() iterator bug
This one is similar to the bug when searching for a key, reaching a
dead-end branch that doesn't match, because the branch offset point
is after the point where the search key differs.

This fixes the case where we are multiple levels deep. In other
words, we had a more-than-one matches *after* the point where the
search key differs.

For example, consider the following qp-trie:

branch: "[e]", "[m]":
 - leaf: "a.b.c.d.e"
 - branch: "moo[g]", "moo[k]", "moo[n]":
   - leaf: "moog"
   - branch: "mook[e]", "mook[o]"
     - leaf: "mooker"
     - leaf: "mooko"
   - leaf: "moon"

If searching for a key "monky", we would reach the branch with
twigs "moo[k]" and "moo[n]". The key matches on the 'k' on offset=4,
and reaches the branch with twigs "mook[e]" and "mook[o]". This time
we cannot find a twig that matches our key at offset=5, there is no
twig for 'y'. The closest name we found was "mooker".

Note that on a branch it can't detect it is on a dead branch because the
key is not encapsulated in a branch node.

In the previous code we considered "mooker" to be the successor of
"monky" and so we needed to the predecessor of "mooker" to find the
predecessor for "monky". However, since the search key alread differed
before entering this branch, this is not enough. We would be left with
"moog" as the predecessor of "monky", while in this example "a.b.c.d.e"
is the actual predecessor.

Instead, we need to go up a level, find the predecessor and check
again if we are on the right branch, and repeat the process until we
are.

Unit tests to cover the scenario are now added.
2023-12-11 21:01:29 +00:00
Matthijs Mekking
276bdcf5cf and fix another dns_qp_lookup() iterator bug
There was yet another edge case in which an iterator could be
positioned at the wrong node after dns_qp_lookup(). When searching for
a key, it's possible to reach a leaf that matches at the given offset,
but because the offset point is *after* the point where the search key
differs from the leaf's contents, we are now at the wrong leaf.

In other words, the bug fixed the previous commit for dead-end branches
must also be applied on matched leaves.

For example, if searching for the key "monpop", we could reach a branch
containing "moop" and "moor". the branch offset point - i.e., the point
after which the branch's leaves differ from each other - is the
fourth character ("p" or "r"). The search key matches the fourth
character "p", and takes that twig to the next node (which can be
a branch for names starting with "moop", or could be a leaf node for
"moop").

The old code failed to detect this condition, and would have
incorrectly left the iterator pointing at some successor, and not
at the predecessor of the "moop".

To find the right predecessor in this case, we need to get to the
previous branch and get the previous from there.

This has been fixed and the unit test now includes several new
scenarios for testing search names that match and unmatch on the
offset but have a different character before the offset.
2023-12-11 21:01:29 +00:00
Tom Krizek
059a63793a
Remove obsolete check for resolver-nonbackoff-tries
With the resolver-nonbackoff-tries statement being removed in #4405,
this check can no longer be reached and can be safely removed.
2023-12-07 13:10:58 +01:00
Mark Andrews
7e462c2b26 Also cleanup the space for the rbt nodes
As we are in overmem state we want to free more memory than we are
adding so we need to add in an allowance for the rbtnodes that may
have been added and the names stored with them.  There is the node
for the owner name and a possible ENT node if there was a node split.
2023-12-07 02:59:04 +00:00
Mark Andrews
5e8f0e9ceb Process the combined LRU lists in LRU order
Only cleanup headers that are less than equal to the rbt's last_used
time.  Adjust the rbt's last_used time when the target cleaning was
not achieved to the oldest value of the remaining set of headers.

When updating delegating NS and glue records last_used was not being
updated when it should have been.

When adding zero TTL records to the tail of the LRU lists set
last_used to rbtdb->last_used + 1 rather than now.  This appoximately
preserves the lists LRU order.
2023-12-07 02:59:04 +00:00
Evan Hunt
7d05590a6f clean up client.c
- make dns_client_startresolve() static since it's only used internally
- remove outdated comments
2023-12-06 17:31:38 -08:00
Evan Hunt
50dd6aad34 remove unused functions in dns_master
dns_master_dumpnode() and dns_master_dumpnodetostream() were
never used and can be removed.
2023-12-06 17:31:38 -08:00
Evan Hunt
66496d550b remove resolver-retry-interval and resolver-nonbackoff-tries
fully remove these options and mark them as ancient.
2023-12-06 11:54:59 -08:00
Evan Hunt
4aaa4f7dca deprecate resolver-retry-interval and resolver-nonbackoff-tries
these options control default timing of retries in the resolver
for experimental purposes; they are not known to useful in production
environments.  they will be removed in the future; for now, we
only log a warning if they are used.
2023-12-06 11:51:22 -08:00
Evan Hunt
60a33ae6bb fix another dns_qp_lookup() iterator bug
there was another edge case in which an iterator could be positioned at
the wrong node after dns_qp_lookup().  when searching for a key, it's
possible to reach a dead-end branch that doesn't match, because the
branch offset point is *after* the point where the search key differs
from the branch's contents.

for example, if searching for the key "mop", we could reach a branch
containing "moon" and "moor". the branch offset point - i.e., the
point after which the branch's leaves differ from each other - is the
fourth character ("n" or "r"). however, both leaves differ from the
search key at position *three* ("o" or "p"). the old code failed to
detect this condition, and would have incorrectly left the iterator
pointing at some lower value and not at "moor".

this has been fixed and the unit test now includes this scenario.
2023-12-06 11:03:30 -08:00