Commit graph

14800 commits

Author SHA1 Message Date
Aram Sargsyan
04648d7c2f Add ClientQuota statistics channel counter
This counter indicates the number of the resolver's spilled
queries due to reaching the clients per query quota.
2023-05-31 09:08:58 +00:00
Evan Hunt
26b4acde16 remove win2k gss-tsig hacks
Remove the code implementing nonstardard behaviors that were formerly
needed to allow GSS-TSIG to work with Windows 2000, which passed
End-of-Life in 2010.

Deprecate the "oldgsstsig" command and "-o" command line option
to nsupdate; these are now treated as synonyms for "gsstsig" and "-g"
respectively.
2023-05-30 15:36:01 -07:00
Matthijs Mekking
74d30879ba Extend serve-stale logging
Print the database lookup result in serve-stale logs for debugging
potential future serve-stale issues.
2023-05-30 11:58:19 +02:00
Matthijs Mekking
bbd163acf6 Fix serve-stale bug when cache has no data
We recently fixed a bug where in some cases (when following an
expired CNAME for example), named could return SERVFAIL if the target
record is still valid (see isc-projects/bind9#3678, and
isc-projects/bind9!7096). We fixed this by considering non-stale
RRsets as well during the stale lookup.

However, this triggered a new bug because despite the answer from
cache not being stale, the lookup may be triggered by serve-stale.
If the answer from database is not stale, the fix in
isc-projects/bind9!7096 erroneously skips the serve-stale logic.

Add 'answer_found' checks to the serve-stale logic to fix this issue.
2023-05-30 11:58:19 +02:00
Mark Andrews
ac2e0bc3ff Move isc_mem_put to after node is checked for equality
isc_mem_put NULL's the pointer to the memory being freed.  The
equality test 'parent->r == node' was accidentally being turned
into a test against NULL.
2023-05-29 01:40:57 +00:00
Evan Hunt
512e5e786b don't set SHUTTINGDOWN until after calling the request callbacks
if we set ISC_HTTPDMGR_SHUTTINGDOWN in the http manager before
calling the pending request callbacks, it can trigger an assertion.
2023-05-27 00:41:37 +00:00
Artem Boldariev
0b95cf74ff
ZMGR: TLS contexts cache - properly synchronise access
This commit ensures that access to the TLS context cache within zone
manager is properly synchronised.

Previously there was a possibility for it to get unexpectedly
NULLified for a brief moment by a call to
dns_zonemgr_set_tlsctx_cache() from one thread, while being accessed
from another (e.g. from got_transfer_quota()). This behaviour could
lead to server abort()ing on configuration reload (under very rare
circumstances).

That behaviour has been fixed.
2023-05-26 14:18:03 +03:00
Evan Hunt
0e800467ee fix handling of TCP timeouts
when a TCP dispatch times out, we call tcp_recv() with a result
value of ISC_R_TIMEDOUT; this cancels the oldest dispatch
entry in the dispatch's active queue, plus any additional entries
that have waited longer than their configured timeouts. if, at
that point, there were more dispatch entries still on the active
queue, it resumes reading, but until now it failed to restart
the timer.

this has been corrected: we now calculate a new timeout
based on the oldest dispatch entry still remaining.  this
requires us to initialize the start time of each dispatch entry
when it's first added to the queue.

in order to ensure that the handling of timed-out requests is
consistent, we now calculate the runtime of each dispatch
entry based on the same value for 'now'.

incidentally also fixed a compile error that turned up when
DNS_DISPATCH_TRACE was turned on.
2023-05-26 00:41:01 -07:00
Evan Hunt
e436d84408 prevent TSIG keys from being added to multiple rings
it was possible to add a TSIG key to more than one TSIG
keyring at a time, and this was in fact happening with the
session key, which was generated once and then added to the
keyrings for each view as it was configured.

this has been corrected and a REQUIRE added to dns_tsigkeyring_add()
to prevent it from happening again.
2023-05-25 11:59:02 -07:00
Matthijs Mekking
ef58f2444f Add new dns_rdatatype_iskeymaterial() function
The following code block repeats quite often:

    if (rdata.type == dns_rdatatype_dnskey ||
        rdata.type == dns_rdatatype_cdnskey ||
        rdata.type == dns_rdatatype_cds)

Introduce a new function to reduce the repetition.
2023-05-23 08:53:23 +02:00
Matthijs Mekking
81cb18b8a2 Make make_dnskey() a public funcion
It can be used to compare DNSKEY, CDNSKEY, and CDS records with
signing keys.
2023-05-23 08:53:23 +02:00
Mark Andrews
d24297343f Don't sign the raw zone
The raw zone is not supposed to be signed.  DNSKEY records in a raw zone
should not trigger zone signing.  The update code needs to be able to
identify when it is working on a raw zone.  Add dns_zone_israw() and
dns_zone_issecure() enable it to do this. Also, we need to check the
case for 'auto-dnssec maintain'.
2023-05-23 08:53:23 +02:00
Matthijs Mekking
b493c8505e Fix dns_zone_getkasp() function
For inline-signing zones, sometimes kasp was not detected because
the function was called on the raw (unsigned) version of the zone,
but the kasp is only set on the secure (signed) version of the zone.

Fix the dns_zone_getkasp() function to check whether the zone
structure is inline_raw(), and if so, use the kasp from the
secure version.

In zone.c we can access the kasp pointer directly.
2023-05-23 08:52:01 +02:00
Matthijs Mekking
b0b3e2f12e Allow DNSKEY when syncing secure journal/db
When synchronizing the journal or database from the unsigned version of
the zone to the secure version of the zone, allow DNSKEY records to be
synced, because these may be added by the user with the sole intent to
publish the record (not used for signing). This may be the case for
example in the multisigner model 2 (RFC 8901).

Additional code needs to be added to ensure that we do not remove DNSKEY
records that are under our control. Keys under our control are keys that
are used for signing the zone and thus that we have key files for.

Same counts for CDNSKEY and CDS (records that are derived from keys).
2023-05-23 08:52:01 +02:00
Matthijs Mekking
3b6e9a5fa7 Add function to check if a DNSKEY record is in use
Add a function that checks whether a DNSKEY, CDNSKEY, or CDS record
belongs to a key that is being used for signing.
2023-05-23 08:52:01 +02:00
Mark Andrews
c48c72343d Silence Coverity USE_AFTER_FREE warning
Use current used pointer - 16 instead of a saved pointer as Coverity
thinks the memory may be freed between assignment and use of 'cp'.
isc_buffer_put{mem,uint{8,16,32}} can theoretically free the memory
if there is a dynamic buffer in use but that is not the case here.
2023-05-23 02:13:28 +00:00
Tony Finch
b754c6628f Acquire qpmulti->mutex during destruction
Thread sanitizer warns that parts of the qp-trie are accessed
both with and without the mutex; the unlocked accesses happen during
destruction, so they should be benign, but there's no harm locking
anyway to convince tsan it is clean.

Also, ensure .tsan-suppress and .tsan-suppress-extra are in sync.
2023-05-20 07:26:21 +00:00
Michal Nowak
1fe5c008d6
Ensure "wrap" variable is non-NULL
RUNTIME_CHECK on the "wrap" variable avoids possible NULL dereference:

    thread.c: In function 'thread_wrap':
    thread.c:60:15: error: dereference of possibly-NULL 'wrap' [CWE-690] [-Werror=analyzer-possible-null-dereference]
       60 |         *wrap = (struct thread_wrap){

The RUNTIME_CHECK was there before
7d1ceaf35d.
2023-05-19 11:02:59 +02:00
Michał Kępień
6029010dd2
Remove <isc/cmocka.h>
The last use of the cmocka_add_test_byname() helper macro was removed in
commit 63fe9312ff.  Remove the
<isc/cmocka.h> header that defines it.
2023-05-18 15:12:23 +02:00
Mark Andrews
864cd08052 Properly process extra nameserver lines in resolv.conf
The whole line needs to be read rather than just the token "nameserver"
otherwise the next line in resolv.conf is not properly processed.
2023-05-16 02:04:55 +00:00
Tony Finch
c319ccd4c9 Fixes for liburcu-qsbr
Move registration and deregistration of the main thread from
`isc_loopmgr_run()` into `isc__initialize()` / `isc__shutdown()`:
liburcu-qsbr fails an assertion if we try to use it from an
unregistered thread, and we need to be able to use it when the
event loops are not running.

Use `rcu_assign_pointer()` and `rcu_dereference()` in qp-trie
transactions so that they properly mark threads as online. The
RCU-protected pointer is no longer declared atomic because
liburcu does not (yet) use standard C atomics.

Fix the definition of `isc_qsbr_rcu_dereference()` to return
the referenced value, and to call the right function inside
liburcu.

Change the thread sanitizer suppressions to match any variant of
`rcu_*_barrier()`
2023-05-15 20:49:42 +00:00
Tony Finch
afae41aa40
Check the return value from uv_async_send()
An omission pointed out by the following report from Coverity:

    /lib/isc/loop.c: 483 in isc_loopmgr_pause()
    >>>     CID 455002:  Error handling issues  (CHECKED_RETURN)
    >>>     Calling "uv_async_send" without checking return value (as is done elsewhere 5 out of 6 times).
    483     		uv_async_send(&loop->pause_trigger);
2023-05-15 18:52:04 +01:00
Evan Hunt
b4ac7faee9 allow streamdns read to resume after timeout
when reading on a streamdns socket failed due to timeout, but
the dispatch was still waiting for other responses, it would
resume reading by calling isc_nm_read() again. this caused
an assertion because the socket was already reading.

we now check that either the socket is reading, or that it was
already reading on the same handle.
2023-05-13 23:31:45 -07:00
Tony Finch
fc770a8bd0
Remove the now-unused ISC_STACK
We are using the liburcu concurrent data structures instead.
2023-05-12 20:49:43 +01:00
Tony Finch
f11cc83142
Use per-CPU RCU helper threads
Create and free per-CPU helper threads from the main thread and tell
thread sanitizer to suppress leaking threads. (We are not leaking
threads ourselves and we can safely ignore the Userspace-RCU thread
leaks.)
2023-05-12 20:48:31 +01:00
Tony Finch
c377e0a9e3
Help thread sanitizer to cope with liburcu
All the places the qp-trie code was using `call_rcu()` needed
`__tsan_release()` and `__tsan_acquire()` annotations, so
add a couple of wrappers to encapsulate this pattern.

With these wrappers, the tests run almost clean under thread
sanitizer. The remaining problems are due to `rcu_barrier()`
which can be suppressed using `.tsan-suppress`. It does not
suppress the whole of `liburcu`, because we would like thread
sanitizer to detect problems in `call_rcu()` callbacks, which
are called from `liburcu`.

The CI jobs have been updated to use `.tsan-suppress` by
default, except for a special-case job that needs the
additional suppressions in `.tsan-suppress-extra`.

We might be able to get rid of some of this after liburcu gains
support for thread sanitizer.

Note: the `rcu_barrier()` suppression is not entirely effective:
tsan sometimes reports races that originate inside `rcu_barrier()`
but tsan has discarded the stack so it does not have the
information required to suppress the report. These "races" can
be made much easier to reproduce by adding `atexit_sleep_ms=1000`
to `TSAN_OPTIONS`. The problem with tsan's short memory can be
addressed by increasing `history_size`: when it is large enough
(6 or 7) the `rcu_barrier()` stack usually survives long enough
for suppression to work.
2023-05-12 20:48:31 +01:00
Tony Finch
2bce998b2b
Avoid using the zone timer after its loop has gone
Shutdown and cleanup of zones is more asynchronous with the qp-trie
zone table. As a result it's possible that some activity is delayed
until after a zone has been released from its zonemanager.

Previously, the dns_zone code was not very strict in the way it
refers to the loop it is running on: The loop pointer was stashed when
dns_zonemgr_managezone() was called and never cleared. Now, zones
properly attach to and detach from their loops.

The zone timer depends on its loop. The shutdown crashes occurred
when asynchronous calls tried to modify the zone timer after
dns_zonemgr_releasezone() has been called and the loop was
invalidated. In these cases the attempt to set the timer is now
ignored, with a debug log message.
2023-05-12 20:48:31 +01:00
Tony Finch
9882a6ef90
The zone table no longer depends on the loop manager
This reverts some of the changes in commit b171cacf4f
because now it isn't necessary to pass the loopmgr around.
2023-05-12 20:48:31 +01:00
Tony Finch
6217e434b5
Refactor the core qp-trie code to use liburcu
A `dns_qmpulti_t` no longer needs to know about its loopmgr. We no
longer keep a linked list of `dns_qpmulti_t` that have reclamation
work, and we no longer mark chunks with the phase in which they are to
be reclaimed. Instead, empty chunks are listed in an array in a
`qp_rcu_t`, which is passed to call_rcu().
2023-05-12 20:48:31 +01:00
Tony Finch
05ca11e122
Remove isc_qsbr (we are using liburcu instead)
This commit breaks the qp-trie code.
2023-05-12 20:48:31 +01:00
Tony Finch
cd0795beea
Slightly more sanitary thread dispatch
Tell thread sanitizer that the thread wrapper is released before
passing it to a new thread.
2023-05-12 20:48:31 +01:00
Tony Finch
2e0c954806
Wait for RCU to finish before destroying a memory context
Memory reclamation by `call_rcu()` is asynchronous, so during shutdown
it can lose a race with the destruction of its memory context. When we
defer memory reclamation, we need to attach to the memory context to
indicate that it is still in use, but that is not enough to delay its
destruction. So, call `rcu_barrier()` in `isc_mem_destroy()` to wait
for pending RCU work to finish before proceeding to destroy the memory
context.
2023-05-12 20:48:31 +01:00
Tony Finch
4f97a679f0
A macro for the size of a struct with a flexible array member
It can be fairly long-winded to allocate space for a struct with a
flexible array member: in general we need the size of the struct, the
size of the member, and the number of elements. Wrap them all up in a
STRUCT_FLEX_SIZE() macro, and use the new macro for the flexible
arrays in isc_ht and dns_qp.
2023-05-12 20:48:31 +01:00
Aram Sargsyan
fae0930eb8 Check whether zone->db is a valid pointer before attaching
The zone_resigninc() function does not check the validity of
'zone->db', which can crash named if the zone was unloaded earlier,
for example with "rndc delete".

Check that 'zone->db' is not 'NULL' before attaching to it, like
it is done in zone_sign() and zone_nsec3chain() functions, which
can similarly be called by zone maintenance.
2023-05-12 13:37:27 +00:00
Ondřej Surý
fd3522c37b
Add Userspace-RCU to global CFLAGS and LIBS
The Userspace-RCU headers are now needed for more parts of the libisc
and libdns, thus we need to add it globally to prevent compilation
failures on systems with non-standard Userspace-RCU installation path.
2023-05-12 14:16:25 +02:00
Ondřej Surý
00f1823366
Change the isc_quota API to use cds_wfcqueue internally
The isc_quota API was using locked list of isc_job_t objects to keep the
waiting TCP accepts.  Change the isc_quota implementation to use
cds_wfcqueue internally - the enqueue is wait-free and only dequeue
needs to be locked.
2023-05-12 14:16:25 +02:00
Ondřej Surý
7b1d985de2
Change the isc_async API to use cds_wfcqueue internally
The isc_async API was using lock-free stack (where enqueue operation was
not wait-free).  Change the isc_async to use cds_wfcqueue internally -
enqueue and splice (move the queue members from one list to another) is
nonblocking and wait-free.
2023-05-12 14:16:25 +02:00
Ondřej Surý
7220851f67
Replace glue_cache hashtable with direct link in rdatasetheader
Instead of having a global hashtable with a global rwlock for the GLUE
cache, move the glue_list directly into rdatasetheader and use
Userspace-RCU to update the pointer when the glue_list is empty.

Additionally, the cached glue_lists needs to be stored in the RBTDB
version for early cleaning, otherwise the circular dependencies between
nodes and glue_lists will prevent nodes to be ever cleaned up.
2023-05-12 13:25:39 +02:00
Matthijs Mekking
2c7d93d431 Read from kasp whether to publish CDNSKEY
Check the policy and feed 'dns_dnssec_syncupdate() the right value
to enable/disable CDSNKEY publication.
2023-05-11 17:07:51 +02:00
Matthijs Mekking
8be61d1845 Add configuration option 'cdnskey'
Add the 'cdnskey' configuration option to 'dnssec-policy'.
2023-05-11 17:07:51 +02:00
Matthijs Mekking
7960afcc0f Add functions to set CDNSKEY publication
Add kasp API functions to enable/disable publication of CDNSKEY records.
2023-05-11 17:07:51 +02:00
Michal Nowak
31935a3537
Disable ASAN in nsupdate for fatal cases
Clang 16 LeakSanitizer reports a memory leak when dns_request_create()
returned a TLS error in the nsupdate system test. While technically a
memory leak on error handling, it's not a problem because the program is
immediately terminated; nsupdate is not expected to run for a prolonged
time.
2023-05-11 13:39:51 +02:00
Mark Andrews
f3b24ba789 Handle FORMERR on unknown EDNS option that are echoed
If the resolver received a FORMERR response to a request with
an DNS COOKIE option present that echoes the option back, resend
the request without an DNS COOKIE option present.
2023-05-11 09:32:02 +10:00
Ondřej Surý
b3c6ee7b9a
Fix a logical flaw that would skip logging notify success
The notify_done() would never log a success as the logging part was
always skipped.  Fix the code flow in the function.
2023-05-03 21:51:20 +02:00
Mark Andrews
9fcd42c672 Re-write remove_old_tsversions and greatest_version
Stop deliberately breaking const rules by copying file->name into
dirbuf and truncating it there.  Handle files located in the root
directory properly. Use unlinkat() from POSIX 200809.
2023-05-03 09:12:34 +02:00
Matthijs Mekking
70629d73da Fix purging old log files with absolute file path
Removing old timestamp or increment versions of log backup files did
not work when the file is an absolute path: only the entry name was
provided to the file remove function.

The dirname was also bogus, since the file separater was put back too
soon.

Fix these issues to make log file rotation work when the file is
configured to be an absolute path.
2023-05-03 09:12:11 +02:00
Tony Finch
7d1ceaf35d
Move per-thread RCU setup into isc_thread
All the per-loop `libuv` setup remains in `isc_loop`, but the per-thread
RCU setup is moved to `isc_thread` alongside the other per-thread setup.
This avoids repeating the per-thread setup for `call_rcu()` helpers,
and explains a little better why some parts of the per-thread setup
is missing for `call_rcu()` helpers.

This also removes the per-loop `call_rcu()` helpers as we refactored the
isc__random_initialize() in the previous commit.
2023-04-27 12:38:53 +02:00
Ondřej Surý
65021dbf52
Move the isc_random API initialization to the thread_local variable
Instead of writing complicated wrappers for every thread, move the
initialization back to isc_random unit and check whether the random seed
was initialized with a thread_local variable.

Ensure that isc_entropy_get() returns a non-zero seed.

This avoids problems with thread sanitizer tests getting stuck in an
infinite loop.
2023-04-27 12:38:53 +02:00
Tony Finch
e0248bf60f
Simplify isc_thread a little
Remove the `isc_threadarg_t` and `isc_threadresult_t`
typedefs which were unhelpful disguises for `void *`,
and free the dummy jemalloc allocation sooner.
2023-04-27 12:38:53 +02:00
Tony Finch
06f534fa69
Avoid spurious compilation failures in liburcu headers
When liburcu is not installed from a system package, its headers are
not treated as system headers by the compiler, so BIND's -Werror and
other warning options take effect. The liburcu headers have a lot
of inline functions, some of which do not use all their arguments,
which BIND's build treats as an error.
2023-04-27 12:38:53 +02:00