Stop deliberately breaking const rules by copying file->name into
dirbuf and truncating it there. Handle files located in the root
directory properly. Use unlinkat() from POSIX 200809.
Removing old timestamp or increment versions of log backup files did
not work when the file is an absolute path: only the entry name was
provided to the file remove function.
The dirname was also bogus, since the file separater was put back too
soon.
Fix these issues to make log file rotation work when the file is
configured to be an absolute path.
All the per-loop `libuv` setup remains in `isc_loop`, but the per-thread
RCU setup is moved to `isc_thread` alongside the other per-thread setup.
This avoids repeating the per-thread setup for `call_rcu()` helpers,
and explains a little better why some parts of the per-thread setup
is missing for `call_rcu()` helpers.
This also removes the per-loop `call_rcu()` helpers as we refactored the
isc__random_initialize() in the previous commit.
Instead of writing complicated wrappers for every thread, move the
initialization back to isc_random unit and check whether the random seed
was initialized with a thread_local variable.
Ensure that isc_entropy_get() returns a non-zero seed.
This avoids problems with thread sanitizer tests getting stuck in an
infinite loop.
Remove the `isc_threadarg_t` and `isc_threadresult_t`
typedefs which were unhelpful disguises for `void *`,
and free the dummy jemalloc allocation sooner.
When liburcu is not installed from a system package, its headers are
not treated as system headers by the compiler, so BIND's -Werror and
other warning options take effect. The liburcu headers have a lot
of inline functions, some of which do not use all their arguments,
which BIND's build treats as an error.
This commit allows BIND 9 to be compiled with different flavours of
Userspace RCU, and improves the integration between Userspace RCU and
our event loop:
- In the RCU QSBR, the thread is put offline when polling and online
when rcu_dereference, rcu_assign_pointer (or friends) are called.
- In other RCU modes, we check that we are not reading when reaching the
quiescent callback in the event loop.
- We register the thread before uv_work_run() callback is called and
after it has finished. The rcu_(un)register_thread() has a large
overhead, but that's fine in this case.
There's a recurring pattern walking the ISC_LISTs that just repeats over
and over. Add two macros:
* ISC_LIST_FOREACH(list, elt, link) - walk the static list
* ISC_LIST_FOREACH_SAFE(list, elt, link, next) - walk the list in
a manner that's safe against list member deletions
When OPTOUT was in use we didn't ensure that NSEC3 records
for orphaned empty-non-terminals where removed. Check if
there are orphaned empty-non-terminal NSEC3 even if there
wasn't an NSEC3 RRset to be removed in dns_nsec3_delnsec3.
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
After the dns_xfrin was changed to use network manager, the maximum
global (max-transfer-time-in) and idle (max-transfer-idle-in) times for
incoming transfers were turned inoperational because of missing
implementation.
Restore this functionality by implementing the timers for the incoming
transfers.
The spinlock is small (atomic_uint_fast32_t at most), lightweight
synchronization primitive and should only be used for short-lived and
most of the time a isc_mutex should be used.
Add a isc_spinlock unit which is either (most of the time) a think
wrapper around pthread_spin API or an efficient shim implementation of
the simple spinlock.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
When retrying in the DNS dispatch, the local port would be forgotten on
ISC_R_ADDRINUSE, keep the configured source-port even when retrying.
Additionally, treat ISC_R_NOPERM same as ISC_R_ADDRINUSE.
Closes: #3986
When dns_request_create() failed in notify_send_toaddr(), sending the
notify would silently fail. When notify_done() failed, the error would
be logged on the DEBUG(2) level.
This commit remedies the situation by:
* Promoting several messages related to notifies to INFO level and add
a "success" log message at the INFO level
* Adding a TCP fallback - when sending the notify over UDP fails, named
will retry sending notify over TCP and log the information on the
NOTICE level
* When sending the notify over TCP fails, it will be logged on the
WARNING level
Closes: #4001, #4002
If the zone forwards are canceled from dns_zonemgr_shutdown(), the
forward_cancel() would get called from the main loop, which is wrong.
It needs to be called from the matching zone->loop.
Run the dns_request_cancel() via isc_async_run() on the loop associated
with the zone instead of calling the dns_request_cancel() directly from
the main loop.
By inspecting the code, it was discovered that .sendbuf member of the
isc__nm_networker_t was unused and just consuming ~64k per worker.
Remove the member and the association allocation/deallocation.
In e185412872, the TCP accept quota code
became broken in a subtle way - the quota would get initialized on the
first accept for the server socket and then deleted from the server
socket, so it would never get applied again.
Properly fixing this required a bigger refactoring of the isc_quota API
code to make it much simpler. The new code decouples the ownership of
the quota and acquiring/releasing the quota limit.
After (during) the refactoring it became more clear that we need to use
the callback from the child side of the accepted connection, and not the
server side.
Instead of using isc_async_run() when closing StreamDNS handle, add
isc_job_t member to the isc_nmhandle_t structure and use isc_job_run()
to avoid allocation/deallocation on the StreamDNS hot-path.
If the quota callback is called on a thread matching the socket, call
the TCP accept function directly instead of using isc_async_run() which
allocates-deallocates memory.
The isc_tid() function is often called on the hot-path and it's the only
function is to return thread_local variable, make the isc_tid() function
a header-only to save several function calls during query-response
processing.
It should be floor(DNS_NAME_MAXWIRE / 2) + 1 == 128
The mistake was introduced in c6bf51492d because:
* I was refactoring an existing `DNS_MAX_LABELS` defined as 127
* There was a longstanding bug in `dns_name_isvalid()` which
checked the number of labels against 127U instead of 128
* I mistakenly thought `dns_name_isvalid()` was correct and
`dns_name_countlabels()` was incorrect, but the reverse was true.
After this commit, occurrances of `DNS_NAME_MAXLABELS` with value
128 are consistent with the use of 127 or 128 before commit
c6bf51492d except for the mistake in `dns_name_isvalid()`.
This commit adds a test case that checks the MAXLABELS case
in `dns_name_fromtext()` and `dns_name_isvalid()`.
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
Revert refcount debug tracing (commit a8b29f0365), there are better
ways to do it.
Use the dns_qpmethods_t typedef where appropriate.
Some stylistic improvements.
Commit 0858514ae8 enriched dns_qp_compact() to give callers more
control over how thoroughly the trie should be compacted.
In the DNS_QPGC_ALL case, if the trie is small it might be compacted
to a new position in the same memory chunk. In this situation it will
still be holding references to old leaf objects which have been
removed from the trie but will not be completely detached until the
chunk containing the references is freed.
This change resets the qp-trie allocator to a fresh chunk before a
DNS_QPGC_ALL compaction, so all the old memory chunks will be
evacuated and old leaf objects can be detached sooner.
It is sometimes necessary to access a qp-trie outside an isc_loop,
such as in tests or an isc_work callback. The best option was to use
a `dns_qpmulti_write()` transaction, but that has overheads that are
not necessary for read-only access, such as committing a new version
of the trie even when nothing changed.
So this commit adds a `dns_qpmulti_read()` transaction, which is
nearly as lightweight as a query transaction, but it takes the mutex
like a write transaction.
This is the first of the "fancy" searches that know how the DNS
namespace maps on to the structure of a qp-trie. For example, it will
find the closest enclosing zone in the zone tree.
In the check_algorithm() function openssleddsa_alg_info() is
called with two known variants of the 'algorithm' argument, and
both are expected to return a non-NULL value.
Add an INSIST to suppress the following GCC 12 analyzer report:
openssleddsa_link.c: In function 'raw_key_to_ossl':
openssleddsa_link.c:92:13: error: dereference of NULL 'alginfo' [CWE-476] [-Werror=analyzer-null-dereference]
92 | int pkey_type = alginfo->pkey_type;
| ^~~~~~~~~
The dns_adbentry_overquota() was violating the layers accessing the
adbentry struct members directly. Change it to dns_adb_overquota() to
match the dns_adb API.
Instead of explicitly adding a reference to catzs (catalog zones) when
calling the update callback, attach the catzs to the catz (catalog zone)
object to keep it referenced for the whole time the catz exists.
As we are now using dispatch instead of netmgr for XFR TCP connection,
the xfrin_recv_done() will be called when cancelling the dispatch with
ISC_R_CANCELED. This could lead to double detach from the dns_xfrin_t,
one in the xfrin_recv_done() and one in the dns_xfrin_shutdown().
Remove the extra detach from the dns_xfrin_shutdown() and rely on the
dispatch read callback to be always called.
The xfrin_connect_done() had several problems:
- it would not add the server to unreachable table in case of the
failure coming from the dispatch [GL #3989]
- if dns_dispatch_checkperm() disallowed the connection, the xfr would
be left undetached
- if xfrin_send_request() failed to send the request, the xfr would be
left undetached
All of these have been fixed in this commit.
use the ISC_REFCOUNT implementation for dns_zone_attach() and
_detach(). (this applies only to external zone references, not
to dns_zone_iattach() and dns_zone_idetach().)
use dns_zone_ref() where previously a dummy zone object had been
used to increment the reference count.
In selfsigned_dnskey only call dns_dnssec_verify if the signature's
key id matches a revoked key, the trust is pending and the key
matches a trust anchor. Previously named was calling dns_dnssec_verify
unconditionally resulted in busy work.