The dns_zone_catz_enable_db() and dns_zone_catz_disable_db()
functions can race with similar operations in the catz module
because there is no synchronization between the threads.
Add catz functions which use the view's catalog zones' lock
when registering/unregistering the database update notify callback,
and use those functions in the dns_zone module, instead of doing it
directly.
(cherry picked from commit 6f1f5fc307)
in the past there was overlap between the fields used
as resolver fetch options and ADB addrinfo flags. this has
mostly been eliminated; now we can clean up the rest of
it and remove some confusing comments.
(cherry picked from commit 0955cf1af5)
In HTTP/1.0 and HTTP/1.1, RFC 9112 section 9.6 says the last response
in a connection should include a `Connection: close` header, but the
statschannel server omitted it.
In an HTTP/1.0 response, the statschannel server can sometimes send a
`Connection: keep-alive` header when it is about to close the
connection. There are two ways:
If the first request on a connection is keep-alive and the second
request is not, then _both_ responses have `Connection: keep-alive`
but the connection is (correctly) closed after the second response.
If a single request contains
Connection: close
Connection: keep-alive
then RFC 9112 section 9.3 says the keep-alive header is ignored, but
the statschannel sends a spurious keep-alive in its response, though
it correctly closes the connection.
To fix these bugs, make it more clear that the `httpd->flags` are part
of the per-request-response state. The Connection: flags are now
described in terms of the effect they have instead of what causes them
to be set.
(manually picked from commit e18ca83a3b)
The ability to read legacy HMAC-MD5 K* keyfile pairs using algorithm
number 157 was accidentally lost when the algorithm numbers were
consolidated into a single block, in commit
09f7e0607a.
The assumption was that these algorithm numbers were only known
internally, but they were also used in key files. But since HMAC-MD5
got renumbered from 157 to 160, legacy HMAC-MD5 key files no longer
work.
Move HMAC-MD5 back to 157 and GSSAPI back to 160. Add exception for
GSSAPI to list_hmac_algorithms.
(cherry picked from commit 3f93d3f757)
When DNS_FETCHOPT_NOFOLLOW is set DNS_R_DELEGATION needs to be
returned to restart the resolution process rather than converting
it to ISC_R_SUCCESS.
(cherry picked from commit ea11650376)
If we know that the NS RRset for an intermediate label doesn't exist
on cache contents don't query using that name when looking for a
referral.
(cherry picked from commit 80bc0ee075)
There is no harm in aquiring an additional reference to the resolver
after it has started shutting down. All the REQUIRE was doing was
introducing a point of failure when shutting down the server.
If the resolver received a FORMERR response to a request with
an DNS COOKIE option present that echoes the option back, resend
the request without an DNS COOKIE option present.
(cherry picked from commit f3b24ba789)
Tasks can block for a long time, especially when used by tools in
interactive mode. Update the event loop's time to avoid unexpected
errors when processing later events during the same callback.
For example, newly started timers can fire too early, because the
current time was stale. See the note about uv_update_time() in the
https://docs.libuv.org/en/v1.x/timer.html#c.uv_timer_start page.
The isc_result_t enum was to sparse when each library code would skip to
next << 16 as a base. Remove the huge holes in the isc_result_t enum to
make the isc_result tables more compact.
This change required a rewrite how we map dns_rcode_t to isc_result_t
and back, so we don't ever return neither isc_result_t value nor
dns_rcode_t out of defined range.
(cherry picked from commit a8e6c3b8f7)
The mapping functions between isc_result_t and dns_rcode_t could return
both isc_result_t values not defined in the header and dns_rcode_t
values not defined in the header because it blindly maps anything
withing full 12-bits defined for RCODEs to isc_result_t and back.
Refactor the dns_result_{from,to}rcode() functions to always return
valid isc_result_t and dns_rcode_t values by explicitly mapping the
values to each other and returning DNS_R_SERVFAIL (dns_rcode_servfail)
when encountering value out of the defined range.
(cherry picked from commit b53d1d7069)
Report "permission denied" instead of "unexpected error"
when trying to update a zone file on a read-only file system.
(cherry picked from commit dd6acc1cac)
When a catalog zone is updated using AXFR, the zone database is changed,
so it is required to unregister the update notification callback from
the old database, and register it for the new one.
Currently, here is the order of the steps happening in such scenario:
1. The zone.c:zone_startload() function registers the notify callback
on the new database using dns_zone_catz_enable_db()
2. The callback, when called, notices that the new 'db' is different
than 'catz->db', and unregisters the old callback for 'catz->db',
marks that it's unregistered by setting 'catz->db_registered' to
false, then it schedules an update if it isn't already scheduled.
3. The offloaded update process, after completing its job, notices that
'catz->db_registered' is false, and (re)registers the update callback
for the current database it is working on. There is no harm here even
if it was registered also on step 1, and we can't skip it, because
this function can also be called "artificially" during a
reconfiguration, and in that case the registration step is required
here.
A problem arises when before step 1 an update process was already
in a running state, operating on the old database, and finishing its
work only after step 2. As described in step 3, dns__catz_update_cb()
notices that 'catz->db_registered' is false and registers the callback
on the current database it is working on, which, at that state, is
already obsolete and unused by the zone. When it detaches the database,
the function which is responsible for its cleanup (e.g. free_rbtdb())
asserts because there is a registered update notify callback there.
To fix the problem, instead of delaying the (re)registration to step 3,
make sure that the new callback is registered and 'catz->db_registered'
is accordingly marked on step 2.
(cherry picked from commit 998765fea5)
The 'refresh_rrset' variable is used to determine if we can detach from
the client. This can cause a hang on shutdown. To fix this, move setting
of the 'nodetach' variable up to where 'refresh_rrset' is set (in
query_lookup(), and thus not in ns_query_done()), and set it to false
when actually refreshing the RRset, so that when this lookup is
completed, the client will be detached.
When a query was aborted because of the recursion quota being exceeded,
but triggered a stale answer response and a stale data refresh query,
it could cause named to loop back where we are iterating and following
a delegation. Having no good answer in cache, we would fall back to
using serve-stale again, use the stale data, try to refresh the RRset,
and loop back again, without ever terminating until crashing due to
stack overflow.
This happens because in the functions 'query_notfound()' and
'query_delegation_recurse()', we check whether we can fall back to
serving stale data. We shouldn't do so if we are already refreshing
an RRset due to having prioritized stale data in cache.
In other words, we need to add an extra check to 'query_usestale()' to
disallow serving stale data if we are currently refreshing a stale
RRset.
As an additional mitigation to prevent looping, we now use the result
code ISC_R_ALREADYRUNNING rather than ISC_R_FAILURE when a recursion
loop is encountered, and we check for that condition in
'query_usestale()' as well.
When cache memory usage is over the configured cache size (overmem) and
we are cleaning unused entries, it might not be enough to clean just two
entries if the entries to be expired are smaller than the newly added
rdata. This could be abused by an attacker to cause a remote Denial of
Service by possibly running out of the operating system memory.
Currently, the addrdataset() tries to do a single TTL-based cleaning
considering the serve-stale TTL and then optionally moves to overmem
cleaning if we are in that condition. Then the overmem_purge() tries to
do another single TTL based cleaning from the TTL heap and then continue
with LRU-based cleaning up to 2 entries cleaned.
Squash the TTL-cleaning mechanism into single call from addrdataset(),
but ignore the serve-stale TTL if we are currently overmem.
Then instead of having a fixed number of entries to clean, pass the size
of newly added rdatasetheader to the overmem_purge() function and
cleanup at least the size of the newly added data. This prevents the
cache going over the configured memory limit (`max-cache-size`).
Additionally, refactor the overmem_purge() function to reduce for-loop
nesting for readability.
The number of clients per query is calculated using the pending
fetch responses in the list. The dns_resolver_createfetch() function
includes every item in the list when deciding whether the limit is
reached (i.e. fctx->spilled is true). Then, when the limit is reached,
there is another calculation in fctx_sendevents(), when deciding
whether it is needed to increase the limit, but this time the TRYSTALE
responses are not included in the calculation (because of early break
from the loop), and because of that the limit is never increased.
A single client can have more than one associated response/event in the
list (currently max. two), and calculating them as separate "clients"
is unexpected. E.g. if 'stale-answer-enable' is enabled and
'stale-answer-client-timeout' is enabled and is larger than 0, then
each client will have two events, which will effectively halve the
clients-per-query limit.
Fix the dns_resolver_createfetch() function to calculate only the
regular FETCHDONE responses/events.
Change the fctx_sendevents() function to also calculate only FETCHDONE
responses/events. Currently, this second change doesn't have any impact,
because the TRYSTALE events were already skipped, but having the same
condition in both places will help prevent similar bugs in the future
if a new type of response/event is ever added.
(cherry picked from commit 2ae5c4a674)
This commit changes send buffers allocation strategy for stream based
transports. Before that change we would allocate a dynamic buffers
sized at 64Kb even when we do not need that much. That could lead to
high memory usage on server. Now we resize the send buffer to match
the size of the actual data, freeing the memory at the end of the
buffer for being reused later.
(cherry picked from commit d8a5feb556)
There is a lock-order-inversion (potential deadlock) in resolver.c,
because in dns_resolver_shutdown() a resolver bucket lock is locked
while the resolver lock itself is already locked, while in
fctx_sendevents() the resolver lock is locked while a bucket lock
is locked before calling that function in fctx__done_detach().
The resolver lock/unlock in dns_resolver_shutdown() was added back in
the 317e36d47e commit to make sure that
the function is finished before the resolver object is destroyed.
Since res->exiting is atomic, it should be possible to remove the
resolver locking in dns_resolver_shutdown() and add it to the
send_shutdown_events() function which requires it.
Also, since 'res->exiting' is now set while unlocked, the 'INSIST'
in spillattimer_countdown() is wrong, and is removed.
This counter indicates the number of the resolver's spilled
queries due to reaching the clients per query quota.
(cherry picked from commit 04648d7c2f)
We recently fixed a bug where in some cases (when following an
expired CNAME for example), named could return SERVFAIL if the target
record is still valid (see isc-projects/bind9#3678, and
isc-projects/bind9!7096). We fixed this by considering non-stale
RRsets as well during the stale lookup.
However, this triggered a new bug because despite the answer from
cache not being stale, the lookup may be triggered by serve-stale.
If the answer from database is not stale, the fix in
isc-projects/bind9!7096 erroneously skips the serve-stale logic.
Add 'answer_found' checks to the serve-stale logic to fix this issue.
(cherry picked from commit bbd163acf6)
isc_mem_put NULL's the pointer to the memory being freed. The
equality test 'parent->r == node' was accidentally being turned
into a test against NULL.
(cherry picked from commit ac2e0bc3ff)
This commit ensures that access to the TLS context cache within zone
manager is properly synchronised.
Previously there was a possibility for it to get unexpectedly
NULLified for a brief moment by a call to
dns_zonemgr_set_tlsctx_cache() from one thread, while being accessed
from another (e.g. from got_transfer_quota()). This behaviour could
lead to server abort()ing on configuration reload (under very rare
circumstances).
That behaviour has been fixed.
(cherry picked from commit 0b95cf74ff)
when a TCP dispatch times out, we call tcp_recv() with a result
value of ISC_R_TIMEDOUT; this cancels the oldest dispatch
entry in the dispatch's active queue, plus any additional entries
that have waited longer than their configured timeouts. if, at
that point, there were more dispatch entries still on the active
queue, it resumes reading, but until now it failed to restart
the timer.
this has been corrected: we now calculate a new timeout
based on the oldest dispatch entry still remaining. this
requires us to initialize the start time of each dispatch entry
when it's first added to the queue.
in order to ensure that the handling of timed-out requests is
consistent, we now calculate the runtime of each dispatch
entry based on the same value for 'now'.
incidentally also fixed a compile error that turned up when
DNS_DISPATCH_TRACE was turned on.
(cherry picked from commit 0e800467ee)
it was possible to add a TSIG key to more than one TSIG
keyring at a time, and this was in fact happening with the
session key, which was generated once and then added to the
keyrings for each view as it was configured.
this has been corrected and a REQUIRE added to dns_tsigkeyring_add()
to prevent it from happening again.
Seemingly by omission, sockstop netievent used by multi-layer sockets
was not a high priority event, like it should be (similarly to other
socket types).
In particular, that could make BIND stuck on reconfiguration after a
DoH-listener is removed from the configuration.
This commit fixes that.
The intention behind 'isc__nmsocket_stop()' was that the function
sends notifications on every worker thread, making them synchronise on
the barrier, then the initiating thread waits on it, too. This way we
ensure than no other operation will start when we shutting down the
listener.
However, it seems that due to mistake we have been passing the wrong
worker pointer into isc__nm_async_sockstop() from within the context
of an worker thread which has initiated shutting down. While
effectively we have not been using the pointer in this case, it could
cause maintenance issues later. This commit fixes that.
The whole line needs to be read rather than just the token "nameserver"
otherwise the next line in resolv.conf is not properly processed.
(cherry picked from commit 864cd08052)
The zone_resigninc() function does not check the validity of
'zone->db', which can crash named if the zone was unloaded earlier,
for example with "rndc delete".
Check that 'zone->db' is not 'NULL' before attaching to it, like
it is done in zone_sign() and zone_nsec3chain() functions, which
can similarly be called by zone maintenance.
(cherry picked from commit fae0930eb8)
When retrying in the DNS dispatch, the local port would be forgotten on
ISC_R_ADDRINUSE, keep the configured source-port even when retrying.
Additionally, treat ISC_R_NOPERM same as ISC_R_ADDRINUSE.
Closes: #3986
(cherry picked from commit c8e8ccd026)
Stop deliberately breaking const rules by copying file->name into
dirbuf and truncating it there. Handle files located in the root
directory properly. Use unlinkat() from POSIX 200809.
(cherry picked from commit 9fcd42c672)
Removing old timestamp or increment versions of log backup files did
not work when the file is an absolute path: only the entry name was
provided to the file remove function.
The dirname was also bogus, since the file separater was put back too
soon.
Fix these issues to make log file rotation work when the file is
configured to be an absolute path.
(cherry picked from commit 70629d73da)
When OPTOUT was in use we didn't ensure that NSEC3 records
for orphaned empty-non-terminals where removed. Check if
there are orphaned empty-non-terminal NSEC3 even if there
wasn't an NSEC3 RRset to be removed in dns_nsec3_delnsec3.
(cherry picked from commit 27160c137f)
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
(cherry picked from commit dfaecfd752)
After the dns_xfrin was changed to use network manager, the maximum
global (max-transfer-time-in) and idle (max-transfer-idle-in) times for
incoming transfers were turned inoperational because of missing
implementation.
Restore this functionality by implementing the timers for the incoming
transfers.
(cherry picked from commit d2377f8e04)
This reverts commit 6cdeb5b046 which added
wrapper around all the unit tests that would run the unit test in the
forked process.
This makes any debugging of the unit tests too hard. Futures attempts to
fix#3980 (closed) should add a custom automake test harness (log
driver) that would kill the unit test after configured timeout.
In write_public_key() and write_key_state(), there were left-over checks
for result, that were effectively dead code after the last refactoring.
Remove those.
(cherry picked from commit 766366e934)
Instead of explicitly adding a reference to catzs (catalog zones) when
calling the update callback, attach the catzs to the catz (catalog zone)
object to keep it referenced for the whole time the catz exists.
(cherry picked from commit 2ded876db2)
The xfrin_connect_done() had several problems:
- it would not add the server to unreachable table in case of the
failure coming from the dispatch [GL #3989]
- if dns_dispatch_checkperm() disallowed the connection, the xfr would
be left undetached
- if xfrin_send_request() failed to send the request, the xfr would be
left undetached
All of these have been fixed in this commit.
(cherry picked from commit 536e439c79)
The req_response() function is using 'udpcount' variable to resend
the request 'udpcount' times on timeout even for TCP requests,
which does not make sense, as it would use the same connection.
Add a condition to use the resend logic only for UDP requests.
(cherry picked from commit edcdb881da)
The dns_request_createraw() function, unlike dns_request_create(), when
calculating the UDP timeout value, doesn't check that 'udpretries' is
not zero, and that is the more logical behavior, because the calculation
formula uses division to 'udpretries + 1', where '1' is the first try.
Change the dns_request_create() function to remove the 'udpretries != 0'
condition.
Add a 'REQUIRE(udpretries != UINT_MAX)' check to protect from a division
by zero.
Make the 'request->udpcount' field to represent the number of tries,
instead of the number of retries.
(cherry picked from commit 643abfbba7)
dns_rdata_fromstruct in dns_keytable_deletekey can potentially
fail with ISC_R_NOSPACE. Handle the error condition.
(cherry picked from commit b5df9b8591)
In selfsigned_dnskey only call dns_dnssec_verify if the signature's
key id matches a revoked key, the trust is pending and the key
matches a trust anchor. Previously named was calling dns_dnssec_verify
unconditionally resulted in busy work.
(cherry picked from commit e68fecbdaa)
The CI doesn't provide useful forensics when a system test locks
up. Fork the process and kill it with ABRT if it is still running
after 20 minutes. Pass the exit status to the caller.
(cherry picked from commit 3d5c7cd46c)
The isc_fsaccess API was created to hide the implementation details
between POSIX and Windows APIs. As we are not supporting the Windows
APIs anymore, it's better to drop this API used in the DST part.
Moreover, the isc_fsaccess was setting the permissions in an insecure
manner - it operated on the filename, and not on the file descriptor
which can lead to all kind of attacks if unpriviledged user has read (or
even worse write) access to key directory.
Replace the code that operates on the private keys with code that uses
mkstemp(), fchmod() and atomic rename() at the end, so at no time the
private key files have insecure permissions.
(cherry picked from commit 263d232c79)
As it's impossible to get the current umask without modifying it at the
same time, initialize the current umask at the program start and keep
the loaded value internally. Add isc_os_umask() function to access the
starttime umask.
(cherry picked from commit aca7dd3961)
This commit contains the backport of the behaviour for handling TLS
connect callbacks when wrapping up.
The current behaviour have not caused any problems to us, yet, but we
are changing it to remain on the safer side.
The dns__catz_update_cb() function was earlier updated (see
d2ecff3c4a) to use a separate
'dns_db_t' object ('catz->updb' instead of 'catz->db') to
avoid a race between the 'dns__catz_update_cb()' and
'dns_catz_dbupdate_callback()' functions, but the 'REQUIRE'
check there still checks the validity of the 'catz->db' object.
Fix the omission.
(cherry picked from commit a2817541b3)
These options and zone type were created to address the
SiteFinder controversy, in which certain TLD's redirected queries
rather than returning NXDOMAIN. since TLD's are now DNSSEC-signed,
this is no longer likely to be a problem.
The deprecation message for 'type delegation-only' is issued from
the configuration checker rather than the parser. therefore,
isccfg_check_namedconf() has been modified to take a 'nodeprecate'
parameter to suppress the warning when named-checkconf is used with
the command-line option to ignore warnings on deprecated options (-i).
(cherry picked from commit 2399556bee)
When resquery_response() was called with ISC_R_SHUTTINDOWN, the region
argument would be NULL, but rctx_respinit() would try to pass
region->base and region->len to the isc_buffer_init() leading to
a NULL pointer dereference. Properly handle non-ISC_R_SUCCESS by
ignoring the provided region.
(cherry picked from commit 93259812dd)
This should delay the catalog zone from being destroyed during
shutdown, if the update process is still running.
Doing this should not introduce significant shutdown delays, as
the update function constantly checks the 'shuttingdown' flag
and cancels the process if it is set.
(cherry picked from commit dc2b8bb1c9)
This commit ensures that 'sock->tls.pending_req' is not getting
nullified during TLS connection timeout callback as it prevents the
connection callback being called when connecting was not successful.
We expect 'isc__nm_failed_connect_cb() to be called from
'isc__nm_tlsdns_shutdown()' when establishing connections was
successful, but with 'sock->tls.pending_req' nullified that will not
happen.
The code removed most likely was required in older iterations of the
NM, but to me it seems that now it does only harm. One of the well
know pronounced effects is leading to irrecoverable zone transfer
hangs via TLS.
If the zone already has existing NSEC/NSEC3 chains then zone_sign
needs to continue to use them. If there are no chains then use
kasp setting otherwise generate an NSEC chain.
(cherry picked from commit 4b55201459)
dns_dnssec_updatekeys's 'report' could be called with invalid arguments
which the compiler should be be able to detect.
(cherry picked from commit 7a0a2fc3e4)
As it is done in the RPZ module, use 'db' and 'dbversion' for the
database we are going to update to, and 'updb' and 'updbversion' for
the database we are working on.
Doing this should avoid a race between the 'dns__catz_update_cb()' and
'dns_catz_dbupdate_callback()' functions.
(cherry picked from commit d2ecff3c4a)
This reverts commit a719647023.
The commit introduced a data race, because dns_db_endload() is called
after unfreezing the zone.
(not cherry picked from commit 593dea871a)
There are leftovers from the previous refactoring effort, which left
some function declarations and comments in the header file unchanged.
Finish the renaming.
(cherry picked from commit 580ef2e18f)
When detaching from the previous version of the database, make sure
that the update-notify callback is unregistered, otherwise there is
an INSIST check which can generate an assertion failure in free_rbtdb(),
which checks that there are no outstanding update listeners in the list.
There is a similar code already in place for RPZ.
(cherry picked from commit cf79692a66)
The zone_postload() function can fail and unregister the callbacks.
Call dns_db_endload() only after calling zone_postload() to make
sure that the registered update-notify callbacks are not called
when the zone loading has failed during zone_postload().
Also, don't ignore the return value of zone_postload().
(cherry picked from commit ed268b46f1)
The dbiterator read-locks the whole zone and it stayed locked during
whole processing time when catz is being read. Pause the iterator, so
the updates to catz zone are not being blocked while processing the catz
update.
(cherry picked from commit 4e7187601f)
Instead of holding the catzs->lock the whole time we process the catz
update, only hold it for hash table lookup and then release it. This
should unblock any other threads that might be processing updates to
catzs triggered by extra incoming transfer.
(cherry picked from commit b1cd4a066a)
Offload catalog zone processing so that the network manager threads
are not interrupted by a large catalog zone update.
Introduce a new 'updaterunning' state alongside with 'updatepending',
like it is done in the RPZ module.
Note that the dns__catz_update_cb() function currently holds the
catzs->lock during the whole process, which is far from being optimal,
but the issue is going to be addressed separately.
(cherry picked from commit 0b96c9234f)
This change should make sure that catalog zone update processing
doesn't happen when the catalog zone is being shut down. This
should help avoid races when offloading the catalog zone updates
in the follow-up commit.
(cherry picked from commit 246b7084d6)
* Change 'dns_catz_new_zones()' function's prototype (the order of the
arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
expressions when allocating or freeing memory for 'ptr'.
(cherry picked from commit 8cb79fec9d)
The kasp pointers in dns_zone_t should consistently be changed by
dns_kasp_attach and dns_kasp_detach so the usage is balanced.
(cherry picked from commit b41882cc75)
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
(cherry picked from commit 197334464e)
A dns_rpz_unref_rpzs() call is missing when taking the 'goto unlock;'
path on shutdown, in order to compensate for the earlier
dns_rpz_ref_rpzs() call.
Move the dns_rpz_ref_rpzs() call after the shutdown check.
(cherry picked from commit afbe63565f)
When shutting down, or when dns_dbiterator_current() fails, 'node'
shouldn't be detached, because it is NULL at that point.
(cherry picked from commit d36728e42f)
When shutting down, the cleanup path should not try to destroy
'newnodes', because it is NULL at that point.
Introduce another label for the "shuttingdown" scenario.
(cherry picked from commit 975d16230b)
The dns_rpz_zones structure was using .refs and .irefs for strong and
weak reference counting. Rewrite the unit to use just a single
reference counting + shutdown sequence (dns_rpz_destroy_rpzs) that must
be called by the creator of the dns_rpz_zones_t object. Remove the
reference counting from the dns_rpz_zone structure as it is not needed
because the zone objects are fully embedded into the dns_rpz_zones
structure and dns_rpz_zones_t object must never be destroyed before all
dns_rpz_zone_t objects.
The dns_rps_zones_t reference counting uses the new ISC_REFCOUNT_TRACE
capability - enable by defining DNS_RPZ_TRACE in the dns/rpz.h header.
Additionally, add magic numbers to the dns_rpz_zone and dns_rpz_zones
structures.
(cherry picked from commit 77659e7392)
When there are multiple managed trust anchors we need to know the
name of the trust anchor that is failing. Extend the error message
to include the trust anchor name.
(cherry picked from commit fb7b7ac495)
Previously, the RPZ updates ran quantized on the main nm_worker loops.
As the quantum was set to 1024, this might lead to service
interruptions when large RPZ update was processed.
Change the RPZ update process to run as the offloaded work. The update
and cleanup loops were refactored to do as little locking of the
maintenance lock as possible for the shortest periods of time and the db
iterator is being paused for every iteration, so we don't hold the rbtdb
tree lock for prolonged periods of time.
(cherry picked from commit f106d0ed2b)
Previously dns_rpz_add() were passed dns_rpz_zones_t and index to .zones
array. Because we actually attach to dns_rpz_zone_t, we should be using
the local pointer instead of passing the index and "finding" the
dns_rpz_zone_t again.
Additionally, dns_rpz_add() and dns_rpz_delete() were used only inside
rpz.c, so make them static.
(cherry picked from commit b6e885c97f)
Do a general cleanup of lib/dns/rpz.c style:
* Removed deprecated and unused functions
* Unified dns_rpz_zone_t naming to rpz
* Unified dns_rpz_zones_t naming to rpzs
* Add and use rpz_attach() and rpz_attach_rpzs() functions
* Shuffled variables to be more local (cppcheck cleanup)
(cherry picked from commit 840179a247)
libuv support for receiving multiple UDP messages in a single system
call (recvmmsg()) has been tweaked several times between libuv versions
1.35.0 and 1.40.0. Mixing and matching libuv versions within that span
may lead to assertion failures and is therefore considered harmful, so
try to limit potential damage be preventing users from mixing libuv
versions with distinct sets of recvmmsg()-related flags.
(cherry picked from commit 735d09bffe)
The implementation of UDP recvmmsg in libuv 1.35 and 1.36 is
incomplete and could cause assertion failure under certain
circumstances.
Modify the configure and runtime checks to report a fatal error when
trying to compile or run with the affected versions.
(cherry picked from commit 251f411fc3)
isc_bind9 was a global bool used to indicate whether the library
was being used internally by BIND or by an external caller. external
use is no longer supported, but the variable was retained for use
by dyndb, which needed it only when being built without libtool.
building without libtool is *also* no longer supported, so the variable
can go away.
(cherry picked from commit 935879ed11)
Instead of using an extra rarely-used paramater to dns_clientinfo_init()
to set ECS information for a client, this commit adds a function
dns_clientinfo_setecs() which can be called only when ECS is needed.
(cherry picked from commit ff3fdaa424)
when generating a key, if a DH key already existed for the same
name, a spurious warning message was generated saying "bad key
type". this is fixed.
(cherry picked from commit 82503bec99)
the optional 'port' option, when used with notify-source,
transfer-source, etc, is used to set up UDP dispatches with a
particular source port, but when the actual UDP connection was
established the port would be overridden with a random one. this
has been fixed.
(configuring source ports is deprecated in 9.20 and slated for
removal in 9.22, but should still work correctly until then.)
(cherry picked from commit 4d50c912ba)
it was possible for a managed trust anchor needing to send a key
refresh query to be unable to do so because an authoritative zone
was not yet loaded. this has been corrected by delaying the
synchronization of managed-keys zones until after all zones are
loaded.
(cherry-picked from commit bafbbd2465)
Deprecate the use of "port" when configuring query-source(-v6),
transfer-source(-v6), notify-source(-v6), parental-source(-v6),
etc. Also deprecate use-{v4,v6}-udp-ports and avoid-{v4,v6}udp-ports.
(cherry picked from commit 470ccbc8ed)
Commit 6f998bbe51 added a new parameter,
'options', to the prototype of the 'allrdatasets' function pointer in
struct dns_dbmethods. Handle this new parameter accordingly in
rpsdb_allrdatasets().
(cherry picked from commit f3def4e4ed)
The ADB hashmaps are stored in extra memory contexts, so the hash
tables are excluded from the overmem accounting. The new memory
context was unnamed, give it a proper name.
Same thing has happened with extra memory context used for named
global log context - give the extra memory context a proper name.
(cherry picked from commit 3cda9f9f14)
Set the DS state after issuing 'rndc dnssec -checkds'. If the DS
was published, it should go in RUMOURED state, regardless whether it
is already safe to do so according to the state machine.
Leaving it in HIDDEN (or if it was magically already in OMNIPRESENT or
UNRETENTIVE) would allow for easy shoot in the foot situations.
Similar, if the DS was withdrawn, the state should be set to
UNRETENTIVE. Leaving it in OMNIPRESENT (or RUMOURED/HIDDEN)
would also allow for easy shoot in the foot situations.
(cherry picked from commit ee42f66fbe)
This commit ensures that BIND and supplementary tools still can be
built on newer versions of DragonFly BSD. It used to be the case, but
somewhere between versions 6.2 and 6.4 the OS developers rearranged
headers and moved some function definitions around.
Before that the fact that it worked was more like a coincidence, this
time we, at least, looked at the related man pages included with the
OS.
No in depth testing has been done on this OS as we do not really
support this platform - so it is more like a goodwill act. We can,
however, use this platform for testing purposes, too. Also, we know
that the OS users do use BIND, as it is included in its ports
directory.
Building with './configure' and './configure --without-jemalloc' have
been fixed and are known to work at the time the commit is made.
(cherry picked from commit 942569a1bb)
It is allowed to point parental-agents to a resolver. Therefore, the
RD bit should be set on requests.
Upon receiving a DS response, ensure that the message has either the
AA or the RA bit set.
(cherry picked from commit e34722ed43)
The write node lock needs to be held when setting node->wild in
add_wildcard_magic except when being called from loading_addrdataset
which is used to load the zone without locking during its initial
load.
(cherry picked from commit 81c24b8da2)
Return 'isc_result_t' type value instead of 'bool' to indicate
the actual failure. Rename the function to something not suggesting
a boolean type result. Make changes in the places where the API
function is being used to check for the result code instead of
a boolean value.
(cherry picked from commit 41dc48bfd7)
Detaching the views in the zone_shutdown() could lead to
lock-order-inversion between adb->namelocks[bucket], adb->lock,
view->lock and zone->lock. Detach the views outside of the section that
zone-locked.
(cherry picked from commit 978a0ef84c)
If the OpenSSL SHA1_{Init,Update,Final} API is still available, use it.
The API has been deprecated in OpenSSL 3.0, but it is significantly
faster than EVP_MD API, so make an exception here and keep using it
until we can't.
(cherry picked from commit 25db8d0103)
Instead of going through another layer, use OpenSSL EVP_MD API directly
in the isc_iterated_hash() implementation. This shaves off couple of
microseconds in the microbenchmark.
(cherry picked from commit 36654df732)
as far as I can determine the order of operations is not important.
*** CID 351372: Concurrent data access violations (ATOMICITY)
/lib/isc/timer.c: 227 in timer_purge()
221 LOCK(&timer->lock);
222 if (!purged) {
223 /*
224 * The event has already been executed, but not
225 * yet destroyed.
226 */
>>> CID 351372: Concurrent data access violations (ATOMICITY)
>>> Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect.
227 timerevent_unlink(timer, event);
228 }
229 }
230 }
231
232 void
(cherry picked from commit 98718b3b4b)
The reference counting and isc_timer_attach()/isc_timer_detach()
semantic are actually misleading because it cannot be used under normal
conditions. The usual conditions under which is timer used uses the
object where timer is used as argument to the "timer" itself. This
means that when the caller is using `isc_timer_detach()` it needs the
timer to stop and the isc_timer_detach() does that only if this would be
the last reference. Unfortunately, this also means that if the timer is
attached elsewhere and the timer is fired it will most likely be
use-after-free, because the object used in the timer no longer exists.
Remove the reference counting from the isc_timer unit, remove
isc_timer_attach() function and rename isc_timer_detach() to
isc_timer_destroy() to better reflect how the API needs to be used.
The only caveat is that the already executed event must be destroyed
before the isc_timer_destroy() is called because the timer is no longet
attached to .ev_destroy_arg.
(cherry picked from commit ae01ec2823)
The isc_task_purge() and isc_task_purgerange() were now unused, so sweep
the task.c file. Additionally remove unused ISC_EVENTATTR_NOPURGE event
attribute.
(cherry picked from commit c17eee034b)
When we are loading the zones, set the quantum to UINT_MAX, which makes
task_run process all tasks at once. After the zone loading is finished
the quantum will be dropped to 1 to not block server when we are loading
new zones after reconfiguration.
(cherry picked from commit 87c4c24cde)
Add isc_task_setquantum() function that modifies quantum for the future
isc_task_run() invocations.
NOTE: The current isc_task_run() caches the task->quantum into a local
variable and therefore the current event loop is not affected by any
quantum change.
(cherry picked from commit 15ea6f002f)
Instead of searching for the events to purge, keep the list of scheduled
events on the timer list and purge the events that we have scheduled.
(cherry picked from commit 3f8024b4a2f12fcd28a9dd813b6f1f3f11d506f2)
The isc_task_purgerange() was walking through all events on the task to
find a matching task. Instead use the ISC_LINK_LINKED to find whether
the event is active.
Cleanup the related isc_task_unsend() and isc_task_unsendrange()
functions that were not used anywhere.
(cherry picked from commit 17aed2f895)
The .view (and possibly .prev_view) would be kept attached to the
removed zone until the zone is fully removed from the memory in
zone_free(). If this process is delayed because server is busy
something else like doing constant `rndc reconfig`, it could take
seconds to detach the view, possibly keeping multiple dead views in the
memory. This could quickly lead to a massive memory bloat.
Release the views early in the zone_shutdown() call, and don't wait
until the zone is freed.
(cherry picked from commit 13bb821280)
During XoT it is important to check for "dot" ALPN tag to be
negotiated (according to the RFC 9103). We were doing that, however, the
situation was not handled properly, leading to non-cancelled zone
transfers that would crash (abort()) BIND on shutdown.
In this particular case 'result' might equal 'ISC_R_SUCCESS'. When
this is the case, the part of the code supposed to handle failures
will not cancel the zone transfer.
This situation cannot happen when BIND is a secondary of other BIND
instance. Only primaries following the RFC not closely enough could
trigger such a behaviour.
(cherry picked from commit 34a1aab1cb)
Include isc_rwlocktype_t type definition in zt.h
See merge request isc-projects/bind9!7376
(cherry picked from commit d7bcdf8bd6)
395d6fca Include isc_rwlocktype_t type definition in zt.h
Although 'dns_fetch_t' fetch can have two associated events, one for
each of 'DNS_EVENT_FETCHDONE' and 'DNS_EVENT_TRYSTALE' types, the
dns_resolver_cancelfetch() function is designed in a way that it
expects only one existing event, which it must cancel, and when it
happens so that 'stale-answer-client-timeout' is enabled and there
are two events, only one of them is canceled, and it results in an
assertion in dns_resolver_destroyfetch(), when it finds a dangling
event.
Change the logic of dns_resolver_cancelfetch() function so that it
cancels both the events (if they exist), and in the right order.
(cherry picked from commit ec2098ca35)
dns_db_findext() asserts if RRSIG is passed to it and
query_lookup_stale() failed to map RRSIG to ANY to prevent this. To
avoid cases like this in the future, move the mapping of SIG and RRSIG
to ANY for qctx->type to qctx_init().
(cherry picked from commit 56eae06418)
check allow-update, update-policy, and allow-update-forwarding before
consuming quota slots, so that unauthorized clients can't fill the
quota.
(this moves the access check before the prerequisite check, which
violates the precise wording of RFC 2136. however, RFC co-author Paul
Vixie has stated that the RFC is mistaken on this point; it should have
said that access checking must happen *no later than* the completion of
prerequisite checks, not that it must happen exactly then.)
(cherry picked from commit 964f559edb)
limit the number of simultaneous DNS UPDATE events that can be
processed by adding a quota for update and update forwarding.
this quota currently, arbitrarily, defaults to 100.
also add a statistics counter to record when the update quota
has been exceeded.
(cherry picked from commit 7c47254a14)
Previously, an incremental hash table resizing was implemented for the
dns_rbt_t hash table implementation. Using that as a base, also
implement the incremental hash table resizing also for isc_ht API
hashtables:
1. During the resize, allocate the new hash table, but keep the old
table unchanged.
2. In each lookup, delete, or iterator operation, check both tables.
3. Perform insertion operations only in the new table.
4. At each insertion also move <r> elements from the old table to
the new table.
5. When all elements are removed from the old table, deallocate it.
To ensure that the old table is completely copied over before the new
table itself needs to be enlarged, it is necessary to increase the
size of the table by a factor of at least (<r> + 1)/<r> during resizing.
In our implementation <r> is equal to 1.
The downside of this approach is that the old table and the new table
could stay in memory for longer when there are no new insertions into
the hash table for prolonged periods of time as the incremental
rehashing happens only during the insertions.
(cherry picked from commit e42cb1f198)
As shown in the previous commit, using sizeof(type_t) is a little
bit more error-prone when copy-pasting code, so extracting the
size information from the pointer which is being dealt with seems
like a better alternative.
(cherry picked from commit cf4003fa58)
Free 'sizeof(dns_forwarder_t)' bytes of memory instead of
'sizeof(dns_sockaddr_t)' bytes, because `fwd` is a pointer
to a 'dns_forwarder_t' type structure.
(cherry picked from commit 0cc1b06d98)
The dns_zonemgr_releasezone() function makes a decision to destroy
'zmgr' (based on its references count, after decreasing it) inside
a lock, and then destroys the object outside of the lock.
This causes a race with dns_zonemgr_detach(), which could destroy
the object in the meantime.
Change dns_zonemgr_releasezone() to detach from 'zmgr' and destroy
the object (if needed) using dns_zonemgr_detach(), outside of the
lock.
(cherry picked from commit c1fc212253)
Prefer the pthread_barrier implementation on platforms where it is
available over uv_barrier implementation. This also solves the problem
with thread sanitizer builds on macOS that doesn't have pthread barrier.
(cherry picked from commit d07c4a98da)
This reverts commit f17f5e831b that made
following change:
> The TLSDNS transport was not honouring the single read callback for
> TLSDNS client. It would call the read callbacks repeatedly in case the
> single TLS read would result in multiple DNS messages in the decoded
> buffer.
Turns out that this change broke XoT, so we are reverting the change
until we figure out a proper fix that will keep the design promise and
not break XoT at the same time.
The ns_client_aclchecksilent is used to check multiple ACLs before
the decision is made that a query is denied. It is also used to
determine if recursion is available. In those cases we should not
set the extended DNS error "Prohibited".
(cherry picked from commit 798c8f57d4)
Arthimetic on NULL pointers is undefined. Avoid arithmetic operations
when 'in' is NULL and require 'in' to be non-NULL if 'inlen' is not zero.
(cherry picked from commit 349c23dbb7)
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
(cherry picked from commit 916ea26ead)
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():
query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
^~~~~~~~~~~~~~~~~~~
1 warning generated.
The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume(). This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt. Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.
(cherry picked from commit 07592d1315)
(cherry picked from commit a4547a1093)
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
(cherry picked from commit 91a1a8efc5)
Previously, dns_dispatch_gettcp() could pick a TCP connection created by
different thread - this breaks our contractual promise to DNS dispatch
by using the TCP connection on a different thread than it was created.
Add .tid member to the dns_dispatch_t struct and skip the dispatches
from other threads when looking up a TCP dispatch that we can reuse in
dns_request.
NOTE: This is going to be properly refactored, but this change could be
also backported to 9.18 for better stability and thread-affinity.
(cherry picked from commit 1a999353cd)
If the isc_task_create_bound() fails in the middle of buckets
initialization - the most common case would be shutdown initialized
during reload, not all tasks would be initialized, but the cleanup
code would try to cleanup all buckets.
Make sure that we cleanup only the initialized buckets by setting
ntasks to the number of already initialized tasks on the error path.
The 'localaddr' pointer can be NULL, which causes an assertion failure.
Use '&disp->local' instead when printing a debug log message.
(cherry picked from commit 41ca9d419e)
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.
It is mostly done to make it in line with other functions in the
module.
(cherry picked from commit 7962e7f575)
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.
It is mostly done to make it in line with other functions in the
module.
(cherry picked from commit f102df96b8)
Normally, when a 'resquery_t' object is created in fctx_query(),
we call dns_adb_beginudpfetch() (which increases the ADB quota)
only if it's a UDP query. Then, in fctx_cancelquery(), we call
dns_adb_endudpfetch() to decreases back the ADB quota, again only
if it's a UDP query.
The problem is that a UDP query can become a TCP query, preventing
the quota from adjusting back in fctx_cancelquery() later.
Call dns_adb_beginudpfetch() also when switching the query type
from UDP to TCP.
(cherry picked from commit 53afe1f978)
The dns_request code is very sensitive about calling the connected and
deadlocks when the timing is "right" in several places. Move the call
to the connected callback to the (udp|tcp)_connected() functions, so
they are called asynchronously instead of directly from
the (udp|tcp)_dispentry_cancel() functions.
(cherry picked from commit 9dd8deaf01)
The TCP dispatches are removed from the dispatchmgr->list in the
dispatch_destroy() and there's a brief period of time where
dns_dispatch_gettcp() can find a dispatch in connected state that's
being destroyed.
Set the dispatch state to DNS_DISPATCHSTATE_NONE in the TCP connection
callback if there are no responses waiting, and ignore TCP dispatches
with zero references in dns_dispatch_gettcp().
(cherry picked from commit 3fac4ca57e)
In tcp_connected() a typo has turned a DbC check into an assignment
breaking the state machine and making the dns_dispatch_gettcp() try to
attach to dispatch in process of destruction.
The TCP dispatches in DNS_DISPATCHSTATE_NONE could be either very
fresh or those could be dispatches that failed connecting to the
destination. Ignore them when trying to connect to an existing
TCP dispatch via dns_dispatch_gettcp().
The dispatches are not thread-bound, and used freely between various
threads (see the dns_resolver and dns_request units for details).
This refactoring make sure that all non-const dns_dispatch_t and
dns_dispentry_t members are accessed under a lock, and both object now
track their internal state (NONE, CONNECTING, CONNECTED, CANCELED)
instead of guessing the state from the state of various struct members.
During the refactoring, the artificial limit DNS_DISPATCH_SOCKSQUOTA on
UDP sockets per dispatch was removed as the limiting needs to happen and
happens on in dns_resolver and limiting the number of UDP sockets
artificially in dispatch could lead to unpredictable behaviour in case
one dispatch has the limit exhausted by others are idle.
The TCP artificial limit of DNS_DISPATCH_MAXREQUESTS makes even less
sense as the TCP connections are only reused in the dns_request API
that's not a heavy user of the outgoing connections.
As a side note, the fact that UDP and TCP dispatch pretends to be same
thing, but in fact the connected UDP is handled from dns_dispentry_t and
dns_dispatch_t acts as a broker, but connected TCP is handled from
dns_dispatch_t and dns_dispatchmgr_t acts as a broker doesn't really
help the clarity of this unit.
This refactoring kept to API almost same - only dns_dispatch_cancel()
and dns_dispatch_done() were merged into dns_dispatch_done() as we need
to cancel active netmgr handles in any case to not leave dangling
connections around. The functions handling UDP and TCP have been mostly
split to their matching counterparts and the dns_dispatch_<function>
functions are now thing wrappers that call <udp|tcp>_dispatch_<function>
based on the socket type.
More debugging-level logging was added to the unit to accomodate for
this fact.
(cherry picked from commit 6f317f27ea)
Backport macros that can be used to implement generic attach, detach,
ref, and unref functions, so they don't have to be repeated over and
over in each unit that uses reference counting.
The overmem cleaning in ADB could become overzealous and clean fresh ADB
names and entries. Add a safety check to not clean any ADB names and
entries that are below ADB_CACHE_MINIMUM threshold.
(cherry picked from commit 0b661b6f95)
The ADB overmem accounting would include the memory used by hashtables
thus vastly reducing the space that can be used for ADB names and
entries when the hashtables would grow. Create own memory context for
the ADB names and entries hash tables.
(cherry picked from commit 59dee0b078)
The dns_catz_update_from_db() function prints serial number as a signed
number (with "%d" in the format string), but the `vers` variable's type
is 'uint32_t'. This breaks serials bigger than 2^31.
Use PRIu32 instead of "d" in the format string.
(cherry picked from commit 72b1760ea6)
This commit fixes TLS session resumption via session IDs when
client certificates are used. To do so it makes sure that session ID
contexts are set within server TLS contexts. See OpenSSL documentation
for 'SSL_CTX_set_session_id_context()', the "Warnings" section.
(cherry picked from commit 837fef78b1)
Instead of relying on hash table search when using the keys, implement a
proper reference counting in dns_keyfileio_t objects, and attach/detach
the objects to the zone.
(cherry picked from commit 79115a0c3b)
Due to off-by-one error in zonemgr_keymgmt_delete, unused key file IO
lock objects were never freed and they were kept until the server
shutdown. Adjust the returned value by -1 to accomodate the fact that
the atomic_fetch_*() functions return the value before the operation and
not current value after the operation.
(cherry picked from commit fb1acd6736)
Zero TTL handling does not need to be different for 'rdatasetiter_first'
and 'rdatasetiter_next' and it interacts badly with 'bind_rdatadataset'
which makes different determinations.
(cherry picked from commit 1a39328feb)
'DNS_DB_STALEOK' returns stale rdatasets as well as current rdatasets.
'DNS_DB_EXPIREDOK' returns expired rdatasets as well as current
rdatasets. This option is currently only set when DNS_DB_STALEOK is
also set.
(cherry picked from commit 85048ddeee)
Add an options parameter to control what rdatasets are returned when
iteratating over the node. Specific modes will be added later.
(cherry picked from commit 7695c36a5d)
Send the ns_query_cancel() on the recursing clients when we initiate the
named shutdown for faster shutdown.
When we are shutting down the resolver, we cancel all the outstanding
fetches, and the ISC_R_CANCEL events doesn't propagate to the ns_client
callback.
In the future, the better solution how to fix this would be to look at
the shutdown paths and let them all propagate from bottom (loopmgr) to
top (f.e. ns_client).
(cherry picked from commit d861d403bb9a7912e29a06aba6caf6d502839f1b)
when serve-stale is enabled, NXDOMAIN cache entries are no longer
preserved after the normal negative cache TTL, in order to reduce
unnecessary cache memory consumption.
(cherry picked from commit f1485ca145)
dns_db_updatenotify_unregister needed to be called earlier to ensure
that listener->onupdate_arg always points to a valid object. The
existing lazy cleanup in rbtdb_free did not ensure that.
(cherry picked from commit 35839e91d8)
Duplicate dns_db_updatenotify_register registrations need to be
suppressed to ensure that dns_db_updatenotify_unregister is successful.
(cherry picked from commit f13e71e551)
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():
query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
^~~~~~~~~~~~~~~~~~~
1 warning generated.
The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume(). This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt. Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.
(cherry picked from commit 07592d1315)
With 'stale-answer-enable yes;' and 'stale-answer-client-timeout off;',
consider the following situation:
A CNAME record and its target record are in the cache, then the CNAME
record expires, but the target record is still valid.
When a new query for the CNAME record arrives, and the query fails,
the stale record is used, and then the query "restarts" to follow
the CNAME target. The problem is that the query's multiple stale
options (like DNS_DBFIND_STALEOK) are not reset, so 'query_lookup()'
treats the restarted query as a lookup following a failed lookup,
and returns a SERVFAIL answer when there is no stale data found in the
cache, even if there is valid non-stale data there available.
With this change, query_lookup() now considers non-stale data in the
cache in the first place, and returns it if it is available.
(cherry picked from commit 86a80e723f)
This commit ensures that send callbacks are always called from within
the context of its worker thread even in the case of
shuttigdown/inactive socket, just like TCP transport does and with
which TLS attempts to be as compatible as possible.
(cherry picked from commit 2bfc079946)
This commit changes ISC_R_NOTCONNECTED error code to ISC_R_CANCELLED
when attempting to start reading data on the shutting down socket in
order to make its behaviour compatible with that of TCP and not break
the common code in the unit tests.
(cherry picked from commit ef659365ce)
This commit adds a check if 'sock->recv_cb' might have been nullified
during the call to 'sock->recv_cb'. That could happen, e.g. by an
indirect call to 'isc_nmhandle_close()' from within the callback when
wrapping up.
In this case, let's close the TLS connection.
(cherry picked from commit bed5e2bb08)
The TLSDNS transport was not honouring the single read callback for
TLSDNS client. It would call the read callbacks repeatedly in case the
single TLS read would result in multiple DNS messages in the decoded
buffer.
(cherry picked from commit e3c628d562)
The "final reference detached" message was meant to be DEBUG(1), but was
instead kept at INFO level. Move it to the DEBUG(1) logging level, so
it's not printed under normal operations.
(cherry picked from commit 1816244725)
Handle the situation in the read and send callbacks when the
httpd->readhandle has been already detached; f.e. from the previous
iteration of the read callback.
Don't detach the httpd->readcallback from the isc_httpdmgr_shutdown()
and the send callback, but rather call isc_nm_cancelread() to shutdown
the http request from the read callback.
The various factors like NS_PER_MS are now defined in a single place
and the names are no longer inconsistent. I chose the _PER_SEC names
rather than _PER_S because it is slightly more clear in isolation;
but the smaller units are always NS, US, and MS.
(cherry picked from commit 00307fe318)
Extract the tlss values if present from the ipkeylist entry and add
the resulting tls setting to the constructed configuration for the
primary.
When comparing catalog zone entries for reuse also check the
masters.tlss values for equality.
(cherry picked from commit 65f2512315)
It was possible to set operating system limits (RLIMIT_DATA,
RLIMIT_STACK, RLIMIT_CORE and RLIMIT_NOFILE) from named.conf. It's
better to leave these untouched as setting these is responsibility of
the operating system and/or supervisor.
Deprecate the configuration options and remove them in future BIND 9
release.
(cherry picked from commit 379929e052)
The aim is to do less work per byte:
* Check the bounds for each label, instead of checking the
bounds for each character.
* Instead of copying one character at a time from the wire to
the name, copy entire runs of sequential labels using memmove()
to make the most of its fast loop.
* To remember where the name ends, we only need to set the end
marker when we see a compression pointer or when we reach the
root label. There is no need to check if we jumped back and
conditionally update the counter for every character.
* To parse a compression pointer, we no longer take a diversion
around the outer loop in between reading the upper byte of the
pointer and the lower byte.
* The parser state machine is now implicit in the instruction
pointer, instead of being an explicit variable. Similarly,
when we reach the root label we break directly out of the loop
instead of setting a second state machine variable.
* DNS_NAME_DOWNCASE is never used with dns_name_fromwire() so
that option is no longer supported.
I have removed this comment which dated from January 1999 when
dns_name_fromwire() was first introduced:
/*
* Note: The following code is not optimized for speed, but
* rather for correctness. Speed will be addressed in the future.
*/
No functional change, apart from removing support for the unused
DNS_NAME_DOWNCASE option. The new code is about 2x faster than the
old code: best case 11x faster, worst case 1.4x faster.
When using dual-stack-servers the covering namespace to check whether
answers are in scope or not should be fctx->domain. To do this we need
to be able to distingish forwarding due to forwarders clauses and
dual-stack-servers. A new flag FCTX_ADDRINFO_DUALSTACK has been added
to signal this.
(cherry picked from commit dfbffd77f9)
There were a number of places where the zone table should have been
locked, but wasn't, when dns_zt_apply was called.
Added a isc_rwlocktype_t type parameter to dns_zt_apply and adjusted
all calls to using it. Removed locks in callers.
(cherry picked from commit f053d5b414)
Firefox 90+ apparently sends more than 10 headers, so we need to bump
the number to some higher number. Bump it to 100 just to be on a save
side, this is for internal use only anyway.
(cherry picked from commit e4654d1a6a)
The zone_refreshkeys() could run before the zone_shutdown(), but after
the last .erefs has been "detached" causing assertion failure when doing
dns_zone_attach(). Remove the use of .erefs (dns_zone_attach/detach)
and replace it with using the .irefs and additional checks whether the
zone is exiting in the callbacks.
(cherry picked from commit 80e66fbd2d)
There was an exception for dnssec-policy that allowed DNSSEC in the
unsigned version of the zone. This however causes a crash if the
zone switches from dynamic to inline-signing in the case of NSEC3,
because we are now trying to add an NSEC3 record to a non-NSEC3 node.
This is because BIND expects none of the records in the unsigned
version of the zone to be NSEC3.
Remove the exception for dnssec-policy when copying non DNSSEC
records, but do allow for DNSKEY as this may be a published DNSKEY
from a different provider.
(cherry picked from commit 332b98ae49)