The name 'isdelegation()' was confusing. This function is not checking
whether this message is a delegation, but whether the denial of
existence proofs in this message is a proof of a referral to an
unsigned zone.
The name 'is_unsecure_referral()' is more appropriate.
(cherry picked from commit e0f09bb374)
The isdelegation() was changed to return an isc_result_t because the
idea was to have a separate return value DNS_R_NSEC3ITERRANGE to signal
to the caller we could not verify the proof because of too many
iterations in the NSEC3 record, or perhaps ISC_R_UNEXPECTED for a more
generic cause that verification was not done.
But this would make error handling more fragile and all we care about
is whether we can reliably say the NS bit was not set.
If we can not reliably say so, we have to treat it as an insecure
referrral.
Since the answer is either yes or no, we can revert back to returning
a boolean value.
(cherry picked from commit 3ac1bb1c39)
When the xfrin_recv_done() function decides to retry the transfer
using AXFR because of a previous error, it calls the xfrin_reset()
function which calls dns_db_closeversion() on 'xfr->ver'. The problem
is that the ixfr processing of a previous message could be still
in process in a worker thread, which then can use freed 'xfr->ver'.
If there is an ongoing worker thread delay the AXFR retry until after
the worker thread has finished its work.
(cherry picked from commit 141ff7bfa7)
After KeyTrap, the temporal DNSSEC were originally hard errors that
caused validation failures even if the records had another valid
signature. This has been changed and the RRSIGs outside of the
inception and expiration time are not counted as hard errors. However,
these errors are not even counted as validation attempts, so excessive
number of expired RRSIGs would cause some non-cryptograhic extra work
for the validator. This has been fixed and the temporal errors are
correctly counted as validation attempts.
(cherry picked from commit 6ba57a1f0f)
Allow empty APL records because RFC 3123 (Section 4) says "zero or
more items". This fixes processing of a catalog zone ACL (which is
based on APL records) when the zone contains an empty APL record or
when a zone update arrives which creates an empty APL record.
(cherry picked from commit 35b8af229e)
When an NTA already exists for a name, the old code retrieved
and reused the existing NTA object, then reset its timer via
settimer(). This is incorrect because isc_timer_start() and
isc_timer_stop() require the timer to be manipulated from its
owning loop (enforced by REQUIRE(timer->loop == isc_loop()) in
lib/isc/timer.c), and the caller may be running on a different
loop than the one that created the original NTA.
Instead, delete the old NTA (shutting down its timer on the
correct loop) and insert a fresh one that is owned by the
current loop.
dns_view_flushnode() was called in the delete_expired() async
callback, which runs after the query that detected the NTA expiry.
This created a race: the query would proceed with stale cached data
from the NTA period before the flush had a chance to run, resulting
in transient SERVFAIL with EDE 22 (No Reachable Authority).
Skip dns_view_flushnode() in the older branches as the solutions for
older branches are too complicated and this was not a critical bug.
(cherry picked from commit da8e1c956a)
When NTA expires the name's node should be flushed from the view's
cache as it's done when the NTA is manually removed using a rndc
command.
(cherry picked from commit 1899a3318c)
Move the write to fctx->vresult after LOCK(&fctx->lock). The field was
being set before acquiring the lock, but dns_resolver_logfetch() reads
it under the same lock from another thread.
(cherry picked from commit a2bd833909)
The SRTT update loaded the old value, computed a new one, and stored it
back as separate operations. Two concurrent callers could each read the
same old value and one update would be silently lost.
Use a CAS loop for the read-modify-write on entry->srtt. For the aging
path, also CAS on entry->lastage to prevent multiple threads from aging
the same entry in the same second.
(cherry picked from commit 4d15494b94)
The 'env' pointer is passed to an async function without taking
a reference first, which can potentially cause a use-after-free
error. Take a reference, then detach in the async function.
(cherry picked from commit 48d7401f0d)
The error cleanup in fctx_create() was missing isc_mutex_destroy() and
dns_ede_invalidate() calls. When error paths (cleanup_nameservers,
cleanup_fcount, cleanup_qmessage, cleanup_adb) were taken after the
mutex and edectx were initialized, the fctx memory was freed without
properly destroying these resources first.
(cherry picked from commit 5b1750f15f)
The lock is acquired for reading but the error path from
dns_rdata_fromstruct() incorrectly unlocks it as a write lock.
(cherry picked from commit 96a22451d7)
We return DNS_R_NOVALIDSIG if we detected a deadlock. Then in
'validate_async_done()', this result value is used to check if we
need to fall back to insecure. As part of that we create a new fetch
but that fails because of the detected deadlock. This results in a loop
of deadlock detected, fallback to insecure, deadlock detected, ...
Add a new result value, ISC_R_DEADLOCK, and return this instead when
we have detected a deadlock. This will be treated as a generic error,
as there is no special handling for this result value.
(cherry picked from commit bc1d177cc2)
Move the LIBDNS_XFRIN_RECV_DONE probe execution before dns_xfrin_detach
in xfrin_recv_done.
Previously, dns_xfrin_detach was called before the trace probe, which
could free the xfr object. Because the accessed member xfr->info is an
embedded array, the expression evaluates via pointer arithmetic rather
than a direct memory dereference. Although this prevents a reliable
crash in practice, it technically remains a use-after-free issue.
Reorder the statements to ensure the transfer context is fully valid
when the probe executes.
(cherry picked from commit e57245ee81)
Since memory allocation never fails in BIND 9, checkds_create() cannot
fail. Change it to return void and use designated initializers,
removing error handling at all call sites.
(cherry picked from commit 63d3c1f58a)
Initialize cb_args to NULL and free it in the cleanup path so it
is not leaked when the function fails after allocation.
(cherry picked from commit d7e1013741)
Two 'goto next' paths in zone_notify() skipped detaching the TSIG
key and transport, leaking them on TLS configuration failure and
when the destination address is disabled.
(cherry picked from commit 1505cb1c24)
Use isc_mem_putanddetach() instead of isc_mem_put() to properly
detach the attached memory context stored in resarg->mctx.
(cherry picked from commit d0165070c7)
In fctx_query(), resquery_ref(query) is called before
dns_dispatch_connect() in anticipation of the resquery_connected()
callback consuming the reference.
When dns_dispatch_connect() fails synchronously on TCP (e.g. from
dns_transport_get_tlsctx() failing in tcp_dispatch_connect()), the
connect callback is never scheduled, so the extra reference is never
consumed. The error path then tears down the query via manual cleanup
(isc_mem_put) without going through the refcount destructor, leaving
the reference imbalanced.
Fix by dropping the extra reference on the error path, just after
dns_dispatch_done() which cleans up the dispatch entry.
(cherry picked from commit 2da669490c)
When the dns_qp_getname() call returns an error the del_name() function
just returns without cleaning up the trasnaction.
Instead of returning, jump to a new label 'done:' similar to the code
written in the add_nm() function.
(cherry picked from commit 4df5b9ac32)
When defaults->zonedir is set, opts->zonedir is unconditionally
overwritten without freeing the previous value. This leaks memory
on every catalog zone update when zonedir defaults are configured.
Free the existing opts->zonedir before replacing it.
(cherry picked from commit 5cd17c8adc)
Refactor dns_loadctx_t and dns_dumpctx_t to use standard
ISC_REFCOUNT_DECL and ISC_REFCOUNT_IMPL macros, retiring the
redundant manual attach and detach implementations.
Introduce dns_loadctx_enqueue() and dns_dumpctx_enqueue() to
ensure compliance with the new strict loop affinity in
isc_work_enqueue(). If the current loop does not match the
target loop, the enqueue operation is safely bounced to the
correct thread via isc_async_run().
(cherry picked from commit e7c550730a)
The 'keyname' variable could be used in the add_rdata_to_list()
call without being initialized. Make sure that 'keyname' is non-NULL
for all the cases that do not jump to the 'cleanup:' label.
(cherry picked from commit 172f5496ba)
The attacker that controls DNSSEC-signed zone can trigger a memory leak
in the addnoqname() and/or addclosest() by creating more than
max-records-per-type RRSIG for any NSEC records. The memory leaks have
been fixed.
(cherry picked from commit a854a5c83d)
In many places we only create a validator if the RRset has too low
trust (the RRset is pending validation, or could not be validated
before). This check was missing prior to validating negative response
data.
(cherry picked from commit 6ca67f65cd)
When we mark RRsets as secure, we most of the time also log a debug
message. Combine this the same way as 'markanswer()' does.
(cherry picked from commit d4c7c83a70)
If we already marked an rdataset as secure (or it has even stronger
trust), there is no need to cryptographically verify it again.
(cherry picked from commit 0ec08c2120)
When looking up an NSEC3 as part of an insecurity proof, check the
number of iterations. If this is too high, treat the answer as insecure
by marking the answer with trust level "answer", indicating that they
did not validate, but could be cached as insecure.
(cherry picked from commit 988040a5e0)
Replace the two-pass "random start index and wrap around" logic in
fctx_getaddresses_nameservers() with a statistically sound Fisher-Yates
shuffle.
The previous implementation picked a random starting node and did two
passes over the linked list to find query candidates. The new logic
extracts the available nameservers into a bounded, stack-allocated array
of dns_rdata_t structures.
This array is then randomized in-place using a Fisher-Yates shuffle.
Finally, the shuffled array is traversed sequentially to launch fetches
until the dynamic quota (fctx->pending_running >= fetches_allowed) is
reached.
This guarantees a fair random distribution for outbound queries while
properly respecting dynamic query limits, entirely within O(1) memory
and without the overhead of linked-list pointer shuffling or multiple
dataset traversals.
(cherry picked from commit 3c33e7d937)
A debug message that logs a PKCS#11 object has been generated was
erroneously logged at error level. This has been fixed.
(cherry picked from commit 5bd6322739)
When selecting nameserver addresses to be looked up we where
always selecting them in dnssec name order from the start of
the nameserver rrset. This could lead to resolution failure
despite there being address that could be resolved for the
other names. Use a random starting point when selecting which
names to lookup.
(cherry picked from commit b78052119a)
If an invalid SKR file is imported, reading the time from the token
buffer might overflow the buffer on the local stack. This has been
fixed by removing the intermediate buffer and parsing the lexer token
directly.
(cherry picked from commit 8ab4827a0c)
NSEC3 hashes are required to fit within a single DNS label. Since there
are 5 bits per label byte without pad characters, the maximum hash size
is floor(63*5/8) (39 bytes).
This patch enforces this maximum length for unknown algorithms, while
strictly enforcing the exact expected digest length for known algorithms
like SHA-1.
(cherry picked from commit 3801d0ebbf)
When .next_length is longer than NSEC3_MAX_HASH_LENGTH, it causes a
harmless out-of-bound read of the isdelegation() stack. This patch
fixes the issue by skipping NSEC3 records with an oversized hash length
during validation.
(cherry picked from commit 67b4fb56e4)
A regression was introduced when adding the EDE code for unsupported
DNSKEY and DS algorithms. When the parent has both supported and
unsupported algorithm in the DS record, the validator would treat the
supported DS algorithm as insecure when validating DNSKEY records
instead of BOGUS. This has not security impact as the rest of the child
zone correctly ends with BOGUS status, but it is incorrect and thus the
regression has been fixed.
(cherry picked from commit f983a64152)
An attacker controlling a malicious DNS server returns a DNAME record,
and the we stores a pointer to resp->foundname, frees the response
structure, then uses the dangling pointer in dns_name_fullcompare()
possibly causing invalid match. Only the `delv`is affected. This has
been fixed.
(cherry picked from commit 9135b71a7a)
The fetch loop detection occured in two places: when
`dns_resolver_createfetch()` is invoked (looking up through the parent
fetches chain and stops the fetch if a parent fetch is the same qname and
qtype) and right after calling `dns_adb_findname()` in the resolver
(stops the fetch if the current fetch is the same name from the ADB
lookup, and ADB lookup needs to fetch it).
Regarding fetch loop detection at the `dns_resulver_createfetch()`
entry, there are case where both qname and qtype are similar but the
zonecut is different. This will then query different name servers and
get different responses. For instance, the following delegation
parent-side (both for `foo.example.` and `dnshost.example.`):
foo.example. 3600 NS ns.dnshost.example.
dnshost.example. 3600 NS ns.dnshost.example.
ns.dnshost.example. 3600 A 1.2.3.4
Then the child-side of `dnshost.example.`:
dnshost.example. 300 NS ns.dnshost.example.
ns.dnshost.example. 300 A 1.2.3.4
Then the child-side of `foo.example.`:
foo.example 3600 NS ns.dnshost.example.
a.foo.example 300 A 5.6.7.8
Obviously, there is a misconfiguration between the parent-side and the
child-side of `dnshost.example` (the mismatch of the TTL), but, this
happens...
Because the resolver is currently child-centric, the parent-side
delegation's glue of `dnshost.example.` will be overriden by the
child-side of the delegation. Once both A records will expires, the
resolver will attempt to find out the A RRs but will start from the
`foo.example.` zonecut, as the delegation itself is still valid.
Then the resolver will attempt to resolve `ns.dnshost.example.`, still
using the `foo.example.` zonecut, which will immediately trigger another
attempt to resolve `ns.foo.example.` (because the A RR is expired). This
is, however _not_ a loop, because the second attempt will have
`dnshost.example.` zonecut. And this changes everything, because the
resolver detects the A name is in-domain, and pass a flag to ADB so
`dns_view_find()` won't use the cache. As a result, the zonecut will be
`.`, and the hints (root servers) will be queried instead.
From that point, they'll return the parent-side delegation, which
includes the glue for `ns.dnshost.example/A`, and the resolution can
continue. Previously, this wouldn't be possible because a loop would be
detected from the second attempt to looking `ns.foo.example/A` and would
result in a SERVFAIL.
Now, the loop detection is relaxed as the loop is detected if the qname,
qtype _and_ zonecut are equals.
This commit also changes the way the loop detection post
`dns_adb_createfind()` works. From the same example above, there would
be two ADB fetches with the same name, but with two different ADB flags
(the first one without DNS_ADB_STARTATZONE, the second one with that
flag). It means that there will be two fetches out of those two ADB
lookups, both legit, and not a loop (i.e. it won't be stuck). To
differenciate between a find which has a pending fetch (which could be
from another find the current find has been attached to), a new find
option `DNS_ADBFIND_STARTEDFETCH` is introduced, which tells that the
current has did started a fetch.
That way, if a find doesn't have `DNS_ADBFIND_STARTEDFETCH` option but
has pending fetches, we know this is a find attached to a similar find
so this is a loop. Otherwise, with `DNS_ADBFIND_STARTEDFETCH`, we know
that even if there is a pending fetch, this is not a loop as the fetch
has just been started
(cherry picked from commit f623ab1fb3)
ADB entry window and ADB min cache time can be tweaked using `named -T
adbentrywindow=<unsigned int>` and `named -T adbmincache=<unsigned
int>`.
While those values doesn't needs to be exposed to the operator, this can
be needed to be able to system test ADB behaviors without having to wait
as long as those values are by default.
(cherry picked from commit e5f963262a)
This better reflects the true nature of the function as we are reading
the ephemeral port range which is not related to UDP at all.
(cherry picked from commit 295139f8ca)
The function was already marked as never failing, always returning
ISC_R_SUCCESS, so there was a lot of dead code around checking whether
the result would be ISC_R_SUCCESS. This has been cleaned up.
(cherry picked from commit c3ec414d88)