Commit graph

15699 commits

Author SHA1 Message Date
Mark Andrews
e09eb2ff21 Fix OID check for PRIVATEOID keys and signatures
We were failing to account for the length byte before the OID.
See RFC 4034.

   Algorithm number 254 is reserved for private use and will never be
   assigned to a specific algorithm.  The public key area in the DNSKEY
   RR and the signature area in the RRSIG RR begin with an unsigned
   length byte followed by a BER encoded Object Identifier (ISO OID) of
   that length.  The OID indicates the private algorithm in use, and the
   remainder of the area is whatever is required by that algorithm.
   Entities should only use OIDs they control to designate their private
   algorithms.

(cherry picked from commit ca7355b7d0)
2025-04-09 20:07:31 +00:00
Ondřej Surý
81468fca59 Don't copy EDE codes if source is same as destination
If the nested DNS validator ends up in the same fetch because of the
loops, the code could be copying the EDE codes from the same source EDE
context as the destination EDE context.  Skip copying the EDE codes if
the source and the destination is the same.

(cherry picked from commit 2988ebae21)
2025-04-02 16:42:23 +00:00
Ondřej Surý
01a579d126 Don't pass edectx from fetch_and_forget
Pass NULL as edectx for the fetch_and_forget() fetches as nobody
is reading the EDE contexts and it can mess the main client buffer.

(cherry picked from commit fe48290140)
2025-04-02 16:42:23 +00:00
Ondřej Surý
17d4d178b9 Add static ede context into each validator layer
Instead of passing the edectx from the fetchctx into all subvalidators,
make the ede context ownership explict for dns_resolver_createfetch()
callers, and copy the ede result codes from the children validators to
the parent when finishing the validation process.

(cherry picked from commit d7593196a1)
2025-04-02 16:42:23 +00:00
Artem Boldariev
8459d99ec2 Dispatch: carefully check if the server name for SNI is a hostname
Previously the code would not check if the string intended to be used
for SNI is a hostname.

(cherry picked from commit 2592e309c7)
2025-03-31 15:07:55 +03:00
Artem Boldariev
634625be07 Add isc_tls_valid_sni_hostname()
Add a function that checks if a 'hostname' is not a valid IPv4 or IPv6
address. Returns 'true' if the hostname is likely a domain name, and
'false' if it represents an IP address.

(cherry picked from commit 1f199ee606)
2025-03-31 15:06:59 +03:00
Colin Vidal
c1352b79ca copy __FILE__ when allocating memory
When allocating memory under -m trace|record, the __FILE__ pointer is
stored, so it can be printed out later in order to figure out in which
file an allocation leaked. (among others, like the line number).

However named crashes when called with -m record and using a plugin
leaking memory. The reason is that plugins are unloaded earlier than
when the leaked allocations are dumped (obviously, as it's done as late
as possible). In such circumstances, __FILE__ is dangling because the
dynamically loaded library (the plugin) is not in memory anymore.

Fix the crash by systematically copying the __FILE__ string
instead of copying the pointer. Of course, this make each allocation to
consume a bit more memory (and longer, as it needs to calculate the
length of __FILE__) but this occurs only under -m trace|record debugging
flags.

In term of unit test, because grepping in C is not fun, and because the
whole "syntax" of the dump output is tested in other tests, this simply
search for a substring in the whole buffer to make sure the expected
allocations are found.

(cherry picked from commit 4eb2cd364a)
2025-03-27 14:21:00 +01:00
Evan Hunt
3dd7691650
fix the fetchresponse result for CNAME/DNAME
the fix in commit 1edbbc32b4 was incomplete; the wrong
event result could also be set in cache_name() and validated().

(cherry picked from commit 9ebeb60174)
2025-03-26 11:59:48 +01:00
Evan Hunt
8f84f8293a
set eresult based on the type in ncache_adderesult()
when the caching of a negative record failed because of the
presence of a positive one, ncache_adderesult() could override
this to ISC_R_SUCCESS. this could cause CNAME and DNAME responses
to be handled incorrectly.  ncache_adderesult() now sets the result
code correctly in such cases.

(cherry picked from commit 1edbbc32b4)
2025-03-26 11:59:37 +01:00
Ondřej Surý
817a0a8e8e Fix invalid cache-line padding for qpcache buckets
The isc_queue_t was missing in the calculation of the required
padding size inside the qpcache bucket structure.

(cherry picked from commit 3ef9b09620)
2025-03-25 09:59:02 +00:00
Evan Hunt
080299bf49 Don't check DNS_KEYFLAG_NOAUTH
All DNSKEY keys are able to authenticate. The DNS_KEYTYPE_NOAUTH
(and DNS_KEYTYPE_NOCONF) flags were defined for the KEY rdata type,
and are not applicable to DNSKEY.

Previously, because the DNSKEY implementation was built on top of
KEY, the NOAUTH flag prevented authentication in DNSKEYs as well.
This has been corrected.

(cherry picked from commit 5c21576f82)
2025-03-24 23:52:02 -07:00
Evan Hunt
dc1ddd3e8a Tidy up keyvalue.h definitions
Use enums for DNS_KEYFLAG_, DNS_KEYTYPE_, DNS_KEYOWNER_, DNS_KEYALG_,
and DNS_KEYPROTO_ values.

Remove values that are never used.

Eliminate the obsolete DNS_KEYFLAG_SIGNATORYMASK. Instead, add three
more RESERVED bits for the key flag values that it covered but which
were never used.

(cherry picked from commit fee1ba40df)
2025-03-25 06:40:49 +00:00
Matthijs Mekking
c0e92c6df9 Remove dns_qpmulti_lockedread declaration
This function was removed in 6217e434b5
but not from the header file.

(cherry picked from commit 2c52aea3dc)
2025-03-25 06:02:17 +00:00
Mark Andrews
db113bc5ad Fix gaining adbname reference
Call dns_adbname_ref before calling dns_resolver_createfetch to
ensure adbname->name remains stable for the life of the fetch.

(cherry picked from commit 8e7229f641)
2025-03-21 00:29:45 +00:00
Matthijs Mekking
5cb7c19c23 Update Retired and Removed if we update lifetime
If we are updating the lifetime, and it was not set before, also
set/update the Retired and Removed timing metadata.

(cherry picked from commit 3e836a87e6)
2025-03-20 13:57:45 +00:00
Matthijs Mekking
3de8fa8709 Fix keymgr bug wrt setting the next time
Only set the next time the keymgr should run if the value is non zero.
Otherwise we default back to one hour. This may happen if there is one
or more key with an unlimited lifetime.

(cherry picked from commit 6c6b8796d3)
2025-03-20 13:57:45 +00:00
Matthijs Mekking
ac8efcbf14 keymgr: also set DeleteCDS when setting PublishCDS
The keymgr never set the expected timing metadata when CDS/CDNSKEY
records for the corresponding key will be removed from the zone. This
is not troublesome, as key states dictate when this happens, but with
the new pytest we use the timing metadata to determine if the CDS and/or
CDNSKEY for the given key needs to be published.

(cherry picked from commit 8c9d2eb2bf)
2025-03-20 13:57:45 +00:00
Matthijs Mekking
04054bcb9a Fix wrong usage of safety intervals in keymgr
There are a couple of cases where the safety intervals are added
inappropriately:

1. When setting the PublishCDS/SyncPublish timing metadata, we don't
   need to add the publish-safety value if we are calculating the time
   when the zone is completely signed for the first time. This value
   is for when the DNSKEY has been published and we add a safety
   interval before considering the DNSKEY omnipresent.

2. The retire-safety value should only be added to ZSK rollovers if
   there is an actual rollover happening, similar to adding the sign
   delay.

3. The retire-safety value should only be added to KSK rollovers if
   there is an actual rollover happening. We consider the new DS
   omnipresent a bit later, so that we are forced to keep the old DS
   a bit longer.

(cherry picked from commit 63edc4435f)
2025-03-20 13:57:45 +00:00
Matthijs Mekking
147ab68dc1 Fix a small keymgr bug
While converting the kasp system test to pytest, I encountered a small
bug in the keymgr code. We retire keys when there is more than one
key matching a 'keys' line from the dnssec-policy. But if there are
multiple identical 'keys' lines, as is the case for the test zone
'checkds-doubleksk.kasp', we retire one of the two keys that have the
same properties.

Fix this by checking if there are double matches. This is not fool proof
because there may be many keys for a few identical 'keys' lines, but it
is good enough for now. In practice it makes no sense to have a policy
that dictates multiple keys with identical properties.

(cherry picked from commit ef671919d5)
2025-03-20 13:57:45 +00:00
Mark Andrews
1f136f24a4 Use reference counted counters for nfail and nvalidations
The fetch context that held these values could be freed while there
were still active pointers to the memory.  Using a reference counted
pointer avoids this.

(cherry picked from commit bfbaacc9a0)
2025-03-20 01:30:43 +00:00
Aram Sargsyan
afef69cab0 Fix the resolvers RTT-ranged responses statistics counters
When a response times out the fctx_cancelquery() function
incorrectly calculates it in the 'dns_resstatscounter_queryrtt5'
counter (i.e. >=1600 ms). To avoid this, the rctx_timedout()
function should make sure that 'rctx->finish' is NULL. And in order
to adjust the RTT values for the timed out server, 'rctx->no_response'
should be true. Update the rctx_timedout() function to make those
changes.

(cherry picked from commit 830e548111)
2025-03-19 09:51:33 +00:00
Aram Sargsyan
2a4bbf1d2e Fix resolver responses statistics counter
The resquery_response() function increases the response counter without
checking if the response was successful. Increase the counter only when
the result indicates success.

(cherry picked from commit 12e7dfa397)
2025-03-19 09:51:33 +00:00
Aram Sargsyan
70c0074043 Implement -T cookiealwaysvalid
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.

(cherry picked from commit 807ef8545d)
2025-03-17 11:39:16 +00:00
Evan Hunt
1f4ba71f56 detect when closest-encloser name is too long
there was a database bug in which dns_db_find() could get a partial
match for the query name, but still set foundname to match the full
query name.  this triggered an assertion when query_addwildcardproof()
assumed that foundname would be shorter.

the database bug has been fixed, but in case it happens again, we
can just copy the name instead of splitting it. we will also log a
warning that the closest-encloser name was invalid.
2025-03-17 09:27:09 +00:00
Evan Hunt
5da31b753a dns_nsec3_addnsec3() can fail when iterating back
when adding a new NSEC3 record, dns_nsec3_addnsec3() uses a
dbiterator to seek to the newly created node and then find its
predecessor.  dbiterators in the qpzone use snapshots, so changes
to the database are not reflected in an already-existing iterator.
consequently, when we add a new node, we have to create a new iterator
before we can seek to it.
2025-03-17 09:27:09 +00:00
Evan Hunt
2025ba8f7a rbtdb zone find() function could set foundname incorrectly
when an empty node was found, the result was treated as a partial match,
but foundname could still contain the name of the empty node instead of
its parent.
2025-03-17 09:27:09 +00:00
Evan Hunt
dd1050e938 qpzone find() function could set foundname incorrectly
when a requested name is found in the QP trie during a lookup, but its
records have been marked as nonexistent by a previous deletion, then
it's treated as a partial match, but the foundname could be left
pointing to the original qname rather than the parent. this could
lead to an assertion failure in query_findclosestnsec3().
2025-03-17 09:27:09 +00:00
Mark Andrews
ae718fab53 Use a binary search to find the NSEC3 closest encloser
maxlabels is the suffix length that corresponds to the latest
NXDOMAIN response.  minlabels is the suffix length that corresponds
to longest found existing name.

(cherry picked from commit 67f31c5046)
2025-03-17 09:27:09 +00:00
Mark Andrews
0f0b143b35 Add missing locks when returning addresses
Add missing locks in dns_zone_getxfrsource4 et al. Addresses CID
468706, 468708, 468741, 468742, 468785 and 468778.

Cleanup dns_zone_setxfrsource4 et al to now return void.

Remove double copies with dns_zone_getprimaryaddr and dns_zone_getsourceaddr.

(cherry picked from commit d0a59277fb)
2025-03-15 06:07:55 +00:00
Evan Hunt
bfa5dd8991 qpzone.c:step() could ignore rollbacks
the step() function (used for stepping to the prececessor or
successor of a database node) could overlook a node because
there was an rdataset marked IGNORE because it had been rolled
back, covering an active rdataset under it.

(cherry picked from commit 24eaff7adc)
2025-03-14 23:22:59 +00:00
Evan Hunt
8afb5566c9 fix handling of revoked keys
when a key is revoked its key ID changes, due to the inclusion
of the "revoke" flag. a collision between this changed key ID and
that of an unrelated public-only key could cause a crash in
dnssec-signzone.

(cherry picked from commit 9cfe9f5eb7)
2025-03-14 22:29:50 +00:00
Colin Vidal
c8cb75d7b1 add support for EDE 20 (Not Authoritative)
Extended DNS Error message EDE 20 (Not Authoritative) is now sent when
client request recursion (RD) but the server has recursion disabled.

RFC 8914 mention EDE 20 should also be returned if the client doesn't
have the RD bit set (and recursion is needed) but it doesn't apply for
BIND as BIND would try to resolve from the "deepest" referral in
AUTHORITY section. For example, if the client asks for "www.isc.org/A"
but the server only knows the root domain, it will returns NOERROR but
no answer for "www.isc.og/A", just the list of other servers to ask.

(cherry picked from commit 24ffbdcfea)
2025-03-13 11:57:21 +00:00
Colin Vidal
870c5ce8bf add support for EDE 7 and 8
Extended DNS Error messages EDE 7 (expired key) and EDE 8 (validity
period of the key not yet started) are now sent in case of such DNSSEC
validation failures.

Refactor the existing validator extended error APIs in order to make it
easy to have a consisdent extra info (with domain/type) in the various
use case (i.e. when the EDE depends on validator state,
validate_extendederror or when the EDE doesn't depend of any state but
can be called directly in a specific flow).

(cherry picked from commit 334ea1269f)
2025-03-13 10:14:23 +00:00
Ondřej Surý
614f8c1ef1 Acquire the database reference before possibly last node release
Acquire the database refernce in the detachnode() to prevent the last
reference to be release while the NODE_LOCK being locked.  The NODE_LOCK
is locked/unlocked inside the RCU critical section, thus it is most
probably this should not pose a problem as the database uses call_rcu
memory reclamation, but this it is still safer to acquire the reference
before releasing the node.

(cherry picked from commit d1ef6a93c1)
2025-03-06 10:39:17 +00:00
Ondřej Surý
ee6e64df21 Revert "fix: dev: Delete dead nodes when committing a new version"
This reverts commit 67255da4b3, reversing
changes made to 74c9ff384e.

(cherry picked from commit 1e4695510a)
2025-03-05 17:28:44 +00:00
Aram Sargsyan
0561936272 Fix a bug in get_request_transport_type()
When dns_remote_done() is true, calling dns_remote_curraddr() asserts.
Add a dns_remote_curraddr() check before calling dns_remote_curraddr().

(cherry picked from commit 6cd9e4f67c)
2025-03-05 13:18:09 +00:00
Mark Andrews
c0197077aa Implement digest_sig and digest_rrsig for ZONEMD
ZONEMD needs to be able to digest SIG and RRSIG records.  The signer
field can be compressed in SIG so we need to call dns_name_digest().
While for RRSIG the records the signer field is not compressed the
canonical form has the signer field downcased (RFC 4034, 6.2).  This
also implies that compare_rrsig needs to downcase the signer field
during comparison.

(cherry picked from commit 006c5990ce)
2025-03-05 10:33:53 +00:00
Mark Andrews
cbf416a284 Call isc__iterated_hash_initialize
The iterated hash implementation needs to be initialised
on the worker thread.  Also clean it up after we are done.

(cherry picked from commit 988dc57c8c)
2025-03-04 13:49:38 +00:00
Aram Sargsyan
079a4176bb Fix a bug in dns_zone_getprimaryaddr()
When all the addresses were already iterated over, the
dns_remote_curraddr() function asserts. So before calling it,
dns_zone_getprimaryaddr() now checks the address list using the
dns_remote_done() function. This also means that instead of
returning 'isc_sockaddr_t' it now returns 'isc_result_t' and
writes the primary's address into the provided pointer only when
returning success.

(cherry picked from commit 7293cb0612)
2025-03-03 12:23:23 +00:00
Artem Boldariev
9977c7e5fa DoH: Bump the active streams processing limit
This commit bumps the total number of active streams (= the opened
streams for which a request is received, but response is not ready) to
60% of the total streams limit.

The previous limit turned out to be too tight as revealed by
longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.

(cherry picked from commit eaad0aefe6)
2025-03-03 10:12:27 +00:00
Artem Boldariev
b1ca1b3abc DoH: remove obsolete INSIST() check
The check, while not active by default, is not valid since the commit
8b8f4d500d.

See 'if (total == 0) { ...' below branch to understand why.

(cherry picked from commit 217a1ebd79)
2025-03-03 10:12:27 +00:00
Artem Boldariev
0bc12d0deb DoH: Flush HTTP write buffer on an outgoing DNS message
Previously, the code would try to avoid sending any data regardless of
what it is unless:

a) The flush limit is reached;
b) There are no sends in flight.

This strategy is used to avoid too numerous send requests with little
amount of data. However, it has been proven to be too aggressive and,
in fact, harms performance in some cases (e.g., on longer (≥1h) runs
of "stress:long:rpz:doh+udp:linux:*").

Now, additionally to the listed cases, we also:

c) Flush the buffer and perform a send operation when there is an
outgoing DNS message passed to the code (which is indicated by the
presence of a send callback).

That helps improve performance for "stress:long:rpz:doh+udp:linux:*"
tests.

(cherry picked from commit c5f7968856)
2025-03-03 10:12:27 +00:00
Artem Boldariev
30226c749f DoH: Limit the number of delayed IO processing requests
Previously, a function for continuing IO processing on the next UV
tick was introduced (http_do_bio_async()). The intention behind this
function was to ensure that http_do_bio() is eventually called at
least once in the future. However, the current implementation allows
queueing multiple such delayed requests needlessly. There is currently
no need for these excessive requests as http_do_bio() can requeue them
if needed. At the same time, each such request can lead to a memory
allocation, particularly in BIND 9.18.

This commit ensures that the number of enqueued delayed IO processing
requests never exceeds one in order to avoid potentially bombarding IO
threads with the delayed requests needlessly.

(cherry picked from commit 0e1b02868a)
2025-03-03 10:12:27 +00:00
Artem Boldariev
515d84e1f6 DoH: Simplify http_do_bio()
This commit significantly simplifies the code flow in the
http_do_bio() function, which is responsible for processing incoming
and outgoing HTTP/2 data. It seems that the way it was structured
before was indirectly caused by the presence of the missing callback
calls bug, fixed in 8b8f4d500d.

The change introduced by this commit is known to remove a bottleneck
and allows reproducible and measurable performance improvement for
long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests.

Additionally, it fixes a similar issue with potentially missing send
callback calls processing and hardens the code against use-after-free
errors related to the session object (they can potentially occur).

(cherry picked from commit 0956fb9b9e)
2025-03-03 10:12:27 +00:00
Aram Sargsyan
2d48cb33e3 Fix TTL issue with ANY queries processed through RPZ "passthru"
Answers to an "ANY" query which are processed by the RPZ "passthru"
policy have the response-policy's 'max-policy-ttl' value unexpectedly
applied. Do not change the records' TTL when RPZ uses a policy which
does not alter the answer.

(cherry picked from commit 5633dc90d3)
2025-02-27 09:22:01 +00:00
Mark Andrews
14bd113b8f Fix dual-stack-servers
Named was stopping nameserver address resolution attempts too soon
when dual stack servers are configured.  Dual stack servers are
used when there are *not* addresses for the server in a particular
address family so find->status == DNS_ADB_NOMOREADDRESSES is not a
sufficient stopping condition when dual stack servers are available.
Call fctx_try to see if the alternate servers can be used.

(cherry picked from commit f98a8331aa)
2025-02-26 01:04:59 +00:00
Evan Hunt
4f1f958d6d prevent a reference leak from the ns_query_done hooks
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.

this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().

(cherry picked from commit c2e4358267)
2025-02-26 00:55:51 +00:00
Mark Andrews
a0dae15cd1 Relax private DNSKEY and RRSIG constraints
DNSKEY, KEY, RRSIG and SIG constraints have been relaxed to allow
empty key and signature material after the algorithm identifier for
PRIVATEOID and PRIVATEDNS. It is arguable whether this falls within
the expected use of these types as no key material is shared and
the signatures are ineffective but these are private algorithms and
they can be totally insecure.

(cherry picked from commit b048190e23)
2025-02-25 23:40:38 +00:00
Ondřej Surý
7682d63bd4 Destroy the hashmap iterator inside the rwlock
Previously, the hashmap iterator for fetches-per-zone was destroy
outside the rwlock.  This could lead to an assertion failure due to a
timing race with the internal rehashing of the hashmap table as the
rehashing process requires no iterators to be running when rehashing the
hashmap table.  This has been fixed by moving the destruction of the
iterator inside the read locked section.

(cherry picked from commit 1e4fb53c61)
2025-02-25 15:41:30 +00:00
Evan Hunt
16a80f401a Fix a logic error in cache_name()
A change in 6aba56ae8 (checking whether a rejected RRset was identical
to the data it would have replaced, so that we could still cache a
signature) inadvertently introduced cases where processing of a
response would continue when previously it would have been skipped.

(cherry picked from commit d0fd9cbe3b)
2025-02-24 23:42:25 +00:00