Commit graph

860 commits

Author SHA1 Message Date
Ondřej Surý
005e151c5b
Pass empty string instead of NULL to ns_client_dumpmessage()
The two new call sites added by the CLASS-validation work passed NULL
as the reason, but ns_client_dumpmessage() bails out early on a NULL
reason — so the message dump never happened. The intent was to dump
the message and let the follow-up ns_client_log() carry the reason
text, so pass "" to suppress the prefix without short-circuiting the
dump.

(cherry picked from commit 3401cbd16f44b4ecb8b57dc9d1951037db6d0e32)
2026-05-07 13:09:18 +02:00
Ondřej Surý
39a4ad2330
Validate DNS message CLASS early in request processing
Reject requests with unsupported or misused CLASS values before
further processing.  Only IN, CH, HS, RESERVED0 (for DNS Cookies),
ANY (for TKEY negotiation), and NONE (for DNS UPDATE) are accepted;
all other classes return NOTIMP.  Misuse of NONE or ANY outside
their allowed contexts returns FORMERR.

This adds further protection against bugs of the same general class
as YWH-PGM40640-70 and YWH-PGM40640-73.

(cherry picked from commit 0a687451505037e9f9a850c9cb113aed4995b03f)
2026-05-07 13:09:18 +02:00
Evan Hunt
134706912f
Disable UPDATE and NOTIFY for non-IN classes
Return NOTIMP for UPDATE and NOTIFY requests received for views with a
class other than IN.  Only QUERY is now supported for non-IN views such
as CHAOS.

When running dns dns_rdata_tostruct() with types that are only defined
for class IN, ensure that the class is correct before proceeding.

Add an assertion that any zone being updated is of class IN. (Note
that previously, a DLZ zone could have its class value set incorrectly
to NONE; this has been fixed.)

This addresses YWH-PGM40640-70 and YWH-PGM40640-73 (as well as any
similar problems that might have occurred in the future) by minimizing
the code paths that can be reached by rdata classes other than IN, so it
is safe for the implementation to assume that rdatatypes that are only
defined for class IN, such as SVCB or WKS, have been parsed and
validated, and not accepted as unknown/opaque data.

Fixes: isc-projects/bind9#5777
Fixes: isc-projects/bind9#5779

(cherry picked from commit a6d8e330ed6cf0021bff3f00aa1dc7a296f5aec0)
2026-05-07 13:09:18 +02:00
Aram Sargsyan
69fb85d994
Apply XFR-out quota after ACL is checked
Unauthorized clients can consume XFR-out quota and block authorized
XFR clients. Apply the quota after ACL is checked.

(cherry picked from commit 5615e6c47a2cd00d82d48b568cc55a4b89daa330)
2026-05-07 13:09:18 +02:00
Evan Hunt
d42b3e7b91 Clear dns64_aaaaok immediately after use
The DNS64 state information stored in client->query.dns64_aaaaok
could cause an assertion failure in query_respond() if the server
was configured in such a way as to trigger a new recursion before
the query had been reset - for example, by using the filter-aaaa
plugin, which may need to recurse to find out whether an A record
exists.

This has been addressed by clearing DNS64 state information
immediately after the call to query_filter64().

(cherry picked from commit 7213b038f0)
2026-05-06 04:47:07 +00:00
Ondřej Surý
36897a0872 Fix swapped arguments in redirect2() single-label branch
For a query whose qname is the root, the labels==1 branch in
redirect2() called dns_name_copy(redirectname, view->redirectzone)
with arguments reversed, overwriting the view-global
nxdomain-redirect target with the empty redirectname rather than
copying the configured target into the per-query lookup name.  After
the corruption, view->redirectzone names the root, so
dns_name_issubdomain() makes redirect2() short-circuit for every
subsequent query and the nxdomain-redirect feature stops working
until named is restarted.

Triggering this needs the resolver to receive an NXDOMAIN for the
root from upstream, which does not happen in normal DNS operation.

Swap the arguments to match the dns_name_copy(source, dest)
signature.  Add a system test that issues a root query through the
nxdomain-redirect resolver and verifies the redirect feature still
works for a normal NXDOMAIN-producing query afterwards.

Assisted-by: Claude:claude-opus-4-7
(cherry picked from commit c62f24f7ee)
2026-04-30 07:38:57 +02:00
Ondřej Surý
1a93486cde Refuse SIG and NXT records in dynamic updates
SIG (24) and NXT (30) are obsolete DNSSEC record types, superseded by
RRSIG and NSEC in RFC 3755.  Allowing them through dynamic update
exposes two distinct bugs that the surrounding GL#5818 work already
fixes as defense-in-depth:

  - dns__db_findrdataset() used to REQUIRE that (covers == 0 ||
    type == RRSIG), which aborts named when a SIG update reaches the
    prescan foreach_rr() call.  Fixed to accept dns_rdatatype_issig().
  - diff.c rdata_covers() used to test only RRSIG, dropping the
    covered-type field for SIG rdatas; the zone DB then filed every
    SIG rdataset under typepair (SIG, 0) instead of
    (SIG, covered_type) and follow-up adds collided at that bucket.
    Fixed to use dns_rdatatype_issig().

Both underlying bugs are still reachable via inbound zone transfer
(diff.c rdata_covers() runs from both dns_diff_apply on the IXFR path
and dns_diff_load on the AXFR path), so the type-helper fixes above
remain necessary.  For the dynamic-update path, the simplest and
safest posture is to refuse SIG and NXT outright at the front door in
ns/update.c, alongside the existing NSEC/NSEC3/non-apex-RRSIG
refusals.  KEY remains permitted because it is still used to carry
public keys for SIG(0) transaction authentication.

The existing tcp-self SIG regression test is repointed to assert
REFUSED on the SIG add, a symmetric NXT test is added, and the
SIG-via-dyn-update covers-bucket test is removed because it is no
longer reachable through this entry point; AXFR-based coverage of
diff.c rdata_covers() follows in a separate commit.

(cherry picked from commit 3a44a13232)
2026-04-17 19:24:13 +02:00
Ondřej Surý
feb5dc7f98 Add regression test for TOCTOU race in DNS UPDATE SSU handling
Race rndc reconfig (toggling between allow-update and update-policy)
against a stream of DNS UPDATEs for 5 seconds and verify that named
does not crash.

Before the fix, the race between send_update() and update_action()
reading the SSU table independently could trigger an assertion
failure (INSIST) when the zone's update policy changed between the
two reads.

(cherry picked from commit c503b6eee8)
2026-03-25 16:16:22 +01:00
Ondřej Surý
c409b9a939 Fix TOCTOU race in DNS UPDATE SSU table handling
Pass the SSU table through the update event struct from
send_update() to update_action() instead of reading it from the
zone twice.  If rndc reconfig changed the zone's update policy
between the two reads (e.g., from allow-update to update-policy),
send_update() would skip the maxbytype allocation but
update_action() would see a non-NULL ssutable, triggering
INSIST(ssutable == NULL || maxbytype != NULL) and crashing named.

The ssutable reference is now taken once in send_update() and
transferred to update_action() via the event struct, ensuring
both functions see the same value.

(cherry picked from commit c172416559)
2026-03-25 16:16:22 +01:00
Michal Nowak
82991c7881
Use clang-format-22 to update formatting
(cherry picked from commit 239464f276)
2026-03-04 12:18:27 +01:00
Ondřej Surý
6ac20d5099 Clear serve-stale flags when following the CNAME chains
A stale answer or SERVFAIL could have been served in case of multiple
upstream failures when following the CNAME chains. This has been fixed.

(cherry picked from commit d46277b398)
2026-02-25 11:30:34 +01:00
Mark Andrews
32f802f4ed Return FORMERR for ECS family 0
RFC 7871 only defines family 1 (IPv4) and 2 (IPv6). Additionally
it requires FORMERR to be returned for all unknown families.

(cherry picked from commit 757e503536)
2026-02-19 22:42:26 +11:00
Evan Hunt
aa13e62355 allow glue in delegations with QTYPE=ANY
when a query for type ANY triggers a delegation response, all
additional data was omitted from the response, including
mandatory glue. this has been corrected.
2025-12-11 10:36:09 -08:00
Matthijs Mekking
45c7008ecd Log serial when IXFR version not in journal
It may be useful to know which version (begin serial) is missing when
the IXFR version cannot be found.

(cherry picked from commit a4e6fef81c)
2025-12-10 15:25:23 +00:00
Evan Hunt
25c9fb54da standardize CHECK and RETERR macros
previously, there were over 40 separate definitions of CHECK macros, of
which most used "goto cleanup", and the rest "goto failure" or "goto
out". there were another 10 definitions of RETERR, of which most were
identical to CHECK, but some simply returned a result code instead of
jumping to a cleanup label.

this has now been standardized throughout the code base: RETERR is for
returning an error code in the case of an error, and CHECK is for jumping
to a cleanup tag, which is now always called "cleanup". both macros are
defined in isc/util.h.

(cherry picked from commit 52bba5cc34)
2025-12-03 19:17:20 -08:00
Ondřej Surý
5cd69a3dcf
Detect resolution loops between fetches
Maintain the relationship between the parent and child fetch and when
creating a new child fetch, properly check the resolution loops that
would lead to a new fetch would join one of the parent's fetch contexts.

(cherry picked from commit 4d307ac67a)
2025-11-28 09:32:53 +01:00
Colin Vidal
5a98141a00 check plugin config before registering
In named_config_parsefile(), when checking the validity of
named.conf, the checking of plugin correctness was deliberately
postponed until the plugin is loaded and registered. However,
when the plugin was registered, the checking was never actually
done: the plugin_register() implementation was called, but
plugin_check() was not.

This made it necessary to duplicate the correctness checking in both
functions, so that both named-checkconf and named could catch errors.
That should not be required.

ns_plugin_register() now calls the check function before the register
function, and aborts if either one fails.  ns_plugin_check() calls only
the check function.  ns_plugin_check() is used by named-checkconf, and
ns_plugin_register() is used by named. (Note: this design has a
side effect that a call to ns_plugin_register() will result in the
plugin parameters being parsed twice at registration time.)

Partial backport of !11031
2025-10-01 11:16:11 +02:00
Mark Andrews
2554a724d4 Use signer name when disabling DNSSEC algorithms
When disabling algorithms, use the signer name to determine if the
algorithm is disabled or not.  This allows for algorithms to be
cleanly disabled on a zone level basis.  Previously, just using the
records owner name, "disable-algorithms" could impact resolution of
names that where not disabled.  This does now mean that
"disable-algorithms" can not be used to disable part of a zone anymore.

(cherry picked from commit a0945f6337)
2025-09-29 11:16:24 +10:00
Aram Sargsyan
36ef759164 Log the servfail-until-ready message not faster than once per second
Since the log level has been raised, busy servers can "explode" from
the amount of log messages. Use the usual practice of logging "every
once in a while".

(cherry picked from commit 1962857ac4)
2025-09-03 15:43:37 +02:00
Aram Sargsyan
25e08a0cfe Change the "RPZ not ready yet" message and its log level
The "RPZ not ready yet" message is logged at debug 3 level. Use the
info level instead for better visibility.

After raising the log level, the rpz_log_fail_helper() function starts
appending " failed: " the the message. Change the log message so it
makes more sense.

(cherry picked from commit 49356ce944)
2025-09-03 15:43:37 +02:00
Aram Sargsyan
cf687c0bda RPZ 'servfail-until-ready': skip updating SERVFAIL cache
In order to not pollute the SERVFAIL cache with the configured
SERVFAIL answers while RPZ is loading, set the NS_CLIENTATTR_NOSETFC
attribute for the client.

(cherry picked from commit d9b5f6c502)
2025-09-03 15:43:37 +02:00
Aram Sargsyan
369a350e04 Resolve false positive compilation warning from some GCC versions
The complier claims that 'qresult_type' may be used uninitialized,
though all the cases inside the switch either set the variable
or return from the function, and the warning is generated on a line
after the switch-case block.

Slightly modify the code to set a default value for the variable when
declaring it.

    In function 'rpz_rewrite',
        inlined from 'query_checkrpz' at query.c:7288:12,
        inlined from 'query_gotanswer' at query.c:7724:12:
    query.c:4693:14: error: 'qresult_type' may be used uninitialized [-Werror=maybe-uninitialized]
     4693 |             !dnsrps_set_p(&emsg, client, st, qtype, &rdataset,
          |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     4694 |                           qresult_type != qresult_type_recurse))
          |                           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    query.c: In function 'query_gotanswer':
    query.c:4268:24: note: 'qresult_type' was declared here
     4268 |         qresult_type_t qresult_type;
          |                        ^~~~~~~~~~~~
    cc1: all warnings being treated as errors
2025-08-27 10:00:45 +00:00
Aram Sargsyan
ec2c3db702 Implement '-T slowrpz' named testing option
When used, named processes RPZ zones slowly. Useful for system tests.

(cherry picked from commit 5e718dd220)
2025-08-27 10:00:45 +00:00
Aram Sargsyan
ee29e133ac Add a new 'servfail-until-ready' configuration option for RPZ
By default, when named is started it may start answering to
queries before the response policy zones are completely loaded
and processed. This new feature gives an option to the users to
tell named that incoming requests should result in SERVFAIL anwser
until all the response policy zones are procesed and ready.

(cherry picked from commit 41387b8d30)
2025-08-27 10:00:45 +00:00
Matthijs Mekking
1936303158 Add ede for zone with rpz cname override policy
When the zone is configured with a CNAME override policy, also add the
configured EDE code.

When the zone is contains a wildcard CNAME, also add the configured
EDE code.

(cherry picked from commit 2f70a0ef12)
2025-08-05 12:13:15 +02:00
Matthijs Mekking
03eb9aabe1 Special case refresh stale ncache data
When refreshing stale ncache data, the qctx->rdataset is NULL and
requires special processing.

(cherry picked from commit 7774f16ed5)
2025-07-23 12:12:51 +00:00
Matthijs Mekking
667d81b52b Make serve-stale refresh behave as prefetch
A serve-stale refresh is similar to a prefetch, the only difference
is when it triggers. Where a prefetch is done when an RRset is about
to expire, a serve-stale refresh is done when the RRset is already
stale.

This means that the check for the stale-refresh window needs to
move into query_stale_refresh(). We need to clear the
DNS_DBFIND_STALEENABLED option at the same places as where we clear
DNS_DBFIND_STALETIMEOUT.

Now that serve-stale refresh acts the same as prefetch, there is no
worry that the same rdataset is added to the message twice. This makes
some code obsolete, specifically where we need to clear rdatasets from
the message.

(cherry picked from commit a66b04c8d4)
2025-07-23 12:12:51 +00:00
Andoni Duarte Pintado
4255d6d80a Merge tag 'v9.20.11' into bind-9.20 2025-07-16 17:20:09 +02:00
Aram Sargsyan
6a33125d7a Log dropped or slipped responses in the query-errors category
As mentioned in the comments block before the changed code block,
the dropped or slipped responses should be logged in the query
category (or rather query-errors category as done in lib/ns/client.c),
so that requests are not silently lost.

Also fix a couple of errors/typos in the code comments.

(cherry picked from commit 27e7961479)
2025-07-10 08:57:27 +00:00
Aram Sargsyan
9c7a63142d
Reset DNS_DBFIND_STALETIMEOUT in query_lookup()
If ns__query_start() is called because of a chained query (e.g.
after encountering a CNAME), a previously set DNS_DBFIND_STALETIMEOUT
flag on the query's 'dboptions' field can cause an assertion
failure if the new query's 'stalefirst' value is not true (e.g. if the
target qname is an authoritative zone for the server). Reset the
DNS_DBFIND_STALETIMEOUT flag in the query_lookup() function before
evaluating the 'stalefirst' value, and make sure to assign a fresh
value to the `stalefirst' flag instead of conditionally assigning it
only if the value is 'true'.

(cherry picked from commit 3d8bd8bbf1)
2025-07-03 13:54:41 +02:00
Mark Andrews
53738b0e5e Use clang-format-20 to update formatting
(cherry picked from commit 422b9118e8)
2025-06-25 13:32:08 +10:00
Mark Andrews
2ecac031ba Silence tainted scalar in client.c
Coverity detected that 'optlen' was not being checked in 'process_opt'.
This is actually already done when the OPT record was initially
parsed.  Add an INSIST to silence Coverity as is done in message.c.

(cherry picked from commit 72cd6e8591)
2025-05-29 07:01:00 +00:00
Aram Sargsyan
a90e3b9e6f Implement a new 'notify-defer' configuration option
This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.

(cherry picked from commit e42d6b4810)
2025-05-16 09:58:48 +00:00
Michał Kępień
5df876e968
Revert "Use a binary search to find the NSEC3 closest encloser"
This reverts commit ae718fab53.
2025-05-06 09:14:18 +02:00
Michał Kępień
8ea0c1d92b
Revert "detect when closest-encloser name is too long"
This reverts commit 1f4ba71f56.
2025-05-06 09:14:18 +02:00
Aram Sargsyan
7d652d9994 Fix a serve-stale issue with a delegated zone
When 'stale-answer-client-timeout' is 0, named is allowed to return
a stale answer immediately, while also initiating a new query to get
the real answer. This mode is activated in ns__query_start() by setting
the 'qctx->options.stalefirst' optoin to 'true' before calling the
query_lookup() function, but not when the zone is known to be
authoritative to the server. When the zone is authoritative, and
query_looup() finds out that the requested name is a delegation,
then before proceeding with the query, named tries to look it up
in the cache first. Here comes the issue that it doesn't consider
enabling 'qctx->options.stalefirst' in this case, and so the
'stale-answer-client-timeout 0' setting doesn't work for those
delegated zones - instead of immediately returning the stale answer
(if it exists), named tries to resolve it.

Fix this issue by enabling 'qctx->options.stalefirst' in the
query_zone_delegation() function just before named looks up the name
in the cache using a new query_lookup() call. Also, if nothing was
found in the cache, don't initiate another query_lookup() from inside
query_notfound(), and let query_notfound() do its work, i.e. it will
call query_delegation() for further processing.

(cherry picked from commit 412aa881f2)
2025-04-23 12:59:41 +00:00
Mark Andrews
71875eb25a Process NSID and DNS COOKIE options when returning BADVERS
This will help identify the broken server if we happen to break
EDNS version negotiation.  It will also help protect the client
from spoofed BADVERSION responses.

(cherry picked from commit 0d9cab1555)
2025-04-15 03:13:20 +00:00
Ondřej Surý
01a579d126 Don't pass edectx from fetch_and_forget
Pass NULL as edectx for the fetch_and_forget() fetches as nobody
is reading the EDE contexts and it can mess the main client buffer.

(cherry picked from commit fe48290140)
2025-04-02 16:42:23 +00:00
Aram Sargsyan
70c0074043 Implement -T cookiealwaysvalid
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.

(cherry picked from commit 807ef8545d)
2025-03-17 11:39:16 +00:00
Evan Hunt
1f4ba71f56 detect when closest-encloser name is too long
there was a database bug in which dns_db_find() could get a partial
match for the query name, but still set foundname to match the full
query name.  this triggered an assertion when query_addwildcardproof()
assumed that foundname would be shorter.

the database bug has been fixed, but in case it happens again, we
can just copy the name instead of splitting it. we will also log a
warning that the closest-encloser name was invalid.
2025-03-17 09:27:09 +00:00
Mark Andrews
ae718fab53 Use a binary search to find the NSEC3 closest encloser
maxlabels is the suffix length that corresponds to the latest
NXDOMAIN response.  minlabels is the suffix length that corresponds
to longest found existing name.

(cherry picked from commit 67f31c5046)
2025-03-17 09:27:09 +00:00
Colin Vidal
c8cb75d7b1 add support for EDE 20 (Not Authoritative)
Extended DNS Error message EDE 20 (Not Authoritative) is now sent when
client request recursion (RD) but the server has recursion disabled.

RFC 8914 mention EDE 20 should also be returned if the client doesn't
have the RD bit set (and recursion is needed) but it doesn't apply for
BIND as BIND would try to resolve from the "deepest" referral in
AUTHORITY section. For example, if the client asks for "www.isc.org/A"
but the server only knows the root domain, it will returns NOERROR but
no answer for "www.isc.og/A", just the list of other servers to ask.

(cherry picked from commit 24ffbdcfea)
2025-03-13 11:57:21 +00:00
Aram Sargsyan
2d48cb33e3 Fix TTL issue with ANY queries processed through RPZ "passthru"
Answers to an "ANY" query which are processed by the RPZ "passthru"
policy have the response-policy's 'max-policy-ttl' value unexpectedly
applied. Do not change the records' TTL when RPZ uses a policy which
does not alter the answer.

(cherry picked from commit 5633dc90d3)
2025-02-27 09:22:01 +00:00
Evan Hunt
4f1f958d6d prevent a reference leak from the ns_query_done hooks
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.

this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().

(cherry picked from commit c2e4358267)
2025-02-26 00:55:51 +00:00
Aram Sargsyan
0add37862e Fix RPZ bug when resuming a query during a reconfiguration
After a reconfiguration the old view can be left without a valid
'rpzs' member, because when the RPZ is not changed during the named
reconfiguration 'rpzs' "migrate" from the old view into the new
view, so when a query resumes it can find that 'qctx->view->rpzs'
is NULL which query_resume() currently doesn't expect to happen if
it's recursing and 'qctx->rpz_st' is not NULL.

Fix the issue by adding a NULL-check. In order to not split the log
message to two different log messages depending on whether
'qctx->view->rpzs' is NULL or not, change the message to not log
the RPZ policy's "version" which is just a runtime counter and is
most likely not very useful for the users.

(cherry picked from commit 3ea2fbc238)
2025-02-21 11:45:45 +00:00
Colin Vidal
ccafa27b44 Use DNS_EDE_OTHER instead of its literal value
(cherry picked from commit 7c5678bb03)
2025-01-30 12:37:55 +00:00
Ondřej Surý
1ffb67a135 Split and simplify the use of EDE list implementation
Instead of mixing the dns_resolver and dns_validator units directly with
the EDE code, split-out the dns_ede functionality into own separate
compilation unit and hide the implementation details behind abstraction.

Additionally, the EDE codes are directly copied into the ns_client
buffers by passing the EDE context to dns_resolver_createfetch().

This makes the dns_ede implementation simpler to use, although sligtly
more complicated on the inside.

Co-authored-by: Colin Vidal <colin@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
(cherry picked from commit 2f8e0edf3b)
2025-01-30 12:37:55 +00:00
Andoni Duarte Pintado
2d0323e006 Merge tag 'v9.20.5' into bind-9.20 2025-01-29 17:21:44 +01:00
Colin Vidal
6c65d70ce5 add support for EDE code 1 and 2
Add support for EDE codes 1 (Unsupported DNSKEY Algorithm) and 2
(Unsupported DS Digest Type) which might occurs during DNSSEC
validation in case of unsupported DNSKEY algorithm or DS digest type.

Because DNSSEC internally kicks off various fetches, we need to copy
all encountered extended errors from fetch responses to the fetch
context. Upon an event, the errors from the fetch context are copied
to the client response.

(cherry picked from commit 46a58acdf5)
2025-01-24 14:27:16 +01:00
Colin Vidal
e685443c74 add support for multiple EDE
Extended DNS error mechanism (EDE) enables to have several EDE raised
during a DNS resolution (typically, a DNSSEC query will do multiple
fetches which each of them can have an error). Add support to up to 3
EDE errors in an DNS response. If duplicates occur (two EDEs with the
same code, the extra text is not compared), only the first one will be
part of the DNS answer.

Because the maximum number of EDE is statically fixed, `ns_client_t`
object own a static vector of `DNS_DE_MAX_ERRORS` (instead of a linked
list, for instance). The array can be fully filled (all slots point to
an allocated `dns_ednsopt_t` object) or partially filled (or
empty). In such case, the first NULL slot means there is no more EDE
objects.

(cherry picked from commit 4096f27130)
2025-01-23 13:12:53 +00:00