when a query for type ANY triggers a delegation response, all
additional data was omitted from the response, including
mandatory glue. this has been corrected.
previously, there were over 40 separate definitions of CHECK macros, of
which most used "goto cleanup", and the rest "goto failure" or "goto
out". there were another 10 definitions of RETERR, of which most were
identical to CHECK, but some simply returned a result code instead of
jumping to a cleanup label.
this has now been standardized throughout the code base: RETERR is for
returning an error code in the case of an error, and CHECK is for jumping
to a cleanup tag, which is now always called "cleanup". both macros are
defined in isc/util.h.
(cherry picked from commit 52bba5cc34)
Maintain the relationship between the parent and child fetch and when
creating a new child fetch, properly check the resolution loops that
would lead to a new fetch would join one of the parent's fetch contexts.
(cherry picked from commit 4d307ac67a)
In named_config_parsefile(), when checking the validity of
named.conf, the checking of plugin correctness was deliberately
postponed until the plugin is loaded and registered. However,
when the plugin was registered, the checking was never actually
done: the plugin_register() implementation was called, but
plugin_check() was not.
This made it necessary to duplicate the correctness checking in both
functions, so that both named-checkconf and named could catch errors.
That should not be required.
ns_plugin_register() now calls the check function before the register
function, and aborts if either one fails. ns_plugin_check() calls only
the check function. ns_plugin_check() is used by named-checkconf, and
ns_plugin_register() is used by named. (Note: this design has a
side effect that a call to ns_plugin_register() will result in the
plugin parameters being parsed twice at registration time.)
Partial backport of !11031
When disabling algorithms, use the signer name to determine if the
algorithm is disabled or not. This allows for algorithms to be
cleanly disabled on a zone level basis. Previously, just using the
records owner name, "disable-algorithms" could impact resolution of
names that where not disabled. This does now mean that
"disable-algorithms" can not be used to disable part of a zone anymore.
(cherry picked from commit a0945f6337)
Since the log level has been raised, busy servers can "explode" from
the amount of log messages. Use the usual practice of logging "every
once in a while".
(cherry picked from commit 1962857ac4)
The "RPZ not ready yet" message is logged at debug 3 level. Use the
info level instead for better visibility.
After raising the log level, the rpz_log_fail_helper() function starts
appending " failed: " the the message. Change the log message so it
makes more sense.
(cherry picked from commit 49356ce944)
In order to not pollute the SERVFAIL cache with the configured
SERVFAIL answers while RPZ is loading, set the NS_CLIENTATTR_NOSETFC
attribute for the client.
(cherry picked from commit d9b5f6c502)
The complier claims that 'qresult_type' may be used uninitialized,
though all the cases inside the switch either set the variable
or return from the function, and the warning is generated on a line
after the switch-case block.
Slightly modify the code to set a default value for the variable when
declaring it.
In function 'rpz_rewrite',
inlined from 'query_checkrpz' at query.c:7288:12,
inlined from 'query_gotanswer' at query.c:7724:12:
query.c:4693:14: error: 'qresult_type' may be used uninitialized [-Werror=maybe-uninitialized]
4693 | !dnsrps_set_p(&emsg, client, st, qtype, &rdataset,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4694 | qresult_type != qresult_type_recurse))
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
query.c: In function 'query_gotanswer':
query.c:4268:24: note: 'qresult_type' was declared here
4268 | qresult_type_t qresult_type;
| ^~~~~~~~~~~~
cc1: all warnings being treated as errors
By default, when named is started it may start answering to
queries before the response policy zones are completely loaded
and processed. This new feature gives an option to the users to
tell named that incoming requests should result in SERVFAIL anwser
until all the response policy zones are procesed and ready.
(cherry picked from commit 41387b8d30)
When the zone is configured with a CNAME override policy, also add the
configured EDE code.
When the zone is contains a wildcard CNAME, also add the configured
EDE code.
(cherry picked from commit 2f70a0ef12)
A serve-stale refresh is similar to a prefetch, the only difference
is when it triggers. Where a prefetch is done when an RRset is about
to expire, a serve-stale refresh is done when the RRset is already
stale.
This means that the check for the stale-refresh window needs to
move into query_stale_refresh(). We need to clear the
DNS_DBFIND_STALEENABLED option at the same places as where we clear
DNS_DBFIND_STALETIMEOUT.
Now that serve-stale refresh acts the same as prefetch, there is no
worry that the same rdataset is added to the message twice. This makes
some code obsolete, specifically where we need to clear rdatasets from
the message.
(cherry picked from commit a66b04c8d4)
As mentioned in the comments block before the changed code block,
the dropped or slipped responses should be logged in the query
category (or rather query-errors category as done in lib/ns/client.c),
so that requests are not silently lost.
Also fix a couple of errors/typos in the code comments.
(cherry picked from commit 27e7961479)
If ns__query_start() is called because of a chained query (e.g.
after encountering a CNAME), a previously set DNS_DBFIND_STALETIMEOUT
flag on the query's 'dboptions' field can cause an assertion
failure if the new query's 'stalefirst' value is not true (e.g. if the
target qname is an authoritative zone for the server). Reset the
DNS_DBFIND_STALETIMEOUT flag in the query_lookup() function before
evaluating the 'stalefirst' value, and make sure to assign a fresh
value to the `stalefirst' flag instead of conditionally assigning it
only if the value is 'true'.
(cherry picked from commit 3d8bd8bbf1)
Coverity detected that 'optlen' was not being checked in 'process_opt'.
This is actually already done when the OPT record was initially
parsed. Add an INSIST to silence Coverity as is done in message.c.
(cherry picked from commit 72cd6e8591)
This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.
(cherry picked from commit e42d6b4810)
When 'stale-answer-client-timeout' is 0, named is allowed to return
a stale answer immediately, while also initiating a new query to get
the real answer. This mode is activated in ns__query_start() by setting
the 'qctx->options.stalefirst' optoin to 'true' before calling the
query_lookup() function, but not when the zone is known to be
authoritative to the server. When the zone is authoritative, and
query_looup() finds out that the requested name is a delegation,
then before proceeding with the query, named tries to look it up
in the cache first. Here comes the issue that it doesn't consider
enabling 'qctx->options.stalefirst' in this case, and so the
'stale-answer-client-timeout 0' setting doesn't work for those
delegated zones - instead of immediately returning the stale answer
(if it exists), named tries to resolve it.
Fix this issue by enabling 'qctx->options.stalefirst' in the
query_zone_delegation() function just before named looks up the name
in the cache using a new query_lookup() call. Also, if nothing was
found in the cache, don't initiate another query_lookup() from inside
query_notfound(), and let query_notfound() do its work, i.e. it will
call query_delegation() for further processing.
(cherry picked from commit 412aa881f2)
This will help identify the broken server if we happen to break
EDNS version negotiation. It will also help protect the client
from spoofed BADVERSION responses.
(cherry picked from commit 0d9cab1555)
Pass NULL as edectx for the fetch_and_forget() fetches as nobody
is reading the EDE contexts and it can mess the main client buffer.
(cherry picked from commit fe48290140)
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.
(cherry picked from commit 807ef8545d)
there was a database bug in which dns_db_find() could get a partial
match for the query name, but still set foundname to match the full
query name. this triggered an assertion when query_addwildcardproof()
assumed that foundname would be shorter.
the database bug has been fixed, but in case it happens again, we
can just copy the name instead of splitting it. we will also log a
warning that the closest-encloser name was invalid.
maxlabels is the suffix length that corresponds to the latest
NXDOMAIN response. minlabels is the suffix length that corresponds
to longest found existing name.
(cherry picked from commit 67f31c5046)
Extended DNS Error message EDE 20 (Not Authoritative) is now sent when
client request recursion (RD) but the server has recursion disabled.
RFC 8914 mention EDE 20 should also be returned if the client doesn't
have the RD bit set (and recursion is needed) but it doesn't apply for
BIND as BIND would try to resolve from the "deepest" referral in
AUTHORITY section. For example, if the client asks for "www.isc.org/A"
but the server only knows the root domain, it will returns NOERROR but
no answer for "www.isc.og/A", just the list of other servers to ask.
(cherry picked from commit 24ffbdcfea)
Answers to an "ANY" query which are processed by the RPZ "passthru"
policy have the response-policy's 'max-policy-ttl' value unexpectedly
applied. Do not change the records' TTL when RPZ uses a policy which
does not alter the answer.
(cherry picked from commit 5633dc90d3)
if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is
used in a plugin and returns NS_HOOK_RETURN, some of the
cleanup in ns_query_done() can be skipped over, leading
to reference leaks that can cause named to hang on shut
down.
this has been addressed by adding more housekeeping
code after the cleanup: tag in ns_query_done().
(cherry picked from commit c2e4358267)
After a reconfiguration the old view can be left without a valid
'rpzs' member, because when the RPZ is not changed during the named
reconfiguration 'rpzs' "migrate" from the old view into the new
view, so when a query resumes it can find that 'qctx->view->rpzs'
is NULL which query_resume() currently doesn't expect to happen if
it's recursing and 'qctx->rpz_st' is not NULL.
Fix the issue by adding a NULL-check. In order to not split the log
message to two different log messages depending on whether
'qctx->view->rpzs' is NULL or not, change the message to not log
the RPZ policy's "version" which is just a runtime counter and is
most likely not very useful for the users.
(cherry picked from commit 3ea2fbc238)
Instead of mixing the dns_resolver and dns_validator units directly with
the EDE code, split-out the dns_ede functionality into own separate
compilation unit and hide the implementation details behind abstraction.
Additionally, the EDE codes are directly copied into the ns_client
buffers by passing the EDE context to dns_resolver_createfetch().
This makes the dns_ede implementation simpler to use, although sligtly
more complicated on the inside.
Co-authored-by: Colin Vidal <colin@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
(cherry picked from commit 2f8e0edf3b)
Add support for EDE codes 1 (Unsupported DNSKEY Algorithm) and 2
(Unsupported DS Digest Type) which might occurs during DNSSEC
validation in case of unsupported DNSKEY algorithm or DS digest type.
Because DNSSEC internally kicks off various fetches, we need to copy
all encountered extended errors from fetch responses to the fetch
context. Upon an event, the errors from the fetch context are copied
to the client response.
(cherry picked from commit 46a58acdf5)
Extended DNS error mechanism (EDE) enables to have several EDE raised
during a DNS resolution (typically, a DNSSEC query will do multiple
fetches which each of them can have an error). Add support to up to 3
EDE errors in an DNS response. If duplicates occur (two EDEs with the
same code, the extra text is not compared), only the first one will be
part of the DNS answer.
Because the maximum number of EDE is statically fixed, `ns_client_t`
object own a static vector of `DNS_DE_MAX_ERRORS` (instead of a linked
list, for instance). The array can be fully filled (all slots point to
an allocated `dns_ednsopt_t` object) or partially filled (or
empty). In such case, the first NULL slot means there is no more EDE
objects.
(cherry picked from commit 4096f27130)
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA. This limits the
number of lookups into the database(s) during a single client query,
reducing query processing load.
Also, don't append any additional data to type=ANY queries. The
answer to ANY is already big enough.
(cherry picked from commit a1982cf1bb)
In dns_zone_getdnssecsignstats, dns_zone_getrcvquerystats and
dns_zone_getrequeststats attach to the statistics structure.
(cherry picked from commit fb50a71159)
Add support for Extended DNS Errors (EDE) error 22: No reachable
authority. This occurs when after a timeout delay when the resolver is
trying to query an authority server.
(cherry picked from commit d13e94b930)
Commit amended in order to fix usage of isc_log_write (adding dns_lctx
parameter)
Previously, the update policy rules check was moved earlier in the
sequence, and the keep rule match pointers were kept to maintain the
ability to verify maximum records by type.
However, these pointers can become invalid if server reloading
or reconfiguration occurs before update completion. To prevent
this issue, extract the maximum records by type value immediately
during processing and only keep the copy of the values instead of the
full ssurule.
(cherry picked from commit 44a54a29d8)
Instead of cleaning the dns_badcache opportunistically, add per-loop
LRU, so each thread-loop can clean the expired entries. This also
allows removal of the atomic operations as the badcache entries are now
immutable, instead of updating the badcache entry in place, the old
entry is now deleted from the hashtable and the LRU list, and the new
entry is inserted in the LRU.
(cherry picked from commit 2cb5a6210f)
Re-split format strings that had been poorly split by multiple
clang-format runs using different versions of clang-format.
(cherry picked from commit a24d6e1654)