This re-do a previously existing EDE 22 system test as well as add
another one making sure the timed out flow detection works also on UDP
when the resolver is contacting the authoritative server. (the existing
test was using TCP to contact the authoritative servers).
Extended DNS error 22 (No reachable authority) was previously detected
when `fctx_expired` fired. It turns out this function is used as a
"safety net" and the timeout detection should be caught earlier.
It was working though, because of another issue fixed by !9927. Since
this change, the recursive request timed out detection occurs before
`fctx_expired` so EDE 22 is not added to the response message anymore.
The fix of the problem is to add the EDE 22 code in two situations:
- When the dispatch code timed out (rctx_timedout) the resolver code
checks various properties to figure out if it needs to make another
fetch attempt. One of the paramters if the fetch expiration time. If
it expires, the whole recursion is canceled, so it now adds the EDE 22
code.
- If the fetch expiration time doesn't expires in the case above (and
other parameters allows it) a new fetch attempt is made (fctx_query).
But before the new request is actually made, the fetch expiration time
is re-checked. It might then has elapsed, and the whole recursion is
canceled. So it now also adds the EDE 22 code here as well.
The new command is `rndc memprof`. The memory profiling status is also
reported inside `rndc status`. The status also shows whether named can
toggle memory profiling or not and if the server is built with jemalloc.
Closes#4759
Merge branch '4759-add-a-trigger-to-dump-jeprof-data-or-memory-statistics' into 'main'
See merge request isc-projects/bind9!9370
The new command is `rndc memprof`. The memory profiling status is also
reported inside `rndc status`. The status also shows whether named can
toggle memory profiling or not and if the server is built with jemalloc.
The changelog job is supposed to test that the text from GitLab MR
title&description is valid rst syntax and can be built with sphinx. In
49128fc1, the way gitchangelog generates entries was changed - it no
longer writes to the changelog file, but generates output on stdout
instead. Ensure the generated notes is actually written to (some)
rendered file which is part of the docs so that the subsequent sphinx
build attempts to render the note.
Merge branch 'nicki/ci-fix-changelog-job' into 'main'
See merge request isc-projects/bind9!9804
The changelog job is supposed to test that the text from GitLab MR
title&description is valid rst syntax and can be built with sphinx. In
49128fc1, the way gitchangelog generates entries was changed - it no
longer writes to the changelog file, but generates output on stdout
instead. Ensure the generated notes is actually written to (some)
rendered file which is part of the docs so that the subsequent sphinx
build attempts to render the note.
Add support for EDE codes 1 & 2 which might occurs during DNSSEC validation in case of unsupported RRSIG algorithm or DNSKEY digest.
See #2715
Merge branch '2715-ede-unsupported-digest-alg' into 'main'
See merge request isc-projects/bind9!9948
A DNSSEC validation can fail in the case where multiple DNSKEY are
available for a zone and none of them are supported, but for different
reasons: one has a DS record in the parent zone using an unsupported
digest while the other one uses an unsupported encryption algorithm.
Add a specific test case covering this flow and making sure that two
extended DNS error are provided: code 1 and 2, each of them highlighting
unsupported algorithm and digest.
Add support for EDE codes 1 (Unsupported DNSKEY Algorithm) and 2
(Unsupported DS Digest Type) which might occurs during DNSSEC
validation in case of unsupported DNSKEY algorithm or DS digest type.
Because DNSSEC internally kicks off various fetches, we need to copy
all encountered extended errors from fetch responses to the fetch
context. Upon an event, the errors from the fetch context are copied
to the client response.
The minimal dnspython version to run this test is 2.5.0.
Merge branch 'mnowak/pytest_rewrite_cipher-suites' into 'main'
See merge request isc-projects/bind9!8662
The servers are setup and torn down once per each test module. All the
logs and server state persists between individual tests within the same
module. The servers fixture representing these servers should be
module-wide as well.
The base image tends to have a rather old dnspython version and when
used with pylint and mypy it produces errors about newer dnspython
features the old version does not know about.
$ mypy "bin/tests/system/isctest/"
bin/tests/system/isctest/query.py:55: error: Unexpected keyword argument "verify" for "tls" [call-arg]
/usr/lib/python3/dist-packages/dns/query.py:958: note: "tls" defined here
$ pylint --rcfile $CI_PROJECT_DIR/.pylintrc --disable=wrong-import-position $(git ls-files 'bin/tests/system/*.py' | grep -vE 'ans\.py')
************* Module isctest.query
bin/tests/system/isctest/query.py:55:11: E1123: Unexpected keyword argument 'verify' in function call (unexpected-keyword-arg)
When explicitly set to True, the "verify" argument lets dnspython verify
certificates used for the connection. As most certificates in the system
test will inevitably be self-signed, the "verify" argument defaults to
False.
The "verify" argument is present in dnspython since the version 2.5.0.
A number of result codes are obsolete and can be removed. Others, including `ISC_R_NOMEMORY`, are still checked in various places even though they can't occur any longer. These have been cleaned up.
Merge branch 'each-cleanup-results' into 'main'
See merge request isc-projects/bind9!9942
ISCCC_R_SYNTAX, ISCCC_R_EXPIRED, and ISCCC_R_CLOCKSKEW have the
same usage and text formats as DNS_R_SYNTAX, DNS_R_EXPIRED and
DNS_R_CLOCKSCREW respectively. this was originally done because
result codes were defined in separate libraries, and some tool
might be linked with libisccc but not libdns. as the result codes
are now defined in only one place, there's no need to retain the
duplicates.
building BIND without crypto support is no longer possible.
consequently this result code is never sent, and therefore we
don't need code in calling functions to handle it.
the isc_mem allocation functions can no longer fail; as a result,
ISC_R_NOMEMORY is now rarely used: only when an external library
such as libjson-c or libfstrm could return NULL. (even in
these cases, arguably we should assert rather than returning
ISC_R_NOMEMORY.)
code and comments that mentioned ISC_R_NOMEMORY have been
cleaned up, and the following functions have been changed to
type void, since (in most cases) the only value they could
return was ISC_R_SUCCESS:
- dns_dns64_create()
- dns_dyndb_create()
- dns_ipkeylist_resize()
- dns_kasp_create()
- dns_kasp_key_create()
- dns_keystore_create()
- dns_order_create()
- dns_order_add()
- dns_peerlist_new()
- dns_tkeyctx_create()
- dns_view_create()
- dns_zone_setorigin()
- dns_zone_setfile()
- dns_zone_setstream()
- dns_zone_getdbtype()
- dns_zone_setjournal()
- dns_zone_setkeydirectory()
- isc_lex_openstream()
- isc_portset_create()
- isc_symtab_create()
(the exception is dns_view_create(), which could have returned
other error codes in the event of a crypto library failure when
calling isc_file_sanitize(), but that should be a RUNTIME_CHECK
anyway.)
Adjust the limit of maximum disagreements in respdiff results based on
recent pipeline results.
The respdiff and respdiff:asan seem to have almost identical results,
typically around 0.07 % of differences with ocassional spikes up to
around 0.11 %. Similar results are for respdiff:tsan, perhaps with more
common spikes with values up to around 0.12 %. Set the limit to 0.15 %
to allow for some tolerance due to network conditions, time of day etc.
The respdiff:third-party has a slightly higher disagreements average,
with typical values being around 0.12 %. Set the limit to 0.2 %.
Exceeding either of those values should be quite clear indication that
some resolution behaviour has changed, since the values appear to be
very stable within the newly configured limits.
Merge branch 'nicki/ci-respdiff-limits' into 'main'
See merge request isc-projects/bind9!9950
Adjust the limit of maximum disagreements in respdiff results based on
recent pipeline results.
The respdiff and respdiff:asan seem to have almost identical results,
typically around 0.07 % of differences with ocassional spikes up to
around 0.11 %. Similar results are for respdiff:tsan, perhaps with more
common spikes with values up to around 0.12 %. Set the limit to 0.15 %
to allow for some tolerance due to network conditions, time of day etc.
The respdiff:third-party has a slightly higher disagreements average,
with typical values being around 0.12 %. Set the limit to 0.2 %.
Exceeding either of those values should be quite clear indication that
some resolution behaviour has changed, since the values appear to be
very stable within the newly configured limits.
There was confusion about whether the interval was calculated from
the validity period provided on the command line (with -s and -e),
or from the signature being replaced.
Add text to clarify that the interval is calculated from the new
validity period.
Closes#5128
Merge branch '5128-clarify-dnssec-signzone-interval' into 'main'
See merge request isc-projects/bind9!9955
There was confusion about whether the interval was calculated from
the validity period provided on the command line (with -s and -e),
or from the signature being replaced.
Add text to clarify that the interval is calculated from the new
validity period.
In the case when `dnssec-signzone` is called on an already signed zone, and the private key file is unavailable, a signature that needs to be refreshed may be dropped without being able to generate a replacement. This has been fixed.
Closes#5126
Merge branch '5126-dnssec-signzone-retain-rrsig-if-key-is-offline' into 'main'
See merge request isc-projects/bind9!9951
Track inside the dns_dnsseckey structure whether we have seen the
private key, or if this key only has a public key file.
If the key only has a public key file, or a DNSKEY reference in the
zone, mark the key 'pubkey'. In dnssec-signzone, if the key only
has a public key available, consider the key to be offline. Any
signatures that should be refreshed for which the key is not available,
retain the signature.
So in the code, 'expired' becomes 'refresh', and the new 'expired'
is only used to determine whether we need to keep the signature if
the corresponding key is not available (retaining the signature if
it is not expired).
In the 'keysthatsigned' function, we can remove:
- key->force_publish = false;
- key->force_sign = false;
because they are redundant ('dns_dnsseckey_create' already sets these
values to false).
Add a test case for the scenario below.
There is a case when signing a zone with dnssec-signzone where the
private key file is moved outside the key directory (for offline
ksk purposes), and then the zone is resigned. The signature of the
DNSKEY needs refreshing, but is not expired.
Rather than removing the signature without having a valid replacement,
leave the signature in the zone (despite it needs to be refreshed).
If the generated status output exceeds 4096 it was silently truncated, now we output that the status was truncated.
Closes#4180
Merge branch '4180-possible-truncation-in-dns_keymgr_status' into 'main'
See merge request isc-projects/bind9!9905
Extended DNS error mechanism (EDE) may have several errors raised during a DNS resolution. `named` is now able to add up to three EDE codes in a DNS response. In the case of duplicate error codes, only the first one will be part of the DNS response.
Closes#5085
Merge branch '5085-multiple-ede' into 'main'
See merge request isc-projects/bind9!9952
Extended DNS error mechanism (EDE) enables to have several EDE raised
during a DNS resolution (typically, a DNSSEC query will do multiple
fetches which each of them can have an error). Add support to up to 3
EDE errors in an DNS response. If duplicates occur (two EDEs with the
same code, the extra text is not compared), only the first one will be
part of the DNS answer.
Because the maximum number of EDE is statically fixed, `ns_client_t`
object own a static vector of `DNS_DE_MAX_ERRORS` (instead of a linked
list, for instance). The array can be fully filled (all slots point to
an allocated `dns_ednsopt_t` object) or partially filled (or
empty). In such case, the first NULL slot means there is no more EDE
objects.
When 'ISC_R_TIMEDOUT' is received in 'tcp_recv()', it times out the
oldest response in the active responses queue, and only after that it
checks whether other active responses have also timed out. So when
setting a timeout value for a read operation after a successful
connection, it makes sense to take the timeout value from the oldest
response in the active queue too, because, theoretically, the responses
can have different timeout values, e.g. when the TCP dispatch is shared.
Currently 'resp' is always NULL. Previously when connect and read timeouts
were not separated in dispatch this affected only logging, but now since
we are setting a new timeout after a successful connection, we need to
choose a suitable response from the active queue.
Merge branch 'aram/dispatch-tcp_connected-fix' into 'main'
See merge request isc-projects/bind9!9927
Since the read timeout now works, the resolver time outs from the
dispatch level instead of from the "hung fetch" timer, and so the
EDE value in 'fctx_expired()' is not being set. Remove the expected
EDE value from the test.
When TCP is used, 'fctx_query()' adds one second to the rtt
(round-trip time) value, but there's a bug when the decision
about using TCP is made already after the calculation. Move the
block of the code which looks up the peers list to decide
whether to use TCP into a place that's before the rtt calculation
is performed. This commit doesn't add or remove any code, it just
moves the code and adds a comment block.
When 'ISC_R_TIMEDOUT' is received in 'tcp_recv()', it times out the
oldest response in the active responses queue, and only after that it
checks whether other active responses have also timed out. So when
setting a timeout value for a read operation after a successful
connection, it makes sense to take the timeout value from the oldest
response in the active queue too, because, theoretically, the responses
can have different timeout values, e.g. when the TCP dispatch is shared.
Currently 'resp' is always NULL. Previously when connect and read
timeouts were not separated in dispatch this affected only logging, but
now since we are setting a new timeout after a successful connection,
we need to choose a suitable response from the active queue.
Prevent lock contention among many worker threads referring to the same database node at the same time. This would improve zone and cache database performance for the heavily contended database nodes.
Closes#5130
Merge branch '5130-reduce-lock-contention-in-decrement-reference' into 'main'
See merge request isc-projects/bind9!9963
Previously, this function always acquires a node write lock if it
might need node cleanup in case the reference decrements to 0. In
fact, the lock is unnecessary if the reference is larger than 1 and it
can be optimized as an "easy" case. This optimization could even be
"necessary". In some extreme cases, many worker threads could repeat
acquring and releasing the reference on the same node, resulting in
severe lock contention for nothing (as the ref wouldn't decrement to 0
in most cases). This change would prevent noticeable performance
drop like query timeout for such cases.
Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com>
Co-authored-by: Ondřej Surý <ondrej@isc.org>