The `max-rsa-exponent-size` could limit the exponents of the RSA
public keys during the DNSSEC verification. Instead of providing
a cryptic (not cryptographic) knob, hardcode the max exponent to
be 4096 (the theoretical maximum for DNSSEC).
The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).
This new option sets a delay (in seconds) to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration. This
option is not to be confused with the :any:`notify-delay` option.
The default is 0 seconds.
Closes#5259
Merge branch '5259-implement-zone-notify-defer' into 'main'
See merge request isc-projects/bind9!10419
This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.
The test_idle_timeout check in the "timeouts" system test has been
failing often on FreeBSD 13 AWS hosts. Adding timestamped debug logging
shows that the time.sleep() calls used in that check are returning
significantly later than asked to on that platform (e.g. after 4 seconds
when just 1 second is requested), breaking the test's timing assumptions
and triggering false positives. These failures are not an indication of
a bug in named and have not been observed on any other platform. Mark
the problematic check as flaky, but only on FreeBSD 13, so that other
failure modes are caught appropriately.
Merge branch 'michal/mark-test_idle_timeout-as-flaky-on-freebsd-13' into 'main'
See merge request isc-projects/bind9!10459
The test_idle_timeout check in the "timeouts" system test has been
failing often on FreeBSD 13 AWS hosts. Adding timestamped debug logging
shows that the time.sleep() calls used in that check are returning
significantly later than asked to on that platform (e.g. after 4 seconds
when just 1 second is requested), breaking the test's timing assumptions
and triggering false positives. These failures are not an indication of
a bug in named and have not been observed on any other platform. Mark
the problematic check as flaky, but only on FreeBSD 13, so that other
failure modes are caught appropriately.
The debug level (set with the `-d` option) was ignored when running `named` with the `-g` and `-u` options.
Merge branch 'each-fix-debug-level' into 'main'
See merge request isc-projects/bind9!10453
In commit cc167266aa, the -g option was changed so it sets both
named_g_logstderr and also named_g_logflags to use ISO style timestamps
with tzinfo. Together with an error in named_log_setsafechannels(), that
change could cause the debugging level to be ignored.
The wrong NSEC3 records were sometimes returned as proof that the QNAME
did not exist. This has been fixed.
Closes#5292
Merge branch '5292-wrong-nsec3-chosen-for-no-qname-proof' into 'main'
See merge request isc-projects/bind9!10447
When we optimised the closest encloser NSEC3 discovery the maxlabels
variable was used in the binary search. The updated value was later
used to add the NO QNAME NSEC3 but that block of code needed the
original value. This resulted in the wrong NSEC3 sometimes being
chosen to perform this role.
Some domains tested by linkchecker may think that we connect to them too
often and will refuse connection or reply with an error code, which makes
this job fail. Let's check links only on Wednesdays.
Merge branch 'mnowak/run-linkchecker-only-sometimes' into 'main'
See merge request isc-projects/bind9!10439
Some domains tested by linkchecker may think that we connect to them too
often and will refuse connection or reply with and error code, which
makes this job fail. Let's check links only on Wednesdays.
The check fails with the following error for some time:
broken https://www.gnu.org/software/libidn/#libidn2 - HTTPSConnectionPool(host='www.gnu.org', port=443): Max retries exceeded with url: /software/libidn/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f5bd4c14590>: Failed to establish a new connection: [Errno 111] Connection refused'))
Merge branch 'mnowak/linkcheck-disable-www-gnu-org' into 'main'
See merge request isc-projects/bind9!10436
The check fails with the following error for some time:
broken https://www.gnu.org/software/libidn/#libidn2 - HTTPSConnectionPool(host='www.gnu.org', port=443): Max retries exceeded with url: /software/libidn/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f5bd4c14590>: Failed to establish a new connection: [Errno 111] Connection refused'))
The two-tone ksr subtest (test_ksr_twotone) depended on the dnssec-policy keys algorithm values in named.conf being entered in numerical order. As the algorithms used in the test can be selected randomly this does not always happen. Sort the dnssec-policy keys by algorithm when adding them to the key list from named.conf.
Closes#5286
Merge branch '5286-ksr-two-tone-test-only-work-by-luck' into 'main'
See merge request isc-projects/bind9!10395
Extract each section of the bundle and check that the expected
records are there. The old code was assuming that the records in
each section where in a particular order which didn't happen in
practice.
The return value was sometimes being ignored when it shouldn't
have been.
Closes#5301
Merge branch '5301-cid-550216-remove-dead-code' into 'main'
See merge request isc-projects/bind9!10432
Over the past few years, some of the initial decisions made about which
GitLab CI jobs to run for all merge requests and which of them to run
just for scheduled/web-triggered pipelines turned out to be less than
ideal in practice: test coverage was found to be too lax in some areas
and on the other hand unnecessarily repetitive in others. For example,
compilation failures for certain build types that are not exercised for
every merge request (e.g. FIPS-enabled builds) turned out to be much
more common in practice than e.g. test failures happening only on a
subset of releases of a given Linux distribution.
To limit excessive resource use while retaining broad test coverage,
adjust GitLab CI job triggering rules for merge request pipelines as
follows:
- run all possible build jobs for every merge request; compilation
failures triggered for build flavors that were only tested in
scheduled pipelines turned out to be surprisingly commonplace and
became a nuisance over time, particularly given that the run times
of build jobs are much lower than those of test jobs,
- for every merge request, run at least one system & unit test job for
each build flavor (e.g. sanitizer-enabled, FIPS-enabled,
out-of-tree, tarball-based, etc.),
- limit the amount of test jobs run for each distinct operating
system; for example, only run system & unit test jobs for Ubuntu
24.04 Noble Numbat in merge request pipelines, skipping those for
Ubuntu 22.04 Jammy Jellyfish and Ubuntu 20.04 Focal Fossa (while
still running them in other pipeline types, e.g. in scheduled
pipelines),
- ensure every merge request is tested on Oracle Linux 8, which is the
operating system with the oldest package versions out of the systems
that are still supported by this BIND 9 branch,
- decrease the number of test jobs run with sanitizers enabled while
still testing with both ASAN and TSAN and both GCC and Clang for
every merge request.
These changes do not affect the set of jobs created for any other
pipeline type (triggered by a schedule, by a GitLab API call, by the web
interface, etc.); only merge request pipelines are affected.
With the ongoing process of moving CI workloads to AWS, OpenBSD poses a
challenge, as there is no OpenBSD AMI image in the AWS catalog. Building
our image from scratch is disproportionately complicated, given that
OpenBSD is not a common deployment platform for BIND 9. Otherwise,
OpenBSD stays at the "Best-Effort" level of support.
Merge branch 'mnowak/drop-openbsd-from-ci' into 'main'
See merge request isc-projects/bind9!10375
With the ongoing process of moving CI workloads to AWS, OpenBSD poses a
challenge, as there is no OpenBSD AMI image in the AWS catalog. Building
our image from scratch is disproportionately complicated, given that
OpenBSD is not a common deployment platform for BIND 9. Otherwise,
OpenBSD stays at the "Best-Effort" level of support.
If a call_rcu thread is running, there is a possible race condition
where the destructors run before all call_rcu callbacks have finished
running. This can happen, for example, if the call_rcu callback tries to
log something after the logging context has been torn down.
In !10394, we tried to counter this by explicitely creating a call_rcu
thread an shutting it down before running the destructors, but it is
possible for things to "slip" and end up on the default call_rcu thread.
As a quickfix, this commit moves an rcu_barrier() that was in the mem
context destructor earlier, so that it "protects" all libisc
destructors.
Closes#5296
Merge branch '5296-join-rcu-thread-on-shutdown' into 'main'
See merge request isc-projects/bind9!10423
If a call_rcu thread is running, there is a possible race condition
where the destructors run before all call_rcu callbacks have finished
running. This can happen, for example, if the call_rcu callback tries to
log something after the logging context has been torn down.
In !10394, we tried to counter this by explicitely creating a call_rcu
thread an shutting it down before running the destructors, but it is
possible for things to "slip" and end up on the default call_rcu thread.
As a quickfix, this commit moves an rcu_barrier() that was in the mem
context destructor earlier, so that it "protects" all libisc
destructors.
These tests do not easily fit in the standard test case framework, so they go into their own suite.
- zsk retired case
- checkds cases
- reload/restart
- inheritance tests
Merge branch 'matthijs-pytest-rewrite-kasp-system-test-4' into 'main'
See merge request isc-projects/bind9!10278
These tests ensure that if dnssec-policy is set on a higher level, the
zone is still signed (or unsigned) as expected. Or if a higher level
has an override, the new policy is honored as expected.
The new `tcp-primaries-timeout` configuration option works the same way
as the older `tcp-initial-timeout` option, but applies only to the TCP
connections made to the primary servers, so that the timeout value can
be set separately for them. By default, it's set to 150, which is 15
seconds.
Closes#3649
Merge branch '3649-configurable-xfr-tcp-timeouts' into 'main'
See merge request isc-projects/bind9!9376
The isc_nm_getinitialtimeout() function (and also the previously used
isc_nm_gettimeouts() function) returns timeout value(s) in milliseconds,
while the dns_request_create() function expects timeout values in
seconds. Fix the bug by dividing the timeout value by MS_PER_SEC.
There is no added test, because it turns out delv doesn't support
setting custom timeout values (as opposed to what is suggested in
its man page). Tests should be added later when the '+timeout=T'
option is implemented.
Previously all kinds of TCP timeouts had a single getter and setter
functions. Separate each timeout to its own getter/setter functions,
because in majority of cases only one is required at a time, and it's
not optimal expanding those functions every time a new timeout value
is implemented.
Since notify messages now use the configured 'tcp-initial-timeout'
connect timeout value, the existing "checking notify retries expire
within 30 seconds" check in the "notify" system test is failing. Set
the 'tcp-initial-timeout' option for ns3 to the previously hardcoded
value of 15 seconds for the test to pass successfully.
The checkds_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
The notify_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
The new 'tcp-primaries-timeout' configuration option works the same way
as the existing 'tcp-initial-timeout' option, but applies only to the
TCP connections made to the primary servers, so that the timeout value
can be set separately for them. The default is 15 seconds.
Also, while accommodating zone.c's code to support the new option, make
a light refactoring with the way UDP timeouts are calculated by using
definitions instead of hardcoded values.
Write python-based tests for the many test cases from the kasp system test with the same pattern.
Merge branch 'matthijs-pytest-rewrite-kasp-system-test-3' into 'main'
See merge request isc-projects/bind9!10268
For 'keystore.kasp', a setting 'key-directories' is used. If set, this
will expect a list of two directories, the first one is where the KSKs
will be stored, the second in the list is the ZSK key directory. This
may be expanded in the future to test more complex key storage cases.
The 'rumoured.kasp' zone is weird, the key timings can never match
those key states. But it is a regression test for an early day bug,
so we convert it, but skip the expected key times check.