dns_dnssec_updatekeys's 'report' could be called with invalid arguments
which the compiler should be be able to detect.
(cherry picked from commit 7a0a2fc3e4)
The dnstap system test fails intermittently, and it appears to be
a timing issue - adding a short delay after running 'fstrm_capture',
and before running 'dnstap -reopen' improves the situation from
50% failures (5 out of 10 times) to 0% failures (0 out of 20 times),
tested locally.
The reason is that 'fstrm_capture' is executed in the background,
and due to OS scheduling and other factors, the listener socket
may not be ready when the following command runs and tells 'named'
to (re)open it.
(cherry picked from commit fa686fcea5)
BUILD_PARALLEL_JOBS environmental variable is set to 6, which does not
align well with 4 and 8 CPU core systems dedicated to CI "stress" tests.
When multiple parallel jobs run on the host, they compete for resources
with an undesirable result: 6 compiler processes of one job may starve
named, resulting in lower-than-expected throughput and minutes-long
query response latency spikes.
Better drop the build parallelism of BIND-under-test. About 1-2 minutes
are added to the 60-65 minutes long job duration.
(cherry picked from commit 3fd7e7c81f)
Dumping of the freshly transferred zone file can take some time.
Retry 5 times before failing.
The log excerpt below shows such a case, when dumping lasted more than
two seconds.
06-Mar-2023 09:32:09.973 zone example6/IN: Transfer started.
06-Mar-2023 09:32:10.301 zone example6/IN: zone transfer finished: success
06-Mar-2023 09:32:10.301 zone_dump: zone example6/IN: enter
06-Mar-2023 09:32:11.789 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): AXFR request
06-Mar-2023 09:32:11.801 client @0x7fe9ab435d68 10.53.0.10#44113 (example6): transfer of 'example6/IN': AXFR ended: 5 messages, 2676 records, 55815 bytes, 0.011 secs (5074090 bytes/sec) (serial 1397051952)
06-Mar-2023 09:32:12.409 zone_gotwritehandle: zone example6/IN: enter
06-Mar-2023 09:32:12.421 dump_done: zone example6/IN: enter
06-Mar-2023 09:32:12.421 zone_journal_compact: zone example6/IN: target journal size 53044
(cherry picked from commit 5d5d4b523b)
There can be comments in dig output for a zone transfer only in case
of an error, so we should print those errors not when wait_for_tls_xfer
succeeds, but when it fails.
Also, there is no point in printing those comments when a failure was
indeed expected.
(cherry picked from commit 9672b6be57)
If wait_for_tls_xfer succeeds, while a failure was being expected,
set ret=1 to fail without further checking if the zone file exists.
(cherry picked from commit 2fdf01573c)
By omission, BIND was not built with common CFLAGS in the stress test
jobs. Building with common CFLAGS and -Og should help GDB produce a
backtrace with more information.
(cherry picked from commit d33bdd36b4)
The serve-stale system test was intermittently failing due to a timing
issue:
I:serve-stale:check stale data.example TXT was refreshed...
I:serve-stale:failed
The RRset is refreshed, however, it first checks for an expected log
line, prior checking that the stale data.example TXT was refreshed
(using dig). This log line is there to ensure the record is actually
refreshed before we start querying again. Alternatively we could just
retry_quiet 10 <wait for dig output matches expectations>. It would
lower the chances for intermittent test failures, since there is no
longer a "check for log line, sleep one second if check fails, check
for log line, ...", prior to the check.
(cherry picked from commit 0bf36da305)
Named now logs both compile time and run time UV versions when
starting up. This is useful information to have when debugging
network issues involving named.
(cherry picked from commit 5fd2cd8018)
As it is done in the RPZ module, use 'db' and 'dbversion' for the
database we are going to update to, and 'updb' and 'updbversion' for
the database we are working on.
Doing this should avoid a race between the 'dns__catz_update_cb()' and
'dns_catz_dbupdate_callback()' functions.
(cherry picked from commit d2ecff3c4a)
Remove the part which is no longer true after reverting the commit
in question.
The CHANGES entry was never part of a released BIND 9 version.
(cherry picked from commit e1627e1289)
This reverts commit a719647023.
The commit introduced a data race, because dns_db_endload() is called
after unfreezing the zone.
(not cherry picked from commit 593dea871a)
During reconfiguration, the configure_view() function reverts the
configured zones to the previous view in case if there is an error.
It uses the 'zones_configured' boolean variable to decide whether
it is required to revert the zones, i.e. the error happened after
all the zones were successfully configured.
The problem is that it does not account for the case when an error
happens during the configuration of one of the zones (not the first),
in which case there are zones that are already configured for the
new view (and they need to be reverted), and there are zones that
are not (starting from the failed one).
Since 'zones_configured' remains 'false', the configured zones are
not reverted.
Replace the 'zones_configured' variable with a pointer to the latest
successfully configured zone configuration element, and when reverting,
revert up to and including that zone.
(cherry picked from commit 84c235a4b0)
The trick is to configure a duplicate zone, which comes after the
catalog zone, where the duplicate zone is an existing member zone.
In that scenario, all the zones which come before the "faulty" zone
in the configuration file will fail to be reverted to the previous
version of the view after a reconfiguration error, and in this
particular case that will result in an assertion failure when the
catalog zone update is initiated, because it will be still tied to
the new version of the view, which was dismissed.
(cherry picked from commit 93c4f382f4)
In older GitLab versions, the regular expression used for extracting
test coverage statistics from the output of GitLab CI jobs was
configured in the project's settings, using GitLab's web interface.
That changed in recent GitLab versions [1]; the previous configuration
method was removed from the web interface altogether as of GitLab 15.0.
The relevant regular expression is now supposed to be set in the
relevant job's definition in .gitlab-ci.yml.
Set the regular expression used for extracting test coverage
statistics in the definition of the "gcov" GitLab CI job. Use the
regular expression suggested in GitLab's documentation [2].
[1] https://docs.gitlab.com/ee/update/deprecations.html#test-coverage-project-cicd-setting
[2] https://docs.gitlab.com/ee/ci/pipelines/settings.html#test-coverage-examples
(cherry picked from commit db7af9fcc1)
There are leftovers from the previous refactoring effort, which left
some function declarations and comments in the header file unchanged.
Finish the renaming.
(cherry picked from commit 580ef2e18f)
When detaching from the previous version of the database, make sure
that the update-notify callback is unregistered, otherwise there is
an INSIST check which can generate an assertion failure in free_rbtdb(),
which checks that there are no outstanding update listeners in the list.
There is a similar code already in place for RPZ.
(cherry picked from commit cf79692a66)
The zone_postload() function can fail and unregister the callbacks.
Call dns_db_endload() only after calling zone_postload() to make
sure that the registered update-notify callbacks are not called
when the zone loading has failed during zone_postload().
Also, don't ignore the return value of zone_postload().
(cherry picked from commit ed268b46f1)
Add the 'ixfr-from-differences yes;' option to trigger a failed
zone postload operation when a zone is updated but the serial
number is not updated, then issue two successive 'rndc reload'
commands to trigger the bug, which causes an assertion failure.
(cherry picked from commit a73b67456e)
This commit increases server start timeout from 60 to 90 seconds in
order to avoid system test failures on some platforms due to inability
to initialise TLS contexts in time.
(cherry picked from commit 705f0d1ed1)
The test was setting a minimum count for recursive clients which
was not always being met (e.g. 91 instead of 100) producing a false
positive. Lower the lower bound on recursive clients for this
test to 1.
(cherry picked from commit af47090d99)
DNSRPS-enabled builds have recently been silently broken a few times due
to that feature not being tested in regular CI pipelines. Add the
--enable-dnsrps --enable-dnsrps-dl switches to the ./configure
invocation in one of the CI jobs run for all merge requests so that
DNSRPS-related build issues can be detected in advance.
It is important to note that this change by itself does NOT enable
actual testing of the DNSRPS feature as doing that requires a DNSRPS
provider library to be present on the test host.
(cherry picked from commit a4d6f5f6fd)