Commit graph

10538 commits

Author SHA1 Message Date
Ondřej Surý
eba7fb5f9f
Create a second pruning task for rbtdb with unlimited quantum
Previously, rbtdb->task had quantum of 1 because it was originally used
just for freeing RBTDB contents, which can happen on a "best effort"
basis (does not need to be prioritized).  However, when tree pruning was
implemented, it also started sending events to that task, enabling the
latter to become clogged up with a significant event backlog because it
only pruned a single RBTDB node per event.

To prioritize tree pruning (as it is necessary for enforcing the
configured memory use limit for the cache memory context), create a
second task with a virtually unlimited quantum (UINT_MAX) and send the
tree-pruning events to this new task, to ensure that all nodes scheduled
for pruning will be processed before further nodes are queued in a
similar fashion.

This change enables dropping the prunenodes list and restoring the
originally-used logic that allocates and sends a separate event for each
node to prune.

(cherry picked from commit 540a5b5a2c)
2024-03-06 19:17:32 +01:00
Ondřej Surý
b9c10a194d
Add a system test for mixed-case data for the same owner
We were missing a test where a single owner name would have multiple
types with a different case.  The generated RRSIGs and NSEC records will
then have different case than the signed records and message parser have
to cope with that and treat everything as the same owner.

(cherry picked from commit a114042059)
2024-02-11 11:57:58 +01:00
Michał Kępień
1237d73cd1
Fix map offsets in the "masterformat" system test
The "masterformat" system test attempts to check named-checkzone
behavior when it is fed corrupt map-format zone files.  However, despite
the RBTDB and RBT structures having evolved over the years, the offsets
at which a valid map-format zone file is malformed by the "masterformat"
test have not been updated accordingly, causing the relevant checks to
introduce a different type of corruption than they were originally meant
to cause:

  - the "bad node header" check originally mangled the 'type' member of
    the rdatasetheader_t structure for cname.example.nil,

  - the "bad node data" check originally mangled the 'serial' and
    'rdh_ttl' members of the rdatasetheader_t structure for
    aaaa.example.nil.

Update the offsets at which the map-format zone file is malformed at by
the "masterformat" system test so that the relevant checks fulfill their
original purpose again.
2024-01-05 12:40:50 +01:00
Ondřej Surý
a4baf32415
Backport isc_ht API changes from BIND 9.18
To prevent allocating large hashtable in dns_message, we need to
backport the improvements to isc_ht API from BIND 9.18+ that includes
support for case insensitive keys and incremental rehashing of the
hashtables.
2024-01-05 11:52:05 +01:00
Mark Andrews
16f3d79052 Support Net::DNS::Nameserver 1.42
In Net::DNS 1.42 $ns->main_loop no longer loops.  Use current methods
for starting the server, wait for SIGTERM then cleanup child processes
using $ns->stop_server(), then remove the pid file.

(cherry picked from commit c2c59dea60)
2024-01-03 12:01:14 +11:00
Mark Andrews
ea7b92a348 The NSEC3 -> NSEC private record may be added later
Check each delta for the NSEC3 -> NSEC private record addition
as it may be added in the second delta.

(cherry picked from commit 80a4dff986)
2023-12-20 11:13:01 +11:00
Mark Andrews
ba706a170d Regression check for missing RRSIGs
When transitioning from NSEC3 to NSEC the added records where not
being signed because the wrong time was being used to determine if
a key should be used or not.  Check that these records are actually
signed.

(cherry picked from commit bdb42d3838)
2023-12-19 12:56:57 +11:00
Aram Sargsyan
13dab06f60 Fix a statschannel system test zone loadtime issue
The check_loaded() function compares the zone's loadtime value and
an expected loadtime value, which is based on the zone file's mtime
extracted from the filesystem.

For the secondary zones there may be cases, when the zone file isn't
ready yet before the zone transfer is complete and the zone file is
dumped to the disk, so a so zero value mtime is retrieved.

In such cases wait one second and retry until timeout. Also modify
the affected check to allow a possible difference of the same amount
of seconds as the chosen timeout value.

(cherry picked from commit 4e94ff2541)
2023-12-18 09:39:11 +00:00
Mark Andrews
c9147530fd Adjust message buffer sizes in test code
(cherry picked from commit cbfcdbc199)
2023-12-06 09:06:31 +11:00
Michał Kępień
57f7abdd9f
Do not daemonize named instances with custom args
This enables the "logfileconfig" and "rpzextra" system tests to pass
when named is started under the supervision of rr (USE_RR=1).

(cherry picked from commit 422286e9c2)
2023-12-04 20:10:17 +01:00
Michal Nowak
5bcce417b7
Add support for recording named runtime with rr
The traces of the named process are stored in the directory
$system_test/nsX/named-Y/.

(cherry picked from commit e088e8a992)
2023-12-04 20:08:18 +01:00
Ondřej Surý
7099e60b1d
Remove support for running system tests under Valgrind
Valgrind support has been scarcely used.

(cherry picked from commit 658d62a6f4)
2023-12-04 20:05:19 +01:00
Evan Hunt
12c60e9a26 set loadtime during initial transfer of a secondary zone
when transferring in a non-inline-signing secondary for the first time,
we previously never set the value of zone->loadtime, so it remained
zero. this caused a test failure in the statschannel system test,
and that test case was temporarily disabled.  the value is now set
correctly and the test case has been reinstated.

(cherry picked from commit 9643281453)
2023-11-20 09:56:50 -08:00
Mark Andrews
8924adca61 Update b.root-servers.net IP addresses
This covers both root hints and the default primaries for the root
zone mirror.  The official change date is Nov 27, 2023.

(cherry picked from commit 2ca2f7e985)
2023-11-03 03:44:43 +11:00
Michał Kępień
4d4b209abd
Revert GL !8447
This reverts commit bd572bb5af
(c02925763e,
3aeac8e2a9, and
57d8e2949d), reversing changes made to
28c92c9b26.
2023-11-01 18:26:33 +01:00
Matthijs Mekking
c02925763e Test case for issue #4355
Add a test case where serve-stale is enabled on a server that also
servers a local authoritative zone.

The particular case tests a lame delegation and checks if falling
back to serving stale data does not attempt to retrieve the query
by recursing from the root down.

(cherry picked from commit e196ba6168)
2023-10-31 15:04:28 +01:00
Tom Krizek
794c99f4c8
Fix incorrect conversion in rpz system test
Backslashes weren't handled properly when formatting the code with
shfmt.

Related https://github.com/mvdan/sh/issues/1041
2023-10-26 15:25:38 +02:00
Tom Krizek
ce014dbf4e
Reformat shell scripts with shfmt
All changes in this commit were automated using the command:

  shfmt -w -i 2 -ci -bn bin/tests/system/ util/ $(find bin/tests/system/ -name "*.sh.in")

By default, only *.sh and files without extension are checked, so
*.sh.in files have to be added additionally. (See mvdan/sh#944)

(manually replayed commit 4cb8b13987)
2023-10-26 13:24:51 +02:00
Michal Nowak
43947e7198
Add test for CVE-2023-3341
(cherry picked from commit 7d1834b250)
2023-10-20 16:26:49 +02:00
Michal Nowak
531c96b8ed
Update the source code formatting using clang-format-17 2023-10-17 17:56:31 +02:00
Mark Andrews
652847b06e Document that reloading happens asynchronously
(cherry picked from commit e33dbd0cbd)
2023-09-27 10:14:35 +10:00
Mark Andrews
0e55d94653 Wait for the test zone to finish re-loading
'rndc thaw' initiates asynchrous loading of all the zones
similar to 'rndc load'.  Wait for the test zone's load to
complete before testing that it is updatable again.

(cherry picked from commit 5b3238aa85)
2023-09-26 14:12:36 +10:00
Mark Andrews
50bdf976c7 Check RRSIG covered type in negative cache entry
The covered type previously displayed as TYPE0 when it should
have reflected the records that was actually covered.

(cherry picked from commit 8ce359652a)
2023-09-18 16:40:55 +10:00
Tom Krizek
d941dd151c
Skip checkds test on Python<3.7
checkds test requires the capture_output argument for subprocess.run()
which was added in Python 3.7.

(cherry picked from commit 0361233b3d)
2023-08-23 14:51:25 +02:00
Michal Nowak
2d951a900b
Mark test_send_timeout as flaky
In some cases, BIND is not fast enough to fill the send buffer and
manages to answer all queries, contrary to what the test expects.
Repeat the check up to 3 times to limit this test instability.

(cherry picked from commit 681b23c398)
2023-08-22 08:55:36 +02:00
Tom Krizek
f84e0f4ad0
Add custom flaky decorator to handle unstable tests
If the flaky plugin for pytest is available, use its decorator to
support re-running unstable tests. In case the package is missing,
execute the test as usual without attempts to re-run it in case of
failure.

This is mostly intended to increase the test stability in CI. Using a
custom decorator enables us to keep the flaky package as an optional
dependency.

(cherry picked from commit 5b703de733)
2023-08-22 08:55:36 +02:00
Mark Andrews
a9ab5c215f Add sleeps so that the modification time changes
The mkeys system test could fail because root zone was resigned
within the same second as it was previously signed causing reloads
to fail.  Add delays to the test to prevent this.

(cherry picked from commit 40e3529379)
2023-08-15 09:39:53 +10:00
Mark Andrews
71c2bb46e5 Set ret=1 if _wait_for_stats does not succeed
Errors getting transfer statistics from named.run where not detected
as ret was not set to one if there hadn't been a success after looping
for a while.

(cherry picked from commit 287a1ac09b)
2023-08-08 23:42:23 +00:00
Michał Kępień
5698a6a905
Convert setup.pl into static configurations
The setup.pl script has been replaced with static BIND configurations,
and in the course of this change, the unused ns1 server was removed.
This enhancement has greatly improved the overall test's readability.

(cherry picked from commit 08a8906cfc)
2023-08-08 18:01:17 +02:00
Michal Nowak
b939741cfd
Rewrite stress test to pytest
The shell version of the test was completed only after all DNS zone
updates were sent, even if the BIND server crashed while processing
them, leading to prolonged execution and potential hang in the CI
environment. The Python rewrite of the test ensures that DNS update
tasks finish within five minutes of starting, irrespective of a BIND
crash possibility or DNS zone updates not finishing in time.

(cherry picked from commit ecd7b30d0a)
2023-08-08 18:01:17 +02:00
Michał Kępień
268b4392ba
Wait until fstrm_capture is ready
The fstrm_capture utility is started in the background during the
"dnstap" system test.  Consequently, "rndc dnstap-reopen" and similar
commands may be executed before fstrm_capture starts listening on the
Unix domain socket it is configured to receive dnstap data on.  This
results in the dnstap data sent to that socket in the meantime to be
lost; while the fstrm writer thread is able to recover from such a
scenario within a couple of seconds (by reopening the configured dnstap
destination itself), only one write attempt is made for data
successfully queued to the writer thread, so dnstap frames can still be
lost in the process.  This may happen during the "dnstap" system test,
leading to the dnstap output file being empty, which in turn causes the
test to fail.

Fix by waiting until fstrm_capture starts listening on the Unix domain
socket it is configured to use before asking named to reopen the
configured dnstap destination.  Since various fstrm_capture versions log
different messages when the listening socket is set up, wait for a
common string that works for all fstrm_capture versions released to
date.  Add a few extra debug messages indicating test progress and make
the test fail if the expected fstrm_capture log message is not generated
within 10 seconds.

(cherry picked from commit 26d3d97f12)
2023-08-07 14:01:51 +02:00
Michał Kępień
c630efb2f1
Capture all fstrm_capture output
The fstrm_capture.out file is overwritten when the fstrm_capture utility
is restarted during the "dnstap" system test.  Use a separate output
file for each fstrm_capture instance to ensure all output produced by
that tool during the "dnstap" system test is preserved for forensic
purposes.

(cherry picked from commit bd2941fc72)
2023-08-07 14:01:51 +02:00
Mark Andrews
dc2ea03ea2
Use sub shell to isolate enviroment changes
'HOME=value command' should only change HOME for command but on
some platforms this occasionally sets HOME for the rest of the
test. Explicitly isolate the enviroment change using a sub shell.

(cherry picked from commit 96f75bba18)
2023-08-02 10:47:36 +02:00
Štěpán Balážik
fa8d48ee86
Fix ecdsa256 check in ecdsa system test setup
Probably by copy-paste mistake, ecdsa384 was checked twice.

(cherry picked from commit 10194baa07)
2023-07-28 09:15:17 +02:00
Tom Krizek
9abdcb23a2
Disable resolve checks under TSAN
The resolve binary is affected by GL#4119 which occassionally makes it
hand during system tests when running with TSAN. This is a workaround to
avoid wasting resources caused by a CI timeout for the system test tsan
jobs.

(cherry picked from commit 774b9bc629)
2023-07-26 10:08:29 +02:00
Tom Krizek
3c30f4a408
Reproducer for CVE-2023-2911
The conditions that trigger the crash:
- a stale record is in cache
- stale-answer-client-timeout is 0
- multiple clients query for the stale record, enough of them to exceed
  the recursive-clients quota
- the response from the authoritative is sufficiently delayed so that
  recursive-clients quota is exceeded first

The reproducer attempts to simulate this situation. However, it hasn't
proven to be 100 % reproducible, especially in CI. When reproducing
locally, the priming query also seems to sometimes interfere and prevent
the crash. When the reproducer is ran twice, it appears to be more
reliable in reproducing the issue.

(cherry picked from commit f617512d37)
2023-07-25 10:35:09 +02:00
Tom Krizek
90e33052d2
Configure pytest to properly locate conftest.py
In pytest 7.4.0, there were some changes to how the configuration file
for pytest is located. In our case, this resulted in a failure to find
the conftest.py with the needed fixtures which then prevented our python
tests from being executed successfully.

Configure the --confcutdir to ensure it points to the system test
directory, where our conftest.py is located.

Related https://github.com/pytest-dev/pytest/pull/11043
2023-07-20 13:27:03 +02:00
Aram Sargsyan
3a807e554f Fix a bug in an utility script for the statschannel system test
Because of a typo, the fetch.pl script tries to extract the server
address from the input parameter 'a' instead of 's'. Fix the typo.

(cherry picked from commit aa7538fd38)
2023-07-19 13:27:54 +00:00
Mark Andrews
ce17cdf9cb Use absolute path to locate run.gdb
(cherry picked from commit 3f7723cdff)
2023-07-19 12:53:43 +10:00
Matthijs Mekking
80a20c9643 Add test for "three is a crowd" bug (GL #2375)
Add this test scenario for a bug fixed a while ago. When a third key is
introduced while the previous rollover hasn't finished yet, the keymgr
could decide to remove the first two keys, because it was not checking
for an indirect dependency on the keys.

In other words, the previous bug behavior was that the first two keys
were removed from the zone too soon.

This test case checks that all three keys stay in the zone, and no keys
are removed premature after another new key has been introduced.

(cherry picked from commit 9c40cf0566)
2023-07-06 10:30:53 +02:00
Matthijs Mekking
83dd0c85a2 Check all keys despite early failure
In the kasp script, if one expected key is not found, continue checking
the other key ids, even if there is no match for the first one.  This
provides a bit more information which keys mismatch and makes for
easier debugging test failures.

(cherry picked from commit 674249f66a)
2023-07-06 10:28:41 +02:00
Tom Krizek
4efef8cb54
Check for unset variables only after conf.sh is loaded
Make the cds/setup.sh compatible with the workaround which relies on
testing the TSAN_OPTIONS variable which may not be set.

(cherry picked from commit 76d9873ef6)
2023-06-29 14:40:09 +02:00
Tom Krizek
2020ce2010
Fix checking for executables in shell conditions in tests
Surround the variables which are checked whether they're executable in
double quotes. Without them, empty paths won't be properly interpreted
as not executable.

(manually picked from commit 06056c44a7)
2023-06-29 13:19:47 +02:00
Tom Krizek
bd9dabc0c3
Only use delv if available in mkeys test
Check that $DELV is an executable before using it in a test.

(cherry picked from commit 384339dbba)
2023-06-29 13:16:50 +02:00
Tom Krizek
a904cd9a0e
Disable delv tests under TSAN
Since delv can occasionally hang in system tests when running with TSAN
(see GL#4119), disable these tests as a workaround. Otherwise, the hung
delv process will just waste CI resources and prevent any meaningful
output from the rest of the test suite.

(cherry picked from commit fbcf37f914)
2023-06-29 13:16:46 +02:00
Tom Krizek
0374c27fc5
Check for proper file size output in dnstap test
Previously, the first check silently failed, as 450 is apparently (in
the CI) the minimum output size for the dnstap output, rather than
470 which the test was expecting. Effectively, the check served as a 5
second sleep rather than waiting for the proper file size.

Additionally, check the expected file sizes and fail if expectations
aren't met.

(manually picked from commit 5f809e50b6)

On main, the minimum file size seems to 454 bytes, while on some
platforms in our CI setup for the 9.16 branch, it appears to be 450
instead.
2023-06-26 14:33:43 +02:00
Tom Krizek
9cfc8da487
Check for proper log message in kasp test
The log message is supposed to contain the zone name which was
erroneously omitted, but didn't pop up during tests, since return code
was silently ignored.

Now it actually waits for the proper log message rather than being an
equivalent of 3 second sleep (which was also sufficient to make the test
pass, thus we detected no failure).

(cherry picked from commit 1dd4c2b9e2)
2023-06-26 13:08:09 +02:00
Michał Kępień
731a736a91
Add a tool for reproducing ISC SPNEGO bugs
Extend the "tsiggss" system test with reproducers for CVE-2020-8625 and
CVE-2021-25216.

(cherry picked from commit a47dc810f7)
2023-06-19 10:36:25 +02:00
Tom Krizek
328d0a1d0a
Avoid false positive in serve-stale system test check
The purpose of the check is to verify the server has survived the
previous barrage of queries. This is done by sending a query and
checking we get a NOERROR response back.

Previously, that query could've been affected by a servfail cache - the
server would return a SERVFAIL answer, thus failing the check, despite
being up and running. Use version.bind txt ch query to avoid the
interference of servfail cache.

(cherry picked from commit dd7bcd2855)
2023-06-13 14:16:44 +02:00
Michal Nowak
ca57ddf53e
Disable minimal update check with no keys on Windows
The $t1 value equals $t2 due to the time elapsed between "rndc
managed-keys status" calls being equal to the normal active refresh
period (as calculated per rules listed in RFC 5011 section 2.3) minus an
"hour" (as set using -T mkeytimers). This value equality is expected to
happen on really slow machines. On our Windows CI runner, it happens
very often.
2023-05-31 14:25:02 +02:00