Commit graph

10559 commits

Author SHA1 Message Date
Michal Nowak
e97ed8d9b6
Clean up test.output.* references
test.output.* files are no longer created by the system test framework.
Remove all references to these files from the source tree.
2022-01-27 15:32:28 +01:00
Michal Nowak
f6b996f6fc
Drop systests.output references from system test
Since "runall.sh" script removal systests.output file is not being
created and its references are useless.
2022-01-27 15:32:28 +01:00
Michal Nowak
8109e924b5
Drop support for sequential system tests
System test used to have sequential system tests, which can't run in
parallel with the rest of system tests. As there are no such tests
anymore the underlying infrastructure can be dropped.
2022-01-27 15:32:28 +01:00
Michal Nowak
9d398572f0
Drop bin/tests/system/parallel.sh
"parallel.sh" script was used on Windows to run system tests in
parallel. Since Windows support was removed from BIND 9, the script is
not needed anymore.
2022-01-27 15:32:28 +01:00
Michal Nowak
986b364fe6
Drop testsummary.sh
testsummary.sh was not updated after build system rewrite to Autotools,
and needs to be fixed to produce test summary and core dump, assertion
failures, and ThreadSanitizer reports.

Given that all of this is provided by Autotools and run.sh already,
there's little use to testsummary.sh script and should be dropped.
2022-01-27 15:32:27 +01:00
Michal Nowak
5d2dd94cf8
Drop runall.sh
runall.sh was mainly used on Windows and as it's support was removed
from the "main" branch the script is not needed anymore.

Also, remove bin/tests/system/README text on running multiple system
test suites simultaneously with runall.sh as that support was not
present in the script anyway.
2022-01-27 11:58:17 +01:00
Michal Nowak
b983df403a
Drop unused @DNSTAP@ label in conf.sh.in
@DNSTAP@ label does not have adjacent AC_SUBST() call and is therefore
unused.
2022-01-27 11:57:27 +01:00
Michal Nowak
67092442d6
rrsetorder should use stop_server() in tests.sh 2022-01-27 11:57:27 +01:00
Michal Nowak
7ba786dedb
Drop bin/tests/system/setup.sh
bin/tests/system/setup.sh just executes setup.sh script of a particular
system test in the directory of the system test. This does not seems to
be useful enough to maintain it.
2022-01-27 11:57:27 +01:00
Michal Nowak
4c03d814ed
Drop stopall.sh
stopall.sh script takes almost 2 minutes to go thru all test
subdirectories (due to a sleep in stop.pl) and does not seems to be
efficient way to stop manually started tests.
2022-01-27 11:57:26 +01:00
Matthijs Mekking
0af8bbd49b Create keys with pkcs11-tool --id
The keyfromlabel system ECDSA tests sometimes fail. When this happens
the ZSK and KSK key id values differ by 1, which is an indication that
the same key is used for both DNSKEY records.

When the private key is retrieved with 'ENGINE_load_private_key()', the
public key is already set. But sometimes that key differs from the key
which was retrieved with 'ENGINE_load_public_key()'.

The libp11 source code uses id to find the key and without IDs all the
keys are "equal", so it is returning the first key in the array of the
enumerated keys instead of the matching key. In our test we didn't use
'--id', just '--label'. With this change, the system test should no
longer fail intermittently.

Note this is only an issue for ECDSA keys, not RSA keys.
2022-01-27 10:49:47 +01:00
Matthijs Mekking
eba66665a5 Add system test for dnssec-keyfromlabel
Add missing system test for dnssec-keyfromlabel. Test for various
algorithms that we can generate key files from a key that is stored in a
HSM, and that those keys can be used for signing with dnssec-signzone.
2022-01-27 10:49:46 +01:00
Matthijs Mekking
221e1bc2a3 Update .gitlab-ci.yml with openssl setup
GitLab CI needs to know about some environment variables that will
tell where OpenSSL and SoftHSM2 is installed. This is done in the
image, making the prepare-softhsm2.sh script obsolete.

The SoftHSM2 module location is system specific.
2022-01-27 10:46:58 +01:00
Matthijs Mekking
0725fcad38 Remove prepare-softhsm2.sh from runtime test
This script is obsoleted because SoftHSM2 is now installed in the
image.
2022-01-27 10:46:58 +01:00
Evan Hunt
1d706f328c
Remove leftover test code for Windows
- Removed all code that only runs under CYGWIN, and made all
  code that doesn't run under CYGWIN non-optional.
- Removed the $TP variable which was used to add optional
  trailing dots to filenames; they're no longer optional.
- Removed references to pssuspend and dos2unix.
- No need to use environment variables for diff and kill.
- Removed uses of "tr -d '\r'"; this was a workaround for
  a cygwin regex bug that is no longer needed.
2022-01-27 09:08:29 +01:00
Michał Kępień
a938db2170 Fix waiting for lock file removal upon exit
Commit c787a539d2 fixed a certain class of
intermittent system test failures caused by named instances unable to
restart.  The root cause was bin/tests/system/stop.pl returning without
waiting for a named instance to remove its lock file.

Later on, it turned out that the above change causes other issues on
Windows due to the way named handles signals on that platform.  Commit
761ba4514f intended to address those
issues by making the server_lock_file() subroutine in
bin/tests/system/stop.pl return an empty value on Windows, in order to
prevent the script for waiting for lock file cleanup on that platform.
Note, however, that Windows detection in that subroutine is limited to
checking whether the CYGWIN environment variable is set.

While that environment variable was not set on Unix-like systems before
commit 761ba4514f, another commit
(a33237f070, merged a few weeks later)
changed that by setting the CYGWIN environment variable to an empty
value on Unix-like systems.  This made the defined($ENV{'CYGWIN'}) check
in server_lock_file() return true, inadvertently preventing
bin/tests/system/stop.pl from waiting for lock file removal before
exiting on Unix-like systems and therefore reintroducing the original
issue.

Fix by making server_lock_file() only return an empty value when the
CYGWIN environment variable is set to a non-empty value (which is what
bin/tests/system/conf.sh.win32 does).  Adjust a similar check in the
pid_file_exists() subroutine in the same way for consistency.
2022-01-26 15:18:43 +01:00
Michał Kępień
fb87022115 Do not strip leading whitespace from test output
The echo_*() and cat_*() functions in bin/tests/system/conf.sh.common
call the "read" builtin command without specifying the field separator
to use.  This results in leading whitespace getting stripped from each
line of the texts passed to those functions, which mangles e.g. pytest
output, hindering test failure troubleshooting.

Address by setting IFS to an empty value for the "read" calls used in
the aforementioned helper functions.
2022-01-26 15:18:43 +01:00
Michał Kępień
65abbca79b Retain all named.run files from each test run
The bin/tests/system/start.pl script truncates the named.run file for a
given named instance unless it is invoked with the --restart
command-line option.  Ever since Python-based tests were introduced,
bin/tests/system/run.sh may start named instances used by a given system
test multiple times within a single run, causing the
bin/tests/system/start.pl script to truncate some of the log files
written during the test.  This makes troubleshooting certain test
failures hard or even impossible.

Fix by calling bin/tests/system/start.pl with the --restart command-line
option for every start_servers() invocation except the first one.
2022-01-26 15:18:43 +01:00
Aram Sargsyan
5f9d4b5db4 Fix invalid control port number in the catz system test
When failure is expected, the `rndc` command in the catz system test
is being called directly instead of using a function, i.e.:

    $RNDC -c ../common/rndc.conf -s 10.53.0.2 -p 9953 reconfig \
        > /dev/null 2>&1 && ret=1

... instead of:

    rndccmd 10.53.0.2 reconfig && ret=1

This is done to suppress messages like "lt-rndc: 'reconfig' failed:
failure" appearing in the message log of the test, because failure
is actually expected, and the appearance of that message can be
confusing.

The port value used in this case is not correct, making the
`rndc reload` command to fail.  This error was not detected earlier
only because the failure of the command is actually expected, but
the failure happens for a "wrong" reason, and the test still passes.

Fix the error by using the existing variable instead of the fixed
number.
2022-01-25 08:21:50 +00:00
Aram Sargsyan
62337d433f Add a system test for view reverting after a failed reconfiguration
Test the view reverting code by introducing a faulty dlz configuration
in named.conf and using `rndc reconfig` to check if named handles the
situation correctly.

We use "dlz" because the dlz processing code is located in an ideal
place in the view configuration function for the test to cover the
view reverting code.

This test is specifically added to the catz system test to additionally
cover the catz reconfiguration during the mentioned failed
reconfiguration attempt.
2022-01-25 08:21:50 +00:00
Aram Sargsyan
3697560f04 Improve the view configuration error handling and reverting logic
If a view configuration error occurs during a named reconfiguration
procedure, BIND can end up having twin views (old and new), with some
zones and internal structures attached to the old one, and others
attached to the new one, which essentially creates chaos.

Implement some additional view reverting mechanisms to avoid the
situation described above:

 1. Revert rpz configuration.

 2. Revert catz configuration.

 3. Revert zones to view attachments.
2022-01-25 08:20:52 +00:00
Michał Kępień
18db2269bf Fix spelling of "DNS over HTTPS" & "DNS over TLS"
The terms "DNS over HTTPS" and "DNS over TLS" should be hyphenated when
they are used as adjectives and non-hyphenated otherwise.  Ensure all
occurrences of these terms in the source tree follow the above rule.
(CHANGES and release notes are intentionally left intact.)

Tweak a related ARM snippet, fixing a typo in the process.
2022-01-20 15:40:37 +01:00
Michał Kępień
d1d721aae1 rndc: prevent crashing after receiving a signal
If isc_app_run() gets interrupted by a signal, the global 'rndc_task'
variable may already be detached from (set to NULL) by the time the
outstanding netmgr callbacks are run.  This triggers an assertion
failure in isc_task_shutdown().  However, explicitly calling
isc_task_shutdown() from rndc code is redundant because it does not use
isc_task_onshutdown() and the task_shutdown() function gets
automatically called anyway when the task manager gets destroyed (after
isc_app_run() returns).  Remove the redundant isc_task_shutdown() calls
to prevent crashes after receiving a signal.
2022-01-19 14:30:17 +01:00
Evan Hunt
289c1d33ee rndc: sync ISC_R_CANCELED handling in callbacks
rndc_recvdone() is not treating the ISC_R_CANCELED result code as a
request to stop data processing, which may cause a crash when trying to
dereference ccmsg->buffer.  Fix by ensuring ISC_R_CANCELED results in an
early exit from rndc_recvdone().

Make sure the logic for handling ISC_R_CANCELED in rndc_recvnonce()
matches the one present in rndc_recvdone() to ensure consistent behavior
between these two sibling functions.
2022-01-19 14:30:17 +01:00
Artem Boldariev
d3e7c0e647 doth test: fix failure after reconfig
Sometimes the serving a query or two might fail in the test due to the
listeners not being reinitialised on time. This commit makes the test
suite to wait for reconfiguration message in the log file to detect
the time when the reconfiguration request completed.
2022-01-18 14:25:43 +02:00
Michał Kępień
29961bd741 Reimplement the gnutls-cli check in Python
gnutls-cli is tricky to script around as it immediately closes the
server connection when its standard input is closed.  This prevents
simple shell-based I/O redirection from being used for capturing the DNS
response sent over a TLS connection and the workarounds for this issue
employ non-standard utilities like "timeout".

Instead of resorting to clever shell hacks, reimplement the relevant
check in Python.  Exit immediately upon receiving a valid DNS response
or when gnutls-cli exits in order to decrease the test's run time.
Employ dnspython to avoid the need for storing DNS queries in binary
files and to improve test readability.  Capture more diagnostic output
to facilitate troubleshooting.  Use a pytest fixture instead of an
Autoconf macro to keep test requirements localized.
2022-01-18 11:00:46 +01:00
Ondřej Surý
7267c39323 Remove +mapped option from dig
The network manager doesn't have support for IPv4-mapped IPv6 addresses,
thus we are removing the +mapped option from dig command.
2022-01-17 22:16:27 +01:00
Artem Boldariev
4884ab0340 doth test: use extended reg. expression to check for HTTP/2 support
Using extended regular expressions to check for HTTP/2 support in curl
appears to be a more portable option, which also works on
e.g. OpenBSD.
2022-01-17 16:36:27 +02:00
Ondřej Surý
aaa31962d2 Add missing backtick to host.rst
The missing backtick was causing formatting problems in the host
manpage.
2022-01-16 07:56:17 +01:00
Ondřej Surý
d3b975abb6 Increase the timeout to 15 seconds for the resolver test
1. 10 seconds is an unfortunate pick because that reintroduces the
   problem described in commit 5307bf64 (for an earlier check).

   Change the +tries=3 +timeout=10 to +tries=2 +time=15, so that we
   minimize the risk of dig missing any responses sent by the server in
   the first 15 seconds while also increasing our chances of the
   response arriving in time on machines under heavy load and allowing
   it a single retry in case things go awry.

2. The comment about TCP above was misleading: as painfully proven by
   GitLab CI, using TCP is no guarantee of receiving a response in a
   timely manner.  It may help a bit, but it is certainly not a 100%
   reliable solution.

   Change the dig invocation to just use UDP like in the two prior
   tests for consistency (and revise that comment accordingly).
2022-01-14 13:00:56 +01:00
Ondřej Surý
29b9c8e7f5 Increase the dig timeout in resolver test to 10 seconds
The resolver system tests was exhibiting often intermitten failures,
increase the timeout from default 5 second to 10 seconds to give the dig
more leeway for providing an answer.
2022-01-14 11:13:26 +01:00
Ondřej Surý
6d9afd4cc0 Make resolver system test shellcheck clean
The resolver system test shell scripts were using legacy syntax.
Convert the script into POSIX shell syntax and make them shellcheck
clean.
2022-01-14 11:13:26 +01:00
Aram Sargsyan
daf11421df Add a test to query DoT using gnutls-cli
Add a test to check BIND's DoT (DNS-over-TLS) implementation using
gnutls-cli to confirm that it is compatibe with the GnuTLS library.
2022-01-13 12:28:11 +00:00
Ondřej Surý
dbd9c31354 Use ISC_R_SHUTTINGDOWN to detect netmgr shutting down
When the dispatch code was refactored in libdns, the netmgr was changed
to return ISC_R_SHUTTINGDOWN when the netmgr is shutting down, and the
ISC_R_CANCELED is now reserved only for situation where the callback was
canceled by the caller.

This change wasn't reflected in the controlconf.c channel which was
still looking for ISC_R_CANCELED as the shutdown event.
2022-01-13 09:14:12 +01:00
Ondřej Surý
58bd26b6cf Update the copyright information in all files in the repository
This commit converts the license handling to adhere to the REUSE
specification.  It specifically:

1. Adds used licnses to LICENSES/ directory

2. Add "isc" template for adding the copyright boilerplate

3. Changes all source files to include copyright and SPDX license
   header, this includes all the C sources, documentation, zone files,
   configuration files.  There are notes in the doc/dev/copyrights file
   on how to add correct headers to the new files.

4. Handle the rest that can't be modified via .reuse/dep5 file.  The
   binary (or otherwise unmodifiable) files could have license places
   next to them in <foo>.license file, but this would lead to cluttered
   repository and most of the files handled in the .reuse/dep5 file are
   system test files.
2022-01-11 09:05:02 +01:00
Matthijs Mekking
6e9fed2d24 Replace RSASHA1 in autosign test with default alg
Change RSASHA1 to $DEFAULT_ALGORITHM to be FIPS compliant.

There is one RSASHA1 occurence left, to test that dynamically adding an
NSEC3PARAM record to an NSEC-only zone fails.
2022-01-06 09:33:36 +01:00
Matthijs Mekking
fbd559ad0d Update autosign test
Update the autosign system test with new expected behavior.

The 'nozsk.example' zone should have its expired zone signatures
deleted and replaced with signatures generated with the KSK.

The 'inaczsk.example' zone should have its expired zone signatures
deleted and replaced with signatures generated with the KSK.

In both scenarios, signatures are deleted, not retained, so the
"retaining signatures" warning should not be logged.

Furthermore, thsi commit fixex a test bug where the 'awk' command
always returned 0.

Finally, this commit adds a test case for an offline KSK, for the zone
'noksk.example'. In this case the expired signatures should be retained
(despite the zone being bogus, but resigning the DNSKEY RRset with the
ZSK won't help here).
2022-01-06 09:32:32 +01:00
Michał Kępień
ab49205af3 Check unsigned serial number in signed zone files
All signed zone files present in bin/tests/system/inline/ns8 should
contain the unsigned serial number in the raw-format header.  Add a
check to ensure that is the case.  Extend the dnssec-signzone command
line in ns8/sign.sh with the -L option to allow the zones initially
signed there to pass the newly added check.  Add another zone to the
configuration for the ns8 named instance to ensure the check also passes
when multiple zones are inline-signed by a single named instance.
2022-01-05 17:53:49 +01:00
Evan Hunt
973ac1d891 Prevent a shutdown race in catz_create_chg_task()
If a catz event is scheduled while the task manager was being
shut down, task-exclusive mode is unavailable. This needs to be
handled as an error rather than triggering an assertion.
2022-01-05 12:48:40 +01:00
Mark Andrews
b8845454c8 Report duplicate dnssec-policy names
Duplicate dnssec-policy names were detected as an error condition
but were not logged.
2022-01-03 11:48:26 -08:00
Artem Boldariev
64f7c55662 Use the TLS context cache for client-side contexts (XoT)
This commit enables client-side TLS contexts re-use for zone transfers
over TLS. That, in turn, makes it possible to use the internal session
cache associated with the contexts, allowing the TLS connections to be
established faster and requiring fewer resources by not going through
the full TLS handshake procedure.

Previously that would recreate the context on every connection, making
TLS session resumption impossible.

Also, this change lays down a foundation for Strict TLS (when the
client validates a server certificate), as the TLS context cache can
be extended to store additional data required for validation (like
intermediates CA chain).
2021-12-29 10:25:15 +02:00
Artem Boldariev
5b7d4341fe Use the TLS context cache for server-side contexts
Using the TLS context cache for server-side contexts could reduce the
number of contexts to initialise in the configurations when e.g. the
same 'tls' entry is used in multiple 'listen-on' statements for the
same DNS transport, binding to multiple IP addresses.

In such a case, only one TLS context will be created, instead of a
context per IP address, which could reduce the initialisation time, as
initialising even a non-ephemeral TLS context introduces some delay,
which can be *visually* noticeable by log activity.

Also, this change lays down a foundation for Mutual TLS (when the
server validates a client certificate, additionally to a client
validating the server), as the TLS context cache can be extended to
store additional data required for validation (like intermediates CA
chain).

Additionally to the above, the change ensures that the contexts are
not being changed after initialisation, as such a practice is frowned
upon. Previously we would set the supported ALPN tags within
isc_nm_listenhttp() and isc_nm_listentlsdns(). We do not do that for
client-side contexts, so that appears to be an overlook. Now we set
the supported ALPN tags right after server-side contexts creation,
similarly how we do for client-side ones.
2021-12-29 10:25:14 +02:00
Michał Kępień
fc678b19d9 Fix rare control channel socket reference leak
Commit 9ee60e7a17 enabled netmgr shutdown
to cause read callbacks for active control channel sockets to be invoked
with the ISC_R_SHUTTINGDOWN result code.  However, control channel code
only recognizes ISC_R_CANCELED as an indicator of an in-progress netmgr
shutdown (which was correct before the above commit).  This discrepancy
enables the following scenario to happen in rare cases:

 1. A control channel request is received and responded to.  libuv
    manages to write the response to the TCP socket, but the completion
    callback (control_senddone()) is yet to be invoked.

 2. Server shutdown is initiated.  All TCP sockets are shut down, which
    i.a. causes control_recvmessage() to be invoked with the
    ISC_R_SHUTTINGDOWN result code.  As the result code is not
    ISC_R_CANCELED, control_recvmessage() does not set
    listener->controls->shuttingdown to 'true'.

 3. control_senddone() is called with the ISC_R_SUCCESS result code.  As
    neither listener->controls->shuttingdown is 'true' nor is the result
    code ISC_R_CANCELED, reading is resumed on the control channel
    socket.  However, this read can never be completed because the read
    callback on that socket was cleared when the TCP socket was shut
    down.  This causes a reference on the socket's handle to be held
    indefinitely, leading to a hang upon shutdown.

Ensure listener->controls->shuttingdown is also set to 'true' when
control_recvmessage() is invoked with the ISC_R_SHUTTINGDOWN result
code.  This ensures the send completion callback does not resume reading
after the control channel socket is shut down.
2021-12-28 08:36:01 +01:00
Mark Andrews
dc8595936c remove broken-nsec and reject-000-label options 2021-12-23 15:13:46 +11:00
Michał Kępień
9e81903171 Set up default logging for SSLKEYLOGFILE
A customary method of exporting TLS pre-master secrets used by a piece
of software (for debugging purposes, e.g. to examine decrypted traffic
in a packet sniffer) is to set the SSLKEYLOGFILE environment variable to
the path to the file in which this data should be logged.

In order to enable writing any data to a file using the logging
framework provided by libisc, a logging channel needs to be defined and
the relevant logging category needs to be associated with it.  Since the
SSLKEYLOGFILE variable is only expected to contain a path, some defaults
for the logging channel need to be assumed.  Add a new function,
named_log_setdefaultsslkeylogfile(), for setting up those implicit
defaults, which are equivalent to the following logging configuration:

    channel default_sslkeylogfile {
        file "${SSLKEYLOGFILE}" versions 10 size 100m suffix timestamp;
    };

    category sslkeylog {
    	default_sslkeylogfile;
    };

This ensures TLS pre-master secrets do not use up more than about 1 GB
of disk space, which should be enough to hold debugging data for the
most recent 1 million TLS connections.

As these values are arguably not universally appropriate for all
deployment environments, a way for overriding them needs to exist.
Suppress creation of the default logging channel for TLS pre-master
secrets when the SSLKEYLOGFILE variable is set to the string "config".
This enables providing custom logging configuration for the relevant
category via the "logging" stanza.  (Note that it would have been
simpler to only skip setting up the default logging channel for TLS
pre-master secrets if the SSLKEYLOGFILE environment variable is not set
at all.  However, libisc only logs pre-master secrets if that variable
is set.  Detecting a "magic" string enables the SSLKEYLOGFILE
environment variable to serve as a single control for both enabling TLS
pre-master secret collection and potentially also indicating where and
how they should be exported.)
2021-12-22 18:17:26 +01:00
Artem Boldariev
84b2141e69 doth system test: reduce number of contexts in ns3
This commit removes unused listen-on statements from the ns3 instance
in order to reduce the startup time. That should help with occasional
system test initialisation hiccups in the CI which happen because the
required instances cannot initialise in time.
2021-12-20 14:28:53 +02:00
Artem Boldariev
2e5f9a0df5 Fix flakiness in the doth reconfig test
Due to the fact that the primary nameserver creates a lot of TLS
contexts, its reconfiguration could take too much time on the CI,
leading to spurious test failures, while in reality it works just
fine.

This commit adds a separate instance for this test which does not use
ephemeral keys (these are costly to generate) and creates minimal
amount of TLS contexts.
2021-12-20 14:28:53 +02:00
Michal Nowak
9c013f37d0
Drop cppcheck workarounds
As cppcheck was removed from the CI, associated workarounds and
suppressions are not required anymore.
2021-12-14 15:03:56 +01:00
Aram Sargsyan
f595a75cd6 Recreate HTTPS and TLS interfaces only during reconfiguration
The 850e9e59bf commit intended to recreate
the HTTPS and TLS interfaces during reconfiguration, but they are being
recreated also during regular interface re-scans.

Make sure the HTTPS and TLS interfaces are being recreated only during
reconfiguration.
2021-12-14 09:28:01 +00:00
Aram Sargsyan
1bc60caaa0 Add system test for checking TLS interfaces after a reconfiguration 2021-12-13 10:19:57 +00:00