Commit graph

10952 commits

Author SHA1 Message Date
Ondřej Surý
cad2706cce Replace the statschannel truncated tests with two new tests
Now that the artificial limit on the recv buffer has been removed, the
current system test always fails because it tests if the truncation has
happened.

Add test that sending more than 10 headers makes the connection to
closed; and add test that sending huge HTTP request makes the connection
to be closed.
2022-10-14 11:26:54 +02:00
Ondřej Surý
beecde7120 Rewrite isc_httpd using picohttpparser and isc_url_parse
Rewrite the isc_httpd to be more robust.

1. Replace the hand-crafted HTTP request parser with picohttpparser for
   parsing the whole HTTP/1.0 and HTTP/1.1 requests.  Limit the number
   of allowed headers to 10 (arbitrary number).

2. Replace the hand-crafted URL parser with isc_url_parse for parsing
   the URL from the HTTP request.

3. Increase the receive buffer to match the isc_netmgr buffers, so we
   can at least receive two full isc_nm_read()s.  This makes the
   truncation processing much simpler.

4. Process the received buffer from single isc_nm_read() in a single
   loop and schedule the sends to be independent of each other.

The first two changes makes the code simpler and rely on already
existing libraries that we already had (isc_url based on nodejs) or are
used elsewhere (picohttpparser).

The second two changes remove the artificial "truncation" limit on
parsing multiple request.  Now only a request that has too many
headers (currently 10) or is too big (so, the receive buffer fills up
without reaching end of the request) will end the connection.

We can be benevolent here with the limites, because the statschannel
channel is by definition private and access must be allowed only to
administrators of the server.  There are no timers, no rate-limiting, no
upper limit on the number of requests that can be served, etc.
2022-10-14 11:26:54 +02:00
Petr Špaček
53b3ceacd4
Replace #define DNS_NAMEATTR_ with struct of bools
sizeof(dns_name_t) did not change but the boolean attributes are now
separated as one-bit structure members. This allows debuggers to
pretty-print dns_name_t attributes without any special hacks, plus we
got rid of manual bit manipulation code.
2022-10-13 17:04:02 +02:00
Artem Boldariev
95a551de7b doth system test: increase transfers-in/out limits
Sometimes doth test could intermittently fail shortly after start due
to inability to complete a zone transfer in time. As it turned out, it
could happen due to transfers-in/out limits. Initially the defaults
were fine, but over time, especially when adding Strict/Mutual TLS, we
added more than 10 zones so it became possible to hit the limits.

This commit takes care of that by bumping the limits.
2022-10-12 21:52:52 +03:00
Artem Boldariev
354494cd10 doth system test - decrease HTTP listener quota size
This commit reduces the size of HTTP listener quota from 300 (default)
to 100 so that it would make hitting any global limits in case of
running multiple tests in parallel in multiple containers unlikely.

This way the need in opening many file descriptors of different
kinds (e.g. client side connections and pipes) gets significantly
reduced while the required code paths are still verified.
2022-10-12 21:46:39 +03:00
Michał Kępień
18e20f95f6 Fix startup detection after restart in start.pl
The bin/tests/system/start.pl script waits until a "running" message is
logged by a given name server instance before attempting to send a
version.bind/CH/TXT query to it.  The idea behind this was to make the
script wait until named loads all the zones it is configured to serve
before telling the system test framework that a given server is ready to
use; this prevents the need to add boilerplate code that waits for a
specific zone to be loaded to each test expecting that.

The problem is that when it looks for "running" messages, the
bin/tests/system/start.pl script assumes that the existence of any such
message in the named.run file indicates that a given named instance has
already finished loading all zones.  Meanwhile, some system tests
restart all the named instances they use throughout their lifetime (some
even do that a few times), for example to run Python-based tests.  The
bin/tests/system/start.pl script handles such a scenario incorrectly: as
soon as it finds any "running" message in the named.run file it inspects
and it gets a response to a version.bind/CH/TXT query, it tells the
system test framework that a given server is ready to use, which might
not be true - it is possible that only the "version.bind" zone is loaded
at that point and the "running" message found was logged by a
previously-shutdown named instance. This triggers intermittent failures
for Python-based tests.

Fix by improving the logic that the bin/tests/system/start.pl script
uses to detect server startup: check how many "running" lines are
present in a given named.run file before attempting to start a named
instance and only proceed with version.bind/CH/TXT queries when the
number of "running" lines found in that named.run file increases after
the server is started.
2022-10-11 11:54:57 +02:00
Michał Kępień
9146b956ae Do not truncate ns2 logs in the "rrsetorder" test
In the "rrsetorder" system test, the ns2 named instance is restarted
without passing the --restart option to bin/tests/system/start.pl.  This
causes the log file for that named instance to be needlessly truncated.
Prevent this from happening by restarting the affected named instance
in the same way as all the other named instances used in system tests.
2022-10-11 11:54:57 +02:00
Petr Špaček
058c1744ba
Clarify error message about missing inline-signing & dnssec-policy 2022-10-06 10:26:30 +02:00
Mark Andrews
491a8cfe96 Add sleeps to ixfr system test
ensure that at least a second has passed since a zone was last loaded
to prevent it accidentally being skipped as up to date.
2022-10-06 08:18:03 +11:00
Ondřej Surý
0dcbc6274b Record the 'edns-udp-size' in the view, not in the resolver
Getting the recorded value of 'edns-udp-size' from the resolver requires
strong attach to the dns_view because we are accessing `view->resolver`.
This is not the case in places (f.e. dns_zone unit) where `.udpsize` is
accessed.  By moving the .udpsize field from `struct dns_resolver` to
`struct dns_view`, we can access the value directly even with weakly
attached dns_view without the need to lock the view because `.udpsize`
can be accessed after the dns_view object has been shut down.
2022-10-05 11:59:36 -07:00
Michal Nowak
f5d9fa6ea4
Drop flake8 ignore lists
flake8 is not used in BIND 9 CI and inline ignore lists are not needed
anymore.
2022-10-05 17:56:24 +02:00
Ondřej Surý
e18b6fb6a6
Use isc_mem_regetx() when appropriate
While refactoring the isc_mem_getx(...) usage, couple places were
identified where the memory was resized manually.  Use the
isc_mem_reget(...) that was introduced in [GL !5440] to resize the
arrays via function rather than a custom code.
2022-10-05 16:44:05 +02:00
Ondřej Surý
c0598d404c
Use designated initializers instead of memset()/MEM_ZERO for structs
In several places, the structures were cleaned with memset(...)) and
thus the semantic patch converted the isc_mem_get(...) to
isc_mem_getx(..., ISC_MEM_ZERO).  Use the designated initializer to
initialized the structures instead of zeroing the memory with
ISC_MEM_ZERO flag as this better matches the intended purpose.
2022-10-05 16:44:05 +02:00
Ondřej Surý
c1d26b53eb
Add and use semantic patch to replace isc_mem_get/allocate+memset
Add new semantic patch to replace the straightfoward uses of:

  ptr = isc_mem_{get,allocate}(..., size);
  memset(ptr, 0, size);

with the new API call:

  ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);
2022-10-05 16:44:05 +02:00
Mark Andrews
285351d4b2 Add additional forensics to zero system test 2022-10-05 07:46:01 +00:00
Matthijs Mekking
0681b15225 If refresh stale RRset times out, start stale-refresh-time
The previous commit failed some tests because we expect that if a
fetch fails and we have stale candidates in cache, the
stale-refresh-time window is started. This means that if we hit a stale
entry in cache and answering stale data is allowed, we don't bother
resolving it again for as long we are within the stale-refresh-time
window.

This is useful for two reasons:
- If we failed to fetch the RRset that we are looking for, we are not
  hammering the authoritative servers.

- Successor clients don't need to wait for stale-answer-client-timeout
  to get their DNS response, only the first one to query will take
  the latency penalty.

The latter is not useful when stale-answer-client-timeout is 0 though.

So this exception code only to make sure we don't try to refresh the
RRset again if it failed to do so recently.
2022-10-05 08:20:48 +02:00
Mark Andrews
6d561d3886 Add support for 'dohpath' to SVCB (and HTTPS)
dohpath is specfied in draft-ietf-add-svcb-dns and has a value
of 7.  It must be a relative path (start with a /), be encoded
as UTF8 and contain the variable dns ({?dns}).
2022-10-04 14:21:41 +11:00
Ondřej Surý
d971472321 Be more patient when stopping servers in the system tests
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.

Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
2022-09-30 17:12:44 +02:00
Ondřej Surý
477eb22c12
Refactor isc_ratelimiter API
Because the dns_zonemgr_create() was run before the loopmgr was started,
the isc_ratelimiter API was more complicated that it had to be.  Move
the dns_zonemgr_create() to run_server() task which is run on the main
loop, and simplify the isc_ratelimiter API implementation.

The isc_timer is now created in the isc_ratelimiter_create() and
starting the timer is now separate async task as is destroying the timer
in case it's not launched from the loop it was created on.  The
ratelimiter tick now doesn't have to create and destroy timer logic and
just stops the timer when there's no more work to do.

This should also solve all the races that were causing the
isc_ratelimiter to be left dangling because the timer was stopped before
the last reference would be detached.
2022-09-30 10:36:30 +02:00
Ondřej Surý
36cdeb7656 Remove debugging fprintf from run_server()
In the loopmgr branch, we forgot the scissors^Hdebugging output in the
patient^Hnamed, remove it.
2022-09-29 14:22:58 +02:00
Aram Sargsyan
ae4296729c Test dynamic update forwarding when using a TLS-enabled primary
Add several test cases in the 'upforwd' system test to make sure
that different scenarios of Dynamic DNS update forwarding are
tested, in particular when both the original and forwarded requests
are over Do53, or DoT, or they use different transports.
2022-09-28 09:36:24 +00:00
Mark Andrews
432064f63c Suffix may be used before it is assigned a value
CID 350722 (#5 of 7): Bad use of null-like value (FORWARD_NULL)
        12. invalid_operation: Invalid operation on null-like value suffix.
    145        r.authority.append(
    146            dns.rrset.from_text(
    147                "icky.ptang.zoop.boing." + suffix,
    148                1,
    149                IN,
    150                NS,
    151                "a.bit.longer.ns.name." + suffix,
    152            )
    153        )
2022-09-27 23:47:12 +00:00
Ondřej Surý
3b31f7f563
Add autoconf option to enable memory leak detection in libraries
There's a known memory leak in the engine_pkcs11 at the time of writing
this and it interferes with the named ability to check for memory leaks
in the OpenSSL memory context by default.

Add an autoconf option to explicitly enable the memory leak detection,
and use it in the CI except for pkcs11 enabled builds.  When this gets
fixed in the engine_pkc11, the option can be enabled by default.
2022-09-27 17:53:04 +02:00
Ondřej Surý
d1cc847ab0
Check the libuv, OpenSSL and libxml2 memory context on exit
As we can't check the deallocations done in the library memory contexts
by default because it would always fail on non-clean exit (that happens
on error or by calling exit() early), we just want to enable the checks
to be done on normal exit.
2022-09-27 17:10:42 +02:00
Ondřej Surý
e537fea861
Use custom isc_mem based allocator for libxml2
The libxml2 library provides a way to replace the default allocator with
user supplied allocator (malloc, realloc, strdup and free).

Create a memory context specifically for libxml2 to allow tracking the
memory usage that has originated from within libxml2.  This will provide
a separate memory context for libxml2 to track the allocations and when
shutting down the application it will check that all libxml2 allocations
were returned to the allocator.

Additionally, move the xmlInitParser() and xmlCleanupParser() calls from
bin/named/main.c to library constructor/destructor in libisc library.
2022-09-27 17:10:42 +02:00
Petr Špaček
c648e280e4
Document list of crypto algorithms in named -V output 2022-09-27 16:54:39 +02:00
Mark Andrews
d34ecdb366
Deduplicate string formating 2022-09-27 16:54:39 +02:00
Mark Andrews
3156d36495
silence scan-build false positive 2022-09-27 16:54:39 +02:00
Mark Andrews
cb1515e71f
Report algorithms supported by named at startup 2022-09-27 16:54:39 +02:00
Mark Andrews
b308f866c0
Have 'named -V' report supported algorithms
These cover DNSSEC, DS, HMAC and TKEY algorithms.
2022-09-27 16:54:39 +02:00
Mark Andrews
151cc2fff9
Replace alg_totext with dst_hmac_algorithm_totext
The new library function will be reused by subsequent commits.
2022-09-27 16:54:39 +02:00
Mark Andrews
09f7e0607a
Convert DST_ALG defines to enum and group HMAC algorithms
The HMACs and GSSAPI are just using unallocated values.
Moving them around shouldn't cause issues.
Only the dnssec system test knew the internal number in use for hmacmd5.
2022-09-27 16:54:36 +02:00
Aram Sargsyan
4509c4f1bd Use the return value of isc_task_create()
Improve the error handling by checking the isc_task_create()
function's return value.

CID 356329:

    /bin/dnssec/dnssec-signzone.c: 3732 in main()
    3726     	if (directory == NULL) {
    3727     		directory = ".";
    3728     	}
    3729
    3730     	isc_managers_create(&mctx, ntasks, &loopmgr, &netmgr, &taskmgr);
    3731
    >>>     CID 356329:  Error handling issues  (CHECKED_RETURN)
    >>>     Calling "isc__task_create" without checking return value (as is done elsewhere 16 out of 18 times).
    3732     	isc_task_create(taskmgr, &write_task, 0);
    3733
    3734     	result = dst_lib_init(mctx, engine);
    3735     	if (result != ISC_R_SUCCESS) {
    3736     		fatal("could not initialize dst: %s",
    3737
2022-09-27 12:22:34 +00:00
Mark Andrews
176e172210 Check that changing the TSIG key is successful
Switch the primary to require 'next_key' for zone transfers then
update the catalog zone to say to use 'next_key'.  Next update the
zones contents then check that those changes are seen on the
secondary.
2022-09-27 21:54:02 +10:00
Mark Andrews
805e2ba31d
Add the ability to dig to specify the signing time 2022-09-26 16:28:23 +02:00
Mark Andrews
4d248ee78e
Allow dig to SIG(0) sign a message 2022-09-26 16:28:23 +02:00
Aram Sargsyan
bd8299d7b5 Document nsupdate options related to DoT
Add documentation for the newly implemented DoT feature of the
nsupdate program.
2022-09-23 13:27:44 +00:00
Aram Sargsyan
f2bb80d6ae Extend the nsupdate system test with DoT-related checks
Add a simple test PKI based on the existing one in the doth test.

Check ephemeral, forward-secrecy, and forward-secrecy-mutual-tls
TLS configurations with different scenarios.
2022-09-23 13:23:49 +00:00
Aram Sargsyan
60f1a73754 Fix a typo in doth system test's CA.cfg
The comments in CA.cfg file serve as a good tutorial for setting up
a simple PKI for a system test. There is a typo in one of the presented
commands, which results in openssl not exiting with an error message
instead of generating a certificate.

Fix the typo.
2022-09-23 13:23:49 +00:00
Aram Sargsyan
13000c28c2 Implement DoT support for nsupdate
Implement DNS-over-TLS support for nsupdate. Use DiG's DoT
implementation as a model for the newly added features.
2022-09-23 13:23:49 +00:00
Petr Špaček
c46ad4aec2
Fix JUnit test status generator for out-of-tree system tests
- Use separate paths for tests results and test script
- For tarball tests include the conversion script in the `make dist`
2022-09-22 15:20:23 +02:00
Michał Kępień
1814349374 Add tests for broken glueless referrals
If an NS RRset at the parent side of a delegation point only contains
in-bailiwick NS records, at least one glue record should be included in
every referral response sent for such a delegation point or else clients
will need to send follow-up queries in order to determine name server
addresses.  In certain edge cases (when the total size of a referral
response without glue records was just below to the UDP packet size
limit), named failed to adhere to that rule by sending non-truncated,
glueless referral responses.

Add tests attempting to trigger that bug in several different scenarios,
covering all possible combinations of the following factors:

  - type of zone (signed, unsigned),
  - glue record type (A, AAAA, both).
2022-09-22 14:03:17 +02:00
Michał Kępień
791d26b99e Clean up the "glue" system test
Bring the "glue" system test up to speed with other system tests: add
check numbering, ensure test artifacts are preserved upon failure,
improve error reporting, make the test fail upon unexpected errors,
address ShellCheck warnings.
2022-09-22 14:03:17 +02:00
Ondřej Surý
6797ca49cf
Wait for the telemetry check to finish
Instead of expecting that telemetry check has already finished,
wait for it for maximum of three seconds, because named is run with
-tat=3, so the telemetry check must happen with 3 second window.

Co-authored-by: Evan Hunt <each@isc.org>
2022-09-22 09:45:54 +02:00
Aram Sargsyan
90959f6166 Implement TLS transport support for dns_request and dns_dispatch
This change prepares ground for sending DNS requests using DoT,
which, in particular, will be used for forwarding dynamic updates
to TLS-enabled primaries.
2022-09-19 16:36:28 +00:00
Ondřej Surý
f6e4f620b3
Use the semantic patch to do the unsigned -> unsigned int change
Apply the semantic patch on the whole code base to get rid of 'unsigned'
usage in favor of explicit 'unsigned int'.
2022-09-19 15:56:02 +02:00
Michal Nowak
658cae9fad
Bump socket.create_connection() timeout to 10 seconds
The tcp Pytest on OpenBSD fairly reliably fails when receive_tcp()
on a socket is attempted:

    >           (response, rtime) = dns.query.receive_tcp(sock, timeout())

    tests-tcp.py:50:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    /usr/local/lib/python3.9/site-packages/dns/query.py:659: in receive_tcp
        ldata = _net_read(sock, 2, expiration)
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    sock = <socket.socket [closed] fd=-1, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6>
    count = 2, expiration = 1662719959.8106785

        def _net_read(sock, count, expiration):
            """Read the specified number of bytes from sock.  Keep trying until we
            either get the desired amount, or we hit EOF.
            A Timeout exception will be raised if the operation is not completed
            by the expiration time.
            """
            s = b''
            while count > 0:
                try:
    >               n = sock.recv(count)
    E               socket.timeout: timed out

This is because the socket is already closed.

Bump the socket connection timeout to 10 seconds.
2022-09-15 11:13:36 +02:00
Ondřej Surý
52b62b7890
Add support for reporting status via sd_notify()
sd_notify() may be called by a service to notify the service manager
about state changes. It can be used to send arbitrary information,
encoded in an environment-block-like string. Most importantly, it can be
used for start-up completion notification.

Add libsystemd check to autoconf script and when the library is detected
add calls to sd_notify() around the server->reload_status changes.

Co-authored-by: Petr Špaček <pspacek@isc.org>
2022-09-15 10:12:15 +02:00
Evan Hunt
a2bbe578bf
Add tests for the new log messages with refusal reason
Update the allow-query test to check for the new log messages.
2022-09-15 06:50:57 +02:00
Mark Andrews
b1ef1ded69 Emit key algorithm + key id in dnssec signing statsistics
If there was a collision of key id across algorithms it was not
possible to determine where counter applies to which algorithm for
xml statistics while for json only one of the values was emitted.
The key names are now "<algorithm-number>+<id>" (e.g. "8+54274").
2022-09-15 08:42:45 +10:00