These tests rely on somewhat precise timing, as they test that answers
arrive in a particular latency bucket within the statschannel stats.
These tests are affected by various timing and network issues on our
FreeBSD CI runners and the results are very unstable. Skip these on
FreeBSD entirely.
Some dns message modifications like TSIG happen only after .to_wire() is
called on the message. To ensure there isn't a discrepancy between what
has been logged and what has been sent, log the query after
dns.query.udp() is executed (which calls .to_wire() on the message).
Co-Authored-By: Štěpán Balážik <stepan@isc.org>
Make the maximum number of processed delegation nameservers configurable
via the new 'max-delegation-servers' option (default: 13), replacing the
hardcoded NS_PROCESSING_LIMIT (20).
The default is reduced to 13 to precisely match the maximum number of
root servers that can fit into a classic 512-byte UDP payload. This
provides a natural, historically sound cap that mitigates resource
exhaustion and amplification attacks from artificially inflated or
misconfigured delegations.
The configuration option is strictly bounded between 1 and 100 to ensure
resolver stability.
Add a resolver instance "ns4" in the statschannel test and a "ans5"
instance which adds latency to the queries delegeated to it from the
resolver.
Make queries which add latency, and compare the expected values to
the values received from the statistics channel.
The granularity of the simple histogram with fixed number of ranges
sometimes isn't good enough. As there's a need to implement a new
histogram statistics for the incoming query times (RTT), it was decided
to also update the existing RTT statistics of the outgoing queries
so that they look similar and use common code.
Remove the old histogram code from the resolver and from the statistics
channel. Reimplement the outgoing queries RTT histogram using the
isc_histomulti module, and prepare the necessary base for implementing
the incoming queries RTT histogram. The statistics channel will be
updated to expose the new histograms in an upcoming commit.
Introduce a new system test (nsprocessinglimit) to verify that the
resolver strictly respects outgoing network fetch quotas when presented
with heavily delegated, unresponsive zones.
This test acts as a regression check for the recent Fisher-Yates nameserver
selection refactor. It sets up an authoritative server delegating a zone
to 23 distinct nameservers (all pointing to unresponsive loopback IPs).
Using dnstap, the test forces a resolution failure and verifies that:
1. The resolver successfully traverses the zone delegation path.
2. The resolver caps the outgoing network queries to the delegated
nameservers exactly at the processing limit (20 fetches), ensuring
array boundaries and dynamic fetch quotas are strictly enforced without
crashing or hanging.
Add randomizens system test which ensures that NS are randomly selected.
The test relies of the fact that `getaddresses_allowed()` logic won't
allow to query more than 3 NS at the top-level. The `example.` zone has
4 NS and the 3 formers are lame. As a result, if the resolved doesn't
randomize the NS selection, it will only quiery the 3 formers, which
won't give an answer, and fails. With randomization enabled, there is a
chance that the resolver queries the fourth NS, and gets the result.
If an invalid SKR file is imported, reading the time from the token
buffer might overflow the buffer on the local stack. This has been
fixed by removing the intermediate buffer and parsing the lexer token
directly.
Adds a static system test that fails to load an NSEC3 record with an
invalid next part length. Additionally, introduces a dynamic test using
a crafted authoritative DNS proxy to inject invalid NSEC3 records on the
fly to test runtime behavior.
NSEC3 hashes are required to fit within a single DNS label. Since there
are 5 bits per label byte without pad characters, the maximum hash size
is floor(63*5/8) (39 bytes).
This patch enforces this maximum length for unknown algorithms, while
strictly enforcing the exact expected digest length for known algorithms
like SHA-1.
Add a system test that has one invalid DS record with supported
algorithm and one unsupported DS record. Both DNSKEY and A queries must
fail with SERVFAIL.
A regression was introduced when adding the EDE code for unsupported
DNSKEY and DS algorithms. When the parent has both supported and
unsupported algorithm in the DS record, the validator would treat the
supported DS algorithm as insecure when validating DNSKEY records
instead of BOGUS. This has not security impact as the rest of the child
zone correctly ends with BOGUS status, but it is incorrect and thus the
regression has been fixed.
Three variants of YWH-PGM40640-56: Stale/Wrong DNS Data Served via
CNAME Flag Leak (DNS_DBFIND_STALEOK persistence) are presented in
GitLab issue #5751. All these variants have been converted to system
tests.
Variant 1 forwards source.stale to another server, that provides a
CNAME record, while the resolver is authoritative for target.stale.
The CNAME points to a non-existing name. A stale CNAME record should
result in a stale NXDOMAIN (instead of SERVFAIL).
Variant 2 forwards both source.stale and target.stale to other servers.
This time the CNAME points to an A RRset. If the source.stale server
is not available (and stale-answer-client-timeout is off), the cached
CNAME should be followed and pick up the fresh RRset (instead of the
stale A RRset).
Variant 3 is similar to variant 2, but this time the CNAME points to
a non-existing name again. After flushing the target, BIND should
return a stale NXDOMAIN (instead of SERVFAIL).
The goal here is to help new or infrequent users figure out the most
basic ways to use dig.
Notes on the choice of examples:
* I wrote examples that users can copy and paste exactly as is, without
having to come up with an appropriate IP address or domain name to use.
The one exception is the `dig -x` example which uses an IP from the
example range.
* `dig +noall +answer` here is because learning about `+noall +answer`
was lifechanging for me when I learned about it, I've heard from
others that they find it helpful too, and it's pretty hard to infer
from the man page as is that it might be useful
* I thought about adding `+trace` but left it out because 5 examples was
already starting to feel like a lot.
Add a pylint plugin that enforces:
- There is no bare `import dns` statement.
- All `dns.<module>` used are explicitly imported.
- There are no unused `dns.<module>` imports.
Fix all the imports to conform with this check.
In Python 3.10 strings don't support the | operator, so ruff doesn't
attempt to fix these. Quote the entire type specification to avoid the
typing.Optional import.
Alternatives I considered:
- leaving it as is (only use of Optional in the code base)
- using `from future import __annotations__` (replacing one import with
another one)
Importing pytest fixture trips up static analysis tools, so move
default_algorithm to conftest.py and use it instead of os.environ
accesses in various system tests.
For use outside test function, use Algorithm.default().
For Linux >= 6.8:
Since 2023, Linux has introduced a change to the IP_LOCAL_PORT_RANGE
socket option that eliminates the need for the random window
shifting (implemented as a fallback in the next commit).
By setting IP_LOCAL_PORT_RANGE option, we tell the kernel to use better
approach to the source port selection.
For Linux << 6.8:
This implement selecting port by random shifting range leveraging the
IP_LOCAL_PORT_RANGE socket option. The network manager is initialized
with the ephemeral port range (on startup and on reconfig) and then for
every outgoing TCP connection, we define a custom port range (1000
ports) and then randomly shift the custom range within the system range.
This helps the kernel to reduce the search space to the custom window
between <random_offset, random_offset + 1000>.
Reference:
https://blog.cloudflare.com/linux-transport-protocol-port-selection-performance/#kernel
The function was already marked as never failing, always returning
ISC_R_SUCCESS, so there was a lot of dead code around checking whether
the result would be ISC_R_SUCCESS. This has been cleaned up.
named was asserting when the notify source address was not available
and TSIG was being used. Check this scenario by adding a nameserver
to the zone which is configured to uses a non-existent source address
and a blackholed destination address and a TSIG using a server clause
for that destination address.