Commit graph

36756 commits

Author SHA1 Message Date
Michał Kępień
b2c0b56d64 Merge branch 'michal/run-a-short-respdiff-test-for-all-merge-requests' into 'main'
Run a short respdiff test for all merge requests

See merge request isc-projects/bind9!6585
2022-07-18 13:16:01 +00:00
Michał Kępień
ea8a0bde8e Run a short respdiff test for all merge requests
Now that the respdiff tests can detect memory leaks, it is worth running
them for every merge request.  However, the existing respdiff-based
tests take a while to complete (about half an hour with our current CI
infrastructure), which does not make them a good fit for this purpose.
Add a new GitLab CI job, "respdiff-short", which uses a smaller query
set that gets processed within a couple of minutes on our current CI
infrastructure.  Rename the existing respdiff-based jobs to make
distinguishing them easier.
2022-07-18 14:39:02 +02:00
Michał Kępień
004a7bb376 Extract respdiff job definition to a YAML anchor
Ensure the common parts of all jobs using respdiff are available in the
form of a reusable YAML anchor, to reduce code duplication and to
simplify adding more respdiff-based jobs to GitLab CI.
2022-07-18 14:39:02 +02:00
Michał Kępień
a7e89a0712 Use a pre-built executable as the reference named
The "respdiff" GitLab CI job compares DNS responses produced by the
current version of named with those produced by a reference version.
The latter is built from source in each "respdiff" job, despite the fact
that the reference version changes very rarely.  Use a pre-built named
executable as the reference version instead, assuming it is available in
the OS image used for "respdiff" tests.
2022-07-18 14:39:02 +02:00
Ondřej Surý
ed8f8bc09c Merge branch 'ossl-fixes' into 'main'
Clean up OpenSSL usage a bit

See merge request isc-projects/bind9!6436
2022-07-18 12:14:34 +00:00
David Benjamin
e507ea2c85 Remove DH_clear_flags call
These calls have not been needed since OpenSSL 0.9.7h.

This dates to commit 704d6eeab1, "Work
around non-reentrancy in openssl by disabling precomputation in keys".
This was in the bundled OpenSSL 0.9.3a era and made two changes. First,
it registered a locking callback because, in those days, OpenSSL needed
a callback to support locks. Second, it set flags to disable various
bits of cached state on DH, DSA, and RSA objects.

Looking back in OpenSSL 0.9.3a, that cached state was not protected by a
lock:
https://github.com/openssl/openssl/blob/OpenSSL_0_9_3a/crypto/rsa/rsa_eay.c#L137-L142

However, this was fixed in OpenSSL 0.9.7h:
6ec8e63af6

The other flags (DSA and RSA) have since fallen away, DSA with the
removal of DSA altogether (3994b1f9c2) and
RSA with 3a8d4a316e, "openssl 0.9.6a and
higher don't have the RSA locking bug [...] other algorithms still don't
do locking when performing precomputation [...]".

That seems to be referring to this OpenSSL change, which indeed fixed it
for RSA but not others:
bb617a9646

The 0.9.7h change above fixed it across the board, but there was never a
similar update to the workaround for DSA and DH. With such OpenSSL
versions long since out of support, the last remains of this workaround
can finally be removed.
2022-07-18 13:38:47 +02:00
David Benjamin
723f5a0769 Simplify BN_GENCB handling
When callback was NULL, bind9 would use BN_GENCB_set_old to set a NULL
callback because OpenSSL happened to allow a NULL "old" callback, but
not a NULL "new" callback. Instead, the way to turn off the callback is
to pass a NULL BN_GENCB itself.

Switch to doing that.
2022-07-18 13:38:44 +02:00
Ondřej Surý
99dd5d34a0 Merge branch '3453-cope-with-too-small-BUFSIZ' into 'main'
Increase the BUFSIZ-long buffers

Closes #3453

See merge request isc-projects/bind9!6579
2022-07-15 17:30:03 +00:00
Ondřej Surý
b35861f1eb Increase the BUFSIZ-long buffers
The BUFSIZ value varies between platforms, it could be 8K on Linux and
512 bytes on mingw.  Make sure the buffers are always big enough for the
output data to prevent truncation of the output by appropriately
enlarging or sizing the buffers.
2022-07-15 10:33:46 +00:00
Michał Kępień
f7066bb71a Merge branch '3443-memory-related-cleanups' into 'main'
Memory-related cleanups

Closes #3443

See merge request isc-projects/bind9!6567
2022-07-15 08:31:23 +00:00
Michał Kępień
fc8e6a4cb2 Update documentation for named's -M option
Remove "external" from the list of legal values for the -M command-line
option as it has not been allowed since the internal memory allocator
was removed by commit 55ace5d3aa.

Make the style of the relevant paragraph more in line with the next one
and split its contents up into an unordered list of options for improved
readability.
2022-07-15 10:23:03 +02:00
Evan Hunt
dde669f546 Merge branch '3456-dispatch-connect-race' into 'main'
remove unnecessary assertion in dns_dispatch_connect()

Closes #3456

See merge request isc-projects/bind9!6573
2022-07-15 02:26:41 +00:00
Evan Hunt
e1c81f9b1b remove unnecessary assertion in dns_dispatch_connect()
When a thread calls dns_dispatch_connect() on an unconnected TCP socket
it sets `tcpstate` from `DNS_DISPATCHSTATE_NONE` to `_CONNECTING`.
Previously, it then INSISTed that there were no pending connections
before calling isc_nm_tcpdnsconnect().

If a second thread called dns_dispatch_connect() during that window
of time, it could add a pending connection to the list, and trigger
an assertion failure.

This commit removes the INSIST since the condition is actually
harmless.
2022-07-14 16:31:01 -07:00
Ondřej Surý
84c353b5cf Merge branch 'ondrej-fix-timing-error-in-statistics-system-test' into 'main'
Wait for TCP connection refused in the statistics system test

See merge request isc-projects/bind9!6580
2022-07-14 20:33:16 +00:00
Ondřej Surý
d1433da524 Wait for TCP connection refused in the statistics system test
The statistics system test makes a query to foo.info to check for the
pending connections because the ans4 doesn't respond to the query.

This might or might not (depending on exact timing) increment the failed
TCP connection counter when the query is retried over TCP because ans4
doesn't listen on the TCP.

Wait for the 'connection refused' in the ns3 log file to be able to
count the exactly 1 failed TCP connection.
2022-07-14 13:08:29 -07:00
Ondřej Surý
6b15eb45df Merge branch '3451-handle-transient-TCP-connect-EADDRINUSE-on-BSDs' into 'main'
Handle the transient TCP connect() failures on FreeBSD

Closes #3451 and #3452

See merge request isc-projects/bind9!6562
2022-07-14 19:38:33 +00:00
Ondřej Surý
6e71fd2a88 Add CHANGES note for [GL #3451] 2022-07-14 14:34:53 +02:00
Ondřej Surý
3e10d3b45f Cleanup the STATID_CONNECT and STATID_CONNECTFAIL stat counters
The STATID_CONNECT and STATID_CONNECTFAIL statistics were used
incorrectly. The STATID_CONNECT was incremented twice (once in
the *_connect_direct() and once in the callback) and STATID_CONNECTFAIL
would not be incremented at all if the failure happened in the callback.

Closes: #3452
2022-07-14 14:34:53 +02:00
Ondřej Surý
a280855f7b Handle the transient TCP connect() failures on FreeBSD
On FreeBSD (and perhaps other *BSD) systems, the TCP connect() call (via
uv_tcp_connect()) can fail with transient UV_EADDRINUSE error.  The UDP
code already handles this by trying three times (is a charm) before
giving up.  Add a code for the TCP, TCPDNS and TLSDNS layers to also try
three times before giving up by calling uv_tcp_connect() from the
callback two more time on UV_EADDRINUSE error.

Additionally, stop the timer only if we succeed or on hard error via
isc__nm_failed_connect_cb().
2022-07-14 14:20:10 +02:00
Mark Andrews
08af14a475 Merge branch '3448-redundant-assignment-of-clistenon-in-bin-named-server-c' into 'main'
Resolve "Redundant assignment of clistenon in bin/named/server.c"

Closes #3448

See merge request isc-projects/bind9!6560
2022-07-14 01:14:14 +00:00
Mark Andrews
ee9ec0052e Remove redundant assignment of 'clistenon = NULL' 2022-07-14 00:24:37 +00:00
Evan Hunt
965279479e Merge branch '3454-check-putstr' into 'main'
check putstr return values

Closes #3454

See merge request isc-projects/bind9!6574
2022-07-14 00:23:17 +00:00
Evan Hunt
9372baac27 check putstr return values
The calls to putstr() in named_server_fetchlimit() were not checked
for failure.
2022-07-14 00:04:39 +00:00
Mark Andrews
132b62614b Merge branch '3447-lib-dns-tkey-c-free_namelist-should-be-disassociating-associated-rdatatsets' into 'main'
Resolve "lib/dns/tkey.c:free_namelist should be disassociating associated rdatatsets"

Closes #3447

See merge request isc-projects/bind9!6556
2022-07-14 00:03:38 +00:00
Mark Andrews
b38a5d895f disassociate rdatasets when cleaning up
free_namelist could be passed names with associated rdatasets
when handling errors.  These need to be disassociated before
calling dns_message_puttemprdataset.
2022-07-13 23:43:39 +00:00
Mark Andrews
f80b7e0006 Merge branch '3449-kasp-system-test-failed-to-log-some-zones-during-setup' into 'main'
Resolve "kasp system test failed to log some zones during setup"

Closes #3449

See merge request isc-projects/bind9!6561
2022-07-13 23:42:38 +00:00
Mark Andrews
eb5e5edf82 kasp: add missing logging during setup
Some zones where not being logged when just DNSSEC keys where being
generated in system test setup phase.  Add logging for these zones.
2022-07-13 23:19:39 +00:00
Michał Kępień
204ead4c31 Merge branch '3054-3-improve-reporting-for-pthreads-errors' into 'main'
[3/3] Improve reporting for pthreads errors

Closes #3054

See merge request isc-projects/bind9!6572
2022-07-13 13:05:11 +00:00
Michał Kępień
34a804d956 Merge branch '3054-2-enable-tracking-pthreads-objects' into 'main'
[2/3] Enable tracking pthreads objects

See merge request isc-projects/bind9!6571
2022-07-13 13:03:59 +00:00
Michał Kępień
3ea3af37a4 Merge branch '3054-1-misc-pthreads-cleanups' into 'main'
[1/3] Miscellaneous pthreads cleanups

See merge request isc-projects/bind9!6570
2022-07-13 13:01:32 +00:00
Michał Kępień
b67ff4728f Improve reporting for barrier errors
uv_barrier_init() errors are currently ignored.  Use UV_RUNTIME_CHECK()
to catch them and to improve error reporting for any uv_barrier_init()
run-time failures (by augmenting error messages with file/line
information and the error string corresponding to the value returned).
2022-07-13 13:19:32 +02:00
Michał Kępień
5bc7ce41b7 Detect pthreads object leaks during respdiff tests
Set the ISC_TRACK_PTHREADS_OBJECTS preprocessor macro when preparing a
build of BIND 9 for respdiff testing and pass the -m command-line option
to respdiff.sh in order to enable automatic identification of memory
leaks during respdiff tests.
2022-07-13 13:19:32 +02:00
Ondřej Surý
deae974366 Directly cause assertion failure on pthreads primitives failure
Instead of returning error values from isc_rwlock_*(), isc_mutex_*(),
and isc_condition_*() macros/functions and subsequently carrying out
runtime assertion checks on the return values in the calling code,
trigger assertion failures directly in those macros/functions whenever
any pthread function returns an error, as there is no point in
continuing execution in such a case anyway.
2022-07-13 13:19:32 +02:00
Michał Kępień
7009f9d270 Improve reporting for read-write lock errors
Replace direct uses of implementation-specific rwlock functions in
lib/isc/include/isc/rwlock.h with preprocessor macros that use
ERRNO_CHECK(), in order to augment rwlock-related error messages with
file/line/caller information and the error string corresponding to
errno.  Adjust the implementation-specific functions for pthreads-based
rwlocks so that they return any errors encountered to the caller instead
of aborting execution immediately using RUNTIME_CHECK().

To keep code modifications simple, make the non-pthreads-based
implementation-specific rwlock functions always return 0; these
functions continue to handle errors using less verbose run-time
assertions as they do not set errno anyway.
2022-07-13 13:19:32 +02:00
Michał Kępień
77aead5ab6 Enable tracking of pthreads barriers
Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate
memory on the heap when pthread_barrier_init() is called.  Every call to
that function must be accompanied by a corresponding call to
pthread_barrier_destroy() or else the memory allocated for the barrier
will leak.

jemalloc can be used for detecting memory allocations which are not
released by a process when it exits.  Unfortunately, since jemalloc is
also the system allocator on FreeBSD and a special (profiling-enabled)
build of jemalloc is required for memory leak detection, this method
cannot be used for detecting leaked memory allocated by libthr on a
stock FreeBSD installation.

However, libthr's behavior can be emulated on any platform by
implementing alternative versions of libisc functions for creating and
destroying barriers that allocate memory using malloc() and release it
using free().  This enables using jemalloc for detecting missing
pthread_barrier_destroy() calls on any platform on which it works
reliably.

When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro
is set, allocate isc_barrier_t structures on the heap in
isc_barrier_init() and free them in isc_barrier_destroy().  Reuse
existing barrier macros (after renaming them appropriately) for other
operations.
2022-07-13 13:19:32 +02:00
Ondřej Surý
8e5e0fa522 Use library constructor to create default mutex attr once
Instead of using isc_once_do() on every isc_mutex_init() call, use the
global library constructor to initialize the default mutex attr
object (optionally with PTHREAD_MUTEX_ADAPTIVE_NP if supported) just
once when the library is loaded.
2022-07-13 13:19:32 +02:00
Michał Kępień
badeeff0ac Improve reporting for condition variable errors
Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/condition.h
with ERRNO_CHECK(), in order to improve error reporting for any
condition-variable-related run-time failures (by augmenting error
messages with file/line/caller information and the error string
corresponding to errno).
2022-07-13 13:19:32 +02:00
Ondřej Surý
e4606da2c6 Enable tracking of pthreads rwlocks
Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate
memory on the heap when pthread_rwlock_init() is called.  Every call to
that function must be accompanied by a corresponding call to
pthread_rwlock_destroy() or else the memory allocated for the rwlock
will leak.

jemalloc can be used for detecting memory allocations which are not
released by a process when it exits.  Unfortunately, since jemalloc is
also the system allocator on FreeBSD and a special (profiling-enabled)
build of jemalloc is required for memory leak detection, this method
cannot be used for detecting leaked memory allocated by libthr on a
stock FreeBSD installation.

However, libthr's behavior can be emulated on any platform by
implementing alternative versions of libisc functions for creating and
destroying rwlocks that allocate memory using malloc() and release it
using free().  This enables using jemalloc for detecting missing
pthread_rwlock_destroy() calls on any platform on which it works
reliably.

When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro
is set (and --enable-pthread-rwlock is used), allocate isc_rwlock_t
structures on the heap in isc_rwlock_init() and free them in
isc_rwlock_destroy().  Reuse existing functions defined in
lib/isc/rwlock.c for other operations, but rename them first, so that
they contain triple underscores (to indicate that these functions are
implementation-specific, unlike their mutex and condition variable
counterparts, which always use the pthreads implementation).  Define the
isc__rwlock_init() macro so that it is a logical counterpart of
isc__mutex_init() and isc__condition_init(); adjust isc___rwlock_init()
accordingly.  Remove a redundant function prototype for
isc__rwlock_lock() and rename that (static) function to rwlock_lock() in
order to avoid having to use quadruple underscores.
2022-07-13 13:19:32 +02:00
Michał Kępień
5759ace07f Handle pthread_*_init() failures consistently
isc_rwlock_init() currently detects pthread_rwlock_init() failures using
a REQUIRE() assertion.  Use the ERRNO_CHECK() macro for that purpose
instead, so that read-write lock initialization failures are handled
identically as condition variable (pthread_cond_init()) and mutex
(pthread_mutex_init()) initialization failures.
2022-07-13 13:19:32 +02:00
Michał Kępień
f352a834a7 Improve reporting for mutex errors
Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/mutex.h with
ERRNO_CHECK(), in order to improve error reporting for any mutex-related
run-time failures (by augmenting error messages with file/line/caller
information and the error string corresponding to errno).
2022-07-13 13:19:32 +02:00
Ondřej Surý
8dfdb95a20 Enable tracking of pthreads condition variables
Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate
memory on the heap when pthread_cond_init() is called.  Every call to
that function must be accompanied by a corresponding call to
pthread_cond_destroy() or else the memory allocated for the condition
variable will leak.

jemalloc can be used for detecting memory allocations which are not
released by a process when it exits.  Unfortunately, since jemalloc is
also the system allocator on FreeBSD and a special (profiling-enabled)
build of jemalloc is required for memory leak detection, this method
cannot be used for detecting leaked memory allocated by libthr on a
stock FreeBSD installation.

However, libthr's behavior can be emulated on any platform by
implementing alternative versions of libisc functions for creating and
destroying condition variables that allocate memory using malloc() and
release it using free().  This enables using jemalloc for detecting
missing pthread_cond_destroy() calls on any platform on which it works
reliably.

When the newly introduced ISC_TRACK_PTHREADS_OBJECTS preprocessor macro
is set, allocate isc_condition_t structures on the heap in
isc_condition_init() and free them in isc_condition_destroy().  Reuse
existing condition variable macros (after renaming them appropriately)
for other operations.
2022-07-13 13:19:32 +02:00
Michał Kępień
365b47caee Add an ERRNO_CHECK() preprocessor macro
In a number of situations in pthreads-related code, a common sequence of
steps is taken: if the value returned by a library function is not 0,
pass errno to strerror_r(), log the string returned by the latter, and
immediately abort execution.  Add an ERRNO_CHECK() preprocessor macro
which takes those exact steps and use it wherever (conveniently)
possible.

Notes:

 1. The "log the return value of strerror_r() and abort" pattern is used
    in a number of other places that this commit does not touch; only
    "!= 0" checks followed by isc_error_fatal() calls with
    non-customized error messages are replaced here.

 2. This change temporarily breaks file name & line number reporting for
    isc__mutex_init() errors, to prevent breaking the build.  This issue
    will be rectified in a subsequent change.
2022-07-13 13:19:32 +02:00
Ondřej Surý
ebcfb16576 Enable tracking of pthreads mutexes
Some POSIX threads implementations (e.g. FreeBSD's libthr) allocate
memory on the heap when pthread_mutex_init() is called.  Every call to
that function must be accompanied by a corresponding call to
pthread_mutex_destroy() or else the memory allocated for the mutex will
leak.

jemalloc can be used for detecting memory allocations which are not
released by a process when it exits.  Unfortunately, since jemalloc is
also the system allocator on FreeBSD and a special (profiling-enabled)
build of jemalloc is required for memory leak detection, this method
cannot be used for detecting leaked memory allocated by libthr on a
stock FreeBSD installation.

However, libthr's behavior can be emulated on any platform by
implementing alternative versions of libisc functions for creating and
destroying mutexes that allocate memory using malloc() and release it
using free().  This enables using jemalloc for detecting missing
pthread_mutex_destroy() calls on any platform on which it works
reliably.

Introduce a new ISC_TRACK_PTHREADS_OBJECTS preprocessor macro, which
causes isc_mutex_t structures to be allocated on the heap by
isc_mutex_init() and freed by isc_mutex_destroy().  Reuse existing mutex
macros (after renaming them appropriately) for other operations.
2022-07-13 13:19:32 +02:00
Ondřej Surý
9968a6292d Merge branch 'ondrej-update-dir-locals-for-libtest' into 'main'
Update the .dir-locals.el for libtest

See merge request isc-projects/bind9!6565
2022-07-13 10:21:35 +00:00
Ondřej Surý
80fbd849d5
Update the .dir-locals.el for libtest
The tests/libtest directory is missing from the .dir-locals.el, so the
emacs flycheck would not work for the unit tests.  Add it to the
configuration.
2022-07-13 12:17:34 +02:00
Michał Kępień
5415ecbd7c Merge branch '3439-stop-resolving-invalid-names-in-resume_dslookup' into 'main'
Stop resolving invalid names in resume_dslookup()

Closes #3439

See merge request isc-projects/bind9!6563
2022-07-13 08:59:30 +00:00
Michał Kępień
cfa398ad37 Add CHANGES entry and release note for GL #3439 2022-07-13 10:31:16 +02:00
Michał Kępień
1a79aeab44 Stop resolving invalid names in resume_dslookup()
Commit 7b2ea97e46 introduced a logic bug
in resume_dslookup(): that function now only conditionally checks
whether DS chasing can still make progress.  Specifically, that check is
only performed when the previous resume_dslookup() call invokes
dns_resolver_createfetch() with the 'nameservers' argument set to
something else than NULL, which may not always be the case.  Failing to
perform that check may trigger assertion failures as a result of
dns_resolver_createfetch() attempting to resolve an invalid name.

Example scenario that leads to such outcome:

 1. A validating resolver is configured to forward all queries to
    another resolver.  The latter returns broken DS responses that
    trigger DS chasing.

 2. rctx_chaseds() calls dns_resolver_createfetch() with the
    'nameservers' argument set to NULL.

 3. The fetch fails, so resume_dslookup() is called.  Due to
    fevent->result being set to e.g. DNS_R_SERVFAIL, the default branch
    is taken in the switch statement.

 4. Since 'nameservers' was set to NULL for the fetch which caused the
    resume_dslookup() callback to be invoked
    (fctx->nsfetch->private->nameservers), resume_dslookup() chops off
    one label off fctx->nsname and calls dns_resolver_createfetch()
    again, for a name containing one label less than before.

 5. Steps 3-4 are repeated (i.e. all attempts to find the name servers
    authoritative for the DS RRset being chased fail) until fctx->nsname
    becomes stripped down the the root name.

 6. Since resume_dslookup() does not check whether DS chasing can still
    make progress, it strips off a label off the root name and continues
    its attempts at finding the name servers authoritative for the DS
    RRset being chased, passing an invalid name to
    dns_resolver_createfetch().

Fix by ensuring resume_dslookup() always checks whether DS chasing can
still make progress when a name server fetch fails.  Update code
comments to ensure the purpose of the relevant dns_name_equal() check is
clear.
2022-07-13 10:31:16 +02:00
Mark Andrews
75027bc6ce Merge branch '3446-autosign-s-checking-revoked-key-with-duplicate-key-id-test-was-incomplete' into 'main'
Resolve "Autosign's 'checking revoked key with duplicate key ID' test was incomplete"

Closes #3446

See merge request isc-projects/bind9!6555
2022-07-13 00:48:09 +00:00
Mark Andrews
513cb24b55 Make "checking revoked key with duplicate key ID" work
There should be 2 keys with the same key id after the numerically
lower one is revoked (serial space arithmetic).  The DS points
at the non-revoked key so validation should still succeed.
2022-07-13 00:47:49 +00:00