Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions. The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.
Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.
The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
The system test framework starts all named instances with the "-d 99"
command line option (unless it is overridden by a named.args file in a
given instance's working directory). This causes a lot of log messages
to be written to named.run files - currently over 5 million lines for a
single test suite run. While debugging information preserved in the log
files is essential for troubleshooting intermittent test failures, some
system tests involve sending hundreds or even thousands of queries,
which causes the relevant log files to explode in size. When multiple
tests (or even multiple test suites) are run in parallel, excessive
logging contributes considerably to the I/O load on the test host,
increasing the odds of intermittent test failures getting triggered.
Decrease the debug level for the seven most verbose named instances:
- use "-d 3" for ns2 in the "cacheclean" system test (it is the lowest
logging level at which the test still passes without the need to
apply any changes to tests.sh),
- use "-d 1" for the other six named instances.
This roughly halves the number of lines logged by each test suite run
while still leaving enough information in the logs to allow at least
basic troubleshooting in case of test failures.
This approach was chosen as it results in a greater decrease in the
number of lines logged than running all named instances with "-d 3",
without causing any test failures.
The test spawns 4 parallel workers that keep adding, modifying and
deleting zones, the main thread repeatedly checks wheter rndc
status responds within a reasonable period.
While environment and timing issues may affect the test, in most
test cases the deadlock that was taking place before the fix used to
trigger in less than 7 seconds in a machine with at least 2 cores.
Four named instances in the "nsupdate" system test have GSS-TSIG support
enabled. All of them currently use "tkey-gssapi-keytab". Configure two
of them with "tkey-gssapi-credential" to test that option.
As "tkey-gssapi-keytab" and "tkey-gssapi-credential" both provide the
same functionality, no test modifications are required. The difference
between the two options is that the value of "tkey-gssapi-keytab" is an
explicit path to the keytab file to acquire credentials from, while the
value of "tkey-gssapi-credential" is the name of the principal whose
credentials should be used; those credentials are looked up in the
keytab file expected by the Kerberos library, i.e. /etc/krb5.keytab by
default. The path to the default keytab file can be overridden using by
setting the KRB5_KTNAME environment variable. Utilize that variable to
use existing keytab files with the "tkey-gssapi-credential" option.
The KRB5_KTNAME environment variable should not interfere with the
"tkey-gssapi-keytab" option. Nevertheless, rename one of the keytab
files used with "tkey-gssapi-keytab" to something else than the contents
of the KRB5_KTNAME environment variable in order to make sure that both
"tkey-gssapi-keytab" and "tkey-gssapi-credential" are actually tested.
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers. While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:
* The taskmgr has no concept of universal privileged mode - rather the
tasks are either privileged or unprivileged (normal). The privileged
tasks are run as a first thing when the netmgr is unpaused. There
are now four different queues in in the netmgr:
1. priority queue - netievent on the priority queue are run even when
the taskmgr enter exclusive mode and netmgr is paused. This is
needed to properly start listening on the interfaces, free
resources and resume.
2. privileged task queue - only privileged tasks are queued here and
this is the first queue that gets processed when network manager
is unpaused using isc_nm_resume(). All netmgr workers need to
clean the privileged task queue before they all proceed normal
operation. Both task queues are processed when the workers are
finished.
3. task queue - only (traditional) task are scheduled here and this
queue along with privileged task queues are process when the
netmgr workers are finishing. This is needed to process the task
shutdown events.
4. normal queue - this is the queue with netmgr events, e.g. reading,
sending, callbacks and pretty much everything is processed here.
* The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
object.
* The isc_nm_destroy() function now waits for indefinite time, but it
will print out the active objects when in tracing mode
(-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
made a little bit more asynchronous and it might take longer time to
shutdown all the active networking connections.
* Previously, the isc_nm_stoplistening() was a synchronous operation.
This has been changed and the isc_nm_stoplistening() just schedules
the child sockets to stop listening and exits. This was needed to
prevent a deadlock as the the (traditional) tasks are now executed on
the netmgr threads.
* The socket selection logic in isc__nm_udp_send() was flawed, but
fortunatelly, it was broken, so we never hit the problem where we
created uvreq_t on a socket from nmhandle_t, but then a different
socket could be picked up and then we were trying to run the send
callback on a socket that had different threadid than currently
running.
"resolve" is used by the resolver system tests, and I'm not
certain whether delv exercises the same code, so rather than
remove it, I moved it to bin/tests/system.
sample code for export libraries is no longer needed and
this code is not used for any internal tests. also, sample-gai.c
had already been removed but there were some dangling references.
the libdns client API is no longer being maintained for
external use, we can remove the code that isn't being used
internally, as well as the related tests.
This commit merges TLS tests into the common Network Manager unit
tests suite and extends the unit test framework to include support for
additional "ping-pong" style tests where all data could be sent via
lesser number of connections (the behaviour of the old test
suite). The tests for TCP and TLS were extended to make use of the new
mode, as this mode better translates to how the code is used in DoH.
Both TLS and TCP tests now share most of the unit tests' code, as they
are expected to function similarly from a users's perspective anyway.
Additionally to the above, the TLS test suite was extended to include
TLS tests using the connections quota facility.
The fromhex.pl script needs to be copied from the source directory to
the build directory before any test is run, otherwise the out-of-tree
fails to find it. Given that the script is used only in system test,
move it to bin/tests/system/.
This MR introduces a new system test 'keymgr2kasp' to test
migration to 'dnssec-policy'. It moves some existing tests from
the 'kasp' system test to here.
Also a common script 'kasp.sh', to be used in kasp specific tests,
is introduced.
The named-checkzone tool can also be invoked as named-compilezone. Make
sure a man page is installed for that alias. Move and rename the
"man_named-checkzone" label to prevent a Sphinx duplicate label warning
from being raised (see commit 84862e96c1
for more information).
The netmgr unit tests were designed to push the system limits to maximum
by sending as many queries as possible in the busy loop from multiple
threads. This mostly works with UDP, but in the stateful protocol where
establishing the connection takes more time, it failed quite often in
the CI. On FreeBSD, this happened more often, because the socket() call
would fail spuriosly making the problem even worse.
This commit does several things to improve reliability:
* return value of isc_nm_<proto>connect() is always checked and retried
when scheduling the connection fails
* The busy while loop has been slowed down with usleep(1000); so the
netmgr threads could schedule the work and get executed.
* The isc_thread_yield() was replaced with usleep(1000); also to allow
the other threads to do any work.
* Instead of waiting on just one variable, we wait for multiple
variables to reach the final value
* We are wrapping the netmgr operations (connects, reads, writes,
accepts) with reference counting and waiting for all the callbacks to
be accounted for.
This has two effects:
a) the isc_nm_t is always clean of active sockets and handles when
destroyed, so it will prevent the spurious INSIST(references == 1)
from isc_nm_destroy()
b) the unit test now ensures that all the callbacks are always called
when they should be called, so any stuck test means that there was
a missing callback call and it is always a real bug
These changes allows us to remove the workaround that would not run
certain tests on systems without port load-balancing.
The system tests were missing a test that would test tcp-initial-timeout
and tcp-idle-timeout.
This commit adds new "timeouts" system test that adds:
* Test that waits longer than tcp-initial-timeout and then checks
whether the socket was closed
* Test that sends and receives DNS message then waits longer than
tcp-initial-timeout but shorter time than tcp-idle-timeout than
sends DNS message again than waits longer than tcp-idle-timeout
and checks whether the socket was closed
* Similar test, but bursting 25 DNS messages than waiting longer than
tcp-initial-timeout and shorter than tcp-idle-timeout than do second
25 DNS message burst
* Check whether transfer longer than tcp-initial-timeout succeeds
The transport should also be detached when we skip a master, otherwise
named will crash when sending a SOA query to the next master over TLS,
because the transport must be NULL when we enter
'dns_view_gettransport'.
- rename dot to doth, as it now covers both dot and doh.
- merge xot into doth as it's closely related.
- added long-lived key and cert files (expiring 2121).
- add tests with https-get, https-post, http-plain, alternate
endpoints, and both static and ephemeral TLS configuration.
- incidentally fixed a memory leak in dig that occurred if +https
was specified more than once.
tests that version 1 journal files containing version 1 transaction
headers are rolled forward correctly on server startup, then updated
into version 2 journals. also checks journal file consistency and
'max-journal-size' behavior.
Call the libisc isc__initialize() constructor and isc__shutdown()
destructor from DllMain instead of having duplicate code between
those and DllMain() code.
The current isc_hp API uses internal tid_v variable that gets
incremented for each new thread using hazard pointers. This tid_v
variable is then used as a index to global shared table with hazard
pointers state. Since the tid_v is only incremented and never
decremented the table could overflow very quickly if we create set of
threads for short period of time, they finish the work and cease to
exist. Then we create identical set of threads and so on and so on.
This is not a problem for a normal `named` operation as the set of
threads is stable, but the problematic place are the unit tests where we
test network manager or other APIs (task, timer) that create threads.
This commits adds a thin wrapper around any function called from
isc_thread_create() that adds unique-but-reusable small digit thread id
that can be used as index to f.e. hazard pointer tables. The trampoline
wrapper ensures that the thread ids will be reused, so the highest
thread_id number doesn't grow indefinitely when threads are created and
destroyed and then created again. This fixes the hazard pointer table
overflow on machines with many cores. [GL #2396]
Removing stderr from the pict tool serves no purpose and drops valuable
information, we might use when debugging failed pairwise CI job, such
as:
Input Error: A parameter names must be unique
Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use
gcc/clang attributes on POSIX and DLLMain on Windows to initialize and
shutdown OpenSSL library.
This resolves the issue when isc_nm_create() / isc_nm_destroy() was
called multiple times and it would call OpenSSL library destructors from
isc_nm_destroy().
At the same time, since we now have introduced the ctor/dtor for libisc,
this commit moves the isc_mem API initialization (the list of the
contexts) and changes the isc_mem_checkdestroyed() to schedule the
checking of memory context on library unload instead of executing the
code immediately.
The <isc/readline.h> header provided a compatibility shim to use when
other non-GNU readline libraries are in use. The two places where
readline library is being used is nslookup and nsupdate, so the header
file has been moved to bin/dig directory and it's directly included from
bin/nsupdate.
This also conceals any readline headers exposed from the libisc headers.
The --enable-option-checking=fatal option prevents ./configure from
proceeding when an unknown option is used in the ./configure step in CI.
This change will avoid adding unsupported ./configure options or options
with typo or typo in pairwise testing "# [pairwise: ...]" marker.
removed the isc_cfg_http_t and isc_cfg_tls_t structures
and the functions that loaded and accessed them; this can
be done using normal config parser functions.
This commit contains fixes to unit tests to make them work well on
various platforms (in particular ones shipping old versions of
OpenSSL) and for different configurations.
It also updates the generated manpage to include DoH configuration
options.
This commit completes the support for DNS-over-HTTP(S) built on top of
nghttp2 and plugs it into the BIND. Support for both GET and POST
requests is present, as required by RFC8484.
Both encrypted (via TLS) and unencrypted HTTP/2 connections are
supported. The latter are mostly there for debugging/troubleshooting
purposes and for the means of encryption offloading to third-party
software (as might be desirable in some environments to simplify TLS
certificates management).
This commit includes work-in-progress implementation of
DNS-over-HTTP(S).
Server-side code remains mostly untested, and there is only support
for POST requests.
This commit resurrects the old TLS code from
8f73c70d23.
It also includes numerous stability fixes and support for
isc_nm_cancelread() for the TLS layer.
The code was resurrected to be used for DoH.
Add support for a "tls" key/value pair for zone primaries, referencing
either a "tls" configuration statement or "ephemeral". If set to use
TLS, zones will send SOA and AXFR/IXFR queries over a TLS channel.
The BIND 9 libraries are considered to be internal only and hence the
API and ABI changes a lot. Keeping track of the API/ABI changes takes
time and it's a complicated matter as the safest way to make everything
stable would be to bump any library in the dependency chain as in theory
if libns links with libdns, and a binary links with both, and we bump
the libdns SOVERSION, but not the libns SOVERSION, the old libns might
be loaded by binary pulling old libdns together with new libdns loaded
by the binary. The situation gets even more complicated with loading
the plugins that have been compiled with few versions old BIND 9
libraries and then dynamically loaded into the named.
We are picking the safest option possible and usable for internal
libraries - instead of using -version-info that has only a weak link to
BIND 9 version number, we are using -release libtool option that will
embed the corresponding BIND 9 version number into the library name.
That means that instead of libisc.so.1701 (as an example) the library
will now be named libisc-9.17.10.so.
* Following the example set in 634bdfb16d, the tlsdns netmgr
module now uses libuv and SSL primitives directly, rather than
opening a TLS socket which opens a TCP socket, as the previous
model was difficult to debug. Closes#2335.
* Remove the netmgr tls layer (we will have to re-add it for DoH)
* Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc
library; move the OpenSSL initialization/deinitialization from dstapi
needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}()
* Add couple of new shims needed for OpenSSL 1.0.x
* When LibreSSL is used, require at least version 2.7.0 that
has the best OpenSSL 1.1.x compatibility and auto init/deinit
* Enforce OpenSSL 1.1.x usage on Windows
* Added a TLSDNS unit test and implemented a simple TLSDNS echo
server and client.
When we were in nmthread, the isc__nm_async_<proto>connect() function
executes in the same thread as the isc__nm_<proto>connect() and on a
failure, it would block indefinitely because the failure branch was
setting sock->active to false before the condition around the wait had a
chance to skip the WAIT().
This also fixes the zero system test being stuck on FreeBSD 11, so we
re-enable the test in the commit.
On FreeBSD, the stack is destroyed more aggressively than on Linux and
that revealed a bug where we were allocating the 16-bit len for the
TCPDNS message on the stack and the buffer got garbled before the
uv_write() sendback was executed. Now, the len is part of the uvreq, so
we can safely pass it to the uv_write() as the req gets destroyed after
the sendcb is executed.
Due to the platform differences, on non-Linux platforms, the xfer and
ixfr tests fails and zero test gets stuck.
This commit will get reverted when we add support for netmgr
multi-threading.
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested. It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.
NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.
The changes that are included in this commit are listed here
(extensively, but not exclusively):
* The netmgr_test unit test was split into individual tests (udp_test,
tcp_test, tcpdns_test and newly added tcp_quota_test)
* The udp_test and tcp_test has been extended to allow programatic
failures from the libuv API. Unfortunately, we can't use cmocka
mock() and will_return(), so we emulate the behaviour with #define and
including the netmgr/{udp,tcp}.c source file directly.
* The netievents that we put on the nm queue have variable number of
members, out of these the isc_nmsocket_t and isc_nmhandle_t always
needs to be attached before enqueueing the netievent_<foo> and
detached after we have called the isc_nm_async_<foo> to ensure that
the socket (handle) doesn't disappear between scheduling the event and
actually executing the event.
* Cancelling the in-flight TCP connection using libuv requires to call
uv_close() on the original uv_tcp_t handle which just breaks too many
assumptions we have in the netmgr code. Instead of using uv_timer for
TCP connection timeouts, we use platform specific socket option.
* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}
When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
waiting for socket to either end up with error (that path was fine) or
to be listening or connected using condition variable and mutex.
Several things could happen:
0. everything is ok
1. the waiting thread would miss the SIGNAL() - because the enqueued
event would be processed faster than we could start WAIT()ing.
In case the operation would end up with error, it would be ok, as
the error variable would be unchanged.
2. the waiting thread miss the sock->{connected,listening} = `true`
would be set to `false` in the tcp_{listen,connect}close_cb() as
the connection would be so short lived that the socket would be
closed before we could even start WAIT()ing
* The tcpdns has been converted to using libuv directly. Previously,
the tcpdns protocol used tcp protocol from netmgr, this proved to be
very complicated to understand, fix and make changes to. The new
tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
Closes: #2194, #2283, #2318, #2266, #2034, #1920
* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
pass accepted TCP sockets between netthreads, but instead (similar to
UDP) uses per netthread uv_loop listener. This greatly reduces the
complexity as the socket is always run in the associated nm and uv
loops, and we are also not touching the libuv internals.
There's an unfortunate side effect though, the new code requires
support for load-balanced sockets from the operating system for both
UDP and TCP (see #2137). If the operating system doesn't support the
load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
on FreeBSD 12+), the number of netthreads is limited to 1.
* The netmgr has now two debugging #ifdefs:
1. Already existing NETMGR_TRACE prints any dangling nmsockets and
nmhandles before triggering assertion failure. This options would
reduce performance when enabled, but in theory, it could be enabled
on low-performance systems.
2. New NETMGR_TRACE_VERBOSE option has been added that enables
extensive netmgr logging that allows the software engineer to
precisely track any attach/detach operations on the nmsockets and
nmhandles. This is not suitable for any kind of production
machine, only for debugging.
* The tlsdns netmgr protocol has been split from the tcpdns and it still
uses the old method of stacking the netmgr boxes on top of each other.
We will have to refactor the tlsdns netmgr protocol to use the same
approach - build the stack using only libuv and openssl.
* Limit but not assert the tcp buffer size in tcp_alloc_cb
Closes: #2061
The bin/tests/headerdep_test.sh script has not been updated since it was
first created and it cannot be used as-is with the current BIND source
code. Better tools (e.g. "include-what-you-use") emerged since the
script was committed back in 2000, so instead of trying to bring it up
to date, remove it from the source repository.
Add unit test to ensure the right NSEC3PARAM event is scheduled in
'dns_zone_setnsec3param()'. To avoid scheduling and managing actual
tasks, split up the 'dns_zone_setnsec3param()' function in two parts:
1. 'dns__zone_lookup_nsec3param()' that will check if the requested
NSEC3 parameters already exist, and if a new salt needs to be
generated.
2. The actual scheduling of the new NSEC3PARAM event (if needed).
Implement support for NSEC3 in dnssec-policy. Store the configuration
in kasp objects. When configuring a zone, call 'dns_zone_setnsec3param'
to queue an nsec3param event. This will ensure that any previous
chains will be removed and a chain according to the dnssec-policy is
created.
Add tests for dnssec-policy zones that uses the new 'nsec3param'
option, as well as changing to new values, changing to NSEC, and
changing from NSEC.
the test-async plugin uses ns_query_hookasync() at the
NS_QUERY_DONE_SEND hook point to call an asynchronous function.
the only effect is to change the query response code to "NOTIMP",
so we can confirm that the hook ran and resumed correctly.
previously query plugins were strictly synchrounous - the query
process would be interrupted at some point, data would be looked
up or a change would be made, and then the query processing would
resume immediately.
this commit enables query plugins to initiate asynchronous processes
and resume on a completion event, as with recursion.
Add server-side TLS support to netmgr - that includes moving some of the
isc_nm_ functions from tcp.c to a wrapper in netmgr.c calling a proper
tcp or tls function, and a new isc_nm_listentls() function.
Add DoT support to tcpdns - isc_nm_listentlsdns().
tests of UDP and TCP cases including:
- sending and receiving
- closure sockets without reading or sending
- closure of sockets at various points while sending and receiving
- since the teste is multithreaded, cmocka now aborts tests on the
first failure, so that failures in subthreads are caught and
reported correctly.
ans10 simulates a local anycast server which has both signed and
unsigned instances of a zone. 'A' queries get answered from the
signed instance. Everything else gets answered from the unsigned
instance. The resulting answer should be insecure.
While libltdl is a feature-rich library, BIND 9 code only uses its basic
capabilities, which are also provided by libuv and which BIND 9 already
uses for other purposes. As libuv's cross-platform shared library
handling interface is modeled after the POSIX dlopen() interface,
converting code using the latter to the former is simple. Replace
libltdl function calls with their libuv counterparts, refactoring the
code as necessary. Remove all use of libltdl from the BIND 9 source
tree.
As currently used in the BIND source tree, the --wrap linker option is
redundant because:
- static builds are no longer supported,
- there is no need to wrap around existing functions - what is
actually required (at least for now) is to replace them altogether
in unit tests,
- only functions exposed by shared libraries linked into unit test
binaries are currently being replaced.
Given the above, providing the alternative implementations of functions
to be overridden in lib/ns/tests/nstest.c is a much simpler alternative
to using the --wrap linker option. Drop the code detecting support for
the latter from configure.ac, simplify the relevant Makefile.am, and
remove lib/ns/tests/wrap.c, updating lib/ns/tests/nstest.c accordingly
(it is harmless for unit tests which are not calling the overridden
functions).
The size of the log generated by each GitLab CI job is limited to 4 MB
by default. While this limit is configurable, it makes little sense to
print build logs to standard output if they are being captured to files
anyway. Limit use of "tee" in util/pairwise-testing.sh to printing the
combination of configure switches used for a given build. This way the
job should never exceed the default 4 MB log size limit, yet it will
still indicate its progress in a concise way.
Pairwise testing is a test case generation technique based on the
observation that most faults are caused by interactions of at most two
factors. For BIND, its configure options can be thought of as such
factors.
Process BIND configure options into a model that is subsequently
processed by the PICT tool in order to find an effective test vector.
That test vector is then used for configuring and building BIND using
various combinations of configure options.
Previously, the bin/system/wire_test.c was optionally used as a fuzzer,
this commit extracts the parts relevant to the fuzzing into a
specialized fuzzer that can be used in oss-fuzz project.
The fuzzer parses the input as UDP DNS message, then prints parsed DNS
message, then renders the DNS message and then prints the rendered DNS
message. No part of the code should cause a assertion failure.
This commit updates and simplifies the checks for the readline support
in nslookup and nsupdate:
* Change the autoconf checks to pkg-config only, all supported
libraries have accompanying .pc files now.
* Add editline support in addition to libedit and GNU readline
* Add isc/readline.h shim header that defines dummy readline()
function when no readline library is available
In this commit, the simple fuzzing tests for the isc_lex_gettoken() and
isc_lex_getmastertoken() functions have been added.
As part of this commit, the initialization has been moved from fuzz.h
constructor/destructor to LLVMFuzzerInitialize() in each fuzz test. The
main.c of no-fuzzing and AFL modes have been modified to run the
LLVMFuzzerInitialize() at the start of the main() function mimicking
the libfuzzer mode of operation.
The fuzzing tests were temporarily disabled when the build system has been
converted to automake. This commit restores the functionality to run the
fuzzing tests as part of the `make check`. When the afl or libfuzzer
is enabled via ./configure, it uses a custom LOG_DRIVER (fuzz/<fuzzer.sh>).
Currently only libfuzzer.sh has been implemented that runs each fuzz
test for 5 seconds each.
There were some missing bits in the other rst files and Makefile.am(s)
that didn't reflect the rename of the main document. Also add
ddns-confgen.8 manpage.
This test ensures that named will correctly shutdown
when receiving multiple control connections after processing
of either "rncd stop" or "kill -SIGTERM" commands.
Before the fix, named was crashing due to a race condition happening
between two threads, one running shutdown logic in named/server.c
and other handling control logic in controlconf.c.
This test tries to reproduce the above scenario by issuing multiple
queries to a target named instance, issuing either rndc stop or kill
-SIGTERM command to the same named instance, then starting multiple rndc
status connections to ensure it is not crashing anymore.
Merge lib/isc/unix/ifiter_getifaddrs.c into lib/isc/unix/interfaceiter.c
and lib/isc/xoshiro128starstar.c into lib/isc/random.c. This avoids the
need for extra Automake directives required to process the "helper" *.c
files properly and makes the code more localized.
Turn the static check_bad_bits() function used by both Unix and Windows
systems into a "private" function and extract the "private" parts of
lib/isc/fsaccess.c to lib/isc/fsaccess_common_p.h. Instead of including
lib/isc/fsaccess.c from lib/isc/{unix,win32}/fsaccess.c, make the former
an independent C source file.
Rename lib/isc/fsaccess.c to lib/isc/fsaccess_common.c to prevent build
issues on Windows caused by multiple source files (lib/isc/fsaccess.c,
lib/isc/win32/fsaccess.c) being compiled into the same object file.
These changes improve consistency with the way "private" functions and
macros are treated elsewhere in the source tree.
Certain rules of the BIND development process are not codified anywhere
and/or are used inconsistently. In an attempt to improve this
situation, add a GitLab CI job which uses Danger Python to add comments
to merge requests when certain expectations are not met. Two categories
of feedback are used, only one of which - fail() - causes the GitLab CI
job to fail. Exclude dangerfile.py from Python QA checks as the way the
contents of that file are evaluated triggers a lot of Flake8 and PyLint
warnings.
The code in util/bindkeys.pl was overly complicated and it could not be
reused on Windows because redirecting stdin and stdout at the same time
from perl is overly complicated.
Now the util/bindkeys.pl accepts the input file as the first and only
argument and prints the header file to stdout. This allows the same
utility to be used from automake and win32/Configure script.
wire_test is not only used by the dnstap system test, but also in
fuzz testing. it doesn't need to be installed, but it's useful to have it
built when BIND is. this commit moves it back from bin/tests/system to
bin/tests, as a noinst_PROGRAM so that it's built by "make all" but
not installed.
Although in util/api-checker.sh we create textual reports, we don't
preserve them in job artifacts, but we should.
We don't want to keep all HTML pages present in the project root, but
just those produced by ABI checker.
The ARM and the manpages have been converted into Sphinx documentation
format.
Sphinx uses reStructuredText as its markup language, and many of its
strengths come from the power and straightforwardness of
reStructuredText and its parsing and translating suite, the Docutils.
The introduction of netmgr doubled the number of threads from which
dnstap data may be logged: previously, it could only happen from within
taskmgr worker threads; with netmgr, it can happen both from taskmgr
worker threads and from network threads. Since the argument passed to
fstrm_iothr_options_set_num_input_queues() was not updated to reflect
this change, some calls to fstrm_iothr_get_input_queue() can now return
NULL, effectively preventing some dnstap data from being logged.
Whether this bug is triggered or not depends on thread scheduling order
and packet distribution between network threads, but will almost
certainly be triggered on any recursive resolver sooner or later. Fix
by requesting the correct number of dnstap input queues to be allocated.
The current script used ephemeral port range which clashed with the
ports used by the tools (dig, ...), and the range always started with
the first port and there was 100 ports allocated for each system test.
In this commit, the first port has been randomized, the get_ports.sh
script outputs the variables (the output has to be eval'ed from run.sh)
and there's less waste in the port range.
There are several improvements over the default/previous behaviour of
the test log driver and log compiler:
* The system-test-driver.sh was dropped (it was used incorrectly)
* The run.sh script is now both log compiler and cli script to run
individual tests
* The custom-test-driver was added as extended version of the automake
test-driver with capability to tee the test output to stdout when
`--verbose yes` is passed to it (you can use LOG_DRIVER_FLAGS to
add the option by default)
* Makefile.am has been extended to honor V=1 for the system tests
test-driver (e.g. V=1 adds `--verbose yes` to AM_LOG_DRIVER_FLAGS)
The bin/tests/wire_test helper program is currently not included in any
Makefile.am file. Move its source code to bin/tests/system and build it
along other helper tools when dnstap support is requested as the
"dnstap" system test needs this tool in order to pass.
The libirs contained own re-implementations of the getaddrinfo,
getnameinfo and gai_strerror + irs_context and irs_dnsconf API that was
unused anywhere in the BIND 9.
Keep just the irs_resonf API that is being extensively used to parse
/etc/resolv.conf by several of BIND 9 tools.
The 'ephemeral' database implementation was used to provide a
lightweight database implemenation that doesn't cache results, and the
only place where it was really use is "samples" because delv is
overriding this to use "rbtdb" instead. Otherwise it was completely
unused.
* The 'ephemeral' cache DB (ecdb) implementation. An ecdb just provides
* temporary storage for ongoing name resolution with the common DB interfaces.
* It actually doesn't cache anything. The implementation expects any stored
* data is released within a short period, and does not care about the
* scalability in terms of the number of nodes.
Right before the release API version (LIBINTERFACE, LIBREVISION, LIBAGE)
for older and newer libraries tends to be the same. Given that, commit
hash can't be the determining factor here, Unix time of the commit
should suit us better and is placed after the API version. The commit
hash is preserved as it's useful to see it in the actual report.
'-nosymtbl' versions of libraries are not produced in Automake builds.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
With the introduction of dnssec-policy, the aforementioned tools were
either rendered obsolete, or they will be replaced with dnssec-policy
based tools. Remove the tools and the requirement to have Python
installed. Python 3 is still being used for tests, so keep the autoconf
test, but make it much simpler.
Our python code didn't adhere to any coding standard. In this commit, we add
flame8 (https://pypi.org/project/flake8/), and pylint (https://www.pylint.org/).
There's couple of exceptions:
- ans.py scripts are not checked, nor fixed as part of this MR
- pylint's missing-*-docstring and duplicate-code checks have
been disabled via .pylintrc
Both exceptions should be removed in due time.
We introduce a isc_quota_attach_cb function - if ISC_R_QUOTA is returned
at the time the function is called, then a callback will be called when
there's quota available (with quota already attached). The callbacks are
organized as a LIFO queue in the quota structure.
It's needed for TCP client quota - with old networking code we had one
single place where tcp clients quota was processed so we could resume
accepting when the we had spare slots, but it's gone with netmgr - now
we need to notify the listener/accepter that there's quota available so
that it can resume accepting.
Remove unused isc_quota_force() function.
The isc_quote_reserve and isc_quota_release were used only internally
from the quota.c and the tests. We should not expose API we are not
using.
The two "functions" that isc/safe.h declared before were actually simple
defines to matching OpenSSL functions. The downside of the approach was
enforcing all users of the libisc library to explicitly list the include
path to OpenSSL and link with -lcrypto. By hiding the specific
implementation into the private namespace changing the defines into
simple functions, we no longer enforce this. In the long run, this
might also allow us to switch cryptographic library implementation
without affecting the downstream users.