The Unix Domain Sockets support in BIND 9 has been completely disabled
since BIND 9.18 and it has been a fatal error since then. Cleanup the
code and the documentation that suggest that Unix Domain Sockets are
supported.
Allow an arbitrary TCP timeout value to be specified when running
rndc, so that commands that take a long time to execute (for example,
reloading a very large configuration) can be given time to do so.
While the connect timeout was set to 60 seconds in rndc, the
idle read timeout was left at the default value of 30 seconds.
This commit sets it back to 60, to match the behavior in 9.16
and earlier.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
When fatal is called we may be holding memory allocated by OpenSSL.
This may result in the reference count for the FIPS provider not
going to zero and the shared library not being unloaded during
OPENSSL_cleanup. When the shared library is ultimately unloaded,
when all remaining dynamically loaded libraries are freed, we have
already destroyed the memory context we where using to track memory
leaks / late frees resulting in INSIST being called.
Disable triggering the INSIST when fatal has being called.
This is a simple replacement using the semantic patch from the previous
commit and as added bonus, one removal of previously undetected unused
variable in named/server.c.
without diffie-hellman TKEY negotiation, some other code is
now effectively dead or unnecessary, and can be cleaned up:
- the rndc tsig-list and tsig-delete commands.
- a nonoperational command-line option to dnssec-keygen that
was documented as being specific to DH.
- the section of the ARM that discussed TKEY/DH.
- the functions dns_tkey_builddeletequery(), processdeleteresponse(),
and tkey_processgssresponse(), which are unused.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
In rndc_recvdone(), if 'sends' was not 0, then 'recvs' was not
decremented, in which case isc_loopmgr_shutdown() was never reached,
which could cause a hang. (This has not been observed to happen, but
the code was incorrect on examination.)
Previously:
* applications were using isc_app as the base unit for running the
application and signal handling.
* networking was handled in the netmgr layer, which would start a
number of threads, each with a uv_loop event loop.
* task/event handling was done in the isc_task unit, which used
netmgr event loops to run the isc_event calls.
In this refactoring:
* the network manager now uses isc_loop instead of maintaining its
own worker threads and event loops.
* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
and every isc_task runs on a specific isc_loop bound to the specific
thread.
* applications have been updated as necessary to use the new API.
* new ISC_LOOP_TEST macros have been added to enable unit tests to
run isc_loop event loops. unit tests have been updated to use this
where needed.
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
completely removed; isc_timer objects are now directly created on the
isc_loop event loops.
* the isc_timer API has been simplified. the "inactive" timer type has
been removed; timers are now stopped by calling isc_timer_stop()
instead of resetting to inactive.
* isc_manager now creates a loop manager rather than a timer manager.
* modules and applications using isc_timer have been updated to use the
new API.
"rndc fetchlimit" now also prints a list of domain names that are
currently rate-limited by "fetches-per-zone".
The "fetchlimit" system test has been updated to use this feature
to check that domain limits are applied correctly.
this command runs dns_adb_dumpquota() to display all servers
in the ADB that are being actively fetchlimited by the
fetches-per-server controls (i.e, servers with a nonzero average
timeout ratio or with the quota having been reduced from the
default value).
the "fetchlimit" system test has been updated to use the
new command to check quota values instead of "rndc dumpdb".
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop). The unbound tasks would be assigned to a random
thread every time isc_task_send() was called. Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability. Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
Sphinx "standard domain" provides directive types ".. program::" and
".. option::" to create link anchor for a program name + option combination.
These can be referenced using :ref:`program option` syntax.
The problem is that Sphinx 1.8.5 (e.g. in Ubuntu 18.04) generates
conflicting link targets if a page contains two option directives
starting with the same word, e.g.:
.. program:: dnssec-settime
.. option:: -P date
.. option:: -P ds date
The reason is that option directive consumes only first word as "option
name" (-P) and all the rest is considered "option argument" (date, ds
date). Newer versions of Sphinx (e.g. 4.5.0) handle this by creating
numbered link anchors, but older versions warn and BIND build system
turns the warning into a hard error.
To handle that we use method recommended by Sphinx maintainer:
https://github.com/sphinx-doc/sphinx/issues/10218#issuecomment-1059925508
As a bonus it provides more accurate link anchors for sub-options.
Alternatives considered:
- Replacing standard domain definition of .. option - causes more
problems, see BIND issue #3294.
- Removing hyperlinks for options - that would be a step back.
Fixes: #3295
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data. This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).
Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.
Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
Replace :manpage: with :iscman: to generate internal hyperlinks. That
way reader can use links even when offline, and jumps to man pages
for the same version.
Formerly HTML version of man pages did not have links in See Also
section because :manpage: role in Sphinx can generate only external
hyperlinks - and we do not have that enabled.
Enabling the Sphinx :manpage: linking could reliably create hyperlinks
only to external URLs, but that would take users to another version
of docs.
Generated by:
find bin -name '*.rst' | xargs sed -i -e 's/:manpage:`\([^(]\+\)(\([0-9]\))`/:iscman:`\1(\2) <\1>`/g'
+ hand-edit to revert change for mmencode reference which is
not provided in our source tree.
Use the new role :iscman: to replace all occurences or ``binary``
with :iscman:`binary`, creating a hyperlink to the manual page.
Generated using:
find bin -name *.rst | xargs fgrep --files-with-matches '.. iscman' | xargs -I{} -n1 basename {} .rst > /tmp/progs
for PROG in $(cat /tmp/progs); do find -name '*.rst' | xargs sed -i -e "s/\`\`$PROG\`\`/:iscman:\`$PROG\`/g"; done
Additional hand-edits were done mainly around filter-aaaa and
filter-a which are program names and and option names at the
same time. Couple more edits was neede to fix .rst syntax broken by
automatic replacement.
Sphinx has it's own :program: syntax for refering to program names.
Use it for self-references in manual pages. These self-references are
not clickable and not as eye-cathing as links, which is a good thing.
There is no point in attracting attention to ``dig`` several times on a
single page dedicated to dig itself.
Substituted automatically using:
find bin -name *.rst | xargs fgrep --files-with-matches '.. program' | xargs -n1 bash /tmp/repl.sh
With /tmp/repl.sh being:
BASE=$(basename "$1" .rst)
sed -i -e "s/\`\`$BASE\`\`/:program:\`$BASE\`/g" "$1"
The new directive and role "iscman" allow to tag & reference man pages in
our source tree. Essentially it is just namespacing for ISC man pages,
but it comes with couple benefits.
Differences from .. _man_program label we formerly used:
- Does not expand :ref:`man_program` into full text of the page header.
- Generates index entry with category "manual page".
- Rendering style is closer to ubiquitous to the one produced
by ``named`` syntax.
Differences from Sphinx built-in :manpage: role:
- Supports all builders with support for cross-references.
- Generates internal links (unlike :manpage: which generates external
URLs).
- Checks that target exists withing our source tree.
Side-effect of hyperlinking is that typos in program and option names
are now detected by Sphinx.
Candidate -options were detected using:
find -name *.rst | xargs grep '``-[^`]'
and then modified from ``-o`` to :option:`-o` using regex
s/``\(-[^`]\+\)``/:option:`\1`/
+ manual modifications where necessary.
Non-hyphenated options were detected by looking at context around
program names:
find bin -name *.rst | xargs -I{} -n1 basename {} .rst | sort -u
and grepping for program name with trailing whitespace.
Stand-alone program names like ``named`` are not hyperlinked in this
commit.
The markup allows referencing individual options, and also makes them
more legible (no more thin red text on gray background).
Most of the work was done using regexes:
s/^``-\(.*\)``$/.. option:: -\1\r/
s/^``+\(.*\)``$/.. option:: +\1\r/
on bin/**/*.rst files along with visual inspection and hand-edits,
mostly for positional arguments.
Regex for rndc.rst:
s/^``\(.*\)``/.. option:: \1\r/
+ hand edits to remove extra asterisk and whitespace here and there.
The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]). Follow
the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple
assignment of the value as this is what all supported stdatomic.h
implementations do anyway:
* MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v}
* Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE) (VALUE)
1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf
Replace the hard-coded paths for various BIND 9 files (configuration,
pid, etc.) in the man pages and ARM with compile-time values using the
sphinx-build replace system.
This is more complicated, because the restructured text specification
doesn't allow |substitions| inside ``code-blocks``, so for each specific
file we had to create own substition which is sub-optimal, but it is
only way how to do this without adding Sphinx extension.
If isc_app_run() gets interrupted by a signal, the global 'rndc_task'
variable may already be detached from (set to NULL) by the time the
outstanding netmgr callbacks are run. This triggers an assertion
failure in isc_task_shutdown(). However, explicitly calling
isc_task_shutdown() from rndc code is redundant because it does not use
isc_task_onshutdown() and the task_shutdown() function gets
automatically called anyway when the task manager gets destroyed (after
isc_app_run() returns). Remove the redundant isc_task_shutdown() calls
to prevent crashes after receiving a signal.
rndc_recvdone() is not treating the ISC_R_CANCELED result code as a
request to stop data processing, which may cause a crash when trying to
dereference ccmsg->buffer. Fix by ensuring ISC_R_CANCELED results in an
early exit from rndc_recvdone().
Make sure the logic for handling ISC_R_CANCELED in rndc_recvnonce()
matches the one present in rndc_recvdone() to ensure consistent behavior
between these two sibling functions.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
Unify the header guard style and replace the inconsistent include guards
with #pragma once.
The #pragma once is widely and very well supported in all compilers that
BIND 9 supports, and #pragma once was already in use in several new or
refactored headers.
Using simpler method will also allow us to automate header guard checks
as this is simpler to programatically check.
For reference, here are the reasons for the change taken from
Wikipedia[1]:
> In the C and C++ programming languages, #pragma once is a non-standard
> but widely supported preprocessor directive designed to cause the
> current source file to be included only once in a single compilation.
>
> Thus, #pragma once serves the same purpose as include guards, but with
> several advantages, including: less code, avoidance of name clashes,
> and sometimes improvement in compilation speed. On the other hand,
> #pragma once is not necessarily available in all compilers and its
> implementation is tricky and might not always be reliable.
1. https://en.wikipedia.org/wiki/Pragma_once