When automatic-interface-scan is disabled, the route socket was still
being opened. Add new API to connect / disconnect from the route socket
only as needed.
Additionally, move the block that disables periodic interface rescans to
a place where it actually have access to the configuration values.
Previously, the values were being checked before the configuration was
loaded.
Some external log file rotation programs use signals to tell programs
to close log files. SIGHUP can be used to do this but it also does
a full reconfiguration. Configure named to accept SIGUSR1 as a
signal to close log files.
View matching on an incoming query checks the query's signature,
which can be a CPU-heavy task for a SIG(0)-signed message. Implement
an asynchronous mode of the view matching function which uses the
offloaded signature checking facilities, and use it for the incoming
queries.
This is a tiny helper function which is used only once and can be
replaced with two function calls instead. Removing this makes
supporting asynchronous signature checking less complicated.
In order to protect from a malicious DNS client that sends many
queries with a SIG(0)-signed message, add a quota of simultaneously
running SIG(0) checks.
This protection can only help when named is using more than one worker
threads. For example, if named is running with the '-n 4' option, and
'sig0checks-quota 2;' is used, then named will make sure to not use
more than 2 workers for the SIG(0) signature checks in parallel, thus
leaving the other workers to serve the remaining clients which do not
use SIG(0)-signed messages.
That limitation is going to change when SIG(0) signature checks are
offloaded to "slow" threads in a future commit.
The 'sig0checks-quota-exempt' ACL option can be used to exempt certain
clients from the quota requirements using their IP or network addresses.
The 'sig0checks-quota-maxwait-ms' option is used to define a maximum
amount of time for named to wait for a quota to appear. If during that
time no new quota becomes available, named will answer to the client
with DNS_R_REFUSED.
Previously, the number of RR types for a single owner name was limited
only by the maximum number of the types (64k). As the data structure
that holds the RR types for the database node is just a linked list, and
there are places where we just walk through the whole list (again and
again), adding a large number of RR types for a single owner named with
would slow down processing of such name (database node).
Add a configurable limit to cap the number of the RR types for a single
owner. This is enforced at the database (rbtdb, qpzone, qpcache) level
and configured with new max-types-per-name configuration option that
can be configured globally, per-view and per-zone.
Previously, the number of RRs in the RRSets were internally unlimited.
As the data structure that holds the RRs is just a linked list, and
there are places where we just walk through all of the RRs, adding an
RRSet with huge number of RRs inside would slow down processing of said
RRSets.
Add a configurable limit to cap the number of the RRs in a single RRSet.
This is enforced at the database (rbtdb, qpzone, qpcache) level and
configured with new max-records-per-type configuration option that can
be configured globally, per-view and per-zone.
Changed the default value for 'allow-transfer' to 'none'; zone
transfers now require explicit authorization.
Updated all system tests to specify an allow-transfer ACL when needed.
Revised the ARM to specify that the default is 'none'.
The list isn't exactly maintained but it helped with some BIND history
tracking and is basically harmless so it might be worth holding onto it.
I have adapted the name to ASCII so IDN support won't be necessary.
The high-water allows administrators to better tune the recursive
clients limit without having to to poll the statistics channel in high
rates to get this number.
the previous commit introduced a possible race in getsigningtime()
where the rdataset header could change between being found on the
heap and being bound.
getsigningtime() now looks at the first element of the heap, gathers the
locknum, locks the respective lock, and retrieves the header from the
heap again. If the locknum has changed, it will rinse and repeat.
Theoretically, this could spin forever, but practically, it almost never
will as the heap changes on the zone are very rare.
we simplify matters further by changing the dns_db_getsigningtime()
API call. instead of passing back a bound rdataset, we pass back the
information the caller actually needed: the resigning time, owner name
and type of the rdataset that was first on the heap.
if we had a method to get the running loop, similar to how
isc_tid() gets the current thread ID, we can simplify loop
and loopmgr initialization.
remove most uses of isc_loop_current() in favor of isc_loop().
in some places where that was the only reason to pass loopmgr,
remove loopmgr from the function parameters.
by default, QPDB is the database used by named and all tools and
unit tests. the old default of RBTDB can now be restored by using
"configure --with-zonedb=rbt --with-cachedb=rbt".
some tests have been fixed so they will work correctly with either
database.
CHANGES and release notes have been updated to reflect this change.
replace the string "rbt" throughout BIND with "qp" so that
qpdb databases will be used by default instead of rbtdb.
rbtdb databases can still be used by specifying "database rbt;"
in a zone statement.
With _exit() instead of exit() in place, we don't need
isc__tls_setfatalmode() mechanism as the atexit() calls will not be
executed including OpenSSL atexit hooks.
Since the fatal() isn't a correct but rather abrupt termination of the
program, we want to skip the various atexit() calls because not all
memory might be freed during fatal() call, etc. Using _exit() instead
of exit() has this effect - the program will end, but no destructors or
atexit routines will be called.
Expose the newly added 'first refresh' flag in the information
provided by the 'rndc staus' command, by showing the number of
zones, which are not yet fully ready, and their first refresh
is pending or is in-progress.
Add a new zone flag to indicate that a secondary type zone is
not yet fully ready, and a first time refresh is pending or is
in progress.
Expose this new flag in the statistics channel's "Incoming Zone
Transfers" section.
Instead of running all the cryptographic validation in a tight loop,
spread it out into multiple event loop "ticks", but moving every single
validation into own isc_async_run() asynchronous event. Move the
cryptographic operations - both verification and DNSKEY selection - to
the offloaded threads (isc_work_enqueue), this further limits the time
we spend doing expensive operations on the event loops that should be
fast.
Limit the impact of invalid or malicious RRSets that contain crafted
records causing the dns_validator to do many validations per single
fetch by adding a cap on the maximum number of validations and maximum
number of validation failures that can happen before the resolving
fails.
This is now the default way to implement attaching to/detaching from
a pointer.
Also update cfg_keystore_fromconfig() to allow NULL value for the
keystore pointer. In most cases we detach it immediately after the
function call.
Move dns_dnssec_findzonekeys from the dnssec.{c,h} source code to
zone.{c,h} (the header file already commented that this should be done
inside dns_zone_t).
Alter the function in such a way, that keys are searched for in the
key stores if a 'dnssec-policy' (kasp) is attached to the zone,
otherwise keep using the zone's key-directory.
When writing key files to disk, use the internally stored directory.
Add an access function 'dst_key_directory()'.
Most calls to keymgr functions no longer need to provide the
key-directory value. Only 'dns_keymgr_run' still needs access to
the zone's key-directory in case the key-store is set to the built-in
key-directory.
Refactor dns_dnssec_findmatchingkeys and dns_dnssec_keylistfromrdataset
to take into account the key store directories in case the zone is using
dnssec-policy (kasp). Add 'kasp' and 'keystores' parameters.
This requires the keystorelist to be stored inside the zone structure.
The calls to these functions in the DNSSEC tools can use NULL as the
kasp value, as dnssec-signzone does not (yet) support dnssec-policy,
and key collision is checked inside the directory where it is created.
When creating the kasp structure, instead of storing the name of the
key store on keys, store a reference to the key store object instead.
This requires to build the keystore list prior to creating the kasp
structures, in the dnssec tools, the check code and the server code.
We will create a builtin keystore called "key-directory" which means
use the zone's key-directory as the key store.
The check code changes, because now the keystore is looked up before
creating the kasp structure (and if the keystore is not found, this
is an error). Instead of looking up the keystore after all
'dnssec-policy' clauses have been read.
The statistics channel does not expose the current number of TCP clients
connected, only the highwater. Therefore, users did not have an easy
means to collect statistics about TCP clients served over time. This
information could only be measured as a seperate mechanism via rndc by
looking at the TCP quota filled.
In order to expose the exact current count of connected TCP clients
(tracked by the "tcp-clients" quota) as a statistics counter, an
extra, dedicated Network Manager callback would need to be
implemented for that purpose (a counterpart of ns__client_tcpconn()
that would be run when a TCP connection is torn down), which is
inefficient. Instead, track the number of currently-connected TCP
clients separately for IPv4 and IPv6, as Network Manager statistics.
The conn_shutdown() function is called whenever a control channel
connection is supposed to be closed, e.g. after a response to the client
is sent or when named is being shut down. That function calls
isccc_ccmsg_invalidate(), which resets the magic number in the structure
holding the messages exchanged over a given control channel connection
(isccc_ccmsg_t). The expectation here is that all operations related to
the given control channel connection will have been completed by the
time the connection needs to be shut down.
However, if named shutdown is initiated while a control channel message
is still in flight, some netmgr callbacks might still be pending when
conn_shutdown() is called and isccc_ccmsg_t invalidated. This causes
the REQUIRE assertion checking the magic number in ccmsg_senddone() to
fail when the latter function is eventually called, resulting in a
crash.
Fix by splitting up isccc_ccmsg_invalidate() into two separate
functions:
- isccc_ccmsg_disconnect(), which initiates TCP connection shutdown,
- isccc_ccmsg_invalidate(), which cleans up magic number and buffer,
and then:
- replacing all existing uses of isccc_ccmsg_invalidate() with calls
to isccc_ccmsg_disconnect(),
- only calling isccc_ccmsg_invalidate() when all netmgr callbacks are
guaranteed to have been run.
Adjust function comments accordingly.
Multiple zones should be able to read the same key and signing policy
at the same time. Since writing the kasp lock only happens during
reconfiguration, and the complete kasp list is being replaced, there
is actually no need for a lock. Reference counting ensures that a kasp
structure is not destroyed when still being attached to one or more
zones.
This significantly improves the load configuration time.
The loop in shutdown_listener() assumes that the reference count for
every controlconnection_t object on the listener->connections linked
list will drop down to zero after the conn_shutdown() call in the loop's
body. However, when the timing is just right, some netmgr callbacks for
a given control connection may still be awaiting processing by the same
event loop that executes shutdown_listener() when the latter is run.
Since these netmgr callbacks must be run in order for the reference
count for the relevant controlconnection_t objects to drop to zero, when
the scenario described above happens, shutdown_listener() runs into an
infinite loop due to one of the controlconnection_t objects on the
listener->connections linked list never going away from the head of that
list.
Fix by safely iterating through the listener->connections list and
initiating shutdown for all controlconnection_t objects found. This
allows any pending netmgr callbacks to be run by the same event loop in
due course, i.e. after shutdown_listener() returns.
The main intention of PROXY protocol is to pass endpoints information
to a back-end server (in our case - BIND). That means that it is a
valid way to spoof endpoints information, as the addresses and ports
extracted from PROXYv2 headers, from the point of view of BIND, are
used instead of the real connection addresses.
Of course, an ability to easily spoof endpoints information can be
considered a security issue when used uncontrollably. To resolve that,
we introduce 'allow-proxy' and 'allow-proxy-on' ACL options. These are
the only ACL options in BIND that work with real PROXY connections
addresses, allowing a DNS server operator to specify from what clients
and on which interfaces he or she is willing to accept PROXY
headers. By default, for security reasons we do not allow to accept
them.
This commit extends "listen-on" statement with "proxy" options that
allows one to enable PROXYv2 support on a dedicated listener. It can
have the following values:
- "plain" to send PROXYv2 headers without encryption, even in the case
of encrypted transports.
- "encrypted" to send PROXYv2 headers encrypted right after the TLS
handshake.
It was possible that controlconf connections could be shutdown twice
when shutting down the server, because they would receive the
signal (ISC_R_SHUTTINGDOWN result) from netmgr and then the shutdown
procedure would be called second time via controls_shutdown().
Split the shutdown procedure from control_recvmessage(), so we can call
it independently from netmgr callbacks and make sure it will be called
only once. Do the similar thing for the listeners.
The AES algorithm for DNS cookies was being kept for legacy reasons, and
it can be safely removed in the next major release. Remove both the AES
usage for DNS cookies and the AES implementation itself.
Previously, the isccc_ccmsg_invalidate() was called from conn_free() and
this could lead to netmgr calling control_recvmessage() after we
detached the reading controlconnection_t reference, but it wouldn't be
the last reference because controlconnection_t is also attached/detached
when sending response or running command asynchronously.
Instead, move the isccc_ccmsg_invalidate() call to control_recvmessage()
error handling path to make sure that control_recvmessage() won't be
ever called again from the netmgr.
The controlconf channel runs single-threaded on the main thread.
Replace the listener->connections locking with check that we are still
running on the thread with TID 0.
The lock-file configuration (both from configuration file and -X
argument to named) has better alternatives nowadays. Modern process
supervisor should be used to ensure that a single named process is
running on a given configuration.
Alternatively, it's possible to wrap the named with flock(1).
This is first step in removing the lock-file configuration option, it
marks both the `lock-file` configuration directive and -X option to
named as deprecated.
When -X is used the 'lock-file' option change detection condition
is invalid, because it compares the 'lock-file' option's value to
the '-X' argument's value instead of the older 'lock-file' option
value (which was ignored because of '-X').
Don't warn about changing 'lock-file' option if '-X' is used.
It is obvious that the '!cfg_obj_asstring(obj)' check should be
'cfg_obj_asstring(obj)' instead, because it is an AND logic chain
which further uses 'obj' as a string.
Fix the error.
When 'lock-file <lockfile>' is used in configuration at the same time
as using '-X none' in 'named' invocation, there is an invalid
logic that would lead to a isc_mem_strdup() call on a NULL value.
Also, contradicting to ARM, 'lock-file none' is overriding the '-X'
argument.
Fix the overall logic, and make sure that the '-X' takes precedence to
'lock-file'.
When 'lock-file <lockfile1>' was used in configuration at the same time
as using `-X <lockfile2>` in `named` invocation, there was an invalid
logic that would lead to a double isc_mem_strdup() call on the
<lockfile2> value.
Skip the second allocation if `lock-file` is being used in
configuration, so the <lockfile2> is used only single time.
The lock file was being removed when we hadn't successfully locked
it which defeated the purpose of the lockfile. Adjust cleanup_lockfile
such that it only unlinks the lockfile if we have successfully locked
the lockfile and it is still active (lockfile != NULL).
When using sd_notify(3) to send a message to the service manager
about named being reloaded, systemd also requires the MONOTONIC_USEC
field to be set to the current monotonic time in microseconds,
otherwise the 'systemctl reload' command fails.
Add the MONOTONIC_USEC field to the message.
See 'man 5 systemd.service' for more information.
Ignore the option 'inline-signing' unless there is a 'dnssec-policy'
configured for the zone. Having inline signing enabled while the zone
is not DNSSEC signed does not make sense.
If there is a 'dnssec-policy' the 'inline-signing' zone-only option
can be used to override the value for the given zone.
In units that support detailed reference tracing via ISC_REFCOUNT
macros, we were doing:
/* Define to 1 for detailed reference tracing */
#undef <unit>_TRACE
This would prevent using -D<unit>_TRACE=1 in the CFLAGS.
Convert the above mentioned snippet with just a comment how to enable
the detailed reference tracing:
/* Add -D<unit>_TRACE=1 to CFLAGS for detailed reference tracing */
now that we have the QP chain mechanism, we can convert the
RPZ summary database to use a QP trie instead of an RBT.
also revised comments throughout the file accordingly, and
incidentally cleaned up calls to new_node(), which can no
longer fail.
The TRY0 macro doesn't set the 'result' variable, so the error
log message is never printed. Remove the 'result' variable and
modify the function's control flow to be similar to the the
zone_xmlrender() function, with a separate error returning path.
The "Needs Refresh" flag is exposed in two places in the statistics
channel: first - there is a state called "Needs Refresh", when the
process hasn't started yet, but the zone needs a refresh, and second
- there there is a field called "Additional Refresh Queued", when the
process is ongoing, but another refresh is queued for the same zone.
The DNS_ZONEFLG_NEEDREFRESH flag, however, is set only when there is
an ongoing zone transfer and a new notify is received. That is, the
flag is not set for the first case above.
In order to fix the issue, use the DNS_ZONEFLG_NEEDREFRESH flag only
when the zone transfer is running, otherwise, decide whether a zone
needs a refresh using its refresh and expire times.
Currently in the statsistics channel's incoming zone transfers list
the local and remote addresses are shown only when the zone transfer
is already running. Since we have now introduced the "Refresh SOA"
state, which shows the state of the SOA query before the zone transfer
is started, this commit implements a feature to show the local and
remote addresses for the SOA query, when the state is "Refresh SOA".
Improve the "Duration (s)" field, so that it can show the duration of
all the major states of an incoming zone transfer process, while they
are taking place. In particular, it will now show the duration of the
"Pending", "Refresh SOA" and "Deferred" states too, before the actual
zone transfer starts.
With adding this state to the statistics channel, it can now show
the zone transfer in this state instead of as "Pending" when the
zone.c module is performing a refresh SOA request, before actually
starting the transfer process. This will help to understand
whether the process is waiting because of the rate limiter (i.e.
"Pending"), or the rate limiter is passed and it is now waiting for
the refresh SOA query to complete or time out.
Add a new field in the incoming zone transfers section of the
statistics channel to show the transport used for the SOA request.
When the transfer is started beginning from the XFRST_SOAQUERY state,
it means that the SOA query will be performed by xfrin itself, using
the same transport. Otherwise, it means that the SOA query was already
performed by other means (e.g. by zone.c:soa_query()), and, in that
case, we use the SOA query transport type information passed by the
'soa_transport_type' argument, when the xfrin object was created.
The Unix Domain Sockets support in BIND 9 has been completely disabled
since BIND 9.18 and it has been a fatal error since then. Cleanup the
code and the documentation that suggest that Unix Domain Sockets are
supported.
Reusing TCP connections with dns_dispatch_gettcp() used linear linked
list to lookup existing outgoing TCP connections that could be reused.
Replace the linked list with per-loop cds_lfht hashtable to speedup the
lookups. We use cds_lfht because it allows non-unique node insertion
that we need to check for dispatches in different connection states.
Instead of high number of dispatches (4 * named_g_udpdisp)[1], make the
dispatches bound to threads and make dns_dispatchset_t create a dispatch
for each thread (event loop).
This required couple of other changes:
1. The dns_dispatch_createudp() must be called on loop, so the isc_tid()
is already initialized - changes to nsupdate and mdig were required.
2. The dns_requestmgr had only a single dispatch per v4 and v6. Instead
of using single dispatch, use dns_dispatchset_t for each protocol -
this is same as dns_resolver.
Add a configuration option, resolver-use-dns64, which when true
will cause named to map IPv4 address to IPv6 addresses using the
view's DNS64 mapping rules when making iterative queries.
There are libraries which are reported in printversion(), but not
reported in setup(). Synchronize the functions, so that the log
file could have the same information as reported by the 'named -V'
command execution.
The autoconf and named -V now prints used version of jemalloc. This
doesn't work with system supplied jemalloc, so in it prints `system`
instead in the autoconf and nothing in named -V output.
instead of allowing a NULL nametree in dns_nametree_covered(),
require nametree to exist, and ensure that the nametrees defined
for view and resolver objects are always created.
name trees can now hold either boolean values or bit fields. the
type is selected when the name tree is created.
the behavior of dns_nametree_add() differs slightly beteween the types:
in a boolean tree adding an existing name will return ISC_R_EXISTS,
but in a bitfield tree it simply sets the specified bit in the bitfield
and returns ISC_R_SUCCESS.
replace the use of RBTs for deny-answer-aliases, the exclude
lists for deny-answer-aliases and deny-answer-addresses, and
dnssec-must-be-secure, with name trees.
This adds support for User Statically Defined Tracing (USDT). On
Linux, this uses the header from SystemTap and dtrace utility, but the
support is universal as long as dtrace is available.
Also add the required infrastructure to add probes to libisc, libdns and
libns libraries, where most of the probes will be.
The dns_dispatchmgr object was only set in the dns_view object making it
prone to use-after-free in the dns_xfrin unit when shutting down named.
Remove dns_view_setdispatchmgr() and optionally pass the dispatchmgr
directly to dns_view_create() when it is attached and not just assigned,
so the dns_dispatchmgr doesn't cease to exist too early.
The dns_view_getdnsdispatchmgr() is now protected by the RCU lock, the
dispatchmgr reference is incremented, so the caller needs to detach from
it, and the function can return NULL in case the dns_view has been
already shut down.
Instead of an RBT for the forwarders table, use a QP trie.
We now use reference counting for dns_forwarders_t. When a forwarders
object is retrieved by dns_fwdtable_find(), it must now be explicitly
detached by the caller afterward.
QP tries require stored objects to include their names, so the
the forwarders object now has that. This obviates the need to
pass back a separate 'foundname' value from dns_fwdtable_find().
replace the red-black tree used by the negative trust anchor table
with a QP trie.
because of this change, dns_ntatable_init() can no longer fail, and
neither can dns_view_initntatable(). these functions have both been
changed to type void.
Allow larger TTL values in zones that go insecure. This is necessary
because otherwise the zone will not be loaded due to the max-zone-ttl
of P1D that is part of the current insecure policy.
In the keymgr.c code, default back to P1D if the max-zone-ttl is set
to zero.
Add an option to enable/disable inline-signing inside the
dnssec-policy clause. The existing inline-signing option that is
set in the zone clause takes priority, but if it is omitted, then the
value that is set in dnssec-policy is taken.
The built-in policies use inline-signing.
This means that if you want to use the default policy without
inline-signing you either have to set it explicitly in the zone
clause:
zone "example" {
...
dnssec-policy default;
inline-signing no;
};
Or create a new policy, only overriding the inline-signing option:
dnssec-policy "default-dynamic" {
inline-signing no;
};
zone "example" {
...
dnssec-policy default-dynamic;
};
This also means that if you are going insecure with a dynamic zone,
the built-in "insecure" policy needs to be accompanied with
"inline-signing no;".
The isc_stats_create() can no longer return anything else than
ISC_R_SUCCESS. Refactor isc_stats_create() and its variants in libdns,
libns and named to just return void.
1. Change the _new, _add and _copy functions to return the new object
instead of returning 'void' (or always ISC_R_SUCCESS)
2. Cleanup the isc_ht_find() + isc_ht_add() usage - the code is always
locked with catzs->lock (mutex), so when isc_ht_find() returns
ISC_R_NOTFOUND, the isc_ht_add() must always succeed.
3. Instead of returning direct iterator for the catalog zone entries,
add dns_catz_zone_for_each_entry2() function that calls callback
for each catalog zone entry and passes two extra arguments to the
callback. This will allow changing the internal storage for the
catalog zone entries.
4. Cleanup the naming - dns_catz_<fn>_<obj> -> dns_catz_<obj>_<fn>, as an
example dns_catz_new_zone() gets renamed to dns_catz_zone_new().
These two configuration options worked in conjunction with 'auto-dnssec'
to determine KSK usage, and thus are now obsoleted.
However, in the code we keep KSK processing so that when a zone is
reconfigured from using 'dnssec-policy' immediately to 'none' (without
going through 'insecure'), the zone is not immediately made bogus.
Add one more test case for going straight to none, now with a dynamic
zone (no inline-signing).
Some 'rndc signing' commands can still be used in conjunction with
'dnssec-policy' because it shows the progress of signing and
private type records can be cleaned up. Allow these commands to be
executed.
However, setting NSEC3 parameters is incompatible with dnssec-policy.
BIND's rdataset structure is a view of some DNS records. It is
polymorphic, so the details of how the records are stored can vary.
For instance, the records can be held in an rdatalist, or in an
rdataslab in the rbtdb.
The dns_rdataset structure previously had a number of fields called
`private1` up to `private7`, which were used by the various rdataset
implementations. It was not at all clear what these fields were for,
without reading the code and working it out from context.
This change makes the rdataset inheritance hierarchy more clear. The
polymorphic part of a `struct dns_rdataset` is now a union of structs,
each of which is named for the class of implementation using it. The
fields of these structs replace the old `privateN` fields. (Note: the
term "inheritance hierarchy" refers to the fact that the builtin and
SDLZ implementations are based on and inherit from the rdatalist
implementation, which in turn inherits from the generic rdataset.
Most of this change is mechanical, but there are a few extras.
In keynode.c there were a number of REQUIRE()ments that were not
necessary: they had already been checked by the rdataset method
dispatch code. On the other hand, In ncache.c there was a public
function which needed to REQUIRE() that an rdataset was valid.
I have removed lots of "reset iterator state" comments, because it
should now be clear from `target->iter = NULL` where before
`target->private5 = NULL` could have been doing anything.
Initialization is a bit neater in a few places, using C structure
literals where appropriate.
The pointer arithmetic for translating between an rdataslab header and
its raw contents is now fractionally safer.
The ability to read legacy HMAC-MD5 K* keyfile pairs using algorithm
number 157 was accidentally lost when the algorithm numbers were
consolidated into a single block, in commit
09f7e0607a.
The assumption was that these algorithm numbers were only known
internally, but they were also used in key files. But since HMAC-MD5
got renumbered from 157 to 160, legacy HMAC-MD5 key files no longer
work.
Move HMAC-MD5 back to 157 and GSSAPI back to 160. Add exception for
GSSAPI to list_hmac_algorithms.
view->adb may be referenced while the view is shutting down as the
zone uses a weak reference to the view and examines view->adb but
dns_view_detach call dns_adb_detach to clear view->adb.
since it is not necessary to find partial matches when looking
up names in a TSIG keyring, we can use a hash table instead of
an RBT to store them.
the tsigkey object now stores the key name as a dns_fixedname
rather than allocating memory for it.
the `name` parameter to dns_tsigkeyring_add() has been removed;
it was unneeded since the tsigkey object already contains a copy
of the name.
the opportunistic cleanup_ring() function has been removed;
it was only slowing down lookups.
the prior practice of passing a dns_name containing the
expanded name of an algorithm to dns_tsigkey_create() and
dns_tsigkey_createfromkey() is unnecessarily cumbersome;
we can now pass the algorithm number instead.
use the ISC_REFCOUNT attach/detach implementation in dns/tsig.c
so that detailed tracing can be used during refactoring.
dns_tsig_keyring_t has been renamed dns_tsigkeyring_t so the type
and the attach/detach function names will match.
- style cleanups.
- simplify the function parameters to dns_tsigkey_create():
+ remove 'restored' and 'generated', they're only ever set to false.
+ remove 'creator' because it's only ever set to NULL.
+ remove 'inception' and 'expiry' because they're only ever set to
(0, 0) or (now, now), and either way, this means "never expire".
+ remove 'ring' because we can just use dns_tsigkeyring_add() instead.
- rename dns_keyring_restore() to dns_tsigkeyring_restore() to match the
rest of the functions operating on dns_tsigkeyring objects.
it was possible to add a TSIG key to more than one TSIG
keyring at a time, and this was in fact happening with the
session key, which was generated once and then added to the
keyrings for each view as it was configured.
this has been corrected and a REQUIRE added to dns_tsigkeyring_add()
to prevent it from happening again.
The 'named_g_server->interfacemgr' pointer is saved in the zone
structure using dns_zone_setisself(), as a void* argument to be
passed to the isself() callback, so there is no attach/detach,
and when shutting down, the interface manager can be destroyed
by the shutdown_server(), running in exclusive mode, and causing
isself() to crash when trying to use the pointer.
Instead of keeping the interface manager pointer in the zone
structure, just check and use the 'named_g_server->interfacemgr'
itself, as it was implemented originally in the
3aca8e5bf3 commit. Later, in the
8eb88aafee commit, the code was
changed to pass the interface manager pointer using the additional
void* argument, but the commit message doesn't mention if there
was any practical reason for that.
Additionally, don't pass the interfacemgr pointer to the
ns_interfacemgr_getaclenv() function before it is checked
against NULL.
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
isc_loopmgr_pause can't be called before isc_loopmgr_run is
called as the thread ids are not yet valid. If there is a
fatal error before isc_loopmgr_run is run then don't call
isc_loopmgr_pause.
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
If the 'checkds' option is not explicitly set, check if there are
'parental-agents' for the zone configured. If so, default to "explicit",
otherwise default to "yes".
Add a new configuration option to set how the checkds method should
work. Acceptable values are 'yes', 'no', and 'explicit'.
When set to 'yes', the checkds method is to lookup the parental agents
by querying the NS records of the parent zone.
When set to 'no', no checkds method is enabled. Users should run
the 'rndc checkds' command to signal that DS records are published and
withdrawn.
When set to 'explicit', the parental agents are explicitly configured
with the 'parental-agents' configuration option.
This should have no functional effects.
The message size stats are specified by RSSAC002 so it's best not
to mess around with how they appear in the statschannel. But it's
worth changing the implementation to use general-purpose histograms,
to reduce code size and benefit from sharded counters.
Cleanup the remnants of MS Compiler bits from <isc/refcount.h>, printing
the information in named/main.c, and cleanup some comments about Windows
that no longer apply.
The bits in picohttpparser.{h,c} were left out, because it's not our
code.
When fatal is called we may be holding memory allocated by OpenSSL.
This may result in the reference count for the FIPS provider not
going to zero and the shared library not being unloaded during
OPENSSL_cleanup. When the shared library is ultimately unloaded,
when all remaining dynamically loaded libraries are freed, we have
already destroyed the memory context we where using to track memory
leaks / late frees resulting in INSIST being called.
Disable triggering the INSIST when fatal has being called.
There are times where you want named-checkconf to check whether the
dnssec-policies should be constrained by the cryptographic algorithms
supported by the operation system or to just accept all possible
algorithms. This provides a mechanism to make that selection.
The isc_time_now() and isc_time_now_hires() were used inconsistently
through the code - either with status check, or without status check,
or via TIME_NOW() macro with RUNTIME_CHECK() on failure.
Refactor the isc_time_now() and isc_time_now_hires() to always fail when
getting current time has failed, and return the isc_time_t value as
return value instead of passing the pointer to result in the argument.
This is a simple replacement using the semantic patch from the previous
commit and as added bonus, one removal of previously undetected unused
variable in named/server.c.
It's sometimes helpful to get a quick idea of the call stack when
debugging. This change factors out the backtrace logging from named's
fatal error handler so that it's easy to use in other places too.
enable DNSRPS in the continuous integration tests
this triggered a build failure in OpenBSD; building with DNSRPS
causes arpa/nameser.h to be included, which defines the value
STATUS. that value was then reused in server.c renaming the
value to STAT corrects the error.
for testing purposes, we need to be able to specify a library path from
which to load the dnsrps implementation. this can now be done with the
"dnsrps-library" option.
DNSRPS can now be enabled in configure regardless of whether librpz.so
is currently installed on the system.