The main intention of PROXY protocol is to pass endpoints information
to a back-end server (in our case - BIND). That means that it is a
valid way to spoof endpoints information, as the addresses and ports
extracted from PROXYv2 headers, from the point of view of BIND, are
used instead of the real connection addresses.
Of course, an ability to easily spoof endpoints information can be
considered a security issue when used uncontrollably. To resolve that,
we introduce 'allow-proxy' and 'allow-proxy-on' ACL options. These are
the only ACL options in BIND that work with real PROXY connections
addresses, allowing a DNS server operator to specify from what clients
and on which interfaces he or she is willing to accept PROXY
headers. By default, for security reasons we do not allow to accept
them.
This commit extends "listen-on" statement with "proxy" options that
allows one to enable PROXYv2 support on a dedicated listener. It can
have the following values:
- "plain" to send PROXYv2 headers without encryption, even in the case
of encrypted transports.
- "encrypted" to send PROXYv2 headers encrypted right after the TLS
handshake.
It was possible that controlconf connections could be shutdown twice
when shutting down the server, because they would receive the
signal (ISC_R_SHUTTINGDOWN result) from netmgr and then the shutdown
procedure would be called second time via controls_shutdown().
Split the shutdown procedure from control_recvmessage(), so we can call
it independently from netmgr callbacks and make sure it will be called
only once. Do the similar thing for the listeners.
The AES algorithm for DNS cookies was being kept for legacy reasons, and
it can be safely removed in the next major release. Remove both the AES
usage for DNS cookies and the AES implementation itself.
Previously, the isccc_ccmsg_invalidate() was called from conn_free() and
this could lead to netmgr calling control_recvmessage() after we
detached the reading controlconnection_t reference, but it wouldn't be
the last reference because controlconnection_t is also attached/detached
when sending response or running command asynchronously.
Instead, move the isccc_ccmsg_invalidate() call to control_recvmessage()
error handling path to make sure that control_recvmessage() won't be
ever called again from the netmgr.
The controlconf channel runs single-threaded on the main thread.
Replace the listener->connections locking with check that we are still
running on the thread with TID 0.
The lock-file configuration (both from configuration file and -X
argument to named) has better alternatives nowadays. Modern process
supervisor should be used to ensure that a single named process is
running on a given configuration.
Alternatively, it's possible to wrap the named with flock(1).
This is first step in removing the lock-file configuration option, it
marks both the `lock-file` configuration directive and -X option to
named as deprecated.
When -X is used the 'lock-file' option change detection condition
is invalid, because it compares the 'lock-file' option's value to
the '-X' argument's value instead of the older 'lock-file' option
value (which was ignored because of '-X').
Don't warn about changing 'lock-file' option if '-X' is used.
It is obvious that the '!cfg_obj_asstring(obj)' check should be
'cfg_obj_asstring(obj)' instead, because it is an AND logic chain
which further uses 'obj' as a string.
Fix the error.
When 'lock-file <lockfile>' is used in configuration at the same time
as using '-X none' in 'named' invocation, there is an invalid
logic that would lead to a isc_mem_strdup() call on a NULL value.
Also, contradicting to ARM, 'lock-file none' is overriding the '-X'
argument.
Fix the overall logic, and make sure that the '-X' takes precedence to
'lock-file'.
When 'lock-file <lockfile1>' was used in configuration at the same time
as using `-X <lockfile2>` in `named` invocation, there was an invalid
logic that would lead to a double isc_mem_strdup() call on the
<lockfile2> value.
Skip the second allocation if `lock-file` is being used in
configuration, so the <lockfile2> is used only single time.
The lock file was being removed when we hadn't successfully locked
it which defeated the purpose of the lockfile. Adjust cleanup_lockfile
such that it only unlinks the lockfile if we have successfully locked
the lockfile and it is still active (lockfile != NULL).
When using sd_notify(3) to send a message to the service manager
about named being reloaded, systemd also requires the MONOTONIC_USEC
field to be set to the current monotonic time in microseconds,
otherwise the 'systemctl reload' command fails.
Add the MONOTONIC_USEC field to the message.
See 'man 5 systemd.service' for more information.
Ignore the option 'inline-signing' unless there is a 'dnssec-policy'
configured for the zone. Having inline signing enabled while the zone
is not DNSSEC signed does not make sense.
If there is a 'dnssec-policy' the 'inline-signing' zone-only option
can be used to override the value for the given zone.
In units that support detailed reference tracing via ISC_REFCOUNT
macros, we were doing:
/* Define to 1 for detailed reference tracing */
#undef <unit>_TRACE
This would prevent using -D<unit>_TRACE=1 in the CFLAGS.
Convert the above mentioned snippet with just a comment how to enable
the detailed reference tracing:
/* Add -D<unit>_TRACE=1 to CFLAGS for detailed reference tracing */
now that we have the QP chain mechanism, we can convert the
RPZ summary database to use a QP trie instead of an RBT.
also revised comments throughout the file accordingly, and
incidentally cleaned up calls to new_node(), which can no
longer fail.
The TRY0 macro doesn't set the 'result' variable, so the error
log message is never printed. Remove the 'result' variable and
modify the function's control flow to be similar to the the
zone_xmlrender() function, with a separate error returning path.
The "Needs Refresh" flag is exposed in two places in the statistics
channel: first - there is a state called "Needs Refresh", when the
process hasn't started yet, but the zone needs a refresh, and second
- there there is a field called "Additional Refresh Queued", when the
process is ongoing, but another refresh is queued for the same zone.
The DNS_ZONEFLG_NEEDREFRESH flag, however, is set only when there is
an ongoing zone transfer and a new notify is received. That is, the
flag is not set for the first case above.
In order to fix the issue, use the DNS_ZONEFLG_NEEDREFRESH flag only
when the zone transfer is running, otherwise, decide whether a zone
needs a refresh using its refresh and expire times.
Currently in the statsistics channel's incoming zone transfers list
the local and remote addresses are shown only when the zone transfer
is already running. Since we have now introduced the "Refresh SOA"
state, which shows the state of the SOA query before the zone transfer
is started, this commit implements a feature to show the local and
remote addresses for the SOA query, when the state is "Refresh SOA".
Improve the "Duration (s)" field, so that it can show the duration of
all the major states of an incoming zone transfer process, while they
are taking place. In particular, it will now show the duration of the
"Pending", "Refresh SOA" and "Deferred" states too, before the actual
zone transfer starts.
With adding this state to the statistics channel, it can now show
the zone transfer in this state instead of as "Pending" when the
zone.c module is performing a refresh SOA request, before actually
starting the transfer process. This will help to understand
whether the process is waiting because of the rate limiter (i.e.
"Pending"), or the rate limiter is passed and it is now waiting for
the refresh SOA query to complete or time out.
Add a new field in the incoming zone transfers section of the
statistics channel to show the transport used for the SOA request.
When the transfer is started beginning from the XFRST_SOAQUERY state,
it means that the SOA query will be performed by xfrin itself, using
the same transport. Otherwise, it means that the SOA query was already
performed by other means (e.g. by zone.c:soa_query()), and, in that
case, we use the SOA query transport type information passed by the
'soa_transport_type' argument, when the xfrin object was created.
The Unix Domain Sockets support in BIND 9 has been completely disabled
since BIND 9.18 and it has been a fatal error since then. Cleanup the
code and the documentation that suggest that Unix Domain Sockets are
supported.
Reusing TCP connections with dns_dispatch_gettcp() used linear linked
list to lookup existing outgoing TCP connections that could be reused.
Replace the linked list with per-loop cds_lfht hashtable to speedup the
lookups. We use cds_lfht because it allows non-unique node insertion
that we need to check for dispatches in different connection states.
Instead of high number of dispatches (4 * named_g_udpdisp)[1], make the
dispatches bound to threads and make dns_dispatchset_t create a dispatch
for each thread (event loop).
This required couple of other changes:
1. The dns_dispatch_createudp() must be called on loop, so the isc_tid()
is already initialized - changes to nsupdate and mdig were required.
2. The dns_requestmgr had only a single dispatch per v4 and v6. Instead
of using single dispatch, use dns_dispatchset_t for each protocol -
this is same as dns_resolver.
Add a configuration option, resolver-use-dns64, which when true
will cause named to map IPv4 address to IPv6 addresses using the
view's DNS64 mapping rules when making iterative queries.
There are libraries which are reported in printversion(), but not
reported in setup(). Synchronize the functions, so that the log
file could have the same information as reported by the 'named -V'
command execution.
The autoconf and named -V now prints used version of jemalloc. This
doesn't work with system supplied jemalloc, so in it prints `system`
instead in the autoconf and nothing in named -V output.
instead of allowing a NULL nametree in dns_nametree_covered(),
require nametree to exist, and ensure that the nametrees defined
for view and resolver objects are always created.
name trees can now hold either boolean values or bit fields. the
type is selected when the name tree is created.
the behavior of dns_nametree_add() differs slightly beteween the types:
in a boolean tree adding an existing name will return ISC_R_EXISTS,
but in a bitfield tree it simply sets the specified bit in the bitfield
and returns ISC_R_SUCCESS.
replace the use of RBTs for deny-answer-aliases, the exclude
lists for deny-answer-aliases and deny-answer-addresses, and
dnssec-must-be-secure, with name trees.
This adds support for User Statically Defined Tracing (USDT). On
Linux, this uses the header from SystemTap and dtrace utility, but the
support is universal as long as dtrace is available.
Also add the required infrastructure to add probes to libisc, libdns and
libns libraries, where most of the probes will be.
The dns_dispatchmgr object was only set in the dns_view object making it
prone to use-after-free in the dns_xfrin unit when shutting down named.
Remove dns_view_setdispatchmgr() and optionally pass the dispatchmgr
directly to dns_view_create() when it is attached and not just assigned,
so the dns_dispatchmgr doesn't cease to exist too early.
The dns_view_getdnsdispatchmgr() is now protected by the RCU lock, the
dispatchmgr reference is incremented, so the caller needs to detach from
it, and the function can return NULL in case the dns_view has been
already shut down.
Instead of an RBT for the forwarders table, use a QP trie.
We now use reference counting for dns_forwarders_t. When a forwarders
object is retrieved by dns_fwdtable_find(), it must now be explicitly
detached by the caller afterward.
QP tries require stored objects to include their names, so the
the forwarders object now has that. This obviates the need to
pass back a separate 'foundname' value from dns_fwdtable_find().
replace the red-black tree used by the negative trust anchor table
with a QP trie.
because of this change, dns_ntatable_init() can no longer fail, and
neither can dns_view_initntatable(). these functions have both been
changed to type void.
Allow larger TTL values in zones that go insecure. This is necessary
because otherwise the zone will not be loaded due to the max-zone-ttl
of P1D that is part of the current insecure policy.
In the keymgr.c code, default back to P1D if the max-zone-ttl is set
to zero.
Add an option to enable/disable inline-signing inside the
dnssec-policy clause. The existing inline-signing option that is
set in the zone clause takes priority, but if it is omitted, then the
value that is set in dnssec-policy is taken.
The built-in policies use inline-signing.
This means that if you want to use the default policy without
inline-signing you either have to set it explicitly in the zone
clause:
zone "example" {
...
dnssec-policy default;
inline-signing no;
};
Or create a new policy, only overriding the inline-signing option:
dnssec-policy "default-dynamic" {
inline-signing no;
};
zone "example" {
...
dnssec-policy default-dynamic;
};
This also means that if you are going insecure with a dynamic zone,
the built-in "insecure" policy needs to be accompanied with
"inline-signing no;".
The isc_stats_create() can no longer return anything else than
ISC_R_SUCCESS. Refactor isc_stats_create() and its variants in libdns,
libns and named to just return void.
1. Change the _new, _add and _copy functions to return the new object
instead of returning 'void' (or always ISC_R_SUCCESS)
2. Cleanup the isc_ht_find() + isc_ht_add() usage - the code is always
locked with catzs->lock (mutex), so when isc_ht_find() returns
ISC_R_NOTFOUND, the isc_ht_add() must always succeed.
3. Instead of returning direct iterator for the catalog zone entries,
add dns_catz_zone_for_each_entry2() function that calls callback
for each catalog zone entry and passes two extra arguments to the
callback. This will allow changing the internal storage for the
catalog zone entries.
4. Cleanup the naming - dns_catz_<fn>_<obj> -> dns_catz_<obj>_<fn>, as an
example dns_catz_new_zone() gets renamed to dns_catz_zone_new().
These two configuration options worked in conjunction with 'auto-dnssec'
to determine KSK usage, and thus are now obsoleted.
However, in the code we keep KSK processing so that when a zone is
reconfigured from using 'dnssec-policy' immediately to 'none' (without
going through 'insecure'), the zone is not immediately made bogus.
Add one more test case for going straight to none, now with a dynamic
zone (no inline-signing).
Some 'rndc signing' commands can still be used in conjunction with
'dnssec-policy' because it shows the progress of signing and
private type records can be cleaned up. Allow these commands to be
executed.
However, setting NSEC3 parameters is incompatible with dnssec-policy.
BIND's rdataset structure is a view of some DNS records. It is
polymorphic, so the details of how the records are stored can vary.
For instance, the records can be held in an rdatalist, or in an
rdataslab in the rbtdb.
The dns_rdataset structure previously had a number of fields called
`private1` up to `private7`, which were used by the various rdataset
implementations. It was not at all clear what these fields were for,
without reading the code and working it out from context.
This change makes the rdataset inheritance hierarchy more clear. The
polymorphic part of a `struct dns_rdataset` is now a union of structs,
each of which is named for the class of implementation using it. The
fields of these structs replace the old `privateN` fields. (Note: the
term "inheritance hierarchy" refers to the fact that the builtin and
SDLZ implementations are based on and inherit from the rdatalist
implementation, which in turn inherits from the generic rdataset.
Most of this change is mechanical, but there are a few extras.
In keynode.c there were a number of REQUIRE()ments that were not
necessary: they had already been checked by the rdataset method
dispatch code. On the other hand, In ncache.c there was a public
function which needed to REQUIRE() that an rdataset was valid.
I have removed lots of "reset iterator state" comments, because it
should now be clear from `target->iter = NULL` where before
`target->private5 = NULL` could have been doing anything.
Initialization is a bit neater in a few places, using C structure
literals where appropriate.
The pointer arithmetic for translating between an rdataslab header and
its raw contents is now fractionally safer.
The ability to read legacy HMAC-MD5 K* keyfile pairs using algorithm
number 157 was accidentally lost when the algorithm numbers were
consolidated into a single block, in commit
09f7e0607a.
The assumption was that these algorithm numbers were only known
internally, but they were also used in key files. But since HMAC-MD5
got renumbered from 157 to 160, legacy HMAC-MD5 key files no longer
work.
Move HMAC-MD5 back to 157 and GSSAPI back to 160. Add exception for
GSSAPI to list_hmac_algorithms.
view->adb may be referenced while the view is shutting down as the
zone uses a weak reference to the view and examines view->adb but
dns_view_detach call dns_adb_detach to clear view->adb.
since it is not necessary to find partial matches when looking
up names in a TSIG keyring, we can use a hash table instead of
an RBT to store them.
the tsigkey object now stores the key name as a dns_fixedname
rather than allocating memory for it.
the `name` parameter to dns_tsigkeyring_add() has been removed;
it was unneeded since the tsigkey object already contains a copy
of the name.
the opportunistic cleanup_ring() function has been removed;
it was only slowing down lookups.
the prior practice of passing a dns_name containing the
expanded name of an algorithm to dns_tsigkey_create() and
dns_tsigkey_createfromkey() is unnecessarily cumbersome;
we can now pass the algorithm number instead.
use the ISC_REFCOUNT attach/detach implementation in dns/tsig.c
so that detailed tracing can be used during refactoring.
dns_tsig_keyring_t has been renamed dns_tsigkeyring_t so the type
and the attach/detach function names will match.
- style cleanups.
- simplify the function parameters to dns_tsigkey_create():
+ remove 'restored' and 'generated', they're only ever set to false.
+ remove 'creator' because it's only ever set to NULL.
+ remove 'inception' and 'expiry' because they're only ever set to
(0, 0) or (now, now), and either way, this means "never expire".
+ remove 'ring' because we can just use dns_tsigkeyring_add() instead.
- rename dns_keyring_restore() to dns_tsigkeyring_restore() to match the
rest of the functions operating on dns_tsigkeyring objects.
it was possible to add a TSIG key to more than one TSIG
keyring at a time, and this was in fact happening with the
session key, which was generated once and then added to the
keyrings for each view as it was configured.
this has been corrected and a REQUIRE added to dns_tsigkeyring_add()
to prevent it from happening again.
The 'named_g_server->interfacemgr' pointer is saved in the zone
structure using dns_zone_setisself(), as a void* argument to be
passed to the isself() callback, so there is no attach/detach,
and when shutting down, the interface manager can be destroyed
by the shutdown_server(), running in exclusive mode, and causing
isself() to crash when trying to use the pointer.
Instead of keeping the interface manager pointer in the zone
structure, just check and use the 'named_g_server->interfacemgr'
itself, as it was implemented originally in the
3aca8e5bf3 commit. Later, in the
8eb88aafee commit, the code was
changed to pass the interface manager pointer using the additional
void* argument, but the commit message doesn't mention if there
was any practical reason for that.
Additionally, don't pass the interfacemgr pointer to the
ns_interfacemgr_getaclenv() function before it is checked
against NULL.
'-T transferinsecs' makes named interpret the max-transfer-time-out,
max-transfer-idle-out, max-transfer-time-in and max-transfer-idle-in
configuration options as seconds instead of minutes.
'-T transferslowly' makes named to sleep for one second for every
xfrout message.
'-T transferstuck' makes named to sleep for one minute for every
xfrout message.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
isc_loopmgr_pause can't be called before isc_loopmgr_run is
called as the thread ids are not yet valid. If there is a
fatal error before isc_loopmgr_run is run then don't call
isc_loopmgr_pause.
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
If the 'checkds' option is not explicitly set, check if there are
'parental-agents' for the zone configured. If so, default to "explicit",
otherwise default to "yes".
Add a new configuration option to set how the checkds method should
work. Acceptable values are 'yes', 'no', and 'explicit'.
When set to 'yes', the checkds method is to lookup the parental agents
by querying the NS records of the parent zone.
When set to 'no', no checkds method is enabled. Users should run
the 'rndc checkds' command to signal that DS records are published and
withdrawn.
When set to 'explicit', the parental agents are explicitly configured
with the 'parental-agents' configuration option.
This should have no functional effects.
The message size stats are specified by RSSAC002 so it's best not
to mess around with how they appear in the statschannel. But it's
worth changing the implementation to use general-purpose histograms,
to reduce code size and benefit from sharded counters.
Cleanup the remnants of MS Compiler bits from <isc/refcount.h>, printing
the information in named/main.c, and cleanup some comments about Windows
that no longer apply.
The bits in picohttpparser.{h,c} were left out, because it's not our
code.
When fatal is called we may be holding memory allocated by OpenSSL.
This may result in the reference count for the FIPS provider not
going to zero and the shared library not being unloaded during
OPENSSL_cleanup. When the shared library is ultimately unloaded,
when all remaining dynamically loaded libraries are freed, we have
already destroyed the memory context we where using to track memory
leaks / late frees resulting in INSIST being called.
Disable triggering the INSIST when fatal has being called.
There are times where you want named-checkconf to check whether the
dnssec-policies should be constrained by the cryptographic algorithms
supported by the operation system or to just accept all possible
algorithms. This provides a mechanism to make that selection.
The isc_time_now() and isc_time_now_hires() were used inconsistently
through the code - either with status check, or without status check,
or via TIME_NOW() macro with RUNTIME_CHECK() on failure.
Refactor the isc_time_now() and isc_time_now_hires() to always fail when
getting current time has failed, and return the isc_time_t value as
return value instead of passing the pointer to result in the argument.
This is a simple replacement using the semantic patch from the previous
commit and as added bonus, one removal of previously undetected unused
variable in named/server.c.
It's sometimes helpful to get a quick idea of the call stack when
debugging. This change factors out the backtrace logging from named's
fatal error handler so that it's easy to use in other places too.
enable DNSRPS in the continuous integration tests
this triggered a build failure in OpenBSD; building with DNSRPS
causes arpa/nameser.h to be included, which defines the value
STATUS. that value was then reused in server.c renaming the
value to STAT corrects the error.
for testing purposes, we need to be able to specify a library path from
which to load the dnsrps implementation. this can now be done with the
"dnsrps-library" option.
DNSRPS can now be enabled in configure regardless of whether librpz.so
is currently installed on the system.
These options and zone type were created to address the
SiteFinder controversy, in which certain TLD's redirected queries
rather than returning NXDOMAIN. since TLD's are now DNSSEC-signed,
this is no longer likely to be a problem.
The deprecation message for 'type delegation-only' is issued from
the configuration checker rather than the parser. therefore,
isccfg_check_namedconf() has been modified to take a 'nodeprecate'
parameter to suppress the warning when named-checkconf is used with
the command-line option to ignore warnings on deprecated options (-i).
stop and restart the server in the 'tsiggss' test, in order
to confirm that GSS negotiated TSIG keys are saved and restored
when named loads.
added logging to dns_tsigkey_createfromkey() to indicate whether
a key has been statically configured, generated via GSS negotiation,
or restored from a file.
without diffie-hellman TKEY negotiation, some other code is
now effectively dead or unnecessary, and can be cleaned up:
- the rndc tsig-list and tsig-delete commands.
- a nonoperational command-line option to dnssec-keygen that
was documented as being specific to DH.
- the section of the ARM that discussed TKEY/DH.
- the functions dns_tkey_builddeletequery(), processdeleteresponse(),
and tkey_processgssresponse(), which are unused.
Completely remove the TKEY Mode 2 (Diffie-Hellman Exchanged Keying) from
BIND 9 (from named, named.conf and all the tools). The TKEY usage is
fringe at best and in all known cases, GSSAPI is being used as it should.
The draft-eastlake-dnsop-rfc2930bis-tkey specifies that:
4.2 Diffie-Hellman Exchanged Keying (Deprecated)
The use of this mode (#2) is NOT RECOMMENDED for the following two
reasons but the specification is still included in Appendix A in case
an implementation is needed for compatibility with old TKEY
implementations. See Section 4.6 on ECDH Exchanged Keying.
The mixing function used does not meet current cryptographic
standards because it uses MD5 [RFC6151].
RSA keys must be excessively long to achieve levels of security
required by current standards.
We might optionally implement Elliptic Curve Diffie-Hellman (ECDH) key
exchange mode 6 if the draft ever reaches the RFC status. Meanwhile the
insecure DH mode needs to be removed.
Named now logs both compile time and run time UV versions when
starting up. This is useful information to have when debugging
network issues involving named.
During reconfiguration, the configure_view() function reverts the
configured zones to the previous view in case if there is an error.
It uses the 'zones_configured' boolean variable to decide whether
it is required to revert the zones, i.e. the error happened after
all the zones were successfully configured.
The problem is that it does not account for the case when an error
happens during the configuration of one of the zones (not the first),
in which case there are zones that are already configured for the
new view (and they need to be reverted), and there are zones that
are not (starting from the failed one).
Since 'zones_configured' remains 'false', the configured zones are
not reverted.
Replace the 'zones_configured' variable with a pointer to the latest
successfully configured zone configuration element, and when reverting,
revert up to and including that zone.
This implements node reference tracing that passes all the internal
layers from dns_db API (and friends) to increment_reference() and
decrement_reference().
It can be enabled by #defining DNS_DB_NODETRACE in <dns/trace.h> header.
The output then looks like this:
incr:node:check_address_records:rootns.c:409:0x7f67f5a55a40->references = 1
decr:node:check_address_records:rootns.c:449:0x7f67f5a55a40->references = 0
incr:nodelock:check_address_records:rootns.c:409:0x7f67f5a55a40:0x7f68304d7040->references = 1
decr:nodelock:check_address_records:rootns.c:449:0x7f67f5a55a40:0x7f68304d7040->references = 0
There's associated python script to find the missing detach located at:
https://gitlab.isc.org/isc-projects/bind9/-/snippets/1038
The configure_catz() function creates the catalog zones structure
for the view even when it is not needed, in which case it then
discards it (by detaching) later.
Instead, call dns_catz_new_zones() only when it is needed, i.e. when
there is no existing "previous" view with an existing 'catzs', that
is going to be reused.
* Change 'dns_catz_new_zones()' function's prototype (the order of the
arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
expressions when allocating or freeing memory for 'ptr'.
the 'dispatchmgr' member of the resolver object is used by both
the dns_resolver and dns_request modules, and may in the future
be used by others such as dns_xfrin. it doesn't make sense for it
to live in the resolver object; this commit moves it into dns_view.
instead of using the SDB API as a wrapper to register and
unregister and provide a call framework for the builtin databases,
this commit flattens things so that the builtin databases implement
dns_db directly.
move all dns_sdb code into bin/named/builtin.c, which is the
only place from which it's called.
(note this is temporary: later we'll refactor builtin so that it's a
standalone dns_db implementation on its own instead of using SDB
as a wrapper.)
SDB is currently (and foreseeably) only used by the named
builtin databases, so it only needs as much of its API as
those databases use.
- removed three flags defined for the SDB API that were always
set the same by builtin databases.
- there were two different types of lookup functions defined for
SDB, using slightly different function signatures. since backward
compatibility is no longer a concern, we can eliminate the 'lookup'
entry point and rename 'lookup2' to 'lookup'.
- removed the 'allnodes' entry point and all database iterator
implementation code
- removed dns_sdb_putnamedrr() and dns_sdb_putnamedrdata() since
they were never used.
initialize dns_dbmethods, dns_sdbmethods and dns_rdatasetmethods
using explicit struct member names, so we don't have to keep track
of NULLs for unimplemented functions any longer.
When switching to a new view during a reconfiguration (or reverting
to the old view), detach the 'rpzs' and 'catzs' from the previuos view.
The 'catzs' case was earlier solved slightly differently, by detaching
from the new view when reverting to the old view, but we can not solve
this the same way for 'rpzs', because now in BIND 9.19 and BIND 9.18
a dns_rpz_shutdown_rpzs() call was added in view's destroy() function
before detaching the 'rpzs', so we can not leave the 'rpzs' attached to
the previous view and let it be shut down when we intend to continue
using it with the new view.
Instead, "re-fix" the issue for the 'catzs' pointer the same way as
for 'rpzs' for consistency, and also because a similar shutdown call
is likely to be implemented for 'catzs' in the near future.
this function was just a front-end for gethostname(). it was
needed when we supported windows, which has a different function
for looking up the hostname; it's not needed any longer.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
change functions using isc_taskmgr_beginexclusive() to use
isc_loopmgr_pause() instead.
also, removed an unnecessary use of exclusive mode in
named_server_tcptimeouts().
most functions that were implemented as task events because they needed
to be running in a task to use exclusive mode have now been changed
into loop callbacks instead. (the exception is catz, which is being
changed in a separate commit because it's a particularly complex change.)
callback events from dns_resolver_createfetch() are now posted
using isc_async_run.
other modules which called the resolver and maintained task/taskmgr
objects for this purpose have been cleaned up.
- removed documentation of -S option from named man page
- removed documentation of reserved-sockets from ARM
- simplified documentation of dnssec-secure-to-insecure - it
now just says it's obsolete rather than describing what it
doesn't do anymore
- marked three formerly obsolete options as ancient:
parent-registration-delay, reserved-sockets, and
suppress-initial-notify
the built-in trust anchors in named and delv are sufficent for
validation. named still needs to be able to load trust anchors from
a bind.keys file for testing purposes, but it doesn't need to be
the default behavior.
we now only load trust anchors from a file if explicitly specified
via the "bindkeys-file" option in named or the "-a" command line
argument to delv. documentation has been cleaned up to remove references
to /etc/bind.keys.
Closes#3850.
it was possible for a managed trust anchor needing to send a key
refresh query to be unable to do so because an authoritative zone
was not yet loaded. this has been corrected by delaying the
synchronization of managed-keys zones until after all zones are
loaded.
GL !7412 updated the set of counters exposed via the XML & JSON
statistics channels. Apply a corresponding version bump, which was
not included in that merge request.
Commit b69e783164 changed the scope of the
local 'view' variable in load_configuration(), but the code section
guarded by the #ifdef USE_DNSRPS directive was not adjusted accordingly,
causing build errors for DNSRPS-enabled builds. Fix the latter by
declaring the 'view' variable inside the loop in the DNSRPS-specific
block of code.
The ADB hashmaps are stored in extra memory contexts, so the hash
tables are excluded from the overmem accounting. The new memory
context was unnamed, give it a proper name.
Same thing has happened with extra memory context used for named
global log context - give the extra memory context a proper name.
The total memory counter had again little or no meaning when we removed
the internal memory allocator. It was just a monotonic counter that
would count add the allocation sizes but never subtracted anything, so
it would be just a "big number".
The maxinuse memory counter indicated the highest amount of
memory allocated in the past. Checking and updating this high-
water mark value every time memory was allocated had an impact
on server performance, so it has been removed. Memory size can
be monitored more efficiently via an external tool logging RSS.
The malloced and maxmalloced memory counters were mostly useless since
we removed the internal allocator blocks - it would only differ from
inuse by the memory context size itself.
Add support for loading and validating the 'tls' parameter from
the forwarders' configuration.
This prepares ground for adding support to forward queries to
DoT-enabled upstream servers.
named formerly reserved a set of dispatch objects for use when
sending requests from user-specified source ports. this objects
are no longer used and have been removed.
When we change the view in the view->managed_keys, we never commit the
change, keeping the previous view possibly attached forever.
Call the dns_zone_setviewcommit() immediately after changing the view as
we are detaching the previous view anyway and there's no way to recover
from that.
limit the number of simultaneous DNS UPDATE events that can be
processed by adding a quota for update and update forwarding.
this quota currently, arbitrarily, defaults to 100.
also add a statistics counter to record when the update quota
has been exceeded.
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
This bug was masked in the tests because the `catz` test script did an
`rndc addzone` before an `rndc delzone`. The `addzone` autovivified
the NZF config, so `delzone` worked OK.
This commit swaps the order of two sections of the `catz` test script
so that it uses `delzone` before `addzone`, which provokes a crash
when `delzone` requires a non-NULL NZF config.
To fix the crash, we now try to remove the zone from the NZF config
only if it was dynamically added but not by a catalog zone.
Remove parsing the configuration options 'alt-transfer-source',
'alt-transfer-source-v6', and 'use-alt-transfer-source', and remove
the corresponding code that implements the feature.
Additionally to renaming, it changes the function definition so that
it accepts a pointer to pointer instead of returning a pointer to the
new object.
It is mostly done to make it in line with other functions in the
module.
The isc_buffer_reserve() would be passed a reference to the buffer
pointer, which was unnecessary as the pointer would never be changed
in the current implementation. Remove the extra dereference.
The only function left in the isc_resource API was setting the file
limit. Replace the whole unit with a simple getrlimit to check the
maximum value of RLIMIT_NOFILE and set the maximum back to rlimit_cur.
This is more compatible than trying to set RLIMIT_UNLIMITED on the
RLIMIT_NOFILE as it doesn't work on Linux (see man 5 proc on
/proc/sys/fs/nr_open), neither it does on Darwin kernel (see man 2
getrlimit).
The only place where the maximum value could be raised under privileged
user would be BSDs, but the `named_os_adjustnofile()` were not called
there before. We would apply the increased limits only on Linux and Sun
platforms.
After deprecating the operating system limits settings (coresize,
datasize, files and stacksize), mark them as ancient and remove the code
that sets the values from config.
The mechanism for associating a worker task to a database now
uses loops rather than tasks.
For this reason, the parameters to dns_cache_create() have been
updated to take a loop manager rather than a task manager.
The dns_rpz_zones structure was using .refs and .irefs for strong and
weak reference counting. Rewrite the unit to use just a single
reference counting + shutdown sequence (dns_rpz_destroy_rpzs) that must
be called by the creator of the dns_rpz_zones_t object. Remove the
reference counting from the dns_rpz_zone structure as it is not needed
because the zone objects are fully embedded into the dns_rpz_zones
structure and dns_rpz_zones_t object must never be destroyed before all
dns_rpz_zone_t objects.
The dns_rps_zones_t reference counting uses the new ISC_REFCOUNT_TRACE
capability - enable by defining DNS_RPZ_TRACE in the dns/rpz.h header.
Additionally, add magic numbers to the dns_rpz_zone and dns_rpz_zones
structures.
The dns_cache API contained a cache cleaning mechanism that would be
disabled for 'rbt' based cache. As named doesn't have any other cache
implementations, remove the cache cleaning mechanism from dns_cache API.
There were a number of places where the zone table should have been
locked, but wasn't, when dns_zt_apply was called.
Added a isc_rwlocktype_t type parameter to dns_zt_apply and adjusted
all calls to using it. Removed locks in callers.
If after a reconfig a zone is not reusable because inline-signing
was turned on/off, trigger a full resign. This is necessary because
otherwise the zone maintenance may decide to only apply the changes
in the journal, leaving the zone in an inconsistent DNSSEC state.
On Linux, the libcap is now mandatory. It makes things simpler for us.
System without {set,get}res{uid,gid} now have compatibility shim using
setreuid/setregid or seteuid/setegid to setup effective UID/GID, so the
same code can be called all the time (including on Linux).
The ns_interfacemgr_scan() now requires the loopmgr to be running, so we
need to end exclusive mode for the rescan and then begin it again.
This is relatively safe operation (because the scan happens on the timer
anyway), but we need to ensure that we won't load the configuration from
different threads. This is already the case because the initial load
happens on the main thread and the control channel also listens just on
the main loop.
There are three levels there for the port value, with increasing
priority:
1. The default ports, defined by 'port' and 'tls-port' config options.
2. The primaries-level default port: primaries port <number> { ... };
3. The primaries element-level port: primaries { <address> port
<number>; ... };"
In 'named_config_getipandkeylist()', the 'def_port' and 'def_tlsport'
variables are extracted from level 1. The 'port' variable is extracted
from the level 2. Currently if that is unset, it defaults to the
default port ('def_port' or 'def_tlsport' depending on the transport
used), but overrides the level 2 port setting for the next primaries in
the list.
Update the code such that we inherit the port only if the level 3 port
is not set, and inherit from the default ports if the level 2 port is
also not set.
The "prefetch" setting is in "defaultconf" so it cannot fail, use
INSIST to confirm that.
The 'trigger' and 'eligible' variables are now prefixed with
'prefetch_' and their declaration moved to an upper level, because
there is no more additional code block after this change.
This commit fixes a startup issue on Solaris systems with
many (reportedly > 510) CPUs by bumping RLIMIT_NOFILE. This appears to
be a regression from 9.11.
I.e. print the name of the function in BIND that called the system
function that returned an error. Since it was useful for pthreads
code, it seems worthwhile doing so everywhere.
Mostly generated automatically with the following semantic patch,
except where coccinelle was confused by #ifdef in lib/isc/net.c
@@ expression list args; @@
- UNEXPECTED_ERROR(__FILE__, __LINE__, args)
+ UNEXPECTED_ERROR(args)
@@ expression list args; @@
- FATAL_ERROR(__FILE__, __LINE__, args)
+ FATAL_ERROR(args)
Replace all uses of RUNTIME_CHECK() in lib/isc/include/isc/once.h with
PTHEADS_RUNTIME_CHECK(), in order to improve error reporting for any
once-related run-time failures (by augmenting error messages with
file/line/caller information and the error string corresponding to
errno).
Rewrite the isc_httpd to be more robust.
1. Replace the hand-crafted HTTP request parser with picohttpparser for
parsing the whole HTTP/1.0 and HTTP/1.1 requests. Limit the number
of allowed headers to 10 (arbitrary number).
2. Replace the hand-crafted URL parser with isc_url_parse for parsing
the URL from the HTTP request.
3. Increase the receive buffer to match the isc_netmgr buffers, so we
can at least receive two full isc_nm_read()s. This makes the
truncation processing much simpler.
4. Process the received buffer from single isc_nm_read() in a single
loop and schedule the sends to be independent of each other.
The first two changes makes the code simpler and rely on already
existing libraries that we already had (isc_url based on nodejs) or are
used elsewhere (picohttpparser).
The second two changes remove the artificial "truncation" limit on
parsing multiple request. Now only a request that has too many
headers (currently 10) or is too big (so, the receive buffer fills up
without reaching end of the request) will end the connection.
We can be benevolent here with the limites, because the statschannel
channel is by definition private and access must be allowed only to
administrators of the server. There are no timers, no rate-limiting, no
upper limit on the number of requests that can be served, etc.
Getting the recorded value of 'edns-udp-size' from the resolver requires
strong attach to the dns_view because we are accessing `view->resolver`.
This is not the case in places (f.e. dns_zone unit) where `.udpsize` is
accessed. By moving the .udpsize field from `struct dns_resolver` to
`struct dns_view`, we can access the value directly even with weakly
attached dns_view without the need to lock the view because `.udpsize`
can be accessed after the dns_view object has been shut down.
While refactoring the isc_mem_getx(...) usage, couple places were
identified where the memory was resized manually. Use the
isc_mem_reget(...) that was introduced in [GL !5440] to resize the
arrays via function rather than a custom code.
In several places, the structures were cleaned with memset(...)) and
thus the semantic patch converted the isc_mem_get(...) to
isc_mem_getx(..., ISC_MEM_ZERO). Use the designated initializer to
initialized the structures instead of zeroing the memory with
ISC_MEM_ZERO flag as this better matches the intended purpose.
Add new semantic patch to replace the straightfoward uses of:
ptr = isc_mem_{get,allocate}(..., size);
memset(ptr, 0, size);
with the new API call:
ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);
Because the dns_zonemgr_create() was run before the loopmgr was started,
the isc_ratelimiter API was more complicated that it had to be. Move
the dns_zonemgr_create() to run_server() task which is run on the main
loop, and simplify the isc_ratelimiter API implementation.
The isc_timer is now created in the isc_ratelimiter_create() and
starting the timer is now separate async task as is destroying the timer
in case it's not launched from the loop it was created on. The
ratelimiter tick now doesn't have to create and destroy timer logic and
just stops the timer when there's no more work to do.
This should also solve all the races that were causing the
isc_ratelimiter to be left dangling because the timer was stopped before
the last reference would be detached.
There's a known memory leak in the engine_pkcs11 at the time of writing
this and it interferes with the named ability to check for memory leaks
in the OpenSSL memory context by default.
Add an autoconf option to explicitly enable the memory leak detection,
and use it in the CI except for pkcs11 enabled builds. When this gets
fixed in the engine_pkc11, the option can be enabled by default.
As we can't check the deallocations done in the library memory contexts
by default because it would always fail on non-clean exit (that happens
on error or by calling exit() early), we just want to enable the checks
to be done on normal exit.
The libxml2 library provides a way to replace the default allocator with
user supplied allocator (malloc, realloc, strdup and free).
Create a memory context specifically for libxml2 to allow tracking the
memory usage that has originated from within libxml2. This will provide
a separate memory context for libxml2 to track the allocations and when
shutting down the application it will check that all libxml2 allocations
were returned to the allocator.
Additionally, move the xmlInitParser() and xmlCleanupParser() calls from
bin/named/main.c to library constructor/destructor in libisc library.