Instead of mixing the dns_resolver and dns_validator units directly with
the EDE code, split-out the dns_ede functionality into own separate
compilation unit and hide the implementation details behind abstraction.
Additionally, the EDE codes are directly copied into the ns_client
buffers by passing the EDE context to dns_resolver_createfetch().
This makes the dns_ede implementation simpler to use, although sligtly
more complicated on the inside.
Co-authored-by: Colin Vidal <colin@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Address Database (ADB) shares the memory for the short lived ADB
objects (finds, fetches, addrinfo) and the long lived ADB
objects (names, entries, namehooks). This could lead to a situation
where the resolver-heavy load would force evict ADB objects from the
database to point where ADB is completely empty, leading to even more
resolver-heavy load.
Make the short lived ADB objects use the other memory context that we
already created for the hashmaps. This makes the ADB overmem condition
to not be triggered by the ongoing resolver fetches.
Add support for Extended DNS Errors (EDE) error 22: No reachable
authority. This occurs when after a timeout delay when the resolver is
trying to query an authority server.
When canceling the ADB find, the lock on the find gets released for
a brief period of time to be locked again inside adbname lock. During
the brief period that the ADB find is unlocked, it can get canceled by
other means removing it from the adbname list which in turn causes
assertion failure due to a double removal from the adbname list.
Recheck if the find->adbname is still valid after acquiring the lock
again and if not just skip the double removal. Additionally, attach to
the adbname as in the worst case, the adbname might also cease to exist
if the scheduler would block this particular thread for a longer period
of time invalidating the lock we are going to acquire and release.
Static-stub address and addresses from other sources where being
mixed together resulting in static-stub queries going to addresses
not specified in the configuration or alternatively static-stub
addresses being used instead of the real addresses.
Give prefetches a free pass through the quota so that the cache
entries for popular zones could be updated successfully even if the
quota for is already reached.
Remove the complicated mechanism that could be (in theory) used by
external libraries to register new categories and modules with
statically defined lists in <isc/log.h>. This is similar to what we
have done for <isc/result.h> result codes. All the libraries are now
internal to BIND 9, so we don't need to provide a mechanism to register
extra categories and modules.
The previous value of 30 minutes used to cache the ADB names and entries
was quite long. Change the value to 60 seconds for faster recovery
after cached intermittent failure of the remote nameservers.
The algorithm from the previous commit[1] is now used to calculate all
the expiration values through the code (ncache results, cname/dname
targets).
1. ISC_MIN(cur, ISC_MAX(now + ADB_ENTRY_WINDOW, now + rdataset->ttl))
Correct the logic to set the expiration period of expire_{v4,v6} as
follows:
1. If the trust is ultimate (local entry), immediately set the entry as
expired, so the changes to the local zones have immediate effect.
3. If the expiration is already set and smaller than the new value, then
leave the expiration value as it is.
2. Otherwise pick larger of `now + ADB_ENTRY_WINDOW` and `now + TTL` as
the new expiration value.
When ADB entry was created it was set to never expire. If we never
called any of the functions that adjust the expiration, it could linger
in the ADB forever.
Set the expiration (.expires) to now + ADB_ENTRY_WINDOW when creating
the new ADB entry to ensure the ADB entry will always expire.
if we had a method to get the running loop, similar to how
isc_tid() gets the current thread ID, we can simplify loop
and loopmgr initialization.
remove most uses of isc_loop_current() in favor of isc_loop().
in some places where that was the only reason to pass loopmgr,
remove loopmgr from the function parameters.
Previously, there were two methods of working with the overmem
condition:
1. hi/lo water callback - when the overmem condition was reached
for the first time, the water callback was called with HIWATER
mark and .is_overmem boolean was set internally. Similarly,
when the used memory went below the lo water mark, the water
callback would be called with LOWATER mark and .is_overmem
was reset. This check would be called **every** time memory
was allocated or freed.
2. isc_mem_isovermem() - a simple getter for the internal
.is_overmem flag
This commit refactors removes the first method and move the hi/lo water
checks to the isc_mem_isovermem() function, thus we now have only a
single method of checking overmem condition and the check for hi/lo
water is removed from the hot path for memory contexts that doesn't use
overmem checks.
If a child zone is served by the same servers as a parent zone and
a NS query is made for the zone name then the addresses of the
nameservers are returned in the additional section are tagged as
trust additional.
There was a microoptimization for smoothing srtt with bitshifts. Revert
the code to use * 98 / 100, it doesn't really make that difference on
modern CPUs, for comparison here:
muldiv:
imul eax, edi, 98
imul rax, rax, 1374389535
shr rax, 37
ret
shift:
mov eax, edi
sal eax, 9
sub eax, edi
shr eax, 9
ret
If factor == DNS_ADB_RTTADJAGE and addr->entry->lastage == now we would
load value into new_srtt and then immediatelly store it back which
triggers the synchronization between threads using .srtt values.
Use atomics on couple of ADB entry members (.srtt, .flags, .expires, and
.lastage) to remove ADB entry locking from couple of hot spots. The
most prominent place is copy_namehook_lists() that gets called under ADB
name lock and if the namehook list is long it acquires-releases quite a
few ADB entry locks. Changing those ADB entry members to atomics
allowed us to new_adbaddrinfo() not require locked ADB entry and since
adbentry_overquota() already used atomics and handling lame information
was dropped in the previous commit, we could not make the
copy_namehook_lists() lockless.
The other hotspot is dns_adb_adjustsrtt() and dns_adb_agesrtt() that can
now use atomics because .srtt is already atomic_uint.
And the last place that could now use atomics is dns_adb_changeflags().
Keeping the information about lame server in the ADB was done in !322 to
fix following security issue:
[CVE-2021-25219] Disable "lame-ttl" cache
The handling of the lame servers needs to be redesigned and it is not
going to be enabled any time soon, and the current code is just dead
code that takes up space, code and stands in the way of making ADB work
faster.
Remove all the internals needed for handling the lame servers in the ADB
for now. It might get reintroduced later if and when we redesign ADB.
Refactor isc_hashmap to allow custom matching functions. This allows us
to have better tailored keys that don't require fixed uint8_t arrays,
but can be composed of more fields from the stored data structure.
The isc_stats_create() can no longer return anything else than
ISC_R_SUCCESS. Refactor isc_stats_create() and its variants in libdns,
libns and named to just return void.
in the past there was overlap between the fields used
as resolver fetch options and ADB addrinfo flags. this has
mostly been eliminated; now we can clean up the rest of
it and remove some confusing comments.
The dns_adbentry_overquota() was violating the layers accessing the
adbentry struct members directly. Change it to dns_adb_overquota() to
match the dns_adb API.
This is a simple replacement using the semantic patch from the previous
commit and as added bonus, one removal of previously undetected unused
variable in named/server.c.
There was a code flow error that would remove the expired ADB entry from
the LRU list and then a check in the expire_entry() would cause
assertion error because it expect the ADB entry to be linked.
Additionally, the expire mechanism would loop for cases when we would
held only a read rwlock; in such case we need to upgrade the lock and
try again, not just try again.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
callback events from dns_resolver_createfetch() are now posted
using isc_async_run.
other modules which called the resolver and maintained task/taskmgr
objects for this purpose have been cleaned up.
The callbacks from dns_abd_createfind() are now posted using
isc_async_run() instead of isc_task_send(). ADB event types
have been replaced with a new dns_adbstatus_t type which is
included as find->status.
(The ADB still uses a task for dns_resolver_createfetch().)
Replace the isc_mutex in the dns_adb unit with isc_rwlock for better
performance. Both ADB names and ADB entries hashtables and LRU are now
using isc_rwlock.
The ADB hashmaps are stored in extra memory contexts, so the hash
tables are excluded from the overmem accounting. The new memory
context was unnamed, give it a proper name.
Same thing has happened with extra memory context used for named
global log context - give the extra memory context a proper name.
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
The clean_namehooks() function does't hold the 'adb->entries_lock'
lock, so calling maybe_expire_entry() is not thread-safe.
Instead of adding a lock/unlock, leave the expiration to later,
e.g. by the get_attached_and_locked_entry() function.
Also fix a couple of comment typos.
The isc_buffer_reserve() would be passed a reference to the buffer
pointer, which was unnecessary as the pointer would never be changed
in the current implementation. Remove the extra dereference.
When get_attached_entry() encounters entry that would be expired, it
needs to get reference to the entry before calling maybe_expire_entry(),
so the ADB entry doesn't get destroyed inside the its own lock.
This creeped into the code base again during review, so I am adding
an extra comment to prevent this.
The overmem cleaning in ADB could become overzealous and clean fresh ADB
names and entries. Add a safety check to not clean any ADB names and
entries that are below ADB_CACHE_MINIMUM threshold.
The ADB overmem accounting would include the memory used by hashtables
thus vastly reducing the space that can be used for ADB names and
entries when the hashtables would grow. Create own memory context for
the ADB names and entries hash tables.
There was a datarace that could expire a freshly created ADB names and
entries between the return from get_attached_{name,entry} and locking it
again. Lock the ADB name and ADB entry inside the hash table lock, so
they won't get expired until the full initialization has been complete.