There's a mismatch between the atomic and non-atomic types that could
potentialy lead to a rwlock deadlock (after two billion 2^32) writes.
Use int_fast32_t when loading the atomic_int_fast32_t types in the
isc_rwlock unit.
(cherry picked from commit 86673ee67a)
When -T cookiealwaysvalid is passed to named, DNS cookie checks for
the incoming queries always pass, given they are structurally correct.
(cherry picked from commit 807ef8545d)
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA. This limits the
number of lookups into the database(s) during a single client query,
reducing query processing load.
Also, don't append any additional data to type=ANY queries. The
answer to ANY is already big enough.
(cherry picked from commit a1982cf1bb)
When isc_rwlock_trylock() fails to get a read lock because another
writer was faster, it should wake up other waiting writers in case
there are no other readers, but the current code forgets about
the currently active writer when evaluating 'cntflag'.
Unset the WRITER_ACTIVE bit in 'cntflag' before checking to see if
there are other readers, otherwise the waiting writers, if they exist,
might not wake up.
(cherry picked from commit 73b6d9e9e5)
implement, document, and test the 'max-query-restarts' option
which specifies the query restart limit - the number of times
we can follow CNAMEs before terminating resolution.
(cherry picked from commit 104f3b82fb)
(cherry picked from commit 2e04f0380c)
the number of iterative queries that can be sent to resolve a
name now defaults to 32 rather than 100.
(cherry picked from commit 7e3b425dc2)
(cherry picked from commit a11367ade3)
MAX_RESTARTS is no longer hard-coded; ns_server_setmaxrestarts()
and dns_client_setmaxrestarts() can now be used to modify the
max-restarts value at runtime. in both cases, the default is 11.
(cherry picked from commit c5588babaf)
(cherry picked from commit bfbc6a6c84)
the number of steps that can be followed in a CNAME chain
before terminating the lookup has been reduced from 16 to 11.
(this is a hard-coded value, but will be made configurable later.)
(cherry picked from commit 05d78671bb)
(cherry picked from commit dd88a4cdfc)
fctx_create() now logs at debug level 9 when the fctx attaches
to an existing counter or creates a new one.
(cherry picked from commit 825f3d68c5)
(cherry picked from commit 14bce7e275)
previously, validator queries for DNSKEY and DS records were
not counted toward the quota for max-recursion-queries; they
are now.
(cherry picked from commit af7db89513)
(cherry picked from commit 18e39d989f)
there were cases in resolver.c when queries for NS records were
started without passing a pointer to the parent fetch's query counter;
as a result, the max-recursion-queries quota for those queries started
counting from zero, instead of sharing the limit for the parent fetch,
making the quota ineffective in some cases.
(cherry picked from commit d3b7e92783)
(cherry picked from commit 5ab4cae4ed)
When fuzzing it is useful for all signing operations to happen
at a specific time for reproducability. Add two variables to
the message structure (fuzzing and fuzztime) to specify if a
fixed time should be used and the value of that time.
(cherry picked from commit 3e85d8c3d6)
The new "too many records" error can make an update fail without the
error being logged. This commit fixes that.
(cherry picked from commit 558923e5405894cf976d102f0d246a28bdbb400c)
Instead of outright refusing to add new RR types to the cache, be a bit
smarter:
1. If the new header type is in our priority list, we always add either
positive or negative entry at the beginning of the list.
2. If the new header type is negative entry, and we are over the limit,
we mark it as ancient immediately, so it gets evicted from the cache
as soon as possible.
3. Otherwise add the new header after the priority headers (or at the
head of the list).
4. If we are over the limit, evict the last entry on the normal header
list.
(cherry picked from commit 57cd34441a)
when signatures were not added because of too many types already
existing at a node, the diff was not being cleaned up; this led to
a memory leak being reported at shutdown.
(cherry picked from commit 2825bdb1ae5be801e7ed603ba2455ed9a308f1f7)
Previously, the number of RR types for a single owner name was limited
only by the maximum number of the types (64k). As the data structure
that holds the RR types for the database node is just a linked list, and
there are places where we just walk through the whole list (again and
again), adding a large number of RR types for a single owner named with
would slow down processing of such name (database node).
Add a configurable limit to cap the number of the RR types for a single
owner. This is enforced at the database (rbtdb, qpzone, qpcache) level
and configured with new max-types-per-name configuration option that
can be configured globally, per-view and per-zone.
(cherry picked from commit 00d16211d6368b99f070c1182d8c76b3798ca1db)
Previously, the number of RRs in the RRSets were internally unlimited.
As the data structure that holds the RRs is just a linked list, and
there are places where we just walk through all of the RRs, adding an
RRSet with huge number of RRs inside would slow down processing of said
RRSets.
Add a configurable limit to cap the number of the RRs in a single RRSet.
This is enforced at the database (rbtdb, qpzone, qpcache) level and
configured with new max-records-per-type configuration option that can
be configured globally, per-view and per-zone.
(cherry picked from commit 3fbd21f69a1bcbd26c4c00920e7b0a419e8762fc)
Clear qctx->zversion when clearing qctx->zrdataset et al in
lib/ns/query.c:qctx_freedata. The uncleared pointer could lead to
an assertion failure if zone data needed to be re-saved which could
happen with stale data support enabled.
(cherry picked from commit 179fb3532ab8d4898ab070b2db54c0ce872ef709)
Instead of outright refusing to add new RR types to the cache, be a bit
smarter:
1. If the new header type is in our priority list, we always add either
positive or negative entry at the beginning of the list.
2. If the new header type is negative entry, and we are over the limit,
we mark it as ancient immediately, so it gets evicted from the cache
as soon as possible.
3. Otherwise add the new header after the priority headers (or at the
head of the list).
4. If we are over the limit, evict the last entry on the normal header
list.
(cherry picked from commit 57cd34441a)
Add HTTPS, SVCB, SRV, PTR, NAPTR, DNSKEY and TXT records to the list of
the priority types that are put at the beginning of the slabheader list
for faster access and to avoid eviction when there are more types than
the max-types-per-name limit.
(cherry picked from commit b27c6bcce8)
Previously, the number of RR types for a single owner name was limited
only by the maximum number of the types (64k). As the data structure
that holds the RR types for the database node is just a linked list, and
there are places where we just walk through the whole list (again and
again), adding a large number of RR types for a single owner named with
would slow down processing of such name (database node).
Add a hard-coded limit (100) to cap the number of the RR types for a single
owner. The limit can be changed at the compile time by adding following
define to CFLAGS:
-DDNS_RBTDB_MAX_RTYPES=<limit>
Previously, the number of RRs in the RRSets were internally unlimited.
As the data structure that holds the RRs is just a linked list, and
there are places where we just walk through all of the RRs, adding an
RRSet with huge number of RRs inside would slow down processing of said
RRSets.
The fix for end-of-life branches make the limit compile-time only for
simplicity and the limit can be changed at the compile time by adding
following define to CFLAGS:
-DDNS_RDATASET_MAX_RECORDS=<limit>
(cherry picked from commit c5c4d00c38530390c9e1ae4c98b65fbbadfe9e5e)
When calling dns_resolver_createfetch in resolver.c with a callback
of resume_dslookup, clear DNS_FETCHOPT_TRYSTALE_ONTIMEOUT from
options as DNS_EVENT_TRYSTALE is not an expected event type and
triggers a REQUIRE.
(cherry picked from commit 6faea6da3d646557d234d63ddd5d524d222e8082)
The fuzzing harness operates on dns_message_t in non-standard ways
and if 'sig0name' is non-NULL when msgresetsigs() and
dns_message_renderreset() are called it should be cleaned up.
(cherry picked from commit 450fab92b1)
Commit eba7fb5f9f modified the definition
of struct dns_rbtnode. Doing that changes the layout of map-format zone
files. Bump MAPAPI and update the offsets used in map-format zone file
checks in the "masterformat" system test, as these changes were
inadvertently omitted from the aforementioned change.
(cherry picked from commit 52fe0b6be7)
Commit 540a5b5a2c modified the definition
of struct dns_rbtnode. Doing that changes the layout of map-format zone
files. Bump MAPAPI and update the offsets used in map-format zone file
checks in the "masterformat" system test, as these changes were
inadvertently omitted from the aforementioned change.
The dns_cache_flush() drops the old database and creates a new one, but
it forgets to create the task(s) that runs the node pruning and cleaning
the rbtdb when flushing it next time. This causes the cleaning to skip
cleaning the parent nodes (with .down == NULL) leading to increased
memory usage over time until the database is unable to keep up and just
stays overmem all the time.
(cherry picked from commit d4bc4e5cc6)
Previously, rbtdb->task had quantum of 1 because it was originally used
just for freeing RBTDB contents, which can happen on a "best effort"
basis (does not need to be prioritized). However, when tree pruning was
implemented, it also started sending events to that task, enabling the
latter to become clogged up with a significant event backlog because it
only pruned a single RBTDB node per event.
To prioritize tree pruning (as it is necessary for enforcing the
configured memory use limit for the cache memory context), create a
second task with a virtually unlimited quantum (UINT_MAX) and send the
tree-pruning events to this new task, to ensure that all nodes scheduled
for pruning will be processed before further nodes are queued in a
similar fashion.
This change enables dropping the prunenodes list and restoring the
originally-used logic that allocates and sends a separate event for each
node to prune.
(cherry picked from commit 540a5b5a2c)
Reconstruct the variant of the prune_tree() parent cleaning to consider
all elibible parents in a single loop as we were doing before all the
changes that led to this commit.
Update code comments so that they more precisely describe what the
relevant bits of code actually do.
(cherry picked from commit 12c42a6c07)
The dns_cache_flush() drops the old database and creates a new one, but
it forgets to create the task(s) that runs the node pruning and cleaning
the rbtdb when flushing it next time. This causes the cleaning to skip
cleaning the parent nodes (with .down == NULL) leading to increased
memory usage over time until the database is unable to keep up and just
stays overmem all the time.
(cherry picked from commit 79040a669c)
Previously, rbtdb->task had quantum of 1 because it was originally used
just for freeing RBTDB contents, which can happen on a "best effort"
basis (does not need to be prioritized). However, when tree pruning was
implemented, it also started sending events to that task, enabling the
latter to become clogged up with a significant event backlog because it
only pruned a single RBTDB node per event.
To prioritize tree pruning (as it is necessary for enforcing the
configured memory use limit for the cache memory context), create a
second task with a virtually unlimited quantum (UINT_MAX) and send the
tree-pruning events to this new task, to ensure that all nodes scheduled
for pruning will be processed before further nodes are queued in a
similar fashion.
This change enables dropping the prunenodes list and restoring the
originally-used logic that allocates and sends a separate event for each
node to prune.
(cherry picked from commit 231b2375e5)
Reconstruct the variant of the prune_tree() parent cleaning to consider
all elibible parents in a single loop as we were doing before all the
changes that led to this commit.
Update code comments so that they more precisely describe what the
relevant bits of code actually do.
(cherry picked from commit 454c75a33a)
Commit 37101c7c8a checks the prunelink
member of the node that was just pruned, not its parent node that was
intended to be examined. Fix by checking the prunelink member of the
parent node, so that adding the latter to its relevant prunenodes list
twice is properly guarded against.
(cherry picked from commit 7d9be24bb1)
Commit 4b6fc97af6 checks the prunelink
member of the node that was just pruned, not its parent node that was
intended to be examined. Fix by checking the prunelink member of the
parent node, so that adding the latter to its relevant prunenodes list
twice is properly guarded against.
If a node cleaned up by prune_tree() happens to belong to the same node
bucket as its parent, the latter is directly appended to the prunenodes
list currently processed by prune_tree(). However, the relevant code
branch does not account for the fact that the parent might already be on
the list it is trying to append it to. Fix by only calling
ISC_LIST_APPEND() for parent nodes not yet added to their relevant
prunenodes list.
(cherry picked from commit 4b6fc97af6)
If a node cleaned up by prune_tree() happens to belong to the same node
bucket as its parent, the latter is directly appended to the prunenodes
list currently processed by prune_tree(). However, the relevant code
branch does not account for the fact that the parent might already be on
the list it is trying to append it to. Fix by only calling
ISC_LIST_APPEND() for parent nodes not yet added to their relevant
prunenodes list.
Commit 801e888d03 made the prune_tree()
function use send_to_prune_tree() for triggering pruning of deleted leaf
nodes' parents. This enabled the following sequence of events to
happen:
1. Node A, which is a leaf node, is passed to send_to_prune_tree() and
its pruning is queued.
2. Node B is added to the RBTDB as a child of node A before the latter
gets pruned.
3. Node B, which is now a leaf node itself (and is likely to belong to
a different node bucket than node A), is passed to
send_to_prune_tree() and its pruning gets queued.
4. Node B gets pruned. Its parent, node A, now becomes a leaf again
and therefore the prune_tree() call that handled node B calls
send_to_prune_tree() for node A.
5. Since node A was already queued for pruning in step 1 (but not yet
pruned), the INSIST(!ISC_LINK_LINKED(node, prunelink)); assertion
fails for node A in send_to_prune_tree().
The above sequence of events is not a sign of pathological behavior.
Replace the assertion check with a conditional early return from
send_to_prune_tree().
(cherry picked from commit f6289ad931)
Commit 2df147cb12 made the prune_tree()
function use send_to_prune_tree() for triggering pruning of deleted leaf
nodes' parents. This enabled the following sequence of events to
happen:
1. Node A, which is a leaf node, is passed to send_to_prune_tree() and
its pruning is queued.
2. Node B is added to the RBTDB as a child of node A before the latter
gets pruned.
3. Node B, which is now a leaf node itself (and is likely to belong to
a different node bucket than node A), is passed to
send_to_prune_tree() and its pruning gets queued.
4. Node B gets pruned. Its parent, node A, now becomes a leaf again
and therefore the prune_tree() call that handled node B calls
send_to_prune_tree() for node A.
5. Since node A was already queued for pruning in step 1 (but not yet
pruned), the INSIST(!ISC_LINK_LINKED(node, prunelink)); assertion
fails for node A in send_to_prune_tree().
The above sequence of events is not a sign of pathological behavior.
Replace the assertion check with a conditional early return from
send_to_prune_tree().
It was discovered that the TTL-based cleaning could build up
a significant backlog of the rdataset headers during the periods where
the top of the TTL heap isn't expired yet. Make the TTL-based cleaning
more aggressive by cleaning more headers from the heap when we are
adding new header into the RBTDB.
(cherry picked from commit d8220ca4ca)
(cherry picked from commit 496fe6bc60)
It was discovered that an expired header could sit on top of the heap
a little longer than desireable. Remove expired headers (headers with
rdh_ttl set to 0) from the heap completely, so they don't block the next
TTL-based cleaning.
(cherry picked from commit a9383e4b95)
(cherry picked from commit abe080d16e)
It was discovered that the TTL-based cleaning could build up
a significant backlog of the rdataset headers during the periods where
the top of the TTL heap isn't expired yet. Make the TTL-based cleaning
more aggressive by cleaning more headers from the heap when we are
adding new header into the RBTDB.
(cherry picked from commit d8220ca4ca)
It was discovered that an expired header could sit on top of the heap
a little longer than desireable. Remove expired headers (headers with
rdh_ttl set to 0) from the heap completely, so they don't block the next
TTL-based cleaning.
(cherry picked from commit a9383e4b95)