bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-06-04 15:52:03 -04:00

Author	SHA1	Message	Date
Ondřej Surý	5f15df5c53	Fix memory leak in QPcache addnoqname/addclosest mechanism The attacker that controls DNSSEC-signed zone can trigger a memory leak in the addnoqname() and/or addclosest() by creating more than max-records-per-type RRSIG for any NSEC records. The memory leaks have been fixed. (cherry picked from commit `a854a5c83d`)	2026-03-13 13:22:23 +01:00
Matthijs Mekking	63262fd0f4	Implement dns_dbiterator_seek3 This is a new seek function for dbiterator that is meant to find an NSEC3 node in a zone database. The difference with dns_dbiterator_seek is that if the node does not exist, this seek function will point the iterator to the next NSEC3 name. (cherry picked from commit `41159e9062`)	2025-12-11 13:53:25 +01:00
Mark Andrews	b677d31fca	In dbiterator_prev, dereference_iter_node was being called too soon dns_rbtnodechain_prev requires the current node to still be valid which was not always the case after dereference_iter_node was called. Move the call to dereference_iter_node to after the dns_rbtnodechain_prev to preserve the node.	2025-12-08 10:25:17 +01:00
Evan Hunt	25c9fb54da	standardize CHECK and RETERR macros previously, there were over 40 separate definitions of CHECK macros, of which most used "goto cleanup", and the rest "goto failure" or "goto out". there were another 10 definitions of RETERR, of which most were identical to CHECK, but some simply returned a result code instead of jumping to a cleanup label. this has now been standardized throughout the code base: RETERR is for returning an error code in the case of an error, and CHECK is for jumping to a cleanup tag, which is now always called "cleanup". both macros are defined in isc/util.h. (cherry picked from commit `52bba5cc34`)	2025-12-03 19:17:20 -08:00
Mark Andrews	f8cafb9756	Fix missing RRSIGs for "glue" lookups with CD=1 The code to test whether to store the RRSIGs on DNS_R_UNCHANGED with CD=1 was failing because the comparison methods of the two rdatatset instances were not compatible. Move the testing into dns_db_addrdataset(), and request it by setting the DNS_ADD_EQUALOK option. If the option is set and the old and new rrsets compare as equal, dns_db_addrdataset() returns ISC_R_SUCCESS instead of DNS_R_UNCHANGED. (cherry picked from commit `b954a1df43`)	2025-09-10 17:08:52 +10:00
Ondřej Surý	08328a9cce	Don't preserve cache entries if new TTL is smaller than existing Under certain circumstances, cache entries with equivalent rdataset might not get replaced. Previously such entry would get preserved regardless of the new TTL and expire time on the existing header would get updated when the expire time was less than the expire time on the existing header. Change the logic to preserve the existing header only if the new expire time is larger than the existing one and replace the existing cache entry when the new expire time is less than the existing one. Co-authored-by: Jinmei Tatuya <jtatuya@infoblox.com> (cherry picked from commit `9f7ba584cf`)	2025-08-26 21:13:25 +02:00
Ondřej Surý	06e3d996c1	Preserve ZEROTTL attribute when replacing NS RRset Previously, BIND 9 would drop the ZEROTTL attribute when updating previously cached NS entry with ZEROTTL attribute set. Co-authored-by: Jinmei Tatuya <jtatuya@infoblox.com> (cherry picked from commit `982ca161c2`)	2025-08-26 21:12:21 +02:00
Mark Andrews	ae3e67717c	Fix "CNAME and other data" detection prio_type was being used in the wrong place to optimize cname_and_other. We have to first exclude and accepted types and we also have to determine that the record exists before we can check if we are at a point where a later CNAME cannot appear. (cherry picked from commit `5e49a9e4ae`)	2025-02-14 13:41:11 +11:00
Ondřej Surý	8229d9cdfa	Print the expiration time of the stale records (not ancient) In #1870, the expiration time of ANCIENT records were printed, but actually the ancient records are very short lived, and the information carries a little value. Instead of printing the expiration of ANCIENT records, print the expiration time of STALE records. (cherry picked from commit `355fc48472`)	2025-02-04 18:07:59 +01:00
Ondřej Surý	302aca809d	Expand the usage of mark_ancient() helper functions When the mark_ancient() helper function was introduced, couple of places with duplicate (or almost duplicate) code was missed. Move the mark_ancient() function closer to the top of the file, and correctly use it in places that mark the header as ANCIENT. (cherry picked from commit `58179e6a19`)	2025-02-03 15:53:34 +01:00
Ondřej Surý	4b114838de	Add better ZEROTTL handling in bindrdataset() If we know that the header has ZEROTTL set, the server should never send stale records for it and the TTL should never be anything else than 0. The comment was already there, but the code was not matching the comment. (cherry picked from commit `cfee6aa565`)	2025-02-03 15:53:34 +01:00
Ondřej Surý	b32512a232	In cache, set rdataset TTL to 0 when the header is not active When the header has been marked as ANCIENT, but the ttl hasn't been reset (this happens in couple of places), the rdataset TTL would be set to the header timestamp instead to a reasonable TTL value. Since this header has been already expired (ANCIENT is set), set the rdataset TTL to 0 and don't reuse this field to print the expiration time when dumping the cache. Instead of printing the time, we now just print 'expired (awaiting cleanup'. (cherry picked from commit `1bbb57f81b`)	2025-02-03 15:53:34 +01:00
Ondřej Surý	857225aeb6	Clarify reference counting in RBTDB database Change the names of the node reference counting functions and add comments to make the mechanism easier to understand: - dns__rbtdb_newref() and dns__rbtdb_decref() are now called dns__rbtnode_acquire() and dns__rbtnode_release() respectively; this reflects the fact that they modify both the internal and external reference counters for a node. - rbtnode_newref() and rbtnode_decref are now called rbtnode_erefs_increment() and rbtnode_erefs_decrement(), to reflect that they only increase and decrease the node's external reference counters, not internal.	2025-01-31 06:07:48 +01:00
Ondřej Surý	9c45de9473	Refactor node reference counting in rbtdb.c Refactor the pattern in the newref() and decref() functions in rbtdb.c following the pattern, so it follows the similar pattern we already have for QPDB.	2025-01-31 05:52:13 +01:00
JINMEI Tatuya	da0453b1d5	Optimize database decref by avoiding locking with refs > 1 Previously, this function always acquires a node write lock if it might need node cleanup in case the reference decrements to 0. In fact, the lock is unnecessary if the reference is larger than 1 and it can be optimized as an "easy" case. This optimization could even be "necessary". In some extreme cases, many worker threads could repeat acquring and releasing the reference on the same node, resulting in severe lock contention for nothing (as the ref wouldn't decrement to 0 in most cases). This change would prevent noticeable performance drop like query timeout for such cases. Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com> Co-authored-by: Ondřej Surý <ondrej@isc.org> (cherry picked from commit `7f4471594d`)	2025-01-22 14:29:30 +01:00
Ondřej Surý	547f376f21	Rewrite the GLUE cache in QP zone database This is a second attempt to rewrite the GLUE cache to not use per database version hash table. Instead of keeping a hash table indexed by the node, use a directly linked list of GLUE records for each slabheader. This was attempted before, but there was a data race caused by the fact that the thread cleaning the GLUE records could be slower than accessing the slab headers again and reinitializing the wait-free stack. The improved design builds on the previous design, but adds a new dns_gluelist structure that has a pointer to the database version. If a dns_gluelist belonging to a different (old) version is detected, it is just detached from the slabheader and left for the closeversion() to clean it up later. (cherry picked from commit `29bde687b5`)	2025-01-06 14:00:47 +01:00
Ondřej Surý	ad952ffee6	Revert "Fix the glue table in the QP and RBT zone databases" This reverts commit `46cfebac58`. (cherry picked from commit `759d59801b`)	2025-01-06 14:00:43 +01:00
Ondřej Surý	db5803a0ec	Use attach()/detach() functions instead of touching .references In rbtdb.c, there were places where the code touched .references directly instead of using the helper functions. Use the helper functions instead.	2024-11-27 21:16:22 +01:00
JINMEI Tatuya	08122316a7	emit more helpful log for exceeding max-records-per-type The new log message is emitted when adding or updating an RRset fails due to exceeding the max-records-per-type limit. The log includes the owner name and type, corresponding zone name, and the limit value. It will be emitted on loading a zone file, inbound zone transfer (both AXFR and IXFR), handling a DDNS update, or updating a cache DB. It's especially helpful in the case of zone transfer, since the secondary side doesn't have direct access to the offending zone data. It could also be used for max-types-per-name, but this change doesn't implement it yet as it's much less likely to happen in practice. (cherry picked from commit `4156995431`)	2024-11-27 11:17:34 +11:00
Ondřej Surý	58a15d38c2	Remove redundant parentheses from the return statement (cherry picked from commit `0258850f20`)	2024-11-19 14:26:52 +01:00
Ondřej Surý	46cfebac58	Fix the glue table in the QP and RBT zone databases When adding glue to the header, we add header to the wait-free stack to be cleaned up later which sets wfc_node->next to non-NULL value. When the actual cleaning happens we would only cleanup the .glue_list, but since the database isn't locked for the time being, the headers could be reused while cleaning the existing glue entries, which creates a data race between database versions. Revert the code back to use per-database-version hashtable where keys are the node pointers. This allows each database version to have independent glue cache table that doesn't affect nodes or headers that could already "belong" to the future database version. (cherry picked from commit `5beae5faf9`)	2024-08-05 14:43:18 +00:00
Ondřej Surý	57cd34441a	Be smarter about refusing to add many RR types to the database Instead of outright refusing to add new RR types to the cache, be a bit smarter: 1. If the new header type is in our priority list, we always add either positive or negative entry at the beginning of the list. 2. If the new header type is negative entry, and we are over the limit, we mark it as ancient immediately, so it gets evicted from the cache as soon as possible. 3. Otherwise add the new header after the priority headers (or at the head of the list). 4. If we are over the limit, evict the last entry on the normal header list.	2024-07-01 12:48:51 +02:00
Ondřej Surý	b27c6bcce8	Expand the list of the priority types and move it to db_p.h Add HTTPS, SVCB, SRV, PTR, NAPTR, DNSKEY and TXT records to the list of the priority types that are put at the beginning of the slabheader list for faster access and to avoid eviction when there are more types than the max-types-per-name limit.	2024-07-01 12:47:30 +02:00
Ondřej Surý	52b3d86ef0	Add a limit to the number of RR types for single name Previously, the number of RR types for a single owner name was limited only by the maximum number of the types (64k). As the data structure that holds the RR types for the database node is just a linked list, and there are places where we just walk through the whole list (again and again), adding a large number of RR types for a single owner named with would slow down processing of such name (database node). Add a configurable limit to cap the number of the RR types for a single owner. This is enforced at the database (rbtdb, qpzone, qpcache) level and configured with new max-types-per-name configuration option that can be configured globally, per-view and per-zone.	2024-06-10 16:55:09 +02:00
Ondřej Surý	32af7299eb	Add a limit to the number of RRs in RRSets Previously, the number of RRs in the RRSets were internally unlimited. As the data structure that holds the RRs is just a linked list, and there are places where we just walk through all of the RRs, adding an RRSet with huge number of RRs inside would slow down processing of said RRSets. Add a configurable limit to cap the number of the RRs in a single RRSet. This is enforced at the database (rbtdb, qpzone, qpcache) level and configured with new max-records-per-type configuration option that can be configured globally, per-view and per-zone.	2024-06-10 16:55:07 +02:00
Evan Hunt	2c88946590	dns_name_dupwithoffsets() cannot fail this function now always returns success; change it to void and clean up its callers.	2024-04-10 22:51:07 -04:00
Evan Hunt	b3c8b5cfb2	remove dead code in rbtdb.c dns_db_addrdataset() enforces a requirement that version can only be NULL for a cache database. code that checks for zone semantics and version == NULL can never be reached.	2024-03-13 17:15:18 -07:00
Ondřej Surý	454c75a33a	Restore the parent cleaning logic in prune_tree() Reconstruct the variant of the prune_tree() parent cleaning to consider all elibible parents in a single loop as we were doing before all the changes that led to this commit. Update code comments so that they more precisely describe what the relevant bits of code actually do.	2024-03-06 13:03:17 +01:00
Evan Hunt	845f832308	rename dns_rbtdb to dns_qpdb this commit renames all variables and macros with the string "rbtdb" or "RBDTB" to "qpdb" or "QPDB".	2024-03-06 09:57:24 +01:00
Ondřej Surý	d8220ca4ca	Make the TTL-based cleaning more aggressive It was discovered that the TTL-based cleaning could build up a significant backlog of the rdataset headers during the periods where the top of the TTL heap isn't expired yet. Make the TTL-based cleaning more aggressive by cleaning more headers from the heap when we are adding new header into the RBTDB.	2024-02-29 12:57:06 +01:00
Ondřej Surý	a9383e4b95	Remove expired rdataset headers from the heap It was discovered that an expired header could sit on top of the heap a little longer than desireable. Remove expired headers (headers with rdh_ttl set to 0) from the heap completely, so they don't block the next TTL-based cleaning.	2024-02-29 12:56:36 +01:00
Ondřej Surý	0b32d323e0	Simplify the parent cleaning in the prune_tree() mechanism Instead of juggling with node locks in a cycle, cleanup the node we are just pruning and send any the parent that's also subject to the pruning to the prune tree via normal way (e.g. enqueue pruning on the parent). This simplifies the code and also spreads the pruning load across more event loop ticks which is better for lock contention as less things run in a tight loop.	2024-02-29 11:23:03 +01:00
Ondřej Surý	eed17611d8	Reduce lock contention during RBTDB tree pruning The log message for commit `24381cc36d` explained: In some older BIND 9 branches, the extra queuing overhead eliminated by this change could be remotely exploited to cause excessive memory use. Due to architectural shift, this branch is not vulnerable to that issue, but applying the fix to the latter is nevertheless deemed prudent for consistency and to make the code future-proof. However, it turned out that having a single queue for the nodes to be pruned increased lock contention to a level where cleaning up nodes from the RBTDB took too long, causing the amount of memory used by the cache to grow indefinitely over time. This commit reverts the change to the pruning mechanism introduced by commit `24381cc36d` as BIND branches newer than 9.16 were not affected by the excessive event queueing overhead issue mentioned in the log message for the above commit.	2024-02-29 11:23:03 +01:00
Evan Hunt	e40fd4ed06	fix several bugs in the RBTDB dbiterator implementation - the DNS_DB_NSEC3ONLY and DNS_DB_NONSEC3 flags are mutually exclusive; it never made sense to set both at the same time. to enforce this, it is now a fatal error to do so. the dbiterator implementation has been cleaned up to remove code that treated the two as independent: if nonsec3 is true, we can be certain nsec3only is false, and vice versa. - previously, iterating a database backwards omitted NSEC3 records even if DNS_DB_NONSEC3 had not been set. this has been corrected. - when an iterator reaches the origin node of the NSEC3 tree, we need to skip over it and go to the next node in the sequence. the NSEC3 origin node is there for housekeeping purposes and never contains data. - the dbiterator_test unit test has been expanded, several incorrect expectations have been fixed. (for example, the expected number of iterations has been reduced by one; we were previously counting the NSEC3 origin node and we should not have been doing so.)	2024-02-15 10:15:50 -08:00
Michał Kępień	8610799317	BIND 9.19.21 -----BEGIN SSH SIGNATURE----- U1NIU0lHAAAAAQAAARcAAAAHc3NoLXJzYQAAAAMBAAEAAAEBANamVSTMToLcHCXRu1f52e tTJWV3T1GSVrPYXwAGe6EVC7m9CTl06FZ9ZG/ymn1S1++dk4ByVZXf6dODe2Mu0RuqGmyf MUEMKXVdj3cEQhgRaMjBXvIZoYAsQlbHO2BEttomq8PhrpLRizDBq4Bv2aThM0XN2QqSGS ozwYMcPiGUoMVNcVrC4ZQ+Cptb5C4liqAcpRqrSo8l1vcNg5b1Hk6r7NFPdx542gsGMLae wZrnKn3LWz3ZXTGeK2cRmBxm/bydiVSCsc9XjB+tWtIGUpQsfaXqZ7Hs6t+1f1vsnu88oJ oi1dRBo3YNRl49UiCukXWayQrPJa8wwxURS9W28JMAAAADZ2l0AAAAAAAAAAZzaGE1MTIA AAEUAAAADHJzYS1zaGEyLTUxMgAAAQBSREyaosd+mY8kovqAvGYR8pOui/7gOi6pBprPGw RlOB5z6YOx5FOjbVL/YvBhKk2gbox++o8jCMEmdNNbWeO3U3uBvxCa+8QGARbuMV6vdoR4 qjnOgOfryXyaRw7PQX0ZH0gPw1B1036y5bnW7WPkqrTvGgxW34O1q6j0EumE0vh90E24/l PAWKDCTqDR/+slGDuWgtPcCZuClljw1Mh0dAliKkGhp0l80qMQSr6O/p66A44UxzKwtnnt lagtO0j4nZ+BxC/hyaFc/FlCzeoc48qFQRIt0ZjYKU+XK0CUr2RTpYFdi/n7y3BNd7bDkD nIkEDddn/lXP5rkAdkmDCa -----END SSH SIGNATURE----- gpgsig -----BEGIN SSH SIGNATURE----- U1NIU0lHAAAAAQAAADMAAAALc3NoLWVkMjU1MTkAAAAg25GGAuUyFX1gxo7QocNm8V6J/8 frHSduYX7Aqk4iJLwAAAADZ2l0AAAAAAAAAAZzaGE1MTIAAABTAAAAC3NzaC1lZDI1NTE5 AAAAQEGqBHXwCtEJxRzHbTp6CfBNjqwIAjRD9G+HC4M7q77KBEBgc6dRf15ZRRgiWJCk5P iHMZkEMyWCnELMzhiTzgE= -----END SSH SIGNATURE----- Merge tag 'v9.19.21' BIND 9.19.21	2024-02-14 13:24:56 +01:00
Evan Hunt	ac9bd03a0d	clean up dns_rbt - create_node() in rbt.c cannot fail - the dns_rbt_*name() functions, which are wrappers around dns_rbt_[add\|find\|delete]node(), were never used except in tests. this change isn't really necessary since RBT is likely to go away eventually anyway. but keeping the API as simple as possible while it persists is a good thing, and may reduce confusion while QPDB is being developed from RBTDB code.	2024-02-14 01:36:44 -08:00
Evan Hunt	78d173b548	move DNS_RBT_NSEC_* to db.h these values pertain to whether a node is in the main, nsec, or nsec3 tree of an RBTDB. they need to be moved to a more generic location so they can also be used by QPDB. (this is in db.h rather than db_p.h because rbt.c needs access to it. technically, that's a layer violation, but it's a long-existing one; refactoring to get rid of it would be a large hassle, and eventually we expect to remove rbt.c anyway.)	2024-02-14 01:13:44 -08:00
Evan Hunt	27c862d953	separate generic DB helpers into db_p.h when the QPDB is implemented, we will need to have both qpdb_p.h and rbtdb_p.h. in order to prevent name collisions or code duplication, this commit adds a generic private header file, db_p.h, containing structures and macros that will be used by both databases. some functions and structs have been renamed to more specifically refer to the RBT database, in order to avoid namespace collision with similar things that will be needed by the QPDB later.	2024-02-14 09:00:27 +01:00
Ondřej Surý	3f774c2a8a	Optimize cname_and_other_data to stop as earliest as possible Stop the cname_and_other_data processing if we already know that the result is true. Also, we know that CNAME will be placed in the priority headers, so we can stop looking for CNAME if we haven't found CNAME and we are past the priority headers.	2024-02-08 08:33:36 +01:00
Ondřej Surý	3ac482be7f	Optimize the slabheader placement for certain RRTypes Mark the infrastructure RRTypes as "priority" types and place them at the beginning of the rdataslab header data graph. The non-priority types either go right after the priority types (if any).	2024-02-08 08:33:36 +01:00
Michał Kępień	24381cc36d	Limit isc_async_run() overhead for tree pruning Instead of issuing a separate isc_async_run() call for every RBTDB node that triggers tree pruning, maintain a list of nodes from which tree pruning can be started from and only issue an isc_async_run() call if pruning has not yet been triggered by another RBTDB node. In some older BIND 9 branches, the extra queuing overhead eliminated by this change could be remotely exploited to cause excessive memory use. Due to architectural shift, this branch is not vulnerable to that issue, but applying the fix to the latter is nevertheless deemed prudent for consistency and to make the code future-proof.	2024-01-05 12:33:14 +01:00
Mark Andrews	5e8f0e9ceb	Process the combined LRU lists in LRU order Only cleanup headers that are less than equal to the rbt's last_used time. Adjust the rbt's last_used time when the target cleaning was not achieved to the oldest value of the remaining set of headers. When updating delegating NS and glue records last_used was not being updated when it should have been. When adding zero TTL records to the tail of the LRU lists set last_used to rbtdb->last_used + 1 rather than now. This appoximately preserves the lists LRU order.	2023-12-07 02:59:04 +00:00
Ondřej Surý	89fcb6f897	Apply the isc_mem_cget semantic patch	2023-08-31 22:08:35 +02:00
Ondřej Surý	8c4cf5b1de	fixup! Use cds_lfht for updatenotify mechanism in dns_db unit	2023-07-31 18:11:34 +02:00
Ondřej Surý	a1afa31a5a	Use cds_lfht for updatenotify mechanism in dns_db unit The updatenotify mechanism in dns_db relied on unlocked ISC_LIST for adding and removing the "listeners". The mechanism relied on the exclusive mode - it should have been updated only during reconfiguration of the server. This turned not to be true anymore in the dns_catz - the updatenotify list could have been updated during offloaded work as the offloaded threads are not subject to the exclusive mode. Change the update_listeners to be cds_lfht (lock-free hash-table), and slightly refactor how register and unregister the callbacks - the calls are now idempotent (the register call already was and the return value of the unregister function was mostly ignored by the callers).	2023-07-31 18:11:34 +02:00
Ondřej Surý	b6b0d81a36	Cleanup the __tsan_acquire/__tsan_release With ThreadSanitizer support added to the Userspace RCU, we no longer need to wrap the call_rcu and caa_container_of with __tsan_{acquire,release} hints. Remove the direct calls to __tsan_{acquire,release} and the isc_urcu_{container,cleanup} macros.	2023-07-28 08:59:08 +02:00
Ondřej Surý	5321c474ea	Refactor isc_stats_create() and its downstream users to return void The isc_stats_create() can no longer return anything else than ISC_R_SUCCESS. Refactor isc_stats_create() and its variants in libdns, libns and named to just return void.	2023-07-27 11:37:44 +02:00
Evan Hunt	5a85135c1e	split out cache-specific functions move cache-specific functions from rbtdb.c to rbt-cachedb.c.	2023-07-17 14:50:25 +02:00
Evan Hunt	9a1a1293c0	split out zone-specific functions move zone-specific functions from rbtdb.c to rbt-zonedb.c.	2023-07-17 14:50:25 +02:00
Evan Hunt	445ef1d033	move slab rdataset implementation to rdataslab.c ultimately we want the slab implementation of dns_rdataset to be usable by more database implementaions than just rbtdb. this commit moves rdataset_methods to rdataslab.c, renamed dns_rdataslab_rdatasetmethods. new database methods have been added: locknode, unlocknode, addglue, expiredata, and deletedata, allowing external functions to perform functions that previously required internal access to the database implementation. database and heap pointers are now stored in the dns_slabheader object so that header is the only thing that needs to be passed to some functions; this will simplify moving functions that process slabheaders out of rbtdb.c so they can be used by other database implementations.	2023-07-17 14:50:25 +02:00

1 2 3 4 5 ...

716 commits