bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-06-04 15:32:04 -04:00

Author	SHA1	Message	Date
Evan Hunt	8e31dc5353	Fix a stack use-after-free in qpzone In previous_closest_nsec(), a new qpreader was opened to search the NSEC tree. It was possible for that to be used to update a QP iterator object owned by the caller, and then be destroyed when the function returned. This has been addressed by having the caller open the NSEC qpreader instead.	2026-05-05 16:17:25 -07:00
Alessio Podda	04c52148bb	Do not update the case on unchanged rdatasets Fix an assertion failure on unchanged rdataset during IXFR.	2026-02-24 13:04:19 +01:00
Alessio Podda	97f2816947	Fix formatting Cleanup formatting after IXFR changes. (cherry picked from commit `ad0a382092`)	2026-02-02 10:32:38 +01:00
Alessio Podda	0a5e27deef	Implement qpzone specific update path This commit implements a batch update function for qpzone. The main reason for this is speed: using addrdataset would cause a qp transaction per rrdataset added, leading to a substantial slowdown compared to RBTDB. The new API results in a qp transaction per applied diff. (cherry picked from commit `da53708dcb`)	2026-02-02 10:32:38 +01:00
Alessio Podda	1d4ad50e81	Abstract updates into a vtable This commit adds a layer of indirection to the apply_diff logic used by IXFR and resigning by having the database updates go through a vtable. We do this in three steps: - We extend dns_rdatacallbacks_t vtable to allow subtraction and resigning. - We add a new set of api (begin\|commit\|abort)update to the dbmethods vtable, that model an incremental update that can be aborted. - We extract the core logic of diff_apply into a function that satisfies the new interface. - We make diff_apply use this new function, and log the results. The intent of this commit is to allow databases to expose a batch incremental update implementation, just like they expose a custom batch creation implementation through (begin\|end)load. (cherry picked from commit `e36dc0ca76`)	2026-01-29 09:13:02 +01:00
Matthijs Mekking	63262fd0f4	Implement dns_dbiterator_seek3 This is a new seek function for dbiterator that is meant to find an NSEC3 node in a zone database. The difference with dns_dbiterator_seek is that if the node does not exist, this seek function will point the iterator to the next NSEC3 name. (cherry picked from commit `41159e9062`)	2025-12-11 13:53:25 +01:00
Ondřej Surý	89478d95c3	In dns_qpiter_{prev,next}, defer dereference_iter_node call dns_qpiter_{prev,next} requires the current iterator node to still be valid which might not always the case after dereference_iter_node was called. Currently, this is ensured via closeversion() mechanism, but it is not guaranteed to be true in the future. Move the call to dereference_iter_node to after the dns_qpiter_prev() and dns_qpiter_next() to prevent a possible use-after-free of the current iterator node. (cherry picked from commit `9914bd383e`)	2025-12-08 10:25:05 +01:00
Evan Hunt	25c9fb54da	standardize CHECK and RETERR macros previously, there were over 40 separate definitions of CHECK macros, of which most used "goto cleanup", and the rest "goto failure" or "goto out". there were another 10 definitions of RETERR, of which most were identical to CHECK, but some simply returned a result code instead of jumping to a cleanup label. this has now been standardized throughout the code base: RETERR is for returning an error code in the case of an error, and CHECK is for jumping to a cleanup tag, which is now always called "cleanup". both macros are defined in isc/util.h. (cherry picked from commit `52bba5cc34`)	2025-12-03 19:17:20 -08:00
Michał Kępień	d0e0706797	Revert "qpzone find() function could set foundname incorrectly" This reverts commit `dd1050e938`.	2025-05-06 09:14:18 +02:00
Evan Hunt	dd1050e938	qpzone find() function could set foundname incorrectly when a requested name is found in the QP trie during a lookup, but its records have been marked as nonexistent by a previous deletion, then it's treated as a partial match, but the foundname could be left pointing to the original qname rather than the parent. this could lead to an assertion failure in query_findclosestnsec3().	2025-03-17 09:27:09 +00:00
Evan Hunt	bfa5dd8991	qpzone.c:step() could ignore rollbacks the step() function (used for stepping to the prececessor or successor of a database node) could overlook a node because there was an rdataset marked IGNORE because it had been rolled back, covering an active rdataset under it. (cherry picked from commit `24eaff7adc`)	2025-03-14 23:22:59 +00:00
Ondřej Surý	614f8c1ef1	Acquire the database reference before possibly last node release Acquire the database refernce in the detachnode() to prevent the last reference to be release while the NODE_LOCK being locked. The NODE_LOCK is locked/unlocked inside the RCU critical section, thus it is most probably this should not pose a problem as the database uses call_rcu memory reclamation, but this it is still safer to acquire the reference before releasing the node. (cherry picked from commit `d1ef6a93c1`)	2025-03-06 10:39:17 +00:00
Ondřej Surý	ee6e64df21	Revert "fix: dev: Delete dead nodes when committing a new version" This reverts commit `67255da4b3`, reversing changes made to `74c9ff384e`. (cherry picked from commit `1e4695510a`)	2025-03-05 17:28:44 +00:00
Evan Hunt	e35e701c2c	when committing a new qpzone version, delete dead nodes if all data has been deleted from a node in the qpzone database, delete the node too. (cherry picked from commit `e58ce19cf2`)	2025-02-18 22:55:20 +00:00
Evan Hunt	a21168a221	fix dns_qp_insert() checks in qpzone in some places there were checks for failures of dns_qp_insert() after dns_qp_getname(). such failures could only happen if another thread inserted a node between the two calls, and that can't happen because the calls are serialized with dns_qpmulti_write(). we can simplify the code and just add an INSIST. (cherry picked from commit `fffa150df3`)	2025-02-18 05:55:02 +00:00
Mark Andrews	ae3e67717c	Fix "CNAME and other data" detection prio_type was being used in the wrong place to optimize cname_and_other. We have to first exclude and accepted types and we also have to determine that the record exists before we can check if we are at a point where a later CNAME cannot appear. (cherry picked from commit `5e49a9e4ae`)	2025-02-14 13:41:11 +11:00
Ondřej Surý	db2bce1c6f	Switch the locknum generation for qpznode to random Instead of using on hash of the name modulo number of the buckets, assign the locknum randomly with isc_random_uniform(). This makes the locknum assignment aligned with qpcache and allows the bucket number to be non-prime in the future. (cherry picked from commit `732fc338a9`)	2025-02-04 23:28:53 +01:00
Ondřej Surý	d4e8a92977	Rely on call_rcu() to destroy the qpzone outside of locks Reduce the number of qpzone_ref() and qpzone_unref() calls in qpzone_detachnode() by relying on the call_rcu to delay the destruction of the lock buckets. (cherry picked from commit `1fa5219fdf`)	2025-02-04 23:28:53 +01:00
Ondřej Surý	c6c03a6b11	Reduce false sharing in dns_qpzone Instead of having many node_lock_count * sizeof(<member>) arrays, pack all the members into a qpzone_bucket_t that is cacheline aligned and have a single array of those. (cherry picked from commit `6dcc398726`)	2025-02-04 23:28:50 +01:00
Evan Hunt	5300eebc9e	Clarify reference counting in QP databases Change the names of the node reference counting functions and add comments to make the mechanism easier to understand: - newref() and decref() are now called qpcnode_acquire()/ qpznode_acquire() and qpcnode_release()/qpznode_release() respectively; this reflects the fact that they modify both the internal and external reference counters for a node. - qpcnode_newref() and qpznode_newref() are now called qpcnode_erefs_increment() and qpznode_erefs_increment(), and qpcnode_decref() and qpznode_decref() are now called qpcnode_erefs_decrement() and qpznode_erefs_decrement(), to reflect that they only increase and decrease the node's external reference counters, not internal. (cherry picked from commit `d4f791793e`)	2025-01-31 05:52:13 +01:00
Ondřej Surý	7dab6cdfbc	Remove db_nodelock_t in favor of reference counted qpdb This removes the db_nodelock_t structure and changes the node_locks array to be composed only of isc_rwlock_t pointers. The .reference member has been moved to qpdb->references in addition to common.references that's external to dns_db API users. The .exiting members has been completely removed as it has no use when the reference counting is used correctly. (cherry picked from commit `431513d8b3`)	2025-01-31 05:49:36 +01:00
Ondřej Surý	d1d444d2ab	Refactor decref() in both qpcache.c and qpzone.c Cleanup the pattern in the decref() functions in both qpcache.c and qpzone.c, so it follows the similar patter as we already have in newref() function. (cherry picked from commit `814b87da64`)	2025-01-31 05:49:12 +01:00
JINMEI Tatuya	da0453b1d5	Optimize database decref by avoiding locking with refs > 1 Previously, this function always acquires a node write lock if it might need node cleanup in case the reference decrements to 0. In fact, the lock is unnecessary if the reference is larger than 1 and it can be optimized as an "easy" case. This optimization could even be "necessary". In some extreme cases, many worker threads could repeat acquring and releasing the reference on the same node, resulting in severe lock contention for nothing (as the ref wouldn't decrement to 0 in most cases). This change would prevent noticeable performance drop like query timeout for such cases. Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com> Co-authored-by: Ondřej Surý <ondrej@isc.org> (cherry picked from commit `7f4471594d`)	2025-01-22 14:29:30 +01:00
Ondřej Surý	547f376f21	Rewrite the GLUE cache in QP zone database This is a second attempt to rewrite the GLUE cache to not use per database version hash table. Instead of keeping a hash table indexed by the node, use a directly linked list of GLUE records for each slabheader. This was attempted before, but there was a data race caused by the fact that the thread cleaning the GLUE records could be slower than accessing the slab headers again and reinitializing the wait-free stack. The improved design builds on the previous design, but adds a new dns_gluelist structure that has a pointer to the database version. If a dns_gluelist belonging to a different (old) version is detected, it is just detached from the slabheader and left for the closeversion() to clean it up later. (cherry picked from commit `29bde687b5`)	2025-01-06 14:00:47 +01:00
Ondřej Surý	ad952ffee6	Revert "Fix the glue table in the QP and RBT zone databases" This reverts commit `46cfebac58`. (cherry picked from commit `759d59801b`)	2025-01-06 14:00:43 +01:00
Alessio Podda	1edf405add	Optimize memory layout of core structs Reduce memory footprint by: - Reordering struct fields to minimize padding. - Using exact-sized atomic types instead of _least/_fast variants - Downsizing integer fields where possible Affected structs: - dns_name_t - dns_slabheader_t - dns_rdata_t - qpcnode_t - qpznode_t (cherry picked from commit `32c7060bd2`)	2024-12-09 09:04:28 +01:00
JINMEI Tatuya	08122316a7	emit more helpful log for exceeding max-records-per-type The new log message is emitted when adding or updating an RRset fails due to exceeding the max-records-per-type limit. The log includes the owner name and type, corresponding zone name, and the limit value. It will be emitted on loading a zone file, inbound zone transfer (both AXFR and IXFR), handling a DDNS update, or updating a cache DB. It's especially helpful in the case of zone transfer, since the secondary side doesn't have direct access to the offending zone data. It could also be used for max-types-per-name, but this change doesn't implement it yet as it's much less likely to happen in practice. (cherry picked from commit `4156995431`)	2024-11-27 11:17:34 +11:00
Ondřej Surý	58a15d38c2	Remove redundant parentheses from the return statement (cherry picked from commit `0258850f20`)	2024-11-19 14:26:52 +01:00
Matthijs Mekking	d768dd1f5d	Revert "fix: chg: Improve performance when looking for the closest encloser when returning NSEC3 proofs" This reverts merge request !9436 (cherry picked from commit `0396bf98ee`)	2024-10-10 09:29:52 +00:00
Mark Andrews	b30bff7dee	Return partial match when requested Return partial match from dns_db_find/dns_db_find when requested to short circuit the closest encloser discover process. Most of the time this will be the actual closest encloser but may not be when there yet to be committed / cleaned up versions of the zone with names below the actual closest encloser. (cherry picked from commit `d42ea08f16`)	2024-08-29 21:40:16 +00:00
Ondřej Surý	46cfebac58	Fix the glue table in the QP and RBT zone databases When adding glue to the header, we add header to the wait-free stack to be cleaned up later which sets wfc_node->next to non-NULL value. When the actual cleaning happens we would only cleanup the .glue_list, but since the database isn't locked for the time being, the headers could be reused while cleaning the existing glue entries, which creates a data race between database versions. Revert the code back to use per-database-version hashtable where keys are the node pointers. This allows each database version to have independent glue cache table that doesn't affect nodes or headers that could already "belong" to the future database version. (cherry picked from commit `5beae5faf9`)	2024-08-05 14:43:18 +00:00
Ondřej Surý	b27c6bcce8	Expand the list of the priority types and move it to db_p.h Add HTTPS, SVCB, SRV, PTR, NAPTR, DNSKEY and TXT records to the list of the priority types that are put at the beginning of the slabheader list for faster access and to avoid eviction when there are more types than the max-types-per-name limit.	2024-07-01 12:47:30 +02:00
Ondřej Surý	52b3d86ef0	Add a limit to the number of RR types for single name Previously, the number of RR types for a single owner name was limited only by the maximum number of the types (64k). As the data structure that holds the RR types for the database node is just a linked list, and there are places where we just walk through the whole list (again and again), adding a large number of RR types for a single owner named with would slow down processing of such name (database node). Add a configurable limit to cap the number of the RR types for a single owner. This is enforced at the database (rbtdb, qpzone, qpcache) level and configured with new max-types-per-name configuration option that can be configured globally, per-view and per-zone.	2024-06-10 16:55:09 +02:00
Ondřej Surý	32af7299eb	Add a limit to the number of RRs in RRSets Previously, the number of RRs in the RRSets were internally unlimited. As the data structure that holds the RRs is just a linked list, and there are places where we just walk through all of the RRs, adding an RRSet with huge number of RRs inside would slow down processing of said RRSets. Add a configurable limit to cap the number of the RRs in a single RRSet. This is enforced at the database (rbtdb, qpzone, qpcache) level and configured with new max-records-per-type configuration option that can be configured globally, per-view and per-zone.	2024-06-10 16:55:07 +02:00
Evan Hunt	9c882f1e69	replace qpzone node attriutes with atomics there were TSAN error reports because of conflicting uses of node->dirty and node->nsec, which were in the same qword. this could be resolved by separating them, but we could also make them into atomic values and remove some node locking.	2024-05-17 00:33:35 +00:00
Evan Hunt	4b02246130	fix more ambiguous struct names there were some structure names used in qpcache.c and qpzone.c that were too similar to each other and could be confusing when debugging. they have been changed as follows: in qcache.c: - changed_t was unused, and has been removed - search_t -> qpc_search_t - qpdb_rdatasetiter_t -> qpc_rditer_t - qpdb_dbiterator_t -> qpc_dbiter_t in qpzone.c: - qpdb_changed_t -> qpz_changed_t - qpdb_changedlist_t -> qpz_changedlist_t - qpdb_version_t -> qpz_version_t - qpdb_versionlist_t -> qpz_versionlist_t - qpdb_search_t -> qpz_search_t - qpdb_load_t -> qpz_search_t	2024-04-30 12:50:01 -07:00
Evan Hunt	2789e58473	get foundname from the node when calling dns_qp_lookup() from qpcache, instead of passing 'foundname' so that a name would be constructed from the QP key, we now just use the name field in the node data. this makes dns_qp_lookup() run faster. the same optimization has also been added to qpzone. the documentation for dns_qp_lookup() has been updated to discuss this performance consideration.	2024-04-30 12:50:01 -07:00
Evan Hunt	85ab92b6e0	more cleanups in qpcache.c - remove unneeded struct members and misleading comments. - remove unused parameters for static functions. - rename 'find_callback' to 'delegating' for consistency with qpzone; the find callback mechanism is not used in QP databases.	2024-04-30 12:42:31 -07:00
Evan Hunt	3acab71d46	rename QPDB_HEADERNODE to HEADERNODE this makes the macro consistent between qpcache.c and qpzone.c. also removed a redundant definition of HEADERNODE in qpzone.c.	2024-04-30 12:42:31 -07:00
Evan Hunt	46d40b3dca	fix structure names in qpcache.c and qpzone.c - change dns_qpdata_t to qpcnode_t (QP cache node), and dns_qpdb_t to qpcache_t, as these types are only accessed locally. - also change qpdata_t in qpzone.c to qpznode_t (QP zone node), for consistency. - make the refcount declarations for qpcnode_t and qpznode_t static, using the new ISC_REFCOUNT_STATIC macros.	2024-04-30 12:42:07 -07:00
Ondřej Surý	6c54337f52	avoid a race in the qpzone getsigningtime() implementation the previous commit introduced a possible race in getsigningtime() where the rdataset header could change between being found on the heap and being bound. getsigningtime() now looks at the first element of the heap, gathers the locknum, locks the respective lock, and retrieves the header from the heap again. If the locknum has changed, it will rinse and repeat. Theoretically, this could spin forever, but practically, it almost never will as the heap changes on the zone are very rare. we simplify matters further by changing the dns_db_getsigningtime() API call. instead of passing back a bound rdataset, we pass back the information the caller actually needed: the resigning time, owner name and type of the rdataset that was first on the heap.	2024-04-25 15:48:43 -07:00
Evan Hunt	7e6be9f1b5	simplify qpzone database by using only one heap for resigning in RBTDB, the heap was used by zone databases for resigning, and by the cache for TTL-based cache cleaning. the cache use case required very frequent updates, so there was a separate heap for each of the node lock buckets. qpzone is for zones only, so it doesn't need to support the cache use case; the heap will only be touched when the zone is updated or incrementally signed. we can simplify the code by using only a single heap.	2024-04-25 15:41:39 -07:00
Evan Hunt	ea6659a5e9	update foundname when detecting a zonecut above qname an assertion could be triggered in the QPDB cache if a DNAME was found above a queried NS, because the 'foundname' value was not correctly updated to point to the zone cut. the same mistake existed in qpzone and has been fixed there as well.	2024-04-02 10:00:03 +02:00
Mark Andrews	4d2d80f534	Remove remenants of cache support from qpzone.c These where leading to Coverity errors being reported.	2024-03-19 22:04:10 +00:00
Evan Hunt	f908d358c4	reduce memory consumption of qpzone database every node of a QP database contains a copy of the nodename, which is used as the key for the QP-trie. previously, the name was stored as a dns_fixedname object, which has room for up to 255 characters. we can reduce the space consumed by dynamically allocating a dns_name object that's just long enough for the name to be stored.	2024-03-14 10:20:52 -07:00
Matthijs Mekking	ad33a73f83	Fix Coverity CID 487882: Error handling issues The dns_qpiter_next() was called without checking the return value. If we cannot move the iterator forward, there is no use in calling the step() function. /lib/dns/qpzone.c: 2804 in activeempty() 2798 * of the name we were searching for. Step the iterator 2799 * forward, then step() will continue forward until it 2800 * finds a node with active data. If that node is a 2801 * subdomain of the one we were looking for, then we're 2802 * at an active empty nonterminal node. 2803 */ >>> CID 487882: Error handling issues (CHECKED_RETURN) >>> Calling "dns_qpiter_next" without checking return value (as is done elsewhere 26 out of 27 times). 2804 dns_qpiter_next(it, NULL, NULL, NULL); 2805 return (step(search, it, FORWARD, next) && 2806 dns_name_issubdomain(next, current)); 2807 }	2024-03-14 14:01:23 +01:00
Evan Hunt	f0b164430a	remove dead code in qpzone.c qpzone does not support cache semantics, so dns_db_addrdataset(), _deleterdataset() and _subtractrdataset() can't be run with version == NULL; there's no need to check for it. we can also clean up free_qpdb() a bit since current_version is always non-NULL.	2024-03-13 17:15:18 -07:00
Evan Hunt	ac2c454f4f	add a nodefullname implementation for the qpzone database this enables the 'dyndb' system test to use a qpzone database.	2024-03-08 15:36:56 -08:00
Evan Hunt	3512cf5654	add setup/commit functions to rdatacallbacks because dns_qpmulti_commit() can be time consuming, it's inefficient to open and commit a qpmulti transaction for each rdataset being loaded into a database. we can improve load time by opening a qpmulti transaction before adding a group of rdatasets and then committing it afterward. this commit adds 'setup' and 'commit' functions to dns_rdatacallbacks_t, which can be called before and after the loops in which 'add' is called in dns_master_load() and axfr_apply().	2024-03-08 15:36:56 -08:00
Evan Hunt	55f38e34dc	improve node reference counting QP database node data is not reference counted the same way RBT nodes were: in the RBT, node->references could be zero if the node was in the tree but was not in use by any caller, whereas in the QP trie, the database itself uses reference counting of nodes internally. this caused some subtle errors. in RBTDB, when the newref() function is called and the node reference count was zero, the node lock reference counter would also be incremented. in the QP trie, this can never happen - because as long as the node is in the database its reference count cannot be zero - and so the node lock reference counter was never incremented. this has been addressed by maintaining a separate "erefs" counter for external references to the node. this is the same approach used in the "qpdb-lite" database in commit `e91fbd8dea`. while troubleshooting this issue, some compile errors were discovered when building with DNS_DB_NODETRACE; those have also been fixed.	2024-03-08 15:36:56 -08:00

1 2

55 commits