bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-05-27 12:13:20 -04:00

Author	SHA1	Message	Date
Ondřej Surý	86673ee67a	Unify the int32_t vs int_fast32_t when working with atomic types There's a mismatch between the atomic and non-atomic types that could potentialy lead to a rwlock deadlock (after two billion 2^32) writes. Use int_fast32_t when loading the atomic_int_fast32_t types in the isc_rwlock unit.	2025-04-14 18:56:54 +02:00
Aram Sargsyan	df0fe93fa5	Fix the resolvers RTT-ranged responses statistics counters When a response times out the fctx_cancelquery() function incorrectly calculates it in the 'dns_resstatscounter_queryrtt5' counter (i.e. >=1600 ms). To avoid this, the rctx_timedout() function should make sure that 'rctx->finish' is NULL. And in order to adjust the RTT values for the timed out server, 'rctx->no_response' should be true. Update the rctx_timedout() function to make those changes. (cherry picked from commit `830e548111`)	2025-03-31 18:35:45 +00:00
Aram Sargsyan	9a3ba4856a	Fix resolver responses statistics counter The resquery_response() function increases the response counter without checking if the response was successful. Increase the counter only when the result indicates success. (cherry picked from commit `12e7dfa397`)	2025-03-31 18:35:45 +00:00
Evan Hunt	04d18f31ce	fix the fetchresponse result for CNAME/DNAME the fix in commit `1edbbc32b4` was incomplete; the wrong event result could also be set in cache_name() and validated(). (cherry picked from commit `9ebeb60174`)	2025-03-26 12:04:07 +01:00
Evan Hunt	46e793a3b4	set eresult based on the type in ncache_adderesult() when the caching of a negative record failed because of the presence of a positive one, ncache_adderesult() could override this to ISC_R_SUCCESS. this could cause CNAME and DNAME responses to be handled incorrectly. ncache_adderesult() now sets the result code correctly in such cases. (cherry picked from commit `1edbbc32b4`)	2025-03-26 12:01:59 +01:00
Evan Hunt	4b81333567	Don't check DNS_KEYFLAG_NOAUTH All DNSKEY keys are able to authenticate. The DNS_KEYTYPE_NOAUTH (and DNS_KEYTYPE_NOCONF) flags were defined for the KEY rdata type, and are not applicable to DNSKEY. Previously, because the DNSKEY implementation was built on top of KEY, the NOAUTH flag prevented authentication in DNSKEYs as well. This has been corrected. (cherry picked from commit `5c21576f82`)	2025-03-25 00:04:34 -07:00
Evan Hunt	07b599748f	Tidy up keyvalue.h definitions Use enums for DNS_KEYFLAG_, DNS_KEYTYPE_, DNS_KEYOWNER_, DNS_KEYALG_, and DNS_KEYPROTO_ values. Remove values that are never used. Eliminate the obsolete DNS_KEYFLAG_SIGNATORYMASK. Instead, add three more RESERVED bits for the key flag values that it covered but which were never used. (cherry picked from commit `fee1ba40df`)	2025-03-25 00:04:31 -07:00
Artem Boldariev	d0907a3a1f	TLS DNS: Simplify tls_cycle_input() This commit simplifies code flow in the tls_cycle_input() and makes the incoming data processing similar to that in TCP DNS. In particular, now we decipher all the the incoming data before making a single isc__nm_process_sock_buffer() call. Previously we would try to decipher data bit-by-bit before trying to process the deciphered bit via isc__nm_process_sock_buffer(). Doing like before made the code much less predictable, in particular in the areas like when reading is paused or resumed. The newer approach also allowed us to get rid of some old kludges.	2025-03-24 09:49:38 +02:00
Aram Sargsyan	ac15d3dede	Implement -T cookiealwaysvalid When -T cookiealwaysvalid is passed to named, DNS cookie checks for the incoming queries always pass, given they are structurally correct. (cherry picked from commit `807ef8545d`)	2025-03-17 12:01:42 +00:00
Mark Andrews	89e76eec70	Add missing locks when returning addresses Add missing locks in dns_zone_getxfrsource4 et al. Addresses CID 468706, 468708, 468741, 468742, 468785 and 468778. Cleanup dns_zone_setxfrsource4 et al to now return void. Remove double copies with dns_zone_getprimaryaddr and dns_zone_getsourceaddr. (cherry picked from commit `d0a59277fb`)	2025-03-16 15:26:12 +11:00
Mark Andrews	54c89f75f3	Implement digest_sig and digest_rrsig for ZONEMD ZONEMD needs to be able to digest SIG and RRSIG records. The signer field can be compressed in SIG so we need to call dns_name_digest(). While for RRSIG the records the signer field is not compressed the canonical form has the signer field downcased (RFC 4034, 6.2). This also implies that compare_rrsig needs to downcase the signer field during comparison. (cherry picked from commit `006c5990ce`)	2025-03-05 10:34:52 +00:00
Aram Sargsyan	df373d7d99	Fix memory ordering for operations with quota->used and quota->waiting Change all the non-locked operations on 'quota->used' and 'quota->waiting' to "acq/rel" for inter-thread synchronization. Some loads are left as "relaxed", because they are under a locked mutex which also provides protection.	2025-03-04 09:57:34 +00:00
Aram Sargsyan	80d7d11f37	Use relaxed memory ordering for quota->max and quota->soft These variables are not critical for memory ordering issues and we can use the relaxed memory ordering, as done in the main branch.	2025-03-04 09:57:34 +00:00
Artem Boldariev	94bcd8c253	DoH: Bump the active streams processing limit This commit bumps the total number of active streams (= the opened streams for which a request is received, but response is not ready) to 60% of the total streams limit. The previous limit turned out to be too tight as revealed by longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests. (cherry picked from commit `eaad0aefe6`)	2025-03-03 12:08:15 +02:00
Artem Boldariev	aa6fd85b0b	DoH: remove obsolete INSIST() check The check, while not active by default, is not valid since the commit `8b8f4d500d`. See 'if (total == 0) { ...' below branch to understand why. (cherry picked from commit `217a1ebd79`)	2025-03-03 12:07:48 +02:00
Artem Boldariev	d9928ccb62	DoH: Flush HTTP write buffer on an outgoing DNS message Previously, the code would try to avoid sending any data regardless of what it is unless: a) The flush limit is reached; b) There are no sends in flight. This strategy is used to avoid too numerous send requests with little amount of data. However, it has been proven to be too aggressive and, in fact, harms performance in some cases (e.g., on longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:"). Now, additionally to the listed cases, we also: c) Flush the buffer and perform a send operation when there is an outgoing DNS message passed to the code (which is indicated by the presence of a send callback). That helps improve performance for "stress:long:rpz:doh+udp:linux:" tests. (cherry picked from commit `c5f7968856`)	2025-03-03 12:07:15 +02:00
Artem Boldariev	b4e8089694	DoH: Limit the number of delayed IO processing requests Previously, a function for continuing IO processing on the next UV tick was introduced (http_do_bio_async()). The intention behind this function was to ensure that http_do_bio() is eventually called at least once in the future. However, the current implementation allows queueing multiple such delayed requests needlessly. There is currently no need for these excessive requests as http_do_bio() can requeue them if needed. At the same time, each such request can lead to a memory allocation, particularly in BIND 9.18. This commit ensures that the number of enqueued delayed IO processing requests never exceeds one in order to avoid potentially bombarding IO threads with the delayed requests needlessly. (cherry picked from commit `0e1b02868a`)	2025-03-03 12:06:44 +02:00
Artem Boldariev	e525029b89	DoH: Simplify http_do_bio() This commit significantly simplifies the code flow in the http_do_bio() function, which is responsible for processing incoming and outgoing HTTP/2 data. It seems that the way it was structured before was indirectly caused by the presence of the missing callback calls bug, fixed in `8b8f4d500d`. The change introduced by this commit is known to remove a bottleneck and allows reproducible and measurable performance improvement for long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests. Additionally, it fixes a similar issue with potentially missing send callback calls processing and hardens the code against use-after-free errors related to the session object (they can potentially occur). (cherry picked from commit `0956fb9b9e`)	2025-03-03 12:06:05 +02:00
Evan Hunt	90989bfdfb	prevent a reference leak from the ns_query_done hooks if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is used in a plugin and returns NS_HOOK_RETURN, some of the cleanup in ns_query_done() can be skipped over, leading to reference leaks that can cause named to hang on shut down. this has been addressed by adding more housekeeping code after the cleanup: tag in ns_query_done(). (cherry picked from commit `c2e4358267`)	2025-02-25 22:41:27 +00:00
Evan Hunt	9bec99ce7d	Fix a logic error in cache_name() A change in `6aba56ae8` (checking whether a rejected RRset was identical to the data it would have replaced, so that we could still cache a signature) inadvertently introduced cases where processing of a response would continue when previously it would have been skipped. (cherry picked from commit `d0fd9cbe3b`)	2025-02-24 23:42:30 +00:00
Aram Sargsyan	f1ec774f9a	Fix RPZ bug when resuming a query during a reconfiguration After a reconfiguration the old view can be left without a valid 'rpzs' member, because when the RPZ is not changed during the named reconfiguration 'rpzs' "migrate" from the old view into the new view, so when a query resumes it can find that 'qctx->view->rpzs' is NULL which query_resume() currently doesn't expect to happen if it's recursing and 'qctx->rpz_st' is not NULL. Fix the issue by adding a NULL-check. In order to not split the log message to two different log messages depending on whether 'qctx->view->rpzs' is NULL or not, change the message to not log the RPZ policy's "version" which is just a runtime counter and is most likely not very useful for the users. (cherry picked from commit `3ea2fbc238`)	2025-02-21 11:45:52 +00:00
Mark Andrews	2c42324e26	Remove check for missing RRSIG records from getsection Checking whether the authority section is properly signed should be left to the validator. Checking in getsection (dns_message_parse) was way too early and resulted in resolution failures of lookups that should have otherwise succeeded. (cherry picked from commit `83159d0a54`)	2025-02-21 14:20:54 +11:00
Evan Hunt	b2e11b1ad3	Check whether a rejected rrset is different Add a new dns_rdataset_equals() function to check whether two rdatasets are equal in DNSSEC terms. When an rdataset being cached is rejected because its trust level is lower than the existing rdataset, we now check to see whether the rejected data was identical to the existing data. This allows us to cache a potentially useful RRSIG when handling CD=1 queries, while still rejecting RRSIGs that would definitely have resulted in a validation failure. (cherry picked from commit `6aba56ae89`)	2025-02-19 18:55:01 -08:00
Artem Boldariev	66bdddc51a	DoH: http_send_outgoing() return value is not used The value returned by http_send_outgoing() is not used anywhere, so we make it not return anything (void). Probably it is an omission from older times. (cherry picked from commit `2adabe835a`)	2025-02-19 19:42:15 +02:00
Artem Boldariev	0b9e8e6063	DoH: Fix missing send callback calls When handling outgoing data, there were a couple of rarely executed code paths that would not take into account that the callback MUST be called. It could lead to potential memory leaks and consequent shutdown hangs. (cherry picked from commit `8b8f4d500d`)	2025-02-19 19:42:15 +02:00
Artem Boldariev	f9aa7a298d	DoH: change how the active streams number is calculated This commit changes the way how the number of active HTTP streams is calculated and allows it to scale with the values of the maximum amount of streams per connection, instead of effectively capping at STREAM_CLIENTS_PER_CONN. The original limit, which is intended to define the pipelining limit for TCP/DoT. However, it appeared to be too restrictive for DoH, as it works quite differently and implements pipelining at protocol level by the means of multiplexing multiple streams. That renders each stream to be effectively a separate connection from the point of view of the rest of the codebase. (cherry picked from commit `a22bc2d7d4`)	2025-02-19 19:42:15 +02:00
Artem Boldariev	3c49824589	DoH: Track the amount of in flight outgoing data Previously we would limit the amount of incoming data to process based solely on the presence of not completed send requests. That worked, however, it was found to severely degrade performance in certain cases, as was revealed during extended testing. Now we switch to keeping track of how much data is in flight (or ready to be in flight) and limit the amount of processed incoming data when the amount of in flight data surpasses the given threshold, similarly to like we do in other transports. (cherry picked from commit `05e8a50818`)	2025-02-19 19:42:15 +02:00
Mark Andrews	86e65f317a	Re-fetch pending records that failed validation If a deferred validation on data that was originally queried with CD=1 fails, we now repeat the query, since the zone data may have changed in the meantime. (cherry picked from commit `04b1484ed8`)	2025-02-18 23:59:10 +00:00
Mark Andrews	48b32e64c4	Complete the deferred validation if there are no RRSIGs When a query is made with CD=1, we store the result in the cache marked pending so that it can be validated later, at which time it will either be accepted as an answer or removed from the cache as invalid. Deferred validation was not attempted when there were no cached RRSIGs for DNSKEY and DS. We now complete the deferred validation in this scenario. (cherry picked from commit `8b900d1808`)	2025-02-18 23:59:10 +00:00
Aram Sargsyan	fc24cfd71d	Fix a race issue in dns_view_addzone() Views use two types of reference counting - regular and weak, and when there are no more regular references, the view_flushanddetach() function destroys or detaches some parts of the view, including 'view->zonetable', while other parts are freed by destroy() when the last weak reference is detached. Since catalog zones use weak references to attach a view, it's currently possible that during shutdown catalog zone processing will try to add a new zone into an otherwise unused view (because it's shutting down) which doesn't have an attached zonetable any more. This could cause an assertion failure. Fix this issue by modifying the dns_view_addzone() function to expect that 'view->zonetable' can be NULL, and in that case just return ISC_R_SHUTTINGDOWN.	2025-02-17 17:21:38 +00:00
Mark Andrews	7111f5e4c2	Fix "CNAME and other data" detection prio_type was being used in the wrong place to optimize cname_and_other. We have to first exclude and accepted types and we also have to determine that the record exists before we can check if we are at a point where a later CNAME cannot appear. (cherry picked from commit `5e49a9e4ae`)	2025-02-14 13:44:47 +11:00
Ondřej Surý	c9288ea3d7	Print the expiration time of the stale records (not ancient) In #1870, the expiration time of ANCIENT records were printed, but actually the ancient records are very short lived, and the information carries a little value. Instead of printing the expiration of ANCIENT records, print the expiration time of STALE records.	2025-02-04 18:07:30 +01:00
Ondřej Surý	0c064cfde4	Expand the usage of set_ttl() before mark_header_ancient() When the mark_header_ancient() helper function was introduced, couple of places with duplicate (or almost duplicate) code was missed. Add missing set_ttl() calls before mark_header_ancient(), so the handling of expiring headers is same in all places. (concept cherry picked from commit `58179e6a19`)	2025-02-03 15:12:59 +01:00
Ondřej Surý	63e8af9270	Add better ZEROTTL handling in bindrdataset() If we know that the header has ZEROTTL set, the server should never send stale records for it and the TTL should never be anything else than 0. The comment was already there, but the code was not matching the comment. (cherry picked from commit `cfee6aa565`)	2025-02-03 15:04:36 +01:00
Ondřej Surý	9a8483bece	In cache, set rdataset TTL to 0 when the header is not active When the header has been marked as ANCIENT, but the ttl hasn't been reset (this happens in couple of places), the rdataset TTL would be set to the header timestamp instead to a reasonable TTL value. Since this header has been already expired (ANCIENT is set), set the rdataset TTL to 0 and don't reuse this field to print the expiration time when dumping the cache. Instead of printing the time, we now just print 'expired (awaiting cleanup'. (cherry picked from commit `1bbb57f81b`)	2025-02-03 15:04:36 +01:00
Evan Hunt	291d0d8d90	fix the cache findzonecut implementation the search for the deepest known zone cut in the cache could improperly reject a node containing stale data, even if the NS rdataset wasn't the data that was stale. this change also improves the efficiency of the search by stopping it when both NS and RRSIG(NS) have been found. (cherry picked from commit `1f095b902c`)	2025-02-02 13:22:32 -08:00
Ondřej Surý	2974bbba80	Clarify reference counting in RBTDB database Change the names of the node reference counting functions and add comments to make the mechanism easier to understand: - new_reference() and decrement_references() are now called dns__rbtnode_acquire() and dns__rbtnode_release() respectively; this reflects the fact that they modify both the internal and external reference counters for a node. - rbtnode_newref() and rbtnode_decref are now called rbtnode_erefs_increment() and rbtnode_erefs_decrement(), to reflect that they only increase and decrease the node's external reference counters, not internal. (cherry picked from commit `857225aeb6`)	2025-01-31 06:15:13 +01:00
Ondřej Surý	8465e4516f	Refactor node reference counting in rbtdb.c Refactor the pattern in the newref() and decref() functions in rbtdb.c following the pattern, so it follows the similar pattern we already have for QPDB. (cherry picked from commit `9c45de9473`)	2025-01-31 06:01:29 +01:00
Michał Kępień	64367010f2	Fix "rndc flushname" for longer name server names dns_adb_flushname() calls dns_name_hash() to determine the ADB bucket number to search for the given name. Meanwhile, all other functions in lib/dns/adb.c call dns_name_fullhash() for determining the bucket number instead. This discrepancy causes dns_adb_flushname() to have virtually no chances of actually removing the given name from the ADB if the name is longer than 16 bytes (since dns_name_hash() only hashes the first 16 bytes of the name provided to it) - more specifically, the probability of success for names longer than 16 bytes is inversely proportional to the number of ADB buckets in use, i.e. 1:1021 at best. Fix by using dns_name_fullhash() instead of dns_name_hash() in dns_adb_flushname(), so that the logic for determining the bucket number that a given name belongs to is consistent throughout lib/dns/adb.c.	2025-01-30 07:44:18 +01:00
Andoni Duarte Pintado	73997c8161	Merge tag 'v9.18.33' into bind-9.18	2025-01-29 17:23:11 +01:00
Ondřej Surý	b14df7d459	Stop the timer when shuttingdown the fetch context When canceling the last fetch, we also need to stop the fctx_expired timer from possibly firing between the fctx_shutdown() call and the fetch being actually destroyed along with the timer. As there are multiple places where fctx_shutdown() is being called without stopping the timer, move the fctx_stoptimer() to fctx_shutdown() and cleanup the explicit usage.	2025-01-23 17:46:37 +01:00
Mark Andrews	8790d5cd22	Terminate yaml string after negative comment (cherry picked from commit `89afc11389`)	2025-01-22 23:58:54 +00:00
Ondřej Surý	239f4104da	Remove memory limit on ADB finds and fetches Address Database (ADB) shares the memory for the short lived ADB objects (finds, fetches, addrinfo) and the long lived ADB objects (names, entries, namehooks). This could lead to a situation where the resolver-heavy load would force evict ADB objects from the database to point where ADB is completely empty, leading to even more resolver-heavy load. Make the short lived ADB objects use the other memory context that we already created for the hashmaps. This makes the ADB overmem condition to not be triggered by the ongoing resolver fetches. (cherry picked from commit `05faff6d53`)	2025-01-22 15:29:27 +01:00
Ondřej Surý	4cc1160e4d	Replace linked lists with the hashtables to hold fetch contexts When the recursive-clients value is too large, the linked lists holding the fetch contexts can also grow large and since the algorithm to merge outgoing queries is quadratic, named can get slow. Replace the linked list with hashtable for faster lookups. This also allows us to reduce the number of tasks (buckets) in the resolver.	2025-01-22 15:06:04 +01:00
JINMEI Tatuya	065ffb2eb8	Optimize database decref by avoiding locking with refs > 1 Previously, this function always acquires a node write lock if it might need node cleanup in case the reference decrements to 0. In fact, the lock is unnecessary if the reference is larger than 1 and it can be optimized as an "easy" case. This optimization could even be "necessary". In some extreme cases, many worker threads could repeat acquring and releasing the reference on the same node, resulting in severe lock contention for nothing (as the ref wouldn't decrement to 0 in most cases). This change would prevent noticeable performance drop like query timeout for such cases. Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com> Co-authored-by: Ondřej Surý <ondrej@isc.org> (cherry picked from commit `7f4471594d`)	2025-01-22 14:31:09 +01:00
Ondřej Surý	8bf311c769	Shutdown the fetch context after canceling the last fetch Currently, the fetch context will continue running even when the last fetch (response) has been removed from the context, so named can process and cache the answer. This can lead to a situation where the number of outgoing recursing clients exceeds the the configured number for recursive-clients. Be more stringent about the recursive-clients limit and shutdown the fetch context immediately after the last fetch has been canceled from that particular fetch context.	2025-01-22 14:21:51 +01:00
Ondřej Surý	1b9d949534	Remove --with-tuning=small/large configuration option The last remaining tuning value was RESOLVER_NTASKS and instead of having variable number of the tasks per-cpu and in named and in dns_client, set the number of the resolver tasks to 523 (number taken from dns_client unit) to accomodate most of the recursive-clients values.	2025-01-22 14:16:40 +01:00
Ondřej Surý	d8206a939c	Reduce struct isc__nm_uvreq size from 1560 to 560 bytes The uv_req union member of struct isc__nm_uvreq contained libuv request types that we don't use. Turns out that uv_getnameinfo_t is 1000 bytes big and unnecessarily enlarged the whole structure. Remove all the unused members from the uv_req union.	2025-01-22 14:12:38 +01:00
Ondřej Surý	a7630c2c62	Reduce sizeof isc_sockaddr from 152 to 48 bytes After removing sockaddr_unix from isc_sockaddr, we can also remove sockaddr_storage and reduce the isc_sockaddr size from 152 bytes to just 48 bytes needed to hold IPv6 addresses. (cherry picked from commit `2367b6a2e1`)	2025-01-22 14:12:38 +01:00
Artem Boldariev	550b692343	DoH: reduce excessive bad request logging We started using isc_nm_bad_request() more actively throughout codebase. In the case of HTTP/2 it can lead to a large count of useless "Bad Request" messages in the BIND log, as often we attempt to send such request over effectively finished HTTP/2 sessions. This commit fixes that. (cherry picked from commit `937b5f8349`)	2025-01-15 16:50:13 +01:00

1 2 3 4 5 ...

14526 commits