bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-05-28 04:34:54 -04:00

Author	SHA1	Message	Date
Artem Boldariev	d0907a3a1f	TLS DNS: Simplify tls_cycle_input() This commit simplifies code flow in the tls_cycle_input() and makes the incoming data processing similar to that in TCP DNS. In particular, now we decipher all the the incoming data before making a single isc__nm_process_sock_buffer() call. Previously we would try to decipher data bit-by-bit before trying to process the deciphered bit via isc__nm_process_sock_buffer(). Doing like before made the code much less predictable, in particular in the areas like when reading is paused or resumed. The newer approach also allowed us to get rid of some old kludges.	2025-03-24 09:49:38 +02:00
Aram Sargsyan	ac15d3dede	Implement -T cookiealwaysvalid When -T cookiealwaysvalid is passed to named, DNS cookie checks for the incoming queries always pass, given they are structurally correct. (cherry picked from commit `807ef8545d`)	2025-03-17 12:01:42 +00:00
Mark Andrews	89e76eec70	Add missing locks when returning addresses Add missing locks in dns_zone_getxfrsource4 et al. Addresses CID 468706, 468708, 468741, 468742, 468785 and 468778. Cleanup dns_zone_setxfrsource4 et al to now return void. Remove double copies with dns_zone_getprimaryaddr and dns_zone_getsourceaddr. (cherry picked from commit `d0a59277fb`)	2025-03-16 15:26:12 +11:00
Mark Andrews	54c89f75f3	Implement digest_sig and digest_rrsig for ZONEMD ZONEMD needs to be able to digest SIG and RRSIG records. The signer field can be compressed in SIG so we need to call dns_name_digest(). While for RRSIG the records the signer field is not compressed the canonical form has the signer field downcased (RFC 4034, 6.2). This also implies that compare_rrsig needs to downcase the signer field during comparison. (cherry picked from commit `006c5990ce`)	2025-03-05 10:34:52 +00:00
Aram Sargsyan	df373d7d99	Fix memory ordering for operations with quota->used and quota->waiting Change all the non-locked operations on 'quota->used' and 'quota->waiting' to "acq/rel" for inter-thread synchronization. Some loads are left as "relaxed", because they are under a locked mutex which also provides protection.	2025-03-04 09:57:34 +00:00
Aram Sargsyan	80d7d11f37	Use relaxed memory ordering for quota->max and quota->soft These variables are not critical for memory ordering issues and we can use the relaxed memory ordering, as done in the main branch.	2025-03-04 09:57:34 +00:00
Artem Boldariev	94bcd8c253	DoH: Bump the active streams processing limit This commit bumps the total number of active streams (= the opened streams for which a request is received, but response is not ready) to 60% of the total streams limit. The previous limit turned out to be too tight as revealed by longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests. (cherry picked from commit `eaad0aefe6`)	2025-03-03 12:08:15 +02:00
Artem Boldariev	aa6fd85b0b	DoH: remove obsolete INSIST() check The check, while not active by default, is not valid since the commit `8b8f4d500d`. See 'if (total == 0) { ...' below branch to understand why. (cherry picked from commit `217a1ebd79`)	2025-03-03 12:07:48 +02:00
Artem Boldariev	d9928ccb62	DoH: Flush HTTP write buffer on an outgoing DNS message Previously, the code would try to avoid sending any data regardless of what it is unless: a) The flush limit is reached; b) There are no sends in flight. This strategy is used to avoid too numerous send requests with little amount of data. However, it has been proven to be too aggressive and, in fact, harms performance in some cases (e.g., on longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:"). Now, additionally to the listed cases, we also: c) Flush the buffer and perform a send operation when there is an outgoing DNS message passed to the code (which is indicated by the presence of a send callback). That helps improve performance for "stress:long:rpz:doh+udp:linux:" tests. (cherry picked from commit `c5f7968856`)	2025-03-03 12:07:15 +02:00
Artem Boldariev	b4e8089694	DoH: Limit the number of delayed IO processing requests Previously, a function for continuing IO processing on the next UV tick was introduced (http_do_bio_async()). The intention behind this function was to ensure that http_do_bio() is eventually called at least once in the future. However, the current implementation allows queueing multiple such delayed requests needlessly. There is currently no need for these excessive requests as http_do_bio() can requeue them if needed. At the same time, each such request can lead to a memory allocation, particularly in BIND 9.18. This commit ensures that the number of enqueued delayed IO processing requests never exceeds one in order to avoid potentially bombarding IO threads with the delayed requests needlessly. (cherry picked from commit `0e1b02868a`)	2025-03-03 12:06:44 +02:00
Artem Boldariev	e525029b89	DoH: Simplify http_do_bio() This commit significantly simplifies the code flow in the http_do_bio() function, which is responsible for processing incoming and outgoing HTTP/2 data. It seems that the way it was structured before was indirectly caused by the presence of the missing callback calls bug, fixed in `8b8f4d500d`. The change introduced by this commit is known to remove a bottleneck and allows reproducible and measurable performance improvement for long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests. Additionally, it fixes a similar issue with potentially missing send callback calls processing and hardens the code against use-after-free errors related to the session object (they can potentially occur). (cherry picked from commit `0956fb9b9e`)	2025-03-03 12:06:05 +02:00
Evan Hunt	90989bfdfb	prevent a reference leak from the ns_query_done hooks if the NS_QUERY_DONE_BEGIN or NS_QUERY_DONE_SEND hook is used in a plugin and returns NS_HOOK_RETURN, some of the cleanup in ns_query_done() can be skipped over, leading to reference leaks that can cause named to hang on shut down. this has been addressed by adding more housekeeping code after the cleanup: tag in ns_query_done(). (cherry picked from commit `c2e4358267`)	2025-02-25 22:41:27 +00:00
Evan Hunt	9bec99ce7d	Fix a logic error in cache_name() A change in `6aba56ae8` (checking whether a rejected RRset was identical to the data it would have replaced, so that we could still cache a signature) inadvertently introduced cases where processing of a response would continue when previously it would have been skipped. (cherry picked from commit `d0fd9cbe3b`)	2025-02-24 23:42:30 +00:00
Aram Sargsyan	f1ec774f9a	Fix RPZ bug when resuming a query during a reconfiguration After a reconfiguration the old view can be left without a valid 'rpzs' member, because when the RPZ is not changed during the named reconfiguration 'rpzs' "migrate" from the old view into the new view, so when a query resumes it can find that 'qctx->view->rpzs' is NULL which query_resume() currently doesn't expect to happen if it's recursing and 'qctx->rpz_st' is not NULL. Fix the issue by adding a NULL-check. In order to not split the log message to two different log messages depending on whether 'qctx->view->rpzs' is NULL or not, change the message to not log the RPZ policy's "version" which is just a runtime counter and is most likely not very useful for the users. (cherry picked from commit `3ea2fbc238`)	2025-02-21 11:45:52 +00:00
Mark Andrews	2c42324e26	Remove check for missing RRSIG records from getsection Checking whether the authority section is properly signed should be left to the validator. Checking in getsection (dns_message_parse) was way too early and resulted in resolution failures of lookups that should have otherwise succeeded. (cherry picked from commit `83159d0a54`)	2025-02-21 14:20:54 +11:00
Evan Hunt	b2e11b1ad3	Check whether a rejected rrset is different Add a new dns_rdataset_equals() function to check whether two rdatasets are equal in DNSSEC terms. When an rdataset being cached is rejected because its trust level is lower than the existing rdataset, we now check to see whether the rejected data was identical to the existing data. This allows us to cache a potentially useful RRSIG when handling CD=1 queries, while still rejecting RRSIGs that would definitely have resulted in a validation failure. (cherry picked from commit `6aba56ae89`)	2025-02-19 18:55:01 -08:00
Artem Boldariev	66bdddc51a	DoH: http_send_outgoing() return value is not used The value returned by http_send_outgoing() is not used anywhere, so we make it not return anything (void). Probably it is an omission from older times. (cherry picked from commit `2adabe835a`)	2025-02-19 19:42:15 +02:00
Artem Boldariev	0b9e8e6063	DoH: Fix missing send callback calls When handling outgoing data, there were a couple of rarely executed code paths that would not take into account that the callback MUST be called. It could lead to potential memory leaks and consequent shutdown hangs. (cherry picked from commit `8b8f4d500d`)	2025-02-19 19:42:15 +02:00
Artem Boldariev	f9aa7a298d	DoH: change how the active streams number is calculated This commit changes the way how the number of active HTTP streams is calculated and allows it to scale with the values of the maximum amount of streams per connection, instead of effectively capping at STREAM_CLIENTS_PER_CONN. The original limit, which is intended to define the pipelining limit for TCP/DoT. However, it appeared to be too restrictive for DoH, as it works quite differently and implements pipelining at protocol level by the means of multiplexing multiple streams. That renders each stream to be effectively a separate connection from the point of view of the rest of the codebase. (cherry picked from commit `a22bc2d7d4`)	2025-02-19 19:42:15 +02:00
Artem Boldariev	3c49824589	DoH: Track the amount of in flight outgoing data Previously we would limit the amount of incoming data to process based solely on the presence of not completed send requests. That worked, however, it was found to severely degrade performance in certain cases, as was revealed during extended testing. Now we switch to keeping track of how much data is in flight (or ready to be in flight) and limit the amount of processed incoming data when the amount of in flight data surpasses the given threshold, similarly to like we do in other transports. (cherry picked from commit `05e8a50818`)	2025-02-19 19:42:15 +02:00
Mark Andrews	86e65f317a	Re-fetch pending records that failed validation If a deferred validation on data that was originally queried with CD=1 fails, we now repeat the query, since the zone data may have changed in the meantime. (cherry picked from commit `04b1484ed8`)	2025-02-18 23:59:10 +00:00
Mark Andrews	48b32e64c4	Complete the deferred validation if there are no RRSIGs When a query is made with CD=1, we store the result in the cache marked pending so that it can be validated later, at which time it will either be accepted as an answer or removed from the cache as invalid. Deferred validation was not attempted when there were no cached RRSIGs for DNSKEY and DS. We now complete the deferred validation in this scenario. (cherry picked from commit `8b900d1808`)	2025-02-18 23:59:10 +00:00
Aram Sargsyan	fc24cfd71d	Fix a race issue in dns_view_addzone() Views use two types of reference counting - regular and weak, and when there are no more regular references, the view_flushanddetach() function destroys or detaches some parts of the view, including 'view->zonetable', while other parts are freed by destroy() when the last weak reference is detached. Since catalog zones use weak references to attach a view, it's currently possible that during shutdown catalog zone processing will try to add a new zone into an otherwise unused view (because it's shutting down) which doesn't have an attached zonetable any more. This could cause an assertion failure. Fix this issue by modifying the dns_view_addzone() function to expect that 'view->zonetable' can be NULL, and in that case just return ISC_R_SHUTTINGDOWN.	2025-02-17 17:21:38 +00:00
Mark Andrews	7111f5e4c2	Fix "CNAME and other data" detection prio_type was being used in the wrong place to optimize cname_and_other. We have to first exclude and accepted types and we also have to determine that the record exists before we can check if we are at a point where a later CNAME cannot appear. (cherry picked from commit `5e49a9e4ae`)	2025-02-14 13:44:47 +11:00
Ondřej Surý	c9288ea3d7	Print the expiration time of the stale records (not ancient) In #1870, the expiration time of ANCIENT records were printed, but actually the ancient records are very short lived, and the information carries a little value. Instead of printing the expiration of ANCIENT records, print the expiration time of STALE records.	2025-02-04 18:07:30 +01:00
Ondřej Surý	0c064cfde4	Expand the usage of set_ttl() before mark_header_ancient() When the mark_header_ancient() helper function was introduced, couple of places with duplicate (or almost duplicate) code was missed. Add missing set_ttl() calls before mark_header_ancient(), so the handling of expiring headers is same in all places. (concept cherry picked from commit `58179e6a19`)	2025-02-03 15:12:59 +01:00
Ondřej Surý	63e8af9270	Add better ZEROTTL handling in bindrdataset() If we know that the header has ZEROTTL set, the server should never send stale records for it and the TTL should never be anything else than 0. The comment was already there, but the code was not matching the comment. (cherry picked from commit `cfee6aa565`)	2025-02-03 15:04:36 +01:00
Ondřej Surý	9a8483bece	In cache, set rdataset TTL to 0 when the header is not active When the header has been marked as ANCIENT, but the ttl hasn't been reset (this happens in couple of places), the rdataset TTL would be set to the header timestamp instead to a reasonable TTL value. Since this header has been already expired (ANCIENT is set), set the rdataset TTL to 0 and don't reuse this field to print the expiration time when dumping the cache. Instead of printing the time, we now just print 'expired (awaiting cleanup'. (cherry picked from commit `1bbb57f81b`)	2025-02-03 15:04:36 +01:00
Evan Hunt	291d0d8d90	fix the cache findzonecut implementation the search for the deepest known zone cut in the cache could improperly reject a node containing stale data, even if the NS rdataset wasn't the data that was stale. this change also improves the efficiency of the search by stopping it when both NS and RRSIG(NS) have been found. (cherry picked from commit `1f095b902c`)	2025-02-02 13:22:32 -08:00
Ondřej Surý	2974bbba80	Clarify reference counting in RBTDB database Change the names of the node reference counting functions and add comments to make the mechanism easier to understand: - new_reference() and decrement_references() are now called dns__rbtnode_acquire() and dns__rbtnode_release() respectively; this reflects the fact that they modify both the internal and external reference counters for a node. - rbtnode_newref() and rbtnode_decref are now called rbtnode_erefs_increment() and rbtnode_erefs_decrement(), to reflect that they only increase and decrease the node's external reference counters, not internal. (cherry picked from commit `857225aeb6`)	2025-01-31 06:15:13 +01:00
Ondřej Surý	8465e4516f	Refactor node reference counting in rbtdb.c Refactor the pattern in the newref() and decref() functions in rbtdb.c following the pattern, so it follows the similar pattern we already have for QPDB. (cherry picked from commit `9c45de9473`)	2025-01-31 06:01:29 +01:00
Michał Kępień	64367010f2	Fix "rndc flushname" for longer name server names dns_adb_flushname() calls dns_name_hash() to determine the ADB bucket number to search for the given name. Meanwhile, all other functions in lib/dns/adb.c call dns_name_fullhash() for determining the bucket number instead. This discrepancy causes dns_adb_flushname() to have virtually no chances of actually removing the given name from the ADB if the name is longer than 16 bytes (since dns_name_hash() only hashes the first 16 bytes of the name provided to it) - more specifically, the probability of success for names longer than 16 bytes is inversely proportional to the number of ADB buckets in use, i.e. 1:1021 at best. Fix by using dns_name_fullhash() instead of dns_name_hash() in dns_adb_flushname(), so that the logic for determining the bucket number that a given name belongs to is consistent throughout lib/dns/adb.c.	2025-01-30 07:44:18 +01:00
Andoni Duarte Pintado	73997c8161	Merge tag 'v9.18.33' into bind-9.18	2025-01-29 17:23:11 +01:00
Ondřej Surý	b14df7d459	Stop the timer when shuttingdown the fetch context When canceling the last fetch, we also need to stop the fctx_expired timer from possibly firing between the fctx_shutdown() call and the fetch being actually destroyed along with the timer. As there are multiple places where fctx_shutdown() is being called without stopping the timer, move the fctx_stoptimer() to fctx_shutdown() and cleanup the explicit usage.	2025-01-23 17:46:37 +01:00
Mark Andrews	8790d5cd22	Terminate yaml string after negative comment (cherry picked from commit `89afc11389`)	2025-01-22 23:58:54 +00:00
Ondřej Surý	239f4104da	Remove memory limit on ADB finds and fetches Address Database (ADB) shares the memory for the short lived ADB objects (finds, fetches, addrinfo) and the long lived ADB objects (names, entries, namehooks). This could lead to a situation where the resolver-heavy load would force evict ADB objects from the database to point where ADB is completely empty, leading to even more resolver-heavy load. Make the short lived ADB objects use the other memory context that we already created for the hashmaps. This makes the ADB overmem condition to not be triggered by the ongoing resolver fetches. (cherry picked from commit `05faff6d53`)	2025-01-22 15:29:27 +01:00
Ondřej Surý	4cc1160e4d	Replace linked lists with the hashtables to hold fetch contexts When the recursive-clients value is too large, the linked lists holding the fetch contexts can also grow large and since the algorithm to merge outgoing queries is quadratic, named can get slow. Replace the linked list with hashtable for faster lookups. This also allows us to reduce the number of tasks (buckets) in the resolver.	2025-01-22 15:06:04 +01:00
JINMEI Tatuya	065ffb2eb8	Optimize database decref by avoiding locking with refs > 1 Previously, this function always acquires a node write lock if it might need node cleanup in case the reference decrements to 0. In fact, the lock is unnecessary if the reference is larger than 1 and it can be optimized as an "easy" case. This optimization could even be "necessary". In some extreme cases, many worker threads could repeat acquring and releasing the reference on the same node, resulting in severe lock contention for nothing (as the ref wouldn't decrement to 0 in most cases). This change would prevent noticeable performance drop like query timeout for such cases. Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com> Co-authored-by: Ondřej Surý <ondrej@isc.org> (cherry picked from commit `7f4471594d`)	2025-01-22 14:31:09 +01:00
Ondřej Surý	8bf311c769	Shutdown the fetch context after canceling the last fetch Currently, the fetch context will continue running even when the last fetch (response) has been removed from the context, so named can process and cache the answer. This can lead to a situation where the number of outgoing recursing clients exceeds the the configured number for recursive-clients. Be more stringent about the recursive-clients limit and shutdown the fetch context immediately after the last fetch has been canceled from that particular fetch context.	2025-01-22 14:21:51 +01:00
Ondřej Surý	1b9d949534	Remove --with-tuning=small/large configuration option The last remaining tuning value was RESOLVER_NTASKS and instead of having variable number of the tasks per-cpu and in named and in dns_client, set the number of the resolver tasks to 523 (number taken from dns_client unit) to accomodate most of the recursive-clients values.	2025-01-22 14:16:40 +01:00
Ondřej Surý	d8206a939c	Reduce struct isc__nm_uvreq size from 1560 to 560 bytes The uv_req union member of struct isc__nm_uvreq contained libuv request types that we don't use. Turns out that uv_getnameinfo_t is 1000 bytes big and unnecessarily enlarged the whole structure. Remove all the unused members from the uv_req union.	2025-01-22 14:12:38 +01:00
Ondřej Surý	a7630c2c62	Reduce sizeof isc_sockaddr from 152 to 48 bytes After removing sockaddr_unix from isc_sockaddr, we can also remove sockaddr_storage and reduce the isc_sockaddr size from 152 bytes to just 48 bytes needed to hold IPv6 addresses. (cherry picked from commit `2367b6a2e1`)	2025-01-22 14:12:38 +01:00
Artem Boldariev	550b692343	DoH: reduce excessive bad request logging We started using isc_nm_bad_request() more actively throughout codebase. In the case of HTTP/2 it can lead to a large count of useless "Bad Request" messages in the BIND log, as often we attempt to send such request over effectively finished HTTP/2 sessions. This commit fixes that. (cherry picked from commit `937b5f8349`)	2025-01-15 16:50:13 +01:00
Artem Boldariev	796708775d	DoH: introduce manual read timer control This commit introduces manual read timer control as used by StreamDNS and its underlying transports. Before that, DoH code would rely on the timer control provided by TCP, which would reset the timer any time some data arrived. Now, the timer is restarted only when a full DNS message is processed in line with other DNS transports. That change is required because we should not stop the timer when reading from the network is paused due to throttling. We need a way to drop timed-out clients, particularly those who refuse to read the data we send. (cherry picked from commit `609a41517b`)	2025-01-15 16:49:32 +01:00
Artem Boldariev	ee42514be2	DoH: floodding clients detection This commit adds logic to make code better protected against clients that send valid HTTP/2 data that is useless from a DNS server perspective. Firstly, it adds logic that protects against clients who send too little useful (=DNS) data. We achieve that by adding a check that eventually detects such clients with a nonfavorable useful to processed data ratio after the initial grace period. The grace period is limited to processing 128 KiB of data, which should be enough for sending the largest possible DNS message in a GET request and then some. This is the main safety belt that would detect even flooding clients that initially behave well in order to fool the checks server. Secondly, in addition to the above, we introduce additional checks to detect outright misbehaving clients earlier: The code will treat clients that open too many streams (50) without sending any data for processing as flooding ones; The clients that managed to send 1.5 KiB of data without opening a single stream or submitting at least some DNS data will be treated as flooding ones. Of course, the behaviour described above is nothing else but heuristical checks, so they can never be perfect. At the same time, they should be reasonable enough not to drop any valid clients, realatively easy to implement, and have negligible computational overhead. (cherry picked from commit `3425e4b1d0`)	2025-01-15 16:49:23 +01:00
Artem Boldariev	11a2956dce	DoH: process data chunk by chunk instead of all at once Initially, our DNS-over-HTTP(S) implementation would try to process as much incoming data from the network as possible. However, that might be undesirable as we might create too many streams (each effectively backed by a ns_client_t object). That is too forgiving as it might overwhelm the server and trash its memory allocator, causing high CPU and memory usage. Instead of doing that, we resort to processing incoming data using a chunk-by-chunk processing strategy. That is, we split data into small chunks (currently 256 bytes) and process each of them asynchronously. However, we can process more than one chunk at once (up to 4 currently), given that the number of HTTP/2 streams has not increased while processing a chunk. That alone is not enough, though. In addition to the above, we should limit the number of active streams: these streams for which we have received a request and started processing it (the ones for which a read callback was called), as it is perfectly fine to have more opened streams than active ones. In the case we have reached or surpassed the limit of active streams, we stop reading AND processing the data from the remote peer. The number of active streams is effectively decreased only when responses associated with the active streams are sent to the remote peer. Overall, this strategy is very similar to the one used for other stream-based DNS transports like TCP and TLS. (cherry picked from commit `9846f395ad`)	2025-01-15 16:47:21 +01:00
Artem Boldariev	125bfd71d3	Add isc__nm_async_run() This commit adds isc__nm_async_run() which is very similar to isc_async_run() in newer versions of BIND: it allows calling a callback asynchronously. Potentially, it can be used to replace some other async operations in other networking code, in particular the delayed I/O calls in TLS a TCP DNS transports to name a few and remove quiet a lot of code, but it we are unlikely to do that for the strictly maintenance only branch, so it is protected with DoH-related #ifdefs. It is implemented in a "universal" way mainly because doing it in the specific code requires the same amount of code and is not simpler.	2025-01-15 16:43:47 +01:00
Artem Boldariev	13d521fa5f	Implement TLS manual read timer control functionality This commit adds a manual TLS read timer control mode which is supposed to override automatic resetting of the timer when any data is received. It both depends and complements similar functionality in TCP.	2025-01-15 15:34:43 +00:00
Artem Boldariev	a67b325542	Implement TCP manual read timer control functionality This commit adds a manual TCP read timer control mode which is supposed to override automatic resetting of the timer when any data is received. That can be accomplished by `isc__nmhandle_set_manual_timer()`. This functionality is supposed to be used by multilevel networking transports which require finer grained control over the read timer (TLS Stream, DoH). The commit is essentially an implementation of the functionality from newer versions of BIND.	2025-01-15 15:34:43 +00:00
Ondřej Surý	fa7b7973e3	Limit the additional processing for large RDATA sets When answering queries, don't add data to the additional section if the answer has more than 13 names in the RDATA. This limits the number of lookups into the database(s) during a single client query, reducing query processing load. Also, don't append any additional data to type=ANY queries. The answer to ANY is already big enough. (cherry picked from commit `a1982cf1bb`)	2025-01-15 14:13:45 +01:00

1 2 3 4 5 ...

14519 commits