bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-05-26 19:34:04 -04:00

Author	SHA1	Message	Date
Mark Andrews	fd76b90126	Add enum for use with isc_base64_tobuffer and isc_hex_tobuffer This adds the following enum isc_one_or_more and isc_zero_or_more which specify if one or more or zeror or more bytes are required when reading the unbounded base64 / hex encoded data. (cherry picked from commit `07610f8566`)	2026-01-28 08:02:00 +11:00
Mark Andrews	31bdd01227	Use const pointer with strchr of const pointer C23 now has qualifier preserving standard functions for strchr, bsearch, strpbrk, strrchr, strstr, memchr. There where a few places where the return value was not assigned to a const qualified pointer. These have been fixed. (cherry picked from commit `af379e10cc`)	2026-01-20 06:00:50 +00:00
Giulio Benetti	ad25f0c514	Fix building on uclibc While building on uclibc this error is thrown: In file included from ./include/dns/log.h:20, from callbacks.c:19: ../../lib/isc/include/isc/log.h:141:9: error: unknown type name ‘off_t’ 141 \| off_t maximum_size; \| ^~~~~ This is due to missing include unistd.h, so let's add it on top of isc/log.h Signed-off-by: Giulio Benetti <giulio.benetti@benettiengineering.com> (cherry picked from commit `0e43f62c12`)	2026-01-04 20:47:47 +00:00
Evan Hunt	25c9fb54da	standardize CHECK and RETERR macros previously, there were over 40 separate definitions of CHECK macros, of which most used "goto cleanup", and the rest "goto failure" or "goto out". there were another 10 definitions of RETERR, of which most were identical to CHECK, but some simply returned a result code instead of jumping to a cleanup label. this has now been standardized throughout the code base: RETERR is for returning an error code in the case of an error, and CHECK is for jumping to a cleanup tag, which is now always called "cleanup". both macros are defined in isc/util.h. (cherry picked from commit `52bba5cc34`)	2025-12-03 19:17:20 -08:00
Ondřej Surý	95cc515e20	Provide more information when the memory allocation fails Instead of just crashing when memory allocation fails, also print a message saying "Out of memory!", the size of the allocation that failed, total allocated memory from all memory contexts and value of errno. (cherry picked from commit `b0194004d9`)	2025-11-28 16:45:08 +01:00
Ondřej Surý	5cd69a3dcf	Detect resolution loops between fetches Maintain the relationship between the parent and child fetch and when creating a new child fetch, properly check the resolution loops that would lead to a new fetch would join one of the parent's fetch contexts. (cherry picked from commit `4d307ac67a`)	2025-11-28 09:32:53 +01:00
Ondřej Surý	42d59c2ee4	Use atomics for CMM_{LOAD,STORE}_SHARED with ThreadSanitizer Upstream has removed the atomics implementation of CMM_LOAD_SHARED and CMM_STORE_SHARED as these can be used also with non-stdatomics types. As we only use the CMM api with stdatomics types, we can restore the previous behaviour to prevent ThreadSanitizer warnings. (cherry picked from commit `539be61b68`)	2025-11-27 09:32:36 +00:00
Colin Vidal	8cdb1d71ad	attach socket before async streamdns_resume_processing Call to `streamdns_resume_processing` is asynchronous but the socket passed as argument is not attached when scheduling the call. While there is no reproducible way (so far) to make the socket reference number down to 0 before `streamdns_resume_processing` is called, attach the socket before scheduling the call. This guard against an hypothetic case where, for some reasons, the socket refcount would reach 0, and be freed from memory when `streamdns_resume_processing` is called. (cherry picked from commit `7c8b517d56`)	2025-11-20 17:55:00 +00:00
Ondřej Surý	2c2cb31394	Drop the unit test for testing randomness Since we are using system routines for randomness, there's no point in spending time and run the statistical suite for testing PRNG. (cherry picked from commit `90b3def5e9`)	2025-11-04 20:51:22 +01:00
Ondřej Surý	97487d1abb	Fix assertion failure from arc4random_uniform with invalid limit When the arc4random_uniform() is called on NetBSD with upper_bound that makes no sense statistically (0 or 1), the call crashes the calling program. Fix this by returning 0 when upper bound is < 2 as does Linux, FreeBSD and NetBSD. (Hint: System CSPRNG should never crash.) (cherry picked from commit `871bce312b`)	2025-10-24 20:23:32 +00:00
Michał Kępień	b35d6513d8	Merge tag 'v9.20.15' into bind-9.20	2025-10-22 16:16:59 +00:00
Michal Nowak	184cb00814	Use SET_IF_NOT_NULL in isc__nm_base64* (cherry picked from commit `d91e8ed575`)	2025-10-22 11:30:33 +00:00
Ondřej Surý	26c77915d5	Use arc4random for CSPRNG when available Use arc4random on platforms where available. arc4random() provides high quality cryptographically-secure pseudo-random numbers and is generally recommended for application use. The uv_random() call unfortunately uses getentropy() on platforms like MacOS, OpenBSD or NetBSD which is not recommended for application use. (cherry picked from commit `4db9e5d90e`)	2025-10-02 13:49:33 +02:00
Ondřej Surý	2924910eee	Use cryptographically-secure pseudo-random generator everywhere It was discovered in an upcoming academic paper that a xoshiro128** internal state can be recovered by an external 3rd party allowing to predict UDP ports and DNS IDs in the outgoing queries. This could lead to an attacker spoofing the DNS answers with great efficiency and poisoning the DNS cache. Change the internal random generator to system CSPRNG with buffering to avoid excessive syscalls. Thanks Omer Ben Simhon and Amit Klein of Hebrew University of Jerusalem for responsibly reporting this to us. Very cool research! (cherry picked from commit `cffcab9d5f`)	2025-10-02 13:49:33 +02:00
Ondřej Surý	a5d7c8f7db	Add and use __attribute__((nonnull)) in dnssec-signzone.c Clang 20 is complaining about passing NULL to an argument with 'nonnull' attribute. Mark these two functions with the same attribute to assure that these two function also don't accept NULL as an argument. (cherry picked from commit `9e350c1774`)	2025-08-28 14:24:48 +00:00
Thomas Abraham	add7cd3640	ensure file descriptors 0-2 are in use before using libuv libuv expects file descriptors <= STDERR_FILENO are in use. otherwise, it may abort when closing a file descriptor it opened. See https://github.com/libuv/libuv/pull/4559 Closes #5226 (cherry picked from commit `5cfdbeba72`)	2025-08-28 08:57:12 +00:00
Ondřej Surý	8f8fb10232	Use ControlStatementsExceptControlMacros for SpaceBeforeParens > Put a space before opening parentheses only after control statement > keywords (for/if/while...) except this option doesn’t apply to ForEach > and If macros. This is useful in projects where ForEach/If macros are > treated as function calls instead of control statements. (cherry picked from commit `42496f3f4a`)	2025-08-19 08:08:23 +02:00
Ondřej Surý	58791b5cfe	Add and apply InsertBraces statement > Insert braces after control statements (if, else, for, do, and while) > in C++ unless the control statements are inside macro definitions or > the braces would enclose preprocessor directives. (cherry picked from commit `d051e1e8f8`)	2025-08-19 08:07:41 +02:00
alessio	d21f63884a	Adaptive memory allocation strategy for qp-tries qp-tries allocate their nodes (twigs) in chunks to reduce allocator pressure and improve memory locality. The choice of chunk size presents a tradeoff: larger chunks benefit qp-tries with many values (as seen in large zones and resolvers) but waste memory in smaller use cases. Previously, our fixed chunk size of 2^10 twigs meant that even an empty qp-trie would consume 12KB of memory, while reducing this size would negatively impact resolver performance. This commit implements an adaptive chunking strategy that: - Tracks the size of the most recently allocated chunk. - Doubles the chunk size for each new allocation until reaching a predefined maximum. This approach effectively balances memory efficiency for small tries while maintaining the performance benefits of larger chunk sizes for bigger data structures. This commit also splits the callback freeing qpmultis into two phases, one that frees the underlying qptree, and one that reclaims the qpmulti memory. In order to prevent races between the qpmulti destructor and chunk garbage collection jobs, the second phase is protected by reference counting. (cherry picked from commit `70b1777d8a`)	2025-08-05 12:48:19 +02:00
Mark Andrews	53738b0e5e	Use clang-format-20 to update formatting (cherry picked from commit `422b9118e8`)	2025-06-25 13:32:08 +10:00
Ondřej Surý	1945fbc0dc	Set name for all the isc_mem context The memory context for managers and dlz_dlopen_driver units had no name and that was causing trouble with the statistics channel output. Set the name for the two memory context that were missing a proper name. (cherry picked from commit `5d264b3329`)	2025-05-29 05:45:12 +02:00
Ondřej Surý	9ac22fb152	Disable own memory context for libxml2 on macOS 15.4 Sequoia The custom allocation API for libxml2 is deprecated starting in macOS Sequoia 15.4, iOS 18.4, tvOS 18.4, visionOS 2.4, and tvOS 18.4. Disable the memory function override for libxml2 when LIBXML_HAS_DEPRECATED_MEMORY_ALLOCATION_FUNCTIONS is defined as Apple broke the system-wide libxml2 starting with macOS Sequoia 15.4. (cherry picked from commit `bf1b8824ac`)	2025-04-18 21:00:52 +02:00
Artem Boldariev	634625be07	Add isc_tls_valid_sni_hostname() Add a function that checks if a 'hostname' is not a valid IPv4 or IPv6 address. Returns 'true' if the hostname is likely a domain name, and 'false' if it represents an IP address. (cherry picked from commit `1f199ee606`)	2025-03-31 15:06:59 +03:00
Colin Vidal	c1352b79ca	copy __FILE__ when allocating memory When allocating memory under -m trace\|record, the __FILE__ pointer is stored, so it can be printed out later in order to figure out in which file an allocation leaked. (among others, like the line number). However named crashes when called with -m record and using a plugin leaking memory. The reason is that plugins are unloaded earlier than when the leaked allocations are dumped (obviously, as it's done as late as possible). In such circumstances, __FILE__ is dangling because the dynamically loaded library (the plugin) is not in memory anymore. Fix the crash by systematically copying the __FILE__ string instead of copying the pointer. Of course, this make each allocation to consume a bit more memory (and longer, as it needs to calculate the length of __FILE__) but this occurs only under -m trace\|record debugging flags. In term of unit test, because grepping in C is not fun, and because the whole "syntax" of the dump output is tested in other tests, this simply search for a substring in the whole buffer to make sure the expected allocations are found. (cherry picked from commit `4eb2cd364a`)	2025-03-27 14:21:00 +01:00
Mark Andrews	cbf416a284	Call isc__iterated_hash_initialize The iterated hash implementation needs to be initialised on the worker thread. Also clean it up after we are done. (cherry picked from commit `988dc57c8c`)	2025-03-04 13:49:38 +00:00
Artem Boldariev	9977c7e5fa	DoH: Bump the active streams processing limit This commit bumps the total number of active streams (= the opened streams for which a request is received, but response is not ready) to 60% of the total streams limit. The previous limit turned out to be too tight as revealed by longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests. (cherry picked from commit `eaad0aefe6`)	2025-03-03 10:12:27 +00:00
Artem Boldariev	b1ca1b3abc	DoH: remove obsolete INSIST() check The check, while not active by default, is not valid since the commit `8b8f4d500d`. See 'if (total == 0) { ...' below branch to understand why. (cherry picked from commit `217a1ebd79`)	2025-03-03 10:12:27 +00:00
Artem Boldariev	0bc12d0deb	DoH: Flush HTTP write buffer on an outgoing DNS message Previously, the code would try to avoid sending any data regardless of what it is unless: a) The flush limit is reached; b) There are no sends in flight. This strategy is used to avoid too numerous send requests with little amount of data. However, it has been proven to be too aggressive and, in fact, harms performance in some cases (e.g., on longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:"). Now, additionally to the listed cases, we also: c) Flush the buffer and perform a send operation when there is an outgoing DNS message passed to the code (which is indicated by the presence of a send callback). That helps improve performance for "stress:long:rpz:doh+udp:linux:" tests. (cherry picked from commit `c5f7968856`)	2025-03-03 10:12:27 +00:00
Artem Boldariev	30226c749f	DoH: Limit the number of delayed IO processing requests Previously, a function for continuing IO processing on the next UV tick was introduced (http_do_bio_async()). The intention behind this function was to ensure that http_do_bio() is eventually called at least once in the future. However, the current implementation allows queueing multiple such delayed requests needlessly. There is currently no need for these excessive requests as http_do_bio() can requeue them if needed. At the same time, each such request can lead to a memory allocation, particularly in BIND 9.18. This commit ensures that the number of enqueued delayed IO processing requests never exceeds one in order to avoid potentially bombarding IO threads with the delayed requests needlessly. (cherry picked from commit `0e1b02868a`)	2025-03-03 10:12:27 +00:00
Artem Boldariev	515d84e1f6	DoH: Simplify http_do_bio() This commit significantly simplifies the code flow in the http_do_bio() function, which is responsible for processing incoming and outgoing HTTP/2 data. It seems that the way it was structured before was indirectly caused by the presence of the missing callback calls bug, fixed in `8b8f4d500d`. The change introduced by this commit is known to remove a bottleneck and allows reproducible and measurable performance improvement for long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests. Additionally, it fixes a similar issue with potentially missing send callback calls processing and hardens the code against use-after-free errors related to the session object (they can potentially occur). (cherry picked from commit `0956fb9b9e`)	2025-03-03 10:12:27 +00:00
Ondřej Surý	ace7c879a8	Add isc_timer_running() function to check status of timer In the next commit, we need to know whether the timer has been started or stopped. Add isc_timer_running() function that returns true if the timer has been started. (cherry picked from commit `b9e3cd5d2a`)	2025-02-21 22:27:25 +01:00
Aram Sargsyan	18fbc3f735	Fix isc_quota bug Running jobs which were entered into the isc_quota queue is the responsibility of the isc_quota_release() function, which, when releasing a previously acquired quota, checks whether the queue is empty, and if it's not, it runs a job from the queue without touching the 'quota->used' counter. This mechanism is susceptible to a possible hangup of a newly queued job in case when between the time a decision has been made to queue it (because used >= max) and the time it was actually queued, the last quota was released. Since there is no more quotas to be released (unless arriving in the future), the newly entered job will be stuck in the queue. Fix the wrong memory ordering for 'quota->used', as the relaxed ordering doesn't ensure that data modifications made by one thread are visible in other threads. Add checks in both isc_quota_release() and isc_quota_acquire_cb() to make sure that the described hangup does not happen. Also see code comments. (cherry picked from commit `c6529891bb`)	2025-02-20 12:20:25 +00:00
Artem Boldariev	788e925261	DoH: http_send_outgoing() return value is not used The value returned by http_send_outgoing() is not used anywhere, so we make it not return anything (void). Probably it is an omission from older times. (cherry picked from commit `2adabe835a`)	2025-02-19 20:34:29 +02:00
Artem Boldariev	47e9b47742	DoH: Fix missing send callback calls When handling outgoing data, there were a couple of rarely executed code paths that would not take into account that the callback MUST be called. It could lead to potential memory leaks and consequent shutdown hangs. (cherry picked from commit `8b8f4d500d`)	2025-02-19 20:34:29 +02:00
Artem Boldariev	6b9387e2ee	DoH: change how the active streams number is calculated This commit changes the way how the number of active HTTP streams is calculated and allows it to scale with the values of the maximum amount of streams per connection, instead of effectively capping at STREAM_CLIENTS_PER_CONN. The original limit, which is intended to define the pipelining limit for TCP/DoT. However, it appeared to be too restrictive for DoH, as it works quite differently and implements pipelining at protocol level by the means of multiplexing multiple streams. That renders each stream to be effectively a separate connection from the point of view of the rest of the codebase. (cherry picked from commit `a22bc2d7d4`)	2025-02-19 20:34:29 +02:00
Artem Boldariev	96e8ea1245	DoH: Track the amount of in flight outgoing data Previously we would limit the amount of incoming data to process based solely on the presence of not completed send requests. That worked, however, it was found to severely degrade performance in certain cases, as was revealed during extended testing. Now we switch to keeping track of how much data is in flight (or ready to be in flight) and limit the amount of processed incoming data when the amount of in flight data surpasses the given threshold, similarly to like we do in other transports. (cherry picked from commit `05e8a50818`)	2025-02-19 20:34:29 +02:00
Ondřej Surý	a9f4e3369a	Reduce false sharing in dns_qpcache Instead of having many node_lock_count * sizeof(<member>) arrays, pack all the members into a qpcache_bucket_t struct that is cacheline aligned and have a single array of those. Additionaly, make both the head and the tail of isc_queue_t padded, not just the head, to prevent false sharing of the lock-free structure with the lock that follows it. (cherry picked from commit `c602d76c1f`)	2025-02-04 23:27:28 +01:00
Artem Boldariev	50a062e5ce	DoH: reduce excessive bad request logging We started using isc_nm_bad_request() more actively throughout codebase. In the case of HTTP/2 it can lead to a large count of useless "Bad Request" messages in the BIND log, as often we attempt to send such request over effectively finished HTTP/2 sessions. This commit fixes that. (cherry picked from commit `937b5f8349`)	2025-01-15 16:07:13 +01:00
Artem Boldariev	c53541bfc5	Do not stop timer in isc_nm_read_stop() in manual timer mode A call to isc_nm_read_stop() would always stop reading timer even in manual timer control mode which was added with StreamDNS in mind. That looks like an omission that happened due to how timers are controlled in StreamDNS where we always stop the timer before pausing reading anyway (see streamdns_on_complete_dnsmessage()). That would not work well for HTTP, though, where we might want pause reading without stopping the timer in the case we want to split incoming data into multiple chunks to be processed independently. I suppose that it happened due to NM refactoring in the middle of StreamDNS development (at the time isc_nm_cancelread() and isc_nm_pauseread() were removed), as the StreamDNS code seems to be written as if timers are not stoping during a call to isc_nm_read_stop(). (cherry picked from commit `4ae4e255cf`)	2025-01-15 16:05:56 +01:00
Artem Boldariev	36e9720d24	DoH: introduce manual read timer control This commit introduces manual read timer control as used by StreamDNS and its underlying transports. Before that, DoH code would rely on the timer control provided by TCP, which would reset the timer any time some data arrived. Now, the timer is restarted only when a full DNS message is processed in line with other DNS transports. That change is required because we should not stop the timer when reading from the network is paused due to throttling. We need a way to drop timed-out clients, particularly those who refuse to read the data we send. (cherry picked from commit `609a41517b`)	2025-01-15 16:05:47 +01:00
Artem Boldariev	4907248d14	DoH: floodding clients detection This commit adds logic to make code better protected against clients that send valid HTTP/2 data that is useless from a DNS server perspective. Firstly, it adds logic that protects against clients who send too little useful (=DNS) data. We achieve that by adding a check that eventually detects such clients with a nonfavorable useful to processed data ratio after the initial grace period. The grace period is limited to processing 128 KiB of data, which should be enough for sending the largest possible DNS message in a GET request and then some. This is the main safety belt that would detect even flooding clients that initially behave well in order to fool the checks server. Secondly, in addition to the above, we introduce additional checks to detect outright misbehaving clients earlier: The code will treat clients that open too many streams (50) without sending any data for processing as flooding ones; The clients that managed to send 1.5 KiB of data without opening a single stream or submitting at least some DNS data will be treated as flooding ones. Of course, the behaviour described above is nothing else but heuristical checks, so they can never be perfect. At the same time, they should be reasonable enough not to drop any valid clients, realatively easy to implement, and have negligible computational overhead. (cherry picked from commit `3425e4b1d0`)	2025-01-15 16:05:33 +01:00
Artem Boldariev	5eec1f5368	DoH: process data chunk by chunk instead of all at once Initially, our DNS-over-HTTP(S) implementation would try to process as much incoming data from the network as possible. However, that might be undesirable as we might create too many streams (each effectively backed by a ns_client_t object). That is too forgiving as it might overwhelm the server and trash its memory allocator, causing high CPU and memory usage. Instead of doing that, we resort to processing incoming data using a chunk-by-chunk processing strategy. That is, we split data into small chunks (currently 256 bytes) and process each of them asynchronously. However, we can process more than one chunk at once (up to 4 currently), given that the number of HTTP/2 streams has not increased while processing a chunk. That alone is not enough, though. In addition to the above, we should limit the number of active streams: these streams for which we have received a request and started processing it (the ones for which a read callback was called), as it is perfectly fine to have more opened streams than active ones. In the case we have reached or surpassed the limit of active streams, we stop reading AND processing the data from the remote peer. The number of active streams is effectively decreased only when responses associated with the active streams are sent to the remote peer. Overall, this strategy is very similar to the one used for other stream-based DNS transports like TCP and TLS. (cherry picked from commit `9846f395ad`)	2025-01-15 16:05:13 +01:00
Artem Boldariev	4f8ade0e1e	TLS SNI - add low level support for SNI to the networking code This commit adds support for setting SNI hostnames in outgoing connections over TLS. Most of the changes are related to either adapting the code to accept and extra argument in *connect() functions and a couple of changes to the TLS Stream to actually make use of the new SNI hostname information. (cherry picked from commit `6691a1530d`)	2024-12-26 18:31:03 +02:00
Pavel Březina	93bef0ea28	mark loop as shuttingdown earlier in shutdown_cb `shutdown_trigger_close_cb` is not called in the main loop since queued events in the `loop->async_trigger`, including loop teardown (shutdown_server) are processed first, before the `uv_close` callback is executed.. In order to pass the information to the queued events, it is necessary to set the flag earlier in the process and not wait for the `uv_close` callback to trigger. (cherry picked from commit `67e21d94d4`)	2024-12-10 19:52:13 +00:00
Ondřej Surý	476757770b	Update picohttpparser.{c,h} with upstream repository Upstream code doesn't do regular releases, so we need to regularly sync the code from the upstream repository. This is synchronization up to the commit f8d0513 from Jan 29, 2024. (cherry picked from commit `d14a76e115`)	2024-12-08 12:30:07 +00:00
Matthijs Mekking	a7b291adc7	Fix nsupdate hang when processing a large update The root cause is the fix for CVE-2024-0760 (part 3), which resets the TCP connection on a failed send. Specifically commit `4b7c61381f` stops reading on the socket because the TCP connection is throttling. When the tcpdns_send_cb callback thinks about restarting reading on the socket, this fails because the socket is a client socket. And nsupdate is a client and is using the same netmgr code. This commit removes the requirement that the socket must be a server socket, allowing reading on the socket again after being throttled. (cherry picked from commit `aa24b77d8b`)	2024-12-06 08:31:19 +00:00
Matthijs Mekking	492f79560d	Implement global limit for outgoing queries This global limit is not reset on query restarts and is a hard limit for any client request. (cherry picked from commit `16b3bd1cc7`)	2024-12-06 06:20:33 +00:00
Matthijs Mekking	511c86facb	Implement getter function for counter limit (cherry picked from commit `ca7d487357`)	2024-12-06 06:20:33 +00:00
Ondřej Surý	624ea6c57e	Move contributed DLZ modules into a separate repository The DLZ modules are poorly maintained as we only ensure they can still be compiled, the DLZ interface is blocking, so anything that blocks the query to the database blocks the whole server and they should not be used except in testing. The DLZ interface itself should be scheduled for removal. (cherry picked from commit `a6cce753e2`)	2024-11-26 16:24:17 +01:00
Alessio Podda	0472494417	Incrementally apply AXFR transfer Reintroduce logic to apply diffs when the number of pending tuples is above 128. The previous strategy of accumulating all the tuples and pushing them at the end leads to excessive memory consumption during transfer. This effectively reverts half of `e3892805d6` (cherry picked from commit `99b4f01b33`)	2024-11-26 07:17:06 +00:00

1 2 3 4 5 ...

4997 commits