bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-05-23 10:37:43 -04:00

Author	SHA1	Message	Date
Ondřej Surý	8e240bbb5f	Fix isc_buffer_init capacity mismatch in DoH data chunk callback isc_buffer_init() is given MAX_DNS_MESSAGE_SIZE (65535) as capacity but only h2->content_length bytes are allocated. This makes the buffer believe it has more space than actually allocated. A secondary bounds check (new_bufsize <= h2->content_length) prevents actual overflow, but the buffer invariant is violated. Pass h2->content_length as the capacity to match the allocation.	2026-03-18 11:39:01 +01:00
Ondřej Surý	2ab3d7c075	Fix missing server socket detach in TLS accept error path When TLS creation fails in tlslisten_acceptcb(), tlssock->server was not detached before detaching tlssock itself.	2026-03-14 13:58:32 +01:00
Ondřej Surý	295139f8ca	Rename isc_net_getudpportrange() to isc_net_getportrange() This better reflects the true nature of the function as we are reading the ephemeral port range which is not related to UDP at all.	2026-02-20 14:06:23 +01:00
Ondřej Surý	04c81b55d2	Implement IP_LOCAL_PORT_RANGE socket option for Linux For Linux >= 6.8: Since 2023, Linux has introduced a change to the IP_LOCAL_PORT_RANGE socket option that eliminates the need for the random window shifting (implemented as a fallback in the next commit). By setting IP_LOCAL_PORT_RANGE option, we tell the kernel to use better approach to the source port selection. For Linux << 6.8: This implement selecting port by random shifting range leveraging the IP_LOCAL_PORT_RANGE socket option. The network manager is initialized with the ephemeral port range (on startup and on reconfig) and then for every outgoing TCP connection, we define a custom port range (1000 ports) and then randomly shift the custom range within the system range. This helps the kernel to reduce the search space to the custom window between <random_offset, random_offset + 1000>. Reference: https://blog.cloudflare.com/linux-transport-protocol-port-selection-performance/#kernel	2026-02-20 14:06:23 +01:00
Ondřej Surý	2c48fcaeed	Improve the source port selection on Linux Since 2015, Linux has introduced a new socket option to overcome TCP limitations: When an application needs to force a source IP on an active TCP socket it has to use bind(IP, port=x). As most applications do not want to deal with already used ports, x is often set to 0, meaning the kernel is in charge to find an available port. But kernel does not know yet if this socket is going to be a listener or be connected. This IP_BIND_ADDRESS_NO_PORT socket option ask the kernel to ignore the 0 port provided by application in bind(IP, port=0) and only remember the given IP address. The port will be automatically chosen at connect() time, in a way that allows sharing a source port as long as the 4-tuples are unique. Enable IP_BIND_ADDRESS_NO_PORT on the outgoing TCP sockets to overcome this TCP limitation.	2026-02-20 14:06:23 +01:00
Evan Hunt	d4ebea1037	use a standard CLEANUP macro CLEANUP is a macro similar to CHECK but unconditional, jumping to cleanup even if the result is ISC_R_SUCCESS. It is now used in place of DST_RET, CLEANUP_WITH, and CHECK(<non-success constant>).	2025-12-03 13:45:43 -08:00
Evan Hunt	6b33b7fc77	switch to RETERR where it wasn't being used replace all instances of the pattern: result = <statement> if (result != ISC_R_SUCCESS) { return result; } with: RETERR(<statement>);	2025-12-03 13:45:43 -08:00
Evan Hunt	38e94cc7da	switch to CHECK where it wasn't being used replace all instances of the pattern: result = <statement> if (result != ISC_R_SUCCESS) { goto cleanup; } with: CHECK(<statement>);	2025-12-03 13:45:42 -08:00
Colin Vidal	7c8b517d56	attach socket before async streamdns_resume_processing Call to `streamdns_resume_processing` is asynchronous but the socket passed as argument is not attached when scheduling the call. While there is no reproducible way (so far) to make the socket reference number down to 0 before `streamdns_resume_processing` is called, attach the socket before scheduling the call. This guard against an hypothetic case where, for some reasons, the socket refcount would reach 0, and be freed from memory when `streamdns_resume_processing` is called.	2025-11-20 18:08:52 +01:00
Michal Nowak	d91e8ed575	Use SET_IF_NOT_NULL in isc__nm_base64*	2025-10-22 12:50:55 +02:00
Ondřej Surý	94b4d105e8	Apply the changes from updated set_if_not_null semantic patch	2025-10-08 17:44:50 +02:00
Alessio Podda	6e7aec2cb7	Use unique names for probes.d files Enabling LTO in the subsequent commit requires the file names to be unique and having same probes.d in each of the libraries breaks this requirement. Rename probes.d to probes-{isc,dns,ns}.d files and adjust the includes.	2025-09-24 13:18:13 +02:00
Colin Vidal	2cbe958df6	simplify nchildren count in isc_nm_listenudp Slight simplification of the logic to define .nchildren listening UDP socket.	2025-09-16 14:22:15 +02:00
Ondřej Surý	42496f3f4a	Use ControlStatementsExceptControlMacros for SpaceBeforeParens > Put a space before opening parentheses only after control statement > keywords (for/if/while...) except this option doesn’t apply to ForEach > and If macros. This is useful in projects where ForEach/If macros are > treated as function calls instead of control statements.	2025-08-19 07:58:33 +02:00
Ondřej Surý	f6aed602f0	Refactor the network manager to be a singleton There is only a single network manager running on top of the loop manager (except for tests). Refactor the network manager to be a singleton (a single instance) and change the unit tests, so that the shorter read timeouts apply only to a specific handle, not the whole extra 'connect_nm' network manager instance.	2025-07-23 22:45:38 +02:00
Ondřej Surý	b8d00e2e18	Change the loopmgr to be singleton All the applications built on top of the loop manager were required to create just a single instance of the loop manager. Refactor the loop manager to not expose this instance to the callers and keep the loop manager object internal to the isc_loop compilation unit. This significantly simplifies a number of data structures and calls to the isc_loop API.	2025-07-23 22:44:16 +02:00
Ondřej Surý	cca4b26d31	Use regular reference counting macro for isc_nm_t structure Instead of having hand crafted attach/detach/destroy functions, replace them with the standard ISC_REFCOUNT macro. This also have advantage that delayed netmgr detach (from dns_dispatch) now doesn't cause assertion failure. This can happen with delayed (call_rcu) shutdown of dns_adb.	2025-07-09 21:22:48 +02:00
Ondřej Surý	1032681af0	Convert the isc/tid.h to use own signed integer isc_tid_t type Change the internal type used for isc_tid unit to isc_tid_t to hide the specific integer type being used for the 'tid'. Internally, the signed integer type is being used. This allows us to have negatively indexed arrays that works both for threads with assigned tid and the threads with unassigned tid. This should be used only in specific situations.	2025-06-28 13:32:12 +02:00
Mark Andrews	422b9118e8	Use clang-format-20 to update formatting	2025-06-25 12:44:22 +10:00
Aydın Mercan	5cd6c173ff	replace the build system with meson Meson is a modern build system that has seen a rise in adoption and some version of it is available in almost every platform supported. Compared to automake, meson has the following advantages: * Meson provides a significant boost to the build and configuration time by better exploiting parallelism. * Meson is subjectively considered to be better in readability. These merits alone justify experimenting with meson as a way of improving development time and ergonomics. However, there are some compromises to ensure the transition goes relatively smooth: * The system tests currently rely on various files within the source directory. Changing this requirement is a non-trivial task that can't be currently justified. Currently the last compiled build directory writes into the source tree which is in turn used by pytest. * The minimum version supported has been fixed at 0.61. Increasing this value will require choosing a baseline of distributions that can package with meson. On the contrary, there will likely be an attempt to decrease this value to ensure almost universal support for building BIND 9 with meson.	2025-06-11 10:30:12 +03:00
Ondřej Surý	7f498cc60d	Give every memory pool a name Instead of giving the memory pools names with an explicit call to isc_mempool_setname(), add the name to isc_mempool_create() call to have all the memory pools an unconditional name.	2025-05-29 05:46:46 +02:00
Evan Hunt	dd9a685f4a	simplify code around isc_mem_put() and isc_mem_free() it isn't necessary to set a pointer to NULL after calling isc_mem_put() or isc_mem_free(), because those macros take care of it automatically.	2025-05-28 17:22:32 -07:00
Evan Hunt	8487e43ad9	make all ISC_LIST_FOREACH calls safe previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were two separate macros, with the _SAFE version allowing entries to be unlinked during the loop. ISC_LIST_FOREACH is now also safe, and the separate _SAFE macro has been removed. similarly, the ISC_LIST_FOREACH_REV macro is now safe, and ISC_LIST_FOREACH_REV_SAFE has also been removed.	2025-05-23 13:09:10 -07:00
Aram Sargsyan	74a8acdc8d	Separate the single setter/getter functions for TCP timeouts Previously all kinds of TCP timeouts had a single getter and setter functions. Separate each timeout to its own getter/setter functions, because in majority of cases only one is required at a time, and it's not optimal expanding those functions every time a new timeout value is implemented.	2025-04-23 17:03:05 +00:00
Aram Sargsyan	70ad94257d	Implement tcp-primaries-timeout The new 'tcp-primaries-timeout' configuration option works the same way as the existing 'tcp-initial-timeout' option, but applies only to the TCP connections made to the primary servers, so that the timeout value can be set separately for them. The default is 15 seconds. Also, while accommodating zone.c's code to support the new option, make a light refactoring with the way UDP timeouts are calculated by using definitions instead of hardcoded values.	2025-04-23 17:03:05 +00:00
Evan Hunt	ad7f744115	use ISC_LIST_FOREACH in more places use the ISC_LIST_FOREACH pattern in places where lists had been iterated using a different pattern from the typical `for` loop: for example, `while (!ISC_LIST_EMPTY(...))` or `while ((e = ISC_LIST_HEAD(...)) != NULL)`.	2025-03-31 13:45:14 -07:00
Evan Hunt	522ca7bb54	switch to ISC_LIST_FOREACH everywhere the pattern `for (x = ISC_LIST_HEAD(...); x != NULL; ISC_LIST_NEXT(...)` has been changed to `ISC_LIST_FOREACH` throughout BIND, except in a few cases where the change would be excessively complex. in most cases this was a straightforward change. in some places, however, the list element variable was referenced after the loop ended, and the code was refactored to avoid this necessity. also, because `ISC_LIST_FOREACH` uses typeof(list.head) to declare the list elements, compilation failures can occur if the list object has a `const` qualifier. some `const` qualifiers have been removed from function parameters to avoid this problem, and where that was not possible, `UNCONST` was used.	2025-03-31 13:45:10 -07:00
Artem Boldariev	eaad0aefe6	DoH: Bump the active streams processing limit This commit bumps the total number of active streams (= the opened streams for which a request is received, but response is not ready) to 60% of the total streams limit. The previous limit turned out to be too tight as revealed by longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.	2025-03-03 11:32:29 +02:00
Artem Boldariev	217a1ebd79	DoH: remove obsolete INSIST() check The check, while not active by default, is not valid since the commit `8b8f4d500d`. See 'if (total == 0) { ...' below branch to understand why.	2025-03-03 11:32:11 +02:00
Artem Boldariev	c5f7968856	DoH: Flush HTTP write buffer on an outgoing DNS message Previously, the code would try to avoid sending any data regardless of what it is unless: a) The flush limit is reached; b) There are no sends in flight. This strategy is used to avoid too numerous send requests with little amount of data. However, it has been proven to be too aggressive and, in fact, harms performance in some cases (e.g., on longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:"). Now, additionally to the listed cases, we also: c) Flush the buffer and perform a send operation when there is an outgoing DNS message passed to the code (which is indicated by the presence of a send callback). That helps improve performance for "stress:long:rpz:doh+udp:linux:" tests.	2025-03-03 11:32:11 +02:00
Artem Boldariev	0e1b02868a	DoH: Limit the number of delayed IO processing requests Previously, a function for continuing IO processing on the next UV tick was introduced (http_do_bio_async()). The intention behind this function was to ensure that http_do_bio() is eventually called at least once in the future. However, the current implementation allows queueing multiple such delayed requests needlessly. There is currently no need for these excessive requests as http_do_bio() can requeue them if needed. At the same time, each such request can lead to a memory allocation, particularly in BIND 9.18. This commit ensures that the number of enqueued delayed IO processing requests never exceeds one in order to avoid potentially bombarding IO threads with the delayed requests needlessly.	2025-03-03 11:32:11 +02:00
Artem Boldariev	0956fb9b9e	DoH: Simplify http_do_bio() This commit significantly simplifies the code flow in the http_do_bio() function, which is responsible for processing incoming and outgoing HTTP/2 data. It seems that the way it was structured before was indirectly caused by the presence of the missing callback calls bug, fixed in `8b8f4d500d`. The change introduced by this commit is known to remove a bottleneck and allows reproducible and measurable performance improvement for long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests. Additionally, it fixes a similar issue with potentially missing send callback calls processing and hardens the code against use-after-free errors related to the session object (they can potentially occur).	2025-03-03 11:32:11 +02:00
Ondřej Surý	c5075a9a61	Remove convenience list macros from isc/util.h The short convenience list macros were used very sparingly and inconsistenly in the code base. As the consistency is prefered over the convenience, all shortened list macro were removed in favor of their ISC_LIST API targets.	2025-03-01 07:33:40 +01:00
Ondřej Surý	2aa70fff76	Remove unused isc_mutexblock and isc_condition units The isc_mutexblock and isc_condition units were no longer in use and were removed.	2025-03-01 07:33:09 +01:00
Artem Boldariev	2adabe835a	DoH: http_send_outgoing() return value is not used The value returned by http_send_outgoing() is not used anywhere, so we make it not return anything (void). Probably it is an omission from older times.	2025-02-19 17:52:36 +02:00
Artem Boldariev	8b8f4d500d	DoH: Fix missing send callback calls When handling outgoing data, there were a couple of rarely executed code paths that would not take into account that the callback MUST be called. It could lead to potential memory leaks and consequent shutdown hangs.	2025-02-19 17:52:36 +02:00
Artem Boldariev	a22bc2d7d4	DoH: change how the active streams number is calculated This commit changes the way how the number of active HTTP streams is calculated and allows it to scale with the values of the maximum amount of streams per connection, instead of effectively capping at STREAM_CLIENTS_PER_CONN. The original limit, which is intended to define the pipelining limit for TCP/DoT. However, it appeared to be too restrictive for DoH, as it works quite differently and implements pipelining at protocol level by the means of multiplexing multiple streams. That renders each stream to be effectively a separate connection from the point of view of the rest of the codebase.	2025-02-19 17:52:36 +02:00
Artem Boldariev	05e8a50818	DoH: Track the amount of in flight outgoing data Previously we would limit the amount of incoming data to process based solely on the presence of not completed send requests. That worked, however, it was found to severely degrade performance in certain cases, as was revealed during extended testing. Now we switch to keeping track of how much data is in flight (or ready to be in flight) and limit the amount of processed incoming data when the amount of in flight data surpasses the given threshold, similarly to like we do in other transports.	2025-02-19 17:52:36 +02:00
Artem Boldariev	937b5f8349	DoH: reduce excessive bad request logging We started using isc_nm_bad_request() more actively throughout codebase. In the case of HTTP/2 it can lead to a large count of useless "Bad Request" messages in the BIND log, as often we attempt to send such request over effectively finished HTTP/2 sessions. This commit fixes that.	2025-01-15 14:09:17 +00:00
Artem Boldariev	4ae4e255cf	Do not stop timer in isc_nm_read_stop() in manual timer mode A call to isc_nm_read_stop() would always stop reading timer even in manual timer control mode which was added with StreamDNS in mind. That looks like an omission that happened due to how timers are controlled in StreamDNS where we always stop the timer before pausing reading anyway (see streamdns_on_complete_dnsmessage()). That would not work well for HTTP, though, where we might want pause reading without stopping the timer in the case we want to split incoming data into multiple chunks to be processed independently. I suppose that it happened due to NM refactoring in the middle of StreamDNS development (at the time isc_nm_cancelread() and isc_nm_pauseread() were removed), as the StreamDNS code seems to be written as if timers are not stoping during a call to isc_nm_read_stop().	2025-01-15 14:09:17 +00:00
Artem Boldariev	609a41517b	DoH: introduce manual read timer control This commit introduces manual read timer control as used by StreamDNS and its underlying transports. Before that, DoH code would rely on the timer control provided by TCP, which would reset the timer any time some data arrived. Now, the timer is restarted only when a full DNS message is processed in line with other DNS transports. That change is required because we should not stop the timer when reading from the network is paused due to throttling. We need a way to drop timed-out clients, particularly those who refuse to read the data we send.	2025-01-15 14:09:17 +00:00
Artem Boldariev	3425e4b1d0	DoH: floodding clients detection This commit adds logic to make code better protected against clients that send valid HTTP/2 data that is useless from a DNS server perspective. Firstly, it adds logic that protects against clients who send too little useful (=DNS) data. We achieve that by adding a check that eventually detects such clients with a nonfavorable useful to processed data ratio after the initial grace period. The grace period is limited to processing 128 KiB of data, which should be enough for sending the largest possible DNS message in a GET request and then some. This is the main safety belt that would detect even flooding clients that initially behave well in order to fool the checks server. Secondly, in addition to the above, we introduce additional checks to detect outright misbehaving clients earlier: The code will treat clients that open too many streams (50) without sending any data for processing as flooding ones; The clients that managed to send 1.5 KiB of data without opening a single stream or submitting at least some DNS data will be treated as flooding ones. Of course, the behaviour described above is nothing else but heuristical checks, so they can never be perfect. At the same time, they should be reasonable enough not to drop any valid clients, realatively easy to implement, and have negligible computational overhead.	2025-01-15 14:09:17 +00:00
Artem Boldariev	9846f395ad	DoH: process data chunk by chunk instead of all at once Initially, our DNS-over-HTTP(S) implementation would try to process as much incoming data from the network as possible. However, that might be undesirable as we might create too many streams (each effectively backed by a ns_client_t object). That is too forgiving as it might overwhelm the server and trash its memory allocator, causing high CPU and memory usage. Instead of doing that, we resort to processing incoming data using a chunk-by-chunk processing strategy. That is, we split data into small chunks (currently 256 bytes) and process each of them asynchronously. However, we can process more than one chunk at once (up to 4 currently), given that the number of HTTP/2 streams has not increased while processing a chunk. That alone is not enough, though. In addition to the above, we should limit the number of active streams: these streams for which we have received a request and started processing it (the ones for which a read callback was called), as it is perfectly fine to have more opened streams than active ones. In the case we have reached or surpassed the limit of active streams, we stop reading AND processing the data from the remote peer. The number of active streams is effectively decreased only when responses associated with the active streams are sent to the remote peer. Overall, this strategy is very similar to the one used for other stream-based DNS transports like TCP and TLS.	2025-01-15 14:09:17 +00:00
Michał Kępień	d6f9785ac6	Enable extraction of exact local socket addresses Extracting the exact address that each wildcard/TCP socket is bound to locally requires issuing the getsockname() system call, which libuv exposes via its uv__getsockname() functions. This is only required for detailed logging and comes at a noticeable performance cost, so it should not happen by default. However, it is useful for debugging certain problems (e.g. cryptic system test failures), so a convenient way of enabling that behavior should exist. Update isc_nmhandle_localaddr() so that it calls uv__getsockname() when the ISC_SOCKET_DETAILS preprocessor macro is set at compile time. Ensure proper handling of sockets that wrap other sockets. Set the new ISC_SOCKET_DETAILS macro by default when --enable-developer is passed to ./configure. This enables detailed logging in the system tests run in GitLab CI without affecting performance in non-development BIND 9 builds. Note that setting the ISC_SOCKET_DETAILS preprocessor macro at compile time enables all callers of isc_nmhandle_localaddr() to extract the exact address of a given local socket, which results e.g. in dnstap captures containing more accurate information. Mention the new preprocessor macro in the section of the ARM that discusses why exact socket addresses may not be logged by default.	2024-12-29 12:32:05 +01:00
Artem Boldariev	6691a1530d	TLS SNI - add low level support for SNI to the networking code This commit adds support for setting SNI hostnames in outgoing connections over TLS. Most of the changes are related to either adapting the code to accept and extra argument in *connect() functions and a couple of changes to the TLS Stream to actually make use of the new SNI hostname information.	2024-12-26 17:23:12 +02:00
Matthijs Mekking	aa24b77d8b	Fix nsupdate hang when processing a large update The root cause is the fix for CVE-2024-0760 (part 3), which resets the TCP connection on a failed send. Specifically commit `4b7c61381f` stops reading on the socket because the TCP connection is throttling. When the tcpdns_send_cb callback thinks about restarting reading on the socket, this fails because the socket is a client socket. And nsupdate is a client and is using the same netmgr code. This commit removes the requirement that the socket must be a server socket, allowing reading on the socket again after being throttled.	2024-12-05 15:40:48 +01:00
Artem Boldariev	300f05110d	Extended TCP accept()/close() logging This commit adds extra log messages issued when accepting or closing a TCP connection (provided that debugging logging level >=99 is enabled).	2024-11-27 21:14:08 +02:00
Aydın Mercan	d987e2d745	add separate query counters for new protocols Add query counters for DoT, DoH, unencrypted DoH and their proxied counterparts. The protocols don't increment TCP/UDP counters anymore since they aren't the same as plain DNS-over-53.	2024-11-25 13:07:29 +03:00
Ondřej Surý	1a19ce39db	Remove redundant semicolons after the closing braces of functions	2024-11-19 12:27:22 +01:00
Ondřej Surý	0258850f20	Remove redundant parentheses from the return statement	2024-11-19 12:27:22 +01:00

1 2 3 4 5 ...

624 commits