bind9

mirror of https://github.com/isc-projects/bind9.git synced 2026-05-25 02:47:54 -04:00

Author	SHA1	Message	Date
Ondřej Surý	6cb6373b5a	Convert Stream DNS to use isc_buffer API Drop the whole isc_dnsbuffer API and use new improved isc_buffer API that provides same functionality as the isc_dnsbuffer unit now.	2022-12-20 22:13:53 +02:00
Artem Boldariev	0a7e83feea	StreamDNS: Use isc__nm_senddns() to send DNS messages This commit modifies the Stream DNS message so that it uses the optimised code path (isc__nm_senddns()) for sending DNS messages over the underlying transport. This way we avoid allocating any intermediate memory buffers needed to render a DNS message with its length pre-pended ahead of the contents (TCP DNS message format).	2022-12-20 22:13:53 +02:00
Artem Boldariev	cb6f3dc3c8	TLS: isc__nm_senddns() support This commit adds support for isc_nm_senddns() to the generic TLS code.	2022-12-20 22:13:53 +02:00
Artem Boldariev	ad876a65af	Add isc__nm_senddns() The new internal function works in the same way as isc_nm_send() except that it sends a DNS message size ahead of the DNS message data (the format used in DNS over TCP). The intention is to provide a fast path for sending DNS messages over streams protocols - that is, without allocating any intermediate memory buffers.	2022-12-20 22:13:53 +02:00
Artem Boldariev	56732ac2a0	TLS: try to avoid allocating send request objects This commit optimises TLS send request object allocation to enable send request object reuse, somewhat reducing pressure on the memory manager. It is especially helpful in the case when Stream DNS uses the TLS implementation as the transport.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4277eeeb9c	Remove TLS DNS transport (and parts common with TCP DNS) This commit removes TLS DNS transport superseded by Stream DNS.	2022-12-20 22:13:53 +02:00
Artem Boldariev	e5649710d3	Remove TCP DNS transport This commit removes TCP DNS transport superseded by Stream DNS.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4524bf4083	Make isc_nm_tlssocket non-optional This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so that it will be available regardless of the fact if BIND was built with DNS over HTTP support or not.	2022-12-20 22:13:53 +02:00
Artem Boldariev	efe4267044	DoH: use isc_nmhandle_set_tcp_nodelay() This commit replaces ad-hoc code for disabling Nagle's algorithm with a call to isc_nmhandle_set_tcp_nodelay().	2022-12-20 22:13:53 +02:00
Artem Boldariev	e89575ddce	StreamDNS: opportunistically disable Nagle's algorithm This commit ensures that Stream DNS code attempts to disable Nagle's algorithm regardless of underlying stream transport (TCP or TLS), as we are not interested in trading latency for throughout when dealing with DNS messages.	2022-12-20 22:13:53 +02:00
Artem Boldariev	05cfb27b80	Disable Nagle's algorithm for TLS connections by default This commit ensures that Nagle's algorithm is disabled by default for TLS connections on best effort basis, just like other networking software (e.g. NGINX) does, as, in the case of TLS, we are not interested in trading latency for throughput, rather vice versa. We attempt to disable it as early as we can, right after TCP connections establishment, as an attempt to speed up handshake handling.	2022-12-20 22:13:53 +02:00
Artem Boldariev	371b02f37a	TCP: make it possible to set Nagle's algorithms state via handle This commit adds ability to turn the Nagle's algorithm on or off via connections handle. It adds the isc_nmhandle_set_tcp_nodelay() function as the public interface for this functionality.	2022-12-20 22:13:53 +02:00
Artem Boldariev	4606384345	Extend isc__nm_socket_tcp_nodelay() to accept value This makes it possible to both enable and disable Nagle's algorithm for a TCP socket descriptor, before the change it was possible only to disable it.	2022-12-20 22:13:53 +02:00
Artem Boldariev	f395cd4b3e	Add isc_nm_streamdnssocket (aka Stream DNS) This commit adds an initial implementation of isc_nm_streamdnssocket transport: a unified transport for DNS over stream protocols messages, which is capable of replacing both TCP DNS and TLS DNS transports. Currently, the interface it provides is a unified set of interfaces provided by both of the transports it attempts to replace. The transport is built around "isc_dnsbuffer_t" and "isc_dnsstream_assembler_t" objects and attempts to minimise both the number of memory allocations during network transfers as well as memory usage.	2022-12-20 22:13:51 +02:00
Artem Boldariev	338cf3e467	Add isc_dnsstream_assembler_t implementation This commit adds the implementation for an "isc_dnsstream_assembler_t" object. The object is built on top of "isc_dnsbuffer_t" and is intended to encapsulate the state machine used for handling DNS messages received in the format used for messages transmitted over TCP. The idea is that the object accepts the input data received from a socket, tries to assemble DNS messages from the incoming data and calls the callback which contains the status of the incoming data as well as a pointer to the memory region referencing the data of the assembled message. It is capable of assembling DNS messages no matter how torn apart they are when sent over network. The following statuses might be passed to the callback: * ISC_R_SUCCESS - a message has been successfully assembled; * ISC_R_NOMORE - not enough data has been processed to assemble a message; * ISC_R_RANGE - there was an attempt to process a zero-sized DNS message (someone attempts to send us junk data). One could say that the object replaces the implementation of "isc__nm__processbuffer()" functions used by the old TCP DNS and TLS DNS transports with a better defined state machine completely decoupled from the networking code itself. Such a design makes it trivial to write unit tests for it, leading to better verification of its correctness. Another important difference is directly related to the fact that it is built on top of "isc_dnsbuffer_t", which tries to manage memory in a smart way. In particular: It tries to use a static buffer for smaller messages, reducing pressure on the memory manager (hot path); * When allocating dynamic memory for larger messages, it tries to allocate memory conservatively (generic path). These characteristics is a significant upgrade over the older logic where a 64KB(+2 bytes) buffer was allocated from dynamic memory regardless of the fact if we need a buffer this large or not. That is, lesser memory usage is expected in a generic case for DNS transports built on top of "isc_dnsstream_assembler_t."	2022-12-20 21:24:44 +02:00
Artem Boldariev	cbb758abd4	Add isc_dnsbuffer_t implementation This commit adds "isc_dnsbuffer_t" object implementation, a thin wrapper on top of "isc_buffer_t" which has the following characteristics: * provides interface specifically atuned for handling/generating DNS messages, especially in the format used for DNS messages over TCP; * avoids allocating dynamic memory when handling small DNS messages, while transparently switching to using dynamic memory when handling larger messages. This approach significantly reduces pressure on the memory allocator, as most of the DNS messages are small.	2022-12-20 21:24:44 +02:00
Artem Boldariev	c0c59b55ab	TLS: add an internal function isc__nmhandle_get_selected_alpn() The added function provides the interface for getting an ALPN tag negotiated during TLS connection establishment. The new function can be used by higher level transports.	2022-12-20 21:24:44 +02:00
Artem Boldariev	15e626f1ca	TLS: add manual read timer control mode This commit adds manual read timer control mode, similarly to TCP. This way the read timer can be controlled manually using: * isc__nmsocket_timer_start(); * isc__nmsocket_timer_stop(); * isc__nmsocket_timer_restart(). The change is required to make it possible to implement more sophisticated read timer control policies in DNS transports, built on top of TLS.	2022-12-20 21:24:44 +02:00
Artem Boldariev	9aabd55725	TCP: add manual read timer control mode This commit adds a manual read timer control mode to the TCP code (adding isc__nmhandle_set_manual_timer() as the interface to it). Manual read timer control mode suppresses read timer restarting the read timer when receiving any amount of data. This way the read timer can be controlled manually using: * isc__nmsocket_timer_start(); * isc__nmsocket_timer_stop(); * isc__nmsocket_timer_restart(). The change is required to make it possible to implement more sophisticated read timer control policies in DNS transports, built on top of TCP.	2022-12-20 21:24:44 +02:00
Artem Boldariev	f4760358f8	TLS: expose the ability to (re)start and stop underlying read timer This commit adds implementation of isc__nmsocket_timer_restart() and isc__nmsocket_timer_stop() for generic TLS code in order to make its interface more compatible with that of TCP.	2022-12-20 21:24:44 +02:00
Artem Boldariev	f18a9b3743	TLS: add isc__nmsocket_timer_running() support This commit adds isc__nmsocket_timer_running() support to the generic TLS code in order to make it more compatible with TCP.	2022-12-20 21:24:44 +02:00
Artem Boldariev	c0808532e1	TLS: isc_nm_bad_request() and isc__nmsocket_reset() support This commit adds implementations of isc_nm_bad_request() and isc__nmsocket_reset() to the generic TLS stream code in order to make it more compatible with TCP code.	2022-12-20 21:24:44 +02:00
Artem Boldariev	94e650ce89	Use 'restrict' and 'const' for 'isc_buffer_t' The purpose of this commit is to aid compiler in generating better code when working with `isc_buffer_t` objects by using restricted pointers (and, to a lesser extent, 'const' modifier for read-only arguments). This way we, basically, instruct the compiler that the members of structured passed by pointers into the functions can be treated as local variables in the scope of a function. That should reduce the number of load/store operations emitted by compilers when accessing objects (e.g. 'isc_buffer_t') via pointers.	2022-12-20 21:01:27 +02:00
Ondřej Surý	460afcda18	Add isc_buffer_trycompact() function needed for StreamDNS Add isc_buffer_trycompact() that's an optimization; it will compact the buffer only when the remaining length is smaller than used length.	2022-12-20 19:13:48 +01:00
Ondřej Surý	e6062ee3ae	Add isc_buffer_setmctx() and isc_buffer_clearmctx() function Add two extra functions needed by StreamDNS: 1. isc_buffer_setmctx() sets the buffer internal memory context, so we can use isc_buffer_reserve() on the buffer. For this, we also need to track whether the .base was dynamically allocated or not. This needs to be called after isc_buffer_init() and before first isc_buffer_reserve() call. 2. isc_buffer_clearmctx() clears the buffer internal memory context, and frees any dynamically allocated buffer. This needs to be called after the last isc_buffer_reserve() call and before calling the isc_buffer_invalidate()	2022-12-20 19:13:48 +01:00
Ondřej Surý	8e3a86f6dd	Make the isc_buffer unit header-only The isc_buffer is often used in the hot-path, so make it header-only implementation.	2022-12-20 19:13:48 +01:00
Ondřej Surý	2ddea1e41c	Add a static pre-allocated buffer to isc_buffer_t When the buffer is allocated via isc_buffer_allocate() and the size is smaller or equal ISC_BUFFER_STATIC_SIZE (currently 512 bytes), the buffer will be allocated as a flexible array member in the buffer structure itself instead of allocating it on the heap. This should help when the buffer is used on the hot-path with small allocations.	2022-12-20 19:13:48 +01:00
Ondřej Surý	6bd2b34180	Enable auto-reallocation for all isc_buffer_allocate() buffers When isc_buffer_t buffer is created with isc_buffer_allocate() assume that we want it to always auto-reallocate instead of having an extra call to enable auto-reallocation.	2022-12-20 19:13:48 +01:00
Ondřej Surý	135ec7a0f0	Remove single use isc_buffer_putdecint() function The isc_buffer_putdecint() could be easily replaced with isc_buffer_printf() with just a small overhead of calling vsnprintf() twice instead once. This is not on a hot-path (dns_catz unit), so we can ignore the overhead and instead have less single-use code in favor of using reusable more generic function.	2022-12-20 19:13:48 +01:00
Ondřej Surý	2a94123d5b	Refactor the isc_buffer_{get,put}uintN, add isc_buffer_peekuintN The Stream DNS implementation needs a peek methods that read the value from the buffer, but it doesn't advance the current position. Add isc_buffer_peekuintX methods, refactor the isc_buffer_{get,put}uintN methods to modern integer types, and move the isc_buffer_getuintN to the header as static inline functions.	2022-12-20 19:13:48 +01:00
Ondřej Surý	a1d45685e6	Move and extend the uint8_t low-endian to uint{32,64}t to endian.h Move the U8TO{32,64}_LE and U{32,64}TO8_LE macros to endian.h and extend the macros for 16-bit and Big-Endian variants. Use the macros both in isc_siphash (LE) and isc_buffer (BE) units.	2022-12-20 19:13:48 +01:00
Ondřej Surý	aea251f3bc	Change the isc_buffer_reserve() to take just buffer pointer The isc_buffer_reserve() would be passed a reference to the buffer pointer, which was unnecessary as the pointer would never be changed in the current implementation. Remove the extra dereference.	2022-12-20 19:13:48 +01:00
Ondřej Surý	52307f8116	Add internal logging functions to the netmgr Add internal logging functions isc__netmgr_log, isc__nmsocket_log(), and isc__nmhandle_log() that can be used to add logging messages to the netmgr, and change all direct use of isc_log_write() to use those logging functions to properly prefix them with netmgr, nmsocket and nmsocket+nmhandle.	2022-12-14 19:34:48 +01:00
Ondřej Surý	7cefcb6184	Allow zero length keys in isc_hashmap In case, we are trying to hash the empty key into the hashmap, the key is going to have zero length. This might happen in the unit test. Allow this and add a unit test to ensure the empty zero-length key doesn't hash to slot 0 as SipHash 2-4 (our hash function of choice) has no problem with zero-length inputs.	2022-12-14 17:59:07 +01:00
Artem Boldariev	837fef78b1	Fix TLS session resumption via IDs when Mutual TLS is used This commit fixes TLS session resumption via session IDs when client certificates are used. To do so it makes sure that session ID contexts are set within server TLS contexts. See OpenSSL documentation for 'SSL_CTX_set_session_id_context()', the "Warnings" section.	2022-12-14 18:06:20 +02:00
Ondřej Surý	e2262c2112	Remove isc_resource API and set limits directly in named_os unit The only function left in the isc_resource API was setting the file limit. Replace the whole unit with a simple getrlimit to check the maximum value of RLIMIT_NOFILE and set the maximum back to rlimit_cur. This is more compatible than trying to set RLIMIT_UNLIMITED on the RLIMIT_NOFILE as it doesn't work on Linux (see man 5 proc on /proc/sys/fs/nr_open), neither it does on Darwin kernel (see man 2 getrlimit). The only place where the maximum value could be raised under privileged user would be BSDs, but the `named_os_adjustnofile()` were not called there before. We would apply the increased limits only on Linux and Sun platforms.	2022-12-07 19:40:00 +01:00
Artem Boldariev	bed5e2bb08	TLS: check for sock->recv_cb when handling received data This commit adds a check if 'sock->recv_cb' might have been nullified during the call to 'sock->recv_cb'. That could happen, e.g. by an indirect call to 'isc_nmhandle_close()' from within the callback when wrapping up. In this case, let's close the TLS connection.	2022-12-02 13:20:37 +02:00
Artem Boldariev	8b7e123528	DoH: Avoid accessing non-atomic listener socket flags when accepting This commit ensures that the non-atomic flags inside a DoH listener socket object (and associated worker) are accessed when doing accept for a connection only from within the context of the dedicated thread, but not other worker threads. The purpose of this commit is to avoid TSAN errors during isc__nmsocket_closing() calls. It is a continuation of `4b5559cd8f`.	2022-12-02 12:16:12 +02:00
Artem Boldariev	4d0c226375	TLS: Avoid accessing non-atomic listener socket flags during HS This commit ensures that the non-atomic flags inside a TLS listener socket object (and associated worker) are accessed when doing handshake for a connection only from within the context of the dedicated thread, but not other worker threads. The purpose of this commit is to avoid TSAN errors during isc__nmsocket_closing() calls. It is a continuation of `4b5559cd8f`.	2022-12-02 12:16:12 +02:00
Artem Boldariev	4b5559cd8f	TLS: Avoid accessing listener socket flags from other threads This commit ensures that the flags inside a TLS listener socket object (and associated worker) are accessed when accepting a connection only from within the context of the dedicated thread, but not other worker threads.	2022-12-01 21:07:49 +02:00
Ondřej Surý	e3c628d562	Honour single read per client isc_nm_read() call in the TLSDNS The TLSDNS transport was not honouring the single read callback for TLSDNS client. It would call the read callbacks repeatedly in case the single TLS read would result in multiple DNS messages in the decoded buffer.	2022-12-01 18:31:05 +01:00
Artem Boldariev	2bfc079946	TLS stream: always handle send callbacks asynchronously This commit ensures that send callbacks are always called from within the context of its worker thread even in the case of shuttigdown/inactive socket, just like TCP transport does and with which TLS attempts to be as compatible as possible.	2022-11-30 18:09:52 +02:00
Artem Boldariev	ef659365ce	TLS Stream: use ISC_R_CANCELLED error when shutting down This commit changes ISC_R_NOTCONNECTED error code to ISC_R_CANCELLED when attempting to start reading data on the shutting down socket in order to make its behaviour compatible with that of TCP and not break the common code in the unit tests.	2022-11-30 18:09:52 +02:00
Artem Boldariev	fb9955a372	TLS Stream: fix isc_nm_read_stop() and reading flags handling It turned out that after the latest Network Manager refactoring 'sock->reading' flag was not processed correctly. Due to this isc_nm_read_stop() might not work as expected because reading from the underlying TCP socket could have been resume in 'tls_do_bio()' regardless of the 'sock->reading' value. This bug did not seem to cause problems with DoH, so it was not noticed, but Stream DNS has more strict expectations regarding the underlying transport. Additionally to the above, the 'sock->recv_read' flag was completely ignored and corresponding logic was completely unimplemented. That did not allow to implement one fine detail compared to TCP: once reading is started, it could be satisfied by one datum reading. This commit fixes the issues above.	2022-11-30 18:09:52 +02:00
Ondřej Surý	50f357cb36	Refactor the dns_adb unit The dns_adb unit has been refactored to be much simpler. Following changes have been made: 1. Simplify the ADB to always allow GLUE and hints There were only two places where dns_adb_createfind() was used - in the dns_resolver unit where hints and GLUE addresses were ok, and in the dns_zone where dns_adb_createfind() would be called without DNS_ADBFIND_HINTOK and DNS_ADBFIND_GLUEOK set. Simplify the logic by allowing hint and GLUE addresses when looking up the nameserver addresses to notify. The difference is negligible and would cause a difference in the notified addresses only when there's mismatch between the parent and child addresses and we haven't cached the child addresses yet. 2. Drop the namebuckets and entrybuckets Formerly, the namebuckets and entrybuckets were used to reduced the lock contention when accessing the double-linked lists stored in each bucket. In the previous refactoring, the custom hashtable for the buckets has been replaced with isc_ht/isc_hashmap, so only a single item (mostly, see below) would end up in each bucket. Removing the entrybuckets has been straightforward, the only matching was done on the isc_sockaddr_t member of the dns_adbentry. Removing the zonebuckets required GLUEOK and HINTOK bits to be removed because the find could match entries with-or-without the bits set, and creating a custom key that stores the DNS_ADBFIND_STARTATZONE in the first byte of the key, so we can do a straightforward lookup into the hashtable without traversing a list that contains items with different flags. 3. Remove unassociated entries from ADB database Previously, the adbentries could live in the ADB database even after unlinking them from dns_adbnames. Such entries would show up as "Unassociated entries" in the ADB dump. The benefit of keeping such entries is little - the chance that we link such entry to a adbname is small, and it's simpler to evict unlinked entries from the ADB cache (and the hashtable) than create second LRU cleaning mechanism. Unlinked ADB entries are now directly deleted from the hash table (hashmap) upon destruction. 4. Cleanup expired entries from the hash table When buckets were still in place, the code would keep the buckets always allocated and never shrink the hash table (hashmap). With proper reference counting in place, we can delete the adbnames from the hash table and the LRU list. 5. Stop purging the names early when we hit the time limit Because the LRU list is now time ordered, we can stop purging the names when we find a first entry that doesn't fullfil our time-based eviction criteria because no further entry on the LRU list will meet the criteria. Future work: 1. Lock contention In this commit, the focus was on correctness of the data structure, but in the future, the lock contention in the ADB database needs to be addressed. Currently, we use simple mutex to lock the hash tables, because we almost always need to use a write lock for properly purging the hashtables. The ADB database needs to be sharded (similar to the effect that buckets had in the past). Each shard would contain own hashmap and own LRU list. 2. Time-based purging The ADB names and entries stay intact when there are no lookups. When we add separate shards, a timer needs to be added for time-based cleaning in case there's no traffic hashing to the inactive shard. 3. Revisit the 30 minutes limit The ADB cache is capped at 30 minutes. This needs to be revisited, and at least the limit should be configurable (in both directions).	2022-11-30 10:03:24 +01:00
Ondřej Surý	118ae66976	Add extra set of ISC_REFCOUNT_TRACE_{IMPL,DECL} macros The new ISC_REFCOUNT_TRACE_{IMPL,DECL} macros can be used to add a reference tracing capability to any unit using the reference counting. It requires a little bit of extra work in each header as you can't have a define from inside a define (see rpz.h), but it's fairly easy to add tracing to any struct using reference counting with these macros.	2022-11-29 23:57:40 -08:00
Artem Boldariev	9b1c8c03fd	TCP: use uv_try_write() to optimise sends This commit make TCP code use uv_try_write() on best effort basis, just like TCP DNS and TLS DNS code does. This optimisation was added in 'caa5b6548a11da6ca772d6f7e10db3a164a18f8d' but, similar change was mistakenly omitted for generic TCP code. This commit fixes that.	2022-11-29 13:41:10 +02:00
Michal Nowak	afdb41a5aa	Update sources to Clang 15 formatting	2022-11-29 08:54:34 +01:00
Ondřej Surý	d8df29e37d	Be more resilient when destroying the httpd requests Don't restart reading in the send callback after the httpdmgr has been shut down, and call httpd_request(..., ISC_R_SHUTDOWN, ...) when shutting down the httpdmgr to reduce code duplication.	2022-11-25 16:20:34 +01:00
Ondřej Surý	f3004da3a5	Make the netmgr send callback to be asynchronous only when needed Previously, the send callback would be synchronous only on success. Add an option (similar to what other callbacks have) to decide whether we need the asynchronous send callback on a higher level. On a general level, we need the asynchronous callbacks to happen only when we are invoking the callback from the public API. If the path to the callback went through the libuv callback or netmgr callback, we are already on asynchronous path, and there's no need to make the call to the callback asynchronous again. For the send callback, this means we need the asynchronous path for failure paths inside the isc_nm_send() (which calls isc__nm_udp_send(), isc__nm_tcp_send(), etc...) - all other invocations of the send callback could be synchronous, because those are called from the respective libuv send callbacks.	2022-11-25 15:46:25 +01:00

1 2 3 4 5 ...

4609 commits