Merge tag 'v9.20.5' into bind-9.20

This commit is contained in:
Andoni Duarte Pintado 2025-01-29 17:21:44 +01:00
commit 2d0323e006
19 changed files with 870 additions and 65 deletions

View file

@ -279,7 +279,7 @@ n=$((n + 1))
echo_i "testing with 'minimal-any no;' ($n)"
ret=0
$DIG $DIGOPTS -t ANY www.rt.example @10.53.0.1 >dig.out.$n || ret=1
grep "ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 2" dig.out.$n >/dev/null || ret=1
grep "ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 1" dig.out.$n >/dev/null || ret=1
if [ $ret -eq 1 ]; then
echo_i "failed"
status=$((status + 1))

View file

@ -1,12 +0,0 @@
Copyright (C) Internet Systems Consortium, Inc. ("ISC")
SPDX-License-Identifier: MPL-2.0
This Source Code Form is subject to the terms of the Mozilla Public
License, v. 2.0. If a copy of the MPL was not distributed with this
file, you can obtain one at https://mozilla.org/MPL/2.0/.
See the COPYRIGHT file distributed with this work for additional
information regarding copyright ownership.
Add -T noaa.

View file

@ -221,6 +221,10 @@ done
if [ $ret != 0 ]; then echo_i "failed"; fi
status=$((status + ret))
stop_server ns4
touch ns4/named.noaa
start_server --noclean --restart --port ${PORT} ns4 || ret=1
n=$((n + 1))
echo_i "RT21594 regression test check setup ($n)"
ret=0
@ -257,6 +261,10 @@ grep "status: NXDOMAIN" dig.ns5.out.${n} >/dev/null || ret=1
if [ $ret != 0 ]; then echo_i "failed"; fi
status=$((status + ret))
stop_server ns4
rm ns4/named.noaa
start_server --noclean --restart --port ${PORT} ns4 || ret=1
n=$((n + 1))
echo_i "check that replacement of additional data by a negative cache no data entry clears the additional RRSIGs ($n)"
ret=0

View file

@ -99,7 +99,7 @@ from a primary server, the secondary checks to see that its version of
the zone is the current version and, if not, initiates a zone transfer.
For more information about DNS NOTIFY, see the description of the
:namedconf:ref:`notify` and :namedconf:ref`also-notify` statements.
:namedconf:ref:`notify` and :namedconf:ref:`also-notify` statements.
The NOTIFY protocol is specified in :rfc:`1996`.
.. note::

View file

@ -18,6 +18,7 @@ Changelog
development. Regular users should refer to :ref:`Release Notes <relnotes>`
for changes relevant to them.
.. include:: ../changelog/changelog-9.20.5.rst
.. include:: ../changelog/changelog-9.20.4.rst
.. include:: ../changelog/changelog-9.20.3.rst
.. include:: ../changelog/changelog-9.20.2.rst

View file

@ -45,6 +45,7 @@ The list of known issues affecting the latest version in the 9.20 branch can be
found at
https://gitlab.isc.org/isc-projects/bind9/-/wikis/Known-Issues-in-BIND-9.20
.. include:: ../notes/notes-9.20.5.rst
.. include:: ../notes/notes-9.20.4.rst
.. include:: ../notes/notes-9.20.3.rst
.. include:: ../notes/notes-9.20.2.rst

View file

@ -0,0 +1,238 @@
.. Copyright (C) Internet Systems Consortium, Inc. ("ISC")
..
.. SPDX-License-Identifier: MPL-2.0
..
.. This Source Code Form is subject to the terms of the Mozilla Public
.. License, v. 2.0. If a copy of the MPL was not distributed with this
.. file, you can obtain one at https://mozilla.org/MPL/2.0/.
..
.. See the COPYRIGHT file distributed with this work for additional
.. information regarding copyright ownership.
BIND 9.20.5
-----------
Security Fixes
~~~~~~~~~~~~~~
- [CVE-2024-12705] DNS-over-HTTP(s) flooding fixes. ``51900adf29c``
Fix DNS-over-HTTP(S) implementation issues that arise under heavy
query load. Optimize resource usage for :iscman:`named` instances that
accept queries over DNS-over-HTTP(S).
Previously, :iscman:`named` would process all incoming HTTP/2 data at
once, which could overwhelm the server, especially when dealing with
clients that send requests but don't wait for responses. That has been
fixed. Now, :iscman:`named` handles HTTP/2 data in smaller chunks and
throttles reading until the remote side reads the response data. It
also throttles clients that send too many requests at once.
Additionally, :iscman:`named` now carefully processes data sent by
some clients, which can be considered "flooding." It logs these
clients and drops connections from them. :gl:`#4795`
In some cases, :iscman:`named` could leave DNS-over-HTTP(S)
connections in the `CLOSE_WAIT` state indefinitely. That also has been
fixed. ISC would like to thank JF Billaud for thoroughly investigating
the issue and verifying the fix. :gl:`#5083` :gl:`#4795` :gl:`#5083`
- [CVE-2024-11187] Limit the additional processing for large RDATA sets.
``4d3d17c344f``
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA. This limits the number
of lookups into the database(s) during a single client query, reducing
query processing load. :gl:`#5034`
New Features
~~~~~~~~~~~~
- Add Extended DNS Error Code 22 - No Reachable Authority.
``ee77a192091``
When the resolver is trying to query an authority server and
eventually timed out, a SERVFAIL answer is given to the client. Add
the Extended DNS Error Code 22 - No Reachable Authority to the
response. :gl:`#2268` :gl:`!9814`
- Add a new option to configure the maximum number of outgoing queries
per client request. ``844a5310532``
The configuration option 'max-query-count' sets how many outgoing
queries per client request is allowed. The existing
'max-recursion-queries' is the number of permissible queries for a
single name and is reset on every CNAME redirection. This new option
is a global limit on the client request. The default is 200.
This allows us to send a bit more queries while looking up a single
name. The default for 'max-recursion-queries' is changed from 32 to
50. :gl:`#4980` :gl:`#4921` :gl:`!9832`
Removed Features
~~~~~~~~~~~~~~~~
- Drop single-use RETERR macro. ``87f70696c87``
If the RETERR define is only used once in a file, just drop the macro.
:gl:`!9885`
Feature Changes
~~~~~~~~~~~~~~~
- Update picohttpparser.{c,h} with upstream repository. ``3c9657a3f48``
:gl:`#4485` :gl:`!9863`
- The configuration clauses parental-agents and primaries are renamed to
remote-servers. ``b483cd4638c``
The top blocks 'primaries' and 'parental-agents' are no longer
preferred and should be renamed to 'remote-servers'. The zone
statements 'parental-agents' and 'primaries' are still used, and may
refer to any 'remote-servers' top block. :gl:`#4544` :gl:`!9911`
- Add none parameter to query-source and query-source-v6 to disable IPv4
or IPv6 upstream queries. ``e260eb39c56``
Add a none parameter to named configuration option `query-source`
(respectively `query-source-v6`) which forbid usage of IPv4
(respectively IPv6) addresses when named is doing an upstream query.
:gl:`#4981` Turning-off upstream IPv6 queries while still listening to
downstream queries on IPv6. :gl:`!9727` :gl:`!9775`
- Optimize memory layout of core structs. ``67fa22a7746``
Reduce memory footprint by: - Reordering struct fields to minimize
padding. - Using exact-sized atomic types instead of
`*_least`/`*_fast` variants - Downsizing integer fields where possible
Affected structs: - dns_name_t - dns_slabheader_t - dns_rdata_t -
qpcnode_t - qpznode_t :gl:`#5022` :gl:`!9793`
- Revert "Fix NSEC3 closest encloser lookup for names with empty
non-terminals" ``993cb761489``
Revert the fix for #4950 for 9.20.
This reverts MR !9438.
History: A performance improvement for NSEC3 closest encloser lookups
(#4460) was introduced (in MR !9436) and backported to 9.20 (MR !9438)
and to 9.18 in (MR !9439). It was released in 9.18.30 (and 9.20.2 and
9.21.1).
There was a bug in the code (#4950), so we reverted the change in
!9611, !9613 and !9614.
Then a new attempt was merged in main (MR !9610) and backported to
9.20 (MR !9631) and 9.18 (MR !9632). The latter should not have been
backported and was reverted in !9689.
We now also revert the fix for 9.20 :gl:`#5108` :gl:`!9947`
- Add TLS SNI extension to all outgoing TLS connections. ``b14148ac897``
:gl:`!9933`
- Remove unused maxquerycount. ``d61bfeb91e0``
Related to #4980 :gl:`!9853`
- Use query counters in validator code. ``d91835160a2``
Commit af7db8951364a89c468eda1535efb3f53adc2c1f as part of #4141 was
supposed to apply the 'max-recursion-queries' quota to validator
queries, but the counter was never actually passed on to
'dns_resolver_createfetch()'. This has been fixed, and the global
query counter ('max-query-count', per client request) is now also
added.
Related to #4980 :gl:`!9866`
Bug Fixes
~~~~~~~~~
- Fix nsupdate hang when processing a large update. ``4ca7a5d6011``
To mitigate DNS flood attacks over a single TCP connection, we
throttle the connection when the other side does not read the data.
Throttling should only occur on server-side sockets, but erroneously
also happened for nsupdate, which acts as a client. When nsupdate
started throttling the connection, it never attempts to read again.
This has been fixed. :gl:`#4910` :gl:`!9834`
- Lock and attach when returning zone stats. ``79e6519168e``
When returning zone statistics counters, the statistics sets are now
attached while the zone is locked. This addresses Coverity warnings
CID 468720, 468728 and 468729. :gl:`#4934` :gl:`!9843`
- Fix possible assertion failure when reloading server while processing
updates. ``41af766cd08``
:gl:`#5006` :gl:`!9820`
- Preserve cache across reconfig when using attach-cache.
``826dfa006e2``
When the `attach-cache` option is used in the `options` block with an
arbitrary name, it causes all views to use the same cache. Previously,
this configuration caused the cache to be deleted and a new cache
created every time the server was reconfigured. This has been fixed.
:gl:`#5061` :gl:`!9862`
- Resolve the spurious drops in performance due GLUE cache.
``eb3c66304f3``
For performance reasons, the returned GLUE records are cached on the
first use. The current implementation could randomly cause a
performance drop and increased memory use. This has been fixed.
:gl:`#5064` :gl:`!9918`
- Fix dnssec-signzone signing non-DNSKEY RRsets with revoked keys.
``c577c3b544d``
`dnssec-signzone` was using revoked keys for signing RRsets other than
DNSKEY. This has been corrected. :gl:`#5070` :gl:`!9840`
- Revert "Lock and attach when returning zone stats" ``d954d9c20b9``
:gl:`#5082` :gl:`!9860`
- Unknown directive in resolv.conf not handled properly. ``7738fd28c91``
The line after an unknown directive in resolv.conf could accidentally
be skipped, potentially affecting dig, host, nslookup, nsupdate, or
delv. This has been fixed. :gl:`#5084` :gl:`!9877`
- Fix response policy zones and catalog zones with an $INCLUDE statement
defined. ``cc0cbbe697c``
Response policy zones (RPZ) and catalog zones were not working
correctly if they had an $INCLUDE statement defined. This has been
fixed. :gl:`#5111` :gl:`!9941`
- Finalize removal of memory debug flags size and mctx. ``31918336e8a``
Commit 4b3d0c66009d30f5c0bc12ee128fc59f1d853f44 has removed them, but
did not remove few traces in documentation and help. Remove them from
remaining places. :gl:`!9842`
- Fix m4 macro in configure.ac. ``ae739c80ccb``
:gl:`!9813`
- Mark loop as shuttingdown earlier in shutdown_cb. ``fed5e55e339``
:gl:`!9891`
- Use CMM_{STORE,LOAD}_SHARED to store/load glue in gluelist.
``fa7443d3fd2``
ThreadSanitizer has trouble understanding that gluelist->glue is
constant after it is assigned to the slabheader with cmpxchg. Help
ThreadSanitizer to understand the code by using CMM_STORE_SHARED and
CMM_LOAD_SHARED on gluelist->glue. :gl:`!9936`

149
doc/notes/notes-9.20.5.rst Normal file
View file

@ -0,0 +1,149 @@
.. Copyright (C) Internet Systems Consortium, Inc. ("ISC")
..
.. SPDX-License-Identifier: MPL-2.0
..
.. This Source Code Form is subject to the terms of the Mozilla Public
.. License, v. 2.0. If a copy of the MPL was not distributed with this
.. file, you can obtain one at https://mozilla.org/MPL/2.0/.
..
.. See the COPYRIGHT file distributed with this work for additional
.. information regarding copyright ownership.
Notes for BIND 9.20.5
---------------------
Security Fixes
~~~~~~~~~~~~~~
- DNS-over-HTTPS flooding fixes. :cve:`2024-12705`
Fix DNS-over-HTTPS implementation issues that arise under heavy
query load. Optimize resource usage for :iscman:`named` instances that
accept queries over DNS-over-HTTPS.
Previously, :iscman:`named` processed all incoming HTTP/2 data at
once, which could overwhelm the server, especially when dealing with
clients that sent requests but did not wait for responses. That has been
fixed. Now, :iscman:`named` handles HTTP/2 data in smaller chunks and
throttles reading until the remote side reads the response data. It
also throttles clients that send too many requests at once.
In addition, :iscman:`named` now evaluates excessive streams opened by
clients that include no DNS data, which is considered "flooding." It
logs these clients and drops connections from them. :gl:`#4795`
In some cases, :iscman:`named` could leave DNS-over-HTTPS
connections in the `CLOSE_WAIT` state indefinitely. That has also been
fixed. :gl:`#5083`
ISC would like to thank Jean-François Billaud for his assistance with
investigating this issue.
- Limit additional section processing for large RDATA sets.
:cve:`2024-11187`
When answering queries, don't add data to the additional section if
the answer has more than 13 names in the RDATA. This limits the number
of lookups into the database(s) during a single client query, reducing
the query-processing load. :gl:`#5034`
ISC would like to thank Toshifumi Sakaguchi for bringing this
vulnerability to our attention.
New Features
~~~~~~~~~~~~
- Add Extended DNS Error Code 22 - No Reachable Authority.
When the resolver is trying to query an authoritative server and
eventually times out, a SERVFAIL answer is given to the client. Add
the Extended DNS Error Code 22 - No Reachable Authority to the
response. :gl:`#2268`
- Add a new option to configure the maximum number of outgoing queries
per client request.
The configuration option :any:`max-query-count` sets how many outgoing
queries per client request are allowed. The existing
:any:`max-recursion-queries` value is the number of permissible queries for a
single name and is reset on every CNAME redirection. This new option
is a global limit on the client request. The default is 200.
The default for :any:`max-recursion-queries` is changed from 32 to
50. This allows :any:`named` to send a few more queries
while looking up a single name. :gl:`#4980` :gl:`#4921`
- Use the Server Name Indication (SNI) extension for all outgoing TLS
connections.
This improves compatibility with other DNS server software.
:gl:`#5099`
Feature Changes
~~~~~~~~~~~~~~~
- Performance optimization for NSEC3 lookups introduced in BIND 9.20.2 was
reverted to avoid risks associated with a complex code change. :gl:`#5108`
- The configuration clauses ``parental-agents`` and ``primaries`` are renamed to
:any:`remote-servers`.
The top blocks ``primaries`` and ``parental-agents`` are no longer
preferred and should be renamed to :any:`remote-servers`. The zone
statements :any:`parental-agents` and :any:`primaries` are still used, and may
refer to any :any:`remote-servers` top block. :gl:`#4544`
- Add `none` parameter to :namedconf:ref:`query-source` and
:namedconf:ref:`query-source-v6` to disable IPv4 or IPv6 upstream queries but
allow listening to queries from clients on IPv4 or IPv6. :gl:`#4981`
Bug Fixes
~~~~~~~~~
- Fix :iscman:`nsupdate` hang when processing a large update.
To mitigate DNS flood attacks over a single TCP connection, throttle
the connection when the other side does not read the data. Throttling
should only occur on server-side sockets, but erroneously also
happened for :iscman:`nsupdate`, which acts as a client. When
:iscman:`nsupdate` started throttling the connection, it never
attempted to read again. This has been fixed. :gl:`#4910`
- Fix possible assertion failure when reloading server while processing
update policy rules. :gl:`#5006`
- Preserve cache across reconfig when using :any:`attach-cache`.
When the :any:`attach-cache` option is used in the ``options`` block with an
arbitrary name, it causes all views to use the same cache. Previously,
this configuration caused the cache to be deleted and a new cache
to be created every time the server was reconfigured. This has been fixed.
:gl:`#5061`
- Resolve the spurious drops in performance due to glue cache.
For performance reasons, the returned glue records are cached on the
first use. The current implementation could randomly cause a
performance drop and increased memory use. This has been fixed.
:gl:`#5064`
- Fix :iscman:`dnssec-signzone` signing non-DNSKEY RRsets with revoked keys.
:any:`dnssec-signzone` was using revoked keys for signing RRsets other than
DNSKEY. This has been corrected. :gl:`#5070`
- Fix improper handling of unknown directives in ``resolv.conf``.
The line after an unknown directive in ``resolv.conf`` could accidentally be
skipped, potentially affecting :iscman:`dig`, :iscman:`host`,
:iscman:`nslookup`, :iscman:`nsupdate`, or :iscman:`delv`. This has been
fixed. :gl:`#5084`
- Fix response policy zones and catalog zones with an ``$INCLUDE`` statement
defined.
Response policy zones (RPZ) and catalog zones were not working
correctly if they had an ``$INCLUDE`` statement defined. This has been
fixed. :gl:`#5111`

View file

@ -1401,7 +1401,7 @@ create_gluelist(dns_db_t *db, dns_dbversion_t *dbversion, dns_dbnode_t *dbnode,
* in-bailiwick).
*/
(void)dns_rdataset_additionaldata(rdataset, dns_rootname, add, &ctx);
(void)dns_rdataset_additionaldata(rdataset, dns_rootname, add, &ctx, 0);
CMM_STORE_SHARED(gluelist->glue, ctx.glue);

View file

@ -54,6 +54,8 @@
#include <dns/rdatastruct.h>
#include <dns/types.h>
#define DNS_RDATASET_MAXADDITIONAL 13
/* Fixed RRSet helper macros */
#define DNS_RDATASET_LENGTH 2;
@ -535,7 +537,8 @@ dns_rdataset_towirepartial(dns_rdataset_t *rdataset,
isc_result_t
dns_rdataset_additionaldata(dns_rdataset_t *rdataset,
const dns_name_t *owner_name,
dns_additionaldatafunc_t add, void *arg);
dns_additionaldatafunc_t add, void *arg,
size_t limit);
/*%<
* For each rdata in rdataset, call 'add' for each name and type in the
* rdata which is subject to additional section processing.
@ -554,10 +557,15 @@ dns_rdataset_additionaldata(dns_rdataset_t *rdataset,
*\li If a call to dns_rdata_additionaldata() is not successful, the
* result returned will be the result of dns_rdataset_additionaldata().
*
*\li If 'limit' is non-zero and the number of the rdatasets is larger
* than 'limit', no additional data will be processed.
*
* Returns:
*
*\li #ISC_R_SUCCESS
*
*\li #DNS_R_TOOMANYRECORDS in case rdataset count is larger than 'limit'
*
*\li Any error that dns_rdata_additionaldata() can return.
*/

View file

@ -507,7 +507,8 @@ dns_rdataset_towire(dns_rdataset_t *rdataset, const dns_name_t *owner_name,
isc_result_t
dns_rdataset_additionaldata(dns_rdataset_t *rdataset,
const dns_name_t *owner_name,
dns_additionaldatafunc_t add, void *arg) {
dns_additionaldatafunc_t add, void *arg,
size_t limit) {
dns_rdata_t rdata = DNS_RDATA_INIT;
isc_result_t result;
@ -519,6 +520,10 @@ dns_rdataset_additionaldata(dns_rdataset_t *rdataset,
REQUIRE(DNS_RDATASET_VALID(rdataset));
REQUIRE((rdataset->attributes & DNS_RDATASETATTR_QUESTION) == 0);
if (limit != 0 && dns_rdataset_count(rdataset) > limit) {
return DNS_R_TOOMANYRECORDS;
}
result = dns_rdataset_first(rdataset);
if (result != ISC_R_SUCCESS) {
return result;

View file

@ -8584,7 +8584,7 @@ rctx_answer_any(respctx_t *rctx) {
rdataset->trust = rctx->trust;
(void)dns_rdataset_additionaldata(rdataset, rctx->aname,
check_related, rctx);
check_related, rctx, 0);
}
return ISC_R_SUCCESS;
@ -8632,7 +8632,7 @@ rctx_answer_match(respctx_t *rctx) {
rctx->ardataset->attributes |= DNS_RDATASETATTR_CACHE;
rctx->ardataset->trust = rctx->trust;
(void)dns_rdataset_additionaldata(rctx->ardataset, rctx->aname,
check_related, rctx);
check_related, rctx, 0);
for (sigrdataset = ISC_LIST_HEAD(rctx->aname->list);
sigrdataset != NULL;
@ -8839,7 +8839,7 @@ rctx_authority_positive(respctx_t *rctx) {
*/
(void)dns_rdataset_additionaldata(
rdataset, name, check_related,
rctx);
rctx, 0);
done = true;
}
}
@ -9346,8 +9346,12 @@ rctx_referral(respctx_t *rctx) {
*/
INSIST(rctx->ns_rdataset != NULL);
FCTX_ATTR_SET(fctx, FCTX_ATTR_GLUING);
/*
* Mark the glue records in the additional section to be cached.
*/
(void)dns_rdataset_additionaldata(rctx->ns_rdataset, rctx->ns_name,
check_related, rctx);
check_related, rctx, 0);
#if CHECK_FOR_GLUE_IN_ANSWER
/*
* Look in the answer section for "glue" that is incorrectly
@ -9359,8 +9363,9 @@ rctx_referral(respctx_t *rctx) {
if (rctx->glue_in_answer &&
(fctx->type == dns_rdatatype_aaaa || fctx->type == dns_rdatatype_a))
{
(void)dns_rdataset_additionaldata(
rctx->ns_rdataset, rctx->ns_name, check_answer, fctx);
(void)dns_rdataset_additionaldata(rctx->ns_rdataset,
rctx->ns_name, check_answer,
fctx, 0);
}
#endif /* if CHECK_FOR_GLUE_IN_ANSWER */
FCTX_ATTR_CLR(fctx, FCTX_ATTR_GLUING);
@ -9462,7 +9467,7 @@ again:
if (CHASE(rdataset)) {
rdataset->attributes &= ~DNS_RDATASETATTR_CHASE;
(void)dns_rdataset_additionaldata(
rdataset, name, check_related, rctx);
rdataset, name, check_related, rctx, 0);
rescan = true;
}
}

View file

@ -85,6 +85,37 @@
#define INITIAL_DNS_MESSAGE_BUFFER_SIZE (512)
/*
* The value should be small enough to not allow a server to open too
* many streams at once. It should not be too small either because
* the incoming data will be split into too many chunks with each of
* them processed asynchronously.
*/
#define INCOMING_DATA_CHUNK_SIZE (256)
/*
* Often processing a chunk does not change the number of streams. In
* that case we can process more than once, but we still should have a
* hard limit on that.
*/
#define INCOMING_DATA_MAX_CHUNKS_AT_ONCE (4)
/*
* These constants define the grace period to help detect flooding clients.
*
* The first one defines how much data can be processed before opening
* a first stream and received at least some useful (=DNS) data.
*
* The second one defines how much data from a client we read before
* trying to drop a clients who sends not enough useful data.
*
* The third constant defines how many streams we agree to process
* before checking if there was at least one DNS request received.
*/
#define INCOMING_DATA_INITIAL_STREAM_SIZE (1536)
#define INCOMING_DATA_GRACE_SIZE (MAX_ALLOWED_DATA_IN_HEADERS)
#define MAX_STREAMS_BEFORE_FIRST_REQUEST (50)
typedef struct isc_nm_http_response_status {
size_t code;
size_t content_length;
@ -143,6 +174,7 @@ struct isc_nm_http_session {
ISC_LIST(http_cstream_t) cstreams;
ISC_LIST(isc_nmsocket_h2_t) sstreams;
size_t nsstreams;
uint64_t total_opened_sstreams;
isc_nmhandle_t *handle;
isc_nmhandle_t *client_httphandle;
@ -155,6 +187,18 @@ struct isc_nm_http_session {
isc__nm_http_pending_callbacks_t pending_write_callbacks;
isc_buffer_t *pending_write_data;
/*
* The statistical values below are for usage on server-side
* only. They are meant to detect clients that are taking too many
* resources from the server.
*/
uint64_t received; /* How many requests have been received. */
uint64_t submitted; /* How many responses were submitted to send */
uint64_t processed; /* How many responses were processed. */
uint64_t processed_incoming_data;
uint64_t processed_useful_data; /* DNS data */
};
typedef enum isc_http_error_responses {
@ -177,6 +221,7 @@ typedef struct isc_http_send_req {
void *cbarg;
isc_buffer_t *pending_write_data;
isc__nm_http_pending_callbacks_t pending_write_callbacks;
uint64_t submitted;
} isc_http_send_req_t;
#define HTTP_ENDPOINTS_MAGIC ISC_MAGIC('H', 'T', 'E', 'P')
@ -189,10 +234,26 @@ static bool
http_send_outgoing(isc_nm_http_session_t *session, isc_nmhandle_t *httphandle,
isc_nm_cb_t cb, void *cbarg);
static void
http_log_flooding_peer(isc_nm_http_session_t *session);
static bool
http_is_flooding_peer(isc_nm_http_session_t *session);
static ssize_t
http_process_input_data(isc_nm_http_session_t *session,
isc_buffer_t *input_data);
static inline bool
http_too_many_active_streams(isc_nm_http_session_t *session);
static void
http_do_bio(isc_nm_http_session_t *session, isc_nmhandle_t *send_httphandle,
isc_nm_cb_t send_cb, void *send_cbarg);
static void
http_do_bio_async(isc_nm_http_session_t *session);
static void
failed_httpstream_read_cb(isc_nmsocket_t *sock, isc_result_t result,
isc_nm_http_session_t *session);
@ -499,9 +560,20 @@ finish_http_session(isc_nm_http_session_t *session) {
if (!session->closed) {
session->closed = true;
session->reading = false;
isc_nm_read_stop(session->handle);
isc__nmsocket_timer_stop(session->handle->sock);
isc_nmhandle_close(session->handle);
}
/*
* Free any unprocessed incoming data in order to not process
* it during indirect calls to http_do_bio() that might happen
* when calling the failed callbacks.
*/
if (session->buf != NULL) {
isc_buffer_free(&session->buf);
}
if (session->client) {
client_call_failed_read_cb(ISC_R_UNEXPECTED, session);
} else {
@ -575,6 +647,7 @@ on_server_data_chunk_recv_callback(int32_t stream_id, const uint8_t *data,
if (new_bufsize <= MAX_DNS_MESSAGE_SIZE &&
new_bufsize <= h2->content_length)
{
session->processed_useful_data += len;
isc_buffer_putmem(&h2->rbuf, data, len);
break;
}
@ -623,6 +696,9 @@ call_unlink_cstream_readcb(http_cstream_t *cstream,
isc_buffer_usedregion(cstream->rbuf, &read_data);
cstream->read_cb(session->client_httphandle, result, &read_data,
cstream->read_cbarg);
if (result == ISC_R_SUCCESS) {
isc__nmsocket_timer_restart(session->handle->sock);
}
put_http_cstream(session->mctx, cstream);
}
@ -664,6 +740,9 @@ on_server_stream_close_callback(int32_t stream_id,
ISC_LIST_UNLINK(session->sstreams, sock->h2, link);
session->nsstreams--;
if (sock->h2->request_received) {
session->submitted++;
}
/*
* By making a call to isc__nmsocket_prep_destroy(), we ensure that
@ -975,6 +1054,181 @@ client_submit_request(isc_nm_http_session_t *session, http_cstream_t *stream) {
return ISC_R_SUCCESS;
}
static ssize_t
http_process_input_data(isc_nm_http_session_t *session,
isc_buffer_t *input_data) {
ssize_t readlen = 0;
ssize_t processed = 0;
isc_region_t chunk = { 0 };
size_t before, after;
size_t i;
REQUIRE(VALID_HTTP2_SESSION(session));
REQUIRE(input_data != NULL);
if (!http_session_active(session)) {
return 0;
}
/*
* For clients that initiate request themselves just process
* everything.
*/
if (session->client) {
isc_buffer_remainingregion(input_data, &chunk);
if (chunk.length == 0) {
return 0;
}
readlen = nghttp2_session_mem_recv(session->ngsession,
chunk.base, chunk.length);
if (readlen >= 0) {
isc_buffer_forward(input_data, readlen);
session->processed_incoming_data += readlen;
}
return readlen;
}
/*
* If no streams are created during processing, we might process
* more than one chunk at a time. Still we should not overdo that
* to avoid processing too much data at once as such behaviour is
* known for trashing the memory allocator at times.
*/
for (before = after = session->nsstreams, i = 0;
after <= before && i < INCOMING_DATA_MAX_CHUNKS_AT_ONCE;
after = session->nsstreams, i++)
{
const uint64_t active_streams =
(session->received - session->processed);
/*
* If there are non completed send requests in flight -let's
* not process any incoming data, as it could lead to piling
* up too much send data in send buffers. With many clients
* connected it can lead to excessive memory consumption on
* the server instance.
*/
if (session->sending > 0) {
break;
}
/*
* If we have reached the maximum number of streams used, we
* might stop processing for now, as nghttp2 will happily
* consume as much data as possible.
*/
if (session->nsstreams >= session->max_concurrent_streams &&
active_streams > 0)
{
break;
}
if (http_too_many_active_streams(session)) {
break;
}
isc_buffer_remainingregion(input_data, &chunk);
if (chunk.length == 0) {
break;
}
chunk.length = ISC_MIN(chunk.length, INCOMING_DATA_CHUNK_SIZE);
readlen = nghttp2_session_mem_recv(session->ngsession,
chunk.base, chunk.length);
if (readlen >= 0) {
isc_buffer_forward(input_data, readlen);
session->processed_incoming_data += readlen;
processed += readlen;
} else {
isc_buffer_clear(input_data);
return readlen;
}
}
return processed;
}
static void
http_log_flooding_peer(isc_nm_http_session_t *session) {
const int log_level = ISC_LOG_DEBUG(1);
if (session->handle != NULL && isc_log_wouldlog(isc_lctx, log_level)) {
char client_sabuf[ISC_SOCKADDR_FORMATSIZE];
char local_sabuf[ISC_SOCKADDR_FORMATSIZE];
isc_sockaddr_format(&session->handle->sock->peer, client_sabuf,
sizeof(client_sabuf));
isc_sockaddr_format(&session->handle->sock->iface, local_sabuf,
sizeof(local_sabuf));
isc__nmsocket_log(session->handle->sock, log_level,
"Dropping a flooding HTTP/2 peer "
"%s (on %s) - processed: %" PRIu64
" bytes, of them useful: %" PRIu64 "",
client_sabuf, local_sabuf,
session->processed_incoming_data,
session->processed_useful_data);
}
}
static bool
http_is_flooding_peer(isc_nm_http_session_t *session) {
if (session->client) {
return false;
}
/*
* A flooding client can try to open a lot of streams before
* submitting a request. Let's drop such clients.
*/
if (session->received == 0 &&
session->total_opened_sstreams > MAX_STREAMS_BEFORE_FIRST_REQUEST)
{
return true;
}
/*
* We have processed enough data to open at least one stream and
* get some useful data.
*/
if (session->processed_incoming_data >
INCOMING_DATA_INITIAL_STREAM_SIZE &&
(session->total_opened_sstreams == 0 ||
session->processed_useful_data == 0))
{
return true;
}
if (session->processed_incoming_data < INCOMING_DATA_GRACE_SIZE) {
return false;
}
/*
* The overhead of DoH per DNS message can be minimum 160-180
* bytes. We should allow more for extra information that can be
* included in headers, so let's use 256 bytes. Minimum DNS
* message size is 12 bytes. So, (256+12)/12=22. Even that can be
* too restricting for some edge cases, but should be good enough
* for any practical purposes. Not to mention that HTTP/2 may
* include legitimate data that is completely useless for DNS
* purposes...
*
* Anyway, at that point we should have processed enough requests
* for such clients (if any).
*/
if (session->processed_useful_data == 0 ||
(session->processed_incoming_data /
session->processed_useful_data) > 22)
{
return true;
}
return false;
}
/*
* Read callback from TLS socket.
*/
@ -984,6 +1238,7 @@ http_readcb(isc_nmhandle_t *handle ISC_ATTR_UNUSED, isc_result_t result,
isc_nm_http_session_t *session = (isc_nm_http_session_t *)data;
isc_nm_http_session_t *tmpsess = NULL;
ssize_t readlen;
isc_buffer_t input;
REQUIRE(VALID_HTTP2_SESSION(session));
@ -1001,11 +1256,17 @@ http_readcb(isc_nmhandle_t *handle ISC_ATTR_UNUSED, isc_result_t result,
goto done;
}
readlen = nghttp2_session_mem_recv(session->ngsession, region->base,
region->length);
isc_buffer_init(&input, region->base, region->length);
isc_buffer_add(&input, region->length);
readlen = http_process_input_data(session, &input);
if (readlen < 0) {
failed_read_cb(ISC_R_UNEXPECTED, session);
goto done;
} else if (http_is_flooding_peer(session)) {
http_log_flooding_peer(session);
failed_read_cb(ISC_R_RANGE, session);
goto done;
}
if ((size_t)readlen < region->length) {
@ -1017,11 +1278,12 @@ http_readcb(isc_nmhandle_t *handle ISC_ATTR_UNUSED, isc_result_t result,
isc_buffer_putmem(session->buf, region->base + readlen,
unread_size);
isc_nm_read_stop(session->handle);
http_do_bio_async(session);
} else {
/* We might have something to receive or send, do IO */
http_do_bio(session, NULL, NULL, NULL);
}
/* We might have something to receive or send, do IO */
http_do_bio(session, NULL, NULL, NULL);
done:
isc__nm_httpsession_detach(&tmpsess);
}
@ -1059,14 +1321,18 @@ http_writecb(isc_nmhandle_t *handle, isc_result_t result, void *arg) {
}
isc_buffer_free(&req->pending_write_data);
session->processed += req->submitted;
isc_mem_put(session->mctx, req, sizeof(*req));
session->sending--;
http_do_bio(session, NULL, NULL, NULL);
isc_nmhandle_detach(&transphandle);
if (result != ISC_R_SUCCESS && session->sending == 0) {
if (result == ISC_R_SUCCESS) {
http_do_bio(session, NULL, NULL, NULL);
} else {
finish_http_session(session);
}
isc_nmhandle_detach(&transphandle);
isc__nm_httpsession_detach(&session);
}
@ -1227,7 +1493,9 @@ http_send_outgoing(isc_nm_http_session_t *session, isc_nmhandle_t *httphandle,
*send = (isc_http_send_req_t){ .pending_write_data =
session->pending_write_data,
.cb = cb,
.cbarg = cbarg };
.cbarg = cbarg,
.submitted = session->submitted };
session->submitted = 0;
session->pending_write_data = NULL;
move_pending_send_callbacks(session, send);
@ -1249,6 +1517,28 @@ nothing_to_send:
return false;
}
static inline bool
http_too_many_active_streams(isc_nm_http_session_t *session) {
const uint64_t active_streams = session->received - session->processed;
const uint64_t max_active_streams =
ISC_MIN(ISC_NETMGR_MAX_STREAM_CLIENTS_PER_CONN,
session->max_concurrent_streams);
if (session->client) {
return false;
}
/*
* Do not process incoming data if there are too many active DNS
* clients (streams) per connection.
*/
if (active_streams >= max_active_streams) {
return true;
}
return false;
}
static void
http_do_bio(isc_nm_http_session_t *session, isc_nmhandle_t *send_httphandle,
isc_nm_cb_t send_cb, void *send_cbarg) {
@ -1264,42 +1554,86 @@ http_do_bio(isc_nm_http_session_t *session, isc_nmhandle_t *send_httphandle,
finish_http_session(session);
}
return;
} else if (nghttp2_session_want_read(session->ngsession) == 0 &&
nghttp2_session_want_write(session->ngsession) == 0 &&
session->pending_write_data == NULL)
{
session->closing = true;
}
if (send_cb != NULL) {
INSIST(VALID_NMHANDLE(send_httphandle));
(void)http_send_outgoing(session, send_httphandle, send_cb,
send_cbarg);
return;
}
INSIST(send_httphandle == NULL);
INSIST(send_cb == NULL);
INSIST(send_cbarg == NULL);
if (session->pending_write_data != NULL && session->sending == 0) {
(void)http_send_outgoing(session, NULL, NULL, NULL);
return;
}
if (nghttp2_session_want_read(session->ngsession) != 0) {
if (!session->reading) {
/* We have not yet started
* reading from this handle */
/* We have not yet started reading from this handle */
isc__nmsocket_timer_start(session->handle->sock);
isc_nm_read(session->handle, http_readcb, session);
session->reading = true;
} else if (session->buf != NULL) {
size_t remaining =
isc_buffer_remaininglength(session->buf);
/* Leftover data in the
* buffer, use it */
size_t readlen = nghttp2_session_mem_recv(
session->ngsession,
isc_buffer_current(session->buf), remaining);
/* Leftover data in the buffer, use it */
size_t remaining_after = 0;
ssize_t readlen = 0;
isc_nm_http_session_t *tmpsess = NULL;
if (readlen == remaining) {
/*
* Let's ensure that HTTP/2 session and its associated
* data will not go "out of scope" too early.
*/
isc__nm_httpsession_attach(session, &tmpsess);
readlen = http_process_input_data(session,
session->buf);
remaining_after =
isc_buffer_remaininglength(session->buf);
if (readlen < 0) {
failed_read_cb(ISC_R_UNEXPECTED, session);
} else if (http_is_flooding_peer(session)) {
http_log_flooding_peer(session);
failed_read_cb(ISC_R_RANGE, session);
} else if ((size_t)readlen == remaining) {
isc_buffer_free(&session->buf);
http_do_bio(session, NULL, NULL, NULL);
} else if (remaining_after > 0 &&
remaining_after < remaining)
{
/*
* We have processed a part of the data, now
* let's delay processing of whatever is left
* here. We want it to be an async operation so
* that we will:
*
* a) let other things run;
* b) have finer grained control over how much
* data is processed at once, because nghttp2
* would happily consume as much data we pass to
* it and that could overwhelm the server.
*/
http_do_bio_async(session);
} else {
isc_buffer_forward(session->buf, readlen);
(void)http_send_outgoing(session, NULL, NULL,
NULL);
}
http_do_bio(session, send_httphandle, send_cb,
send_cbarg);
isc__nm_httpsession_detach(&tmpsess);
return;
} else {
/* Resume reading, it's
* idempotent, wait for more
/*
* Resume reading, it's idempotent, wait for more
*/
isc__nmsocket_timer_start(session->handle->sock);
isc_nm_read(session->handle, http_readcb, session);
}
} else {
@ -1307,20 +1641,54 @@ http_do_bio(isc_nm_http_session_t *session, isc_nmhandle_t *send_httphandle,
isc_nm_read_stop(session->handle);
}
if (send_cb != NULL) {
INSIST(VALID_NMHANDLE(send_httphandle));
(void)http_send_outgoing(session, send_httphandle, send_cb,
send_cbarg);
} else {
INSIST(send_httphandle == NULL);
INSIST(send_cb == NULL);
INSIST(send_cbarg == NULL);
(void)http_send_outgoing(session, NULL, NULL, NULL);
/* we might have some data to send after processing */
(void)http_send_outgoing(session, NULL, NULL, NULL);
if (nghttp2_session_want_read(session->ngsession) == 0 &&
nghttp2_session_want_write(session->ngsession) == 0 &&
session->pending_write_data == NULL)
{
session->closing = true;
isc_nm_read_stop(session->handle);
if (session->sending == 0) {
finish_http_session(session);
}
}
return;
}
static void
http_do_bio_async_cb(void *arg) {
isc_nm_http_session_t *session = arg;
REQUIRE(VALID_HTTP2_SESSION(session));
if (session->handle != NULL &&
!isc__nmsocket_closing(session->handle->sock))
{
http_do_bio(session, NULL, NULL, NULL);
}
isc__nm_httpsession_detach(&session);
}
static void
http_do_bio_async(isc_nm_http_session_t *session) {
isc_nm_http_session_t *tmpsess = NULL;
REQUIRE(VALID_HTTP2_SESSION(session));
if (session->handle == NULL ||
isc__nmsocket_closing(session->handle->sock))
{
return;
}
isc__nm_httpsession_attach(session, &tmpsess);
isc_async_run(session->handle->sock->worker->loop, http_do_bio_async_cb,
tmpsess);
}
static isc_result_t
get_http_cstream(isc_nmsocket_t *sock, http_cstream_t **streamp) {
http_cstream_t *cstream = sock->h2->connect.cstream;
@ -1442,6 +1810,7 @@ transport_connect_cb(isc_nmhandle_t *handle, isc_result_t result, void *cbarg) {
}
http_transpost_tcp_nodelay(handle);
isc__nmhandle_set_manual_timer(session->handle, true);
http_call_connect_cb(http_sock, session, result);
@ -1723,6 +2092,7 @@ server_on_begin_headers_callback(nghttp2_session *ngsession,
session->nsstreams++;
isc__nm_httpsession_attach(session, &socket->h2->session);
ISC_LIST_APPEND(session->sstreams, socket->h2, link);
session->total_opened_sstreams++;
nghttp2_session_set_stream_user_data(ngsession, frame->hd.stream_id,
socket);
@ -1782,6 +2152,8 @@ server_handle_path_header(isc_nmsocket_t *socket, const uint8_t *value,
socket->worker->mctx, dns_value,
dns_value_len,
&socket->h2->query_data_len);
socket->h2->session->processed_useful_data +=
dns_value_len;
} else {
socket->h2->query_too_large = true;
return ISC_HTTP_ERROR_PAYLOAD_TOO_LARGE;
@ -2090,6 +2462,12 @@ server_call_cb(isc_nmsocket_t *socket, const isc_result_t result,
handle = isc__nmhandle_get(socket, NULL, NULL);
if (result != ISC_R_SUCCESS) {
data = NULL;
} else if (socket->h2->session->handle != NULL) {
isc__nmsocket_timer_restart(socket->h2->session->handle->sock);
}
if (result == ISC_R_SUCCESS) {
socket->h2->request_received = true;
socket->h2->session->received++;
}
socket->h2->cb(handle, result, data, socket->h2->cbarg);
isc_nmhandle_detach(&handle);
@ -2106,6 +2484,12 @@ isc__nm_http_bad_request(isc_nmhandle_t *handle) {
REQUIRE(!sock->client);
REQUIRE(VALID_HTTP2_SESSION(sock->h2->session));
if (sock->h2->response_submitted ||
!http_session_active(sock->h2->session))
{
return;
}
(void)server_send_error_response(ISC_HTTP_ERROR_BAD_REQUEST,
sock->h2->session->ngsession, sock);
}
@ -2492,6 +2876,8 @@ httplisten_acceptcb(isc_nmhandle_t *handle, isc_result_t result, void *cbarg) {
isc__nmsocket_attach(httpserver, &session->serversocket);
server_send_connection_header(session);
isc__nmhandle_set_manual_timer(session->handle, true);
/* TODO H2 */
http_do_bio(session, NULL, NULL, NULL);
return ISC_R_SUCCESS;

View file

@ -474,6 +474,7 @@ typedef struct isc_nmsocket_h2 {
isc_nm_http_endpoints_t *peer_endpoints;
bool request_received;
bool response_submitted;
struct {
char *uri;

View file

@ -121,6 +121,7 @@ proxystream_on_header_data_cb(const isc_result_t result,
* the case of TCP it is disabled by default
*/
proxystream_read_stop(sock);
isc__nmsocket_timer_stop(sock);
isc__nmhandle_set_manual_timer(sock->outerhandle, false);
sock->proxy.header_processed = true;
@ -775,6 +776,7 @@ isc__nm_proxystream_close(isc_nmsocket_t *sock) {
* external references, we can close everything.
*/
proxystream_read_stop(sock);
isc__nmsocket_timer_stop(sock);
if (sock->outerhandle != NULL) {
sock->reading = false;
isc_nm_read_stop(sock->outerhandle);

View file

@ -1009,6 +1009,7 @@ streamdns_close_direct(isc_nmsocket_t *sock) {
if (sock->outerhandle != NULL) {
sock->streamdns.reading = false;
isc__nmsocket_timer_stop(sock);
isc_nm_read_stop(sock->outerhandle);
isc_nmhandle_close(sock->outerhandle);
isc_nmhandle_detach(&sock->outerhandle);

View file

@ -722,7 +722,9 @@ isc__nm_tcp_read_stop(isc_nmhandle_t *handle) {
isc_nmsocket_t *sock = handle->sock;
isc__nmsocket_timer_stop(sock);
if (!sock->manual_read_timer) {
isc__nmsocket_timer_stop(sock);
}
isc__nm_stop_reading(sock);
sock->reading = false;

View file

@ -465,6 +465,7 @@ tls_try_handshake(isc_nmsocket_t *sock, isc_result_t *presult) {
isc__nmsocket_log_tls_session_reuse(sock, sock->tlsstream.tls);
tlshandle = isc__nmhandle_get(sock, &sock->peer, &sock->iface);
isc__nmsocket_timer_stop(sock);
tls_read_stop(sock);
if (isc__nm_closing(sock->worker)) {
@ -1154,6 +1155,10 @@ isc__nm_tls_read_stop(isc_nmhandle_t *handle) {
handle->sock->reading = false;
if (!handle->sock->manual_read_timer) {
isc__nmsocket_timer_stop(handle->sock);
}
tls_read_stop(handle->sock);
}
@ -1174,6 +1179,7 @@ isc__nm_tls_close(isc_nmsocket_t *sock) {
*/
tls_read_stop(sock);
if (sock->outerhandle != NULL) {
isc__nmsocket_timer_stop(sock);
isc_nm_read_stop(sock->outerhandle);
isc_nmhandle_close(sock->outerhandle);
isc_nmhandle_detach(&sock->outerhandle);

View file

@ -2147,7 +2147,8 @@ addname:
if (trdataset != NULL && dns_rdatatype_followadditional(type)) {
if (client->additionaldepth++ < client->view->max_restarts) {
eresult = dns_rdataset_additionaldata(
trdataset, fname, query_additional_cb, qctx);
trdataset, fname, query_additional_cb, qctx,
DNS_RDATASET_MAXADDITIONAL);
}
client->additionaldepth--;
}
@ -2245,7 +2246,7 @@ regular:
* We don't care if dns_rdataset_additionaldata() fails.
*/
(void)dns_rdataset_additionaldata(rdataset, name, query_additional_cb,
qctx);
qctx, DNS_RDATASET_MAXADDITIONAL);
CTRACE(ISC_LOG_DEBUG(3), "query_additional: done");
}
@ -2271,7 +2272,8 @@ query_addrrset(query_ctx_t *qctx, dns_name_t **namep,
* To the current response for 'client', add the answer RRset
* '*rdatasetp' and an optional signature set '*sigrdatasetp', with
* owner name '*namep', to section 'section', unless they are
* already there. Also add any pertinent additional data.
* already there. Also add any pertinent additional data, unless
* the query was for type ANY.
*
* If 'dbuf' is not NULL, then '*namep' is the name whose data is
* stored in 'dbuf'. In this case, query_addrrset() guarantees that
@ -2326,7 +2328,9 @@ query_addrrset(query_ctx_t *qctx, dns_name_t **namep,
*/
query_addtoname(mname, rdataset);
query_setorder(qctx, mname, rdataset);
query_additional(qctx, mname, rdataset);
if (qctx->qtype != dns_rdatatype_any) {
query_additional(qctx, mname, rdataset);
}
/*
* Note: we only add SIGs if we've added the type they cover, so