haproxy

mirror of https://github.com/haproxy/haproxy.git synced 2026-04-26 08:38:20 -04:00

Author	SHA1	Message	Date
Amaury Denoyelle	d85f9f9d43	BUG/MEDIUM: mux-quic: fix RESET_STREAM on send-only stream When receiving a RESET_STREAM on a send-only stream, it is mandatory to close the connection with an error STREAM_STATE error. However, this was badly implemented as this caused two invocation of qcc_set_error() which is forbidden by the mux-quic API. To fix this, rely on qcc_get_qcs() to properly detect the error. Remove qcc_set_error() usage from qcc_recv_reset_stream() instead. This must be backported up to 2.7.	2023-10-11 14:15:31 +02:00
Amaury Denoyelle	a4c59f5b9e	BUG/MINOR: quic: reject packet with no frame RFC 9000 indicates that a QUIC packet with no frame must trigger a connection closure with PROTOCOL_VIOLATION error code. Implement this via an early return inside qc_parse_pkt_frms(). This should be backported up to 2.6.	2023-10-11 14:15:31 +02:00
Amaury Denoyelle	f59f8326f9	REORG: quic: cleanup traces definition Move all QUIC trace definitions from quic_conn.h to quic_trace-t.h. Also remove multiple definition trace_quic macro definition into quic_trace.h. This forces all QUIC source files who relies on trace to include it while reducing the size of quic_conn.h.	2023-10-11 14:15:31 +02:00
Frédéric Lécaille	bd83b6effb	BUG/MINOR: quic: Avoid crashing with unsupported cryptographic algos This bug was detected when compiling haproxy against aws-lc TLS stack during QUIC interop runner tests. Some algorithms could be negotiated by haproxy through the TLS stack but not fully supported by haproxy QUIC implentation. This leaded tls_aead() to return NULL (same thing for tls_md(), tls_hp()). As these functions returned values were never checked, they could triggered segfaults. To fix this, one closes the connection as soon as possible with a handshake_failure(40) TLS alert. Note that as the TLS stack successfully negotiates an algorithm, it provides haproxy with CRYPTO data before entering ->set_encryption_secrets() callback. This is why this callback (ha_set_encryption_secrets() on haproxy side) is modified to release all the CRYPTO frames before triggering a CONNECTION_CLOSE with a TLS alert. This is done calling qc_release_pktns_frms() for all the packet number spaces. Modify some quic_tls_keys_hexdump to avoid crashes when the ->aead or ->hp EVP_CIPHER are NULL. Modify qc_release_pktns_frms() to do nothing if the packet number space passed as parameter is not intialized. This bug does not impact the QUIC TLS compatibily mode (USE_QUIC_OPENSSL_COMPAT). Thank you to @ilia-shipitsin for having reported this issue in GH #2309. Must be backported as far as 2.6.	2023-10-11 11:52:22 +02:00
William Lallemand	a62a2d8b48	MINOR: ssl: add an explicit error when 'ciphersuites' are not supported Add an explicit error when the support for 'ciphersuites' was not enable into the build because of the SSL library.	2023-10-09 14:46:09 +02:00
Aurelien DARRAGON	31e8a003a5	MINOR: sink: function to add new sink servers Move the sft creation part out of sink_finalize() function so that it becomes possible to register sink's servers without forward_px being set.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	205d480d9f	MINOR: sink: refine forward_px usage now forward_px only serves as a hint to know if a proxy was created specifically for the sink, in which case the sink is responsible for it. Everywhere forward_px was used in appctx context: get the parent proxy from the sft->srv instead. This permits to finally get rid of the double link dependency between sink and proxy.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	405567c125	MINOR: sink: don't rely on forward_px to init sink forwarding Instead, we check if at least one sft has been registered into the sink, if it is the case, then we need to init the forwarding for the sink.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	3c53f6cb76	MINOR: sink: don't rely on p->parent in sink appctx Removing unnecessary dependency on proxy->parent pointer in sink appctx functions by directly using the sink sft from the applet->svcctx to get back to sink related structs. Thanks to this, proxy used for a ringbuf does not have to be exclusive to a single sink anymore.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	ec770b7924	MINOR: sink: remove useless check after sink creation It's useless to check if sink has been created with BUF type after calling sink_new_buf() since the goal of the function is to create a new sink of BUF type.	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	cb01da8d12	MINOR: sink/log: fix some typos around postparsing logic Fixing some typos that have been overlooked during the recent log/sink API improvements. Using this patch to make sink_new_from_logsrv() static since it is not used outside of sink.c	2023-10-06 15:34:31 +02:00
Aurelien DARRAGON	19a1210dcd	MINOR: cfgparse-listen: warn when use-server rules is used in wrong mode haproxy will report a warning when "use-server" keyword is used within a backend that doesn't support server rules to inform the user that rules will be ignored. To this day, only TCP and HTTP backends can make use of it.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	3934901e51	MINOR: proxy: report a warning for max_ka_queue in proxy_cfg_ensure_no_http() Display a warning when max_ka_queue is set (it is the case when "max-keep-alive-queue" directive is used within a proxy section) to inform the user that this directives depends on the "http" mode to work and thus will safely be ignored.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	65f1124b5d	MINOR: cfgparse-listen: "http-reuse" requires TCP or HTTP mode Prevent the use of the "http-reuse" keyword in proxy section when neither the TCP nor the HTTP mode is set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	403fdee6a4	MINOR: proxy: dynamic-cookie CLIs require TCP or HTTP mode Prevent the use of "dynamic-cookie" related CLI commands if the backend is not in TCP or HTTP mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	0b09727a22	MINOR: cfgparse-listen: "dynamic-cookie-key" requires TCP or HTTP mode Prevent the use of the "dynamic-cookie-key" keyword in proxy sections when TCP or HTTP modes are not set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	d354947365	MINOR: cfgparse-listen: "http-send-name-header" requires TCP or HTTP mode Prevent the use of the "http-send-name-header" keyword in proxy section when neither TCP or HTTP mode is set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	0ba731f50b	MINOR: fcgi-app: "use-fcgi-app" requires TCP or HTTP mode Prevent the use of the "use-fcgi-app" keyword in proxy sections where neither TCP nor HTTP mode is set.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	b41b77b4cc	MINOR: http_htx/errors: prevent the use of some keywords when not in tcp/http mode Prevent the use of "errorfile", "errorfiles" and various errorloc options in proxies that are neither in TCP or HTTP mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	225526dc16	MINOR: flt_http_comp: "compression" requires TCP or HTTP mode Prevent the use of "compression" keyword in proxy sections when the proxy is neither in tcp or http mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	1e0093a317	MINOR: backend/balance: "balance" requires TCP or HTTP mode Prevent the use of "balance" and associated keywords when proxy is neither in tcp or http mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	f9422551cd	MINOR: filter: "filter" requires TCP or HTTP mode Prevent the use of "filter" when proxy is not in TCP or HTTP mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	098ae743fd	MINOR: stktable: "stick" requires TCP or HTTP mode Prevent the use of "stick-table" and "stick *" when proxy is neither in tcp or http mode.	2023-10-06 15:34:30 +02:00
Aurelien DARRAGON	09b15e4163	MINOR: tcp_rules: tcp-{request,response} requires TCP or HTTP mode Prevent the use of tcp-{request,response} keyword in proxies that are neither in TCP or HTTP modes.	2023-10-06 15:34:30 +02:00
Willy Tarreau	90fa2eaa15	MINOR: haproxy: permit to register features during boot The regtests are using the "feature()" predicate but this one can only rely on build-time options. It would be nice if some runtime-specific options could be detected at boot time so that regtests could more flexibly adapt to what is supported (capabilities, splicing, etc). Similarly, certain features that are currently enabled with USE_XXX could also be automatically detected at build time using ifdefs and would simplify the configuration, but then we'd lose the feature report in the feature list which is convenient for regtests. This patch makes sure that haproxy -vv shows the variable's contents and not the macro's contents, and adds a new hap_register_feature() to allow the code to register a new keyword.	2023-10-06 11:40:02 +02:00
Remi Tricot-Le Breton	a5e96425a2	MEDIUM: cache: Add "Origin" header to secondary cache key This patch add a hash of the Origin header to the cache's secondary key. This enables to manage store responses that have a "Vary: Origin" header in the cache when vary is enabled. This cannot be considered as a means to manage CORS requests though, it only processes the Origin header and hashes the presented value without any form of URI normalization. This need was expressed by Philipp Hossner in GitHub issue #251. Co-Authored-by: Philipp Hossner <philipp.hossner@posteo.de>	2023-10-05 10:53:54 +02:00
Amaury Denoyelle	544e320f80	BUG/MINOR: hq-interop: simplify parser requirement hq-interop should be limited for QUIC testing. As such, its code should be kept plain simple and not implement too many things. This patch fixes issues which may cause rare QUIC interop failures : - remove some unneeded BUG_ON() as parser should not be too strict - remove support of partial message parsing - ensure buffer data does not wrap as it was not properly handled. In any case, this should never happen as only a single message will be stored for each qcs buffer. This should be backported up to 2.6.	2023-10-04 17:32:23 +02:00
William Lallemand	45174e4fdc	BUILD: quic: allow USE_QUIC to work with AWSLC This patch fixes the build with AWSLC and USE_QUIC=1, this is only meant to be able to build for now and it's not feature complete. The set_encryption_secrets callback has been split in set_read_secret and set_write_secret. Missing features: - 0RTT was disabled. - TLS1_3_CK_CHACHA20_POLY1305_SHA256, TLS1_3_CK_AES_128_CCM_SHA256 were disabled - clienthello callback is missing, certificate selection could be limited (RSA + ECDSA at the same time)	2023-10-04 16:55:19 +02:00
Christopher Faulet	225a4d02e1	MINOR: h1-htx: Declare successful tunnel establishment as bodyless Successful responses to a CONNECT or to a upgrade request have no payload. Be explicit on this point by setting HTX_SL_F_BODYLESS_RESP flag on the HTX start-line.	2023-10-04 15:34:18 +02:00
Christopher Faulet	b6c32f1e04	BUG/MINOR: h1-htx: Keep flags about C-L/T-E during HEAD response parsing When a response to a HEAD request is parsed, flags to know if the content length is set or if the payload is chunked must be preserved.. It is important because of the previous fix. Otherwise, these headers will be removed from the response sent to the client. This patch must only backported if "BUG/MEDIUM: mux-h1; Ignore headers modifications about payload representation" is backported.	2023-10-04 15:34:18 +02:00
Christopher Faulet	f89ba27caa	BUG/MEDIUM: mux-h1; Ignore headers modifications about payload representation We now ignore modifications during the message analysis about the payload representation if only headers are updated and not meta-data. It means a C-L header removed to add a T-E one or the opposite via HTTP actions. This kind of changes are ignored because it is extremly hard to be sure the payload will be properly formatted. It is an issue since the HTX was introduced and it was never reported. Thus, there is no reason to backport this patch for now. It relies on following commits: * MINOR: mux-h1: Add flags if outgoing msg contains a header about its payload * MINOR: mux-h1: Rely on H1S_F_HAVE_CHNK to add T-E in outgoing messages * BUG/MEDIUM: mux-h1: Add C-L header in outgoing message if it was removed	2023-10-04 15:34:18 +02:00
Christopher Faulet	c43742c188	BUG/MEDIUM: mux-h1: Add C-L header in outgoing message if it was removed If a C-L header was found during parsing of a message but it was removed via a HTTP action, it is re-added during the message formatting. Indeed, if headers about the payload are modified, meta-data of the message must also be updated. Otherwise, it is not possible to guarantee the message will be properly formatted. To do so, we rely on the flag H1S_F_HAVE_CLEN. This patch should not be backported except an issue is explicitly reported. It relies on "MINOR: mux-h1: Add flags if outgoing msg contains a header about its payload".	2023-10-04 15:34:18 +02:00
Christopher Faulet	accd3e911c	MINOR: mux-h1: Rely on H1S_F_HAVE_CHNK to add T-E in outgoing messages If a message is declared to have a known length but no C-L or T-E headers are set, a "Transfer-Encoding; chunked" header is automatically added. It is useful for H2/H3 messages with no C-L header. There is now a flag to know this header was found or added. So we use it.	2023-10-04 15:34:18 +02:00
Christopher Faulet	e7964eac2d	BUG/MEDIUM: h1: Ignore C-L value in the H1 parser if T-E is also set In fact, during the parsing there is already a test to remove the Content-Length header if a Transfer-Encoding one is found. However, in the parser, the content-length value was still used to set the body length (the final one and the remaining one). This value is thus also used to set the extra field in the HTX message and is then used during the sending stage to announce the chunk size. So, Content-Length header value must be ignored by the H1 parser to properly reformat the message when it is sent. This patch must be backported as far as 2.6. Lower versions don"t handle this case.	2023-10-04 15:34:18 +02:00
Christopher Faulet	c367957851	BUG/MINOR: mux-h1: Ignore C-L when sending H1 messages if T-E is also set In fact, it is already done but both flags (H1_MF_CLEN and H1_MF_CHUNK) are set on the H1 parser. Thus it is errorprone when H1 messages are sent, especially because most of time, the "Content-length" case is processed before the "chunked" one. This may lead to compute the wrong chunk size and to miss the last chunk. This patch must be backported as far as 2.6. This case is not handled in 2.4 and lower.	2023-10-04 15:34:18 +02:00
Christopher Faulet	331241b084	BUG/MINOR: mux-h1: Handle read0 in rcv_pipe() only when data receipt was tried In rcv_pipe() callback we must be careful to not report the end of stream too early because some data may still be present in the input buffer. If we report a EOS here, this will block the subsequent call to rcv_buf() to process remaining input data. This only happens when we try a last rcv_pipe() when the xfer length is unknown and all data was already received in the input buffer. Concretely this happens with a payload larger than a buffer but lower than 2 buffers. This patch must be backported as far as 2.7.	2023-10-04 15:34:18 +02:00
Christopher Faulet	2225cb660c	DEBUG: mux-h1: Fix event label from trace messages about payload formatting The label used for in/out trace messages about payload formatting was not the right one. Use H1_EV_TX_BODY, instead of H1_EV_TX_HDRS.	2023-10-04 15:34:18 +02:00
Christopher Faulet	751b59c40b	BUG/MEDIUM: hlua: Initialize appctx used by a lua socket on connect only Ths appctx used by a lua socket was synchronously initialized after the appctx creation. The connect itself is performed later. However it is an issue because the script may be interrupted beteween the two operation. In this case, the stream attached to the appctx is woken up before any destination is set. The stream will try to connect but without destination, it fails. When the lua script is rescheduled and the connect is performed, the connection has already failed and an error is returned. To fix the issue, we must be sure to not woken up the stream before the connect. To do so, we must defer the appctx initilization. It is now perform on connect. This patch relies on the following commits: * MINOR: hlua: Test the hlua struct first when the lua socket is connecting * MINOR: hlua: Save the lua socket's server in its context * MINOR: hlua: Save the lua socket's timeout in its context * MINOR: hlua: Don't preform operations on a not connected socket * MINOR: hlua: Set context's appctx when the lua socket is created All the series must be backported as far as 2.6.	2023-10-04 15:34:13 +02:00
Christopher Faulet	66fc9238f0	MINOR: hlua: Test the hlua struct first when the lua socket is connecting It makes sense to first verify the hlua context is valid. It is probably better than doing it after updated the appctx.	2023-10-04 15:34:10 +02:00
Christopher Faulet	6f4041c75d	MINOR: hlua: Save the lua socket's server in its context For the same reason than the timeout, the server used by a lua socket is now saved in its context. This will be mandatory to fix issues with the lua sockets.	2023-10-04 15:34:06 +02:00
Christopher Faulet	0be1ae2fa2	MINOR: hlua: Save the lua socket's timeout in its context When the lua socket timeout is set, it is now saved in its context. If there is already a stream attached to the appctx, the timeout is then immediately modified. Otherwise, it is modified when the stream is created, thus during the appctx initialization. For now, the appctx is initialized when it is created. But this will change to fix issues with the lua sockets. Thus, this patch is mandatory.	2023-10-04 15:34:03 +02:00
Christopher Faulet	ee687aa18d	MINOR: hlua: Don't preform operations on a not connected socket There is nothing that prevent someone to create a lua socket and try to receive or to write before the connection was established ot after the shutdown was performed. The same is true when info about the socket are retrieved. It is not an issue because this will fail later. But now, we check the socket is connected or not earlier. It is more effecient but it will be also mandatory to fix issue with the lua sockets.	2023-10-04 15:34:00 +02:00
Christopher Faulet	ed9333827a	MINOR: hlua: Set context's appctx when the lua socket is created The lua socket's context referenced the owning appctx. It was set when the appctx was initialized. It is now performed when the appctx is created. It is a small change but this will be required to fix several issues with the lua sockets.	2023-10-04 15:33:57 +02:00
Christopher Faulet	b62d5689d2	BUILD: pool: Fix GCC error about potential null pointer dereference In pool_gc(), GCC 13.2.1 reports an error about a potential null potential dereference: src/pool.c: In function ‘pool_gc’: src/pool.c:807:64: error: potential null pointer dereference [-Werror=null-dereference] 807 \| entry->buckets[bucket].free_list = temp->next; \| ~~~~^~~~~~ There is no issue here because "bucket" variable cannot be greater than CONFIG_HAP_POOL_BUCKETS. But to make GCC happy, we now break the loop if it is greater or equal to CONFIG_HAP_POOL_BUCKETS.	2023-10-04 08:03:02 +02:00
Amaury Denoyelle	90873dc678	MINOR: proto_reverse_connect: support source address setting Support backend configuration for explicit source address on pre-connect. These settings can be specified via "source" backend keyword or directly on the server line. Previously, all source parameters triggered a BUG_ON() when binding a reverse connect listener. This was done because some settings are incompatible with reverse connect context : this is the case for all source settings which do not specify a fixed address but rather rely on a frontend connection. Indeed, in case of preconnect, connection is initiated on its own without the existence of a previous frontend connection. This patch allows to use a source parameter with a fixed address. All other settings (usesrc client/clientip/hdr_ip) are rejected on listener binding. On connection init, alloc_bind_address() is used to set the optional source address.	2023-10-03 17:50:36 +02:00
Amaury Denoyelle	bd001ff346	MINOR: backend: refactor specific source address allocation Refactor alloc_bind_address() function which is used to allocate a sockaddr if a connection to a target server relies on a specific source address setting. The main objective of this change is to be able to use this function outside of backend module, namely for preconnections using a reverse server. As such, this function is now exported globally. For reverse connect, there is no stream instance. As such, the function parts which relied on it were reduced to the minimal. Now, stream is only used if a non-static address is configured which is useful for usesrc client\|clientip\|hdr_ip. These options have no sense for reverse connect so it should be safe to use the same function.	2023-10-03 17:49:12 +02:00
Amaury Denoyelle	2ac5d9a657	MINOR: quic: handle perm error on bind during runtime Improve EACCES permission errors encounterd when using QUIC connection socket at runtime : * First occurence of the error on the process will generate a log warning. This should prevent users from using a privileged port without mandatory access rights. * Socket mode will automatically fallback to listener socket for the receiver instance. This requires to duplicate the settings from the bind_conf to the receiver instance to support configurations with multiple addresses on the same bind line.	2023-10-03 16:52:02 +02:00
Amaury Denoyelle	3ef6df7387	MINOR: quic: define quic-socket bind setting Define a new bind option quic-socket : quic-socket [ connection \| listener ] This new setting works in conjunction with the existing configuration global tune.quic.socket-owner and reuse the same semantics. The purpose of this setting is to allow to disable connection socket usage on listener instances individually. This will notably be useful when needing to deactivating it when encountered a fatal permission error on bind() at runtime.	2023-10-03 16:49:26 +02:00
Remi Tricot-Le Breton	b019636cd7	DOC: sample: Add a comment in 'check_operator' to explain why 'vars_check_arg' should ignore the 'err' buffer This extra comment ensure that we do not try to pass an 'err' argument to 'vars_check_arg' otherwise some warnings will be raised if an operator is given an integer directly in the configuration file.	2023-10-03 11:13:10 +02:00
Remi Tricot-Le Breton	6fe57303f7	Revert "MEDIUM: sample: Small fix in function check_operator for eror reporting" This reverts commit `d897d7da87`. The "check_operator" function is used for all the operator converters such as "and", "or", "add"... With such a converter that accepts a variable name as well as an integer, the "vars_check_arg" call is expected to fail when an integer is provided. Passing an "err" variable has the unwanted side effect of raising a warning during init for a configuration such as the following: http-request set-query "s=%[rand,add(20)]" which raises the following warning: [WARNING] (33040) : config : parsing [hap.cfg:14] : invalid variable name '20'. A variable name must be start by its scope. The scope can be 'proc', 'sess', 'txn', 'req', 'res' or 'check'.	2023-10-03 11:13:10 +02:00
William Lallemand	c21ec3b735	BUG/MINOR: proto_reverse_connect: fix FD leak upon connect new_reverse_conn() is creating its own socket with sock_create_server_socket(). However the connect is done with conn->ctrl->connect() which is tcp_connect_server(). tcp_connect_server() is also creating its own socket and sets it in the struct conn, left the previous socket unclosed and leaking at each attempt. This patch fixes the issue by letting tcp_connect_server() handling the socket part, and removes it in new_reverse_conn().	2023-09-30 00:53:43 +02:00
Amaury Denoyelle	c58fd4d1cc	MINOR: tcp_act: remove limitation on protocol for attach-srv This patch allows to specify "tcp-request session attach-srv" without requiring that each associated bind lines mandates HTTP/2 usage. If a non supported protocol is targetted by this rule, conn_install_mux_fe() is responsible to reject it. This change is mandatory to be able to mix attach-srv and standard non-reversable connection on the same bind instances. An ACL can be used to activate attach-srv only on some conditions.	2023-09-29 18:11:10 +02:00
Amaury Denoyelle	337c71423f	MINOR: connection: define mux flag for reverse support Add a new MUX flag MX_FL_REVERSABLE. This value is used to indicate that MUX instance supports connection reversal. For the moment, only HTTP/2 multiplexer is flagged with it. This allows to dynamically check if reversal can be completed during MUX installation. This will allow to relax requirement on config writing for 'tcp-request session attach-srv' which currently cannot be used mixed with non-http/2 listener instances, even if used conditionnally with an ACL.	2023-09-29 18:09:08 +02:00
Amaury Denoyelle	ac1164de7c	MINOR: connection: define error for reverse connect Define a new error code for connection CO_ER_REVERSE. This will be used to report an issue which happens on a connection targetted for reversal before reverse process is completed.	2023-09-29 18:08:26 +02:00
Amaury Denoyelle	753fe2b9ac	BUG/MINOR: tcp_act: fix attach-srv rule ACL parsing Fix parser for tcp-request session attach-srv rule. Before this commit, it was impossible to use an anonymous ACL with it. This was caused because support for optional name argument was badly implemented. No need to backport this.	2023-09-29 18:07:52 +02:00
Amaury Denoyelle	6118590e95	BUG/MINOR: proto_reverse_connect: fix FD leak on connection error Listener using "rev@" address is responsible to setup connection and reverse it using a server instance. If an error occured before reversal is completed, proper freeing must be taken care of by the listener as no session exists for this. Currently, there is two locations where a connection is freed on error before reversal inside reverse_connect protocol. Both of these were incomplete as several function must be used to ensure connection is properly freed. This commit fixes this by reusing the same cleaning mechanism used inside H2 multiplexer. One of the biggest drawback before this patch was that connection FD was not properly removed from fdtab which caused a file-descriptor leak. No need to backport this.	2023-09-29 18:02:36 +02:00
Willy Tarreau	b3dcd59f8d	MINOR: stream: fix output alignment of stuck thread dumps Since commit `c185bc465` ("MEDIUM: stream: now provide full stream dumps in case of loops"), the stuck threads show the stream's pointer in the margin since it appears immediately after a line feed. Let's add it after the prefix and "stream=" to make the output more readable.	2023-09-29 16:43:07 +02:00
Emeric Brun	3c250cb847	Revert "BUG/MEDIUM: quic: missing check of dcid for init pkt including a token" This reverts commit `072e774939`. Doing h2load with h3 tests we notice this behavior: Client ---- INIT no token SCID = a , DCID = A ---> Server (1) Client <--- RETRY+TOKEN DCID = a, SCID = B ---- Server (2) Client ---- INIT+TOKEN SCID = a , DCID = B ---> Server (3) Client <--- INIT DCID = a, SCID = C ---- Server (4) Client ---- INIT+TOKEN SCID = a, DCID = C ---> Server (5) With (5) dropped by haproxy due to token validation. Indeed the previous patch adds SCID of retry packet sent to the aad of the token ciphering aad. It was useful to validate the next INIT packets including the token are sent by the client using the new provided SCID for DCID as mantionned into the RFC 9000. But this stateless information is lost on received INIT packets following the first outgoing INIT packet from the server because the client is also supposed to re-use a second time the lastest received SCID for its new DCID. This will break the token validation on those last packets and they will be dropped by haproxy. It was discussed there: https://mailarchive.ietf.org/arch/msg/quic/7kXVvzhNCpgPk6FwtyPuIC6tRk0/ To resume: this is not the role of the server to verify the re-use of retry's SCID for DCID in further client's INIT packets. The previous patch must be reverted in all versions where it was backported (supposed until 2.6)	2023-09-29 09:27:22 +02:00
Willy Tarreau	d956db6638	CLEANUP: stream: remove the now unused stream_dump() function It was superseded by strm_dump_to_buffer() which provides much more complete information and supports anonymizing.	2023-09-29 09:20:27 +02:00
Willy Tarreau	feff6296a1	MINOR: debug: use the more detailed stream dump in panics Similarly upon a panic we'd like to have a more detailed dump of a stream's state, so let's use the full dump function for this now.	2023-09-29 09:20:27 +02:00
Willy Tarreau	c185bc4656	MEDIUM: stream: now provide full stream dumps in case of loops When a stream is caught looping, we produce some output to help figure its internal state explaining why it's looping. The problem is that this debug output is quite old and the info it provides are quite insufficient to debug a modern process, and since such bugs happen only once or twice a year the situation doesn't improve. On the other hand the output of "show sess all" is extremely detailed and kept up to date with code evolutions since it's a heavily used debugging tool. This commit replaces the call to the totally outdated stream_dump() with a call to strm_dump_to_buffer(), and removes the filters dump since they are already emitted there, and it now produces much more exploitable output: [ALERT] (5936) : A bogus STREAM [0x7fa8dc02f660] is spinning at 5653514 calls per second and refuses to die, aborting now! Please report this error to developers: 0x7fa8dc02f660: [28/Sep/2023:09:53:08.811818] id=2 proto=tcpv4 source=127.0.0.1:58306 flags=0xc4a, conn_retries=0, conn_exp=<NEVER> conn_et=0x000 srv_conn=0x133f220, pend_pos=(nil) waiting=0 epoch=0x1 frontend=public (id=2 mode=http), listener=? (id=1) addr=127.0.0.1:4080 backend=public (id=2 mode=http) addr=127.0.0.1:61932 server=s1 (id=1) addr=127.0.0.1:7443 task=0x7fa8dc02fa40 (state=0x01 nice=0 calls=5749559 rate=5653514 exp=3s tid=1(1/1) age=1s) txn=0x7fa8dc02fbf0 flags=0x3000 meth=1 status=-1 req.st=MSG_DONE rsp.st=MSG_RPBEFORE req.f=0x4c rsp.f=0x00 scf=0x7fa8dc02f5f0 flags=0x00000482 state=EST endp=CONN,0x7fa8dc02b4b0,0x05004001 sub=1 rex=58s wex=<NEVER> h1s=0x7fa8dc02b4b0 h1s.flg=0x100010 .sd.flg=0x5004001 .req.state=MSG_DONE .res.state=MSG_RPBEFORE .meth=GET status=0 .sd.flg=0x05004001 .sc.flg=0x00000482 .sc.app=0x7fa8dc02f660 .subs=0x7fa8dc02f608(ev=1 tl=0x7fa8dc02fae0 tl.calls=0 tl.ctx=0x7fa8dc02f5f0 tl.fct=sc_conn_io_cb) h1c=0x7fa8dc0272d0 h1c.flg=0x0 .sub=0 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x7fa8dc0273f0 .exp=<NEVER> co0=0x7fa8dc027040 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=LISTENER:0x12840c0 flags=0x00000300 fd=32 fd.state=20 updt=0 fd.tmask=0x2 scb=0x7fa8dc02fb30 flags=0x00001411 state=EST endp=CONN,0x7fa8dc0300c0,0x05000001 sub=1 rex=58s wex=<NEVER> h1s=0x7fa8dc0300c0 h1s.flg=0x4010 .sd.flg=0x5000001 .req.state=MSG_DONE .res.state=MSG_RPBEFORE .meth=GET status=0 .sd.flg=0x05000001 .sc.flg=0x00001411 .sc.app=0x7fa8dc02f660 .subs=0x7fa8dc02fb48(ev=1 tl=0x7fa8dc02feb0 tl.calls=2 tl.ctx=0x7fa8dc02fb30 tl.fct=sc_conn_io_cb) h1c=0x7fa8dc02ff00 h1c.flg=0x80000000 .sub=1 .ibuf=0@(nil)+0/0 .obuf=0@(nil)+0/0 .task=0x7fa8dc030020 .exp=<NEVER> co1=0x7fa8dc02fcd0 ctrl=tcpv4 xprt=RAW mux=H1 data=STRM target=SERVER:0x133f220 flags=0x10000300 fd=33 fd.state=10421 updt=0 fd.tmask=0x2 req=0x7fa8dc02f680 (f=0x1840000 an=0x8000 pipe=0 tofwd=0 total=79) an_exp=<NEVER> buf=0x7fa8dc02f688 data=(nil) o=0 p=0 i=0 size=0 htx=0xc18f60 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 res=0x7fa8dc02f6d0 (f=0x80000000 an=0x1400000 pipe=0 tofwd=0 total=0) an_exp=<NEVER> buf=0x7fa8dc02f6d8 data=(nil) o=0 p=0 i=0 size=0 htx=0xc18f60 flags=0x0 size=0 data=0 used=0 wrap=NO extra=0 call trace(10): \| 0x59f2b7 [0f 0b 0f 1f 80 00 00 00]: stream_dump_and_crash+0x1f7/0x2bf \| 0x5a0d71 [e9 af e6 ff ff ba 40 00]: process_stream+0x19f1/0x3a56 \| 0x68d7bb [49 89 c7 4d 85 ff 74 77]: run_tasks_from_lists+0x3ab/0x924 \| 0x68e0b4 [29 44 24 14 8b 4c 24 14]: process_runnable_tasks+0x374/0x6d6 \| 0x656f67 [83 3d f2 75 84 00 01 0f]: run_poll_loop+0x127/0x5a8 \| 0x6575d7 [48 8b 1d 42 50 5c 00 48]: main+0x1b22f7 \| 0x7fa8e0f35e45 [64 48 89 04 25 30 06 00]: libpthread:+0x7e45 \| 0x7fa8e0e5a4af [48 89 c7 b8 3c 00 00 00]: libc:clone+0x3f/0x5a Note that the output is subject to the global anon key so that IPs and object names can be anonymized if required. It could make sense to backport this and the few related previous patches next time such an issue is reported.	2023-09-29 09:20:27 +02:00
Willy Tarreau	b206504f43	MINOR: streams: add support for line prefixes to strm_dump_to_buffer() Now the function can prepend every new line with a caller-fed prefix that will later be used for indenting. The caller has to feed the prefix for the first line itself though, allowing to possibly append the first line at the end of an existing one.	2023-09-29 09:20:27 +02:00
Willy Tarreau	5743eeea88	MINOR: stream: make stream_dump() always multi-line There used to be two working modes for this function, a single-line one and a multi-line one, the difference being made on the "eol" argument which could contain either a space or an LF (and with the prefix being adjusted accordingly). Let's get rid of the single-line mode as it's what limits the output contents because it's difficult to produce exploitable structured data this way. It was only used in the rare case of spinning streams and applets and these are the ones lacking info. Now a spinning stream produces: [ALERT] (3511) : A bogus STREAM [0x227e7b0] is spinning at 5581202 calls per second and refuses to die, aborting now! Please report this error to developers: strm=0x227e7b0,c4a src=127.0.0.1 fe=public be=public dst=s1 txn=0x2041650,3000 txn.req=MSG_DONE,4c txn.rsp=MSG_RPBEFORE,0 rqf=1840000 rqa=8000 rpf=80000000 rpa=1400000 scf=0x24af280,EST,482 scb=0x24af430,EST,1411 af=(nil),0 sab=(nil),0 cof=0x7fdb28026630,300:H1(0x24a6f60)/RAW((nil))/tcpv4(33) cob=0x23199f0,10000300:H1(0x24af630)/RAW((nil))/tcpv4(32) filters={} call trace(11): (...)	2023-09-29 09:20:27 +02:00
Willy Tarreau	5ddeba7af3	MINOR: stream: make strm_dump_to_buffer() show the list of filters That's one of the rare pieces of information that was not present in the full dump and only in the short one, the list of filters the stream is subscribed to (however the current filter was present and more detailed).	2023-09-29 09:20:27 +02:00
Willy Tarreau	3e630a9871	MINOR: stream: make strm_dump_to_buffer() take an arbitrary buffer We won't always want to dump into the trash, so let's make the function accept an arbitrary buffer.	2023-09-29 09:20:27 +02:00
Willy Tarreau	6bc07103f8	CLEANUP: stream: make strm_dump_to_buffer() take a const stream Now that we don't need a variable anymore, let's pass a const stream. It will void any doubt about what can happen to the stream when the function is called from inspection points (show sess etc).	2023-09-29 09:20:27 +02:00
Willy Tarreau	1a01ee4740	CLEANUP: stream: use const filters in the dump function The strm_dump_to_buffer() function requires a variable stream only for a few functions in it that do not take a const. strm_flt() is one of them (and for good reasons since most call places want to update filters). Here we know we won't modify the filter nor the stream so let's directly access the strm_flt in the stream and assign it to a const filter. This will also catch any future accidental change.	2023-09-29 09:20:27 +02:00
Willy Tarreau	77ecb3146a	MINOR: stream: split stats_dump_full_strm_to_buffer() in two The function only works with the CLI's appctx and does most of the convenient work of dumping a stream into a buffer (well, the trash buffer for now). Let's split it in two so that most of the work is done in a generic function and that the CLI-specific function relies on that one. The diff looks huge due to the changed indent caused by the extraction of the switch/case statement, but when looked at using diff -b it's small.	2023-09-29 09:20:27 +02:00
Willy Tarreau	6c2af048d6	CLEANUP: stream: make the dump code not depend on the CLI appctx The HA_ANON_CLI() helper relies on the CLI appctx and prevents the code from being made more generic. Let's extract the CLI's anon key separately and pass it via HA_ANON_STR() instead.	2023-09-29 09:20:27 +02:00
Amaury Denoyelle	7cf9cf705e	BUG/MINOR: mux-quic: remove full demux flag on ncbuf release When rcv_buf stream callback is invoked, mux tasklet is woken up if demux was previously blocked due to lack of buffer space. A BUG_ON() is present to ensure there is data in qcs Rx buffer. If this is not the case, wakeup is unneeded : BUG_ON(!ncb_data(&qcs->rx.ncbuf, 0)); This BUG_ON() may be triggered if RESET_STREAM is received after demux has been blocked. On reset, Rx buffer is purged according to RFC 9000 which allows to discard any data not yet consumed. This will trigger the BUG_ON() assertion if rcv_buf stream callback is invoked after this. To prevent BUG_ON() crash, just clear demux block flag each time Rx buffer is purged. This covers accordingly RESET_STREAM reception. This should be backported up to 2.7. This may fix github issue #2293. This bug relies on several precondition so its occurence is rare. This was reproduced by using a custom client which post big enough data to fill the buffer. It then emits a RESET_STREAM in place of a proper FIN. Moreover, mux code has been edited to artificially stalled stream read to force demux blocking. h3_data_to_htx: - return htx_sent; + return 1; qcc_recv_reset_stream: qcs_free_ncbuf(qcs, &qcs->rx.ncbuf); + qcs_notify_recv(qcs); qmux_strm_rcv_buf: char fin = 0; + static int i = 0; + if (++i < 2) + return 0; TRACE_ENTER(QMUX_EV_STRM_RECV, qcc->conn, qcs);	2023-09-28 11:44:53 +02:00
Vladimir Vdovin	f8b81f6eb7	MINOR: support for http-request set-timeout client Added set-timeout for frontend side of session, so it can be used to set custom per-client timeouts if needed. Added cur_client_timeout to fetch client timeout samples.	2023-09-28 08:49:22 +02:00
Amaury Denoyelle	b9bb3b932c	MINOR: proto_reverse_connect: emit log for preconnect Add reporting using send_log() for preconnect operation. This is minimal to ensure we understand the current status of listener in active reverse connect. To limit logging quantity, only important transition are considered. This requires to implement a minimal state machine as a new field in receiver structure. Here are the logs produced : * Initiating : first time preconnect is enabled on a listener * Error : last preconnect attempt interrupted on a connection error * Reaching maxconn : all necessary connections were reversed and are operational on a listener	2023-09-22 17:21:53 +02:00
Amaury Denoyelle	069ca55e70	MINOR: proto_reverse_connect: remove unneeded wakeup No need to use task_wakeup() on rev_bind_listener() to bootstrap preconnect. A similar call is done on rev_enable_listener() which serve both for bootstrap and also later to reinitiate attemps to maintain maxconn if connection are freed.	2023-09-22 17:06:18 +02:00
Amaury Denoyelle	1f43fb71be	MINOR: proto_reverse_connect: refactor preconnect failure When a connection is freed during preconnect before reversal, the error must be notified to the listener to remove any connection reference and rearm a new preconnect attempt. Currently, this can occur through 2 code paths : * conn_free() called directly by H2 mux * error during conn_create_mux(). For this case, connection is flagged with CO_FL_ERROR and reverse_connect task is woken up. The process task handler is then responsible to call conn_free() for such connection. Duplicated steps where done both in conn_free() and process task handler. These are now removed. To facilitate code maintenance, dedicated operation have been centralized in a new function rev_notify_preconn_err() which is called by conn_free().	2023-09-22 16:43:36 +02:00
Amaury Denoyelle	a37abee266	BUG/MINOR: proto_reverse_connect: set default maxconn If maxconn is not set for preconnect, it assumes we want to establish a single connection. However, this does not work properly in case the connection is closed after reversal. Listener is not resumed by protocol layer to attempt a new preconnect. To fix this, explicitely set maxconn to 1 in the listener instance if none is defined. This ensures the behavior is consistent. A BUG_ON() has been added to validate we never try to use a listener with a 0 maxconn.	2023-09-22 16:40:58 +02:00
Emeric Brun	27b2fd2e06	MINOR: quic: handle external extra CIDs generator. This patch adds the ability to externalize and customize the code of the computation of extra CIDs after the first one was derived from the ODCID. This is to prepare interoperability with extra components such as different QUIC proxies or routers for instance. To process the patch defines two function callbacks: - the first one to compute a hash 64bits from the first generated CID (itself continues to be derived from ODCID). Resulting hash is stored into the 'quic_conn' and 64bits is chosen large enought to be able to store an entire haproxy's CID. - the second callback re-uses the previoulsy computed hash to derive an extra CID using the custom algorithm. If not set haproxy will continue to choose a randomized CID value. Those two functions have also the 'cluster_secret' passed as an argument: this way, it is usable for obfuscation or ciphering.	2023-09-22 10:32:14 +02:00
Lokesh Jindal	d897d7da87	MEDIUM: sample: Small fix in function check_operator for eror reporting When function "check_operator" calls function "vars_check_arg" to decode a variable, it passes in NULL value for pointer to the char array meant for capturing the error message. This commit replaces NULL with the pointer to the real char array. This should help in correct error reporting.	2023-09-22 08:48:53 +02:00
Lokesh Jindal	915e48675a	MEDIUM: sample: Enhances converter "bytes" to take variable names as arguments Prior to this commit, converter "bytes" takes only integer values as arguments. After this commit, it can take variable names as inputs. This allows us to dynamically determine the offset/length and capture them in variables. These variables can then be used with the converter. Example use case: parsing a token present in a request header.	2023-09-22 08:48:51 +02:00
Amaury Denoyelle	d3db96f11a	MINOR: proto_reverse_connect: prevent transparent server for pre-connect Prevent using transparent servers for pre-connect on startup by emitting a fatal error. This is used to ensure we never try to connect to a target with an unspecified destination address or port.	2023-09-21 16:58:08 +02:00
Amaury Denoyelle	9b6812d781	BUG/MINOR: proto_reverse_connect: fix preconnect with startup name resolution addr member of server structure is not set consistently depending on the server address type. When using <IP:PORT> notation, its port is properly set. However, when using <HOSTNAME:PORT>, only IP address is set after startup name resolution but its port is left to 0. This behavior causes preconnect to not be functional when using server with hostname for startup name resolution. Indeed, only srv.addr is used as connect argument through function new_reverse_conn(). To fix this, rely on srv.svc_port : this member is always set for servers using IP or hostname. This is similar to connect_server() on the backend side. This does not need to be backported.	2023-09-21 16:57:30 +02:00
Sébastien Gross	6a9ba85322	MINOR: hlua: Add support for the "http-after-res" action This commit introduces support for the "http-after-res" action in hlua, enabling the invocation of a Lua function in a "http-after-response" rule. With this enhancement, a Lua action can be registered using the "http-after-res" action type: core.register_action('myaction', {'http-after-res'}, myaction) A new "lua.myaction" is created and can be invoked in a "http-after-response" rule: http-after-response lua.myaction This addition provides greater flexibility and extensibility in handling post-response actions using Lua. This commit depends on: - `4457783` ("MINOR: http_ana: position the FINAL flag for http_after_res execution") Signed-off-by: Sébastien Gross <sgross@haproxy.com>	2023-09-21 16:31:20 +02:00
Aurelien DARRAGON	95c4d24825	BUG/MEDIUM: server/cli: don't delete a dynamic server that has streams In cli_parse_delete_server(), we take care of checking that the server is in MAINT and that the cur_sess counter is set to 0, in the hope that no connection/stream ressources continue to point to the server, else we refuse to delete it. As shown in GH #2298, this is not sufficient. Indeed, when the server option "on-marked-down shutdown-sessions" is not used, server streams are not purged when srv enters maintenance mode. As such, there could be remaining streams that point to the server. To detect this, a secondary check on srv->cur_sess counter was performed in cli_parse_delete_server(). Unfortunately, there are some code paths that could lead to cur_sess being decremented, and not resulting in a stream being actually shutdown. As such, if the delete_server cli is handled right after cur_sess has been decremented with streams still pointing to the server, we could face some nasty bugs where stream->srv_conn could point to garbage memory area, as described in the original github report. To make the check more reliable prior to deleting the server, we don't rely exclusively on cur_sess and directly check that the server is not used in any stream through the srv_has_stream() helper function. Thanks to @capflam which found out the root cause for the bug and greatly helped to provide the fix. This should be backported up to 2.6.	2023-09-21 14:57:01 +02:00
Aurelien DARRAGON	0189a4679e	MINOR: pattern/ip: simplify pat_match_ip() function pat_match_ip() has been updated several times over the last decade to introduce new features, but it was never cleaned up. The result is that the function is pretty hard to read, and there are multiple duplicated code blocks so it becomes error-prone to maintain it, plus it bloats the haproxy binary for nothing. In this patch, we move the tree search (ip4 / ip6) logic into 2 dedicated helper functions. This allows us to refactor pat_match_ip() without touching to the original behavior.	2023-09-21 09:50:56 +02:00
Aurelien DARRAGON	f80122db26	MINOR: pattern/ip: offload ip conversion logic to helper functions Now that v4tov6() and v6tov4() were reworked to match behavior from pat_match_ip() function in ("MINOR: tools/ip: v4tov6() and v6tov4() rework"), we can remove code duplication in pat_match_ip() by directly using those dedicated functions where relevant.	2023-09-21 09:50:55 +02:00
Aurelien DARRAGON	72514a4467	MEDIUM: tools/ip: v4tov6() and v6tov4() rework v4tov6() and v6tov4() helper function were initially implemented in `4f92d3200` ("[MEDIUM] IPv6 support for stick-tables"). However, since `ceb4ac9c3` ("MEDIUM: acl: support IPv6 address matching") support for legacy ip6 to ip4 conversion formats were added, with the parsing logic directly performed in acl_match_ip (which later became pat_match_ip) The issue is that the original v6tov4() function which is used for sample expressions handling lacks those additional formats, so we could face inconsistencies whether we rely on ip4/ip6 conversions from an acl context or an expression context. To unify ip4/ip6 automatic mapping behavior, we reworked v4tov6 and v6tov4 functions so that they now behave like in pat_match_ip() function. Note: '6to4 (RFC3056)' and 'RFC4291 ipv4 compatible address' formats are still supported for legacy purposes despite being deprecated for a while now.	2023-09-21 09:50:55 +02:00
Christopher Faulet	d3e379b3ce	BUG/MEDIUM: http-ana: Try to handle response before handling server abort In the request analyser responsible to forward the request, we try to detect the server abort to stop the request forwarding. However, we must be careful to not block the response processing, if any. Indeed, it is possible to get the response and the server abort in same time. In this case, we must try to forward the response to the client first. So to fix the issue, in the request analyser we no longer handle the server abort if the response channel is not empty. In the end, the response analyser is able to detect the server abort if it is relevant. Otherwise, the stream will be woken up after the response forwarding and the server abort should be handled at this stage. This patch should be backported as far as 2.7 only because the risk of breakage is high. And it is probably a good idea to wait a bit before backporting it.	2023-09-21 09:36:37 +02:00
Willy Tarreau	cbbee15462	CLEANUP: ring: rename the ring lock "RING_LOCK" instead of "LOGSRV_LOCK" The ring lock was initially mostly used for the logs and used to inherit its name in lock stats. Now that it's exclusively used by rings, let's rename it accordingly.	2023-09-20 21:38:33 +02:00
Willy Tarreau	cec8b42cb3	MEDIUM: logs: atomically check and update the log sample index The log server lock is pretty visible in perf top when using log samples because it's taken for each server in turn while trying to validate and update the log server's index. Let's change this for a CAS, since we have the index and the range at hand now. This allow us to remove the logsrv lock. The test on 4 servers now shows a 3.7 times improvement thanks to much lower contention. Without log sampling a test producing 4.4M logs/s delivers 4.4M logs/s at 21 CPUs used, everything spent in the kernel. After enabling 4 samples (1:4, 2:4, 3:4 and 4:4), the throughput would previously drop to 1.13M log/s with 37 CPUs used and 75% spent in process_send_log(). Now with this change, 4.25M logs/s are emitted, using 26 CPUs and 22% in process_send_log(). That's a 3.7x throughput improvement for a 30% global CPU usage reduction, but in practice it mostly shows that the performance drop caused by having samples is much less noticeable (each of the 4 servers has its index updated for each log). Note that in order to even avoid incrementing an index for each log srv that is consulted, it would be more convenient to have a single index per frontend and apply the modulus on each log server in turn to see if the range has to be updated. It would then only perform one write per range switch. However the place where this is done doesn't have access to a frontend, so some changes would need to be performed for this, and it would require to update the current range independently in each logsrv, which is not necessarily easier since we don't know yet if we can commit it.	2023-09-20 21:38:33 +02:00
Willy Tarreau	e00470378b	MINOR: logs: use a single index to store the current range and index By using a single long long to store both the current range and the next index, we'll make it possible to perform atomic operations instead of locking. Let's only regroup them for now under a new "curr_rg_idx". The upper word is the range, the lower is the index.	2023-09-20 21:38:33 +02:00
Willy Tarreau	49ddc0138c	CLEANUP: logs: rename a confusing local variable "curr_rg" to "smp_rg" The variable curr_rg in process_send_log() is misleading because it is not related to the integer curr_rg that's used to calculate it, instead it's a pointer to the current smp_log_range from smp_rgs[], so let's call it "smp_rg" as a singular for this "smp_rgs" and put an end to this confusion.	2023-09-20 21:38:33 +02:00
Willy Tarreau	3f1284560f	MINOR: log: remove the unused curr_idx in struct smp_log_range This index is useless because it only serves to know when the global index reached the end, while the global one already knows it. Let's just drop it and perform the test on the global range. It was verified with the following config that the first server continues to take 1/10 of the traffic, the 2nd one 2/10, the 3rd one 3/10 and the 4th one 4/10: log 127.0.0.1:10001 sample 1:10 local0 log 127.0.0.1:10002 sample 2,5:10 local0 log 127.0.0.1:10003 sample 3,7,9:10 local0 log 127.0.0.1:10004 sample 4,6,8,10:10 local0	2023-09-20 21:38:33 +02:00
Willy Tarreau	4351364700	MINOR: logs: clarify the check of the log range The test of the log range is not very clear, in part due to the reuse of the "curr_idx" name that happens at two levels. The call to in_smp_log_range() applies to the smp_info's index to which 1 is added: it verifies that the next index is still within the current range. Let's just have a local variable "next_index" in process_send_log() that gets assigned the next index (current+1) and compare it to the current range's boundaries. This makes the test much clearer. We can then simply remove in_smp_log_range() that's no longer needed.	2023-09-20 21:38:33 +02:00
Aurelien DARRAGON	2c9bd3ae80	BUG/MINOR: server: add missing free for server->rdr_pfx rdr_pfx was not being free during server cleanup, leading to small memory leak when "redir" argument was used on a server line (HTTP only). This should be backported to every stable versions. [For 2.6 and 2.7: the free should be performed in srv_drop() directly. For older versions: free in deinit() function near the free for the cookie string]	2023-09-15 17:46:49 +02:00
Willy Tarreau	6cbb5a057b	Revert "MAJOR: import: update mt_list to support exponential back-off" This reverts commit `c618ed5ff4`. The list iterator is broken. As found by Fred, running QUIC single- threaded shows that only the first connection is accepted because the accepter relies on the element being initialized once detached (which is expected and matches what MT_LIST_DELETE_SAFE() used to do before). However while doing this in the quic_sock code seems to work, doing it inside the macro show total breakage and the unit test doesn't work anymore (random crashes). Thus it looks like the fix is not trivial, let's roll this back for the time it will take to fix the loop.	2023-09-15 17:13:43 +02:00
William Lallemand	694889ac2d	BUILD: quic: fix build on centos 8 and USE_QUIC_OPENSSL_COMPAT When using USE_QUIC_OPENSSL_COMPAT=1 on centos-8 the build fail this way: In file included from src/quic_openssl_compat.c:11: /usr/include/openssl/kdf.h:33:46: error: unknown type name 'va_list' int EVP_KDF_vctrl(EVP_KDF_CTX *ctx, int cmd, va_list args); This is because of openssl/kdf.h being include before openssl-compat.h	2023-09-14 16:26:58 +02:00
Christopher Faulet	89e20033c7	BUG/MAJOR: mux-h2: Report a protocol error for any DATA frame before headers If any DATA frame is received before all headers are fully received, a protocol error must be reported. It is required by the HTTP/2 RFC but it is also important because the HTTP analyzers expect the first HTX block is a start-line. It leads to a crash if this statement is not respected. For instance, it is possible to trigger a crash by sending an interim message with a DATA frame (It may be an empty DATA frame with the ES flag). AFAIK, only the server side is affected by this bug. To fix the issue, an protocol error is reported for the stream. This patch should fix the issue #2291. It must be backported as far as 2.2 (and probably to 2.0 too).	2023-09-14 11:39:39 +02:00
Fr�d�ric L�caille	3921bf80c7	BUG/MINOR: quic: Leak of frames to send. In very rare cases, it is possible that packet are detected as lost, their frames requeued, then the connection is released without releasing for any reason (to be killed because of a sendto() fatal failure for instance. Such frames are lost and never release because the function which release their packet number spaces does not release the frames which are still enqueued to be send. Must be backported as far as 2.6.	2023-09-13 15:32:14 +02:00
William Lallemand	c7424a1bac	MINOR: samples: implement bytes_in and bytes_out samples %[bytes_in] and %[bytes_out] are equivalent to %U and %B tags in log-format.	2023-09-13 14:54:50 +02:00
Willy Tarreau	5abbae2d3d	CLEANUP: pools: simplify the pool expression when no pool was matched in dump When dumping pool information, we make a special case of the condition where the pool couldn't be identified and we consider that it was the correct one. In the code arrangements brought by commit `efc46dede` ("DEBUG: pools: inspect pools on fatal error and dump information found"), a ternary expression for testing this depends on the "if" block condition so this can be simplified and will make Coverity happy. This was reported in GH #2290.	2023-09-13 13:31:41 +02:00
Willy Tarreau	c618ed5ff4	MAJOR: import: update mt_list to support exponential back-off The new mt_list code supports exponential back-off on conflict, which is important for use cases where there is contention on a large number of threads. The API evolved a little bit and required some updates: - mt_list_for_each_entry_safe() is now in upper case to explicitly show that it is a macro, and only uses the back element, doesn't require a secondary pointer for deletes anymore. - MT_LIST_DELETE_SAFE() doesn't exist anymore, instead one just has to set the list iterator to NULL so that it is not re-inserted into the list and the list is spliced there. One must be careful because it was usually performed before freeing the element. Now instead the element must be nulled before the continue/break. - MT_LIST_LOCK_ELT() and MT_LIST_UNLOCK_ELT() have always been unclear. They were replaced by mt_list_cut_around() and mt_list_connect_elem() which more explicitly detach the element and reconnect it into the list. - MT_LIST_APPEND_LOCKED() was only in haproxy so it was left as-is in list.h. It may however possibly benefit from being upstreamed. This required tiny adaptations to event_hdl.c and quic_sock.c. The test case was updated and the API doc added. Note that in order to keep include files small, the struct mt_list definition remains in list-t.h (par of the internal API) and was ifdef'd out in mt_list.h. A test on QUIC with both quictls 1.1.1 and wolfssl 5.6.3 on ARM64 with 80 threads shows a drastic reduction of CPU usage thanks to this and the refined memory barriers. Please note that the CPU usage on OpenSSL 3.0.9 is significantly higher due to the excessive use of atomic ops by openssl, but 3.1 is only slightly above 1.1.1 though: - before: 35 Gbps, 3.5 Mpps, 7800% CPU - after: 41 Gbps, 4.2 Mpps, 2900% CPU	2023-09-13 11:50:33 +02:00
Christopher Faulet	13fb7170be	BUG/MEDIUM: master/cli: Pin the master CLI on the first thread of the group 1 There is no reason to start the master CLI on several threads and on several groups. And in fact, it must not be done otherwise the same FD is inserted several times in the fdtab, leading to a crash during startup because of a BUG_ON(). It happens when several groups are configured. To fix the bug the master CLI is now pinned on the first thread of the first group. This patch should fix the issue #2259 and must be backported to 2.8.	2023-09-13 10:26:32 +02:00
Christopher Faulet	665703d456	BUG/MEDIUM: mux-fcgi: Don't swap trash and dbuf when handling STDERR records trahs chunks are buffers but not allocated from the buffers pool. And the "trash" chunk is static and thread-local. It is two reason to not swap it with a regular buffer allocated from the buffers pool. Unfortunatly, it is exactly what is performed in the FCGI mux when a STDERR record is handled. b_xfer() is used to copy data from the demux buffer to the trash to format the error message. A zeor-copy via a swap may be performed. In this case, this leads to a memory corruption and a crash because, some time later, the demux buffer is released because it is empty. And it is in fact the trash chunk. b_force_xfer() must be used instead. This function forces the copy. This patch must be backported as far as 2.2. For 2.4 and 2.2, b_force_xfer() does not exist. For these versions, the following commit must be backported too: * `c7860007cc` ("MINOR: buf: Add b_force_xfer() function")	2023-09-12 19:50:17 +02:00
Aurelien DARRAGON	1115fc348e	BUG/MINOR: hlua/init: coroutine may not resume itself It's not supported to call lua_resume with <L> and <from> designating the same lua coroutine. It didn't cause visible bugs so far because Lua 5.3 used to be more permissive about this, and moreover, yielding is not involved during the hlua init state. But this is wrong usage, and the doc clearly specifies that the <from> argument can be NULL when there is no such coroutine, which is the case here. This should be backported in every stable versions.	2023-09-12 19:50:17 +02:00
Aurelien DARRAGON	e7281f3f5d	BUG/MEDIUM: hlua: don't pass stale nargs argument to lua_resume() In hlua_ctx_resume(), we call lua_resume() function like this: lua_resume(lua->T, hlua_states[lua->state_id], lua->nargs) Once the call returns, we may call the function again with the same hlua context when E_YIELD is returned (the execution was interrupted and may be resumed through another lua_resume() call). The 3rd argument to lua_resume(), 'nargs', is a hint passed to Lua to know how many (optional) arguments were pushed on the stack prior to resuming the execution (arguments that Lua will then expose to the Lua script). But here is the catch: we never reset lua->nargs between successive lua_resume() calls, meaning that next lua_resume() calls will still inherit from the initial nargs value that was set in hlua ctx prior to calling hlua_ctx_resume() (our wrapper function) for the first time. This is problematic, because despite not being explicitly mentioned in the Lua documentation, passed arguments (to which `nargs` refer to), are already consumed once lua_resume() returns. This means that we cannot keep calling lua_resume() with non-zero nargs if we don't push new arguments on the stack prior to resuming lua after the initial call: nargs is proper to a single lua_resume() invocation. Despite improper use of lua_resume() for a long time, this didn't cause visible issues in the past with Lua 5.3, but it is particularly sensitive starting with Lua 5.4.3 due to debugging hooks improvements that led to some internal changes (see: lua/lua@58aa09a). Not using nargs properly now exposes us to undefined behavior when resuming after a yield triggered from a debugging hook, which may cause running scripts to crash unexpectedly: for instance with Lua raising errors and complaining about values being NULL where it should not be the case. For reference, this issue was initially raised on the Lua mailing list: http://lua-users.org/lists/lua-l/2023-09/msg00005.html In this patch, we immediately reset nargs when lua_resume() returns to prevent any misuse. It should be backported to every maintained versions.	2023-09-12 19:50:17 +02:00
Willy Tarreau	93c2ea0ec3	MEDIUM: pools: refine pool size rounding The pools sizes were rounded up a little bit too much with commit `30f931ead` ("BUG/MEDIUM: pools: fix the minimum allocation size"). The goal was in fact to make sure they were always at least large enough to store 2 list heads, and stuffing this into the alignment calculation resulted in the size being always rounded up to this size. This is problematic because it means that the appended tag at the end doesn't always catch potential overflows since more bytes than needed are allocated. Moreover, this test was later reinforced by commit `b5ba09ed5` ("BUG/MEDIUM: pools: ensure items are always large enough for the pool_cache_item"), proving that the first test was not always sufficient. This needs to be reworked to proceed correctly: - the two lists are needed when the object is in the cache, hence when we don't care about the tag, which means that the tag's size, if any, can easily cover for the missing bytes to reach that size. This is actually what was already being checked for. - the rounding should not be performed (beyond the size of a word to preserve pointer alignment) when pool tagging is enabled, otherwise we don't detect small overflows. It means that there will be less merging when proceeding like this. Tests show that we merge 93 pools into 36 without tags and 43 with tags enabled. - the rounding should not consider the extra size, since it's already done when calculating the allocated size later (i.e. don't round up twice). The difference is subtle but it's what makes sure the tag immediately follows the area instead of starting from the end. Thanks to this, now when writing one byte too many at the end of a struct stream, the error is instantly caught.	2023-09-12 18:14:05 +02:00
Willy Tarreau	61575769ac	DEBUG: pools: print the contents surrounding the expected tag location When no tag matches a known pool, we can inspect around to help figure what could have possibly overwritten memory. The contents are printed one machine word per line in hex, then using printable characters, and when they can be resolved to a pointer, either the pool's pointer name or a resolvable symbol with offset. The goal here is to help recognize what is easily identifiable in memory. For example applying the following patch to stream_free(): - pool_free(pool_head_stream, s); + pool_free(pool_head_stream, (void*)s+1); Causes the following dump to be emitted: FATAL: pool inconsistency detected in thread 1: tag mismatch on free(). caller: 0x59e968 (stream_free+0x6d8/0xa0a) item: 0x13df5c1 pool: 0x12782c0 ('stream', size 888, real 904, users 1) Tag does not match (0x4f00000000012782). Tag does not match any other pool. Contents around address 0x13df5c1+888=0x13df939: 0x13df918 [00 00 00 00 00 00 00 00] [........] 0x13df920 [00 00 00 00 00 00 00 00] [........] 0x13df928 [00 00 00 00 00 00 00 00] [........] 0x13df930 [00 00 00 00 00 00 00 00] [........] 0x13df938 [c0 82 27 01 00 00 00 00] [..'.....] [pool:stream] 0x13df940 [4f c0 59 00 00 00 00 00] [O.Y.....] [stream_new+0x4f/0xbec] 0x13df948 [49 46 49 43 41 54 45 2d] [IFICATE-] 0x13df950 [81 02 00 00 00 00 00 00] [........] 0x13df958 [df 13 00 00 00 00 00 00] [........] Other possible callers: (...) We notice that the tag references pool_head_stream with the allocation point in stream_new. Another benefit is that a caller may be figured from the tag even if the "caller" feature is not enabled, because upon a free() we always put the caller's location into the tag. This should be sufficient to debug most cases that normally require gdb.	2023-09-12 18:14:05 +02:00
Willy Tarreau	0f9a10c7f1	DEBUG: pools: also print the value of the tag when it doesn't match Sometimes the tag's value may reveal a recognizable pattern, so let's print it when it doesn't match a known pool.	2023-09-12 18:14:05 +02:00
Willy Tarreau	96c1a24224	DEBUG: pools: also print the item's pointer when crashing It's important to inspect a core or recognize some values to have the item pointer, it was not provided.	2023-09-12 18:14:05 +02:00
Willy Tarreau	efc46dede9	DEBUG: pools: inspect pools on fatal error and dump information found It's a bit frustrating sometimes to see pool checks catch a bug but not provide exploitable information without a core. Here we're adding a function "pool_inspect_item()" which is called just before aborting in pool_check_pattern() and POOL_DEBUG_CHECK_MARK() and which will display the error type, the pool's pointer and name, and will try to check if the item's tag matches the pool, and if not, will iterate over all pools to see if one would be a better candidate, then will try to figure the last known caller and possibly other likely candidates if the pool's tag is not sufficiently trusted. This typically helps better diagnose corruption in use-after-free scenarios, or freeing to a pool that differs from the one the object was allocated from, and will also indicate calling points that may help figure where an object was last released or allocated. The info is printed on stderr just before the backtrace. For example, the recent off-by-one test in the PPv2 changes would have produced the following output in vtest logs: * h1 debug\|FATAL: pool inconsistency detected in thread 1: tag mismatch on free(). * h1 debug\| caller: 0x62bb87 (conn_free+0x147/0x3c5) * h1 debug\| pool: 0x2211ec0 ('pp_tlv_256', size 304, real 320, users 1) * h1 debug\|Tag does not match. Possible origin pool(s): * h1 debug\| tag: @0x2565530 = 0x2216740 (pp_tlv_128, size 176, real 192, users 1) * h1 debug\|Recorded caller if pool 'pp_tlv_128': *** h1 debug\| @0x2565538 (+0184) = 0x62c76d (conn_recv_proxy+0x4cd/0xa24) A mismatch in the allocated/released pool is already visible, and the callers confirm it once resolved, where the allocator indeed allocates from pp_tlv_128 and conn_free() releases to pp_tlv_256: $ addr2line -spafe ./haproxy <<< $'0x62bb87\n0x62c76d' 0x000000000062bb87: conn_free at connection.c:568 0x000000000062c76d: conn_recv_proxy at connection.c:1177	2023-09-11 15:46:14 +02:00
Willy Tarreau	f6bee5a50b	DEBUG: pools: make pool_check_pattern() take a pointer to the pool This will be useful to report detailed bug traces.	2023-09-11 15:19:49 +02:00
Willy Tarreau	e92e96b00f	DEBUG: pools: pass the caller pointer to the check functions and macros In preparation for more detailed pool error reports, let's pass the caller pointers to the check functions. This will be useful to produce messages indicating where the issue happened.	2023-09-11 15:19:49 +02:00
Willy Tarreau	baf2070421	DEBUG: pools: always record the caller for uncached allocs as well When recording the caller of a pool_alloc(), we currently store it only when the object comes from the cache and never when it comes from the heap. There's no valid reason for this except that the caller's pointer was not passed to pool_alloc_nocache(), so it used to set NULL there. Let's just pass it down the chain.	2023-09-11 15:19:49 +02:00
Frédéric Lécaille	2dedbe76c9	BUG/MINOR: quic: fdtab array underflow access When using the listener socket as file descriptor, qc->fd value is -1. In this case one must not access fdtab[qc->fd] element to change its value. This bug could have been detected by asan with such a backtrace: ================================================================= ==402222==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa8ecf417ex7fa8e915cf90 sp 0x7fa8e915cf88 WRITE of size 8 at 0x7fa8ecf417e8 thread T6 #0 0x55707a0bf18a in qc_new_cc_conn src/quic_conn.c:838 #1 0x55707a0c6dc0 in quic_conn_release src/quic_conn.c:1408 #2 0x55707a10916f in quic_close src/xprt_quic.c:35 #3 0x55707a0cec77 in conn_xprt_close include/haproxy/connection.h:153 #4 0x55707a0ceed0 in conn_full_close include/haproxy/connection.h:197 #5 0x55707a0ec253 in qcc_release src/mux_quic.c:2412 #6 0x55707a0ec7d0 in qcc_io_cb src/mux_quic.c:2443 #7 0x55707a63ff2a in run_tasks_from_lists src/task.c:596 #8 0x55707a641cc9 in process_runnable_tasks src/task.c:876 #9 0x55707a56f7b2 in run_poll_loop src/haproxy.c:2954 #10 0x55707a5705fd in run_thread_poll_loop src/haproxy.c:3153 #11 0x7fa8f9450ea6 in start_thread nptl/pthread_create.c:477 #12 0x7fa8f936ea2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfba2e) 0x7fa8ecf417e8 is located 24 bytes to the left of 134217728-byte region [0x7fa8e allocated by thread T0 here: #0 0x7fa8f9a37037 in __interceptor_calloc ../../../../src/libsanitizer/asan/ #1 0x55707a71a61d in init_pollers src/fd.c:1161 #2 0x55707a56cdf1 in init src/haproxy.c:2672 #3 0x55707a5714c2 in main src/haproxy.c:3298 #4 0x7fa8f9296d09 in __libc_start_main ../csu/libc-start.c:308 Thread T6 created by T0 here: #0 0x7fa8f99e22a2 in __interceptor_pthread_create ../../../../src/libsanitizpp:214 #1 0x55707a748a21 in setup_extra_threads src/thread.c:252 #2 0x55707a5735c9 in main src/haproxy.c:3844 #3 0x7fa8f9296d09 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: heap-buffer-overflow src/quic_conn.c:838 in qc_new_cc Shadow bytes around the buggy address: 0x0ff59d9e02a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0ff59d9e02e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x0ff59d9e02f0: fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]fa fa 0x0ff59d9e0300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ff59d9e0340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==402222==ABORTING Aborted Thank you to @Tristan971 for having reported this bug in GH #2247. No need to backport.	2023-09-11 15:14:22 +02:00
Willy Tarreau	4a18d9e560	REORG: cpuset: move parse_cpu_set() and parse_cpumap() to cpuset.c These ones were still in cfgparse.c but they're not specific to the config at all and may actually be used even when parsing cpu list entries in /sys. Better move them where they can be reused.	2023-09-08 16:25:19 +02:00
Willy Tarreau	5119109e3f	MINOR: cpuset: dynamically allocate cpu_map cpu_map is 8.2kB/entry and there's one such entry per group, that's ~520kB total. In addition, the init code is still in haproxy.c enclosed in ifdefs. Let's make this a dynamically allocated array in the cpuset code and remove that init code. Later we may even consider reallocating it once the number of threads and groups is known, in order to shrink it a little bit, as the typical setup with a single group will only need 8.2kB, thus saving half a MB of RAM. This would require that the upper bound is placed in a variable though.	2023-09-08 16:25:19 +02:00
Willy Tarreau	b0f20ed79b	MEDIUM: cfgparse: assign NUMA affinity to cpu-maps Do not force affinity on the process, instead let's just apply it to cpu-map, it will automatically be used later in the init process. We can do this because we know that cpu-map was not set when we're using this detection code. This is much saner, as we don't need to manipulate the process' affinity at this point in time, and just update the info that the user omitted to set by themselves, which guarantees a better long-term consistency with the documented feature.	2023-09-08 16:25:19 +02:00
Willy Tarreau	809a49da96	MINOR: cfgparse: use read_line_from_trash() to read from /sys It's easier to use this function now to natively support variable fields in the file's path. This also removes read_file_from_trash() that was only used here and was static.	2023-09-08 16:25:19 +02:00
Willy Tarreau	1f2433fb6a	MINOR: tools: add function read_line_to_trash() to read a line of a file This function takes on input a printf format for the file name, making it particularly suitable for /proc or /sys entries which take a lot of numbers. It also automatically trims the trailing CR and/or LF chars.	2023-09-08 16:25:19 +02:00
Willy Tarreau	5f10176e2c	MEDIUM: init: initialize the trash earlier More and more utility functions rely on the trash while most of the init code doesn't have access to it because it's initialized very late (in PRE_CHECK for the initial one). It's a pool, and it purposely supports being reallocated, so let's initialize it in STG_POOL so that early STG_INIT code can at least use it.	2023-09-08 16:25:19 +02:00
Frédéric Lécaille	e3e218b98e	CLEANUP: quic: Remove useless free_quic_tx_pkts() function. This function define but no more used since this commit: BUG/MAJOR: quic: Really ignore malformed ACK frames.	2023-09-08 10:17:25 +02:00
Frédéric Lécaille	292dfdd78d	BUG/MINOR: quic: Wrong cluster secret initialization The function generate_random_cluster_secret() which initializes the cluster secret when not supplied by configuration is buggy. There 1/256 that the cluster secret string is empty. To fix this, one stores the cluster as a reduced size first 128 bits of its own SHA1 (160 bits) digest, if defined by configuration. If this is not the case, it is initialized with a 128 bits random value. Furthermore, thus the cluster secret is always initialized. As the cluster secret is always initialized, there are several tests which are for now on useless. This patch removes such tests (if(global.cluster_secret)) in the QUIC code part and at parsing time: no need to check that a cluster secret was initialized with "quic-force-retry" option. Must be backported as far as 2.6.	2023-09-08 09:50:58 +02:00
William Lallemand	15e591b6e0	MINOR: ssl: add support for 'curves' keyword on server lines This patch implements the 'curves' keyword on server lines as well as the 'ssl-default-server-curves' keyword in the global section. It also add the keyword on the server line in the ssl_curves reg-test. These keywords allow the configuration of the curves list for a server.	2023-09-07 23:29:10 +02:00
Willy Tarreau	28ff1a5d56	MINOR: tasks/stats: report the number of niced tasks in "show info" We currently know the number of tasks in the run queue that are niced, and we don't expose it. It's too bad because it can give a hint about what share of the load is relevant. For example if one runs a Lua script that was purposely reniced, or if a stats page or the CLI is hammered with slow operations, seeing them appear there can help identify what part of the load is not caused by the traffic, and improve monitoring systems or autoscalers.	2023-09-06 17:44:44 +02:00
Remi Tricot-Le Breton	e03d060aa3	MINOR: cache: Change hash function in default normalizer used in case of "vary" When building the secondary signature for cache entries when vary is enabled, the referer part of the signature was a simple crc32 of the first referer header. This patch changes it to a 64bits hash based of xxhash algorithm with a random seed built during init. This will prevent "malicious" hash collisions between entries of the cache.	2023-09-06 16:11:31 +02:00
Aurelien DARRAGON	7a71801af6	CLEANUP: log: remove unnecessary trim in __do_send_log Since both sink_write and fd_write_frag_line take the maxlen parameter as argument, there is no added value for the trim before passing the msg parameter to those functions.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	8e6339aa29	MEDIUM: sink: add sink_finalize() function To further clean the code and remove duplication, some sink postparsing and sink->sft finalization is now performed in a dedicated function named sink_finalize().	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	b2879e3502	MEDIUM: sink/ring: introduce high level ring creation helper function ease code maintenance.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	5a8755681d	MINOR: sink: add helper function to deallocate sink struct In this patch we move sink freeing logic outside of sink_deinit() function in order to create the sink_free() helper function that could be used on error paths for example.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	6049a478e4	MEDIUM: spoe-agent: properly postresolve log rings Now that we have sink_postresolve_logsrvs() function, we make use of it for spoe-agent log postparsing logic. This will allow this kind of config to work: \|spoe-agent test \| log tcp@127.0.0.1:514 local0 \| use-backend xxx Plus, consistency checks will also be performed as for regular log directives used from global, log-forward or proxy sections.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	486aa01204	MEDIUM: fcgi-app: properly postresolve logsrvs Now that we have postresolve_logsrv_list() function, we make use of it for fcgi-app log postparsing logic. This will allow this kind of config to work: \|fcgi-app test \| docroot / \| log-stderr tcp@127.0.0.1:514 local0 Plus, consistency checks will also be performed as for regular log directives used from global, log-forward or proxy sections.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	d9b81e5b49	MEDIUM: log/sink: make logsrv postparsing more generic We previously had postparsing logic but only for logsrv sinks, but now we need to make this operation on logsrv directly instead of sinks to prepare for additional postparsing logic that is not sink-specific. To do this, we migrated post_sink_resolve() and sink_postresolve_logsrvs() to their postresolve_logsrvs() and postresolve_logsrv_list() equivalents. Then, we split postresolve_logsrv_list() so that the sink-only logic stays in sink.c (sink_resolve_logsrv_buffer() function), and the "generic" target part stays in log.c as resolve_logsrv(). Error messages formatting was preserved as far as possible but some slight variations are to be expected. As for the functional aspect, no change should be expected.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	969e212c66	MINOR: log: add dup_logsrv() helper function ease code maintenance by introducing dup_logsrv() helper function to properly duplicate an existing logsrv struct.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	7a12e2d369	MEDIUM: httpclient/logs: rely on per-proxy post-check instead of global one httpclient used to register a global post-check function to iterate over all known proxies and post-initialize httpclient related ones (mainly for logs initialization). But we currently have an issue: post_sink_resolve() function which is also registered using REGISTER_POST_CHECK() macro conflicts with httpclient_postcheck() function. This is because post_sink_resolve() relies on proxy->logsrvs to be correctly initialized already, and httpclient_postcheck() may create and insert new logsrvs entries to existing proxies when executed. So depending on which function runs first, we could run into trouble. Hopefully, to this day, everything works "by accident" due to http_client.c file being loaded before sink.c file when compiling source code. But as soon as we would move one of the two functions to other files, or if we rename files or make changes to the Makefile build recipe, we could break this at any time. To prevent post_sink_resolve() from randomly failing in the future, we now make httpclient postcheck rely on per-proxy post-checks by slightly modifying httpclient_postcheck() function so that it can be registered using REGISTER_POST_PROXY_CHECK() macro. As per-proxy post-check functions are executed right after config parsing for each known proxy (vs global post-check which are executed a bit later in the init process), we can be certain that functions registered using global post-check macro, ie: post_sink_resolve(), will always be executed after httpclient postcheck, effectively resolving the ordering conflict. This should normally not cause visible behavior changes, and while it could be considered as a bug, it's probably not worth backporting it since the only way to trigger the issue is through code refactors, unless we want to backport it to ease code maintenance of course, in which case it should easily apply for >= 2.7.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	e187361b52	MINOR: log: move log-forwarders cleanup in log.c Move the log-forwarded proxies cleanup from global deinit() function into log dedicated deinit function. No backport needed.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	32f1db6d0d	MEDIUM: sink: don't perform implicit truncations when maxlen is not set maxlen now defaults ~0 (instead of BUFSIZE) to make sure no implicit truncation will be performed when the option is not specified, since the doc doesn't mention any default value for maxlen. As such, if the payload is too big, it will be dropped (this is the default expected behavior).	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	fdf82d058b	MINOR: sink: inform the user when logs will be implicitly truncated Consider the following example: \|log ring@test-ring len 2000 local0 \| \|ring test-ring \| maxlen 1000 This would result in emitted logs being silently truncated to 1000 because test-ring maxlen is smaller than the log directive maxlen. In this patch we're adding an extra check in post_sink_resolve() to detect this kind of confusing setups and warn the user about the implicit truncation when DIAG mode is on. This commit depends on: - "MINOR: sink: simplify post_sink_resolve function"	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	ceaa1ddb06	MINOR: log/sink: detect when log maxlen exceeds sink size To prevent logs from being silently (and unexpectly droppped) at runtime, we check that the maxlen parameter from the log directives are strictly inferior to the targeted ring size. \|global \| tune.bufsize 16384 \| log tcp@127.0.0.1:514 len 32768 \| log myring@127.0.0.1:514 len 32768 \|ring myring \| # no explicit size On such configs, a diag warning will be reported. This commit depends on: - "MINOR: sink: simplify post_sink_resolve function" - "MINOR: ring: add a function to compute max ring payload"	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	d499485aa9	MINOR: sink: simplify post_sink_resolve function Simplify post_sink_resolve() function to reduce code duplication and make it easier to maintain.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	ddd8671b19	BUG/MEDIUM: ring: adjust maxlen consistency check When user specifies a maxlen parameter that is greater than the size of a given ring section, a warning is emitted to inform that the max length exceeds size, and then the maxlen is forced to size. The logic is good, but imprecise, because it doesn't take into account the slight overhead from storing payloads into the ring. In practise, we cannot store a single message which is exactly the same length than size. Doing so will result in the message being dropped at runtime. Thanks to the ring_max_payload() function introduced in "MINOR: ring: add a function to compute max ring payload", we can now deduce the maximum value for the maxlen parameter before it could result in messages being dropped. When maxlen value is set to an improper value, the warning will be emitted and maxlen will be forced to the maximum "single" payload len that could fit in the ring buffer, preventing messages from being dropped unexpectedly. This commit depends on: - "MINOR: ring: add a function to compute max ring payload" This may be backported as far as 2.2	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	5b295ff409	MINOR: ring: add a function to compute max ring payload Add a helper function to the ring API to compute the maximum payload length that could fit into the ring based on ring size.	2023-09-06 16:06:39 +02:00
Aurelien DARRAGON	4457783ade	MINOR: http_ana: position the FINAL flag for http_after_res execution Ensure that the ACT_OPT_FINAL flag is always set when executing actions from http_after_res context. This will permit lua functions to be executed as http_after_res actions since hlua_ctx_resume() automatically disables "yielding" when such flag is set: the hlua handler will only allow 1shot executions at this point (lua or not, we don't wan't to reschedule http_after_res actions).	2023-09-06 11:42:34 +02:00
Aurelien DARRAGON	967608a432	BUG/MINOR: hlua/action: incorrect message on E_YIELD error When hlua_action error messages were reworked in `d5b073cf1` ("MINOR: lua: Improve error message"), an error was made for the E_YIELD case. Indeed, everywhere E_YIELD error is handled: "yield is not allowed" or similar error message is reported to the user. But instead we currently have: "aborting Lua processing on expired timeout". It is quite misleading because this error message often refers to the HLUA_E_ETMOUT case. Thus, we now report the proper error message thanks to this patch. This should be backported to all stable versions. [on 2.0, the patch needs to be slightly adapted]	2023-09-06 11:42:34 +02:00
Fr�d�ric L�caille	e7240a0ba6	BUG/MINOR: quic: Dereferenced unchecked pointer to Handshke packet number space This issue was reported by longrtt interop test with quic-go as client and @chipitsine in GH #2282 when haproxy is compiled against libressl. Add two checks to prevent a pointer to the Handshake packet number space to be dereferenced if this packet number space was released. Thank you to @chipitsine for this report. No need to backport.	2023-09-06 10:13:40 +02:00
Christopher Faulet	700ca14fc1	BUG/MINOR: ring/cli: Don't expect input data when showing events The "show events" command may wait for now events if "-w" option is used. In this case, no timeout must be triggered. So we explicitly state no input data are expected. This disables the read timeout on the client side. This patch should be backported to 2.8. It is probably useless to backport it further. In all cases, it depends on the commit "BUG/MINOR: applet: Always expect data when CLI is waiting for a new command"	2023-09-06 09:36:29 +02:00
Christopher Faulet	2f1e0a0a46	BUG/MINOR: applet: Always expect data when CLI is waiting for a new command There is a mechanism for applets to disable the read timeout on the opposite side if it is now waiting for any data. Of course, there is also a way to re-activate it. But, it must excplicitly be handle by applets. For the CLI, some commands may state no input data are expected. So we must be sure to reset its state when the applet is waiting for a new command. For now, it is not a bug because no CLI command uses this mechanism. This patch must be backported to 2.8.	2023-09-06 09:36:19 +02:00
Christopher Faulet	8073094bf1	NUG/MEDIUM: stconn: Always update stream's expiration date after I/O It is a revert of following patches: * `d7111e7ac` ("MEDIUM: stconn: Don't requeue the stream's task after I/O") * `3479d99d5` ("BUG/MEDIUM: stconn: Update stream expiration date on blocked sends") Because the first one is reverted, the second one is useless and can be reverted too. The issue here is that I/O may be performed without stream wakeup. So if no expiration date was set on the last call to process_stream(), the stream is never rescheduled and no timeout can be detected. This especially happens on TCP streams because fast-forward is enabled very early. Instead of tracking all places where the stream's expiration data must be updated, it is now centralized in sc_notify(), as it was performed before the timeout refactoring. This patch must be backported to 2.8.	2023-09-06 09:29:27 +02:00
Christopher Faulet	b9c87f8082	BUG/MEDIUM: stconn/stream: Forward shutdown on write timeout The commit `7f59d68fe2` ("BUG/MEDIIM: stconn: Flush output data before forwarding close to write side") introduced a regression. When a write timeout is detected, the shutdown is no longer forwarded. Dependig on the channels state, it may block the processing, waiting the client or the server leaves. The commit above tries to avoid to truncate messages on shutdown but on write timeout, if the channel is not empty, there is nothing more we can do to send these data. It means the endpoint is unable to send data. In this case, we must forward the shutdown. This patch should be backported as far as 2.2.	2023-09-06 09:29:27 +02:00
Christopher Faulet	d18657ae11	BUG/MEDIUM: applet: Report an error if applet request more room on aborted SC If an abort was performed and the applet still request more room, it means the applet has not properly handle the error on its own. At least the CLI applet is concerned. Instead of reviewing all applets, the error is now handled in task_run_applet() function. Because of this bug, a session may be blocked infinitly and may also lead to a wakup loop. This patch must only be backported to 2.8 for now. And only to lower versions if a bug is reported because it is a bit sensitive and the code older versions are very different.	2023-09-06 09:29:27 +02:00
Christopher Faulet	34645a6365	BUG/MEDIUM: stconn: Report read activity when a stream is attached to front SC It only concerns the front SC. But it is important to report a read activity when a stream is created and attached to the front SC, especially in TCP. In HTTP, when this happens, the request was necessarily received. But in TCP, the client may open a connection without sending anything. We must still report a first read activity in this case to be able to properly report client timeout. This patch must be backported to 2.8.	2023-09-06 09:29:27 +02:00
Christopher Faulet	015fec6a29	BUG/MINOR: stconn: Don't inhibit shutdown on connection on error In the SC function responsible to perform shutdown, there is a statement inhibiting the shutdown if an error was encountered on the SC. This statement is inherited from very old version and should in fact be removed. The error may be set from the stream. In this case the shutdown must be performed. In all cases, it is not a big deal if the shutdown is performed twice because underlying functions already handle multiple calls. This patch does not fix any bug. Thus there is no reason to backport it.	2023-09-06 09:29:27 +02:00
Fr�d�ric L�caille	fb4294be55	BUG/MINOR: quic: Wrong RTT computation (srtt and rrt_var) Due to the fact that several variable values (rtt_var, srtt) were stored as multiple of their real values, some calculations were less accurate as expected. Stop storing 4rtt_var values, and 8srtt values. Adjust all the impacted statements. Must be backported as far as 2.6.	2023-09-05 17:14:51 +02:00
Fr�d�ric L�caille	cf768f7456	BUG/MINOR: quic: Wrong RTT adjusments There was a typo in the test statement to check if the rtt must be adjusted (>= incorectly replaced by >). Must be backported as far as 2.6.	2023-09-05 17:14:51 +02:00
William Lallemand	6bc00a97da	MINOR: httpclient: allow to configure the timeout.connect When using the httpclient, one could be bothered with it returning after a very long time when failing. By default the httpclient has a retries of 3 and a timeout connect of 5s, which can results in pause of 20s upon failure. This patch allows the user to configure the "timeout connect" of the httpclient so it could reduce the time to return an error. This patch helps fixing part of the issue #2269. Could be backported in 2.7 if needed.	2023-09-05 16:42:27 +02:00
William Lallemand	c52948bd2c	MINOR: httpclient: allow to configure the retries When using the httpclient, one could be bothered with it returning after a very long time when failing. By default the httpclient has a retries of 3 and a timeout connect of 5s, which can results in pause of 20s upon failure. This patch allows the user to configure the retries of the httpclient so it could reduce the time to return an error. This patch helps fixing part of the issue #2269. Could be backported in 2.7 if needed.	2023-09-05 15:55:04 +02:00
William Lallemand	fcb080d8f9	MEDIUM: mworker: display a more accessible message when a worker crash Should fix issue #1034. Display a more accessible message when a worker crash about what to do. Example: $ ./haproxy -W -f haproxy.cfg [NOTICE] (308877) : New worker (308884) forked [NOTICE] (308877) : Loading success. [NOTICE] (308877) : haproxy version is 2.9-dev4-d90d3b-58 [NOTICE] (308877) : path to executable is ./haproxy [ALERT] (308877) : Current worker (308884) exited with code 139 (Segmentation fault) [WARNING] (308877) : A worker process unexpectedly died and this can only be explained by a bug in haproxy or its dependencies. Please check that you are running an up to date and maintained version of haproxy and open a bug report. HAProxy version 2.9-dev4-d90d3b-58 2023/09/05 - https://haproxy.org/ Status: development branch - not safe for use in production. Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open Running on: Linux 6.2.0-31-generic #31-Ubuntu SMP PREEMPT_DYNAMIC Mon Aug 14 13:42:26 UTC 2023 x86_64 [ALERT] (308877) : exit-on-failure: killing every processes with SIGTERM [WARNING] (308877) : All workers exited. Exiting... (139)	2023-09-05 15:31:04 +02:00
William Lallemand	d90d3bf894	MINOR: global: export the display_version() symbol Export the display_version() function which can be used elsewhere than in haproxy.c	2023-09-05 15:24:39 +02:00
Fr�d�ric L�caille	aeb2f28ca7	BUG/MINOR: quic: Unchecked pointer to Handshake packet number space It is possible that there are still Initial crypto data in flight without Handshake crypto data in flight. This is very rare but possible. This issue was reported by handshakeloss interop test with quic-go as client and @chipitsine in GH #2279. No need to backport.	2023-09-05 11:38:33 +02:00
Frédéric Lécaille	3afe54ed5b	BUILD: quic: Compilation issue on 32-bits systems with quic_may_send_bytes() quic_may_send_bytes() implementation arrived with this commit: MINOR: quic: Amplification limit handling sanitization. It returns a size_t. So when compared with QUIC_MIN() with qc->path->mtu there is no need to cast this latted anymore because it is also a size_t. Detected when compiled with -m32 gcc option.	2023-09-05 10:33:56 +02:00
Willy Tarreau	86854dd032	MEDIUM: threads: detect excessive thread counts vs cpu-map This detects when there are more threads bound via cpu-map than CPUs enabled in cpu-map, or when there are more total threads than the total number of CPUs available at boot (for unbound threads) and configured for bound threads. In this case, a warning is emitted to explain the problems it will cause, and explaining how to address the situation. Note that some configurations will not be detected as faulty because the algorithmic complexity to resolve all arrangements grows in O(N!). This means that having 3 threads on 2 CPUs and one thread on 2 CPUs will not be detected as it's 4 threads for 4 CPUs. But at least configs such as T0:(1,4) T1:(1,4) T2:(2,4) T3:(3,4) will not trigger a warning since they're valid.	2023-09-04 19:39:17 +02:00
Willy Tarreau	8357f950cb	MEDIUM: threads: detect incomplete CPU bindings It's very easy to mess up with some cpu-map directives and to leave some thread unbound. Let's add a test that checks that either all threads are bound or none are bound, but that we do not face the intermediary situation where some are pinned and others are left wandering around, possibly on the same CPUs as bound ones. Note that this should not be backported, or maybe turned into a notice only, as it appears that it will easily catch invalid configs and that may break updates for some users.	2023-09-04 19:39:17 +02:00
Willy Tarreau	e65f54cf96	MINOR: cpuset: centralize a reliable bound cpu detection Till now the CPUs that were bound were only retrieved in thread_cpus_enabled() in order to count the number of CPUs allowed, and it relied on arch-specific code. Let's slightly arrange this into ha_cpuset_detect_bound() that reuses the ha_cpuset struct and the accompanying code. This makes the code much clearer without having to carry along some arch-specific stuff out of this area. Note that the macos-specific code used in thread.c to only count online CPUs but not retrieve a mask, so for now we can't infer anything from it and can't implement it. In addition and more importantly, this function is reliable in that it will only return a value when the detection is accurate, and will not return incomplete sets on operating systems where we don't have an exact list, such as online CPUs.	2023-09-04 19:39:17 +02:00
Willy Tarreau	d3ecc67a01	MINOR: cpuset: add ha_cpuset_or() to bitwise-OR two CPU sets This operation was not implemented and will be needed later.	2023-09-04 19:39:17 +02:00
Willy Tarreau	eb10567254	MINOR: cpuset: add ha_cpuset_isset() to check for the presence of a CPU in a set This function will be convenient to test for the presence of a given CPU in a set.	2023-09-04 19:39:17 +02:00
Willy Tarreau	fca3fc0d90	BUILD: checks: shut up yet another stupid gcc warning gcc has always had hallucinations regarding value ranges, and this one is interesting, and affects branches 4.7 to 11.3 at least. When building without threads, the randomly picked new_tid that is reduced to a multiply by 1 shifted right 32 bits, hence a constant output of 0 shows this warning: src/check.c: In function 'process_chk_conn': src/check.c:1150:32: warning: array subscript [-1, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds] In file included from include/haproxy/thread.h:28, from include/haproxy/list.h:26, from include/haproxy/action.h:28, from src/check.c:31: or this one when trying to force the test to see that it cannot be zero(!): src/check.c: In function 'process_chk_conn': src/check.c:1150:54: warning: array subscript [0, 0] is outside array bounds of 'struct thread_ctx[1]' [-Warray-bounds] 1150 \| uint t2_act = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks); \| ~~~~~~~~~~~~~^~~~~~ include/haproxy/atomic.h:66:40: note: in definition of macro 'HA_ATOMIC_LOAD' 66 \| #define HA_ATOMIC_LOAD(val) *(val) \| ^~~ src/check.c:1150:24: note: in expansion of macro '_HA_ATOMIC_LOAD' 1150 \| uint t2_act = _HA_ATOMIC_LOAD(&ha_thread_ctx[thr2].active_checks); \| ^~~~~~~~~~~~~~~ Let's just add an ALREADY_CHECKED() statement there, no other check seems to get rid of it. No backport is needed.	2023-09-04 19:38:51 +02:00
Andrew Hopkins	b3f94f8b3b	BUILD: ssl: Build with new cryptographic library AWS-LC This adds a new option for the Makefile USE_OPENSSL_AWSLC, and update the documentation with instructions to use HAProxy with AWS-LC. Update the type of the OCSP callback retrieved with SSL_CTX_get_tlsext_status_cb with the actual type for libcrypto versions greater than 1.0.2. This doesn't affect OpenSSL which casts the callback to void* in SSL_CTX_ctrl.	2023-09-04 18:19:18 +02:00
Miroslav Zagorac	3cfc30416c	MINOR: properly mark the end of the CLI command in error messages In several places in the file src/ssl_ckch.c, in the message about the incorrect use of the CLI command, the end of that CLI command is not correctly marked with the sign ' .	2023-09-04 18:13:43 +02:00
Willy Tarreau	8547f5cfa2	BUG/MINOR: stream: further protect stream_dump() against incomplete sessions As found by Coverity in issue #2273, the fix in commit `e64bccab2` ("BUG/MINOR: stream: protect stream_dump() against incomplete streams") was still not enough, as scf/scb are still dereferenced to dump their flags and states. This should be backported to 2.8.	2023-09-04 15:32:17 +02:00
Chris Staite	3939e39479	BUG/MEDIUM: h1-htx: Ensure chunked parsing with full output buffer A previous fix to ensure that there is sufficient space on the output buffer to place parsed data (#2053) introduced an issue that if the output buffer is filled on a chunk boundary no data is parsed but the congested flag is not set due to the state not being H1_MSG_DATA. The check to ensure that there is sufficient space in the output buffer is actually already performed in all downstream functions before it is used. This makes the early optimisation that avoids the state transition to H1_MSG_DATA needless. Therefore, in order to allow the chunk parser to continue in this edge case we can simply remove the early check. This ensures that the state can progress and set the congested flag correctly in the caller. This patch fixes #2262. The upstream change that caused this logic error was backported as far as 2.5, therefore it makes sense to backport this fix back that far also.	2023-09-04 12:15:36 +02:00
Willy Tarreau	135c66f6cb	BUG/MEDIUM: connection: fix pool free regression with recent ppv2 TLV patches In commit `fecc573da` ("MEDIUM: connection: Generic, list-based allocation and look-up of PPv2 TLVs") there was a tiny mistake, elements of length <= 128 are allocated from pool_pp_128 but only those of length < 128 are released to this pool, other ones go to pool_pp_256. Because of this, elements of size exactly 128 are allocated from 128 and released to 256. It can be reproduced a few times by running sample_fetches/tlvs.vtc 1000 times with -DDEBUG_DONT_SHARE_POOLS -DDEBUG_MEMORY_POOLS -DDEBUG_EXPR -DDEBUG_STRICT=2 -DDEBUG_POOL_INTEGRITY -DDEBUG_POOL_TRACING -DDEBUG_NO_POOLS. Not sure why it doesn't reproduce more often though. No backport is needed. This should address github issues #2275 and #2274.	2023-09-04 11:45:37 +02:00
Fr�d�ric L�caille	d52466726f	BUG/MINOR: quic: Unchecked pointer to packet number space dereferenced It is possible that there are still Initial crypto data in flight without Handshake crypto data in flight. This is very rare but possible. This issue was reported by long-rtt interop test with quic-go as client and @chipitsine in GH #2276. No need to backport.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	9077f20251	BUG/MAJOR: quic: Really ignore malformed ACK frames. If not correctly parsed, an ACK frame must be ignored without any more treatment. Before this patch an ACK frame could be partially correctly parsed, then some errors could be detected which leaded newly acknowledged packets to be released in a wrong way calling free_quic_tx_pkts() called by qc_parse_ack_frm(). But there is no reason to release such packets because of a malformed ACK frame. This patch modifies qc_parse_ack_frm(). The newly acknowledged TX packets is done in two steps. It first collects the newly acknowledged packet calling qc_newly_acked_pkts(). Then proceed the same way as before for the treatments of haproxy TX packets acknowledged by the peer. If the ACK frame could not be fully parsed, the newly ackowledged packets are replaced back from where they were detached: the tree of TX packets for their encryption level. Must be backported as far as 2.6.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	7dad52bdbd	MINOR: quic: Add a trace to quic_release_frm() Display the address of the frame to be released as soon as entering into quic_release_frm() whose job is obviously to released the memory allocated for the frame <frm> passed as parameter.	2023-09-04 11:29:35 +02:00
Fr�d�ric L�caille	3c90c1ce6b	BUG/MINOR: quic: Possible skipped RTT sampling There are very few chances this bug may occur. Furthermore the consequences are not dramatic: an RTT sampling may be ignored. I guess this may happen when the now_ms global value wraps. Do not rely on the time variable value a packet was sent to decide if it is a newly acknowledged packet but on its presence or not in the tx packet ebtree. Must be backported as far as 2.6.	2023-09-04 11:29:35 +02:00
Christopher Faulet	0b93ff8c87	BUG/MEDIUM: stconn: Wake applets on sending path if there is a pending shutdown An applet is not woken up on sending path if it is not waiting for data or if it states it will not consume data. However, it is important to still wake it up if there is a pending shutdown. Otherwise, the event may be missed and some data may remain blocked in the channel's buffer. Because of this bug, it is possible to have a stream stuck if data are also blocked on the opposite channel. It is for instance possible to hit the buf with the stats applet and a client not consuming data. This patch must slowly be backported as far as 2.2. It should partially fix issue #2249.	2023-09-01 14:18:26 +02:00
Christopher Faulet	9e394d34e0	BUG/MINOR: stconn: Don't report blocked sends during connection establishment The server timeout must not be handled during the connection establishment to not superseed the connect timeout. To do so, we must not consider outgoing data are blocked during this stage. Concretly, it means the fsb time must not be updated during connection establishment. It is not an issue with regular clients because the server timeout is only defined when the connection is estalished. However, it may be an issue for the HTTP client, when the server timeout is lower than the connect timeout. In this case, an early 502 may be reported with no connection retries. This patch must be backported to 2.8.	2023-09-01 14:18:26 +02:00
Christopher Faulet	3479d99d5f	BUG/MEDIUM: stconn: Update stream expiration date on blocked sends When outgoing data are blocked, we must update the stream expiration date and requeue the task. It is important to be sure to properly handle write timeout, expecially if the stream cannot expire on reads. This bug was introduced when handling of channel's timeouts was refactored to be managed by the stream-connectors. It is an issue if there is no server timeout and the client does not consume the response (or the opposite but it is less common). It is also possible to trigger the same scenario with applets on server side because, most of time, there is no server timeout. This patch must be backported to 2.8.	2023-09-01 14:18:26 +02:00
Christopher Faulet	49ed83e948	DEBUG: applet: Properly report opposite SC expiration dates in traces The wrong label was used in trace to report expiration dates of the opposite SC. "sc" was used instead of "sco". This patch should be backported to 2.8.	2023-09-01 14:18:26 +02:00
Willy Tarreau	b0031d9679	MINOR: checks: also consider the thread's queue for rebalancing Let's also check for other threads when the current one is queueing, let's not wait for the load to be high. Now this totally eliminates differences between threads.	2023-09-01 14:00:04 +02:00
Willy Tarreau	844a3bc25b	MEDIUM: checks: implement a queue in order to limit concurrent checks The progressive adoption of OpenSSL 3 and its abysmal handshake performance has started to reveal situations where it simply isn't possible anymore to succesfully run health checks on many servers, because between the moment all the checks are started and the moment the handshake finally completes, the timeout has expired! This also has consequences on production traffic which gets significantly delayed as well, all that for lots of checks. While it's possible to increase the check delays, it doesn't solve everything as checks still take a huge amount of time to converge in such conditions. Here we take a different approach by permitting to enforce the maximum concurrent checks per thread limitation and implementing an ordered queue. Thanks to this, if a thread about to start a check has reached its limit, it will add the check at the end of a queue and it will be processed once another check is finished. This proves to be extremely efficient, with all checks completing in a reasonable amount of time and not being disturbed by the rest of the traffic from other checks. They're just cycling slower, but at the speed the machine can handle. One must understand however that if some complex checks perform multiple exchanges, they will take a check slot for all the required duration. This is why the limit is not enforced by default. Tests on SSL show that a limit of 5-50 checks per thread on local servers gives excellent results already, so that could be a good starting point.	2023-09-01 14:00:04 +02:00
Willy Tarreau	cfc0bceeb5	MEDIUM: checks: search more aggressively for another thread on overload When the current check is overloaded (more running checks than the configured limit), we'll try more aggressively to find another thread. Instead of just opportunistically looking for one half as loaded, now if the current thread has more than 1% more active checks than another one, or has more than a configured limit of concurrent running checks, it will search for a more suitable thread among 3 other random ones in order to migrate the check there. The number of migrations remains very low (~1%) and the checks load very fair across all threads (~1% as well). The new parameter is called tune.max-checks-per-thread.	2023-09-01 08:26:06 +02:00
Willy Tarreau	016e189ea3	MINOR: check: also consider the random other thread's active checks When checking if it's worth transferring a sleeping thread to another random thread, let's also check if that random other thread has less checks than the current one, which is another reason for transferring the load there. This commit adds a function "check_thread_cmp_load()" to compare two threads' loads in order to simplify the decision taking. The minimum active check count before starting to consider rebalancing the load was now raised from 2 to 3, because tests show that at 15k concurrent checks, at 2, 50% are evaluated for rebalancing and 30% are rebalanced, while at 3, this is cut in half.	2023-09-01 08:26:06 +02:00
Willy Tarreau	00de9e0804	MINOR: checks: maintain counters of active checks per thread Let's keep two check counters per thread: - one for "active" checks, i.e. checks that are no more sleeping and are assigned to the thread. These include sleeping and running checks ; - one for "running" checks, i.e. those which are currently executing on the thread. By doing so, we'll be able to spread the health checks load a bit better and refrain from sending too many at once per thread. The counters are atomic since a migration increments the target thread's active counter. These numbers are reported in "show activity", which allows to check per thread and globally how many checks are currently pending and running on the system. Ideally, we should only consider checks in the process of establishing a connection since that's really the expensive part (particularly with OpenSSL 3.0). But the inner layers are really not suitable to doing this. However knowing the number of active checks is already a good enough hint.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3b7942a1c9	MINOR: check/activity: collect some per-thread check activity stats We now count the number of times a check was started on each thread and the number of times a check was adopted. This helps understand better what is observed regarding checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	e03d05c6ce	MINOR: check: remember when we migrate a check The goal here is to explicitly mark that a check was migrated so that we don't do it again. This will allow us to perform other actions on the target thread while still knowing that we don't want to be migrated again. The new READY bit combine with SLEEPING to form 4 possible states: SLP RDY State Description 0 0 - (reserved) 0 1 RUNNING Check is bound to current thread and running 1 0 SLEEPING Check is sleeping, not bound to a thread 1 1 MIGRATING Check is migrating to another thread Thus we set READY upon migration, and check for it before migrating, this is sufficient to prevent a second migration. To make things a bit clearer, the SLEEPING bit was switched with FASTINTER so that SLEEPING and READY are adjacent.	2023-09-01 08:26:06 +02:00
Willy Tarreau	3544c9f8a0	MINOR: checks: pin the check to its thread upon wakeup When a check leaves the sleeping state, we must pin it to the thread that is processing it. It's normally always the case after the first execution, but initial checks that start assigned to any thread (-1) could be assigned much later, causing problems with planned changes involving queuing. Thus better do it early, so that all threads start properly pinned.	2023-09-01 08:26:06 +02:00
Willy Tarreau	7163f95b43	MINOR: checks: start the checks in sleeping state The CHK_ST_SLEEPING state was introduced by commit `d114f4a68` ("MEDIUM: checks: spread the checks load over random threads") to indicate that a check was not currently bound to a thread and that it could easily be migrated to any other thread. However it did not start the checks in this state, meaning that they were not redispatchable on startup. Sometimes under heavy load (e.g. when using SSL checks with OpenSSL 3.0) the cost of setting up new connections is so high that some threads may experience connection timeouts on startup. In this case it's better if they can transfer their excess load to other idle threads. By just marking the check as sleeping upon startup, we can do this and significantly reduce the number of failed initial checks.	2023-09-01 08:26:06 +02:00
Willy Tarreau	48442b8b15	BUG/MINOR: checks: do not queue/wake a bounced check A small issue was introduced with commit `d114f4a68` ("MEDIUM: checks: spread the checks load over random threads"): when a check is bounced to another thread, its expiration time is set to TICK_ETERNITY. This makes it show as not expired upon first wakeup on the next thread, thus being detected as "woke up too early" and being instantly rescheduled. Only this after this next wakeup it will be properly considered. Several approaches were attempted to fix this. The best one seems to consist in resetting t->expire and expired upon wakeup, and changing the !expired test for !tick_is_expired() so that we don't trigger on this case. This needs to be backported to 2.7.	2023-09-01 08:26:06 +02:00
Willy Tarreau	338431ecb6	MINOR: activity: report the current run queue size While troubleshooting the causes of load spikes, it appeared that the length of individual run queues was missing, let's add it to "show activity".	2023-09-01 08:26:06 +02:00
Willy Tarreau	2cb896c4b0	MEDIUM: server/ssl: pick another thread's session when we have none yet The per-thread SSL context in servers causes a burst of connection renegotiations on startup, both for the forwarded traffic and for the health checks. Health checks have been seen to continue to cause SSL rekeying for several minutes after a restart on large thread-count machines. The reason is that the context is exlusively per-thread and that the more threads there are, the more likely it is for a new connection to start on a thread that doesn't have such a context yet. In order to improve this situation, this commit ensures that a thread starting an SSL connection to a server without a session will first look at the last session that was updated by another thread, and will try to use it. In order to minimize the contention, we're using a read lock here to protect the data, and the first-level index is an integer containing the thread number, that is always valid and may always be dereferenced. This way the session retrieval algorithm becomes quite simple: - if the last thread index is valid, then try to use the same session under a read lock ; - if any error happens, then atomically nuke the index so that other threads don't use it and the next one to update a connection updates it again And for the ssl_sess_new_srv_cb(), we have this: - update the entry under a write lock if the new session is valid, otherwise kill it if the session is not valid; - atomically update the index if it was 0 and the new one is valid, otherwise atomically nuke it if the session failed. Note that even if only the pointer is destroyed, the element will be re-allocated by the next thread during the sess_new_srv_sb(). Right now a session is picked even if the SNI doesn't match, because we don't know the SNI yet during ssl_sock_init(), but that's essentially a matter of API, since connect_server() figures the SNI very early, then calls conn_prepare() which calls ssl_sock_init(). Thus in the future we could easily imaging storing a number of SNI-based contexts instead of storing contexts per thread. It could be worth backporting this to one LTS version after some observation, though this is not strictly necessary. the current commit depends on the following ones: BUG/MINOR: ssl_sock: fix possible memory leak on OOM MINOR: ssl_sock: avoid iterating realloc(+1) on stored context DOC: ssl: add some comments about the non-obvious session allocation stuff CLEANUP: ssl: keep a pointer to the server in ssl_sock_init() MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session MINOR: server/ssl: maintain an index of the last known valid SSL session MINOR: server/ssl: clear the shared good session index on failure MEDIUM: server/ssl: pick another thread's session when we have none yet	2023-08-31 09:27:14 +02:00
Willy Tarreau	777f62cfb7	MINOR: server/ssl: clear the shared good session index on failure If we fail to set the session using SSL_set_session(), we want to quickly erase our index from the shared one so that any other thread with a valid session replaces it.	2023-08-31 08:50:01 +02:00
Willy Tarreau	52b260bae4	MINOR: server/ssl: maintain an index of the last known valid SSL session When a thread creates a new session for a server, if none was known yet, we assign the thread id (hence the reused_sess index) to a shared variable so that other threads will later be able to find it when they don't have one yet. For now we only set and clear the pointer upon session creation, we do not yet pick it. Note that we could have done it per thread-group, so as to avoid any cross-thread exchanges, but it's anticipated that this is essentially used during startup, at a moment where the cost of inter-thread contention is very low compared to the ability to restart at full speed, which explains why instead we store a single entry.	2023-08-31 08:50:01 +02:00
Willy Tarreau	607041dec3	MEDIUM: server/ssl: place an rwlock in the per-thread ssl server session The goal will be to permit a thread to update its session while having it shared with other threads. For now we only place the lock and arrange the code around it so that this is quite light. For now only the owner thread uses this lock so there is no contention. Note that there is a subtlety in the openssl API regarding i2s_SSL_SESSION() in that it fills the area pointed to by its argument with a dump of the session and returns a size that's equal to the previously allocated one. As such, it does modify the shared area even if that's not obvious at first glance.	2023-08-31 08:50:01 +02:00
Willy Tarreau	95ac5fe4a8	MEDIUM: ssl_sock: always use the SSL's server name, not the one from the tid In ssl_sock_set_servername(), we're retrieving the current server name from the current thread, hoping it will not have changed. This is a bit dangerous as strictly speaking it's not easy to prove that no other connection had to use one between the moment it was retrieved in ssl_sock_init() and the moment it's being read here. In addition, this forces us to maintain one session per thread while this is not the real need, in practice we only need one session per SNI. And the current model prevents us from sharing sessions between threads. This had been done in 2.5 via commit `e18d4e828` ("BUG/MEDIUM: ssl: backend TLS resumption with sni and TLSv1.3"), but as analyzed with William, it turns out that a saner approach consists in keeping the call to SSL_get_servername() there and instead to always assign the SNI to the current SSL context via SSL_set_tlsext_host_name() immediately when the session is retreived. This way the session and SNI are consulted atomically and the host name is only checked from the session and not from possibly changing elements. As a bonus the rdlock that was added by that commit could now be removed, though it didn't cost much.	2023-08-31 08:49:15 +02:00
Willy Tarreau	335b5adf2c	CLEANUP: ssl: keep a pointer to the server in ssl_sock_init() We're using about 6 times "__objt_server(conn->target)" there, it's not quite easy to read, let's keep a pointer to the server.	2023-08-30 18:58:40 +02:00
Willy Tarreau	bc31ef0896	DOC: ssl: add some comments about the non-obvious session allocation stuff The SSL session allocation/reuse part is far from being trivial, and there are some necessary tricks such as allocating then immediately freeing that are required by the API due to internal refcount. All of this is particularly hard to grasp, even with the scarce man pages. Let's document a little bit what's granted and expected along this path to help the reader later.	2023-08-30 11:43:06 +02:00
Willy Tarreau	2c6fe24001	MINOR: ssl_sock: avoid iterating realloc(+1) on stored context The SSL context storage in servers is per-thread, and the contents are allocated for a length that is determined from the session. It turns out that placing some traces there revealed that the realloc() that is called to grow the area can be called multiple times in a row even for just health checks, to grow the area by just one or two bytes. Given that malloc() allocates in multiples of 8 or 16 anyway, let's round the allocated size up to the nearest multiple of 8 to avoid this unneeded operation.	2023-08-30 11:43:06 +02:00
Alexander Stephan	2cc53ecc8f	MINOR: sample: Add common TLV types as constants for fc_pp_tlv This patch adds common TLV types as specified in the PPv2 spec. We will use the suffix of the type, e.g., PP2_TYPE_AUTHORITY becomes AUTHORITY.	2023-08-29 15:32:02 +02:00
Alexander Stephan	0a4f6992e0	MINOR: sample: Refactor fc_pp_unique_id by wrapping the generic TLV fetch The fetch logic is redundant and can be simplified by simply calling the generic fetch with the correct TLV ID set as an argument, similar to fc_pp_authority.	2023-08-29 15:32:01 +02:00
Alexander Stephan	ece0d1ab49	MINOR: sample: Refactor fc_pp_authority by wrapping the generic TLV fetch We already have a call that can retreive an TLV with any value. Therefore, the fetch logic is redundant and can be simplified by simply calling the generic fetch with the correct TLV ID set as an argument.	2023-08-29 15:31:51 +02:00
Alexander Stephan	f773ef721c	MEDIUM: sample: Add fetch for arbitrary TLVs Based on the new, generic allocation infrastructure, a new sample fetch fc_pp_tlv is introduced. It is an abstraction for existing PPv2 TLV sample fetches. It takes any valid TLV ID as argument and returns the value as a string, similar to fc_pp_authority and fc_pp_unique_id.	2023-08-29 15:31:28 +02:00

... 2 3 4 5 6 ...

16556 commits