Commit graph

7549 commits

Author SHA1 Message Date
Willy Tarreau
c62a2d540d MEDIUM: ring: move the ring reader code to ring_dispatch_messages()
This new function is made around the loop that scans a ring for new
messages and dispatches them to a message handler. It also takes
ring flags (WAIT, NEW, etc) and offset pointers that the caller will
use to initialize/reuse/update the current processing offset. The
caller is still responsible for presetting it to ~0 before the
first call if it wants the function to automatically adjust it (or set
it to the correct value). The function may also return the last_ofs
that was known before releasing the lock so that the caller knows
what to compare against and if it needs to restart processing or not.
The context remains a void* so that should not necessarily depend on
an appctx.

The current "show ring" code was ported to this and it continues to
work as expected.
2024-03-25 17:34:19 +00:00
Willy Tarreau
ad31e53287 REORG: dns/ring: split the ring between the generic one and the DNS one
A ring is used for the DNS code but slightly differently from the generic
one, which prevents some important changes from being made to the generic
code without breaking DNS. As the use cases differ, it's better to just
split them apart for now and have the DNS code use its own ring that we
rename dns_ring and let the generic code continue to live on its own.

The unused parts such as CLI registration were dropped, resizing and
allocation from a mapped area were dropped. dns_ring_detach_appctx() was
kept despite not being used, so as to stay consistent with the comments
that say it must be called, despite the DNS code explicitly mentioning
that it skips it for now (i.e. this may change in the future).

Hopefully after the generic rings are converted the DNS code can migrate
back to them, though this is really not necessary.
2024-03-25 17:34:19 +00:00
Willy Tarreau
201c706330 MINOR: log/applet: add new function syslog_applet_append_event()
This function takes a buffer on input, and offset and a length, and
consumes the block from that buffer to send it to the appctx's output
buffer. Contrary to its sibling applet_append_line(), instead of just
appending an LF at the end of the line, it prepends the message size
in decimal and a space before the message, as expected by syslog TCP
implementaions. This will be used to simplify the ring reader code.
2024-03-25 17:34:19 +00:00
Willy Tarreau
6ae41dc510 MINOR: applet: add new function applet_append_line()
This function takes a buffer on input, and offset and a length, and
consumes the block from that buffer to send it to the appctx's output
buffer. This will be used to simplify the ring reader code.
2024-03-25 17:34:19 +00:00
Willy Tarreau
c038ca8e8c MINOR: atomic: add a read-specific variant of __ha_cpu_relax()
Tests on various systems show that x86 prefers not to wait at all inside
read loops while aarch64 prefers to wait a little bit. Instead of having
to stuff ifdefs around __ha_cpu_relax() inside plenty of such loops
waiting for a condition to appear, better implement a new variant that
we call __ha_cpu_relax_for_read() which honors each architecture's
preferences and is the same as __ha_cpu_relax() for other ones.
2024-03-25 17:34:19 +00:00
Aurelien DARRAGON
3de1acfb23 BUILD: server: fix build regression on old compilers (<= gcc-4.4)
Willy reported that since 3ac79b504 ("MEDIUM: server:
make server_set_inetaddr() updater serializable"), haproxy fails to
compile on some older compilers such as gcc-4.4 with this kind of error:

  src/server.c: In function 'snr_resolution_cb':
  src/server.c:4471: error: unknown field 'dns_resolver' specified in initializer
  compilation terminated due to -Wfatal-errors.
  make: *** [Makefile:1006: src/server.o] Error 1

This is due to referencing a member inside anonymous union from a compound
literal assignment. Apparently such use of anonymous union wasn't properly
supported back then on older compilers. To fix the issue, we give "u" name
to the parent union use this name to explicitly refer to the union where
relevant in the code (only a few changes fortunately).

The fix itself was verified to restore build compatibility with gcc 4.4
(and even 4.2).

As 3ac79b504 is used as a prerequisite for 64c9c8ef3 ("BUG/MINOR:
server/dns: use server_set_inetaddr() to unset srv addr from DNS"), please
consider backporting this patch too if 64c9c8ef3 happens to be backported
in 2.9.
2024-03-25 16:23:37 +01:00
Amaury Denoyelle
0d4273f04b MEDIUM: server: close private idle connection before server deletion
This commit similar to the following one :
  65ae241dcfe710e1cdd3ec4e7a9bde38d2e4c116
  MEDIUM: server: close idle conn before server deletion

This patch implements a similar logic, this time to close private idle
connections stored in sessions. The principle is identical to the above
commit : conn_release() is used on idle connections after a takeover to
ensure thread safety.

An extra change was required to be able to execute takeover on such
connections. Their original thread ID was unknown, contrary to non
private connections which are stored in sharded lists. As such, a new
tid member has been added under sess_priv_conns chaining element.
2024-03-22 17:12:27 +01:00
Amaury Denoyelle
f3862a9bc7 MINOR: connection: extend takeover with release option
Extend takeover API both for MUX and XPRT with a new boolean argument
<release>. Its purpose is to signal if the connection will be freed
immediately after the takeover, rendering new resources allocation
unnecessary.

For the moment, release argument is always false. However, it will be
set to true on delete server CLI handler to proactively close server
idle connections.
2024-03-22 16:12:36 +01:00
Amaury Denoyelle
ff2e71ae24 MINOR: connection: implement conn_release()
Several places reuse the same code to ensure a connection is properly
freed, either via its MUX or by calling the proper set of functions.
Factorize all of this in a new function conn_release().

This new function is now called via session_free() and
session_accept_fd(). It will also be reused on delete server to
proactively close idle connections.
2024-03-22 16:12:36 +01:00
Remi Tricot-Le Breton
5c25c577a0 BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing
The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.

These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.

Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.

An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.

This patch can be backported up to 2.8. Wait a little bit before
backporting.
2024-03-20 16:12:10 +01:00
Remi Tricot-Le Breton
69071490ff BUG/MAJOR: ocsp: Separate refcount per instance and per store
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each SSL_CTX that references it increments its
refcount. The reference to the certificate_ocsp is kept in the SSL_CTX
linked to each ckch_inst, in an ex_data entry that gets freed when the
context is freed.
One of the downsides of this implementation is that if every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), and if all
the corresponding SSL_CTXs were destroyed (no ongoing connection using
them), the OCSP response would be destroyed even if the certificate
remains in the system (as an unused certificate).
In such a case, we would want the OCSP response not to be "usable",
since it is not used by any ckch_inst, but still remain in the OCSP
response tree so that if the certificate gets reused (via an "add ssl
crt-list" command for instance), its OCSP response is still known as
well.
But we would also like such an entry not to be updated automatically
anymore once no instance uses it. An easy way to do it could have been
to keep a reference to the certificate_ocsp structure in the ckch_store
as well, on top of all the ones in the ckch_instances, and to remove the
ocsp response from the update tree once the refcount falls to 1, but it
would not work because of the way the ocsp response tree keys are
calculated. They are decorrelated from the ckch_store and are the actual
OCSP_CERTIDs, which is a combination of the issuer's name hash and key
hash, and the certificate's serial number. So two copies of the same
certificate but with different names would still point to the same ocsp
response tree entry.

The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
actual reference counter corresponding to the number of "live" pointers
on the certificate_ocsp structure, incremented for every SSL_CTX using
it, and one for the ckch stores.
If the ckch_store reference counter falls to 0, the corresponding
certificate must have been removed via CLI calls ('set ssl cert' for
instance).
If the actual refcount falls to 0, then no live SSL_CTX uses the
response anymore. It could happen if all the corresponding crt-list
lines were removed and there are no live SSL sessions using the
certificate anymore.
If any of the two refcounts becomes 0, we will always remove the
response from the auto update tree, because there's no point in spending
time updating an OCSP response that no new SSL connection will be able
to use. But the certificate_ocsp object won't be removed from the tree
unless both refcounts are 0.

Must be backported up to 2.8. Wait a little bit before backporting.
2024-03-20 16:12:10 +01:00
Amaury Denoyelle
c130f74803 BUG/MINOR: session: ensure conn owner is set after insert into session
A crash could occured if a session_add_conn() would temporarily failed
when called via h2_detach(). In this case, connection owner is reset to
NULL. However, if this wasn't the last connection stream, the connection
won't be destroyed. When h2_detach() is recalled for another stream and
this time session_add_conn() succeeds, a crash will occur due to
session_check_idle_conn() invocation with a NULL connection owner.

To fix this, ensure connection owner is always set after
session_add_conn() success.

This bug is considered as minor as the only failure reason for
session_add_conn() is a pool allocation issue.

This should be backported up to all stable releases.
2024-03-20 14:26:57 +01:00
Christopher Faulet
189f74d4ff MINOR: cfgparse: Add a global option to expose deprecated directives
Similarly to "expose-exprimental-directives" option, there is no a global
option to expose some deprecated directives. Idea is to have a way to silent
warnings about deprecated directives when there is no alternative solution.

Of course, deprecated directives covered by this option are not listed and
may change. It is only a best effort to let users upgrade smoothly.
2024-03-15 11:31:48 +01:00
Amaury Denoyelle
7dae3ceaa0 BUG/MAJOR: server: do not delete srv referenced by session
A server can only be deleted if there is no elements which reference it.
This is taken care via srv_check_for_deletion(), most notably for active
and idle connections.

A special case occurs for connections directly managed by a session.
This is for so-called private connections, when using http-reuse never
or H2 + http-reuse safe for example. In this case. server does not
account these connections into its idle lists. This caused a bug as the
server is deleted despite the session still being able to access it.

To properly fix this, add a new referencing element into the server for
these session connections. A mt_list has been chosen for this. On
default http-reuse, private connections are typically not used so it
won't make any difference. If using H2 servers, or more generally when
dealing with private connections, insert/delete should typically occur
only once per session lifetime so impact on performance should be
minimal.

This should be backported up to 2.4. Note that srv_check_for_deletion()
was introduced in 3.0 dev tree. On backport, the extra condition in it
should be placed in cli_parse_delete_server() instead.
2024-03-14 15:21:07 +01:00
Amaury Denoyelle
5ad801c058 MINOR: session: rename private conns elements
By default, backend connections are attached to a server instance. This
allows to implement connection reuse. However, in some particular cases,
connection cannot be shared accross several clients. These connections
are considered and private and are attached to the session instance
instead.

These private connections are also indexed by the target server to not
mix them. All of this is implemented via a dedicated structure
previously named struct sess_srv_list.

Rename it to better reflect its usage to struct sess_priv_conns. Also
rename its internal members and all of the associated functions.

This commit is only a renaming, thus no functional impact is expected.
2024-03-14 15:21:02 +01:00
Aurelien DARRAGON
07b2e84bce BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread (2nd try)
While trying to reproduce another crash case involving lua filters
reported by @bgrooot on GH #2467, we found out that mixing filters loaded
from different contexts ('lua-load' vs 'lua-load-per-thread') for the same
stream isn't supported and may even cause the process to crash.

Historically, mixing lua-load and lua-load-per-threads for a stream wasn't
supported, but this changed thanks to 0913386 ("BUG/MEDIUM: hlua: streams
don't support mixing lua-load with lua-load-per-thread").

However, the above fix didn't consider lua filters's use-case properly:
unlike lua fetches, actions or even services, lua filters don't simply
use the stream hlua context as a "temporary" hlua running context to
process some hlua code. For fetches, actions.. hlua executions are
processed sequentially, so we simply reuse the hlua context from the
previous action/fetch to run the next one (this allows to bypass memory
allocations and initialization, thus it increases performance), unless
we need to run on a different hlua state-id, in which case we perform a
reset of the hlua context.

But this cannot work with filters: indeed, once registered, a filter will
last for the whole stream duration. It means that the filter will rely
on the stream hlua context from ->attach() to ->detach(). And here is the
catch, if for the same stream we register 2 lua filters from different
contexts ('lua-load' + 'lua-load-per-thread'), then we have an issue,
because the hlua stream will be re-created each time we switch between
runtime contexts, which means each time we switch between the filters (may
happen for each stream processing step), and since lua filters rely on the
stream hlua to carry context between filtering steps, this context will be
lost upon a switch. Given that lua filters code was not designed with that
in mind, it would confuse the code and cause unexpected behaviors ranging
from lua errors to crashing process.

So here we take another approach: instead of re-creating the stream hlua
context each time we switch between "global" and "per-thread" runtime
context, let's have both of them inside the stream directly as initially
suggested by Christopher back then when talked about the original issue.

For this we leverage hlua_stream_ctx_prepare() and hlua_stream_ctx_get()
helper functions which return the proper hlua context for a given stream
and state_id combination.

As for debugging infos reported after ha_panic(), we check for both hlua
runtime contexts to check if one of them was active when the panic occured
(only 1 runtime ctx per stream may be active at a given time).

This should be backported to all stable versions with 0913386
("BUG/MEDIUM: hlua: streams don't support mixing lua-load with lua-load-per-thread")

This commit depends on:
 - "DEBUG: lua: precisely identify if stream is stuck inside lua or not"
   [for versions < 2.9 the ha_thread_dump_one() part should be skipped]
 - "MINOR: hlua: use accessors for stream hlua ctx"

For 2.4, the filters API didn't exist. However it may be a good idea to
backport it anyway because ->set_priv()/->get_priv() from tcp/http lua
applets may also be affected by this bug, plus it will ease code
maintenance. Of course, filters-related parts should be skipped in this
case.
2024-03-13 09:24:46 +01:00
Aurelien DARRAGON
1a2cdf64c9 DEBUG: lua: precisely identify if stream is stuck inside lua or not
When ha_panic() is called by the watchdog, we try to guess from
ha_task_dump() and ha_thread_dump_one() if the thread was stuck while
executing lua from the stream context. However we consider this is the
case by simply checking if the stream hlua context was set, but this is
not very precise because if the hlua context is set, then it simply means
that at least one lua instruction was executed at the stream level, not
that the stuck was currently executing lua when the panic occured.

This is especially true with filters, one could simply register a lua
filter that does nothing but this will still end up initializing the
stream hlua context for each stream. If the thread end up being stuck
during the stream handling, then debug dumping functions will report
that the stream was stuck while handling lua, which is not necessarilly
true, and could in fact confuse us even more.

So here we take another approach, we add the BUSY flag to hlua context:
this flag is set by hlua_ctx_resume() around lua_resume() call, this way
we can precisely tell if the thread was handling lua when it was
interrupted, and we rely on this flag in debug functions to check if the
thread was effectively stuck inside lua or not while processing the stream

No backport needed unless a commit depends on it.
2024-03-13 09:24:46 +01:00
William Lallemand
bbc215d3bd CLEANUP: ssl: remove useless #ifdef in openssl-compat.h
Remove a useless #ifdef in openssl-compat.h
2024-03-13 08:51:04 +01:00
William Lallemand
501d9fdb86 MEDIUM: ssl: allow to change the OpenSSL security level from global section
The new "ssl-security-level" option allows one to change the OpenSSL
security level without having to change the openssl.cnf global file of
your distribution. This directives applies on every SSL_CTX context.

People sometimes change their security level directly in the ciphers
directive, however there are some cases when the security level change
is not applied in the right order (for example when applying a DH
param).

Before this patch, it was to possible to trick by using a specific
openssl.cnf file and start haproxy this way:

    OPENSSL_CONF=./openssl.cnf ./haproxy -f bug-2468.cfg

Values for the security level can be found there:

https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_security_level.html

This was discussed in github issue #2468.
2024-03-12 17:37:11 +01:00
Amaury Denoyelle
c499d66f37 MINOR: quic: remove qc_treat_rx_crypto_frms()
This commit removes qc_treat_rx_crypto_frms(). This function was used in
a single place inside qc_ssl_provide_all_quic_data(). Besides, its
naming was confusing as conceptually it is directly linked to quic_ssl
module instead of quic_rx.

Thus, body of qc_treat_rx_crypto_frms() is inlined directly inside
qc_ssl_provide_all_quic_data(). Also, qc_ssl_provide_quic_data() is now
only used inside quic_ssl to its scope is set to static. Overall, API
for CRYPTO frame handling is now cleaner.
2024-03-11 14:27:51 +01:00
matthias sweertvaegher
062ea3a3d4 BUILD: solaris: fix compilation errors
Compilation on solaris fails because of usage of names reserved on that
platform, i.e. 'queue' and 's_addr'.

This patch redefines 'queue' as '_queue' and renames 's_addr' to
'srv_addr' which fixes compilation for now.

Future plan: rename 'queue' in code base so define can be removed again.

Backporting: 2.9, 2.8
2024-03-09 11:24:54 +01:00
Willy Tarreau
758cb450a2 OPTIM: sink: drop the sink lock used to count drops
The sink lock was made to prevent event producers from passing while
there were other threads trying to print a "dropped" message, in order
to guarantee the absence of reordering. It has a serious impact however,
which is that all threads need to take the read lock when producing a
regular trace even when there's no reader.

This patch takes a different approach. The drop counter is shifted left
by one so that the lowest bit is used to indicate that one thread is
already taking care of trying to dump the counter. Threads only read
this value normally, and will only try to change it if it's non-null,
in which case they'll first check if they are the first ones trying to
dump it, otherwise will simply count another drop and leave. This has
a large benefit. First, it will avoid the locking that causes stalls
as soon as a slow reader is present. Second, it avoids any write on the
fast path as long as there's no drop. And it remains very lightweight
since we just need to add +2 or subtract 2*dropped in operations, while
offering the guarantee that the sink_write() has succeeded before
unlocking the counter.

While a reader was previously limiting the traffic to 11k RPS under
4C/8T, now we reach 36k RPS vs 14k with no reader, so readers will no
longer slow the traffic down and will instead even speed it up due to
avoiding the contention down the chain in the ring. The locking cost
dropped from ~75% to ~60% now (it's in ring_write now).
2024-03-09 11:23:52 +01:00
Willy Tarreau
26cd248feb BUILD: ssl: define EVP_CTRL_AEAD_GET_TAG for older versions
Amaury reported that previous commit 08ac282375 ("MINOR: Add aes_gcm_enc
converter") broke the CI on OpenSSL 1.0.2 due to the define above not
existing there. Let's just map it to its older name when not existing.
For reference, these were renamed when switching to 1.1.0:

    https://marc.info/?l=openssl-cvs&m=142244190907706&w=2

No backport is needed.
2024-03-08 18:23:34 +01:00
Amaury Denoyelle
1ee7bf5bd9 MINOR: quic: always use ncbuf for rx CRYPTO
The previous patch fix the handling of in-order CRYPTO frames which
requires the usage of a new buffer for these data as their handling is
delayed to run under TASK_HEAVY.

In fact, as now all CRYPTO frames handling must be delayed, their
handling can be unify. This is the purpose of this commit, which removes
the just introduced new buffer. Now, all CRYPTO frames are buffered
inside the ncbuf. Unused elements such as crypto_frms member for
encryption level are also removed.

This commit is not a bugcfix but is a direct follow-up to the last one.
As such, it can probably be backported with it to 2.9 to reduce code
differences between these versions.
2024-03-08 17:22:48 +01:00
Amaury Denoyelle
81f118cec0 BUG/MEDIUM: quic: fix handshake freeze under high traffic
QUIC relies on SSL_do_hanshake() to be able to validate handshake. As
this function is computation heavy, it is since 2.9 called only under
TASK_HEAVY. This has been implemented by the following patch :
  94d20be138
  MEDIUM: quic: Heavy task mode during handshake

Instead of handling CRYPTO frames immediately during reception, this
patch delays the process to run under TASK_HEAVY tasklet. A frame copy
is stored in qel.rx.crypto_frms list. However, this frame still
reference the receive buffer. If the receive buffer is cleared before
the tasklet is rescheduled, it will point to garbage data, resulting in
haproxy decryption error. This happens if a fair amount of data is
received constantly to preempt the quic_conn tasklet execution.

This bug can be reproduced with a fair amount of clients. It is
exhibited by 'show quic full' which can report connections blocked on
handshake. Using the following commands result in h2load non able to
complete the last connections.

$ h2load --alpn-list h3 -t 8 -c 800 -m 10 -w 10 -n 8000 "https://127.0.0.1:20443/?s=10k"

Also, haproxy QUIC listener socket mode was active to trigger the issue.
This forces several connections to share the same reception buffer,
rendering the bug even more plausible to occur. It should be possible to
reproduce it with connection socket if increasing the clients amount.

To fix this bug, define a new buffer under quic_cstream. It is used
exclusively to copy CRYPTO data for in-order frame if ncbuf is empty.
This ensures data remains accessible even if receive buffer is cleared.

Note that this fix is only a temporary step. Indeed, a ncbuf is also
already used for out-of-order data. It should be possible to unify its
usage for both in and out-of-order data, rendering this new buffer
instance unnecessary. In this case, several unneeded elements will
become obsolete such as qel.rx.crypto_frms list. This will be done in a
future refactoring patch.

This must be backported up to 2.9.
2024-03-08 17:22:48 +01:00
Nenad Merdanovic
e225e04ba7 MINOR: vars: export var_set and var_unset functions
Co-authored-by: Dragan Dosen <ddosen@haproxy.com>
2024-03-08 17:20:43 +01:00
Willy Tarreau
93a0fb74f4 BUILD: buf: make b_ncat() take a const for the source
In 2.7 with commit 35df34223b ("MINOR: buffers: split b_force_xfer() into
b_cpy() and b_force_xfer()"), b_ncat() was extracted from b_force_xfer()
but kept its source variable instead of constant, making it unusable for
calls from a const source. Let's just fix it.
2024-03-05 11:50:34 +01:00
Willy Tarreau
0a0041d195 BUILD: tree-wide: fix a few missing includes in a few files
Some include files, mostly types definitions, are missing a few includes
to define the types they're using, causing include ordering dependencies
between files, which are most often not seen due to the alphabetical
order of includes. Let's just fix them.

These were spotted by building pre-compiled headers for all these files
to .h.gch.
2024-03-05 11:50:34 +01:00
Willy Tarreau
ac692d7ee5 BUILD: thread: move lock label definitions to thread-t.h
The 'lock_label' enum is defined in thread.h but it's used in a few
type files, so let's move it to thread-t.h to allow explicit includes.
2024-03-05 11:50:34 +01:00
Amaury Denoyelle
f913d42aaf MINOR: quic: add MUX output for show quic
Extend "show quic" to be able to dump MUX related information. This is
done via the new function qcc_show_quic(). This replaces the old streams
dumping list which was incomplete.

These info are displayed on full output or by specifying "mux" field.
2024-02-29 10:03:36 +01:00
Christopher Faulet
60fcc27577 MEDIUM: htx/http-ana: No longer close connection on early HAProxy response
When a response was returned by HAProxy, a dedicated HTX flag was
set. Thanks to this flag, it was possible to add a "connection: close"
header to the response if the request was not fully received and to close
the connection. In the same way, when a redirect rule was applied,
keep-alive was forcefully disabled for unfinished requests.

All these mechanisms are now useless because the H1 mux is able to drain the
response. So HTX_FL_PROXY_RESP flag is removed and no special processing is
performed on HAProxy response when the request is unfinished.
2024-02-28 16:02:33 +01:00
Christopher Faulet
077906da14 MAJOR: mux-h1: Drain requests on client side before shut a stream down
unlike for H2 and H3, there is no mechanism in H1 to notify the client it
must stop to upload data when a response is replied before the end of the
request without closing the connection. There is no RST_STREAM frame
equivalent.

Thus, there is only two ways to deal with this situation: closing the
connection or draining the request. Until now, HAProxy didn't support
draining H1 messages. Closing the connection in this case has however a
major drawback. It leads to send a TCP reset, dropping this way all in-fly
data. There is no warranty the client has fully received the response.

Draining H1 messages was never implemented because in old versions it was a
bit tricky to implement. However, it is now far simplier to support this
feature because it is possible to have a H1 stream without any applicative
stream. It is the purpose of this patch. Now, when a shutdown is requested
and the stream is detached from the connection, if the request is unfinished
while the response was fully sent, the request in drained.

To do so, in this case the shutdown and the detach are delayed. From the
upper layer point of view, there is no changes. The endpoint is shut down
and detached as usual. But on H1 mux point of view, the H1 stream is still
alive and is being able to drain data. However the stream-endpoint
descriptor is orphan. Once the request is fully received (and drained), the
connection is shut down if it cannot be reused for a new transaction and the
H1 stream is destroyed.
2024-02-28 16:02:33 +01:00
Amaury Denoyelle
8a31783b64 BUG/MEDIUM: server: fix dynamic servers initial settings
Contrary to static servers, dynamic servers does not initialize their
settings from a default server instance. As such, _srv_parse_init() was
responsible to set a set of minimal values to have a correct behavior.

However, some settings were not properly initialized. This caused
dynamic servers to not behave as static ones without explicit
parameters.

Currently, the main issue detected is connection reuse which was
completely impossible. This is due to incorrect pool_purge_delay and
max_reuse settings incompatible with srv_add_to_idle_list().

To fix the connection reuse, but also more generally to ensure dynamic
servers are aligned with other server instances, define a new function
srv_settings_init(). This is used to set initial values for both default
servers and dynamic servers. For static servers, srv_settings_cpy() is
kept instead, using their default server as reference.

This patch could have unexpected effects on dynamic servers behavior as
it restored proper initial settings. Previously, they were set to 0 via
calloc() invocation from new_server().

This should be backported up to 2.6, after a brief period of
observation.
2024-02-27 17:02:20 +01:00
William Lallemand
4895fdac5a BUG/MAJOR: ssl/ocsp: crash with ocsp when old process exit or using ocsp CLI
This patch reverts 2 fixes that were made in an attempt to fix the
ocsp-update feature used with the 'commit ssl cert' command.

The patches crash the worker when doing a soft-stop when the 'set ssl
ocsp-response' command was used, or during runtime if the ocsp-update
was used.

This was reported in issue #2462 and #2442.

The last patch reverted is the associated reg-test.

Revert "BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing"
This reverts commit 5e66bf26ec.

Revert "BUG/MEDIUM: ocsp: Separate refcount per instance and per store"
This reverts commit 04b77f84d1b52185fc64735d7d81137479d68b00.

Revert "REGTESTS: ssl: Add OCSP related tests"
This reverts commit acd1b85d3442fc58164bd0fb96e72f3d4b501d15.
2024-02-26 18:04:25 +01:00
Willy Tarreau
a4d44250eb BUG/MINOR: ist: only store NUL byte on succeeded alloc
The trailing NUL added at the end of istdup() by recent commit de0216758
("BUG/MINOR: ist: allocate nul byte on istdup") was placed outside of
the pointer validity test, rightfully showing null deref warnings. This
fix should be backported along with the fix above, to the same versions.
2024-02-23 19:51:54 +01:00
Miroslav Zagorac
3f771f5118 MINOR: ssl: Call callback function after loading SSL CRL data
Due to the possibility of calling a control process after adding CRLs, the
ssl_commit_crlfile_cb variable was added.  It is actually a pointer to the
callback function, which is called if defined after initial loading of CRL
data from disk and after committing CRL data via CLI command
'commit ssl crl-file ..'.

If the callback function returns an error, then the CLI commit operation
is terminated.

Also, one case was added to the CLI context used by "commit cafile" and
"commit crlfile": CACRL_ST_CRLCB in which the callback function is called.

Signed-off-by: William Lallemand <wlallemand@haproxy.com>
2024-02-23 18:12:27 +01:00
Christopher Faulet
3d93ecc132 BUG/MAJOR: cli: Restore non-interactive mode behavior with pipelined commands
The issue was decribed in commit "BUG/MEDIUM: cli: Warn if pipelined commands
are delimited by a \n". In non-interactive mode, it was possible to use a
newline character as delimiter for pipelined commands. As a consequence, it was
possible to stop commands processing on the middle.

With the above commit, a warning is emitted to notify users. With this one,
we restore the expected behavior, as documented in the management guide.
Only the first line of commands is parsed. This commit will not be
backported to avoid breaking changes on stable versions.

This commit has of course some visible effects. All script using a newline
character as delimiter to pipeline commands in non-interactive mode will
stop working. Only the first command will be evaluated, all others will be
ignored. Pipelined commands MUST now be separated by a semi-colon.

But there is a more subtle and probably more annoying change. It is no
longer possible to pipeline commands with a payload ! A command with a
payload will always be the last one evaluated because it must be finished by
a newline (eventually preceeded by a custom pattern).

It is really annoying to introduce such breaking change. But, on the long
term, it is mandatory. The 2.8 will be the last LST version supporting the
old behavior (with some warning however). This will let 4 years to users to
adapt their scripts.

No backport needed.
2024-02-23 15:19:49 +01:00
Christopher Faulet
598c7f164c BUG/MEDIUM: cli: Warn if pipelined commands are delimited by a \n
This was broken since commit 0011c25144 ("BUG/MINOR: cli: avoid O(bufsize)
parsing cost on pipelined commands"). It is not really a bug fix but it is
labelled as is to make it more visible.

Before, a full line was first retrieved from the request buffer before
extracting the first command to eval it. Now, only one command is retrieved.
But we rely on the request buffer state to interrupt processing in
non-interactive mode. After a command processing, if output of the request
buffer is empty, we leave. Before the above commit, this was not a problem.
But since then, it is obviously a bad statement. First because some input
data may still be there. It is not true today, but it might change. Then,
there is no warranty to receive all commands in same time. For small list of
commands, it will be most of time the case, but it is a dangerous
assumption. For long list of commands, it is almost always false.

To be an issue, commands must be chunked exactly between two commands. But
in this case, remaining commands are skipped. A good way to reproduce the
issue is to wait a bit between two commands, for instance:

    (printf "show info;"; sleep 2; printf "show stat\n") | socat ...

In fact, to properly fix the issue, we should exit on the first command
finished by a newline. Indeed, as stated in the documentation, in
non-interactive mode, a single line is processed. To pipeline commands,
commands must be separated by a semi-colon. Unfortunately, the above commit
introduced another change. It is possible to pipeline commands delimited by
a newline. It was pushed 2 years ago and backported to all stable versions.
Several scripts may rely on this behavior.

So, on stable version, the bug will not be fixed. However a warning will be
emitted to notify users their scripts don't respect the documentation and
they must adapt it. Mainly because the cli behavior on this point will be
changed in 3.0 to stick to the doc. This warning will only be emitted once
over the whole worker process life. Idea is to not flood the logs with the
same warning for every offending commands.

This commit should probably be backported to all stable versions. But with
some cautions because the CLI was often modified.
2024-02-23 15:19:49 +01:00
Amaury Denoyelle
de02167584 BUG/MINOR: ist: allocate nul byte on istdup
istdup() is documented as having the same behavior as strdup(). However,
it may cause confusion as it allocates a block of input length, without
an extra byte for \0 delimiter. This behavior is incoherent as in case
of an empty string however a single \0 is allocated.

This API inconsistency could cause a bug anywhere an IST is used as a
C-string after istdup() invocation. Currently, the only found issue is
with 'wait' CLI command using 'srv-unused'. This causes a buffer
overflow due to ist0() invocation after istdup() for be_name and
sv_name.

Backport should be done to all stable releases. Even if no bug has been
found outside of wait CLI implementation, it ensures the code is more
consistent on every releases.
2024-02-22 18:24:35 +01:00
Aurelien DARRAGON
1c2e16ba8a MINOR: log: add free_logformat_node() helper function
Function may be used to free a single logformat node.
2024-02-22 15:32:42 +01:00
Aurelien DARRAGON
891bac673b CLEANUP: proxy/log: remove unused proxy flag
Since 3d6350e10 ("MINOR: log: Remove log-error-via-logformat option"),
PR_O_ERR_LOGFMT flag is not used anymore, but it was left in the proxy-t.h
header file. Simply removing it and adding a comment to indicate that the
corresponding bit is now unused.
2024-02-22 15:32:42 +01:00
Amaury Denoyelle
8b950f40fa MINOR: quic: only use sendmsg() syscall variant
This patch is the direct followup of the previous one :
  MINOR: quic: remove sendto() usage variant

This finalizes qc_snd_buf() simplification by removing send() syscall
usage for quic-conn owned socket. Syscall invocation is merged in a
single code location to the sendmsg() variant.

The only difference for owned socket is that destination address for
sendmsg() is set to NULL. This usage is documented in man 2 sendmsg as
valid for connected sockets. This allows maximum performance by avoiding
unnecessary lookups on kernel socket address tables.

As the previous patch, no functional change should happen here. However,
it will be simpler to extend qc_snd_buf() for GSO usage.
2024-02-20 16:42:05 +01:00
Aurelien DARRAGON
1448478d62 MINOR: log: explicit typecasting for logformat nodes
Add the ability to manually specify desired output type after a custom
field name for logformat nodes. Forcing the type can be useful to ensure
value is stored with the proper type representation. (i.e.: forcing
numerical to string to work around the limited resolution of JS number
types)

By default, type is set to SMP_T_SAME, which means the original type will
be preserved.

Currently supported types are: bool, str, sint
2024-02-20 15:49:54 +01:00
Aurelien DARRAGON
0cfcc64b79 MINOR: sample: add type_to_smp() helper function
type_to_smp(type) does the reverse operation of smp_to_type[smp]: it takes
a type name as input string and tries to return the corresponding SMP_T_*
smp type or SMP_TYPES if not found.
2024-02-20 15:18:39 +01:00
Aurelien DARRAGON
2ed6068f2a MINOR: log: custom name for logformat node
Add the ability to specify custom name (will be used for representation
in verbose output types such as json) to logformat nodes.

For now, a custom name should be composed by characters [a-zA-Z0-9-_]*
2024-02-20 15:18:39 +01:00
Christopher Faulet
22d4b0e901 MINOR: stconn: Add SE flag to announce zero-copy forwarding on consumer side
The SE_FL_MAY_FASTFWD_CONS is added and it will be used by endpoints to
announce their support for the zero-copy forwarding on the consumer
side. The flag is not necessarily permanent. However, it will be used this
way for now.
2024-02-14 15:09:14 +01:00
Christopher Faulet
7598c0ba69 MINOR: stconn: Rename SE_FL_MAY_FASTFWD and reorder bitfield
To fix a bug, a flag to announce the capabitlity to support the zero-copy
forwarding on the consumer side will be added on the SE descriptor. So the
old flag SE_FL_MAY_FASTFWD is renamed to indicate it concerns the producer
side. It is now SE_FL_MAY_FASTFWD_PROD. And to prepare addition of the new
flag, the bitfield is a bit reordered.
2024-02-14 15:00:32 +01:00
Christopher Faulet
8cfc11f461 CLEANUP: stconn: Move SE flags set by app layer at the end of the bitfield
To fix a bug, some SE flags must be added or renamed. To avoid mixing flags
set by the endpoint and flags set by the app, the second set of flags are
moved at the end of the bitfield, leaving the holes on the middle.
2024-02-14 14:52:25 +01:00
Christopher Faulet
40d98176ba BUG/MEDIUM: stconn: Don't check pending shutdown to wake an applet up
This revert of commit 0b93ff8c87 ("BUG/MEDIUM: stconn: Wake applets on
sending path if there is a pending shutdown") and 9e394d34e0 ("BUG/MINOR:
stconn: Don't report blocked sends during connection establishment") because
it was not the right fixes.

We must not wake an applet up when a shutdown is pending because it means
output some data are still blocked in the channel buffer. The applet does
not necessarily consume these data. In this case, the applet may be woken up
infinitly, except if it explicitly reports it wont consume datay yet.

This patch must be backported as far as 2.8. For older versions, as far as
2.2, it may be backported. If so, a previous fix must be pushed to prevent
an HTTP applet to be stuck. In http_ana.c, in http_end_request() and
http_end_reponse(), the call to channel_htx_truncate() on the request
channel in case of MSG_ERROR must be replace by a call to
channel_htx_erase().
2024-02-14 14:22:36 +01:00
Christopher Faulet
2c672f282d BUG/MEDIUM: stconn: Allow expiration update when READ/WRITE event is pending
When a READ or a WRITE activity is reported on a channel, the corresponding
date is updated. the last-read-activity date (lra) is updated and the
first-send-block date (fsb) is reset. The event is also reported at the
channel level by setting CF_READ_EVENT or CF_WRITE_EVENT flags. When one of
these flags is set, this prevent the update of the stream's task expiration
date from sc_notify(). It also prevents corresponding timeout to be reported
from process_stream().

But it is a problem during fast-forwarding stage if no expiration date was
set by the stream. Only process_stream() resets these flags. So a first READ
or WRITE event will prevent any stream's expiration date update till a new
call to process_stream(). But with no expiration date, this will only happen
on shutdown/abort event, blocking the stream for a while.

It is for instance possible to block the stats applet or the cli applet if a
client does not consume the response. The stream may be blocked, the client
timeout is not respected and the stream can only be closed on a client
abort.

So now, we update the stream's expiration date, regardless of reported
READ/WRITE events. It is not a big deal because lra and fsb date are
properly updated. It also means an old READ/WRITE event will no prevent the
stream to report a timeout and it is expected too.

This patch must be backported as far as 2.8. On older versions, timeouts and
stream's expiration date are not updated in the same way and this works as
expected.
2024-02-14 14:22:36 +01:00
Christopher Faulet
4a78f766ff MEDIUM: applet: Add notion of shutdown for write for applets
In fact there is already flags on the SE to state a shutdown for reads or
writes was performed. But for applets, this notion does not exist. Both
flags are set in same time when the applet is released. But at the SC level,
there are functions to perform a shutdown (formely the shutw) and an abort
(formely the shutr). For applets, when a shutdown is performed on the SC, if
the applet is not immediately released, nothing is acknowledge at the SE
level.

With old way to implement applets, this was not an real issue until recently
because applets accessed to the channel/SC flags. It was thus possible to
catch the shutdowns. But the "wait" command on the CLI reveals the
flaw. Indeed, when this command is executed, nothing is read or sent. So, it
is not possible to detect the shutdowns. As a workaround, a dedicated test
on the SC flags was added at the end of the wait command I/O handler. But it
is pretty ugly.

With new way to implement applets, there is no longer access to the channel
or SC. So we must add a way to acknowledge shutdown into the SE.

This patch solves the both sides of the issue. The shutw notion is added for
applets. Its only purpose is to set SE_FL_SHWN flags. This flag is tested by
all applets, so, it solves the issue quite simply.

Note that it is described as a bug fix but there is no real issue, just a
design flaw. However, if the "wait" command is backported, this patch must
be backported too. Unfortinately it will require an adaptation because there
is no appctx flags on older versions.
2024-02-14 14:22:36 +01:00
Christopher Faulet
5df45cff8f BUG/MEDIUM: stconn/applet: Block 0-copy forwarding if producer needs more room
This case does not exist yet with the H1 multiplexer, but applets may decide to
not produce data if there is not enough room in the destination buffer (the
applet's outbuf or the opposite SE buffer). It is true for the stats applets for
instance. However this case is not properly handled when the zero-copy
forwarding is in-use.

To fix the issue, the se_done_ff() function was modified to return the number of
bytes really forwarded and to subs for sends if nothing was forwarded while the
zero-copy forwarding was blocked by the producer. On the applet side, we take
care to block the zero-copy forwarding if the applet requests more room. At the
end, zero-copy forwarding is unblocked if something was forwarded.

This way, it is now possible for the stats applet to report a full buffer and
block the zero-copy forwarding, even if the buffer is not really full, by
requesting more room.

No backport needed.
2024-02-14 14:22:36 +01:00
Christopher Faulet
ece002af1d BUG/MEDIUM: applet: Add a flag to state an applet is using zero-copy forwarding
An issue was introduced when zero-copy forwarding was added to the stats and
cache applets. There is no test to be sure the upper layer is ready to use
the zero-copy forwarding. So these applets refuse to deliver the response
into the applet's output buffer if the zero-copy forwarding is supported by
the opposite endpoint. It is especially an issue when a filter, like the
compression, is in-use on the response channel.

Because of this bug, the response is not delivered and the applet is woken
up in loop to produce data.

To fix the issue, an appctx flag was added, APPCTX_FL_FASTFWD, to know when
the zero-copy forwarding is in-use. We rely on this flag to not fill the
outbuf in the applet's I/O handler.

No backport needed.
2024-02-14 14:22:36 +01:00
Frederic Lecaille
167e38e0e0 MINOR: quic: Add a counter for reordered packets
A packet is considered as reordered when it is detected as lost because its packet
number is above the largest acknowledeged packet number by at least the
packet reordering threshold value.

Add ->nb_reordered_pkt new quic_loss struct member at the same location that
the number of lost packets to count such packets.

Should be backported to 2.6.
2024-02-14 11:32:29 +01:00
Frederic Lecaille
eeeb81bb49 MINOR: quic: Dynamic packet reordering threshold
Let's say that the largest packet number acknowledged by the peer is #10, when inspecting
the non already acknowledged packets to detect if they are lost or not, this is the
case a least if the difference between this largest packet number and and their
packet numbers are bigger or equal to the packet reordering threshold as defined
by the RFC 9002. This latter must not be less than QUIC_LOSS_PACKET_THRESHOLD(3).
Which such a value, packets #7 and oldest are detected as lost if non acknowledged,
contrary to packet number #8 or #9.

So, the packet loss detection is very sensitive to such a network characteristic
where non acknowledged packets are distant from each others by their packet number
differences.

Do not use this static value anymore for the packet reordering threshold which is used
as a criteria to detect packet loss. In place, make it depend on the difference
between the number of the last transmitted packet and the number of the oldest
one among the packet which are still in flight before being inspected to be
deemed as lost.

Add new tune.quic.reorder-ratio setting to apply a ratio in percent to this
dynamic packet reorder threshold.

Should be backported to 2.6.
2024-02-14 11:32:29 +01:00
Remi Tricot-Le Breton
5e66bf26ec BUG/MEDIUM: ssl: Fix crash when calling "update ssl ocsp-response" when an update is ongoing
The CLI command "update ssl ocsp-response" was forcefully removing an
OCSP response from the update tree regardless of whether it used to be
in it beforehand or not. But since the main OCSP upate task works by
removing the entry being currently updated from the update tree and then
reinserting it when the update process is over, it meant that in the CLI
command code we were modifying a structure that was already being used.

These concurrent accesses were not properly locked on the "regular"
update case because it was assumed that once an entry was removed from
the update tree, the update task was the only one able to work on it.

Rather than locking the whole update process, an "updating" flag was
added to the certificate_ocsp in order to prevent the "update ssl
ocsp-response" command from trying to update a response already being
updated.

An easy way to reproduce this crash was to perform two "simultaneous"
calls to "update ssl ocsp-response" on the same certificate. It would
then crash on an eb64_delete call in the main ocsp update task function.

This patch can be backported up to 2.8.
2024-02-12 11:15:45 +01:00
Willy Tarreau
613e959c7b MINOR: cli/wait: add a condition to wait on a server to become unused
The "wait" command now supports a condition, "srv-unused", which waits
for the designated server to become totally unused, indicating that it
is removable. Upon each wakeup it calls srv_check_for_deletion() to
verify if conditions are met, if not if it's recoverable, or if it's
not recoverable, and proceeds according to this, never waiting for a
final decision longer than the configured delay.

The purpose is to make it possible to remove servers from the CLI after
waiting for their sessions to be terminated:

  $ socat -t5 /path/to/socket - <<< "
        disable server px/srv1
        shutdown sessions server px/srv1
        wait 2s srv-unused px/srv1
        del server px/srv1"

Or even wait for connections to terminate themselves:

  $ socat -t70 /path/to/socket - <<< "
        disable server px/srv1
        wait 1m srv-unused px/srv1
        del server px/srv1"
2024-02-09 20:38:08 +01:00
Willy Tarreau
66989ff426 MINOR: cli/wait: also pass up to 4 arguments to the external conditions
Conditions will need to have context, arguments etc from the command line.
Since these will vary with time (otherwise we wouldn't wait), let's just
pass them as text (possibly pre-processed). We're starting with 4 strings
that are expected to be allocated by strdup() and are always sent to free()
upon release.
2024-02-09 20:38:08 +01:00
Willy Tarreau
2673f8be82 MINOR: cli/wait: also support an unrecoverable failure status
Since we'll support waiting for an action to succeed or permanently
fail, we need the ability to return an unrecoverable failure. Let's
add CLI_WAIT_ERR_FAIL for this. A static error message may be placed
into ctx->msg to report to the user why the failure is unrecoverable.
2024-02-09 20:38:08 +01:00
Willy Tarreau
9b680d7411 MINOR: server: split the server deletion code in two parts
We'll need to be able to verify whether or not a server may be deleted.
For now, both the verification and the action are performed in the same
function, at once under thread isolation. The goal here is to extract
the verification code into a new function that will perform these checks,
return a status between success/recoverable/non-recoverable failure, and
will also return a message for the caller.
2024-02-09 20:38:08 +01:00
Willy Tarreau
1d2255a78a MINOR: cli: add a new "wait" command to wait for a certain delay
This allows to insert delays between commands, i.e. to collect a same
set of metrics at a fixed interval. E.g:

  $ socat -t20 /path/to/socket <<< "show activity; wait 10s; show activity"

The goal will be to extend the feature to optionally support waiting on
certain conditions. For this reason the struct definitions and enums were
placed into cli-t.h.
2024-02-08 21:54:54 +01:00
Willy Tarreau
8581d62daf MINOR: session: add the necessary functions to update the per-session glitches
This provides a new function session_add_glitch_ctr() that will update
the glitch counter and rate for the session, if tracked at all.
2024-02-08 15:51:49 +01:00
Willy Tarreau
c9c6b683fb MEDIUM: stick-tables: add a new stored type for glitch_cnt and glitch_rate
This adds a new pair of stored types in the stick-tables:
  - glitch_cnt
  - glitch_rate

These keep count of the number of glitches reported on a front connection,
in order to decide how to act with a badly defective client or a potential
attacker. For now nothing updates these counters, but all the infrastructure
needed to configure, update and retrieve them was added, including the doc.

No regtest was added yet since they're not filled yet.
2024-02-08 15:51:49 +01:00
Remi Tricot-Le Breton
befebf8b51 BUG/MEDIUM: ocsp: Separate refcount per instance and per store
With the current way OCSP responses are stored, a single OCSP response
is stored (in a certificate_ocsp structure) when it is loaded during a
certificate parsing, and each ckch_inst that references it increments
its refcount. The reference to the certificate_ocsp is actually kept in
the SSL_CTX linked to each ckch_inst, in an ex_data entry that gets
freed when he context is freed.
One of the downside of this implementation is that is every ckch_inst
referencing a certificate_ocsp gets detroyed, then the OCSP response is
removed from the system. So if we were to remove all crt-list lines
containing a given certificate (that has an OCSP response), the response
would be destroyed even if the certificate remains in the system (as an
unused certificate). In such a case, we would want the OCSP response not
to be "usable", since it is not used by any ckch_inst, but still remain
in the OCSP response tree so that if the certificate gets reused (via an
"add ssl crt-list" command for instance), its OCSP response is still
known as well. But we would also like such an entry not to be updated
automatically anymore once no instance uses it. An easy way to do it
could have been to keep a reference to the certificate_ocsp structure in
the ckch_store as well, on top of all the ones in the ckch_instances,
and to remove the ocsp response from the update tree once the refcount
falls to 1, but it would not work because of the way the ocsp response
tree keys are calculated. They are decorrelated from the ckch_store and
are the actual OCSP_CERTIDs, which is a combination of the issuer's name
hash and key hash, and the certificate's serial number. So two copies of
the same certificate but with different names would still point to the
same ocsp response tree entry.

The solution that answers to all the needs expressed aboved is actually
to have two reference counters in the certificate_ocsp structure, one
for the actual ckch instances and one for the ckch stores. If the
instance refcount becomes 0 then we remove the entry from the auto
update tree, and if the store reference becomes 0 we can then remove the
OCSP response from the tree. This would allow to chain some "del ssl
crt-list" and "add ssl crt-list" CLI commands without losing any
functionality.

Must be backported to 2.8.
2024-02-07 17:10:05 +01:00
Remi Tricot-Le Breton
28e78a0a74 MINOR: ssl: Use OCSP_CERTID instead of ckch_store in ckch_store_build_certid
The only useful information taken out of the ckch_store in order to copy
an OCSP certid into a buffer (later used as a key for entries in the
OCSP response tree) is the ocsp_certid field of the ckch_data structure.
We then don't need to pass a pointer to the full ckch_store to
ckch_store_build_certid or even any information related to the store
itself.
The ckch_store_build_certid is then converted into a helper function
that simply takes an OCSP_CERTID and converts it into a char buffer.
2024-02-07 17:09:39 +01:00
Christopher Faulet
d7467cd495 MINOR: applet: Identify applets using their own buffers via a flag
These applets can now be identified by testing APPCTX_FL_INOUT_BUFS
flag. This will be useful between the kind of applets in helper functions.
2024-02-07 15:05:05 +01:00
Christopher Faulet
a9301c96f1 MINOR: applet: Use an option to disable zero-copy forwarding for all applets
At the beginning of the 3.0-dev cycle, the zero-copy forwarding support was
added only for the cache applet with an option to disable it. This was a
hack, waiting for a better integration with applets. It is now possible to
implement the zero-copy forwarding for any applets. So the specific option
for the cache applet was renamed to be used for all applets. And this option
is now also checked for the stats applet.

Concretely, 'tune.cache.zero-copy-forwarding' was renamed to
'tune.applet.zero-copy-forwarding'.
2024-02-07 15:05:01 +01:00
Christopher Faulet
ee53d8421f MEDIUM: applet: Simplify a bit API to exchange data with applets
Default .rcv_buf and .snd_buf functions that applets can use are now
specialized to manipulate raw buffers or HTX buffers.

Thus a TCP applet should use appctx_raw_rcv_buf() and appctx_raw_snd_buf()
while HTTP applet should use appctx_htx_rcv_buf() and appctx_htx_snd_buf().

Note that the appctx is now directly passed to these functions instead of
the SC.
2024-02-07 15:04:52 +01:00
Christopher Faulet
868205943c MAJOR: stats: Send stats dump over HTTP using zero-copy forwarding
Just like for the cache applet, it is now possible to send response to the
opposite side using the zero-copy forwarding. Internal functions were
slightly updated but there is nothing special to say. Except the requested
size during the nego stage is not exact.
2024-02-07 15:04:48 +01:00
Christopher Faulet
1c18d32a0d MEDIUM: stconn: Nofify requested size during zero-copy forwarding nego is exact
It is now possible to use a flag during zero-copy forwarding negotiation to
specify the requested size is exact, it means the producer really expect to
receive at least this amount of data.

It can be used by consumer to prepare some processing at this stage, based
on the requested size. For instance, in the H1 mux, it is used to write the
next chunk size.
2024-02-07 15:04:38 +01:00
Christopher Faulet
2297f52734 MINOR: stconn: Add support for flags during zero-copy forwarding negotiation
During zero-copy forwarding negotiation, a pseudo flag was already used to
notify the consummer if the producer is able to use kernel splicing or not. But
this was not extensible. So, now we use a true bitfield to be able to pass flags
during the negotiation. NEGO_FF_FL_* flags may be used now.

Of course, for now, there is only one flags, the kernel splicing support on
producer side (NEGO_FF_FL_MAY_SPLICE).
2024-02-07 15:04:29 +01:00
Christopher Faulet
39b6f5b04c MEDIUM: applet: Add support for zero-copy forwarding from an applet
Thanks to this patch, it is possible to an applet to directly send data to
the opposite endpoint. To do so, it must implement <fastfwd> appctx callback
function and set SE_FL_MAY_FASTFWD flag.

Everything will be handled by appctx_fastfwd() function. The applet is only
responsible to transfer data. If it sets <to_forward> value, it is used to
limit the amount of data to forward.
2024-02-07 15:04:01 +01:00
Christopher Faulet
62a81cb6a6 MINOR: applet: Add callback function to deal with zero-copy forwarding
This patch introduces the support for the callback function responsible to
produce data via the zero-copy forwarding mechanism. There is no
implementation for now. But <to_forward> field was added in the appctx
structure to let an applet inform how much data it want to forward. It is
not mandatory but it will be used during the zero-copy forwarding
negociation.
2024-02-07 15:03:57 +01:00
Christopher Faulet
cc7b141e1c MINOR: applet: Add an appctx flag to report shutdown to applets
There is no shutdown for reads and send with applets. Both are performed
when the appctx is released. So instead of 2 flags, like for
muxes/connections, only one flag is used. But the idea is the same:
acknowledge the event at the applet level.
2024-02-07 15:03:50 +01:00
Christopher Faulet
14bd091fd7 MINOR: applet: Remove appctx state field to only used the flags
The appctx state was never really used as a state. It is only used to know
when an applet should be freed on the next wakeup. This can be converted to
a flag and the state can be removed. This is what this patch does.
2024-02-07 15:03:46 +01:00
Christopher Faulet
4434b03358 MINIOR: applet: Add flags to deal with ends of input, ends of stream and errors
Dedicated appctx flags to report EOI, EOS and errors (pending or terminal) were
added with the functions to set these flags. It is pretty similar to what it
done on most of muxes.
2024-02-07 15:03:42 +01:00
Christopher Faulet
e8655546b7 MINOR: applet: Add flags on the appctx and stop abusing its state
Till now, we've extended the appctx state to add some flags. However, the
field name is misleading. So a bitfield was added to handle real flags. And
helper functions to manipulate this bitfield were added.
2024-02-07 15:03:34 +01:00
Christopher Faulet
4ad8192ce4 MEDIM: applet: Add the applet handler based on IN/OUT buffers
A dedicated function to run applets was introduced, in addition to the old
one, to deal with applets that use their own buffers. The main differnce
here is that this handler does not use channels at all. It performs a
synchronous send before calling the applet and performs a synchronous
receive just after.

No applets are plugged on this handler for now.
2024-02-07 15:03:26 +01:00
Christopher Faulet
f81b704d01 MEDIUM: stconn: Add functions to handle applets I/O from the SC layer
There is no tasklet to handle I/O subscriptions for applets, but functions
to deal with receives and sends from the SC layer were added. it meanse a
function to retrieve data from an applet with this synchronous version and a
function to push data to an applet wit this synchronous version.

It is pretty similar to the functions used for muxes but there are some
differences. So for now, we keep them separated.

Zero-copy forwarding is not supported for now. In addition, there is no
subscription mechanism.
2024-02-07 15:03:23 +01:00
Christopher Faulet
525ec12305 MINOR: applet: Implement default functions to exchange data with channels
In this patch, we add default functions to copy data from a channel to the
<inbuf> buffer of an applet (appctx_rcv_buf) and another on to copy data
from <outbuf> buffer of an applet to a channel (appctx_snd_buf).

These functions are not used for now, but they will be used by applets to
define their <rcv_buf> and <snd_buf> callback functions. Of course, it will
be possible for a specific applet to implement its own functions but these
ones should be good enough for most of applets. HTX and RAW buffers are
supported.
2024-02-07 15:03:18 +01:00
Christopher Faulet
361b81bfca MINOR: applet: Add support for callback functions to exchange data with channels
For now, it is not usable, but this patch introduce the support of callback
functions, in the applet structure, to exchange data between channels and
applets. It is pretty similar to callback functions defined by muxes.
2024-02-07 15:03:14 +01:00
Christopher Faulet
ab9d2c6ca8 MINOR: applet: Add dedicated IN/OUT buffers for appctx
It is the first patch of a series aimed to align applets on connections.
Here, dedicated buffers are added for applets. For now, buffers are
initialized and helpers function to deal with allocation are added. In
addition, flags to report allocation failures or full buffers are also
introduced. <inbuf> will be used to push data to the applet from the stream
and <outbuf> will be used to push data from the applet to the stream.
2024-02-07 15:03:01 +01:00
Christopher Faulet
0dd7ff0d67 MINOR: stconn: Be able to detect applets using HTX
IS_HXT_SC() macro is only usable if the stream-connector is attached to a
connection. It is a bit restrictive because this cannot work if the SC is
attached to an applet. So let's fix that be adding the support of applets
too.
2024-02-07 15:02:19 +01:00
Christopher Faulet
6734e56514 MINOR: task: Move wait_event in the task header file
wait_event structure was in connection header file because it is only used
by connections and muxes. But, this may change. For instance applets may be
good candidates to use it too. So, the structure is moved to the task header
file instead.
2024-02-07 15:02:13 +01:00
Willy Tarreau
25968c186a MINOR: debug: add an optional message argument to the BUG_ON() family
This commit adds support for an optional second argument to BUG_ON(),
WARN_ON(), CHECK_IF(), that can be a constant string. When such an
argument is given, it will be printed on a second line after the
existing first message that contains the condition.

This can be used to provide more human-readable explanations about
what happened, such as "too low on memory" or "memory corruption
detected" that may help a user resolve the incident by themselves.
2024-02-05 17:09:00 +01:00
Willy Tarreau
d417863828 MINOR: debug: support passing an optional message in ABORT_NOW()
The ABORT_NOW() macro is not much used since we have BUG_ON(), but
there are situations where it makes sense, typically if the program
must always die regardless od DEBUG_STRICT, or if the condition must
always be evaluated (e.g. decompress something and check it).

It's not convenient not to have any hint about what happened there. But
providing too much info also results in wiping some registers, making
the trace less exploitable, so a compromise must be found.

What this patch does is to provide the support for an optional argument
to ABORT_NOW(). When an argument is passed (a string), then a message
will be emitted with the file name, line number, the message and a
trailing LF, before the stack dump and the crash. It should be used
reasonably, for example in functions that have multiple calls that need
to be more easily distinguished.
2024-02-05 17:09:00 +01:00
Willy Tarreau
bc70b385fd MINOR: debug: make BUG_ON() catch build errors even without DEBUG_STRICT
As seen in previous commit 59acb27001 ("BUILD: quic: Variable name typo
inside a BUG_ON()."), it can sometimes happen that with DEBUG forced
without DEBUG_STRICT, BUG_ON() statements are ignored. Sadly, it means
that typos there are not even build-tested.

This patch makes these statements reference sizeof(cond) to make sure
the condition is parsed. This doesn't result in any code being emitted,
but makes sure the expression is correct so that an issue such as the one
above will fail to build (which was verified).

This may be backported as it can help spot failed backports as well.
2024-02-05 15:09:37 +01:00
Aurelien DARRAGON
be0165b249 BUILD: debug: remove leftover parentheses in ABORT_NOW()
Since d480b7b ("MINOR: debug: make ABORT_NOW() store the caller's line
number when using abort"), building with 'DEBUG_USE_ABORT' fails with:

  |In file included from include/haproxy/api.h:35,
  |                 from include/haproxy/activity.h:26,
  |                 from src/ev_poll.c:20:
  |include/haproxy/thread.h: In function ‘ha_set_thread’:
  |include/haproxy/bug.h:107:47: error: expected ‘;’ before ‘_with_line’
  |  107 | #define ABORT_NOW() do { DUMP_TRACE(); abort()_with_line(__LINE__); } while (0)
  |      |                                               ^~~~~~~~~~
  |include/haproxy/bug.h:129:25: note: in expansion of macro ‘ABORT_NOW’
  |  129 |                         ABORT_NOW();                                    \
  |      |                         ^~~~~~~~~
  |include/haproxy/bug.h:123:9: note: in expansion of macro ‘__BUG_ON’
  |  123 |         __BUG_ON(cond, file, line, crash, pfx, sfx)
  |      |         ^~~~~~~~
  |include/haproxy/bug.h:174:30: note: in expansion of macro ‘_BUG_ON’
  |  174 | #  define BUG_ON(cond)       _BUG_ON     (cond, __FILE__, __LINE__, 3, "FATAL: bug ",     "")
  |      |                              ^~~~~~~
  |include/haproxy/thread.h:201:17: note: in expansion of macro ‘BUG_ON’
  |  201 |                 BUG_ON(!thr->ltid_bit);
  |      |                 ^~~~~~
  |compilation terminated due to -Wfatal-errors.
  |make: *** [Makefile:1006: src/ev_poll.o] Error 1

This is because of a leftover: abort()_with_line(__LINE__);
                                    ^^
Fixing it by removing the extra parentheses after 'abort' since the
abort() call is now performed under abort_with_line() helper function.

This was raised by Ilya in GH #2440.

No backport is needed, unless the above commit gets backported.
2024-02-05 14:55:04 +01:00
Willy Tarreau
d480b7be96 MINOR: debug: make ABORT_NOW() store the caller's line number when using abort
Placing DO_NOT_FOLD() before abort() only works in -O2 but not in -Os which
continues to place only 5 calls to abort() in h3.o for call places. The
approach taken here is to replace abort() with a new function that wraps
it and stores the line number in the stack. This slightly increases the
code size (+0.1%) but when unwinding a crash, the line number remains
present now. This is a very low cost, especially if we consider that
DEBUG_USE_ABORT is almost only used by code coverage tools and occasional
debugging sessions.
2024-02-02 17:12:06 +01:00
Willy Tarreau
2bb192ba91 MINOR: debug: make sure calls to ha_crash_now() are never merged
As indicated in previous commit, we don't want calls to ha_crash_now()
to be merged, since it will make gdb return a wrong line number. This
was found to happen with gcc 4.7 and 4.8 in h3.c where 26 calls end up
as only 5 to 18 "ud2" instructions depending on optimizations. By
calling DO_NOT_FOLD() just before provoking the trap, we can reliably
avoid this folding problem. Note that this does not address the case
where abort() is used instead (DEBUG_USE_ABORT).
2024-02-02 17:12:06 +01:00
Willy Tarreau
e06e8a2390 MINOR: compiler: add a new DO_NOT_FOLD() macro to prevent code folding
Modern compilers sometimes perform function tail merging and identical
code folding, which consist in merging identical occurrences of same
code paths, generally final ones (e.g. before a return, a jump or an
unreachable statement). In the case of ABORT_NOW(), it can happen that
the compiler merges all of them into a single one in a function,
defeating the purpose of the check which initially was to figure where
the bug occurred.

Here we're creating a DO_NO_FOLD() macro which makes use of the line
number and passes it as an integer argument to an empty asm() statement.
The effect is a code position dependency which prevents the compiler
from merging the code till that point (though it may still merge the
following code). In practice it's efficient at stopping the compilers
from merging calls to ha_crash_now(), which was the initial purpose.

It may also be used to force certain optimization constructs since it
gives more control to the developer.
2024-02-02 17:12:06 +01:00
Christopher Faulet
3246f863d6 MEDIUM: stats: Be able to access a specific field into a stats module
It is now possible to selectively retrieve extra counters from stats
modules. H1, H2, QUIC and H3 fill_stats() callback functions are updated to
return a specific counter.
2024-02-01 12:00:53 +01:00
Christopher Faulet
fd366a106b MINOR: stats: Be able to access to registered stats modules from anywhere
The list of modules registered on the stats to expose extra counters is now
public. It is required to export these counters into the Prometheus
exporter.
2024-02-01 12:00:53 +01:00
Aurelien DARRAGON
42a97d9feb MEDIUM: tcp-act/backend: support for set-bc-{mark,tos} actions
set-bc-{mark,tos} actions are pretty similar to set-fc-{mark,tos} to set
mark/tos on packets sent from haproxy to server: set-bc-{mark,tos} actions
act on the whole backend/srv connection: from connect() to connection
teardown, thus they may only be used before the connection to the server
is instantiated, meaning that they are only relevant for request-oriented
rules such as tcp-request or http-request rules. For now their use is
limited to content request rules, because tos and mark informations are
stored directly within the stream, thus it is required that the stream
already exists.

stream flags are used in combination with dedicated stream struct members
variables to pass 'tos' and 'mark' informations so that they are correctly
considered during stream connection assignment logic (prior to connecting
to actually connecting to the server)

'tos' and 'mark' fd sockopts are taken into account in conn hash
parameters for connection reuse mechanism.

The documentation was updated accordingly.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
b4ee7b044e MEDIUM: tcp-act: <expr> support for set-fc-{mark,tos} actions
In this patch we add the possibility to use sample expression as argument
for set-fc-{mark,tos} actions. To make it backward compatible with
previous behavior, during parsing we first try to parse the value as
as integer (decimal or hex notation), and then fallback to expr parsing
in case of failure.

The documentation was updated accordingly.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
ea09075f59 OPTIM: connection: progressive hash for conn_calculate_hash()
Some CPU time is needlessly wasted in conn_calculate_hash(), because all
params are first copied into a temporary buffer before computing the
hash on the whole buffer. Instead, let's leverage the XXH progressive
hash update functions to avoid expensive memcpys.
2024-02-01 10:58:30 +01:00
Aurelien DARRAGON
1de149fb6d CLEANUP: connection: remove obsolete comment in header file
0x00000008 bit for CO_FL_* flags is no more unused since 8cc3fc73f1
("MINOR: connection: update rhttp flags usage"). Removing the comment
that says otherwise.
2024-02-01 10:58:30 +01:00
Amaury Denoyelle
4b5f557283 MINOR: mux-quic: realign Tx buffer if possible
A major reorganization of QUIC MUX sending has been implemented. Now
data transfer occur over a single QCS buffer. This has improve
performance but at the cost of restrictions on snd_buf. Indeed, buffer
instances are now shared from stream callback snd_buf up to quic-conn
layer.

As such, snd_buf cannot manipulate freely already present data buffer.
In particular, realign has been completely removed by the previous
patches.

This commit reintroduces a partial realign support. This is only done if
the buffer contains only unsent data, via a new MUX function
qcc_realign_stream_txbuf() which is called during snd_buf.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
4513787d0d MEDIUM: mux-quic: properly handle conn Tx buf exhaustion
This commit is a direct follow-up on the major rearchitecture of send
buffering. This patch implements the proper handling of connection pool
buffer temporary exhaustion.

The first step is to be able to differentiate a fatal allocation error
from a temporary pool exhaustion. This is done via a new output argument
on qcc_get_stream_txbuf(). For a fatal error, application protocol layer
will schedule the immediate connection closing. For a pool exhaustion,
QCC is flagged with QC_CF_CONN_FULL and stream sending process is
interrupted. QCS instance is also registered in a new list
<qcc.buf_wait_list>.

A new connection buffer can become available when all ACKs are received
for an older buffer. This process is taken in charge by quic-conn layer.
It uses qcc_notify_buf() function to clear QC_CF_CONN_FULL and to wake
up every streams registered on buf_wait_list to resume sending process.
2024-01-31 16:28:54 +01:00
Amaury Denoyelle
cd22200d23 MEDIUM: mux-quic: release Tx buf on too small room
This commit is a direct follow-up on the major rearchitecture of send
buffering. It allows application protocol to react if current QCS
sending buffer space is too small. In this case, the buffer can be
released to the quic-conn layer. This allows to allocate a new QCS
buffer and retry HTX parsing, unless connection buffer pool is already
depleted.

A new function qcc_release_stream_txbuf() serves as API for app protocol
to release the QCS sending buffer. This operation fails if there is
unsent data in it. In this case, MUX has to keep it to finalize transfer
of unsent data to quic-conn layer. QCS is thus flagged with
QC_SF_BLK_MROOM to interrupt snd_buf operation.

When all data are sent to the quic-conn layer, QC_SF_BLK_MROOM is
cleared via qcc_streams_sent_done() and stream layer is woken up to
restart snd_buf.

Note that a new function qcc_stream_can_send() has been defined. It
allows app proto to check if sending is currently blocked for the
current QCS. For now, it checks QC_SF_BLK_MROOM flag. However, it will
be extended to other conditions with the following patches.
2024-01-31 16:28:54 +01:00