Commit graph

27461 commits

Author SHA1 Message Date
William Lallemand
339d25636d REORG: httpclient/lua: move the lua httpclient code to http_client.c
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Move the lua httpclient code from hlua.c to http_client.c

The code is almost the same but the registering of the class which is
done in hlua_http_client_init_state(), from REGISTER_HLUA_STATE_INIT()

check_args() calls have been replaced by hlua_check_args().

hlua_httpclient_destroy_all() is exported so it can be called in hlua.c.
hlua_httpclient_table_to_hdrs() is made static.
2026-06-13 21:18:20 +02:00
William Lallemand
974560128d MINOR: lua: export hlua_pusherror() and check_args()
hlua_pusherror() and check_args() are being exported.

check_args() is now a macro to hlua_check_args() so it's not confusing
when called outside hlua.c.
2026-06-13 21:18:20 +02:00
Amaury Denoyelle
2ff861747c BUG/MINOR: server: fix add server with consistent hash balancing
Some checks failed
Contrib / admin/halog/ (push) Has been cancelled
Contrib / dev/flags/ (push) Has been cancelled
Contrib / dev/haring/ (push) Has been cancelled
Contrib / dev/hpack/ (push) Has been cancelled
Contrib / dev/poll/ (push) Has been cancelled
FreeBSD / clang (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
When a dynamic server is added with consistent hash balancing on the
backend, its lb_nodes elements are allocated and associated with a
calculated server key. This operation is performed in add server handler
via srv_alloc_lb(). By default, the server key is based on its ID.
However, automatic server ID is calculated later in add server handler,
which means the initial lb_nodes are not valid.

This could cause load balacing issue but in fact this is not directly
visible as the server key is recalculated when the dynamic server is
enabled via chash_queue_dequeue_srv() : all server lb_nodes are dequeued
and requeued with the now proper key.

Thus, "add server" handler must be corrected as it is buggy when
considering it alone. The simplest solution of the current patch is to
initialize server ID before srv_alloc_lb() is invoked. There is no issue
as handler runs under thread isolation so there is no risk of multiple
servers manipulating the same ID. Server insertion in proxy ID tree is
still performed at the end of the handler when all fallible operation
are completed.

The fact that server key is recalculated when the server is set to ready
state is a side effect of the following patch which was introduced in
3.0. What this means though is that users of older releases are facing a
bigger issue, with load-balancing not working as expected. Thus,
this patch is even more crucial for 2.8 and older releases.

  faa8c3e024
  MEDIUM: lb-chash: Deterministic node hashes based on server address

This should fix github issue #3413. Thanks to Joao Morais for is
analysis on the problem.

This must be backported to all stable releases.
2026-06-12 15:31:41 +02:00
Olivier Houchard
98b1fd4ff9 BUG/MEDIUM: h3: Properly handle PUSH_PROMISE on backend connections
When we receive a PUSH_PROMISE frame while we don't expect it, flag it
as a connection error, do not just set ret to H3_ERR_ID_ERROR, as it
would just be considered the number of bytes we read, and could lead to
random corruption. This should only happen with backend connections.
This should be backported whenever commit 4a8bb2fe5 is backported.
2026-06-12 14:01:07 +02:00
Olivier Houchard
b9aa1c0e64 MEDIUM: tasks: Redispatch shared tasks when the thread is loaded
Now that there is no longer a shared wake queue, chances are if a shared task
is scheduled, it will always end up on the same thread. In
wake_expired_tasks(), when a task has to be waken up, randomly look to
three other threads, and if the runqueue of the current thread is at least
two time bigger than the runqueue of one of the other threads, then give
that task to that thread, so that our load gets reduced.
If we're giving the task to another thread, then we have to add the
TASK_RUNNING flag until we waked it up, otherwise the other thread could
just run it, if it gets waken up from another path, and free it while
we're still not done with it.
2 times has been chosen somewhat arbitrarily, and may be tweaked at a
later date if deemed not optimal.
2026-06-12 11:49:09 +02:00
Olivier Houchard
aaee6c463c MINOR: tasks: Remove wq_lock and the per-thread group wait queues
Now that they are no longer used, remove wq_lock and the per-thread
group wait queues.
2026-06-12 11:49:09 +02:00
Olivier Houchard
caa1cd0674 MINOR: tasks: Use __task_set_state_and_tid() in task_instant_wakeup()
Modify task_instant_wakeup() to use __task_set_state_and_tid().
It uses the new ownership behavior, but that's okay because
task_instant_wakeup() was not used anywhere.
2026-06-12 11:49:09 +02:00
Olivier Houchard
0988b9c773 MEDIUM: tasks: Remove the per-thread group wait queue
Totally remove the per-thread group wait queue. This was potentially a
source of contention, because there were only a global lock for all
those wait queues.
Instead, for shared tasks, there is now the concept of ownership for the
task. When a task is in the wait queue, run queue, or is running on that
particular thread, the task's tid is set to -2 - thread_tid, and only
that thread will be responsible for it until it is no longer running,
and in none of its queue.
When a shared task is scheduled to be run at a later time, if its
current tid is -1, then the current thread will take ownership, and put
it in its own wait queue. If it is already owned, then TASK_WOKEN_WQ is
added to the task's state, and a task_wakeup() is done, so that the
owner thread will add it in its wait queue.
If there is any owner, then a task_wakeup() will just add the task to
the owner's runqueue, otherwise the current thread will become the
owner.
2026-06-12 11:49:09 +02:00
Olivier Houchard
c9f3ddcb1e MINOR: tasks: Start using __task_set_state_and_tid()
Start using __task_set_state_and_tid() when we're changing the state of
the task while queueing it, in preparation to the future ownership
changes.
2026-06-12 11:49:09 +02:00
Olivier Houchard
95cb3251a0 MINOR: tasks: Use __task_get_current_owner() in task_kill.
In task_kill(), to know which thread to send the task to, use
__task_get_current_owner(), in preparation for future changes.
2026-06-12 11:49:09 +02:00
Olivier Houchard
74b16c5477 MINOR: tasks: Introduce __task_get_current_owner
Introduce a new function, __task_get_current_owner, that returns the
owner of a task based on its current tid.
-1 means there is no current owner, otherwise either the tid is >= 0, in
which case it will just return it, or it's < -1, in which case it will
return -2 - tid, the tid of the thread with the current ownership.
2026-06-12 11:49:09 +02:00
Olivier Houchard
8b6d8f5e4f MINOR: tasks: Add __task_get_new_tid_field()
Introduce __task_get_new_tid_field(), that provides the tid to be used
for a task.
For shared task, to mark temporary ownership of a task, instead of -1,
the tid will be set to -2-tid, tid being the tid of the current thread.
2026-06-12 11:49:09 +02:00
Olivier Houchard
91f9e3a3dd MINOR: tasks: Introduce __task_set_state_and_tid
Introduce a new function, __task_set_state_and_tid, that atomically can
set a task's state and its tid. This will be used later, as the tid will
be used to indicate task ownership even for shared tasks.
2026-06-12 11:49:09 +02:00
William Lallemand
92206fb02f DOC: acme: add mentions of lua features
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Mention ACME.challenge_ready() and event_hdl which are useful in lua to
implement dns-01.
2026-06-11 23:51:45 +02:00
William Lallemand
d2c9bf70e5 EXAMPLES: lua/acme: add a dns-01 handler for Gandi LiveDNS API
This Lua script automates dns-01 ACME challenges using the Gandi LiveDNS
API v5. It subscribes to the ACME_DEPLOY event to set the required
_acme-challenge TXT record via the Gandi REST API, signals HAProxy that
the challenge is ready using ACME.challenge_ready(), then cleans up the
TXT record once the certificate is issued on ACME_NEWCERT.

The API key is read from the GANDI_API_KEY environment variable at
startup. Zone discovery is automatic: the script probes parent zones
from longest to shortest until Gandi accepts the record, which handles
both apex and wildcard certificates transparently.
2026-06-11 19:37:49 +02:00
William Lallemand
4bb21dae2f MINOR: acme: publish ACME_DEPLOY event via event_hdl
Add EVENT_HDL_SUB_ACME_DEPLOY to the ACME family. It is published in
the dns-01 challenge path after the TXT record information has been
prepared, carrying the certificate store name, domain, account
thumbprint, dns_record value, and optionally the provider and vars
strings.

Lua subscribers using core.event_sub() receive the event data as an
AcmeEvent object, which is the same class used for ACME_NEWCERT and
carries the fields relevant to the event type.
2026-06-11 19:14:52 +02:00
William Lallemand
81d7624e01 MINOR: acme: publish ACME_NEWCERT event via event_hdl
Add a new EVENT_HDL_SUB_ACME_NEWCERT event type in the ACME family.
It is published after a new certificate has been successfully fetched
and installed. The event carries the certificate store name, allowing
subscribers to act on newly available certificates.

Lua subscribers using core.event_sub() receive the event data as an
AcmeEvent object with a crtname field containing the certificate store
name.
2026-06-11 19:14:52 +02:00
Willy Tarreau
960fa1c921 BUG/MINOR: cpu-topo: use ha_diag_notice() to report thread creations
Using ha_diag_warning() to report the number of threads created resulted
in warnings being counted and possibly an error being fired when combined
with -dW:

  $ printf "global\nstats socket /tmp/sock1\n" | ./haproxy -dD -dW -c -f /dev/stdin; echo $?
  [NOTICE]   (10406) : haproxy version is 3.5-dev0-5091ac-35
  [NOTICE]   (10406) : path to executable is ./haproxy
  [DIAG]     (10406) : Created 20 threads split into 2 groups
  [ALERT]    (10406) : Some warnings were found and 'zero-warning' is set. Aborting.
  1

Now that we have ha_diag_notice(), let's use it:

  $ printf "global\nstats socket /tmp/sock1\n" | ./haproxy -dD -dW -c -f /dev/stdin; echo $?
  [DIAG]     (10513) : Created 20 threads split into 2 groups
  0

It would make sense to backport this to 3.2 because it helps validate configs
against diag warnings without triggering a false positive. It depends on
this previous patch:

  MINOR: errors: add ha_diag_notice() to report diag-level notifications
2026-06-11 18:49:57 +02:00
Willy Tarreau
7d63efa5f5 MINOR: errors: add ha_diag_notice() to report diag-level notifications
Right now the only way to report info that is only displayed in diag
mode with -dD is to use ha_diag_warning(). The problem is that this is
then counted as a warning and may result in errors when combined with
-dW, as happens for the CPU topology info:

  $ printf "global\nstats socket /tmp/sock1\n" | ./haproxy -dD -dW -c -f /dev/stdin; echo $?
  [NOTICE]   (10406) : haproxy version is 3.5-dev0-5091ac-35
  [NOTICE]   (10406) : path to executable is ./haproxy
  [DIAG]     (10406) : Created 20 threads split into 2 groups
  [ALERT]    (10406) : Some warnings were found and 'zero-warning' is set. Aborting.
  1

We need another level. This commit introduces ha_diag_notice() which only
emits a notification that doesn't count as a warning. Note that we could
even introduce an info level and revisit various messages so that notice
only reports certain events while info is for anything (like versions
above). That could be a future improvement.
2026-06-11 18:48:59 +02:00
Karol Kucharski
96b08e959c BUG/MEDIUM: ktls: defer enabling TLS ULP on a socket until connected
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
The Linux tls module requires a socket to be in TCP_ESTABLISHED state
before we can enable the TLS ULP on the socket, if the socket is in any
other state, then the setsockopt() call will fail, and we won't use
kTLS on that socket.
To make sure we're not doing it too early, defer it until the TLS
handshake is done, which means the TCP connection is established.

This should be backported up to 3.3.

Signed-off-by: Karol Kucharski <kkucharski@fastlogic.pl>
2026-06-11 14:18:31 +02:00
William Lallemand
784f972a6f MINOR: acme/lua: implement ACME.challenge_ready() Lua function
Add a new ACME global Lua table with a challenge_ready(crt, dns) method
that wraps acme_challenge_ready(). It marks the ACME challenge for domain
<dns> in certificate <crt> as ready and returns the number of remaining
challenges, or 0 when all challenges are ready and validation has been
triggered. A Lua error is raised if the certificate or domain is not found.

The ACME table is registered for each lua_State via the new
REGISTER_HLUA_STATE_INIT() mechanism.
2026-06-11 15:01:38 +02:00
William Lallemand
5c0733db9a MEDIUM: lua: move longjmp annotation macros to hlua.h
__LJMP, WILL_LJMP() and MAY_LJMP() were defined locally in hlua.c,
making them unavailable to other modules that implement Lua bindings.
Move them to include/haproxy/hlua.h so they can be used outside of
hlua.c.
2026-06-11 14:40:27 +02:00
William Lallemand
d0fde90e16 MINOR: lua: add REGISTER_HLUA_STATE_INIT() to register state init callbacks
Add a registration mechanism so that modules outside of hlua.c can hook
into each lua_State creation. Modules call hap_register_hlua_state_init()
(or the REGISTER_HLUA_STATE_INIT() macro) with a callback of the form:

  int my_init(lua_State *L, char **errmsg);

The callback returns an ERR_* code. ERR_ALERT and ERR_WARN trigger
ha_alert()/ha_warning() respectively; any other non-zero errmsg is
emitted via ha_notice(). ERR_FATAL or ERR_ABORT cause exit(1).
Registered entries are freed in hlua_deinit().
2026-06-11 14:13:04 +02:00
Amaury Denoyelle
3dfd86062b BUILD: h3: fix compilation with USE_TRACE=0
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Mark argument in h3_trace_header as unused if USE_TRACE is not set.

No need to backport unless HTTP/3 header traces are picked.
2026-06-11 11:47:57 +02:00
Amaury Denoyelle
cc01214a67 MINOR: h3: trace HTTP headers on BE side
Output HTTP/3 header traces on the backend side. As previous commit,
this relies on h3_trace_header() function.

Extra calls are added for fields extracted from the request start-line
which produce an HTTP/3 pseudo-header.
2026-06-11 11:40:06 +02:00
Amaury Denoyelle
00c081b5f3 MINOR: h3: trace HTTP headers on FE side
Output trace for HTTP/3 headers manipulated on the frontend side. This
is implement via a new utility function h3_trace_header(), largely
inspired from existing h2_trace_header().

An extra call is added for HTTP/3 response :status which is extracted
from the HTX start line.
2026-06-11 11:40:06 +02:00
Amaury Denoyelle
53fe4181a5 MINOR: h3: extend trace verbosity
Define two new values for HTTP/3 trace verbosity : simple and advanced.
For now, these values are unused. However advanced level will become
useful to implement HTTP/3 header traces.
2026-06-11 11:40:06 +02:00
William Lallemand
9e60d35aaf MINOR: acme: introduce acme_challenge_ready() for reuse outside the CLI
Extract the challenge-readiness logic from cli_acme_chall_ready_parse()
into a new acme_challenge_ready(crt, dns) function so it can be called
from other contexts such as Lua event handlers.

It slightly changes the messages on the CLI.
2026-06-11 11:33:27 +02:00
William Lallemand
0a90ff6b3d BUG/MEDIUM: acme: stuck ACME task when authz is already "valid"
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
When an ACME order is re-used or when a domain was recently validated,
the CA may return status "valid" for an authorization without requiring
any challenge to be solved.  In acme_res_auth(), this is handled by
setting auth->validated = 1 and jumping to out — but auth->ready is
never initialized and stays 0.

This became a bug in 3.4 when the "challenge-ready" option and the
ACME_CLI_WAIT state were introduced (commit 2b0c510aff).  ACME_CLI_WAIT
computes:

    all_cond_ready &= auth->ready;

across all authorizations.  A single auth->ready == 0 drives the AND
to zero and the task waits indefinitely for a readiness signal that
will never arrive, since no challenge was published and no external
agent will ever call challenge_ready() for that domain.

Fix it by setting auth->ready = ctx->cfg->cond_ready for already-valid
authorizations, marking them as satisfying all required readiness
conditions so ACME_CLI_WAIT can proceed normally.

This should be backported to 3.4.
2026-06-10 18:19:55 +02:00
Olivier Houchard
6c75202b48 BUILD: servers: Fix build with -std=gnu89
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Commit 3c923d075 introduced a C99ism by declaring a variable in a for loop,
don't do that, especially since there already is a variable named "i"
declared.
This should fix the build when -std=c89 is used.
This should be backported if commit 3c923d075 is backported.
2026-06-10 10:28:52 +02:00
Amaury Denoyelle
fb38e40ad5 BUG/MINOR: quic: fix Initial length value in sent packets
Some checks failed
FreeBSD / clang (push) Waiting to run
Contrib / admin/halog/ (push) Has been cancelled
Contrib / dev/flags/ (push) Has been cancelled
Contrib / dev/haring/ (push) Has been cancelled
Contrib / dev/hpack/ (push) Has been cancelled
Contrib / dev/poll/ (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
QUIC packets using a long header contains a Length field. Its value is
the length of the content following it, i.e. the packet number field and
the remaining payload (QUIC frames and TLS AEAD tag).

Computation to determine the packet length is performed in
qc_do_build_pkt(). However this calculation is incorrect when Initial
padding is added on a small enough Initial packet. As length field is
encoded as a varint, it changes the field size which grow from one to
two bytes, reducing in effect the total required padding length from one
byte. However, length value is not updated and thus is one byte bigger
than the final packet payload with padding.

Fix this calculation by reducing the length value after padding size has
been adjusted.

This bug caused the peer to reject such faulty packets. However, its
impact is minor as it only happened only for Initial with small enough
payload. Packets used for ClientHello/ServerHello exchanges should be
large enough and typically not concerned by this bug, except maybe in
case of fragmentation.

This bug was detected by testing QUIC backend with a quiche server. The
server endpoint reported the faulty packets with the following trace :
  [2026-06-09T13:42:13.694179158Z ERROR quiche_server] 1b1b961c9c4ae1f470f3687510b120da1f5d5f5a recv failed: InvalidPacket

This must be backported up to 3.0.
2026-06-09 15:42:33 +02:00
Christopher Faulet
9bc37232f4 REGTESTS: Fix log matching in healthcheck-section.vtc
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
One of the regex matching syslog messages for S2 was not correct, making the
script fail depending on the order of the healthchecks. It is now fixed.
2026-06-09 08:42:01 +02:00
Olivier Houchard
3c923d075c MEDIUM: servers: Move to a per-thread idle connection cleanup task
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
Having a single task to take care of idle connection cleanup across all
servers leads to high contention. It uses a lock to maintain its tree of
servers to track, and then can acquire the idle_conns lock for each thread.
Instead, have one task per thread. Each thread will maintain its own
tree, so there will be no need for any lock, and it will just acquire
its own idle_conns lock, so it will lead to less contention.
This is a performance improvement, so backporting is optional, but may be
considered if it is worth it. That would require backporting commit
6f8dab2583 too.
2026-06-08 15:38:22 +02:00
Olivier Houchard
6f8dab2583 MINOR: servers: Add a back-pointer to the server in srv_per_thread
In struct srv_per_thread, add a pointer to the server, as with just a
pointer to srv_per_thread, we can't figure out the related server.
2026-06-08 15:37:50 +02:00
Olivier Houchard
a4520229a7 BUG/MEDIUM: checks: Dequeue checks on purge
When tune.max-checks-per-thread is used, checks that should run are
queued, to avoid having too many checks running at the same time.
But if the check is about to be purged, because the server is being
deleted, we have to explicitly remove it from the queue as that memory is
about to be freed, otherwise it will cause a use-after-free.
Also, queued checks have not yet incremented th_ctx->running_checks, so
don't decrement it if we're queued.

This should be backported up to 3.0.
2026-06-08 15:06:09 +02:00
Willy Tarreau
3fa818c78f MINOR: memprof: be careful to account allocations only once
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
For certain calls like strdup(), certain libc call the malloc() symbol
themselves, resulting in both strdup() and malloc() accounting for the
allocation while a single free() call is accounted for. Usually it's
not very hard to spot as these allocations are done inside libc, but
yet they complicate the tracing of allocations.

Let's note when we enter a handler and refrain from doing the accounting
again in this case. This way, the strdup() call place will be accountable
for the allocation and the libc's internal malloc() will not be seen.
2026-06-08 13:46:18 +02:00
Willy Tarreau
a7888f0373 MINOR: memprof: make in_memprof a bitfield instead of a counter
It's not convenient to use it as it is now because it may only be
used to count passes via the memprof init code. Let's turn it to
a bitfield instead so that we can also check what we're doing there.
This is safe because all callers of memprof_init() check for the
bit being zero first so it's not reentrant.
2026-06-08 13:46:18 +02:00
Willy Tarreau
ef191c46d7 BUG/MINOR: acl: report "ACL" not "map" in ACL ID lookup failures
As reported by @broxio in issue #3411, when trying to delete an ACL by
its name, in case of error the message says "unknown map identifier".
We need to check the type to decide between map and ACL as in other
messages.

This can be backported to all stable branches. Thanks to @broxio for
reporting the issue with a reproducer and providing this tested fix.
2026-06-08 13:45:39 +02:00
Willy Tarreau
b9fa07bd20 MINOR: pools: reject creation of pools containing invalid chars in their name
In order to preventively avoid issues that complicate debugging, let's
report to developers early if a pool name is not acceptable. This patch
does it in create_pool_from_reg() which catches both direct and declared
registrations. Aside the previous case, this didn't catch any other
occurrence.
2026-06-08 08:54:37 +02:00
Willy Tarreau
172306c308 CLEANUP: sessions: simplify the sess_priv_conns pool name
Using "show pools detailed" on the CLI breaks the column alignment on
"sess_priv_conns" because the pool name contains spaces: "session priv
conns list", which is not welcome as pool names are truncated after the
12th chars anyway. Let's shorten it to the pool's name as done for many
other ones: sess_priv_conns.

This can be backported as far as 3.0 where this name was introduced,
because it helps when trying to sum or graph certain metrics during
debugging.
2026-06-08 08:44:25 +02:00
Willy Tarreau
e51ae5ce66 BUG/MEDIUM: xprt_qmux: implement ->get_ssl_sock_ctx() to get the SSL laye
conn_get_ssl_sock_ctx() retrieves the ssl_sock_ctx of a connection by
calling conn->xprt->get_ssl_sock_ctx(). Only ssl_sock implements this
method, and it returns conn->xprt_ctx. This works because for every
existing XPRT combination the SSL layer is the topmost one: even
xprt_handshake (SOCKS4, PROXY, NetScaler CIP) is installed *below*
ssl_sock, so conn->xprt keeps pointing to ssl_sock.

Qmux changes this assumption: xprt_qmux is stacked *on top of* ssl_sock
and keeps the SSL layer as its lower layer to exchange the QUIC transport
parameters over the established TLS stream. During the qmux handshake,
conn->xprt therefore points to xprt_qmux, which does not implement
get_ssl_sock_ctx(), making conn_get_ssl_sock_ctx() return NULL for the
whole connection, affecting every caller that inspects the SSL layer
(sample fetches, logging, ssl_sock_infocbk(), ...).

The visible consequence was a crash: when the peer sends a TLS alert
during the qmux handshake, the SSL library calls ssl_sock_infocbk(),
which recovers a valid connection but a NULL ctx, rightfully triggering
the "BUG_ON(!ctx)" early in the function.

This patch implements xprt_qmux_get_ssl_sock_ctx() so that it returns
the ssl_sock_ctx of the lower layer when it is the SSL layer, just like
ssl_sock_get_ctx() does. conn_get_ssl_sock_ctx() then works again for
all callers while the qmux handshake is in progress. After the handshake,
conn->xprt is restored to the SSL layer so nothing else changes.

This should be backported to 3.4.
2026-06-08 08:31:20 +02:00
Olivier Houchard
45a64123d6 BUG/MEDIUM: threads: Fiw build when using no thread
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
In thread_detect_count(), avoid any usage of thread_cpu_enable_at_boot
if we're building without thread support. That variable is only defined
when building with threads, and those tests make little sense when
building with no thread, anyway.
This was submitted by: ririnto <ririnto@kakao.com>
This should fix github issue #3408.
This should be backported to 3.4.
2026-06-08 01:16:49 +02:00
Willy Tarreau
ac776e3819 BUG/MEDIUM: regex: initialize the match array earlier during boot
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
As reported by @zhanhb in github issue #3410, since 3.3 with commit
fda6dc959 ("MINOR: regex: use a thread-local match pointer for pcre2"),
the local_pcre2_match array is initialized too late for use by Lua. If
a lua-load makes use of regex, it may segfault (actually using PCRE2
is fine but PCRE2_JIT will crash):

Let's change the init sequence so that the first thread's context is
initialized early at boot and other threads are initialized when they
are created. For lua-load-per-thread, all extra threads will run on
the first thread's temporary storage during init but that's not a
problem since the sole purpose is to avoid concurrent accesses.

Thanks to @zhanbb for the detailed report and quick tests. This needs
to be backported to 3.3.
2026-06-07 07:46:32 +02:00
Christopher Faulet
1e00743520 REGTESTS: checks: Add script for external healthchecks
Some checks failed
Contrib / admin/halog/ (push) Has been cancelled
Contrib / dev/flags/ (push) Has been cancelled
Contrib / dev/haring/ (push) Has been cancelled
Contrib / dev/hpack/ (push) Has been cancelled
Contrib / dev/poll/ (push) Has been cancelled
FreeBSD / clang (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
This script is quite basic but it should validate the external healthchecks
are working well.
2026-06-05 17:15:31 +02:00
Christopher Faulet
b227ad2dc7 BUG/MINOR: tcpcheck: Override external check if healthcheck section is set
When an external check was configured at the proxy level, the healthcheck
section set on a server was not considered. The main reason was that the
check type of the server was always inherited for the proxy one.

To fix the issue, when a healthcheck section is set on a server line, the
check type for the server is forced to TCPCHK.

This patch must be backported to 3.4.
2026-06-05 17:15:31 +02:00
Amaury Denoyelle
07deafa104 BUG/MINOR: mux_quic: do not interrupt recv on error/incomplete data
Some checks failed
Contrib / admin/halog/ (push) Has been cancelled
Contrib / dev/flags/ (push) Has been cancelled
Contrib / dev/haring/ (push) Has been cancelled
Contrib / dev/hpack/ (push) Has been cancelled
Contrib / dev/poll/ (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
Prior to this patch, qcc_io_recv() stream decoding loop was interrupted
on the first decoding error or if incomplete data could not be parsed.

This patch adjusts this part so that loop is stopped only on a
connection level error. In case of a stream level error or on incomplete
data, decoding continues on the next QCS entry.

Without this patch, there is a risk that a QCS decode is not performed
as expected, with a possible client timeout firing. This is pretty
unlikely though. However this patch is still necessary to remove
completely this possibility.

This should be backported up to 3.2.
2026-06-05 16:27:10 +02:00
Amaury Denoyelle
a39b1a40ad OPTIM: mux_quic: remove QCS from recv_list on reset
When a RESET_STREAM is received, QCS Rx channel is closed and pending Rx
data and buf are cleared without being transmitted to upper stream
layer.

This patch complements this by removing the QCS from recv_list if
present in it. This is a small optimization nothing would be performed
for such QCS on qcc_io_recv().
2026-06-05 15:42:44 +02:00
Amaury Denoyelle
83ae0c250c BUG/MEDIUM: mux_quic: prevent risk of infinite loop on recv
When a RESET_STREAM is received, QCS Rx channel is closed and pending Rx
data and buf are cleared without being transmitted to upper stream
layer.

This can cause an issue if this QCS instance is present in the QCC
recv_list. When qcc_io_recv() is executed after reset handling, an
infinite loop is triggered for the QCS instance as qcs_rx_avail_data()
always return 0.

This issue happened due to the poor writing of the while loop in
qcc_io_recv() which is not correctly protected against infinite
execution.

To prevent this issue, this patch rewrites the loop. Crucially,
LIST_DEL_INIT() is now performed unconditionally outside of the inner
loop. This guarantees that even if the inner loop is not executed, the
stream will be removed from QCC recv_list and iteration will progress.

This is functionally correct as a QCS should not be present in recv_list
if there is no avail data or demux is currently blocked. For the first
condition, qcc_decode_qcs() will be called again when new data is read
unless demux is blocked. In this case, QCS will be reinserted in the
list on unblocking, with a rescheduling to invoke qcc_decode_qcs().

In the context of the currently found reproducer linked to stream reset,
the QCS instance can be safely removed from the recv_list without
implication.

This must be backported up to 3.2.
2026-06-05 15:32:55 +02:00
Christopher Faulet
f7bc8246ee BUG/MEDIUM: server/checks: Support healtcheck keyword on default-server lines
Some checks are pending
Contrib / admin/halog/ (push) Waiting to run
Contrib / dev/flags/ (push) Waiting to run
Contrib / dev/haring/ (push) Waiting to run
Contrib / dev/hpack/ (push) Waiting to run
Contrib / dev/poll/ (push) Waiting to run
FreeBSD / clang (push) Waiting to run
VTest / Generate Build Matrix (push) Waiting to run
VTest / (push) Blocked by required conditions
Windows / Windows, gcc, all features (push) Waiting to run
The healthcheck keyword could be parsed on default-server lines but not
copied during server initialization, making it ineffective. But there is
also a true issue by setting it on a default-server. The pseudo server used
to parse the default-server line is not initialized via the new_server()
function, as regular servers. So there is no tcpcheck information inherited
from the proxy. We must take care of that when the "healthcheck" keyword is
parsed to avoid crashes.

This patch must be backported to 3.4.
2026-06-04 21:53:32 +02:00
Christopher Faulet
3daf4498f3 MINOR: check: Don't dump buffers state in check traces for external checks
In healthcheck trace messages, there is no reason to dump the in/out buffers
state for external checks. So let's skip this part in that case.
2026-06-04 21:50:12 +02:00