Commit graph

8185 commits

Author SHA1 Message Date
Christopher Faulet
a58e650ad1 MEDIUM: tevt/muxes: Add dedicated termination events for muxc/se locations
Termination events dedicated to mux connection and stream-endpoint
descriptors are added in this patch. Specific events to these locations are
thus added. Changes for the H1 and H2 multiplexers are reviewed to be more
accurate.
2025-01-31 10:41:50 +01:00
Christopher Faulet
f2778ccc7d MINOR: tevt/connection: Add dedicated termination events for lower locations
To be able to add more accurate termination events for each location, the
enum will be splitted by location. Indeed, there are at most 16 possbile
events. It will be pretty confusing to use same termination events for the
different locations. So the best is to split them.

In this patch, the termination events for the fd, hs and xprt locations are
introduced. For now some holes are added to keep similar events aligned
across enums. But this may change in future.
2025-01-31 10:41:50 +01:00
Christopher Faulet
a4c281a190 MINOR: tevt/muxes: Add CTL and SCTL command to get the termination event logs
MUX_CTL_TEVTS command is added to get the termination event logs of a mux
connection and MUX_SCTL_TEVTS command to get the termination event logs of a
mux stream.
2025-01-31 10:41:50 +01:00
Christopher Faulet
00a07c8b54 MINOR: tevt/stream/stconn: Report termination events for stream and sc
In this patch, events for the stream location are reported. These events are
first reported on the corresponding stream-connector. So front events on scf
and back event on scb. Then all events are both merged in the stream. But
only 4 events are saved on the stream.

Several internal events are for now grouped with the type
"tevt_type_intercepted". More events will be added to have a better
resolution. But at least the place to report these events are identified.

For now, when a event is reported on a SC, it is also reported on the stream
and vice versa.
2025-01-31 10:41:50 +01:00
Christopher Faulet
992b4b9726 MINOR: tevt/stconn: Add a termination events log in the SE descriptor
This termination events log will be used to report events from the mux
streams. The location will be "tevt_loc_se" and the muxes will be
responsible to report the corresponding events.
2025-01-31 10:41:50 +01:00
Christopher Faulet
e944944990 MINOR: tevt: Add the termination events log's fundations
Termination events logs will be used to report the events that led to close
a connection. Unlike flags, that reflect a state, the idea here is to store
a log to preserve the order of the events. Most of time, when debugging an
issue, the order of the events is crucial to be able to understand the root
cause of the issue. The traces are trully heplful to do so. But it is not
always possible to active them because it is pretty verbose. On heavily
loaded platforms, it is not acceptable. We hope that the termination events
logs will help us in that situations.

One termination events log will be be store at each layer (connection, mux
connection, mux stream...) as a 32-bits integer. Each event will be store on
8 bits, 4 bits for the location and 4 bits for the type. So the first four
events will be stored only for each layer. It should be enough why a
connection is closed.

In this patch, the enums defining the termination event locations and types
are added. The macro to report a new event is also added and a function to
convert a termination events log to a string that could be display in log
messages for instance.
2025-01-31 10:41:49 +01:00
Christopher Faulet
e56e718c82 MINOR: mux-h1: Add masks to group H1S DEMUX and MUX errors
It is just a small patch to clean up mux/demux functions. Instead of listing
the H1S errors that must be handled during demux of mux operations, masks of
flags are used. It is more readable.
2025-01-31 10:41:49 +01:00
Willy Tarreau
d155924efe MINOR: fd: add a generation number to file descriptors
This patch adds a counter of close() on file descriptors in the fdtab.
The goal is to better detect if reported events concern the current or
a previous file descriptor. For now the counter is only added, and is
showed in "show fd" as "gen". We're reusing unused space at the end of
the struct. If it's needed for something more important later, this
patch can be reverted.
2025-01-30 19:45:34 +01:00
Willy Tarreau
44ac7a7e73 DEBUG: fd: add a counter of takeovers of an FD since it was last opened
That's essentially in order to help with debugging strange cases like
the occasional epoll issues/races, by keeping a counter of how many
times an FD was taken over since last inserted. The room is available
so let's use it. If it's needed later, this patch can easily be reverted.
The counter is also reported in "show fd" as "tkov".
2025-01-30 19:45:34 +01:00
Amaury Denoyelle
b849ee5fa3 BUILD: quic: fix overflow in global tune
A new global option was recently introduced to disable pacing. However,
the value used (1<<31) caused issue with some compiler as options field
used for storage is declared as int. Move pacing deactivation flag
outside into the newly defined quic_tune to fix this.

This should be backported up to 3.1 after a period of observation. Note
that it relied on the previous patch which defined new quic_tune type.
2025-01-30 18:12:53 +01:00
Amaury Denoyelle
09e9c7d5b7 MINOR: quic: define quic_tune
Define a new structure quic_tune. It will be useful to regroup various
configuration settings and tunable related to QUIC, instead of defining
them into the global structure.
2025-01-30 18:12:40 +01:00
Amaury Denoyelle
0c8b54b2d1 MINOR: quic: transform pacing settings into a global option
Pacing support was previously activated on each bind line individually,
via an optional argument of quic-cc-algo keyword. Remove this optional
argument and introduce a global setting to enable/disable pacing. Pacing
activation is still flagged as experimental.

One important change is that previously BBR usage automatically
activated pacing support. This is not the case anymore, so users should
now always explicitely activate pacing if BBR is selected. A new warning
message will be displayed if this is not the case.

Another consequence of this change is that now pacing_inter callback is
always defined for every quic_cc_algo types. As such, QUIC MUX uses
global.tune.options to determine if pacing is required.

This should be backported up to 3.1, after a period of observation.
2025-01-30 17:19:38 +01:00
William Lallemand
b43e5d8c16 BUILD: ssl: more cleaner approach to WolfSSL without renegotiation
Patch discussed in https://github.com/wolfSSL/wolfssl/issues/6834

When building Wolfssl without renegotiation options, WolfSSL still
defines the macros about it, which warns during the build.

This patch completes the previous one by undefining the macros so
haproxy could build without any warning.
2025-01-28 20:55:20 +01:00
William Lallemand
c6a8279cdf BUILD: ssl: allow to build without the renegotiation API of WolfSSL
In ticket https://github.com/wolfSSL/wolfssl/issues/6834, it was
suggested to push --enable-haproxy within --enable-distro.

WolfSSL does not want to include the renegotiation support in
--enable-distro.

To achieve this, let haproxy build without SSL_renegotiate_pending()
when wolfssl does not define HAVE_SECURE_RENEGOCIATION or
HAVE_SERVER_RENEGOCIATION_INFO.
2025-01-28 18:31:32 +01:00
Willy Tarreau
f17b0a994b BUILD: tools: fix build on BSD by dropping the ETIME check
Commit 44537379fc ("MINOR: tools: add errname to print errno macro
name") brought a facility to report errno using a symbolic string
when known instead of showing only the value. However, among the
listed options, ETIME is mentioned but is unknown from FreeBSD where
it breaks the build. Let's simply drop it, we don't use ETIME anyway
and even if it would be reported, the default code path still reports
the numeric value so there's no harm. If other ones fail to build in
the future, they could be handled the same way.
2025-01-28 15:58:57 +01:00
Christopher Faulet
36d151dc10 MEDIUM: stream: No longer use TASK_F_UEVT* to shut a stream down
Thanks to the previous patch, it is now possible to explicitly rely on
stream's events to shut it down. The right event is set in
stream_shutdown(), before waking up the stream, via an atomic operation. In
process_stream(), this event will be handled as expected.

Thus, TASK_F_UEVT* are no longer used, but not removed since still usable
for other tasks.

This patch depends on "MEDIUM: stream: Map task wake up reasons to dedicated
stream events".
2025-01-28 14:53:37 +01:00
Christopher Faulet
6048460102 MEDIUM: stream: Map task wake up reasons to dedicated stream events
To fix thread-safety issues when a stream must be shut, three new task
states were added. These states are generic (UEVT1, UEVT2 and UEVT3), the
task callback function is responsible to know what to do with them. However,
it is not really scalable.

The best is to use an atomic field in the stream structure itself to deal
with these dedicated events. There is already the "pending_events" field
that save wake up reasons (TASK_WOKEN_*) to not loose them if
process_stream() is interrupted before it had a chance to handle them.

So the idea is to introduce a new field to handle streams dedicated events
and merged them with the task's wake up reasons used by the stream. This
means a mapping must be performed between some task wake up reasons and
streams events. Note that not all task wake up reasons will be mapped.

In this patch, the "new_events" field is introduced. It is an atomic
bit-field. Streams events (STRM_EVT_*) are also introduced to map the task
wake up reasons used by process_stream(). Only TASK_WOKEN_TIMER and
TASK_WOKEN_MSG are mapped, in addition to TASK_F_UEVT* flags. In
process_stream(), "pending_events" field is now filled with new stream
events and the mapping of the wake up reasons.
2025-01-28 14:53:37 +01:00
Christopher Faulet
0a52a75ef7 BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions"
shutdown-backup-sessions action for on-marked-up directive does not work anymore
since the stream_shutdown() function was modified to be async-safe.

When stream_shutdown() was modified to be async-safe, dedicated task events were
added to map the reasons to shut a stream down. SF_ERR_DOWN was mapped to
TASK_F_EVT1 and SF_ERR_KILLED was mapped to TASK_F_EVT2. The reverse mapping was
performed by process_stream() to shut the stream with the appropriate reason.

However, SF_ERR_UP reason, used by shutdown-backup-sessions action to shut a
stream down because a preferred server became available, was not mapped in the
same way. So since commit b8e3b0a18d ("BUG/MEDIUM: stream: make
stream_shutdown() async-safe"), this action is ignored and does not work
anymore.

To fix an issue, and being able to bakcport the fix, a third task event was
added. TASK_F_EVT3 is now mapped on SF_ERR_UP.

This patch should fix the issue #2848. It must be backported as far as 2.6.
2025-01-28 14:53:37 +01:00
Olivier Houchard
26b3e5236f MEDIUM: servers/proxies: Switch to using per-tgroup queues.
For both servers and proxies, use one connection queue per thread-group,
instead of only one. Having only one can lead to severe performance
issues on NUMA machines, it is actually trivial to get the watchdog to
trigger on an AMD machine, having a server with a maxconn of 96, and an
injector that uses 160 concurrent connections.
We now have one queue per thread-group, however when dequeueing, we're
dequeuing MAX_SELF_USE_QUEUE (currently 9) pendconns from our own queue,
before dequeueing one from another thread group, if available, to make
sure everybody is still running.
2025-01-28 12:49:41 +01:00
Olivier Houchard
583303c48b MINOR: proxies/servers: Calculate queueslength and use it.
For both proxies and servers, properly calculates queueslength, which is
the total number of element in each queues (as they currently are only
using one queue, it is equivalent to the number of element of that
queue), and use it instead of the queue's length.
2025-01-28 12:49:41 +01:00
Olivier Houchard
59eddabe16 MINOR: Add fields to the per-thread group field in struct server.
Add a per-thread group queue and associated fields in per-thread group
field in struct server, as well as a new field, queues length.
This is currently unused, so should change nothing.
2025-01-28 12:49:41 +01:00
Olivier Houchard
f879b9a18a MINOR: proxies: Add a per-thread group field to struct proxy.
Add a per-thread group field to struct proxy, that will contain a struct
queue, as well as a new field, "queueslength".
This is currently unused, so should change nothing.
Please note that proxy_init_per_thr() must now be called for each proxy
once the thread groups number is known.
2025-01-28 12:49:41 +01:00
Aurelien DARRAGON
e768a531b7 CLEANUP: tree-wide: define and use acl_match_cond() helper
acl_match_cond() combines acl_exec_cond() + acl_pass() and a check on the
condition->pol (to check if the cond is inverted) in order to return
either 0 if the cond doesn't match or 1 if it matches (or NULL).

Thanks to this we can actually simplify some redundant constructs that
iterate over rules and evaluate if the condition matches or not.

Conditions for tcp-request inspect-content and tcp-response
inspect-content couldn't be simplified because they perform an extra
check for missing data, and thus still need to leverage acl_exec_cond()

It's best to display the patch using "-w", like "git show xxxx -w",
because some blocks had to be re-indented after the cleanup, which
makes the patch hard to review by default.
2025-01-27 11:11:43 +01:00
Valentine Krasnobaeva
94d3b7375a CLEANUP: ssl: move ssl_sock_gencert_load_ca declaration in ssl_gencert.h
As ssl_sock_gencert_load_ca and ssl_sock_gencert_free_ca are compiled only if
SSL_NO_GENERATE_CERTIFICATES is not defined, let's align it and move these
declarations in ssl_gencert.h.
2025-01-24 12:31:07 +01:00
Valentine Krasnobaeva
846819b316 CLEANUP: ssl: rename ssl_sock_load_ca to ssl_sock_gencert_load_ca
ssl_sock_load_ca is defined in ssl_gencert.c and compiled only if
SSL_NO_GENERATE_CERTIFICATES is not defined. It's name is a bit confusing, as
we may think at the first glance, that it's a generic function, which is also
used to load CA file, provided via 'ca-file' keyword.
ssl_set_verify_locations_file is used in this case.

So let's rename ssl_sock_load_ca into ssl_sock_gencert_load_ca. Same is
applied to ssl_sock_free_ca.
2025-01-24 12:31:07 +01:00
Valentine Krasnobaeva
44537379fc MINOR: tools: add errname to print errno macro name
Add helper to print the name of errno's corresponding macro, for example
"EINVAL" for errno=22. This may be helpful for debugging and for using in
some CLI commands output. The switch-case in errname() contains only the
errnos currently used in the code. So, it needs to be extended, if one starts
to use new syscalls.
2025-01-24 09:54:57 +01:00
Amaury Denoyelle
7896edccdc MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path
Pacing burst size is now dynamic. As such, configuration value has been
removed and related fields in bind_conf and quic_cc_path structures can
be safely removed.

This should be backported up to 3.1.
2025-01-23 17:40:48 +01:00
Amaury Denoyelle
cb91ccd8a8 MEDIUM: quic: use dynamic credit for pacing
Major improvements have been introduced in pacing recently. Most
notably, QMUX schedules emission on a millisecond resolution, which
allow to use passive wait to be much CPU friendly.

However, an issue remains with the pacing max credit. Unless BBR is
used, it is fixed to the configured value from quic-cc-algo bind
statement. This is not practical as if too low, it may drastically
reduce performance due to 1ms sleep resolution. If too high, some
clients will suffer from too much packet loss.

This commit fixes the issue by implementing a dynamic maximum credit
value based on the network condition specific to each clients.
Calculation is done to fix a maximum value which should allow QMUX
current tasklet context to emit enough data to cover the delay with the
next tasklet invokation. As such, avg_loop_us is used to detect the
process load. If too small, 1.5ms is used as minimal value, to cover the
extra delay incurred by the system which will happen for a default 1ms
sleep.

This should be backported up to 3.1.
2025-01-23 17:40:48 +01:00
Amaury Denoyelle
8098be1fdc MEDIUM: mux-quic: reduce pacing CPU usage with passive wait
Pacing algorithm has been revamped in the previous commit to implement a
credit based solution. This is a far more adaptative solution, in
particular which allow to catch up in case pause between pacing emission
was longer than expected.

This allows QMUX to remove the active loop based on tasklet wake-up.
Instead, a new task is used when emission should be paced. The main
advantage is that CPU usage is drastically reduced.

New pacing task timer is reset each time qcc_io_send() is invoked. Timer
will be set only if pacing engine reports that emission must be
interrupted. In this case timer is set via qcc_wakeup_pacing() to the
delay reported by congestion algorithm, or 1ms if delay is too short. At
the end of qcc_io_cb(), pacing task is queued if timer has been set.

Pacing task execution is simple enough : it immediately wakes up QCC I/O
handler.

Note that to have decent performance, it requires to have a large enough
burst defined in configuration of quic-cc-algo. However, this value is
common to every listener clients, which may cause too much loss under
network conditions. This will be address in a future patch.

This should be backported up to 3.1.
2025-01-23 17:40:22 +01:00
Amaury Denoyelle
4489a61585 MEDIUM: quic: implement credit based pacing
Implement a new method for QUIC pacing emission based on credit. This
represents the number of packets which can be emitted in a single burst.
After emission, decrement from the credit the number of emitted packets.
Several emission can be conducted in the same sequence until the credit
is completely decremented.

When a new emission sequence is initiated (i.e. under a new QMUX tasklet
invokation), credit is refilled according to the delay which occured
between the last and current emission context.

This new mechanism main advantage is that it allows to conduct several
emission in the same task context without having to wait between each
invokation. Wait is only forced if pacing is expired, which is now
equivalent to having a null credit.

Furthermore, if delay between two emissions sequence would have been
smaller than expected, credit is only partially refilled. This allows to
restart emission without having to wait for the whole credit to be
available.

On the implementation side, a new field <credit> is avaiable in
quic_pacer structure. It is automatically decremented on
quic_pacing_sent_done() invokation. Also, a new function
quic_pacing_reload() must be used by QUIC MUX when a new emission
sequence is initiated to refill credit. <next> field from quic_pacer has
been removed.

For the moment, credit is based on the burst configured via quic-cc-algo
keyword, or directly reported by BBR.

This should be backported up to 3.1.
2025-01-23 17:40:20 +01:00
Amaury Denoyelle
bbaa7aef7b BUG/MINOR: quic: do not increase congestion window if app limited
Previously, congestion window was increased any time each time a new
acknowledge was received. However, it did not take into account the
window filling level. In a network condition with negligible loss, this
will cause the window to be incremented until the maximum value (by
default 480k), even though the application does not have enough data to
fill it.

In most cases, this issue is not noticeable. However, it may lead to
excessive memory consumption when a QUIC connection is suddendly
interrupted, as in this case haproxy will fill the window with
retransmission. It even has caused OOM crash when thousands of clients
were interrupted at once on a local network benchmark.

Fix this by first checking window level prior to every incrementation
via a new helper function quic_cwnd_may_increase(). It was arbitrarily
decided that the window must be at least 50% full when the ACK is
handled prior to increment it. This value is a good compromise to keep
window in check while still allowing fast increment when needed.

Note that this patch only concerns cubic and newreno algorithm. BBR has
already its notion of application limited which ensures the window is
only incremented when necessary.

This should be backported up to 2.6.
2025-01-23 14:49:35 +01:00
Amaury Denoyelle
7c0820892f MINOR: quic: rename pacing_rate cb to pacing_inter
Rename one of the congestion algorithms pacing callback from pacing_rate
to pacing_inter. This better reflects that this function returns a delay
(in nanoseconds) which should be applied between each packet emission to
fill the congestion window with a perfectly smoothed emission.

This should be backported up to 3.1.
2025-01-23 14:49:35 +01:00
Amaury Denoyelle
2178bf1192 CLEANUP: quic: remove unused prototype
Remove undefined quic_pacing_send() function prototype from quic_pacing
module.

This should be backported up to 3.1.
2025-01-23 14:49:35 +01:00
Frederic Lecaille
4f38c4bfd8 MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount
This is definitively a bug to call quic_tx_packet_refdec() to decrement the reference
counter of a TX packet calling quic_tx_packet_refdec(), and possibly to release its
memory when it is negative or null.

This counter is incremented when a TX frm is attached to it with some allocated memory
and when the packet is inserted into a data structure, if needed (list or tree).

Should be easily backported as far as 2.6 to ease any further backport around
this code part.
2025-01-21 22:01:34 +01:00
Frederic Lecaille
cb729fb64d BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours
Reset ->prev and ->next fields of a coalesced TX packet to ensure it cannot access
several times its neighbours after it is supposed to be detached from them calling
quic_tx_packet_dgram_detach().

There are two cases where a packet can be coalesced to another previous built one:
this is when it is built into the same datagrame without GSO (and flagged flag with
QUIC_FL_TX_PACKET_COALESCED) or when sent from the same sendto() syscall with GOS
(not flagged with QUIC_FL_TX_PACKET_COALESCED).

This fix may be in relation with GH #2839.

Must be backported as far as 2.6.
2025-01-21 22:01:34 +01:00
Willy Tarreau
b066c0affb REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c
version.c tries to centralize all variables conveying version information,
but there's still an issue with the BUILD_* variables which are only
passed to haproxy.o and are only updated when that one is rebuilt. This
is not very logical given that we can end up with values there which
contradict info from version.c.

Better move all of these to version.c which is systematically rebuilt.
Most of these variables only end up as string concatenation at the
moment. Some of them are even duplicated. In version.c we now have one
variable (or constant) for each of them and haproxy.c references them
in messages. This is much more logical and easier to maintain in a
consistent state.

The patch looks a bit large but it really only moves the ifdefed string
assignment from one file to another, placing them into variables.
2025-01-20 17:53:55 +01:00
Amaury Denoyelle
a50dd07c16 MINOR: trace: ensure -dt priority over traces config section
Traces can be activated on startup either via -dt command line argument
or via the traces configuration section. This can caused confusion as it
may not be clear as trace source can be completed or overriden by one or
the other.

Fix the precedence to give the priority to the command line argument.
Now, each trace source configured via -dt is first resetted to a default
state before applying new settings. Then, it is impossible to change a
trace source via the configuration file if it was already targetted via
-dt argument.
2025-01-10 14:50:59 +01:00
Willy Tarreau
b25850f25b MINOR: tools: add a few functions to simply check for a file's existence
At many places we'd like to be able to simply construct a path from a
format string and check if that path corresponds to an existing file,
directory etc. Here we add 3 functions, a generic one to test that a
path corresponds to a given file mode (e.g. S_IFDIR, S_IFREG etc), and
two other ones specifically checking for a file or a dir for easier
use.
2025-01-09 09:18:49 +01:00
Willy Tarreau
bd06502b22 BUILD: makefile: add a qinfo macro to pass info in quiet mode
Some commands such as $(cmd_CC) etc already handle the quiet vs verbose
mode in the makefile, but sometimes we may want to pass other info. The
new "qinfo" macro can be called with a 9-char string argument (spaces
included) as a prefix for some commands, to emit that string when in
quiet mode. The caller must fill the spaces needed for alignment. E.g:

  $(call quinfo,  CC     )$(CC) ...
2025-01-08 11:26:05 +01:00
Amaury Denoyelle
af00be8e0f MINOR: mux-quic: change return value of qcs_attach_sc()
A recent fix was introduced to ensure that a streamdesc instance won't
be attached to an already completed QCS which is eligible to purging.
This was performed by skipping application protocol decoding if a QCS is
in such a state. Here is the patch responsible for this change.
  caf60ac696
  BUG/MEDIUM: mux-quic: do not attach on already closed stream

However, this is too restrictive, in particular for unidirection stream
where no streamdesc is never attached. To fix this behavior, first
qcs_attach_sc() API has been modified. Instead of returning a streamdesc
instance, it returns either 0 on success or a negative error code.

There should be no functional changes with this patch. It is only to be
able to extend qcs_attach_sc() with the possibility of skipping
streamdesc instantiation while still keeping a success return value.

This should be backported wherever the above patch has been merged. For
the record, it was scheduled for immediate backport on 3.1, plus merging
on older releases up to 2.8 after a period of observation.
2025-01-03 17:19:21 +01:00
Willy Tarreau
f486f976c7 BUILD: limits: make normalize_rlim() take an rlim_t to fix build on m68k
As can be seen here, the build fails on m68k since commit 665dde648
("MINOR: debug: use LIM2A to show limits") in 3.1:

  https://github.com/haproxy/haproxy/actions/runs/12440234399/job/34735360177

The reason is the comparison between a ulong limit and RLIM_INFINITY.
Indeed, on m68k, rlim_t is an unsigned long long. Let's just change
the function's input type to take an rlim_t instead. This also allows
to get rid of the casts in the call place.

This can be backported to 3.1 though it's not important given the low
prevalence of this platform for such use cases.
2024-12-25 12:33:06 +01:00
Willy Tarreau
f78121dd32 BUILD: compat: add missing fcntl.h before defining F_SETPIPE_SZ
n 1.5-dev8, 13 years ago, support for setting pipe size was added by
commit bd9a0a778 ("OPTIM/MINOR: make it possible to change pipe size
(tune.pipesize)"). For compatibility purposes, it was defining
F_SETPIPE_SZ in compat.h if it was not set. It apparently always had
F_SETPIPE_SZ defined before being included.

Now in 3.2-dev1, commit fbc534a6f ("REORG: startup: move nofile limit
checks in limits.c") reordered a few includes and ended up with
mworker-prog.c including compat.h before fcntl.h, causing a redefinition
error on certain libcs:

    CC      src/mworker-prog.o
  In file included from /usr/include/bits/fcntl.h:61:0,
                   from /usr/include/fcntl.h:35,
                   from include/haproxy/limits.h:11,
                   from include/haproxy/mworker.h:18,
                   from src/mworker-prog.c:27:
  /usr/include/bits/fcntl-linux.h:203:0: warning: "F_SETPIPE_SZ" redefined [enabled by default]
  In file included from include/haproxy/api-t.h:35:0,
                   from include/haproxy/api.h:33,
                   from src/mworker-prog.c:23:
  include/haproxy/compat.h:161:0: note: this is the location of the previous definition

Let's simply include fcntl.h in compat.h before the macro is redefined.

There's normally no need to backport this, though it's harmless to do
it if needed.
2024-12-25 11:53:11 +01:00
Olivier Houchard
505480eeef CLEANUP: Remove pendconn_must_try_again().
Remove pendconn_must_try_again(), now that it no longer is used.
2024-12-24 14:10:06 +01:00
Olivier Houchard
cda7275ef5 MEDIUM: queue: Handle the race condition between queue and dequeue differently
There is a small race condition, where a server would check if there is
something left in the proxy queue, and adding something to the proxy
queue. If the server checks just before the stream is added to the queue,
and it no longer has any stream to deal with, then nothing will take
care of the stream, that may stay in the queue forever.
This was worked around with commit 5541d4995d, by checking for that exact
condition after adding the stream to the queue, and trying again to get
a server assigned if it is detected.
That fix lead to multiple infinite loops, that got fixed, but it is not
unlikely that it could happen again. So let's fix the initial problem
differently : a single server may mark itself as ready, and it removes
itself once used. The principle is that when we discover that the just
queued stream is alone with no active request anywhere ot dequeue it,
instead of rebalancing it, it will be assigned to that current "ready"
server that is available to handle it. The extra cost of the atomic ops
is negligible since the situation is super rare.
2024-12-24 14:10:06 +01:00
Olivier Houchard
5b8899b6cc BUG/MEDIUM: queue: Make process_srv_queue return the number of streams
Make process_srv_queue() return the number of streams unqueued, as
pendconn_grab_from_px() did, as that number is used by
srv_update_status() to generate logs.

This should be backported up to 2.6 with
111ea83ed4
2024-12-23 15:03:40 +01:00
William Lallemand
056ec51c26 MEDIUM: ssl/ocsp: counters for OCSP stapling
Add 2 counters in the SSL stats module for OCSP stapling.

- ssl_ocsp_staple is the number of OCSP response successfully stapled
  with the handshake
- ssl_failed_ocsp_stapled is the number of OCSP response that we
  couldn't staple, it could be because of an error or because the
  response is expired.

These counters are incremented in the OCSP stapling callback, so if no
OCSP was configured they won't never increase. Also they are only
working in frontends.

This was discussed in github issue #2822.
2024-12-23 11:23:00 +01:00
William Lallemand
0e6af97233 MINOR: ssl: change visibility of ssl_stats_module
In order to add stats from other files, the ssl_stats_module need to be
visible from other files.

This moves the ssl_counters definition in ssl_sock-t.h and removes the
static of ssl_stats_module.
2024-12-23 11:23:00 +01:00
William Lallemand
acb2c9eb8b MINOR: ssl: improve HAVE_SSL_OCSP ifdef
Allow to build correctly without OCSP. It could be disabled easily with
OpenSSL build with OPENSSL_NO_OCSP. Or even with
DEFINE="-DOPENSSL_NO_OCSP" on haproxy make line.
2024-12-19 10:53:05 +01:00
Remi Tricot-Le Breton
93f2c73423 MINOR: ssl/ocsp: Add extra details in error logs when possible
When the ocsp response auto update process fails during insertion or
while validating the received ocsp response, we call
ssl_sock_update_ocsp_response or ssl_ocsp_check_response respectively
and both these functions take an 'err' parameter in which detailed error
messages can be written. Until now, those error messages were discarded
and the only information given to the user was a generic error
(ERR_CHECK or ERR_INSERT) which does not help much.
We now keep a pointer to the last error message in the certificate_ocsp
structure and dump its content in the update logs as well as in the
"show ssl ocsp-updates" cli command.

This issue was raised in GitHub #2817.
2024-12-18 10:41:16 +01:00
Amaury Denoyelle
9d155ca706 MINOR: trace: implement tracing disabling API
Define a set of functions to temporarily disable/reactivate tracing for
the current thread. This could be useful when wanting to quickly remove
tracing output for some code parts.

The API relies on a disable/resume set of functions, with a thread-local
counter. This counter is tested under __trace_enabled(). It is a
cumulative value so that the same count of resume must be issued after
several disable usage. There is also the possibility to force reset the
counter to 0 before restoring the old value.

This should be backported up to 3.1.
2024-12-18 09:52:06 +01:00
Amaury Denoyelle
e296585ae9 MEDIUM/OPTIM: mux-quic: implement purg_list
This commit is part of the current serie which aims to refactor and
improve overall performance of QUIC MUX I/O handler.

qcc_io_process() is responsible to perform some internal operations on
QUIC MUX after I/O completion. It is notably called on every qcc_io_cb()
tasklet handler.

The most intensive work on it is the purging of QCS instances after
transfer completion. This was implemented by looping on QCC streams tree
and inspecting the state of every QCS. The purpose of this commit is to
optimize this processing.

A new purg_list QCC member is defined. It is responsible to list every
QCS instances whose transfer has been completed. It is thus safe to
reuse <el_send> QCS list attach point. Stream purging will thus only
loop on purg_list instead of every known QCS.

This should be backported up to 3.1.
2024-12-18 09:33:52 +01:00
Amaury Denoyelle
4b42dd4ae0 MEDIUM/OPTIM: mux-quic: define a recv_list for demux resumption
This commit is part of the current serie which aims to refactor and
improve overall performance of QUIC MUX I/O handler.

Define a recv_list element into qcc structure. This is used to
registered every instance of qcs which are currently blocked on
demuxing, which happen on no more space in <rx.appbuf>.

The purpose of this patch is to reduce qcc_io_recv() CPU usage. Now,
only recv_list iteration is performed, instead of the previous looping
over every qcs instances. This is useful as qcc_io_recv() is called each
time qcc_io_cb() is scheduled, even if only sending condition was the
wakeup origin.

A qcs is not inserted into recv_list immediately after blocking on demux
full buffer. Instead, this is only done after unblocking via stream
rcv_buf callback, which ensure that new buffer space is available.

This should be backported up to 3.1.
2024-12-18 09:23:41 +01:00
Amaury Denoyelle
0a53a008d0 MINOR: mux-quic: refactor wait-for-handshake support
This commit refactors wait-for-handshake support from QUIC MUX. The flag
logic QC_CF_WAIT_HS is inverted : it is now positionned only if MUX is
instantiated before handshake completion. When the handshake is
completed, the flag is removed.

The flag is now set directly on initialization via qmux_init(). Removal
via qcc_wait_for_hs() is moved from qcc_io_process() to qcc_io_recv().
This is deemed more logical as QUIC MUX is scheduled on RECV to be
notify by the transport layer about handshake termination. Moreover,
qcc_wait_for_hs() is now called if recv subscription is still active.

This commit is the first of a serie which aims to refactor QUIC MUX I/O
handler and improves its overall performance. The ultimate objective is
to be able to stream qcc_io_cb() by removing pacing specific code path
via qcc_purge_sending().

This should be backported up to 3.1.
2024-12-18 09:23:41 +01:00
Amaury Denoyelle
17bfe93768 CLEANUP: mux-quic: remove unused qcc member send_retry_list
Remove unused fields send_retry_list from qcc and its corresponding
attach element el from qcs.

This should be backported up to 3.1.
2024-12-18 09:20:20 +01:00
Willy Tarreau
7b6acb6a51 MINOR: bug: make BUG_ON() fall back to ASSUME
When the strict level is zero and BUG_ON() is not implemented, some
possible null-deref warnings are emitted again because some were
covering for these cases. Let's make it fall back to ASSUME() so that
the compiler continues to know that the tested expression never happens.
It also allows to further optimize certain functions by helping the
compiler eliminate certain tests for impossible values. However it
requires that the expression is really evaluated before passing the
result through ASSUME() otherwise it was shown that gcc-11 and above
will fail to evaluate its implications and will continue to emit the
null-deref warnings in case the expression is non-trivial (e.g. it
has multiple terms).

We don't do it for BUG_ON_HOT() however because the extra cost of
evaluating the condition is generally not welcome in fast paths,
particularly when that BUG_ON_HOT() was kept disabled for
performance reasons.
2024-12-17 17:39:12 +01:00
Willy Tarreau
63798088b3 MINOR: compiler: add ASSUME_NONNULL() to tell the compiler a pointer is valid
At plenty of places we have ALREADY_CHECKED() or DISGUISE() on a pointer
just to avoid "possibly null-deref" warnings. These ones have the side
effect of weakening optimizations by passing through an assembly step.
Using ASSUME_NONNULL() we can avoid that extra step. And when the
__builtin_unreachable() builtin is not present, we fall back to the old
method using assembly. The macro returns the input value so that it may
be used both as a declarative way to claim non-nullity or directly inside
an expression like DISGUISE().
2024-12-17 16:46:46 +01:00
Willy Tarreau
2ce63b7b17 MINOR: compiler: also enable __builtin_assume() for ASSUME()
Clang apparently has __builtin_assume() which does exactly the same
as our macro, since at least v3.8. Let's enable it, in case it may
even better detect assumptions vs unreachable code.
2024-12-17 16:46:46 +01:00
Willy Tarreau
efc897484b MINOR: compiler: add a new "ASSUME" macro to help the compiler
This macro takes an expression, tests it and calls an unreachable
statement if false. This allows the compiler to know that such a
combination does not happen, and totally eliminate tests that would
be related to this condition. When the statement is not available
in the compiler, we just perform a break from a do {} while loop
so that the expression remains evaluated if needed (e.g. function
call).
2024-12-17 16:46:46 +01:00
Willy Tarreau
41fc18b1d1 MINOR: compiler: rely on builtin detection for __builtin_unreachable()
Due to __builtin_unreachable() only being associated to gcc 4.5 and
above, it turns out it was not enabled for clang. It's not used *that*
much but still a little bit, so let's enable it now. This reduces the
code size by 0.2% and makes it a bit more efficient.
2024-12-17 16:46:46 +01:00
Willy Tarreau
96cfcb1df3 MINOR: compiler: add a __has_builtin() macro to detect features more easily
We already have a __has_attribute() macro to detect when the compiler
supports a specific attribute, but we didn't have the equivalent for
builtins. clang-3 and gcc-10 have __has_builtin() for this. Let's just
bring it using the same mechanism as __has_attribute(), which will allow
us to simply define the macro's value for older compilers. It will save
us from keeping that many compiler-specific tests that are incomplete
(e.g. the __builtin_unreachable() test currently doesn't cover clang).
2024-12-17 16:46:46 +01:00
Olivier Houchard
b3cd5a4b86 CLEANUP: queues: Remove pendconn_grab_from_px().
pendconn_grab_from_px() is now unused, so just remove it.
2024-12-17 16:05:44 +01:00
William Lallemand
bb88f68cf7 MINOR: ssl: add utils functions to extract X509 notAfter date
Add ASN1_to_time_t() which converts an ASN1_TIME to a time_t and
x509_get_notafter_time_t() which returns the notAfter date in time_t
format.
2024-12-16 14:54:53 +01:00
Valentine Krasnobaeva
fbc534a6fa REORG: startup: move nofile limit checks in limits.c
Let's encapsulate the code, which checks the applied nofile limit into
a separate helper check_nofile_lim_and_prealloc_fd(). Let's keep in this new
function scope the block, which tries to create a copy of FD with the highest
number, if prealloc-fd is set in the configuration.
2024-12-16 10:44:01 +01:00
Valentine Krasnobaeva
14f5e00d38 REORG: startup: move code that applies limits to limits.c
In step_init_3() we try to apply provided or calculated earlier haproxy
maxsock and memmax limits.

Let's encapsulate these code blocks in dedicated functions:
apply_nofile_limit() and apply_memory_limit() and let's move them into
limits.c. Limits.c gathers now all the logic for calculating and setting
system limits in dependency of the provided configuration.
2024-12-16 10:44:01 +01:00
Valentine Krasnobaeva
1332e9b58d REORG: startup: move global.maxconn calculations in limits.c
Let's encapsulate the code, which calculates global.maxconn and
global.maxsslconn into a dedicated function set_global_maxconn() and let's
move this function in limits.c. In limits.c we keep helpers to calculate and
check haproxy internal limits, based on the system nofile and memory limits.
2024-12-16 10:44:01 +01:00
Frederic Lecaille
e1d25cdbdd CLEANUP: quic: remove a wrong comment about ->app_limited (drs)
->app_limited quic_drs struct member is not a boolean. This is
the index of the last transmitted packet marked as application-limited, or 0 if
the connection is not currently application-limited (see C.app_limited
definition in BBR v3 draft).
2024-12-13 14:42:43 +01:00
Frederic Lecaille
eeaeb412dc MINOR: quic: reduce the private data size of QUIC cc algos
After these commits:

    BUG/MINOR: quic: remove max_bw filter from delivery rate sampling
    BUG/MINOR: quic: fix BBB max bandwidth oscillation issue

where some members were removed from bbr struct, the private data
size of QUIC cc algorithms may be reduced from 160 to 144 uint32_t.

Should be easily backported to 3.1 alonside the commits mentioned above.
2024-12-13 14:42:43 +01:00
Frederic Lecaille
22ab45a3a8 BUG/MINOR: quic: remove max_bw filter from delivery rate sampling
This filter is no more needed after this commit:

 BUG/MINOR: quic: fix BBB max bandwidth oscillation issue.

Indeed, one added this filter at delivery rate sampling level to filter
the BBR max bandwidth estimations and was inspired from ngtcp2 code source when
trying to fix the oscillation issue. But this BBR max bandwidth oscillation issue
was fixed by the aforementioned commit.

Furthermore this code tends to always increment the BBR max bandwidth. From my point
of view, this is not a good idea at all.

Must be backported to 3.1.
2024-12-13 14:42:43 +01:00
Frederic Lecaille
a9a2f98f86 MINOR: window_filter: rely on the time to update the filter samples (QUIC/BBR)
The windowed filters are used only the BBR implementation for QUIC to filter
the maximum bandwidth samples for its estimation over a virtual time interval
tracked by counting the cyclical progression through ProbeBW cycles. ngtcp2
and quiche use such windowed filters in their BBR implementation. But in a
slightly different way. When updating the 2nd or 3rd filter samples, this
is done based on their values in place of the time they have been sampled.
It seems more logical to rely on the sample timestamps even if this has no
implication because when a sample is updated using another sample because it
has the same value, they have both the same timestamps!

This patch modifies two statements which compare two consecutive filter samples
based on their values (smp[]->v) by statements which compare them based on the
virtual time they have been sampled (smp[]->t). This fully complies which the
code used by the Linux kernel in lib/win_minmax.c.

Alo take the opportunity of this patch to shorten some statements using <smp>
local variable value to update smp[2] sample in place of initializing its two
members with the <smp> member values.

This patch SHOULD be easily backported to 3.1 where BBR was first implemented.
2024-12-13 14:42:43 +01:00
Amaury Denoyelle
1f458b3ea8 MINOR: applet: define applet_putchk_stress() alternative
Previous patch introduced stress mode to be able to easily test
alternative code paths.

The first point would be to force interruption of stats dump on every
line and check reentrant patchs, in particular while adding and removing
servers instances.

The purpose of this patch is to be able to use applet_putchk_stress()
during stats dump while not impacting other applets. To support this,
extract applet_putchk() into an internal _applet_putchk() which have a
new argument stress. Define two helpers applet_putchk() and
applet_putchk_stress(), the latter to set the stress argument to true.

For the moment, applet_putchk_stress() is not used. This will be the
subject of the next patch.
2024-12-12 11:26:33 +01:00
Amaury Denoyelle
9d19fc4cf7 MINOR: build: define DEBUG_STRESS
Define a new build mode DEBUG_STRESS. This will be used to stress some
code parts which cannot be reproduce easily with an alternative
suboptimal code.

First, a global <mode_stress> is set either to 1 or 0 depending on
DEBUG_STRESS compilation. A new global keyword "stress-level" is also
defined. It allows to specify a level from 0 to 9, to increase the
stress incurred on the code.

Helper macro STRESS_RUN* are defined for each stress level. This allows
to easily specify an instruction in default execution and a stress
counterpart if running on the corresponding stress level.
2024-12-12 11:19:10 +01:00
Aurelien DARRAGON
358166ae6a BUG/MINOR: hlua_fcn: restore server pairs iterator pointer consistency
Since 9c91b30 ("MINOR: server: remove prev_deleted server list"), hlua
server pair iterator may use and return invalid (stale) server pointer
if multiple servers were deleted between two iterations.

Indeed, the server refcount mechanism (using srv_take()) is no longer
sufficient as the prev_deleted mitigation was removed.

To ensure server pointer consistency between two yields, the new watcher
mechanism must be used (as it already the case for stats dumping).

Thus in this patch we slightly change the server iteration logic:
hlua_server_list_iterator_context struct now stores the next valid server
pointer, and a watcher is added to ensure this pointer is never stale.

Then in hlua_listable_servers_pairs_iterator(), this next pointer is used
to create the Lua server object, and the next valid pointer is obtained by
leveraging watcher_next().

No backport needed unless 9c91b30 ("MINOR: server: remove prev_deleted
server list") is. Please note that dynamic servers were not supported in
Lua prior to 2.8, so it doesn't make sense to backport this patch further
than 2.8.
2024-12-11 10:52:11 +01:00
Amaury Denoyelle
9c91b30139 MINOR: server: remove prev_deleted server list
This patch is a direct follow-up to the previous one. Thanks to watcher
type, it is not safe to assume that servers manipulated via stats dump
were not targetted by a "delete server" CLI command. As such,
prev_deleted list server member is now unneeded. This patch thus removes
any reference to it.
2024-12-10 16:19:33 +01:00
Amaury Denoyelle
071ae8ce3d BUG/MEDIUM: stats/server: use watcher to track server during stats dump
If a server A is deleted while a stats dump is currently on it, deletion
is delayed thanks to reference counting. Server A is nonetheless removed
from the proxy list. However, this list is a single linked list. If the
next server B is deleted and freed immediately, server A would still
point to it. This problem has been solved by the prev_deleted list in
servers.

This model seems correct, but it is difficult to ensure completely its
validity. In particular, it implies when stats dump is resumed, server A
elements will be accessed despite the server being in a half-deleted
state.

Thus, it has been decided to completely ditch the refcount mechanism for
stats dump. Instead, use the watcher element to register every stats
dump currently tracking a server instance. Each time a server is deleted
on the CLI, each stats dump element which may points to it are updated
to access the next server instance, or NULL if this is the last server.
This ensures that a server which was deleted via CLI but not completely
freed is never accessed on stats dump resumption.

Currently, no race condition related to dynamic servers and stats dump
is known. However, as described above, the previous model is deemed too
fragile, as such this patch is labelled as bug-fix. It should be
backported up to 2.6, after a reasonable period of observation. It
relies on the following patch :
  MINOR: list: define a watcher type
2024-12-10 16:19:33 +01:00
Amaury Denoyelle
eafa8a32bb MINOR: list: define a watcher type
Define a new watcher type into list module. This type is similar to bref
and can be used to register an element which is currently tracking a
dynamic target. Contrary to bref, if the target is freed, every watcher
element are updated to point to a next valid entry or NULL.

This type will simplify handling of dynamic servers deletion, in
particular while stats dump are performed.

This patch is not a bug-fix. However, it is mandatory to fix a race
condition in dynamic servers. Thus, it should be backported along the
next commit up to 2.6.
2024-12-10 16:04:11 +01:00
Valentine Krasnobaeva
1f63a53955 BUG/MINOR: mworker: detach from tty when received READY from worker
Some master process' initialization steps are conditioned by receiving the
READY message from worker (pidfile creation, forwarding READY message to the
launching parent). So, master process can not do these initialization routines
before.

If the master process fails, while creating pid or forwarding the READY to the
parent in daemon mode, he exits with a proper alert message. In daemon mode we
no longer see such message, as process is already detached from the tty.

To fix this, as these alerts could be very useful, let's detach the master
process from the tty after his last initialization steps in _send_status.
2024-12-09 21:32:54 +01:00
Valentine Krasnobaeva
663d75e7a0 BUG/MEDIUM: startup: report status if daemonized process fails
Due to master-worker rework, daemonization fork happens now before parsing
and applying the configuration. This makes impossible to report correctly all
warnings and alerts to shell's stdout. Daemonzied process fails, while being
already in background, exit code reported by shell via '$?' equals to 0, as
it's the exit code of his parent.

To fix this, let's create a pipe between parent and daemonized child. The
child will send into this pipe a "READY" message, when it finishes his
initialization. The parent will wait on the "read" end of the pipe until
receiving something. If read() fails, parent obtains the status of the
exited child with waitpid(). So, the parent can correctly report the error to
the stdout and he can exit with child's exitcode.

This fix should be backported only in 3.1.
2024-12-09 21:32:44 +01:00
William Lallemand
5454824e31 MINOR: ssl: add notBefore and notAfter utility functions
Extracting notBefore and notAfter as a string can be bothersome,
add 2 utility functions that returns the value in a static buffer.
2024-12-09 18:29:23 +01:00
Willy Tarreau
c3ee4e375b MINOR: tools: make fddebug() automatically emit the location
fddebug() is sometimes quite helpful, but annoying to use when following
a call path because it's a pain to always repeat the function name and
call place. Let's have it automatically prepend the function name, the
file name and the line number, and make its arguments optional, replacing
them by a simple LF when all absent. This way, simply placing:

    fddebug();

is sufficient to emit a location follocing "[%s@%s:%d]\n". This function
must not be used in production (and even call places with it shouldn't be
committed) and it should only be used by developers, so the simplest the
better.
2024-12-09 18:05:09 +01:00
Willy Tarreau
d6dc8120c0 BUILD: debug: fix build issues in COUNT_IF() with -Wunused-value
Commit 7f64bb79fd ("BUG/MINOR: debug: COUNT_IF() should return true/false")
allowed the COUNT_IF() macro to return the evaluated value. This is handy
to place it in "if ()" conditions and count them at the same time. When
glitches are disabled, the condition is just returned as-is, but most call
places do not use the result, making some compilers complain. In addition,
while reviewing this, it was noticed that when DEBUG_STRICT=0, the macro
would still be replaced by a "do { } while (0)" statement, which not only
does not evaluate the expression, but also cannot return anything. Ditto
for COUNT_IF_HOT().

Let's make sure both are always properly evaluated now.
2024-12-09 18:04:51 +01:00
Willy Tarreau
7f64bb79fd BUG/MINOR: debug: COUNT_IF() should return true/false
The COUNT_IF() macro was initially meant to return true/false to be used
in if() conditions but had an extra do { } while(0) that prevents it from
doing so. Let's get rid of the do { } while(0) before the code generalizes
to too many places. There's no impact on existing code, but may have to be
backported if future fixes rely on it.
2024-12-06 18:45:46 +01:00
Valentine Krasnobaeva
cd0b58e23e BUG/MINOR: startup: fix error path for master, if can't open pidfile
If master process can't open a pidfile, there is no sense to send SIGTTIN to
oldpids, as it will exit. So, old workers will terminate as well. It's better
to send the last alert to the log about unrecoverable error, because master is
already in its polling loop.

For the standalone mode we should keep the previous logic in this case: send
SIGTTIN to old process and unbind listeners for the new one. So, it's better
to put this error path in main(), as it's done when other configuration settings
can't be applied.

This patch should be backported only in 3.1.
2024-12-06 12:00:22 +01:00
Aurelien DARRAGON
ae9d8d40d0 CLEANUP: stktable: add some stktable flags polishing
Better late than never, commit 1f73d35 ("MINOR: stktable: implement
"recv-only" table option") implemented stktable flags and initial
definitions, but it lacks some comments plus the flag is stored as
16bits but the SKT_FL_ definition width allows for only 8bits so
it is a bit confusing, let's fix that
2024-12-05 13:14:21 +01:00
Aurelien DARRAGON
9f44c5f9be CLEANUP: stktable: replace nopurge attribute with flag
Thanks to previous commit stktable struct now have a "flags" struct member

Let's take this opportunity to remove the isolated "nopurge" attribute in
stktable struct and rely on a flag named STK_FL_NOPURGE instead.

This helps to better organize stktable struct members.
2024-12-05 12:15:31 +01:00
Aurelien DARRAGON
1f73d3524d MINOR: stktable: implement "recv-only" table option
When "recv-only" keyword is added on a stick table declaration (in peers
or proxy section), haproxy considers that the table is only used for
data retrieval from a remote location and not used to perform local
updates. As such, it enables the retrieval of local-only values such
as conn_cur that are ignored by default. This can be useful in some
contexts where we want to know about local-values such are conn_cur
from a remote peer.

To do this, add stktable struct flags  which default to NONE and enable
the RECV_ONLY flag on the table then "recv-only" keyword is found in the
table declaration. Then, when in peer_treat_updatemsg(), when handling
table updates, don't ignore data updates for local-only values if the flag
is set.
2024-12-05 12:15:24 +01:00
Willy Tarreau
e6f4f15929 MINOR: tasklet: set TASK_WOKEN_OTHER on tasklets by default
Now when tasklets are woken up via tasklet_wakeup(), tasklet_wakeup_on()
or tasklet_wakeup_after(), either the optional wakeup flags will be used,
or TASK_WOKEN_OTHER will be used.

This allows tasklet handlers waking up for any given cause to notice
whether or not they were also woken for another reason. For example, a
mux handler could skip heavy parts when seeing that TASK_WOKEN_OTHER is
absent, proving that no standard tasklet_wakeup() was done, for example
in response to a subscribe().

The benefit of the TASK_WOKEN_* flags is that they're purged during the
wakeup, and that they're easy to check for using TASK_WOKEN_ANY.
TASK_F_UEVT1 and TASK_F_UEVT2 are also usable for private use (e.g. wakeup
from a stream to a connection inside a mux).

Probably that in the future, code dealing with subscribe events should
start to place TASK_WOKEN_IO like is done for upper layers.
2024-12-03 19:45:08 +01:00
Willy Tarreau
6322c9fbbf MINOR: tools: add a new macro DEFVAL() to provide a default argument
This is like DEFZERO and DEFNULL, but this one allows to specify the
default value to be used as the first argument.
2024-12-03 19:45:08 +01:00
Valentine Krasnobaeva
a33977da48 BUG/MINOR: startup: close pidfd and free global.pidfile in handle_pidfile()
After master-worker mode refactoring, global.pidfile is only used in
handle_pidfile(), which opens the provided file and writes the PID into it. So,
it's more appropriate to perform the close(pidfd) and ha_free(&global.pidfile)
also in this function.

This commit prepares the fix of the pidfile creation, as it's created now very
early, when we are not sure, that process has successfully started. In
master-worker mode handle_pidfile() can be called in the master process context.
So, let's make it accessible from other compilation units via global.h.

This should be backported only in 3.1.
2024-12-02 17:28:04 +01:00
Aurelien DARRAGON
8bce7ff854 MINOR: hlua_fcn: add Patref:commit() method
commit() method may be used to commit pending updates on the local patref
object:

hlua_patref flags were added:
 HLUA_PATREF_FL_GEN means the patref object has been updated
 and it is associated to a new revision (curr_gen) in order to prepare
 and commit the pending updates.

upon commit, the pattern API is leveraged with curr_gen as revision to
commit new object items. Once commit is performed, previous (pending)
revisions that are older than the committed one are cleaned up (similar
to what's done with commit on the cli). Also, Patref function APIs now
take into account curr_gen to perform lookups.
2024-11-29 07:23:08 +01:00
Aurelien DARRAGON
e769d8f426 MINOR: pattern: add pat_ref_may_commit() helper function
pat_ref_may_commit() may be used to know if a given generation ID id still
valid, which means it may still be committed at some point. Else it means
that another pending generation ID older than the tested one was already
committed and thus other generations ID below this one are stale and must
be regenerated.
2024-11-29 07:23:01 +01:00
Aurelien DARRAGON
43ab25f007 MINOR: hlua_fcn: wrap pat_ref struct for patref class
In order to extend the patref class features, let's wrap the pat_ref struct
into hlua_patref struct. This way we may add additional data alongside the
pat_ref pointer to store additional context required for pat_ref data
manipulation from lua.

Since the wrapper (hlua_patref) is an allocated object, we declare the _gc
metamethod for patref class in order to properly cleanup resources when
they are out of scope.
2024-11-29 07:22:54 +01:00
Aurelien DARRAGON
2021072391 MINOR: hlua_fcn: implement index and pair metamethods for patref class
patref object may now leverage index and pair methamethods to list and
access patref elements at a specific index (=key)

Also, patref:is_map() method may be used to know if the patref stores acl
(key only) or map-style (key:value) patterns.
2024-11-29 07:22:46 +01:00
Aurelien DARRAGON
956a25cf60 MINOR: hlua: add patref class
Implement patref class to expose pat_ref struct internal pattern struct
in lua. This is some prerequisite work needed to be able to manipulate
exisiting generic pattern object lists (acl/map) from Lua, because the Map
class can only be used to perform matching ops on Map files.
2024-11-29 07:22:32 +01:00
Aurelien DARRAGON
f72a66eef2 MINOR: pattern: publish event_hdl events on pat_ref updates
Now that PAT_REF events were defined in previous commit, let's actually
publish them from pattern API where relevant. Unlike server events,
pattern reference events are only published in the pat_ref subscriber's
list on purpose, because in some setups patref updates (updates performed
on a map for instance from action or cli) are very frequent, and we don't
want to impact pattern API performance just for that.

Moreover, as the main use case is to be able to subscribe to maps updates
from Lua, allowing a per-pattern reference registration is already enough.

No additional data is provided for such events (also for performance reason)

Care was taken not to publish events when the update doesn't affect the
live subset (the one targeted by curr_gen).
2024-11-29 07:22:25 +01:00
Aurelien DARRAGON
f7267bd315 MINOR: event_hdl: add PAT_REF events
This is some prerequisite work for implementing PAT_REF events.

In this commit we define the PAT_REF event_hdl family (which gets family
slot id #2), with the following supported events:

  - EVENT_HDL_SUB_PAT_REF_ADD: element was added to the current version of
    the pattern ref
  - EVENT_HDL_SUB_PAT_REF_DEL: element was deleted from the current
    version of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_SET: element was modified in the current version
    of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_COMMIT: pending element(s) was/were commited in
    the current version of the pattern ref
  - EVENT_HDL_SUB_PAT_REF_CLEAR: all elements were cleared from the
    current version of the pattern ref

The goal is to be able to track a pat_ref struct in order to be notified
when it is updated. For performance reasons, events from this family won't
provide any additional info, and will only be published in the pat_ref
subscription list. Indeed, pat_ref may be updated at a relatively high
frequency (or worse, batch work), so we cannot afford doing expensive
treatment for each update.
2024-11-29 07:22:18 +01:00
Frederic Lecaille
f8b697c19b BUG/MINOR: improve BBR throughput on very fast links
This patch fixes the loss of information when computing the delivery rate
(quic_cc_drs.c) on links with very low latency due to usage of 32bits
variables with the millisecond as precision.

Initialize the quic_conn task with TASK_F_WANTS_TIME flag ask it to ask
the scheduler to update the call date of this task. This allows this task to get
a nanosecond resolution on the call date calling task_mono_time(). This is enabled
only for congestion control algorithms with delivery rate estimation support
(BBR only at this time).

Store the send date with nanosecond precision of each TX packet into
->time_sent_ns new quic_tx_packet struct member to store the date a packet was
sent in nanoseconds thanks to task_mono_time().

Make use of this new timestamp by the delivery rate estimation algorithm (quic_cc_drs.c).

Rename current ->time_sent member from quic_tx_packet struct to ->time_sent_ms to
distinguish the unit used by this variable (millisecond) and update the code which
uses this variable. The logic found in quic_loss.c is not modified at all.

Must be backported to 3.1.
2024-11-28 21:39:05 +01:00
Christopher Faulet
bc66d31985 MINOR: proxy: Add support of 421-Misdirected-Request in retry-on status
The "421" status can now be specified on retry-on directives. PR_RE_* flags
were updated to remains sorted.

This patch should fix the issue #2794. It is quite simple so it may safely
be backported to 3.1 if necessary.
2024-11-28 11:47:40 +01:00
Willy Tarreau
97d33abb23 MINOR: version: this is development again (3.2)
This basically reverts commit b629f366a7 ("MINOR: version: mention that
3.1 is stable now").
2024-11-26 17:21:16 +01:00
Aurelien DARRAGON
4792f27892 MINOR: pattern: add pat_ref_gen_delete() function
pat_ref_gen_delete(ref, gen_id, key) tries to delete all samples belonging
to <gen_id> and matching <key> under <ref>

The goal is to be able to target a single subset from <ref>
2024-11-26 16:12:21 +01:00
Aurelien DARRAGON
a131c542a6 MINOR: pattern: add pat_ref_gen_find_elt() function
pat_ref_gen_find_elt(ref, gen_id, key) tries to find <elt> element
belonging to <gen_id> and matching <key> in <ref> reference.

The goal is to be able to target a single subset from <ref>
2024-11-26 16:12:16 +01:00