The lack of mjson_next() prevents to iterate easily and need to hack by
iterating on a loop of snprintf + $.field[XXX] combined with
mjson_find().
This reintroduce mjson_next() so we could iterate without having to
build the string.
The patch does not reintroduce MJSON_ENABLE_NEXT so it could be used
without having to define it.
The ACME_INITIAL_DELAY state displays a message about 'dns-01', but this
state is also used for 'dns-persist-01'.
This patch displays the challenge that was configured instead of dns-01
This allows to use the `unique-id` fetch within `tcp-check` or `http-check`
ruleset. The format is taken from the checked server's backend (which is
naturally inherited from the corresponding `defaults` section).
This is particularly useful with
http-check send ... hdr request-id %[unique-id]
to ensure all requests sent by HAProxy have a unique ID header attached.
This resolves GitHub Issue #3307.
Reviewed-by: Volker Dusch <github@wallbash.com>
This implementation is directly modeled after `stream_generate_unique_id()` and
the corresponding `unique_id` field on `struct stream`.
It will be used in a future commit to enable the use of the `%[unique-id]`
fetch in check rules.
Use the return value of `stream_generate_unique_id()` instead of relying on the
`unique_id` field of `struct stream` when handling the `%ID` log placeholder.
This also allowed to unify the "stream available" and "stream not available"
paths.
Reviewed-by: Volker Dusch <github@wallbash.com>
With the introduction of the `generate_unique_id()` helper, the actual
complicated logic is sitting in a different file. Allow inlining of
`stream_generate_unique_id()`, so that callers can benefit from an abstraction
without hiding away the access of `strm->unique_id` behind a function call.
This new function will handle the actual generation of the unique ID according
to a format. The caller is responsible to check that no unique ID is stored
yet.
Commit 6d16b11022 ("BUG/MINOR: haterm: preserve the pipe size margin
for splicing") solved the issue of pipe size being sufficient for the
vmsplice() call, but as Christopher pointed out, the ratio was applied
to the default size of 64k, so now it's applied twice, giving 100k
instead of 80k. Let's drop it from there.
No backport needed.
Printing a "(null)" when NULL passed with the %s format specifier is a
GNU extension, so it must be avoided for portability reasons.
Must be backported as far as 3.2
The wildcard field was declared and used when building the dns-persist-01
TXT record value (policy=wildcard suffix), but was never populated from
the server's authorization response. Add the missing mjson_get_bool() call
to read $.wildcard before saving auth->dns.
Add challenge_type parameter to acme_rslv_start() to select the correct
DNS lookup prefix: _validation-persist.<domain> for dns-persist-01 and
_acme-challenge.<domain> for dns-01.
Default cond_ready to ACME_RDY_DNS|ACME_RDY_DELAY for dns-persist-01.
Extend ACME_CLI_WAIT to cover dns-persist-01 alongside dns-01.
In ACME_RSLV_READY, check only TXT record existence for dns-persist-01
since the resolver cannot parse multiple strings within a single TXT entry.
Implements draft DNS-PERSIST-01 challenge based on
https://datatracker.ietf.org/doc/html/draft-ietf-acme-dns-persist
Blog post: https://letsencrypt.org/2026/02/18/dns-persist-01
This challenge is designed to use preprovisioned DNS records,
unlike DNS-01 challenge it doesn't need per provider API integration.
In short instead of validating order by crafting a custom response
based on input recieved from ACME server, like other challenges do
in particular DNS-01, HTTP-01, TLS-ALPN-01, in this challenge you
authorize domain statically, ACME account key functions similar to
a private key and accounturi in the record functions like a public key,
ACME server verifies that account uri matches account key and authorizes
based on that. You only need to write DNS record one time,
accounturi binds to an account key, and will only change if new account
key is created, although it is possible to rotate account key without
changing account uri.
Main benefits of this challenge in contrast to DNS-01:
1. Security, no need to give reverse proxy write access to the DNS.
2. Simplicity, no complex per provider integrations like Lego needed.
3. Robustness, no worrying about DNS record cache each renewal.
It would be used like this:
1. generate an account key ahead of time
2. add required DNS record manually or automatically using IaC tools
3. start HAProxy with the same account key used
Intended way to use this challenge is with a code that will print
and maybe sets DNS records ahead of time. For example that could
be integrated into the IaC provisioning step. This challenge type
is extremely recent though, so those integrations are yet to be written.
It is possible to do this challenge without extra tools too,
with pebble / challtestsrv steps would be as following:
After starting HAProxy it will print required records in the logs.
With challtestsrv you can then set those records like this:
curl -d '{
"host":"_validation-persist.localhost.",
"value": "pebble.letsencrypt.org; accounturi=...; policy=wildcard"}
' http://localhost:8055/set-txt
After setting the records run renew with the name of the certificate:
echo "acme renew @cert/localhost.pem" \
| socat stdio tcp4-connect:127.0.0.1:9999
Or just restart HAProxy.
Unlike with DNS-01 you don't have to worry about DNS records changing,
if there is any problem with DNS records you can just retry.
Originally in httpterm we used to allocate 5/4 of the size of a pipe to
permit to use vmsplice because there's some fragmentation or overhead
internally that requires to use a bit of margin. While this was initially
applied to haterm as well, it was accidentally lost with commit fb82dece47
("BUG/MEDIUM: haterm: Properly initialize the splicing support for haterm"),
resulting in errors about vmsplice() whenever tune.pipesize is set. Let's
enforce the ratio again.
No backport is needed.
I noticed some strange checks for presence of errmsg. Called functions
generate non-empty error message in case of failure, so a non-NULL address
of the error message is enough.
No backport needed.
When command line is parsed, when the payload was too big the error was not
properly handled. Instead of leaving the parsing function to print the
error, we looped infinitly trying to parse remaining data.
When the command line is too big, we must exit the parsing function in
CLI_ST_PRINT_ERR state. Instead of exiting the function, we only left the
while loop, setting this way the cli applet in CLI_ST_PROMPT state.
This patch must be backported as far as 3.2.
Instead of relying on the implementation detail that
`stream_generate_unique_id()` will store the unique ID in `strm->unique_id` we
should use the returned value, especially since that one is already checked in
the `isttest()`.
Reviewed-by: Volker Dusch <github@wallbash.com>
The return value of the `if()` and `else` branch is identical. We can just move
it out of conditional paths.
Reviewed-by: Volker Dusch <github@wallbash.com>
`sess_build_logline_orig()` takes a `size_t maxsize` as input and accordingly
should also return `size_t` instead of `int` as the resulting length. In
practice most of the callers already stored the result in a `size_t` anyways.
The few places that used an `int` were adjusted.
This Coccinelle patch was used to check for completeness:
@@
type T != size_t;
T var;
@@
(
* var = build_logline(...)
|
* var = build_logline_orig(...)
|
* var = sess_build_logline(...)
|
* var = sess_build_logline_orig(...)
)
Reviewed-by: Volker Dusch <github@wallbash.com>
The following configuration:
defaults
unique-id-format TEST-%[srv_name]
frontend fe_http
mode http
bind :::8080 v4v6
Emitted the following error:
[ALERT] (219835) : Parsing [./patch.cfg:2]: failed to parse unique-id : sample fetch <srv_name]> may not be reliably used here because it needs 'server' which is not available here.
The `]` in the name of the sample fetch should not be there.
This bug exists since at least HAProxy 2.4, which is the oldest supported
version. The fix should be backported there.
Reviewed-by: Volker Dusch <github@wallbash.com>
Reuse QUIC transport parameters value set in xprt_qstrm layer in frame
builder function. Prior to this patch, mux_quic would use different
values from the advertised ones.
No need to backport.
When QMux was first implemented, values used for emitted transport
parameters in xprt_qstrm and local flow control in mux_quic were
initialized separately. This is error prone in particular if a value is
change in one layer but not the other.
This patch fixes this by using xprt_qstrm_lparams() in QMux init
function. Mux flow control is then loaded with these values. Thus all
values are now initialized in a single place which is xprt_qstrm_init().
The OpenTelemetry (OTel) filter enables distributed tracing of requests
across service boundaries, export of metrics such as request rates,
latencies and error counts, and structured logging tied to trace context,
giving operators a unified view of HAProxy traffic through any
OpenTelemetry-compatible backend.
The OTel filter is implemented using the standard HAProxy stream filter
API. Stream filters attach to proxies and intercept traffic at each stage
of processing: they receive callbacks on stream creation and destruction,
channel analyzer events, HTTP header and payload processing, and TCP data
forwarding. This allows the filter to collect telemetry data at every
stage of the request/response lifecycle without modifying the core proxy
logic.
This commit added the minimum set of files required for the filter to
compile: the addon Makefile with pkg-config-based detection of the
opentelemetry-c-wrapper library, header files with configuration
constants, utility macros and type definitions, and the source files
containing stub filter operation callbacks registered through
flt_otel_ops and the "opentelemetry" keyword parser entry point.
The filter uses the opentelemetry-c-wrapper library from HAProxy
Technologies, which provides a C interface to the OpenTelemetry C++ SDK.
This wrapper allows HAProxy, a C codebase, to leverage the full
OpenTelemetry observability pipeline without direct C++ dependencies
in the HAProxy source tree.
https://github.com/haproxytech/opentelemetry-c-wrapperhttps://github.com/open-telemetry/opentelemetry-cpp
Build options:
USE_OTEL - enable the OpenTelemetry filter
OTEL_DEBUG - compile the filter in debug mode
OTEL_INC - force the include path to the C wrapper
OTEL_LIB - force the library path to the C wrapper
OTEL_RUNPATH - add the C wrapper RUNPATH to the executable
Example build with OTel and debug enabled:
make -j8 USE_OTEL=1 OTEL_DEBUG=1 TARGET=linux-glibc
conn_recv_qstrm() may be called several times per connection if the read
data is too short and a truncated record is received.
Previously, record length was parsed every time the function is invoked.
However, this must only be performed if record length varint is
incomplete. Once read and parsed, data are removed from the buffer via
b_quic_dec_int(). Thus, next conn_recv_qstrm() run will reread an
invalid record length this time.
This patch fixes this by only parsing record length if <rxrlen> member
is null. Prior to it, parsing of QMux transport parameters would fail in
case of a first truncated read, which would prevent the connection
initialization.
No need to backport.
A QCC connection may be flagged with QC_CF_ERRL to trigger a
CONNECTION_CLOSE emission. However, for now error reporting is not
functional with QMux, as it relies on quic_conn layer access.
To prevent a crash in qcc_io_send() when using QMux, add a
conn_is_quic() check when QC_CF_ERRL is set to ensure no access will be
performed on quic_conn layer. In the future, this should be extended so
that QMux is also able to emit CONNECTION_CLOSE for connection closure.
No need to backport.
First, we must not emit any warning if splicing is not configured and the
global maxpipes value is 0. Then we must not remove GTUNE_USE_SPLICE flag
when we fail to allocate the haterm master pipe. Instead, we test it when we
negociate with the opposite side, to properly exclude the splicing if it is
not usable.
No backport needed.
This reverts commit 8056117e98.
Moving haterm init from haproxy is not the right way to fix the issue
because it should be possible to use a haterm configuration in haproxy.
So let's revert the commit above.
Add QUIC BLOCKED frames in the list of supported types in
qstrm_parse_frm(). Nothing is really implemented for them as for QUIC,
but this prevents a crash when receiving one of them via QMux.
No need to backport.
This patch implements emission of the new Record layer for QMux frames.
This handles mux-quic and xprt_qstrm layers as this is performed
similarly in both cases.
Currently, the simplest approach has been prefered : each frame is
encoded in its own record. This is not the most efficient in size but it
is extremely simple to implement for a first interop testing.
This patch implements the new QMux record layer parsing for xprt_qstrm.
This is mostly similar to the MUX code from the previous patch.
Along with this change, a new xprt_qstrm layer accessor exposes the
possible remaining record length after Transport parameters parsing.
This can only occur when xprt_qstrm Rx buffer is not completely emptied
due to other following frames. If stored in the same record, MUX layer
has to know the remaining record length.
Thus, xprt_qstrm_rxrlen() is now used in qmux_init() to preinitialize
<rx.rlen> QCC field.
This is the first patch of a serie which aims to support the new Record
layer defined by the draft 01 of QMux protocol.
https://www.ietf.org/archive/id/draft-ietf-quic-qmux-01.html#name-qmux-records
This patch deals with QMux reception at the MUX layer. The function
qcc_qstrm_recv() is adapted to read record headers before frame parsing.
This requires to keep the last record length read in a new QCC field
named <rx.rlen>.
Frames are only parsed once a full record is received. One of the
advantage of the record layer is that it can only contains whole frame
without truncation.
This patch implements proper connection error handling for xprt_qstrm
layer. Basically, processing is interrupted if CO_FL_ERROR is
encountered after either rcv_buf or snd_buf operations. Connectionn
error is set to the newly defined value CO_ER_QSTRM.
This commit adds buffering on transmission for xprt_qstrm layer. This is
necessary in the rare case where send syscall only emits partial data.
A new <txbuf> member is defined in xprt_qstrm context. On first send
invokation, buffer is allocated and then the QMux transport parameters
frame is encoded. Then emission is performed via snd_buf and each time
the send function is invoked.
Layer xprt_qstrm is responsible to read the initial QMux transport
parameters frame. However, it could receive more data if some other
frames follow it. This extra content can only be handled by the MUX
layer once initialized.
Theorically, it could have been implemented via MSG_PEEK. However, this
flag is currently ignored by SSL layer. Besides, it is tedious to
implement safely. A new approach has been prefered where the MUX layer
is responsible to retrieve remaining data via xprt_qstrm_rxbuf()
accessor function during its initialization.
Thus, qmux_init() now may retrieve the buffer from xprt_qstrm layer.
This is performed via b_xfer() which will result in a zero copy
transfer. If this happens, tasklet is immediately scheduled to start
demuxing.
Implement buffering for reception on xprt_qstrm layer. This is necessary
to handle reception of a truncated QMux transport parameters frame.
This is performed via a new dedicated <rxbuf> member in xprt_qstrm
context. Read is performed by reusing the buffer until a whole frame can
be read.
QUIC frame parsers functions take a <pos> pointer as input argument for
the data to be parsed. If parsing is successful, <pos> must be
incremented to point to the next data.
Increment was not performed when parsing QMux transport parameters
frame. This commit fixes this. Note that for now there is no real issue
as xprt_qstrm does not check the QMux frame length.
No need to backport.
In qcc_release(), <conn> may be NULL. Thus every access on it must be
tested.
With recent QMux introduction, a call to conn_is_quic() has been added
prior to registration of the stream rejection callback. It could lead to
NULL deref as <conn> is not tested there. Fix this by adding an extra
check on the pointer validity.
No need to backport.
hlua_applet_http_status() stored the result of luaL_optlstring()
directly in http_ctx->reason. The pointer references Lua-managed
string storage which is only guaranteed valid until the C function
returns to Lua. If the GC runs between applet:set_status(200, str)
and applet:start_response(), the pointer dangles.
hlua_applet_http_send_response() then calls ist(http_ctx->reason)
which does strlen() on freed memory, followed by memcpy into the
HTX status line. The freed-and-reallocated chunk contents are sent
verbatim to the HTTP client.
Trigger:
applet:set_status(200, table.concat({"Reason ", str:rep(50)}))
collectgarbage("collect"); collectgarbage("collect")
applet:start_response()
With heap grooming, adjacent allocation contents (session data, TLS
material from the same thread) leak into the response status line.
Anchor the Lua string in the registry keyed by the http_ctx field
address so it survives until the applet is done with it. The
registry entry is overwritten on each call (handles repeated
set_status) and naturally cleaned up when the lua_State is closed.
This patch should be backported to all stable versions.
FCGI content_length is a 16-bit field but fcgi_set_record_size()
is called with size_t/uint32_t arguments. With tune.bufsize >= 65544
(legal; cfgparse-global.c only enforces <= INT_MAX-16), a single
HTX DATA block or accumulated outbuf can exceed 65535 bytes. The
implicit conversion to uint16_t silently truncates the length field
while b_add(mbuf, outbuf.data) writes the full body.
A client posting ~99000 bytes can craft the body so that bytes
after the truncated length are parsed by PHP-FPM as fresh FCGI
records on the connection: a smuggled BEGIN_REQUEST + PARAMS with
arbitrary SCRIPT_FILENAME / PHP_VALUE bypasses all haproxy ACLs.
Fix the zero-copy path by refusing it when the block exceeds 65535
bytes (falls through to copy). Fix the copy path by capping
outbuf.size to 65535 + header so the data-fill loop naturally
stops at the FCGI maximum and emits the rest in a subsequent record.
The PARAMS path at line 2084 is similarly affected but harder to
trigger (requires combined header+param size > 65535) and is
covered by the same outbuf.size cap pattern if applied there.
This patch must be backported to all stable versions.
exp_replace() returns int and returns -1 when the back-reference
expansion overflows the output buffer (regex.c:51). output->data is
size_t, so -1 becomes SIZE_MAX. There was no error check.
The subsequent comparisons interpret SIZE_MAX as a huge length:
"output->data > b_room(trash)" tries to grow trash, then
"max > output->data" is false so max stays at trash->size, and
memcpy(trash, output->area, trash->size) copies the full chunk.
output->area is a pool_alloc()'d chunk that is NOT zeroed; the
bytes after the partial exp_replace output are stale data from a
prior pool user (request headers, response bodies from the same
worker thread).
Trigger with a backreference whose expansion exceeds bufsize:
http-request set-header X %[req.hdr(In),regsub('(.+)','\1\1')]
and a request with In: of ~9000 bytes. The X header sent to the
backend then contains ~9KB of stale heap data.
With tune.bufsize.large set, get_larger_trash_chunk() upgrades trash
and the memcpy reads up to ~50KB past the (smaller) output->area
allocation.
http_ana.c:2728 and http_act.c:551 already check exp_replace() for
-1; this call site was missed when backreferences were added.
This patch must be backported to all stable versions.
Samples of type SMP_T_METH were not properly handled in smp_dup(),
smp_is_safe() and smp_is_rw(). For "other" methods, for instance PATCH, a
fallback was performed on the SMP_T_STR type. Only the buffer considered
changed. "smp->data.u.meth.str" should be used for the SMP_T_METH samples
while smp->data.u.str should be used for SMP_T_STR samples. However, in
smp_dup(), the result was stored in wrong buffer, the string one instead of
the method one. In smp_is_safe() and smp_is_rw(), the method buffer was not
used at all.
We now take care to use the right buffer.
This patch must be backported to all stable versions.
When "Expect" header was found in request headers, "HTTP/1.1 100-continue"
was returned instead of "HTTP/1.1 100 continue". Let's fix it.
No backport needed.
http_action_set_headers_bin() decodes varint name and value lengths
from a binary sample but never validates that the decoded length
fits in the remaining sample data before constructing the ist.
If the value's varint decodes to a large number with only a few
bytes following, v.len exceeds the buffer and http_add_header()
memcpys past the sample, copying adjacent heap data into a header
sent to the backend (or client, with http-response).
The intended source for this action is the hdrs_bin sample fetch
which produces well-formed output, but nothing prevents an admin
from feeding it req.body or another untrusted source. With:
http-request set-var(txn.h) req.body
http-request add-headers-bin var(txn.h)
a POST body of [05]"X-Foo"[c8]"AB" produces v = {ptr="AB", len=200}
and 198 bytes of adjacent heap data go into X-Foo.
http_action_del_headers_bin() was fixed too.
Compare spoe_decode_buffer() which has the equivalent check.
Validate both name and value lengths against remaining data.
No backport needed.
Commit c84c15d393 ("BUG/MINOR: resolvers: Apply dns-accept-family
setting on additional records") converted a switch statement to an
if/else chain but left the break; in the AAAA branch. In the new
form, break exits the surrounding for loop instead of a switch case.
For every AAAA additional record in an SRV response:
- answer_record allocated at line 1460 is never freed and never
inserted into answer_tree -> ~580 bytes leaked per response
- all subsequent additional records in the response are silently
discarded
A DNS server controlling SRV responses for haproxy service discovery
can leak memory at MB/min rates given default resolution intervals.
Also breaks IPv6 SRV target resolution outright since the AAAA record
is leaked rather than attached to its SRV entry.
HAProxy has always called luaL_openlibs() unconditionally, which opens
all standard Lua libraries including io, os, package and debug. This
makes it impossible to prevent Lua scripts from executing binaries
(os.execute, io.popen), loading native C modules (package/require), or
bypassing any Lua-level sandbox via the debug library.
Add a new global directive tune.lua.openlibs that accepts a comma-separated
list of library names to load:
tune.lua.openlibs none # only base + coroutine
tune.lua.openlibs string,math,table,utf8 # safe libs only
tune.lua.openlibs all # default, same as before
The base and coroutine libraries are always loaded regardless: base provides
core Lua functions that HAProxy relies on, and coroutine is required because
HAProxy overrides coroutine.create() with its own safe implementation.
When all libraries are enabled (the default), the fast path still calls
luaL_openlibs() directly with no overhead. A parse error is returned if
the directive appears after lua-load or lua-load-per-thread (the Lua state
is already initialised at that point), or if 'none' is combined with other
library names. Note that fork() and new thread creation are already blocked
by default regardless of this setting (see "insecure-fork-wanted").
Literals are sent in two ways:
- in EOB state, unencoded and prefixed with their length
- in FIXED state, huffman-encoded
And references are only sent in FIXED state.
The API promises that the amount of data will not grow by more than
5 bytes every 65535 input bytes (the comment was adjusted to remind
this last point). This is guaranteed by the literal encoding in EOB
state (BT, LEN, NLEN + bytes), which is supposed to be the worst
case by design.
However, as reported by Greg KH, this is currently not true: the test
that decides whether or not to switch to FIXED state to send references
doesn't properly account for the number of bytes needed to roll back
to the *exact* same state in EOB, which means sending EOB, BT,
alignment, LEN and NLEN in addition to the referenced bytes, versus
sending the encoding for the reference. By not taking into account the
cost of returning to the initial state (BT+LEN+NLEN), it was possible
to stay too long in the FIXED state and to consume the extra bytes that
are needed to return to the EOB state, resulting in producing much more
data in case of multiple switchovers (up to 6.25% increase was measured
in tests, or 1/16, which matches worst case estimates based on the code).
And this check is only valid when starting from EOB (in order to restore
the same state that offers this guarantee). When already in FIXED state,
the encoded reference is always smaller than or same size as the data.
The smallest match length we support is 4 bytes, and when encoded this
is no more than 28 bits, so it is safe to stay in FIXED state as long
as needed while checking the possibility of switching back to EOB.
This very slightly reduces the compression ratio (-0.17% on a linux
kernel source) but makes sure we respect the API promise of no more
than 5 extra bytes per 65535 of input. A side effect of the slightly
simpler check is an ~7.5% performance increase in compression speed.
Many thanks to Greg for the detailed report allowing to reproduce
the issue.
This is libslz upstream commit 002e838935bf298d967f670036efa95822b6c84e.
Note: in haproxy's default configuration (tune.bufsize 16384,
tune.maxrewrite 1024), this problem cannot be triggered, because the
reserve limits input to 15360 bytes, and the overflow is maximum
960 bytes resulting in 16320 bytes total, which still fits into the
buffer. However, reducing tune.maxrewrite below 964, or tune.bufsize
above 17408 can result in overflows for specially crafted patterns.
A workaround for larger buffers consists in always setting tune.bufsize
to at least 1/16 of tune.bufsize.
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://www.mail-archive.com/haproxy@formilux.org/msg46837.html
Storing the protocol directly into the check was not a good idea,
because the protocol may not be determined until after a DNS resolution
on the server, and may even change at runtime, if the DNS changes.
What we can, however, figure out at start up, is the net_addr_type,
which will contain all that we need to find out which protocol to use
later.
Also revert the changes made by commit 07edaed191
that would not reuse the server xprt if a different alpn is set for
checks. The alpn is just a string, and should not influence the choice
of the xprt.
We'll now make sure to use the server xprt, unless an address is
provided, in which case we'll use whatever xprt matches that address, or
a port, in which case we'll assume we want TCP, and use check_ssl to
know whetver we want the SSL xprt or not.
Now that the check contains all that is needed to know which protocol to
look up, always just use that when creating a new check connection if it
is the default check connection, and for now, always use TCP when a
tcp-check or http-check connect rule is used (which means those can't be
used for QUIC so far).
This should hopefully fix github issue #3324.
This reverts commit a03120e228.
A WIP version of the patch was applied before the actual patch by
accident. The correct patch is 2db801c ("BUG/MINOR: hlua: fix stack
overflow in httpclient headers conversion")
When the app_ops were removed, direct calls to the SC wake callback function
were replaced by tasklet wakeups. However, in conn_create_mux(), it was
replaced by a direct call to sc_conn_process(). However, sc_conn_process()
is only usable when the SC is attach to a stream. A backend mux can be
created for a healcheck. In this context, sc_conn_process() cannot be
called.
Because of this bug, crashes can be experienced when an error is triggered
during a SSL connection attempt from a healthcheck.
To fix the issue, the call to sc_conn_process() was replaced by a tasklet
wakeup.
This patch should fix the issue #3326. No backport needed.
When a peer sends a dictionary entry update with a value (the else
branch at line 2109), the entry id decoded from the wire was never
validated against dc->max_entries before being used as an array index
into dc->rx[].
A malicious peer can send id=N where N > 128 (PEER_STKT_CACHE_MAX_ENTRIES)
to:
- dc->rx[id-1].de at line 2123: OOB read followed by atomic decrement
and potential free of an attacker-controlled pointer via
dict_entry_unref()
- dc->rx[id-1].de = de at line 2124: OOB write of a heap pointer at
an attacker-controlled offset (16-byte stride, ~64 GiB range)
The bounds check was added to the key-only branch in commit f9e51beec
("BUG/MINOR: peers: Do not ignore a protocol error for dictionary
entries.") but was never added to the with-value branch. The bug has
been present since dictionary support was introduced in commit
8d78fa7def ("MINOR: peers: Make peers protocol support new
"server_name" data type.").
Reachable from any TCP client that knows the configured peer name
(no cryptographic authentication on the peers protocol). Requires a
stick-table with "store server_key" in the configuration.
Fix by hoisting the bounds check above the branch so it covers both
paths.
Must be backported as far as 2.6.
When the input chunk is already the large buffer (chk->size ==
large_trash_size), the <= comparison still matched and returned
another large buffer of the same size. Callers that retry on a
non-NULL return value (sample.c:4567 in json_query) loop forever.
The json_query infinite loop is trivially triggered: mjson_unescape()
returns -1 not only when the output buffer is too small but also for
any \uXXYY escape where XX != "00" (mjson.c:305) and for invalid
escapes like \q. The retry loop assumes -1 always means "grow the
buffer", so a 14-byte JSON body of {"k":"\u0100"} hangs the worker
thread permanently. Send N such requests to exhaust all worker
threads.
Use < instead of <= so a chunk that is already large yields NULL.
This also fixes the json converter overflow at sample.c:2869 where
no recheck happens after the "growth" returned a same-size buffer.
Introduced in commit ce912271db ("MEDIUM: chunk: Add support for
large chunks"). No backport needed.
A copy-paste error in alloc_trash_buffers_per_thread() passes
global.tune.bufsize_large to alloc_small_trash_buffers() instead of
global.tune.bufsize_small. This sets small_trash_size = bufsize_large.
When tune.bufsize.large is configured, get_larger_trash_chunk() then
incorrectly matches a large buffer against small_trash_size at line
169 and "grows" it to a regular (smaller) buffer. b_xfer() at line
179 attempts to copy the large buffer's contents into the smaller one:
- Default builds (DEBUG_STRICT=1): BUG_ON in __b_putblk() aborts
the process -> remote DoS
- DEBUG_STRICT=0 builds: BUG_ON becomes ASSUME() and the compiler
elides the check -> heap overflow with attacker-controlled bytes
Reachable via the json converter (sample.c:2862) when escaping
~bufsize_large/6 control characters in attacker-supplied data such
as a request header or body.
Introduced in commit 92a24a4e87 ("MEDIUM: chunk: Add support for
small chunks"). No backport needed.
hlua_error() is a printf-family function (calls vsnprintf), but
hlua_patref_set, hlua_patref_add, and _hlua_patref_add_bulk pass
errmsg directly as the format string. errmsg is built by pattern.c
helpers that embed the user-supplied key or value verbatim, e.g.
pat_ref_set_elt() generates "unable to parse '<value>'".
A Lua script calling:
ref:set("key", "%p.%p.%p.%p.%p.%p.%p.%p")
against a map with an integer output type (where the parse fails)
gets stack/register contents formatted into the (nil, err) return
value -> ASLR/canary leak. With %n and no _FORTIFY_SOURCE this
becomes an arbitrary write primitive.
This must be backported as far as the Patref Lua API exists.
hlua_httpclient_table_to_hdrs() declares a VLA of size
global.tune.max_http_hdr (default 101) on the stack but never checks
hdr_num against that bound. A Lua script that supplies a header table
with more than 101 values writes struct http_hdr entries (two ist =
two heap pointers + two lengths) past the end of the VLA, smashing
the stack frame.
Trigger from any Lua action/task/service:
local hc = core.httpclient()
local v = {}
for i = 1, 300 do v[i] = "x" end
hc:get{ url = "http://127.0.0.1/", headers = { ["X"] = v } }
Each out-of-bounds entry writes a heap pointer (controllable
allocation contents via istdup) plus an attacker-chosen length onto
the stack, overwriting the saved return address.
[wla: this is only reachable if the Lua script passes more than
max_http_hdr header values, which requires access to the script itself]
This must be backported as far as the httpclient Lua API exists.
Signed-off-by: William Lallemand <wlallemand@haproxy.com>
hlua_httpclient_table_to_hdrs() declares a VLA of size
global.tune.max_http_hdr (default 101) on the stack but never checks
hdr_num against that bound. A Lua script that supplies a header table
with more than 101 values writes struct http_hdr entries (two ist =
two heap pointers + two lengths) past the end of the VLA, smashing
the stack frame.
Trigger from any Lua action/task/service:
local hc = core.httpclient()
local v = {}
for i = 1, 300 do v[i] = "x" end
hc:get{ url = "http://127.0.0.1/", headers = { ["X"] = v } }
Each out-of-bounds entry writes a heap pointer (controllable
allocation contents via istdup) plus an attacker-chosen length onto
the stack, overwriting the saved return address. With no stack
canary, this is direct RCE; with a canary, it requires a leak first.
Reachable from any deployment that loads Lua scripts. While Lua
scripts are nominally trusted, this turns "can edit Lua" into "can
execute arbitrary native code", which is a meaningful boundary in
many setups (Lua sandbox escape).
This must be backported as far as the httpclient Lua API exists.
When the secret argument to jwt_decrypt_secret is a variable
(ARGT_VAR) rather than a literal string, alloc_trash_chunk() is
called to hold the base64-decoded secret but the buffer is never
released. The end: label frees input, decrypted_cek, out, and the
decoded_items array but not secret.
Each request leaks one trash chunk (~tune.bufsize, default 16KB).
At ~65000 requests per GiB this allows slow memory exhaustion DoS
against any config of the form:
http-request set-var(txn.x) req.hdr(...),jwt_decrypt_secret(txn.key)
This must be backported as far as JWE support exists.
convert_ecdsa_sig() calls i2d_ECDSA_SIG(ecdsa_sig, &p) where p
points into signature->area, a trash chunk of tune.bufsize bytes
(default 16384). i2d writes with no output bound.
The raw R||S input can be up to bufsize bytes (filled by
base64urldec at jwt.c:520-527), giving bignum_len up to 8192. The
DER encoding adds a SEQUENCE header (2-4 bytes), two INTEGER headers
(2-4 bytes each), and up to two leading-zero sign-padding bytes when
the bignum high bit is set. With two 8192-byte bignums having the
high bit set, the encoding is ~16398 bytes, overflowing the 16384-
byte buffer by ~14 bytes.
Triggered by any JWT with alg=ES256/384/512 and a ~21830-character
base64url signature. The signature does not need to verify
successfully; the overflow happens before verification. Reachable
from any config using jwt_verify with an EC algorithm.
Also fixes the existing wrong check: i2d returns -1 on error which
became SIZE_MAX in the size_t signature->data, defeating the
"== 0" test.
This must be backported as far as JWT support exists.
In sample_conv_jwt_decrypt_secret(), when a JWE token has an empty
encrypted-key section but the algorithm is not "dir" (e.g. A128KW),
neither branch initializes decrypted_cek. The NULL pointer is then
passed to decrypt_ciphertext() which dereferences it:
- For GCM encodings: aes_process() calls b_orig(NULL) -> SIGSEGV
- For CBC encodings: b_data(NULL) at jwe.c:463 -> SIGSEGV
A single HTTP request with a crafted Authorization header crashes the
worker process. Trigger token (JOSE header {"alg":"A128KW","enc":"A128GCM"},
empty CEK section between the two dots):
eyJhbGciOiJBMTI4S1ciLCJlbmMiOiJBMTI4R0NNIn0..AAAAAAAAAAAAAAAA.AA.AA
Reachable in any configuration using the jwt_decrypt_secret converter.
The other two decrypt converters (jwt_decrypt_jwk, jwt_decrypt_cert)
already have the check.
This must be backported as far as JWE support exists.
The 16-bit name_len field is read directly from the ClientHello and
stored as the sample length without any validation against srv_len,
ext_len, or the channel buffer size. A 65-byte ClientHello with
name_len=0xffff produces a sample claiming 65535 bytes of data when
only ~4 bytes are actually present in the buffer.
Downstream consumers then read tens of kilobytes past the channel
buffer:
- pattern.c:741 XXH3() hashes 65535 bytes -> ~50KB OOB heap read
- sample.c smp_dup memcpy if large trash configured
- log-format %[req.ssl_sni] leaks heap contents to logs/headers
Reachable pre-authentication on any TCP frontend using req.ssl_sni
(req_ssl_sni), which is the documented way to do SNI-based content
switching in TCP mode. No SSL handshake is required; the parser
runs on raw buffer contents in tcp-request content rules.
Bug introduced in commit d4c33c8889 (2013). The ALPN parser in
the same file at line 1044 has the equivalent check; SNI never did.
This must be backported to all supported versions.
When the healthcheck section support was added, the tcpcheck type was moved
into the tcpcheck ruleset. However, conn_install_mux_chk() function was not
updated accordingly. So the TCP mode was always returned.
No backport needed. This patch is related to #3324 but it is not the root
cause of the issue.
As reported by GH @phihos on GH #3320, using the shm-stats-file feature
with objects exceeding 127 chars would result in object name being
unexpectedly truncated, while GUID API supports up to 128 chars.
Indeed, with the config below, and shm-stats-file enabled:
server s1 127.0.0.1:1 guid srv:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:SRV_1 disabled
server s10 127.0.0.1:1 guid srv:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:SRV_10 disabled
haproxy would store the second server object with the same id as the first
one, but upon reload, only the first one would be restored, which would
eventually cause shm-stats-file slot exhaustion with repetitive reloads.
@phihos, found out the underlying issue, in counters.c we used snprintf()
with sizeof(shm_obj->guid) - 1 as <size> parameter, while we should have
use sizeof(shm_obj->guid) instead since shm_obj->guid already takes the
terminating NULL byte into account.
So we simply apply the fix suggested by @phihos, and hopefully this should
solve the shm-stats-file slot leak that was observed.
Unfortunately, for now, we cannot warn the user that a duplicate
shm-stats-file object was found, because we accept duplicate objects
by design for 2 reasons. The first one is for a new process to be able
to change the object type for a previously known GUID while allowing
previous processes to use the old object as long as they are alive.
The second reason is that upon startup we cannot afford to scan the
whole object list, as soon as we find a match (type + GUID), we bind
the object, and this way we avoid unnecessary lookup time.
Perhaps we have room for improvement in the future, but for now let's
keep it this way.
It should be backported to 3.3
Big thanks to @phihos for the bug description, analysis and
suggestions.
The parsing of the "healthcheck" parameter for dynamic servers was not
finished. The post-config was missing, leading to a crash because the
ruleset pointer was NULL.
To fix the issue, check_server_tcpcheck() function is called in
cli_parse_add_server().
No backport needed.
When an early response is sent to the client and the H1 connection is
switched to the draining state, we must take care to disable the 0-copy data
forwarding because the backend side is no longer here. It is an issue
because this prevent any regular receive to be performed.
This patch should fix the issue #3316. It must be backported as far as 3.0.
Functions used to initialize haterm (the splicing and the response buffers)
were defined and registered in haterm.c. The problem is that this file in
compiled with haproxy. So it may be an issue. And for the splicing part,
warnings may be emitted when haproxy is started.
To avoid any issue during haproxy startup and to avoid to initialize some
part of haterm, all init functions were moved into haterm_init.c file.
No backport needed.
This is another pre-requisite work for upcoming decompression filter.
In this patch we implement the "filter-sequence" directive which can be
used in proxy section (frontend,backend,listen) and takes 2 parameters
The first one is the direction (request or response), the second one
is a comma separated list of filter names previously declared on the
proxy using the "filter" keyword.
The main goal of this directive is to be able to instruct haproxy in which
order the filters should be executed on request and response paths,
especially if the ordering between request and response handling must
differ, and without relying on the filter declaration ordering (within
the proxy) which is used by default by haproxy.
Another benefit of this feature is that it becomes possible to "ignore"
a previously declared filter on the proxy. Indeed, when filter-sequence
is defined for a given direction (request/response), then it will be used
over the implicit filter ordering, but if a filter which was previously
declared is not specified in the related filter-sequence, it will not be
executed on purpose. This can be used as a way to temporarily disable a
filter without completely removing its configuration.
Documentation was updated (check examples for more info)
flt_conf struct stores the filter id, which is used internally to check
match the filter against static pointer identifier, and also used as
descriptive text to describe the filter. But the id is not consistent
with the public name as used in the configuration (for instance when
selecting filter through the 'filter' directive).
What we do in this patch is that we add flt_conf->name member, which
stores the real filter name as seen in the configuration. This will
allow to select filters by their name from other directives in the
configuration.
The ssl_crtname_index was registered with SSL_get_ex_new_index() but the
certificate name is stored on a SSL_CTX object via SSL_CTX_set_ex_data().
The free callback is only invoked for the object type matching the index
registration, so the strdup'd name was never freed when the SSL_CTX was
released.
Fix this by using SSL_CTX_get_ex_new_index() instead, which ensures the
free callback fires when the SSL_CTX is destroyed.
No backport needed.
Following request options are now handled as flags:
- ?k=1 => flag HS_ST_OPT_CHUNK_RES is set
- ?c=0 => flag HS_ST_OPT_NO_CACHE is set
- ?R=1 => flag HS_ST_OPT_RANDOM_RES is set
- ?A=A => flag HS_ST_OPT_REQ_AFTER_RES is set.
By default, none is set.
The support for the splicing was added and enabled by default, if
supported. The command line option '-dS' was also added to disable the
feature.
When the splicing can be used and the front multiplexer agrees to proceed,
tee() is used to "copy" data from the master pipe to the client pipe.
Now the zero-copy data forwarding is supported, we will add the splicing
support. To do so, we first create a master pipe with vmsplice() during
haterm startup. It is only performed if the splicing is supported. And its
size can be configured by setting "tune.pipesize" global parameter.
This master pipe will be used to fill the pipe with the client.
The support for the zero-copy data forwarding was added and enabled by
default. The command line option '-dZ' was also added to disable the
feature.
Concretely, when haterm pushes the response payload, if the zero-copy
forwarding is supported, a dedicated function is used to do so.
hstream_ff_snd() will rely on se_nego_ff() to know how many data can send
and at the end, on se_done_ff() to really send data.
hstream_add_ff_data() function was added to perform the raw copy of the
payload in the sedesc I/O buffer.
hstream_add_data() function is renamed to hstream_add_htx_data() because
there will be a similar function to add data in zero-copy forwarding
mode. The function was also adapted to take the data length to add in
parameter and to return the number of written bytes.
This new sample fetch returns the name of the certificate selected for
an incoming SSL/TLS connection, as it would appear in "show ssl cert".
It may be a filename with its relative or absolute path, or an alias,
depending on how the certificate was declared in the configuration.
The certificate name is stored as ex_data on the SSL_CTX at load time
in ckch_inst_new_load_store(), and freed via a dedicated free callback.
The "feature" predicate takes an argument name. Not passing one will
cause strstr() to always find something, including at the end of the
string, and to read past end that ASAN detects. We need to check that
we didn't reach end before proceeding.
This bug was reported by OSS Fuzz here:
https://issues.oss-fuzz.com/issues/499133314
The issue is present since 2.4 with commit 58ca706e16 ("MINOR: config:
add predicate "feature" to detect certain built-in features") so this
fix must be backported to all stable versions.
Using awslc_api_before() with an invalid argument results in "(null)"
appearing in the error message due to -1 being returned without the
error message being filled. Let's always fill the error message on error.
This was introduced in 3.3 with commit 3d15c07ed0 ("MINOR: cfgcond: add
"awslc_api_atleast" and "awslc_api_before""), and this fix must be
backported to 3.3.
Using openssl_version_before() with an invalid argument results in "(null)"
appearing in the error message due to -1 being returned without the error
message being filled. Let's always fill the error message on error.
This was introduced in 2.5 with commit 3aeb3f9347 ("MINOR: cfgcond:
implements openssl_version_atleast and openssl_version_before"), and
this fix must be backported to 2.6.
cfg_eval_condition() says that the <errptr> pointer will be set upon
error. However, cfg_eval_cond_expr() can fail (e.g. failure to handle
a dynamic argument) but would branch to "done" and leave errptr unset.
Let's check for this case as well.
This bug was reported by OSS Fuzz here:
https://issues.oss-fuzz.com/issues/499135825
The bug was introduced in 2.5 around commit ca81887599 ("MINOR:
cfgcond: insert an expression between the condition and the term") so
the fix must be backported as far as 2.6.
The previous ACME_RSLV_WAIT state served a dual role: it applied the
initial dns-delay before the first DNS probe and also handled the
delay between retries. There was no way to simply wait a fixed delay
before submitting the challenge without also triggering DNS pre-checks.
Replace ACME_RSLV_WAIT with two distinct states:
- ACME_INITIAL_DELAY: an optional initial wait before proceeding,
only applied when "challenge-ready" includes the new "delay" keyword
- ACME_RSLV_RETRY_DELAY: the delay between resolution retries, always
applied when DNS pre-checks are in progress
The new "delay" keyword in "challenge-ready" can be used standalone
(wait then submit the challenge directly) or combined with "dns" (wait
then start the DNS pre-checks). When "delay" is not set, the first DNS
probe fires immediately.
Update the documentation accordingly.
The TASK_WOKEN_TIMER check that previously handled the case where
RSLV_TRIGGER was reached directly from the CLI command is therefore dead
code and can be removed.
Fix the following build warning from obsolete compilers for <orig_frm>
variable in qcc_qstrm_send_frames() function :
src/mux_quic_qstrm.c:266:17: warning: 'orig_frm' may be used
uninitialized in this function [-Wmaybe-uninitialized]
The variable is now explicitely initialized to NULL on each loop, which
should prevent this warning. Note that for code clarity, the variable is
renamed <next_frm>.
No need to backport.
Previously the dns timeout timer was initialized in ACME_RSLV_WAIT,
before the initial dns-delay expires. This meant the countdown started
before any DNS request was actually sent, so the effective timeout was
shorter than expected by one dns-delay period.
Move the initialization to ACME_RSLV_TRIGGER so the timer starts only
when the first DNS resolution attempt is triggered. Update the
documentation to clarify this behaviour.
Defines an API for xprt_qstrm so that the QMux transport parameters can
be retrieved by the MUX layer on its initialization. This concerns both
local and remote parameters.
Functions xprt_qstrm_lparams/rparams() are defined and exported for
this. They are both used in qmux_init() if QMux protocol is active.
On SSL handshake completion, MUX layer can be initialized if not already
the case. However, for QMux protocol, it is necessary first to perform
transport parameters exchange, via the new xprt_qstrm layer. This patch
ensures this is performed if any flag CO_FL_QSTRM_* is set on the
connection.
Also, SSL layer registers itself via add_xprt. This ensures that it can
be used by xprt_qstrm for the emission/reception of the necessary
frames.
This patch implements QMux emission of transport parameters via
xprt_qstrm. Similarly to receive, this is performed in conn_send_qstrm()
which uses lower xprt snd_buf operation. The connection must first be
flagged with CO_FL_QSTRM_SEND to trigger this step.
Extend xprt_qstrm to implement the reception of QMux transport
parameters. This is performed via conn_recv_qstrm() which relies on the
lower xprt rcv_buf operation. Once received, parameters are kept in
xprt_qstrm context, so that the MUX can retrieve them on init.
For the reception of parameters to be active, the connection must first
be flagged with CO_FL_QSTRM_RECV.
Add get_alpn operation support for xprt_qstrm. This simply acts as a
passthrough method to the underlying XPRT layer.
This function is necessary for QMux when running above SSL, as mux-quic
will access ALPN during its initialization in order to instantiate the
proper application protocol layer.
Define a new XPRT layer for the new QMux protocol. Its role will be to
perform the initial exchange of transport parameters.
On completion, contrary to XPRT handshake, xprt_qstrm will first init
the MUX and then removes itself. This will be necessary so that the
parameters can be retrieved by the MUX during its initialization.
This patch only declares the new xprt_qstrm along with basic operations.
Future commits will implement the proper reception/emission steps.
Similarly to reception, a new buffer is defined in QCC connection to
handle emission for QMux protocol. This replaces the trash buffer usage
in qcc_qstrm_send_frames().
This buffer is necessary to handle partial emission. On retry, the
buffer must be completely emitted before starting to send new frames.
Each time a QUIC frame is emitted, mux-quic layer is notified via a
callback to update the underlying QCS. For QUIC, this is performed via
qc_stream_desc element.
In QMux protocol, this can be simplified as there is no
qc_stream_desc/quic_conn layer interaction. Instead, each time snd_buf
is called, QCS can be updated immediately using its return value. This
is performed via a new function qstrm_ctrl_send().
Its work is similar to the QUIC equivalent but in a simpler mode. In
particular, sent data can be immediately removed from the Tx buffer as
there is no need for retransmission when running above TCP.
This patchs implement mux-quic reception for the new QMux protocol. This
is performed via the new function qcc_qstrm_send_frames(). Its interface
is similar to the QUIC equivalent : it takes a list of frames and
encodes them in a buffer before sending it via snd_buf.
Contrary to QUIC, a check on CO_FL_ERROR flag is performed prior to
every qcc_qstrm_send_frames() invokation to interrupt emission. This is
necessary as the transport layer may set it during snd_buf. This is not
the case currently for quic_conn layer, but maybe a similar mechanism
should be implemented as well for QUIC in the future.
The previous patch defines a new QCC buffer member to implement QMux
reception. This patch completes this by perfoming realign on it during
qcc_qstrm_recv(). This is necessary when there is not enough contiguous
data to read a whole frame.
When QMux is used, mux-quic must actively performed reception of new
content. This has been implemented by the previous patch.
The current patch extends this by defining a buffer on QCC dedicated to
this operation. This replaces the usage of the trash buffer. This is
necessary to deal with incomplete reads.
Implements parsing of frames related to flow-control for mux-quic
running on the new QMux protocol. This simply calls qcc_recv_*() MUX
functions already used by QUIC.
This patch implements a new function qcc_qstrm_recv() dedicated to the
new QMux protocol. It is responsible to perform data reception via
rcv_buf() callback. This is defined in a new mux_quic_strm module.
Read data are parsed in frames. Each frame is handled via standard
mux-quic functions. Currently, only STREAM and RESET_STREAM types are
implemented.
One major difference between QUIC and QMux is that mux-quic is passive
on the reception side on the former protocol. For the new one, mux-quic
becomes active. Thus, a new call to qcc_qstrm_recv() is performed via
qcc_io_recv().
STREAM frame will also be used by the new QMux protocol. This requires
some adaptation in the qf_stream structure. Reference to qc_stream_desc
object is replaced by a generic void* pointer.
This change is necessary as QMux protocol will not use any
qc_stream_desc elements for emission.
Ensure mux-quic traces will be compatible with the new QMux protocol.
This is necessary as the quic_conn element is accessed to display some
transport information. Use conn_is_quic() to protect these accesses.
Ensure mux-quic operations related to initialization and shutdown will
be compatible with the new QMux protocol. This requires to use
conn_is_quic() before any access to the quic_conn element, in
qmux_init(), qcc_shutdown() and qcc_release().
Adapts mux-quic functions related to emission for future QMux protocol
support.
In short, QCS will not used a qc_stream_desc object but instead a plain
buffer. This is inserted as a union in QCS structure. Every access to
QUIC qc_stream_desc is protected by a prior conn_is_quic() check. Also,
pacing is useless for QMux and thus is disabled for such protocol.
Move <stream> field from qcs type into the inner structure 'tx'. This
change is only a minor refactoring without any impact. It is cleaner as
Rx buffer elements are already present in 'rx' inner structure.
This reorganization is performed before introducing of a new Tx buffer
field used for QMux protocol.
Implement parse/build methods for QX_TRANSPORT_PARAMETER frame. Both
functions may fail due to buffer space too small (encoding) or truncated
frame (parsing).
Define a new frame type for QMux transport parameter exchange. Frame
type is 0x3f5153300d0a0d0a and is declared as an extra frame, outside of
quic_frame_parsers / quic_frame_builders.
The next patch will implement parsing/encoding of this frame payload.
The previous patch refactored QUIC transport parameters decoding and
validity checks. These two operation are now performed in two distinct
functions. This renders quic_tp_dec_err type useless. Thus, this patch
removes it. Function returns are converted to a simple integer value.
Function quic_transport_params_decode() is used for decoding received
parameters. Prior to this patch, it also contained validity checks on
some of the parameters. Finally, it also tested that mandatory
parameters were indeed found.
This patch separates this two parts. Params validity is now tested in a
new function quic_transport_params_check(), which can be called just
after decode operation.
This patch will be useful for QMux protocol, as this allows to reuse
decode operation without executing checks which are tied to the QUIC
specification, in particular for mandatory parameters.
The documentation for functions related to transport parameters decoding
is unclear or sometimes completely wrong on the meaning of the <server>
argument. It must be set to reflect the origin of the parameters,
contrary to what was implied in function comments.
Fix this by rewriting comments related to this <server> argument. This
should prevent to make any mistake in the future.
This is purely a documentation fix. However, it could be useful to
backport it up to 2.6.
This patch is a direct follow-up of the previous one. This time,
refactoring is performed on qc_build_frm() which is used for frame
encoding.
Function prototype has changed as now packet argument is removed. To be
able to check frame validity with a packet, one can use the new parent
function qc_build_frm_pkt() which relies on qc_build_frm().
As with the previous patch, there is no function change expected. The
objective is to facilitate a future QMux implementation.
This patch refactors parsing in QUIC frame module. Function
qc_parse_frm() has been splitted in three :
* qc_parse_frm_type()
* qc_parse_frm_pkt()
* qc_parse_frm_payload()
No functional change. The main objective of this patch is to facilitate
a QMux implementation. One of the gain is the ability to manipulate QUIC
frames without any reference to a QUIC packet as it is irrelevant for
QMux. Also, quic_set_connection_close() calls are extracted as this
relies on qc type. The caller is now responsible to set the required
error code.
Set the default dns-delay to 30s so it can be more efficient with fast
DNS providers. The dns-timeout is set to 600s by default so this does
not have a big impact, it will only do more check and allow the
challenge to be started more quickly.
When using the dns-01 challenge method with "challenge-ready dns", HAProxy
retries DNS resolution indefinitely at the interval set by "dns-delay". This
adds a "dns-timeout" keyword to set a maximum duration for the DNS check phase
(default: 600s). If the next resolution attempt would be scheduled beyond that
deadline, the renewal is aborted with an explicit error message.
A new "dnsstarttime" field is stored in the acme_ctx to record when DNS
resolution began, used to evaluate the timeout on each retry.
Thanks to this patch, it is now possible to specify an healthcheck section
on the server line. In that case, the server will use the tcpcheck as
defined in the correspoding healthcheck section instead of the proxy's one.
tcpcheck_ruleset struct was extended to host a config part that will be used
for healthcheck sections. This config part is mainly used to store element
for the server's tcpcheck part.
When a healthcheck section is parsed, a ruleset is created with its name
(which must be unique). "*healthcheck-{NAME}" is used for these ruleset. So
it is not possible to mix them with regular rulesets.
For now, in a healthcheck section, the type must be defined, based on the
options name (tcp-check, httpchk, redis-check...). In addition, several
"tcp-check" or "http-check" rules can be specified, depending on the
healthcheck type.
Functions used to parse directives related to tcpchecks were split to have a
first step testing the proxy and creating the tcpcheck ruleset if necessary,
and a second step filling the ruleset. The aim of this patch is to preapre
the parsing of healthcheck sections. In this context, only the second steip
will be used.
When log-format stirngs were parsed in context of a tcpcheck, ARGC_SRV
context was used instead of ARGC_TCK. This context is used to report
accurrate errors.
This patch could be backported to all stable versions.
The proxy flag PR_O_TCPCHK_SSL is replaced by a flag on the tcpcheck
itself. When TCPCHK_FL_USE_SSL flag is set, it means the healthcheck will
use an SSL connection and the SSL xprt must be prepared for the server.
In tcpchecks context, when HTTP sample expressions are parsed, there is no
reason to set the proxy's http_needed value to 1. This value is only used
for streams to allocate an HTTP txn.
This patch could be backported to all stable versions.
disable-on-404 and send-state options, configured on an HTTP healtcheck,
were handled as proxy options. Now, these options are handled in the
tcp-check itself. So the corresponding PR_O and PR_02 flags are removed.
The tcpcheck_rules structure is replaced by the tcpcheck structure. The main
difference is that the ruleset is now referenced in the tcpcheck structure,
instead of the rules list. The flags about the ruleset type are moved into
the ruleset structure and flags to track unused rules remains on the
tcpcheck structure. So it should be easier to track unused rulesets. But it
should be possible to configure a set of tcpcheck rules outside of the proxy
scope.
The main idea of these changes is to prepare the parsing of a new
healthcheck section. So this patch is quite huge, but it is mainly about
renaming some fields.
When parsing httpchck option, a wrong flag (TCPCHK_SND_HTTP_FROM_OPT) was
set on the rules, while it is in fact a flag for a send rule. Let's remove
it. There is no issue here because there is no corresponding flag for
tcpcheck rules.
This patch must be backported to all stable versions.
These actions were added recently and it appeared the way binary headers
were retrieved could be simplified.
First, there is no reason to retrieve a base64 encoded string. It is
possible to rely on the binary string directly. "b64dec" converter can be
used to perform a base64 decoding if necessary.
Then, using a log-format string is quite overkill and probably
conterintuitive. Most of time, the headers will be retrieved from a
variable. So a sample expression is easier to use. Thanks to the previous
patch, it is quite easy to achieve.
This patch relies on the commit "MINOR: action: Add a sample expression
field in arguments used by HTTP actions". The documentation was updated
accordingly.
An error is erroneously triggered if a if/unless statement is found after
set-headers-bin and add-headers-bin actions. To make it works, during
parsing of these actions, we should leave when an unknown argument is found
to let the rule parser the opportunity to parse an if/unless statement.
No backport needed.
Add keylog_format_fc and keylog_format_bc global variables containing
the SSLKEYLOGFILE log-format strings for the frontend (client-facing)
and backend (server-facing) TLS connections respectively. These produce
output compatible with the SSLKEYLOGFILE format described at:
https://tlswg.org/sslkeylogfile/draft-ietf-tls-keylogfile.html
Both formats are also exported as environment variables at startup:
HAPROXY_KEYLOG_FC_LOG_FMT
HAPROXY_KEYLOG_BC_LOG_FMT
These variables contains \n so they might not be compatible with syslog
servers, using them with stderr or a sink might be required.
These can be referenced directly in "log-format" directives to produce
SSLKEYLOGFILE-compatible output, usable by network analyzers such as
Wireshark to decrypt captured TLS traffic.
Reverse the default, to hide the version from stats by default, and add
a new keyword, "stats show-version", to enable them, as we don't want to
disclose the version by default, especially on public websites.
When binary headers are decoded, return value of decode_varint() function is
not properly handled. On error, it can return -1. However, the result is
inconditionnaly added to an unsigned offset.
Now, a temporary variable is used to be abl to test decode_varint() return
value. It is added to the offset on success only.
No backport needed.
When h1_snd_buf() inherits the CO_SFL_MSG_MORE flag from the upper layer, it
unconditionally propagates it to H1C_F_CO_MSG_MORE, which eventually sets
MSG_MORE on the sendmsg() call. For bodyless responses (HEAD, 204, 304), this
causes the kernel to cork the TCP connection for ~200ms waiting for body data
that will never be sent.
With an H1 frontend and H2 backend, this adds ~200ms of latency to many or
all bodyless responses. The 200ms corresponds to the kernel's tcp_cork_time
default. H1 backends are less affected because h1_postparse_res_hdrs() sets
HTX_FL_EOM during header parsing for bodyless responses, but H2 backends
frequently deliver the end-of-stream signal in a separate scheduling round,
leaving htx_expect_more() returning TRUE when headers are first forwarded.
The fix guards H1C_F_CO_MSG_MORE so it is only set when the connection is a
backend (H1C_F_IS_BACK) or the response is not bodyless
(!H1S_F_BODYLESS_RESP). This ensures bodyless responses on the front
connection are sent immediately without corking.
This should be backported to all stable branches.
Co-developed-by: Billy Campoli <bcampoli@meta.com>
Co-developed-by: Chandan Avdhut <cavdhut@meta.com>
Co-developed-by: Neel Raja <neelraja@meta.com
These actions allow setting, adding and deleting multiple headers from
the same action, without having to know the header names during parsing.
This is useful when doing things with SPOE.
The CLI commands (get|add|del|clear|commit|set) | (acl|map) does not
contain a permission check on admin level.
Must be backported to 3.3. This can be a breaking change for some users.
Initially reported by Cameron Brown.
'set ssl ocsp-response', 'update ssl ocsp-response', 'show ssl
ocsp-response', 'show ssl ocsp-updates' are lacking permissions checks
on admin level.
Must be backported in 3.3. This can be a breaking change for some users.
Initially reported by Cameron Brown.
Both 'set ssl tls-key' and 'show tls-keys' command are missing the
permission checks so the commands can be used only in admin mode.
Must be backported to 3.3. This can be a breaking change for some users.
Initially reported by Cameron Brown.
This commit adds an ha_warning() when map/acl commands are accessed
without admin level. This is to warn users that these commands will be
restricted to admin only in HAProxy 3.3.
Must be backported in every stable branches.
Initially reported by Cameron Brown.
This commit adds an ha_warning() when OCSP commands are accessed without
admin level. This is to warn users that these commands will be
restricted to admin only in HAProxy 3.3.
Must be backported in every stable branches.
Initially reported by Cameron Brown.
This commit adds an ha_warning() when 'show tls-keys' or 'set ssl
tls-key' are accessed without admin level. This is to warn users that
these commands will be restricted to admin only in HAProxy 3.3.
Must be backported in every stable branches.
Initially reported by Cameron Brown.
The previous patch implemented the 'dns-check' option. This one replaces
it by a more generic 'challenge-ready' option, which allows the user to
chose the condition to validate the readiness of a challenge. It could
be 'cli', 'dns' or both.
When in dns-01 mode it's by default to 'cli' so the external tool used to
configure the TXT record can validate itself. If the tool does not
validate the TXT record, you can use 'cli,dns' so a DNS check would be
done after the CLI validated with 'challenge_ready'.
For an automated validation of the challenge, it should be set to 'dns',
this would check that the TXT record is right by itself.
When using the dns-01 challenge type, TXT record propagation across
DNS servers can take time. If the ACME server verifies the challenge
before the record is visible, the challenge fails and it's not possible
to trigger it again.
This patch introduces an optional DNS pre-check mechanism controlled
by two new configuration directives in the "acme" section:
- "dns-check on|off": enable DNS propagation verification before
notifying the ACME server (default: off)
- "dns-delay <time>": delay before querying DNS (default: 300s)
When enabled, three new states are inserted in the state machine
between AUTH and CHALLENGE:
- ACME_RSLV_WAIT: waits dns-delay seconds before starting
- ACME_RSLV_TRIGGER: starts an async TXT resolution for each
pending authorization using HAProxy's resolver infrastructure
- ACME_RSLV_READY: compares the resolved TXT record against the
expected token; retries from ACME_RSLV_WAIT if any record is
missing or does not match
The "acme_rslv" structure is implemented in acme_resolvers.c, it holds
the resolution for each domain. The "auth" structure which contains each
challenge to resolve contains an "acme_rslv" structure. Once
ACME_RSLV_TRIGGER leaves, the DNS tasks run on the same thread, and the
last DNS task which finishes will wake up acme_process().
Note that the resolution goes through the configured resolvers, not
through the authoritative name servers of the domain. The result may
therefore still be affected by DNS caching at the resolver level.
This patch adds support for TXT records. It allows to get the first
string of a TXT-record which is limited to 255 characters.
The rest of the record is ignored.
Latest commit a336c467a0 ("BUG/MINOR: net_helper: fix length controls
on ip.fp tcp options parsing") was malformed and broke the build. This
should be backported wherever the fix above is backported.
If opt len is truncated by tcplen we may read 1 Byte after the
tcp header.
There is also missing controls parsing MSS and WS we may compute
invalid values on fingerprint reading after the tcp header in
case of truncated options.
This patch should be backported on versions including ip.fp
We leverage the SE_FL_APP_STARTED flag to detect whether the application
layer had a chance to run or not when an RST_STREAM is received. This
allows us to triage RST_STREAM between regular ones and harmful ones,
and to count glitches for them. It reveals extremely effective at
detecting fast HEADERS+RST pairs.
It could be useful to backport it to 3.2, though it depends on these
two previous patches to be backported first (the first one was already
planned and the second one is harmless, though will require to drop
the haterm changes):
BUG/MINOR: stconn: Always declare the SC created from healthchecks as a back SC
MINOR: stconn: flag the stream endpoint descriptor when the app has started
In order to improve our ability to distinguish operations that had
already started from others under high loads, it would be nice to know
if an application layer (stream) has started to work with an endpoint
or not. The use case typically is a frontend mux instantiating a stream
to instantly cancel it. Currently this info will take some time to be
detected and processed if the applcation's task takes time to wake up.
By flagging the sedesc with SE_FL_APP_STARTED the first time a the app
layer starts, the lower layers can know whether they're cancelling a
stream that has started to work or not, and act accordingly. For now
this is done unconditionally on the backend, and performed early in the
only two app layers that can be reached by a frontend: process_stream()
and process_hstream() (for haterm).
The SC created from a healthcheck is always a back SC. But SC_FL_ISBACK
flags was missing. Instead of passing it when sc_new_from_check() is called,
the function was simplified to set SC_FL_ISBACK flag systematically when a
SC is created from a healthcheck.
This patch should be backported as far as 2.6.
RFC 9000 lists each supported frames and the type of packets in which it
can be present.
Prior to this patch, a packet with an incompatible frame is dropped.
However, QUIC specification mandates that the connection is immediately
closed with PROTOCOL_VIOLATION error code. This patch completes
qc_parse_frm() to add such connection closure.
This must be backported up to 2.6.
When an htx DATA block is partially transfer, we must take care to remove
exactly the copied size. To do so, we must save the size of the last block
value copied and not rely on the last data block after the copy. Indeed,
data can be merged with an existing DATA block, so the last block size can
be larger than the last part copied.
Because of this issue, it is possible to remove more data than
expected. Worse, this could lead to a crash by performing an integer
overflow on the block size.
No backport needed.
Fix a leak of the task object in acme_start_task() when one of the
condition in the function failed.
Fix issue #3308.
Must be backported to 3.2 and later.
There are two settings to control idle connection sharing across
threads.
tune.idle-pool.shared, that enables or disables it, and then
tune.takeover-other-tg-connections, which lets you or not get idle
connections from other thread groups.
Add a new keyword for tune.idle-pool.shared, "full", that lets you get
connections from other thread groups (equivalent to "full" keyword for
tune.takeover-other-tg-connections). The "on" keyword now will be
equivalent to the "restrict" one, which allowed getting connection from
other thread groups only when not doing it would result in a connection
failure (when reverse-http or when strict-macxonn are used).
tune.takeover-other-tg-connections will be deprecated.
If server returns an auth with status valid it seems that client
needs to always skip it, CA can recycle authorizations, without
this change haproxy fails to obtain certificates in that case.
It is also something that is explicitly allowed and stated
in the dns-persist-01 draft RFC.
Note that it would be better to change how haproxy does status polling,
and implements the state machine, but that will take some thought
and time, this patch is a quick fix of the problem.
See:
https://github.com/letsencrypt/boulder/issues/2125https://github.com/letsencrypt/pebble/issues/133
This must be backported to 3.2 and later.
When abortonclose option is enabled (by default since 3.3), the HTTP rules
can no longer yield if the client aborts. However, stream aborts were also
considered. So it was possible to interrupt yielding rules, especially on
the response processing, while the client was still waiting for the
response.
So now, when abortonclose option is enabled, we now take care to only
consider client aborts to prevent HTTP rules to yield.
Many thanks to @DirkyJerky for his detailed analysis.
This patch should fix the issue #3306. It should be backported as far as
2.8.
warnif_misplaced_* functions return 1 when a warning is reported and 0
otherwise. So the caller must properly handle the return value.
When parsing a proxy, ERR_WARN code must be added to the error code instead
of the return value. When a warning was reported, ERR_RETRYABLE (1) was
added instead of ERR_WARN.
And when tcp rules were parsed, warnings were ignored. Message were emitted
but the return values were ignored.
This patch should be backported to all stable versions.
When warnif_cond_conflicts() is called, we must take care to emit a warning
only when a conflict is reported. We cannot rely on the err_code variable
because some warnings may have been already reported. We now rely on the
errmsg variable. If it contains something, a warning is emitted. It is good
enough becasue warnif_cond_conflicts() only reports warnings.
This patch should fix the issue #3305. It is a 3.4-dev specific issue. No
backport needed.
When picking a mux, pay attention to its MX_FL_FRAMED. If it is set,
then it means we explicitely want QUIC, so don't use that mux for any
protocol that is not QUIC.
When parsing the check address, store the associated proto too.
That way we can use the notation like quic4@address, and the right
protocol will be used. It is possible for checks to use a different
protocol than the server, ie we can have a QUIC server but want to run
TCP checks, so we can't just reuse whatever the server uses.
WIP: store the protocol in checks
Don't assume the check will reuse the server's xprt. It may not be true
if some settings such as the ALPN has been set, and it differs from the
server's one. If the server is QUIC, and we want to use TCP for checks,
we certainly don't want to reuse its XPRT.
Permission checks on the CLI for ACME are missing.
This patch adds a check on the ACME commands
so they can only be run in admin mode.
ACME is stil a feature in experimental-mode.
Initial report by Cameron Brown.
Must be backported to 3.2 and later.
Permission checks on the CLI for ECH are missing.
This patch adds a check for "(add|set|del|show) ssl ech" commands
so they can only be run in admin mode.
ECH is stil a feature in experimental-mode and is not compiled by
default.
Initial report by Cameron Brown.
Must be backported to 3.3.
This patch fixes a warning that can be reproduced with gcc-8.5 on RHEL8
(gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-28)).
This should fix issue #3303.
Must be backported everywhere 917e82f283 ("MINOR: debug: copy debug
symbols from /usr/lib/debug when present") was backported, which is
to branch 3.2 for now.
Fix the check or arguments of the 'acme challenge_ready' command which
was checking if all arguments are NULL instead of one of the argument.
Must be backported to 3.2 and later.
Replace atol() by _strl2uic() in cases the input are ISTs when parsing
the retry-after header. There's no risk of an error since it will stop
at the first non-digit.
Must be backported to 3.2 and later.
In acme_req_finalize() the data buffer is only freed when a2base64url
succeed. This patch moves the allocation so it free() the DER buffer in
every cases.
Must be backported to 3.2 and later.
Thanks to previous commits, it is possible to use small buffers at different
places: to store the request when a connection is queued or when L7 retries
are enabled, or for health-checks requests. However, there was no
configuration parameter to fine tune small buffer use.
It is now possible, thanks to the proxy option "use-small-buffers".
Documentation was updated accordingly.
If support for small buffers is enabled, we now try to use them for
healthcheck requests. First, we take care the tcpcheck ruleset may use small
buffers. Send rules using LF strings or too large data are excluded. The
ability to use small buffers or not are set on the ruleset. All send rules
of the ruleset must be compatible. This info is then transfer to server's
healthchecks relying on this ruleset.
Then, when a healthcheck is running, when a send rule is evaluated, if
possible, we try to use small buffers. On error, the ability to use small
buffers is removed and we retry with a regular buffer. It means on the first
error, the support is disabled for the healthcheck and all other runs will
use regular buffers.
In h2_rcv_buf(), HTX flags are transfer with data when htx_xfer() is
called. There is no reason to continue to deal with them in the H2 mux. In
addition, there is no reason to set SE_FL_EOI flag when a parsing error was
reported. This part was added before the stconn era. Nowadays, when an HTX
parsing error is reported, an error on the sedesc should also be reported.
This reverts commit 44932b6c41.
The patch above was only necessary to handle partial headers or trailers
parsing. There was nothing to prevent the H2 multiplexer to start to add
headers or trailers in an HTX message and to stop the processing on error,
leaving the HTX message with no EOH/EOT block.
From the HTX API point of view, it is unexepected. And this was fixed thanks
to the commit ba7dc46a9 ("BUG/MINOR: h2/h3: Never insert partial
headers/trailers in an HTX message").
So this patch can be reverted. It is important to not report a parsign error
too early, when there are still data to transfer to the upper layer.
This patch must be backport where 44932b6c4 was backported but only after
backporting ba7dc46a9 first.
When a HTX stream is queued, if the request is small enough, it is moved
into a small buffer. This should save memory on instances intensively using
queues.
Applet and connection receive function were update to block receive when a
small buffer is in use.
In the same way support for large chunks was added to properly work with
large buffers, we are now adding supports for small chunks because it is
possible to process small buffers.
So a dedicated memory pool is added to allocate small
chunks. alloc_small_trash_chunk() must be used to allocate a small
chunk. alloc_trash_chunk_sz() and free_trash_chunk() were uppdated to
support small chunks.
In addition, small trash buffers are also created, using the same mechanism
than for regular trash buffers. So three thread-local trash buffers are
created. get_small_trash_chunk() must be used to get a small trash buffer.
And get_trash_chunk_sz() was updated to also deal with small buffers.
htx_move_to_small_buffer()/htx_move_to_large_buffer() and
htx_copy_to_small_buffer()/htx_copy_to_large_buffer() functions can now be
used to move or copy blocks from a default buffer to a small or large
buffer. The destination buffer is allocated and then each blocks are
transferred into it.
These funtions relies in htx_xfer() function.
htx_xfer() function should replace htx_xfer_blks(). It will be a bit easier to
maintain and to use. The behavior of htx_xfer() can be changed by calling it
with specific flags:
* HTX_XFER_KEEP_SRC_BLKS: Blocks from the source message are just copied
* HTX_XFER_PARTIAL_HDRS_COPY: It is allowed to partially xfer headers or trailers
* HTX_XFER_HDRS_ONLY: only headers are xferred
By default (HTX_XFER_DEFAULT or 0), all blocks from the source message are moved
into to the destination mesage. So copied in the destination messageand removed
from the source message.
The caller must still define the maximum amount of data (including meta-data)
that can be xferred.
It is no longer necessary to specify a block type to stop the copy. Most of
time, with htx_xfer_blks(), this parameter was set to HTX_BLK_UNUSED. And
otherwise it was only specified to transfer headers.
It is important to not that the caller is responsible to verify the original
HTX message is well-formated. Especially, it must be sure headers part and
trailers part are complete (finished by EOH/EOT block).
For now, htx_xfer_blks() is not removed for compatiblity reason. But it is
deprecated.
When small buffer size was greater than the default buffer size, an error
was triggered. We now do the same than for large buffer. A warning is
emitted and the small buffer size is set to 0 do disable small buffer
allocation.
Because small buffers were only used by QUIC streams, the pool used to alloc
these buffers was located in the quic code. However, their usage will be
extended to other parts. So, the small buffers pool was moved into the
dynbuf part.
http-errors parsing has been refactored in a recent serie of patches.
However, a null deref was introduced by the following patch in case a
non-existent http-errors section is referenced by an "errorfiles"
directive.
commit 2ca7601c2d
MINOR/OPTIM: http_htx: lookup once http_errors section on check/init
Fix this by delaying ha_free() so that it is called after ha_alert().
No need to backport.
The cfg_parse_acme() function checks if an 'acme' section is already
existing in the configuration with cur_acme->linenum > 0. But the wrong
filename and line number are displayed in the commit message.
Must be backported to 3.2 and later.
This patch fixes a leak of the ext_san structure when
sk_X509_EXTENSION_push() failed. sk_X509_EXTENSION_pop_free() is already
suppose to free it, so ext_san must be set to NULL upon success to avoid
a double-free.
Must be backported to 3.2 and later.
Use proxy_check_http_errors() on defaults proxy instances. This will
emit alert messages for errorfiles directives referencing a non-existing
http-errors section, or a warning if an explicitely listed status code
is not present in the target section.
This is a small behavior changes, as previouly this was only performed
for regular proxies. Thus, errorfile/errorfiles directives in an unused
defaults were never checked.
This may prevent startup of haproxy with a configuration file previously
considered as valid. However, this change is considered as necessary to
be able to use http-errors with dynamic backends. Any invalid defaults
will be detected on startup, rather than having to discover it at
runtime via "add backend" invokation.
Thus, any restriction on http-errors usage is now lifted for the
creation of dynamic backends.
The previous patch has splitted the original proxy_check_errors()
function in two, so that check and init steps are performed separately.
However, this renders the code inefficient for "errorfiles" directive as
tree lookup on http-errors section is performed twice.
Optimize this by adding a reference to the section in conf_errors
structure. This is resolved during proxy_check_http_errors() and
proxy_finalize_http_errors() can reuse it.
No need to backport.
Function proxy_check_errors() is used when configuration parsing is
over. This patch splits it in two newly named ones.
The first function is named proxy_check_http_errors(). It is responsible
to check for the validity of any "errorfiles" directive which could
reference non-existent http-errors section or code not defined in such
section. This function is now called via proxy_finalize().
The second function is named proxy_finalize_http_errors(). It converts
each conf_errors type used during parsing in a proper http_reply type
for runtime usage. This function is still called via post-proxy-check,
after proxy_finalize().
This patch does not bring any functional change. However, it will become
necessary to ensure http-errors can be used as expected with dynamic
backends.
This patch is the second part of the refactoring for http-errors
parsing. It renames some fields in <conf_errors> structure to clarify
their usage. In particular, union variants are renamed "inl"/"section",
which better highlight the link with the newly defined enum
http_err_directive.
In conf_errors struct, arbitrary integer values were used for both
<type> field and <status> array. This renders the code difficult to
follow.
Replaces these values with proper enums type. Two new types are defined
for each of these fields. The first one represents the directive type,
derived from the keyword used (errorfile vs errorfiles). This directly
represents which part of <info> union should be manipulated.
The second enum is used for errorfiles directive with a reference on a
http-errors section. It indicates whether or not if a status code should
be imported from this section, and if this import is explicit or
implicit.
Several resources were leaked on both success and error paths:
- X509_NAME *nm was never freed. X509_REQ_set_subject_name() makes
an internal copy, so nm must be freed separately by the caller.
- str_san allocated via my_strndup() was never freed on either path.
- On error paths after allocation, x (X509_REQ) and exts
(STACK_OF(X509_EXTENSION)) were also leaked.
Fix this by adding proper cleanup of all allocated resources in both
the success and error paths. Also move sk_X509_EXTENSION_pop_free()
after X509_REQ_sign() so it is not skipped when sign fails, and
initialize nm to NULL to make early error paths safe.
Must be backported as far as 3.2.