The direct fowarding support was only relying on "hs->to_write" value. But
we must be sure to retry if fast-forward data are still there in the I/O
buffer.
No client timeout was set with haterm. It could be an issue with
unresponsive clients. So the I/O timeout of the SC is initialized to the
frontend client timeout when the hstream is created. Then a read activity is
reported when data are received. This read activity is used to set an
expiration date on the hstream task and test it when the hstream is woken up
with TASK_WOKEN_TIMER reason.
When a client timeout is detected, the hstream try to send a 408 and report
an error.
There were several issues with the handling of end of the request or end of
response. The main problem was about the request draining.
To help to fix these issues, two flags were introduced:
* HS_ST_HTTP_EOM_RCVD: to know the request was fully received
* HS_ST_HTTP_EOM_SENT: to know the response was fully sent
Thanks to these flags some parts were reviewed and simplified.
In the I/O callback function, outside of any error, the hstream task is now
woken when one of the direction is not finished or when there are still some
data in a buffer.
The function hstream_must_drain() was reworked to properly drain request
with no content-length before replying.
The condition to wake hstream up to drain the request after replying was
also reworked, and moved ouside of the else block. Indeed, it must also be
evaluated when the response was fully sent in one call, when request headers
were processed.
Finally, the condition to shut the hstream was slighly adapted to use the
new flags. In addition, we now rely on se_shutdown().
When draining the request, if some data were received, no subscribe for
receives was performed to get the remaining. However, because request data
are just ignored, we must always subscribe until it was fully
drained. Otherwise, haterm will never be woken up to drain more data.
When request data were received, the request body length was decremented
accordingly with no check on it to be sure it was set. However, it remains
equal to 0 for chunked requests or H2/H3 requests with no content-length.
So now, it is only decremented when it is greater than 0.
When draining the request data, if an error was reported while some data
were received, the error was not processed immediately. This part was copied
from tcpchecks where the response should be processed first. For haterm, the
request data are ignored. So no reason to wait to handle the error. It may
be an issue because the response may be sent in the meanwhile.
When haterm was waiting for request headers, there was two test to know if
it had to drain the request data before replying. One of them was useless
and was thus removed.
In h1_parse_full_contig_chunks(), we first try to reserve the bigger HTX
DATA block as possible. It is ajusted at the end of chunks parsing or
removed if no data was copied. However, it should also be removed when a
parsing error is triggered. It could be an issue for http health checks and
haterm to properly handle errors.
This patch should be backported as far as 2.6.
When http_auth_bearer() sample fetch function is called with a custom header
and the header is not found or type didn't match 'Bearer', a mismatch must
be reported instead of an empty string.
This patch should be backported as far as 2.6.
Several return value for chunk_istcat() or chunk_memcat() calls were not
tested. Now, 0 is returned on failure.
Concretly, for now, it is unexpected to trigger error because the result
cannot exceed the buffer size. Data are extracted from an HTX message.
At first glance, no reason to backport it.
According to ACME RFC contact email is optional.
Letsencrypt used it some long time ago, but not today.
Currently HAProxy always sets the value of the contact mail to a string
that is read from the config, but if that string is not specified,
it sets %s in mailto:%s to null, which cases new account request
to fail in pebble.
Also HAProxy currently passes termsOfServiceAgreed bool to requests
that contain onlyReturnExisting, that isn't needed according to the RFC
and other ACME impls.
This patch dynamically builds the account request JSON to address that.
Can be backported to 3.2
As reported by Huangbin Zhan in github issue #3355, we're too lax on
the :protocol pseudo header. It is currently accepted with regular
CONNECT as well as non-CONNECT methods while it only ought to be
accepted with extended CONNECT (i.e. CONNECT after the connection
negotiated the RFC8441 extension). Let's refine the check in H2 by
leveraging the new flag H2_MSGF_EXT_CONN_OK that is passed by the
caller when the connection supports the extension. This is sufficient
to sort the various cases.
The proto upgrade regtest was updated to verify that CONNECT with
:protocol without nego and another method with nego and :protocol
both fail.
Thanks to Huangbin Zhan (@zhanhb) for the report and helpful reproducer.
This needs to be backported to all versions. It relies on these patches
first:
REGTESTS: http-messaging: always send RFC8441 client settings to use ext connect
BUG/MINOR: mux-h2: condition the processing of 8441 extension to global setting
MINOR: mux-h2: add a new message flag to indicate ext connect support
When rfc8441 (extended connect) is disabled via
h2-workaround-bogus-websocket-clients, we properly refrain from
advertising support for extended connect, but we should also ignore
the incoming setting, otherwise it can remain enabled if the client
advertises it.
This should be backported to stable versions.
Function h2_phdr_to_list() was missing the decoding for the :protocol
header and would emit :UNKNOWN in this case. It's only used in traces
so it's not important. The fix can be backported in all versions.
The tests were validating extended connect without sending the setting
in the client settings frame. It currently works due to a bug, so let's
fix the vtc first.
All ci_insert() calls in pcli_parse_request() were ignoring the return
value, silently miscounting the bytes to forward if an insertion failed.
Add a check on each call and return -1 on failure. In practice this has
no impact: the channel receive machinery enforces a maxrewrite reserve,
so there is always sufficient room in the buffer for these small
prefixes by the time pcli_parse_request() is called.
Must be backported in every maintained versions, the list of available
commands might change.
When @@<pid> is matched in pcli_parse_request(), no "operator -" or
"user -" is being sent before the command, like it's done for @<pid>.
It leads to privileges not being respected and commands are sent as
admin.
Fix this by applying the access-level downgrade in the @@<pid> path,
like it's done for @<pid>.
Must be backported to 3.2.
Reported-by: Omkhar Arasaratnam <omkhar@linkedin.com>
This test first performs two successive requests over the same
connection where reuse is expected, then perform two 401 which must
both work, testing both the transition from null->sess, and sess->sess.
This test could be backported to detect changes related to private
sessions.
Thanks to Omkhar Arasaratnam for the test.
During the architectural review leading to commit 90b2154d93 ("MEDIUM:
muxes: always set conn->owner to the session that owns the connection"),
we wondered whether srv_conn->owner could ever be NULL or even invalid,
or if it ought to be changed to sess, and the projections of various use
cases, as well as a number of attempts to fool it led us to conclude
that it was always valid since the connection is private. So it was
considered safer not to start fiddling with the pointer in case it
could still match a previous session after a reuse, which would match
the scenario described in the session_add_conn() comment.
Actually there was exactly one case where a NULL could be met, and that
was covered by the preliminary call to conn_set_owner() that was removed
in that patch, precisely related to the one that the next patch tried to
address: in http-reuse always, after the second request on a connection
releases the connection, the owner can now become NULL, so if an NTLM
header is seen at this point, it will crash.
Interestingly, after the immediately following commit was merged,
d93c53b0df ("MEDIUM: session: always reset the conn->owner on backend
when installing mux"), the problem became immediate as the conn's
owner is now null during operation if the connection is not private, and
now the first response in NTLM is sufficient to crash the process. On
the other hand, thanks to the two patches above, we're now certain never
to meet a different session, which was the sought goal: either the session
is normal and it has no owner, or it's private and it has <sess> as owner.
Also with HTTP/1 (since the code explicitly checks for H1), there may be
a single request at a time on a connection so the owner should either be
the session or NULL.
So this patch finally implements the original plan, to pass <sess> to
session_add_conn(). The call is idempotent if the owner is already set,
but at least the function performs some preliminary sanity checks which
are quite welcome, so better continue to always call it.
Note that this is only for 3.4 or any branch that has exactly the two
patches above. And if the patches above were to ever be backported
(together), this one would be needed as well.
Thanks to Omkhar Arasaratnam for reporting this regression.
AES-CBC uses a 16-byte block size, and PKCS padding always adds at least
one byte (up to a full 16 bytes when input is already block-aligned), so
the encrypted output is always larger than the input. Without checking
that the output buffer can hold the padded result, encryption could
overflow it. Add a pre-encryption guard for block cipher (blksize > 1)
that rejects the operation when the output buffer is too small.
No backport needed.
Reported-by: Omkhar Arasaratnam <omkhar@linkedin.com>
The exact same issue that was fixed for ip.fp with commit dbf471f99a
("BUG/MAJOR: net_helper: ip.fp infinite loop on malformed tcp options")
also exists for tcp_options_list: an option with a zero size will prevent
the offset from making any progress and will loop forever. In this case,
since we produce output, trash->data gets incremented on each loop and
the byte at ofs is used to fill the memory there, quickly overflowing the
area.
The exact same fix as above was applied here again, including the uchar
cast to make sure we don't go back-and-forth between two positions.
Such TCP options are not valid and will not be reported by the TCP stack,
however they can happen in case options are passed in headers by a first
proxy, or are offered in a return directive fed from a query string as a
means of providing a debugging tool for admins.
Thanks to Omkhar Arasaratnam for reporting this issue.
This fix must be backported to 3.2 where the commits above were
backported.
The error handler at do_resolve_parse_error freed varname and resolvers_id
but missed freeing rule->arg.resolv.opts (allocated via calloc). Added
ha_free(&rule->arg.resolv.opts) to the cleanup path.
This patch could be backported to all stable branches.
In resolv_check_response(), the condition 'if (!srvrq->named_servers)' was
inverted - it ran the named server lookup when the tree was empty/NULL
instead of when it was populated. This prevented proper hostname-based
matching of SRV records to servers from the state file.
The issue was introduced when tree of servers was replaced by a cebis tree
(fdf6fd5b45).
This patch must be backported to 3.3.
If an error is triggered while the requester was allocated, it is not
immediately released. It is not really a memory leak because the requester
will be reused laster, on a next attempt, or released on deinit. For a
do-resolv action, the requester is released with the stream. So no leak
here. But it is probably not expected to keep a newly allocated requester in
case of error.
This patch could be backported as far as 2.6.
When an error is reported on an expect rule for tcp and http health-checks,
a dedicated message is reported with details about the wrong match. However,
this was never performed for HTTP health-check because of the wrong test on
the check type.
In addition, when an error was reported on an "expect hdr" rule, a break
statement was missing. So the error message could be truncated (but never
emitted because of the issue above).
This patch should be backported to all stable versions.
When we removed the use of connect() to reach DNS servers in 3.3 with
commit 2c7e05f80e ("MEDIUM: dns: don't call connect to dest socket for
AF_INET*"), we accidentally lost a check on the server's address in
responses, opening the possibility of spoofed DNS responses for someone
who knows both haproxy's IP:port and the transaction ID.
In practice, the impact is very low, because:
- DNS servers IP addresses are almost always known, and often even among
the widely used ones (1.1.1.1, 8.8.8.8 etc), and their port is 53.
- all DNS "security" relies on the ignorance of the transaction ID and
is possible source port, so if either the attacker is on-path and sees
them, or it's off-path and has to guess them, but in any case it's
trivial to spoof the known server in responses, with or without the
check.
Regardless, let's not further weaken the protocol and do the check.
Thanks to Omkhar Arasaratnam for reporting this issue.
An interesting observation while testing this fix was that the code does
support UNIX dgram sockets (via connect()) but that since we don't bind
to a local UNIX socket to send requests, the server's recvfrom() doesn't
get any address and has nowhere to respond to. So in practice while the
code is designed to deal with UNIX sockets, these cannot work by design.
This fix must be backported to 3.2 where the commit above was backported.
Both keywords are used with dns-01 and dns-persist-01 challenges to pass
information to an external DNS provisioning tool (e.g. the dataplaneAPI)
via the "dpapi" sink. provider-name sets the DNS provider identifier and
acme-vars passes arbitrary tool-specific variables.
Thanks to @oliwer for reporting the issue.
Must be backported to 3.2, however previous version don't have
"dns-persist-01".
The docs (readme and configuration.txt) still used to mention that OTEL
was under development. Now that it's released, let's indicate that it's
ready with the download URL.
Commit d12edebe4a ("BUG/MAJOR: mux-h2: detect incomplete transfers on
HEADERS frames as well") tried to enforce strict matching between
advertised content-length and transferred data when dealing with ES on
a headers frame. It purposely arranged the code so that it would cover
both headers and trailers. The problem is, in h2c_dec_hdrs() we preset
message flags (msgf) based on the current state and knowledge related
to the stream being processed, then we pass these flags to the headers
parser and use their final state to perform some extra checks.
MSGF_BODY_CL was set by the parsers themselves when processing a
content-length header, but when parsing a trailers frame, it will not
be set, and due to this the matching between the remaining expected
content-length and the transferred data is not verified, so the fix
above doesn't work for trailers.
This patch sets MSGF_BODY_CL to the same value as H2_SF_DATA_CLEN so
that during headers it remains zero, but it matches what was learned
during headers when processing trailers. It is sufficient to re-enable
the check that was attempted in the commit above.
The impact remains the same as the one indicated in the commit above: in
practice this can be used to force subsequent requests to fail, or when
running with "http-reuse never" or when running with a totally idle server,
to perform a request smuggling by constructing specially crafted request
pairs where the first one is used to trigger an early response and hide
parts of or all headers of the second one, to instead use a second
embedded one that was not subject to analysis. As such, the risk remains
moderate given the low prevalence of "http-reuse never" in production
environments, and of idle servers. Again, a temporary alternative to the
fix is to disable HTTP/2 by specifying "alpn http/1.1" on "bind" lines,
and adding "option disable-h2-upgrade" in HTTP frontends.
Many thanks to Pratham Gupta / alchemy1729 for spotting and analyzing
this problem, and for providing a lightweight reproducer to illustrate
the problem!
This fix must be backported to all versions where the fix above was
backported (i.e. all). Note that it depends on this previous commit
otherwise trailers will always break:
BUG/MEDIUM: mux-h2: fix the body_len to check when parsing request trailers
As a side note, it's worth noting that these temporary message flags have
reached a level of pain and fragility that really warrants a complete
rework. Ideally we should have a pair of such flags in the h2s (one per
direction) and the callers of the parsers should point to them so that
they are always up to date. And by having generic HTTP flags instead of
H2, we could better unify the h1/h2/h3/fcgi processors (and maybe avoid
some HTX conversion). One flag could even indicate that trailers are
being parsed (since they're last) so as to ease this detection down the
chain.
The h2 content-length validation in commit d12edebe4a ("BUG/MAJOR:
mux-h2: detect incomplete transfers on HEADERS frames as well") was
insufficient. The content-length check is still ineffective on request
trailers and it could not work by default due to the fact that the
default body_len is used in h2c_frt_handle_headers() when processing
trailers, instead of passing h2s->body_len, which was necessarily parsed
before reaching trailers. Let's fix this point first, otherwise fixing
the second issue would break trailers.
Many thanks to Pratham Gupta / alchemy1729 for spotting and analyzing
this problem, and for providing a lightweight reproducer to illustrate
the problem!
This fix must be backported to all versions where the fix above was
backported (i.e. all).
The opentelemetry addons now live outside haproxy sources and is
available at https://github.com/haproxytech/haproxy-opentelemetry/
The addon must be built using the EXTRA_MAKE option from HAProxy
Makefile:
$ PKG_CONFIG_PATH=/opt/lib/pkgconfig make -j8 TARGET=linux-glibc
EXTRA_MAKE="/tmp/a/a/haproxy-opentelemetry" OTEL_DEBUG=1 OTEL_USE_VARS=1
The OpenTelemetry filter has been moved out of the haproxy source tree and
now lives and is developed in the haproxy-opentelemetry repository as an
external addon. Removed all hard-coded references to USE_OTEL and to the
addons/otel directory from the main Makefile, as the addon is now plugged
in through the generic EXTRA_MAKE hook added in the previous commit.
Allow to call an external Makefile called Makefile.inc in order to build
complex addons.
make TARGET=linux-glibc ... EXTRA_MAKE="/path/to/addon1" \
EXTRA_MAKE+="/path/to/addon2"
The historical code dealing with timeout was left with a confusing
condition that is always true but always requires some analysis. It
would check if some streams were left pending before deciding to
release the connection. It could indeed be problematic to leave with
no timeout and an active connection!
As Christopher figured, the situation cannot exist because a first
check ensures there's no more h2c via h2c_may_expire(), then a call
to h2_wake_some_streams() will call h2s_wake_one_stream() for each
of the h2s, and all those with no h2c are purged. Thus on return
from h2_wake_some_streams() we're guaranteed to have an empty tree.
Let's just remove the condition and clean up the code.
Recently, datagram reception architecture has been completely reworked
to improve performance. A regression has been introduced when using
traces in qc_rcv_buf() : datagram argument is uninitialized after recv
syscall. This may cause a crash as CIDs buffer is dereferenced.
Fix this by removing dgram argument from the affected trace. A new trace
is added after quic_dgram_init() to keep the ability to display the
received content.
This issue has caused failure of all QUIC interop testing.
No need to backport.
In ACME_INITIAL_RSLV_READY, when the opportunistic DNS propagation check
fails and the code falls back to ACME_CLI_WAIT, ACME_RDY_INITIAL_DNS was
left set in cond_ready. Since the CLI-wait path only ever sets ACME_RDY_CLI
on auth->ready, the readiness check in ACME_CLI_WAIT could never be
satisfied, permanently stalling certificate renewal.
Fix this by stripping ACME_RDY_INITIAL_DNS from cond_ready before falling
back to the regular CLI-wait flow. Also replace the &= with a plain
assignment in the success path to make the intent explicit.
No backport needed, 3.4 only.
New type sockaddr_in46 has been recently introduced. It serves as a
union which can store either an IPv4 or IPv6 address. The objective is
to reduce the storage size for QUIC datagrams which previously uses a
sockaddr_storage field.
On qc_new_conn(), source and destination addresses from the datagram are
passed to the function as sockaddr_storage so that they are copied into
the newly built quic_conn instance. However, the involved memcpy() is
producing a buffer overflow as sockaddr_in46 is smaller than
sockaddr_storage type.
This patch fixes this by defining a new helper function
in46un_to_addr(). This allows to convert safely sockaddr_in46 to a plain
sockaddr type. The function is now used before invoking qc_new_conn().
Note that there is still other several places where union sockaddr_in46
is casted as sockaddr_storage type. However, these should be safe as in
these cases sockaddr fields are accessed individually after checking
ss_family. The memory issue only exists when plain memcpy is used.
This bug was detected using ASAN. It generates the following traces when
a QUIC connection is instantiated.
==37474==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7c3bb9a61100 at pc 0x5631f52c3946 bp 0x7ffc83e45b50 sp 0x7ffc83e45310
READ of size 128 at 0x7c3bb9a61100 thread T0
#0 0x5631f52c3945 in __asan_memcpy (/home/amaury/work/haproxy-quic-dev/haproxy+0x3ae945) (BuildId: a7ccfd74b7a71a869b8ff8d13f6dcde8c82c1487)
#1 0x5631f55f9e34 in qc_new_conn /home/amaury/work/haproxy-quic-dev/src/quic_conn.c:1311:2
#2 0x5631f558d5c3 in quic_rx_pkt_retrieve_conn /home/amaury/work/haproxy-quic-dev/src/quic_rx.c:1875:10
#3 0x5631f558330b in quic_dgram_parse /home/amaury/work/haproxy-quic-dev/src/quic_rx.c:2463:29
#4 0x5631f5625da6 in quic_lstnr_dghdlr /home/amaury/work/haproxy-quic-dev/src/quic_sock.c:206:3
#5 0x5631f6a64173 in run_tasks_from_lists /home/amaury/work/haproxy-quic-dev/src/task.c:660:26
#6 0x5631f6a6ba1e in process_runnable_tasks /home/amaury/work/haproxy-quic-dev/src/task.c:913:9
#7 0x5631f5e984c3 in run_poll_loop /home/amaury/work/haproxy-quic-dev/src/haproxy.c:2982:3
#8 0x5631f5e9a715 in run_thread_poll_loop /home/amaury/work/haproxy-quic-dev/src/haproxy.c:3212:2
#9 0x5631f5e9f732 in main /home/amaury/work/haproxy-quic-dev/src/haproxy.c:3853:2
#10 0x7f2bba8276c0 (/usr/lib/libc.so.6+0x276c0) (BuildId: ca0db5ab57a36507d61bbcf4988d344974331f19)
#11 0x7f2bba8277f8 in __libc_start_main (/usr/lib/libc.so.6+0x277f8) (BuildId: ca0db5ab57a36507d61bbcf4988d344974331f19)
#12 0x5631f51be594 in _start (/home/amaury/work/haproxy-quic-dev/haproxy+0x2a9594) (BuildId: a7ccfd74b7a71a869b8ff8d13f6dcde8c82c1487)
No need to backport.
aclkw->match_type was compared twice to PAT_MATCH_DOM in parse_acl_expr().
After auditing the involved types, it's only a copy-paste mistake, as no
other matching method is missing, so let's drop it to avoid the confusion.
Also drop variable ckw which is assigned NULL and passed to free(), and is
clearly a leftover from a previous version.
No backport needed.
In pattern_read_from_file(), in case of failure to load from the file,
the newly allocated reference remains attached to the list and is never
freed. Let's just do it.
This can be backported to all versions since it arrived in 1.5 with
commit 1e00d3853b ("MAJOR: pattern/map: Extends the map edition system
in the patterns").
Maps can leak a map descriptor in sample_load_map() on error. This is
harmless since the process won't start after this, and this cannot be
used at run time. but it's cleaner to fix it. This can be backported.