Complete qc_send function. After having processed each qcs emission, it
will now retry send on qcs where transfer can continue. This is useful
when qc_stream_desc buffer is full and there is still data present in
qcs buf.
To implement this, each eligible qcs is inserted in a new list
<qcc.send_retry_list>. This is done on send notification from the
transport layer through qcc_streams_sent_done(). Retry emission until
send_retry_list is empty or the transport layer cannot proceed more
data.
Several send operations are now called on two different places. Thus a
new _qc_send_qcs() function is defined to factorize the code.
This change should maximize the throughput during QUIC transfers.
MUX streams can now allocate multiple buffers for sending. quic-conn is
responsible to limit the total count of allowed allocated buffers. A
counter is stored in the new field <stream_buf_count>.
For the moment, the value is hardcoded to 30.
On stream buffer allocation failure, the qcc MUX is flagged with
QC_CF_CONN_FULL. The MUX is then woken up as soon as a buffer is freed,
most notably on ACK reception.
Acknowledge of STREAM has been complexified with the introduction of
stream multi buffers. Two functions are executing roughly the same set
of instructions in xprt_quic.c.
To simplify this, move the code complexity in a new function
qc_stream_desc_ack(). It will handle offset calculation, removal of
data, freeing oldest buffer and freeing stream instance if required.
The qc_stream_desc API is cleaner as qc_stream_desc_free_buf() ambiguous
function has been removed.
Complete the qc_stream_desc type to support multiple buffers on
emission. The main objective is to increase the transfer throughput.
The MUX is now able to transfer more data without having to wait ACKs.
To implement this feature, a new type qc_stream_buf is declared. it
encapsulates a buffer with a list element. New functions are defined to
retrieve the current buffer, release it or allocate a new one. Each
buffer is kept in the qc_stream_desc list until all of its data is
acknowledged.
On the MUX side, a qcs uses the current stream buffer to transfer data.
Once the buffer is full, it is released and a new one will be allocated
on a future qc_send() invocation.
Add a new member <qc> in qc_stream_desc structure. This change is
possible since previous patch which add quic-conn argument to
qc_stream_desc_new().
The purpose of this change is to simplify the future evolution of
qc-stream-desc API. This will avoid to repeat qc as argument in various
functions which already used a qc_stream_desc.
Simplify the model qcs/qc_stream_desc. Each types has now its own tree
node, stored respectively in qcc and quic-conn trees. It is still
necessary to mark the stream as detached by the MUX once all data is
transfered to the lower layer.
This might improve slightly the performance on ACK management as now
only the lookup in quic-conn is necessary. On the other hand, memory
size of qcs structure is increased.
Regroup all type definitions and functions related to qc_stream_desc in
the source file src/quic_stream.c.
qc_stream_desc complexity will be increased with the development of Tx
multi-buffers. Having a dedicated module is useful to mix it with
pure transport/quic-conn code.
DHE ciphers do not present a security risk if the key is big enough but
they are slow and mostly obsoleted by ECDHE. This patch removes any
default DH parameters. This will effectively disable all DHE ciphers
unless a global ssl-dh-param-file is defined, or
tune.ssl.default-dh-param is set, or a frontend has DH parameters
included in its PEM certificate. In this latter case, only the frontends
that have DH parameters will have DHE ciphers enabled.
Adding explicitely a DHE ciphers in a "bind" line will not be enough to
actually enable DHE. We would still need to know which DH parameters to
use so one of the three conditions described above must be met.
This request was described in GitHub issue #1604.
MacOS can feed fc_rtt, fc_rttvar, fc_sacked, fc_lost and fc_retrans
so let's expose them on this platform.
Note that at the tcp(7) level, the API is slightly different, as
struct tcp_info is called tcp_connection_info and TCP_INFO is
called TCP_CONNECTION_INFO, so for convenience these ones were
defined to point to their equivalent. However there is a small
difference now in that tcpi_rtt is called tcpi_rttcur on this
platform, which forces us to make a special case for it before
other platforms.
The two recent patches b12966af1 ("BUILD: debug: mark the
__start_mem_stats/__stop_mem_stats symbols as weak") and 2a06e248f
("BUILD: initcall: mark the __start_i_* symbols as weak, not global")
aimed at fixing a build warning and resulted in a build breakage on
MacOS which doesn't have a ".weak" asm statement.
We've already had MacOS-specific asm() statements for section names, so
this patch continues on this trend by moving HA_GLOBL() to compiler.h
and using ".globl" on MacOS since apparently nobody complains there.
It is debatable whether to expose this only when !USE_OBSOLETE_LINKER
or all the time, but since these are just macroes it's no big deal to
let them be available when needed and let the caller decide on the
build conditions.
If any of the patches above is backported, this one will need to as
well.
Ilya reported in issue #1638 that Clang 14 has invented a new warning
that encourages to modify the code in a way that is not always
equivalent, by turning "|" to "||" between some logical operators,
except that the first one guarantees that all members of the expression
will always be evaluated while the latter will stop at the first one
which is true!
This warning triggers in thread_has_tasks(), which is not sensitive to
such change of behavior but which is built this way because it results
in branchless code for something that most often evaluates to false for
all terms. As such it was out of question to turn this to less efficient
compare-and-jump that needlessly pollute the branch predictor, so the
workaround consists in casting each expression to (int). It was verified
that the code is the same.
Yet another example of how-to-introduce-bugs-by-fixing-valid-code
through warnings invented around a beer without thinking longer!
This may need to be backported to a few older branches in case this
compiler lands in recent distros or if gcc finds it wise to imitate it.
Just like for previous fix, these symbols are marked ".globl" during
their declaration, but their later mention uses __attribute__((weak)),
so it's better to only use ".weak" during the declaration so that the
symbol's class does not change.
No need to backport this unless someone reports build issues.
Building with clang and DEBUG_MEM_STATS shows the following warnings:
warning: __start_mem_stats changed binding to STB_WEAK [-Wsource-mgr]
warning: __stop_mem_stats changed binding to STB_WEAK [-Wsource-mgr]
The reason is that the symbols are declared using ".globl" while they
are also referenced as __attribute__((weak)) elsewhere. It turns out
that a weak symbol is implicitly a global one and that the two classes
are exclusive, thus it may confuse the linker. Better fix this.
This may be backported where the patch applies.
cs_conn_io_cb(), cs_conn_sync_recv() and cs_conn_sync_send() are moved in
conn_stream.c. Associated functions are moved too (cs_notify, cs_conn_read0,
cs_conn_recv, cs_conn_send and cs_conn_process).
Remaining flags and associated functions are move in the conn-stream
scope. These flags are added on the endpoint and not the conn-stream
itself. This way it will be possible to get them from the mux or the
applet. The functions to get or set these flags are renamed accordingly with
the "cs_" prefix and updated to manipualte a conn-stream instead of a
stream-interface.
si_conn_cb variable is renamed cs_data_conn_cb. In addtion, its associated
functions are also renamed. si_cs_recv(), si_cs_send() and si_cs_process() are
renamed cs_conn_recv(), cs_conn_send and cs_conn_process(). These functions are
updated to manipulate conn-streams instead of stream-interfaces.
data callbacks were only used for streams attached to a connection and
for health-checks. However there is a callback used by task_run_applet. So,
si_applet_wake_cb() is first renamed to cs_applet_process() and it is
defined as the data callback for streams attached to an applet. This way,
this part now manipulates a conn-stream instead of a stream-interface. In
addition, applets are no longer handled as an exception for this part.
si_update_both() is renamed stream_update_both_cs() and moved in stream.c.
The function is slightly changed to manipulate the stream instead the front
and back conn-streams.
si_update_rx(), si_update_tx() and si_update() are renamed cs_update_rx(),
cs_upate_tx() and cs_update() and updated to manipulate a conn-stream
instead of a stream-interface.
It is a transient commit. It should ease next changes about the conn-stream
refactoring. At the end these functions will be moved in the conn-stream
scope.
si_register_applet() and si_applet_release() are renamed
cs_register_applet() and cs_applet_release() and now manipulate a
conn-stream instead of a stream-inteface.
si_shutr(), si_shutw(), si_chk_rcv() and si_chk_snd() are moved in the
conn-stream scope and renamed, respectively, cs_shutr(), cs_shutw(),
cs_chk_rcv(), cs_chk_snd() and manipulate a conn-stream instead of a
stream-interface.
Some conn-stream functions are only used when there is a connection. Thus,
they was renamed with "cs_conn_" prefix. In addition, we expect to have a
connection, so a BUG_ON is added to be sure the functions are never called
in another context.
wait_event structure is moved in the conn-stream. The tasklet is only
created if the conn-stream is attached to a mux and released when the mux is
detached. This implies a subtle change. In stream_int_chk_rcv() function,
the wakeup of the tasklet was removed because there is no longer tasklet at
this stage (stream_int_chk_rcv() is a callback function of si_embedded_ops).
To be able to move wait_event from the stream-interface to the conn-stream,
we must be prepare to handle errors when a mux is attached to a conn-stream.
Indeed, the wait_event's tasklet will be allocated when both a mux and a
stream will be both attached to a stream. So, we must be prepared to handle
allocation errors.
These flags only concerns the connection part. In addition, it is required
for a next commit, to avoid circular deps. Thus CS_SHR_* and CS_SHW_* were
renamed with the "CO_" prefix.
si_connect() is moved in backend.c and renamed as do_connect_server(). In
addition, the function now manipulate a stream instead of a
stream-interface.
si_retnclose() is used to send a reply to a client before closing. There is
no use on the server side, in spite of the function is generic. Thus, it is
renamed stream_retnclose() and moved into the stream scope. The function now
handle a stream and explicitly send a message to the client.
The stream-interface state (SI_ST_*) is now in the conn-stream. It is a
mechanical replacement for now. Nothing special. SI_ST_* and SI_SB_* were
renamed accordingly. Utils functions to manipulate these infos were moved
under the conn-stream scope.
But it could be good to keep in mind that this part should be
reworked. Indeed, at the CS level, we only need to know if it is ready to
receive or to send. The state of conn-stream from INI to EST is only used on
the server side. The client CS is immediately set to EST. Thus current
SI_ST_* states should probably be moved to the stream to reflect the server
connection state during the establishment stage.
Only the server side is concerned by the stream-interface error type. It is
useless to have an err_type field on the client side. So, it is now move to
the stream. SI_ET_* are renames STRM_ET_* and moved in stream-t.h header
file.
The previous connection state on the client side was only used for debugging
purpose to report client close. But this may be handled when the client
stream-interface is switched from SI_ST_DIS to SI_ST_CLO.
So, there only remains the previous connection state on the server side that
is used by the stream, in process_stream(), to be able to set the correct
termination flags. Thus, instead of keeping this info in the
stream-interface for only one side, the info is now stored in the stream
itself.
Flag to get the source ip/port with getsockname is now handled at the stream
level. Thus SI_FL_SRC_ADDR stream-int flag is replaced by SF_SRC_ADDR stream
flag.
Flag to consider a stream as indepenent is now handled at the conn-stream
level. Thus SI_FL_INDEP_STR stream-int flag is replaced by CS_FL_INDEP_STR
conn-stream flags.
Flag to not wake the stream up on I/O is now handled at the conn-stream
level. Thus SI_FL_DONT_WAKE stream-int flag is replaced by CS_FL_DONT_WAKE
conn-stream flags.
Flags to disable lingering and half-close are now handled at the conn-stream
level. Thus SI_FL_NOLINGER and SI_FL_NOHALF stream-int flags are replaced by
CS_FL_NOLINGER and CS_FL_NOHALF conn-stream flags.
Instead of setting a stream-interface flag to then set the corresponding
conn-stream endpoint flag, we now only rely the conn-stream endoint. Thus
SI_FL_KILL_CON is replaced by CS_EP_KILL_CONN.
In addition si_must_kill_conn() is replaced by cs_must_kill_conn().
Instead of relying on the conn-stream error, via CS_FL_ERR flags, we now
directly use the error at the endpoint level with the flag CS_EP_ERROR. It
should be safe to do so. But we must be careful because it is still possible
that an error is processed too early. Anyway, a conn-stream has always a
valid endpoint, maybe detached from any endpoint, but valid.
SI_FL_ERR is removed and replaced by CS_FL_ERROR. It is a transient patch
because the idea is to rely on the endpoint to handle errors at this
level. But if for any reason it is not possible, the stream-interface flags
will still be replaced.
The expiration date in the stream-interface was only used on the server side
to set the connect, queue or turn-around timeout. It was checked on the
frontend stream-interface, but never used concretely. So it was removed and
replaced by a connect expiration date in the stream itself. Thus, SI_FL_EXP
flag in stream-interfaces is replaced by a stream flag, SF_CONN_EXP.
The source and destination addresses at the applicative layer are moved from
the stream-interface to the conn-stream. This simplifies a bit the code and
it is a logicial step to remove the stream-interface.
The conn_retries counter was set to the max value and decremented at each
connection retry. Thus the counter reflected the number of retries left and
not the real number of retries. All calculations of redispatch or reporting
of number of retries experienced were made using subtracts from the
configured retries, which was complicated and didn't bring any benefit.
Now, this counter is set to 0 and incremented at each retry. We know we've
reached the maximum allowed connection retries by comparing it to the
configured value. In all other cases, we directly use the counter.
This patch should address the feature request #1608.
The conn_retries counter may be moved into the stream structure. It only
concerns the connection establishment. The frontend stream-interface does not
use it. So it is a logical change.
The L7 retries only concerns the stream when a server connection is
established. Thus instead of storing the L7 buffer into the
stream-interface, it may be moved to the stream. And because it is only
available for HTTP streams, it may be moved in the HTTP transaction.
Associated flags are also moved into the HTTP transaction.
At many places, we now use the new CS functions to get a stream or a channel
from a conn-stream instead of using the stream-interface API. It is the
first step to reduce the scope of the stream-interfaces. The main change
here is about the applet I/O callback functions. Before the refactoring, the
stream-interface was the appctx owner. Thus, it was heavily used. Now, as
far as possible,the conn-stream is used. Of course, it remains many calls to
the stream-interface API.
cs_utils.h header file will contain all util functions related to the
conn_streams. For now, few functions were added, all are equivalent to SI
functions. Idea is to progressively replace SI functions by CS ones.
CS_FL_ISBACK is a new flag, set on backend conn-streams. We must just be
careful to preserve this flag when the endpoint is detached from the
conn-stream.
All old flags CS_FL_* are now moved in the endpoint scope and renamed
CS_EP_* accordingly. It is a systematic replacement. There is no true change
except for the health-check and the endpoint reset. Here it is a bit special
because the same conn-stream is reused. Thus, we must handle endpoint
allocation errors. To do so, cs_reset_endp() has been adapted.
Thanks to this last change, it will now be possible to simplify the
multiplexer and probably the applets too. A review must also be performed to
remove some flags in the channel or the stream-interface. The HTX will
probably be simplified too. Finally, there is now some place in the
conn-stream to move info from the stream-interface.
The conn-stream endpoint is now shared between the conn-stream and the
applet or the multiplexer. If the mux or the applet is created first, it is
responsible to also create the endpoint and share it with the conn-stream.
If the conn-stream is created first, it is the opposite.
When the endpoint is only owned by an applet or a mux, it is called an
orphan endpoint (there is no conn-stream). When it is only owned by a
conn-stream, it is called a detached endpoint (there is no mux/applet).
The last entity that owns an endpoint is responsible to release it. When a
mux or an applet is detached from a conn-stream, the conn-stream
relinquishes the endpoint to recreate a new one. This way, the endpoint
state is never lost for the mux or the applet.
It is a transient commit to prepare next changes. Now, when a conn-stream is
created from an applet or a multiplexer, an endpoint is always provided. In
addition, the API to create a conn-stream was specialized to have one
function per type.
The next step will be to share the endpoint structure.
It is a transient commit to prepare next changes. It is possible to pass a
pre-allocated endpoint to create a new conn-stream. If it is NULL, a new
endpoint is created, otherwise the existing one is used. There no more
change at the conn-stream level.
In the applets, all conn-stream are created with no pre-allocated
endpoint. But for multiplexers, an endpoint is systematically created before
creating the conn-stream.
Some CS flags, only related to the endpoint, are moved into the endpoint
struct. More will probably moved later. Those ones are not critical. So it
is pretty safe to move them now and this will ease next changes.
Group the endpoint target of a conn-stream, its context and the associated
flags in a dedicated structure in the conn-stream. It is not inlined in the
conn-stream structure. There is a dedicated pool.
For now, there is no complexity. It is just an indirection to get the
endpoint or its context. But the purpose of this structure is to be able to
share a refcounted context between the mux and the conn-stream. This way, it
will be possible to preserve it when the mux is detached from the
conn-stream.
The function cs_init() is only called by cs_new(). The conn-stream
initialization will be reviewed. It is easier to do it in cs_new() instead
of using a dedicated function. cs_new() is pretty simple, there is no reason
to split the code in this case.
This change is only significant for the multiplexer part. For the applets,
the context and the endpoint are the same. Thus, there is no much change. For
the multiplexer part, the connection was used to set the conn-stream
endpoint and the mux's stream was the context. But it is a bit strange
because once a mux is installed, it takes over the connection. In a
wonderful world, the connection should be totally hidden behind the mux. The
stream-interface and, in a lesser extent, the stream, still access the
connection because that was inherited from the pre-multiplexer era.
Now, the conn-stream endpoint is the mux's stream (an opaque entity for the
conn-stream) and the connection is the context. Dedicated functions have
been added to attached an applet or a mux to a conn-stream.
The appctx owner is now always a conn-stream. Thus, it can be set during the
appctx allocation. But, to do so, the conn-stream must be created first. It
is not a problem on the server side because the conn-stream is created with
the stream. On the client side, we must take care to create the conn-stream
first.
This change should ease other changes about the applets bootstrapping.
This patch is mandatory to invert the endpoint and the context in the
conn-stream. There is no common type (at least for now) for the entity
representing a mux (h1s, h2s...), thus we must set its type when the
endpoint is attached to a conn-stream. There is 2 types for the conn-stream
endpoints: the mux (CS_FL_ENDP_MUX) and the applet (CS_FL_ENDP_APP).
For now there is no much change. Only the appctx is passed as argument when
the .init callback function is called. And it is not possible to yield at
this stage. It is not a problem because the feature is not used. Only the
lua defines this callback function for the lua TCP/HTTP services. The idea
is to be able to use it for all applets to initialize the appctx context.
First gcc, then now coverity report possible null derefs in situations
where we know these cannot happen since we call the functions in
contexts that guarantee the existence of the connection and the method
used. Let's introduce an unchecked version of the function for such
cases, just like we had to do with objt_*. This allows us to remove the
ALREADY_CHECKED() statements (which coverity doesn't see), and addresses
github issues #1643, #1644, #1647.
It was supposed to be there, and probably was not placed there due to
historic limitations in listener_accept(), but now there does not seem
to be a remaining valid reason for keeping the quic_conn out of the
handle. In addition in new_quic_cli_conn() the handle->fd was incorrectly
set to the listener's FD.
Historically there was a single way to have an SSL transport on a
connection, so detecting if the transport layer was SSL and a context
was present was sufficient to detect SSL. With QUIC, things have changed
because QUIC also relies on SSL, but the context is embedded inside the
quic_conn and the transport layer doesn't match expectations outside,
making it difficult to detect that SSL is in use over the connection.
The approach taken here to improve this consists in adding a new method
at the transport layer, get_ssl_sock_ctx(), to retrieve this often needed
ssl_sock_ctx, and to use this to detect the presence of SSL. This will
even allow some simplifications and cleanups to be made in the SSL code
itself, and QUIC will be able to provide one to export its ssl_sock_ctx.
These functions will allow the connection layer to retrieve a quic_conn's
source or destination when possible. The quic_conn holds the peer's address
but not the local one, and the sockets API doesn't always makes that easy
for datagrams. Thus for frontend connection what we're doing here is to
retrieve the listener's address when the destination address is desired.
Now it finally becomes possible to fetch the source and destination using
"src" and "dst", and to pass an incoming connection's endpoints via the
proxy protocol.
Right now the proto_fam descriptor provides a family-specific
get_src() and get_dst() pair of calls to retrieve a socket's source
or destination address. However this only works for connected mode
sockets. QUIC provides its own stream protocol, which relies on a
datagram protocol underneath, so the get_src()/get_dst() at that
protocol's family will not work, and QUIC would need to provide its
own.
This patch implements get_src() and get_dst() at the protocol level
from a connection, and makes sure that conn_get_src()/conn_get_dst()
will automatically use them if defined before falling back to the
family's pair of functions.
We'll want conn_get_src/dst to support other means of retrieving these
respective IP addresses, but the functions as they're designed are a bit
too restrictive for now.
This patch arranges them to have a default error fallback allowing to
test different mechanisms. In addition we now make sure the underlying
protocol is of type stream before calling the family's get_src/dst as
it makes no sense to do that on dgram sockets for example.
Certain functions cannot be called on an FD-less conn because they are
normally called as part of the protocol-specific setup/teardown sequence.
Better place a few BUG_ON() to make sure none of them is called in other
situations. If any of them would trigger in ambiguous conditions, it would
always be possible to replace it with an error.
Some syscalls at the TCP level act directly on the FD. Some of them
are used by TCP actions like set-tos, set-mark, silent-drop, others
try to retrieve TCP info, get the source or destination address. These
ones must not be called with an invalid FD coming from an FD-less
connection, so let's add the relevant tests for this. It's worth
noting that all these ones already have fall back plans (do nothing,
error, or switch to alternate implementation).
There are plenty of places (particularly in debug code) where we try to
dump the connection's FD only when the connection is defined. That's
already a pain but now it gets one step further with QUIC because we do
*not* want to dump this FD in this case.
conn_fd() checks if the connection exists, is ready and is not fd-less,
and returns the FD only in this case, otherwise returns -1. This aims at
simplifying most of these conditions.
QUIC connections do not use a file descriptor, instead they use the
quic equivalent which is the quic_conn. A number of our historical
functions at the connection level continue to unconditionally touch
the file descriptor and this may have consequences once QUIC starts
to be used.
This patch adds a new flag on QUIC connections, CO_FL_FDLESS, to
mention that the connection doesn't have a file descriptor, hence the
FD-based API must never be used on them.
From now on it will be possible to intrument existing functions to
panic when this flag is present.
The OpenSSL engine API is deprecated starting with OpenSSL 3.0.
In order to have a clean build this feature is now disabled by default.
It can be reactivated with USE_ENGINE=1 on the build line.
The new 'close-spread-time' global option can be used to spread idle and
active HTTP connction closing after a SIGUSR1 signal is received. This
allows to limit bursts of reconnections when too many idle connections
are closed at once. Indeed, without this new mechanism, in case of
soft-stop, all the idle connections would be closed at once (after the
grace period is over), and all active HTTP connections would be closed
by appending a "Connection: close" header to the next response that goes
over it (or via a GOAWAY frame in case of HTTP2).
This patch adds the support of this new option for HTTP as well as HTTP2
connections. It works differently on active and idle connections.
On active connections, instead of sending systematically the GOAWAY
frame or adding the 'Connection: close' header like before once the
soft-stop has started, a random based on the remainder of the close
window is calculated, and depending on its result we could decide to
keep the connection alive. The random will be recalculated for any
subsequent request/response on this connection so the GOAWAY will still
end up being sent, but we might wait a few more round trips. This will
ensure that goaways are distributed along a longer time window than
before.
On idle connections, a random factor is used when determining the expire
field of the connection's task, which should naturally spread connection
closings on the time window (see h2c_update_timeout).
This feature request was described in GitHub issue #1614.
This patch should be backported to 2.5. It depends on "BUG/MEDIUM:
mux-h2: make use of http-request and keep-alive timeouts" which
refactorized the timeout management of HTTP2 connections.
We modify the key update feature implementation to support reusable cipher contexts
as this is done for the other cipher contexts for packet decryption and encryption.
To do so we attach a context to the quic_tls_kp struct and initialize it each time
the underlying secret key is updated. Same thing when we rotate the secrets keys,
we rotate the contexts as the same time.
Add ->ctx new member field to quic_tls_secrets struct to store the cipher context
for each QUIC TLS context TX/RX parts.
Add quic_tls_rx_ctx_init() and quic_tls_tx_ctx_init() functions to initialize
these cipher context for RX and TX parts respectively.
Make qc_new_isecs() call these two functions to initialize the cipher contexts
of the Initial secrets. Same thing for ha_quic_set_encryption_secrets() to
initialize the cipher contexts of the subsequent derived secrets (ORTT, Handshake,
1RTT).
Modify quic_tls_decrypt() and quic_tls_encrypt() to always use the same cipher
context without allocating it each time they are called.
Define a new API to notify the MUX from the quic-conn when the
connection is about to be closed. This happens in the following cases :
- on idle timeout
- on CONNECTION_CLOSE emission or reception
The MUX wake callback is called on these conditions. The quic-conn
QUIC_FL_NOTIFY_CLOSE is set to only report once. On the MUX side,
connection flags CO_FL_SOCK_RD_SH|CO_FL_SOCK_WR_SH are set to interrupt
future emission/reception.
This patch is the counterpart to
"MEDIUM: mux-quic: report CO_FL_ERROR on send".
Now the quic-conn is able to report its closing, which may be translated
by the MUX into a CO_FL_ERROR on the connection for the upper layer.
This allows the MUX to properly react to the QUIC closing mechanism for
both idle-timeout and closing/draining states.
Complete the error reporting. For each attached streams, if CO_FL_ERROR
is set, mark them with CS_FL_ERR_PENDING|CS_FL_ERROR. This will notify
the upper layer to trigger streams detach and release of the MUX.
This reporting is implemented in a new function qc_wake_some_streams(),
called by qc_wake(). This ensures that a lower-layer error is quickly
reported to the individual streams.
Add a new app layer operation is_active. This can be used by the MUX to
check if the connection can be considered as active or not. This is used
inside qcc_is_dead as a first check.
For example on HTTP/3, if there is at least one bidir client stream
opened the connection is active. This explicitly ignore the uni streams
used for control and qpack as they can never be closed during the
connection lifetime.
Improve timeout handling on the MUX. When releasing a stream, first
check if the connection can be considered as dead and should be freed
immediatly. This allows to liberate resources faster when possible.
If the connection is still active, ensure there is no attached
conn-stream before scheduling the timeout. To do this, add a nb_cs field
in the qcc structure.
This flag was used to notify the MUX about a CONNECTION_CLOSE frame
reception. It is now unused on the MUX side and can be removed. A new
mechanism to detect quic-conn closing will be soon implemented.
Rationalize the lifetime of the quic-conn regarding with the MUX. The
quic-conn must not be freed if the MUX is still allocated.
This simplify the MUX code when accessing the quic-conn and removed
possible segfaults.
To implement this, if the quic-conn timer expired, the quic-conn is
released only if the MUX is not allocated. Else, the quic-conn is
flagged with QUIC_FL_CONN_EXP_TIMER. The MUX is then responsible
to call quic_close() which will free the flagged quic-conn.
New received packets after sending CONNECTION_CLOSE frame trigger a new
CONNECTION_CLOSE frame to be sent. Each time such a frame is sent we
increase the number of packet required to send another CONNECTION_CLOSE
frame.
Rearm only one time the idle timer when sending a CONNECTION_CLOSE frame.
This should be useful to have an idea of the list of frames which could be built
towards the list of available frames when building packets.
Same thing about the frames which could not be built because of a lack of room
in the TX buffer.
Due to a erroneous interpretation of the RFC 9000 (quic-transport), ACKs frames
were always sent only after having received two ack-eliciting packets.
This could trigger useless retransmissions for tail packets on the peer side.
For now on, we send as soon as possible ACK frames as soon as we have ACK to send,
in the same packets as the ack-eliciting frame packets, and we also send ACK
frames after having received 2 ack-eliciting packets since the last time we sent
an ACK frame with other ack-eliciting frames.
As such variables are handled by the QUIC connection I/O handler which runs
always on the thread, there is no need to continue to use such atomic operations
The new qc_stream_desc type has a tree node for storage. Thus, we can
remove the node in the qcs structure.
When initializing a new stream, it is stored into the qcc streams_by_id
tree. When the MUX releases it, it will freed as soon as its buffer is
emptied. Before this, the quic-conn is responsible to store it inside
its own streams_by_id tree.
Move the xprt-buf and ack related fields from qcs to the qc_stream_desc
structure. In exchange, qcs has a pointer to the low-level stream. For
each new qcs, a qc_stream_desc is automatically allocated.
This simplify the transport layer by removing qcs/mux manipulation
during ACK frame parsing. An additional check is done to not notify the
MUX on sending if the stream is already released : this case may now
happen on retransmission.
To complete this change, the quic_stream frame now references the
quic_stream instance instead of a qcs.
Currently, the mux qcs streams manage the Tx buffering, even after
sending it to the transport layer. Buffers are emptied when
acknowledgement are treated by the transport layer. This complicates the
MUX liberation and we may loose some data after the MUX free.
Change this paradigm by moving the buffering on the transport layer. For
this goal, a new type is implemented as low-level stream at the
transport layer, as a counterpart of qcs mux instances. This structure
is called qc_stream_desc. This will allow to free the qcs/qcc instances
without having to wait for acknowledge reception.
For the moment, the quic-conn is responsible to store the qc_stream_desc
in a new tree named streams_by_id. This will sligthly change in the next
commits to remove the qcs node which has a similar purpose :
qc_stream_desc instances will be shared between the qcc MUX and the
quic-conn.
This patch only introduces the new type definition and the function to
manipulate it. The following commit will bring the rearchitecture in the
qcs structure.
Define a new callback release inside qcc_app_ops. It is called when the
qcc MUX is freed via qc_release. This will allows to implement cleaning
on the app layer.
Regroup some cleaning operations inside a new function qcs_free. This
can be used for all streams, both through qcs_destroy and with
uni-directional streams.
The CertCache.set() function allows to update an SSL certificate file
stored in the memory of the HAProxy process. This function does the same
as "set ssl cert" + "commit ssl cert" over the CLI.
This could be used to update the crt and key, as well as the OCSP, the
SCTL, and the OSCP issuer.
The implementation does yield every 10 ckch instances, the same way the
"commit ssl cert" do.
Extract the code that replace the ckch_store and its dependencies into
the ckch_store_replace() function.
This function must be used under the global ckch lock.
It frees everything related to the old ckch_store.
The new function dump_act_rules() now dumps the list of actions supported
by a ruleset. These actions are alphanumerically sorted first so that the
produced output is easy to compare.
When trying to sort sets of strings, it's often needed to required to
compare 3 strings to see if the chosen one fits well between the two
others. That's what this function does, in addition to being able to
ignore extremities when they're NULL (typically for the first iteration
for example).
Similar to the sample fetch keywords, let's also list the converter
keywords. They're much simpler since there's no compatibility matrix.
Instead the input and output types are listed. This is called by
dump_registered_keywords() for the "cnv" keywords class.
New function smp_dump_fetch_kw lists registered sample fetch keywords
with their compatibility matrix, mandatory and optional argument types,
and output types. It's called from dump_registered_keywords() with class
"smp".
New function acl_dump_kwd() dumps the registered ACL keywords and their
sample-fetch equivalent to stdout. It's called by dump_registered_keywords()
for keyword class "acl".
New function cli_list_keywords() scans the list of registered CLI keywords
and dumps them on stdout. It's now called from dump_registered_keywords()
for the class "cli".
Some keywords are valid for the master, they'll be suffixed with
"[MASTER]". Others are valid for the worker, they'll have "[WORKER]".
Those accessible only in expert mode will show "[EXPERT]" and the
experimental ones will show "[EXPERIM]".
All registered config keywords that are valid in the config parser are
dumped to stdout organized like the regular sections (global, listen,
etc). Some keywords that are known to only be valid in frontends or
backends will be suffixed with [FE] or [BE].
All regularly registered "bind" and "server" keywords are also dumped,
one per "bind" or "server" line. Those depending on ssl are listed after
the "ssl" keyword. Doing so required to export the listener and server
keyword lists that were static.
The function is called from dump_registered_keywords() for keyword
class "cfg".
It's difficult from outside haproxy to detect the supported keywords
and syntax. Interestingly, many of our modern keywords are enumerated
since they're registered from constructors, so it's not very hard to
enumerate most of them.
This patch creates some basic infrastructure to support dumping existing
keywords from different classes on stdout. The format will differ depending
on the classes, but the idea is that the output could easily be passed to
a script that generates some simple syntax highlighting rules, completion
rules for editors, syntax checkers or config parsers.
The principle chosen here is that if "-dK" is passed on the command-line,
at the end of the parsing the registered keywords will be dumped for the
requested classes passed after "-dK". Special name "help" will show known
classes, while "all" will execute all of them. The reason for doing that
after the end of the config processor is that it will also enumerate
internally-generated keywords, Lua or even those loaded from external
code (e.g. if an add-on is loaded using LD_PRELOAD). A typical way to
call this with a valid config would be:
./haproxy -dKall -q -c -f /path/to/config
If there's no config available, feeding /dev/null will also do the job,
though it will not be able to detect dynamically created keywords, of
course.
This patch also updates the management doc.
For now nothing but the help is listed, various subsystems will follow
in subsequent patches.
Move all inline functions with trace from quic_loss.h to a dedicated
object file. This let to remove the TRACE_SOURCE macro definition
outside of the include file.
This change is required to be able to define another TRACE_SOUCE inside
the mux_quic.c for a dedicated trace module.
This commit is similar to the previous one but with MAX_DATA frames.
This allows to increase the connection level flow-control limit. If the
connection was blocked due to QC_CF_BLK_MFCTL flag, the flag is reseted.
Implement a MUX method to parse MAX_STREAM_DATA. If the limit is greater
than the previous one and the stream was blocked, the flag
QC_SF_BLK_SFCTL is removed.
This commit is similar to the previous one, but this time on the
connection level instead of the stream.
When the connection limit is reached, the connection is flagged with
QC_CF_BLK_MFCTL. This flag is checked in qc_send.
qcs_push_frame uses a new parameter which is used to not exceed the
connection flow-limit while calling it repeatdly over multiple streams
instance before transfering data to the transport layer.
Implement the flow-control max-streams-data limit on emission. We ensure
that we never push more than the offset limit set by the peer. When the
limit is reached, the stream is marked as blocked with a new flag
QC_SF_BLK_SFCTL to disable emission.
Currently, this is only implemented for bidirectional streams. It's
required to unify the sending for unidirectional streams via
qcs_push_frame from the H3 layer to respect the flow-control limit for
them.
Rename the fields used for flow-control in the qcc structure. The
objective is to have shorter name for better readability while keeping
their purpose clear. It will be useful when the flow-control will be
extended with new fields.
In MQTTv3.1, protocol name is "MQIsdp" and protocol level is 3. The mqtt
converters(mqtt_is_valid and mqtt_field_value) did not work for clients on
mqttv3.1 because the mqtt_parse_connect() marked the CONNECT message invalid
if either the protocol name is not "MQTT" or the protocol version is other than
v3.1.1 or v5.0. To fix it, we have added the mqttv3.1 protocol name and version
as part of the checks.
This patch fixes the mqtt converters to support mqttv3.1 clients as well (issue #1600).
It must be backported to 2.4.
During the packet number space discarding, do no reset tx.in_flight counter
before decrement it from other variables.
Furthermore path prep_in_flight counter was not decremented.
We must consider the peer address as validated as soon as we received an
handshake packet. An ACK frame in handshake packet was too restrictive.
Rename the concerned flag to reflect this situation.
The most important one is the ->flags member which leads to an erratic xprt behavior.
For instance a non ack-eliciting packet could be seen as ack-eliciting leading the
xprt to try to retransmit a packet which are not ack-eliciting. In this case, the
xprt does nothing and remains indefinitively in a blocking state.
The TX packet refcounting had come with the multithreading support but not only.
It is very useful to ease the management of the memory allocated for TX packets
with TX frames attached to. At some locations of the code we have to move TX
frames from a packet to a new one during retranmission when the packet has been
deemed as lost or not. When deemed lost the memory allocated for the paquet must
be released contrary to when its frames are retransmitted when probing (PTO).
For now on, thanks to this patch we handle the TX packets memory this way. We
increment the packet refcount when:
- we insert it in its packet number space tree,
- we attache an ack-eliciting frame to it.
And reciprocally we decrement this refcount when:
- we remove an ack-eliciting frame from the packet,
- we delete the packet from its packet number space tree.
Note that an optimization WOULD NOT be to fully reuse (without releasing its
memorya TX packet to retransmit its contents (its ack-eliciting frames). Its
information (timestamp, in flight length) to be processed by packet loss detection
and the congestion control.
When building a packet with an ACK frame, we store the largest acknowledged
packet number sent in this frame in the packet (quic_tx_packet struc).
When receiving an ack for such a packet we can purge the tree of acknowledged
packet number ranges from the range sent before this largest acknowledged
packet number.
This struct member stores the largest acked packet number which was received. It
is used to build (TX) packet. But this is confusing to store it in the tx packet
of the packet number space structure even if it is used to build and transmit
packets.
There was free_act_rules() that frees all rules from a head but nothing
to free a single rule. Currently some rulesets partially free their own
rules on parsing error, and we're seeing some regtests emit errors under
ASAN because of this.
Let's first extract the code to free a rule into its own function so
that it becomes possible to use it on a single rule.
Log servers are a real mess because:
- entries are duplicated using memcpy() without their strings being
reallocated, which results in these ones not being freeable every
time.
- a new field, ring_name, was added in 2.2 by commit 99c453df9
("MEDIUM: ring: new section ring to declare custom ring buffers.")
but it's never initialized during copies, causing the same issue
- no attempt is made at freeing all that.
Of course, running "haproxy -c" under ASAN quickly notices that and
dumps a core.
This patch adds the missing strdup() and initialization where required,
adds a new free_logsrv() function to cleanly free() such a structure,
calls it from the proxy when iterating over logsrvs instead of silently
leaking their file names and ring names, and adds the same logsrv loop
to the proxy_free_defaults() function so that we don't leak defaults
sections on exit.
It looks a bit entangled, but it comes as a whole because all this stuff
is inter-dependent and was missing.
It's probably preferable not to backport this in the foreseable future
as it may reveal other jokes if some obscure parts continue to memcpy()
the logsrv struct.
The BUG_ON_HOT() test condition added to b_peek_varint() by commit
8873b85bd ("DEBUG: buf: add BUG_ON_HOT() to most buffer management
functions") was wrong as <data> in this function is not b->data,
so that was triggering during live dumps of H2 traces on the CLI
when built with -DDEBUG_STRICT=2. No backport is needed.
The current implementation of STREAM frames emission has some
limitation. Most notably when we cannot sent all frames in a single
qc_send run.
In this case, frames are left in front of the MUX list. It will be
re-send individually before other frames, possibly another frame from
the same STREAM with new data. An opportunity to merge the frames is
lost here.
This method is now improved. If a frame cannot be send entirely, it is
discarded. On the next qc_send run, we retry to send to this position. A
new field qcs.sent_offset is used to remember this. A new frame list is
used for each qc_send.
The impact of this change is not precisely known. The most notable point
is that it is a more logical method of emission. It might also improve
performance as we do not keep old STREAM frames which might delay other
streams.
Implement a new MUX function qcc_notify_send. This function must be
called by the transport layer to confirm the sending of STREAM data to
the MUX.
For the moment, the function has no real purpose. However, it will be
useful to solve limitations on push frame and implement the flow
control.
The aim of the idle timeout is to silently closed the connection after a period
of inactivity depending on the "max_idle_timeout" transport parameters advertised
by the endpoints. We add a new task to implement this timer. Its expiry is
updated each time we received an ack-eliciting packet, and each time we send
an ack-eliciting packet if no other such packet was sent since we received
the last ack-eliciting packet. Such conditions may be implemented thanks
to QUIC_FL_CONN_IDLE_TIMER_RESTARTED_AFTER_READ new flag.
This packet number space flags were defined with the same value because
defined at different places in the file. Assemble them at the same location
with different values.
This bug could unvalidate the peer address after it was validated
during the handshake leading to the anti-amplication limit to be
enabled again after having been disabled. The situation could not
be unblocked (deadlock).
There is no need to use such a reference counter anymore since the QUIC
connections are always handled by the same thread.
quic_conn_drop() is removed. Its code is merged into quic_conn_release().
When we store the remote transport parameters, we compute the maximum idle
timeout for the connection which is the minimum of the two advertised
max_idle_timeout transport parameter values if both have non-null values, or the
maximum if one of the value is set and non-null.
When a tcp-{request,response} content or http-request/http-response
rule delivers a final verdict (deny, accept, redirect etc), the last
evaluated one will now be recorded in the stream. The purpose is to
permit to log the last one that performed a final action. For now
the log is not produced.
The server_id_hdr_name is already processed as an ist in various locations lets
also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
The orgto_hdr_name is already processed as an ist in `http_process_request`,
lets also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
The fwdfor_hdr_name is already processed as an ist in `http_process_request`,
lets also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
The monitor_uri is already processed as an ist in `http_wait_for_request`, lets
also just store it as such.
see 0643b0e7e ("MINOR: proxy: Make `header_unique_id` a `struct ist`") for a
very similar past commit.
Supporting kFreebsd previously led to FreeBSD (< 14) build breakage:
In file included from src/cpuset.c:5:
In file included from include/haproxy/cpuset.h:4:
include/haproxy/cpuset-t.h:46:2: error: unknown type name 'cpu_set_t'; did you mean 'cpuset_t'?
CPUSET_REPR cpuset;
^~~~~~~~~~~
cpuset_t
include/haproxy/cpuset-t.h:21:22: note: expanded from macro 'CPUSET_REPR'
# define CPUSET_REPR cpu_set_t
^
Around limits for QUIC integer encoding, this functions could return
wrong values which lead to qc_build_frms() to prepare wrong CRYPTO (less chances)
or STREAM frames (more chances). qc_do_build_pkt() could build wrong packets
with bad CRYPTO/STREAM frames which could not be decoded by the peer.
In such a case ngtcp2 closes the connection with an ENCRYPTION_ERROR error
in a transport CONNECTION_CLOSE frame.
This function returns the maximum integer which may be encoded with a number of
bytes passed as parameter. Useful to precisely compute the number of bytes which
may used to fulfill a buffer with lengths as QUIC enteger encoded prefixes for the
number of following bytes.
When in congestion avoidance state and when acknowledging an <acked> number bytes
we must increase the congestion window by at most one datagram (<path->mtu>)
by congestion window. So thanks to this patch we apply a ratio to the current
number of acked bytes : <acked> * <path->mtu> / <cwnd>.
So, when <cwnd> bytes are acked we precisely increment <cwnd> by <path->mtu>.
Furthermore we take into an account the number of remaining acknowledged bytes
each time we increment the window by <acked> storing their values in the algorithm
struct state (->remain_acked) so that it might be take into an account at the
next ACK event.
This function returns the remaining number of bytes which can be sent on the
network before fulfilling the congestion window. There is a counter for
the number of prepared data and another one for the really in flight number
of bytes (in_flight). These variable have been mixed up.
Since the persistent congestion detection is done out of the congestion
controllers, there is no need to pass them information through quic_cc_event struct.
We remove its useless members. Also remove qc_cc_loss_event() which is no more used.
We establish the persistent congestion out of any congestion controller
to improve the algorithms genericity. This path characteristic detection may
be implemented regarless of the underlying congestion control algorithm.
Send congestion (loss) event using directly quic_cc_event(), so without
qc_cc_loss_event() wrapper function around quic_cc_event().
Take the opportunity of this patch to shorten "newest_time_sent" member field
of quic_cc_event to "time_sent".
We want to be able to make the congestion controllers re-enter the slow
start state outside of the congestion controllers themselves. So,
we add a callback ->slow_start() to do so.
Define this callback for NewReno algorithm.
kFreeBSD needs to be treated as a distinct target from FreeBSD
since the underlying system libc is the GNU one. Thus, relying
only on __GLIBC__ no longer suffice.
- freebsd-glibc new target, key difference is including crypt.h
and linking to libdl like linux.
- cpu affinity available but the api is still the FreeBSD's.
- enabling auxiliary data access only for Linux.
Patch based on preliminary work done by @bigon.
closes#1555
Implement the locally flow-control streams limit for opened
bidirectional streams. Add a counter which is used to count the total
number of closed streams. If this number is big enough, emit a
MAX_STREAMS frame to increase the limit of remotely opened bidirectional
streams.
This is the first commit to implement QUIC flow-control. A series of
patches should follow to complete this.
This is required to be able to handle more than 100 client requests.
This should help to validate the Multiplexing interop test.
Modify the STREAM emission in qc_send. Use the new transport function
qc_send_app_pkts to directly send the list of constructed frames. This
allows to remove the tasklet wakeup on the quic_conn and should reduce
the latency.
If not all frames are send after the transport call, subscribe the MUX
on the lower layer to be able to retry. Currently there is a bug because
the transport layer does not retry to send frames in excess after a
successful sendto. This might cause the transfer to be interrupted.
Define two new unions in the qcc structure named 'lfctl' and 'rfctl'.
For the moment they are empty. They will be completed to store the
initial and current level for flow-control on the local and remote side.
Improve the functions used to detect the stream characteristics :
uni/bidirectional and local/remote initiated.
Most notably, these functions are now designed to work transparently for
a MUX in the frontend or backend side. For this, we use the connection
to determine the current MUX side. This will be useful if QUIC is
implemented on the server side.
Since QUIC accept handling has been improved, the MUX is initialized
after the handshake completion. Thus its safe to access transport
parameters in qc_init via the quic_conn.
Remove quic_mux_transport_params_update which was called by the
transport for the MUX. This improves the architecture by removing a
direct call from the transport to the MUX.
The deleted function body is not transfered to qc_init because this part
will change heavily in the near future when implementing the
flow-control.
As reported by Tim in issue #1428, our sources are clean, there are
just a few files with a few rare non-ASCII chars for the paragraph
symbol, a few typos, or in Fred's name. Given that Fred already uses
the non-accentuated form at other places like on the public list,
let's uniformize all this and make sure the code displays equally
everywhere.
This is the pool equivalent of commit 97ea9c49f ("BUG/MEDIUM: fd: always
align fdtab[] to 64 bytes"). After a careful code review, it happens that
the pool heads are the other structures allocated with malloc/calloc that
claim to be aligned to a size larger than what the allocator can offer.
While no issue was reported on them, no memset() is performed and no type
is large, this is a problem waiting to happen, so better fix it. In
addition, it's relatively easy to do by storing the allocation address
inside the pool_head itself and use it at free() time. Finally, threads
might benefit from the fact that the caches will really be aligned and
that there will be no false sharing.
This should be backported to all versions where it applies easily.
Many inline functions involve some BUG_ON() calls and because of the
partial complexity of the functions, they're not inlined anymore (e.g.
co_data()). The reason is that the expression instantiates the message,
its size, sometimes a counter, then the atomic OR to taint the process,
and the back trace. That can be a lot for an inline function and most
of it is always the same.
This commit modifies this by delegating the common parts to a dedicated
function "complain()" that takes care of updating the counter if needed,
writing the message and measuring its length, and tainting the process.
This way the caller only has to check a condition, pass a pointer to the
preset message, and the info about the type (bug or warn) for the tainting,
then decide whether to dump or crash. Note that this part could also be
moved to the function but resulted in complain() always being at the top
of the stack, which didn't seem like an improvement.
Thanks to these changes, the BUG_ON() calls do not result in uninlining
functions anymore and the overall code size was reduced by 60 to 120 kB
depending on the build options.
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
The 3 functions http_{req,res,after_res}_keywords_register() are
referenced in initcalls by their pointer, it makes no sense to declare
them inline. At best it causes function duplication, at worst it doesn't
build on older compilers.
This one is referenced in initcalls by its pointer, it makes no sense
to declare it inline. At best it causes function duplication, at worst
it doesn't build on older compilers.
gcc 6 continues its saga with excessive reports of null-deref warnings.
This time it was in the IS_HTX_CS() macro. Let's use __cs_conn() after
cs_conn() was checked.
Do not distinguish the direction (TX/RX) when settings TLS secrets flags.
There is not such a distinction in the RFC 9001.
Assemble them at the same level: at the upper context level.
Wakeup asap the timer task when setting its timer in the past.
Take also the opportunity of this patch to make simplify quic_pto_pktns():
calling tick_first() is useless here to compare <lpto> with <tmp_pto>.
We had several warnings when building haproxy 2.5.4 with
old openssl 1.0.1e. This version of openssl is the latest
available in EOL centos 6.
include/haproxy/openssl-compat.h:157:51: \
warning: "LIBRESSL_VERSION_NUMBER" is not defined
This patch fixed the build. It changes the #if condition,
as done in other similar parts of openssl-compat.h.
Reorganize the Rx path for STREAM frames on bidirectional streams. A new
function qcc_recv is implemented on the MUX. It will handle the STREAM
frames copy and offset calculation from transport to MUX.
Another function named qcc_decode_qcs from the MUX can be called by
transport each time new STREAM data has been copied.
The architecture is now cleaner with the MUX layer in charge of parsing
the STREAM frames offsets. This is required to be able to implement the
flow-control on the MUX layer.
Note that as a convenience, a STREAM frame is not partially copied to
the MUX buffer. This simplify the implementation for the moment but it
may change in the future to optimize the STREAM frames handling.
For the moment, only bidirectional streams benefit from this change. In
the future, it may be extended to unidirectional streams to unify the
STREAM frames processing.
FIN flag on a STREAM frame was not detected if the frame was previously
buffered on qcs.rx.frms before being handled.
To fix this, copy the fin field from the quic_stream instance to
quic_rx_strm_frm. This is required to properly notify the FIN flag on
qc_treat_rx_strm_frms for the MUX layer.
Without this fix, the request channel might be left opened after the
last STREAM frame reception if there is out-of-order frames on the Rx
path.
This flag is set when the STREAM frame with FIN set has been received on
a qcs instance. For now, this is only used as a BUG_ON guard to prevent
against multiple frames with FIN set. It will also be useful when
reorganize the RX path and move some of its code in the mux.
The new macro was introduced with commit 86bcc5308 ("DEBUG: implement 4
levels of choices between warn and crash.") but some older compilers can
complain that we test the value when the macro is not defined despite
having already been checked in a previous #if directive. Let's just
repeat the test for the definition.
693b23bb1 ("MEDIUM: tree-wide: Use unsafe conn-stream API when it is
relevant") introduced a regression in DEBUG_STRICT mode because some BUG_ON
conditions were inverted. It should ok now.
In addition, ALREADY_CHECKED macro was removed from appctx_wakeup() function
because it is useless now.
This way si_*_recv() and si_*_sned() API are defined the same
way. si_sync_snd/si_sync_recv are both exported and defined in the C
file. And si_cs_send/si_cs_recv are private and only used by
stream-interface internals.
The unsafe conn-stream API (__cs_*) is now used when we are sure the good
endpoint or application is attached to the conn-stream. This avoids compiler
warnings about possible null derefs. It also simplify the code and clear up
any ambiguity about manipulated entities.
Depending on the context, we know the endpoint or the application attached
to the conn_stream is defined and we know its type. However, having
accessors testing the endpoint or the application may lead the compiler to
report possible null derefs here and there. The alternative is to add
useless tests or use ALREAD_CHECKED/DISGUISE macros. It is tedious and
inelegant.
So now, similarily to the ob API, the safe API, testing
endpoint/application, relies on an unsafe one (same name prefixed with
'__'). This way, any caller may use the unsafe API when it is relevant.
In addition, there is no reason to test the conn-stream itself. It is the
caller responsibility to be sure there is a conn-stream to get its endpoint
or its application. And most of type, we are sure to have a conn-stream.
A few functions such as c_adv(), c_rew(), co_set_data() or co_skip() got a
BUG_ON_HOT() to make sure they're not used to push more data than available
in the buffer. Note that with HTX the margin can be high and will less easily
trigger, but the goal is to detect a misuse early enough.
co_data() should never be called with a wrong c->output. At least it never
happens in regtests, but we're adding a CHECK_IF_HOT() there to avoid crashing
but report it if it ever happened when the hot path checks are enabled.