HAProxy - Load balancer
Find a file
Willy Tarreau 17ccd1a356 BUG/MEDIUM: connection: add a mux flag to indicate splice usability
Commit c640ef1a7d ("BUG/MINOR: stream-int: avoid calling rcv_buf() when
splicing is still possible") fixed splicing in TCP and legacy mode but
broke it badly in HTX mode.

What happens in HTX mode is that the channel's to_forward value remains
set to CHN_INFINITE_FORWARD during the whole transfer, and as such it is
not a reliable signal anymore to indicate whether more data are expected
or not. Thus, when data are spliced out of the mux using rcv_pipe(), even
when the end is reached (that only the mux knows about), the call to
rcv_buf() to get the final HTX blocks completing the message were skipped
and there was often no new event to wake this up, resulting in transfer
timeouts at the end of large objects.

All this goes down to the fact that the channel has no more information
about whether it can splice or not despite being the one having to take
the decision to call rcv_pipe() or not. And we cannot afford to call
rcv_buf() inconditionally because, as the commit above showed, this
reduces the forwarding performance by 2 to 3 in TCP and legacy modes
due to data lying in the buffer preventing splicing from being used
later.

The approach taken by this patch consists in offering the muxes the ability
to report a bit more information to the upper layers via the conn_stream.
This information could simply be to indicate that more data are awaited
but the real need being to distinguish splicing and receiving, here
instead we clearly report the mux's willingness to be called for splicing
or not. Hence the flag's name, CS_FL_MAY_SPLICE.

The mux sets this flag when it knows that its buffer is empty and that
data waiting past what is currently known may be spliced, and clears it
when it knows there's no more data or that the caller must fall back to
rcv_buf() instead.

The stream-int code now uses this to determine if splicing may be used
or not instead of looking at the rcv_pipe() callbacks through the whole
chain. And after the rcv_pipe() call, it checks the flag again to decide
whether it may safely skip rcv_buf() or not.

All this bitfield dance remains a bit complex and it starts to appear
obvious that splicing vs reading should be a decision of the mux based
on permission granted by the data layer. This would however increase
the API's complexity but definitely need to be thought about, and should
even significantly simplify the data processing layer.

The way it was integrated in mux-h1 will also result in no more calls
to rcv_pipe() on chunked encoded data, since these ones are currently
disabled at the mux level. However once the issue with chunks+splice
is fixed, it will be important to explicitly check for curr_len|CHNK
to set MAY_SPLICE, so that we don't call rcv_buf() after each chunk.

This fix must be backported to 2.1 and 2.0.
2020-01-17 17:00:12 +01:00
.github/ISSUE_TEMPLATE DOC: Add GitHub issue config.yml 2019-11-03 15:36:06 +01:00
contrib BUG/MINOR: contrib/prometheus-exporter: decode parameter and value only 2019-11-27 11:51:35 +01:00
doc DOC: clarify crt-base usage 2020-01-15 10:55:43 +01:00
ebtree BUILD: ebtree: make eb_is_empty() and eb_is_dup() take a const 2019-10-02 15:24:19 +02:00
examples CLEANUP: removed obsolete examples an move a few to better places 2019-06-15 21:25:06 +02:00
include BUG/MEDIUM: connection: add a mux flag to indicate splice usability 2020-01-17 17:00:12 +01:00
reg-tests REGTEST: add sample_fetches/hashes.vtc to validate hashes 2020-01-16 08:45:27 +01:00
scripts REGTEST: run-regtests: implement #REQUIRE_BINARIES 2019-12-19 14:36:46 +01:00
src BUG/MEDIUM: connection: add a mux flag to indicate splice usability 2020-01-17 17:00:12 +01:00
tests TESTS: Add a stress-test for mt_lists. 2019-09-23 18:16:08 +02:00
.cirrus.yml BUILD: cirrus-ci: choose proper openssl package name 2020-01-08 16:26:11 +01:00
.gitignore DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
.travis.yml BUILD: travis-ci: reenable address sanitizer for clang builds 2019-12-26 06:30:21 +01:00
BRANCHES DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
CHANGELOG [RELEASE] Released version 2.2-dev0 2019-11-25 20:36:16 +01:00
CONTRIBUTING DOC: improve the wording in CONTRIBUTING about how to document a bug fix 2019-07-26 15:46:21 +02:00
INSTALL DOC: this is development again 2019-11-25 20:37:49 +01:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS DOC: wurfl: added point of contact in MAINTAINERS file 2019-04-23 11:00:23 +02:00
Makefile BUILD: reorder the objects in the makefile 2019-11-25 19:47:23 +01:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
ROADMAP DOC: update the outdated ROADMAP file 2019-06-15 21:59:54 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 2.1.0 2019-11-25 19:47:40 +01:00
VERSION [RELEASE] Released version 2.2-dev0 2019-11-25 20:36:16 +01:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)