Implement "help" as a sub-command for "add server" CLI. The objective is
to list all the keywords that are supported for dynamic servers. CLI IO
handler and add_srv_ctx are used to support reentrancy on full output
buffer.
Now that this command is implemented, the outdated keyword list on "add
server" from management documentation can be removed.
Currently there is "no-tls-tickets" that is also supported in the
ssl-default-bind-options directive, but there's no way to re-enable
them on a specific "bind" line. This patch simply provides the option
to re-enable them. Note that the flag is inverted because tickets are
enabled by default and the no-tls-ticket option sets the flag to
disable them.
Several users already reported that it would be nice to support
strict-sni in ssl-default-bind-options. However, in order to support
it, we also need an option to disable it.
This patch moves the setting of the option from the strict_sni field
to a flag in the ssl_options field so that it can be inherited from
the default bind options, and adds a new "no-strict-sni" directive to
allow to disable it on a specific "bind" line.
The test file "del_ssl_crt-list.vtc" which already tests both options
was updated to make use of the default option and the no- variant to
confirm everything continues to work.
Released version 3.2-dev17 with the following main changes :
- DOC: configuration: explicit multi-choice on bind shards option
- BUG/MINOR: sink: detect and warn when using "send-proxy" options with ring servers
- BUG/MEDIUM: peers: also limit the number of incoming updates
- MEDIUM: hlua: Add function to change the body length of an HTTP Message
- BUG/MEDIUM: stconn: Disable 0-copy forwarding for filters altering the payload
- BUG/MINOR: h3: don't insert more than one Host header
- BUG/MEDIUM: h1/h2/h3: reject forbidden chars in the Host header field
- DOC: config: properly index "table and "stick-table" in their section
- DOC: management: change reference to configuration manual
- BUILD: debug: mark ha_crash_now() as attribute(noreturn)
- IMPORT: slz: avoid multiple shifts on 64-bits
- IMPORT: slz: support crc32c for lookup hash on sse4 but only if requested
- IMPORT: slz: use a better hash for machines with a fast multiply
- IMPORT: slz: fix header used for empty zlib message
- IMPORT: slz: silence a build warning on non-x86 non-arm
- BUG/MAJOR: leastconn: do not loop forever when facing saturated servers
- BUG/MAJOR: queue: properly keep count of the queue length
- BUG/MINOR: quic: fix crash on quic_conn alloc failure
- BUG/MAJOR: leastconn: never reuse the node after dropping the lock
- MINOR: acme: renewal notification over the dpapi sink
- CLEANUP: quic: Useless BIO_METHOD initialization
- MINOR: quic: Add useful error traces about qc_ssl_sess_init() failures
- MINOR: quic: Allow the use of the new OpenSSL 3.5.0 QUIC TLS API (to be completed)
- MINOR: quic: implement all remaining callbacks for OpenSSL 3.5 QUIC API
- MINOR: quic: OpenSSL 3.5 internal QUIC custom extension for transport parameters reset
- MINOR: quic: OpenSSL 3.5 trick to support 0-RTT
- DOC: update INSTALL for QUIC with OpenSSL 3.5 usages
- DOC: management: update 'acme status'
- BUG/MEDIUM: wdt: always ignore the first watchdog wakeup
- CLEANUP: wdt: clarify the comments on the common exit path
- BUILD: ssl: avoid possible printf format warning in traces
- BUILD: acme: fix build issue on 32-bit archs with 64-bit time_t
- DOC: management: precise some of the fields of "show servers conn"
- BUG/MEDIUM: mux-quic: fix BUG_ON() on rxbuf alloc error
- DOC: watchdog: update the doc to reflect the recent changes
- BUG/MEDIUM: acme: check if acme domains are configured
- BUG/MINOR: acme: fix formatting issue in error and logs
- EXAMPLES: lua: avoid screen refresh effect in "trisdemo"
- CLEANUP: quic: remove unused cbuf module
- MINOR: quic: move function to check stream type in utils
- MINOR: quic: refactor handling of streams after MUX release
- MINOR: quic: add some missing includes
- MINOR: quic: adjust quic_conn-t.h include list
- CLEANUP: cfgparse: alphabetically sort the global keywords
- MINOR: glitches: add global setting "tune.glitches.kill.cpu-usage"
It was mentioned during the development of glitches that it would be
nice to support not killing misbehaving connections below a certain
CPU usage so that poor implementations that routinely misbehave without
impact are not killed. This is now possible by setting a CPU usage
threshold under which we don't kill them via this parameter. It defaults
to zero so that we continue to kill them by default.
As reported in issue #2970, the output of "show servers conn" is not
clear. It was essentially meant as a debugging tool during some changes
to idle connections management, but if some users want to monitor or
graph them, more info is needed. The doc mentions the currently known
list of fields, and reminds that this output is not meant to be stable
over time, but as long as it does not change, it can provide some useful
metrics to some users.
Since e24b77e7 ('DOC: config: move the extraneous sections out of the
"global" definition') the ACME section of the configuration manual was
move from 3.13 to 12.8.
Change the reference to that section in "acme renew".
Tim reported in issue #2953 that "stick-table" and "table" were not
indexed as keywords. The issue was the indent level. Also let's make
sure to put a box around the "store" arguments as well.
There was no function for a lua filter to change the body length of an HTTP
Message. But it is mandatory to be able to alter the message payload. It is
not possible update to directly update the message headers because the
internal state of the message must also be updated accordingly.
It is the purpose of HTTPMessage.set_body_len() function. The new body
length myst be passed as argument. If it is an integer, the right
"Content-Length" header is set. If the "chunked" string is used, it forces
the message to be chunked-encoded and in that case the "Transfer-Encoding"
header.
This patch should fix the issue #2837. It could be backported as far as 2.6.
From the documentation, this wasn't clear enough that shards should
be followed by one of the options number / by-thread / by-group.
Align it with existing options in documentation so that it becomes
more explicit.
Released version 3.2-dev16 with the following main changes :
- BUG/MEDIUM: mux-quic: fix crash on invalid fctl frame dereference
- DEBUG: pool: permit per-pool UAF configuration
- MINOR: acme: add the global option 'acme.scheduler'
- DEBUG: pools: add a new integrity mode "backup" to copy the released area
- MEDIUM: sock-inet: re-check IPv6 connectivity every 30s
- BUG/MINOR: ssl: doesn't fill conf->crt with first arg
- BUG/MINOR: ssl: prevent multiple 'crt' on the same ssl-f-use line
- BUG/MINOR: ssl/ckch: always free() the previous entry during parsing
- MINOR: tools: ha_freearray() frees an array of string
- BUG/MINOR: ssl/ckch: always ha_freearray() the previous entry during parsing
- MINOR: ssl/ckch: warn when the same keyword was used twice
- BUG/MINOR: threads: fix soft-stop without multithreading support
- BUG/MINOR: tools: improve parse_line()'s robustness against empty args
- BUG/MINOR: cfgparse: improve the empty arg position report's robustness
- BUG/MINOR: server: dont depend on proxy for server cleanup in srv_drop()
- BUG/MINOR: server: perform lbprm deinit for dynamic servers
- MINOR: http: add a function to validate characters of :authority
- BUG/MEDIUM: h2/h3: reject some forbidden chars in :authority before reassembly
- MINOR: quic: account Tx data per stream
- MINOR: mux-quic: account Rx data per stream
- MINOR: quic: add stream format for "show quic"
- MINOR: quic: display QCS info on "show quic stream"
- MINOR: quic: display stream age
- BUG/MINOR: cpu-topo: fix group-by-cluster policy for disordered clusters
- MINOR: cpu-topo: add a new "group-by-ccx" CPU policy
- MINOR: cpu-topo: provide a function to sort clusters by average capacity
- MEDIUM: cpu-topo: change "performance" to consider per-core capacity
- MEDIUM: cpu-topo: change "efficiency" to consider per-core capacity
- MEDIUM: cpu-topo: prefer grouping by CCX for "performance" and "efficiency"
- MEDIUM: config: change default limits to 1024 threads and 32 groups
- BUG/MINOR: hlua: Fix Channel:data() and Channel:line() to respect documentation
- DOC: config: Fix a typo in the "term_events" definition
- BUG/MINOR: spoe: Don't report error on applet release if filter is in DONE state
- BUG/MINOR: mux-spop: Don't report error for stream if ACK was already received
- BUG/MINOR: mux-spop: Make the demux stream ID a signed integer
- BUG/MINOR: mux-spop: Don't open new streams for SPOP connection on error
- MINOR: mux-spop: Don't set SPOP connection state to FRAME_H after ACK parsing
- BUG/MEDIUM: mux-spop: Remove frame parsing states from the SPOP connection state
- BUG/MEDIUM: mux-spop: Properly handle CLOSING state
- BUG/MEDIUM: spop-conn: Report short read for partial frames payload
- BUG/MEDIUM: mux-spop: Properly detect truncated frames on demux to report error
- BUG/MEDIUM: mux-spop; Don't report a read error if there are pending data
- DEBUG: mux-spop: Review some trace messages to adjust the message or the level
- DOC: config: move address formats definition to section 2
- DOC: config: move stick-tables and peers to their own section
- DOC: config: move the extraneous sections out of the "global" definition
- CI: AWS-LC(fips): enable unit tests
- CI: AWS-LC: enable unit tests
- CI: compliance: limit run on forks only to manual + cleanup
- CI: musl: enable unit tests
- CI: QuicTLS (weekly): limit run on forks only to manual dispatch
- CI: WolfSSL: enable unit tests
Due to some historic mistakes that have spread to newly added sections,
a number of of recently added small sections found themselves described
under section 3 "global parameters" which is specific to "global" section
keywords. This is highly confusing, especially given that sections 3.1,
3.2, 3.3 and 3.10 directly start with keywords valid in the global section,
while others start with keywords that describe a new section.
Let's just create a new chapter "12. other sections" and move them all
there. 3.10 "HTTPclient tuning" however was moved to 3.4 as it's really
a definition of the global options assigned to the HTTP client. The
"programs" that are going away in 3.3 were moved at the end to avoid a
renumbering later.
Another nice benefit is that it moves a lot of text that was previously
keeping the global and proxies sections apart.
As suggested by Tim in issue #2953, stick-tables really deserve their own
section to explain the configuration. And peers have to move there as well
since they're totally dedicated to stick-tables.
Now we introduce a new section "Stick-tables and Peers", explaining the
concepts, and under which there is one subsection for stick-tables
configuration and one for the peers (which mostly keeps the existing
peers section).
Section 2 describes the config file format, variables naming etc, so
there's no reason why the address format used in this file should be
in a separate section, let's bring it into section 2 as well.
Most of the time, machines made of multiple CPU types use the same L3
for them, and grouping CPUs by frequencies to form groups doesn't bring
any value and on the opposite can impair the incoming connection balancing.
This choice of grouping by cluster was made in order to constitute a good
choice on homogenous machines as well, so better rely on the per-CCX
grouping than the per-cluster one in this case. This will create less
clusters on machines where it counts without affecting other ones.
It doesn't seem necessary to change anything for the "resource" policy
since it selects a single cluster.
This is similar to the previous change to the "performance" policy but
it applies to the "efficiency" one. Here we're changing the sorting
method to sort CPU clusters by average per-CPU capacity, and we evict
clusters whose per-CPU capacity is above 125% of the previous one.
Per-core capacity allows to detect discrepancies between CPU cores,
and to continue to focus on efficient ones as a priority.
Running the "performance" policy on highly heterogenous systems yields
bad choices when there are sufficiently more small than big cores,
and/or when there are multiple cluster types, because on such setups,
the higher the frequency, the lower the number of cores, despite small
differences in frequencies. In such cases, we quickly end up with
"performance" only choosing the small or the medium cores, which is
contrary to the original intent, which was to select performance cores.
This is what happens on boards like the Orion O6 for example where only
the 4 medium cores and 2 big cores are choosen, evicting the 2 biggest
cores and the 4 smallest ones.
Here we're changing the sorting method to sort CPU clusters by average
per-CPU capacity, and we evict clusters whose per-CPU capacity falls
below 80% of the previous one. Per-core capacity allows to detect
discrepancies between CPU cores, and to continue to focus on high
performance ones as a priority.
This cpu-policy will only consider CCX and not clusters. This makes
a difference on machines with heterogenous CPUs that generally share
the same L3 cache, where it's not desirable to create multiple groups
based on the CPU types, but instead create one with the different CPU
types. The variants "group-by-2/3/4-ccx" have also been added.
Let's also add some text explaining the difference between cluster
and CCX.
Add a new format for "show quic" command labelled as "stream". This is
an equivalent of "show sess", dedicated to the QUIC stack. Each active
QUIC streams are listed on a line with their related infos.
The main objective of this command is to ensure there is no freeze
streams remaining after a transfer.
IPv6 connectivity might start off (e.g. network not fully up when
haproxy starts), so for features like resolvers, it would be nice to
periodically recheck.
With this change, instead of having the resolvers code rely on a variable
indicating connectivity, it will now call a function that will check for
how long a connectivity check hasn't been run, and will perform a new one
if needed. The age was set to 30s which seems reasonable considering that
the DNS will cache results anyway. There's no saving in spacing it more
since the syscall is very check (just a connect() without any packet being
emitted).
The variables remain exported so that we could present them in show info
or anywhere else.
This way, "dns-accept-family auto" will now stay up to date. Warning
though, it does perform some caching so even with a refreshed IPv6
connectivity, an older record may be returned anyway.
This way we can preserve the entire contents of the released area for
later inspection. This automatically enables comparison at reallocation
time as well (like "integrity" does). If used in combination with
integrity, the comparison is disabled but the check of non-corruption
of the area mangled by integrity is still operated.
The automatic scheduler is useful but sometimes you don't want to use,
or schedule manually.
This patch adds an 'acme.scheduler' option in the global section, which
can be set to either 'auto' or 'off'. (auto is the default value)
This also change the ouput of the 'acme status' command so it does not
shows scheduled values. The state will be 'Stopped' instead of
'Scheduled'.
The new MEM_F_UAF flag can be set just after a pool's creation to make
this pool UAF for debugging purposes. This allows to maintain a better
overall performance required to reproduce issues while still having a
chance to catch UAF. It will only be used by developers who will manually
add it to areas worth being inspected, though.
Released version 3.2-dev15 with the following main changes :
- BUG/MEDIUM: stktable: fix sc_*(<ctr>) BUG_ON() regression with ctx > 9
- BUG/MINOR: acme/cli: don't output error on success
- BUG/MINOR: tools: do not create an empty arg from trailing spaces
- MEDIUM: config: warn about the consequences of empty arguments on a config line
- MINOR: tools: make parse_line() provide hints about empty args
- MINOR: cfgparse: visually show the input line on empty args
- BUG/MINOR: tools: always terminate empty lines
- BUG/MINOR: tools: make parseline report the required space for the trailing 0
- DEBUG: threads: don't keep lock label "OTHER" in the per-thread history
- DEBUG: threads: merge successive idempotent lock operations in history
- DEBUG: threads: display held locks in threads dumps
- BUG/MINOR: proxy: only use proxy_inc_fe_cum_sess_ver_ctr() with frontends
- Revert "BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame"
- MINOR: acme/cli: 'acme status' show the status acme-configured certificates
- MEDIUM: acme/ssl: remove 'acme ps' in favor of 'acme status'
- DOC: configuration: add "acme" section to the keywords list
- DOC: configuration: add the "crt-store" keyword
- BUG/MAJOR: queue: lock around the call to pendconn_process_next_strm()
- MINOR: ssl: add filename and linenum for ssl-f-use errors
- BUG/MINOR: ssl: can't use crt-store some certificates in ssl-f-use
- BUG/MINOR: tools: only fill first empty arg when not out of range
- MINOR: debug: bump the dump buffer to 8kB
- MINOR: stick-tables: add "ipv4" as an alias for the "ip" type
- MINOR: quic: extend return value during TP parsing
- BUG/MINOR: quic: use proper error code on missing CID in TPs
- BUG/MINOR: quic: use proper error code on invalid server TP
- BUG/MINOR: quic: reject retry_source_cid TP on server side
- BUG/MINOR: quic: use proper error code on invalid received TP value
- BUG/MINOR: quic: fix TP reject on invalid max-ack-delay
- BUG/MINOR: quic: reject invalid max_udp_payload size
- BUG/MEDIUM: peers: hold the refcnt until updating ts->seen
- BUG/MEDIUM: stick-tables: close a tiny race in __stksess_kill()
- BUG/MINOR: cli: fix too many args detection for commands
- MINOR: server: ensure server postparse tasks are run for dynamic servers
- BUG/MEDIUM: stick-table: always remove update before adding a new one
- BUG/MEDIUM: quic: free stream_desc on all data acked
- BUG/MINOR: cfgparse: consider the special case of empty arg caused by \x00
- DOC: config: recommend disabling libc-based resolution with resolvers
Using both libc and haproxy resolvers can lead to hard to diagnose issues
when their bevahiour diverges; recommend using only one type of resolver.
Should be backported to stable versions.
Link: https://www.mail-archive.com/haproxy@formilux.org/msg45663.html
Co-authored-by: Lukas Tribus <lukas@ltri.eu>
However the doc purposely says the opposite, to encourage migrating away
from "ip". The goal is that in the future we change "ip" to mean "ipv6",
which seems to be what most users naturally expect. But we cannot break
configurations in the LTS version so for now "ipv4" is the alias.
The reason for not changing it in the table is that the type name is
used at a few places (look for "].kw"):
- dumps
- promex
We'd rather not change that output for 3.2, but only do it in 3.3.
This way, 3.2 can be made future-proof by using "ipv4" in the config
without any other side effect.
Please see github issue #2962 for updates on this transition.
Add the "crt-store" keyword with its argument in the "3.12" section, so
this could be detected by haproxy-dconv has a keyword and put in the
keywords list.
Must be backported as far as 3.0
Remove the 'acme ps' command which does not seem useful anymore with the
'acme status' command.
The big difference with the 'acme status' command is that it was only
displaying the running tasks instead of the status of all certificate.
The "acme status" command, shows the status of every certificates
configured with ACME, not only the running task like "acme ps".
The IO handler loops on the ckch_store tree and outputs a line for each
ckch_store which has an acme section set. This is still done under the
ckch_store lock and doesn't support resuming when the buffer is full,
but we need to change that in the future.
Released version 3.2-dev14 with the following main changes :
- MINOR: acme: retry label always do a request
- MINOR: acme: does not leave task for next request
- BUG/MINOR: acme: reinit the retries only at next request
- MINOR: acme: change the default max retries to 5
- MINOR: acme: allow a delay after a valid response
- MINOR: acme: wait 5s before checking the challenges results
- MINOR: acme: emit a log when starting
- MINOR: acme: delay of 5s after the finalize
- BUG/MEDIUM: quic: Let it be known if the tasklet has been released.
- BUG/MAJOR: tasks: fix task accounting when killed
- CLEANUP: tasks: use the local state, not t->state, to check for tasklets
- DOC: acme: external account binding is not supported
- MINOR: hlua: ignore "tune.lua.bool-sample-conversion" if set after "lua-load"
- MEDIUM: peers: Give up if we fail to take locks in hot path
- MEDIUM: stick-tables: defer adding updates to a tasklet
- MEDIUM: stick-tables: Limit the number of old entries we remove
- MEDIUM: stick-tables: Limit the number of entries we expire
- MINOR: cfgparse-global: add explicit error messages in cfg_parse_global_env_opts
- MINOR: ssl: add function to extract X509 notBefore date in time_t
- BUILD: acme: need HAVE_ASN1_TIME_TO_TM
- MINOR: acme: move the acme task init in a dedicated function
- MEDIUM: acme: add a basic scheduler
- MINOR: acme: emit a log when the scheduler can't start the task
This patch implements a very basic scheduler for the ACME tasks.
The scheduler is a task which is started from the postparser function
when at least one acme section was configured.
The scheduler will loop over the certificates in the ckchs_tree, and for
each certificate will start an ACME task if the notAfter date is past
curtime + (notAfter - notBefore) / 12, or 7 days if notBefore is not
available.
Once the lookup over all certificates is terminated, the task will sleep
and will wakeup after 12 hours.
tune.lua.bool-sample-conversion must be set before any lua-load or
lua-load-per-thread is used for it to be considered. Indeed, lua-load
directives are parsed on the fly and will cause some parts of the scripts
to be executed during init already (script body/init contexts).
As such, we cannot afford to have "tune.lua.bool-sample-conversion" set
after some Lua code was loaded, because it would mean that the setting
would be handled differently for Lua's code executed during or after
config parsing.
To avoid ambiguities, the documentation now states that the setting must
be set before any lua-load(-per-thread) directive, and if the setting
is met after some Lua was already loaded, the directive is ignored and
a warning informs about that.
It should fix GH #2957
It may be backported with 29b6d8af16 ("MINOR: hlua: rename
"tune.lua.preserve-smp-bool" to "tune.lua.bool-sample-conversion"")
Released version 3.2-dev13 with the following main changes :
- MEDIUM: checks: Make sure we return the tasklet from srv_chk_io_cb
- MEDIUM: listener: Make sure w ereturn the tasklet from accept_queue_process
- MEDIUM: mux_fcgi: Make sure we return the tasklet from fcgi_deferred_shut
- MEDIUM: quic: Make sure we return the tasklet from qcc_io_cb
- MEDIUM: quic: Make sure we return NULL in quic_conn_app_io_cb if needed
- MEDIUM: quic: Make sure we return the tasklet from quic_accept_run
- BUG/MAJOR: tasklets: Make sure he tasklet can't run twice
- BUG/MAJOR: listeners: transfer connection accounting when switching listeners
- MINOR: ssl/cli: add a '-t' option to 'show ssl sni'
- DOC: config: fix ACME paragraph rendering issue
- DOC: config: clarify log-forward "host" option
- MINOR: promex: expose ST_I_PX_RATE (current_session_rate)
- BUILD: acme: use my_strndup() instead of strndup()
- BUILD: leastconn: fix build warning when building without threads on old machines
- MINOR: threads: prepare DEBUG_THREAD to receive more values
- MINOR: threads: turn the full lock debugging to DEBUG_THREAD=2
- MEDIUM: threads: keep history of taken locks with DEBUG_THREAD > 0
- MINOR: threads/cli: display the lock history on "show threads"
- MEDIUM: thread: set DEBUG_THREAD to 1 by default
- BUG/MINOR: ssl/acme: free EVP_PKEY upon error
- MINOR: acme: separate the code generating private keys
- MINOR: acme: failure when no directory is specified
- MEDIUM: acme: generate the account file when not found
- MEDIUM: acme: use 'crt-base' to load the account key
- MINOR: compiler: add more macros to detect macro definitions
- MINOR: cli: split APPCTX_CLI_ST1_PROMPT into two distinct flags
- MEDIUM: cli: make the prompt mode configurable between n/i/p
- MEDIUM: mcli: make the prompt mode configurable between i/p
- MEDIUM: mcli: replicate the current mode when enterin the worker process
- DOC: configuration: acme account key are auto generated
- CLEANUP: acme: remove old TODO for account key
- DOC: configuration: add quic4 to the ssl-f-use example
- BUG/MINOR: acme: does not try to unlock after a failed trylock
- BUG/MINOR: mux-h2: fix the offset of the pattern for the ping frame
- MINOR: tcp: add support for setting TCP_NOTSENT_LOWAT on both sides
- BUG/MINOR: acme: creating an account should not end the task
- MINOR: quic: rename min/max fields for congestion window algo
- MINOR: quic: refactor BBR API
- BUG/MINOR: quic: ensure cwnd limits are always enforced
- MINOR: thread: define cshared type
- MINOR: quic: account for global congestion window
- MEDIUM: quic: limit global Tx memory
- MEDIUM: acme: use a map to store tokens and thumbprints
- BUG/MINOR: acme: remove references to virt@acme
- MINOR: applet: add appctx_schedule() macro
- BUG/MINOR: dns: add tempo between 2 connection attempts for dns servers
- CLEANUP: dns: remove unused dns_stream_server struct member
- BUG/MINOR: dns: prevent ds accumulation within dss
- CLEANUP: proxy: mention that px->conn_retries isn't relevant in some cases
- DOC: ring: refer to newer RFC5424
- MINOR: tools: make my_strndup() take a size_t len instead of and int
- MINOR: Add "sigalg" to "sigalg name" helper function
- MINOR: ssl: Add traces to ssl init/close functions
- MINOR: ssl: Add traces to recv/send functions
- MINOR: ssl: Add traces to ssl_sock_io_cb function
- MINOR: ssl: Add traces around SSL_do_handshake call
- MINOR: ssl: Add traces to verify callback
- MINOR: ssl: Add ocsp stapling callback traces
- MINOR: ssl: Add traces to the switchctx callback
- MINOR: ssl: Add traces about sigalg extension parsing in clientHello callback
- MINOR: Add 'conn' param to ssl_sock_chose_sni_ctx
- BUG/MEDIUM: mux-spop: Wait end of handshake to declare a spop connection ready
- BUG/MEDIUM: mux-spop: Handle CLOSING state and wait for AGENT DISCONNECT frame
- BUG/MINOR: mux-h1: Don't pretend connection was released for TCP>H1>H2 upgrade
- BUG/MINOR: mux-h1: Fix trace message in h1_detroy() to not relay on connection
- BUILD: ssl: Fix wolfssl build
- BUG/MINOR: mux-spop: Use the right bitwise operator in spop_ctl()
- MEDIUM: mux-quic: increase flow-control on each bufsize
- MINOR: mux-quic: limit emitted MSD frames count per qcs
- MINOR: add hlua_yield_asap() helper
- MINOR: hlua_fcn: enforce yield after *_get_stats() methods
- DOC: config: restore default values for resolvers hold directive
- MINOR: ssl/cli: "acme ps" shows the acme tasks
- MINOR: acme: acme_ctx_destroy() returns upon NULL
- MINOR: acme: use acme_ctx_destroy() upon error
- MEDIUM: tasks: Mutualize code between tasks and tasklets.
- MEDIUM: tasks: More code factorization
- MEDIUM: tasks: Remove TASK_IN_LIST and use TASK_QUEUED instead.
- MINOR: tasks: Remove unused tasklet_remove_from_tasklet_list
- MEDIUM: tasks: Mutualize the TASK_KILLED code between tasks and tasklets
- BUG/MEDIUM: connections: Report connection closing in conn_create_mux()
- BUILD/MEDIUM: quic: Make sure we build with recent changes
Implement a way to display the running acme tasks over the CLI.
It currently only displays a "Running" status with the certificate name
and the acme section from the configuration.
The displayed running tasks are limited to the size of a buffer for now,
it will require a backref list later to be called multiple times to
resume the list.
Default values for hold directive (resolver context) used to be documented
but this was lost when the keyword description was reworked in 24b319b
("Default value is 10s for "valid", 0s for "obsolete" and 30s for
others.")
Restoring the part that describes the default value.
It may be backported to all stable versions with 24b319b
The stateless mode which was documented previously in the ACME example
is not convenient for all use cases.
First, when HAProxy generates the account key itself, you wouldn't be
able to put the thumbprint in the configuration, so you will have to get
the thumbprint and then reload.
Second, in the case you are using multiple account key, there are
multiple thumbprint, and it's not easy to know which one you want to use
when responding to the challenger.
This patch allows to configure a map in the acme section, which will be
filled by the acme task with the token corresponding to the challenge,
as the key, and the thumbprint as the value. This way it's easy to reply
the right thumbprint.
Example:
http-request return status 200 content-type text/plain lf-string "%[path,field(-1,/)].%[path,field(-1,/),map(virt@acme)]\n" if { path_beg '/.well-known/acme-challenge/' }
Define a new settings tune.quic.frontend.max-tot-window. It contains a
size argument which can be used to set a limit on the sum of all QUIC
connections congestion window. This is applied both on
quic_cc_path_set() and quic_cc_path_inc().
Note that this limitation cannot reduce a congestion window more than
the minimal limit which is set to 2 datagrams.
TCP_NOTSENT_LOWAT is very convenient as it indicates when to report
EAGAIN on the sending side. It takes a margin on top of the estimated
window, meaning that it's no longer needed to store too many data in
socket buffers. Instead there's just enough to fill the send window
and a little bit of margin to cover the scheduling time to restart
sending. Experiments on a 100ms network have shown a 10-fold reduction
in the memory used by socket buffers by just setting this value to
tune.bufsize, without noticing any performance degradation. Theoretically
the responsiveness on multiplexed protocols such as H2 should also be
improved.
While humans can find it convenient to enter the worker process in prompt
mode, for external tools it will not be convenient to have to systematically
disable it. A better approach is to replicate the master socket's mode
there, since it has already been configured to suit the user: interactive,
prompt and timed modes are automatically passed to the worker process.
This makes the using the worker commands more natural from the master
process, without having to systematically adapt it for each new connection.
Support the same syntax in master mode as in worker mode in order to
configure the prompt. The only thing is that for now the master doesn't
have a non-interactive mode and it doesn't seem necessary to implement
it, so we only support the interactive and prompt modes. However the code
was written in a way that makes it easy to change this later if desired.
Now the prompt mode can more finely be configured between non-interactive
(default), interactive without prompt, and interactive with prompt. This
will ease the usage from automated tools which are not necessarily
interested in having to consume '> ' after each command nor displaying
"+" on payload lines. This can also be convenient when coming from the
master CLI to keep the same output format.
log-forward "host" option may be confusing because we often mention the
host field when talking about syslog RFC3164 or RFC5424 messages, but
neither rfc actually define "host" field. In fact, everywhere we used
"host field" we actually meant "hostname field" as documented in RFC5424.
This was a language abuse on our side.
In this patch we replace "host" with "hostname" where relevant in the
documentation to prevent confusion.
Thanks to Nick Ramirez for having reported the issue.
Nick Ramirez reported that the ACME paragraph (3.13) caused a rendering
issue where simple text was rendered as a directive. This was caused
by the use of unescaped <name> which confuses dconv.
Let's escape <name> by putting quotes around it to prevent the rendering
issue.
No backport needed.
Add a -t option to 'show ssl sni', allowing to add an offset to the
current date so it would allow to check which certificates are expired
after a certain period of time.
Released version 3.2-dev12 with the following main changes :
- BUG/MINOR: quic: do not crash on CRYPTO ncbuf alloc failure
- BUG/MINOR: proxy: always detach a proxy from the names tree on free()
- CLEANUP: proxy: detach the name node in proxy_free_common() instead
- CLEANUP: Slightly reorder some proxy option flags to free slots
- MINOR: proxy: Add options to drop HTTP trailers during message forwarding
- MINOR: h1-htx: Skip C-L and T-E headers for 1xx and 204 messages during parsing
- MINOR: mux-h1: Keep custom "Content-Length: 0" header in 1xx and 204 messages
- MINOR: hlua/h1: Use http_parse_cont_len_header() to parse content-length value
- CLEANUP: h1: Remove now useless h1_parse_cont_len_header() function
- BUG/MEDIUM: mux-spop: Respect the negociated max-frame-size value to send frames
- MINOR: http-act: Add 'pause' action to temporarily suspend the message analysis
- MINOR: acme/cli: add the 'acme renew' command to the help message
- MINOR: httpclient: add an "https" log-format
- MEDIUM: acme: use a customized proxy
- MEDIUM: acme: rename "uri" into "directory"
- MEDIUM: acme: rename "account" into "account-key"
- MINOR: stick-table: use a separate lock label for updates
- MINOR: h3: simplify h3_rcv_buf return path
- BUG/MINOR: mux-quic: fix possible infinite loop during decoding
- BUG/MINOR: mux-quic: do not decode if conn in error
- BUG/MINOR: cli: Issue an error when too many args are passed for a command
- MINOR: cli: Use a full prompt command for bidir connections with workers
- MAJOR: cli: Refacor parsing and execution of pipelined commands
- MINOR: cli: Rename some CLI applet states to reflect recent refactoring
- CLEANUP: applet: Update st0/st1 comment in appctx structure
- BUG/MINOR: hlua: Fix I/O handler of lua CLI commands to not rely on the SC
- BUG/MINOR: ring: Fix I/O handler of "show event" command to not rely on the SC
- MINOR: cli/applet: Move appctx fields only used by the CLI in a private context
- MINOR: cache: Add a pointer on the cache in the cache applet context
- MINOR: hlua: Use the applet name in error messages for lua services
- MINOR: applet: Save the "use-service" rule in the stream to init a service applet
- CLEANUP: applet: Remove unsued rule pointer in appctx structure
- BUG/MINOR: master/cli: properly trim the '@@' process name in error messages
- MEDIUM: resolvers: add global "dns-accept-family" directive
- MINOR: resolvers: add command-line argument -4 to force IPv4-only DNS
- MINOR: sock-inet: detect apparent IPv6 connectivity
- MINOR: resolvers: add "dns-accept-family auto" to rely on detected IPv6
- MEDIUM: acme: use Retry-After value for retries
- MEDIUM: acme: reset the remaining retries
- MEDIUM: acme: better error/retry management of the challenge checks
- BUG/MEDIUM: cli: Handle applet shutdown when waiting for a command line
- Revert "BUG/MINOR: master/cli: properly trim the '@@' process name in error messages"
- BUG/MINOR: master/cli: only parse the '@@' prefix on complete lines
- MINOR: resolvers: use the runtime IPv6 status instead of boot time one
Instead of always having to force IPv4 or IPv6, let's now also offer
"auto" which will only enable IPv6 if the system has a default gateway
for it. This means that properly configured dual-stack systems will
default to "ipv4,ipv6" while those lacking a gateway will only use
"ipv4". Note that no real connectivity test is performed, so firewalled
systems may still get it wrong and might prefer to rely on a manual
"ipv4" assignment.
In order to ease troubleshooting and testing, the new "-4" command line
argument enforces queries and processing of "A" DNS records only, i.e.
those representing IPv4 addresses. This can be useful when a host lack
end-to-end dual-stack connectivity. This overrides the global
"dns-accept-family" directive and is equivalent to value "ipv4".
By default, DNS resolvers accept both IPv4 and IPv6 addresses. This can be
influenced by the "resolve-prefer" keywords on server lines as well as the
family argument to the "do-resolve" action, but that is only a preference,
which does not block the other family from being used when it's alone. In
some environments where dual-stack is not usable, stumbling on an unreachable
IPv6-only DNS record can cause significant trouble as it will replace a
previous IPv4 one which would possibly have continued to work till next
request. The "dns-accept-family" global option permits to enforce usage of
only one (or both) address families. The argument is a comma-delimited list
of the following words:
- "ipv4": query and accept IPv4 addresses ("A" records)
- "ipv6": query and accept IPv6 addresses ("AAAA" records)
When a single family is used, no request will be sent to resolvers for the
other family, and any response for the othe family will be ignored. The
default value is "ipv4,ipv6", which effectively enables both families.
The 'pause' HTTP action can now be used to suspend for a moment the message
analysis. A timeout, expressed in milliseconds using a time-format
parameter, or an expression can be used. If an expression is used, errors
and invalid values are ignored.
Internally, the action will set the analysis expiration date on the
corresponding channel to the configured value and it will yield while it is
not expired.
The 'pause' action is available for 'http-request' and 'http-response'
rules.
In RFC9110, it is stated that trailers could be merged with the
headers. While it should be performed with a speicial care, it may be a
problem for some applications. To avoid any trouble with such applications,
two new options were added to drop trailers during the message forwarding.
On the backend, "http-drop-request-trailers" option can be enabled to drop
trailers from the requests before sending them to the server. And on the
frontend, "http-drop-response-trailers" option can be enabled to drop
trailers from the responses before sending them to the client. The options
can be defined in defaults sections and disabled with "no" keyword.
This patch should fix the issue #2930.
Released version 3.2-dev11 with the following main changes :
- CI: enable weekly QuicTLS build
- DOC: management: slightly clarify the prefix role of the '@' command
- DOC: management: add a paragraph about the limitations of the '@' prefix
- MINOR: master/cli: support bidirectional communications with workers
- MEDIUM: ssl/ckch: add filename and linenum argument to crt-store parsing
- MINOR: acme: add the acme section in the configuration parser
- MINOR: acme: add configuration for the crt-store
- MINOR: acme: add private key configuration
- MINOR: acme/cli: add the 'acme renew' command
- MINOR: acme: the acme section is experimental
- MINOR: acme: get the ACME directory
- MINOR: acme: handle the nonce
- MINOR: acme: check if the account exist
- MINOR: acme: generate new account
- MINOR: acme: newOrder request retrieve authorizations URLs
- MINOR: acme: allow empty payload in acme_jws_payload()
- MINOR: acme: get the challenges object from the Auth URL
- MINOR: acme: send the request for challenge ready
- MINOR: acme: implement a check on the challenge status
- MINOR: acme: generate the CSR in a X509_REQ
- MINOR: acme: finalize by sending the CSR
- MINOR: acme: verify the order status once finalized
- MINOR: acme: implement retrieval of the certificate
- BUG/MINOR: acme: ckch_conf_acme_init() when no filename
- MINOR: ssl/ckch: handle ckch_conf in ckchs_dup() and ckch_conf_clean()
- MINOR: acme: copy the original ckch_store
- MEDIUM: acme: replace the previous ckch instance with new ones
- MINOR: acme: schedule retries with a timer
- BUILD: acme: enable the ACME feature when JWS is present
- BUG/MINOR: cpu-topo: check the correct variable for NULL after malloc()
- BUG/MINOR: acme: key not restored upon error in acme_res_certificate()
- BUG/MINOR: thread: protect thread_cpus_enabled_at_boot with USE_THREAD
- MINOR: acme: default to 2048bits for RSA
- DOC: acme: explain how to configure and run ACME
- BUG/MINOR: debug: remove the trailing \n from BUG_ON() statements
- DOC: config: add the missing "profiling.memory" to the global kw index
- DOC: config: add the missing "force-cfg-parser-pause" to the global kw index
- DEBUG: init: report invalid characters in debug description strings
- DEBUG: rename DEBUG_GLITCHES to DEBUG_COUNTERS and enable it by default
- DEBUG: counters: make COUNT_IF() only appear at DEBUG_COUNTERS>=1
- DEBUG: counters: add the ability to enable/disable updating the COUNT_IF counters
- MINOR: tools: let dump_addr_and_bytes() support dumping before the offset
- MINOR: debug: in call traces, dump the 8 bytes before the return address, not after
- MINOR: debug: detect call instructions and show the branch target in backtraces
- BUG/MINOR: acme: fix possible NULL deref
- CLEANUP: acme: stored value is overwritten before it can be used
- BUILD: incompatible pointer type suspected with -DDEBUG_UNIT
- BUG/MINOR: http-ana: Properly detect client abort when forwarding the response
- BUG/MEDIUM: http-ana: Report 502 from req analyzer only during rsp forwarding
- CI: fedora rawhide: enable unit tests
- DOC: configuration: fix a typo in ACME documentation
- MEDIUM: sink: add a new dpapi ring buffer
- Revert "BUG/MINOR: acme: key not restored upon error in acme_res_certificate()"
- BUG/MINOR: acme: key not restored upon error in acme_res_certificate() V2
- BUG/MINOR: acme: fix the exponential backoff of retries
- DOC: configuration: specify limitations of ACME for 3.2
- MINOR: acme: emit logs instead of ha_notice
- MINOR: acme: add a success message to the logs
- BUG/MINOR: acme/cli: fix certificate name in error message
- MINOR: acme: register the task in the ckch_store
- MINOR: acme: free acme_ctx once the task is done
- BUG/MEDIUM: h3: trim whitespaces when parsing headers value
- BUG/MEDIUM: h3: trim whitespaces in header value prior to QPACK encoding
- BUG/MINOR: h3: filter upgrade connection header
- BUG/MINOR: h3: reject invalid :path in request
- BUG/MINOR: h3: reject request URI with invalid characters
- MEDIUM: h3: use absolute URI form with :authority
- BUG/MEDIUM: hlua: fix hlua_applet_{http,tcp}_fct() yield regression (lost data)
- BUG/MINOR: mux-h2: prevent past scheduling with idle connections
- BUG/MINOR: rhttp: fix reconnect if timeout connect unset
- BUG/MINOR: rhttp: ensure GOAWAY can be emitted after reversal
- BUG/MINOR: mux-h2: do not apply timer on idle backend connection
- MINOR: mux-h2: refactor idle timeout calculation
- MINOR: mux-h2: prepare to support PING emission
- MEDIUM: server/mux-h2: implement idle-ping on backend side
- MEDIUM: listener/mux-h2: implement idle-ping on frontend side
- MINOR: mux-h2: do not emit GOAWAY on idle ping expiration
- MINOR: mux-h2: handle idle-ping on conn reverse
- BUILD: makefile: enable backtrace by default on musl
- BUG/MINOR: threads: set threads_idle and threads_harmless even with no threads
- BUG/MINOR debug: fix !USE_THREAD_DUMP in ha_thread_dump_fill()
- BUG/MINOR: wdt/debug: avoid signal re-entrance between debugger and watchdog
- BUG/MINOR: debug: detect and prevent re-entrance in ha_thread_dump_fill()
- MINOR: debug: do not statify a few debugging functions often used with wdt/dbg
- MINOR: tools: also protect the library name resolution against concurrent accesses
- MINOR: tools: protect dladdr() against reentrant calls from the debug handler
- MINOR: debug: protect ha_dump_backtrace() against risks of re-entrance
- MINOR: tinfo: keep a copy of the pointer to the thread dump buffer
- MINOR: debug: always reset the dump pointer when done
- MINOR: debug: remove unused case of thr!=tid in ha_thread_dump_one()
- MINOR: pass a valid buffer pointer to ha_thread_dump_one()
- MEDIUM: wdt: always make the faulty thread report its own warnings
- MINOR: debug: make ha_stuck_warning() only work for the current thread
- MINOR: debug: make ha_stuck_warning() print the whole message at once
- CLEANUP: debug: no longer set nor use TH_FL_DUMPING_OTHERS
- MINOR: sched: add a new function is_sched_alive() to report scheduler's health
- MINOR: wdt: use is_sched_alive() instead of keeping a local ctxsw copy
- MINOR: sample: add 4 new sample fetches for clienthello parsing
- REGTEST: add new reg-test for the 4 new clienthello fetches
- MINOR: servers: Move the per-thread server initialization earlier
- MINOR: proxies: Initialize the per-thread structure earlier.
- MINOR: servers: Provide a pointer to the server in srv_per_tgroup.
- MINOR: lb_fwrr: Move the next weight out of fwrr_group.
- MINOR: proxies: Add a per-thread group lbprm struct.
- MEDIUM: lb_fwrr: Use one ebtree per thread group.
- MEDIUM: lb_fwrr: Don't start all thread groups on the same server.
- MINOR: proxies: Do stage2 initialization for sinks too
This patch contains this 4 new fetches and doc changes for the new fetches:
- req.ssl_cipherlist
- req.ssl_sigalgs
- req.ssl_keyshare_groups
- req.ssl_supported_groups
Towards:#2532
Instead of using the thread dump buffer for post-mortem analysis, we'll
keep a copy of the assigned pointer whenever it's used, even for warnings
or "show threads". This will offer more opportunities to figure from a
core what happened, and will give us more freedom regarding the value of
the thread_dump_buffer itself. For example, even at the end of the dump
when the pointer is reset, the last used buffer is now preserved.
This commit extends MUX H2 connection reversal step to properly take
into account the new idle-ping feature. It first ensures that h2c task
is properly instantiated/freed depending now on both timers and
idle-ping configuration. Also, h2c_update_timeout() is now called
instead of manually requeuing the task, which ensures the proper timer
value is selected depending on the new connection side.
This commit is the counterpart of the previous one, adapted on the
frontend side. "idle-ping" is added as keyword to bind lines, to be able
to refresh client timeout of idle frontend connections.
H2 MUX behavior remains similar as the previous patch. The only
significant change is in h2c_update_timeout(), as idle-ping is now taken
into account also for frontend connection. The calculated value is
compared with http-request/http-keep-alive timeout value. The shorter
delay is then used as expired date. As hr/ka timeout are based on
idle_start, this allows to run them in parallel with an idle-ping timer.
This commit implements support for idle-ping on the backend side. First,
a new server keyword "idle-ping" is defined in configuration parsing. It
is used to set the corresponding new server member.
The second part of this commit implements idle-ping support on H2 MUX. A
new inlined function conn_idle_ping() is defined to access connection
idle-ping value. Two new connection flags are defined H2_CF_IDL_PING and
H2_CF_IDL_PING_SENT. The first one is set for idle connections via
h2c_update_timeout().
On h2_timeout_task() handler, if first flag is set, instead of releasing
the connection as before, the second flag is set and tasklet is
scheduled. As both flags are now set, h2_process_mux() will proceed to
PING emission. The timer has also been rearmed to the idle-ping value.
If a PING ACK is received before next timeout, connection timer is
refreshed. Else, the connection is released, as with timer expiration.
Also of importance, special care is needed when a backend connection is
going to idle. In this case, idle-ping timer must be rearmed. Thus a new
invokation of h2c_update_timeout() is performed on h2_detach().
These counters can have a noticeable cost on large machines, though not
dramatic. There's no single good choice to keep them enabled or disabled.
This commit adds multiple choices:
- DEBUG_COUNTERS set to 2 will automatically enable them by default, while
1 will disable them by default
- the global "debug.counters on/off" will allow to change the setting at
boot, regardless of DEBUG_COUNTERS as long as it was at least 1.
- the CLI "debug counters on/off" will also allow to change the value at
run time, allowing to observe a phenomenon while it's happening, or to
disable counters if it's suspected that their cost is too high
Finally, the "debug counters" command will append "(stopped)" at the end
of the CNT lines when these counters are stopped.
Not that the whole mechanism would easily support being extended to all
counter types by specifying the types to apply to, but it doesn't seem
useful at all and would require the user to also type "cnt" on debug
lines. This may easily be changed in the future if it's found relevant.
Till now the per-line glitches counters were only enabled with the
confusingly named DEBUG_GLITCHES (which would not turn glitches off
when disabled). Let's instead change it to DEBUG_COUNTERS and make sure
it's enabled by default (though it can still be disabled with
-DDEBUG_GLITCHES=0 just like for DEBUG_STRICT). It will later be
expanded to cover more counters.
Some rare commands in the worker require to keep their input open and
terminate when it's closed ("show events -w", "wait"). Others maintain
a per-session context ("set anon on"). But in its default operation
mode, the master CLI passes commands one at a time to the worker, and
closes the CLI's input channel so that the command can immediately
close upon response. This effectively prevents these two specific cases
from being used.
Here the approach that we take is to introduce a bidirectional mode to
connect to the worker, where everything sent to the master is immediately
forwarded to the worker (including the raw command), allowing to queue
multiple commands at once in the same session, and to continue to watch
the input to detect when the client closes. It must be a client's choice
however, since doing so means that the client cannot batch many commands
at once to the master process, but must wait for these commands to complete
before sending new ones. For this reason we use the prefix "@@<pid>" for
this. It works exactly like "@" except that it maintains the channel
open during the whole execution. Similarly to "@<pid>" with no command,
"@@<pid>" will simply open an interactive CLI session to the worker, that
will be ended by "quit" or by closing the connection. This can be convenient
for the user, and possibly for clients willing to dedicate a connection to
the worker.
The '@' prefix permits to execute a single command at once in a worker.
It is very handy but comes with some limitations affecting rare commands,
which is better to be documented (one command per session, input closed)
since it can seldom have user-visible effects.
While the examples were clear, the text did not fully imply what was
reflected there. Better have the text explicitly mention that the
'@' command may be used as a prefix or wrapper in front of a command
as well as a standalone command.
Released version 3.2-dev10 with the following main changes :
- REORG: ssl: move curves2nid and nid2nist to ssl_utils
- BUG/MEDIUM: stream: Fix a possible freeze during a forced shut on a stream
- MEDIUM: stream: Save SC and channel flags earlier in process_steam()
- BUG/MINOR: peers: fix expire learned from a peer not converted from ms to ticks
- BUG/MEDIUM: peers: prevent learning expiration too far in futur from unsync node
- CI: spell check: allow manual trigger
- CI: codespell: add "pres" to spellcheck whitelist
- CLEANUP: assorted typo fixes in the code, commits and doc
- CLEANUP: atomics: remove support for gcc < 4.7
- CLEANUP: atomics: also replace __sync_synchronize() with __atomic_thread_fence()
- TESTS: Fix build for filltab25.c
- MEDIUM: ssl: replace "crt" lines by "ssl-f-use" lines
- DOC: configuration: replace "crt" by "ssl-f-use" in listeners
- MINOR: backend: mark srv as nonnull in alloc_dst_address()
- BUG/MINOR: server: ensure check-reuse-pool is copied from default-server
- MINOR: server: activate automatically check reuse for rhttp@ protocol
- MINOR: check/backend: support conn reuse with SNI
- MINOR: check: implement check-pool-conn-name srv keyword
- MINOR: task: add thread safe notification_new and notification_wake variants
- BUG/MINOR: hlua_fcn: fix potential UAF with Queue:pop_wait()
- MINOR: hlua_fcn: register queue class using hlua_register_metatable()
- MINOR: hlua: add core.wait()
- MINOR: hlua: core.wait() takes optional delay paramater
- MINOR: hlua: split hlua_applet_tcp_recv_yield() in two functions
- MINOR: hlua: add AppletTCP:try_receive()
- MINOR: hlua_fcn: add Queue:alarm()
- MEDIUM: task: make notification_* API thread safe by default
- CLEANUP: log: adjust _lf_cbor_encode_byte() comment
- MEDIUM: ssl/crt-list: warn on negative wildcard filters
- MEDIUM: ssl/crt-list: warn on negative filters only
- BUILD: atomics: fix build issue on non-x86/non-arm systems
- BUG/MINOR: log: fix CBOR encoding with LOG_VARTEXT_START() + lf_encode_chunk()
- BUG/MEDIUM: sample: fix risk of overflow when replacing multiple regex back-refs
- DOC: configuration: rework the crt-list section
- MINOR: ring: support arbitrary delimiters through ring_dispatch_messages()
- MINOR: ring/cli: support delimiting events with a trailing \0 on "show events"
- DEV: h2: fix h2-tracer.lua nil value index
- BUG/MINOR: backend: do not use the source port when hashing clientip
- BUG/MINOR: hlua: fix invalid errmsg use in hlua_init()
- MINOR: proxy: add setup_new_proxy() function
- MINOR: checks: mark CHECKS-FE dummy frontend as internal
- MINOR: flt_spoe: mark spoe agent frontend as internal
- MEDIUM: tree-wide: avoid manually initializing proxies
- MINOR: proxy: add deinit_proxy() helper func
- MINOR: checks: deinit checks_fe upon deinit
- MINOR: flt_spoe: deinit spoe agent proxy upon agent release
At the moment it is not supported to produce multi-line events on the
"show events" output, simply because the LF character is used as the
default end-of-event mark. However it could be convenient to produce
well-formatted multi-line events, e.g. in JSON or other formats. UNIX
utilities have already faced similar needs in the past and added
"-print0" to "find" and "-0" to "xargs" to mention that the delimiter
is the NUL character. This makes perfect sense since it's never present
in contents, so let's do exactly the same here.
Thus from now on, "show events <ring> -0" will delimit messages using
a \0 instead of a \n, permitting a better and safer encapsulation.
Queue:alarm() sets a wakeup alarm on the task when new data becomes
available on Queue. It must be re-armed for each event.
Lua documentation was updated
This is the non-blocking variant for AppletTCP:receive(). It doesn't
take any argument, instead it tries to read as much data as available
at once. If no data is available, empty string is returned.
Lua documentation was updated.
core.wait() now accepts optional delay parameter in ms. Passed this delay
the task is woken up if no event woke the task before.
Lua documentation was updated.
Similar to core.yield(), except that the task is not woken up
automatically, instead it waits for events to trigger the task
wakeup.
Lua documentation was updated.
This commit is a direct follow-up of the previous one. It defines a new
server keyword check-pool-conn-name. It is used as the default value for
the name parameter of idle connection hash generation.
Its behavior is similar to server keyword pool-conn-name, but reserved
for checks reuse. If check-pool-conn-name is set, it is used in priority
to match a connection for reuse. If unset, a fallback is performed on
check-sni.
Without check-reuse-pool, it is impossible to perform check on server
using @rhttp protocol. This is due to the inherent nature of the
protocol which does not implement an active connect method.
Thus, ensure that check-reuse-pool is always set when a reverse HTTP
server is declared. This reduces server configuration and should prevent
any omission. Note that it is still require to add "check" server
keyword so activate server checks.
Replace the "crt" keyword from the frontend section with a "ssl-f-use"
keyword, "crt" could be ambigous in case we don't want to put a
certificate filename.
Released version 3.2-dev9 with the following main changes :
- MINOR: quic: move global tune options into quic_tune
- CLEANUP: quic: reorganize TP flow-control initialization
- MINOR: quic: ignore uni-stream for initial max data TP
- MINOR: mux-quic: define config for max-data
- MINOR: quic: define max-stream-data configuration as a ratio
- MEDIUM: lb-chash: add directive hash-preserve-affinity
- MEDIUM: pools: be a bit smarter when merging comparable size pools
- REGTESTS: disable the test balance/balance-hash-maxqueue
- BUG/MINOR: log: fix gcc warn about truncating NUL terminator while init char arrays
- CI: fedora rawhide: allow "on: workflow_dispatch" in forks
- CI: fedora rawhide: install "awk" as a dependency
- CI: spellcheck: allow "on: workflow_dispatch" in forks
- CI: coverity scan: allow "on: workflow_dispatch" in forks
- CI: cross compile: allow "on: workflow_dispatch" in forks
- CI: Illumos: allow "on: workflow_dispatch" in forks
- CI: NetBSD: allow "on: workflow_dispatch" in forks
- CI: QUIC Interop on AWS-LC: allow "on: workflow_dispatch" in forks
- CI: QUIC Interop on LibreSSL: allow "on: workflow_dispatch" in forks
- MINOR: compiler: add __nonstring macro
- MINOR: thread: dump the CPU topology in thread_map_to_groups()
- MINOR: cpu-set: compare two cpu sets with ha_cpuset_isequal()
- MINOR: cpu-set: add a new function to print cpu-sets in human-friendly mode
- MINOR: cpu-topo: add a dump of thread-to-CPU mapping to -dc
- MINOR: cpu-topo: pass an extra argument to ha_cpu_policy
- MINOR: cpu-topo: add new cpu-policies "group-by-2-clusters" and above
- BUG/MINOR: config: silence .notice/.warning/.alert in discovery mode
- EXAMPLES: add "games.cfg" and an example game in Lua
- MINOR: jws: emit the JWK thumbprint
- TESTS: jws: change the jwk format
- MINOR: ssl/ckch: add substring parser for ckch_conf
- MINOR: mt_list: Implement mt_list_try_lock_prev().
- MINOR: lbprm: Add method to deinit server and proxy
- MINOR: threads: Add HA_RWLOCK_TRYRDTOWR()
- MAJOR: leastconn; Revamp the way servers are ordered.
- BUG/MINOR: ssl/ckch: leak in error path
- BUILD: ssl/ckch: potential null pointer dereference
- MINOR: log: support "raw" logformat node typecast
- CLEANUP: assorted typo fixes in the code and comments
- DOC: config: fix two missing "content" in "tcp-request" examples
- MINOR: cpu-topo: cpu_dump_topology() SMT info check little optimisation
- BUILD: compiler: undefine the CONCAT() macro if already defined
- BUG/MEDIUM: leastconn: Don't try to reposition if the server is down
- BUG/MINOR: rhttp: fix incorrect dst/dst_port values
- BUG/MINOR: backend: do not overwrite srv dst address on reuse
- BUG/MEDIUM: backend: fix reuse with set-dst/set-dst-port
- MINOR: sample: define bc_reused fetch
- REGTESTS: extend conn reuse test with transparent proxy
- MINOR: backend: fix comment when killing idle conns
- MINOR: backend: adjust conn_backend_get() API
- MINOR: backend: extract conn hash calculation from connect_server()
- MINOR: backend: extract conn reuse from connect_server()
- MINOR: backend: remove stream usage on connection reuse
- MINOR: check define check-reuse-pool server keyword
- MEDIUM: check: implement check-reuse-pool
- BUILD: backend: silence a build warning when not using ssl
- BUILD: quic_sock: address a strict-aliasing build warning with gcc 5 and 6
- BUILD: ssl_ckch: use my_strndup() instead of strndup()
- DOC: update INSTALL to reflect the minimum compiler version
Define a new server keyword check-reuse-pool, and its counterpart with a
"no" prefix. For the moment, only parsing is implemented. The real
behavior adjustment will be implemented in the next patch.
As reported by Uku Srmus in GitHub issue #2917, two "tcp-request" rules
in an example were mistakenly missing the "content" hook, rendering them
invalid.
This can be backported.
Implement mt_list_try_lock_prev(), that does the same thing
as mt_list_lock_prev(), exceot if the list is locked, it
returns { NULL, NULL } instaed of waiting.
This adds "group-by-{2,3,4}-clusters", which, as its name implies,
create one thread group per X clusters. This can be useful when CPUs
are split into too small clusters, as well as when the total number
of assigned cores is not even between the clusters, to try to spread
the load between less different ones.
When using hash-based load balancing, requests are always assigned to
the server corresponding to the hash bucket for the balancing key,
without taking maxconn or maxqueue into account, unlike in other load
balancing methods like 'first'. This adds a new backend directive that
can be used to take maxconn and possibly maxqueue in that context. This
can be used when hashing is desired to achieve cache locality, but
sending requests to a different server is preferable to queuing for a
long time or failing requests when the initial server is saturated.
By default, affinity is preserved as was the case previously. When
'hash-preserve-affinity' is set to 'maxqueue', servers are considered
successively in the order of the hash ring until a server that does not
have a full queue is found.
When 'maxconn' is set on a server, queueing cannot be disabled, as
'maxqueue=0' means unlimited. To support picking a different server
when a server is at 'maxconn' irrespective of the queue,
'hash-preserve-affinity' can be set to 'maxconn'.
Define a new global configuration tune.quic.frontend.max-data. This
allows users to explicitely set the value for the corresponding QUIC TP
initial-max-data, with direct impact on haproxy memory consumption.
Released version 3.2-dev8 with the following main changes :
- MINOR: jws: implement JWS signing
- TESTS: jws: implement a test for JWS signing
- CI: github: add "jose" to apt dependencies
- CLEANUP: log-forward: remove useless options2 init
- CLEANUP: log: add syslog_process_message() helper
- MINOR: proxy: add proxy->options3
- MINOR: log: migrate log-forward options from proxy->options2 to options3
- MINOR: log: provide source address information in syslog_process_message()
- MINOR: tools: only print address in sa2str() when port == -1
- MINOR: log: add "option host" log-forward option
- MINOR: log: handle log-forward "option host"
- MEDIUM: log: change default "host" strategy for log-forward section
- BUG/MEDIUM: thread: use pthread_self() not ha_pthread[tid] in set_affinity
- MINOR: compiler: add a simple macro to concatenate resolved strings
- MINOR: compiler: add a new __decl_thread_var() macro to declare local variables
- BUILD: tools: silence a build warning when USE_THREAD=0
- BUILD: backend: silence a build warning when threads are disabled
- DOC: management: rename some last occurences from domain "dns" to "resolvers"
- BUG/MINOR: stats: fix capabilities and hide settings for some generic metrics
- MINOR: cli: export cli_io_handler() to ease symbol resolution
- MINOR: tools: improve symbol resolution without dl_addr
- MINOR: tools: ease the declaration of known symbols in resolve_sym_name()
- MINOR: tools: teach resolve_sym_name() a few more common symbols
- BUILD: tools: avoid a build warning on gcc-4.8 in resolve_sym_name()
- DEV: ncpu: also emulate sysconf() for _SC_NPROCESSORS_*
- DOC: design-thoughts: commit numa-auto.txt
- MINOR: cpuset: make the API support negative CPU IDs
- MINOR: thread: rely on the cpuset functions to count bound CPUs
- MINOR: cpu-topo: add ha_cpu_topo definition
- MINOR: cpu-topo: allocate and initialize the ha_cpu_topo array.
- MINOR: cpu-topo: rely on _SC_NPROCESSORS_CONF to trim maxcpus
- MINOR: cpu-topo: add a function to dump CPU topology
- MINOR: cpu-topo: update CPU topology from excluded CPUs at boot
- REORG: cpu-topo: move bound cpu detection from cpuset to cpu-topo
- MINOR: cpu-topo: add detection of online CPUs on Linux
- MINOR: cpu-topo: add detection of online CPUs on FreeBSD
- MINOR: cpu-topo: try to detect offline cpus at boot
- MINOR: cpu-topo: add CPU topology detection for linux
- MINOR: cpu-topo: also store the sibling ID with SMT
- MINOR: cpu-topo: add NUMA node identification to CPUs on Linux
- MINOR: cpu-topo: add NUMA node identification to CPUs on FreeBSD
- MINOR: thread: turn thread_cpu_mask_forced() into an init-time variable
- MINOR: cfgparse: move the binding detection into numa_detect_topology()
- MINOR: cfgparse: use already known offline CPU information
- MINOR: global: add a command-line option to enable CPU binding debugging
- MINOR: cpu-topo: add a new "cpu-set" global directive to choose cpus
- MINOR: cpu-topo: add "drop-cpu" and "only-cpu" to cpu-set
- MEDIUM: thread: start to detect thread groups and threads min/max
- MEDIUM: cpu-topo: make sure to properly assign CPUs to threads as a fallback
- MEDIUM: thread: reimplement first numa node detection
- MEDIUM: cfgparse: remove now unused numa & thread-count detection
- MINOR: cpu-topo: refine cpu dump output to better show kept/dropped CPUs
- MINOR: cpu-topo: fall back to nominal_perf and scaling_max_freq for the capacity
- MINOR: cpu-topo: use cpufreq before acpi cppc
- MINOR: cpu-topo: boost the capacity of performance cores with cpufreq
- MINOR: cpu-topo: skip CPU detection when /sys/.../cpu does not exist
- MINOR: cpu-topo: skip identification of non-existing CPUs
- MINOR: cpu-topo: skip CPU properties that we've verified do not exist
- MINOR: cpu-topo: implement a sorting mechanism for CPU index
- MINOR: cpu-topo: implement a sorting mechanism by CPU locality
- MINOR: cpu-topo: implement a CPU sorting mechanism by cluster ID
- MINOR: cpu-topo: ignore single-core clusters
- MINOR: cpu-topo: assign clusters to cores without and renumber them
- MINOR: cpu-topo: make sure we don't leave unassigned IDs in the cpu_topo
- MINOR: cpu-topo: assign an L3 cache if more than 2 L2 instances
- MINOR: cpu-topo: renumber cores to avoid holes and make them contiguous
- MINOR: cpu-topo: add a function to sort by cluster+capacity
- MINOR: cpu-topo: consider capacity when forming clusters
- MINOR: cpu-topo: create an array of the clusters
- MINOR: cpu-topo: ignore excess of too small clusters
- MINOR: cpu-topo: add "only-node" and "drop-node" to cpu-set
- MINOR: cpu-topo: add "only-thread" and "drop-thread" to cpu-set
- MINOR: cpu-topo: add "only-core" and "drop-core" to cpu-set
- MINOR: cpu-topo: add "only-cluster" and "drop-cluster" to cpu-set
- MINOR: cpu-topo: add a CPU policy setting to the global section
- MINOR: cpu-topo: add a 'first-usable-node' cpu policy
- MEDIUM: cpu-topo: use the "first-usable-node" cpu-policy by default
- CLEANUP: thread: now remove the temporary CPU node binding code
- MINOR: cpu-topo: add cpu-policy "group-by-cluster"
- MEDIUM: cpu-topo: let the "group-by-cluster" split groups
- MINOR: cpu-topo: add a new "performance" cpu-policy
- MINOR: cpu-topo: add a new "efficiency" cpu-policy
- MINOR: cpu-topo: add a new "resource" cpu-policy
- MINOR: jws: add new functions in jws.h
- MINOR: cpu-topo: fix unused stack var 'cpu2' reported by coverity
- MINOR: hlua: add an optional timeout to AppletTCP:receive()
- MINOR: jws: use jwt_alg type instead of a char
- BUG/MINOR: log: prevent saddr NULL deref in syslog_io_handler()
- MINOR: stream: decrement srv->served after detaching from the list
- BUG/MINOR: hlua: fix optional timeout argument index for AppletTCP:receive()
- MINOR: server: simplify srv_has_streams()
- CLEANUP: server: make it clear that srv_check_for_deletion() is thread-safe
- MINOR: cli/server: don't take thread isolation to check for srv-removable
- BUG/MINOR: limits: compute_ideal_maxconn: don't cap remain if fd_hard_limit=0
- MINOR: limits: fix check_if_maxsock_permitted description
- BUG/MEDIUM: hlua/cli: fix cli applet UAF in hlua_applet_wakeup()
- MINOR: tools: path_base() concatenates a path with a base path
- MEDIUM: ssl/ckch: make the ckch_conf more generic
- BUG/MINOR: mux-h2: Reset streams with NO_ERROR code if full response was already sent
- MINOR: stats: add .generic explicit field in stat_col struct
- MINOR: stats: STATS_PX_CAP___B_ macro
- MINOR: stats: add .cap for some static metrics
- MINOR: stats: use stat_col storage stat_cols_info
- MEDIUM: promex: switch to using stat_cols_info for global metrics
- MINOR: promex: expose ST_I_INF_WARNINGS (AKA total_warnings) metric
- MEDIUM: promex: switch to using stat_cols_px for front/back/server metrics
- MINOR: stats: explicitly add frontend cap for ST_I_PX_REQ_TOT
- CLEANUP: promex: remove unused PROMEX_FL_{INFO,FRONT,BACK,LI,SRV} flags
- BUG/MEDIUM: mux-quic: fix crash on RS/SS emission if already close local
- BUG/MINOR: mux-quic: remove extra BUG_ON() in _qcc_send_stream()
- MEDIUM: mt_list: Reduce the max number of loops with exponential backoff
- MINOR: stats: add alt_name field to stat_col struct
- MINOR: stats: add alt name info to stat_cols_info where relevant
- MINOR: promex: get rid of promex_global_metric array
- MINOR: stats-proxy: add alt_name field for ME_NEW_{FE,BE,PX} helpers
- MINOR: stats-proxy: add alt name info to stat_cols_px where relevant
- MINOR: promex: get rid of promex_st_metrics array
- MINOR: pools: rename the "by_what" field of the show pools context to "how"
- MINOR: cli/pools: record the list of pool registrations even when merging them
By default, create_pool() tries to merge similar pools into one. But when
dealing with certain bugs, it's hard to say which ones were merged together.
We do have the information at registration time, so let's just create a
list of registrations ("pool_registration") attached to each pool, that
will store that information. It can then be consulted on the CLI using
"show pools detailed", where the names, sizes, alignment and flags are
reported.
TCP services might want to be interactive, and without a timeout on
receive(), the possibilities are a bit limited. Let's add an optional
timeout in the 3rd argument to possibly limit the wait time. In this
case if the timeout strikes before the requested size is complete,
a possibly incomplete block will be returned.
This cpu policy keeps the smallest CPU cluster. This can
be used to limit the resource usage to the strict minimum
that still delivers decent performance, for example to
try to further reduce power consumption or minimize the
number of cores needed on some rented systems for a
sidecar setup, in order to scale the system down more
easily. Note that if a single cluster is present, it
will still be fully used.
When started on a 64-core EPYC gen3, it uses only one CCX
with 8 cores and 16 threads, all in the same group.
This cpu policy tries to evict performant core clusters and only
focuses on efficiency-oriented ones. On an intel i9-14900k, we can
get 525k rps using 8 performance cores, versus 405k when using all
24 efficiency cores. In some cases the power savings might be more
desirable (e.g. scalability tests on a developer's laptop), or the
performance cores might be better suited for another component
(application or security component).
This cpu policy tries to evict efficient core clusters and only
focuses on performance-oriented ones. On an intel i9-14900k, we can
get 525k rps using only 8 cores this way, versus 594k when using all
24 cores. The gains from using all these codes are not significant
enough to waste them on this. Also these cores can be much slower
at doing SSL handshakes so it can make sense to evict them. Better
keep the efficiency cores for network interrupts for example.
Also, on a developer's machine it can be convenient to keep all these
cores for the local tasks and extra tools (load generators etc).
This policy forms thread groups from the CPU clusters, and bind all the
threads in them to all the CPUs of the cluster. This is recommended on
system with bad inter-CCX latencies. It was shown to simply triple the
performance with queuing on a 64-core EPYC without having to manually
assign the cores with cpu-map.
This now turns the cpu-policy to "first-usable-node" by default, so that
we preserve the current default behavior consisting in binding to the
first node if nothing was forced. If a second node is found,
global.nbthread is set and the previous code will be skipped.
This is a reimplemlentation of the current default policy. It binds to
the first node having usable CPUs if found, and drops CPUs from the
second and next nodes.
These are processed after the topology is detected, and they allow to
restrict binding to or evict CPUs matching the indicated hardware
cluster number(s). It can be used to bind to only some clusters, such
as CCX or different energy efficiency cores. For this reason, here we
use the cluster's local ID (local to the node).
These are processed after the topology is detected, and they allow to
restrict binding to or evict CPUs matching the indicated hardware
core number(s). It can be used to bind to only some clusters as well
as to evict efficient cores whose number is known.
These are processed after the topology is detected, and they allow to
restrict binding to or evict CPUs matching the indicated hardware
thread number(s). It can be used to reserve even threads for HW IRQs
and odd threads for haproxy for example, or to evict efficient cores
that do only have thread #0.
For now it's limited, it only supports "reset" to ask that any previous
"taskset" be ignored. The goal will be to later add more actions that
allow to symbolically define sets of cpus to bind to or to drop. This
also clears the cpu_mask_forced variable that is used to detect
that a taskset had been used.
During development, everything related to CPU binding and the CPU topology
is debugged using state dumps at various places, but it does make sense to
have a real command line option so that this remains usable in production
to help users figure why some CPUs are not used by default. Let's add
"-dc" for this. Since the list of global.tune.options values is almost
full and does not 100% match this option, let's add a new "tune.debug"
field for this.
Lots of collected data and observations aggregated into a single commit
so as not to lose them. Some parts below come from several commit
messages and are incremental.
Add captures and analysis of intel 14900 where it's not easy to draw
the line between the desired P and E cores.
The 14900 raises some questions (imagine a dual-die variant in multi-socket).
That's the start of an algorithmic distribution of performance cores into
thread groups.
cpu-map currently conflicts a lot with the choices after auto-detection
but it doesn't have to. The problem is the inability to configure the
threads for the whole process like taskset does. By offering this ability
we can also start to designate groups of CPUs symbolically (package, die,
ccx, cores, smt).
It can also be useful to exploit the info from cpuinfo that is not
available in /sys, such as the model number. At least on arm, higher
numbers indicate bigger cores and can be useful to distinguish cores
inside a cluster. It will not indicate big vs medium ones of the same
type (e.g. a78 3.0 vs 2.4 GHz) but can still be effective at identifying
the efficient ones.
In short, infos such as cluster ID not always reliable, and are
local to the package. die_id as well. die number is not reported
here but should definitely be used, as a higher priority than L3.
We're still missing a discriminant between the l3 and cluster number
in order to address heterogenous CPUs (e.g. intel 14900), though in
terms of locality that's currently done correctly.
CPU selection is also a full topic, and some thoughts were noted
regarding sorting by perf vs locality so as never to mix inter-
socket CPUs due to sorting.
The proposed cpu-selection cannot work as-is, because it acts both on
restriction and preference, and these two are not actions but a sequence.
First restrictions must be enforced, and second the remaining CPUs are
sorted according to the preferred criterion, and a number of threads are
selected.
Currently we refine the OS-exposed cluster number but it's not correct
as we can end up with something poorly numbered. We need to respect the
LLC in any case so let's explain the approach.
This is a complementary patch to cf913c2f9 ("DOC: management: rename show
stats domain cli "dns" to "resolvers"). The doc still refered to the
legacy "dns" domain filter for stat command. Let's rename those occurences
to "resolvers".
It may be backported to all stable versions.
Historically, log-forward proxy used to preserve host field from input
message as much as possible, and if syslog host wasn't provided
(rfc5424 '-' or bad rfc3164 or rfc5424 message) then "localhost" or "-"
would be used as host when outputting message using rfc3164 or rfc5424.
We change that behavior (which corresponds to "keep" host option), so that
log-forward now uses "fill" strategy as default: if the host is provided
in input message, it is preserved. However if it is missing and IP address
from sender is available, we use it.
Following previous patch, we know implement the logic for the host
option under log-forward section. Possible strategies are:
replace If input message already contains a value for the host
field, we replace it by the source IP address from the
sender.
If input message doesn't contain a value for the host field
(ie: '-' as input rfc5424 message or non compliant rfc3164
or rfc5424 message), we use the source IP address from the
sender as host field.
fill If input message already contains a value for the host field,
we keep it.
If input message doesn't contain a value for the host field
(ie: '-' as input rfc5424 message or non compliant rfc3164
or rfc5424 message), we use the source IP address from the
sender as host field.
keep If input message already contains a value for the host field,
we keep it.
If input message doesn't contain a value for the host field,
we set it to localhost (rfc3164) or '-' (rfc5424).
(This is the default)
append If input message already contains a value for the host field,
we append a comma followed by the IP address from the sender.
If input message doesn't contain a value for the host field,
we use the source IP address from the sender.
Default value (unchanged) is "keep" strategy. option host is only relevant
with rfc3164 or rfc5424 format on log targets. Also, if the source address
is not available (ie: UNIX socket), default behavior prevails.
Documentation was updated.
Released version 3.2-dev7 with the following main changes :
- BUG/MEDIUM: applet: Don't handle EOI/EOS/ERROR is applet is waiting for room
- BUG/MEDIUM: spoe/mux-spop: Introduce an NOOP action to deal with empty ACK
- BUG/MINOR: cfgparse: fix NULL ptr dereference in cfg_parse_peers
- BUG/MEDIUM: uxst: fix outgoing abns address family in connect()
- REGTESTS: fix reg-tests/server/abnsz.vtc
- BUG/MINOR: log: fix outgoing abns address family
- BUG/MINOR: sink: add tempo between 2 connection attempts for sft servers
- MINOR: clock: always use atomic ops for global_now_ms
- CI: QUIC Interop: clean old docker images
- BUG/MINOR: stream: do not call co_data() from __strm_dump_to_buffer()
- BUG/MINOR: mux-h1: always make sure h1s->sd exists in h1_dump_h1s_info()
- MINOR: tinfo: add a new thread flag to indicate a call from a sig handler
- BUG/MEDIUM: stream: never allocate connection addresses from signal handler
- MINOR: freq_ctr: provide non-blocking read functions
- BUG/MEDIUM: stream: use non-blocking freq_ctr calls from the stream dumper
- MINOR: tools: use only opportunistic symbols resolution
- CLEANUP: task: move the barrier after clearing th_ctx->current
- MINOR: compression: Introduce minimum size
- BUG/MINOR: h2: always trim leading and trailing LWS in header values
- MINOR: tinfo: split the signal handler report flags into 3
- BUG/MEDIUM: stream: don't use localtime in dumps from a signal handler
- OPTIM: connection: don't try to kill other threads' connection when !shared
- BUILD: add possibility to use different QuicTLS variants
- MEDIUM: fd: Wait if locked in fd_grab_tgid() and fd_take_tgid().
- MINOR: fd: Add fd_lock_tgid_cur().
- MEDIUM: epoll: Make sure we can add a new event
- MINOR: pollers: Add a fixup_tgid_takeover() method.
- MEDIUM: pollers: Drop fd events after a takeover to another tgid.
- MEDIUM: connections: Allow taking over connections from other tgroups.
- MEDIUM: servers: Add strict-maxconn.
- BUG/MEDIUM: server: properly initialize PROXY v2 TLVs
- BUG/MINOR: server: fix the "server-template" prefix memory leak
- BUG/MINOR: h3: do not report transfer as aborted on preemptive response
- CLEANUP: h3: fix documentation of h3_rcv_buf()
- MINOR: hq-interop: properly handle incomplete request
- BUG/MEDIUM: mux-fcgi: Try to fully fill demux buffer on receive if not empty
- MINOR: h1: permit to relax the websocket checks for missing mandatory headers
- BUG/MINOR: hq-interop: fix leak in case of rcv_buf early return
- BUG/MINOR: server: check for either proxy-protocol v1 or v2 to send hedaer
- MINOR: jws: implement a JWK public key converter
- DEBUG: init: add a way to register functions for unit tests
- TESTS: add a unit test runner in the Makefile
- TESTS: jws: register a unittest for jwk
- CI: github: run make unit-tests on the CI
- TESTS: add config smoke checks in the unit tests
- MINOR: jws: conversion to NIST curves name
- CI: github: remove smoke tests from vtest.yml
- TESTS: ist: fix wrong array size
- TESTS: ist: use the exit code to return a verdict
- TESTS: ist: add a ist.sh to launch in make unit-tests
- CI: github: fix h2spec.config proxy names
- DEBUG: init: Add a macro to register unit tests
- MINOR: sample: allow custom date format in error-log-format
- CLEANUP: log: removing "log-balance" references
- BUG/MINOR: log: set proper smp size for balance log-hash
- MINOR: log: use __send_log() with exact payload length
- MEDIUM: log: postpone the decision to send or not log with empty messages
- MINOR: proxy: make pr_mode enum bitfield compatible
- MINOR: cfgparse-listen: add and use cfg_parse_listen_match_option() helper
- MINOR: log: add options eval for log-forward
- MINOR: log: detach prepare from parse message
- MINOR: log: add dont-parse-log and assume-rfc6587-ntf options
- BUG/MEIDUM: startup: return to initial cwd only after check_config_validity()
- TESTS: change the output of run-unittests.sh
- TESTS: unit-tests: store sh -x in a result file
- CI: github: show results of the Unit tests
- BUG/MINOR: cfgparse/peers: fix inconsistent check for missing peer server
- BUG/MINOR: cfgparse/peers: properly handle ignored local peer case
- BUG/MINOR: server: dont return immediately from parse_server() when skipping checks
- MINOR: cfgparse/peers: provide more info when ignoring invalid "peer" or "server" lines
- BUG/MINOR: stream: fix age calculation in "show sess" output
- MINOR: stream/cli: rework "show sess" to better consider optional arguments
- MINOR: stream/cli: make "show sess" support filtering on front/back/server
- TESTS: quic: create first quic unittest
- MINOR: h3/hq-interop: restore function for standalone FIN receive
- MINOR/OPTIM: mux-quic: do not allocate rxbuf on standalone FIN
- MINOR: mux-quic: refine reception of standalone STREAM FIN
- MINOR: mux-quic: define globally stream rxbuf size
- MINOR: mux-quic: define rxbuf wrapper
- MINOR: mux-quic: store QCS Rx buf in a single-entry tree
- MINOR: mux-quic: adjust Rx data consumption API
- MINOR: mux-quic: adapt return value of qcc_decode_qcs()
- MAJOR: mux-quic: support multiple QCS RX buffers
- MEDIUM: mux-quic: handle too short data splitted on multiple rxbuf
- MAJOR: mux-quic: increase stream flow-control for multi-buffer alloc
- BUG/MINOR: cfgparse-tcp: relax namespace bind check
- MINOR: startup: adjust alert messages, when capabilities are missed
With "show sess", particularly "show sess all", we're often missing the
ability to inspect only streams attached to a frontend, backend or server.
Let's just add these filters to the command. Only one at a time may be set.
One typical use case could be to dump streams attached to a server after
issuing "shutdown sessions server XXX" to figure why any wouldn't stop
for example.
The "show sess" CLI command parser is getting really annoying because
several options were added in an exclusive mode as the single possible
argument. Recently some cumulable options were added ("show-uri") but
the older ones were not yet adapted. Let's just make sure that the
various filters such as "older" and "age" now belong to the options
and leave only <id>, "all", and "help" for the first ones. The doc was
updated and it's now easier to find these options.
This commit introduces the dont-parse-log option to disable log message
parsing, allowing raw log data to be forwarded without modification.
Also, it adds the assume-rfc6587-ntf option to frame log messages
using only non-transparent framing as per RFC 6587. This avoids
missparsing in certain cases (mainly with non RFC compliant messages).
The documentation is updated to include details on the new options and
their intended use cases.
This feature was discussed in GH #2856
At least one user would like to allow a standards-violating client setup
WebSocket connections through haproxy to a standards-violating server that
accepts them. While this should of course never be done over the internet,
it can make sense in the datacenter between application components which do
not need to mask the data, so this typically falls into the situation of
what the "accept-unsafe-violations-in-http-request" option and the
"accept-unsafe-violations-in-http-response" option are made for.
See GH #2876 for more context.
This patch relaxes the test on the "Sec-Websocket-Key" header field in
the request, and of the "Sec-Websocket-Accept" header in the response
when these respective options are set.
The doc was updated to reference this addition. This may be backported
to 3.1 but preferably not further.
Maxconn is a bit of a misnomer when it comes to servers, as it doesn't
control the maximum number of connections we establish to a server, but
the maximum number of simultaneous requests. So add "strict-maxconn",
that will make it so we will never establish more connections than
maxconn.
It extends the meaning of the "restricted" setting of
tune.takeover-other-tg-connections, as it will also attempt to get idle
connections from other thread groups if strict-maxconn is set.
Allow haproxy to take over idle connections from other thread groups
than our own. To control that, add a new tunable,
tune.takeover-other-tg-connections. It can have 3 values, "none", where
we won't attempt to get connections from the other thread group (the
default), "restricted", where we only will try to get idle connections
from other thread groups when we're using reverse HTTP, and "full",
where we always try to get connections from other thread groups.
Unless there is a special need, it is advised to use "none" (or
restricted if we're using reverse HTTP) as using connections from other
thread groups may have a performance impact.
This is the introduction of "minsize-req" and "minsize-res".
These two options allow you to set the minimum payload size required for
compression to be applied.
This helps save CPU on both server and client sides when the payload does
not need to be compressed.
Released version 3.2-dev6 with the following main changes :
- BUG/MEDIUM: debug: close a possible race between thread dump and panic()
- DEBUG: thread: report the spin lock counters as seek locks
- DEBUG: thread: make lock time computation more consistent
- DEBUG: thread: report the wait time buckets for lock classes
- DEBUG: thread: don't keep the redundant _locked counter
- DEBUG: thread: make lock_stat per operation instead of for all operations
- DEBUG: thread: reduce the struct lock_stat to store only 30 buckets
- MINOR: lbprm: add a new callback ->server_requeue to the lbprm
- MEDIUM: server: allocate a tasklet for asyncronous requeuing
- MAJOR: leastconn: postpone the server's repositioning under contention
- BUG/MINOR: quic: reserve length field for long header encoding
- BUG/MINOR: quic: fix CRYPTO payload size calcul for encoding
- MINOR: quic: simplify length calculation for STREAM/CRYPTO frames
- BUG/MINOR: mworker: section ignored in discovery after a post_section_parser
- BUG/MINOR: mworker: post_section_parser for the last section in discovery
- CLEANUP: mworker: "program" section does not have a post_section_parser anymore
- MEDIUM: initcall: allow to register mutiple post_section_parser per section
- CI: cirrus-ci: bump FreeBSD image to 14-2
- DOC: initcall: name correctly REGISTER_CONFIG_POST_SECTION()
- REGTESTS: stop using truncated.vtc on freebsd
- MINOR: quic: refactor STREAM encoding and splitting
- MINOR: quic: refactor CRYPTO encoding and splitting
- BUG/MEDIUM: fd: mark FD transferred to another process as FD_CLONED
- BUG/MINOR: ssl/cli: "show ssl crt-list" lacks client-sigals
- BUG/MINOR: ssl/cli: "show ssl crt-list" lacks sigals
- MINOR: ssl/cli: display more filenames in 'show ssl cert'
- DOC: watchdog: document the sequence of the watchdog and panic
- MINOR: ssl: store the filenames resulting from a lookup in ckch_conf
- MINOR: startup: allow hap_register_feature() to enable a feature in the list
- MINOR: quic: support frame type as a varint
- BUG/MINOR: startup: leave at first post_section_parser which fails
- BUG/MINOR: startup: hap_register_feature() fix for partial feature name
- BUG/MEDIUM: cli: Be sure to drop all input data in END state
- BUG/MINOR: cli: Wait for the last ACK when FDs are xferred from the old worker
- BUG/MEDIUM: filters: Handle filters registered on data with no payload callback
- BUG/MINOR: fcgi: Don't set the status to 302 if it is already set
- MINOR: ssl/crtlist: split the ckch_conf loading from the crtlist line parsing
- MINOR: ssl/crtlist: handle crt_path == cc->crt in crtlist_load_crt()
- MINOR: ssl/ckch: return from ckch_conf_clean() when conf is NULL
- MEDIUM: ssl/crtlist: "crt" keyword in frontend
- DOC: configuration: document the "crt" frontend keyword
- DEV: h2: add a Lua-based HTTP/2 connection tracer
- BUG/MINOR: quic: prevent crash on conn access after MUX init failure
- BUG/MINOR: mux-quic: prevent crash after MUX init failure
- DEV: h2: fix flags for the continuation frame
- REGTESTS: Fix truncated.vtc to send 0-CRLF
- BUG/MINOR: mux-h2: Properly handle full or truncated HTX messages on shut
- Revert "REGTESTS: stop using truncated.vtc on freebsd"
- MINOR: mux-quic: define a QCC application state member
- MINOR: mux-quic/h3: emit SETTINGS via MUX tasklet handler
- MINOR: mux-quic/h3: support temporary blocking on control stream sending
Each time we go into the watchdog and panic code, it's super hard to
figure who calls what since signals are involved to bounce between
threads. Let's document the main principles and sequences to ease the
journey next time.
Before this patch, REGISTER_CONFIG_SECTION() allowed to register one and only
one callback (<post>) called after the parsing of a section.
It was limitating because you couldn't register a post callback from anywhere
else in the code.
This patch introduces the new REGISTER_CONFIG_SECTION_POST() macros which allows
to register a new post callback for a section keyword from anywhere.
This patch introduces the feature by allowing `struct cfg_section` entries that
does not have a `section_parser`, and then iterating on all cfg_section with a
post_section_parser for a keyword.
Released version 3.2-dev5 with the following main changes :
- BUG/MINOR: ssl: put ssl_sock_load_ca under SSL_NO_GENERATE_CERTIFICATES
- CLEANUP: ssl: rename ssl_sock_load_ca to ssl_sock_gencert_load_ca
- CLEANUP: ssl: move ssl_sock_gencert_load_ca declaration in ssl_gencert.h
- CLEANUP: tree-wide: define and use acl_match_cond() helper
- MINOR: epoll: permit to mask certain specific events
- MINOR: proxies: Add a per-thread group field to struct proxy.
- MINOR: Add fields to the per-thread group field in struct server.
- MINOR: proxies/servers: Calculate queueslength and use it.
- MEDIUM: servers/proxies: Switch to using per-tgroup queues.
- BUG/MINOR: stream: Properly handle "on-marked-up shutdown-backup-sessions"
- MEDIUM: stream: Map task wake up reasons to dedicated stream events
- MEDIUM: stream: No longer use TASK_F_UEVT* to shut a stream down
- BUILD: tools: fix build on BSD by dropping the ETIME check
- MINOR: queues: use __ha_cpu_relax() on failed CAS.
- BUILD: queues: Use unsigned int when needed
- BUILD: ssl: allow to build without the renegotiation API of WolfSSL
- BUILD: ssl: more cleaner approach to WolfSSL without renegotiation
- BUG/MEDIUM: chunk: make sure to flush the trash pool before resizing
- MINOR: quic: remove references to burst in quic-cc-algo parsing
- MINOR: quic: allow BBR testing without pacing
- MINOR: quic: transform pacing settings into a global option
- MAJOR: quic: mark pacing as stable and enable it by default
- MINOR: quic: mark BBR as stable
- MINOR: quic: define quic_tune
- BUILD: quic: fix overflow in global tune
- DEBUG: fd: add a counter of takeovers of an FD since it was last opened
- MINOR: fd: add a generation number to file descriptors
- DEBUG: epoll: store and compare the FD's generation count with reported event
- MEDIUM: epoll: skip reports of stale file descriptors
- MINOR: mux-h1: Add masks to group H1S DEMUX and MUX errors
- BUG/MINOR: mux-h1: Only report a SE error on demux error
- MINOR: tevt: Add the termination events log's fundations
- MINOR: tevt/stconn: Add a termination events log in the SE descriptor
- MINOR: tevt/mux-h1: Report termination events for the H1C and H1S
- MINOR: tevt/mux-h2: Report termination events for the H2C
- MINOR: tevt/stream/stconn: Report termination events for stream and sc
- MINOR: tevt/conn: Report intercepted event for L4 rules
- MINOR: tevt/mux-h1/mux-h2: Add termination events log when dumping mux info
- MINOR: tevt/muxes: Add CTL and SCTL command to get the termination event logs
- MINOR: tevt/mux-pt: Add support for termination event logs
- MINOR: tevt/connection: Add dedicated termination events for lower locations
- MEDIUM: tevt/muxes: Add dedicated termination events for muxc/se locations
- MINOR: tevt/stconn: Be more accurate to report shutw events
- MEDIUM: tevt/stconn/stream: Add dedicated termination events for stream location
- MINOR: tevt: Don't duplicate termination event during reporting
- MINOR: tevt/applet: Add limited support for termination event logs for applets
- MINOR: tevt: Add a sample to get termination events for all locations
- MINOR: tevt: Improve function to convert a termination events log to string
- REORG: tevt/connection: Move enums at the end of the header file
- MINOR: tevt/dev: Add term_events tool
- MINOR: tevt/connection: Add support for POLL_HUP/POLL_ERR events
- MINOR: tevt/dev: Parse tuple of termination events
- BUG/MEDIUM: htx: wrong count computation in htx_xfer_blks()
- DOC: htx: clarify <mark> parameter for htx_xfer_blks()
- BUILD: quic: remove GCC undefined error in qc_release_lost_pkts()
- MEDIUM: htx: prevent <mark> to copy incomplete headers in htx_xfer_blks()
- BUG/MEDIUM: mux-fcgi: Properly handle read0 on partial records
- BUG/MINOR: tevt/http-ana: Remove badly placed event reports
- DEBUG: http-ana: Remove debug counters from HTTP analyzers
- DEBUG: mux-h1: Remove some debug counters
- BUG/MINOR: tcp-rules: Don't forward close during tcp-response content rules eval
- MEDIUM: stream: interrupt costly rulesets after too many evaluations
- BUG/MINOR: http-check: Don't pretend a C-L heeader is set before adding it
- BUILD: ssl: remove a boringssl definition defined by recent boringssl libs
- BUG/MINOR: tevt/mux-h2: Set truncated receive/eos events at SE level on error
- BUG/MEDIUM: flt-spoe: Set/test applet flags instead of SE flags from I/O handler
- BUG/MEDIUM: applet: Don't pretend to have more data to handle EOI/EOS/ERROR
- BUG/MEDIUM: flt-spoe: Properly handle end of stream from the SPOE applet
- MINOR: flt-spoe: Report end of input immediately after applet init
- MINOR: mux-spop: Report EOI on the SE when a ACK is received for a stream
- MINOR: mux-spop: Set SPOP_CF_ERROR flag on connection error only
- MINOR: tevt/mux-spop: Report termination events for the SPOP connect/stream
- CLEANUP: mux-spop: Remove useless comments
- MINOR: mux-spop: Dump info about connections and streams in dedicated functions
- MINOR: mux-spop: Implement .show_sd callback function
- MEDIUM: mux-fcgi: Add a function to propagate termination flags from fstrm to SE
- BUG/MEDIUM: mux-fcgi: Propagate flags to SE in fcgi_strm_wake_one_stream
- MINOR: tevt/mux-fcgi: Report termination events for the FCGI connect/stream
- MINOR: mux-fcgi: Dump info about connections and streams in dedicated functions
- MINOR: mux-spop/mux-fcgi: Add support of the debug string for logs
- BUG/MINOR: cli: Don't set SE flags from the cli applet
- BUG/MINOR: cli: Fix memory leak on error for _getsocks command
- BUG/MINOR: cli: Fix a possible infinite loop in _getsocks()
- BUG/MINOR: config/userlist: Support one 'users' option for 'group' directive
- BUG/MINOR: auth: Fix a leak on error path when parsing user's groups
- BUG/MINOR: flt-trace: Support only one name option
- MINOR: filters: Improve errors formating during filters parsing
- BUG/MINOR: stats-json: Define JSON_INT_MAX as a signed integer
- DOC: option redispatch should mention persist options
- BUG/MINOR: debug: make "debug dev sched" accept a negative TID
- BUG/MINOR: debug: make sure the "debug dev sched" tasks don't block stopping
- IMPORT: plock: export the uninlined version of the lock wait function
- IMPORT: plock: give higher precedence to W than S
- IMPORT: plock: lower the slope of the exponential back-off
- IMPORT: plock: use cpu_relax() for a shorter time in EBO
- Revert "IMPORT: plock: export the uninlined version of the lock wait function"
- BUG/MEDIUM: ssl: chosing correct certificate using RSA-PSS with TLSv1.3
"option redispatch" remains vague in which cases a session would persist;
let's mention "option persist" and "force-persist" as an example so folks
don't draw the conclusion that this may be default.
Should be backported to stable branches.
It is not rare to see configurations with a large number of "tcp-request
content" or "http-request" rules for instance. A large number of rules
combined with cpu-demanding actions (e.g.: actions that work on content)
may create thread contention as all the rules from a given ruleset are
evaluated under the same polling loop if the evaluation is not interrupted
Thus, in this patch we add extra logic around "tcp-request content",
"tcp-response content", "http-request" and "http-response" rulesets, so
that when a certain number of rules are evaluated under the single polling
loop, we force the evaluating function to yield. As such, the rule which
was about to be evaluated is saved, and the function starts evaluating
rules from the save pointer when it returns (in the next polling loop).
We use task_wakeup(task, TASK_WOKEN_MSG) to explicitly wake the task so
that no time is wasted and the processing is resumed ASAP. TASK_WOKEN_MSG
is mandatory here because process_stream() expects TASK_WOKEN_MSG for
explicit analyzers re-evaluation.
rules_bcount stream's attribute was added to count how manu rules were
evaluated since last interruption (yield). Also, SF_RULE_FYIELD flag
was added to know that the s->current_rule was assigned due to forced
yield and not regular yield.
By default haproxy will enforce a yield every 50 rules, this behavior
can be configured using the "tune.max-rules-at-once" global keyword.
There is a limitation though: for now, if the ACT_OPT_FINAL flag is set
on act_opts, we consider it is not safe to yield (as it is already the
case for automatic yield). In this case instead of yielding an taking
the risk of not being called back, we skip the yield and hope it will
not create contention. This is something we should ideally try to
improve in order to yield in all conditions.
"term_events" is a sample fetche function that can be used to get
termination events for all locations in one call. The format equivalent to:
{fc_term_events,fc_mux_term_events,fs.term_events,txn.term_events,bs.term_events,bc_mux_term_events,bc_term_events}
If no event was reported for a location, the field is empty. If the feature
is not supported yet, a dash ('-') is printed.
Pacing has recently been moved out of experimental status and is
activated by default. This is a mandatory requirement for BBR.
Furthermore, BBR is now considered stable. As such, removes its
experimental status with this commit.
Remove pacing experimental status, so it's not required anymore to use
expose-experimental-directives to enable it.
Along this change, pacing is now activated by default. As such, pacing
configuration is transformed into its final form. The global on/off
setting is turned into a disable setting without argument.
Pacing support was previously activated on each bind line individually,
via an optional argument of quic-cc-algo keyword. Remove this optional
argument and introduce a global setting to enable/disable pacing. Pacing
activation is still flagged as experimental.
One important change is that previously BBR usage automatically
activated pacing support. This is not the case anymore, so users should
now always explicitely activate pacing if BBR is selected. A new warning
message will be displayed if this is not the case.
Another consequence of this change is that now pacing_inter callback is
always defined for every quic_cc_algo types. As such, QUIC MUX uses
global.tune.options to determine if pacing is required.
This should be backported up to 3.1, after a period of observation.
Pacing is activated per bind line via an optional boolean argument of
quic-cc-algo keyword. Contrary to the default usage, pacing is
automatically activated when BBR is chosen. This is because this
algorithm is expected to run on top of pacing, else its behavior is
undefined.
Previously, pacing argument was thus ignored when BBR was selected.
Change this to support explicit deactivation of pacing with it. This
could be useful to test BBR without pacing when debugging some issues.
This should be backported up to 3.1, after a period of observation.
shutdown-backup-sessions action for on-marked-up directive does not work anymore
since the stream_shutdown() function was modified to be async-safe.
When stream_shutdown() was modified to be async-safe, dedicated task events were
added to map the reasons to shut a stream down. SF_ERR_DOWN was mapped to
TASK_F_EVT1 and SF_ERR_KILLED was mapped to TASK_F_EVT2. The reverse mapping was
performed by process_stream() to shut the stream with the appropriate reason.
However, SF_ERR_UP reason, used by shutdown-backup-sessions action to shut a
stream down because a preferred server became available, was not mapped in the
same way. So since commit b8e3b0a18d ("BUG/MEDIUM: stream: make
stream_shutdown() async-safe"), this action is ignored and does not work
anymore.
To fix an issue, and being able to bakcport the fix, a third task event was
added. TASK_F_EVT3 is now mapped on SF_ERR_UP.
This patch should fix the issue #2848. It must be backported as far as 2.6.
A few times in the past we've seen cases where epoll was caught reporting
a wrong event that caused trouble (e.g. spuriously reporting HUP or RDHUP
after a successful connect()). The new tune.epoll.mask-events directive
permits to mask events such as ERR, HUP and RDHUP and convert them to IN
events that are processed by the regular receive path. This should help
better diagnose and troubleshoot issues such as this one, as well as rule
out such a cause when similar issues are reported:
https://github.com/haproxy/haproxy/issues/2368https://www.spinics.net/lists/netdev/msg876470.html
It should be harmless to backport this if necessary.
Released version 3.2-dev4 with the following main changes :
- BUG/MINOR: stktable: fix big-endian compatiblity in smp_to_stkey()
- MINOR: stktable: add stkey_to_smp() helper
- MINOR: stktable: add stksess_getkey() helper
- MINOR: stktable: add sc[0-2]_key fetches
- BUG/MEDIUM: queues: Adjust the proxy counters when appropriate
- MINOR: trace: add help message for -dt argument
- MINOR: trace: ensure -dt priority over traces config section
- MINOR: trace: support all source alias on -dt
- BUG/MINOR: quic: reject NEW_TOKEN frames from clients
- MINOR: stktable: fix potential build issue in smp_to_stkey
- BUG/MEDIUM: stktable: fix missing lock on some table converters
- BUG/MEDIUM: promex: Use right context pointers to dump backends extra-counters
- MINOR: stktable: fix potential build issue in smp_to_stkey (2nd try)
- MINOR: stktable: add smp_fetch_stksess() helper function
- MEDIUM: stktable: split src-based key smp_fetch_sc functions
- MEDIUM: stktable: split sc_ and src_ fetch lookup logics
- MEDIUM: stktable: leverage smp_fetch_* helpers from sample conv
- DOC: config: unify sample conv|fetches optional arguments syntax
- DOC: config: stick-table converters support implicit <table> argument
- DOC: config: stick-table converter do accept ANY-typed input
- DOC: config: clarify return type for some stick-table converters
- DOC: config: refer to canonical sticktable converters for src_* fetches
- CLEANUP: stktable: move sample_conv_table_bytes_out_rate()
- MINOR: stktable: add table_{inc,clr}_gpc* converters
- BUG/MAJOR: quic: reject too large CRYPTO frames
- BUG/MAJOR: log/sink: possible sink collision in sink_new_from_srv()
- BUG/MINOR: init: set HAPROXY_STARTUP_VERSION from the variable, not the macro
- REORG: version: move the remaining BUILD_* stuff from haproxy.c to version.c
- BUG/MINOR: quic: ensure a detached coalesced packet can't access its neighbours
- MINOR: quic: Add a BUG_ON() on quic_tx_packet refcount
- BUILD: quic: Move an ASSUME_NONNULL() for variable which is not null
- BUG/MEDIUM: mux-h1: Properly close H1C if an error is reported before sending data
- CLEANUP: quic: remove unused prototype
- MINOR: quic: rename pacing_rate cb to pacing_inter
- BUG/MINOR: quic: do not increase congestion window if app limited
- MINOR: mux-quic: increment pacing retry counter on expired
- MEDIUM: quic: implement credit based pacing
- MEDIUM: mux-quic: reduce pacing CPU usage with passive wait
- MEDIUM: quic: use dynamic credit for pacing
- MINOR: quic: remove unused pacing burst in bind_conf/quic_cc_path
- MINOR: quic: adapt credit based pacing to BBR
- MINOR: tools: add errname to print errno macro name
- MINOR: debug: debug_parse_cli_show_dev: use errname
- MINOR: debug: show boot and runtime process settings in table
Major improvements have been introduced in pacing recently. Most
notably, QMUX schedules emission on a millisecond resolution, which
allow to use passive wait to be much CPU friendly.
However, an issue remains with the pacing max credit. Unless BBR is
used, it is fixed to the configured value from quic-cc-algo bind
statement. This is not practical as if too low, it may drastically
reduce performance due to 1ms sleep resolution. If too high, some
clients will suffer from too much packet loss.
This commit fixes the issue by implementing a dynamic maximum credit
value based on the network condition specific to each clients.
Calculation is done to fix a maximum value which should allow QMUX
current tasklet context to emit enough data to cover the delay with the
next tasklet invokation. As such, avg_loop_us is used to detect the
process load. If too small, 1.5ms is used as minimal value, to cover the
extra delay incurred by the system which will happen for a default 1ms
sleep.
This should be backported up to 3.1.
Pacing algorithm has been revamped in the previous commit to implement a
credit based solution. This is a far more adaptative solution, in
particular which allow to catch up in case pause between pacing emission
was longer than expected.
This allows QMUX to remove the active loop based on tasklet wake-up.
Instead, a new task is used when emission should be paced. The main
advantage is that CPU usage is drastically reduced.
New pacing task timer is reset each time qcc_io_send() is invoked. Timer
will be set only if pacing engine reports that emission must be
interrupted. In this case timer is set via qcc_wakeup_pacing() to the
delay reported by congestion algorithm, or 1ms if delay is too short. At
the end of qcc_io_cb(), pacing task is queued if timer has been set.
Pacing task execution is simple enough : it immediately wakes up QCC I/O
handler.
Note that to have decent performance, it requires to have a large enough
burst defined in configuration of quic-cc-algo. However, this value is
common to every listener clients, which may cause too much loss under
network conditions. This will be address in a future patch.
This should be backported up to 3.1.
As discussed in GH #2423, there are some cases where src_{inc,clr}_gpc*
is not sufficient because we need to perform the lookup on a specific
key. Indeed, just like we did in e642916 ("MEDIUM: stktable: leverage
smp_fetch_* helpers from sample conv"), we can easily implement new
table converters based on existing fetches. This is what we do in
this patch.
Also the doc was updated so that src_{inc,clr}_gpc* fetches now point to
their generic equivalent table_{inc,clr}_gpc*. Indeed, src_{inc,clr}_gpc*
are simply aliases.
This should fix GH #2423.
When available, to prevent doc duplication, let's make src_* fetches
point to equivalent table_* converters, as they are in fact aliases
for src,table_* converters.
Some stick-table converters such as "table_gpt" erroneously suggest that
the returned type is a boolean while in fact it is integer type, as
properly documented for the sample fetch equivalents.
Since 2d17db58 ("MINOR: stick-table: change all stick-table converters'
inputs to SMP_T_ANY"), all stick-table converters accept ANY input
type as parameter, this means that it does no longer restrict the key as
a string representation of the input. However the doc wasn't updated when
the change was made. Moreover, some converters document the updated behavior
while others don't, which is kind of confusing, let's fix that.
As with stick-table sample fetches, the <table> argument is not strictly
needed and defaults to the current proxy's stick-table when not provided
Let's update the doc and prototype to reflect the current behavior.
The most common way (and proper way it seems) to declare optional
arguments in sample fetch or converters' prototype is to declare
them between square brackets, including the leading coma (because the
coma should be omitted if the argument is not provided). Also, when
multiple optional arguments are found, we should apply the same logic
but recursively.
In this patch we fix prototypes that include optional arguments and don't
follow this syntax. This improves readibility and sets the norm for
upcoming sample fetches/converters.