Add forward declarations in types/ssl_crtlist.h in order to avoid
circular dependencies. Also remove the listener.h include which is not
needed anymore.
The ssl_sock.c file contains a lot of macros and structure definitions
that should be in a .h. Move them to the more appropriate
types/ssl_sock.h file.
This patch adds the ability to register callbacks for SSL/TLS protocol
messages by using the function ssl_sock_register_msg_callback().
All registered callback functions will be called when observing received
or sent SSL/TLS protocol messages.
The EVP_MD_CTX_create() and EVP_MD_CTX_destroy() functions were renamed to
EVP_MD_CTX_new() and EVP_MD_CTX_free() in OpenSSL 1.1.0, respectively. These
functions are used by the digest converter, introduced by the commit 8e36651ed
("MINOR: sample: Add digest and hmac converters"). So for prior versions of
openssl, macros are used to fallback on old functions.
This patch must only be backported if the commit 8e36651ed is backported too.
This one really ought to be defined in hathreads.h like all other thread
definitions, which is what this patch does. As expected, all files but
one (regex.h) were already including hathreads.h when using THREAD_LOCAL;
regex.h was fixed for this.
This was the last entry in config.h which is now useless.
The setting of CONFIG_HAP_LOCKLESS_POOLS depending on threads and
compat was done in config.h for use only in memory.h and memory.c
where other settings are dealt with. Further, the default pool cache
size was set there from a fixed value instead of being set from
defaults.h
Let's move the decision to enable lockless pools via
CONFIG_HAP_LOCKLESS_POOLS to memory.h, and set the default pool
cache size in defaults.h like other default settings.
This was the next-to-last setting in config.h.
CONFIG_HAP_MEM_OPTIM was introduced with memory pools in 1.3 and dropped
in 1.6 when pools became the only way to allocate memory. Still the
option remained present in config.h. Let's kill it.
It is now possible to use log-format string (or hexadecimal string for the
binary version) to match a content in tcp-check based expect rules. For
hexadecimal log-format string, the conversion in binary is performed after the
string evaluation, during health check execution. The pattern keywords to use
are "string-lf" for the log-format string and "binary-lf" for the hexadecimal
log-format string.
Just like in previous patch, it happens that HA_ATOMIC_UPDATE_MIN() and
HA_ATOMIC_UPDATE_MAX() would evaluate the (val) argument up to 3 times.
However this time it affects both thread and non-thread versions. It's
strange because the copy was properly performed for the (new) argument
in order to avoid this. Anyway it was done for the "val" one as well.
A quick code inspection showed that this currently has no effect as
these macros are fairly limited in usage.
It would be best to backport this for long-term stability (till 1.8)
but it will not fix an existing bug.
When threads are disabled, HA_ATOMIC_CAS() becomes a simple compound
expression. However this expression presents a problem, which is that
its arguments are evaluated multiple times, once for the comparison
and once again for the assignement. This presents a risk of performing
some side-effect operations twice in the non-threaded case (e.g. in
case of auto-increment or function return).
The macro was rewritten using local copies for arguments like the
other macros do.
Fortunately a complete inspection of the code indicates that this case
currently never happens. It was however responsible for the strict-aliasing
warning emitted when building fd.c without threads but with 64-bit CAS.
This may be backported as far as 1.8 though it will not fix any existing
bug and is more of a long-term safety measure in case a future fix would
depend on this behavior.
It is now possible to add http-check expect rules matching HTTP header names and
values. Here is the format of these rules:
http-check expect header name [ -m <meth> ] <name> [log-format] \
[ value [ -m <meth> ] <value> [log-format] [full] ]
the name pattern (name ...) is mandatory but the value pattern (value ...) is
optionnal. If not specified, only the header presence is verified. <meth> is the
matching method, applied on the header name or the header value. Supported
matching methods are:
* "str" (exact match)
* "beg" (prefix match)
* "end" (suffix match)
* "sub" (substring match)
* "reg" (regex match)
If not specified, exact matching method is used. If the "log-format" option is
used, the pattern (<name> or <value>) is evaluated as a log-format string. This
option cannot be used with the regex matching method. Finally, by default, the
header value is considered as comma-separated list. Each part may be tested. The
"full" option may be used to test the full header line. Note that matchings are
case insensitive on the header names.
It is now possible to use different matching methods to look for header names in
an HTTP message:
* The exact match. It is the default method. http_find_header() uses this
method. http_find_str_header() is an alias.
* The prefix match. It evals the header names starting by a prefix.
http_find_pfx_header() must be called to use this method.
* The suffix match. It evals the header names ending by a suffix.
http_find_sfx_header() must be called to use this method.
* The substring match. It evals the header names containing a string.
http_find_sub_header() must be called to use this method.
* The regex match. It evals the header names matching a regular expression.
http_match_header() must be called to use this method.
Some HTTP sample fetches will be accessible from the context of a http-check
health check. Thus, the prefetch function responsible to return the HTX message
has been update to handle a check, in addition to a channel. Both cannot be used
at the same time. So there is no ambiguity.
Given that a "count" value of 32M was seen in _shctx_wait4lock(), it
is very important to prevent this from happening again. It's absolutely
essential to prevent the value from growing unbounded because with an
increase of the number of threads, the number of successive failed
attempts will necessarily grow.
Instead now we're scanning all 2^p-1 values from 3 to 255 and are
bounding to count to 255 so that in the worst case each thread tries an
xchg every 255 failed read attempts. That's one every 4 on average per
thread when there are 64 threads, which corresponds to the initial count
of 4 for the first attempt so it seems like a reasonable value to keep a
low latency.
The bug was introduced with the shctx entries in 1.5 so the fix must
be backported to all versions. Before 1.8 the function was called
_shared_context_wait4lock() and was in shctx.c.
Jrme reported an amazing crash in the spinlock version of
_shctx_wait4lock() with an extremely high <count> value of 32M! The
root cause is that the function cannot deal with contention on the lock
at all because it forgets to check if the lock's value has changed! As
such, every time it's called due to a contention, it waits twice as
long before trying again and lets the caller check for the contention
by itself.
The correct thing to do is to compare the value again at each loop.
This way it makes sure to mostly perform read accesses on the shared
cache line without writing too often, and to be ready fast enough to
try to grab the lock. And we must not increase the count on success
either!
Unfortunately I'd have expected to see a performance boost on the cache
with this but there was absolutely no change, so it's very likely that
these issues only happen once in a while and are sufficient to derail
the process when they strike, but not to have a permanent performance
impact.
The bug was introduced with the shctx entries in 1.5 so the fix must
be backported to all versions. Before 1.8 the function was called
_shared_context_wait4lock() and was in shctx.c.
I changed my mind twice on this one and pushed after the last test with
threads disabled, without re-enabling long long, causing this rightful
build warning.
This needs to be backported if the previous commit ff64d3b027 ("MINOR:
threads: export the POSIX thread ID in panic dumps") is backported as
well.
It is very difficult to map a panic dump against a gdb thread dump
because the thread numbers do not match. However gdb provides the
pthread ID but this one is supposed to be opaque and not to be cast
to a scalar.
This patch provides a fnuction, ha_get_pthread_id() which retrieves
the pthread ID of the indicated thread and casts it to an unsigned
long long so as to lose the least possible amount of information from
it. This is done cleanly using a union to maintain alignment so as
long as these IDs are stored on 1..8 bytes they will be properly
reported. This ID is now presented in the panic dumps so it now
becomes possible to map these threads. When threads are disabled,
zero is returned. For example, this is a panic dump:
Thread 1 is about to kill the process.
*>Thread 1 : id=0x7fe92b825180 act=0 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
stuck=1 prof=0 harmless=0 wantrdv=0
cpu_ns: poll=5119122 now=2009446995 diff=2004327873
curr_task=0xc99bf0 (task) calls=4 last=0
fct=0x592440(task_run_applet) ctx=0xca9c50(<CLI>)
strm=0xc996a0 src=unix fe=GLOBAL be=GLOBAL dst=<CLI>
rqf=848202 rqa=0 rpf=80048202 rpa=0 sif=EST,200008 sib=EST,204018
af=(nil),0 csf=0xc9ba40,8200
ab=0xca9c50,4 csb=(nil),0
cof=0xbf0e50,1300:PASS(0xc9cee0)/RAW((nil))/unix_stream(20)
cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0)
call trace(20):
| 0x59e4cf [48 83 c4 10 5b 5d 41 5c]: wdt_handler+0xff/0x10c
| 0x7fe92c170690 [48 c7 c0 0f 00 00 00 0f]: libpthread:+0x13690
| 0x7ffce29519d9 [48 c1 e2 20 48 09 d0 48]: linux-vdso:+0x9d9
| 0x7ffce2951d54 [eb d9 f3 90 e9 1c ff ff]: linux-vdso:__vdso_gettimeofday+0x104/0x133
| 0x57b484 [48 89 e6 48 8d 7c 24 10]: main+0x157114
| 0x50ee6a [85 c0 75 76 48 8b 55 38]: main+0xeaafa
| 0x50f69c [48 63 54 24 20 85 c0 0f]: main+0xeb32c
| 0x59252c [48 c7 c6 d8 ff ff ff 44]: task_run_applet+0xec/0x88c
Thread 2 : id=0x7fe92b6e6700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=786738 now=1086955 diff=300217
curr_task=0
Thread 3 : id=0x7fe92aee5700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=828056 now=1129738 diff=301682
curr_task=0
Thread 4 : id=0x7fe92a6e4700 act=0 glob=0 wq=0 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=818900 now=1153551 diff=334651
curr_task=0
And this is the gdb output:
(gdb) info thr
Id Target Id Frame
* 1 Thread 0x7fe92b825180 (LWP 15234) 0x00007fe92ba81d6b in raise () from /lib64/libc.so.6
2 Thread 0x7fe92b6e6700 (LWP 15235) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
3 Thread 0x7fe92a6e4700 (LWP 15237) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
4 Thread 0x7fe92aee5700 (LWP 15236) 0x00007fe92bb56a56 in epoll_wait () from /lib64/libc.so.6
We can clearly see that while threads 1 and 2 are the same, gdb's
threads 3 and 4 respectively are haproxy's threads 4 and 3.
This may be backported to 2.0 as it removes some confusion in github issues.
It can be sometimes useful to measure total time of a request as seen
from an end user, including TCP/TLS negotiation, server response time
and transfer time. "Tt" currently provides something close to that, but
it also takes client idle time into account, which is problematic for
keep-alive requests as idle time can be very long. "Ta" is also not
sufficient as it hides TCP/TLS negotiationtime. To improve that, introduce
a "Tu" timer, without idle time and everything else. It roughly estimates
time spent time spent from user point of view (without DNS resolution
time), assuming network latency is the same in both directions.
This bug was introduced by the commit 2444aa5b ("MEDIUM: sessions: Don't be
responsible for connections anymore."). In session_check_idle_conn(), when the
mux is destroyed, its context must be passed as argument instead of the
connection.
It is de 2.2-dev bug. No need to backport.
It is now possible to match on a comma-separated list of status codes or range
of codes. In addtion, instead of a string comparison to match the response's
status code, a integer comparison is performed. Here is an example:
http-check expect status 200,201,300-310
This reverts commit 1979943c30ef285ed04f07ecf829514de971d9b2.
Captures in comment was only used when a tcp-check expect based on a negative
regex matching failed to eventually report what was captured while it was not
expected. It is a bit far-fetched to be useable IMHO. on-error and on-success
log-format strings are far more usable. For now there is few check sample
fetches (in fact only one...). But it could be really powerful to report info in
logs.
Since all tcp-check rulesets are globally stored, it is a problem to use
list. For configuration with many backends, the lookups in list may be costly
and slow downs HAProxy startup. To solve this problem, tcp-check rulesets are
now stored in a tree.
The patch is not obvious at the first glance. But it is just a reorg. Functions
have been grouped and ordered in a more logical way. Some structures and flags
are now private to the checks module (so moved from the .h to the .c file).
Defaut health-checks, without any option, doing only a connection check, are now
based on tcp-checks. An implicit default tcp-check connect rule is used. A
shared tcp-check ruleset, name "*tcp-check" is created to support these checks.
This function is unused for now. But it will have be used to install a mux for
an outgoing connection openned in a health-check context. In this case, the
session's origin is the check itself, and it is used to know the mode, HTTP or
TCP, depending on the tcp-check type and not the proxy mode. The check is also
used to get the mux protocol if configured.
It is not set and not used for now, but it will be possible to force the mux
protocol thanks to this patch. A mux proto field is added to the checks and to
tcp-check connect rules.
HTTP health-checks are now internally based on tcp-checks. Of course all the
configuration parsing of the "http-check" keyword and the httpchk option has
been rewritten. But the main changes is that now, as for tcp-check ruleset, it
is possible to perform several send/expect sequences into the same
health-checks. Thus the connect rule is now also available from HTTP checks, jst
like set-var, unset-var and comment rules.
Because the request defined by the "option httpchk" line is used for the first
request only, it is now possible to set the method, the uri and the version on a
"http-check send" line.
Instead of having 2 independent integers, used as boolean values, to know if the
expect rule is invered and to know if the matching regexp has captures, we know
use a 32-bits bitfield.
All tcp-check rules are now stored in the globla shared list. The ones created
to parse a specific protocol, for instance redis, are already stored in this
list. Now pure tcp-check rules are also stored in it. The ruleset name is
created using the proxy name and its config file and line. tcp-check rules
declared in a defaults section are also stored this way using "defaults" as
proxy name.
For now, all tcp-check ruleset are stored in a list. But it could be a bit slow
to looks for a specific ruleset with a huge number of backends. So, it could be
a good idea to use a tree instead.
It is now possible to specified the healthcheck status to use on success of a
tcp-check rule, if it is the last evaluated rule. The option "ok-status"
supports "L4OK", "L6OK", "L7OK" and "L7OKC" status.
A shared tcp-check ruleset is now created to support agent checks. The following
sequence is used :
tcp-check send "%[var(check.agent_string)] log-format
tcp-check expect custom
The custom function to evaluate the expect rule does the same that it was done
to handle agent response when a custom check was used.
A share tcp-check ruleset is now created to support SPOP checks. This way no
extra memory is used if several backends use a SPOP check.
The following sequence is used :
tcp-check send-binary SPOP_REQ
tcp-check expect custom min-recv 4
The spop request is the result of the function
spoe_prepare_healthcheck_request() and the expect rule relies on a custom
function calling spoe_handle_healthcheck_response().
A shared tcp-check ruleset is now created to support LDAP check. This way no
extra memory is used if several backends use a LDAP check.
The following sequance is used :
tcp-check send-binary "300C020101600702010304008000"
tcp-check expect rbinary "^30" min-recv 14 \
on-error "Not LDAPv3 protocol"
tcp-check expect custom
The last expect rule relies on a custom function to check the LDAP server reply.
A share tcp-check ruleset is now created to support MySQL checks. This way no
extra memory is used if several backends use a MySQL check.
One for the following sequence is used :
## If no extra params are set
tcp-check connect default linger
tcp-check expect custom ## will test the initial handshake
## If the username is defined
tcp-check connect default linger
tcp-check send-binary MYSQL_REQ log-format
tcp-check expect custom ## will test the initial handshake
tcp-check expect custom ## will test the reply to the client message
The log-format hexa string MYSQL_REQ depends on 2 preset variables, the packet
header containing the packet length and the sequence ID (check.header) and the
username (check.username). If is also different if the "post-41" option is set
or not. Expect rules relies on custom functions to check MySQL server packets.
A shared tcp-check ruleset is now created to support postgres check. This way no
extra memory is used if several backends use a pgsql check.
The following sequence is used :
tcp-check connect default linger
tcp-check send-binary PGSQL_REQ log-format
tcp-check expect !rstring "^E" min-recv 5 \
error-status "L7RSP" on-error "%[check.payload(6,0)]"
tcp-check expect rbinary "^520000000800000000 min-recv "9" \
error-status "L7STS" \
on-success "PostgreSQL server is ok" \
on-error "PostgreSQL unknown error"
The log-format hexa string PGSQL_REQ depends on 2 preset variables, the packet
length (check.plen) and the username (check.username).
A share tcp-check ruleset is now created to support smtp checks. This way no
extra memory is used if several backends use a smtp check.
The following sequence is used :
tcp-check connect default linger
tcp-check expect rstring "^[0-9]{3}[ \r]" min-recv 4 \
error-status "L7RSP" on-error "%[check.payload(),cut_crlf]"
tcp-check expect rstring "^2[0-9]{2}[ \r]" min-recv 4 \
error-status "L7STS" \
on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \
status-code "check.payload(0,3)"
tcp-echeck send "%[var(check.smtp_cmd)]\r\n" log-format
tcp-check expect rstring "^2[0-9]{2}[- \r]" min-recv 4 \
error-status "L7STS" \
on-error %[check.payload(4,0),ltrim(' '),cut_crlf] \
on-success "%[check.payload(4,0),ltrim(' '),cut_crlf]" \
status-code "check.payload(0,3)"
The variable check.smtp_cmd is by default the string "HELO localhost" by may be
customized setting <helo> and <domain> parameters on the option smtpchk
line. Note there is a difference with the old smtp check. The server gretting
message is checked before send the HELO/EHLO comand.
A shared tcp-check ruleset is now created to support ssl-hello check. This way
no extra memory is used if several backends use a ssl-hello check.
The following sequence is used :
tcp-check send-binary SSLV3_CLIENT_HELLO log-format
tcp-check expect rbinary "^1[56]" min-recv 5 \
error-status "L6RSP" tout-status "L6TOUT"
SSLV3_CLIENT_HELLO is a log-format hexa string representing a SSLv3 CLIENT HELLO
packet. It is the same than the one used by the old ssl-hello except the sample
expression "%[date(),htonl,hex]" is used to set the date field.
A share tcp-check ruleset is now created to support redis checks. This way no
extra memory is used if several backends use a redis check.
The following sequence is used :
tcp-check send "*1\r\n$4\r\nPING\r\n"
tcp-check expect string "+PONG\r\n" error-status "L7STS" \
on-error "%[check.payload(),cut_crlf]" on-success "Redis server is ok"
It is now possible to set a custom function to evaluate a tcp-check expect
rule. It is an internal and not documentd option because the right pointer of
function must be set and it is not possible to express it in the
configuration. It will be used to convert some protocol healthchecks to
tcp-checks.
Custom functions must have the following signature:
enum tcpcheck_eval_ret (*custom)(struct check *, struct tcpcheck_rule *, int);
A list of variables is now associated to each tcp-check ruleset. It is more a
less a list of set-var expressions. This list may be filled during the
configuration parsing. The listed variables will then be set during each
execution of the tcp-check healthcheck, at the begining, before execution of the
the first tcp-check rule.
This patch is mandatory to convert all protocol checks to tcp-checks. It is a
way to customize shared tcp-check rulesets.
This option defines a sample expression, evaluated as an integer, to set the
status code (check->code) if a tcp-check healthcheck ends on the corresponding
expect rule.
These options define log-format strings used to produce the info message if a
tcp-check expect rule fails (on-error option) or succeeds (on-success
option). For this last option, it must be the ending rule, otherwise the
parameter is ignored.
It is now possible to specified the healthcheck status to use on error or on
timeout for tcp-check expect rules. First, to define the error status, the
option "error-status" must be used followed by "L4CON", "L6RSP", "L7RSP" or
"L7STS". Then, to define the timeout status, the option "tout-status" must be
used followed by "L4TOUT", "L6TOUT" or "L7TOUT".
These options will be used to convert specific protocol healthchecks (redis,
pgsql...) to tcp-check ones.
x
A global list to tcp-check ruleset can now be used to share common rulesets with
all backends without any duplication. It is mandatory to convert all specific
protocol checks (redis, pgsql...) to tcp-check healthchecks.
To do so, a flag is now attached to each tcp-check ruleset to know if it is a
shared ruleset or not. tcp-check rules defined in a backend are still directly
attached to the proxy and not shared. In addition a second flag is used to know
if the ruleset is inherited from the defaults section.
An extra parameter for tcp-check send rules can be specified to handle the
string or the hexa string as a log-format one. Using "log-format" option,
instead of considering the data to send as raw data, it is parsed as a
log-format string. Thus it is possible to call sample fetches to customize data
sent to a server. Of course, because we have no stream attached to healthchecks,
not all sample fetches are available. So be careful.
tcp-check set-var(check.port) int(8000)
tcp-check set-var(check.uri) str(/status)
tcp-check connect port var(check.port)
tcp-check send "GET %[check.uri] HTTP/1.0\r\n" log-format
tcp-check send "Host: %[srv_name]\r\n" log-format
tcp-check send "\r\n"
Since we have a session attached to tcp-check healthchecks, It is possible use
sample expression and variables. In addition, it is possible to add tcp-check
set-var rules to define custom variables. So, now, a sample expression can be
used to define the port to use to establish a connection for a tcp-check connect
rule. For instance:
tcp-check set-var(check.port) int(8888)
tcp-check connect port var(check.port)
With this option, it is now possible to use a specific address to open the
connection for a tcp-check connect rule. If the port option is also specified,
it is used in priority.
With this option, it is possible to establish the connection opened by a
tcp-check connect rule using upstream socks4 proxy. Info from the socks4
parameter on the server are used.
Register the custom action rules "set-var" and "unset-var", that will
call the parse_store() command upon parsing.
These rules are thus built and integrated to the tcp-check ruleset, but
have no further effect for the moment.
Add a dedicated vars scope for checks. This scope is considered as part of the
session scope for accounting purposes.
The scope can be addressed by a valid session, even embryonic. The stream is not
necessary.
The scope is initialized after the check session is created. All variables are
then pruned before the session is destroyed.
Create a session for each healthcheck relying on a tcp-check ruleset. When such
check is started, a session is allocated, which will be freed when the check
finishes. A dummy static frontend is used to create these sessions. This will be
useful to support variables and sample expression. This will also be used,
later, by HTTP healthchecks to rely on HTTP muxes.
The loop in tcpcheck_main() function is quite hard to understand. Depending
where we are in the loop, The current_step is the currentely executed rule or
the one to execute on the next call to tcpcheck_main(). When the check result is
reported, we rely on the rule pointed by last_started_step or the one pointed by
current_step. In addition, the loop does not use the common list_for_each_entry
macro and it is thus quite confusing.
So the loop has been totally rewritten and splitted to several functions to
simplify its reading and its understanding. Tcp-check rules are evaluated in
dedicated functions. And a common for_each loop is used and only one rule is
referenced, the current one.
After the configuration parsing, when its validity check, an implicit tcp-check
connect rule is added in front of the tcp-check ruleset if the first non-comment
rule is not a connect one. This implicit rule is flagged to use the default
check parameter.
This means now, all tcp-check rulesets begin with a connect and are never
empty. When tcp-check healthchecks are used, all connections are thus handled by
tcpcheck_main() function.
To allow reusing these blocks without consuming more memory, their list
should be static and share-able accross uses. The head of the list will
be shared as well.
It is thus necessary to extract the head of the rule list from the proxy
itself. Transform it into a pointer instead, that can be easily set to
an external dynamically allocated head.
Parse back-references in comments of tcp-check expect rules. If references are
made, capture groups in the match and replace references to it within the
comment when logging the error. Both text and binary regex can caputre groups
and reference them in the expect rule comment.
[Cf: I slightly updated the patch. exp_replace() function is used instead of a
custom one. And if the trash buffer is too small to contain the comment during
the substitution, the comment is ignored.]
The rbinary match works similarly to the rstring match type, however the
received data is rewritten as hex-string before the match operation is
done.
This allows using regexes on binary content even with the POSIX regex
engine.
[Cf: I slightly updated the patch. mem2hex function was removed and dump_binary
is used instead.]
Allow declaring tcpcheck connect commands with a new parameter,
"linger". This option will configure the connection to avoid using an
RST segment to close, instead following the four-way termination
handshake. Some servers would otherwise log each healthcheck as
an error.
Some expect rules cannot be satisfied due to inherent ambiguity towards
the received data: in the absence of match, the current behavior is to
be forced to wait either the end of the connection or a buffer full,
whichever comes first. Only then does the matching diagnostic is
considered conclusive. For instance :
tcp-check connect
tcp-check expect !rstring "^error"
tcp-check expect string "valid"
This check will only succeed if the connection is closed by the server before
the check timeout. Otherwise the first expect rule will wait for more data until
"^error" regex matches or the check expires.
Allow the user to explicitly define an amount of data that will be
considered enough to determine the value of the check.
This allows succeeding on negative rstring rules, as previously
in valid condition no match happened, and the matching was repeated
until the end of the connection. This could timeout the check
while no error was happening.
[Cf: I slighly updated the patch. The parameter was renamed and the value is a
signed integer to support -1 as default value to ignore the parameter.]
When receiving additional data while chaining multiple tcp-check expects,
previous inverse expects might have a different result with the new data. They
need to be evaluated again against the new data.
Add a pointer to the first inverse expect rule of the current expect chain
(possibly of length one) to each expect rule. When receiving new data, the
currently evaluated tcp-check rule is set back to this pointed rule.
Fonctionnaly speaking, it is a bug and it exists since the introduction of the
feature. But there is no way for now to hit it because when an expect rule does
not match, we wait for more data, independently on the inverse flag. The only
way to move to the following rule is to be sure no more data will be received.
This patch depends on the commit "MINOR: mini-clist: Add functions to iterate
backward on a list".
[Cf: I slightly updated the patch. First, it only concerns inverse expect
rule. Normal expect rules are not concerned. Then, I removed the BUG tag
because, for now, it is not possible to move to the following rule when the
current one does not match while more data can be received.]
Replace the generic integer with an enumerated list. This allows light
type check and helps debugging (seeing action = 2 in the struct is not
helpful).
This options is used to force a non-SSL connection to check a SSL server or to
invert a check-ssl option inherited from the default section. The use_ssl field
in the check structure is used to know if a SSL connection must be used
(use_ssl=1) or not (use_ssl=0). The server configuration is used by default.
The problem is that we cannot distinguish the default case (no specific SSL
check option) and the case of an explicit non-SSL check. In both, use_ssl is set
to 0. So the server configuration is always used. For a SSL server, when
no-check-ssl option is set, the check is still performed using a SSL
configuration.
To fix the bug, instead of a boolean value (0=TCP, 1=SSL), we use a ternary value :
* 0 = use server config
* 1 = force SSL
* -1 = force non-SSL
The same is done for the server parameter. It is not really necessary for
now. But it is a good way to know is the server no-ssl option is set.
In addition, the PR_O_TCPCHK_SSL proxy option is no longer used to set use_ssl
to 1 for a check. Instead the flag is directly tested to prepare or destroy the
server SSL context.
This patch should be backported as far as 1.8.
The 'http-check send' directive have been added to add headers and optionnaly a
payload to the request sent during HTTP healthchecks. The request line may be
customized by the "option httpchk" directive but there was not official way to
add extra headers. An old trick consisted to hide these headers at the end of
the version string, on the "option httpchk" line. And it was impossible to add
an extra payload with an "http-check expect" directive because of the
"Connection: close" header appended to the request (See issue #16 for details).
So to make things official and fully support payload additions, the "http-check
send" directive have been added :
option httpchk POST /status HTTP/1.1
http-check send hdr Content-Type "application/json;charset=UTF-8" \
hdr X-test-1 value1 hdr X-test-2 value2 \
body "{id: 1, field: \"value\"}"
When a payload is defined, the Content-Length header is automatically added. So
chunk-encoded requests are not supported yet. For now, there is no special
validity checks on the extra headers.
This patch is inspired by Kiran Gavali's work. It should fix the issue #16 and
as far as possible, it may be backported, at least as far as 1.8.
list_for_each_entry_rev() and list_for_each_entry_from_rev() and corresponding
safe versions have been added to iterate on a list in the reverse order. All
these functions work the same way than the forward versions, except they use the
.p field to move for an element to another.
Server address and port may change at runtime. So the address and port passed as
arguments and as environment variables when an external check is executed must
be updated. The current number of connections on the server was already updated
before executing the command. So the same mechanism is used for the server
address and port. But in addition, command arguments are also updated.
This patch must be backported to all stable versions. It should fix the
issue #577.
The url_decode() function used by the url_dec converter and a few other
call points is ambiguous on its processing of the '+' character which
itself isn't stable in the spec. This one belongs to the reserved
characters for the query string but not for the path nor the scheme,
in which it must be left as-is. It's only in argument strings that
follow the application/x-www-form-urlencoded encoding that it must be
turned into a space, that is, in query strings and POST arguments.
The problem is that the function is used to process full URLs and
paths in various configs, and to process query strings from the stats
page for example.
This patch updates the function to differentiate the situation where
it's parsing a path and a query string. A new argument indicates if a
query string should be assumed, otherwise it's only assumed after seeing
a question mark.
The various locations in the code making use of this function were
updated to take care of this (most call places were using it to decode
POST arguments).
The url_dec converter is usually called on path or url samples, so it
needs to remain compatible with this and will default to parsing a path
and turning the '+' to a space only after a question mark. However in
situations where it would explicitly be extracted from a POST or a
query string, it now becomes possible to enforce the decoding by passing
a non-null value in argument.
It seems to be what was reported in issue #585. This fix may be
backported to older stable releases.
As reported in issue #596, the edx register isn't marked as clobbered
in div64_32(), which could technically allow gcc to try to reuse it
if it needed a copy of the 32 highest bits of the o1 register after
the operation.
Two attempts were tried, one using a dummy 32-bit local variable to
store the intermediary edx and another one switching to "=A" and making
result a long long. It turns out the former makes the resulting object
code significantly dirtier while the latter makes it better and was
kept. This is due to gcc's difficulties at working with register pairs
mixing 32- and 64- bit values on i386. It was verified that no code
change happened at all on x86_64, armv7, aarch64 nor mips32.
In practice it's only used by the frequency counters so this bug
cannot even be triggered but better fix it.
This may be backported to stable branches though it will not fix any
issue.
If haproxy fails to start and emits an alert, then it can be useful
to have it also emit the version and the path used to load it. Some
users may be mistakenly launching the wrong binary due to a misconfigured
PATH variable and this will save them some troubleshooting time when it
reports that some keywords are not understood.
What we do here is that we *try* to extract the binary name from the
AUX vector on glibc, and we report this as a NOTICE tag before the
very first alert is emitted.
Since some systems switched to service managers which hide all warnings
by default, some users are not aware of some possibly important warnings
and get caught too late with errors that could have been detected earlier.
This patch adds a new global keyword, "zero-warning" and an equivalent
command-line option "-dW" to refuse to start in case any warning is
detected. It is recommended to use these with configurations that are
managed by humans in order to catch mistakes very early.
This helps quickly checking if the config produces any warning. For
this we reuse the "warned" bit field to add a new WARN_ANY bit that is
set by ha_warning(). The rest of the bit field was also cleaned from
unused bits.
Before supporting "server" line in "peers" section, such sections without
any local peer were removed from the configuration to get it validated.
This patch fixes the issue where a "server" line without address and port which
is a remote peer without address and port makes the configuration parsing fail.
When encoutering such cases we now ignore such lines remove them from the
configuration.
Thank you to Jrme Magnin for having reported this bug.
Must be backported to 2.1 and 2.0.
In 'commit ssl cert', instead of trying to regenerate a list of filters
from the SNIs, use the list provided by the crtlist_entry used to
generate the ckch_inst.
This list of filters doesn't need to be free'd anymore since they are
always reused from the crtlist_entry.
Use the refcount of the SSL_CTX' to free them instead of freeing them on
certains conditions. That way we can free the SSL_CTX everywhere its
pointer is used.
The dump and show ssl crt-list commands does the same thing, they dump
the content of a crt-list, but the 'show' displays an ID in the first
column. Delete the 'dump' command so it is replaced by the 'show' one.
The old 'show' command is replaced by an '-n' option to dump the ID.
And the ID which was a pointer is replaced by a line number and placed
after colons in the filename.
Example:
$ echo "show ssl crt-list -n kikyo.crt-list" | socat /tmp/sock1 -
# kikyo.crt-list
kikyo.pem.rsa:1 secure.domain.tld
kikyo.pem.ecdsa:2 secure.domain.tld
This patch fixes a bad stop condition when decoding a protocol buffer variable integer
whose maximum lenghts are 10, shifting a uint64_t value by more than 63.
Thank you to Ilya for having reported this issue.
Must be backported to 2.1 and 2.0.
When updating a ckch_store we may want to update its pointer in the
crtlist_entry which use it. To do this, we need the list of the entries
using the store.
The instances were wrongly inserted in the crtlist entries, all
instances of a crt-list were inserted in the last crt-list entry.
Which was kind of handy to free all instances upon error.
Now that it's done correctly, the error path was changed, it must
iterate on the entries and find the ckch_insts which were generated for
this bind_conf. To avoid wasting time, it stops the iteration once it
found the first unsuccessful generation.
In order to be able to add new certificate in a crt-list, we need the
list of bind_conf that uses this crt-list so we can create a ckch_inst
for each of them.
Add a counter to know the current number of used connections, as well as the
max, this will be used later to refine the algorithm used to kill idle
connections, based on current usage.
With server-template was introduced the possibility to scale the
number of servers in a backend without needing a configuration change
and associated reload. On the other hand it became impractical to
write use-server rules for these servers as they would only accept
existing server labels as argument. This patch allows the use of
log-format notation to describe targets of a use-server rules, such
as in the example below:
listen test
bind *:1234
use-server %[hdr(srv)] if { hdr(srv) -m found }
use-server s1 if { path / }
server s1 127.0.0.1:18080
server s2 127.0.0.1:18081
If a use-server rule is applied because it was conditionned by an
ACL returning true, but the target of the use-server rule cannot be
resolved, no other use-server rule is evaluated and we fall back to
load balancing.
This feature was requested on the ML, and bumped with issue #563.
In srv_add_to_idle_list(), make sure we set the idle_time before we add
the connection to an idle list, not after, otherwise another thread may
grab it, set the idle_time to 0, only to have the original thread set it
back to now_ms.
This may have an impact, as in conn_free() we check idle_time to decide
if we should decrement the idle connection counters for the server.
In connect_server(), if we no longer have any idle connections for the
current thread, attempt to use the new "takeover" mux method to steal a
connection from another thread.
This should have no impact right now, given no mux implements it.
Make the "list" element a struct mt_list, and explicitely use
list_from_mt_list to get a struct list * where it is used as such, so that
mt_list_for_each_entry will be usable with it.
Add a new mux method, "takeover", that will attempt to make the current thread
responsible for the connection.
It should return 0 on success, and non-zero on failure.
Implement a new function, fd_takeover(), that lets you become the thread
responsible for the fd. On architectures that do not have a double-width CAS,
use a global rwlock.
fd_set_running() was also changed to be able to compete with fd_takeover(),
either using a dooble-width CAS on both running_mask and thread_mask, or
by claiming a reader on the global rwlock. This extra operation should not
have any measurable impact on modern architectures where threading is
relevant.
Revamp the server connection lists. We know have 3 lists :
- idle_conns, which contains idling connections
- safe_conns, which contains idling connections that are safe to use even
for the first request
- available_conns, which contains connections that are not idling, but can
still accept new streams (those are HTTP/2 or fastcgi, and are always
considered safe).
Make it so sessions are not responsible for connection anymore, except for
connections that are private, and thus can't be shared, otherwise, as soon
as a request is done, the session will just add the connection to the
orphan connections pool.
This will break http-reuse safe, but it is expected to be fixed later.
The flush_lock was introduced, mostly to be sure that pool_gc() will never
dereference a pointer that has been free'd. __pool_get_first() was acquiring
the lock to, the fear was that otherwise that pointer could get free'd later,
and then pool_gc() would attempt to dereference it. However, that can not
happen, because the only functions that can free a pointer, when using
lockless pools, are pool_gc() and pool_flush(), and as long as those two
are mutually exclusive, nobody will be able to free the pointer while
pool_gc() attempts to access it.
So change the flush_lock to a spinlock, and don't bother acquire/release
it in __pool_get_first(), that way callers of __pool_get_first() won't have
to wait while the pool is flushed. The worst that can happen is we call
__pool_refill_alloc() while the pool is getting flushed, and memory can
get allocated just to be free'd.
This may help with github issue #552
This may be backported to 2.1, 2.0 and 1.9.
Move the definition of WDTSIG and DEBUGSIG from wdt.c and debug.c into
types/signal.h, so that we can access them in another file.
We need those definition to avoid blocking those signals when running
__signal_process_queue().
This should be backported to 2.1, 2.0 and 1.9.
In the struct fdtab, introduce a new mask, running_mask. Each thread should
add its bit before using the fd.
Use the running_mask instead of a lock, in fd_insert/fd_delete, we'll just
spin as long as the mask is non-zero, to be sure we access the data
exclusively.
fd_set_running_excl() spins until the mask is 0, fd_set_running() just
adds the thread bit, and fd_clr_running() removes it.
The crtlist structure defines a crt-list in the HAProxy configuration.
It contains crtlist_entry structures which are the lines in a crt-list
file.
crt-list are now loaded in memory using crtlist and crtlist_entry
structures. The file is read only once. The generation algorithm changed
a little bit, new ckch instances are generated from the crtlist
structures, instead of being generated during the file loading.
The loading function was split in two, one that loads and caches the
crt-list and certificates, and one that looks for a crt-list and creates
the ckch instances.
Filters are also stored in crtlist_entry->filters as a char ** so we can
generate the sni_ctx again if needed. I won't be needed anymore to parse
the sni_ctx to do that.
A crtlist_entry stores the list of all ckch_inst that were generated
from this entry.
With these debug options we still get these warnings:
include/common/memory.h:501:23: warning: null pointer dereference [-Wnull-dereference]
*(volatile int *)0 = 0;
~~~~~~~~~~~~~~~~~~~^~~
include/common/memory.h:460:22: warning: null pointer dereference [-Wnull-dereference]
*(volatile int *)0 = 0;
~~~~~~~~~~~~~~~~~~~^~~
These are purposely there to crash the process at specific locations.
But the annoying warnings do not help with debugging and they are not
even reliable as the compiler may decide to optimize them away. Let's
pass the pointer through DISGUISE() to avoid this.
It's more generic and versatile than the previous shut_your_big_mouth_gcc()
that was used to silence annoying warnings as it's not limited to ignoring
syscalls returns only. This allows us to get rid of the aforementioned
function and the shut_your_big_mouth_gcc_int variable, that started to
look ugly in multi-threaded environments.
Tim reported that BUG_ON() issues warnings on his distro, as the libc marks
some syscalls with __attribute__((warn_unused_result)). Let's pass the
write() result through DISGUISE() to hide it.
This does exactly the same as ALREADY_CHECKED() but does it inline,
returning an identical copy of the scalar variable without letting
the compiler know how it might have been transformed. This can
forcefully disable certain null-pointer checks or result checks when
known undesirable. Typically forcing a crash with *(DISGUISE(NULL))=0
will not cause a null-deref warning.
These are mostly comments in the code. A few error messages were fixed
and are of low enough importance not to deserve a backport. Some regtests
were also fixed.
This patch adds the `unique-id` option to `proxy-v2-options`. If this
option is set a unique ID will be generated based on the `unique-id-format`
while sending the proxy protocol v2 header and stored as the unique id for
the first stream of the connection.
This feature is meant to be used in `tcp` mode. It works on HTTP mode, but
might result in inconsistent unique IDs for the first request on a keep-alive
connection, because the unique ID for the first stream is generated earlier
than the others.
Now that we can send unique IDs in `tcp` mode the `%ID` log variable is made
available in TCP mode.
Remove the list of private connections from server, it has been largely
unused, we only inserted connections in it, but we would never actually
use it.
Implement mt_list_to_list() and list_to_mt_list(), to be able to convert
from a struct list to a struct mt_list, and vice versa.
This is normally of no use, except for struct connection's list field, that
can go in either a struct list or a struct mt_list.
The stream-int code doesn't need to load server.h as it doesn't use
servers at all. However removing this one reveals that proxy.h was
lacking types/checks.h that used to be silently inherited from
types/server.h loaded before in stream_interface.h.
Ryan O'Hara reported that haproxy breaks on fedora-32 using gcc-10
(pre-release). It turns out that constructs such as:
while (item != head) {
item = LIST_ELEM(item.n);
}
loop forever, never matching <item> to <head> despite a printf there
showing them equal. In practice the problem is that the LIST_ELEM()
macro is wrong, it assigns the subtract of two pointers (an integer)
to another pointer through a cast to its pointer type. And GCC 10 now
considers that this cannot match a pointer and silently optimizes the
comparison away. A tested workaround for this is to build with
-fno-tree-pta. Note that older gcc versions even with -ftree-pta do
not exhibit this rather surprizing behavior.
This patch changes the test to instead cast the null-based address to
an int to get the offset and subtract it from the pointer, and this
time it works. There were just a few places to adjust. Ideally
offsetof() should be used but the LIST_ELEM() API doesn't make this
trivial as it's commonly called with a typeof(ptr) and not typeof(ptr*)
thus it would require to completely change the whole API, which is not
something workable in the short term, especially for a backport.
With this change, the emitted code is subtly different even on older
versions. A code size reduction of ~600 bytes and a total executable
size reduction of ~1kB are expected to be observed and should not be
taken as an anomaly. Typically this loop in dequeue_proxy_listeners() :
while ((listener = MT_LIST_POP(...)))
used to produce this code where the comparison is performed on RAX
while the new offset is assigned to RDI even though both are always
identical:
53ded8: 48 8d 78 c0 lea -0x40(%rax),%rdi
53dedc: 48 83 f8 40 cmp $0x40,%rax
53dee0: 74 39 je 53df1b <dequeue_proxy_listeners+0xab>
and now produces this one which is slightly more efficient as the
same register is used for both purposes:
53dd08: 48 83 ef 40 sub $0x40,%rdi
53dd0c: 74 2d je 53dd3b <dequeue_proxy_listeners+0x9b>
Similarly, retrieving the channel from a stream_interface using si_ic()
and si_oc() used to cause this (stream-int in rdi):
1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi)
1cbe: f6 47 04 10 testb $0x10,0x4(%rdi)
1cc2: 74 1c je 1ce0 <si_report_error+0x30>
1cc4: 48 81 ef 00 03 00 00 sub $0x300,%rdi
1ccb: 81 4f 10 00 08 00 00 orl $0x800,0x10(%rdi)
and now causes this:
1cb7: c7 47 1c 00 02 00 00 movl $0x200,0x1c(%rdi)
1cbe: f6 47 04 10 testb $0x10,0x4(%rdi)
1cc2: 74 1c je 1ce0 <si_report_error+0x30>
1cc4: 81 8f 10 fd ff ff 00 orl $0x800,-0x2f0(%rdi)
There is extremely little chance that this fix wakes up a dormant bug as
the emitted code effectively does what the source code intends.
This must be backported to all supported branches (dropping MT_LIST_ELEM
and the spoa_example parts as needed), since the bug is subtle and may
not always be visible even when compiling with gcc-10.
In MT_LIST_DEL_SAFE(), when the code was changed to use a temporary variable
instead of using the provided pointer directly, we shouldn't have changed
the code that set the pointer to NULL, as we really want the pointer
provided to be nullified, otherwise other parts of the code won't know
we just deleted an element, and bad things will happen.
This should be backported to 2.1.
The splice() syscall has been supported in glibc since version 2.5 issued
in 2006 and is present on supported systems so there's no need for having
our own arch-specific syscall definitions anymore.