BUG/MINOR: sock: store the connection error status

When an async connect() fails in sock_conn_check(), it returns an errno
that will not be retrieved later by a subsequent getsockopt(SO_ERROR).
The problem is that this errno is then definitely lost. This is visible
in the 4be_1srv_smtpchk_httpchk_layer47errors regtest that fails on
certain systems (e.g. glibc 2.31 on arm32 running Linux 6.1), where the
connect() error is systematically lost and the "Connection refused" is
never seen in the check status. It also matches a few random reports of
the past indicating that the connection error was sometimes not reported
in the stats page in front of a down server.

Ideally we should store errno in connections as soon as the error is
seen. However this would require significant changes that are not
acceptable yet for 3.4 nor stable releases. A more acceptable fix is to
make use of the extra CO_ER_* flags set by conn_set_errno() as soon as
the error is detected. This will recognize a sufficiently large number
of errors and the check status will report them (here we'll have
"ECONNREFUSED" in the check). Note that on systems where the error is
seen synchronously, we can have "ECONNREFUSED (Connection refused)",
but this is not a problem.

This fix adds the missing conn_set_errno() call to sock_conn_check(),
that is thus sufficient to catch this error. In addition, the two
affected regtests were updated to search for ECONNREFUSED here.

This might be backported to older releases if users request it, but it
is probably not necessary.
This commit is contained in:
Willy Tarreau 2026-05-18 17:06:28 +02:00
parent fdb569c2ea
commit 3da2b63274
3 changed files with 4 additions and 3 deletions

View file

@ -29,7 +29,7 @@ syslog S3 -level notice {
syslog S4 -level notice {
recv
expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server be4/srv4 failed.+reason: Layer4 connection problem.+info: \"Connection refused\".+check duration: [[:digit:]]+ms.+status: 0/1 DOWN."
expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server be4/srv4 failed.+reason: Layer4 connection problem.+info: \"ECONNREFUSED returned by OS.*\".+check duration: [[:digit:]]+ms.+status: 0/1 DOWN."
} -start
server s1 {

View file

@ -7,12 +7,12 @@ feature ignore_unknown_macro
syslog S1 -level notice {
recv
expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server be1/srv1 failed.*Connection refused at step 2 of tcp-check.*connect port 1"
expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server be1/srv1 failed.*ECONNREFUSED returned by OS.* at step 2 of tcp-check.*connect port 1"
} -start
syslog S2 -level notice {
recv
expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server be2/srv1 failed.*Connection refused at step 1 of tcp-check.*connect port 1"
expect ~ "[^:\\[ ]\\[${h1_pid}\\]: Health check for server be2/srv1 failed.*ECONNREFUSED returned by OS.* at step 1 of tcp-check.*connect port 1"
} -start
server s1 {

View file

@ -1011,6 +1011,7 @@ int sock_conn_check(struct connection *conn)
goto wait;
if (errno && errno != EISCONN) {
conn_set_errno(conn, errno);
conn_report_term_evt(conn, tevt_loc_fd, fd_tevt_type_connect_err);
goto out_error;
}