haproxy

mirror of https://github.com/haproxy/haproxy.git synced 2026-06-19 22:39:46 -04:00

Author	SHA1	Message	Date
Willy Tarreau	c83e45e9b0	MEDIUM: queue: simplify again the process_srv_queue() API This basically undoes the API changes that were performed by commit `0274286dd` ("BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check") to address the deadlock issue: since process_srv_queue() doesn't use the server lock anymore, it doesn't need the "server_locked" argument, so let's get rid of it before it gets used again.	2021-06-22 18:57:15 +02:00
Willy Tarreau	fcb8bf8650	MEDIUM: queue: use a dedicated lock for the queues Till now whenever a server or proxy's queue was touched, this server or proxy's lock was taken. Not only this requires distinct code paths, but it also causes unnecessary contention with other uses of these locks. This patch adds a lock inside the "queue" structure that will be used the same way by the server and the proxy queuing code. The server used to use a spinlock and the proxy an rwlock, though the queue only used it for locked writes. This new version uses a spinlock since we don't need the read lock part here. Tests have not shown any benefit nor cost in using this one versus the rwlock so we could change later if needed. The lower contention on the locks increases the performance from 491k to 507k req/s on 16 threads with 20 servers and leastconn. The gain with roundrobin even increases by 6%. The performance profile changes from this: 13.03% haproxy [.] fwlc_srv_reposition 8.08% haproxy [.] fwlc_get_next_server 3.62% haproxy [.] process_srv_queue 1.78% haproxy [.] pendconn_dequeue 1.74% haproxy [.] pendconn_add to this: 11.95% haproxy [.] fwlc_srv_reposition 7.57% haproxy [.] fwlc_get_next_server 3.51% haproxy [.] process_srv_queue 1.74% haproxy [.] pendconn_dequeue 1.70% haproxy [.] pendconn_add At this point the differences are mostly measurement noise. This is tagged medium because the lock is changed, but no other part of the code touches the queues, with nor without locking, so this should remain invisible.	2021-06-22 18:43:56 +02:00
Willy Tarreau	a05704582c	MINOR: server: replace the pendconns-related stuff with a struct queue Just like for proxies, all three elements (pendconns, nbpend, queue_idx) were moved to struct queue.	2021-06-22 18:43:14 +02:00
Willy Tarreau	7f3c1df248	MINOR: proxy: replace the pendconns-related stuff with a struct queue All three elements (pendconns, nbpend, queue_idx) were moved to struct queue.	2021-06-22 18:43:14 +02:00
Willy Tarreau	5941ef0a6c	MINOR: lb/api: remove the locked argument from take_conn/drop_conn This essentially reverts commit 2b4370078 ("MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock") that was merged during 2.4 before the various locks could be eliminated at the lower layers. Passing that information complicates the cleanup of the queuing code and it's become useless.	2021-06-22 18:43:12 +02:00
Willy Tarreau	1b648c857b	MEDIUM: queue: refine the locking in process_srv_queue() The lock in process_srv_queue() was placed around the whole loop to avoid the cost of taking/releasing it multiple times. But in practice almost all calls to this function only dequeue a single connection, so that argument doesn't really stand. However by placing the lock inside the loop, we'd make it possible to release it before manipulating the pendconn and waking the task up. That's what this patch does. This increases the performance from 431k to 491k req/s on 16 threads with 20 servers under leastconn. The performance profile changes from this: 14.09% haproxy [.] process_srv_queue 10.22% haproxy [.] fwlc_srv_reposition 6.39% haproxy [.] fwlc_get_next_server 3.97% haproxy [.] pendconn_dequeue 3.84% haproxy [.] pendconn_add to this: 13.03% haproxy [.] fwlc_srv_reposition 8.08% haproxy [.] fwlc_get_next_server 3.62% haproxy [.] process_srv_queue 1.78% haproxy [.] pendconn_dequeue 1.74% haproxy [.] pendconn_add The difference is even slightly more visible in roundrobin which does not have take_conn() call.	2021-06-22 18:41:55 +02:00
Willy Tarreau	3e92a31783	MINOR: queue: update proxy->served once out of the loop It's not needed during all these operations and doesn't even affect queueing in the LB algo, so we can safely update it out of the loop and the lock.	2021-06-22 18:37:45 +02:00
Willy Tarreau	5304669e1b	MEDIUM: queue: make pendconn_process_next_strm() only return the pendconn It used to do far too much under the lock, including waking up tasks, updating counters and repositionning entries in the load balancing algo. This patch first moves all that stuff out of the function into the only caller (process_srv_queue()). The decision to update the LB algo is now taken out of the lock. The wakeups could be performed outside of the loop by using a local list. This increases the performance from 377k to 431k req/s on 16 threads with 20 servers under leastconn. The perf profile changes from this: 23.17% haproxy [.] process_srv_queue 6.58% haproxy [.] pendconn_add 6.40% haproxy [.] pendconn_dequeue 5.48% haproxy [.] fwlc_srv_reposition 3.70% haproxy [.] fwlc_get_next_server to this: 13.95% haproxy [.] process_srv_queue 9.96% haproxy [.] fwlc_srv_reposition 6.21% haproxy [.] fwlc_get_next_server 3.96% haproxy [.] pendconn_dequeue 3.75% haproxy [.] pendconn_add	2021-06-22 18:37:41 +02:00
Amaury Denoyelle	0274286dd3	BUG/MAJOR: server: fix deadlock when changing maxconn via agent-check The server_parse_maxconn_change_request locks the server lock. However, this function can be called via agent-checks or lua code which already lock it. This bug has been introduced by the following commit : commit `79a88ba3d0` BUG/MAJOR: server: prevent deadlock when using 'set maxconn server' This commit tried to fix another deadlock with can occur because previoulsy server_parse_maxconn_change_request requires the server lock to be held. However, it may call internally process_srv_queue which also locks the server lock. The locking policy has thus been updated. The fix is functional for the CLI 'set maxconn' but fails to address the agent-check / lua counterparts. This new issue is fixed in two steps : - changes from the above commit have been reverted. This means that server_parse_maxconn_change_request must again be called with the server lock. - to counter the deadlock fixed by the above commit, process_srv_queue now takes an argument to render the server locking optional if the caller already held it. This is only used by server_parse_maxconn_change_request. The above commit was subject to backport up to 1.8. Thus this commit must be backported in every release where it is already present.	2021-06-22 11:39:20 +02:00
Willy Tarreau	901972e261	MINOR: queue: update the stream's pend_pos before queuing it Since commit `c7eedf7a5` ("MINOR: queue: reduce the locked area in pendconn_add()") the stream's pend_pos is set out of the lock, after the pendconn is queued. While this entry is only manipulated by the stream itself and there is no bug caused by this right now, it's a bit dangerous because another thread could decide to look at this field during dequeuing and could randomly see something else. Also in case of crashes, memory inspection wouldn't be as trustable. Let's assign the pendconn before it can be found in the queue.	2021-06-18 18:21:18 +02:00
Amaury Denoyelle	34897d2eff	MINOR: ssl: support ssl keyword for dynamic servers Activate the 'ssl' keyword for dynamic servers. This is the final step to have ssl dynamic servers feature implemented. If activated, ssl_sock_prepare_srv_ctx will be called at the end of the 'add server' CLI handler. At the same time, update the management doc to list all ssl keywords implemented for dynamic servers.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	71f9a06e4b	MINOR: ssl: enable a series of ssl keywords for dynamic servers These keywords are deemed safe-enough to be enable on dynamic servers. Their parsing functions are simple and can be called at runtime. - allow-0rtt - alpn - ciphers - ciphersuites - force-sslv3/tlsv10/tlsv11/tlsv12/tlsv13 - no-sslv3/tlsv10/tlsv11/tlsv12/tlsv13 - no-ssl-reuse - no-tls-tickets - npn - send-proxy-v2-ssl - send-proxy-v2-ssl-cn - sni - ssl-min-ver - ssl-max-ver - tls-tickets - verify - verifyhost 'no-ssl-reuse' and 'no-tls-tickets' are enabled to override the default behavior. 'tls-tickets' is enable to override a possible 'no-tls-tickets' set via the global option 'ssl-default-server-options'. 'force' and 'no' variants of tls method options are useful to override a possible 'ssl-default-server-options'.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	fde82605cd	MINOR: ssl: support crl arg for dynamic servers File-access through ssl_store_load_locations_file is deactivated if srv_parse_crl is used at runtime for a dynamic server. The crl must have already been loaded either in the config or through the 'ssl crl' CLI commands.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	93be21e0c6	MINOR: ssl: support crt arg for dynamic servers File-access through ssl_store_load_locations_file is deactivated if srv_parse_crt is used at runtime for a dynamic server. The cert must have already been loaded either in the config or through the 'ssl cert' CLI commands.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	482550280a	MINOR: ssl: support ca-file arg for dynamic servers File-access through ssl_store_load_locations_file is deactivated if srv_parse_ca_file is used at runtime for a dynamic server. The ca-file must have already been loaded either in the config or through the 'ssl ca-file' CLI commands.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	7addf56b72	MINOR: ssl: split parse functions for alpn/check-alpn This will be in preparation for support of ssl on dynamic servers. The 'alpn' keyword will be allowed for dynamic servers but not the 'check-alpn'. The alpn parsing is extracted into a new function parse_alpn. Each srv_parse_alpn and srv_parse_check_alpn called it.	2021-06-18 16:42:26 +02:00
Amaury Denoyelle	36aa451a4e	MINOR: ssl: render file-access optional on server crt loading The function ssl_sock_load_srv_cert will be used at runtime for dynamic servers. If the cert is not loaded on ckch tree, we try to access it from the file-system. Now this access operation is rendered optional by a new function argument. It is only allowed at parsing time, but will be disabled for dynamic servers at runtime.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	b89d3d3de7	MINOR: server: disable CLI 'set server ssl' for dynamic servers 'set server ssl' uses ssl parameters from default-server. As dynamic servers does not reuse any default-server parameters, this command has no sense for them.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	1f9333b30e	MINOR: ssl: check allocation in parse npn/sni These checks are especially required now as this function will be used at runtime for dynamic servers.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	cbbf87f119	MINOR: ssl: check allocation in parse ciphers/ciphersuites/verifyhost These checks are especially required now as this function will be used at runtime for dynamic servers.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	949c94e462	MINOR: ssl: check allocation in ssl_sock_init_srv These checks are especially required now as this function will be used at runtime for dynamic servers.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	c593bcdb43	MINOR: ssl: always initialize random generator Explicitly call ssl_initialize_random to initialize the random generator in init() global function. If the initialization fails, the startup is interrupted. This commit is in preparation for support of ssl on dynamic servers. To be able to activate ssl on dynamic servers, it is necessary to ensure that the random generator is initialized on startup regardless of the config. It cannot be called at runtime as access to /dev/urandom is required. This also has the effect to fix the previous non-consistent behavior. Indeed, if bind or server in the config are using ssl, the initialization function was called, and if it failed, the startup was interrupted. Otherwise, the ssl initialization code could have been called through the ssl server for lua, but this times without blocking the startup on error. Or not called at all if lua was deactivated.	2021-06-18 16:42:25 +02:00
Amaury Denoyelle	b11ad9ed61	MINOR: ssl: fix typo in usage for 'new ssl ca-file' Fix the usage for the command new ssl ca-file, which has a missing '-' dash separator.	2021-06-18 16:42:25 +02:00
Tim Duesterhus	3bc6af417d	BUG/MINOR: cache: Correctly handle existing-but-empty 'accept-encoding' header RFC 7231#5.3.4 makes a difference between a completely missing 'accept-encoding' header and an 'accept-encoding' header without any values. This case was already correctly handled by accident, because an empty accept encoding does not match any known encoding. However this resulted in the 'other' encoding being added to the bitmap. Usually this also succeeds in serving cached responses, because the cached response likely has no 'content-encoding', thus matching the identity case instead of not serving the response, due to the 'other' encoding. But it's technically not 100% correct. Fix this by special-casing 'accept-encoding' values with a length of zero and extend the test to check that an empty accept-encoding is correctly handled. Due to the reasons given above the test also passes without the change in cache.c. Vary support was added in HAProxy 2.4. This fix should be backported to 2.4+.	2021-06-18 15:48:20 +02:00
Christopher Faulet	0ba54bb401	BUG/MINOR: server/cli: Fix locking in function processing "set server" command The commit `c7b391aed` ("BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI") introduced 2 bugs. The first one is a typo on the server's lock label (s/SERVER_UNLOCK/SERVER_LOCK/). The second one is about the server's lock itself. It must be acquired to execute the "agent-send" subcommand. The patch above is marked to be backported as far as 1.8. Thus, this one must also backported as far 1.8. BUG/MINOR: server/cli: Don't forget to lock server on agent-send subcommand	2021-06-18 09:16:32 +02:00
Christopher Faulet	e886dd5c32	BUG/MINOR: resolvers: Use resolver's lock in resolv_srvrq_expire_task() The commit `dcac41806` ("BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status") introduced a type. In resolv_srvrq_expire_task() function, the resolver's lock must be used instead of the resolver itself. This patch must be backported with the patch above (at least as far as 2.2).	2021-06-18 09:15:35 +02:00
Amaury Denoyelle	655dec81bd	BUG/MINOR: backend: do not set sni on connection reuse When reusing a backend connection, do not reapply the SNI on the connection. It should already be defined when the connection was instantiated on a previous connect_server invocation. As the SNI is a parameter used to select a connection, only connection with same value can be reused. The impact of this bug is unknown and may be null. No memory leak has been reported by valgrind. So this is more a cleaning fix. This commit relies on the SF_SRV_REUSED flag and thus depends on the following fix : BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose This should be backported up to 2.4.	2021-06-17 18:01:57 +02:00
Amaury Denoyelle	2b1d91758d	BUG/MINOR: backend: restore the SF_SRV_REUSED flag original purpose The SF_SRV_REUSED flag was set if a stream reused a backend connection. One of its purpose is to count the total reuse on the backend in opposition to newly instantiated connection. However, the flag was diverted from its original purpose since the following commit : `e8f5f5d8b2` BUG/MEDIUM: servers: Only set SF_SRV_REUSED if the connection if fully ready. With this change, the flag is not set anymore if the mux is not ready when a connection is picked for reuse. This can happen for multiplexed connections which are inserted in the available list as soon as created in http-reuse always mode. The goal of this change is to not retry immediately this request in case on an error on the same server if the reused connection is not fully ready. This change is justified for the retry timeout handling but it breaks other places which still uses the flag for its original purpose. Mainly, in this case the wrong 'connect' backend counter is incremented instead of the 'reuse' one. The flag is also used in http_return_srv_error and may have an impact if a http server error is replied for this stream. To fix this problem, the original purpose of the flag is restored by setting it unconditionaly when a connection is reused. Additionally, a new flag SF_SRV_REUSED_ANTICIPATED is created. This flag is set when the connection is reused but the mux is not ready yet. For the timeout handling on error, the request is retried immediately only if the stream reused a connection without this newly anticipated flag. This must be backported up to 2.1.	2021-06-17 17:58:50 +02:00
Christopher Faulet	dcac418062	BUG/MEDIUM: resolvers: Add a task on servers to check SRV resolution status When a server relies on a SRV resolution, a task is created to clean it up (fqdn/port and address) when the SRV resolution is considered as outdated (based on the resolvers 'timeout' value). It is only possible if the server inherits outdated info from a state file and is no longer selected to be attached to a SRV item. Note that most of time, a server is attached to a SRV item. Thus when the item becomes obsolete, the server is cleaned up. It is important to have such task to be sure the server will be free again to have a chance to be resolved again with fresh information. Of course, this patch is a workaround to solve a design issue. But there is no other obvious way to fix it without rewritting all the resolvers part. And it must be backportable. This patch relies on following commits: * MINOR: resolvers: Clean server in a dedicated function when removing a SRV item * MINOR: resolvers: Remove server from named_servers tree when removing a SRV item All the series must be backported as far as 2.2 after some observation period. Backports to 2.0 and 1.8 must be evaluated.	2021-06-17 16:52:35 +02:00
Christopher Faulet	73001ab6e3	MINOR: resolvers: Remove server from named_servers tree when removing a SRV item When a server is cleaned up because the corresponding SRV item is removed, we always remove the server from the srvrq's name_servers tree. For now, it is useless because, if a server was attached to a SRV item, it means it was already removed from the tree. But it will be mandatory to fix a bug.	2021-06-17 16:52:35 +02:00
Christopher Faulet	11c6c39656	MINOR: resolvers: Clean server in a dedicated function when removing a SRV item A dedicated function is now used to clean up servers when a SRV item becomes obsolete or when a requester is removed from a resolution. This patch is mandatory to fix a bug.	2021-06-17 16:52:35 +02:00
Christopher Faulet	c7b391aed2	BUG/MEDIUM: server/cli: Fix ABBA deadlock when fqdn is set from the CLI To perform servers resolution, the resolver's lock is first acquired then the server's lock when necessary. However, when the fqdn is set via the CLI, the opposite is performed. So, it is possible to experience an ABBA deadlock. To fix this bug, the server's lock is acquired and released for each subcommand of "set server" with an exception when the fqdn is set. The resolver's lock is first acquired. Of course, this means we must be sure to have a resolver to lock. This patch must be backported as far as 1.8.	2021-06-17 16:52:14 +02:00
Christopher Faulet	a386e78823	BUG/MINOR: server: Forbid to set fqdn on the CLI if SRV resolution is enabled If a server is configured to rely on a SRV resolution, we must forbid to change its fqdn on the CLI. Indeed, in this case, the server retrieves its fqdn from the SRV resolution. If the fqdn is changed via the CLI, this conflicts with the SRV resolution and leaves the server in an undefined state. Most of time, the SRV resolution remains enabled with no effect on the server (no update). Some time the A/AAAA resolution for the new fqdn is not enabled at all. It depends on the server state and resolver state when the CLI command is executed. This patch must be backported as far as 2.0 (maybe to 1.8 too ?) after some observation period.	2021-06-17 16:17:14 +02:00
Miroslav Zagorac	8a8f270f6a	CLEANUP: server: a separate function for initializing the per_thr field To avoid repeating the same source code, allocating memory and initializing the per_thr field from the server structure is transferred to a separate function.	2021-06-17 16:07:10 +02:00
Ilya Shipitsin	213bb99f9e	CLEANUP: assorted typo fixes in the code and comments This is 24th iteration of typo fixes	2021-06-17 09:02:16 +02:00
Willy Tarreau	3a53707160	BUG/MINOR: mux-h2/traces: bring back the lost "sent H2 REQ/RES" traces In 2.4, commit `d1ac2b90c` ("MAJOR: htx: Remove the EOM block type and use HTX_FL_EOM instead") changed the HTX processing to destroy the blocks as they are processed. So the traces that were emitted at the end of the send headers functions didn't have anything to show. Let's move these traces earlier in the function, right before the HTX processing, so that everything is still in place. This should be backported to 2.4.	2021-06-17 08:43:43 +02:00
Willy Tarreau	29268e9a3c	BUG/MINOR: mux-h2/traces: bring back the lost "rcvd H2 REQ" trace Since commit `7d013e796` ("BUG/MEDIUM: mux-h2: Xfer rxbuf to the upper layer when creating a front stream"), the rxbuf is lost during the call to h2c_frt_stream_new(), so the trace that happens later cannot find a request there and we've lost the useful part indicating what the request looked like. Let's move the trace before this call. This should be backported to 2.4.	2021-06-17 08:43:27 +02:00
Willy Tarreau	ee4684f65b	MINOR: mux-h2: obey http-ignore-probes during the preface We're seeing some browsers setting up multiple connections and closing some to just keep one. It looks like they do this in case they'd negotiate H1. This results in aborted prefaces and log pollution about bad requests and "PR--" in the status flags. We already have an option to ignore connections with no data, it's called http-ignore-probes. But it was not used by the H2 mux. However it totally makes sense to use it during the preface. This patch changes this so that connections aborted before sending the preface can avoid being logged. This should be backported to 2.4 and 2.3 at least, and probably even as far as 2.0.	2021-06-17 08:08:48 +02:00
Willy Tarreau	fc8e438637	BUG/MINOR: stats: make "show stat typed desc" work again As part of the changes to support per-module stats data in 2.3-dev6 with commit `ee63d4bd6` ("MEDIUM: stats: integrate static proxies stats in new stats"), a small change resulted in the description field to be replaced by the name field, making it pointless. Let's fix this back. This should fix issue #1291. Thanks to Nick Ramirez for reporting this issue. This patch can be backported to 2.3.	2021-06-17 07:25:22 +02:00
Willy Tarreau	9abb317683	CLEANUP: mux-h2/traces: better align user messages "sent H2 request" was already misaligned with the 3 other ones (sent/rcvd, request/response), and now with "new H2 connection" that's yet another alignment making the traces even less legible. Let's just realign all 5 messages, this even eases quick pointer comparisons. This should probably be backported to 2.4 as it's where it's the most likely to be used in the mid-term.	2021-06-16 18:32:42 +02:00
Willy Tarreau	8e6f749f18	MINOR: mux-h2/trace: report a few connection-level info during h2_init() It is currently very difficult to match some H2 trace outputs against some log extracts because there's no exactly equivalent info. This patch tries to address this by adding a TRACE_USER() call in h2_init() that is matched in h2_trace() to report: - connection pointer and direction - frontend's name or server's name - transport layer and control layer (e.g. "SSL/tcpv4") - source and/or destination depending on what is set This now permits to get something like this at verbosity level complete: <0>2021-06-16T18:30:19.810897+02:00 [00\|h2\|1\|mux_h2.c:1006] new H2 connection : h2c=0x19fee50(F,PRF) : conn=0x7f373c026850(IN) fe=h2gw RAW/tcpv4 src=127.0.0.1:19540 <0>2021-06-16T18:30:19.810919+02:00 [00\|h2\|1\|mux_h2.c:2731] rcvd H2 request : h2c=0x19fee50(F,FRH) <0>2021-06-16T18:30:19.810998+02:00 [00\|h2\|1\|mux_h2.c:1006] new H2 connection : h2c=0x1a04ee0(B,PRF) : conn=0x1a04ce0(OUT) sv=h2gw/s1 RAW/tcpv4 dst=127.0.0.1:4446	2021-06-16 18:30:42 +02:00
Willy Tarreau	d943a044aa	MINOR: connection: add helper conn_append_debug_info() This function appends to a buffer some information from a connection. This will be used by traces and possibly some debugging as well. A frontend/backend/server, transport/control layers, source/destination ip:port, connection pointer and direction are reported depending on the available information.	2021-06-16 18:30:42 +02:00
Willy Tarreau	b74debd826	BUG/MINOR: mux-h1: do not skip the error response on bad requests Since 2.4-dev3 with commit `c4bfa59f1` ("MAJOR: mux-h1: Create the client stream as later as possible"), a request error doesn't result in any error response if "option http-ignore-probes" is set, there's just a close. This is caused by an unneeded b_reset() in h1_process_demux()'s error path, which makes h1_handle_bad_req() believe there was an empty request. There is no reason for this reset to be there, it must have been a leftover of an earlier attempt at dealing with the error, let's drop it. This should be backported to 2.4.	2021-06-16 15:06:43 +02:00
Willy Tarreau	f9a7c442f6	MINOR: backend: only skip LB when there are actual connections In 2.3, a significant improvement was brought against situations where the queue was heavily used, because some LB algos were still checked for no reason before deciding to put the request into the queue. This was commit `82cd5c13a` ("OPTIM: backend: skip LB when we know the backend is full"). As seen in previous commit ("BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue") the dequeuing code is extremely tricky, and the optimization above tends to emphasize transient issues by making them permanent until the next reload, which is not acceptable as the code must always be robust against any bad situation. This commit brings a protection against such a situation by slightly relaxing the test. Instead of checking that there are pending connections in the backend queue, it also verifies that the backend's connections are not solely composed of queued connections, which would then indicate we are in this situation. This is not rocket science, but at least if the situation happens, we know that it will unlock by itself once the streams have left, as new requests will be allowed to reach the servers and to flush the queue again. This needs to be backported to 2.4 and 2.3.	2021-06-16 09:05:35 +02:00
Willy Tarreau	7867cebf31	BUG/MAJOR: queue: set SF_ASSIGNED when setting strm->target on dequeue Commit `82cd5c13a` ("OPTIM: backend: skip LB when we know the backend is full") has uncovered a long-burried bug in the dequeing code: when a server releases a connection, it picks a new one from the proxy's or its queue. Technically speaking it only picks a pendconn which is a link between a position in the queue and a stream. It then sets this pendconn's target to itself, and wakes up the stream's task so that it can try to connect again. The stream then goes through the regular connection setup phases, calls back_try_conn_req() which calls pendconn_dequeue(), which sets the stream's target to the pendconn's and releases the pendconn. It then reaches assign_server() which sees no SF_ASSIGNED and calls assign_server_and_queue() to perform load balancing or queuing. This one first destroys the stream's target and gets ready to perform load balancing. At this point we're load-balancing for no reason since we already knew what server was available. And this is where the commit above comes into play: the check for the backend's queue above may detect other connections that arrived in between, and will immediately return FULL, forcing this request back into the queue. If the server had a very low maxconn (e.g. 1 due to a long slowstart), it's possible that this evicted connection was the last one on the server and that no other one will ever be present to process the queue. Usually a regularly processed request will still have its own srv_conn that will be used during stream_free() to dequeue other connections. But if the server had a down-up cycle, then a call to pendconn_grab_from_px() may start to dequeue entries which had no srv_conn and which will have no server slot to offer when they expire, thus maintaining the situation above forever. Worse, as new requests arrive, there are always some requests in the queue and the situation feeds on itself. The correct fix here is to properly set SF_ASSIGNED in pendconn_dequeue() when the stream's target is assigned (as it's what this flag means), so as to avoid a load-balancing pass when dequeuing. Many thanks to Pierre Cheynier for the numerous detailed traces he provided that helped narrow this problem down. This could be backported to all stable versions, but in practice only 2.3 and above are really affected since the presence of the commit above. Given how tricky this code is it's better to limit it to those versions that really need it.	2021-06-16 09:05:35 +02:00
Willy Tarreau	6fd0450b47	CLEANUP: shctx: remove the different inter-process locking techniques With a single process, we don't need to USE_PRIVATE_CACHE, USE_FUTEX nor USE_PTHREAD_PSHARED anymore. Let's only keep the basic spinlock to lock between threads.	2021-06-15 16:52:42 +02:00
Willy Tarreau	b54ca70e7c	MEDIUM: config: warn about "bind-process" deprecation Let's indicate that "bind-process" is deprecated and scheduled for removal in 2.7, as it only supports "1".	2021-06-15 16:52:42 +02:00
Willy Tarreau	e8422bf56b	MEDIUM: global: remove the relative_pid from global and mworker The relative_pid is always 1. In mworker mode we also have a child->relative_pid which is always equalt relative_pid, except for a master (0) or external process (-1), but these types are usually tested for, except for one place that was amended to carefully check for the PROC_O_TYPE_WORKER option. Changes were pretty limited as most usages of relative_pid were for designating a process in stats output and peers protocol.	2021-06-15 16:52:42 +02:00
Willy Tarreau	06987f4238	CLEANUP: global: remove unused definition of MAX_PROCS This one was forced to 1 and the only reference was a test to verify it was comprised between 1 and LONGBITS.	2021-06-15 16:52:42 +02:00
Willy Tarreau	44ea631b77	MEDIUM: cpu-set: make the proc a single bit field and not an array We only have a single process now so we don't need to store the per-proc CPU binding anymore.	2021-06-15 16:52:42 +02:00
Willy Tarreau	bda7c1decd	MEDIUM: config: simplify cpu-map handling As there's no more nbproc>1, we can remove some loops and tests in cpu-map. Both the lack of thread number and thread 1 can count as the whole process now (which is still used for whole process binding when threads are disabled).	2021-06-15 16:52:42 +02:00
Willy Tarreau	72faef3866	MEDIUM: global: remove dead code from nbproc/bind_proc removal Lots of places iterating over nbproc or comparing with nbproc could be simplified. Further, "bind-process" and "process" parsing that was already limited to process 1 or "all" or "odd" resulted in a bind_proc field that was either 0 or 1 during the init phase and later always 1. All the checks for compatibilities were removed since it's not possible anymore to run a frontend and a backend on different processes or to have peers and stick-tables bound on different ones. This is the largest part of this patch. The bind_proc field was removed from both the proxy and the receiver structs. Since the "process" and "bind-process" directives are still parsed, configs making use of correct values allowing process 1 will continue to work.	2021-06-15 16:52:42 +02:00
Willy Tarreau	5301f5d72a	CLEANUP: global: remove pid_bit and all_proc_mask They were already set to 1 and never changed. Let's remove them and replace their references with 1.	2021-06-15 16:52:42 +02:00
Willy Tarreau	91358595f8	CLEANUP: global: remove the nbproc field from the global structure Let's use 1 in the rare places where it was still referenced since it's now its only possible value.	2021-06-15 16:52:42 +02:00
Willy Tarreau	6185a0343b	MINOR: mworker: remove the initialization loop over processes There was a loop used to prepare structures for all current processes. Let's just assume there's a single iteration now.	2021-06-15 16:52:42 +02:00
Willy Tarreau	d67ff340a5	MEDIUM: init: remove the loop over processes during init There was a loop iterating over all nbproc values during init that couldn't be immediately removed because the loop's index was used to distinguish a child from a parent. That's now fixed by replacing the iterator with an in_parent flag. All bindings that were checking (1UL << proc) or cpu_map.proc[proc] were adjusted to always use zero for proc.	2021-06-15 16:52:42 +02:00
Willy Tarreau	e34cf28011	BUG/MINOR: mworker: fix typo in chroot error message Since its introduction in 1.8 with commit `095ba4c24` ("MEDIUM: mworker: replace systemd mode by master worker mode"), it says "cannot chroot1(...)" which seems to be a leftover of a debug message. It could be backported but probably nobody will notice.	2021-06-15 16:52:07 +02:00
Willy Tarreau	4c19e99621	BUG/MINOR: ssl: use atomic ops to update global shctx stats The global shctx lookups and misses was updated without using atomic ops, so the stats available in "show info" are very likely off by a few units over time. This should be backported as far as 1.8. Versions without _HA_ATOMIC_INC() can use HA_ATOMIC_ADD(,1).	2021-06-15 16:52:07 +02:00
Willy Tarreau	9e467af804	BUG/MEDIUM: shctx: use at least thread-based locking on USE_PRIVATE_CACHE Since threads were introduced in 1.8, the USE_PRIVATE_CACHE mode of the shctx was not updated to use locks. Originally it was meant to disable sharing between processes, so it removes the lock/unlock instructions. But with threads enabled, it's not possible to work like this anymore. It's easy to see that once built with private cache and threads enabled, sending violent SSL traffic to the the process instantly makes it die. The HTTP cache is very likely affected as well. This patch addresses this by falling back to our native spinlocks when USE_PRIVATE_CACHE is used. In practice we could use them also for other modes and remove all older implementations, but this patch aims at keeping the changes very low and easy to backport. A new SHCTX_LOCK label was added to help with debugging, but OTHER_LOCK might be usable as well for backports. An even lighter approach for backports may consist in always declaring the lock (or reusing "waiters"), and calling pl_take_s() for the lock() and pl_drop_s() for the unlock() operation. This could even be used in all modes (process and threads), even when thread support is disabled. Subsequent patches will further clean up this area. This patch must be backported to all supported versions since 1.8.	2021-06-15 16:52:07 +02:00
Amaury Denoyelle	8ff0434b61	BUG/MEDIUM: server: do not auto insert a dynamic server in px addr_node Until then, the servers were automatically attached on their creation into the proxy addr_node tree via _srv_parse_init. In case of an invalid dynamic server which is instantly freed, no detach operation was made leaving a NULL server in the tree. Change this mode of operation by marking the attach operation as optional in _srv_parse_init. This operation is not conduct for a dynamic server. The server is attached only at the end of the CLI handler when it is marked as valid. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	1613b4a75d	BUG/MINOR: server: do not keep an invalid dynamic server in px ids tree A bug is present when trying to create a dynamic server with a fixed id. If the server is detected invalid due to a later parsing arguments error, the server is not removed from the proxy used ids tree before being freed. Change the mode of operation of 'id' keyword parsing handler. The insertion in the backend tree is removed from the handler and is not taken in charge by parse_server for configuration parsing. For the dynamic servers, the insertion is called at the end of the 'add server' CLI handler when the server has been validated. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	406aaef55a	BUG/MEDIUM: server: do not forget to generate the dynamic servers ids If no id is specified by the user for a dynamic server, it is necessary to generate a new one. This operation is now done at the end of 'add server' CLI handler. The server is then inserted into the proxy ids tree. Without this, several features may be broken for dynamic servers. Among them, there is the "first" lb algorithm, the persistence using stick-tables or the uniqueness internal check of srv_parse_id. This must be backported up to 2.4.	2021-06-15 11:42:53 +02:00
Amaury Denoyelle	82d7f77463	BUG/MEDIUM: server: clear dynamic srv on delete from proxy id/name trees Do not leave deleted server in used_server_id/used_server_addr backend trees. This might lead to crashes if a deleted server is used through these trees. At this moment, dynamic servers are only added in used_server_id if they have a fixed id. They are never inserted in used_server_addr as this code is missing. So these new delete instructions are noop. However, a fix will be provided soon to insert properly all dynamic servers in both used_server_id and used_server_addr trees so the deletion counterpart will be mandatory in the CLI server delete handler. This must be backported to 2.4.	2021-06-15 11:38:06 +02:00
Amaury Denoyelle	31ddd76fef	BUG/MEDIUM: server: extend thread-isolate over much of CLI 'add server' Some config parsing handlers were designed to be run at startup on a single-thread. When executing at runtime for dynamic servers, thread-safety is not guaranteed. This is the case for example in srv_parse_id which manipulates backend used_ids tree. One solution could be to add locks but it might be tricky to found all affected functions and it can be an easy source of deadlock. The other solution which has been chosen is to use thread-isolation over almost all of the cli_parse_add_server CLI handler. For now this solution is sufficient. If some users make heavy use of the 'add server', hurting the overall performance, it will be necessary to design a much thinner solution. This must be backported up to 2.4.	2021-06-15 11:19:43 +02:00
Amaury Denoyelle	077c6b8d29	BUG/MINOR: stick-table: insert srv in used_name tree even with fixed id If the server id is fixed in the configuration, it is immediately inserted in the 'used_server_id' backend tree via srv_parse_id. On check_config_validity, the dynamic id generation is thus skipped for fixed-id servers. However, it must nevertheless be inserted in the 'used_server_name' backend tree. This bug seems to be not noticeable for the user. Indeed, before the fix, the search in sticking_rule_find_target always returned NULL for the name, then the fallback search with server id succeeded, so the persistence is properly applied. However with the fix the fallback search is not executed anymore, which saves from the locking of STK_SESS. This should be backported up to 2.0.	2021-06-15 10:50:02 +02:00
Remi Tricot-Le Breton	6916493c29	MINOR: ssl: Use OpenSSL's ASN1_TIME convertor when available The ASN1_TIME_to_tm function was added in OpenSSL1.1.1 so with this version of the library we do not need our homemade time convertor anymore.	2021-06-14 15:12:53 +02:00
Emeric Brun	caef19e0c7	BUG/MAJOR: resolvers: segfault using server template without SRV RECORDs This patch fix the issue adding a test in srvrq before registering the server on it during server template init. This was a regression due to commit : `3406766d57` This should be backported with this previous commit (until 2.0)	2021-06-14 11:04:02 +02:00
Willy Tarreau	2a651e2d0d	BUILD: log: remove unused fmt_directive() fmt_directive() became unused after the removal of the deprecated tags, and it emits a warning on some compilers. Let's drop it.	2021-06-11 17:32:03 +02:00
Willy Tarreau	3ae1d1eab9	BUILD: init: remove initialization of multi-process thread mappings This broke the build with recent compilers and is not used anyway.	2021-06-11 17:28:19 +02:00
Willy Tarreau	b63dbb7b2e	MAJOR: config: remove parsing of the global "nbproc" directive This one was deprecated in 2.3 and marked for removal in 2.5. It suffers too many limitations compared to threads, and prevents some improvements from being engaged. Instead of a bypassable startup error, there is now a hard error. The parsing code was removed, and very few obvious cases were as well. The code is deeply rooted at certain places (e.g. "for" loops iterating from 0 to nbproc) so it will not be that trivial to remove everywhere. The "bind" and "bind-process" parsers will have to be adjusted, though maybe not completely changed if we later want to support thread groups for large NUMA machines. Some stats socket restrictions were removed, and the doc was updated according to what was done. A few places in the doc still refer to nbproc and will have to be revisited. The master-worker code also refers to the process number to distinguish between master and workers and will have to be carefully adjusted. The MAX_PROCS macro was reset to 1, this will at least reduce the size of some remaining arrays. Two regtests were dependieng on this directive, one with an explicit "nbproc 1" and another one testing the master's CLI using nbproc 4. Both were adapted.	2021-06-11 17:02:13 +02:00
Willy Tarreau	eb778248d9	MEDIUM: proxy: remove the deprecated "grace" keyword Commit `ab0a5192a` ("MEDIUM: config: mark "grace" as deprecated") marked the "grace" keyword as deprecated in 2.3, tentative removal for 2.4 with a hard deadline in 2.5, so let's remove it and return an error now. This old and outdated feature was incompatible with soft-stop, reload and socket transfers, and keeping it forced ugly hacks in the lower layers of the protocol stack.	2021-06-11 16:57:34 +02:00
Willy Tarreau	d2f2537d1b	MINOR: config: remove deprecated option "http-tunnel" It was marked as deprecated in 2.1-dev2 and for removal in 2.2, but it was missed. A warning was already emitted and the doc didn't refer to it any more, let's now get rid of it.	2021-06-11 16:57:34 +02:00
Willy Tarreau	6ba69841f8	MINOR: config: reject long-deprecated "option forceclose" It's been warning as being deprecated since 2.0-dev4, it's about time to drop it now. The error message recommends to either remove it or use "option httpclose" instead. It's still referred to in the old internal doc about the connection header, which itself seems highly inaccurate by now.	2021-06-11 16:57:34 +02:00
Willy Tarreau	4a83977283	MINOR: http: remove the long deprecated "set-cookie()" sample fetch function This one was marked as deprecated 9 years ago by commit `28376d62c` ("MEDIUM: http: merge ACL and pattern cookie fetches into a single one") and has disappeared from any documentation, so it never appeared in any released version. Let's remove it now.	2021-06-11 16:57:34 +02:00
Willy Tarreau	fd6ab66041	MINOR: log: remove the long-deprecated early log-format tags The following 10 log-format tags were implemented during log-format development and changed before the release. They were marked as deprecated in 2012 by commit `2beef5888` ("MEDIUM: log: change a few log tokens to make them easier to remember") and were not documented. They've been emitting a warning since then, with a suggestion of the one to use instead. Let's get rid of them now. Bi => bi, Bp => bp, Ci => ci, Cp => cp, Fi => fi Fp => fp, Si => si, Sp => sp, cc => CC, cs => CS	2021-06-11 16:57:34 +02:00
Willy Tarreau	9862787e8f	MINOR: config: completely remove support for "no option http-use-htx" This one used to still be supported, emitting a warning about it being deprecated and the default since 2.1. Let's remove it now.	2021-06-11 16:57:34 +02:00
Willy Tarreau	eb9d90a5a2	MINOR: config: remove support for deprecated option "tune.chksize" It was marked as deprecated for immediate removal as it was not used, let's reject it and remove it from the doc. A specific error suggests to check tune.bufsize instead.	2021-06-11 16:57:34 +02:00
Christopher Faulet	85af93b8c7	BUG/MINOR: server-state: load SRV resolution only if params match the config When the state of a server is loaded, if there is no hostname defined for this server and if a fqdn and a server record are retrieved from the state file, it means the server should rely on a SRV resolution. But we must be sure the server is configured this way. A SRV resolution must be configured with the same SRV record. This part must be skipped if there is no SRV resolution configured for this server or if the SRV record used is not the same. This patch should be backported as far as 1.8 after some observation period.	2021-06-11 16:16:20 +02:00
Emeric Brun	3406766d57	MEDIUM: resolvers: add a ref between servers and srv request or used SRV record This patch add a ref into servers to register them onto the record answer item used to set their hostnames. It also adds a head list into 'srvrq' to register servers free to be affected to a SRV record. A head of a tree is also added to srvrq to put servers which present a hotname in server state file. To re-link them fastly to the matching record as soon an item present the same name. This results in better performances on SRV record response parsing. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	bd78c912fd	MEDIUM: resolvers: add a ref on server to the used A/AAAA answer item This patch adds a head list into answer items on servers which use this record to set their IPs. It makes lookup on duplicated ip faster and allow to check immediatly if an item is still valid renewing the IP. This results in better performances on A/AAAA resolutions. This is an optimization but it could avoid to trigger the haproxy's internal wathdog in some circumstances. And for this reason it should be backported as far we can (2.0 ?)	2021-06-11 16:16:16 +02:00
Emeric Brun	12ca658dbe	BUG/MINOR: resolvers: answser item list was randomly purged or errors In case of SRV records, The answer item list was purged by the error callback of the first requester which considers the error could not be safely ignored. It makes this item list unavailable for subsequent requesters even if they consider the error could be ignored. On A resolution or do_resolve action error, the answer items were never trashed. This patch re-work the error callbacks and the code to check the return code If a callback return 1, we consider the error was ignored and the answer item list must be kept. At the opposite, If all error callbacks of all requesters of the same resolution returns 0 the list will be purged This patch should be backported as far as 2.0.	2021-06-11 16:16:16 +02:00
Christopher Faulet	0fe1864f7d	CLEANUP: l7-retries: do not test the buffer before calling b_alloc() The return value is enough now to know if the allocation succeeded or failed. This cleanup was already pushed by Willy (`f499f50`) but a revert crushed it. It may be backported to the 2.4 because the original patch was done on this version.	2021-06-11 16:04:28 +02:00
Christopher Faulet	bf76df12a6	BUG/MINOR: h1-htx: Fix a signess bug with char data type when parsing chunk size On some platform, a char may be unsigned. Of course, we should not rely on the signess of a char to be portable. Unfortunatly, since the commit a835f3cb ("MINOR: h1-htx: Use a correlation table to speed-up small chunks parsing") we rely on it to test the value retrieved from the hexadecimal correlation table when the size of a chunk is parsed. To fix the bug, we now test the result is in the range [0,15] with a bitwise AND. This patch should fix the issue #1272. It is 2.5-specific, no backport is needed except if the commit above is backported.	2021-06-11 14:15:48 +02:00
Christopher Faulet	5cd0e528cf	BUG/MINOR: mux-fcgi: Expose SERVER_SOFTWARE parameter by default As specified in the RFC3875 (section 4.1.17), this parameter must be set to the name and version of the information server software making the CGI request. Thus, it is now added to the default parameters defined by HAProxy. It is set to the string "HAProxy $version". This patch should fix the issue #1285 and must be backported as far as 2.2.	2021-06-11 14:15:48 +02:00
Christopher Faulet	1cf414b522	BUG/MAJOR: htx: Fix htx_defrag() when an HTX block is expanded When an HTX block is expanded, a defragmentation may be performed first to have enough space to copy the new data. When it happens, the meta data of the HTX message must take account of the new data length but copied data are still unchanged at this stage (because we need more space to update the message content). And here there is a bug because the meta data are updated by the caller. It means that when the blocks content is copied, the new length is already set. Thus a block larger than the reality is copied and data outside the buffer may be accessed, leading to a crash. To fix this bug, htx_defrag() is updated to use an extra argument with the new meta data to use for the referenced block. Thus the caller does not need to update the HTX message by itself. However, it still have to update the data. Most of time, the bug will be encountered in the HTTP compression filter. But, even if it is highly unlikely, in theory it is also possible to hit it when a HTTP header (or only its value) is replaced or when the start-line is changed. This patch must be backported as far as 2.0.	2021-06-11 14:05:34 +02:00
Remi Tricot-Le Breton	3faf0cbba6	BUILD: ssl: Fix compilation with BoringSSL The ifdefs surrounding the "show ssl ocsp-response" functionality that were supposed to disable the code with BoringSSL were built the wrong way. It does not need to be backported.	2021-06-10 19:01:13 +02:00
Willy Tarreau	8715dec6f9	MEDIUM: pools: remove the locked pools implementation Now that the modified lockless variant does not need a DWCAS anymore, there's no reason to keep the much slower locked version, so let's just get rid of it.	2021-06-10 17:46:50 +02:00
Willy Tarreau	2a4523f6f4	BUG/MAJOR: pools: fix possible race with free() in the lockless variant In GH issue #1275, Fabiano Nunes Parente provided a nicely detailed report showing reproducible crashes under musl. Musl is one of the libs coming with a simple allocator for which we prefer to keep the shared cache. On x86 we have a DWCAS so the lockless implementation is enabled for such libraries. And this implementation has had a small race since day one: the allocator will need to read the first object's <next> pointer to place it into the free list's head. If another thread picks the same element and immediately releases it, while both the local and the shared pools are too crowded, it will be freed to the OS. If the libc's allocator immediately releases it, the memory area is unmapped and we can have a crash while trying to read that pointer. However there is no problem as long as the item remains mapped in memory because whatever value found there will not be placed into the head since the counter will have changed. The probability for this to happen is extremely low, but as analyzed by Fabiano, it increases with the buffer size. On 16 threads it's relatively easy to reproduce with 2MB buffers above 200k req/s, where it should happen within the first 20 seconds of traffic usually. This is a structural issue for which there are two non-trivial solutions: - place a read lock in the alloc call and a barrier made of lock/unlock in the free() call to force to serialize operations; this will have a big performance impact since free() is already one of the contention points; - change the allocator to use a self-locked head, similar to what is done in the MT_LISTS. This requires two memory writes to the head instead of a single one, thus the overhead is exactly one memory write during alloc and one during free; This patch implements the second option. A new POOL_DUMMY pointer was defined for the locked pointer value, allowing to both read and lock it with a single xchg call. The code was carefully optimized so that the locked period remains the shortest possible and that bus writes are avoided as much as possible whenever the lock is held. Tests show that while a bit slower than the original lockless implementation on large buffers (2MB), it's 2.6 times faster than both the no-cache and the locked implementation on such large buffers, and remains as fast or faster than the all implementations when buffers are 48k or higher. Tests were also run on arm64 with similar results. Note that this code is not used on modern libcs featuring a fast allocator. A nice benefit of this change is that since it removes a dependency on the DWCAS, it will be possible to remove the locked implementation and replace it with this one, that is then usable on all systems, thus significantly increasing their performance with large buffers. Given that lockless pools were introduced in 1.9 (not supported anymore), this patch will have to be backported as far as 2.0. The code changed several times in this area and is subject to many ifdefs which will complicate the backport. What is important is to remove all the DWCAS code from the shared cache alloc/free lockless code and replace it with this one. The pool_flush() code is basically the same code as the allocator, retrieving the whole list at once. If in doubt regarding what barriers to use in older versions, it's safe to use the generic ones. This patch depends on the following previous commits: - MINOR: pools: do not maintain the lock during pool_flush() - MINOR: pools: call malloc_trim() under thread isolation - MEDIUM: pools: use a single pool_gc() function for locked and lockless The last one also removes one occurrence of an unneeded DWCAS in the code that was incompatible with this fix. The removal of the now unused seq field will happen in a future patch. Many thanks to Fabiano for his detailed report, and to Olivier for his help on this issue.	2021-06-10 17:46:50 +02:00
Willy Tarreau	9b3ed51371	MEDIUM: pools: use a single pool_gc() function for locked and lockless Locked and lockless shared pools don't need to use a different pool_gc() function because this function isolates itself during the operation, so we do not need to rely on DWCAS nor any atomic operation in fact. Let's just get rid of the lockless one in favor of the simple one. This should even result in a faster execution. The ifdefs were slightly moved so that we can have pool_gc() defined as soon as there are global pools, this avoids duplicating the function.	2021-06-10 17:46:50 +02:00
Willy Tarreau	26ed183556	MINOR: pools: call malloc_trim() under thread isolation pool_gc() was adjusted to run under thread isolation by commit `c0e2ff202` ("MEDIUM: memory: make pool_gc() run under thread isolation") so that the underlying malloc() and free() don't compete between threads during these potentially aggressive moments (especially when mmap/munmap are involved). Commit `88366c292` ("MEDIUM: pools: call malloc_trim() from pool_gc()") later added a call to malloc_trim() but made it outside of the thread isolation, which is contrary to the principle explained above. Also it missed it in the locked version, meaning that those without a lockless implementation cannot benefit from trimming. This patch fixes that by calling it before thread_release() in both places.	2021-06-10 17:46:50 +02:00
Willy Tarreau	c88914379d	MINOR: pools: do not maintain the lock during pool_flush() The locked version of pool_flush() is absurd, it locks the pool for each and every element to be released till the end. Not only this is extremely inefficient, but it may even never finish if other threads spend their time refilling the pool. The only case where this can happen is during soft-stop so the risk remains limited, but it should be addressed.	2021-06-10 17:46:50 +02:00
Willy Tarreau	9a7aa3b4a1	BUG/MINOR: pools: make DEBUG_UAF always write to the to-be-freed location Since the code was reorganized, DEBUG_UAF was still tested in the locked pool code despite pools being disabled when DEBUG_UAF is used. Let's move the test to pool_put_to_os() which is the one that is always called in this condition. The impact is only a possible misleading analysis during a troubleshooting session due to a missing double-frees or free of const area test that is normally already dealt with by the underlying code anyway. In practice it's unlikely anyone will ever notice. This should only be backported to 2.4.	2021-06-10 17:46:50 +02:00
Willy Tarreau	c239cde26f	BUG/MINOR: pools: fix a possible memory leak in the lockless pool_flush() The lockless version of pool_flush() had a leftover of the original version causing the pool's first entry to be set to NULL at the end. The problem is that it does this outside of any lock and in a non- atomic way, so that any concurrent alloc+free would result in a lost object. The risk is low and the consequence even lower, given that pool_flush() is only used in pool_destroy() (hence single-threaded) or by stream_free() during a soft-stop (not the place where most allocations happen), so in the worst case it could result in valgrind complaining on soft-stop. The bug was introduced with the first version of the code, in 1.9, so the fix can be backported to all stable versions.	2021-06-10 17:46:50 +02:00
Amaury Denoyelle	efbf35caf9	BUG/MINOR: server: explicitly set "none" init-addr for dynamic servers Define srv.init_addr_methods to SRV_IADDR_NONE on 'add server' CLI handler. This explicitly states that no resolution will be made on the server creation. This is not a real bug as the default value (SRV_IADDR_END) has the same effect in practice. However the intent is clearer and prevent to use the default "libc,last" by mistake which cannot execute on runtime (blocking call + file access via gethostbyname/getaddrinfo). The doc is also updated to reflect this limitation. This should be backported up to 2.4.	2021-06-10 17:44:05 +02:00
Remi Tricot-Le Breton	6056e61ae2	MINOR: ssl: Add the "show ssl cert foo.pem.ocsp" CLI command Add the ability to dump an OCSP response details through a call to "show ssl cert cert.pem.ocsp". It can also be used on an ongoing transaction by prefixing the certificate name with a '*'. Even if the ckch structure holds an ocsp_response buffer, we still need to look for the actual ocsp response entry in the ocsp response tree rather than just dumping the ckch's buffer details because when updating an ocsp response through a "set ssl ocsp-response" call, the corresponding buffer in the ckch is not updated accordingly. So this buffer, even if it is not empty, might hold an outdated ocsp response.	2021-06-10 16:44:11 +02:00
Remi Tricot-Le Breton	da968f69c7	MINOR: ssl: Add the OCSP entry key when displaying the details of a certificate This patch adds an "OCSP Response Key" information in the output of a "show ssl cert <certfile>" call. The key can then be used in a "show ssl ocsp-response <key>" CLI command.	2021-06-10 16:44:11 +02:00
Remi Tricot-Le Breton	d92fd11c77	MINOR: ssl: Add new "show ssl ocsp-response" CLI command This patch adds the "show ssl ocsp-response [<id>]" CLI command. This command can be used to display the IDs of the OCSP tree entries along with details about the entries' certificate ID (issuer's name and key hash + serial number), or to display the details of a single ocsp-response if an ID is given. The details displayed in this latter case are the ones shown by a "openssl ocsp -respin <ocsp-response> -text" call.	2021-06-10 16:44:11 +02:00
Remi Tricot-Le Breton	5aa1dce5ee	MINOR: ssl: Keep the actual key length in the certificate_ocsp structure The OCSP tree entry key is a serialized version of the OCSP_CERTID of the entry which is stored in a buffer that can be at most 128 bytes. Depending on the length of the serial number, the actual non-zero part of the key can be smaller than 128 bytes and this new structure member allows to know how many of the bytes are filled. It will be useful when dumping the key (in a "show ssl cert <cert>" output for instance).	2021-06-10 16:44:11 +02:00
Christopher Faulet	12554d00f6	BUG/MEDIUM: compression: Add a flag to know the filter is still processing data Since the commit `acfd71b97` ("BUG/MINOR: http-comp: Preserve HTTP_MSGF_COMPRESSIONG flag on the response"), there is no more flag to know when the compression ends. This means it is possible to finish the compression several time if there are trailers. So, we reintroduce almost the same mechanism but with a dedicated flag. So now, there is a bits field in the compression filter context. The commit above is marked to be backported as far as 2.0. Thus this patch must also be backported as far as 2.0.	2021-06-10 08:57:55 +02:00
Christopher Faulet	402740c3ad	BUG/MEDIUM: compression: Properly get the next block to iterate on payload When a DATA block is compressed, or when the compression context is finished on a TLR/EOT block, the next block used to loop on the HTX message must be refreshed because a defragmentation may have occurred. This bug was introduced when the EOM block was removed in 2.4. Thus, this patch must be backported to 2.4.	2021-06-10 08:57:55 +02:00

1 2 3 4 5 ...

11766 commits