A lot of our subsystems start to be shared by thread groups now
(listeners, queues, stick-tables, stats, idle connections, LB algos).
This has allowed to recover the performance that used to be out of
reach on losely shared platforms (typically AMD EPYC systems), but in
parallel other large unified systems (Xeon and large Arm in general)
still suffer from the remaining contention when placing too many
threads in a group.
A first test running on a 64-core Neoverse-N1 processor with a single
backend with one server and no LB algo specifiied shows 1.58 Mrps with
64 threads per group, and 1.71 Mrps with 16 threads per group. The
difference is essentially spent updating stats counters everywhere.
Another test is the connection:close mode, delivering 85 kcps with
64 threads per group, and 172 kcps (202%) with 16 threads per group.
In this case it's mostly the more numerous listeners which improve
the situation as the change is mostly in the kernel:
max-threads-per-group 64:
# perf top
Samples: 244K of event 'cycles', 4000 Hz, Event count (approx.): 61065854708 los
Overhead Shared Object Symbol
10.41% [kernel] [k] queued_spin_lock_slowpath
10.36% [kernel] [k] _raw_spin_unlock_irqrestore
2.54% [kernel] [k] _raw_spin_lock
2.24% [kernel] [k] handle_softirqs
1.49% haproxy [.] process_stream
1.22% [kernel] [k] _raw_spin_lock_bh
# h1load
time conns tot_conn tot_req tot_bytes err cps rps bps ttfb
1 1024 84560 83536 4761666 0 84k5 83k5 38M0 11.91m
2 1024 168736 167713 9559698 0 84k0 84k0 38M3 11.98m
3 1024 253865 252841 14412165 0 85k0 85k0 38M7 11.84m
4 1024 339143 338119 19272783 0 85k1 85k1 38M8 11.80m
5 1024 424204 423180 24121374 0 84k9 84k9 38M7 11.86m
max-threads-per-group 16:
# perf top
Samples: 1M of event 'cycles', 4000 Hz, Event count (approx.): 375998622679 lost
Overhead Shared Object Symbol
15.20% [kernel] [k] queued_spin_lock_slowpath
4.31% [kernel] [k] _raw_spin_unlock_irqrestore
3.33% [kernel] [k] handle_softirqs
2.54% [kernel] [k] _raw_spin_lock
1.46% haproxy [.] process_stream
1.12% [kernel] [k] _raw_spin_lock_bh
# h1load
time conns tot_conn tot_req tot_bytes err cps rps bps ttfb
1 1020 172230 171211 9759255 0 172k 171k 78M0 5.817m
2 1024 343482 342460 19520277 0 171k 171k 78M0 5.875m
3 1021 515947 514926 29350953 0 172k 172k 78M5 5.841m
4 1024 689972 688949 39270207 0 173k 173k 79M2 5.783m
5 1024 863904 862881 49184274 0 173k 173k 79M2 5.795m
So let's change the default value to 16. It also happens to match what's
used by default on EPYC systems these days.
This change was marked MEDIUM as it will increase the number of listening
sockets on some systems, to match their counter parts from other vendors,
which is easier for capacity planning.
|
||
|---|---|---|
| .github | ||
| addons | ||
| admin | ||
| dev | ||
| doc | ||
| examples | ||
| include | ||
| reg-tests | ||
| scripts | ||
| src | ||
| tests | ||
| .cirrus.yml | ||
| .gitattributes | ||
| .gitignore | ||
| .mailmap | ||
| .travis.yml | ||
| BRANCHES | ||
| BSDmakefile | ||
| CHANGELOG | ||
| CONTRIBUTING | ||
| INSTALL | ||
| LICENSE | ||
| MAINTAINERS | ||
| Makefile | ||
| README.md | ||
| SUBVERS | ||
| VERDATE | ||
| VERSION | ||
HAProxy
HAProxy is a free, very fast and reliable reverse-proxy offering high availability, load balancing, and proxying for TCP and HTTP-based applications.
Installation
The INSTALL file describes how to build HAProxy. A list of packages is also available on the wiki.
Getting help
The discourse and the mailing-list are available for questions or configuration assistance. You can also use the slack or IRC channel. Please don't use the issue tracker for these.
The issue tracker is only for bug reports or feature requests.
Documentation
The HAProxy documentation has been split into a number of different files for ease of use. It is available in text format as well as HTML. The wiki is also meant to replace the old architecture guide.
Please refer to the following files depending on what you're looking for:
- INSTALL for instructions on how to build and install HAProxy
- BRANCHES to understand the project's life cycle and what version to use
- LICENSE for the project's license
- CONTRIBUTING for the process to follow to submit contributions
The more detailed documentation is located into the doc/ directory:
- doc/intro.txt for a quick introduction on HAProxy
- doc/configuration.txt for the configuration's reference manual
- doc/lua.txt for the Lua's reference manual
- doc/SPOE.txt for how to use the SPOE engine
- doc/network-namespaces.txt for how to use network namespaces under Linux
- doc/management.txt for the management guide
- doc/regression-testing.txt for how to use the regression testing suite
- doc/peers.txt for the peers protocol reference
- doc/coding-style.txt for how to adopt HAProxy's coding style
- doc/internals for developer-specific documentation (not all up to date)
License
HAProxy is licensed under GPL 2 or any later version, the headers under LGPL 2.1. See the LICENSE file for a more detailed explanation.
