HAProxy - Load balancer
Find a file
Willy Tarreau 2c3f9818e8 BUG/MEDIUM: fd: do not wait on FD removal in fd_delete()
Christopher discovered an issue mostly affecting 2.2 and to a less extent
2.3 and above, which is that it's possible to deadlock a soft-stop when
several threads are using a same listener:

          thread1                             thread2

      unbind_listener()                   fd_set_running()
          lock(listener)                  listener_accept()
          fd_delete()                          lock(listener)
             while (running_mask);  -----> deadlock
          unlock(listener)

This simple case disappeared from 2.3 due to the removal of some locked
operations at the end of listener_accept() on the regular path, but the
architectural problem is still here and caused by a lock inversion built
around the loop on running_mask in fd_clr_running_excl(), because there
are situations where the caller of fd_delete() may hold a lock that is
preventing other threads from dropping their bit in running_mask.

The real need here is to make sure the last user deletes the FD. We have
all we need to know the last one, it's the one calling fd_clr_running()
last, or entering fd_delete() last, both of which can be summed up as
the last one calling fd_clr_running() if fd_delete() calls fd_clr_running()
at the end. And we can prevent new threads from appearing in running_mask
by removing their bits in thread_mask.

So what this patch does is that it sets the running_mask for the thread
in fd_delete(), clears the thread_mask, thus marking the FD as orphaned,
then clears the running mask again, and completes the deletion if it was
the last one. If it was not, another thread will pass through fd_clr_running
and will complete the deletion of the FD.

The bug is easily reproducible in 2.2 under high connection rates during
soft close. When the old process stops its listener, occasionally two
threads will deadlock and the old process will then be killed by the
watchdog. It's strongly believed that similar situations do exist in 2.3
and 2.4 (e.g. if the removal attempt happens during resume_listener()
called from listener_accept()) but if so, they should be much harder to
trigger.

This should be backported to 2.2 as the issue appeared with the FD
migration. It requires previous patches "fd: make fd_clr_running() return
the remaining running mask" and "MINOR: fd: remove the unneeded running
bit from fd_insert()".

Notes for backport: in 2.2, the fd_dodelete() function requires an extra
argument "do_close" indicating whether we want to remove and close the FD
(fd_delete) or just delete it (fd_remove). While this information is not
conveyed along the chain, we know that late calls always imply do_close=1
become do_close=0 exclusively results from fd_remove() which is only used
by the config parser and the master, both of which are single-threaded,
hence are always the last ones in the running_mask. Thus it is safe to
assume that a postponed FD deletion always implies do_close=1.

Thanks to Olivier for his help in designing this optimal solution.
2021-03-24 17:17:21 +01:00
.github CI: github actions: update LibreSSL to 3.2.5 2021-03-20 09:32:52 +01:00
contrib MINOR: opentracing: use pool_alloc(), not pool_alloc_dirty() 2021-03-22 15:35:53 +01:00
doc CLEANUP: dynbuf: remove the unused b_alloc_fast() function 2021-03-22 16:28:05 +01:00
examples CLEANUP: assorted typo fixes in the code and comments 2020-06-26 11:27:28 +02:00
include BUG/MEDIUM: fd: do not wait on FD removal in fd_delete() 2021-03-24 17:17:21 +01:00
reg-tests BUG/MINOR: ssl: Prevent disk access when using "add ssl crt-list" 2021-03-23 19:29:46 +01:00
scripts BUILD: Makefile: move REGTESTST_TYPE default setting 2021-02-05 11:41:16 +01:00
src BUG/MEDIUM: fd: do not wait on FD removal in fd_delete() 2021-03-24 17:17:21 +01:00
tests MEDIUM: config: remove the deprecated and dangerous global "debug" directive 2020-10-09 19:18:45 +02:00
.cirrus.yml CI: cirrus: update FreeBSD image to 12.2 2021-02-12 16:04:52 +01:00
.gitattributes MINOR: Configure the cpp userdiff driver for *.[ch] in .gitattributes 2021-02-22 18:17:57 +01:00
.gitignore CLEANUP: Update .gitignore 2020-09-12 13:11:24 +02:00
.travis.yml CI: travis-ci: drop coverity scan builds 2020-12-22 19:39:23 +01:00
BRANCHES DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
CHANGELOG [RELEASE] Released version 2.4-dev13 2021-03-19 17:16:18 +01:00
CONTRIBUTING DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
INSTALL DOC: fix some spelling issues over multiple files 2021-01-08 14:53:47 +01:00
LICENSE LICENSE: add licence exception for OpenSSL 2012-09-07 13:52:26 +02:00
MAINTAINERS DOC: Update the module list in MAINTAINERS file 2021-02-24 22:09:57 +01:00
Makefile BUG/MINOR: protocol: add missing support of dgram unix socket. 2021-03-18 18:30:29 +01:00
README DOC: create a BRANCHES file to explain the life cycle 2019-06-15 22:00:14 +02:00
ROADMAP DOC: update the outdated ROADMAP file 2019-06-15 21:59:54 +02:00
SUBVERS BUILD: use format tags in VERDATE and SUBVERS files 2013-12-10 11:22:49 +01:00
VERDATE [RELEASE] Released version 2.4-dev13 2021-03-19 17:16:18 +01:00
VERSION [RELEASE] Released version 2.4-dev13 2021-03-19 17:16:18 +01:00

The HAProxy documentation has been split into a number of different files for
ease of use.

Please refer to the following files depending on what you're looking for :

  - INSTALL for instructions on how to build and install HAProxy
  - BRANCHES to understand the project's life cycle and what version to use
  - LICENSE for the project's license
  - CONTRIBUTING for the process to follow to submit contributions

The more detailed documentation is located into the doc/ directory :

  - doc/intro.txt for a quick introduction on HAProxy
  - doc/configuration.txt for the configuration's reference manual
  - doc/lua.txt for the Lua's reference manual
  - doc/SPOE.txt for how to use the SPOE engine
  - doc/network-namespaces.txt for how to use network namespaces under Linux
  - doc/management.txt for the management guide
  - doc/regression-testing.txt for how to use the regression testing suite
  - doc/peers.txt for the peers protocol reference
  - doc/coding-style.txt for how to adopt HAProxy's coding style
  - doc/internals for developer-specific documentation (not all up to date)