2007-04-09 06:03:06 -04:00
|
|
|
/*
|
|
|
|
|
* FD polling functions for FreeBSD kqueue()
|
|
|
|
|
*
|
MAJOR: polling: rework the whole polling system
This commit heavily changes the polling system in order to definitely
fix the frequent breakage of SSL which needs to remember the last
EAGAIN before deciding whether to poll or not. Now we have a state per
direction for each FD, as opposed to a previous and current state
previously. An FD can have up to 8 different states for each direction,
each of which being the result of a 3-bit combination. These 3 bits
indicate a wish to access the FD, the readiness of the FD and the
subscription of the FD to the polling system.
This means that it will now be possible to remember the state of a
file descriptor across disable/enable sequences that generally happen
during forwarding, where enabling reading on a previously disabled FD
would result in forgetting the EAGAIN flag it met last time.
Several new state manipulation functions have been introduced or
adapted :
- fd_want_{recv,send} : enable receiving/sending on the FD regardless
of its state (sets the ACTIVE flag) ;
- fd_stop_{recv,send} : stop receiving/sending on the FD regardless
of its state (clears the ACTIVE flag) ;
- fd_cant_{recv,send} : report a failure to receive/send on the FD
corresponding to EAGAIN (clears the READY flag) ;
- fd_may_{recv,send} : report the ability to receive/send on the FD
as reported by poll() (sets the READY flag) ;
Some functions are used to report the current FD status :
- fd_{recv,send}_active
- fd_{recv,send}_ready
- fd_{recv,send}_polled
Some functions were removed :
- fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai()
The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers
knows it can try to access the file descriptor to get this information.
In order to simplify the conditions to add/remove cache entries, a new
function fd_alloc_or_release_cache_entry() was created to be used from
pollers while scanning for updates.
The following pollers have been updated :
ev_select() : done, built, tested on Linux 3.10
ev_poll() : done, built, tested on Linux 3.10
ev_epoll() : done, built, tested on Linux 3.10 & 3.13
ev_kqueue() : done, built, tested on OpenBSD 5.2
2014-01-10 10:58:45 -05:00
|
|
|
* Copyright 2000-2014 Willy Tarreau <w@1wt.eu>
|
2007-04-09 06:03:06 -04:00
|
|
|
*
|
|
|
|
|
* This program is free software; you can redistribute it and/or
|
|
|
|
|
* modify it under the terms of the GNU General Public License
|
|
|
|
|
* as published by the Free Software Foundation; either version
|
|
|
|
|
* 2 of the License, or (at your option) any later version.
|
|
|
|
|
*
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
#include <unistd.h>
|
|
|
|
|
#include <sys/time.h>
|
|
|
|
|
#include <sys/types.h>
|
|
|
|
|
|
|
|
|
|
#include <sys/event.h>
|
|
|
|
|
#include <sys/time.h>
|
|
|
|
|
|
|
|
|
|
#include <common/compat.h>
|
|
|
|
|
#include <common/config.h>
|
2018-08-02 04:16:17 -04:00
|
|
|
#include <common/hathreads.h>
|
2008-07-06 18:09:58 -04:00
|
|
|
#include <common/ticks.h>
|
2007-04-09 06:03:06 -04:00
|
|
|
#include <common/time.h>
|
2007-06-03 11:16:49 -04:00
|
|
|
#include <common/tools.h>
|
2007-04-09 06:03:06 -04:00
|
|
|
|
|
|
|
|
#include <types/global.h>
|
|
|
|
|
|
2018-11-22 02:31:09 -05:00
|
|
|
#include <proto/activity.h>
|
2007-04-09 06:03:06 -04:00
|
|
|
#include <proto/fd.h>
|
MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.
A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.
A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.
It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.
The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 12:07:59 -05:00
|
|
|
#include <proto/signal.h>
|
2015-04-13 14:44:19 -04:00
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
|
|
|
|
|
/* private data */
|
2018-01-19 02:56:14 -05:00
|
|
|
static int kqueue_fd[MAX_THREADS]; // per-thread kqueue_fd
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
static THREAD_LOCAL struct kevent *kev = NULL;
|
2018-04-16 07:24:48 -04:00
|
|
|
static struct kevent *kev_out = NULL; // Trash buffer for kevent() to write the eventlist in
|
2007-04-09 06:03:06 -04:00
|
|
|
|
2018-05-09 19:01:28 -04:00
|
|
|
static int _update_fd(int fd, int start)
|
2018-04-25 10:58:25 -04:00
|
|
|
{
|
|
|
|
|
int en;
|
2018-05-09 19:01:28 -04:00
|
|
|
int changes = start;
|
2018-04-25 10:58:25 -04:00
|
|
|
|
|
|
|
|
en = fdtab[fd].state;
|
|
|
|
|
|
|
|
|
|
if (!(fdtab[fd].thread_mask & tid_bit) || !(en & FD_EV_POLLED_RW)) {
|
2019-07-25 10:00:18 -04:00
|
|
|
if (!(polled_mask[fd].poll_recv & tid_bit) &&
|
|
|
|
|
!(polled_mask[fd].poll_send & tid_bit)) {
|
2018-04-25 10:58:25 -04:00
|
|
|
/* fd was not watched, it's still not */
|
2018-09-11 08:44:51 -04:00
|
|
|
return changes;
|
2018-04-25 10:58:25 -04:00
|
|
|
}
|
|
|
|
|
/* fd totally removed from poll list */
|
|
|
|
|
EV_SET(&kev[changes++], fd, EVFILT_READ, EV_DELETE, 0, 0, NULL);
|
|
|
|
|
EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_DELETE, 0, 0, NULL);
|
2019-07-25 10:00:18 -04:00
|
|
|
if (polled_mask[fd].poll_recv & tid_bit)
|
|
|
|
|
_HA_ATOMIC_AND(&polled_mask[fd].poll_recv, ~tid_bit);
|
|
|
|
|
if (polled_mask[fd].poll_send & tid_bit)
|
|
|
|
|
_HA_ATOMIC_AND(&polled_mask[fd].poll_send, ~tid_bit);
|
2018-04-25 10:58:25 -04:00
|
|
|
}
|
|
|
|
|
else {
|
|
|
|
|
/* OK fd has to be monitored, it was either added or changed */
|
|
|
|
|
|
2019-07-25 10:00:18 -04:00
|
|
|
if (en & FD_EV_POLLED_R) {
|
|
|
|
|
if (!(polled_mask[fd].poll_recv & tid_bit)) {
|
|
|
|
|
EV_SET(&kev[changes++], fd, EVFILT_READ, EV_ADD, 0, 0, NULL);
|
|
|
|
|
_HA_ATOMIC_OR(&polled_mask[fd].poll_recv, tid_bit);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
else if (polled_mask[fd].poll_recv & tid_bit) {
|
2018-04-25 10:58:25 -04:00
|
|
|
EV_SET(&kev[changes++], fd, EVFILT_READ, EV_DELETE, 0, 0, NULL);
|
2019-07-25 10:00:18 -04:00
|
|
|
HA_ATOMIC_AND(&polled_mask[fd].poll_recv, ~tid_bit);
|
|
|
|
|
}
|
2018-04-25 10:58:25 -04:00
|
|
|
|
2019-07-25 10:00:18 -04:00
|
|
|
if (en & FD_EV_POLLED_W) {
|
|
|
|
|
if (!(polled_mask[fd].poll_send & tid_bit)) {
|
|
|
|
|
EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_ADD, 0, 0, NULL);
|
|
|
|
|
_HA_ATOMIC_OR(&polled_mask[fd].poll_send, tid_bit);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
else if (polled_mask[fd].poll_send & tid_bit) {
|
2018-04-25 10:58:25 -04:00
|
|
|
EV_SET(&kev[changes++], fd, EVFILT_WRITE, EV_DELETE, 0, 0, NULL);
|
2019-07-25 10:00:18 -04:00
|
|
|
_HA_ATOMIC_AND(&polled_mask[fd].poll_send, ~tid_bit);
|
|
|
|
|
}
|
2018-04-25 10:58:25 -04:00
|
|
|
|
|
|
|
|
}
|
|
|
|
|
return changes;
|
|
|
|
|
}
|
|
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
/*
|
2012-11-11 14:49:49 -05:00
|
|
|
* kqueue() poller
|
2007-04-09 06:03:06 -04:00
|
|
|
*/
|
2019-05-28 10:44:05 -04:00
|
|
|
REGPRM3 static void _do_poll(struct poller *p, int exp, int wake)
|
2007-04-09 06:03:06 -04:00
|
|
|
{
|
2012-11-11 14:49:49 -05:00
|
|
|
int status;
|
MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.
A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.
A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.
It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.
The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 12:07:59 -05:00
|
|
|
int count, fd, wait_time;
|
|
|
|
|
struct timespec timeout_ts;
|
2018-04-25 10:58:25 -04:00
|
|
|
int updt_idx;
|
2012-11-11 14:49:49 -05:00
|
|
|
int changes = 0;
|
2018-04-25 10:58:25 -04:00
|
|
|
int old_fd;
|
2007-04-09 06:03:06 -04:00
|
|
|
|
MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.
A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.
A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.
It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.
The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 12:07:59 -05:00
|
|
|
timeout_ts.tv_sec = 0;
|
|
|
|
|
timeout_ts.tv_nsec = 0;
|
2012-11-11 14:49:49 -05:00
|
|
|
/* first, scan the update list to find changes */
|
|
|
|
|
for (updt_idx = 0; updt_idx < fd_nbupdt; updt_idx++) {
|
|
|
|
|
fd = fd_updt[updt_idx];
|
MAJOR: polling: rework the whole polling system
This commit heavily changes the polling system in order to definitely
fix the frequent breakage of SSL which needs to remember the last
EAGAIN before deciding whether to poll or not. Now we have a state per
direction for each FD, as opposed to a previous and current state
previously. An FD can have up to 8 different states for each direction,
each of which being the result of a 3-bit combination. These 3 bits
indicate a wish to access the FD, the readiness of the FD and the
subscription of the FD to the polling system.
This means that it will now be possible to remember the state of a
file descriptor across disable/enable sequences that generally happen
during forwarding, where enabling reading on a previously disabled FD
would result in forgetting the EAGAIN flag it met last time.
Several new state manipulation functions have been introduced or
adapted :
- fd_want_{recv,send} : enable receiving/sending on the FD regardless
of its state (sets the ACTIVE flag) ;
- fd_stop_{recv,send} : stop receiving/sending on the FD regardless
of its state (clears the ACTIVE flag) ;
- fd_cant_{recv,send} : report a failure to receive/send on the FD
corresponding to EAGAIN (clears the READY flag) ;
- fd_may_{recv,send} : report the ability to receive/send on the FD
as reported by poll() (sets the READY flag) ;
Some functions are used to report the current FD status :
- fd_{recv,send}_active
- fd_{recv,send}_ready
- fd_{recv,send}_polled
Some functions were removed :
- fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai()
The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers
knows it can try to access the file descriptor to get this information.
In order to simplify the conditions to add/remove cache entries, a new
function fd_alloc_or_release_cache_entry() was created to be used from
pollers while scanning for updates.
The following pollers have been updated :
ev_select() : done, built, tested on Linux 3.10
ev_poll() : done, built, tested on Linux 3.10
ev_epoll() : done, built, tested on Linux 3.10 & 3.13
ev_kqueue() : done, built, tested on OpenBSD 5.2
2014-01-10 10:58:45 -05:00
|
|
|
|
2019-03-08 12:49:54 -05:00
|
|
|
_HA_ATOMIC_AND(&fdtab[fd].update_mask, ~tid_bit);
|
2018-01-20 13:30:13 -05:00
|
|
|
if (!fdtab[fd].owner) {
|
|
|
|
|
activity[tid].poll_drop++;
|
MAJOR: polling: rework the whole polling system
This commit heavily changes the polling system in order to definitely
fix the frequent breakage of SSL which needs to remember the last
EAGAIN before deciding whether to poll or not. Now we have a state per
direction for each FD, as opposed to a previous and current state
previously. An FD can have up to 8 different states for each direction,
each of which being the result of a 3-bit combination. These 3 bits
indicate a wish to access the FD, the readiness of the FD and the
subscription of the FD to the polling system.
This means that it will now be possible to remember the state of a
file descriptor across disable/enable sequences that generally happen
during forwarding, where enabling reading on a previously disabled FD
would result in forgetting the EAGAIN flag it met last time.
Several new state manipulation functions have been introduced or
adapted :
- fd_want_{recv,send} : enable receiving/sending on the FD regardless
of its state (sets the ACTIVE flag) ;
- fd_stop_{recv,send} : stop receiving/sending on the FD regardless
of its state (clears the ACTIVE flag) ;
- fd_cant_{recv,send} : report a failure to receive/send on the FD
corresponding to EAGAIN (clears the READY flag) ;
- fd_may_{recv,send} : report the ability to receive/send on the FD
as reported by poll() (sets the READY flag) ;
Some functions are used to report the current FD status :
- fd_{recv,send}_active
- fd_{recv,send}_ready
- fd_{recv,send}_polled
Some functions were removed :
- fd_ev_clr(), fd_ev_set(), fd_ev_rem(), fd_ev_wai()
The POLLHUP/POLLERR flags are now reported as ready so that the I/O layers
knows it can try to access the file descriptor to get this information.
In order to simplify the conditions to add/remove cache entries, a new
function fd_alloc_or_release_cache_entry() was created to be used from
pollers while scanning for updates.
The following pollers have been updated :
ev_select() : done, built, tested on Linux 3.10
ev_poll() : done, built, tested on Linux 3.10
ev_epoll() : done, built, tested on Linux 3.10 & 3.13
ev_kqueue() : done, built, tested on OpenBSD 5.2
2014-01-10 10:58:45 -05:00
|
|
|
continue;
|
2018-01-20 13:30:13 -05:00
|
|
|
}
|
2018-05-09 19:01:28 -04:00
|
|
|
changes = _update_fd(fd, changes);
|
2018-04-25 10:58:25 -04:00
|
|
|
}
|
|
|
|
|
/* Scan the global update list */
|
|
|
|
|
for (old_fd = fd = update_list.first; fd != -1; fd = fdtab[fd].update.next) {
|
|
|
|
|
if (fd == -2) {
|
|
|
|
|
fd = old_fd;
|
|
|
|
|
continue;
|
2012-11-11 14:49:49 -05:00
|
|
|
}
|
2018-04-25 10:58:25 -04:00
|
|
|
else if (fd <= -3)
|
|
|
|
|
fd = -fd -4;
|
|
|
|
|
if (fd == -1)
|
|
|
|
|
break;
|
|
|
|
|
if (fdtab[fd].update_mask & tid_bit)
|
|
|
|
|
done_update_polling(fd);
|
|
|
|
|
else
|
|
|
|
|
continue;
|
|
|
|
|
if (!fdtab[fd].owner)
|
|
|
|
|
continue;
|
2018-05-09 19:01:28 -04:00
|
|
|
changes = _update_fd(fd, changes);
|
2012-11-11 14:49:49 -05:00
|
|
|
}
|
2018-04-25 10:58:25 -04:00
|
|
|
|
2018-08-02 04:16:17 -04:00
|
|
|
thread_harmless_now();
|
|
|
|
|
|
2018-04-16 07:24:48 -04:00
|
|
|
if (changes) {
|
|
|
|
|
#ifdef EV_RECEIPT
|
|
|
|
|
kev[0].flags |= EV_RECEIPT;
|
|
|
|
|
#else
|
|
|
|
|
/* If EV_RECEIPT isn't defined, just add an invalid entry,
|
|
|
|
|
* so that we get an error and kevent() stops before scanning
|
|
|
|
|
* the kqueue.
|
|
|
|
|
*/
|
|
|
|
|
EV_SET(&kev[changes++], -1, EVFILT_WRITE, EV_DELETE, 0, 0, NULL);
|
|
|
|
|
#endif
|
MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.
A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.
A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.
It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.
The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 12:07:59 -05:00
|
|
|
kevent(kqueue_fd[tid], kev, changes, kev_out, changes, &timeout_ts);
|
2018-04-16 07:24:48 -04:00
|
|
|
}
|
2012-11-11 14:49:49 -05:00
|
|
|
fd_nbupdt = 0;
|
2007-04-09 06:03:06 -04:00
|
|
|
|
2018-10-17 05:25:54 -04:00
|
|
|
/* now let's wait for events */
|
2019-05-28 10:44:05 -04:00
|
|
|
wait_time = wake ? 0 : compute_poll_timeout(exp);
|
2018-01-29 08:58:02 -05:00
|
|
|
fd = global.tune.maxpollevents;
|
2018-10-17 08:31:19 -04:00
|
|
|
tv_entering_poll();
|
2018-11-22 02:31:09 -05:00
|
|
|
activity_count_runtime();
|
MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.
A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.
A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.
It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.
The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 12:07:59 -05:00
|
|
|
|
|
|
|
|
do {
|
|
|
|
|
int timeout = (global.tune.options & GTUNE_BUSY_POLLING) ? 0 : wait_time;
|
|
|
|
|
|
|
|
|
|
timeout_ts.tv_sec = (timeout / 1000);
|
|
|
|
|
timeout_ts.tv_nsec = (timeout % 1000) * 1000000;
|
|
|
|
|
|
|
|
|
|
status = kevent(kqueue_fd[tid], // int kq
|
|
|
|
|
NULL, // const struct kevent *changelist
|
|
|
|
|
0, // int nchanges
|
|
|
|
|
kev, // struct kevent *eventlist
|
|
|
|
|
fd, // int nevents
|
|
|
|
|
&timeout_ts); // const struct timespec *timeout
|
|
|
|
|
tv_update_date(timeout, status);
|
|
|
|
|
|
|
|
|
|
if (status)
|
|
|
|
|
break;
|
|
|
|
|
if (timeout || !wait_time)
|
|
|
|
|
break;
|
2019-05-28 10:44:05 -04:00
|
|
|
if (signal_queue_len || wake)
|
MINOR: polling: add an option to support busy polling
In some situations, especially when dealing with low latency on processors
supporting a variable frequency or when running inside virtual machines,
each time the process waits for an I/O using the poller, the processor
goes back to sleep or is offered to another VM for a long time, and it
causes excessively high latencies.
A solution to this provided by this patch is to enable busy polling using
a global option. When busy polling is enabled, the pollers never sleep and
loop over themselves waiting for an I/O event to happen or for a timeout
to occur. On multi-processor machines it can significantly overheat the
processor but it usually results in much lower latencies.
A typical test consisting in injecting traffic over a single connection at
a time over the loopback shows a bump from 4640 to 8540 connections per
second on forwarded connections, indicating a latency reduction of 98
microseconds for each connection, and a bump from 12500 to 21250 for
locally terminated connections (redirects), indicating a reduction of
33 microseconds.
It is only usable with epoll and kqueue because select() and poll()'s
API is not convenient for such usages, and the level of performance they
are used in doesn't benefit from this anyway.
The option, which obviously remains disabled by default, can be turned
on using "busy-polling" in the global section, and turned off later
using "no busy-polling". Its status is reported in "show info" to help
troubleshooting suspicious CPU spikes.
2018-11-22 12:07:59 -05:00
|
|
|
break;
|
|
|
|
|
if (tick_isset(exp) && tick_is_expired(exp, now_ms))
|
|
|
|
|
break;
|
|
|
|
|
} while (1);
|
|
|
|
|
|
|
|
|
|
tv_leaving_poll(wait_time, status);
|
2007-04-09 06:03:06 -04:00
|
|
|
|
2018-08-02 04:16:17 -04:00
|
|
|
thread_harmless_end();
|
2019-07-24 12:07:06 -04:00
|
|
|
if (sleeping_thread_mask & tid_bit)
|
|
|
|
|
_HA_ATOMIC_AND(&sleeping_thread_mask, ~tid_bit);
|
2018-08-02 04:16:17 -04:00
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
for (count = 0; count < status; count++) {
|
2017-08-30 04:34:36 -04:00
|
|
|
unsigned int n = 0;
|
2007-04-09 06:03:06 -04:00
|
|
|
fd = kev[count].ident;
|
2012-07-06 05:44:28 -04:00
|
|
|
|
2018-01-20 13:30:13 -05:00
|
|
|
if (!fdtab[fd].owner) {
|
|
|
|
|
activity[tid].poll_dead++;
|
2012-07-06 10:02:29 -04:00
|
|
|
continue;
|
2018-01-20 13:30:13 -05:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
if (!(fdtab[fd].thread_mask & tid_bit)) {
|
|
|
|
|
activity[tid].poll_skip++;
|
|
|
|
|
continue;
|
|
|
|
|
}
|
2012-07-06 10:02:29 -04:00
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
if (kev[count].filter == EVFILT_READ) {
|
2017-03-13 15:49:56 -04:00
|
|
|
if (kev[count].data)
|
2017-08-30 04:34:36 -04:00
|
|
|
n |= FD_POLL_IN;
|
2017-03-13 15:36:48 -04:00
|
|
|
if (kev[count].flags & EV_EOF)
|
2017-08-30 04:34:36 -04:00
|
|
|
n |= FD_POLL_HUP;
|
2012-11-11 14:49:49 -05:00
|
|
|
}
|
|
|
|
|
else if (kev[count].filter == EVFILT_WRITE) {
|
2017-08-30 04:34:36 -04:00
|
|
|
n |= FD_POLL_OUT;
|
2017-03-13 15:36:48 -04:00
|
|
|
if (kev[count].flags & EV_EOF)
|
2017-08-30 04:34:36 -04:00
|
|
|
n |= FD_POLL_ERR;
|
2007-04-09 06:03:06 -04:00
|
|
|
}
|
2012-11-11 14:49:49 -05:00
|
|
|
|
2017-08-30 04:34:36 -04:00
|
|
|
fd_update_events(fd, n);
|
2007-04-09 06:03:06 -04:00
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
|
|
|
|
|
static int init_kqueue_per_thread()
|
|
|
|
|
{
|
2018-01-19 02:56:14 -05:00
|
|
|
int fd;
|
|
|
|
|
|
2018-04-16 07:24:48 -04:00
|
|
|
/* we can have up to two events per fd, so allocate enough to store
|
|
|
|
|
* 2*fd event, and an extra one, in case EV_RECEIPT isn't defined,
|
|
|
|
|
* so that we can add an invalid entry and get an error, to avoid
|
|
|
|
|
* scanning the kqueue uselessly.
|
|
|
|
|
*/
|
|
|
|
|
kev = calloc(1, sizeof(struct kevent) * (2 * global.maxsock + 1));
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
if (kev == NULL)
|
2018-01-19 02:56:14 -05:00
|
|
|
goto fail_alloc;
|
|
|
|
|
|
2018-01-25 10:40:35 -05:00
|
|
|
if (MAX_THREADS > 1 && tid) {
|
2018-01-19 02:56:14 -05:00
|
|
|
kqueue_fd[tid] = kqueue();
|
|
|
|
|
if (kqueue_fd[tid] < 0)
|
|
|
|
|
goto fail_fd;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/* we may have to unregister some events initially registered on the
|
|
|
|
|
* original fd when it was alone, and/or to register events on the new
|
|
|
|
|
* fd for this thread. Let's just mark them as updated, the poller will
|
|
|
|
|
* do the rest.
|
|
|
|
|
*/
|
2018-01-29 08:58:02 -05:00
|
|
|
for (fd = 0; fd < global.maxsock; fd++)
|
2018-01-19 02:56:14 -05:00
|
|
|
updt_fd_polling(fd);
|
|
|
|
|
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
return 1;
|
2018-01-19 02:56:14 -05:00
|
|
|
fail_fd:
|
|
|
|
|
free(kev);
|
|
|
|
|
fail_alloc:
|
|
|
|
|
return 0;
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
static void deinit_kqueue_per_thread()
|
|
|
|
|
{
|
2018-01-25 10:40:35 -05:00
|
|
|
if (MAX_THREADS > 1 && tid)
|
2018-01-25 10:32:18 -05:00
|
|
|
close(kqueue_fd[tid]);
|
|
|
|
|
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
free(kev);
|
2017-10-27 07:53:47 -04:00
|
|
|
kev = NULL;
|
MAJOR: threads/fd: Make fd stuffs thread-safe
Many changes have been made to do so. First, the fd_updt array, where all
pending FDs for polling are stored, is now a thread-local array. Then 3 locks
have been added to protect, respectively, the fdtab array, the fd_cache array
and poll information. In addition, a lock for each entry in the fdtab array has
been added to protect all accesses to a specific FD or its information.
For pollers, according to the poller, the way to manage the concurrency is
different. There is a poller loop on each thread. So the set of monitored FDs
may need to be protected. epoll and kqueue are thread-safe per-se, so there few
things to do to protect these pollers. This is not possible with select and
poll, so there is no sharing between the threads. The poller on each thread is
independant from others.
Finally, per-thread init/deinit functions are used for each pollers and for FD
part for manage thread-local ressources.
Now, you must be carefull when a FD is created during the HAProxy startup. All
update on the FD state must be made in the threads context and never before
their creation. This is mandatory because fd_updt array is thread-local and
initialized only for threads. Because there is no pollers for the main one, this
array remains uninitialized in this context. For this reason, listeners are now
enabled in run_thread_poll_loop function, just like the worker pipe.
2017-05-29 04:40:41 -04:00
|
|
|
}
|
|
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
/*
|
|
|
|
|
* Initialization of the kqueue() poller.
|
|
|
|
|
* Returns 0 in case of failure, non-zero in case of success. If it fails, it
|
|
|
|
|
* disables the poller by setting its pref to 0.
|
|
|
|
|
*/
|
2007-04-15 18:25:25 -04:00
|
|
|
REGPRM1 static int _do_init(struct poller *p)
|
2007-04-09 06:03:06 -04:00
|
|
|
{
|
|
|
|
|
p->private = NULL;
|
|
|
|
|
|
2018-04-16 07:24:48 -04:00
|
|
|
/* we can have up to two events per fd, so allocate enough to store
|
|
|
|
|
* 2*fd event, and an extra one, in case EV_RECEIPT isn't defined,
|
|
|
|
|
* so that we can add an invalid entry and get an error, to avoid
|
|
|
|
|
* scanning the kqueue uselessly.
|
|
|
|
|
*/
|
|
|
|
|
kev_out = calloc(1, sizeof(struct kevent) * (2 * global.maxsock + 1));
|
|
|
|
|
if (!kev_out)
|
|
|
|
|
goto fail_alloc;
|
|
|
|
|
|
2018-01-19 02:56:14 -05:00
|
|
|
kqueue_fd[tid] = kqueue();
|
|
|
|
|
if (kqueue_fd[tid] < 0)
|
2007-04-09 06:03:06 -04:00
|
|
|
goto fail_fd;
|
|
|
|
|
|
2017-10-27 07:53:47 -04:00
|
|
|
hap_register_per_thread_init(init_kqueue_per_thread);
|
|
|
|
|
hap_register_per_thread_deinit(deinit_kqueue_per_thread);
|
2007-04-09 06:03:06 -04:00
|
|
|
return 1;
|
|
|
|
|
|
|
|
|
|
fail_fd:
|
2018-04-16 07:24:48 -04:00
|
|
|
free(kev_out);
|
|
|
|
|
kev_out = NULL;
|
|
|
|
|
fail_alloc:
|
2007-04-09 06:03:06 -04:00
|
|
|
p->pref = 0;
|
|
|
|
|
return 0;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Termination of the kqueue() poller.
|
|
|
|
|
* Memory is released and the poller is marked as unselectable.
|
|
|
|
|
*/
|
2007-04-15 18:25:25 -04:00
|
|
|
REGPRM1 static void _do_term(struct poller *p)
|
2007-04-09 06:03:06 -04:00
|
|
|
{
|
2018-01-19 02:56:14 -05:00
|
|
|
if (kqueue_fd[tid] >= 0) {
|
|
|
|
|
close(kqueue_fd[tid]);
|
|
|
|
|
kqueue_fd[tid] = -1;
|
2009-05-10 04:18:54 -04:00
|
|
|
}
|
2007-04-09 06:03:06 -04:00
|
|
|
|
|
|
|
|
p->private = NULL;
|
|
|
|
|
p->pref = 0;
|
2018-04-16 07:24:48 -04:00
|
|
|
if (kev_out) {
|
|
|
|
|
free(kev_out);
|
|
|
|
|
kev_out = NULL;
|
|
|
|
|
}
|
2007-04-09 06:03:06 -04:00
|
|
|
}
|
|
|
|
|
|
2007-04-09 13:29:56 -04:00
|
|
|
/*
|
|
|
|
|
* Check that the poller works.
|
|
|
|
|
* Returns 1 if OK, otherwise 0.
|
|
|
|
|
*/
|
2007-04-15 18:25:25 -04:00
|
|
|
REGPRM1 static int _do_test(struct poller *p)
|
2007-04-09 13:29:56 -04:00
|
|
|
{
|
|
|
|
|
int fd;
|
|
|
|
|
|
|
|
|
|
fd = kqueue();
|
|
|
|
|
if (fd < 0)
|
|
|
|
|
return 0;
|
|
|
|
|
close(fd);
|
|
|
|
|
return 1;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Recreate the kqueue file descriptor after a fork(). Returns 1 if OK,
|
|
|
|
|
* otherwise 0. Note that some pollers need to be reopened after a fork()
|
|
|
|
|
* (such as kqueue), and some others may fail to do so in a chroot.
|
|
|
|
|
*/
|
2007-04-15 18:25:25 -04:00
|
|
|
REGPRM1 static int _do_fork(struct poller *p)
|
2007-04-09 13:29:56 -04:00
|
|
|
{
|
2018-01-19 02:56:14 -05:00
|
|
|
kqueue_fd[tid] = kqueue();
|
|
|
|
|
if (kqueue_fd[tid] < 0)
|
2007-04-09 13:29:56 -04:00
|
|
|
return 0;
|
|
|
|
|
return 1;
|
|
|
|
|
}
|
|
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
/*
|
2007-04-15 18:25:25 -04:00
|
|
|
* It is a constructor, which means that it will automatically be called before
|
|
|
|
|
* main(). This is GCC-specific but it works at least since 2.95.
|
|
|
|
|
* Special care must be taken so that it does not need any uninitialized data.
|
2007-04-09 06:03:06 -04:00
|
|
|
*/
|
2007-04-15 18:25:25 -04:00
|
|
|
__attribute__((constructor))
|
|
|
|
|
static void _do_register(void)
|
2007-04-09 06:03:06 -04:00
|
|
|
{
|
2007-04-15 18:25:25 -04:00
|
|
|
struct poller *p;
|
2018-01-19 02:56:14 -05:00
|
|
|
int i;
|
2007-04-15 18:25:25 -04:00
|
|
|
|
|
|
|
|
if (nbpollers >= MAX_POLLERS)
|
|
|
|
|
return;
|
2009-05-10 04:18:54 -04:00
|
|
|
|
2018-01-19 02:56:14 -05:00
|
|
|
for (i = 0; i < MAX_THREADS; i++)
|
|
|
|
|
kqueue_fd[i] = -1;
|
|
|
|
|
|
2007-04-15 18:25:25 -04:00
|
|
|
p = &pollers[nbpollers++];
|
|
|
|
|
|
2007-04-09 06:03:06 -04:00
|
|
|
p->name = "kqueue";
|
|
|
|
|
p->pref = 300;
|
2017-03-13 15:36:48 -04:00
|
|
|
p->flags = HAP_POLL_F_RDHUP;
|
2007-04-09 06:03:06 -04:00
|
|
|
p->private = NULL;
|
|
|
|
|
|
2012-11-11 15:02:34 -05:00
|
|
|
p->clo = NULL;
|
2007-04-15 18:25:25 -04:00
|
|
|
p->test = _do_test;
|
|
|
|
|
p->init = _do_init;
|
|
|
|
|
p->term = _do_term;
|
|
|
|
|
p->poll = _do_poll;
|
|
|
|
|
p->fork = _do_fork;
|
2007-04-09 06:03:06 -04:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
|
* Local variables:
|
|
|
|
|
* c-indent-level: 8
|
|
|
|
|
* c-basic-offset: 8
|
|
|
|
|
* End:
|
|
|
|
|
*/
|