Commit graph

3478 commits

Author SHA1 Message Date
Willy Tarreau
c40efc1919 MINOR: init/threads: make the threads array global
Currently the thread array is a local variable inside a function block
and there is no access to it from outside, which often complicates
debugging. Let's make it global and export it. Also the allocation
return is now checked.
2019-05-03 10:16:30 +02:00
Willy Tarreau
81492c989c MINOR: threads: flatten the per-thread cpu-map
When we initially experimented with threads and processes support, we
needed to implement arrays of threads per process for cpu-map, but this
is not needed anymore since we support either threads or processes.
Let's simply make the thread-based cpu-map per thread and not per
thread and per process since that's not used anymore. Doing so reduces
the global struct from 33kB to 1.5kB.
2019-05-03 09:46:45 +02:00
Olivier Houchard
a48237fd07 BUG/MEDIUM: connections: Make sure we remove CO_FL_SESS_IDLE on disown.
When for some reason the session is not the owner of the connection anymore,
make sure we remove CO_FL_SESS_IDLE, even if we're about to call
conn->mux->destroy(), as the destroy may not destroy the connection
immediately if it's still in use.
This should be backported to 1.9.
u
2019-05-02 12:08:39 +02:00
Olivier Houchard
55071d30ca BUG/MEDIUM: channels: Don't forget to reset output in channel_erase().
In channel_erase(), don't forget to set output to 0, otherwise the
channel won't seem empty, when it really is, and that could lead to
stream never closing properly.

This should be backported to 1.9.
2019-05-02 10:40:59 +02:00
Christopher Faulet
102854cbba BUG/MEDIUM: listener: Fix how unlimited number of consecutive accepts is handled
There is a bug when global.tune.maxaccept is set to -1 (no limit). It is pretty
visible with one process (nbproc sets to 1). The functions listener_accept() and
accept_queue_process() don't expect to handle negative maxaccept values. So
instead of accepting incoming connections without any limit, none are never
accepted and HAProxy loop infinitly in the scheduler.

When there are 2 or more processes, the bug is a bit more subtile. The limit for
a listener is set to 1. So only one connection is accepted at a time by a given
listener. This happens because the listener's maxaccept value is an unsigned
integer. In check_config_validity(), it is first set to UINT_MAX (-1 casted in
an unsigned integer), and then some calculations on it leads to an integer
overflow.

To fix the bug, the listener's maxaccept value is now a signed integer. So, if a
negative value is set for global.tune.maxaccept, we keep it untouched for the
listener and no calculation is made on it. Then, in the listener code, this
signed value is casted to a unsigned one. It simplifies all tests instead of
dealing with negative values. So, it limits the number of connections accepted
at a time to UINT_MAX at most. But, honestly, it not an issue.

This patch must be backported to 1.9 and 1.8.
2019-04-30 15:28:29 +02:00
Olivier Houchard
07425de717 BUG/MEDIUM: port_range: Make the ring buffer lock-free.
Port range uses a ring buffer, and unfortunately, when making haproxy
multithreaded, it's been overlooked, and the ring buffer is not thread-safe.
When specifying a source range, 2 or more threads could pick the same
port, and of course only one of them could use the port, the others would
always fail the connection.
To fix this, make it a lock-free ring buffer. This is easier than usual
because we know the ring buffer can never be full.

This should be backported to 1.8 and 1.9.
2019-04-30 15:10:17 +02:00
Olivier Houchard
9ce62b5498 MINOR: threads: Implement HA_ATOMIC_LOAD().
The same way we have HA_ATOMIC_STORE(), implement HA_ATOMIC_LOAD().

This should be backported to 1.8 and 1.9, as we need it for a bug fix
in port ranges.
2019-04-30 15:10:08 +02:00
Willy Tarreau
bc13bec548 MINOR: activity: report context switch counts instead of rates
It's not logical to report context switch rates per thread in show activity
because everything else is a counter and it's not even possible to compare
values. Let's only report counts. Further, this simplifies the scheduler's
code.
2019-04-30 14:55:18 +02:00
Willy Tarreau
9634e86dc7 CLEANUP: task: move the task_per_thread definition to task.h
It's the second time I look for it and can't find it because it's not
in the right file.
2019-04-30 14:36:47 +02:00
Frédéric Lécaille
d803e475e5 MINOR: log: Enable the log sampling and load-balancing feature.
This patch implements the sampling and load-balancing of log servers configured
with "sample" new keyword implemented by this commit:
    'MINOR: log: Add "sample" new keyword to "log" lines'.
As the list of ranges used to sample the log to balance is ordered, we only
have to maintain ->curr_idx member of smp_info struct which is the index of
the sample and check if it belongs or not to the current range to decide if we
must send it to the log server or not.
2019-04-30 09:25:09 +02:00
Frédéric Lécaille
d95ea2897e MINOR: log: Add "sample" new keyword to "log" lines.
This patch implements the parsing of "sample" new optional keyword for "log" lines
to be able to sample and balance the load of log messages between serveral log
destinations declared by "log" lines. This keyword must be followed by a list of
comma seperated ranges of indexes numbered from 1 to define the samples to be used
to balance the load of logs to send. This "sample" keyword must be used on "log" lines
obviously before the remaining optional ones without keyword. The list of ranges
must be followed by a colon character to separate it from the log sampling size.

With such following configuration declarations:

   log stderr local0
   log 127.0.0.1:10001 sample 2-3,8-11:11 local0
   log 127.0.0.2:10002 sample 5:5 local0

in addition to being sent to stderr, about the second "log" line, every 11 logs
the logs #2 up to #3 would be sent to 127.0.0.1:10001, then #8 up tp #11 four
logs would be sent to the same log server and so on periodically. Logs would be
sent to 127.0.0.2:100002 every 5 logs.

It is also possible to define the size of the sample with a value different of
the maximum of the high limits of the ranges, for instance as follows:

   log 127.0.0.1:10001 sample 2-3,8-11:15 local0

as before the two logs #2 and #3 would be sent to 127.0.0.1:10001, then #8
up tp #11 logs, but in this case here, this would be done periodically every 15
messages.

Also note that the ranges must not overlap each others. This is to ease the
way the logs are periodically sent.
2019-04-30 09:25:09 +02:00
Christopher Faulet
85db3212b8 MINOR: spoe: Use the sample context to pass frag_ctx info during encoding
This simplifies the API and hide the details in the sample. This way, only
string and binary are aware of these info, because other types cannot be
partially encoded.

This patch may be backported to 1.9 and 1.8.
2019-04-29 16:02:05 +02:00
Kevin Zhu
f7f54280c8 BUG/MEDIUM: spoe: arg len encoded in previous frag frame but len changed
Fragmented arg will do fetch at every encode time, each fetch may get
different result if SMP_F_MAY_CHANGE, for example res.payload, but
the length already encoded in first fragment of the frame, that will
cause SPOA decode failed and waste resources.

This patch must be backported to 1.9 and 1.8.
2019-04-29 16:02:05 +02:00
Willy Tarreau
71c07ac65a MINOR: stream/debug: make a stream dump and crash function
During 1.9 development (and even a bit after) we've started to face a
significant number of situations where streams were abusively spinning
due to an uncaught error flag or complex conditions that couldn't be
correctly identified. Sometimes streams wake appctx up and conversely
as well. More importantly when this happens the only fix is to restart.

This patch adds a new function to report a serious error, some relevant
info and to crash the process using abort() so that a core dump is
available. The purpose will be for this function to be called in various
situations where the process is unfixable. It will help detect these
issues much earlier during development and may even help fixing test
platforms which are able to automatically restart when such a condition
happens, though this is not the primary purpose.

This patch only provides the function and doesn't use it yet.
2019-04-26 13:15:56 +02:00
Willy Tarreau
5e6a5b3a6e MINOR: connection: make the debugging helper functions safer
We have various functions like conn_get_ctrl_name() to retrieve
some information reported in "show sess" for debugging, which
assume that the connection is valid. This is really not convenient
in code aimed at debugging and is error-prone. Let's add a validity
test first.
2019-04-25 18:35:49 +02:00
Willy Tarreau
d5ec4bfe85 CLEANUP: standard: use proper const to addr_to_str() and port_to_str()
The input parameter was not marked const, making it painful for some calls.
2019-04-25 17:48:16 +02:00
Willy Tarreau
d2d3348acb MINOR: activity: enable automatic profiling turn on/off
Instead of having to manually turn task profiling on/off in the
configuration, by default it will work in "auto" mode, which
automatically turns on on any thread experiencing sustained loop
latencies over one millisecond averaged over the last 1024 samples.

This may happen with configs using lots of regex (thing map_reg for
example, which is the lazy way to convert Apache's rewrite rules but
must not be abused), and such high latencies affect all the process
and the problem is most often intermittent (e.g. hitting a map which
is only used for certain host names).

Thus now by default, with profiling set to "auto", it remains off all
the time until something bad happens. This also helps better focus on
the issues when looking at the logs as well as in "show sess" output.
It automatically turns off when the average loop latency over the last
1024 calls goes below 990 microseconds (which typically takes a while
when in idle).

This patch could be backported to stable versions after a bit more
exposure, as it definitely improves observability and the ability to
quickly spot the culprit. In this case, previous patch ("MINOR:
activity: make the profiling status per thread and not global") must
also be taken.
2019-04-25 17:26:46 +02:00
Willy Tarreau
d9add3acc8 MINOR: activity: make the profiling status per thread and not global
In order to later support automatic profiling turn on/off, we need to
have it per-thread. We're keeping the global option to know whether to
turn it or on off, but the profiling status is now set per thread. We're
updating the status in activity_count_runtime() which is called before
entering poll(). The reason is that we'll extend this with run time
measurement when deciding to automatically turn it on or off.
2019-04-25 17:26:19 +02:00
Willy Tarreau
22d63a24d9 MINOR: applet: measure and report an appctx's call rate in "show sess"
Very similarly to previous commit doing the same for streams, we now
measure and report an appctx's call rate. This will help catch applets
which do not consume all their data and/or which do not properly report
that they're waiting for something else. Some of them like peers might
theorically be able to exhibit some occasional peeks when teaching a
full table to a nearby peer (e.g. the new replacement process), but
nothing close to what a bogus service can do so there is no risk of
confusion.
2019-04-24 16:04:23 +02:00
Willy Tarreau
2e9c1d2960 MINOR: stream: measure and report a stream's call rate in "show sess"
Quite a few times some bugs have made a stream task incorrectly
handle a complex combination of events, which was often reported as
"100% CPU", and was usually caused by the event not being properly
identified and flushed, and the stream's handler called in loops.

This patch adds a call rate counter to the stream struct. It's not
huge, it's really inexpensive (especially compared to the rest of the
processing function) and will easily help spot such tasks in "show sess"
output, possibly even allowing to kill them.

A future patch should probably consist in alerting when they're above a
certain threshold, possibly sending a dump and killing them. Some options
could also consist in aborting in order to get an analyzable core dump
and let a service manager restart a fresh new process.
2019-04-24 16:04:23 +02:00
Willy Tarreau
0212fadd65 MINOR: tasks/activity: report the context switch and task wakeup rates
It's particularly useful to spot runaway tasks to see this. The context
switch rate covers all tasklet calls (tasks and I/O handlers) while the
task wakeups only covers tasks picked from the run queue to be executed.
High values there will indicate either an intense traffic or a bug that
mades a task go wild.
2019-04-24 16:04:23 +02:00
Christopher Faulet
c1918d1a8f BUG/MAJOR: muxes: Use the HTX mode to find the best mux for HTTP proxies only
Since the commit 1d2b586cd ("MAJOR: htx: Enable the HTX mode by default for all
proxies"), the HTX is enabled by default for all proxies, HTTP and TCP, but also
CLI and HEALTH proxies. But when the best mux is retrieved, only HTTP and TCP
modes are checked. If the TCP mode is not explicitly set, it is considered as an
HTTP proxy. It is an hidden bug introduced when the option "http-use-htx" was
added. It has no effect until the commit 1d2b586cd. But now, when a stats socket
is created for the master process, the mux h1 is installed on all incoming
connections to the CLI proxy, leading to segfaults because HTX operations are
performed on raw buffers.

So to fix the buf, when a mux is installed, all proxies are considered as TCP
proxies, except HTTP ones. This way, CLI and HEALTH proxies will be handled as
TCP proxies.

This patch must be backported to 1.9 although it has no effect. It is safer to
not keep hidden bugs.
2019-04-24 15:40:02 +02:00
Baptiste Assmann
333939c2ee MINOR: action: new '(http-request|tcp-request content) do-resolve' action
The 'do-resolve' action is an http-request or tcp-request content action
which allows to run DNS resolution at run time in HAProxy.
The name to be resolved can be picked up in the request sent by the
client and the result of the resolution is stored in a variable.
The time the resolution is being performed, the request is on pause.
If the resolution can't provide a suitable result, then the variable
will be empty. It's up to the admin to take decisions based on this
statement (return 503 to prevent loops).

Read carefully the documentation concerning this feature, to ensure your
setup is secure and safe to be used in production.

This patch creates a global counter to track various errors reported by
the action 'do-resolve'.
2019-04-23 11:41:52 +02:00
Baptiste Assmann
0b9ce82dfa MINOR: obj_type: new object type for struct stream
This patch creates a new obj_type for the struct stream in HAProxy.
2019-04-23 11:35:56 +02:00
Baptiste Assmann
dfd35fd71a MINOR: dns: dns_requester structures are now in a memory pool
dns_requester structure can be allocated at run time when servers get
associated to DNS resolution (this happens when SRV records are used in
conjunction with service discovery).
Well, this memory allocation is safer if managed in an HAProxy pool,
furthermore with upcoming HTTP action which can perform DNS resolution
at runtime.

This patch moves the memory management of the dns_requester structure
into its own pool.
2019-04-23 11:33:48 +02:00
Emeric Brun
d0e095c2aa MINOR: ssl/cli: async fd io-handlers printable on show fd
This patch exports the async fd iohandlers and make them printable
doing a 'show fd' on cli.
2019-04-19 17:27:01 +02:00
Christopher Faulet
22c57bef56 BUG/MEDIUM: h1: Don't parse chunks CRLF if not enough data are available
As specified in the function comment, the function h1_skip_chunk_crlf() must not
change anything and return zero if not enough data are available. This must
include the case where there is no data at all. On this point, it must do the
same that other h1 parsing functions. This bug is made visible since the commit
91f77d599 ("BUG/MINOR: mux-h1: Process input even if the input buffer is
empty").

This patch must be backported to 1.9.
2019-04-19 15:53:23 +02:00
Olivier Houchard
88698d966d MEDIUM: connections: Add a way to control the number of idling connections.
As by default we add all keepalive connections to the idle pool, if we run
into a pathological case, where all client don't do keepalive, but the server
does, and haproxy is configured to only reuse "safe" connections, we will
soon find ourself having lots of idling, unusable for new sessions, connections,
while we won't have any file descriptors available to create new connections.

To fix this, add 2 new global settings, "pool_low_ratio" and "pool_high_ratio".
pool-low-fd-ratio  is the % of fds we're allowed to use (against the maximum
number of fds available to haproxy) before we stop adding connections to the
idle pool, and destroy them instead. The default is 20. pool-high-fd-ratio is
the % of fds we're allowed to use (against the maximum number of fds available
to haproxy) before we start killing idling connection in the event we have to
create a new outgoing connection, and no reuse is possible. The default is 25.
2019-04-18 19:52:03 +02:00
Olivier Houchard
7c49d2e213 MINOR: fd: Add a counter of used fds.
Add a new counter, ha_used_fds, that let us know how many file descriptors
we're currently using.
2019-04-18 19:19:59 +02:00
Olivier Houchard
e179d0e88f MEDIUM: connections: Provide a xprt_ctx for each xprt method.
For most of the xprt methods, provide a xprt_ctx.  This will be useful later
when we'll want to be able to stack xprts.
The init() method now has to create and provide the said xprt_ctx if needed.
2019-04-18 14:56:24 +02:00
Olivier Houchard
7b5fd1ec26 MEDIUM: connections: Move some fields from struct connection to ssl_sock_ctx.
Move xprt_st, tmp_early_data and sent_early_data from struct connection to
struct ssl_sock_ctx, as they are only used in the SSL code.
2019-04-18 14:56:24 +02:00
Olivier Houchard
3f795f76e8 MEDIUM: tasks: Merge task_delete() and task_free() into task_destroy().
task_delete() was never used without calling task_free() just after, and
task_free() was only used on error pathes to destroy a just-created task,
so merge them into task_destroy(), that will remove the task from the
wait queue, and make sure the task is either destroyed immediately if it's
not in the run queue, or destroyed when it's supposed to run.
2019-04-18 10:10:04 +02:00
Willy Tarreau
8c12e2f785 MINOR: task/thread: factor out a wake-up condition
The wakeup condition in task_wakeup() is redundant as it is already
validated by the CAS. Better move the __task_wakeup() call there, it
also has the merit of being easier to audit this way. This also reduces
the code size by around 1.8 kB :

  $ size haproxy-?
     text    data     bss     dec     hex filename
  2153806  100208 1307676 3561690  3658da haproxy-1
  2152094  100208 1307676 3559978  36522a haproxy-2
2019-04-17 22:15:58 +02:00
Willy Tarreau
a70bfaaf8b BUG/MAJOR: task: make sure never to delete a queued task
Commit 0c7a4b6 ("MINOR: tasks: Don't set the TASK_RUNNING flag when
adding in the tasklet list.") revealed a hole in the way tasks may
be freed : they could be removed while in the run queue when the
TASK_QUEUED flag was present but not the TASK_RUNNING one. But it
seems the issue was emphasized by commit cde7902 ("MEDIUM: tasks:
improve fairness between the local and global queues") though the
code it replaces was already affected given how late the TASK_RUNNING
flag was set after removal from the global queue.

At the moment the task is picked from the global run queue, if it
is the last one, the global run queue lock is dropped, and then
the TASK_RUNNING flag was added. In the mean time another thread
might have performed a task_free(), and immediately after, the
TASK_RUNNING flag was re-added to the task, which was then added
to the tasklet list. The unprotected window was extremely faint
but does definitely exist and inconsistent task lists have been
observed a few times during very intensive tests over the last few
days. From this point various options are possible, the task might
have been re-allocated while running, and assigned state 0 and/or
state QUEUED while it was still running, resulting in the tast not
being put back into the tree.

This commit simply makes sure that tests on TASK_RUNNING before removing
the task also cover TASK_QUEUED.

It must be backported to 1.9 along with the previous ones touching
that area.
2019-04-17 22:15:58 +02:00
Olivier Houchard
4a1be0c6d6 MEDIUM: tasks: No longer use rq.node.leaf_p as a lock.
Now that we have the warranty that a task won't be added in the runqueue
while the TASK_QUEUED or the TASK_RUNNING flag is set, don't bother trying
to lock the task by setting leaf_p to 0x1 while inserting it in the runqueue
or having it in the tasklet_list, as nobody else will attempt to add it.
2019-04-17 19:28:01 +02:00
Olivier Houchard
5c964f7b42 MINOR: tasks: Don't consider we can wake task with tasklet_wakeup().
In tasklet_wakeup(), don't bother checking if the tasklet is really a task,
calling tasklet_wakeup() with a task is invalid.
2019-04-17 19:28:01 +02:00
Willy Tarreau
b038007ae8 BUG/MEDIUM: tasks: Make sure we set TASK_QUEUED before adding a task to the rq.
Make sure we set TASK_QUEUED in every case before adding the task to the
run queue. task_wakeup() now checks if either TASK_QUEUED or TASK_RUNNING
is set, and if neither is set, add TASK_QUEUED and effectively add the task
to the runqueue.
No longer use __task_wakeup() anywhere except in task_wakeup(), always use
task_wakeup() instead.
With the old code, process_runnable_task() may re-add a task in the runqueue
without setting the TASK_QUEUED flag, and there were race conditions that could
lead to a task having the TASK_QUEUED flag but not in the runqueue, thus
being unschedulable.

This should be backported to 1.9.
2019-04-17 19:28:01 +02:00
Christopher Faulet
5ec8bcb021 BUG/MINOR: http_fetch/htx: Allow permissive sample prefetch for the HTX
As for smp_prefetch_http(), there is now a way to successfully perform a
prefetch in HTX, even if the message forwarding already begun. It is used for
the sample fetches "req.proto_http" and "method".

This patch must be backported to 1.9.
2019-04-17 15:12:27 +02:00
Christopher Faulet
89dc499359 BUG/MAJOR: http_fetch: Get the channel depending on the keyword used
All HTTP samples are buggy because the channel tested in the prefetch functions
(HTX and legacy HTTP) is chosen depending on the sample direction and not the
keyword really used. It means the request channel is used if the sample is
called during the request analysis and the response channel is used if it is
called during the response analysis, regardless the sample really called. For
instance, if you use the sample "req.ver" in an http-response rule, the response
channel will be prefeched because it is called during the response analysis,
while the request channel should have been used instead. So some assumptions on
the validity of the sample may be made on the wrong channel. It is the first
bug.

Then the same error is done in some samples themselves. So fetches are performed
on the wrong channel. For instance, the header extraction (req.fhdr, res.fhdr,
req.hdr, res.hdr...). If the sample "req.hdr" is used in an http-response rule,
then the matching is done on the response headers and not the request ones. It
is the second bug.

Finally, the last one but not the least, in some samples, the right channel is
used. But because the prefetch was done on the wrong one, this channel may be in
a undefined state. For instance, using the sample "req.ver" in an http-response
rule leads to a matching on a posibility released buffer.

To fix all these bugs, the right channel is now chosen in sample fetches, before
the prefetch. If the same function is used to fetch requests and responses
elements, then the keyword is used to choose the right one. This channel is then
used by the functions smp_prefetch_htx() and smp_prefetch_http(). Of course, it
is also used by the samples themselves to extract information.

This patch must be backported to all supported versions. For version 1.8 and
priors, it must be totally refactored. First because there is no HTX into these
versions. Then the buffers API has changed in HAProxy 1.9. The files
http_fetch.{ch} doesn't exist on old versions.
2019-04-17 15:12:27 +02:00
Christopher Faulet
3a4d1bea61 BUG/MEDIUM: htx: Don't return the start-line if the HTX message is empty
In the function htx_get_stline(), NULL must be returned if the HTX message
doesn't contain any element.

This patch must be backported to 1.9.
2019-04-17 15:12:27 +02:00
Willy Tarreau
636848aa86 MINOR: init: add a "set-dumpable" global directive to enable core dumps
It's always a pain to get a core dump when enabling user/group setting
(which disables the dumpable flag on Linux), when using a chroot and/or
when haproxy is started by a service management tool which requires
complex operations to just raise the core dump limit.

This patch introduces a new "set-dumpable" global directive to work
around these troubles by doing the following :

  - remove file size limits     (equivalent of ulimit -f unlimited)
  - remove core size limits     (equivalent of ulimit -c unlimited)
  - mark the process dumpable again (equivalent of suid_dumpable=1)

Some of these will depend on the operating system. This way it becomes
much easier to retrieve a core file. Temporarily moving the chroot to
a user-writable place generally enough.
2019-04-16 14:31:23 +02:00
William Lallemand
8f7069a389 CLEANUP: mworker: remove the type field in mworker_proc
Since the introduction of the options field, we can use it to store the
type of process.

type = 'm' is replaced by PROC_O_TYPE_MASTER
type = 'w' is replaced by PROC_O_TYPE_WORKER
type = 'e' is replaced by PROC_O_TYPE_PROG

The old values are still used in the HAPROXY_PROCESSES environment
variable to pass the information during a reload.
2019-04-16 13:26:43 +02:00
William Lallemand
bd3de3efb7 MEDIUM: mworker-prog: implements 'option start-on-reload'
This option is already the default, but its opposite 'no option
start-on-reload' allows the master to keep a previous instance of a
program and don't start a new one upon a reload.

The old program will then appear as a current one in "show proc" and
could also trigger an exit-on-failure upon a segfault.
2019-04-16 13:26:43 +02:00
William Lallemand
4528611ed6 MEDIUM: mworker: store the leaving state of a process
Previously we were assuming than a process was in a leaving state when
its number of reload was greater than 0. With mworker programs it's not
the case anymore so we need to store a leaving state.
2019-04-16 13:26:43 +02:00
Willy Tarreau
9df86f997e BUG/MAJOR: lb/threads: fix insufficient locking on round-robin LB
Maksim Kupriianov reported very strange crashes in fwrr_update_position()
which didn't make sense because of an apparent divide overflow except that
the value was not null in the core.

It happens that while the locking is correct in all the functions' call
graph, the uppermost one (fwrr_get_next_server()) incorrectly expected
that its target server was already locked when called. This stupid
assumption causd the server lock not to be held when calling the other
ones, explaining how it was possible to change the server's eweight by
calling srv_lb_commit_status() under the server lock yet collide with
its unprotected usage.

This commit makes sure that fwrr_get_server_from_group() retrieves a
locked server and that fwrr_get_next_server() is responsible for
unlocking the server before returning it. There is one subtlety in
this function which is that it builds a list of avoided servers that
were full while scanning the tree, and all of them are queued in a
full state so they must be unlocked upon return.

Many thanks to Maksim for providing detailed info allowing to narrow
down this bug.

This fix must be backported to 1.9. In 1.8 the lock seems much wider
and changes to the server's state are performed under the rendez-vous
point so this it doesn't seem possible that it happens there.
2019-04-16 11:21:14 +02:00
Frédéric Lécaille
95679dc096 MINOR: peers: Add a new command to the CLI for peers.
Implements "show peers [peers section]" new CLI command to dump information
about the peers and their stick-tables to be synchronized and others internal.

May be backported as far as 1.5.
2019-04-16 09:58:40 +02:00
Willy Tarreau
8de1df92a3 BUILD: do not specify "const" on functions returning structs or scalars
Older compilers (like gcc-3.4) warn about the use of "const" on functions
returning a struct, which makes sense since the return may only be copied :

  include/common/htx.h:233: warning: type qualifiers ignored on function return type

Let's simply drop "const" here.
2019-04-15 21:55:48 +02:00
Willy Tarreau
0e492e2ad0 BUILD: address a few cases of "static <type> inline foo()"
Older compilers don't like to see "inline" placed after the type in a
function declaration, it must be "static inline <type>" only. This
patch touches various areas. The warnings were seen with gcc-3.4.
2019-04-15 21:55:48 +02:00
Olivier Houchard
3212a2c438 BUG/MEDIUM: Threads: Only use the gcc >= 4.7 builtins when using gcc >= 4.7.
Move the definition of the various _HA_ATOMIC_* macros that use
__atomic_* in the #if GCC_VERSION >= 4.7, not just after it, so that we
can build with older versions of gcc again.
2019-04-15 21:16:24 +02:00
Olivier Houchard
e5eef1f1b4 MINOR: connections: Remove the SUB_CALL_UNSUBSCRIBE flag.
Garbage collect SUB_CALL_UNSUBSCIRBE, as it's now unused.
2019-04-15 19:27:57 +02:00
Nenad Merdanovic
8ef706502a BUG/MINOR: ssl: Fix 48 byte TLS ticket key rotation
Whenever HAProxy was reloaded with rotated keys, the resumption would be
broken for previous encryption key. The bug was introduced with the addition
of 80 byte keys in 9e7547 (MINOR: ssl: add support of aes256 bits ticket keys
on file and cli.).

This fix needs to be backported to 1.9.
2019-04-15 10:09:54 +02:00
Willy Tarreau
24f382f555 CLEANUP: task: do not export rq_next anymore
This one hasn't been used anymore since the scheduler changes after 1.8
but it kept being exported and maintained up to date while it's always
reset when scanning the trees. Let's stop exporting it and updating it.
2019-04-15 09:50:56 +02:00
Christopher Faulet
0ef372a390 MAJOR: muxes/htx: Handle inplicit upgrades from h1 to h2
The upgrade is performed when an H2 preface is detected when the first request
on a connection is parsed. The CS is destroyed by setting EOS flag on it. A
special flag is added on the HTX message to warn the HTX analyzers the stream
will be closed because of an upgrade. This way, no error and no log are
emitted. When the mux h1 is released, we create a mux h2, without any CS and
passing the buffer with the unparsed H2 preface.
2019-04-12 22:06:53 +02:00
Christopher Faulet
c0016d8119 MEDIUM: connection: Add conn_upgrade_mux_fe() to handle mux upgrades
This function will handle mux upgrades, for frontend connections only. It will
retrieve the best mux in the same way than conn_install_mux_fe except that the
mode and optionnally the proto are forced.

The new multiplexer is initialized using a new context and a specific input
buffer. Then, the old one is destroyed. If an error occurred, everything is
rolled back.
2019-04-12 22:06:53 +02:00
Christopher Faulet
73c1207c71 MINOR: muxes: Pass the context of the mux to destroy() instead of the connection
It is mandatory to handle mux upgrades, because during a mux upgrade, the
connection will be reassigned to another multiplexer. So when the old one is
destroyed, it does not own the connection anymore. Or in other words, conn->ctx
does not point to the old mux's context when its destroy() callback is
called. So we now rely on the multiplexer context do destroy it instead of the
connection.

In addition, h1_release() and h2_release() have also been updated in the same
way.
2019-04-12 22:06:53 +02:00
Christopher Faulet
51f73eb11a MEDIUM: muxes: Add an optional input buffer during mux initialization
The mux's callback init() now take a pointer to a buffer as extra argument. It
must be used by the multiplexer as its input buffer. This buffer is always NULL
when a multiplexer is initialized with a fresh connection. But if a mux upgrade
is performed, it may be filled with existing data. Note that, for now, mux
upgrades are not supported. But this commit is mandatory to do so.
2019-04-12 22:06:53 +02:00
Christopher Faulet
209829f159 MINOR: http: update the macro IS_HTX_STRM() to check the stream flag SF_HTX
Instead of matching on the frontend options, we now check if the flag SF_HTX is
set or not on the stream to know if it is an HTX stream or not.
2019-04-12 22:06:53 +02:00
Christopher Faulet
0e160ff5bb MINOR: stream: Set a flag when the stream uses the HTX
The flag SF_HTX has been added to know when a stream uses the HTX or not. It is
set when an HTX stream is created. There are 2 conditions to set it. The first
one is when the HTTP frontend enables the HTX. The second one is when the attached
conn_stream uses an HTX multiplexer.
2019-04-12 22:06:53 +02:00
Christopher Faulet
9f38f5aa80 MINOR: muxes: Add a flag to specify a multiplexer uses the HTX
A multiplexer must now set the flag MX_FL_HTX when it uses the HTX to structured
the data exchanged with channels. the muxes h1 and h2 set this flag. Of course,
for the mux h2, it is set on h2_htx_ops only.
2019-04-12 22:06:53 +02:00
Christopher Faulet
a51ebb7f56 MEDIUM: h1: Add an option to sanitize connection headers during parsing
The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize
connection headers. It means it will remove all "close" and "keep-alive" values
during the parsing. One noticeable effect is that connection headers may be
unfolded. In practice, this is not a problem because it is not frequent to have
multiple values for the connection headers.

If this flag is set, during the parsing The function
h1_parse_next_connection_header() is called in a loop instead of
h1_parse_conection_header().

No need to backport this patch
2019-04-12 22:06:53 +02:00
Christopher Faulet
03b9d8ba4a MINOR: proto_htx: Don't adjust transaction mode anymore in HTX analyzers
Because the option http-tunnel is now ignored in HTX, there is no longer any
need to adjust the transaction mode in HTX analyzers. A channel can still be
switch to the tunnel mode for legitimate cases (HTTP CONNECT or switching
protocols). So the function htx_adjust_conn_mode() is now useless.

This patch must be backported to 1.9. It is not strictly speaking required but
it will ease futur backports.
2019-04-12 22:06:53 +02:00
Willy Tarreau
64a9c05f37 MINOR: cli/listener: report the number of accepts on "show activity"
The "show activity" command reports the number of incoming connections
dispatched per thread but doesn't report the number of connections
received by each thread. It is important to be able to monitor this
value as it can show that for whatever reason a smaller set of threads
is receiving the connections and dispatching them to all other ones.
2019-04-12 15:54:15 +02:00
Olivier Houchard
526dc95eb9 MINOR: initcall: Don't forget to define the __start/stop_init_##stg symbols.
When creating a new initcall, don't forget to define the symbols, as it may
not be done automatically and that would lead to undefined symbols.

This should be backported to 1.9.
2019-04-10 16:33:25 +02:00
Christopher Faulet
f192d683a7 BUG/MINOR: htx: Preserve empty HTX messages with an unprocessed parsing error
This let a chance to HTX analyzers to handle the error and send the appropriate
response to the client.

This patch must be backported to 1.9.
2019-04-01 15:43:40 +02:00
William Lallemand
9a1ee7ac31 MEDIUM: mworker-prog: implement program for master-worker
This patch implements the external binary support in the master worker.

To configure an external process, you need to use the program section,
for example:

	program dataplane-api
		command ./dataplane_api

Those processes are launched at the same time as the workers.

During a reload of HAProxy, those processes are dealing with the same
sequence as a worker:

  - the master is re-executed
  - the master sends a USR1 signal to the program
  - the master launches a new instance of the program

During a stop, or restart, a SIGTERM is sent to the program.
2019-04-01 14:45:37 +02:00
William Lallemand
7175e6861e MINOR: cli: export cli_parse_default() definition in cli.h
Export the cli_parse_default() function in cli.h so it could be used in
other files.
2019-04-01 14:45:37 +02:00
William Lallemand
3f12887ffa MINOR: mworker: don't use children variable anymore
The children variable is still used in haproxy, it is not required
anymore since we have the information about the current workers in the
mworker_proc linked list.

The oldpids array is also replaced by this linked list when we
generated the arguments for the master reexec.
2019-04-01 14:45:37 +02:00
William Lallemand
9001ce8c2f REORG: mworker: move mworker_cleanlisteners to mworker.c 2019-04-01 14:45:37 +02:00
William Lallemand
e25473c846 REORG: mworker: move signal handlers and related functions
Move the following functions to mworker.c:

void mworker_catch_sighup(struct sig_handler *sh);
void mworker_catch_sigterm(struct sig_handler *sh);
void mworker_catch_sigchld(struct sig_handler *sh);

static void mworker_kill(int sig);
int current_child(int pid);
2019-04-01 14:45:37 +02:00
William Lallemand
3fa724db87 REORG: mworker: move IPC functions to mworker.c
Move the following functions to mworker.c:

void mworker_accept_wrapper(int fd);
void mworker_pipe_register();
2019-04-01 14:45:37 +02:00
William Lallemand
3cd95d2f1b REORG: mworker: move signals functions to mworker.c
Move the following functions to mworker.c:

void mworker_block_signals();
void mworker_unblock_signals();
2019-04-01 14:45:37 +02:00
William Lallemand
48dfbbdea9 REORG: mworker: move serializing functions to mworker.c
Move the 2 following functions to mworker.c:

void mworker_proc_list_to_env()
void mworker_env_to_proc_list()
2019-04-01 14:45:37 +02:00
Willy Tarreau
a1bd1faeeb BUILD: use inttypes.h instead of stdint.h
I found on an (old) AIX 5.1 machine that stdint.h didn't exist while
inttypes.h which is expected to include it does exist and provides the
desired functionalities.

As explained here, stdint being just a subset of inttypes for use in
freestanding environments, it's probably always OK to switch to inttypes
instead:

  https://pubs.opengroup.org/onlinepubs/009696799/basedefs/stdint.h.html

Also it's even clearer here in the autoconf doc :

  https://www.gnu.org/software/autoconf/manual/autoconf-2.61/html_node/Header-Portability.html

  "The C99 standard says that inttypes.h includes stdint.h, so there's
   no need to include stdint.h separately in a standard environment.
   Some implementations have inttypes.h but not stdint.h (e.g., Solaris
   7), but we don't know of any implementation that has stdint.h but not
   inttypes.h"
2019-04-01 07:44:56 +02:00
Willy Tarreau
7b5654f54a BUILD: re-implement an initcall variant without using executable sections
The current initcall implementation relies on dedicated sections (one
section per init stage) to store the initcall descriptors. Then upon
startup, these sections are scanned from beginning to end and all items
found there are called in sequence.

On platforms like AIX or Cygwin it seems difficult to figure the
beginning and end of sections as the linker doesn't seem to provide
the corresponding symbols. In order to replace this, this patch
simply implements an array of single linked (one per init stage)
which are fed using constructors for each register call. These
constructors are declared static, with a name depending on their
line number in the file, in order to avoid name clashes. The final
effect is the same, except that the method is slightly more expensive
in that it explicitly produces code to register these initcalls :

$ size  haproxy.sections haproxy.constructor
   text    data     bss     dec     hex filename
4060312  249176 1457652 5767140  57ffe4 haproxy.sections
4062862  260408 1457652 5780922  5835ba haproxy.constructor

This mechanism is enabled as an alternative to the default one when
build option USE_OBSOLETE_LINKER is set. This option is currently
enabled by default only on AIX and Cygwin, and may be attempted for
any target which fails to build complaining about missing symbols
__start_init_* and/or __stop_init_*.

Once confirmed as a reliable fix, this will likely have to be backported
to 1.9 where AIX and Cygwin do not build anymore.
2019-04-01 07:43:07 +02:00
Willy Tarreau
9d22e56178 MINOR: tools: add an unsetenv() implementation
Older Solaris and AIX versions do not have unsetenv(). This adds a
fairly simple implementation which scans the environment, for use
with those systems. It will simply require to pass the define in
the "DEFINE" macro at build time like this :

      DEFINE="-Dunsetenv=my_unsetenv"
2019-03-29 21:05:37 +01:00
Willy Tarreau
72d9f3351d BUILD: chunk: properly declare pool_head_trash as extern
This one was also declared without the extern modifier in an include
file.

This needs to be backported to 1.9.
2019-03-29 21:03:20 +01:00
Willy Tarreau
e01d11a75b BUILD: http: properly mark some struct as extern
http_known_methods, HTTP_100 and HTTP_103 were not declared extern and
as such were multiply defined since they were in http.h. There was
apparently no more side effect but it may depend on the platform and
the linker.

This needs to be backported to 1.9.
2019-03-29 21:00:22 +01:00
Willy Tarreau
a33d39a1b1 CLEANUP: task: only perform a LIST_DEL() when the list is not empty
In tasklet_free() we unconditionally perform a LIST_DEL() even when
the list is empty, let's move the LIST_DEL() inside the matching block.
2019-03-25 18:10:53 +01:00
Willy Tarreau
e73256fd2a BUG/MEDIUM: task/h2: add an idempotent task removal fucntion
Previous commit 3ea351368 ("BUG/MEDIUM: h2: Remove the tasklet from the
task list if unsubscribing.") uncovered an issue which needs to be
addressed in the scheduler's API. The function task_remove_from_task_list()
was initially designed to remove a task from the running tasklet list from
within the scheduler, and had to be used in h2 to abort pending I/O events.
However this function was not designed to be idempotent, occasionally
causing a double removal from the tasklet list, with the second doing
nothing but affecting the apparent tasks count and making haproxy use
100% CPU on some tests consisting in stopping the client during some
transfers. The h2_unsubscribe() function can sometimes be called upon
stream exit after an error where the tasklet was possibly already
removed, so it.

This patch does 2 things :
  - it renames task_remove_from_task_list() to
    __task_remove_from_tasklet_list() to discourage users from calling
    it. Also note the fix in the naming since it's a tasklet list and
    not a task list. This function is still uesd from the scheduler.
  - it adds a new, idempotent, task_remove_from_tasklet_list() function
    which does nothing if the task is already not in the tasklet list.

This patch will need to be backported where the commit above is backported.
2019-03-25 18:02:54 +01:00
Christopher Faulet
87a8f353f1 CLEANUP: muxes/stream-int: Remove flags CS_FL_READ_NULL and SI_FL_READ_NULL
Since the flag CF_SHUTR is no more set to mark the end of the message, these
flags become useless.

This patch should be backported to 1.9.
2019-03-25 06:55:23 +01:00
Christopher Faulet
297d3e2e0f MINOR: channel: Report EOI on the input channel if it was reached in the mux
The flag CF_EOI is now set on the input channel when the flag CS_FL_EOI is set
on the corresponding conn_stream. In addition, if a read activity is reported
when this flag is set, the stream is woken up.

This patch should be backported to 1.9.
2019-03-25 06:24:43 +01:00
Christopher Faulet
5311a9255d MINOR: connection: and new flag to mark end of input (EOI)
Since the begining, in the H2 multiplexer, when the end of a message is reached,
the flag CS_FL_(R)EOS is set on the conn_stream to notify the upper layer that
all data were received and consumed and there is no longer any expected. The
stream-interface converts it into a shutdown read. But it leads to some
ambiguities with the real shutr. Once it was reported at the end of the message,
there is no way to report it when the read0 is received. For this reason, aborts
after the message was fully received cannot be reported. And on the channel
side, it is hard to make the difference between a shutr because the end of the
message was reached and a shutr because of an abort.

For these reasons, there is now a flag to mark the end of the message. It is
called CS_FL_EOI (end-of-input) because it is only used on the receipt path.
This flag is only declared and not used yet.

This patch will be used by future bug fixes and will have to be backported
to 1.9.
2019-03-25 06:24:25 +01:00
Willy Tarreau
0f22299435 CLEANUP: cache: don't export http_cache_applet anymore
This one can become static since it's not used by http/htx anymore.
2019-03-19 09:58:35 +01:00
Christopher Faulet
3a78aa6e95 BUG/MINOR: stats: Fully consume large requests in the stats applet
In the stats applet (in HTX and legacy HTTP), after a response is fully sent to
a client, the request is consumed. It is done at the end, after all the response
was copied into the channel's buffer. But only outgoing data at time the applet
is called are consumed. Then the applet is closed. If a request with a huge body
is sent, an error is triggerred because a SHUTW is catched for an unfinisehd
request.

Now, we consume request data until the end. In fact, we don't try to shutdown
the request's channel for write anymore.

This patch must be backported to 1.9 after some observation period. It should
probably be backported in prior versions too. But honnestly, with refactoring
on the connection layer and the stream interface in 1.9, it is probably safer
to not do so.
2019-03-19 09:49:29 +01:00
Willy Tarreau
679bba13f7 MINOR: init: report the list of optionally available services
It's never easy to guess what services are built in. We currently have
the prometheus exporter in contrib/ which is the only extension for now.
Let's enumerate all available ones just like we do for filterr and pollers.
2019-03-19 08:08:10 +01:00
Christopher Faulet
203b2b0a5a MINOR: muxes: Report the Last read with a dedicated flag
For conveniance, in HTTP muxes (h1 and h2), the end of the stream and the end of
the message are reported the same way to the stream, by setting the flag
CS_FL_EOS. In the stream-interface, when CS_FL_EOS is detected, a shutdown for
read is reported on the channel side. This is historical. With the legacy HTTP
layer, because the parsing is done by the stream in HTTP analyzers, the EOS
really means a shutdown for read.

Most of time, for muxes h1 and h2, it works pretty well, especially because the
keep-alive is handled by the muxes. The stream is only used for one
transaction. So mixing EOS and EOM is good enough. But not everytime. For now,
client aborts are only reported if it happens before the end of the request. It
is an error and it is properly handled. But because the EOS was already
reported, client aborts after the end of the request are silently
ignored. Eventually an error can be reported when the response is sent to the
client, if the sending fails. Otherwise, if the server does not reply fast
enough, an error is reported when the server timeout is reached. It is the
expected behaviour, excpect when the option abortonclose is set. In this case,
we must report an error when the client aborts. But as said before, this event
can be ignored. So to be short, for now, the abortonclose is broken.

In fact, it is a design problem and we have to rethink all channel's flags and
probably the conn-stream ones too. It is important to split EOS and EOM to not
loose information anymore. But it is not a small job and the refactoring will be
far from straightforward.

So for now, temporary flags are introduced. When the last read is received, the
flag CS_FL_READ_NULL is set on the conn-stream. This way, we can set the flag
SI_FL_READ_NULL on the stream interface. Both flags are persistant. And to be
sure to wake the stream, the event CF_READ_NULL is reported. So the stream will
always have the chance to handle the last read.

This patch must be backported to 1.9 because it will be used by another patch to
fix the option abortonclose.
2019-03-18 15:50:23 +01:00
Christopher Faulet
2b9b6784b9 MINOR: stats: Move stuff about the stats status codes in stats files
The status codes definition (STAT_STATUS_*) and their string representation
stat_status_codes) have been moved in stats files. There is no reason to keep
them in proto_http files.
2019-03-15 14:34:59 +01:00
Christopher Faulet
3c2ecf75c8 MINOR: stats: Add the status code STAT_STATUS_IVAL to handle invalid requests
This patch must be backported to 1.9 because a bug fix depends on it.
2019-03-15 14:34:52 +01:00
Olivier Houchard
1d7f37a2cb BUG/MAJOR: tasks: Use the TASK_GLOBAL flag to know if we're in the global rq.
In task_unlink_rq, to decide if we should logk the global runqueue lock,
use the TASK_GLOBAL flag instead of relying on t->thread_mask being tid_bit,
as it could be so while still being in the global runqueue if another thread
woke that task for us.

This should be backported to 1.9.
2019-03-14 16:19:11 +01:00
Olivier Houchard
237985b228 MEDIUM: connections: Use _HA_ATOMIC_*
Use _HA_ATOMIC_ instead of HA_ATOMIC_ because we know we don't need barriers
2019-03-14 15:55:15 +01:00
Olivier Houchard
9f8d821a55 MEDIUM: list: Use _HA_ATOMIC_*
Use _HA_ATOMIC_ instead of HA_ATOMIC_ because we know we don't need barriers.
2019-03-14 15:55:15 +01:00
Olivier Houchard
17fbb4eb3f MEDIUM: list: Remove useless barriers.
Don't bother forcing a barrier after using HA_ATOMIC_XCHG if we're about
to check the returned value anyway.
2019-03-14 15:55:15 +01:00
Willy Tarreau
b0cef35b09 BUG/MEDIUM: list: fix incorrect pointer unlocking in LIST_DEL_LOCKED()
Injecting on a saturated listener started to exhibit some deadlocks
again between LIST_POP_LOCKED() and LIST_DEL_LOCKED(). Olivier found
it was due to a leftover from a previous debugging session. This patch
fixes it.

This will have to be backported if the other LIST_*_LOCKED() patches
are backported.
2019-03-13 14:15:54 +01:00
Willy Tarreau
df23c0ce45 MINOR: config: continue to rely on DEFAULT_MAXCONN to set the minimum maxconn
Some packages used to rely on DEFAULT_MAXCONN to set the default global
maxconn value to use regardless of the initial ulimit. The recent changes
made the lowest bound set to 100 so that it is compatible with almost any
environment. Now that DEFAULT_MAXCONN is not needed for anything else, we
can use it for the lowest bound set when maxconn is not configured. This
way it retains its original purpose of setting the default maxconn value
eventhough most of the time the effective value will be higher thanks to
the automatic computation based on "ulimit -n".
2019-03-13 10:10:49 +01:00
Willy Tarreau
ca783d4ee6 MINOR: config: remove obsolete use of DEFAULT_MAXCONN at various places
This entry was still set to 2000 but never used anymore. The only places
where it appeared was as an alias to SYSTEM_MAXCONN which forces it, so
let's turn these ones to SYSTEM_MAXCONN and remove the default value for
DEFAULT_MAXCONN. SYSTEM_MAXCONN still defines the upper bound however.
2019-03-13 10:10:25 +01:00
Olivier Houchard
20872763dd MEDIUM: memory: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:38 +01:00
Olivier Houchard
4c28328572 MEDIUM: task: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
aa4d71a7fe MEDIUM: server: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
11ecfd1c01 MEDIUM: proxy: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00
Olivier Houchard
d5f9b19196 MEDIUM: freq_ctr: Use the new _HA_ATOMIC_* macros.
Use the new _HA_ATOMIC_* macros and add barriers where needed.
2019-03-11 17:02:37 +01:00