This removes all the hard-coded 8-bit and 256 entries to use a pair of
macros instead so that we can more easily experiment with larger table
sizes if needed.
Storing the pointer to the pool along with the stats is quite useful as
it allows to report the name. That's what we're doing here. We could
store it in place of another field but that's not convenient as it would
require to change all functions that manipulate counters. Thus here we
store one extra field, as well as some padding because the struct turns
56 bytes long, thus better go to 64 directly. Example of output from
"show profiling memory":
2 0 48 0| 0x4bfb2c ha_quic_set_encryption_secrets+0xcc/0xb5e p_alloc(24) [pool=quic_tls_iv]
0 55252 0 10608384| 0x4bed32 main+0x2beb2 free(-192)
15 0 2760 0| 0x4be855 main+0x2b9d5 p_alloc(184) [pool=quic_frame]
1 0 1048 0| 0x4be266 ha_quic_add_handshake_data+0x2b6/0x66d p_alloc(1048) [pool=quic_crypto]
3 0 552 0| 0x4be142 ha_quic_add_handshake_data+0x192/0x66d p_alloc(184) [pool=quic_frame]
31276 0 6755616 0| 0x4bb8f9 quic_sock_fd_iocb+0x689/0x69b p_alloc(216) [pool=quic_dgram]
0 31424 0 6787584| 0x4bb7f3 quic_sock_fd_iocb+0x583/0x69b p_free(-216) [pool=quic_dgram]
152 0 32832 0| 0x4bb4d9 quic_sock_fd_iocb+0x269/0x69b p_alloc(216) [pool=quic_dgram]
Pools are being used so well that it becomes difficult to profile their
usage via the regular memory profiling. Let's add new entries for pools
there, named "p_alloc" and "p_free" that correspond to pool_alloc() and
pool_free(). Ideally it would be nice to only report those that fail
cache lookups but that's complicated, particularly on the free() path
since free lists are released in clusters to the shared pools.
It's worth noting that the alloc_tot/free_tot fields can easily be
determined by multiplying alloc_calls/free_calls by the pool's size, and
could be better used to store a pointer to the pool itself. However it
would require significant changes down the code that sorts output.
If this were to cause a measurable slowdown, an alternate approach could
consist in using a different value of USE_MEMORY_PROFILING to enable pools
profiling. Also, this profiler doesn't depend on intercepting regular malloc
functions, so we could also imagine enabling it alone or the other one alone
or both.
Tests show that the CPU overhead on QUIC (which is already an extremely
intensive user of pools) jumps from ~7% to ~10%. This is quite acceptable
in most deployments.
Right now it's not possible to feed memory profiling info from outside
activity.c, so let's export the function and move the enum and struct
to the include file.
This one is set whenever an FD is reported by a poller with a null owner,
regardless of the thread_mask. It has become totally meaningless because
it only indicates a migrated FD that was not yet reassigned to a thread,
but as soon as a thread uses it, the status will change to skip_fd. Thus
there is no reason to distinguish between the two, it adds more confusion
than it helps. Let's simply drop it.
This adds the necessary flags to permit run-time enabling/disabling of
memory profiling. For now this is disabled.
A few words were added to the management doc about it and recalling that
this is limited to certain OSes.
Both structures are identical except the name of the field starting
the period and its description. Let's call them all freq_ctr and the
period's start "curr_tick" which is generic.
This is only a temporary change and fields are expected to remain
the same with no code change (verified).
The new sched_activity structure will be used to collect task-level
activity based on the target function. The principle is to declare a
large enough array to make collisions rare (256 entries), and hash
the function pointer using a reduced XXH to decide where to store the
stats. On first computation an entry is definitely assigned to the
array and it's done atomically. A special entry (0) is used to store
collisions ("others"). The goal is to make it easy and inexpensive for
the scheduler code to use these to store #calls, cpu_time and lat_time
for each task.
In 2.0, commit d2d3348ac ("MINOR: activity: enable automatic profiling
turn on/off") introduced an automatic mode to enable/disable profiling.
The problem is that the automatic mode automatically changes to on/off,
which implied that the forced on/off modes aren't sticky anymore. It's
annoying when debugging because as soon as the load decreases, profiling
stops.
This makes a small change which ought to have been done first, which
consists in having two states for "auto" (auto-on, auto-off) to
distinguish them from the forced states. Setting to "auto" in the config
defaults to "auto-off" as before, and setting it on the CLI switches to
auto but keeps the current operating state.
This is simple enough to be backported to older releases if needed.
The FD takeover operation might have certain impacts explaining
unexpected activities, so it's important to report such a counter
there. We thus count the number of times a thread has stolen an
FD from another thread.
This one was confusingly called, I thought it was the cumulated number
of streams but it's the number of calls to process_stream(). Let's make
this clearer.
We have poll_drop, poll_dead and poll_skip which are confusingly named
like their poll_io and poll_exp counterparts except that they are not
per poll() call but per-fd. This patch renames them to poll_drop_fd(),
poll_dead_fd() and poll_skip_fd() for this reason.
The "show activity" output mentions a number of indicators to explain
wake up reasons but doesn't have the number of times poll() sees some
I/O. And given that multiple events can happen simultaneously, it's
not always possible to deduce this metric by subtracting.
This patch adds a new "poll_io" counter that allows one to see how
often poll() returns with at least one active FD. This should help
detect stuck events and measure various ratios of poll sub-metrics.
Since 2.1-dev2, with commit 305d5ab46 ("MAJOR: fd: Get rid of the fd cache.")
we don't have the fd_lock anymore and as such its acitvity counter is always
zero. Let's remove it from the struct and from "show activity" output, as
there are already plenty of indicators to look at.
The cache line comment in the struct activity was updated to reflect
reality as it looks like another one already got removed in the past.
This moves types/activity.h to haproxy/activity-t.h and
proto/activity.h to haproxy/activity.h.
The macros defining the bit field values for the profiling variable
were moved to the type file to be more future-proof.