Commit graph

192 commits

Author SHA1 Message Date
Miroslav Zagorac
bab0ea7b77 MEDIUM: otel: implemented scope execution and span management
Implemented the scope execution engine that creates OTel spans, evaluates
sample expressions to collect telemetry data, and manages span lifecycle
during request and response processing.

The scope runner flt_otel_scope_run() was expanded from a stub into a
complete implementation that evaluates ACL conditions on the scope,
extracts span contexts from HTTP headers when configured, iterates over
the scope's span definitions calling flt_otel_scope_run_span() for each,
marks and finishes completed spans, and cleans up unused runtime
resources.

The span runner flt_otel_scope_run_span() creates OTel spans via the
tracer with optional parent references (from other spans or extracted
contexts), collects telemetry by calling flt_otel_sample_add() for each
configured attribute, event, baggage and status entry, then applies the
collected data to the span (attributes, events with their own key-value
arrays, baggage items, and status code with description) and injects the
span context into HTTP headers when configured.

The sample evaluation layer converts HAProxy sample expressions into OTel
telemetry data.  flt_otel_sample_add() evaluates each sample expression
against the stream, converts the result via flt_otel_sample_to_value()
which preserves native types (booleans as OTELC_VALUE_BOOL, integers as
OTELC_VALUE_INT64, all others as strings), and routes the key-value pair
to the appropriate collector based on the sample type (attribute, event,
baggage, or status).  The key-value arrays grow dynamically using the
FLT_OTEL_ATTR_INIT_SIZE and FLT_OTEL_ATTR_INC_SIZE constants.

Span finishing is handled in two phases: flt_otel_scope_finish_mark()
marks spans and contexts for completion using exact name matching or
wildcards ("*" for all, "*req*" for request-direction, "*res*" for
response-direction), and flt_otel_scope_finish_marked() ends all marked
spans with a common monotonic timestamp and destroys their contexts.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
3184470339 MEDIUM: otel: wired OTel C wrapper library integration
Connected the OpenTelemetry C wrapper library to the filter lifecycle by
implementing the library initialization, tracer creation, memory and
thread callbacks, shutdown sequence, and span completion.

The flt_otel_lib_init() function now verifies the C wrapper library
version against the compiled headers, calls otelc_init() with the absolute
configuration file path, and creates the tracer via otelc_tracer_create().
On success, it registers HAProxy pool-based memory callbacks
(flt_otel_mem_malloc, flt_otel_mem_free) and a thread ID callback
(flt_otel_thread_id) through otelc_ext_init(), so the C++ SDK allocates
span and context objects from pool_head_otel_span_context.  A custom log
handler (flt_otel_log_handler_cb) is registered via otelc_log_set_handler()
to count OTel SDK internal diagnostic messages in the flt_otel_drop_cnt
counter.

The per-thread init callback now starts the tracer thread via
OTELC_OPS(tracer, start) instead of unconditionally returning success.

The deinit callback saves the tracer handle before freeing the
configuration, then shuts down the library via otelc_deinit() after the
pool is destroyed, ensuring the ext callbacks remain valid while the
configuration structures are still being freed.  In debug builds, it logs
wrapper statistics, attach counters, and per-event HTX usage counters
before shutdown.

The runtime context cleanup in flt_otel_runtime_context_free() now ends
all active spans with a common monotonic timestamp via
OTELC_OPSR(span, end_with_options) before freeing them.  The scope context
cleanup in flt_otel_scope_context_free() now destroys the underlying OTel
span context via OTELC_OPSR(context, destroy).

The parser gained static storage for the debug memory tracker
(OTELC_DBG_MEM) and its initialization in the parse entry point, used when
compiled with the OTELC_DBG_MEM flag.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
2e962a5443 MEDIUM: otel: implemented filter callbacks and event dispatcher
Replaced the stub filter callbacks with full implementations that dispatch
OTel events through the scope execution engine, and added the supporting
debug, error handling and utility infrastructure.

The filter lifecycle callbacks (init, deinit, init_per_thread) now
initialize the OpenTelemetry C wrapper library, create the tracer from the
instrumentation configuration file, enable HTX stream filtering, and clean
up the configuration and memory pools on shutdown.

The stream callbacks (attach, stream_start, stream_set_backend,
stream_stop, detach, check_timeouts) create the per-stream runtime context
on attach with rate-limit based sampling, fire the corresponding OTel
events (on-stream-start, on-backend-set, on-stream-stop), manage the
idle timeout timer with reschedule logic in detach, and free the runtime
context in check_timeouts.  The attach callback also registers the
required pre and post channel analyzers from the instrumentation
configuration.

The channel callbacks (start_analyze, pre_analyze, post_analyze,
end_analyze) register per-channel analyzers, map analyzer bits to event
indices via flt_otel_get_event(), and dispatch the matching events.
The end_analyze callback also fires the on-server-unavailable event
when response analyzers were configured but never executed.

The HTTP callbacks (http_headers, http_end, http_reply, and the debug-only
http_payload and http_reset) dispatch their respective request/response
events based on the channel direction.

The event dispatcher flt_otel_event_run() in event.c iterates over all
scopes matching a given event index and calls flt_otel_scope_run() for
each, sharing a common monotonic and wall-clock timestamp across all spans
within a single event.

Error handling is centralized in flt_otel_return_int() and
flt_otel_return_void(), which implement the hard-error/soft-error policy:
hard errors disable the filter for the stream, soft errors are silently
cleared.

The new debug.h header provides conditional debug macros
(FLT_OTEL_DBG_ARGS, FLT_OTEL_DBG_BUF) and the FLT_OTEL_LOG macro for
structured logging through the instrumentation's log server list.  The
utility layer gained debug-only label functions for channel direction,
proxy mode, stream position, filter type, and analyzer bit name lookups.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
f05a6735b1 MEDIUM: otel: added memory pool and runtime scope layer
Added the memory pool management and the runtime scope layer that track
per-stream OTel spans and contexts during request processing.

The pool layer in pool.c manages HAProxy memory pools for the runtime
structures used by the filter: scope spans, scope contexts, runtime
contexts, and span contexts.  Each pool is conditionally compiled via
USE_POOL_OTEL_* macros defined in config.h and registered with
REGISTER_POOL().  The allocation functions (flt_otel_pool_alloc,
flt_otel_pool_strndup, flt_otel_pool_free) transparently fall back to
heap allocation when the corresponding pool is not enabled.  Trash buffer
helpers (flt_otel_trash_alloc, flt_otel_trash_free) provide scratch space
using either HAProxy's trash chunk pool or direct heap allocation.

The scope layer in scope.c implements the per-stream runtime state.  The
flt_otel_runtime_context structure is allocated when a stream starts and
holds the stream and filter references, hard-error/disabled/logging flags
copied from the instrumentation configuration, idle timeout state, a
generated UUID, and lists of active scope spans and extracted scope
contexts.  Scope spans (flt_otel_scope_span) carry the operation name,
fetch direction, the OTel span handle, and optional parent references
resolved from other spans or extracted contexts.  Scope contexts
(flt_otel_scope_context) hold an extracted span context obtained from
a carrier text map via the tracer.  The scope data structures
(flt_otel_scope_data) aggregate growable key-value arrays for attributes
and baggage, a linked list of named events with their own attribute
arrays, and a span status code with description, representing the
telemetry collected during a single event execution.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
c0fd39457f MEDIUM: otel: added post-parse configuration check
Implemented the flt_otel_ops_check() callback that validates the parsed
OTel filter configuration after all HAProxy configuration sections have
been processed.

The check callback performs the following validations: resolves deferred
sample fetch arguments under full frontend and backend capabilities,
verifies uniqueness of filter IDs across all proxies, ensures the
instrumentation section and its configuration file are present, checks
for duplicate group and scope section names, verifies that groups are not
empty, resolves group-to-scope and instrumentation-to-group/scope
cross-references by linking placeholder entries to their definitions,
detects unused scopes, counts root spans and warns when the count differs
from one, and accumulates the required channel analyzer bits from all used
scopes into the instrumentation configuration.

The commit also added the flt_otel_counters structure to track per-event
diagnostic counters in debug builds, the FLT_OTEL_ALERT macro for
filter-scoped error messages, and the FLT_OTEL_DBG_LIST macro for
iterating and dumping named configuration lists.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
2d56399b0c MEDIUM: otel: added configuration parser and event model
Added the full configuration parser that reads the OTel filter's external
configuration file and the event model that maps filter events to HAProxy
channel analyzers.

The event model in event.h defines an X-macro table
(FLT_OTEL_EVENT_DEFINES) that maps each filter event to its HAProxy
channel analyzer bit, sample fetch direction, and event name.  Events
cover stream lifecycle (start, stop, backend-set, idle-timeout), client
and server sessions, request analyzers (frontend and backend TCP and
HTTP inspection, switching rules, sticking rules, RDP cookie), response
analyzers (TCP inspection, HTTP response processing), and HTTP headers,
end, and reply callbacks.  The event names are partially compatible with
the SPOE filter.  The flt_otel_event_data[] table in event.c is generated
from the same X-macro and provides per-event metadata at runtime.

The parser in parser.c implements section parsers for the three OTel
configuration blocks: otel-instrumentation (tracer identity, log server,
config file path, groups, scopes, ACLs, rate-limit, options for
disabled/hard-errors/nolognorm, and debug-level), otel-group (group
identity and scope list), and otel-scope (scope identity, span definitions
with optional root/parent modifiers, attributes, events, baggages, status
codes, inject/extract context operations, finish lists, idle-timeout,
ACLs, and otel-event binding with optional if/unless ACL conditions).
Each section has a post-parse callback that validates the parsed state.

The top-level flt_otel_parse_cfg() temporarily registers these section
parsers, loads the external configuration file via parse_cfg(), and
handles deferred resolution of sample fetch arguments by saving them in
conf->smp_args for later resolution in flt_otel_check() when full frontend
and backend capabilities are available.  The main flt_otel_parse() entry
point was extended to parse the filter ID and config file keywords, verify
that insecure-fork-wanted is enabled, and wire the parsed configuration
into the flt_conf structure.

The utility layer gained flt_otel_strtod() and flt_otel_strtoll() for
validated string-to-number conversion used by rate-limit and debug-level
parsing.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
8126fd569b MEDIUM: otel: added configuration and utility layer
Added the configuration structures that model the OTel filter's
instrumentation hierarchy and the utility functions that support the
configuration parser.

The configuration is organized as a tree rooted at flt_otel_conf, which
holds the proxy reference, filter identity, and lists of groups and
scopes.  Below it, flt_otel_conf_instr carries the instrumentation
settings: tracer handle, rate limiting, hard-error mode, logging state,
channel analyzers, and placeholder references to groups and scopes.
Groups (flt_otel_conf_group) aggregate scopes by name.  Scopes
(flt_otel_conf_scope) bind an event to its ACL condition, span context
declarations, span definitions and a list of spans scheduled for
finishing.  Spans (flt_otel_conf_span) carry attributes, events,
baggages and status entries, each represented as flt_otel_conf_sample
structures that pair a key with concatenated sample-expression arguments.

All configuration types share a common header macro (FLT_OTEL_CONF_HDR)
that embeds an identifier string, its length, a configuration line number,
and a list link.  Their init and free functions are generated by the
FLT_OTEL_CONF_FUNC_INIT and FLT_OTEL_CONF_FUNC_FREE macros in
conf_funcs.h, with per-type custom initialization and cleanup bodies.

The utility layer in util.c provides argument counting and concatenation
for the configuration parser, sample data to string conversion covering
boolean, integer, IPv4, IPv6, string and HTTP method types, and debug
helpers for dumping argument arrays and linked list state.
2026-04-13 09:23:26 +02:00
Miroslav Zagorac
cd14abf9f3 MEDIUM: otel: added OpenTelemetry filter skeleton
The OpenTelemetry (OTel) filter enables distributed tracing of requests
across service boundaries, export of metrics such as request rates,
latencies and error counts, and structured logging tied to trace context,
giving operators a unified view of HAProxy traffic through any
OpenTelemetry-compatible backend.

The OTel filter is implemented using the standard HAProxy stream filter
API.  Stream filters attach to proxies and intercept traffic at each stage
of processing: they receive callbacks on stream creation and destruction,
channel analyzer events, HTTP header and payload processing, and TCP data
forwarding.  This allows the filter to collect telemetry data at every
stage of the request/response lifecycle without modifying the core proxy
logic.

This commit added the minimum set of files required for the filter to
compile: the addon Makefile with pkg-config-based detection of the
opentelemetry-c-wrapper library, header files with configuration
constants, utility macros and type definitions, and the source files
containing stub filter operation callbacks registered through
flt_otel_ops and the "opentelemetry" keyword parser entry point.

The filter uses the opentelemetry-c-wrapper library from HAProxy
Technologies, which provides a C interface to the OpenTelemetry C++ SDK.
This wrapper allows HAProxy, a C codebase, to leverage the full
OpenTelemetry observability pipeline without direct C++ dependencies
in the HAProxy source tree.

  https://github.com/haproxytech/opentelemetry-c-wrapper
  https://github.com/open-telemetry/opentelemetry-cpp

Build options:

  USE_OTEL     - enable the OpenTelemetry filter
  OTEL_DEBUG   - compile the filter in debug mode
  OTEL_INC     - force the include path to the C wrapper
  OTEL_LIB     - force the library path to the C wrapper
  OTEL_RUNPATH - add the C wrapper RUNPATH to the executable

Example build with OTel and debug enabled:

  make -j8 USE_OTEL=1 OTEL_DEBUG=1 TARGET=linux-glibc
2026-04-13 09:23:26 +02:00
Aurelien DARRAGON
8fe0950511 MINOR: promex: export "haproxy_sticktable_local_updates" metric
haproxy_sticktable_local_updates corresponds to the table->localupdate
counter, which is used internally by the peers protocol to identify
update messages in order to send and ack them among peers.

Here we decide to expose this information, as it is already the case in
"show peers" output, because it turns out that this value, which is
cumulative and grows in sync with the number of updates triggered on the
table due to changes initiated by the current process, can be used to
compute the update rate of the table. Computing the update rate of the
table (from the process point of view, ie: updates sent by the process and
not those received by the process), can be a great load indicator in order
to properly scale the infrastructure that is intended to handle the
table updates.

Note that there is a pitfall, which is that the value will eventually
wrap since it is stored using unsigned 32bits integer. Scripts or system
making use of this value must take wrapping into account between two
readings to properly compute the effective number of updates that were
performed between two readings. Also, they must ensure that the "polling"
rate between readings is small enough so that the value cannot wrap behind
their back.
2026-03-18 11:18:37 +01:00
Amaury Denoyelle
dd1990a97a MINOR: promex: use watcher to iterate over backend instances
Ensures proxies iteration in promex applet is safe via a new watcher
member. The principle is similar to the one already used for servers
iteration.

Note that ctx.p[0] is not updated anymore at the end of a function, as
this is automatically done via the watcher itself.
2026-02-27 10:28:24 +01:00
Amaury Denoyelle
5000f0b2ef BUG/MINOR: promex: fix server iteration when last server is deleted
Servers iteration via promex is now resilient to server runtime deletion
thanks to the watcher mechanism. However, the watcher was not correctly
initialized which could cause duplicate metrics reporting.

This issue happens when promex dump yielded when manipulating the last
server of a proxy. If this server is removed in parallel, <sv> pointer
will be set to NULL when promex resumes. Instead of switching to another
proxy, the code would reuse the same one and iterate again on the same
server list.

To fix this issue, <sv> pointer must not be reinitialized just after a
resumption point. Instead, this is now performed before
promex_dump_srv_metrics(), or just after switching to another proxy
instance. Thus, on resumption, if promex_dump_srv_metrics() is started
with <sv> as NULL, it means that the server was deleted and the end of
the current proxy list is reached, hence iteration is restarted on the
next proxy instance.

Note that ctx.p[1] does not need to be manually updated at the end of
promex_dump_srv_metrics() as srv_watch already does that.

This patch must be backported up to 3.0.
2026-02-26 18:24:36 +01:00
Amaury Denoyelle
00a106059e MINOR: promex: test applet resume in stress mode
Implement a stress mode with force yield for promex applet each time a
metric is displayed. This is implemented by returning 0 in
promex_dump_ts() each time the output buffer is not empty.

To test this, haproxy must be compiled with DEBUG_STRESS and the
following configuration must be used :

  global
    stress-level 1
2026-02-26 18:24:36 +01:00
Willy Tarreau
95a9f472d2 MEDIUM: counters: change the fill_stats() API to pass the module and extra_counters
We'll soon need to iterate over thread groups in the fill_stats() functions,
so let's first pass the extra_counters and stats_module pointers to the
fill_stats functions. They now call EXTRA_COUNTERS_GET() themselves with
these elements in order to retrieve the required pointer. Nothing else
changed, and it's getting even a bit more transparent for callers.

This doesn't change anything visible however.
2026-02-26 08:24:03 +01:00
David Carlier
cb63e899d9 CLEANUP: deviceatlas: add unlikely hints and minor code tidying
Some checks failed
Contrib / build (push) Has been cancelled
alpine/musl / gcc (push) Has been cancelled
VTest / Generate Build Matrix (push) Has been cancelled
Windows / Windows, gcc, all features (push) Has been cancelled
VTest / (push) Has been cancelled
Add unlikely() hints on error paths in init, conv and fetch functions.
Remove unnecessary zero-initialization of local buffers that are
always written before use. Fix indentation in da_haproxy_checkinst()
and remove unused loop variable initialization.
2026-02-14 15:49:00 +01:00
David Carlier
076ec9443c MINOR: deviceatlas: precompute maxhdrlen to skip oversized headers early
Precompute the maximum header name length from the atlas evidence
headers at init and hot-reload time. Use it in da_haproxy_fetch() to
skip headers early that cannot match any known DeviceAtlas evidence
header, avoiding unnecessary string copies and comparisons.
2026-02-14 15:49:00 +01:00
David Carlier
f5d03bbe13 MINOR: deviceatlas: define header_evidence_entry in dummy library header
Add the struct header_evidence_entry definition to the dummy dac.h
to accommodate the ongoing deviceatlas module update which now
iterates over atlas header_priorities to precompute maxhdrlen.
The struct was already referenced by struct da_atlas but lacked
a definition in the dummy header.
2026-02-14 15:49:00 +01:00
David Carlier
23aeb72798 MINOR: deviceatlas: increase DA_MAX_HEADERS and header buffer sizes
Increase DA_MAX_HEADERS from 24 to 32 and hbuf from 24 to 64 to
accommodate current DeviceAtlas data files which may use more headers
and longer header names.
2026-02-14 14:47:22 +01:00
David Carlier
8031bf6e03 MINOR: deviceatlas: check getproptype return and remove pprop indirection
Check the return value of da_atlas_getproptype() and skip the property
on failure instead of using an uninitialized proptype. Also remove the
unnecessary pprop pointer indirection, using prop directly.
2026-02-14 14:47:22 +01:00
David Carlier
0fad24b5da BUG/MINOR: deviceatlas: set cache_size on hot-reloaded atlas instance
When hot-reloading the atlas in da_haproxy_checkinst(), the configured
cache_size was not applied to the new instance, causing it to use the
default value.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
1d1daff7c4 BUG/MINOR: deviceatlas: fix deinit to only finalize when initialized
da_fini() was called unconditionally in deinit_deviceatlas() even when
da_init() was never called. Move it inside the daset check. Also remove
the erroneous shm_unlink() call which could affect the dadwsch shared
memory used by the scheduling process.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
d8ff676592 BUG/MINOR: deviceatlas: fix resource leak on hot-reload compile failure
In da_haproxy_checkinst(), when da_atlas_compile() failed, the cnew
buffer was leaked. Add a free(cnew) in the else branch.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
ea3b1bb866 BUG/MINOR: deviceatlas: fix double-checked locking race in checkinst
In da_haproxy_checkinst(), base[0] was checked before acquiring the
lock but not re-checked after. Another thread could have already
processed the reload between the initial check and the lock
acquisition, leading to a race condition.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
734a139c52 BUG/MINOR: deviceatlas: fix cookie vlen using wrong length after extraction
In da_haproxy_fetch(), vlen was set from v.len (the raw header value
length) instead of the truncated copy length. Also the cookie-specific
vlen calculation used an incorrect subtraction instead of the actual
extracted cookie value length (pl) returned by
http_extract_cookie_value().

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
7098b4f93a BUG/MINOR: deviceatlas: fix off-by-one in da_haproxy_conv()
The user-agent string copy had an off-by-one error: the buffer size
limit did not account for the null terminator, and the memcpy length
used i-1 which truncated the last character of the user-agent string.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
d8f219b380 BUG/MEDIUM: deviceatlas: fix resource leaks on init error paths
When da_atlas_compile() or da_atlas_open() failed in init_deviceatlas(),
atlasimgptr was leaked and da_fini() was never called. Also add a NULL
check on strdup() for the default cookie name with proper cleanup of
the atlas and image pointer on failure.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
6342705cee BUG/MINOR: deviceatlas: add NULL checks on strdup() results in config parsers
Add missing NULL checks after strdup() for the json file path in
da_json_file() and the cookie name in da_properties_cookie().

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
David Carlier
2d6e9e15cd BUG/MINOR: deviceatlas: add missing return on error in config parsers
da_log_level() and da_cache_size() were missing a return -1 on error,
causing fall-through to the normal return 0 path when invalid values
were provided.

This should be backported to lower branches.
2026-02-14 14:47:22 +01:00
Christopher Faulet
be68ecc37d BUG/MINOR: promex: Detach promex from the server on error dump its metrics dump
If an error occurres during the dump of a metric for a server, we must take
care to detach promex from the watcher list for this server. It must be
performed explicitly because on error, the applet state (st1) is changed, so
it is not possible to detach it during the applet release stage.

This patch must be backported with b4f64c0ab ("BUG/MEDIUM: promex: server
iteration may rely on stale server") as far as 3.0. On older versions, 2.8
and 2.6, the watcher_detach() line must be changed by "srv_drop(ctx->p[1])".
2026-01-23 11:40:54 +01:00
Aurelien DARRAGON
b4f64c0abf BUG/MEDIUM: promex: server iteration may rely on stale server
When performing a promex dump, even though we hold reference on server
during resumption after a yield (ie: buffer full), the refcount mechanism
only guarantees that the server pointer will be valid upon resumption, not
that its content will be consistent. As such, sv->next may be garbage upon
resumption. Instead, we must rely on the watcher mechanism to iterate over
server list when resumption is involved like we already do for stats and
lua handlers.

It must be backported anywhere 071ae8ce3 (" BUG/MEDIUM: stats/server: use
watcher to track server during stats dump") was (up to 2.8 it seems)
2026-01-19 14:24:11 +01:00
Christopher Faulet
927884a3eb MINOR: applet: Add a flag to know an applet is using HTX buffers
Multiplexers already explicitly announce their HTX support. Now it is
possible to set flags on applet, it could be handy to do the same. So, now,
HTX aware applets must set the APPLET_FL_HTX flag.
2025-08-25 11:11:05 +02:00
Willy Tarreau
c264ea1679 MEDIUM: tree-wide: replace most DECLARE_POOL with DECLARE_TYPED_POOL
This will make the pools size and alignment automatically inherit
the type declaration. It was done like this:

   sed -i -e 's:DECLARE_POOL(\([^,]*,[^,]*,\s*\)sizeof(\([^)]*\))):DECLARE_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_POOL src addons)
   sed -i -e 's:DECLARE_STATIC_POOL(\([^,]*,[^,]*,\s*\)sizeof(\([^)]*\))):DECLARE_STATIC_TYPED_POOL(\1\2):g' $(git grep -lw DECLARE_STATIC_POOL src addons)

81 replacements were made. The only remaining ones are those which set
their own size without depending on a structure. The few ones with an
extra size were manually handled.

It also means that the requested alignments are now checked against the
type's. Given that none is specified for now, no issue is reported.

It was verified with "show pools detailed" that the definitions are
exactly the same, and that the binaries are similar.
2025-08-11 19:55:30 +02:00
Christopher Faulet
337768656b MINOR: applet: Add support for flags on applets with a flag about the new API
A new field was added in the applet structure to be able to set flags on the
applets The first one is related to the new API. APPLET_FL_NEW_API is set
for applets based on the new API. It was set on all HAProxy's applets.
2025-07-25 15:44:02 +02:00
Christopher Faulet
2e5e6cdf23 MEDIUM: promex: Update the promex applet to use their own buffers
Thanks to this patch, the promex applet is now using its own buffers.
.rcv_buf and .snd_buf callback functions are now defined to use the default
HTX functions. Parts to receive and send data have also been updated to use
the applet API and to remove any dependencies on the stream-connectors and
the channels.
2025-07-24 12:13:42 +02:00
David Carlier
0e8e20a83f BUILD/MEDIUM: deviceatlas: fix when installed in custom locations.
We are reusing DEVICEATLAS_INC/DEVICEATLAS_LIB when the DeviceAtlas
library had been compiled and installed with cmake and make install targets.
Works fine except when ldconfig is unaware of the path, thus adding
cflags/ldflags into the mix.

Ideally, to be backported down to the lowest stable branch.
2025-07-03 09:08:06 +02:00
Christopher Faulet
7244f16ac4 MINOR: promex: Add agent check status/code/duration metrics
In the Prometheus exporter, the last health check status is already exposed,
with its code and duration in seconds. The server status is also exposed.
But the information about the agent check are not available. It is not
really handy because when a server status is changed because of the agent,
it is not obvious by looking to the Prometheus metrics. Indeed, the server
may reported as DOWN for instance, while the health check status still
reports a success. Being able to get the agent status in that case could be
valuable.

So now, the last agent check status is exposed, with its code and duration
in seconds. Following metrics can be grabbe now:

  * haproxy_server_agent_status
  * haproxy_server_agent_code
  * haproxy_server_agent_duration_seconds

Note that unlike the other metrics, no per-backend aggregated metric is
exposed.

This patch is related to issue #2983.
2025-05-22 09:50:10 +02:00
Ilia Shipitsin
27a6353ceb CLEANUP: assorted typo fixes in the code, commits and doc 2025-04-03 11:37:25 +02:00
Ilia Shipitsin
78b849b839 CLEANUP: assorted typo fixes in the code and comments
code, comments and doc actually.
2025-04-02 11:12:20 +02:00
Aurelien DARRAGON
83074bf690 MINOR: promex: get rid of promex_st_metrics array
In this patch we pursue the work started in a5aadbd ("MEDIUM: promex:
switch to using stat_cols_px for front/back/server metrics"):

Indeed, while having ".promex_name" info in stat_cols_info generic array
was confusing, Willy suggested that we have ".alt_name" which stays
generic and may be considered by alternative exporters for metric naming.
For now, only promex exporter will make use of it.

Thanks to this, it allows us to completely get rid of the
stat_cols_px array. The other main benefit is that it will be much harder
to overlook promex metric definition now because .alt_name has more
visibility in the main metric array rather than in an addon file.
2025-03-21 17:05:31 +01:00
Aurelien DARRAGON
155fb4ec74 MINOR: promex: get rid of promex_global_metric array
In this patch we pursue the work started in 1adc796 ("MEDIUM: promex:
switch to using stat_cols_info for global metrics"):

Indeed, while having ".promex_name" info in stat_cols_info generic array
was confusing, Willy suggested that we have ".alt_name" which stays
generic and may be considered by alternative exporters for metric naming.
For now, only promex exporter will make use of it.

Thanks to this, it allows us to completely get rid of the
promex_global_metric array. The other main benefit is that it will be
much harder to overlook promex metric definition now because .alt_name
has more visibility in the main metric array rather than in an addon file.
2025-03-21 17:05:14 +01:00
Aurelien DARRAGON
85f2f93d11 CLEANUP: promex: remove unused PROMEX_FL_{INFO,FRONT,BACK,LI,SRV} flags
Now promex metric dumping relies on stat_cols API, we don't make use of
these flags, so let's remove them.
2025-03-20 11:42:58 +01:00
Aurelien DARRAGON
a5aadbd512 MEDIUM: promex: switch to using stat_cols_px for front/back/server metrics
Now the stat_cols_px array contains all info that-prometheus requires
stop using the promex_st_metrics array that contains redundant infos.

As for ("MEDIUM: promex: switch to using stat_cols_info for global
metrics"), initial goal was to completely get rid of promex_st_metrics
array, but it turns out it is still required but only for the name
mapping part now. So in this commit we change it from complex structure
array (with redundant info) to a simple ist array with the
metric id:promex name mapping. If a metric name is not defined there, then
promex ignores it.
2025-03-20 11:40:07 +01:00
Aurelien DARRAGON
d31ef6134a MINOR: promex: expose ST_I_INF_WARNINGS (AKA total_warnings) metric
It has been requested to have the ST_I_INF_WARNINGS metric available from
prometheus, let's define it in promex_global_metrics ist array so that
prometheus starts advertising it.
2025-03-20 11:39:16 +01:00
Aurelien DARRAGON
1adc796c4b MEDIUM: promex: switch to using stat_cols_info for global metrics
Now the stat_cols_info array contains all info that prometheus requires,
stop using the promex_global_metrics array that contains redundant infos.

Initial goal was to completely drop the promex_global_metrics array.
However it was deemed no longer relevant as prometheus stats rely on a
custom name that cannot be derived from stat_cols_info[], unless we add
a specific ".promex_name" field or similar to name the stats for
prometheus. This is what was carried over on a first attempt but it proved
to burden stat_cols_info[] array (not only memory wise, it is quite
confusing to see promex in the main codebase, given that prometheus is
shipped as an optional add-on).

The new strategy consists in revamping the promex_global_metrics array
from promex_metric (with all redundant fields for metrics) to a simple
ID<==>IST mapping. If the metric is mapped, then it means promex addon
should advertise it (using the name provided in the mapping). Now for
all the metric retrieval, no longer rely on built-in hardcoded values
but instead leverage the new stat cols API.

The tricky part is the .type association because the general rule doesn't
apply for all metrics as it seems that we stated that some non-counters
oriented metrics (at least from haproxy point of view) had to be presented
as counter metrics. So in this patch we add some special treatment for
those metrics to emulate the old behavior. If that's not relevant in the
future, it may be removed. But this requires to ensure that promex users
will properly cope with that change. At least for now, no change of
behavior should be expected.
2025-03-20 11:38:56 +01:00
Christopher Faulet
6048460102 MEDIUM: stream: Map task wake up reasons to dedicated stream events
To fix thread-safety issues when a stream must be shut, three new task
states were added. These states are generic (UEVT1, UEVT2 and UEVT3), the
task callback function is responsible to know what to do with them. However,
it is not really scalable.

The best is to use an atomic field in the stream structure itself to deal
with these dedicated events. There is already the "pending_events" field
that save wake up reasons (TASK_WOKEN_*) to not loose them if
process_stream() is interrupted before it had a chance to handle them.

So the idea is to introduce a new field to handle streams dedicated events
and merged them with the task's wake up reasons used by the stream. This
means a mapping must be performed between some task wake up reasons and
streams events. Note that not all task wake up reasons will be mapped.

In this patch, the "new_events" field is introduced. It is an atomic
bit-field. Streams events (STRM_EVT_*) are also introduced to map the task
wake up reasons used by process_stream(). Only TASK_WOKEN_TIMER and
TASK_WOKEN_MSG are mapped, in addition to TASK_F_UEVT* flags. In
process_stream(), "pending_events" field is now filled with new stream
events and the mapping of the wake up reasons.
2025-01-28 14:53:37 +01:00
Christopher Faulet
91578212d7 BUG/MEDIUM: promex: Use right context pointers to dump backends extra-counters
When backends extra counters are dumped, the wrong pointer was used in the
promex context to retrieve the stats module. p[1] must be used instead of
p[2]. Because of this typo, a infinite loop could be experienced if the
output buffer is full during this stage. But in all cases an overflow is
possible leading to a memory corruption.

This patch may be related to issue #2831. It must be backported as far as
3.0.
2025-01-14 15:38:43 +01:00
Willy Tarreau
450528b9f5 DOC: ot: mention planned deprecation of the OT filter
Miroslav mentioned below that he's currently working on an OpenTelemetry
replacement for the OpenTracing filter since OpenTracing itself is no
longer maintained nor supported:

  https://github.com/haproxy/haproxy/issues/2782#issuecomment-2493576327

Given that he aims for 3.2, let's already settle on an upcoming deprecation
of the filter for 3.3 with a removal for 3.5. This will leave time to finish
the development and permit users to switch smoothly. At this point no warning
is emitted (since the users have no alternative) but better mention this plan
in the doc to make them aware of future changes.
2024-11-22 16:11:51 +01:00
Christopher Faulet
25b0592745 MINOR: promex: Add global and proxies description as labels to all metrics
While the global description is exposed, when defined, in a dedicated
metric, it is not possible to dump the description defined in a
frontend/listen/backend sections. So, thanks to this patch, it is now
possible to dump it as a label of all metrics of the corresponding
section. To do so, "desc-labels" parameter must be provided on the URL:

    /metrics?desc-labels

When this parameter is set, if a description is provided in a section,
including the global one, the "desc" label will be added to all metrics of
this section. For instance:

  haproxy_frontend_current_sessions{proxy="front-http",desc="..."} 1

Note that servers metrics inherit the description of their backend/listen
section.

This patch should solve the issue #1531.
2024-11-15 14:25:13 +01:00
Christopher Faulet
451d216a53 MINOR: promex: Expose the global node and description in process metrics
The global node value is now exposed via "haproxy_process_node" metrics. The
metric value is always set to 1 and the node name itself is the "node"
label. The same is performed for the global description. But only if it is
defined. In that case "haproxy_process_description" metric is defined, with
1 as value and the description itself is set in the "desc" label.
2024-11-15 14:24:31 +01:00
Miroslav Zagorac
aadda34fd6 BUILD: ot: use a cebtree instead of a list for variable names
In order for the function flt_ot_vars_scope_dump() to work, it is
necessary to take into account the changes made by the commits 47ec7c681
("OPTIM: vars: use a cebtree instead of a list for variable names") and
5d350d1e5 ("OPTIM: vars: use multiple name heads in the vars struct").

The function is only used if the OT_DEBUG=1 option is set when compiling
HAProxy.
2024-11-12 11:07:13 +01:00
Christopher Faulet
d1adfd9fe4 BUG/MEDIUM: promex: Fix dump of extra counters
When extra counters are dumped for an entity (frontend, backend, server or
listener), there is a filter on capabilities. Some extra counters are not
available for all entities and must be ignored. However, when this was
performed, the field number, used as an index to dump the metric value, was
still incremented while it should not and leads to an overflow or a stats
mix-up.

This patch must be backported to 3.0.
2024-11-05 15:36:41 +01:00