haproxy/doc/internals/core-principles.txt
Willy Tarreau 4519906c70 DOC: internal: add a few rules about internal core principles
The new file core-principles.txt quickly enumerates a number of rules
and invariants across the project. These can be used as quick reminders
as well as basic rules for reviews. It's still lacking a lot of info but
should be a good start.
2026-05-16 20:12:32 +02:00

229 lines
13 KiB
Text

HAPROXY CORE PRINCIPLES
0. RULE ZERO: EXCEPTIONS AND JUSTIFICATION
- These rules are mandatory; violations are bugs unless explicitly justified.
- A violation is acceptable if accompanied by a comment explaining WHY the
standard approach was insufficient (e.g., "Performance-critical bypass").
- Reviews should flag unjustified violations but accept commented ones.
1. PROJECT ORGANIZATION
- header files all under "include/", and split between haproxy/<file>-t.h for
type definitions (types, enums, structures), and haproxy/<file>.h for static
definitions and exported symbols. A few imported libs under include/import.
- C source files in src/.
- some API doc in doc/internals/api/ (not always up to date, check date or
version at the top).
2. ENVIRONMENT AND DATA TYPES
- The project targets 32/64-bit POSIX systems (little or big endian).
- Char is signed or unsigned 8-bit, short signed 16-bit, int signed 32-bit.
- Long and pointers always match the native word size. Long long is 64-bit.
- Aliases: uchar (unsigned char), uint (unsigned int), ulong (unsigned long),
ushort (unsigned short), ullong (unsigned long long), llong (long long),
schar (signed char).
- size_t always same size as long but often declared as uint on 32-bit and
ulong on 64-bit. Do not use in printf() without a cast (ulong with "%lu").
- Main platforms are x86_64 and aarch64 with high thread counts (>=64).
- Unaligned accesses are permitted for archs that support them; portable
wrappers in net_helper.h (read_u32(), write_u32() etc).
- signed integer wrapping well-defined via -fwrapv.
- arch-specific asm() statements OK as long as equivalent C-code exists for
generic archs.
- Pointer arithmetics used a lot via container_of(), offset_of(), and void*
casts.
- Floating point not used.
3. MEMORY MANAGEMENT AND POOLS
- Pools are used for runtime allocation; malloc/free are for boot code only.
- pool_alloc() semantics match malloc(); the return must always be tested.
- pool_alloc() and malloc() are not interchangeable / compatible.
- pool_free() semantics match free(); it is a no-op on NULL.
- pool_free() makes the pointer invalid immediately; it must not be touched
or passed to pool_free() again.
- Memory allocated from one pool must be released to the same pool.
- ha_free() calls free() and sets the pointer to NULL before returning.
- my_realloc2() frees the original pointer if the allocation fails.
- never leave dangling pointers in structs after free().
4. BUFFER INVARIANTS (struct buffer)
- Buffers are 4-word inline structs used for data in transit (wrapping,
sliding window).
- Members: area (storage), size (capacity), head (offset), data (count).
- The area pointer is allowed to be NULL when size is zero.
- always true: 0<=data<=size; always true when size>0: 0<=head<size.
- contents start at <head>, for <data> bytes, and may wrap at the end of the
storage area (area+size).
- API (b_*, in buf.h and dynbuf.h) supports empty or unallocated buffers.
- idempotent functions b_alloc() and b_free() use pools to manage the
storage area and check <size> to know if alloc/free still needed.
- a non-contiguous version exists (ncbuf, ncbmbuf), allowing holes anywhere
in data. The former mandates holes of at least 8 bytes. The second relies
on a bitmap of populated places.
- another string API exists, "ist", representing a pointer and a length in a
struct that is returned by inline functions and macros. It is described in
doc/internals/api/ist.txt
- buffers can switch to and from HTX, which is an internal representation of
HTTP elements, with an API supporting header addition/modification/removal,
start-line manipulation, data appending/consumption etc. HTX functions are
all prefixed with "htx_". Between htx_from_buf() and htx_to_buf(), only the
HTX API may be used, not the b_* API.
5. DATA MANIPULATION (CHUNKS, TRASH, LISTS, TREES)
- Chunks use the buffer API but are NOT allowed to wrap.
- Chunks are used for linear operations like chunk_printf().
- Trash is a thread-local temporary buffer; scope stays within the caller.
- trash always the same size as a buffer (global.tune.bufsize).
- get_trash_chunk() provides up to 3 rotating thread-local trash chunks (with
a scope spanning from the call to the next function call).
- For longer lived trash chunks, alloc_trash_chunk() is available but must be
released using free_trash_chunk() on leaving.
- standard doubly-linked lists (struct list) are provided via macros LIST_*.
- LIST_INIT() must be used on new heads and elements. LIST_DELETE() only
removes the element and does not reinitialize it, so the idempotent
LIST_DEL_INIT() is generally preferred. Iterators like list_for_each_* are
available, some safe against item removal. See doc/internals/api/list.txt
for details (grep -i "^list_" to list available macros).
- thread-safe doubly-linked lists (struct mt_list) are provided via macros
mt_list_*. They work like lists and use compatible storage, though they may
not be mixed. See doc/internals/api/mt_list.txt (grep -i "^mt_list_" to
list available operations).
- elastic binary trees (ebtree) are used for fast access (O(logN) operations,
O(1) deletion). Idempotent deletion. Main functions are lookup, insert,
delete, first, next, with type-based prefix eb{32,64,st,mb,pt}_*().
- compact elastic binary trees (cebtree) are used for read-mostly focusing on
space savings (O(logN) operations, but higher cost than ebtree). Same ops
as ebtree, with type-based prefix ceb{32,u32,64,u64,s,is}_*.
6. THREAD SYNCHRONIZATION
- Threads are started at boot (one per CPU) and persist for the process life,
arranged in thread groups (tg) by cache locality.
- Each thread has its own polling loop and scheduler. Total parallelism.
- thread_isolate()/thread_release() for total thread isolation (very heavy).
- "tid" always current thread number, "th_ctx" always current thread's context,
"ti" current thread info.
- "tgid" always current tg number, "tg_ctx" current tg context.
- HA_ATOMIC_* for atomic operations on integers and pointers (includes load
and store). DWCAS available on some platforms but requires an equivalent
for other ones.
- The _HA_ATOMIC_* version (leading underscore) do not use barriers so these
must be explicit (__ha_barrier_*).
- Atomic loops must use CPU relaxation or exponential back-off.
- For multiple changes at once, threads may use spinlocks (HA_SPIN_LOCK()/
HA_SPIN_UNLOCK/HA_SPIN_TRYLOCK), and upgradable RW locks (HA_RWLOCK_*) if
read accesses dominate.
- No sleeping locks (mutex etc), only spinning/rwlocks/atomic loops.
7. SCHEDULING AND LATENCY
- Latency is critical.
- No runtime filesystem access, no blocking calls, no long loops.
- Complex processing must be split into small steps; the task must yield.
- CPUs are not dedicated to haproxy, high risk of a thread being interrupted
by another process if it works too long, catastrophic if it happens with a
lock held.
- A watchdog kills the process if a task hogs a CPU for > few milliseconds.
- Tasks vs Tasklets: Tasks have tree storage (rq) and timers (wq); tasklets
use list elements instead of rq and are smaller (no wq). Only task.c/h may
distinguish rq vs list access.
- Tasks are aliased to tasklet while they are running (hence why some
functions cast task to tasklets and conversely to access certain fields).
- inter-thread task/tasklet wakeups always safe using the task_* API.
- task/tasklet->state field must always be accessed atomically.
8. ARCHITECTURAL LAYERS (MUX AND STREAMS)
- Naming: Lower layer (multiplexed), attached to the connection uses suffix
'c' (h1c, h2c, qcc, muxc); Upper layer (demultiplexed/application, often a
stream) uses suffix 's' (h1s, h2s, qcs, muxs).
- Application layer stream (struct stream) has two stream connectors (stconn):
front (scf) and back (scb). Responsible for processing requests/responses,
deciding which server to route it, finding a backend connection or creating
one, and exchanging data between the two sides.
- Stream connectors link to a muxs or applet via a stream endpoint descriptor
(sedesc/sd), and exchange data via buffers, which for an HTTP muxs are HTX
buffers containing HTX blocks.
- The sd carries the shared context between layers.
- When a stream detaches from a mux, a new sd is allocated for the stream and
the mux keeps its previous sd: stconn and muxs both always have a valid sd.
- Front connections/streams are tied to the creator thread forever.
- Idle back connections can be stolen via mux->takeover(), but become
thread-bound once a stream attaches. => all streams of a mux are on the
same thread.
- session vs connection vs stream: connection is transport; session lasts for
the client connection's life; stream are request/response pairs.
- applets carry a context specific to the service being executed or the CLI
command in appctx->svcctx, and this one is always zeroed before the handler
is first called.
9. FUNCTION RETURN CONVENTIONS
- Boolean style: Functions named as actions/sentences return 0 (failure) or
non-zero (success).
- Integer style: some syscall-like functions return <0 (error) or >=0 (success).
- Tri-state style, e.g. counts: <0 (error), 0 (no progress), >0 (success).
10. DIAGNOSTICS AND SAFETY
- When DEBUG_STRICT is set, ABORT_NOW() crashes the program immediately, and
BUG_ON(cond[,msg]) crashes the program if the condition is true.
- COUNT_IF() / CHECK_IF() only track if a condition occurs (non-fatal).
- Glitches are counters for uncommon events used to detect hostile behavior.
- strcpy(), strcat() and sprintf() are totally forbidden (the program will
not build).
11. BASIC CODING STYLE
- Linux Kernel-like, but uses tabs for indent, spaces for alignment. Function
definitions have their opening brace on a new line, never on the same line.
- All local variables must be declared at the beginning of the function
block, before any executable statements (gnu89-like).
- Avoid variable shadowing in code blocks.
- Beware of local static and global variables.
- Use const arguments whenever possible.
- Avoid static storage when persistence is not needed.
- Macros in uppercase unless they're used to wrap functions which then get a
leading underscore.
- Explicitly compare functions returning non-zero with 0 (e.g. strcmp) unless
they explicitly return a boolean (e.g. isalnum) or a pointer (e.g. strchr).
- Unsigned int comparisons to zero never use >0 but !=0 to avoid signedness
mistakes.
- turn non-zero integer to boolean using "!" or "!!".
12. BUILD AND TEST
- Preferred build command:
$ make -j$(nproc) TARGET=linux-glibc OPT_CFLAGS='-std=gnu89 -Os' \
USE_OPENSSL=1 USE_QUIC_OPENSSL_COMPAT=1 USE_QUIC=1 USE_LUA=1
- Individual files can be tested by passing src/file.o as a make argument.
- Compiler warnings are not permitted for new code.
13. COMMIT MESSAGES AND DOCUMENTATION
- Commit messages must follow the project's strict format below. Do not try
to learn better from previous commits, which might be wrong during reviews.
- Structure: <TAG>: <location>: <subject> (max ~70 chars), then blank line,
then description.
- Tags:
- CLEANUP: spelling fixes, refactoring, no new code nor functional change.
- MINOR: new feature or low-impact change, may be backported if needed.
- MEDIUM: new feature or change with moderate severity/impact/risk.
- MAJOR: new feature or change with important severity/impact/risk.
- OPTIM: Performance improvements, may always be reverted if it breaks.
- DOC: Documentation updates or fixes.
- BUG/<severity>: Fixes a bug. Specify if regression or long-standing.
Valid severities are MINOR (low impact), MEDIUM (perf/stability risk
in uncommon configs, MAJOR (most configs), CRITICAL (stability risk
without workaround).
- Regressions: Find original commit via `git blame`; designate using
`git log -1 --format='%h ("%s")'` and version via `git describe --tags`.
- Location: subsystem (stream, tasks, mux-h2, qpack etc).
- Description: Explain technical "WHY", "HOW", and technical impact. Explain
how to trigger the bug for developer testing.
- Backports: only for fixes, mention versions ("Must be backported to 3.0").
- Style: No generic messages like "fix(xxx): blah". Be technically precise.
- Do not mix spelling fixes in comments (not important) with other changes.
However it's preferred to have a single commit for many typo fixes at once.
- Spelling mistakes in user-visible parts (doc, logs, traces, error messages)
must be in their own commit (may need backport).
- One commit per bug.
- Example:
BUG/MEDIUM: sample: fix null pointer dereference in h1_parse_line
When parsing malformed headers, the line buffer was not initialized.
This caused a crash on certain edge cases. Let's fix this by always
initializing the line buffer when first calling the parser. This was
brought by commit 04c9e8f5 ("MINOR: add h1_parse_line") in latest -dev
so no backport is needed.