mirror of
https://github.com/redis/redis.git
synced 2026-05-28 04:02:46 -04:00
1729 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
95040d61d5
|
Replace INCREX out-of-bounds policy to a single SATURATE option (#15237)
Follow https://github.com/redis/redis/issues/15045 ## Summary Simplify INCREX's out-of-bounds policy: The original INCREX shipped with three out-of-bounds policies — OVERFLOW FAIL, OVERFLOW SAT, OVERFLOW REJECT — but FAIL and REJECT are functionally redundant: both leave the key untouched when the result is out of bounds. They differ only in how the caller is notified (error reply vs. [current_value, 0] array reply), which forces the user to make a stylistic choice with no real semantic difference. This PR collapses the three policies into one clear behavior: * Default: the operation is rejected; the key value and TTL are left unchanged, and the reply is [current_value, 0]. Callers detect non-application by checking the applied-increment field; no error-handling branch is required. * SATURATE: the result is saturated to UBOUND / LBOUND, or to the type limits (LLONG_MAX/MIN for BYINT, ±LDBL_MAX for BYFLOAT) when no explicit bound is given. New syntax: INCREX <key> [BYFLOAT increment | BYINT increment] [LBOUND lowerbound] [UBOUND upperbound] [SATURATE] [EX seconds | PX milliseconds | EXAT seconds-timestamp | PXAT milliseconds-timestamp | PERSIST] [ENX] --------- Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> |
||
|
|
b1a53ea21f
|
Add error on enabling memory tracking in non-clustered mode (#15005)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
Enabling memory tracking is forbidden during runtime if it is already disabled. In non-clustered mode though the checks were incorrect so this PR enforces the correct behavior in non-clustered environment. |
||
|
|
2e46d2e735
|
Hold GCRA out of the release (#15191)
Some checks failed
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
After introducing GCRA algorithm into redis https://github.com/redis/redis/pull/14826 and subsequent introduction of new RATE_LIMIT object type - https://github.com/redis/redis/pull/14905. It was internally decided not to introduce GCRA into the new release. As still no decision is made on whether it will be kept or not in the future, this PR only makes the code related to GCRA dead - commands are inaccessible and AOF/RDB load+save is disabled. --------- Co-authored-by: debing.sun <debing.sun@redis.com> |
||
|
|
54ea50c029 |
Fix cluster-announce-ip rejecting hostnames (#15188)
Fixes [#15183](https://github.com/redis/redis/issues/15183). ## Motivation Commit [ |
||
|
|
22f1ab6e27 |
Fix cluster AUX-field newline/control-character injection bricks a node on restart
* fix cluster AUX-field newline/control-character injection bricks a node on restart |
||
|
|
5c355b68ec |
Fix use-after-free when evicting blocked client during unblock (CVE-2026-23479)
When re-executing a pending command after unblocking, check the return value of `processCommandAndResetClient` and exit if needed. |
||
|
|
c4e3405704 | Invalid Memory Access in Redis RESTORE Command (CVE-2026-25243) | ||
|
|
80621b1d0e |
Add DENYOOM flag to SUBSCRIBE, PSUBSCRIBE and SSUBSCRIBE commands
Add the DENYOOM flag to SUBSCRIBE, PSUBSCRIBE, and SSUBSCRIBE commands to bring their memory protection behavior in line with other Redis commands. Problem: Currently, subscribe commands lack memory protection when Redis reaches its memory limit. This becomes problematic in two specific scenarios: 1. When the eviction policy doesn't allow eviction (e.g., noeviction) 2. When there are no evictable keys remaining in the database In these cases, memory usage from pub/sub subscribers can keep growing unchecked, potentially causing the Redis server to run out of memory. This behavior is inconsistent with other Redis commands, which are protected by the DENYOOM flag. Solution: Add the DENYOOM flag to all subscribe commands. When memory limits are reached, these commands will be rejected, preventing uncontrolled memory growth and aligning their behavior with other Redis commands. |
||
|
|
0d9576435f
|
Implement the new Redis Array type (#15162)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
# Redis Array For years, Redis has been missing a real indexed data structure for the use cases where the index and the spatial relationship of elements are semantic. Hashes give you random lookups, but you have to store an index as a key, and have no range visibility. Lists give you appending and trimming, but what is in the middle remains hard to access. Streams give you append-only events, which is another (useful, indeed) beast. None of these is what you want when the *position itself* has business meaning — slot 37, step 4, row 18552, day from 2934 to 2949, file line 11, 12, 15 and so forth. And, all those types, for different reasons, are all suboptimal when you want a **ring buffer** able to store the latest N observed samples of something. Up to now, users found ways (they always do \o/) using the fact that the data structures that are obvious in this universe are also extremely powerful, if well implemented. But this forces compromises. Arrays handle these index-first requirements natively, and usually with much better memory and CPU usage than the workarounds. If the use case is the right one, Arrays often provide much better space, time and usability at the same time. ## Internal encoding 1. When dense, an Array is essentially a more fancy C array. You don't pay anything for storing the index. 2. Yet, instead of going really flat, arrays are sliced into 4096-element slices, and each slice, when it contains just a few elements, uses a special sparse encoding. When a slice is empty it's just a `NULL` stored in the directory. 3. Small ints, floats, and short strings are pointer-tagged, so they cost zero additional memory beyond the pointer slot itself. 4. When very sparse, a super-directory of windowed directories is used. This allows the data type to be safe, instead of exhibiting pathological space or time behavior. This representation is only triggered when there are more than 8 million elements or very high indexes set. ## Use cases Arrays are mostly stateless if not for the fact that each array remembers the index of the latest added item, allowing `ARINSERT` and `ARRING` to work properly. Otherwise it is a set/get at this index game, with solid support for both setting / getting ranges, server-side scanning, returning only populated elements in a time which is proportional not to the range size, but to the population size. A few concrete examples, that may work as mental models for the set of problems that are similar to them (from the POV of the data modeling). **Thermometer.** A sensor reporting once per minute, with gaps: ``` ARSET temp:room12:day7 123 22.3 ARGETRANGE temp:room12:day7 600 660 # the 10:00–11:00 window, with NULLs ARSCAN temp:room12:day7 600 660 # only populated elements AROP temp:room12:day7 0 1439 MAX # peak of the day, server-side ``` Missing minutes cost little to nothing. Numeric aggregation runs inside Redis. Telemetry, IoT, meter readings, KPI rollups. **Calendar.** A clinic with 96 fifteen-minute slots per day: ``` ARSET sched:room12:day 32 booking:991 ARSCAN sched:room12:day 0 95 # only occupied slots ARGETRANGE sched:room12:day 48 63 # the afternoon full view to render ``` The slot number is the business key in this case. Room booking, parking spaces, warehouse bins, lockers, ... **Ring buffer.** ARRING replaces the classic LPUSH+LTRIM pattern. Imagine remote `dmesg`. ``` ARRING machine:123 200 "[141087.430123]: arm_cpu_init(): cpu 14 online" # Capped to 200 entries ARLASTITEMS machine:123 50 REV # 50 newest first ``` Faster than LPUSH+LTRIM, keep indexed access to past elements. Last-N alarms, recent fraud scores, access history, remote logs, device events. Ok here the use cases are mainly the ones of the old pattern: it is just a better fit and allows to access random items in the middle, aggregate server-side, and so forth. **Workflow.** Step number is the index, value is the status. Gaps are meaningful: ``` ARSET claim:99172 0 received ARSET claim:99172 3 waiting:reviewer42 ARSET claim:99172 5 approved ARGETRANGE claim:99172 0 5 # full workflow view, with NULLs for missing steps ARSCAN claim:99172 0 5 # only steps that have a state ARCOUNT claim:99172 # number of recorded steps ARLEN claim:99172 # highest reached step + 1 ``` **Skills knowledge base for agents.** Arrays are good at representing / grepping into Markdown files: ``` ARSET skill:metal_gpu 0 "...." ARSET skill:metal_gpu 1 "...." ARSET skill:metal_gpu 2 "...." ARGREP skill:metal_gpu - + RE "M3|M4" WITHVALUES ``` ARGREP has EXACT, MATCH, GLOB, RE, you can have multiple predicates, can select AND or OR behavior. **Bulk import results.** Sparse row annotations over millions of rows / CSV / ...: ``` ARSET import:job551 18552 ERR:bad_email ARSCAN import:job551 0 1000000 # Provides only rows that have something ``` ## TLDR If the position is part of the meaning, use an Array. If you want to aggregate or grep remotely, use an Array. Feedback welcome :) --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Shubham S Taple <155555100+ShubhamTaple@users.noreply.github.com> Co-authored-by: Yuan Wang <yuan.wang@redis.com> Co-authored-by: Marc Gravell <marc.gravell@gmail.com> |
||
|
|
b7d6ef6b5a
|
Add slowlog entry truncation limits configs (#15182)
Add configurations for `SLOWLOG_ENTRY_MAX_ARGC` and `SLOWLOG_ENTRY_MAX_STRING` values which are currently hardcoded in code. Two new configurations: * `slowlog-entry-max-argc` - maximum number of command arguments kept in a slowlog entry. Default: 32 * `slowlog-entry-max-string-len` - maximum length of a command argument in a slowlog entry. Default: 128 Useful for better diagnostics of slow commands with numerous and long arguments. --------- Co-authored-by: debing.sun <debing.sun@redis.com> |
||
|
|
9c1ecd044e
|
Add INCREX command for atomic increment with TTL and bounds (#15045)
Some checks failed
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
Close #14278 ## Overview Rate limiters, sliding windows, request counters, and numerous other network-facing patterns share a common primitive: **atomically increment a counter and set its expiration**. Achieving this in Redis requires either multiple round-trips or a Lua script that bundles `INCR` / `INCRBY` / `INCRBYFLOAT` with `EXPIRE` / `PEXPIRE`. We propose a new command, **`INCREX`**, that collapses this two-step pattern into a single, native, O(1) command. `INCREX` atomically: 1. Increments (or decrements) a key's numeric value — by integer or float. 2. Optionally enforces lower and/or upper bounds, with a configurable overflow policy (error out, saturate, or no-op), enabling built-in cap enforcement (e.g., max request count) without additional client logic. 3. Optionally sets or removes the key's expiration. 4. Returns both the **new value** and the **actual increment applied**, giving the caller immediate feedback on whether the operation was saturated or skipped. ## Use Cases ### Basic Usage ``` # Increment by 1 (default) and set a 60-second TTL. > SET mykey 10 > INCREX mykey EX 60 1) (integer) 11 # new value 2) (integer) 1 # actual increment # Use 0 as initial value if the key doesn't exist. > DEL mykey > INCREX mykey 1) (integer) 1 # new value 2) (integer) 1 # actual increment # Default policy (OVERFLOW FAIL): exceeding a bound returns an error. > SET mykey 5 > INCREX mykey BYINT 20 UBOUND 10 (error) value is out of bounds # Opt into saturation with OVERFLOW SAT. > INCREX mykey BYINT 20 UBOUND 10 OVERFLOW SAT 1) (integer) 10 # saturated to upper bound 2) (integer) 5 # only 5 was actually applied # Skip the operation with OVERFLOW REJECT — the key and its TTL are # untouched, and the reply reports the current value with a zero delta. > SET mykey 5 > INCREX mykey BYINT 20 UBOUND 10 OVERFLOW REJECT 1) (integer) 5 # current (unchanged) value 2) (integer) 0 # nothing was applied # Increment by a float > SET mykey 1 > INCREX mykey BYFLOAT 0.5 1) "1.5" 2) "0.5" ``` ### Use Case: Rate Limiter **Before (Lua script):** ```lua -- KEYS[1] = rate limit key, ARGV[1] = limit, ARGV[2] = window in seconds local current = redis.call('INCR', KEYS[1]) if current > tonumber(ARGV[1]) then return 0 -- rejected end if current == 1 then redis.call('EXPIRE', KEYS[1], ARGV[2]) end return 1 -- allowed ``` Client invocation: ```python result = redis.eval(LUA_SCRIPT, 1, f"ratelimit:{user_id}", 100, 60) if result == 0: reject_request() ``` **After (INCREX):** ```python new_val, actual_incr = redis.execute_command( "INCREX", f"ratelimit:{user_id}", "UBOUND", 100, "OVERFLOW", "REJECT", "EX", 60, "ENX" ) if actual_incr == 0: # Rate limit exceeded — key left unchanged. reject_request() ``` `ENX` means: set expiration only if the key doesn't already have an expiration. This ensures the sliding window's TTL is only set on the first request. ### Use Case: Token Bucket Refill Refill tokens periodically up to a capacity ceiling, saturating at the cap instead of erroring: ``` > INCREX tokens:user123 BYINT 10 UBOUND 100 OVERFLOW SAT EX 3600 ENX 1) (integer) 10 2) (integer) 10 ``` Tokens cannot exceed 100, and the key auto-expires after inactivity. ### Use Case: Countdown / Resource Consumption Decrement a resource counter down to zero, saturating at the floor: ``` > SET credits:user123 50 > INCREX credits:user123 BYINT -1 LBOUND 0 OVERFLOW SAT 1) (integer) 49 2) (integer) -1 ``` When credits are exhausted, `OVERFLOW SAT` prevents negative balances without client-side checks. ## Parameter Reference ### Syntax ``` INCREX key [BYFLOAT increment | BYINT increment] [LBOUND lowerbound] [UBOUND upperbound] [OVERFLOW <FAIL | SAT | REJECT>] [EX seconds | PX milliseconds | EXAT unix-time-seconds | PXAT unix-time-milliseconds | PERSIST] [ENX] ``` ### Parameters | Parameter | Description | |-----------|-------------| | `key` | The key to increment. Created with value `0` if it does not exist. | | `BYFLOAT increment` | Increment the value by the given long-double float. | | `BYINT increment` | Increment the value by the given 64-bit signed integer. | | `LBOUND lowerbound` | Set lower bound for the increment result. Defaults to `LLONG_MIN` (integer) or `-LDBL_MAX` (float). | | `UBOUND upperbound` | Set upper bound for the increment result. Defaults to `LLONG_MAX` (integer) or `LDBL_MAX` (float). | | `OVERFLOW <FAIL \| SAT \| REJECT>` | Set the overflow policy when the result would be out of bounds. `FAIL` rejects the operation with an error (default). `SAT` saturates the result to the bound. `REJECT` leaves the key and its TTL untouched and replies with the current value and a zero delta. | | `EX seconds` | Set the key's TTL to `seconds` seconds. | | `PX milliseconds` | Set the key's TTL to `milliseconds` milliseconds. | | `EXAT unix-time-seconds` | Set the key's expiration to the absolute Unix timestamp in seconds. | | `PXAT unix-time-milliseconds` | Set the key's expiration to the absolute Unix timestamp in milliseconds. | | `PERSIST` | Remove the key's existing TTL. | | `ENX` | Set the key's TTL/expiration if it has No eXpiration | If neither `BYINT` nor `BYFLOAT` is specified, the increment defaults to integer `1`. ### Return Value An **array of two elements**: 1. **New value** — the value of the key after the increment (or the unchanged current value under `OVERFLOW REJECT`). 2. **Actual increment** — the increment that was actually applied. May differ from the requested increment when `OVERFLOW SAT` saturates the result to a bound, and is always `0` when `OVERFLOW REJECT` skipped the operation. - In integer mode (default or `BYINT`): both elements are **integers**. - In float mode (`BYFLOAT`): both elements are **bulk strings** representing the float values on RESP2, and **RESP3 Doubles** on RESP3. ### Overflow Policy (FAIL vs. SAT vs. REJECT) Controlled by the optional `OVERFLOW` argument. A bound violation includes both exceeding an explicit `LBOUND`/`UBOUND` and overflowing the type limits when no explicit bound is given. - **`OVERFLOW FAIL` (default)**: if the computed result would violate a bound, the command returns an error and the key is left unchanged. This matches the existing semantics of `INCRBY` / `INCRBYFLOAT` on overflow. - **`OVERFLOW SAT`**: the result is silently capped at `UBOUND` / floored at `LBOUND` (or saturated to the type limits when no explicit bound is given). The second element of the reply reflects the saturated delta. If the delta cannot be represented as a 64-bit signed integer(default or `BYINT`), or would produce Infinity(`BYFLOAT`), an error is returned. - **`OVERFLOW REJECT`**: the operation is silently skipped — the key value and its TTL are left unchanged, no keyspace notification is fired, and nothing is replicated. The reply is `[current_value, 0]`, allowing the caller to detect the rejection without handling an error. ### Notes - If **no expiration option** is given, the key's existing TTL is preserved (like `INCR`). - `ENX` requires one of `EX`/`PX`/`EXAT`/`PXAT`. - If the result is saturated by `OVERFLOW SAT`, the expiration is still applied as specified. - Under `OVERFLOW REJECT` the expiration option is ignored on the rejected branch — TTL is preserved exactly as it was before the call. - **`BYINT` requires an integer-typed existing value; `BYFLOAT` accepts both.** Integers can be promoted to floats losslessly, but a stored float (e.g. `"1.5"`) cannot be parsed back as an integer. This is consistent with `INCR`/`INCRBY` (integer-only) and `INCRBYFLOAT` (accepts both). --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com> Co-authored-by: Moti Cohen <moti.cohen@redis.com> Co-authored-by: oranagra <oran@redislabs.com> |
||
|
|
62551a7b12
|
Batched MGET/MSET dict prefetch with dictType-driven payload hints (#15133)
Reduce MGET / MSET latency by overlapping the dict-lookup memory accesses across the keys of a single multi-key command. Builds on the cross-command batched prefetch framework introduced in #14017 and the dict-prefetch state machine in `memory_prefetch.c`, and lifts the kvobject-aware bits out of the state machine into two new `dictType` callbacks so the same machinery can be reused for other dict-encoded types later (hash hashtable, sets, sorted sets) without paying for `kvobj`-specific code paths in the core loop. Bundles the work originally proposed in #14899 (MGET prefetch framework, by @mpozniak95) and #15043 (MSET batch prefetch). ## Design Two new optional callbacks on `dictType`: ```c typedef struct dictType { ... /* Bring the entry's key payload into cache before keyCompare runs. * Returns the address to prefetch, or NULL if the entry alone is enough. */ void *(*prefetchEntryKey)(const dictEntry *de); /* Called only after a key match. Returns the value-side payload to * prefetch (or NULL). */ void *(*prefetchEntryValue)(const dictEntry *de); } dictType; ``` `dbDictType` registers both. The kv-aware logic — the `dictEntryIsKey()` shortcut for embedded kvobjs, and `kv->ptr` for `OBJ_STRING` / `OBJ_ENCODING_RAW` values — now lives in two small helpers in `server.c`: ```c static void *dbDictPrefetchEntryKey(const dictEntry *de) { return dictEntryIsKey(de) ? NULL : dictGetKey(de); } static void *dbDictPrefetchEntryValue(const dictEntry *de) { kvobj *kv = dictGetKey(de); return (kv->type == OBJ_STRING && kv->encoding == OBJ_ENCODING_RAW) ? kv->ptr : NULL; } ``` The `PrefetchGetValueDataFunc` typedef and the per-call `get_val_data` parameter on `dictPrefetchKeys()` / `dictPrefetch()` are removed — the dict's own type drives both ends. This also removes the foot-gun where callers (like `mgetCommand`) had to remember whether to pass `prefetchGetObjectValuePtr` or `NULL`. `memory_prefetch.c` no longer references `kvobj`, `kvobjGetKey`, or any specific value layout. ## State machine Two file-local types in `memory_prefetch.c`: | Type | Role | |---|---| | `dictPrefetchLookup` | Per-key snapshot of an in-flight, software-pipelined `dictFind` (mirrors the locals a synchronous `dictFind` would carry across one bucket walk). | | `dictPrefetcher` | Driver that advances a batch of `dictPrefetchLookup`s through the FSM, yielding to the next in-flight lookup each time a prefetch is issued. | Five-stage lifecycle for each lookup, driven by the prefetcher: ```text │ start │ ┌────────▼─────────┐ ┌─────────►│ PREFETCH_BUCKET ├────►────────┐ │ └────────┬─────────┘ no more tables │ bucket│found │ │ │ │ entry not found - goto next table ┌────────▼────────┐ │ └────◄─────┤ PREFETCH_ENTRY │ ▼ ┌────────────►└────────┬────────┘ │ │ entry│found │ │ │ │ │ ┌───────────▼─────────────┐ │ │ │ PREFETCH_ENTRY_KEY │ ◄── dictType->prefetchEntryKey(de) │ └───────────┬─────────────┘ │ │ │ │ key mismatch - goto next entry │ │ │ ┌───────────▼─────────────┐ │ └──────◄───│ PREFETCH_ENTRY_VALUE │ ◄── keyCompare; on match, └───────────┬─────────────┘ dictType->prefetchEntryValue(de) │ │ ┌─────────▼─────────────┐ │ │ PREFETCH_DONE │◄────────┘ └───────────────────────┘ ``` `PREFETCH_BUCKET` first picks `ht_table[0]`, then flips to `ht_table[1]` if the dict is mid-rehash, then transitions to `PREFETCH_DONE` if no more tables remain. `memory_prefetch.c` exposes a small lifecycle that any caller can drive: ```c dictPrefetcherInit(p, max_keys); /* one-shot heap alloc of lookups[] */ dictPrefetcherReset(p, dicts, keys, nkeys); /* configure for one batch */ dictPrefetcherRun(p); /* drive FSM until all PREFETCH_DONE */ dictPrefetcherFree(p); /* release */ ``` Each FSM stage is a named static function (`dictPrefetchBucket`, `dictPrefetchEntry`, `dictPrefetchEntryKey`, `dictPrefetchEntryValue`), so the `dictPrefetcherRun` driver is a four-line `switch` over the state. The state machine is dict-pure: no `kvobj` field on `dictPrefetchLookup`, no `kvobjGetKey` reach-through. Round-robin advance semantics — a state only advances the cursor if a prefetch was actually issued — are preserved, so the embedded-kvobj fast path (`dictEntryIsKey(de) == 1` → callback returns NULL) still skips the extra prefetch and falls straight into the compare on the next loop iteration. The cross-command path (`prefetchCommands` / `PrefetchCommandsBatch`) embeds a `dictPrefetcher` initialized once at startup and reset per batch, so cross-command prefetching no longer allocates per call. ## Intra-command API ```c void dictPrefetchKeys(dict **dicts, void **keys, size_t nkeys); ``` A single multi-key command (e.g. MGET) can prefetch dict data for a batch of its own keys, reusing the same state machine that the cross-command path uses. Single-key calls (`nkeys <= 1`) early-return — nothing to interleave with. The implementation stack-allocates a fixed-size lookup array bounded by `DICT_PREFETCH_MAX_SIZE = 64` (no VLA, predictable stack usage), so the intra-command path doesn't touch the heap. ## Notes on the call sites A shared helper picks the next prefetch batch and warms it via `dictPrefetchKeys`: ```c /* Pick the next prefetch batch starting at argv[start] and warm it via * dictPrefetchKeys. 'stride' is 1 for keys-only args (MGET) or 2 for * key/value pairs (MSET). Returns the chosen batch size in items. */ static int prefetchKeysBatch(client *c, int slot, int start, int stride); ``` Adaptive batch sizing inside the helper: if at least two full batches (`PREFETCH_BATCH_SIZE * 2 = 32` items) remain, take one batch (`PREFETCH_BATCH_SIZE = 16`); otherwise take all remaining items in one call. This generalizes the small-request fast path so the trailing batch of a large request also gets the single-call benefit. - **MGET (`mgetCommand`)** — gated by `do_prefetch = server.prefetch_batch_max_size && !already_prefetched && numkeys > 1`, with `already_prefetched = c->current_pending_cmd && (c->current_pending_cmd->flags & PENDING_CMD_KEYS_PREFETCHED)`. When `do_prefetch` is set, each iteration calls `prefetchKeysBatch(c, slot, j, 1)` and then sequentially `lookupKeyRead`s + replies the chosen batch. When `do_prefetch` is clear (cross-command path already warmed the keys, or batch prefetching is off), the loop takes all remaining items in one go and skips the prefetch. - **MSET / MSETNX (`msetGenericCommand`)** — same `do_prefetch` gate as MGET with `stride = 2`. For the NX flag the NX-check loop runs `lookupKeyWrite` (which already warmed everything via `prefetchKeysBatch`); the SET loop then disables further prefetch (`do_prefetch &&= !nx`) so we don't re-prefetch on the second pass. Going through the full state machine (rather than bucket-only) means `dbDictType`'s `prefetchEntryValue` callback runs on a key match — warming the old kvobj's payload, which `setKey -> dbReplaceValue -> updateKeysizesHist(oldlen, newlen)` then reads to compute the histogram delta. The slot dict is re-fetched per batch — in cluster mode the slot dict can be freed mid-MSET (`KVSTORE_FREE_EMPTY_DICTS` + `expireIfNeeded`), so a cached pointer would otherwise dangle. - **Cross-command batch path (`addCommandToBatch`)** — sets `PENDING_CMD_KEYS_PREFETCHED` on every command added to the batch, even on partial-batch overflow (was: only when ALL keys fit). The intra-command path then uniformly skips supplemental prefetching for any command the batch touched. Rationale: running both paths (cross-command warm + intra-command supplement) caused a measured −9.6 % regression on x86 with pipeline-10, and the partial cross- command warmup is sufficient for the head of the keyset; the cold tail goes through normal lookup, which is still cheaper than running the FSM a second time on already-warm keys. - **Future types**: each dict's `dictType` can register its own `prefetchEntryKey` / `prefetchEntryValue` (e.g. for the hashtable hash encoding, the field-sds and value-sds payloads), without touching `memory_prefetch.c`. ## Benchmark validation On x86, performance improvements are significant for larger batch sizes: - 5Mkeys-string-mget-10B-100keys-pipeline-10: +89.44% - 5Mkeys-string-mget-100B-100keys: +37.33% - 5Mkeys-string-mget-100B-30keys: +22.40% On ARM (Graviton4), the gains are even more pronounced: - 5Mkeys-string-mget-10B-100keys-pipeline-10: +128.34% - 5Mkeys-string-mget-100B-100keys-pipeline-10: +46.76% Overall, the improvement scales with batch size, while a few small-batch cases show marginal gains or slight regressions. --------- Co-authored-by: Marcin Poźniak <marcin.pozniak@intel.com> Co-authored-by: Yuan Wang <yuan.wang@redis.com> |
||
|
|
7bdab45ff1
|
Reduce memory allocation overhead (#15096)
While profiling command execution, I noticed that command argv object alloc/free overhead is quite high for workloads with many small arguments (e.g. `HSET` with many fields). The effect is much more visible with pipelining when Redis becomes CPU bound. I experimented with replacing argv object alloc/free with a simple object pool and saw significant speedups. (Note: related effort around this topic: https://github.com/redis/redis/pull/13726) In this PR, I tried to improve the main hotspots in the memory allocation path (focusing on command arg allocations) to close the gap with custom pool performance, so we can avoid having a dedicated memory pools and let the whole codebase benefit from these optimizations. ## Changes ### 1) Faster dealloc via passing size hint to jemalloc (separate PR #15071) Jemalloc does more work than an object pool on free (a lookup on a tree to find the allocation's size class). For some deallocations, we can reduce free path overhead by passing a size hint to jemalloc (i.e. `sdallocx()`) which can skip metadata lookup in the common case. This PR introduces `zfree_with_size()` and uses it where we can know the allocation size i.e. `OBJ_ENCODING_EMBSTR` objects in `decrRefCount()` and SDS free path. ### 2) Reduce atomic operation cost for stat updates `update_zmalloc_stat_alloc()` / `update_zmalloc_stat_free()` previously used atomic read-modify-write (RMW) operations (`atomicIncrGet` / `atomicDecr`) which can emit expensive locked instructions on x86. When we can guarantee a single writer to a counter, we can use a cheaper load+add+store sequence instead of a locked RMW. This PR gives the first 16 threads dedicated slots for used_memory stats (intended to cover the main thread/ I/O threads) so they can use this single writer fast path. Threads beyond that fall back to a shared pool and continue to use full atomic RMW. ### 3) Improve jemalloc tcache hit rate With the default `lookahead=16` config, a pipelined HSET with ~20 fields does ~40 small allocations per command (fields + values), so you can get 16 x 40 = ~640 allocations. When args are small, many of these land in the 32 byte size class (often `EMBSTR`). Jemalloc’s default per-bin tcache cap is 200, so this kind of burst overflows the cache and it does frequent flushes. I raised the small-bin tcache limits (lg_tcache_nslots_mul:3, tcache_nslots_small_max:1000) to handle these bursts better. In the worst case, tcache may have a higher memory usage due to this change. Perhaps, another option was lowering `lookahead` to tune it differently. ### 4) Inlining When you have a simple pool, it has a few small functions and it is easy for compiler to inline them. Compared to that, jemalloc alloc/free path has a deeper call stack. Also, jemalloc was not compiled with `-flto` which was preventing inlining jemalloc functions. As part of this PR, I added `-flto` flag to jemalloc when it is enabled for Redis. Compiler also chooses not to inline some hot path functions in Redis. This suggests PGO (profile-guided optimization) could provide additional wins and perhaps we can start experimenting with it sometime. We could try to force inlining with attributes like `always_inline` but it is hard to apply across a deep call stack and misuse can cause code bloat. So, rather than going in this direction, I added `inline` keyword to some functions for now. This doesn't make compiler to inline all hot path functions but at least it is a step ahead. (If we can further improve this in future, performance gets very close to custom memory pool implementation). ## Benchmark results Commands were like: ``` memtier_benchmark --command="HSET __key__ username john_doe email john@example.com password hashed_pwd_123 created_at 1709125200 updated_at 1709125200 first_name John last_name Doe phone_number +1234567890 address 123_Main_St city NewYork country USA postal_code 10001 company Acme_Corp job_title Engineer bio Loves_coding" --command-ratio=1 --command-key-pattern=P --key-prefix="hsetkey" --key-minimum=1 --key-maximum=100000 -n 1000000 -c 50 -t 2 --hide-histogram --pipeline 50 ``` | Benchmark | Improvement | | --- | ---: | | SET | +0% | | SET (pipeline) | +8% | | HSET 15 fields | +2% | | HSET 15 fields (pipeline) | +17% | | ZADD 15 elements| +3% | | ZADD 15 elements (pipeline) | +15% | |
||
|
|
05859cdd7e
|
Fix client output buffer memory tracking not accounting for copy-avoided bulk string references (#14934)
## Problem After #14608 (Reply Copy Avoidance), when copy avoidance kicks in, bulk string replies are sent by reference instead of being copied into the output buffer. The referenced bytes are not counted in `reply_bytes`, which causes: 1. `getClientOutputBufferMemoryUsage()` underestimates the actual memory usage, so output buffer limits may not be triggered in time, allowing clients to consume unbounded memory. 2. Client eviction does not account for the referenced bytes, making it ineffective when copy avoidance is used. 3. `omem` reported in `CLIENT LIST` / `CLIENT INFO` does not reflect the true output buffer memory footprint. ## Solution Track the bytes of referenced bulk strings in the output buffer with two per-client counters: 1. reply_bytes_shared - the logical size of all BULK_STR_REF payloads in the output buffer. Updated incrementally whenever a reference is added/removed. Represents memory the client is "charged" for even though it is shared with the keyspace. 2. reply_bytes_unshared — the subset of the above where the referenced object's refcount == 1 (i.e. the key has been deleted from the keyspace), so the memory is kept alive solely by this client's output buffer and would actually be freed on disconnect. Maintained as a lazy cache refreshed via updateClientUnsharedReplyBytes(). ## Info field CLIENT LIST / CLIENT INFO — two new fields, plus refined semantics for existing ones: Field | Meaning -- | -- omem | (semantics changed) logical output-buffer memory, now including shared memory referenced from the keyspace. Still excludes client->buf so static clients show 0. omem-shared | (new) shared output-buffer memory (referenced bulk strings, not solely owned by this client). omem-unshared | (new) unshared output-buffer memory (referenced bulk strings solely owned by this client; freed on disconnect). tot-mem | (semantics refined) actual memory usage — includes omem-unshared, excludes omem-shared to avoid double-counting keyspace memory. INFO memory — two new fields mirroring the above: Field | Meaning -- | -- mem_clients_normal | (semantics changed) actual memory usage of normal clients (includes unshared, excludes shared). mem_clients_normal_shared | (new) aggregate shared output-buffer memory across normal clients. mem_clients_normal_unshared | (new) aggregate unshared output-buffer memory across normal clients. MEMORY STATS — schema extended with the matching keys: Field | Meaning -- | -- clients.normal.shared | (new) aggregate shared output-buffer memory across normal clients. clients.normal.unshared | (new) aggregate unshared output-buffer memory across normal clients. ## Bug Fix Fix missing closeClientOnOutputBufferLimitReached() call when adding a referenced robj to the reply --------- Co-authored-by: oranagra <oran@redislabs.com> |
||
|
|
417cc6e4fc
|
test: stabilize HOTKEYS MULTI/EXEC test by increasing iteration count (#15129)
## Problem The test `HOTKEYS - commands inside MULTI/EXEC` in `tests/unit/hotkeys.tcl` is flaky on fast hardware. This PR raises its inner loop count from 7 to 30 to make `key2` reliably appear in the CPU top-K. Failed CI: https://github.com/redis/redis/actions/runs/25051455424/job/73380034469?pr=15128 Inside `MULTI`/`EXEC`, each queued command's per-command CPU time is recorded as `c->duration = ustime() - call_timer` (microseconds, integer). Very fast commands such as `SET` against a small value can complete in less than 1 µs and therefore be measured as `0`. `hotkeyStatsUpdateCurrentCmd` then forwards that zero duration as the weight to `chkTopKUpdate`, which has an explicit early return on `weight == 0`: ```c sds chkTopKUpdate(chkTopK *topk, char *item, int itemlen, counter_t weight) { if (weight == 0) return NULL; ... } ``` In the original test, `key2` is `SET` only 7 times inside the transaction. On fast hosts (the failure was observed on an ARM box with `ustime()` ticking at 1 µs resolution) it is possible for all 7 calls to be measured as 0 µs, which means `key2` is never inserted into the CPU top-K and the assertion ```tcl assert [dict exists $cpu_result $key2] ``` fails. `key1` has 21 calls and is statistically safe. The author already anticipated this and left a comment ("Send multiple commands to avoid <1us cpu for $key2"), but 7 iterations turned out to be insufficient. ## Changes Bump the iteration count from 7 to 30. With `key2` now `SET` 30 times the probability of every single call being measured as 0 µs becomes negligible on any realistic hardware. |
||
|
|
0bbb196c46
|
Fix sharded pubsub unsubscribe lookup using cached command slot (#15094)
Fixes #15085 ## Problem getKeySlot() may return `server.current_client->slot` while a command is executing instead of computing the slot from the provided string. The unsubscribe can be triggered by another client, in which case server.current_client is not the client being unsubscribed, so getKeySlot() would return that client's cached slot. Using this wrong slot would make the lookup in type.serverPubSubChannels miss the channel and ultimately trigger the assertion below. ## Fix Always use keyHashSlot() instead of getKeySlot() on unsubscribe. --------- Co-authored-by: debing.sun <debing.sun@redis.com> |
||
|
|
5a05863e97
|
t_string: rewrite SET GET propagation in place (#15114)
Optimize SET key value GET propagation rewriting in setGenericCommand() by removing GET arguments in-place with rewriteClientCommandArgument(). This avoids the overhead of allocating a new argv vector and incrementing reference counts for every retained argument. The optimization is scoped to the no-expire SET ... GET rewrite path. It also adds test coverage for cases with repeated GET tokens to ensure robust string semantics and consistent replication behavior. Changes: - Use rewriteClientCommandArgument(c, j, NULL) for in-place removal. - Eliminate redundant argv allocations and refcount increments. - Improve performance of SET GET in high-throughput write streams. |
||
|
|
625b6f58f6
|
tracking: fix self-overlap returning non-zero loop index (#15073)
Fixes checkPrefixCollisionsOrReply() to return 0 (failure) on any provided-prefix self-overlap, instead of accidentally returning a non-zero loop index for overlaps found after the first prefix. Signed-off-by: Raj Danday <rajkripal.danday@gmail.com> |
||
|
|
fafc47251a
|
Fix signed integer overflow in scan count parameter (#14982)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
### Problem In `scanGenericCommand`, `maxiterations = count * 10` overflows when `count > LONG_MAX / 10`, causing undefined behavior. ### Changed 1. Use saturating arithmetic to prevent overflow. 2. Added a test to trigger the overflow path, detectable by UBSan. |
||
|
|
63f02e7876
|
Fix double ERR prefix in XNACK error replies (#15091)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Several `addReplyError` and `addReplyErrorFormat` calls in `xnackCommand` included a redundant `"ERR "` prefix in the message string. Since `addReplyErrorLength` already prepends `-ERR ` to the RESP reply, clients received `ERR ERR ...` for these error paths. This PR removes the redundant prefix from all five affected calls and tightens the corresponding test patterns to match from the beginning of the error message (`"ERR ..."` instead of `"*...*"`), so any future double-prefix regression will be caught. |
||
|
|
0fa78fd8fd
|
perf: widen fast_float_strtod fast path to 17-19 digit mantissas (#15061)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
## Root cause Roughly 50% of random double scores generated by the ZADD listpack workload have 17-19 significant digits, which exceed `MAX_MANTISSA_FAST_PATH` (`2^53`). These inputs fall through to the `strtod()` fallback: ```c char static_buf[128]; memcpy(buf, nptr, len); /* memcpy back! */ buf[len] = '\0'; /* null-term */ double result = strtod(buf, ...); /* glibc strtod — ~10× slower on ARM */ ``` The original C++ `fast_float` library handled the same 17-19 digit inputs with Eisel-Lemire / bigint arithmetic without falling back to `strtod()`. That is what the pure-C replacement lost. ## Fix Compute `mantissa * 10^exponent` in 128-bit integer arithmetic using `__uint128_t`, then convert to double with a single IEEE round-to-nearest-even cast. Supported for `|exp| in [0, 19]` where `10^|exp|` fits in `uint64`; cases outside that range (or otherwise outside the fast path's preconditions) still fall through to `strtod()`. --------- Co-authored-by: debing.sun <debing.sun@redis.com> |
||
|
|
8677971360
|
Remove unnecessary -ERR and \r\n for addReplyErrorFormat in extractLongLatOrReply() (#14995)
In addReplyErrorLength and addReplyErrorFormatInternal, `-ERR` is automatically prepended if the message doesn’t start with `-`, so the initial `-ERR` is unnecessary. Also, trailing `\r\n` will be trimmed, so it doesn’t need to be included. --------- Signed-off-by: charsyam <charsyam@naver.com> Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Co-authored-by: debing.sun <debing.sun@redis.com> |
||
|
|
8aeea8c210
|
Increase threshold for HPEXPIRETIME persists after RDB reload test (#15047) | ||
|
|
15cb40dac2
|
Fix command-docs and corrupt-dump-fuzzer of OBJ_GCRA type (#15055)
### Problem While the new type `OBJ_GCRA` was added, several related code paths were not updated accordingly, leading to failures in the `reply-schemas-validator` CI job and `corrupt-dump-fuzzer.tcl` ##### reply-schemas-validator Failed CI: https://github.com/redis/redis/actions/runs/24485248057/job/71558533290#step:10:903 ```shell Traceback (most recent call last): File "/home/runner/work/redis/redis/./utils/req-res-log-validator.py", line 238, in process_file jsonschema.validate(instance=res.json, schema=req.schema, cls=schema_validator) File "/home/runner/.local/lib/python3.12/site-packages/jsonschema/validators.py", line 1121, in validate raise error jsonschema.exceptions.ValidationError: 'rate_limit' is not valid under any of the given schemas Failed validating 'oneOf' in schema['patternProperties']['^.*$']['properties']['group']: {'description': 'the functional group to which the command belongs', 'oneOf': [{'const': 'bitmap'}, {'const': 'cluster'}, {'const': 'connection'}, {'const': 'generic'}, {'const': 'geo'}, {'const': 'hash'}, {'const': 'hyperloglog'}, {'const': 'list'}, {'const': 'module'}, {'const': 'pubsub'}, {'const': 'scripting'}, {'const': 'sentinel'}, {'const': 'server'}, {'const': 'set'}, {'const': 'sorted-set'}, {'const': 'stream'}, {'const': 'string'}, {'const': 'transactions'}]} On instance['gcrasetvalue']['group']: 'rate_limit' ``` ##### `corrupt-dump-fuzzer.tcl` Also fixed `: Fuzzer corrupt restore payloads - sanitize_dump: yes in tests/integration/corrupt-dump-fuzzer.tcl` Failed daily test : https://github.com/redis/redis/actions/runs/24485248057/job/71558533312#step:6:8652 ```shell Server crashed (by signal: 0, err: key "gcra" not known in dictionary), with payload: "\x1C\x0A\x02\x5F\x37\xC0\x06\xC0\x00\x02\x5F\x39\xC0\x08\x02\x5F\x33\x02\x5F\x35\x02\x5F\x31\xC0\x02\xC0\x04\x0E\x00\xA9\x71\xBF\xEE\x6F\x46\xEF\xA6" violating commands: Done 1434 cycles in 600 seconds. RESTORE: successful: 601, rejected: 833 Total commands sent in traffic: 1194776, crashes during traffic: 1 (0 by signal). [: Fuzzer corrupt restore payloads - sanitize_dump: yes in tests/integration/corrupt-dump-fuzzer.tcl Expected '1' to be equal to '0' (context: type eval line 155 cmd {assert_equal $stat_terminated_in_traffic 0} proc ::test) [147/147 done]: integration/corrupt-dump-fuzzer (1201 seconds) ``` ### Changed This change completes the necessary updates across all relevant components to ensure consistent handling of the rate_limit group and restores CI stability. |
||
|
|
4757561861
|
Subkey notification for hash fields (#14958)
## Motivation
Redis's existing keyspace notification system operates at the **key
level** only — when a hash field is modified via `HSET`, `HDEL`, or
`HEXPIRE`, the subscriber receives the key name and the event type, but
not **which fields** were affected, therefore, these notifications has
very little practical value.
This PR introduces a subkey notification system that extends keyspace
events to include field-level (subkey) details for hash operations,
through both Pub/Sub channels and the Module API.
## New Pub/Sub Notification Channels
Four new channels are added:
|Channel Format | Payload |
|---------------|---------|
| `__subkeyspace@<db>__:<key>` | `<event>\|<len>:<subkey>[,...]` |
|`__subkeyevent@<db>__:<event>` |
`<key_len>:<key>\|<len>:<subkey>[,...]` |
| `__subkeyspaceitem@<db>__:<key>\n<subkey>` | `<event>` |
|`__subkeyspaceevent@<db>__:<event>\|<key>` | `<len>:<subkey>[,...]` |
**Design rationale for 4 channels:**
- **Subkeyspace**: Subscribe to a specific key, receive all field
changes in a single message — efficient for key-centric consumers.
- **Subkeyevent**: Subscribe to a specific event type, receive
key+fields — efficient for event-centric consumers.
- **Subkeyspaceitem**: Subscribe to a specific key+field combination —
the most selective, one message per field, no parsing needed.
- **Subkeyspaceevent**: Subscribe to event+key combination, receiving
only the affected fields — server-side filtering on both dimensions.
Subkeys are encoded in a length-prefixed format (`<len>:<subkey>`) to
support binary-safe field names containing delimiters.
**Safety guards:**
- Events containing `|` are skipped for `__subkeyspace` and
`__subkeyspaceevent ` channels (to avoid parsing ambiguity).
- Keys containing `\n` are skipped for the `__subkeyspaceitem` channel
(newline is the key/subkey separator).
- Subkeys channels are only published when `subkeys != NULL && count >
0`.
## Hash Command Integration
The following hash operations now emit subkey level notifications with
the affected field names:
| Command | Event | Subkeys |
|---------|-------|---------|
| `HSET` / `HMSET` | `hset` | All fields being set |
| `HSETNX` | `hset` | The field (if set) |
| `HDEL` | `hdel` | All fields deleted |
| `HGETDEL` | `hdel` / `hexpired` | Deleted or lazily expired fields |
| `HGETEX` | `hexpire` / `hpersist` / `hdel` / `hexpired` | Affected
fields per event |
| `HINCRBY` | `hincrby` | The field |
| `HINCRBYFLOAT` | `hincrbyfloat` | The field |
| `HEXPIRE` / `HPEXPIRE` / `HEXPIREAT` / `HPEXPIREAT` | `hexpire` |
Updated fields |
| `HPERSIST` | `hpersist` | Persisted fields |
| `HSETEX` | `hset` / `hdel` / `hexpire` / `hexpired` | Affected fields
per event |
| Field expiration (active/lazy) | `hexpired` | All expired fields
(batched) |
For field expiration, expired fields are collected into a dynamic array
and sent as a single batched notification after the expiration loop,
rather than one notification per field.
## Module API
Three new APIs and one new callback type:
```c
/* Function pointer type for keyspace event notifications with subkeys from modules. */
typedef void (*RedisModuleNotificationWithSubkeysFunc)(
RedisModuleCtx *ctx, int type, const char *event,
RedisModuleString *key, RedisModuleString **subkeys, int count);
/* Subscribe to keyspace notifications with subkey information.
*
* This is the extended version of RM_SubscribeToKeyspaceEvents. When subkeys
* are available, the `subkeys` array and `count` are passed to the callback.
* `subkeys` contains only the names of affected subkeys (values are not included),
* and `count` is the number of elements. The array may contain duplicates when
* the same subkey appears more than once in a command (e.g. HSET key f1 v1 f1 v2
* produces subkeys=["f1","f1"], count=2). When no subkeys are present, `subkeys`
* will be NULL and `count` will be 0. Whether events without subkeys are delivered
* depends on the `flags` parameter (see below).
*
* `types` is a bit mask of event types the module is interested in
* (using the same REDISMODULE_NOTIFY_* flags as RM_SubscribeToKeyspaceEvents).
*
* `flags` controls delivery filtering:
* - REDISMODULE_NOTIFY_FLAG_NONE: The callback is invoked for all matching
* events regardless of whether subkeys are present, so a separate
* RM_SubscribeToKeyspaceEvents registration can be omitted.
* - REDISMODULE_NOTIFY_FLAG_SUBKEYS_REQUIRED: The callback is only invoked
* when subkeys are not empty. Events without subkey information (e.g. SET,
* EXPIRE, DEL) are skipped.
*
* The callback signature is:
* void callback(RedisModuleCtx *ctx, int type, const char *event,
* RedisModuleString *key, RedisModuleString **subkeys, int count);
*
* The subkeys array and its contents are only valid during the callback.
* The underlying objects may be stack-allocated or temporary, so
* RM_RetainString must NOT be used on them. To keep a subkey beyond
* the callback (e.g. in a RM_AddPostNotificationJob callback), use
* RM_HoldString (which handles static objects by copying) or
* RM_CreateStringFromString to make a deep copy before returning.
*/
int RM_SubscribeToKeyspaceEventsWithSubkeys(RedisModuleCtx *ctx, int types, int flags, RedisModuleNotificationWithSubkeysFunc callback);
/* Unregister a module's callback from keyspace notifications with subkeys
* for specific event types.
*
* This function removes a previously registered subscription identified by
* the event mask, delivery flags, and the callback function.
*
* Parameters:
* - ctx: The RedisModuleCtx associated with the calling module.
* - types: The event mask representing the notification types to unsubscribe from.
* - flags: The delivery flags that were used during registration.
* - callback: The callback function pointer that was originally registered.
*
* Returns:
* - REDISMODULE_OK on successful removal of the subscription.
* - REDISMODULE_ERR if no matching subscription was found. */
int RM_UnsubscribeFromKeyspaceEventsWithSubkeys(
RedisModuleCtx *ctx, int types, int flags,
RedisModuleNotificationWithSubkeysFunc cb);
/* Like RM_NotifyKeyspaceEvent, but also triggers subkey-level notifications
* when subkeys are provided. Both key-level (keyspace/keyevent) and
* subkey-level (subkeyspace/subkeyevent/subkeyspaceitem/subkeyspaceevent)
* channels are published to, depending on the server configuration.
*
* This is the extended version of RM_NotifyKeyspaceEvent and can actually
* replace it. When called with subkeys=NULL and count=0, it behaves
* identically to RM_NotifyKeyspaceEvent. */
int RM_NotifyKeyspaceEventWithSubkeys(
RedisModuleCtx *ctx, int type, const char *event,
RedisModuleString *key, RedisModuleString **subkeys, int count);
```
## Configuration
Subkey notifications are controlled via the existing
`notify-keyspace-events` configuration string with four new characters:
`notify-keyspace-events` "STIV"
**S** -> Subkeyspace events, published with `__subkeyspace@<db>__:<key>`
prefix.
**T** -> Subkeyevent events, published with
`__subkeyevent@<db>__:<event>` prefix.
**I** -> Subkeyspaceitem events, published per subkey with
`__subkeyspaceitem@<db>__:<key>\n<subkey>` prefix.
**V** -> Subkeyspaceevent events, published with
`__subkeyspaceevent@<db>__:<event>|<key>` prefix.
These flags are **independent** from the existing key-level flags (`K`,
`E`, etc.). Enabling subkey notifications does **not** implicitly enable
or depend on keyspace/keyevent notifications, and vice versa.
## Known Limitations
- **Duplicate fields in subkey notifications**: Subkey notification
payloads may contain duplicate field names when the same field is
affected more than once within a single command. Since duplicate fields
are not the common case and deduplication would introduce significant
overhead on every notification, we chose not to deduplicate at this
time.
- **Subkey is sds encoding object**: We assume the subkey is sds
encoding object, and access it by `subkey->ptr`, and there is an assert,
redis will crash if not.
|
||
|
|
fa6d4c3d63
|
Fix SIGABRT in HSETEX when a field appears twice in the FIELDS list (#14956)
HSETEX crashed on assert() with a SIGABRT when the same field appeared more than once in the FIELDS list and an expiry time was given (EX/PX/EXAT/PXAT). Root cause: hfieldPersist() and the KEEP_TTL path in hashTypeSet() both asserted that dictExpireMeta->expireMeta.trash == 0, meaning the hash must be globally registered in the HFE DS. This is incorrect during HSETEX execution because hashTypeSetExDone(), which registers the hash globally and clears trash, called only at the end of flow. The private per-field ebuckets are fully valid regardless of the global registration state. Fix: Remove both incorrect assertions. The operations on the private ebuckets (ebRemove in hfieldPersist, ebAdd in the KEEP_TTL path) are correct and do not require the hash to be globally registered. Tests: Added two regression tests covering the crash scenarios: - HSETEX EX with a duplicate field (existing field, expiry given) - HSETEX FNX EX with a duplicate field (no prior field, FNX condition passes) |
||
|
|
3bcfbbe92a
|
Add new OBJ_GCRA type (#14905)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
[PR ](https://github.com/redis/redis/pull/14826) introduced a new rate limiting command which stores its internal implementation-detail data into a string key. Since this will prevent a client from detecting type errors or accidental overwrites or value invalidations, f.e via SET or INCR this PR introduces a new data type - OBJ_GCRA specifically created for that new command. Furthermore, a new RATE_LIMIT KSN type was introduced for emitting "gcra" events on such keys. GCRASETTAT was renamed to GCRASETVALUE. --------- Co-authored-by: debing.sun <debing.sun@redis.com> |
||
|
|
670993a89d
|
Replace fast_float C++ library with pure C implementation (#14661)
The fast_float dependency required C++ (libstdc++) to build Redis. This commit replaces the 3800-line C++ template library with a minimal pure C implementation (~360 lines) that provides the same functionality needed by Redis. This is **very important** because Redis build process would fail without g++ installed, a common situation in Linux distributions even after installing the basic build tools: we want the build process of Redis to be the simplest possible. Also Redis sometimes is compiled in embedded systems lacking the g++ toolchain. There is no reason to depend on C++ in a project written in C. ## The C implementation uses 1. Fast path (Clinger's algorithm) for numbers with mantissa <= 2^53 and exponent in [-22, 22], covering ~99% of real-world cases. 2. Fallback to strtod() for complex cases to ensure correctly-rounded results. ## Changes - Move new fast_float_strtod.c(C implementation) from deps into Redis core since it is now a single file and no longer needs a separate directory. - Remove all c++ dependencies The implementation was tested against both strtod and the original C++ implementation with 10,000+ test cases including edge cases, special values (inf/nan), and random inputs. --------- Co-authored-by: debing.sun <debing.sun@redis.com> Co-authored-by: Mincho Paskalev <minchopaskal@gmail.com> Co-authored-by: Moti Cohen <moti.cohen@redis.com> |
||
|
|
3cd464263b
|
Fix gen_write_load error on MOVED/ASK during atomic-slot-migration tests (#15016)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
|
||
|
|
80f1ebda88
|
Add AGGREGATE COUNT option to ZUNION, ZINTER, ZUNIONSTORE, and ZINTERSTORE (#14892)
Some checks failed
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
### Overview
This PR adds a new `COUNT` aggregation mode to the `ZUNIONSTORE`,
`ZINTERSTORE`, `ZUNION`, and `ZINTER` sorted set commands. When
`AGGREGATE COUNT` is specified, the resulting score for each element
reflects how many input sets contain it (optionally scaled by
`WEIGHTS`), rather than combining the actual scores of the elements.
This enables a common use case — counting set membership frequency —
directly at the command level, without application-side workarounds.
### Problem Statement
For developers who need to know **how many input sorted sets contain
each element**, there is no single-command solution today.
**Example:** given several game leaderboards, find how many leaderboards
each player appears in.
The existing aggregation modes (`SUM`, `MIN`, `MAX`) all operate on the
elements' scores. To ignore scores and just count set membership, you'd
currently need to copy each sorted set with all scores set to 1, then
run `ZUNIONSTORE`/`ZINTERSTORE` with `SUM` — requiring multiple round
trips, temporary keys, and application-level locking to avoid races.
A `COUNT` aggregation mode solves this directly.
### Solution
Introduces `AGGREGATE COUNT` as a fourth aggregation mode:
- `ZINTER numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE
<SUM | MIN | MAX | COUNT>] [WITHSCORES]`
- `ZINTERSTORE destination numkeys key [key ...] [WEIGHTS weight [weight
...]] [AGGREGATE <SUM | MIN | MAX | COUNT>]`
- `ZUNION numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE
<SUM | MIN | MAX | COUNT>] [WITHSCORES]`
- `ZUNIONSTORE destination numkeys key [key ...] [WEIGHTS weight [weight
...]] [AGGREGATE <SUM | MIN | MAX | COUNT>]`
When `COUNT` is specified, **the scores in the input sets are ignored**.
Note that `WEIGHTS` is **not** ignored — each set contributes its weight
(default 1) per element, and the contributions are summed.
**Implementation details:**
A new helper function `zuiWeightedScore()` computes the per-set
contribution:
```c
inline static double zuiWeightedScore(double score, double weight, int aggregate) {
return (aggregate == REDIS_AGGR_COUNT) ? weight : weight * score;
}
```
The `zunionInterAggregate()` function treats `COUNT` identically to
`SUM` — it adds the per-set contributions. All four call sites where
`weight * score` was previously computed inline are updated to use
`zuiWeightedScore()`.
### Examples
```
> ZADD s1 1 foo 1 bar
> ZADD s2 2 foo 2 bar
> ZADD s3 3 foo
```
**With `SUM` (existing behavior, for comparison):**
```
> ZINTERSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE SUM
(integer) 1
> ZRANGE t1 0 -1 WITHSCORES
1) "foo"
2) "29"
> ZUNIONSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE SUM
(integer) 2
> ZRANGE t1 0 -1 WITHSCORES
1) "bar"
2) "20"
3) "foo"
4) "29"
```
**With `COUNT` and `WEIGHTS`:**
```
> ZINTERSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE COUNT
(integer) 1
> ZRANGE t1 0 -1 WITHSCORES
1) "foo"
2) "18"
> ZUNIONSTORE t1 3 s1 s2 s3 WEIGHTS 10 5 3 AGGREGATE COUNT
(integer) 2
> ZRANGE t1 0 -1 WITHSCORES
1) "bar"
2) "15"
3) "foo"
4) "18"
```
**With `COUNT` and no specified `WEIGHTS`** — resulting score equals the
number of input sorted sets containing the element:
```
> ZINTERSTORE t1 3 s1 s2 s3 AGGREGATE COUNT
(integer) 1
> ZRANGE t1 0 -1 WITHSCORES
1) "foo"
2) "3"
> ZUNIONSTORE t1 3 s1 s2 s3 AGGREGATE COUNT
(integer) 2
> ZRANGE t1 0 -1 WITHSCORES
1) "bar"
2) "2"
3) "foo"
4) "3"
```
### Backward Compatibility
This is a fully additive change. The new `COUNT` keyword is only
recognized after the `AGGREGATE` token in the four affected commands.
Existing commands, arguments, and default behavior (`AGGREGATE SUM`) are
completely unchanged. No new command is introduced, and no existing
response format is modified.
|
||
|
|
e1d35aca01
|
Fix HEXPIRE numfields overflow (#15021)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Validate HEXPIRE-family field counts without parser overflow keep flexible option order; only require fields fit in argv add tests for INT_MAX numfields across HEXPIRE/HPEXPIRE/HEXPIREAT/HPEXPIREAT |
||
|
|
e8da0e5b47
|
Fix brittle assert_match patterns for unexpected slowlog fields (#14948) | ||
|
|
0be39e5032
|
Fix missing consumer propagation on empty XREADGROUP (#14963)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
## Summary Fixes consumer replication inconsistency when `XREADGROUP` is called for a new consumer but no `XCLAIM` commands are propagated to the replica. Previously, consumer creation was only propagated to replicas when `noack=true`, relying on `XCLAIM` propagation to implicitly create the consumer in the non-NOACK path. However, if no messages exist to read, no `XCLAIM` is generated, and the consumer is silently lost on the replica. This is a follow-up to the original fix in [redis/redis#7140](https://github.com/redis/redis/issues/7140) / [redis/redis#7526](https://github.com/redis/redis/pull/7526), which introduced `XGROUP CREATECONSUMER` propagation but only for the `NOACK` case. ## Changes - **`xreadgroupCommand` (src/t_stream.c):** Replaced the `if (noack)` guard around the `streamPropagateConsumerCreation()` call with a deferred check after `streamReplyWithRange()`. Consumer creation is now propagated when `noack || propCount == 0` — that is, only when no `XCLAIM` commands were generated. This avoids redundant propagation in the common case where `XCLAIM` already implicitly creates the consumer on the replica, while correctly handling both the NOACK path (where PEL/XCLAIM is skipped entirely) and the no-messages path (where there is nothing to XCLAIM). - **Test (tests/unit/type/stream-cgroups.tcl):** Added replication test `"XREADGROUP propagates new consumer to replica"` that sets up a master-replica pair and verifies consumer propagation in two cases: (1) without NOACK when no messages are available to deliver, and (2) with NOACK when messages are delivered but XCLAIM is skipped. ## Benefits - **Master-replica consistency:** Consumers created by `XREADGROUP` are now visible on replicas whenever no `XCLAIM` would otherwise create them — covering both the NOACK path and the empty-stream path. - **No redundant propagation:** The noack || propCount == 0 condition avoids emitting a superfluous XGROUP CREATECONSUMER when XCLAIM commands are already propagated and would implicitly create the consumer on the replica. |
||
|
|
747dfe578e
|
Add XNACK command for releasing stream messages back to the group (#14797)
Some checks failed
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Has been cancelled
### Overview
This PR enhances Redis Streams consumer groups by adding a new `XNACK`
command that allows consumers to explicitly release pending messages
back to the group without acknowledging them. Released (NACKed) entries
become immediately available for re-delivery to other consumers,
eliminating the idle-timeout delay currently required for message
recovery. The command supports three modes — SILENT, FAIL, and FATAL —
giving consumers fine-grained control over delivery counter semantics to
handle graceful shutdowns, transient failures, and poison messages
respectively.
### Problem Statement
For developers using Redis Streams with consumer groups, there are
several common scenarios where a consumer needs to release a message it
has claimed without acknowledging it:
1. **Transient internal failures**: A consumer may fail to process a
message because of problems unrelated to the message itself — for
example, it cannot connect to an external service to fetch required
context. The message is perfectly valid and should be retried promptly
by another consumer.
2. **Resource pressure**: A consumer under resource stress (low CPU, low
memory) may be unable to handle a specific message (e.g., a complex or
large message) within acceptable QoS. It should leave the opportunity to
other consumers in the group, with minimal delay.
3. **Graceful shutdown**: A consumer about to shut down would like to
immediately release all unprocessed messages it has claimed, so they can
be picked up by remaining consumers without waiting for idle timeouts.
4. **Poison / malicious messages**: A consumer may detect or suspect
that a claimed message is invalid or malicious and wants to mark it as
permanently failed (for dead-letter queue processing when available).
**Currently, a consumer cannot NACK a message.** It can either:
- **XACK** it — marks it as "processed" and removes it from the PEL
entirely, losing the ability to redeliver it
- **Leave it pending** — requires other consumers to discover it via
`XPENDING` and claim it via `XCLAIM`/`XAUTOCLAIM` or `XREADGROUP CLAIM`
after the idle timeout expires, introducing a long, unnecessary delay
In all these cases, the logic that applications must implement
introduces **message handling delays**, **implementation complexity**,
and **code duplication** across consumer implementations.
### Solution
Introduces a new `XNACK` (Negative ACKnowledge) command that explicitly
releases pending messages from their owning consumer back to the group's
PEL, making them immediately claimable via `XCLAIM` and `XAUTOCLAIM`,
and prioritized for re-delivery in `XREADGROUP CLAIM`:
```
XNACK key group <SILENT|FAIL|FATAL> IDS numids id [id ...] [RETRYCOUNT count] [FORCE]
```
When executed, the command:
1. **Disassociates** the entry from its owning consumer (`consumer =
NULL`)
2. **Repositions** the entry to the head of the PEL time-ordered list
(`delivery_time = 0`), making it immediately claimable with any
`min-idle-time` threshold
3. **Adjusts the delivery counter** based on the specified mode, giving
consumers fine-grained control over retry semantics
4. **Returns** the count of successfully NACKed entries
**Mode** controls the delivery counter adjustment and communicates the
reason for the NACK:
| Mode | Delivery Counter Behavior | Use Case |
|----------|---------------------------------------------------|---------------------------------------------|
| `SILENT` | Decrement by 1 (undo the delivery increment) | Consumer
shutdown / transient internal error — the delivery "didn't count" |
| `FAIL` | No change (keep the incremented value) | Message too complex
for this consumer, but may work for others — count this as an attempt |
| `FATAL` | Set to `LLONG_MAX` | Invalid / suspected malicious message —
mark as permanently failed |
The three modes map directly to the real-world scenarios above:
- **SILENT** for graceful shutdown or transient failures unrelated to
the message
- **FAIL** for resource-constrained consumers that cannot handle a
specific message
- **FATAL** for poison message detection and dead-letter queue
integration
**Optional parameters:**
- **`RETRYCOUNT count`**: Directly sets `delivery_count` to the
specified value, overriding the mode-based adjustment
- **`FORCE`**: Creates new unowned PEL entries for IDs that are not
already in the group PEL (the entry must exist in the stream). When
`FORCE` creates an entry, the delivery counter is set to `0` (or to
`RETRYCOUNT` if specified, or to `LLONG_MAX` if mode is `FATAL`). This
is used internally for AOF rewrite and replication.
### Response Format
The command returns an integer — the number of messages successfully
NACKed (released back to the group PEL):
```
127.0.0.1:6379> XADD mystream 1-0 f v1
"1-0"
127.0.0.1:6379> XADD mystream 2-0 f v2
"2-0"
127.0.0.1:6379> XGROUP CREATE mystream grp 0
OK
127.0.0.1:6379> XREADGROUP GROUP grp c1 STREAMS mystream >
1) 1) "mystream"
2) 1) 1) "1-0"
2) 1) "f"
2) "v1"
2) 1) "2-0"
2) 1) "f"
2) "v2"
127.0.0.1:6379> XNACK mystream grp FAIL IDS 2 1-0 2-0
(integer) 2
```
After XNACK, the entries appear with an empty consumer in XPENDING
output:
```
127.0.0.1:6379> XPENDING mystream grp - + 10
1) 1) "1-0"
2) ""
3) (integer) -1
4) (integer) 1
2) 1) "2-0"
2) ""
3) (integer) -1
4) (integer) 1
```
### NACK Zone: Data Structure Extension
To support unowned PEL entries and ensure they are prioritized for
re-delivery, a **NACK zone** is introduced at the head of the existing
PEL time-ordered doubly-linked list. A new `pel_nack_tail` pointer is
added to the `streamCG` structure:
**PEL ordering:**
```
[pel_time_head] <-> ... <-> [pel_nack_tail] <-> [owned entries...] <-> [pel_time_tail]
|_____________ NACK zone ______________| |_______ normal PEL ________|
```
The head of the PEL contains all NACKed messages (FIFO-ordered),
followed by all delivered messages that were not NACKed (same order as
today). This ensures NACKed messages are always prioritized over idle
pending messages.
The delivery order for `XREADGROUP` is therefore:
1. If `CLAIM` was specified: first deliver NACKed messages, then deliver
due pending messages (current behavior)
2. Deliver new entries after the group's last-delivered-id (current
behavior)
**Structure Design:**
- NACKed entries occupy positions from `pel_time_head` to
`pel_nack_tail` in the time-ordered list
- Their `delivery_time` is set to `0`, ensuring they always appear
"oldest" and are immediately claimable
- Their `consumer` pointer is set to `NULL`, marking them as unowned
- `pel_nack_tail` is `NULL` when no NACKed entries exist
**Key Properties:**
- **O(1) insertion**: New NACKed entries are inserted right after
`pel_nack_tail` (or at the list head if the zone is empty)
- **FIFO ordering** among NACKed entries: entries are NACKed in the
order they are released
- **Immediate claimability**: Since `delivery_time = 0`, NACKed entries
have maximum idle time and satisfy any `min-idle-time` threshold in
`XCLAIM` and `XAUTOCLAIM`, In `XREADGROUP CLAIM`, NACKed entries are
also prioritized over other pending entries due to their position at the
head of the PEL.
- **Zone integrity**: The `pelListInsertSorted` function is updated to
stop scanning at the `pel_nack_tail` boundary, ensuring owned entries
are never placed inside the NACK zone
### Impact on Existing Commands
All commands that interact with the PEL are updated to handle unowned
(`consumer = NULL`) entries:
- **XPENDING**: Shows NACKed entries with an empty consumer name
- **XCLAIM / XAUTOCLAIM**: Can claim NACKed entries (they satisfy any
min-idle-time since `delivery_time = 0`)
- **XREADGROUP CLAIM**: NACKed entries are picked up by the claim phase
- **XACK**: Works correctly on NACKed entries (removes from group PEL)
- **XINFO STREAM FULL**: Displays NACKed entries with an empty consumer
name
- **XGROUP DELCONSUMER**: Unaffected — NACKed entries are not in any
consumer's PEL
Propagation is also updated: when `XCLAIM` or `XAUTOCLAIM` encounters a
deleted stream entry for an unowned NACK, it propagates `XACK` (instead
of `XCLAIM`) to replicas and AOF, since there is no source consumer to
reference.
### Persistence
**RDB:**
- A new RDB type `RDB_TYPE_STREAM_LISTPACKS_5` (type 27) is introduced
- After saving consumer PEL entries, the NACK zone stream IDs are saved
separately (count + encoded IDs)
- On load, NACK zone entries are reconstructed by looking them up in the
group PEL, unlinking from their sorted position, and re-inserting into
the NACK zone via `pelListInsertNacked`
- Backward compatibility is preserved: old RDB types continue to load
with the existing validation (all entries must have consumers)
**AOF:**
- AOF rewrite emits `XNACK <key> <group> FAIL IDS <n> <id...> RETRYCOUNT
<cnt> FORCE` commands for entries in the NACK zone
- Consecutive entries with the same `delivery_count` are batched into a
single command (up to `AOF_REWRITE_ITEMS_PER_CMD` IDs per command)
### Defragmentation
The defragmentation logic is restructured to handle unowned entries:
- **`defragStreamCGPendingEntry`** (new): Walks the group-level PEL rax,
defragments each NACK, updates the doubly-linked list pointers
(`pel_prev`, `pel_next`), `pel_time_head`, `pel_time_tail`,
`pel_nack_tail`, and the consumer PEL back-pointer for owned entries
- **`defragStreamConsumerPendingEntry`** (simplified): Only fixes up
back-pointers to the possibly-relocated consumer and CG, since actual
defragmentation is now done at the group-level walk. Unowned (NACK zone)
entries have no consumer PEL walk, so the group-level pass is their only
chance
### Key Benefits
- **Immediate re-delivery**: NACKed entries are instantly claimable by
other consumers via `XCLAIM` and `XAUTOCLAIM` (since `delivery_time = 0`
satisfies any `min-idle-time`), and prioritized for re-delivery in
`XREADGROUP CLAIM`, eliminating idle-time delays that can range from
seconds to minutes
- **Explicit release semantics**: Consumers can release messages
intentionally, with fine-grained control over retry behavior — a
capability that exists in competing systems like RabbitMQ
- **Flexible retry control**: Three modes (SILENT, FAIL, FATAL) plus
RETRYCOUNT cover the full spectrum of failure handling strategies, from
graceful shutdown to poison message detection
- **Reduced application complexity**: Eliminates the need for
application-level workarounds involving XPENDING polling, arbitrary idle
timeouts, and manual XCLAIM orchestration
- **Dead-letter queue readiness**: FATAL mode + delivery count enables
straightforward poison message detection and future DLQ integration
- **Backward compatibility**: Fully optional new command with zero
breaking changes to existing behavior
|
||
|
|
153b79a290
|
keymeta: add DEBUG flag for runtime keymeta class registration (#14968)
M_CreateKeyMetaClass() allows registration only on: - 'DEBUG enable-module-keymeta-runtime-registration 1' (replaces server.enable_debug_cmd) - REDISMODULE_CTX_FLAGS_SERVER_STARTUP, in addition to module->onload |
||
|
|
d22c68f904
|
Partial support set keymeta on ksn (#15004)
As part of KSN, modules must not modify keys. However, RediSearch modifies key metadata in some flows, which may invalidate the local kvobj pointer. Introduce KSN_INVALIDATE_KVOBJ() to explicitly invalidate kvobj after notifications, preventing further access by Redis core. Currently relevant for hash keys without HFE. Changes: - Add KSN_INVALIDATE_KVOBJ() to guard unsafe flows - Apply invalidation beyond hash-specific paths - Extend KSN side-effect coverage for DELEX and MOVE - Rearrange flows to avoid kvobj access after notification - Include additional tests from @JoanFM (#14939) Behavior: No intended behavior change and no reordering of notifications. |
||
|
|
ef536f48fd
|
GCRA param renaming (#14950)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
Renames the `GCRA` command interface to use token terminology: `requests-per-period` becomes `tokens-per-period`, and the optional `NUM_REQUESTS` argument becomes `TOKENS` (with corresponding error messages/documentation updates). |
||
|
|
a6a27f56f2
|
Test tcp deadlock fixes (#14946)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
This fix follows #14667 and #14886 Several tests pipelined large numbers of commands on deferring clients without draining replies. That can fill buffers and stall progress. Fix by draining replies every 500 pipelined requests to avoid TCP stalls. --------- Co-authored-by: oranagra <oran@redislabs.com> |
||
|
|
5f5ddfd1a1
|
Fix COMMAND GETKEYS for PFMERGE with no source keys (#14942)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Reply-schemas linter / reply-schemas-linter (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
PFMERGE's second key spec (source keys) produces an empty range when called with only a dest key (e.g. PFMERGE dest). getKeysUsingKeySpecs treats that as invalid_spec, which discards all previously found keys and returns an error. Add pfmergeGetKeys as a getkeys callback so the command correctly falls back to it when key specs fail on the edge case. |
||
|
|
2ba0194fbe
|
Fix memory leak in ZDIFF algorithm 2 on early exit (#14932)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
## Problem `zdiffAlgorithm2()` can break out early once the destination cardinality reaches zero. In that path, a temporary SDS created by `zuiSdsFromValue()` is left dirty and never released, because the cleanup normally happens on the next `zuiNext()` call which is skipped due to the early `break`. `zuiClearIterator()` called after the loop does **not** clean up dirty values — only `zuiNext()` or explicit `zuiDiscardDirtyValue()` does. ## Fix Add `zuiDiscardDirtyValue(&zval)` before the early `break` to ensure the temporary SDS is freed on all exit paths. |
||
|
|
bbc0dcbb9a
|
Fix potential TCP deadlock in Active defrag IDMP streams test (#14886)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
The IDMP streams defrag test sends all commands (100k) before reading any replies, which can cause TCP deadlock when buffers fill up. Fix by batching writes and reads (1000 iterations per batch), consistent with the approach already used in the script defrag test above. |
||
|
|
78e21daafe
|
HSETNX notify after accessing KV (#14918)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
On HSETNX command handling, the key space notification is sent before the complete usage of kvobject in stack unlike in the rest of the HSET* family of command handlers. The problem is that when a Module writes potentially to Key Metadata, this KVObject may be reallocated and the usage of this after the notifciation becomes dangerous and can potentially lead to a crash. This issue appeared because we started integration Key Metadata support on a module and observed that when handling HSETNX to update some metadata, we observed a crash. |
||
|
|
f9b9140f10
|
Fix test failures under Valgrind caused by unexpected slowlog fields in cmdstat output (#14914)
The recent PR https://github.com/redis/redis/pull/14896 introduced new slowlog statistics in `INFO` commandstats. This causes `assert_match` to fail in CI (especially under Valgrind) when commands trigger the slowlog, adding extra fields after failed_calls. This PR updates the test patterns to append a trailing `*` to tolerate these optional fields. |
||
|
|
0f5190f103
|
Add global and per-command stats for slowlog metrics (#14896)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
# What Add global and per-command slowlog metrics. `INFO STATS` now shows: - slowlog_commands_count - total count of commands written to slowlog (including trimmed ones) - slowlog_commands_time_ms_sum - sum of execution times of the commands from the slowlog. - slowlog_commands_time_ms_max - maximum execution time of a command from the slowlog (useful for calculating averege values) `INFO COMMANDSTATS` adds the equivalent 3 metrics, but per command. Only shown for a command if it was added at least once in the slowlog: - slowlog_count - how many times the command was written in the slowlog - slowlog_time_ms_sum - sum of execution time of the command (only from the slowlog) - slowlog_time_ms_max - maximum execution time of the command (only from the slowlog) # Why More fine-grained slowlog metrics, easy of alert creation regarding slowlog. |
||
|
|
f4d176b3b7
|
KEYSIZES/ASM: simplify histograms, fix background trim, and refactor debug assertions (#14877)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
- Simplify KEYSIZES tracking and saving memory by removing per‑slot histogram state in kvstoreDictMetadata and routing all updates through kvsUpdateHistogram + updateKeysizesHist(db, type, ...) (per‑DB only). - Fix KEYSIZES consistency during ASM background trim by introducing asmTrimCtx, passing it through emptyDbDataAsync/BIO lazy‑free, computing histogram deltas in the background, and applying them on completion; add bg_trim_running to coordinate with validation. - Refactor and relax debug assertions into a unified dbg_assert_flags bitmask and dbgRunAssertions(db), and skip KEYSIZES/ALLOCSIZE checks during nested execution, RDB load, ASM import, and ASM background trim. - Update commands, module APIs, tests (including UNLINK async deletion coverage), and daily CI workflow to reflect the new histogram behavior and re‑enable ASM/cluster tests. - Revise the daily CI “debug-assert-keyspace” workflow to run ASM and slot-stats unit tests again Potential edge case with histogram accuracy: The histogram could become inaccurate if the database is flushed while ASM trim is running in the background. I considered adding a generation counter to detect this, but decided against it since this is purely an INFO/diagnostic feature and the edge case is quite rare. The target_kvstore pointer check prevents applying stale deltas to the wrong kvstore, and if the histogram does become incorrect we have debugServerAssert() to catch negative values - it won't cause crashes in production. This is a known limitation we can document and revisit if it becomes a real issue in practice. |
||
|
|
9accf8bd24
|
Refine error message for HSETEX command with PERSIST (#14880)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled
Fixes #14879 > PERSIST can only be given for HGETEX and KEEPTTL for HSETEX but the code doesn’t verify it. **Behavior currently:** ``` 127.0.0.1:5555> HSETEX h ex 100 persist fields 1 k v (error) ERR Only one of EX, PX, EXAT, PXAT or KEEPTTL arguments can be specified ``` **With this PR, behavior would be:** ``` 127.0.0.1:5555> HSETEX h ex 100 persist fields 1 k v (error) ERR unknown argument: persist ``` |
||
|
|
462e603a1f
|
Fix stream_idmp_keys missing from database lifecycle ops (#14897)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
**Summary** Ensures `db->stream_idmp_keys` is managed consistently with `keys`, `expires`, and `subexpires` across every database lifecycle operation — flush, swap, temp-DB init/discard, lazy-free, and cluster slot migration. Without this fix, the dict was absent from these code paths, causing three classes of bugs: a SIGSEGV during diskless replication in `swapdb` mode (NULL pointer dereference in `initTempDb`), silently lost IDMP tracking after `SWAPDB` and diskless replication (pointers never swapped), and stale IDMP entries surviving `FLUSHDB` (dict never cleared). **Changes** - **`emptyDbStructure`** (`src/db.c`) — Clear `stream_idmp_keys` on flush so stale entries don't persist across `FLUSHDB`/`FLUSHALL`. - **`initTempDb` / `discardTempDb`** (`src/db.c`) — Create and release `stream_idmp_keys` for temp databases used during diskless replication RDB load. Fixes the SIGSEGV when `rdbLoadRioWithLoadingCtx` calls `dictAddRaw` on a NULL pointer. - **`dbSwapDatabases`** (`src/db.c`) — Swap `stream_idmp_keys` alongside the other per-DB dicts so IDMP tracking follows the data during `SWAPDB`. - **`swapMainDbWithTempDb`** (`src/db.c`) — Swap `stream_idmp_keys` when promoting a temp database after diskless replication, so the cron can discover and expire IDMP entries on replicas. - **`streamMoveIdmpKeys`** (`src/db.c`) — New helper that migrates IDMP entries by slot from one dict to another, used during cluster slot migration. - **`emptyDbAsync` / `emptyDbDataAsync`** (`src/lazyfree.c`) — Replace the dict with a fresh one and hand the old one to the background lazy-free job. - **`lazyfreeFreeDatabase`** (`src/lazyfree.c`) — Free `stream_idmp_keys` in the background job (now takes 4 args instead of 3). - **`asmTriggerBackgroundTrim`** (`src/cluster_asm.c`) — Move matching IDMP entries by slot into a temporary dict during background trim, then pass it to `emptyDbDataAsync` for async cleanup. - **`server.h`** — Updated `emptyDbDataAsync` signature; added `streamMoveIdmpKeys` declaration. - **Tests** (`tests/unit/type/stream.tcl`) — Four new integration tests covering IDMP expiry after `SAVE` + restart, tracking survival across `SWAPDB`, cleanup on `FLUSHDB`, and diskless replication in `swapdb` mode (both `rdbchannel=yes` and `rdbchannel=no`). **What this fixes** - **No crash on diskless replication** — `initTempDb` now initializes the dict, eliminating the NULL dereference during RDB load. - **Tracking preserved across swaps** — Both `SWAPDB` and diskless replication correctly transfer the dict, so the cron keeps expiring entries in the right database. - **Clean state after flush** — `FLUSHDB`/`FLUSHALL` clear the dict, preventing ghost entries from interfering with new streams. - **Correct cluster migration cleanup** — IDMP entries for migrated slots are moved and freed alongside the key data. |
||
|
|
1b615c774d
|
Fix FIELDS argument validation in HSETEX/HGETEX (#14883)
Fixes #14879 Improve validation of the FIELDS argument in HSETEX and HGETEX to ensure exactly one field is provided, rejecting both missing and multiple fields with consistent and accurate error messages. Align behavior across both commands. |
||
|
|
c4d74587b5
|
Fix ACL OOB for wrong-arity KEYNUM commands (#14847)
Some checks are pending
CI / test-ubuntu-latest (push) Waiting to run
CI / test-sanitizer-address (push) Waiting to run
CI / build-debian-old (push) Waiting to run
CI / build-macos-latest (push) Waiting to run
CI / build-32bit (push) Waiting to run
CI / build-libc-malloc (push) Waiting to run
CI / build-centos-jemalloc (push) Waiting to run
CI / build-old-chain-jemalloc (push) Waiting to run
Codecov / code-coverage (push) Waiting to run
External Server Tests / test-external-standalone (push) Waiting to run
External Server Tests / test-external-cluster (push) Waiting to run
External Server Tests / test-external-nodebug (push) Waiting to run
Spellcheck / Spellcheck (push) Waiting to run
`luaRedisAclCheckCmdPermissionsCommand` and `RM_ACLCheckCommandPermissions` now call `commandCheckArity()` to check command arity before calling `ACLCheckAllUserCommandPerm`, matching the behavior of `processCommand`, `scriptCall`, and `RM_Call`. Without this, KEYNUM keyspec commands like EVAL with wrong arity cause out-of-bounds argv access during key extraction. Also fix KEYNUM index calculation (`first + keynumidx`) and add a bounds check in genericGetKeys(). Add scripting and module ACL tests for wrong-arity `EVAL` to lock in the non-crashing behavior. Fixes #14843 |
||
|
|
31a4356ac0
|
GCRA Rate Limiter (#14826)
# What Implement rate limiting functionality via GCRA algorithm. Introduce a new command `GCRA` to facilitate it. The implementation is heavily based on the popular [redis-cell](https://github.com/brandur/redis-cell) module (by [brandur](https://github.com/brandur)) with small changes in the API. # Why Rate limiting is a very common use case of redis and GCRA is one of the most popular algorithms used because of its simplicity and speed. Currently rate limiting with GCRA is possible via lua scripts or even client libraries via the relatively recent `SET IFEQ`/`DIGEST` commands ([redis-py example](https://gist.github.com/minchopaskal/b7acd4550f7144b88e2d0f86568a0d7b)). Implementing it directly inside redis gives us even faster performance. # API ``` GCRA key max_burst requests_per_period period [NUM_REQUESTS count] ``` ## Description Rate limit via GCRA. `requests_per_period` are allowed per `period` at a sustained rate. Thus we have a minimum spacing(emission interval) of `period`/`requests_per_period` seconds between each request. `max_burst` allows for occasional spikes by granting up to `max_burst` additional requests to be consumed at once. See more in the [GCRA wiki](https://en.wikipedia.org/wiki/Generic_cell_rate_algorithm). ## Options **KEY** - key related to specific rate limiting case **MAX_BURST** - maximum number of tokens allowed as a burst (in addition to the sustained rate). Min: 0 **REQUESTS_PER_PERIOD** - number of requests allowed per PERIOD. Min: 1 **PERIOD** - period in seconds as floating point number used for calculating the sustained rate. Min: 1.0 **NUM_REQUESTS** - cost (or weight) of this rate-limiting request. A higher cost drains the allowance faster. Default: 1 ### Note In redis-cell module and most other modules that are based on it PERIOD is given in seconds as integer. We decided to use floating point for greater flexibility. Internally time periods are calculated in microsecond granularity. ## Reply Reply is identical to reply of redis-cell ``` 127.0.0.1:6379> GCRA <key> <max_burst> <requests_per_period> <period> NUM_REQUESTS <count> 1) <limited> # 0 or 1 2) <max-req-num> # max number of request. Always equal to max_burst+1 3) <num-avail-req> # number of requests available immediately 4) <reply-after> # number of seconds after which caller should retry. Always returns -1 if request isn't limited. 5) <full-burst-after> # number of seconds after which a full burst will be allowed ``` --------- Co-authored-by: debing.sun <debing.sun@redis.com> |