redis/tests/unit/replybufsize.tcl
debing.sun 05859cdd7e
Fix client output buffer memory tracking not accounting for copy-avoided bulk string references (#14934)
## Problem

After #14608 (Reply Copy Avoidance), when copy avoidance kicks in, bulk
string replies are sent by reference instead of being copied into the
output buffer.
The referenced bytes are not counted in  `reply_bytes`, which causes:

1. `getClientOutputBufferMemoryUsage()` underestimates the actual memory
usage, so output buffer limits may not be triggered in time, allowing
clients to consume unbounded memory.
2. Client eviction does not account for the referenced bytes, making it
ineffective when copy avoidance is used.
3. `omem` reported in `CLIENT LIST` / `CLIENT INFO` does not reflect the
true output buffer memory footprint.

## Solution

Track the bytes of referenced bulk strings in the output buffer with two
per-client counters:

1. reply_bytes_shared - the logical size of all BULK_STR_REF payloads in
the output buffer.
   Updated incrementally whenever a reference is added/removed.
Represents memory the client is "charged" for even though it is shared
with the keyspace.

2. reply_bytes_unshared — the subset of the above where the referenced
object's refcount == 1 (i.e. the key has been deleted from the
keyspace), so the memory is kept alive solely by this client's output
buffer and would actually be freed on disconnect.
Maintained as a lazy cache refreshed via
updateClientUnsharedReplyBytes().

## Info field

CLIENT LIST / CLIENT INFO — two new fields, plus refined semantics for
existing ones:

Field | Meaning
-- | --
omem | (semantics changed) logical output-buffer memory, now including
shared memory referenced from the keyspace. Still
excludes client->buf so static clients show 0.
omem-shared | (new) shared output-buffer memory (referenced bulk
strings, not solely owned by this client).
omem-unshared | (new) unshared output-buffer memory (referenced bulk
strings solely owned by this client; freed on disconnect).
tot-mem | (semantics refined) actual memory usage —
includes omem-unshared, excludes omem-shared to avoid double-counting
keyspace memory.

INFO memory — two new fields mirroring the above:

Field | Meaning
-- | --
mem_clients_normal | (semantics changed) actual memory usage of normal
clients (includes unshared, excludes shared).
mem_clients_normal_shared | (new) aggregate shared output-buffer memory
across normal clients.
mem_clients_normal_unshared | (new) aggregate unshared output-buffer
memory across normal clients.

MEMORY STATS — schema extended with the matching keys:

Field | Meaning
-- | --
clients.normal.shared | (new) aggregate shared output-buffer memory
across normal clients.
clients.normal.unshared | (new) aggregate unshared output-buffer memory
across normal clients.

## Bug Fix
Fix missing closeClientOnOutputBufferLimitReached() call when adding a
referenced robj to the reply

---------

Co-authored-by: oranagra <oran@redislabs.com>
2026-05-06 09:46:17 +08:00

47 lines
No EOL
1.7 KiB
Tcl

proc get_reply_buffer_size {cname} {
set clients [split [string trim [r client list]] "\r\n"]
set c [lsearch -inline $clients *name=$cname*]
if {![regexp rbs=(\[a-zA-Z0-9-\]+) $c - rbufsize]} {
error "field rbs not found in $c"
}
return $rbufsize
}
start_server {tags {"replybufsize"}} {
test {verify reply buffer limits} {
# In order to reduce test time we can set the peak reset time very low
r debug replybuffer peak-reset-time 100
# Create a simple idle test client
variable tc [redis_client]
$tc client setname test_client
# make sure the client is idle for 1 seconds to make it shrink the reply buffer
wait_for_condition 10 100 {
[get_reply_buffer_size test_client] >= 1024 && [get_reply_buffer_size test_client] < 2046
} else {
set rbs [get_reply_buffer_size test_client]
fail "reply buffer of idle client is $rbs after 1 seconds"
}
r set bigval [string repeat x 8192] ;# Keep value <= 16KB to avoid copy-avoidance, which shares memory and slows tot-mem growth.
# In order to reduce test time we can set the peak reset time very low
r debug replybuffer peak-reset-time never
wait_for_condition 10 100 {
[$tc mget bigval bigval bigval bigval ; get_reply_buffer_size test_client] >= 16384 && [get_reply_buffer_size test_client] < 32768
} else {
set rbs [get_reply_buffer_size test_client]
fail "reply buffer of busy client is $rbs after 1 seconds"
}
# Restore the peak reset time to default
r debug replybuffer peak-reset-time reset
$tc close
} {0} {needs:debug}
}