Skip memory prefetch during loading to avoid crash in dictEmpty callback (#14848)
Some checks failed
CI / test-ubuntu-latest (push) Has been cancelled
CI / test-sanitizer-address (push) Has been cancelled
CI / build-debian-old (push) Has been cancelled
CI / build-macos-latest (push) Has been cancelled
CI / build-32bit (push) Has been cancelled
CI / build-libc-malloc (push) Has been cancelled
CI / build-centos-jemalloc (push) Has been cancelled
CI / build-old-chain-jemalloc (push) Has been cancelled
Codecov / code-coverage (push) Has been cancelled
External Server Tests / test-external-standalone (push) Has been cancelled
External Server Tests / test-external-cluster (push) Has been cancelled
External Server Tests / test-external-nodebug (push) Has been cancelled
Spellcheck / Spellcheck (push) Has been cancelled

Fixes #14838

## Summary

Fix a crash in `prefetchCommands()` that occurs during replica full sync
when the replica has existing data that needs to be emptied.

## Problem Description

During `emptyData()` → `kvstoreEmpty()` → `dictEmpty()` →
`_dictClear()`, the first hash table is cleared and `d->ht_table[0]` is
set to NULL via `_dictReset`. Then while clearing the second hash table,
every 65536 buckets it invokes `replicationEmptyDbCallback()` →
`processEventsWhileBlocked()` → `readQueryFromClient()` →
`prefetchCommands()`.
At this point, `dictSize() > 0` still holds (because the second hash
table isn't fully cleared yet), but `ht_table[0]` is already NULL. The
prefetch code assumed `ht_table[0]` is always valid when `dictSize() >
0`, leading to a crash.

## Solution
1. **Skip prefetch during loading**: Added a `server.loading` check at
the top of `prefetchCommands()` to return early. During RDB loading, the
main dictionary is being rebuilt, so prefetching keys from it is useless
anyway.
2. **Add defensive assertion**: Added
`serverAssert(batch->current_dicts[i]->ht_table[0])` in
`initBatchInfo()` to catch any future cases where `ht_table[0]` is NULL
while `dictSize() > 0` (which should only happen mid-`dictEmpty` via
`_dictReset`).

---------

Co-authored-by: kairosci <kairosci@users.noreply.github.com>
Co-authored-by: debing.sun <debing.sun@redis.com>
Co-authored-by: Yuan Wang <yuan.wang@redis.com>
This commit is contained in:
Alessio Attilio 2026-03-24 03:22:58 +01:00 committed by GitHub
parent 63d841c3ae
commit 1abd489d07
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 46 additions and 1 deletions

View file

@ -165,6 +165,11 @@ static void initBatchInfo(dict **dicts, GetValueDataFunc func) {
info->state = PREFETCH_DONE;
continue;
}
/* We skip prefetch during loading, so ht_table[0] should never be NULL
* when dictSize() > 0 (which only happens mid-dictEmpty via _dictReset). */
serverAssert(batch->current_dicts[i]->ht_table[0]);
info->ht_idx = HT_IDX_INVALID;
info->current_entry = NULL;
info->current_kv = NULL;
@ -334,7 +339,7 @@ int determinePrefetchCount(int len) {
* 3. Prefetch the keys and values for all commands in the current batch from
* the main dictionaries. */
void prefetchCommands(void) {
if (!batch) return;
if (!batch || server.loading) return;
/* Prefetch argv's for all pending commands */
for (size_t i = 0; i < batch->pcmd_count; i++) {

View file

@ -1832,3 +1832,43 @@ start_server {tags {"repl external:skip"}} {
}
}
}
start_server {tags {"repl external:skip"}} {
set master [srv 0 client]
set master_host [srv 0 host]
set master_port [srv 0 port]
start_server {overrides {io-threads 2}} {
set slave [srv 0 client]
test {prefetchCommands handles NULL argv and keys during RDB replication with IO threads} {
# Enable diskless sync to trigger RDB streaming during replication
$master config set repl-diskless-sync yes
$master config set repl-diskless-sync-delay 0
# Populate keys in the format key:$i with 128-byte values.
$slave debug populate 700000 key 128
# Force a full resync by resetting the slave.
set rd [redis_deferring_client 0]
$rd slaveof $master_host $master_port
# Create a large pipeline command.
set batch_size 1000
set buf ""
for {set i 0} {$i < $batch_size} {incr i} {
append buf [format_command get key:1]
}
# Continuously send pipelined commands so that the replica processes
# and prefetches them while it is emptying old data during full sync.
set start_time [clock milliseconds]
while {[clock milliseconds] - $start_time < 5000} {
$rd write $buf
$rd flush
if {[s 0 master_link_status] eq "up"} break
}
$rd close
}
}
}