Fix self-deadlock when replaying WAL generated by older minor version

Commit 77dff5d937 introduced a SimpleLruWriteAll() call when replaying
multixact WAL records generated by older minor versions. However,
SimpleLruWriteAll() acquires the SLRU lock and on v16 and below, it's
called while already holding the lock, leading to self-deadlock.
Version 17 and 18 did not have that problem, because in those versions
the lock is acquired later in the function.

To fix, acquire MultiXactOffsetSLRULock later in RecordNewMultiXact(),
at the same place where it's acquired on version 17 and 18.

Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reported-by: Radim Marek <radim@boringsql.com>
Discussion: https://www.postgresql.org/message-id/19490-9c59c6a583513b99@postgresql.org
Backpatch-through: 14-16
This commit is contained in:
Heikki Linnakangas 2026-05-27 11:49:50 +03:00
parent e786fb5aa7
commit 2dfe75f984

View file

@ -887,8 +887,6 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
MultiXactOffset *next_offptr;
MultiXactOffset next_offset;
LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE);
/* position of this multixid in the offsets SLRU area */
pageno = MultiXactIdToOffsetPage(multi);
entryno = MultiXactIdToOffsetEntry(multi);
@ -950,6 +948,8 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
{
elog(DEBUG1, "next offsets page is not initialized, initializing it now");
LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE);
/* Create and zero the page */
slotno = SimpleLruZeroPage(MultiXactOffsetCtl, next_pageno);
@ -957,6 +957,8 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
SimpleLruWritePage(MultiXactOffsetCtl, slotno);
Assert(!MultiXactOffsetCtl->shared->page_dirty[slotno]);
LWLockRelease(MultiXactOffsetSLRULock);
/*
* Remember that we initialized the page, so that we don't zero it
* again at the XLOG_MULTIXACT_ZERO_OFF_PAGE record.
@ -975,6 +977,7 @@ RecordNewMultiXact(MultiXactId multi, MultiXactOffset offset,
* concurrently, we might race ahead and get called before the previous
* multixid.
*/
LWLockAcquire(MultiXactOffsetSLRULock, LW_EXCLUSIVE);
/*
* Note: we pass the MultiXactId to SimpleLruReadPage as the "transaction"