Remove recovery.signal at recovery end when both signal files are present.

When both standby.signal and recovery.signal are present, standby.signal
takes precedence and the server runs in standby mode. Previously,
in this case, recovery.signal was not removed at the end of standby mode
(i.e., on promotion) or at the end of archive recovery, while standby.signal
was removed. As a result, a leftover recovery.signal could cause
a subsequent restart to enter archive recovery unexpectedly, potentially
preventing the server from starting. This behavior was surprising and
confusing to users.

This commit fixes the issue by updating the recovery code to remove
recovery.signal alongside standby.signal when both files are present and
recovery completes.

Because this code path is particularly sensitive and changes in recovery
behavior can be risky for stable branches, this change is applied only to
the master branch.

Reported-by: Nikolay Samokhvalov <nik@postgres.ai>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: David Steele <david@pgbackrest.org>
Discussion: https://postgr.es/m/CAM527d8PVAQFLt_ndTXE19F-XpDZui861882L0rLY3YihQB8qA@mail.gmail.com
This commit is contained in:
Fujii Masao 2026-02-16 13:57:38 +09:00
parent 459576303d
commit 351265a6c7
2 changed files with 23 additions and 5 deletions

View file

@ -1068,9 +1068,6 @@ readRecoverySignalFile(void)
* Check for recovery signal files and if found, fsync them since they
* represent server state information. We don't sweat too much about the
* possibility of fsync failure, however.
*
* If present, standby signal file takes precedence. If neither is present
* then we won't enter archive recovery.
*/
if (stat(STANDBY_SIGNAL_FILE, &stat_buf) == 0)
{
@ -1085,7 +1082,8 @@ readRecoverySignalFile(void)
}
standby_signal_file_found = true;
}
else if (stat(RECOVERY_SIGNAL_FILE, &stat_buf) == 0)
if (stat(RECOVERY_SIGNAL_FILE, &stat_buf) == 0)
{
int fd;
@ -1099,6 +1097,10 @@ readRecoverySignalFile(void)
recovery_signal_file_found = true;
}
/*
* If both signal files are present, standby signal file takes precedence.
* If neither is present then we won't enter archive recovery.
*/
StandbyModeRequested = false;
ArchiveRecoveryRequested = false;
if (standby_signal_file_found)

View file

@ -115,6 +115,17 @@ $node_standby2->append_conf(
recovery_end_command = 'echo recovery_end_failed > missing_dir/xyz.file'
));
# Create recovery.signal and confirm that both signal files exist.
# This is necessary to test how recovery behaves when both files are present,
# i.e., standby.signal should take precedence and both files should be
# removed at the end of recovery.
$node_standby2->set_recovery_mode();
my $node_standby2_data = $node_standby2->data_dir;
ok(-f "$node_standby2_data/recovery.signal",
"recovery.signal is present at the beginning of recovery");
ok(-f "$node_standby2_data/standby.signal",
"standby.signal is present at the beginning of recovery");
$node_standby2->start;
# Save the log location, to see the failure of recovery_end_command.
@ -126,7 +137,6 @@ $node_standby2->promote;
# Check the logs of the standby to see that the commands have failed.
my $log_contents = slurp_file($node_standby2->logfile, $log_location);
my $node_standby2_data = $node_standby2->data_dir;
like(
$log_contents,
@ -141,4 +151,10 @@ like(
qr/WARNING:.*recovery_end_command/s,
"recovery_end_command failure detected in logs after promotion");
# Check that no signal files are present after promotion.
ok( !-f "$node_standby2_data/recovery.signal",
"recovery.signal was left behind after promotion");
ok( !-f "$node_standby2_data/standby.signal",
"standby.signal was left behind after promotion");
done_testing();