postgresql/src
Michael Paquier 1346794feb Fix rare instability in recovery TAP test 004_timeline_switch
This fixes a problem similar to ad8c86d22c.  In this case, the test
could fail under the following circumstances:
- The primary is stopped with teardown_node(), meaning that it may not
be able to send all its WAL records to standby_1 and standby_2.
- If standby_2 receives more records than standby_1, attempting to
reconnect standby_2 to the promoted standby_1 would fail because of a
timeline fork.

This race condition is fixed with a simple trick: instead of tearing
down the primary, it is stopped cleanly so as all the WAL records of the
primary are received and flushed by both standby_1 and standby_2.  Once
we do that, there is no need for a wait_for_catchup() before stopping
the node.  The test wants to check that a timeline jump can be achieved
when reconnecting a standby to a promoted standby in the same cluster,
hence an immediate stop of the primary is not required.

This failure is harder to reach than the previous instability of
009_twophase, still the buildfarm has been able to detect this failure
at least once.  I have tried Alexander Lakhin's test trick with the
bgwriter and very aggressive standby snapshots, but I could not
reproduce it directly.  It is reachable, as the buildfarm has proved.

Backpatch down to all supported branches, and this problem can lead to
spurious failures in the buildfarm.

Discussion: https://postgr.es/m/493401a8-063f-436a-8287-a235d9e065fc@gmail.com
Backpatch-through: 14
2026-03-05 10:06:08 +09:00
..
backend Don't malloc(0) in EventTriggerCollectAlterTSConfig 2026-03-04 15:04:53 +01:00
bin Fix some cases of indirectly casting away const. 2026-02-25 11:19:50 -05:00
common Fix mb2wchar functions on short input. 2026-02-09 12:38:12 +13:00
fe_utils In fmtIdEnc(), handle failure of enlargePQExpBuffer(). 2025-02-16 12:46:35 -05:00
include Allow PG_PRINTF_ATTRIBUTE to be different in C and C++ code. 2026-02-25 11:57:26 -05:00
interfaces Fix some cases of indirectly casting away const. 2026-02-25 11:19:50 -05:00
makefiles Add NO_INSTALL option to pgxs 2021-05-27 13:58:29 +02:00
pl EUC_CN, EUC_JP, EUC_KR, EUC_TW: Skip U+00A0 tests instead of failing. 2026-02-25 18:13:45 -08:00
port Fix some cases of indirectly casting away const. 2026-02-25 11:19:50 -05:00
template On NetBSD, force dynamic symbol resolution at postmaster start. 2022-08-30 17:29:03 -04:00
test Fix rare instability in recovery TAP test 004_timeline_switch 2026-03-05 10:06:08 +09:00
timezone Fix some cases of indirectly casting away const. 2026-02-25 11:19:50 -05:00
tools Fix Solution.pm for change in pg_config.h contents. 2026-02-26 12:26:52 -05:00
tutorial Doc: sync src/tutorial/basics.source with SGML documentation. 2022-11-19 13:09:14 -05:00
.gitignore
DEVELOPERS
Makefile Remove the option to build thread_test.c outside configure. 2020-10-21 12:08:48 -04:00
Makefile.global.in Don't put library-supplied -L/-I switches before user-supplied ones. 2025-07-29 15:17:41 -04:00
Makefile.shlib Stop using "-multiply_defined suppress" on macOS. 2023-09-26 21:06:21 -04:00
nls-global.mk Fix update-po for the PGXS case 2025-10-16 20:21:05 +02:00