postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-05-28 04:35:45 -04:00

Author	SHA1	Message	Date
Masahiko Sawada	d79bf7612a	Fix race between ProcSignalInit() and EmitProcSignalBarrier(). Previously, ProcSignalInit() read the global barrier generation before publishing its PID into pss_pid. This created a race condition: a process could initialize its local generation with an older global value, while a concurrent EmitProcSignalBarrier() might skip that process because its pss_pid was still zero. This resulted in WaitForProcSignalBarrier() hanging indefinitely. Fix this by publishing pss_pid before reading psh_barrierGeneration with a memory barrier so that the store to pss_pid is ordered before the load. A concurrent EmitProcSignalBarrier() then either observes the published PID and signals this slot, or completes its generation increment before we load it. While this race has become more visible due to recent features using signal barriers in more places (such as online wal_level changes), the issue is theoretically present since signal barriers were introduced to release smgr caches (e.g., in DROP DATABASE). v14 has the procsiangl barrier infrastricutre but no in-tree caller that actually emits a barrier, so the case is unreachable there. This issue was also reported by buildfarm member flaviventris. Reported-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2WgAJmWReDN7Chtba8Er2YBvKCoa0KVN25-1evnTrHsLyA@mail.gmail.com Backpatch-through: 15	2026-05-27 16:25:56 -07:00
Masahiko Sawada	47ad2233fa	Fix 051_effective_wal_level.pl on builds without injection points. Commit `2af1dc8928` placed the new "logical decoding disabled after REPACK (CONCURRENTLY)" check at the end of 051_effective_wal_level.pl. That placement assumed the logical slot "test_slot" no longer existed when the check ran, but the assumption only holds on builds with injection points: the earlier injection-point-driven tests drop "test_slot" as a side effect, while on builds without injection points the slot persists. When "test_slot" still exists, logical decoding remains enabled and the new check fails on those buildfarm members. Move the REPACK test earlier in the script, ensuring that the test starts with logical decoding disabled. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAD21AoBmdmBQ-+Jga+jSKKq5OPGEP1pEjSJfRPT6MCwVHLD6og@mail.gmail.com	2026-05-27 15:52:30 -07:00
Álvaro Herrera	2af1dc8928	Disable logical decoding after REPACK (CONCURRENTLY) REPACK (CONCURRENTLY) uses a temporary logical replication slot, which is dropped once done, but it wasn't calling RequestDisableLogicalDecoding(), leaving effective_wal_level stuck at 'logical'. Fix by adding a Boolean flag to ReplicationSlotDropAcquired() to have it request to disable logical decoding, and passing it as true on REPACK. Other callers of that function preserve their existing behavior. Author: Imran Zaheer <imran.zhir@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CA+UBfaktds57dw2M8BEv_kS-=ixph3w+3MxKixtaDQMi_k7Ybg@mail.gmail.com	2026-05-27 20:11:29 +02:00
Tom Lane	0f24332aeb	Fix NOTIFY wakeups for pre-commit LISTEN entries. Commit `282b1cde9` made SignalBackends() ignore ListenerEntry entries whose "listening" flag said that the listener was not yet committed. That will be true for a new listener that has already registered its queue position, but has not yet reached AtCommit_Notify(). If another backend notifies the same channel in that window, SignalBackends() would directly advance the new listener's queue position, causing it to miss message(s). Really this is a definitional question: is a new listener active as of PreCommit, or as of AtCommit? But it seems to make more sense to expect that the new listener will see all messages after its initially-registered queue position, especially since the direct-advance logic is supposed to be an optimization that doesn't affect semantics. Fix this by treating all channel entries as valid wakeup targets. Rename the "listening" flag to removeOnAbort to reflect its remaining purpose: identifying staged LISTEN entries that abort cleanup must remove. While we're here, remove an obsolete test case added by `282b1cde9`. The check for "ChannelHashAddListener array growth" was meant to exercise code that never made it into the committed patch, so now it's just a waste of test cycles. Author: Joel Jacobson <joel@compiler.org> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/9835b0a4-9121-47ac-9c44-427b8b1a7f1b@app.fastmail.com Discussion: https://postgr.es/m/6fe5ee75-537d-4d4f-909a-b21303c3ce75@app.fastmail.com	2026-05-27 12:23:42 -04:00
Heikki Linnakangas	2fbb21170e	Avoid orphaned objects dependencies Concurrent DDL can leave behind objects referencing other objects that no longer exist. This can happen if an object is dropped, while a new object that depends on it is created concurrently. For example: session 1: BEGIN; CREATE FUNCTION myschema.myfunc() ...; session 2: DROP SCHEMA myschema; session 1: COMMIT; DROP SCHEMA does check that there are no objects dependending on the schema being dropped, but it does not see objects being concurrently created by other sessions. Even if it did, this scenario would still fail: session 1: BEGIN: DROP SCHEMA myschema; session 2: CREATE FUNCTION myschema.myfunc() ...; session 1: COMMIT; When the DROP SCHEMA runs, the schema was empty, but the new function is created in it before the dropping transaction completes. The CREATE FUNCTION does not see that the schema is concurrently being dropped. In both of these scenarios, the function is left behind in the schema that no longer exists. To fix, acquire AccessShareLock on all referenced objects when recording dependencies. This conflicts with the AccessExclusiveLock taken by DROP, preventing the race. After acquiring the lock, verify that the object still exists, and if it was dropped concurrently, report an error. We already had such a mechanism for shared dependencies, but for some reason we didn't do it for in-database dependendies. Ideally the locks would be acquired much earlier when creating a new object, but that will require modifying a lot of callers. This check while recording the dependency is a nice wholesale protection, and even if we change all the CREATE commands to acquire locks earlier, it's still good to have this as a backstop to catch any cases where we forgot to do so. The patch adds a few tests for some cases that left behind orphaned objects before this. It also adds a test for roles, which already had such protection, although that test is partially disabled because the error message includes an OID which is not predictable. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://postgr.es/m/ZiYjn0eVc7pxVY45@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 14	2026-05-27 18:41:14 +03:00
Heikki Linnakangas	fd93ee1008	Don't try to record dependency on a dropped column's datatype When creating a relation with a dropped column, we called recordDependencyOn() also on the datatype of the dropped column, which is always InvalidOid. In versions 15 and above, that was harmless because recordDependencyOn() considers InvalidOid as a pinned object, and skips over it. On version 14, isPinnedObject() does not consider InvalidOid as pinned, so we created a bogus pg_depend entry with refobjectid == 0. As far as I can tell, the only case when AddNewAttributeTuples() is called with dropped columns is when performing a table-rewriting ALTER TABLE command. That temporarily creates a new relation with the same columns, including dropped ones, then swaps the relations, and drops the newly created table again. So even on version 14, the bogus pg_depend entry was only on the transient relation that was dropped at the end of the ALTER TABLE command, which was harmless. Even though this is harmless, let's be tidy, similar to commit `713bce9484`. The reason I noticed this now and why I backported this, is because the next commit will add code to acquire locks on the referenced objects, and we don't want to acquire a lock on InvalidOid. Discussion: https://postgr.es/m/ZiYjn0eVc7pxVY45@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 14	2026-05-27 18:41:03 +03:00
Peter Eisentraut	ee31868a53	Use strtoi64 instead of strtoll This is mostly for notational consistency, since the result is stored in a variable of type int64.	2026-05-27 17:12:27 +02:00
Daniel Gustafsson	c71b94f033	Remove incorrect OpenSSL feature guards Commit `316472146` introduced support for ECDH key exchange with an ifdef guard to ensure support in the underlying OpenSSL installation. Commit 10bf4fc2c3 in OpenSSL removed this guard in 2015 which effectively made our check a no-op. There has been no complaints that this doesn't work and OpenSSL installations without ECDH support are likely very rare, so remove the checks rather than re-implementing support. Not backpatched since this fix doesn't alter functionality. Also fix a typo introduced in the original commit which had survived till this day. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/1787BA9F-A11C-4A7A-9252-94C470D5CBE3@yesql.se	2026-05-27 12:58:56 +02:00
Michael Paquier	84b9d6bcea	Fix procLatch ownership race in ProcKill() DisownLatch() was executed after the PGPROC entry of the process terminated is pushed back into a freelist. A newly-forked backend that recycles the slot could call OwnLatch() and PANIC with a "latch already owned by PID", taking down the server. There were two scenarios related to lock groups where this issue could be reached: * A follower pushes the leader's PGPROC back to the freelist while the leader has not yet called DisownLatch() in its own ProcKill(). * A leader outliving all its followers pushes its own PGPROC onto the freelist before reaching DisownLatch(), which would be the most common scenario. This issue is fixed by calling SwitchBackToLocalLatch() and DisownLatch() at an earlier phase of ProcKill(), before any freelist manipulation happens, so that the slot of the backend terminated is never exposed as owning a latch. Note that pgstat_reset_wait_event_storage() is kept at a later stage. An upcoming commit will take advantage of that by introducing a test able to check the original PANIC scenario. Author: Vlad Lesin <vladlesin@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/d2983796-2603-41b7-a66e-fc8489ddb954@gmail.com Backpatch-through: 14	2026-05-27 17:19:20 +09:00
Michael Paquier	5631045231	Fix race conditions in ProcKill()'s lock-group freelist handling This commit fixes two bugs in ProcKill()'s lock-group teardown freelist publication: * a double push of the leader's PGPROC that corrupts the freelist. * a leak of the last follower's PGPROC slot. ProcKill()'s lock-group teardown had two PGPROC freelist updates scattered through the function, done under two separate freeProcsLock acquisitions: * A follower's push of the leader's PGPROC, done when a follower is the last group member exiting. * Every backend's self-push at the bottom of the function. The two freelist updates were coordinated only by inspecting proc->lockGroupLeader, which a follower could clear as a side effect of pushing the leader. This coordination was broken. For example, with two concurrent backends: * The follower clears leader->lockGroupLeader and pushes the leader's PGPROC under leader_lwlock. * The follower does not clear its own proc->lockGroupLeader, being skipped. * When the leader reaches the bottom of ProcKill(), it sees a NULL proc->lockGroupLeader (the follower cleared it) and pushes itself, causing a second dlist_push_tail() of the same node onto the same freelist. * The follower at the bottom sees its own proc->lockGroupLeader being not NULL (never cleared) and skips its own push, causing its own slot to leak. This commit refactors the freelist manipulation to be done in two distinct phases, each step using its own lock acquisition to ensure that each freelist operation happens in an isolated manner for each backend (follower or leader): - First, under a single leader_lwlock acquisition, check the state of the lock-group. Depending on if we are dealing with a follower and/or a leader, and if the leader has exited before a follower, then set some state booleans that define which actions should be taken with the freelist. - Second, under a single freeProcsLock acquisition, perform the cleanup actions, self-push of a backend and/or push of the leader back to the freelist. This is an old issue, dating back to 9.6 where parallel workers and lock grouping has been added. Author: Vlad Lesin <vladlesin@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/d2983796-2603-41b7-a66e-fc8489ddb954@gmail.com Backpatch-through: 14	2026-05-27 14:48:56 +09:00
Fujii Masao	12d0004889	pg_createsubscriber: Fix cleanup of publisher-side objects after errors When pg_createsubscriber fails after creating logical replication objects, it should remove the publication and replication slot that it created on the publisher. Previously, if dropping subscriber-side objects failed, pg_createsubscriber reset its internal cleanup state too early. As a result, the exit-time cleanup could skip removing the publication or replication slot on the publisher. This could leave pg_createsubscriber-created objects behind on the publisher after a failed run. That can make a retry harder, because the leftover publication or replication slot may need to be removed manually before running pg_createsubscriber again. In the case of a replication slot, leaving it behind can also retain WAL files longer than expected. The cause of this issue was that the flags made_publication and made_replslot tracking whether pg_createsubscriber created a publication or replication slot on the primary were incorrectly reset to false when failures occurred while dropping objects on the subscriber. This commit fixes the issue by preventing those cleanup flags from being reset even when failures occurred while dropping objects on the subscriber, ensuring proper cleanup of primary objects before exit on failure. Backpatch to v17, where pg_createsubscriber was added. Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CABdArM5V9QKK1PkLY9dpgAcZa3kUp84-wPqPovxvdLOri4=69w@mail.gmail.com Backpatch-through: 17	2026-05-27 10:34:17 +09:00
Bruce Momjian	9a41b34a28	doc: add comma to UPDATE docs, for consistency Reported-by: X-MAN Author: X-MAN Discussion: https://postgr.es/m/tencent_90A64D807DE3586650CF3426C28BB599D30A@qq.com	2026-05-26 20:18:00 -04:00
Alexander Korotkov	0b866bb903	Clean up 019_replslot_limit.pl comments Update stale comments and test names in 019_replslot_limit.pl to match the actual WAL advancement and wal_status checks. Remove a redundant standby stop in the inactive_since coverage. Discussion: https://postgr.es/m/CABPTF7XxDonXAcz6DsN6AUJB3swYrZkJHq3UCDaD3Q2H%2Bj0gUA%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-27 03:09:44 +03:00
Alexander Korotkov	cdb1d1cf1d	Stabilize 019_replslot_limit.pl: wait on slot restart_lsn wait_for_catchup() has "wait for the standby to reach the target LSN" semantics. However, the previous polling implementation actually waited for the primary to observe that position via pg_stat_replication. `7e8aeb9e48` introduced the new WAIT FOR LSN-based implementation, which just probes the standby. 019_replslot_limit.pl relied on the old side effect: its "slot state changes to extended/unreserved" subtests inspect primary-side pg_replication_slots, whose wal_status depends on restart_lsn, which only advances after the walsender processes a standby reply. Make the test wait on what it actually needs by replacing each wait_for_catchup() with wait_for_slot_catchup('rep1', 'restart', primary->lsn('write')). Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/63f6abc9-c0ae-465d-a4e6-667eca6ea008@gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-27 02:54:04 +03:00
Alexander Korotkov	bec61f5935	Skip pg_database.dathasloginevt cleanup on standby EventTriggerOnLogin() tries to clear pg_database.dathasloginevt when the database no longer has any login event triggers but the flag is still set. To make that safe against concurrent flag setters, it takes a conditional AccessExclusiveLock on the database object. On a hot standby, that lock acquisition fails outright with FATAL: cannot acquire lock mode AccessExclusiveLock on database objects while recovery is in progress because LockAcquireExtended() refuses locks stronger than RowExclusiveLock on database objects during recovery. The standby already replays the flag's value from the primary, so the dangling flag is the result of replaying a state in which the primary had already dropped its login event triggers but not yet run a login event trigger pass to clear the flag. Any session connecting to the standby in that window therefore fails to connect. Skip the cleanup on a standby. The flag will be cleared via WAL replay once the primary clears it on its side. Add a recovery TAP test that reproduces the original report: create and drop a login event trigger on the primary in one session, wait for the standby to replay, then verify that a fresh connection to the standby succeeds. Backpatch to v17, where the login event triggers were introduced. Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reported-by: Egor Chindyaskin <kyzevan23@mail.ru> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/19488-d7ccfca2bf6b74b0%40postgresql.org Backpatch-through: 17	2026-05-27 02:27:32 +03:00
Amit Kapila	490259d072	Fix memory accumulation in pg_sync_replication_slots() during retries. Unlike the slotsync worker, whose retry cycles are separated by transaction boundaries, pg_sync_replication_slots() retries within a single SQL function call. Per-cycle allocations for slot names, plugin names, database names, and auxiliary list containers get accumulated across retries until the function returned. Memory growth is proportional to the number of retries and remote slots, and the function may wait an extended period between cycles when slots are slow to persist. Fix by running each retry cycle in a short-lived memory context (sync_retry_ctx) that is reset before the next attempt. Additionally, release tuple slots created with MakeSingleTupleTableSlot() before clearing the walreceiver result. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CABPTF7VVPxgfYyr8Kyi=+JACjckQ6NpniV9eRtHboj2hMn0REw@mail.gmail.com	2026-05-26 15:16:12 -07:00
Bruce Momjian	8656ba7f71	doc PG 19 relnotes: more fixes Reported-by: Thom Brown Author: Thom Brown Discussion: https://postgr.es/m/CAA-aLv7B7M9s5fZgCoWzXqer5RJ9jqG_k0h8t5QHFW=Qbxa=Eg@mail.gmail.com	2026-05-26 17:49:31 -04:00
Bruce Momjian	1d751b4b6b	doc PG 19 relnotes: various corrections Reported-by: Thom Brown Author: Thom Brown Discussion: https://postgr.es/m/CAA-aLv7w1wwucet76yAW0yq3-LrN5wL81uRrnpT3Tyxh7dmyTw@mail.gmail.com	2026-05-26 16:31:58 -04:00
Tom Lane	61ea5cc6a6	Add stack depth check to QueueFKConstraintValidation(). QueueFKConstraintValidation() recurses through the partition hierarchy to queue child constraint validations and to mark child rows as validated. With a sufficiently deep partition tree, this can result in a stack-overflow crash. Defend against that as we do elsewhere. Bug: #19482 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19482-4cc37cbf52d55235@postgresql.org Backpatch-through: 18	2026-05-26 11:58:25 -04:00
Álvaro Herrera	1588d89af2	Restructure repack worker teardown The original code would leave a shared memory segment unreleased if we fail partway through initialization. Change the shutdown order so that we always free it. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/agtNn6ZCmdI2KJFn@alvherre.pgsql	2026-05-26 17:24:06 +02:00
Bruce Momjian	cfedd45133	doc PG 19 relnotes: adjust item to mention pg_replication_slots Reported-by: Chong Peng Author: Chong Peng Discussion: https://postgr.es/m/CC2712F9-8457-4733-AA9D-7D7C9843B590@gmail.com	2026-05-26 10:59:30 -04:00
Michael Paquier	6aa26be288	Fix calculation of members_size in pg_get_multixact_stats() pg_get_multixact_stats() uses members_size to report the amount of storage used by the currently retained multixact members. However, MultiXactOffsetStorageSize() divided the member count by the number of members per storage group before multiplying by the group size, so it was rounding down its result and incorrectly reported zero when there were few retained members. The calculation is changed to calculate the same based on the member count. While on it, this fixes a different issue in the isolation test multixact-stats. Three fields were defined for checks related to the oldest offset values, but were not used. The offsets existed in an older version of the patch than what has been committed. These are replaced by checks for members_size, checking the new calculation formula. Thinkos introduced in `97b101776c`. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/819AC1B2-1A71-4244-B081-3ADD85D1725D@gmail.com	2026-05-26 13:49:04 +09:00
Michael Paquier	d40aed5542	Adjust some error hints The wording of two error hints is tweaked in this commit: - Import of extended statistics, where the value of an array element is not a NULL or a string. - Online data checksum switch, where a period was missing. Author: Baji Shaik <baji.pgdev@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CA+fm-RMrKbyky_+vi5SDdAVnFVjWh7zW3GoDAVnrp5OpDnW6tw@mail.gmail.com	2026-05-26 08:13:22 +09:00
Tom Lane	524cc0f638	Fix missed ReleaseVariableStats() in intarray's _int_matchsel(). Given a WHERE clause like "int[] @@ query_int" or "query_int ~~ int[]" where the query_int side is a table column having statistics, _int_matchsel() exited without remembering to free the statistics tuple. This would typically lead to warnings about cache refcount leakage, like WARNING: resource was not closed: cache pg_statistic (73), tuple 42/12 has count 1 It's been wrong since this code was added, in commit `c6fbe6d6f`. Bug: #19492 Reported-by: Man Zeng <zengman@halodbtech.com> Author: Man Zeng <zengman@halodbtech.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19492-ddcd0e22399ef85a@postgresql.org Backpatch-through: 14	2026-05-25 18:15:49 -04:00
Fujii Masao	e2b8813403	dblink: Reject use_scram_passthrough on foreign-data wrappers Previously, dblink accepted the use_scram_passthrough option on foreign-data wrappers via ALTER FOREIGN DATA WRAPPER dblink_fdw OPTIONS, even though the setting had no effect there. use_scram_passthrough should be only meaningful for foreign servers and user mappings, so this commit updates dblink to accept the option only in those contexts. Backpatch to v18, where use_scram_passthrough was introduced. Author: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEJ8rZjmbOvCicyr4vbuLio082bNTde0WNoSWaWr9wVcg@mail.gmail.com Backpatch-through: 18	2026-05-26 01:07:24 +09:00
Fujii Masao	5f5165e2fe	dblink: Give user mapping precedence for use_scram_passthrough Commit `97f6fc10ff` changed postgres_fdw so that user-mapping settings override foreign server settings for use_scram_passthrough. This commit applies the same behavior to dblink. Backpatch to v18, where use_scram_passthrough was introduced. Author: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEJ8rZjmbOvCicyr4vbuLio082bNTde0WNoSWaWr9wVcg@mail.gmail.com Backpatch-through: 18	2026-05-26 00:51:18 +09:00
Fujii Masao	97f6fc10ff	postgres_fdw: Give user mapping precedence for use_scram_passthrough Previously, when use_scram_passthrough was specified on both a foreign server and a user mapping, the server-level setting took precedence over the user-mapping setting. This was inconsistent with the usual semantics of postgres_fdw options, where foreign server options provide shared defaults and user mapping options override them on a per-user basis. This commit updates postgres_fdw so that the user-mapping setting takes precedence when use_scram_passthrough is specified in both places. This matches the behavior of other connection options such as sslcert and sslkey. Backpatch to v18, where use_scram_passthrough was introduced. In v18, this only affects limited configurations that specify conflicting values at both the foreign server and user-mapping levels. In such cases, users would naturally expect the user-mapping setting to override the server-level setting, so changing the behavior should be minimally disruptive. Also keeping v18 as the only branch with different semantics for use_scram_passthrough would be unnecessarily confusing, so backpatch this fix to v18. Author: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEJ8rZjmbOvCicyr4vbuLio082bNTde0WNoSWaWr9wVcg@mail.gmail.com Backpatch-through: 18	2026-05-26 00:46:31 +09:00
Daniel Gustafsson	377cc45194	doc: Clarify CHECKPOINT handling of unlogged buffers The CHECKPOINT reference page still described checkpoints as flushing all data files, which could be misleading as it depends on the value of FLUSH_UNLOGGED option. Update the description to make it clearer that only data files of permanent relations are flushed by default. Author: Chao Li <lic@highgo.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/4855807D-F1CA-44E6-9B58-406691832848@gmail.com	2026-05-25 12:15:29 +02:00
Daniel Gustafsson	7e5d8bd013	psql: Tab completion for CHECKPOINT FLUSH_UNLOGGED boolean options Tab completion for CHECKPOINT options contained FLUSH_UNLOGGED, but the boolean value was not part of the completion. Fix to make this consistent with other boolean values. Author: Chao Li <lic@highgo.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/4855807D-F1CA-44E6-9B58-406691832848@gmail.com	2026-05-25 11:57:14 +02:00
Alexander Korotkov	e64a9ba2b4	Reject degenerate SPLIT PARTITION with DEFAULT partition ALTER TABLE ... SPLIT PARTITION allows a DEFAULT partition to be created as one of the replacement partitions when the parent table does not already have one. However, it should not allow the degenerate case where a non-DEFAULT partition keeps exactly the same bound as the split partition and the command merely adds a DEFAULT partition through the SPLIT PARTITION path. Detect that case by comparing the bound of the split partition with the bound of the only non-DEFAULT replacement partition, and raise an error when they are the same. Users should add a DEFAULT partition directly with CREATE TABLE ... PARTITION OF ... DEFAULT or ALTER TABLE ... ATTACH PARTITION ... DEFAULT instead. The comparison goes through the partition operator family rather than byte equality so that values which are binary-different but compare equal under the partition key's comparator are treated as the same bound. The corresponding regression test uses a float8 LIST partition with -0.0 and 0.0 -- they have different bit patterns but are equal under float8 -- to verify that a datumIsEqual()-based check would let the degenerate split through while the partsupfunc-based check correctly rejects it. Author: Chao Li <lic@highgo.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/C18878AB-DEB2-4A61-9995-A035DD644B81@gmail.com	2026-05-25 11:57:42 +03:00
Michael Paquier	0b8fa5fd37	Fix size check in statext_dependencies_deserialize() The check for the minimum expected bytea size of a MVDependencies object was using SizeOfItem() for its calculation. This macro uses the number of attributes in a single dependency. This minimum size calculation should be based on MinSizeOfItems(), that computes the minimum expected size as the header plus the minimally-sized number of dependency items. Oversight in `d08c44f7a4`. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Discussion: https://postgr.es/m/4b8d299d-2505-4c30-bf80-0f697410db35@tantorlabs.com Backpatch-through: 14	2026-05-25 14:38:02 +09:00
Álvaro Herrera	01a80f0621	Revert "Allow logical replication snapshots to be database-specific" This reverts commit `0d3dba38c7`, which was determined to have fundamental flaws. This restricts REPACK (CONCURRENTLY) so that only one process can run it concurrently on different tables and even on different databases; we'll lift that restriction in another way during the next development cycle. Reported-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1Jg21ODQ7fS2fvN5W_S5kDRhAP5inj3XMRQaa=s-GbYhw@mail.gmail.com	2026-05-23 21:33:19 -07:00
Fujii Masao	2c4bd2bf57	psql: Add missing IO option to EXPLAIN tab completion Commit `681daed931` added EXPLAIN (IO) as a boolean option, but did not update psql's tab completion to include it. Add IO to both the option keyword list and the boolean ON/OFF completion. Author: Afrah Razzak <mypg.afrah@gmail.com> Reviewed-by: Zhenwei Shang <a934172442@gmail.com> Discussion: https://postgr.es/m/CAAJ6gzGi9gK6nGjsGCch0nFPdd2+odWatTS1uAGwRDPbHkmSVQ@mail.gmail.com	2026-05-23 09:39:58 +09:00
Michael Paquier	c37b38806a	Avoid exposing WAL receiver raw conninfo during timeline jumps When reusing an existing WAL receiver after it has reached WALRCV_WAITING for new instructions, RequestXLogStreaming() copied PrimaryConnInfo into WalRcv->conninfo before switching the state to WALRCV_RESTARTING. At that point ready_to_display could still be true, so pg_stat_wal_receiver could expose the raw connection string, including sensitive fields, but it should only show the user-displayable version of the connection string. WALRCV_RESTARTING does not establish a new connection. The waiting WAL receiver reuses its existing connection and only needs a new startpoint and timeline, so there is no need to copy the raw connection string into shared memory again. Let's only copy conninfo when launching a new WAL receiver after WALRCV_STOPPED, not while waiting for instructions. This commit adds coverage for the case fixed by this commit to the timeline-switch test by verifying that the WAL receiver conninfo remains consistent across the jump. Backpatch all the way down, as this issue is possible since pg_stat_wal_receiver has been introduced. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/EF91FF76-1E2B-4F3B-9162-290B4DC517FF@gmail.com Backpatch-through: 14	2026-05-23 08:10:07 +09:00
Michael Paquier	7f469097c7	Improve pg_stat_wal_receiver for CONNECTING status Commit `a36164e746` added a CONNECTING status for the WAL receiver, but pg_stat_wal_receiver returned no information while the connection to the primary was attempted, limiting the usability of the feature in high-latency environments where the connection attempt to the primary could take time. This commit improves the report of the status by splitting the way the shared memory state of the WAL receiver is filled before and after the connection to the primary is attempted with walrcv_connect(): - Before the attempt, reset all the connection fields, switch ready_to_display to true. - After the attempt, fill in the connection fields. This change means two spinlock acquisitions instead of one, but at least monitoring tools can know about the connection attempt before its completion, enlarging the usability of the feature. This code path is taken only once when a WAL receiver is spawned, so the extra acquisition does not matter performance-wise. Reported-by: Chao Li <li.evan.chao@gmail.com> Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/EF91FF76-1E2B-4F3B-9162-290B4DC517FF@gmail.com	2026-05-23 04:04:26 +09:00
Fujii Masao	06a5c3cdef	Set notice receiver before libpq connection startup completes Commit `112faf1378` added custom notice receivers for replication, postgres_fdw, and dblink so that remote NOTICE, WARNING, and similar messages are reported via ereport(). However, those notice receivers were installed only after libpqsrv_connect() and libpqsrv_connect_params() returned, by which point libpq connection startup had already completed. As a result, messages emitted during connection establishment could be missed. This commit fixes the issue by splitting libpqsrv_connect() and libpqsrv_connect_params() into separate start and complete phases: libpqsrv_connect_start(), libpqsrv_connect_params_start(), and libpqsrv_connect_complete(). This allows callers to perform per-connection setup, such as installing a notice receiver, after the connection has been started but before startup completes. Note that callers of libpqsrv_connect_start() and libpqsrv_connect_params_start() must still call libpqsrv_connect_complete(), even if the start function returns NULL, so that any external FDs reserved during startup are released properly. Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Discussion: https://postgr.es/m/A2B8B7DE-C119-492F-A9FA-14CF86849777@gmail.com	2026-05-23 00:25:48 +09:00
Fujii Masao	d8b5d87e54	Prevent setting NO INHERIT on partitioned NOT NULL constraints The documentation states that NOT NULL constraints on partitioned tables are always inherited by all partitions, and therefore cannot be declared NO INHERIT. While a check already existed to reject creating such constraints with NO INHERIT, previously the same check was missing for ALTER TABLE ... ALTER CONSTRAINT ... NO INHERIT. This commit adds the missing check so that attempting to set NO INHERIT on a partitioned NOT NULL constraint now fails. Backpatch to v18, where ALTER TABLE ... ALTER CONSTRAINT ... [NO] INHERIT was added. Author: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/ecc985ad-6ec1-4094-a315-317943ca5f3f@proxel.se Backpatch-through: 18	2026-05-22 23:59:04 +09:00
Alexander Korotkov	0392fb900e	Revert "Reject degenerate SPLIT PARTITION with DEFAULT partition" This reverts commit `d8af730100`. Per buildfarm failures.	2026-05-20 23:23:49 +03:00
Alexander Korotkov	d8af730100	Reject degenerate SPLIT PARTITION with DEFAULT partition ALTER TABLE ... SPLIT PARTITION allows a DEFAULT partition to be created as one of the replacement partitions when the parent table does not already have one. However, it should not allow the degenerate case where a non-DEFAULT partition keeps exactly the same bound as the split partition and the command merely adds a DEFAULT partition through the SPLIT PARTITION path. Detect that case by comparing the bound of the split partition with the bound of the only non-DEFAULT replacement partition, and raise an error when they are the same. Users should add a DEFAULT partition directly with CREATE TABLE ... PARTITION OF ... DEFAULT or ALTER TABLE ... ATTACH PARTITION ... DEFAULT instead. Author: Chao Li <lic@highgo.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/C18878AB-DEB2-4A61-9995-A035DD644B81@gmail.com	2026-05-20 14:32:57 +03:00
Fujii Masao	d6a72bbe00	pg_recvlogical: Add tests for output file permissions Commit `263d1e6dfe` changed pg_recvlogical to honor source cluster file permissions when creating output files. This commit adds tests verifying that output files are created with mode 0600 when the source cluster is initialized without group access, and with mode 0640 when group access is enabled. Author: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHhpizYzMo3nFP4GkNMueSNMY3QfC-gBN1VTXtuiANDvw@mail.gmail.com	2026-05-20 16:01:56 +09:00
Fujii Masao	263d1e6dfe	pg_recvlogical: Honor source cluster file permissions for output files Commit `c37b3d08ca` attempted to preserve group permissions on pg_recvlogical output files when group access was enabled on the source cluster. However, the output files were still created with a fixed S_IRUSR \| S_IWUSR mode, preventing group-read permissions from being applied. This commit fixes the issue by creating output files with pg_file_create_mode instead of a hard-coded mode. This allows pg_recvlogical to correctly preserve group permissions from the source cluster. Backpatch to all supported branches. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHhpizYzMo3nFP4GkNMueSNMY3QfC-gBN1VTXtuiANDvw@mail.gmail.com Backpatch-through: 14	2026-05-20 15:54:13 +09:00
Álvaro Herrera	0160143ad9	Fix REPACK decoding worker not cleaned up on FATAL exit When the launching backend of REPACK (CONCURRENTLY) is terminated via pg_terminate_backend(), ProcDiePending causes ereport(FATAL) which bypasses PG_FINALLY blocks. As a result, stop_repack_decoding_worker() is never called, leaving the decoding worker running indefinitely and holding its temporary replication slot. Fix by using PG_ENSURE_ERROR_CLEANUP, which handles both ERROR and FATAL exits. Author: Baji Shaik <baji.pgdev@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CA+fm-RNoPxL2N7db_A0anMXV_aDu6jWj4PNOPtMtBUAPDPvSXQ@mail.gmail.com	2026-05-19 11:37:46 -07:00
Alexander Korotkov	83df16f1fa	Clarify SPLIT PARTITION bound requirements in docs The documentation said that the bounds of new partitions should not overlap and that their combined bounds should equal the bounds of the split partition. That is misleading when a new DEFAULT partition is specified, because the explicit partitions may cover only part of the split partition while the DEFAULT partition covers the rest. Clarify that new non-DEFAULT partition bounds must not overlap with other new or existing partitions and must be contained within the bounds of the split partition. Also state that the combined bounds must exactly match the split partition only when no new DEFAULT partition is specified. While here, improve nearby wording about hash-partitioned target tables and splitting a DEFAULT partition with the same partition name. Author: Chao Li <lic@highgo.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/C18878AB-DEB2-4A61-9995-A035DD644B81@gmail.com	2026-05-19 13:54:55 +03:00
Alexander Korotkov	971017c495	Fix SPLIT PARTITION hint for DEFAULT partition bounds When ALTER TABLE ... SPLIT PARTITION specifies a DEFAULT partition, the explicit partitions do not need to cover the split partition's bound exactly. They may cover only part of it, with the DEFAULT partition covering the remaining range. However, the existing hint said that the combined bounds of the new partitions must exactly match the bound of the split partition, which is misleading for this case and inconsistent with the code comment. Fix the hint to state the actual requirement: explicit partition bounds must stay within the bounds of the split partition when a DEFAULT partition is specified. Author: Chao Li <lic@highgo.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/C18878AB-DEB2-4A61-9995-A035DD644B81@gmail.com	2026-05-19 13:54:55 +03:00
Alexander Korotkov	9354896920	Fix SPLIT PARTITION range bound validation with DEFAULT When splitting a range partition and defining a new DEFAULT partition, the validation checked the lower bound of the first explicit partition and the upper bound of explicit partitions only when they were not first. If there was exactly one explicit non-DEFAULT partition, its upper bound was therefore not checked. This could allow the replacement partition to extend beyond the upper bound of the partition being split, potentially overlapping another existing partition. Fix this by checking the upper bound whenever the explicit partition is the last one. Add a regression test covering the single explicit partition plus DEFAULT case. Author: Chao Li <lic@highgo.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Zhenwei Shang <a934172442@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/C18878AB-DEB2-4A61-9995-A035DD644B81@gmail.com	2026-05-19 13:54:55 +03:00
Fujii Masao	1164a82272	Fix COPY FROM ON_ERROR SET_NULL with selective column list When using COPY FROM ... ON_ERROR SET_NULL with a selective column list, the domain_with_constraint array was incorrectly allocated based on the length of the target column list. While the array was populated sequentially, CopyFromTextLikeOneRow attempted to access it using the physical attribute index (attnum - 1). This mismatch caused out-of-bounds reads when targeting high-numbered columns, allowing NULL values to bypass NOT NULL domain checks and be silently inserted. Fix by allocating the array to match the total number of physical attributes (num_phys_attrs) and indexing via attnum - 1, bringing it into alignment with other per-column arrays in BeginCopyFrom. Author: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHg+QDdej0c0gWJi2FnbirzhgzyZNPiTwC1P5B_-dSNCzq-91A@mail.gmail.com	2026-05-19 10:11:41 +09:00
Daniel Gustafsson	801b9962e7	Remove support for 8 byte tear free read/write on 32-bit The macro for enabling single-copy atomicity on i586+ when using GCC has been incorrect since 2017 (commit `e8fdbd58f`) without any complaints, and getting it to work is non-trivial. Getting this to work reliably require C11 atomics, which in turn also bumps the required MSVC version. For now, simply remove the attempted support which doesn't work anyways. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAKZiRmycHOOJyEPc9FUss1_69_U62WoSx32jT7wyES-YkStZKA@mail.gmail.com Discussion: https://posrgr.es/m/CA+hUKGKFvu3zyvv3aaj5hHs9VtWcjFAmisOwOc7aOZNc5AF3NA@mail.gmail.com	2026-05-18 08:59:59 -07:00
Daniel Gustafsson	15b140d465	Remove obsolete comment in AtEOXact_Inval This comment was originally added to RegisterInvalid() in POSTGRES before Postgres95, and came in via the Postgres95 import. It has been obsolote for quite some time so remove. Author: Steven Niu <niushiji@highgo.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/MN2PR15MB30219837B2381AE2518A4C45A7FCA@MN2PR15MB3021.namprd15.prod.outlook.com	2026-05-18 08:43:12 -07:00
Daniel Gustafsson	e04910a9a2	psql: Make ParseVariableDouble reject values above max ParseVariableDouble missed returning false after logging an error when the parsed value exceeded max, making the value assigned rather than rejected. Backpatch down to v18 where this was introduced as part of the \WATCH_INTERVAL. Author: Sven Klemm <sven@tigerdata.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAMCrgp31p_5SDVi7dwnP39tTW5icQ0MWHA+N4kJdXgkL0PEy8w@mail.gmail.com Backpatch-through: 18	2026-05-18 08:33:36 -07:00
Daniel Gustafsson	aa7eb23aca	oauth: Fix missing quote in errormessage The error message for incorrect oauth validator configuration was missing a quote character. OAuth was introduced in v18 but there is no need for a backpatch since this was introduced in `22f9207aaa`. Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/ff9b84b9e6d5a3fef1f320ee5d63ec7dae722739.camel@gmail.com	2026-05-18 08:03:09 -07:00
Michael Paquier	a28fa2947d	Fix issues with handling of expressions in extended stats restore This commit addresses some defects with the handling of expressions in pg_restore_extended_stats() and pg_clear_extended_stats(): - Misleading WARNING for an incorrect number of expressions, where the number of required expressions was reported as the number of elements given in input rather than the actual number of expressions expected by the extstats object definition. - Incorrect matching of expression names, where a key name was considered as valid as long as it matched with the prefix of a legit key name. For example "correlatio" given in input would match with "correlation", and be considered valid. The consequence of this bug was a silent discard of the input data, where the operation would be considered a success. The value associated to the prefixed key was not inserted in the catalogs, just ignored. pg_dump would not generate such input data patterns, but a user doing manual stats injection could. - Missing heap_freetuple() in pg_clear_extended_stats(), for the case where the extstats object in input does not match with its parent relation. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/A7C11B83-7534-4A09-9071-FBD09175CFC8@gmail.com	2026-05-18 13:18:35 +09:00
Fujii Masao	a120ecf549	Fix parsing of REPACK options Previously, REPACK option parsing had two bugs. First, REPACK (CONCURRENTLY OFF) failed with: ERROR: unrecognized REPACK option "concurrently" while CONCURRENTLY ON was accepted correctly. Second, when the same option was specified multiple times, the last value specified was not always honored. If any occurrence set the option to ON, the option was treated as enabled even when the final setting was OFF. This commit fixes these issues by correctly accepting CONCURRENTLY regardless of its value, and by making the last specified value take precedence when an option appears multiple times. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAHGQGwGAY4kfDtC4i+hAOX-a3u0yOA6__6EDTQz-ytsDHgh-yQ@mail.gmail.com	2026-05-18 13:14:49 +09:00
Tatsuo Ishii	26269fe3c8	Fix IGNORE NULLS nullness cache for volatile window arguments. The IGNORE NULLS implementation caches whether a window function argument evaluated to NULL or NOT NULL for a given partition row. That is safe for ordinary expressions, but not for volatile expressions, where evaluating the same argument on the same row can produce a different NULL/NOT NULL result later. This could produce wrong results in two ways. A row previously cached as NULL could be skipped even though a later evaluation would return NOT NULL. Conversely, a row cached as NOT NULL could be chosen as the target row, then re-evaluated to fetch the actual value and return NULL. Make the nullness cache conditional per argument. Do not use it for arguments containing volatile functions or subplans, following the same conservative approach used for moving window aggregates. Also avoid re-evaluating non-cacheable partition arguments after the scan has already found the target row. Add regression tests covering volatile arguments and subplan arguments with IGNORE NULLS. Author: Chao Li <lic@highgo.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://postgr.es/m/42B42506-6972-4266-8422-FB73E61D9DA7@gmail.com	2026-05-18 12:09:37 +09:00
Michael Paquier	e7b416b2fa	injection_points: Move some structs to new header injection_points.h This commit moves the definitions of InjectionPointConditionType and InjectionPointCondition into a new header local to the test module injection_points.h, so as these can be shared across more files in the module. A patch for a bug fix is under discussion, whose proposed test will benefit from this refactoring. Backpatch down to where the module exists, as this should be useful for future bug fixes, even cases unrelated to the thread where this change has been discussed. Author: Andrey Borodin <x4mmm@yandex-team.ru> Author: Vlad Lesin <vladlesin@gmail.com> Discussion: https://postgr.es/m/d2983796-2603-41b7-a66e-fc8489ddb954@gmail.com Backpatch-through: 17	2026-05-18 11:11:40 +09:00
Noah Misch	bf7d19be9b	Use ereport(ERROR), not Assert(), for publisher tuples missing columns. Three locations use Assert() to guard against a mismatch between the number of columns advertised in the RELATION message and the number actually received in the subsequent INSERT/UPDATE tuple message. Since these values originate from the publisher, the check must survive into production builds. A malicious or buggy publisher can send a RELATION claiming N columns and an INSERT claiming M < N columns. The subscriber's apply worker indexes into colvalues[]/colstatus[] using column indices from the RELATION message's attribute map, causing a heap out-of-bounds read when the tuple's column array is smaller than expected. We've looked, without success, for a scenario in which the publisher holds sufficient control over these out-of-bounds bytes to exploit this or even to reach a SIGSEGV. Despite not finding one, the code has been fragile. Back-patch to v14 (all supported versions). Reported-by: Varik Matevosyan <varikmatevosyan@gmail.com> Author: Varik Matevosyan <varikmatevosyan@gmail.com> Discussion: https://postgr.es/m/CA+bBoog3cCogktzfLb9bppUByu-10B3CFp8u=iKXG_OvtAguCw@mail.gmail.com Backpatch-through: 14	2026-05-16 18:01:35 -07:00
Michael Paquier	3dcd85d1b9	Simplify signature of ProcessStartupPacket() There is now only one caller of ProcessStartupPacket(). Let's simplify the routine so as the GSS and SSL states are tracked inside it. If future callers are added, there is less guessing to do. Suggested-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/aga7lCWluyc5zLb5@paquier.xyz	2026-05-17 07:44:17 +09:00
Michael Paquier	4111b91ab3	doc: Fix example of pg_restore_extended_stats() Oversight in `ba97bf9cb7`, probably due to an incorrect rebase. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/8A72720A-55AC-4D41-B9DF-5610307600E6@gmail.com	2026-05-17 07:36:04 +09:00
Andres Freund	5ba34f6dc8	pg_test_timing: Show additional TSC clock source debug info In some cases its necessary to understand whether TSC frequency data was sourced from CPUID, and which of the registers. Show this debug info at the end of pg_test_timing, and rework TSC functions to support that. This would have helped debug the buildfarm report fixed in `7fc36c5db5` and is likely going to aid in any TSC-related issues reported during the beta period or later. Additionally, emit a warning if TSC frequency from calibration differs by more than 10% from the TSC frequency in use, and suggest the use of timing_clock_source = 'system'. In passing, add an explicit early return in the output function if the loop count is zero. This can't happen in practice, but coverity complained because we unconditionally call output for the fast TSC measurement. Author: Lukas Fittl <lukas@fittl.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (coverity fix only) Discussion: https://postgr.es/m/CAP53Pkw3Gzb+KTF5pu_o7tzbfZ7+qm2m6uDWuGtTJjZpV9yNpg@mail.gmail.com	2026-05-16 11:51:34 -04:00
Etsuro Fujita	aa1f93a338	postgres_fdw: Replace buffers in RemoteAttributeMapping with pointers. Commit `28972b6fc` ("Add support for importing statistics from remote servers.") stored the names of local/remote columns for a foreign table into the buffers of NAMEDATALEN bytes in this structure, without accounting for the possibility that the remote column name in particular could be longer than NAMEDATALEN - 1. If it was longer than that, this would leave it unterminated/truncated in the buffer, invoking undefined behavior when match_attrmap() processes it, which assumes that it's fully-contained/terminated in the buffer. To fix, replace the buffers with char pointers, pstrdup the local/remote column names, and store the results into the pointers. This commit also adds a function to clean up the nested data structure. Per Coverity and Tom Lane. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/342868.1776017700%40sss.pgh.pa.us	2026-05-16 17:55:00 +09:00
Jeff Davis	8eba2edb80	Check retain_dead_tuples for ALTER SUBSCRIPTION ... SERVER. Previously, the subscription setting retain_dead_tuples didn't cause ALTER SUBSCRIPTION ... SERVER to check the publisher. And if the publisher was checked for some other reason, then it would use the old conninfo. Fix ALTER SUBSCRIPTION ... SERVER to always check the publisher when retain_dead_tuples is set, and to use the new connection info, like ALTER SUBSCRIPTION ... CONNECTION. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/f13a8e29410bbbf9999290f2c04513a8884fa51c.camel@j-davis.com	2026-05-15 15:52:33 -07:00
Jeff Davis	6d22c67c3b	Don't accept length of -1 in pg_locale.h APIs. Reverts `ac30021356`. Per discussion, that commit interfered with useful tooling, and was not worth the special cases. Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/s32n3tm2mjh247f3xkkxkdk7cf77hglbr3ia3hrsdjylajou7y@nlldpag3tjd5	2026-05-15 11:09:15 -07:00
Bruce Momjian	41b60bf172	doc PG 19 relnotes: remove "Add fake LSN support to hash index" Also add missing commit link to json_array() item. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzm1UAuv9ih6_ATbwbmrmusKPoJ2qSo3HBF-JaUEkVYUPg@mail.gmail.com	2026-05-15 13:26:50 -04:00
Michael Paquier	27bdae8413	Re-add regression tests for ltree and intarray These tests have been removed by `906ea101d0`, due to some of them being unstable in the buildfarm with low max_stack_depth values. They are now reworked so as they should be more portable. The tests to cover the findoprnd() overflows use a balanced tree to avoid using too much stack, per a suggestion and an investigation by Tom Lane. Note: This is initially applied only on HEAD; a backpatch will follow should the buildfarm be fine with the situation. Discussion: https://postgr.es/m/agZc6XecyE7E7fep@paquier.xyz Backpatch-through: 14	2026-05-15 14:27:30 +09:00
Fujii Masao	e5035950da	psql: Fix tab completion for REPACK boolean options Previously, tab completion for REPACK parenthesized boolean options (ANALYZE, CONCURRENTLY, and VERBOSE) did not suggest the boolean values ON and OFF, unlike VACUUM. This commit fixes the issue by adding ON/OFF completion for those options. Author: Baji Shaik <baji.pgdev@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CA+fm-RNZpy7MAceR9gSyy833H_uL-fTx0LxO73RnvwEaprpuRA@mail.gmail.com	2026-05-15 14:24:45 +09:00
Bruce Momjian	6b48f5d1a7	doc PG 19 relnotes: update to current	2026-05-14 16:37:47 -04:00
Nathan Bossart	611756948e	refint: Fix segfault in check_foreign_key(). When an UPDATE statement triggers check_foreign_key() with the action set to "cascade", it generates more UPDATE statements to modify the key values in referencing relations. If a new key value is NULL, SPI_getvalue() returns a NULL pointer, which is subsequently passed to quote_literal_cstr(), causing a segfault. To fix, skip quoting when a new key value is NULL and insert an unquoted NULL keyword instead. Oversight in commit `260e97733b`. While the refint documentation recommends marking primary key columns NOT NULL, the aforementioned scenario accidentally worked on platforms where snprintf() substitutes "(null)" for NULL pointers. Note that for character-type columns, the old code quoted "(null)" as a string literal, so this didn't always produce correct results. But it still seems better to fix this than to reject cases that previously worked. Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru> Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Pierre Forstmann <pierre.forstmann@gmail.com> Discussion: https://postgr.es/m/19476-bd04ea6241345303%40postgresql.org Backpatch-through: 14	2026-05-14 13:11:49 -05:00
Masahiko Sawada	82f0135a26	Fix attribute mapping for COPY TO on partitioned tables. Commit `4bea91f21f` enabled COPY TO on a partitioned table to read tuples from its partitions and mapped them to the root table's tuple descriptor before output. However, it incorrectly built the attribute map from the root table to the partition. This commit fixes by building the attribute map from the partition to the root table, ensuring that partition attributes are correctly mapped to their corresponding root attributes. Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/85EA70F3-C3DB-477B-B856-EA569FDAAE7C@gmail.com	2026-05-14 10:32:34 -07:00
Alexander Korotkov	ce146621f7	Prevent access to other sessions' temp tables Commit `b7b0f3f272` ("Use streaming I/O in sequential scans") routed sequential scans through read_stream_next_buffer(), bypassing the RELATION_IS_OTHER_TEMP() check in ReadBufferExtended(). As a result, a superuser can attempt to read or modify temp tables of other sessions through the read-stream path. When the query plan uses no index, SELECT/UPDATE/DELETE/MERGE silently see no rows / report zero affected rows, and COPY produces an empty output -- because the buffer manager has no visibility into the owning session's local buffers and silently returns nothing. Any query plan that uses, for instance, a btree index still errors out via the existing check in ReadBufferExtended(), which is reached from hio.c and nbtree respectively, but this is incidental. Fix by enforcing RELATION_IS_OTHER_TEMP() at the three additional buffer-manager entry points: - read_stream_begin_impl() rejects the read at stream setup time, covering sequential and bitmap scans that go through the read-stream path. - ReadBuffer_common() becomes the canonical place for the check, consolidating the existing one previously kept in ReadBufferExtended(). All ReadBufferExtended() callers go through ReadBuffer_common(), so the consolidation is behavior-preserving. - StartReadBuffersImpl() catches direct callers of StartReadBuffers() that bypass both of the above. This is currently defense-in-depth, but documents the contract for future code. The companion test in src/test/modules/test_misc was added in the preceding commit; this commit updates the assertions for SELECT, UPDATE, DELETE, MERGE, and COPY (which previously documented the bug as silent success) to expect the new error. Author: Jim Jones <jim.jones@uni-muenster.de> Author: Daniil Davydov <3danissimo@gmail.com> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJDiXghdFcZ8%3Dnh4G69te7iRr3Q0uFyXxb3ZdG09_GTNZXwH0g%40mail.gmail.com Backpatch-through: 17	2026-05-14 15:01:17 +03:00
Alexander Korotkov	1fee0e857e	Add tests for cross-session temp table access Add a TAP test in src/test/modules/test_misc that documents what happens when one session attempts to read or modify another session's temporary table. This commit only adds tests; it does not change backend behavior, so the assertions reflect current behavior: - SELECT, UPDATE, DELETE, MERGE, COPY on a table without an index silently succeed with no error and zero rows / zero affected rows. These commands run through the read-stream path, which currently bypasses the RELATION_IS_OTHER_TEMP() check. This is the underlying bug to be fixed in a follow-up. - INSERT errors with "cannot access temporary tables of other sessions" because hio.c calls ReadBufferExtended() to find a page with free space and is caught by the existing check there. - Index scan errors via the same existing check, reached through nbtree -> ReadBuffer -> ReadBufferExtended. - TRUNCATE / ALTER TABLE / ALTER INDEX / CLUSTER fail with their command-specific error messages. - VACUUM is silently skipped to avoid noise during database-wide VACUUM (vacuum_rel() returns without warning). - DROP TABLE is intentionally allowed: DROP does not touch the table's contents, and autovacuum relies on this to clean up temp relations orphaned by a crashed backend. - ALTER FUNCTION / DROP FUNCTION on an owner-created function over its own temp row type work as catalog operations -- they don't read the underlying data. - CREATE FUNCTION from a separate session, using another session's temp row type as an argument, is allowed but emits a NOTICE: the function is moved into the creator's pg_temp namespace with an auto-dependency on the borrowed type, so it disappears together with the session that created it. - A bare DROP TABLE on a temp table that has a cross-session dependent function fails with a catalog-level dependency error. - LOCK TABLE in ACCESS SHARE mode on another session's temp table succeeds and properly blocks the owner's session-exit cleanup (which acquires AccessExclusiveLock via findDependentObjects). This exercises the same LockRelationOid path used by autovacuum when cleaning up orphaned temp relations. - When the owner session ends, the normal session-exit cleanup cascades through DEPENDENCY_NORMAL and removes both the temp objects and any cross-session functions that depended on them. Also, document the contract for RELATION_IS_OTHER_TEMP() so that future buffer-access entry points enforce the same rule. Backpatch this through PostgreSQL 17, where `b7b0f3f272` introduces a code path bypassing this check. Author: Jim Jones <jim.jones@uni-muenster.de> Author: Daniil Davydov <3danissimo@gmail.com> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJDiXghdFcZ8%3Dnh4G69te7iRr3Q0uFyXxb3ZdG09_GTNZXwH0g%40mail.gmail.com Backpatch-through: 17	2026-05-14 15:01:17 +03:00
Etsuro Fujita	5107398e6d	postgres_fdw: Fix deparsing of remote column names in stats import. build_remattrmap() deparses a list of remote column names for a query that retrieves attribute stats for them from the remote server. Previously, it did so by using the array-literal syntax with each column name individually quoted by quote_identifier(), causing the query to fail on the remote server with a syntax error or no results when that column name included a single quote or backslash, as quote_identifier() doesn't escape those characters, making the query invalid or incorrect. Fix by switching from the array-literal syntax to the ARRAY constructor syntax with each column name individually quoted by deparseStringLiteral(). Oversight in commit `28972b6fc`. Reported-by: Satya Narlapuram <satyanarlapuram@gmail.com> Reported-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Alex Guo <guo.alex.hengchen@gmail.com> Reviewed-by: Zhenwei Shang <a934172442@gmail.com> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAHg%2BQDc9%3DWtYi%3DJW6QUL6ASOJc6PcGPTuxoMkhnkQ7oi7j5atg%40mail.gmail.com Discussion: https://postgr.es/m/CAJTYsWWGhVDFjr%2BsmdYdU-Q_TT9YMzXA4QcLCr7rizDOyrEEow%40mail.gmail.com	2026-05-14 17:05:00 +09:00
Michael Paquier	954e57708e	Fix jsonpath .split_part() to honor silent mode The jsonpath .split_part() method passed its field-position argument through numeric_int4(), that can fail hard if called directly. This commit switches the code to use numeric_int4_safe() with an error context for soft reporting, so as the overflow and zero field-position cases can be handled in silent mode. Oversight in `bd4f879a9c`. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/FCF996D0-580B-431C-8DE1-A540C58E444C@gmail.com	2026-05-14 16:02:07 +09:00
Fujii Masao	61f8a85a57	pgbench: fix verbose error message corruption with multiple threads When pgbench runs with multiple threads and verbose error reporting is enabled (--verbose-errors), multiple clients can build verbose error messages concurrently. Previously, a function-local static PQExpBuffer was used for these messages, causing the buffer to be shared across threads. This was not thread-safe and could result in corrupted or incorrect log output. Fix this by using a local PQExpBufferData instead of a static buffer. This keeps verbose error messages correct during concurrent execution. Backpatch to v15, where this issue was introduced. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alex Guo <guo.alex.hengchen@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwER1AjGXpkKB9t9820NBhMQ_Ghv7=HsKeodUr3=SZsF4g@mail.gmail.com Backpatch-through: 15	2026-05-14 12:30:34 +09:00
Nathan Bossart	0c025ab347	Add several commits to .git-blame-ignore-revs.	2026-05-13 14:53:48 -05:00
Álvaro Herrera	3bf63730cb	Fix style in a few REPACK ereports Use consistent "REPACK (CONCURRENTLY)" naming in errhint messages, matching the actual command syntax and the errmsg text used elsewhere in the same file. Also improve the ereport() after XLogReadRecord failure to be like others in the tree. While at it, remove direct mentions of the DDL in the translatable strings, both in the same errhint() calls as well as some errmsg() calls. Add periods where missing. There are all oversights in `28d534e2ae`. Reported-by: Baji Shaik <baji.pgdev@gmail.com> Discussion: https://postgr.es/m/CA+fm-RPxX1xTcYY4qQGPRDXB2-Fy2SDNdZi=zVjr0j=MPg2PaA@mail.gmail.com	2026-05-13 18:28:31 +02:00
Tom Lane	2122281672	Use "grep -E" not "egrep". "egrep" has never been in POSIX; the standard way to access this functionality is "grep -E". Recent versions of GNU grep have started to warn about this, so stop using "egrep". This could be back-patched, but I see little need to do so because the affected places are not code that runs during normal builds. (Perhaps src/backend/port/aix/mkldexport.sh is an exception, but let's wait to see if any AIX users complain before touching that.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/473272.1778685870@sss.pgh.pa.us	2026-05-13 12:07:19 -04:00
Tom Lane	b94989e73d	Pre-beta updates: run src/tools/copyright.pl. As usual, post-1-Jan patches missed some copyright-year updates.	2026-05-13 11:01:57 -04:00
Tom Lane	c7cb8e5b73	Do pre-release housekeeping on catalog data. Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta tasks specified by RELEASE_CHANGES. For reference, the command was ./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6400 (but there were already some used OIDs at 6400, so the first one actually assigned was 6434).	2026-05-13 10:54:44 -04:00
Tom Lane	652ae1a520	Add preceding commits to .git-blame-ignore-revs.	2026-05-13 10:44:36 -04:00
Tom Lane	719fe0779d	Pre-beta mechanical code beautification, step 3: run reformat-dat-files.	2026-05-13 10:41:33 -04:00
Tom Lane	736a97bddd	Pre-beta mechanical code beautification, step 2: run pgperltidy. It's as opinionated as ever.	2026-05-13 10:37:42 -04:00
Tom Lane	020794ee42	Pre-beta mechanical code beautification, step 1: run pgindent. Update typedefs.list from the buildfarm, and run pgindent. The changes from the new typedefs list are pretty minimal, since we'd been pretty good (not perfect) about updating typedefs.list by hand. But the pgindent behavior changes installed by `a3e6beba6`, `b518ba4af`, and `60f9467c3` add up to make this a relatively sizable diff.	2026-05-13 10:34:17 -04:00
Tom Lane	60f9467c38	pgindent: improve formatting of multiline comments. Enforce this standard formatting of multiline comments that start in column 1: /* * line 1 * line 2 / Unlike indented comments, we don't reconsider line breaks, except for forcing the initial / and trailing / onto their own lines. We do make each line start with " ", with some whitespace following. We preserve pgindent's existing behavior of not touching comments that begin with /*... or /-... Also, if the first line looks like /* === or /* ---, we don't split that line; similarly for the last line. The vast majority of multiline comments in our tree already look like this, but this change will clean up some stragglers. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reported-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJ7c6TPQ0kkHQG-AqeAJ3PV_YtmDzcc7s%2B_V4%3Dt%2BxgSnZm1cFw%40mail.gmail.com Discussion: https://postgr.es/m/EB0141C5-ACC2-4F0B-85EA-0E3AFBCE322F@umbc.edu	2026-05-13 10:21:54 -04:00
Tom Lane	b518ba4aff	Make pg_bsd_indent add a space between comma and period. Formatting of variadic functions and struct literals with named fields used to be ugly due to pg_bsd_indent treating period as always being a binary operator. After a comma, it's not that, so insert a space. Bump pg_bsd_indent's version so that people who use out-of-tree copies will know they need to update. (This also covers the other pg_bsd_indent behavioral change introduced in a3e6beba6.) Author: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/c3327be8-09e2-46a1-88b4-228a339d6916@proxel.se	2026-05-13 10:17:57 -04:00
Nathan Bossart	a3e6beba60	pgindent: Fix spacing after != when member name matches typedef. When a struct member name matches a registered typedef, pgindent removes the space after "!=" (and some other operators), like so: entry->dsh.dsa_handle !=DSA_HANDLE_INVALID The problem is that the related code in lexi.c sets last_u_d to true before jumping to found_typename, causing the next operator to be classified as unary and suppressing the following space. This is correct for type names, but not for struct members. For example, "Datum x" needs "" to be unary to suppress the space before "x". To fix, only set last_u_d before jumping to found_typename if the typedef name doesn't appear after "." or "->". Note that this does not bump INDENT_VERSION. We'll do that just once after some other changes to pg_bsd_indent are committed. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aS9hkwnkWf3dZIA_%40nathan	2026-05-13 09:10:50 -05:00
Peter Eisentraut	7ca8c94296	Fix FOR PORTION OF with non-updatable view columns Both UPDATE and DELETE were failing to test that the application-time column was updatable. The column is not part of perminfo->updatedCols, because it should not be checked for permissions. And it needs to be checked in the DELETE case as well, since we might insert leftovers with a value for that column. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Co-authored-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/CACJufxFRqg8%3DgbZ-Q6ZS_UQ%2BYdwfZpk%2B9rf7jgWrk8m4RMUm%3DA%40mail.gmail.com	2026-05-13 13:44:28 +02:00
Michael Paquier	6636621782	pg_stat_statements: Set PlannedStmt to NULL after nested utility execution As mentioned in `8268e41aca`, pgss_ProcessUtility() may free the PlannedStmt after an internal ROLLBACK. This commit sets the PlannedStmt "pstmt" to NULL once it is no longer safe to rely on it, making bugs similar to the one fixed by the previous commit easier to detect. Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/0A9A8DAC-BC3C-4C7A-9504-2C6050405544@anarazel.de	2026-05-13 15:39:44 +09:00
Michael Paquier	900c07b854	Add more tests for corrupted data with pglz_decompress() Two cases fixed by `2b5ba2a0a1` were not covered, to emulate the handling of corrupted data, for: - set control bit with a valid 2-byte match tag where offset is 0. - set control bit with a valid 2-byte match tag where offset exceeds output written. Oversight in `67d318e704`. Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/agF4xkIdRcrCIprs@paquier.xyz Backpatch-through: 14	2026-05-13 14:43:42 +09:00
Fujii Masao	422e54e309	Fix stale COPY progress during logical replication table sync Previously, pg_stat_progress_copy in the subscriber could continue to show the initial COPY operation for logical replication table synchronization as active even after the data copy had finished. The stale progress entry remained visible until synchronization caught up with the publisher. This happened because the table synchronization code called BeginCopyFrom() and CopyFrom(), but failed to call EndCopyFrom() afterward. This commit fixes the issue by adding the missing EndCopyFrom() call so that the COPY progress state in the subscriber is cleared as soon as the initial data copy completes. Backpatch to all supported branches. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOzEurQKuy3RiPkd=25PEwEzaqHuGvEOf=X7vaVzhgNjaukYzA@mail.gmail.com Backpatch-through: 14	2026-05-13 11:44:31 +09:00
Bruce Momjian	34be85f657	psql: save/restore truePrint/falsePrint printQueryOpt values Reported-by: a.kozhemyakin Author: David G. Johnston Discussion: https://postgr.es/m/83e247ed-0b2d-4aba-bc42-e7bbc20be0d6@postgrespro.ru	2026-05-12 18:28:20 -04:00
Bruce Momjian	cac0f24eb5	doc PG 19 relnotes: add two optimizer hooks Reported-by: Jian He Discussion: https://postgr.es/m/CACJufxE8Ew_DCXtd1VZSC=pNPHqZRa4RJkbCr7z6ZPJJ3o3hGQ@mail.gmail.com	2026-05-12 16:16:33 -04:00
Tom Lane	163f20ca12	De-obfuscate the comment in tsrank.c's calc_rank_or(). Oleg's original comment was intelligible only to him. Aleksander has reverse-engineered what seems like a plausible explanation of what the code is trying to do, so replace the comment with that. (Also, re-order the final expression to match the new comment.) In passing, this makes the comment satisfy our usual formatting conventions. pgindent has let it pass as-is so far, but planned changes would mess it up without some sort of intervention. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJ7c6TO0xvunpeOv89i1eKQBhKF9=GEETkTz+yAGs1xGYH25MQ@mail.gmail.com	2026-05-12 15:21:36 -04:00
Bruce Momjian	06fccab4c6	doc PG 19 relnotes: remove "Optionally" for CPU optimizations Reported-by: John Naylor Discussion: https://postgr.es/m/CANWCAZZWfdoMcemSaTMon+e6aCkSABN3+sco0aStC90cFPVE4A@mail.gmail.com	2026-05-12 15:13:46 -04:00
Peter Eisentraut	7b22f15a01	Add psql tab completion for FOR PORTION OF clause Add tab completion support in psql for the FOR PORTION OF clause used in UPDATE and DELETE statements with temporal tables. For both UPDATE and DELETE, completion now guides users through: <table> FOR -> PORTION -> OF -> <column> -> FROM Author: Kiran Kaki <itskkpg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAD0dvCQLqLzPrQJRjjA2qXDH%3DD%2BXShcxhbSPxNhVruC8HGhkbQ%40mail.gmail.com	2026-05-12 17:24:01 +02:00
Michael Paquier	8268e41aca	pg_stat_statements: Fix potential use-after-free of PlannedStmt pgss_ProcessUtility() included a reference to a portion of a PlannedStmt after the point where this data's structure could have been freed, causing an incorrect memory access. There was a comment documenting this requirement, missed in `3357471cf9`. This commit includes a test able to make valgrind complain with a PlannedStmt freed by an internal ROLLBACK query. Similarly to what is mentioned in `495e73c207`, this can be triggered by using the extended query protocol, something that can be now tested thanks to the recent meta-command additions in psql. This commit mentions potential other cases, but as far as I can see the extended protocol case with an internal ROLLBACK is the only problematic pattern reachable in practice. Issue introduced by `3357471cf9`, gone unnoticed due to a lack of test coverage. The fix is authored by Chao, my contribution being the new test. Author: Chao Li <li.evan.chao@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/2F91906A-F2B5-4A6B-9695-D136957D4545@gmail.com	2026-05-12 13:36:38 +09:00
Bruce Momjian	8974a7c433	doc PG 19 relnotes: adjustments/removal of items Reported-by: John Naylor Discussion: https://postgr.es/m/CANWCAZZWfdoMcemSaTMon+e6aCkSABN3+sco0aStC90cFPVE4A@mail.gmail.com	2026-05-11 17:43:15 -04:00
Heikki Linnakangas	c3f7dde39e	Use palloc_array() in a few more places to avoid overflow These could overflow on 32-bit systems. Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 21:27:55 +03:00
Álvaro Herrera	36f52a59b3	Fix REPACK with WITHOUT OVERLAPS replica identity indexes REPACK replay builds scan keys for the replica identity index, but it hard-coded BTEqualStrategyNumber when looking up the equality operator. That is not correct for non-btree identity indexes, such as the GiST indexes created for WITHOUT OVERLAPS primary keys. In addition, find_target_tuple() accepted the first tuple returned by the identity index scan, which is unsafe for lossy index scans because the index AM may return false positives with xs_recheck set. Fix this by using IndexAmTranslateCompareType() to translate COMPARE_EQ to the equality strategy number for the index AM, and by continuing the scan when recheck is required until a candidate tuple matches the locator tuple on all replica identity key columns. The recheck uses the same equality operator functions as the identity index scan keys, preserving ScanKey argument ordering. Author: Chao Li <lic@highgo.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/7B0EC0EC-5461-41EF-9B31-F9BBE608DEA5@gmail.com	2026-05-11 18:17:46 +02:00
Tom Lane	906ea101d0	Remove test cases for field overflows in intarray and ltree. These checks are failing in the buildfarm, reporting stack overflows rather than the expected errors, though seemingly only on ppc64 and s390x platforms. Perhaps there is something off about our tests for stack depth on those architectures? But there's no time to debug that right now, and surely these tests aren't too essential. Revert for now and plan to revisit after the release dust settles. Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 12:12:03 -04:00
Nathan Bossart	260e97733b	refint: Fix SQL injection and buffer overruns. Maliciously crafted key value updates could achieve SQL injection within check_foreign_key(). To fix, ensure new key values are properly quoted and escaped in the internally generated SQL statements. While at it, avoid potential buffer overruns by replacing the stack buffers for internally generated SQL statements with StringInfo. Reported-by: Nikolay Samokhvalov <nik@postgres.ai> Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Security: CVE-2026-6637 Backpatch-through: 14	2026-05-11 05:13:47 -07:00
Nathan Bossart	bd48114937	Mark PQfn() unsafe and fix overrun in frontend LO interface. When result_is_int is set to 0, PQfn() cannot validate that the result fits in result_buf, so it will write data beyond the end of the buffer when the server returns more data than requested. Since this function is insecurable and obsolete, add a warning to the top of the pertinent documentation advising against its use. The only in-tree caller of PQfn() is the frontend large object interface. To fix that, add a buf_size parameter to pqFunctionCall3() that is used to protect against overruns, and use it in a private version of PQfn() that also accepts a buf_size parameter. Reported-by: Yu Kunpeng <yu443940816@live.com> Reported-by: Martin Heistermann <martin.heistermann@unibe.ch> Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Security: CVE-2026-6477 Backpatch-through: 14	2026-05-11 05:13:47 -07:00
Heikki Linnakangas	6d68fcb28f	Fix integer overflow in array_agg(), when the array grows too large If you accumulate many arrays full of NULLs, you could overflow 'nitems', before reaching the MaxAllocSize limit on the allocations. Add an explicit check that the number of items doesn't grow too large. With more than MaxArraySize items, getting the final result with makeArrayResultArr() would fail anyway, so better to error out early. Reported-by: Xint Code Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:47 -07:00
Tom Lane	b2869ebc43	Fix integer-overflow and alignment hazards in locale-related code. pg_locale_icu.c was full of places where a very long input string could cause integer overflow while calculating a buffer size, leading to buffer overruns. It also was cavalier about using char-type local arrays as buffers holding arrays of UChar. The alignment of a char[] variable isn't guaranteed, so that this risked failure on alignment-picky platforms. The lack of complaints suggests that such platforms are very rare nowadays; but it's likely that we are paying a performance price on rather more platforms. Declare those arrays as UChar[] instead, keeping their physical size the same. pg_locale_libc.c's strncoll_libc_win32_utf8() also had the disease of assuming it could double or quadruple the input string length without concern for overflow. Reported-by: Xint Code Reported-by: Pavel Kohout <pavel.kohout@aisle.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:47 -07:00
Michael Paquier	a1063eeced	Prevent path traversal in pg_basebackup and pg_rewind pg_rewind and pg_basebackup could be fed paths from rogue endpoints that could overwrite the contents of the client when received, achieving path traversal. There were two areas in the tree that were sensitive to this problem: - pg_basebackup, through the astreamer code, where no validation was performed before building an output path when streaming tar data. This is an issue in v15 and newer versions. - pg_rewind file operations for paths received through libpq, for all the stable branches supported. In order to address this problem, this commit adds a helper function in path.c, that reuses path_is_relative_and_below_cwd() after applying canonicalize_path(). This can be used to validate the paths received from a connection point. A path is considered invalid if any of the two following conditions is satisfied: - The path is absolute. - The path includes a direct parent-directory reference. Reported-by: XlabAI Team of Tencent Xuanwu Lab Reported-by: Valery Gubanov <valerygubanov95@gmail.com> Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 14 Security: CVE-2026-6475	2026-05-11 05:13:47 -07:00
Nathan Bossart	6a985e71e9	Avoid overflow in size calculations in formatting.c. A few functions in this file were incautious about multiplying a possibly large integer by a factor more than 1 and then using it as an allocation size. This is harmless on 64-bit systems where we'd compute a size exceeding MaxAllocSize and then fail, but on 32-bit systems we could overflow size_t, leading to an undersized allocation and buffer overrun. To fix, use palloc_array() or mul_size() instead of handwritten multiplication. Reported-by: Sven Klemm <sven@tigerdata.com> Reported-by: Xint Code Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Security: CVE-2026-6473 Backpatch-through: 14	2026-05-11 05:13:47 -07:00
Nathan Bossart	4793fc41f8	Check CREATE privilege on multirange type schema in CREATE TYPE. This omission allowed roles to create multirange types in any schema, potentially leading to privilege escalations. Note that when a multirange type name is not specified in CREATE TYPE, it is automatically placed in the range type's schema, which is checked at the beginning of DefineRange(). Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl> Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Security: CVE-2026-6472 Backpatch-through: 14	2026-05-11 05:13:47 -07:00
Nathan Bossart	d389415ffa	pg_createsubscriber: Obstruct SQL injection via subscription names. drop_existing_subscription() neglected to escape the subscription name when generating its query string. To fix, use PQescapeIdentifier() to construct a properly escaped name, and use it in the ALTER SUBSCRIPTION and DROP SUBSCRIPTION commands. Reported-by: Yu Kunpeng <yu443940816@live.com> Author: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Security: CVE-2026-6476 Backpatch-through: 17	2026-05-11 05:13:47 -07:00
Michael Paquier	6d6348f032	Fix MCV input array checks in statistics restore functions The SQL functions for the restore of attribute and expression statistics accept "most_common_vals" and "most_common_freqs" as independent arrays. The planner assumes these have the same number of elements, but it was possible to insert in the catalogs data that would cause an over-read when the catalog data is loaded in the planner. There were two holes in the stats restore logic: - Both arrays should match in size. - The input array must be one-dimensional, and it should match with what is delivered by pg_dump when scanning the pg_stats catalogs. The multivariate extended statistics MCV path (import_mcv) already validated these inputs via check_mcvlist_array(), and is not affected. These problems exist in v18 and newer versions for the restore of attribute statistics. These problems affect only HEAD for the restore of the expression statistics. Reported-by: Jeroen Gui <jeroen.gui1@proton.me> Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Security: CVE-2026-6575 Backpatch-through: 18	2026-05-11 05:13:46 -07:00
Tom Lane	ec8ded4b32	Guard against unsafe conditions in usage of pg_strftime(). Although pg_strftime() has defined error conditions, no callers bother to check for errors. This is problematic because the output string is very likely not null-terminated if an error occurs, so that blindly using it is unsafe. Rather than trusting that we can find and fix all the callers, let's alter the function's API spec slightly: make it guarantee a null-terminated result so long as maxsize > 0. Furthermore, if we do get an error, let's make that null-terminated result be an empty string. We could instead truncate at the buffer length, but that risks producing mis-encoded output if the tz_name string contains multibyte characters. It doesn't seem reasonable for src/timezone/ to make use of our encoding-aware truncation logic. Also, the only really likely source of a failure is a user-supplied timezone name that is intentionally trying to overrun our buffers. I don't feel a need to be particularly friendly about that case. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Backpatch-through: 14 Security: CVE-2026-6474	2026-05-11 05:13:46 -07:00
Tom Lane	76ab76f875	Avoid passing unintended format codes to snprintf(). timeofday() assumed that the output of pg_strftime() could not contain % signs, other than the one it explicitly asks for with %%. However, we don't have that guarantee with respect to the time zone name (%Z). A crafted time zone setting could abuse the subsequent snprintf() call, resulting in crashes or disclosure of server memory. To fix, split the pg_strftime() call into two and then treat the outputs as literal strings, not a snprintf format string. The extra pg_strftime() call doesn't really cost anything, since the bulk of the conversion work was done by pg_localtime(). Also, adjust buffer widths so that we're not risking string truncation during the snprintf() step, as that would create a hazard of producing mis-encoded output. This also fixes a latent portability issue: the format string expects an int, but tp.tv_usec is long int on many platforms. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Backpatch-through: 14 Security: CVE-2026-6474	2026-05-11 05:13:46 -07:00
Noah Misch	46b4f5c11b	Fix SQL injection in logical replication origin checks. ALTER SUBSCRIPTION ... REFRESH PUBLICATION interpolates schema and relation names into SQL without quoting them. A crafted subscriber relation name can inject arbitrary SQL on the publisher. Test such a name. Back-patch to v16, where commit `8756930190` first appeared. Reported-by: Pavel Kohout <pavel.kohout@aisle.com> Author: Pavel Kohout <pavel.kohout@aisle.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Backpatch-through: 16 Security: CVE-2026-6638	2026-05-11 05:13:46 -07:00
Michael Paquier	5924e256c4	Apply timingsafe_bcmp() in authentication paths This commit applies timingsafe_bcmp() to authentication paths that handle attributes or data previously compared with memcpy() or strcmp(), which are sensitive to timing attacks. The following data is concerned by this change, some being in the backend and some in the frontend: - For a SCRAM or MD5 password, the computed key or the MD5 hash compared with a password during a plain authentication. - For a SCRAM exchange, the stored key, the client's final nonce and the server nonce. - RADIUS (up to v18), the encrypted password. - For MD5 authentication, the MD5(MD5()) hash. Reported-by: Joe Conway <mail@joeconway.com> Security: CVE-2026-6478 Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Backpatch-through: 14	2026-05-11 05:13:46 -07:00
Tom Lane	43451a7a2b	Guard against overflow in "left" fields of query_int and ltxtquery. contrib/intarray's query_int type uses an int16 field to hold the offset from a binary operator node to its left operand. However, it allows the number of nodes to be as much as will fit in MaxAllocSize, so there is a risk of overflowing int16 depending on the precise shape of the tree. Simple right-associative cases like "a \| b \| c \| ..." work fine, so we should not solve this by restricting the overall number of nodes. Instead add a direct test of whether each individual offset is too large. contrib/ltree's ltxtquery type uses essentially the same logic and has the same 16-bit restriction. (The core backend's tsquery.c has a variant of this logic too, but in that case the target field is 32 bits, so it is okay so long as varlena datums are restricted to 1GB.) In v16 and up, these types support soft error reporting, so we have to complicate the recursive findoprnd function's API a bit to allow the complaint to be reported softly. v14/v15 don't need that. Undocumented and overcomplicated code like this makes my head hurt, so add some comments and simplify while at it. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:46 -07:00
Michael Paquier	b63f25bddf	Fix unbounded recursive handling of SSL/GSS in ProcessStartupPacket() The handling of SSL and GSS negotiation messages in ProcessStartupPacket() could cause a recursion of the backend, ultimately crashing the server as the negotiation attempts were not tracked across multiple calls processing startup packets. A malicious client could therefore alternate rejected SSL and GSS requests indefinitely, each adding a stack frame, until the backend crashed with a stack overflow, taking down a server. This commit addresses this issue by modifying ProcessStartupPacket() so as processed negotiation attempts are tracked, preventing infinite recursive attempts. A TAP test is added to check this problem, where multiple SSL and GSS negotiated attempts are stacked. Reported-by: Calif.io in collaboration with Claude and Anthropic Research Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Security: CVE-2026-6479 Backpatch-through: 14	2026-05-11 05:13:46 -07:00
Tom Lane	c55cea5290	Fix assorted places that need to use palloc_array(). multirange_recv and BlockRefTableReaderNextRelation were incautious about multiplying a possibly-large integer by a factor more than 1 and then using it as an allocation size. This is harmless on 64-bit systems where we'd compute a size exceeding MaxAllocSize and then fail, but on 32-bit systems we could overflow size_t leading to an undersized allocation and buffer overrun. Fix these places by using palloc_array() instead of a handwritten multiplication. (In HEAD, some of them were fixed already, but none of that work got back-patched at the time.) In addition, BlockRefTableReaderNextRelation passes the same value to BlockRefTableRead's "int length" parameter. If built for 64-bit frontend code, palloc_array() allows a larger array size than it otherwise would, potentially allowing that parameter to overflow. Add an explicit check to forestall that and keep the behavior the same cross-platform. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:46 -07:00
Tom Lane	066b7b144f	Prevent buffer overrun in unicode_normalize(). Some UTF8 characters decompose to more than a dozen codepoints. It is possible for an input string that fits into well under 1GB to produce more than 4G decomposed codepoints, causing unicode_normalize()'s decomp_size variable to wrap around to a small positive value. This results in a small output buffer allocation and subsequent buffer overrun. To fix, test after each addition to see if we've overrun MaxAllocSize, and break out of the loop early if so. In frontend code we want to just return NULL for this failure (treating it like OOM). In the backend, we can rely on the following palloc() call to throw error. I also tightened things up in the calling functions in varlena.c, using size_t rather than int and allocating the input workspace with palloc_array(). These changes are probably unnecessary given the knowledge that the original input and the normalized output_chars array must fit into 1GB, but it's a lot easier to believe the code is safe with these changes. Reported-by: Xint Code Reported-by: Bruce Dang <bruce@calif.io> Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Heikki Linnakangas <hlinnaka@iki.fi> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:46 -07:00
Tom Lane	0dc1fdc75e	Harden our regex engine against integer overflow in size calculations. The number of NFA states, number of NFA arcs, and number of colors are all bounded to reasonably small values. However, there are places where we try to allocate arrays sized by products of those quantities, and those calculations could overflow, enabling buffer-overrun attacks. In practice there's no problem on 64-bit machines, but there are some live scenarios on 32-bit machines. A related problem is that citerdissect() and creviterdissect() allocate arrays based on the length of the input string, which potentially could overflow. To fix, invent MALLOC_ARRAY and REALLOC_ARRAY macros that rely on palloc_array_extended and repalloc_array_extended with the NO_OOM option, similarly to the existing MALLOC and REALLOC macros. (Like those, they'll throw an error not return a NULL result for oversize requests. This doesn't really fit into the regex code's view of error handling, but it'll do for now. We can consider whether to change that behavior in a non-security follow-up patch.) I installed similar defenses in the colormap construction code. It's not entirely clear whether integer overflow is possible there, but analyzing the behavior in detail seems not worth the trouble, as the risky spots are not in hot code paths. I left a bunch of calls as-is after verifying that they can't overflow given reasonable limits on nstates and narcs. Those limits were enforced already via REG_MAX_COMPILE_SPACE, but add commentary to document the interactions. In passing, also fix a related edge case, which is that the special color numbers used in LACON carcs could overflow the "color" data type, if ncolors is close to MAX_COLOR. In v14 and v15, the regex engine calls malloc() directly instead of using palloc(), so MALLOC_ARRAY and REALLOC_ARRAY do likewise. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:46 -07:00
Tom Lane	46593aea0a	Make palloc_array() and friends safe against integer overflow. Sufficiently large "count" arguments could result in undetected overflow, causing the allocated memory chunk to be much smaller than what the caller will subsequently write into it. This is unlikely to be a hazard with 64-bit size_t but can sometimes happen on 32-bit builds, primarily where a function allocates workspace that's significantly larger than its input data. Rather than trying to patch the at-risk callers piecemeal, let's just redefine these macros so that they always check. To do that, move the longstanding add_size() and mul_size() functions into palloc.h and mcxt.c, and adjust them to not be specific to shared-memory allocation. Then invent palloc_mul(), palloc0_mul(), palloc_mul_extended() to use these functions. Actually, the latter use inlined copies to save one function call. repalloc_array() gets similar treatment. I didn't bother trying to inline the calls for repalloc0_array() though. In v14 and v15, this also adds repalloc_extended(), which previously was only available in v16 and up. We need copies of all this in fe_memutils.[hc] as well, since that module also provides palloc_array() etc. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:46 -07:00
Michael Paquier	d388e1d7f0	Fix overflows with ts_headline() The options "StartSel", "StopSel" and "FragmentDelimiter" given by a caller of the SQL function ts_headline() have their lengths stored as int16. When providing values larger than PG_INT16_MAX, it was possible to overflow the length values stored, leading to incorrect behaviors in generateHeadline(), in most cases translating to a crash. Attempting to use values for these options larger than PG_INT16_MAX is now blocked. Some test cases are added to cover our tracks. Reported-by: Xint Code Author: Michael Paquier <michael@paquier.xyz> Backpatch-through: 14 Security: CVE-2026-6473	2026-05-11 05:13:46 -07:00
Michael Paquier	2f1b16e867	ltree: Fix overflows with lquery parsing The lquery parser in contrib/ltree/ had two overflow problems: - A single lquery level with many OR-separated variants (e.g., 'label1\|label2\|...'), could cause an overflow of totallen, this being stored as a uint16, meaning a maximum value of UINT16_MAX or 65k. Each variant contributes MAXALIGN(LVAR_HDRSIZE + len) bytes. With enough long variants, the value would wraparound. This would corrupt the data written by LQL_NEXT(), leading to a stack corruption, most likely translating into a crash, but it would allow incorrect memory access. - numvar, labelled as a uint16, counts the number of OR-variants in a single level, and it is incremented without bounds checking. With more than PG_UINT16_MAX (65k) variants in a single level, and a minimum of 131kB of input data, it would wrap to 0. When a (wildcard) '*' is used, this would change the query results silently. For both issues, a set of overflows checks are added to guard against these problematic patterns. The first issue has been reported by the three people listed below, affecting v16 and newer versions due to `b1665bf01e`. Its coding was still unsafe in v14 and v15. The second issue affects all the stable branches; I have bumped into while reviewing the code of the module. Reported-by: Vergissmeinnicht <vergissmeinnichtzh@gmail.com> Reported-by: A1ex <alex000young@gmail.com> Reported-by: Jihe Wang <wangjihe.mail@gmail.com> Author: Michael Paquier <michael@paquier.xyz> Security: CVE-2026-6473 Backpatch-through: 14	2026-05-11 05:13:46 -07:00
Peter Eisentraut	c1fe2d1a38	pg_upgrade: Message improvements	2026-05-11 11:38:20 +02:00
John Naylor	901ed9b352	Fix universal builds on MacOS Commit `16743db06` assumed that the CPUID instruction was always available when the usual x86 symbols were defined. That is not the case, so zero out the info rather than error out. Reported-by: Jakob Egger <jakob@eggerapps.at> Reported-by: Tobias Bussmann <t.bussmann@gmx.net> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/223EA201-A0E8-4A13-B220-EB903E8DF817@eggerapps.at	2026-05-08 16:44:25 +07:00
Richard Guo	9d124a14b3	Enforce RETURNING typmod for empty-set JSON_ARRAY(query) Commit `8d829f5a0` introduced a COALESCE wrapper around the JSON_ARRAYAGG subquery so that JSON_ARRAY(query) returns '[]' rather than NULL when the subquery yields no rows, per the SQL/JSON standard. The empty-array Const used as the COALESCE fallback was, however, built with typmod -1 and the type input function was likewise invoked with typmod -1. As a result, any length restriction from the RETURNING clause was silently bypassed on the empty-set path, while the non-empty path enforced it via the JSON_ARRAYAGG coercion. Build the empty-array Const using the typmod of the COALESCE's non-empty argument, and pass that typmod to OidInputFunctionCall as well so the value is length-checked at parse time. This makes the empty-set and non-empty-set paths behave consistently. Reported-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAJTYsWXPYqa58YXrU+SQMVonsAhjLS46HNUMU=wO5zm9MgY3_g@mail.gmail.com	2026-05-08 17:21:48 +09:00
Amit Kapila	a49b9cfd72	Use schema-qualified names in EXCEPT clause error messages. Error messages in check_publication_add_relation() previously reported only the relation name when a table in an EXCEPT clause could not be processed, which is ambiguous when the same name exists in multiple schemas. Use schema-qualified names instead, consistent with other error messages that reference relation names. Author: Dilip Kumar <dilipbalaut@gmail.com> Author: vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAFiTN-scG7b11Jsp+VoDRT8ZFE84eSKLcDsSB18dZ8AaP=R-mw@mail.gmail.com	2026-05-08 10:00:26 +05:30
Etsuro Fujita	3f7a1afbae	postgres_fdw: Fix syntax error in fetch_attstats(). When importing remote stats for a foreign table backed by a pre-v17 remote server, the query built/executed in this function has three NULL placeholders for the range stats supported in v17 at the end of the SELECT list. Previously, it included a trailing comma after the last NULL, like "SELECT ..., NULL, NULL, NULL, FROM pg_catalog.pg_stats ...", causing a syntax error on the remote server. Fix by removing the comma. Oversight in commit `28972b6fc`. Author: Satya Narlapuram <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/CAHg%2BQDdEE7wp1S60Fn9Kmna8KfdMo5Tu6dROLpMn_-EOUBKmWQ%40mail.gmail.com	2026-05-08 13:15:00 +09:00
Richard Guo	a1b754558a	Consider opfamily and collation when removing redundant GROUP BY columns remove_useless_groupby_columns() uses a relation's unique indexes to prove that some GROUP BY columns are functionally dependent on others, and so can be dropped from the GROUP BY clause. The match between index columns and GROUP BY columns was done by attno alone, ignoring two equality-relation issues. A type may belong to multiple btree opfamilies whose notions of equality differ. The record type, for instance, has record_ops (per-field equality) and record_image_ops (bytewise equality). A unique index under one opfamily does not prove uniqueness under the equality used by GROUP BY when the SortGroupClause's eqop comes from a different opfamily. Likewise, since nondeterministic collations were introduced in PG 12, two collations may disagree on equality, and a unique index under one collation does not prove uniqueness under another. In either case, rows that the index considers distinct can collapse into a single GROUP BY group, taking ungrouped columns of differing values with them, so the planner drops a column that is not in fact functionally dependent and produces wrong results. Fix by requiring, for each unique-index key column, that some GROUP BY item on the same column has an eqop in the index's opfamily and a collation that agrees on equality with the index's collation. This mirrors the combined check relation_has_unique_index_for() applies to join clauses. This is a v18 regression: commit `bd10ec529` extended remove_useless_groupby_columns() from primary-key constraints to arbitrary unique indexes. Before that, the function consulted only primary keys, whose enforcement index is required by parse_utilcmd.c to use the default opclass and the column's declared collation, so neither mismatch could arise. Back-patch to v18 only. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAMbWs49t6uArWoTT-cHY+nhsi23nJJKcF9Xb9cYGzaZ9kNJ98g@mail.gmail.com Backpatch-through: 18	2026-05-08 12:45:51 +09:00
Richard Guo	ba82de48e6	Fix HAVING-to-WHERE pushdown for simple-CASE form Commit `f76686ce7` added a walker that detects when a HAVING clause uses a collation that conflicts with the GROUP BY's nondeterministic collation, keeping such clauses in HAVING. The walker uses exprInputCollation() to identify each ancestor's comparison collation, but missed the simple-CASE case: parse analysis builds each WHEN as OpExpr(CaseTestExpr op val), where CaseTestExpr is a placeholder for the arg, while the actual arg expression sits at cexpr->arg, outside the OpExpr that carries the comparison's inputcollid. A GROUP Var at cexpr->arg was therefore visited with the WHEN's inputcollid absent from the ancestor stack, the conflict went undetected, and the clause was wrongly pushed to WHERE. Fix by handling simple CASE explicitly: before walking cexpr->arg, push every WHEN's inputcollid onto the ancestor stack so a GROUP Var at the arg is checked against the same collations the WHEN comparisons would apply. Then walk the WHEN bodies and defresult under the unchanged stack, where their own collation contexts are picked up by the default path. Back-patch to v18 only; this fix extends the walker added by commit `f76686ce7` and inherits its dependency on the v18 RTE_GROUP mechanism. Author: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAHg+QDcqPdd=2V0PQ_oNYj50OUeqSqznqFaYtP3RdokLBDXBqw@mail.gmail.com Backpatch-through: 18	2026-05-08 10:57:50 +09:00
Bruce Momjian	12ca57bf34	doc PG 19 relnotes: add UTF-8 case folding performance item Reported-by: Andreas Karlsson Discussion: https://postgr.es/m/9dae1593-4441-4a20-a1ab-ce5018db9878@proxel.se	2026-05-07 20:53:41 -04:00
Amit Langote	4b1b2be22f	Fix use-after-free of qs in AfterTriggerEndQuery. afterTriggerInvokeEvents() may repalloc afterTriggers.query_stack while firing trigger events, leaving any precomputed entry pointer dangling. The loop body in AfterTriggerEndQuery() recomputes qs after each afterTriggerInvokeEvents() call for that reason, but the "all fired" break path exits without the recompute, and the subsequent FireAfterTriggerBatchCallbacks(qs->batch_callbacks) dereferences the freed pointer. Fix by recomputing qs immediately before FireAfterTriggerBatchCallbacks(), as the loop body already does after each afterTriggerInvokeEvents() call. The hazard was introduced in `34a3078629`, which added the qs->batch_callbacks dereference at this site. Reported-by: Amul Sul <sulamul@gmail.com> Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CAAJ_b95p6-qiVpE2Gpr=bUsNAqTcejD_rPgLnfjx9m=fo3Rf3Q@mail.gmail.com	2026-05-08 09:42:42 +09:00
Bruce Momjian	2d773a9f00	doc PG 19 relnotes: correct two items Reported-by: jian he Discussion: https://postgr.es/m/CACJufxG_ZTCTtFMxKiVji-s10jHt99krfH+Kn+Ww2prF=X6g6Q@mail.gmail.com	2026-05-07 20:21:16 -04:00
Bruce Momjian	4c0f1e4910	doc PG 19 relnotes: add missing commits and details Reported-by: Xuneng Zhou Discussion: https://postgr.es/m/CABPTF7VxrFB_4Qoo2=PyrczGyq8CqOpQ5D5yye3DyxDC=so_0Q@mail.gmail.com	2026-05-07 18:02:21 -04:00
Masahiko Sawada	b384cdb274	Fix race condition in XLogLogicalInfo and ProcSignal initialization. Previously, InitializeProcessXLogLogicalInfo() was called before ProcSignalInit(). This created a window where a process could miss a signal barrier if it was issued between these two calls. As a result, the process could fail to update its local XLogLogicalInfo cache, leading to an inconsistent logical decoding state. This commit fixes this by moving InitializeProcessXLogLogicalInfo() after ProcSignalInit(). This ensures that the process is registered to participate in signal barriers before its state is initialized, preventing it from missing any state changes propagated during the startup sequence. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAD21AoBzdeSyLSSPM5E6ysN1r8qzp8u_BRmnLvuAp_S8QxS_fQ@mail.gmail.com Discussion: https://postgr.es/m/CAD21AoBj+zKvgw_Q8gjr4YbKccW_uMe3OFQ5+KT246FHUuNXSQ@mail.gmail.com	2026-05-07 10:09:42 -07:00
John Naylor	ecb2508aaf	Rationalize error comments in partition split/merge tests The regression tests had a copy of the full error, detail, and hint text in comments above each failing statement in the .sql files. This is a maintenance hazard, so simplify to "-- ERROR", in line with other tests. Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CANWCAZap26BRLwtd+A7GFDSD6-+C3F0NVdUGUAu2LUfvpOTy=w@mail.gmail.com	2026-05-07 19:10:51 +07:00
John Naylor	52e629be95	Message corrections for partition split/merge commands Fix spelling and grammar, turn an accidental duplicate errmsg into errdetail, and remove an errposition that was not pointing at anything relevant to the error. Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Yuchen Li <liyuchen_xyz@163.com> (earlier version) Discussion: https://postgr.es/m/CAJTYsWUvMT5uKOasPnm6-o9CrdXbRONiAYHTKJb7wx66LB8S1A@mail.gmail.com	2026-05-07 19:10:35 +07:00
Peter Eisentraut	43fc1dc752	pg_createsubscriber: Message improvements and corrections	2026-05-07 11:19:55 +02:00
Peter Eisentraut	5778fb3eaf	Fix typo in error message Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAJTYsWXFy1j_T82%2BM_S9kFxU414tQYnZQD-b82%3DoL_LbG_5fPQ%40mail.gmail.com	2026-05-07 10:36:59 +02:00
Michael Paquier	6827de95ee	Simplify code in objectaddress.c for some property graph objects Property graph element labels and label properties relied on a direct systable scan when retrieving their object descriptions. These can be simplified with get_catalog_object_by_oid(). This offers the benefit to do a direct syscache lookup, if available. The same logic will be used in a follow-up patch when retrieving the object identity parts, applying the same rule across the board for these object types. Extracted from a larger patch by the author. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Alex Guo <guo.alex.hengchen@gmail.com> Discussion: https://postgr.es/m/aej1DkLwhyZWmtxJ@bdtpg	2026-05-07 10:18:49 +09:00
Alexander Korotkov	5cdec42319	Fix WAIT FOR LSN cleanup on subtransaction abort WAIT FOR LSN registers the current backend in shared memory before entering an interruptible wait loop. Top-level abort and backend exit already call WaitLSNCleanup(), but subtransaction abort did not. If an interrupt, such as statement_timeout, occurred while waiting inside a savepoint, rolling back to the savepoint left the backend marked as present in the WAIT FOR LSN heap. Clean up WAIT FOR LSN state from AbortSubTransaction() as well, and add a TAP test covering reuse of WAIT FOR LSN after a savepoint rollback. Reported-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAJTYsWXDRwo-RVRaQgwxVcXgURVFeX8BKnijQrPiPcSCkDDX9A%40mail.gmail.com Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-06 13:56:38 +03:00
Daniel Gustafsson	486b9a9b9e	Fix regex searching for page verification failures in tests The test for finding page verification failures in the logfiles were missing the /m modifier to make sure it anchors to every newline in the search space buffer, and not just the last one. Spotted while adding a test for the recently reported issue with excessive WAL for unlogged relations. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAHg+QDeGrpZbNZdLjd_T4b43xKEEXZN0HGhkFm-1bkBdyzK7AQ@mail.gmail.com	2026-05-06 12:38:15 +02:00
Daniel Gustafsson	9a39056c41	Apply data-checksum worker throttling parameters The DataChecksumsWorker accepts cost_delay and cost_limit parameters from pg_enable_data_checksums() so users can throttle the I/O caused by enabling checksums. Due to the API for setting the cost parameters changing between when the code was written, and when it was committed the new cost update function call was omitted and thus the parameters were silently ignored. Fix by calling VacuumUpdateCosts() after assigning the parameters (both during worker startup and on the runtime cost-update path), and by leaving the page-cost weights at their GUC-controlled defaults. Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAHg+QDeevH6aTyWdXYBJW0wOmfoZy66gDi5TfinK_dXeCrHQLg@mail.gmail.com	2026-05-06 12:38:12 +02:00
Daniel Gustafsson	2018bd6167	Skip WAL for unlogged main fork during online checksum enable ProcessSingleRelationFork() unconditionally generated an FPI WAL record for every page of every relation when enabling checksums. Unlogged relations, which by definition never generate WAL for data changes, were not exempt which generated excessive WAL to be emitted. Fix by guarding the FPI WAL record call with RelationNeedsWAL() to avoid emitting WAL for unlogged main forks. Unlogged pages are still dirtied to ensure the checksum is written to disk at the next checkpoint. The init fork remains WAL-logged even for unlogged relations, as it's needed on the standby to materialize the relation after promotion (see ResetUnloggedRelations()). Skipping init-fork WAL would leave the standby with a stale init fork that, once copied to the main fork on promotion, would fail checksum verification on every read of the unlogged relation. A test which creates an unlogged table with an index, enables checksums, promotes the standby, and verifies that the unlogged relation and its indexes are still readable post-promotion has been added. Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAHg+QDeGrpZbNZdLjd_T4b43xKEEXZN0HGhkFm-1bkBdyzK7AQ@mail.gmail.com	2026-05-06 12:38:01 +02:00
Peter Eisentraut	43dc21f76f	Document deprecated --wal-directory option for pg_verifybackup Commit `b3cf461b3c` renamed --wal-directory to --wal-path but retained the former as a silent alias. Per project policy, all options, including deprecated ones, should be documented to assist users transitioning between versions. This patch restores --wal-directory to the documentation and --help output. Author: Amul Sul <sulamul@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/E1w3fZp-000gje-31%40gemulon.postgresql.org	2026-05-06 10:45:42 +02:00
Álvaro Herrera	a0a0c0c20e	Skip other sessions' temp tables in REPACK, CLUSTER, and VACUUM FULL get_tables_to_repack() and get_all_vacuum_rels() were including other sessions' temporary tables in their output work list, causing REPACK, CLUSTER and VACUUM FULL (when executed without a table list) to attempt to acquire AccessExclusiveLock on them, potentially blocking for an extended time. Fix by skipping other-session temp tables early, before they are added to the list. This issue is ancient, but there have been no complaints about it that I know of, so I'm opting for not backpatching at present. Author: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/0b555318-2bf2-46df-9377-09629a2a59db@uni-muenster.de	2026-05-05 16:20:26 +02:00
John Naylor	6766264262	Add missing guard for __builtin_constant_p Oversight in commit `e2809e3a1`. While at it, use pg_integer_constant_p in master. Discussion: https://postgr.es/m/CANWCAZbOha-x5MCreQn3TRA56VdKWNMAKMy3fAV1kJSw9Vp4pw@mail.gmail.com Backpatch-through: 18	2026-05-05 18:51:07 +07:00
Etsuro Fujita	648818ba38	postgres_fdw: Fix handling of abort-cleanup-failed connections. As connections that failed abort cleanup can't safely be further used, if a remote query tries to get such a connection, we reject it. Previously, this rejection involved dropping the connection if it was open, without accounting for the possibility of open cursors using it, causing a server crash when such an open cursor tried to use an already-dropped connection, as a cursor-handling function (create_cursor, fetch_more_data, or close_cursor) was called on a freed PGconn. To fix, delay dropping failed connections until abort cleanup of the main transaction, to ensure open cursors using such a connection can safely refer to the PGconn for it. Oversight in commit `8bf58c0d9`. Reported-by: Zhibai Song <songzhibai1234@gmail.com> Diagnosed-by: Zhibai Song <songzhibai1234@gmail.com> Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAPmGK176y6JP017-Cn%2BhS9CEJx_6iVhRoYbAqzuLU4d8-XPPNg%40mail.gmail.com Backpatch-through: 14	2026-05-05 18:55:00 +09:00
Peter Eisentraut	d0ed9ad8b0	doc: Clean up title case use	2026-05-05 11:24:16 +02:00
Peter Eisentraut	22f9207aaa	Message style improvements (oauth related)	2026-05-05 10:39:13 +02:00
Álvaro Herrera	eb2e2eb4d4	Don't lose column values on REPACK Commit `28d534e2ae` introduced reform_tuple() with a fast path that returns the source tuple verbatim when no dropped columns require fixing up. I (Álvaro) failed to realize that this broke handling of columns with a 'missingval' defined: after a VACUUM FULL, CLUSTER, or REPACK operation, the catalogued missingval is thrown away, so the tuples are no longer correct. Fix by forcing the rewrite when the tuple is shorter than the tuple descriptor. Author: Satya Narlapuram <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/CAHg+QDeoccU5CudrJpmSKZfKZ1gRMNY=5BxSC=JpHgkonzgcOw@mail.gmail.com	2026-05-05 10:24:49 +02:00
Peter Eisentraut	d0eac3cafb	Make spelling consistent "vertexes" -> "vertices" Reported-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAJTYsWXFy1j_T82%2BM_S9kFxU414tQYnZQD-b82%3DoL_LbG_5fPQ%40mail.gmail.com	2026-05-05 09:36:54 +02:00
Peter Eisentraut	1190f858ea	doc: Small synopsis wording change for consistency	2026-05-05 09:27:32 +02:00
Richard Guo	574581b50a	Consider collation when proving subquery uniqueness rel_is_distinct_for()'s RTE_SUBQUERY branch passed only the equality operator from each join clause to query_is_distinct_for(), discarding the operator's input collation. query_is_distinct_for() then verified opfamily compatibility but never checked collations, so a DISTINCT / GROUP BY / set-op operating under one collation was trusted to prove uniqueness for a comparison performed under an unrelated collation. As with the recent fix in relation_has_unique_index_for(), this is unsound for nondeterministic collations and yields wrong query results in any optimization that consumes the proof. Fix by carrying each clause's operator input collation into query_is_distinct_for() and validating it at every check-site against the subquery target expression's collation. Back-patch to all supported branches. query_is_distinct_for() is declared in an installed header, so on stable branches the existing two-list signature is retained as a thin wrapper that forwards to a new collation-aware entry point; external callers continue to receive the historical collation-blind answer. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMbWs4_XUUSTyzCaRjUeeahWNqi=8ZOA5Q4coi8zUVEDSBkM6A@mail.gmail.com Backpatch-through: 14	2026-05-05 10:23:31 +09:00
Richard Guo	5a55ea507a	Consider collation when proving uniqueness from unique indexes relation_has_unique_index_for() has long had an XXX noting that it doesn't check collations when matching a unique index's columns against equality clauses. This was benign as long as all collations in play reduced to the same notion of equality, but has been incorrect since nondeterministic collations were introduced in PG 12: a unique index under a deterministic collation does not prove uniqueness under a nondeterministic collation, nor vice versa. The consequence is wrong query results for any planner optimization that consumes the faulty proof, including inner-unique join execution (which stops the inner search after the first match per outer row), useless-left-join removal, semijoin-to-innerjoin reduction, and self-join elimination. Fix by requiring the index's collation to agree on equality with the clause's input collation. Two collations agree on equality if either is InvalidOid (denoting a non-collation-sensitive operation, which cannot conflict with the other side), if they have the same OID, or if both are deterministic: by definition a deterministic collation treats two strings as equal iff they are byte-wise equal (see CREATE COLLATION), so any two deterministic collations share the same equality relation and the uniqueness proof carries over. Any mismatch involving a nondeterministic collation is rejected. Back-patch to all supported branches; the bug has existed since nondeterministic collations were introduced in PG 12. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMbWs4_XUUSTyzCaRjUeeahWNqi=8ZOA5Q4coi8zUVEDSBkM6A@mail.gmail.com Backpatch-through: 14	2026-05-05 10:22:53 +09:00
Tom Lane	93da297366	Declare load_hosts() as returning HostsFileLoadResult. This function returns some value of enum HostsFileLoadResult, but for reasons lost in the development process was declared to return "int". Fix that, for clarity and so that our typedefs collection tooling sees the typedef as used. Also fix the variable that the sole call assigns into. Move the typedef to the header file that declares load_hosts() to avoid creating header dependency problems. Discussion: https://postgr.es/m/359138.1777922557@sss.pgh.pa.us	2026-05-04 18:33:06 -04:00
Peter Eisentraut	f6edd8ed70	Add ORDER BY to test query to stabilize test for commit `dc9e7c9ed9`	2026-05-04 20:59:16 +02:00
Álvaro Herrera	b5f92b8eb4	Fix off-by-one in repack index loop A blunder of mine (Álvaro) in commit `28d534e2ae`. Author: Lakshmi N <lakshmin.jhs@gmail.com> Reviewed-by: Xiaopeng Wang <wxp_728@163.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CA+3i_M9ytFufvD8Tm0rhpfxuC4XrpgQDBHxM7NJQYxv488JW7w@mail.gmail.com	2026-05-04 20:01:19 +02:00
Peter Eisentraut	dc9e7c9ed9	Handle nodes that may appear in GraphPattern expression trees expression_tree_mutator_impl() did not handle T_GraphPattern, T_GraphElementPattern, and T_GraphPropertyRef. The corresponding expression_tree_walker_impl() already handles all three node types. This causes an "unrecognized node type" error whenever a GRAPH_TABLE appeared in an expression tree. While at it, also update raw_expression_tree_walker() and expression_tree_walker() to handle missing nodes that may appear in GraphPattern expression trees. When raw_expression_tree_walker() is called, GraphElementPattern::labelexpr contains ColumnRefs instead of GraphLabelRefs. Hence those are not handled in raw_expression_tree_walker(). Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHg%2BQDc97WFTSkXg%3Dg_ZAH8GnY2gJrvq72cs%2BYjqEAuZgXnkAQ%40mail.gmail.com	2026-05-04 17:34:32 +02:00
Peter Eisentraut	891a57c739	Do not define type for a property graph Even though a property graph is defined in pg_class it does not contain any rows by itself and need not have a type defined. Avoid creating a type for it. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAExHW5ucu7ZTgYkO6rB_1ShJP3e%3DGAT2T3CP4XWN8rUVEsiJoA%40mail.gmail.com	2026-05-04 15:45:56 +02:00
Peter Eisentraut	abff4492d0	Fix options listing of pg_restore --no-globals The new pg_restore option --no-globals (commit `3c19983cc0`) appeared out of order in the documentation and help output. Fix that.	2026-05-04 12:00:22 +02:00
Peter Eisentraut	b83a94a73b	Add missing serial commas	2026-05-04 11:53:04 +02:00
Peter Eisentraut	2fcc8aaeb2	doc: Fix up spacing around verbatim DocBook elements	2026-05-04 09:45:40 +02:00
Amit Kapila	bf3ead6075	Simplify translatable messages for tuple value details in conflict.c. append_tuple_value_detail() constructed user-visible messages using separately translated fragments such as ": ", ", ", and ".",. This makes correct translation difficult or impossible in some languages. Refactor append_tuple_value_detail() to move all punctuation and sentence construction to the callers, which now use a single translatable string with a %s placeholder for the tuple data. Reported-by: David Rowley <dgrowleyml@gmail.com> Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/227279.1775956328%40sss.pgh.pa.us#8f3a5f50543556c60cc5a13270cb7ba4 Discussion: https://postgr.es/m/CAApHDvohYOdrvhVxXzCJNX_GYMSWBfjTTtB6hgDauEtZ8Nar2A@mail.gmail.com	2026-05-04 12:06:41 +05:30
Alexander Korotkov	c06d1a4ba6	Mark modified the FSM buffer as dirty during recovery The XLogRecordPageWithFreeSpace function updates the freespace map (FSM) data while replaying data-level WAL records during the recovery. If the FSM block is updated, it needs to be marked as modified. Currently, this is done with the MarkBufferDirtyHint call (as in all other cases for modifying FSM data). However, in the recovery context, this function will actually do nothing if checksums are enabled. It's assumed that the page should not be dirtied during recovery while modifying hints to protect against torn pages, since no new WAL data can be generated at this point to store FPI. Such logic does not seem fully aligned with the FSM case, as its blocks could be simply zeroed if a checksum mismatch is detected. Currently, changes to an FSM block could be lost if each change to that block occurs infrequently enough to allow it to be evicted from the cache. To persist the change, the modification needs to be performed while the FSM block is still kept in buffers and marked as dirty after receiving its FPI. If the block has already been cleaned, the change won't be persisted, so stored FSM blocks may remain in an obsolete state. If a large number of discrepancies between the data in leaf FSM blocks and the actual data blocks accumulate on the replica server, this could cause significant delays in insert operations after switchover. Such an insert operation may need to visit many data blocks marked as having sufficient space in the FSM, only to discover that the information is incorrect and the FSM records need to be corrected. In a heavily trafficked insert-only table with many concurrent clients performing inserts, this has been observed to cause several-second stalls, causing visible application malfunction. The desire to avoid such cases was the reason behind the commit `ab7dbd681`, which introduced an update of FSM data during the heap_xlog_visible invocation. However, an update to the FSM data on the standby side could be lost due to a missing 'dirty' flag, so there is still a possibility that a large number of FSM records will contain incorrect data. Note that having a zeroed FSM page in such a case (due to a checksum mismatch) is preferable, as a zero value will be interpreted as an indication of full data blocks, and the inserter will be routed to the next FSM block or to the end of the table. Given that FSM is ready to handle torn page writes and XLogRecordPageWithFreeSpace is called only during the recovery, there seems to be no reason to use MarkBufferDirtyHint here instead of a regular MarkBufferDirty call. Discussion: https://postgr.es/m/596c4f1c-f966-4512-b9c9-dd8fbcaf0928%40postgrespro.ru Author: Alexey Makhmutov <a.makhmutov@postgrespro.ru> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 20:23:50 +03:00
Alexander Korotkov	21d290161b	Document that WAIT FOR LSN is timeline-blind WAIT FOR LSN compares only the numeric LSN and has no notion of which timeline a WAL record belongs to. There are many possible scenarios when timeline-switching can break read-your-writes consistency. The proper analysis and timeline support is possible in the next major release. Yet just document the current behaviour. Reported-by: Xuneng Zhou <xunengzhou@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 16:22:02 +03:00
Alexander Korotkov	cb096e6d69	Improve WAIT FOR LSN test coverage Add regression coverage for several WAIT FOR LSN edge cases. First, cover fresh walreceiver shared-memory initialization after a standby restart. Restart the standby while its upstream is down, so RequestXLogStreaming() seeds writtenUpto/flushedUpto to the segment-aligned receiveStart and the walreceiver cannot immediately advance them. Verify that the seeded flush position is segment-aligned, that replay can be ahead of it, and that standby_write/standby_flush still succeed for an already-replayed LSN via the replay-position floor in GetCurrentLSNForWaitType(). Second, add fencepost checks for the target <= currentLSN predicate. With replay paused and walreceiver stopped, verify exact boundaries for standby_replay using pg_last_wal_replay_lsn(), and for standby_flush using pg_last_wal_receive_lsn(). Also verify that a waiter for current + 1 sleeps while replay is paused and wakes with success once new WAL is delivered and replay advances. Finally, add a cascading-standby timeline-switch test. Start a waiter on the downstream standby, promote its upstream, generate WAL on the new timeline, and verify that the cascade follows the new timeline and the wait completes successfully once replay reaches the target LSN. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/1957514.1775526774%40sss.pgh.pa.us Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Xuneng Zhou <xunengzhou@gmail.com>	2026-05-03 16:22:02 +03:00
Alexander Korotkov	e7cd592174	Wake standby_write/standby_flush waiters from the WAL replay loop The startup process only woke STANDBY_REPLAY waiters after replaying each WAL record. STANDBY_WRITE and STANDBY_FLUSH waiters depended only on walreceiver write/flush callbacks. As a result, replay progress alone did not wake those waiters, and in pure archive recovery (where no walreceiver exists) they could sleep until timeout. Fix by also calling WaitLSNWakeup() for STANDBY_WRITE and STANDBY_FLUSH after each replay. For the replay-floor semantics used by GetCurrentLSNForWaitType(), replay progress is a valid lower bound for both modes: WAL cannot be replayed unless it has already been written and flushed locally. This works together with the replay-position floor in GetCurrentLSNForWaitType(). The getter ensures that a waiter woken by replay can recheck successfully; the replay-side wakeups ensure that a waiter already asleep is notified when replay reaches its target. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1957514.1775526774%40sss.pgh.pa.us Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 16:22:02 +03:00
Alexander Korotkov	cba67b5b87	Use replay position as floor for WAIT FOR LSN standby_(write\|flush) GetCurrentLSNForWaitType() for standby_write and standby_flush modes returned only the walreceiver position, which may lag behind WAL already present on the standby from a base backup, archive restore, or prior streaming. This could cause unnecessary blocking if the target LSN falls between the walreceiver's tracked position and the replay position. Fix by returning the maximum of the walreceiver position and the replay position. WAL up to the replay point is physically on disk regardless of its origin, so there is no reason to wait for the walreceiver to re-receive it. This complements `29e7dbf5e4`, which seeded writtenUpto to receiveStart in RequestXLogStreaming() to fix the most common hang scenario. The getter-level floor handles the remaining edge cases: targets between receiveStart and the replay position, and standbys running with archive recovery only (no walreceiver). Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1957514.1775526774%40sss.pgh.pa.us Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 16:22:02 +03:00
Alexander Korotkov	df9f938ca2	Remove redundant WAIT FOR LSN caller-side pre-checks All five wakeup call sites duplicate WaitLSNWakeup()'s internal fast-path minWaitedLSN check and add an unnecessary NULL check on waitLSNState. Remove the inline pre-checks and call WaitLSNWakeup() directly. The fast-path check inside WaitLSNWakeup() already returns early when no waiter's target has been reached, so there is no performance difference. The waitLSNState NULL checks are also unnecessary: shared memory is fully initialized before any backend or auxiliary process starts, so waitLSNState is always non-NULL at these call sites. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/jzq5shdewncpxc35r3s2mcfsmo4bjovkza5mnqf5bdfumhfi3g%40bglckf7dxmw5 Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 16:22:02 +03:00
Alexander Korotkov	a80a593ab6	Fix memory ordering in WAIT FOR LSN wakeup mechanism WAIT FOR LSN uses a Dekker-style handshake: the waker stores an LSN position then reads minWaitedLSN; the waiter stores its target into minWaitedLSN then reads the position. Without a barrier between each side's store and load, a CPU may satisfy the load before the store becomes globally visible, causing either side to miss a concurrent update. The result is a missed wakeup: the waiter sleeps indefinitely until the next unrelated event. Fix by embedding the required barriers into the atomic operations on minWaitedLSN: - In updateMinWaitedLSN(), use pg_atomic_write_membarrier_u64() so the waiter's preceding heap update is visible before the new minWaitedLSN value is published. - In WaitLSNWakeup(), use pg_atomic_read_membarrier_u64() in the fast-path check so the waker's preceding position store is globally visible before minWaitedLSN is read. The waiter side is also covered by the barrier semantics already present in GetCurrentLSNForWaitType(): GetWalRcvWriteRecPtr() uses an explicit read barrier (from patch 0001), while the remaining getters acquire a spinlock, which implies the same ordering. Also call ResetLatch() unconditionally after WaitLatch(), following the standard latch loop pattern. WaitLatch() does not guarantee that all simultaneously true wake conditions are reported in one return, so a timeout can race with SetLatch(). If we skip ResetLatch() on a timeout return, the code performs further asynchronous-state checks before consuming the latch, violating the latch API's required wait/reset pattern. That can leave the latch set across loop exit and cause a later unrelated WaitLatch() in the same backend to return immediately. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/zqbppucpmkeqecfy4s5kscnru4tbk6khp3ozqz6ad2zijz354k%40w4bdf4z3wqoz Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 16:22:02 +03:00
Alexander Korotkov	dfb690dd52	Use barrier semantics when reading/writing writtenUpto The walreceiver publishes its write position lock-free via writtenUpto. On weakly-ordered architectures (ARM, PowerPC), both sides of this handshake need explicit barriers so that the lock-less reader sees a consistent state. Use pg_atomic_write_membarrier_u64() at both write sites and pg_atomic_read_membarrier_u64() in GetWalRcvWriteRecPtr(). This matches the barrier semantics that GetWalRcvFlushRecPtr() and other LSN-position functions get implicitly from their spinlock acquire/release, and protects from bugs caused by expectations of similar barrier guarantees from different LSN-position functions. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/zqbppucpmkeqecfy4s5kscnru4tbk6khp3ozqz6ad2zijz354k%40w4bdf4z3wqoz Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-05-03 16:22:02 +03:00
Andrew Dunstan	c34a280c85	Add missing connection validation in ECPG ECPGdeallocate_all(), ECPGprepared_statement(), ECPGget_desc(), and ecpg_freeStmtCacheEntry() could crash with a SIGSEGV when called without an established connection (for example, when EXEC SQL CONNECT was forgotten or a non-existent connection name was used), because they dereferenced the result of ecpg_get_connection() without first checking it for NULL. Each site is fixed in the style of the surrounding code. New tests are added for these conditions. Author: Shruthi Gowda <gowdashru@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Mahendra Singh Thalor <mahi6run@gmail.com> Reviewed-by: Nishant Sharma <nishant.sharma@enterprisedb.com> Discussion: https://postgr.es/m/3007317.1765210195@sss.pgh.pa.us Backpatch-through: 14	2026-05-01 15:12:28 -04:00
Andrew Dunstan	b772f3fcad	Only show signal-sender PID/UID detail in server log The errdetail() added in `55890a9194` (and reworked in `3e2a1496ba`) exposed the operating-system PID and UID of whoever sent the termination signal directly to the affected client. Discussion suggested this should not be sent to the client, but only recorded in the server log where the admin can use it for diagnosis. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/E5CA274C-74BD-4067-8B73-A3AD8C080EFA@gmail.com	2026-05-01 13:20:08 -04:00
Amit Kapila	f67dbd8398	Fix BF failure introduced in commit `2bf6c9ff71`. The sequence subscription test switches regress_seq_sub to connect to the publisher as regress_seq_repl (a non-superuser) when checking behavior with insufficient sequence privileges but forgot to set up pg_hba.conf to allow connections from it. The special setup is only needed on Windows machines that don't use UNIX sockets. As per buildfarm. Reported-by: Ajin Cherian <itsajin@gmail.com> Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAFPTHDad911HUMkHgD1KZk+WOvTopiBcYf4C_8Fqj1-sZk3xgw@mail.gmail.com	2026-05-01 14:35:26 +05:30
Michael Paquier	0916282a06	doc: Mention validation attempt during ALTER INDEX .. ATTACH PARTITION Since `9d3e094f12`, the command tries to validate the parent index of the named index, if invalid. The documentation did not mention this behavior, which could be confusing. Author: Mohamed ALi <moali.pg@gmail.com> Discussion: https://postgr.es/m/CAGnOmWpHu25_LpT=zv7KtetQhqV1QEZzFYLd_TDyOLu1Od9fpw@mail.gmail.com Backpatch-through: 14	2026-05-01 13:10:35 +09:00
Fujii Masao	c0b24b32b0	Avoid blocking indefinitely while finishing walsender shutdown When walsender finishes streaming during shutdown, it sends a CommandComplete message to tell the receiver that WAL streaming is done. Previously, that path used EndCommand() followed by pq_flush(). Those functions can block indefinitely waiting for the socket to become writeable. As a result, even when wal_sender_shutdown_timeout is set, walsender could remain stuck while sending the final completion message, and the shutdown timeout would not be enforced. Fix this by introducing EndCommandExtended(), which allows CommandComplete to be queued with pq_putmessage_noblock(), and by using the walsender nonblocking flush path instead of pq_flush(), so the shutdown timeout continues to be checked while pending output is flushed. Per CI testing on FreeBSD. Reported-by: Andres Freund <andres@anarazel.de> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/vwlugmsogfn36jhm56zwrgd7m6xe6ircltvfh3kzt6kldvbtht@f45dgow5uhnx	2026-05-01 12:12:44 +09:00
Richard Guo	f76686ce7f	Fix HAVING-to-WHERE pushdown with nondeterministic collations When GROUP BY uses a nondeterministic collation, the planner's optimization of moving HAVING clauses to WHERE can produce incorrect query results. The HAVING clause may apply a stricter collation that distinguishes values the GROUP BY considers equal. Pushing such a clause to WHERE causes it to filter individual rows before grouping, potentially eliminating group members and changing aggregate results. Fix this by detecting collation conflicts before flatten_group_exprs, while the HAVING clause still contains GROUP Vars (Vars referencing RTE_GROUP). At that point, each GROUP Var directly carries the GROUP BY collation as its varcollid, making it straightforward to compare against the operator's inputcollid. A mismatch where the GROUP BY collation is nondeterministic means the clause is unsafe to push down. RowCompareExpr is treated specially, since it carries per-column inputcollids[] rather than a single inputcollid. The conflicting clause indices are recorded in a Bitmapset and consulted during the existing HAVING-to-WHERE loop, so that only affected clauses are kept in HAVING; other safe clauses in the same query are still pushed. Back-patch to v18 only. The fix relies on the RTE_GROUP mechanism introduced in v18 (commit `247dea89f`), which is what lets us identify grouping expressions and their resolved collations via GROUP Vars on pre-flatten havingQual. Pre-v18 branches lack that machinery, so a back-patch there would need a different approach. Given the absence of field reports of this bug on back branches, the risk of carrying a different fix on stable branches is not justified. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/CAMbWs48Dn2wW6XM94GZsoyMiH42=KgMo+WcobPKuWvGYnWaPOQ@mail.gmail.com Backpatch-through: 18	2026-05-01 11:13:50 +09:00
Amit Langote	410013d2a5	Use "concurrent delete" in serialization error for TM_Deleted cases In ExecLockRows() and ri_LockPKTuple(), the TM_Deleted code path was using the same "could not serialize access due to concurrent update" message as the TM_Updated path. Use "concurrent delete" instead, since the tuple was deleted, not updated. The ExecLockRows() instance was likely a copy-paste error per Andres; the ri_LockPKTuple() instance was carried over from the same pattern in commit `2da86c1ef9`. Update affected isolation test expected files accordingly and add a new test to fk-concurrent-pk-upd.spec with concurrent delete of the PK row. The ExecLockRows() change is master-only for lack of user complaints and to avoid breaking anything that might match on the error text. Reported-by: Jian He <jian.universality@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CACJufxEG1JTCq4A1gnNAu-bGAq9Xn=Xkf7kC3TRWFz6iuUOuRA@mail.gmail.com	2026-05-01 10:00:29 +09:00
Richard Guo	8d829f5a02	Fix JSON_ARRAY(query) empty set handling and view deparsing According to the SQL/JSON standard, JSON_ARRAY(query) must return an empty JSON array ('[]') when the subquery returns zero rows. Previously, the parser rewrote JSON_ARRAY(query) into a JSON_ARRAYAGG aggregate function. Because this aggregate evaluates to NULL over an empty set without a GROUP BY clause, the constructor erroneously returned NULL. Additionally, this premature rewrite baked physical implementation details into the catalog, preventing ruleutils.c from deparsing the original syntax for views. This patch resolves both issues by introducing a new JSCTOR_JSON_ARRAY_QUERY constructor type. The parser builds the executable form --- a COALESCE-wrapped JSON_ARRAYAGG subquery --- from raw parse nodes via transformExprRecurse, and stores it in the func field. The original transformed Query is kept in a new orig_query field so that ruleutils.c can deparse the original syntax for views. During planning, eval_const_expressions replaces the node with the pre-built func expression. The deparsing issue was reported by Tom Lane. Bump catalog version. Bug: #19418 Reported-by: Lukas Eder <lukas.eder@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/19418-591ba1f29862ef5b@postgresql.org	2026-05-01 09:42:00 +09:00
Álvaro Herrera	6ca631b990	REPACK CONCURRENTLY: fix processing of toasted tuples In order to process tuples inserted or updated while REPACK executes, we write those tuples to disk and later restore them; however, some forms of toasted tuples were not being processed correctly. Fix that. Also expand the tests a bit for better coverage. Author: Satya Narlapuram <satyanarlapuram@gmail.com> Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHg+QDeXb9HM2VGKXQedyCp52GzajJK5KOUdNi6oLjsS0nerQw@mail.gmail.com	2026-04-30 23:32:57 +02:00
Álvaro Herrera	2fd787d0aa	Remove working test that was supposed to fail I evidently failed to review the expected output in commit `832e220d99` carefully enough. Per complaint from Tom Lane. Discussion: https://postgr.es/m/769631.1777575242@sss.pgh.pa.us	2026-04-30 22:57:24 +02:00
Andrew Dunstan	6cf49e804c	Fix attnum remapping in generateClonedExtStatsStmt() When cloning extended statistics via CREATE TABLE ... LIKE ... INCLUDING STATISTICS, stxkeys holds attribute numbers from the source (parent) table, but get_attname() was being called with the child relation's OID. If the parent has dropped columns, the child's attribute numbers are renumbered sequentially and no longer match, so the lookup either returns the wrong column name (silent corruption) or errors out when the attnum does not exist in the child. Fix it by remapping the parent attnum through attmap before the lookup, consistent with how expression statistics are already handled a few lines below. Add a regression test covering both manifestations: a 3-column parent where the stale attnum refers to no child column (cache-lookup error), and a 4-column parent where the stale attnum silently refers to the wrong child column. Author: Julien Tachoires <julmon@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/20260415105718.tomuncfbmlt67oel@poseidon.home.virt Backpatch-through: 14	2026-04-30 11:04:57 -04:00
Andrew Dunstan	5642a0367c	Avoid SIGSEGV in pg_get_database_ddl() on NULL tablespace There is a narrow race in which a concurrent ALTER DATABASE ... SET TABLESPACE moves the database off the tablespace and a DROP TABLESPACE removes it between the syscache lookup and the catalog scan. If that happens, output an error. Author: Chao Li <lic@highgo.com> Reviewed-by: Jack Bonatakis <jack@bonatak.is> Reviewed-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/573E45C1-31A4-4885-A00C-1A2171159A2A@gmail.com	2026-04-30 10:14:52 -04:00
Daniel Gustafsson	75152c5dc5	Fix data_checksum GUC show_hook Commit `f19c0eccae` erroneously omitted the show_hook for the data_checksum GUC. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:57 +02:00
Daniel Gustafsson	1df361e3d8	Improve database detection logic in datachecksumsworker The worker need to know whether a database which failed checksum processing still exists, or has been dropped. This improves the detection logic by checking for being partially dropped. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:55 +02:00
Daniel Gustafsson	bf25e5571b	Improve handling of concurrent checksum requests When pg_{enable\|disable}_data_checksums is called while checksums are being enabled or disabled, the already running launcher is detected and the new desired state is recorded. Processing will then pick up the new state and change its operation to fulfill the new request. If the same state is requested but with different cost values, the new cost values will take effect on the next relation processed. The previous coding had a complex logic of starting a new launcher for this, which is now avoided with the shared mem structure instead used to signal current processing. This makes the logic more robust, and fixes a bug where the launcher would erroneously revert back to the "off" state. Access to the shared memory is also protected with LWLocks in all cases. Since the shmem structure is used for signalling between the worker and the launcher, and there can be only one of each, there were no concurrency issues detected but it's better to stick to proper locking protocol should this ever be updated to handle multiple workers. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:53 +02:00
Daniel Gustafsson	381d19da15	Typo and spelling fixups for online checksums A collection of spelling, wording and punctuation fixups for the code documentation from postcommit review. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:50 +02:00
Daniel Gustafsson	25b922ec58	Fix invalid checksum state transition in checkpoints Commit 78e950cb8 added checksum state handling to all XLOG_CHECKPOINT records which caused unnecessary state transitions and emission of procsignal barriers. Remove as only the _REDO record need to handle checksum state. Barrier emission is also consistently made after controlfile updates to avoid race conditions. Additionally, interrupts are held between calling ProcSignalInit and InitLocalDataChecksumState to remove a window where otherwise invalid state transitions can happen. Also remove a pointless assertion on Controlfile which will never hit. Author: Tomas Vondra <tomas@vondra.me> Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:48 +02:00
Daniel Gustafsson	8fb8ded889	Handle data_checksum state changes during launcher_exit When erroring out from the datachecksums launcher during data checksum enabling, before state has transitioned to "on", we revert back to the "off" state. Since checksums weren't enabled, there is no use staying in an inprogress state since the checksum launcher currently doesn't support restarting from where it left off. Should restartability get added in the future, this would need to be revisited. This state transition was however missing from the allowed transitions in the statemachine causing an error. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:46 +02:00
Daniel Gustafsson	a0d8f4c1ae	Test improvements for online checksums This includes a number of smaller fixups to the online checksums test module which were found during postcommit review and stabilization work. * Fix scope increase for PG_TEST_EXTRA: The online checksums tests have two levels of PG_TEST_EXTRA, checksum and checksums_extended for extra test runs and test runs with increased randomization. The logic for increasing the number of test iterations was however backwards. * Change stopmode for PITR test: The pitr suite used immediate stop mode which caused problems on slower machines where the sigquit would interrupt archive commands leaving partial WAL files behind. This would then prevent restart. Fix by using fast mode which is the appropriate mode for the test at hand. Also increase timeouts to help slower test systems since an expired timeout will incur the same effect as an immediate standby with a partial WAL left behind. This issue was observed when running the test suites on a Raspberry Pi 4 machine. * Improve logging: The test suite for data checksums use a set of helper functions in a Perl module to avoid repeating code, this makes sure that the helper functions do a better job of logging their test output to make debug easier. * Remove unused code: wait_for_cluster_crash was used during the development of online checksums but was never used in any test which shipped, so remove the function. * Standby fixes: Ensure no vacuum on pgbench init on standby with -n to avoid bogus error message in the log, and enable hot_standby_feedback to prevent queries from getting cancelled due to recovery on slower systems. Author: Daniel Gustafsson <daniel@yesql.se> Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/9197F930-DDEB-4CAC-82A2-16FEC715CCE8@yesql.se	2026-04-30 13:41:43 +02:00
Daniel Gustafsson	b120358c61	Prevent pg_enable/disable_data_checksums() on standby These functions missed a RecoveryInProgress() check, allowing them to be called on a hot standby. Enabling, or disabling, checksums on the standby only would cause the cluster to get out of sync and replaying checksum transitions to fail. Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAHg+QDfRk4-S7DMmdbXJnQ-xF=sUpMAKuh8b83ObLqYVKx5QLA@mail.gmail.com	2026-04-30 13:41:41 +02:00
Amit Kapila	2bf6c9ff71	Fix double table_close of sequence_rel in copy_sequences(). sequence_rel was declared at batch scope, so when a row is skipped due to concurrent drop or insufficient privileges, the end-of-row cleanup closes the stale pointer from the previous row, tripping the relcache refcount assertion. Move sequence_rel inside the per-row loop. Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/CAJTYsWWOuw-yfmzotV4jCJ6LLxEsb=STLcGtDYXOxRcU9Te3Pw@mail.gmail.com	2026-04-30 16:39:39 +05:30
Michael Paquier	5941e7f092	Fix errno check based on EINTR in pg_flush_data() Upon a failure of sync_file_range(), EINTR was checked based on the returned result of the routine rather than its errno. sync_file_range() returns -1 on failure, making the check a no-op, invalidating the retry attempt in this case. Oversight in `0d369ac650`. Author: DaeMyung Kang <charsyam@gmail.com> Discussion: https://postgr.es/m/20260429151811.1810874-1-charsyam@gmail.com Backpatch-through: 16	2026-04-30 18:44:38 +09:00
Michael Paquier	ac59a90bef	Adjust some incorrect GetDatum() macros This reverts portions of commit `6dcfac9696`, which is wrong in trying to use a GetDatum() that matches with the C types of the values read. GetDatum() should match with the output argument types of the SQL functions. The portions of `6dcfac9696` that are right regarding this rule are: - gistget.c, where the GiST support functions use DatumGetUInt16() to retrieve the strategy number. - The BRIN code for strategynum, used in syscache lookups. The adjustments done in this commit are for pageinspect, pg_buffercache and pg_lock_status(). While double-checking the whole state of the tree regarding non-matching pairs of DatumGet() and *GetDatum(), I have found much more code paths that are incorrect, unrelated to `6dcfac9696`. These may be adjusted in the future, in a different patch (perhaps not for v19, as we are already past feature freeze). Reported-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/97f9375a-be61-4272-a44d-408337fe8fa6@eisentraut.org Discussion: https://postgr.es/m/CAJ7c6TMcGu8qmRe1gZfJ-gOzVnZq-t=fwn-UuyStx1w6ZyydMw@mail.gmail.com	2026-04-30 13:10:19 +09:00
Michael Paquier	4bfd0f1b76	Fix error of pg_stat_reset_shared() "lock" a values is supported since `4019f725f5`, but the error message of the function used when specifying an incorrect value forgot about it. Author: Maksim Logvinenko <logvinenko-ms@yandex.ru> Discussion: https://postgr.es/m/433431777389005@mail.yandex.ru	2026-04-30 11:12:56 +09:00
Nathan Bossart	3dd42ee97b	Suppress "has no symbols" linker warnings on macOS. After a recent macOS update, building Postgres produces warnings that look like this: ranlib: warning: 'libpgport_shlib.a(pg_cpu_x86.c.o)' has no symbols ranlib: warning: 'libpgport_shlib.a(pg_popcount_x86.c.o)' has no symbols To fix, add a dummy symbol to files that may otherwise have none. Per project policy, this is a candidate for back-patching into out-of-support branches: it suppresses annoying compiler warnings but changes no behavior. Reported-by: Zhang Mingli <zmlpostgres@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/229aaaf3-f529-44ed-8e50-00cb6909af21%40Spark Backpatch-through: 13	2026-04-29 12:25:09 -05:00
Masahiko Sawada	a424e31b16	test_tidstore: Stabilize regression tests by sorting offsets. TidStoreSetBlockOffsets() requires its offsets array to be strictly ascending and asserts this precondition. In test_tidstore, we were passing random offset numbers deduplicated by a DISTINCT clause in an array_agg() call directly to the do_set_block_offsets() test harness. However, DISTINCT without an ORDER BY clause does not guarantee sorted results according to the SQL standard. Fix this by sorting the offsets in-place inside do_set_block_offsets() before calling TidStoreSetBlockOffsets(). While this assertion failure is not observed during regular regression tests because they use queries simple enough that the optimizer consistently chooses plans yielding sorted results, it makes sense to stabilize the test. The failure could theoretically occur depending on the optimizer's plan choice, and has been reported when experimenting with certain third-party extensions. Backpatch to v17, where test_tidstore was introduced, to ensure extension development on stable branches does not hit this assertion. Reported-by: Andrei Lepikhov <lepihov@gmail.com> Author: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/b97f1850-fc7b-43c4-9b04-4e97bb9e7dc0@gmail.com Backpatch-through: 17	2026-04-29 09:10:04 -07:00
Andrew Dunstan	df1bac400f	Fix timezone dependence in test_misc/012_ddlutils.pl The tests introduced in `c529ee38b9` are timezone sensitive. Pin the cluster's timezone to UTC at init time so timestamptz output is deterministic regardless of the host's local timezone.	2026-04-29 12:00:32 -04:00
Andrew Dunstan	c529ee38b9	Convert ddlutils regression tests to TAP tests. The regression tests for pg_get_role_ddl(), pg_get_database_ddl(), and pg_get_tablespace_ddl() created databases and tablespaces, which are heavyweight operations. As noted by Andres Freund, this is wasteful in the core regression suite which gets run repeatedly. Convert the three test files (role_ddl.sql, database_ddl.sql, tablespace_ddl.sql) into a single TAP test that runs once, covering all the same functionality: basic DDL generation, pretty-printing, option handling, error cases, permission checks, and edge cases like quoted names and role memberships. Discussion: https://postgr.es/m/5c67dc79-909a-4e17-8606-6686667da6c6@dunslane.net	2026-04-29 11:34:01 -04:00
Peter Geoghegan	748d871b7c	Fix nbtree skip array parallel alloc accounting. btestimateparallelscan neglected to add btps_arrElems[] space overhead for skip array scan keys that were later output by nbtree preprocessing. Skip arrays don't actually need to use this space, but a scan with a subsequent SAOP array will need to subscript btps_arrElems[] using a simple so->arrayKeys[]-wise offset. so->arrayKeys[] has entries for both kinds of arrays. As a result of this oversight, it was possible for an index scan with a skip array and a lower-order SAOP array to write past the allocated shared memory boundary when storing the SAOP array's cur_elem. In practice the problem seems to be limited to scans with many skipped index columns, since our general approach to estimating the amount of shared memory that will be required is fairly conservative. To fix, have btestimateparallelscan request an extra sizeof(int) space for key columns that might require a skip array later on. Oversight in commit `92fe23d9`, which added the nbtree skip scan optimization. Author: Siddharth Kothari <sidkot@google.com> Discussion: https://postgr.es/m/CAGCUe0Lwk3C0qdkBa+OLpYc7yXwW=pbaz8Sju4xMXEQAmyp+5g@mail.gmail.com Backpatch-through: 18	2026-04-29 11:22:23 -04:00
John Naylor	ca9807dfec	Cosmetic fixes for radix sort Do minor comment fixes and remove implicit cast to Datum. While here, let's prefer crashing instead of entering an infinite loop in case of future programming mistakes when computing next_level, suggested by ChangAo Chen. Discussion: https://postgr.es/m/tencent_49E3F11E74D8A584A2144ED532A490CBC40A@qq.com	2026-04-29 16:14:25 +07:00
John Naylor	a0302eac78	Remove unused ByteaSortSupport.abbreviate field Oversight in commit `9303d62c6`. Author: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/CAJ7c6TOsKmmgyA6EwxKVsNeHFHrWXYdgZivgjo_ujf890BpeeA@mail.gmail.com	2026-04-29 13:57:07 +07:00
Amit Kapila	c210647aeb	Fix xid_advance_interval when max_retention_duration is 0. When a subscription has retain_dead_tuples enabled and maxretention is zero (unlimited), adjust_xid_advance_interval() mistakenly caps xid_advance_interval to zero. This zero interval forces get_candidate_xid() to evaluate TimestampDifferenceExceeds() as always true, causing the apply worker to call GetOldestActiveTransactionId() for every WAL message. This leads to unnecessary ProcArrayLock acquisitions. Fix this by only capping the interval when maxretention > 0, allowing the exponential back-off to function properly. Author: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Discussion: https://postgr.es/m/CAHg+QDdKVnCLHot=AcoPpEiSyDzGz7wGYjAFHVOw57oDtmUDWQ@mail.gmail.com	2026-04-28 14:51:38 +05:30
Amit Kapila	7424aac088	Fix wrong datum conversion for subretentionactive in CreateSubscription. Use BoolGetDatum() instead of Int32GetDatum() when storing the boolean subretentionactive column in pg_subscription. This was an oversight in `a850be2fe6`. Author: Lakshmi N <lakshmin.jhs@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Discussion: https://postgr.es/m/CA+3i_M98-XjE-_fw0p+8xOnw64y2_YLtJfcwvCfsVMn-z2ZjGg@mail.gmail.com	2026-04-28 13:13:47 +05:30
Álvaro Herrera	832e220d99	REPACK CONCURRENTLY: Don't use deferrable primary keys Similarly to logical replication, REPACK CONCURRENTLY needs to ability to reliably locate a tuple based on an identity. A replica identity index is okay. Primary keys normally also are, except when they are deferrable, because a tuple being modified might not yet be indexed, causing REPACK to fail. Change the REPACK CONCURRENTLY code to use GetRelationIdentityOrPK(), similar to what the logical replication code does. (Though we don't yet support locating tuples based on arbitrary indexes for replica identity FULL.) While at it, add a few more test cases for situations that aren't supported by REPACK, to improve coverage. Author: Chao Li <lic@highgo.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Antonin Houska <ah@cybertec.at> Reviewed-by: Yuchen Li <liyuchen_xyz@163.com> Discussion: https://postgr.es/m/10DD5E13-B45D-44F1-BE08-C63E00ABCAC0@gmail.com	2026-04-27 18:22:03 +02:00
Peter Eisentraut	33db6c4baf	Fix DELETE/UPDATE FOR PORTION OF with rules Previously, these test cases would give internal errors or crash. The fix is to add some missing fields of ForPortionOfExpr to expression_tree_walker. Author: jian he <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/CACJufxHs1Hs00EqsZ4NbuAjmYzMzjJyP1sAj12Ne=cBsEVmQOA@mail.gmail.com	2026-04-27 10:34:06 +02:00
Michael Paquier	31b9d90f15	doc: Fix grammar in some logical replication pages Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAHut+PuvY_wYLPJ4DTs7NE9Lu2ty4d-OgZAOJC-NvCM=2wwcQQ@mail.gmail.com Backpatch-through: 14	2026-04-27 16:17:04 +09:00
Richard Guo	c66d6d19eb	Fix bogus calls in remove_self_join_rel() remove_self_join_rel() called adjust_relid_set() on all_result_relids and leaf_result_relids but threw away the return value. Since adjust_relid_set() returns a freshly-built Relids and does not modify the input in place, the calls did nothing. This has been the case since the SJE feature went in (commit `fc069a3a6`). There has been no observable misbehavior, because the relid being passed is guaranteed not to be a member of either set. At the point remove_self_join_rel() runs, those sets contain only resultRelation; inheritance children have not been added yet, as that happens later in query_planner(), in expand_single_inheritance_child() called from add_other_rels_to_query(). And remove_self_joins_recurse() rejects parse->resultRelation as an SJE candidate to preserve the EvalPlanQual mechanism. Even with the result assigned, the calls would be no-ops in practice. Rather than make the calls do the cleanup they pretend to do, replace them with assertions of the invariant. Any future loosening of the SJE candidate filter -- for instance to allow eliminating a result relation under provable conditions -- will trip the assertion and force whoever does it to revisit this code. Additionally, decorate adjust_relid_set() with pg_nodiscard so that any future accidental discard of its return value is caught at compile time. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMbWs49fYQcqJfJ_Gtn8r1GFNoYtb1=2AUab4ieuqY4Zid9ocQ@mail.gmail.com	2026-04-27 10:40:37 +09:00
Michael Paquier	b801d5eef1	Fix some memory leaks in the WAL receiver These are old leaks, that can pile up if a WAL receiver stays alive, waiting for new WAL data after the sender has switched to a new timeline. While this is technically a bug, the impact is minimal and would only become noticeable if the WAL sender handles a lot of timeline switches, so no backpatch is done. Note that in most cases, primary_conninfo would be updated in a standby to point to a new sender, meaning a restart of the WAL receiver. Let's be clean on HEAD, though. Author: DaeMyung Kang <charsyam@gmail.com> Discussion: https://postgr.es/m/20260426170100.847923-1-charsyam@gmail.com Discussion: https://postgr.es/m/20260426170219.849330-1-charsyam@gmail.com	2026-04-27 10:32:45 +09:00
Noah Misch	f9c638054c	Fix new test with comma in build directory. Quote pg_hosts.conf fields derived from the build directory, since hba.c:next_token() treats a comma as a token separator. Commit `4f433025f6` introduced pg_hosts.conf and this test. A build directory name containing a comma worked before that commit. A build directory name containing a quote character has not worked, so don't handle that. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20260426213252.7a@rfd.leadboat.com	2026-04-26 15:03:51 -07:00
Peter Eisentraut	7d7e58feef	Don't use INT64_FORMAT in translatable string Use PRId64 instead.	2026-04-25 20:23:03 +02:00
Tom Lane	f64f62f5be	Update time zone data files to tzdata release 2026b. British Columbia (America/Vancouver) moved to permanent UTC-07 on 2026-03-09, which will affect their clocks beginning on 2026-11-01. For lack of any clarity on the point, assume their TZ abbreviation will be MST from that time forward. Moldova (Europe/Chisinau) has followed EU DST transition times since 2022. Backpatch-through: 14	2026-04-24 12:28:35 -04:00
Peter Eisentraut	3b28dad70e	meson: Differentiate top-level and custom targets We need to create top-level targets to run targets with the ninja command like `ninja <target_name>`. Some targets (man, html, ...) have the same target name on both top-level and custom target. This creates a confusion for the meson build: $ meson compile -C build html ``` ERROR: Can't invoke target `html`: ambiguous name. Add target type and/or path: - ./doc/src/sgml/html:custom - ./doc/src/sgml/html:alias ``` Solve that problem by adding '-custom' suffix to these problematic targets' custom target names. Top-level targets can be called with both meson and ninja now: $ meson compile -C build html $ ninja -C build html Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Suggested-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/5508e572-79ae-4b20-84d0-010a66d077f2%40eisentraut.org	2026-04-24 09:51:09 +02:00
Peter Eisentraut	9d2979dd68	pg_get_viewdef() and lateral references in COLUMNS of GRAPH_TABLE Expressions in GRAPH_TABLE COLUMNS list may have lateral references. get_rule_expr() requires lateral namespaces to deparse such references. get_from_clause_item() does not pass them when processing the expressions in COLUMNS list causing ERROR "bogus varlevelsup: 0 offset 0". Fix get_from_clause_item() to pass input deparse_context containing lateral namespaces to get_rule_expr() instead of the dummy context. Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHg%2BQDcLVa2iBnggkHxY4itZbXtDMfsYHEjnCUYe9hNbnxDi-w%40mail.gmail.com	2026-04-24 09:12:03 +02:00
Peter Eisentraut	ac3bcc041c	Fix collation of expressions in GRAPH_TABLE COLUMNS clause GRAPH_TABLE clause is converted into a rangetable entry, which is ignored by assign_query_collations(). Hence we assign collations while transforming its parts. But expressions in COLUMNS clause missed that treatment, so fix that. While at it, also add comments about collation assignment to the parts of GRAPH_TABLE clause, and also fix a small grammar issue. Reported-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAHg+QDc4aaiufYSgrwMMPMMRTPtQ66SghcrPFbWJFZMqNaG+BA@mail.gmail.com	2026-04-24 08:43:26 +02:00
Peter Eisentraut	9082680c34	Fix typos and grammar in graph table rewrite code Reported-by: Lakshmi N <lakshmin.jhs@gmail.com> Author: Lakshmi N <lakshmin.jhs@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CA+3i_M9gpUGjH-BkJk=UFjK16jq9fEQHpmZ1cxpJO+xM4hWC+A@mail.gmail.com	2026-04-24 08:27:04 +02:00
Peter Eisentraut	2ff289d039	Check for stack overflow when rewriting graph queries generate_queries_for_path_pattern_recurse() and generate_setop_from_pathqueries() are recursive functions. For a property graph with hundreds of tables, a graph pattern with a handful element patterns can cause stack overflow. Fix it by calling check_stack_depth() at the beginning of these functions. Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAHg+QDfgK0xddH8f3eAb+UVn7sBDOnv8RvM6OkP4HtHAt6aD7w@mail.gmail.com	2026-04-24 08:18:21 +02:00
Fujii Masao	863c4b827d	pg_test_timing: store timing deltas in int64 Commit `0b096e379e` changed pg_test_timing to measure timing differences in nanoseconds instead of microseconds, but the resulting deltas continued to be stored in int32. That can overflow for large gaps (for example, values greater than about 2.14 seconds in nanoseconds), leading to truncation or incorrect output. This commit fixes the issue by storing measured timing deltas in int64. This prevents overflow for large values and better matches nanosecond-resolution measurements. Author: Chao Li <lic@highgo.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Xiaopeng Wang <wxp_728@163.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/F780CEEB-A237-4302-9F55-60E9D8B6533D@gmail.com	2026-04-24 12:11:40 +09:00
David Rowley	94219a73f7	Fix incorrect logic for hashed IN / NOT IN with non-strict operators ExecEvalHashedScalarArrayOp(), when using a strict equality function, performs a short-circuit when looking up NULL values. When the function is non-strict, the code incorrectly looked up the hash table for a zero-valued Datum, which could have resulted in an accidental true return if the hash table contained zero valued Datum, or could result in a crash for non-byval types. Here we fix this by adding an extra step when we build the hash table to check what the result of a NULL lookup would be. This requires looping over the array and checking what the non-hashed version of the code would do. We cache the results of that in the expression so that we can reuse the result any time we're asked to search for a NULL value. It's important to note that non-strict equality functions are free to treat any NULL value as equal to any non-NULL value. For example, someone may wish to design a type that treats an empty string and NULL as equal. All built-in types have strict equality functions, so this could affect custom / user-defined types. Author: Chengpeng Yan <chengpeng_yan@outlook.com> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/A16187AE-2359-4265-9F5E-71D015EC2B2D@outlook.com Backpatch-through: 14	2026-04-24 14:03:12 +12:00
Fujii Masao	019cc9962b	pg_test_timing: fix unit in backward-clock warning pg_test_timing reports timing differences in nanoseconds in master, and in microseconds in v14 through v18, but previously the backward-clock warning incorrectly labeled the value as milliseconds. This commit fixes the warning message to use "ns" in master and "us" in v14 through v18, matching the actual unit being reported. Backpatch to all supported versions. Author: Chao Li <lic@highgo.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Xiaopeng Wang <wxp_728@163.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/F780CEEB-A237-4302-9F55-60E9D8B6533D@gmail.com Backpatch-through: 14	2026-04-24 09:02:03 +09:00
Peter Eisentraut	aa27a3331a	Add missing source files to several nls.mk	2026-04-23 21:52:02 +02:00
Heikki Linnakangas	713bce9484	Don't call CheckAttributeType() with InvalidOid on dropped cols If CheckAttributeType() is called with InvalidOid, it performs a bunch of pointless, futile syscache lookups with InvalidOid, but ultimately tolerates it and has no effect. We were calling it with InvalidOid on dropped columns, but it seems accidental that it works, so let's stop doing it. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/93ce56cd-02a6-4db1-8224-c8999372facc@iki.fi Backpatch-through: 14	2026-04-23 21:28:26 +03:00
Heikki Linnakangas	dd40691976	Don't allow composite type to be member of itself via multirange CheckAttributeType() checks that a composite type is not made a member of itself with ALTER TABLE ADD COLUMN or ALTER TYPE ADD ATTRIBUTE, even indirectly via a domain, array, another composite type or a range type. But it missed checking for multiranges. That was a simple oversight when multiranges were added. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/93ce56cd-02a6-4db1-8224-c8999372facc@iki.fi Backpatch-through: 14	2026-04-23 21:28:11 +03:00
Álvaro Herrera	4b2aa4b39c	Move REPACK (CONCURRENTLY) test out of stock regression tests These tests sometimes run with wal_level=minimal, which does not allow to run REPACK (CONCURRENTLY). Move them to test_decoding, which is ensured to run with high enough wal_level. Discussion: https://postgr.es/m/260901.1776696126@sss.pgh.pa.us	2026-04-23 12:34:41 +02:00
Amit Kapila	2e1d4fdb10	psql: Improve describe footer titles for publications. The psql describe (`\d`) footer titles were previously unintuitive when listing publications that included or excluded specific tables. Even though the tag for included publications was pre-existing, it is better to update it to "Included in publications:" to match the phrasing of the "Excluded from publications:" tag. Footer titles for sequence and schema descriptions have been updated similarly to maintain consistency. Reported-by: Álvaro Herrera <alvherre@kurilemu.de> Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Yuchen Li <liyuchen_xyz@163.com> Discussion: https://postgr.es/m/aeDs7iZUox1bbKAK%40alvherre.pgsql	2026-04-23 14:10:03 +05:30
Peter Eisentraut	71123a5454	Avoid casting void * function arguments Like commit `c3c240537f`, but for newly added code.	2026-04-23 08:08:57 +02:00
David Rowley	4f0cbc6fb5	Fix new-to-v19 -Wshadow warnings There's some talk about upgrading our current -Wshadow=compatible-local up to -Wshadow. There's some pending questions as to whether the churn and extra backpatching pain are worthwhile for doing all of them. We can't use the latter argument for ones that are new to v19, providing we fix them now. So let's fix those ones so that the problem is not any worse for if we decide to fix the remainder for v20. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Yuchen Li <liyuchen_xyz@163.com> Discussion: https://postgr.es/m/CAApHDvp=rx5GxM=yW8QhFF3noXtYt7LkOxJ7zkaPOzpti4Gm8w@mail.gmail.com	2026-04-23 16:49:29 +12:00
Jeff Davis	dbf217c1c7	catcache.c: use C_COLLATION_OID for texteqfast/texthashfast. The problem report was about setting GUCs in the startup packet for a physical replication connection. Setting the GUC required an ACL check, which performed a lookup on pg_parameter_acl.parname. The catalog cache was hardwired to use DEFAULT_COLLATION_OID for texteqfast() and texthashfast(), but the database default collation was uninitialized because it's a physical walsender and never connects to a database. In versions 18 and later, this resulted in a NULL pointer dereference, while in version 17 it resulted in an ERROR. As the comments stated, using DEFAULT_COLLATION_OID was arbitrary anyway: if the collation actually mattered, it should have used the column's actual collation. (In the catalog, some text columns are the default collation and some are "C".) Fix by using C_COLLATION_OID, which doesn't require any initialization and is always available. When any deterministic collation will do, it's best to consistently use the simplest and fastest one, so this is a good idea anyway. Another problem was raised in the thread, which this commit doesn't fix (see second discussion link). Reported-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/D18AD72A-5004-4EF8-AF80-10732AF677FA@yandex-team.ru Discussion: https://postgr.es/m/4524ed61a015d3496fc008644dcb999bb31916a7.camel%40j-davis.com Backpatch-through: 17	2026-04-22 10:22:44 -07:00
Masahiko Sawada	e471dc5912	pg_upgrade: Fix detection of invalid logical replication slots. Commit `7a1f0f8747` optimized the slot verification query but overlooked cases where all logical replication slots are already invalidated. In this scenario, the CTE returns no rows, causing the main query (which used a cross join) to return an empty result even when invalid slots exist. This commit fixes this by using a LEFT JOIN with the CTE, ensuring that slots are properly reported even if the CTE returns no rows. Author: Lakshmi N <lakshmin.jhs@gmail.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CA+3i_M8eT6j8_cBHkYykV-SXCxbmAxpVSKptjDVq+MFtpT-Paw@mail.gmail.com	2026-04-22 09:59:46 -07:00
Peter Geoghegan	d14f69a32a	Harmonize function parameter names for Postgres 19. Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. Most of these inconsistencies were introduced during Postgres 19 development. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit `035ce1fe`).	2026-04-22 12:47:19 -04:00
Tom Lane	a50777680f	Guard against overly-long numeric formatting symbols from locale. to_char() allocates its output buffer with 8 bytes per formatting code in the pattern. If the locale's currency symbol, thousands separator, or decimal or sign symbol is more than 8 bytes long, in principle we could overrun the output buffer. No such locales exist in the real world, so it seems sufficient to truncate the symbol if we do see it's too long. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/638232.1776790821@sss.pgh.pa.us Backpatch-through: 14	2026-04-22 12:41:00 -04:00
Tom Lane	d7970e7e95	Prevent some buffer overruns in spell.c's parsing of affix files. parse_affentry() and addCompoundAffixFlagValue() each collect fields from an affix file into working buffers of size BUFSIZ. They failed to defend against overlength fields, so that a malicious affix file could cause a stack smash. BUFSIZ (typically 8K) is certainly way longer than any reasonable affix field, but let's fix this while we're closing holes in this area. I chose to do this by silently truncating the input before it can overrun the buffer, using logic comparable to the existing logic in get_nextfield(). Certainly there's at least as good an argument for raising an error, but for now let's follow the existing precedent. Reported-by: Igor Stepansky <igor.stepansky@orca.security> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/864123.1776810909@sss.pgh.pa.us Backpatch-through: 14	2026-04-22 12:02:15 -04:00
Tom Lane	844bb90d49	Prevent buffer overrun in spell.c's CheckAffix(). This function writes into a caller-supplied buffer of length 2 * MAXNORMLEN, which should be plenty in real-world cases. However a malicious affix file could supply an affix long enough to overrun that. Defend by just rejecting the match if it would overrun the buffer. I also inserted a check of the input word length against Affix->replen, just to be sure we won't index off the buffer, though it would be caller error for that not to be true. Also make the actual copying steps a bit more readable, and remove an unnecessary requirement for the whole input word to fit into the output buffer (even though it always will with the current caller). The lack of documentation in this code makes my head hurt, so I also reverse-engineered a basic header comment for CheckAffix. Reported-by: Xint Code Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/641711.1776792744@sss.pgh.pa.us Backpatch-through: 14	2026-04-22 10:47:56 -04:00
Alexander Korotkov	713e553e32	Preserve extension dependencies on indexes during partition merge/split When using ALTER TABLE ... MERGE PARTITIONS or ALTER TABLE ... SPLIT PARTITION, extension dependencies on partition indexes were being lost. This happened because the new partition indexes are created fresh from the parent partitioned table's indexes, while the old partition indexes (with their extension dependencies) are dropped. Fix this by collecting extension dependencies from source partition indexes before detaching them, then applying those dependencies to the corresponding new partition indexes after they're created. The mapping between old and new indexes is done via their common parent partitioned index. For MERGE operations, all source partition indexes sharing a parent partitioned index must have the same extension dependencies; if they differ, an error naming both conflicting partition indexes is raised. The check is implemented by collecting one entry per partition index, sorting by parent index OID, and comparing adjacent entries in a single pass. This is order-independent: the same set of partitions produces the same decision regardless of the order they are listed in the MERGE command, and subset mismatches are caught in both directions. For SPLIT operations, the new partition indexes simply inherit all extension dependencies from the source partition's index. The regression tests exercising this feature live under src/test/modules/test_extensions, where the test_ext3 and test_ext5 extensions are available; core regression tests cannot assume any particular extension is installed. Author: Matheus Alcantara <matheusssilv97@gmail.com> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/CALdSSPjXtzGM7Uk4fWRwRMXcCczge5uNirPQcYCHKPAWPkp9iQ%40mail.gmail.com	2026-04-22 14:34:20 +03:00
Dean Rasheed	5548a969b6	Fix UPDATE/DELETE ... WHERE CURRENT OF on a table with virtual columns. Formerly, attempting to use WHERE CURRENT OF to update or delete from a table with virtual generated columns would fail with the error "WHERE CURRENT OF on a view is not implemented". The reason was that the check preventing WHERE CURRENT OF from being used on a view was in replace_rte_variables_mutator(), which presumed that the only way it could get there was as part of rewriting a query on a view. That is no longer the case, since replace_rte_variables() is now also used to expand the virtual generated columns of a table. Fix by doing the check for WHERE CURRENT OF on a view at parse time. This is safe, since it is no longer possible for the relkind to change after the query is parsed (as of `b23cd185f`). Reported-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAHg+QDc_TwzSgb=B_QgNLt3mvZdmRK23rLb+RkanSQkDF40GjA@mail.gmail.com Backpatch-through: 18	2026-04-22 11:50:17 +01:00
Dean Rasheed	7834251758	Fix expansion of EXCLUDED virtual generated columns. If the SET or WHERE clause of an INSERT ... ON CONFLICT command references EXCLUDED.col, where col is a virtual generated column, the column was not properly expanded, leading to an "unexpected virtual generated column reference" error, or incorrect results. The problem was that expand_virtual_generated_columns() would expand virtual generated columns in both the SET and WHERE clauses and in the targetlist of the EXCLUDED pseudo-relation (exclRelTlist). Then fix_join_expr() from set_plan_refs() would turn the expanded expressions in the SET and WHERE clauses back into Vars, because they would be found to match the expression entries in the indexed tlist produced from exclRelTlist. To fix this, arrange for expand_virtual_generated_columns() to not expand virtual generated columns in exclRelTlist. This forces set_plan_refs() to resolve generation expressions in the query using non-virtual columns, as required by the executor. In addition, exclRelTlist now always contains only Vars. That was something already claimed in a couple of existing comments in the planner, which relied on that fact to skip some processing, though those did not appear to constitute active bugs. Reported-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAHg+QDf7wTLz_vqb1wi1EJ_4Uh+Vxm75+b4c-Ky=6P+yOAHjbQ@mail.gmail.com Backpatch-through: 18	2026-04-22 09:03:44 +01:00
Amit Langote	1b9dc2cb75	Fix some const qualifier use in ri_triggers.c The ri_FetchConstraintInfo() and ri_LoadConstraintInfo() functions were declared to return const RI_ConstraintInfo *, but callers sometimes need to modify the struct, requiring casts to drop the const. Remove the misapplied const qualifiers and the casts that worked around them. Reported-by: Peter Eisentraut <peter@eisentraut.org> Author: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/548600ed-8bbb-4e50-8fc3-65091b122276@eisentraut.org	2026-04-22 11:36:54 +09:00
Michael Paquier	9d3e094f12	Allow ALTER INDEX .. ATTACH PARTITION to validate a parent index This commit tweaks ALTER INDEX .. ATTACH PARTITION to attempt a validation of a parent index in the case where an index is already attached but the parent is not yet valid. This occurs in cases where a parent index was created invalid such as with CREATE INDEX ONLY, but was left invalid after an invalid child index was attached (partitioned indexes set indisvalid to false if at least one partition is !indisvalid, indisvalid is true in a partitioned table iff all partitions are indisvalid). This could leave a partition tree in a situation where a user could not bring the parent index back to valid after fixing the child index, as there is no built-in mechanism to do so. This commit relies on the fact that repeated ATTACH PARTITION commands on the same index silently succeed. An invalid parent index is more than just a passive issue. It causes for example ON CONFLICT on a partitioned table if the invalid parent index is used to enforce a unique constraint. Some test cases are added to track some of problematic patterns, using a set of partition trees with combinations of invalid indexes and ATTACH PARTITION. Reported-by: Mohamed Ali <moali.pg@gmail.com> Author: Sami Imseih <sanmimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Discussion: http://postgr.es/m/CAGnOmWqi1D9ycBgUeOGf6mOCd2Dcf=6sKhbf4sHLs5xAcKVCMQ@mail.gmail.com Backpatch-through: 14	2026-04-22 10:32:10 +09:00
Tom Lane	64b2b42124	Fix not-quite-right Makefile for src/test/modules/test_checksums. This neglected to set TAP_TESTS = 1, and partially compensated for that by writing duplicative hand-made rules for check and installcheck. That's not really sufficient though. The way I noticed the error was that "make distclean" didn't clean out the tmp_check subdirectory, and there might be other consequences. Do it the standard way instead.	2026-04-21 18:29:36 -04:00
Melanie Plageman	31b0544b32	bufmgr: use I/O stats arguments in FlushUnlockedBuffer() FlushUnlockedBuffer() accepted io_object and io_context arguments but hardcoded IOOBJECT_RELATION and IOCONTEXT_NORMAL when calling FlushBuffer(). Pass them through instead. Also fix FlushBuffer() to use its io_object parameter for I/O timing stats rather than hardcoding IOOBJECT_RELATION. Not actively broken since all current callers pass IOOBJECT_RELATION and IOCONTEXT_NORMAL, so not backpatched. Author: Chao Li <lic@highgo.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/BC97546F-5C15-42F2-AD57-CFACDB9657D0@gmail.com	2026-04-21 17:47:50 -04:00
Melanie Plageman	62407d26b7	Stabilize btree_gist test against on-access VM setting The btree_gist enum test expects a bitmap heap scan. Since `b46e1e54d0` enabled setting the VM during on-access pruning and `378a21618` set pd_prune_xid on INSERT, scans of enumtmp may set pages all-visible. If autovacuum or autoanalyze then updates pg_class.relallvisible, the planner could choose an index-only scan instead. Make the enumtmp a temp table to exclude it from autovacuum/autoanalyze. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/46733d68-aec0-4d09-8120-4c66b87047a4%40gmail.com	2026-04-21 17:32:45 -04:00
Melanie Plageman	85ae8ab053	Stabilize plancache test against on-access VM setting Since `b46e1e54d0` allowed setting the VM on-access and `378a21618` set pd_prune_xid on INSERT, the testing of generic/custom plans in src/test/regress/sql/plancache.sql was destabilized. One of the queries of test_mode could have set the pages all-visible and if autovacuum/autoanalyze ran and updated pg_class.relallvisible, it would affect whether we got an index-only or sequential scan. Preclude this by disabling autovacuum and autoanalyze for test_mode and carefully sequencing when ANALYZE is run. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/71277259-264e-4983-a201-938b404049d7%40gmail.com	2026-04-21 14:36:59 -04:00
Melanie Plageman	da6874635d	Make local buffers pin limit more conservative GetLocalPinLimit() and GetAdditionalLocalPinLimit(), currently in use only by the read stream, previously allowed a backend to pin all num_temp_buffers local buffers. This meant that the read stream could use every available local buffer for read-ahead, leaving none for other concurrent pin-holders like other read streams and related buffers like the visibility map buffer needed during on-access pruning. This became more noticeable since `b46e1e54d0`, which allows on-access pruning to set the visibility map, which meant that some scans also needed to pin a page of the VM. It caused a test in src/test/regress/sql/temp.sql to fail in some cases. Cap the local pin limit to num_temp_buffers / 4, providing some headroom. This doesn't guarantee that all needed pins will be available — for example, a backend can still open more cursors than there are buffers — but it makes it less likely that read-ahead will exhaust the pool. Note that these functions are not limited by definition to use in the read stream; however, this cap should be appropriate in other contexts. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/97529f5a-ec10-46b1-ab50-4653126c6889%40gmail.com	2026-04-21 11:03:05 -04:00
Tom Lane	1cd3cd372a	Remove gen_node_support.pl's ad-hoc ABI stability check. We installed this in commit `eea9fa9b2` to protect against foreseeable mistakes that would break ABI in stable branches by renumbering NodeTag enum entries. However, we now have much more thorough ABI stability checks thanks to buildfarm members using libabigail (see the .abi-compliance-history mechanism). So this incomplete, single-purpose check seems like an anachronism. I wouldn't object to keeping it were it not that it requires an additional manual step when making a new stable git branch. That seems like something easy to screw up, so let's get rid of it. This patch just removes the logic that checks for changes in the last auto-assigned NodeTag value. We still need eea9fa9b2's cross-check on the supplied list of header files, to prevent divergence between the makefile and meson build systems. We'll also sometimes need the nodetag_number() infrastructure for hand-assigning new NodeTags in stable branches. Discussion: https://postgr.es/m/1458883.1776143073@sss.pgh.pa.us	2026-04-21 10:58:00 -04:00
Tom Lane	81c082f51a	Make plpgsql_trap test more robust and less resource-intensive. We were using "select count(*) into x from generate_series(1, 1_000_000_000_000)" to waste one second waiting for a statement timeout trap. Aside from consuming CPU to little purpose, this could easily eat several hundred MB of temporary file space, which has been observed to cause out-of-disk-space errors in the buildfarm. Let's just use "pg_sleep(10)", which is far less resource-intensive. Also update the "when others" exception handler so that if it does ever again trap an error, it will tell us what error. The cause of these intermittent buildfarm failures had been obscure for awhile. Discussion: https://postgr.es/m/557992.1776779694@sss.pgh.pa.us Backpatch-through: 14	2026-04-21 10:54:39 -04:00
Michael Paquier	d3bba04154	Fix a set of typos and grammar issues across the tree This batch is similar to `462fe0ff62` and addresses a variety of code style issues, including grammar mistakes, typos, inconsistent variable names in function declarations, and incorrect function names in comments and documentation. These fixes have accumulated on the community mailing lists since the commit mentioned above. Notably, Alexander Lakhin previously submitted a patch identifying many of the trivial typos and grammar issues that had been reported on pgsql-hackers. His patch covered a somewhat large portion of the issues addressed here, though not all of them. The documentation changes only affect HEAD.	2026-04-21 14:46:22 +09:00
Richard Guo	c6a79be3f3	Fix incorrect NEW references to generated columns in rule rewriting When a rule action or rule qualification references NEW.col where col is a generated column (stored or virtual), the rewriter produces incorrect results. rewriteTargetListIU removes generated columns from the query's target list, since stored generated columns are recomputed by the executor and virtual ones store nothing. However, ReplaceVarsFromTargetList then cannot find these columns when resolving NEW references during rule rewriting. For UPDATE, the REPLACEVARS_CHANGE_VARNO fallback redirects NEW.col to the original target relation, making it read the pre-update value (same as OLD.col). For INSERT, REPLACEVARS_SUBSTITUTE_NULL replaces it with NULL. Both are wrong when the generated column depends on columns being modified. Fix by building target list entries for generated columns from their generation expressions, pre-resolving the NEW.attribute references within those expressions against the query's targetlist, and passing them together with the query's targetlist to ReplaceVarsFromTargetList. Back-patch to all supported branches. Virtual generated columns were added in v18, so the back-patches in pre-v18 branches only handle stored generated columns. Reported-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHg+QDexGTmCZzx=73gXkY2ZADS6LRhpnU+-8Y_QmrdTS6yUhA@mail.gmail.com Backpatch-through: 14	2026-04-21 14:28:26 +09:00
Michael Paquier	9b43e6793b	Fix orphaned processes when startup process fails during PM_STARTUP When the startup process exists with a FATAL error during PM_STARTUP, the postmaster called ExitPostmaster() directly, assuming that no other processes are running at this stage. Since `7ff23c6d27`, this assumption is not true, as the checkpointer, the background writer, the IO workers and bgworkers kicking in early would be around. This commit removes the startup-specific shortcut happening in process_pm_child_exit() for a failing startup process during PM_STARTUP, falling down to the existing exit() flow to signal all the started children with SIGQUIT, so as we have no risk of creating orphaned processes. This required an extra change in HandleFatalError() for v18 and newer versions, as an assertion could be triggered for PM_STARTUP. It is now incorrect. In v17 and older versions, HandleChildCrash() needs to be changed to handle PM_STARTUP so as children can be waited on. While on it, fix a comment at the top of postmaster.c. It was claiming that the checkpointer and the background writer were started after PM_RECOVERY. That is not the case. Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAJTYsWVoD3V9yhhqSae1_wqcnTdpFY-hDT7dPm5005ZFsL_bpA@mail.gmail.com Backpatch-through: 15	2026-04-21 09:39:59 +09:00
Fujii Masao	8155581ec6	doc: Use "integer" for some I/O worker GUC type descriptions The documentation previously described the io_max_workers, io_worker_idle_timeout, and io_worker_launch_interval GUCs as type "int". However, the documentation consistently uses "integer" for parameters of this type. This commit updates these parameter descriptions to use "integer" for consistency. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAHGQGwEpMDpB-K8SSUVRRHg6L6z3pLAkekd9aviOS=ns0EC=+Q@mail.gmail.com	2026-04-21 08:50:10 +09:00
Fujii Masao	524cbb5155	doc: Correct context description for some JIT support GUCs The documentation for jit_debugging_support and jit_profiling_support previously stated that these parameters can only be set at server start. However, both parameters use the PGC_SU_BACKEND context, meaning they can be set at session start by superusers or users granted the appropriate SET privilege, but cannot be changed within an active session. This commit updates the documentation to reflect the actual behavior. Backpatch to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAHGQGwEpMDpB-K8SSUVRRHg6L6z3pLAkekd9aviOS=ns0EC=+Q@mail.gmail.com Backpatch-through: 14	2026-04-21 08:44:19 +09:00
Fujii Masao	f1cfb48efb	plsample: Use TextDatumGetCString() for text-to-CString conversion Replace the outdated DatumGetCString(DirectFunctionCall1(textout, ...)) pattern with TextDatumGetCString(). The macro is the modern, more efficient way to convert a text Datum to a C string as it avoids unnecessary function call machinery and handles detoasting internally. Since plsample serves as reference code for extension authors, it should follow current idiomatic practices. Author: Amul Sul <sulamul@gmail.com> Discussion: https://postgr.es/m/CAAJ_b95-xMvUN1PEqxv8y6g-A-8k+fSgyv20kSZc9eF1wZAUPg@mail.gmail.com	2026-04-21 08:37:17 +09:00
Tom Lane	f0ac6d494b	Fix relid-set clobber during join removal. Commit `cfcd57111` et al fell over under Valgrind testing. (It seems to be enough to #define USE_VALGRIND, you don't actually need to run it under Valgrind to see failures.) The cause is that remove_rel_from_eclass updates each EquivalenceMember's em_relids, and those can be aliases of the left_relids or right_relids of some RestrictInfo in ec_sources. If the update made em_relids empty then bms_del_member will have pfree'd the relid set, so that the subsequent attempt to clean up ec_sources accesses already-freed memory. We missed seeing ill effects before `cfcd57111` because (a) if the pfree happens then we will remove the EquivalenceMember altogether, making the source RestrictInfo no longer of use, and (b) the cleanup of ec_sources didn't touch left/right_relids before that. I'm unclear though on how `cfcd57111` managed to pass non-USE_VALGRIND testing. Apparently we managed to store another Bitmapset into the freed space before trying to access it, but you'd not think that would happen 100% of the time. I think what USE_VALGRIND changes is that it makes list.c much more memory-hungry, so that the freed space gets claimed by some List node before a Bitmapset can be put there. This failure can be seen in v16, v17, and master, but oddly enough not v18. That's because the SJE patch replaced the simple bms_del_members calls used here with adjust_relid_set, which is careful not to scribble on its input. But commit `20efbdffe` just recently put back the old coding and thus resurrected the problem. Discussion: https://postgr.es/m/458729.1776724816@sss.pgh.pa.us Backpatch-through: 16, 17, master	2026-04-20 19:24:52 -04:00
Jeff Davis	bdcb85b56a	Fix callers of unicode_strtitle() using srclen == -1. Currently, only called that way in tests, which failed to fail. Discussion: https://postgr.es/m/581a72ff452bb045ba83bbe3c6cf4467702d4f0f.camel@j-davis.com Backpatch-through: 18	2026-04-20 14:44:08 -07:00
Jeff Davis	59919ec776	style: define parameterless functions as foo(void). Avoids warning in 'update-unicode' build target. Similar to `11171fe1fc`. Discussion: https://postgr.es/m/581a72ff452bb045ba83bbe3c6cf4467702d4f0f.camel@j-davis.com	2026-04-20 14:42:54 -07:00
Masahiko Sawada	79fba6ebab	doc: Fix missing role attribute in pg_get_tablespace_ddl() description. The second function signature entry for pg_get_tablespace_ddl() was missing the role="func_signature" attribute. This commit adds the missing attribute to ensure consistent formatting with other function entries. Author: Tatsuya Kawata <kawatatatsuya0913@gmail.com> Discussion: https://postgr.es/m/CAHza6qcSgwdh+f41zEm6NSaGHvs5_cwjVu22+KTic=TfnonrFA@mail.gmail.com	2026-04-20 13:31:13 -07:00
Tom Lane	cfcd571116	Clean up all relid fields of RestrictInfos during join removal. The original implementation of remove_rel_from_restrictinfo() thought it could skate by with removing no-longer-valid relid bits from only the clause_relids and required_relids fields. This is quite bogus, although somehow we had not run across a counterexample before now. At minimum, the left_relids and right_relids fields need to be fixed because they will be examined later by clause_sides_match_join(). But it seems pretty foolish not to fix all the relid fields, so do that. This needs to be back-patched as far as v16, because the bug report shows a planner failure that does not occur before v16. I'm a little nervous about back-patching, because this could cause unexpected plan changes due to opening up join possibilities that were rejected before. But it's hard to argue that this isn't a regression. Also, the fact that this changes no existing regression test results suggests that the scope of changes may be fairly narrow. I'll refrain from back-patching further though, since no adverse effects have been demonstrated in older branches. Bug: #19460 Reported-by: François Jehl <francois.jehl@pigment.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19460-5625143cef66012f@postgresql.org Backpatch-through: 16	2026-04-20 14:48:23 -04:00
Tom Lane	207cb2abcb	Make ExecForPortionOfLeftovers() obey SRF protocol. Before each call to the SRF, initialize isnull and isDone, as per the comments for struct ReturnSetInfo. This fixes a Coverity warning about rsi.isDone not being initialized. The built-in {multi,}range_minus_multi functions don't return without setting it, but a user-supplied function might not be as accommodating. We also add statistics tracking around the function call, which will be expected once user-defined withoutPortionProcs functions are supported, and a cross-check on rsi.returnMode just for paranoia's sake. Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/4126231.1776622202@sss.pgh.pa.us	2026-04-20 10:21:52 -04:00
Álvaro Herrera	5dbb63fc82	REPACK: do not require REPLICATION or LOGIN Although REPACK (CONCURRENTLY) uses replication slots, there is no concern that the slot will leak data of other users, because the MAINTAIN privilege on the table is required anyway; requiring REPLICATION is user-unfriendly without providing any actual protection. A related aspect is that the REPLICATION attribute is not needed to prevent REPACK from stealing slots from logical replication, since commit `e76d8c749c` made REPACK use a separate pool of replication slots. Similarly, there's no reason to require that the table owner has the LOGIN privilege. Bypass the default behavior in the background worker launch sequence. Because there are now successful concurrent repack runs in the regression tests, we're forced to run test_plan_advice under wal_level=replica, so add that. Also, move the cluster.sql test to a different parallel group in parallel_schedule: apparently the use of the repack worker causes it to exceed the maximum limit of processes in some runs (the actual limit reached is the number of XIDs in a snapshot's xip array). Author: Antonin Houska <ah@cybertec.at> Reported-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/aeJHPNmL4vVy3oPw@pryzbyj2023	2026-04-20 15:44:23 +02:00
Bruce Momjian	158d8fadd7	doc PG 19 relnotes: fix typo, "date" -> "data" Reported-by: shammat@gmx.net Discussion: https://postgr.es/m/c0d3dfe1-d3e1-45ff-bcdd-40ded5d37ada@gmx.net	2026-04-20 07:15:25 -04:00
Alexander Korotkov	23cbadeeb4	049_wait_for_lsn.pl: create function and procedure at once Create the PL/pgSQL function and procedure for the top-level WAIT FOR checks in a single transaction, then wait once for standby replay before running both tests. Also revise some surrounding comments. This avoids an extra 'wait_for_catchup()' on the delayed standby without changing the test coverage. Discussion: https://postgr.es/m/CABPTF7WZ1yuYz8V%3Dxsbghg8e7qaAm5MpyNw6BthWcbN7%2BP6biw%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-04-20 13:05:55 +03:00
Richard Guo	20efbdffeb	Clean up remove_rel_from_query() after self-join elimination commit The self-join elimination (SJE) commit grafted self-join removal onto remove_rel_from_query(), which was originally written for left-join removal only. This resulted in several issues: - Comments throughout remove_rel_from_query() still assumed only left-join removal, making the code misleading. - ChangeVarNodesExtended was called on phv->phexpr with subst=-1 during left-join removal, which is pointless and confusing since any surviving PHV shouldn't reference the removed rel. - phinfo->ph_lateral was adjusted for left-join removal, which is unnecessary since the removed relid cannot appear in ph_lateral for outer joins. - The comment about attr_needed reconstruction was in remove_rel_from_query(), but the actual rebuild is performed by the callers. - EquivalenceClass processing in remove_rel_from_query() is redundant for self-join removal, since the caller (remove_self_join_rel) already handles ECs via update_eclasses(). - In remove_self_join_rel(), ChangeVarNodesExtended was called on root->processed_groupClause, which contains SortGroupClause nodes that have no Var nodes to rewrite. The accompanying comment incorrectly mentioned "HAVING clause". This patch fixes all these issues, clarifying the separation between left-join removal and self-join elimination code paths within remove_rel_from_query(). The resulting code is also better structured for adding new types of join removal (such as inner-join removal) in the future. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/CAMbWs48JC4OVqE=3gMB6se2WmRNNfMyFyYxm-09vgpm+Vwe8Hg@mail.gmail.com	2026-04-20 17:00:22 +09:00
Peter Eisentraut	04f9ea372a	Add missing Datum conversions Similar to commit `ff89e182d4`, for new code added since.	2026-04-20 07:22:16 +02:00
Peter Eisentraut	5936afe1ee	Fix incorrect format placeholders	2026-04-20 07:09:13 +02:00
Amit Kapila	090c4297e4	Flush statistics during idle periods in parallel apply worker. Parallel apply workers previously failed to report statistics while waiting for new work in the main loop. This resulted in the stats from the most recent transaction remaining unbuffered, leading to arbitrary reporting delays—particularly when streamed transactions were infrequent. This commit ensures that statistics are explicitly flushed when the worker is idle, providing timely visibility into accumulated worker activity. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/TYRPR01MB1419579F217CC4332B615589594202@TYRPR01MB14195.jpnprd01.prod.outlook.com	2026-04-20 10:31:11 +05:30
Michael Paquier	63a116a96e	Meson: Fix check_header() for readline and gssapi Since `f039c22441`, the minimum version of meson supported is 0.57.2, meaning that it is possible to use the result of declare_dependency() when checking for headers with check_header(). There were two TODOs for readline and gssapi to change declare_dependency() after upgrading to at least 0.57.0, which were not addressed yet. While on it, this fixes a comment related to str.replace(). The function has been introduced in meson 0.58.0, not 0.56. Author: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Tristan Partin <tristan@partin.io> Discussion: https://postgr.es/m/00cd2e0c-85df-4cf9-a889-125d85e66980@proxel.se	2026-04-20 12:36:14 +09:00
David Rowley	5142f0093e	Minor fixes for test_bitmapset.c 1. Make it so test_random_operations() can accept a NULL to have the function select a random seed. 2. Widen the seed parameter of test_random_operations() to bigint. Without that, it'll be impossible to run the function with a seed which was selected by GetCurrentTimestamp(), and if a randomly selected seed ever results in a failure, we'll likely want to run with the same seed to debug the issue. 3. Report the seed in the error messages in test_random_operations(). If the buildfarm were ever to fail there, we'd certainly want to know what this was. 4. Add CHECK_FOR_INTERRUPTS() to test_random_operations(). Someone might run with a large num_ops and they'd have no way to cancel the query. 5. Minor cosmetic fixes; header order and whitespace issue. To allow #1, the STRICT modifier had to be removed. The additional prechecks were added as I didn't see how else to handle someone passing those parameters as NULL. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/CAApHDvrDW9W72vAr7h7XeCu7+Qz-_Vff02Q+RPPuVeM0Qf0MCw@mail.gmail.com	2026-04-20 09:58:40 +12:00
Peter Eisentraut	9018c7d37b	Fix 64-bit shifting in dynahash.c The switch from long to int64 in commit `13b935cd52` was incomplete. It was shifting the constant 1L, which is not always 64 bit. Fix by using an explicit int64 constant. MSVC warning: ../src/backend/utils/hash/dynahash.c(1767): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Also add the corresponding warning to the standard warning set on MSVC, to help catch similar issues in the future. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1142ad86-e475-41b3-aeee-c6ad913064fa%40eisentraut.org	2026-04-19 13:27:54 +02:00
Robert Haas	228a1f9542	pg_plan_advice: pgindent Per buildfarm member koel.	2026-04-17 17:46:27 -04:00
Heikki Linnakangas	d65995cbc6	Change PointerGetDatum() back to a macro The argument was marked as "const void X", but that might rightly give the compiler the idea that X cannot be modified through the resulting Datum, and make incorrect optimizations based on that. Some functions use pointer Datums to pass output arguments, like GIN support functions. Coverity started to complain after commit `6f5ad00ab7` that there's dead code in ginExtractEntries(), because it didn't see that it passes PointerGetDatum(&nentries) to a function that sets it. This issue goes back to commit `c8b2ef05f4` (version 16), which changed PointerGetDatum() from a macro to a static inline function. This commit changes it back to a macro, but uses a trick with a dummy conditional expression to still produce a compiler error if you try to pass a non-pointer as the argument. Even though this goes back to v16, I'm only committing this to 'master' for now, to verify that this silences the Coverity warning. If this works, we might want to introduce separate const and non-const versions of PointerGetDatum() instead of this, but that's a bigger patch. It's also not decided yet whether to back-patch this (or some other fix), given that we haven't yet seen any hard evidence of compilers actually producing buggy code because of this. Discussion: https://www.postgresql.org/message-id/342012.1776017102@sss.pgh.pa.us	2026-04-17 22:14:40 +03:00
Robert Haas	4321dcad47	pg_plan_advice: Fix another unique-semijoin bug. This one occurs when an outer join appears beneath the made-unique side of a semijoin. The issue is that join RTEs are not featured out of sj_unique_rels entries. Fix, and add a test case. Reported-by: Alexander Lakhin <exclusion@gmail.com> Analyzed-by: Tender Wang <tndrwang@gmail.com> Discussion: http://postgr.es/m/c0c63979-43c2-4424-8fe8-56949934c9d8@gmail.com	2026-04-17 14:08:37 -04:00
Amit Kapila	f3ae1ec729	Doc: Improve the wording of logical slot prerequisites. Replace the previous negative phrasing such as "there are no slots whose ... is not true" with a direct expression that all slots must have conflicting = false. Similarly, reword the requirement on the new cluster to state that it must not have any permanent logical slots, clarifying that any existing logical slots must have temporary set to true. These changes improve readability without altering the meaning. Reported-by: <mimidatabase@gmail.com> Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/177609278737.403059.14174275013090471947%40wrigleys.postgresql.org	2026-04-17 15:02:15 +05:30
Fujii Masao	950f50d5d4	doc: Improve description of pg_ctl -l log file permissions The documentation stated only that the log file created by pg_ctl -l is inaccessible to other users by default. However, since commit `c37b3d0`, the actual behavior is that only the cluster owner has access by default, but users in the same group as the cluster owner may also read the file if group access is enabled in the cluster. This commit updates the documentation to describe this behavior more clearly. Backpatch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Xiaopeng Wang <wxp_728@163.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/OS9PR01MB1214959BE987B4839E3046050F54BA@OS9PR01MB12149.jpnprd01.prod.outlook.com Backpatch-through: 14	2026-04-17 15:30:59 +09:00
Fujii Masao	4e0e1f3b27	psql: Fix incorrect tab completion after CREATE PUBLICATION ... EXCEPT (...) Previously, tab completion after EXCEPT (...) always suggested FROM SERVER. This was correct for IMPORT FOREIGN SCHEMA ... EXCEPT (...), but became incorrect once commit `fd366065e0` added CREATE PUBLICATION ... EXCEPT (...). This commit updates tab completion so FROM SERVER is no longer suggested after CREATE PUBLICATION ... EXCEPT (...), while preserving the existing behavior for IMPORT FOREIGN SCHEMA ... EXCEPT (...). Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CALDaNm1-Fx6Msw6zcRuSjgQdw6asdTyp2DwP-4TCKGYAT+ndsA@mail.gmail.com	2026-04-17 14:31:05 +09:00
Amit Langote	cda0c4c5d6	Reject invalid databases in pg_get_database_ddl() An invalid database has datconnlimit set to -2. pg_get_database_ddl() emits this verbatim as CONNECTION LIMIT = -2, which ALTER DATABASE rejects. Error out early instead. Reported-by: Lakshmi N <lakshmin.jhs@gmail.com> Author: Lakshmi N <lakshmin.jhs@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Hu Xunqi <huxunqi.08@gmail.com> Discussion: https://postgr.es/m/CA+3i_M8m1k2gFch+tU0JmAQh9FRV+pFrfTXDrJo+BqmwsTmOhg@mail.gmail.com	2026-04-17 13:19:56 +09:00
Bruce Momjian	f3c28c2f2b	doc PG 19 relnotes: change "free space map" to "visibility map" Reported-by: Melanie Plageman Author: Melanie Plageman	2026-04-16 17:23:55 -04:00
Andrew Dunstan	446c400fd8	Make psql DETAIL line test unconditionally optional. Commit `3e2a1496ba` made the psql TAP test require the DETAIL line on platforms with SA_SIGINFO, rather than making it optional. This unexpectedly blew up on OpenBSD buildfarm members, because OpenBSD does not set si_pid for SIGTERM signals even though it has SA_SIGINFO defined. So revert to the test as it was in commit `55890a9194`, where the detail line being missing never causes an error. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/2007157.1776269052%40sss.pgh.pa.us	2026-04-16 16:56:18 -04:00
Álvaro Herrera	05c401d578	Add missing initialization The backend running REPACK can check DecodingWorkerShared->initialized before the worker could have the chance to initialize it, possibly leading to wrong behavior. While at it, remove DecodingWorkerShared->worker_dsm_segment, because that doesn't actually need to be in shared memory; a simple local-memory global variable is enough. Oversights in commit `28d534e2ae`. Author: Antonin Houska <ah@cybertec.at> Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/18181295-8375-4789-ad32-269d78d6001e@gmail.com	2026-04-16 22:27:04 +02:00
Bruce Momjian	191a037d4f	doc PG 19 relnotes: add author and move items Reported-by: Richard Guo Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_etzZZPMEzte8hJv2f4Tn6dGskg8v1R_N9uCd2of0kMQ@mail.gmail.com	2026-04-16 12:46:00 -04:00
Melanie Plageman	b4c1b2be30	Update FSM during prune/freeze replay even if freespace is zero `add323da40` started updating the visibility map in the same WAL record as pruning and freezing. This included updating the freespace map during replay of a record setting the VM, which we've done since `ab7dbd681`. `add323da40`, however, conditioned doing so on there being > 0 freespace on the page, which differed from the previous state for records updating the VM. The FSM is not WAL-logged and is instead updated heuristically on standbys. In rare cases, this heuristic could lead to pages with 0 freespace having outdated entries in the FSM. If the standby is later promoted and vacuum skips these pages because they are marked all-visible/all-frozen, overly optimistic values would be propagated up the FSM tree, causing slowness when searching for freespace for new tuples. Fix it by always updating the FSM during replay when setting VM bits. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Alexey Makhmutov <a.makhmutov@postgrespro.ru> Discussion: https://postgr.es/m/ead2f110-c736-48f5-99e1-023dc9acbf0b%40postgrespro.ru	2026-04-16 12:10:47 -04:00
Bruce Momjian	af1ed03739	doc PG 19 relnotes: update author Reported-by: Masahiko Sawada Author: Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoCLCZnzEFam8H07qq-=fUpDwmTmV7+4RPnT2x_xoJBrgg@mail.gmail.com	2026-04-16 11:23:55 -04:00
Bruce Momjian	2dc34eaa07	doc PG 19 relnotes: corrections reported to me privately	2026-04-16 10:43:37 -04:00
Fujii Masao	2fd84e2226	Use XLogRecPtrIsValid() consistently for WAL position checks Commit `a2b02293bc` switched various checks to use XLogRecPtrIsValid(), but later changes reintroduced XLogRecPtrIsInvalid() and direct comparisons with InvalidXLogRecPtr. This commit replaces those uses with XLogRecPtrIsValid() for better readability and consistency. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Xiaopeng Wang <wxp_728@163.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CALDaNm16knMFtcqyAG3XYSkyagmVXfhaR0T=hau8UTAU0+eLQQ@mail.gmail.com	2026-04-16 23:02:34 +09:00
Daniel Gustafsson	4abcdc1bbe	doc: Add missing GUCs to SSL SNI docs The ssl_sni and hosts_file GUCs were missing from the configuration section of the documentation, they were only described in the main SSL SNI subsection. This adds the GUCs to the relevant sections as well as rewords the existing SSL SNI documentation to refer to the settings along with a few smaller fixups. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwESD2Pty+J1kP3mXmWwMKZ5uJmknZdJsSGrMSRR6CQBmw@mail.gmail.com	2026-04-16 11:18:57 +02:00
Peter Eisentraut	1a51ec16db	MSVC: Turn missing function declaration into an error Calling an undeclared function should be an error as of C99, and GCC and Clang do that, but MSVC doesn't even warn about it in the default warning level. (Commit `c86d2ccdb3` fixed an instance of this problem.) This turns on this warning and makes it an error by default, to match other compilers. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1142ad86-e475-41b3-aeee-c6ad913064fa%40eisentraut.org	2026-04-16 09:53:03 +02:00
Peter Eisentraut	c86d2ccdb3	Add missing include "utils/pg_locale.h" is needed when under MSVC for wchar2char(), introduced by commit `65707ed9af`. Surprisingly, MSVC doesn't warn by default about calling undeclared functions. This will be addressed in a separate commit. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1142ad86-e475-41b3-aeee-c6ad913064fa%40eisentraut.org	2026-04-16 09:35:05 +02:00
Thomas Munro	9a618901a4	Fix comments for Korean encodings in encnames.c * JOHAB: replace the incorrect "simplified Chinese" description with a correct one that identifies it as the Korean combining (Johab) encoding standardized in KS X 1001 annex 3. * EUC_KR: drop a stray space before the comma in the existing comment, and note that the encoding covers the KS X 1001 precomposed (Wansung) form. * UHC: spell out "Unified Hangul Code", clarify that it is Microsoft Windows CodePage 949, and describe its relationship to EUC-KR (superset covering all 11,172 precomposed Hangul syllables). Backpatch-through: 14 Author: Henson Choi <assam258@gmail.com> Discussion: https://postgr.es/m/CAAAe_zAFz1v-3b7Je4L%2B%3DwZM3UGAczXV47YVZfZi9wbJxspxeA%40mail.gmail.com	2026-04-16 18:17:05 +12:00
Amit Langote	059cf7f58d	Fix pg_overexplain to emit valid output with RANGE_TABLE option. overexplain_range_table() emitted the "Unprunable RTIs" and "Result RTIs" properties before closing the "Range Table" group. In the JSON and YAML formats the Range Table group is rendered as an array of RTE objects, so emitting key/value pairs inside it produced structurally invalid output. The XML format had a related oddity, with these elements nested inside <Range-Table> rather than appearing as its siblings. These fields are properties of the PlannedStmt as a whole, not of any individual RTE, so close the Range Table group before emitting them. They now appear as siblings of "Range Table" in the parent Query object, which is what was intended. Also add a test exercising FORMAT JSON with RANGE_TABLE so that any future regression in the output structure is caught. Reported-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHg+QDdDrdqMr98a_OBYDYmK3RaT7XwCEShZfvDYKZpZTfOEjQ@mail.gmail.com Backpatch-through: 18	2026-04-16 13:47:07 +09:00
Amit Langote	b5062a4e57	Fix incorrect comment in JsonTablePlanJoinNextRow() The comment on the return-false path when both UNION siblings are exhausted said "there are more rows," which is the opposite of what the code does. The code itself is correct, returning false to signal no more rows, but the misleading comment could tempt a reader into "fixing" the return value, which would cause UNION plans to loop indefinitely. Back-patch to 17, where JSON_TABLE was introduced. Author: Chuanwen Hu <463945512@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/tencent_4CC6316F02DECA61ACCF22F933FEA5C12806@qq.com Backpatch-through: 17	2026-04-16 13:45:33 +09:00
Fujii Masao	ee550254a2	Use proc_exit() for walreceiver exit in WalRcvWaitForStartPosition() Previously, when the walreceiver exited from WalRcvWaitForStartPosition() at the startup process's request, it called exit(1) directly. This could skip cleanup performed by the callback functions. This commit makes the walreceiver to use proc_exit() instead, ensuring normal cleanup is executed on exit. Also this commit updates comments describing walreceiver termination. Apply to master only, as this has not caused practical issues so far. Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/74381238-4E8A-4621-B794-57025DCCE0BA@gmail.com	2026-04-16 12:33:17 +09:00
Bruce Momjian	7102ce9823	doc PG 19 relnotes: remove doc author from "Allow autovacuum" Reported-by: Aleksander Alekseev Discussion: https://postgr.es/m/CAJ7c6TO-FHg4SGF48PJ9dnV3cg1-_xW9=P4t8-cd-+JWvZAPyQ@mail.gmail.com	2026-04-15 16:58:11 -04:00
Bruce Momjian	be32494126	doc PG 19 relnotes: add free space map all-visible item Reported-by: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_bzN6ioG+h7agjCF847whVpS2WEiJB3UXAtkJ3WVXOZwA@mail.gmail.com	2026-04-15 16:52:17 -04:00
Bruce Momjian	3837e72757	doc PG 19 relnotes: fixes for commands and authors Reported-by: jian he Discussion: https://postgr.es/m/CANzqJaCKn_AahetGkZWKJVi6MKyGKqr1JrziquyHt1-SRwQpSw@mail.gmail.com	2026-04-15 16:18:51 -04:00
Bruce Momjian	75693dc5b7	doc PG 19 relnotes: remove "Lakshmi N" as author of checksums Reported-by: Daniel Gustafsson Discussion: https://postgr.es/m/762DAF2F-7055-4F52-9DF7-23C4A19478A0@yesql.se	2026-04-15 15:43:36 -04:00
Bruce Momjian	caebf16509	doc PG 19 relnotes: fix "now targets" Reported-by: Laurenz Albe Discussion: https://postgr.es/m/27be2ef070e3a0ca55b478e0493fac0124d4f95e.camel@cybertec.at	2026-04-15 15:37:00 -04:00
Bruce Momjian	57f768816d	doc PG 19 relnotes: addjust CREATE/ALTER PUBLICATION "EXCEPT" Reported-by: Peter Smith Backpatch-through: CAHut+Psb41Lou8+BS4ZYmZJFG8pF99wEr+xcP17PCZP1MaY_+Q@mail.gmail.com	2026-04-15 15:35:23 -04:00
Bruce Momjian	23ec74c8a8	doc PG 19 relnotes: add missing March 16 autovacuum score item Also fix "deformed" tuples. Reported-by: David Rowley Backpatch-through: CAApHDvrsyD3QKBO=dypNkyFzYOzQEbgy+xJLwn=y+h+bLSDd-g@mail.gmail.com	2026-04-15 14:42:57 -04:00
Bruce Momjian	e70ac90d95	doc PG 19 relnotes: adjust ShmemRequestStruct item Reported-by: Ashutosh Bapat Author: Ashutosh Bapat Discussion: https://postgr.es/m/CAExHW5vjpd=mWauQZsTbKX9QqD8yxDUABBGQAT5n+CT+nr8QHw@mail.gmail.com	2026-04-15 13:03:29 -04:00
Andrew Dunstan	f30d0c720f	Fix COPY TO FORMAT JSON to exclude generated columns. COPY TO with FORMAT json was including generated columns in the output, unlike TEXT and CSV formats. Virtual generated columns appeared as null, and stored ones showed their computed values. The JSON code path only built a restricted TupleDesc when an explicit column list was given (attnamelist != NIL), but CopyGetAttnums() also excludes generated columns from the default list. Fix by checking whether the attnumlist is shorter than the full TupleDesc instead. Bug introduced in `7dadd38cda`. Author: Satya Narlapuram <satya.narlapuram@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAHg+QDcfpGDoPL3fvfjXRtfn=fny6DdJR6BAy6TpS1Xj2EZfXA@mail.gmail.com	2026-04-15 07:58:17 -04:00
Andrew Dunstan	3e2a1496ba	Rework signal handler infrastructure to pass sender info as argument. Commit 095c9d4cf06 added errdetail() reporting of the PID and UID of the process that sent a termination signal. However, as noted by Andres Freund, the implementation had architectural problems: 1. wrapper_handler() in pqsignal.c contained SIGTERM-specific logic (setting ProcDieSenderPid/Uid), violating its role as a generic signal dispatch wrapper. 2. Using globals to pass sender info between wrapper_handler and the real handler is unsafe when signals nest on some platforms. 3. The syncrep.c errdetail used psprintf() to conditionally embed text via %s, breaking translatability. Adopt the approach proposed by Andres Freund: introduce a pg_signal_info struct that is passed as an argument to all signal handlers via the SIGNAL_ARGS macro. wrapper_handler populates it from siginfo_t when SA_SIGINFO is available, or with zeros otherwise. This keeps wrapper_handler fully generic and avoids any globals for passing signal metadata. Since pqsigfunc now has a different signature from the system's signal handler type, SIG_IGN and SIG_DFL can no longer be passed directly to pqsignal(). Introduce PG_SIG_IGN and PG_SIG_DFL macros that cast to the new pqsigfunc type, and update all call sites. The legacy pqsignal() in libpq retains its original signature via a local typedef. Only die() reads pg_siginfo today, copying the sender PID/UID into ProcDieSenderPid/Uid for later use by ProcessInterrupts(). Only the first SIGTERM's sender info is recorded. Also fix the syncrep.c translatability issue by using separate ereport calls with complete, independently translatable errdetail strings. Also make the psql TAP test require the DETAIL line on platforms with SA_SIGINFO, rather than making it unconditionally optional. On Windows, pg_signal_info uses uint32_t for pid and uid fields since pid_t/uid_t are not available early enough in the include chain. The Windows signal dispatch in pgwin32_dispatch_queued_signals() passes a zeroed pg_signal_info to handlers. Author: Andres Freund <andres@anarazel.de> Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/cwyyryh2veejuxbj5ifzyaejw7jhhqc5mrdeq56xckknsdecn2@6hzfcxde2nm5 Discussion: https://postgr.es/m/jygesyr7mwg7ovdbxpmjvvbi3hccptpkcreqb645h7f56puwbz@hmkkwi3melfe	2026-04-15 07:30:34 -04:00
Bruce Momjian	972c14fb91	doc: first draft of PG 19 release notes	2026-04-14 21:06:27 -04:00
Richard Guo	363af93bdd	Fix var_is_nonnullable() to handle invalid NOT NULL constraints The NOTNULL_SOURCE_SYSCACHE code path in var_is_nonnullable() used get_attnotnull() to check pg_attribute.attnotnull, which is true for both valid and invalid (NOT VALID) NOT NULL constraints. An invalid constraint does not guarantee the absence of NULLs, so this could lead to incorrect results. For example, query_outputs_are_not_nullable() could wrongly conclude that a subquery's output is non-nullable, causing NOT IN to be incorrectly converted to an anti-join. Fix by checking the attnullability field in the relation's tuple descriptor instead, which correctly distinguishes valid from invalid constraints, consistent with what the NOTNULL_SOURCE_HASHTABLE code path already does. While at it, rename NOTNULL_SOURCE_SYSCACHE to NOTNULL_SOURCE_CATALOG to reflect that this code path no longer uses a syscache lookup, and remove the now-unused get_attnotnull() function. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/CAMbWs48ALW=mR0ydQ62dGS-Q+3D7WdDSh=EWDezcKp19xi=TUA@mail.gmail.com	2026-04-15 09:38:56 +09:00
Andrew Dunstan	1f108fc02e	Fix pfree crash in pg_get_role_ddl() and pg_get_database_ddl(). DatumGetArrayTypeP() can return a pointer into the tuple when the datum is stored as a short varlena, so pfree() on the result crashes. Use DatumGetArrayTypePCopy() to always get a palloc'd copy. Bug introduced in `76e514ebb4` and `a4f774cf1c`. Reported-by: Jeff Davis <pgsql@j-davis.com> Author: Satya Narlapuram <satya.narlapuram@gmail.com> Discussion: https://postgr.es/m/CAHg+QDdWtv9PKtPZEokwGCNtbv4MVnfYw5wMZrsEj4xizSNe5Q@mail.gmail.com	2026-04-14 18:29:46 -04:00
Jeff Davis	dacd8fa6f2	Check for unterminated strings when calling uloc_getLanguage(). Missed by commit `1671f990dd`. Author: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/118ca69e-47eb-42e1-83e9-72ccf40dd6fd@proxel.se Backpatch-through: 16	2026-04-14 14:46:14 -07:00
Michael Paquier	67d318e704	Add tests for low-level PGLZ [de]compression routines The goal of this module is to provide an entry point for the coverage of the low-level compression and decompression PGLZ routines. The new test is moved to a new parallel group, with all the existing compression-related tests added to it. This includes tests for the cases detected by fuzzing that emulate corrupted compressed data, as fixed by `2b5ba2a0a1`: - Set control bit with read of a match tag, where no data follows. - Set control bit with read of a match tag, where 1 byte follows. - Set control bit with match tag where length nibble is 3 bytes (extended case). While on it, some tests are added for compress/decompress roundtrips, and for check_complete=false/true. Like `2b5ba2a0a1`, backpatch to all the stable branches. Discussion: https://postgr.es/m/adw647wuGjh1oU6p@paquier.xyz Backpatch-through: 14	2026-04-15 05:09:05 +09:00
Heikki Linnakangas	66ad764c8d	Replace deprecated StaticAssertStmt() with StaticAssertDecl() Commit `6f5ad00ab7` added another use of StaticAssertStmt(), but it was marked as deprecated in commit `d50c86e743`. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/adeNWH5pDawDvvR2@ip-10-97-1-34.eu-west-3.compute.internal	2026-04-14 12:03:30 +03:00
Amit Kapila	fce3f7d267	Add missing period to HINT messages. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAHut+PvikGr4AtoFSs=jq=hmTybVF2NCMEZ57-sjwbGudfuqsQ@mail.gmail.com	2026-04-14 09:37:18 +05:30
Jeff Davis	06ce97b999	Fix overrun when comparing with unterminated ICU language string. The overrun was introduced in commit `c4ff35f10`. Author: Andreas Karlsson <andreas@proxel.se> Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/96d80a47-f17f-42fa-82b1-2908efbd6541@gmail.com Backpatch-through: 18	2026-04-13 11:19:04 -07:00
Robert Haas	e89f98ff03	doc: Remove stray word from pg_stash_advice docs. Commit `c10edb102a` left behind the word "both" where it no longer makes sense. Reported-by: Erik Rijkers <er@xs4all.nl> Discussion: http://postgr.es/m/8912b2e5-ccad-4cbd-ab53-869b0b9ecec5@xs4all.nl	2026-04-13 12:51:04 -04:00
Robert Haas	f4a4f1a7e6	doc: Fix a couple of mistakes in pgplanadvice.sgml It said FOREIGN_SCAN where it should say FOREIGN_JOIN. NESTED_LOOP_MEMOIZE was mistakenly omitted from the list of join methods. Author: Lakshmi N <lakshmin.jhs@gmail.com> Reviewed-by: jie wang <jugierwang@gmail.com> Discussion: http://postgr.es/m/CA+3i_M-mo7Of=Pn8WzRfJLt=fc=gDTn1oOdj8v8BEtgXh9ZMCg@mail.gmail.com	2026-04-13 12:45:57 -04:00
Robert Haas	c644aca240	pg_plan_advice: Export feedback-related definitions. It turns out that our main regression test suite queries tables upon which concurrent DDL is occurring, which can, rarely, cause test_plan_advice failures. We're not quite ready to fix that problem just yet, because we want to gather some more information about how often it actually happens first. But, our plan is going to require test_plan_advice to access a few bits of pg_plan_advice that have been considered internal up until now, so this commit rejiggers things to expose those bits. First, test_plan_advice is going to need to be able to interpret the PGPA_TE_* constants which have been declared in pgpa_trove.h. The "TE" stands for "trove entry" but that's kind of a silly name; change the naming to "FB" (for "feedback") and move the declarations to pg_plan_advice.h, which is a header file that's already installed. This has the side benefit of making these constants available to any other extensions that may want to examine plan advice feedback. Second, test_plan_advice is going to call pgpa_planner_feedback_warning, so make that function non-static and mark it PGDLLEXPORT. Discussion: http://postgr.es/m/CA+TgmobOOmmXSJz3e+cjTY-bA1+W0dqVDqzxUBEvGtW62whYGg@mail.gmail.com	2026-04-13 11:47:40 -04:00
Robert Haas	0f93ebb311	pg_plan_advice: Fix a bug when a subquery is pruned away entirely. If a subquery is proven empty, and if that subquery contained a semijoin, and if making one side or the other of that semijoin unique and performing an inner join was a possible strategy, then the previous code would fail with ERROR: no rtoffset for plan %s when attempting to generate advice. Fix that. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/CA+TgmobOOmmXSJz3e+cjTY-bA1+W0dqVDqzxUBEvGtW62whYGg@mail.gmail.com	2026-04-13 10:34:09 -04:00
Robert Haas	1faf9dfa47	pg_plan_advice: Add alternatives test to Makefile. Oversight in commit `6455e55b0d`. Discussion: http://postgr.es/m/CA+TgmobOOmmXSJz3e+cjTY-bA1+W0dqVDqzxUBEvGtW62whYGg@mail.gmail.com	2026-04-13 10:09:20 -04:00
Robert Haas	3311ccc3d2	pg_plan_advice: Handle non-repeatable TABLESAMPLE scans. When a tablesample routine says that it is not repeatable across scans, set_tablesample_rel_pathlist will (usually) materialize it, confusing pg_plan_advice's plan walker machinery. To fix, update that machinery to view such Material paths as essentially an extension of the underlying scan. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/CA+TgmobOOmmXSJz3e+cjTY-bA1+W0dqVDqzxUBEvGtW62whYGg@mail.gmail.com	2026-04-13 08:46:25 -04:00
Alexander Korotkov	a8b61c23c5	Explicitly forbid non-top-level WAIT FOR execution Previously we were relying on a snapshot-based check to detect invalid execution contexts. However, when WAIT FOR is wrapped into a stored procedure or a DO block, it could pass this check, causing an error elsewhere. This commit implements an explicit isTopLevel check to reject WAIT FOR when called from within a function, procedure, or DO block. The isTopLevel check catches these cases early with a clear error message, matching the pattern used by other utility commands like VACUUM and REINDEX. The snapshot check is retained for the remaining case: top-level execution within a transaction block using an isolation level higher than READ COMMITTED. Also adds tests for WAIT FOR LSN wrapped in a procedure and DO block, complementing the existing test that uses a function wrapper. Relevant documentation paragraph is also added. Reported-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/CAHg%2BQDcN-n3NUqgRtj%3DBQb9fFQmH8-DeEROCr%3DPDbw_BBRKOYA%40mail.gmail.com Author: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2026-04-13 14:04:52 +03:00
Peter Eisentraut	b47854b699	Update Unicode data to CLDR 48.2 No actual changes result. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-04-13 11:13:36 +02:00
Peter Eisentraut	99b726ac48	pg_createsubscriber: Don't use MAXPGPATH Use dynamic allocation instead. Using MAXPGPATH is unnecessary in new code like this.. Discussion: https://www.postgresql.org/message-id/flat/CAEqnbaUthOQARV1dscGvB_EsqC-YfxiM6rWkVDHc%2BG%2Bf4oSUHw%40mail.gmail.com	2026-04-13 10:59:08 +02:00
Peter Eisentraut	f5528b90b4	pg_createsubscriber: Remove separate logfile_open() function This seems like an excessive indirection. Discussion: https://www.postgresql.org/message-id/flat/CAEqnbaUthOQARV1dscGvB_EsqC-YfxiM6rWkVDHc%2BG%2Bf4oSUHw%40mail.gmail.com	2026-04-13 10:52:19 +02:00
Peter Eisentraut	847336ba53	pg_createsubscriber: Use logging.c log file callback This reverts commit `6b5b7eae3a`, where a new logging API layer was introduced locally in pg_createsubscriber. Instead, use the log file callback introduced in logging.c. This new approach is simpler, eliminates code duplication, and doesn't require any caller changes or NLS updates (which the previous commit missed). Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEqnbaUthOQARV1dscGvB_EsqC-YfxiM6rWkVDHc%2BG%2Bf4oSUHw%40mail.gmail.com	2026-04-13 10:44:14 +02:00
Peter Eisentraut	41237556f8	Add log file support to logging.c This adds the ability for users of logging.c to provide a file handle for a log file, where log messages are also written in addition to stderr. Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEqnbaUthOQARV1dscGvB_EsqC-YfxiM6rWkVDHc%2BG%2Bf4oSUHw%40mail.gmail.com	2026-04-13 10:44:02 +02:00
Amit Kapila	8f81c92351	Fix capitalization in publication describe output. Consistent with existing psql metadata display conventions, update the description tags for EXCEPT publications to use lowercase for the second word (e.g., "Except tables" instead of "Except Tables"). This aligns the output style with other publication describe commands. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAHut+Pt3t_tCYwDStkj5fG4Z=YXrHvPBA7iGdh745QipC5zKeg@mail.gmail.com	2026-04-13 10:54:16 +05:30
Amit Kapila	85c17f612a	Fix excessive logging in idle slotsync worker. The slotsync worker was incorrectly identifying no-op states as successful updates, triggering a busy loop to sync slots that logged messages every 200ms. This patch corrects the logic to properly classify these states, enabling the worker to respect normal sleep intervals when no work is performed. Reported-by: Fujii Masao <masao.fujii@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/CAHGQGwF6zG9Z8ws1yb3hY1VqV-WT7hR0qyXCn2HdbjvZQKufDw@mail.gmail.com	2026-04-13 10:06:50 +05:30
David Rowley	49ce41810f	Improve various new-to-v19 appendStringInfo calls Similar to `928394b66` and `8461424fd`, here we adjust a few new locations which were not using the most suitable appendStringInfo* or appendPQExpBuffer* function for the intended purpose. Author: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvohYOdrvhVxXzCJNX_GYMSWBfjTTtB6hgDauEtZ8Nar2A@mail.gmail.com	2026-04-13 13:16:48 +12:00
Michael Paquier	5d35531af1	test_saslprep: Fix issue with copy of input bytea The data given in input of the function may not be null-terminated, causing strlcpy() to complain with an invalid read. Issue spotted using valgrind. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/09df9d75-13e7-45fe-89af-33fe118e797b@gmail.com	2026-04-13 09:06:17 +09:00
David Rowley	e3e26d04bd	Fix unlikely overflow bug in bms_next_member() ... and bms_prev_member(). Both of these functions won't work correctly when given a prevbit of INT_MAX and would crash when operating on a Bitmapset that happened to have a member with that value. Here we fix that by using an unsigned int to calculate which member to look for next. I've also adjusted bms_prev_member() to check for < 0 rather than == -1 for starting the loop. This was done as it's safer and comes at zero extra cost. With our current use cases, it's likely impossible to have a Bitmapset with an INT_MAX member, so no backpatch here. I only noticed this issue when working on a bms function to bitshift a Bitmapset. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAApHDvr1B2gbf6JF69QmueM2QNRvbQeeKLxDnF=w9f9--022uA@mail.gmail.com	2026-04-13 11:39:15 +12:00
David Rowley	a63bbc811d	Use stack-allocated StringInfoDatas, where possible `6d0eba662` already did most of the changes, but some new ones snuck in just prior to that commit, so these got missed. Having these short-lived StringInfoDatas on the stack rather than having them get palloc'd by makeStringInfo() is simply for performance as it saves doing a 2nd palloc. Since this code is new to v19, it makes sense to improve it now rather than wait until we branch as having v19 and v20 differ here just makes it harder to backpatch fixes in this area. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/adt4wpj4FZwR+S7I@ip-10-97-1-34.eu-west-3.compute.internal	2026-04-13 10:43:19 +12:00
David Rowley	a78cf591a3	Doc: use "an SQL" consistently rather than "a SQL" Per the precedent set by `04539e73f`, adjust article prefixes for "SQL" to use "an" consistently rather than "a", i.e., "an es-que-ell" rather than "a sequel". Also see `b51f86e49`, `b1b13d2b5`, `d866f0374` and `7bdd489d3`. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvp3osQwQam+wNTp9BdhP+QfWO6aY6ZTixQQMfM-UArKCw@mail.gmail.com	2026-04-12 22:49:27 +12:00
Michael Paquier	80156cee06	Honor passed-in database OIDs in pgstat_database.c Three routines in pgstat_database.c incorrectly ignore the database OID provided by their caller, using MyDatabaseId instead: - pgstat_report_connect() - pgstat_report_disconnect() - pgstat_reset_database_timestamp() The first two functions, for connection and disconnection, each have a single caller that already passes MyDatabaseId. This was harmless, still incorrect. The timestamp reset function also has a single caller, but in this case the issue has a real impact: it fails to reset the timestamp for the shared-database entry (datid=0) when operating on shared objects. This situation can occur, for example, when resetting counters for shared relations via pg_stat_reset_single_table_counters(). There is currently one test in the tree that checks the reset of a shared relation, for pg_shdescription, we rely on it to check what is stored in pg_stat_database. As stats_reset may be NULL, two resets are done to provide a baseline for comparison. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Dapeng Wang <wangdp20191008@gmail.com> Discussion: https://postgr.es/m/ABBD5026-506F-4006-A569-28F72C188693@gmail.com Backpatch-through: 15	2026-04-11 17:02:52 +09:00
Richard Guo	77d0e82e58	Fix estimate_array_length error with set-operation array coercions When a nested set operation's output type doesn't match the parent's expected type, recurse_set_operations builds a projection target list using generate_setop_tlist with varno 0. If the required type coercion involves an ArrayCoerceExpr, estimate_array_length could be called on such a Var, and would pass it to examine_variable, which errors in find_base_rel because varno 0 has no valid relation entry. Fix by skipping the statistics lookup for Vars with varno 0. Bug introduced by commit `9391f7152`. Back-patch to v17, where estimate_array_length was taught to use statistics. Reported-by: Justin Pryzby <pryzby@telsasoft.com> Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/adjW8rfPDkplC7lF@pryzbyj2023 Backpatch-through: 17	2026-04-11 16:38:47 +09:00
Thomas Munro	b2a17ba7a5	read_stream: Remove obsolete comment. This comment was describing the v17 implementation (or io_method=sync). Backpatch-through: 18	2026-04-11 11:25:25 +12:00
Masahiko Sawada	c22d115f1d	Fix unstable log verification in test_autovacuum. The test in test_autovacuum was unstable because it called log_contains() immediately after verifying autovacuum_count in pg_stat_user_tables. This created a race condition where the statistics could be updated before the autovacuum logs were fully flushed to disk. This commit replaces log_contains() with wait_for_log() to ensure the test waits for the parallel vacuum messages to appear. Additionally, remove the checks of the autovacuum count. Verifying the log messages is sufficient to confirm parallel autovacuum behavior, as logging is only enabled for the specific table under test. Per report from buildfarm member flaviventris. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/525d0f48-93f7-493f-a988-f39b460a79bc@gmail.com	2026-04-10 16:01:42 -07:00
Masahiko Sawada	2a3d2f9f68	doc: Improve consistency of parallel vacuum description. Use consistent phrasing for parallel vacuum descriptions between manual VACUUM and autovacuum. Specifically, clarify that the parallel worker count is limited by the respective options only if they are explicitly specified. Also, fix a typo in the parallel vacuum section. Author: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/CAJ7c6TPcSqzhbhrsiCMmVwmE8F7pwS7i9J49SP1zPKS_ER+vcA@mail.gmail.com	2026-04-10 10:59:24 -07:00
Fujii Masao	de74d1e9a5	Adjust log level of logical decoding messages by context Commit `21b018e7ea` lowered some logical decoding messages from LOG to DEBUG1. However, per discussion on pgsql-hackers, messages from background activity (e.g., walsender or slotsync worker) should remain at LOG, as they are less frequent and more likely to indicate issues that DBAs should notice. For foreground SQL functions (e.g., pg_logical_slot_peek_binary_changes()), keep these messages at DEBUG1 to avoid excessive log noise. They can still be enabled by lowering client_min_messages or log_min_messages for the session. This commit updates logical decoding to log these messages at LOG for background activity and at DEBUG1 for foreground execution. Suggested-by: Robert Haas <robertmhaas@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CA+TgmoYsu2+YAo9eLGkDp5VP-pfQ-jOoX382vS4THKHeRTNgew@mail.gmail.com	2026-04-10 22:59:34 +09:00
Andrew Dunstan	eec8e234bd	Revert "Add built-in fuzzing harnesses for security testing." This reverts commit `4a18907b41`. inadvertenly pushed, mea culpa	2026-04-10 09:53:58 -04:00
Andrew Dunstan	3f8913f683	Use size_t instead of Size in pg_waldump In commit `b15c151398` I missed the memo about not using Size in new code. Per complaint from Thomas Munro Discussion: https://postgr.es/m/CA+hUKGJkeTVuq5u5WKJm6xkwmW577UuQ7fA=PyBCSR3h9g2GtQ@mail.gmail.com	2026-04-10 09:29:00 -04:00
Andrew Dunstan	4a18907b41	Add built-in fuzzing harnesses for security testing. Add 12 libFuzzer-compatible fuzzing harnesses behind a new -Dfuzzing=true meson option. Each harness implements LLVMFuzzerTestOneInput() and can also be built in standalone mode (reading from files) when no fuzzer engine is detected. Frontend targets (no backend dependencies): fuzz_json - non-incremental JSON parser (pg_parse_json) fuzz_json_incremental - incremental/chunked JSON parser fuzz_conninfo - libpq connection string parser (PQconninfoParse) fuzz_pglz - PGLZ decompressor (pglz_decompress) fuzz_unescapebytea - libpq bytea unescape (PQunescapeBytea) fuzz_b64decode - base64 decoder (pg_b64_decode) fuzz_saslprep - SASLprep normalization (pg_saslprep) fuzz_parsepgarray - array literal parser (parsePGArray) fuzz_pgbench_expr - pgbench expression parser (via Bison/Flex) Backend targets (link against postgres_lib): fuzz_rawparser - SQL raw parser (raw_parser) fuzz_regex - regex engine (pg_regcomp/pg_regexec) fuzz_typeinput - type input functions (numeric/date/timestamp/interval)	2026-04-10 07:13:08 -04:00
Andrew Dunstan	2b5ba2a0a1	Fix heap-buffer-overflow in pglz_decompress() on corrupt input. When decoding a match tag, pglz_decompress() reads 2 bytes (or 3 for extended-length matches) from the source buffer before checking whether enough data remains. The existing bounds check (sp > srcend) occurs after the reads, so truncated compressed data that ends mid-tag causes a read past the allocated buffer. Fix by validating that sufficient source bytes are available before reading each part of the match tag. The post-read sp > srcend check is no longer needed and is removed. Found by fuzz testing with libFuzzer and AddressSanitizer.	2026-04-10 07:13:08 -04:00
Andrew Dunstan	2478bd5db0	Fix incremental JSON parser numeric token reassembly across chunks. When the incremental JSON parser splits a numeric token across chunk boundaries, it accumulates continuation characters into the partial token buffer. The accumulator's switch statement unconditionally accepted '+', '-', '.', 'e', and 'E' as valid numeric continuations regardless of position, which violated JSON number grammar (-? int [frac] [exp]). For example, input "4-" fed in single-byte chunks would accumulate the '-' into the numeric token, producing an invalid token that later triggered an assertion failure during re-lexing. Fix by tracking parser state (seen_dot, seen_exp, prev character) across the existing partial token and incoming bytes, so that each character class is accepted only in its grammatically valid position.	2026-04-10 07:13:08 -04:00
Amit Langote	009ea1b08d	Add test case for same-type reordered FK columns The test added in `980c1a85d8` covered reordered FK columns with different types, which triggered an "operator not a member of opfamily" error in the fast-path prior to that commit. Add a test for the same-type case, which is also fixed by that commit but where the wrong scan key ordering instead produced a spurious FK violation without any internal error. Reported-by: Fredrik Widlert <fredrik.widlert@digpro.se> Discussion: https://postgr.es/m/CADfhSr8hYc-4Cz7vfXH_oV-Jq81pyK9W4phLrOGspovsg2W7Kw@mail.gmail.com	2026-04-10 17:44:06 +09:00
Amit Langote	d6e96bacd3	Move afterTriggerFiringDepth into AfterTriggersData The static variable afterTriggerFiringDepth introduced by commit `5c54c3ed1b` is logically part of the after-trigger state and in hindsight should have been a field in AfterTriggersData alongside query_depth and the other per-transaction after-trigger state. Move it there as firing_depth. Also update its comment to accurately reflect its sole remaining purpose: signaling to AfterTriggerIsActive() that after-trigger firing is active. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFt4NGTNk7BinOsHHM48E9zGAa852vCfGoSe1bbL=JNFQ@mail.gmail.com	2026-04-10 16:17:58 +09:00
Richard Guo	f6936bf9da	Fix var_is_nonnullable() to account for varreturningtype var_is_nonnullable() failed to consider varreturningtype, which meant it could incorrectly claim a Var is non-nullable based on a column's NOT NULL constraint even when the Var refers to a non-existent row. Specifically, OLD.col is NULL for INSERT (no old row exists) and NEW.col is NULL for DELETE (no new row exists), regardless of any NOT NULL constraint on the column. This caused the planner's constant folding in eval_const_expressions to incorrectly simplify IS NULL / IS NOT NULL tests on such Vars. For example, "old.a IS NULL" in an INSERT's RETURNING clause would be folded to false when column "a" has a NOT NULL constraint, even though the correct result is true. Fix by returning false from var_is_nonnullable() when varreturningtype is not VAR_RETURNING_DEFAULT, since such Vars can be NULL regardless of table constraints. Author: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAHg+QDfaAipL6YzOq2H=gAhKBbcUTYmfbAv+W1zueOfRKH43FQ@mail.gmail.com	2026-04-10 15:51:00 +09:00
Amit Langote	155c03ee9d	Assert index_attnos[0] == 1 in ri_FastPathFlushArray() ri_FastPathFlushArray() handles single-column FKs only, so index_attnos[0] is always 1. Add an Assert to make this invariant explicit, as a followup to `980c1a85d8`. Suggested-by: Junwang Zhao <zhjwpku@gmail.com> (offlist) Discussion: https://www.postgresql.org/message-id/CADfhSr-pCkbDxmiOVYSAGE5QGjsQ48KKH_W424SPk%2BpwzKZFaQ%40mail.gmail.com	2026-04-10 15:24:38 +09:00
Amit Langote	980c1a85d8	Fix FK fast-path scan key ordering for mismatched column order The fast-path foreign key check introduced in `2da86c1ef9` assumed that constraint key positions directly correspond to index column positions. This is not always true as a FK constraint can reference PK columns in a different order than they appear in the PK's unique index. For example, if the PK is (a, b, c) and the FK references them as (a, c, b), the constraint stores keys in the FK-specified order, but the index has columns in PK order. The buggy code used the constraint key index to access rd_opfamily[i], which retrieved the wrong operator family when columns were reordered, causing "operator X is not a member of opfamily Y" errors. After fixing the opfamily lookup, a second issue started to happen: btree index scans require scan keys to be ordered by attribute number. The code was placing scan keys at array position i with attribute number idx_attno, producing out-of-order keys when columns were swapped. This caused "btree index keys must be ordered by attribute" errors. The fix adds an index_attnos array to FastPathMeta that maps each constraint key position to its corresponding index column position. In ri_populate_fastpath_metadata(), we search indkey to find the actual index column for each pk_attnums[i] and use that position for the opfamily lookup. In build_index_scankeys(), we place each scan key at the array position corresponding to its index column (skeys[idx_attno-1]) rather than at the constraint key position, ensuring scan keys are properly ordered by attribute number as btree requires. Reported-by: Fredrik Widlert <fredrik.widlert@digpro.se> Author: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://www.postgresql.org/message-id/CADfhSr-pCkbDxmiOVYSAGE5QGjsQ48KKH_W424SPk%2BpwzKZFaQ%40mail.gmail.com	2026-04-10 13:33:55 +09:00
Amit Langote	03029409b4	Fix typo left by `34a3078629` Reported-by: jie wang <jugierwang@gmail.com> Reported-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAJnZyeDyaS=X-eYN=9rDYqK=6ma1gMLa0qDgfNbZKK0e0+q99Q@mail.gmail.com	2026-04-10 13:32:38 +09:00
Amit Langote	34a3078629	Fix RI fast-path crash under nested C-level SPI When a C-language function uses SPI_connect/SPI_execute/SPI_finish to INSERT into a table with FK constraints, the FK AFTER triggers fire and schedule ri_FastPathEndBatch via RegisterAfterTriggerBatchCallback(), opening PK relations under CurrentResourceOwner at the time of the SPI call. The query_depth > 0 guard in FireAfterTriggerBatchCallbacks suppresses the callback at that nesting level, deferring teardown to the outer query's AfterTriggerEndQuery. By then the resource owner active during the SPI call may have been released, decrementing the cached relations' refcounts to zero. ri_FastPathTeardown, running under the outer query's resource owner, then crashes in assert builds when it attempts to close relations whose refcounts are already zero: TRAP: failed Assert("rel->rd_refcnt > 0") Fix by storing batch callbacks at the level where they should fire: in AfterTriggersQueryData.batch_callbacks for immediate constraints (fired by AfterTriggerEndQuery) and in AfterTriggersData.batch_callbacks for deferred constraints (fired by AfterTriggerFireDeferred and AfterTriggerSetState). RegisterAfterTriggerBatchCallback() routes the callback to the current query-level list when query_depth >= 0, and to the top-level list otherwise. FireAfterTriggerBatchCallbacks() takes a list parameter and simply iterates and invokes it; memory cleanup is handled by the caller. This replaces the query_depth > 0 guard with list-level scoping. Note that deferred constraints are unaffected by this bug: their callbacks fire at commit via AfterTriggerFireDeferred, under the outer transaction's resource owner, which remains valid throughout. Also add firing_batch_callbacks to AfterTriggersData to enforce that callbacks do not register new callbacks during FireAfterTriggerBatchCallbacks(), which would be unsafe as it could modify the list being iterated. An Assert in RegisterAfterTriggerBatchCallback() enforces this discipline for future callers. The flag is reset at transaction and subtransaction boundaries to handle cases where an error thrown by a callback is caught and the subtransaction is rolled back. While at it, ensure callbacks are properly accounted for at all transaction boundaries, as cleanup of `b7b27eb41a`: discard any remaining top-level callbacks on both commit and abort in AfterTriggerEndXact(), and clean up query-level callbacks in AfterTriggerFreeQuery(). Note that ri_PerformCheck() calls SPI with fire_triggers=false, which skips AfterTriggerBeginQuery/EndQuery for that SPI command. Any triggers queued during that SPI command are not fired immediately but deferred to the outer query level. Since the fast-path check for those triggers runs under the outer query's resource owner rather than a nested SPI resource owner, and ri_PerformCheck() does not create a dedicated child resource owner, the bug described above does not apply. Reported-by: Evan Montgomery-Recht <montge@mianetworks.net> Reported-by: Sandro Santilli <strk@kbt.io> Analyzed-by: Evan Montgomery-Recht <montge@mianetworks.net> Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEg7pwcKf01FmDqFAf-Hzu_pYnMYScY_Otid-pe9uw3BJ6gq9g@mail.gmail.com	2026-04-10 12:41:34 +09:00
Jeff Davis	90630ec429	Document new catalog columns, missed in commit `8185bb5347`. Reported-by: "Shinoda, Noriyoshi (PSD Japan FSI)" <noriyoshi.shinoda@hpe.com> Co-authored-by: "Shinoda, Noriyoshi (PSD Japan FSI)" <noriyoshi.shinoda@hpe.com> Discussion: https://postgr.es/m/LV8PR84MB3787135EBDBF7747A05731F3EE592@LV8PR84MB3787.NAMPRD84.PROD.OUTLOOK.COM	2026-04-09 20:29:42 -07:00
Michael Paquier	5b5bf51e43	Zero-fill private_data when attaching an injection point InjectionPointAttach() did not initialize the private_data buffer of the shared memory entry before (perhaps partially) overwriting it. When the private data is set to NULL by the caler, the buffer was left uninitialized. If set, it could have stale contents. The buffer is initialized to zero, so as the contents recorded when a point is attached are deterministic. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tsGHu2h6YLnVu4HiK05q+gTE_9WVUAqihW2LSscAYS-g@mail.gmail.com Backpatch-through: 17	2026-04-10 11:17:09 +09:00
Nathan Bossart	71ff232a5b	Fix double-free in pg_stat_autovacuum_scores. Presently, relation_needs_vacanalyze() unconditionally frees the pgstat entry returned by pgstat_fetch_stat_tabentry_ext(). This behavior was first added by commit `02502c1bca` to avoid memory leakage in autovacuum. While this is fine for autovacuum since it forces stats_fetch_consistency to "none", it is not okay for other callers that use "cache" or "snapshot". This manifests as a double-free when pg_stat_autovacuum_scores is called multiple times in the same transaction. To fix, add a "bool *may_free" parameter to pgstat_fetch_stat_tabentry_ext() that returns whether it is safe for the caller to explicitly pfree() the result. If a caller would rather leave it to the memory context machinery to free the result, it can pass NULL as the "may_free" argument (or just ignore its value). Oversight in commit `87f61f0c82`. Reported-by: Tender Wang <tndrwang@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAHewXNkJKdwb3D5OnksrdOqzqUnXUEMpDam1TPW0vfUkW%3D7jUw%40mail.gmail.com Discussion: https://postgr.es/m/5684f479-858e-4c5d-b8f5-bcf05de1f909%40gmail.com	2026-04-09 13:07:06 -05:00
Masahiko Sawada	8030b839d3	Remove an unstable wait from parallel autovacuum regression test. The test 001_parallel_autovacuum.pl verified that vacuum delay parameters are propagated to parallel vacuum workers by using injection points. It previously waited for autovacuum to complete on the test_autovac table. However, since injection points are cluster-wide, an autovacuum worker could be triggered on tables in other databases (e.g., template1) and get stuck at the same injection point. This could lead to a timeout when the test waits for the expected table's autovacuum to finish. This commit removes the wait for autovacuum completion from this specific test case. Since the primary goal is to verify the propagation of parameter updates, which is already confirmed via log messages, waiting for the entire vacuum process to finish is unnecessary and prone to instability in concurrent test environments. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s+kZZRMSF4HW7tZ9W2jS1o4B+Fg8dr5a-T6mANX+mdQA@mail.gmail.com	2026-04-09 09:13:32 -07:00
Andres Freund	7fc36c5db5	instrumentation: Avoid CPUID 0x15/0x16 for Hypervisor TSC frequency This restricts the retrieval of the TSC frequency whilst under a Hypervisor to either Hypervisor-specific CPUID registers (0x40000010), or TSC calibration. We previously allowed retrieving from the traditional CPUID registers for TSC frequency (0x15/0x16) like on bare metal, but it turns out that they are not trustworthy when virtualized and can report wildly incorrect frequencies, like 7 kHz when the actual calibrated frequencty is 2.5 GHz. Per report from buildfarm member drongo. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/jr4hk2sxhqcfpb67ftz5g4vw33nm67cgf7go3wwmqsafu5aclq%405m67ukuhyszz	2026-04-09 11:50:46 -04:00
Nathan Bossart	60165db6e1	Add LOG_NEVER error level code. This logging level means not to emit the log, which is useful for functions like relation_needs_vacanalyze(). This function accepts a log level argument but not all callers want it to emit logs. Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3101163.1775676098%40sss.pgh.pa.us	2026-04-09 10:18:15 -05:00
Richard Guo	8b6c89e377	Fix integer overflow in nodeWindowAgg.c In nodeWindowAgg.c, the calculations for frame start and end positions in ROWS and GROUPS modes were performed using simple integer addition. If a user-supplied offset was sufficiently large (close to INT64_MAX), adding it to the current row or group index could cause a signed integer overflow, wrapping the result to a negative number. This led to incorrect behavior where frame boundaries that should have extended indefinitely (or beyond the partition end) were treated as falling at the first row, or where valid rows were incorrectly marked as out-of-frame. Depending on the specific query and data, these overflows can result in incorrect query results, execution errors, or assertion failures. To fix, use overflow-aware integer addition (ie, pg_add_s64_overflow) to check for overflows during these additions. If an overflow is detected, the boundary is now clamped to INT64_MAX. This ensures the logic correctly treats the boundary as extending to the end of the partition. Bug: #19405 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/19405-1ecf025dda171555@postgresql.org Backpatch-through: 14	2026-04-09 19:28:33 +09:00
Peter Eisentraut	11d6042337	Update config.guess and config.sub	2026-04-09 11:26:14 +02:00
Richard Guo	c1408956e3	Strip PlaceHolderVars from partition pruning operands When pulling up a subquery, its targetlist items may be wrapped in PlaceHolderVars to enforce separate identity or as a result of outer joins. This causes any upper-level WHERE clauses referencing these outputs to contain PlaceHolderVars, which prevents partprune.c from recognizing that they match partition key columns, defeating partition pruning. To fix, strip PlaceHolderVars from operands before comparing them to partition keys. A PlaceHolderVar with empty phnullingrels appearing in a relation-scan-level expression is effectively a no-op, so stripping it is safe. This parallels the existing treatment in indxpath.c for index matching. In passing, rename strip_phvs_in_index_operand() to strip_noop_phvs() and move it from indxpath.c to placeholder.c, since it is now a general-purpose utility used by both index matching and partition pruning code. Back-patch to v18. Although this issue exists before that, changes in that version made it common enough to notice. Given the lack of field reports for older versions, I am not back-patching further. In the v18 back-patch, strip_phvs_in_index_operand() is retained as a thin wrapper around the new strip_noop_phvs() to avoid breaking third-party extensions that may reference it. Reported-by: Cándido Antonio Martínez Descalzo <candido@ninehq.com> Diagnosed-by: David Rowley <dgrowleyml@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAH5YaUwVUWETTyVECTnhs7C=CVwi+uMSQH=cOkwAUqMdvXdwWA@mail.gmail.com Backpatch-through: 18	2026-04-09 16:41:31 +09:00
Amit Langote	e1cc57fabd	Add nkeys parameter to recheck_matched_pk_tuple() The function looped over ii_NumIndexKeyAttrs elements of the skeys array, but one caller (ri_FastPathFlushArray) passes a one-element array since it only handles single-column FKs. The function signature did not communicate this constraint, which static analysis flags as a potential out-of-bounds read. Add an nkeys parameter and assert that it matches ii_NumIndexKeyAttrs, then use it in the loop. The call sites already know the key count. Reported-by: Evan Montgomery-Recht <montge@mianetworks.net> Discussion: https://postgr.es/m/CAEg7pwcKf01FmDqFAf-Hzu_pYnMYScY_Otid-pe9uw3BJ6gq9g@mail.gmail.com	2026-04-09 14:45:31 +09:00
Michael Paquier	e0fa5bd146	Reduce presence of syscache.h in src/include/ `ee642cccc4` has added syscache.h in inval.h and objectaddress.h, enlarging by a lot the footprint of this header, particularly via objectaddress.h. A change in syscache.h would cause a lot more files to be recompiled. This commit reduces the presence of syscache.h by switching to a direct use of syscache_ids.h in inval.h and objectaddress.h, where the enum SysCacheIdentifier is defined. genbki.pl gains an #ifndef block for this header, so as its inclusion is more controlled. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/vlcexdcimsmvu3aplt2yxpfndkgtuvjsrms2fdl46rbw3k2kug@drspkoxlaije	2026-04-09 08:49:36 +09:00
Álvaro Herrera	2cff363715	Simplify declaration of memcpy target The existing one is understandable failing on (some?) 32-bit platforms. Reported-by: Tomas Vondra <tomas@vondra.me> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1c197f2d-49a2-4830-8dde-55867218b62d@vondra.me	2026-04-08 22:58:56 +02:00
Daniel Gustafsson	b364828f82	doc: Fix data_checksums data type Commit `f19c0eccae` changed the data_checksums GUC datatype from a boolean to an enum. This updates the documentation to accurately reflect its new type and document the new possible states: 'on', 'off', 'inprogress-on', and 'inprogress-off'. Also update the xref for more information to point to the section on data checksums rather than the initdb checksum option. Author: Lakshmi N <lakshmin.jhs@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA+3i_M-AtTnqTB2KLBTpu-c-jvnTuy7bGxyxs80rgiQLxWrRUQ@mail.gmail.com	2026-04-08 22:53:43 +03:00
Nathan Bossart	e0851bded6	Add a couple of commits to .git-blame-ignore-revs.	2026-04-08 13:41:22 -05:00
Peter Eisentraut	f8eec1ced6	Add missing PGDLLIMPORT markings	2026-04-08 15:49:33 +02:00
Thomas Munro	a1643d40b3	Remove RADIUS support. Our RADIUS implementation supported only the deprecated RADIUS/UDP variant, without the recommended Message-Authenticator attribute to mitigate against the Blast-RADIUS vulnerability. By now, popular RADIUS servers are expected to generate loud warnings or reject our authentication attempts outright. Since there have been no user reports about this, it seems unlikely that there are users. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/CA%2BhUKG%2BSH309V8KECU5%3DxuLP9Dks0v9f9UVS2W74fPAE5O21dg%40mail.gmail.com	2026-04-08 22:38:43 +12:00
Etsuro Fujita	28972b6fc3	Add support for importing statistics from remote servers. Add a new FDW callback routine that allows importing remote statistics for a foreign table directly to the local server, instead of collecting statistics locally. The new callback routine is called at the beginning of the ANALYZE operation on the table, and if the FDW failed to import the statistics, the existing callback routine is called on the table to collect statistics locally. Also implement this for postgres_fdw. It is enabled by "restore_stats" option both at the server and table level. Currently, it is the user's responsibility to ensure remote statistics to import are up-to-date, so the default is false. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CADkLM%3DchrYAx%3DX2KUcDRST4RLaRLivYDohZrkW4LLBa0iBhb5w%40mail.gmail.com	2026-04-08 19:15:00 +09:00
Thomas Munro	d1c01b79d4	aio: Adjust I/O worker pool automatically. The size of the I/O worker pool used to implement io_method=worker was previously controlled by the io_workers setting, defaulting to 3. It was hard to know how to tune it effectively. That is replaced with: io_min_workers=2 io_max_workers=8 (up to 32) io_worker_idle_timeout=60s io_worker_launch_interval=100ms The pool is automatically sized within the configured range according to recent variation in demand. It grows when existing workers detect that latency might be introduced by queuing, and shrinks when the highest-numbered worker is idle for too long. Work was already concentrated into low-numbered workers in anticipation of this logic. The logic for waking extra workers now also tries to measure and reduce the number of spurious wakeups, though they are not entirely eliminated. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2Bm4xV0LMoH2c%3DoRAdEXuCnh%2BtGBTWa7uFeFMGgTLAw%2BQ%40mail.gmail.com	2026-04-08 19:08:32 +12:00
John Naylor	948ef7cdc4	Exit early from pg_comp_crc32c_pmull for small inputs The vectorized path in commit `fbc57f2bc` had a side effect of putting more branches in the path taken for small inputs. To reduce risk of regressions, only proceed with the vectorized path if we can guarantee that the remaining input after the alignment preamble is greater than 64 bytes. That also allows removing length checks in the alignment preamble. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CANWCAZZ48GuLYhJCcTy8TXysjrMVJL6n1n7NP94=iG+t80YKPw@mail.gmail.com	2026-04-08 13:52:14 +07:00
Thomas Munro	ce11e63f81	pg_upgrade: Check for unsupported encodings. Since we have dropped MULE_INTERNAL, add a check that all encodings used in the source cluster are still supported according to PG_ENCODING_BE_VALID(). This is done generically, in case we decide to drop another encoding some day. Suggested-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGKXDXh-FdU0orjfv%2BF08f%3DD91BhV3Ra-4zL-q%2BJmGYqTA%40mail.gmail.com	2026-04-08 17:45:09 +12:00
Thomas Munro	77645d44e3	Remove MULE_INTERNAL encoding. This was useful before widespread Unicode adoption, and was based on the internal encoding Emacs used to mix multiple sub-encodings. Emacs itself has stopped using it, and our implementation hadn't been updated with modern underlying standards. It is thought to be very unlikely that anyone is still using it in the field. Since such a complex encoding comes with costs and risks, we agreed to drop support. Any existing database using this encoding would need to be dumped and restored with a new encoding to upgrade to PostgreSQL 19, most likely UTF8, since pg_upgrade would fail. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CA%2BhUKGKXDXh-FdU0orjfv%2BF08f%3DD91BhV3Ra-4zL-q%2BJmGYqTA%40mail.gmail.com	2026-04-08 17:40:06 +12:00
Andres Freund	2c16deee2f	instrumentation: Allocate query level instrumentation in ExecutorStart Until now extensions that wanted to measure overall query execution could create QueryDesc->totaltime, which the core executor would then start and stop. That's a bit odd and composes badly, e.g. extensions always had to use INSTRUMENT_ALL, because otherwise another extension might not get what they need. Instead this introduces a new field, QueryDesc->query_instr_options, that extensions can use to indicate whether they need query level instrumentation populated, and with which instrumentation options. Extensions should take care to only add options they need, instead of replacing the options of others. The prior name of the field, totaltime, sounded like it would only measure time, but these days the instrumentation infrastructure can track more resources. The secondary benefit is that this will make it obvious to extensions that they may not create the Instrumentation struct themselves anymore (often extensions build only against a postgres build without assertions). Adjust pg_stat_statements and auto_explain to match, and lower the requested instrumentation level for auto_explain to INSTRUMENT_TIMER, since the summary instrumentation it needs is only runtime. The reason to push this now, rather in the PG 20 cycle, is that `5a79e78501` already required extensions using query level instrumentations to adjust their code, and it seemed undesirable to require them to do so again for 20. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53Pkyqsht+exJQYRsjhSWYKu+vFGHhPub7m6PmFD6Or0=p1g@mail.gmail.com	2026-04-08 00:06:45 -04:00
Fujii Masao	db93032a7c	Fix slotsync worker blocking promotion when stuck in wait Previously, on standby promotion, the startup process sent SIGUSR1 to the slotsync worker (or a backend performing slot synchronization) and waited for it to exit. This worked in most cases, but if the process was blocked waiting for a response from the primary (e.g., due to a network failure), SIGUSR1 would not interrupt the wait. As a result, the process could remain stuck, causing the startup process to wait for a long time and delaying promotion. This commit fixes the issue by introducing a new procsignal reason, PROCSIG_SLOTSYNC_MESSAGE. On promotion, the startup process sends this signal, and the handler sets interrupt flags so the process exits (or errors out) promptly at CHECK_FOR_INTERRUPTS(), allowing promotion to complete without delay. Backpatch to v17, where slotsync was introduced. Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFzNYroAxSoyJhqTU-pH=t4Ej6RyvhVmBZ91Exj_TPMMQ@mail.gmail.com Backpatch-through: 17	2026-04-08 11:22:21 +09:00
Andres Freund	544000288e	instrumentation: Move ExecProcNodeInstr to allow inlining This moves the implementation of ExecProcNodeInstr, the ExecProcNode variant that gets used when instrumentation is on, to be defined in instrument.c instead of execProcNode.c, and marks functions it uses as inline. This allows compilers to generate an optimized implementation, and shows a 4 to 12% reduction in instrumentation overhead for queries that move lots of rows. Author: Lukas Fittl <lukas@fittl.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53PkzdBK8VJ1fS4AZ481LgMN8f9mJiC39ZRHqkFUSYq6KWmg@mail.gmail.com	2026-04-07 21:36:49 -04:00
Tomas Vondra	e157fe6f76	Add EXPLAIN (IO) instrumentation for TidRangeScan Adds support for EXPLAIN (IO) instrumentation for TidRange scans. This requires adding shared instrumentation for parallel scans, using the separate DSM approach introduced by `dd78e69cfc`. Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 23:25:05 +02:00
Andres Freund	16fca48254	pg_test_timing: Also test RDTSC[P] timing, report time source, TSC frequency This adds support to pg_test_timing for the different timing sources added by `294520c444`. Author: Lukas Fittl <lukas@fittl.com> Author: David Geier <geidav.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> (in an earlier version) Discussion: https://www.postgresql.org/message-id/flat/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de	2026-04-07 17:12:08 -04:00
Tomas Vondra	3b1117d6e2	Add EXPLAIN (IO) instrumentation for SeqScan Adds support for EXPLAIN (IO) instrumentation for sequential scans. This requires adding shared instrumentation, using the separate DSM approach introduced by `dd78e69cfc`. Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 23:07:03 +02:00
Tom Lane	b268928f93	Suppress unused-variable warning. x86 machines lacking HAVE__CPUIDEX saw a complaint about "unused variable 'reg'", per buildfarm as well as local experience. Oversight in `bcb2cf41f`.	2026-04-07 17:03:20 -04:00
Tomas Vondra	61c36a34a4	auto_explain: Add new GUC auto_explain.log_io Allows enabling the new EXPLAIN "IO" option for auto_explain. Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 22:49:44 +02:00
Tomas Vondra	681daed931	Add EXPLAIN (IO) infrastructure with BitmapHeapScan support Allows collecting details about AIO / prefetch for scan nodes backed by a ReadStream. This may be enabled by a new "IO" option in EXPLAIN, and it shows information about the prefetch distance and I/O requests. As of this commit this applies only to BitmapHeapScan, because that's the only scan node using a ReadStream and collecting instrumentation from workers in a parallel query. Support for SeqScan and TidRangeScan, the other scan nodes using ReadStream, will be added in subsequent commits. The stats are collected only when required by EXPLAIN ANALYZE, with the IO option (disabled by default). The amount of collected statistics is very limited, but we don't want to clutter EXPLAIN with too much data. The IOStats struct is stored in the scan descriptor as a field, next to other fields used by table AMs. A pointer to the field is passed to the ReadStream, and updated directly. It's the responsibility of the table AM to allocate the struct (e.g. in ambeginscan) whenever the flag SO_SCAN_INSTRUMENT flag is passed to the scan, so that the executor and ReadStream has access to it. The collected stats are designed for ReadStream, but are meant to be reasonably generic in case a TAM manages I/Os in different ways. Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 22:33:34 +02:00
Tomas Vondra	10d5a12a93	Switch EXPLAIN to unaligned output for json/xml/yaml Use unaligned output for multiple EXPLAIN queries using non-text format in regression tests. With aligned output adding/removing explain fields can be very disruptive, as it often modifies the whole block because of padding. Unaligned output does not have this issue. Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 22:12:27 +02:00
Tom Lane	4edd6036d6	Fix WITHOUT OVERLAPS' interaction with domains. UNIQUE/PRIMARY KEY ... WITHOUT OVERLAPS requires the no-overlap column to be a range or multirange, but it should allow a domain over such a type too. This requires minor adjustments in both the parser and executor. In passing, fix a nearby break-instead-of-continue thinko in transformIndexConstraint. This had the effect of disabling parse-time validation of the no-overlap column's type in the context of ALTER TABLE ADD CONSTRAINT, if it follows a dropped column. We'd still complain appropriately at runtime though. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxGoAmN_0iJ=hjTG0vGpOSOyy-vYyfE+-q0AWxrq2_p5XQ@mail.gmail.com Backpatch-through: 18	2026-04-07 14:45:37 -04:00
Andres Freund	294520c444	instrumentation: Use Time-Stamp Counter on x86-64 to lower overhead This allows the direct use of the Time-Stamp Counter (TSC) value retrieved from the CPU using RDTSC/RDTSCP instructions, instead of APIs like clock_gettime() on POSIX systems. This reduces the overhead of EXPLAIN with ANALYZE and TIMING ON. Tests showed that the overhead on top of actual runtime when instrumenting queries moving lots of rows through the plan can be reduced from 2x as slow to 1.2x as slow compared to the actual runtime. More complex workloads such as TPCH queries have also shown ~20% gains when instrumented compared to before. To control use of the TSC, the new "timing_clock_source" GUC is introduced, whose default ("auto") automatically uses the TSC when reliable, for example when running on modern Intel CPUs, or when running on Linux and the system clocksource is reported as "tsc". The use of the operating system clock source can be enforced by setting "system", or on x86-64 architectures the use of TSC can be enforced by explicitly setting "tsc". In order to use the TSC the frequency is first determined by use of CPUID, and if not available, by running a short calibration loop at program start, falling back to the system clock source if TSC values are not stable. Note, that we split TSC usage into the RDTSC CPU instruction which does not wait for out-of-order execution (faster, less precise) and the RDTSCP instruction, which waits for outstanding instructions to retire. RDTSCP is deemed to have little benefit in the typical InstrStartNode() / InstrStopNode() use case of EXPLAIN, and can be up to twice as slow. To separate these use cases, the new macro INSTR_TIME_SET_CURRENT_FAST() is introduced, which uses RDTSC. The original macro INSTR_TIME_SET_CURRENT() uses RDTSCP and is supposed to be used when precision is more important than performance. When the system timing clock source is used both of these macros instead utilize the system APIs (clock_gettime / QueryPerformanceCounter) like before. Additional users of interval timing, such as track_io_timing and track_wal_io_timing could also benefit from being converted to use INSTR_TIME_SET_CURRENT_FAST() but are left for future changes. Author: Lukas Fittl <lukas@fittl.com> Author: Andres Freund <andres@anarazel.de> Author: David Geier <geidav.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an earlier version) Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com> (in an earlier version) Reviewed-by: Robert Haas <robertmhaas@gmail.com> (in an earlier version) Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> (in an earlier version) Discussion: https://postgr.es/m/20200612232810.f46nbqkdhbutzqdg@alap3.anarazel.de	2026-04-07 13:00:24 -04:00
Andres Freund	bcb2cf41f9	Allow retrieving x86 TSC frequency/flags from CPUID This adds additional x86 specific CPUID checks for flags needed for determining whether the Time-Stamp Counter (TSC) is usable on a given system, as well as a helper function to retrieve the TSC frequency from CPUID. This is intended for a future patch that will utilize the TSC to lower the overhead of timing instrumentation. In passing, always make pg_cpuid_subleaf reset the variables used for its result, to avoid accidentally using stale results if __get_cpuid_count errors out and the caller doesn't check for it. Author: Lukas Fittl <lukas@fittl.com> Author: David Geier <geidav.pg@gmail.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: John Naylor <john.naylor@postgresql.org> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> (in an earlier version) Discussion: https://www.postgresql.org/message-id/flat/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de	2026-04-07 13:00:24 -04:00
Andres Freund	0022622c93	instrumentation: Standardize ticks to nanosecond conversion method The timing infrastructure (INSTR_* macros) measures time elapsed using clock_gettime() on POSIX systems, which returns the time as nanoseconds, and QueryPerformanceCounter() on Windows, which is a specialized timing clock source that returns a tick counter that needs to be converted to nanoseconds using the result of QueryPerformanceFrequency(). This conversion currently happens ad-hoc on Windows, e.g. when calling INSTR_TIME_GET_NANOSEC, which calls QueryPerformanceFrequency() on every invocation, despite the frequency being stable after program start, incurring unnecessary overhead. It also causes a fractured implementation where macros are defined differently between platforms. To ease code readability, and prepare for a future change that intends to use a ticks-to-nanosecond conversion on x86-64 for TSC use, introduce new pg_ticks_to_ns() / pg_ns_to_ticks() functions that get called from INSTR_* macros on all platforms. These functions rely on a separately initialized ticks_per_ns_scaled value, that represents the conversion ratio. This value is initialized from QueryPerformanceFrequency() on Windows, and set to zero on x86-64 POSIX systems, which results in the ticks being treated as nanoseconds. Other architectures always directly return the original ticks. To support this, pg_initialize_timing() is introduced, and is now mandatory for both the backend and any frontend programs to call before utilizing INSTR_* macros. In passing, fix variable names in comment documenting INSTR_TIME_ADD_NANOSEC(). Author: Lukas Fittl <lukas@fittl.com> Author: David Geier <geidav.pg@gmail.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://www.postgresql.org/message-id/flat/20200612232810.f46nbqkdhbutzqdg%40alap3.anarazel.de	2026-04-07 13:00:24 -04:00
Jacob Champion	b977bd308a	oauth: Allow validators to register custom HBA options OAuth validators can already use custom GUCs to configure behavior globally. But we currently provide no ability to adjust settings for individual HBA entries, because the original design focused on a world where a provider covered a "single audience" of users for one database cluster. This assumption does not apply to multitenant use cases, where a single validator may be controlling access for wildly different user groups. To improve this use case, add two new API calls for use by validator callbacks: RegisterOAuthHBAOptions() and GetOAuthHBAOption(). Registering options "foo" and "bar" allows a user to set "validator.foo" and "validator.bar" in an oauth HBA entry. These options are stringly typed (syntax validation is solely the responsibility of the defining module), and names are restricted to a subset of ASCII to avoid tying our hands with future HBA syntax improvements. Unfortunately, we can't check the custom option names during a reload of the configuration, like we do with standard HBA options, without requiring all validators to be loaded via shared_preload_libraries. (I consider this to be a nonstarter: most validators should probably use session_preload_libraries at most, since requiring a full restart just to update authentication behavior will be unacceptable to many users.) Instead, the new validator.* options are checked against the registered list at connection time. Multiple alternatives were proposed and/or prototyped, including extending the GUC system to allow per-HBA overrides, joining forces with recent refactoring work on the reloptions subsystem, and giving the ability to customize HBA options to all PostgreSQL extensions. I personally believe per-HBA GUC overrides are the best option, because several existing GUCs like authentication_timeout and pre_auth_delay would fit there usefully. But the recent addition of SNI per-host settings in `4f433025f` indicates that a more general solution is needed, and I expect that to take multiple releases' worth of discussion. This compromise patch, then, is intentionally designed to be an architectural dead end: simple to describe, cheap to maintain, and providing just enough functionality to let validators move forward for PG19. The hope is that it will be replaced in the future by a solution that can handle per-host, per-HBA, and other per-context configuration with the same functionality that GUCs provide today. In the meantime, the bulk of the code in this patch consists of strict guardrails on the simple API, to try to ensure that we don't have any reason to regret its existence during its unknown lifespan. I owe particular thanks here to Zsolt Parragi, who prototyped several approaches that guided the final design. Suggested-by: Zsolt Parragi <zsolt.parragi@percona.com> Suggested-by: VASUKI M <vasukianand0119@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAN4CZFM3b8u5uNNNsY6XCya257u%2BDofms3su9f11iMCxvCacag%40mail.gmail.com	2026-04-07 08:15:19 -07:00
Jacob Champion	6d00fb9048	libpq: Split PGOAUTHDEBUG=UNSAFE into multiple options PGOAUTHDEBUG is a blunt instrument: you get all the debugging features, or none of them. The most annoying consequence during manual use is the Curl debug trace, which tends to obscure the device flow prompt entirely. The promotion of PGOAUTHCAFILE into its own feature in `993368113` improved the situation somewhat, but there's still the discomfort of knowing you have to opt into many dangerous behaviors just to get the single debug feature you wanted. Explode the PGOAUTHDEBUG syntax into a comma-separated list. The old "UNSAFE" value enables everything, like before. Any individual unsafe features still require the envvar to begin with an "UNSAFE:" prefix, to try to interrupt the flow of someone who is about to do something they should not. So now, rather than PGOAUTHDEBUG=UNSAFE # enable all the unsafe things a developer can say PGOAUTHDEBUG=call-count # only show me the call count. safe! PGOAUTHDEBUG=UNSAFE:trace # print secrets, but don't allow HTTP To avoid adding more build system scaffolding to libpq-oauth, implement this entirely in a small private header. This unfortunately can't be standalone, so it needs a headerscheck exception. Author: Zsolt Parragi <zsolt.parragi@percona.com> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2B%3DfbZNJSkHVci%3DGpR8XPYObK%3DH%2B2ERRha0LDTS%2BifsWnw%40mail.gmail.com Discussion: https://postgr.es/m/CAN4CZFMmDZMH56O9vb_g7vHqAk8ryWFxBMV19C39PFghENg8kA%40mail.gmail.com	2026-04-07 08:15:14 -07:00
Álvaro Herrera	e76d8c749c	Reserve replication slots specifically for REPACK Add a new GUC max_repack_replication_slots, which lets the user reserve some additional replication slots for concurrent repack (and only concurrent repack). With this, the user doesn't have to worry about changing the max_replication_slots in order to cater for use of concurrent repack. (We still use the same pool of bgworkers though, but that's less commonly a problem than slots.) Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/202604012148.nnnmyxxrr6nh@alvherre.pgsql	2026-04-07 16:55:29 +02:00
Heikki Linnakangas	979387f188	Fix harmless leftover in _hash_kill_items() Checking for 'havePin' is sufficient here. An earlier version of the patch didn't have the 'havePin' variable and used 'so->hashso_bucket_buf == so->currPos.buf' as the condition when both locking and unlocking the page. The havePin variable was added later during development, but the unlocking condition wasn't fully updated. Tidy it up. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/b9de8d05-3b02-4a27-9b0b-03972fa4bfd3@iki.fi	2026-04-07 17:38:11 +03:00
Andrew Dunstan	55890a9194	Add errdetail() with PID and UID about source of termination signal. When a backend is terminated via pg_terminate_backend() or an external SIGTERM, the error message now includes the sender's PID and UID as errdetail, making it easier to identify the source of unexpected terminations in multi-user environments. On platforms that support SA_SIGINFO (Linux, FreeBSD, and most modern Unix systems), the signal handler captures si_pid and si_uid from the siginfo_t structure. On platforms without SA_SIGINFO, the detail is simply omitted. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Chao Li <1356863904@qq.com> Discussion: https://postgr.es/m/CAKZiRmyrOWovZSdixpLd3PGMQXuQL_zw2Ght5XhHCkQ1uDsxjw@mail.gmail.com	2026-04-07 10:22:33 -04:00
Robert Haas	c10edb102a	pg_stash_advice: Allow stashed advice to be persisted to disk. If pg_stash_advice.persist = true, stashed advice will be written to pg_stash_advice.tsv in the data directory, periodically and at shutdown. On restart, stash modifications are locked out until this file has been reloaded, but queries will not be, so there may be a short window after startup during which previously-stashed advice is not automatically applied. Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/CA+Tgmob87qsWa-VugofU6epuV0H5XjWZGMbQas4Q-ADKmvSyBg@mail.gmail.com	2026-04-07 10:11:36 -04:00
Andres Freund	29e7dbf5e4	Minimal fix for WAIT FOR ... MODE 'standby_flush' The investigation into the negative test performance impact of `7e8aeb9e48` lead to discovering that there are a few issues with WAIT FOR. This commit is just a minimal fix to prevent hangs in standby_flush mode, due to WAIT FOR ... 'standby_flush' seeing a 0 LSN if a newly started walreceiver does not receive any writes, because the stanby is already caught up. There are several other issues and this is isn't necessarily the best fix. But this way we get the hangs out of the way. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/zqbppucpmkeqecfy4s5kscnru4tbk6khp3ozqz6ad2zijz354k@w4bdf4z3wqoz	2026-04-07 09:48:09 -04:00
Álvaro Herrera	8fb95a8ab6	doc: Add an example of REPACK (CONCURRENTLY) Suggested-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm3tiKhtegx5Cawi34UjbHmNGEDNAtScGM1RgWRtV-5_0Q@mail.gmail.com	2026-04-07 15:33:55 +02:00
Heikki Linnakangas	9480c585df	Tidy up #ifdef USE_INJECTION_POINTS guards Remove unnecessary #ifdef guard around the function prototypes; they are already inside a larger #ifdef block. Move #include "subsystems.h" inside the USE_INJECTION_POINTS guard; it's needed for InjectionPointShmemCallbacks, which is a also inside the guard. Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/87y0iz2c1v.fsf@wibble.ilmari.org	2026-04-07 16:18:31 +03:00
Álvaro Herrera	be142fa008	Fix tests under wal_level=minimal Buildfarm members which have specifically configured to use wal_level=minimal fail the repack regression tests, which require wal_level=replica. Add a temp config file to fix that.	2026-04-07 15:14:32 +02:00
Heikki Linnakangas	257c8231bf	Modernize and optimize pg_buffercache_pages() Refactor pg_buffercache_pages() to use SFRM_Materialize mode and construct a tuplestore directly. That's simpler and more efficient than collecting all the data to a custom array first. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Palak Chaturvedi <chaturvedipalak1911@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5sMsaz1j+hrdhyo-DJp7JCgJx87=q2iJfOc_9mwYWyvmw@mail.gmail.com	2026-04-07 16:04:48 +03:00
Heikki Linnakangas	9f3755ea07	Optimize sorting and deduplicating trigrams Use templated qsort() so that the comparison function can be inlined. To speed up qunique(), use a specialized comparison function that only checks for equality. Author: David Geier <geidav.pg@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/2a76b5ef-4b12-4023-93a1-eed6e64968f3@gmail.com	2026-04-07 14:11:25 +03:00
Tomas Vondra	884f9b3c76	Use add_size/mul_size for index instrumentation size calculations Use overflow-safe size arithmetic in the Index[Only]Scan and parallel instrumentation functions, consistent with other executor nodes (Hash, Sort, Agg, Memoize). This was an oversight in `dd78e69cfc`. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 12:47:28 +02:00
Tomas Vondra	9c18b47e61	Fix BitmapHeapScan non-parallel-aware EXPLAIN ANALYZE Allocates shared bitmap table scan instrumentation for all parallel scans. Previously, the instrumentation was only allocated for parallel-aware scans, other bitmap heap scans in the parallel query had no shared instrumentation and EXPLAIN didn't report exact/lossy pages. This affected cases like scans on the outside of a parallel join or queries run with debug_parallel_query=regress. Fixed by allocating a separate DSM chunk for shared instrumentation and doing so regardless of parallel-awareness. The instrumentation is allocated in its own DSM chunk, separate from ParallelBitmapHeapState. Report an initial patch by me. The approach with a separate DSM was proposed and implemented by Melanie. Not backpatched. The issue affects Postgres 18 (since `5a1e6df3b8`), but having multiple DSM chunks is possible only since `dd78e69cfc`. If we decide to fix this in backbranches too, it will need to be done in a less invasive way. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 12:47:13 +02:00
Álvaro Herrera	0d3dba38c7	Allow logical replication snapshots to be database-specific By default, the logical decoding assumes access to shared catalogs, so the snapshot builder needs to consider cluster-wide XIDs during startup. That in turn means that, if any transaction is already running (and has XID assigned), the snapshot builder needs to wait for its completion, as it does not know if that transaction performed catalog changes earlier. A possible problem with this concept is that if REPACK (CONCURRENTLY) is running in some database, backends running the same command in other databases get stuck until the first one has committed. Thus only a single backend in the cluster can run REPACK (CONCURRENTLY) at any time. Likewise, REPACK (CONCURRENTLY) can block walsenders starting on behalf of subscriptions throughout the cluster. This patch adds a new option to logical replication output plugin, to declare that it does not use shared catalogs (i.e. catalogs that can be changed by transactions running in other databases in the cluster). In that case, no snapshot the backend will use during the decoding needs to contain information about transactions running in other databases. Thus the snapshot builder only needs to wait for completion of transactions in the current database. Currently we only use this option in the REPACK background worker. It could possibly be used in the plugin for logical replication too, however that would need thorough analysis of that plugin. Bump WAL version number, due to a new field in xl_running_xacts. Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/90475.1775218118@localhost	2026-04-07 12:31:18 +02:00
Álvaro Herrera	a3b069ef90	Avoid different-size pointer-to-integer cast Buildfarm member mamba is unhappy that I wrote "(Datum) NULL" in commit `28d534e2ae`: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2026-04-07%2005%3A08%3A08 Use "(Datum) 0" which is what we do everywhere else. Discussion: https://postgr.es/m/CANWCAZaOs_+WPH13ow33Q==+FwBwVZkqzm4vND=WEB4_NBmv1Q@mail.gmail.com	2026-04-07 12:28:05 +02:00
Heikki Linnakangas	6f5ad00ab7	Optimize sort and deduplication in ginExtractEntries() Remove NULLs from the array first, and use qsort to deduplicate only the non-NULL items. This simplifies the comparison function. Also replace qsort_arg() with a templated version so that the comparison function can be inlined. These changes make ginExtractEntries() a little faster especially for simple datatypes like integers. Author: David Geier <geidav.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/6d16b6bd-a1ff-4469-aefb-a1c8274e561a@iki.fi	2026-04-07 13:26:39 +03:00
Peter Eisentraut	b6ccd30d8f	Add isolation tests for UPDATE/DELETE FOR PORTION OF Add documentation about concurrency issues related to UPDATE/DELETE FOR PORTION OF as well as supporting isolation tests. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com	2026-04-07 11:22:11 +02:00
Álvaro Herrera	5bcc3fbd19	Fix valgrind failure Buildfarm member skink reports that the new REPACK code is trying to write uninitialized bytes to disk, which correspond to padding space in the SerializedSnapshotData struct. Silence that by initializing the memory in SerializeSnapshot() to all zeroes. Co-authored-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/1976915.1775537087@sss.pgh.pa.us	2026-04-07 11:13:50 +02:00
John Naylor	8c3e22a8f8	Use .h for the file containing the page checksum code fragment Commit `5e13b0f24` used a .c file for a file containing a code fragment, to avoid adding an exception to headerscheck. That turned out to be too clever, since it meant installation didn't happen by the usual mechanism. Make it look like a normal header and add the requisite exception. Bug: #19450 Reported-by: RekGRpth <rekgrpth@gmail.com> Discussion: https://postgr.es/m/19450-bb0612c50c6786e5@postgresql.org	2026-04-07 15:52:55 +07:00
John Naylor	30229be755	Simplify SortSupport for the macaddr data type As of commit `6aebedc38` Datums are 64-bit values. Since MAC addresses have only 6 bytes, the abbreviated key always contains the entire MAC address and is thus authoritative (for practical purposes -- the tuple sort machinery has no way of knowing that). Abbreviating this datatype is cheap, and aborting abbreviation prevents optimizations like radix sort, so remove cardinality estimation. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Suggested-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TMk10rF_LiMz6j9rRy1rqk-5s+wBPuBefLix4cY+-4s1w@mail.gmail.com	2026-04-07 13:29:27 +07:00
Michael Paquier	49cc0d4148	Mark JumbleState as a const in the post_parse_analyze hook This commit changes the post_parse_analyze_hook_type() hook to take a const JumbleState, to tell external modules that they are not allowed to touch the JumbleState that has been compiled by the core code. This fixes a pretty old problem with pg_stat_statements, that had always the idea of modifying the lengths of the constants stored in the JumbleState. The previous state could confuse extensions that need to look at a JumbleState depending on the loading order, if pg_stat_statements is part of the stack loaded. Another piece included in this commit is the move of the routine fill_in_constant_lengths() to queryjumblefuncs.c, to give an option to extensions to compile the lengths of the constants, if necessary. I was surprised by the number of external code that carries a copy of this routine (see the thread for details). Previously, this routine modified JumbleState. It now copies the set of LocationLens from JumbleState, and fills the constant lengths for separate use. pg_stat_statements is updated to use the new ComputeConstantLengths(). JumbleState is now marked with a const in the module, where relevant. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/CAA5RZ0tZp5qU0ikZEEqJnxvdSNGh1DWv80sb-k4QAUmiMoOp_Q@mail.gmail.com	2026-04-07 15:22:49 +09:00
John Naylor	51098839cf	Split CREATE STATISTICS error reasons out into errdetails Some errmsgs in statscmds.c were phrased as "...cannot be used because...". Put the reasons into errdetails. While at it, switch from passive voice to "cannot create..." for the errmsg. Author: Yugo Nagata <nagata@sraoss.co.jp> Suggested-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CANWCAZaZeX0omWNh_ZbD_JVujzYQdRUW8UZOQ4dWh9Sg7OcAow@mail.gmail.com	2026-04-07 11:37:48 +07:00
Michael Paquier	3284e3f63c	Fix injection point detach timing problem in TAP test for lock stats injection_points_detach() could fail because of a concurrent cleanup triggered by injection_points_set_local() when a session finishes. This problem could be reproduced by adding a hardcoded sleep in InjectionPointDetach(), and has been detected by the CI. As the test is designed so as the injection point is detached before being awaken, there is no need for it to be local, similarly to test 010_index_concurrently_upsert. This commit removes injection_points_set_local(), replacing it with a confirmation that the point has been attached in the session expected to block on a lock. With this removal, the detach cannot happen concurrently anymore, only before when the point is woken up. Issue introduced by `557a9f1e3e`, where the test has been added. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/rp6wz4lnz5qn4zlh7uxtavzfrmqvycy2g42z4zasfss2gxi54f@zzcsjdvdflwp	2026-04-07 13:17:13 +09:00
Michael Paquier	17132f55c5	Fix shmem allocation of fixed-sized custom stats kind StatsShmemSize(), that computes the shmem size needed for pgstats, includes the amount of shared memory wanted by all the custom stats kinds registered. However, the shared memory allocation was done by ShmemAlloc() in StatsShmemInit(), meaning that the space reserved was not used, wasting some memory. These extra allocations would show up under "<anonymous>" in pg_shmem_allocations, as the allocations done by ShmemAlloc() are not tracked by ShmemIndexEnt. Issue introduced by `7949d95945`. Author: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/04b04387-92f5-476c-90b0-4064e71c5f37@iki.fi Backpatch-through: 18	2026-04-07 11:59:49 +09:00
Amit Langote	5c54c3ed1b	Fix deferred FK check batching introduced by commit `b7b27eb41a` That commit introduced AfterTriggerIsActive() to detect whether we are inside the after-trigger firing machinery, so that RI trigger functions can take the batched fast path. It was implemented using query_depth >= 0, which correctly identified immediate trigger firing but missed the deferred case where query_depth is -1 at COMMIT via AfterTriggerFireDeferred(). This caused deferred FK checks to fall back to the per-row fast path instead of the batched path. The correct check is whether we are inside an after-trigger firing loop specifically. Introduce afterTriggerFiringDepth, a counter incremented around the trigger-firing loops in AfterTriggerEndQuery, AfterTriggerFireDeferred, and AfterTriggerSetState, and decremented after FireAfterTriggerBatchCallbacks() returns. AfterTriggerIsActive() now returns afterTriggerFiringDepth > 0. Reported-by: Chao Li <li.evan.chao@gmail.com> Author: Chao Li <li.evan.chao@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/C2133B47-79CD-40FF-B088-02D20D654806@gmail.com	2026-04-07 10:45:59 +09:00
Michael Paquier	9897957805	Fix shared memory size of template code for custom fixed-sized pgstats On HEAD, the template code for custom fixed-sized pgstats is in the test module test_custom_stats. On REL_18_STABLE, this code lives in the test module injection_points. Both cases were underestimating the size of the shared memory area required for the storage of the stats data, using a single entry rather than the whole area. This underestimation meant that there was no memory allocated for the LWLock required for the stats, and even more. This problem would be also misleading for extension developers looking at this code. This issue has been noticed while digging into a different bug reported by Heikki Linnakangas, showing that the underestimation was causing failures in the TAP tests of the test modules for 32-bit builds. The other issue reported, related to the memory allocation of custom fixed-sized pgstats, will be fixed in a follow-up commit. Discussion: https://postgr.es/m/adMk_lWbnz3HDOA8@paquier.xyz Backpatch-through: 18	2026-04-07 08:24:32 +09:00
Melanie Plageman	dd78e69cfc	Allocate separate DSM chunk for parallel Index[Only]Scan instrumentation Previously, parallel index and index-only scans packed the parallel scan descriptor and shared instrumentation (for EXPLAIN ANALYZE) into a single DSM allocation. Since scans may be instrumented without being parallel-aware, and vice versa, using separate DSM chunks -- each with its own TOC key -- is cleaner. A future commit will extend this pattern to other scan node types. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-06 19:10:19 -04:00
Melanie Plageman	43222b8e53	Assert no duplicate keys in shm_toc_insert() shm_toc_insert() silently accepts duplicate keys. Since shm_toc_lookup() returns the first matching entry, any later entry with the same key would be unreachable. Add an assertion to catch this. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-06 18:41:47 -04:00
Nathan Bossart	87f61f0c82	Add pg_stat_autovacuum_scores system view. This view contains one row for each table in the current database, showing the current autovacuum scores for that specific table. It also shows whether autovacuum would vacuum or analyze the table. Bumps catversion. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-06 16:56:33 -05:00
Daniel Gustafsson	b3a37ffbc5	Use PG_DATA_CHECKSUM_OFF instead of hardcoded value For a long time, the online checksums patchset kept the "off" state as literal zero without a label to be consistent with the previous coding which only had a label for the "on" state. Later, when an "off" label was made not all uses in the code got the memo. Fix by setting these to PG_DATA_CHECKSUM_OFF. While there, fix a duplicate word in a comment introduced by the same commit. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAJ7c6TPRTnQFXXX1CRcYoTLXw2swtDH==uSz1MYoMKdLrKZHjA@mail.gmail.com	2026-04-06 22:11:53 +02:00
Álvaro Herrera	28d534e2ae	Add CONCURRENTLY option to REPACK When this flag is specified, REPACK no longer acquires access-exclusive lock while the new copy of the table is being created; instead, it creates the initial copy under share-update-exclusive lock only (same as vacuum, etc), and it follows an MVCC snapshot; it sets up a replication slot starting at that snapshot, and uses a concurrent background worker to do logical decoding starting at the snapshot to populate a stash of concurrent data changes. Those changes can then be re-applied to the new copy of the table just before swapping the relfilenodes. Applications can continue to access the original copy of the table normally until just before the swap, which is the only point at which the access-exclusive lock is needed. There are some loose ends in this commit: 1. concurrent repack needs its own replication slot in order to apply logical decoding, which are a scarce resource and easy to run out of. 2. due to the way the historic snapshot is initially set up, only one REPACK process can be running at any one time on the whole system. 3. there's a danger of deadlocking (and thus abort) due to the lock upgrade required at the final phase. These issues will be addressed in upcoming commits. The design and most of the code are by Antonin Houska, heavily based on his own pg_squeeze third-party implementation. Author: Antonin Houska <ah@cybertec.at> Co-authored-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/5186.1706694913@antos Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql	2026-04-06 21:55:08 +02:00
Alexander Korotkov	10484c2cc7	Document that WAIT FOR may be interrupted by recovery conflicts Add a note to the WAIT FOR documentation explaining that sessions using this command on a standby server may be interrupted by recovery conflicts. Some conflicts are unavoidable - for example, replaying a tablespace drop terminates all backends unconditionally. Discussion: https://postgr.es/m/CAPpHfds7oSCbZqob7ytT_Lso8fv-NW8LnedUTE4Krde%2B3rkJeA%40mail.gmail.com Author: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2026-04-06 22:47:26 +03:00
Alexander Korotkov	7e8aeb9e48	Use WAIT FOR LSN in PostgreSQL::Test::Cluster::wait_for_catchup() When the standby is passed as a PostgreSQL::Test::Cluster instance, use the WAIT FOR LSN command on the standby server to implement wait_for_catchup() for replay, write, and flush modes. This is more efficient than polling pg_stat_replication on the upstream, as the WAIT FOR LSN command uses a latch-based wakeup mechanism. The optimization applies when: - The standby is passed as a Cluster object (not just a name string) - The mode is 'replay', 'write', or 'flush' (not 'sent') Rather than pre-checking pg_is_in_recovery() on the standby (which would add an extra round-trip on every call), we issue WAIT FOR LSN directly and handle the 'not in recovery' result as a signal to fall back to polling. For 'sent' mode, when the standby is passed as a string (e.g., a subscription name for logical replication), when the standby has been promoted, or when WAIT FOR LSN is interrupted by a recovery conflict, the function falls back to the original polling-based approach using pg_stat_replication on the upstream. The recovery conflict fallback is necessary because some conflicts are unavoidable - for example, ResolveRecoveryConflictWithTablespace() kills all backends unconditionally, regardless of what they are doing. The recovery conflict detection matches the English error message "conflict with recovery", which is reliable because the test suite runs with LC_MESSAGES=C. Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>	2026-04-06 22:47:26 +03:00
Alexander Korotkov	834038c1f8	Avoid syscache lookup while building a WAIT FOR tuple descriptor Use TupleDescInitBuiltinEntry instead of TupleDescInitEntry when building the result tuple descriptor for the WAIT FOR command. This avoids a syscache access that could re-establish a catalog snapshot after we've explicitly released all snapshots before the wait. Discussion: https://postgr.es/m/CABPTF7U%2BSUnJX_woQYGe%3D%3DR9Oz%2B-V6X0VO2stBLPGfJmH_LEhw%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-04-06 22:47:26 +03:00
Nathan Bossart	775fe51daa	Remove recheck_relation_needs_vacanalyze(). This function is a thin wrapper around relation_needs_vacanalyze() that handles fetching and freeing the pgstat entry for the table. Since all callers of relation_needs_vacanalyze() do that anyway, we can teach that function to fetch/free the pgstat entry and use it instead. Suggested-by: Álvaro Herrera <alvherre@kurilemu.de> Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-06 14:30:52 -05:00
Robert Haas	e972dff6c3	auto_explain: Add new GUC, auto_explain.log_extension_options. The associated value should look like something that could be part of an EXPLAIN options list, but restricted to EXPLAIN options added by extensions. For example, if pg_overexplain is loaded, you could set auto_explain.log_extension_options = 'DEBUG, RANGE_TABLE'. You can also specify arguments to these options in the same manner as normal e.g. 'DEBUG 1, RANGE_TABLE false'. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 15:19:42 -04:00
Tom Lane	d516974840	Support more object types within CREATE SCHEMA. Having rejected the principle that we should know how to re-order the sub-commands of CREATE SCHEMA, there is not really anything except a little coding to stop us from supporting more object types. This patch adds support for creating functions (including procedures and aggregates), operators, types (including domains), collations, and text search objects. SQL:2021 specifies that we should allow functions, procedures, types, domains, and collations, so this moves us a great deal closer to full SQL compatibility of CREATE SCHEMA. What remains missing from their list are casts, transforms, roles, and some object types we don't support yet (e.g. CREATE CHARACTER SET). Supporting casts or transforms would be problematic because they don't have names at all, let alone schema-qualified names, so it'd be quite a stretch to say that they belong to a schema. Roles likewise are not schema-qualified, plus they are global to a cluster, making it even less reasonable to consider them as belonging to a schema. So I don't see us trying to complete the list. User-defined aggregates and operators are outside the spec's ken, as are text search objects, so adding them does not do anything for spec compatibility. But they go along with these other object types, plus it takes no additional code to support them since they are represented as DefineStmts like some variants of CREATE TYPE. It would indeed take some effort to reject them. Author: Kirill Reshke <reshkekirill@gmail.com> Author: Jian He <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CALdSSPh4jUSDsWu3K58hjO60wnTRR0DuO4CKRcwa8EVuOSfXxg@mail.gmail.com	2026-04-06 15:16:25 -04:00
Tom Lane	404db8f9ed	Execute foreign key constraints in CREATE SCHEMA at the end. The previous patch simplified CREATE SCHEMA's behavior to "execute all subcommands in the order they are written". However, that's a bit too simple, as the spec clearly requires forward references in foreign key constraint clauses to work, see feature F311-01. (Most other SQL implementations seem to read more into the spec than that, but it's not clear that there's justification for more in the text, and this is the only case that doesn't introduce unresolvable issues.) We never implemented that before, but let's do so now. To fix it, transform FOREIGN KEY clauses into ALTER TABLE ... ADD FOREIGN KEY commands and append them to the end of the CREATE SCHEMA's subcommand list. This works because the foreign key constraints are independent and don't affect any other DDL that might be in CREATE SCHEMA. For simplicity, we do this for all FOREIGN KEY clauses even if they would have worked where they were. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1075425.1732993688@sss.pgh.pa.us	2026-04-06 15:16:25 -04:00
Tom Lane	a9c350d9ee	Don't try to re-order the subcommands of CREATE SCHEMA. transformCreateSchemaStmtElements has always believed that it is supposed to re-order the subcommands of CREATE SCHEMA into a safe execution order. However, it is nowhere near being capable of doing that correctly. Nor is there reason to think that it ever will be, or that that is a well-defined requirement. (The SQL standard does say that it should be possible to do foreign-key forward references within CREATE SCHEMA, but it's not clear that the text requires anything more than that.) Moreover, the problem will get worse as we add more subcommand types. Let's just drop the whole idea and execute the commands in the order given, which seems like a much less astonishment-prone definition anyway. The foreign-key issue will be handled in a follow-up patch. This will result in a release-note-worthy incompatibility, which is that forward references like CREATE SCHEMA myschema CREATE VIEW myview AS SELECT * FROM mytable CREATE TABLE mytable (...); used to work and no longer will. Considering how many closely related variants never worked, this isn't much of a loss. Along the way, pass down a ParseState so that we can provide an error cursor for "wrong schema name" and related errors, and fix transformCreateSchemaStmtElements so that it doesn't scribble on the parsetree passed to it. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/1075425.1732993688@sss.pgh.pa.us	2026-04-06 15:16:25 -04:00
Masahiko Sawada	1ff3180ca0	Allow autovacuum to use parallel vacuum workers. Previously, autovacuum always disabled parallel vacuum regardless of the table's index count or configuration. This commit enables autovacuum workers to use parallel index vacuuming and index cleanup, using the same parallel vacuum infrastructure as manual VACUUM. Two new configuration options control the feature. The GUC autovacuum_max_parallel_workers sets the maximum number of parallel workers a single autovacuum worker may launch; it defaults to 0, preserving existing behavior unless explicitly enabled. The per-table storage parameter autovacuum_parallel_workers provides per-table limits. A value of 0 disables parallel vacuum for the table, a positive value caps the worker count (still bounded by the GUC), and -1 (the default) defers to the GUC. To handle cases where autovacuum workers receive a SIGHUP and update their cost-based vacuum delay parameters mid-operation, a new propagation mechanism is added to vacuumparallel.c. The leader stores its effective cost parameters in a DSM segment. Parallel vacuum workers poll for changes in vacuum_delay_point(); if an update is detected, they apply the new values locally via VacuumUpdateCosts(). A new test module, src/test/modules/test_autovacuum, is added to verify that parallel autovacuum workers are correctly launched and that cost-parameter updates are propagated as expected. The patch was originally proposed by Maxim Orlov, but the implementation has undergone significant architectural changes since then during the review process. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: zengman <zengman@halodbtech.com> Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com	2026-04-06 11:48:29 -07:00
Álvaro Herrera	c0b53ec063	Rename cluster.c to repack.c (and corresponding .h) CLUSTER is no longer the favored way to invoke this functionality, and the code is about to shift its focus to the REPACK more ambitiously. Rename the file to avoid leaving an unnecessary historical artifact around. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/202603271635.owyhm7btgoic@alvherre.pgsql	2026-04-06 20:11:01 +02:00
Tom Lane	21c69dc73f	Disallow system columns in COPY FROM WHERE conditions. These columns haven't been computed yet when the filtering happens (since we've not written the candidate tuple into the table); so any check on them is wrong or useless. Worse, since `aa606b931` such a reference results in an access off the end of a TupleDesc, potentially causing a phony "generated columns are not supported in COPY FROM WHERE conditions" error; and since `c98ad086a` it throws an Assert instead. Actually we could allow tableoid, which has been set to the OID of the table named as the COPY target. However, plausible uses for tests of tableoid would involve a partitioned target table, and the user would wish it to read as the OID of the destination partition. There has been some discussion of changing things to make it work like that, but pending that happening we should just disallow tableoid along with other system columns. It seems best though to install this prohibition only in HEAD. In the back branches we'll just guard the unsafe TupleDesc access, and people will keep getting whatever semantics they got before. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6f435023-8ab6-47c2-ba07-035d0c4212f9@gmail.com	2026-04-06 14:05:01 -04:00
Tom Lane	f7da81f68b	Add missing .gitignore files. contrib/pg_stash_advice and src/test/modules/test_shmem missed these, leading to complaints from git after an in-tree check-world run. Use our standard boilerplate list of ignorable subdirectories, although the two modules presently create different subsets of that.	2026-04-06 13:25:29 -04:00
Tom Lane	6582010c80	Fix null-bitmap combining in array_agg_array_combine(). This code missed the need to update the combined state's nullbitmap if state1 already had a bitmap but state2 didn't. We need to extend the existing bitmap with 1's but didn't. This could result in wrong output from a parallelized array_agg(anyarray) calculation, if the input has a mix of null and non-null elements. The errors depended on timing of the parallel workers, and therefore would vary from one run to another. Also install guards against integer overflow when calculating the combined object's sizes, and make some trivial cosmetic improvements. Author: Dmytro Astapov <dastapov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFQUnFj2pQ1HbGp69+w2fKqARSfGhAi9UOb+JjyExp7kx3gsqA@mail.gmail.com Backpatch-through: 16	2026-04-06 13:14:53 -04:00
Robert Haas	0442f1c9ef	Add a guc_check_handler to the EXPLAIN extension mechanism. It would be useful to be able to tell auto_explain to set a custom EXPLAIN option, but it would be bad if it tried to do so and the option name or value wasn't valid, because then every query would fail with a complaint about the EXPLAIN option. So add a guc_check_handler that auto_explain will be able to use to only try to set option name/value/type combinations that have been determined to be legal, and to emit useful messages about ones that aren't. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 12:31:47 -04:00
Nathan Bossart	e3481edfd1	Remove autoanalyze corner case. The restructuring in commit `53b8ca6881` revealed an interesting corner case: if a table needs vacuuming for wraparound prevention and autovacuum is disabled for it, we might still choose to analyze it. Research seems to indicate this was an accidental addition by commit `48188e1621`, and further discussion indicates there is consensus that it is unnecessary and can be removed. Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/adB9nSsm_S0D9708%40nathan	2026-04-06 11:28:46 -05:00
Robert Haas	e0e819cc08	Expose helper functions scan_quoted_identifier and scan_identifier. Previously, this logic was embedded within SplitIdentifierString, SplitDirectoriesString, and SplitGUCList. Factoring it out saves a bit of duplicated code, and also makes it available to extensions that might want to do similar things without necessarily wanting to do exactly the same thing. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 11:13:25 -04:00
Fujii Masao	ca2b5443e2	Add TAP tests for log_lock_waits This commit updates 011_lock_stats.pl to verify log_lock_waits behavior. The tests check that messages are emitted both when a wait occurs and when the lock is acquired, and that the "still waiting for" message is logged exactly once per wait, even if the backend wakes up during the wait. The latter covers the behavior introduced by commit `fd6ecbfa75`. Author: Hüseyin Demir <huseyin.d3r@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAB5wL7YB1my9W5k5i=SY+=sTjeozyJ0YkvGXrVfeDNzuRkoTPg@mail.gmail.com	2026-04-06 23:49:40 +09:00
Fujii Masao	93dc1ace20	Release postmaster working memory context in slotsync worker Child processes do not need the postmaster's working memory context and normally release it at the start of their main entry point. However, the slotsync worker forgot to do so. This commit makes the slotsync worker release the postmaster's working memory context at startup, preventing unintended use. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tiancheng Ge <getiancheng_2012@163.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHO05JaUpgKF8FBDmPdBUJsK22axRRcgmAUc2Jyi8OK8g@mail.gmail.com	2026-04-06 23:04:18 +09:00
Heikki Linnakangas	ed71d7356e	Fix memory leaks introduced by commit `283e823f9d` When freeing pending_shmem_requests we should also free the ->options. Author: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://www.postgresql.org/message-id/CAJ7c6TN9tp8MTc0WXM0zfSWqjfBqU8gpe+o5KqHB1-cQ7409Kw@mail.gmail.com	2026-04-06 15:46:03 +03:00
Heikki Linnakangas	2670a0fcc6	Fix compilation without injection points with some compilers Some compilers didn't like the empty initializer when compiled without USE_INJECTION_POINTS. Per buildfarm member 'drongo', using Visual Studio 2019. Author: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/adNHcBVJO5gIOp1l@paquier.xyz	2026-04-06 15:46:00 +03:00
Robert Haas	e8ec19aa32	Add pg_stash_advice contrib module. This module allows plan advice strings to be provided automatically from an in-memory advice stash. Advice stashes are stored in dynamic shared memory and must be recreated and repopulated after a server restart. If pg_stash_advice.stash_name is set to the name of an advice stash, and if query identifiers are enabled, the query identifier for each query will be looked up in the advice stash and the associated advice string, if any, will be used each time that query is planned. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: http://postgr.es/m/CA+TgmoaeNuHXQ60P3ZZqJLrSjP3L1KYokW9kPfGbWDyt+1t=Ng@mail.gmail.com	2026-04-06 07:41:28 -04:00
Michael Paquier	404a17c155	Use single LWLock for lock statistics in pgstats Previously, one LWLock was used for each lock type, adding complexity without an observable performance benefit as data is gathered only for paths involving lock waits, at least currently. This commit replaces the per-type set of LWLocks with a single LWLock protecting the stats data of all the lock types, like the stats kinds for SLRU or WAL. A good chunk of the callbacks get simpler thanks to this change. The previous approach also had one bug in the flush callback when nowait was called with "true": a backend iterating over all entries could successfully flush some entries while skipping others due to contention, then unconditionally reset the pending data. This would cause some stats data loss. Oversight in `4019f725f5`. Reported-by: Tomas Vondra <tomas@vondra.me> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1af63e6d-16d5-4d5b-9b03-11472ef1adf9@vondra.me	2026-04-06 14:01:04 +09:00
Michael Paquier	283c5fb22b	Improve more stability of worker_spi termination test Alexander Lakhin has noticed that it can be possible on machines with slow storage to have the spawned workers be stuck in initialize_worker_spi(), before they reach their main loop. Waiting for a flush to happen would block the interrupt attempts done by the database commands, causing the test to fail on timeout once the number of interrupt attempts is reached in CountOtherDBBackends(). This commit switches the test to wait for the spawned bgworkers to reach their main loops before attempting the database commands that would trigger the interrupts, napping for a time larger than the default, with worker_spi.naptime set at 10 minutes. Another thing that could be attempted is to enforce a larger number of tries in CountOtherDBBackends(), if what is done here is not enough. Let's see first if what this commit does is enough for the buildfarm members widowbird and jay. Analyzed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/f913fba1-da59-404c-9eb3-07c7304be637@gmail.com	2026-04-06 13:23:28 +09:00
Fujii Masao	d78a4f0bf0	Simplify redundant current_database() subqueries in stats.sql regression test Previously the stats.sql regression test used conditions like "datname = (SELECT current_database())" to check the current database name. The subquery is unnecessary, so this commit simplifies these expressions to "datname = current_database()". Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/A1535A8F-65AF-4C3D-ACBE-25891CB5D38B@gmail.com	2026-04-06 13:19:45 +09:00
Richard Guo	3a08a2a8b4	Fix volatile function evaluation in eager aggregation Pushing aggregates containing volatile functions below a join can violate volatility semantics by changing the number of times the function is executed. Here we check the Aggref nodes in the targetlist and havingQual for volatile functions and disable eager aggregation when such functions are present. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAMbWs48A53PY1Y4zoj7YhxPww9fO1hfnbdntKfA855zpXfVFRA@mail.gmail.com	2026-04-06 11:54:08 +09:00
Richard Guo	bd94845e8c	Fix collation handling for grouping keys in eager aggregation When determining if it is safe to use an expression as a grouping key for partial aggregation, eager aggregation relies on the B-tree equalimage support function to ensure that equality implies image equality. Previously, the code incorrectly passed the default collation of the expression's data type to the equalimage procedure, rather than the expression's actual collation. As a result, if a column used a non-deterministic collation but the base type's default collation was deterministic, eager aggregation would incorrectly assume that the column was safe for byte-level grouping. This could cause rows to be prematurely grouped and subsequently discarded by strict join conditions, resulting in incorrect query results. This patch fixes the issue by passing the expression's actual collation to the equalimage procedure. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAMbWs48A53PY1Y4zoj7YhxPww9fO1hfnbdntKfA855zpXfVFRA@mail.gmail.com	2026-04-06 11:52:33 +09:00
Fujii Masao	a8f45dee91	Add wal_sender_shutdown_timeout GUC to limit shutdown wait for replication Previously, during shutdown, walsenders always waited until all pending data was replicated to receivers. This ensures sender and receiver stay in sync after shutdown, which is important for physical replication switchovers, but it can significantly delay shutdown. For example, in logical replication, if apply workers are blocked on locks, walsenders may wait until those locks are released, preventing shutdown from completing for a long time. This commit introduces a new GUC, wal_sender_shutdown_timeout, which specifies the maximum time a walsender waits during shutdown for all pending data to be replicated. When set, shutdown completes once all data is replicated or the timeout expires. A value of -1 (the default) disables the timeout. This can reduce shutdown time when replication is slow or stalled. However, if the timeout is reached, the sender and receiver may be left out of sync, which can be problematic for physical replication switchovers. Author: Andrey Silitskiy <a.silitskiy@postgrespro.ru> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Takamichi Osumi <osumi.takamichi@fujitsu.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Vitaly Davydov <v.davydov@postgrespro.ru> Reviewed-by: Ronan Dunklau <ronan@dunklau.fr> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com	2026-04-06 11:35:03 +09:00
John Naylor	8194c4a9dd	Fix unportable use of __builtin_constant_p On MSVC Arm, USE_ARMV8_CRC32C is defined, but __builtin_constant_p is not available. Use pg_integer_constant_p and add appropriate guards. There is a similar potential hazard for the x86 path, but for now let's get the buildfarm green. Oversight in commit `fbc57f2bc`, per buildfarm member hoatzin.	2026-04-06 09:30:01 +07:00
Daniel Gustafsson	07009121c2	Test stabilization for online checksums Postcommit review and buildfarm/CI failures revealed a few issues in the test code which this commit attempts to resolve. These failures are verified using synthetic means. * Wait for launcher exit in enable/disable checksum tests When enabling or disabling data checksums in a test with waiting for an end state (on or off), the test typically want to perform more test against the cluster immediately. Make sure to wait for the launcher to exit in these cases before returning in order to know it can immediately be acted on. This is a more generic way of implementating `0036232ba8`. * Refactor injection point tests to use the injection_points test extension. Two injection points added for online checksums were better expressed using the injection_points extension with the test code embedded in datachecksum_state.c. * Make tests less timing dependent and allow transitions to "on" and not just "inprogress-on" in case a test manages to finish before it's checked for state. * When waiting on a blocking background psql keeping a temporary table open, the test first closed the background session abd then the server. This could cause data checksums to manage to get enabled in the brief window between dropping the temporary table and closing the server. Fix by closing the server first before the background session. * Remove a few superfluous duplicate checks and general cleanup of comments as well as making LSN logging consistent. These issues were reported by Andres as well as spotted in the buildfarm and on CI. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/92F25C14-801E-4198-994D-D83E31FEB0D8@yesql.se	2026-04-06 02:03:10 +02:00
Daniel Gustafsson	d771b0a907	Handle checksumworker startup wait race If the background worker for processing databases manages to finish before the launcher starts waiting for it, the launcher would treat it erroneously as an error. Fix by ensureing to check result state in this case. Identified on CI and synthetically reproduced during local testing. Also while, make sure to properly lock the shared memory structure before updating tje result state. Author: Daniel Gustafsson <daniel@yesql.seA Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/4fxw37ge47v5baeozla5phymi233hxbcjbwwsfwv3mpg3kyl2z@6jk4nkf6jp4	2026-04-06 01:55:06 +02:00
Michael Paquier	557a9f1e3e	Add tests for lock statistics, take two Commit `7c64d56fd9` has removed the isolation test providing coverage for lock statistics due to some instability in the CI, where the deadlock timeout may not have enough time to process, preventing the stats data to be updated. These also relied on a set of hardcoded sleeps. This commit switches the test suite to TAP, instead, that uses an injection point with a wait to avoid the sleeps. The injection point is added in ProcSleep(), once we know that the deadlock timeout has fired and that the stats have been updated. Multiple lock patterns are checked, all rely on the same workflow, with two sessions: - session 1 holds a given lock type. - session 2 attaches to the new injection point with the wait action. - session 2 attempts to acquire a lock conflicting with the lock of session 1, waiting for the injection point to be reached. - session 1 releases its lock, session 2 commits. - pg_stat_lock is polled until the counters are updated for the lock type. Bertrand's version of the patch introduced a new routine to BackgroundPsql() to detect the blocked background sessions. I have tweaked the test so as we use the same method as some of the other tests instead, based on some \echo commands. This test has been run multiple times in the CI, all passing, so I'd like to think that this is more stable than the first version attempted. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/acNTR1lLHwQJ0o+P@ip-10-97-1-34.eu-west-3.compute.internal	2026-04-06 08:51:30 +09:00
Heikki Linnakangas	9b5acad3f4	Convert all remaining subsystems to use the new shmem allocation API This removes all remaining uses of ShmemInitStruct() and ShmemInitHash() from built-in code. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:13:10 +03:00
Heikki Linnakangas	a4b6139dcc	Convert buffer manager to use the new shmem allocation functions This rectifies the initialization functions a little, making the "buffer strategy" stuff in freelist.c and buffer mapping hash table in buf_init.c top-level "subsystems" of their own, registered directly in subsystemlist.h. Previously they were called indirectly from BufferManagerShmemInit() and BufferManagerShmemSize() Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:13:08 +03:00
Heikki Linnakangas	dacfe81a0d	Add alignment option to ShmemRequestStruct() The buffer blocks, converted to use ShmemRequestStruct() in the next commit, are IO-aligned. This might come handy in other places too, so make it an explicit feature of ShmemRequestStruct(). Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:13:06 +03:00
Heikki Linnakangas	58a1573385	Convert AIO to use the new shmem allocation functions This replaces the "shmem_size" and "shmem_init" callbacks in the IO methods table with the same ShmemCallback struct that we now use in other subsystems Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:13:04 +03:00
Heikki Linnakangas	2e0943a859	Convert SLRUs to use the new shmem allocation functions I replaced the old SimpleLruInit() function without a backwards compatibility wrapper, because few extensions define their own SLRUs. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:13:02 +03:00
Heikki Linnakangas	4c9eca5afe	Refactor shmem initialization code in predicate.c This is in preparation to convert it to use the new shmem allocation functions, making the next commit that does that smaller. This inlines SerialInit() to the caller, and moves all the initialization steps within PredicateLockShmemInit() to happen after all the ShmemInit{Struct\|Hash}() calls. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:13:01 +03:00
Heikki Linnakangas	c6d55714ba	Use the new shmem allocation functions in a few core subsystems These subsystems have some complicating properties, making them slightly harder to convert than most: - The initialization callbacks of some of these subsystems have dependencies, i.e. they need to be initialized in the right order. - The ProcGlobal pointer still needs to be inherited by the BackendParameters mechanism on EXEC_BACKEND builds, because ProcGlobal is required by InitProcess() to get a PGPROC entry, and the PGPROC entry is required to use LWLocks, and usually attaching to shared memory areas requires the use of LWLocks. - Similarly, ProcSignal pointer still needs to be handled by BackendParameters, because query cancellation connections access it without calling InitProcess Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:59 +03:00
Heikki Linnakangas	a006bc7b16	Convert lwlock.c to use the new shmem allocation functions It seems like a good candidate to convert first because it needs to initialized before any other subsystem, but other than that it's nothing special. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:57 +03:00
Heikki Linnakangas	1fc2e9fbc0	Introduce a registry of built-in shmem subsystems To add a new built-in subsystem, add it to subsystemslist.h. That hooks up its shmem callbacks so that they get called at the right times during postmaster startup. For now this is unused, but will replace the current SubsystemShmemSize() and SubsystemShmemInit() calls in the next commits. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:55 +03:00
Heikki Linnakangas	d4885af3d6	Convert pg_stat_statements to use the new shmem allocation functions As part of this, embed the LWLock it needs in the shared memory struct itself, so that we don't need to use RequestNamedLWLockTranche() anymore. LWLockNewTrancheId() + LWLockInitialize() is more convenient to use in extensions. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:53 +03:00
Heikki Linnakangas	6409994c7d	Add a test module to test after-startup shmem allocations The old ShmemInit{Struct/Hash}() functions could be used after postmaster statup, as long as the allocation is small enough to fit in spare shmem reserved at startup. I believe some extensions do that, although we hadn't really documented it and had not coverage for it. The new test module covers that after-startup usage with the new ShmemRequestStruct() functions. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:51 +03:00
Heikki Linnakangas	283e823f9d	Introduce a new mechanism for registering shared memory areas This replaces the [Subsystem]ShmemSize() and [Subsystem]ShmemInit() functions called at postmaster startup with a new set of callbacks. The new mechanism is designed to be more ergonomic. Notably, the size of each shmem area is specified in the same ShmemRequestStruct() call, together with its name. The same mechanism is used in extensions, replacing the shmem_{request/startup}_hooks. ShmemInitStruct() and ShmemInitHash() become backwards-compatibility wrappers around the new functions. In future commits, I will replace all ShmemInitStruct() and ShmemInitHash() calls with the new functions, although we'll still need to keep them around for extensions. Co-authored-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:50 +03:00
Heikki Linnakangas	6ef9bee293	Move some code from shmem.c and shmem.h A little refactoring in preparation for the next commit, to make the material changes in that commit more clear. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:48 +03:00
Andres Freund	5a79e78501	instrumentation: Separate per-node logic from other uses Previously, different places (e.g. query "total time") were repurposing the Instrumentation struct initially introduced for capturing per-node statistics during execution. This overuse of the same struct is confusing, e.g. by cluttering calls of InstrStartNode/InstrStopNode in unrelated code paths, and prevents future refactorings. Instead, simplify the Instrumentation struct to only track time and WAL/buffer usage. Similarly, drop the use of InstrEndLoop outside of per-node instrumentation - these calls were added without any apparent benefit since the relevant fields were never read. Introduce the NodeInstrumentation struct to carry forward the per-node instrumentation information. WorkerInstrumentation is renamed to WorkerNodeInstrumentation for clarity. In passing, clarify that InstrAggNode is expected to only run after InstrEndLoop (as it does in practice), and drop unused code. This also fixes a consequence-less bug: Previously ->async_mode was only set when a non-zero instrument_option was passed. That turns out to be harmless right now, as ->async_mode only affects a timing related field. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53PkzdBK8VJ1fS4AZ481LgMN8f9mJiC39ZRHqkFUSYq6KWmg@mail.gmail.com	2026-04-05 19:04:24 -04:00
Andres Freund	7d9b74df53	instrumentation: Separate trigger logic from other uses Introduce TriggerInstrumentation to capture trigger timing and firings (previously counted in "ntuples"), to aid a future refactoring that splits out all Instrumentation fields beyond timing and WAL/buffers into more specific structs. In passing, drop the "n" argument to InstrAlloc, as all remaining callers need exactly one Instrumentation struct. The duplication between InstrAlloc() and InstrInit(), as well as the conditional initialization of async_mode will be addressed in a subsequent commit. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/CAP53PkzdBK8VJ1fS4AZ481LgMN8f9mJiC39ZRHqkFUSYq6KWmg@mail.gmail.com	2026-04-05 16:56:50 -04:00
Andres Freund	6c7bce28c8	Fixups for `a4f774cf1c` The database name was warned about when building with -DENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS, leading to BF and CI failures. It is somewhat confusing that the required prefix is different for databases than other object types. Also fix a pgindent violation that caused koel to start to fail. Discussion: https://postgr.es/m/ptyiexyhmtxf4lm524s7o7w64r26ra237uusv4tjav4yhpmeoo@vfwwllz7tivb	2026-04-05 15:36:34 -04:00
Andres Freund	df6949ccf7	Add tid_block() and tid_offset() accessor functions The two new functions allow to extract the block number and offset from a tid. There are existing ways to do so (e.g. by doing (ctid::text::point)[0]), but they are hard to remember and not pretty. tid_block() returns int8 (bigint) because BlockNumber is uint32, which exceeds the range of int4. tid_offset() returns int4 (integer) because OffsetNumber is uint16, which fits safely in int4. Bumps catversion. Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com> Discussion: https://postgr.es/m/CAJTYsWUzok2+mvSYkbVUwq_SWWg-GdHqCuYumN82AU97SjwjCA@mail.gmail.com	2026-04-05 15:17:05 -04:00
Heikki Linnakangas	f10b6be258	Check that the tranche name is unique in RequestNamedLWLockTranche You could request two tranches with same name, but things would get confusing when you called GetNamedLWLockTranche() to get the LWLocks allocated for them; it would always return the first tranche with the name. That doesn't make sense, so forbid duplicates. We still allow duplicates with LWLockNewTrancheId(). That works better as you don't use the name to look up the tranche ID later. It's still confusing in wait events, for example, but it's not dangerous in the same way. Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/463a28db-0c0b-4af6-bac6-3891828bbbfe@iki.fi	2026-04-05 21:05:20 +03:00
Heikki Linnakangas	92a685e407	Improve test_lwlock_tranches While working on refactoring how shmem is allocated, I made a mistake where the main LWLock array did not reserve space for the LWLocks allocated with RequestNamedLWLockTranche(), and the test still passed. Matthias van de Meent spotted that before it got committed, but in order to catch such mistakes in the future, add checks in test_lwlock_tranches that the locks allocated with RequestNamedLWLockTranche() can be acquired and released. Another change is to stop requesting multiple tranches with the same name with RequestNamedLWLockTranche(). As soon as I started to test using the locks I realized that's bogus, and the next commit will forbid it. Keep test coverage for duplicates requested with LWLockNewTrancheId() for now, but make it more clear that that's what the test does. Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/463a28db-0c0b-4af6-bac6-3891828bbbfe@iki.fi Discussion: https://www.postgresql.org/message-id/CAEze2WjgCROMMXY0+j8FFdm3iFcr7By-+6Mwiz=PgGSEydiW3A@mail.gmail.com	2026-04-05 21:05:15 +03:00
Andrew Dunstan	a4f774cf1c	Add pg_get_database_ddl() function Add a new SQL-callable function that returns the DDL statements needed to recreate a database. It takes a regdatabase argument and an optional VARIADIC text argument for options that are specified as alternating name/value pairs. The following options are supported: pretty (boolean) for formatted output, owner (boolean) to include OWNER and tablespace (boolean) to include TABLESPACE. The return is one or multiple rows where the first row is a CREATE DATABASE statement and subsequent rows are ALTER DATABASE statements to set some database properties. The caller must have CONNECT privilege on the target database. Author: Akshay Joshi <akshay.joshi@enterprisedb.com> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Co-authored-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Quan Zongliang <quanzongliang@yeah.net> Discussion: https://postgr.es/m/CANxoLDc6FHBYJvcgOnZyS+jF0NUo3Lq_83-rttBuJgs9id_UDg@mail.gmail.com Discussion: https://postgr.es/m/e247c261-e3fb-4810-81e0-a65893170e94@dunslane.net	2026-04-05 10:54:54 -04:00
Andrew Dunstan	b99fd9fd7f	Add pg_get_tablespace_ddl() function Add a new SQL-callable function that returns the DDL statements needed to recreate a tablespace. It takes a tablespace name or OID and an optional VARIADIC text argument for options that are specified as alternating name/value pairs. The following options are supported: pretty (boolean) for formatted output and owner (boolean) to include OWNER. (It includes two variants because there is no regtablespace pseudotype.) The return is one or multiple rows where the first row is a CREATE TABLESPACE statement and subsequent rows are ALTER TABLESPACE statements to set some tablespace properties. The caller must have SELECT privilege on pg_tablespace. get_reloptions() in ruleutils.c is made non-static so it can be called from the new ddlutils.c file. Author: Nishant Sharma <nishant.sharma@enterprisedb.com> Author: Manni Wood <manni.wood@enterprisedb.com> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Co-authored-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAKWEB6rmnmGKUA87Zmq-s=b3Scsnj02C0kObQjnbL2ajfPWGEw@mail.gmail.com Discussion: https://postgr.es/m/e247c261-e3fb-4810-81e0-a65893170e94@dunslane.net	2026-04-05 10:54:54 -04:00
Andrew Dunstan	76e514ebb4	Add pg_get_role_ddl() function Add a new SQL-callable function that returns the DDL statements needed to recreate a role. It takes a regrole argument and an optional VARIADIC text argument for options that are specified as alternating name/value pairs. The following options are supported: pretty (boolean) for formatted output and memberships (boolean) to include GRANT statements for role memberships and membership options. The return is one or multiple rows where the first row is a CREATE ROLE statement and subsequent rows are ALTER ROLE statements to set some role properties. Password information is never included in the output. The caller must have SELECT privilege on pg_authid. Author: Mario Gonzalez <gonzalemario@gmail.com> Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Co-authored-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Quan Zongliang <quanzongliang@yeah.net> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/4c5f895e-3281-48f8-b943-9228b7da6471@gmail.com Discussion: https://postgr.es/m/e247c261-e3fb-4810-81e0-a65893170e94@dunslane.net	2026-04-05 10:54:54 -04:00
Andrew Dunstan	4881981f92	Add infrastructure for pg_get__ddl functions Add parse_ddl_options(), append_ddl_option(), and append_guc_value() helper functions in a new ddlutils.c file that provide common option parsing and output formatting for the pg_get__ddl family of functions which will follow in later patches. These accept VARIADIC text arguments as alternating name/value pairs. Callers declare an array of DdlOption descriptors specifying the accepted option names and their types (boolean, text, or integer). parse_ddl_options() matches each supplied pair against the array, validates the value, and fills in the result fields. This descriptor-based scheme is based on an idea from Euler Taveira. This is placed in a new ddlutils.c file which will contain the pg_get_*_ddl functions. Author: Akshay Joshi <akshay.joshi@enterprisedb.com> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Co-authored-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/CAKWEB6rmnmGKUA87Zmq-s=b3Scsnj02C0kObQjnbL2ajfPWGEw@mail.gmail.com Discussion: https://postgr.es/m/4c5f895e-3281-48f8-b943-9228b7da6471@gmail.com Discussion: https://postgr.es/m/CANxoLDc6FHBYJvcgOnZyS+jF0NUo3Lq_83-rttBuJgs9id_UDg@mail.gmail.com Discussion: https://postgr.es/m/e247c261-e3fb-4810-81e0-a65893170e94@dunslane.net	2026-04-05 10:54:54 -04:00
Álvaro Herrera	caec9d9fad	Allow index_create to suppress index_build progress reporting A future REPACK patch wants a way to suppress index_build doing its progress reports when building an index, because that would interfere with repack's own reporting; so add an INDEX_CREATE_SUPPRESS_PROGRESS bit that enables this. Furthermore, change the index_create_copy() API so that it takes flag bits for index_create() and passes them unchanged. This gives its callers more direct control, which eases the interface -- now its callers can pass the INDEX_CREATE_SUPPRESS_PROGRESS bit directly. We use it for the current caller in REINDEX CONCURRENTLY, since it's also not interested in progress reporting, since it doesn't want index_build() to be called at all in the first place. One thing to keep in mind, pointed out by Mihail, is that we're not suppressing the index-AM-specific progress report updates which happen during ambuild(). At present this is not a problem, because the values updated by those don't overlap with those used by commands other than CREATE INDEX; but maybe in the future we'll want the ability to suppress them also. (Alternatively we might want to display how each index-build-subcommand progresses during REPACK and others.) Author: Antonin Houska <ah@cybertec.at> Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/102906.1773668762@localhost	2026-04-05 13:34:08 +02:00
Etsuro Fujita	de28140ded	postgres_fdw: Inherit the local transaction's access/deferrable modes. READ ONLY transactions should prevent modifications to foreign data as well as local data, but postgres_fdw transactions declared as READ ONLY that reference foreign tables mapped to a remote view executing volatile functions would modify data on remote servers, as it would open remote transactions in READ WRITE mode. Similarly, DEFERRABLE transactions should not abort due to a serialization failure even when accessing foreign data, but postgres_fdw transactions declared as DEFERRABLE would abort due to that failure in a remote server, as it would open remote transactions in NOT DEFERRABLE mode. To fix, modify postgres_fdw to open remote transactions in the same access/deferrable modes as the local transaction. This commit also modifies it to open remote subtransactions in the same access mode as the local subtransaction. This commit changes the behavior of READ ONLY/DEFERRABLE transactions using postgres_fdw; in particular, it doesn't allow the READ ONLY transactions to modify data on remote servers anymore, so such transactions should be redeclared as READ WRITE or rewritten using other tools like dblink. The release notes should note this as an incompatibility. These issues exist since the introduction of postgres_fdw, but to avoid the incompatibility in the back branches, fix them in master only. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com Discussion: https://postgr.es/m/E1uLe9X-000zsY-2g%40gemulon.postgresql.org	2026-04-05 18:55:00 +09:00
Thomas Munro	fc44f10665	aio: Simplify pgaio_worker_submit(). Merge pgaio_worker_submit_internal() and pgaio_worker_submit(). The separation didn't serve any purpose. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2Bm4xV0LMoH2c%3DoRAdEXuCnh%2BtGBTWa7uFeFMGgTLAw%2BQ%40mail.gmail.com	2026-04-05 18:07:21 +12:00
Andres Freund	f63ca33790	read_stream: Only increase read-ahead distance when waiting for IO This avoids increasing the distance to the maximum in cases where the I/O subsystem is already keeping up. This turns out to be important for performance for two reasons: - Pinning a lot of buffers is not cheap. If additional pins allow us to avoid IO waits, it's definitely worth it, but if we can already do all the necessary readahead at a distance of 16, reading ahead 512 buffers can increase the CPU overhead substantially. This is particularly noticeable when the to-be-read blocks are already in the kernel page cache. - If the read stream is read to completion, reading in data earlier than needed is of limited consequences, leaving aside the CPU costs mentioned above. But if the read stream will not be fully consumed, e.g. because it is on the inner side of a nested loop join, the additional IO can be a serious performance issue. This is not that commonly a problem for current read stream users, but the upcoming work, to use a read stream to fetch table pages as part of an index scan, frequently encounters this. Note that this commit would have substantial performance downsides without earlier commits: - Commit `6e36930f9a`, which avoids decreasing the readahead distance when there was recent IO, is crucial, as otherwise we very often would end up not reading ahead aggressively enough anymore with this commit, due to increasing the distance less often. - "read stream: Split decision about look ahead for AIO and combining" is important as we would otherwise not perform IO combining when the IO subsystem can keep up. - "aio: io_uring: Trigger async processing for large IOs" is important to continue to benefit from memory copy parallelism when using fewer IOs. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Tested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CA+hUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw@mail.gmail.com	2026-04-05 00:43:54 -04:00
Andres Freund	8ca147d582	read stream: Split decision about look ahead for AIO and combining In a subsequent commit the read-ahead distance will only be increased when waiting for IO. Without further work that would cause a regression: As IO combining and read-ahead are currently controlled by the same mechanism, we would end up not allowing IO combining when never needing to wait for IO (as the distance ends up too small to allow for full sized IOs), which can increase CPU overhead. A typical reason to not have to wait for IO completion at a low look-ahead distance is use of io_uring with the to-be-read data in the page cache. But even with worker the IO submission rate may be low enough for the worker to keep up. One might think that we could just always perform IO combining, but doing so at the start of a scan can cause performance regressions: 1) Performing a large IO commonly has a higher latency than smaller IOs. That is not a problem once reading ahead far enough, but at the start of a stream it can lead to longer waits for IO completion. 2) Sometimes read streams will not be read to completion. Immediately starting with full sized IOs leads to more wasted effort. This is not commonly an issue with existing read stream users, but the upcoming use of read streams to fetch table pages as part of an index scan frequently encounters this. Solve this issue by splitting ReadStream->distance into ->combine_distance and ->readahead_distance. Right now they are increased/decreased at the same time, but that will change in the next commit. One of the comments in read_stream_should_look_ahead() refers to a motivation that only really exists as of the next commit, but without it the code doesn't make sense on its own. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CA+hUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw@mail.gmail.com	2026-04-05 00:43:54 -04:00
Andres Freund	434dab76ba	read_stream: Move logic about IO combining & issuing to helpers The long if statements were hard to read and hard to document. Splitting them into inline helpers makes it much easier to explain each part separately. This is done in preparation for making the logic more complicated... Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu	2026-04-05 00:43:54 -04:00
Andres Freund	a9ee668817	aio: io_uring: Trigger async processing for large IOs io_method=io_uring has a heuristic to trigger asynchronous processing of IOs once the IO depth is a bit larger. That heuristic is important when doing buffered IO from the kernel page cache, to allow parallelizing of the memory copy, as otherwise io_method=io_uring would be a lot slower than io_method=worker in that case. An upcoming commit will make read_stream.c only increase the read-ahead distance if we needed to wait for IO to complete. If to-be-read data is in the kernel page cache, io_uring will synchronously execute IO, unless the IO is flagged as async. Therefore the aforementioned change in read_stream.c heuristic would lead to a substantial performance regression with io_uring when data is in the page cache, as we would never reach a deep enough queue to actually trigger the existing heuristic. Parallelizing the copy from the page cache is mainly important when doing a lot of IO, which commonly is only possible when doing largely sequential IO. The reason we don't just mark all io_uring IOs as asynchronous is that the dispatch to a kernel thread has overhead. This overhead is mostly noticeable with small random IOs with a low queue depth, as in that case the gain from parallelizing the memory copy is small and the latency cost high. The facts from the two prior paragraphs show a way out: Use the size of the IO in addition to the depth of the queue to trigger asynchronous processing. One might think that just using the IO size might be enough, but experimentation has shown that not to be the case - with deep look-ahead distances being able to parallelize the memory copy is important even with smaller IOs. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CA+hUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw@mail.gmail.com	2026-04-05 00:43:54 -04:00
John Naylor	2849fe4c97	Fix unused function warning on Arm platforms Guard definition pg_pmull_available() on compile-time availability of PMULL. Oversight in `fbc57f2bc`. In passing, remove "inline" hint for consistency. Reported-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/f153d5a4-a9be-4211-b0b2-7e99b56d68d5@vondra.me	2026-04-05 08:49:47 +07:00
Álvaro Herrera	69c11f0545	Modernize struct declarations in snapbuild.h Just a cosmetic cleanup.	2026-04-05 00:21:53 +02:00
Álvaro Herrera	33bf7318f9	Make index_concurrently_create_copy more general Also rename it to index_create_copy. Add a 'boolean concurrent' option, and make it work for both cases: in concurrent mode, just create the catalog entries; caller is responsible for the actual building later. In non-concurrent mode, the index is built right away. This allows it to be reused for other purposes -- specifically, for concurrent REPACK. (With the CONCURRENTLY option, REPACK cannot simply swap the heap file and rebuild its indexes. Instead, it needs to build a separate set of indexes, including their system catalog entries, before the actual swap, to reduce the time AccessExclusiveLock needs to be held for. This approach is different from what CREATE INDEX CONCURRENTLY does.) Per a suggestion from Mihail Nikalayeu. Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/41104.1754922120@localhost	2026-04-04 20:38:26 +02:00
Peter Geoghegan	2d3490dd99	heapam: Keep buffer pins across index scan resets. Avoid dropping the heap page pin (xs_cbuf) and visibility map pin (xs_vmbuffer) within heapam_index_fetch_reset. Retaining these pins saves cycles during certain nested loop joins and merge joins that frequently restore a saved mark: cases where the next tuple fetched after a reset often falls on the same heap page will now avoid the cost of repeated pinning and unpinning. Avoiding dropping the scan's heap page buffer pin is preparation for an upcoming patch that will add I/O prefetching to index scans. Testing of that patch (which makes heapam tend to pin more buffers concurrently than was typical before now) shows that the aforementioned cases get a small but clearly measurable benefit from this optimization. Upcoming work to add a slot-based table AM interface for index scans (which is further preparation for prefetching) will move VM checks for index-only scans out of the executor and into heapam. That will expand the role of xs_vmbuffer to include VM lookups for index-only scans (the field won't just be used for setting pages all-visible during on-access pruning via the enhancement recently introduced by commit `b46e1e54`). Avoiding dropping the xs_vmbuffer pin will preserve the historical behavior of nodeIndexonlyscan.c, which always kept this pin on a rescan; that aspect of this commit isn't really new. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-04-04 13:49:37 -04:00
Heikki Linnakangas	fda5300132	Remove unnecessary #include "spin.h" from shmem.h Commit `6b8238cb6a` removed the last usage of slock_t from the file. proc.c was relying the indirect #include, so add it to proc.c directly.	2026-04-04 20:22:04 +03:00
Peter Geoghegan	c7d09595e4	heapam: Track heap block in IndexFetchHeapData. Add an explicit BlockNumber field (xs_blk) to IndexFetchHeapData that tracks which heap block is currently pinned in xs_cbuf. heapam_index_fetch_tuple now uses xs_blk to determine when buffer switching is needed, replacing the previous approach that compared buffer identities via ReleaseAndReadBuffer on every non-HOT-chain call. This is preparatory work for an upcoming commit that will add index prefetching using a read stream. Delegating the release of a currently pinned buffer to ReleaseAndReadBuffer won't work anymore -- at least not when the next buffer that the scan needs to pin is one returned by read_stream_next_buffer (not a buffer returned by ReadBuffer). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-04-04 11:45:33 -04:00
Peter Geoghegan	a29fdd6c8d	Move heapam_handler.c index scan code to new file. Move the heapam index fetch callbacks (index_fetch_begin, index_fetch_reset, index_fetch_end, and index_fetch_tuple) into a new dedicated file. Also move heap_hot_search_buffer over. This is a purely mechanical move with no functional impact. Upcoming work to add a slot-based table AM interface for index scans will substantially expand this code. Keeping it in heapam_handler.c would clutter a file whose primary role is to wire up the TableAmRoutine callbacks. Bitmap heap scans and sequential scans would benefit from similar separation in the future. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/bmbrkiyjxoal6o5xadzv5bveoynrt3x37wqch7w3jnwumkq2yo@b4zmtnrfs4mh	2026-04-04 11:30:41 -04:00
Peter Geoghegan	1adff1a0c5	Rename heapam_index_fetch_tuple argument for clarity. Rename heapam_index_fetch_tuple's call_again argument to heap_continue, for consistency with the pointed-to variable name (IndexScanDescData's xs_heap_continue field). Preparation for an upcoming commit that will move index scan related heapam functions into their own file. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/bmbrkiyjxoal6o5xadzv5bveoynrt3x37wqch7w3jnwumkq2yo@b4zmtnrfs4mh	2026-04-04 11:30:05 -04:00
John Naylor	519acd1be5	Fix indentation Per buildfarm member koel	2026-04-04 21:50:54 +07:00
John Naylor	fbc57f2bc2	Compute CRC32C on ARM using the Crypto Extension where available In similar vein to commit `3c6e8c123`, the ARMv8 cryptography extension has 64x64 -> 128-bit carryless multiplication instructions suitable for computing CRC. This was tested to be around twice as fast as scalar CRC instructions for longer inputs. We now do a runtime check, even for builds that target "armv8-a+crc", but those builds can still use a direct call for constant inputs, which we assume are short. As for x86, the MIT-licensed implementation was generated with the "generate" program from https://github.com/corsix/fast-crc32/ Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CANWCAZaKhE+RD5KKouUFoxx1EbUNrNhcduM1VQ=DkSDadNEFng@mail.gmail.com	2026-04-04 20:47:01 +07:00
John Naylor	5e13b0f240	Use AVX2 for calculating page checksums where available We already rely on autovectorization for computing page checksums, but on x86 we can get a further several-fold performance increase by annotating pg_checksum_block() with a function target attribute for the AVX2 instruction set extension. Not only does that use 256-bit registers, it can also use vector multiplication rather than the vector shifts and adds used in SSE2. Similar to other hardware-specific paths, we set a function pointer on first use. We don't bother to avoid this on platforms without AVX2 since the overhead of indirect calls doesn't matter for multi-kilobyte inputs. However, we do arrange so that only core has the function pointer mechanism. External programs will continue to build a normal static function and don't need to be aware of this. This matters most when using io_uring since in that case the checksum computation is not done in parallel by IO workers. Co-authored-by: Matthew Sterrett <matthewsterrett2@gmail.com> Co-authored-by: Andrew Kim <andrew.kim@intel.com> Reviewed-by: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Tested-by: Ants Aasma <ants.aasma@cybertec.at> Tested-by: Stepan Neretin <slpmcf@gmail.com> (earlier version) Discussion: https://postgr.es/m/CA+vA85_5GTu+HHniSbvvP+8k3=xZO=WE84NPwiKyxztqvpfZ3Q@mail.gmail.com Discussion: https://postgr.es/m/20250911054220.3784-1-root%40ip-172-31-36-228.ec2.internal	2026-04-04 18:07:15 +07:00
Heikki Linnakangas	c06443063f	Add missing shmem size estimate for fast-path locking struct It's been missing ever since fast-path locking was introduced. It's a small discrepancy, about 4 kB, but let's be tidy. This doesn't seem worth backpatching, however; in stable branches we were less precise about the estimates and e.g. added a 10% margin to the hash table estimates, which is usually much bigger than this discrepancy.	2026-04-04 11:46:11 +03:00
Thomas Munro	bab656bb87	More tar portability adjustments. For the three implementations that have caused problems so far: * GNU and BSD (libarchive) tar both understand --format=ustar * ustar doesn't support large UID/GID values, so set them to 0 to avoid a hard error from at least GNU tar * OpenBSD tar needs -F ustar, and it appears to warn but carry on with "nobody" if a UID is too large * -f /dev/null is a more portable way to throw away the output, since the default destination might be a tape device depending on build options that a distribution might change * Windows ships BSD tar but lacks /dev/null, so ask perl for its name Based on their manuals, the other two implementations the tests are likely to encounter in the wild don't seem to need any special handling: * Solaris/illumos tar uses ustar and replaces large UIDs with 60001 * AIX tar uses ustar (unless --format=pax) and truncates large UIDs Backpatch-through: 18 Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Sami Imseih <samimseih@gmail.com> (large UIDs) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version) Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> (OpenBSD) Reviewed-by: Andrew Dunstan <andrew@dunslane.net> (Windows) Discussion: https://postgr.es/m/3676229.1775170250%40sss.pgh.pa.us Discussion: https://postgr.es/m/CAA5RZ0tt89MgNi4-0F4onH%2B-TFSsysFjMM-tBc6aXbuQv5xBXw%40mail.gmail.com	2026-04-04 13:54:21 +13:00
Heikki Linnakangas	4953a25b7f	Remove HASH_DIRSIZE, always use the default algorithm to select it It's not very useful to specify a non-standard directory size. The HASH_DIRSIZE option was only used for shared memory hash tables, and those always used hash_select_dirsize() to choose the size, which in turn just uses the default algorithm anyway. That assumption was ingrained in hash_estimate_size(), too. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-04-04 02:40:28 +03:00
Heikki Linnakangas	9fe9ecd516	Allocate all parts of shmem hash table from a single contiguous area Previously, the shared header (HASHHDR) and the directory were allocated by the caller, and passed to hash_create(), while the actual elements were allocated separately with ShmemAlloc(). After this commit, all the memory needed by the header, the directory, and all the elements is allocated using a single ShmemInitStruct() call, and the different parts are carved out of that allocation. This way the ShmemIndex entries (and thus pg_shmem_allocations) reflect the size of the whole hash table, rather than just the directories. Commit `f5930f9a98` attempted this earlier, but it had to be reverted. The new strategy is to let dynahash.c perform all the allocations with the alloc function, but have the alloc function carve out the parts from the one larger allocation. The shared header and the directory are now also allocated with alloc calls, instead of passing the area for those directly from the caller. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-04-04 02:40:25 +03:00
Heikki Linnakangas	999e9ebb51	Prevent shared memory hash tables from growing beyond initial size Set HASH_FIXED_SIZE on all shared memory hash tables, to prevent them from growing after the initial allocation. It was always weirdly indeterministic that if one hash table used up all the unused shared memory, you could not use that space for other things anymore until restart. We just got rid of that behavior for the LOCK and PROCLOCK tables, but it's similarly weird for all other hash tables. Increase SHMEM_INDEX_SIZE because we were already above the max size, on that one, and it's now a hard limit. Some callers of ShmemInitHash() still pass HASH_FIXED_SIZE, but that's now unnecessary. They should perhaps now be removed, but it doesn't do any harm either to pass it. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-04-04 02:40:24 +03:00
Heikki Linnakangas	9ebe1c4f2c	Merge init and max size options on shmem hash tables Replace the separate init and max size options with a single size option. We didn't make much use of the feature, all callers except the ones in wait_event.c already used the same size for both, and the hash tables in wait_event.c are small so there's little harm in just allocating them to the max size. The only reason why you might want to not reserve the max size upfront is to make the memory available for other hash tables to grow beyond their max size. Letting hash tables grow much beyond their max size is bad for performance, however, because we cannot resize the directory, and we never had very much "wiggle room" to grow to anyway so you couldn't really rely on it. We recently marked the LOCK and PROCLOCK tables with HAS_FIXED_SIZE, so there's nothing left in core that would benefit from more unallocated shared memory. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-04-04 02:40:20 +03:00
Jacob Champion	d438a36591	oauth: Let validators provide failure DETAILs At the moment, the only way for a validator module to report error details on failure is to log them separately before returning from validate_cb. Independently of that problem, the ereport() calls that we make during validation failure partially duplicate some of the work of auth_failed(). The end result is overly verbose and confusing for readers of the logs: [768233] LOG: [my_validator] bad signature in bearer token [768233] LOG: OAuth bearer authentication failed for user "jacob" [768233] DETAIL: Validator failed to authorize the provided token. [768233] FATAL: OAuth bearer authentication failed for user "jacob" [768233] DETAIL: Connection matched file ".../pg_hba.conf" line ... Solve both problems by making use of the existing logdetail pointer that's provided by ClientAuthentication. Validator modules may set ValidatorModuleResult->error_detail to override our default generic message. The end result looks something like [242284] FATAL: OAuth bearer authentication failed for user "jacob" [242284] DETAIL: [my_validator] bad signature in bearer token Connection matched file ".../pg_hba.conf" line ... Reported-by: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/202601241015.y5uvxd7oxnfs%40alvherre.pgsql	2026-04-03 16:05:33 -07:00
Daniel Gustafsson	0036232ba8	Make data checksum tests more resilient for slow machines The test for re-running checksum enabling was only checking for the data checksum state to transition to 'on', but didn't account for the launcher process having had time to exit, thus getting an error instead of the expected no-op. Adding a pg_stat_activity check for the launcher exiting resolves the error, verified by inducing delay in the launcher. Also wrap a variable only used in injection point tests within the correct USE macros to avoid warning for an unused variable. All per the buildfarm. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Buildfarm Discussion: https://postgr.es/m/1CB288C9-564B-4664-B096-C2F4377D17AB@yesql.se	2026-04-04 00:25:07 +02:00
Nathan Bossart	01876ace13	Add elevel parameter to relation_needs_vacanalyze(). This will be used in a follow-up commit to avoid emitting debug logs from this function. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-03 17:04:28 -05:00
Nathan Bossart	53b8ca6881	Teach relation_needs_vacanalyze() to always compute scores. Presently, this function only computes component scores when the corresponding threshold is reached. A follow-up commit will add a view that shows tables' autovacuum scores, and we anticipate that users will want to use this view to discover tables that are nearing autovacuum eligibility. This commit teaches this function to always compute autovacuum scores, even when a threshold has not been reached or autovacuum is disabled. The restructuring in this commit revealed an interesting edge case. If the table needs vacuuming for wraparound prevention and autovacuum is disabled for it, we might still choose to analyze it. It's not clear if this is intentional, but it has been this way for nearly 20 years, so it seems best to avoid changing it without further discussion. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-03 16:44:41 -05:00
Daniel Gustafsson	f19c0eccae	Online enabling and disabling of data checksums This allows data checksums to be enabled, or disabled, in a running cluster without restricting access to the cluster during processing. Data checksums could prior to this only be enabled during initdb or when the cluster is offline using the pg_checksums app. This commit introduce functionality to enable, or disable, data checksums while the cluster is running regardless of how it was initialized. A background worker launcher process is responsible for launching a dynamic per-database background worker which will mark all buffers dirty for all relation with storage in order for them to have data checksums calculated on write. Once all relations in all databases have been processed, the data_checksums state will be set to on and the cluster will at that point be identical to one which had data checksums enabled during initialization or via offline processing. When data checksums are being enabled, concurrent I/O operations from backends other than the data checksums worker will write the checksums but not verify them on reading. Only when all backends have absorbed the procsignalbarrier for setting data_checksums to on will they also start verifying checksums on reading. The same process is repeated during disabling; all backends write checksums but do not verify them until the barrier for setting the state to off has been absorbed by all. This in-progress state is used to ensure there are no false negatives (or positives) due to reading a checksum which is not in sync with the page. A new testmodule, test_checksums, is introduced with an extensive set of tests covering both online and offline data checksum mode changes. The tests which run concurrent pgbdench during online processing are gated behind the PG_TEST_EXTRA flag due to being very expensive to run. Two levels of PG_TEST_EXTRA flags exist to turn on a subset of the expensive tests, or the full suite of multiple runs. This work is based on an earlier version of this patch which was reviewed by among others Heikki Linnakangas, Robert Haas, Andres Freund, Tomas Vondra, Michael Banck and Andrey Borodin. During the work on this new version, Tomas Vondra has given invaluable assistance with not only coding and reviewing but very in-depth testing. Author: Daniel Gustafsson <daniel@yesql.se> Author: Magnus Hagander <magnus@hagander.net> Co-authored-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CABUevExz9hUUOLnJVr2kpw9Cx=o4MCr1SVKwbupzuxP7ckNutA@mail.gmail.com Discussion: https://postgr.es/m/20181030051643.elbxjww5jjgnjaxg@alap3.anarazel.de Discussion: https://postgr.es/m/CABUevEwE3urLtwxxqdgd5O2oQz9J717ZzMbh+ziCSa5YLLU_BA@mail.gmail.com	2026-04-03 22:58:51 +02:00
Nathan Bossart	8261ee24fe	Refactor relation_needs_vacanalyze(). This commit adds an early return to this function, allowing us to remove a level of indentation on a decent chunk of code. This is preparatory work for follow-up commits that will add a new system view to show tables' autovacuum scores. Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-03 14:03:12 -05:00
Heikki Linnakangas	79534f9065	Change default of max_locks_per_transactions to 128 The previous commits reduced the amount of memory available for locks by eliminating the "safety margins" and by settling the split between LOCK and PROCLOCK tables at startup. The allocation is now more deterministic, but it also means that you often hit one of the limits sooner than before. To compensate for that, bump up max_locks_per_transactions from 64 to 128. With that there is a little more space in the both hash tables than what was the effective maximum size for either table before the previous commits. This only changes the default, so if you had changed max_locks_per_transactions in postgresql.conf, you will still have fewer locks available than before for the same setting value. This should be noted in the release notes. A good rule of thumb is that if you double max_locks_per_transactions, you should be able to get as many locks as before. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/e07be2ba-856b-4ff5-8313-8b58b6b4e4d0@iki.fi	2026-04-03 20:27:46 +03:00
Heikki Linnakangas	e1ad034809	Make the lock hash tables fixed-sized This prevents the LOCK table from "stealing" space that was originally calculated for the PROLOCK table, and vice versa. That was weirdly indeterministic so that if you e.g. took a lot of locks consuming all the available shared memory for the LOCK table, subsequent transactions that needed the more space for the PROCLOCK table would fail, but if you restarted the system then the space would be available for PROCLOCK again. Better to be strict and predictable, even though that means that in many cases you can acquire far fewer locks than before. This also prevents the lock hash tables from using up the general-purpose 100 kB reserve we set aside for "stuff that's too small to bother estimating" in CalculateShmemSize(). We are pretty good at accounting for everything nowadays, so we could probably make that reservation smaller, but I'll leave that for another commit. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/e07be2ba-856b-4ff5-8313-8b58b6b4e4d0@iki.fi	2026-04-03 20:27:16 +03:00
Heikki Linnakangas	3e854d2ff1	Remove 10% safety margin from lock manager hash table estimates As the comment says, the hash table sizes are just estimates, but that doesn't mean we need a "safety margin" here. hash_estimate_size() estimates the needed size in bytes pretty accurately for the given number of elements, so if we wanted room for more elements in the table, we should just use larger max_table_size in the hash_estimate_size() call. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/e07be2ba-856b-4ff5-8313-8b58b6b4e4d0@iki.fi	2026-04-03 20:26:18 +03:00
Heikki Linnakangas	feb03dfecd	Remove bogus "safety margin" from predicate.c shmem estimates The 10% safety margin was copy-pasted from lock.c when the predicate locking code was originally added. However, we later (commit `7c797e7194`) added the HASH_FIXED_SIZE flag to the hash tables, which means that they cannot actually use the safety margin that we're calculating for them. The extra memory was mainly used by the main lock manager, which is the only shmem hash table of non-trivial size that does not use the HASH_FIXED_SIZE flag. If we wanted to have more space for the lock manager, we should reserve it directly in lock.c. After this commit, the lock manager will just have less memory available than before. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/e07be2ba-856b-4ff5-8313-8b58b6b4e4d0@iki.fi	2026-04-03 20:25:57 +03:00
Amit Langote	b7b27eb41a	Optimize fast-path FK checks with batched index probes Instead of probing the PK index on each trigger invocation, buffer FK rows in a new per-constraint cache entry (RI_FastPathEntry) and flush them as a batch. On each trigger invocation, the new ri_FastPathBatchAdd() buffers the FK row in RI_FastPathEntry. When the buffer fills (64 rows) or the trigger-firing cycle ends, the new ri_FastPathBatchFlush() probes the index for all buffered rows, sharing a single CommandCounterIncrement, snapshot, permission check, and security context switch across the batch, rather than repeating each per row as the SPI path does. Per-flush CCI is safe because all AFTER triggers for the buffered rows have already fired by flush time. For single-column foreign keys, the new ri_FastPathFlushArray() builds an ArrayType from the buffered FK values (casting to the PK-side type if needed) and constructs a scan key with the SK_SEARCHARRAY flag. The index AM sorts and deduplicates the array internally, then walks matching leaf pages in one ordered traversal instead of descending from the root once per row. A matched[] bitmap tracks which batch items were satisfied; the first unmatched item is reported as a violation. Multi-column foreign keys fall back to per-row probing via the new ri_FastPathFlushLoop(). The fast path introduced in the previous commit (`2da86c1ef9`) yields ~1.8x speedup. This commit adds ~1.6x on top of that, for a combined ~2.9x speedup over the unpatched code (int PK / int FK, 1M rows, PK table and index cached in memory). FK tuples are materialized via ExecCopySlotHeapTuple() into a new purpose-specific memory context (flush_cxt), child of TopTransactionContext, which is also used for per-flush transient work: cast results, the search array, and index scan allocations. It is reset after each flush and deleted in teardown. The PK relation, index, tuple slots, and fast-path metadata are cached in RI_FastPathEntry across trigger invocations within a trigger-firing batch, avoiding repeated open/close overhead. The snapshot and IndexScanDesc are taken fresh per flush. The entry is not subject to cache invalidation: cached relations are held with locks for the transaction duration, and the entry's lifetime is bounded by the trigger-firing cycle. Lifecycle management for RI_FastPathEntry relies on three new mechanisms: - AfterTriggerBatchCallback: A new general-purpose callback mechanism in trigger.c. Callbacks registered via RegisterAfterTriggerBatchCallback() fire at the end of each trigger-firing batch (AfterTriggerEndQuery for immediate constraints, AfterTriggerFireDeferred at COMMIT, and AfterTriggerSetState for SET CONSTRAINTS IMMEDIATE). The RI code registers ri_FastPathEndBatch as a batch callback. - Batch callbacks only fire at the outermost query level (checked inside FireAfterTriggerBatchCallbacks), so nested queries from SPI inside other AFTER triggers do not tear down the cache mid-batch. - XactCallback: ri_FastPathXactCallback NULLs the static cache pointer at transaction end, handling the abort path where the batch callback never fired. - SubXactCallback: ri_FastPathSubXactCallback NULLs the static cache pointer on subtransaction abort, preventing the batch callback from accessing already-released resources. - AfterTriggerBatchIsActive(): A new exported accessor that returns true when afterTriggers.query_depth >= 0. During ALTER TABLE ... ADD FOREIGN KEY validation, RI triggers are called directly outside the after-trigger framework, so batch callbacks would never fire. The fast-path code uses this to fall back to the non-cached per-invocation path in that context. ri_FastPathEndBatch() flushes any partial batch before tearing down cached resources. Since the FK relation may already be closed by flush time (e.g. for deferred constraints at COMMIT), it reopens the relation using entry->fk_relid if needed. The existing ALTER TABLE validation path bypasses batching and continues to call ri_FastPathCheck() directly per row, because RI triggers are called outside the after-trigger framework there and batch callbacks would never fire to flush the buffer. Suggested-by: David Rowley <dgrowleyml@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Co-authored-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Tested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA+HiwqF4C0ws3cO+z5cLkPuvwnAwkSp7sfvgGj3yQ=Li6KNMqA@mail.gmail.com	2026-04-03 14:33:53 +09:00
Thomas Munro	be21341e13	jit: No backport::SectionMemoryManager for LLVM 22. LLVM 22 has the fix that we copied into our tree in commit `9044fc1d` and a new function to reach it[1][2], so we only need to use our copy for Aarch64 + LLVM < 22. The only change to the final version that our copy didn't get is a new LLVM_ABI macro, but that isn't appropriate for us. Our copy is hopefully now frozen and would only need maintenance if bugs are found in the upstream code. Non-Aarch64 systems now also use the new API with LLVM 22. It allocates all sections with one contiguous mmap() instead of one per section. We could have done that earlier, but commit `9044fc1d` wanted to limit the blast radius to the affected systems. We might as well benefit from that small improvement everywhere now that it is available out of the box. We can't delete our copy until LLVM 22 is our minimum supported version, or we switch to the newer JITLink API for at least Aarch64. [1] https://github.com/llvm/llvm-project/pull/71968 [2] https://github.com/llvm/llvm-project/pull/174307 Backpatch-through: 14 Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com	2026-04-03 14:55:11 +13:00
Tom Lane	ebba64c08d	Further harden tests that might use not-so-compatible tar versions. Buildfarm testing shows that OpenSUSE (and perhaps related platforms?) configures GNU tar in such a way that it'll archive sparse WAL files by default, thus triggering the pax-extension detection code added by `bc30c704a`. Thus, we need something similar to `852de579a` but for GNU tar's option set. "--format=ustar" seems to do the trick. Moreover, the buildfarm shows that pg_verifybackup's 003_corruption.pl test script is also triggering creation of pax-format tar files on that platform. We had not noticed because those test cases all fail (intentionally) before getting to the point of trying to verify WAL data. Since that means two TAP scripts need this option-selection logic, and plausibly more will do so in future, factor it out into a subroutine in Test::Utils. We also need to back-patch the 003_corruption.pl fix into v18, where it's also failing. While at it, clean up some places where guards for $tar being empty or undefined were incomplete or even outright backwards. Presumably, we missed noticing because the set of machines that run TAP tests and don't have tar installed is empty. But if we're going to try to handle that scenario, we should do it correctly. Reported-by: Tomas Vondra <tomas@vondra.me> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/02770bea-b3f3-4015-8a43-443ae345379c@vondra.me Backpatch-through: 18	2026-04-02 17:21:27 -04:00
Andrew Dunstan	bd4f879a9c	Add additional jsonpath string methods Add the following jsonpath methods: * l/r/btrim() * lower(), upper() * initcap() * replace() * split_part() Each simply dispatches to the standard string processing functions. These depend on the locale, but since it's set at `initdb`, they can be considered immutable and therefore allowed in any jsonpath expression. Author: Florents Tselai <florents.tselai@gmail.com> Co-authored-by: David E. Wheeler <david@justatheory.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CA+v5N40sJF39m0v7h=QN86zGp0CUf9F1WKasnZy9nNVj_VhCZQ@mail.gmail.com	2026-04-02 15:19:49 -04:00
Andrew Dunstan	a35c9d524e	Rename jsonpath method arg tokens This is just cleanup in the jsonpath grammar. Rename the `csv_` tokens to `int_`, because they represent signed or unsigned integers, as follows: * `csv_elem` => `int_elem` * `csv_list` => `int_list` * `opt_csv_list` => `opt_int_list` Rename the `datetime_precision` tokens to `uint_arg`, as they represent unsigned integers and will be useful for other methods in the future, as follows: * `datetime_precision` => `uint_elem` * `opt_datetime_precision` => `opt_uint_arg` Rename the `datetime_template` tokens to `str_arg`, as they represent strings and will be useful for other methods in the future, as follows: * `datetime_template` => `str_elem` * `opt_datetime_template` => `opt_str_arg` Author: David E. Wheeler <david@justatheory.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CA+v5N40sJF39m0v7h=QN86zGp0CUf9F1WKasnZy9nNVj_VhCZQ@mail.gmail.com	2026-04-02 15:19:49 -04:00
Masahiko Sawada	fd7a25af11	Add target_relid parameter to pg_get_publication_tables(). When a tablesync worker checks whether a specific table is published, it previously issued a query to the publisher calling pg_get_publication_tables() and filtering the result by relid via a WHERE clause. Because the function itself was fully evaluated before the filter was applied, this forced the publisher to enumerate all tables in the publication. For publications covering a large number of tables, this resulted in expensive catalog scans and unnecessary CPU overhead on the publisher. This commit adds a new overloaded form of pg_get_publication_tables() that accepts an array of publication names and a target table OID. Instead of enumerating all published tables, it evaluates membership for the specified relation via syscache lookups, using the new is_table_publishable_in_publication() helper. This helper correctly accounts for publish_via_partition_root, ALL TABLES with EXCEPT clauses, schema publications, and partition inheritance, while avoiding the overhead of building the complete published table list. The existing VARIADIC array form of pg_get_publication_tables() is preserved for backward compatibility. Tablesync workers use the new two-argument form when connected to a publisher running PostgreSQL 19 or later. Bump catalog version. Reported-by: Marcos Pegoraro <marcos@f10.com.br> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Haoyan Wang <wanghaoyan20@163.com> Discussion: https://postgr.es/m/CAB-JLwbBFNuASyEnZWP0Tck9uNkthBZqi6WoXNevUT6+mV8XmA@mail.gmail.com	2026-04-02 11:34:50 -07:00
Tom Lane	bc30c704ad	Harden astreamer tar parsing logic against archives it can't handle. Previously, there was essentially no verification in this code that the input is a tar file at all, let alone that it fits into the subset of valid tar files that we can handle. This was exposed by the discovery that we couldn't handle files that FreeBSD's tar makes, because it's fairly aggressive about converting sparse WAL files into sparse tar entries. To fix: * Bail out if we find a pax extension header. This covers the sparse-file case, and also protects us against scenarios where the pax header changes other file properties that we care about. (Eventually we may extend the logic to actually handle such headers, but that won't happen in time for v19.) * Be more wary about tar file type codes in general: do not assume that anything that's neither a directory nor a symlink must be a regular file. Instead, we just ignore entries that are none of the three supported types. * Apply pg_dump's isValidTarHeader to verify that a purported header block is actually in tar format. To make this possible, move isValidTarHeader into src/port/tar.c, which is probably where it should have been since that file was created. I also took the opportunity to const-ify the arguments of isValidTarHeader and tarChecksum, and to use symbols not hard-wired constants inside tarChecksum. Back-patch to v18 but not further. Although this code exists inside pg_basebackup in older branches, it's not really exposed in that usage to tar files that weren't generated by our own code, so it doesn't seem worth back-porting these changes across `3c9056981` and `f80b09bac`. I did choose to include a back-patch of `5868372bb` into v18 though, to minimize cosmetic differences between these two branches. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/3049460.1775067940@sss.pgh.pa.us> Backpatch-through: 18	2026-04-02 12:20:36 -04:00
Fujii Masao	5770679918	Remove redundant SetLatch() calls in interrupt handling functions Interrupt handling functions (e.g., HandleCatchupInterrupt(), HandleParallelApplyMessageInterrupt()) are called only by procsignal_sigusr1_handler(), which already calls SetLatch() for the current process at the end of its processing. Therefore, these interrupt handling functions do not need to call SetLatch() themselves. However, previously, some of these functions redundantly called SetLatch(). This commit removes those unnecessary calls. While duplicate SetLatch() calls are redundant, they are harmless, so this change is not backpatched. Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CALj2ACWd5apddj6Cd885WwJ6LquYu_G81C4GoR4xSoDV1x-FEA@mail.gmail.com	2026-04-02 23:55:30 +09:00
John Naylor	effaa464af	Check for __cpuidex and __get_cpuid_count separately Previously we would only check for the availability of __cpuidex if the related __get_cpuid_count was not available on a platform. Future commits will need to access hypervisor information about the TSC frequency of x86 CPUs. For that case __cpuidex is the only viable option for accessing a high leaf (e.g. 0x40000000), since __get_cpuid_count does not allow that. __cpuidex is defined in cpuid.h for gcc/clang, but in intrin.h for MSVC, so adjust tests to suite. We also need to cast the array of unsigned ints to signed, since gcc (with -Wall) and clang emit warnings otherwise. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: John Naylor <john.naylor@postgresql.org> Discussion: https://postgr.es/m/CAP53PkyooCeR8YV0BUD_xC7oTZESHz8OdA=tP7pBRHFVQ9xtKg@mail.gmail.com	2026-04-02 19:39:57 +07:00
Andrew Dunstan	bb6ae9707c	Use command_ok for pg_regress calls in 002_pg_upgrade and 027_stream_regress Now that command_ok() captures and displays failure output, use it instead of system() plus manual diff-dumping in these two tests. This simplifies both scripts and produces consistent, truncated output on failure. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl	2026-04-02 08:13:44 -04:00
Andrew Dunstan	b8da9869b8	perl tap: Use croak instead of die in our helper modules Replace die with croak throughout Cluster.pm and Utils.pm (except in INIT blocks and signal handlers, where die is correct) so that error messages report the test script's line number rather than the helper module's. Add @CARP_NOT in Utils.pm listing PostgreSQL::Test::Cluster, so that when a Utils function is called through a Cluster.pm wrapper, croak skips both packages and reports the actual test-script caller. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl	2026-04-02 08:13:44 -04:00
Andrew Dunstan	76540fdedf	perl tap: Show die reason in TAP output Install a $SIG{__DIE__} handler in the INIT block of Utils.pm that emits the die message as a TAP diagnostic. Previously, an unexpected die (e.g. from safe_psql) produced only "no plan was declared" with no indication of the actual error. The handler also calls done_testing() to suppress that confusing message. Dies during compilation ($^S undefined) and inside eval ($^S == 1) are left alone. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl Discussion: https://postgr.es/m/20220222181924.eehi7o4pmneeb4hm%40alap3.anarazel.de	2026-04-02 08:13:44 -04:00
Andrew Dunstan	1402b8d2fc	perl tap: Show failed command output Capture stdout and stderr from command_ok() and command_fails() and emit them as TAP diagnostics on failure. Output is truncated to the first and last 30 lines per channel to avoid flooding. A new helper _diag_command_output() is introduced in Utils.pm so both functions share the same truncation and formatting logic. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl	2026-04-02 08:13:44 -04:00
Andrew Dunstan	5720ae0143	pg_regress: Include diffs in TAP output When pg_regress fails it is often tedious to find the actual diffs, especially in CI where you must navigate a file browser. Emit the first 80 lines of the combined regression.diffs as TAP diagnostics so the failure reason is visible directly in the test output. The line limit is across all failing tests in a single pg_regress run to avoid flooding when a crash causes every subsequent test to fail. New DIAG_DETAIL / DIAG_END tap output types are added, mirroring the existing NOTE_DETAIL / NOTE_END pair, so that long diff lines can be emitted without spurious '#' prefixes on continuation lines. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl	2026-04-02 08:13:44 -04:00
Tomas Vondra	7f8c88c2b8	jit: Change the default to off. While JIT can speed up large analytical queries, it can also cause serious performance issues on otherwise very fast queries. Compiling and optimizing the expressions may be so expensive, it completely outweighs the JIT benefits for shorter queries. Ideally, we'd address this in the cost model, but the part deciding whether to enable JIT for a query is rather simple, partially because we don't have any reliable estimates of how expensive the LLVM compilation and optimization is. Sometimes seemingly unrelated changes (for example a couple additional INSERTs into a table) increase the cost just enough to enable JIT, resulting in a performance cliff. Because of these risks, most large-scale deployments already disable JIT by default. Notably, this includes all hyperscalers. This commit changes our default to align with that established practice. If we improve the JIT (be it better costing or cheaper execution), we can consider enabling it by default again. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/DG1VZJEX1AQH.2EH4OKGRUDB71@jeltef.nl	2026-04-02 13:40:29 +02:00
Heikki Linnakangas	148fe2b05d	Test pg_stat_statements across crash restart Add 'pg_stat_statements' to the crash restart test, to test that shared memory and LWLock initialization works across crash restart in a library listed in shared_preload_libraries. We had no test coverage for that. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-02 13:33:06 +03:00
Amit Kapila	4441d6b2e4	Doc: Fix oversight in commit `55cefadde8`. pg_publication_rel.prrelid refers to sequences whereas stores information only of tables. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAHut+Pv1UKR_bxmN7wcCCpQveHoYprvH-hbdFq8gsaH1Ye7B_w@mail.gmail.com	2026-04-02 10:16:53 +05:30
Thomas Munro	de6b80e5ff	jit: Stop emitting lifetime.end for LLVM 22. The lifetime.end intrinsic can now only be used for stack memory allocated with alloca[1][2][3]. We use it to tell LLVM about the lifetime of function arguments/isnull values that we keep in palloc'd memory, so that it can avoid spilling registers to memory. We might need to rearrange things and put them on the stack, but that'll take some research. In the meantime, unbreak the build on LLVM 22. [1] https://github.com/llvm/llvm-project/pull/149310 [2] https://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic [3] https://llvm.org/docs/LangRef.html#i-alloca Backpatch-through: 14 Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> (earlier attempt) Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> (earlier attempt) Reviewed-by: Andres Freund <andres@anarazel.de> (earlier attempt) Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com	2026-04-02 15:52:48 +13:00
David Rowley	331d829e62	Fix nocachegetattr() so it again supports deforming cstrings `c456e3911` added various optimizations to the tuple deformation routines. One optimization assumed that heap tuples would never contain cstrings. That optimization also made its way into nocachegetattr(), which isn't correct as ROW() types get formed into HeapTuples by ExecEvalRow() and those can contain cstring Datums. nocachegetattr() gets used to extract Datums from those tuples. Here we remove the pg_assume(), which was there to instruct the compiler to omit the attlen == -2 related code in att_addlength_pointer(). Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/80aeac57-8f50-4732-a5b4-c2373c3f8149@gmail.com	2026-04-02 14:11:17 +13:00
Andres Freund	82c0cb4e67	pg_test_timing: Reduce per-loop overhead The pg_test_timing program was previously using INSTR_TIME_GET_NANOSEC on an absolute instr_time value in order to do a diff, which goes against the spirit of how the GET_* macros are supposed to be used, and will cause overhead in a future change that assumes these macros are typically used on intervals only. Additionally the program was doing unnecessary work in the test loop by measuring the time elapsed, instead of checking the existing current time measurement against a target end time. To support that, introduce a new INSTR_TIME_ADD_NANOSEC macro that allows adding user-defined nanoseconds to an instr_time variable. While modifying the relevant code anyway, simplify it by not handling durations <= 0 in test_timing(), since duration is unsigned and 0 is disallowed by the caller. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53Pkyxv3-3gX+aOxC5tX0p2v9RHU+XH0iyvb64+ZnBXj92vg@mail.gmail.com	2026-04-01 20:07:38 -04:00
Andres Freund	6e36930f9a	read_stream: Prevent distance from decaying too quickly Until now we reduced the look-ahead distance by 1 on every hit, and doubled it on every miss. That is problematic because there are very common IO patterns where this prevents us from ever reaching a sufficiently high distance (e.g. a miss followed by a hit will never have the distance grow beyond 2). In many such cases, if we had ever reached a sufficient look-ahead distance, things would have been fine, because we grow the distance faster than we decrease it. One might think that the most obvious answer to this problem would be to never reduce the distance. However, that would not work well, as (particularly with upcoming users of read streams), it is reasonably common to at first have a lot of misses and then to transition to a fully cached workload, e.g. because the same blocks are needed repeatedly within one stream. Doing unnecessarily deep readahead can be costly, due to having to pin a lot more buffers, which increases CPU overhead. Because the cost of a synchronously handled miss can be very high (multiple milliseconds for every IO with commonly used storage) compared to the CPU overhead of keeping the distance too high, we want to err on the side of not reducing the distance too early. The insight that a decrease of the distance by 1 at ever hit may be ok at large distances, but not at low distances, shows a way out: If we only allow decreasing the distance once there were no misses for our maximum look-ahead distance, we will keep the distance high as long as readahead has a chance to do IO asynchronously, but not commonly when not. Several folks have written variants of this patch, including at least Thomas Munro, Melanie Plageman and I. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CA+hUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw@mail.gmail.com Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com	2026-04-01 19:50:03 -04:00
Andres Freund	cceb1bf45e	read_stream: Issue IO synchronously while in fast path While in fast-path, execute any IO that we might encounter synchronously. Because we are, in that moment, not reading ahead, dispatching any occasional IO to workers has the dispatch overhead, without any realistic chance of the IO completing before we need it. This helps io_method=worker performance for workloads that have only occasional cache misses, but where those occasional misses still take long enough to matter. It is likely this is only measurable with fast local storage or workloads with the data in the kernel page cache, as with remote storage the IO latency, not the dispatch-to-worker latency, is the determining factor. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com	2026-04-01 19:22:44 -04:00
Heikki Linnakangas	1bdbb211bb	Make ShmemIndex visible in the pg_shmem_allocations view Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-04-01 23:56:51 +03:00
Álvaro Herrera	db89a47115	Give an 'options' parameter to tuple_delete/_update The tuple_insert() method already has an equivalent argument, so this makes sense just on consistency grounds, for future growth. table_delete() can immediately use it to carry the 'changingPart' boolean; for table_update we don't have any options at present. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> (older version) Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/202603171606.kf6pmhscqbqz@alvherre.pgsql	2026-04-01 20:26:57 +02:00
Peter Eisentraut	8e72d914c5	Add UPDATE/DELETE FOR PORTION OF This is an extension of the UPDATE and DELETE commands to do a "temporal update/delete" based on a range or multirange column. The user can say UPDATE t FOR PORTION OF valid_at FROM '2001-01-01' TO '2002-01-01' SET ... (or likewise with DELETE) where valid_at is a range or multirange column. The command is automatically limited to rows overlapping the targeted portion, and only history within those bounds is changed. If a row represents history partly inside and partly outside the bounds, then the command truncates the row's application time to fit within the targeted portion, then it inserts one or more "temporal leftovers": new rows containing all the original values, except with the application-time column changed to only represent the untouched part of history. To compute the temporal leftovers that are required, we use the *_minus_multi set-returning functions defined in `5eed8ce50c`. - Added bison support for FOR PORTION OF syntax. The bounds must be constant, so we forbid column references, subqueries, etc. We do accept functions like NOW(). - Added logic to executor to insert new rows for the "temporal leftover" part of a record touched by a FOR PORTION OF query. - Documented FOR PORTION OF. - Added tests. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com	2026-04-01 19:06:03 +02:00
Álvaro Herrera	ec2f81766a	Fix vicinity of tuple_insert to use uint32, not int, for options Oversight in commit `1bd6f22f43`: I was way too optimistic about the compiler letting me know what variables needed to be updated, and missed a few of them. Clean it up. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/40E570EE-5A60-49D8-B8F7-2F8F2B7C8DFA@gmail.com	2026-04-01 18:14:51 +02:00
Dean Rasheed	f7f4052a4e	Add support for extended statistics on virtual generated columns. This allows both univariate and multivariate statistics to be built on virtual generated columns and expressions that refer to virtual generated columns. The restriction disallowing extended statistics on a single column is lifted in the case of a single virtual generated column, since it is treated as a single expression. In the catalogs, references to virtual generated columns are stored as-is. They are expanded at ANALYZE time to build the statistics, and at planning time to allow the optimizer to make use of the statistics. This allows the statistics to be correctly rebuilt using ANALYZE, if a column's generation expression is altered (which causes any existing statistics data to be deleted). Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/20250422181006.dd6f9d1d81299f5b2ad55e1a@sraoss.co.jp	2026-04-01 17:02:24 +01:00
Nathan Bossart	196bf448e0	doc: Add missing description for DROP SUBSCRIPTION IF EXISTS. Oversight in commit `665d1fad99`. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHut%2BPv72haFerrCdYdmF6hu6o2jKcGzkXehom%2BsP-JBBmOVDg%40mail.gmail.com Backpatch-through: 14	2026-04-01 09:48:48 -05:00
Andres Freund	513374a47a	bufmgr: Return whether WaitReadBuffers() needed to wait Thanks to the previous commit, pgaio_wref_check_done() will now detect whether IO has completed even if userspace has not yet consumed the kernel completion. This knowledge can be useful for callers of WaitReadBuffers() to know whether it needed to wait or not, e.g. for adjusting read-ahead aggressiveness or for instrumentation. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com Discussion: https://postgr.es/m/a177a6dd-240b-455a-8f25-aca0b1c08c6e@vondra.me	2026-04-01 09:26:43 -04:00
Andres Freund	6e648e353f	aio: io_uring: Allow IO methods to check if IO completed in the background Until now pgaio_wref_check_done() with io_method=io_uring would not detect if IOs are known to have completed to the kernel, but the completion has not yet been consumed by userspace. This can lead to inferior performance and also makes it harder to use smarter feedback logic in read_stream, because we cannot use knowledge about whether an IO completed to control the readahead distance. This commit just adds the io_uring specific infrastructure. Later commits will return whether a wait was needed from WaitReadBuffers() and then use that knowledge. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com	2026-04-01 09:26:43 -04:00
Amit Langote	edee563456	Make FastPathMeta self-contained by copying FmgrInfo structs FastPathMeta stored pointers into ri_compare_cache entries via compare_entries[], creating a dependency on that cache remaining stable. If ri_compare_cache entries were invalidated after fpmeta was populated, the pointers would dangle. Replace compare_entries[] with inline copies of the two FmgrInfo fields actually needed (cast_func_finfo and eq_opr_finfo), copied at populate time via fmgr_info_copy(). fpmeta now depends only on riinfo remaining valid, which is already handled by the invalidation callback. Introduced by commit `2da86c1ef9` ("Add fast path for foreign key constraint checks"), noticed while reviewing code for robustness under CLOBBER_CACHE_ALWAYS. Discussion: https://postgr.es/m/CA+HiwqFQ+ZA7hSOygv4uv_t75B3r0_gosjadetCsAEoaZwTu6g@mail.gmail.com	2026-04-01 18:43:40 +09:00
Amit Langote	e484b0eea6	Fix two issues in fast-path FK check introduced by commit `2da86c1ef9` First, under CLOBBER_CACHE_ALWAYS, the RI_ConstraintInfo entry can be invalidated by relcache callbacks triggered inside table_open() or index_open(), leaving ri_FastPathCheck() calling ri_populate_fastpath_metadata() with a stale entry whose valid flag is false. Fix by moving the fpmeta initialization to after ri_CheckPermissions(), reloading riinfo first to ensure it is valid, then calling ri_ExtractValues() and build_index_scankeys() immediately after before any further operations that could trigger invalidation. Second, fpmeta allocated in TopMemoryContext was not freed when the entry was invalidated in InvalidateConstraintCacheCallBack(), leaking memory each time the constraint cache entry was recycled. Fix by freeing and NULLing fpmeta at invalidation time. Noticed locally when testing with CLOBBER_CACHE_ALWAYS. Discussion: https://postgr.es/m/CA+HiwqGBU__7-VZZhQWQ3EQuwLYNPd9==ngnzduhGWKHMj9mvw@mail.gmail.com	2026-04-01 17:30:33 +09:00
John Naylor	f6bd9f0fe2	Skip common prefixes during radix sort During the counting step, keep track of the bits that are the same for the entire input. If we counted only a single distinct byte, the next recursion will start at the next byte position that has more than one distinct byte in the input. This allows us to skip over multiple passes where the byte is the same for the entire input. This provides a significant speedup for integers that have some upper bytes with all-zeros or all-ones, which is common. Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/CANWCAZYpGMDSSwAa18fOxJGXaPzVdyPsWpOkfCX32DWh3Qznzw@mail.gmail.com	2026-04-01 14:18:57 +07:00
Fujii Masao	21b018e7ea	Reduce log level of some logical decoding messages from LOG to DEBUG1 Previously some logical decoding messages (e.g., "logical decoding found consistent point") were logged at level LOG, even though they provided low-level, developer-oriented information that DBAs were typically not interested in. Since these messages can occur routinely (for example, when keeping calling pg_logical_slot_get_changes() to obtain the changes from logical decoding), logging them at LOG can be overly verbose. This commit reduces their log level to DEBUG1 to avoid unnecessary log noise. This change applies to a small set of messages for now. Additional messages may be adjusted similarly in the future. Even with this change, if these messages from walsender still need to be observed, enabling DEBUG1 logging selectively for walsender (e.g., log_min_messages = 'warning,walsender:debug1') would be helpful to avoid increasing overall log volume. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGTyHgtD9tyN664x6vQ8Q1G53H7ZUCgBU9_X=nLt3f1QA@mail.gmail.com	2026-04-01 15:43:02 +09:00
Peter Eisentraut	76f4b92bac	Use standard C23 and C++ attributes if available Use the standard C23 and C++ attributes [[nodiscard]], [[noreturn]], and [[maybe_unused]], if available. This makes pg_nodiscard and pg_attribute_unused() available in not-GCC-compatible compilers that support C23 as well as in C++. For pg_noreturn, we can now drop the GCC-specific and MSVC-specific fallbacks, because the C11 and the C++ implementation will now cover all required cases. Note, in a few places, we need to change the position of the attribute because it's not valid in that place in C23. Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw	2026-04-01 08:15:02 +02:00
Peter Eisentraut	c05ad248f9	Enable test_cplusplusext with MSVC The test_cplusplusext test module has so far been disabled on MSVC. The only remaining problem now is that designated initializers, as used in PG_MODULE_MAGIC, require C++20. (With GCC and Clang they work in older C++ versions as well.) This adds another test in the top-level meson.build to check that the compiler supports C++20 designated initializers. This is not required, we are just checking and recording the answer. If yes, we can enable the test module. Most current compilers likely won't be in C++20 mode by default. This doesn't change that; we are not doing anything to try to switch the compiler into that mode. This might be a separate project, but for now we'll leave that for the user or the test scaffolding. The VS task on Cirrus CI is changed to provide the required flag to turn on C++20 mode. There is no equivalent change in configure, since this change mainly targets MSVC. Co-authored-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg%40mail.gmail.com	2026-04-01 07:48:47 +02:00
Amit Kapila	6b0550c45d	Fix miscellaneous issues in EXCEPT publication clause. Improve documentation regarding multiple publications and partition hierarchies. Refine error reporting for excluded relations. Consolidate docs by using table_object instead of expanded table syntax in publication commands. Also includes minor test cleanup and naming fixes. Reported-by: Peter Smith <smithpb2250@gmail.com> Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm1CiBYcteE_jjPA4BPHfX30dg9eTTTkJgkjY5tgE7t=bQ@mail.gmail.com Discussion: https://postgr.es/m/CALDaNm3=JrucjhiiwsYQw5-PGtBHFONa6F7hhWCXMsGvh=tamA@mail.gmail.com	2026-04-01 09:13:43 +05:30
Thomas Munro	852de579a6	Fix pg_waldump/t/001_basic.pl with BSD tar on ZFS. The new test fails with an error about a missing WAL file on that stack, because it is archived in GNU tar's --sparse --format=posix format. BSD tar uses that format by default, unlike GNU tar itself, and ZFS triggers it by implicitly creating sparse files when it sees a lot of zeroes. The problem will surely also affect real users of the new tar support in pg_waldump (commit `b15c1513`) and pg_verifybackup (commit `b3cf461b3`) on such systems. Ideas under discussion, but for now the test is made to pass by disabling sparse file detection in BSD tar. Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/1624716.1774736283%40sss.pgh.pa.us	2026-04-01 14:37:01 +13:00
Andres Freund	c0af4eb4e7	bufmgr: Fix ordering of checks in PinBuffer() The check for skip_if_not_valid added in `819dc118c0` was put at the start of the loop. A CAS loop in theory does allow to make that check in a race free manner. However, just after the check, there's a old_buf_state = WaitBufHdrUnlocked(buf); which introduces a race, because it would allow BM_VALID to be cleared, after the skip_if_not_valid check. Fix by restarting the loop after WaitBufHdrUnlocked(). Reported-by: Yura Sokolov <y.sokolov@postgrespro.ru> Discussion: https://postgr.es/m/5bf667f3-5270-4b19-a08f-0facbecdff68@postgrespro.ru	2026-03-31 19:24:58 -04:00
Tom Lane	273d26b75e	Doc: warn that parallel pg_restore may fail if --no-schema was used. If the archive file was made with --no-schema or related options then it likely does not have enough dependency information to ensure that parallel restore will choose a workable restore order. Document this. In passing, do some minor wordsmithing on new nearby text about restoring from pg_dumpall archives. Author: vaibhave postgres <postgresvaibhave@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAM_eQjzTLtt1X9WKvMV6Rew0UvxT3mmhimZa9WT-vqaPjmXk-g@mail.gmail.com	2026-03-31 16:36:01 -04:00
Melanie Plageman	8519251ee9	Fix test_aio read_buffers() to work without cassert In a production build, StartReadBuffers() doesn't populate all fields of a ReadBuffersOperation for a buffer hit because no callers use them (they are populated in assert builds). read_buffers() (a test-only function) relied on some of these fields, so AIO tests failed on non-assert builds (discovered on the buildfarm after commit `020c02bd90`). Fix by tracking the required information ourselves in read_buffers() and avoiding reliance on the ReadBuffersOperation unless we know that we did IO. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/9ce8f5d8-8ab2-4aa2-b062-c5d74161069c%40gmail.com	2026-03-31 15:02:52 -04:00
Jacob Champion	e020a897ef	oauth: Don't log discovery connections by default Currently, when the client sends a parameter discovery request within OAUTHBEARER, the server logs the attempt with FATAL: OAuth bearer authentication failed for user These log entries are difficult to distinguish from true authentication failures, and by default, libpq sends a discovery request as part of every OAuth connection, making them annoyingly noisy. Use the new PG_SASL_EXCHANGE_ABANDONED status to suppress them. Patch by Zsolt Parragi, with some additional comments added by me. Author: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAN4CZFPim7hUiyb7daNKQPSZ8CvQRBGkVhbvED7yZi8VktSn4Q%40mail.gmail.com	2026-03-31 11:47:33 -07:00
Jacob Champion	c4ff16339f	sasl: Allow backend mechanisms to "abandon" exchanges Introduce PG_SASL_EXCHANGE_ABANDONED, which allows CheckSASLAuth to suppress the failing log entry for any SASL exchange that isn't actually an authentication attempt. This is desirable for OAUTHBEARER's discovery exchanges (and a subsequent commit will make use of it there). This might have some overlap in the future with in-band aborts for SASL exchanges, but it's intentionally not named _ABORTED to avoid confusion. (We don't currently support clientside aborts in our SASL profile.) Adapted from a patch by Zsolt Parragi. Author: Zsolt Parragi <zsolt.parragi@percona.com> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAN4CZFPim7hUiyb7daNKQPSZ8CvQRBGkVhbvED7yZi8VktSn4Q%40mail.gmail.com	2026-03-31 11:47:31 -07:00
Jacob Champion	c2bca7cc96	Add FATAL_CLIENT_ONLY to ereport/elog SASL exchanges must end with either an AuthenticationOk or an ErrorResponse from the server, and the standard way to produce an ErrorResponse packet is for auth_failed() to call ereport(FATAL). This means that there's no way for a SASL mechanism to suppress the server log entry if the "authentication attempt" was really just a query for authentication metadata, as is done with OAUTHBEARER. Following the example of `1f9158ba4`, add a FATAL_CLIENT_ONLY elevel. This will allow ClientAuthentication() to choose not to log a particular failure, while still correctly ending the authentication exchange before process exit. (The provenance of this patch is convoluted: since it's a mechanical copy-paste of `1f9158ba4`, both Zsolt Parragi and I produced nearly identical versions independently, and Andrey Borodin reviewed Zsolt's version. Tom Lane is the author of `1f9158ba4`, but I don't want to imply that he's signed off on this adaptation. See Discussion.) Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAN4CZFPim7hUiyb7daNKQPSZ8CvQRBGkVhbvED7yZi8VktSn4Q%40mail.gmail.com	2026-03-31 11:47:29 -07:00
Jacob Champion	09532b4040	libpq: Allow developers to reimplement libpq-oauth For PG19, since we won't have the ability to officially switch out flow plugins, relax the flow-loading code to not require the internal init function. Modules that don't have one will be treated as custom user flows in error messages. This will let bleeding-edge developers more easily test out the API and provide feedback for PG20, by telling the runtime linker to find a different libpq-oauth. It remains undocumented for end users. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com	2026-03-31 11:47:26 -07:00
Jacob Champion	0af4d402cb	libpq: Poison the v2 part of a v1 Bearer request The new PGoauthBearerRequestV2 API (which has similarities to the "subclass" pointer architecture in use by the backend, for Nodes) carries the risk of a developer ignoring the type of hook in use and just casting directly to the V2 struct. This will appear to work fine in 19, but crash (or worse) when speaking to libpq 18. However, we're in a unique position to catch this problem, because we have tight control over the struct. Add poisoning code to the v1 path which does the following: - masks the v2 request->issuer pointer, to hopefully point at nonsense memory - abort()s if the v2 request->error is assigned by the hook - attempts to cover both with VALGRIND_MAKE_MEM_NOACCESS for the duration of the callback (a potential AddressSanitizer implementation is left for future work) The struct is unpoisoned after the call, so we can switch back to the v2 internal implementation when necessary. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2BnCg5upBVOo_UCSjMfO%3DYMkZXcSEsgaADKXqerr5wahZQ%40mail.gmail.com	2026-03-31 11:47:23 -07:00
Nathan Bossart	771fe0948c	Avoid including vacuum.h in tableam.h and heapam.h. Commit `2252fcd427` modified some function prototypes in tableam.h and heapam.h to take a VacuumParams argument instead of a pointer, which required including vacuum.h in those headers. vacuum.h has a reasonably large dependency tree, and headers like tableam.h are widely included, so this is not ideal. To fix, change the functions in question to accept a "const VacuumParams *" argument instead. That allows us to use a forward declaration for VacuumParams and avoid including vacuum.h. Since vacuum_rel() needs to scribble on the params argument, we still pass it by value to that function so that the original struct is not modified. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/rzxpxod4c4la62yvutyrvgoyilrl2fx55djaf2suidy7np5m6c%403l2ln476eadh	2026-03-31 12:43:52 -05:00
Tom Lane	960382e3e9	Doc: remove bogus claim that tsvectors can have up to 2^64 entries. This is nonsense on its face, since the textsearch parsing logic generally uses int32 to count words (see, eg, struct ParsedText). Not to mention that we don't support input strings larger than 1GB. The actual limitation of interest is documented nearby: a tsvector can't be larger than 1MB, thanks to 20-bit offset fields within it (see WordEntry.pos). That constrains us to well under 256K lexemes per tsvector, depending on how many positions are stored per lexeme. It seems sufficient therefore to just remove the bit about number of lexemes. Author: Dharin Shah <dharinshah95@gmail.com> Discussion: https://postgr.es/m/CAOj6k6d0YO6AO-bhxkfUXPxUi-+YX9-doh2h5D5z0Bm8D2w=OA@mail.gmail.com	2026-03-31 11:49:54 -04:00
Tom Lane	fb7a9050d5	Doc: improve explanation of GiST compress/decompress methods. The docs previously didn't explain that leaf and non-leaf keys could be treated differently, even though many of our opclasses do exactly that. It also wasn't explained how that relates to the STORAGE option, particularly since only one storage type can be specified for both leaf and non-leaf keys. While here, reorganize the text slightly, rather than sticking additional detail into what's supposed to be a brief summary paragraph. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+renyWs5Np+FLSYfL+eu20S4U671A3fQGb-+7e22HLrD1NbYw@mail.gmail.com	2026-03-31 11:23:26 -04:00
Heikki Linnakangas	7b424e3108	Change the signature of dynahash's alloc function Instead of passing the current memory context to the alloc function via a shared variable, pass it directly as an argument. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-03-31 16:55:03 +03:00
Heikki Linnakangas	dde69621c3	Remove HASH_SEGMENT option It's been unused forever. There's no urgency in removing it now, but it was just something that caught my eye. Aleksander Alekseev proposed this a long time ago [0], but Tom Lane was worried about third-party extensions using it. I believe that's a non-issue: I tried grepping through all extensions found on github and didn't find any references to HASH_SEGMENT. [0] https://www.postgresql.org/message-id/20160418180711.55ac82c0@fujitsu Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-03-31 16:45:28 +03:00
Peter Eisentraut	a0dd0702e4	Fix cross variable references in graph pattern causing segfault When converting the WHERE clause in an element pattern, generate_query_for_graph_path() calls replace_property_refs() to replace the property references in it. Only the current graph element pattern is passed as the context for replacement. If there are references to variables from other element patterns, it causes a segmentation fault (an assertion failure in an Assert enabled build) since it does not find path_element object corresponding to those variables. We do not support forward and backward variable references within a graph table clause. Hence prohibit all the cross references. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reported-by: Man Zeng <zengman@halodbtech.com> Reviewed-by: Henson Choi <assam258@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5u6AoDfNg4%3DR5eVJn_bJn%3DC%3DwVPrto02P_06fxy39fniA%40mail.gmail.com	2026-03-31 11:47:19 +02:00
Peter Eisentraut	c5b3253b8a	Property references are preferred over regular column references When a ColumnRef can be resolved as a graph table property reference and a lateral table column reference prefer the graph table property reference since element pattern variables in the GRAPH_TABLE clause form the innermost namespace. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Henson Choi <assam258@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5u6AoDfNg4%3DR5eVJn_bJn%3DC%3DwVPrto02P_06fxy39fniA%40mail.gmail.com	2026-03-31 11:47:19 +02:00
Amit Langote	68a8601ee9	Fix use-after-free in ri_LoadConstraintInfo conindid was read from conForm after ReleaseSysCache(tup). Move the read to before the release. Introduced by commit `2da86c1ef9`. Per buildfarm member prion. Discussion: https://postgr.es/m/CA+HiwqGGYjN6F2oL7yAk=hvSs-sj3TPqZ9JC9iyLkCqJadECrw@mail.gmail.com	2026-03-31 17:04:44 +09:00
Daniel Gustafsson	097ab69d17	Formalize WAL record for XLOG_CHECKPOINT_REDO XLOG_CHECKPOINT_REDO only contains the wal_level copied straight in without an encapsulating record structure. While it works, it makes future uses of XLOG_CHECKPOINT_REDO hard as there is nowhere to put new data items. This fix this was inspired by the online checksums patch which adds data to this record, but this change has value on its own. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/c92b5d8b-bc03-47bc-b209-2e4a719eee32@iki.fi	2026-03-31 09:38:01 +02:00
Peter Eisentraut	82a7cbea74	Disable some C++ warnings in MSVC Flexible array members, as used in many PostgreSQL header files, are not a C++ feature. MSVC warns about these. Disable the warning. (GCC and Clang accept them, but they would warn in -pedantic mode.) Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg%40mail.gmail.com	2026-03-31 08:42:26 +02:00
Peter Eisentraut	4c83f12535	meson: Make room for C++-only warning flags for MSVC Refactor the MSVC warning option handling to have a list of common flags and lists of flags specific to C and C++. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg%40mail.gmail.com	2026-03-31 08:42:22 +02:00
Amit Langote	2da86c1ef9	Add fast path for foreign key constraint checks Add a fast-path optimization for foreign key checks that bypasses SPI by directly probing the unique index on the referenced table. Benchmarking shows ~1.8x speedup for bulk FK inserts (int PK/int FK, 1M rows, where PK table and index are cached). The fast path applies when the referenced table is not partitioned and the constraint does not involve temporal semantics. Otherwise, the existing SPI path is used. This optimization covers only the referential check trigger (RI_FKey_check). The action triggers (CASCADE, SET NULL, SET DEFAULT, RESTRICT, NO ACTION) must find rows on the FK side to modify, which requires a table scan with no guaranteed index available, and then execute DML against those rows through the full executor path including any triggered actions. Replicating that without substantial code duplication is not feasible, so those triggers remain on the SPI path. Extending the fast path to action triggers remains possible as future work if the necessary infrastructure is built. The new ri_FastPathCheck() function extracts the FK values, builds scan keys, performs an index scan, and locks the matching tuple with LockTupleKeyShare via ri_LockPKTuple(), which handles the RI-specific subset of table_tuple_lock() results. If the locked tuple was reached by chasing an update chain (tmfd.traversed), recheck_matched_pk_tuple() verifies that the key is still the same, emulating EvalPlanQual. The scan uses GetTransactionSnapshot(), matching what the SPI path uses (via _SPI_execute_plan pushing GetTransactionSnapshot() as the active snapshot). Under READ COMMITTED this is a fresh snapshot; under REPEATABLE READ / SERIALIZABLE it is the frozen transaction- start snapshot, so PK rows committed after the transaction started are not visible. The ri_CheckPermissions() function performs schema USAGE and table SELECT checks, matching what the SPI path gets implicitly through the executor's permission checks. The fast path also switches to the PK table owner's security context (with SECURITY_NOFORCE_RLS) before the index probe, matching the SPI path where the query runs as the table owner. ri_HashCompareOp() is adjusted to handle cross-type equality operators (e.g. int48eq for int4 PK / int8 FK) which can appear in conpfeqop. The existing code asserted same-type operators only, which was correct for its existing callers (ri_KeysEqual compares same-type FK column values via ff_eq_oprs), but the fast path is the first caller to pass pf_eq_oprs, which can be cross-type. Per-key metadata (compare entries, operator procedures, strategy numbers) is cached in RI_ConstraintInfo via ri_populate_fastpath_metadata() on first use, eliminating repeated calls to ri_HashCompareOp() and get_op_opfamily_properties(). conindid and pk_is_partitioned are also cached at constraint load time, avoiding per-invocation syscache lookups and the need to open pk_rel before deciding whether the fast path applies. New regression tests cover RLS bypass and ACL enforcement for the fast-path permission checks. New isolation tests exercise concurrent PK updates under both READ COMMITTED and REPEATABLE READ. Author: Junwang Zhao <zhjwpku@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Tested-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CA+HiwqF4C0ws3cO+z5cLkPuvwnAwkSp7sfvgGj3yQ=Li6KNMqA@mail.gmail.com	2026-03-31 13:49:21 +09:00
Amit Kapila	5984ea868e	Change syntax of EXCEPT TABLE clause in publication commands. Adjust the syntax of the EXCEPT clause in CREATE/ALTER PUBLICATION added in commits `fd366065e0` and `493f8c6439` to move the TABLE keyword inside the relation list. Old syntax: CREATE PUBLICATION ... FOR ALL TABLES EXCEPT TABLE (t1, ...); ALTER PUBLICATION ... SET ALL TABLES EXCEPT TABLE (t1, ...); New syntax: CREATE PUBLICATION ... FOR ALL TABLES EXCEPT (TABLE t1, ...); ALTER PUBLICATION ... SET ALL TABLES EXCEPT (TABLE t1, ...); This is to ensure that inclusion and exclusion list can be specified in a same way. Previously, the exclusion table list can be specified as TABLE (t1, t2, t3) and inclusion list can be specified as TABLE t1, t2, t3, or TABLE t1, TABLE t2, TABLE t3. This change is purely syntactic and does not alter behavior. Reported-by: Masahiko Sawada <sawada.mshk@gmail.com> Author: vignesh C <vignesh21@gmail.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAD21AoCC8XuwfX62qKBSfHUAoww_XB3_84HjswgL9jxQy696yw@mail.gmail.com Discussion: https://postgr.es/m/CALDaNm3=JrucjhiiwsYQw5-PGtBHFONa6F7hhWCXMsGvh=tamA@mail.gmail.com	2026-03-31 09:40:51 +05:30
Tom Lane	786552e7f2	Doc: update ddl.sgml's description of cmin and cmax. We long ago folded these two tuple header fields into one field to save space. However, nothing was done to the user-facing documentation about them, perhaps with the idea that we'd add code to emit something approximating the original definitions. That never happened and presumably never will, so update the text to reflect current reality. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+renyVYYboiTayRRE0j1oKpeB+NjEBSUXfwgEu6O0JESSmauQ@mail.gmail.com	2026-03-30 18:25:19 -04:00
Peter Eisentraut	c73e8ee061	Add warning option -Wold-style-declaration This warning has been triggered a few times via the buildfarm (see commits `8212625e53`, `2b7259f855`, `afe86a9e73`), so we might as well add it so that everyone sees it. (This is completely separate from the recently added -Wold-style-definition.) Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aa73q1aT0A3/vke/%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-30 23:34:13 +02:00
Jacob Champion	993368113c	libpq: Add oauth_ca_file option to change CAs without debugging PG18 hid the PGOAUTHCAFILE envvar behind PGOAUTHDEBUG=UNSAFE, because I thought that any "real" production usage of private CA certificates would have them added to the Curl system trust store. But there are use cases, such as containerized environments, that prefer to manage custom CA settings more granularly; some of them consider envvar configuration of certificates to be standard practice. Move PGOAUTHCAFILE out from under the debug flag, and add an oauth_ca_file option to libpq to configure trusted CAs per connection. Patch by Jonathan Gonzalez V., with some additional wordsmithing and test organization by me. Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/16a91d02795cb991963326a902afa764e4d721db.camel%40gmail.com	2026-03-30 14:14:45 -07:00
Nathan Bossart	bab2f27eaa	Remove bits* typedefs. In addition to removing the bits8, bits16, and bits32 typedefs, this commit replaces all uses with uint8, uint16, or uint32. bits* provided little benefit beyond establishing the intent of the variable, and they were inconsistently used for that purpose. Third-party code should instead use the corresponding uint* typedef. Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/absbX33E4eaA0Ity%40nathan	2026-03-30 16:12:08 -05:00
Heikki Linnakangas	40c41dc773	Use ShmemInitStruct to allocate shmem for semaphores This makes them visible in pg_shmem_allocations Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-03-30 23:39:35 +03:00
Melanie Plageman	378a216187	Set pd_prune_xid on insert Now that on-access pruning can update the visibility map (VM) during read-only queries, set the page’s pd_prune_xid hint during INSERT and on the new page during UPDATE. This allows heap_page_prune_and_freeze() to set the VM the first time a page is read after being filled with tuples. This may avoid I/O amplification by setting the page all-visible when it is still in shared buffers and allowing later vacuums to skip scanning the page. It also enables index-only scans of newly inserted data much sooner. As a side benefit, this addresses a long-standing note in heap_insert() and heap_multi_insert(): aborted inserts can now be pruned on-access rather than lingering until the next VACUUM. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2026-03-30 16:07:11 -04:00
Melanie Plageman	b46e1e54d0	Allow on-access pruning to set pages all-visible Many queries do not modify the underlying relation. For such queries, if on-access pruning occurs during the scan, we can check whether the page has become all-visible and update the visibility map accordingly. Previously, only vacuum and COPY FREEZE marked pages as all-visible or all-frozen. This commit implements on-access VM setting for sequential scans, tid range scans, sample scans, bitmap heap scans, and the underlying heap relation in index scans. Setting the visibility map on-access can avoid write amplification caused by vacuum later needing to set the page all-visible, which could trigger a write and potentially an FPI. It also allows more frequent index-only scans, since they require pages to be marked all-visible in the VM. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2026-03-30 15:47:07 -04:00
Nathan Bossart	e3637a05dc	Add commit `874da8b1f6` to .git-blame-ignore-revs.	2026-03-30 14:35:24 -05:00
Peter Eisentraut	488ab592d9	configure: Apply -Werror=vla to C++ as well as C The comment part of `d9dd406fe2` mentioned that -Wvla is not applicable for C++. That is not fully correct: it is true that VLAs are not part of the C++ standard, and g++ with -pedantic will also warn about them as a non-standard extension. However, -Wvla itself works fine in C++ and will catch VLA usage just as in C. Fix configure.ac to apply -Werror=vla to C++ as well. There is no need to fix meson.build as it already includes it in common_warning_flags. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/7bf60ab1-2b5d-4a77-93ce-815072a0a014%40eisentraut.org	2026-03-30 20:55:16 +02:00
Tom Lane	7394773450	Be more careful to preserve consistency of a tuplestore. Several places in tuplestore.c would leave the tuplestore data structure effectively corrupt if some subroutine were to throw an error. Notably, if WRITETUP() failed after some number of successful calls within dumptuples(), the tuplestore would contain some memtuples pointers that were apparently live entries but in fact pointed to pfree'd chunks. In most cases this sort of thing is fine because transaction abort cleanup is not too picky about the contents of memory that it's going to throw away anyway. There's at least one exception though: if a Portal has a holdStore, we're going to call tuplestore_end() on that, even during transaction abort. So it's not cool if that tuplestore is corrupt, and that means tuplestore.c has to be more careful. This oversight demonstrably leads to crashes in v15 and before, if a holdable cursor fails to persist its data due to an undersized temp_file_limit setting. Very possibly the same thing can happen in v16 and v17 as well, though the specific test case submitted failed to fail there (cf. `095555daf`). The failure is accidentally dodged as of v18 because `590b045c3` got rid of tuplestore_end's retail tuple deletion loop. Still, it seems unwise to permit tuplestores to become internally inconsistent in any branch, so I've applied the same fix across the board. Since the known test case for this is rather expensive and doesn't fail in recent branches, I've omitted it. Bug: #19438 Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org Backpatch-through: 14	2026-03-30 13:59:58 -04:00
Heikki Linnakangas	681774315d	Replace getopt() with our re-entrant variant in the backend Some of these probably could continue using non-re-entrant getopt() even if we start using threads in the future, but it seems better to make them all anyway, so that we have a clear-cut rule of "no plain getopt() in the postgres binary". Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/d1da5f0e-0d68-47c9-a882-eb22f462752f@iki.fi	2026-03-30 20:47:16 +03:00
Heikki Linnakangas	fd8e3f7cee	Invent a variant of getopt(3) that is thread-safe The standard getopt(3) function is not re-entrant nor thread-safe. That's OK for current usage, but it's one more little thing we need to change in order to make the server multi-threaded. There's no standard getopt_r() function on any platform, I presume because command line arguments are usually parsed early when you start a program, before launching any threads, so there isn't much need for it. However, we call it at backend startup to parse options from the startup packet. Because there's no standard, we're free to define our own. The pg_getopt_start/next() implementation is based on the old getopt implementation, I just gathered all the state variables to a struct. The non-re-entrant getopt() function is now a wrapper around the re-entrant variant, on platforms that don't have getopt(3). getopt_long() is not used in the server, so we don't need to provide a re-entrant variant of that. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/d1da5f0e-0d68-47c9-a882-eb22f462752f@iki.fi	2026-03-30 20:47:13 +03:00
Heikki Linnakangas	c5f7820e57	Fix latent bug in get_stats_option_name() The function is supposed to look at the passed in 'arg' argument, but peeks at the 'optarg' global variable that's part of getopt() instead. It happened to work anyway, because all callers passed 'optarg' as the argument. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/d1da5f0e-0d68-47c9-a882-eb22f462752f@iki.fi	2026-03-30 20:34:48 +03:00
Melanie Plageman	50eb5faea2	Pass down information on table modification to scan nodes Pass down information to sequential scan, index [only] scan, bitmap table scan, sample scan, and TID range scan nodes on whether or not the query modifies the relation being scanned. A later commit will use this information to update the VM during on-access pruning only if the relation is not modified by the query. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/4379FDA3-9446-4E2C-9C15-32EFE8D4F31B%40yandex-team.ru	2026-03-30 13:27:34 -04:00
Álvaro Herrera	349bd88202	Don't use bits32 in table AM interface Seems there's near-universal dislike for the bitsXX typedefs. Revert that part of commit `1bd6f22f43` in favor of using plain uint32.	2026-03-30 19:06:33 +02:00
Melanie Plageman	dcd8cc1c85	Thread flags through begin-scan APIs Add an AM user-settable flags parameter to several of the table scan functions, one table AM callback, and index_beginscan(). This allows users to pass additional context to be used when building the scan descriptors. For index scans, a new flags field is added to IndexFetchTableData, and the heap AM saves the caller-provided flags there. This introduces an extension point for follow-up work to pass per-scan information (such as whether the relation is read-only for the current query) from the executor to the AM layer. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2be31f17-5405-4de9-8d73-90ebc322f7d8%40vondra.me	2026-03-30 12:27:24 -04:00
Tom Lane	095555daf1	Detect pfree or repalloc of a previously-freed memory chunk. Before the major rewrite in commit `c6e0fe1f2`, AllocSetFree() would typically crash when asked to free an already-free chunk. That was an ugly but serviceable way of detecting coding errors that led to double pfrees. But since that rewrite, double pfrees went through just fine, because the "hdrmask" of a freed chunk isn't changed at all when putting it on the freelist. We'd end with a corrupt freelist that circularly links back to the doubly-freed chunk, which would usually result in trouble later, far removed from the actual bug. This situation is no good at all for debugging purposes. Fortunately, we can fix it at low cost in MEMORY_CONTEXT_CHECKING builds by making AllocSetFree() check for chunk->requested_size == InvalidAllocSize, relying on the pre-existing code that sets it that way just below. I investigated the alternative of changing a freed chunk's methodid field, which would allow detection in non-MEMORY_CONTEXT_CHECKING builds too. But that adds measurable overhead. Seeing that we didn't notice this oversight for more than three years, it's hard to argue that detecting this type of bug is worth any extra overhead in production builds. Likewise fix AllocSetRealloc() to detect repalloc() on a freed chunk, and apply similar changes in generation.c and slab.c. (generation.c would hit an Assert failure anyway, but it seems best to make it act like aset.c.) bump.c doesn't need changes since it doesn't support pfree in the first place. Ideally alignedalloc.c would receive similar changes, but in debugging builds it's impossible to reach AlignedAllocFree() or AlignedAllocRealloc() on a pfreed chunk, because the underlying context's pfree would have wiped the chunk header of the aligned chunk. But that means we should get an error of some sort, so let's be content with that. Per investigation of why the test case for bug #19438 didn't appear to fail in v16 and up, even though the underlying bug was still present. (This doesn't fix the underlying double-free bug, just cause it to get detected.) Bug: #19438 Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org Backpatch-through: 16	2026-03-30 12:02:08 -04:00
Heikki Linnakangas	bd365b1ae5	Fix outdated comment on MainLWLockArray It's no longer passed to child processes down via BackendParameters in EXEC_BACKEND mode. Reported-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA5RZ0vPWNMvTBqyH7nqDRrHd6Y4Et5iNqXFuwpbsPOk3cL4rQ@mail.gmail.com	2026-03-30 17:13:11 +03:00
Robert Haas	e2ee95233c	pg_plan_advice: Avoid assertion failure with partitionwise aggregate. An Append node that is part of a partitionwise aggregate has no apprelids. If such a node was elided, the previous coding would attempt to call unique_nonjoin_rtekind() on a NULL pointer, which leads to an assertion failure. Insert a NULL check to prevent that. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/0afba1ce-c946-4131-972d-191d9a1c097c@gmail.com	2026-03-30 09:58:25 -04:00
Melanie Plageman	39dcd10a2c	Remove PlannedStmt->resultRelations in favor of resultRelationRelids PlannedStmt->resultRelations was an integer list of range table indexes because at the time it was added (to Query), the Bitmapset data type did not yet exist in Postgres. `0f4c170cf3` added a Bitmapset of result relations, so remove the integer list of RTIs and use the more compact resultRelationRelids. Discussion: https://postgr.es/m/CAApHDvqAOeOwCKh9g0gfxWa040%3DHyc7_oA%3DC59rjod8kXJDWyw%40mail.gmail.com	2026-03-30 09:51:28 -04:00
Melanie Plageman	0f4c170cf3	Make it cheap to check if a relation is modified by a query Save the range table indexes of result relations and row mark relations in separate bitmapsets in the PlannedStmt. Precomputing them allows cheap membership checks during execution. Together, these two groups approximate all relations that will be modified by a query. This includes relations targeted by INSERT, UPDATE, DELETE, and MERGE as well as relations with any row mark (like SELECT FOR UPDATE). Future work will use information on whether or not a relation is modified by a query in a heuristic. PlannedStmt->resultRelations is only used in a membership check, so it will be removed in a separate commit. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/F5CDD1B5-628C-44A1-9F85-3958C626F6A9%40gmail.com	2026-03-30 09:38:03 -04:00
Álvaro Herrera	1bd6f22f43	Have table_insert and siblings use an unsigned type for options Using signed types can lead to bugs, such as the one fixed by commit `2a2e1b470b`. Discussion: https://postgr.es/m/44e6ze3kuunhky63wmfjxrmn72pds2whwf5ok6hpz7c4my7k2h@l65zhpcuasnf	2026-03-30 13:58:16 +02:00
Peter Eisentraut	c546f008cd	headerscheck: Avoid mutual inclusion of pg_config.h and c.h Headers that c.h includes early should not have another header included before them in the headerscheck test file, especially not c.h. A particular instance of a problem is that pg_config.h defines some symbols that c.h later undefines in some cases, such as in the code added by commit `cd083b54bd`, but there were also some before that. This only works correctly if pg_config.h is included first. pg_config_manual.h and pg_config_os.h are closely related to pg_config.h and should be treated the same way. postgres_ext.h is meant to be usable standalone, so testing it with c.h included first defeats the point. c.h also includes port.h, but this commit leaves that alone, since port.h does need some of c.h to be processed first. (But because of header guards, testing port.h separately is probably ineffective.) Discussion: https://www.postgresql.org/message-id/flat/579116be-5016-4dbc-aed0-c06f8d9f5bbb%40eisentraut.org	2026-03-30 10:14:41 +02:00
Peter Eisentraut	b36b956404	Make cast functions to type money error safe This converts the cast functions from types integer, bigint, and numeric to type money to support soft errors. Note: Casting from type money to type numeric (the other way, function cash_numeric) is not yet error safe. Author: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CADkLM%3Dfv1JfY4Ufa-jcwwNbjQixNViskQ8jZu3Tz_p656i_4hQ%40mail.gmail.com	2026-03-30 10:10:56 +02:00
John Naylor	ec5981c381	Remove extraneous PGDLLIMPORT Oversight from commit `3c6e8c1238`. Should be harmless, so no backpatch. Reported-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAN4CZFM8jmh4+1rUR2c++JWK9sV85T8_mqmwHMvM0YWkTm4_dQ@mail.gmail.com	2026-03-30 14:39:13 +07:00
Peter Eisentraut	75a5914d00	Fix accidentally casting away const Recently introduced in commit `b15c151398`.	2026-03-30 09:24:10 +02:00
Peter Eisentraut	26f9012bee	Make cast function from circle to polygon error safe Previously, the function casting type circle to type polygon could not be made error safe, because it is an SQL language function. This refactors it as a C/internal function, by sharing code with the C/internal function that the SQL function previously wrapped, and soft error support is added. Author: jian he <jian.universality@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: Discussion: https://www.postgresql.org/message-id/flat/CADkLM%3Dfv1JfY4Ufa-jcwwNbjQixNViskQ8jZu3Tz_p656i_4hQ%40mail.gmail.com	2026-03-30 09:11:08 +02:00
Fujii Masao	2497dac556	Fix FK triggers losing DEFERRABLE/INITIALLY DEFERRED when marked ENFORCED again Previously, a foreign key defined as DEFERRABLE INITIALLY DEFERRED could behave as NOT DEFERRABLE after being set to NOT ENFORCED and then back to ENFORCED. This happened because recreating the FK triggers on re-enabling the constraint forgot to restore the tgdeferrable and tginitdeferred fields in pg_trigger. Fix this bug by properly setting those fields when the foreign key constraint is marked ENFORCED again and its triggers are recreated, so the original DEFERRABLE and INITIALLY DEFERRED properties are preserved. Backpatch to v18, where NOT ENFORCED foreign keys were introduced. Author: Yasuo Honda <yasuo.honda@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAKmOUTms2nkxEZDdcrsjq5P3b2L_PR266Hv8kW5pANwmVaRJJQ@mail.gmail.com Backpatch-through: 18	2026-03-30 14:37:33 +09:00
David Rowley	0d866282b8	Fix datum_image_()'s inability to detect sign-extension variations Functions such as hash_numeric() are not careful to use the correct PG_RETURN_() macro according to the return type of that function as defined in pg_proc. Because that function is meant to return int32, when the hashed value exceeds 2^31, the 64-bit Datum value won't wrap to a negative number, which means the Datum won't have the same value as it would have had it been cast to int32 on a two's complement machine. This isn't harmless as both datum_image_eq() and datum_image_hash() may receive a Datum that's been formed and deformed from a tuple in some cases, and not in other cases. When formed into a tuple, the Datum value will be coerced into an integer according to the attlen as specified by the TupleDesc. This can result in two Datums that should be equal being classed as not equal, which could result in (but not limited to) an error such as: ERROR: could not find memoization table entry Here we fix this by ensuring we cast the Datum value to a signed integer according to the typLen specified in the datum_image_eq/datum_image_hash function call before comparing or hashing. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Tender Wang <tndrwang@gmail.com> Backpatch-through: 14 Discussion: https://postgr.es/m/CAHewXNmcXVFdB9_WwA8Ez0P+m_TQy_KzYk5Ri5dvg+fuwjD_yw@mail.gmail.com	2026-03-30 16:14:34 +13:00
Fujii Masao	1a11405a43	psql: Make \d+ inheritance tables list formatting consistent with other objects This followw up on the previous change (commit `7bff9f106a`) for partitions by applying the same formatting to inheritance tables lists. Previously, \d+ <table> displayed inheritance tables differently from other object lists: the first inheritance table appeared on the same line as the "Inherits" header. For example: Inherits: test_like_5, test_like_5x This commit updates the output so that inheritance tables are listed consistently with other objects, with each entry on its own line starting below the header: Inherits: test_like_5 test_like_5x Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+Pu1puO00C-OhgLnAcECzww8MB3Q8DCsvx0cZWHRfs4gBQ@mail.gmail.com	2026-03-30 11:21:22 +09:00
Fujii Masao	7bff9f106a	psql: Make \d+ partition list formatting consistent with other objects Previously, \d+ <table> displayed partitions differently from other object lists: the first partition appeared on the same line as the "Partitions" header. For example: Partitions: pt12 FOR VALUES IN (1, 2), pt34 FOR VALUES IN (3, 4) This commit updates the output so that partitions are listed consistently with other objects, with each entry on its own line starting below the header: Partitions: pt12 FOR VALUES IN (1, 2) pt34 FOR VALUES IN (3, 4) Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+Pu1puO00C-OhgLnAcECzww8MB3Q8DCsvx0cZWHRfs4gBQ@mail.gmail.com	2026-03-30 11:06:42 +09:00
Amit Langote	c57d8178eb	Doc: fix stale text about partition locking with cached plans Commit `121d774cae` added text to master describing pruning-aware locking behavior introduced by `525392d57`. That behavior was reverted in May 2025, making the text incorrect. Replace it with the text used in back branches, which correctly describes current behavior: pruned partitions are still locked at the beginning of execution. Discussion: https://postgr.es/m/CA+HiwqFT0fPPoYBr0iUFWNB-Og7bEXB9hB=6ogk_qD9=OM8Vbw@mail.gmail.com	2026-03-30 10:29:21 +09:00
Amit Langote	1ad7191f7e	Add comment explaining fire_triggers=false in ri_PerformCheck() The reason for passing fire_triggers=false to SPI_execute_snapshot() in ri_PerformCheck() was not documented, making it unclear why it was done that way. Add a comment explaining that it ensures AFTER triggers on rows modified by the RI action are queued in the outer query's after-trigger context and fire only after all RI updates on the same row are complete. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Surya Poondla <suryapoondla4@gmail.com> Discussion: https://postgr.es/m/20250331212648.ad4ab804559001d7f0788741@sraoss.co.jp	2026-03-30 10:10:17 +09:00
Peter Eisentraut	45cdaf3665	Make geometry cast functions error safe This adjusts cast functions of the geometry types to support soft errors. This requires refactoring of various helper functions to support error contexts. Also make the float8 to float4 cast error safe. It requires some of the same helper functions. This is in preparation for a future feature where conversion errors in casts can be caught. (The function casting type circle to type polygon is not yet made error safe, because it is an SQL language function.) Author: jian he <jian.universality@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CADkLM%3Dfv1JfY4Ufa-jcwwNbjQixNViskQ8jZu3Tz_p656i_4hQ%40mail.gmail.com	2026-03-29 20:40:50 +02:00
Tom Lane	d4cb9c3776	Doc: document more incompatible pg_restore option pairs. Most of the pairs of incompatible options (such as --file and --dbname) are pretty obvious and need no explanation. But it may not be obvious that --single-transaction cannot be used together with --create or multiple jobs, so let's mention that in the documentation. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/CAExHW5ti5igDwOOde6shgfS7JPtCY9gNrkB3xNr=FuGTYVDSjQ@mail.gmail.com	2026-03-29 14:06:50 -04:00
Tom Lane	e7b809ae75	Doc: clarify introductory description of pg_dumpall. Add a sentence that describes the parts of a cluster's state that are not included in the output. Also swap two sentences in the introductory paragraph. Without that, it is not clear what the "it" at the beginning of the second sentence is referring to. Also add a reference to pg_restore, since not all output formats are restored with pg_dump. Also clarify the recently-added text about where different output formats go, and relocate it above the ancillary text about having to run as superuser. Reported-by: Dimitre Radoulov <cichomitiko@gmail.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAGJBphSX2oMPPu=VM4U8NP4+qffFH_483tFQCJ_s-mOcN3DLDw@mail.gmail.com	2026-03-29 13:53:17 -04:00
Andrew Dunstan	01d58d7e3f	Fix multiple bugs in astreamer pipeline code. astreamer_tar_parser_content() sent the wrong data pointer when forwarding MEMBER_TRAILER padding to the next streamer. After astreamer_buffer_until() buffers the padding bytes, the 'data' pointer has been advanced past them, but the code passed 'data' instead of bbs_buffer.data. This caused the downstream consumer to receive bytes from after the padding rather than the padding itself, and could read past the end of the input buffer. astreamer_gzip_decompressor_content() only checked for Z_STREAM_ERROR from inflate(), silently ignoring Z_DATA_ERROR (corrupted data) and Z_MEM_ERROR (out of memory). Fix by treating any return other than Z_OK, Z_STREAM_END, and Z_BUF_ERROR as fatal. astreamer_gzip_decompressor_free() missed calling inflateEnd() to release zlib's internal decompression state. astreamer_tar_parser_free() neglected to pfree() the streamer struct itself, leaking it. astreamer_extractor_content() did not check the return value of fclose() when closing an extracted file. A deferred write error (e.g., disk full on buffered I/O) would be silently lost. Discussion: https://postgr.es/m/results/98c6b630-acbb-44a7-97fa-1692ce2b827c@dunslane.net Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 15	2026-03-29 09:01:47 -04:00
Álvaro Herrera	0841b219bf	Sort InternalBGWorkers list alphabetically This simplifies deciding where to add a new one.	2026-03-29 14:15:00 +02:00
Peter Eisentraut	10e4d8aaf4	Make cast functions from jsonb error safe This adjusts cast functions from jsonb to other types to support soft errors. This just involves some refactoring of the underlying helper functions to use ereturn. This is in preparation for a future feature where conversion errors in casts can be caught. Author: jian he <jian.universality@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CADkLM%3Dfv1JfY4Ufa-jcwwNbjQixNViskQ8jZu3Tz_p656i_4hQ%40mail.gmail.com	2026-03-28 15:44:13 +01:00
Andres Freund	999dec9ec6	aio: Don't wait for already in-progress IO When a backend attempts to start a read IO and finds the first buffer already has I/O in progress, previously it waited for that I/O to complete before initiating reads for any of the subsequent buffers. Although it must wait for the I/O to finish when acquiring the buffer, there's no reason for it to wait when setting up the read operation. Waiting at this point prevents starting I/O on subsequent buffers and can significantly reduce concurrency. This matters in two workloads: 1) When multiple backends scan the same relation concurrently. 2) When a single backend requests the same block multiple times within the readahead distance. Waiting each time an in-progress read is encountered effectively degenerates the access pattern into synchronous I/O. To fix this, when encountering an already in-progress IO for the head buffer, the wait reference is now recorded and waiting is deferred until WaitReadBuffers(), when the buffer actually needs to be acquired. In rare cases, a backend may still need to wait synchronously at IO start time: If another backend has set BM_IO_IN_PROGRESS on the buffer but has not yet set the wait reference. Such windows should be brief and uncommon. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw%403p3zu522yykv	2026-03-27 19:53:32 -04:00
Andres Freund	74eafeab1a	bufmgr: Improve StartBufferIO interface Until now StartBufferIO() had a few weaknesses: - As it did not submit staged IOs, it was not safe to call StartBufferIO() where there was a potential for unsubmitted IO, which required AsyncReadBuffers() to use a wrapper (ReadBuffersCanStartIO()) around StartBufferIO(). - With nowait = true, the boolean return value did not allow to distinguish between no IO being necessary and having to wait, which would lead ReadBuffersCanStartIO() to unnecessarily submit staged IO. - Several callers needed to handle both local and shared buffers, requiring the caller to differentiate between StartBufferIO() and StartLocalBufferIO() - In a future commit some callers of StartBufferIO() want the BufferDesc's io_wref to be returned, to asynchronously wait for in-progress IO - Indicating whether to wait with the nowait parameter was somewhat confusing compared to a wait parameter Address these issues as follows: - StartBufferIO() is renamed to StartSharedBufferIO() - A new StartBufferIO() is introduced that supports both shared and local buffers - The boolean return value has been replaced with an enum, indicating whether the IO is already done, already in progress or that the buffer has been readied for IO - A new PgAioWaitRef * argument allows the caller to get the wait reference is desired. All current callers pass NULL, a user of this will be introduced subsequently - Instead of the nowait argument there now is wait This probably would not have been worthwhile on its own, but since all these lines needed to be touched anyway... Author: Andres Freund <andres@anarazel.de> Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-27 19:08:12 -04:00
Heikki Linnakangas	2407c8db15	Fix RequestNamedLWLockTranche in single-user mode PostmasterContext is not available in single-user mode, use TopMemoryContext instead. Also make sure that we use the correct memory context in the lappend(). Author: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/acb_Eo1XtmCO_9z7@nathan	2026-03-28 01:02:11 +02:00
Andres Freund	1f6f200cab	test_aio: Add read_stream test infrastructure & tests While we have a lot of indirect coverage of read streams, there are corner cases that are hard to test when only indirectly controlling and observing the read stream. This commit adds an SQL callable SRF interface for a read stream and uses that in a few tests. To make some of the tests possible, the injection point infrastructure in test_aio had to be expanded to allow blocking IO completion. While at it, fix a wrong debug message in inj_io_short_read_hook(). Author: Andres Freund <andres@anarazel.de> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-27 18:52:43 -04:00
Andres Freund	020c02bd90	test_aio: Add basic tests for StartReadBuffers() Upcoming commits will change StartReadBuffers() and its building blocks, making it worthwhile to directly test StartReadBuffers(). Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-27 18:52:43 -04:00
Tom Lane	00c025a001	Doc: split functions-posix-regexp section into multiple subsections. Create a <sect4> section for each function that the previous text described in one long series of paragraphs. Also split the functions' previously in-line syntax summaries into <synopsis> clauses, which is more readable and allows us to sneak in an explicit mention of the result data type. This change gives us an opportunity to make cross-reference links more specific, too, so do that. Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxFuk9P=P4=BZ=qCkgvo6im8aL8NnCkjxx2S2MQDWNdouw@mail.gmail.com	2026-03-27 17:41:08 -04:00
Andres Freund	f39cb8c011	bufmgr: Make UnlockReleaseBuffer() more efficient Now that the buffer content lock is implemented as part of BufferDesc.state, releasing the lock and unpinning the buffer can be implemented as a single atomic operation. This improves workloads that have heavy contention on a small number of buffers substantially, I e.g., see a ~20% improvement for pipelined readonly pgbench on an older two socket machine. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-27 15:56:29 -04:00
Andres Freund	8df3c48e46	Use UnlockReleaseBuffer() in more places An upcoming commit will make UnlockReleaseBuffer() considerably faster and more scalable than doing LockBuffer(BUFFER_LOCK_UNLOCK); ReleaseBuffer();. But it's a small performance benefit even as-is. Most of the callsites changed in this patch are not performance sensitive, however some, like the nbtree ones, are in critical paths. This patch changes all the easily convertible places over to UnlockReleaseBuffer() mainly because I needed to check all of them anyway, and reducing cases where the operations are done separately makes the checking easier. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-27 15:56:29 -04:00
Andres Freund	41d3d64e87	bufmgr: Don't copy pages while writing out After the series of preceding commits introducing and using BufferBeginSetHintBits()/BufferSetHintBits16(), hint bits are not set anymore while IO is going on. Therefore we do not need to copy pages while they are being written out anymore. For the same reason XLogSaveBufferForHint() now does not need to operate on a copy of the page anymore, but can instead use the normal XLogRegisterBuffer() mechanism. For that the assertions and comments to XLogRegisterBuffer() had to be updated to allow share-exclusive locked buffers to be registered. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-27 15:56:29 -04:00
Tom Lane	79ac82125e	pgindent: ensure all C files end with a newline. Not only is this good style, but it dodges some obscure bugs within pg_bsd_indent. We could try to fix said bugs, but the amount of effort required seems far out of proportion to the benefit. Reported-by: Akshay Joshi <akshay.joshi@enterprisedb.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CANxoLDfca8O5SkeDxB_j6SVNXd+pNKaDmVmEW+2yyicdU8fy0w@mail.gmail.com	2026-03-27 15:38:48 -04:00
Masahiko Sawada	e752a2ccc9	doc: Clarify collation requirements for base32hex sortability. While fixing the base32hex UUID sortability test in commit `89210037a0`, it turned out that the expected lexicographical order is only maintained under the C collation (or an equivalent byte-wise collation). Natural language collations may employ different rules, breaking the sortability. This commit updates the documentation to explicitly state that base32hex is "byte-wise sortable", ensuring users do not fall into the trap of using natural language collations when querying their encoded data. Co-Authored-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CAD21AoAwX1D6baSGuQXm0mzPXPWB07kgaoaaahjNHHenbdY24A@mail.gmail.com	2026-03-27 12:13:29 -07:00
Nathan Bossart	d7965d65fc	Add rudimentary table prioritization to autovacuum. Autovacuum workers scan pg_class twice to collect the set of tables to process. The first pass is for plain relations and materialized views, and the second is for TOAST tables. When the worker finds a table to process, it adds it to the end of a list. Later on, it processes the tables in the same order as the list. This simple strategy has worked surprisingly well for a long time, but there have been many discussions over the years about trying to improve it. This commit introduces a scoring system that is used to sort the aforementioned list of tables to process. The idea is to have autovacuum workers prioritize tables that are furthest beyond their thresholds (e.g., a table nearing transaction ID wraparound should be vacuumed first). This prioritization scheme is certainly far from perfect; there are simply too many possibilities for any scoring technique to work across all workloads, and the situation might change significantly between the time we calculate the score and the time that autovacuum processes it. However, we have attemped to develop something that is expected to work for a large portion of workloads with reasonable parameter settings. The score is calculated as the maximum of the ratios of each of the table's relevant values to its threshold. For example, if the number of inserted tuples is 100, and the insert threshold for the table is 80, the insert score is 1.25. If all other scores are below that value, the table's score will be 1.25. The other criteria considered for the score are the table ages (both relfrozenxid and relminmxid) compared to the corresponding freeze-max-age setting, the number of update/deleted tuples compared to the vacuum threshold, and the number of inserted/updated/deleted tuples compared to the analyze threshold. Once exception to the previous paragraph is for tables nearing wraparound, i.e., those that have surpassed the effective failsafe ages. In that case, the relfrozenxid/relminmxid-based score is scaled aggressively so that the table has a decent chance of sorting to the front of the list. To adjust how strongly each component contributes to the score, the following parameters can be adjusted from their default of 1.0 to anywhere between 0.0 and 10.0 (inclusive). Setting all of these to 0.0 restores pre-v19 prioritization behavior: autovacuum_freeze_score_weight autovacuum_multixact_freeze_score_weight autovacuum_vacuum_score_weight autovacuum_vacuum_insert_score_weight autovacuum_analyze_score_weight This is intended to be a baby step towards smarter autovacuum workers. Possible future improvements include, but are not limited to, periodic reprioritization, automatic cost limit adjustments, and better observability (e.g., a system view that shows current scores). While we do not expect this commit to produce any earth-shattering improvements, it is arguably a prerequisite for the aforementioned follow-up changes. Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/aOaAuXREwnPZVISO%40nathan	2026-03-27 10:17:05 -05:00
Peter Eisentraut	9a9998163b	Align tests for stored and virtual generated columns These tests were intended to be aligned with each other, but additional tests for virtual generated columns disrupted that alignment. The test confirming that user-defined types are not allowed in virtual generated columns has also been moved to the generated_virtual.sql-specific section. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Mutaamba Maasha <maasha@gmail.com> Reviewed-by: Surya Poondla <s_poondla@apple.com> Discussion: https://www.postgresql.org/message-id/flat/20250808115142.e9ccb81f35466a9a131a4c55@sraoss.co.jp	2026-03-27 15:49:34 +01:00
Peter Eisentraut	6857947db5	pgindent: Always clean up .BAK files from pg_bsd_indent The previous commit let pgindent clean up File::Temp files on SIGINT. This extends that to also cleaning up the .BAK files, created by pg_bsd_indent. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/DFCDD5H4J7VX.3GJKRBBDCKQ86@jeltef.nl	2026-03-27 14:26:43 +01:00
Peter Eisentraut	801de0bd44	pgindent: Clean up temp files created by File::Temp on SIGINT When pressing Ctrl+C while running pgindent, it would often leave around files like pgtypedefAXUEEA. This slightly changes SIGINT handling so those files are cleaned up. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/DFCDD5H4J7VX.3GJKRBBDCKQ86@jeltef.nl	2026-03-27 14:26:43 +01:00
Heikki Linnakangas	3fd0577728	Refactor PredicateLockShmemInit to not reuse var for different things The PredicateLockShmemInit function is pretty complicated, and one source of confusion is that it reuses the same local variable for sizes of things. Replace the different uses with separate variables for clarity. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/113724ab-0028-493f-9605-6e8570f0939f@iki.fi	2026-03-27 13:24:34 +02:00
Heikki Linnakangas	3c74cb5762	Avoid memory leak on error while parsing pg_stat_statements dump file By using palloc() instead of raw malloc(). Reported-by: Gaurav Singh <gaurav.singh@yugabyte.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAEcQ1bYR9s4eQLFDjzzJHU8fj-MTbmRpW-9J-r2gsCn+HEsynw@mail.gmail.com Backpatch-through: 14	2026-03-27 12:25:10 +02:00
Peter Eisentraut	288ae96872	Add a graph pattern variable only once An element pattern variable may be repeated in the path pattern. GraphTableParseState maintains a list of all variable names used in the graph pattern. Add a new variable name to that list only when it is not present already. This isn't a problem right now, but it could be in the future. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5tR4O0vjeqTCPr2VB5pYjNYbJgbCBEQf63NtU5Pz1MiOQ%40mail.gmail.com	2026-03-27 10:55:17 +01:00
Heikki Linnakangas	98993150c0	Minor comment fixes to yesterday's LWLock tranche refactoring Author: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA5RZ0sLENRM+BicUjQFs_rP38oPx3gm0SsGrD0-jMhhM+HZ_w@mail.gmail.com	2026-03-27 11:44:10 +02:00
Peter Eisentraut	720f0f89d6	Reject consecutive element patterns of same kind Adding an implicit empty vertex pattern when a path pattern starts or ends with an edge pattern or when two consecutive edge patterns appear in the pattern is not supported right now. Prohibit such path patterns. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Henson Choi <assam258@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/72a23702-6d96-4103-a54b-057c2352e885%2540eisentraut.org	2026-03-27 10:31:53 +01:00
Peter Eisentraut	b4a1320224	Enable warning like -Wstrict-prototypes on MSVC as well This adds an MSVC warning option equivalent to those added in commit `29bf4ee749` for GCC/Clang. Note that this requires commit `bccfc73acd` (Disable warnings in system headers in MSVC). Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aa73q1aT0A3/vke/%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-27 08:28:07 +01:00
Robert Haas	874da8b1f6	pg_plan_advice: pgindent Reported-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 20:10:13 -04:00
Heikki Linnakangas	30d432502b	Use ShmemInitStruct to allocate lwlock.c's shared memory It's nice to have them show up in pg_shmem_allocations like all other shmem areas. ShmemInitStruct() depends on ShmemIndexLock, but only after postmaster startup. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi	2026-03-26 23:51:41 +02:00
Heikki Linnakangas	06d859aaf4	Move ShmemIndexLock into ShmemAllocator This makes shmem.c independent of the main LWLock array. That makes it possible to stop passing MainLWLockArray through BackendParameters in the next commit. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi	2026-03-26 23:51:41 +02:00
Heikki Linnakangas	12e3e0f2c8	Use a separate spinlock to protect LWLockTranches Previously we reused the shmem allocator's ShmemLock to also protect lwlock.c's shared memory structures. Introduce a separate spinlock for lwlock.c for the sake of modularity. Now that lwlock.c has its own shared memory struct (LWLockTranches), this is easy to do. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi	2026-03-26 23:50:59 +02:00
Heikki Linnakangas	d6eba30a24	Refactor how user-defined LWLock tranches are stored in shmem Merge the LWLockTranches and NamedLWLockTrancheRequest data structures in shared memory into one array of user-defined tranches. The NamedLWLockTrancheRequest list is now only used in postmaster, to hold the requests until shared memory is initialized. Introduce a C struct, LWLockTranches, to hold all the different fields kept in shared memory. This gives an easier overview of what are all the things kept in shared memory. Previously, we had separate pointers for LWLockTrancheNames, LWLockCounter and the (shared memory copy of) NamedLWLockTrancheRequestArray. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi	2026-03-26 23:47:22 +02:00
Heikki Linnakangas	cc88481aeb	Rename MAX_NAMED_TRANCHES to MAX_USER_DEFINED_TRANCHES The "named tranches" term is a little confusing. In most places it refers to tranches requested with RequestNamedLWLockTranche(), even though all built-in tranches and tranches allocated with LWLockNewTrancheId() also have a name. But in MAX_NAMED_TRANCHES, it refers to tranches requested with either RequestNamedLWLockTranche() or LWLockNewTrancheId(), as it's the maximum of all of those in total. The "user defined" term is already used in LWTRANCHE_FIRST_USER_DEFINED, so let's standardize on that to mean tranches allocated with either RequestNamedLWLockTranche() or LWLockNewTrancheId(). Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/47aaf57e-1b7b-4e12-bda2-0316081ff50e@iki.fi	2026-03-26 23:46:04 +02:00
Tom Lane	a6d26e0fb2	Doc: declutter CREATE TABLE synopsis. Factor out the "persistence mode" and storage/compression parts of the syntax synopsis to reduce line lengths and increase readability. Also add an introductory para about the persistence modes so that the Description section still lines up with the synopsis. Author: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAKFQuwYfMV-2SdrP-umr5SVNSqTn378BUvHsebetp5=DhT494w@mail.gmail.com	2026-03-26 17:27:40 -04:00
Robert Haas	6455e55b0d	pg_plan_advice: Invent DO_NOT_SCAN(relation_identifier). The premise of src/test/modules/test_plan_advice is that if we plan a query once, generate plan advice, and then replan it using that same advice, all of that advice should apply cleanly, since the settings and everything else are the same. Unfortunately, that's not the case: the test suite is the main regression tests, and concurrent activity can change the statistics on tables involved in the query, especially system catalogs. That's OK as long as it only affects costing, but in a few cases, it affects which relations appear in the final plan at all. In the buildfarm failures observed to date, this happens because we consider alternative subplans for the same portion of the query; in theory, MinMaxAggPath is vulnerable to a similar hazard. In both cases, the planner clones an entire subquery, and the clone has a different plan name, and therefore different range table identifiers, than the original. If a cost change results in flipping between one of these plans and the other, the test_plan_advice tests will fail, because the range table identifiers to which advice was applied won't even be present in the output of the second planning cycle. To fix, invent a new DO_NOT_SCAN advice tag. When generating advice, emit it for relations that should not appear in the final plan at all, because some alternative version of that relation was used instead. When DO_NOT_SCAN is supplied, disable all scan methods for that relation. To make this work, we reuse a bunch of the machinery that previously existed for the purpose of ensuring that we build the same set of relation identifiers during planning as we do from the final PlannedStmt. In the process, this commit slightly weakens the cross-check mechanism: before this commit, it would fire whenever the pg_plan_advice module was loaded, even if pg_plan_advice wasn't actually doing anything; now, it will only engage when we have some other reason to create a pgpa_planner_state. The old way was complex and didn't add much useful test coverage, so this seems like an acceptable sacrifice. Discussion: http://postgr.es/m/CA+TgmoYuWmN-00Ec5pY7zAcpSFQUQLbgAdVWGR9kOR-HM-fHrA@mail.gmail.com Reviewed-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 17:09:57 -04:00
Robert Haas	26255a3207	Add an alternative_plan_name field to PlannerInfo. Typically, we have only one PlannerInfo for any given subquery, but when we are considering a MinMaxAggPath or a hashed subplan, we end up creating a second PlannerInfo for the same portion of the query, with a clone of the original range table. In fact, in the MinMaxAggPath case, we might end up creating several clones, one per aggregate. At present, there's no easy way for a plugin, such as pg_plan_advice, to understand the relationships between the original range table and the copies of it that are created in these cases. To fix, add an alternative_plan_name field to PlannerInfo. For a hashed subplan, this is the plan name for the non-hashed alternative; for minmax aggregates, this is the plan_name from the parent PlannerInfo; otherwise, it's the same as plan_name. Discussion: http://postgr.es/m/CA+TgmoYuWmN-00Ec5pY7zAcpSFQUQLbgAdVWGR9kOR-HM-fHrA@mail.gmail.com Reviewed-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 16:45:17 -04:00
Tom Lane	10e2a8ac6a	Doc: commit performs rollback of aborted transactions. The COMMIT command handles an aborted transaction in the same manner as the ROLLBACK command, but this wasn't explained in its official reference page. Also mention that behavior in the tutorial's material on transactions. Also add a comment mentioning that we don't raise an exception for COMMIT within an aborted transaction, as the SQL standard would have us do. Hyperlink a couple of cross-references while we're at it. Author: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKFQuwYgYR3rWt6vFXw=ZWZ__bv7PqvdOnHujG+UyqE11f+3sg@mail.gmail.com	2026-03-26 15:14:27 -04:00
Andres Freund	698ab40469	Address perlcritic complaint in response to `906a046972`	2026-03-26 15:03:47 -04:00
Andres Freund	8a1a1d6ab8	bufmgr: Restructure AsyncReadBuffers() Restructure AsyncReadBuffers() to use early return when the head buffer is already valid, instead of using a did_start_io flag and if/else branches. Also move around a bit of the code to be located closer to where it is used. This is a refactor only. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-26 12:07:05 -04:00
Andres Freund	df09452c32	bufmgr: Make buffer hit helper Already two places count buffer hits, requiring quite a few lines of code since we do accounting in so many places. Future commits will add more locations, so refactor into a helper. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-26 12:07:05 -04:00
Andres Freund	c2a68e08b1	bufmgr: Pass io_object and io_context through to PinBufferForBlock() PinBufferForBlock() is always_inline and called in a loop in StartReadBuffersImpl(). Previously it computed io_context and io_object internally, which required calling IOContextForStrategy() -- a non-inline function the compiler cannot prove is side-effect-free. This could potential cause unneeded redundant function calls. Compute io_context and io_object in the callers instead, allowing StartReadBuffersImpl() to do so once before entering the loop. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-26 12:07:05 -04:00
Robert Haas	5dcb15e89a	pg_plan_advice: Refactor to invent pgpa_planner_info pg_plan_advice tracks two pieces of per-PlannerInfo data: (1) for each RTI, the corresponding relation identifier, for purposes of cross-checking those calculations against the final plan; and (2) the set of semijoins seen during planning for which the strategy of making one side unique was considered. The former is tracked using a hash table that uses <plan_name, RTI> as the key, and the latter is tracked using a List of <plan_name, relids>. It seems better to track both of these things in the same way and to try to reuse some code instead of having everything be completely separate, so invent pgpa_planner_info; we'll create one every time we see a new PlannerInfo and need to associate some data with it, and we'll use the plan_name field to distinguish between PlannerInfo objects, as it should always be unique. Then, refactor the two systems mentioned above to use this new infrastructure. (Note that the adjustment in pgpa_plan_walker is necessary in order to avoid spuriously triggering the sanity check in that function, in the case where a pgpa_planner_info is created for a purpose not related to sj_unique_rels.) Discussion: https://postgr.es/m/CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com Reviewed-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 11:57:33 -04:00
Tom Lane	41d69e6dcc	Add labels to help make psql's hidden queries more understandable. We recommend looking at psql's "-E" output to help understand the system catalogs, but in some cases (particularly table displays) there's a bunch of rather impenetrable SQL there. As a small improvement, label each query issued by describe.c with a short description of its purpose. The code is arranged so that the labels also appear as SQL comments in the server log, if the server is logging these commands. We could expand this policy to every use of PSQLexec(), but most of the ones outside describe.c are issuing simple commands like "BEGIN" or "COMMIT", which don't seem to need such glosses. I did add labels to the commands issued by \sf, \sv and friends. Also, make the -E and log output for hidden queries say "INTERNAL QUERY" not just "QUERY", to distinguish them from user-written queries. Author: Greg Sabino Mullane <htamfids@gmail.com> Co-authored-by: David Christensen <david+pg@pgguru.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKAnmmJz8Hh=8Ru8jgzySPWmLBhnv4=oc_0KRiz-UORJ0Dex+w@mail.gmail.com	2026-03-26 11:36:52 -04:00
Andres Freund	cf66978d79	Fix off-by-one error in read IO tracing AsyncReadBuffer()'s no-IO needed path passed TRACE_POSTGRESQL_BUFFER_READ_DONE the wrong block number because it had already incremented operation->nblocks_done. Fix by folding the nblocks_done offset into the blocknum local variable at initialization. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/u73un3xeljr4fiidzwi4ikcr6vm7oqugn4fo5vqpstjio6anl2%40hph6fvdiiria Backpatch-through: 18	2026-03-26 10:38:56 -04:00
Andres Freund	906a046972	aio: Refactor tests in preparation for more tests In a future commit more AIO related tests are due to be introduced. However 001_aio.pl already is fairly large. This commit introduces a new TestAio package with helpers for writing AIO related tests. Then it uses the new helpers to simplify the existing 001_aio.pl by iterating over all supported io_methods. This will be particularly helpful because additional methods already have been submitted. Additionally this commit splits out testing of initdb using a non-default method into its own test. While that test is somewhat important, it's fairly slow and doesn't break that often. For development velocity it's helpful for 001_aio.pl to be faster. While particularly the latter could benefit from being its own commit, it seems to introduce more back-and-forth than it's worth. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/zljergweqti7x67lg5ije2rzjusie37nslsnkjkkby4laqqbfw@3p3zu522yykv	2026-03-26 10:38:56 -04:00
Robert Haas	47c110f77e	Respect disabled_nodes in fix_alternative_subplan. When my commit `e222534679` added the concept of disabled_nodes, it failed to add a disabled_nodes field to SubPlan. This is a regression: before that commit, when fix_alternative_subplan compared the costs of two plans, the number of disabled nodes affected the result, because it was just a component of the total cost. After that commit, it no longer did, making it possible for a disabled path to win on cost over one that is not disabled. Fix that. As usual for planner fixes that might destabilize plan choices, no back-patch. Discussion: https://postgr.es/m/CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com Reviewed-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 10:25:04 -04:00
Peter Eisentraut	119e791e9c	Fix -Wcast-qual warning This dials back a couple of the qualifiers added by commit `7724cb9935`. Specifically, in match_boolean_partition_clause() the call to negate_clause() casts away the const, so we shouldn't make the input argument const.	2026-03-26 15:00:24 +01:00
Fujii Masao	400a790a48	Avoid sending duplicate WAL locations in standby status replies Previously, when the startup process applied WAL and requested walreceiver to send an apply notification to the primary, walreceiver sent a status reply unconditionally, even if the WAL locations had not advanced since the previous update. As a result, the standby could send two consecutive status reply messages with identical WAL locations even though wal_receiver_status_interval had not yet elapsed. This could unexpectedly reset the reported replication lag, making it difficult for users to monitor lag. The second message was also unnecessary because it reported no progress. This commit updates walreceiver to send a reply only when the apply location has advanced since the last status update, even when the startup process requests a notification. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com	2026-03-26 20:54:32 +09:00
Fujii Masao	eef1ba704d	Fix premature NULL lag reporting in pg_stat_replication pg_stat_replication is documented to keep the last measured lag values for a short time after the standby catches up, and then set them to NULL when there is no WAL activity. However, previously lag values could become NULL prematurely even while WAL activity was ongoing, especially in logical replication. This happened because the code cleared lag when two consecutive reply messages indicated that the apply location had caught up with the send location. It did not verify that the reported positions were unchanged, so lag could be cleared even when positions had advanced between messages. In logical replication, where the apply location often quickly catches up, this issue was more likely to occur. This commit fixes the issue by clearing lag only when the standby reports that it has fully replayed WAL (i.e., both flush and apply locations have caught up with the send location) and the write/flush/apply positions remain unchanged across two consecutive reply messages. The second message with unchanged positions typically results from wal_receiver_status_interval, so lag values are cleared after that interval when there is no activity. This avoids showing stale lag data while preventing premature NULL values. Even with this fix, lag may rarely become NULL during activity if identical position reports are sent repeatedly. Eliminating such duplicate messages would address this fully, but that change is considered too invasive for stable branches and will be handled in master only later. Backpatch to all supported branches. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com Backpatch-through: 14	2026-03-26 20:49:31 +09:00
Heikki Linnakangas	6b8238cb6a	Refactor ShmemIndex initialization Initialize the ShmemIndex hash table in InitShmemAllocator() already, removing the need for the separate InitShmemIndex() step. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-03-26 11:35:55 +02:00
Peter Eisentraut	515b0dc4bc	MSVC: Remove unnecessary warning option The MSVC warning option /w24777 added by commit `2307cfe316` was a typo, it should have been /w24477. But this option is already enabled by default in level 1, so we don't need to add it explicitly. So just remove it.	2026-03-26 09:10:42 +01:00
Peter Eisentraut	f8e7ca3285	Make fixed-length list building macros work in C++ Compound literals, as used in pg_list.h for list_makeN(), are not a C++ feature. MSVC doesn't accept these. (GCC and Clang accept them, but they would warn in -pedantic mode.) Replace with equivalent inline functions. (These are the only instances of compound literals used in PostgreSQL header files.) Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg%40mail.gmail.com	2026-03-26 08:53:13 +01:00
Amit Kapila	735e8fe685	Refactor replorigin_session_setup() for better readability. Reorder the validation checks in replorigin_session_setup() to provide a more logical flow. This makes the function easier to follow and ensures that basic state checks are performed consistently. Additionally, update an error message to align its phrasing with similar diagnostics in the replication origin subsystem, improving overall consistency. Author: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/e0508305-bc6a-417c-b969-36564d632f9e@iki.fi	2026-03-26 09:15:25 +05:30
Masahiko Sawada	89210037a0	Fix UUID sortability tests in base32hex encoding. Commit `497c1170cb` added base32hex encoding support, but its regression test for UUIDs failed on buildfarm members hippopotamus and jay using natural language locales (such as cs_CZ). This happened because those collations may sort characters differently, which breaks the strict byte-wise lexicographical ordering expected by base32hex encoding. This commit fixes the regression tests by explicitly using the C collation. Per buildfarm members hippopotamus and jay. Analyzed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/682417.1774482047@sss.pgh.pa.us	2026-03-25 20:12:26 -07:00
Michael Paquier	4287c50fc2	Improve timeout handling of pg_promote() Previously, pg_promote() looped a fixed number of times, calculated from the specified timeout, and waited 100ms on a latch, once per iteration, for the promotion of a standby to complete. However, unrelated signals to the backend could set the latch and wake up the backend early, resulting in a faster consumption of the loops and an execution time of the function that does not match with the timeout input given in input. This could be confusing for the function caller, especially if some backend-side timeout is aggressive, because the function would return much earlier than expected and report that the promote request has not completed within the time requested. This commit refines the logic to track the time actually elapsed, by looping until the requested duration has truly passed. The code calculates the end time we expect, then uses it when looping. Author: Robert Pang <robertpang@google.com> Reviewed-by: Tiancheng Ge <getiancheng_2012@163.com> Discussion: https://postgr.es/m/CAJhEC07OK8J7tLUbyiccnuOXRE7UKxBNqD2-pLfeFXa=tBoWtw@mail.gmail.com	2026-03-26 10:39:40 +09:00
Tom Lane	e9d723487b	Remove a low-value, high-risk optimization in pg_waldump. The code removed here deleted already-used data from a partially-read WAL segment's hashtable entry. The intent was evidently to try to keep the entry's memory consumption below the WAL segment's total size, but we don't use WAL segments that are so large as to make that a big win. The important memory-space optimization is to remove hashtable entries altogether when done with them, and that's handled elsewhere. To buy that, we must accept a substantially more complex (and under-documented) logical invariant about what is in entry->buf, as well as complex and under-documented interactions with the entry spilling logic, various re-checking code paths in xlogreader.c, and pg_waldump's overall data processing order. Any of those aspects could have bugs lurking still, and are quite likely to be prone to new bugs after future code changes. Given the number of bugs we've already found in commit `b15c15139`, I judge that simplifying anything we possibly can is a good decision. While here, revise and extend some related comments. Discussion: https://postgr.es/m/374225.1774459521@sss.pgh.pa.us	2026-03-25 19:15:52 -04:00
Tom Lane	ff84efe4fd	Fix misuse of simplehash.h hash operations in pg_waldump. Both ArchivedWAL_insert() and ArchivedWAL_delete_item() can cause existing hashtable entries to move. The code didn't account for this and could leave privateInfo->cur_file pointing at a dead or incorrect entry, with hilarity ensuing. Likewise, read_archive_wal_page calls read_archive_file which could result in movement of the hashtable entry it is working with. I believe these bugs explain some odd buildfarm failures, although the amount of data we use in pg_waldump's TAP tests isn't enough to trigger them reliably. This code's all new as of commit `b15c15139`, so no need for back-patch. Discussion: https://postgr.es/m/374225.1774459521@sss.pgh.pa.us	2026-03-25 18:37:28 -04:00
Tom Lane	03b1e30e7a	Fix file descriptor leakages in pg_waldump. TarWALDumpCloseSegment was of the opinion that it didn't need to do anything. It was mistaken: it has to close the open file if any, because nothing else will, leading to a descriptor leak. In addition, we failed to ensure that any file being read by the XLogReader machinery gets closed before the atexit callback tries to cleanup the temporary directory holding spilled WAL files. While the file would have been closed already in case of a success exit, this doesn't happen in case of pg_fatal() exits. The least messy way to fix that is to move the atexit function into pg_waldump.c, where it has easier access to the XLogReaderState pointer and to WALDumpCloseSegment. These FD leakages are pretty insignificant on Unix-ish platforms, but they're a bug on Windows, because they prevent successful cleanup of the temporary directory for extracted WAL files. (Windows can't delete a directory that holds a deleted-but-still-open file.) This is visible in occasional buildfarm failures. This code's all new as of commit `b15c15139`, so no need for back-patch. Discussion: https://postgr.es/m/374225.1774459521@sss.pgh.pa.us	2026-03-25 18:28:42 -04:00
Masahiko Sawada	497c1170cb	Add base32hex support to encode() and decode() functions. This adds support for base32hex encoding and decoding, as defined in RFC 4648 Section 7. Unlike standard base32, base32hex uses the extended hex alphabet (0-9, A-V) which preserves the lexicographical order of the encoded data. This is particularly useful for representing UUIDv7 values in a compact string format while maintaining their time-ordered sort property. The encode() function produces output padded with '=', while decode() accepts both padded and unpadded input. Following the behavior of other encoding types, decoding is case-insensitive. Suggested-by: Sergey Prokhorenko <sergeyprokhorenko@yahoo.com.au> Author: Andrey Borodin <x4mmm@yandex-team.ru> Co-authored-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Илья Чердаков <i.cherdakov.pg@gmail.com> Reviewed-by: Chengxi Sun <chengxisun92@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TOramr1UTLcyB128LWMqita1Y7%3Darq3KHaU%3Dqikf5yKOQ%40mail.gmail.com	2026-03-25 11:35:19 -07:00
Álvaro Herrera	c8b4a3ec08	Remove unused autovac_table.at_sharedrel The last use was removed by commit `38f7831d70`. After that, we compute MyWorkerInfo->wi_sharedrel directly from the pg_class tuple of the table being vacuumed rather than passing it around. Author: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/20260325165734.7ab8e4e55fe4c2f1e55031d9@sraoss.co.jp	2026-03-25 18:24:34 +01:00
Masahiko Sawada	5fa7837d9a	psql: Fix tab completion for FOREIGN DATA WRAPPER and SUBSCRIPTION. Commit `8185bb5347` extended the CREATE/ALTER SUBSCRIPTION and CREATE/ALTER FOREIGN DATA WRAPPER commands, but missed the corresponding tab-completion logic. This commit fixes that oversight by adding completion support for: - The CONNECTION keyword in CREATE/ALTER FOREIGN DATA WRAPPER. - The list of foreign servers in CREATE/ALTER SUBSCRIPTION. Author: Yamaguchi Atsuo <acrobatcoder@gmail.com> Discussion: https://postgr.es/m/CAKSyusJWdWcUKVd3qJXcEaQxJewGymQWV_r3-mc=Knrqo0AZ_g@mail.gmail.com	2026-03-25 09:30:26 -07:00
Peter Eisentraut	87e1891c45	Remove compiler warning option -Wendif-labels This warning has always been on by default in GCC (and in Clang at least going back to 3.1), so we don't need the option explicitly. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aa73q1aT0A3/vke/%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-25 15:04:18 +01:00
Peter Eisentraut	bccfc73acd	Disable warnings in system headers in MSVC This is similar to the standard behavior in GCC. For MSVC, we set all headers in angle brackets to be considered system headers. (GCC goes by path, not include style.) The required option is available since VS 2017. (Before VS 2019 version 16.10, the additional option /experimental:external is required, but per discussion in [0], we effectively require 16.11, so this shouldn't be a problem.) [0]: https://www.postgresql.org/message-id/04ab76a3-186c-4a37-8076-e6882ebf9d43%40eisentraut.org Then, we can remove one workaround for avoiding a warning from a system header. (And some warnings to be enabled in the future could benefit from this.) Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aa73q1aT0A3/vke/%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-25 15:03:52 +01:00
Peter Eisentraut	5282bf535e	Fix some typos and make small stylistic improvements for commit `2f094e7ac6` Author: zengman <zengman@halodbtech.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-25 09:17:40 +01:00
Peter Eisentraut	c79e414127	Fix typo Mistake in commit `e2f289e5b9`: SOFT_ERROR_OCCURRED was called with the wrong fcinfo field. Reported-by: Jianghua Yang <yjhjstz@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAZLFmSGti716gWeY%3DDCZ9TTVOixnHZ4_4V4tDzoeE86D64vOA%40mail.gmail.com	2026-03-25 07:09:44 +01:00
Amit Kapila	6b5b7eae3a	pg_createsubscriber: Add -l/--logdir option to redirect output to files. This commit introduces a -l (or --logdir) argument to pg_createsubscriber, allowing users to specify a directory for log files. When enabled, a timestamped subdirectory is created within the specified log directory, containing: pg_createsubscriber_server.log: Captures logs from the standby server during its start/stop cycles. pg_createsubscriber_internal.log: Captures the tool's own internal diagnostic and progress messages. This ensures that transient server and utility messages are preserved for troubleshooting after the subscriber creation process completes or errored out. Author: Gyan Sreejith <gyan.sreejith@gmail.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEqnbaUthOQARV1dscGvB_EsqC-YfxiM6rWkVDHc+G+f4oSUHw@mail.gmail.com	2026-03-25 11:22:07 +05:30
John Naylor	be6a7494d2	Refactor handling of x86 CPUID instructions Introduce two helpers for CPUID, pg_cpuid and pg_cpuid_subleaf that wrap the platform specific __get_cpuid/__cpuid and __get_cpuid_count/__cpuidex functions. Additionally, use macros to specify registers names (e.g. EAX) for clarity, instead of numeric integers into the result array. Author: Lukas Fittl <lukas@fittl.com> Suggested-By: John Naylor <john.naylor@postgresql.org> Discussion: https://postgr.es/m/CANWCAZZ+Crjt5za9YmFsURRMDW7M4T2mutDezd_3s1gTLnrzGQ@mail.gmail.com	2026-03-25 12:32:36 +07:00
Michael Paquier	7c64d56fd9	Remove isolation test lock-stats This test is proving to be unstable in the CI for Windows, at least. The origin of the issue is that the deadlock_timeout requests may not be processed, causing the lock stats to not be updated. This could be mitigated by making the hardcoded sleep longer, however this would cost in runtime on fast machines. On slow machines, there is no guarantee that an augmented sleep would be enough. An isolation test may not be the best method to write this test (TAP test with injection point with a NOTICE+wait_for_log before processing the deadlock_timeout request should remove the need of a sleep). As we are late in the release cycle, I am removing the test for now to keep the CI and the buildfarm a maximum stable. Let's revisit this part later. Discussion: https://postgr.es/m/hlkdrplgrmudbspibsuq6xooxrqxqsgwo6x5b6x5ptvkgjbe7w@xogt6xgua6dz	2026-03-25 08:48:15 +09:00
Jeff Davis	11f8018ee6	Refactor to remove ForeignServerName(). Callers either have a ForeignServer object or can readily construct one. Discussion: https://postgr.es/m/CAExHW5vV5znEvecX=ra2-v7UBj9-M6qvdDzuB78M-TxbYD1PEA@mail.gmail.com Suggested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>	2026-03-24 15:20:28 -07:00
Jeff Davis	f16f5d608c	GetSubscription(): use per-object memory context. Constructing a Subcription object uses a number of small or temporary allocations. Use a per-object memory context for easy cleanup. Get rid of FreeSubscription() which did not free all the allocations anyway. Also get rid of the PG_TRY()/PG_CATCH() logic in ForeignServerConnectionString() which were used to avoid leaks during GetSubscription(). Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/xvdjrdqnpap3uq7owbaox3r7p5gf7sv62aaqf2ju3vb6yglatr%40kvvwhoudrlxq Discussion: https://postgr.es/m/CAA4eK1K=WjZ1maBCmj=5ZdO66AwPORK5ZBxVKedS0xdCcb621A@mail.gmail.com	2026-03-24 15:11:45 -07:00
Melanie Plageman	a881cc9c7e	Remove XLOG_HEAP2_VISIBLE entirely There are no remaining users that emit XLOG_HEAP2_VISIBLE records, so it can be removed. This includes deleting the xl_heap_visible struct and all functions responsible for emitting or replaying XLOG_HEAP2_VISIBLE records. Bumps XLOG_PAGE_MAGIC because we removed a WAL record type. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2026-03-24 17:58:12 -04:00
Melanie Plageman	a759ced2f1	WAL log VM setting for empty pages in XLOG_HEAP2_PRUNE_VACUUM_SCAN As part of removing XLOG_HEAP2_VISIBLE records, phase I of VACUUM now marks empty pages all-visible and all-frozen in a XLOG_HEAP2_PRUNE_VACUUM_SCAN record. This has no real independent benefit, but empty pages were the last user of XLOG_HEAP2_VISIBLE, so by making this change we can next remove all of the XLOG_HEAP2_VISIBLE code. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Earlier version Reviewed-by: Robert Haas <robertmhaas@gmail.com>	2026-03-24 17:30:54 -04:00
Melanie Plageman	1252a4ee28	WAL log VM setting during vacuum phase I in XLOG_HEAP2_PRUNE_VACUUM_SCAN Vacuum no longer emits a separate WAL record for each page set all-visible or all-frozen during phase I. Instead, visibility map updates are now included in the XLOG_HEAP2_PRUNE_VACUUM_SCAN record that is already emitted for pruning and freezing. Previously, heap_page_prune_and_freeze() determined whether a page was all-visible, but the corresponding VM bits were only set later in lazy_scan_prune(). Now the VM is updated immediately in heap_page_prune_and_freeze(), at the same time as the heap modifications. This reduces WAL volume produced by vacuum. For now, vacuum is still the only user of heap_page_prune_and_freeze() allowed to set the VM. On-access pruning is not yet able to set the VM. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Earlier version Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2026-03-24 16:49:46 -04:00
Robert Haas	dc47beacaa	get_memoize_path: Don't exit quickly when PGS_NESTLOOP_PLAIN is unset. This function exits early in the case where the number of inner rows is estimated to be less than 2, on the theory that in that case a Nested Loop with inner Memoize must lose to a plain Nested Loop. But since commit `4020b370f2` it's possible for a plain Nested Loop to be disabled, while a Nested Loop with inner Memoize is still enabled. In that case, this reasoning is not valid, so adjust the code not to exit early in that case. This issue was revealed by a test_plan_advice failure on buildfarm member skink, where NESTED_LOOP_MEMOIZE() couldn't be enforced on replanning due to this early exit. Discussion: http://postgr.es/m/CA+TgmoZUN8FT1Ah=m6Uis5bHa4FUa+_hMDWtcABG17toEfpiUg@mail.gmail.com	2026-03-24 16:17:26 -04:00
Melanie Plageman	9ba3ec076a	Keep newest live XID up-to-date even if page not all-visible During pruning, we keep track of the newest xmin of live tuples on the page visible to all running and future transactions so that we can use it later as the snapshot conflict horizon when setting the VM if the page turns out to be all-visible. Previously, we stopped updating this value once we determined the page was not all-visible. However, maintaining it even when the page is not all-visible is inexpensive and makes the snapshot conflict horizon calculation clearer. This guarantees it won't contain a stale value. Since we'll keep it up to date all the time now anyway, there's no reason not to maintain set_all_visible for on-access pruning. This will allow us to set the VM on-access in the future. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk	2026-03-24 15:37:18 -04:00
Melanie Plageman	dd5716f3c7	Use GlobalVisState in vacuum to determine page level visibility During vacuum's first and third phases, we examine tuples' visibility to determine if we can set the page all-visible in the visibility map. Previously, this check compared tuple xmins against a single XID chosen at the start of vacuum (OldestXmin). We now use GlobalVisState, which enables future work to set the VM during on-access pruning, since ordinary queries have access to GlobalVisState but not OldestXmin. This also benefits vacuum: in some cases, GlobalVisState may advance during a vacuum, allowing more pages to become considered all-visible. And, in the future, we could easily add a heuristic to update GlobalVisState more frequently during vacuums of large tables. OldestXmin is still used for freezing and as a backstop to ensure we don't freeze a dead tuple that wasn't yet prunable according to GlobalVisState in the rare occurrences where GlobalVisState moves backwards. Because comparing a transaction ID against GlobalVisState is more expensive than comparing against a single XID, we defer this check until after scanning all tuples on the page. Therefore, we perform the GlobalVisState check only once per page. This is safe because visibility_cutoff_xid records the newest live xmin on the page; if it is globally visible, then the entire page is all-visible. Using GlobalVisState means on-access pruning can also maintain visibility_cutoff_xid, which is required to set the visibility map on-access in the future. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk#c755ef151507aba58471ffaca607e493	2026-03-24 14:50:59 -04:00
Álvaro Herrera	f227b7b20c	Avoid including clog.h in proc.h The number of .c files that must include access/clog.h can currently be counted on one's fingers and miss only one (assuming one has the usual number of hands). However, due to indirect inclusion via proc.h, there's a lot of files that are pointlessly including it. This is easy to avoid with the easy trick implemented by this commit. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/202603221856.iwlhitt6dxxx@alvherre.pgsql	2026-03-24 17:31:16 +01:00
Tom Lane	6e243d81c5	Fix poorly-sized buffers in astreamer compression modules. astreamer_gzip.c and astreamer_lz4.c left their decompression output buffers at StringInfo's default allocation, merely 1kB. This results in a lot of ping-ponging between the decompressor and the next astreamer filter. This patch increases these buffer sizes to 256kB. In a simple test this had a small but measurable effect (saving a few percent) on the overall runtime of pg_waldump for the gzipped-data case; I didn't bother measuring for lz4. astreamer_zstd.c used ZSTD_DStreamOutSize() to size its compression output buffer, but the libzstd API says you should use ZSTD_CStreamOutSize(); ZSTD_DStreamOutSize() is for decompression. The two functions seem to produce the same value (256kB) here, so this is just cosmetic, but nonetheless we should play by the rules. While these issues are old, they don't seem significant enough to warrant back-patching. Discussion: https://postgr.es/m/3424809.1774234940@sss.pgh.pa.us	2026-03-24 12:17:12 -04:00
Tom Lane	ca1f1ade3f	Remove read_archive_file()'s "count" parameter. Instead, always try to fill the allocated buffer completely. The previous coding apparently intended (though it's undocumented) to read only small amounts of data until we are able to identify the WAL segment size and begin filtering out unwanted segments. However this extra complication has no measurable value according to simple testing here, and it could easily be a net loss if there is a substantial amount of non-WAL data in the archive file before the first WAL file. Discussion: https://postgr.es/m/3341199.1774221191@sss.pgh.pa.us	2026-03-24 12:17:12 -04:00
Álvaro Herrera	2102ebb195	Don't include storage/lock.h in so many headers Since storage/locktags.h was added by commit `322bab7974`, many headers can be made leaner by depending on that instead of on storage/lock.h, which has many other dependencies. (In fact, some of these changes were possible even before that.) Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/abvrRZo52Yx9ZzWQ@ip-10-97-1-34.eu-west-3.compute.internal	2026-03-24 17:11:12 +01:00
Álvaro Herrera	5f2350a043	Fix dereference in a couple of GUC check hooks check_backtrace_functions() and check_archive_directory() were doing an empty-string check this way: newval[0] == '\0' which, because of operator precedence, is interpreted as (newval[0]) instead of (*newval)[0] -- but these variables are pointers to C-strings and we want to check the first character therein, rather than check the first pointer of the array, so that interpretation is wrong. This would be wrong for any index element other than 0, as evidenced by every other dereference of the same variable in check_backtrace_functions, which use parentheses. Add parentheses to make the intended dereference explicit. This is just cosmetic at this stage, so no backpatch, although it's been "wrong" for a long time. Author: Zhang Hu <kongbaik228@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CAB5m2QssN6UO+ckr6ZCcV0A71mKUB6WdiTw1nHo43v4DTW1Dfg@mail.gmail.com	2026-03-24 16:45:39 +01:00
Nathan Bossart	c7b9f16113	test_bloomfilter: Fix error message. The error message in question uses the wrong format specifier and variable. This has been wrong for a while, but since it's in a test module and wasn't noticed until just now, no back-patch. Oversight in commit `51bc271790`. Author: Jianghua Yang <yjhjstz@gmail.com> Discussion: https://postgr.es/m/CAAZLFmS2OMiwe65gdm-MKgO%3DLnKatGMSK6JWxhycGN3TWrhbnw%40mail.gmail.com	2026-03-24 09:32:15 -05:00
Robert Haas	4647ee2da3	Add a test for creating an index on a whole-row expression. Surprisingly, we have no existing test for this. Had this test been present before commit `570e2fcc04` the Assert added in commit `c98ad086ad` would have caught the bug. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoacixUZVvi00hOjk_d9B4iYKswWP1gNqQ8Vfray-AcOCA@mail.gmail.com	2026-03-24 10:06:46 -04:00
Peter Eisentraut	6bc7449eac	Fix accidentally casting away const Recently introduced in commit `4b5ba0c4ca`.	2026-03-24 14:34:50 +01:00
Fujii Masao	1c162c965a	Report detailed errors from XLogFindNextRecord() failures. Previously, XLogFindNextRecord() did not return detailed error information when it failed to find a valid WAL record. As a result, callers such as the WAL summarizer, pg_waldump, and pg_walinspect could only report generic errors (e.g., "could not find a valid record after ..."), making troubleshooting difficult. This commit fix the issue by extending XLogFindNextRecord() to return detailed error information on failure, and updating its callers to include those details in their error messages. For example, when pg_waldump is run on a WAL file with an invalid magic number, it now reports not only the generic error but also the specific cause (e.g., "invalid magic number"). Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAO6_XqoxJXddcT4wkd9Xd+cD6Sz-fyspRGuV4Bq-wbXG4pVNzA@mail.gmail.com	2026-03-24 22:33:09 +09:00
Robert Haas	c98ad086ad	Bounds-check access to TupleDescAttr with an Assert. The second argument to TupleDescAttr should always be at least zero and less than natts; otherwise, we index outside of the attribute array. Assert that this is the case. Various violations, or possible violations, of this rule that are currently in the tree are actually harmless, because while we do call TupleDescAttr() before verifying that the argument is within range, we don't actually dereference it unless the argument was within range all along. Nonetheless, the Assert means we should be more careful, so tidy up accordingly. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoacixUZVvi00hOjk_d9B4iYKswWP1gNqQ8Vfray-AcOCA@mail.gmail.com	2026-03-24 08:58:50 -04:00
Peter Eisentraut	e2f289e5b9	Make many cast functions error safe This adjusts many C functions underlying casts to support soft errors. This is in preparation for a future feature where conversion errors in casts can be caught. This patch covers cast functions that can be adjusted easily by changing ereport to ereturn or making other light changes. The underlying helper functions were already changed to support soft errors some time ago as part of soft error support in type input functions. Other casts and types will require some more work and are being kept as separate patches. Author: jian he <jian.universality@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CADkLM%3Dfv1JfY4Ufa-jcwwNbjQixNViskQ8jZu3Tz_p656i_4hQ%40mail.gmail.com	2026-03-24 12:08:22 +01:00
Robert Haas	570e2fcc04	Prevent spurious "indexes on virtual generated columns are not supported". Both of the checks in DefineIndex() that can produce this error message have a guard against negative attribute numbers, but lack a guard to ensure that attno is non-zero. As a result, we can index off the beginning of the TupleDesc and read a garbage byte for attgenerated. If that byte happens to be 'v', we'll incorrectly produce the error mentioned above. The first call site is easy to hit: any attempt to create an expression index does so. The second one is not currently hit in the regression tests, but can be hit by something like CREATE INDEX ON some_table ((some_function(some_table))). Found by study of a test_plan_advice failure on buildfarm member skink, though this issue has nothing to do with test_plan_advice and seems to have only been revealed by happenstance. Backpatch-through: 18 Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoacixUZVvi00hOjk_d9B4iYKswWP1gNqQ8Vfray-AcOCA@mail.gmail.com	2026-03-24 06:28:33 -04:00
John Naylor	d2a1aa77c2	Fix copy-paste error in test_ginpostinglist The check for a mismatch on the second decoded item pointer was an exact copy of the first item pointer check, comparing orig_itemptrs[0] with decoded_itemptrs[0] instead of orig_itemptrs[1] with decoded_itemptrs[1]. The error message also reported (0, 1) as the expected value instead of (blk, off). As a result, any decoding error in the second item pointer (where the varbyte delta encoding is exercised) would go undetected. This has been wrong since commit `bde7493d1`, so backpatch to all supported versions. Author: Jianghua Yang <yjhjstz@gmail.com> Discussion: https://postgr.es/m/CAAZLFmSOD8R7tZjRLZsmpKtJLoqjgawAaM-Pne1j8B_Q2aQK8w@mail.gmail.com Backpatch-through: 14	2026-03-24 17:14:11 +07:00
Alexander Korotkov	6888658516	Further improve commentary about ChangeVarNodesWalkExpression() The updated comment explains why we use ChangeVarNodes_walker() instead of expression_tree_walker(), and provides a bit more detail about the differences in processing top-level Query and subqueries. Author: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAPpHfdvbjq342WTQ705Wmqhe8794pcp7wospz%2BWUJ2qB7vuOqA%40mail.gmail.com Backpatch-through: 18	2026-03-24 09:54:00 +02:00
Michael Paquier	4019f725f5	Add support for lock statistics in pgstats This commit adds a new stats kind, called PGSTAT_KIND_LOCK, implementing statistics for lock tags, as reported by pg_locks. The implementation is fixed-sized, as the data is caped based on the number of lock tags in LockTagType. The new statistics kind records the following fields, providing insight regarding lock behavior, while avoiding impact on performance-critical code paths (such as fast-path lock acquisition): - waits and wait_time: respectively track the number of times a lock required waiting and the total time spent acquiring it. These metrics are only collected once a lock is successfully acquired and after deadlock_timeout has been exceeded. fastpath_exceeded: counts how often a lock could not be acquired via the fast path due to the max_locks_per_transaction slot limits. A new view called pg_stat_lock can be used to access this data, coupled with a SQL function called pg_stat_get_lock(). Bump stat file format PGSTAT_FILE_FORMAT_ID. Bump catalog version. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aIyNxBWFCybgBZBS%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-24 15:32:09 +09:00
Michael Paquier	a90d865182	Move some code blocks in lock.c and proc.c This change will simplify an upcoming change that will introduce lock statistics, reducting code churn. This commit means that we begin to calculate the time it took to acquire a lock after the deadlock check interrupt has run should log_lock_waits be off, when taken in isolation. This is not a performance-critical code path, and note that log_lock_waits is enabled by default since `2aac62be8c`. Extracted from a larger patch by the same author. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aIyNxBWFCybgBZBS@ip-10-97-1-34.eu-west-3.compute.internal	2026-03-24 13:34:54 +09:00
Michael Paquier	3d10ece612	Make implementation of SASLprep compliant for ASCII characters This commit makes our implementation of SASLprep() compliant with RFC 3454 (Stringprep) and RFC 4013 (SASLprep). Originally, as introduced in `60f11b87a2`, the operation considered a password made of only ASCII characters as valid, performing an optimization for this case to skip the internal NFKC transformation. However, the RFCs listed above use a different definition, with the following characters being prohibited: - 0x00~0x1F (0~31), control characters. - 0x7F (127, DEL). In its SCRAM protocol, Postgres has the idea to apply a password as-is if SASLprep() is not a success, so this change is safe on backward-compatibility grounds: - A libpq client with the compliant SASLprep can connect to a server with a non-compliant SASLprep. - A libpq client with the non-compliant SASLprep can connect to a server with a compliant SASLprep. This commit removes the all-ASCII optimization used in pg_saslprep() and applies SASLprep even if a password is made only of ASCII characters, making the operation compatible with the RFC. All the in-core callers of pg_saslprep() do that: - pg_be_scram_build_secret() in auth-scram.c, when generating a SCRAM verifier for rolpassword in the backend. - scram_init() in fe-auth-scram.c, when starting the SASL exchange. - pg_fe_scram_build_secret() in fe-auth-scram.c, when generating a SCRAM verifier for the frontend with libpq, to generate it for a ALTER/CREATE ROLE command for example. The test module test_saslprep shows the difference this change is leading to. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aaEJ-El2seZHeFcG@paquier.xyz	2026-03-24 08:29:23 +09:00
Tom Lane	2e123e3c2b	Silence compiler warning from older compilers. Our RHEL7-vintage buildfarm animals are complaining about "the comparison will always evaluate as true" for a usage of SOFT_ERROR_OCCURRED() on a local variable. This is the same issue addressed in `7bc88c3d6` and some earlier commits, so solve it the same way: write "escontext.error_occurred" instead. Problem dates to recent commit `a0b6ef29a`, no need for back-patch.	2026-03-23 17:25:12 -04:00
Tom Lane	7c08a7e809	Doc: minor improvements to SNI documentation. My attention was drawn to this new documentation by overlength-line complaints in the PDF docs builds: the synopsis for hostname lines was too wide. I initially thought of shortening the parameter names to fit, but it turns out that adding <optional> markup is enough to persuade DocBook to break the line, and that seems more helpful anyway. While here, I couldn't resist some copy-editing, mostly being consistent about whether to use Oxford commas or not. The biggest change was to re-order the entries in the hostname-values table to match the running text.	2026-03-23 15:33:51 -04:00
Tom Lane	99d6aa64ef	Doc: document how EXPLAIN ANALYZE reports parallel queries. This wasn't covered anywhere before... Reported-by: Marcos Pegoraro <marcos@f10.com.br> Author: Maciek Sakrejda <maciek@pganalyze.com> Reviewed-by: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAB-JLwYCgdiB=trauAV1HN5rAWQdvDGgaaY_mqziN88pBTvqqg@mail.gmail.com	2026-03-23 14:48:58 -04:00
Bruce Momjian	0a68fd70cb	doc: make "datadir" argument specification more specific Previously these cases were listed as "directory". Author: Peter Smith Discussion: https://postgr.es/m/CAHut+PvCOQqMi0zRk3GecbYzm5xX1wQixxm9Qs3oXXr5fFCUgw@mail.gmail.com	2026-03-23 12:13:31 -04:00
Tom Lane	360dd6f7b4	Improve commentary about ChangeVarNodesWalkExpression(). IMO the proximate cause of the bug fixed in commit `07b7a964d` was sloppy thinking about what ChangeVarNodesWalkExpression() is to be used for. Flesh out its header comment to try to improve that situation. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1607553.1774017006@sss.pgh.pa.us Backpatch-through: 18	2026-03-23 11:14:24 -04:00
Michael Paquier	93b76db0ac	Fix invalid value of pg_aios.pid, function pg_get_aios() When the value of pg_aios.pid is found to be 0, the function had the idea to set "nulls" to "false" instead of "true", without setting the value stored in the tuplestore. This could lead to the display of buggy data. The intention of the code is clearly to display NULL when a PID of 0 is found, and this commit adjusts the logic to do so. Issue introduced by `60f566b4f2`. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/tencent_7D61A85D6143AD57CA8D8C00DEC541869D06@qq.com Backpatch-through: 18	2026-03-23 18:13:56 +09:00
Peter Eisentraut	085a531983	ci: Run headerscheck and cpluspluscheck in parallel This can save several seconds of wall-clock time for that task. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/b49e74d4-3cf9-4d1c-9dce-09f75e55d026%40eisentraut.org	2026-03-23 08:40:29 +01:00
Peter Eisentraut	0f17d1dbfa	headerscheck: Get CXXFLAGS from Makefile.global headerscheck in C++ mode (cpluspluscheck) previously hardcoded CXXFLAGS and documented that you might need to override them manually from the environment. Now that we have better C++ support in the build system, we can just get CXXFLAGS from Makefile.global, like we do for other variables. Furthermore, this is necessary in some configurations to make cpluspluscheck work under meson, because under meson, some -I options end up in CXXFLAGS where under make they would be in CPPFLAGS. Therefore, getting the correct CXXFLAGS is required in those cases. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMSWrt-PoQt4sHryWrB1ViuGBJF_PpbjoSGrWR2Ry47bHNLDqg%40mail.gmail.com	2026-03-23 07:53:43 +01:00
Amit Kapila	d6628a5ea0	pg_createsubscriber: Introduce module-specific logging functions. Replace generic pg_log_* calls with report_createsub_log() and report_createsub_fatal(). This refactor provides the necessary infrastructure to support logging to external files via the -l option. These new functions enable the utility to route messages to both the terminal and a log file based on the logging configuration and verbosity levels provided by the user. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Gyan Sreejith <gyan.sreejith@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAEqnbaUthOQARV1dscGvB_EsqC-YfxiM6rWkVDHc+G+f4oSUHw@mail.gmail.com	2026-03-23 09:23:20 +05:30
Michael Paquier	ded9754804	Add missing deflateEnd() for server-side gzip base backups The gzip basebackup sink called deflateInit2() in begin_archive() but never called deflateEnd(), leaking zlib's internal compression state (~256KB per archive) until the memory context of the base backup is destroyed. The code tree has already a matching deflateEnd() call for each deflateInit[2]() call (pgrypto, etc.), except for the file touched in this commit, so this brings more consistency for all the compression methods. The server-side LZ4 and zstd implementations require a dedicated cleanup callback as they allocate their state outside the context of a palloc(). As currently used, deflateInit2() is called once per tablespace in a single backup. Memory would slightly bloat only when dealing with many tablespaces at once, not across multiple base backups so this is not worth a backpatch. This change could matter for future uses of this code. zlib allows the definition of memory allocation and free callbacks in the z_stream object given to a deflateInit[2](). The base backup backend code relies on palloc() for the allocations and deflateEnd() internally only cleans up memory (no fd allocation for example). Author: Jianghua Yang <yjhjstz@gmail.com> Discussion: https://postgr.es/m/CAAZLFmQNJ0QNArpWEOZXwv=vbumcWKEHz-b1me5gBqRqG67EwQ@mail.gmail.com	2026-03-23 09:04:44 +09:00
Tom Lane	69c57466a7	Fix another buglet in archive_waldump.c. While re-reading `860359ea0`, I noticed another problem: when spilling to a temp file, it did not bother to check the result of fclose(). This is bad since write errors (like ENOSPC) may not be reported until close time.	2026-03-22 18:48:38 -04:00
Tom Lane	860359ea02	Fix assorted bugs in archive_waldump.c. 1. archive_waldump.c called astreamer_finalize() nowhere. This meant that any data retained in decompression buffers at the moment we detect archive EOF would never reach astreamer_waldump_content(), resulting in surprising failures if we actually need the last few bytes of the archive file. To fix that, make read_archive_file() do the finalize once it detects EOF. Change its API to return a boolean "yes there's more data" rather than the entirely-misleading raw count of bytes read. 2. init_archive_reader() relied on privateInfo->cur_file to track which WAL segment was being read, but cur_file can become NULL if a member trailer is processed during a read_archive_file() call. This could cause unreproducible "could not find WAL in archive" failures, particularly with compressed archives where all the WAL data fits in a small number of compressed bytes. Fix by scanning the hash table after each read to find any cached WAL segment with sufficient data, instead of depending on cur_file. Also reduce the minimum data requirement from XLOG_BLCKSZ to sizeof(XLogLongPageHeaderData), since we only need the long page header to extract the segment size. We likewise need to fix init_archive_reader() to scan the whole hash table for irrelevant entries, since we might have already loaded more than one entry when the data is compressible enough. 3. get_archive_wal_entry() relied on tracking cur_file to identify WAL hash table entries that need to be spilled to disk. However, this can't work for entries that are read completely within a single read_archive_file call: the caller will never see cur_file pointing at such an entry. Instead, scan the WAL hash table to find entries we should spill. This also fixes a buglet that any hash table entries completely loaded during init_archive_reader were never considered for spilling. Also, simplify the logic tremendously by not attempting to spill entries that haven't been read fully. I am not convinced that the old logic handled that correctly in every path, and it's really not worth the complication and risk of bugs to try to spill entries on the fly. We can just write them in a single go once they are no longer the cur_file. 4. Fix a rather critical performance problem: the code thought that resetStringInfo() will reclaim storage, but it doesn't. So by the end of the run we'd have consumed storage space equal to the total amount of WAL read, negating all the effort of the spill logic. Also document the contract that cur_file can change (or become NULL) during a single read_archive_file() call, since the decompression pipeline may produce enough output to trigger multiple astreamer callbacks. Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/2178517.1774064942@sss.pgh.pa.us	2026-03-22 18:24:42 -04:00
Tom Lane	5868372bbf	Remove nonfunctional tar file trailer size check. The ASTREAMER_ARCHIVE_TRAILER case in astreamer_tar_parser_content() intended to reject tar files whose trailer exceeded 2 blocks. However, the check compared 'len' after astreamer_buffer_bytes() had already consumed all the data and set len to 0, so the pg_fatal() could never fire. Moreover, per the POSIX specification for the ustar format, the last physical block of a tar archive is always full-sized, and "logical records after the two zero logical records may contain undefined data." GNU tar, for example, zero-pads its output to a 10kB boundary by default. So rejecting extra data after the two zero blocks would be wrong even if the check worked. (But if the check had worked, it would have alerted us to the bug just fixed in 9aa1fcc54.) Remove the dead check and update the comment to explain why trailing data is expected and harmless. Per report from Tom Lane. Author: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2178517.1774064942@sss.pgh.pa.us	2026-03-22 18:13:41 -04:00
Tom Lane	9aa1fcc547	Fix finalization of decompressor astreamers. Send the correct amount of data to the next astreamer, not the whole allocated buffer size. This bug escaped detection because in present uses the next astreamer is always a tar-file parser which is insensitive to trailing garbage. But that may not be true in future uses. Author: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2178517.1774064942@sss.pgh.pa.us Backpatch-through: 15	2026-03-22 18:06:48 -04:00
Peter Geoghegan	e5836f7b7d	Add fake LSN support to hash index AM. Use fake LSNs in all hash AM critical sections that write a WAL record. This gives us a reliable way (a way that works during scans of both logged and unlogged relations) to detect when an index page was concurrently modified during the window between when the page is initially read (by _hash_readpage) and when the page has any known-dead items LP_DEAD-marked (by _hash_kill_items). Preparation for an upcoming patch that makes the hash index AM use the amgetbatch interface, enabling I/O prefetching during hash index scans. The amgetbatch design imposes certain rules on index AMs with respect to how they hold on to index page buffer pins (at least in the case of pins held as an interlock against unsafe concurrent TID recycling by VACUUM). These rules have consequences for routines that set LP_DEAD bits on index tuples from an amgetbatch index AM: such routines have an inherent need to reason about concurrent TID recycling by VACUUM, but can no longer rely on their amgettuple routine holding on to a buffer pin (during the aforementioned window) as an interlock against such recycling. Instead, they have to follow a new, standardized approach. The new approach taken by amgetbatch index AMs when setting LP_DEAD bits is heavily based on the current nbtree dropPin design, which was added by commit `2ed5b87f`. It also works by checking if the page's LSN advanced during the window where unsafe concurrent TID recycling might have taken place. This commit is similar to commit `8a879119`, which taught nbtree to use fake LSNs to improve its dropPin behavior. However, unlike that commit, this is not an independently useful enhancement, since hash doesn't implement anything like nbtree's dropPin behavior (not yet). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-WzkehuhxyuA8quc7rRN3EtNXpiKsjPfO8mhb+0Dr2K0Dtg@mail.gmail.com	2026-03-22 17:31:43 -04:00
Melanie Plageman	01b7e4a46d	Add pruning fast path for all-visible and all-frozen pages Because of the SKIP_PAGES_THRESHOLD optimization or a stale prune XID, heap_page_prune_and_freeze() can be invoked for pages with no pruning or freezing work to do. To avoid this, if a page is already all-frozen or it is all-visible and no freezing will be attempted, exit early. We can't exit early if vacuum passed DISABLE_PAGE_SKIPPING, though. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk	2026-03-22 15:46:50 -04:00
Peter Geoghegan	f026fbf059	Make IndexScanInstrumentation a pointer in executor scan nodes. Change the IndexScanInstrumentation fields in IndexScanState, IndexOnlyScanState, and BitmapIndexScanState from inline structs to pointers. This avoids additional space overhead whenever new fields are added to IndexScanInstrumentation in the future, at least in the common case where the instrumentation isn't used (i.e. when the executor node isn't being run through an EXPLAIN ANALYZE). Preparation for an upcoming patch series that will add index prefetching. The new slot-based interface that will enable index prefetching necessitates that we add at least one more field to IndexScanInstrumentation (to count heap fetches during index-only scans). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-03-22 13:20:29 -04:00
Melanie Plageman	4f7ecca84d	Detect and fix visibility map corruption in more cases Move VM corruption detection and repair into heap page pruning. This allows VM repair during on-access pruning, not only during vacuum. Also, expand corruption detection to cover pages marked all-visible that contain dead tuples and tuples inserted or deleted by in-progress transactions, rather than only all-visible pages with LP_DEAD items. Pinning the correct VM page before on-access pruning is cheap when compared to the cost of actually pruning. The vmbuffer is saved in the scan descriptor, so a query should only need to pin each VM page once, and a single VM page covers a large number of heap pages. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk	2026-03-22 11:52:40 -04:00
Heikki Linnakangas	516310ed4d	Don't reset 'latest_page_number' when replaying multixid truncation 'latest_page_number' is set to the correct value, according to nextOffset, early at system startup. Contrary to the comment, it hence should be set up correctly by the time we get to WAL replay. This was committed to back-branches earlier already (commit `817f74600d`), to fix a bug in a backwards-compatibility codepath. We don't have that bug on 'master', but the change nevertheless makes sense on 'master' too. Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/20260214090150.GC2297@p46.dedyn.io;lightning.p46.dedyn.io Discussion: https://www.postgresql.org/message-id/e1787b17-dc93-4621-a5a1-c713d1ac6a1b@iki.fi	2026-03-22 14:23:54 +02:00
Michael Paquier	1f7947a48d	Add test for single-page VACUUM of hash index on INSERT _hash_vacuum_one_page() in hashinsert.c is a routine related to hash indexes that can perform a single-page VACUUM when dead tuples are detected during index insertion. This routine previously had no test coverage, and this commit adds a test case for that purpose. To safely create dead tuples in a way that works with parallel tests, this uses a technique based on a rollbacked INSERT, following a suggestion by Heikki Linnakangas. Author: Alexander Kuzmenkov <akuzmenkov@tigerdata.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CALzhyqxrc1ZHYmf5V8NE+yMboqVg7xZrQM7K2c7VS0p1v8z42w@mail.gmail.com	2026-03-22 15:24:33 +09:00
Michael Paquier	322bab7974	Move declarations related to locktags from lock.h to new locktag.h This commit moves all the declarations related to locktags from lock.h to a new header called locktag.h. This header is useful so as code paths that care about locktags but not the lock hashtable can know about these without having to include lock.h and all its set of dependencies. This move includes the basic locktag structures and the set of macros to fill in the locktag fields before attempting to acquire a lock. Based on a suggestion from me, suggestion done while discussing a different feature. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/abufUya2oK-_PJ3E@paquier.xyz	2026-03-21 14:34:47 +09:00
Tom Lane	ce8d5fe0e2	plpgsql: optimize "SELECT simple-expression INTO var". Previously, we always fed SELECT ... INTO to the SPI machinery. While that works for all cases, it's a great deal slower than the otherwise-equivalent "var := expression" if the expression is "simple" and the INTO target is a single variable. Users coming from MSSQL or T_SQL are likely to be surprised by this; they are used to writing SELECT ... INTO since there is no "var := expression" syntax in those dialects. Hence, check for a simple expression and use the faster code path if possible. (Here, "simple" means whatever exec_is_simple_query accepts, which basically means "SELECT scalar-expression" without any input tables, aggregates, qual clauses, etc.) This optimization is not entirely transparent. Notably, one of the reasons it's faster is that the hooks that pg_stat_statements uses aren't called in this path, so that the evaluated expression no longer appears in pg_stat_statements output as it did before. There may be some other minor behavioral changes too, although I tried hard to make error reporting look the same. Hopefully, none of them are significant enough to not be acceptable as routine changes in a PG major version. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/CAFj8pRDieSQOPDHD_svvR75875uRejS9cN87FoAC3iXMXS1saQ@mail.gmail.com	2026-03-20 18:23:45 -04:00
Jeff Davis	4a0b46b6e1	Fix dependency on FDW's connection function. Missed in commit `8185bb5347`. Catalog version bump. Discussion: https://postgr.es/m/fd49b44dc65da8e71ab20c1cf1ec7e65921c20f5.camel@j-davis.com	2026-03-20 12:42:59 -07:00
Andrew Dunstan	b3cf461b3c	pg_verifybackup: Enable WAL parsing for tar-format backups Now that pg_waldump supports reading WAL from tar archives, remove the restriction that forced --no-parse-wal for tar-format backups. pg_verifybackup now automatically locates the WAL archive: it looks for a separate pg_wal.tar first, then falls back to the main base.tar. A new --wal-path option (replacing the old --wal-directory, which is kept as a silent alias) accepts either a directory or a tar archive path. The default WAL directory preparation is deferred until the backup format is known, since tar-format backups resolve the WAL path differently from plain-format ones. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com	2026-03-20 15:31:35 -04:00
Andrew Dunstan	b15c151398	pg_waldump: Add support for reading WAL from tar archives pg_waldump can now accept the path to a tar archive (optionally compressed with gzip, lz4, or zstd) containing WAL files and decode them. This was added primarily for pg_verifybackup, which previously had to skip WAL parsing for tar-format backups. The implementation uses the existing archive streamer infrastructure with a hash table to track WAL segments read from the archive. If WAL files within the archive are not in sequential order, out-of-order segments are written to a temporary directory (created via mkdtemp under $TMPDIR or the archive's directory) and read back when needed. An atexit callback ensures the temporary directory is cleaned up. The --follow option is not supported when reading from a tar archive. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com	2026-03-20 15:31:35 -04:00
Andrew Dunstan	f8a0cd2671	pg_waldump: Preparatory refactoring for tar archive WAL decoding. Several refactoring steps in preparation for adding tar archive WAL decoding support to pg_waldump: - Move XLogDumpPrivate and related declarations into a new pg_waldump.h header, allowing a second source file to share them. - Factor out required_read_len() so the read-size calculation can be reused for both regular WAL files and tar-archived WAL. - Move the WAL segment size variable into XLogDumpPrivate and rename it to segsize, making it accessible to the archive streamer code. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com	2026-03-20 15:31:35 -04:00
Andrew Dunstan	c8a350a439	Move tar detection and compression logic to common. Consolidate tar archive identification and compression-type detection logic into a shared location. Currently used by pg_basebackup and pg_verifybackup, this functionality is also required for upcoming pg_waldump enhancements. This change promotes code reuse and simplifies maintenance across frontend tools. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> discussion: https://postgr.es/m/CAAJ_b94bqdWN3h2J-PzzzQ2Npbwct5ZQHggn_QoYGhC2rn-=WQ@mail.gmail.com	2026-03-20 15:31:35 -04:00
Nathan Bossart	48f11bfa06	Bump transaction/multixact ID warning limits to 100M. These warning limits were last changed to 40M by commit `cd5e82256d`. For the benefit of workloads that rapidly consume transactions or multixacts, this commit bumps the limits to 100M. This will hopefully give users enough time to react. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/aRdhSSFb9zZH_0zc%40nathan	2026-03-20 14:15:33 -05:00
Nathan Bossart	e646450e60	Add percentage of available IDs to wraparound warnings. This commit adds DETAIL messages to the existing wraparound WARNINGs that include the percentage of transaction/multixact IDs that remain available for use. The hope is that this more clearly expresses the urgency of the situation. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/aRdhSSFb9zZH_0zc%40nathan	2026-03-20 14:15:33 -05:00
Tom Lane	733f20df53	Discount the metapage when estimating number of index pages visited. genericcostestimate() estimates the number of index leaf pages to be visited as a pro-rata fraction of the total number of leaf pages. Or at least that was the intention. What it actually used in the calculation was the total number of index pages, so that non-leaf pages were also counted. In a decent-sized index the error is probably small, since we expect upper page fanout to be high. But in a small index that's not true; in the worst case with one data-bearing page plus a metapage, we had 100% relative error. This led to surprising planning choices such as not using a small partial index. To fix, ask genericcostestimate's caller to supply an estimate of the number of non-leaf pages, and subtract that. For the built-in index AMs, it seems sufficient to count the index metapage (if the AM uses one) as non-leaf. Per the above argument, counting upper index pages shouldn't change the estimate much, and in most cases we don't have any easy way of estimating the number of upper pages. This might be an area for further research in future. Any external genericcostestimate callers that do not set the new field GenericCosts.numNonLeafPages will see the same behavior as before, assuming they followed the advice to zero out that whole struct. Unsurprisingly, this change affects a number of plans seen in the core regression tests. I hacked up the existing tests to keep the tests' plans the same, since in each case it appeared that the test's intent was to test exactly that plan. Also add one new test case demonstrating that a better index choice is now made. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Henson Choi <assam258@gmail.com> Discussion: https://postgr.es/m/870521.1745860752@sss.pgh.pa.us	2026-03-20 14:50:53 -04:00
Alexander Korotkov	07b7a964d3	Fix self-join removal to update bare Var references in join clauses Self-join removal failed to update Var nodes when the join clause was a bare Var (e.g., ON t1.bool_col) rather than an expression containing Vars. ChangeVarNodesWalkExpression() used expression_tree_walker(), which descends into child nodes but does not process the top-level node itself. When a bare Var referencing the removed relation appeared as the clause, its varno was left unchanged, leading to "no relation entry for relid N" errors. Fix by calling ChangeVarNodes_walker() directly instead of expression_tree_walker(), so the top-level node is also processed. Bug: #19435 Reported-by: Hang Ammmkilo <ammmkilo@163.com> Author: Andrei Lepikhov <lepihov@gmail.com> Co-authored-by: Tender Wang <tndrwang@gmail.com> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/19435-3cc1a87f291129f1%40postgresql.org Backpatch-through: 18	2026-03-20 15:46:30 +02:00
Álvaro Herrera	e7975f1c06	SET NOT NULL: Call object-alter hook only after the catalog change ... otherwise, the function invoked by the hook might consult the catalog and not see that the new constraint exists. This relies on set_attnotnull doing CommandCounterIncrement() after successfully modifying the catalog. Oversight in commit `14e87ffa5c`. Author: Artur Zakirov <zaartur@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAKNkYnxUPCJk-3Xe0A3rmCC8B8V8kqVJbYMVN6ySGpjs_qd7dQ@mail.gmail.com	2026-03-20 14:38:50 +01:00
Robert Haas	12444183e4	test_plan_advice: Set TAP test priority 50 in meson.build. Since this runs the main regression tests, it can take some time to complete. Therefore, it's better to start it earlier, as we also do for the main regression test suite. Author: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: http://postgr.es/m/1095d3fe-a6eb-4d83-866e-649d6f369908@gmail.com	2026-03-20 08:41:38 -04:00
Andrew Dunstan	4c0390ac53	Add option force_array for COPY JSON FORMAT This adds the force_array option, which is available exclusively when using COPY TO with the JSON format. When enabled, this option wraps the output in a top-level JSON array (enclosed in square brackets with comma-separated elements), making the entire result a valid single JSON value. Without this option, the default behavior is to output a stream of independent JSON objects. Attempting to use this option with COPY FROM or with formats other than JSON will raise an error. Author: Joe Conway <mail@joeconway.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com Discussion: https://postgr.es/m/6a04628d-0d53-41d9-9e35-5a8dc302c34c@joeconway.com	2026-03-20 08:40:17 -04:00
Andrew Dunstan	7dadd38cda	json format for COPY TO This introduces the JSON format option for the COPY TO command, allowing users to export query results or table data directly as a stream of JSON objects (one per line, NDJSON style). The JSON format is currently supported only for COPY TO operations; it is not available for COPY FROM. JSON format is incompatible with some standard text/CSV formatting options, including HEADER, DEFAULT, NULL, DELIMITER, FORCE QUOTE, FORCE NOT NULL, and FORCE NULL. Column list support is included: when a column list is specified, only the named columns are emitted in each JSON object. Regression tests covering valid JSON exports and error handling for incompatible options have been added to src/test/regress/sql/copy.sql. Author: Joe Conway <mail@joeconway.com> Author: jian he <jian.universality@gmail.com> Co-Authored-By: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Andrey M. Borodin <x4mmm@yandex-team.ru> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Davin Shearer <davin@apache.org> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com Discussion: https://postgr.es/m/6a04628d-0d53-41d9-9e35-5a8dc302c34c@joeconway.com	2026-03-20 08:40:04 -04:00
Andrew Dunstan	a2145605ee	introduce CopyFormat, refactor CopyFormatOptions Currently, the COPY command format is determined by two boolean fields (binary, csv_mode) in CopyFormatOptions. This approach, while functional, isn't ideal for implementing other formats in the future. To simplify adding new formats, introduce a CopyFormat enum. This makes the code cleaner and more maintainable, allowing for easier integration of additional formats down the line. Author: Joel Jacobson <joel@compiler.org> Author: jian he <jian.universality@gmail.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com Discussion: https://postgr.es/m/6a04628d-0d53-41d9-9e35-5a8dc302c34c@joeconway.com	2026-03-20 08:21:57 -04:00
Peter Eisentraut	040a56be4b	Cleanup users and roles in graph_table_rls test This test leaves behind the roles and users it creates. 002_pg_upgrade test dumps and restore the regression when PG_TEST_EXTRA contains regress_dump_restore. The global objects such as users and roles are not dumped by pg_dump. But it still dumps the policies associated with users, and commands to set the ownership. Restoring these policies and the ownerships fails since the users and roles do not exist. To fix this failure we could use --no-owner, but it does not exclude the policy objects associated with users. Hence drop the users, roles and policies that depend upon them at the end of the test. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reported-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-20 10:55:51 +01:00
Peter Eisentraut	57ee397953	Update Unicode data to Unicode 17.0.0 Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alexander Borisov <lex.borisov@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-20 08:42:50 +01:00
Amit Kapila	493f8c6439	Add support for EXCEPT TABLE in ALTER PUBLICATION. Following commit `fd366065e0`, which added EXCEPT TABLE support to CREATE PUBLICATION, this commit extends ALTER PUBLICATION to allow modifying the exclusion list. New Syntax: ALTER PUBLICATION name SET publication_all_object [, ... ] where publication_all_object is one of: ALL TABLES [ EXCEPT TABLE ( except_table_object [, ... ] ) ] ALL SEQUENCES If the EXCEPT clause is provided, the existing exclusion list in pg_publication_rel is replaced with the specified relations. If the EXCEPT clause is omitted, any existing exclusions for the publication are cleared. Similarly, SET ALL SEQUENCES updates Note that because this is a SET command, specifying only one object type (e.g., SET ALL SEQUENCES) will reset the other unspecified flags (e.g., setting puballtables to false). Consistent with CREATE PUBLICATION, only root partitioned tables or standard tables can be specified in the EXCEPT list. Specifying a partition child will result in an error. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Discussion: https://postgr.es/m/CALDaNm3=JrucjhiiwsYQw5-PGtBHFONa6F7hhWCXMsGvh=tamA@mail.gmail.com	2026-03-20 11:36:09 +05:30
David Rowley	07d5bffe75	Fix new tuple deforming code so it can support cstrings again In `c456e3911`, I mistakenly thought that the deformer code would never see cstrings and that I could use pg_assume() to have the compiler omit producing code for attlen == -2 attributes. That saves bloating the deforming code a bit with the extra check and strlen() call. While this is ok to do for tuples from the heap, it's not ok to do for MinimalTuples as those can contain cstrings and tts_minimal_getsomeattrs() implements deforming by inlining the (slightly misleadingly named) slot_deform_heap_tuple() code. To fix, add a new parameter to the slot_deform_heap_tuple() and have the callers define which code to inline. Because this new parameter is passed as a const, the compiler can choose to emit or not emit the cstring-related code based on the parameter's value. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNmSK+gKziAt_WvQoMVWt3_LRVMmRYY9dAbMPMcpPV0QmA@mail.gmail.com	2026-03-20 14:16:06 +13:00
Jeff Davis	703fee3b25	Fix dependency on FDW handler. ALTER FOREIGN DATA WRAPPER could drop the dependency on the handler function if it wasn't explicitly specified. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/35c44a4b7fb76d35418c4d66b775a88f4ce60c86.camel@j-davis.com Backpatch-through: 14	2026-03-19 15:07:43 -07:00
Masahiko Sawada	adcdbe9386	Add parallel vacuum worker usage to VACUUM (VERBOSE) and autovacuum logs. This commit adds both the number of parallel workers planned and the number of parallel workers actually launched to the output of VACUUM (VERBOSE) and autovacuum logs. Previously, this information was only reported as an INFO message during VACUUM (VERBOSE), which meant it was not included in autovacuum logs in practice. Although autovacuum does not yet support parallel vacuum, a subsequent patch will enable it and utilize these logs in its regression tests. This change also improves observability by making it easier to verify if parallel vacuum is utilizing the expected number of workers. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com	2026-03-19 15:01:47 -07:00
Masahiko Sawada	ba21f5bf8a	Allow explicit casting between bytea and uuid. This enables the use of functions such as encode() and decode() with UUID values, allowing them to be converted to and from alternative formats like base64 or hex. The cast maps the 16-byte internal representation of a UUID directly to a bytea datum. This is more efficient than going through a text forepresentation. Bump catalog version. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Co-authored-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAJ7c6TOramr1UTLcyB128LWMqita1Y7%3Darq3KHaU%3Dqikf5yKOQ%40mail.gmail.com	2026-03-19 13:51:50 -07:00
Tom Lane	1811f1af98	Improve hash join's handling of tuples with null join keys. In a plain join, we can just summarily discard an input tuple with null join key(s), since it cannot match anything from the other side of the join (assuming a strict join operator). However, if the tuple comes from the outer side of an outer join then we have to emit it with null-extension of the other side. Up to now, hash joins did that by inserting the tuple into the hash table as though it were a normal tuple. This is unnecessarily inefficient though, since the required processing is far simpler than for a potentially-matchable tuple. Worse, if there are a lot of such tuples they will bloat the hash bucket they go into, possibly causing useless repeated attempts to split that bucket or increase the number of batches. We have a report of a large join vainly creating many thousands of batches when faced with such input. This patch improves the situation by keeping such tuples out of the hash table altogether, instead pushing them into a separate tuplestore from which we return them later. (One might consider trying to return them immediately; but that would require substantial refactoring, and it doesn't work anyway for cases where we rescan an unmodified hash table.) This works even in parallel hash joins, because whichever worker reads a null-keyed tuple can just return it; there's no need for consultation with other workers. Thus the tuplestores are local storage even in a parallel join. A pre-existing buglet that I noticed while analyzing the code's behavior is that ExecHashRemoveNextSkewBucket fails to decrement hashtable->skewTuples for tuples moved into the main hash table from the skew hash table. This invalidates ExecHashTableInsert's calculation of the number of main-hash-table tuples, though probably not by a lot since we expect the skew table to be small relative to the main one. Nonetheless, let's fix that too while we're here. Bug: #18909 Reported-by: Sergey Koposov <Sergey.Koposov@ed.ac.uk> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/3061845.1746486714@sss.pgh.pa.us	2026-03-19 15:21:36 -04:00
Tom Lane	8b02c22bb4	Avoid leaking duplicated file descriptors in corner cases. pg_dump's compression modules had variations on the theme of fp = fdopen(dup(fd), mode); if (fp == NULL) // fail, reporting errno which is problematic for two reasons. First, if dup() succeeds but fdopen() fails, we'd leak the duplicated FD. That's not important at present since the program will just exit immediately after failure anyway; but perhaps someday we'll try to continue, making the resource leak potentially significant. Second, if dup() fails then fdopen() will overwrite the useful errno (perhaps EMFILE) with a misleading value EBADF, making it difficult to understand what went wrong. Fix both issues by testing for dup() failure before proceeding to the next call. These failures are sufficiently unlikely, and the consequences minor enough, that this doesn't seem worth the effort to back-patch. But let's fix it in HEAD. Author: Jianghua Yang <yjhjstz@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/62bbe34d-2315-4b42-b768-56d901aa83e1@gmail.com	2026-03-19 14:25:26 -04:00
Nathan Bossart	dd1398f137	Allow choosing specific grantors via GRANT/REVOKE ... GRANTED BY. Except for GRANT and REVOKE on roles, the GRANTED BY clause currently only accepts the current role to match the SQL standard. And even if an acceptable grantor (i.e., the current role) is specified, Postgres ignores it and chooses the "best" grantor for the command. Allowing the user to select a specific grantor would allow better control over the precise behavior of GRANT/REVOKE statements. This commit adds that ability. For consistency with select_best_grantor(), we only permit choosing grantor roles for which the current role inherits privileges. Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aRYLkTpazxKhnS_w%40nathan	2026-03-19 11:41:39 -05:00
Robert Haas	6f0738ddec	dshash: Make it possible to suppress out of memory errors Introduce dshash_find_or_insert_extended, which is just like dshash_find_or_insert except that it takes a flags argument. Currently, the only supported flag is DSHASH_INSERT_NO_OOM, but I have chosen to use an integer rather than a boolean in case we end up with more flags in the future. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: http://postgr.es/m/CA+TgmoaJwUukUZGu7_yL74oMTQQz2=zqucMhF9+9xBmSC5us1w@mail.gmail.com	2026-03-19 11:51:17 -04:00
Tom Lane	5a2043bf71	Fix transient memory leakage in jsonpath evaluation. This patch reimplements JsonValueList to be more space-efficient and arranges for temporary JsonValueLists created during jsonpath evaluation to be freed when no longer needed, rather than being leaked till the end of the function evaluation cycle as before. The motivation is to prevent indefinite memory bloat while evaluating jsonpath expressions that traverse a lot of data. As an example, this query SELECT jsonb_path_query((SELECT jsonb_agg(i) FROM generate_series(1,10000) i), '$[] ? (@ < $)'); formerly required about 6GB to execute, with the space required growing quadratically with the length of the input array. With this patch the memory consumption stays static. (The time required is still quadratic, but we can't do much about that: this path expression asks to compare each array element to each other one.) The bloat happens because we construct a JsonValueList containing all the array elements to represent the second occurrence of "$", and then just leak it after evaluating the filter expression for any one value generated from "$[]". If I were implementing this functionality from scratch I'd probably try to avoid materializing that representation at all, but changing that now looks like more trouble than it's worth. This patch takes the more conservative approach of just making sure we free the list after we're done with it. The existing representation of JsonValueList is neither especially compact nor especially easy to free: it's a List containing pointers to separately-palloc'd JsonbValue structs. We could theoretically use list_free_deep, but it's not 100% clear that all the JsonbValues are always safe for us to free. In any case we are talking about a lot of palloc/pfree traffic if we keep it like this. This patch replaces that with what's essentially an expansible array of JsonbValues, so that even a long list requires relatively few palloc requests. Also, for the very common case that only one or two elements appear in the list, this representation uses zero pallocs: the elements can be kept in the on-the-stack base struct. Note that we are only interested in freeing the JsonbValue structs themselves. While many types of JsonbValue include pointers to external data such as strings or numerics, we expect that that data is part of the original jsonb input Datum(s) and need not (indeed cannot) be freed here. In this reimplementation, JsonValueListAppend() always copies the supplied JsonbValue struct into the JsonValueList data. This allows simplifying and regularizing many call sites that sometimes palloc'd JsonbValues and sometimes passed a local-variable JsonbValue. Always doing the latter is simpler, faster, and less bug-prone. I also removed JsonValueListLength() in favor of constant-time tests for whether the list has zero, one, or more than one member, which is what the callers really need to know. JsonValueListLength() was not a hot code path, so this aspect of the patch won't move the needle in the least performance-wise. But it seems neater. I've not done any wide-ranging performance testing, but this should be faster than the old code thanks to reduction of palloc overhead. On the specific example shown above, it's about twice as fast as before on not-very-large inputs; and of course it wins big if you consider an input large enough to drive the old code into swapping. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/569394.1773783211@sss.pgh.pa.us	2026-03-19 11:37:14 -04:00
Peter Eisentraut	7724cb9935	Add some const qualifiers enabled by typeof_unqual change on copyObject The recent commit to change copyObject() to use typeof_unqual allows cleaning up some APIs to take advantage of this improved qualifier handling. EventTriggerCollectSimpleCommand() is a good example: It takes a node tree and makes a copy that it keeps around for its internal purposes, but it can't communicate via its function signature that it promises not scribble on the passed node tree. That is now fixed. Reviewed-by: David Geier <geidav.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-19 06:35:54 +01:00
Michael Paquier	46fb08aff6	Add commit `015d32016d` to .git-blame-ignore-revs.	2026-03-19 13:45:07 +09:00
Michael Paquier	015d32016d	test_saslprep: Apply proper indentation Noticed before koel has the idea to complain. Rebase thinko from commit `aa73838a5c`.	2026-03-19 13:42:24 +09:00
David Rowley	c95cd2991f	Short-circuit row estimation in NOT IN containing NULL consts ScalarArrayOpExpr used for either NOT IN or <>/= ALL, when the array contains a NULL constant, will never evaluate to true. Here we add an explicit short-circuit in scalararraysel() to account for this and return 0.0 rows when we see that a NULL exists. When the array is a constant, we can very quickly see if there are any NULL values and return early before going to much effort in scalararraysel(). For non-const arrays, we short-circuit after finding the first NULL and forego selectivity estimations of any remaining elements. In the future, it might be better to do something for this case in constant folding. We would need to be careful to only do this for strict operators on expressions located in places that don't care about distinguishing false from NULL returns. i.e. EXPRKIND_QUAL expressions. Doing that requires a bit more thought and effort, so here we just fix some needlessly slow selectivity estimations for ScalarArrayOpExpr containing many array elements and at least one NULL. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/eaa2598c-5356-4e1e-9ec3-5fd6eb1cd704@tantorlabs.com	2026-03-19 17:16:36 +13:00
Michael Paquier	aa73838a5c	test_saslprep: Test module for SASLprep() This module includes two functions: - test_saslprep(), that performs pg_saslprep on a bytea. - test_saslprep_ranges(), able to check for all valid ranges of UTF-8 codepoints pg_saslprep() handles each one of them. This provides a detailed coverage of our implementation of SASLprep() used for SCRAM, with: - ASCII characters. - Incomplete UTF-8 sequences, for `390b3cbbb2` (later backpatched). - A more advanced check for all the valid UTF-8 ranges of codepoints, to check for cases where these generate an empty password, based on an original suggestion from Heikki Linnakangas. This part consumes resources and time, so it is implemented as a TAP test under a new PG_TEST_EXTRA value. A different patch is still under discussion to tweak our internal SASLprep() implementation, and this module can be used to track any changes in behavior. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aaEJ-El2seZHeFcG@paquier.xyz	2026-03-19 13:03:30 +09:00
Michael Paquier	79a5911fe6	Add more debugging information for bgworker termination tests of worker_spi widowbird has failed again after `af8837a10b`, with the same symptoms of a backend still lying around when attempting a database rename with a bgworker connected to the database being renamed. We are still not sure yet how the failure can be reached, if this is a timing issue in the test or an actual bug in the logic used for interruptible bgworkers. This commit adds more debugging information in the backend to help with the analysis as a temporary measure. Another thing I have noticed is that the queries launching the dynamic bgworkers or checking pg_stat_activity would connect to the database renamed. These are switched to use 'postgres'. That will hopefully remove some of the friction of the test, but I doubt that this is the end of the story. Discussion: https://postgr.es/m/abtJLEAsf1HZXWdR@paquier.xyz	2026-03-19 11:39:31 +09:00
Fujii Masao	645c6d05cc	doc: Clarify BUFFERS behavior without ANALYZE in EXPLAIN This commit clarifies the documentation for the BUFFERS option of EXPLAIN by explicitly describing its behavior when ANALYZE is not specified. Author: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/TYRPR01MB13457D31085CB5B246DBBA1AFE845A@TYRPR01MB13457.jpnprd01.prod.outlook.com	2026-03-19 08:30:50 +09:00
Robert Haas	b335fe56f3	pg_plan_advice: Fix multiple copy-and-paste-errors in test case. The second half of this file is meant to test feedback, not generated advice, and is meant to use the statements that it prepares, not leftover prepared statements from earlier in the file. These mistakes resulted in failures under debug_discard_caches = 1, because re-executing pt2 instead of executing pt4 for the first time resulted in different output depending on whether the query was replanned. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (per BF member avocet)	2026-03-18 18:24:39 -04:00
Daniel Gustafsson	e87ab5049d	ssl: Skip passphrase reload tests in EXEC_BACKEND builds SSL password command reloading must be enabled on Windows and in EXEC_BACKEND builds due to them always reloading the context. The new tests in commit `4f433025` skipped under Windows but missed the EXEC_BACKEND check. Reported by buildfarm member culicidae. Author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAOYmi+kXmCCgBWffzmSjaNhME5rD=gjyc_OP1FeWQTw2MmSNjg@mail.gmail.com	2026-03-18 22:59:57 +01:00
Tom Lane	f91b8ff6af	Fix -Wstrict-prototypes warning in ecpg_init_sqlca() declaration. When headerscheck compiles ecpglib_extern.h, POSTGRES_ECPG_INTERNAL is not defined, causing sqlca.h to expand "sqlca" as a macro (*ECPGget_sqlca()). This causes the ecpg_init_sqlca() declaration to trigger a -Wstrict-prototypes warning. Fix by renaming the parameter from "sqlca" to "sqlca_p" in both the declaration and definition, avoiding the macro expansion. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reported-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAN55FZ1VDwJ-ZD092ChYf%2B%2BhuP%2B-S3Cg45tJ8jNH5wx2c4BHAg%40mail.gmail.com	2026-03-18 15:27:49 -04:00
Nathan Bossart	ec80215c03	pg_restore: Remove unnecessary strlen() calls in options parsing. Unlike pg_dump and pg_dumpall, pg_restore first checks whether the argument passed to --format, --host, and --port is empty before setting the corresponding variable. Consequently, pg_restore does not error if given an empty format name, whereas pg_dump and pg_dumpall do. Empty arguments for --host and --port are ignored by all three applications, so this commit produces no functionality changes there. This behavior should perhaps be reconsidered, but that is left as a future exercise. As with other recent changes to option handling for these applications (commits `b2898baaf7`, `7c8280eeb5`, and `be0d0b457c`), no back-patch. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/CAKYtNApkh%3DVy2DpNRCnEJmPpxNuksbAh_QBav%3D2fLmVjBhGwFw%40mail.gmail.com	2026-03-18 14:22:15 -05:00
Jeff Davis	1c5bf1185a	ALTER SUBSCRIPTION ... SERVER test. Test ALTER SUBSCRIPTION ... SERVER and ALTER SUBSCRIPTION ... CONNECTION, including invalidation. Also run perltidy on the test file. Discussion: https://postgr.es/m/CAExHW5vV5znEvecX=ra2-v7UBj9-M6qvdDzuB78M-TxbYD1PEA@mail.gmail.com Suggested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>	2026-03-18 10:15:51 -07:00
Tom Lane	8df3c7a85e	Exclude contrib/pg_plan_advice/pgpa_parser.h from headerscheck. Like other Bison-written headers, it's not worth the trouble to make this compilable standalone. (We might revisit this someday, if we ever move up our minimum required Bison version.)	2026-03-18 13:10:22 -04:00
Jeff Davis	b71bf3b845	Fix pg_dump for CREATE FOREIGN DATA WRAPPER ... CONNECTION. Discussion: https://postgr.es/m/7eb0c03b4312b32cb76d340023b39a751745a1f9.camel@j-davis.com	2026-03-18 09:58:42 -07:00
Peter Eisentraut	29bf4ee749	Enable -Wstrict-prototypes and -Wold-style-definition by default Those are available in all gcc and clang versions that support C11 and as C11 is required as of `f5e0186f86`, then we can add them without capability test. Having them enabled by default avoid having to chase these manually like `11171fe1fc`, `cdf4b9aff2`, `0e72b9d440`, `7069dbcc31`, `f1283ed6cc`, `7b66e2c086`, `e95126cf04` and `9f7c527af3` have done. Also, readline headers trigger a lot of warnings with -Wstrict-prototypes, so we make use of the system_header pragma to hide the warnings. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/13d51b20-a69c-4ac1-8546-ec4fc278064f%40eisentraut.org Discussion: https://postgr.es/m/aTFctZwWSpl2/LG5%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-18 14:31:50 +01:00
Peter Eisentraut	9b406a9e48	Update RELEASE_CHANGES The existing instructions did not cover meson. Point to src/common/unicode/README instead, where there is more information. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-18 13:42:06 +01:00
Peter Eisentraut	1b0c269f2e	Implement unaccent Unicode data update in meson The meson/ninja update-unicode target did not cover the required updates in contrib/unaccent/. This is fixed now. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alexander Borisov <lex.borisov@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-18 13:42:05 +01:00
Daniel Gustafsson	4f433025f6	ssl: Serverside SNI support for libpq Support for SNI was added to clientside libpq in `5c55dc8b47` with the sslsni parameter, but there was no support for utilizing it serverside. This adds support for serverside SNI such that certificate/key handling is available per host. A new config file, $datadir/pg_hosts.conf, is used for configuring which certificate and key should be used for which hostname. In order to use SNI the ssl_sni GUC must be set to on, when it is off the ssl configuration works just like before. If ssl_sni is enabled and pg_hosts.conf is non-empty it will take precedence over the regular SSL GUCs, if it is empty or missing the regular GUCs will be used just as before this commit with no hostname specific handling. The TLS init hook is not compatible with ssl_sni since it operates on a single TLS configuration and SNI break that assumption. If the init hook and ssl_sni are both enabled, a WARNING will be issued. Host configuration can either be for a literal hostname to match, non- SNI connections using the no_sni keyword or a default fallback matching all connections. By omitting no_sni and the fallback a strict mode can be achieved where only connections using sslsni=1 and a specified hostname are allowed. CRL file(s) are applied from postgresql.conf to all configured hostnames. Serverside SNI requires OpenSSL, currently LibreSSL does not support the required infrastructure to update the SSL context during the TLS handshake. Author: Daniel Gustafsson <daniel@yesql.se> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Dewei Dai <daidewei1970@163.com> Reviewed-by: Cary Huang <cary.huang@highgo.ca> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/1C81CD0D-407E-44F9-833A-DD0331C202E5@yesql.se	2026-03-18 12:37:11 +01:00
Daniel Gustafsson	25e568ba7c	ssl: Add tests for client CA These tests were originally written to test the SSL SNI patchset but they have merit on their own since we lack coverage for these scenarios in the non SNI case as well. Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/1C81CD0D-407E-44F9-833A-DD0331C202E5@yesql.se	2026-03-18 12:36:53 +01:00
Peter Eisentraut	e82fc27e09	meson: Add headerscheck and cpluspluscheck targets Author: Miłosz Bieniek <bieniek.milosz0@gmail.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Bilal Yavuz <byavuz81@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMSWrt-PoQt4sHryWrB1ViuGBJF_PpbjoSGrWR2Ry47bHNLDqg%40mail.gmail.com	2026-03-18 11:27:43 +01:00
Peter Eisentraut	720c9b504e	meson: Add {perl\|python}_includespec to generated Makefile.global This is meant to help enable headerscheck under meson, but can also be useful in general, for example for third-party extension that might use these values. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMSWrt-PoQt4sHryWrB1ViuGBJF_PpbjoSGrWR2Ry47bHNLDqg%40mail.gmail.com	2026-03-18 11:09:14 +01:00
Peter Eisentraut	905e44152a	Allow setting the collation strength in ICU tailoring rules There was a bug that if you created an ICU collation with tailoring rules, any strength specification inside the rules was ignored. This was because we called ucol_openRules() with UCOL_DEFAULT_STRENGTH for the strength argument, which overrides the strength. This was because of faulty guidance in the ICU documentation, which has since been fixed. The correct invocation is to use UCOL_DEFAULT for the strength argument. This fixes bug #18771 and bug #19425. Author: Daniel Verite <daniel@manitou-mail.org> Reported-by: Ruben Ruiz <ruben.ruizcuadrado@gmail.com> Reported-by: dorian.752@live.fr Reported-by: Todd Lang <Todd.Lang@D2L.com> Discussion: https://www.postgresql.org/message-id/flat/YT2PPF959236618377A072745A280E278F4BE1DA@YT2PPF959236618.CANPRD01.PROD.OUTLOOK.COM Discussion: https://www.postgresql.org/message-id/flat/18771-98bb23e455b0f367@postgresql.org Discussion: https://www.postgresql.org/message-id/flat/19425-58915e19dacd4f40%40postgresql.org	2026-03-18 08:58:47 +01:00
David Rowley	374a6394c6	Move planner row-estimation tests to new planner_est.sql Move explain_mask_costs() and the associated planner row-estimation tests from misc_functions.sql to a new regression test file, planner_est.sql. Previously, there wasn't an ideal home for such tests, likely as there were very few such tests due to width and selectivity estimations being too dependent on statistics and hardware. That's not always the case, as we have SupportRequestRows support functions. More such tests are possibly on the way, so let's create a better home so that we don't have to create the explain_mask_costs() function in each file we might have added such tests to. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvphShGABn-3AoE36dTvGHW7gUpFSw0_ZZnH84wGCW3hHw@mail.gmail.com	2026-03-18 17:22:05 +13:00
Michael Paquier	ab697307dd	test_plan_advice: Add .gitignore Issue noticed while playing with the tree.	2026-03-18 11:04:10 +09:00
Andrew Dunstan	3b4c2b9db2	Allow IS JSON predicate to work with domain types The IS JSON predicate only accepted the base types text, json, jsonb, and bytea. Extend it to also accept domain types over those base types by resolving through getBaseType() during parse analysis. The base type OID is stored in the JsonIsPredicate node (as exprBaseType) so the executor can dispatch to the correct validation path without repeating the domain lookup at runtime. When a non-supported type (or domain over a non-supported type) is used, the error message displays the original type name as written by the user, rather than the resolved base type. Author: jian he <jian.universality@gmail.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CACJufxEk34DnJFG72CRsPPT4tsJL9arobX0tNPsn7yH28J=zQg@mail.gmail.com	2026-03-17 15:20:22 -04:00
Andres Freund	f5eb854ab6	Fix use of wrong variable in _hash_kill_items() In `82467f627b` I somehow ended up using 'so->currPos.buf' instead of the 'buf' variable, which is incorrect when the buffer is not already pinned. At the very least this can lead to assertion failures Unfortunately this shows that this code path was not covered. Expand src/test/modules/index/specs/killtuples.spec to test it. Until now the 'result' step always reported either a 0 or 1 buffer accesses, but when exercising hash overflows, more buffers are accessed. To avoid depending on the precise number of accesses, change the result step to return whether there were any heap accesses. That makes the change a lot more verbose, but still seems worth it. Reported-by: Alexander Kuzmenkov <akuzmenkov@tigerdata.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/vjtmvwvbxt7w5uyacxpzibpj65ewcb7uqaqbhd4arvnjbp5jqz%405ksdh6fsyqve Discussion: https://postgr.es/m/b9de8d05-3b02-4a27-9b0b-03972fa4bfd3@iki.fi	2026-03-17 14:54:41 -04:00
Robert Haas	01b02c0eca	pg_plan_advice: Avoid a crash under GEQO. The previous code could allocate pgpa_sj_unique_rel objects in a context that had too short a lifespan. Fix by allocating them (and any associated List-related allocations) in the same context as the pgpa_planner_state to which they are attached. We also need to copy uniquerel->relids, because the associated RelOptInfo may also be allocated within a short-lived context. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/a6e6d603-e847-44dc-acd5-879fb4570062@gmail.com	2026-03-17 14:25:43 -04:00
Robert Haas	e0e4c132ef	Test pg_plan_advice using a new test_plan_advice module. The TAP test included in this new module runs the regression tests with pg_plan_advice loaded. It arranges for each query to be planned twice. The first time, we generate plan advice. The second time, we replan the query using the resulting advice string. If the tests fail, that means that using pg_plan_advice to tell the planner to do what it was going to do anyway breaks something, which indicates a problem either with pg_plan_advice or with the planner. The test also enables pg_plan_advice.feedback_warnings, so that if the plan advice fails to apply cleanly when the query is replanned, a failure will occur. Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA%2BTgmoZzM2i%2Bp-Rxdphs4qx7sshn-kzxF91ASQ5duOo0dFRXLQ%40mail.gmail.com	2026-03-17 14:06:26 -04:00
Robert Haas	59dcc19b39	pg_plan_advice: Always install pg_plan_advice.h, and in the right place The Makefile failed to set HEADERS_pg_plan_advice, so the header wasn't installed. Fixing that reveals another problem: since this is just a loadable module, not an extension, the header file is installed into $(includedir_server)/contrib rather than $(includedir_server)/extension. While we have no existing cases of installing header files there, it appears to be the intent of pgxs.mk. However, this is inconsistent with meson.build, which was using dir_include_extension. Changing that to dir_include_server / 'contrib' makes the install locations consistent across the two builds. Author: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: http://postgr.es/m/CAN4CZFP6NOjv__4Mx+iQD8StdpbHvzDAatEQn2n15UKJ=MySSQ@mail.gmail.com	2026-03-17 12:53:13 -04:00
Nathan Bossart	4b5ba0c4ca	pg_dump: Simplify query for retrieving attribute statistics. This query fetches information from pg_stats, which did not return table OIDs until recent commit `3b88e50d6c`. Because of this, we had to cart around arrays of schema and table names, and we needed an extra filter clause to hopefully convince the planner to use the correct index. With the introduction of pg_stats.tableid, we can instead just use an array of OIDs, and we no longer need the extra filter clause hack. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CADkLM%3DcoCVy92QkVUUTLdo5eO2bMDtwMrzRn_8miAhX%2BuPaqXg%40mail.gmail.com	2026-03-17 11:32:40 -05:00
Peter Eisentraut	2eb6cd327c	Hardcode typeof_unqual to __typeof_unqual__ for clang A new attempt was made in `63275ce84d` to make typeof_unqual work on all configurations of CC and CLANG. This re-introduced an old problem though, where CLANG would only support __typeof_unqual__ but the configure check for CC detected support for typeof_unqual. This fixes that by always defining typeof_unqual as __typeof_unqual__ under clang. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-17 16:44:43 +01:00
Robert Haas	7560995a38	pg_plan_advice: Fix variable type confusion. pgs_mask values should always be uint64, but in a couple of places I incorrectly used uint32. Fix that. Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: http://postgr.es/m/CAApHDvquH6wnp4fhpaCOkC4R3KAvr2BOTbhhDPDQCBNR3YbLMQ@mail.gmail.com	2026-03-17 11:34:26 -04:00
Andrew Dunstan	ecd9288624	make immutability tests in to_json and to_jsonb complete Complete the TODOs in to_json_is_immutable() and to_jsonb_is_immutable() by recursing into container types (arrays, composites, ranges, multiranges, domains) to check element/sub-type mutability, rather than conservatively returning "mutable" for all arrays and composites. The shared logic is factored into a single json_check_mutability() function in jsonfuncs.c, with the existing exported functions as thin wrappers. Composite type inspection uses lookup_rowtype_tupdesc() (typcache) instead of relation_open() to avoid unnecessary lock acquisition in the optimizer. Range and multirange types are now also checked recursively: if the subtype's conversion is immutable, the range is considered immutable for JSON purposes, even though range_out is generically marked STABLE. This is a behavioral change: range types with immutable subtypes (e.g., int4range) can now appear in expression indexes via JSON_ARRAY/JSON_OBJECT, whereas previously they were conservatively rejected. Add regression tests for JSON_ARRAY and JSON_OBJECT mutability with expression indexes and generated columns, covering arrays, composites, domains, ranges, multiranges and combinations thereof. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CACJufxFz=OsXQdsMJ-cqoqspD9aJrwntsQP-U2A-UaV_M+-S9g@mail.gmail.com Commitfest: https://commitfest.postgresql.org/patch/5759	2026-03-17 11:28:33 -04:00
Nathan Bossart	3b88e50d6c	Add more columns to pg_stats, pg_stats_ext, and pg_stats_ext_exprs. This commit adds table OID and attribute number columns to pg_stats, and it adds table OID and statistics object OID columns to pg_stats_ext and pg_stats_ext_exprs. A proposed follow-up commit would use pg_stats.tableid to simplify a query in pg_dump. The others have no immediate purpose but may be useful later. Bumps catversion. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM%3DcoCVy92QkVUUTLdo5eO2bMDtwMrzRn_8miAhX%2BuPaqXg%40mail.gmail.com	2026-03-17 09:26:27 -05:00
Peter Eisentraut	c9babbc881	Dump labels in reproducible order In pg_get_propgraphdef(), sort the labels before writing out, for a consistent dump order. Also, since we now have a list, we can get rid of the separate table scan to get the count. Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Co-authored-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Co-authored-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-17 14:07:29 +01:00
Heikki Linnakangas	2c1a7d421f	Don't leave behind files in src dir in 007_multixact_conversion.pl pg_upgrade test 007_multixact_conversion.pl was leaving files like delete_old_cluster.sh in the source directory for VPATH and meson builds. To fix, change the tmp_check directory before running the test, like in the other pg_upgrade tests. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> https://www.postgresql.org/message-id/TYRPR01MB121563A4DA8B2FE9A2ECB79F5F541A@TYRPR01MB12156.jpnprd01.prod.outlook.com	2026-03-17 11:24:52 +02:00
Peter Eisentraut	182cdf5aea	pg_dump: Add appropriate version check Some code added by commit `2f094e7ac6` needs to be behind a version check so that it is not run against older databases. Author: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://www.postgresql.org/message-id/afe3f099-3271-4fc4-8e32-467b5309affb%40dunslane.net	2026-03-17 09:46:06 +01:00
Michael Paquier	233e6ae953	gen_guc_tables.pl: Improve detection of inconsistent data This commit adds two improvements to gen_guc_tables.pl: 1) When finding two entries with the same name, the script complained about these being not in alphabetical order, which was confusing. Duplicated entries are now reported as their own error. 2) While the presence of the required fields is checked for all the parameters, the script did not perform any checks on the non-required fields. A check is added to check that any field defined matches with what can be accepted. Previously, a typo in the name of a required field would cause the field to be reported as missing. Non-mandatory fields would be silently ignored, which was problematic as we could lose some information. Author: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN4CZFP=3xUoXb9jpn5OWwicg+rbyrca8-tVmgJsQAa4+OExkw@mail.gmail.com	2026-03-17 17:38:55 +09:00
Michael Paquier	1a7ccd2b33	Refactor some code around ALTER TABLE [NO] INHERIT [NO] INHERIT is not supported for partitioned tables, but this portion of tablecmds.c did not apply the same rules as the other sub-commands, checking the relkind in the execution phase, not the preparation phase. This commit refactors the code to centralize the relkind and other checks in the preparation phase for both command patterns, getting rid of one translatable string on the way. ATT_PARTITIONED_TABLE is removed from ATSimplePermissions(), and the child relation is checked the same way for both sub-commands. The ALTER TABLE patterns that now fail at preparation failed already at execution, hence there should be no changes from the user perspective except more consistent error messages generated. Some comments at the top of ATPrepAddInherit() were incorrect, CreateInheritance() being the routine checking the columns and constraints between the parent and its to-be-child. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAEoWx2kggo1N2kDH6OSfXHL_5gKg3DqQ0PdNuL4LH4XSTKJ3-g@mail.gmail.com	2026-03-17 14:34:29 +09:00
Michael Paquier	cbf9a72993	Add regression test for ALTER TABLE .. NO INHERIT on typed tables This test addition has come up as a suggestion by me, while discussing a patch that manipulates the area of the code related to this command pattern. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2kggo1N2kDH6OSfXHL_5gKg3DqQ0PdNuL4LH4XSTKJ3-g@mail.gmail.com	2026-03-17 13:14:02 +09:00
Michael Paquier	af8837a10b	Tweak TAP test for worker terminations in worker_spi The test has been reported as having a race condition for the case of a worker that should be terminated after a database rename. Based on the report received from buildfarm member jay, the database renamed is accessed by a different session, preventing the ALTER DATABASE to complete, ultimately failing the test. Honestly, I am not completely sure what is the origin of this disturbance, but two possibilities are an autovacuum or parallel worker (due to debug_parallel_query being used by the host). In order to (hopefully) stabilize the test, autovacuum and debug_parallel_query are now disabled in the configuration of the node used in the test. The failure is hard to reproduce, so it will take a few weeks to make sure that the test has become stable. Let's see where it goes. Reported-by: Aya Iwata <iwata.aya@fujitsu.com> Discussion: https://postgr.es/m/OS3PR01MB8889505E2F3E443CCA4BD72EEA45A@OS3PR01MB8889.jpnprd01.prod.outlook.com	2026-03-17 12:56:46 +09:00
David Rowley	d8a859d22b	Reduce size of CompactAttribute struct to 8 bytes Previously, this was 16 bytes. With the use of some bitflags and by reducing the attcacheoff field size to a 16-bit type, we can halve the size of the struct. It's unlikely that caching the offsets for offsets larger than what will fit in a 16-bit int will help much as the tuple is very likely to have some non-fixed-width types anyway, the offsets of which we cannot cache. Shrinking this down to 8 bytes helps by accessing fewer cachelines when performing tuple deformation. The fields used there are all fully fledged fields, which don't require any bitmasking to extract the value of. It also helps to more efficiently calculate the address of a compact_attrs[] element in TupleDesc as the x86 LEA instruction can work with 8 byte offsets, which allows the element address to be calculated from the TupleDesc's address in a single instruction using LEA's concurrent shift and add. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAApHDvodSVBj3ypOYbYUCJX%2BNWL%3DVZs63RNBQ_FxB_F%2B6QXF-A%40mail.gmail.com	2026-03-17 15:06:31 +13:00
Fujii Masao	d927b4bd97	Fix WAL flush LSN used by logical walsender during shutdown Commit `6eedb2a5fd` made the logical walsender call XLogFlush(GetXLogInsertRecPtr()) to ensure that all pending WAL is flushed, fixing a publisher shutdown hang. However, if the last WAL record ends at a page boundary, GetXLogInsertRecPtr() can return an LSN pointing past the page header, which can cause XLogFlush() to report an error. A similar issue previously existed in the GiST code. Commit `b1f14c9672` introduced GetXLogInsertEndRecPtr(), which returns a safe WAL insertion end location (returning the start of the page when the last record ends at a page boundary), and updated the GiST code to use it with XLogFlush(). This commit fixes the issue by making the logical walsender use XLogFlush(GetXLogInsertEndRecPtr()) when flushing pending WAL during shutdown. Backpatch to all supported versions. Reported-by: Andres Freund <andres@anarazel.de> Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux Backpatch-through: 14	2026-03-17 08:10:20 +09:00
Jeff Davis	f4af7849b3	Clean up postgres_fdw/t/010_subscription.pl. The test was based on test/subscription/002_rep_changes.pl, but had some leftover copy+paste problems that were useless and/or distracting. Discussion: https://postgr.es/m/CAA4eK1+=V_UFNHwcoMFqzy0F4AtS9_GyXhQDUzizgieQPWr=0A@mail.gmail.com Reported-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>	2026-03-16 13:42:55 -07:00
David Rowley	7a2ab122a1	Fix thinko in nocachegetattr() and nocache_index_getattr() This code was recently adjusted by `c456e3911`, but that commit didn't get the logic correct when finding the attnum to start walking the tuple in. If there is a NULL, we need to start walking the tuple before it. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNnb-s_=VdVUZ9h7dPA0u3hxV8x2aU3obZytnqQZ_MiROA@mail.gmail.com	2026-03-17 09:00:39 +13:00
Robert Haas	5e72ce2467	pg_plan_advice: Fix failures to accept identifier keywords. TOK_IDENT allows only non-keywords; identifier should be used any place where either keywords or non-keywords should be accepted. Hence, without this commit, any string that happens to be a keyword can't be used as a partition schema, partition name, or plan name, which is incorrect. Author: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CAP53PkzKeD=t90OfeMsniYrcRe2THQbUx3g6wV17Y=ZtiwmWTQ@mail.gmail.com	2026-03-16 14:46:50 -04:00
Peter Eisentraut	4f888d0f94	Fix whitespace	2026-03-16 19:33:13 +01:00
Peter Eisentraut	63275ce84d	Hardcode override of typeof_unqual for clang-for-bitcode The fundamental problem is that when we call clang to generate bitcode, we might be using configure results from a different compiler, which might not be fully compatible with the clang we are using. In practice, clang supports most things other compilers support, so this has apparently not been a problem in practice. But commits `4cfce4e62c`, `0af05b5dbb`, and `59292f7aac` have been struggling to make typeof_unqual work in this situation. Clang added support in version 19, GCC in version 14, so if you are using, say, GCC 14 and Clang 16, the compilation with the latter will fail. Such combinations are not very likely in practice, because GCC 14 and Clang 19 were released within a few months of each other, and so Linux distributions are likely to have suitable combinations. But some buildfarm members and some Fedora versions are affected, so this tries to fix it. The fully correct solution would be to run a separate set of configure tests for that clang-for-bitcode, but that would be very difficult to implement, and probably of limited use in practice. So the workaround here is that we hardcodedly override the configure result under clang based on the version number. As long as we only have a few of these cases, this should be manageable. Also swap the order of the tests of typeof_unqual: Commit `59292f7aac` tested the underscore variant first, but the reasons for that are now gone. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-16 19:24:49 +01:00
Nathan Bossart	be0d0b457c	pg_dumpall: Fix handling of incompatible options. This commit teaches pg_dumpall to fail when both --clean and --data-only are specified. Previously, it passed the options through to pg_dump, which would fail after pg_dumpall had already started producing output. Like recent commits `b2898baaf7` and `7c8280eeb5`, no back-patch. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/CAKYtNArrHiJ0LDB9BFZiUWs6tC78QkBN50wiwO07WhxewYDS3Q%40mail.gmail.com	2026-03-16 11:01:20 -05:00
Peter Eisentraut	cd8844e7db	Make some tests more stable by adding more explicit ordering for some tests added by commit `2f094e7ac6`, based on buildfarm results	2026-03-16 16:24:22 +01:00
Álvaro Herrera	fba4233c83	Reduce header inclusions via execnodes.h Remove a bunch of #include lines from execnodes.h. Most of these requier suitable typedefs to be added, so that it still compiles standalone. In one case, the fix is to move a struct definition to the one .c file where it is needed. Also some light clean up in plannodes.h and genam.h, though not as extensive as in execnodes.h. Author: Álvaro Herrera <alvherre@kurilemu.de> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/202603131240.ihwqdxnj7w2o@alvherre.pgsql	2026-03-16 14:34:57 +01:00
Fujii Masao	57b5543bb8	Remove unstable test for pg_statio_all_sequences stats reset Commit `8fe315f18d` added the stats_reset column to pg_statio_all_sequences and included a regression test to verify that statistics in this view are reset correctly. However, this test caused buildfarm member crake to report a pg_upgradeCheck failure. The failing test assumed that the blks_read and blks_hit counters in pg_statio_all_sequences would be zero after calling pg_stat_reset_single_table_counters(). On crake, however, either blks_read or blks_hit sometimes appeared as 1 during the pg_upgradeCheck test, even right after the reset. Since these counters may change due to concurrent activity and the test is unstable, this commit removes the checks for blks_read and blks_hit in pg_statio_all_sequences from the regression test. Per buildfarm member crake. Discussion: https://postgr.es/m/CAHGQGwFcay_tX=7HSS=N=+Yd0FLEm2GrJgwxnqHM4wvxX0B=4g@mail.gmail.com	2026-03-16 21:05:13 +09:00
Peter Eisentraut	1e67508730	Fix pg_upgrade failure when extension_control_path is used When an extension is located via extension_control_path and it has a hardcoded $libdir/ path, this is stripped by the extension_control_path mechanism. But when pg_upgrade verifies the extension using LOAD, this stripping does not happen, and so pg_upgrade will fail because it cannot load the extension. To work around that, change pg_upgrade to itself strip the prefix when it runs its checks. A test case is also added. Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Reviewed-by: Niccolò Fei <niccolo.fei@enterprisedb.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/43b3691c673a8b9158f5a09f06eacc3c63e2c02d.camel%40gmail.com	2026-03-16 11:52:26 +01:00
Peter Eisentraut	5c2a8d272b	Use C11 alignas in typedef definitions They were already using pg_attribute_aligned. This replaces that with alignas and moves that into the required syntactic position. Suggested-by: Peter Eisentraut <peter@eisentraut.org> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/d7a788fa-e609-4894-a8be-2f70e135424f%40eisentraut.org	2026-03-16 11:35:51 +01:00
Peter Eisentraut	d7ad79e506	Prevent -Wstrict-prototypes and -Wold-style-definition warnings A following commit will enable -Wstrict-prototypes and -Wold-style-definition by default. This commit fixes the warnings that those new flags will generate before actually adding the new flags. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/13d51b20-a69c-4ac1-8546-ec4fc278064f%40eisentraut.org	2026-03-16 10:53:24 +01:00
Peter Eisentraut	2f094e7ac6	SQL Property Graph Queries (SQL/PGQ) Implementation of SQL property graph queries, according to SQL/PGQ standard (ISO/IEC 9075-16:2023). This adds: - GRAPH_TABLE table function for graph pattern matching - DDL commands CREATE/ALTER/DROP PROPERTY GRAPH - several new system catalogs and information schema views - psql \dG command - pg_get_propgraphdef() function for pg_dump and psql A property graph is a relation with a new relkind RELKIND_PROPGRAPH. It acts like a view in many ways. It is rewritten to a standard relational query in the rewriter. Access privileges act similar to a security invoker view. (The security definer variant is not currently implemented.) Starting documentation can be found in doc/src/sgml/ddl.sgml and doc/src/sgml/queries.sgml. Author: Peter Eisentraut <peter@eisentraut.org> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Ajay Pal <ajay.pal.k@gmail.com> Reviewed-by: Henson Choi <assam258@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-16 10:14:18 +01:00
Fujii Masao	fd6ecbfa75	Ensure "still waiting on lock" message is logged only once per wait. When log_lock_waits is enabled, the "still waiting on lock" message is normally emitted only once while a session continues waiting. However, if the wait is interrupted, for example by wakeups from client_connection_check_interval, SIGHUP for configuration reloads, or similar events, the message could be emitted again each time the wait resumes. For example, with very small client_connection_check_interval values (e.g., 100 ms), this behavior could flood the logs with repeated messages, making them difficult to use. To prevent this, this commit guards the "still waiting on lock" message so it is reported at most once during a lock wait, even if the wait is interrupted. This preserves the intended behavior when no interrupts occur. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hüseyin Demir <huseyin.d3r@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHZUmg+r4kMcPYt_Z-txxVX+CJJhfra+qemxKXvAxYbpw@mail.gmail.com	2026-03-16 18:10:57 +09:00
Michael Paquier	c336133c65	Reject ALTER TABLE .. CLUSTER earlier for partitioned tables ALTER TABLE .. CLUSTER ON and SET WITHOUT CLUSTER are not supported for partitioned tables and already fail with a check happening when the sub-command is executed, not when it is prepared. This commit moves the relkind check for partitioned tables to happen when the sub-command is prepared in ATSimplePermissions(). This matches with the practice of the other sub-commands of ALTER TABLE, shaving one translatable string. mark_index_clustered() can be a bit simplified, switching one elog(ERROR) to an assertion. Note that mark_index_clustered() can also be called through a CLUSTER command, but it cannot be reached for a partitioned table, per the assertion based on the relkind in cluster_rel(), and there is only one caller of rebuild_relation(). Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAEoWx2kggo1N2kDH6OSfXHL_5gKg3DqQ0PdNuL4LH4XSTKJ3-g@mail.gmail.com	2026-03-16 17:48:39 +09:00
Fujii Masao	8fe315f18d	Add stats_reset column to pg_statio_all_sequences pg_statio_all_sequences lacked a stats_reset column, unlike the other pg_statio_* views that already expose it. This commit adds the column so users can see when the statistics in this view were last reset. Also this commit updates the documentation for pg_stat_reset_single_table_counters() to clarify that it can reset statistics for sequences and materialized views as well. Catalog version bumped. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Shihao Zhong <zhong950419@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0v0OPGyDpwxkX81CtTt9xsj9-TNxhm=8JdOvEKPsVVFNg@mail.gmail.com	2026-03-16 17:24:08 +09:00
Peter Eisentraut	a41bc38439	Fix accidentally casting away const Recently introduced in commit `8c2b30487c`.	2026-03-16 07:37:03 +01:00
Amit Kapila	5f39698c90	Remove obsolete speculative insert cleanup in ReorderBuffer. Commit `4daa140a2f` introduced proper decoding for speculative aborts. As a result, the internal state is guaranteed to be clean when a new speculative insert is encountered. This patch removes the defensive cleanup code that is no longer reachable. Author: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/23256.1772702981@localhost	2026-03-16 10:14:22 +05:30
Fujii Masao	d8879d34b9	file_fdw: Add regression test for file_fdw with ON_ERROR='set_null' Commit `2a525cc97e` introduced the ON_ERROR = 'set_null' option for COPY, allowing it to be used with foreign tables backed by file_fdw. However, unlike ON_ERROR = 'ignore', no regression test was added to verify this behavior for file_fdw. This commit adds a regression test to ensure that foreign tables using file_fdw work correctly with ON_ERROR = 'set_null', improving test coverage. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yi Ding <dingyi_yale@163.com> Discussion: https://postgr.es/m/CAHGQGwGmPc6aHpA5=WxKreiDePiOEitfOFsW2dSo5m81xWXgRA@mail.gmail.com	2026-03-16 12:13:11 +09:00
Michael Paquier	bfa3c4f106	Optimize hash index bulk-deletion with streaming read This commit refactors hashbulkdelete() to use streaming reads, improving the efficiency of the operation by prefetching upcoming buckets while processing a current bucket. There are some specific changes required to make sure that the cleanup work happens in accordance to the data pushed to the stream read callback. When the cached metadata page is refreshed to be able to process the next set of buckets, the stream is reset and the data fed to the stream read callback has to be updated. The reset needs to happen in two code paths, when _hash_getcachedmetap() is called. The author has seen better performance numbers than myself on this one (with tweaks similar to `6c228755ad`). The numbers are good enough for both of us that this change is worth doing, in terms of IO and runtime. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-16 09:22:09 +09:00
Tom Lane	82ff54377e	Move -ffast-math defense to float.c and remove the configure check. We had defenses against -ffast-math in timestamp-related files, which is a pretty obsolete place for them since we've not supported floating-point timestamps in a long time. Remove those and instead put one in float.c, which is still broken by using this switch. Add some commentary to put more color on why it's a bad idea. Also remove the check from configure. That was just there to fail faster, but it doesn't really seem necessary anymore, and besides we have no corresponding check in meson.build. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/abFXfKC8zR0Oclon%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-15 19:34:52 -04:00
Tom Lane	c675d80d72	Be more careful about int vs. Oid in ecpglib. Print an OID value inserted into a SQL query with %u not %d. The existing code accidentally fails to malfunction when given an OID above 2^31, but only accidentally; future changes to our SQL parser could perhaps break it. Declare the Oid values that ecpg_type_infocache_push() and ecpg_is_type_an_array() work with as "Oid" not "int". This doesn't have any functional effect, but it's clearer. At the moment I don't see a need to back-patch this. Bug: #19429 Author: fairyfar@msn.com Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19429-aead3b1874be1a99@postgresql.org	2026-03-15 18:55:46 -04:00
David Rowley	c456e39113	Optimize tuple deformation This commit includes various optimizations to improve the performance of tuple deformation. We now precalculate CompactAttribute's attcacheoff, which allows us to remove the code from the deform routines which was setting the attcacheoff. Setting the attcacheoff is now handled by TupleDescFinalize(), which must be called before the TupleDesc is used for anything. Having TupleDescFinalize() means we can store the first attribute in the TupleDesc which does not have an offset cached. That allows us to add a dedicated deforming loop to deform all attributes up to the final one with an attcacheoff set, or up to the first NULL attribute, whichever comes first. Here we also improve tuple deformation performance of tuples with NULLs. Previously, if the HEAP_HASNULL bit was set in the tuple's t_infomask, deforming would, one-by-one, check each and every bit in the NULL bitmap to see if it was zero. Now, we process the NULL bitmap 1 byte at a time rather than 1 bit at a time to find the attnum with the first NULL. We can now deform the tuple without checking for NULLs up to just before that attribute. We also record the maximum attribute number which is guaranteed to exist in the tuple, that is, has a NOT NULL constraint and isn't an atthasmissing attribute. When deforming only attributes prior to the guaranteed attnum, we've no need to access the tuple's natt count. As an additional optimization, we only count fixed-width columns when calculating the maximum guaranteed column, as this eliminates the need to emit code to fetch byref types in the deformation loop for guaranteed attributes. Some locations in the code deform tuples that have yet to go through NOT NULL constraint validation. We're unable to perform the guaranteed attribute optimization when that's the case. This optimization is opt-in via the TupleTableSlot using the TTS_FLAG_OBEYS_NOT_NULL_CONSTRAINTS flag. This commit also adds a more efficient way of populating the isnull array by using a bit-wise SWAR trick which performs multiplication on the inverse of the tuple's bitmap byte and masking out all but the lower bit of each of the boolean's byte. This results in much more optimal code when compared to determining the NULLness via att_isnull(). 8 isnull elements are processed at once using this method, which means we need to round the tts_isnull array size up to the next 8 bytes. The palloc code does this anyway, but the round-up needed to be formalized so as not to overwrite the sentinel byte in MEMORY_CONTEXT_CHECKING builds. Doing this also allows the NULL-checking deforming loop to more efficiently check the isnull array, rather than doing the bit-wise processing for each attribute that att_isnull() does. The level of performance improvement from these changes seems to vary depending on the CPU architecture. Apple's M chips seem particularly fond of the changes, with some of the tested deform-heavy queries going over twice as fast as before. With x86-64, the speedups aren't quite as large. With tables containing only a small number of columns, the speedups will be less. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAApHDvpoFjaj3%2Bw_jD5uPnGazaw41A71tVJokLDJg2zfcigpMQ%40mail.gmail.com	2026-03-16 11:46:00 +13:00
David Rowley	503620311e	Add all required calls to TupleDescFinalize() As of this commit all TupleDescs must have TupleDescFinalize() called on them once the TupleDesc is set up and before BlessTupleDesc() is called. In this commit, TupleDescFinalize() does nothing. This change has only been separated out from the commit that properly implements this function to make the change more obvious. Any extension which makes its own TupleDesc will need to be modified to call the new function. The follow-up commit which properly implements TupleDescFinalize() will cause any code which forgets to do this to fail in assert-enabled builds in BlessTupleDesc(). It may still be worth mentioning this change in the release notes so that extension authors update their code. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAApHDvpoFjaj3%2Bw_jD5uPnGazaw41A71tVJokLDJg2zfcigpMQ%40mail.gmail.com	2026-03-16 11:45:49 +13:00
Tom Lane	e5a77d876d	Save a few bytes per CatCTup. CatalogCacheCreateEntry() computed the space needed for a CatCTup as sizeof(CatCTup) + MAXIMUM_ALIGNOF. That's not our usual style, and it wastes memory by allocating more padding than necessary. On 64-bit machines sizeof(CatCTup) would be maxaligned already since it contains pointer fields, therefore this code is wasting 8 bytes compared to the more usual MAXALIGN(sizeof(CatCTup)). While at it, we don't really need to do MemoryContextSwitchTo() when we're only allocating one block. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/tencent_A42E0544C6184FE940CD8E3B14A3F0A39605@qq.com	2026-03-15 18:05:38 -04:00
Tom Lane	bb53b8d359	Fix small memory leak in get_dbname_oid_list_from_mfile(). Coverity complained that this function leaked the dumpdirpath string, which it did. But we don't need to make a copy at all, because there's not really any point in trimming trailing slashes from the directory name here. If that were needed, the initial file_exists_in_directory() test would have failed, since it doesn't bother with that (and neither does anyplace else in this file). Moreover, if we did want that, reimplementing canonicalize_path() poorly is not the way to proceed. Arguably, all of this code should be reexamined with an eye to using src/port/path.c's facilities, but for today I'll settle for getting rid of the memory leak.	2026-03-15 15:24:04 -04:00
Andrew Dunstan	a793677e57	pg_restore: Remove dead code in restore_all_databases() Cleanup from commit `763aaa06f0`. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Discussion: https://postgr.es/m/CAKYtNAqN49Hqd4v0wWH3uW6d6QsH+8e8bR_MVf4CboTZSzd+Aw@mail.gmail.com	2026-03-15 12:13:19 -04:00
Melanie Plageman	99bf1f8aa6	Save vmbuffer in heap-specific scan descriptors for on-access pruning Future commits will use the visibility map in on-access pruning to fix VM corruption and set the VM if the page is all-visible. Saving the vmbuffer in the scan descriptor reduces the number of times it would need to be pinned and unpinned, making the overhead of doing so negligible. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/C3AB3F5B-626E-4AAA-9529-23E9A20C727F%40gmail.com	2026-03-15 11:09:10 -04:00
Melanie Plageman	8d2c1df4f4	Avoid BufferGetPage() calls in heap_update() BufferGetPage() isn't cheap and heap_update() calls it multiple times when it could just save the page from a single call. Do that. While we are at it, make separate variables for old and new page in heap_xlog_update(). It's confusing to reuse "page" for both pages. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAAKRu_a%2BhO4PCptyaPR7AMZd7FjcHfOFKKJT8ouU3KedMud0tQ%40mail.gmail.com	2026-03-15 10:42:34 -04:00
Melanie Plageman	a3511443e5	Initialize missing fields in CreateExecutorState() `d47cbf474e` and `cbc127917e` forgot to initialize a few fields they introduced in the EState, so do that now. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/F5CDD1B5-628C-44A1-9F85-3958C626F6A9%40gmail.com	2026-03-15 10:13:14 -04:00
Peter Eisentraut	cd083b54bd	Make typeof and typeof_unqual fallback definitions work on C++11 These macros were unintentionally using C++14 features. This replaces them with valid C++11 code. Tested locally by compiling with -std=c++11 (which reproduced the original issue). Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-15 07:36:27 +01:00
Tom Lane	0123ce131f	Switch the semaphore API on Solaris to unnamed POSIX. Solaris descendants (Illumos, OpenIndiana, OmniOS, etc.) hit System V semaphore limits ("No space left on device" from semget) when running many parallel test scripts under default system settings. We could tell people to raise those settings, but there's a better answer. Unnamed POSIX semaphores have been available on Solaris for decades and work well, so prefer them, as was recently done for AIX. This patch also updates the documentation to remove now-unnecessary advice about raising project.max-sem-ids and project.max-msg-ids. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/470305.1772417108@sss.pgh.pa.us	2026-03-14 14:10:32 -04:00
Tom Lane	2eb87345e1	Fix aclitemout() to work during early bootstrap. "initdb -d" has been broken since commit `f95d73ed4`, because I changed aclitemin to work in bootstrap mode but failed to consider aclitemout. That routine isn't reached by default, but it is if the elog message level is high enough, so it needs to work without catalog access too. This patch just makes it use its existing code paths to print role OIDs numerically. We could alternatively invent an inverse of boot_get_role_oid() and print them symbolically, but that would take more code and it's not apparent that it'd be any better for debugging purposes. Reported-by: Greg Burd <greg@burd.me> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4416.1773328045@sss.pgh.pa.us	2026-03-14 13:46:54 -04:00
Tomas Vondra	02eecead86	Tighten asserts on ParallelWorkerNumber The comment about ParallelWorkerNumbr in parallel.c says: In parallel workers, it will be set to a value >= 0 and < the number of workers before any user code is invoked; each parallel worker will get a different parallel worker number. However asserts in various places collecting instrumentation allowed (ParallelWorkerNumber == num_workers). That would be a bug, as the value is used as index into an array with num_workers entries. Fixed by adjusting the asserts accordingly. Backpatch to all supported versions. Discussion: https://postgr.es/m/5db067a1-2cdf-4afb-a577-a04f30b69167@vondra.me Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Backpatch-through: 14	2026-03-14 15:26:39 +01:00
Michael Paquier	ae58189a4d	pgstattuple: Optimize pgstattuple_approx() with streaming read This commit plugs into pgstattuple_approx(), the SQL function faster than pgstattuple() that returns approximate results, the streaming read APIs. A callback is used to be able to skip all-visible pages via VM lookup, to match with the logic prior to this commit. Under test conditions similar to `6c228755ad` (some dm_delay and debug_io_direct=data), this can substantially improve the execution time of the function, particularly for large relations. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-14 15:06:13 +09:00
David Rowley	4deecb52af	Allow sibling call optimization in slot_getsomeattrs_int() This changes the TupleTableSlotOps contract to make it so the getsomeattrs() function is in charge of calling slot_getmissingattrs(). Since this removes all code from slot_getsomeattrs_int() aside from the getsomeattrs() call itself, we may as well adjust slot_getsomeattrs() so that it calls getsomeattrs() directly. We leave slot_getsomeattrs_int() intact as this is still called from the JIT code. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAApHDvodSVBj3ypOYbYUCJX%2BNWL%3DVZs63RNBQ_FxB_F%2B6QXF-A%40mail.gmail.com	2026-03-14 13:52:09 +13:00
Peter Geoghegan	8a879119a1	Use fake LSNs to improve nbtree dropPin behavior. Use fake LSNs in all nbtree critical sections that write a WAL record. That way we can safely apply the _bt_killitems LSN trick with logged and unlogged indexes alike. This brings the same benefits to plain scans of unlogged relations that commit `2ed5b87f` brought to plain scans of logged relations: scans will drop their leaf page pin eagerly (by applying the "dropPin" optimization), which avoids blocking progress by VACUUM. This is particularly helpful with applications that allow a scrollable cursor to remain idle for long periods. Preparation for an upcoming commit that will add the amgetbatch interface, and switch nbtree over to it (from amgettuple) to enable I/O prefetching. The index prefetching read stream's effective prefetch distance is adversely affected by any buffer pins held by the index AM. At the same time, it can be useful for prefetching to read dozens of leaf pages ahead of the scan to maintain an adequate prefetch distance. The index prefetching patch avoids this tension by always eagerly dropping index page pins of the kind traditionally held as an interlock against unsafe concurrent TID recycling by VACUUM (essentially the same way that amgetbitmap routines have always avoided holding onto pins). The work from this commit makes that possible during scans of nbtree unlogged indexes -- without our having to give up on setting LP_DEAD bits on index tuples altogether. Follow-up to commit `d774072f`, which moved the fake LSN infrastructure out of GiST so that it could be used by other index AMs. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-WzkehuhxyuA8quc7rRN3EtNXpiKsjPfO8mhb+0Dr2K0Dtg@mail.gmail.com	2026-03-13 20:37:39 -04:00
Peter Geoghegan	d774072f00	Move fake LSN infrastructure out of GiST. Move utility functions used by GiST to generate fake LSNs into xlog.c and xloginsert.c, so that other index AMs can also generate fake LSNs. Preparation for an upcoming commit that will add support for fake LSNs to nbtree, allowing its dropPin optimization to be used during scans of unlogged relations. That commit is itself preparation for another upcoming commit that will add a new amgetbatch/btgetbatch interface to enable I/O prefetching. Bump XLOG_PAGE_MAGIC due to XLOG_GIST_ASSIGN_LSN becoming XLOG_ASSIGN_LSN. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-WzkehuhxyuA8quc7rRN3EtNXpiKsjPfO8mhb+0Dr2K0Dtg@mail.gmail.com	2026-03-13 19:38:17 -04:00
Jeff Davis	9b860373da	Add error code to user-visible message. Reported-by: Alexander Lakhin <exclusion@gmail.com>	2026-03-13 16:07:54 -07:00
Tomas Vondra	b1f14c9672	Use GetXLogInsertEndRecPtr in gistGetFakeLSN The function used GetXLogInsertRecPtr() to generate the fake LSN. Most of the time this is the same as what XLogInsert() would return, and so it works fine with the XLogFlush() call. But if the last record ends at a page boundary, GetXLogInsertRecPtr() returns LSN pointing after the page header. In such case XLogFlush() fails with errors like this: ERROR: xlog flush request 0/01BD2018 is not satisfied --- flushed only to 0/01BD2000 Such failures are very hard to trigger, particularly outside aggressive test scenarios. Fixed by introducing GetXLogInsertEndRecPtr(), returning the correct LSN without skipping the header. This is the same as GetXLogInsertRecPtr(), except that it calls XLogBytePosToEndRecPtr(). Initial investigation by me, root cause identified by Andres Freund. This is a long-standing bug in gistGetFakeLSN(), probably introduced by `c6b92041d3` in PG13. Backpatch to all supported versions. Reported-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/vf4hbwrotvhbgcnknrqmfbqlu75oyjkmausvy66ic7x7vuhafx@e4rvwavtjswo Backpatch-through: 14	2026-03-13 23:25:24 +01:00
Heikki Linnakangas	311a851436	Free memory allocated for unrecognized_protocol_options Since `4966bd3ed9` Valgrind started to warn about little amount of memory being leaked in ProcessStartupPacket(). This is not critical but the warnings may distract from real issues. Fix it by freeing the list after use. Author: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://www.postgresql.org/message-id/CAJ7c6TN3Hbb5p=UHx0SPVN+h_JwPAV6rxoqOm7gHBMFKfnGK-Q@mail.gmail.com	2026-03-13 23:37:19 +02:00
Nathan Bossart	233bbdf031	Add convenience view to stats import test. Presently, many statements in stats_import.sql select all columns from the pg_stats system view. A proposed follow-up commit would add columns to this view (some of which are not stable across test runs), breaking all of these tests. This commit introduces a convenience view for those statements so that future changes are minimally disruptive. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CADkLM%3DcoCVy92QkVUUTLdo5eO2bMDtwMrzRn_8miAhX%2BuPaqXg%40mail.gmail.com	2026-03-13 15:04:10 -05:00
Andres Freund	ce5d489166	Fix bug due to confusion about what IsMVCCSnapshot means In `0b96e734c5` I (Andres) relied on page_collect_tuples() being called only with an MVCC snapshot, and added assertions to that end, but did not realize that IsMVCCSnapshot() allows both proper MVCC snapshots and historical snapshots, which behave quite similarly to MVCC snapshots. Unfortunately that can lead to incorrect visibility results during logical decoding, as a historical snapshot is interpreted as a plain MVCC snapshot. The only reason this wasn't noticed earlier is that it's hard to reach as most of the time there are no sequential scans during logical decoding. To fix the bug and avoid issues like this in the future, split IsMVCCSnapshot() into IsMVCCSnapshot() and IsMVCCLikeSnapshot(), where now only the latter includes historic snapshots. One effect of this is that during logical decoding no page-at-a-time snapshots are used, as otherwise runtime branches to handle historic snapshots would be needed in some performance critical paths. Given how uncommon sequential scans are during logical decoding, that seems acceptable. Author: Antonin Houska <ah@cybertec.at> Reported-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/61812.1770637345@localhost	2026-03-13 13:53:19 -04:00
Jacob Champion	b634c4e0e8	libpq-oauth: Fix Makefile dependencies As part of `6225403f2`, I'd removed the override for the `stlib` target, since NAME no longer contains a major version number. But I forgot that its dependencies are declared before Makefile.shlib is included; those dependencies were then omitted entirely. Per buildfarm member indri, which appears to be the only system so far that's bothered by an empty archive.	2026-03-13 10:34:03 -07:00
Nathan Bossart	1c33a2d81d	Add commit `b6eb8dde6b` to .git-blame-ignore-revs.	2026-03-13 11:45:34 -05:00
Jacob Champion	dba3560448	libpq-oauth: Never link against libpq's encoding functions Now that libpq-oauth doesn't have to match the major version of libpq, some things in pg_wchar.h are technically unsafe for us to use. (See `b6c7cfac8` for a fuller discussion.) This is unlikely to be a problem -- we only care about UTF-8 in the context of OAuth right now -- but if anyone did introduce a way to hit it, it'd be extremely difficult to debug or reproduce, and it'd be a potential security vulnerability to boot. Define USE_PRIVATE_ENCODING_FUNCS so that anyone who tries to add a dependency on the exported APIs will simply fail to link the shared module. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com	2026-03-13 09:38:04 -07:00
Jacob Champion	6225403f27	libpq-oauth: Use the PGoauthBearerRequestV2 API Switch the private libpq-oauth ABI to a public one, based on the new PGoauthBearerRequestV2 API. A huge amount of glue code can be removed as part of this, and several code paths can be deduplicated. Additionally, the shared library no longer needs to change its name for every major release; it's now just "libpq-oauth.so". Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com	2026-03-13 09:37:59 -07:00
Nathan Bossart	be43c48c22	Initialize variable to placate compiler. Since commit `5883ff30b0`, some compilers have been warning that the rtekind variable in unique_nonjoin_rtekind() may be used uninitialized. There doesn't appear to be any actual risk, so let's just initialize it to something to silence the compiler warnings. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0sieVNfniCKMDdDjuXGd1OuzMQfTS5%3D9vX3sa-iiujKUA%40mail.gmail.com	2026-03-13 11:32:14 -05:00
Nathan Bossart	e0a3a3fd53	Optimize COPY FROM (FORMAT {text,csv}) using SIMD. Presently, such commands scan the input buffer one byte at a time looking for special characters. This commit adds a new path that uses SIMD instructions to skip over chunks of data without any special characters. This can be much faster. To avoid regressions, SIMD processing is disabled for the remainder of the COPY FROM command as soon as we encounter a short line or a special character (except for end-of-line characters, else we'd always disable it after the first line). This is perhaps too conservative, but it could probably be made more lenient in the future via fine-tuned heuristics. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Ayoub Kazar <ma_kazar@esi.dz> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Neil Conway <neil.conway@gmail.com> Reviewed-by: Greg Burd <greg@burd.me> Tested-by: Manni Wood <manni.wood@enterprisedb.com> Tested-by: Mark Wong <markwkm@gmail.com> Discussion: https://postgr.es/m/CAOzEurSW8cNr6TPKsjrstnPfhf4QyQqB4tnPXGGe8N4e_v7Jig%40mail.gmail.com	2026-03-13 11:07:32 -05:00
Peter Eisentraut	8c2b30487c	Factor out constructSetOpTargetlist() from transformSetOperationTree() This would be used separately by a future patch. It also makes a little smaller. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-13 16:16:40 +01:00
Heikki Linnakangas	f9de9bf302	Add callback for I/O error messages in SLRUs Historically, all SLRUs were addressed by transaction IDs, but that hasn't been true for a long time. However, the error message on I/O error still always talked about accessing a transaction ID. This commit adds a callback that allows subsystems to construct their own error messages, which can then correctly refer to a transaction ID, multixid or whatever else is used to address the particular SLRU. Author: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/CACG=ezZZfurhYV+66ceubxQAyWqv9vaUi0yoO4-t48OE5xc0DQ@mail.gmail.com	2026-03-13 16:21:06 +02:00
Fujii Masao	723619eaa3	Add stats_reset column to pg_stat_database_conflicts. This commit adds a stats_reset column to pg_stat_database_conflicts, allowing users to see when the statistics in this view were last reset. This makes the view consistent with pg_stat_database and other statistics views. Catalog version bumped. Author: Shihao Zhong <zhong950419@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAGRkXqS98OebEWjax99_LVAECsxCB8i=BfsdAL34i-5QHfwyOQ@mail.gmail.com	2026-03-13 22:17:14 +09:00
Heikki Linnakangas	2e1dcf8c54	Check for interrupts during non-fast-update GIN insertion ginExtractEntries() can produce a lot of entries for a single item. During index build, we check for interrupts between entries, and the fast-update codepath does it as part of vacuum_delay_point(), but the non-fast update insertion codepath was uninterruptible. Add CHECK_FOR_INTERRUPTS() between entries in the non-fast update codepath too. Author: Vinod Sridharan <vsridh90@gmail.com> Discussion: https://www.postgresql.org/message-id/CAFMdLD6mQvAuStiOGvBJxAEfo6wdjZhj3+JveTLxOX8MVn4zmA@mail.gmail.com	2026-03-13 15:12:32 +02:00
Alexander Korotkov	fa6f2f624c	Rework ginScanToDelete() to pass Buffers instead of BlockNumbers. Previously, ginScanToDelete() and ginDeletePage() passed BlockNumbers and re-read pages that were already pinned and locked during the tree walk. The caller ginVacuumPostingTree()) held a cleanup-locked root buffer, yet ginScanToDelete() re-read it by block number with special-case code to skip re-locking. At first, this commit gives both functions more appropriate names, ginScanPostingTreeToDelete() and ginDeletePostingPage(), indicating they deal with posting trees/pages. This is more descriptive and similar to the way we name other GIN functions, for instance, ginVacuumPostingTree() and ginVacuumPostingTreeLeaves(). Then rework both functions to pass Buffers directly. DataPageDeleteStack now carries buffer, myoff (downlink offset in parent), and isRoot per level, so ginScanPostingTreeToDelete() takes only GinVacuumState and DataPageDeleteStack pointers. Also, ginDeletePostingPage() receives the three Buffers directly, and no longer reads or releases them itself. The caller reads and locks child pages before recursing, and manages buffer lifecycle afterward. This eliminates the confusing isRoot special cases in buffer management, including the apparent (but unreachable) double release of the root buffer identified by Andres Freund. Add comments explaining the locking protocol and the DataPageDeleteStack structure. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/utrlxij43fbguzw4kldte2spc4btoldizutcqyrfakqnbrp3ir@ph3sphpj4asz Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Jinbinge <jinbinge@126.com>	2026-03-13 13:50:13 +02:00
Heikki Linnakangas	f30cebb954	Fix pointer type of ShmemAllocatorData->index This went unnoticed in commit `e2362eb2bd` because the pointer is cast to/from a void pointer.	2026-03-13 11:00:15 +02:00
Michael Paquier	7d64419f80	xml2: Fix failure with xslt_process() under -fsanitize=undefined The logic of xslt_process() has never considered the fact that xsltSaveResultToString() would return NULL for an empty string (the upstream code has always done so, with a string length of 0). This would cause memcpy() to be called with a NULL pointer, something forbidden by POSIX. Like `46ab07ffda` and similar fixes, this is backpatched down to all the supported branches, with a test case to cover this scenario. An empty string has been always returned in xml2 in this case, based on the history of the module, so this is an old issue. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/c516a0d9-4406-47e3-9087-5ca5176ebcf9@gmail.com Backpatch-through: 14	2026-03-13 16:06:28 +09:00
Peter Eisentraut	59292f7aac	Change copyObject() to use typeof_unqual Currently, when the argument of copyObject() is const-qualified, the return type is also, because the use of typeof carries over all the qualifiers. This is incorrect, since the point of copyObject() is to make a copy to mutate. But apparently no code ran into it. The new implementation uses typeof_unqual, which drops the qualifiers, making this work correctly. typeof_unqual is standardized in C23, but all recent versions of all the usual compilers support it even in non-C23 mode, at least as __typeof_unqual__. We add a configure/meson test for typeof_unqual and __typeof_unqual__ and use it if it's available, else we use the existing fallback of just returning void *. This is the second attempt, after the first attempt in commit `4cfce4e62c` was reverted. The following two points address problems with the earlier version: We test the underscore variant first so that there is a higher chance that clang used for bitcode also supports it, since we don't test that separately. Unlike the typeof test, the typeof_unqual test also tests with a void pointer similar to how copyObject() would use it, because that is not handled by MSVC, so we want the test to fail there. Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-13 07:06:57 +01:00
Michael Paquier	213f0079b3	pgstattuple: Optimize btree and hash index functions with streaming read This commit replaces the synchronous ReadBufferExtended() loops with the streaming read routines, affecting pgstatindex() (for btree) and pgstathashindex() (for hash indexes). Under test conditions similar to `6c228755ad` (some dm_delay and debug_io_direct=data), this can result in nice runtime and IO gains. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-13 10:48:45 +09:00
Andrew Dunstan	a0b6ef29a5	Enable fast default for domains with non-volatile constraints Previously, ALTER TABLE ADD COLUMN always forced a table rewrite when the column type was a domain with constraints (CHECK or NOT NULL), even if the default value satisfied those constraints. This was because contain_volatile_functions() considers CoerceToDomain immutable, so the code conservatively assumed any constrained domain might fail. Improve this by using soft error handling (ErrorSaveContext) to evaluate the CoerceToDomain expression at ALTER TABLE time. If the default value passes the domain's constraints, the value is stored as a "missing" attribute default and no table rewrite is needed. If the constraint check fails, we fall back to a table rewrite, preserving the historical behavior that constraint violations are only raised when the table actually contains rows. Domains with volatile constraint expressions always require a table rewrite since the constraint result could differ per evaluation and cannot be cached. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Viktor Holmberg <viktor.holmberg@aiven.io> Discussion: https://postgr.es/m/CACJufxE_+iZBR1i49k_AHigppPwLTJi6km8NOsC7FWvKdEmmXg@mail.gmail.com	2026-03-12 18:05:01 -04:00
Andrew Dunstan	487cf2cbd2	Extend DomainHasConstraints() to optionally check constraint volatility Add an optional bool *has_volatile output parameter to DomainHasConstraints(). When non-NULL, the function checks whether any CHECK constraint contains a volatile expression. Callers that don't need this information pass NULL and get the same behavior as before. This is needed by a subsequent commit that enables the fast default optimization for domains with non-volatile constraints: we can safely evaluate such constraints once at ALTER TABLE time, but volatile constraints require a full table rewrite. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Viktor Holmberg <viktor.holmberg@aiven.io> Discussion: https://postgr.es/m/CACJufxE_+iZBR1i49k_AHigppPwLTJi6km8NOsC7FWvKdEmmXg@mail.gmail.com	2026-03-12 18:04:16 -04:00
Álvaro Herrera	a630ac5c20	Document the 'command' column of pg_stat_progress_repack Commit `ac58465e06` added and documented a new progress-report view for REPACK, but neglected to list the 'command' column in the docs. This is my (Álvaro's) fail, as I added the column in v23 of the patch and forgot to document it. In passing, add a note in the docs for pg_stat_progress_cluster that it might contain rows for sessions running REPACK, though mapping the command name to either the older commands; and that it is for backwards- compatibility only. (Maybe we should just remove this older view.) Author: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com> Discussion: https://postgr.es/m/LV8PR84MB37870F0F35EF2E8CB99768CBEE47A@LV8PR84MB3787.NAMPRD84.PROD.OUTLOOK.COM Discussion: https://postgr.es/m/202510101352.vvp4p3p2dblu@alvherre.pgsql	2026-03-12 19:19:23 +01:00
Peter Geoghegan	a367c433ad	Use simplehash for backend-private buffer pin refcounts. Replace dynahash with simplehash for the per-backend PrivateRefCountHash overflow table. Simplehash generates inlined, open-addressed lookup code, avoiding the per-call overhead of dynahash that becomes noticeable when many buffers are pinned with a CPU-bound workload. Motivated by testing of the index prefetching patch, which pins many more buffers concurrently than typical index scans. Author: Peter Geoghegan <pg@bowt.ie> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-By: Tomas Vondra <tomas@vondra.me> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-03-12 13:26:16 -04:00
Peter Geoghegan	d071e1cfec	nbtree: Avoid allocating _bt_search stack. Avoid allocating memory for an nbtree descent stack during index scans. We only require a descent stack during inserts, when it is used to determine where to insert a new pivot tuple/downlink into the target leaf page's parent page in the event of a page split. (Page deletion's first phase also performs a _bt_search that requires a descent stack.) This optimization improves performance by minimizing palloc churn. It speeds up index scans that call _bt_search frequently/descend the index many times, especially when the cost of scanning the index dominates (e.g., with index-only skip scans). Testing has shown that the underlying issue causes performance problems for an upcoming patch that will replace btgettuple with a new btgetbatch interface to enable I/O prefetching. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/CAH2-Wzmy7NMba9k8m_VZ-XNDZJEUQBU8TeLEeL960-rAKb-+tQ@mail.gmail.com	2026-03-12 13:22:36 -04:00
Robert Haas	5883ff30b0	Add pg_plan_advice contrib module. Provide a facility that (1) can be used to stabilize certain plan choices so that the planner cannot reverse course without authorization and (2) can be used by knowledgeable users to insist on plan choices contrary to what the planner believes best. In both cases, terrible outcomes are possible: users should think twice and perhaps three times before constraining the planner's ability to do as it thinks best; nevertheless, there are problems that are much more easily solved with these facilities than without them. This patch takes the approach of analyzing a finished plan to produce textual output, which we call "plan advice", that describes key decisions made during plan; if that plan advice is provided during future planning cycles, it will force those key decisions to be made in the same way. Not all planner decisions can be controlled using advice; for example, decisions about how to perform aggregation are currently out of scope, as is choice of sort order. Plan advice can also be edited by the user, or even written from scratch in simple cases, making it possible to generate outcomes that the planner would not have produced. Partial advice can be provided to control some planner outcomes but not others. Currently, plan advice is focused only on specific outcomes, such as the choice to use a sequential scan for a particular relation, and not on estimates that might contribute to those outcomes, such as a possibly-incorrect selectivity estimate. While it would be useful to users to be able to provide plan advice that affects selectivity estimates or other aspects of costing, that is out of scope for this commit. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Dian Fay <di@nmfay.com> Reviewed-by: Ajay Pal <ajay.pal.k@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-03-12 13:00:43 -04:00
Michael Paquier	02976b0a17	doc: Document variables for path substitution in SQL tests Test suites driven by pg_regress can use the following environment variables for path substitutions since `d1029bb5a2`: - PG_ABS_SRCDIR - PG_ABS_BUILDDIR - PG_DLSUFFIX - PG_LIBDIR These variables have never been documented, and they can be useful for out-of-core code based on the options used by the pg_regress command invoked by installcheck (or equivalent) to build paths to libraries for various commands, like LOAD or CREATE FUNCTION. Reviewed-by: Zhang Hu <kongbaik228@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/abDAWzHaesHLDFke@paquier.xyz	2026-03-12 16:41:41 +09:00
Michael Paquier	d841ca2d14	bloom: Optimize VACUUM and bulk-deletion with streaming read This commit replaces the synchronous ReadBufferExtended() loops done in blbulkdelete() and blvacuumcleanup() with the streaming read equivalent, to improve I/O efficiency during bloom index vacuum cleanup operations. Under the same test conditions as `6c228755ad`, the runtime is proving to gain around 30% better, with most the benefits coming from a large reduction of the IO operation based on the stats retrieved in the scenarios run. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-12 12:00:22 +09:00
Michael Paquier	6c228755ad	Use streaming read for VACUUM cleanup of GIN This commit replace the synchronous ReadBufferExtended() loop done in ginvacuumcleanup() with the streaming read equivalent, to improve I/O efficiency during GIN index vacuum cleanup operations. With dm_delay to emulate some latency and debug_io_direct=data to force synchronous writes and force the read path to be exercised, the author has noticed a 5x improvement in runtime, with a substantial reduction in IO stats numbers. I have reproduced similar numbers while running similar tests, with improvements becoming better with more tuples and more pages manipulated. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-12 11:48:31 +09:00
Richard Guo	383eb21ebf	Convert NOT IN sublinks to anti-joins when safe The planner has historically been unable to convert "x NOT IN (SELECT y ...)" sublinks into anti-joins. This is because standard SQL semantics for NOT IN require that if the comparison "x = y" returns NULL, the "NOT IN" expression evaluates to NULL (effectively false), causing the row to be discarded. In contrast, an anti-join preserves the row if no match is found. Due to this semantic mismatch regarding NULL handling, the conversion was previously considered unsafe. However, if we can prove that neither side of the comparison can yield NULL values, and further that the operator itself cannot return NULL for non-null inputs, the behavior of NOT IN and anti-join becomes identical. Enabling this conversion allows the planner to treat the sublink as a first-class relation rather than an opaque SubPlan filter. This unlocks global join ordering optimization and permits the selection of the most efficient join algorithm based on cost, often yielding significant performance improvements for large datasets. This patch verifies that neither side of the comparison can be NULL and that the operator is safe regarding NULL results before performing the conversion. To verify operator safety, we require that the operator be a member of a B-tree or Hash operator family. This serves as a proxy for standard boolean behavior, ensuring the operator does not return NULL on valid non-null inputs, as doing so would break index integrity. For operand non-nullability, this patch makes use of several existing mechanisms. It leverages the outer-join-aware-Var infrastructure to verify that a Var does not come from the nullable side of an outer join, and consults the NOT-NULL-attnums hash table to efficiently verify schema-level NOT NULL constraints. Additionally, it employs find_nonnullable_vars to identify Vars forced non-nullable by qual clauses, and expr_is_nonnullable to deduce non-nullability for other expression types. The logic for verifying the non-nullability of the subquery outputs was adapted from prior work by David Rowley and Tom Lane. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Zhang Mingli <zmlpostgres@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/CAMbWs495eF=-fSa5CwJS6B-BaEi3ARp0UNb4Lt3EkgUGZJwkAQ@mail.gmail.com	2026-03-12 09:45:18 +09:00
Andres Freund	6322a028fa	bufmgr: Fix use of wrong variable in GetPrivateRefCountEntrySlow() Unfortunately, in `30df61990c`, I made GetPrivateRefCountEntrySlow() set a wrong cache hint when moving entries from the hash table to the faster array. There are no correctness concerns due to this, just an unnecessary loss of performance. Noticed while testing the index prefetching patch. Discussion: https://postgr.es/m/CAH2-Wz=g=JTSyDB4UtB5su2ZcvsS7VbP+ZMvvaG6ABoCb+s8Lw@mail.gmail.com	2026-03-11 17:52:21 -04:00
Jeff Davis	547c15f9f8	Fix use of volatile. Commit `8185bb5347` misused volatile. Fix it. See also `6307b096e2`. Reported-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1bb21c7d-885f-4f07-a3ed-21b60d7c92c6@eisentraut.org	2026-03-11 14:27:58 -07:00
Andrew Dunstan	342051d73b	Add support for altering CHECK constraint enforceability This commit adds support for ALTER TABLE ALTER CONSTRAINT ... [NOT] ENFORCED for CHECK constraints. Previously, only foreign key constraints could have their enforceability altered. When changing from NOT ENFORCED to ENFORCED, the operation not only updates catalog information but also performs a full table scan in Phase 3 to validate that existing data satisfies the constraint. For partitioned tables and inheritance hierarchies, the operation recurses to all child tables. When changing to NOT ENFORCED, we must recurse even if the parent is already NOT ENFORCED, since child constraints may still be ENFORCED. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@cybertec.at> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CACJufxHCh_FU-FsEwsCvg9mN6-5tzR6H9ntn+0KUgTCaerDOmg@mail.gmail.com	2026-03-11 16:15:35 -04:00
Andrew Dunstan	a9747153e1	rename alter constraint enforceability related functions The functions AlterConstrEnforceabilityRecurse and ATExecAlterConstrEnforceability are being renamed to AlterFKConstrEnforceabilityRecurse and ATExecAlterFKConstrEnforceability, respectively. The current alter constraint functions only handle Foreign Key constraints. Renaming them to be more explicit about the constraint type is necessary; otherwise, it will cause confusion when we later introduce the ability to alter the enforceability of other constraints. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CACJufxHCh_FU-FsEwsCvg9mN6-5tzR6H9ntn+0KUgTCaerDOmg@mail.gmail.com	2026-03-11 16:14:58 -04:00
Andres Freund	a766125efd	bufmgr: Switch to standard order in MarkBufferDirtyHint() When we were updating hint bits with just a share lock MarkBufferDirtyHint() had to use a non-standard order of operations, i.e. WAL log the buffer before marking the buffer dirty. This was required because the lock level used to set hints did not conflict with the lock level that was used to flush pages, which would have allowed flushing the page out before the WAL record. The non-standard order in turn required preventing the checkpoint from starting between writing the WAL record and flushing out the page. Now that setting hints and writing out buffers use share-exclusive, we can revert back to the normal order of operations. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-11 14:58:29 -04:00
Andres Freund	b0f4ff3c92	bufmgr: Remove the, now obsolete, BM_JUST_DIRTIED Due to the recent changes to use a share-exclusive mode for setting hint bits and for flushing pages - instead of using share mode as before - a buffer cannot be dirtied while the flush is ongoing. The reason we needed JUST_DIRTIED was to handle the case where the buffer was dirtied while IO was ongoing - which is not possible anymore. Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-11 14:58:29 -04:00
Melanie Plageman	11e0824bd9	Avoid WAL flush checks for unlogged buffers in GetVictimBuffer() GetVictimBuffer() rejects a victim buffer if it is from a bulkread strategy ring and reusing it would require flushing WAL. Unlogged table buffers can have fake LSNs (e.g. unlogged GiST pages) and calling XLogNeedsFlush() on a fake LSN is meaningless. This is a bit of future-proofing because currently the bulkread strategy is not used for relations with fake LSNs. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Earlier version reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/flat/fmkqmyeyy7bdpvcgkheb6yaqewemkik3ls6aaveyi5ibmvtxnd%40nu2kvy5rq3a6	2026-03-11 14:50:50 -04:00
Tomas Vondra	943e881733	Do not lock in BufferGetLSNAtomic() on archs with 8 byte atomic reads On platforms where we can read or write the whole LSN atomically, we do not need to lock the buffer header to prevent torn LSNs. We can do this only on platforms with PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY, and when the pd_lsn field is properly aligned. For historical reasons the PageXLogRecPtr was defined as a struct with two uint32 fields. This replaces it with a single uint64 value, to make the intent clearer. To prevent issues with weak typedefs the value is still wrapped in a struct. This also adjusts heapfuncs() in pageinspect, to ensure proper alignment when reading the LSN from a page on alignment-sensitive hardware. Idea by Andres Freund. Initial patch by Andreas Karlsson, improved by Peter Geoghegan. Minor tweaks by me. Author: Andreas Karlsson <andreas@proxel.se> Author: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/b6610c3b-3f59-465a-bdbb-8e9259f0abc4@proxel.se	2026-03-11 19:46:08 +01:00
Tomas Vondra	b6eb8dde6b	Fix indentation from commit `29a0fb2157` Per buildfarm animal koel	2026-03-11 15:14:46 +01:00
Tomas Vondra	29a0fb2157	Conditional locking in pgaio_worker_submit_internal With io_method=worker, there's a single I/O submission queue. With enough workers, the backends and workers may end up spending a lot of time competing for the AioWorkerSubmissionQueueLock lock. This can happen with workloads that keep the queue full, in which case it's impossible to add requests to the queue. Increasing the number of I/O workers increases the pressure on the lock, worsening the issue. This change improves the situation in two ways: * If AioWorkerSubmissionQueueLock can't be acquired without waiting, the I/O is performed synchronously (as if the queue was full). * When an entry can't be added to a full queue, stop trying to add more entries. All remaining entries are handled as synchronous I/O. The regression was reported by Alexandre Felipe. Investigation and patch by me, based on an idea by Andres Freund. Reported-by: Alexandre Felipe <o.alexandre.felipe@gmail.com> Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAE8JnxOn4+xUAnce+M7LfZWOqfrMMxasMaEmSKwiKbQtZr65uA@mail.gmail.com	2026-03-11 13:40:23 +01:00
Peter Eisentraut	9c05f152b5	Fixes for C++ typeof implementation This fixes two bugs in commit `1887d822f1`. First, if we are using the fallback C++ implementation of typeof, then we need to include the C++ header <type_traits> for std::remove_reference_t. This header is also likely to be used for other C++ implementations of type tricks, so we'll put it into the global includes. Second, for the case that the C compiler supports typeof in a spelling that is not "typeof" (for example, __typeof__), then we need to #undef typeof in the C++ section to avoid warnings about duplicate macro definitions. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-03-11 11:54:10 +01:00
Peter Eisentraut	d4a080b8a1	Remove Int8GetDatum function We have no uses of Int8GetDatum in our tree and did not have for a long time (or never), and the inverse does not exist either. Author: Kirill Reshke <reshkekirill@gmail.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/CALdSSPhFyb9qLSHee73XtZm1CBWJNo9+JzFNf-zUEWCRW5yEiQ@mail.gmail.com	2026-03-11 10:46:08 +01:00
Peter Eisentraut	d537f59fbb	Sort out table_open vs. relation_open in rewriter table_open() is a wrapper around relation_open() that checks that the relkind is table-like and gives a user-facing error message if not. It is best used in directly user-facing areas to check that the user used the right kind of command for the relkind. In internal uses where the relkind was previously checked from the user's perspective, table_open() is not necessary and might even be confusing if it were to give out-of-context error messages. In rewriteHandler.c, there were several such table_open() calls, which this changes to relation_open(). This currently doesn't make a difference, but there are plans to have other relkinds that could appear in the rewriter but that shouldn't be accessible via table-specific commands, and this clears the way for that. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6d3fef19-a420-4e11-8235-8ea534bf2080%40eisentraut.org Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-11 09:22:11 +01:00
Andres Freund	82467f627b	Require share-exclusive lock to set hint bits and to flush At the moment hint bits can be set with just a share lock on a page (and, until `45f658dacb`, in one case even without any lock). Because of this we need to copy pages while writing them out, as otherwise the checksum could be corrupted. The need to copy the page is problematic to implement AIO writes: 1) Instead of just needing a single buffer for a copied page we need one for each page that's potentially undergoing I/O 2) To be able to use the "worker" AIO implementation the copied page needs to reside in shared memory It also causes problems for using unbuffered/direct-IO, independent of AIO: Some filesystems, raid implementations, ... do not tolerate the data being written out to change during the write. E.g. they may compute internal checksums that can be invalidated by concurrent modifications, leading e.g. to filesystem errors (as the case with btrfs). It also just is plain odd to allow modifications of buffers that are just share locked. To address these issues, this commit changes the rules so that modifications to pages are not allowed anymore while holding a share lock. Instead the new share-exclusive lock (introduced in `fcb9c977aa`) allows at most one backend to modify a buffer while other backends have the same page share locked. An existing share-lock can be upgraded to a share-exclusive lock, if there are no conflicting locks. For that BufferBeginSetHintBits()/BufferFinishSetHintBits() and BufferSetHintBits16() have been introduced. To prevent hint bits from being set while the buffer is being written out, writing out buffers now requires a share-exclusive lock. The use of share-exclusive to gate setting hint bits means that from now on only one backend can set hint bits at a time. To allow multiple backends to set hint bits would require more complicated locking: For setting hint bits we'd need to store the count of backends currently setting hint bits and we would need another lock-level for I/O conflicting with the lock-level to set hint bits. Given that the share-exclusive lock for setting hint bits is only held for a short time, that backends would often just set the same hint bits and that the cost of occasionally not setting hint bits in hotly accessed pages is fairly low, this seems like an acceptable tradeoff. The biggest change to adapt to this is in heapam. To avoid performance regressions for sequential scans that need to set a lot of hint bits, we need to amortize the cost of BufferBeginSetHintBits() for cases where hint bits are set at a high frequency. To that end HeapTupleSatisfiesMVCCBatch() uses the new SetHintBitsExt(), which defers BufferFinishSetHintBits() until all hint bits on a page have been set. Conversely, to avoid regressions in cases where we can't set hint bits in bulk (because we're looking only at individual tuples), use BufferSetHintBits16() when setting hint bits without batching. Several other places also need to be adapted, but those changes are comparatively simpler. After this we do not need to copy buffers to write them out anymore. That change is done separately however. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf%40gcnactj4z56m	2026-03-10 19:32:13 -04:00
Michael Paquier	4c910f3bbe	bloom: Optimize bitmap scan path with streaming read This commit replaces the per-page buffer read look in blgetbitmap() with a reading stream, to improve scan efficiency, particularly useful for large bloom indexes. Some benchmarking with a large number of rows has shown a very nice improvement in terms of runtime and IO read reduction with test cases up to 10M rows for a bloom index scan. For the io_uring method, The author has reported a 3x in runtime with io_uring while I was at close to a 7x. For the worker method with 3 workers, the author has reported better numbers than myself in runtime, with the reduction in IO stats being appealing for all the cases measured. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-11 07:36:10 +09:00
Melanie Plageman	4c7362c553	Remove unused PruneState member frz_conflict_horizon `c2a23dcf9e` removed use of PruneState.frz_conflict_horizon but neglected to actually remove the member. Do that now.	2026-03-10 18:31:00 -04:00
Heikki Linnakangas	138592d1b0	Don't clear pendingRecoveryConflicts at end of transaction Commit `17f51ea818` introduced a new pendingRecoveryConflicts field in PGPROC to replace the various ProcSignals. The new field was cleared in ProcArrayEndTransaction(), which makes sense for conflicts with e.g. locks or buffer pins which are gone at end of transaction. But it is not appropriate for conflicts on a database, or a logical slot. Because of this, the 035_standby_logical_decoding.pl test was occasionally getting stuck in the buildfarm. It happens if the startup process signals recovery conflict with the logical slot just when the walsender process using the slot calls ProcArrayEndTransaction(). To fix, don't clear pendingRecoveryConflicts in ProcArrayEndTransaction(). We could still clear certain conflict flags, like conflicts on locks, but we didn't try to do that before commit `17f51ea818` either. In the passing, fix a misspelled comment, and make InitAuxiliaryProcess() to also clear pendingRecoveryConflicts. I don't think aux processes can have recovery conflicts, but it seems best to initialize the field and keep InitAuxiliaryProcess() as close to InitProcess() as possible. Analyzed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://www.postgresql.org/message-id/3e07149d-060b-48a0-8f94-3d5e4946ae45@gmail.com	2026-03-11 00:06:09 +02:00
Melanie Plageman	c2a23dcf9e	Use the newest to-be-frozen xid as the conflict horizon for freezing Previously WAL records that froze tuples used OldestXmin as the snapshot conflict horizon, or the visibility cutoff if the page would become all-frozen. Both are newer than (or equal to) the newst XID actually frozen on the page. Track the newest XID that will be frozen and use that as the snapshot conflict horizon instead. This yields an older horizon resulting in fewer query cancellations on standbys. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAAKRu_bbaUV8OUjAfVa_iALgKnTSfB4gO3jnkfpcFgrxEpSGJQ%40mail.gmail.com	2026-03-10 15:24:39 -04:00
Álvaro Herrera	ac58465e06	Introduce the REPACK command REPACK absorbs the functionality of VACUUM FULL and CLUSTER in a single command. Because this functionality is completely different from regular VACUUM, having it separate from VACUUM makes it easier for users to understand; as for CLUSTER, the term is heavily overloaded in the IT world and even in Postgres itself, so it's good that we can avoid it. We retain those older commands, but de-emphasize them in the documentation, in favor of REPACK; the difference between VACUUM FULL and CLUSTER (namely, the fact that tuples are written in a specific ordering) is neatly absorbed as two different modes of REPACK. This allows us to introduce further functionality in the future that works regardless of whether an ordering is being applied, such as (and especially) a concurrent mode. Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/82651.1720540558@antos Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql	2026-03-10 19:56:39 +01:00
Masahiko Sawada	a596d27d80	Fix grammar in short description of effective_wal_level. Align with the convention of using third-person singular (e.g., "Shows" instead of "Show") for GUC parameter descriptions. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20260210.143752.1113524465620875233.horikyota.ntt@gmail.com	2026-03-10 11:36:38 -07:00
Andres Freund	f4a4ce52c0	heapam: Don't mimic MarkBufferDirtyHint() in inplace updates Previously heap_inplace_update_and_unlock() used an operation order similar to MarkBufferDirty(), to reduce the number of different approaches used for updating buffers. However, in an upcoming patch, MarkBufferDirtyHint() will switch to using the update protocol used by most other places (enabled by hint bits only being set while holding a share-exclusive lock). Luckily it's pretty easy to adjust heap_inplace_update_and_unlock(). As a comment already foresaw, we can use the normal order, with the slight change of updating the buffer contents after WAL logging. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-10 11:58:06 -04:00
Álvaro Herrera	a198c26ded	pg_dumpall: simplify coding of dropDBs() There's no need for a StringInfo when all you want is a string being constructed in a single pass. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Ranier Vilela <ranier.vf@gmail.com> Reviewed-by: Yang Yuanzhuo <1197620467@qq.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CAEudQAq2wyXZRdsh+wVHcOrungPU+_aQeQU12wbcgrmE0bQovA@mail.gmail.com	2026-03-10 16:00:19 +01:00
Fujii Masao	59bae23435	Remove duplicate initialization in initialize_brin_buildstate(). Commit `dae761a` added initialization of some BrinBuildState fields in initialize_brin_buildstate(). Later, commit `b437571` inadvertently added the same initialization again. This commit removes that redundant initialization. No behavioral change is intended. Author: Chao Li <lic@highgo.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAEoWx2nmrca6-9SNChDvRYD6+r==fs9qg5J93kahS7vpoq8QVg@mail.gmail.com	2026-03-10 22:55:11 +09:00
Peter Eisentraut	8080f44f96	Rename grammar nonterminal to simplify reuse A list of expressions with optional AS-labels is useful in a few different places. Right now, this is available as xml_attribute_list because it was first used in the XMLATTRIBUTES construct, but it is already used elsewhere, and there are other possible future uses. To reduce possible confusion going forward, rename it to labeled_expr_list (like existing expr_list plus ColLabel). Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-10 14:09:09 +01:00
Robert Haas	0fbfd37cef	Allow extensions to mark an individual index as disabled. Up until now, the only way for a loadable module to disable the use of a particular index was to use build_simple_rel_hook (or, previous to yesterday's commit, get_relation_info_hook) to remove it from the index list. While that works, it has some disadvantages. First, the index becomes invisible for all purposes, and can no longer be used for optimizations such as self-join elimination or left join removal, which can severely degrade the resulting plan. Second, if the module attempts to compel the use of a certain index by removing all other indexes from the index list and disabling other scan types, but the planner is unable to use the chosen index for some reason, it will fall back to a sequential scan, because that is only disabled, whereas the other indexes are, from the planner's point of view, completely gone. While this situation ideally shouldn't occur, it's hard for a loadable module to be completely sure whether the planner will view a certain index as usable for a certain query. If it isn't, it may be better to fall back to a scan using a disabled index rather than falling back to an also-disabled sequential scan. Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA%2BTgmoYS4ZCVAF2jTce%3DbMP0Oq_db_srocR4cZyO0OBp9oUoGg%40mail.gmail.com	2026-03-10 08:33:55 -04:00
Michael Paquier	03facc1211	Switch to FATAL error for missing checkpoint record without backup_label Crash recovery started without a backup_label previously crashed with a PANIC if the checkpoint record could not be found. This commit lowers the report generated to be a FATAL instead. With recovery methods being more imaginative these days, this should provide more flexibility when handling PostgreSQL recovery processing in the event of a driver error, similarly to `15f68cebdc`. An extra benefit of this change is that it becomes possible to add a test to check that a FATAL is hit with an expected error message pattern. With the recovery code becoming more complicated over the last couple of years, I suspect that this will be benefitial to cover in the long-term. The original PANIC behavior has been introduced in the early days of crash recovery, as of `4d14fe0048` (PANIC did not exist yet, the code used STOP). Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Discussion: https://postgr.es/m/CAMm1aWZbQ-Acp_xAxC7mX9uZZMH8+NpfepY9w=AOxbBVT9E=uA@mail.gmail.com	2026-03-10 12:00:05 +09:00
Michael Paquier	6307b096e2	Fix misuse of "volatile" in xml.c What should be used is not "volatile foo ptr" but "foo volatile ptr", The incorrect (former) style means that what the pointer variable points to is volatile. The correct (latter) style means that the pointer variable itself needs to be treated as volatile. The latter style is required to ensure a consistent treatment of these variables after a longjmp with the TRY/CATCH blocks. Some casts can be removed thanks to this change. Issue introduced by `2e94721747`, so no backpatch is required. A similar set of issues has been fixed in `93001888d8` for contrib/xml2/. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_5BE8DAD985EE140ED62EA728C8D4E1311F0A@qq.com	2026-03-10 07:05:32 +09:00
Nathan Bossart	7c8280eeb5	pg_{dump,restore}: Refactor handling of conflicting options. This commit makes use of the function added by commit `b2898baaf7` for these applications' handling of conflicting options. It doesn't fix any bugs, but it does trim several lines of code. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CACJufxHDYn%2B3-2jR_kwYB0U7UrNP%2B0EPvAWzBBD5EfUzzr1uiw%40mail.gmail.com	2026-03-09 11:37:46 -05:00
Robert Haas	91f33a2ae9	Replace get_relation_info_hook with build_simple_rel_hook. For a long time, PostgreSQL has had a get_relation_info_hook which plugins can use to editorialize on the information that get_relation_info obtains from the catalogs. However, this hook is only called for baserels of type RTE_RELATION, and there is potential utility in a similar call back for other types of RTEs. This might have had utility even before commit `4020b370f2` added pgs_mask to RelOptInfo, but it certainly has utility now. So, move the callback up one level, deleting get_relation_info_hook and adding build_simple_rel_hook instead. The new callback is called just slightly later than before and with slightly different arguments, but it should be fairly straightforward to adjust existing code that currently uses get_relation_info_hook: the values previously available as relationObjectId and inhparent are now available via rte->relid and rte->inh, and calls where rte->rtekind != RTE_RELATION can be ignored if desired. Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA%2BTgmoYg8uUWyco7Pb3HYLMBRQoO6Zh9hwgm27V39Pb6Pdf%3Dug%40mail.gmail.com	2026-03-09 09:48:26 -04:00
Robert Haas	8300d3ad4a	Consider startup cost as a figure of merit for partial paths. Previously, the comments stated that there was no purpose to considering startup cost for partial paths, but this is not the case: it's perfectly reasonable to want a fast-start path for a plan that involves a LIMIT (perhaps over an aggregate, so that there is enough data being processed to justify parallel query but yet we don't want all the result rows). Accordingly, rewrite add_partial_path and add_partial_path_precheck to consider startup costs. This also fixes an independent bug in add_partial_path_precheck: commit `e222534679` failed to update it to do anything with the new disabled_nodes field. That bug fix is formally separate from the rest of this patch and could be committed separately, but I think it makes more sense to fix both issues together, because then we can (as this commit does) just make add_partial_path_precheck do the cost comparisons in the same way as compare_path_costs_fuzzily, which hopefully reduces the chances of ending up with something that's still incorrect. This patch is based on earlier work on this topic by Tomas Vondra, but I have rewritten a great deal of it. Co-authored-by: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tomas Vondra <tomas@vondra.me> Discussion: http://postgr.es/m/CA+TgmobRufbUSksBoxytGJS1P+mQY4rWctCk-d0iAUO6-k9Wrg@mail.gmail.com	2026-03-09 08:16:30 -04:00
Robert Haas	ffc226ab64	Prevent restore of incremental backup from bloating VM fork. When I (rhaas) wrote the WAL summarizer code, I incorrectly believed that XLOG_SMGR_TRUNCATE truncates all forks to the same length. In fact, what other parts of the code do is compute the truncation length for the FSM and VM forks from the truncation length used for the main fork. But, because I was confused, I coded the WAL summarizer to set the limit block for the VM fork to the same value as for the main fork. (Incremental backup always copies FSM forks in full, so there is no similar issue in that case.) Doing that doesn't directly cause any data corruption, as far as I can see. However, it does create a serious risk of consuming a large amount of extra disk space, because pg_combinebackup's reconstruct.c believes that the reconstructed file should always be at least as long as the limit block value. We might want to be smarter about that at some point in the future, because it's always safe to omit all-zeroes blocks at the end of the last segment of a relation, and doing so could save disk space, but the current algorithm will rarely waste enough disk space to worry about unless we believe that a relation has been truncated to a length much longer than its actual length on disk, which is exactly what happens as a result of the problem mentioned in the previous paragraph. To fix, create a new visibilitymap helper function and use it to include the right limit block in the summary files. Incremental backups taken with existing summary files will still have this issue, but this should improve the situation going forward. Diagnosed-by: Oleg Tkachenko <oatkachenko@gmail.com> Diagnosed-by: Amul Sul <sulamul@gmail.com> Discussion: http://postgr.es/m/CAAJ_b97PqG89hvPNJ8cGwmk94gJ9KOf_pLsowUyQGZgJY32o9g@mail.gmail.com Discussion: http://postgr.es/m/6897DAF7-B699-41BF-A6FB-B818FCFFD585%40gmail.com Backpatch-through: 17	2026-03-09 06:45:32 -04:00
Amit Kapila	06d8302262	Remove trailing period from errmsg in subscriptioncmds.c. Author: Sahitya Chandra <sahityajb@gmail.com> Discussion: https://postgr.es/m/20260308142806.181309-1-sahityajb@gmail.com	2026-03-09 15:10:03 +05:30
Peter Eisentraut	2799e29fb8	Move comment back to better place Commit `f014b1b9bb` inserted some new code in between existing code and a trailing comment. Move the comment back to near the code it belongs to.	2026-03-09 10:36:16 +01:00
Fujii Masao	173aa8c5e8	doc: Document IF NOT EXISTS option for ALTER FOREIGN TABLE ADD COLUMN. Commit `2cd40adb85` added the IF NOT EXISTS option to ALTER TABLE ADD COLUMN. This also enabled IF NOT EXISTS for ALTER FOREIGN TABLE ADD COLUMN, but the ALTER FOREIGN TABLE documentation was not updated to mention it. This commit updates the documentation to describe the IF NOT EXISTS option for ALTER FOREIGN TABLE ADD COLUMN. While updating that section, also this commit clarifies that the COLUMN keyword is optional in ALTER FOREIGN TABLE ADD/DROP COLUMN. Previously, part of the documentation could be read as if COLUMN were required. This commit adds regression tests covering these ALTER FOREIGN TABLE syntaxes. Backpatch to all supported versions. Suggested-by: Fujii Masao <masao.fujii@gmail.com> Author: Chao Li <lic@highgo.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFk=rrhrwGwPtQxBesbT4DzSZ86Q3ftcwCu3AR5bOiXLw@mail.gmail.com Backpatch-through: 14	2026-03-09 18:23:36 +09:00
Michael Paquier	4da2afd01f	Fix size underestimation of DSA pagemap for odd-sized segments When make_new_segment() creates an odd-sized segment, the pagemap was only sized based on a number of usable_pages entries, forgetting that a segment also contains metadata pages, and that the FreePageManager uses absolute page indices that cover the entire segment. This miscalculation could cause accesses to pagemap entries to be out of bounds. During subsequent reuse of the allocated segment, allocations landing on pages with indices higher than usable_pages could cause out-of-bounds pagemap reads and/or writes. On write, 'span' pointers are stored into the data area, corrupting the allocated objects. On read (aka during a dsa_free), garbage is interpreted as a span pointer, typically crashing the server in dsa_get_address(). The normal geometric path correctly sizes the pagemap for all pages in the segment. The odd-sized path needs to do the same, but it works forward from usable_pages rather than backward from total_size. This commit fixes the sizing of the odd-sized case by adding pagemap entries for the metadata pages after the initial metadata_bytes calculation, using an integer ceiling division to compute the exact number of additional entries needed in one go, avoiding any iteration in the calculation. An assertion is added in the code path for odd-sized segments, ensuring that the pagemap includes the metadata area, and that the result is appropriately sized. This problem would show up depending on the size requested for the allocation of a DSA segment. The reporter has noticed this issue when a parallel hash join makes a DSA allocation large enough to trigger the odd-sized segment path, but it could happen for anything that does a DSA allocation. A regression test is added to test_dsa, down to v17 where the test module has been introduced. This adds a set of cheap tests to check the problem, the new assertion being useful for this purpose. Sami has proposed a test that took a longer time than what I have done here; the test committed is faster and good enough to check the odd-sized allocation path. Author: Paul Bunn <paul.bunn@icloud.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/044401dcabac$fe432490$fac96db0$@icloud.com Backpatch-through: 14	2026-03-09 13:46:27 +09:00
Michael Paquier	ccd7abaa45	Refactor tests for catalog diff comparisons in stats_import.sql The tests of stats_import.sql include a set of queries to do differential checks of the three statistics catalog relations, based on the comparison of a source relation and a target relation, used for the copy of the stats data with the restore functions: - pg_statistic - pg_stats_ext - pg_stats_ext_exprs This commit refactors the tests to reduce the bloat of such differential queries, by creating a set of objects that make the differential queries smaller: - View for a base relation type. - First function to retrieve stats data, that returns a type based on the view previously created. - Second function that checks the difference, based on two calls of the first function. This change leads to a nice reduction of stats_import.sql, with a larger effect on the output file. While on it, this adds some sanity checks for the three catalogs, to warn developers that the stats import facilities may need to be updated if any of the three catalogs change. These are rare in practice, see `918eee0c49` as one example. Another stylistic change is the use of the extended output format for the differential queries, so as we avoid long lines of output if a diff is caught. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=eEhxJpSUP+eC=eMGZZsVOpnfKDvVkuCbsFg9CajYwDsA@mail.gmail.com	2026-03-09 08:46:06 +09:00
Michael Paquier	9e8193a262	Fix typo in stats_import.sql The test mentioned pg_stat_ext_exprs, but the correct catalog name is pg_stats_ext_exprs. Thinko in `ba97bf9cb7`. Discussion: https://postgr.es/m/CADkLM=eEhxJpSUP+eC=eMGZZsVOpnfKDvVkuCbsFg9CajYwDsA@mail.gmail.com	2026-03-09 07:15:26 +09:00
Álvaro Herrera	eb2c867b0a	Fix invalid boolean if-test We were testing the truth value of the array of booleans (which is always true) instead of the boolean element specific to the affected table column. This causes a binary-upgrade dump fail to omit the name of a constraint; that is, the correct constraint name is always printed, even when it's not needed. The affected case is a binary-upgrade dump of a not-null constraint in an inherited column, which must in addition have no comment. Another point is that in order for this to make a difference, the constraint must have the default name in the child table. That is, the constraint must have been created _in the parent table_ with the name that it would have in the child table, like so: CREATE TABLE parent (a int CONSTRAINT child_a_not_null NOT NULL); CREATE TABLE child () INHERITS (parent); Otherwise, the correct name must be printed by binary-upgrade pg_dump anyway, since it wouldn't match the name produced at the parent. Moreover, when it does hit, the pre-18-compatibility code (which has to work with a constraint that has no name) gets involved and uses the UPDATE on pg_constraint using the conkey instead of column name ... and so everything ends up working correctly AFAICS. I think it might cause a problem if the table and column names are overly long, but I didn't want to spend time investigating further. Still, it's wrong code, and static analyzers have twice complained about it, so fix it by adding the array index accessor that was obviously meant. Reported-by: Ranier Vilela <ranier.vf@gmail.com> Reported-by: George Tarasov <george.v.tarasov@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAEudQAo7ah=4TDheuEjtb0dsv6bHoK7uBNqv53Tsub2h-xBSJw@mail.gmail.com Discussion: https://postgr.es/m/f3029f25-acc9-4cb9-a74f-fe93bcfb3a27@gmail.com	2026-03-07 14:28:16 +01:00
Jacob Champion	e982331b52	libpq: Introduce PQAUTHDATA_OAUTH_BEARER_TOKEN_V2 For the libpq-oauth module to eventually make use of the PGoauthBearerRequest API, it needs some additional functionality: the derived Issuer ID for the authorization server needs to be provided, and error messages need to be built without relying on PGconn internals. These features seem useful for application hooks, too, so that they don't each have to reinvent the wheel. The original plan was for additions to PGoauthBearerRequest to be made without a version bump to the PGauthData type. Applications would simply check a LIBPQ_HAS_* macro at compile time to decide whether they could use the new features. That theoretically works for applications linked against libpq, since it's not safe to downgrade libpq from the version you've compiled against. We've since found that this strategy won't work for plugins, due to a complication first noticed during the libpq-oauth module split: it's normal for a plugin on disk to be newer than the libpq that's loading it, because you might have upgraded your installation while an application was running. (In other words, a plugin architecture causes the compile-time and run-time dependency arrows to point in opposite directions, so plugins won't be able to rely on the LIBPQ_HAS_* macros to determine what APIs are available to them.) Instead, extend the original PGoauthBearerRequest (now retroactively referred to as "v1" in the code) with a v2 subclass-style struct. When an application implements and accepts PQAUTHDATA_OAUTH_BEARER_TOKEN_V2, it may safely cast the base request pointer it receives in its callbacks to v2 in order to make use of the new functionality. libpq will query the application for a v2 hook first, then v1 to maintain backwards compatibility, before giving up and using the builtin flow. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com	2026-03-06 12:05:51 -08:00
Nathan Bossart	b2898baaf7	pg_dumpall: Fix handling of conflicting options. pg_dumpall is missing checks for some conflicting options, including those passed through to pg_dump. To fix, introduce a new function that checks whether mutually exclusive options are set, and use that in pg_dumpall. A similar change could likely be made for pg_dump and pg_restore, but that is left as a future exercise. This is arguably a bug fix, but since this might break existing scripts, no back-patch for now. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Wang Peng <215722532@qq.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CACJufxFf5%3DwSv2MsuO8iZOvpLZQ1-meAMwhw7JX5gNvWo5PDug%40mail.gmail.com	2026-03-06 14:00:04 -06:00
Masahiko Sawada	50ea4e09b6	Use palloc_object() and palloc_array() in more areas of the logical replication. The idea is to encourage the use of newer routines across the tree, as these offer stronger type-safety guarantees than raw palloc(). Similar work has been done in commits `1b105f9472`, `0c3c5c3b06`, `31d3847a37`, and `4f7dacc5b8`. This commit extends those changes to more locations within src/backend/replication/logical/. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAHut+Pv4N7Vpxo18+NAR1r9RGvR8b0BtwTkoeCE2PfFoXgmR6A@mail.gmail.com	2026-03-06 10:49:50 -08:00
Tom Lane	415100aa62	Support grouping-expression references and GROUPING() in subqueries. Until now, substitute_grouped_columns and its predecessor check_ungrouped_columns intentionally did not cope with references to GROUP BY expressions (anything more complex than a Var) within subqueries of the query having GROUP BY. Because they didn't try to match subexpressions of subqueries to the GROUP BY list, they'd drill down to raw Vars of the grouping level and then fail with "subquery uses ungrouped column from outer query". There have been remarkably few complaints about this deficiency, so nobody ever did anything about it. The reason for not wanting to deal with it is that within a subquery, Vars will have varlevelsup different from zero and will thus not be equal() to the expressions seen in the outer query. We recognized this at least as far back as `96ca8ffeb`, although I think the comment I added about it then was just documenting a pre-existing deficiency. It looks like at the time, the solutions I considered were (1) write a version of equal() that permits an offset in varlevelsup, or (2) dynamically apply IncrementVarSublevelsUp at each subexpression. (1) would require an amount of new code that seems rather out of proportion to the benefit, while (2) would add an exponential amount of cost to the matching process. But rethinking it now, what seems attractive is (3) apply IncrementVarSublevelsUp to the groupingClause list not the subexpressions, and do so only once per subquery depth level. Then we can still use plain equal() to check for matches, and we're not incurring cost proportional to some power of the subquery's complexity. This patch continues to use the old logic when the GROUP BY list is all Vars. We could discard the special comparison logic for that and always do it the more general way, but that would be a good deal slower. (Micro-benchmarking just parse analysis suggests it's about 50% slower than the Vars-only path. But we've not heard complaints about the speed of matching within the main query, so I doubt that applying the same matching logic within subqueries will be a problem.) The lack of complaints suggests strongly that this is a very minority use-case, so I don't want to make the typical case slower to fix it. While testing that, I was surprised to discover a nearby bug: GROUPING() within a subquery fails to match GROUP BY Vars that are join alias Vars. It tries to apply flatten_join_alias_vars to make such cases work, but that fails to work inside a subquery because varlevelsup is wrong. Therefore, this patch invents a new entry point flatten_join_alias_for_parser() that allows specification of a sublevels_up offset. (It seems cleaner to give the parser its own entry point rather than abuse the planner's conventions even further.) While this is pretty clearly a bug fix, I'm hesitant to take the risk of back-patching, seeing that the existing behavior has stood for so long with so few complaints. Maybe we can reconsider once this patch has baked awhile in master. Reported-by: PALAYRET Jacques <jacques.palayret@meteo.fr> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/531183.1772058731@sss.pgh.pa.us	2026-03-06 13:40:55 -05:00
Jeff Davis	8185bb5347	CREATE SUBSCRIPTION ... SERVER. Allow CREATE SUBSCRIPTION to accept a foreign server using the SERVER clause instead of a raw connection string using the CONNECTION clause. * Enables a user with sufficient privileges to create a subscription using a foreign server by name without specifying the connection details. * Integrates with user mappings (and other FDW infrastructure) using the subscription owner. * Provides a layer of indirection to manage multiple subscriptions to the same remote server more easily. Also add CREATE FOREIGN DATA WRAPPER ... CONNECTION clause to specify a connection_function. To be eligible for a subscription, the foreign server's foreign data wrapper must specify a connection_function. Add connection_function support to postgres_fdw, and bump postgres_fdw version to 1.3. Bump catversion. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/61831790a0a937038f78ce09f8dd4cef7de7456a.camel@j-davis.com	2026-03-06 08:27:56 -08:00
Álvaro Herrera	868825aaeb	Don't include wait_event.h in pgstat.h wait_event.h itself includes wait_event_types.h, which is a generated file, so it's nice that we can avoid compiling >10% of the tree just because that file is regenerated. To avoid breaking too many third-party modules, we now #include utils/wait_classes.h in storage/latch.h. Then, the very common case of doing WaitLatch(..., PG_WAIT_EXTENSION) continues to work by including just storage/latch.h. (I didn't try to determine how many modules would actually break if we don't do this, but this seems a convenient and low-impact measure.) Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/202602181214.gcmhx2vhlxzp@alvherre.pgsql	2026-03-06 16:24:58 +01:00
Peter Eisentraut	90ca7c1429	doc: Fix capitalization of Unicode Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-06 10:32:05 +01:00
Peter Eisentraut	16686a853f	Fix Python deprecation warning Starting with Python 3.14, contrib/unaccent/generate_unaccent_rules.py complains DeprecationWarning: codecs.open() is deprecated. Use open() instead. This makes that change. This works for all Python 3.x versions. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-06 10:31:59 +01:00
Peter Eisentraut	258248d0bd	Make unconstify and unvolatize use StaticAssertVariableIsOfTypeMacro The unconstify and unvolatize macros had an almost identical assertion as was already defined in StaticAssertVariableIsOfTypeMacro, only it had a less useful error message and didn't have a sizeof fallback. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com	2026-03-06 10:14:32 +01:00
Peter Eisentraut	e2308350c9	Use typeof everywhere instead of compiler specific spellings We define typeof ourselves as __typeof__ if it does not exist. So let's actually use that for consistency. The meson/autoconf checks for __builtin_types_compatible_p still use __typeof__ though, because there we have not redefined it. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com	2026-03-06 10:14:32 +01:00
Peter Eisentraut	aa7c868523	Portable StaticAssertExpr Use a different way to write StaticAssertExpr() that does not require the GCC extension statement expressions. For C, we put the static_assert into a struct. This appears to be a common approach. We still need to keep the fallback implementation to support buggy MSVC < 19.33. For C++, we put it into a lambda expression. (The C approach doesn't work; it's not permitted to define a new type inside sizeof.) Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/5fa3a9f5-eb9a-4408-9baf-403d281f8b10%40eisentraut.org	2026-03-06 09:27:54 +01:00
Fujii Masao	6eedb2a5fd	Fix publisher shutdown hang caused by logical walsender busy loop. Previously, when logical replication was running, shutting down the publisher could cause the logical walsender to enter a busy loop and prevent the publisher from completing shutdown. During shutdown, the logical walsender waits for all pending WAL to be written out. However, some WAL records could remain unflushed, causing the walsender to wait indefinitely. The issue occurred because the walsender used XLogBackgroundFlush() to flush pending WAL. This function does not guarantee that all WAL is written. For example, WAL generated by a transaction without an assigned transaction ID that aborts might not be flushed. This commit fixes the bug by making the logical walsender call XLogFlush() instead, ensuring that all pending WAL is written and preventing the busy loop during shutdown. Backpatch to all supported versions. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAO6_Xqo3co3BuUVEVzkaBVw9LidBgeeQ_2hfxeLMQcXwovB3GQ@mail.gmail.com Backpatch-through: 14	2026-03-06 16:43:40 +09:00
Fujii Masao	2007df4333	Improve tests for recovery_target_timeline GUC. Commit `fd7d7b7191` added regression tests to verify recovery_target_timeline settings. To confirm that invalid values are rejected, those tests started the server with an invalid setting and then verified that startup failed. While functionally correct, this approach was expensive because it required setting up and starting the server for each test case. This commit updates the tests for recovery_target_timeline to use the simpler approach introduced by commit `bffd7130` for recovery_target_xid, using ALTER SYSTEM SET to verify that invalid settings are rejected. This avoids the need to set up and start the server when checking invalid recovery_target_timeline values. Author: David Steele <david@pgbackrest.org> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwG44vZbSoBmg076G+xkR6n=Tj2=q+fVkfP7yEsyF1daFA@mail.gmail.com	2026-03-06 16:02:57 +09:00
Michael Paquier	d5ea206728	Fix inconsistency with HeapTuple freeing in extended_stats_funcs.c heap_freetuple() is a thin wrapper doing a pfree(), and the function import_pg_statistic(), introduced by `ba97bf9cb7`, had the idea to call directly pfree() rather than the "dedicated" heap tuple routine. upsert_pg_statistic_ext_data already uses heap_freetuple(). This code is harmless as-is, but let's be consistent across the board. Reported-by: Yonghao Lee <yonghao_lee@qq.com> Discussion: https://postgr.es/m/tencent_CA1315EE8FB9C62F742C71E95FAD72214205@qq.com	2026-03-06 14:49:00 +09:00
Michael Paquier	2d4ead6f4b	Fix order of columns in pg_stat_recovery recovery_last_xact_time is listed before current_chunk_start_time in the documentation, the function definition and the view definition, but their order was reversed in the code. Thinko in `01d485b142`. Mea culpa. Author: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAOzEurQQ1naKmPJhfE5WOUQjtf5tu08Kw3QCGY5UY=7Rt9fE=w@mail.gmail.com	2026-03-06 14:41:41 +09:00
Amit Kapila	f1ddaa1535	Fix inconsistent elevel in pg_sync_replication_slots() retry logic. The commit `0d2d4a0ec3` allowed pg_sync_replication_slots() to retry sync attempts, but missed a case, when WAL prior to a slot's confirmed_flush_lsn is not yet flushed locally. By changing the elevel from ERROR to LOG, we allow the sync loop to continue. This provides the opportunity for the slot to be synchronized once the standby catches up with the necessary WAL. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAFPTHDZAA+gWDntpa5ucqKKba41=tXmoXqN3q4rpjO9cdxgQrw@mail.gmail.com	2026-03-06 10:51:32 +05:30
Michael Paquier	01d485b142	Add system view pg_stat_recovery This commit introduces pg_stat_recovery, that exposes at SQL level the state of recovery as tracked by XLogRecoveryCtlData in shared memory, maintained by the startup process. This new view includes the following fields, that are useful for monitoring purposes on a standby, once it has reached a consistent state (making the execution of the SQL function possible): - Last-successfully replayed WAL record LSN boundaries and its timeline. - Currently replaying WAL record end LSN and its timeline. - Current WAL chunk start time. - Promotion trigger state. - Timestamp of latest processed commit/abort. - Recovery pause state. Some of this data can already be recovered from different system functions, but not all of it. See pg_get_wal_replay_pause_state or pg_last_xact_replay_timestamp. This new view offers a stronger consistency guarantee, by grabbing the recovery state for all fields through one spinlock acquisition. The system view relies on a new function, called pg_stat_get_recovery(). Querying this data requires the pg_read_all_stats privilege. The view returns no rows if the node is not in recovery. This feature originates from a suggestion I have made while discussion the addition of a CONNECTING state to the WAL receiver's shared memory state, because we lacked access to some of the state data. The author has taken the time to implement it, so thanks for that. Bump catalog version. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CABPTF7W+Nody-+P9y4PNk37-QWuLpfUrEonHuEhrX+Vx9Kq+Kw@mail.gmail.com Discussion: https://postgr.es/m/aW13GJn_RfTJIFCa@paquier.xyz	2026-03-06 12:37:40 +09:00
Michael Paquier	42a12856a6	Refactor code retrieving string for RecoveryPauseState This refactoring is going to be useful in an upcoming commit, to avoid some code duplication with the function pg_get_wal_replay_pause_state(), that returns a string for the recovery pause state. Refactoring opportunity noticed while hacking on a different patch. Discussion: https://postgr.es/m/CABPTF7W+Nody-+P9y4PNk37-QWuLpfUrEonHuEhrX+Vx9Kq+Kw@mail.gmail.com	2026-03-06 11:53:23 +09:00
Tom Lane	f95d73ed43	Simplify creation of built-in functions with non-default ACLs. Up to now, to create such a function, one had to make a pg_proc.dat entry and then modify it with GRANT/REVOKE commands, which we put in system_functions.sql. That seems a little ugly though, because it violates the idea of having a single source of truth about the initial contents of pg_proc, and it results in leaving dead rows in the initial contents of pg_proc. This patch improves matters by allowing aclitemin to work during early bootstrap, before pg_authid has been loaded. On the same principle that we use for early access to pg_type details, put a table of known built-in role names into bootstrap.c, and use that in bootstrap mode. To create a built-in function with a non-default ACL, one should write the desired ACL list in its pg_proc.dat entry, using a simplified version of aclitemout's notation: omit the grantor (if it is the bootstrap superuser, which it pretty much always should be) and spell the bootstrap superuser's name as POSTGRES, similarly to the notation used elsewhere in src/include/catalog. This results in entries like proacl => '{POSTGRES=X,pg_monitor=X}' which shows that we've revoked public execute permissions and instead granted that to pg_monitor. In addition to fixing up pg_proc.dat entries, I got rid of some role grants that had been stuck into system_functions.sql, and instead put them into a new file pg_auth_members.dat; that seems like a far less random place to put the information. The correctness of the data changes can be verified by comparing the initial contents of pg_proc and pg_auth_members before and after. pg_proc should match exactly, but the OID column of pg_auth_members will probably be different because those OIDs now get assigned a little earlier in bootstrap. (I forced a catversion bump out of caution, but it wasn't really necessary.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/183292bb-4891-4c96-a3ca-e78b5e0e1358@dunslane.net	2026-03-05 17:43:09 -05:00
Tom Lane	7664319ccb	Be more wary of false matches in initdb's replace_token(). Do not replace the target string unless the occurrence is surrounded by whitespace or line start/end. This avoids potential false match to a substring of a field. While we've not had trouble with that up to now, the next patch creates hazards of false matches to POSTGRES within an ACL field. There is one call site that needs adjustment, as it was presuming it could write "::1" and have that match "::1/128". For all the others, this restriction is okay and strictly safer. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/183292bb-4891-4c96-a3ca-e78b5e0e1358@dunslane.net	2026-03-05 17:22:31 -05:00
Melanie Plageman	34cb4254bd	Prefix PruneState->all_{visible,frozen} with set_ The PruneState had members called "all_visible" and "all_frozen" which reflect not the current state of the page but the state it could be in once pruning and freezing have been executed. These are then saved in the PruneFreezeResult so the caller can set the VM accordingly. Prefix the PruneState members as well as the corresponsding PruneFreezeResult members with "set_" to clarify that they represent the proposed state of the all-visible and all-frozen bits for a heap page in the visibility map, not the current state. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk	2026-03-05 16:55:00 -05:00
Melanie Plageman	68c2dcb913	Add PageGetPruneXid() helper This is similar to the other page accessors in bufpage.h. It improves readability and avoids long lines. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/BD8B69E7-26D8-4706-9164-597C6AE57812%40gmail.com	2026-03-05 16:22:57 -05:00
Melanie Plageman	59663e4207	Move commonly used context into PruneState and simplify helpers heap_page_prune_and_freeze() and many of its helpers use the heap buffer, block number, and page. Other helpers took the heap page and didn't use it. Initializing these values once during prune_freeze_setup() simplifies the helpers' interfaces and avoids any repeated calls to BufferGetBlockNumber() and BufferGetPage(). While updating PruneState, also reorganize its fields to make the layout and member documentation more consistent. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/BD8B69E7-26D8-4706-9164-597C6AE57812%40gmail.com	2026-03-05 16:10:29 -05:00
Tom Lane	ac0accafd6	Exit after fatal errors in client-side compression code. It looks like whoever wrote the astreamer (nee bbstreamer) code thought that pg_log_error() is equivalent to elog(ERROR), but it's not; it just prints a message. So all these places tried to continue on after a compression or decompression error return, with the inevitable result being garbage output and possibly cascading error messages. We should use pg_fatal() instead. These error conditions are probably pretty unlikely in practice, which no doubt accounts for the lack of field complaints. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1531718.1772644615@sss.pgh.pa.us Backpatch-through: 15	2026-03-05 14:43:21 -05:00
Jacob Champion	a6483f5ac9	oauth: Add TLS support for oauth_validator tests The oauth_validator tests don't currently support HTTPS, which makes testing PGOAUTHCAFILE difficult. Add a localhost certificate to src/test/ssl and make use of it in oauth_server.py. In passing, explain the hardcoded use of IPv4 in our issuer identifier, after intermittent failures on NetBSD led to commit `8d9d5843b`. (The new certificate is still set up for IPv6, to make it easier to improve that behavior in the future.) Patch by Jonathan Gonzalez V., with some additional tests and tweaks by me. Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Discussion: https://postgr.es/m/8a296a2c128aba924bff0ae48af2b88bf8f9188d.camel@gmail.com	2026-03-05 10:04:53 -08:00
Jacob Champion	b8d7685835	libpq: Add PQgetThreadLock() to mirror PQregisterThreadLock() Allow libpq clients to retrieve the current pg_g_threadlock pointer with PQgetThreadLock(). Single-threaded applications could already do this in a convoluted way: pgthreadlock_t tlock; tlock = PQregisterThreadLock(NULL); PQregisterThreadLock(tlock); /* re-register the callback / / use tlock */ But a generic library can't do that without potentially breaking concurrent libpq connections. The motivation for doing this now is the libpq-oauth plugin, which currently relies on direct injection of pg_g_threadlock, and should ideally not. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmEU_q9sr1PMmE-4rLwFN%3DOjyndDwFZvpsMU3RNJLrM9g%40mail.gmail.com Discussion: https://postgr.es/m/CAOYmi%2B%3DMHD%2BWKD4rsTn0v8220mYfyLGhEc5EfhmtqrAb7SmC5g%40mail.gmail.com	2026-03-05 10:04:48 -08:00
Jacob Champion	f8c0b91a60	oauth: Report cleanup errors as warnings on stderr Using conn->errorMessage for these "shouldn't-happen" cases will only work if the connection itself fails. Our SSL and password callbacks print WARNINGs when they find themselves in similar situations, so follow their lead. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmEU_q9sr1PMmE-4rLwFN%3DOjyndDwFZvpsMU3RNJLrM9g%40mail.gmail.com	2026-03-05 10:04:36 -08:00
Alexander Korotkov	177037341a	Fix handling of updated tuples in the MERGE statement This branch missed the IsolationUsesXactSnapshot() check. That led to EPQ on repeatable read and serializable isolation levels. This commit fixes the issue and provides a simple isolation check for that. Backpatch through v15 where MERGE statement was introduced. Reported-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAPpHfdvzZSaNYdj5ac-tYRi6MuuZnYHiUkZ3D-AoY-ny8v%2BS%2Bw%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Backpatch-through: 15	2026-03-05 19:49:28 +02:00
Fujii Masao	bffd7130e9	Improve validation of recovery_target_xid GUC values. Previously, the recovery_target_xid GUC values were not sufficiently validated. As a result, clearly invalid inputs such as the string "bogus", a decimal value like "1.1", or 0 (a transaction ID smaller than the minimum valid value of 3) were unexpectedly accepted. In these cases, the value was interpreted as transaction ID 0, which could cause recovery to behave unexpectedly. This commit improves validation of recovery_target_xid GUC so that invalid values are rejected with an error. This prevents recovery from proceeding with misconfigured recovery_target_xid settings. Also this commit updates the documentation to clarify the allowed values for recovery_target_xid GUC. Author: David Steele <david@pgbackrest.org> Reviewed-by: Hüseyin Demir <huseyin.d3r@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/f14463ab-990b-4ae9-a177-998d2677aae0@pgbackrest.org	2026-03-05 21:40:32 +09:00
Fujii Masao	9b0e5bd532	doc: Clarify that COLUMN is optional in ALTER TABLE ... ADD/DROP COLUMN. In ALTER TABLE ... ADD/DROP COLUMN, the COLUMN keyword is optional. However, part of the documentation could be read as if COLUMN were required, which may mislead users about the command syntax. This commit updates the ALTER TABLE documentation to clearly state that COLUMN is optional for ADD and DROP. Also this commit adds regression tests covering ALTER TABLE ... ADD/DROP without the COLUMN keyword. Backpatch to all supported versions. Author: Chao Li <lic@highgo.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAEoWx2n6ShLMOnjOtf63TjjgGbgiTVT5OMsSOFmbjGb6Xue1Bw@mail.gmail.com Backpatch-through: 14	2026-03-05 12:55:52 +09:00
Michael Paquier	5f8124a0cf	Move definition of XLogRecoveryCtlData to xlogrecovery.h XLogRecoveryCtlData is the structure that stores the shared-memory state of WAL recovery, including information such as promotion requests, the timeline ID (TLI), and the LSNs of replayed records. This refactoring is independently useful because it allows code outside of core to access the recovery state in live. It will be used by an upcoming patch that introduces a SQL function for querying this information, that can be accessed on a standby once a consistent state has been reached. This only moves code around, changing nothing functionally. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CABPTF7W+Nody-+P9y4PNk37-QWuLpfUrEonHuEhrX+Vx9Kq+Kw@mail.gmail.com	2026-03-05 12:17:47 +09:00
Michael Paquier	ea4744782b	Fix rare instability in recovery TAP test 004_timeline_switch This fixes a problem similar to `ad8c86d22c`. In this case, the test could fail under the following circumstances: - The primary is stopped with teardown_node(), meaning that it may not be able to send all its WAL records to standby_1 and standby_2. - If standby_2 receives more records than standby_1, attempting to reconnect standby_2 to the promoted standby_1 would fail because of a timeline fork. This race condition is fixed with a simple trick: instead of tearing down the primary, it is stopped cleanly so as all the WAL records of the primary are received and flushed by both standby_1 and standby_2. Once we do that, there is no need for a wait_for_catchup() before stopping the node. The test wants to check that a timeline jump can be achieved when reconnecting a standby to a promoted standby in the same cluster, hence an immediate stop of the primary is not required. This failure is harder to reach than the previous instability of 009_twophase, still the buildfarm has been able to detect this failure at least once. I have tried Alexander Lakhin's test trick with the bgwriter and very aggressive standby snapshots, but I could not reproduce it directly. It is reachable, as the buildfarm has proved. Backpatch down to all supported branches, and this problem can lead to spurious failures in the buildfarm. Discussion: https://postgr.es/m/493401a8-063f-436a-8287-a235d9e065fc@gmail.com Backpatch-through: 14	2026-03-05 10:05:44 +09:00
Michael Paquier	34dfca2934	Change default value of default_toast_compression to "lz4", take two The default value for default_toast_compression was "pglz". The main reason for this choice is that this option is always available, pglz code being embedded in Postgres. However, it is known that LZ4 is more efficient than pglz: less CPU required, more compression on average. As of this commit, the default value of default_toast_compression becomes "lz4", if available. By switching to LZ4 as the default, users should see natural speedups on TOAST data reads and/or writes. Support for LZ4 in TOAST compression was added in Postgres v14, or 5 releases ago. This should be long enough to consider this feature as stable. While at it, quotes are removed from default_toast_compression in postgresql.conf.sample. Quotes are not required in this case. The in-place value replacement done by initdb if the build supports LZ4 would not use them in the postgresql.conf file added to a freshly-initialized cluster. Note that this is a version lighter than `7c1849311e`, that included a replacement of --with-lz4 by --without-lz4 in configure builds, forcing a requirement for LZ4 in all environments. The buildfarm did not like it, at all. This commit switches default_toast_compression to lz4 as default only when --with-lz4 is defined, which should keep the buildfarm at bay while still allowing users to benefit from LZ4 compression in TOAST as long as the code is compiled with it. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://posgr.es/m/435df33a-129e-4f0c-a803-f3935c5a5ecb@eisentraut.org	2026-03-05 09:24:35 +09:00
Michael Paquier	4f0b3afab4	Revert "Change default value of default_toast_compression to "lz4"" This reverts commit `7c1849311e`, due to the fact that more than 60% of the buildfarm members do not have lz4 installed. As we are in the last commit fest of the development cycle, and that it could take a couple of weeks to stabilize things, this change is reverted for now. This commit will be reworked in a lighter version, as default_toast_compression's default can be changed to "lz4" without the switch from --with-lz4 to --without-lz4. This approach will keep the buildfarm at bay, and still allow builds to take advantage of LZ4 in TOAST by default, as long as the code is compiled with LZ4 support. A harder requirement based on LZ4 should be achievable at some point, but it is going to require some work from the buildfarm owners first. Perhaps this part could be revisited at the beginning of the next development cycle. Discussion: https://postgr.es/m/CAOYmi+meTT0NbLbnVqOJD5OKwCtHL86PQ+RZZTrn6umfmHyWaw@mail.gmail.com	2026-03-05 08:25:35 +09:00
Andrew Dunstan	3c19983cc0	pg_restore: add --no-globals option to skip globals This is a followup to commit `763aaa06f0` Add non-text output formats to pg_dumpall. Add a --no-globals option to pg_restore that skips restoring global objects (roles and tablespaces) when restoring from a pg_dumpall archive. When -C/--create is not specified, databases that do not already exist on the target server are also skipped. This is useful when restoring only specific databases from a pg_dumpall archive without needing the global objects to be restored first. Author: Mahendra Singh Thalor <mahi6run@gmail.com> With small tweaks by me. Discussion: https://postgr.es/m/CAKYtNArdcc5kx1MdTtTKFNYiauo3=zCA-NB0LmBCW-RU_kSb3A@mail.gmail.com	2026-03-04 16:53:29 -05:00
Andrew Dunstan	c7572cd48d	Improve writing map.dat preamble Fix code from commit `763aaa06f0` Suggestion from Alvaro Herrera following a bug discovered by Coverity.	2026-03-04 16:08:04 -05:00
Andrew Dunstan	01c729e0c7	Fix casting away const-ness in pg_restore.c This was intoduced in commit `763aaa06f0` per gripe from Peter Eistentrut. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Slightly tweaked by me. Discussion: https://postgr.es/m/016819c0-666e-42a8-bfc8-2b93fd8d0176@eisentraut.org	2026-03-04 15:54:02 -05:00
Tom Lane	e6a1d8f5ac	Fix estimate_hash_bucket_stats's correction for skewed data. The previous idea was "scale up the bucketsize estimate by the ratio of the MCV's frequency to the average value's frequency". But we should have been suspicious of that plan, since it frequently led to impossible (> 1) values which we had to apply an ad-hoc clamp to. Joel Jacobson demonstrated that it sometimes leads to making the wrong choice about which side of the hash join should be inner. Instead, drop the whole business of estimating average frequency, and just clamp the bucketsize estimate to be at least the MCV's frequency. This corresponds to the bucket size we'd get if only the MCV appears in a bucket, and the MCV's frequency is not affected by the WHERE-clause filters. (We were already making the latter assumption.) This also matches the coding used since `4867d7f62` in the case where only a default ndistinct estimate is available. Interestingly, this change affects no existing regression test cases. Add one to demonstrate that it helps pick the smaller table to be hashed when the MCV is common enough to affect the results. This leaves estimate_hash_bucket_stats not considering the effects of null join keys at all, which we should probably improve. However, I have a different patch in the queue that will change the executor's handling of null join keys, so it seems appropriate to wait till that's in before doing anything more here. Reported-by: Joel Jacobson <joel@compiler.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Joel Jacobson <joel@compiler.org> Discussion: https://postgr.es/m/341b723c-da45-4058-9446-1514dedb17c1@app.fastmail.com	2026-03-04 15:33:15 -05:00
Tom Lane	c70f6dc6bd	Fix yet another bug in archive streamer with LZ4 decompression. The code path in astreamer_lz4_decompressor_content() that updated the output pointers when the output buffer isn't full was wrong. It advanced next_out by bytes_written, which could include previous decompression output not just that of the current cycle. The correct amount to advance is out_size. While at it, make the output pointer updates look more like the input pointer updates. This bug is pretty hard to reach, as it requires consecutive compression frames that are too small to fill the output buffer. pg_dump could have produced such data before `66ec01dc4`, but I'm unsure whether any files we use astreamer with would be likely to contain problematic data. Author: Chao Li <lic@highgo.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0594CC79-1544-45DD-8AA4-26270DE777A7@gmail.com Backpatch-through: 15	2026-03-04 12:08:37 -05:00
Álvaro Herrera	ce4fbe1ac6	Don't malloc(0) in EventTriggerCollectAlterTSConfig Author: Florin Irion <florin.irion@enterprisedb.com> Discussion: https://postgr.es/m/c6fff161-9aee-4290-9ada-71e21e4d84de@gmail.com	2026-03-04 15:04:53 +01:00
Amit Kapila	fd366065e0	Allow table exclusions in publications via EXCEPT TABLE. Extend CREATE PUBLICATION ... FOR ALL TABLES to support the EXCEPT TABLE syntax. This allows one or more tables to be excluded. The publisher will not send the data of excluded tables to the subscriber. To support this, pg_publication_rel now includes a prexcept column to flag excluded relations. For partitioned tables, the exclusion is applied at the root level; specifying a root table excludes all current and future partitions in that tree. Follow-up work will implement ALTER PUBLICATION support for managing these exclusions. Author: vignesh C <vignesh21@gmail.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CALDaNm3=JrucjhiiwsYQw5-PGtBHFONa6F7hhWCXMsGvh=tamA@mail.gmail.com	2026-03-04 15:56:48 +05:30
Heikki Linnakangas	fe08113aef	Add test for row-locking and multixids with prepared transactions This is a repro for the issue fixed in commit `ccae90abdb`. Backpatch to v17 like that commit, although that's a little arbitrary as this test would work on older versions too. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA5RZ0twq5bNMq0r0QNoopQnAEv+J3qJNCrLs7HVqTEntBhJ=g@mail.gmail.com Backpatch-through: 17	2026-03-04 11:29:02 +02:00
Heikki Linnakangas	19615a44b3	Skip prepared_xacts test if max_prepared_transactions < 2 This reduces maintenance overhead, as we no longer need to update the dummy expected output file every time the .sql file changes. Discussion: https://www.postgresql.org/message-id/1009073.1772551323@sss.pgh.pa.us Backpatch-through: 14	2026-03-04 11:06:43 +02:00
Michael Paquier	ad8c86d22c	Fix rare instability in recovery TAP test 009_twophase The phase of the test where we want to check that 2PC transactions prepared on a primary can be committed on a promoted standby relied on an immediate stop of the primary. This logic has a race condition: it could be possible that some records (most likely standby snapshot records) are generated on the primary before it finishes its shutdown, without the promoted standby know about them. When the primary is recycled as new standby, the test could fail because of a timeline fork as an effect of these extra records. This fix takes care of the instability by doing a clean stop of the primary instead of a teardown (aka immediate stop), so as all records generated on the primary are sent to the promoted standby and flushed there. There is no need for a teardown of the primary in this test scenario: the commit of 2PC transactions on a promoted standby do not care about the state of the primary, only of the standby. This race is very hard to hit in practice, even slow buildfarm members like skink have a very low rate of reproduction. Alexander Lakhin has come up with a recipe to improve the reproduction rate a lot: - Enable -DWAL_DEBUG. - Patch the bgwriter so as standby snapshots are generated every milliseconds. - Run 009_twophase tests under heavy parallelism. With this method, the failure appears after a couple of iterations. With the fix in place, I have been able to run more than 50 iterations of the parallel test sequence, without seeing a failure. Issue introduced in `30820982b2`, due to a copy-pasto coming from the surrounding tests. Thanks also to Hayato Kuroda for digging into the details of the failure. He has proposed a fix different than the one of this commit. Unfortunately, it relied on injection points, feature only available in v17. The solution of this commit is simpler, and can be applied to v14~v16. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/b0102688-6d6c-c86a-db79-e0e91d245b1a@gmail.com Backpatch-through: 14	2026-03-04 16:30:51 +09:00
Michael Paquier	7c1849311e	Change default value of default_toast_compression to "lz4", when available The default value for default_toast_compression was "pglz". The main reason for this choice is that this option is always available, pglz code being embedded in Postgres. However, it is known that LZ4 is more efficient than pglz: less CPU required, more compression on average. As of this commit, the default value of default_toast_compression becomes "lz4", if available. By switching to LZ4 as the default, users should see natural speedups on TOAST data reads and/or writes. Support for LZ4 in TOAST compression was added in Postgres v14, or 5 releases ago. This should be long enough to consider this feature as stable. --with-lz4 is removed, replaced by a --without-lz4 to disable LZ4 in the builds on an option-basis, following a practice similar to readline or ICU. References to --with-lz4 are removed from the documentation. While at it, quotes are removed from default_toast_compression in postgresql.conf.sample. Quotes are not required in this case. The in-place value replacement done by initdb if the build supports LZ4 would not use them in the postgresql.conf file added to a freshly-initialized cluster. For the reference, a similar switch has been done with ICU in `fcb21b3acd`. Some of the changes done in this commit are consistent with that. Note: this is going to create some disturbance in the buildfarm, in environments where lz4 is not installed. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://posgr.es/m/435df33a-129e-4f0c-a803-f3935c5a5ecb@eisentraut.org	2026-03-04 13:05:31 +09:00
Richard Guo	1f4f87d794	Remove redundant restriction checks in apply_child_basequals In apply_child_basequals, after translating a parent relation's restriction quals for a child relation, we simplify each child qual by calling eval_const_expressions. Historically, the code then called restriction_is_always_false and restriction_is_always_true to reduce NullTest quals that are provably false or true. However, since commit `e2debb643`, the planner natively performs NullTest deduction during constant folding. Therefore, calling restriction_is_always_false and restriction_is_always_true immediately afterward is redundant and wastes CPU cycles. We can safely remove them and simply rely on the constant folding to handle the deduction. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-vLmGXaUEZyOMacN0BVfqWCt2tM-eDVWdDfJnOQaauGg@mail.gmail.com	2026-03-04 10:57:43 +09:00
Richard Guo	ce1c17a316	Remove obsolete SAMESIGN macro The SAMESIGN macro was historically used as a helper for manual integer overflow checks. However, since commit `4d6ad3125` introduced overflow-aware integer operations, this manual sign-checking logic is no longer necessary. The macro remains defined in brin_minmax_multi.c and timestamp.c, but is not used in either file. This patch removes these definitions to clean things up. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-NL3J3hQ3LzrwV-YUkQC18P+jM7ZiegQyAHzgdZev2qg@mail.gmail.com	2026-03-04 10:56:06 +09:00
Michael Paquier	9ef6381829	Add some tests for CREATE OR REPLACE VIEW with column additions When working on an already-defined view with matching attributes, CREATE OR REPLACE VIEW would internally generate an ALTER TABLE command with a set of AT_AddColumnToView sub-commands, one for each attribute added. Such a command is stored in event triggers twice: - Once as a simple command. - Once as an ALTER TABLE command, as it has sub-commands. There was no test coverage to track this command pattern in terms of event triggers and DDL deparsing: - For the test module test_ddl_deparse, two command notices are issued. - For event triggers, a CREATE VIEW command is logged twice, which may look a bit weird first, but again this maps with the internal behavior of how the commands are built, and how the event trigger code reacts in terms of commands gathered. While on it, this adds a test for CREATE SCHEMA with a CREATE VIEW command embedded in it, case supported by the grammar but not covered yet. This hole in the test coverage has been found while digging into what would be a similar behavior for sequences if adding attributes to them with ALTER TABLE variants, after the initial relation creation. Discussion: https://postgr.es/m/aaFG9bqkEn0RhLJG@paquier.xyz	2026-03-04 09:55:58 +09:00
Melanie Plageman	38229cb905	Add read_stream_{pause,resume}() Read stream users can now pause lookahead when no blocks are currently available. After resuming, subsequent read_stream_next_buffer() calls continue lookahead with the previous lookahead distance. This is especially useful for read stream users with self-referential access patterns (where consuming already-read buffers can produce additional block numbers). Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJLT2JvWLEiBXMbkSSc5so_Y7%3DN%2BS2ce7npjLw8QL3d5w%40mail.gmail.com	2026-03-03 16:03:09 -05:00
Peter Eisentraut	b30656ce00	doc: Add restart on failure to example systemd file The documentation previously had a systemd unit file that would not attempt to recover from process failures such as OOM's, segfaults, etc. This commit adds "Restart=on-failure",` which tells systemd to attempt to restart the process after failure. This is the recommended configuration per the systemd documentation: "Setting this to on-failure is the recommended choice for long-running services". Many PostgreSQL users will simply copy/paste what the PostgreSQL documentation recommends and will probably do their own research and change the service file to restart on failure, so might as well set this as the default in the PostgreSQL documentation. Author: Andrew Jackson <andrewjackson947@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkFfMpAQnv8CLs%3Di%3DrZwurtCV_gmfRb0uZi-V%2Bd6wcryqg%40mail.gmail.com	2026-03-03 13:18:53 +01:00
Álvaro Herrera	cece37c984	Reduce scope of for-loop-local variables to avoid shadowing Adjust a couple of for-loops where a local variable was shadowed by another in the same scope, by renaming it as well as reducing its scope to the containing for-loop. Author: Chao Li <lic@highgo.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAEoWx2kQ2x5gMaj8tHLJ3=jfC+p5YXHkJyHrDTiQw2nn2FJTmQ@mail.gmail.com	2026-03-03 11:24:11 +01:00
Peter Eisentraut	f2d7570cdd	Reduce the scope of volatile qualifiers Commit `c66a7d75e6` introduced a new "cast discards ‘volatile’" warning (-Wcast-qual) in vac_truncate_clog(). Instead of making use of unvolatize(), remove the warning by reducing the scope of the volatile qualifier (added in commit `2d2e40e3be`) to only 2 fields. Also do the same for vac_update_datfrozenxid(), since the intent of commit `f65ab862e3` was to prevent the same kind of race condition that commit `2d2e40e3be` was fixing. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aZ3a%2BV82uSfEjDmD%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-03 10:02:28 +01:00
Peter Eisentraut	2a525cc97e	Add COPY (on_error set_null) option If ON_ERROR SET_NULL is specified during COPY FROM, any data type conversion errors will result in the affected column being set to a null value. A column's not-null constraints are still enforced, and attempting to set a null value in such columns will raise a constraint violation error. This applies to a column whose data type is a domain with a NOT NULL constraint. Author: Jian He <jian.universality@gmail.com> Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Yugo NAGATA <nagata@sraoss.co.jp> Reviewed-by: torikoshia <torikoshia@oss.nttdata.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CAKFQuwawy1e6YR4S%3Dj%2By7pXqg_Dw1WBVrgvf%3DBP3d1_aSfe_%2BQ%40mail.gmail.com	2026-03-03 07:37:12 +01:00
Michael Paquier	a1bd0c1615	doc: Fix sentence of pg_walsummary page Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAHut+PvfYBL-ppX-i8DPeRu7cakYCZz+QYBhrmQzicx7z_Tj5w@mail.gmail.com Backpatch-through: 17	2026-03-03 15:27:50 +09:00
Fujii Masao	bae42a54e3	doc: Clarify that empty COMMENT string removes the comment. Clarify the documentation of COMMENT ON to state that specifying an empty string is treated as NULL, meaning that the comment is removed. This makes the behavior explicit and avoids possible confusion about how empty strings are handled. Also adds regress test cases that use empty string to remove a comment. Backpatch to all supported versions. Author: Chao Li <lic@highgo.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Shengbin Zhao <zshengbin91@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: zhangqiang <zhang_qiang81@163.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/26476097-B1C1-4BA8-AA92-0AD0B8EC7190@gmail.com Backpatch-through: 14	2026-03-03 14:45:52 +09:00
Michael Paquier	ba97bf9cb7	Add support for "exprs" in pg_restore_extended_stats() This commit adds support for the restore of extended statistics of the kind "exprs", counting for the statistics data computed for expressions. The input format consists of a jsonb object which must be an array of objects which are keyed by statistics parameter names, like this: [{"stat_type1": "...", "stat_type2": "...", ...}, {"stat_type1": "...", "stat_type2": "...", ...}, ...] The outer array must have as many elements as there are expressions defined in the statistics object, mapping with the way extended statistics are built with one pg_statistic tuple stored for each expression whose statistics have been computed. The elements of the array must be either objects or null values (equivalent of invalid data, case also supported by the stats computations when its data is inserted in the catalogs). The keys of the inner objects are names of the statistical columns in pg_stats_ext_exprs (i.e. everything after "inherited"). Not all parameter keys need to be provided, those omitted being silently ignored. Key values that do not match a statistical column name will cause a warning to be issued, but do not otherwise fail the expression or the import as a whole. The expected value type for all parameters is jbvString, which allows us to validate the values using the input function specific to that parameter. Any parameters with a null value are silently ignored, same as if they were not provided in the first place. This commit includes a battery of test cases: - Sanity checks for what-should-be-all the failures in restore code paths, including parsing errors, parameter sanity checks depending on the extended stats object definition, etc. - Value injection, for scalar, array, range, multi-range cases. - Stats data cloning, with differential checks between the source relation and its target. The source and the target should hold the same stats data after restore. - While expressions are supported in extended statistics since v14, range_length_histogram, range_empty_frac, and range_bounds_histogram have been added to pg_stat_ext_exprs only in v19. A test case has been added to emulate a dump taken from v18, with expression stats restored for a range data type where these three fields are NULL. Support for pg_dump is included, with expressions supported since v14, inherited since v15, and data for range types in expressions in v19. pg_upgrade is the main use-case of this feature; it is also possible to inject statistics, same as for the other extstat kinds. As of this commit, ANALYZE should not be required after pg_upgrade when the cluster upgrading from uses extended statistics, as MCV, dependencies, expressions and ndistinct stats are all covered. The stats data related to range types used in expressions requires v19, whose support has also been added. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=fPcci6oPyuyEZ0F4bWqAA7HzaWO+ZPptufuX5_uWt6kw@mail.gmail.com	2026-03-03 14:19:54 +09:00
Jeff Davis	11171fe1fc	style: define parameterless functions as foo(void). Change pg_icu_unicode_version() to pg_icu_unicode_version(void), introduced by commit `af2d4ca191`. See commit `9b05e2ec08`, which fixed similar cases. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aaEhpwrj1FY/8/7n@ip-10-97-1-34.eu-west-3.compute.internal	2026-03-02 20:12:38 -08:00
Tom Lane	cdaa675658	Fix local-variable shadowing in pg_trgm's printSourceNFA(). We hadn't noticed this violation of -Wshadow=compatible-local because this function isn't compiled without -DTRGM_REGEXP_DEBUG. As long as we have to clean it up, let's do so by converting all this function's loops to use C99 loop-local control variables. Reported-by: Sergei Kornilov <sk@zsrv.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3009911772478436@08341ecb-668d-43a9-af4d-b45f00c72521	2026-03-02 14:40:29 -05:00
Nathan Bossart	f191dc6766	Add commit `7b24959434` to .git-blame-ignore-revs.	2026-03-02 13:23:28 -06:00
Nathan Bossart	cc774c543b	basic_archive: Allow archive directory to be missing at startup. Presently, the GUC check hook for basic_archive.archive_directory checks that the specified directory exists. Consequently, if the directory does not exist at server startup, archiving will be stuck indefinitely, even if it appears later. To fix, remove this check from the hook so that archiving will resume automatically once the directory is present. basic_archive must already be prepared to deal with the directory disappearing at any time, so no additional special handling is required. Reported-by: Олег Самойлов <splarv@ya.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Sergei Kornilov <sk@zsrv.org> Discussion: https://postgr.es/m/73271769675212%40mail.yandex.ru Backpatch-through: 15	2026-03-02 13:12:25 -06:00
Heikki Linnakangas	ccae90abdb	Fix OldestMemberMXactId and OldestVisibleMXactId array usage Commit `ab355e3a88` changed how the OldestMemberMXactId array is indexed. It's no longer indexed by synthetic dummyBackendId, but with ProcNumber. The PGPROC entries for prepared xacts come after auxiliary processes in the allProcs array, which rendered the calculation for MaxOldestSlot and the indexes into the array incorrect. (The OldestVisibleMXactId array is not used for prepared xacts, and thus never accessed with ProcNumber's greater than MaxBackends, so this only affects the OldestMemberMXactId array.) As a result, a prepared xact would store its value past the end of the OldestMemberMXactId array, overflowing into the OldestVisibleMXactId array. That could cause a transaction's row lock to appear invisible to other backends, or other such visibility issues. With a very small max_connections setting, the store could even go beyond the OldestVisibleMXactId array, stomping over the first element in the BufferDescriptor array. To fix, calculate the array sizes more precisely, and introduce helper functions to calculate the array indexes correctly. Author: Yura Sokolov <y.sokolov@postgrespro.ru> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/7acc94b0-ea82-4657-b1b0-77842cb7a60c@postgrespro.ru Backpatch-through: 17	2026-03-02 19:19:22 +02:00
Álvaro Herrera	344b572e3e	psql: Tab-complete ALTER ROLE ... IN DATABASE SET/RESET Detailed completion of the RESET clause is still missing. Not sure a detailed implementation is worth the trouble. Author: Ian Lawrence Barwick <barwick@gmail.com> Author: Vasuki M <vasukianand0119@gmail.com> Reviewed-by: zengman <zengman@halodbtech.com> Reviewed-by: Dharin Shah <dharinshah95@gmail.com> Reviewed-by: Surya Poondla <suryapoondla4@gmail.com> Discussion: https://postgr.es/m/CAB8KJ=iH_v1YB2ss1A=BqvOAf28OVYiWRqUdE6TJ3pP-RdsPig@mail.gmail.com	2026-03-02 18:03:44 +01:00
Tom Lane	74b4438a70	In pg_dumpall, don't skip role GRANTs with dangling grantor OIDs. In commits `29d75b25b` et al, I made pg_dumpall's dumpRoleMembership logic treat a dangling grantor OID the same as dangling role and member OIDs: print a warning and skip emitting the GRANT. This wasn't terribly well thought out; instead, we should handle the case by emitting the GRANT without the GRANTED BY clause. When the source database is pre-v16, such cases are somewhat expected because those versions didn't prevent dropping the grantor role; so don't even print a warning that we did this. (This change therefore restores pg_dumpall's pre-v16 behavior for these cases.) The case is not expected in >= v16, so then we do print a warning, but soldiering on with no GRANTED BY clause still seems like a reasonable strategy. Per complaint from Robert Haas that we were now dropping GRANTs altogether in easily-reachable scenarios. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+TgmoauoiW4ydDhdrseg+DD4Kwha=+TSZp18BrJeHKx3o1Fdw@mail.gmail.com Backpatch-through: 16	2026-03-02 11:15:10 -05:00
Melanie Plageman	8b9d42bf6b	Save prune cycles by consistently clearing prune hints on all-visible pages All-visible pages can't contain prunable tuples. We already clear the prune hint (pd_prune_xid) during pruning of all-visible pages, but we were not doing so in vacuum phase three, nor initializing it for all-frozen pages created by COPY FREEZE, and we were not clearing it on standbys. Because page hints are not WAL-logged, pages on a standby carry stale pd_prune_xid values. After promotion, that stale hint triggers unnecessary on-access pruning. Fix this by clearing the prune hint everywhere we currently mark a heap page all-visible. Clearing it when setting PD_ALL_VISIBLE ensures no extra overhead. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/flat/CAAKRu_b-BMOyu0X-0jc_8bWNSbQ5K6JTEueayEhcQuw-OkCSKg%40mail.gmail.com	2026-03-02 11:05:59 -05:00
Peter Eisentraut	1887d822f1	Support using copyObject in standard C++ Calling copyObject in C++ without GNU extensions (e.g. when using -std=c++11 instead of -std=gnu++11) fails with an error like this: error: use of undeclared identifier 'typeof'; did you mean 'typeid' This is due to the C compiler used to compile PostgreSQL supporting typeof, but that function actually not being present in the C++ compiler. This fixes that by explicitely checking for typeof support in C++, and then either use that or define typeof ourselves as: std::remove_reference_t<decltype(x)> According to the paper that led to adding typeof to the C standard, that's the C++ equivalent of the C typeof: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2927.htm#existing-decltype Author: Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/DGPW5WCFY7WY.1IHCDNIVVT300%2540jeltef.nl	2026-03-02 11:48:13 +01:00
Peter Eisentraut	386ca3908d	Check for memset_explicit() and explicit_memset() We can use either of these to implement a missing explicit_bzero(). explicit_memset() is supported on NetBSD. NetBSD hitherto didn't have a way to implement explicit_bzero() other than the fallback variant. memset_explicit() is the C23 standard, so we use it as first preference. It is currently supported on: - NetBSD 11 - FreeBSD 15 - glibc 2.43 It doesn't provide additional coverage, but as it's the new standard, its availability will presumably grow. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/c4701776-8d99-41da-938d-88528a3adc15%40eisentraut.org	2026-03-02 07:51:19 +01:00
Michael Paquier	f68d7e7483	Remove WAL page header flag XLP_BKP_REMOVABLE There are no known users of this flag. The last supposed user was pglesslog, which is the reason why this flag has been introduced in core, based on an historical search pointing at `a8d539f124`. I have mentioned that we may want to remove this flag back in 2018, due to zero users of it in core. More recently, Noah has pointed out that this flag is not safe to use: XLP_BKP_REMOVABLE can be set by the WAL writer in a lock-free fashion with runningBackups > 0, meaning that some full-page images could be required but not logged, ultimately corrupting backups. Bump XLOG_PAGE_MAGIC. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/20250705001628.c3.nmisch@google.com Discussion: https://postgr.es/m/CAEze2WhiwKSoAvfUggjDeoeY0-rz9cTpfrHcqvBMmJxv-K_5DA@mail.gmail.com	2026-03-02 14:13:05 +09:00
Michael Paquier	f7dc17aa91	Fix memory allocation size in RegisterExtensionExplainOption() The allocations used for the static array ExplainExtensionOptionArray, that tracks a set of ExplainExtensionOption, used "char " instead of ExplainExtensionOption as the memory size consumed by one element, underestimating the memory required by half. The initial allocation of ExplainExtensionNameArray wants to hold 16 elements before being reallocated, and with "char " it meant that there was enough space only for 8 ExplainExtensionOption elements, 16 bytes required for each element. The backend would crash once one tries to register a 9th EXPLAIN option. As far as I can see, the allocation formulas of GetExplainExtensionId() have been copy-pasted to RegisterExtensionExplainOption(), but the internal maths of the copy were not adjusted accordingly. Oversight in `c65bc2e1d1`. Author: Joel Jacobson <joel@compiler.org> Discussion: https://postgr.es/m/2a4bd2f5-2a2f-409f-8ac7-110dd3fad4fc@app.fastmail.com Backpatch-through: 18	2026-03-02 13:14:15 +09:00
Michael Paquier	2176520089	test_custom_types: Test module with fancy custom data types This commit adds a new test module called "test_custom_types", that can be used to stress code paths related to custom data type implementations. Currently, this is used as a test suite to validate the set of fixes done in `3b7a6fa157`, that requires some typanalyze callbacks that can force very specific backend behaviors, as of: - typanalyze callback that returns "false" as status, to mark a failure in computing statistics. - typanalyze callback that returns "true" but let's the backend know that no interesting stats could be computed, with stats_valid set to "false". This could be extended more in the future if more problems are found. For simplicity, the module uses a fake int4 data type, that requires a btree operator class to be usable with extended statistics. The type is created by the extension, and its properties are altered in the test. Like `3b7a6fa157`, this module is backpatched down to v14, for coverage purposes. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz Backpatch-through: 14	2026-03-02 11:10:31 +09:00
Fujii Masao	0bf7d4ca9a	psql: Add tab completion for DELETE ... USING. This implements the tab completion that was marked as XXX TODO in the source code. The following completion is now supported: DELETE FROM <table> USING <TAB> -> list of relations supporting SELECT This uses Query_for_list_of_selectables (instead of Query_for_list_of_tables) because the USING clause can reference not only tables but also views and other selectable objects, following the same syntax as the FROM clause of a SELECT statement. Author: Tatsuya Kawata <kawatatatsuya0913@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHza6qf0CLJuJr+5cQw0oWNebM5VyMB-ghoKBgnEjOQ_JtAiuw@mail.gmail.com	2026-03-02 11:07:42 +09:00
Michael Paquier	3b7a6fa157	Fix set of issues with extended statistics on expressions This commit addresses two defects regarding extended statistics on expressions: - When building extended statistics in lookup_var_attr_stats(), the call to examine_attribute() did not account for the possibility of a NULL return value. This can happen depending on the behavior of a typanalyze callback — for example, if the callback returns false, if no rows are sampled, or if no statistics are computed. In such cases, the code attempted to build MCV, dependency, and ndistinct statistics using a NULL pointer, incorrectly assuming valid statistics were available, which could lead to a server crash. - When loading extended statistics for expressions, statext_expressions_load() did not account for NULL entries in the pg_statistic array storing expression statistics. Such NULL entries can be generated when statistics collection fails for an expression, as may occur during the final step of serialize_expr_stats(). An extended statistics object defining N expressions requires N corresponding elements in the pg_statistic array stored for the expressions, and some of these elements can be NULL. This situation is reachable when a typanalyze callback returns true, but sets stats_valid to indicate that no useful statistics could be computed. While these scenarios cannot occur with in-core typanalyze callbacks, as far as I have analyzed, they can be triggered by custom data types with custom typanalyze implementations, at least. No tests are added in this commit. A follow-up commit will introduce a test module that can be extended to cover similar edge cases if additional issues are discovered. This takes care of the core of the problem. Attribute and relation statistics already offer similar protections: - ANALYZE detects and skips the build of invalid statistics. - Invalid catalog data is handled defensively when loading statistics. This issue exists since the support for extended statistics on expressions has been added, down to v14 as of `a4d75c86bf`. Backpatch to all supported stable branches. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz Backpatch-through: 14	2026-03-02 09:38:37 +09:00
Tom Lane	d80b022501	Correctly calculate "MCV frequency" for a unique column. In commit `bd3e3e9e5`, I over-hastily used 1 / rel->rows as the assumed frequency of entries in a column that ANALYZE has found to be unique. However, rel->rows is the number of table rows that are estimated to pass the query's restriction conditions, so that we got a too-large result if the query has selective restrictions. What I should have used is 1 / rel->tuples, since that is the estimated total number of table rows. The pre-existing code path that digs a frequency out of the histogram produces a frequency relative to the whole table, so surely this new alternative code path must do so as well. Any correction needed on the basis of selectivity must be done by the user of the mcv_freq value. Fixing this causes all the regression test plans changed by `bd3e3e9e5` to revert to what they had been, except for the first change in join.out. As I correctly argued in `bd3e3e9e5`, in that test case we have no stats and should not risk a hash join. Evidently I was less correct to argue that the other changes were improvements. Reported-by: Joel Jacobson <joel@compiler.org> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/341b723c-da45-4058-9446-1514dedb17c1@app.fastmail.com	2026-03-01 12:56:55 -05:00
Fujii Masao	aecc558666	psql: Show comments in \dRp+, \dRs+, and \dX+ psql meta-commands. Previously, the psql meta-commands that list publications, subscriptions, and extended statistics did not display their associated comments, whereas other \d meta-commands did. This made it inconvenient for users to view these objects together with their descriptions. This commit improves \dRp+ and \dRs+ to include comments for publications and subscriptions. It also extends the \dX meta-command to accept the + option, allowing comments for extended statistics to be shown when requested. Author: Fujii Masao <masao.fujii@gmail.com> Co-authored-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGL4JqiKA26fnGx-cTM=VzoTs_uzqejvj4Fawyr4uLUUw@mail.gmail.com	2026-02-28 23:56:46 +09:00
John Naylor	51bb4a58ed	Refactor detection of x86 ZMM registers - Call _xgetbv within x86_set_runtime_features rather than in a separate function - Use symbols for XCR mask bits rather than a magic constant A future commit will build on this to detect YMM registers without code duplication. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CANWCAZbgEUFw7LuYSVeJ=Tj98R5HoOB1Ffeqk3aLvbw5rU5NTw@mail.gmail.com	2026-02-28 16:28:09 +07:00
Peter Eisentraut	3f98862980	Fix some -Wcast-qual warnings This fixes some warnings from -Wcast-qual that are easy to fix, without using unconstify or the like. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/990c9117-b013-4026-aaf5-261fe2832c3d%40eisentraut.org	2026-02-27 21:57:33 +01:00
Tom Lane	65a3ff8f1b	Doc: improve user docs and code comments about EXISTS(SELECT * ...). Point out that Postgres automatically optimizes away the target list of an EXISTS' subquery, except in weird cases such as target lists containing set-returning functions. Thus, both common conventions EXISTS(SELECT * FROM ...) and EXISTS(SELECT 1 FROM ...) are overhead-free and there's little reason to prefer one over the other. In the code comments, mention that the SQL spec says that EXISTS(SELECT * FROM ...) should be interpreted as EXISTS(SELECT some-literal FROM ...), but we don't choose to do it exactly that way. Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/9b301c70-3909-4f0f-98ca-9e3c4d142f3e@eisentraut.org	2026-02-27 15:20:16 -05:00
Tom Lane	98616ac18b	Don't flatten join alias Vars that are stored within a GROUP RTE. The RTE's groupexprs list is used for deparsing views, and for that usage it must contain the original alias Vars; else we can get incorrect SQL output. But since commit `247dea89f`, parseCheckAggregates put the GROUP BY expressions through flatten_join_alias_vars before building the RTE_GROUP RTE. Changing the order of operations there is enough to fix it. This patch unfortunately can do nothing for already-created views: if they use a coding pattern that is subject to the bug, they will deparse incorrectly and hence present a dump/reload hazard in the future. The only fix is to recreate the view from the original SQL. But the trouble cases seem to be quite narrow. AFAICT the output was only wrong for "SELECT ... t1 LEFT JOIN t2 USING (x) GROUP BY x" where t1.x and t2.x were not of identical data types and t1.x was the side that required an implicit coercion. If there was no hidden coercion, or if the join was plain, RIGHT, or FULL, the deparsed output was uglier than intended but not functionally wrong. Reported-by: Swirl Smog Dowry <swirl-smog-dowry@duck.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CA+-gibjCg_vjcq3hWTM0sLs3_TUZ6Q9rkv8+pe2yJrdh4o4uoQ@mail.gmail.com Backpatch-through: 18	2026-02-27 12:54:02 -05:00
John Naylor	16743db061	Centralize detection of x86 CPU features We now maintain an array of booleans that indicate which features were detected at runtime. When code wants to check for a given feature, the array is automatically checked if it has been initialized and if not, a single function checks all features at once. Move all x86 feature detection to pg_cpu_x86.c, and move the CRC function choosing logic to the file where the hardware-specific functions are defined, consistent with more recent hardware-specific files in src/port. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CANWCAZbgEUFw7LuYSVeJ=Tj98R5HoOB1Ffeqk3aLvbw5rU5NTw@mail.gmail.com	2026-02-27 20:30:41 +07:00
Andrew Dunstan	d6d9b96b40	Clean up nodes that are no longer of use in 007_pgdumpall.pl Oversight in commit `763aaa06f0`. When nodes are going out of scope, we should stop the underlying postmasters rather than waiting for the script to end. Per gripe from Tom Lane Discussion: https://postgr.es/m/740033.1772142754@sss.pgh.pa.us	2026-02-27 07:33:50 -05:00
Michael Paquier	574bee89c2	Use pg_malloc_object() and pg_alloc_array() variants in frontend code This commit updates the frontend tools (src/bin/, contrib/ and src/test/) to use the memory allocation variants based on pg_malloc_object() and pg_malloc_array() in various code paths. This does not cover all the allocations, but a good chunk of them. Like all the changes of this kind (`31d3847a37`, etc.), this should encourage any future code to use this new style. Author: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/cfb645da-6b3a-4f22-9bcc-5bc46b0e9c61@proxel.se	2026-02-27 18:59:41 +09:00
Álvaro Herrera	a2c89835f5	Don't include proc.h in shm_mq.h This prevents proliferation of proc.h to tons of other places; shm_mq.h is widely included. Discussion: https://postgr.es/m/202602261733.s2rkxezwuif6@alvherre.pgsql	2026-02-27 10:53:47 +01:00
Etsuro Fujita	e7b97a2238	postgres_fdw: Fix thinko in comment for UserMappingPasswordRequired(). This commit also rephrases this comment to improve readability. Oversight in commit `6136e94dc`. Reported-by: Etsuro Fujita <etsuro.fujita@gmail.com> Author: Andreas Karlsson <andreas@proxel.se> Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAPmGK16pDnM_wU3kmquPj-M9MYqG3y0BdntRZ0eytqbCaFY3WQ%40mail.gmail.com Backpatch-through: 14	2026-02-27 17:05:00 +09:00
Melanie Plageman	284925508a	Remove table_scan_analyze_next_tuple unneeded parameter OldestXmin heapam_scan_analyze_next_tuple() doesn't distinguish between dead and recently dead tuples when counting them, so it doesn't need OldestXmin. GetOldestNonRemovableTransactionId() isn't free, so removing it is a win. Looking at other table AMs implementing table_scan_analyze_next_tuple(), we couldn't find one using OldestXmin either, so remove it from the callback. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CALdSSPjvhGXihT_9f-GJabYU%3D_PjrFDUxYaURuTbfLyQM6TErg%40mail.gmail.com	2026-02-26 15:41:53 -05:00
Melanie Plageman	3efe58febc	Simplify visibility check in heap_page_would_be_all_visible() heap_page_would_be_all_visible() does not need to distinguish between HEAPTUPLE_RECENTLY_DEAD and HEAPTUPLE_DEAD tuples: any tuple in a state other than HEAPTUPLE_LIVE means the page is not all-visible and heap_page_would_be_all_visible() returns false. Given that, calling HeapTupleSatisfiesVacuum() is unnecessary, since it performs extra work to distinguish between dead and recently dead tuples using OldestXmin. Replace it with the more minimal HeapTupleSatisfiesVacuumHorizon(). Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CALdSSPjvhGXihT_9f-GJabYU%3D_PjrFDUxYaURuTbfLyQM6TErg%40mail.gmail.com	2026-02-26 15:41:45 -05:00
Jeff Davis	c8308a984d	Fix more multibyte issues in ltree. Commit `84d5efa7e3` missed some multibyte issues caused by short-circuit logic in the callers. The callers assumed that if the predicate string is longer than the label string, then it couldn't possibly be a match, but it can be when using case-insensitive matching (LVAR_INCASE) if casefolding changes the byte length. Fix by refactoring to get rid of the short-circuit logic as well as the function pointer, and consolidate the logic in a replacement function ltree_label_match(). Discussion: https://postgr.es/m/02c6ef6cf56a5013ede61ad03c7a26affd27d449.camel@j-davis.com Backpatch-through: 14	2026-02-26 12:23:22 -08:00
Jeff Davis	d942511f08	Fix memory leaks in pg_locale_icu.c. The backport prior to 18 requires minor modification due to code refactoring. Discussion: https://postgr.es/m/e2b7a0a88aaadded7e2d19f42d5ab03c9e182ad8.camel@j-davis.com Backpatch-through: 16	2026-02-26 12:15:01 -08:00
Melanie Plageman	5aea60839b	Rename LVRelState VM-related logging counters The LVRelState fields that track newly all-visible/all-frozen pages were previously named vm_new_visible_pages, vm_new_frozen_pages, and vm_new_visible_frozen_pages. The correct terminology is all-visible and all-frozen; omitting “all” was open to misinterpretation, as the page isn't visible or invisible, rather all the tuples on the page are visible to all running and future transactions. Rename the members accordingly. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk	2026-02-26 15:04:49 -05:00
Álvaro Herrera	7b9b620d8f	Don't include latch.h in libpq/libpq.h This reduces the inclusion footprint of latch.h a bit. Per suggestion from Andres Freund. Discussion: https://postgr.es/m/pap7mzhcxvuwlfdebjkh646ntyk4brtwm4dbocfpllwdccta5t@w3d7wz6mjpwv	2026-02-26 18:04:13 +01:00
Andres Freund	9d6294c09e	instrumentation: Drop INSTR_TIME_SET_CURRENT_LAZY macro This macro had exactly one user in InstrStartNode, and the caller can instead use INSTR_TIME_IS_ZERO / INSTR_TIME_SET_CURRENT directly. This supports a future change that intends to modify the time source being used in the InstrStartNode case. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53Pkx1bK1FB71_nBqYmzvSSXnp_MbE0ZDnU+baPJF6Ud2WDA@mail.gmail.com	2026-02-26 10:39:29 -05:00
Andres Freund	3218825271	instrumentation: Rename INSTR_TIME_LT macro to INSTR_TIME_GT This was incorrectly named "LT" for "larger than" in `e5a5e0a907`, but that is against existing conventions, where "LT" means "less than". Clarify by using "GT" for "greater than" in macro name, and add a missing comment at the top of instr_time.h to note the macro's existence. Reported by: Peter Smith <smithpb2250@gmail.com> Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAHut%2BPut94CTpjQsqOJHdHkgJ2ZXq%2BqVSfMEcmDKLiWLW-hPfA%40mail.gmail.com	2026-02-26 10:38:59 -05:00
Andrew Dunstan	763aaa06f0	Add non-text output formats to pg_dumpall pg_dumpall can now produce output in custom, directory, or tar formats in addition to plain text SQL scripts. When using non-text formats, pg_dumpall creates a directory containing: - toc.glo: global data (roles and tablespaces) in custom format - map.dat: mapping between database OIDs and names - databases/: subdirectory with per-database archives named by OID pg_restore is extended to handle these pg_dumpall archives, restoring globals and then each database. The --globals-only option can be used to restore only the global objects. This enables parallel restore of pg_dumpall output and selective restoration of individual databases from a cluster-wide backup. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Co-Author: Andrew Dunstan <andrew@dunslane.net> Reviewed-By: Tushar Ahuja <tushar.ahuja@enterprisedb.com> Reviewed-By: Jian He <jian.universality@gmail.com> Reviewed-By: Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com> Reviewed-By: Srinath Reddy <srinath2133@gmail.com> Discussion: https://postgr.es/m/cb103623-8ee6-4ba5-a2c9-f32e3a4933fa@dunslane.net	2026-02-26 08:29:56 -05:00
Álvaro Herrera	7bb50dd7d6	Reduce includes in pgstat.h The lack of fallout here is somewhat surprising. Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/aY-UE-4t7FiYgH3t@alap3.anarazel.de	2026-02-26 13:50:24 +01:00
Álvaro Herrera	d0833fdae7	pg_dump: Preserve NO INHERIT on NOT NULL on inheritance children When the constraint is printed without the column, we were not printing the NO INHERIT flag. Author: Jian He <jian.universality@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CACJufxEDEOO09G+OQFr=HmFr9ZDLZbRoV7+pj58h3_WeJ_K5UQ@mail.gmail.com	2026-02-26 11:50:26 +01:00
Noah Misch	0163951b78	EUC_CN, EUC_JP, EUC_KR, EUC_TW: Skip U+00A0 tests instead of failing. Settings that ran the new test euc_kr.sql to completion would fail these older src/pl tests. Use alternative expected outputs, for which psql \gset and \if have reduced the maintenance burden. This fixes "LANG=ko_KR.euckr LC_MESSAGES=C make check-world". (LC_MESSAGES=C fixes IO::Pty usage in tests 010_tab_completion and 001_password.) That file is new in commit `c67bef3f32`. Back-patch to v14, like that commit. Discussion: https://postgr.es/m/20260217184758.da.noahmisch@microsoft.com Backpatch-through: 14	2026-02-25 18:13:22 -08:00
Fujii Masao	b2ff2a0b52	doc: Clarify INCLUDING COMMENTS behavior in CREATE TABLE LIKE. The documentation for the INCLUDING COMMENTS option of the LIKE clause in CREATE TABLE was inaccurate and incomplete. It stated that comments for copied columns, constraints, and indexes are copied, but regarding comments on constraints in reality only comments on CHECK and NOT NULL constraints are copied; comments on other constraints (such as primary keys) are not. In addition, comments on extended statistics are copied, but this was not documented. The CREATE FOREIGN TABLE documentation had a similar omission: comments on extended statistics are also copied, but this was not mentioned. This commit updates the documentation to clarify the actual behavior. The CREATE TABLE reference now specifies that comments on copied columns, CHECK constraints, NOT NULL constraints, indexes, and extended statistics are copied. The CREATE FOREIGN TABLE reference now notes that comments on extended statistics are copied as well. Backpatch to all supported versions. Documentation updates related to CREATE FOREIGN TABLE LIKE and NOT NULL constraint comment copying are not applied to v17 and earlier, since those features were introduced in v18. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHSOSGcaYDvHF8EYCUCfGPjbRwGFsJ23cx5KbJ1X6JouQ@mail.gmail.com Backpatch-through: 14	2026-02-26 09:01:52 +09:00
Fujii Masao	70f470314c	Fix ProcWakeup() resetting wrong waitStart field. Previously, when one process woke another that was waiting on a lock, ProcWakeup() incorrectly cleared its own waitStart field (i.e., MyProc->waitStart) instead of that of the process being awakened. As a result, the awakened process retained a stale lock-wait start timestamp. This did not cause user-visible issues. pg_locks.waitstart was reported as NULL for the awakened process (i.e., when pg_locks.granted is true), regardless of the waitStart value. This bug was introduced by commit `46d6e5f567`. This commit fixes this by resetting the waitStart field of the process being awakened in ProcWakeup(). Backpatch to all supported branches. Reported-by: Chao Li <li.evan.chao@gmail.com> Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: ji xu <thanksgreed@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/537BD852-EC61-4D25-AB55-BE8BE46D07D7@gmail.com Backpatch-through: 14	2026-02-26 08:46:12 +09:00
Tom Lane	4c1a27e53a	Stabilize output of new isolation test insert-conflict-do-update-4. The test added by commit `4b760a181` assumed that a table's physical row order would be predictable after an UPDATE. But a non-heap table AM might produce some other order. Even with heap AM, the assumption seems risky; compare `a3fd53bab` for instance. Adding an ORDER BY is cheap insurance and doesn't break any goal of the test. Author: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CALT9ZEHcE6tpvumScYPO6pGk_ASjTjWojLkodHnk33dvRPHXVw@mail.gmail.com Backpatch-through: 14	2026-02-25 10:51:42 -05:00
Richard Guo	77c7a17a6e	Fix unsafe RTE_GROUP removal in simplify_EXISTS_query When simplify_EXISTS_query removes the GROUP BY clauses from an EXISTS subquery, it previously deleted the RTE_GROUP RTE directly from the subquery's range table. This approach is dangerous because deleting an RTE from the middle of the rtable list shifts the index of any subsequent RTE, which can silently corrupt any Var nodes in the query tree that reference those later relations. (Currently, this direct removal has not caused problems because the RTE_GROUP RTE happens to always be the last entry in the rtable list. However, relying on that is extremely fragile and seems like trouble waiting to happen.) Instead of deleting the RTE_GROUP RTE, this patch converts it in-place to be RTE_RESULT type and clears its groupexprs list. This preserves the length and indexing of the rtable list, ensuring all Var references remain intact. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3472344.1771858107@sss.pgh.pa.us Backpatch-through: 18	2026-02-25 11:13:21 +09:00
John Naylor	3322f01a11	Fix USE_SLICING_BY_8_CRC32C builds on x86 A future commit will move the CRC function choosing logic to the file where the hardware-specific functions are defined, but until then add guards for builds without those functions. Oversight in commit `b9278871f`. Per buildfarm animal rhinoceros Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4014992.1771963187@sss.pgh.pa.us	2026-02-25 08:44:59 +07:00
Jacob Champion	a60a103386	pg_upgrade: Use max_protocol_version=3.0 for older servers The grease patch in `4966bd3ed` found its first problem: prior to the February 2018 patch releases, no server knew how to negotiate protocol versions, so pg_upgrade needs to take that into account when speaking to those older servers. This will be true even after the grease feature is reverted; we don't need anyone to trip over this again in the future. Backpatch so that all supported versions of pg_upgrade can gracefully handle an update to the default protocol version. (This is needed for any distributions that link older binaries against newer libpqs, such as Debian.) Branches prior to 18 need an additional version check, for the existence of max_protocol_version. Per buildfarm member crake. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAOYmi%2B%3D4QhCjssfNEoZVK8LPtWxnfkwT5p-PAeoxtG9gpNjqOQ%40mail.gmail.com Backpatch-through: 14	2026-02-24 14:01:37 -08:00
Álvaro Herrera	65707ed9af	Add backtrace support for Windows using DbgHelp API Previously, backtrace generation on Windows would return an "unsupported" message. With this commit, we rely on CaptureStackBackTrace() to capture the call stack and the DbgHelp API (SymFromAddrW, SymGetLineFromAddrW64) for symbol resolution. Symbol handler initialization (SymInitialize) is performed once per process and cached. If initialization fails, the report for it is returned as the backtrace output. The symbol handler is cleaned up via on_proc_exit() to release DbgHelp resources. The implementation provides symbol names, offsets, and addresses. When PDB files are available, it also includes source file names and line numbers. Symbol names and file paths are converted from UTF-16 to the database encoding using wchar2char(), which properly handles both UTF-8 and non-UTF-8 databases on Windows. When symbol information is unavailable or encoding conversion fails, it falls back to displaying raw addresses. The implementation uses the explicit UTF16 versions of the DbgHelp functions (SYMBOL_INFOW, SymFromAddrW, IMAGEHLP_LINEW64, SymGetLineFromAddrW64) rather than the generic versions. This allows us to rely on predictable encoding conversion, rather than using the haphazard ANSI codepage that we'd get otherwise. DbgHelp is apparently available on all Windows platforms we support, so there are no version number checks. Author: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/a692c0fe-caca-4c08-9c5d-debfd0ef2504@gmail.com	2026-02-24 17:34:56 +01:00
Peter Eisentraut	dea0812cda	doc: Add link targets to CREATE/ALTER FOREIGN TABLE reference pages This adds IDs to create_foreign_table.sgml's and alter_foreign_table.sgml's <varlistentry> and <refsect1>, similar to other reference pages. Author: jian he <jian.universality@gmail.com> Reviewed-by: Quan Zongliang <quanzongliang@yeah.net> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxE6fW2jFAyTFWEYdUSDP%3D9P2yYerdksPTgxqDM4DZvvvw%40mail.gmail.com	2026-02-24 11:27:49 +01:00
Peter Eisentraut	a99c6b56ff	Make ALTER DOMAIN VALIDATE CONSTRAINT no-op when constraint is already validated Currently, AlterDomainValidateConstraint will re-validate a constraint that has already been validated, which would just waste cycles. This operation should be a no-op when the constraint is already validated. This also aligns with ATExecValidateConstraint. Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxG=-Dv9fPJHqkA9c-wGZ2dDOWOXSp-X-0K_G7r-DgaASw@mail.gmail.com	2026-02-24 10:58:36 +01:00
Peter Eisentraut	f80bedd52b	Allow ALTER COLUMN SET EXPRESSION on virtual columns with CHECK constraints Previously, changing the generation expression of a virtual column was prohibited if the column was referenced by a CHECK constraint. This lifts that restriction. RememberAllDependentForRebuilding within ATExecSetExpression will rebuild all the dependent constraints, later ATPostAlterTypeCleanup queues the required AlterTableStmt operations for ALTER TABLE Phase 3 execution. Overall, ALTER COLUMN SET EXPRESSION on virtual columns may require scanning the table to re-verify any associated CHECK constraints, but it does not require a table rewrite in ALTER TABLE Phase 3. Author: jian he <jian.universality@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CACJufxH3VETr7orF5rW29GnDk3n1wWbOE3WdkHYd3iPGrQ9E_A@mail.gmail.com	2026-02-24 10:32:05 +01:00
Michael Paquier	462fe0ff62	Fix variety of typos and grammar mistakes This commit includes a batch of fixes for various minor typos and grammar mistakes, that have been proposed to the hackers mailing list since the beginning of January. Similar batches are planned on a bi-monthly basis depending on the amount received, with the next one for the end of April.	2026-02-24 13:26:37 +09:00
Michael Paquier	e2f3d82f89	doc: Adjust some markups on pg_waldump page Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAHut+PuuPps9bUPvouU5dH=tOTiF8QBzQox5O7DqXeOFdda79Q@mail.gmail.com	2026-02-24 12:54:23 +09:00
Michael Paquier	ff393fa526	fe_utils: Sprinkle some pg_malloc_object() and pg_malloc_array() The idea is to encourage more the use of these allocation routines across the tree, as these offer stronger type safety guarantees than pg_malloc() & friends (type cast in the result, sizeof() embedded). This commit updates some code paths of src/fe_utils/. This commit is similar to `31d3847a37`. Author: Henrik TJ <henrik@0x48.dk> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/6df1b64e-1314-9afd-41a3-3fefb76225e1@0x48.dk	2026-02-24 12:34:42 +09:00
Nathan Bossart	bfc321b472	Convert SpinLock* macros to static inline functions. This is preparatory work for a proposed follow-up commit that would add assertions to these functions. Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/aZX2oUcKf7IzHnnK%40nathan Discussion: https://postgr.es/m/20200617183354.pm3biu3zbmo2pktq%40alap3.anarazel.de	2026-02-23 15:32:01 -06:00
Andrew Dunstan	7b24959434	Fix indentation from commit `b380a56a3f` Per buildfarm animal koel	2026-02-23 16:22:49 -05:00
Nathan Bossart	d981976027	Allow pg_{read,write}_all_data to access large objects. Since the initial goal of pg_read_all_data was to be able to run pg_dump as a non-superuser without explicitly granting access to every object, it follows that it should allow reading all large objects. For consistency, pg_write_all_data should allow writing all large objects, too. Author: Nitin Motiani <nitinmotiani@google.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CAH5HC96dxAEvP78s1-JK_nDABH5c4w2MDfyx4vEWxBEfofGWsw%40mail.gmail.com	2026-02-23 14:55:21 -06:00
Tom Lane	d743545d84	Work around lgamma(NaN) bug on AIX. lgamma(NaN) should produce NaN, but on older versions of AIX it reports an ERANGE error. While that's been fixed in the latest version of libm, it'll take awhile for the fix to propagate. This workaround is harmless even when the underlying bug does get fixed. Discussion: https://postgr.es/m/3603369.1771877682@sss.pgh.pa.us	2026-02-23 15:30:50 -05:00
Peter Eisentraut	aca61f7e5f	Use LOCKMODE in parse_relation.c/.h There were a couple of comments in parse_relation.c > Note: properly, lockmode should be declared LOCKMODE not int, but that > would require importing storage/lock.h into parse_relation.h. Since > LOCKMODE is typedef'd as int anyway, that seems like overkill. but actually LOCKMODE has been in storage/lockdefs.h for a while, which is intentionally a more narrow header. So we can include that one in parse_relation.h and just use LOCKMODE normally. An alternative would be to add a duplicate typedef into parse_relation.h, but that doesn't seem necessary here. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/4bcd65fb-2497-484c-bb41-83cb435eb64d%40eisentraut.org	2026-02-23 21:25:55 +01:00
Jacob Champion	4966bd3ed9	libpq: Grease the protocol by default Send PG_PROTOCOL_GREASE and _pq_.test_protocol_negotiation, which were introduced in commit `d8d7c5dc8`, by default, and fail the connection if the server attempts to claim support for them. The hope is to provide feedback to noncompliant implementations and gain confidence in our ability to advance the protocol. (See the other commit for details.) To help end users navigate the situation, a link to our documentation that explains the behavior is displayed. We append this to the error message when the NegotiateProtocolVersion response is incorrect, or when the peer sends an error during startup that appears to be grease- related. It's still possible for users to connect to servers that don't support protocol negotiation, by adding max_protocol_version=3.0 to their connection strings. Only the default connection behavior is impacted. This commit is tracked as a PG19 open item and will be reverted before RC1. (The implementation here doesn't handle negotiation with later server versions, so it can't be released into the wild as a five-year-supported feature. But an improved implementation might be able to do so, in the future...) Author: Jelte Fennema-Nio <postgres@jeltef.nl> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/DDPR5BPWH1RJ.1LWAK6QAURVAY%40jeltef.nl	2026-02-23 10:48:20 -08:00
Tom Lane	4a1b05caa5	Restore AIX support. The concerns that led us to remove AIX support in commit `0b16bb877` have now been alleviated: 1. IBM has stepped forward to provide support, including buildfarm animal(s). 2. AIX 7.2 and later seem to be fine with large pg_attribute_aligned requirements. Since 7.1 is now EOL anyway, we can just cease to support it. 3. Tossing xlc support overboard seems okay as well. It's a bit sad to drop one of the few remaining non-gcc-alike compilers, but working around xlc's bugs and idiosyncrasies doesn't seem justified by the theoretical portability benefits. 4. Likewise, we can stop supporting 32-bit AIX builds. This is not so much about whether we could build such executables as that they're too much of a pain to manage in the field, due to limited address space available for dynamic library loading. 5. We hit on a way to manage catalog column alignment that doesn't require continuing developer effort (see commit `ecae09725`). Hence, this commit reverts `0b16bb877` and some follow-on commits such as `e6bb491bf`, except for not putting back XLC support nor the changes related to catalog column alignment. Some other notable changes from the way things were in v16: Prefer unnamed POSIX semaphores on AIX, rather than the default choice of SysV semaphores. Include /opt/freeware/lib in -Wl,-blibpath, even when it is not mentioned anywhere in LDFLAGS. Remove platform-specific adjustment of MEMSET_LOOP_LIMIT; maybe that's still the right thing, but it really ought to be re-tested. Silence compiler warnings related to getpeereid(), wcstombs_l(), and PAM conversation procs. Accept "libpythonXXX.a" as an okay name for the Python shared library (but only on AIX!). Author: Aditya Kamath <Aditya.Kamath1@ibm.com> Author: Srirama Kucherlapati <sriram.rk@in.ibm.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CY5PR11MB63928CC05906F27FB10D74D0FD322@CY5PR11MB6392.namprd11.prod.outlook.com	2026-02-23 13:34:22 -05:00
Tom Lane	ecae097252	Cope with AIX's alignment woes by using _Pragma("pack"). Because we assume that int64 and double have the same alignment requirement, AIX's default behavior that alignof(double) = 4 while alignof(int64) = 8 is a headache. There are two issues: 1. We align both int8 and float8 tuple columns per ALIGNOF_DOUBLE, which is an ancient choice that can't be undone without breaking pg_upgrade and creating some subtle SQL-level compatibility issues too. However, the cost of that is just some marginal inefficiency in fetching int8 values, which can't be too awful if the platform architects were willing to pay the same costs for fetching float8s. So our decision is to leave that alone. This patch makes our alignment choices the same as they were pre-v17, namely that ALIGNOF_DOUBLE and ALIGNOF_INT64_T are whatever the compiler prefers and then MAXIMUM_ALIGNOF is the larger of the two. (On all supported platforms other than AIX, all three values will be the same.) 2. We need to overlay C structs onto catalog tuples, and int8 fields in those struct declarations may not be aligned to match this rule. In the old branches we had some annoying rules about ordering catalog columns to avoid alignment problems, but nobody wants to resurrect those. However, there's a better answer: make the compiler construe those struct declarations the way we need it to by using the pack(N) pragma. This requires no manual effort to maintain going forward; we only have to insert the pragma into all the catalog *.h files. (As the catalogs stand at this writing, nothing actually changes because we've not moved any affected columns since v16; hence no catversion bump is required. The point of this is to not have to worry about the issue going forward.) We did not have this option when the AIX port was first made. This patch depends on the C99 feature _Pragma(), as well as the pack(N) pragma which dates to somewhere around gcc 4.0, and probably doesn't exist in xlc at all. But now that we've agreed to toss xlc support out the window, there doesn't seem to be a reason not to go this way. In passing, I got rid of LONGALIGN[_DOWN] along with the configure probes for ALIGNOF_LONG. We were not using those anywhere and it seems highly unlikely that we'd do so in future. Instead supply INT64ALIGN[_DOWN], which isn't used either but at least could have a good reason to be used. Discussion: https://postgr.es/m/1127261.1769649624@sss.pgh.pa.us	2026-02-23 12:34:54 -05:00
Nathan Bossart	bc60ee8606	Warn upon successful MD5 password authentication. This uses the "connection warning" infrastructure introduced by commit `1d92e0c2cc` to emit a WARNING when an MD5 password is used to authenticate. MD5 password support was marked as deprecated in v18 and will be removed in a future release of Postgres. These warnings are on by default but can be turned off via the existing md5_password_warnings parameter. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Xiangyu Liang <liangxiangyu_2013@163.com> Discussion: https://postgr.es/m/aYzeAYEbodkkg5e-%40nathan	2026-02-23 11:22:04 -06:00
Peter Eisentraut	797872f6b9	Rename validate_relation_kind() There are three static definitions of validate_relation_kind() in the codebase, one each in table.c, indexam.c and sequence.c, validating that the given relation is a table, an index or a sequence respectively. The compiler knows which definition to use where because they are static. But this could be confusing to a reader. Rename these functions so that their names reflect the kind of relation they are validating. While at it, also update the comments in table.c to clarify the definition of table-like relkinds so that we don't have to maintain the exclusion list as the set of relkinds undergoes changes. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6d3fef19-a420-4e11-8235-8ea534bf2080%40eisentraut.org	2026-02-23 17:38:06 +01:00
Peter Eisentraut	d7be57ad85	Flip logic in table validate_relation_kind It instead of checking which relkinds it shouldn't be, explicitly list the ones we accept. This is used to check which relkinds are accepted in table_open() and related functions. Before this change, figuring that out was always a few steps too complicated. This also makes changes for new relkinds more explicit instead of accidental. Finally, this makes this more aligned with the functions of the same name in src/backend/access/index/indexam.c and src/backend/access/sequence/sequence.c. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6d3fef19-a420-4e11-8235-8ea534bf2080%40eisentraut.org	2026-02-23 17:32:07 +01:00
Andrew Dunstan	b380a56a3f	Disallow CR and LF in database, role, and tablespace names Previously, these characters could cause problems when passed through shell commands, and were flagged with a comment in string_utils.c suggesting they be rejected in a future major release. The affected commands are CREATE DATABASE, CREATE ROLE, CREATE TABLESPACE, ALTER DATABASE RENAME, ALTER ROLE RENAME, and ALTER TABLESPACE RENAME. Also add a pg_upgrade check to detect these invalid names in clusters being upgraded from pre-v19 versions, producing a report file listing any offending objects that must be renamed before upgrading. Tests have been modified accordingly. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Reviewed-By: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-By: Andrew Dunstan <andrew@dunslane.net> Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-By: Nathan Bossart <nathandbossart@gmail.com> Reviewed-By: Srinath Reddy <srinath2133@gmail.com> Discussion: https://postgr.es/m/CAKYtNApkOi4FY0S7+3jpTqnHVyyZ6Tbzhtbah-NBbY-mGsiKAQ@mail.gmail.com	2026-02-23 11:19:13 -05:00
Peter Eisentraut	78727dcba3	meson: allow disabling building/installation of static libraries. We now support the common meson option -Ddefault_library, with values 'both' (the default), 'shared' (install only shared libraries), and 'static' (install only static libraries). The 'static' choice doesn't actually work, since psql and other programs insist on linking to the shared version of libpq, but it's there pro-forma. It could be built out if we really wanted, but since we have never supported the equivalent in the autoconf build system, there doesn't appear to be an urgent need. With an eye to re-supporting AIX, the internal implementation distinguishes whether to install libpgport.a and other static-only libraries from whether to build/install the static variant of libraries that we can build both ways. This detail isn't exposed as a meson option, though it could be if there's demand. The Cirrus CI task SanityCheck now uses -Ddefault_library=shared to save a little bit of build time (and to test this option). Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/e8aa97db-872b-4087-b073-f296baae948d@eisentraut.org	2026-02-23 16:45:40 +01:00
Nathan Bossart	f33b8793fd	Make use of pg_popcount() in more places. This replaces some loops over word-length popcount functions with calls to pg_popcount(). Since pg_popcount() may use a function pointer for inputs with sizes >= a Bitmapset word, this produces a small regression for the common one-word case in bms_num_members(). To deal with that, this commit adds an inlined fast-path for that case. This fast-path could arguably go in pg_popcount() itself (with an appropriate alignment check), but that is left for future study. Suggested-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com	2026-02-23 09:26:00 -06:00
Nathan Bossart	eb9ab7e093	Remove uses of popcount builtins. This commit replaces the implementations of pg_popcount{32,64} with branchless ones in plain C. While these new implementations do not make use of more sophisticated population count instructions available on some CPUs, testing indicates they perform well, especially now that they are inlined. Newer versions of popular compilers will automatically replace these with special instructions if possible, anyway. A follow-up commit will replace various loops over these functions with calls to pg_popcount(), leaving us little reason to worry about micro-optimizing them further. Since this commit removes the only uses of the popcount builtins, we can also remove the corresponding configuration checks. Suggested-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com	2026-02-23 09:26:00 -06:00
John Naylor	b9278871f9	Rename pg_crc32c_sse42_choose.c for general purpose Future commits will consolidate the CPU feature detection functionality now scattered around in various files, and the CRC "*_choose.c" files seem to be the natural place for it. For now, just rename in a separate commit to make it easier to follow the git log. Do the minimum necessary to keep the build systems functional, and build the new file pg_cpu_x86.c unconditionally using guards to control the visibility of its contents, following the model of some more recent files in src/port. Limit scope to x86 to reduce the number of moving parts, since the motivation for doing this now is to clear out some technical debt before adding AVX2 detection. Arm is left for future work. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CANWCAZbgEUFw7LuYSVeJ=Tj98R5HoOB1Ffeqk3aLvbw5rU5NTw@mail.gmail.com	2026-02-23 19:24:56 +07:00
Peter Eisentraut	55f3859329	Change error message for sequence validate_relation_kind() We can just say "... is not a sequence" instead of the more complicated variant from before, which was probably copied from src/backend/access/table/table.c. Fix a typo in a comment in passing. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6d3fef19-a420-4e11-8235-8ea534bf2080%40eisentraut.org	2026-02-23 10:56:54 +01:00
Peter Eisentraut	4bfbbeb679	meson: Refactor libpq targets variables Some of the knowledge of the libpq targets was spread around between the top-level meson.build and src/interfaces/libpq*. This change organizes it more like other targets by having a libpq_targets variable that different subdirectories can add to. Discussion: https://www.postgresql.org/message-id/flat/e8aa97db-872b-4087-b073-f296baae948d%40eisentraut.org	2026-02-23 10:37:38 +01:00
Peter Eisentraut	2f2c9d8363	test_cplusplusext: Add C++ pg_fallthrough test case Discussion: https://www.postgresql.org/message-id/flat/76a8efcd-925a-4eaf-bdd1-d972cd1a32ff%40eisentraut.org	2026-02-23 07:40:19 +01:00
Peter Eisentraut	0284e07599	Enable -Wimplicit-fallthrough option for clang On clang, -Wimplicit-fallthrough requires annotations with attributes, but on gcc, -Wimplicit-fallthrough is the same as -Wimplicit-fallthrough=3, which allows annotations with comments. In order to enforce consistent annotations with attributes on both compilers, we test first for -Wimplicit-fallthrough=5, which will succeed on gcc, and if that is not found we test for -Wimplicit-fallthrough. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/76a8efcd-925a-4eaf-bdd1-d972cd1a32ff%40eisentraut.org	2026-02-23 07:40:19 +01:00
Peter Eisentraut	3f7a0e1e55	Fix additional fallthrough warning Clang warns about this one, but GCC did not. (Apparently a bug in GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122796) Apparently, the previous "fall through" comment was introduced manually in commit `f76892c9ff` without the compiler actually asking for it. This is in preparation for enabling fallthrough warnings on Clang. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/76a8efcd-925a-4eaf-bdd1-d972cd1a32ff%40eisentraut.org	2026-02-23 07:40:19 +01:00
Peter Eisentraut	3a63b76571	Fix additional fallthrough warnings from clang Clang warns if falling through to a case or default label that is immediately followed by break, but GCC does not (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91432). (MSVC also warns about the equivalent code in C++.) This is in preparation for enabling fallthrough warnings on Clang. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/76a8efcd-925a-4eaf-bdd1-d972cd1a32ff%40eisentraut.org	2026-02-23 07:40:19 +01:00
Amit Kapila	308622edf1	Avoid including utils/timestamp.h in conflict.h. conflict.h currently includes utils/timestamp.h despite only requiring basic timestamp type definitions. This creates unnecessary overhead. Replace the include with datatype/timestamp.h to provide the necessary types. This change requires explicitly including utils/timestamp.h in test_custom_fixed_stats.c, which previously relied on the indirect inclusion. Extracted from the larger patch by Andres Freund. Discussion: https://postgr.es/m/aY-UE-4t7FiYgH3t@alap3.anarazel.de	2026-02-23 10:19:05 +05:30
Michael Paquier	ee46584884	doc: Add section "Options" for pg_controldata Adding this section brings consistency with the pages of other tools, potentially easing the introduction of new options in the future as these are now showing in the shape of a list. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/CAHut+PtSF5AW3DHpYA-_muDLms2xBUzHpd545snVj8vFpmsmGg@mail.gmail.com	2026-02-23 13:42:38 +09:00
Heikki Linnakangas	412f78c66e	Align PGPROC to cache line boundary On common architectures, the PGPROC struct happened to be a multiple of 64 bytes on PG 18, but it's changed on 'master' since. There was worry that changing the alignment might hurt performance, due to false cacheline sharing across elements in the proc array. However, there was no explicit alignment, so any alignment to cache lines was accidental. Add explicit alignment to remove worry about false sharing. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/3dd6f70c-b94d-4428-8e75-74a7136396be@iki.fi	2026-02-22 13:13:43 +02:00
Heikki Linnakangas	2e0853176f	Rearrange fields in PGPROC, for clarity The ordering was pretty random, making it hard to get an overview of what's in it. Group related fields together, and add comments to act as separators between the groups. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/3dd6f70c-b94d-4428-8e75-74a7136396be@iki.fi	2026-02-22 12:45:13 +02:00
Michael Paquier	4476106c65	doc: Add description of "filename" for pg_walsummary This command requires an input file (WAL summary file), that has to be specified without an option name. The shape of the command and how to use this parameter is implied in its synopsis. However, this page lacked a description of the parameter. Listing parameters that do not require an option is a common practice across the docs. See for example pg_dump, pg_restore, etc. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAHut+PtbQi8Dw_0upS9dd=Oh9OqfOdAo=0_DOKG=YSRT_a+0Fw@mail.gmail.com	2026-02-22 15:12:58 +09:00
Álvaro Herrera	0eeffd31bf	Avoid name collision with NOT NULL constraints If a CREATE TABLE statement defined a constraint whose name is identical to the name generated for a NOT NULL constraint, we'd throw an (unnecessary) unique key violation error on pg_constraint_conrelid_contypid_conname_index: this can easily be avoided by choosing a different name for the NOT NULL constraint. Fix by passing the constraint names already created by AddRelationNewConstraints() to AddRelationNotNullConstraints(), so that the latter can avoid name collisions with them. Bug: #19393 Author: Laurenz Albe <laurenz.albe@cybertec.at> Reported-by: Hüseyin Demir <huseyin.d3r@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/19393-6a82427485a744cf@postgresql.org	2026-02-21 12:22:08 +01:00
Heikki Linnakangas	36bbcd5be3	Split PGPROC 'links' field into two, for clarity The field was mainly used for the position in a LOCK's wait queue, but also as the position in a the freelist when the PGPROC entry was not in use. The reuse saves some memory at the expense of readability, which seems like a bad tradeoff. If we wanted to make the struct smaller there's other things we could do, but we're actually just discussing adding padding to the struct for performance reasons. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/3dd6f70c-b94d-4428-8e75-74a7136396be@iki.fi	2026-02-20 22:34:42 +02:00
Nathan Bossart	dc592a4155	Speedup COPY FROM with additional function inlining. Following the example set by commit `58a359e585`, we can squeeze out a little more performance from COPY FROM (FORMAT {text,csv}) by inlining CopyReadLineText() and passing the is_csv parameter as a constant. This allows the compiler to emit specialized code with fewer branches. This is preparatory work for a proposed follow-up commit that would further optimize this code with SIMD instructions. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Ayoub Kazar <ma_kazar@esi.dz> Tested-by: Manni Wood <manni.wood@enterprisedb.com> Discussion: https://postgr.es/m/CAOzEurSW8cNr6TPKsjrstnPfhf4QyQqB4tnPXGGe8N4e_v7Jig%40mail.gmail.com	2026-02-20 12:07:27 -06:00
Heikki Linnakangas	18bcdb75d1	Fix expanding 'bounds' in pg_trgm's calc_word_similarity() function If the 'bounds' array needs to be expanded, because the input contains more trigrams than the initial guess, the code didn't return the reallocated array correctly to the caller. That could lead to a crash in the rare case that the input string becomes longer when it's lower-cased. The only known instance of that is when an ICU locale is used with certain single-byte encodings. This was an oversight in commit `00896ddaf4`. Author: Zsolt Parragi <zsolt.parragi@percona.com> Backpatch-through: 18	2026-02-20 11:56:42 +02:00
Richard Guo	691977d370	Fix computation of varnullingrels when translating appendrel Var When adjust_appendrel_attrs translates a Var referencing a parent relation into a Var referencing a child relation, it propagates varnullingrels from the parent Var to the translated Var. Previously, the code simply overwrote the translated Var's varnullingrels with those of the parent. This was incorrect because the translated Var might already possess nonempty varnullingrels. This happens, for example, when a LATERAL subquery within a UNION ALL references a Var from the nullable side of an outer join. In such cases, the translated Var correctly carries the outer join's relid in its varnullingrels. Overwriting these bits with the parent Var's set caused the planner to lose track of the fact that the Var could be nulled by that outer join. In the reported case, because the underlying column had a NOT NULL constraint, the planner incorrectly deduced that the Var could never be NULL and discarded essential IS NOT NULL filters. This led to incorrect query results where NULL rows were returned instead of being filtered out. To fix, use bms_add_members to merge the parent Var's varnullingrels into the translated Var's existing set, preserving both sources of nullability. Back-patch to v16. Although the reported case does not seem to cause problems in v16, leaving incorrect varnullingrels in the tree seems like a trap for the unwary. Bug: #19412 Reported-by: Sergey Shinderuk <s.shinderuk@postgrespro.ru> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19412-1d0318089b86859e@postgresql.org Backpatch-through: 16	2026-02-20 17:57:53 +09:00
Michael Paquier	0dc22fff64	Fix constant in error message for recovery_target_timeline The intention was to use PG_UINT32_MAX, not UINT_MAX. Let's be consistent and use the same constant. Thinko in `fd7d7b7191`. Author: David Steele <david@pgbackrest.org> Discussion: https://postgr.es/m/aZfXO97jSQaTTlfD@paquier.xyz	2026-02-20 16:17:57 +09:00
Amit Kapila	9842e8aca0	Avoid including worker_internal.h in pgstat.h. pgstat.h is a widely included header. Including worker_internal.h there is unnecessary and creates tight coupling. By refactoring pgstat_report_subscription_error() to fetch the required LogicalRepWorkerType internally rather than receiving it as an argument, we can eliminate the need for the internal header. Reported-by: Andres Freund <andres@anarazel.de> Author: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/aY-UE-4t7FiYgH3t@alap3.anarazel.de	2026-02-20 09:26:33 +05:30
Nathan Bossart	ba401828c1	Remove SpinLockFree() and S_LOCK_FREE(). S_LOCK_FREE() is used by the test program in s_lock.c, but nobody has voiced concerns about losing some coverage there. SpinLockFree() appears to have been unused since it was introduced by commit `499abb0c0f`. There was agreement to remove these in 2020, but it never happened. Since we still have agreement for removal in 2026, let's do that now. Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/aZX2oUcKf7IzHnnK%40nathan Discussion: https://postgr.es/m/20200608225338.m5zho424w6lpwb2d%40alap3.anarazel.de	2026-02-19 16:19:41 -06:00
Nathan Bossart	aa71a35a40	Assume "inline" keyword is available. This has been a keyword since C99, and we now require C11, so we no longer need to use __inline__ or to check for it at configure time. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aZdGbDaV4_yKCMc-%40nathan	2026-02-19 14:37:29 -06:00
Robert Haas	6e466e1e83	Fix add_partial_path interaction with disabled_nodes Commit `e222534679` adjusted the logic in add_path() to keep the path list sorted by disabled_nodes and then by total_cost, but failed to make the corresponding adjustment to add_partial_path. As a result, add_partial_path might sort the path list just by total cost, which could lead to later planner misbehavior. In principle, this should be back-patched to v18, but we are typically reluctant to back-patch planner fixes for fear of destabilizing working installations, and it is unclear to me that this has sufficiently serious consequences to justify an exception, so for now, no back-patch. Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: http://postgr.es/m/CAMbWs4-mO3jMK4t_LgcJ+7Eo=NmGgkxettgRaVbJzZvVZ1koMA@mail.gmail.com	2026-02-19 13:46:10 -05:00
Álvaro Herrera	fc3896c786	Add translator comment Otherwise the message is not very clear. Backpatch-through: 18	2026-02-19 17:11:04 +01:00
Tom Lane	2f248ad573	Remove no-longer-useful markers in pg_hba.conf.sample. The source version of pg_hba.conf.sample contains @remove-line-for-nolocal@ markers that indicate which lines should be deleted for an installation that doesn't HAVE_UNIX_SOCKETS. We no longer support that case, and since commit `f55808828` all that initdb is doing is unconditionally removing the markers. We might as well remove the markers from the source version and drop the removal code, which is unintelligible now anyway. This will not of course save any noticeable number of cycles in initdb, but it might save some confusion for future developers looking at pg_hba.conf.sample. It also reduces the number of distinct cases that replace_token() has to support, possibly allowing some tightening of that function. Discussion: https://postgr.es/m/2287786.1771458157@sss.pgh.pa.us	2026-02-19 11:09:00 -05:00
Fujii Masao	fb80f388f4	Add per-subscription wal_receiver_timeout setting. This commit allows setting wal_receiver_timeout per subscription using the CREATE SUBSCRIPTION and ALTER SUBSCRIPTION commands. The value is stored in the subwalrcvtimeout column of the pg_subscription catalog. When set, this value overrides the global wal_receiver_timeout for the subscription's apply worker. The default is -1, which means the global setting (from the server configuration, command line, role, or database) remains in effect. This feature is useful for configuring different timeout values for each subscription, especially when connecting to multiple publisher servers, to improve failure detection. Bump catalog version. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/a1414b64-bf58-43a6-8494-9704975a41e9@oss.nttdata.com	2026-02-20 01:00:09 +09:00
Fujii Masao	8a6af3ad08	Make GUC wal_receiver_timeout user-settable. When multiple subscribers connect to different publisher servers, it can be useful to set different wal_receiver_timeout values for each connection to better detect failures. However, previously this wasn't possible, which limited flexibility in managing subscriptions. This commit changes wal_receiver_timeout to be user-settable, allowing different values to be assigned using ALTER ROLE SET for each subscription owner. This effectively enables per-subscription configuration. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/a1414b64-bf58-43a6-8494-9704975a41e9@oss.nttdata.com	2026-02-20 00:52:43 +09:00
Fujii Masao	5b93a5987b	Log checkpoint request flags in checkpoint completion messages. Checkpoint completion log messages include more detail than checkpoint start messages, but previously omitted the checkpoint request flags, which were only logged at checkpoint start. As a result, users had to correlate completion messages with earlier start messages to see the full context. This commit includes the checkpoint request flags in the checkpoint completion log message as well. This duplicates some information, but makes the completion message self-contained and easier to interpret. Author: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Reviewed-by: Yuan Li <carol.li2025@outlook.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAMtXxw9tPwV=NBv5S9GZXMSKPeKv5f9hRhSjZ8__oLsoS5jcuA@mail.gmail.com	2026-02-19 23:55:12 +09:00
Peter Eisentraut	8354b9d6b6	Use fallthrough attribute instead of comment Instead of using comments to mark fallthrough switch cases, use the fallthrough attribute. This will (in the future, not here) allow supporting other compilers besides gcc. The commenting convention is only supported by gcc, the attribute is supported by clang, and in the fullness of time the C23 standard attribute would allow supporting other compilers as well. Right now, we package the attribute into a macro called pg_fallthrough. This commit defines that macro and replaces the existing comments with that macro invocation. We also raise the level of the gcc -Wimplicit-fallthrough= option from 3 to 5 to enforce the use of the attribute. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/76a8efcd-925a-4eaf-bdd1-d972cd1a32ff%40eisentraut.org	2026-02-19 08:51:12 +01:00
Peter Eisentraut	0c3fbb3fef	Remove useless fallthrough annotation A fallthrough attribute after the last case is a constraint violation in C23, and clang warns about it (not about this comment, but if we changed it to an attribute). Remove it. (There was apparently never anything after this to fall through to, even in the first commit da07a1e8565.) Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/76a8efcd-925a-4eaf-bdd1-d972cd1a32ff%40eisentraut.org	2026-02-19 08:50:58 +01:00
Michael Paquier	21e323e941	Sanitize some WAL-logging buffer handling in GIN and GiST code As transam's README documents, the general order of actions recommended when WAL-logging a buffer is to unlock and unpin buffers after leaving a critical section. This pattern was not being followed by some code paths of GIN and GiST, adjusted in this commit, where buffers were either unlocked or unpinned inside a critical section. Based on my analysis of each code path updated here, there is no reason to not follow the recommended unlocking/unpin pattern done outside of a critical section. These inconsistencies are rather old, coming mainly from `ecaa4708e5` and `ff301d6e69`. The guidelines in the README predate these commits, being introduced in `6d61cdec07`. Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CALdSSPgBPnpNNzxv0Y+_GNFzW6PmzRZYh+_hpf06Y1N2zLhZaQ@mail.gmail.com	2026-02-19 15:59:20 +09:00
Tom Lane	759b03b24c	Simplify creation of built-in functions with default arguments. Up to now, to create such a function, one had to make a pg_proc.dat entry and then overwrite it with a CREATE OR REPLACE command in system_functions.sql. That's error-prone (cf. bug #19409) and results in leaving dead rows in the initial contents of pg_proc. Manual maintenance of pg_node_tree strings seems entirely impractical, and parsing expressions during bootstrap would be extremely difficult as well. But Andres Freund observed that all the current use-cases are simple constants, and building a Const node is well within the capabilities of bootstrap mode. So this patch invents a special case: if bootstrap mode is asked to ingest a non-null value for pg_proc.proargdefaults (which would otherwise fail in pg_node_tree_in), it parses the value as an array literal and then feeds the element strings to the input functions for the corresponding parameter types. Then we can build a suitable pg_node_tree string with just a few more lines of code. This allows removing all the system_functions.sql entries that are just there to set up default arguments, replacing them with proargdefaults fields in pg_proc.dat entries. The old technique remains available in case someone needs a non-constant default. The initial contents of pg_proc are demonstrably the same after this patch, except that (1) json_strip_nulls and jsonb_strip_nulls now have the correct provolatile setting, as per bug #19409; (2) pg_terminate_backend, make_interval, and drandom_normal now have defaults that don't include a type coercion, which is how they should have been all along. In passing, remove some unused entries from bootstrap.c's TypInfo[] array. I had to add some new ones because we'll now need an entry for each default-possessing system function parameter, but we shouldn't carry more than we need there; it's just a maintenance gotcha. Bug: #19409 Reported-by: Lucio Chiessi <lucio.chiessi@trustly.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/183292bb-4891-4c96-a3ca-e78b5e0e1358@dunslane.net Discussion: https://postgr.es/m/19409-e16cd2605e59a4af@postgresql.org	2026-02-18 14:14:44 -05:00
Heikki Linnakangas	d62dca3b29	Use standard die() handler for SIGTERM in bgworkers The previous default bgworker_die() signal would exit with elog(FATAL) directly from the signal handler. That could cause deadlocks or crashes if the signal handler runs while we're e.g holding a spinlock or in the middle of a memory allocation. All the built-in background workers overrode that to use the normal die() handler and CHECK_FOR_INTERRUPTS(). Let's make that the default for all background workers. Some extensions relying on the old behavior might need to adapt, but the new default is much safer and is the right thing to do for most background workers. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/5238fe45-e486-4c62-a7f3-c7d8d416e812@iki.fi	2026-02-18 19:59:34 +02:00
Álvaro Herrera	3894f08abe	Update obsolete comment table_tuple_update's update_indexes argument hasn't been a boolean since commit `19d8e2308b`. Backpatch-through: 16	2026-02-18 18:09:54 +01:00
Michael Paquier	623a90c2ad	Force creation of stamp file after libpq library check in meson builds Previously, if --stamp_file was specified, libpq_check.pl would create a new stamp file only if none could be found. If there was already a stamp file, the script would do nothing, leaving the previous stamp file in place. This logic could cause unnecessary rebuilds because meson relies on the timestamp of the output files to determine if a rebuild should happen. In this case, a stamp file generated during an older check would be kept, but we need a stamp file from the latest moment where the libpq check has been run, so as correct rebuild decisions can be taken. This commit changes libpq_check.pl so as a fresh stamp file is created each time libpq_check.pl is run, when --stamp_file is specified. Oversight in commit `4a8e6f43a6`. Reported-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: VASUKI M <vasukim1992002@gmail.com> Discussion: https://postgr.es/m/CAN55FZ22rrN6gCn7urtmTR=_5z7ArZLUJu-TsMChdXwmRTaquA@mail.gmail.com	2026-02-18 16:07:13 +09:00
Michael Paquier	ee642cccc4	Switch SysCacheIdentifier to a typedef enum The main purpose of this change is to allow an ABI checker to understand when the list of SysCacheIdentifier changes, by switching all the routine declarations that relied on a signed integer for a syscache ID to this new type. This is going to be useful in the long-term for versions newer than v19 so as we will be able to check when the list of values in SysCacheIdentifier is updated in a non-ABI compliant fashion. Most of the changes of this commit are due to the new definition of SyscacheCallbackFunction, where a SysCacheIdentifier is now required for the syscache ID. It is a mechanical change, still slightly invasive. There are more areas in the tree that could be improved with an ABI checker in mind; this takes care of only one area. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/289125.1770913057@sss.pgh.pa.us	2026-02-18 09:58:38 +09:00
Michael Paquier	c06b5b99bb	Add concept of invalid value to SysCacheIdentifier This commit tweaks the generation of the syscache IDs for the enum SysCacheIdentifier to now include an invalid value, with -1 assigned as value. The concept of an invalid syscache ID exists when handling lookups of a ObjectAddress, based on their set of properties in ObjectPropertyType. -1 is used for the case where an object type has no option for a syscache lookup. This has been found as independently useful while discussing a switch of SysCacheIdentifier to a typedef, as we already have places that want to know about the concept of an invalid value when dealing with ObjectAddresses. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/aZQRnmp9nVjtxAHS@paquier.xyz	2026-02-18 09:25:52 +09:00
Michael Paquier	f7df12a66c	Fix one-off issue with cache ID in objectaddress.c get_catalog_object_by_oid_extended() has been doing a syscache lookup when given a cache ID strictly higher than 0, which is wrong because the first valid value of SysCacheIdentifier is 0. This issue had no consequences, as the first value assigned in the enum SysCacheIdentifier is AGGFNOID, which is not used in the object type properties listed in objectaddress.c. Even if an ID of 0 was hypotherically given, the code would still work with a less efficient heap-or-index scan. Discussion: https://postgr.es/m/aZTr_R6JGmqokUBb@paquier.xyz	2026-02-18 08:47:58 +09:00
Álvaro Herrera	b7271aa1d7	Use a bitmask for ExecInsertIndexTuples options ... instead of passing a bunch of separate booleans. Also, rearrange the argument list in a hopefully more sensible order. Discussion: https://postgr.es/m/202602111846.xpvuccb3inbx@alvherre.pgsql Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> (older version)	2026-02-17 17:59:45 +01:00
Álvaro Herrera	661237056b	Fix memory leak in new GUC check_hook Commit `38e0190ced` forgot to pfree() an allocation (freed in other places of the same function) in only one of several spots in check_log_min_messages(). Per Coverity. Add that. While at it, avoid open-coding guc_strdup(). The new coding does a strlen() that wasn't there before, but I doubt it's measurable.	2026-02-17 16:38:24 +01:00
Heikki Linnakangas	a92b809f9d	Ignore SIGINT in walwriter and walsummarizer Previously, SIGINT was treated the same as SIGTERM in walwriter and walsummarizer. That decision goes back to when the walwriter process was introduced (commit `ad4295728e`), and was later copied to walsummarizer. It was a pretty arbitrary decision back then, and we haven't adopted that convention in all the other processes that have been introduced later. Summary of how other processes respond to SIGINT: - Autovacuum launcher: Cancel the current iteration of launching - bgworker: Ignore (unless connected to a database) - checkpointer: Request shutdown checkpoint - bgwriter: Ignore - pgarch: Ignore - startup process: Ignore - walreceiver: Ignore - IO worker: die() IO workers are a notable exception in that they exit on SIGINT, and there's a documented reason for that: IO workers ignore SIGTERM, so SIGINT provides a way to manually kill them. (They do respond to SIGUSR2, though, like all the other processes that we don't want to exit immediately on SIGTERM on operating system shutdown.) To make this a little more consistent, ignore SIGINT in walwriter and walsummarizer. They have no "query" to cancel, and they react to SIGTERM just fine. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/818bafaf-1e77-4c78-8037-d7120878d87c@iki.fi	2026-02-17 17:18:31 +02:00
Peter Eisentraut	451650eaac	Test most StaticAssert macros in C++ extensions Most of the StaticAssert macros already worked in C++ with Clang and GCC:(the only compilers we're currently testing C++ extension support for). This adds a regression test for them in our test C++ extension, so we can safely change their implementation without accidentally breaking C++. The only macros that StaticAssert macros that don't work yet are the StaticAssertVariableIsOfType and StaticAssertVariableIsOfTypeMacro. These will be added in a follow-on commit. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com	2026-02-17 10:17:57 +01:00
Peter Eisentraut	3d28ecb5ac	Test List macros in C++ extensions All of these macros already work in C++ with Clang and GCC (the only compilers we're currently testing C++ extension support for). This adds a regression test for them in our test C++ extension, so we can safely change their implementation without accidentally breaking C++. Some of the List macros didn't work in C++ in the past (see commit `d5ca15ee5`), and this would have caught that. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com	2026-02-17 10:17:57 +01:00
Thomas Munro	bd626ef093	Fix test_valid_server_encoding helper function. Commit `c67bef3f32` introduced this test helper function for use by src/test/regress/sql/encoding.sql, but its logic was incorrect. It confused an encoding ID for a boolean so it gave the wrong results for some inputs, and also forgot the usual return macro. The mistake didn't affect values actually used in the test, so there is no change in behavior. Also drop it and another missed function at the end of the test, for consistency. Backpatch-through: 14 Author: Zsolt Parragi <zsolt.parragi@percona.com>	2026-02-17 16:12:05 +13:00
Noah Misch	8cef93d8a5	Suppress new "may be used uninitialized" warning. Various buildfarm members, having compilers like gcc 8.5 and 6.3, fail to deduce that text_substring() variable "E" is initialized if slice_size!=-1. This suppression approach quiets gcc 8.5; I did not reproduce the warning elsewhere. Back-patch to v14, like commit `9f4fd119b2`. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1157953.1771266105@sss.pgh.pa.us Backpatch-through: 14	2026-02-16 18:04:58 -08:00
Michael Paquier	a6f823e778	hstore: Fix NULL pointer dereference with receive function The receive function of hstore was not able to handle correctly duplicate key values when a new duplicate links to a NULL value, where a pfree() could be attempted on a NULL pointer, crashing due to a pointer dereference. This problem would happen for a COPY BINARY, when stacking values like that: aa => 5 aa => null The second key/value pair is discarded and pfree() calls are attempted on its key and its value, leading to a pointer dereference for the value part as the value is NULL. The first key/value pair takes priority when a duplicate is found. Per offline report. Reported-by: "Anemone" <vergissmeinnichtzh@gmail.com> Reported-by: "A1ex" <alex000young@gmail.com> Backpatch-through: 14	2026-02-17 08:41:26 +09:00
Nathan Bossart	b33f753612	pg_upgrade: Use COPY for LO metadata for upgrades from < v12. Before v12, pg_largeobject_metadata was defined WITH OIDS, so unlike newer versions, the "oid" column was a hidden system column that pg_dump's getTableAttrs() will not pick up. Thus, for commit `161a3e8b68`, we did not bother trying to use COPY for pg_largeobject_metadata for upgrades from older versions. This commit removes that restriction by adjusting the query in getTableAttrs() to pick up the "oid" system column and by teaching dumpTableData_copy() to use COPY (SELECT ...) for this catalog, since system columns cannot be used in COPY's column list. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/aYzuAz_ITUpd9ZvH%40nathan	2026-02-16 15:13:06 -06:00
Tom Lane	6be5b76d66	Ensure that all three build methods install the same set of files. syscache_info.h was installed into $installdir/include/server/catalog if you use a non-VPATH autoconf build, but not if you use a VPATH build or meson. That happened because the makefiles blindly install src/include/catalog/*.h, and in a non-VPATH build the generated header files would be swept up in that. While it's hard to conjure a reason to need syscache_info.h outside of backend build, it's also hard to get the makefiles to skip syscache_info.h, so let's go the other way and install it in the other two cases too. Another problem, new in v19, was that meson builds install a copy of src/include/catalog/README, while autoconf builds do not. The issue here is that that file is new and wasn't added to meson.build's exclusion list. While it's clearly a bug if different build methods don't install the same set of files, I doubt anyone would thank us for changing the behavior in released branches. Hence, fix in master only. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/946828.1771185367@sss.pgh.pa.us	2026-02-16 15:20:15 -05:00
Daniel Gustafsson	db93988ab0	doc: Add note to ssl_group config on X25519 and FIPS The X25519 curve is not allowed when OpenSSL is configured for FIPS mode, so add a note to the documentation that the default setting must be altered for such setups. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3521653.1770666093@sss.pgh.pa.us	2026-02-16 15:11:29 +01:00
Daniel Gustafsson	07e90c6913	Avoid using the X25519 curve in ssl tests The X25519 curve is disallowed when OpenSSL is configured for FIPS mode which makes the testsuite fail. Since X25519 isn't required for the tests we can remove it to allow FIPS enabled configurations to run the tests. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3521653.1770666093@sss.pgh.pa.us	2026-02-16 15:10:16 +01:00
Peter Eisentraut	d50c86e743	Change remaining StaticAssertStmt() to StaticAssertDecl() This completes the work started by commit `75f49221c2`. In basebackup.c, changing the StaticAssertStmt to StaticAssertDecl results in having the same StaticAssertDecl() in 2 functions. So, it makes more sense to move it to file scope instead. Also, as it depends on some computations based on 2 tar blocks, define TAR_NUM_TERMINATION_BLOCKS. In deadlock.c, change the StaticAssertStmt to StaticAssertDecl and keep it in the function scope. Add new braces to avoid warning from -Wdeclaration-after-statement. In aset.c, change the StaticAssertStmt to StaticAssertDecl and move it to file scope. Finally, update the comments in c.h a bit. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/aYH6ii46AvGVCB84%40ip-10-97-1-34.eu-west-3.compute.internal	2026-02-16 09:22:43 +01:00
Fujii Masao	351265a6c7	Remove recovery.signal at recovery end when both signal files are present. When both standby.signal and recovery.signal are present, standby.signal takes precedence and the server runs in standby mode. Previously, in this case, recovery.signal was not removed at the end of standby mode (i.e., on promotion) or at the end of archive recovery, while standby.signal was removed. As a result, a leftover recovery.signal could cause a subsequent restart to enter archive recovery unexpectedly, potentially preventing the server from starting. This behavior was surprising and confusing to users. This commit fixes the issue by updating the recovery code to remove recovery.signal alongside standby.signal when both files are present and recovery completes. Because this code path is particularly sensitive and changes in recovery behavior can be risky for stable branches, this change is applied only to the master branch. Reported-by: Nikolay Samokhvalov <nik@postgres.ai> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: David Steele <david@pgbackrest.org> Discussion: https://postgr.es/m/CAM527d8PVAQFLt_ndTXE19F-XpDZui861882L0rLY3YihQB8qA@mail.gmail.com	2026-02-16 13:57:38 +09:00
Michael Paquier	459576303d	pgcrypto: Tweak error message for incorrect session key length The error message added in `379695d3cc` referred to the public key being too long. This is confusing as it is in fact the session key included in a PGP message which is too long. This is harmless, but let's be precise about what is wrong. Per offline report. Reported-by: Zsolt Parragi <zsolt.parragi@percona.com> Backpatch-through: 14	2026-02-16 12:18:18 +09:00
Noah Misch	9f4fd119b2	Fix SUBSTRING() for toasted multibyte characters. Commit `1e7fe06c10` changed pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an incomplete character. Most callers want that. text_substring() does not. It detoasts the most bytes it could possibly need to get the requested number of characters. For example, to extract up to 2 chars from UTF8, it needs to detoast 8 bytes. In a string of 3-byte UTF8 chars, 8 bytes spans 2 complete chars and 1 partial char. Fix this by replacing this pg_mbstrlen_with_len() call with a string traversal that differs by stopping upon finding as many chars as the substring could need. This also makes SUBSTRING() stop raising an encoding error if the incomplete char is past the end of the substring. This is consistent with the general philosophy of the above commit, which was to raise errors on a just-in-time basis. Before the above commit, SUBSTRING() never raised an encoding error. SUBSTRING() has long been detoasting enough for one more char than needed, because it did not distinguish exclusive and inclusive end position. For avoidance of doubt, stop detoasting extra. Back-patch to v14, like the above commit. For applications using SUBSTRING() on non-ASCII column values, consider applying this to your copy of any of the February 12, 2026 releases. Reported-by: SATŌ Kentarō <ranvis@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Bug: #19406 Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org Backpatch-through: 14	2026-02-14 12:16:16 -08:00
Noah Misch	4644f8b23b	pg_mblen_range, pg_mblen_with_len: Valgrind after encoding ereport. The prior order caused spurious Valgrind errors. They're spurious because the ereport(ERROR) non-local exit discards the pointer in question. pg_mblen_cstr() ordered the checks correctly, but these other two did not. Back-patch to v14, like commit `1e7fe06c10`. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20260214053821.fa.noahmisch@microsoft.com Backpatch-through: 14	2026-02-14 12:16:16 -08:00
John Naylor	ef3c3cf6d0	Perform radix sort on SortTuples with pass-by-value Datums Radix sort can be much faster than quicksort, but for our purposes it is limited to sequences of unsigned bytes. To make tuples with other types amenable to this technique, several features of tuple comparison must be accounted for, i.e. the sort key must be "normalized": 1. Signedness -- It's possible to modify a signed integer such that it can be compared as unsigned. For example, a signed char has range -128 to 127. If we cast that to unsigned char and add 128, the range of values becomes 0 to 255 while preserving order. 2. Direction -- SQL allows specification of ASC or DESC. The descending case is easily handled by taking the complement of the unsigned representation. 3. NULL values -- NULLS FIRST and NULLS LAST must work correctly. This commmit only handles the case where datum1 is pass-by-value Datum (possibly abbreviated) that compares like an ordinary integer. (Abbreviations of values of type "numeric" are a convenient counterexample.) First, tuples are partitioned by nullness in the correct NULL ordering. Then the NOT NULL tuples are sorted with radix sort on datum1. For tiebreaks on subsequent sortkeys (including the first sort key if abbreviated), we divert to the usual qsort. ORDER BY queries on pre-warmed buffers are up to 2x faster on high cardinality inputs with radix sort than the sort specializations added by commit `697492434`, so get rid of them. It's sufficient to fall back to qsort_tuple() for small arrays. Moderately low cardinality inputs show more modest improvents. Our qsort is strongly optimized for very low cardinality inputs, but radix sort is usually equal or very close in those cases. The changes to the regression tests are caused by under-specified sort orders, e.g. "SELECT a, b from mytable order by a;". For unstable sorts, such as our qsort and this in-place radix sort, there is no guarantee of the order of "b" within each group of "a". The implementation is taken from ska_byte_sort() (Boost licensed), which is similar to American flag sort (an in-place radix sort) with modifications to make it better suited for modern pipelined CPUs. The technique of normalization described above can also be extended to the case of multiple keys. That is left for future work (Thanks to Peter Geoghegan for the suggestion to look into this area). Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com> Reviewed-by: zengman <zengman@halodbtech.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> (earlier version) Discussion: https://postgr.es/m/CANWCAZYzx7a7E9AY16Jt_U3+GVKDADfgApZ-42SYNiig8dTnFA@mail.gmail.com	2026-02-14 13:50:06 +07:00
Daniel Gustafsson	aa082bed0b	doc: Mention PASSING support for jsonpath variables Commit `dfd79e2d` added a TODO comment to update this paragraph when support for PASSING was added. Commit `6185c9737c` added PASSING but missed resolving this TODO. Fix by expanding the paragraph with a reference to PASSING. Author: Aditya Gollamudi <adigollamudi@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20260117051406.sx6pss4ryirn2x4v@pgs	2026-02-13 12:12:11 +01:00
Daniel Gustafsson	4469fe1761	doc: Update docs images README with required ditaa version The URL for Ditaa linked to the old Sourceforge version which is too old for what we need, the fork over on Github is the correct version to use for re-generating the SVG files for the docs. The required Ditaa version is 0.11.0 as it when SVG support as added. Running the version found on Sourceforge produce the error below: $ ditaa -E -S --svg in.txt out.txt Unrecognized option: --svg usage: ditaa <INPFILE> [OUTFILE] [-A] [-b <BACKGROUND>] [-d] [-E] [-e <ENCODING>] [-h] [--help] [-o] [-r] [-S] [-s <SCALE>] [-T] [-t <TABS>] [-v] [-W] While there, also mention that meson rules exists for building images. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/CAN55FZ2O-23xERF2NYcvv9DM_1c9T16y6mi3vyP=O1iuXS0ASA@mail.gmail.com	2026-02-13 11:50:17 +01:00
Daniel Gustafsson	4ec0e75afd	meson: Add target for generating docs images This adds an 'images' target to the meson build system in order to be able to regenerate the images used in the docs. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reported-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAN55FZ0c0Tcjx9=e-YibWGHa1-xmdV63p=THH4YYznz+pYcfig@mail.gmail.com	2026-02-13 11:50:14 +01:00
Michael Paquier	6736dea14a	pg_dump: Use pg_malloc_object() and pg_malloc_array() The idea is to encourage more the use of these allocation routines across the tree, as these offer stronger type safety guarantees than pg_malloc() & co (type cast in the result, sizeof() embedded). This set of changes is dedicated to the pg_dump code. Similar work has been done as of `31d3847a37`, as one example. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/CAHut+PvpGPDLhkHAoxw_g3jdrYxA1m16a8uagbgH3TGWSKtXNQ@mail.gmail.com	2026-02-13 19:48:35 +09:00
Daniel Gustafsson	53c6bd0aa3	Restart BackgroundPsql's timer more nicely. Use BackgroundPsql's published API for automatically restarting its timer for each query, rather than manually reaching into it to achieve the same thing. 010_tab_completion.pl's logic for this predates the invention of BackgroundPsql (and `664d75753` missed the opportunity to make it cleaner). 030_pager.pl copied-and-pasted the code. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1100715.1712265845@sss.pgh.pa.us	2026-02-13 11:36:31 +01:00
Michael Paquier	775fc01415	Improve error message for checksum failures in pgstat_database.c This log message was referring to conflicts, but it is about checksum failures. The log message improved in this commit should never show up, due to the fact that pgstat_prepare_report_checksum_failure() should always be called before pgstat_report_checksum_failures_in_db(), with a stats entry already created in the pgstats shared hash table. The three code paths able to report database-level checksum failures follow already this requirement. Oversight in `b96d3c3897`. Author: Wang Peng <215722532@qq.com> Discussion: https://postgr.es/m/tencent_9B6CD6D9D34AE28CDEADEC6188DB3BA1FE07@qq.com Backpatch-through: 18	2026-02-13 12:17:08 +09:00
Heikki Linnakangas	d7edcec35c	Make pg_numa_query_pages() work in frontend programs It's currently only used in the server, but it was placed in src/port with the idea that it might be useful in client programs too. However, it will currently fail to link if used in a client program, because CHECK_FOR_INTERRUPTS() is not usable in client programs. Fix that by wrapping it in "#ifndef FRONTEND". Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/21cc7a48-99d9-4f69-9a3f-2c2de61ac8e5%40iki.fi Backpatch-through: 18	2026-02-12 19:41:06 +02:00
Heikki Linnakangas	d7a4291bb7	Fix comment neglected in commit `ddc3250208` I renamed the field in commit `ddc3250208`, but missed this one reference.	2026-02-12 19:41:02 +02:00
Nathan Bossart	a468898883	Remove specialized word-length popcount implementations. The uses of these functions do not justify the level of micro-optimization we've done and may even hurt performance in some cases (e.g., due to using function pointers). This commit removes all architecture-specific implementations of pg_popcount{32,64} and converts the portable ones to inlined functions in pg_bitutils.h. These inlined versions should produce the same code as before (but inlined), so in theory this is a net gain for many machines. A follow-up commit will replace the remaining loops over these word-length popcount functions with calls to pg_popcount(), further reducing the need for architecture-specific implementations. Suggested-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com	2026-02-12 11:32:49 -06:00
Nathan Bossart	cb7b2e5e8e	Remove some unnecessary optimizations in popcount code. Over the past few releases, we've added a huge amount of complexity to our popcount implementations. Commits `fbe327e5b4`, `79e232ca01`, `8c6653516c`, and `25dc485074` did some preliminary refactoring, but many opportunities remain. In particular, if we disclaim interest in micro-optimizing this code for 32-bit builds and in unnecessary alignment checks on x86-64, we can remove a decent chunk of code. I cannot find public discussion or benchmarks for the code this commit removes, but it seems unlikely that this change will noticeably impact performance on affected systems. Suggested-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com	2026-02-12 11:32:49 -06:00
Dean Rasheed	88327092ff	Add support for INSERT ... ON CONFLICT DO SELECT. This adds a new ON CONFLICT action DO SELECT [FOR UPDATE/SHARE], which returns the pre-existing rows when conflicts are detected. The INSERT statement must have a RETURNING clause, when DO SELECT is specified. The optional FOR UPDATE/SHARE clause allows the rows to be locked before they are are returned. As with a DO UPDATE conflict action, an optional WHERE clause may be used to prevent rows from being selected for return (but as with a DO UPDATE action, rows filtered out by the WHERE clause are still locked). Bumps catversion as stored rules change. Author: Andreas Karlsson <andreas@proxel.se> Author: Marko Tiikkaja <marko@joh.to> Author: Viktor Holmberg <v@viktorh.net> Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/d631b406-13b7-433e-8c0b-c6040c4b4663@Spark Discussion: https://postgr.es/m/5fca222d-62ae-4a2f-9fcb-0eca56277094@Spark Discussion: https://postgr.es/m/2b5db2e6-8ece-44d0-9890-f256fdca9f7e@proxel.se Discussion: https://postgr.es/m/CAL9smLCdV-v3KgOJX3mU19FYK82N7yzqJj2HAwWX70E=P98kgQ@mail.gmail.com	2026-02-12 09:57:04 +00:00
Amit Kapila	788ec96d59	Refactor slot synchronization logic in slotsync.c. Following `e68b6adad9`, the reason for skipping slot synchronization is stored as a slot property. This commit removes redundant function parameters that previously tracked this state, instead relying directly on the slot property. Additionally, this change centralizes the logic for skipping synchronization when required WAL has not yet been received or flushed. By consolidating this check, we reduce code duplication and the risk of inconsistent state updates across different code paths. In passing, add an assertion to ensure a slot is marked as temporary if a consistent point has not been reached during synchronization. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TY4PR01MB16907DD16098BE3B20486D4569463A@TY4PR01MB16907.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/CAFPTHDZAA+gWDntpa5ucqKKba41=tXmoXqN3q4rpjO9cdxgQrw@mail.gmail.com	2026-02-12 14:38:31 +05:30
Dean Rasheed	706cadde32	Remove p_is_insert from struct ParseState. The only place that used p_is_insert was transformAssignedExpr(), which used it to distinguish INSERT from UPDATE when handling indirection on assignment target columns -- see commit `c1ca3a19df`. However, this information is already available to transformAssignedExpr() via its exprKind parameter, which is always either EXPR_KIND_INSERT_TARGET or EXPR_KIND_UPDATE_TARGET. As noted in the commit message for `c1ca3a19df`, this use of p_is_insert isn't particularly pretty, so have transformAssignedExpr() use the exprKind parameter instead. This then allows p_is_insert to be removed entirely, which simplifies state management in a few other places across the parser. Author: Viktor Holmberg <v@viktorh.net> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/badc3b4c-da73-4000-b8d3-638a6f53a769@Spark	2026-02-12 09:01:42 +00:00
Richard Guo	cf74558feb	Reduce LEFT JOIN to ANTI JOIN using NOT NULL constraints For a LEFT JOIN, if any var from the right-hand side (RHS) is forced to null by upper-level quals but is known to be non-null for any matching row, the only way the upper quals can be satisfied is if the join fails to match, producing a null-extended row. Thus, we can treat this left join as an anti-join. Previously, this transformation was limited to cases where the join's own quals were strict for the var forced to null by upper qual levels. This patch extends the logic to check table constraints, leveraging the NOT NULL attribute information already available thanks to the infrastructure introduced by `e2debb643`. If a forced-null var belongs to the RHS and is defined as NOT NULL in the schema (and is not nullable due to lower-level outer joins), we know that the left join can be reduced to an anti-join. Note that to ensure the var is not nullable by any lower-level outer joins within the current subtree, we collect the relids of base rels that are nullable within each subtree during the first pass of the reduce-outer-joins process. This allows us to verify in the second pass that a NOT NULL var is indeed safe to treat as non-nullable. Based on a proposal by Nicolas Adenis-Lamarre, but this is not the original patch. Suggested-by: Nicolas Adenis-Lamarre <nicolas.adenis.lamarre@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CACPGbctKMDP50PpRH09in+oWbHtZdahWSroRstLPOoSDKwoFsw@mail.gmail.com	2026-02-12 15:30:13 +09:00
Tom Lane	9863c90759	Fix plpgsql's handling of "return simple_record_variable". If the variable's value is null, exec_stmt_return() missed filling in estate->rettype. This is a pretty old bug, but we'd managed not to notice because that value isn't consulted for a null result ... unless we have to cast it to a domain. That case led to a failure with "cache lookup failed for type 0". The correct way to assign the data type is known by exec_eval_datum. While we could copy-and-paste that logic, it seems like a better idea to just invoke exec_eval_datum, as the ROW case already does. Reported-by: Pavel Stehule <pavel.stehule@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFj8pRBT_ahexDf-zT-cyH8bMR_qcySKM8D5nv5MvTWPiatYGA@mail.gmail.com Backpatch-through: 14	2026-02-11 16:53:14 -05:00
Heikki Linnakangas	78a5e3074b	Fix pg_stat_get_backend_wait_event() for aux processes The pg_stat_activity view shows information for aux processes, but the pg_stat_get_backend_wait_event() and pg_stat_get_backend_wait_event_type() functions did not. To fix, call AuxiliaryPidGetProc(pid) if BackendPidGetProc(pid) returns NULL, like we do in pg_stat_get_activity(). In version 17 and above, it's a little silly to use those functions when we already have the ProcNumber at hand, but it was necessary before v17 because the backend ID was different from ProcNumber. I have other plans for wait_event_info on master, so it doesn't seem worth applying a different fix on different versions now. Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/c0320e04-6e85-4c49-80c5-27cfb3a58108@iki.fi Backpatch-through: 14	2026-02-11 18:50:57 +02:00
Nathan Bossart	1d92e0c2cc	Add password expiration warnings. This commit adds a new parameter called password_expiration_warning_threshold that controls when the server begins emitting imminent-password-expiration warnings upon successful password authentication. By default, this parameter is set to 7 days, but this functionality can be disabled by setting it to 0. This patch also introduces a new "connection warning" infrastructure that can be reused elsewhere. For example, we may want to warn about the use of MD5 passwords for a couple of releases before removing MD5 password support. Author: Gilles Darold <gilles@darold.net> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: songjinzhou <tsinghualucky912@foxmail.com> Reviewed-by: liu xiaohui <liuxh.zj.cn@gmail.com> Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/129bcfbf-47a6-e58a-190a-62fc21a17d03%40migops.com	2026-02-11 10:36:15 -06:00
Tom Lane	a3fd53babb	Further stabilize a postgres_fdw test case. The buildfarm occasionally shows a variant row order in the output of this UPDATE ... RETURNING, implying that the preceding INSERT dropped one of the rows into some free space within the table rather than appending them all at the end. It's not entirely clear why that happens some times and not other times, but we have established that it's affected by concurrent activity in other databases of the cluster. In any case, the behavior is not wrong; the test is at fault for presuming that a seqscan will give deterministic row ordering. Add an ORDER BY atop the update to stop the buildfarm noise. The buildfarm seems to have shown this only in v18 and master branches, but just in case the cause is older, back-patch to all supported branches. Discussion: https://postgr.es/m/3866274.1770743162@sss.pgh.pa.us Backpatch-through: 14	2026-02-11 11:03:17 -05:00
Álvaro Herrera	1efdd7cc63	Cleanup for log_min_messages changes in `38e0190ced` * Remove an unused variable * Use "default log level" consistently (instead of "generic") * Keep the process types in alphabetical order (missed one place in the SGML docs) * Since log_min_messages type was changed from enum to string, it is a good idea to add single quotes when printing it out. Otherwise it fails if the user copies and pastes from the SHOW output to SET, except in the simplest case. Using single quotes reduces confusion. * Use lowercase string for the burned-in default value, to keep the same output as previous versions. Author: Euler Taveira <euler@eulerto.com> Author: Man Zeng <zengman@halodbtech.com> Author: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/202602091250.genyflm2d5dw@alvherre.pgsql	2026-02-11 16:38:18 +01:00
Heikki Linnakangas	7984ce7a1d	Move ProcStructLock to the ProcGlobal struct It protects the freeProcs and some other fields in ProcGlobal, so let's move it there. It's good for cache locality to have it next to the thing it protects, and just makes more sense anyway. I believe it was allocated as a separate shared memory area just for historical reasons. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/b78719db-0c54-409f-b185-b0d59261143f@iki.fi	2026-02-11 16:48:45 +02:00
Dean Rasheed	bc953bf523	doc: Mention all SELECT privileges required by INSERT ... ON CONFLICT. On the INSERT page, mention that SELECT privileges are also required for any columns mentioned in the arbiter clause, including those referred to by the constraint, and clarify that this applies to all forms of ON CONFLICT, not just ON CONFLICT DO UPDATE. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com Backpatch-through: 14	2026-02-11 10:52:58 +00:00
Dean Rasheed	227a6ea657	doc: Clarify RLS policies applied for ON CONFLICT DO NOTHING. On the CREATE POLICY page, the description of per-command policies stated that SELECT policies are applied when an INSERT has an ON CONFLICT DO NOTHING clause. However, that is only the case if it includes an arbiter clause, so clarify that. While at it, also clarify the comment in the regression tests that cover this. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com Backpatch-through: 14	2026-02-11 10:25:05 +00:00
Heikki Linnakangas	ab32a9e21d	Remove useless store to local variable It was a leftover from commit `5764f611e1`, which converted the loop to use dclist_foreach. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/3dd6f70c-b94d-4428-8e75-74a7136396be@iki.fi	2026-02-11 11:49:18 +02:00
Robert Haas	7358abcc60	Store information about Append node consolidation in the final plan. An extension (or core code) might want to reconstruct the planner's decisions about whether and where to perform partitionwise joins from the final plan. To do so, it must be possible to find all of the RTIs of partitioned tables appearing in the plan. But when an AppendPath or MergeAppendPath pulls up child paths from a subordinate AppendPath or MergeAppendPath, the RTIs of the subordinate path do not appear in the final plan, making this kind of reconstruction impossible. To avoid this, propagate the RTI sets that would have been present in the 'apprelids' field of the subordinate Append or MergeAppend nodes that would have been created into the surviving Append or MergeAppend node, using a new 'child_append_relid_sets' field for that purpose. The value of this field is a list of Bitmapsets, because each relation whose append-list was pulled up had its own set of RTIs: just one, if it was a partitionwise scan, or more than one, if it was a partitionwise join. Since our goal is to see where partitionwise joins were done, it is essential to avoid losing the information about how the RTIs were grouped in the pulled-up relations. This commit also updates pg_overexplain so that EXPLAIN (RANGE_TABLE) will display the saved RTI sets. Co-authored-by: Robert Haas <rhaas@postgresql.org> Co-authored-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-02-10 17:55:59 -05:00
Michael Paquier	9181c870ba	Improve type handling of varlena structures This commit changes the definition of varlena to a typedef, so as it becomes possible to remove "struct" markers from various declarations in the code base. Historically, "struct" markers are not the project style for variable declarations, so this update simplifies the code and makes it more consistent across the board. This change has an impact on the following structures, simplifying declarations using them: - varlena - varatt_indirect - varatt_external This cleanup has come up in a different path set that played with TOAST and varatt.h, independently worth doing on its own. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aW8xvVbovdhyI4yo@paquier.xyz	2026-02-11 07:33:24 +09:00
Robert Haas	0d4391b265	Store information about elided nodes in the final plan. An extension (or core code) might want to reconstruct the planner's choice of join order from the final plan. To do so, it must be possible to find all of the RTIs that were part of the join problem in that plan. Commit `adbad833f3`, together with the earlier work in `8c49a484e8`, is enough to let us match up RTIs we see in the final plan with RTIs that we see during the planning cycle, but we still have a problem if the planner decides to drop some RTIs out of the final plan altogether. To fix that, when setrefs.c removes a SubqueryScan, single-child Append, or single-child MergeAppend from the final Plan tree, record the type of the removed node and the RTIs that the removed node would have scanned in the final plan tree. It would be natural to record this information on the child of the removed plan node, but that would require adding an additional pointer field to type Plan, which seems undesirable. So, instead, store the information in a separate list that the executor need never consult, and use the plan_node_id to identify the plan node with which the removed node is logically associated. Also, update pg_overexplain to display these details. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-02-10 16:46:05 -05:00
Robert Haas	adbad833f3	Store information about range-table flattening in the final plan. Suppose that we're currently planning a query and, when that same query was previously planned and executed, we learned something about how a certain table within that query should be planned. We want to take note when that same table is being planned during the current planning cycle, but this is difficult to do, because the RTI of the table from the previous plan won't necessarily be equal to the RTI that we see during the current planning cycle. This is because each subquery has a separate range table during planning, but these are flattened into one range table when constructing the final plan, changing RTIs. Commit `8c49a484e8` allows us to match up subqueries seen in the previous planning cycles with the subqueries currently being planned just by comparing textual names, but that's not quite enough to let us deduce anything about individual tables, because we don't know where each subquery's range table appears in the final, flattened range table. To fix that, store a list of SubPlanRTInfo objects in the final planned statement, each including the name of the subplan, the offset at which it begins in the flattened range table, and whether or not it was a dummy subplan -- if it was, some RTIs may have been dropped from the final range table, but also there's no need to control how a dummy subquery gets planned. The toplevel subquery has no name and always begins at rtoffset 0, so we make no entry for it. This commit teaches pg_overexplain's RANGE_TABLE option to make use of this new data to display the subquery name for each range table entry. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-02-10 15:33:39 -05:00
Robert Haas	0f4c8d33d4	Pass cursorOptions to planner_setup_hook. Commit `94f3ad3961` failed to do this because I couldn't think of a use for the information, but this has proven to be short-sighted. Best to fix it before this code is officially released. Now, the only argument to standard_planenr that isn't passed to planner_setup_hook is boundParams, but that is accessible via glob->boundParams, and so doesn't need to be passed separately. Discussion: https://www.postgresql.org/message-id/CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com	2026-02-10 11:50:28 -05:00
Robert Haas	cbdf93d471	Fix PGS_CONSIDER_NONPARTIAL interaction with Materialize nodes. Commit `4020b370f2` had the idea that it would be a good idea to handle testing PGS_CONSIDER_NONPARTIAL within cost_material to save callers the trouble, but that turns out not to be a very good idea. One concern is that it makes cost_material() dependent on the caller having initialized certain fields in the MaterialPath, which is a bit awkward for materialize_finished_plan, which wants to use a dummy path. Another problem is that it can result in generated materialized nested loops where the Materialize node is disabled, contrary to the intention of joinpath.c's logic in match_unsorted_outer() and consider_parallel_nestloop(), which aims to consider such paths only when they would not need to be disabled. In the previous coding, it was possible for the pgs_mask on the joinrel to have PGS_CONSIDER_NONPARTIAL set, while the inner rel had the same bit clear. In that case, we'd generate and then disable a Materialize path. That seems wrong, so instead, pull up the logic to test the PGS_CONSIDER_NONPARTIAL bit into joinpath.c, restoring the historical behavior that either we don't generate a given materialized nested loop in the first place, or we don't disable it. Discussion: http://postgr.es/m/CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com	2026-02-10 11:49:07 -05:00
Heikki Linnakangas	be5257725d	Refactor ProcessRecoveryConflictInterrupt for readability Two changes here: 1. Introduce a separate RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK flag to indicate a suspected deadlock that involves a buffer pin. Previously the startup process used the same flag for a deadlock involving just regular locks, and to check for deadlocks involving the buffer pin. The cases are handled separately in the startup process, but the receiving backend had to deduce which one it was based on HoldingBufferPinThatDelaysRecovery(). With a separate flag, the receiver doesn't need to guess. 2. Rewrite the ProcessRecoveryConflictInterrupt() function to not rely on fallthrough through the switch-statement. That was difficult to read. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-10 16:23:10 +02:00
Heikki Linnakangas	17f51ea818	Separate RecoveryConflictReasons from procsignals Share the same PROCSIG_RECOVERY_CONFLICT flag for all recovery conflict reasons. To distinguish, have a bitmask in PGPROC to indicate the reason(s). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-10 16:23:08 +02:00
Heikki Linnakangas	ddc3250208	Use ProcNumber rather than pid in ReplicationSlot This helps the next commit. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-10 16:23:05 +02:00
Michael Paquier	f33c585774	Simplify some log messages in extended_stats_funcs.c The log messages used in this file applied too much quoting logic: - No need for quote_identifier(), which is fine to not use in the context of a log entry. - The usual project style is to group the namespace and object together in a quoted string, when mentioned in an log message. This code quoted the namespace name and the extended statistics object name separately, which was confusing. Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20260210.143752.1113524465620875233.horikyota.ntt@gmail.com	2026-02-10 16:59:19 +09:00
Michael Paquier	307447e6db	Add information about range type stats to pg_stats_ext_exprs This commit adds three attributes to the system view pg_stats_ext_exprs, whose data can exist when involving a range type in an expression: range_length_histogram range_empty_frac range_bounds_histogram These statistics fields exist since `918eee0c49`, and have become viewable in pg_stats later in `bc3c8db8ae`. This puts the definition of pg_stats_ext_exprs on par with pg_stats. This issue has showed up during the discussion about the restore of extended statistics for expressions, so as it becomes possible to query the stats data to restore from the catalogs. Having access to this data is useful on its own, without the restore part. Some documentation and some tests are added, written by me. Corey has authored the part in system_views.sql. Bump catalog version. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aYmCUx9VvrKiZQLL@paquier.xyz	2026-02-10 12:36:57 +09:00
Richard Guo	f41ab51573	Teach planner to transform "x IS [NOT] DISTINCT FROM NULL" to a NullTest In the spirit of `8d19d0e13`, this patch teaches the planner about the principle that NullTest with !argisrow is fully equivalent to SQL's IS [NOT] DISTINCT FROM NULL. The parser already performs this transformation for literal NULLs. However, a DistinctExpr expression with one input evaluating to NULL during planning (e.g., via const-folding of "1 + NULL" or parameter substitution in custom plans) currently remains as a DistinctExpr node. This patch closes the gap for const-folded NULLs. It specifically targets the case where one input is a constant NULL and the other is a nullable non-constant expression. (If the other input were otherwise, the DistinctExpr node would have already been simplified to a constant TRUE or FALSE.) This transformation can be beneficial because NullTest is much more amenable to optimization than DistinctExpr, since the planner knows a good deal about the former and next to nothing about the latter. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com	2026-02-10 10:19:25 +09:00
Richard Guo	0aaf0de7fe	Optimize BooleanTest with non-nullable input The BooleanTest construct (IS [NOT] TRUE/FALSE/UNKNOWN) treats a NULL input as the logical value "unknown". However, when the input is proven to be non-nullable, this special handling becomes redundant. In such cases, the construct can be simplified directly to a boolean expression or a constant. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com	2026-02-10 10:18:47 +09:00
Richard Guo	0a37961254	Optimize IS DISTINCT FROM with non-nullable inputs The IS DISTINCT FROM construct compares values acting as though NULL were a normal data value, rather than "unknown". Semantically, "x IS DISTINCT FROM y" yields true if the values differ or if exactly one is NULL, and false if they are equal or both NULL. Unlike ordinary comparison operators, it never returns NULL. Previously, the planner only simplified this construct if all inputs were constants, folding it to a constant boolean result. This patch extends the optimization to cases where inputs are non-constant but proven to be non-nullable. Specifically, "x IS DISTINCT FROM NULL" folds to constant TRUE if "x" is known to be non-nullable. For cases where both inputs are guaranteed not to be NULL, the expression becomes semantically equivalent to "x <> y", and the DistinctExpr is converted into an inequality OpExpr. This transformation provides several benefits. It converts the comparison into a standard operator, allowing the use of partial indexes and constraint exclusion. Furthermore, if the clause is negated (i.e., "IS NOT DISTINCT FROM"), it simplifies to an equality operator. This enables the planner to generate better plans using index scans, merge joins, hash joins, and EC-based qual deduction. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com	2026-02-10 10:17:45 +09:00
Nathan Bossart	158408fef8	pg_upgrade: Fix handling of pg_largeobject_metadata. For binary upgrades from v16 or newer, pg_upgrade transfers the files for pg_largeobject_metadata from the old cluster, as opposed to using COPY or ordinary SQL commands to reconstruct its contents. While this approach adds complexity, it can greatly reduce pg_upgrade's runtime when there are many large objects. Large objects with comments or security labels are one source of complexity for this approach. During pg_upgrade, schema restoration happens before files are transferred. Comments and security labels are transferred in the former step, but the COMMENT and SECURITY LABEL commands will fail if their corresponding large objects do not exist. To deal with this, pg_upgrade first copies only the rows of pg_largeobject_metadata that are needed to avoid failures. Later, pg_upgrade overwrites those rows by replacing pg_largeobject_metadata's files with its files in the old cluster. Unfortunately, there's a subtle problem here. Simply put, there's no guarantee that pg_upgrade will overwrite all of pg_largeobject_metadata's files on the new cluster. For example, the new cluster's version might more aggressively extend relations or create visibility maps, and pg_upgrade's file transfer code is not sophisticated enough to remove files that lack counterparts in the old cluster. These extra files could cause problems post-upgrade. More fortunately, we can simultaneously fix the aforementioned problem and further optimize binary upgrades for clusters with many large objects. If we teach the COMMENT and SECURITY LABEL commands to allow nonexistent large objects during binary upgrades, pg_upgrade no longer needs to transfer pg_largeobject_metadata's contents beforehand. This approach also allows us to remove the associated dependency tracking from pg_dump, even for upgrades from v12-v15 that use COPY to transfer pg_largeobject_metadata's contents. In addition to what is described in the previous paragraph, this commit modifies the query in getLOs() to only retrieve LOs with comments or security labels for upgrades from v12 or newer. We have long assumed that such usage is rare, so this should reduce pg_upgrade's memory usage and runtime in many cases. We might also be able to remove the "upgrades from v12 or newer" restriction on the recent batch of optimizations by adding special handling for pg_largeobject_metadata's hidden OID column on older versions (since this catalog previously used the now-removed WITH OIDS feature), but that is left as a future exercise. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/3yd2ss6n7xywo6pmhd7jjh3bqwgvx35bflzgv3ag4cnzfkik7m%40hiyadppqxx6w	2026-02-09 14:58:02 -06:00
Heikki Linnakangas	73d60ac385	cleanup: Deadlock checker is no longer called from signal handler Clean up a few leftovers from when the deadlock checker was called from signal handler. We stopped doing that in commit `6753333f55`, in year 2015. - CheckDeadLock can return a return value directly to the caller, there's no need to use a global variable for that. - Remove outdated comments that claimed that CheckDeadLock "signals ProcSleep". - It should be OK to ereport() from DeadLockCheck now. I considered getting rid of InitDeadLockChecking() and moving the workspace allocations into DeadLockCheck, but it's still good to avoid doing the allocations while we're holding all the partition locks. So just update the comment to give that as the reason we do the allocations up front.	2026-02-09 20:26:23 +02:00
Álvaro Herrera	cbef472558	Remove HeapTupleheaderSetXminCommitted/Invalid functions They are not and never have been used by any known code -- apparently we just cargo-culted them in commit `37484ad2aa` (or their ancestor macros anyway, which begat these functions in commit `34694ec888`). Allegedly they're also potentially dangerous; users are better off going through HeapTupleSetHintBits instead. Author: Andy Fan <zhihuifan1213@163.com> Discussion: https://postgr.es/m/87sejogt4g.fsf@163.com	2026-02-09 19:15:20 +01:00
Heikki Linnakangas	18f0afb2a6	Fix incorrect iteration type in extension_file_exists() Commit `f3c9e341cd` changed the type of objects in the List that get_extension_control_directories() returns, from "char " to "ExtensionLocation ", but missed adjusting this one caller. Author: Chao Li <lic@highgo.com> Discussion: https://www.postgresql.org/message-id/362EA9B3-589B-475A-A16E-F10C30426E28@gmail.com	2026-02-09 19:15:44 +02:00
Noah Misch	c5dc75479b	Fix test "NUL byte in text decrypt" for --without-zlib builds. Backpatch-through: 14 Security: CVE-2026-2006	2026-02-09 09:08:10 -08:00
Tom Lane	8ebdf41c26	Harden _int_matchsel() against being attached to the wrong operator. While the preceding commit prevented such attachments from occurring in future, this one aims to prevent further abuse of any already- created operator that exposes _int_matchsel to the wrong data types. (No other contrib module has a vulnerable selectivity estimator.) We need only check that the Const we've found in the query is indeed of the type we expect (query_int), but there's a difficulty: as an extension type, query_int doesn't have a fixed OID that we could hard-code into the estimator. Therefore, the bulk of this patch consists of infrastructure to let an extension function securely look up the OID of a datatype belonging to the same extension. (Extension authors have requested such functionality before, so we anticipate that this code will have additional non-security uses, and may soon be extended to allow looking up other kinds of SQL objects.) This is done by first finding the extension that owns the calling function (there can be only one), and then thumbing through the objects owned by that extension to find a type that has the desired name. This is relatively expensive, especially for large extensions, so a simple cache is put in front of these lookups. Reported-by: Daniel Firer as part of zeroday.cloud Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2004 Backpatch-through: 14	2026-02-09 10:14:22 -05:00
Tom Lane	841d42cc4e	Require superuser to install a non-built-in selectivity estimator. Selectivity estimators come in two flavors: those that make specific assumptions about the data types they are working with, and those that don't. Most of the built-in estimators are of the latter kind and are meant to be safely attachable to any operator. If the operator does not behave as the estimator expects, you might get a poor estimate, but it won't crash. However, estimators that do make datatype assumptions can malfunction if they are attached to the wrong operator, since then the data they get from pg_statistic may not be of the type they expect. This can rise to the level of a security problem, even permitting arbitrary code execution by a user who has the ability to create SQL objects. To close this hole, establish a rule that built-in estimators are required to protect themselves against being called on the wrong type of data. It does not seem practical however to expect estimators in extensions to reach a similar level of security, at least not in the near term. Therefore, also establish a rule that superuser privilege is required to attach a non-built-in estimator to an operator. We expect that this restriction will have little negative impact on extensions, since estimators generally have to be written in C and thus superuser privilege is required to create them in the first place. This commit changes the privilege checks in CREATE/ALTER OPERATOR to enforce the rule about superuser privilege, and fixes a couple of built-in estimators that were making datatype assumptions without sufficiently checking that they're valid. Reported-by: Daniel Firer as part of zeroday.cloud Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2004 Backpatch-through: 14	2026-02-09 10:07:31 -05:00
Tom Lane	60e7ae41a6	Guard against unexpected dimensions of oidvector/int2vector. These data types are represented like full-fledged arrays, but functions that deal specifically with these types assume that the array is 1-dimensional and contains no nulls. However, there are cast pathways that allow general oid[] or int2[] arrays to be cast to these types, allowing these expectations to be violated. This can be exploited to cause server memory disclosure or SIGSEGV. Fix by installing explicit checks in functions that accept these types. Reported-by: Altan Birler <altan.birler@tum.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2003 Backpatch-through: 14	2026-02-09 09:57:43 -05:00
Noah Misch	d536aee556	Require PGP-decrypted text to pass encoding validation. pgp_sym_decrypt() and pgp_pub_decrypt() will raise such errors, while bytea variants will not. The existing "dat3" test decrypted to non-UTF8 text, so switch that query to bytea. The long-term intent is for type "text" to always be valid in the database encoding. pgcrypto has long been known as a source of exceptions to that intent, but a report about exploiting invalid values of type "text" brought this module to the forefront. This particular exception is straightforward to fix, with reasonable effect on user queries. Back-patch to v14 (all supported versions). Reported-by: Paul Gerste (as part of zeroday.cloud) Reported-by: Moritz Sanft (as part of zeroday.cloud) Author: shihao zhong <zhong950419@gmail.com> Reviewed-by: cary huang <hcary328@gmail.com> Discussion: https://postgr.es/m/CAGRkXqRZyo0gLxPJqUsDqtWYBbgM14betsHiLRPj9mo2=z9VvA@mail.gmail.com Backpatch-through: 14 Security: CVE-2026-2006	2026-02-09 06:14:47 -08:00
Álvaro Herrera	38e0190ced	Allow log_min_messages to be set per process type Change log_min_messages from being a single element to a comma-separated list of type:level elements, with 'type' representing a process type, and 'level' being a log level to use for that type of process. The list must also have a freestanding level specification which is used for process types not listed, which convenientely makes the whole thing backwards-compatible. Some choices made here could be contested; for instance, we use the process type `backend` to affect regular backends as well as dead-end backends and the standalone backend, and `autovacuum` means both the launcher and the workers. I think it's largely sensible though, and it can easily be tweaked if desired. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Tan Yang <332696245@qq.com> Discussion: https://postgr.es/m/e85c6671-1600-4112-8887-f97a8a5d07b2@app.fastmail.com	2026-02-09 13:23:10 +01:00
Thomas Munro	c67bef3f32	Code coverage for most pg_mblen* calls. A security patch changed them today, so close the coverage gap now. Test that buffer overrun is avoided when pg_mblen*() requires more than the number of bytes remaining. This does not cover the calls in dict_thesaurus.c or in dict_synonym.c. That code is straightforward. To change that code's input, one must have access to modify installed OS files, so low-privilege users are not a threat. Testing this would likewise require changing installed share/postgresql/tsearch_data, which was enough of an obstacle to not bother. Security: CVE-2026-2006 Backpatch-through: 14 Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>	2026-02-09 12:44:12 +13:00
Thomas Munro	1e7fe06c10	Replace pg_mblen() with bounds-checked versions. A corrupted string could cause code that iterates with pg_mblen() to overrun its buffer. Fix, by converting all callers to one of the following: 1. Callers with a null-terminated string now use pg_mblen_cstr(), which raises an "illegal byte sequence" error if it finds a terminator in the middle of the sequence. 2. Callers with a length or end pointer now use either pg_mblen_with_len() or pg_mblen_range(), for the same effect, depending on which of the two seems more convenient at each site. 3. A small number of cases pre-validate a string, and can use pg_mblen_unbounded(). The traditional pg_mblen() function and COPYCHAR macro still exist for backward compatibility, but are no longer used by core code and are hereby deprecated. The same applies to the t_isXXX() functions. Security: CVE-2026-2006 Backpatch-through: 14 Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reported-by: Paul Gerste (as part of zeroday.cloud) Reported-by: Moritz Sanft (as part of zeroday.cloud)	2026-02-09 12:44:04 +13:00
Thomas Munro	74ee636cc9	Fix mb2wchar functions on short input. When converting multibyte to pg_wchar, the UTF-8 implementation would silently ignore an incomplete final character, while the other implementations would cast a single byte to pg_wchar, and then repeat for the remaining byte sequence. While it didn't overrun the buffer, it was surely garbage output. Make all encodings behave like the UTF-8 implementation. A later change for master only will convert this to an error, but we choose not to back-patch that behavior change on the off-chance that someone is relying on the existing UTF-8 behavior. Security: CVE-2026-2006 Backpatch-through: 14 Author: Thomas Munro <thomas.munro@gmail.com> Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>	2026-02-09 12:08:58 +13:00
Thomas Munro	af79c30dc3	Fix encoding length for EUC_CN. While EUC_CN supports only 1- and 2-byte sequences (CS0, CS1), the mb<->wchar conversion functions allow 3-byte sequences beginning SS2, SS3. Change pg_encoding_max_length() to return 3, not 2, to close a hypothesized buffer overrun if a corrupted string is converted to wchar and back again in a newly allocated buffer. We might reconsider that in master (ie harmonizing in a different direction), but this change seems better for the back-branches. Also change pg_euccn_mblen() to report SS2 and SS3 characters as having length 3 (following the example of EUC_KR). Even though such characters would not pass verification, it's remotely possible that invalid bytes could be used to compute a buffer size for use in wchar conversion. Security: CVE-2026-2006 Backpatch-through: 14 Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>	2026-02-09 12:08:58 +13:00
Heikki Linnakangas	00896ddaf4	Fix buffer overflows in pg_trgm due to lower-casing The code made a subtle assumption that the lower-cased version of a string never has more characters than the original. That is not always true. For example, in a database with the latin9 encoding: latin9db=# select lower(U&'\00CC' COLLATE "lt-x-icu"); lower ----------- i\x1A\x1A (1 row) In this example, lower-casing expands the single input character into three characters. The generate_trgm_only() function relied on that assumption in two ways: - It used "slen * pg_database_encoding_max_length() + 4" to allocate the buffer to hold the lowercased and blank-padded string. That formula accounts for expansion if the lower-case characters are longer (in bytes) than the originals, but it's still not enough if the lower-cased string contains more characters than the original. - Its callers sized the output array to hold the trigrams extracted from the input string with the formula "(slen / 2 + 1) * 3", where 'slen' is the input string length in bytes. (The formula was generous to account for the possibility that RPADDING was set to 2.) That's also not enough if one input byte can turn into multiple characters. To fix, introduce a growable trigram array and give up on trying to choose the correct max buffer sizes ahead of time. Backpatch to v18, but no further. In previous versions lower-casing was done character by character, and thus the assumption that lower-casing doesn't change the character length was valid. That was changed in v18, commit `fb1a18810f`. Security: CVE-2026-2007 Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com>	2026-02-09 12:08:58 +13:00
Heikki Linnakangas	54598670fe	Remove 'charlen' argument from make_trigrams() The function assumed that if charlen == bytelen, there are no multibyte characters in the string. That's sensible, but the callers were a little careless in how they calculated the lengths. The callers converted the string to lowercase before calling make_trigram(), and the 'charlen' value was calculated before the conversion to lowercase while 'bytelen' was calculated after the conversion. If the lowercased string had a different number of characters than the original, make_trigram() might incorrectly apply the fastpath and treat all the bytes as single-byte characters, or fail to apply the fastpath (which is harmless), or it might hit the "Assert(bytelen == charlen)" assertion. I'm not aware of any locale / character combinations where you could hit that assertion in practice, i.e. where a string converted to lowercase would have fewer characters than the original, but it seems best to avoid making that assumption. To fix, remove the 'charlen' argument. To keep the performance when there are no multibyte characters, always try the fast path first, but check the input for multibyte characters as we go. The check on each byte adds some overhead, but it's close enough. And to compensate, the find_word() function no longer needs to count the characters. This fixes one small bug in make_trigrams(): in the multibyte codepath, it peeked at the byte just after the end of the input string. When compiled with IGNORECASE, that was harmless because there is always a NUL byte or blank after the input string. But with !IGNORECASE, the call from generate_wildcard_trgm() doesn't guarantee that. Backpatch to v18, but no further. In previous versions lower-casing was done character by character, and thus the assumption that lower-casing doesn't change the character length was valid. That was changed in v18, commit `fb1a18810f`. Security: CVE-2026-2007 Reviewed-by: Noah Misch <noah@leadboat.com>	2026-02-09 12:08:58 +13:00
Michael Paquier	379695d3cc	pgcrypto: Fix buffer overflow in pgp_pub_decrypt_bytea() pgp_pub_decrypt_bytea() was missing a safeguard for the session key length read from the message data, that can be given in input of pgp_pub_decrypt_bytea(). This can result in the possibility of a buffer overflow for the session key data, when the length specified is longer than PGP_MAX_KEY, which is the maximum size of the buffer where the session data is copied to. A script able to rebuild the message and key data that can trigger the overflow is included in this commit, based on some contents provided by the reporter, heavily editted by me. A SQL test is added, based on the data generated by the script. Reported-by: Team Xint Code as part of zeroday.cloud Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2005 Backpatch-through: 14	2026-02-09 08:00:59 +09:00
Tom Lane	73dd7163c5	Replace some hard-wired OID constants with corresponding macros. Looking again at commit `7cdb633c8`, I wondered why we have hard-wired "1034" for the OID of type aclitem[]. Some other entries in the same array have numeric type OIDs as well. This seems to be a hangover from years ago when not every built-in pg_type entry had an OID macro. But since we made genbki.pl responsible for generating these macros, there are macros available for all these array types, so there's no reason not to follow the project policy of never writing numeric OID constants in C code.	2026-02-07 23:15:20 -05:00
Tom Lane	c0bf15729f	meson: host_system value for Solaris is 'sunos' not 'solaris'. This thinko caused us to not substitute our own getopt() code, which results in failing to parse long options for the postmaster since Solaris' getopt() doesn't do what we expect. This can be seen in the results of buildfarm member icarus, which is the only one trying to build via meson on Solaris. Per consultation with pgsql-release, it seems okay to fix this now even though we're in release freeze. The fix visibly won't affect any other platforms, and it can't break Solaris/meson builds any worse than they're already broken. Discussion: https://postgr.es/m/2471229.1770499291@sss.pgh.pa.us Backpatch-through: 16	2026-02-07 20:05:52 -05:00
Peter Eisentraut	1653ce5236	Further error message fix Further fix of error message changed in commit `74a116a79b`. The initial fix was not quite correct. Discussion: https://www.postgresql.org/message-id/flat/tencent_1EE1430B1E6C18A663B8990F%40qq.com	2026-02-07 22:37:02 +01:00
John Naylor	7467041cde	Future-proof sort template against undefined behavior Commit `176dffdf7` added a NULL array pointer check before performing a qsort in order to prevent undefined behavior when passing NULL pointer and zero length. To head off future degenerate cases, check that there are at least two elements to sort before proceeding with insertion sort. This has the added advantage of allowing us to remove four equivalent checks that guarded against recursion/iteration. There might be a tiny performance penalty from unproductive recursions, but we can buy that back by increasing the insertion sort threshold. That is left for future work. Discussion: https://postgr.es/m/CANWCAZZWvds_35nXc4vXD-eBQa_=mxVtqZf-PM_ps=SD7ghhJg@mail.gmail.com	2026-02-07 17:02:35 +07:00
Peter Eisentraut	0af05b5dbb	Revert "Change copyObject() to use typeof_unqual" This reverts commit `4cfce4e62c`. This implementation fails to compile on newer MSVC that support __typeof_unqual__. (Older versions did not support it and compiled fine.) Revert for now and research further. Reported-by: Bryan Green <dbryan.green@gmail.com> Discussion: https://www.postgresql.org/message-id/b03ddcd4-2a16-49ee-b105-e7f609f3c514%40gmail.com	2026-02-07 10:08:38 +01:00
Tom Lane	7cdb633c89	Make some minor cleanups in typalign-related code. Commit `7b378237a` widened AclMode to 64 bits, which implies that the alignment of AclItem is now determined by an int64 field. That commit correctly set the typalign for SQL type aclitem to 'd', but it missed the hard-wired knowledge about _aclitem in bootstrap.c. This doesn't seem to have caused any ill effects, probably because we never try to fill a non-null value into an aclitem[] column during bootstrap. Nonetheless, it's clearly a gotcha waiting to happen, so fix it up. In passing, also fix a couple of typanalyze functions that were using hard-coded typalign constants when they could just as easily use greppable TYPALIGN_xxx macros. Noticed these while working on a patch to expand the set of typalign values. I doubt we are going to pursue that path, but these fixes still seem worth a quick commit. Discussion: https://postgr.es/m/1127261.1769649624@sss.pgh.pa.us	2026-02-06 20:46:03 -05:00
Nathan Bossart	ba1e14134a	Adjust style of some debugging macros. This commit adjusts a few debugging macros to match the style of those in pg_config_manual.h. Like commits `123661427b` and `b4cbc106a6`, these were discovered while reviewing Aleksander Alekseev's proposed changes to pgindent. Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aP-H6kSsGOxaB21k%40nathan	2026-02-06 16:24:21 -06:00
Jacob Champion	d8d7c5dc8f	libpq: Prepare for protocol grease during 19beta The main reason that libpq doesn't request protocol version 3.2 by default is because other proxy/server implementations don't implement the negotiation. This is a bit of a chicken-and-egg problem: We don't bump the default version that libpq requests, but other implementations may not be incentivized to implement version negotiation if their users never run into issues. One established practice to combat this is to flip Postel's Law on its head, by sending parameters that the server cannot possibly support. If the server fails the handshake instead of correctly negotiating, then the problem is surfaced naturally. If the server instead claims to support the bogus parameters, then we fail the connection to make the lie obvious. This is called "grease" (or "greasing"), after the GREASE mechanism in TLS that popularized the concept: https://www.rfc-editor.org/rfc/rfc8701.html This patch reserves 3.9999 as an explicitly unsupported protocol version number and `_pq_.test_protocol_negotiation` as an explicitly unsupported protocol extension. A later commit will send these by default in order to stress-test the ecosystem during the beta period; that commit will then be reverted before 19 RC1, so that we can decide what to do with whatever data has been gathered. The _pq_.test_protocol_negotiation change here is intentionally docs- only: after its implementation is reverted, the parameter should remain reserved. Extracted/adapted from a patch by Jelte Fennema-Nio. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/DDPR5BPWH1RJ.1LWAK6QAURVAY%40jeltef.nl	2026-02-06 10:31:45 -08:00
Jacob Champion	e3d37853ec	doc: Expand upon protocol versions and extensions First, split the Protocol Versions table in two, and lead with the list of versions that are supported today. Reserved and unsupported version numbers go into the second table. Second, in anticipation of a new (reserved) protocol extension, document the extension negotiation process alongside version negotiation, and add the corresponding tables for future extension parameter registrations. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/DDPR5BPWH1RJ.1LWAK6QAURVAY%40jeltef.nl	2026-02-06 10:25:12 -08:00
Michael Paquier	072c842135	Fix use of proc number in pgstat_create_backend() This routine's internals directly used MyProcNumber to choose which object ID to assign for the hash key of a backend's stats entry, while the value to use is given as input argument of the function. The original intention was to pass MyProcNumber as an argument of pgstat_create_backend() when called in pgstat_bestart_final(), pgstat_beinit() ensuring that MyProcNumber has been set, not use it directly in the function. This commit addresses this inconsistency by using the procnum given by the caller of pgstat_create_backend(), not MyProcNumber. This issue is not a cause of bugs currently. However, let's keep the code in sync across all the branches where this code exists, as it could matter in a future backpatch. Oversight in `4feba03d8b`. Reported-by: Ryo Matsumura <matsumura.ryo@fujitsu.com> Discussion: https://postgr.es/m/TYCPR01MB11316AD8150C8F470319ACCAEE866A@TYCPR01MB11316.jpnprd01.prod.outlook.com Backpatch-through: 18	2026-02-06 19:57:22 +09:00
Michael Paquier	74a116a79b	Fix some error message inconsistencies These errors are very unlikely going to show up, but in the event that they happen, some incorrect information would have been provided: - In pg_rewind, a stat() failure was reported as an open() failure. - In pg_combinebackup, a check for the new directory of a tablespace mapping was referred as the old directory. - In pg_combinebackup, a failure in reading a source file when copying blocks referred to the destination file. The changes for pg_combinebackup affect v17 and newer versions. For pg_rewind, all the stable branches are affected. Author: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/tencent_1EE1430B1E6C18A663B8990F@qq.com Backpatch-through: 14	2026-02-06 15:38:16 +09:00
Thomas Munro	f94e9141a0	Add file_extend_method=posix_fallocate,write_zeros. Provide a way to disable the use of posix_fallocate() for relation files. It was introduced by commit `4d330a61bb`. The new setting file_extend_method=write_zeros can be used as a workaround for problems reported from the field: * BTRFS compression is disabled by the use of posix_fallocate() * XFS could produce spurious ENOSPC errors in some Linux kernel versions, though that problem is reported to have been fixed The default is file_extend_method=posix_fallocate if available, as before. The write_zeros option is similar to PostgreSQL < 16, except that now it's multi-block. Backpatch-through: 16 Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reported-by: Dimitrios Apostolou <jimis@gmx.net> Discussion: https://postgr.es/m/b1843124-fd22-e279-a31f-252dffb6fbf2%40gmx.net	2026-02-06 17:38:49 +13:00
Fujii Masao	e35add48cc	doc: Move synchronized_standby_slots to "Primary Server" section. synchronized_standby_slots is defined in guc_parameter.dat as part of the REPLICATION_PRIMARY group and is listed under the "Primary Server" section in postgresql.conf.sample. However, in the documentation its description was previously placed under the "Sending Servers" section. Since synchronized_standby_slots only takes effect on the primary server, this commit moves its documentation to the "Primary Server" section to match its behavior and other references. Backpatch to v17 where synchronized_standby_slots was added. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAHGQGwE_LwgXgCrqd08OFteJqdERiF3noqOKu2vt7Kjk4vMiGg@mail.gmail.com Backpatch-through: 17	2026-02-06 09:40:05 +09:00
Michael Paquier	9476ef206c	Fix comment in extended_stats_funcs.c The attribute storing the statistics data for a set of expressions in pg_statistic_ext_data is stxdexpr. stxdexprs does not exist. Extracted from a larger patch by the same author. Incorrect as of `efbebb4e85`. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=fPcci6oPyuyEZ0F4bWqAA7HzaWO+ZPptufuX5_uWt6kw@mail.gmail.com	2026-02-05 15:14:53 +09:00
Masahiko Sawada	7a1f0f8747	pg_upgrade: Optimize logical replication slot caught-up check. Commit `29d0a77fa6` improved pg_upgrade to allow migrating logical slots provided that all logical slots have caught up (i.e., they have no pending decodable WAL records). Previously, this verification was done by checking each slot individually, which could be time-consuming if there were many logical slots to migrate. This commit optimizes the check to avoid reading the same WAL stream multiple times. It performs the check only for the slot with the minimum confirmed_flush_lsn and applies the result to all other slots in the same database. This limits the check to at most one logical slot per database. During the check, we identify the last decodable WAL record's LSN to report any slots with unconsumed records, consistent with the existing error reporting behavior. Additionally, the maximum confirmed_flush_lsn among all logical slots on the database is used as an early scan cutoff; finding a decodable WAL record beyond this point implies that no slot has caught up. Performance testing demonstrated that the execution time remains stable regardless of the number of slots in the database. Note that we do not distinguish slots based on their output plugins. A hypothetical plugin might use a replication origin filter that filters out changes from a specific origin. In such cases, we might get a false positive (erroneously considering a slot caught up). However, this is safe from a data integrity standpoint, such scenarios are rare, and the impact of a false positive is minimal. This optimization is applied only when the old cluster is version 19 or later. Bump catalog version. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAD21AoBZ0LAcw1OHGEKdW7S5TRJaURdhEk3CLAW69_siqfqyAg@mail.gmail.com	2026-02-04 17:11:27 -08:00
Michael Paquier	3c5ec35dea	oid2name: Add relation path to the information provided by -x/--extended This affects two command patterns, showing information about relations: * oid2name -x -d DBNAME, applying to all relations on a database. * oid2name -x -d DBNAME -t TABNAME [-t ..], applying to a subset of defined relations on a database. The relative path of a relation is added to the information provided, using pg_relation_filepath(). Author: David Bidoc <dcbidoc@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Guillaume Lelarge <guillaume.lelarge@dalibo.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Mark Wong <markwkm@gmail.com> Discussion: https://postgr.es/m/CABour1v2CU1wjjoM86wAFyezJQ3-+ncH43zY1f1uXeVojVN8Ow@mail.gmail.com	2026-02-05 09:02:12 +09:00
Álvaro Herrera	0c8e082fba	Assign "backend" type earlier during process start-up Instead of assigning the backend type in the Main function of each postmaster child, do it right after fork(), by which time it is already known by postmaster_child_launch(). This reduces the time frame during which MyBackendType is incorrect. Before this commit, ProcessStartupPacket would overwrite MyBackendType to B_BACKEND for dead-end backends, which is quite dubious. Stop that. We may now see MyBackendType == B_BG_WORKER before setting up MyBgworkerEntry. As far as I can see this is only a problem if we try to log a message and %b is in log_line_prefix, so we now have a constant string to cover that case. Previously, it would print "unrecognized", which seems strictly worse. Author: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/e85c6671-1600-4112-8887-f97a8a5d07b2@app.fastmail.com	2026-02-04 16:56:57 +01:00
Fujii Masao	36ead71232	Fix logical replication TAP test to read publisher log correctly. Commit `5f13999aa1` added a TAP test for GUC settings passed via the CONNECTION string in logical replication, but the buildfarm member sungazer reported test failures. The test incorrectly used the subscriber's log file position as the starting offset when reading the publisher's log. As a result, the test failed to find the expected log message in the publisher's log and erroneously reported a failure. This commit fixes the test to use the publisher's own log file position when reading the publisher's log. Also, to avoid similar confusion in the future, this commit splits the single $log_location variable into $log_location_pub and $log_location_sub, clearly distinguishing publisher and subscriber log positions. Backpatched to v15, where commit `5f13999aa1` introduced the test. Per buildfarm member sungazer. This issue was reported and diagnosed by Alexander Lakhin. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/966ec3d8-1b6f-4f57-ae59-fc7d55bc9a5a@gmail.com Backpatch-through: 15	2026-02-05 00:43:06 +09:00
John Naylor	176dffdf7d	Fix various instances of undefined behavior Mostly this involves checking for NULL pointer before doing operations that add a non-zero offset. The exception is an overflow warning in heap_fetch_toast_slice(). This was caused by unneeded parentheses forcing an expression to be evaluated to a negative integer, which then got cast to size_t. Per clang 21 undefined behavior sanitizer. Backpatch to all supported versions. Co-authored-by: Alexander Lakhin <exclusion@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/777bd201-6e3a-4da0-a922-4ea9de46a3ee@gmail.com Backpatch-through: 14	2026-02-04 18:09:35 +07:00
Heikki Linnakangas	084e42bc71	Add backendType to PGPROC, replacing isRegularBackend We can immediately make use of it in pg_signal_backend(), which previously fetched the process type from the backend status array with pgstat_get_backend_type_by_proc_number(). That was correct but felt a little questionable to me: backend status should be for observability purposes only, not for permission checks. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/b77e4962-a64a-43db-81a1-580444b3e8f5@iki.fi	2026-02-04 13:06:04 +02:00
Peter Eisentraut	4cfce4e62c	Change copyObject() to use typeof_unqual Currently, when the argument of copyObject() is const-qualified, the return type is also, because the use of typeof carries over all the qualifiers. This is incorrect, since the point of copyObject() is to make a copy to mutate. But apparently no code ran into it. The new implementation uses typeof_unqual, which drops the qualifiers, making this work correctly. typeof_unqual is standardized in C23, but all recent versions of all the usual compilers support it even in non-C23 mode, at least as __typeof_unqual__. We add a configure/meson test for typeof_unqual and __typeof_unqual__ and use it if it's available, else we use the existing fallback of just returning void *. Reviewed-by: David Geier <geidav.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/92f9750f-c7f6-42d8-9a4a-85a3cbe808f3%40eisentraut.org	2026-02-04 09:22:41 +01:00
Michael Paquier	c8ec74713b	pg_resetwal: Fix incorrect error message related to pg_wal/summaries/ A failure while closing pg_wal/summaries/ incorrectly generated a report about pg_wal/archive_status/. While at it, this commit adds #undefs for the macros used in KillExistingWALSummaries() and KillExistingArchiveStatus() to prevent those values from being misused in an incorrect function context. Oversight in `dc21234005`. Author: Tianchen Zhang <zhang_tian_chen@163.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/SE2P216MB2390C84C23F428A7864EE07FA19BA@SE2P216MB2390.KORP216.PROD.OUTLOOK.COM Backpatch-through: 17	2026-02-04 16:38:06 +09:00
Álvaro Herrera	78bf28e3bf	Docs: consolidate dependency notes in pg_dump and pg_restore The pg_dump documentation had repetitive notes for the --schema, --table, and --extension switches, noting that dependent database objects are not automatically included in the dump. This commit removes these notes and replaces them with a consolidated paragraph in the "Notes" section. pg_restore had a similar note for -t but lacked one for -n; do likewise. Also, add a note to --extension in pg_dump to note that ancillary files (such as shared libraries and control files) are not included in the dump and must be present on the destination system. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/284C4D55-4F90-4AA0-84C8-1E6A28DDF271@gmail.com	2026-02-03 19:29:15 +01:00
Heikki Linnakangas	57bff90160	Don't hint that you can reconnect when the database is dropped Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-03 15:08:16 +02:00
Heikki Linnakangas	cd375d5b6d	Remove useless errdetail_abort() I don't understand how to reach errdetail_abort() with MyProc->recoveryConflictPending set. If a recovery conflict signal is received, ProcessRecoveryConflictInterrupt() raises an ERROR or FATAL error to cancel the query or connection, and abort processing clears the flag. The error message from ProcessRecoveryConflictInterrupt() is very clear that the query or connection was terminated because of recovery conflict. The only way to reach it AFAICS is with a race condition, if the startup process sends a recovery conflict signal when the transaction has just entered aborted state for some other reason. And in that case the detail would be misleading, as the transaction was already aborted for some other reason, not because of the recovery conflict. errdetail_abort() was the only user of the recoveryConflictPending flag in PGPROC, so we can remove that and all the related code too. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-03 15:08:13 +02:00
Álvaro Herrera	96e2af6050	Reject ADD CONSTRAINT NOT NULL if name mismatches existing constraint When using ALTER TABLE ... ADD CONSTRAINT to add a not-null constraint with an explicit name, we have to ensure that if the column is already marked NOT NULL, the provided name matches the existing constraint name. Failing to do so could lead to confusion regarding which constraint object actually enforces the rule. This patch adds a check to throw an error if the user tries to add a named not-null constraint to a column that already has one with a different name. Reported-by: yanliang lei <msdnchina@163.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-bu: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/19351-8f1c523ead498545%40postgresql.org	2026-02-03 12:33:29 +01:00
Peter Eisentraut	955e507668	Change StaticAssertVariableIsOfType to be a declaration This allows moving the uses to more natural and useful positions. Also, a declaration is the more native use of static assertions in C. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2273bc2a-045d-4a75-8584-7cd9396e5534%40eisentraut.org	2026-02-03 08:46:02 +01:00
Peter Eisentraut	137d05df2f	Rename AssertVariableIsOfType to StaticAssertVariableIsOfType This keeps run-time assertions and static assertions clearly separate. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2273bc2a-045d-4a75-8584-7cd9396e5534%40eisentraut.org	2026-02-03 08:45:24 +01:00
Michael Paquier	e05a24c2d4	Add two IO wait events for COPY FROM/TO on a pipe/file/program Two wait events are added to the COPY FROM/TO code: * COPY_FROM_READ: reading data from a copy_file. * COPY_TO_WRITE: writing data to a copy_file. In the COPY code, copy_file can be set when processing a command through the pipe mode (for the non-DestRemote case), the program mode or the file mode, when processing fread() or fwrite() on it. Author: Nikolay Samokhvalov <nik@postgres.ai> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAM527d_iDzz0Kqyi7HOfqa-Xzuq29jkR6AGXqfXLqA5PR5qsng@mail.gmail.com	2026-02-03 12:20:41 +09:00
Michael Paquier	213fec296f	Fix incorrect errno in OpenWalSummaryFile() This routine has an option to bypass an error if a WAL summary file is opened for read but is missing (missing_ok=true). However, the code incorrectly checked for EEXIST, that matters when using O_CREAT and O_EXCL, rather than ENOENT, for this case. There are currently only two callers of OpenWalSummaryFile() in the tree, and both use missing_ok=false, meaning that the check based on the errno is currently dead code. This issue could matter for out-of-core code or future backpatches that would like to use missing_ok set to true. Issue spotted while monitoring this area of the code, after `a9afa021e9`. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aYAf8qDHbpBZ3Rml@paquier.xyz Backpatch-through: 17	2026-02-03 11:25:10 +09:00
Fujii Masao	21c1125d66	Release synchronous replication waiters immediately on configuration changes. Previously, when synchronous_standby_names was changed (for example, by reducing the number of required synchronous standbys or modifying the standby list), backends waiting for synchronous replication were not released immediately, even if the new configuration no longer required them to wait. They could remain blocked until additional messages arrived from standbys and triggered their release. This commit improves walsender so that backends waiting for synchronous replication are released as soon as the updated configuration takes effect and the new settings no longer require them to wait, by calling SyncRepReleaseWaiters() when configuration changes are processed. As part of this change, the duplicated code that handles configuration changes in walsender has been refactored into a new helper function, which is now used at the three existing call places. Since this is an improvement rather than a bug fix, it is applied only to the master branch. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAOzEurSRii0tEYhu5cePmRcvS=ZrxTLEvxm3Kj0d7_uKGdM23g@mail.gmail.com	2026-02-03 11:14:00 +09:00
Fujii Masao	dddbbc253b	psql: Add %i prompt escape to indicate hot standby status. This commit introduces a new prompt escape %i for psql, which shows whether the connected server is operating in hot standby mode. It expands to standby if the server reports in_hot_standby = on, and primary otherwise. This is useful for distinguishing standby servers from primary ones at a glance, especially when working with multiple connections in replicated environments where libpq's multi-host connection strings are used. Author: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/016f6738-f9a9-4e98-bb5a-e1e4b9591d46@uni-muenster.de	2026-02-03 10:03:19 +09:00
Melanie Plageman	4a99ef1a0d	Fix flakiness in the pg_visibility VM-only vacuum test by using a temporary table. The test relies on VACUUM being able to mark a page all-visible, but this can fail when autovacuum in other sessions prevents the visibility horizon from advancing. Making the test table temporary isolates its horizon from other sessions, including catalog table vacuums, ensuring reliable test behavior. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/2b09fba6-6b71-497a-96ef-a6947fcc39f6%40gmail.com	2026-02-02 17:45:27 -05:00
Nathan Bossart	12451d9d1f	test_shm_mq: Set background worker names. Oversight in commit `5373bc2a08`. Author: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/20260202173156.GB17962%40p46.dedyn.io%3Blightning.p46.dedyn.io	2026-02-02 15:43:01 -06:00
Tom Lane	da7a1dc0d6	Refactor att_align_nominal() to improve performance. Separate att_align_nominal() into two macros, similarly to what was already done with att_align_datum() and att_align_pointer(). The inner macro att_nominal_alignby() is really just TYPEALIGN(), while att_align_nominal() retains its previous API by mapping TYPALIGN_xxx values to numbers of bytes to align to and then calling att_nominal_alignby(). In support of this, split out tupdesc.c's logic to do that mapping into a publicly visible function typalign_to_alignby(). Having done that, we can replace performance-critical uses of att_align_nominal() with att_nominal_alignby(), where the typalign_to_alignby() mapping is done just once outside the loop. In most places I settled for doing typalign_to_alignby() once per function. We could in many places pass the alignby value in from the caller if we wanted to change function APIs for this purpose; but I'm a bit loath to do that, especially for exported APIs that extensions might call. Replacing a char typalign argument by a uint8 typalignby argument would be an API change that compilers would fail to warn about, thus silently breaking code in hard-to-debug ways. I did revise the APIs of array_iter_setup and array_iter_next, moving the element type attribute arguments to the former; if any external code uses those, the argument-count change will cause visible compile failures. Performance testing shows that ExecEvalScalarArrayOp is sped up by about 10% by this change, when using a simple per-element function such as int8eq. I did not check any of the other loops optimized here, but it's reasonable to expect similar gains. Although the motivation for creating this patch was to avoid a performance loss if we add some more typalign values, it evidently is worth doing whether that patch lands or not. Discussion: https://postgr.es/m/1127261.1769649624@sss.pgh.pa.us	2026-02-02 14:39:50 -05:00
Tom Lane	0c9f46c428	In s_lock.h, use regular labels with %= instead of local labels. Up to now we've used GNU-style local labels for branch targets in s_lock.h's assembly blocks. But there's an alternative style, which I for one didn't know about till recently: use regular assembler labels, and insert a per-asm-block number in them using %= to ensure they are distinct across multiple TAS calls within one source file. gcc has had %= since gcc 2.0, and I've verified that clang knows it too. While the immediate motivation for changing this is that AIX's assembler doesn't do local labels, it seems to me that this is a superior solution anyway. There is nothing mnemonic about "1:", while a regular label can convey something useful, and at least to me it feels less error-prone. Therefore let's standardize on this approach, also converting the one other usage in s_lock.h. Discussion: https://postgr.es/m/399291.1769998688@sss.pgh.pa.us	2026-02-02 11:13:38 -05:00
Michael Paquier	a9afa021e9	Fix error message in RemoveWalSummaryIfOlderThan() A failing unlink() was reporting an incorrect error message, referring to stat(). Author: Man Zeng <zengman@halodbtech.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/tencent_3BBE865C5F49D452360FF190@qq.com Backpath-through: 17	2026-02-02 10:21:04 +09:00
Michael Paquier	d46aa32ea5	Fix build inconsistency due to the generation of wait-event code The build generates four files based on the wait event contents stored in wait_event_names.txt: - wait_event_types.h - pgstat_wait_event.c - wait_event_funcs_data.c - wait_event_types.sgml The SGML file is generated as part of a documentation build, with its data stored in doc/src/sgml/ for meson and configure. The three others are handled differently for meson and configure: - In configure, all the files are created in src/backend/utils/activity/. A link to wait_event_types.h is created in src/include/utils/. - In meson, all the files are created in src/include/utils/. The two C files, pgstat_wait_event.c and wait_event_funcs_data.c, are then included in respectively wait_event.c and wait_event_funcs.c, without the "utils/" path. For configure, this does not present a problem. For meson, this has to be combined with a trick in src/backend/utils/activity/meson.build, where include_directories needs to point to include/utils/ to make the inclusion of the C files work properly, causing builds to pull in PostgreSQL headers rather than system headers in some build paths, as src/include/utils/ would take priority. In order to fix this issue, this commit reworks the way the C/H files are generated, becoming consistent with guc_tables.inc.c: - For meson, basically nothing changes. The files are still generated in src/include/utils/. The trick with include_directories is removed. - For configure, the files are now generated in src/backend/utils/, with links in src/include/utils/ pointing to the ones in src/backend/. This requires extra rules in src/backend/utils/activity/Makefile so as a make command in this sub-directory is able to work. - The three files now fall under header-stamp, which is actually simpler as guc_tables.inc.c does the same. - wait_event_funcs_data.c and pgstat_wait_event.c are now included with "utils/" in their path. This problem has not been an issue in the buildfarm; it has been noted with AIX and a conflict with float.h. This issue could, however, create conflicts in the buildfarm depending on the environment with unexpected headers pulled in, so this fix is backpatched down to where the generation of the wait-event files has been introduced. While on it, this commit simplifies wait_event_names.txt regarding the paths of the files generated, to mention just the names of the files generated. The paths where the files are generated became incorrect. The path of the SGML path was wrong. This change has been tested in the CI, down to v17. Locally, I have run tests with configure (with and without VPATH), as well as meson, on the three branches. Combo oversight in `fa88928470` and `1e68e43d3f`. Reported-by: Aditya Kamath <aditya.kamath1@ibm.com> Discussion: https://postgr.es/m/LV8PR15MB64888765A43D229EA5D1CFE6D691A@LV8PR15MB6488.namprd15.prod.outlook.com Backpatch-through: 17	2026-02-02 08:02:39 +09:00
Tom Lane	6918434a4a	Make psql/t/030_pager.pl more robust. Similarly to the preceding commit, 030_pager.pl was assuming that patterns it looks for in interactive psql output would appear by themselves on a line, but that assumption tends to fall over in builds made --without-readline: the output we get might have a psql prompt immediately followed by the expected line of output. For several of these tests, just checking for the pattern followed by newline seems sufficient, because we could not get a false match against the command echo, nor against the unreplaced command output if the pager fails to be invoked when expected. However, that's fairly scary for the test that was relying on information_schema.referential_constraints: "\d+" could easily appear at the end of a line in that view. Let's get rid of that hazard by making a custom test view instead of using information_schema.referential_constraints. This test script is new in v19, so no need for back-patch. Reported-by: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Author: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Discussion: https://postgr.es/m/db6fdb35a8665ad3c18be01181d44b31@postgrespro.ru	2026-01-30 15:11:44 -05:00
Tom Lane	a1d7ae2b2e	Improve guards against false regex matches in BackgroundPsql.pm. BackgroundPsql needs to wait for all the output from an interactive psql command to come back. To make sure that's happened, it issues the command, then issues \echo and \warn psql commands that echo a "banner" string (which we assume won't appear in the command's output), then waits for the banner strings to appear. The hazard in this approach is that the banner will also appear in the echoed psql commands themselves, so we need to distinguish those echoes from the desired output. Commit `8b886a4e3` tried to do that by positing that the desired output would be directly preceded and followed by newlines, but it turns out that that assumption is timing-sensitive. In particular, it tends to fail in builds made --without-readline, wherein the command echoes will be made by the pty driver and may be interspersed with prompts issued by psql proper. It does seem safe to assume that the banner output we want will be followed by a newline, since that should be the last output before things quiesce. Therefore, we can improve matters by putting quotes around the banner strings in the \echo and \warn psql commands, so that their echoes cannot include banner directly followed by newline, and then checking for just banner-and-newline in the match pattern. While at it, spruce up the pump() call in sub query() to look like the neater version in wait_connect(), and don't die on timeout until after printing whatever we got. Reported-by: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Diagnosed-by: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Discussion: https://postgr.es/m/db6fdb35a8665ad3c18be01181d44b31@postgrespro.ru Backpatch-through: 14	2026-01-30 14:59:25 -05:00
Heikki Linnakangas	e2362eb2bd	Move shmem allocator's fields from PGShmemHeader to its own struct For readability. It was a slight modularity violation to have fields in PGShmemHeader that were only used by the allocator code in shmem.c. And it was inconsistent that ShmemLock was nevertheless not stored there. Moving all the allocator-related fields to a separate struct makes it more consistent and modular, and removes the need to allocate and pass ShmemLock separately via BackendParameters. Merge InitShmemAccess() and InitShmemAllocation() into a single function that initializes the struct when called from postmaster, and when called from backends in EXEC_BACKEND mode, re-establishes the global variables. That's similar to all the *ShmemInit() functions that we have. Co-authored-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5uNRB9oT4pdo54qAo025MXFX4MfYrD9K15OCqe-ExnNvg@mail.gmail.com	2026-01-30 18:22:56 +02:00
Álvaro Herrera	e76221bd95	Minor cosmetic tweaks These changes should have been done by `2f9661311b`, but were overlooked. I noticed while reviewing the code for commit `b8926a5b4b`. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/18984-0f4778a6599ac3ae@postgresql.org	2026-01-30 14:26:02 +01:00
Álvaro Herrera	1eb09ed63a	Use C99 designated designators in a couple of places This makes the arrays somewhat easier to read. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/202601281204.sdxbr5qvpunk@alvherre.pgsql	2026-01-30 10:11:04 +01:00
Fujii Masao	bb26a81ee2	Remove unused argument from ApplyLogicalMappingFile(). Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/20260128120056.b2a3e8184712ab5a537879eb@sraoss.co.jp	2026-01-30 09:05:35 +09:00
Andres Freund	87f7b824f2	tableam: Perform CheckXidAlive check once per scan Previously, the CheckXidAlive check was performed within the table_scannext functions. This caused the check to be executed for every fetched tuple, an unnecessary overhead. To fix, move the check to table_beginscan* so it is performed once per scan rather than once per row. Note: table_tuple_fetch_row_version() does not use a scan descriptor; therefore, the CheckXidAlive check is retained in that function. The overhead is unlikely to be relevant for the existing callers. Reported-by: Andres Freund <andres@anarazel.de> Author: Dilip Kumar <dilipbalaut@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Suggested-by: Amit Kapila <akapila@postgresql.org> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/tlpltqm5jjwj7mp66dtebwwhppe4ri36vdypux2zoczrc2i3mp%40dhv4v4nikyfg	2026-01-29 17:52:07 -05:00
Andres Freund	333f586372	bufmgr: Allow conditionally locking of already locked buffer In `fcb9c977aa` I included an assertion in BufferLockConditional() to detect if a conditional lock acquisition is done on a buffer that we already have locked. The assertion was added in the course of adding other assertions. Unfortunately I failed to realize that some of our code relies on such lock acquisitions to silently fail. E.g. spgist and nbtree may try to conditionally lock an already locked buffer when acquiring a empty buffer. LWLockAcquireConditional(), which was previously used to implement ConditionalLockBuffer(), does not have such an assert. Instead of just removing the assert, and relying on the lock acquisition to fail due to the buffer already locked, this commit changes the behaviour of conditional content lock acquisition to fail if the current backend has any pre-existing lock on the buffer, even if the lock modes would not conflict. The reason for that is that we currently do not have space to track multiple lock acquisitions on a single buffer. Allowing multiple locks on the same buffer by a backend also seems likely to lead to bugs. There is only one non-self-exclusive conditional content lock acquisition, in GetVictimBuffer(), but it only is used if the target buffer is not pinned and thus can't already be locked by the current backend. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/90bd2cbb-49ce-4092-9f61-5ac2ab782c94@gmail.com	2026-01-29 16:49:01 -05:00
Tom Lane	bd9dfac8b1	Further fix extended alignment for older g++. Commit `6ceef9408` was still one brick shy of a load, because it caused any usage at all of PGIOAlignedBlock or PGAlignedXLogBlock to fail under older g++. Notably, this broke "headerscheck --cplusplus". We can permit references to these structs as abstract structs though; only actual declaration of such a variable needs to be forbidden. Discussion: https://www.postgresql.org/message-id/3119480.1769189606@sss.pgh.pa.us	2026-01-29 16:16:36 -05:00
Jeff Davis	de90bb7db1	Fix theoretical memory leaks in pg_locale_libc.c. The leaks were hard to reach in practice and the impact was low. The callers provide a buffer the same number of bytes as the source string (plus one for NUL terminator) as a starting size, and libc never increases the number of characters. But, if the byte length of one of the converted characters is larger, then it might need a larger destination buffer. Previously, in that case, the working buffers would be leaked. Even in that case, the call typically happens within a context that will soon be reset. Regardless, it's worth fixing to avoid such assumptions, and the fix is simple so it's worth backporting. Discussion: https://postgr.es/m/e2b7a0a88aaadded7e2d19f42d5ab03c9e182ad8.camel@j-davis.com Backpatch-through: 18	2026-01-29 10:14:55 -08:00
Álvaro Herrera	ec31744071	Replace literal 0 with InvalidXLogRecPtr for XLogRecPtr assignments Use the proper constant InvalidXLogRecPtr instead of literal 0 when assigning XLogRecPtr variables and struct fields. This improves code clarity by making it explicit that these are invalid LSN values rather than ambiguous zero literals. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aRtd2dw8FO1nNX7k@ip-10-97-1-34.eu-west-3.compute.internal	2026-01-29 18:37:09 +01:00
Robert Haas	71c1136989	Fix mistakes in commit `4020b370f2` cost_tidrangescan() was setting the disabled_nodes value correctly, and then immediately resetting it to zero, due to poor code editing on my part. materialized_finished_plan correctly set matpath.parent to zero, but forgot to also set matpath.parallel_workers = 0, causing an access to uninitialized memory in cost_material. (This shouldn't result in any real problem, but it makes valgrind unhappy.) reparameterize_path was dereferencing a variable before verifying that it was not NULL. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (issue #1) Reported-by: Michael Paquier <michael@paquier.xyz> (issue #1) Diagnosed-by: Lukas Fittl <lukas@fittl.com> (issue #1) Reported-by: Zsolt Parragi <zsolt.parragi@percona.com> (issue #2) Reported-by: Richard Guo <guofenglinux@gmail.com> (issue #3) Discussion: http://postgr.es/m/CAN4CZFPvwjNJEZ_JT9Y67yR7C=KMNa=LNefOB8ZY7TKDcmAXOA@mail.gmail.com Discussion: http://postgr.es/m/aXrnPgrq6Gggb5TG@paquier.xyz	2026-01-29 08:04:47 -05:00
Alexander Korotkov	20a8f783e1	Wake LSN waiters before recovery target stop Move WaitLSNWakeup() immediately after ApplyWalRecord() so waiters are signaled even when recoveryStopsAfter() breaks out for pause/promotion targets. Discussion: https://postgr.es/m/9533608f-f289-44bd-b881-9e5a73203c5b%40iki.fi Discussion: https://postgr.es/m/CABPTF7Wdq6KbvC3EhLX3Pz%3DODCCPEX7qVQ%2BE%3DcokkB91an2E-A%40mail.gmail.com Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Author: Xuneng Zhou <xunengzhou@gmail.com>	2026-01-29 09:47:09 +02:00
Michael Paquier	4b77282f25	psql: Disable %P (pipeline status) for non-active connection In the psql prompt, %P prompt shows the current pipeline status. Unlike most of the other options, its status was showing up in the output generated even if psql was not connected to a database. This was confusing, because without a connection a pipeline status makes no sense. Like the other options, %P is updated so as its data is now hidden without an active connection. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/86EF76B5-6E62-404D-B9EC-66F4714D7D5F@gmail.com Backpatch-through: 18	2026-01-29 16:20:45 +09:00
Michael Paquier	740a1494f4	Fix two error messages in extended_stats_funcs.c These have been fat-fingered in `0e80f3f88d` and `302879bd68`. The error message for ndistinct had an incorrect grammar, while the one for dependencies had finished with a period (incorrect based on the project guidelines). Discussion: https://postgr.es/m/aXrsjZQbVuB6236u@paquier.xyz	2026-01-29 14:57:47 +09:00
Michael Paquier	fc365e4fcc	Add test doing some cloning of extended statistics data The test added in this commit copies the data of an ANALYZE run on one relation to a secondary relation with the same attribute definitions and extended statistics objects. Once the clone is done, the target and origin should have the same extended statistics information, with no differences. This test would have been able to catch `e3094679b9`, for example, as we expect the full range of statistics to be copied over, with no differences generated between the results of an ANALYZE and the data copied to the cloned relation. Note that this new test should remain at the bottom of stats_import.sql, so as any additions in the main relation and its clone are automatically covered when copying their statistics, so as it would work as a sanity check in the future. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-29 13:22:07 +09:00
Michael Paquier	0b7beec42a	Add test for pg_restore_extended_stats() with multiranges The restore of extended statistics has some paths dedicated to multirange types and expressions for all the stats kinds supported, and we did not have coverage for the case where an extended stats object uses a multirange attribute with or without an expression. Extracted from a larger patch by the same author, with a couple of tweaks from me regarding the format of the output generated, to make it more readable to the eye. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-29 12:38:10 +09:00
Amit Kapila	3a98f989e8	Fix CI failure introduced in commit `851f6649cc`. The test added in commit `851f6649cc` uses a backup taken from a node created by the previous test to perform standby related checks. On Windows, however, the standby failed to start with the following error: FATAL: could not rename file "backup_label" to "backup_label.old": Permission denied This occurred because some background sessions from the earlier test were still active. These leftover processes continued accessing the parent directory of the backup_label file, likely preventing the rename and causing the failure. Ensuring that these sessions are cleanly terminated resolves the issue in local testing. Additionally, the has_restoring => 1 option has been removed, as it was not required by the new test. Reported-by: Robert Haas <robertmhaas@gmail.com> Backpatch-through: 17 Discussion: https://postgr.es/m/CA+TgmobdVhO0ckZfsBZ0wqDO4qHVCwZZx8sf=EinafvUam-dsQ@mail.gmail.com	2026-01-29 03:22:02 +00:00
Michael Paquier	efbebb4e85	Add support for "mcv" in pg_restore_extended_stats() This commit adds support for the restore of extended statistics of the kind "mcv", aka most-common values. This format is different from n_distinct and dependencies stat types in that it is the combination of three of the four different arrays from the pg_stats_ext view which in turn require three different input parameters on pg_restore_extended_statistics(). These are translated into three input arguments for the function: - "most_common_vals", acting as a leader of the others. It is a 2-dimension array, that includes the common values. - "most_common_freqs", 1-dimension array of float8[], with a number of elements that has to match with "most_common_vals". - "most_common_base_freqs", 1-dimension array of float8[], with a number of elements that has to match with "most_common_vals". All three arrays are required to achieve the restore of this type of extended statistics (if "most_common_vals" happens to be NULL in the catalogs, the rest is NULL by design). Note that "most_common_val_nulls" is not required in input, its data is rebuilt from the decomposition of the "most_common_vals" array based on its text[] representation. The initial versions of the patch provided this option in input, but we do not require it and it simplifies a lot the result. Support in pg_dump is added down to v13 which is where the support for this type of extended statistics has been added, when --statistics is used. This means that upgrade and dumps can restore extended statistics data transparently, like "dependencies", "ndistinct", attribute and relation statistics. For MCV, the values are directly queried from the relevant catalogs. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-29 12:14:08 +09:00
Michael Paquier	e09e5ad69a	Fix whitespace issue in regression test stats_import Issue noticed while playing this area of the tests for a different patch.	2026-01-29 11:59:43 +09:00
Nathan Bossart	ef1c865206	Add a couple of recent commits to .git-blame-ignore-revs.	2026-01-28 15:56:48 -06:00
Masahiko Sawada	8f1e2dfe03	Consolidate replication origin session globals into a single struct. This commit moves the separate global variables for replication origin state into a single ReplOriginXactState struct. This groups logically related variables, which improves code readability and simplifies state management (e.g., resetting the state) by handling them as a unit. Author: Chao Li <lic@highgo.com> Suggested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=pYvfRthXHTzSrOsf5_FfyY4zJyK4zV2v4W=yjUij1cA@mail.gmail.com	2026-01-28 12:26:22 -08:00
Masahiko Sawada	227eb4eea2	Refactor replication origin state reset helpers. Factor out common logic for clearing replorigin_session_* variables into a dedicated helper function, replorigin_xact_clear(). This removes duplicated assignments of these variables across multiple call sites, and makes the intended scope of each reset explicit. Author: Chao Li <lic@highgo.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAEoWx2=pYvfRthXHTzSrOsf5_FfyY4zJyK4zV2v4W=yjUij1cA@mail.gmail.com	2026-01-28 11:45:26 -08:00
Masahiko Sawada	1fdbca159e	Standardize replication origin naming to use "ReplOrigin". The replication origin code was using inconsistent naming conventions. Functions were typically prefixed with 'replorigin', while typedefs and constants used "RepOrigin". This commit unifies the naming convention by renaming RepOriginId to ReplOriginId. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAD21AoBDgm3hDqUZ+nqu=ViHmkCnJBuJyaxG_yvv27BAi2zBmQ@mail.gmail.com	2026-01-28 11:03:29 -08:00
Robert Haas	4020b370f2	Allow for plugin control over path generation strategies. Each RelOptInfo now has a pgs_mask member which is a mask of acceptable strategies. For most rels, this is populated from PlannerGlobal's default_pgs_mask, which is computed from the values of the enable_* GUCs at the start of planning. For baserels, get_relation_info_hook can be used to adjust pgs_mask for each new RelOptInfo, at least for rels of type RTE_RELATION. Adjusting pgs_mask is less useful for other types of rels, but if it proves to be necessary, we can revisit the way this hook works or add a new one. For joinrels, two new hooks are added. joinrel_setup_hook is called each time a joinrel is created, and one thing that can be done from that hook is to manipulate pgs_mask for the new joinrel. join_path_setup_hook is called each time we're about to add paths to a joinrel by considering some particular combination of an outer rel, an inner rel, and a join type. It can modify the pgs_mask propagated into JoinPathExtraData to restrict strategy choice for that particular combination of rels. To make joinrel_setup_hook work as intended, the existing calls to build_joinrel_partition_info are moved later in the calling functions; this is because that function checks whether the rel's pgs_mask includes PGS_CONSIDER_PARTITIONWISE, so we want it to only be called after plugins have had a chance to alter pgs_mask. Upper rels currently inherit pgs_mask from the input relation. It's unclear that this is the most useful behavior, but at the moment there are no hooks to allow the mask to be set in any other way. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-01-28 11:28:34 -05:00
Álvaro Herrera	e6d6e32f42	Fix duplicate arbiter detection during REINDEX CONCURRENTLY on partitions Commit `90eae926a` fixed ON CONFLICT handling during REINDEX CONCURRENTLY on partitioned tables by treating unparented indexes as potential arbiters. However, there's a remaining race condition: when pg_inherits records are swapped between consecutive calls to get_partition_ancestors(), two different child indexes can appear to have the same parent, causing duplicate entries in the arbiter list and triggering "invalid arbiter index list" errors. Note that this is not a new problem introduced by `90eae926a`. The same error could occur before that commit in a slightly different scenario: an index is selected during planning, then index_concurrently_swap() commits, and a subsequent call to get_partition_ancestors() uses a new catalog snapshot that sees zero ancestors for that index. Fix by tracking which parent indexes have already been processed. If a subsequent call to get_partition_ancestors() returns a parent we've already seen, treat that index as unparented instead, allowing it to be matched via IsIndexCompatibleAsArbiter() like other concurrent reindex scenarios. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/e5a8c1df-04e5-4343-85ef-5df2a7e3d90c@gmail.com	2026-01-28 14:38:53 +01:00
Michael Paquier	e3094679b9	Fix pg_restore_extended_stats() with expressions This commit fixes an issue with the restore of ndistinct and dependencies statistics, causing the operation to fail when any of these kinds included expressions. In extended statistics, expressions use strictly negative attribute numbers, decremented from -1. For example, let's imagine an object defined as follows: CREATE STATISTICS stats_obj (dependencies) ON lower(name), upper(name) FROM tab_obj; This object would generate dependencies stats using -1 and -2 as attribute numbers, like that: [{"attributes": [-1], "dependency": -2, "degree": 1.000000}, {"attributes": [-2], "dependency": -1, "degree": 1.000000}] However, pg_restore_extended_stats() forgot to account for the number of expressions defined in an extended statistics object. This would cause the validation step of ndistinct and dependencies data to fail, preventing a restore of their stats even if the input is valid. This issue has come up due to an incorrect split of the patch set. Some tests are included to cover this behavior. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aXl4bMfSTQUxM_yy@paquier.xyz	2026-01-28 11:48:45 +09:00
Michael Paquier	f9562b95c6	Add output test for pg_dependencies statistics import Commit `302879bd68` has added the ability to restore extended stats of the type "dependencies", but it has forgotten the addition of a test to verify that the value restored was actually set. This test is the pg_dependencies equivalent of the test added for pg_ndistinct in `0e80f3f88d`. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=dZr_Ut3jKw94_BisyyDtNZPRJWeOALXVzcJz=ZFTAhvQ@mail.gmail.com	2026-01-28 08:37:46 +09:00
Jacob Champion	c6a10a89fe	oauth: Correct test dependency on oauth_hook_client The oauth_validator tests missed the lessons of `c89525d57` et al, so certain combinations of command-line build order and `meson test` options can result in Command 'oauth_hook_client' not found in [...] at src/test/perl/PostgreSQL/Test/Utils.pm line 427. Add the missing dependency on the test executable. This fixes, for example, $ ninja clean && ninja meson-test-prereq && PG_TEST_EXTRA=oauth meson test --no-rebuild Reported-by: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Discussion: https://postgr.es/m/6e8f4f7c23faf77c4b6564c4b7dc5d3de64aa491.camel@gmail.com Discussion: https://postgr.es/m/qh4c5tvkgjef7jikjig56rclbcdrrotngnwpycukd2n3k25zi2%4044hxxvtwmgum Backpatch-through: 18	2026-01-27 11:56:44 -08:00
Robert Haas	9a446d0256	pg_waldump: Remove file-level global WalSegSz. It's better style to pass the value around to just the places that need it. This makes it easier to determine whether the value is always properly initialized before use. Reviewed-by: Amul Sul <sulamul@gmail.com> Discussion: http://postgr.es/m/CAAJ_b94+wObPn-z1VECipnSFhjMJ+R2cpTmKVYLjyQuVn+B5QA@mail.gmail.com	2026-01-27 08:33:20 -05:00
Amit Kapila	851f6649cc	Prevent invalidation of newly synced replication slots. A race condition could cause a newly synced replication slot to become invalidated between its initial sync and the checkpoint. When syncing a replication slot to a standby, the slot's initial restart_lsn is taken from the publisher's remote_restart_lsn. Because slot sync happens asynchronously, this value can lag behind the standby's current redo pointer. Without any interlocking between WAL reservation and checkpoints, a checkpoint may remove WAL required by the newly synced slot, causing the slot to be invalidated. To fix this, we acquire ReplicationSlotAllocationLock before reserving WAL for a newly synced slot, similar to commit `006dd4b2e5`. This ensures that if WAL reservation happens first, the checkpoint process must wait for slotsync to update the slot's restart_lsn before it computes the minimum required LSN. However, unlike in ReplicationSlotReserveWal(), this lock alone cannot protect a newly synced slot if a checkpoint has already run CheckPointReplicationSlots() before slotsync updates the slot. In such cases, the remote restart_lsn may be stale and earlier than the current redo pointer. To prevent relying on an outdated LSN, we use the oldest WAL location available if it is greater than the remote restart_lsn. This ensures that newly synced slots always start with a safe, non-stale restart_lsn and are not invalidated by concurrent checkpoints. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Vitaly Davydov <v.davydov@postgrespro.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Backpatch-through: 17 Discussion: https://postgr.es/m/TY4PR01MB16907E744589B1AB2EE89A31F94D7A%40TY4PR01MB16907.jpnprd01.prod.outlook.com	2026-01-27 05:06:29 +00:00
Michael Paquier	c32fb29e97	Include extended statistics data in pg_dump This commit integrates the new pg_restore_extended_stats() function into pg_dump, so as the data of extended statistics is detected and included in dumps when the --statistics switch is specified. Currently, the same extended stats kinds as the ones supported by the SQL function can be dumped: "n_distinct" and "dependencies". The extended statistics data can be dumped down to PostgreSQL 10, with the following changes depending on the backend version dealt with: - In v19 and newer versions, the format of pg_ndistinct and pg_dependencies has changed, catalogs can be directly queried. - In v18 and older versions, the format is translated to the new format supported by the backend. - In v14 and older versions, inherited extended statistics are not supported. - In v11 and older versions, the data for ndistinct and dependencies was stored in pg_statistic_ext. These have been moved to pg_stats_ext in v12. - Extended Statistics have been introduced in v10, no support is needed for versions older than that. The extended statistics data is dumped if it can be found in the catalogs. If the catalogs are empty, then no restore of the stats data is attempted. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-27 13:42:32 +09:00
Fujii Masao	1ea44d7ddf	Remove unnecessary abort() from WalSndShutdown(). WalSndShutdown() previously called abort() after proc_exit(0) to silence compiler warnings. This is no longer needed, because both WalSndShutdown() and proc_exit() are declared pg_noreturn, allowing the compiler to recognize that the function does not return. Also there are already other functions, such as CheckpointerMain(), that call proc_exit() without an abort(), and they do not produce warnings. Therefore this abort() call in WalSndShutdown() is useless and this commit removes it. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CAHGQGwHPX1yoixq+YB5rF4zL90TMmSEa3FpHURtqW3Jc5+=oSA@mail.gmail.com	2026-01-27 11:55:32 +09:00
Tomas Vondra	09c37015d4	pgindent fix for `3fccbd94cb` Backpatch-through: 18	2026-01-27 00:26:36 +01:00
Tomas Vondra	3fccbd94cb	Handle ENOENT status when querying NUMA node We've assumed that touching the memory is sufficient for a page to be located on one of the NUMA nodes. But a page may be moved to a swap after we touch it, due to memory pressure. We touch the memory before querying the status, but there is no guarantee it won't be moved to the swap in the meantime. The touching happens only on the first call, so later calls are more likely to be affected. And the batching increases the window too. It's up to the kernel if/when pages get moved to swap. We have to accept ENOENT (-2) as a valid result, and handle it without failing. This patch simply treats it as an unknown node, and returns NULL in the two affected views (pg_shmem_allocations_numa and pg_buffercache_numa). Hugepages cannot be swapped out, so this affects only regular pages. Reported by Christoph Berg, investigation and fix by me. Backpatch to 18, where the two views were introduced. Reported-by: Christoph Berg <myon@debian.org> Discussion: 18 Backpatch-through: https://postgr.es/m/aTq5Gt_n-oS_QSpL@msg.df7cb.de	2026-01-27 00:21:40 +01:00
Michael Paquier	302879bd68	Add support for "dependencies" in pg_restore_extended_stats() This commit adds support for the restore of extended statistics of the kind "dependencies", for the following input data: [{"attributes": [2], "dependency": 3, "degree": 1.000000}, {"attributes": [3], "dependency": 2, "degree": 1.000000}] This relies on the existing routines of "dependencies" to cross-check the input data with the definition of the extended statistics objects for the attribute numbers. An input argument of type "pg_dependencies" is required for this new option. Thanks to the work done in `0e80f3f88d` for the restore function and `e1405aa5e3` for the input handling of data type pg_dependencies, this addition is straight-forward. This will be used so as it is possible to transfer these statistics across dumps and upgrades, removing the need for a post-operation ANALYZE for these kinds of statistics. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-27 08:20:13 +09:00
Melanie Plageman	19af794b66	Refactor lazy_scan_prune() VM clear logic into helper Encapsulating the cases that clear the visibility map after vacuum phase I, when corruption is detected, into in a helper makes the code cleaner and enables further refactoring in future commits. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/7ib3sa55sapwjlaz4sijbiq7iezna27kjvvvar4dpgkmadml6t%40gfpkkwmdnepx	2026-01-26 17:12:05 -05:00
Melanie Plageman	648a7e28d7	Eliminate use of cached VM value in lazy_scan_prune() lazy_scan_prune() takes a parameter from lazy_scan_heap() indicating whether the page was marked all-visible in the VM at the time it was last checked in find_next_unskippable_block(). This behavior is historical, dating back to commit `608195a3a3`, when we did not pin the VM page until deciding we must read it. Now that the VM page is already pinned, there is no meaningful benefit to relying on a cached VM status. Removing this cached value simplifies the logic in both lazy_scan_heap() and lazy_scan_prune(). It also clarifies future work that will set the visibility map on-access: such paths will not have a cached value available, which would make the logic harder to reason about. And eliminating it enables us to detect and repair VM corruption on-access. Along with removing the cached value and unconditionally checking the visibility status of the heap page, this commit also moves the VM corruption handling to occur first. This reordering should have no performance impact, since the checks are inexpensive and performed only once per page. It does, however, make the control flow easier to understand. The new restructuring also makes it possible to set the VM after fixing corruption (if pruning found the page all-visible). Now that no callers of visibilitymap_set() use its return value, change its (and visibilitymap_set_vmbits()) return type to void. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/5CEAA162-67B1-44DA-B60D-8B65717E8B05%40gmail.com	2026-01-26 17:00:13 -05:00
Melanie Plageman	21796c267d	Combine visibilitymap_set() cases in lazy_scan_prune() lazy_scan_prune() previously had two separate cases that called visibilitymap_set() after pruning and freezing. These branches were nearly identical except that one attempted to avoid dirtying the heap buffer. However, that situation can never occur — the heap buffer cannot be clean at that point (and we would hit an assertion if it were). In lazy_scan_prune(), when we change a previously all-visible page to all-frozen and the page was recorded as all-visible in the visibility map by find_next_unskippable_block(), the heap buffer will always be dirty. Either we have just frozen a tuple and already dirtied the buffer, or the buffer was modified between find_next_unskippable_block() and heap_page_prune_and_freeze() and then pruned in heap_page_prune_and_freeze(). Additionally, XLogRegisterBuffer() asserts that the buffer is dirty, so attempting to add a clean heap buffer to the WAL chain would assert out anyway. Since the “clean heap buffer with already set VM” case is impossible, the two visibilitymap_set() branches in lazy_scan_prune() can be merged. Doing so makes the intent clearer and emphasizes that the heap buffer must always be marked dirty before being added to the WAL chain. This commit also adds a test case for vacuuming when no heap modifications are required. Currently this ensures that the heap buffer is marked dirty before it is added to the WAL chain, but if we later remove the heap buffer from the VM-set WAL chain or pass it with the REGBUF_NO_CHANGES flag, this test would guard that behavior. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/5CEAA162-67B1-44DA-B60D-8B65717E8B05%40gmail.com Discussion: https://postgr.es/m/flat/CAAKRu_ZWx5gCbeCf7PWCv8p5%3D%3Db7EEws0VD2wksDxpXCvCyHvQ%40mail.gmail.com	2026-01-26 16:03:32 -05:00
Tomas Vondra	f6e5d21bf7	Exercise parallel GIN builds in regression tests Modify two places creating GIN indexes in regression tests, so that the build is parallel. This provides a basic test coverage, even if the amounts of data are fairly small. Reported-by: Kirill Reshke <reshkekirill@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CALdSSPjUprTj+vYp1tRKWkcLYzdy=N=O4Cn4y_HoxNSqQwBttg@mail.gmail.com	2026-01-26 20:05:17 +01:00
Tomas Vondra	db14dcdec6	Lookup the correct ordering for parallel GIN builds When building a tuplesort during parallel GIN builds, the function incorrectly looked up the default B-Tree operator, not the function associated with the GIN opclass (through GIN_COMPARE_PROC). Fixed by using the same logic as initGinState(), and the other place in parallel GIN builds. This could cause two types of issues. First, a data type might not have a B-Tree opclass, in which case the PrepareSortSupportFromOrderingOp() fails with an ERROR. Second, a data type might have both B-Tree and GIN opclasses, defining order/equality in different ways. This could lead to logical corruption in the index. Backpatch to 18, where parallel GIN builds were introduced. Discussion: https://postgr.es/m/73a28b94-43d5-4f77-b26e-0d642f6de777@iki.fi Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Backpatch-through: 18	2026-01-26 20:05:17 +01:00
Robert Haas	4cbaf4dcd2	Reduce length of TAP test file name. Buildfarm member fairywren hit the Windows limitation on the length of a file path. While there may be other things we should also do to prevent this from happening, it's certainly the case that the length of this test file name is much longer than others in the same directory, so make it shorter. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/274e0a1a-d7d2-4bc8-8b56-dd09f285715e@gmail.com Backpatch-through: 17	2026-01-26 12:43:52 -05:00
Peter Eisentraut	5ca5f12c2c	Fix accidentally cast away qualifiers This fixes cases where a qualifier (const, in all cases here) was dropped by a cast, but the cast was otherwise necessary or desirable, so the straightforward fix is to add the qualifier into the cast. Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b04f4d3a-5e70-4e73-9ef2-87f777ca4aac%40eisentraut.org	2026-01-26 16:02:31 +01:00
Fujii Masao	33a92632b7	doc: Clarify that \d and \d+ output lists are illustrative, not exhaustive. The psql documentation for the \d and \d+ meta-commands lists objects that may be shown, but previously the wording could be read as exhaustive even though additional objects can also appear in the output. This commit clarifies the description by adding phrasing such as "for example" or "such as", making it clear that the listed objects are illustrative rather than a complete list. While the change is small, it helps avoid potential user confusion. As this is a documentation clarification rather than a bug fix, it is not backpatched. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+Pt1DBtaUqfJftkkaQLJJJenYJBtb6Ec6s6vu82KEMh46A@mail.gmail.com	2026-01-26 20:45:05 +09:00
David Rowley	7027dd499d	Remove deduplication logic from find_window_functions This code thought it was optimizing WindowAgg evaluation by getting rid of duplicate WindowFuncs, but it turns out all it does today is lead to cost-underestimations and makes it possible that optimize_window_clauses could miss some of the WindowFuncs that must receive an updated winref. The deduplication likely was useful when it was first added, but since the projection code was changed in `b8d7f053c`, the list of WindowFuncs gathered by find_window_functions isn't used during execution. Instead, the expression evaluation code will process the node's targetlist to find the WindowFuncs. The reason the deduplication could cause issues for optimize_window_clauses() is because if a WindowFunc is moved to another WindowClause, the winref is adjusted to reference the new WindowClause. If any duplicate WindowFuncs were discarded in find_window_functions() then the WindowFuncLists may not include all the WindowFuncs that need their winref adjusted. This could lead to an error message such as: ERROR: WindowFunc with winref 2 assigned to WindowAgg with winref 1 The back-branches will receive a different fix so that the WindowAgg costs are not affected. Author: Meng Zhang <mza117jc@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAErYLFAuxmW0UVdgrz7iiuNrxGQnFK_OP9hBD5CUzRgjrVrz=Q@mail.gmail.com	2026-01-26 23:27:15 +13:00
Peter Eisentraut	6ceef9408c	Disable extended alignment uses on older g++ Fix for commit `a9bdb63bba`. The previous plan of redefining alignas didn't work, because it interfered with other C++ header files (e.g., LLVM). So now the new workaround is to just disable the affected typedefs under the affected compilers. These are not typically used in extensions anyway. Discussion: https://www.postgresql.org/message-id/3119480.1769189606%40sss.pgh.pa.us	2026-01-26 10:23:14 +01:00
Michael Paquier	d9abd9e105	Add test for MAINTAIN permission with pg_restore_extended_stats() Like its cousin functions for the restore of relation and attribute stats, pg_restore_extended_stats() needs to be run by a user that is the database owner or has MAINTAIN privileges on the table whose stats are restored. This commit adds a regression test ensuring that MAINTAIN is required when calling the function. This test also checks that a ShareUpdateExclusive lock is taken on the table whose stats are restored. This has been split from the commit that has introduced pg_restore_extended_stats(), for clarity. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-26 16:32:33 +09:00
Michael Paquier	114e84c532	Fix missing initialization in pg_restore_extended_stats() The tuple data upserted into pg_statistic_ext_data was missing an initialization for the nulls flag of stxoid and stxdinherit. This would cause an incorrect handling of the stats data restored. This issue has been spotted by CatalogTupleCheckConstraints(), translating to a NOT NULL constraint inconsistency, while playing more with the follow-up portions of the patch set. Oversight in `0e80f3f88d` (mea culpa). Surprisingly, the buildfarm did not complain yet. Discussion: https://postgr.es/m/CADkLM=c7DY3Jv6ef0n_MGUJ1FyTMUoT697LbkST05nraVGNHYg@mail.gmail.com	2026-01-26 16:13:41 +09:00
Michael Paquier	0e80f3f88d	Add pg_restore_extended_stats() This function closely mirror its relation and attribute counterparts, but for extended statistics (i.e. CREATE STATISTICS) objects, being able to restore extended statistics for an extended stats object. Like the other functions, the goal of this feature is to ease the dump or upgrade of clusters so as ANALYZE would not be required anymore after these operations, stats being directly loaded into the target cluster without any post-dump/upgrade computation. The caller of this function needs the following arguments for the extended stats to restore: - The name of the relation. - The schema name of the relation. - The name of the extended stats object. - The schema name of the extended stats object. - If the stats are inherited or not. - One or more extended stats kind with its data. This commit adds only support for the restore of the extended statistics kind "n_distinct", building the basic infrastructure for the restore of more extended statistics kinds in follow-up commits, including MVC and dependencies. The support for "n_distinct" is eased in this commit thanks to the previous work done particularly in commits `1f927cce44` and `44eba8f06e`, that have added the input function for the type pg_ndistinct, used as data type in input of this new restore function. Bump catalog version. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-26 15:08:15 +09:00
Michael Paquier	d4504d6f60	Add test with multirange type for pg_restore_attribute_stats() This commit adds a test for pg_restore_attribute_stats() with the injection of statistics related to a multirange type. This case is supported in statatt_get_type() since its introduction in `ce207d2a79`, but there was no test in the main regression test suite to check for the case where attribute stats is restored for a multirange type, as done by multirange_typanalyze(). Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=c3JivzHNXLt-X_JicYknRYwLTiOCHOPiKagm2_vdrFUg@mail.gmail.com	2026-01-26 13:32:17 +09:00
Michael Paquier	c100340729	Remove PG_MMAP_FLAGS from mem.h Based on name of the macro, it was implied that it could be used for all mmap() calls on portability grounds. However, its use is limited to sysv_shmem.c, for CreateAnonymousSegment(). This commit removes the declaration, reducing the confusion around it as a portability tweak, being limited to SysV-style shared memory. This macro has been introduced in `b0fc0df936` for sysv_shmem.c, originally. It has been moved to mem.h in `0ac5e5a7e1` a bit later. Suggested by: Peter Eisentraut <peter@eisentraut.org> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAExHW5vTWABxuM5fbQcFkGuTLwaxuZDEE2vtx2WuMUWk6JnF4g@mail.gmail.com Discussion: https://postgr.es/m/12add41a-7625-4639-a394-a5563e349322@eisentraut.org	2026-01-26 10:52:02 +09:00
David Rowley	83a53572a6	Always inline SeqNext and SeqRecheck The intention of the work done in `fb9f95502` was that these functions are inlined. I noticed my compiler isn't doing this on -O2 (gcc version 15.2.0). Also, clang version 20.1.8 isn't inlining either. Fix by marking both of these functions as pg_attribute_always_inline to avoid leaving this up to the compiler's heuristics. A quick test with a Seq Scan on a table with a single int column running a query that filters all 1 million rows in the WHERE clause yields a 3.9% speedup on my Zen4 machine. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrL7Q41B=gv+3wc8+AJGKZugGegUbBo8FPQ+3+NGTPb+w@mail.gmail.com	2026-01-26 14:29:10 +13:00
Michael Paquier	168765e5d4	Add more tests with clause STORAGE on table and TOAST interactions This commit adds more tests to cover STORAGE MAIN and EXTENDED, checking how these use TOAST tables. EXTENDED is already widely tested as the default behavior, but there were no tests where the clause pattern is directly specified. STORAGE MAIN and its interactions with TOAST was not covered at all. This hole in the tests has been noticed for STORAGE MAIN (inline compressible varlenas), where I have managed to break the backend without the tests able to notice the breakage while playing with the varlena structures. Reviewed-by: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Discussion: https://postgr.es/m/aXMdX1UTHnzYPkHk@paquier.xyz	2026-01-26 09:30:22 +09:00
Peter Eisentraut	a9bdb63bba	Work around buggy alignas in older g++ Older g++ (<9.3) mishandle the alignas specifier (raise warnings that the alignment is too large), but the more or less equivalent attribute works. So as a workaround, #define alignas to that attribute for those versions. see <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89357> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/3119480.1769189606%40sss.pgh.pa.us	2026-01-25 11:32:47 +01:00
Michael Paquier	72e3abd082	pg_stat_statements: Fix test instability with cache-clobbering builds Builds with CLOBBER_CACHE_ALWAYS enabled are failing the new test introduced in `1572ea96e6`, checking the nesting level calculation in the planner hook. The inner query of the function called twice is registered as normalized, as such builds would register a PGSS entry in the post-parse-analyse hook due to the cached plans requiring revalidation. A trick based on debug_discard_caches cannot work as far as I can, a normalized query still being registered. This commit takes a different approach with the addition of a DISCARD PLANS before the first function call. This forces the use of a normalized query in the PGSS entry for the inner query of the function with and without CLOBBER_CACHE_ALWAYS, which should be enough to stabilize the test. Note that the test is still checking what it should: when removing the nesting level calculation in the planner hook of PGSS, one still gets a failure for the PGSS entry of the inner query in the function, with "toplevel" being flipped to true instead of false (it should be false, as a non-top-level entry). Per buildfarm members avocet and trilobite, at least. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/82dd02bb-4e0f-40ad-a60b-baa1763ff0bd@gmail.com	2026-01-25 19:01:23 +09:00
Dean Rasheed	b4307ae2e5	Fix trigger transition table capture for MERGE in CTE queries. When executing a data-modifying CTE query containing MERGE and some other DML operation on a table with statement-level AFTER triggers, the transition tables passed to the triggers would fail to include the rows affected by the MERGE. The reason is that, when initializing a ModifyTable node for MERGE, MakeTransitionCaptureState() would create a TransitionCaptureState structure with a single "tcs_private" field pointing to an AfterTriggersTableData structure with cmdType == CMD_MERGE. Tuples captured there would then not be included in the sets of tuples captured when executing INSERT/UPDATE/DELETE ModifyTable nodes in the same query. Since there are no MERGE triggers, we should only create AfterTriggersTableData structures for INSERT/UPDATE/DELETE. Individual MERGE actions should then use those, thereby sharing the same capture tuplestores as any other DML commands executed in the same query. This requires changing the TransitionCaptureState structure, replacing "tcs_private" with 3 separate pointers to AfterTriggersTableData structures, one for each of INSERT, UPDATE, and DELETE. Nominally, this is an ABI break to a public structure in commands/trigger.h. However, since this is a private field pointing to an opaque data structure, the only way to create a valid TransitionCaptureState is by calling MakeTransitionCaptureState(), and no extensions appear to be doing that anyway, so it seems safe for back-patching. Backpatch to v15, where MERGE was introduced. Bug: #19380 Reported-by: Daniel Woelfel <dwwoelfel@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19380-4e293be2b4007248%40postgresql.org Backpatch-through: 15	2026-01-24 11:30:48 +00:00
Jacob Champion	9b9eaf08ab	libpq_pipeline: Test the default protocol version In preparation for a future change to libpq's default protocol version, pin today's default (3.0) in the libpq_pipeline tests. Patch by Jelte Fennema-Nio, with some additional refactoring of the PQconnectdbParams() logic by me. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/DDPR5BPWH1RJ.1LWAK6QAURVAY%40jeltef.nl	2026-01-23 12:59:03 -08:00
Jacob Champion	f7521bf721	pqcomm.h: Explicitly reserve protocol v3.1 Document this unused version alongside the other special protocol numbers. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAOYmi%2BkKyw%3Dh-5NKqqpc7HC5M30_QmzFx3kgq2AdipyNj47nUw%40mail.gmail.com	2026-01-23 12:57:15 -08:00
Nathan Bossart	4ce105d9c4	Add missing #include. Oversight in commit `8eef2df189`. Per buildfarm.	2026-01-23 11:00:06 -06:00
Nathan Bossart	8eef2df189	Fix some rounding code for shared memory. InitializeShmemGUCs() always added 1 to the value calculated for shared_memory_size_in_huge_pages, which is unnecessary if the shared memory size is divisible by the huge page size. CreateAnonymousSegment() neglected to check for overflow when rounding up to a multiple of the huge page size. These are arguably bugs, but they seem extremely unlikely to be causing problems in practice, so no back-patch. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAO6_Xqq2vZbva0R9eQSY0p2kfksX2aP4r%3D%2BZ_q1HBYNU%3Dm8bBg%40mail.gmail.com	2026-01-23 10:46:49 -06:00
Michael Paquier	a36164e746	Add WALRCV_CONNECTING state to the WAL receiver Previously, a WAL receiver freshly started would set its state to WALRCV_STREAMING immediately at startup, before actually establishing a replication connection. This commit introduces a new state called WALRCV_CONNECTING, which is the state used when the WAL receiver freshly starts, or when a restart is requested, with a switch to WALRCV_STREAMING once the connection to the upstream server has been established with COPY_BOTH, meaning that the WAL receiver is ready to stream changes. This change is useful for monitoring purposes, especially in environments with a high latency where a connection could take some time to be established, giving some room between the [re]start phase and the streaming activity. From the point of view of the startup process, that flips the shared memory state of the WAL receiver when it needs to be stopped, the existing WALRCV_STREAMING and the new WALRCV_CONNECTING states have the same semantics: the WAL receiver has started and it can be stopped. Based on an initial suggestion from Noah Misch, with some input from me about the design. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/CABPTF7VQ5tGOSG5TS-Cg+Fb8gLCGFzxJ_eX4qg+WZ3ZPt=FtwQ@mail.gmail.com	2026-01-23 14:17:28 +09:00
Amit Langote	f9a468c664	Fix bogus ctid requirement for dummy-root partitioned targets ExecInitModifyTable() unconditionally required a ctid junk column even when the target was a partitioned table. This led to spurious "could not find junk ctid column" errors when all children were excluded and only the dummy root result relation remained. A partitioned table only appears in the result relations list when all leaf partitions have been pruned, leaving the dummy root as the sole entry. Assert this invariant (nrels == 1) and skip the ctid requirement. Also adjust ExecModifyTable() to tolerate invalid ri_RowIdAttNo for partitioned tables, which is safe since no rows will be processed in this case. Bug: #19099 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19099-e05dcfa022fe553d%40postgresql.org Backpatch-through: 14	2026-01-23 10:23:30 +09:00
Tom Lane	4b760a181a	Remove faulty Assert in partitioned INSERT...ON CONFLICT DO UPDATE. Commit `f16241bef` mistakenly supposed that INSERT...ON CONFLICT DO UPDATE rejects partitioned target tables. (This may have been accurate when the patch was written, but it was already obsolete when committed.) Hence, there's an assertion that we can't see ItemPointerIndicatesMovedPartitions() in that path, but the assertion is triggerable. Some other places throw error if they see a moved-across-partitions tuple, but there seems no need for that here, because if we just retry then we get the same behavior as in the update-within-partition case, as demonstrated by the new isolation test. So fix by deleting the faulty Assert. (The fact that this is the fix doubtless explains why we've heard no field complaints: the behavior of a non-assert build is fine.) The TM_Deleted case contains a cargo-culted copy of the same Assert, which I also deleted to avoid confusion, although I believe that one is actually not triggerable. Per our code coverage report, neither the TM_Updated nor the TM_Deleted case were reached at all by existing tests, so this patch adds tests for both. Reported-by: Dmitry Koval <d.koval@postgrespro.ru> Author: Joseph Koshakow <koshy44@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/f5fffe4b-11b2-4557-a864-3587ff9b4c36@postgrespro.ru Backpatch-through: 14	2026-01-22 18:35:31 -05:00
Álvaro Herrera	69f98fce5b	Make some use of anonymous unions [reloptions] In the spirit of commit `4b7e6c73b0` and following, which see for more details; it appears to have been quite an uncontroversial C11 feature to use and it makes the code nicer to read. This commit changes the relopt_value struct. Author: Peter Eisentraut <peter@eisentraut.org> Author: Álvaro Herrera <alvherre@kurilemu.de> Note: Yes, this was written twice independently. Discussion: https://postgr.es/m/202601192106.zcdi3yu2gzti@alvherre.pgsql	2026-01-22 17:04:59 +01:00
Peter Eisentraut	c257ba8397	Record range constructor functions in pg_range When a range type is created, several construction functions are also created, two for the range type and three for the multirange type. These have an internal dependency, so they "belong" to the range type. But there was no way to identify those functions when given a range type. An upcoming patch needs access to the two- or possibly the three-argument range constructor function for a given range type. The only way to do that would be with fragile workarounds like matching names and argument types. The correct way to do that kind of thing is to record to the links in the system catalogs. This is what this patch does, it records the OIDs of these five constructor functions in the pg_range catalog. (Currently, there is no code that makes use of this.) Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/7d63ddfa-c735-4dfe-8c7a-4f1e2a621058%40eisentraut.org	2026-01-22 15:56:29 +01:00
Peter Eisentraut	a5b40d156e	Mark commented out code as unused There were many PG_GETARG_* calls, mostly around gin, gist, spgist code, that were commented out, presumably to indicate that the argument was unused and to indicate that it wasn't forgotten or miscounted. But keeping commented-out code updated with refactorings and style changes is annoying. So this commit changes them to #ifdef NOT_USED blocks, which is a style already in use. That way, at least the indentation and syntax highlighting works correctly, making some of these blocks much easier to read. An alternative would be to just delete that code, but there is some value in making unused arguments explicit, and some of this arguably serves as example code for index AM APIs. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/328e4371-9a4c-4196-9df9-1f23afc900df%40eisentraut.org	2026-01-22 12:44:07 +01:00
Peter Eisentraut	846fb3c790	Remove incorrect commented out code These calls, if activated, are happening before null checks, so they are not correct. Also, the "in" variable is shadowed later. Remove them to avoid confusion and bad examples. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: David Geier <geidav.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/328e4371-9a4c-4196-9df9-1f23afc900df%40eisentraut.org	2026-01-22 12:44:07 +01:00
Peter Eisentraut	f3c96c9dff	Remove redundant AssertVariableIsOfType uses The uses of AssertVariableIsOfType in pg_upgrade are unnecessary because the calls to upgrade_task_add_step() already check the compatibility of the callback functions. These were apparently copied from a previous coding style, but similar removals were already done in commit `30b789eafe`. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/3d289481-7f76-409f-81c7-81824219cc75%40eisentraut.org	2026-01-22 12:20:32 +01:00
Peter Eisentraut	ae4fe737ae	Detect if flags are needed for C++11 support Just like we only support compiling with C11, we only support compiling extensions with C++11 and up. Some compilers support C++11 but don't enable it by default. This detects if flags are needed to enable C++11 support, in a similar way to how we check the same for C11 support. The C++ test extension module added by commit `476b35d4e3` confirmed that C++11 is effectively required. (This was understood in mailing list discussions but not recorded anywhere in the source code.) Author: Jelte Fennema-Nio <postgres@jeltef.nl> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/E1viDt1-001d7E-2I%40gemulon.postgresql.org	2026-01-22 09:09:25 +01:00
Michael Paquier	1a1e733b62	doc: List all the possible values of pg_stat_wal_receiver.status The possible values of pg_stat_wal_receiver.status have never been documented. Note that the status "stopped" will never show up in this view, hence there is no need to document it. Issue noticed while discussing a patch that aims to add a new status to WAL receiver. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CABPTF7X_Jgmyk1FBVNf3tyAOKqU55LLpLMzWkGtEAb_jQWVN=g@mail.gmail.com	2026-01-22 17:03:21 +09:00
Michael Paquier	25be5e8a33	doc: Mention pg_get_partition_constraintdef() All the other SQL functions reconstructing definitions or commands are listed in the documentation, except this one. Oversight in `1848b73d45`. Author: Todd Liebenschutz-Jones <todd.liebenschutz-jones@starlingbank.com> Discussion: https://postgr.es/m/CAGTRfaD6uRQ9iutASDzc_iDoS25sQTLWgXTtR3ta63uwTxq6bA@mail.gmail.com Backpatch-through: 14	2026-01-22 16:35:36 +09:00
Thomas Munro	e5d99b4d9e	jit: Add missing inline pass for LLVM >= 17. With LLVM >= 17, transform passes are provided as a string to LLVMRunPasses. Only two strings were used: "default<O3>" and "default<O0>,mem2reg". With previous LLVM versions, an additional inline pass was added when JIT inlining was enabled without optimization. With LLVM >= 17, the code would go through llvm_inline, prepare the functions for inlining, but the generated bitcode would be the same due to the missing inline pass. This patch restores the previous behavior by adding an inline pass when inlining is enabled but no optimization is done. This fixes an oversight introduced by `76200e5e` when support for LLVM 17 was added. Backpatch-through: 14 Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Pierre Ducroquet <p.psql@pinaraf.info> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAO6_XqrNjJnbn15ctPv7o4yEAT9fWa-dK15RSyun6QNw9YDtKg%40mail.gmail.com	2026-01-22 16:03:47 +13:00
Fujii Masao	26cb14aea1	file_fdw: Support multi-line HEADER option. Commit `bc2f348` introduced multi-line HEADER support for COPY. This commit extends this capability to file_fdw, allowing multiple header lines to be skipped. Because CREATE/ALTER FOREIGN TABLE requires option values to be single-quoted, this commit also updates defGetCopyHeaderOption() to accept integer values specified as strings for HEADER option. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: songjinzhou <tsinghualucky912@foxmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOzEurT+iwC47VHPMS+uJ4WSzvOLPsZ2F2_wopm8M7O+CZa3Xw@mail.gmail.com	2026-01-22 10:14:12 +09:00
Fujii Masao	f3da70a805	Improve the error message in COPY with HEADER option. The error message reported for invalid values of the HEADER option in COPY command previously used the term "non-negative integer", which is discouraged by the Error Message Style Guide because it is ambiguous about whether zero is allowed. This commit improves the error message by replacing "non-negative integer" there with "an integer value greater than or equal to zero" to make the accepted values explicit. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Steven Niu <niushiji@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwE86PcuPZbP=aurmW7Oo=eycF10gxjErWq4NmY-5TTX4Q@mail.gmail.com	2026-01-22 10:13:07 +09:00
Nathan Bossart	25dc485074	Refactor some SIMD and popcount macros. This commit does the following: * Removes TRY_POPCNT_X86_64. We now assume that the required CPUID intrinsics are available when HAVE_X86_64_POPCNTQ is defined, as we have done since v16 for meson builds when USE_SSE42_CRC32C_WITH_RUNTIME_CHECK is defined and since v17 when USE_AVX512_POPCNT_WITH_RUNTIME_CHECK is defined. * Moves the MSVC check for HAVE_X86_64_POPCNTQ to configure-time. This way, we set it for all relevant platforms in one place. * Moves the #defines for USE_SSE2 and USE_NEON to c.h so that they can be used elsewhere without including simd.h. Consequently, we can remove the POPCNT_AARCH64 macro. * Moves the #includes for pg_bitutils.h to below the system headers in pg_popcount_{aarch64,x86}.c, since we no longer depend on macros from pg_bitutils.h to decide which system headers to use. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aWf_InS1VrbeXAfP%40nathan	2026-01-21 14:21:00 -06:00
Nathan Bossart	8c6653516c	Rename "fast" and "slow" popcount functions. Since we now have several implementations of the popcount functions, let's give them more descriptive names. This commit replaces "slow" with "portable" and "fast" with "sse42". While the POPCNT instruction is technically not part of SSE4.2, this naming scheme is close enough in practice and is arguably easier to understand than using "popcnt" instead. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/aWf_InS1VrbeXAfP%40nathan	2026-01-21 14:21:00 -06:00
Nathan Bossart	79e232ca01	Move x86-64-specific popcount code to pg_popcount_x86.c. This moves the remaining x86-64-specific popcount implementations in pg_bitutils.c to pg_popcount_x86.c. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aWf_InS1VrbeXAfP%40nathan	2026-01-21 14:21:00 -06:00
Nathan Bossart	fbe327e5b4	Rename pg_popcount_avx512.c to pg_popcount_x86.c. This is preparatory work for a follow-up commit that will move the rest of the x86-64-specific popcount code to this file. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aWf_InS1VrbeXAfP%40nathan	2026-01-21 14:21:00 -06:00
Tom Lane	4576208454	Force standard_conforming_strings to always be ON. Continuing to support this backwards-compatibility feature has nontrivial costs; in particular it is potentially a security hazard if an application somehow gets confused about which setting the server is using. We changed the default to ON fifteen years ago, which seems like enough time for applications to have adapted. Let's remove support for the legacy string syntax. We should not remove the GUC altogether, since client-side code will still test it, pg_dump scripts will attempt to set it to ON, etc. Instead, just prevent it from being set to OFF. There is precedent for this approach (see commit `de66987ad`). This patch does remove the related GUC escape_string_warning, however. That setting does nothing when standard_conforming_strings is on, so it's now useless. We could leave it in place as a do-nothing setting to avoid breaking clients that still set it, if there are any. But it seems likely that any such client is also trying to turn off standard_conforming_strings, so it'll need work anyway. The client-side changes in this patch are pretty minimal, because even though we are dropping the server's support, most of our clients still need to be able to talk to older server versions. We could remove dead client code only once we disclaim compatibility with pre-v19 servers, which is surely years away. One change of note is that pg_dump/pg_dumpall now set standard_conforming_strings = on in their source session, rather than accepting the source server's default. This ensures that literals in view definitions and such will be printed in a way that's acceptable to v19+. In particular, pg_upgrade will work transparently even if the source installation has standard_conforming_strings = off. (However, pg_restore will behave the same as before if given an archive file containing standard_conforming_strings = off. Such an archive will not be safely restorable into v19+, but we shouldn't break the ability to extract valid data from it for use with an older server.) Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3279216.1767072538@sss.pgh.pa.us	2026-01-21 15:08:38 -05:00
Álvaro Herrera	4d6a66f675	Allow Boolean reloptions to have ternary values From the user's point of view these are just Boolean values; from the implementation side we can now distinguish an option that hasn't been set. Reimplement the vacuum_truncate reloption using this type. This could also be used for reloptions vacuum_index_cleanup and buffering, but those additionally need a per-option "alias" for the state where the variable is unset (currently the value "auto"). Author: Nikolay Shaplov <dhyan@nataraj.su> Reviewed-by: Timur Magomedov <t.magomedov@postgrespro.ru> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/3474141.usfYGdeWWP@thinkpad-pgpro	2026-01-21 20:06:01 +01:00
Tom Lane	cec5fe0d1e	Remove useless flag PVC_INCLUDE_CONVERTROWTYPES. This was introduced in the SJE patch (`fc069a3a6`), but it doesn't do anything because pull_var_clause() never tests it. Apparently it snuck in from somebody's private fork. Remove it again, but only in HEAD -- seems best to let it be in v18. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/70008c19d22e3dd1565ca57f8436c0ba@postgrespro.ru	2026-01-21 13:26:19 -05:00
Álvaro Herrera	1f28982e40	amcheck: Fix snapshot usage in bt_index_parent_check We were using SnapshotAny to do some index checks, but that's wrong and causes spurious errors when used on indexes created by CREATE INDEX CONCURRENTLY. Fix it to use an MVCC snapshot, and add a test for it. Backpatch of `6bd469d26a` to branches 14-16. I previously misidentified the bug's origin: it came in with commit `7f563c09f8` (pg11-era, not `5ae2087202` as claimed previously), so all live branches are affected. Also take the opportunity to fix some comments that we failed to update in the original commits and apply pgperltidy. In branch 14, remove the unnecessary test plan specification (which would have need to have been changed anyway; c.f. commit 549ec201d613.) Diagnosed-by: Donghang Lin <donghanglin@gmail.com> Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Backpatch-through: 17 Discussion: https://postgr.es/m/CANtu0ojmVd27fEhfpST7RG2KZvwkX=dMyKUqg0KM87FkOSdz8Q@mail.gmail.com	2026-01-21 18:55:43 +01:00
Peter Eisentraut	e6bb491bf2	Remove more leftovers of AIX support The make variables MKLDEXPORT and POSTGRES_IMP were only used for AIX, so they should have been removed with commit `0b16bb8776`. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/7a48b624-2236-4e11-9b9d-6a3c658d77a1%40eisentraut.org	2026-01-21 14:51:05 +01:00
Michael Paquier	1572ea96e6	pg_stat_statements: Add more tests for level tracking This commit adds tests to verify the computation of the nesting level for two code paths: the planner hook and the ExecutorFinish() hook. The nesting level is essential to save a correct "toplevel" status for the added PGSS entries. The author has noticed that removing the manipulations of nesting_level in these two code paths did not cause the tests to complain, meaning that we never had coverage for the assumptions taken by the code. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0uK1PSrgf52bWCtDpzaqbWt04o6ZA7zBm6UQyv7vyvf9w@mail.gmail.com	2026-01-21 18:18:15 +09:00
Peter Eisentraut	b4555cb070	Fix for C++ compatibility After commit `476b35d4e3`, some buildfarm members are complaining about not recognizing _Noreturn when building the new C++ module test_cplusplusext. This is not a C++ feature, but it was gated like #if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L #define pg_noreturn _Noreturn But apparently that was not sufficient. Some platforms define __STDC_VERSION__ even in C++ mode. (In this particular case, it was g++ on Solaris, but apparently this is also done by some other platforms, and it is allowed by the C++ standard.) To fix, add a ... && !defined(__cplusplus) Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com	2026-01-21 08:54:35 +01:00
John Naylor	7892e25924	Update some comments for fasthash - Add advice about hashing multiple inputs with the incremental API - Generalize statements that were specific to C strings to include all variable length inputs, where applicable. - Update comments about the standalone functions and make it easy to find them. Reported-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: zengman <zengman@halodbtech.com> Discussion: https://postgr.es/m/CANWCAZZgKnf8dNOd_w03n88NqOfmMnMv2=D8_Oy6ADGyiMq+cg@mail.gmail.com Discussion: https://postgr.es/m/CANWCAZa-2mEUY27xBw2TpsybpvVu3Ez4ABrHCBqZpAs_UDTj2Q@mail.gmail.com	2026-01-21 14:11:40 +07:00
Amit Kapila	48efefa6ca	Improve errdetail for logical replication conflict messages. This change enhances the clarity and usefulness of error detail messages generated during logical replication conflicts. The following improvements have been made: 1. Eliminate redundant output: Avoid printing duplicate remote row and replica identity values for the multiple_unique_conflicts conflict type. 2. Improve message structure: Append tuple values directly to the main error message, separated by a colon (:), for better readability. 3. Simplify local row terminology: Remove the word 'existing' when referring to the local row, as this is already implied by context. 4. General code refinements: Apply miscellaneous code cleanups to improve how conflict detail messages are constructed and formatted. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/CAHut+Psgkwy5-yGRJC15izecySGGysrbCszv_z93ess8XtCDOQ@mail.gmail.com	2026-01-21 04:58:03 +00:00
Michael Paquier	905ef401d5	pg_stat_statements: Clean up REGRESS list in Makefile The "wal" and "entry_timestamp" items were still on the same line, which was not intentional. Thinko in `f9afd56218`. Reported-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/aW6_Xc8auuu5iAPi@paquier.xyz	2026-01-21 11:29:34 +09:00
Michael Paquier	f9afd56218	pg_stat_statements: Rework test order The test "squashing" was the last item of the REGRESS list, but "cleanup" should be the second to last, dropping the extension. "oldextversions" is the last item. In passing, the REGRESS list is cleaned up to include one item per line, so as diffs are minimized when adding new test files. Noticed while playing with this area of the code. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/aW6_Xc8auuu5iAPi@paquier.xyz	2026-01-21 07:47:38 +09:00
Peter Eisentraut	476b35d4e3	tests: Add a test C++ extension module While we already test that our headers are valid C++ using headerscheck, it turns out that the macros we define might still expand to invalid C++ code. This adds a minimal test extension that is compiled using C++ to test that it's actually possible to build and run extensions written in C++. Future commits will improve C++ compatibility of some of our macros and add usage of them to this extension make sure that they don't regress in the future. The test module is for the moment disabled when using MSVC. In particular, the use of designated initializers in PG_MODULE_MAGIC would require C++20, for which we are currently not set up. (GCC and Clang support it as extensions.) It is planned to fix this. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com	2026-01-20 16:42:30 +01:00
Álvaro Herrera	f1cd34f952	Use integer backend type when exec'ing a postmaster child This way we don't have to walk the entire process type array and strcmp() the string with the names therein. The integer value can be directly used as array index instead. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/202512090935.k3xrtr44hxkn@alvherre.pgsql	2026-01-20 16:41:04 +01:00
Alexander Korotkov	30776ca468	Remove redundant pg_unreachable() after elog(ERROR) from ExecWaitStmt() elog(ERROR) never returns. Compilers don't always understand this. So, sometimes, we have to append pg_unreachable() to keep the compiler quiet about returning from a non-void function without a value. But pg_unreachable() is redundant for ExecWaitStmt(), which is void. Reported-by: Peter Eisentraut <peter@eisentraut.org> Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/8d72a2b3-7423-4a15-a981-e130bf60b1a6%40eisentraut.org Discussion: https://postgr.es/m/CABPTF7UcuVD0L-X%3DjZFfeygjPaZWWkVRwtWOaJw2tcXbEN2xsA%40mail.gmail.com	2026-01-20 16:10:25 +02:00
Amit Kapila	1ba3eee89a	Fix concurrent sequence drops during sequence synchronization. A recent BF failure showed that commit `7a485bd641` did not handle the case where a sequence is dropped concurrently during sequence synchronization on the subscriber. Previously, pg_get_sequence_data() would ERROR out if the sequence was dropped concurrently. After `7a485bd641`, it instead returns NULL, which leads to an assertion failure on the subscriber. To handle this change, update sequence synchronization to skip sequences for which pg_get_sequence_data() returns NULL. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm0FoGdt+1mzua0t-=wYdup5_zmFrvfNf-L=MGBnj9HAcg@mail.gmail.com	2026-01-20 09:40:13 +00:00
Michael Paquier	7ebb64c557	Add routine to free MCVList This addition is in the same spirit as `32e27bd320` for MVNDistinct and MVDependencies, except that we were missing a free routine for the third type of extended statistics, MCVList. I was not sure if we needed an equivalent for MCVList, but after more review of the main patch set for the import of extended statistics, it has become clear that we do. This is introduced as its own change as this routine can be useful on its own. This one is a piece that has not been written by Corey Huinker, I have just noticed it by myself on the way. Author: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-20 13:13:47 +09:00
Bruce Momjian	2e937eeb93	doc: revert "xreflabel" used for PL/Python & libpq chapters This reverts `d8aa21b74f`, which was added for the PG 18 release notes, and adjusts the PG 18 release notes for this change. This is necessary since the "xreflabel" affected other references to these chapters. Reported-by: Robert Treat Author: Robert Treat Discussion: https://postgr.es/m/CABV9wwNEZDdp5QtrW5ut0H+MOf6U1PvrqBqmgSTgcixqk+Q73A@mail.gmail.com Backpatch-through: 18	2026-01-19 22:59:10 -05:00
Michael Paquier	5d95219faa	pg_stat_statements: Fix crash in list squashing with Vars When IN/ANY clauses contain both constants and variable expressions, the optimizer transforms them into separate structures: constants become an array expression while variables become individual OR conditions. This transformation was creating an overlap with the token locations, causing pg_stat_statements query normalization to crash because it could not calculate the amount of bytes remaining to write for the normalized query. This commit disables squashing for mixed IN list expressions when constructing a scalar array op, by setting list_start and list_end to -1 when both variables and non-variables are present. Some regression tests are added to PGSS to verify these patterns. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0ts9qiONnHjjHxPxtePs22GBo4d3jZ_s2BQC59AN7XbAA@mail.gmail.com Backpatch-through: 18	2026-01-20 08:11:12 +09:00
Robert Haas	ecd275718b	Don't set the truncation block length greater than RELSEG_SIZE. When faced with a relation containing more than 1 physical segment (i.e. >1GB, with normal settings), the previous code could compute a truncation block length greater than RELSEG_SIZE, which could lead to restore failures of this form: file "%s" has truncation block length %u in excess of segment size %u The fix is simply to clamp the maximum computed truncation_block_length to RELSEG_SiZE. I have also added some comments to clarify the logic. The test case was written by Oleg Tkachenko, but I have rewritten its comments. Reported-by: Oleg Tkachenko <oatkachenko@gmail.com> Diagnosed-by: Oleg Tkachenko <oatkachenko@gmail.com> Co-authored-by: Robert Haas <rhaas@postgresql.org> Co-authored-by: Oleg Tkachenko <oatkachenko@gmail.com> Reviewed-by: Amul Sul <sulamul@gmail.com> Backpatch-through: 17 Discussion: http://postgr.es/m/00FEFC88-EA1D-4271-B38F-EB741733A84A@gmail.com	2026-01-19 12:09:32 -05:00
Richard Guo	34740b90bc	Fix unsafe pushdown of quals referencing grouping Vars When checking a subquery's output expressions to see if it's safe to push down an upper-level qual, check_output_expressions() previously treated grouping Vars as opaque Vars. This implicitly assumed they were stable and scalar. However, a grouping Var's underlying expression corresponds to the grouping clause, which may be volatile or set-returning. If an upper-level qual references such an output column, pushing it down into the subquery is unsafe. This can cause strange results due to multiple evaluation of a volatile function, or introduce SRFs into the subquery's WHERE/HAVING quals. This patch teaches check_output_expressions() to look through grouping Vars to their underlying expressions. This ensures that any volatility or set-returning properties in the grouping clause are detected, preventing the unsafe pushdown. We do not need to recursively examine the Vars contained in these underlying expressions. Even if they reference outputs from lower-level subqueries (at any depth), those references are guaranteed not to expand to volatile or set-returning functions, because subqueries containing such functions in their targetlists are never pulled up. Backpatch to v18, where this issue was introduced. Reported-by: Eric Ridge <eebbrr@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/7900964C-F99E-481E-BEE5-4338774CEB9F@gmail.com Backpatch-through: 18	2026-01-19 11:13:23 +09:00
Tom Lane	228fe0c3e6	Update time zone data files to tzdata release 2025c. This is pretty pro-forma for our purposes, as the only change is a historical correction for pre-1976 DST laws in Baja California. (Upstream made this release mostly to update their leap-second data, which we don't use.) But with minor releases coming up, we should be up-to-date. Backpatch-through: 14	2026-01-18 14:54:33 -05:00
Michael Paquier	6bca4b50d0	Fix error message related to end TLI in backup manifest The code adding the WAL information included in a backup manifest is cross-checked with the contents of the timeline history file of the end timeline. A check based on the end timeline, when it fails, reported the value of the start timeline in the error message. This error is fixed to show the correct timeline number in the report. This error report would be confusing for users if seen, because it would provide an incorrect information, so backpatch all the way down. Oversight in `0d8c9c1210`. Author: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/tencent_0F2949C4594556F672CF4658@qq.com Backpatch-through: 14	2026-01-18 17:24:25 +09:00
Michael Paquier	2a6ce34b55	Remove useless asserts in report_namespace_conflict() An assertion is used in this routine to check that a valid namespace OID is given by the caller, but it was repeated twice: once at the top of the routine and a second time multiple times in a switch/case. This commit removes the assertions within the switch/case. Thinko in commit `765cbfdc92`. Author: Man Zeng <zengman@halodbtech.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/tencent_40F8C1D82E2EE28065009AAA@qq.com	2026-01-18 16:11:46 +09:00
Peter Eisentraut	6831cd9e3b	Fix PL/Python build on MSVC with older Meson Amendment for commit `2bc60f8621`. With older Meson versions, we need to specify the Python include directory directly to cc.check_header instead of relying on the dependency to pass it through. Author: Bryan Green <dbryan.green@gmail.com> Discussion: https://www.postgresql.org/message-id/0de98c41-4145-44c1-aac5-087cf5b3e4a9%40gmail.com	2026-01-16 17:25:05 +01:00
Heikki Linnakangas	71379663fe	Fix crash in test function on removable_cutoff(NULL) The function is part of the injection_points test module and only used in tests. None of the current tests call it with a NULL argument, but it is supposed to work. Backpatch-through: 17	2026-01-16 14:42:22 +02:00
Heikki Linnakangas	1c64d2fcbe	Fix rare test failure in nbtree_half_dead_pages If auto-analyze kicks in at just the right moment, it can hold a snapshot and prevent the VACUUM command in the test from removing the deleted tuples. The test needs the tuples to be removed, otherwise no half-dead page is generated. To fix, introduce a helper procedure to wait for the removable cutoff to advance, like the one used in the syscache-update-pruned test for similar purposes. Thanks to Alexander Lakhin for reproducing and analyzing the test failure, and Tom Lane for the report. Discussion: https://www.postgresql.org/message-id/307198.1767408023@sss.pgh.pa.us	2026-01-16 14:38:20 +02:00
Andres Freund	84705b3727	bufmgr: Avoid spurious compiler warning after `fcb9c977aa` Some compilers, e.g. gcc with -Og or -O1, warn about the wait_event in BufferLockAcquire() possibly being uninitialized. That can't actually happen, as the switch() covers all legal lock mode values, but we still need to silence the warning. We could add a default:, but we'd like to get a warning if we were to get a new lock mode in the future. So just initialize wait_event to 0. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/934395.1768518154@sss.pgh.pa.us	2026-01-16 06:58:35 -05:00
Michael Paquier	395b73c045	Improve pg_clear_extended_stats() with incorrect relation/stats combination Issue fat-fingered in `d756fa1019`, noticed while doing more review of the main patch set proposed. I have missed the fact that this can be triggered by specifying an extended stats object that does not match with the relation specified and already locked. Like the cases where an object defined in input is missing, the code is changed to issue a WARNING instead of a confusing cache lookup failure. A regression test is added to cover this case. Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-16 15:24:59 +09:00
Amit Langote	889676a0d5	Fix rowmark handling for non-relation RTEs during executor init Commit `cbc127917e` introduced tracking of unpruned relids to skip processing of pruned partitions. PlannedStmt.unprunableRelids is computed as the difference between PlannerGlobal.allRelids and prunableRelids, but allRelids only contains RTE_RELATION entries. This means non-relation RTEs (VALUES, subqueries, CTEs, etc.) are never included in unprunableRelids, and consequently not in es_unpruned_relids at runtime. As a result, rowmarks attached to non-relation RTEs were incorrectly skipped during executor initialization. This affects any DML statement that has rowmarks on such RTEs, including MERGE with a VALUES or subquery source, and UPDATE/DELETE with joins against subqueries or CTEs. When a concurrent update triggers an EPQ recheck, the missing rowmark leads to incorrect results. Fix by restricting the es_unpruned_relids membership check to RTE_RELATION entries only, since partition pruning only applies to actual relations. Rowmarks for other RTE kinds are now always processed. Bug: #19355 Reported-by: Bihua Wang <wangbihua.cn@gmail.com> Diagnosed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/19355-57d7d52ea4980dc6@postgresql.org Backpatch-through: 18	2026-01-16 14:53:50 +09:00
Amit Langote	9cbb1d21d6	Fix segfault from releasing locks in detached DSM segments If a FATAL error occurs while holding a lock in a DSM segment (such as a dshash lock) and the process is not in a transaction, a segmentation fault can occur during process exit. The problem sequence is: 1. Process acquires a lock in a DSM segment (e.g., via dshash) 2. FATAL error occurs outside transaction context 3. proc_exit() begins, calling before_shmem_exit callbacks 4. dsm_backend_shutdown() detaches all DSM segments 5. Later, on_shmem_exit callbacks run 6. ProcKill() calls LWLockReleaseAll() 7. Segfault: the lock being released is in unmapped memory This only manifests outside transaction contexts because AbortTransaction() calls LWLockReleaseAll() during transaction abort, releasing locks before DSM cleanup. Background workers and other non-transactional code paths are vulnerable. Fix by calling LWLockReleaseAll() unconditionally at the start of shmem_exit(), before any callbacks run. Releasing locks before callbacks prevents the segfault - locks must be released before dsm_backend_shutdown() detaches their memory. This is safe because after an error, held locks are protecting potentially inconsistent data anyway, and callbacks can acquire fresh locks if needed. Also add a comment noting that LWLockReleaseAll() must be safe to call before LWLock initialization (which it is, since num_held_lwlocks will be 0), plus an Assert for the post-condition. This fix aligns with the original design intent from commit `001a573a2`, which noted that backends must clean up shared memory state (including releasing lwlocks) before unmapping dynamic shared memory segments. Reported-by: Rahila Syed <rahilasyed90@gmail.com> Author: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAH2L28uSvyiosL+kaic9249jRVoQiQF6JOnaCitKFq=xiFzX3g@mail.gmail.com Backpatch-through: 14	2026-01-16 13:02:42 +09:00
Fujii Masao	b98cc4a14e	pg_recvlogical: remove unnecessary OutputFsync() return value checks. Commit `1e2fddfa33` changed OutputFsync() so that it always returns true. However, pg_recvlogical.c still contained checks of its boolean return value, which are now redundant. This commit removes those checks and changes the type of return value of OutputFsync() to void, simplifying the code. Suggested-by: Yilin Zhang <jiezhilove@126.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFeTymZQ7RLvMU6WuDGar8bUQCazg=VOfA-9GeBkg-FzA@mail.gmail.com	2026-01-16 12:37:05 +09:00
Fujii Masao	d89b1d8175	Add test for pg_recvlogical reconnection behavior. This commit adds a test to verify that data already received and flushed by pg_recvlogical is not streamed again even after the connection is lost, reestablished, and logical replication is restarted. Author: Mircea Cadariu <cadariu.mircea@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFeTymZQ7RLvMU6WuDGar8bUQCazg=VOfA-9GeBkg-FzA@mail.gmail.com	2026-01-16 12:36:34 +09:00
Fujii Masao	0b10969db6	Add a new helper function wait_for_file() to Utils.pm. wait_for_file() waits for the contents of a specified file, starting at an optional offset, to match a given regular expression. If no offset is provided, the entire file is checked. The function times out after $PostgreSQL::Test::Utils::timeout_default seconds. It returns the total file length on success. The existing wait_for_log() function contains almost identical logic, but is limited to reading the cluster's log file. This commit also refactors wait_for_log() to call wait_for_file() instead, avoiding code duplication. This helper will be used by upcoming changes. Suggested-by: Mircea Cadariu <cadariu.mircea@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFeTymZQ7RLvMU6WuDGar8bUQCazg=VOfA-9GeBkg-FzA@mail.gmail.com	2026-01-16 12:35:56 +09:00
Fujii Masao	41cbdab0ab	pg_recvlogical: Prevent flushed data from being re-sent. Previously, when pg_recvlogical lost connection, reconnected, and restarted replication, data that had already been flushed could be streamed again. This happened because the replication start position used when restarting replication was taken from the last standby status message, which could be older than the position of the last flushed data. As a result, some flushed data newer than the replication start position could exist and be re-sent. This commit fixes the issue by ensuring all written data is flushed to disk before restarting replication, and by using the last flushed position as the replication start point. This prevents already flushed data from being re-sent. Additionally, previously when the --no-loop option was used, pg_recvlogical could exit without flushing written data, potentially losing data. To fix this issue, this commit also ensures all data is flushed to disk before exiting due to --no-loop. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Yilin Zhang <jiezhilove@126.com> Reviewed-by: Dewei Dai <daidewei1970@163.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwFeTymZQ7RLvMU6WuDGar8bUQCazg=VOfA-9GeBkg-FzA@mail.gmail.com	2026-01-16 12:35:26 +09:00
Michael Paquier	a7c63e4860	Fix stability issue with new TAP test of pg_createsubscriber The test introduced in `639352d904` has added a direct pg_ctl command to start a node, a method that is incompatible with the teardown() routine used at the end of the test as the PID saved in the Cluster object would prevent the node to be shut down. This can ultimately prevent the test to perform its cleanup, failing on timeout. Like pg_ctl's 001_start_stop or ssl_passphrase_callback's 001_testfunc, this commit changes the test so a direct pg_ctl command is used to stop the rogue node. That should be hopefully enough to cool down the buildfarm. Per report from buildfarm member fairywren, which is the only animal that is showing this issue. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/TY7PR01MB1455452AE9053DD2B77B74FEAF58CA@TY7PR01MB14554.jpnprd01.prod.outlook.com	2026-01-16 12:12:26 +09:00
Michael Paquier	d756fa1019	Add pg_clear_extended_stats() This function is able to clear the data associated to an extended statistics object, making things so as the object looks as newly-created. The caller of this function needs the following arguments for the extended stats to clear: - The name of the relation. - The schema name of the relation. - The name of the extended stats object. - The schema name of the extended stats object. - If the stats are inherited or not. The first two parameters are especially important to ensure a consistent lookup and ACL checks for the relation on which is based the extended stats object that will be cleared, relying first on a RangeVar lookup where permissions are checked without locking a relation, critical to prevent denial-of-service attacks when using this kind of function (see also `688dc6299a` for a similar concern). The third to fifth arguments give a way to target the extended stats records to clear. This has been extracted from a larger patch by the same author, for a piece which is again useful on its own. I have rewritten large portions of it. The tests have been extended while discussing this piece, resulting on what this commit includes. The intention behind this feature is to add support for the import of extended statistics across dumps and upgrades, this change building one piece that we will be able to rely on for the rest of the changes. Bump catalog version. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-16 08:13:30 +09:00
Andres Freund	d40fd85187	lwlock: Remove support for disowned lwlwocks This reverts commit `f8d7f29b3e`, plus parts of subsequent commits fixing a typo in a parameter name. Support for disowned lwlocks was added for the benefit of AIO, to be able to have content locks "owned" by the AIO subsystem. But as of commit `fcb9c977aa`, content locks do not use lwlocks anymore. It does not seem particularly likely that we need this facility outside of the AIO use-case, therefore remove the now unused functions. I did choose to keep the comment added in the aforementioned commit about lock->owner intentionally being left pointing to the last owner. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/cj5mcjdpucvw4a54hehslr3ctukavrbnxltvuzzhqnimvpju5e@cy3g3mnsefwz	2026-01-15 14:57:45 -05:00
Andres Freund	55fbfb738b	lwlock: Remove ForEachLWLockHeldByMe As of commit `fcb9c977aa`, ForEachLWLockHeldByMe(), introduced in `f4ece891fc`, is not used anymore, as content locks are now implemented in bufmgr.c. It doesn't seem that likely that a new user of the functionality will appear all that soon, making removal of the function seem like the most sensible path. It can easily be added back if necessary. Discussion: https://postgr.es/m/lneuyxqxamqoayd2ntau3lqjblzdckw6tjgeu4574ezwh4tzlg%40noioxkquezdw	2026-01-15 14:57:45 -05:00
Andres Freund	335f2231a3	pgindent fix for `8077649907` Per buildfarm member koel. Backpatch-through: 18	2026-01-15 14:57:45 -05:00
Andres Freund	fcb9c977aa	bufmgr: Implement buffer content locks independently of lwlocks Until now buffer content locks were implemented using lwlocks. That has the obvious advantage of not needing a separate efficient implementation of locks. However, the time for a dedicated buffer content lock implementation has come: 1) Hint bits are currently set while holding only a share lock. This leads to having to copy pages while they are being written out if checksums are enabled, which is not cheap. We would like to add AIO writes, however once many buffers can be written out at the same time, it gets a lot more expensive to copy them, particularly because that copy needs to reside in shared buffers (for worker mode to have access to the buffer). In addition, modifying buffers while they are being written out can cause issues with unbuffered/direct-IO, as some filesystems (like btrfs) do not like that, due to filesystem internal checksums getting corrupted. The solution to this is to require a new share-exclusive lock-level to set hint bits and to write out buffers, making those operations mutually exclusive. We could introduce such a lock-level into the generic lwlock implementation, however it does not look like there would be other users, and it does add some overhead into important code paths. 2) For AIO writes we need to be able to race-freely check whether a buffer is undergoing IO and whether an exclusive lock on the page can be acquired. That is rather hard to do efficiently when the buffer state and the lock state are separate atomic variables. This is a major hindrance to allowing writes to be done asynchronously. 3) Buffer locks are by far the most frequently taken locks. Optimizing them specifically for their use case is worth the effort. E.g. by merging content locks into buffer locks we will be able to release a buffer lock and pin in one atomic operation. 4) There are more complicated optimizations, like long-lived "super pinned & locked" pages, that cannot realistically be implemented with the generic lwlock implementation. Therefore implement content locks inside bufmgr.c. The lockstate is stored as part of BufferDesc.state. The implementation of buffer content locks is fairly similar to lwlocks, with a few important differences: 1) An additional lock-level share-exclusive has been added. This lock-level conflicts with exclusive locks and itself, but not share locks. 2) Error recovery for content locks is implemented as part of the already existing private-refcount tracking mechanism in combination with resowners, instead of a bespoke mechanism as the case for lwlocks. This means we do not need to add dedicated error-recovery code paths to release all content locks (like done with LWLockReleaseAll() for lwlocks). 3) The lock state is embedded in BufferDesc.state instead of having its own struct. 4) The wakeup logic is a tad more complicated due to needing to support the additional lock-level This commit unfortunately introduces some code that is very similar to the code in lwlock.c, however the code is not equivalent enough to easily merge it. The future wins that this commit makes possible seem worth the cost. As of this commit nothing uses the new share-exclusive lock mode. It will be used in a future commit. It seemed too complicated to introduce the lock-level in a separate commit. It's worth calling out one wart in this commit: Despite content locks not being lwlocks anymore, they continue to use PGPROC->lw* - that seemed better than duplicating the relevant infrastructure. Another thing worth pointing out is that, after this change, content locks are not reported as LWLock wait events anymore, but as new wait events in the "Buffer" wait event class (see also `6c5c393b74`). The old BufferContent lwlock tranche has been removed. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2026-01-15 14:26:53 -05:00
Andres Freund	dac328c8a6	bufmgr: Change BufferDesc.state to be a 64-bit atomic This is motivated by wanting to merge buffer content locks into BufferDesc.state in a future commit, rather than having a separate lwlock (see commit `c75ebc657f` for more details). As this change is rather mechanical, it seems to make sense to split it out into a separate commit, for easier review. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2026-01-15 14:20:41 -05:00
Tom Lane	282b1cde9d	Optimize LISTEN/NOTIFY via shared channel map and direct advancement. This patch reworks LISTEN/NOTIFY to avoid waking backends that have no need to process the notification messages we just sent. The primary change is to create a shared hash table that tracks which processes are listening to which channels (where a "channel" is defined by a database OID and channel name). This allows a notifying process to accurately determine which listeners are interested, replacing the previous weak approximation that listeners in other databases couldn't be interested. Secondly, if a listener is known not to be interested and is currently stopped at the old queue head, we avoid waking it at all and just directly advance its queue pointer past the notifications we inserted. These changes permit very significant improvements (integer multiples) in NOTIFY throughput, as well as a noticeable reduction in latency, when there are many listeners but only a few are interested in any specific message. There is no improvement for the simplest case where every listener reads every message, but any loss seems below the noise level. Author: Joel Jacobson <joel@compiler.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6899c044-4a82-49be-8117-e6f669765f7e@app.fastmail.com	2026-01-15 14:12:15 -05:00
Heikki Linnakangas	23b25586dc	Fix 'unexpected data beyond EOF' on replica restart On restart, a replica can fail with an error like 'unexpected data beyond EOF in block 200 of relation T/D/R'. These are the steps to reproduce it: - A relation has a size of 400 blocks. - Blocks 201 to 400 are empty. - Block 200 has two rows. - Blocks 100 to 199 are empty. - A restartpoint is done - Vacuum truncates the relation to 200 blocks - A FPW deletes a row in block 200 - A checkpoint is done - A FPW deletes the last row in block 200 - Vacuum truncates the relation to 100 blocks - The replica restarts When the replica restarts: - The relation on disk starts at 100 blocks, because all the truncations were applied before restart. - The first truncate to 200 blocks is replayed. It silently fails, but it will still (incorrectly!) update the cache size to 200 blocks - The first FPW on block 200 is applied. XLogReadBufferForRead relies on the cached size and incorrectly assumes that the page already exists in the file, and thus won't extend the relation. - The online checkpoint record is replayed, calling smgrdestroyall which causes the cached size to be discarded - The second FPW on block 200 is applied. This time, the detected size is 100 blocks, an extend is attempted. However, the block 200 is already present in the buffer cache due to the first FPW. This triggers the 'unexpected data beyond EOF'. To fix, update the cached size in SmgrRelation with the current size rather than the requested new size, when the requested new size is greater. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://www.postgresql.org/message-id/CAO6_Xqrv-snNJNhbj1KjQmWiWHX3nYGDgAc=vxaZP3qc4g1Siw@mail.gmail.com Backpatch-through: 14	2026-01-15 21:02:49 +02:00
Álvaro Herrera	35e3fae738	Remove #include <math.h> where not needed Liujinyang reported the one in binaryheap.c, I then found and analyzed the rest. For future patches, we require git archaelogical analysis before we accept patches of this nature. Co-authored-by: liujinyang <21043272@qq.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/tencent_6B302BFCAF6F010E00AB5C2C0ECB7AA3F205@qq.com	2026-01-15 19:09:47 +01:00
Andres Freund	8077649907	aio: io_uring: Fix danger of completion getting reused before being read We called io_uring_cqe_seen(..., cqe) before reading cqe->res. That allows the completion to be reused, which in turn could lead to cqe->res being overwritten. The window for that is very narrow and the likelihood of it happening is very low, as we should never actually utilize all CQEs, but the consequences would be bad. This bug was reported to me privately. Backpatch-through: 18 Discussion: https://postgr.es/m/bwo3e5lj2dgi2wzq4yvbyzu7nmwueczvvzioqsqo6azu6lm5oy@pbx75g2ach3p	2026-01-15 11:09:07 -05:00
Heikki Linnakangas	d9c3c94365	Wake up autovacuum launcher from postmaster when a worker exits When an autovacuum worker exits, the launcher needs to be notified with SIGUSR2, so that it can rebalance and possibly launch a new worker. The launcher must be notified only after the worker has finished ProcKill(), so that the worker slot is available for a new worker. Before this commit, the autovacuum worker was responsible for that, which required a slightly complicated dance to pass the launcher's PID from FreeWorkerInfo() to ProcKill() in a global variable. Simplify that by moving the responsibility of the signaling to the postmaster. The postmaster was already doing it when it failed to fork a worker process, so it seems logical to make it responsible for notifying the launcher on worker exit too. That's also how the notification on background worker exit is done. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: li carol <carol.li2025@outlook.com> Discussion: https://www.postgresql.org/message-id/a5e27d25-c7e7-45d5-9bac-a17c8f462def@iki.fi	2026-01-15 18:02:25 +02:00
Heikki Linnakangas	102bdaa9be	Add check for invalid offset at multixid truncation If a multixid with zero offset is left behind after a crash, and that multixid later becomes the oldest multixid, truncation might try to look up its offset and read the zero value. In the worst case, we might incorrectly use the zero offset to truncate valid SLRU segments that are still needed. I'm not sure if that can happen in practice, or if there are some other lower-level safeguards or incidental reasons that prevent the caller from passing an unwritten multixid as the oldest multi. But better safe than sorry, so let's add an explicit check for it. In stable branches, we should perhaps do the same check for 'oldestOffset', i.e. the offset of the old oldest multixid (in master, 'oldestOffset' is gone). But if the old oldest multixid has an invalid offset, the damage has been done already, and we would never advance past that point. It's not clear what we should do in that case. The check that this commit adds will prevent such an multixid with invalid offset from becoming the oldest multixid in the first place, which seems enough for now. Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: Discussion: https://www.postgresql.org/message-id/000301b2-5b81-4938-bdac-90f6eb660843@iki.fi Backpatch-through: 14	2026-01-15 16:48:45 +02:00
Heikki Linnakangas	c4b71e6f60	Remove some unnecessary code from multixact truncation With 64-bit multixact offsets, PerformMembersTruncation() doesn't need the starting offset anymore. The 'oldestOffset' value that TruncateMultiXact() calculates is no longer used for anything. Remove it, and the code to calculate it. 'oldestOffset' was included in the WAL record as 'startTruncMemb', which sounds nice if you e.g. look at the WAL with pg_waldump, but it was also confusing because we didn't actually use the value for determining what to truncate. Replaying the WAL would remove all segments older than 'endTruncMemb', regardless of 'startTruncMemb'. The 'startTruncOff' stored in the WAL record was similarly unnecessary even before 64-bit multixid offsets, it was stored just for the sake of symmetry with 'startTruncMemb'. Remove both from the WAL record, and rename the remaining 'endTruncOff' to 'oldestMulti' and 'endTruncMemb' to 'oldestOffset', for consistency with the variable names used for them in other places. Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://www.postgresql.org/message-id/000301b2-5b81-4938-bdac-90f6eb660843@iki.fi	2026-01-15 13:34:50 +02:00
Peter Eisentraut	da265a8717	plpython: Streamline initialization The initialization of PL/Python (the Python interpreter, the global state, the plpy module) was arranged confusingly across different functions with unclear and confusing boundaries. For example, PLy_init_interp() said "Initialize the Python interpreter ..." but it didn't actually do this, and PLy_init_plpy() said "initialize plpy module" but it didn't do that either. After this change, all the global initialization is called directly from _PG_init(), and the plpy module initialization is all called from its registered initialization function PyInit_plpy(). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: li carol <carol.li2025@outlook.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/f31333f1-fbb7-4098-b209-bf2d71fbd4f3%40eisentraut.org	2026-01-15 12:11:52 +01:00
Peter Eisentraut	3263a893fb	plpython: Remove duplicate PyModule_Create() This seems to have existed like this since Python 3 support was added (commit `dd4cd55c15`), but it's unclear what this second call is supposed to accomplish. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: li carol <carol.li2025@outlook.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/f31333f1-fbb7-4098-b209-bf2d71fbd4f3%40eisentraut.org	2026-01-15 10:32:41 +01:00
Peter Eisentraut	34d8111c3a	plpython: Clean up PyModule_AddObject() uses The comments "PyModule_AddObject does not add a refcount to the object, for some odd reason" seem distracting. Arguably, this behavior is expected, not odd. Also, the additional references created by the existing code are apparently not necessary. But we should clean up the reference in the error case, as suggested by the Python documentation. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: li carol <carol.li2025@outlook.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/f31333f1-fbb7-4098-b209-bf2d71fbd4f3%40eisentraut.org	2026-01-15 10:32:38 +01:00
Peter Eisentraut	8cb95a0645	plpython: Remove commented out code This code has been commented out since the first commit of plpython. It doesn't seem worth keeping. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: li carol <carol.li2025@outlook.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/f31333f1-fbb7-4098-b209-bf2d71fbd4f3%40eisentraut.org	2026-01-15 10:32:34 +01:00
Michael Paquier	32e27bd320	Introduce routines to validate and free MVNDistinct and MVDependencies These routines are useful to perform some basic validation checks on each object structure, working currently on attribute numbers for non-expression and expression attnums. These checks could be extended in the future. Note that this code is not used yet in the tree, and that these functions will become handy for an upcoming patch for the import of extended statistics data. However, they are worth their own independent change as they are actually useful by themselves, with at least the extension code argument in mind (or perhaps I am just feeling more pedantic today). Extracted from a larger patch by the same author, with many adjustments and fixes by me. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2026-01-15 09:36:05 +09:00
Jeff Davis	ed425b5a20	Remove redundant assignment in CreateWorkExprContext In CreateWorkExprContext(), maxBlockSize is initialized to ALLOCSET_DEFAULT_MAXSIZE, and it then immediately reassigned, thus the initialization is a redundant. Author: Andreas Karlsson <andreas@proxel.se> Reported-by: Chao Li <lic@highgo.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/83a14f3c-f347-4769-9c01-30030b31f1eb@gmail.com	2026-01-14 12:01:36 -08:00
Andres Freund	556c92a689	lwlock: Improve local variable name In `9a385f6166` I used the variable name new_release_in_progress, but new_wake_in_progress makes more sense given the flag name. Suggested-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/AC5E365D-7AD9-47AE-B2C6-25756712B188@gmail.com	2026-01-14 11:15:38 -05:00
Peter Eisentraut	fa16e7fd84	Revert "Replace pg_restrict by standard restrict" This reverts commit `f0f2c0c1ae`. The original problem that led to the use of pg_restrict was that MSVC couldn't handle plain restrict, and defining it to something else would conflict with its __declspec(restrict) that is used in system header files. In C11 mode, this is no longer a problem, as MSVC handles plain restrict. This led to the commit to replace pg_restrict with restrict. But this did not take C++ into account. Standard C++ does not have restrict, so we defined it as something else (for example, MSVC supports __restrict). But this then again conflicts with __declspec(restrict) in system header files. So we have to revert this attempt. The comments are updated to clarify that the reason for this is now C++ only. Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/CAGECzQRoD7chJP1-dneSrhxUJv%2BBRcigoGOO4UwGzaShLot2Yw%40mail.gmail.com	2026-01-14 15:12:25 +01:00
Peter Eisentraut	794ba8b6a4	doc: Slightly correct advice on C/C++ linkage The documentation was writing that <literal>extern C</literal> should be used, but it should be <literal>extern "C"</literal>.	2026-01-14 15:05:29 +01:00
Peter Eisentraut	2bc60f8621	Enable Python Limited API for PL/Python on MSVC Previously, the Python Limited API was disabled on MSVC due to build failures caused by Meson not knowing to link against python3.lib instead of python3XX.lib when using the Limited API. This commit works around the Meson limitation by explicitly finding and linking against python3.lib on MSVC, and removes the preprocessor guard that was disabling the Limited API on MSVC in plpython.h. This requires python3.lib to be present in the Python installation, which is included when Python is installed. Author: Bryan Green <dbryan.green@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ee410de1-1e0b-4770-b125-eeefd4726a24%40eisentraut.org	2026-01-14 10:43:51 +01:00
Álvaro Herrera	4196d6178a	Reword confusing comment to avoid "typo fixes" Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAApHDvqPmpa53jcTmfU8arFFm7=hB5cFoXX5dcUH=1qV0tRFHA@mail.gmail.com	2026-01-14 10:07:44 +01:00
Michael Paquier	6dcfac9696	Use more consistent *GetDatum() macros for some unsigned numbers This patch switches some code paths to use GetDatum() macros more in line with the data types of the variables they manipulate. This set of changes does not fix a problem, but it is always nice to be more consistent across the board. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Roman Khapov <rkhapov@yandex-team.ru> Reviewed-by: Yuan Li <carol.li2025@outlook.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/CALdSSPidtC7j3MwhkqRj0K2hyp36ztnnjSt6qzGxQtiePR1dzw@mail.gmail.com	2026-01-14 17:07:49 +09:00
Amit Kapila	e385a4e2fd	Prevent unintended dropping of active replication origins. Commit `5b148706c5` exposed functionality that allows multiple processes to use the same replication origin, enabling non-builtin logical replication solutions to implement parallel apply for large transactions. With this functionality, if two backends acquire the same replication origin and one of them resets it first, the acquired_by flag is cleared without acknowledging that another backend is still actively using the origin. This can lead to the origin being unintentionally dropped. If the shared memory for that dropped origin is later reused for a newly created origin, the remaining backend that still holds a pointer to the old memory may inadvertently advance the LSN of a completely different origin, causing unpredictable behavior. Although the underlying issue predates commit `5b148706c5`, it did not surface earlier because the internal parallel apply worker mechanism correctly coordinated origin resets and drops. This commit resolves the problem by introducing a reference counter for replication origins. The reference count increases when a backend sets the origin and decreases when it resets it. Additionally, the backend that first acquires the origin will not release it until all other backends using the origin have released it as well. The patch also prevents dropping a replication origin when acquired_by is zero but the reference counter is nonzero, covering the scenario where the first session exits without properly releasing the origin. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TY4PR01MB169077EE72ABE9E55BAF162D494B5A@TY4PR01MB16907.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/CAMPB6wfe4zLjJL8jiZV5kjjpwBM2=rTRme0UCL7Ra4L8MTVdOg@mail.gmail.com	2026-01-14 07:15:46 +00:00
Michael Paquier	4fe1ea7777	pg_waldump: Relax LSN comparison check in TAP test The test 002_save_fullpage.pl, checking --save-fullpage fails with wal_consistency_checking enabled, due to the fact that the block saved in the file has the same LSN as the LSN used in the file name. The test required that the block LSN is stritly lower than file LSN. This commit relaxes the check a bit, by allowing the LSNs to match. While on it, the test name is reworded to include some information about the file and block LSNs, which is useful for debugging. Author: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/4226AED7-E38F-419B-AAED-9BC853FB55DE@yandex-team.ru Backpatch-through: 16	2026-01-14 16:02:30 +09:00
Andres Freund	ff219c1987	bufmgr: Make definitions related to buffer descriptor easier to modify This is in preparation to widening the buffer state to 64 bits, which in turn is preparation for implementing content locks in bufmgr. This commit aims to make the subsequent commits a bit easier to review, by separating out reformatting etc from the actual changes. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/4csodkvvfbfloxxjlkgsnl2lgfv2mtzdl7phqzd4jxjadxm4o5@usw7feyb5bzf	2026-01-13 19:38:29 -05:00
Andres Freund	9a385f6166	lwlock: Invert meaning of LW_FLAG_RELEASE_OK Previously, a flag was set to indicate that a lock release should wake up waiters. Since waking waiters is the default behavior in the majority of cases, this logic has been inverted. The new LW_FLAG_WAKE_IN_PROGRESS flag is now set iff wakeups are explicitly inhibited. The motivation for this change is that in an upcoming commit, content locks will be implemented independently of lwlocks, with the lock state stored as part of BufferDesc.state. As all of a buffer's flags are cleared when the buffer is invalidated, without this change we would have to re-add the RELEASE_OK flag after clearing the flags; otherwise, the next lock release would not wake waiters. It seems good to keep the implementation of lwlocks and buffer content locks as similar as reasonably possible. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/4csodkvvfbfloxxjlkgsnl2lgfv2mtzdl7phqzd4jxjadxm4o5@usw7feyb5bzf	2026-01-13 19:38:29 -05:00
Michael Paquier	e217dc7484	Fix query jumbling with GROUP BY clauses RangeTblEntry.groupexprs was marked with the node attribute query_jumble_ignore, causing a list of GROUP BY expressions to be ignored during the query jumbling. For example, these two queries could be grouped together within the same query ID: SELECT count() FROM t GROUP BY a; SELECT count() FROM t GROUP BY b; However, as such queries use different GROUP BY clauses, they should be split across multiple entries. This fixes an oversight in `247dea89f7`, that has introduced an RTE for GROUP BY clauses. Query IDs are documented as being stable across minor releases, but as this is a regression new to v18 and that we are still early in its support cycle, a backpatch is exceptionally done as this has broken a behavior that exists since query jumbling is supported in core, since its introduction in pg_stat_statements. The tests of pg_stat_statements are expanded to cover this area, with patterns involving GROUP BY and GROUPING clauses. Author: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxEy2W+tCqC7XuJ94r3ivWsM=onKJp94kRFx3hoARjBeFQ@mail.gmail.com Backpatch-through: 18	2026-01-14 08:44:12 +09:00
Fujii Masao	ad381d0d92	doc: Document DEFAULT option in file_fdw. Commit `9f8377f7a` introduced the DEFAULT option for file_fdw but did not update the documentation. This commit adds the missing description of the DEFAULT option to the file_fdw documentation. Backpatch to v16, where the DEFAULT option was introduced. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAOzEurT_PE7QEh5xAdb7Cja84Rur5qPv2Fzt3Tuqi=NU0WJsbg@mail.gmail.com Backpatch-through: 16	2026-01-13 22:54:45 +09:00
Álvaro Herrera	8a47d9ee7f	Fix test_misc/010_index_concurrently_upsert for cache-clobbering builds The test script added by commit `e1c971945d` failed to handle the case of cache-clobbering builds (CLOBBER_CACHE_ALWAYS and CATCACHE_FORCE_RELEASE) properly -- it would only exit a loop on timeout, which is slow, and unfortunate because I (Álvaro) increased the timeout for that loop to the complete default TAP test timeout, causing the buildfarm to report the whole test run as a timeout failure. We can be much quicker: exit the loop as soon as the backend is seen as waiting on the injection point. In this commit we still reduce the timeout (of that loop and a nearby one just to be safe) to half of the default. I (Álvaro) had also changed Mihail's "sleep(1)" to "sleep(0.1)", which apparently turns a 1s sleep into a 0s sleep, because Perl -- probably making this a busy loop. Use Time::HiRes::usleep instead, like we do in other tests. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CADzfLwWOVyJygX6BFuyuhTKkJ7uw2e8OcVCDnf6iqnOFhMPE%2BA%40mail.gmail.com	2026-01-13 10:03:33 +01:00
John Naylor	94a24b4ee5	Improve some comment wording and grammar in extension.c Noted while looking at reports of grammatical errors. Reported-by: albert tan <alterttan1223@gmail.com> Reported-by: Yuan Li(carol) <carol.li2025@outlook.com> Discussion: https://postgr.es/m/CAEzortnJB7aue6miGT_xU2KLb3okoKgkBe4EzJ6yJ%3DY8LMB7gw%40mail.gmail.com	2026-01-13 12:33:08 +07:00
Jeff Davis	a00a25b6ce	Fix error message typo. Reported-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2mMmm9fTZYgE-r_T-KPTFR1rKO029QV-S-6n=7US_9EMA@mail.gmail.com	2026-01-12 19:07:00 -08:00
Andres Freund	0b96e734c5	heapam: Add batch mode mvcc check and use it in page mode There are two reasons for doing so: 1) It is generally faster to perform checks in a batched fashion and making sequential scans faster is nice. 2) We would like to stop setting hint bits while pages are being written out. The necessary locking becomes visible for page mode scans, if done for every tuple. With batching, the overhead can be amortized to only happen once per page. There are substantial further optimization opportunities along these lines: - Right now HeapTupleSatisfiesMVCCBatch() simply uses the single-tuple HeapTupleSatisfiesMVCC(), relying on the compiler to inline it. We could instead write an explicitly optimized version that avoids repeated xid tests. - Introduce batched version of the serializability test - Introduce batched version of HeapTupleSatisfiesVacuum Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/6rgb2nvhyvnszz4ul3wfzlf5rheb2kkwrglthnna7qhe24onwr@vw27225tkyar	2026-01-12 13:22:04 -05:00
Andres Freund	852558b9ec	heapam: Use exclusive lock on old page in CLUSTER To be able to guarantee that we can set the hint bit, acquire an exclusive lock on the old buffer. This is required as a future commit will only allow hint bits to be set with a new lock level, which is acquired as-needed in a non-blocking fashion. We need the hint bits, set in heapam_relation_copy_for_cluster() -> HeapTupleSatisfiesVacuum(), to be set, as otherwise reform_and_rewrite_tuple() -> rewrite_heap_tuple() will get confused. Specifically, rewrite_heap_tuple() checks for HEAP_XMAX_INVALID in the old tuple to determine whether to check the old-to-new mapping hash table. It'd be better if we somehow could avoid setting hint bits on the old page. A common reason to use VACUUM FULL is very bloated tables - rewriting most of the old table during VACUUM FULL doesn't exactly help. Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/4wggb7purufpto6x35fd2kwhasehnzfdy3zdcu47qryubs2hdz@fa5kannykekr	2026-01-12 12:40:13 -05:00
Andres Freund	45f658dacb	freespace: Don't modify page without any lock Before this commit fsm_vacuum_page() modified the page without any lock on the page. Historically that was kind of ok, as we didn't rely on the freespace to really stay consistent and we did not have checksums. But these days pages are checksummed and there are ways for FSM pages to be included in WAL records, even if the FSM itself is still not WAL logged. If a FSM page ever were modified while a WAL record referenced that page, we'd be in trouble, as the WAL CRC could end up getting corrupted. The reason to address this right now is a series of patches with the goal to only allow modifications of pages with an appropriate lock level. Obviously not having any lock is not appropriate :) Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/4wggb7purufpto6x35fd2kwhasehnzfdy3zdcu47qryubs2hdz@fa5kannykekr Discussion: https://postgr.es/m/e6a8f734-2198-4958-a028-aba863d4a204@iki.fi	2026-01-12 12:40:00 -05:00
Álvaro Herrera	225d1df1d2	Stop including {brin,gin}_tuple.h in tuplesort.h Doing this meant that those two headers, which are supposed to be internal to their corresponding index AMs, were being included pretty much universally, because tuplesort.h is included by execnodes.h which is very widely used. Stop that, and fix fallout. We also change indexing.h to no longer include execnodes.h (tuptable.h is sufficient), and relscan.h to no longer include buf.h (pointless since `c2fe139c20`). Author: Mario González <gonzalemario@gmail.com> Discussion: https://postgr.es/m/CAFsReFUcBFup=Ohv_xd7SNQ=e73TXi8YNEkTsFEE2BW7jS1noQ@mail.gmail.com	2026-01-12 18:09:49 +01:00
Jeff Davis	b96a9fd76f	fuzzystrmatch: use pg_ascii_toupper(). fuzzystrmatch is designed for ASCII, so no need to rely on the global LC_CTYPE setting. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/dd0cdd1f-e786-426e-b336-1ffa9b2f1fc6%40eisentraut.org	2026-01-12 08:54:04 -08:00
Álvaro Herrera	2defd00062	Move instrumentation-related structs to instrument_node.h Some structs and enums related to parallel query instrumentation had organically grown scattered across various files, and were causing header pollution especially through execnodes.h. Create a single file where they can live together. This only moves the structs to the new file; cleaning up the pollution by removing no-longer-necessary cross-header inclusion will be done in future commits. Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: Mario González <gonzalemario@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/202510051642.wwmn4mj77wch@alvherre.pgsql Discussion: https://postgr.es/m/CAFsReFUr4KrQ60z+ck9cRM4WuUw1TCghN7EFwvV0KvuncTRc2w@mail.gmail.com	2026-01-12 16:59:28 +01:00
Peter Eisentraut	c3c240537f	Avoid casting void * function arguments In many cases, the cast would silently drop a const qualifier. To fix, drop the unnecessary cast and let the compiler check the types and qualifiers. Add const to read-only local variables, preserving the const qualifiers from the function signatures. Co-authored-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/aUQHy/MmWq7c97wK%40ip-10-97-1-34.eu-west-3.compute.internal	2026-01-12 16:12:56 +01:00
Peter Eisentraut	707f905399	Add const to read only TableInfo pointers in pg_dump Functions that dump table data receive their parameters through const void * but were casting away const. Add const qualifiers to functions that only read the table information. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aUQHy/MmWq7c97wK%40ip-10-97-1-34.eu-west-3.compute.internal	2026-01-12 14:26:26 +01:00
Peter Eisentraut	e39ece0343	Make dmetaphone collation-aware The dmetaphone() SQL function internally upper-cases the argument string. It did this using the toupper() function. That way, it has a dependency on the global LC_CTYPE locale setting, which we want to get rid of. The "double metaphone" algorithm specifically supports the "C with cedilla" letter, so just using ASCII case conversion wouldn't work. To fix that, use the passed-in collation and use the str_toupper() function, which has full awareness of collations and collation providers. Note that this does not change the fact that this function only works correctly with single-byte encodings. The change to str_toupper() makes the case conversion multibyte-enabled, but the rest of the function is still not ready. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://www.postgresql.org/message-id/108e07a2-0632-4f00-984d-fe0e0d0ec726%40eisentraut.org	2026-01-12 08:35:48 +01:00
Nathan Bossart	5d1f5079ab	pg_dump: Fix memory leak in dumpSequenceData(). Oversight in commit `7a485bd641`. Per Coverity. Backpatch-through: 18	2026-01-11 13:52:50 -06:00
Michael Paquier	540c39cc56	doc: Improve description of pg_restore --jobs The parameter name used for the option value was named "number-of-jobs", which was inconsistent with what all the other tools with an option called --jobs use. This commit updates the parameter name to "njobs". Author: Tatsuro Yamada <yamatattsu@gmail.com> Discussion: https://postgr.es/m/CAOKkKFvHqA6Tny0RKkezWVfVV91nPJyj4OGtMi3C1RznDVXqrg@mail.gmail.com	2026-01-11 15:24:02 +09:00
Michael Paquier	1c0f6c3879	Fix some typos across the board Found while browsing the code.	2026-01-11 08:16:46 +09:00
Andres Freund	e5a5e0a907	instrumentation: Keep time fields as instrtime, convert in callers Previously the instrumentation logic always converted to seconds, only for many of the callers to do unnecessary division to get to milliseconds. As an upcoming refactoring will split the Instrumentation struct, utilize instrtime always to keep things simpler. It's also a bit faster to not have to first convert to a double in functions like InstrEndLoop(), InstrAggNode(). Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53PkzZ3UotnRrrnXWAv=F4avRq9MQ8zU+bxoN9tpovEu6fGQ@mail.gmail.com	2026-01-09 13:38:00 -05:00
Heikki Linnakangas	bba81f9d3d	Inline ginCompareAttEntries for speed It is called in tight loops during GIN index build. Author: David Geier <geidav.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/5d366878-2007-4d31-861e-19294b7a583b@gmail.com	2026-01-09 20:31:43 +02:00
Jacob Champion	e2aae8d68f	doc: Improve description of publish_via_partition_root Reword publish_via_partition_root's opening paragraph. Describe its behavior more clearly, and directly state that its default is false. Per complaint by Peter Smith; final text of the patch made in collaboration with Chao Li. Author: Chao Li <li.evan.chao@gmail.com> Author: Peter Smith <peter.b.smith@fujitsu.com> Reported-by: Peter Smith <peter.b.smith@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAHut%2BPu7SpK%2BctOYoqYR3V4w5LKc9sCs6c_qotk9uTQJQ4zp6g%40mail.gmail.com Backpatch-through: 14	2026-01-09 10:11:37 -08:00
Tom Lane	7a1d422e39	Improve "constraint must include all partitioning columns" message. This formerly said "unique constraint must ...", which was accurate enough when it only applied to UNIQUE and PRIMARY KEY constraints. However, now we use it for exclusion constraints too, and in that case it's a tad confusing. Do what we already did in the errdetail message: print the constraint_type, so that it looks like "UNIQUE constraint ...", "EXCLUDE constraint ...", etc. Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxH6VhAf65Vghg4T2q315gY=Rt4BUfMyunkfRj0n2S9n-g@mail.gmail.com	2026-01-09 12:59:35 -05:00
Nathan Bossart	7a485bd641	pg_dump: Fix gathering of sequence information. Since commit `bd15b7db48`, pg_dump uses pg_get_sequence_data() (née pg_sequence_read_tuple()) to gather all sequence data in a single query as opposed to a query per sequence. Two related bugs have been identified: * If the user lacks appropriate privileges on the sequence, pg_dump generates a setval() command with garbage values instead of failing as expected. * pg_dump can fail due to a concurrently dropped sequence, even if the dropped sequence's data isn't part of the dump. This commit fixes the above issues by 1) teaching pg_get_sequence_data() to return nulls instead of erroring for a missing sequence and 2) teaching pg_dump to fail if it tries to dump the data of a sequence for which pg_get_sequence_data() returned nulls. Note that pg_dump may still fail due to a concurrently dropped sequence, but it should now only do so when the sequence data is part of the dump. This matches the behavior before commit `bd15b7db48`. Bug: #19365 Reported-by: Paveł Tyślacki <pavel.tyslacki@gmail.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19365-6245240d8b926327%40postgresql.org Discussion: https://postgr.es/m/2885944.1767029161%40sss.pgh.pa.us Backpatch-through: 18	2026-01-09 10:12:54 -06:00
Fujii Masao	24cb3a08a4	Use IsA() macro in define.c, for sake of consistency. Commit `63d1b1cf7f` replaced a direct nodeTag() comparison with the IsA() macro in copy.c, but a similar direct comparison remained in define.c. This commit replaces that comparison with IsA() for consistency. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGjWGS89_sTx=sbPm0FQemyQQrfTKm=taUhAJFV5k-9cw@mail.gmail.com	2026-01-09 20:24:00 +09:00
Peter Eisentraut	69d76fb2ab	Decouple C++ support in Meson's PGXS from LLVM enablement This is important for Postgres extensions that are written in C++, such as pg_duckdb, which uses PGXS as the build system currently. In the autotools build, C++ is not coupled to LLVM. If the autotools build is configured without --with-llvm, the C++ compiler and the various flags get persisted into the Makefile.global. Author: Tristan Partin <tristan@partin.io> Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/D98JHQF7H2A8.VSE3I4CJBTAB%40partin.io	2026-01-09 10:25:02 +01:00
Peter Eisentraut	831bbb9bf5	ci: Configure g++ with 32-bit for 32-bit build In future commits, we'll start to make improvements to C++ infrastructure. This requires that for our 32-bit builds we also build C++ code for the 32-bit architecture. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/D98JHQF7H2A8.VSE3I4CJBTAB%40partin.io	2026-01-09 08:58:50 +01:00
Peter Eisentraut	1f08d687c3	meson: Rename cpp variable to cxx Since CPP is also used to mean C PreProcessor in our meson.build files, it's confusing to use it to also mean C PlusPlus. This uses the non-ambiguous cxx wherever possible. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/flat/D98JHQF7H2A8.VSE3I4CJBTAB%40partin.io	2026-01-09 08:58:23 +01:00
David Rowley	349107537d	Fix possible incorrect column reference in ERROR message When creating a partition for a RANGE partitioned table, the reporting of errors relating to converting the specified range values into constant values for the partition key's type could display the name of a previous partition key column when an earlier range was specified as MINVALUE or MAXVALUE. This was caused by the code not correctly incrementing the index that tracks which partition key the foreach loop was working on after processing MINVALUE/MAXVALUE ranges. Fix by using foreach_current_index() to ensure the index variable is always set to the List element being worked on. Author: myzhen <zhenmingyang@yeah.net> Reviewed-by: zhibin wang <killerwzb@gmail.com> Discussion: https://postgr.es/m/273cab52.978.19b96fc75e7.Coremail.zhenmingyang@yeah.net Backpatch-through: 14	2026-01-09 11:01:36 +13:00
Tom Lane	b8ccd29152	Remove now-useless btree_gist--1.2.sql script. In the wake of the previous commit, this script will fail if executed via CREATE EXTENSION, so it's useless. Remove it, but keep the delta scripts, allowing old (even very old) versions of the btree_gist SQL objects to be upgraded to 1.9 via ALTER EXTENSION UPDATE after a pg_upgrade. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/2483812.1754072263@sss.pgh.pa.us	2026-01-08 14:09:58 -05:00
Tom Lane	b352d3d80b	Mark GiST inet_ops as opcdefault, and deal with ensuing fallout. This patch completes the transition to making inet_ops be default for inet/cidr columns, rather than btree_gist's opclasses. Once we do that, though, pg_upgrade has a big problem. A dump from an older version will see btree_gist's opclasses as being default, so it will not mention the opclass explicitly in CREATE INDEX commands, which would cause the restore to create the indexes using inet_ops. Since that's not compatible with what's actually in the files, havoc would ensue. This isn't readily fixable, because the CREATE INDEX command strings are built by the older server's pg_get_indexdef() function; pg_dump hasn't nearly enough knowledge to modify those strings successfully. Even if we cared to put in the work to make that happen in pg_dump, it would be counterproductive because the end goal here is to get people off of these opclasses. Allowing such indexes to persist through pg_upgrade wouldn't advance that goal. Therefore, this patch just adds code to pg_upgrade to detect indexes that would be problematic and refuse to upgrade. There's another issue too: even without any indexes to worry about, pg_dump in binary-upgrade mode will reproduce the "CREATE OPERATOR CLASS ... DEFAULT" commands for btree_gist's opclasses, and those will fail because now we have a built-in opclass that provides a conflicting default. We could ask users to drop the btree_gist extension altogether before upgrading, but that would carry very severe penalties. It would affect perfectly-valid indexes for other data types, and it would drop operators that might be relied on in views or other database objects. Instead, put a hack in DefineOpClass to ignore the DEFAULT clauses for these opclasses when in binary-upgrade mode. This will result in installing a version of btree_gist that isn't quite the version it claims to be, but that can be fixed by issuing ALTER EXTENSION UPDATE afterwards. Since we don't apply that hack when not in binary-upgrade mode, it is now impossible to install any version of btree_gist less than 1.9 via CREATE EXTENSION. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/2483812.1754072263@sss.pgh.pa.us	2026-01-08 14:03:56 -05:00
Tom Lane	b3b0b45717	Create btree_gist v1.9, in which inet/cidr opclasses aren't default. btree_gist's gist_inet_ops and gist_cidr_ops opclasses are fundamentally broken: they rely on an approximate representation of the inet values and hence sometimes miss rows they should return. We want to eventually get rid of them altogether, but as the first step on that journey, we should mark them not-opcdefault. To do that, roll up the preceding deltas since 1.2 into a new base script btree_gist--1.9.sql. This will allow installing 1.9 without going through a transient situation where gist_inet_ops and gist_cidr_ops are marked as opcdefault; trying to create them that way will fail if there's already a matching default opclass in the core system. Additionally provide btree_gist--1.8--1.9.sql, so that a database that's been pg_upgraded from an older version can be migrated to 1.9. I noted along the way that commit `57e3c5160` had missed marking the gist_bool_ops support functions as PARALLEL SAFE. While that probably has little harmful effect (since AFAIK we don't check that when calling index support functions), this seems like a good time to make things consistent. Readers will also note that I removed the former habit of installing some opclass operators/functions with ALTER OPERATOR FAMILY, instead just rolling them all into the CREATE OPERATOR CLASS steps. The comment in btree_gist--1.2.sql that it's necessary to use ALTER for pg_upgrade reproducibility has been obsolete since we invented the amadjustmembers infrastructure. Nowadays, gistadjustmembers will force all operators and non-required support functions to have "soft" opfamily dependencies, regardless of whether they are installed by CREATE or ALTER. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/2483812.1754072263@sss.pgh.pa.us	2026-01-08 13:56:08 -05:00
Heikki Linnakangas	63d1b1cf7f	Use IsA macro, for sake of consistency Reported-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://www.postgresql.org/message-id/CAOzEurS=PzRzGba3mpNXgEhbnQFA0dxXaU0ujCJ0aa9yMSH6Pw@mail.gmail.com	2026-01-08 18:58:28 +02:00
Heikki Linnakangas	ad853bb877	Fix misc typos, mostly in comments The only user-visible change is the fix in the "malformed pg_dependencies" error detail. That one is new in commit `e1405aa5e3`, so no backpatching required.	2026-01-08 18:10:08 +02:00
Heikki Linnakangas	d3304921db	Improve comments around _bt_checkkeys Discussion: https://www.postgresql.org/message-id/f5388839-99da-465a-8744-23cdfa8ce4db@iki.fi	2026-01-08 18:10:05 +02:00
Amit Kapila	31ddbb38ee	Fix typos in the code. Author: "Dewei Dai" <daidewei1970@163.com> Author: zengman <zengman@halodbtech.com> Author: Zhiyuan Su <suzhiyuan_pg@126.com> Discussion: https://postgr.es/m/2026010719201902382410@163.com Discussion: https://postgr.es/m/tencent_4DC563C83443A4B1082D2BFF@qq.com Discussion: https://postgr.es/m/44656d72.2a63.19b9b92b0a3.Coremail.suzhiyuan_pg@126.com	2026-01-08 09:43:50 +00:00
Peter Eisentraut	6ade3cd459	Remove use of rindex() function rindex() has been removed from POSIX 2008. Replace the one remaining use with the equivalent and more standard strrchr(). Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/98ce805c-6103-421b-adc3-fcf8f3dddbe3%40eisentraut.org	2026-01-08 08:53:49 +01:00
Peter Eisentraut	5e7abdac99	strnlen() is now required Remove all configure checks and workarounds for strnlen() missing. It is required by POSIX 2008. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/98ce805c-6103-421b-adc3-fcf8f3dddbe3%40eisentraut.org	2026-01-08 08:51:20 +01:00
Michael Paquier	639352d904	pg_createsubscriber: Improve handling of automated recovery configuration When repurposing a standby to a logical replica, pg_createsubscriber uses for the new replica a set of configuration parameters saved into postgresql.auto.conf, to force recovery patterns when the physical replica is promoted. While not wrong in practice, this approach can cause issues when forcing again recovery on a logical replica or its base backup as the recovery parameters are not reset on the target server once pg_createsubscriber is done with the node. This commit aims at improving the situation, by changing the way recovery parameters are saved on the target node. Instead of writing all the configuration to postgresql.auto.conf, this file now uses an include_if_exists, that points to a pg_createsubscriber.conf. This new file contains all the recovery configuration, and is renamed to pg_createsubscriber.conf.disabled when pg_createsubscriber exits. This approach resets the recovery parameters, and offers the benefit to keep a trace of the setup used when the target node got promoted, for debugging purposes. If pg_createsubscriber.conf cannot be renamed (unlikely scenario), a warning is issued to inform users that a manual intervention may be required to reset this configuration. This commit includes a test case to demonstrate the problematic case: a standby node created from a base backup of what was the target node of pg_createsubscriber does not get confused when started. If removing this new logic, the test fails with the standby not able to start due to an incorrect recovery target setup, where the startup process fails quickly with a FATAL. I have provided the design idea for the patch, that Alyona has written (with some code adjustments from me). This could be considered as a bug, but after discussion this is put into the bucket for improvements. Redesigning pg_createsubscriber would not be acceptable in the stable branches anyway. Author: Alyona Vinter <dlaaren8@gmail.com> Reviewed-by: Ilyasov Ian <ianilyasov@outlook.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrey Rudometov <unlimitedhikari@gmail.com> Discussion: https://postgr.es/m/CAGWv16K6L6Pzm99i1KiXLjFWx2bUS3DVsR6yV87-YR9QO7xb3A@mail.gmail.com	2026-01-08 10:12:33 +09:00
Masahiko Sawada	28c4b8a05b	psql: Add tab completion for pstdin and pstdout in \copy. This commit adds tab completion support for the keywords "pstdin" and "pstdout" in the \copy command. "pstdin" is now suggested after FROM, and "pstdout" is suggested after TO, alongside filenames and other keywords. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20251231183953.95132e171e43abd5e9b78084@sraoss.co.jp	2026-01-07 16:22:42 -08:00
Álvaro Herrera	e1c971945d	Replace flaky CIC/RI isolation tests with a TAP test The isolation tests for INSERT ON CONFLICT behavior during CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY (added by `bc32a12e0d`, `2bc7e886fc`, and `90eae926ab`) were disabled in `77038d6d0b` due to persistent CI flakiness, after several attempts at stabilization. This commit removes them and introduces a TAP test in test_misc module (010_index_concurrently_upsert.pl) that covers the same scenarios. This new test should hopefully be more stable while providing assurance that the fixes in all those commits (plus `81f72115cf`) continue to work. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/ccssrhafzbp3a3beju3ptyc56a7gbfimj4vwkbokoldofckrc7@bso37rxskjtf Discussion: https://postgr.es/m/CANtu0ogv+6wqRzPK241jik4U95s1pW3MCZ3rX5ZqbFdUysz7Qw@mail.gmail.com Discussion: https://postgr.es/m/202512112014.icpomgc37zx4@alvherre.pgsql	2026-01-07 19:44:57 -03:00
Nathan Bossart	a516b3f00d	MSVC: Support building for AArch64. This commit does the following to get tests passing for MSVC/AArch64: * Implements spin_delay() with an ISB instruction (like we do for gcc/clang on AArch64). * Sets USE_ARMV8_CRC32C unconditionally. Vendor-supported versions of Windows for AArch64 require at least ARMv8.1, which is where CRC extension support became mandatory. * Implements S_UNLOCK() with _InterlockedExchange(). The existing implementation for MSVC uses _ReadWriteBarrier() (a compiler barrier), which is insufficient for this purpose on non-TSO architectures. There are likely other changes required to take full advantage of the hardware (e.g., atomics/arch-arm.h, simd.h, pg_popcount_aarch64.c), but those can be dealt with later. Author: Niyas Sait <niyas.sait@linaro.org> Co-authored-by: Greg Burd <greg@burd.me> Co-authored-by: Dave Cramer <davecramer@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Tested-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/A6152C7C-F5E3-4958-8F8E-7692D259FF2F%40greg.burd.me Discussion: https://postgr.es/m/CAFPTBD-74%2BAEuN9n7caJ0YUnW5A0r-KBX8rYoEJWqFPgLKpzdg%40mail.gmail.com	2026-01-07 13:42:57 -06:00
Peter Geoghegan	8aed8e168f	Fix nbtree skip array transformation comments. Fix comments that incorrectly described transformations performed by the "Avoid extra index searches through preprocessing" mechanism introduced by commit `b3f1a13f`. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-By: Chao Li <li.evan.chao@gmail.com> Reviewed-By: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/20251230190145.c3c88c5eb0f88b136adda92f@sraoss.co.jp Backpatch-through: 18	2026-01-07 12:53:07 -05:00
Fujii Masao	1b795ef032	doc: Remove deprecated clauses from CREATE USER/GROUP syntax synopsis. The USER and IN GROUP clauses of CREATE ROLE are deprecated, and commit `8e78f0a1` removed them from the CREATE ROLE syntax syntax synopsis in the docs. However, previously CREATE USER and CREATE GROUP docs still listed these clauses. Since CREATE USER is equivalent to CREATE ROLE ... WITH LOGIN and CREATE GROUP is equivalent to CREATE ROLE, their documented syntax synopsis should match CREATE ROLE to avoid confusion. Therefore this commit removes the deprecated USER and IN GROUP clauses from the CREATE USER and CREATE GROUP syntax synopsis in the docs. Author: Japin Li <japinli@hotmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/MEAPR01MB3031C30E72EF16CFC08C8565B687A@MEAPR01MB3031.ausprd01.prod.outlook.com	2026-01-08 01:10:36 +09:00
Alexander Korotkov	e54ce0b2da	Revert "Use WAIT FOR LSN in PostgreSQL::Test::Cluster::wait_for_catchup()" This reverts commit `f30848cb05`. Due to buildfarm failures related to recovery conflicts while executing the WAIT FOR command. It appears that WAIT FOR still holds xmin for a while. Reported-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BL0OQR8dQtsNBSaA3FNNyQeFabdeRVzurm0b7Xtp592g%40mail.gmail.com	2026-01-07 17:22:18 +02:00
Peter Eisentraut	6675f41c18	Fix typo Reported-by: Xueyu Gao <gaoxueyu_hope@163.com> Discussion: https://www.postgresql.org/message-id/42b5c99a.856d.19b73d858e2.Coremail.gaoxueyu_hope%40163.com	2026-01-07 15:47:02 +01:00
John Naylor	ece25c2611	createuser: Update docs to reflect defaults Commit `c7eab0e97` changed the default password_encryption setting to 'scram-sha-256', so update the example for creating a user with an assigned password. In addition, commit `08951a7c9` added new options that in turn pass default tokens NOBYPASSRLS and NOREPLICATION to the CREATE ROLE command, so fix this omission as well for v16 and later. Reported-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/cff1ea60-c67d-4320-9e33-094637c2c4fb%40iki.fi Backpatch-through: 14	2026-01-07 16:02:19 +07:00
Michael Paquier	68119480a7	Fix unexpected reversal of lists during catcache rehash During catcache searches, the most-recently searched entries are kept at the head of the list to speed up subsequent searches, keeping the "freshest" entries at its beginning. A rehash of the catcache was doing the opposite: fresh entries were moved to the tail of the newly-created buckets, causing a rehash to slow down a bit. When a rehash is done, this commit switches the code to use dlist_push_tail() instead of dlist_push_head(), so as fresh entries are kept at the head of the lists, not their tail. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/tencent_9EA10D8512B5FE29E7323F780A0749768708@qq.com	2026-01-07 17:52:54 +09:00
Michael Paquier	ba887a8cdb	Fix grammar in datatype.sgml Introduced in `b139bd3b6e`. Reported-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/tencent_121C1BB152CAF3195C99D56C@qq.com	2026-01-07 14:13:18 +09:00
John Naylor	e171405afe	Further doc updates to reflect MD5 deprecation Followup to `44f49511b`. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CAHGQGwH_UfN96vcvLGA%3DYro%2Bo6qCn0nEgEGoviwzEiLTHtt2Pw%40mail.gmail.com Backpatch-through: 18	2026-01-07 12:00:05 +07:00
Fujii Masao	0a7c37b847	doc: Add glossary and index entries for GUC. GUC is a commonly used term but previously appeared only in the acronym documentation. This commit adds glossary and documentation index entries for GUC to make it easier to find and understand. Author: Robert Treat <rob@xzilla.net> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CABV9wwPQnkeo_G6-orMGnHPK9SXGVWm7ajJPzsbE6944tDx=hQ@mail.gmail.com	2026-01-07 13:58:07 +09:00
Fujii Masao	466347ad28	doc: Add index entry for Git. This commit adds Git to the documentation index, pointing to the source code repository documentation. Author: Robert Treat <rob@xzilla.net> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CABV9wwPQnkeo_G6-orMGnHPK9SXGVWm7ajJPzsbE6944tDx=hQ@mail.gmail.com	2026-01-07 13:57:36 +09:00
Michael Paquier	a2e632ece1	Improve portability of test with oid8 comparison function Oversight in `b139bd3b6e`, per reports from buildfarm members longfin and prion, that use -DSTRESS_SORT_INT_MIN. Thanks to Tom Lane for the poke. Discussion: https://postgr.es/m/1656709.1767754981@sss.pgh.pa.us	2026-01-07 12:55:16 +09:00
Michael Paquier	b139bd3b6e	Add data type oid8, 64-bit unsigned identifier This new identifier type provides support for 64-bit unsigned values, to be used in catalogs, like OIDs. An advantage of a new data type is that it becomes easier to grep for it in the code when assigning this type to a catalog attribute, linking it to dedicated APIs and internal structures. The following operators are added in this commit, with dedicated tests: - Casts with integer types and OID. - btree and hash operators - min/max functions. - C type with related macros and defines, named around "Oid8". This has been mentioned as useful on its own on the thread to add support for 64-bit TOAST values, so as it becomes possible to attach this data type to the TOAST code and catalog definitions. However, as this concept can apply to many more areas, it is implemented as its own independent change. This is based on a discussion with Andres Freund and Tom Lane. Bump catalog version. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Discussion: https://postgr.es/m/1891064.1754681536@sss.pgh.pa.us	2026-01-07 11:37:00 +09:00
Jeff Davis	af2d4ca191	Clean up ICU includes. Remove ICU includes from pg_locale.h, and instead include them in the few C files that need ICU. Clean up a few other includes in passing. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/48911db71d953edec66df0d2ce303563d631fbe0.camel@j-davis.com	2026-01-06 17:19:51 -08:00
Andres Freund	75609fded3	Fix buggy interaction between array subscripts and subplan params In `a7f107df2` I changed subplan param evaluation to happen within the containing expression. As part of that, ExecInitSubPlanExpr() was changed to evaluate parameters via a new EEOP_PARAM_SET expression step. These parameters were temporarily stored into ExprState->resvalue/resnull, with some reasoning why that would be fine. Unfortunately, that analysis was wrong - ExecInitSubscriptionRef() evaluates the input array into "resv"/"resnull", which will often point to ExprState->resvalue/resnull. This means that the EEOP_PARAM_SET, if inside an array subscript, would overwrite the input array to array subscript. The fix is fairly simple - instead of evaluating into ExprState->resvalue/resnull, store the temporary result of the subplan in the subplan's return value. Bug: #19370 Reported-by: Zepeng Zhang <redraiment@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Diagnosed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/19370-7fb7a5854b7618f1@postgresql.org Backpatch-through: 18	2026-01-06 19:51:10 -05:00
Jeff Davis	c4ff35f104	ICU: use UTF8-optimized case conversion API Initializes a UCaseMap object once for use across calls, and uses UTF8-optimized APIs. Author: Andreas Karlsson <andreas@proxel.se> Reviewed-by: zengman <zengman@halodbtech.com> Discussion: https://postgr.es/m/5a010b27-8ed9-4739-86fe-1562b07ba564@proxel.se	2026-01-06 14:09:07 -08:00
Michael Paquier	0547aeae0f	Improve portability of new worker_spi test The new test 002_worker_terminate relies on the generation of a LOG entry to check that a worker has been started, but missed the fact that a node set with log_error_verbosity = verbose would add an error code. The regexp used for the matching check did not take this case into account, making the test fail on a timeout. The regexp is now fixed to handle the verbose case correctly. Per buildfarm member prion, that uses log_error_verbosity = verbose. The error was reproducible by setting this GUC the same way in the test. Oversight in `f1e251be80`.	2026-01-06 20:24:17 +09:00
Peter Eisentraut	6449291728	Add test coverage for indirection transformation These tests cover nested arrays of composite data types, single-argument functions, and casting using dot-notation, providing a baseline for future enhancements to jsonb dot-notation support. Author: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAK98qZ1JNNAx4QneJG+eX7iLesOhd6A68FNQVvvHP6Up_THf3A@mail.gmail.com	2026-01-06 09:45:17 +01:00
Alexander Korotkov	bf308639bf	Fix variable usage in wakeupWaiters() Mistakenly, `i` was used both as an integer representation of lsnType and an index in wakeUpProcs array. Discussion: https://postgr.es/m/CA%2BhUKG%2BL0OQR8dQtsNBSaA3FNNyQeFabdeRVzurm0b7Xtp592g%40mail.gmail.com Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Ruikai Peng <ruikai@pwno.io>	2026-01-06 10:04:28 +02:00
Michael Paquier	8b9b93e39b	Use relation_close() more consistently in contrib/ All the code paths updated here have been using index_close() to close a relation that was opened with relation_open(), in pgstattuple and pageinspect. index_close() does the same thing as relation_close(), so there is no harm, but being inconsistent could lead to issues if the internals of these close() functions begin to introduce some specific logic in the future. In passing, this commit adds some comments explaining why we are using relation_open() instead of index_open() in a few places, which is due to the fact that partitioned indexes are not allowed in these functions. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/aUKamYGiDKO6byp5@ip-10-97-1-34.eu-west-3.compute.internal	2026-01-06 16:17:59 +09:00
Michael Paquier	f1e251be80	Allow bgworkers to be terminated for database-related commands Background workers gain a new flag, called BGWORKER_INTERRUPTIBLE, that offers the possibility to terminate the workers when these are connected to a database that is involved in one of the following commands: ALTER DATABASE RENAME TO ALTER DATABASE SET TABLESPACE CREATE DATABASE DROP DATABASE This is useful to give background workers the same behavior as backends and autovacuum workers, which are stopped when these commands are executed. The default behavior, that exists since 9.3, is still to never terminate bgworkers connected to the database involved in any of these commands. The new flag has to be set to terminate the workers. A couple of tests are added to worker_spi to track the commands that impact the termination of the workers. There is a test case for a non-interruptible worker, additionally, that relies on an injection point to make the wait time in CountOtherDBBackends() reduced from 5s to 0.3s for faster test runs. The tests rely on the contents of the server logs to check if a worker has been started or terminated: - LOG generated by worker_spi_main() at startup, once connection to database is done. - FATAL in bgworker_die() when terminated. A couple of tests run in the CI have showed that this method is stable enough. The safe_psql() calls that scan pg_stat_activity could be replaced with some poll_query_until() for more stability, if the current method proves to be an issue in the buildfarm. Author: Aya Iwata <iwata.aya@fujitsu.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/OS7PR01MB11964335F36BE41021B62EAE8EAE4A@OS7PR01MB11964.jpnprd01.prod.outlook.com	2026-01-06 14:24:29 +09:00
Amit Kapila	c970bdc037	Update comments atop ReplicationSlotCreate. Since commit `1462aad2e4`, which introduced the ability to modify the two_phase property of a slot, the comments above ReplicationSlotCreate have become outdated. We have now added a cautionary note in the comments above ReplicationSlotAlter explaining when it is safe to modify the two_phase property of a slot. Author: Daniil Davydov <3danissimo@gmail.com> Author: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAJDiXggZXQZ7bD0QcTizDt6us9aX6ZKK4dWxzgb5x3+TsVHjqQ@mail.gmail.com	2026-01-06 05:02:25 +00:00
David Rowley	6ca8506ea5	Fix issue with EVENT TRIGGERS and ALTER PUBLICATION When processing the "publish" options of an ALTER PUBLICATION command, we call SplitIdentifierString() to split the options into a List of strings. Since SplitIdentifierString() modifies the delimiter character and puts NULs in their place, this would overwrite the memory of the AlterPublicationStmt. Later in AlterPublicationOptions(), the modified AlterPublicationStmt is copied for event triggers, which would result in the event trigger only seeing the first "publish" option rather than all options that were specified in the command. To fix this, make a copy of the string before passing to SplitIdentifierString(). Here we also adjust a similar case in the pgoutput plugin. There's no known issues caused by SplitIdentifierString() here, so this is being done out of paranoia. Thanks to Henson Choi for putting together an example case showing the ALTER PUBLICATION issue. Author: sunil s <sunilfeb26@gmail.com> Reviewed-by: Henson Choi <assam258@gmail.com> Reviewed-by: zengman <zengman@halodbtech.com> Backpatch-through: 14	2026-01-06 17:29:12 +13:00
Amit Kapila	63ed3bc7f9	Fix typo in slot.c. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/AC9B87F1-ED04-4547-B85C-9443B4253A08@gmail.com Discussion: https://postgr.es/m/CAJDiXggZXQZ7bD0QcTizDt6us9aX6ZKK4dWxzgb5x3+TsVHjqQ@mail.gmail.com	2026-01-06 04:13:40 +00:00
Michael Paquier	ae28373602	Fix typo in planner.c `b8cfcb9e00` did not get this change right. Author: Alexander Law <exclusion@gmail.com> Discussion: https://postgr.es/m/CAJ0YPFFWhJXs-e-=7iJz-FLp=b1dXfJA_qtrVAgto=bZmzD9zQ@mail.gmail.com	2026-01-06 12:30:01 +09:00
Fujii Masao	5f13999aa1	Add TAP test for GUC settings passed via CONNECTION in logical replication. Commit `d926462d81` restored the behavior of passing GUC settings from the CONNECTION string to the publisher's walsender, allowing per-connection configuration. This commit adds a TAP test to verify that behavior works correctly. Since commit `d926462d81` was recently applied and backpatched to v15, this follow-up commit is also backpatched accordingly. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/CAHGQGwGYV+-abbKwdrM2UHUe-JYOFWmsrs6=QicyJO-j+-Widw@mail.gmail.com Backpatch-through: 15	2026-01-06 11:57:12 +09:00
Fujii Masao	d926462d81	Honor GUC settings specified in CREATE SUBSCRIPTION CONNECTION. Prior to v15, GUC settings supplied in the CONNECTION clause of CREATE SUBSCRIPTION were correctly passed through to the publisher's walsender. For example: CREATE SUBSCRIPTION mysub CONNECTION 'options=''-c wal_sender_timeout=1000''' PUBLICATION ... would cause wal_sender_timeout to take effect on the publisher's walsender. However, commit `f3d4019da5` changed the way logical replication connections are established, forcing the publisher's relevant GUC settings (datestyle, intervalstyle, extra_float_digits) to override those provided in the CONNECTION string. As a result, from v15 through v18, GUC settings in the CONNECTION string were always ignored. This regression prevented per-connection tuning of logical replication. For example, using a shorter timeout for walsender connecting to a nearby subscriber and a longer one for walsender connecting to a remote subscriber. This commit restores the intended behavior by ensuring that GUC settings in the CONNECTION string are again passed through and applied by the walsender, allowing per-connection configuration. Backpatch to v15, where the regression was introduced. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/CAHGQGwGYV+-abbKwdrM2UHUe-JYOFWmsrs6=QicyJO-j+-Widw@mail.gmail.com Backpatch-through: 15	2026-01-06 11:52:22 +09:00
David Rowley	7f9acc9bcc	Simplify GetOperatorFromCompareType() code The old code would set *opid = InvalidOid to determine if the get_opclass_opfamily_and_input_type() worked or not. This means more moving parts that what's really needed here. Let's just fail immediately if the get_opclass_opfamily_and_input_type() lookup fails. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com> Discussion: https://postgr.es/m/CA+renyXOrjLacP_nhqEQUf2W+ZCoY2q5kpQCfG05vQVYzr8b9w@mail.gmail.com	2026-01-06 15:25:13 +13:00
David Rowley	e8d4e94a47	Fix misleading comment for GetOperatorFromCompareType The comment claimed *strat got set to InvalidStrategy when the function lookup fails. This isn't true; an ERROR is raised when that happens. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com> Discussion: https://postgr.es/m/CA+renyXOrjLacP_nhqEQUf2W+ZCoY2q5kpQCfG05vQVYzr8b9w@mail.gmail.com Backpatch-through: 18	2026-01-06 15:16:14 +13:00
Fujii Masao	b9ee5f2dcb	doc: Fix outdated doc in pg_rewind. Update pg_rewind documentation to reflect the change that data checksums are now enabled by default during initdb. Backpatch to v18, where data checksums were changed to be enabled by default. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/TY4PR01MB16907D62F3A0A377B30FDBEA794B2A@TY4PR01MB16907.jpnprd01.prod.outlook.com Backpatch-through: 18	2026-01-06 11:00:54 +09:00
David Rowley	c5af141cd4	Clarify where various catcache.h dlist_nodes are used Also remove a comment which mentions we don't currently divide the per-cache lists into hash buckets. Since `473182c95`, we do. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_7732789707C8768EA13785A7B5EA29103208@qq.com	2026-01-06 14:39:36 +13:00
Masahiko Sawada	e212a0f8e6	pg_visibility: Fix incorrect buffer lock description in comment. Although the comment in collect_corrupt_items() stated that the buffer is locked in exclusive mode, it is actually locked in shared mode. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CAEoWx2kkhxgfp=kinPMetnwHaa0JjR6YBkO_0gg0oiy6mu7Zjw@mail.gmail.com	2026-01-05 15:49:43 -08:00
Tom Lane	b85d5dc0e7	Fix meson build of snowball code. include/snowball/libstemmer has to be in the -I search path, as it is in the autoconf build. It's not apparent to me how this ever worked before, nor why my recent commit made it stop working. Discussion: https://postgr.es/m/ld2iurl7kzexwydxmdfhdgarpa7xxsfrgvggqhbblt4rvt3h6t@bxsk6oz5x7cc	2026-01-05 16:51:36 -05:00
Tom Lane	7dc95cc3b9	Update to latest Snowball sources. It's been almost a year since we last did this, and upstream has been busy. They've added stemmers for Polish and Esperanto, and also deprecated their old Dutch stemmer in favor of the Kraaij-Pohlmann algorithm. (The "dutch" stemmer is now the latter, and "dutch_porter" is the old algorithm.) Upstream also decided to rename their internal header "header.h" to something less generic: "snowball_runtime.h". Seems like a good thing, but it complicates this patch a bit because we were relying on interposing our own version of "header.h" to control system header inclusion order. (We're partially failing at that now, because now the generated stemmer files include <stddef.h> before snowball_runtime.h. I think that'll be okay, but if the buildfarm complains then we'll have to do more-extensive editing of the generated files.) I realized that we weren't documenting the available stemmers in any user-visible place, except indirectly through sample \dFd output. That's incomplete because we only provide built-in dictionaries for the recommended stemmers for each language, not alternative stemmers such as dutch_porter. So I added a list to the documentation. I did not do anything with the stopword lists. If those are still available from snowballstem.org, they are mighty well hidden. Discussion: https://postgr.es/m/1185975.1767569534@sss.pgh.pa.us	2026-01-05 15:22:37 -05:00
Andres Freund	bb048e31dc	ci: Remove ulimit -p for netbsd/openbsd Previously the ulimit -p 256 was needed to increase the limit on openbsd. However, sometimes the limit actually was too low, causing "could not fork new process for connection: Resource temporarily unavailable" errors. Most commonly on netbsd, but also on openbsd. The ulimit on openbsd couldn't trivially be increased with ulimit, because of hitting the hard limit. Instead of increasing the limit in the CI script, the CI image generation now increases the limits: https://github.com/anarazel/pg-vm-images/pull/129 Backpatch-through: 18	2026-01-05 13:57:04 -05:00
Masahiko Sawada	d351063e49	Fix typo in parallel.c. Author: kelan <ke_lan1@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/tencent_38B5875E2D440C8DA8C0C022ABD999F9C207@qq.com	2026-01-05 10:16:28 -08:00
Alexander Korotkov	f30848cb05	Use WAIT FOR LSN in PostgreSQL::Test::Cluster::wait_for_catchup() When the standby is passed as a PostgreSQL::Test::Cluster instance, use the WAIT FOR LSN command on the standby server to implement wait_for_catchup() for replay, write, and flush modes. This is more efficient than polling pg_stat_replication on the upstream, as the WAIT FOR LSN command uses a latch-based wakeup mechanism. The optimization applies when: - The standby is passed as a Cluster object (not just a name string) - The mode is 'replay', 'write', or 'flush' (not 'sent') - The standby is in recovery For 'sent' mode, when the standby is passed as a string (e.g., a subscription name for logical replication), or when the standby has been promoted, the function falls back to the original polling-based approach using pg_stat_replication on the upstream. Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>	2026-01-05 19:56:19 +02:00
Alexander Korotkov	76948337f7	Add tab completion for the WAIT FOR LSN MODE option Update psql tab completion to support the optional MODE option in the WAIT FOR LSN command. After specifying an LSN value, completion now offers both MODE and WITH keywords. The MODE option specifies which LSN type to wait for. In particular, it controls whether the wait is evaluated from the standby or primary perspective. When MODE is specified, the completion suggests the valid mode values: standby_replay, standby_write, standby_flush, and primary_flush. Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>	2026-01-05 19:56:19 +02:00
Alexander Korotkov	49a181b5d6	Add the MODE option to the WAIT FOR LSN command This commit extends the WAIT FOR LSN command with an optional MODE option in the WITH clause that specifies which LSN type to wait for: WAIT FOR LSN '<lsn>' [WITH (MODE '<mode>', ...)] where mode can be: - 'standby_replay' (default): Wait for WAL to be replayed to the specified LSN, - 'standby_write': Wait for WAL to be written (received) to the specified LSN, - 'standby_flush': Wait for WAL to be flushed to disk at the specified LSN, - 'primary_flush': Wait for WAL to be flushed to disk on the primary server. The default mode is 'standby_replay', matching the original behavior when MODE is not specified. This follows the pattern used by COPY and EXPLAIN commands, where options are specified as string values in the WITH clause. Modes are explicitly named to distinguish between primary and standby operations: - Standby modes ('standby_replay', 'standby_write', 'standby_flush') can only be used during recovery (on a standby server), - Primary mode ('primary_flush') can only be used on a primary server. The 'standby_write' and 'standby_flush' modes are useful for scenarios where applications need to ensure WAL has been received or persisted on the standby without necessarily waiting for replay to complete. The 'primary_flush' mode allows waiting for WAL to be flushed on the primary server. This commit also includes includes: - Documentation updates for the new syntax and mode descriptions, - Test coverage for all four modes, including error cases and concurrent waiters, - Wakeup logic in walreceiver for standby write/flush waiters, - Wakeup logic in WAL writer for primary flush waiters. Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>	2026-01-05 19:56:19 +02:00
Alexander Korotkov	7a39f43d88	Extend xlogwait infrastructure with write and flush wait types Add support for waiting on WAL write and flush LSNs in addition to the existing replay LSN wait type. This provides the foundation for extending the WAIT FOR command with MODE parameter. Key changes are following. - Add WAIT_LSN_TYPE_STANDBY_WRITE and WAIT_LSN_TYPE_STANDBY_FLUSH to WaitLSNType. - Add GetCurrentLSNForWaitType() to retrieve the current LSN for each wait type. - Add new wait events WAIT_EVENT_WAIT_FOR_WAL_WRITE and WAIT_EVENT_WAIT_FOR_WAL_FLUSH for pg_stat_activity visibility. - Update WaitForLSN() to use GetCurrentLSNForWaitType() internally. Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>	2026-01-05 19:56:19 +02:00
Alexander Korotkov	d51a5d8e56	Adjust errcode in checkPartition() Replace ERRCODE_UNDEFINED_TABLE with ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE for the case where we don't find a parent-child relationship between the partitioned table and its partition. In this case, tables are present, but they are not in a prerequisite state (no relationship). Discussion: https://postgr.es/m/CAHewXNmBM%2B5qbrJMu60NxPn%2B0y-%3D2wXM-QVVs3xRp8NxFvDb9A%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>	2026-01-05 19:56:19 +02:00
Robert Haas	3f33b63de2	Remove redundant SET enable_partitionwise_join = on. partition_join.sql keeps partitionwise join enabled for the entire file, so we don't need to enable it for this test case individually. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: http://postgr.es/m/CAExHW5uRW=Z==bmLR=NXm6Vv3JGH4rUvb+Rfft8TfjrfzUUm3g@mail.gmail.com	2026-01-05 11:57:24 -05:00
Michael Paquier	877ae5db89	Fix comment in tableam.c Author: shiyu qin <qinshy510@gmail.com> Discussion: https://postgr.es/m/CAJUCM3uJjoLR1zfKoZD4J71T-hdeFdFw1kTQoMkywKZP0hZsvw@mail.gmail.com	2026-01-05 19:15:55 +09:00
Peter Eisentraut	de746e0d2a	Separate read and write pointers in pg_saslprep Use separate pointers for reading const input ('p') and writing to mutable output ('outp'), avoiding the need to cast away const on the input parameter. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aUQHy/MmWq7c97wK%40ip-10-97-1-34.eu-west-3.compute.internal	2026-01-05 11:03:49 +01:00
Heikki Linnakangas	461b8cc952	Tighten up assertion on a local variable 'lineindex' is 0-based, as mentioned in the comments. Backpatch to v18 where the assertion was added. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/tencent_A84F3C810365BB9BD08442955AE494141907@qq.com Backpatch-through: 18	2026-01-05 11:33:35 +02:00
David Rowley	4c144e0452	Use the GetPGProcByNumber() macro when possible A few places were accessing &ProcGlobal->allProcs directly, so adjust them to use the accessor macro instead. Author: Maksim Melnikov <m.melnikov@postgrespro.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/80621c00-aba6-483c-88b1-a845461d1165@postgrespro.ru	2026-01-05 21:19:03 +13:00
Amit Kapila	3f906d3af9	Improve the comments atop build_replindex_scan_key(). Author: zourenli <398740848@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/tencent_C2DC8157CC05C8F5C36E12678A7864554809@qq.com	2026-01-05 03:06:55 +00:00
Michael Paquier	8ab4b864c1	Remove unneeded probes from configure and meson `7d854bdc5b` has removed two symbols from pg_config.h.in. This file is automatically generated. The correct cleanup needs to be done in the build scripts, instead. autoheader produces now a consistent pg_config.h.in, without the symbols that were removed in the previous commit. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1193764.1767573683@sss.pgh.pa.us	2026-01-05 11:03:43 +09:00
Michael Paquier	7d854bdc5b	Remove unneeded defines from pg_config.h.in This commit removes HAVE_ATOMIC_H and HAVE_MBARRIER_H from pg_config.h.in, cleanup that could have been done in `25f36066dd`. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/b2c0d0b7-3944-487d-a03d-d155851958ff@gmail.com	2026-01-05 09:27:19 +09:00
Michael Paquier	b8cfcb9e00	Fix typos and inconsistencies in code and comments This change is a cocktail of harmonization of function argument names, grammar typos, renames for better consistency and unused code (see ltree). All of these have been spotted by the author. Author: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/b2c0d0b7-3944-487d-a03d-d155851958ff@gmail.com	2026-01-05 09:19:15 +09:00
Tom Lane	e3fbc9a8de	Allow role created by new test to log in on Windows. We must tell init about each role name we plan to connect as, else SSPI auth fails. Similar to previous patches such as `da44d71e7`. Oversight in `f3c9e341c`, per buildfarm member drongo.	2026-01-04 18:14:02 -05:00
Tom Lane	ba75f71752	Include error location in errors from ComputeIndexAttrs(). Make use of IndexElem's new location field to localize these errors better. Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxH3OgXF1hrzGAaWyNtye2jHEmk9JbtrtGv-KJK6tsGo5w@mail.gmail.com	2026-01-04 14:16:20 -05:00
Tom Lane	62299bbd90	Add parse location to IndexElem. This patch mostly just fills in the field, although a few error reports in resolve_unique_index_expr() are adjusted to use it. The next commit will add more uses. catversion bump out of an abundance of caution: I'm not sure IndexElem can appear in stored rules, but I'm not sure it can't either. Author: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxH3OgXF1hrzGAaWyNtye2jHEmk9JbtrtGv-KJK6tsGo5w@mail.gmail.com Discussion: https://postgr.es/m/202512121327.f2zimsr6guso@alvherre.pgsql	2026-01-04 14:16:20 -05:00
Heikki Linnakangas	ac94ce8194	Fix partial read handling in pg_upgrade's multixact conversion Author: Man Zeng <zengman@halodbtech.com> Discussion: https://www.postgresql.org/message-id/tencent_566562B52163DB1502F4F7A4@qq.com	2026-01-04 20:04:36 +02:00
Peter Eisentraut	0eadf1767a	Remove bogus const qualifier on PageGetItem() argument The function ends up casting away the const qualifier, so it was a lie. No callers appear to rely on the const qualifier on the argument, so the simplest solution is to just remove it. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/beusplf77varvhip6ryuhd2fchsx26qmmhduqz432bnglq634b%402dx4k6yxj4cm	2026-01-04 16:00:15 +01:00
David Rowley	07e0e9ac27	Doc: add missing punctuation Author: Daisuke Higuchi <higuchi.daisuke11@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAEVT6c-yWYstu76YZ7VOxmij2XA8vrOEvens08QLmKHTDjEPBw@mail.gmail.com Backpatch-through: 14	2026-01-04 21:12:23 +13:00
David Rowley	4f49e4b55e	Fix selectivity estimation integer overflow in contrib/intarray This fixes a poorly written integer comparison function which was performing subtraction in an attempt to return a negative value when a < b and a positive value when a > b, and 0 when the values were equal. Unfortunately that didn't always work correctly due to two's complement having the INT_MIN 1 further from zero than INT_MAX. This could result in an overflow and cause the comparison function to return an incorrect result, which would result in the binary search failing to find the value being searched for. This could cause poor selectivity estimates when the statistics stored the value of INT_MAX (2147483647) and the value being searched for was large enough to result in the binary search doing a comparison with that INT_MAX value. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEoWx2ng1Ot5LoKbVU-Dh---dFTUZWJRH8wv2chBu29fnNDMaQ@mail.gmail.com Backpatch-through: 14	2026-01-04 20:32:40 +13:00
Tom Lane	54315fde73	Improve a couple of error messages. Change "function" to "function or procedure" in PreventInTransactionBlock, and improve grammar of ExecWaitStmt's complaint about having an active snapshot. Author: Pavel Stehule <pavel.stehule@gmail.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Marcos Pegoraro <marcos@f10.com.br> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFj8pRCveWPR06bbad9GnMb0Kcr6jnXPttv9XOaOB+oFCD1Tsg@mail.gmail.com	2026-01-03 17:18:39 -05:00
David Rowley	094b61ce3e	Fix spelling mistake in fk-snapshot-3.spec Author: Aditya Gollamudi <adigollamudi@gmail.com> Discussion: https://postgr.es/m/CAD-KL_EdOOWp_cmPk9%3D5vNxo%2BabTTRpNx4vex-gVUm8u3GnkTg%40mail.gmail.com	2026-01-02 17:53:07 +13:00
Bruce Momjian	451c43974f	Update copyright for 2026 Backpatch-through: 14	2026-01-01 13:24:10 -05:00
Andrew Dunstan	f3c9e341cd	Add paths of extensions to pg_available_extensions Add a new "location" column to the pg_available_extensions and pg_available_extension_versions views, exposing the directory where the extension is located. The default system location is shown as '$system', the same value that can be used to configure the extension_control_path GUC. User-defined locations are only visible for super users, otherwise '<insufficient privilege>' is returned as a column value, the same behaviour that we already use in pg_stat_activity. I failed to resist the temptation to do a little extra editorializing of the TAP test script. Catalog version bumped. Author: Matheus Alcantara <mths.dev@pm.me> Reviewed-By: Chao Li <li.evan.chao@gmail.com> Reviewed-By: Rohit Prasad <rohit.prasad@arm.com> Reviewed-By: Michael Banck <mbanck@gmx.net> Reviewed-By: Manni Wood <manni.wood@enterprisedb.com> Reviewed-By: Euler Taveira <euler@eulerto.com> Reviewed-By: Quan Zongliang <quanzongliang@yeah.net>	2026-01-01 12:13:59 -05:00
Masahiko Sawada	85d5bd308b	Fix macro name for io_uring_queue_init_mem check. Commit `f54af9f267` added a check for io_uring_queue_init_mem(). However, it used the macro name HAVE_LIBURING_QUEUE_INIT_MEM in both meson.build and the C code, while the Autotools build script defined HAVE_IO_URING_QUEUE_INIT_MEM. As a result, the optimization was never enabled in builds configured with Autotools, as the C code checked for the wrong macro name. This commit changes the macro name to HAVE_IO_URING_QUEUE_INIT_MEM in meson.build and the C code. This matches the actual function name (io_uring_queue_init_mem), following the standard HAVE_<FUNCTION> convention. Backpatch to 18, where the macro was introduced. Bug: #19368 Reported-by: Evan Si <evsi@amazon.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19368-016d79a7f3a1c599@postgresql.org Backpatch-through: 18	2025-12-31 11:18:14 -08:00
Tom Lane	d6542f8dfc	Doc: remove obsolete, confused <note> about rowtype I/O syntax. This <note> was originally written to describe the double levels of de-backslashing encountered when a backslash-aware string literal is used to hold the text representation of a composite value. It still made sense when we switched to mostly using E'...' syntax for that type of literal. However, commit `f77de4b0c` mangled it completely by changing the example literal to be SQL-standard. The extra pass of de-backslashing described in the text doesn't actually occur with the example as written, unless you happen to be using standard_conforming_strings = off. We could restore this <note> to self-consistency by reverting the change from `f77de4b0c`, but on the whole I judge that its time has passed. standard_conforming_strings = off is nearly obsolete, and may soon be fully so. But without that, the behavior isn't so complicated as to justify a discursive note. I observe that the nearby section about array I/O syntax has no equivalent text, although that syntax is equally subject to this issue. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2998401.1767038920@sss.pgh.pa.us Discussion: https://postgr.es/m/3279216.1767072538@sss.pgh.pa.us	2025-12-31 13:19:27 -05:00
Thomas Munro	915711c8a4	jit: Fix jit_profiling_support when unavailable. jit_profiling_support=true captures profile data for Linux perf. On other platforms, LLVMCreatePerfJITEventListener() returns NULL and the attempt to register the listener would crash. Fix by ignoring the setting in that case. The documentation already says that it only has an effect if perf support is present, and we already did the same for older LLVM versions that lacked support. No field reports, unsurprisingly for an obscure developer-oriented setting. Noticed in passing while working on commit `1a28b4b4`. Backpatch-through: 14 Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKGJgB6gvrdDohgwLfCwzVQm%3DVMtb9m0vzQn%3DCwWn-kwG9w%40mail.gmail.com	2025-12-31 14:50:23 +13:00
Tom Lane	bc6374cd76	Change IndexAmRoutines to be statically-allocated structs. Up to now, index amhandlers were expected to produce a new, palloc'd struct on each call. That requires palloc/pfree overhead, and creates a risk of memory leaks if the caller fails to pfree, and the time taken to fill such a large structure isn't nil. Moreover, we were storing these things in the relcache, eating several hundred bytes for each cached index. There is not anything in these structs that needs to vary at runtime, so let's change the definition so that an amhandler can return a pointer to a "static const" struct of which there's only one copy per index AM. Mark all the core code's IndexAmRoutine pointers const so that we catch anyplace that might still try to change or pfree one. (This is similar to the way we were already handling TableAmRoutine structs. This commit does fix one comment that was infelicitously copied-and-pasted into tableamapi.c.) This commit needs to be called out in the v19 release notes as an API change for extension index AMs. An un-updated AM will still work (as of now, anyway) but it risks memory leaks and will be slower than necessary. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAEoWx2=vApYk2LRu8R0DdahsPNEhWUxGBZ=rbZo1EXE=uA+opQ@mail.gmail.com	2025-12-30 18:26:23 -05:00
Masahiko Sawada	736f754eed	Add dead items memory usage to VACUUM (VERBOSE) and autovacuum logs. This commit adds the total memory allocated during vacuum, the number of times the dead items storage was reset, and the configured memory limit. This helps users understand how much memory VACUUM required, and such information can be used to avoid multiple index scans. Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHza6qcPitBCkyiKJosDTt3bmxMvzZOTONoebwCkBZrr3rk65Q%40mail.gmail.com	2025-12-30 13:12:10 -08:00
Masahiko Sawada	2a5225b99d	Fix a race condition in updating procArray->replication_slot_xmin. Previously, ReplicationSlotsComputeRequiredXmin() computed the oldest xmin across all slots without holding ProcArrayLock (when already_locked is false), acquiring the lock just before updating the replication slot xmin. This could lead to a race condition: if a backend created a new slot and updates the global replication slot xmin, another backend concurrently running ReplicationSlotsComputeRequiredXmin() could overwrite that update with an invalid or stale value. This happens because the concurrent backend might have computed the aggregate xmin before the new slot was accounted for, but applied the update after the new slot had already updated the global value. In the reported failure, a walsender for an apply worker computed InvalidTransactionId as the oldest xmin and overwrote a valid replication slot xmin value computed by a walsender for a tablesync worker. Consequently, the tablesync worker computed a transaction ID via GetOldestSafeDecodingTransactionId() effectively without considering the replication slot xmin. This led to the error "cannot build an initial slot snapshot as oldest safe xid %u follows snapshot's xmin %u", which was an assertion failure prior to commit `240e0dbacd`. To fix this, we acquire ReplicationSlotControlLock in exclusive mode during slot creation to perform the initial update of the slot xmin. In ReplicationSlotsComputeRequiredXmin(), we hold ReplicationSlotControlLock in shared mode until the global slot xmin is updated in ProcArraySetReplicationSlotXmin(). This prevents concurrent computations and updates of the global xmin by other backends during the initial slot xmin update process, while still permitting concurrent calls to ReplicationSlotsComputeRequiredXmin(). Backpatch to all supported versions. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Pradeep Kumar <spradeepkumar29@gmail.com> Reviewed-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAA4eK1L8wYcyTPxNzPGkhuO52WBGoOZbT0A73Le=ZUWYAYmdfw@mail.gmail.com Backpatch-through: 14	2025-12-30 10:56:30 -08:00
Michael Paquier	ffdcc9c638	Fix comment in lsyscache.c Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2miv0KGcM9j29ANRN45-Vz-2qAqrM0cv9OtaLx8e_WCMQ@mail.gmail.com	2025-12-30 16:42:21 +09:00
Thomas Munro	1a28b4b455	jit: Drop redundant LLVM configure probes. We currently require LLVM 14, so these probes for LLVM 9 functions always succeeded. Even when the features aren't enabled in an LLVM build, dummy functions are defined (a problem for a later commit). The whole PGAC_CHECK_LLVM_FUNCTIONS macro and Meson equivalent are removed, because we switched to testing LLVM_VERSION_MAJOR at compile time in subsequent work and these were the last holdouts. That suits the nature of LLVM API evolution better, and also allows for strictly mechanical pruning in future commits like `820b5af7` and `972c2cd2`. They advanced the minimum LLVM version but failed to spot these. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CA%2BhUKGJgB6gvrdDohgwLfCwzVQm%3DVMtb9m0vzQn%3DCwWn-kwG9w%40mail.gmail.com	2025-12-30 20:24:42 +13:00
Michael Paquier	97b101776c	Add pg_get_multixact_stats() This new function exposes at SQL level some information related to multixacts, not available until now. This data is useful for monitoring purposes, especially for workloads that make a heavy use of multixacts: - num_mxids, number of MultiXact IDs in use. - num_members, number of member entries in use. - members_size, bytes used by num_members in pg_multixact/members/. - oldest_multixact: oldest MultiXact still needed. This patch has been originally proposed when MultiXactOffset was still 32 bits, to monitor wraparound. This part is not relevant anymore since `bd8d9c9bdf` that has widen MultiXactOffset to 64 bits. The monitoring of disk space usage for the members is still relevant. Some tests are added to check this function, in the shape of one isolation test with concurrent transactions that take a ROW SHARE lock, and some SQL tests for pg_read_all_stats. Some documentation is added to explain some patterns that can come from the information provided by the function. Bump catalog version. Author: Naga Appani <nagnrik@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com	2025-12-30 15:38:50 +09:00
Michael Paquier	0e3ad4b96a	Add MultiXactOffsetStorageSize() to multixact_internal.h This function calculates in bytes the storage taken between two multixact offsets. This will be used in an upcoming patch, introduced separately here as this piece can be useful on its own. Author: Naga Appani <nagnrik@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aUyTvZMq2CLgNEB4@paquier.xyz	2025-12-30 14:13:40 +09:00
Michael Paquier	9cf746a453	Change GetMultiXactInfo() to return the next multixact offset This routine returned a number of members as a MultiXactOffset, calculated based on the difference between the next-to-be-assigned offset and the oldest offset. However, this number is not actually an offset but a number. This type confusion comes from the original implementation of MultiXactMemberFreezeThreshold(), in `53bb309d2d`. The number of members is now defined as a uint64, large enough for MultiXactOffset. This change will be used in a follow-up patch. Reviewed-by: Naga Appani <nagnrik@gmail.com> Discussion: https://postgr.es/m/aUyTvZMq2CLgNEB4@paquier.xyz	2025-12-30 14:03:49 +09:00
Thomas Munro	7da9d8f2db	jit: Remove -Wno-deprecated-declarations in 18+. REL_18_STABLE and master have commit `ee485912`, so they always use the newer LLVM opaque pointer functions. Drop -Wno-deprecated-declarations (commit `a56e7b660`) for code under jit/llvm in those branches, to catch any new deprecation warnings that arrive in future version of LLVM. Older branches continued to use functions marked deprecated in LLVM 14 and 15 (ie switched to the newer functions only for LLVM 16+), as a precaution against unforeseen compatibility problems with bitcode already shipped. In those branches, the comment about warning suppression is updated to explain that situation better. In theory we could suppress warnings only for LLVM 14 and 15 specifically, but that isn't done here. Backpatch-through: 14 Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1407185.1766682319%40sss.pgh.pa.us	2025-12-30 14:34:32 +13:00
Tom Lane	bd3e3e9e56	Ensure sanity of hash-join costing when there are no MCV statistics. estimate_hash_bucket_stats is defined to return zero to mcv_freq if it cannot obtain a value for the frequency of the most common value. Its sole caller final_cost_hashjoin ignored this provision and would blindly believe the zero value, resulting in computing zero for the largest bucket size. In consequence, the safety check that intended to prevent the largest bucket from exceeding get_hash_memory_limit() was ineffective, allowing very silly plans to be chosen if statistics were missing. After fixing final_cost_hashjoin to disregard zero results for mcv_freq, a second problem appeared: some cases that should use hash joins failed to. This is because estimate_hash_bucket_stats was unaware of the fact that ANALYZE won't store MCV statistics if it doesn't find any multiply-occurring values. Thus the lack of an MCV stats entry doesn't necessarily mean that we know nothing; we may well know that the column is unique. The former coding returned zero for mcv_freq in this case, which was pretty close to correct, but now final_cost_hashjoin doesn't believe it and disables the hash join. So check to see if there is a HISTOGRAM stats entry; if so, ANALYZE has in fact run for this column and must have found it to be unique. In that case report the MCV frequency as 1 / rows, instead of claiming ignorance. Reporting a more accurate *mcv_freq in this case can also affect the bucket-size skew adjustment further down in estimate_hash_bucket_stats, causing hash-join cost estimates to change slightly. This affects some plan choices in the core regression tests. The first diff in join.out corresponds to a case where we have no stats and should not risk a hash join, but the remaining changes are caused by producing a better bucket-size estimate for unique join columns. Those are all harmless changes so far as I can tell. The existing behavior was introduced in commit `4867d7f62` in v11. It appears from the commit log that disabling the bucket-size safety check in the absence of statistics was intentional; but we've now seen a case where the ensuing behavior is bad enough to make that seem like a poor decision. In any case the lack of other problems with that safety check after several years helps to justify enforcing it more strictly. However, we won't risk back-patching this, in case any applications are depending on the existing behavior. Bug: #19363 Reported-by: Jinhui Lai <jinhui.lai@qq.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2380165.1766871097@sss.pgh.pa.us Discussion: https://postgr.es/m/19363-8dd32fc7600a1153@postgresql.org	2025-12-29 13:01:27 -05:00
Tom Lane	cb77bc0442	Further stabilize a postgres_fdw test case. This patch causes one postgres_fdw test case to revert to the plan it used before `aa86129e1`, i.e., using a remote sort in preference to local sort. That decision is actually a coin-flip because cost_sort() will give the same answer on both sides, so that the plan choice comes down to little more than roundoff error. In consequence, the test output can change as a result of even minor changes in nearby costs, as we saw in `aa86129e1` (compare also `b690e5fac` and `4b14e1871`). b690e5fac's solution to stabilizing the adjacent test case was to disable sorting locally, and here we extend that to the currently- problematic case. Without this, the following patch would cause this plan choice to change back in this same way, for even less apparent reason. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2551253.1766952956@sss.pgh.pa.us	2025-12-29 12:53:49 -05:00
Thomas Munro	45d92b76dc	Fix Mkvcbuild.pm builds of test_cloexec.c. Mkvcbuild.pm scrapes Makefile contents, but couldn't understand the change made by commit `bec2a0aa`. Revealed by BF animal hamerkop in branch REL_16_STABLE. 1. It used += instead of =, which didn't match the pattern that Mkvcbuild.pm looks for. Drop the +. 2. Mkvcbuild.pm doesn't link PROGRAM executables with libpgport. Apply a local workaround to REL_16_STABLE only (later branches dropped Mkvcbuild.pm). Backpatch-through: 16 Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/175163.1766357334%40sss.pgh.pa.us	2025-12-29 15:47:31 +13:00
Richard Guo	559f9e90db	Ignore PlaceHolderVars when looking up statistics When looking up statistical data about an expression, we failed to look through PlaceHolderVar nodes, treating them as opaque. This could prevent us from matching an expression to base columns, index expressions, or extended statistics, as examine_variable() relies on strict structural matching. As a result, queries involving PlaceHolderVar nodes often fell back to default selectivity estimates, potentially leading to poor plan choices. This patch updates examine_variable() to strip PlaceHolderVars before analysis. This is safe during estimation because PlaceHolderVars are transparent for the purpose of statistics lookup: they do not alter the value distribution of the underlying expression. To minimize performance overhead on this hot path, a lightweight walker first checks for the presence of PlaceHolderVars. The more expensive mutator is invoked only when necessary. There is one ensuing plan change in the regression tests, which is expected and demonstrates the fix: the rowcount estimate becomes much more accurate with this patch. Back-patch to v18. Although this issue exists before that, changes in this version made it common enough to notice. Given the lack of field reports for older versions, I am not back-patching further. Reported-by: Haowu Ge <gehaowu@bitmoe.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/62af586c-c270-44f3-9c5e-02c81d537e3d.gehaowu@bitmoe.com Backpatch-through: 18	2025-12-29 11:40:45 +09:00
Richard Guo	ad66f705fa	Strip PlaceHolderVars from index operands When pulling up a subquery, we may need to wrap its targetlist items in PlaceHolderVars to enforce separate identity or as a result of outer joins. However, this causes any upper-level WHERE clauses referencing these outputs to contain PlaceHolderVars, which prevents indxpath.c from recognizing that they could be matched to index columns or index expressions, potentially affecting the planner's ability to use indexes. To fix, explicitly strip PlaceHolderVars from index operands. A PlaceHolderVar appearing in a relation-scan-level expression is effectively a no-op. Nevertheless, to play it safe, we strip only PlaceHolderVars that are not marked nullable. The stripping is performed recursively to handle cases where PlaceHolderVars are nested or interleaved with other node types. To minimize performance impact, we first use a lightweight walker to check for the presence of strippable PlaceHolderVars. The expensive mutator is invoked only if a candidate is found, avoiding unnecessary memory allocation and tree copying in the common case where no PlaceHolderVars are present. Back-patch to v18. Although this issue exists before that, changes in this version made it common enough to notice. Given the lack of field reports for older versions, I am not back-patching further. Reported-by: Haowu Ge <gehaowu@bitmoe.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/62af586c-c270-44f3-9c5e-02c81d537e3d.gehaowu@bitmoe.com Backpatch-through: 18	2025-12-29 11:38:49 +09:00
Peter Eisentraut	b7057e4346	Change some Datum to void * for opaque pass-through pointer Here, Datum was used to pass around an opaque pointer between a group of functions. But one might as well use void * for that; the use of Datum doesn't achieve anything here and is just distracting. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1c5d23cb-288b-4154-b1cd-191fe2301707%40eisentraut.org	2025-12-28 14:34:12 +01:00
Michael Paquier	9adf32da6b	Split some long Makefile lists This change makes more readable code diffs when adding new items or removing old items, while ensuring that lines do not get excessively long. Some SUBDIRS, PROGRAMS and REGRESS lists are split. Note that there are a few more REGRESS lists that could be split, particularly in contrib/. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Co-Authored-By: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/DF6HDGB559U5.3MPRFCWPONEAE@jeltef.nl	2025-12-28 09:17:42 +09:00
Daniel Gustafsson	a9123db14a	Fix incorrectly spelled city name The correct spelling is Beijing, fix in regression test and docs. Author: JiaoShuntian <jiaoshuntian@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/ebfa3ec2-dc3c-4adb-be2a-4a882c2e85a7@gmail.com	2025-12-27 23:47:40 +01:00
Peter Eisentraut	b63443718a	Remove MsgType type Presumably, the C type MsgType was meant to hold the protocol message type in the pre-version-3 era, but this was never fully developed even then, and the name is pretty confusing nowadays. It has only one vestigial use for cancel requests that we can get rid of. Since a cancel request is indicated by a special protocol version number, we can use the ProtocolVersion type, which MsgType was based on. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/505e76cb-0ca2-4e22-ba0f-772b5dc3f230%40eisentraut.org	2025-12-27 23:46:28 +01:00
Daniel Gustafsson	ec0da9b893	Add oauth_validator_libraries to variable_is_guc_list_quote The variable_is_guc_list_quote function need to know about all GUC_QUOTE variables, this adds oauth_validator_libraries which was missing. Backpatch to v18 where OAuth was introduced. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/tencent_03D4D2A5C0C8DCE0CD1DB4D945858E15420A@qq.com Backpatch-through: 18	2025-12-27 23:05:48 +01:00
Michael Paquier	36b8f4974a	Fix pg_stat_get_backend_activity() to use multi-byte truncated result pg_stat_get_backend_activity() calls pgstat_clip_activity() to ensure that the reported query string is correctly truncated when it finishes with an incomplete multi-byte sequence. However, the result returned by the function was not what pgstat_clip_activity() generated, but the non-truncated, original, contents from PgBackendStatus.st_activity_raw. Oversight in `54b6cd589a`, so backpatch all the way down. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2mDzwc48q2EK9tSXS6iJMJ35wvxNQnHX+rXjy5VgLvJQw@mail.gmail.com Backpatch-through: 14	2025-12-27 17:23:30 +09:00
Bruce Momjian	e82e9aaa6a	doc: warn about the use of "ctid" queries beyond the examples Also be more assertive that "ctid" should not be used for long-term storage. Reported-by: Bernice Southey Discussion: https://postgr.es/m/CAEDh4nyn5swFYuSfcnGAbpQrKOc47Hh_ZyKVSPYJcu2P=51Luw@mail.gmail.com Backpatch-through: 17	2025-12-26 17:34:17 -05:00
Michael Paquier	f8a4cad8f4	doc: Remove duplicate word in ECPG description Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/d6d6a800f8b503cd78d5f4fa721198e40eec1677.camel@cybertec.at Backpatch-through: 14	2025-12-26 15:25:46 +09:00
Michael Paquier	bde3a46160	Upgrade BufFile to use int64 for byte positions This change has the advantage of removing some weird type casts, caused by offset calculations based on pgoff_t but saved as int (on older branches we use off_t, which could be 4 or 8 bytes depending on the environment). These are safe currently because capped by MAX_PHYSICAL_FILESIZE, but we would run into problems when to make MAX_PHYSICAL_FILESIZE larger or allow callers of these routines to use a larger physical max size on demand. While on it, this improves BufFileDumpBuffer() so as we do not use an offset for "availbytes". It is not a file offset per-set, but a number of available bytes. This change should lead to no functional changes. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aUStrqoOCDRFAq1M@paquier.xyz	2025-12-26 08:41:56 +09:00
Michael Paquier	eee19a30d6	Fix typo in stat_utils.c Introduced by `213a1b8952`. Reported-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNku-jz-FPKeJVk25fZ1pV2buYh5vpeqGDOB=bFQhKxXhw@mail.gmail.com	2025-12-26 07:53:46 +09:00
Michael Paquier	213a1b8952	Move attribute statistics functions to stat_utils.c Many of the operations done for attribute stats in attribute_stats.c share the same logic as extended stats, as done by a patch under discussion to add support for extended stats import and export. All the pieces necessary for extended statistics are moved to stats_utils.c, which is the file where common facilities are shared for stats files. The following renames are done: * get_attr_stat_type() -> statatt_get_type() * init_empty_stats_tuple() -> statatt_init_empty_tuple() * set_stats_slot() -> statatt_set_slot() * get_elem_stat_type() -> statatt_get_elem_type() While on it, this commit adds more documentation for all these functions, describing more their internals and the dependencies that have been implied for attribute statistics. The same concepts apply to extended statistics, at some degree. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2025-12-25 15:13:39 +09:00
Richard Guo	325808cac9	Fix planner error with SRFs and grouping sets If there are any SRFs in a PathTarget, we must separate it into SRF-computing and SRF-free targets. This is because the executor can only handle SRFs that appear at the top level of the targetlist of a ProjectSet plan node. If we find a subexpression that matches an expression already computed in the previous plan level, we should treat it like a Var and should not split it again. setrefs.c will later replace the expression with a Var referencing the subplan output. However, when processing the grouping target for grouping sets, the planner can fail to recognize that an expression is already computed in the scan/join phase. The root cause is a mismatch in the nullingrels bits. Expressions in the grouping target carry the grouping nulling bit in their nullingrels to indicate that they can be nulled by the grouping step. However, the corresponding expressions in the scan/join target do not have these bits. As a result, the exact match check in list_member() fails, leading the planner to incorrectly believe that the expression needs to be re-evaluated from its arguments, which are often not available in the subplan. This can lead to planner errors such as "variable not found in subplan target list". To fix, ignore the grouping nulling bit when checking whether an expression from the grouping target is available in the pre-grouping input target. This aligns with the matching logic in setrefs.c. Backpatch to v18, where this issue was introduced. Bug: #19353 Reported-by: Marian MULLER REBEYROL <marian.muller@serli.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/19353-aaa179bba986a19b@postgresql.org Backpatch-through: 18	2025-12-25 12:12:52 +09:00
Masahiko Sawada	c5c808f9b3	psql: Fix tab completion for VACUUM option values. Commit `8a3e4011` introduced tab completion for the ONLY option of VACUUM and ANALYZE, along with some code simplification using MatchAnyN. However, it caused a regression in tab completion for VACUUM option values. For example, neither ON nor OFF was suggested after "VACUUM (VERBOSE". In addition, the ONLY keyword was not suggested immediately after a completed option list. Backpatch to v18. Author: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/20251223021509.19bba68ecbbc70c9f983c2b4@sraoss.co.jp Backpatch-through: 18	2025-12-24 13:55:29 -08:00
Bruce Momjian	41808377fe	doc: change "can not" to "cannot" Reported-by: Chao Li Author: Chao Li Discussion: https://postgr.es/m/CAEoWx2kyiD+7-vUoOYhH=y2Hrmvqyyhm4EhzgKyrxGBXOMWCxw@mail.gmail.com	2025-12-24 15:12:01 -05:00
Masahiko Sawada	0de5f0d869	Fix regression test failure when wal_level is set to minimal. Commit 67c209 removed the WARNING for insufficient wal_level from the expected output, but the WARNING may still appear on buildfarm members that run with wal_level=minimal. To avoid unstable test output depending on wal_level, this commit the test to use ALTER PUBLICATION for verifying the same behavior, ensuring the output remains consistent regardless of the wal_level setting. Per buildfarm member thorntail. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/TY4PR01MB16907680E27BAB146C8EB1A4294B2A@TY4PR01MB16907.jpnprd01.prod.outlook.com	2025-12-24 10:48:27 -08:00
Fujii Masao	008beba005	doc: Use proper tags in pg_overexplain documentation. The pg_overexplain documentation previously used the <literal> tag for some file names, struct names, and commands. Update the markup to use the more appropriate tags: <filename>, <structname>, and <command>. Backpatch to v18, where pg_overexplain was introduced. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Shixin Wang <wang-shi-xin@outlook.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEyYUzz0LjBV_fMcdwU3wgmu0NCoT+JJiozPa8DG6eeog@mail.gmail.com Backpatch-through: 18	2025-12-25 00:27:19 +09:00
Fujii Masao	d1b35952da	Fix CREATE SUBSCRIPTION failure when the publisher runs on pre-PG19. CREATE SUBSCRIPTION with copy_data=true and origin='none' previously failed when the publisher was running a version earlier than PostgreSQL 19, even though this combination should be supported. The failure occurred because the command issued a query calling pg_get_publication_sequences function on the publisher. That function does not exist before PG19 and the query is only needed for logical replication sequence synchronization, which is supported starting in PG19. This commit fixes this issue by skipping that query when the publisher runs a version earlier than PG19. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEx4twHtJdiPWTyAXJhcBPLaH467SH2ajGSe-41m65giA@mail.gmail.com	2025-12-24 23:45:19 +09:00
Fujii Masao	5e813edb55	Fix version check for retain_dead_tuples subscription option. The retain_dead_tuples subscription option is supported only when the publisher runs PostgreSQL 19 or later. However, it could previously be enabled even when the publisher was running an earlier version. This was caused by check_pub_dead_tuple_retention() comparing the publisher server version against 19000 instead of 190000. Fix this typo so that the version check correctly enforces the PG19+ requirement. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEx4twHtJdiPWTyAXJhcBPLaH467SH2ajGSe-41m65giA@mail.gmail.com	2025-12-24 23:25:00 +09:00
Amit Kapila	98e8fe57c2	Update comments to reflect changes in `8e0d32a4a1`. Commit `8e0d32a4a1` fixed an issue by allowing the replication origin to be created while marking the table sync state as SUBREL_STATE_DATASYNC. Update the comment in check_old_cluster_subscription_state() to accurately describe this corrected behavior. Author: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 17, where the code was introduced Discussion: https://postgr.es/m/CAA4eK1+KaSf5nV_tWy+SDGV6MnFnKMhdt41jJjSDWm6yCyOcTw@mail.gmail.com Discussion: https://postgr.es/m/aUTekQTg4OYnw-Co@paquier.xyz	2025-12-24 10:29:53 +00:00
Amit Kapila	dc6c879455	Doc: Clarify publication privilege requirements. Update the logical replication documentation to explicitly outline the privilege requirements for each publication syntax. This will ensure users understand the necessary permissions when creating or managing publications. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CANhcyEXODen4U0XLk0aAwFTwGxjAfE9eRaynREenLp-JBSaFHw@mail.gmail.com	2025-12-24 09:22:00 +00:00
Richard Guo	c8d2f68cc8	Teach expr_is_nonnullable() to handle more expression types Currently, the function expr_is_nonnullable() checks only Const and Var expressions to determine if an expression is non-nullable. This patch extends the detection logic to handle more expression types. This can enable several downstream optimizations, such as reducing NullTest quals to constant truth values (e.g., "COALESCE(var, 1) IS NULL" becomes FALSE) and converting "COUNT(expr)" to the more efficient "COUNT(*)" when the expression is proven non-nullable. This breaks a test case in test_predtest.sql, since we now simplify "ARRAY[] IS NULL" to constant FALSE, preventing it from weakly refuting a strict ScalarArrayOpExpr ("x = any(ARRAY[])"). To ensure the refutation logic is still exercised as intended, wrap the array argument in opaque_array(). Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAMbWs49UhPBjm+NRpxerjaeuFKyUZJ_AjM3NBcSYK2JgZ6VTEQ@mail.gmail.com	2025-12-24 18:00:44 +09:00
Richard Guo	cb7b7ec7a1	Optimize ROW(...) IS [NOT] NULL using non-nullable fields We break ROW(...) IS [NOT] NULL into separate tests on its component fields. During this breakdown, we can improve efficiency by utilizing expr_is_nonnullable() to detect fields that are provably non-nullable. If a component field is proven non-nullable, it affects the outcome based on the test type. For an IS NULL test, a single non-nullable field refutes the whole NullTest, reducing it to constant FALSE. For an IS NOT NULL test, the check for that specific field is guaranteed to succeed, so we can discard it from the list of component tests. This extends the existing optimization logic, which previously only handled Const fields, to support any expression that can be proven non-nullable. In passing, update the existing constant folding of NullTests to use expr_is_nonnullable() instead of var_is_nonnullable(), enabling it to benefit from future improvements to that function. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAMbWs49UhPBjm+NRpxerjaeuFKyUZJ_AjM3NBcSYK2JgZ6VTEQ@mail.gmail.com	2025-12-24 18:00:02 +09:00
Richard Guo	10c4fe074a	Simplify COALESCE expressions using non-nullable arguments The COALESCE function returns the first of its arguments that is not null. When an argument is proven non-null, if it is the first non-null-constant argument, the entire COALESCE expression can be replaced by that argument. If it is a subsequent argument, all following arguments can be dropped, since they will never be reached. Currently, we perform this simplification only for Const arguments. This patch extends the simplification to support any expression that can be proven non-nullable. This can help avoid the overhead of evaluating unreachable arguments. It can also lead to better plans when the first argument is proven non-nullable and replaces the expression, as the planner no longer has to treat the expression as non-strict, and can also leverage index scans on the resulting expression. There is an ensuing plan change in generated_virtual.out, and we have to modify the test to ensure that it continues to test what it is intended to. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAMbWs49UhPBjm+NRpxerjaeuFKyUZJ_AjM3NBcSYK2JgZ6VTEQ@mail.gmail.com	2025-12-24 17:58:49 +09:00
Michael Paquier	b947dd5c75	Improve comment in pgstatfuncs.c Author: Zizhen Qiao <zizhen_qiao@163.com> Discussion: https://postgr.es/m/5ee635f9.49f7.19b4ed9e803.Coremail.zizhen_qiao@163.com	2025-12-24 17:09:13 +09:00
Amit Kapila	1528b0d899	Don't advance origin during apply failure. The logical replication parallel apply worker could incorrectly advance the origin progress during an error or failed apply. This behavior risks transaction loss because such transactions will not be resent by the server. Commit `3f28b2fcac` addressed a similar issue for both the apply worker and the table sync worker by registering a before_shmem_exit callback to reset origin information. This prevents the worker from advancing the origin during transaction abortion on shutdown. This patch applies the same fix to the parallel apply worker, ensuring consistent behavior across all worker types. As with `3f28b2fcac`, we are backpatching through version 16, since parallel apply mode was introduced there and the issue only occurs when changes are applied before the transaction end record (COMMIT or ABORT) is received. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16 Discussion: https://postgr.es/m/TY4PR01MB169078771FB31B395AB496A6B94B4A@TY4PR01MB16907.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com	2025-12-24 04:36:39 +00:00
Tom Lane	9f7565c6c2	Fix another case of indirectly casting away const. This one was missed in `8f1791c61`, because the machines that detected those issues don't compile this function. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us	2025-12-23 21:38:43 -05:00
Bruce Momjian	2214a207ee	C comment: fix psql "pstdout" duplicate to "pstdin" Reported-by: Ignat Remizov Author: Ignat Remizov Discussion: https://postgr.es/m/CAKiC8XbbR2_YqmbxmYWuEA+MmWP3c=obV5xS1Hye3ZHS-Ss_DA@mail.gmail.com	2025-12-23 20:36:11 -05:00
Masahiko Sawada	55c46bbf3a	pg_visibility: Use visibilitymap_count instead of loop. This commit updates pg_visibility_map_summary() to use the visibilitymap_count() API, replacing its own counting mechanism. This simplifies the function and improves performance by leveraging the vectorized implementation introduced in commit `41c51f0c68`. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAEze2WgPu-EYYuYQimy=AHQHGa7w8EvLVve5DM5eGMR6zh-7sw@mail.gmail.com	2025-12-23 10:33:06 -08:00
Masahiko Sawada	67c20979ce	Toggle logical decoding dynamically based on logical slot presence. Previously logical decoding required wal_level to be set to 'logical' at server start. This meant that users had to incur the overhead of logical-level WAL logging even when no logical replication slots were in use. This commit adds functionality to automatically control logical decoding availability based on logical replication slot presence. The newly introduced module logicalctl.c allows logical decoding to be dynamically activated when needed when wal_level is set to 'replica'. When the first logical replication slot is created, the system automatically increases the effective WAL level to maintain logical-level WAL records. Conversely, after the last logical slot is dropped or invalidated, it decreases back to 'replica' WAL level. While activation occurs synchronously right after creating the first logical slot, deactivation happens asynchronously through the checkpointer process. This design avoids a race condition at the end of recovery; a concurrent deactivation could happen while the startup process enables logical decoding at the end of recovery, but WAL writes are still not permitted until recovery fully completes. The checkpointer will handle it after recovery is done. Asynchronous deactivation also avoids excessive toggling of the logical decoding status in workloads that repeatedly create and drop a single logical slot. On the other hand, this lazy approach can delay changes to effective_wal_level and the disabling logical decoding, especially when the checkpointer is busy with other tasks. We chose this lazy approach in all deactivation paths to keep the implementation simple, even though laziness is strictly required only for end-of-recovery cases. Future work might address this limitation either by using a dedicated worker instead of the checkpointer, or by implementing synchronous waiting during slot drops if workloads are significantly affected by the lazy deactivation of logical decoding. The effective WAL level, determined internally by XLogLogicalInfo, is allowed to change within a transaction until an XID is assigned. Once an XID is assigned, the value becomes fixed for the remainder of the transaction. This behavior ensures that the logging mode remains consistent within a writing transaction, similar to the behavior of GUC parameters. A new read-only GUC parameter effective_wal_level is introduced to monitor the actual WAL level in effect. This parameter reflects the current operational WAL level, which may differ from the configured wal_level setting. Bump PG_CONTROL_VERSION as it adds a new field to CheckPoint struct. Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAD21AoCVLeLYq09pQPaWs+Jwdni5FuJ8v2jgq-u9_uFbcp6UbA@mail.gmail.com	2025-12-23 10:13:16 -08:00
Heikki Linnakangas	955f550686	Fix bug in following update chain when locking a heap tuple After waiting for a concurrent updater to finish, heap_lock_tuple() followed the update chain to lock all tuple versions. However, when stepping from the initial tuple to the next one, it failed to check that the next tuple's XMIN matches the initial tuple's XMAX. That's an important check whenever following an update chain, and the recursive part that follows the chain did it, but the initial step missed it. Without the check, if the updating transaction aborts, the updated tuple is vacuumed away and replaced by an unrelated tuple, the unrelated tuple might get incorrectly locked. Author: Jasper Smit <jasper.smit@servicenow.com> Discussion: https://www.postgresql.org/message-id/CAOG+RQ74x0q=kgBBQ=mezuvOeZBfSxM1qu_o0V28bwDz3dHxLw@mail.gmail.com Backpatch-through: 14	2025-12-23 13:37:16 +02:00
Michael Paquier	8e0d32a4a1	Fix orphaned origin in shared memory after DROP SUBSCRIPTION Since `ce0fdbfe97`, a replication slot and an origin are created by each tablesync worker, whose information is stored in both a catalog and shared memory (once the origin is set up in the latter case). The transaction where the origin is created is the same as the one that runs the initial COPY, with the catalog state of the origin becoming visible for other sessions only once the COPY transaction has committed. The catalog state is coupled with a state in shared memory, initialized at the same time as the origin created in the catalogs. Note that the transaction doing the initial data sync can take a long time, time that depends on the amount of data to transfer from a publication node to its subscriber node. Now, when a DROP SUBSCRIPTION is executed, all its workers are stopped with the origins removed. The removal of each origin relies on a catalog lookup. A worker still running the initial COPY would fail its transaction, with the catalog state of the origin rolled back while the shared memory state remains around. The session running the DROP SUBSCRIPTION should be in charge of cleaning up the catalog and the shared memory state, but as there is no data in the catalogs the shared memory state is not removed. This issue would leave orphaned origin data in shared memory, leading to a confusing state as it would still show up in pg_replication_origin_status. Note that this shared memory data is sticky, being flushed on disk in replorigin_checkpoint at checkpoint. This prevents other origins from reusing a slot position in the shared memory data. To address this problem, the commit moves the creation of the origin at the end of the transaction that precedes the one executing the initial COPY, making the origin immediately visible in the catalogs for other sessions, giving DROP SUBSCRIPTION a way to know about it. A different solution would have been to clean up the shared memory state using an abort callback within the tablesync worker. The solution of this commit is more consistent with the apply worker that creates an origin in a short transaction. A test is added in the subscription test 004_sync.pl, which was able to display the problem. The test fails when this commit is reverted. Reported-by: Tenglong Gu <brucegu@amazon.com> Reported-by: Daisuke Higuchi <higudai@amazon.com> Analyzed-by: Michael Paquier <michael@paquier.xyz> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aUTekQTg4OYnw-Co@paquier.xyz Backpatch-through: 14	2025-12-23 14:32:14 +09:00
Bruce Momjian	ea97154fc2	doc: add "DO" to "ON CONFLICT" in CREATE VIEW text This is done for consistency. Reported-by: jian he Author: Laurenz Albe Discussion: https://postgr.es/m/CACJufxEW1RRDD9ZWGcW_Np_Z9VGPE-YC7u0C6RcsEY8EKiTdBg@mail.gmail.com	2025-12-22 19:42:11 -05:00
Michael Paquier	e5f3839af6	Switch buffile.c/h to use pgoff_t instead of off_t off_t was previously used for offsets, which is 4 bytes on Windows, hence limiting the backend code to a hard limit for files longer than 2GB. This leads to some simplification in these files, removing some casts based on long, also 4 bytes on Windows. This commit removes one comment introduced in `db3c4c3a2d`, not relevant anymore as pgoff_t is a safe 8-byte alternative on Windows. This change is surprisingly not invasive, as the callers of BufFileTell(), BufFileSeek() and BufFileTruncateFileSet() (worker.c, tuplestore.c, etc.) track offsets in local structures that just to switch from off_t to pgoff_t for the most part. The file is still relying on a maximum file size of MAX_PHYSICAL_FILESIZE (1GB). This change allows the code to make this maximum potentially larger in the future, or larger on a per-demand basis. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aUStrqoOCDRFAq1M@paquier.xyz	2025-12-23 07:41:34 +09:00
Masahiko Sawada	c6a7d3bab4	psql: Improve tab completion for COPY option lists. Previously, only the first option in a parenthesized option list was suggested by tab completion. This commit enhances tab completion for both COPY TO and COPY FROM commands to suggest options after each comma. Also add completion for HEADER and FREEZE option value candidates. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20250605100835.b396f9d656df1018f65a4556@sraoss.co.jp	2025-12-22 14:28:12 -08:00
Tom Lane	37a1688a1b	Add missing .gitignore for src/test/modules/test_cloexec.	2025-12-22 14:06:54 -05:00
Fujii Masao	c5d162435a	doc: Fix incorrect reference in pg_overexplain documentation. Correct the referenced location of the RangeTblEntry definition in the pg_overexplain documentation. Backpatched to v18, where pg_overexplain was introduced. Author: Julien Tachoires <julien@tachoires.me> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20251218092319.tht64ffmcvzqdz7u@poseidon.home.virt Backpatch-through: 18	2025-12-22 17:56:28 +09:00
Michael Paquier	d31276f0e2	Fix another typo in gininsert.c Reported-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNkRJ9DMFZMQKWQ32U+OTBR78KeGh2=9Wy5jEeWDxMVFcQ@mail.gmail.com	2025-12-22 12:38:40 +09:00
Peter Geoghegan	fab5cd3dd1	Remove obsolete name_ops index-only scan comments. nbtree index-only scans of an index that uses btree/name_ops as one of its index column's input opclasses are no longer at any risk of reading past the end of currTuples. We're no longer reliant on such scans being able to at least read from the start of markTuples storage (which uses space from the same allocation as currTuples) to avoid a segfault: StoreIndexTuple (from nodeIndexonlyscan.c) won't actually read past the end of a cstring datum from a name_ops index. In other words, we already have the "special-case treatment for name_ops" that the removed comment supposed we could avoid by relying on markTuples in this way. Oversight in commit `a63224be49`, which added special case handling of name_ops cstrings to StoreIndexTuple, but missed these comments.	2025-12-21 12:27:38 -05:00
Thomas Munro	bec2a0aa30	Clean up test_cloexec.c and Makefile. An unused variable caused a compiler warning on BF animal fairywren, an snprintf() call was redundant, and some buffer sizes were inconsistent. Per code review from Tom Lane. The Makefile's test ifeq ($(PORTNAME), win32) never succeeded due to a circularity, so only Meson builds were actually compiling the new test code, partially explaining why CI didn't tell us about the warning sooner (the other problem being that CompilerWarnings only makes world-bin, a problem for another commit). Simplify. Backpatch-through: 16, like commit `c507ba55` Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <tmunro@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1086088.1765593851%40sss.pgh.pa.us	2025-12-21 17:34:20 +13:00
Andres Freund	548de59d93	heapam: Move logic to handle HEAP_MOVED into a helper function Before we dealt with this in 6 near identical and one very similar copy. The helper function errors out when encountering a HEAP_MOVED_IN/HEAP_MOVED_OUT tuple with xvac considered current or in-progress. It'd be preferrable to do that change separately, but otherwise it'd not be possible to deduplicate the handling in HeapTupleSatisfiesVacuum(). Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6 Discussion: https://postgr.es/m/6rgb2nvhyvnszz4ul3wfzlf5rheb2kkwrglthnna7qhe24onwr@vw27225tkyar	2025-12-19 13:28:34 -05:00
Andres Freund	09ae2c8bac	bufmgr: Optimize & harmonize LockBufHdr(), LWLockWaitListLock() The main optimization is for LockBufHdr() to delay initializing SpinDelayStatus, similar to what LWLockWaitListLock already did. The initialization is sufficiently expensive & buffer header lock acquisitions are sufficiently frequent, to make it worthwhile to instead have a fastpath (via a likely() branch) that does not initialize the SpinDelayStatus. While LWLockWaitListLock() already the aforementioned optimization, it did not use likely(), and inspection of the assembly shows that this indeed leads to worse code generation (also observed in a microbenchmark). Fix that by adding the likely(). While the LockBufHdr() improvement is a small gain on its own, it mainly is aimed at preventing a regression after a future commit, which requires additional locking to set hint bits. While touching both, also make the comments more similar to each other. Reviewed-by: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-12-19 13:23:33 -05:00
Bruce Momjian	80f08a6e6a	doc: clarify when physical/logical replication is used The imprecision caused some text to be only partially accurate. Reported-by: Paul A Jungwirth Author: Robert Treat Discussion: https://postgr.es/m/CA%2BrenyULt3VBS1cRFKUfT2%3D5dr61xBOZdAZ-CqX3XLGXqY-aTQ%40mail.gmail.com	2025-12-19 12:01:23 -05:00
Heikki Linnakangas	47a9f61fca	Use proper type for RestoreTransactionSnapshot's PGPROC arg Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/08cbaeb5-aaaf-47b6-9ed8-4f7455b0bc4b@iki.fi	2025-12-19 13:40:02 +02:00
John Naylor	44f49511b7	Update pg_hba.conf example to reflect MD5 deprecation In the wake of commit `db6a4a985`, remove most use of 'md5' from the example configuration file. The only remainder is an example exception for a client that doesn't support SCRAM. Author: Mikael Gustavsson <mikael.gustavsson@smhi.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/176595607507.978865.11597773194269211255@wrigleys.postgresql.org Discussion: https://postgr.es/m/4ed268473fdb4cf9b0eced6c8019d353@smhi.se Backpatch-through: 18	2025-12-19 15:48:18 +07:00
Michael Paquier	5cdbec5aa9	Fix typos in gininsert.c Introduced by `8492feb98f`. Author: Xingbin She <xingbin.she@qq.com> Discussion: https://postgr.es/m/tencent_C254AE962588605F132DB4A6F87205D6A30A@qq.com	2025-12-19 14:33:38 +09:00
Fujii Masao	b3ccb0a2cb	Add guard to prevent recursive memory context logging. Previously, if memory context logging was triggered repeatedly and rapidly while a previous request was still being processed, it could result in recursive calls to ProcessLogMemoryContextInterrupt(). This could lead to infinite recursion and potentially crash the process. This commit adds a guard to prevent such recursion. If ProcessLogMemoryContextInterrupt() is already in progress and logging memory contexts, subsequent calls will exit immediately, avoiding unintended recursive calls. While this scenario is unlikely in practice, it's not impossible. This change adds a safety check to prevent such failures. Back-patch to v14, where memory context logging was introduced. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Artem Gavrilov <artem.gavrilov@percona.com> Discussion: https://postgr.es/m/CA+TgmoZMrv32tbNRrFTvF9iWLnTGqbhYSLVcrHGuwZvCtph0NA@mail.gmail.com Backpatch-through: 14	2025-12-19 12:05:37 +09:00
Michael Paquier	9d0f7996e5	Use table/index_close() more consistently All the code paths updated here have been using relation_close() to close a relation that has already been opened with table_open() or index_open(), where a relkind check is enforced. table_close() and index_open() do the same thing as relation_close(), so there was no harm, but being inconsistent could lead to issues if the internals of these close() functions begin to introduce some logic specific to each relkind in the future. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aUKamYGiDKO6byp5@ip-10-97-1-34.eu-west-3.compute.internal	2025-12-19 07:55:58 +09:00
Noah Misch	d49936f302	Sort DO_SUBSCRIPTION_REL dump objects independent of OIDs. Commit `0decd5e89d` missed DO_SUBSCRIPTION_REL, leading to assertion failures. In the unlikely use case of diffing "pg_dump --binary-upgrade" output, spurious diffs were possible. As part of fixing that, align the DumpableObject naming and sort order with DO_PUBLICATION_REL. The overall effect of this commit is to change sort order from (subname, srsubid) to (rel, subname). Since DO_SUBSCRIPTION_REL is only for --binary-upgrade, accept that larger-than-usual dump order change. Back-patch to v17, where commit `9a17be1e24` introduced DO_SUBSCRIPTION_REL. Reported-by: vignesh C <vignesh21@gmail.com> Author: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm2x3rd7C0_HjUpJFbxpAqXgm=QtoKfkEWDVA8h+JFpa_w@mail.gmail.com Backpatch-through: 17	2025-12-18 10:23:47 -08:00
Heikki Linnakangas	951b60f7ab	Do not emit WAL for unlogged BRIN indexes Operations on unlogged relations should not be WAL-logged. The brin_initialize_empty_new_buffer() function didn't get the memo. The function is only called when a concurrent update to a brin page uses up space that we're just about to insert to, which makes it pretty hard to hit. If you do manage to hit it, a full-page WAL record is erroneously emitted for the unlogged index. If you then crash, crash recovery will fail on that record with an error like this: FATAL: could not create file "base/5/32819": File exists Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/CALdSSPhpZXVFnWjwEBNcySx_vXtXHwB2g99gE6rK0uRJm-3GgQ@mail.gmail.com Backpatch-through: 14	2025-12-18 15:08:48 +02:00
Amit Kapila	b47c50e566	Fix intermittent BF failure in 040_standby_failover_slots_sync. Commit `0d2d4a0ec3` introduced a test that verifies replication slot synchronization to a standby server via SQL API. However, the test did not configure synchronized_standby_slots. Without this setting, logical failover slots can advance beyond the physical replication slot, causing intermittent synchronization failures. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/TY4PR01MB16907DF70205308BE918E0D4494ABA@TY4PR01MB16907.jpnprd01.prod.outlook.com	2025-12-18 05:06:55 +00:00
Michael Paquier	5cf03552fb	btree_gist: Fix memory allocation formula This change has been suggested by the two authors listed in this commit, both of them providing an incomplete solution (David's formula relied on a "bytea *", while Bertrand's did not use palloc_array()). The solution provided in this commit uses GBT_VARKEY instead of the inconsistent bytea for the allocation size, with a palloc_array(). The change related to Vsrt is one I am flipping to a more consistent style, in passing. Author: David Geier <geidav.pg@gmail.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com Discussion: https://postgr.es/m/aTrG3Fi4APtfiCvQ@ip-10-97-1-34.eu-west-3.compute.internal	2025-12-18 11:01:43 +09:00
Michael Paquier	167cb26718	Fix const correctness in pgstat data serialization callbacks `4ba012a8ed` defined the "header" (pointer to the stats data) of from_serialized_data() as a const, even though it is fine (and expected!) for the callback to modify the shared memory entry when loading the stats at startup. While on it, this commit updates the callback to_serialized_data() in the test module test_custom_stats to make the data extracted from the "header" parameter a const since it should never be modified: the stats are written to disk and no modifications are expected in the shared memory entry. This clarifies the API contract of these new callbacks. Reported-By: Peter Eisentraut <peter@eisentraut.org> Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/d87a93b0-19c7-4db6-b9c0-d6827e7b2da1@eisentraut.org	2025-12-18 07:33:40 +09:00
Jacob Champion	ab8af1db43	oauth_validator: Avoid races in log_check() Commit `e0f373ee4` fixed up races in Cluster::connect_fails when using log_like. t/002_client.pl didn't get the memo, though, because it doesn't use Test::Cluster to perform its custom hook tests. (This is probably not an issue at the moment, since the log check is only done after authentication success and not failure, but there's no reason to wait for someone to hit it.) Introduce the fix, based on debug2 logging, to its use of log_check() as well, and move the logic into the test() helper so that any additions don't need to continually duplicate it. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com Backpatch-through: 18	2025-12-17 11:46:05 -08:00
Jacob Champion	781ca72139	libpq-oauth: use correct c_args in meson.build Copy-paste bug from `b0635bfda`: libpq-oauth.so was being built with libpq_so_c_args, rather than libpq_oauth_so_c_args. (At the moment, the two lists are identical, but that won't be true forever.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com Backpatch-through: 18	2025-12-17 11:46:01 -08:00
Jacob Champion	8b217c96ea	libpq-fe.h: Don't claim SOCKTYPE in the global namespace The definition of PGoauthBearerRequest uses a temporary SOCKTYPE macro to hide the difference between Windows and Berkeley socket handles, since we don't surface pgsocket in our public API. This macro doesn't need to escape the header, because implementers will choose the correct socket type based on their platform, so I #undef'd it immediately after use. I didn't namespace that helper, though, so if anyone else needs a SOCKTYPE macro, libpq-fe.h will now unhelpfully get rid of it. This doesn't seem too far-fetched, given its proximity to existing POSIX macro names. Add a PQ_ prefix to avoid collisions, update and improve the surrounding documentation, and backpatch. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com Backpatch-through: 18	2025-12-17 11:45:52 -08:00
Tom Lane	5b4fb2b97d	Rename regress.so's .mo file to postgresql-regress-VERSION.mo. I originally used just "regress-VERSION.mo", but that seems too generic considering that some packagers will put this file into a system-wide directory. Per suggestion from Christoph Berg. Reported-by: Christoph Berg <myon@debian.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aULSW7Xqx5MqDW_1@msg.df7cb.de	2025-12-17 14:10:42 -05:00
Heikki Linnakangas	5cbaa00592	Make postmaster 003_start_stop.pl test less flaky The test is very sensitive to how backends start and exit, because it tests dead-end backends which occur when all the connection slots are in use. The test failed occasionally in the CI, when the backend that was launched for the raw_connect_works() check lingered for a while, and exited only later during the test. When it exited, it released a connection slot, when the test expected all the slots to be in use at that time. The 002_connection_limits.pl test had a similar issue: if the backend launched for safe_psql() in the test initialization lingers around, it uses up a connection slot during the test, messing up the test's connection counting. I haven't seen that in the CI, but when I added a "sleep(1);" to proc_exit(), the test failed. To make the tests more robust, restart the server to ensure that the lingering backends doesn't interfere with the later test steps. In the passing, fix a bogus test name. Report and analysis by Jelte Fennema-Nio, Andres Freund, Thomas Munro. Discussion: https://www.postgresql.org/message-id/CAGECzQSU2iGuocuP+fmu89hmBmR3tb-TNyYKjCcL2M_zTCkAFw@mail.gmail.com Backpatch-through: 18	2025-12-17 16:23:13 +02:00
Amit Kapila	85ddcc2f4c	Support existing publications in pg_createsubscriber. Allow pg_createsubscriber to reuse existing publications instead of failing when they already exist on the publisher. Previously, pg_createsubscriber would fail if any specified publication already existed. Now, existing publications are reused as-is with their current configuration, and non-existing publications are created automatically with FOR ALL TABLES. This change provides flexibility when working with mixed scenarios of existing and new publications. Users should verify that existing publications have the desired configuration before reusing them, and can use --dry-run with verbose mode to see which publications will be reused and which will be created. Only publications created by pg_createsubscriber are cleaned up during error cleanup operations. Pre-existing publications are preserved unless '--clean=publications' is explicitly specified, which drops all publications. This feature would be helpful for pub-sub configurations where users want to subscribe to a subset of tables from the publisher. Author: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: tianbing <tian_bing_0531@163.com> Discussion: https://postgr.es/m/CAHv8Rj%2BsxWutv10WiDEAPZnygaCbuY2RqiLMj2aRMH-H3iZwyA%40mail.gmail.com	2025-12-17 09:43:53 +00:00
Michael Paquier	f4e797171e	Change pgstat_report_vacuum() to use Relation This change makes pgstat_report_vacuum() more consistent with pgstat_report_analyze(), that also uses a Relation. This enforces a policy that callers of this routine should open and lock the relation whose statistics are updated before calling this routine. We will unlikely have a lot of callers of this routine in the tree, but it seems like a good idea to imply this requirement in the long run. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aUEA6UZZkDCQFgSA@ip-10-97-1-34.eu-west-3.compute.internal	2025-12-17 11:26:17 +09:00
Michael Paquier	1d325ad99c	Remove useless code in InjectionPointAttach() strlcpy() ensures that a target string is zero-terminated, so there is no need to enforce it a second time in this code. This simplification could have been done in `0eb23285a2`. Author: Feilong Meng <feelingmeng@foxmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/tencent_771178777C5BC17FCB7F7A1771CD1FFD5708@qq.com	2025-12-17 08:58:58 +09:00
Jeff Davis	0a90df58cf	Avoid global LC_CTYPE dependency in pg_locale_icu.c. ICU still depends on libc for compatibility with certain historical behavior for single-byte encodings. Make the dependency explicit by holding a locale_t object when required. We should consider a better solution in the future, such as decoding the text to UTF-32 and using u_tolower(). That would be a behavior change and require additional infrastructure though; so for now, just avoid the global LC_CTYPE dependency. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-16 15:32:57 -08:00
Jeff Davis	87b2968df0	downcase_identifier(): use method table from locale provider. Previously, libc's tolower() was always used for lowercasing identifiers, regardless of the database locale (though only characters beyond 127 in single-byte encodings were affected). Refactor to allow each provider to supply its own implementation of identifier downcasing. For historical compatibility, when using a single-byte encoding, ICU still relies on tolower(). One minor behavior change is that, before the database default locale is initialized, it uses ASCII semantics to downcase the identifiers. Previously, it would use the postmaster's LC_CTYPE setting from the environment. While that could have some effect during GUC processing, for example, it would have been fragile to rely on the environment setting anyway. (Also, it only matters when the encoding is single-byte.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-16 15:32:41 -08:00
Jeff Davis	7f007e4a04	ltree: fix case-insensitive matching. Previously, ltree_prefix_eq_ci() used lowercasing with the default collation; while ltree_crc32_sz() used tolower() directly. These were equivalent only if the default collation provider was libc and the encoding was single-byte. Change both to use casefolding with the default collation. Backpatch through 18, where the casefolding APIs were introduced. The bug exists in earlier versions, but would require some adaptation. A REINDEX is required for ltree indexes where the database default collation is not libc. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Backpatch-through: 18 Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com Discussion: https://postgr.es/m/01fc00fd66f641b9693d4f9f1af0ccf44cbdfbdf.camel@j-davis.com	2025-12-16 12:57:00 -08:00
Jeff Davis	24bf379cb1	Clarify a #define introduced in `8d299052fe`. The value is the same, but use the right symbol for clarity.	2025-12-16 12:48:53 -08:00
Jeff Davis	84d5efa7e3	Fix multibyte issue in ltree_strncasecmp(). Previously, the API for ltree_strncasecmp() took two inputs but only one length (that of the smaller input). It truncated the larger input to that length, but that could break a multibyte sequence. Change the API to be a check for prefix equality (possibly case-insensitive) instead, which is all that's needed by the callers. Also, provide the lengths of both inputs. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/5f65b85740197ba6249ea507cddf609f84a6188b.camel%40j-davis.com Backpatch-through: 14	2025-12-16 10:35:40 -08:00
Robert Haas	f1a6e622bd	Switch memory contexts in ReinitializeParallelDSM. We already do this in CreateParallelContext, InitializeParallelDSM, and LaunchParallelWorkers. I suspect the reason why the matching logic was omitted from ReinitializeParallelDSM is that I failed to realize that any memory allocation was happening here -- but shm_mq_attach does allocate, which could result in a shm_mq_handle being allocated in a shorter-lived context than the ParallelContext which points to it. That could result in a crash if the shorter-lived context is freed before the parallel context is destroyed. As far as I am currently aware, there is no way to reach a crash using only code that is present in core PostgreSQL, but extensions could potentially trip over this. Fixing this in the back-branches appears low-risk, so back-patch to all supported versions. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Co-authored-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com> Backpatch-through: 14 Discussion: http://postgr.es/m/CAKZiRmwfVripa3FGo06=5D1EddpsLu9JY2iJOTgbsxUQ339ogQ@mail.gmail.com	2025-12-16 12:24:55 -05:00
Tom Lane	462e247652	Test PRI* macros even when we can't test NLS translation. Further research shows that the reason commit `7db6809ce` failed is that recent glibc versions short-circuit translation attempts when LC_MESSAGES is 'C.<encoding>', not only when it's 'C'. There seems no way around that, so we'll have to live with only testing NLS when a suitable real locale is installed. However, something can still be salvaged: it still seems like a good idea to verify that the PRI* macros work as-expected even when we can't check their translations (see `f8715ec86` for motivation). Hence, adjust the test to always run the ereport calls, and tweak the parameter values in hopes of detecting any cases where there's confusion about the actual widths of the parameters. Discussion: https://postgr.es/m/1991599.1765818338@sss.pgh.pa.us	2025-12-16 12:01:46 -05:00
Melanie Plageman	bfe5c4bec7	Add explanatory comment to prune_freeze_setup() heap_page_prune_and_freeze() fills in PruneState->deadoffsets, the array of OffsetNumbers of dead tuples. It is returned to the caller in the PruneFreezeResult. To avoid having two copies of the array, the PruneState saves only a pointer to the array. This was a bit unusual and confusing, so add a clarifying comment. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=jiD1nqch4JQN+odAxZSD7mRvdoHUGJYN2r6tQG_66yQ@mail.gmail.com	2025-12-16 11:04:07 -05:00
Melanie Plageman	4877391ce8	Fix const qualification in prune_freeze_setup() The const qualification of the presult argument to prune_freeze_setup() is later cast away, so it was not correct. Remove it and add a comment explaining that presult should not be modified. Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fb97d0ae-a0bc-411d-8a87-f84e7e146488%40eisentraut.org	2025-12-16 11:00:05 -05:00
Daniel Gustafsson	b39013b7b1	doc: Update header file mention for CompareType Commit `119fc30` moved CompareType to cmptype.h but the mention in the docs still refered to primnodes.h Author: Daisuke Higuchi <higuchi.daisuke11@gmail.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAEVT6c8guXe5P=L_Un5NUUzCgEgbHnNcP+Y3TV2WbQh-xjiwqA@mail.gmail.com Backpatch-through: 18	2025-12-16 09:46:53 +01:00
John Naylor	9303d62c6d	Separate out bytea sort support from varlena.c In the wake of commit `b45242fd3`, bytea_sortsupport() still called out to varstr_sortsupport(). Treating bytea as a kind of text/varchar required varstr_sortsupport() to allow for the possibility of NUL bytes, but only for C collation. This was confusing. For better separation of concerns, create an independent sortsupport implementation in bytea.c. The heuristics for bytea_abbrev_abort() remain the same as for varstr_abbrev_abort(). It's possible that the bytea case warrants different treatment, but that is left for future investigation. In passing, adjust some strange looking comparisons in varstr_abbrev_abort(). Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TP1bAbEhUJa6+rgceN6QJWMSsxhg1=mqfSN=Nb-n6DAKg@mail.gmail.com	2025-12-16 15:19:16 +07:00
Michael Paquier	15f68cebdc	Add TAP test to check recovery when redo LSN is missing This commit provides test coverage for `dc7c77f825`, where the redo record and the checkpoint record finish on different WAL segments with the start of recovery able to detect that the redo record is missing. This test uses a wait injection point done in the critical section of a checkpoint, method that requires not one but actually two wait injection points to avoid any memory allocations within the critical section of the checkpoint: - Checkpoint run with a background psql. - One first wait point is run by the checkpointer before the critical section, allocating the shared memory required by the DSM registry for the wait machinery in the library injection_points. - First point is woken up. - Second wait point is loaded before the critical section, allocating the memory to build the path to the library loaded, then run in the critical section once the checkpoint redo record has been logged. - WAL segment is switched while waiting on the second point. - Checkpoint completes. - Stop cluster with immediate mode. - The segment that includes the redo record is removed. - Start, recovery fails as the redo record cannot be found. The error message introduced in `dc7c77f825` is now reduced to a FATAL, meaning that the information is still provided while being able to use a test for it. Nitin has provided a basic version of the test, that I have enhanced to make it portable with two points. Without `dc7c77f825`, the cluster crashes in this test, not on a PANIC but due to the pointer dereference at the beginning of recovery, failure mentioned in the other commit. Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAMm1aWaaJi2w49c0RiaDBfhdCL6ztbr9m=daGqiOuVdizYWYaA@mail.gmail.com	2025-12-16 14:28:05 +09:00
Michael Paquier	dc7c77f825	Fail recovery when missing redo checkpoint record without backup_label This commit adds an extra check at the beginning of recovery to ensure that the redo record of a checkpoint exists before attempting WAL replay, logging a PANIC if the redo record referenced by the checkpoint record could not be found. This is the same level of failure as when a checkpoint record is missing. This check is added when a cluster is started without a backup_label, after retrieving its checkpoint record. The redo LSN used for the check is retrieved from the checkpoint record successfully read. In the case where a backup_label exists, the startup process already fails if the redo record cannot be found after reading a checkpoint record at the beginning of recovery. Previously, the presence of the redo record was not checked. If the redo and checkpoint records were located on different WAL segments, it would be possible to miss a entire range of WAL records that should have been replayed but were just ignored. The consequences of missing the redo record depend on the version dealt with, these becoming worse the older the version used: - On HEAD, v18 and v17, recovery fails with a pointer dereference at the beginning of the redo loop, as the redo record is expected but cannot be found. These versions are good students, because we detect a failure before doing anything, even if the failure is misleading in the shape of a segmentation fault, giving no information that the redo record is missing. - In v16 and v15, problems show at the end of recovery within FinishWalRecovery(), the startup process using a buggy LSN to decide from where to start writing WAL. The cluster gets corrupted, still it is noisy about it. - v14 and older versions are worse: a cluster gets corrupted but it is entirely silent about the matter. The redo record missing causes the startup process to skip entirely recovery, because a missing record is the same as not redo being required at all. This leads to data loss, as everything is missed between the redo record and the checkpoint record. Note that I have tested that down to 9.4, reproducing the issue with a version of the author's reproducer slightly modified. The code is wrong since at least 9.2, but I did not look at the exact point of origin. This problem has been found by debugging a cluster where the WAL segment including the redo segment was missing due to an operator error, leading to a crash, based on an investigation in v15. Requesting archive recovery with the creation of a recovery.signal or a standby.signal even without a backup_label would mitigate the issue: if the record cannot be found in pg_wal/, the missing segment can be retrieved with a restore_command when checking that the redo record exists. This was already the case without this commit, where recovery would re-fetch the WAL segment that includes the redo record. The check introduced by this commit makes the segment to be retrieved earlier to make sure that the redo record can be found. On HEAD, the code will be slightly changed in a follow-up commit to not rely on a PANIC, to include a test able to emulate the original problem. This is a minimal backpatchable fix, kept separated for clarity. Reported-by: Andres Freund <andres@anarazel.de> Analyzed-by: Andres Freund <andres@anarazel.de> Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Discussion: https://postgr.es/m/20231023232145.cmqe73stvivsmlhs@awork3.anarazel.de Discussion: https://postgr.es/m/CAMm1aWaaJi2w49c0RiaDBfhdCL6ztbr9m=daGqiOuVdizYWYaA@mail.gmail.com Backpatch-through: 14	2025-12-16 13:29:28 +09:00
Tom Lane	84a3778c79	Revert "Avoid requiring Spanish locale to test NLS infrastructure." This reverts commit `7db6809ced`. That doesn't seem to work with recent (last couple of years) glibc, and the reasons are obscure. I can't let the farm stay this broken for long.	2025-12-15 18:36:29 -05:00
Jacob Champion	f7fbd02d32	libpq: Align oauth_json_set_error() with other NLS patterns Now that the prior commits have fixed missing OAuth translations, pull the bespoke usage of libpq_gettext() for OAUTHBEARER parsing into oauth_json_set_error() itself, and make that a gettext trigger as well, to better match what the other sites are doing. Add an _internal() variant to handle the existing untranslated case. Suggested-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/0EEBCAA8-A5AC-4E3B-BABA-0BA7A08C361B%40yesql.se Backpatch-through: 18	2025-12-15 13:22:04 -08:00
Jacob Champion	301a1dcf00	libpq-oauth: Don't translate internal errors Some error messages are generated when OAuth multiplexer operations fail unexpectedly in the client. Álvaro pointed out that these are both difficult to translate idiomatically (as they use internal terminology heavily) and of dubious translation value to end users (since they're going to need to get developer help anyway). The response parsing engine has a similar issue. Remove these from the translation files by introducing internal variants of actx_error() and oauth_parse_set_error(). Suggested-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAOYmi%2BkQQ8vpRcoSrA5EQ98Wa3G6jFj1yRHs6mh1V7ohkTC7JA%40mail.gmail.com Backpatch-through: 18	2025-12-15 13:21:00 -08:00
Jacob Champion	ea3370b18e	libpq: Add missing OAuth translations Several strings that should have been translated as they passed through libpq_gettext were not actually being pulled into the translation files, because I hadn't directly wrapped them in one of the GETTEXT_TRIGGERS. Move the responsibility for calling libpq_gettext() to the code that sets actx->errctx. Doing the same in report_type_mismatch() would result in double-translation, so mark those strings with gettext_noop() instead. And wrap two ternary operands with gettext_noop(), even though they're already in one of the triggers, since xgettext sees only the first. Finally, fe-auth-oauth.c was missing from nls.mk, so none of that file was being translated at all. Add it now. Original patch by Zhijie Hou, plus suggested tweaks by Álvaro Herrera and small additions by me. Reported-by: Zhijie Hou <houzj.fnst@fujitsu.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/TY4PR01MB1690746DB91991D1E9A47F57E94CDA%40TY4PR01MB16907.jpnprd01.prod.outlook.com Backpatch-through: 18	2025-12-15 13:17:10 -08:00
Nathan Bossart	48d4a1423d	Allow passing a pointer to GetNamedDSMSegment()'s init callback. This commit adds a new "void *arg" parameter to GetNamedDSMSegment() that is passed to the initialization callback function. This is useful for reusing an initialization callback function for multiple DSM segments. Author: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAN4CZFMjh8TrT9ZhWgjVTzBDkYZi2a84BnZ8bM%2BfLPuq7Cirzg%40mail.gmail.com	2025-12-15 14:27:16 -06:00
Noah Misch	64bf53dd61	Revisit cosmetics of "For inplace update, send nontransactional invalidations." This removes a never-used CacheInvalidateHeapTupleInplace() parameter. It adds README content about inplace update visibility in logical decoding. It rewrites other comments. Back-patch to v18, where commit `243e9b40f1` first appeared. Since this removes a CacheInvalidateHeapTupleInplace() parameter, expect a v18 ".abi-compliance-history" edit to follow. PGXN contains no calls to that function. Reported-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reported-by: Ilyasov Ian <ianilyasov@outlook.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Surya Poondla <s_poondla@apple.com> Discussion: https://postgr.es/m/CA+renyU+LGLvCqS0=fHit-N1J-2=2_mPK97AQxvcfKm+F-DxJA@mail.gmail.com Backpatch-through: 18	2025-12-15 12:19:49 -08:00
Noah Misch	0839fbe400	Correct comments of "Fix data loss at inplace update after heap_update()". This corrects commit `a07e03fd8f`. Reported-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reported-by: Surya Poondla <s_poondla@apple.com> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/CA+renyWCW+_2QvXERBQ+mna6ANwAVXXmHKCA-WzL04bZRsjoBA@mail.gmail.com	2025-12-15 12:19:49 -08:00
Tom Lane	7db6809ced	Avoid requiring Spanish locale to test NLS infrastructure. I had supposed that the majority of machines with gettext installed would have most language locales installed, but at least in the buildfarm it turns out less than half have es_ES installed. So depending on that to run the test now seems like a bad idea. But it turns out that gettext can be persuaded to "translate" even in the C locale, as long as you fake out its short-circuit logic by spelling the locale name like "C.UTF-8" or similar. (Many thanks to Bryan Green for correcting my misconceptions about that.) Quick testing suggests that that spelling is accepted by most platforms, though again the buildfarm may show that "most" isn't "all". Hence, remove the es_ES dependency and instead create a "C" message catalog. I've made the test unconditionally set lc_messages to 'C.UTF-8'. That approach might need adjustment depending on what the buildfarm shows, but let's keep it simple until proven wrong. While at it, tweak the test so that we run the various ereport's even when !ENABLE_NLS. This is useful to verify that the macros provided by <inttypes.h> are compatible with snprintf.c, as we now know is worth questioning. Discussion: https://postgr.es/m/1991599.1765818338@sss.pgh.pa.us	2025-12-15 15:16:45 -05:00
Jeff Davis	95a19fefdc	Remove incorrect declarations in pg_wchar.h. Oversight in commit `9acae56ce0`. Discussion: https://postgr.es/m/541F240E-94AD-4D65-9794-7D6C316BC3FF@gmail.com	2025-12-15 10:38:55 -08:00
Jeff Davis	54c41a6deb	Remove unused single-byte char_is_cased() API. https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-15 10:24:57 -08:00
Jeff Davis	9c8de15969	Use multibyte-aware extraction of pattern prefixes. Previously, like_fixed_prefix() used char-at-a-time logic, which forced it to be too conservative for case-insensitive matching. Introduce like_fixed_prefix_ci(), and use that for case-insensitive pattern prefixes. It uses multibyte and locale-aware logic, along with the new pg_iswcased() API introduced in `630706ced0`. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-15 10:24:47 -08:00
Tom Lane	8191937082	Add offnum range checks to suppress compile warnings with UBSAN. Late-model gcc with -fsanitize=undefined enabled issues warnings about uses of PageGetItemId() when it can't prove that the offsetNumber is > 0. The call sites where this happens are checking that the offnum is <= PageGetMaxOffsetNumber(page), so it seems reasonable to add an explicit check that offnum >= 1 too. While at it, rearrange the code to be less contorted and avoid duplicate checks on PageGetMaxOffsetNumber. Maybe the compiler would optimize away the duplicate logic or maybe not, but the existing coding has little to recommend it anyway. There are multiple instances of this identical coding pattern in heapam.c and heapam_xlog.c. Current gcc only complains about two of them, but I fixed them all in the name of consistency. Potentially this could be back-patched in the name of silencing warnings; but I think enabling UBSAN is mainly something people would do on HEAD, so for now it seems not worth the trouble. Discussion: https://postgr.es/m/1699806.1765746897@sss.pgh.pa.us	2025-12-15 12:40:09 -05:00
Heikki Linnakangas	bd43940b02	Increase timeout in multixid_conversion upgrade test The workload to generate multixids before upgrade is very slow on buildfarm members running with JIT enabled. The workload runs a lot of small queries, so it's unsurprising that JIT makes it slower. On my laptop it nevertheless runs in under 10 s even with JIT enabled, while some buildfarm members have been hitting the 180 s timeout. That seems extreme, but I suppose it's still expected on very slow and busy buildfarm animals. The timeout applies to the BackgroundPsql sessions as whole rather than the individual queries. Bump up the timeout to avoid the test failures. Add periodic progress reports to the test output so that we get a better picture of just how slow the test is. In the passing, also fix comments about how many multixids and members the workload generates. The comments were written based on 10 parallel connections, but it actually uses 20. Discussion: https://www.postgresql.org/message-id/b7faf07c-7d2c-4f35-8c43-392e057153ef@gmail.com	2025-12-15 18:06:24 +02:00
Heikki Linnakangas	ecb553ae82	Improve sanity checks on multixid members length In the server, check explicitly for multixids with zero members. We used to have an assertion for it, but commit `d4b7bde418` replaced it with more extensive runtime checks, but it missed the original case of zero members. In the upgrade code, a negative length never makes sense, so better check for it explicitly. Commit `d4b7bde418` added a similar sanity check to the corresponding server code on master, and in backbranches, the 'length' is passed to palloc which would fail with "invalid memory alloc request size" error. Clarify the comments on what kind of invalid entries are tolerated by the upgrade code and which ones are reported as fatal errors. Coverity complained about 'length' in the upgrade code being tainted. That's bogus because we trust the data on disk at least to some extent, but hopefully this will silence the complaint. If not, I'll dismiss it manually. Discussion: https://www.postgresql.org/message-id/7b505284-c6e9-4c80-a7ee-816493170abc@iki.fi	2025-12-15 13:30:17 +02:00
Álvaro Herrera	77038d6d0b	Disable recently added CIC/RI isolation tests We have tried to stabilize them several times already, but they are very flaky -- apparently there's some intrinsic instability that's hard to solve with the isolationtester framework. They are very noisy in CI runs (whereas buildfarm has not registered any such failures). They may need to be rewritten completely. In the meantime just comment them out in Makefile/meson.build, leaving the spec files around. Per complaint from Andres Freund. Discussion: https://postgr.es/m/202512112014.icpomgc37zx4@alvherre.pgsql	2025-12-15 12:17:37 +01:00
Peter Eisentraut	17f446784d	Refactor static_assert() support. HAVE__STATIC_ASSERT was really a test for GCC statement expressions, as needed for StaticAssertExpr() now that _Static_assert could be assumed to be available through our C11 requirement. This artificially prevented Visual Studio from being able to use static_assert() in other contexts. Instead, make a new test for HAVE_STATEMENT_EXPRESSIONS, and use that to control only whether StaticAssertExpr() uses fallback code, not the other variants. This improves the quality of failure messages in the (much more common) other variants under Visual Studio. Also get rid of the two separate implementations for C++, since the C implementation is also also valid as C++11. While it is a stretch to apply HAVE_STATEMENT_EXPRESSIONS tested with $CC to a C++ compiler, the previous C++ coding assumed that the C++ compiler had them unconditionally, so it isn't a new stretch. In practice, the C and C++ compilers are very likely to agree, and if a combination is ever reported that falsifies this assumption we can always reconsider that. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGKvr0x_oGmQTUkx%3DODgSksT2EtgCA6LmGx_jQFG%3DsDUpg%40mail.gmail.com	2025-12-15 11:54:23 +01:00
Heikki Linnakangas	366dcdaf57	Clarify comment on multixid offset wraparound check Coverity complained that offset cannot be 0 here because there's an explicit check for "offset == 0" earlier in the function, but it didn't see the possibility that offset could've wrapped around to 0. The code is correct, but clarify the comment about it. The same code exists in backbranches in the server GetMultiXactIdMembers() function and in 'master' in the pg_upgrade GetOldMultiXactIdSingleMember function. In backbranches Coverity didn't complain about it because the check was merely an assertion, but change the comment in all supported branches for consistency. Per Tom Lane's suggestion. Discussion: https://www.postgresql.org/message-id/1827755.1765752936@sss.pgh.pa.us	2025-12-15 11:47:04 +02:00
David Rowley	cd83ed9a91	Fix typo in tablecmds.c Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2%3DAib%2BcatZn6wHKmz0BWe8-q10NAhpxu8wUDT19SddKNA%40mail.gmail.com	2025-12-15 17:17:04 +13:00
Amit Kapila	0d2d4a0ec3	Add retry logic to pg_sync_replication_slots(). Previously, pg_sync_replication_slots() would finish without synchronizing slots that didn't meet requirements, rather than failing outright. This could leave some failover slots unsynchronized if required catalog rows or WAL segments were missing or at risk of removal, while the standby continued removing needed data. To address this, the function now waits for the primary slot to advance to a position where all required data is available on the standby before completing synchronization. It retries cyclically until all failover slots that existed on the primary at the start of the call are synchronized. Slots created after the function begins are not included. If the standby is promoted during this wait, the function exits gracefully and the temporary slots will be removed. Author: Ajin Cherian <itsajin@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Yilin Zhang <jiezhilove@126.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAFPTHDZAA%2BgWDntpa5ucqKKba41%3DtXmoXqN3q4rpjO9cdxgQrw%40mail.gmail.com	2025-12-15 02:50:21 +00:00
Michael Paquier	33980eaa6d	test_custom_stats: Fix compilation warning I have fat-fingered an error message related to an offset while switching the code to use pgoff_t. Let's switch to the same error message used in the rest of the tree for similar failures with fseeko(), instead. Per buildfarm members running macos: longfin, sifaka and indri.	2025-12-15 10:34:18 +09:00
Michael Paquier	171198ff2a	pageinspect: use index_close() for GiST index relation gist_page_items() opens its target relation with index_open(), but closed it using relation_close() instead of index_close(). This was harmless because index_close() and relation_close() do the exact same work, still inconsistent with the rest of the code tree as routines opening and closing a relation based on a relkind are expected to match, at least in name. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=bL41WWcD-4Fxx-buS2Y2G5=9PjkxZbHeFMR6Uy2WNvw@mail.gmail.com	2025-12-15 10:28:28 +09:00
Michael Paquier	481783e69f	test_custom_stats: Add tests with read/write of auxiliary data This commit builds upon `4ba012a8ed`, giving an example of what can be achieved with the new callbacks. This provides coverage for the new pgstats APIs, while serving as a reference template. Note that built-in stats kinds could use them, we just don't have a use-case there yet. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com	2025-12-15 09:47:30 +09:00
Michael Paquier	4ba012a8ed	Allow cumulative statistics to read/write auxiliary data from/to disk Cumulative stats kinds gain the capability to write additional per-entry data when flushing the stats at shutdown, and read this data when loading back the stats at startup. This can be fit for example in the case of variable-length data (like normalized query strings), so as it becomes possible to link the shared memory stats entries to data that is stored in a different area, like a DSA segment. Three new optional callbacks are added to PgStat_KindInfo, available to variable-numbered stats kinds: * to_serialized_data: writes auxiliary data for an entry. * from_serialized_data: reads auxiliary data for an entry. * finish: performs actions after read/write/discard operations. This is invoked after processing all the entries of a kind, allowing extensions to close file handles and clean up resources. Stats kinds have the option to store this data in the existing pgstats file, but can as well store it in one or more additional files whose names can be built upon the entry keys. The new serialized callbacks are called once an entry key is read or written from the main stats file. A file descriptor to the main pgstats file is available in the arguments of the callbacks. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com	2025-12-15 09:40:56 +09:00
Tom Lane	58dad7f349	Update typedefs.list to match what the buildfarm currently reports. The current list from the buildfarm includes quite a few typedef names that it used to miss. The reason is a bit obscure, but it seems likely to have something to do with our recent increased use of palloc_object and palloc_array. In any case, this makes the relevant struct declarations be much more nicely formatted, so I'll take it. Install the current list and re-run pgindent to update affected code. Syncing with the current list also removes some obsolete typedef names and fixes some alphabetization errors. Discussion: https://postgr.es/m/1681301.1765742268@sss.pgh.pa.us	2025-12-14 17:03:53 -05:00
Tom Lane	66b2282b0c	Make "pgoff_t" be a typedef not a #define. There doesn't seem to be any great reason why this has been a macro rather than a typedef. But doing it like that means our buildfarm typedef tooling doesn't capture the name as a typedef. That would result in pgindent glitches, except that we've seemingly kept it in typedefs.list manually. That's obviously error-prone, so let's convert it to a typedef now. Discussion: https://postgr.es/m/1681301.1765742268@sss.pgh.pa.us	2025-12-14 16:53:34 -05:00
Tom Lane	fe7ede45f1	Looks like we can't test NLS on machines that lack any es_ES locale. While commit `5b275a6e1` fixed a few unhappy buildfarm animals, it looks like the remainder simply don't have any es_ES locale at all. There's little point in running the test in that case, so minimize the number of variant expected-files by bailing out. Also emit a log entry so that it's possible to tell from buildfarm postmaster logs which case occurred. Possibly, the scope of this testing could be improved by providing additional translations. But I think it's likely that the failing animals have no non-C locales installed at all. In passing, update typedefs.list so that koel doesn't think regress.c is misformatted. Discussion: https://postgr.es/m/E1vUpNU-000kcQ-1D@gemulon.postgresql.org	2025-12-14 14:30:50 -05:00
Andres Freund	30df61990c	bufmgr: Add one-entry cache for private refcount The private refcount entry for a buffer is often looked up repeatedly for the same buffer, e.g. to pin and then unpin a buffer. Benchmarking shows that it's worthwhile to have a one-entry cache for that case. With that cache in place, it's worth splitting GetPrivateRefCountEntry() into a small inline portion (for the cache hit case) and an out-of-line helper for the rest. This is helpful for some workloads today, but becomes more important in an upcoming patch that will utilize the private refcount infrastructure to also store whether the buffer is currently locked, as that increases the rate of lookups substantially. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/6rgb2nvhyvnszz4ul3wfzlf5rheb2kkwrglthnna7qhe24onwr@vw27225tkyar	2025-12-14 13:09:43 -05:00
Andres Freund	edbaaea0a9	bufmgr: Separate keys for private refcount infrastructure This makes lookups faster, due to allowing auto-vectorized lookups. It is also beneficial for an upcoming patch, independent of auto-vectorization, as the upcoming patch wants to track more information for each pinned buffer, making the existing loop, iterating over an array of PrivateRefCountEntry, more expensive due to increasing its size. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-12-14 13:09:43 -05:00
Tom Lane	5b275a6e15	Try a few different locale name spellings in nls.sql. While CI testing in advance of commit `8c498479d` suggested that all Unix-ish platforms would accept 'es_ES.UTF-8', the buildfarm has a different opinion. Let's dynamically select something that works, if possible. Discussion: https://postgr.es/m/E1vUpNU-000kcQ-1D@gemulon.postgresql.org	2025-12-14 12:54:57 -05:00
Tom Lane	b853e644d7	Fix double assignment. Coverity complained about this, not without reason: OldMultiXactReader state = state = pg_malloc(sizeof(state)); (I'm surprised this is even legal C ... why is "state" in-scope in its initialization expression?) While at it, convert to use our newly-preferred "pg_malloc_object" macro instead of an explicit sizeof().	2025-12-14 12:09:56 -05:00
Tom Lane	8c498479d7	Add a regression test to verify that NLS translation works. We've never actually had a formal test for this facility. It seems worth adding one now, mainly because we are starting to depend on gettext() being able to handle the PRI* macros from <inttypes.h>, and it's not all that certain that that works everywhere. So the test goes to a bit of effort to check all the PRI* macros we are likely to use. (This effort has indeed found one problem already, now fixed in commit f8715ec86.) Discussion: https://postgr.es/m/3098752.1765221796@sss.pgh.pa.us Discussion: https://postgr.es/m/292844.1765315339@sss.pgh.pa.us	2025-12-14 11:55:18 -05:00
Alexander Korotkov	b27e48213f	Refactor WaitLSNType enum to use a macro for type count Change WAIT_LSN_TYPE_COUNT from an enum sentinel to a macro definition, in a similar way to IOObject, IOContext, and BackendType enums. Remove explicit enum value assignments well. Author: Xuneng Zhou <xunengzhou@gmail.com>	2025-12-14 17:18:32 +02:00
Alexander Korotkov	c5ae07a90a	Fix usage of palloc() in MERGE/SPLIT PARTITION(s) code `f2e4cc4279` and `4b3d173629` implement ALTER TABLE ... MERGE/SPLIT PARTITION(s) commands. In several places, these commits use palloc(), where we should use palloc_object() and palloc_array(). This commit provides appropriate usage of palloc_object() and palloc_array(). Reported-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/tencent_3661BB522D5466B33EA33666%40qq.com	2025-12-14 16:10:25 +02:00
Alexander Korotkov	4b3d173629	Implement ALTER TABLE ... SPLIT PARTITION ... command This new DDL command splits a single partition into several partitions. Just like the ALTER TABLE ... MERGE PARTITIONS ... command, new partitions are created using the createPartitionTable() function with the parent partition as the template. This commit comprises a quite naive implementation which works in a single process and holds the ACCESS EXCLUSIVE LOCK on the parent table during all the operations, including the tuple routing. This is why the new DDL command can't be recommended for large, partitioned tables under high load. However, this implementation comes in handy in certain cases, even as it is. Also, it could serve as a foundation for future implementations with less locking and possibly parallelism. Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru Author: Dmitry Koval <d.koval@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Co-authored-by: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Co-authored-by: Dagfinn Ilmari Mannsaker <ilmari@ilmari.org> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Co-authored-by: Jian He <jian.universality@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Robert Haas <rhaas@postgresql.org> Reviewed-by: Stephane Tachoires <stephane.tachoires@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Daniel Gustafsson <dgustafsson@postgresql.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com>	2025-12-14 13:29:38 +02:00
Alexander Korotkov	f2e4cc4279	Implement ALTER TABLE ... MERGE PARTITIONS ... command This new DDL command merges several partitions into a single partition of the target table. The target partition is created using the new createPartitionTable() function with the parent partition as the template. This commit comprises a quite naive implementation which works in a single process and holds the ACCESS EXCLUSIVE LOCK on the parent table during all the operations, including the tuple routing. This is why this new DDL command can't be recommended for large partitioned tables under a high load. However, this implementation comes in handy in certain cases, even as it is. Also, it could serve as a foundation for future implementations with less locking and possibly parallelism. Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru Author: Dmitry Koval <d.koval@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Co-authored-by: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Co-authored-by: Dagfinn Ilmari Mannsaker <ilmari@ilmari.org> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Co-authored-by: Jian He <jian.universality@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Robert Haas <rhaas@postgresql.org> Reviewed-by: Stephane Tachoires <stephane.tachoires@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Daniel Gustafsson <dgustafsson@postgresql.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com>	2025-12-14 13:29:17 +02:00
Michael Paquier	5b3ef3055d	doc: Fix incorrect documentation for test_custom_stats The reference to the test module test_custom_stats should have been added under the section "Custom Cumulative Statistics", but the section "Injection Points" has been updated instead, reversing the references for both test modules. `d52c24b0f8` has removed a paragraph that was correct, and `31280d96a6` has added a paragraph that was incorrect. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s4heX926+ZNh63u12gLd9jgauU6yiirKc7xGo1G01PXQ@mail.gmail.com	2025-12-14 11:21:01 +09:00
Tom Lane	ef5f559b95	Fix jsonb_object_agg crash after eliminating null-valued pairs. In commit `b61aa76e4` I added an assumption in jsonb_object_agg_finalfn that it'd be okay to apply uniqueifyJsonbObject repeatedly to a JsonbValue. I should have studied that code more closely first, because in skip_nulls mode it removed leading nulls by changing the "pairs" array start pointer. This broke the data structure's invariants in two ways: pairs no longer references a repalloc-able chunk, and the distance from pairs to the end of its array is less than parseState->size. So any subsequent addition of more pairs is at high risk of clobbering memory and/or causing repalloc to crash. Unfortunately, adding more pairs is exactly what will happen when the aggregate is being used as a window function. Fix by rewriting uniqueifyJsonbObject to not do that. The prior coding had little to recommend it anyway. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/ec5e96fb-ee49-4e5f-8a09-3f72b4780538@gmail.com	2025-12-13 16:18:29 -05:00
Peter Eisentraut	315342ffed	Use correct preprocessor conditional in relptr.h When relptr.h was added (commit `fbc1c12a94`), there was no check for HAVE_TYPEOF, so it used HAVE__BUILTIN_TYPES_COMPATIBLE_P, which already existed (commit `ea473fb2de`) and which was thought to cover approximately the same compilers. But the guarded code can also work without HAVE__BUILTIN_TYPES_COMPATIBLE_P, and we now have a check for HAVE_TYPEOF (commit `4cb824699e`), so let's fix this up to use the correct logic. Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/CA%2BhUKGL7trhWiJ4qxpksBztMMTWDyPnP1QN%2BLq341V7QL775DA%40mail.gmail.com	2025-12-13 19:56:09 +01:00
Peter Eisentraut	abb331da0a	Fix out-of-date comment on makeRangeConstructors We did define 4 functions in `4429f6a9e3`, but in `df73584431` we got rid of the 0- and 1-arg versions. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CA%2BrenyVQti3iC7LE4UxtQb4ROLYMs6%2Bu-d4LrN5U4idH1Ghx6Q%40mail.gmail.com	2025-12-13 16:56:07 +01:00
Peter Eisentraut	ff30bad7f6	Clarify comment about temporal foreign keys In RI_ConstraintInfo, period_contained_by_oper and period_intersect_oper can take either anyrange or anymultirange. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/CA%2BrenyWzDth%2BjqLZA2L2Cezs3wE%2BWX-5P8W2EOVx_zfFD%3Daicg%40mail.gmail.com	2025-12-13 16:44:33 +01:00
Álvaro Herrera	630a93799d	Reject opclass options in ON CONFLICT clause It's as pointless as ASC/DESC and NULLS FIRST/LAST are, so reject all of them in the same way. While at it, normalize the others' error messages to have less translatable strings. Add tests for these errors. Noticed while reviewing recent INSERT ON CONFLICT patches. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/202511271516.oiefpvn3z27m@alvherre.pgsql	2025-12-12 14:26:42 +01:00
Peter Eisentraut	493eb0da31	Replace most StaticAssertStmt() with StaticAssertDecl() Similar to commit `75f49221c2`, it is preferable to use StaticAssertDecl() instead of StaticAssertStmt() when possible. Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKGKvr0x_oGmQTUkx%3DODgSksT2EtgCA6LmGx_jQFG%3DsDUpg%40mail.gmail.com	2025-12-12 10:06:40 +01:00
Heikki Linnakangas	87a350e1f2	Never store 0 as the nextMXact Before this commit, when multixid wraparound happens, MultiXactState->nextMXact goes to 0, which is invalid. All the readers need to deal with that possibility and skip over the 0. That's error-prone and we've missed it a few times in the past. This commit changes the responsibility so that all the writers of MultiXactState->nextMXact skip over the zero already, and readers can trust that it's never 0. We were already doing that for MultiXactState->oldestMultiXactId; none of its writers would set it to 0. ReadMultiXactIdRange() was nevertheless checking for that possibility. For clarity, remove that check. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/3624730d-6dae-42bf-9458-76c4c965fb27@iki.fi	2025-12-12 10:47:34 +02:00
Nathan Bossart	b4cbc106a6	Fix some comments. Like commit `123661427b`, these were discovered while reviewing Aleksander Alekseev's proposed changes to pgindent.	2025-12-11 15:13:04 -06:00
Álvaro Herrera	81f72115cf	Fix infer_arbiter_index for partitioned tables The fix for concurrent index operations in `bc32a12e0d` started considering indexes that are not yet marked indisvalid as arbiters for INSERT ON CONFLICT. For partitioned tables, this leads to including indexes that may not exist in partitions, causing a trivially reproducible "invalid arbiter index list" error to be thrown because of failure to match the index. To fix, it suffices to ignore !indisvalid indexes on partitioned tables. There should be no risk that the set of indexes will change for concurrent transactions, because in order for such an index to be marked valid, an ALTER INDEX ATTACH PARTITION must run which requires AccessExclusiveLock. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/17622f79-117a-4a44-aa8e-0374e53faaf0%40gmail.com	2025-12-11 20:56:37 +01:00
Heikki Linnakangas	b65f1ad9b1	Fix comment on how temp files and subtransactions are handled The comment was accurate a long time ago, but not any more. I failed to update the comment in commit `ab3148b712`.	2025-12-11 15:57:11 +02:00
Heikki Linnakangas	d4b7bde418	Add runtime checks for bogus multixact offsets It's not far-fetched that we'd try to read a multixid with an invalid offset in case of bugs or corruption. Or if you call pg_get_multixact_members() after a crash that left behind invalid but unused multixids. Better to get a somewhat descriptive error message if that happens. Discussion: https://www.postgresql.org/message-id/3624730d-6dae-42bf-9458-76c4c965fb27@iki.fi	2025-12-11 11:18:14 +02:00
Peter Eisentraut	795e94c70c	Make <assert.h> consistently available in frontend and backend Previously, c.h made <assert.h> only available in frontends (#ifdef FRONTEND), which was probably reasonable, because the only thing it would give you is assert(), which you generally shouldn't use in the backend. But with C11, <assert.h> also makes available static_assert(), which would be useful everywhere. So this patch moves <assert.h> to the commonly available header files in c.h and fixes a small complication in regcustom.h that resulted from that. Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA%2BhUKGKvr0x_oGmQTUkx%3DODgSksT2EtgCA6LmGx_jQFG%3DsDUpg%40mail.gmail.com	2025-12-11 09:56:57 +01:00
Michael Paquier	4f7dacc5b8	Use palloc_object() and palloc_array(), the last change This is the last batch of changes that have been suggested by the author, this part covering the non-trivial changes. Some of the changes suggested have been discarded as they seem to lead to more instructions generated, leaving the parts that can be qualified as in-place replacements. Similar work has been done in `1b105f9472`, `0c3c5c3b06` and `31d3847a37`. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com	2025-12-11 14:29:12 +09:00
Michael Paquier	3f83de20ba	pg_buffercache: Fix memory allocation formula The code over-allocated the memory required for os_page_status, relying on uint64 for its element size instead of an int, hence doubling what was required. This could mean quite a lot of memory if dealing with a lot of NUMA pages. Oversight in `ba2a3c2302`. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com Backpatch-through: 18	2025-12-11 14:11:06 +09:00
Amit Kapila	1362bc33e0	Enhance slot synchronization API to respect promotion signal. Previously, during a promotion, only the slot synchronization worker was signaled to shut down. The backend executing slot synchronization via the pg_sync_replication_slots() SQL function was not signaled, allowing it to complete its synchronization cycle before exiting. An upcoming patch improves pg_sync_replication_slots() to wait until replication slots are fully persisted before finishing. This behaviour requires the backend to exit promptly if a promotion occurs. This patch ensures that, during promotion, a signal is also sent to the backend running pg_sync_replication_slots(), allowing it to be interrupted and exit immediately. Author: Ajin Cherian <itsajin@gmail.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAFPTHDZAA%2BgWDntpa5ucqKKba41%3DtXmoXqN3q4rpjO9cdxgQrw%40mail.gmail.com	2025-12-11 03:49:28 +00:00
Peter Geoghegan	e16c6f0247	Clarify why _bt_killitems sorts its items array. Make it clear why _bt_killitems sorts the scan's so->killedItems[] array. Also add an assertion to the _bt_killitems loop (that iterates through this array) to verify it accesses tuples in leaf page order. Follow-up to commit `bfb335df58`. Author: Peter Geoghegan <pg@bowt.ie> Suggested-by: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAGnEboirgArezZDNeFrR8FOGvKF-Xok333s2iVwWi65gZf8MEA@mail.gmail.com	2025-12-10 20:50:47 -05:00
Michael Paquier	06761b6096	Fix allocation formula in llvmjit_expr.c An array of LLVMBasicBlockRef is allocated with the size used for an element being "LLVMBasicBlockRef *" rather than "LLVMBasicBlockRef". LLVMBasicBlockRef is a type that refers to a pointer, so this did not directly cause a problem because both should have the same size, still it is incorrect. This issue has been spotted while reviewing a different patch, and exists since `2a0faed9d7`, so backpatch all the way down. Discussion: https://postgr.es/m/CA+hUKGLngd9cKHtTUuUdEo2eWEgUcZ_EQRbP55MigV2t_zTReg@mail.gmail.com Backpatch-through: 14	2025-12-11 10:25:21 +09:00
Peter Geoghegan	473cb1b951	Fix MULTIXACT_DEBUG builds. Oversight in commit `bd8d9c9b`. Discussion: https://postgr.es/m/CAH2-WzmvwVKZ+0Z=RL_+g_aOku8QxWddDCXmtyLj02y+nYaD0g@mail.gmail.com	2025-12-10 19:31:13 -05:00
Tom Lane	0909380e4c	Allow PG_PRINTF_ATTRIBUTE to be different in C and C++ code. Although clang claims to be compatible with gcc's printf format archetypes, this appears to be a falsehood: it likes __syslog__ (which gcc does not, on most platforms) and doesn't accept gnu_printf. This means that if you try to use gcc with clang++ or clang with g++, you get compiler warnings when compiling printf-like calls in our C++ code. This has been true for quite awhile, but it's gotten more annoying with the recent appearance of several buildfarm members that are configured like this. To fix, run separate probes for the format archetype to use with the C and C++ compilers, and conditionally define PG_PRINTF_ATTRIBUTE depending on __cplusplus. (We could alternatively insist that you not mix-and-match C and C++ compilers; but if the case works otherwise, this is a poor reason to insist on that.) No back-patch for now, but we may want to do that if this patch survives buildfarm testing. Discussion: https://postgr.es/m/986485.1764825548@sss.pgh.pa.us	2025-12-10 17:09:10 -05:00
Peter Geoghegan	bfb335df58	Return TIDs in desc order during backwards scans. Always return TIDs in descending order when returning groups of TIDs from an nbtree posting list tuple during nbtree backwards scans. This makes backwards scans tend to require fewer buffer hits, since the scan is less likely to repeatedly pin and unpin the same heap page/buffer (we'll get exactly as many buffer hits as we get with a similar forwards scan case). Commit `0d861bbb`, which added nbtree deduplication, originally did things this way to avoid interfering with _bt_killitems's approach to setting LP_DEAD bits on posting list tuples. _bt_killitems makes a soft assumption that it can always iterate through posting lists in ascending TID order, finding corresponding killItems[]/so->currPos.items[] entries in that same order. This worked out because of the prior _bt_readpage backwards scan behavior. If we just changed the backwards scan posting list logic in _bt_readpage, without altering _bt_killitems itself, it would break its soft assumption. Avoid that problem by sorting the so->killedItems[] array at the start of _bt_killitems. That way the order that dead items are saved in from btgettuple can't matter; so->killedItems[] will always be in the same order as so->currPos.items[] in the end. Since so->currPos.items[] is now always in leaf page order, regardless of the scan direction used within _bt_readpage, and since so->killedItems[] is always in that same order, the _bt_killitems loop can continue to make a uniform assumption about everything being in page order. In fact, sorting like this makes the previous soft assumption about item order into a hard invariant. Also deduplicate the so->killedItems[] array after it is sorted. That way there's no risk of the _bt_killitems loop becoming confused by a duplicate dead item/TID. This was possible in cases that involved a scrollable cursor that encountered the same dead TID more than once (within the same leaf page/so->currPos context). This doesn't come up very much in practice, but it seems best to be as consistent as possible about how and when _bt_killitems will LP_DEAD-mark index tuples. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-Wz=Wut2pKvbW-u3hJ_LXwsYeiXHiW8oN1GfbKPavcGo8Ow@mail.gmail.com	2025-12-10 15:35:30 -05:00
Jeff Davis	630706ced0	Add pg_iswcased(). True if character has multiple case forms. Will be a useful multibyte-aware replacement for char_is_cased(). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-10 11:56:11 -08:00
Jeff Davis	1e493158d3	Remove char_tolower() API. It's only useful for an ILIKE optimization for the libc provider using a single-byte encoding and a non-C locale, but it creates significant internal complexity. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-10 11:55:59 -08:00
Heikki Linnakangas	820343bab3	Fix bogus extra arguments to query_safe in test The test seemed to incorrectly think that query_safe() takes an argument that describes what the query does, similar to e.g. command_ok(). Until commit `bd8d9c9bdf` the extra arguments were harmless and were just ignored, but when commit `bd8d9c9bdf` introduced a new optional argument to query_safe(), the extra arguments started clashing with that, causing the test to fail. Backpatch to v17, that's the oldest branch where the test exists. The extra arguments didn't cause any trouble on the older branches, but they were clearly bogus anyway.	2025-12-10 19:38:07 +02:00
Heikki Linnakangas	343693c3c1	Improve DDL deparsing test 1. The test initially focuses on the "parent" table, then switches to the "part" table, and goes back to the "parent" table. That seems a little weird, so move the tests around so that all the commands on the "parent" table are done first, followed by the "part" table. 2. ALTER TABLE ALTER COLUMN SET EXPRESSION was not tested, so add that. Author: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CACJufxFDi7fnwB-8xXd_ExML7-7pKbTaK4j46AJ=4-14DXvtVg@mail.gmail.com	2025-12-10 19:27:02 +02:00
Melanie Plageman	eebec3ca4b	Add comment about keeping PD_ALL_VISIBLE and VM in sync The comment above heap_xlog_visible() about the critical integrity requirement for PD_ALL_VISIBLE and the visibility map should also be in heap_xlog_prune_freeze() where we set PD_ALL_VISIBLE. Oversight in `add323da40` Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2025-12-10 11:10:13 -05:00
Melanie Plageman	bd298f54a0	Simplify vacuum visibility assertion Phase I vacuum gives the page a once-over after pruning and freezing to check that the values of all_visible and all_frozen agree with the result of heap_page_is_all_visible(). This is meant to keep the logic in phase I for determining visibility in sync with the logic in phase III. Rewrite the assertion to avoid an Assert(false). Suggested by Andres Freund. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y%40sy4ymcdtdklo	2025-12-10 11:10:01 -05:00
Heikki Linnakangas	70b4d90439	Fix comment in GetPublicationRelations This function gets the list of relations associated with the publication but the comment said the opposite. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CANhcyEV3C_CGBeDtjvKjALDJDMH-Uuc9BWfSd=eck8SCXnE=fQ@mail.gmail.com	2025-12-10 15:33:29 +02:00
Heikki Linnakangas	fa44b8b7fb	Fix some near-bugs related to ResourceOwner function arguments These functions took a ResourceOwner argument, but only checked if it was NULL, and then used CurrentResourceOwner for the actual work. Surely the intention was to use the passed-in resource owner. All current callers passed CurrentResourceOwner or NULL, so this has no consequences at the moment, but it's an accident waiting to happen for future caller and extensions. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEze2Whnfv8VuRZaohE-Af+GxBA1SNfD_rXfm84Jv-958UCcJA@mail.gmail.com Backpatch-through: 17	2025-12-10 11:43:16 +02:00
Michael Paquier	8268e66ac6	libpq: Authorize pthread_exit() in libpq_check pthread_exit() is added to the list of symbols allowed when building libpq. This has been reported as possible when libpq is statically linked to libcrypto, where pthread_exit() could be called. Reported-by: Torsten Rupp <torsten.rupp@gmx.net> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/19095-6d8256d0c37d4be2@postgresql.org	2025-12-10 13:56:33 +09:00
Michael Paquier	1d7b00dc14	Fix failures with cross-version pg_upgrade tests Buildfarm members skimmer and crake have reported that pg_upgrade running from v18 fails due to the changes of `d52c24b0f8`, with the expectations that the objects removed in the test module injection_points should still be present post upgrades, but the test module does not have them anymore. The origin of the issue is that the following test modules depend on injection_points, but they do not drop the extension once the tests finish, leaving its traces in the dumps used for the upgrades: - gin, down to v17 - typcache, down to v18 - nbtree, HEAD-only Test modules have no upgrade requirements, as they are used only for.. Tests, so there is no point in keeping them around. An alternative solution would be to drop the databases created by these modules in AdjustUpgrade.pm, but the solution of this commit to drop the extension is simpler. Note that there would be a catch if using a solution based on AdjustUpgrade.pm as the database name used for the test runs differs between configure and meson: - configure relies on USE_MODULE_DB for the database name unicity, that would build a database name based on the first entry of REGRESS, that lists all the SQL tests. - meson relies on a "name" field. For example, for the test module "gin", the regression database is named "regression_gin" under meson, while it is more complex for configure, as of "contrib_regression_gin_incomplete_splits". So a AdjustUpgrade.pm would need a set of DROP DATABASE IF EXISTS to solve this issue, to cope with each build system. The failure has been caused by `d52c24b0f8`, and the problem can happen with upgrade dumps from v17 and v18 to HEAD. This problem is not currently reachable in the back-branches, but it could be possible that a future change in injection_points in stable branches invalidates this theory, so this commit is applied down to v17 in the test modules that matter. Per discussion with Tom Lane and Heikki Linnakangas. Discussion: https://postgr.es/m/2899652.1765167313@sss.pgh.pa.us Backpatch-through: 17	2025-12-10 12:46:45 +09:00
Michael Paquier	06817fc8a4	Fix two issues with recently-introduced nbtree test REGRESS has forgotten about the test nbtree_half_dead_pages, and a .gitignore was missing from the module. Oversights in `c085aab278` for REGRESS and `1e4e5783e7` for the missing .gitignore. Discussion: https://postgr.es/m/aTipJA1Y1zVSmH3H@paquier.xyz	2025-12-10 11:56:42 +09:00
Michael Paquier	801b4ee7fa	Fix meson warning due to missing declaration of NM The warning was showing up in the early stages of the meson build, when the contents of Makefile.global is generated based on the configuration of meson for PGXS. NM is added to pgxs_empty. This declaration is only used internally for the libpq sanity check, so there is no point in exposing it in PGXS. Oversight in `4a8e6f43a6`. Reported-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/4423e01f-1e52-4f47-a6ca-05cc8081c888@eisentraut.org	2025-12-10 08:10:28 +09:00
Heikki Linnakangas	bae9d2f892	Fix typo in comment Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CABPTF7V8CbOXGePqrad6EH3Om7DRhNiO3C0rQ-62UuT7RdU-GQ@mail.gmail.com	2025-12-10 01:06:03 +02:00
David Rowley	f275afc997	Fix misleading comment in tuplesort.c A comment in tuplesort.c was claiming that the code was defining INITIAL_MEMTUPSIZE so that it does not exceed ALLOCSET_SEPARATE_THRESHOLD, but the code actually ensures that we purposefully do exceed ALLOCSET_SEPARATE_THRESHOLD for the initial allocation of the tuples array, as per reasons detailed in the commentary of grow_memtuples(). Also, there's not much need to repeat the mention about ALLOCSET_SEPARATE_THRESHOLD in each location where INITIAL_MEMTUPSIZE is used, so remove those comments. Author: ChangAo Chen <cca5507@qq.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/tencent_6FA14F85D6B5B5291532D6789E07F4765C08%40qq.com	2025-12-10 12:01:14 +13:00
Michael Paquier	1b105f9472	Use palloc_object() and palloc_array() in backend code The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). This batch of changes includes most of the trivial changes suggested by the author for src/backend/. A total of 334 files are updated here. Among these files, 48 of them have their build change slightly; these are caused by line number changes as the new allocation formulas are simpler, shaving around 100 lines of code in total. Similar work has been done in `0c3c5c3b06` and `31d3847a37`. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com	2025-12-10 07:36:46 +09:00
Thomas Munro	c507ba55f5	Fix O_CLOEXEC flag handling in Windows port. PostgreSQL's src/port/open.c has always set bInheritHandle = TRUE when opening files on Windows, making all file descriptors inheritable by child processes. This meant the O_CLOEXEC flag, added to many call sites by commit `1da569ca1f` (v16), was silently ignored. The original commit included a comment suggesting that our open() replacement doesn't create inheritable handles, but it was a mis- understanding of the code path. In practice, the code was creating inheritable handles in all cases. This hasn't caused widespread problems because most child processes (archive_command, COPY PROGRAM, etc.) operate on file paths passed as arguments rather than inherited file descriptors. Even if a child wanted to use an inherited handle, it would need to learn the numeric handle value, which isn't passed through our IPC mechanisms. Nonetheless, the current behavior is wrong. It violates documented O_CLOEXEC semantics, contradicts our own code comments, and makes PostgreSQL behave differently on Windows than on Unix. It also creates potential issues with future code or security auditing tools. To fix, define O_CLOEXEC to _O_NOINHERIT in master, previously used by O_DSYNC. We use different values in the back branches to preserve existing values. In pgwin32_open_handle() we set bInheritHandle according to whether O_CLOEXEC is specified, for the same atomic semantics as POSIX in multi-threaded programs that create processes. Backpatch-through: 16 Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> (minor adjustments) Discussion: https://postgr.es/m/e2b16375-7430-4053-bda3-5d2194ff1880%40gmail.com	2025-12-10 09:01:35 +13:00
Nathan Bossart	d107176d27	vacuumdb: Add --dry-run. This new option instructs vacuumdb to print, but not execute, the VACUUM and ANALYZE commands that would've been sent to the server. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CADkLM%3DckHkX7Of5SrK7g0LokPUwJ%3Dkk8JU1GXGF5pZ1eBVr0%3DQ%40mail.gmail.com	2025-12-09 13:34:22 -06:00
Nathan Bossart	750816971b	Add ParallelSlotSetIdle(). This commit refactors the code for marking a ParallelSlot as idle to a new static inline function. This can be used to mark a slot that was obtained via ParallelSlotGetIdle() but that we don't intend to actually use for a query as idle again. This is preparatory work for a follow-up commit that will add a --dry-run option to vacuumdb. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com Discussion: https://postgr.es/m/CADkLM%3DckHkX7Of5SrK7g0LokPUwJ%3Dkk8JU1GXGF5pZ1eBVr0%3DQ%40mail.gmail.com	2025-12-09 13:34:22 -06:00
Nathan Bossart	cf1450e577	vacuumdb: Move some variables to the vacuumingOptions struct. Presently, the "echo" and "quiet" variables are carted around to various functions, which is a bit tedious. To simplify things, this commit moves them into the vacuumingOptions struct and removes the related function parameters. While at it, remove some redundant initialization code in vacuumdb's main() function. This is preparatory work for a follow-up commit that will add a --dry-run option to vacuumdb. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CADkLM%3DckHkX7Of5SrK7g0LokPUwJ%3Dkk8JU1GXGF5pZ1eBVr0%3DQ%40mail.gmail.com	2025-12-09 13:34:22 -06:00
Masahiko Sawada	ab40db3852	Add started_by column to pg_stat_progress_analyze view. The new column, started_by, indicates the initiator of the analyze ('manual' or 'autovacuum'), helping users and monitoring tools to better understand ANALYZE behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAA5RZ0suoicwxFeK_eDkUrzF7s0BVTaE7M%2BehCpYcCk5wiECpw%40mail.gmail.com	2025-12-09 11:23:45 -08:00
Masahiko Sawada	0d78952061	Add mode and started_by columns to pg_stat_progress_vacuum view. The new columns, mode and started_by, indicate the vacuum mode ('normal', 'aggressive', or 'failsafe') and the initiator of the vacuum ('manual', 'autovacuum', or 'autovacuum_wraparound'), respectively. This allows users and monitoring tools to better understand VACUUM behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAOzEurQcOY-OBL_ouEVfEaFqe_md3vB5pXjR_m6L71Dcp1JKCQ@mail.gmail.com	2025-12-09 10:51:14 -08:00
Nathan Bossart	b237f5422b	doc: Fix titles of some pg_buffercache functions. As in commit `59d6c03956`, use <function> rather than <structname> in the <title> to be consistent with how other functions in this module are documented. Oversights in commits `dcf7e1697b` and `9ccc049dfe`. Author: Noboru Saito <noborusai@gmail.com> Discussion: https://postgr.es/m/CAAM3qn%2B7KraFkCyoJCHq6m%3DurxcoHPEPryuyYeg%3DQ0EjJxjdTA%40mail.gmail.com Backpatch-through: 18	2025-12-09 11:01:38 -06:00
Tom Lane	f8715ec866	Support "j" length modifier in snprintf.c. POSIX has for a long time defined the "j" length modifier for printf conversions as meaning the size of intmax_t or uintmax_t. We got away without supporting that so far, because we were not using intmax_t anywhere. However, commit `e6be84356` re-introduced upstream's use of intmax_t and PRIdMAX into zic.c. It emerges that on some platforms (at least FreeBSD and macOS), <inttypes.h> defines PRIdMAX as "jd", so that snprintf.c falls over if that is used. (We hadn't noticed yet because it would only be apparent if bad data is fed to zic, resulting in an error report, and even then the only visible symptom is a missing line number in the error message.) We could revert that decision from our copy of zic.c, but on the whole it seems better to update snprintf.c to support this standard modifier. There might well be extensions, now or in future, that expect it to work. I did this in the lazy man's way of translating "j" to either "l" or "ll" depending on a compile-time sizeof() check, just as was done long ago to support "z" for size_t. One could imagine promoting intmax_t to have full support in snprintf.c, for example converting fmtint()'s value argument and internal arithmetic to use [u]intmax_t not [unsigned] long long. But that'd be more work and I'm hesitant to do it anyway: if there are any platforms out there where intmax_t is actually wider than "long long", this would doubtless result in a noticeable speed penalty to snprintf(). Let's not go there until we have positive evidence that there's a reason to, and some way to measure what size of penalty we're taking. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3210703.1765236740@sss.pgh.pa.us	2025-12-09 11:43:25 -05:00
Heikki Linnakangas	3cb5808bd1	Add wait event for the group commit delay before WAL flush Author: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CA%2BFpmFf-hWXtrC0Q3Cr_Xo78zuP_M_VC5xgWPOYOkwqOD0T8eg@mail.gmail.com	2025-12-09 17:06:40 +02:00
Heikki Linnakangas	f231a4e8c7	Fix warning about wrong format specifier for off_t type Per OS X buildfarm members.	2025-12-09 14:05:13 +02:00
Heikki Linnakangas	bd8d9c9bdf	Widen MultiXactOffset to 64 bits This eliminates MultiXactOffset wraparound and the 2^32 limit on the total number of multixid members. Multixids are still limited to 2^31, but this is a nice improvement because 'members' can grow much faster than the number of multixids. On such systems, you can now run longer before hitting hard limits or triggering anti-wraparound vacuums. Not having to deal with MultiXactOffset wraparound also simplifies the code and removes some gnarly corner cases. We no longer need to perform emergency anti-wraparound freezing because of running out of 'members' space, so the offset stop limit is gone. But you might still not want 'members' to consume huge amounts of disk space. For that reason, I kept the logic for lowering vacuum's multixid freezing cutoff if a large amount of 'members' space is used. The thresholds for that are roughly the same as the "safe" and "danger" thresholds used before, 2 billion transactions and 4 billion transactions. This keeps the behavior for the freeze cutoff roughly the same as before. It might make sense to make this smarter or configurable, now that the threshold is only needed to manage disk usage, but that's left for the future. Add code to pg_upgrade to convert multitransactions from the old to the new format, rewriting the pg_multixact SLRU files. Because pg_upgrade now rewrites the files, we can get rid of some hacks we had put in place to deal with old bugs and upgraded clusters. Bump catalog version for the pg_multixact/offsets format change. Author: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com	2025-12-09 13:53:03 +02:00
Heikki Linnakangas	bb3b1c4f64	Move pg_multixact SLRU page format definitions to a separate header This makes them accessible from pg_upgrade, needed by the next commit. I'm doing this mechanical move as a separate commit to make the next commit's changes to these definitions more obvious. Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezbZo_3_fnx%3DS5BfepwRftzrpJ%2B7WET4EkTU6wnjDTsnjg@mail.gmail.com	2025-12-09 13:45:01 +02:00
Dean Rasheed	e9443a5526	doc: Fix statement about ON CONFLICT and deferrable constraints. The description of deferrable constraints in create_table.sgml states that deferrable constraints cannot be used as conflict arbitrators in an INSERT with an ON CONFLICT DO UPDATE clause, but in fact this restriction applies to all ON CONFLICT clauses, not just those with DO UPDATE. Fix this, and while at it, change the word "arbitrators" to "arbiters", to match the terminology used elsewhere. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAEZATCWsybvZP3ce8rGcVNx-QHuDOJZDz8y=p1SzqHwjRXyV4Q@mail.gmail.com Backpatch-through: 14	2025-12-09 10:49:16 +00:00
Richard Guo	f00484c170	Fix distinctness check for queries with grouping sets query_is_distinct_for() is intended to determine whether a query never returns duplicates of the specified columns. For queries using grouping sets, if there are no grouping expressions, the query may contain one or more empty grouping sets. The goal is to detect whether there is exactly one empty grouping set, in which case the query would return a single row and thus be distinct. The previous logic in query_is_distinct_for() was incomplete because the check was insufficiently thorough and could return false when it could have returned true. It failed to consider cases where the DISTINCT clause is used on the GROUP BY, in which case duplicate empty grouping sets are removed, leaving only one. It also did not correctly handle all possible structures of GroupingSet nodes that represent a single empty grouping set. To fix, add a check for the groupDistinct flag, and expand the query's groupingSets tree into a flat list, then verify that the expanded list contains only one element. No backpatch as this could result in plan changes. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMbWs480Z04NtP8-O55uROq2Zego309+h3hhaZhz6ztmgWLEBw@mail.gmail.com	2025-12-09 17:09:27 +09:00
Richard Guo	c925ad30b0	Fix const-simplification for index expressions and predicate Similar to the issue with constraint and statistics expressions fixed in `317c117d6`, index expressions and predicate can also suffer from incorrect reduction of NullTest clauses during const-simplification, due to unfixed varnos and the use of a NULL root. It has been reported that this issue can cause the planner to fail to pick up a partial index that it previously matched successfully. Because we need to cache the const-simplified index expressions and predicate in the relcache entry, we cannot fix the Vars before applying eval_const_expressions. To ensure proper reduction of NullTest clauses, this patch runs eval_const_expressions a second time -- after the Vars have been fixed and with a valid root. It could be argued that the additional call to eval_const_expressions might increase planning time, but I don't think that's a concern. It only runs when index expressions and predicate are present; it is relatively cheap when run on small expression trees (which is typically the case for index expressions and predicate), and it runs on expressions that have already been const-simplified once, making the second pass even cheaper. In return, in cases like the one reported, it allows the planner to match and use partial indexes, which can lead to significant execution-time improvements. Bug: #19007 Reported-by: Bryan Fox <bryfox@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org	2025-12-09 16:56:26 +09:00
Amit Kapila	04396eacd3	Fix LOCK_TIMEOUT handling in slotsync worker. Previously, the slotsync worker relied on SIGINT for graceful shutdown during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler to cancel queries. Since the slotsync worker can lock catalog tables while parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT signals and potentially wait indefinitely on locks. This patch replaces the slotsync worker's SIGINT handler with StatementCancelHandler to correctly process query-cancel interrupts. Additionally, the startup process now uses SIGUSR1 to signal the slotsync worker to stop during promotion. The worker exits after detecting that the shared memory flag stopSignaled is set. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, here it was introduced Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com	2025-12-09 07:25:20 +00:00
Peter Eisentraut	2268f2b91b	Remove useless casts in format arguments There were a number of useless casts in format arguments, either where the input to the cast was already in the right type, or seemingly uselessly casting between types instead of just using the right format placeholder to begin with. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org	2025-12-09 07:33:08 +01:00
Peter Eisentraut	907caf5c39	Clean up int64-related format strings Remove some gratuitous uses of INT64_FORMAT. Make use of PRIu64/PRId64 were appropriate, remove unnecessary casts. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org	2025-12-09 07:33:08 +01:00
Peter Eisentraut	2b117bb014	Remove unnecessary casts in printf format arguments (%zu/%zd) Many of these are probably left over from before use of %zu/%zd was portable. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/07fa29f9-42d7-4aac-8834-197918cbbab6%40eisentraut.org	2025-12-09 07:33:08 +01:00
Michael Paquier	0c3c5c3b06	Use palloc_object() and palloc_array() in more areas of the tree The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). The following paths are included in this batch, treating all the areas proposed by the author for the most trivial changes, except src/backend (by far the largest batch): src/bin/ src/common/ src/fe_utils/ src/include/ src/pl/ src/test/ src/tutorial/ Similar work has been done in `31d3847a37`. The code compiles the same before and after this commit, with the following exceptions due to changes in line numbers because some of the new allocation formulas are shorter: blkreftable.c pgfnames.c pl_exec.c Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com	2025-12-09 14:53:17 +09:00
Andres Freund	aa749bde32	Improve documentation for pg_atomic_unlocked_write_u32() After my recent commit `7902a47c20`, Nathan noticed that pg_atomic_unlocked_write_u64() was not accurately described by the comments for the 32bit version. Turns out the 32bit version has suffered from copy-and-paste-itis since its introduction. Fix. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/aTGt7q4Jvn97uGAx@nathan	2025-12-08 23:11:19 -05:00
David Rowley	52382feb78	Doc: fix typo in hash index documentation Plus a similar fix to the README. Backpatch as far back as the sgml issue exists. The README issue does exist in v14, but that seems unlikely to harm anyone. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ed3db7ea-55b4-4809-86af-81ad3bb2c7d3@gmail.com Backpatch-through: 15	2025-12-09 14:41:30 +13:00
Michael Paquier	4a8e6f43a6	libpq: Refactor logic checking for exit() in shared library builds This commit refactors the sanity check done by libpq to ensure that there is no exit() reference in the build, moving the check from a standalone Makefile rule to a perl script. Platform-specific checks are now part of the script, avoiding most of the duplication created by the introduction of this check for meson, but not all of them: - Solaris and Windows skipped in the script. - Whitelist of symbols is in the script. - nm availability, with its path given as an option of the script. Its execution is checked in the script. - Check is disabled if coverage reports are enabled. This part is not pushed down to the script. - Check is disabled for static builds of libpq. This part is filtered out in each build script. A trick is required for the stamp file, in the shape of an optional argument that can be given to the script. Meson expects the stamp in output and uses this argument, generating the stamp file in the script. Meson is able to handle the removal of the stamp file internally when libpq needs to be rebuilt and the check done again. This refactoring piece has come up while discussing the addition of more items in the symbols considered as acceptable. This sanity check has never been run by meson since its introduction in `dc227eb82e`, so it is possible that this fails in some of the buildfarm members. At least the CI is happy with it, but let's see how it goes. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: VASUKI M <vasukim1992002@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/19095-6d8256d0c37d4be2@postgresql.org	2025-12-09 10:39:08 +09:00
Tom Lane	c004d68c93	Fix minor portability issue in pg_resetwal.c. The argument of isspace() (like other <ctype.h> functions) must be cast to unsigned char to ensure portable results. Per NetBSD buildfarm members. Oversight in `636c1914b`.	2025-12-08 19:06:36 -05:00
Peter Geoghegan	83a26ba59b	Avoid pointer chasing in _bt_readpage inner loop. Make _bt_readpage pass down the current scan direction to various utility functions within its pstate variable. Also have _bt_readpage work off of a local copy of scan->ignore_killed_tuples within its per-tuple loop (rather than using scan->ignore_killed_tuples directly). Testing has shown that this significantly benefits large range scans, which are naturally able to take full advantage of the pstate.startikey optimization added by commit `8a510275`. Running a pgbench script with a "SELECT abalance FROM pgbench_accounts WHERE aid BETWEEN ..." query shows an increase in transaction throughput of over 5%. There also appears to be a small performance benefit when running pgbench's built-in select-only script. Follow-up to commit `65d6acbc`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-WzmwMwcwKFgaf+mYPwiz3iL4AqpXnwtW_O0vqpWPXRom9Q@mail.gmail.com	2025-12-08 13:48:09 -05:00
Álvaro Herrera	d0d0ba6cf6	Unify some more messages No backpatch here because of message wording changes. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202512081537.ahw5gwoencou@alvherre.pgsql	2025-12-08 19:25:36 +01:00
Peter Geoghegan	65d6acbc56	Relocate _bt_readpage and related functions. Quite a bit of code within nbtutils.c is only called by _bt_readpage. Move _bt_readpage and all of the nbtutils.c functions it depends on into a new .c file, nbtreadpage.c. Also reorder some of the functions within the new file for clarity. This commit has no functional impact. It is strictly mechanical. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Discussion: https://postgr.es/m/CAH2-WzmwMwcwKFgaf+mYPwiz3iL4AqpXnwtW_O0vqpWPXRom9Q@mail.gmail.com	2025-12-08 13:15:00 -05:00
Álvaro Herrera	502e256f22	Unify error messages No visible changes, just refactor how messages are constructed.	2025-12-08 16:30:52 +01:00
Heikki Linnakangas	978cf02bb8	pg_resetwal: Use separate flags for whether an option is given Currently, we use special values that are otherwise invalid for each option to indicate "option was not given". Replace that with separate boolean variables for each option. It seems more clear to be explicit. We were already doing that for the -m option, because there were no invalid values for nextMulti that we could use (since commit `94939c5f3a`). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/81adf5f3-36ad-4bcd-9ba5-1b95c7b7a807@iki.fi	2025-12-08 16:54:54 +02:00
Heikki Linnakangas	636c1914b4	pg_resetwal: Reject negative and out of range arguments The strtoul() function that we used to parse many of the options accepts negative values, and silently wraps them to the equivalent unsigned values. For example, -1 becomes 0xFFFFFFFF, on platforms where unsigned long is 32 bits wide. Also, on platforms where "unsigned long" is 64 bits wide, we silently casted values larger than UINT32_MAX to the equivalent 32-bit value. Both of those behaviors seem undesirable, so tighten up the parsing to reject them. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/81adf5f3-36ad-4bcd-9ba5-1b95c7b7a807@iki.fi	2025-12-08 16:54:50 +02:00
Peter Eisentraut	7f88553cea	Make ecpg parse.pl more robust with braces When parse.pl processes braces, it does not take into account that braces could also be their own token if single quoted ('{', '}'). This is not currently used but a future patch wants to make use of it. This fixes that by using lookaround assertions to detect the quotes. To make sure all Perl versions in play support this and to avoid surprises later on, let's give this a spin on the buildfarm now. It can exist independently of future work. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2025-12-08 15:53:52 +01:00
Peter Eisentraut	804046b39a	Use PGAlignedXLogBlock for some code simplification The code in BootStrapXLOG() and in pg_test_fsync.c tried to align WAL buffers in complicated ways. Also, they still used XLOG_BLCKSZ for the alignment, even though that should now be PG_IO_ALIGN_SIZE. This can now be simplified and made more consistent by using PGAlignedXLogBlock, either directly in BootStrapXLOG() and using alignas in pg_test_fsync.c. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f462a175-b608-44a1-b428-bdf351e914f4%40eisentraut.org	2025-12-08 14:54:32 +01:00
Michael Paquier	31280d96a6	test_custom_stats: Test module for custom cumulative statistics This test module acts as a replacement that existed prior to `d52c24b0f8` in the test module injection_points. It uses a more flexible structure than its ancestor: - Two libraries are built, one for fixed-sized stats and one for variable-sized stats. - No GUCs required. The stats are enabled only if one or both libraries are loaded with shared_preload_libraries. - Same kind IDs reserved: 25 (variable-sized) and 26 (fixed-sized) The goal of this redesign is to be able to easier extend the code coverage provided by this module for other changes that are currently under discussion, and injection_points was not suited for these. Injection points are also now widely used in the tree now, so extending more the test coverage for custom pgstats in the test module injection_points would be a riskier long-term move. The new code is mostly a copy of what existed previously in the test module injection_points, with the same callbacks defined for fixed-sized and variable-sized stats, but a simpler overall structure in terms of the stats counters updated. The test coverage should remain the same as previously: one TAP test is used to check data reports, crash recovery and clean restart scenarios. Tests are added for the manual reset of fixed-sized stats, something not tested until now. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com	2025-12-08 15:23:09 +09:00
Amit Kapila	006dd4b2e5	Prevent invalidation of newly created replication slots. A race condition could cause a newly created replication slot to become invalidated between WAL reservation and a checkpoint. Previously, if the required WAL was removed, we retried the reservation process. However, the slot could still be invalidated before the retry if the WAL was not yet removed but the checkpoint advanced the redo pointer beyond the slot's intended restart LSN and computed the minimum LSN that needs to be preserved for the slots. The fix is to acquire an exclusive lock on ReplicationSlotAllocationLock during WAL reservation to serialize WAL reservation and checkpoint's minimum restart_lsn computation. This ensures that, if WAL reservation occurs first, the checkpoint waits until restart_lsn is updated before removing WAL. If the checkpoint runs first, subsequent WAL reservations pick a position at or after the latest checkpoint's redo pointer. We can't use the same fix for branch 17 and prior because commit `2090edc6f3` changed to compute to the minimum restart_LSN among slot's at the beginning of checkpoint (or restart point). The fix for 17 and prior branches is under discussion and will be committed separately. Reported-by: suyu.cmj <mengjuan.cmj@alibaba-inc.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Vitaly Davydov <v.davydov@postgrespro.ru> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/5e045179-236f-4f8f-84f1-0f2566ba784c.mengjuan.cmj@alibaba-inc.com	2025-12-08 05:21:22 +00:00
Michael Paquier	d52c24b0f8	injection_points: Remove portions related to custom pgstats The test module injection_points has been used as a landing spot to provide coverage for the custom pgstats APIs, for both fixed-sized and variable-sized stats kinds. Some recent work related to pgstats is proving that this structure makes the implementation of new tests harder. This commit removes the code related to pgstats from injection_points, and an equivalent will be reintroduced as a separate test module in a follow-up commit. This removal is done in its own commit for clarity. Using injection_points for this test coverage was perhaps not the best way to design things, but this was good enough while working on the first flavor of the custom pgstats APIs. Using a new test module will make easier the introduction of new tests, and we will not need to worry about the impact of new changes related to custom pgstats could have with the internals of injection_points. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com	2025-12-08 12:45:20 +09:00
Michael Paquier	f68597ee77	Improve error messages of input functions for pg_dependencies and pg_ndistinct The error details updated in this commit can be reached in the regression tests. They did not follow the project style, and they should be written them as full sentences. Some of the errors are switched to use an elog(), for cases that involve paths that cannot be reached based on the previous state of the parser processing the input data (array start, object end, etc.). The error messages for these cases use now a more consistent style across the board, with the state of the parser reported for debugging. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1353179.1764901790@sss.pgh.pa.us	2025-12-08 10:23:48 +09:00
Tom Lane	4eda42e8bd	ecpg: refactor to eliminate cast-away-const in find_variable(). find_variable() and its subroutines transiently scribble on the passed-in "name" string, even though we've declared that "const". The string is in fact temporary, so this is not very harmful, but it's confusing and will produce compiler warnings with late-model gcc. Rearrange the code so that instead of modifying the given string, we make temporary copies of the parts that we need separated out. (I used loc_alloc so that the copies are short-lived and don't need to be freed explicitly.) This code is poorly structured and confusing, to the point where my first attempt to fix it was wrong. It is also under-tested, allowing the broken v1 patch to nonetheless pass regression. I'll restrain myself from rewriting it completely, and just add some comments and more test cases. We will probably want to back-patch this once gcc 15.2 becomes more widespread, but for now just put it in master. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us	2025-12-07 14:32:36 -05:00
Tom Lane	005a2907dc	Micro-optimize datatype conversions in datum_to_jsonb_internal. The general case for converting to a JSONB numeric value is to run the source datatype's output function and then numeric_in, but we can do substantially better than that for integer and numeric source values. This patch improves the speed of jsonb_agg by 30% for integer input, and nearly 2X for numeric input. Sadly, the obvious idea of using float4_numeric and float8_numeric to speed up those cases doesn't work: they are actually slower than the generic coerce-via-I/O method, and not by a small amount. They might round off differently than this code has historically done, too. Leave that alone pending possible changes in those functions. We can also do better than the existing code for text/varchar/bpchar source data; this optimization is similar to one that already exists in the json_agg() code. That saves 20% or so for such inputs. Also make a couple of other minor improvements, such as not giving JSONTYPE_CAST its own special case outside the switch when it could perfectly well be handled inside, and not using dubious string hacking to detect infinity and NaN results. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us	2025-12-07 11:54:33 -05:00
Tom Lane	b61aa76e45	Remove fundamentally-redundant processing in jsonb_agg() et al. The various variants of jsonb_agg() operate as follows, for each aggregate input value: 1. Build a JsonbValue tree representation of the input value. 2. Flatten the JsonbValue tree into a Jsonb in on-disk format. 3. Iterate through the Jsonb, building a JsonbValue that is part of the aggregate's state stored in aggcontext, but is otherwise identical to what phase 1 built. This is very slightly less silly than it sounds, because phase 1 involves calling non-JSONB code such as datatype output functions, which are likely to leak memory, and we don't want to leak into the aggcontext. Nonetheless, phases 2 and 3 are accomplishing exactly nothing that is useful if we can make phase 1 put the JsonbValue tree where we need it. We could probably do that with a bunch of MemoryContextSwitchTo's, but what seems more robust is to give pushJsonbValue the responsibility of building the JsonbValue tree in a specified non-current memory context. The previous patch created the infrastructure for that, and this patch simply makes the aggregate functions use it and then rips out phases 2 and 3. For me, this makes jsonb_agg() with a text column as input run about 2X faster than before. It's not yet on par with json_agg(), but this removes a whole lot of the difference. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us	2025-12-07 11:52:22 -05:00
Tom Lane	0986e95161	Revise APIs for pushJsonbValue() and associated routines. Instead of passing "JsonbParseState **" to pushJsonbValue(), pass a pointer to a JsonbInState, which will contain the parseState stack pointer as well as other useful fields. Also, instead of returning a JsonbValue pointer that is often meaningless/ignored, return the top-level JsonbValue pointer in the "result" field of the JsonbInState. This involves a lot of (mostly mechanical) edits, but I think the results are notationally cleaner and easier to understand. Certainly the business with sometimes capturing the result of pushJsonbValue() and sometimes not was bug-prone and incapable of mechanical verification. In the new arrangement, JsonbInState.result remains null until we've completed a valid sequence of pushes, so that an incorrect sequence will result in a null-pointer dereference, not mistaken use of a partial result. However, this isn't simply an exercise in prettier notation. The real reason for doing it is to provide a mechanism whereby pushJsonbValue() can be told to construct the JsonbValue tree in a context that is not CurrentMemoryContext. That happens when a non-null "outcontext" is specified in the JsonbInState. No callers exercise that option in this patch, but the next patch in the series will make use of it. I tried to improve the comments in this area too. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1060917.1753202222@sss.pgh.pa.us	2025-12-07 11:51:33 -05:00
Tom Lane	3628af4210	Add a macro for the declared typlen of type timetz. pg_type.typlen says 12 for the size of timetz, but sizeof(TimeTzADT) will be 16 on most platforms due to alignment padding. Using the sizeof number is no problem for usages such as palloc'ing a result datum, but in usages such as datumCopy we really ought to match what pg_type says. Add a macro TIMETZ_TYPLEN so that we have a symbolic way to write that rather than hard-coding "12". I cannot find any place where we've needed this so far, but an upcoming patch requires it. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2329959.1765047648@sss.pgh.pa.us	2025-12-07 11:33:35 -05:00
Tom Lane	6498287696	Handle constant inputs to corr() and related aggregates more precisely. The SQL standard says that corr() and friends should return NULL in the mathematically-undefined case where all the inputs in one of the columns have the same value. We were checking that by seeing if the sums Sxx and Syy were zero, but that approach is very vulnerable to roundoff error: if a sum is close to zero but not exactly that, we'd come out with a pretty silly non-NULL result. Instead, directly track whether the inputs are all equal by remembering the common value in each column. Once we detect that a new input is different from before, represent that by storing NaN for the common value. (An objection to this scheme is that if the inputs are all NaN, we will consider that they were not all equal. But under IEEE float arithmetic rules, one NaN is never equal to another, so this behavior is arguably correct. Moreover it matches what we did before in such cases.) Then, leave the sums at their exact value of zero for as long as we haven't detected different input values. This solution requires the aggregate transition state to contain 8 float values not 6, which is not problematic, and it seems to add less than 1% to the aggregates' runtime, which seems acceptable. While we're here, improve corr()'s final function to cope with overflow/underflow in the final calculation, and to clamp its result to [-1, 1] in case of roundoff error. Although this is arguably a bug fix, it requires a catversion bump due to the change in aggregates' initial states, so it can't be back-patched. Patch written by me, but many of the ideas are due to Dean Rasheed, who also did a deal of testing. Bug: #19340 Reported-by: Oleg Ivanov <o15611@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/19340-6fb9f6637f562092@postgresql.org	2025-12-06 18:31:26 -05:00
Tom Lane	25303961d0	Doc: include JSON in the list of SQL-standard types. Oversight I guess, it's been in the standard for awhile. Reported-by: Bob Kline <bkline@rksystems.com> Discussion: https://postgr.es/m/CAGjKmVoP4qVeJgkaBtQ6L46+OLARzmym53uQGhp5COw4wp65yQ@mail.gmail.com	2025-12-06 13:34:48 -05:00
Michael Paquier	47da198934	Improve error reporting of recovery test 027_stream_regress Previously, the 027_stream_regress test reported the full contents of regression.diffs upon a test failure, when the standby and the primary were still alive. If a test fails quite badly, the amount of information reported can be really high, bloating the reports in the buildfarm, the CI, or even local runs. In most cases, we have noticed that having all this information is not necessary when attempting to identify the source of a problem in this test. This commit changes the situation by including the head and tail of regression.diffs in the reports generated on failure rather than its full contents, building upon `b93f4e2f98` to optionally control the size of the reports with the new environment variable PG_TEST_FILE_READ_LINES. This will perhaps require some more tuning, but the hope is to reduce some of the buildfarm report bloat while making the information good enough to deduce what is happening when something is going wrong, be it in the buildfarm or some tests run in the CI, at least. Suggested-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com	2025-12-06 14:41:29 +09:00
Michael Paquier	b93f4e2f98	Add PostgreSQL::Test::Cluster::read_head_tail() helper to PostgreSQL/Utils.pm This function reads the lines from a file and filters its contents to report its head and tail contents. The amount of contents to read from a file can be tuned by the environment variable PG_TEST_FILE_READ_LINES, that can be used to override the default of 50 lines. If the file whose content is read has less lines than two times PG_TEST_FILE_READ_LINES, the whole file is returned. This will be used in a follow-up commit to limit the amount of information reported by some of the TAP tests on failure, where we have noticed that the contents reported by the buildfarm can be heavily bloated in some cases, with the head and tail contents of a report being able to provide enough information to be useful for debugging. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com	2025-12-06 14:27:53 +09:00
Tom Lane	6dfce8420e	Fix text substring search for non-deterministic collations. Due to an off-by-one error, the code failed to find matches at the end of the haystack. Fix by rewriting the loop. While at it, fix a comment that claimed that the function could find a zero-length match. Such a match could send a caller into an endless loop. However, zero-length matches only make sense with an empty search string, and that case is explicitly excluded by all callers. To make sure it stays that way, add an Assert and a comment. Bug: #19341 Reported-by: Adam Warland <adam.warland@infor.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19341-1d9a22915edfec58@postgresql.org Backpatch-through: 18	2025-12-05 20:10:33 -05:00
Heikki Linnakangas	7c2061bdfb	Fix test to work with non-8kB block sizes Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3Dezbtm%2BLOzEMyLX7rzGcAv3ez3F6nNpSJjvZeMzed0Oe6Pw%40mail.gmail.com	2025-12-05 23:39:01 +02:00
Nathan Bossart	d0d873c382	Add commit `86b276a4a9` to .git-blame-ignore-revs.	2025-12-05 14:21:12 -06:00
Robert Haas	014f9a831a	Don't reset the pathlist of partitioned joinrels. apply_scanjoin_target_to_paths wants to avoid useless work and platform-specific dependencies by throwing away the path list created prior to applying the final scan/join target and constructing a whole new one using the final scan/join target. However, this is only valid when we'll consider all the same strategies after the pathlist reset as before. After resetting the path list, we reconsider Append and MergeAppend paths with the modified target list; therefore, it's only valid for a partitioned relation. However, what the previous coding missed is that it cannot be a partitioned join relation, because that also has paths that are not Append or MergeAppend paths and will not be reconsidered. Thus, before this patch, we'd sometimes choose a partitionwise strategy with a higher total cost than cheapest non-partitionwise strategy, which is not good. We had a surprising number of tests cases that were relying on this bug to work as they did. A big part of the reason for this is that row counts in regression test cases tend to be low, which brings the cost of partitionwise and non-partitionwise strategies very close together, especially for merge joins, where the real and perceived advantages of a partitionwise approach are minimal. In addition, one test case included a row-count-inflating join. In such cases, a partitionwise join can easily be a loser on cost, because the total number of tuples passing through an Append node is much higher than it is with a non-partitionwise strategy. That test case is adjusted by adding additional join clauses to avoid the row count inflation. Although the failure of the planner to choose the lowest-cost path is a bug, we generally do not back-patch fixes of this type, because planning is not an exact science and there is always a possibility that some user will end up with a plan that has a lower estimated cost but actually runs more slowly. Hence, no backpatch here, either. The code change here is exactly what was originally proposed by Ashutosh, but the changes to the comments and test cases have been very heavily rewritten by me, helped along by some very useful advice from Richard Guo. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Arne Roland <arne.roland@malkut.net> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com	2025-12-05 12:00:18 -05:00
Tom Lane	8f1791c618	Fix some cases of indirectly casting away const. Newest versions of gcc are able to detect cases where code implicitly casts away const by assigning the result of strchr() or a similar function applied to a "const char " value to a target variable that's just "char ". This of course creates a hazard of not getting a compiler warning about scribbling on a string one was not supposed to, so fixing up such cases is good. This patch fixes a dozen or so places where we were doing that. Most are trivial additions of "const" to the target variable, since no actually-hazardous change was occurring. There is one place in ecpg.trailer where we were indeed violating the intention of not modifying a string passed in as "const char *". I believe that's harmless not a live bug, but let's fix it by copying the string before modifying it. There is a remaining trouble spot in ecpg/preproc/variable.c, which requires more complex surgery. I've left that out of this commit because I want to study that code a bit more first. We probably will want to back-patch this once compilers that detect this pattern get into wider circulation, but for now I'm just going to apply it to master to see what the buildfarm says. Thanks to Bertrand Drouvot for finding a couple more spots than I had. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us	2025-12-05 11:17:23 -05:00
Álvaro Herrera	a4a0fa0c75	Stabilize tests some more Tests added by commits `90eae926ab`, `2bc7e886fc`, `bc32a12e0d` have occasionally failed, depending on timing. Add some dependency markers to the spec to try and remove the instability. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/202512041739.sgg3tb2yobe2@alvherre.pgsql	2025-12-05 16:16:27 +01:00
Heikki Linnakangas	4d936c3fff	Fix setting next multixid's offset at offset wraparound In commit `789d65364c`, we started updating the next multixid's offset too when recording a multixid, so that it can always be used to calculate the number of members. I got it wrong at offset wraparound: we need to skip over offset 0. Fix that. Discussion: https://www.postgresql.org/message-id/d9996478-389a-4340-8735-bfad456b313c@iki.fi Backpatch-through: 14	2025-12-05 11:32:38 +02:00
Michael Paquier	31d3847a37	Use more palloc_object() and palloc_array() in contrib/ The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). In an ideal world, palloc() would then act as an internal routine of these flavors, whose footprint in the tree is minimal. The patch sent by the author is very large, and this chunk of changes represents something like 10% of the overall patch submitted. The code compiled is the same before and after this commit, using objdump to do some validation with a difference taken in-between. There are some diffs, which are caused by changes in line numbers because some of the new allocation formulas are shorter, for the following files: trgm_regexp.c, xpath.c and pg_walinspect.c. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com	2025-12-05 16:40:26 +09:00
Michael Paquier	2f04110225	Improve test output of extended statistics for ndistinct and dependencies Corey Huinker has come up with a recipe that is more compact and more pleasant to the eye for extended stats because we know that all of them are 1-dimension JSON arrays. This commit switches the extended stats tests to use replace() instead of jsonb_pretty(), splitting the data so as one line is used for each item in the extended stats object. This results in the removal of a good chunk of test output, that is now easier to debug with one line used for each item in a stats object. This patch has not been provided by Corey. This is some post-commit cleanup work that I have noticed as good enough to do on its own while reviewing the rest of the patch set Corey has posted. Discussion: https://postgr.es/m/CADkLM=csMd52i39Ye8-PUUHyzBb3546eSCUTh-FBQ7bzT2uZ4Q@mail.gmail.com	2025-12-05 14:15:21 +09:00
Amit Kapila	5db6a344ab	Rename column slotsync_skip_at to slotsync_last_skip. Commit `76b78721ca` introduced two new columns in pg_stat_replication_slots to improve monitoring of slot synchronization. One of these columns was named slotsync_skip_at, which is inconsistent with the naming convention used for similar columns in other system views. Columns that store timestamps of the most recent event typically use the 'last_' in the column name (e.g., last_autovacuum, checksum_last_failure). Renaming slotsync_skip_at to slotsync_last_skip aligns with this pattern, making the purpose of the column clearer and improving overall consistency across the views. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/20251128091552.GB13635@p46.dedyn.io;lightning.p46.dedyn.io Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com	2025-12-05 04:12:55 +00:00
Michael Paquier	7bc88c3d6f	Fix some compiler warnings Some of the buildfarm members with some old gcc versions have been complaining about an always-true test for a NULL pointer caused by a combination of SOFT_ERROR_OCCURRED() and a local ErrorSaveContext variable. These warnings are taken care of by removing SOFT_ERROR_OCCURRED(), switching to a direct variable check, like `56b1e88c80`. Oversights in `e1405aa5e3` and `44eba8f06e`. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1341064.1764895052@sss.pgh.pa.us	2025-12-05 12:30:43 +09:00
Michael Paquier	83f2f8413e	Show version of nodes in output of TAP tests This commit adds the version information of a node initialized by Cluster.pm, that may vary depending on the install_path given by the test. The code was written so as the node information, that includes the version number, was dumped before the version number was set. This is particularly useful for the pg_upgrade TAP tests, that may mix several versions for cross-version runs. The TAP infrastructure also allows mixing nodes with different versions, so this information can be useful for out-of-core tests. Backpatch down to v15, where Cluster.pm and the pg_upgrade TAP tests have been introduced. Author: Potapov Alexander <a.potapov@postgrespro.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/e59bb-692c0a80-5-6f987180@170377126 Backpatch-through: 15	2025-12-05 09:21:13 +09:00
Melanie Plageman	904f9f5ea0	Suppress spurious Coverity warning in prune freeze logic Adjust the prune_freeze_setup() parameter types of new_relfrozen_xid and new_relmin_mxid to prevent misleading Coverity analysis. heap_page_prune_and_freeze() compared these values against NULL when passing them to prune_freeze_setup(), causing Coverity to assume they could be NULL and flag a possible null-pointer dereference later, even though it occurs inside a directly related conditional. Reported-by: Coverity Author: Melanie Plageman <melanieplageman@gmail.com>	2025-12-04 18:55:02 -05:00
Nathan Bossart	80f6e2fb4a	Fix key size of PrivateRefCountHash. The key is the first member of PrivateRefCountEntry, which has type Buffer. This commit changes the key size from sizeof(int32) to sizeof(Buffer). This appears to be an oversight in commit `4b4b680c3d`, but it's of no consequence because Buffer has been a signed 32-bit integer for a long time. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aS77DTpl0fOkIKSZ%40ip-10-97-1-34.eu-west-3.compute.internal	2025-12-04 15:42:18 -06:00
Peter Eisentraut	e158fd4d68	Remove no longer needed casts from Pointer These casts used to be required when Pointer was char , but now it's void (commit `1b2bb5077e`), so they are not needed anymore. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-04 20:44:52 +01:00
Peter Eisentraut	c6be3daa05	Remove no longer needed casts to Pointer These casts used to be required when Pointer was char , but now it's void (commit `1b2bb5077e`), so they are not needed anymore. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-04 19:40:08 +01:00
Álvaro Herrera	6bd469d26a	amcheck: Fix snapshot usage in bt_index_parent_check We were using SnapshotAny to do some index checks, but that's wrong and causes spurious errors when used on indexes created by CREATE INDEX CONCURRENTLY. Fix it to use an MVCC snapshot, and add a test for it. This problem came in with commit `5ae2087202`, which introduced uniqueness check. Backpatch to 17. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Backpatch-through: 17 Discussion: https://postgr.es/m/CANtu0ojmVd27fEhfpST7RG2KZvwkX=dMyKUqg0KM87FkOSdz8Q@mail.gmail.com	2025-12-04 18:12:08 +01:00
Peter Eisentraut	40bdd839f5	headerscheck ccache support Currently, headerscheck and cpluspluscheck are very slow, and they defeat use of ccache. This fixes that, and now they are much faster. The problem was that the test files are created in a randomly-named directory (`mktemp -d /tmp/$me.XXXXXX`), and this directory is mentioned on the compiler command line, which is part of the cache key. The solution is to create the test files in the build directory. For example, for src/include/storage/ipc.h, we generate tmp_headerscheck_c/src_include_storage_ipc_h.c (or .cpp) Now ccache works. (And it's also a bit easier to debug everything with this naming.) (The subdirectory is used to keep the cleanup trap simple.) The observed speedup on Cirrus CI for headerscheck plus cpluspluscheck is from about 1min 20s to only 20s. In local use, the speedups are similar. Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/b49e74d4-3cf9-4d1c-9dce-09f75e55d026%40eisentraut.org	2025-12-04 11:23:23 +01:00
Peter Eisentraut	d0b7a0b4c8	headerscheck: Use LLVM_CPPFLAGS Otherwise, headerscheck will fail if the LLVM headers are in a location not reached by the normal CFLAGS/CPPFLAGS. Discussion: https://www.postgresql.org/message-id/flat/b49e74d4-3cf9-4d1c-9dce-09f75e55d026%40eisentraut.org	2025-12-04 10:58:15 +01:00
Alexander Korotkov	d6ef8ee3ee	Fix incorrect assertion bound in WaitForLSN() The assertion checking MyProcNumber used MaxBackends as the upper bound, but the procInfos array is allocated with size MaxBackends + NUM_AUXILIARY_PROCS. This inconsistency would cause a false assertion failure if an auxiliary process calls WaitForLSN(). Author: Xuneng Zhou <xunengzhou@gmail.com>	2025-12-04 10:38:12 +02:00
Andres Freund	6c5c393b74	Rename BUFFERPIN wait event class to BUFFER In an upcoming patch more wait events will be added to the wait event class (for buffer locking), making the current name too specific. Alternatively we could introduce a dedicated wait event class for those, but it seems somewhat confusing to have a BUFFERPIN and a BUFFER wait event class. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-12-03 18:38:20 -05:00
Andres Freund	7902a47c20	Add pg_atomic_unlocked_write_u64 The 64bit equivalent of pg_atomic_unlocked_write_u32(), to be used in an upcoming patch converting BufferDesc.state into a 64bit atomic. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-12-03 18:38:20 -05:00
Andres Freund	156680055d	bufmgr: Turn BUFFER_LOCK_* into an enum It seems cleaner to use an enum to tie the different values together. It also helps to have a more descriptive type in the argument to various functions. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-12-03 18:38:20 -05:00
Tom Lane	8d61228717	Make stats_ext test faster under cache-clobbering test conditions. Commit `1eccb9315` added a test case that will cause a large number of evaluations of a plpgsql function. With -DCLOBBER_CACHE_ALWAYS, that takes an unreasonable amount of time (hours) because the function's cache entries are repeatedly deleted and rebuilt. That doesn't add any useful test coverage --- other test cases already exercise plpgsql well enough --- and it's not part of what this test intended to cover. We can get the same planner coverage, if not more, by making the test directly invoke numeric_lt(). Reported-by: Tomas Vondra <tomas@vondra.me> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/baf1ae02-83bd-4f5d-872a-1d04f11a9073@vondra.me	2025-12-03 13:23:50 -05:00
Heikki Linnakangas	7b81be9b42	Add test for multixid wraparound Author: Andrey Borodin <amborodin@acm.org> Discussion: https://www.postgresql.org/message-id/7de697df-d74d-47db-9f73-e069b7349c4b@iki.fi	2025-12-03 19:39:34 +02:00
Heikki Linnakangas	789d65364c	Set next multixid's offset when creating a new multixid With this commit, the next multixid's offset will always be set on the offsets page, by the time that a backend might try to read it, so we no longer need the waiting mechanism with the condition variable. In other words, this eliminates "corner case 2" mentioned in the comments. The waiting mechanism was broken in a few scenarios: - When nextMulti was advanced without WAL-logging the next multixid. For example, if a later multixid was already assigned and WAL-logged before the previous one was WAL-logged, and then the server crashed. In that case the next offset would never be set in the offsets SLRU, and a query trying to read it would get stuck waiting for it. Same thing could happen if pg_resetwal was used to forcibly advance nextMulti. - In hot standby mode, a deadlock could happen where one backend waits for the next multixid assignment record, but WAL replay is not advancing because of a recovery conflict with the waiting backend. The old TAP test used carefully placed injection points to exercise the old waiting code, but now that the waiting code is gone, much of the old test is no longer relevant. Rewrite the test to reproduce the IPC/MultixactCreation hang after crash recovery instead, and to verify that previously recorded multixids stay readable. Backpatch to all supported versions. In back-branches, we still need to be able to read WAL that was generated before this fix, so in the back-branches this includes a hack to initialize the next offsets page when replaying XLOG_MULTIXACT_CREATE_ID for the last multixid on a page. On 'master', bump XLOG_PAGE_MAGIC instead to indicate that the WAL is not compatible. Author: Andrey Borodin <amborodin@acm.org> Reviewed-by: Dmitry Yurichev <dsy.075@yandex.ru> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Ivan Bykov <i.bykov@modernsys.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/172e5723-d65f-4eec-b512-14beacb326ce@yandex.ru Backpatch-through: 14	2025-12-03 19:15:08 +02:00
Nathan Bossart	9b05e2ec08	Use "foo(void)" for definitions of functions with no parameters. Standard practice in PostgreSQL is to use "foo(void)" instead of "foo()", as the latter looks like an "old-style" function declaration. Similar changes were made in commits `cdf4b9aff2`, `0e72b9d440`, `7069dbcc31`, `f1283ed6cc`, `7b66e2c086`, `e95126cf04`, and `9f7c527af3`. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/aTBObQPg%2Bps5I7vl%40ip-10-97-1-34.eu-west-3.compute.internal	2025-12-03 10:54:37 -06:00
Álvaro Herrera	be25c77677	Put back alternative-output expected files These were removed in `5dee7a603f`, but that was too optimistic, per buildfarm member prion as reported by Tom Lane. Mea (Álvaro's) culpa. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/570630.1764737028@sss.pgh.pa.us	2025-12-03 16:37:06 +01:00
Daniel Gustafsson	64527a17a5	doc: Consistently use restartpoint in the documentation The majority of cases already used "restartpoint" with just a few instances of "restart point". Changing the latter spelling to the former ensures consistency in the user facing documentation. Code comments are not affected by this since it is not worth the churn to change anything there. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/0F6E38D0-649F-4489-B2C1-43CD937E6636@yesql.se	2025-12-03 15:22:38 +01:00
Peter Eisentraut	9790affcce	Fix stray references to SubscriptRef This type never existed. SubscriptingRef was meant instead. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/2eaa45e3-efc5-4d75-b082-f8159f51445f%40eisentraut.org	2025-12-03 14:44:14 +01:00
Peter Eisentraut	1b2bb5077e	Change Pointer to void * The comment for the Pointer type said 'XXX Pointer arithmetic is done with this, so it can't be void * under "true" ANSI compilers.'. This has been fixed in the previous commit `756a436893`. This now changes the definition of the type from char * to void *, as envisaged by that comment. Extension code that relies on using Pointer for pointer arithmetic will need to make changes similar to commit `756a436893`, but those changes would be backward compatible. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-03 10:22:17 +01:00
Peter Eisentraut	756a436893	Don't rely on pointer arithmetic with Pointer type The comment for the Pointer type says 'XXX Pointer arithmetic is done with this, so it can't be void * under "true" ANSI compilers.'. This fixes that. Change from Pointer to use char * explicitly where pointer arithmetic is needed. This makes the meaning of the code clearer locally and removes a dependency on the actual definition of the Pointer type. (The definition of the Pointer type is not changed in this commit.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-03 09:54:15 +01:00
Peter Eisentraut	8c6bbd674e	Use more appropriate DatumGet* function Use DatumGetCString() instead of DatumGetPointer() for returning a C string. Right now, they are the same, but that doesn't always have to be so. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-03 08:52:28 +01:00
Peter Eisentraut	623801b3bd	Remove useless casts to Pointer in arguments of memcpy() and memmove() calls Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-03 08:40:33 +01:00
Amit Kapila	c252d37d8c	Fix shadow variable warning in subscriptioncmds.c. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CAHut+PsF8R0Bt4J3c92+T2F0mun0rRfK=-GH+iBv2s-O8ahJJw@mail.gmail.com	2025-12-03 03:31:31 +00:00
Nathan Bossart	a6d05c8193	Use LW_SHARED in dsa.c where possible. Both dsa_get_total_size() and dsa_get_total_size_from_handle() take an exclusive lock just to read a variable. This commit reduces the lock level to LW_SHARED in those functions. Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aS8fMzWs9e8iHxk2%40nathan	2025-12-02 16:40:23 -06:00
Heikki Linnakangas	cbe04e5d72	Fix amcheck's handling of half-dead B-tree pages amcheck incorrectly reported the following error if there were any half-dead pages in the index: ERROR: mismatch between parent key and child high key in index "amchecktest_id_idx" It's expected that a half-dead page does not have a downlink in the parent level, so skip the test. Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru Backpatch-through: 14	2025-12-02 21:11:15 +02:00
Heikki Linnakangas	c085aab278	Add a test for half-dead pages in B-tree indexes To increase our test coverage in general, and because I will use this in the next commit to test a bug we currently have in amcheck. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru	2025-12-02 21:11:05 +02:00
Heikki Linnakangas	6c05ef5729	Fix amcheck's handling of incomplete root splits in B-tree When the root page is being split, it's normal that root page according to the metapage is not marked BTP_ROOT. Fix bogus error in amcheck about that case. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi Backpatch-through: 14	2025-12-02 21:10:51 +02:00
Heikki Linnakangas	1e4e5783e7	Add a test for incomplete splits in B-tree indexes To increase our test coverage in general, and because I will add onto this in the next commit to also test amcheck with incomplete splits. This is copied from the similar test we had for GIN indexes. B-tree's incomplete splits work similarly to GIN's, so with small changes, the same test works for B-tree too. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi	2025-12-02 21:10:47 +02:00
Nathan Bossart	f894acb24a	Show size of DSAs and dshashes in pg_dsm_registry_allocations. Presently, this view reports NULL for the size of DSAs and dshash tables because 1) the current backend might not be attached to them and 2) the registry doesn't save the pointers to the dsa_area or dshash_table in local memory. Also, the view doesn't show partially-initialized entries to avoid ambiguity, since those entries would report a NULL size as well. This commit introduces a function that looks up the size of a DSA given its handle (transiently attaching to the control segment if needed) and teaches pg_dsm_registry_allocations to use it to show the size of successfully-initialized DSA and dshash entries. Furthermore, the view now reports partially-initialized entries with a NULL size. Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aSeEDeznAsHR1_YF%40nathan	2025-12-02 10:29:45 -06:00
Álvaro Herrera	758479213d	Remove doc and code comments about ON CONFLICT deficiencies They have been fixed, so we don't need this text anymore. This reverts commit `8b18ed6dfb`. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/CADzfLwWo+FV9WSeOah9F1r=4haa6eay1hNvYYy_WfziJeK+aLQ@mail.gmail.com	2025-12-02 16:47:18 +01:00
Álvaro Herrera	5dee7a603f	Avoid use of NOTICE to wait for snapshot invalidation This idea (implemented in commits and `bc32a12e0d` and `9e8fa05d34`) of using notices to detect that a session is sleeping was unreliable, so simplify the concurrency controller session to just look at pg_stat_activity for a process sleeping on the injection point we want it to hit. This change allows us to remove a secondary injection point and the alternative expected output files. Reproduced by Alexander Lakhin following a report in buildfarm member skink (which runs the server under valgrind). Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/3e302c96-cdd2-45ec-af84-03dbcdccde4a@gmail.com	2025-12-02 16:43:27 +01:00
Álvaro Herrera	90eae926ab	Fix ON CONFLICT with REINDEX CONCURRENTLY and partitions When planning queries with ON CONFLICT on partitioned tables, the indexes to consider as arbiters for each partition are determined based on those found in the parent table. However, it's possible for an index on a partition to be reindexed, and in that case, the auxiliary indexes created on the partition must be considered as arbiters as well; failing to do that may result in spurious "duplicate key" errors given sufficient bad luck. We fix that in this commit by matching every index that doesn't have a parent to each initially-determined arbiter index. Every unparented matching index is considered an additional arbiter index. Closely related to the fixes in `bc32a12e0d` and `2bc7e886fc`, and for identical reasons, not backpatched (for now) even though it's a longstanding issue. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com	2025-12-02 13:51:53 +01:00
Peter Eisentraut	4f941d432b	Remove useless casting to same type This removes some casts where the input already has the same type as the type specified by the cast. Their presence could cause risks of hiding actual type mismatches in the future or silently discarding qualifiers. It also improves readability. Same kind of idea as `7f798aca1d` and `ef8fe69360`. (This does not change all such instances, but only those hand-picked by the author.) Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal	2025-12-02 10:09:32 +01:00
Peter Eisentraut	35988b31db	Simplify hash_xlog_split_allocate_page() Instead of complicated pointer arithmetic, overlay a uint32 array and just access the array members. That's safe thanks to XLogRecGetBlockData() returning a MAXALIGNed buffer. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal	2025-12-02 09:18:54 +01:00
Peter Eisentraut	ec782f56b0	Replace pointer comparisons and assignments to literal zero with NULL While 0 is technically correct, NULL is the semantically appropriate choice for pointers. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/aS1AYnZmuRZ8g%2B5G%40ip-10-97-1-34.eu-west-3.compute.internal	2025-12-02 08:39:24 +01:00
Peter Eisentraut	376c649634	Update comment related to C99 One could do more work here to eliminate the Windows difference described in the comment, but that can be a separate project. The purpose of this change is to update comments that might confusingly indicate that C99 is not required. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/170308e6-a7a3-4484-87b2-f960bb564afa%40eisentraut.org	2025-12-02 08:20:43 +01:00
Michael Paquier	713d9a847e	Update some timestamp[tz] functions to use soft-error reporting This commit updates two functions that convert "timestamptz" to "timestamp", and vice-versa, to use the soft error reporting rather than a their own logic to do the same. These are now named as follows: - timestamp2timestamptz_safe() - timestamptz2timestamp_safe() These functions were suffixed with "_opt_overflow", previously. This shaves some code, as it is possible to detect how a timestamp[tz] overflowed based on the returned value rather than a custom state. It is optionally possible for the callers of these functions to rely on the error generated internally by these functions, depending on the error context. Similar work has been done in `d03668ea05` and `4246a977ba`. Reviewed-by: Amul Sul <sulamul@gmail.com> Discussion: https://postgr.es/m/aS09YF2GmVXjAxbJ@paquier.xyz	2025-12-02 09:30:23 +09:00
Jeff Davis	19b966243c	Make regex "max_chr" depend on encoding, not provider. The regex mechanism scans through the first "max_chr" character values to cache character property ranges (isalpha, etc.). For single-byte encodings, there's no sense in scanning beyond UCHAR_MAX; but for UTF-8 it makes sense to cache higher code point values (though not all of them; only up to MAX_SIMPLE_CHR). Prior to `5a38104b36`, the logic about how many character values to scan was based on the pg_regex_strategy, which was dependent on the provider. Commit `5a38104b36` preserved that logic exactly, allowing different providers to define the "max_chr". Now, change it to depend only on the encoding and whether ctype_is_c. For this specific calculation, distinguishing between providers creates more complexity than it's worth. Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com Reviewed-by: Chao Li <li.evan.chao@gmail.com>	2025-12-01 11:06:17 -08:00
Jeff Davis	99cd8890be	Change some callers to use pg_ascii_toupper(). The input is ASCII anyway, so it's better to be clear that it's not locale-dependent. Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com	2025-12-01 09:24:03 -08:00
Álvaro Herrera	2bc7e886fc	Fix ON CONFLICT ON CONSTRAINT during REINDEX CONCURRENTLY When REINDEX CONCURRENTLY is processing the index that supports a constraint, there are periods during which multiple indexes match the constraint index's definition. Those must all be included in the set of inferred index for INSERT ON CONFLICT, in order to avoid spurious "duplicate key" errors. To fix, we set things up to match all indexes against attributes, expressions and predicates of the constraint index, then return all indexes that match those, rather than just the one constraint index. This is more onerous than before, where we would just test the named constraint for validity, but it's not more onerous than processing "conventional" inference (where a list of attribute names etc is given). This is closely related to the misbehaviors fixed by `bc32a12e0d`, for a different situation. We're not backpatching this one for now either, for the same reasons. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com	2025-12-01 17:34:13 +01:00
Peter Eisentraut	2fcc5a7151	Fix a strict aliasing violation This one is almost a textbook example of an aliasing violation, and it is straightforward to fix, so clean it up. (The warning only shows up if you remove the -fno-strict-aliasing option.) Also, move the code after the error checking. Doesn't make a difference technically, but it seems strange to do actions before errors are checked. Reported-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/20240724.155525.366150353176322967.ishii%40postgresql.org	2025-12-01 16:41:08 +01:00
Michael Paquier	a87987cafc	Move WAL sequence code into its own file This split exists for most of the other RMGRs, and makes cleaner the separation between the WAL code, the redo code and the record description code (already in its own file) when it comes to the sequence RMGR. The redo and masking routines are moved to a new file, sequence_xlog.c. All the RMGR routines are now located in a new header, sequence_xlog.h. This separation is useful for a different patch related to sequences that I have been working on, where it makes a refactoring of sequence.c easier if its RMGR routines and its core routines are split. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/aSfTxIWjiXkTKh1E@paquier.xyz	2025-12-01 16:21:41 +09:00
Michael Paquier	d03668ea05	Switch some date/timestamp functions to use the soft error reporting This commit changes some functions related to the data types date and timestamp to use the soft error reporting rather than a custom boolean flag called "overflow", used to let the callers of these functions know if an overflow happens. This results in the removal of some boilerplate code, as it is possible to rely on an error context rather than a custom state, with the possibility to use the error generated inside the functions updated here, if necessary. These functions were suffixed with "_opt_overflow". They are now renamed to use "_safe" as suffix. This work is similar to `4246a977ba`. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAAJ_b95HEmFyzHZfsdPquSHeswcopk8MCG1Q_vn4tVkZ+xxofw@mail.gmail.com	2025-12-01 15:22:20 +09:00
David Rowley	5424f4da90	Don't call simplify_aggref with a NULL PlannerInfo `42473b3b3` added prosupport infrastructure to allow simplification of Aggrefs during constant-folding. In some cases the context->root that's given to eval_const_expressions_mutator() can be NULL. `42473b3b3` failed to take that into account, which could result in a crash. To fix, add a check and only call simplify_aggref() when the PlannerInfo is set. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: Birler, Altan <altan.birler@tum.de> Discussion: https://postgr.es/m/132d4da23b844d5ab9e352d34096eab5@tum.de	2025-11-30 12:55:34 +13:00
Peter Geoghegan	c902bd57af	Update obsolete row compare preprocessing comments. We have some limited ability to detect redundant and contradictory conditions involving an nbtree row comparison key following commits `f09816a0` and `bd3f59fd`: we can do so in simple cases involving IS NULL and IS NOT NULL keys on a row compare key's first column. We can likewise determine that a scan's qual is unsatisfiable given a row compare whose first subkey's arg is NULL. Update obsolete comments that claimed that we merely copied row compares into the output key array "without any editorialization". Also update another _bt_preprocess_keys header comment paragraph: add a parenthetical remark that points out that preprocessing will generate a skip array for the preceding example qual. That will ultimate lead to preprocessing marking the example's lower-order y key required -- which is exactly what the example supposes cannot happen. Keep the original comment, though, since it accurately describes the mechanical rules that determine which keys get marked required in the absence of skip arrays (which can occasionally still matter). This fixes an oversight in commit `92fe23d9`, which added the nbtree skip scan optimization. Author: Peter Geoghegan <pg@bowt.ie> Backpatch-through: 18	2025-11-29 16:41:51 -05:00
Dean Rasheed	3881561d77	Avoid rewriting data-modifying CTEs more than once. Formerly, when updating an auto-updatable view, or a relation with rules, if the original query had any data-modifying CTEs, the rewriter would rewrite those CTEs multiple times as RewriteQuery() recursed into the product queries. In most cases that was harmless, because RewriteQuery() is mostly idempotent. However, if the CTE involved updating an always-generated column, it would trigger an error because any subsequent rewrite would appear to be attempting to assign a non-default value to the always-generated column. This could perhaps be fixed by attempting to make RewriteQuery() fully idempotent, but that looks quite tricky to achieve, and would probably be quite fragile, given that more generated-column-type features might be added in the future. Instead, fix by arranging for RewriteQuery() to rewrite each CTE exactly once (by tracking the number of CTEs already rewritten as it recurses). This has the advantage of being simpler and more efficient, but it does make RewriteQuery() dependent on the order in which rewriteRuleAction() joins the CTE lists from the original query and the rule action, so care must be taken if that is ever changed. Reported-by: Bernice Southey <bernice.southey@gmail.com> Author: Bernice Southey <bernice.southey@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAEDh4nyD6MSH9bROhsOsuTqGAv_QceU_GDvN9WcHLtZTCYM1kA@mail.gmail.com Backpatch-through: 14	2025-11-29 12:28:59 +00:00
Peter Eisentraut	87c6f8b047	Generate translator comments for GUC parameter descriptions Automatically generate comments like /* translator: GUC parameter "client_min_messages" short description */ in the generated guc_tables.inc.c. This provides translators more context. Reviewed-by: Pavlo Golub <pavlo.golub@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Stéphane Schildknecht <sas.postgresql@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/1a89b3f0-e588-41ef-b712-aba766143cad%40eisentraut.org	2025-11-28 16:01:59 +01:00
Peter Eisentraut	8b3e2c622a	Fix pg_isblank() There was a pg_isblank() function that claimed to be a replacement for the standard isblank() function, which was thought to be "not very portable yet". We can now assume that it's portable (it's in C99). But pg_isblank() actually diverged from the standard isblank() by also accepting '\r', while the standard one only accepts space and tab. This was added to support parsing pg_hba.conf under Windows. But the hba parsing code now works completely differently and already handles line endings before we get to pg_isblank(). The other user of pg_isblank() is for ident protocol message parsing, which also handles '\r' separately. So this behavior is now obsolete and confusing. To improve clarity, I separated those concerns. The ident parsing now gets its own function that hardcodes the whitespace characters mentioned by the relevant RFC. pg_isblank() is now static in hba.c and is a wrapper around the standard isblank(), with some extra logic to ensure robust treatment of non-ASCII characters. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/170308e6-a7a3-4484-87b2-f960bb564afa%40eisentraut.org	2025-11-28 08:33:07 +01:00
Amit Kapila	e68b6adad9	Add slotsync_skip_reason column to pg_replication_slots view. Introduce a new column, slotsync_skip_reason, in the pg_replication_slots view. This column records the reason why the last slot synchronization was skipped. It is primarily relevant for logical replication slots on standby servers where the 'synced' field is true. The value is NULL when synchronization succeeds. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com	2025-11-28 05:21:35 +00:00
Michael Paquier	9ccc049dfe	pg_buffercache: Add pg_buffercache_mark_dirty{,_relation,_all}() This commit introduces three new functions for marking shared buffers as dirty by using the functions introduced in `9660906dbd`: * pg_buffercache_mark_dirty() for one shared buffer. - pg_buffercache_mark_dirt_relation() for all the shared buffers in a relation. * pg_buffercache_mark_dirty_all() for all the shared buffers in pool. The "_all" and "_relation" flavors are designed to address the inefficiency of repeatedly calling pg_buffercache_mark_dirty() for each individual buffer, which can be time-consuming when dealing with with large shared buffers pool. These functions are intended as developer tools and are available only to superusers. There is no need to bump the version of pg_buffercache, `4b203d499c` having done this job in this release cycle. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Joseph Koshakow <koshy44@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yuhang Qiu <iamqyh@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com	2025-11-28 09:04:04 +09:00
David Rowley	d167c19295	Fix possibly uninitialized HeapScanDesc.rs_startblock The solution used in `0ca3b1697` to determine the Parallel TID Range Scan's start location was to modify the signature of table_block_parallelscan_startblock_init() to allow the startblock to be passed in as a parameter. This allows the scan limits to be adjusted before that function is called so that the limits are picked up when the parallel scan starts. The commit made it so the call to table_block_parallelscan_startblock_init uses the HeapScanDesc's rs_startblock to pass the startblock to the parallel scan. That all works ok for Parallel TID Range scans as the HeapScanDesc rs_startblock gets set by heap_setscanlimits(), but for Parallel Seq Scans, initscan() does not initialize rs_startblock, and that results in passing an uninitialized value to table_block_parallelscan_startblock_init() as noted by the buildfarm member skink, running Valgrind. To fix this issue, make it so initscan() sets the rs_startblock for parallel scans unless we're doing a rescan. This makes it so table_block_parallelscan_startblock_init() will be called with the startblock set to InvalidBlockNumber, and that'll allow the syncscan code to find the correct start location (when enabled). For Parallel TID Range Scans, this InvalidBlockNumber value will be overwritten in the call to heap_setscanlimits(). initscan() is a bit light on documentation on what's meant to get initialized where for parallel scans. From what I can tell, it looks like it just didn't matter prior to `0ca3b1697` that rs_startblock was left uninitialized for parallel scans. To address the light documentation, I've also added some comments to mention that the syncscan location for parallel scans is figured out in table_block_parallelscan_startblock_init. I've also taken the liberty to adjust the if/else if/else code in initscan() to make it clearer which parts apply to parallel scans and which parts are for the serial scans. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvqALm+k7FyfdQdCw1yF_8HojvR61YRrNhwRQPE=zSmnQA@mail.gmail.com	2025-11-28 12:40:50 +13:00
Michael Paquier	c75bf57a90	doc: Add missing tags in pg_buffercache page Issue noticed while looking at this area of the documentation, for a different patch. This is a matter of style, so no backpatch is done. Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com	2025-11-28 08:00:23 +09:00
Michael Paquier	9660906dbd	Add routines for marking buffers dirty efficiently This commit introduces new internal bufmgr routines for marking shared buffers as dirty: * MarkDirtyUnpinnedBuffer() * MarkDirtyRelUnpinnedBuffers() * MarkDirtyAllUnpinnedBuffers() These functions provide an efficient mechanism to respectively mark one buffer, all the buffers of a relation, or the entire shared buffer pool as dirty, something that can be useful to force patterns for the checkpointer. MarkDirtyUnpinnedBufferInternal(), an extra routine, is used by these three, to mark as dirty an unpinned buffer. They are intended as developer tools to manipulate buffer dirtiness in bulk, and will be used in a follow-up commit. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Joseph Koshakow <koshy44@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yuhang Qiu <iamqyh@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com	2025-11-28 07:39:33 +09:00
Tom Lane	5528e8d104	Allow indexscans on partial hash indexes with implied quals. Normally, if a WHERE clause is implied by the predicate of a partial index, we drop that clause from the set of quals used with the index, since it's redundant to test it if we're scanning that index. However, if it's a hash index (or any !amoptionalkey index), this could result in dropping all available quals for the index's first key, preventing us from generating an indexscan. It's fair to question the practical usefulness of this case. Since hash only supports equality quals, the situation could only arise if the index's predicate is "WHERE indexkey = constant", implying that the index contains only one hash value, which would make hash a really poor choice of index type. However, perhaps there are other !amoptionalkey index AMs out there with which such cases are more plausible. To fix, just don't filter the candidate indexquals this way if the index is !amoptionalkey. That's a bit hokey because it may result in testing quals we didn't need to test, but to do it more accurately we'd have to redundantly identify which candidate quals are actually usable with the index, something we don't know at this early stage of planning. Doesn't seem worth the effort. Reported-by: Sergei Glukhov <s.glukhov@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/e200bf38-6b45-446a-83fd-48617211feff@postgrespro.ru Backpatch-through: 14	2025-11-27 13:09:59 -05:00
Fujii Masao	246ec4a51c	doc: Fix misleading synopsis for CREATE/ALTER PUBLICATION. The documentation for CREATE/ALTER PUBLICATION previously showed: [ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ] [, ... ] to indicate that the table/column specification could be repeated. However, placing [, ... ] directly after a multi-part construct was misleading and made it unclear which portion was repeatable. This commit introduces a new term, table_and_columns, to represent: [ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ] and updates the synopsis to use: table_and_columns [, ... ] which clearly identifies the repeatable element. Backpatched to v15, where the misleading syntax was introduced. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+PtsyvYL3KmA6C8f0ZpXQ=7FEqQtETVy-BOF+cm9WPvfMQ@mail.gmail.com Backpatch-through: 15	2025-11-27 23:29:57 +09:00
Álvaro Herrera	9e8fa05d34	Fix new test for CATCACHE_FORCE_RELEASE builds Two of the isolation tests introduce by commit `bc32a12e0d` had a problem under CATCACHE_FORCE_RELEASE, as evidenced by buildfarm member prion. An injection point is hit ahead of what the test spec expects, so a session goes to sleep and there's no one there to wait it up. Fix in the simplest possible way, which is to conditionally wake the process up if it's waiting. An alternative output file is necessary to cover both cases. This suggests a couple of possible improvements to the injection points infrastructure: a conditional wakeup (doing nothing if no one is sleeping, as opposed to throwing an error), as well as a way to attach to a point in "deactivated" mode, activated later. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/202511261817.fyixgtt3hqdr@alvherre.pgsql	2025-11-27 13:10:56 +01:00
Daniel Gustafsson	e396a18f32	doc: Fix typo in pg_dump documentation Reported-by: Erik Rijkers <er@xs4all.nl> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/7596672c-43e8-a030-0850-2dd09af98cac@xs4all.nl	2025-11-27 09:25:56 +01:00
Peter Eisentraut	e7075a3405	Use C11 alignas in pg_atomic_uint64 definitions They were already using pg_attribute_aligned. This replaces that with alignas and moves that into the required syntactic position. This ends up making these three atomics implementations appear a bit more consistent, but shouldn't change anything otherwise. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/46f05236-d4d4-4b4e-84d4-faa500f14691%40eisentraut.org	2025-11-27 07:53:34 +01:00
Amit Langote	519fa0433b	Fix error reporting for SQL/JSON path type mismatches transformJsonFuncExpr() used exprType()/exprLocation() on the possibly coerced path expression, which could be NULL when coercion to jsonpath failed, leading to "cache lookup failed for type 0" errors. Preserve the original expression node so that type and location in the "must be of type jsonpath" error are reported correctly. Add regression tests to cover these cases. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CACJufxHunVg81JMuNo8Yvv_hJD0DicgaVN2Wteu8aJbVJPBjZA@mail.gmail.com Backpatch-through: 17	2025-11-27 12:07:01 +09:00
David Rowley	0ca3b16973	Add parallelism support for TID Range Scans In v14, `bb437f995` added support for scanning for ranges of TIDs using a dedicated executor node for the purpose. Here, we allow these scans to be parallelized. The range of blocks to scan is divvied up similarly to how a Parallel Seq Scans does that, where 'chunks' of blocks are allocated to each worker and the size of those chunks is slowly reduced down to 1 block per worker by the time we're nearing the end of the scan. Doing that means workers finish at roughly the same time. Allowing TID Range Scans to be parallelized removes the dilemma from the planner as to whether a Parallel Seq Scan will cost less than a non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan (disk costs are not divided by the number of workers). It was possible the planner could choose the Parallel Seq Scan which would result in reading additional blocks during execution than the TID Scan would have. Allowing Parallel TID Range Scans removes the trade-off the planner makes when choosing between reduced CPU costs due to parallelism vs additional I/O from the Parallel Seq Scan due to it scanning blocks from outside of the required TID range. There is also, of course, the traditional parallelism performance benefits to be gained as well, which likely doesn't need to be explained here. Author: Cary Huang <cary.huang@highgo.ca> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca	2025-11-27 14:05:04 +13:00
David Rowley	42473b3b31	Have the planner replace COUNT(ANY) with COUNT(), when possible This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport functions to receive an Aggref and allow them to determine if there is a way that the Aggref call can be optimized. Also added is a support function to allow transformation of COUNT(ANY) into COUNT(). This is possible to do when the given "ANY" cannot be NULL and also that there are no ORDER BY / DISTINCT clauses within the Aggref. This is a useful transformation to do as it is common that people write COUNT(1), which until now has added unneeded overhead. When counting a NOT NULL column. The overheads can be worse as that might mean deforming more of the tuple, which for large fact tables may be many columns in. It may be possible to add prosupport functions for other aggregates. We could consider if ORDER BY could be dropped for some calls, e.g. the ORDER BY is quite useless in MAX(c ORDER BY c). There is a little bit of passing fallout from adjusting expr_is_nonnullable() to handle Const which results in a plan change in the aggregates.out regression test. Previously, nothing was able to determine that "One-Time Filter: (100 IS NOT NULL)" was always true, therefore useless to include in the plan. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com	2025-11-27 10:43:28 +13:00
Nathan Bossart	dbdc717ac6	Teach DSM registry to retry entry initialization if needed. If DSM registry entry initialization fails, backends could try to use an uninitialized DSM segment, DSA, or dshash table (since the entry is still added to the registry). To fix, restructure the code so that the registry retries initialization as needed. This commit also modifies pg_get_dsm_registry_allocations() to leave out partially-initialized entries, as they shouldn't have any allocated memory. DSM registry entry initialization shouldn't fail often in practice, but retrying was deemed better than leaving entries in a permanently failed state (as was done by commit `1165a933aa`, which has since been reverted). Suggested-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/E1vJHUk-006I7r-37%40gemulon.postgresql.org Backpatch-through: 17	2025-11-26 15:12:25 -06:00
Jeff Davis	1476028225	Allow pg_locale_t APIs to work when ctype_is_c. Previously, the caller needed to check ctype_is_c first for some routines and not others. Now, the APIs consistently work, and the caller can just check ctype_is_c for optimization purposes. Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com Reviewed-by: Chao Li <li.evan.chao@gmail.com>	2025-11-26 12:54:37 -08:00
Daniel Gustafsson	1cdb84bb1b	Check for correct version of perltidy pgperltidy requires a particular version of perltidy, but the version wasn't checked like how pgindent checks the underlying indent binary. Fix by checking the version of perltidy and error out if an incorrect version is used. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/1209850.1764092152@sss.pgh.pa.us	2025-11-26 20:43:09 +01:00
Jeff Davis	8d299052fe	Add #define for UNICODE_CASEMAP_BUFSZ. Useful for mapping a single codepoint at a time into a statically-allocated buffer. Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com Reviewed-by: Chao Li <li.evan.chao@gmail.com>	2025-11-26 10:05:11 -08:00
Jeff Davis	ec4997a9d7	Inline pg_ascii_tolower() and pg_ascii_toupper(). Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com Reviewed-by: Chao Li <li.evan.chao@gmail.com>	2025-11-26 10:04:32 -08:00
Nathan Bossart	2dd506b859	Revert "Teach DSM registry to ERROR if attaching to an uninitialized entry." This reverts commit `1165a933aa` (and the corresponding commits on the back-branches). In a follow-up commit, we'll teach the registry to retry entry initialization instead of leaving it in a permanently failed state. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/E1vJHUk-006I7r-37%40gemulon.postgresql.org Backpatch-through: 17	2025-11-26 11:37:21 -06:00
Melanie Plageman	e135e04457	Split heap_page_prune_and_freeze() into helpers Refactor the setup and planning phases of pruning and freezing into helpers. This streamlines heap_page_prune_and_freeze() and makes it more clear when the examination of tuples ends and page modifications begin. No code change beyond what was required to extract the code into helper functions. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/mhf4vkmh3j57zx7vuxp4jagtdzwhu3573pgfpmnjwqa6i6yj5y%40sy4ymcdtdklo	2025-11-26 11:00:34 -05:00
Nathan Bossart	9446f918ac	Remove a few unused struct members. Oversights in commits `ab9e0e718a`, `f3049a603a`, and `247ce06b88`. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aScUuBSawPWogUxs%40ip-10-97-1-34.eu-west-3.compute.internal	2025-11-26 09:50:00 -06:00
Daniel Gustafsson	348020caa7	ssl: Add connection and reload tests for key passphrases ssl_passphrase_command_supports_reload was not covered by the SSL testsuite, and connection tests after unlocking secrets with the passphrase was also missing. This adds test coverage for reloads of passphrase commands as well as connection attempts which tests the different codepaths for Windows and non-EXEC_BACKEND builds. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/5F301096-921A-427D-8EC1-EBAEC2A35082@yesql.se	2025-11-26 14:24:34 +01:00
Daniel Gustafsson	b3fe098d33	Add GUC to show EXEC_BACKEND state There is no straightforward way to determine if a cluster is running in EXEC_BACKEND mode or not, which is useful for tests to know. This adds a GUC debug_exec_backend similar to debug_assertions which will be true when the server is running in EXEC_BACKEND mode. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/5F301096-921A-427D-8EC1-EBAEC2A35082@yesql.se	2025-11-26 14:24:27 +01:00
Daniel Gustafsson	0f4f45772c	doc: Clarify passphrase command reloading on Windows When running on Windows (or EXEC_BACKEND) the SSL configuration will be reloaded on each backend start, so the passphrase command will be reloaded along with it. This implies that passphrase command reload must be enabled on Windows for connections to work at all. Document this since it wasn't mentioned explicitly, and will there add markup for parameter value to match the rest of the docs. Backpatch to all supported versions. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/5F301096-921A-427D-8EC1-EBAEC2A35082@yesql.se Backpatch-through: 14	2025-11-26 14:24:04 +01:00
Peter Eisentraut	8fe4aef829	Replace internal C function pg_hypot() by standard hypot() The code comment said, "It is expected that this routine will eventually be replaced with the C99 hypot() function.", so let's do that now. This function is tested via the geometry regression test, so if it is faulty on any platform, it will show up there. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/170308e6-a7a3-4484-87b2-f960bb564afa%40eisentraut.org	2025-11-26 07:48:29 +01:00
Jacob Champion	47c7a7ebc8	oauth_validator: Shorten JSON responses in test logs Response padding from the oauth_validator abuse tests was adding a couple megabytes to the test logs. We don't need the buildfarm to hold onto that, and we don't need to read it when debugging; truncate it. Reported-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/202511251218.zfs4nu2qnh2m%40alvherre.pgsql Backpatch-through: 18	2025-11-25 20:39:47 -08:00
Amit Kapila	e3e787ca02	Fix test failure caused by commit `76b78721ca`. The test failed because it assumed that a newly created logical replication slot could be synced to the standby by the slotsync worker. However, the presence of an existing physical slot caused the new logical slot to use a non-latest xmin. On the standby, the DDL had already been replayed, advancing xmin, which led to the slotsync worker failing to sync the lagging logical slot. To resolve this, we moved the slot sync statistics tests to run after the tests that do not require the newly created slot to be sync-ready. As per buildfarm. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966FE0BFB6C212298BFFEDEF5D1A@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-11-26 03:26:57 +00:00
Michael Paquier	e1405aa5e3	Add input function for data type pg_dependencies pg_dependencies is used as data type for the contents of dependencies extended statistics. This new input function consumes the format that has been established by `e76defbcf0` for the output function of pg_dependencies, enforcing some sanity checks for: - Checks for the input object, which should be a one-dimension array with correct attributes and values. - The key names: "attributes", "dependency", "degree". All are required, other key names are blocked. - Value types for each key: "attributes" requires an array of integers, "dependency" an attribute number, "degree" a float. - List of attributes. In this case, it is possible that some dependencies are not listed in the statistics data, as items with a degree of 0 are discarded when building the statistics. This commit includes checks for simple scenarios, like duplicated attributes, or overlapping values between the list of "attributes" and the "dependency" value. Even if the input function considers the input as valid, a value still needs to be cross-checked with the attributes defined in a statistics object at import. - Based on the discussion, the checks on the values are loose, as there is also an argument for potentially stats injection. For example, "degree" should be defined in [0.0,1.0], but a check is not enforced. This is required for a follow-up patch that aims to implement the import of extended statistics. Some tests are added to check the code paths of the JSON parser checking the shape of the pg_dependencies inputs, with 91% of code coverage reached. The tests are located in their own new test file, for clarity. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2025-11-26 10:53:16 +09:00
Michael Paquier	44eba8f06e	Add input function for data type pg_ndistinct pg_ndistinct is used as data type for the contents of ndistinct extended statistics. This new input function consumes the format that has been established by `1f927cce44` for the output function of pg_ndistinct, enforcing some sanity checks for: - Checks for the input object, which should be a one-dimension array with correct attributes and values. - The key names: "attributes", "ndistinct". Both are required, other key names are blocked. - Value types for each key: "attributes" requires an array of integers, and "ndistinct" an integer. - List of attributes. Note that this enforces a check so as an attribute list has to be a subset of the longest attribute list found. This does not enforce that a full group of attribute sets exist, based on how the groups are generated when the ndistinct objects are generated, making the list of ndistinct items a bit loose. Note a check would still be required at import to see if the attributes listed match with the attribute numbers set in the definition of a statistics object. - Based on the discussion, the checks on the values are loose, as there is also an argument for potentially stats injection. The relation and attribute level stats follow the same line of argument for the values. This is required for a follow-up patch that aims to implement the import of extended statistics. Some tests are added to check the code paths of the JSON parser checking the shape of the pg_ndistinct inputs, with 90% of code coverage reached. The tests are located in their own new test file, for clarity. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2025-11-26 10:13:18 +09:00
Melanie Plageman	cd38b7e773	Assert that cutoffs are provided if freezing will be attempted heap_page_prune_and_freeze() requires the caller to initialize PruneFreezeParams->cutoffs so that the function can correctly evaluate whether tuples should be frozen. This requirement previously existed only in comments and was easy to miss, especially after “cutoffs” was converted from a direct function parameter to a field of the newly introduced PruneFreezeParams struct (added in `1937ed7062`). Adding an assert makes this requirement explicit and harder to violate. Also, fix a minor typo while we're at it. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/0AC177F5-5E26-45EE-B273-357C51212AC5%40gmail.com	2025-11-25 16:41:29 -05:00
Jeff Davis	3b9c118920	Remove a useless length check. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CAEoWx2mW0P8CByavV58zm3=eb2MQHaKOcDEF5B2UJYRyC2c3ig@mail.gmail.com	2025-11-25 11:39:41 -08:00
Álvaro Herrera	33bb3dc2d8	pg_dump tests: don't put dumps in stdout This bloats the regression log files for no reason. Backpatch to 18; no further only because it fails to apply cleanly. (It's just whitespace change that conflicts, but I don't think this warrants more effort than this.) Discussion: https://postgr.es/m/202511251218.zfs4nu2qnh2m@alvherre.pgsql	2025-11-25 19:08:36 +01:00
Álvaro Herrera	417ac9c1ee	Improve test case stability Given unlucky timing, some of the new tests added by commit `bc32a12e0d` can fail spuriously. We haven't seen such failures yet in buildfarm, but allegedly we can prevent them with this tweak. While at it, remove an unused injection point I (Álvaro) added. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/CADzfLwUc=jtSUEaQCtyt8zTeOJ-gHZ8=w_KJsVjDOYSLqaY9Lg@mail.gmail.com Discussion: https://postgr.es/m/CADzfLwV5oQq-Vg_VmG_o4SdL6yHjDoNO4T4pMtgJLzYGmYf74g@mail.gmail.com	2025-11-25 18:20:06 +01:00
Peter Eisentraut	7169c0b96b	gen_guc_tables.pl: Validate required GUC fields before code generation Previously, gen_guc_tables.pl would emit "Use of uninitialized value" warnings if required fields were missing in guc_parameters.dat (for example, when an integer or real GUC omitted the 'max' value). The resulting error messages were unclear and did not identify which GUC entry was problematic. Add explicit validation of required fields depending on the parameter type, and fail with a clear and specific message such as: guc_parameters.dat:1909: error: entry "max_index_keys" of type "int" is missing required field "max" No changes to generated guc_tables.c. Author: Chao Li <lic@highgo.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2%3DoP4LgHi771_OKhPPUS7B-CTqCs%3D%3DuQcNXWrwBoAm5Vg%40mail.gmail.com	2025-11-25 16:50:34 +01:00
Peter Eisentraut	2256af4ba2	backend/nodes cleanup: Move loop variables definitions into for statement Author: Chao Li (Evan) <lic@highgo.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2nP12qwAaiJutbn1Kw50atN6FbMJNQ4bh4%2BfP_Ay_u7Eg%40mail.gmail.com	2025-11-25 15:38:23 +01:00
Amit Kapila	3df4df53b0	Fix a BF failure caused by commit `76b78721ca`. The issue occurred because the replication slot was not released in the slotsync worker when a slot synchronization cycle was skipped. This skip happened because the required WAL was not received and flushed on the standby server. As a result, in the next cycle, when attempting to acquire the slot, an assertion failure was triggered. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/CAA4eK1KMwYUYy=oAVHu9mam+vX50ixxfhO4_C=kgQC8VCQHEfw@mail.gmail.com	2025-11-25 08:49:46 +00:00
Amit Kapila	76b78721ca	Add slotsync skip statistics. This patch adds two new columns to the pg_stat_replication_slots view: slotsync_skip_count - the total number of times a slotsync operation was skipped. slotsync_skip_at - the timestamp of the most recent skip. These additions provide better visibility into replication slot synchronization behavior. A future patch will introduce the slotsync_skip_reason column in pg_replication_slots to capture the reason for skip. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com	2025-11-25 07:06:02 +00:00
Peter Eisentraut	c581c9a7ac	Remove obsolete comment This comment should probably have been moved to pg_locale_libc.c in commit `66ac94cdc7` (2024), but upon closer examination it was already completely obsolete then. The first part of the comment has been obsolete since commit `85feb77aa0` (2017), which required that the system provides the required wide-character functions. The second part has been obsolete since commit `e9931bfb75` (2024), which eliminated code paths depending on the global LC_CTYPE setting. Discussion: https://www.postgresql.org/message-id/flat/170308e6-a7a3-4484-87b2-f960bb564afa%40eisentraut.org	2025-11-25 06:26:49 +01:00
Michael Paquier	ed823da128	Rename routines for write/read of pgstats file This commit renames write_chunk and read_chunk to respectively pgstat_write_chunk() and pgstat_read_chunk(), along with the *_s convenience macros. These are made available for plug-ins, so as any code that decides to write and/or read stats data can rely on a single code path for this work. Extracted from a larger patch by the same author. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com	2025-11-25 10:55:40 +09:00
Andres Freund	81f7738953	lwlock: Fix, currently harmless, bug in LWLockWakeup() Accidentally the code in LWLockWakeup() checked the list of to-be-woken up processes to see if LW_FLAG_HAS_WAITERS should be unset. That means that HAS_WAITERS would not get unset immediately, but only during the next, unnecessary, call to LWLockWakeup(). Luckily, as the code stands, this is just a small efficiency issue. However, if there were (as in a patch of mine) a case in which LWLockWakeup() would not find any backend to wake, despite the wait list not being empty, we'd wrongly unset LW_FLAG_HAS_WAITERS, leading to potentially hanging. While the consequences in the backbranches are limited, the code as-is confusing, and it is possible that there are workloads where the additional wait list lock acquisitions hurt, therefore backpatch. Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff Backpatch-through: 14	2025-11-24 18:10:48 -05:00
Jeff Davis	f81bf78ce1	Avoid global LC_CTYPE dependency in pg_locale_libc.c. Call tolower_l() directly instead of through pg_tolower(), because the latter depends on the global LC_CTYPE. Discussion: https://postgr.es/m/8186b28a1a39e61a0d833a4c25a8909ebbbabd48.camel@j-davis.com	2025-11-24 14:55:09 -08:00
Tom Lane	698fa924b1	Improve detection of implicitly-temporary views. We've long had a practice of making views temporary by default if they reference any temporary tables. However the implementation was pretty incomplete, in that it only searched for RangeTblEntry references to temp relations. Uses of temporary types, regclass constants, etc were not detected even though the dependency mechanism considers them grounds for dropping the view. Thus a view not believed to be temp could silently go away at session exit anyhow. To improve matters, replace the ad-hoc isQueryUsingTempRelation() logic with use of the dependency-based infrastructure introduced by commit `572c40ba9`. This is complete by definition, and it's less code overall. While we're at it, we can also extend the warning NOTICE (or ERROR in the case of a materialized view) to mention one of the temp objects motivating the classification of the view as temp, as was done for functions in `572c40ba9`. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/19cf6ae1-04cd-422c-a760-d7e75fe6cba9@uni-muenster.de	2025-11-24 17:00:16 -05:00
Jacob Champion	0664aa4ff8	Reorganize pqcomm.h a bit Group the PG_PROTOCOL() codes, add a comment to AuthRequest now that the AUTH_REQ codes live in a different header, and make some small adjustments to spacing and comment style for the sake of scannability. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/CAOYmi%2B%3D6zg4oXXOQtifrVao_YKiujTDa3u6bxnU08r0FsSig4g%40mail.gmail.com	2025-11-24 10:01:30 -08:00
Jacob Champion	e2ceff13d8	postgres: Use pg_{add,mul}_size_overflow() The backend implementations of add_size() and mul_size() can now make use of the APIs provided in common/int.h. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOYmi%2B%3D%2BpqUd2MUitvgW1pAJuXgG_TKCVc3_Ek7pe8z9nkf%2BAg%40mail.gmail.com	2025-11-24 09:59:54 -08:00
Jacob Champion	8934f2136c	Add pg_add_size_overflow() and friends Commit `600086f47` added (several bespoke copies of) size_t addition with overflow checks to libpq. Move this to common/int.h, along with its subtraction and multiplication counterparts. pg_neg_size_overflow() is intentionally omitted; I'm not sure we should add SSIZE_MAX to win32_port.h for the sake of a function with no callers. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOYmi%2B%3D%2BpqUd2MUitvgW1pAJuXgG_TKCVc3_Ek7pe8z9nkf%2BAg%40mail.gmail.com	2025-11-24 09:59:38 -08:00
Jacob Champion	f1881c7dd3	Make some use of anonymous unions [libpq-oauth] Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes the json_field struct. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CAOYmi%2BnV25oC5uXFgWodydGrHkfWMDCLUcjbAreM3mNX%3DF2JWw%40mail.gmail.com	2025-11-24 09:55:16 -08:00
Álvaro Herrera	bc32a12e0d	Fix infer_arbiter_index during concurrent index operations Previously, we would only consider indexes marked indisvalid as usable for INSERT ON CONFLICT. But that's problematic during CREATE INDEX CONCURRENTLY and REINDEX CONCURRENTLY, because concurrent transactions would end up with inconsistents lists of inferred indexes, leading to deadlocks and spurious errors about unique key violations (because two transactions are operating on different indexes for the speculative insertion tokens). Change this function to return indexes even if invalid. This fixes the spurious errors and deadlocks. Because such indexes might not be complete, we still need uniqueness to be verified in a different way. We do that by requiring that at least one index marked valid is part of the set of indexes returned. It is that index that is going to help ensure that the inserted tuple is indeed unique. This does not fix similar problems occurring with partitioned tables or with named constraints. These problems will be fixed in follow-up commits. We have no user report of this problem, even though it exists in all branches. Because of that and given that the fix is somewhat tricky, I decided not to backpatch for now. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ogv+6wqRzPK241jik4U95s1pW3MCZ3rX5ZqbFdUysz7Qw@mail.gmail.com	2025-11-24 17:17:29 +01:00
Michael Paquier	e429c3cecb	Move isolation test index-killtuples to src/test/modules/index/ index-killtuples test depends on the contrib modules btree_gin and btree_gist, which would not be installed in a temporary installation with an execution of the main isolation test suite like this one: make -C src/test/isolation/ check src/test/isolation/ should not depend on contrib/, and EXTRA_INSTALL has no effect in this case as this test suite uses its own Makefile rules. This commit moves index-killtuples into its new module, called "index", whose name looks like the best fit there can be as it depends on more than one index AM. btree_gin and btree_gist are now pulled in the temporary installation with EXTRA_INSTALL. The test is renamed to "killtuples", for simplicity. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Suggested-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aKJsWedftW7UX1WM@paquier.xyz	2025-11-24 19:33:51 +09:00
Peter Eisentraut	d4c0f91f7d	C11 alignas instead of unions -- extended alignments This replaces some uses of pg_attribute_aligned() with the standard alignas() for cases where extended alignment (larger than max_align_t) is required. This patch stipulates that all supported compilers must support alignments up to PG_IO_ALIGN_SIZE, but that seems pretty likely. We can then also desupport the case where direct I/O is disabled because pg_attribute_aligned is not supported. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/46f05236-d4d4-4b4e-84d4-faa500f14691%40eisentraut.org	2025-11-24 07:39:37 +01:00
Michael Paquier	4b203d499c	pg_buffercache: Add pg_buffercache_os_pages `ba2a3c2302` has added a way to check if a buffer is spread across multiple pages with some NUMA information, via a new view pg_buffercache_numa that depends on pg_buffercache_numa_pages(), a SQL function. These can only be queried when support for libnuma exists, generating an error if not. However, it can be useful to know how shared buffers and OS pages map when NUMA is not supported or not available. This commit expands the capabilities around pg_buffercache_numa: - pg_buffercache_numa_pages() is refactored as an internal function, able to optionally process NUMA. Its SQL definition prior to this commit is still around to ensure backward-compatibility with v1.6. - A SQL function called pg_buffercache_os_pages() is added, able to work with or without NUMA. - The view pg_buffercache_numa is redefined to use pg_buffercache_os_pages(). - A new view is added, called pg_buffercache_os_pages. This ignores NUMA for its result processing, for a better efficiency. The implementation is done so as there is no code duplication between the NUMA and non-NUMA views/functions, relying on one internal function that does the job for all of them. The module is bumped to v1.7. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z/fFA2heH6lpSLlt@ip-10-97-1-34.eu-west-3.compute.internal	2025-11-24 14:29:15 +09:00
David Rowley	07d1dc3aeb	Fix incorrect IndexOptInfo header comment The comment incorrectly indicated that indexcollations[] stored collations for both key columns and INCLUDE columns, but in reality it only has elements for the key columns. canreturn[] didn't get a mention, so add that while we're here. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LwbZgMKOQ9CmZarX5DEipKivdHp5PZMOO-riL0w%3DL%3D4A%40mail.gmail.com Backpatch-through: 14	2025-11-24 17:00:01 +13:00
Tom Lane	572c40ba94	Issue a NOTICE if a created function depends on any temp objects. We don't have an official concept of temporary functions. (You can make one explicitly in pg_temp, but then you have to explicitly schema-qualify it on every call.) However, until now we were quite laissez-faire about whether a non-temporary function could depend on a temporary object, such as a temp table or view. If one does, it will silently go away at end of session, due to the automatic DROP ... CASCADE on the session's temporary objects. People have complained that that's surprising; however, we can't really forbid it because other people (including our own regression tests) rely on being able to do it. Let's compromise by emitting a NOTICE at CREATE FUNCTION time. This is somewhat comparable to our ancient practice of emitting a NOTICE when forcing a view to become temp because it depends on temp tables. Along the way, refactor recordDependencyOnExpr() so that the dependencies of an expression can be combined with other dependencies, instead of being emitted separately and perhaps duplicatively. We should probably make the implementation of temp-by-default views use the same infrastructure used here, but that's for another patch. It's unclear whether there are any other object classes that deserve similar treatment. Author: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19cf6ae1-04cd-422c-a760-d7e75fe6cba9@uni-muenster.de	2025-11-23 15:02:55 -05:00
Fujii Masao	81966c5458	psql: Improve tab-completion for PREPARE. This commit enhances tab-completion for PREPARE xx AS to also suggest MERGE INTO, VALUES, WITH, and TABLE. Author: Haruna Miwa <miwa@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/TY7P286MB5466B859BD6C5BE64E961878F1CEA@TY7P286MB5466.JPNP286.PROD.OUTLOOK.COM	2025-11-23 23:03:53 +09:00
Michael Paquier	7d9043aee8	pg_buffercache: Remove unused fields from BufferCacheNumaRec These fields have been added in commit `ba2a3c2302`, and have never been used. While on it, this commit moves a comment that was out of place, improving it. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aSBOKX6pLJzumbmF@ip-10-97-1-34.eu-west-3.compute.internal	2025-11-23 13:37:42 +09:00
Tom Lane	b140c8d7a3	Add SupportRequestInlineInFrom planner support request. This request allows a support function to replace a function call appearing in FROM (typically a set-returning function) with an equivalent SELECT subquery. The subquery will then be subject to the planner's usual optimizations, potentially allowing a much better plan to be generated. While the planner has long done this automatically for simple SQL-language functions, it's now possible for extensions to do it for functions outside that group. Notably, this could be useful for functions that are presently implemented in PL/pgSQL and work by generating and then EXECUTE'ing a SQL query. Author: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/09de6afa-c33d-4d94-a5cb-afc6cea0d2bb@illuminatedcomputing.com	2025-11-22 19:33:34 -05:00
Bruce Momjian	c0bc9af151	tools: remove src/tools/codelines This is a one-line script never gained general usage since being added in 2005. Backpatch-through: master	2025-11-22 12:02:14 -05:00
Peter Eisentraut	5eed8ce50c	Add range_minus_multi and multirange_minus_multi functions The existing range_minus function raises an exception when the range is "split", because then the result can't be represented by a single range. For example '[0,10)'::int4range - '[4,5)' would be '[0,4)' and '[5,10)'. This commit adds new set-returning functions so that callers can get results even in the case of splits. There is no risk of an exception for multiranges, but a set-returning function lets us handle them the same way we handle ranges. Both functions return zero results if the subtraction would give an empty range/multirange. The main use-case for these functions is to implement UPDATE/DELETE FOR PORTION OF, which must compute the application-time of "temporal leftovers": the part of history in an updated/deleted row that was not changed. To preserve the untouched history, we will implicitly insert one record for each result returned by range/multirange_minus_multi. Using a set-returning function will also let us support user-defined types for application-time update/delete in the future. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com	2025-11-22 09:42:03 +01:00
Thomas Munro	0dceba21d7	jit: Adjust AArch64-only code for LLVM 21. LLVM 21 changed the arguments of RTDyldObjectLinkingLayer's constructor, breaking compilation with the backported SectionMemoryManager from commit `9044fc1d`. `cd585864c0` Backpatch-through: 14 Author: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/d25e6e4a-d1b4-84d3-2f8a-6c45b975f53d%40applied-asynchrony.com	2025-11-22 21:21:11 +13:00
Andrew Dunstan	51da766494	Add 'make check-tests' behavior to the meson based builds There was no easy way to run specific tests in the meson based builds. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: postgr.es/m/CAExHW5tK-QqayUN0%2BN3MF5bjV6vLKDCkRuGwoDJwc7vGjwCygQ%40mail.gmail.com	2025-11-21 17:12:22 -05:00
Peter Eisentraut	51364113d5	Fix typo in documentation about application time Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>	2025-11-21 17:36:25 +01:00
Peter Eisentraut	ef8fe69360	Remove useless casts to (void *) Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers. Some have been missed in `7f798aca1d` and some are new ones along the same lines. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aR8Yv%2BuATLKbJCgI%40ip-10-97-1-34.eu-west-3.compute.internal	2025-11-21 16:49:40 +01:00
Heikki Linnakangas	2aabaa52df	Use strtoi64() in pgbench, replacing its open-coded implementation Makes the code a little simpler. The old implementation accepted trailing whitespace, but that was unnecessary. Firstly, its sibling function for parsing decimals, strtodouble(), does not accept trailing whitespace. Secondly, none of the callers can pass a string with trailing whitespace to it. In the passing, check specifically for ERANGE before printing the "out of range" error. On some systems, strtoul() and strtod() return EINVAL on an empty or all-spaces string, and "invalid input syntax" is more appropriate for that than "out of range". For the existing strtodouble() function this is purely academical because it's never called with errorOK==false, but let's be tidy. (Perhaps we should remove the dead codepaths altogether, but I'll leave that for another day.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com> Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com> Discussion: https://www.postgresql.org/message-id/861dd5bd-f2c9-4ff5-8aa0-f82bdb75ec1f@iki.fi	2025-11-21 15:03:11 +02:00
Peter Eisentraut	e6be84356b	Update timezone to C99 This reverts changes done in PostgreSQL over the upstream code to avoid relying on C99 <stdint.h> and <inttypes.h>. In passing, there were a few other minor and cosmetic changes that I left in to improve alignment with upstream, including some C11 feature use (_Noreturn). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/9ad2749f-77ab-4ecb-a321-1ca915480b05%40eisentraut.org	2025-11-21 13:07:40 +01:00
Peter Eisentraut	97e04c74be	C11 alignas instead of unions This changes a few union members that only existed to ensure alignments and replaces them with the C11 alignas specifier. This change only uses fundamental alignments (meaning approximately alignments of basic types), which all C11 compilers must support. There are opportunities for similar changes using extended alignments, for example in PGIOAlignedBlock, but these are not necessarily supported by all compilers, so they are kept as a separate change. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/46f05236-d4d4-4b4e-84d4-faa500f14691%40eisentraut.org	2025-11-21 10:08:24 +01:00
Masahiko Sawada	266543a620	Use "COPY table TO" for partitioned tables in initial table synchronization. Commit `4bea91f` added support for "COPY table TO" with partitioned tables. This commit enhances initial table synchronization in logical replication to use "COPY table TO" for partitioned tables if possible, instead of "COPY (SELECT ...) TO" variant, improving performance. Author: Ajin Cherian <itsajin@gmail.com> Discussion: https://postgr.es/m/CAFPTHDY=w+xmEof=yyjhbDzaLxhBkoBzKcksEofXcT6EcjMbtQ@mail.gmail.com	2025-11-20 14:50:27 -08:00
Melanie Plageman	1e14edcea5	Split PruneFreezeParams initializers to one field per line This conforms more closely with the style of other struct initializers in the code base. Initializing multiple fields on a single line is unpopular in part because pgindent won't permit a space after the comma before the next field's period. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/87see87fnq.fsf%40wibble.ilmari.org	2025-11-20 17:36:40 -05:00
Bruce Momjian	5d4dc112c7	tools: update tools/codelines to use "git ls-files" This generates a more accurate code count because 'make distclean' doesn't always remove build files. Author: idea from David Rowley Discussion: https://postgr.es/m/aR4hoOotVHB7TXo5@momjian.us Backpatch-through: master	2025-11-20 15:23:39 -05:00
Melanie Plageman	65ec565b19	Update PruneState.all_[visible\|frozen] earlier in pruning During pruning and freezing in phase I of vacuum, we delay clearing all_visible and all_frozen in the presence of dead items. This allows opportunistic freezing if the page would otherwise be fully frozen, since those dead items are later removed in vacuum phase III. To move the VM update into the same WAL record that prunes and freezes tuples, we must know whether the page will be marked all-visible/all-frozen before emitting WAL. Previously we waited until after emitting WAL to update all_visible/all_frozen to their correct values. The only barrier to updating these flags immediately after deciding whether to opportunistically freeze was that while emitting WAL for a record freezing tuples, we use the pre-corrected value of all_frozen to compute the snapshot conflict horizon. By determining the conflict horizon earlier, we can update the flags immediately after making the opportunistic freeze decision. This is required to set the VM in the XLOG_HEAP2_PRUNE_VACUUM_SCAN record emitted by pruning and freezing. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2025-11-20 11:43:18 -05:00
Melanie Plageman	351d7e2441	Keep all_frozen updated in heap_page_prune_and_freeze Previously, we relied on all_visible and all_frozen being used together to ensure that all_frozen was correct, but it is better to keep both fields updated. Future changes will separate their usage, so we should not depend on all_visible for the validity of all_frozen. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2025-11-20 10:59:24 -05:00
Melanie Plageman	1937ed7062	Refactor heap_page_prune_and_freeze() parameters into a struct heap_page_prune_and_freeze() had accumulated an unwieldy number of input parameters and upcoming work to handle VM updates in this function will add even more. Introduce a new PruneFreezeParams struct to group the function’s input parameters, improving readability and maintainability. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/yn4zp35kkdsjx6wf47zcfmxgexxt4h2og47pvnw2x5ifyrs3qc%407uw6jyyxuyf7	2025-11-20 10:32:14 -05:00
Daniel Gustafsson	fa0ffa2877	doc: Assorted documentation improvements A set of wording improvements and spelling fixes. Author: Oleg Sibiryakov <o.sibiryakov@postgrespro.ru> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/e62bedb5-c26f-4d37-b4ed-ce9b55f1e980@postgrespro.ru	2025-11-20 15:04:41 +01:00
Daniel Gustafsson	20bff3d794	doc: Document how to run a subset of regress tests This patch was originally submitted a year ago, but never ended up getting committed. It was later brought up again on a recent thread on the same subject. Original patch by Paul A Jungwirth with some wordsmithing by me based on the review from the original thread. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Viktor Holmberg <v@viktorh.net> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA+renyXB5jYG9r5-CaDc4g607EB398QwTk_efEXTzarrO8bPzw@mail.gmail.com Discussion: https://postgr.es/m/CACJufxHOcmeTkoh2CxFHKv9GRnp9sLVzN=LZhqTgvqT++PXZNQ@mail.gmail.com	2025-11-20 14:49:33 +01:00
Tomas Vondra	599336c64f	Handle EPERM in pg_numa_init When running in Docker, the container may not have privileges needed by get_mempolicy(). This is called by numa_available() in libnuma, but versions prior to 2.0.19 did not expect that. The numa_available() call seemingly succeeds, but then we get unexpected failures when trying to query status of pages: postgres =# select * from pg_shmem_allocations_numa; ERROR: XX000: failed NUMA pages inquiry status: Operation not permitted LOCATION: pg_get_shmem_allocations_numa, shmem.c:691 The best solution is to call get_mempolicy() first, and proceed to numa_available() only when it does not fail with EPERM. Otherwise we'd need to treat older libnuma versions as insufficient, which seems a bit too harsh, as this only affects containerized systems. Fix by me, based on suggestions by Christoph. Backpatch to 18, where the NUMA functions were introduced. Reported-by: Christoph Berg <myon@debian.org> Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aPDZOxjrmEo_1JRG@msg.df7cb.de Backpatch-through: 18	2025-11-20 13:26:49 +01:00
Peter Eisentraut	b5623cc5e4	Remove obsolete cast The upstream timezone code uses a bool variable as an array subscript. Back when PostgreSQL's bool was char, this would have caused a warning from gcc -Wchar-subscripts, which is included in -Wall. But this has been obsolete since probably commit `d26a810ebf`, but certainly since bool is now the C standard bool. So we can remove this deviation from the upstream code, to make future code merges simpler. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/9ad2749f-77ab-4ecb-a321-1ca915480b05%40eisentraut.org	2025-11-20 07:47:48 +01:00
Fujii Masao	aaf035790a	doc: Update pg_upgrade documentation to match recent description changes. Commit `792353f7d5` updated the pg_dump and pg_dumpall documentation to clarify which statistics are not included in their output. The pg_upgrade documentation contained a nearly identical description, but it was not updated at the same time. This commit updates the pg_upgrade documentation to match those changes. Backpatch to v18, where commit `792353f7d5` was backpatched to. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Bruce Momjian <bruce@momjian.us> Discussion: https://postgr.es/m/CAHGQGwFnfgdGz8aGWVzgFCFwoWQU7KnFFjmxinf4RkQAkzmR+w@mail.gmail.com Backpatch-through: 18	2025-11-20 09:18:51 +09:00
Fujii Masao	99780da720	Add HINT listing valid encodings to encode() and decode() errors. This commit updates encode() and decode() so that when an invalid encoding is specified, their error message includes a HINT listing all valid encodings. This helps users quickly see which encodings are supported without needing to consult the documentation. Author: Shinya Sugamoto <shinya34892@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAAe3y+99sfPv8UDF1VM-rC1i5HBdqxUh=2HrbJJFm2+i=1OwOw@mail.gmail.com	2025-11-20 09:14:02 +09:00
Thomas Munro	6b46669883	Drop support for MSVCRT's float formatting quirk. Commit `f1885386` added code to remove an unnecessary leading zero from the exponent in a float formatted by the system snprintf(). The C standard doesn't allow unnecessary digits beyond two, and the tests pass without this on Windows' modern UCRT (required since commit `1758d424`). Discussion: https://postgr.es/m/CA%2BhUKGJnmzTqiODmTjf-23yZ%3DE3HXqFTtKoyp3TF-MpB93hTMQ%40mail.gmail.com	2025-11-20 10:38:15 +13:00
Thomas Munro	7ab9b34614	Drop support for MSVCRT's %I64 format strings. MSVCRT predated C99 and invented non-standard placeholders for 64-bit numbers, and then later used them in standard macros when C99 <inttypes.h> arrived. The macros just use %lld etc when building with UCRT, so there should be no way for our interposed sprintf.c code to receive the pre-standard kind these days. Time to drop the code that parses them. That code was in fact already dead when commit `962da900` landed, as we'd disclaimed MSVCRT support a couple of weeks earlier in commit `1758d424`, but patch development overlapped and the history of these macros hadn't been investigated. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/4d8b1a67-aab2-4429-b44b-f03988095939%40eisentraut.org	2025-11-20 10:07:27 +13:00
Tom Lane	057012b205	Speed up eqjoinsel() with lots of MCV entries. If both sides of the operator have most-common-value statistics, eqjoinsel wants to check which MCVs have matches on the other side. Formerly it did this with a dumb compare-all-the-entries loop, which had O(N^2) behavior for long MCV lists. When that code was written, twenty-plus years ago, that seemed tolerable; but nowadays people frequently use much larger statistics targets, so that the O(N^2) behavior can hurt quite a bit. To add insult to injury, when asked for semijoin semantics, the entire comparison loop was done over, even though we frequently know that it will yield exactly the same results. To improve matters, switch to using a hash table to perform the matching. Testing suggests that depending on the data type, we may need up to about 100 MCVs on each side to amortize the extra costs of setting up the hash table and performing hash-value computations; so continue to use the old looping method when there are fewer MCVs than that. Also, refactor so that we don't repeat the matching work unless we really need to, which occurs only in the uncommon case where eqjoinsel_semi decides to truncate the set of inner MCVs it considers. The refactoring also got rid of the need to use the presented operator's commutator. Real-world operators that are using eqjoinsel should pretty much always have commutators, but at the very least this saves a few syscache lookups. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Co-authored-by: David Geier <geidav.pg@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20ea8bf5-3569-4e46-92ef-ebb2666debf6@tantorlabs.com	2025-11-19 13:22:12 -05:00
Heikki Linnakangas	d5b4f3a6d4	Print new OldestXID value in pg_resetwal when it's being changed Commit `74cf7d46a9` added the --oldest-transaction-id option to pg_resetwal, but forgot to update the code that prints all the new values that are being set. Fix that. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/5461bc85-e684-4531-b4d2-d2e57ad18cba@iki.fi Backpatch-through: 14	2025-11-19 18:05:42 +02:00
Nathan Bossart	cbdce71b99	doc: Update formula for vacuum insert threshold. Oversight in commit `06eae9e621`. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/aRODeqFUVkGDJSPP%40nathan Backpatch-through: 18	2025-11-19 10:01:37 -06:00
Peter Eisentraut	86b276a4a9	Fix indentation for commit `0fc33b0053`	2025-11-19 10:41:28 +01:00
Peter Eisentraut	0fc33b0053	Fix NLS for incorrect GUC enum value hint message The translation markers were applied at the wrong place, so no string was extracted for translation. Also add translator comments here and in a similar place. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/2c961fa1-14f6-44a2-985c-e30b95654e8d%40eisentraut.org	2025-11-19 08:48:21 +01:00
Peter Eisentraut	300c8f5324	Add <stdalign.h> to c.h This allows using the C11 constructs alignas and alignof (not done in this patch). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/46f05236-d4d4-4b4e-84d4-faa500f14691%40eisentraut.org	2025-11-19 08:18:25 +01:00
Richard Guo	4d3f623ea8	Fix typo in nodeHash.c Replace "overlow" with "overflow". Author: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNnzFjAjYLTkP78HE2PQ17MjBqFdQQg+0X6Wo7YMUb68xA@mail.gmail.com	2025-11-19 11:04:03 +09:00
Tom Lane	3e83bdd35a	Fix pg_popcount_aarch64.c to build with ancient glibc releases. Like commit `6d969ca68`, except here we are mopping up after `519338ace`. (There are no other uses of <sys/auxv.h> in the tree, so we should be done now.) Reported-by: GaoZengqi <pgf00a@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFmBtr3Av62-jBzdhFkDHXJF9vQmNtSnH2upwODjnRcsgdTytw@mail.gmail.com Backpatch-through: 18	2025-11-18 16:16:46 -05:00
Álvaro Herrera	77fb3959a4	Fix typo	2025-11-18 19:31:23 +01:00
Tom Lane	35b5c62c3a	Don't allow CTEs to determine semantic levels of aggregates. The fix for bug #19055 (commit `b0cc0a71e`) allowed CTE references in sub-selects within aggregate functions to affect the semantic levels assigned to such aggregates. It turns out this broke some related cases, leading to assertion failures or strange planner errors such as "unexpected outer reference in CTE query". After experimenting with some alternative rules for assigning the semantic level in such cases, we've come to the conclusion that changing the level is more likely to break things than be helpful. Therefore, this patch undoes what `b0cc0a71e` changed, and instead installs logic to throw an error if there is any reference to a CTE that's below the semantic level that standard SQL rules would assign to the aggregate based on its contained Var and Aggref nodes. (The SQL standard disallows sub-selects within aggregate functions, so it can't reach the troublesome case and hence has no rule for what to do.) Perhaps someone will come along with a legitimate query that this logic rejects, and if so probably the example will help us craft a level-adjustment rule that works better than what `b0cc0a71e` did. I'm not holding my breath for that though, because the previous logic had been there for a very long time before bug #19055 without complaints, and that bug report sure looks to have originated from fuzzing not from real usage. Like `b0cc0a71e`, back-patch to all supported branches, though sadly that no longer includes v13. Bug: #19106 Reported-by: Kamil Monicz <kamil@monicz.dev> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19106-9dd3668a0734cd72@postgresql.org Backpatch-through: 14	2025-11-18 12:56:55 -05:00
Nathan Bossart	faf4128ad3	Add commit `f63ae72bbc` to .git-blame-ignore-revs.	2025-11-18 10:32:22 -06:00
Nathan Bossart	379f0e9f72	Check for tabs in postgresql.conf.sample. The previous commit updated this file to use spaces instead of tabs. This commit adds a test to ensure that no new tabs are added. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aReNUKdMgKxLqmq7%40nathan	2025-11-18 10:28:36 -06:00
Nathan Bossart	f63ae72bbc	Switch from tabs to spaces in postgresql.conf.sample. This file is written for 8-space tabs, since we expect that most users who edit their configuration files use 8-space tabs. However, most of PostgreSQL is written for 4-space tabs, and at least one popular web interface defaults to 4-space tabs. Rather than trying to standardize on a particular tab width for this file, let's just switch to spaces. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aReNUKdMgKxLqmq7%40nathan	2025-11-18 10:28:36 -06:00
Nathan Bossart	aeebb49b7c	Update .editorconfig and .gitattributes for postgresql.conf.sample. This commit updates .editorconfig and .gitattributes in preparation for a follow-up commit that will modify this file to use spaces instead of tabs. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aReNUKdMgKxLqmq7%40nathan	2025-11-18 10:28:36 -06:00
Álvaro Herrera	c05dee1911	Log a note at program start when running in dry-run mode Users might get some peace of mind knowing their data is not being destroyed or whatever. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAHut+PsvQJQnQO0KT0S2oegenkvJ8FUuY-QS5syyqmT24R2xFQ@mail.gmail.com	2025-11-18 16:13:29 +01:00
Alexander Korotkov	75e82b2f5a	Optimize shared memory usage for WaitLSNProcInfo We need separate pairing heaps for different WaitLSNType's, because there might be waiters for different LSN's at the same time. However, one process can wait only for one type of LSN at a time. So, no need for inHeap and heapNode fields to be arrays. Discussion: https://postgr.es/m/CAPpHfdsBR-7sDtXFJ1qpJtKiohfGoj%3DvqzKVjWxtWsWidx7G_A%40mail.gmail.com Author: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2025-11-18 09:50:12 +02:00
Michael Paquier	694b4ab33b	pg_buffercache: Fix incorrect result cast for relforknumber pg_buffercache_pages.relforknumber is defined as an int2, but its value was stored with ObjectIdGetDatum() rather than Int16GetDatum() in the result record. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAExHW5s2_qwSdhKpVnUzjRMf0cf1PvmhUHQDLaFM3QzKbP1OyQ@mail.gmail.com	2025-11-18 15:46:43 +09:00
Michael Paquier	fce13424b9	doc: Fix style of description for pg_buffercache_numa.os_page_num Extracted from a larger patch by the same author. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/Z/fFA2heH6lpSLlt@ip-10-97-1-34.eu-west-3.compute.internal	2025-11-18 14:17:54 +09:00
Amit Kapila	3edaf29fa5	Rename two columns in pg_stat_subscription_stats. This patch renames the sync_error_count column to sync_table_error_count in the pg_stat_subscription_stats view. The new name makes the purpose explicit now that a separate column exists to track sequence synchronization errors. Additionally, the column seq_sync_error_count is renamed to sync_seq_error_count to maintain a consistent naming pattern, making it easier for users to group, and query synchronization related counters. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CALDaNm3WwJmz=-4ybTkhniB-Nf3qmFG9Zx1uKjyLLoPF5NYYXA@mail.gmail.com	2025-11-18 03:58:55 +00:00
Amit Kapila	c677f2b09f	Doc: Use <structfield> markup for sequence fields. Following commit `980a855c5c`, update documentation to use <structfield> for sequence columns. Previously, these were incorrectly marked up as <literal>. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHut+PtpDMUE3Kd1p=1ff9pw2HMbgQCpowE_0Hd6gs5v2pKfQg@mail.gmail.com	2025-11-18 03:48:00 +00:00
Bruce Momjian	792353f7d5	doc: clarify that pg_upgrade preserves "optimizer" stats. Reported-by: Rambabu V Author: Robert Treat Discussion: https://postgr.es/m/CADtiZxrUzRRX6edyN2y-7U5HA8KSXttee7K=EFTLXjwG1SCE4A@mail.gmail.com Backpatch-through: 18	2025-11-17 18:55:41 -05:00
Masahiko Sawada	a6eac2273e	Use streaming read I/O in BRIN vacuum scan. This commit implements streaming read I/O for BRIN vacuum scans. Although BRIN indexes tend to be relatively small by design, performance tests have shown performance improvements. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://postgr.es/m/CAE7r3ML01aiq9Th_1OSz7U7Aq2pWbhMLoz5T%2BPXcg8J9ZAPFFA%40mail.gmail.com	2025-11-17 13:22:20 -08:00
Tom Lane	6d969ca687	Fix pg_crc32c_armv8_choose.c to build with ancient glibc releases. If you go back as far as the RHEL7 era, <sys/auxv.h> does not provide the HWCAPxxx macros needed with elf_aux_info or getauxval, so you need to get those from the kernel header <asm/hwcap.h> instead. We knew that for the 32-bit case but failed to extrapolate to the 64-bit case. Oversight in commit `aac831caf`. Reported-by: GaoZengqi <pgf00a@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFmBtr3Av62-jBzdhFkDHXJF9vQmNtSnH2upwODjnRcsgdTytw@mail.gmail.com Backpatch-through: 18	2025-11-17 15:24:34 -05:00
Tom Lane	ed931377ab	Clean up match_orclause_to_indexcol(). Remove bogus stripping of RelabelTypes: that can result in building an output SAOP tree with incorrect exposed exprType for the operands, which might confuse polymorphic operators. Moreover it demonstrably prevents folding some OR-trees to SAOPs when the RHS expressions have different base types that were coerced to the same type by RelabelTypes. Reduce prohibition on type_is_rowtype to just disallow type RECORD. We need that because otherwise we would happily fold multiple RECORD Consts into a RECORDARRAY Const even if they aren't the same record type. (We could allow that perhaps, if we checked that they all have the same typmod, but the case doesn't seem worth that much effort.) However, there is no reason at all to disallow the transformation for named composite types, nor domains over them: as long as we can find a suitable array type we're good. Remove some assertions that seem rather out of place (it's not this code's duty to verify that the RestrictInfo structure is sane). Rewrite some comments. The issues with RelabelType stripping seem severe enough to back-patch this into v18 where the code was introduced. Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAHewXN=aH7GQBk4fXU-WaEeVmQWUmBAeNyBfJ3VKzPphyPKUkQ@mail.gmail.com Backpatch-through: 18	2025-11-17 13:54:52 -05:00
Fujii Masao	6793d6a839	doc: Document default values for some pg_recvlogical options. The documentation did not previously mention the default values for the --fsync-interval and --plugin options, even though pg_recvlogical --help shows them. This omission made it harder for users to understand the tool's behavior from the documentation alone. This commit adds the missing default value descriptions for both options to the pg_recvlogical documentation. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/CAHGQGwFqssPBjkWMFofGq32e_tANOeWN-cM=6biAP3nnFUXMRw@mail.gmail.com	2025-11-17 23:24:39 +09:00
Daniel Gustafsson	ab805989b2	Fix typos in logical replication code comments Author: Chao Li <lic@highgo.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAEoWx2kt8m7wV39_zOBds5SNXx9EAkDqb5cPshk7Bxw6Js4Zpg@mail.gmail.com	2025-11-17 13:37:25 +01:00
Daniel Gustafsson	721bf9ce18	Mention md5 deprecation in postgresql.conf.sample PostgreSQL 18 deprecated password_encryption='md5', but the comments for this GUC in the sample configuration file did not mention the deprecation. Update comments with a notice to make as many users as possible aware of it. Also add a comment to the related md5_password_warnings GUC while there. Author: Michael Banck <mbanck@gmx.net> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Backpatch-through: 18	2025-11-17 12:18:18 +01:00
Michael Paquier	e76defbcf0	Rework output format of pg_dependencies The existing format of pg_dependencies uses a single-object JSON structure, with each key value embedding all the knowledge about the set attributes tracked, like: {"1 => 5": 1.000000, "5 => 1": 0.423130} While this is a very compact format, it is confusing to read and it is difficult to manipulate the values within the object, particularly when tracking multiple attributes. The new output format introduced in this commit is a JSON array of objects, with: - A key named "degree", with a float value. - A key named "attributes", with an array of attribute numbers. - A key named "dependency", with an attribute number. The values use the same underlying type as previously when printed, with a new output format that shows now as follows: [{"degree": 1.000000, "attributes": [1], "dependency": 5}, {"degree": 0.423130, "attributes": [5], "dependency": 1}] This new format will become handy for a follow-up set of changes, so as it becomes possible to inject extended statistics rather than require an ANALYZE, like in a dump/restore sequence or after pg_upgrade on a new cluster. This format has been suggested by Tomas Vondra. The key names are defined in the header introduced by `1f927cce44`, to ease the integration of frontend-specific changes that are still under discussion. (Again a personal note: if anybody comes up with better name for the keys, of course feel free.) The bulk of the changes come from the regression tests, where jsonb_pretty() is now used to make the outputs generated easier to parse. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2025-11-17 10:44:26 +09:00
Michael Paquier	1f927cce44	Rework output format of pg_ndistinct The existing format of pg_ndistinct uses a single-object JSON structure where each key is itself a comma-separated list of attnums, like: {"3, 4": 11, "3, 6": 11, "4, 6": 11, "3, 4, 6": 11} While this is a very compact format, it is confusing to read and it is difficult to manipulate the values within the object. The new output format introduced in this commit is an array of objects, with: - A key named "attributes", that contains an array of attribute numbers. - A key named "ndistinct", represented as an integer. The values use the same underlying type as previously when printed, with a new output format that shows now as follows: [{"ndistinct": 11, "attributes": [3,4]}, {"ndistinct": 11, "attributes": [3,6]}, {"ndistinct": 11, "attributes": [4,6]}, {"ndistinct": 11, "attributes": [3,4,6]}] This new format will become handy for a follow-up set of changes, so as it becomes possible to inject extended statistics rather than require an ANALYZE, like in a dump/restore sequence or after pg_upgrade on a new cluster. This format has been suggested by Tomas Vondra. The key names are defined in a new header, to ease with the integration of frontend-specific changes that are still under discussion. (Personal note: I am not specifically wedded to these key names, but if there are better name suggestions for this release, feel free.) The bulk of the changes come from the regression tests, where jsonb_pretty() is now used to make the outputs generated easier to parse. Author: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2025-11-17 09:52:20 +09:00
Thomas Munro	32b236644d	Define PS_USE_CLOBBER_ARGV on GNU/Hurd. Until `d2ea2d310d`, the PS_USE_PS_STRINGS option was used on the GNU/Hurd. As this option got removed and PS_USE_CLOBBER_ARGV appears to work fine nowadays on the Hurd, define this one to re-enable process title changes on this platform. In the 14 and 15 branches, the existing test for __hurd__ (added 25 years ago by commit `209aa77d`, removed in 16 by the above commit) is left unchanged for now as it was activating slightly different code paths and would need investigation by a Hurd user. Author: Michael Banck <mbanck@debian.org> Discussion: https://postgr.es/m/CA%2BhUKGJMNGUAqf27WbckYFrM-Mavy0RKJvocfJU%3DJ2XcAZyv%2Bw%40mail.gmail.com Backpatch-through: 16	2025-11-17 12:48:55 +13:00
David Rowley	586d63214e	Adjust MemSet macro to use size_t rather than long Likewise for MemSetAligned. "long" wasn't the most suitable type for these macros as with MSVC in 64-bit builds, sizeof(long) == 4, which is narrower than the processor's word size, therefore these macros had to perform twice as many loops as they otherwise might. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CAApHDvoGFjSA3aNyVQ3ivbyc4ST=CC5L-_VjEUQ92HbE2Cxovg@mail.gmail.com	2025-11-17 12:27:00 +13:00
David Rowley	9c047da51f	Get rid of long datatype in CATCACHE_STATS enabled builds "long" is 32 bits on Windows 64-bit. Switch to a datatype that's 64-bit on all platforms. While we're there, use an unsigned type as these fields count things that have occurred, of which it's not possible to have negative numbers of. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CAApHDvoGFjSA3aNyVQ3ivbyc4ST=CC5L-_VjEUQ92HbE2Cxovg@mail.gmail.com	2025-11-17 12:26:41 +13:00
Michael Paquier	e7cde9dad2	Add test for temporary file removal and WITH HOLD cursor This new test, added in 009_log_temp_files, checks that the temporary files created by a WITH HOLD cursor are dropped at the end of the transaction where the transaction has been created. The portal's executor is shutdown in PersistHoldablePortal(), after for example some forced detoast, so as the cursor data can be accessed without requiring a snapshot. Author: Mircea Cadariu <cadariu.mircea@gmail.com> Discussion: https://postgr.es/m/0a666d28-9080-4239-90d6-f6345bb43468@gmail.com	2025-11-17 08:01:04 +09:00
Dean Rasheed	1b92fe7bb9	Fix Assert failure in EXPLAIN ANALYZE MERGE with a concurrent update. When instrumenting a MERGE command containing both WHEN NOT MATCHED BY SOURCE and WHEN NOT MATCHED BY TARGET actions using EXPLAIN ANALYZE, a concurrent update of the target relation could lead to an Assert failure in show_modifytable_info(). In a non-assert build, this would lead to an incorrect value for "skipped" tuples in the EXPLAIN output, rather than a crash. This could happen if the concurrent update caused a matched row to no longer match, in which case ExecMerge() treats the single originally matched row as a pair of not matched rows, and potentially executes 2 not-matched actions for the single source row. This could then lead to a state where the number of rows processed by the ModifyTable node exceeds the number of rows produced by its source node, causing "skipped_path" in show_modifytable_info() to be negative, triggering the Assert. Fix this in ExecMergeMatched() by incrementing the instrumentation tuple count on the source node whenever a concurrent update of this kind is detected, if both kinds of merge actions exist, so that the number of source rows matches the number of actions potentially executed, and the "skipped" tuple count is correct. Back-patch to v17, where support for WHEN NOT MATCHED BY SOURCE actions was introduced. Bug: #19111 Reported-by: Dilip Kumar <dilipbalaut@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/19111-5b06624513d301b3@postgresql.org Backpatch-through: 17	2025-11-16 22:14:06 +00:00
David Rowley	2b54a1abdb	Doc: include MERGE in variable substitution command list Backpatch to 15, where MERGE was introduced. Reported-by: <emorgunov@mail.ru> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/176278494385.770.15550176063450771532@wrigleys.postgresql.org Backpatch-through: 15	2025-11-17 10:51:26 +13:00
Alexander Korotkov	23792d7381	Fix incorrect function name in comments Update comments to reference WaitForLSN() instead of the outdated WaitForLSNReplay() function name. Discussion: https://postgr.es/m/CABPTF7UieOYbOgH3EnQCasaqcT1T4N6V2wammwrWCohQTnD_Lw%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-11-15 12:27:42 +02:00
Alexander Korotkov	ede6acef49	Fix WaitLSNWakeup() fast-path check for InvalidXLogRecPtr WaitLSNWakeup() incorrectly returned early when called with InvalidXLogRecPtr (meaning "wake all waiters"), because the fast-path check compared minWaitedLSN > 0 without validating currentLSN first. This caused WAIT FOR LSN commands to wait indefinitely during standby promotion until random signals woke them. Add an XLogRecPtrIsValid() check before the comparison so that InvalidXLogRecPtr bypasses the fast-path and wakes all waiters immediately. Discussion: https://postgr.es/m/CABPTF7UieOYbOgH3EnQCasaqcT1T4N6V2wammwrWCohQTnD_Lw%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-11-15 12:27:42 +02:00
Daniel Gustafsson	446568c222	Add test for postgresql.conf.sample line syntax All GUCs in postgresql.conf.sample should be set to the default value and be commented out. This syntax was however not tested for, making omissions easy to miss. Add a test which check all lines for syntax. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/19727040-3EE4-4719-AF4F-2548544113D7@yesql.se	2025-11-14 23:44:56 +01:00
Nathan Bossart	478c4814a0	Comment out autovacuum_worker_slots in postgresql.conf.sample. All settings in this file should be commented out. In addition to fixing that, also fix the indentation for this line. Oversight in commit `c758119e5b`. Reported-by: Daniel Gustafsson <daniel@yesql.se> Author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/19727040-3EE4-4719-AF4F-2548544113D7%40yesql.se Backpatch-through: 18	2025-11-14 13:45:04 -06:00
Nathan Bossart	7506bdbbf4	Add note about CreateStatistics()'s selective use of check_rights. Commit `5e4fcbe531` added a check_rights parameter to this function for use by ALTER TABLE commands that re-create statistics objects. However, we intentionally ignore check_rights when verifying relation ownership because this function's lookup could return a different answer than the caller's. This commit adds a note to this effect so that we remember it down the road. Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 14	2025-11-14 13:20:09 -06:00
Bruce Momjian	4c00960772	doc: clarify that logical slots track transaction activity Previously it only mentioned WAL retention. Discussion: https://postgr.es/m/pexmenhqptw5h4ma4qasz3cvjtynivxprqifgghdjtmkxdig2g@djg7bk2p6pts Backpatch-through: master	2025-11-14 10:45:59 -05:00
Bruce Momjian	43e6929bb2	doc: double-quote use of %f, %p, and %r in literal commands. Path expansion might expose characters like spaces which would cause command failure, so double-quote the examples. While %f doesn't need quoting since it uses a fixed character set, it is best to be consistent. Discussion: https://postgr.es/m/aROPCQCfvKp9Htk4@momjian.us Backpatch-through: master	2025-11-14 09:08:53 -05:00
Bruce Momjian	a554389fb5	doc: remove verbiage about "receiving" data from rep. slots The slots are just LSN markers, not something to receive from. Backpatch-through: master	2025-11-14 08:56:04 -05:00
Fujii Masao	4aa0ac0576	pgbench: Fix assertion failure with multiple \syncpipeline in pipeline mode. Previously, when pgbench ran a custom script that triggered retriable errors (e.g., deadlocks) followed by multiple \syncpipeline commands in pipeline mode, the following assertion failure could occur: Assertion failed: (res == ((void*)0)), function discardUntilSync, file pgbench.c, line 3594. The issue was that discardUntilSync() assumed a pipeline sync result (PGRES_PIPELINE_SYNC) would always be followed by either another sync result or NULL. This assumption was incorrect: when multiple sync requests were sent, a sync result could instead be followed by another result type. In such cases, discardUntilSync() mishandled the results, leading to the assertion failure. This commit fixes the issue by making discardUntilSync() correctly handle cases where a pipeline sync result is followed by other result types. It now continues discarding results until another pipeline sync followed by NULL is reached. Backpatched to v17, where support for \syncpipeline command in pgbench was introduced. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20251111105037.f3fc554616bc19891f926c5b@sraoss.co.jp Backpatch-through: 17	2025-11-14 22:40:39 +09:00
Álvaro Herrera	e4018f891d	Doc: add IDs to copy.sgml's <varlistentry> and <refsect1> In the spirit of commit `78ee60ed84`. Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxFsPXCwSVR+_vScZ3bysh4-dpE19iVyeta30uNHwnwnSw@mail.gmail.com	2025-11-14 11:45:13 +01:00
Michael Paquier	910690415b	Revert "Drop unnamed portal immediately after execution to completion" This reverts commit `1fd981f053`, based on concerns that the logging improvements do not justify the protocol breakage of dropping an unnamed portal once its execution has completed. It seems unlikely that one would try to send an execute or describe message after the portal has been used, but if they do such post-completion messages would not be able to process as the previous versions. Let's revert this change for now so as we keep compatibility and consider a different solution. The tests added by `76bba03312` track the pre-1fd981f05369 behavior, and are still valid. Discussion: https://postgr.es/m/CA+TgmoYFJyJNQw3RT7veO3M2BWRE9Aw4hprC5rOcawHZti-f8g@mail.gmail.com	2025-11-14 14:37:10 +09:00
Bruce Momjian	8fa6b9030d	doc: adjust "Replication Slot" to mention physical & logical Much of the "Replication Slot" chapter applies to physical and logical slots, but it was sloppy in mentioning mostly physical slots. This patch clarified which parts of the text apply to which slot types. This chapter is referenced from the logical slot/subscriber chapter, so it needs to do double duty. Backpatch-through: master	2025-11-13 21:53:13 -05:00
Bruce Momjian	ada78cd7f8	doc: clarify "logical" replication slots Also mention that logical replication slots are created by default when subscriptions are created. This should clarify the text. Backpatch-through: master	2025-11-13 21:35:28 -05:00
Bruce Momjian	a5b69e3073	doc: clarify "physical" replication slot creation on the primary Previously it was not clear that "physical" replication slots were being discussed, and that they needed to be created on the primary and not the standby. Backpatch-through: master	2025-11-13 20:44:00 -05:00
Bruce Momjian	acbc9beaae	doc: reorder logical replication benefits in a logical order The previous ordering was hard to understand and remember. Also adjust wording to be more consistent with surrounding items. Backpatch-through: master	2025-11-13 18:12:37 -05:00
Daniel Gustafsson	d22cc7326c	Document that pg_getaddrinfo_all does not accept null hints While the underlying getaddrinfo call accepts a null pointer for the hintp parameter, pg_getaddrinfo_all does not. Document this difference with a comment to make it clear. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru> Discussion: https://postgr.es/m/1e5efc94-407e-40b8-8b10-4d25f823c6d7@postgrespro.ru	2025-11-13 16:35:07 +01:00
Dean Rasheed	7dc4fa9141	doc: Improve description of RLS policies applied by command type. On the CREATE POLICY page, the "Policies Applied by Command Type" table was missing MERGE ... THEN DELETE and some of the policies applied during INSERT ... ON CONFLICT and MERGE. Fix that, and try to improve readability by listing the various MERGE cases separately, rather than together with INSERT/UPDATE/DELETE. Mention COPY ... TO along with SELECT, since it behaves in the same way. In addition, document which policy violations cause errors to be thrown, and which just cause rows to be silently ignored. Also, a paragraph above the table states that INSERT ... ON CONFLICT DO UPDATE only checks the WITH CHECK expressions of INSERT policies for rows appended to the relation by the INSERT path, which is incorrect -- all rows proposed for insertion are checked, regardless of whether they end up being inserted. Fix that, and also mention that the same applies to INSERT ... ON CONFLICT DO NOTHING. In addition, in various other places on that page, clarify how the different types of policy are applied to different commands, and whether or not errors are thrown when policy checks do not pass. Backpatch to all supported versions. Prior to v17, MERGE did not support RETURNING, and so MERGE ... THEN INSERT would never check new rows against SELECT policies. Prior to v15, MERGE was not supported at all. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAEZATCWqnfeChjK=n1V_dYZT4rt4mnq+ybf9c0qXDYTVMsy8pg@mail.gmail.com Backpatch-through: 14	2025-11-13 12:00:56 +00:00
Thomas Munro	017249b828	Add some missing #include <limits.h>. These files relied on transitive inclusion via port/atomics.h for constants CHAR_BIT and INT_MAX. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/536409d2-c9df-4ef3-808d-1ffc3182868c@iki.fi	2025-11-13 22:56:08 +13:00
Michael Paquier	e6c9186e68	Add commit `c2b0e3a035` to .git-blame-ignore-revs.	2025-11-13 14:27:24 +09:00
Michael Paquier	c2b0e3a035	Fix indentation issue Issue introduced by `84fb27511d`. I have missed this diff while adding pgoff_t to the typedef list of pgindent, while addressing a separate indentation issue. Per buildfarm member koel.	2025-11-13 14:25:21 +09:00
Michael Paquier	84fb27511d	Replace off_t by pgoff_t in I/O routines PostgreSQL's Windows port has never been able to handle files larger than 2GB due to the use of off_t for file offsets, only 32-bit on Windows. This causes signed integer overflow at exactly 2^31 bytes when trying to handle files larger than 2GB, for the routines touched by this commit. Note that large files are forbidden by ./configure (`3c6248a828`) and meson (recent change, see `79cd66f28c`). This restriction also exists in v16 and older versions for the now-dead MSVC scripts. The code base already defines pgoff_t as __int64 (64-bit) on Windows for this purpose, and some function declarations in headers use it, but many internals still rely on off_t. This commit switches more routines to use pgoff_t, offering more portability, for areas mainly related to file extensions and storage. These are not critical for WAL segments yet, which have currently a maximum size allowed of 1GB (well, this opens the door at allowing a larger size for them). This matters more for segment files if we want to lift the large file restriction in ./configure and meson in the future, which would make sense to remove once/if all traces of off_t are gone from the tree. This can additionally matter for out-of-core code that may want files larger than 2GB in places where off_t is four bytes in size. Note that off_t is still used in other parts of the tree like buffile.c, WAL sender/receiver, base backup, pg_combinebackup, etc. These other code paths can be addressed separately, and their update will be required if we want to remove the large file restriction in the future. This commit is a good first cut in itself towards more portability, hopefully. On Unix-like systems, pgoff_t is defined as off_t, so this change only affects Windows behavior. Author: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/0f238ff4-c442-42f5-adb8-01b762c94ca1@gmail.com	2025-11-13 12:41:40 +09:00
Fujii Masao	705601c5ae	Fix incorrect assignment of InvalidXLogRecPtr to a non-LSN variable. pg_logical_slot_get_changes_guts() previously assigned InvalidXLogRecPtr to the local variable upto_nchanges, which is of type int32, not XLogRecPtr. While this caused no functional issue since InvalidXLogRecPtr is defined as 0, it was semantically incorrect. This commit fixes the issue by updating pg_logical_slot_get_changes_guts() to set upto_nchanges to 0 instead of InvalidXLogRecPtr. No backpatch is needed, as the previous behavior was harmless. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHKHuR5NGnGxU3+ebz7cbC1ZAR=AgG4Bueq==Lj6iX8Sw@mail.gmail.com	2025-11-13 08:44:33 +09:00
Nathan Bossart	180e7abe68	Remove obsolete autovacuum comment. This comment seems to refer to some stuff that was removed during development in 2005. Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/aRJFDxKJLFE_1Iai%40nathan	2025-11-12 15:13:08 -06:00
Nathan Bossart	c5c74282f2	test_dsa: Avoid leaking LWLock tranches. Since this is a test module, leaking a couple of LWLock tranches is fine, but we want to discourage that pattern in third-party code. This commit teaches the module to create only one tranche and to store its ID in shared memory for use by other backends. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/dd36d384-55df-4fc2-825c-5bc56c950fa9%40gmail.com	2025-11-12 14:57:48 -06:00
Nathan Bossart	1165a933aa	Teach DSM registry to ERROR if attaching to an uninitialized entry. If DSM entry initialization fails, backends could try to use an uninitialized DSM segment, DSA, or dshash table (since the entry is still added to the registry). To fix, keep track of whether initialization completed, and ERROR if a backend tries to attach to an uninitialized entry. We could instead retry initialization as needed, but that seemed complicated, error prone, and unlikely to help most cases. Furthermore, such problems probably indicate a coding error. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/dd36d384-55df-4fc2-825c-5bc56c950fa9%40gmail.com Backpatch-through: 17	2025-11-12 14:30:11 -06:00
Heikki Linnakangas	0bdc777e80	Clear 'xid' in dummy async notify entries written to fill up pages Before we started to freeze async notify entries (commit `8eeb4a0f7c`), no one looked at the 'xid' on an entry with invalid 'dboid'. But now we might actually need to freeze it later. Initialize them with InvalidTransactionId to begin with, to avoid that work later. Álvaro pointed this out in review of commit `8eeb4a0f7c`, but I forgot to include this change there. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/202511071410.52ll56eyixx7@alvherre.pgsql Backpatch-through: 14	2025-11-12 21:19:03 +02:00
Heikki Linnakangas	797e9ea6e5	Fix remaining race condition with CLOG truncation and LISTEN/NOTIFY Previous commit fixed a bug where VACUUM would truncate the CLOG that's still needed to check the commit status of XIDs in the async notify queue, but as mentioned in the commit message, it wasn't a full fix. If a backend is executing asyncQueueReadAllNotifications() and has just made a local copy of an async SLRU page which contains old XIDs, vacuum can concurrently truncate the CLOG covering those XIDs, and the backend still gets an error when it calls TransactionIdDidCommit() on those XIDs in the local copy. This commit fixes that race condition. To fix, hold the SLRU bank lock across the TransactionIdDidCommit() calls in NOTIFY processing. Per Tom Lane's idea. Backpatch to all supported versions. Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://www.postgresql.org/message-id/2759499.1761756503@sss.pgh.pa.us Backpatch-through: 14	2025-11-12 20:59:44 +02:00
Heikki Linnakangas	8eeb4a0f7c	Fix bug where we truncated CLOG that was still needed by LISTEN/NOTIFY The async notification queue contains the XID of the sender, and when processing notifications we call TransactionIdDidCommit() on the XID. But we had no safeguards to prevent the CLOG segments containing those XIDs from being truncated away. As a result, if a backend didn't for some reason process its notifications for a long time, or when a new backend issued LISTEN, you could get an error like: test=# listen c21; ERROR: 58P01: could not access status of transaction 14279685 DETAIL: Could not open file "pg_xact/000D": No such file or directory. LOCATION: SlruReportIOError, slru.c:1087 To fix, make VACUUM "freeze" the XIDs in the async notification queue before truncating the CLOG. Old XIDs are replaced with FrozenTransactionId or InvalidTransactionId. Note: This commit is not a full fix. A race condition remains, where a backend is executing asyncQueueReadAllNotifications() and has just made a local copy of an async SLRU page which contains old XIDs, while vacuum concurrently truncates the CLOG covering those XIDs. When the backend then calls TransactionIdDidCommit() on those XIDs from the local copy, you still get the error. The next commit will fix that remaining race condition. This was first reported by Sergey Zhuravlev in 2021, with many other people hitting the same issue later. Thanks to: - Alexandra Wang, Daniil Davydov, Andrei Varashen and Jacques Combrink for investigating and providing reproducable test cases, - Matheus Alcantara and Arseniy Mukhin for review and earlier proposed patches to fix this, - Álvaro Herrera and Masahiko Sawada for reviews, - Yura Sokolov aka funny-falcon for the idea of marking transactions as committed in the notification queue, and - Joel Jacobson for the final patch version. I hope I didn't forget anyone. Backpatch to all supported versions. I believe the bug goes back all the way to commit `d1e027221d`, which introduced the SLRU-based async notification queue. Discussion: https://www.postgresql.org/message-id/16961-25f29f95b3604a8a@postgresql.org Discussion: https://www.postgresql.org/message-id/18804-bccbbde5e77a68c2@postgresql.org Discussion: https://www.postgresql.org/message-id/CAK98qZ3wZLE-RZJN_Y%2BTFjiTRPPFPBwNBpBi5K5CU8hUHkzDpw@mail.gmail.com Backpatch-through: 14	2025-11-12 20:59:36 +02:00
Heikki Linnakangas	1b4699090e	Escalate ERRORs during async notify processing to FATAL Previously, if async notify processing encountered an error, we would report the error to the client and advance our read position past the offending entry to prevent trying to process it over and over again. Trying to continue after an error has a few problems however: - We have no way of telling the client that a notification was lost. They get an ERROR, but that doesn't tell you much. As such, it's not clear if keeping the connection alive after losing a notification is a good thing. Depending on the application logic, missing a notification could cause the application to get stuck waiting, for example. - If the connection is idle, PqCommReadingMsg is set and any ERROR is turned into FATAL anyway. - We bailed out of the notification processing loop on first error without processing any subsequent notifications. The subsequent notifications would not be processed until another notify interrupt arrives. For example, if there were two notifications pending, and processing the first one caused an ERROR, the second notification would not be processed until someone sent a new NOTIFY. This commit changes the behavior so that any ERROR while processing async notifications is turned into FATAL, causing the client connection to be terminated. That makes the behavior more consistent as that's what happened in idle state already, and terminating the connection is a clear signal to the application that it might've missed some notifications. The reason to do this now is that the next commits will change the notification processing code in a way that would make it harder to skip over just the offending notification entry on error. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://www.postgresql.org/message-id/fedbd908-4571-4bbe-b48e-63bfdcc38f64@iki.fi Backpatch-through: 14	2025-11-12 20:59:28 +02:00
Daniel Gustafsson	d36acd6f5c	doc: Document effects of ownership change on privileges Explicitly document that privileges are transferred along with the ownership. Backpatch to all supported versions since this behavior has always been present. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Josef Šimánek <josef.simanek@gmail.com> Reported-by: Gilles Parc <gparc@free.fr> Discussion: https://postgr.es/m/2023185982.281851219.1646733038464.JavaMail.root@zimbra15-e2.priv.proxad.net Backpatch-through: 14	2025-11-12 17:04:35 +01:00
Álvaro Herrera	877a024902	Split out innards of pg_tablespace_location() This creates a src/backend/catalog/pg_tablespace.c supporting file containing a new function get_tablespace_location(), which lets the code underlying pg_tablespace_location() be reused for other purposes. Author: Manni Wood <manni.wood@enterprisedb.com> Author: Nishant Sharma <nishant.sharma@enterprisedb.com> Reviewed-by: Vaibhav Dalvi <vaibhav.dalvi@enterprisedb.com> Reviewed-by: Ian Lawrence Barwick <barwick@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAKWEB6rmnmGKUA87Zmq-s=b3Scsnj02C0kObQjnbL2ajfPWGEw@mail.gmail.com	2025-11-12 16:39:55 +01:00
Alexander Korotkov	a1f7f91be2	Add tab completion support for the WAIT FOR command This commit implements tab completion for the WAIT FOR LSN command in psql. Discussion: https://postgr.es/m/CABPTF7WnLPKcoTGCGge1dDpOieZ2HGF7OVqhNXDcRLPPdSw%3DxA%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-11-12 16:11:14 +02:00
Daniel Gustafsson	b4e32a076c	Fix range for commit_siblings in sample conf The range for commit_siblings was incorrectly listed as starting on 1 instead of 0 in the sample configuration file. Backpatch down to all supported branches. Author: Man Zeng <zengman@halodbtech.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/tencent_53B70BA72303AE9C6889E78E@qq.com Backpatch-through: 14	2025-11-12 13:51:53 +01:00
Daniel Gustafsson	9122ff65a1	libpq: threadsafety for SSL certificate callback In order to make the errorhandling code in backend libpq be thread- safe the global variable used by the certificate verification call- back need to be replaced with passing private data. This moves the threadsafety needle a little but forwards, the call to strerror_r also needs to be replaced with the error buffer made thread local. This is left as future work for when add the thread primitives required for this to the tree. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/353226C7-97A1-4507-A380-36AA92983AE6@yesql.se	2025-11-12 12:37:40 +01:00
Álvaro Herrera	78aae29830	Change coding pattern for CURL_IGNORE_DEPRECATION() Instead of having to write a semicolon inside the macro argument, we can insert a semicolon with another macro layer. This no longer gives pg_bsd_indent indigestion, so we can remove the digestive aids that had to be installed in the pgindent Perl script. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202511111134.njrwf5w5nbjm@alvherre.pgsql Backpatch-through: 18	2025-11-12 12:35:14 +01:00
Michael Paquier	040a39ed25	Fix comments of output routines for pg_ndistinct and pg_dependencies Oversights in `7b504eb282` (for pg_ndistinct) and `2686ee1b7c` (for pg_dependencies). Reported-by: Man Zeng <zengman@halodbtech.com> Discussion: https://postgr.es/m/176293711658.2081918.12019224686811870203.pgcf@coridan.postgresql.org	2025-11-12 20:24:10 +09:00
Heikki Linnakangas	94939c5f3a	Fix pg_upgrade around multixid and mxoff wraparound pg_resetwal didn't accept multixid 0 or multixact offset UINT32_MAX, but they are both valid values that can appear in the control file. That caused pg_upgrade to fail if you tried to upgrade a cluster exactly at multixid or offset wraparound, because pg_upgrade calls pg_resetwal to restore multixid/offset on the new cluster to the values from the old cluster. To fix, allow those values in pg_resetwal. Fixes bugs #18863 and #18865 reported by Dmitry Kovalenko. Backpatch down to v15. Version 14 has the same bug, but the patch doesn't apply cleanly there. It could be made to work but it doesn't seem worth the effort given how rare it is to hit this problem with pg_upgrade, and how few people are upgrading to v14 anymore. Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezaApSMTjd%3DM2Sfn5Ucuggd3FG8Z8Qte8Xq9k5-%2BRQis-g@mail.gmail.com Discussion: https://www.postgresql.org/message-id/18863-72f08858855344a2@postgresql.org Discussion: https://www.postgresql.org/message-id/18865-d4c66cf35c2a67af@postgresql.org Backpatch-through: 15	2025-11-12 12:20:16 +02:00
Amit Kapila	55cefadde8	Doc: Add documentation for sequence synchronization. Add documentation describing sequence synchronization support in logical replication. It explains how sequence changes are synchronized from the publisher to the subscriber, the configuration requirements, and provide examples illustrating setup and usage. Additionally, document the pg_get_sequence_data() function, which allows users to query sequence details on the publisher to determine when to refresh corresponding sequences on the subscriber. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-11-12 08:49:01 +00:00
Michael Paquier	2ddc8d9e9b	Move code specific to pg_dependencies to new file This new file is named pg_dependencies.c and includes all the code directly related to the data type pg_dependencies, extracted from the extended statistics code. Some patches are under discussion to change its input and output functions, and this separation makes the follow-up changes cleaner by separating the logic related to the data type and the functional dependencies statistics core logic in dependencies.c. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aQ2k8--a0FfwSwX9@paquier.xyz	2025-11-12 16:53:19 +09:00
Michael Paquier	a552312343	Move code specific to pg_ndistinct to new file This new file is named pg_ndistinct.c and includes all the code directly related to the data type pg_ndistinct, extracted from the extended statistics code. Some patches are under discussion to change its input and output functions, and this separation makes the follow-up changes cleaner by separating the logic related to the data type and the multivariate ndistinct coefficient core logic in mvdistinct.c. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aQ2k8--a0FfwSwX9@paquier.xyz	2025-11-12 16:34:52 +09:00
Fujii Masao	df53fa1c1e	doc: Fix incorrect synopsis for ALTER PUBLICATION ... DROP ... The synopsis for the ALTER PUBLICATION ... DROP ... command incorrectly implied that a column list and WHERE clause could be specified as part of the publication object. However, these options are not allowed for DROP operations, making the documentation misleading. This commit corrects the synopsis to clearly show only the valid forms of publication objects. Backpatched to v15, where the incorrect synopsis was introduced. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+PsPu+47Q7b0o6h1r-qSt90U3zgbAHMHUag5o5E1Lo+=uw@mail.gmail.com Backpatch-through: 15	2025-11-12 13:37:58 +09:00
Amit Kapila	bfb7419b0b	Remove unused assignment in CREATE PUBLICATION grammar. Commit `96b3784973` extended the grammar for CREATE PUBLICATION to support the ALL SEQUENCES variant. However, it unnecessarily prepared publication objects for this variant, which is not required. This was a copy-paste oversight in that commit. Additionally, rename pub_obj_type_list to pub_all_obj_type_list to better reflect its purpose. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CANhcyEWbjkFvk3mSy5LFs9+0z4K1gDwQeFj7GUjOe+L4vxs4AA@mail.gmail.com Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-11-12 03:28:17 +00:00
Thomas Munro	2421ade663	Prefer spelling "cacheable" over "cachable". Previously we had both in code and comments. Keep the more common and accepted variant. Author: Chao Li <lic@highgo.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/5EBF1771-0566-4D08-9F9B-CDCDEF4BDC98@gmail.com	2025-11-12 14:35:16 +13:00
Michael Paquier	fb9bff0454	injection_points: Add tests for name limits The maximum limits for point name, library name, function name and private area size were not kept track of in the tests. The new function introduced in `16a2f70695` gives a way to trigger them. This is not critical but cheap to cover. While on it, this commit cleans up some of the tests introduced by `16a2f70695` for NULL inputs by using more consistent argument values. The coverage does not change, but it makes the whole less confusing with argument values that are correct based their position in the SQL function called. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/aRE7zhu6wOA29gFf@paquier.xyz	2025-11-12 10:32:50 +09:00
Michael Paquier	6e1535308c	Report better object limits in error messages for injection points Previously, error messages for oversized injection point names, libraries, and functions showed buffer sizes (64, 128, 128) instead of the usable character limits (63, 127, 127) as it did not count for the zero-terminated byte, which was confusing. These messages are adjusted to show better the reality. The limit enforced for the private area was also too strict by one byte, as specifying a zone worth exactly INJ_PRIVATE_MAXLEN should be able to work because three is no zero-terminated byte in this case. This is a stylistic change (well, mostly, a private_area size of exactly 1024 bytes can be defined with this change, something that nobody seem to care about based on the lack of complaints). However, this is a testing facility let's keep the logic consistent across all the branches where this code exists, as there is an argument in favor of out-of-core extensions that use injection points. Author: Xuneng Zhou <xunengzhou@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CABPTF7VxYp4Hny1h+7ejURY-P4O5-K8WZg79Q3GUx13cQ6B2kg@mail.gmail.com Backpatch-through: 17	2025-11-12 10:18:50 +09:00
Michael Paquier	79cd66f28c	Add check for large files in meson.build A similar check existed in the MSVC scripts that have been removed in v17 by `1301c80b21`, but nothing of the kind was checked in meson when building with a 4-byte off_t. This commit adds a check to fail the builds when trying to use a relation file size higher than 1GB when off_t is 4 bytes, like ./configure, rather than detecting these failures at runtime because the code is not able to handle large files in this case. Backpatch down to v16, where meson has been introduced. Discussion: https://postgr.es/m/aQ0hG36IrkaSGfN8@paquier.xyz Backpatch-through: 16	2025-11-12 09:02:27 +09:00
Heikki Linnakangas	6956bca515	Add warning to pg_controldata on PG_CONTROL_VERSION mismatch If you run pg_controldata on a cluster that has been initialized with different PG_CONTROL_VERSION than what the pg_controldata program has been compiled with, pg_controldata will still try to interpret the control file, but the result is likely to be somewhat nonsensical. How nonsensical it is depends on the differences between the versions. If sizeof(ControlFileData) differs between the versions, the CRC will not match and you get a warning of that, but otherwise you get no warning. Looking back at recent PG_CONTROL_VERSION updates, all changes that would mess up the printed values have also changed sizeof(ControlFileData), but there's no guarantee of that in future versions. Add an explicit check and warning for version number mismatch before the CRC check. That way, you get a more clear warning if you use the pg_controldata binary from wrong version, and if we change the control file in the future in a way that doesn't change sizeof(ControlFileData), this ensures that you get a warning in that case too. Discussion: https://www.postgresql.org/message-id/2afded89-f9f0-4191-84d8-8b8668e029a1@iki.fi	2025-11-11 19:00:41 +02:00
Heikki Linnakangas	676cd9ac07	Add pg_resetwal and pg_controldata support for new control file field I forgot these in commit `3e0ae46d90`. Discussion: https://www.postgresql.org/message-id/2afded89-f9f0-4191-84d8-8b8668e029a1@iki.fi	2025-11-11 19:00:34 +02:00
Peter Eisentraut	d2f24df19b	Clean up qsort comparison function for GUC entries guc_var_compare() is invoked from qsort() on an array of struct config_generic, but the function accesses these directly as strings (char *). This relies on the name being the first field, so this works. But we can write this more clearly by using the struct and then accessing the field through the struct. Before the reorganization of the GUC structs (commit `a13833c35f`), the old code was probably more convenient, but now we can write this more clearly and correctly. After this change, it is no longer required that the name is the first field in struct config_generic, so remove that comment. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/2c961fa1-14f6-44a2-985c-e30b95654e8d%40eisentraut.org	2025-11-11 07:55:10 +01:00
Heikki Linnakangas	e510378358	Bump PG_CONTROL_VERSION for commit `3e0ae46d90` Commit `3e0ae46d90` added a field to ControlFileData and bumped CATALOG_VERSION_NO, but CATALOG_VERSION_NO is not the right version number for ControlFileData changes. Bumping either one will force an initdb, but PG_CONTROL_VERSION is more accurate. Bump PG_CONTROL_VERSION now. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/1874404.1762787779@sss.pgh.pa.us	2025-11-10 19:12:43 +02:00
Nathan Bossart	5e4fcbe531	Check for CREATE privilege on the schema in CREATE STATISTICS. This omission allowed table owners to create statistics in any schema, potentially leading to unexpected naming conflicts. For ALTER TABLE commands that require re-creating statistics objects, skip this check in case the user has since lost CREATE on the schema. The addition of a second parameter to CreateStatistics() breaks ABI compatibility, but we are unaware of any impacted third-party code. Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl> Author: Jelte Fennema-Nio <postgres@jeltef.nl> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Security: CVE-2025-12817 Backpatch-through: 13	2025-11-10 09:00:00 -06:00
Jacob Champion	600086f471	libpq: Prevent some overflows of int/size_t Several functions could overflow their size calculations, when presented with very large inputs from remote and/or untrusted locations, and then allocate buffers that were too small to hold the intended contents. Switch from int to size_t where appropriate, and check for overflow conditions when the inputs could have plausibly originated outside of the libpq trust boundary. (Overflows from within the trust boundary are still possible, but these will be fixed separately.) A version of add_size() is ported from the backend to assist with code that performs more complicated concatenation. Reported-by: Aleksey Solovev (Positive Technologies) Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Security: CVE-2025-12818 Backpatch-through: 13	2025-11-10 06:20:33 -08:00
Heikki Linnakangas	3e0ae46d90	Move SLRU_PAGES_PER_SEGMENT to pg_config_manual.h It seems plausible that someone might want to experiment with different values. The pressing reason though is that I'm reviewing a patch that requires pg_upgrade to manipulate SLRU files. That patch needs to access SLRU_PAGES_PER_SEGMENT from pg_upgrade code, and slru.h, where SLRU_PAGES_PER_SEGMENT is currently defined, cannot be included from frontend code. Moving it to pg_config_manual.h makes it accessible. Now that it's a little more likely that someone might change SLRU_PAGES_PER_SEGMENT, add a cluster compatibility check for it. Bump catalog version because of the new field in the control file. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/c7a4ea90-9f7b-4953-81be-b3fcb47db057@iki.fi	2025-11-10 16:11:41 +02:00
Daniel Gustafsson	3a872ddd64	Fix typos in nodeWindowAgg comments One of them submitted by the author, with another one other spotted during review so this fixes both. Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAHewXN=eNx2oJ_hzxJrkSvy-1A5Qf45SM8pxERWXE+6RoZyFrw@mail.gmail.com	2025-11-10 12:51:47 +01:00
Michael Paquier	b23fe993e1	Add more tests for relation statistics with rewrites While there are many tests related to relation rewrites, nothing existed to check how the cumulative statistics behave in such cases for relations. A different patch is under discussion to move the relation statistics to be tracked on a per-relfilenode basis, so as these could be rebuilt during crash recovery. This commit gives us a way to check (and perhaps change) the existing behaviors for several rewrite scenarios, mixing transactions, sub-transactions, two-phase commit and VACUUM. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aQ3X20hbqoThQXgp@ip-10-97-1-34.eu-west-3.compute.internal	2025-11-10 14:30:10 +09:00
David Rowley	812367f3d4	Doc: more uppercase keywords in SQLs Per `49d43faa8`. These ones were missed. Reported-by: jian he <jian.universality@gmail.com> Author: Erik Wienhold <ewie@ewie.name> Discussion: https://postgr.es/m/CACJufxG5UaQtoYFQKdMCYjpz_5Kggvdgm1gVEW4sNEa_W__FKA@mail.gmail.com	2025-11-10 17:15:03 +13:00
Michael Paquier	16a2f70695	injection_points: Add variant for injection_point_attach() This new function is able to take in input more data than the existing injection_point_attach(): - A library name. - A function name. - Some private data. This gives more flexibility for tests so as these would not need to reinvent a wrapper for InjectionPointAttach() when attaching a callback from a library other than "injection_points". injection_point_detach() can be used with both versions of injection_point_attach(). Author: Rahila Syed <rahilasyed.90@gmail.com> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAH2L28sOG2b_TKkZU51dy+pWJtny1mqDmeFiFoUASGa0X0iiKQ@mail.gmail.com	2025-11-10 09:52:14 +09:00
Michael Paquier	9d7e851a21	Fix comment in copyto.c Author: Tatsuya Kawata <kawatatatsuya0913@gmail.com> Discussion: https://postgr.es/m/CAHza6qeNbqgMfgDi15Dv6E6GWx+8maRAqe97OwzYz3qpEFouJQ@mail.gmail.com	2025-11-09 08:17:31 +09:00
Bruce Momjian	980a855c5c	doc: consistently use "structname" and "structfield" markup Previously "literal" and "classname" were used, inconsistently, for SQL table and column names. Reported-by: Peter Smith Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Pvtf24r+bdPgBind84dBLPvgNL7aB+=HxAUupdPuo2gRg@mail.gmail.com Backpatch-through: master	2025-11-08 09:49:43 -05:00
Bruce Momjian	e8bfad4ca8	docs: fix text by adding/removing parentheses Reported-by: Daisuke Higuchi Author: Daisuke Higuchi, Erik Wienhold Reviewed-by: Erik Wienhold Discussion: https://postgr.es/m/CAEVT6c9FRQcFCzQ8AO=QoeQNA-w6RhTkfOUHzY6N2xD5YnBxhg@mail.gmail.com Backpatch-through: master	2025-11-07 22:19:09 -05:00
Bruce Momjian	6204d07ad6	Remove blank line in C code. Was added in commit `5e89985928`. Reported-by: Ashutosh Bapat Author: Ashutosh Bapat Discussion: https://postgr.es/m/CAExHW5tba_biyuMrd_iPVzq-+XvsMdPcEnjQ+d+__V=cjYj8Pg@mail.gmail.com Backpatch-through: master	2025-11-07 21:54:25 -05:00
Thomas Munro	c5d34f4a55	Fix generic read and write barriers for Clang. generic-gcc.h maps our read and write barriers to C11 acquire and release fences using compiler builtins, for platforms where we don't have our own hand-rolled assembler. This is apparently enough for GCC, but the C11 memory model is only defined in terms of atomic accesses, and our barriers for non-atomic, non-volatile accesses were not always respected under Clang's stricter interpretation of the standard. This explains the occasional breakage observed on new RISC-V + Clang animal greenfly in lock-free PgAioHandle manipulation code containing a repeating pattern of loads and read barriers. The problem can also be observed in code generated for MIPS and LoongAarch, though we aren't currently testing those with Clang, and on x86, though we use our own assembler there. The scariest aspect is that we use the generic version on very common ARM systems, but it doesn't seem to reorder the relevant code there (or we'd have debugged this long ago). Fix by inserting an explicit compiler barrier. It expands to an empty assembler block declared to have memory side-effects, so registers are flushed and reordering is prevented. In those respects this is like the architecture-specific assembler versions, but the compiler is still in charge of generating the appropriate fence instruction. Done for write barriers on principle, though concrete problems have only been observed with read barriers. Reported-by: Alexander Lakhin <exclusion@gmail.com> Tested-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/d79691be-22bd-457d-9d90-18033b78c40a%40gmail.com Backpatch-through: 13	2025-11-08 12:26:43 +13:00
Alexander Korotkov	7742f99a02	Fix checking for recovery state in WaitForLSN() We only need to do it for WAIT_LSN_TYPE_REPLAY. WAIT_LSN_TYPE_FLUSH can work for both primary and follower.	2025-11-07 23:34:50 +02:00
Daniel Gustafsson	07961ef866	doc: Fix incorrect wording for --file in pg_dump The documentation stated that the directory specified by --file must not exist, but pg_dump does allow for empty directories to be specified and used. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Bruce Momjian <bruce@momjian.us> Discussion: https://postgr.es/m/534AA60D-CF6B-432F-9882-E9737B33D1B7@gmail.com	2025-11-07 15:10:50 +01:00
Fujii Masao	0ab208fa50	pgbench: Add --continue-on-error option. This commit adds the --continue-on-error option, allowing pgbench clients to continue running even when SQL statements fail for reasons other than serialization or deadlock errors. Without this option (by default), the clients aborts in such cases, which was the only available behavior previously. This option is useful for benchmarks using custom scripts that may raise errors, such as unique constraint violations, where users want pgbench to complete the run despite individual statement failures. Author: Rintaro Ikeda <ikedarintarof@oss.nttdata.com> Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Stepan Neretin <slpmcf@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/44334231a4d214fac382a69cceb7d9fc@oss.nttdata.com	2025-11-07 19:17:37 +09:00
Peter Eisentraut	a3ea5330fc	Fix "inconsistent DLL linkage" warning on Windows MSVC This warning was disabled in meson.build (warning 4273). If you enable it, it looks like this: ../src/backend/utils/misc/ps_status.c(27): warning C4273: '__p__environ': inconsistent dll linkage C:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt\stdlib.h(1158): note: see previous definition of '__p__environ' The declaration in ps_status.c was: #if !defined(WIN32) \|\| defined(_MSC_VER) extern char environ; #endif The declaration in the OS header file is: _DCRTIMP char* __cdecl __p__environ (void); #define _environ (*__p__environ()) So it is evident that this could be problematic. The old declaration was required by the old MSVCRT library, but we don't support that anymore with MSVC. To fix, disable the re-declaration in ps_status.c, and also in some other places that use the same code pattern but didn't trigger the warning. Then we can also re-enable the warning (delete the disablement in meson.build). Reviewed-by: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/bf060644-47ff-441b-97cf-c685d0827757@eisentraut.org	2025-11-07 10:14:25 +01:00
Amit Kapila	f6a4c498dc	Add seq_sync_error_count to subscription statistics. This commit adds a new column, seq_sync_error_count, to the pg_stat_subscription_stats view. This counter tracks the number of errors encountered by the sequence synchronization worker during operation. Since a single worker handles the synchronization of all sequences, this value may reflect errors from multiple sequences. This addition improves observability of sequence synchronization behavior and helps monitor potential issues during replication. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-11-07 08:05:08 +00:00
Fujii Masao	c32e32f763	doc: Fix descriptions of some PGC_POSTMASTER parameters. The following parameters can only be set at server start because their context is PGC_POSTMASTER, but this information was missing or incorrectly documented. This commit adds or corrects that information for the following parameters: * debug_io_direct * dynamic_shared_memory_type * event_source * huge_pages * io_max_combine_limit * max_notify_queue_pages * shared_memory_type * track_commit_timestamp * wal_decode_buffer_size Backpatched to all supported branches. Author: Karina Litskevich <litskevichkarina@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGfPzcin-_6XwPgVbWTOUFVZgHF5g9ROrwLUdCTfjy=0A@mail.gmail.com Backpatch-through: 13	2025-11-07 14:54:36 +09:00
Fujii Masao	6fba6cb05d	doc: Clarify units for io_combine_limit and io_max_combine_limit. If these parameters are set without units, the values are interpreted as blocks. This detail was previously missing from the documentation, so this commit adds it. Backpatch to v17 where io_combine_limit was added. Author: Karina Litskevich <litskevichkarina@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CACiT8iZCDkz1bNYQNQyvGhXWJExSnJULRTYT894u4-Ti7Yh6jw@mail.gmail.com Backpatch-through: 17	2025-11-07 14:42:17 +09:00
Andres Freund	5310fac6e0	bufmgr: Use atomic sub for unpinning buffers The prior commit made it legal to modify BufferDesc.state while the buffer header spinlock is held. This allows us to replace the CAS loop inUnpinBufferNoOwner() with an atomic sub. This improves scalability significantly. See the prior commits for more background. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-11-06 16:43:16 -05:00
Andres Freund	c75ebc657f	bufmgr: Allow some buffer state modifications while holding header lock Until now BufferDesc.state was not allowed to be modified while the buffer header spinlock was held. This meant that operations like unpinning buffers needed to use a CAS loop, waiting for the buffer header spinlock to be released before updating. The benefit of that restriction is that it allowed us to unlock the buffer header spinlock with just a write barrier and an unlocked write (instead of a full atomic operation). That was important to avoid regressions in `48354581a4`. However, since then the hottest buffer header spinlock uses have been replaced with atomic operations (in particular, the most common use of PinBuffer_Locked(), in GetVictimBuffer() (formerly in BufferAlloc()), has been removed in `5e89985928`). This change will allow, in a subsequent commit, to release buffer pins with a single atomic-sub operation. This previously was not possible while such operations were not allowed while the buffer header spinlock was held, as an atomic-sub would not have allowed a race-free check for the buffer header lock being held. Using atomic-sub to unpin buffers is a nice scalability win, however it is not the primary motivation for this change (although it would be sufficient). The primary motivation is that we would like to merge the buffer content lock into BufferDesc.state, which will result in more frequent changes of the state variable, which in some situations can cause a performance regression, due to an increased CAS failure rate when unpinning buffers. The regression entirely vanishes when using atomic-sub. Naively implementing this would require putting CAS loops in every place modifying the buffer state while holding the buffer header lock. To avoid that, introduce UnlockBufHdrExt(), which can set/add flags as well as the refcount, together with releasing the lock. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-11-06 16:42:10 -05:00
David Rowley	448b6a4173	Tidyup WARNING ereports in subscriptioncmds.c A couple of ereports were making use of StringInfos as temporary storage for the portions of the WARNING message. One was doing this to avoid having 2 separate ereports. This was all fairly unnecessary and resulted in more code rather than less code. Refactor out the additional StringInfos and make check_publications_origin_tables() use 2 ereports. In passing, adjust pubnames to become a stack-allocated StringInfoData to avoid having to palloc the temporary StringInfoData. This follows on from the efforts made in `6d0eba662`. Author: Mats Kindahl <mats.kindahl@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/0b381b02-cab9-41f9-a900-ad6c8d26c1fc%40gmail.com	2025-11-07 09:50:02 +13:00
Álvaro Herrera	a2b02293bc	Use XLogRecPtrIsValid() in various places Now that commit `06edbed478` has introduced XLogRecPtrIsValid(), we can use that instead of: - XLogRecPtrIsInvalid() - direct comparisons with InvalidXLogRecPtr - direct comparisons with literal 0 This makes the code more consistent. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal	2025-11-06 20:33:57 +01:00
Álvaro Herrera	06edbed478	Introduce XLogRecPtrIsValid() XLogRecPtrIsInvalid() is inconsistent with the affirmative form of macros used for other datatypes, and leads to awkward double negatives in a few places. This commit introduces XLogRecPtrIsValid(), which allows code to be written more naturally. This patch only adds the new macro. XLogRecPtrIsInvalid() is left in place, and all existing callers remain untouched. This means all supported branches can accept hypothetical bug fixes that use the new macro, and at the same time any code that compiled with the original formulation will continue to silently compile just fine. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal	2025-11-06 19:08:29 +01:00
Álvaro Herrera	8fe7700b7e	Refer readers of \? to "\? variables" for pset options ... and remove the list of \pset options from the general \? output. That list was getting out of hand, both for developers to keep up to date as well as for users to read. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202511041638.dm4qukcxfjto@alvherre.pgsql	2025-11-06 15:50:04 +01:00
Peter Eisentraut	aa606b9316	Disallow generated columns in COPY WHERE clause Stored generated columns are not yet computed when the filtering happens, so we need to prohibit them to avoid incorrect behavior. Virtual generated columns currently error out ("unexpected virtual generated column reference"). They could probably work if we expand them in the right place, but for now let's keep them consistent with the stored variant. This doesn't change the behavior, it only gives a nicer error message. Co-authored-by: jian he <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxHb8YPQ095R_pYDr77W9XKNaXg5Rzy-WP525mkq+hRM3g@mail.gmail.com	2025-11-06 13:54:42 +01:00
Heikki Linnakangas	aa9c5fd3e3	Refactor shared memory allocation for semaphores Before commit `e25626677f`, spinlocks were implemented using semaphores on some platforms (--disable-spinlocks). That made it necessary to initialize semaphores early, before any spinlocks could be used. Now that we don't support --disable-spinlocks anymore, we can allocate the shared memory needed for semaphores the same way as other shared memory structures. Since the semaphores are used only in the PGPROC array, move the semaphore shmem size estimation and initialization calls to ProcGlobalShmemSize() and InitProcGlobal(). Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5seSZpPx-znjidVZNzdagGHOk06F+Ds88MpPUbxd1kTaA@mail.gmail.com	2025-11-06 14:45:00 +02:00
Heikki Linnakangas	daf3d99d2b	Add comment to explain why PGReserveSemaphores() is called early Before commit `e25626677f`, PGReserveSemaphores() had to be called before SpinlockSemaInit() because spinlocks were implemented using semaphores on some platforms (--disable-spinlocks). Add a comment explaining that. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5seSZpPx-znjidVZNzdagGHOk06F+Ds88MpPUbxd1kTaA@mail.gmail.com Backpatch-to: 18	2025-11-06 14:20:48 +02:00
Peter Eisentraut	150e24501b	Fix redundancy in error message Discussion: https://www.postgresql.org/message-id/flat/E1vEsbx-004QDO-0o%40gemulon.postgresql.org	2025-11-06 10:47:45 +01:00
John Naylor	07b3df5d00	Cosmetic fixes in GiST README Fix a typo, add some missing conjunctions, and make a sentence flow more smoothly. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/CA%2BrenyXZgwzegmO5t%3DUSU%3D9Wo5bc-YqNf-6E7Nv7e577DCmYXA%40mail.gmail.com	2025-11-06 16:35:40 +07:00
Amit Kapila	5a4eba558a	Fix few issues in commit `5509055d69`. Test failure on buildfarm member prion: The test failed due to an unexpected LOCATION: line appearing between the WARNING and ERROR messages. This occurred because the prion machine uses log_error_verbosity = verbose, which includes additional context in error messages. The test was originally checking for both WARNING and ERROR messages in sequence sync, but the extra LOCATION: line disrupted this pattern. To make the test robust across different verbosity settings, it now only checks for the presence of the WARNING message after the test, which is sufficient to validate the intended behavior. Failure to sync sequences with quoted names: The previous implementation did not correctly quote sequence names when querying remote information, leading to failures when quoted sequence names were used. This fix ensures that sequence names are properly quoted during remote queries, allowing sequences with quoted identifiers to be synced correctly. Author: Vignesh C <vignesh21@gmail.com> Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CALDaNm0WcdSCoNPiE-5ek4J2dMJ5o111GPTzKCYj9G5i=ONYtQ@mail.gmail.com Discussion: https://postgr.es/m/CAOzEurQOSN=Zcp9uVnatNbAy=2WgMTJn_DYszYjv0KUeQX_e_A@mail.gmail.com	2025-11-06 08:52:31 +00:00
Thomas Munro	b498af4204	ci: Improve OpenBSD core dump backtrace handling. Since OpenBSD core dumps do not embed executable paths, the script now searches for the corresponding binary manually within the specified directory before invoking LLDB. This is imperfect but should find the right executable in practice, as needed for meaningful backtraces. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ36R74TZ8RKsFueYwLxGKDAm3LU2FHM_ZUCSB6imd3vYA@mail.gmail.com Backpatch-through: 18	2025-11-06 21:14:05 +13:00
Michael Paquier	d6c132d83b	Document some structures in attribute_stats.c Like relation_stats.c, these structures are used to track the argument number, names and types of pg_restore_attribute_stats() and pg_clear_attribute_stats(). Extracted from a larger patch by the same author, reworded by me for consistency with relation_stats.c. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com	2025-11-06 16:22:12 +09:00
Peter Eisentraut	489ec6b2fc	Fix spurious output in configure If sizeof off_t is 4, then configure will print a line saying just "0" after the test. This is the output of the following "expr" command. If we are using expr just for the exit code, the output should be sent to /dev/null, as is done elsewhere.	2025-11-06 08:00:57 +01:00
Peter Eisentraut	2307cfe316	MSVC: Improve warning options set The previous code had a set of warnings to disable on MSVC. But some of these weren't actually enabled by default anyway, only in higher MSVC warning levels (/W, maps to meson warning_level). I rearranged this so that it is clearer in what MSVC warning level a warning would have been enabled. Furthermore, sort them numerically within the levels. Moreover, we can add a few warning types to the default set, to get a similar set of warnings that we get by default with gcc or clang (the equivalents of -Wswitch and -Wformat). Reviewed-by: Bryan Green <dbryan.green@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/bf060644-47ff-441b-97cf-c685d0827757@eisentraut.org	2025-11-06 07:56:02 +01:00
Peter Eisentraut	01a985c3c4	Re-run autoheader Some of the changes in pg_config.h.in from commit `3853a6956c` didn't match the order that a fresh run would produce.	2025-11-06 07:37:22 +01:00
Peter Eisentraut	4b6fa00a3a	Re-run autoconf Some of the last-minute changes in commit `f0f2c0c1ae` were apparently not captured.	2025-11-06 07:36:06 +01:00
Peter Eisentraut	05b9edcb71	Update code comment Should have been part of commit `a13833c35f`.	2025-11-06 07:16:30 +01:00
David Rowley	eaa159632d	Fix UNION planner estimate_num_groups with varno==0 `03d40e4b5` added code to provide better row estimates for when a UNION query ended up only with a single child due to other children being found to be dummy rels. In that case, ordinarily it would be ok to call estimate_num_groups() on the targetlist of the only child path, however that's not safe to do if the UNION child is the result of some other set operation as we generate targetlists containing Vars with varno==0 for those, which estimate_num_groups() can't handle. This could lead to: ERROR: XX000: no relation entry for relid 0 Fix this by avoiding doing this when the only child is the result of another set operation. In that case we'll fall back on the assume-all-rows-are-unique method. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/cfbc99e5-9d44-4806-ba3c-f36b57a85e21@gmail.com	2025-11-06 16:34:55 +13:00
Etsuro Fujita	a3ebec4e4c	Update obsolete comment in ExecScanReScan(). Commit `27cc7cd2b` removed the epqScanDone flag from the EState struct, and instead added an equivalent flag named relsubs_done to the EPQState struct; but it failed to update this comment. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAPmGK152zJ3fU5avDT5udfL0namrDeVfMTL3dxdOXw28SOrycg%40mail.gmail.com Backpatch-through: 13	2025-11-06 12:25:00 +09:00
Etsuro Fujita	c3eec94fc1	postgres_fdw: Add more test coverage for EvalPlanQual testing. postgres_fdw supports EvalPlanQual testing by using the infrastructure provided by the core with the RecheckForeignScan callback routine (cf. commits `5fc4c26db` and `385f337c9`), but there has been no test coverage for that, except that recent commit `12609fbac`, which fixed an issue in commit `385f337c9`, added a test case to exercise only a code path added by that commit to the core infrastructure. So let's add test cases to exercise other code paths as well at this time. Like commit `12609fbac`, back-patch to all supported branches. Reported-by: Masahiko Sawada <sawada.mshk@gmail.com> Author: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAPmGK15%2B6H%3DkDA%3D-y3Y28OAPY7fbAdyMosVofZZ%2BNc769epVTQ%40mail.gmail.com Backpatch-through: 13	2025-11-06 12:15:00 +09:00
David Rowley	49d43faa83	Doc: use uppercase keywords in SQLs Use uppercase SQL keywords consistently throughout the documentation to ease reading. Also add whitespace in a couple of places where it improves readability. Author: Erik Wienhold <ewie@ewie.name> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/82eb512b-8ed2-46be-b311-54ffd26978c4%40ewie.name	2025-11-06 16:03:02 +13:00
David Rowley	6d0eba6627	Use stack allocated StringInfoDatas, where possible Various places that were using StringInfo but didn't need that StringInfo to exist beyond the scope of the function were using makeStringInfo(), which allocates both a StringInfoData and the buffer it uses as two separate allocations. It's more efficient for these cases to use a StringInfoData on the stack and initialize it with initStringInfo(), which only allocates the string buffer. This also simplifies the cleanup, in a few cases. Author: Mats Kindahl <mats.kindahl@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/4379aac8-26f1-42f2-a356-ff0e886228d3@gmail.com	2025-11-06 14:59:48 +13:00
Thomas Munro	cf638b46af	ci: Add missing "set -e" to scripts run by su. If any shell command fails, the whole script should fail. To avoid future omissions, add this even for single-command scripts that use su with heredoc syntax, as they might be extended or copied-and-pasted. Extracted from a larger patch that wanted to use #error during compilation, leading to the diagnosis of this problem. Reviewed-by: Tristan Partin <tristan@partin.io> (earlier version) Discussion: https://postgr.es/m/DDZP25P4VZ48.3LWMZBGA1K9RH%40partin.io Backpatch-through: 15	2025-11-06 13:24:30 +13:00
Tom Lane	d4baa327a1	Avoid possible crash within libsanitizer. We've successfully used libsanitizer for awhile with the undefined and alignment sanitizers, but with some other sanitizers (at least thread and hwaddress) it crashes due to internal recursion before it's fully initialized itself. It turns out that that's due to the "__ubsan_default_options" hack installed by commit `f686ae82f`, and we can fix it by ensuring that __ubsan_default_options is built without any sanitizer instrumentation hooks. Reported-by: Emmanuel Sibi <emmanuelsibi.mec@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Diagnosed-by: Emmanuel Sibi <emmanuelsibi.mec@gmail.com> Fix-suggested-by: Jacob Champion <jacob.champion@enterprisedb.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/F7543B04-E56C-4D68-A040-B14CCBAD38F1@gmail.com Discussion: https://postgr.es/m/dbf77bf7-6e54-ed8a-c4ae-d196eeb664ce@gmail.com Backpatch-through: 16	2025-11-05 11:09:45 -05:00
Peter Eisentraut	e4d8a2af07	doc: Add section for temporal tables This section introduces temporal tables, with a focus on Application Time (which we support) and only a brief mention of System Time (which we don't). It covers temporal primary keys, unique constraints, and temporal foreign keys. We will document temporal update/delete and periods as we add those features. This commit also adds glossary entries for temporal table, application time, and system time. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024@illuminatedcomputing.com	2025-11-05 16:38:04 +01:00
Alexander Korotkov	447aae13b0	Implement WAIT FOR command WAIT FOR is to be used on standby and specifies waiting for the specific WAL location to be replayed. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes are on standby. WAIT FOR needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records, implying a kind of self-deadlock. This is why separate utility command seems appears to be the most robust way to implement this functionality. It's not possible to implement this as a function. Previous experience shows that stored procedures also have limitation in this aspect. Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2025-11-05 11:44:13 +02:00
Alexander Korotkov	3b4e53a075	Add infrastructure for efficient LSN waiting Implement a new facility that allows processes to wait for WAL to reach specific LSNs, both on primary (waiting for flush) and standby (waiting for replay) servers. The implementation uses shared memory with per-backend information organized into pairing heaps, allowing O(1) access to the minimum waited LSN. This enables fast-path checks: after replaying or flushing WAL, the startup process or WAL writer can quickly determine if any waiters need to be awakened. Key components: - New xlogwait.c/h module with WaitForLSNReplay() and WaitForLSNFlush() - Separate pairing heaps for replay and flush waiters - WaitLSN lightweight lock for coordinating shared state - Wait events WAIT_FOR_WAL_REPLAY and WAIT_FOR_WAL_FLUSH for monitoring This infrastructure can be used by features that need to wait for WAL operations to complete. Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2025-11-05 11:44:13 +02:00
Alexander Korotkov	8af3ae0d4b	Add pairingheap_initialize() for shared memory usage The existing pairingheap_allocate() uses palloc(), which allocates from process-local memory. For shared memory use cases, the pairingheap structure must be allocated via ShmemAlloc() or embedded in a shared memory struct. Add pairingheap_initialize() to initialize an already- allocated pairingheap structure in-place, enabling shared memory usage. Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru> Author: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2025-11-05 11:44:13 +02:00
Richard Guo	0ea5eee376	Avoid creating duplicate ordered append paths In generate_orderedappend_paths(), the function does not handle the case where the paths in total_subpaths and fractional_subpaths are identical. This situation is not uncommon, and as a result, it may generate two exactly identical ordered append paths. Fix by checking whether total_subpaths and fractional_subpaths contain the same paths, and skipping creation of the ordered append path for the fractional case when they are identical. Given the lack of field complaints about this, I'm a bit hesitant to back-patch, but let's clean it up in HEAD. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-OYsgA75tGGiBARt87G0y_z_GBTSLrzudcJxAzndYkYw@mail.gmail.com	2025-11-05 18:10:54 +09:00
Richard Guo	c1777f2d6d	Fix assertion failure in generate_orderedappend_paths() In generate_orderedappend_paths(), there is an assumption that a child relation's row estimate is always greater than zero. There is an Assert verifying this assumption, and the estimate is also used to convert an absolute tuple count into a fraction. However, this assumption is not always valid -- for example, upper relations can have their row estimates unset, resulting in a value of zero. This can cause an assertion failure in debug builds or lead to the tuple fraction being computed as infinity in production builds. To fix, use the row estimate from the cheapest_total path to compute the tuple fraction. The row estimate in this path should already have been forced to a valid value. In passing, update the comment for generate_orderedappend_paths() to note that the function also considers the cheapest-fractional case when not all tuples need to be retrieved. That is, it collects all the cheapest fractional paths and builds an ordered append path for each interesting ordering. Backpatch to v18, where this issue was introduced. Bug: #19102 Reported-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/19102-93480667e1200169@postgresql.org Backpatch-through: 18	2025-11-05 18:09:21 +09:00
Michael Paquier	a4fd971c6f	Fix timing-dependent failure in recovery test 004_timeline_switch The test introduced by `17b2d5ec75` verifies that a WAL receiver survives across a timeline jump by searching the server logs for termination messages. However, it called restart() before the timeline switch, which kills the WAL receiver and may log the exact message being checked, hence failing the test. As TAP tests reuse the same log file across restarts, a rotate_logfile() is used before the restart so as the log matching check is not impacted by log entries generated by a previous shutdown. Recent changes to file handle inheritance altered I/O timing enough to make this fail consistently while testing another patch. While on it, this adds an extra check based on a PID comparison. This test may lead to false positives as it could be possible that the WAL receiver has processed a timeline jump before the initial PID is grabbed, but it should be good enough in most cases. Like `17b2d5ec75`, backpatch down to v13. Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/9d00b597-d64a-4f1e-802e-90f9dc394c70@gmail.com Backpatch-through: 13	2025-11-05 16:48:19 +09:00
Amit Kapila	5509055d69	Add sequence synchronization for logical replication. This patch introduces sequence synchronization. Sequences that are synced will have 2 states: - INIT (needs [re]synchronizing) - READY (is already synchronized) A new sequencesync worker is launched as needed to synchronize sequences. A single sequencesync worker is responsible for synchronizing all sequences. It begins by retrieving the list of sequences that are flagged for synchronization, i.e., those in the INIT state. These sequences are then processed in batches, allowing multiple entries to be synchronized within a single transaction. The worker fetches the current sequence values and page LSNs from the remote publisher, updates the corresponding sequences on the local subscriber, and finally marks each sequence as READY upon successful synchronization. Sequence synchronization occurs in 3 places: 1) CREATE SUBSCRIPTION - The command syntax remains unchanged. - The subscriber retrieves sequences associated with publications. - Published sequences are added to pg_subscription_rel with INIT state. - Initiate the sequencesync worker to synchronize all sequences. 2) ALTER SUBSCRIPTION ... REFRESH PUBLICATION - The command syntax remains unchanged. - Dropped published sequences are removed from pg_subscription_rel. - Newly published sequences are added to pg_subscription_rel with INIT state. - Initiate the sequencesync worker to synchronize only newly added sequences. 3) ALTER SUBSCRIPTION ... REFRESH SEQUENCES - A new command introduced for PG19 by `f0b3573c3a`. - All sequences in pg_subscription_rel are reset to INIT state. - Initiate the sequencesync worker to synchronize all sequences. - Unlike "ALTER SUBSCRIPTION ... REFRESH PUBLICATION" command, addition and removal of missing sequences will not be done in this case. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-11-05 05:59:58 +00:00
Michael Paquier	1fd981f053	Drop unnamed portal immediately after execution to completion Previously, unnamed portals were kept until the next Bind message or the end of the transaction. This could cause temporary files to persist longer than expected and make logging not reflect the actual SQL responsible for the temporary file. This patch changes exec_execute_message() to drop unnamed portals immediately after execution to completion at the end of an Execute message, making their removal more aggressive. This forces temporary file cleanups to happen at the same time as the completion of the portal execution, with statement logging correctly reflecting to which statements these temporary files were attached to (see the diffs in the TAP test updated by this commit for an idea). The documentation is updated to describe the lifetime of unnamed portals, and test cases are updated to verify temporary file removal and proper statement logging after unnamed portal execution. This changes how unnamed portals are handled in the protocol, hence no backpatch is done. Author: Frédéric Yhuel <frederic.yhuel@dalibo.com> Co-Authored-by: Sami Imseih <samimseih@gmail.com> Co-Authored-by: Mircea Cadariu <cadariu.mircea@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tTrTUoEr3kDXCuKsvqYGq8OOHiBwoD-dyJocq95uEOTQ%40mail.gmail.com	2025-11-05 14:35:16 +09:00
Richard Guo	59dec6c0b0	Fix comments for ChangeVarNodes() and related functions The comment for ChangeVarNodes() refers to a parameter named change_RangeTblRef, which does not exist in the code. The comment for ChangeVarNodesExtended() contains an extra space, while the comment for replace_relid_callback() has an awkward line break and a typo. This patch fixes these issues and revises some sentences for smoother wording. Oversights in commits `ab42d643c` and `fc069a3a6`. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs480j16HC1JtjKCgj5WshivT8ZJYkOfTyZAM0POjFomJkg@mail.gmail.com Backpatch-through: 18	2025-11-05 12:29:31 +09:00
Michael Paquier	2fc3107962	Add assertions checking for the startup process in WAL replay routines These assertions may prove to become useful to make sure that no process other than the startup process calls the routines where these checks are added, as we expect that these do not interfere with a WAL receiver switched to a "stopping" state by a startup process. The assumption that only the startup process can use this code has existed for many years, without a check enforcing it. Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/aQmGeVLYl51y1m_0@paquier.xyz	2025-11-05 10:41:50 +09:00
Andres Freund	dae00f333b	aio: Improve assertions related to io_method First, the assertions in assign_io_method() were the wrong way round. Second, the lengthof() assertion checked the length of io_method_options, which is the wrong array to check and is always longer than pgaio_method_ops_table. While add it, add a static assert to ensure pgaio_method_ops_table and io_method_options stay in sync. Per coverity and Tom Lane. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 18	2025-11-04 20:03:53 -05:00
Andres Freund	2d83d729d5	jit: Fix accidentally-harmless type confusion In `2a0faed9d7`, which added JIT compilation support for expressions, I accidentally used sizeof(LLVMBasicBlockRef *) instead of sizeof(LLVMBasicBlockRef) as part of computing the size of an allocation. That turns out to have no real negative consequences due to LLVMBasicBlockRef being a pointer itself (and thus having the same size). It still is wrong and confusing, so fix it. Reported by coverity. Backpatch-through: 13	2025-11-04 20:03:53 -05:00
Jeff Davis	d115de9d89	Special case C_COLLATION_OID in pg_newlocale_from_collation(). Allow pg_newlocale_from_collation(C_COLLATION_OID) to work even if there's no catalog access, which some extensions expect. Not known to be a bug without extensions involved, but backport to 18. Also corrects an issue in master with dummy_c_locale (introduced in commit `5a38104b36`) where deterministic was not set. That wasn't a bug, but could have been if that structure was used more widely. Reported-by: Alexander Kukushkin <cyberdemn@gmail.com> Reviewed-by: Alexander Kukushkin <cyberdemn@gmail.com> Discussion: https://postgr.es/m/CAFh8B=nj966ECv5vi_u3RYij12v0j-7NPZCXLYzNwOQp9AcPWQ@mail.gmail.com Backpatch-through: 18	2025-11-04 16:48:16 -08:00
Masahiko Sawada	8ae0f6a0c3	Add CHECK_FOR_INTERRUPTS in Evict{Rel,All}UnpinnedBuffers. This commit adds CHECK_FOR_INTERRUPTS to the shared buffer iteration loops in EvictRelUnpinnedBuffers and EvictAllUnpinnedBuffers. These functions, used by pg_buffercache's pg_buffercache_evict_relation and pg_buffercache_evict_all, can now be interrupted during long-running operations. Backpatch to version 18, where these functions and their corresponding pg_buffercache functions were introduced. Author: Yuhang Qiu <iamqyh@gmail.com> Discussion: https://postgr.es/m/8DC280D4-94A2-4E7B-BAB9-C345891D0B78%40gmail.com Backpatch-through: 18	2025-11-04 15:47:25 -08:00
David Rowley	fdda78e361	Fix possible usage of incorrect UPPERREL_SETOP RelOptInfo `03d40e4b5` allowed dummy UNION [ALL] children to be removed from the plan by checking for is_dummy_rel(). That commit neglected to still account for the relids from the dummy rel so that the correct UPPERREL_SETOP RelOptInfo could be found and used for adding the Paths to. Not doing this could result in processing of subsequent UNIONs using the same RelOptInfo as a previously processed UNION, which could result in add_path() freeing old Paths that are needed by the previous UNION. The same fix was independently submitted (2 mins later) by Richard Guo. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/bee34aec-659c-46f1-9ab7-7bbae0b7616c@gmail.com	2025-11-05 11:48:09 +13:00
Álvaro Herrera	0a3d27bfe0	Fix snapshot handling bug in recent BRIN fix Commit `a95e3d84c0` added ActiveSnapshot push+pop when processing work-items (BRIN autosummarization), but forgot to handle the case of a transaction failing during the run, which drops the snapshot untimely. Fix by making the pop conditional on an element being actually there. Author: Álvaro Herrera <alvherre@kurilemu.de> Backpatch-through: 13 Discussion: https://postgr.es/m/202511041648.nofajnuddmwk@alvherre.pgsql	2025-11-04 20:31:43 +01:00
Tomas Vondra	1213cb4753	Trim TIDs during parallel GIN builds more eagerly The parallel GIN builds perform "freezing" of TID lists when merging chunks built earlier. This means determining what part of the list can no longer change, depending on the last received chunk. The frozen part can be evicted from memory and written out. The code attempted to freeze items right before merging the old and new TID list, after already attempting to trim the current buffer. That means part of the data may get frozen based on the new TID list, but will be trimmed later (on next loop). This increases memory usage. This inverts the order, so that we freeze data first (before trimming). The benefits are likely relatively small, but it's also virtually free with no other downsides. Discussion: https://postgr.es/m/CAHLJuCWDwn-PE2BMZE4Kux7x5wWt_6RoWtA0mUQffEDLeZ6sfA@mail.gmail.com	2025-11-04 20:06:01 +01:00
Masahiko Sawada	6d2ff1de4d	psql: Add tab completion for COPY ... PROGRAM. This commit adds tab completion support for COPY TO PROGRAM and COPY FROM PROGRAM syntax in psql. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20250605100835.b396f9d656df1018f65a4556@sraoss.co.jp	2025-11-04 10:51:39 -08:00
Masahiko Sawada	02fd47dbfa	psql: Improve tab completion for COPY ... STDIN/STDOUT. This commit enhances tab completion for both COPY FROM and COPY TO commands to suggest STDIN and STDOUT, respectively. To make suggesting both file names and keywords easier, it introduces a new COMPLETE_WITH_FILES_PLUS() macro. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20250605100835.b396f9d656df1018f65a4556@sraoss.co.jp	2025-11-04 10:40:58 -08:00
Andres Freund	be9efd4929	ci: debian: Switch to Debian Trixie release Debian Trixie CI images are generated now [1], so use them with the following changes: - detect_stack_use_after_return=0 option is added to the ASAN_OPTIONS because ASAN uses a "shadow stack" to track stack variable lifetimes and this confuses Postgres' stack depth check [2]. - Perl is updated to the newer version (perl5.40-i386-linux-gnu). - LLVM-14 is no longer default installation, no need to force using LLVM-16. - Switch MinGW CC/CXX to x86_64-w64-mingw32ucrt-* to fix build failure from missing _iswctype_l in mingw-w64 v12 headers. [1] https://github.com/anarazel/pg-vm-images/commit/35a144793f [2] https://postgr.es/m/20240130212304.q66rquj5es4375ab%40awork3.anarazel.de Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ1_B1usTskAv+AYt1bA7abVd9YH6XrUUSbr-2Z0d5Wd8w@mail.gmail.com Backpatch: 15-, where CI support was added	2025-11-04 13:25:22 -05:00
Tomas Vondra	c98dffcb7c	Limit the size of TID lists during parallel GIN build When building intermediate TID lists during parallel GIN builds, split the sorted lists into smaller chunks, to limit the amount of memory needed when merging the chunks later. The leader may need to keep in memory up to one chunk per worker, and possibly one extra chunk (before evicting some of the data). The code processing item pointers uses regular palloc/repalloc calls, which means it's subject to the MaxAllocSize (1GB) limit. We could fix this by allowing huge allocations, but that'd require changes in many places without much benefit. Larger chunks do not actually improve performance, so the memory usage would be wasted. Fixed by limiting the chunk size to not hit MaxAllocSize. Each worker gets a fair share. This requires remembering the number of participating workers, in a place that can be accessed from the callback. Luckily, the bs_worker_id field in GinBuildState was unused, so repurpose that. Report by Greg Smith, investigation and fix by me. Batchpatched to 18, where parallel GIN builds were introduced. Reported-by: Gregory Smith <gregsmithpgsql@gmail.com> Discussion: https://postgr.es/m/CAHLJuCWDwn-PE2BMZE4Kux7x5wWt_6RoWtA0mUQffEDLeZ6sfA@mail.gmail.com Backpatch-through: 18	2025-11-04 18:51:17 +01:00
Jeff Davis	4bfaea11d2	Remove redundant memset() introduced by `a0942f4`. Reported-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2kAkNaDa01O0nKsQmkfEmxsDvm09SU=f1T0CV8ew3qJEA@mail.gmail.com	2025-11-04 09:46:00 -08:00
Tom Lane	ff4597acd4	Allow "SET list_guc TO NULL" to specify setting the GUC to empty. We have never had a SET syntax that allows setting a GUC_LIST_INPUT parameter to be an empty list. A locution such as SET search_path = ''; doesn't mean that; it means setting the GUC to contain a single item that is an empty string. (For search_path the net effect is much the same, because search_path ignores invalid schema names and '' must be invalid.) This is confusing, not least because configuration-file entries and the set_config() function can easily produce empty-list values. We considered making the empty-string syntax do this, but that would foreclose ever allowing empty-string items to be valid in list GUCs. While there isn't any obvious use-case for that today, it feels like the kind of restriction that might hurt someday. Instead, let's accept the forbidden-up-to-now value NULL and treat that as meaning an empty list. (An objection to this could be "what if we someday want to allow NULL as a GUC value?". That seems unlikely though, and even if we did allow it for scalar GUCs, we could continue to treat it as meaning an empty list for list GUCs.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrei Klychkov <andrew.a.klychkov@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/CA+mfrmwsBmYsJayWjc8bJmicxc3phZcHHY=yW5aYe=P-1d_4bg@mail.gmail.com	2025-11-04 12:37:40 -05:00
Álvaro Herrera	93b7ab5b4b	Have psql's "\? variables" show csv_fieldsep Accidental omission in commit `aa2ba50c2c`. There are too many lists of these variables ... Discussion: https://postgr.es/m/202511031738.eqaeaedpx5cr@alvherre.pgsql	2025-11-04 17:30:44 +01:00
Peter Eisentraut	040cc5f3c7	Tighten check for generated column in partition key expression A generated column may end up being part of the partition key expression, if it's specified as an expression e.g. "(<generated column name>)" or if the partition key expression contains a whole-row reference, even though we do not allow a generated column to be part of partition key expression. Fix this hole. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxF%3DWDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw%40mail.gmail.com	2025-11-04 14:46:58 +01:00
Álvaro Herrera	a95e3d84c0	BRIN autosummarization may need a snapshot It's possible to define BRIN indexes on functions that require a snapshot to run, but the autosummarization feature introduced by commit `7526e10224` fails to provide one. This causes autovacuum to leave a BRIN placeholder tuple behind after a failed work-item execution, making such indexes less efficient. Repair by obtaining a snapshot prior to running the task, and add a test to verify this behavior. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Giovanni Fabris <giovanni.fabris@icon.it> Reported-by: Arthur Nascimento <tureba@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/202511031106.h4fwyuyui6fz@alvherre.pgsql	2025-11-04 13:23:26 +01:00
Peter Eisentraut	c09a06918d	Error message stylistic correction Fixup for commit `ef5e60a9d3`: The inconsistent use of articles was a bit awkward.	2025-11-04 12:25:04 +01:00
Michael Paquier	861af92610	libpq: Improve error handling in passwordFromFile() Previously, passwordFromFile() returned NULL for valid cases (like no matching password found) and actual errors (two out-of-memory paths). This made it impossible for its sole caller, pqConnectOptions2(), to distinguish between these scenarios and fail the connection appropriately should an out-of-memory error occur. This patch extends passwordFromFile() to be able to detect both valid and failure cases, with an error string given back to the caller of the function. Out-of-memory failures unlikely happen in the field, so no backpatch is done. Author: Joshua Shanks <jjshanks@gmail.com> Discussion: https://postgr.es/m/CAOxqWDfihFRmhNVdfu8epYTXQRxkCHSOrg+=-ij2c_X3gW=o3g@mail.gmail.com	2025-11-04 20:12:48 +09:00
Álvaro Herrera	ad1581d7fe	Use USECS_PER_SEC from datatype/timestamp.h We had two places defining their own constants for this. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/202510311750.mxiykx3tp4mx@alvherre.pgsql	2025-11-04 10:07:54 +01:00
Michael Paquier	65f4976189	Add assertion check for WAL receiver state during stream-archive transition When the startup process switches from streaming to archive as WAL source, we avoid calling ShutdownWalRcv() if the WAL receiver is not streaming, based on WalRcvStreaming(). WALRCV_STOPPING is a state set by ShutdownWalRcv(), called only by the startup process, meaning that it should not be possible to reach this state while in WaitForWALToBecomeAvailable(). This commit adds an assertion to make sure that a WAL receiver is never in a WALRCV_STOPPING state should the startup process attempt to reset InstallXLogFileSegmentActive. Idea suggested by Noah Misch. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org	2025-11-04 13:14:46 +09:00
Michael Paquier	e0ca61e7c4	Add WalRcvGetState() to retrieve the state of a WAL receiver This has come up as useful as an alternative of WalRcvStreaming(), to be able to do sanity checks based on the state of a WAL receiver. This will be used in a follow-up commit. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org	2025-11-04 12:57:36 +09:00
Michael Paquier	17b2d5ec75	Fix unconditional WAL receiver shutdown during stream-archive transition Commit `b4f584f9d2` (affecting v15~, later backpatched down to 13 as of `3635a0a35a`) introduced an unconditional WAL receiver shutdown when switching from streaming to archive WAL sources. This causes problems during a timeline switch, when a WAL receiver enters WALRCV_WAITING state but remains alive, waiting for instructions. The unconditional shutdown can break some monitoring scenarios as the WAL receiver gets repeatedly terminated and re-spawned, causing pg_stat_wal_receiver.status to show a "streaming" instead of "waiting" status, masking the fact that the WAL receiver is waiting for a new TLI and a new LSN to be able to continue streaming. This commit changes the WAL receiver behavior so as the shutdown becomes conditional, with InstallXLogFileSegmentActive being always reset to prevent the regression fixed by `b4f584f9d2`: only terminate the WAL receiver when it is actively streaming (WALRCV_STREAMING, WALRCV_STARTING, or WALRCV_RESTARTING). When in WALRCV_WAITING state, just reset InstallXLogFileSegmentActive flag to allow archive restoration without killing the process. WALRCV_STOPPED and WALRCV_STOPPING are not reachable states in this code path. For the latter, the startup process is the one in charge of setting WALRCV_STOPPING via ShutdownWalRcv(), waiting for the WAL receiver to reach a WALRCV_STOPPED state after switching walRcvState, so WaitForWALToBecomeAvailable() cannot be reached while a WAL receiver is in a WALRCV_STOPPING state. A regression test is added to check that a WAL receiver is not stopped on timeline jump, that fails when the fix of this commit is reverted. Reported-by: Ryan Bird <ryanzxg@gmail.com> Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org Backpatch-through: 13	2025-11-04 10:47:38 +09:00
Noah Misch	8b18ed6dfb	Doc: cover index CONCURRENTLY causing errors in INSERT ... ON CONFLICT. Author: Mikhail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com Backpatch-through: 13	2025-11-03 12:57:09 -08:00
Masahiko Sawada	e7ccb247b3	Fix outdated comment of COPY in gram.y. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_392C0E92EC52432D0A336B9D52E66426F009@qq.com	2025-11-03 10:34:49 -08:00
Álvaro Herrera	645cb44c54	Add \pset options for boolean value display New \pset variables display_true and display_false allow the user to change how true and false values are displayed. Author: David G. Johnston <David.G.Johnston@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAKFQuwYts3vnfQ5AoKhEaKMTNMfJ443MW2kFswKwzn7fiofkrw@mail.gmail.com Discussion: https://postgr.es/m/56308F56.8060908@joh.to	2025-11-03 17:40:39 +01:00
Álvaro Herrera	cf8be02253	Prevent setting a column as identity if its not-null constraint is invalid We don't allow null values to appear in identity-generated columns in other ways, so we shouldn't let unvalidated not-null constraints do it either. Oversight in commit `a379061a22`. Author: jian he <jian.universality@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CACJufxGQM_+vZoYJMaRoZfNyV=L2jxosjv_0TLAScbuLJXWRfQ@mail.gmail.com	2025-11-03 15:58:19 +01:00
Álvaro Herrera	f242dbcede	Remove WaitPMResult enum in pg_createsubscriber A simple boolean suffices. This is cosmetic, so no backpatch. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/202510311750.mxiykx3tp4mx@alvherre.pgsql	2025-11-03 12:59:32 +01:00
Michael Paquier	ad25744f43	Add wal_fpi_bytes to VACUUM and ANALYZE logs The new wal_fpi_bytes counter calculates the total amount of full page images inserted in WAL records, in bytes. This commit adds this information to VACUUM and ANALYZE logs alongside the existing counters, building upon `f9a09aa295`. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aQMMSSlFXy4Evxn3@paquier.xyz	2025-11-03 19:42:03 +09:00
Peter Eisentraut	fce7c73fba	Sort guc_parameters.dat alphabetically by name The order in this list was previously pretty random and had grown organically over time. This made it unnecessarily cumbersome to maintain these lists, as there was no clear guidelines about where to put new entries. Also, after the merger of the type-specific GUC structs, the list still reflected the previous type-specific super-order. By using alphabetical order, the place for new entries becomes clear, and often related entries will be listed close together. This patch reorders the existing entries in guc_parameters.dat, and it also augments the generation script to error if an entry is found at the wrong place. Note: The order is actually checked after lower-casing, to handle the likes of "DateStyle". Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-11-03 10:04:14 +01:00
Tom Lane	8f29467c57	Change "long" numGroups fields to be Cardinality (i.e., double). We've been nibbling away at removing uses of "long" for a long time, since its width is platform-dependent. Here's one more: change the remaining "long" fields in Plan nodes to Cardinality, since the three surviving examples all represent group-count estimates. The upstream planner code was converted to Cardinality some time ago; for example the corresponding fields in Path nodes are type Cardinality, as are the arguments of the make_foo_path functions. Downstream in the executor, it turns out that these all feed to the table-size argument of BuildTupleHashTable. Change that to "double" as well, and fix it so that it safely clamps out-of-range values to the uint32 limit of simplehash.h, as was not being done before. Essentially, this is removing all the artificial datatype-dependent limitations on these values from upstream processing, and applying just one clamp at the moment where we're forced to do so by the datatype choices of simplehash.h. Also, remove BuildTupleHashTable's misguided attempt to enforce work_mem/hash_mem_limit. It doesn't have enough information (particularly not the expected tuple width) to do that accurately, and it has no real business second-guessing the caller's choice. For all these plan types, it's really the planner's responsibility to not choose a hashed implementation if the hashtable is expected to exceed hash_mem_limit. The previous patch improved the accuracy of those estimates, and even if BuildTupleHashTable had more information it should arrive at the same conclusions. Reported-by: Jeff Janes <jeff.janes@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com	2025-11-02 16:57:43 -05:00
Tom Lane	1ea5bdb00b	Improve planner's estimates of tuple hash table sizes. For several types of plan nodes that use TupleHashTables, the planner estimated the expected size of the table as basically numEntries * (MAXALIGN(dataWidth) + MAXALIGN(SizeofHeapTupleHeader)). This is pretty far off, especially for small data widths, because it doesn't account for the overhead of the simplehash.h hash table nor for any per-tuple "additional space" the plan node may request. Jeff Janes noted a case where the estimate was off by about a factor of three, even though the obvious hazards such as inaccurate estimates of numEntries or dataWidth didn't apply. To improve matters, create functions provided by the relevant executor modules that can estimate the required sizes with reasonable accuracy. (We're still not accounting for effects like allocator padding, but this at least gets the first-order effects correct.) I added functions that can estimate the tuple table sizes for nodeSetOp and nodeSubplan; these rely on an estimator for TupleHashTables in general, and that in turn relies on one for simplehash.h hash tables. That feels like kind of a lot of mechanism, but if we take any short-cuts we're violating modularity boundaries. The other places that use TupleHashTables are nodeAgg, which took pains to get its numbers right already, and nodeRecursiveunion. I did not try to improve the situation for nodeRecursiveunion because there's nothing to improve: we are not making an estimate of the hash table size, and it wouldn't help us to do so because we have no non-hashed alternative implementation. On top of that, our estimate of the number of entries to be hashed in that module is so suspect that we'd likely often choose the wrong implementation if we did have two ways to do it. Reported-by: Jeff Janes <jeff.janes@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com	2025-11-02 16:57:26 -05:00
Peter Geoghegan	b8f1c62807	Document nbtree row comparison design. Add comments explaining when and where it is safe for nbtree to treat row compare keys as if they were simple scalar inequality keys on the row's most significant column. This is particularly important within _bt_advance_array_keys, which deals with required inequality keys in a general and uniform way, without any special handling for row compares. Also spell out the implications of _bt_check_rowcompare's approach of _conditionally_ evaluating lower-order row compare subkeys, particularly when one of its lower-order subkeys might see NULL index tuple values (these may or may not affect whether the qual as a whole is satisfied). The behavior in this area isn't particularly intuitive, so these issues seem worth going into. In passing, add a few more defensive/documenting row comparison related assertions to _bt_first and _bt_check_rowcompare. Follow-up to commits `bd3f59fd` and `ec986020`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Reviewed-By: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAH2-Wznwkak_K7pcAdv9uH8ZfNo8QO7+tHXOaCUddMeTfaCCFw@mail.gmail.com Backpatch-through: 18	2025-11-02 15:27:05 -05:00
Peter Geoghegan	4f08586c7a	Remove obsolete nbtree equality key comments. _bt_first reliably uses the same equality key (on each index column) for initial positioning purposes as the one that _bt_checkkeys can use to end the scan following commit `f09816a0`. _bt_first no longer applies its own independent rules to determine which initial positioning key to use on each column (for equality and inequality keys alike). Preprocessing is now fully in control of determining which keys start and end each scan, ensuring that _bt_first and _bt_checkkeys have symmetric behavior. Remove obsolete comments that described why _bt_first was expected to use at least one of the available required equality keys for initial positioning purposes. The rules in this area are now maximally strict and uniform, so there's no reason to draw attention to equality keys. Any column with a required equality key cannot have a redundant required inequality key (nor can it have a redundant required equality key). Oversight in commit `f09816a0`, which removed similar comments from _bt_first, but missed these comments. Author: Peter Geoghegan <pg@bowt.ie> Backpatch-through: 18	2025-11-02 13:34:18 -05:00
Tom Lane	645c1e2752	Avoid mixing void and integer in a conditional expression. The C standard says that the second and third arguments of a conditional operator shall be both void type or both not-void type. The Windows version of INTERRUPTS_PENDING_CONDITION() got this wrong. It's pretty harmless because the result of the operator is ignored anyway, but apparently recent versions of MSVC have started issuing a warning about it. Silence the warning by casting the dummy zero to void. Reported-by: Christian Ullrich <chris@chrullrich.net> Author: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/cc4ef8db-f8dc-4347-8a22-e7ebf44c0308@chrullrich.net Backpatch-through: 13	2025-11-02 12:30:44 -05:00
Tom Lane	b70cafd85f	Remove unused variable in recovery/t/006_logical_decoding.pl. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAJDiXggmZWew8+SY_9o0atpmaJmPTL25wdz07MrDoqCkp4D1ug@mail.gmail.com	2025-11-01 14:01:52 -04:00
Tom Lane	ff8aba65d4	Fix contrib/ltree's subpath() with negative offset. subpath(ltree,offset,len) now correctly errors when given an offset less than -n, where n is the number of labels in the given ltree. There was a duplicate block of code that allowed an offset as low as -2n. The documentation says no such thing, so this must have been a copy-and-paste error in the original ltree patch. While here, avoid redundant calculation of "end" and write LTREE_MAX_LEVELS rather than its hard-coded value. Author: Marcus Gartner <m.a.gartner@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAAUGV_SvBO9gWYbaejb9nhe-mS9FkNP4QADNTdM3wdRhvLobwA@mail.gmail.com	2025-11-01 13:25:42 -04:00
Álvaro Herrera	2648eab377	pg_createsubscriber: reword dry-run log messages The original messages were confusing in dry-run mode in that they state that something is being done, when in reality it isn't. Use alternative wording in that case, to make the distinction clear. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAHut+PsvQJQnQO0KT0S2oegenkvJ8FUuY-QS5syyqmT24R2xFQ@mail.gmail.com	2025-10-31 18:49:50 +01:00
Álvaro Herrera	11144915e1	pg_createsubscriber: Fix error complaining about the wrong thing The code updates the system identifier, then runs pg_walreset; if the latter fails, it complains about the former, which makes no sense. Change the error message to complain about the right thing. Noticed while reviewing a patch touching nearby code. Author: Álvaro Herrera <alvherre@kurilemu.de> Backpatch-through: 17	2025-10-31 17:43:15 +01:00
Peter Eisentraut	8a27d418f8	Mark function arguments of type "Datum " as "const Datum " where possible Several functions in the codebase accept "Datum " parameters but do not modify the pointed-to data. These have been updated to take "const Datum " instead, improving type safety and making the interfaces clearer about their intent. This change helps the compiler catch accidental modifications and better documents immutability of arguments. Most of "Datum " parameters have a pairing "bool isnull" parameter, they are constified as well. No functional behavior is changed by this patch. Author: Chao Li <lic@highgo.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2msfT0knvzUa72ZBwu9LR_RLY4on85w2a9YpE-o2By5HQ@mail.gmail.com	2025-10-31 10:47:25 +01:00
Peter Eisentraut	aa4535307e	formatting.c cleanup: Change fill_str() return type to void The return value is not used anywhere. In passing, add a comment explaining the function's arguments. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 09:55:12 +01:00
Peter Eisentraut	da2052ab9a	formatting.c cleanup: Rename DCH_S_* to DCH_SUFFIX_* For clarity. Also rename several related macros and turn them into inline functions. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 08:06:46 +01:00
Peter Eisentraut	378212c68a	formatting.c cleanup: Change several int fields to enums This makes their purpose more self-documenting. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 08:06:46 +01:00
Peter Eisentraut	ce5f6817e4	formatting.c cleanup: Change TmFromChar.clock field to bool This makes the purpose clearer and avoids having two extra symbols, one of which (CLOCK_24_HOUR) was unused. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 08:06:46 +01:00
Michael Paquier	c9e38a569c	Add test tracking WAL receiver shutdown for primary_conninfo updates The test introduced by this commit checks that a reload of primary_conninfo leads to a WAL receiver restarted, by looking at the request generated in the server logs. This is something for what there was no coverage. This has come up for a different patch, while discussing a regression where a WAL receiver should not be stopped while waiting for a new position to stream, like at the end of a timeline. In the case of the other patch, we want to check that this log entry is not generated, but if the error message is reworded the test would become silently broken. The test of this commit ensures that we at least keep track the log message format, for a supported scenario. Extracted from a larger patch by the same author. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/aQKlC1v2_MXGV6_9@paquier.xyz	2025-10-31 11:24:24 +09:00
Bruce Momjian	3896e861b3	doc: rewrite random_page_cost description This removes some of the specifics of how the default was set, and adds a mention of latency as a reason the value is lower than the storage hardware might suggest. It still mentions caching. Discussion: https://postgr.es/m/CAKAnmmK_nSPYr53LobUwQD59a-8U9GEC3XGJ43oaTYJq5nAOkw@mail.gmail.com Backpatch-through: 13	2025-10-30 19:11:53 -04:00
Andres Freund	9c398fdf48	ci: macos: Upgrade to Sequoia Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ3kO4vLq56PWrfJ7Fw6Wz8DhEN9j9GX3aScx%2BWOirtK-g%40mail.gmail.com Backpatch: 15-, where CI support was added	2025-10-30 16:08:21 -04:00
Andres Freund	0a8a4be866	ci: Fix Windows and MinGW task names They use Windows Server 2022, not 2019. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/CAN55FZ1OsaM+852BMQDJ+Kgfg+07knJ6dM3PjbGbtYaK4qwfqA@mail.gmail.com	2025-10-30 13:06:42 -04:00
Tom Lane	c106ef0807	Use BumpContext contexts in TupleHashTables, and do some code cleanup. For all extant uses of TupleHashTables, execGrouping.c itself does nothing with the "tablecxt" except to allocate new hash entries in it, and the callers do nothing with it except to reset the whole context. So this is an ideal use-case for a BumpContext, and the hash tables are frequently big enough for the savings to be significant. (Commit `cc721c459` already taught nodeAgg.c this idea, but neglected the other callers of BuildTupleHashTable.) While at it, let's clean up some ill-advised leftovers from rebasing TupleHashTables on simplehash.h: * Many comments and variable names were based on the idea that the tablecxt holds the whole TupleHashTable, whereas now it only holds the hashed tuples (plus any caller-defined "additional storage"). Rename to names like tuplescxt and tuplesContext, and adjust the comments. Also adjust the memory context names to be like "<Foo> hashed tuples". * Make ResetTupleHashTable() reset the tuplescxt rather than relying on the caller to do so; that was fairly bizarre and seems like a recipe for leaks. This is less efficient in the case where nodeAgg.c uses the same tuplescxt for several different hashtables, but only microscopically so because mcxt.c will short-circuit the extra resets via its isReset flag. I judge the extra safety and intellectual cleanliness well worth those few cycles. * Remove the long-obsolete "allow_jit" check added by ac88807f9; instead, just Assert that metacxt and tuplescxt are different. We need that anyway for this definition of ResetTupleHashTable() to be safe. There is a side issue of the extent to which this change invalidates the planner's estimates of hashtable memory consumption. However, those estimates are already pretty bad, so improving them seems like it can be a separate project. This change is useful to do first to establish consistent executor behavior that the planner can expect. A loose end not addressed here is that the "entrysize" calculation in BuildTupleHashTable seems wrong: "sizeof(TupleHashEntryData) + additionalsize" corresponds neither to the size of the simplehash entries nor to the total space needed per tuple. It's questionable why BuildTupleHashTable is second-guessing its caller's nbuckets choice at all, since the original source of the number should have had more information. But that all seems wrapped up with the planner's estimation logic, so let's leave it for the planned followup patch. Reported-by: Jeff Janes <jeff.janes@gmail.com> Reported-by: David Rowley <dgrowleyml@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com Discussion: https://postgr.es/m/2268409.1761512111@sss.pgh.pa.us	2025-10-30 11:21:22 -04:00
Peter Eisentraut	e1ac846f3d	Mark ItemPointer arguments as const throughout This is a follow up `991295f`. I searched over src/ and made all ItemPointer arguments as const as much as possible. Note: We cut out from the original patch the pieces that would have created incompatibilities in the index or table AM APIs. Those could be considered separately. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEoWx2nBaypg16Z5ciHuKw66pk850RFWw9ACS2DqqJ_AkKeRsw%40mail.gmail.com	2025-10-30 14:12:06 +01:00
Álvaro Herrera	a27c40bfe8	Simplify coding in ProcessQuery The original is pretty baroque for no apparent reason; arguably, commit `2f9661311b` should have done this. Noted while reviewing related code for bug #18984. This is cosmetic (though I'm surprised that my compiler generates shorter assembly this way), so no backpatch. Discussion: https://postgr.es/m/18984-0f4778a6599ac3ae@postgresql.org	2025-10-30 11:26:35 +01:00
Peter Eisentraut	8ce795fcb7	Fix some confusing uses of const There are a few places where we have typedef struct FooData { ... } FooData; typedef FooData Foo; and then function declarations with bar(const Foo x) which isn't incorrect but probably meant bar(const FooData x) meaning that the thing x points to is immutable, not x itself. This patch makes those changes where appropriate. In one case (execGrouping.c), the thing being pointed to was not immutable, so in that case remove the const altogether, to avoid further confusion. Co-authored-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEoWx2m2E0xE8Kvbkv31ULh_E%2B5zph-WA_bEdv3UR9CLhw%2B3vg%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAEoWx2kTDz%3Db6T2xHX78vy_B_osDeCC5dcTCi9eG0vXHp5QpdQ%40mail.gmail.com	2025-10-30 11:20:04 +01:00
Peter Eisentraut	9fcd4874ed	docs: Link to the correct protocol version inspection function The docs for max_protocol_version suggested PQprotocolVersion() instead of PQfullProtocolVersion() to find out the exact protocol version. Since PQprotocolVersion() only returns the major protocol version, that is bad advice. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQSKFxQsYAgr11PhdOr-RtPZEdAXZnHx6U3avLuk3xQaTQ%40mail.gmail.com	2025-10-30 10:59:56 +01:00
Peter Eisentraut	3479a0f823	const-qualify ItemPointer comparison functions Add const qualifiers to ItemPointerEquals() and ItemPointerCompare(). This will allow further changes up the stack. It also complements commit `aeb767ca0b`, as we now have all of itemptr.h appropriately const-qualified. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2nBaypg16Z5ciHuKw66pk850RFWw9ACS2DqqJ_AkKeRsw@mail.gmail.com	2025-10-30 10:13:47 +01:00
Peter Eisentraut	e2cf524e4a	formatting.c cleanup: Improve formatting of some struct declarations This makes future editing easier. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-30 08:35:33 +01:00
Peter Eisentraut	9a1a5dfee8	formatting.c cleanup: Remove unnecessary zeroize macros Replace with initializer or memset(). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-30 08:35:28 +01:00
Peter Eisentraut	38506f55fd	formatting.c cleanup: Remove unnecessary extra line breaks in error message literals Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-30 08:35:18 +01:00
Michael Paquier	5ab0b6a248	Expose wal_fpi_bytes in EXPLAIN (WAL) The new wal_fpi_bytes counter calculates the total amount of full page images inserted in WAL records, in bytes. This commit exposes this information in EXPLAIN (ANALYZE, WAL) alongside the existing counters, for both the text and JSON/YAML outputs, building upon `f9a09aa295`. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discusssion: https://postgr.es/m/CAOzEurQtZEAfg6P0kU3Wa-f9BWQOi0RzJEMPN56wNTOmJLmfaQ@mail.gmail.com	2025-10-30 15:34:01 +09:00
Michael Paquier	d432094689	Fix regression with slot invalidation checks This commit reverts `818fefd8fd`, that has been introduced to address a an instability in some of the TAP tests due to the presence of random standby snapshot WAL records, when slots are invalidated by InvalidatePossiblyObsoleteSlot(). Anyway, this commit had also the consequence of introducing a behavior regression. After `818fefd8fd`, the code may determine that a slot needs to be invalidated while it may not require one: the slot may have moved from a conflicting state to a non-conflicting state between the moment when the mutex is released and the moment when we recheck the slot, in InvalidatePossiblyObsoleteSlot(). Hence, the invalidations may be more aggressive than they actually have to. `105b2cb336` has tackled the test instability in a way that should be hopefully sufficient for the buildfarm, even for slow members: - In v18, the test relies on an injection point that bypasses the creation of the random records generated for standby snapshots, eliminating the random factor that impacted the test. This option was not available when `818fefd8fd` was discussed. - In v16 and v17, the problem was bypassed by disallowing a slot to become active in some of the scenarios tested. While on it, this commit adds a comment to document that it is fine for a recheck to use xmin and LSN values stored in the slot, without storing and reusing them across multiple checks. Reported-by: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/f492465f-657e-49af-8317-987460cb68b0.mengjuan.cmj@alibaba-inc.com Backpatch-through: 16	2025-10-30 13:13:28 +09:00
Richard Guo	257ee78341	Disable parallel plans for RIGHT_SEMI joins RIGHT_SEMI joins rely on the HEAP_TUPLE_HAS_MATCH flag to guarantee that only the first match for each inner tuple is considered. However, in a parallel hash join, the inner relation is stored in a shared global hash table that can be probed by multiple workers concurrently. This allows different workers to inspect and set the match flags of the same inner tuples at the same time. If two workers probe the same inner tuple concurrently, both may see the match flag as unset and emit the same tuple, leading to duplicate output rows and violating RIGHT_SEMI join semantics. For now, we disable parallel plans for RIGHT_SEMI joins. In the long term, it may be possible to support parallel execution by performing atomic operations on the match flag, for example using a CAS or similar mechanism. Backpatch to v18, where RIGHT_SEMI join was introduced. Bug: #19094 Reported-by: Lori Corbani <Lori.Corbani@jax.org> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19094-6ed410eb5b256abd@postgresql.org Backpatch-through: 18	2025-10-30 11:58:45 +09:00
David Rowley	50eb4e1181	Fix bogus use of "long" in AllocSetCheck() Because long is 32-bit on 64-bit Windows, it isn't a good datatype to store the difference between 2 pointers. The under-sized type could overflow and lead to scary warnings in MEMORY_CONTEXT_CHECKING builds, such as: WARNING: problem in alloc set ExecutorState: bad single-chunk %p in block %p However, the problem lies only in the code running the check, not from an actual memory accounting bug. Fix by using "Size" instead of "long". This means using an unsigned type rather than the previous signed type. If the block's freeptr was corrupted, we'd still catch that if the unsigned type wrapped. Unsigned allows us to avoid further needless complexities around comparing signed and unsigned types. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvo-RmiT4s33J=aC9C_-wPZjOXQ232V-EZFgKftSsNRi4w@mail.gmail.com	2025-10-30 14:48:10 +13:00
Jeff Davis	3853a6956c	Use C11 char16_t and char32_t for Unicode code points. Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/bedcc93d06203dfd89815b10f815ca2de8626e85.camel%40j-davis.com	2025-10-29 14:17:13 -07:00
Álvaro Herrera	16edc1b94f	pg_stat_statements: Fix handling of duplicate constant locations Two or more constants can have the same location. We handled this correctly for non squashed constants, but failed to do it if squashed (resulting in out-of-bounds memory access), because the code structure became broken by commit `0f65f3eec4`: we failed to update 'last_loc' correctly when skipping these squashed constants. The simplest fix seems to be to get rid of 'last_loc' altogether -- in hindsight, it's quite pointless. Also, when ignoring a constant because of this, make sure to fulfill fill_in_constant_lengths's duty of setting its length to -1. Lastly, we can use == instead of <= because the locations have been sorted beforehand, so the < case cannot arise. Co-authored-by: Sami Imseih <samimseih@gmail.com> Co-authored-by: Dmitry Dolgov <9erthalion6@gmail.com> Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Backpatch-through: 18 Discussion: https://www.postgresql.org/message-id/2b91e358-0d99-43f7-be44-d2d4dbce37b3%40garret.ru	2025-10-29 12:35:02 +01:00
Álvaro Herrera	94f95d91b0	CheckNNConstraintFetch: Fill all of ConstrCheck in a single pass Previously, we'd fill all fields except ccbin, and only later obtain and detoast ccbin, with hypothetical failures being possible. If ccbin is null (rare catalog corruption I have never witnessed) or its a corrupted toast entry, we leak a tiny bit of memory in CacheMemoryContext from having strdup'd the constraint name. Repair these by only attempting to fill the struct once ccbin has been detoasted. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAr=i3_Z4GvmediX900+sSySTeMkvuytYShhQqEwoGyvhA@mail.gmail.com	2025-10-29 11:41:39 +01:00
Peter Eisentraut	a13833c35f	Reorganize GUC structs Instead of having five separate GUC structs, one for each type, with the generic part contained in each of them, flip it around and have one common struct, with the type-specific part has a subfield. The very original GUC design had type-specific structs and type-specific lists, and the membership in one of the lists defined the type. But now the structs themselves know the type (from the .vartype field), and they are all loaded into a common hash table at run time, and so this original separation no longer makes sense. It creates a bunch of inconsistencies in the code about whether the type-specific or the generic struct is the primary struct, and a lot of casting in between, which makes certain assumptions about the struct layouts. After the change, all these casts are gone and all the data is accessed via normal field references. Also, various code is simplified because only one kind of struct needs to be processed. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-29 09:52:29 +01:00
Peter Eisentraut	2724830929	formatting.c cleanup: Remove unnecessary extra parentheses Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:29:00 +01:00
Peter Eisentraut	6271d9922e	formatting.c cleanup: Use array syntax instead of pointer arithmetic for easier readability Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:28:55 +01:00
Peter Eisentraut	b9def57a3c	formatting.c cleanup: Add some const pointer qualifiers Co-authored-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:28:50 +01:00
Peter Eisentraut	d98b3cdbaf	formatting.c cleanup: Use size_t for string length variables and arguments Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:28:43 +01:00
Peter Eisentraut	f0f2c0c1ae	Replace pg_restrict by standard restrict MSVC in C11 mode supports the standard restrict qualifier, so we don't need the workaround naming pg_restrict anymore. Even though restrict is in C99 and should be supported by all supported compilers, we keep the configure test and the hardcoded redirection to __restrict, because that will also work in C++ in all supported compilers. (restrict is not part of the C++ standard.) For backward compatibility for extensions, we keep a #define of pg_restrict around, but our own code doesn't use it anymore. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/0e3d8644-c01d-4374-86ea-9f0a987981f0%40eisentraut.org	2025-10-29 07:52:58 +01:00
Peter Eisentraut	c094be259b	Remove obsolete comment The comment "type prefixes (const, signed, volatile, inline) are handled in pg_config.h." has been mostly not true for a long time.	2025-10-29 07:32:21 +01:00
Michael Paquier	d3111cb753	Fix correctness issue with computation of FPI size in WAL stats XLogRecordAssemble() may be called multiple times before inserting a record in XLogInsertRecord(), and the amount of FPIs generated inside a record whose insertion is attempted multiple times may vary. The logic added in `f9a09aa295` touched directly pgWalUsage in XLogRecordAssemble(), meaning that it could be possible for pgWalUsage to be incremented multiple times for a single record. This commit changes the code to use the same logic as the number of FPIs added to a record, where XLogRecordAssemble() returns this information and feeds it to XLogInsertRecord(), updating pgWalUsage only when a record is inserted. Reported-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAOzEurSiSr+rusd0GzVy8Bt30QwLTK=ugVMnF6=5WhsSrukvvw@mail.gmail.com	2025-10-29 09:13:31 +09:00
Nathan Bossart	b3ce55f413	Add psql PROMPT variable for search_path. The new %S substitution shows the current value of search_path. Note that this only works when connected to Postgres v18 or newer, since search_path was first marked as GUC_REPORT in commit `28a1121fd9`. On older versions that don't report search_path, %S is replaced with a question mark. Suggested-by: Lauri Siltanen <lauri.siltanen@gmail.com> Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CANsM767JhTKCRagTaq5Lz52fVwLPVkhSpyD1C%2BOrridGv0SO0A%40mail.gmail.com	2025-10-28 14:08:38 -05:00
Peter Eisentraut	03fbb0814c	formatting.c cleanup: Move loop variables definitions into for statement Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-28 19:20:17 +01:00
Peter Eisentraut	95924672d5	formatting.c cleanup: Remove dashes in comments This saves some vertical space and makes the comments style more consistent with the rest of the code. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-28 19:20:02 +01:00
Álvaro Herrera	d5845aa8ad	Don't error out when dropping constraint if relchecks is already zero I have never seen this be a problem in practice, but it came up when purposely corrupting catalog contents to study the fix for a nearby bug: we'd try to decrement relchecks, but since it's zero we error out and fail to drop the constraint. The fix is to downgrade the error to warning, skip decrementing the counter, and otherwise proceed normally. Given lack of field complaints, no backpatch. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/202508291058.q2zscdcs64fj@alvherre.pgsql	2025-10-28 19:13:32 +01:00
Jeff Davis	4da12e9e2e	Move comment about casts from pg_wchar. Suggested-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKGLXQUYK7Cq5KbLGgTWo7pORs7yhBWO1AEnZt7xTYbLRhg@mail.gmail.com	2025-10-28 10:49:20 -07:00
Peter Eisentraut	35e53b6841	Check that index can return in get_actual_variable_range() Some recent changes were made to remove the explicit dependency on btree indexes in some parts of the code. One of these changes was made in commit `9ef1851685`, which allows non-btree indexes to be used in get_actual_variable_range(). A follow-up commit `ee1ae8b99f` fixes the cases where an index doesn’t have a sortopfamily as this is a prerequisite to be used in get_actual_variable_range(). However, it was found that indexes that have amcanorder = true but do not allow index-only-scans (amcanreturn returns false or is NULL) will pass all of the conditions, while they should be rejected since get_actual_variable_range() uses the index-only-scan machinery in get_actual_variable_endpoint(). Such an index might cause errors like ERROR: no data returned for index-only scan during query planning. The fix is to add a check in get_actual_variable_range() to reject indexes that do not allow index-only scans. Author: Maxime Schoemans <maxime.schoemans@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/20ED852A-C2D9-41EB-8671-8C8B9D418BE9%40enterprisedb.com	2025-10-28 10:07:29 +01:00
Michael Paquier	f9a09aa295	Add wal_fpi_bytes to pg_stat_wal and pg_stat_get_backend_wal() This new counter, called "wal_fpi_bytes", tracks the total amount in bytes of full page images (FPIs) generated in WAL. This data becomes available globally via pg_stat_wal, and for backend statistics via pg_stat_get_backend_wal(). Previously, this information could only be retrieved with pg_waldump or pg_walinspect, which may not be available depending on the environment, and are expensive to execute. It offers hints about how much FPIs impact the WAL generated, which could be a large percentage for some workloads, as well as the effects of wal_compression or page holes. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, due to the addition of wal_fpi_bytes in PgStat_WalCounters. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOzEurQtZEAfg6P0kU3Wa-f9BWQOi0RzJEMPN56wNTOmJLmfaQ@mail.gmail.com	2025-10-28 16:21:51 +09:00
Amit Kapila	3e8e05596a	Add worker type argument to logical replication worker functions. Extend logicalrep_worker_stop, logicalrep_worker_wakeup, and logicalrep_worker_find to accept a worker type argument. This change enables differentiation between logical replication worker types, such as apply workers and table sync workers. While preserving existing behavior, it lays the groundwork for upcoming patch to add sequence synchronization workers. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-28 05:47:50 +00:00
Michael Paquier	8767b449a3	Simplify newline handling in two TAP tests Two tests are changed in this commit: - libpq's 006_service - ldap's 003_ldap_connection_param_lookup CRLF translation is already handled by the text mode, so there should be need for any specific logic. See also `1c6d462939`, msys perl being one case where the translation mattered. Note: This is first applied on HEAD, and backpatch will follow once the buildfarm has provided an opinion about this commit. Author: Jacob Champion <jacob.champion@enterprisedb.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/aPsh39bxwYKvUlAf@paquier.xyz Backpatch-through: 13	2025-10-28 08:26:42 +09:00
Nathan Bossart	123661427b	Fix a couple of comments. These were discovered while reviewing Aleksander Alekseev's proposed changes to pgindent. Oversights in commits `393e0d2314` and `25a30bbd42`. Discussion: https://postgr.es/m/aP-H6kSsGOxaB21k%40nathan	2025-10-27 10:30:05 -05:00
Dean Rasheed	2e84248d64	Add new RLS tests to test policies applied by command type. The existing RLS tests focus on the outcomes of various testing scenarios, rather than the exact policies applied. This sometimes makes it hard to see why a particular result occurred (e.g., which policy failed), or to construct a test that fails a particular policy check without an earlier check failing. These new tests issue NOTICE messages to show the actual policies applied for each command type, including the different paths through INSERT ... ON CONFLICT and MERGE, making it easier to verify the expected behaviour. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CAEZATCWqnfeChjK=n1V_dYZT4rt4mnq+ybf9c0qXDYTVMsy8pg@mail.gmail.com	2025-10-27 10:21:16 +00:00
Peter Eisentraut	10b5bb3bff	Add some const qualifications Add some const qualifications afforded by the previous change that added a const qualification to PageAddItemExtended(). Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/c75cccf5-5709-407b-a36a-2ae6570be766@eisentraut.org	2025-10-27 09:55:59 +01:00
Peter Eisentraut	76acf4b722	Remove Item type This type is just char * underneath, it provides no real value, no type safety, and just makes the code one level more mysterious. It is more idiomatic to refer to blobs of memory by a combination of void * and size_t, so change it to that. Also, since this type hides the pointerness, we can't apply qualifiers to what is pointed to, which requires some unconstify nonsense. This change allows fixing that. Extension code that uses the Item type can change its code to use void * to be backward compatible. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/c75cccf5-5709-407b-a36a-2ae6570be766@eisentraut.org	2025-10-27 09:55:59 +01:00
Peter Eisentraut	64d2b0968e	Remove meaninglist restrict qualifiers The use of the restrict qualifier in casts is meaningless, so remove them. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/0e3d8644-c01d-4374-86ea-9f0a987981f0%40eisentraut.org	2025-10-27 08:53:09 +01:00
Amit Kapila	e0dc4bbfb8	Fix GUC check_hook validation for synchronized_standby_slots. Previously, the check_hook for synchronized_standby_slots attempted to validate that each specified slot existed and was physical. However, these checks were not performed during server startup. As a result, if users configured non-existent slots before startup, the misconfiguration would go undetected initially. This could later cause parallel query failures, as newly launched workers would detect the issue and raise an ERROR. This patch improves the check_hook by validating the syntax and format of slot names. Validation of slot existence and type is deferred to the WAL sender process, aligning with the behavior of the check_hook for primary_slot_name. Reported-by: Fabrice Chapuis <fabrice636861@gmail.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/CAA5-nLCeO4MQzWipCXH58qf0arruiw0OeUc1+Q=Z=4GM+=v1NQ@mail.gmail.com	2025-10-27 06:48:32 +00:00
Amit Kapila	549d9c91b1	Improve test in 009_matviews.pl. Ensure that the target table on the subscriber exists before executing any DML intended for replication. Currently, if the table is missing, the replication logic keeps retrying until the table is eventually created by the test. Although this behaviour does not cause test failures, since the table is created after the INSERT is published and replication eventually succeeds, however, it introduces unnecessary looping and delays. Author: Grem Snoort <grem.snoort@gmail.com> Discussion: https://postgr.es/m/CANV9Qw5HD7=Fp4nox2O7DoVctHoabRRVW9Soo4A=QipqW5B=Tg@mail.gmail.com	2025-10-27 03:52:39 +00:00
Jeff Davis	371a302eec	Comment typo fixes: pg_wchar_t should be pg_wchar. Reported-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKGJ5Xh0KxLYXDZuPvw1_fHX=yuzb4xxtam1Cr6TPZZ1o+w@mail.gmail.com	2025-10-26 12:31:50 -07:00
David Rowley	39dcfda2d2	Fix incorrect logic for caching ResultRelInfos for triggers When dealing with ResultRelInfos for partitions, there are cases where there are mixed requirements for the ri_RootResultRelInfo. There are cases when the partition itself requires a NULL ri_RootResultRelInfo and in the same query, the same partition may require a ResultRelInfo with its parent set in ri_RootResultRelInfo. This could cause the column mapping between the partitioned table and the partition not to be done which could result in crashes if the column attnums didn't match exactly. The fix is simple. We now check that the ri_RootResultRelInfo matches what the caller passed to ExecGetTriggerResultRel() and only return a cached ResultRelInfo when the ri_RootResultRelInfo matches what the caller wants, otherwise we'll make a new one. Author: David Rowley <dgrowleyml@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Reported-by: Dmitry Fomin <fomin.list@gmail.com> Discussion: https://postgr.es/m/7DCE78D7-0520-4207-822B-92F60AEA14B4@gmail.com Backpatch-through: 15	2025-10-26 10:59:50 +13:00
Dean Rasheed	7e2af1fb11	Guard against division by zero in test_int128 module. When testing 128/32-bit division in the test_int128 test module, make sure that we don't attempt to divide by zero. While at it, use pg_prng_int64() and pg_prng_int32() to generate the random numbers required for the tests, rather than pg_prng_uint64(). This eliminates the need for any implicit or explicit type casts. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/da4wqngd6gwz5s4yf5y5f75xj7gpja62l4dbp6w4j3vs7fcjue@upvolcu4e6o2	2025-10-25 11:08:46 +01:00
Michael Paquier	5173bfd044	pg_rewind: Skip copy of WAL segments generated before point of divergence This commit makes the way WAL segments are handled from the source to the target server slightly smarter: the copy of the WAL segments is now skipped if these have been created before the point where source and target have diverged (the WAL segment where the point of divergence exists is still copied), because we know that such segments exist on both the target and source. Note that the on-disk size of the WAL segments on the source and target need to match. Hence, only the segments generated after the point of divergence are now copied. A segment existing on the source but not the target is copied. Previously, all the WAL segments were just copied in full. This change can make the rewind operation cheaper in some configurations, especially for setups where some WAL retention causes many segments to remain on the source server even after the promotion of a standby used as source to rewind a previous primary. A TAP test is added to track these new behaviors. The file map printed with --debug now includes all the information related to WAL segments, to be able to track if these are copied or skipped, and the test relies on the debug output generated. Author: John Hsu <johnhyvr@gmail.com> Author: Justin Kwan <justinpkwan@outlook.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/181b4c6fa9c.b8b725681941212.7547232617810891479@viggy28.dev	2025-10-25 09:07:31 +09:00
Fujii Masao	14ee8e6403	psql: Improve tab completion for large objects. This commit enhances psql's tab completion support for large objects: - Completes \lo_export <oid> with a file name - Completes GRANT/REVOKE ... LARGE with OBJECT - Completes ALTER DEFAULT PRIVILEGES GRANT/REVOKE ... LARGE with OBJECTS Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/87y0syikki.fsf@wibble.ilmari.org	2025-10-24 14:31:14 +09:00
Tom Lane	0758111f5d	Update expected output for contrib/sepgsql's regression tests. Commit `65281391a` caused some additional error context lines to appear in the output of one test case. That's fine, but we missed updating the expected output. Do it now. While here, add some missing test-output subdirectories to contrib/sepgsql/.gitignore, so that we don't get git warnings after running the tests. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1613232.1761255361@sss.pgh.pa.us Backpatch-through: 18	2025-10-23 17:47:10 -04:00
Daniel Gustafsson	c0677d8b2e	doc: Remove mention of Git protocol support The project Git server hasn't supported cloning with the Git protocol in a very long time, but the documentation never got the memo. Remove the mention of using the Git protocol, and while there wrap a mention of Git in <productname> tags. Backpatch down to all supported versions. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CABwTF4WMiMb-KT2NRcib5W0C8TQF6URMb+HK9a_=rnZnY8Q42w@mail.gmail.com Backpatch-through: 13	2025-10-23 21:26:15 +02:00
Tom Lane	e557c82759	Avoid memory leak in validation of a PL/Python trigger function. If we're trying to perform check_function_bodies validation of a PL/Python trigger function, we create a new PLyProcedure, but we don't put it into the PLy_procedure_cache hash table. (Doing so would be useless, since we don't have the relation OID that is part of the cache key for a trigger function, so we could not make an entry that would be found by later uses.) However, we didn't think through what to do instead, with the result that the PLyProcedure was simply leaked. It would take a pretty large number of CREATE FUNCTION operations for this to amount to a serious problem, but it's easy to see the memory bloat if you do CREATE OR REPLACE FUNCTION in a loop. To fix, have PLy_procedure_get delete the new PLyProcedure and return NULL if it's not going to cache the PLyProcedure. I considered making plpython3_validator do the cleanup instead, which would be more natural. But then plpython3_validator would have to know the rules under which PLy_procedure_get returns a non-cached PLyProcedure, else it risks deleting something that's pointed to by a cache entry. On the whole it seems more robust to deal with the case inside PLy_procedure_get. Found by the new version of Coverity (nice catch!). In the end I feel this fix is more about satisfying Coverity than about fixing a real-world problem, so I'm not going to back-patch.	2025-10-23 14:23:26 -04:00
Tom Lane	9f9a04368f	Fix off-by-one Asserts in FreePageBtreeInsertInternal/Leaf. These two functions expect there to be room to insert another item in the FreePageBtree's array, but their assertions were too weak to guarantee that. This has little practical effect granting that the callers are not buggy, but it seems to be misleading late-model Coverity into complaining about possible array overrun. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/799984.1761150474@sss.pgh.pa.us Backpatch-through: 13	2025-10-23 12:32:06 -04:00
Tom Lane	798b19d27b	Fix resource leaks in PL/Python error reporting, redux. Commit `c6f7f11d8` intended to prevent leaking any PyObject reference counts in edge cases (such as out-of-memory during string construction), but actually it introduced a leak in the normal case. Repeating an error-trapping operation often enough would lead to session-lifespan memory bloat. The problem is that I failed to think about the fact that PyObject_GetAttrString() increments the refcount of the returned PyObject, so that simply walking down the list of error frame objects causes all but the first one to have their refcount incremented. I experimented with several more-or-less-complex ways around that, and eventually concluded that the right fix is simply to drop the newly-obtained refcount as soon as we walk to the next frame object in PLy_traceback. This sounds unsafe, but it's perfectly okay because the caller holds a refcount on the first frame object and each frame object holds a refcount on the next one; so the current frame object can't disappear underneath us. By the same token, we can simplify the caller's cleanup back to simply dropping its refcount on the first object. Cleanup of each frame object will lead in turn to the refcount of the next one going to zero. I also added a couple of comments explaining why PLy_elog_impl() doesn't try to free the strings acquired from PLy_get_spi_error_data() or PLy_get_error_data(). That's because I got here by looking at a Coverity complaint about how those strings might get leaked. They are not leaked, but in testing that I discovered this other leak. Back-patch, as `c6f7f11d8` was. It's a bit nervous-making to be putting such a fix into v13, which is only a couple weeks from its final release; but I can't see that leaving a recently-introduced leak in place is a better idea. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1203918.1761184159@sss.pgh.pa.us Backpatch-through: 13	2025-10-23 11:47:46 -04:00
Amit Kapila	f0b3573c3a	Introduce "REFRESH SEQUENCES" for subscriptions. This patch adds support for a new SQL command: ALTER SUBSCRIPTION ... REFRESH SEQUENCES This command updates the sequence entries present in the pg_subscription_rel catalog table with the INIT state to trigger resynchronization. In addition to the new command, the following subscription commands have been enhanced to automatically refresh sequence mappings: ALTER SUBSCRIPTION ... REFRESH PUBLICATION ALTER SUBSCRIPTION ... ADD PUBLICATION ALTER SUBSCRIPTION ... DROP PUBLICATION ALTER SUBSCRIPTION ... SET PUBLICATION These commands will perform the following actions: Add newly published sequences that are not yet part of the subscription. Remove sequences that are no longer included in the publication. This ensures that sequence replication remains aligned with the current state of the publication on the publisher side. Note that the actual synchronization of sequence data/values will be handled in a subsequent patch that introduces a dedicated sequence sync worker. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-23 08:30:27 +00:00
Michael Paquier	6ae08d9583	pg_rewind: Extend code detecting relation files to work with WAL files isRelDataFile() is renamed to getFileContentType(), extended so as it becomes able to detect more file patterns than only relation files. The new file name pattern that can be detected is WAL files. This refactoring has been suggested by Robert Haas. This will be used in a follow-up patch where we are looking at improving how WAL files are processed by pg_rewind. As of this change, WAL files are still handled the same way as previously, always copied from the source to the target server. Extracted from a larger patch by the same authors. Author: John Hsu <johnhyvr@gmail.com> Author: Justin Kwan <justinpkwan@outlook.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/181b4c6fa9c.b8b725681941212.7547232617810891479@viggy28.dev	2025-10-23 15:57:46 +09:00
Fujii Masao	abc2b71383	Add comments explaining overflow entries in the replication lag tracker. Commit `883a95646a` introduced overflow entries in the replication lag tracker to fix an issue where lag columns in pg_stat_replication could stall when the replay LSN stopped advancing. This commit adds comments clarifying the purpose and behavior of overflow entries to improve code readability and understanding. Since commit `883a95646a` was recently applied and backpatched to all supported branches, this follow-up commit is also backpatched accordingly. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CABPTF7VxqQA_DePxyZ7Y8V+ErYyXkmwJ1P6NC+YC+cvxMipWKw@mail.gmail.com Backpatch-through: 13	2025-10-23 13:24:56 +09:00
Tatsuo Ishii	20628b62e4	Fix coding style with "else". The "else" code block having single statement with comments on a separate line should have been surrounded by braces. Reported-by: Chao Li <lic@highgo.com> Suggested-by: David Rowley <dgrowleyml@gmail.com> Author: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://postgr.es/m/20251020.125847.997839131426057290.ishii%40postgresql.org	2025-10-23 10:58:41 +09:00
David Rowley	b30da2d233	Fix some misplaced comments in parallel_schedule These are listing which other tests one of the tests in the subsequent group depends on. A couple of comments were located with unrelated tests. In passing, fix a small grammatical issue. Noticed in passing while working on something else. Author: David Rowley <dgrowleyml@gmail.com>	2025-10-23 13:38:39 +13:00
Masahiko Sawada	487e2bc534	Add copyright notice to vacuum_horizon_floor.pl test. Fix oversight in commit `303ba0573`, which was backpatched through 14. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAD21AoBeFdTJcwUfUYPcEgONab3TS6i1PB9S5cSXcBAmdAdQKw%40mail.gmail.com Backpatch-through: 14	2025-10-22 17:17:49 -07:00
David Rowley	6911f80379	Fix incorrect zero extension of Datum in JIT tuple deform code When JIT deformed tuples (controlled via the jit_tuple_deforming GUC), types narrower than sizeof(Datum) would be zero-extended up to Datum width. This wasn't the same as what fetch_att() does in the standard tuple deforming code. Logically the values are the same when fetching via the DatumGet*() marcos, but negative numbers are not the same in binary form. In the report, the problem was manifesting itself with: ERROR: could not find memoization table entry in a query which had a "Cache Mode: binary" Memoize node. However, it's currently unclear what else is affected. Anything that uses datum_image_eq() or datum_image_hash() on a Datum from a tuple deformed by JIT could be affected, but it may not be limited to that. The fix for this is simple: use signed extension instead of zero extension. Many thanks to Emmanuel Touzery for reporting this issue and providing steps and backup which allowed the problem to easily be recreated. Reported-by: Emmanuel Touzery <emmanuel.touzery@plandela.si> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/DB8P194MB08532256D5BAF894F241C06393F3A@DB8P194MB0853.EURP194.PROD.OUTLOOK.COM Backpatch-through: 13	2025-10-23 13:11:02 +13:00
Tom Lane	fe9c051fd3	Avoid assuming that time_t can fit in an int. We had several places that used cast-to-unsigned-int as a substitute for properly checking for overflow. Coverity has started objecting to that practice as likely introducing Y2038 bugs. An extra comparison is surely not much compared to the cost of time(NULL), nor is this coding practice particularly readable. Let's do it honestly, with explicit logic covering the cases of first-time-through and clock-went-backwards. I don't feel a need to back-patch though: our released versions will be out of support long before 2038, and besides which I think the code would accidentally work anyway for another 70 years or so.	2025-10-22 17:50:11 -04:00
Nathan Bossart	d10866f1fd	Fix type of infomask parameter in htup_details.h functions. Oversight in commit `34694ec888`. Since there aren't any known live bugs related to this, no back-patch. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aPk4u955ZPPZ_nYw%40nathan	2025-10-22 16:47:38 -05:00
Tom Lane	716c451128	Remove useless pstrdup() calls. The result of PLyUnicode_AsString is already palloc'd, so pstrdup'ing it is just a waste of time and memory. More importantly it might confuse people about whether that's necessary. Doesn't seem important enough to back-patch, but we should fix it. Spotted by Coverity.	2025-10-22 16:22:52 -04:00
Tom Lane	bc310c6ff3	Fix memory leaks in scripts/vacuuming.c. Coverity complained that the list of table names returned by retrieve_objects() was leaked, and it's right. Potentially this is quite a lot of memory; however, it's not entirely clear how much we gain by cleaning it up, since in many operating modes we're going to need the list until the program is about to exit. Still, it will win in some cases, so fix the leak. vacuuming.c is new in v19, and this patch doesn't apply at all cleanly to the predecessor code in v18. I'm not excited enough about the issue to devise a back-patch.	2025-10-22 15:19:19 -04:00
Tom Lane	9224c30252	Fix memory leaks in pg_combinebackup/reconstruct.c. One code path forgot to free the separately-malloc'd filename part of a struct rfile. Another place freed the filename but forgot the struct rfile itself. These seem worth fixing because with a large backup we could be dealing with many files. Coverity found the bug in make_rfile(). I found the other one by manual inspection.	2025-10-22 13:38:40 -04:00
Nathan Bossart	4c5e1d0785	Remove make_temptable_name_n(). This small function is only used in one place, and it fails to handle quoted table names (although the table name portion of the input should never be quoted in current usage). In addition to removing make_temptable_name_n() in favor of open-coding it where needed, this commit ensures the "diff" table name is properly quoted in order to future-proof this area a bit. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TO3a5q2NKRsjdJ6sLf8isVe4aMaaX1-Hj2TdHdhFw8zRA%40mail.gmail.com	2025-10-22 12:31:55 -05:00
Fujii Masao	f33e60a53a	Make invalid primary_slot_name follow standard GUC error reporting. Previously, if primary_slot_name was set to an invalid slot name and the configuration file was reloaded, both the postmaster and all other backend processes reported a WARNING. With many processes running, this could produce a flood of duplicate messages. The problem was that the GUC check hook for primary_slot_name reported errors at WARNING level via ereport(). This commit changes the check hook to use GUC_check_errdetail() and GUC_check_errhint() for error reporting. As with other GUC parameters, this causes non-postmaster processes to log the message at DEBUG3, so by default, only the postmaster's message appears in the log file. Backpatch to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/CAHGQGwFud-cvthCTfusBfKHBS6Jj6kdAPTdLWKvP2qjUX6L_wA@mail.gmail.com Backpatch-through: 13	2025-10-22 20:09:43 +09:00
Tatsuo Ishii	2d7b247cb4	Fix multi WinGetFuncArgInFrame/Partition calls with IGNORE NULLS. Previously it was mistakenly assumed that there's only one window function argument which needs to be processed by WinGetFuncArgInFrame or WinGetFuncArgInPartition when IGNORE NULLS option is specified. To eliminate the limitation, WindowObject->notnull_info is modified from "uint8 " to "uint8 " so that WindowObject->notnull_info could store pointers to "uint8 " which holds NOT NULL info corresponding to each window function argument. Moreover, WindowObject->num_notnull_info is changed from "int" to "int64 *" so that WindowObject->num_notnull_info could store the number of NOT NULL info corresponding to each function argument. Memories for these data structures will be allocated when WinGetFuncArgInFrame or WinGetFuncArgInPartition is called. Thus no memory except the pointers is allocated for function arguments which do not call these functions Also fix the set mark position logic in WinGetFuncArgInPartition to not raise a "cannot fetch row before WindowObject's mark position" error in IGNORE NULLS case. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://postgr.es/m/2952409.1760023154%40sss.pgh.pa.us	2025-10-22 12:06:33 +09:00
Fujii Masao	883a95646a	Fix stalled lag columns in pg_stat_replication when replay LSN stops advancing. Previously, when the replay LSN reported in feedback messages from a standby stopped advancing, for example, due to a recovery conflict, the write_lag and flush_lag columns in pg_stat_replication would initially update but then stop progressing. This prevented users from correctly monitoring replication lag. The problem occurred because when any LSN stopped updating, the lag tracker's cyclic buffer became full (the write head reached the slowest read head). In that state, the lag tracker could no longer compute round-trip lag values correctly. This commit fixes the issue by handling the slowest read entry (the one causing the buffer to fill) as a separate overflow entry and freeing space so the write and other read heads can continue advancing in the buffer. As a result, write_lag and flush_lag now continue updating even if the reported replay LSN remains stalled. Backpatch to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGdGQ=1-X-71Caee-LREBUXSzyohkoQJd4yZZCMt24C0g@mail.gmail.com Backpatch-through: 13	2025-10-22 11:27:15 +09:00
Michael Paquier	2519fa8362	Bump catalog version for new function error_on_null() Oversight in `2b75c38b70`. No comments. Discussion: https://postgr.es/m/aPgu7kwiT4iGo6Ya@paquier.xyz	2025-10-22 10:08:47 +09:00
Michael Paquier	2b75c38b70	Add error_on_null(), checking if the input is the null value This polymorphic function produces an error if the input value is detected as being the null value; otherwise it returns the input value unchanged. This function can for example become handy in SQL function bodies, to enforce that exactly one row was returned. Author: Joel Jacobson <joel@compiler.org> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/ece8c6d1-2ab1-45d5-ba12-8dec96fc8886@app.fastmail.com Discussion: https://postgr.es/m/de94808d-ed58-4536-9e28-e79b09a534c7@app.fastmail.com	2025-10-22 09:55:17 +09:00
David Rowley	2470ca435c	Use CompactAttribute more often, when possible `5983a4cff` added CompactAttribute for storing commonly used fields from FormData_pg_attribute. `5983a4cff` didn't go to the trouble of adjusting every location where we can use CompactAttribute rather than FormData_pg_attribute, so here we change the remaining ones. There are some locations where I've left the code using FormData_pg_attribute. These are mostly in the ALTER TABLE code. Using CompactAttribute here seems more risky as often the TupleDesc is being changed and those changes may not have been flushed to the CompactAttribute yet. I've also left record_recv(), record_send(), record_cmp(), record_eq() and record_image_eq() alone as it's not clear to me that accessing the CompactAttribute is a win here due to the FormData_pg_attribute still having to be accessed for most cases. Switching the relevant parts to use CompactAttribute would result in having to access both for common cases. Careful benchmarking may reveal that something can be done to make this better, but in absence of that, the safer option is to leave these alone. In ReorderBufferToastReplace(), there was a check to skip attnums < 0 while looping over the TupleDesc. Doing this is redundant since TupleDescs don't store < 0 attnums. Removing that code allows us to move to using CompactAttribute. The change in validateDomainCheckConstraint() just moves fetching the FormData_pg_attribute into the ERROR path, which is cold due to calling errstart_cold() and results in code being moved out of the common path. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAApHDvrMy90o1Lgkt31F82tcSuwRFHq3vyGewSRN=-QuSEEvyQ@mail.gmail.com	2025-10-22 11:36:26 +13:00
Tom Lane	fba60a1b10	Avoid short seeks in pg_restore. If a data block to be skipped over is less than 4kB, just read the data instead of using fseeko(). Experimentation shows that this avoids useless kernel calls --- possibly quite a lot of them, at least with current glibc --- while not incurring any extra I/O, since libc will read 4kB at a time anyway. (There may be platforms where the default buffer size is different from 4kB, but this change seems unlikely to hurt in any case.) We don't expect short data blocks to be common in the wake of `66ec01dc4` and related commits. But older pg_dump files may well contain very short data blocks, and that will likely be a case to be concerned with for a long time. While here, do a little bit of other cleanup in _skipData. Make "buflen" be size_t not int; it can't really exceed the range of int, but comparing size_t and int variables is just asking for trouble. Also, when we initially allocate a buffer for reading skipped data into, make sure it's at least 4kB to reduce the odds that we'll shortly have to realloc it bigger. Author: Dimitrios Apostolou <jimis@gmx.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2edb7a57-b225-3b23-a680-62ba90658fec@gmx.net	2025-10-21 14:10:14 -04:00
Nathan Bossart	b97d8d843a	Add reminder to create .abi-compliance-history. This commit adds a note to RELEASE_CHANGES to remind us to create an .abi-compliance-history file for new major versions. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David E. Wheeler <david@justatheory.com> Discussion: https://postgr.es/m/aPJ03E2itovDBcKX%40nathan	2025-10-21 12:23:23 -05:00
Jeff Davis	ff53907c35	Make char2wchar() static. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-21 09:32:12 -07:00
Jeff Davis	844385d12e	Remove obsolete global database_ctype_is_c. Now that tsearch uses the database default locale, there's no need to track the database CTYPE separately. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-21 09:32:04 -07:00
Jeff Davis	e113f9c102	tsearch: use database default collation for parsing. Previously, tsearch used the database's CTYPE setting, which only matches the database default collation if the locale provider is libc. Note that tsearch types (tsvector and tsquery) are not collatable types. The locale affects parsing the original text, which is a lossy process, so a COLLATE clause on the already-parsed value would not make sense. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-21 09:31:49 -07:00
Nathan Bossart	776c2c2ae2	Add previous commit to .git-blame-ignore-revs. Backpatch-through: 13	2025-10-21 10:02:19 -05:00
Nathan Bossart	e94a7afe44	Re-pgindent brin.c. Backpatch-through: 13	2025-10-21 09:56:26 -05:00
Álvaro Herrera	b7cc6474e9	Make smgr access for a BufferManagerRelation safer in relcache inval Currently there's no bug, because we have no code path where we invalidate relcache entries where it'd cause a problem. But it's more robust to do it this way in case we introduce such a path later, as some Postgres forks reportedly already have. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Stepan Neretin <slpmcf@gmail.com> Discussion: https://postgr.es/m/CAJDiXgj3FNzAhV+jjPqxMs3jz=OgPohsoXFj_fh-L+nS+13CKQ@mail.gmail.com	2025-10-21 10:51:55 +03:00
David Rowley	9fd29d7ff4	Fix BRIN 32-bit counter wrap issue with huge tables A BlockNumber (32-bit) might not be large enough to add bo_pagesPerRange to when the table contains close to 2^32 pages. At worst, this could result in a cancellable infinite loop during the BRIN index scan with power-of-2 pagesPerRange, and slow (inefficient) BRIN index scans and scanning of unneeded heap blocks for non power-of-2 pagesPerRange. Backpatch to all supported versions. Author: sunil s <sunilfeb26@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOG6S4-tGksTQhVzJM19NzLYAHusXsK2HmADPZzGQcfZABsvpA@mail.gmail.com Backpatch-through: 13	2025-10-21 20:46:14 +13:00
Michael Paquier	e4e496e88c	Fix comment in pg_get_shmem_allocations_numa() The comment fixed in this commit described the function as dealing with database blocks, but in reality it processes shared memory allocations. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aH4DDhdiG9Gi0rG7@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 18	2025-10-21 16:12:30 +09:00
Richard Guo	18d2614093	Fix pushdown of degenerate HAVING clauses `67a54b9e8` taught the planner to push down HAVING clauses even when grouping sets are present, as long as the clause does not reference any columns that are nullable by the grouping sets. However, there was an oversight: if any empty grouping sets are present, the aggregation node can produce a row that did not come from the input, and pushing down a HAVING clause in this case may cause us to fail to filter out that row. Currently, non-degenerate HAVING clauses are not pushed down when empty grouping sets are present, since the empty grouping sets would nullify the vars they reference. However, degenerate (variable-free) HAVING clauses are not subject to this restriction and may be incorrectly pushed down. To fix, explicitly check for the presence of empty grouping sets and retain degenerate clauses in HAVING when they are present. This ensures that we don't emit a bogus aggregated row. A copy of each such clause is also put in WHERE so that query_planner() can use it in a gating Result node. To facilitate this check, this patch expands the groupingSets tree of the query to a flat list of grouping sets before applying the HAVING pushdown optimization. This does not add any additional planning overhead, since we need to do this expansion anyway. In passing, make a small tweak to preprocess_grouping_sets() by reordering its initial operations a bit. Backpatch to v18, where this issue was introduced. Reported-by: Yuhang Qiu <iamqyh@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0879D9C9-7FE2-4A20-9593-B23F7A0B5290@gmail.com Backpatch-through: 18	2025-10-21 12:35:36 +09:00
Michael Paquier	29b039e916	Fix POSIX compliance in pgwin32_unsetenv() for "name" argument pgwin32_unsetenv() (compatibility routine of unsetenv() on Windows) lacks the input validation that its sibling pgwin32_setenv() has. Without these checks, calling unsetenv() with incorrect names crashes on WIN32. However, invalid names should be handled, failing on EINVAL. This commit adds the same checks as setenv() to fail with EINVAL for a "name" set to NULL, an empty string, or if '=' is included in the value, per POSIX requirements. Like `7ca37fb040`, backpatch down to v14. pgwin32_unsetenv() is defined on REL_13_STABLE, but with the branch going EOL soon and the lack of setenv() there for WIN32, nothing is done for v13. Author: Bryan Green <dbryan.green@gmail.com> Discussion: https://postgr.es/m/b6a1e52b-d808-4df7-87f7-2ff48d15003e@gmail.com Backpatch-through: 14	2025-10-21 08:05:28 +09:00
Masahiko Sawada	4bea91f21f	Support COPY TO for partitioned tables. Previously, COPY TO command didn't support directly specifying partitioned tables so users had to use COPY (SELECT ...) TO variant. This commit adds direct COPY TO support for partitioned tables, improving both usability and performance. Performance tests show it's faster than the COPY (SELECT ...) TO variant as it avoids the overheads of query processing and sending results to the COPY TO command. When used with partitioned tables, COPY TO copies the same rows as SELECT * FROM table. Row-level security policies of the partitioned table are applied in the same way as when executing COPY TO on a plain table. Author: jian he <jian.universality@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CACJufxEZt%2BG19Ors3bQUq-42-61__C%3Dy5k2wk%3DsHEFRusu7%3DiQ%40mail.gmail.com	2025-10-20 10:38:52 -07:00
Tom Lane	d74cfe3263	Fix thinko in commit `7d129ba54`. The revised logic in 001_ssltests.pl would fail if openssl doesn't work or if Perl is a 32-bit build, because it had already overwritten $serialno with something inappropriate to use in the eventual match. We could go back to the previous code layout, but it seems best to introduce a separate variable for the output of openssl. Per failure on buildfarm member mamba, which has a 32-bit Perl.	2025-10-20 08:45:57 -04:00
Fujii Masao	762faf702c	pg_dump: Remove unnecessary code for security labels on extensions. Commit `d9572c4e3b` added extension support and made pg_dump attempt to dump security labels on extensions. However, security labels on extensions are not actually supported, so this code was unnecessary. This commit removes it. Suggested-by: Jian He <jian.universality@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxF8=z0v=888NKKEoTHQ+Jc4EXutFi91BF0fFjgFsZT6JQ@mail.gmail.com	2025-10-20 11:44:11 +09:00
Michael Paquier	a7c3042200	pg_checksums: Use new routine to retrieve data of PG_VERSION Previously, attempting to use pg_checksums on a cluster with a control file whose version does not match with what thetool is able to support would lead to the following error: pg_checksums: error: pg_control CRC value is incorrect This is confusing, because it would look like the control file is corrupted. However, the contents of the control file are correct, pg_checksums not being able to understand how the past control file is shaped. This commit adds a check based on PG_VERSION, using the facility added by `cd0be131ba`, using the same error message as some of the other frontend tools. A note is added in the documentation about the major version requirement. Author: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/68f1ff21.170a0220.2c9b5f.4df5@mx.google.com	2025-10-20 09:35:22 +09:00
Tom Lane	92cf557ffa	Add static assertion that RELSEG_SIZE fits in an int. Our configure script intended to ensure this, but it supposed that expr(1) would report an error for integer overflow. Maybe that was true when the code was written (commit `3c6248a82` of 2008-05-02), but all the modern expr's I tried will deliver bigger-than-int32 results without complaint. Moreover, if you use --with-segsize-blocks then there's no check at all. Ideally we'd add a test in configure itself to check that the value fits in int, but to do that we'd need to suppose that test(1) handles bigger-than-int32 numbers correctly. Probably modern ones do, but that's an assumption I could do without; and I'm not too trusting about meson either. Instead, let's install a static assertion, so that even people who ignore all the compiler warnings you get from such values will be forced to confront the fact that it won't work. This has been hazardous for awhile, but given that we hadn't heard a complaint about it till now, I don't feel a need to back-patch. Reported-by: Casey Shobe <casey.allen.shobe@icloud.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/C5DC82D6-C76D-4E8F-BC2E-DF03EFC4FA24@icloud.com	2025-10-19 18:28:46 -04:00
Tom Lane	277dec6514	Don't rely on zlib's gzgetc() macro. It emerges that zlib's configuration logic is not robust enough to guarantee that the macro will have the same ideas about struct field layout as the library itself does, leading to corruption of zlib's state struct followed by unintelligible failure messages. This hazard has existed for a long time, but we'd not noticed for several reasons: (1) We only use gzgetc() when trying to read a manually-compressed TOC file within a directory-format dump, which is a rarely-used scenario that we weren't even testing before `20ec99589`. (2) No corruption actually occurs unless sizeof(long) is different from sizeof(off_t) and the platform is big-endian. (3) Some platforms have already fixed the configuration instability, at least sufficiently for their environments. Despite (3), it seems foolish to assume that the problem isn't going to be present in some environments for a long time to come. Hence, avoid relying on this macro. We can just #undef it and fall back on the underlying function of the same name. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2122679.1760846783@sss.pgh.pa.us Backpatch-through: 13	2025-10-19 14:36:58 -04:00
Tatsuo Ishii	dd766a441d	Fix Coverity issue reported in commit `2273fa32bc`. Coverity complains that the return value from gettuple_eval_partition (stored in variable "datum") in a do..while loop in WinGetFuncArgInPartition is overwritten when exiting the while loop. This commit tries to fix the issue by changing the gettuple_eval_partition call to: (void) gettuple_eval_partition() explicitly stating that we discard the return value. We are just interested in whether we are inside or outside of partition, NULL or NOT NULL here. Also enhance some comments for easier code reading. Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aPCOabSE4VfJLaky%40paquier.xyz	2025-10-19 09:29:26 +09:00
Jeff Davis	e533524b23	Add pg_database_locale() to retrieve database default locale. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-18 16:25:23 -07:00
Jeff Davis	67a8b49e96	Add pg_iswxdigit(), useful for tsearch. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-18 16:25:11 -07:00
Tom Lane	da44d71e79	Allow role created by new test to log in on Windows. We must tell init about each role name we plan to connect as, else SSPI auth fails. Similar to previous patches such as `14793f471`, `973542866`. Oversight in `208927e65`, per buildfarm member drongo. (Although that was back-patched to v13, the test script only exists in v16 and up.)	2025-10-18 18:36:21 -04:00
David Rowley	e3b9e44689	Tidyup truncate_useless_pathkeys() function This removes a few static functions and replaces them with 2 functions which aim to be more reusable. The upper planner's pathkey requirements can be simplified down to operations which require pathkeys in the same order as the pathkeys for the given operation, and operations which can make use of a Path's pathkeys in any order. Here we also add some short-circuiting to truncate_useless_pathkeys(). At any point we discover that all pathkeys are useful to a single operation, we can stop checking the remaining operations as we're not going to be able to find any further useful pathkeys - they're all possibly useful already. Adjusting this seems to warrant trying to put the checks roughly in order of least-expensive-first so that the short-circuits have the most chance of skipping the more expensive checks. In passing clean up has_useful_pathkeys() as it seems to have grown a redundant check for group_pathkeys. This isn't needed as standard_qp_callback will set query_pathkeys if there's any requirement to have group_pathkeys. All this code does is waste run-time effort and take up needless space. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAApHDvpbsEoTksvW5901MMoZo-hHf78E5up3uDOfkJnxDe_WAw@mail.gmail.com	2025-10-19 10:13:13 +13:00
Álvaro Herrera	615ff828e1	Fix determination of not-null constraint "locality" for inherited columns It is possible to have a non-inherited not-null constraint on an inherited column, but we were failing to preserve such constraints during pg_upgrade where the source is 17 or older, because of a bug in the pg_dump query for it. Oversight in commit `14e87ffa5c`. Fix that query. In passing, touch-up a bogus nearby comment introduced by the same commit. In version 17, make the regression tests leave a table in this situation, so that this scenario is tested in the cross-version upgrade tests of 18 and up. Author: Dilip Kumar <dilipbalaut@gmail.com> Reported-by: Andrew Bille <andrewbille@gmail.com> Bug: #19074 Backpatch-through: 18 Discussion: https://postgr.es/m/19074-ae2548458cf0195c@postgresql.org	2025-10-18 18:18:19 +02:00
Álvaro Herrera	4921a5972a	Fix pg_dump sorting of foreign key constraints Apparently, commit `04bc2c42f7` failed to notice that DO_FK_CONSTRAINT objects require identical handling as DO_CONSTRAINT ones, which causes some pg_upgrade tests in debug builds to fail spuriously. Add that. Author: Álvaro Herrera <alvherre@kurilemu.de> Backpatch-through: 13 Discussion: https://postgr.es/m/202510181201.k6y75v2tpf5r@alvherre.pgsql	2025-10-18 17:50:10 +02:00
David Rowley	5c0a20003b	Fix reset of incorrect hash iterator in GROUPING SETS queries This fixes an unlikely issue when fetching GROUPING SET results from their internally stored hash tables. It was possible in rare cases that the hash iterator would be set up incorrectly which could result in a crash. This was introduced in `4d143509c`, so backpatch to v18. Many thanks to Yuri Zamyatin for reporting and helping to debug this issue. Bug: #19078 Reported-by: Yuri Zamyatin <yuri@yrz.am> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org Backpatch-through: 18	2025-10-18 16:07:04 +13:00
David Rowley	86d118f9a6	Englishify comment wording Switch to using the English word here rather than using a verbified function name. The full word still fits within a single comment line, so it's probably better just to use that instead of trying to shorten it, which might cause confusion. Author: Rafia Sabih <rafia.pghackers@gmail.com> Discussion: https://postgr.es/m/CA+FpmFe7LnRF2NA_QfARjkSWme4mNt+Udwbh2Yb=zZm35Ji31w@mail.gmail.com	2025-10-18 12:50:14 +13:00
Tomas Vondra	b85c4700fc	Fix hashjoin memory balancing logic Commit `a1b4f289be` improved the hashjoin sizing to also consider the memory used by BufFiles for batches. The code however had multiple issues, making it ineffective or not working as expected in some cases. * The amount of memory needed by buffers was calculated using uint32, so it would overflow for nbatch >= 262144. If this happened the loop would exit prematurely and the memory usage would not be reduced. The nbatch overflow is fixed by reworking the condition to not use a multiplication at all, so there's no risk of overflow. An explicit cast was added to a similar calculation in ExecHashIncreaseBatchSize. * The loop adjusting the nbatch value used hash_table_bytes to calculate the old/new size, but then updated only space_allowed. The consequence is the total memory usage was not reduced, but all the memory saved by reducing the number of batches was used for the internal hash table. This was fixed by using only space_allowed. This is also more correct, because hash_table_bytes does not account for skew buckets. * The code was also doubling multiple parameters (e.g. the number of buckets for hash table), but was missing overflow protections. The loop now checks for overflow, and terminates if needed. It'd be possible to cap the value and continue the loop, but it's not worth the complexity. And the overflow implies the in-memory hash table is already very large anyway. While at it, rework the comment explaining how the memory balancing works, to make it more concise and easier to understand. The initial nbatch overflow issue was reported by Vaibhav Jain. The other issues were noticed by me and Melanie Plageman. Fix by me, with a lot of review and feedback by Melanie. Backpatch to 18, where the hashjoin memory balancing was introduced. Reported-by: Vaibhav Jain <jainva@google.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CABa-Az174YvfFq7rLS+VNKaQyg7inA2exvPWmPWqnEn6Ditr_Q@mail.gmail.com	2025-10-17 22:21:50 +02:00
Masahiko Sawada	fd53065013	Remove unused data_bufsz from DecodedBkpBlock struct. Author: Mikhail Gribkov <youzhick@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAMEv5_sxuaiAfSy1ZyN%3D7UGbHg3C10cwHhEk8nXEjiCsBVs4vQ%40mail.gmail.com	2025-10-17 11:28:54 -07:00
Nathan Bossart	208927e656	Fix privilege checks for pg_prewarm() on indexes. pg_prewarm() currently checks for SELECT privileges on the target relation. However, indexes do not have access rights of their own, so a role may be denied permission to prewarm an index despite having the SELECT privilege on its parent table. This commit fixes this by locking the parent table before the index (to avoid deadlocks) and checking for SELECT on the parent table. Note that the code is largely borrowed from amcheck_lock_relation_and_check(). An obvious downside of this change is the extra AccessShareLock on the parent table during prewarming, but that isn't expected to cause too much trouble in practice. Author: Ayush Vatsa <ayushvatsa1810@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CACX%2BKaMz2ZoOojh0nQ6QNBYx8Ak1Dkoko%3DD4FSb80BYW%2Bo8CHQ%40mail.gmail.com Backpatch-through: 13	2025-10-17 11:36:50 -05:00
Tom Lane	a6113dc1da	Improve TAP tests by replacing ok() with better Test::More functions Transpose the changes made by commit `fabb33b35` in 002_pg_dump.pl into its recently-created clone 006_pg_dump_compress.pl.	2025-10-17 11:25:53 -04:00
Daniel Gustafsson	7d129ba54e	Avoid warnings in tests when openssl binary isn't available The SSL tests for pg_stat_ssl tries to exactly match the serial from the certificate by extracting it with the openssl binary. If that fails due to the binary not being available, a fallback match is used, but the attempt to execute a missing binary adds a warning to the output which can confuse readers for a failure in the test. Fix by only attempting if the openssl binary was found by autoconf/meson. Backpatch down to v16 where commit `c8e4030d1b` made the test use the OPENSSL variable from autoconf/meson instead of a hard- coded value. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aNPSp1-RIAs3skZm@msg.df7cb.de Backpatch-through: 16	2025-10-17 14:21:26 +02:00
Peter Eisentraut	e1a912c86d	Change config_generic.vartype to be initialized at compile time Previously, this was initialized at run time so that it did not have to be maintained by hand in guc_tables.c. But since that table is now generated anyway, we might as well generate this bit as well. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-17 10:33:54 +02:00
Peter Eisentraut	0a7bde4610	Use designated initializers for guc_tables This makes the generating script simpler and the output easier to read. In the future, it will make it easier to reorder and rearrange the underlying C structures. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-17 10:29:42 +02:00
Daniel Gustafsson	0d82163958	ecpg: check return value of replace_variables() The function returns false if it fails to allocate memory, so make sure to check the return value in callsites. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAJ7c6TNPrU8ZxgdfN3PyGY1tzo0bgszx+KkqW0Z7zt3heyC1GQ@mail.gmail.com	2025-10-17 10:03:15 +02:00
Daniel Gustafsson	6aa184c80f	Replace defunct URL with stable archive.org URL in rbtree.c The URL for "Sorting and Searching Algorithms: A Cookbook" by Thomas Niemann has started returning 404, and since we refer to the page for license terms this replaces the now defunct link with one to the copy on archive.org. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/6DED3DEF-875E-4D1D-8F8F-7353D5AF7B79@gmail.com	2025-10-17 09:38:49 +02:00
Michael Paquier	fabb33b351	Improve TAP tests by replacing ok() with better Test::More functions The TAP tests whose ok() calls are changed in this commit were relying on perl operators, rather than equivalents available in Test::More. For example, rather than the following: ok($data =~ qr/expr/m, "expr matching"); ok($data !~ qr/expr/m, "expr not matching"); The new test code uses this equivalent: like($data, qr/expr/m, "expr matching"); unlike($data, qr/expr/m, "expr not matching"); A huge benefit of the new formulation is that it is possible to know about the values we are checking if a failure happens, making debugging easier, should the test runs happen in the buildfarm, in the CI or locally. This change leads to more test code overall as perltidy likes to make the code pretty the way it is in this commit. Author: Sadhuprasad Patro <b.sadhu@gmail.com> Discussion: https://postgr.es/m/CAFF0-CHhwNx_Cv2uy7tKjODUbeOgPrJpW4Rpf1jqB16_1bU2sg@mail.gmail.com	2025-10-17 14:39:09 +09:00
Fujii Masao	e64aa1a39d	doc: Clarify when backend_xmin in pg_stat_replication can be NULL. Improve the documentation of pg_stat_replication to explain when the backend_xmin column becomes NULL. This happens when a replication slot is used (the xmin is then shown in pg_replication_slots) or when hot_standby_feedback is disabled. Author: Renzo Dani <arons7@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CA+XOKQAMXzskpdUmj2sg03_5fmiXc2Gs0r3TX1_rmcFcqh+=xQ@mail.gmail.com	2025-10-17 14:03:42 +09:00
Michael Paquier	d1b80a31ed	Fix matching check in recovery test 042_low_level_backup 042_low_level_backup compared the result of a query two times with a comparison operator based on an integer, while the result should be compared with a string. The outcome of the tests is currently not impacted by this change. However, it could be possible that the tests fail to detect future issues if the query results become different, for some reason. Oversight in `99b4a63bef`. Author: Sadhuprasad Patro <b.sadhu@gmail.com> Discussion: https://postgr.es/m/CAFF0-CHhwNx_Cv2uy7tKjODUbeOgPrJpW4Rpf1jqB16_1bU2sg@mail.gmail.com Backpatch-through: 17	2025-10-17 13:06:04 +09:00
Michael Paquier	d372888ade	pg_createsubscriber: Fix matching check in TAP test 040_pg_createsubscriber has been calling safe_psql(), that returns the result of a SQL query, with ok() without checking the result generated (in this case 't', for a number of publications). The outcome of the tests is currently not impacted by this change. However, it could be possible that the test fails to detect future issues if the query results become different. The test is rewritten so as the number of publications is checked. This is not the fix suggested originally by the author, but this is more reliable in the long run. Oversight in `e5aeed4b80`. Author: Sadhuprasad Patro <b.sadhu@gmail.com> Discussion: https://postgr.es/m/CAFF0-CHhwNx_Cv2uy7tKjODUbeOgPrJpW4Rpf1jqB16_1bU2sg@mail.gmail.com Backpatch-through: 18	2025-10-17 13:01:14 +09:00
Álvaro Herrera	6ad9378c9a	Fix update-po for the PGXS case The original formulation failed to take into account the fact that for the PGXS case, the source dir is not $(top_srcdir), so it ended up not doing anything. Handle it explicitly. Author: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Bryan Green <dbryan.green@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/TYCPR01MB113164770FB0B0BE6ED21E68EE8DCA@TYCPR01MB11316.jpnprd01.prod.outlook.com	2025-10-16 20:21:05 +02:00
Tom Lane	20ec995892	Add more TAP test coverage for pg_dump. Add a test case to cover pg_dump with --compress=none. This brings the coverage of compress_none.c up from about 64% to 90%, in particular covering the new code added in a previous patch. Include compression of toc.dat in manually-compressed test cases. We would have found the bug fixed in commit `a239c4a0c` much sooner if we'd done this. As far as I can tell, this doesn't reduce test coverage at all, since there are other tests of directory format that still use an uncompressed toc.dat. Widen the wide row used to verify correct (de) compression. Commit `1a05c1d25` advises us (not without reason) to ensure that this test case fully fills DEFAULT_IO_BUFFER_SIZE, so that loops within the compression logic will iterate completely. To follow that advice with the proposed DEFAULT_IO_BUFFER_SIZE of 128K, we need something close to this. This does indeed increase the reported code coverage by a few lines. While here, fix a glitch that I noticed in testing: the $glob_patterns tests were incapable of failing, because glob() will return 'foo' as 'foo' whether there is a matching file or not. (Indeed, the stanza just above that one relies on that.) In my testing, this patch adds approximately as much runtime as was saved by the previous patch, so that it's about a wash compared to the old code. However, we get better test coverage. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us	2025-10-16 12:52:10 -04:00
Tom Lane	9dcf7f1172	Split 002_pg_dump.pl into two test files. Add a new test script 006_pg_dump_compress.pl, containing just the pg_dump tests specifically concerned with compression, and remove those tests from 002_pg_dump.pl. We can also drop some infrastructure in 002_pg_dump.pl that was used only for these tests. The point of this is to avoid the cost of running these test cases over and over in all the scenarios (runs) that 002_pg_dump.pl exercises. We don't learn anything more about the behavior of the compression code that way, and we expend significant amounts of time, since one of these test cases is quite large and due to get larger. The intent of this specific patch is to provide exactly the same coverage as before, except that I went back to using --no-sync in all the test runs moved over to 006_pg_dump_compress.pl. I think that avoiding that had basically been cargo-culted into these test cases as a result of modeling them on the defaults_custom_format test case; again, doing that over and over isn't going to teach us anything new. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us	2025-10-16 12:51:55 -04:00
Tom Lane	66ec01dc41	Align the data block sizes of pg_dump's various compression modes. After commit `fe8192a95`, compress_zstd.c tends to produce data block sizes around 128K, and we don't really have any control over that unless we want to overrule ZSTD_CStreamOutSize(). Which seems like a bad idea. But let's try to align the other compression modes to produce block sizes roughly comparable to that, so that pg_restore's skip-data performance isn't enormously different for different modes. gzip compression can be brought in line simply by setting DEFAULT_IO_BUFFER_SIZE = 128K, which this patch does. That increases some unrelated buffer sizes, but none of them seem problematic for modern platforms. lz4's idea of appropriate block size is highly nonlinear: if we just increase DEFAULT_IO_BUFFER_SIZE then the output blocks end up around 200K. I found that adjusting the slop factor in LZ4State_compression_init was a not-too-ugly way of bringing that number roughly into line. With compress = none you get data blocks the same sizes as the table rows, which seems potentially problematic for narrow tables. Introduce a layer of buffering to make that case match the others. Comments in compress_io.h and 002_pg_dump.pl suggest that if we increase DEFAULT_IO_BUFFER_SIZE then we need to increase the amount of data fed through the tests in order to improve coverage. I've not done that here, leaving it for a separate patch. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us	2025-10-16 12:50:18 -04:00
Nathan Bossart	812221b204	Remove partColsUpdated. This information appears to have been unused since commit `c5b7ba4e67`. We could not find any references in third-party code, either. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aO_CyFRpbVMtgJWM%40nathan	2025-10-16 11:31:38 -05:00
Amit Kapila	41c674d2e3	Refactor logical worker synchronization code into a separate file. To support the upcoming addition of a sequence synchronization worker, this patch extracts common synchronization logic shared by table sync workers and the new sequence sync worker into a dedicated file. This modularization improves code reuse, maintainability, and clarity in the logical workers framework. Author: vignesh C <vignesh21@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-16 05:10:50 +00:00
Amit Langote	905e932f09	Fix EPQ crash from missing partition directory in EState EvalPlanQualStart() failed to propagate es_partition_directory into the child EState used for EPQ rechecks. When execution time partition pruning ran during the EPQ scan, executor code dereferenced a NULL partition directory and crashed. Previously, propagating es_partition_directory into the EPQ EState was unnecessary because CreatePartitionPruneState(), which sets it on demand, also initialized the exec-pruning context. After commit `d47cbf474`, CreatePartitionPruneState() now initializes only the init- time pruning context, leaving exec-pruning context initialization to ExecInitNode(). Since EvalPlanQualStart() runs only ExecInitNode() and not CreatePartitionPruneState(), it can encounter a NULL es_partition_directory. Other executor fields initialized during CreatePartitionPruneState() are already copied into the child EState thanks to commit `8741e48e5d`, but es_partition_directory was missed. Fix by borrowing the parent estate's es_partition_directory in EvalPlanQualStart(), and by clearing that field in EvalPlanQualEnd() so the parent remains responsible for freeing the directory. Add an isolation test permutation that triggers EPQ with execution- time partition pruning, the case that reproduces this crash. Bug: #19078 Reported-by: Yuri Zamyatin <yuri@yrz.am> Diagnosed-by: David Rowley <dgrowleyml@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org Backpatch-through: 18	2025-10-16 14:01:44 +09:00
Michael Paquier	02c171f63f	Override log_error_verbosity to "default" in test 009_log_temp_files Per report from buildfarm member prion. The CI does not use this parameter, and this buildfarm member sets log_error_verbosity to "verbose". This would generate extra LOCATION entries in the logs, causing the regexps of the test to fail. Trying to support log_error_verbosity=verbose in the test would mean to tweak all the regexps used in the test to detect an optional set of LOCATION lines, at least. This would not improve the coverage, and forcing the GUC value is simpler. Oversight in `76bba03312`. Discussion: https://postgr.es/m/aPBaNNGiYT3xMBN1@paquier.xyz	2025-10-16 11:39:45 +09:00
Michael Paquier	76bba03312	Add tests for logging of temporary file removal and statement Temporary file usage is sometimes attributed to the wrong query in the logs output. One identified reason is that unnamed portal cleanup (and consequently temp file logging) happens during the next BIND message as a, after debug_query_string has already been updated to the new query. Dropping an unnamed portal in the next BIND message is a rather old protocol behavior (`fe19e56c57`, also mentioned in the docs). log_temp_files is a bit newer than that, as of `be8a431881`. This commit adds tests to track which query is displayed next to the temporary file(s) removed when a portal is dropped, and in some cases if a query is displayed or not. We have not concluded how to improve the situation yet; these tests will at least help in checking what changes in the logs depending on the proposal discussed and how it affects the scenarios tracked by this new test. Author: Sami Imseih <samimseih@gmail.com> Author: Frédéric Yhuel <frederic.yhuel@dalibo.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/3d07ee43-8855-42db-97e0-bad5db82d972@dalibo.com	2025-10-16 09:02:51 +09:00
Nathan Bossart	079480dc20	Fix lookup code for REINDEX INDEX. This commit adjusts RangeVarCallbackForReindexIndex() to handle an extremely unlikely race condition involving concurrent OID reuse. In short, if REINDEX INDEX is executed at the same time that the index is re-created with the same name and OID but a different parent table OID, we might lock the wrong parent table. To fix, simply detect when this happens and emit an ERROR. Unfortunately, we can't gracefully handle this situation because we will have already locked the index, and we must lock the parent table before the index to avoid deadlocks. While at it, I've replaced all but one early return in this callback function with ERRORs that should be unreachable. While I haven't verified the presence of a live bug, the checks in question appear to be unnecessary, and the early returns seem prone to breaking the parent table locking code in subtle ways. If nothing else, this simplifies the code a bit. This is a bug fix and could be back-patched, but given the presumed rarity of the race condition and the lack of reports, I'm not going to bother. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/Z8zwVmGzXyDdkAXj%40nathan	2025-10-15 16:32:40 -05:00
Jeff Davis	af164f31b9	Add pg_iswalpha() and related functions. Per-character pg_locale_t APIs. Useful for tsearch parsing and potentially other places. Significant overlap with the regc_wc_isalpha() and related functions in regc_pg_locale.c, but this change leaves those intact for now. Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-15 12:54:01 -07:00
Nathan Bossart	688dc6299a	Fix lookups in pg_{clear,restore}_{attribute,relation}_stats(). Presently, these functions look up the relation's OID, lock it, and then check privileges. Not only does this approach provide no guarantee that the locked relation matches the arguments of the lookup, but it also allows users to briefly lock relations for which they do not have privileges, which might enable denial-of-service attacks. This commit adjusts these functions to use RangeVarGetRelidExtended(), which is purpose-built to avoid both of these issues. The new RangeVarGetRelidCallback function is somewhat complicated because it must handle both tables and indexes, and for indexes, we must check privileges on the parent table and lock it first. Also, it needs to handle a couple of extremely unlikely race conditions involving concurrent OID reuse. A downside of this change is that the coding doesn't allow for locking indexes in AccessShare mode anymore; everything is locked in ShareUpdateExclusive mode. Per discussion, the original choice of lock levels was intended for a now defunct implementation that used in-place updates, so we believe this change is okay. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/Z8zwVmGzXyDdkAXj%40nathan Backpatch-through: 18	2025-10-15 12:47:33 -05:00
Peter Eisentraut	5f4c3b33a9	Change reset_extra into a config_generic common field This is not specific to the GUC parameter type, so it can be part of the generic struct rather than the type-specific struct (like the related "extra" field). This allows for some code simplifications. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-15 15:20:28 +02:00
Peter Eisentraut	dd3ae37830	Add log_autoanalyze_min_duration The log output functionality of log_autovacuum_min_duration applies to both VACUUM and ANALYZE, so it is not possible to separate the VACUUM and ANALYZE log output thresholds. Logs are likely to be output only for VACUUM and not for ANALYZE. Therefore, we decided to separate the threshold for log output of VACUUM by autovacuum (log_autovacuum_min_duration) and the threshold for log output of ANALYZE by autovacuum (log_autoanalyze_min_duration). Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kasahara Tatsuhito <kasaharatt@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/CAOzEurQtfV4MxJiWT-XDnimEeZAY+rgzVSLe8YsyEKhZcajzSA@mail.gmail.com	2025-10-15 14:31:12 +02:00
Etsuro Fujita	12609fbacb	Fix EvalPlanQual handling of foreign/custom joins in ExecScanFetch. If inside an EPQ recheck, ExecScanFetch would run the recheck method function for foreign/custom joins even if they aren't descendant nodes in the EPQ recheck plan tree, which is problematic at least in the foreign-join case, because such a foreign join isn't guaranteed to have an alternative local-join plan required for running the recheck method function; in the postgres_fdw case this could lead to a segmentation fault or an assert failure in an assert-enabled build when running the recheck method function. Even if inside an EPQ recheck, any scan nodes that aren't descendant ones in the EPQ recheck plan tree should be normally processed by using the access method function; fix by modifying ExecScanFetch so that if inside an EPQ recheck, it runs the recheck method function for foreign/custom joins that are descendant nodes in the EPQ recheck plan tree as before and runs the access method function for foreign/custom joins that aren't. This fix also adds to postgres_fdw an isolation test for an EPQ recheck that caused issues stated above. Oversight in commit `385f337c9`. Reported-by: Kristian Lejao <kristianlejao@gmail.com> Author: Masahiko Sawada <sawada.mshk@gmail.com> Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAD21AoBpo6Gx55FBOW+9s5X=nUw3Xpq64v35fpDEKsTERnc4TQ@mail.gmail.com Backpatch-through: 13	2025-10-15 17:15:00 +09:00
Peter Eisentraut	29dc7a6687	Add some const qualifiers in guc-related source files, in anticipation of some further restructuring. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-15 10:05:53 +02:00
Peter Eisentraut	1a79518888	Modernize some for loops in guc-related source files, in anticipation of some further restructuring. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-15 10:00:37 +02:00
Peter Eisentraut	594ba21bce	plpython: Remove support for major version conflict detection This essentially reverts commit `866566a690`, which installed safeguards against loading plpython2 and plpython3 into the same process. We don't support plpython2 anymore, so this is obsolete. The Python and PL/Python initialization now happens again in _PG_init() rather than the first time a PL/Python call handler is invoked. (Often, these will be very close together.) I kept the separate PLy_initialize() function introduced by `866566a690` to keep _PG_init() a bit modular. Reviewed-by: Mario González Troncoso <gonzalemario@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/9eb9feb6-1df3-4f0c-a0dc-9bcf35273111%40eisentraut.org	2025-10-15 08:18:29 +02:00
Amit Kapila	2436b8c047	Standardize use of REFRESH PUBLICATION in code and messages. This patch replaces ALTER SUBSCRIPTION REFRESH with ALTER SUBSCRIPTION REFRESH PUBLICATION in comments and error messages to improve clarity and support future extensibility. The change aligns with upcoming addition REFRESH SEQUENCES for sequence synchronization. Author: vignesh C <vignesh21@gmail.com> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-15 03:42:27 +00:00
Michael Paquier	fa55be2a50	pg_createsubscriber: Use new routine to retrieve data of PG_VERSION pg_createsubscriber is documented as requiring the same major version as the target clusters. Attempting to use this tool on a cluster where the control file version read does not match with the version compiled with would lead to the following error message: pg_createsubscriber: error: control file appears to be corrupt This is confusing as the control file is correct: only the version expected does not match. This commit integrates pg_createsubscriber with the facility added by `cd0be131ba`, where the contents of PG_VERSION are read and compared with the value of PG_MAJORVERSION_NUM expected by the tool. This puts pg_createsubscriber in line with the documentation, with a better error message when the version does not match. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aONDWig0bIGilixs@paquier.xyz	2025-10-15 11:11:30 +09:00
Michael Paquier	c6a6cd53d3	pg_resetwal: Use new routine to retrieve data of PG_VERSION pg_resetwal's custom logic to retrieve the version number of a data folder's PG_VERSION can be replaced by the facility introduced in `cd0be131ba`. This removes some code. One thing specific to pg_resetwal is that the first line of PG_VERSION is read and reported in the error report generated when the major version read does not match with the version pg_resetwal has been compiled with. The new logic preserves this property, without changes to neither the error message nor the data used in the error report. Note that as a chdir() is done within the data folder before checking the data of PG_VERSION, get_pg_version() needs to be tweaked to look for PG_VERSION in the current folder. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aOiirvWJzwdVCXph@paquier.xyz	2025-10-15 10:09:48 +09:00
Michael Paquier	e4775e42ca	pg_combinebackup: Use new routine to retrieve data of PG_VERSION pg_combinebackup's custom logic to retrieve the version number of a data folder's PG_VERSION can be replaced by the facility introduced in `cd0be131ba`. This removes some code. One thing specific to this tool is that backend versions older than v10 are not supported. The new code does the same checks as the previous code. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aOiirvWJzwdVCXph@paquier.xyz	2025-10-15 09:54:56 +09:00
Masahiko Sawada	45a7faf130	Revert "pg_createsubscriber: Add log message when no publications exist to drop." This reverts commit `74ac377d75`. The previous change contained a misconception about how publications are cleaned up on the subscriber. The newly added log message could confuse users, particularly when running pg_createsubscriber with --dry-run - users would see a "dropping publication" message immediately followed by a "no publications found" message. Discussion: https://postgr.es/m/CAHut+Pu7xz1LqNvyQyvSHrV0Sw6D=e6T-Jm=gh1MRJrkuWGyBQ@mail.gmail.com	2025-10-14 17:36:11 -07:00
Melanie Plageman	3e4705484e	Make heap_page_is_all_visible independent of LVRelState This function only requires a few fields from LVRelState, so pass them in individually. This change allows calling heap_page_is_all_visible() from code such as pruneheap.c, which does not have access to an LVRelState. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek	2025-10-14 17:43:41 -04:00
Melanie Plageman	43b05b38ea	Inline TransactionIdFollows/Precedes[OrEquals]() These functions appeared prominently in a profile of a patch that sets the visibility map on-access. Inline them to remove call overhead and make them cheaper to use in hot paths. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek	2025-10-14 17:03:48 -04:00
Melanie Plageman	c8dd6542ba	Add helper for freeze determination to heap_page_prune_and_freeze After scanning the line pointers on a heap page during the first phase of vacuum, we use the information collected to decide whether to use the assembled freeze plans. Move this decision logic into a helper function to improve readability. While here, rename a PruneState member and disambiguate some local variables in heap_page_prune_and_freeze(). Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek	2025-10-14 15:08:50 -04:00
Masahiko Sawada	74ac377d75	pg_createsubscriber: Add log message when no publications exist to drop. When specifying --clean=publication to pg_createsubscriber, it drops all existing publications with a log message "dropping all existing publications in database "testdb"". Add a new log message "no publications found" when there are no publications to drop, making the progress more transparent to users. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAHut+Ptm+WJwbbYXhC0s6FP_98KzZCR=5CPu8F8N5uV8P7BpqA@mail.gmail.com	2025-10-14 11:45:29 -07:00
Jeff Davis	8efe982fe2	pg_regc_locale.c: rename some static functions. Use the more specific prefix "regc_" rather than the generic prefix "pg_". A subsequent commit will create generic versions of some of these functions that can be called from other modules. Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com	2025-10-14 11:04:04 -07:00
Nathan Bossart	c9b299f6df	dblink: Avoid locking relation before privilege check. The present coding of dblink's get_rel_from_relname() predates the introduction of RangeVarGetRelidExtended(), which provides a way to check permissions before locking the relation. This commit adjusts get_rel_from_relname() to use that function. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/aOgmi6avE6qMw_6t%40nathan	2025-10-14 12:20:48 -05:00
Melanie Plageman	4a8fb58671	Bump XLOG_PAGE_MAGIC after xl_heap_prune change `add323da40` altered xl_heap_prune, changing the WAL format, but neglected to bump XLOG_PAGE_MAGIC. Do so now. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aO3Gw6hCAZFUd5ab%40paquier.xyz	2025-10-14 10:13:10 -04:00
Tatsuo Ishii	5f3808646f	Use ereport rather than elog in WinCheckAndInitializeNullTreatment. Previously WinCheckAndInitializeNullTreatment() used elog() to emit an error message. ereport() should be used instead because it's a user-facing error. Also use existing get_func_name() to get a function's name, rather than own implementation. Moreover add an assertion to validate winobj parameter, just like other window function API. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/2952409.1760023154%40sss.pgh.pa.us	2025-10-14 19:15:24 +09:00
Richard Guo	1206df04c2	Rename apply_at to apply_agg_at for clarity The field name "apply_at" in RelAggInfo was a bit ambiguous. Rename it to "apply_agg_at" to improve clarity and make its purpose clearer. Per complaint from David Rowley, Robert Haas. Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+TgmoZ0KR2_XCWHy17=HHcQ3p2Mamc9c6Dnnhf1J6wPYFD9ng@mail.gmail.com	2025-10-14 16:35:22 +09:00
Michael Paquier	a7d8052910	pg_upgrade: Use new routine to retrieve data of PG_VERSION Unsurprisingly, this shaves code. get_major_server_version() can be replaced by the new routine added by `cd0be131ba`, with the contents of PG_VERSION stored in an allocated buffer instead of a fixed-sized one. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aOiirvWJzwdVCXph@paquier.xyz	2025-10-14 16:27:13 +09:00
Michael Paquier	cd0be131ba	Introduce frontend API able to retrieve the contents of PG_VERSION get_pg_version() is able to return a version number, that can be used for comparisons based on PG_VERSION_NUM. A macro is added to convert the result to a major version number, to work with PG_MAJORVERSION_NUM. It is possible to pass to the routine an optional argument, where the contents retrieved from PG_VERSION are saved. This requirement matters for some of the frontend code (one example: pg_upgrade wants that for tablespace paths with a version number strictly older than v10). This will be used by a set of follow-up patches, to be consumed in various frontend tools that duplicate a logic similar to do what this new routine does, like: - pg_resetwal - pg_combinebackup - pg_createsubscriber - pg_upgrade This routine supports both the post-v10 version number and the older flavor (aka 9.6), as required at least by pg_upgrade. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aOiirvWJzwdVCXph@paquier.xyz	2025-10-14 16:20:42 +09:00
Michael Paquier	1c05fe11ab	Fix version number calculation for data folder flush in pg_combinebackup The version number calculated by read_pg_version_file() is multiplied once by 10000, to be able to do comparisons based on PG_VERSION_NUM or equivalents with a minor version included. However, the version number given sync_pgdata() was multiplied by 10000 a second time, leading to an overestimated number. This issue was harmless (still incorrect) as pg_combinebackup does not support versions of Postgres older than v10, and sync_pgdata() only includes a version check due to the rename of pg_xlog/ to pg_wal/. This folder rename happened in the development cycle of v10. This would become a problem if in the future sync_pgdata() is changed to have more version-specific checks. Oversight in `dc21234005`, so backpatch down to v17. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aOil5d0y87ZM_wsZ@paquier.xyz Backpatch-through: 17	2025-10-14 08:30:54 +09:00
Melanie Plageman	add323da40	Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III Instead of emitting a separate XLOG_HEAP2_VISIBLE WAL record for each page that becomes all-visible in vacuum's third phase, specify the VM changes in the already emitted XLOG_HEAP2_PRUNE_VACUUM_CLEANUP record. Visibility checks are now performed before marking dead items unused. This is safe because the heap page is held under exclusive lock for the entire operation. This reduces the number of WAL records generated by VACUUM phase III by up to 50%. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2025-10-13 18:01:06 -04:00
Tom Lane	03bf7a12c5	Fix incorrect message-printing in win32security.c. log_error() would probably fail completely if used, and would certainly print garbage for anything that needed to be interpolated into the message, because it was failing to use the correct printing subroutine for a va_list argument. This bug likely went undetected because the error cases this code is used for are rarely exercised - they only occur when Windows security API calls fail catastrophically (out of memory, security subsystem corruption, etc). The FRONTEND variant can be fixed just by calling vfprintf() instead of fprintf(). However, there was no va_list variant of write_stderr(), so create one by refactoring that function. Following the usual naming convention for such things, call it vwrite_stderr(). Author: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=TzaM+tWPnw@mail.gmail.com Backpatch-through: 13	2025-10-13 17:56:45 -04:00
David Rowley	615a0fc2f1	Doc: clarify n_distinct_inherited setting There was some confusion around how to adjust the n_distinct estimates for partitioned tables. Here we try and clarify that n_distinct_inherited needs to be adjusted rather than n_distinct. Also fix some slightly misleading text which was talking about table size rather than table rows, fix a grammatical error, and adjust some text which indicated that ANALYZE was performing calculations based on the n_distinct settings. Really it's the query planner that does this and ANALYZE only stores the overridden n_distinct estimate value in pg_statistic. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com	2025-10-14 09:25:02 +13:00
Tom Lane	1f8062dd96	Fix serious performance problems in LZ4Stream_read_internal. I was distressed to find that reading an LZ4-compressed toc.dat file was hundreds of times slower than it ought to be. On investigation, the blame mostly affixes to LZ4Stream_read_overflow's habit of memmove'ing all the remaining buffered data after each read operation. Since reading a TOC file tends to involve a lot of small (even one-byte) decompression calls, that amounts to an O(N^2) cost. This could have been fixed with a minimal patch, but to my eyes LZ4Stream_read_internal and LZ4Stream_read_overflow are badly-written spaghetti code; in particular the eol_flag logic is inefficient and duplicative. I chose to throw the code away and rewrite from scratch. This version is about sixty lines shorter as well as not having the performance issue. Fortunately, AFAICT the only way to get to this problem is to manually LZ4-compress the toc.dat and/or blobs.toc files within a directory-style archive; in the main data files, we read blocks that are large enough that the O(N^2) behavior doesn't manifest. Few people do that, which likely explains the lack of field complaints. Otherwise this performance bug might be considered bad enough to warrant back-patching. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us	2025-10-13 13:17:45 -04:00
Tom Lane	fe8192a95e	Fix poor buffering logic in pg_dump's lz4 and zstd compression code. Both of these modules dumped each bit of output that they got from the underlying compression library as a separate "data block" in the emitted archive file. In the case of zstd this'd frequently result in block sizes well under 100 bytes; lz4 is a little better but still produces blocks around 300 bytes, at least in the test case I tried. This bloats the archive file a little bit compared to larger block sizes, but the real problem is that when pg_restore has to skip each data block rather than seeking directly to some target data, tiny block sizes are enormously inefficient. Fix both modules so that they fill their allocated buffer reasonably well before dumping a data block. In the case of lz4, also delete some redundant logic that caused the lz4 frame header to be emitted as a separate data block. (That saves little, but I see no reason to expend extra code to get worse results.) I fixed the "stream API" code too. In those cases, feeding small amounts of data to fwrite() probably doesn't have any meaningful performance consequences. But it seems like a bad idea to leave the two sets of code doing the same thing in two different ways. In passing, remove unnecessary "extra paranoia" check in _ZstdWriteCommon. _CustomWriteFunc (the only possible referent of cs->writeF) already protects itself against zero-length writes, and it's really a modularity violation for _ZstdWriteCommon to know that the custom format disallows empty data blocks. Also, fix Zstd_read_internal to do less work when passed size == 0. Reported-by: Dimitrios Apostolou <jimis@gmx.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us	2025-10-13 13:01:45 -04:00
Tom Lane	a239c4a0c2	Fix issue with reading zero bytes in Gzip_read. pg_dump expects a read request of zero bytes to be a no-op; see for example ReadStr(). Gzip_read got this wrong and falsely supposed that the resulting gzret == 0 indicated an error. We could complicate that error-checking logic some more, but it seems best to just fall out immediately when passed size == 0. This bug breaks the nominally-supported case of manually gzip'ing the toc.dat file within a directory-style dump, so back-patch to v16 where this code came in. (Prior branches already have a short-circuit for size == 0 before their only gzread call.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us Backpatch-through: 16	2025-10-13 12:44:20 -04:00
Magnus Hagander	d3ba50db48	docs: Fix protocol version 3.2 message format of CancelRequest Since protocol version 3.2 the CancelRequest does not have a fixed size length anymore. The protocol docs still listed the length field to be a constant number though. This fixes that. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reported-by: Dmitry Igrishin <dmitigr@gmail.com> Backpatch-through: 18	2025-10-13 15:31:25 +02:00
Magnus Hagander	e062af861b	Remove extra semicolon in example Reported-By: Pavel Luzanov <p.luzanov@postgrespro.ru> Discussion: https://postgr.es/m/175976566145.768.4645962241073007347@wrigleys.postgresql.org Backpatch-through: 18	2025-10-13 15:26:37 +02:00
Peter Geoghegan	7a662a46eb	Remove unused nbtree array advancement variable. Remove a variable that is no longer in use following commit `9a2e2a28`. It's not immediately clear why there were no compiler warnings about this oversight. Author: Peter Geoghegan <pg@bowt.ie> Backpatch-through: 18	2025-10-12 14:04:08 -04:00
Tom Lane	26d1cd375f	Restore test coverage of LZ4Stream_gets(). In commit `a45c78e32` I removed the only regression test case that reaches this function, because it turns out that we only use it if reading an LZ4-compressed blobs.toc file in a directory dump, and that is a state that has to be created manually. That seems like a bad thing to not test, not so much for LZ4Stream_gets() itself as because it means the squirrely eol_flag logic in LZ4Stream_read_internal() is not tested. The reason for the change was that I thought the lz4 program did not have any way to perform compression without explicit specification of the output file name. However, it turns out that the syntax synopsis in its man page is a lie, and if you read enough of the man page you find out that with "-m" it will do what's needful. So restore the manual compression step in that test case. Noted while testing some proposed changes in pg_dump's compression logic. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us Backpatch-through: 17	2025-10-11 16:33:55 -04:00
Álvaro Herrera	3231fd0455	Stop creating constraints during DETACH CONCURRENTLY Commit `71f4c8c6f7` (which implemented DETACH CONCURRENTLY) added code to create a separate table constraint when a table is detached concurrently, identical to the partition constraint, on the theory that such a constraint was needed in case the optimizer had constructed any query plans that depended on the constraint being there. However, that theory was apparently bogus because any such plans would be invalidated. For hash partitioning, those constraints are problematic, because their expressions reference the OID of the parent partitioned table, to which the detached table is no longer related; this causes all sorts of problems (such as inability of restoring a pg_dump of that table, and the table no longer working properly if the partitioned table is later dropped). We'd like to get rid of all those constraints. In fact, for branch master, do that -- no longer create any substitute constraints. However, out of fear that some users might somehow depend on these constraints for other partitioning strategies, for stable branches (back to 14, which added DETACH CONCURRENTLY), only do it for hash partitioning. (If you repeatedly DETACH CONCURRENTLY and then ATTACH a partition, then with this constraint addition you don't need to scan the table in the ATTACH step, which presumably is good. But if users really valued this feature, they would have requested that it worked for non-concurrent DETACH also.) Author: Haiyang Li <mohen.lhy@alibaba-inc.com> Reported-by: Fei Changhong <feichanghong@qq.com> Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com> Backpatch-through: 14 Discussion: https://postgr.es/m/18371-7fef49f63de13f02@postgresql.org Discussion: https://postgr.es/m/19070-781326347ade7c57@postgresql.org	2025-10-11 20:30:12 +02:00
Álvaro Herrera	ff47f9c16c	dbase_redo: Fix Valgrind-reported memory leak Introduced by my (Álvaro's) commit `9e4f914b5e`, which was itself backpatched to pg10, though only pg15 and up contain the problem because of commit `9c08aea6a3`. This isn't a particularly significant leak, but given the fix is trivial, we might as well backpatch to all branches where it applies, so do that. Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/x4odfdlrwvsjawscnqsqjpofvauxslw7b4oyvxgt5owoyf4ysn@heafjusodrz7	2025-10-11 16:39:22 +02:00
Peter Geoghegan	843e50208a	Remove overzealous _bt_killitems assertion. An assertion in _bt_killitems expected the scan's currPos state to contain a valid LSN, saved from when currPos's page was initially read. The assertion failed to account for the fact that even logged relations can have leaf pages with an invalid LSN when built with wal_level set to "minimal". Remove the faulty assertion. Oversight in commit `e6eed40e` (though note that the assertion was backpatched to stable branches before 18 by commit `7c319f54`). Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Matthijs van der Vleuten <postgresql@zr40.nl> Bug: #19082 Discussion: https://postgr.es/m/19082-628e62160dbbc1c1@postgresql.org Backpatch-through: 13	2025-10-10 14:52:25 -04:00
Michael Paquier	3a36543d7d	Fix two typos in xlogstats.h and xlogstats.c Issue found while browsing this area of the code, introduced and copy-pasted around by `2258e76f90`. Backpatch-through: 15	2025-10-10 11:51:45 +09:00
Michael Paquier	912af1c7e9	Remove state.tmp when failing to save a replication slot An error happening while a slot data is saved on disk in SaveSlotToPath() could cause a state.tmp file (temporary file holding the slot state data, renamed to its permanent name at the end of the function) to remain around after it has been created. This temporary file is created with O_EXCL, meaning that if an existing state.tmp is found, its creation would fail. This would prevent the slot data to be saved, requiring a manual intervention to remove state.tmp before being able to save again a slot. Possible scenarios where this temporary file could remain on disk is for example a ENOSPC case (no disk space) while writing, syncing or renaming it. The bug reports point to a write failure as the principal cause of the problems. Using O_TRUNC has been argued back in 2019 as a potential solution to discard any temporary file that could exist. This solution was rejected as O_EXCL can also act as a safety measure when saving the slot state, crash recovery offering cleanup guarantees post-crash. This commit uses the alternative approach that has been suggested by Andres Freund back in 2019. When the temporary state file cannot be written, synced, closed or renamed (note: not when created!), an unlink() is used to remove the temporary state file while holding the in-progress I/O LWLock, so as any follow-up attempts to save a slot's data would not choke on an existing file that remained around because of a previous failure. This problem has been reported a few times across the years, going back to 2019, but for some reason I have never come back to do something about it and it has been forgotten. A recent report has reminded me that this was still a problem. Reported-by: Kevin K Biju <kevinkbiju@gmail.com> Reported-by: Sergei Kornilov <sk@zsrv.org> Reported-by: Grigory Smolkin <g.smolkin@postgrespro.ru> Discussion: https://postgr.es/m/CAM45KeHa32soKL_G8Vk38CWvTBeOOXcsxAPAs7Jt7yPRf2mbVA@mail.gmail.com Discussion: https://postgr.es/m/3559061693910326@qy4q4a6esb2lebnz.sas.yp-c.yandex.net Discussion: https://postgr.es/m/08bbfab1-a61d-3750-fc18-4ab2c1aa7f09@postgrespro.ru Backpatch-through: 13	2025-10-10 09:23:59 +09:00
Andres Freund	c819d1017d	bufmgr: Fix valgrind checking for buffers pinned in StrategyGetBuffer() In `5e89985928` I made StrategyGetBuffer() pin buffers with a single CAS, instead of using PinBuffer_Locked(). Unfortunately I missed that PinBuffer_Locked() marked the page as defined for valgrind. Fix this oversight by centralizing the valgrind initialization into TrackNewBufferPin(), which also allows us to reduce the number of places doing VALGRIND_MAKE_MEM_DEFINED. Per buildfarm animal skink and Amit Langote. Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff Discussion: https://postgr.es/m/CA+HiwqGKJ6nEXEPQW7EpykVsEtzxp5-up_xhtcUAkWFtATVQvQ@mail.gmail.com	2025-10-09 19:17:13 -04:00
Michael Paquier	9d46b86529	test_bitmapset: Improve random function test_random_operations() did not check the result returned by bms_is_member() in its last phase, when checking that the contents of the bitmap match with what is expected. This was impacting the reliability of the function and the coverage it could provide. This commit improves the whole function, adding more checks based on bms_is_member(), using a bitmap and a secondary array that tracks the members added by random additions and deletions. While on it, more comments are added to document the internals of the function. Reported-by: Ranier Vilela <ranier.vf@gmail.com> Author: Greg Burd <greg@burd.me> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEudQAq_zOSA2NUQSWePTGV_=90Uw0WcXxGOWnN-vwF046OOqA@mail.gmail.com	2025-10-10 07:20:03 +09:00
Melanie Plageman	d96f87332b	Eliminate COPY FREEZE use of XLOG_HEAP2_VISIBLE Instead of emitting a separate WAL XLOG_HEAP2_VISIBLE record for setting bits in the VM, specify the VM block changes in the XLOG_HEAP2_MULTI_INSERT record. This halves the number of WAL records emitted by COPY FREEZE. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com	2025-10-09 16:29:01 -04:00
David Rowley	1b073cba49	Cleanup VACUUM option processing error messages The processing of the PARALLEL option for VACUUM was not quite following what the DefElem code had intended. defGetInt32() already has code to handle missing parameters and returns a perfectly good error message for when that happens. Here we get rid of the ExecVacuum() error: ERROR: parallel option requires a value between 0 and N and leave defGetInt32() handle it, which will give: ERROR: parallel requires an integer value defGetInt32() was already handling the non-integer parameter case, so it may as well handle the missing parameter case too. Additionally, parameterize the option name to make translator work easier, and also use errhint_internal() rather than errhint() for the BUFFER_USAGE_LIMIT option since there isn't any work for a translator to do for "%s". Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAApHDvovH14tNWB+WvP6TSbfi7-=TysQ9h5tQ5AgavwyWRWKHA@mail.gmail.com	2025-10-10 09:25:23 +13:00
Tom Lane	89d57c1fb3	Clean up memory leakage that occurs in context callback functions. An error context callback function might leak some memory into ErrorContext, since those functions are run with ErrorContext as current context. In the case where the elevel is ERROR, this is no problem since the code level that catches the error should do FlushErrorState to clean up, and that will reset ErrorContext. However, if the elevel is less than ERROR then no such cleanup occurs. In principle, repeated leaks while emitting log messages or client notices could accumulate arbitrarily much leaked data, if no ERROR occurs in the session. To fix, let errfinish() perform an ErrorContext reset if it is at the outermost error nesting level. (If it isn't, we'll delay cleanup until the outermost nesting level is exited.) The only actual leakage of this sort that I've been able to observe within our regression tests was recently introduced by commit `f727b63e8`. While it seems plausible that there are other such leaks not reached in the regression tests, the lack of field reports suggests that they're not a big problem. Accordingly, I won't take the risk of back-patching this now. We can always back-patch later if we get field reports of leaks. Reported-by: Andres Freund <andres@anarazel.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/jngsjonyfscoont4tnwi2qoikatpd5hifsg373vmmjvugwiu6g@m6opxh7uisgd	2025-10-09 15:37:42 -04:00
Masahiko Sawada	b46efe9048	Fix access-to-already-freed-memory issue in pgoutput. While pgoutput caches relation synchronization information in RelationSyncCache that resides in CacheMemoryContext, each entry's information (such as row filter expressions and column lists) is stored in the entry's private memory context (entry_cxt in RelationSyncEntry), which is a descendant memory context of the decoding context. If a logical decoding invoked via SQL functions like pg_logical_slot_get_binary_changes fails with an error, subsequent logical decoding executions could access already-freed memory of the entry's cache, resulting in a crash. With this change, it's ensured that RelationSyncCache is cleaned up even in error cases by using a memory context reset callback function. Backpatch to 15, where entry_cxt was introduced for column filtering and row filtering. While the backbranches v13 and v14 have a similar issue where RelationSyncCache persists even after an error when pgoutput is used via SQL API, we decided not to backport this fix. This decision was made because v13 is approaching its final minor release, and we won't have an chance to fix any new issues that might arise. Additionally, since using pgoutput via SQL API is not a common use case, the risk outwights the benefit. If we receive bug reports, we can consider backporting the fixes then. Author: vignesh C <vignesh21@gmail.com> Co-authored-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/CALDaNm0x-aCehgt8Bevs2cm=uhmwS28MvbYq1=s2Ekf0aDPkOA@mail.gmail.com Backpatch-through: 15	2025-10-09 10:59:27 -07:00
Tom Lane	71540dcdcb	Avoid uninitialized-variable warnings from older compilers. Some of the buildfarm is still unhappy with WinGetFuncArgInPartition even after `2273fa32b`. While it seems to be just very old compilers, we can suppress the warnings and arguably make the code more readable by not initializing these variables till closer to where they are used. While at it, make a couple of cosmetic comment improvements.	2025-10-09 10:33:55 -04:00
Richard Guo	36fd8bde1b	Fix comment in eager_aggregate.sql The comment stated that eager aggregation is disabled by default, which is no longer true. This patch removes that comment as well as the related GUC set statement. Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvr4YWpiMR3RsgYwJWv-u8xoRqTAKRiYy9zUszjZOqG4Ug@mail.gmail.com	2025-10-09 17:50:54 +09:00
Richard Guo	f997d777ad	Remove unnecessary include of "utils/fmgroids.h" In initsplan.c, no macros for built-in function OIDs are used, so this include is unnecessary and can be removed. This was my oversight in commit `8e1185910`. Discussion: https://postgr.es/m/CAMbWs4_-sag-cAKrLJ+X+5njL1=oudk=+KfLmsLZ5a2jckn=kg@mail.gmail.com	2025-10-09 17:49:20 +09:00
Michael Paquier	8d02f49696	Remove duplicated log related to slot creation in pg_createsubscriber The creation of a replication slot done in a specific database on a publisher was logged twice, with the second log not mentioning the database where the slot creation happened. This commit removes the information logged after a slot has been successfully created, moving the information about the publisher from the second to the first log. Note that failing a slot creation is also logged, so there is no loss of information. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHut+Pv7qDvLbDgc9PQGhULT3rPXTxdu_=w+iW-kMs+zPADR+w@mail.gmail.com	2025-10-09 14:02:24 +09:00
Amit Kapila	96b3784973	Add "ALL SEQUENCES" support to publications. This patch adds support for the ALL SEQUENCES clause in publications, enabling synchronization/replication of all sequences that is useful for upgrades. Publications can now include all sequences via FOR ALL SEQUENCES. psql enhancements: \d shows publications for a given sequence. \dRp indicates if a publication includes all sequences. ALL SEQUENCES can be combined with ALL TABLES, but not with other options like TABLE or TABLES IN SCHEMA. We can extend support for more granular clauses in future. The view pg_publication_sequences provides information about the mapping between publications and sequences. This patch enables publishing of sequences; subscriber-side support will be added in upcoming patches. Author: vignesh C <vignesh21@gmail.com> Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-09 03:48:54 +00:00
Amit Langote	ef5e60a9d3	Fix internal error from CollateExpr in SQL/JSON DEFAULT expressions SQL/JSON functions such as JSON_VALUE could fail with "unrecognized node type" errors when a DEFAULT clause contained an explicit COLLATE expression. That happened because assign_collations_walker() could invoke exprSetCollation() on a JsonBehavior expression whose DEFAULT still contained a CollateExpr, which exprSetCollation() does not handle. For example: SELECT JSON_VALUE('{"a":1}', '$.c' RETURNING text DEFAULT 'A' COLLATE "C" ON EMPTY); Fix by validating in transformJsonBehavior() that the DEFAULT expression's collation matches the enclosing JSON expression’s collation. In exprSetCollation(), replace the recursive call on the JsonBehavior expression with an assertion that its collation already matches the target, since the parser now enforces that condition. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CACJufxHVwYYSyiVQ6o+PsRX6zQ7rAFinh_fv1kCfTsT1xG4Zeg@mail.gmail.com Backpatch-through: 17	2025-10-09 01:07:59 -04:00
David Rowley	a5a68dd6d5	Make truncate_useless_pathkeys() consider WindowFuncs truncate_useless_pathkeys() seems to have neglected to account for PathKeys that might be useful for WindowClause evaluation. Modify it so that it properly accounts for that. Making this work required adjusting two things: 1. Change from checking query_pathkeys to check sort_pathkeys instead. 2. Add explicit check for window_pathkeys For #1, query_pathkeys gets set in standard_qp_callback() according to the sort order requirements for the first operation to be applied after the join planner is finished, so this changes depending on which upper planner operations a particular query needs. If the query has window functions and no GROUP BY, then query_pathkeys gets set to window_pathkeys. Before this change, this meant PathKeys useful for the ORDER BY were not accounted for in queries with window functions. Because of #1, #2 is now required so that we explicitly check to ensure we don't truncate away PathKeys useful for window functions. Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrj3HTKmXoLMbUjTO=_MNMxM=cnuCSyBKidAVibmYPnrg@mail.gmail.com	2025-10-09 12:38:33 +13:00
Andres Freund	5e89985928	bufmgr: Don't lock buffer header in StrategyGetBuffer() Previously StrategyGetBuffer() acquired the buffer header spinlock for every buffer, whether it was reusable or not. If reusable, it'd be returned, with the lock held, to GetVictimBuffer(), which then would pin the buffer with PinBuffer_Locked(). That's somewhat violating the spirit of the guidelines for holding spinlocks (i.e. that they are only held for a few lines of consecutive code) and necessitates using PinBuffer_Locked(), which scales worse than PinBuffer() due to holding the spinlock. This alone makes it worth changing the code. However, the main reason to change this is that a future commit will make PinBuffer_Locked() slower (due to making UnlockBufHdr() slower), to gain scalability for the much more common case of pinning a pre-existing buffer. By pinning the buffer with a single atomic operation, iff the buffer is reusable, we avoid any potential regression for miss-heavy workloads. There strictly are fewer atomic operations for each potential buffer after this change. The price for this improvement is that freelist.c needs two CAS loops and needs to be able to set up the resource accounting for pinned buffers. The latter is achieved by exposing a new function for that purpose from bufmgr.c, that seems better than exposing the entire private refcount infrastructure. The improvement seems worth the complexity. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-10-08 17:04:07 -04:00
Andres Freund	3baae90013	bufmgr: fewer calls to BufferDescriptorGetContentLock We're planning to merge buffer content locks into BufferDesc.state. To reduce the size of that patch, centralize calls to BufferDescriptorGetContentLock(). The biggest part of the change is in assertions, by introducing BufferIsLockedByMe[InMode]() (and removing BufferIsExclusiveLocked()). This seems like an improvement even without aforementioned plans. Additionally replace some direct calls to LWLockAcquire() with calls to LockBuffer(). Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-10-08 16:06:19 -04:00
Andres Freund	2a2e1b470b	bufmgr: Fix signedness of mask variable in BufferSync() BM_PERMANENT is defined as 1U<<31, which is a negative number when interpreted as a signed integer. Unfortunately the mask variable in BufferSync() was signed. This has been wrong for a long time, but failed to fail, due to integer conversion rules. However, in an upcoming patch the width of the state variable will be increased, with the wrong signedness leading to never flushing permanent buffers - luckily caught in a test. It seems better to fix this separately, instead of doing so as part of a large, otherwise mechanical, patch. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-10-08 14:34:30 -04:00
Andres Freund	3c2b97b29e	bufmgr: Introduce FlushUnlockedBuffer There were several copies of code locking a buffer, flushing its contents, and unlocking the buffer. It seems worth centralizing that into a helper function. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-10-08 14:34:30 -04:00
Andres Freund	819dc118c0	Improve ReadRecentBuffer() scalability While testing a new potential use for ReadRecentBuffer(), Andres reported that it scales badly when called concurrently for the same buffer by many backends. Instead of a naive (but wrong) coding with PinBuffer(), it used the spinlock, so that it could be careful to pin only if the buffer was valid and holding the expected block, to avoid breaking invariants in eg GetVictimBuffer(). Unfortunately that made it less scalable than PinBuffer(), which uses compare-exchange instead. We can fix that by giving PinBuffer() a new skip_if_not_valid mode that doesn't pin invalid buffers. It might occasionally skip when it shouldn't due to the unlocked read of the header flags, but that's unlikely and perfectly acceptable for an opportunistic optimisation routine, and it can only succeed when it really should due to the compare-exchange loop. Note that this fixes ReadRecentBuffer()'s failure to bump the usage count. While this could be seen as a bug, there currently aren't cases affected by this in core, so it doesn't seem worth backpatching that portion. Author: Thomas Munro <thomas.munro@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/20230627020546.t6z4tntmj7wmjrfh%40awork3.anarazel.de Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-10-08 13:10:40 -04:00
Masahiko Sawada	d3b6183dd9	Add mem_exceeded_count column to pg_stat_replication_slots. This commit introduces a new column mem_exceeded_count to the pg_stat_replication_slots view. This counter tracks how often the memory used by logical decoding exceeds the logical_decoding_work_mem limit. The new statistic helps users determine whether exceeding the logical_decoding_work_mem limit is a rare occurrences or a frequent issue, information that wasn't available through existing statistics. Bumps catversion. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/978D21E8-9D3B-40EA-A4B1-F87BABE7868C@yesql.se	2025-10-08 10:05:04 -07:00
Tom Lane	14ad0d7bf2	Cleanup NAN code in float.h, too. In the same spirit as `3bf905692`, assume that all compilers we still support provide the NAN macro, and get rid of workarounds for that. The C standard allows implementations to omit NAN if the underlying float arithmetic lacks quiet (non-signaling) NaNs. However, we've required that feature for years: the workarounds only supported lack of the macro, not lack of the functionality. I put in a compile-time #error if there's no macro, just for clarity. Also fix up the copies of these functions in ecpglib, and leave a breadcrumb for the next hacker who touches them. History of the hacks being removed here can be found in commits `1bc2d544b`, `4d17a2146`, `cec8394b5`. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1952095.1759764279@sss.pgh.pa.us	2025-10-08 12:19:53 -04:00
Robert Haas	4685977cc5	Add extension_state member to PlannedStmt. Extensions can stash data computed at plan time into this list using planner_shutdown_hook (or perhaps other mechanisms) and then access it from any code that has access to the PlannedStmt (such as explain hooks), allowing for extensible debugging and instrumentation of plans. Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com	2025-10-08 09:07:49 -04:00
Robert Haas	94f3ad3961	Add planner_setup_hook and planner_shutdown_hook. These hooks allow plugins to get control at the earliest point at which the PlannerGlobal object is fully initialized, and then just before it gets destroyed. This is useful in combination with the extendable plan state facilities (see extendplan.h) and perhaps for other purposes as well. Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com	2025-10-08 09:05:38 -04:00
Robert Haas	c83ac02ec7	Add ExplainState argument to pg_plan_query() and planner(). This allows extensions to have access to any data they've stored in the ExplainState during planning. Unfortunately, it won't help with EXPLAIN EXECUTE is used, but since that case is less common, this still seems like an improvement. Since planner() has quite a few arguments now, also add some documentation of those arguments and the return value. Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com	2025-10-08 08:33:29 -04:00
Richard Guo	8e11859102	Implement Eager Aggregation Eager aggregation is a query optimization technique that partially pushes aggregation past a join, and finalizes it once all the relations are joined. Eager aggregation may reduce the number of input rows to the join and thus could result in a better overall plan. In the current planner architecture, the separation between the scan/join planning phase and the post-scan/join phase means that aggregation steps are not visible when constructing the join tree, limiting the planner's ability to exploit aggregation-aware optimizations. To implement eager aggregation, we collect information about aggregate functions in the targetlist and HAVING clause, along with grouping expressions from the GROUP BY clause, and store it in the PlannerInfo node. During the scan/join planning phase, this information is used to evaluate each base or join relation to determine whether eager aggregation can be applied. If applicable, we create a separate RelOptInfo, referred to as a grouped relation, to represent the partially-aggregated version of the relation and generate grouped paths for it. Grouped relation paths can be generated in two ways. The first method involves adding sorted and hashed partial aggregation paths on top of the non-grouped paths. To limit planning time, we only consider the cheapest or suitably-sorted non-grouped paths in this step. Alternatively, grouped paths can be generated by joining a grouped relation with a non-grouped relation. Joining two grouped relations is currently not supported. To further limit planning time, we currently adopt a strategy where partial aggregation is pushed only to the lowest feasible level in the join tree where it provides a significant reduction in row count. This strategy also helps ensure that all grouped paths for the same grouped relation produce the same set of rows, which is important to support a fundamental assumption of the planner. For the partial aggregation that is pushed down to a non-aggregated relation, we need to consider all expressions from this relation that are involved in upper join clauses and include them in the grouping keys, using compatible operators. This is essential to ensure that an aggregated row from the partial aggregation matches the other side of the join if and only if each row in the partial group does. This ensures that all rows within the same partial group share the same "destiny", which is crucial for maintaining correctness. One restriction is that we cannot push partial aggregation down to a relation that is in the nullable side of an outer join, because the NULL-extended rows produced by the outer join would not be available when we perform the partial aggregation, while with a non-eager-aggregation plan these rows are available for the top-level aggregation. Pushing partial aggregation in this case may result in the rows being grouped differently than expected, or produce incorrect values from the aggregate functions. If we have generated a grouped relation for the topmost join relation, we finalize its paths at the end. The final paths will compete in the usual way with paths built from regular planning. The patch was originally proposed by Antonin Houska in 2017. This commit reworks various important aspects and rewrites most of the current code. However, the original patch and reviews were very useful. Author: Richard Guo <guofenglinux@gmail.com> Author: Antonin Houska <ah@cybertec.at> (in an older version) Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version) Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version) Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version) Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com	2025-10-08 17:04:23 +09:00
Richard Guo	185e304263	Allow negative aggtransspace to indicate unbounded state size This patch reuses the existing aggtransspace in pg_aggregate to signal that an aggregate's transition state can grow unboundedly. If aggtransspace is set to a negative value, it now indicates that the transition state may consume unpredictable or large amounts of memory, such as in aggregates like array_agg or string_agg that accumulate input rows. This information can be used by the planner to avoid applying memory-sensitive optimizations (e.g., eager aggregation) when there is a risk of excessive memory usage during partial aggregation. Bump catalog version. Per idea from Robert Haas, though applied differently than originally suggested. Discussion: https://postgr.es/m/CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com	2025-10-08 17:01:48 +09:00
Michael Paquier	138da727a1	Improve description of some WAL records for GIN The following information is added in the description of some GIN records: - In INSERT_LISTPAGE, the number of tuples and the right link block. - In UPDATE_META_PAGE, the number of tuples, the previous tail block, and the right link block. - In SPLIT, the left and right children blocks. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CALdSSPgnAt5L=D_xGXRXLYO5FK1H31_eYEESxdU1n-r4g+6GqA@mail.gmail.com	2025-10-08 14:02:26 +09:00
Michael Paquier	b71bae41a0	Add stats_reset to pg_stat_user_functions It is possible to call pg_stat_reset_single_function_counters() for a single function, but the reset time was missing the system view showing its statistics. Like all the fields of pg_stat_user_functions, the GUC track_functions needs to be enabled to show the statistics about function executions. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to PgStat_StatFuncEntry. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aONjnsaJSx-nEdfU@paquier.xyz	2025-10-08 12:43:40 +09:00
Amit Kapila	035b09131d	Fix typo in function header comment. Reported-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+TgmoZYh_nw-2j_Fi9y6ZAvrpN+W1aSOFNM7Rus2Q-zTkCsQw@mail.gmail.com	2025-10-08 03:17:05 +00:00
Tatsuo Ishii	2273fa32bc	Fix Coverity issues reported in commit `25a30bbd42`. Fix several issues pointed out by Coverity (reported by Tome Lane). - In row_is_in_frame(), return value of window_gettupleslot() was not checked. - WinGetFuncArgInPartition() tried to derefference "isout" pointer even if it could be NULL in some places. Besides the issues, I also fixed a compiler warning reported by Álvaro Herrera. Moreover, in WinGetFuncArgInPartition refactor the do...while loop so that the codes inside the loop simpler. Also simplify the case when abs_pos < 0. Author: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Paul Ramsey <pramsey@cleverelephant.ca> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/1686755.1759679957%40sss.pgh.pa.us Discussion: https://postgr.es/m/202510051612.gw67jlc2iqpw%40alvherre.pgsql	2025-10-08 09:26:49 +09:00
David Rowley	3bf905692c	Cleanup INFINITY code in float.h The INFINITY macro is always defined per C99 standard, so this should mean we can now get rid of the workaround code for when that macro isn't defined. Also, delete the (now unneeded) #pragma code which was disabling a compiler warning in MSVC. There was a comment explaining why the #pragma was placed outside the function body to work around a MSVC compiler bug, but the link explaining that was dead, as reported by jian he. Author: David Rowley <dgrowleyml@gmail.com> Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxGARYETnNwtCK7QC0zE_7gq-tfN0mME=gT5rTNtC=VSHQ@mail.gmail.com	2025-10-08 12:07:17 +13:00
Robert Haas	64095d1574	Remove PlannerInfo's join_search_private method. Instead, use the new mechanism that allows planner extensions to store private state inside a PlannerInfo, treating GEQO as an in-core planner extension. This is a useful test of the new facility, and also buys back a few bytes of storage. To make this work, we must remove innerrel_is_unique_ext's hack of testing whether join_search_private is set as a proxy for whether the join search might be retried. Add a flag that extensions can use to explicitly signal their intentions instead. Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com	2025-10-07 12:43:45 -04:00
Robert Haas	0132dddab3	Allow private state in certain planner data structures. Extension that make extensive use of planner hooks may want to coordinate their efforts, for example to avoid duplicate computation, but that's currently difficult because there's no really good way to pass data between different hooks. To make that easier, allow for storage of extension-managed private state in PlannerGlobal, PlannerInfo, and RelOptInfo, along very similar lines to what we have permitted for ExplainState since commit `c65bc2e1d1`. Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com	2025-10-07 12:09:30 -04:00
Tom Lane	afd532c3a8	Adjust new TAP test to work on macOS. Seems Apple's version of "wc -l" puts spaces before the number. (I wonder why the cfbot didn't find this.) While here, make the failure case log what it got, to aid debugging future issues. Per buildfarm.	2025-10-07 11:47:27 -04:00
Tom Lane	27da1a796f	Improve psql's ability to select pager mode accurately. We try to use the pager only when more than a screenful's worth of data is to be printed. However, the code in print.c that's concerned with counting the number of lines that will be needed missed a lot of edge cases: * While plain aligned mode accounted for embedded newlines in column headers and table cells, unaligned and vertical output modes did not. * In particular, since vertical mode repeats the headers for each record, we need to account for embedded newlines in the headers for each record. * Multi-line table titles were not accounted for. * tuples_only mode (where headers aren't printed) wasn't accounted for. * Footers were accounted for as one line per footer, again missing the possibility of multi-line footers. (In some cases such as "\d+" on a view, there can be many lines in a footer.) Also, we failed to account for the default footer. To fix, move the entire responsibility for counting lines into IsPagerNeeded (or actually, into a new subroutine count_table_lines), and then expand the logic as appropriate. Also restructure to make it perhaps a bit easier to follow. It's still only completely accurate for ALIGNED/WRAPPED/UNALIGNED formats, but the other formats are not typically used with interactive output. Arrange to not run count_table_lines at all unless we will use its result, and teach it to quit early as soon as it's proven that the output is long enough to require use of the pager. When dealing with large tables this should save a noticeable amount of time, since pg_wcssize() isn't exactly cheap. In passing, move the "flog" output step to the bottom of printTable(), rather than running it when we've already opened the pager in some modes. In principle it shouldn't interfere with the pager because flog should always point to a non-interactive file; but it seems silly to risk any interference, especially when the existing positioning seems to have been chosen with the aid of a dartboard. Also add a TAP test to exercise pager mode. Up to now, we have had zero test coverage of these code paths, because they aren't reached unless isatty(stdout). We do have the test infrastructure to improve that situation, though. Following the lead of 010_tab_completion.pl, set up an interactive psql and feed it some test cases. To detect whether it really did invoke the pager, point PSQL_PAGER to "wc -l". The test is skipped if that utility isn't available. Author: Erik Wienhold <ewie@ewie.name> Test-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2dd2430f-dd20-4c89-97fd-242616a3d768@ewie.name	2025-10-07 10:57:56 -04:00
Robert Haas	8c49a484e8	Assign each subquery a unique name prior to planning it. Previously, subqueries were given names only after they were planned, which makes it difficult to use information from a previous execution of the query to guide future planning. If, for example, you knew something about how you want "InitPlan 2" to be planned, you won't know whether the subquery you're currently planning will end up being "InitPlan 2" until after you've finished planning it, by which point it's too late to use the information that you had. To fix this, assign each subplan a unique name before we begin planning it. To improve consistency, use textual names for all subplans, rather than, as we did previously, a mix of numbers (such as "InitPlan 1") and names (such as "CTE foo"), and make sure that the same name is never assigned more than once. We adopt the somewhat arbitrary convention of using the type of sublink to set the plan name; for example, a query that previously had two expression sublinks shown as InitPlan 2 and InitPlan 1 will now end up named expr_1 and expr_2. Because names are assigned before rather than after planning, some of the regression test outputs show the numerical part of the name switching positions: what was previously SubPlan 2 was actually the first one encountered, but we finished planning it later. We assign names even to subqueries that aren't shown as such within the EXPLAIN output. These include subqueries that are a FROM clause item or a branch of a set operation, rather than something that will be turned into an InitPlan or SubPlan. The purpose of this is to make sure that, below the topmost query level, there's always a name for each subquery that is stable from one planning cycle to the next (assuming no changes to the query or the database schema). Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: http://postgr.es/m/3641043.1758751399@sss.pgh.pa.us	2025-10-07 09:18:54 -04:00
Daniel Gustafsson	8c2d5d4f11	doc: Add missing parenthesis in pg_stat_progress_analyze docs Author: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAOzEurRgpAh9dsbEM88FPOhNaV_PkdL6p_9MJatcrNf9wXw1nw@mail.gmail.com Backpatch-through: 18	2025-10-07 15:02:20 +02:00
Álvaro Herrera	c53775185d	Fix compile of src/tutorial/funcs.c I broke this with recent #include removals. Fix by adding an explicit Reported-by: Devrim Gündüz <devrim@gunduz.org> Discussion: https://postgr.es/m/5e2c2d7c44434f3f0af7523864b27fe4fb590902.camel@gunduz.org	2025-10-07 10:45:57 +02:00
David Rowley	9c9d41af4d	Teach planner to short-circuit EXCEPT/INTERSECT with dummy inputs When either inputs of an INTERSECT [ALL] operator are proven not to return any results (a dummy rel), then mark the entire INTERSECT operation as dummy. Likewise, if an EXCEPT [ALL] operation's left input is proven empty, then mark the entire operation as dummy. With EXCEPT ALL, we can easily handle the right input being dummy as we can return the left input without any processing. That can lead to significant performance gains during query execution. We can't easily handle dummy right inputs for EXCEPT (without ALL), as that would require deduplication of the left input. Wiring up those Paths is likely more complex than it's worth as the gains during execution aren't that great, so let's leave that one to be handled by the normal Path generation code. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com	2025-10-07 17:17:52 +13:00
David Rowley	928df067d1	Fix incorrect targetlist in dummy UNIONs The prior code, added in `03d40e4b5` attempted to use the targetlist of the first UNION child when all UNION children were proven as dummy rels. That's not going to work when some operation atop of the Result node must find target entries within the Result's targetlist. This could have been something as simple as trying to sort the results of the UNION operation, which would lead to: ERROR: could not find pathkey item to sort Instead, use the top-level UNION's targetlist and fix the varnos in setrefs.c. Because set operation targetlists always use varno==0, we can rewrite those to become varno==1, i.e. use the Vars from the first UNION child. This does result in showing Vars from relations that are not present in the final plan, but that's no different to what we see when normal base relations are proven dummy. Without this fix it would be possible to see the following error in EXPLAIN VERBOSE when all UNION inputs were proven empty. ERROR: bogus varno: 0 Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvrUASy9sfULMEsM2udvZJP6AoBRCZvHYXYxZTy2tX9FYw@mail.gmail.com	2025-10-07 14:15:04 +13:00
Masahiko Sawada	771cfe22a0	Avoid unnecessary GinFormTuple() calls for incompressible posting lists. Previously, we attempted to form a posting list tuple even when ginCompressPostingList() failed to compress the posting list due to its size. While there was no functional failure, it always wasted one GinFormTuple() call when item pointers didn't fit in a posting list tuple. This commit ensures that a GIN index tuple is formed only when all item pointers in the posting list are successfully compressed. Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAE7r3M+C=jcpTD93f_RBHrQp3C+=TAXFs+k4tTuZuuxboK8AvA@mail.gmail.com	2025-10-06 14:02:01 -07:00
Nathan Bossart	ec8719ccbf	Optimize hex_encode() and hex_decode() using SIMD. The hex_encode() and hex_decode() functions serve as the workhorses for hexadecimal data for bytea's text format conversion functions, and some workloads are sensitive to their performance. This commit adds new implementations that use routines from port/simd.h, which testing indicates are much faster for larger inputs. For small or invalid inputs, we fall back on the existing scalar versions. Since we are using port/simd.h, these optimizations apply to both x86-64 and AArch64. Author: Nathan Bossart <nathandbossart@gmail.com> Co-authored-by: Chiranmoy Bhattacharya <chiranmoy.bhattacharya@fujitsu.com> Co-authored-by: Susmitha Devanga <devanga.susmitha@fujitsu.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aLhVWTRy0QPbW2tl%40nathan	2025-10-06 12:28:50 -05:00
Andrew Dunstan	5b5e8a29c1	Revert "Improve docs syntax checking" This reverts commit `b292256272`. Further discussion is needed Discussion: https://postgr.es/m/0198ec0f-0269-4cf4-b4a7-22c05b3047cb@eisentraut.org	2025-10-06 07:53:31 -04:00
Amit Kapila	b93172ca59	Expose sequence page LSN via pg_get_sequence_data. This patch enhances the pg_get_sequence_data function to include the page-level LSN (Log Sequence Number) of the sequence. This additional metadata will be used by upcoming patches to support synchronization of sequences during logical replication. By exposing the LSN, we enable more accurate tracking of sequence changes, which is essential for maintaining consistency across replicated nodes. Author: vignesh C <vignesh21@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com	2025-10-06 08:30:16 +00:00
Michael Paquier	42c6b74d89	Add comment in ginxlog.h about block used with ginxlogInsertListPage All the other structures describe the list of blocks used, and in the case of a GIN_INSERT_LISTPAGE record block 0 refers to a list page with the items added to it. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CALdSSPgk=9WRoXhZy5fdk+T1hiau7qbL_vn94w_L1N=gtEdbsg@mail.gmail.com	2025-10-06 16:23:51 +09:00
Michael Paquier	7072a8855e	Remove block information from description of some WAL records for GIN The WAL records XLOG_GIN_INSERT and XLOG_GIN_VACUUM_DATA_LEAF_PAGE included some information about the blocks added to the record. This information is already provided by XLogRecGetBlockRefInfo() with much more details about the blocks included in each record, like the compression information, for example. This commit removes the block information that existed in the record descriptions specific to GIN. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CALdSSPgk=9WRoXhZy5fdk+T1hiau7qbL_vn94w_L1N=gtEdbsg@mail.gmail.com	2025-10-06 16:14:59 +09:00
Michael Paquier	a5b543258a	Add stats_reset to pg_stat_all_{tables,indexes} and related views It is possible to call pg_stat_reset_single_table_counters() on a relation (index or table) but the reset time was missing from the system views showing their statistics. This commit adds the reset time as an attribute of pg_stat_all_tables, pg_stat_all_indexes, and other relations related to them. Bump catalog version. Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to PgStat_StatTabEntry. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aN8l182jKxEq1h9f@paquier.xyz	2025-10-06 15:31:21 +09:00
Michael Paquier	c173aaff98	Add test for pg_stat_reset_single_table_counters() on index stats.sql is already doing some tests coverage on index statistics, by retrieving for example idx_scan and friends in pg_stat_all_tables. pg_stat_reset_single_table_counters() is supported for an index for a long time, but the case was never covered. This commit closes the gap, by using this reset function on an index, cross-checking the contents of pg_stat_all_indexes. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aN8l182jKxEq1h9f@paquier.xyz	2025-10-06 14:34:45 +09:00
Michael Paquier	0c7f103028	Fix two comments in numeric.c The comments at the top of numeric_int4_safe() and numeric_int8_safe() mentioned respectively int4_numeric() and int8_numeric(). The intention is to refer to numeric_int4() and numeric_int8(). Oversights in `4246a977ba`. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxFfVt7Jx9_j=juxXyP-6tznN8OcvS9E-QSgp0BrD8KUgA@mail.gmail.com	2025-10-06 11:18:30 +09:00
Tom Lane	ea78bd6d5d	Use SOCK_ERRNO[_SET] in fe-secure-gssapi.c. On Windows, this code did not handle error conditions correctly at all, since it looked at "errno" which is not used for socket-related errors on that platform. This resulted, for example, in failure to connect to a PostgreSQL server with GSSAPI enabled. We have a convention for dealing with this within libpq, which is to use SOCK_ERRNO and SOCK_ERRNO_SET rather than touching errno directly; but the GSSAPI code is a relative latecomer and did not get that memo. (The equivalent backend code continues to use errno, because the backend does this differently. Maybe libpq's approach should be rethought someday.) Apparently nobody tries to build libpq with GSSAPI support on Windows, or we'd have heard about this before, because it's been broken all along. Back-patch to all supported branches. Author: Ning Wu <ning94803@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFGqpvg-pRw=cdsUpKYfwY6D3d-m9tw8WMcAEE7HHWfm-oYWvw@mail.gmail.com Backpatch-through: 13	2025-10-05 16:27:47 -04:00
Álvaro Herrera	1a8b5b11e4	Don't include access/htup_details.h in executor/tuptable.h This is not at all needed; I suspect it was a simple mistake in commit `5408e233f0`. It causes htup_details.h to bleed into a huge number of places via execnodes.h. Remove it and fix fallout. Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql	2025-10-05 18:00:38 +02:00
Álvaro Herrera	1b6f61bd89	Don't include execnodes.h in brin.h or gin.h These headers don't need execnodes.h for anything. I think they never have. Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql	2025-10-05 17:35:25 +02:00
David Rowley	03d40e4b52	Teach UNION planner to remove dummy inputs This adjusts UNION planning so that the planner produces more optimal plans when one or more of the UNION's subqueries have been proven to be empty (a dummy rel). If any of the inputs are empty, then that input can be removed from the Append / MergeAppend. Previously, a const-false "Result" node would appear to represent this. Removing empty inputs has a few extra benefits when only 1 union child remains as it means the Append or MergeAppend can be removed in setrefs.c, making the plan slightly faster to execute. Also, we can provide better n_distinct estimates by looking at the sole remaining input rel's statistics. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com	2025-10-04 14:30:03 +13:00
David Rowley	5092aae431	Use bms_add_members() instead of bms_union() when possible bms_union() causes a new set to be allocated. What this caller needs is members added to an existing set. bms_add_members() is the tool for that job. This is just a matter of fixing an inefficiency due to surplus memory allocations. No bugs being fixed. The only other place I found that might be valid to apply this change is in markNullableIfNeeded(), but I opted not to do that due to the risk to reward ratio not looking favorable. The risk being that there could be another pointer pointing to the Bitmapset. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAApHDvoCcoS-p5tZNJLTxFOKTYNjqVh7Dwf+5ikDUBwnvWftRw@mail.gmail.com	2025-10-04 12:19:31 +13:00
Nathan Bossart	f8f4afe751	Optimize vector8_has_le() on AArch64. Presently, the SIMD implementation of this function uses unsigned saturating subtraction to find bytes less than or equal to the given value, which is a workaround for the lack of unsigned comparison instructions on some architectures. However, Neon offers vminvq_u8(), which returns the minimum (unsigned) value in the vector. This commit adds a Neon-specific implementation that uses vminvq_u8() to optimize vector8_has_le() on AArch64. In passing, adjust the SSE2 implementation to use vector8_min() and vector8_eq() to find values less than or equal to the given value. This was the only use of vector8_ssub(), so it has been removed. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/aNHDNDSHleq0ogC_%40nathan	2025-10-03 14:02:47 -05:00
Nathan Bossart	74b41f5a77	Make some use of anonymous unions [DSM registry]. Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes the DSMRegistryEntry struct. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/aNKsDg0fJwqhZdXX%40nathan	2025-10-03 10:14:33 -05:00
David Rowley	a69b55cd47	Tidy-up unneeded NULL parameter checks from SQL function This function is marked as strict, so we can safely remove the checks checking for NULL input parameters. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAApHDvqiN0+mbooUOSCDALc=GoM8DmTbCdvwnCwak6Wb2O1ZJA@mail.gmail.com	2025-10-03 23:04:37 +13:00
John Naylor	54ab748651	Fix reuse-after-free hazard in dead_items_reset In similar vein to commit `ccc8194e42`, a reset instance of a shared memory TID store happened to occupy the same private memory as the old one for the entry point, since the chunk freed after the last round of index vacuuming was put on the context's freelist. The failure to update the vacrel->dead_items pointer was evident by nudging the system to allocate memory in a different area. This was not discovered at the time of the earlier commit since our regression tests didn't cover multiple index passes with parallel vacuum. Backpatch to v17, when TidStore came in. Author: Kevin Oommen Anish <kevin.o@zohocorp.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Tested-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/199a07cbdfc.7a1c4aac25838.1675074408277594551%40zohocorp.com Backpatch-through: 17	2025-10-03 16:05:02 +07:00
Richard Guo	605bfb7dbe	Fix incorrect function reference in comment The comment incorrectly references the defunct function BufFileOpenShared(), which was replaced in commit `dcac5e7ac`. This patch updates the comment to refer to the current function BufFileOpenFileSet(). Author: Zhang Mingli <zmlpostgres@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/1cb48b4c-54ab-40cc-b355-0b3c2af6d3f7@Spark	2025-10-03 16:34:42 +09:00
Michael Paquier	902c08887a	pgbench: Fail cleanly when finding a COPY result state Currently, pgbench aborts when a COPY response is received in readCommandResponse(). However, as PQgetResult() returns an empty result when there is no asynchronous result, through getCopyResult(), the logic done at the end of readCommandResponse() for the error path leads to an infinite loop. This commit forcefully exits the COPY state with PQendcopy() before moving to the error handler when fiding a COPY state, avoiding the infinite loop. The COPY protocol is not supported by pgbench anyway, as an error is assumed in this case, so giving up is better than having the tool be stuck forever. pgbench was interruptible in this state. A TAP test is added to check that an error happens if trying to use COPY. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqpHyF2m73ifV5a=5jhXxH2chk=XrgefY+eWWPe2Eft3=A@mail.gmail.com Backpatch-through: 13	2025-10-03 14:03:55 +09:00
Tatsuo Ishii	25a30bbd42	Add IGNORE NULLS/RESPECT NULLS option to Window functions. Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead, lag, first_value, last_value and nth_value window functions. If unspecified, the default is RESPECT NULLS which includes NULL values in any result calculation. IGNORE NULLS ignores NULL values. Built-in window functions are modified to call new API WinCheckAndInitializeNullTreatment() to indicate whether they accept IGNORE NULLS/RESPECT NULLS option or not (the API can be called by user defined window functions as well). If WinGetFuncArgInPartition's allowNullTreatment argument is true and IGNORE NULLS option is given, WinGetFuncArgInPartition() or WinGetFuncArgInFrame() will return evaluated function's argument expression on specified non NULL row (if it exists) in the partition or the frame. When IGNORE NULLS option is given, window functions need to visit and evaluate same rows over and over again to look for non null rows. To mitigate the issue, 2-bit not null information array is created while executing window functions to remember whether the row has been already evaluated to NULL or NOT NULL. If already evaluated, we could skip the evaluation work, thus we could get better performance. Author: Oliver Ford <ojford@gmail.com> Co-authored-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Krasiyan Andreev <krasiyan@gmail.com> Reviewed-by: Andrew Gierth <andrew@tao11.riddles.org.uk> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Fetter <david@fetter.org> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com	2025-10-03 09:47:36 +09:00
Daniel Gustafsson	381f5cffae	Remove check for NULL in STRICT function test_bms_make_singleton is defined as STRICT and only takes a single parameter, so there is no need to check that parameter for NULL as a NULL input won't ever reach there. Author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/BC483901-9587-4076-B20F-9A414C66AB78@yesql.se	2025-10-02 22:54:37 +02:00
Daniel Gustafsson	a1b064e4b2	Fixes for comments in test_bitmapset This fixes a typo in the sql/expected test files and removes a leftover comment from test_bitmapset.c from when the functions invoked bms_free. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/978D21E8-9D3B-40EA-A4B1-F87BABE7868C@yesql.se	2025-10-02 22:41:24 +02:00
Andrew Dunstan	b292256272	Improve docs syntax checking Move the checks out of the Makefile into a perl script that can be called from both the Makefile and meson.build. The set of files checked is simplified, so it is just all the sgml and xsl files found in docs/src/sgml directory tree. Along the way make some adjustments to .cirrus.tasks.yml to support this better in CI. Also ensure that the checks are part of the Makefile's html target. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-Author: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CAN55FZ3BnM+0twT-ZWL8As9oBEte_b+SBU==cz6Hk8JUCM_5Wg@mail.gmail.com	2025-10-02 10:26:32 -04:00
Daniel Gustafsson	482bc0705d	doc: Improve wording for base64url definition This sentence should be "the alphabet uses" due to it referring to multiple cases of use. Reported-by: Erik Rijkers <er@xs4all.nl> Discussion: https://postgr.es/m/81d6ab37-92dc-75c9-a649-4e1286a343ea@xs4all.nl	2025-10-02 11:47:46 +02:00
Michael Paquier	3f431109dc	Remove useless pointer update in ginxlog.c Oversight in `2c03216d83`, when the redo code of GIN got refactored for the new WAL format where block information has been standardized, as the payload data got tracked for each block after the change, and not in the whole record. This is just a cleanup. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CALdSSPgnAt5L=D_xGXRXLYO5FK1H31_eYEESxdU1n-r4g+6GqA@mail.gmail.com	2025-10-02 17:16:20 +09:00
John Naylor	48566180ef	Generate EUC_CN mappings from gb18030-2022.ucm In the wake of `cfa6cd292`, EUC_CN was the only encoding that used gb-18030-2000.xml to generate the .map files. Since EUC_CN is a subset of GB18030, we can easily use the same UCM file. This allows deleting the XML file from our repository. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CANWCAZaNRXZ-5NuXmsaMA2mKvMZnCGHZqQusLkpE%2B8YX%2Bi5OYg%40mail.gmail.com	2025-10-02 12:36:24 +07:00
Michael Paquier	684a745f55	pgstattuple: Improve reports generated for indexes (hash, gist, btree) pgstattuple checks the state of the pages retrieved for gist and hash using some check functions from each index AM, respectively gistcheckpage() and _hash_checkpage(). When these are called, they would fail when bumping on data that is found as incorrect (like opaque area size not matching, or empty pages), contrary to btree that simply discards these cases and continues to aggregate data. Zero pages can happen after a crash, with these AMs being able to do an internal cleanup when these are seen. Also, sporadic failures are annoying when doing for example a large-scale diagnostic query based on pgstattuple with a join of pg_class, as it forces one to use tricks like quals to discard hash or gist indexes, or use a PL wrapper able to catch errors. This commit changes the reports generated for btree, gist and hash to be more user-friendly; - When seeing an empty page, report it as free space. This new rule applies to gist and hash, and already applied to btree. - For btree, a check based on the size of BTPageOpaqueData is added. - For gist indexes, gistcheckpage() is not called anymore, replaced by a check based on the size of GISTPageOpaqueData. - For hash indexes, instead of _hash_getbuf_with_strategy(), use a direct call to ReadBufferExtended(), coupled with a check based on HashPageOpaqueData. The opaque area size check was already used. - Pages that do not match these criterias are discarded from the stats reports generated. There have been a couple of bug reports over the years that complained about the current behavior for hash and gist, as being not that useful, with nothing being done about it. Hence this change is backpatched down to v13. Reported-by: Noah Misch <noah@leadboat.com> Author: Nitin Motiani <nitinmotiani@google.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CAH5HC95gT1J3dRYK4qEnaywG8RqjbwDdt04wuj8p39R=HukayA@mail.gmail.com Backpatch-through: 13	2025-10-02 11:07:30 +09:00
Jacob Champion	fd726b8379	test_json_parser: Speed up 002_inline.pl Some macOS machines are having trouble with 002_inline, which executes the JSON parser test executables hundreds of times in a nested loop. Both developer machines and buildfarm critters have shown excessive test durations, upwards of 20 seconds. Push the innermost loop of 002_inline, which iterates through differing chunk sizes, down into the test executable. (I'd eventually like to push all of the JSON unit tests down into C, but this is an easy win in the short term.) Testers have reported a speedup between 4-9x. Reported-by: Robert Haas <robertmhaas@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Tested-by: Andrew Dunstan <andrew@dunslane.net> Tested-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BTgmobKoG%2BgKzH9qB7uE4MFo-z1hn7UngqAe9b0UqNbn3_XGQ%40mail.gmail.com Backpatch-through: 17	2025-10-01 09:48:57 -07:00
Peter Eisentraut	3e908fb54f	Fix compiler warnings around _CRT_glob Newer compilers warned about extern int _CRT_glob = 0; which is indeed a mysterious C construction, as it combines "extern" and an initialization. It turns out that according to the C standard, the "extern" is ignored here, so we can remove it to resolve the warnings. But then we also need to add a real extern declaration (without initializer) to satisfy -Wmissing-variable-declarations. (Note that this code is only active on MinGW.) Discussion: https://www.postgresql.org/message-id/1053279b-da01-4eb4-b7a3-da6b5d8f73d1%40eisentraut.org	2025-10-01 17:13:52 +02:00
David Rowley	3a66158068	Minor fixups of test_bitmapset.c The macro's comment had become outdated from a prior version and there's now no longer a need for the do/while loop (or my misplaced semi-colon). Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAApHDvr+P454SP_LDvB=bViPAbDQhV1Jmg5M55GEKz0d3z25NA@mail.gmail.com	2025-10-01 18:50:28 +13:00
Michael Paquier	9952f6c05a	test_bitmapset: Simplify code of the module Two macros are added in this module, to cut duplicated patterns: - PG_ARG_GETBITMAPSET(), for input argument handling, with knowledge about NULL. - PG_RETURN_BITMAPSET_AS_TEXT(), that generates a text result from a Bitmapset. These changes limit the code so as the SQL functions are now mostly wrappers of the equivalent C function. Functions that use integer input arguments still need some NULL handling, like bms_make_singleton(). A NULL input is translated to "<>", which is what nodeToString() generates. Some of the tests are able to generate this result. Per discussion, the calls of bms_free() are removed. These may be justified if the functions are used in a rather long-lived memory context, but let's keep the code minimal for now. These calls used NULL checks, which were also not necessary as NULL is an input authorized by bms_free(). Some of the tests existed to cover behaviors related to the SQL functions for NULL inputs. Most of them are still relevant, as the routines of bitmapset.c are able to handle such cases. The coverage reports of bitmapset.c and test_bitmapset.c remain the same after these changes, with 300 lines of C code removed. Author: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/CAApHDvqghMnm_zgSNefto9oaEJ0S-3Cgb3gdsV7XvLC-hMS02Q@mail.gmail.com	2025-10-01 14:17:54 +09:00
Peter Eisentraut	8e2acda2b0	Rename pg_builtin_integer_constant_p to pg_integer_constant_p Since it's not builtin. Also fix a related typo. Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAApHDvom02B_XNVSkvxznVUyZbjGAR%2B5myA89ZcbEd3%3DPA9UcA%40mail.gmail.com	2025-09-30 21:15:46 +02:00
Fujii Masao	19d4f9ffc2	pgbench: Fix error reporting in readCommandResponse(). pgbench uses readCommandResponse() to process server responses. When readCommandResponse() encounters an error during a call to PQgetResult() to fetch the current result, it attempts to report it with an additional error message from PQerrorMessage(). However, previously, this extra error message could be lost or become incorrect. The cause was that after fetching the current result (and detecting an error), readCommandResponse() called PQgetResult() again to peek at the next result. This second call could overwrite the libpq connection's error message before the original error was reported, causing the error message retrieved from PQerrorMessage() to be lost or overwritten. This commit fixes the issue by updating readCommandResponse() to use PQresultErrorMessage() instead of PQerrorMessage() to retrieve the error message generated when the PQgetResult() for the current result causes an error, ensuring the correct message is reported. Backpatch to all supported versions. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20250925110940.ebacc31725758ec47d5432c6@sraoss.co.jp Backpatch-through: 13	2025-09-30 23:52:28 +09:00
David Rowley	91df0465a6	Fix typo in pgstat_relation.c header comment Looks like a copy and paste error from pgstat_function.c Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aNuaVMAdTGbgBgqh@ip-10-97-1-34.eu-west-3.compute.internal	2025-10-01 00:23:38 +13:00
Peter Eisentraut	f5aabe6d58	Revert "Make some use of anonymous unions [pgcrypto]" This reverts commit `efcd5199d8`. I rebased my patch series incorrectly. This patch contained unrelated parts from another patch, which made the overall build fail. Revert for now and reconsider.	2025-09-30 13:12:16 +02:00
Peter Eisentraut	8b7f27fef3	Make some use of anonymous unions [plpython] Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes some structures in plpython. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org	2025-09-30 12:35:50 +02:00
Peter Eisentraut	efcd5199d8	Make some use of anonymous unions [pgcrypto] Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes some structures in pgcrypto. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org	2025-09-30 12:35:50 +02:00
Peter Eisentraut	57d46dff9b	Make some use of anonymous unions [reorderbuffer xact_time] Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes the ReorderBufferTXN struct. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org	2025-09-30 12:35:50 +02:00
Peter Eisentraut	4b7e6c73b0	Make some use of anonymous unions [pg_locale_t] Make some use of anonymous unions, which are allowed as of C11, as examples and encouragement for future code, and to test compilers. This commit changes the pg_locale_t type. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org	2025-09-30 12:35:50 +02:00
Álvaro Herrera	3bf31dd243	Do a tiny bit of header file maintenance Stop including utils/relcache.h in access/genam.h, and stop including htup_details.h in nodes/tidbitmap.h. Both these files (genam.h and tidbitmap.h) are widely used in other header files, so it's in our best interest that they remain as lean as reasonable. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202509291356.o5t6ny2hoa3q@alvherre.pgsql	2025-09-30 12:28:29 +02:00
Michael Paquier	bb68cde413	Reorder XLogNeedsFlush() checks to be more consistent During recovery, XLogNeedsFlush() checks the minimum recovery LSN point instead of the flush LSN point. The same condition checks are used when updating the minimum recovery point in UpdateMinRecoveryPoint(), but are written in reverse order. This commit makes the order of the checks consistent between XLogNeedsFlush() and UpdateMinRecoveryPoint(), improving the code clarity. Note that the second check (as ordered by this commit) relies on InRecovery, which is true only in the startup process. So this makes XLogNeedsFlush() cheaper in the startup process with the first check acting as a shortcut while doing crash recovery, where LocalMinRecoveryPoint is an invalid LSN. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/aMIHNRTP6Wj6vw1s%40paquier.xyz	2025-09-30 09:38:32 +09:00
Michael Paquier	3cc689f833	injection_points: Add proper locking when reporting fixed-variable stats Contrary to its siblings for the archiver, the bgwriter and the checkpointer stats, pgstat_report_inj_fixed() can be called concurrently. This was causing an assertion failure, while messing up with the stats. This code is aimed at being a template for extension developers, so it is not a critical issue, but let's be correct. This module has also been useful for some benchmarking, at least for me, and that was how I have discovered this issue. Oversight in `f68cd847fa`. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/aNnXbAXHPFUWPIz2@paquier.xyz Backpatch-through: 18	2025-09-30 09:02:09 +09:00
Tom Lane	ef38a4d975	Add GROUP BY ALL. GROUP BY ALL is a form of GROUP BY that adds any TargetExpr that does not contain an aggregate or window function into the groupClause of the query, making it exactly equivalent to specifying those same expressions in an explicit GROUP BY list. This feature is useful for certain kinds of data exploration. It's already present in some other DBMSes, and the SQL committee recently accepted it into the standard, so we can be reasonably confident in the syntax being stable. We do have to invent part of the semantics, as the standard doesn't allow for expressions in GROUP BY, so they haven't specified what to do with window functions. We assume that those should be treated like aggregates, i.e., left out of the constructed GROUP BY list. In passing, wordsmith some existing documentation about GROUP BY, and update some neglected synopsis entries in select_into.sgml. Author: David Christensen <david@pgguru.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAHM0NXjz0kDwtzoe-fnHAqPB1qA8_VJN0XAmCgUZ+iPnvP5LbA@mail.gmail.com	2025-09-29 16:55:17 -04:00
David Rowley	b91067c899	Remove unused parameter from find_window_run_conditions() ... and check_and_push_window_quals(). Similar to `4be9024d5`, but it seems there was yet another unused parameter. Author: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/DD5BEKORUG34.2M8492NMB9DB8@gmail.com	2025-09-30 08:37:42 +13:00
Noah Misch	a95393ecdb	Fix StatisticsObjIsVisibleExt() for pg_temp. Neighbor get_statistics_object_oid() ignores objects in pg_temp, as has been the standard for non-relation, non-type namespace searches since CVE-2007-2138. Hence, most operations that name a statistics object correctly decline to map an unqualified name to a statistics object in pg_temp. StatisticsObjIsVisibleExt() did not. Consequently, pg_statistics_obj_is_visible() wrongly returned true for such objects, psql \dX wrongly listed them, and getObjectDescription()-based ereport() and pg_describe_object() wrongly omitted namespace qualification. Any malfunction beyond that would depend on how a human or application acts on those wrong indications. Commit `d99d58cdc8` introduced this. Back-patch to v13 (all supported versions). Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/20250920162116.2e.nmisch@google.com Backpatch-through: 13	2025-09-29 11:15:44 -07:00
Michael Paquier	5668fff3c5	test_bitmapset: Expand more the test coverage This commit expands the set of tests added by `00c3d87a5c`, to bring the coverage of bitmapset.c close to 100% by addressing a lot of corner cases (most of these relate to word counts and reallocations). Some of the functions of this module also have their own idea of the result to return depending on the input values given. These are specific to the module, still let's add more coverage for all of them. Some comments are made more consistent in the tests, while on it. Author: Greg Burd <greg@burd.me> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aNR-gsGmLnMaNT5i@paquier.xyz	2025-09-29 15:17:27 +09:00
David Rowley	2cb49c609b	Improve planner's width estimates for set operations For UNION, EXCEPT and INTERSECT, we were not very good at estimating the PathTarget.width for the set operation. Since the targetlist of the set operation is made up of Vars with varno==0, this would result in get_expr_width() applying a default estimate based on the Var's type rather than taking width estimates from any relation's statistics. Here we attempt to improve the situation by looking at the width estimates for the set operation child paths and calculating the average width of the relevant child paths weighted over the estimated number of rows. For UNION and INTERSECT, the relevant paths to look at are all child paths. For EXCEPT, since we don't return rows from the right-hand child (only possibly remove left-hand rows matching those), we use only the left-hand child for width estimates. This also adjusts the hashed-UNION Path's PathTarget to use the same PathTarget as its Append subpath. Both PathTargets will be the same and are void of any resjunk columns, per generate_append_tlist(). Making the AggPath use the same PathTarget saves having to adjust the "width" of the AggPath's PathTarget too. This was reported as a bug by sunw.fnst, but it's not something we ever claimed to do properly. Plus, if we were to adjust this in back branches, plans could change as the estimated input sizes to Sorts and Hash Aggregates could go up or down. Plan choices aren't something we want to destabilize in stable versions. Reported-by: sunw.fnst <936739278@qq.com> Author: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/tencent_34CF8017AB81944A4C08DD089D410AB6C306@qq.com	2025-09-29 14:36:39 +13:00
Michael Paquier	acf0960c23	injection_points: Enable entry count in its variable-sized stats This serves as coverage for the tracking of entry count added by `7bd2975fa9` as built-in variable-sized stats kinds have no need for it, at least not yet. A new function, called injection_points_stats_count(), is added to the module. It is able to return the number of entries. This has been useful when doing some benchmarking to check the sanity of the counts. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aMPKWR81KT5UXvEr@paquier.xyz	2025-09-29 09:06:32 +09:00
Michael Paquier	7bd2975fa9	Add support for tracking of entry count in pgstats Stats kinds can set a new option called "track_entry_count" (disabled by default, available for variable-numbered stats) that will make pgstats track the number of entries that exist in its shared hashtable. As there is only one code path where a new entry is added, and one code path where entries are freed, the count tracking is straight-forward in its implementation. Reads of these counters are optimistic, and may change across two calls. The counter is incremented when an entry is created (not when reused), and is decremented when an entry is freed from the hashtable (marked for drop with its refcount reaching 0), which is something that pgstats decides internally. A first use case of this facility would be pg_stat_statements, where we need to be able to cap the number of entries that would be stored in the shared hashtable, based on its "max" GUC. The module currently relies on hash_get_num_entries(), which offers a cheap way to count how many entries are in its hash table, but we cannot do that in pgstats for variable-sized stats kinds as a single hashtable is used for all the stats kinds. Independently of PGSS, this is useful for other custom stats kinds that want to cap, control, or track the number of entries they have, without depending on a potentially expensive sequential scan to know the number of entries while holding an extra exclusive lock. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aMPKWR81KT5UXvEr@paquier.xyz	2025-09-29 08:57:57 +09:00
Tom Lane	b0fb2c6aa5	Refactor to avoid code duplication in transformPLAssignStmt. transformPLAssignStmt contained many lines cribbed directly from transformSelectStmt. I had supposed that we could manage to keep the two copies in sync, but the bug just fixed in `7504d2be9` shows that that hope was foolish. Let's refactor so there's just one copy. The main stumbling block to doing this is that transformPLAssignStmt has a chunk of custom code that has to run after transformTargetList but before we potentially modify the tlist further during analysis of ORDER BY and GROUP BY. Rather than make transformSelectStmt fully aware of PLAssignStmt processing, I put that code into a callback function. It still feels a little bit ugly, but it's not too awful, and surely it's better than a hundred lines of duplicated code. The steps involved in processing a PLAssignStmt remain exactly the same as before, just in different places. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/31027.1758919078@sss.pgh.pa.us	2025-09-27 17:17:51 -04:00
Tom Lane	7504d2be9e	Fix missed copying of groupDistinct in transformPLAssignStmt. Because we failed to do this, DISTINCT in GROUP BY DISTINCT would be ignored in PL/pgSQL assignment statements. It's not surprising that no one noticed, since such statements will throw an error if the query produces more than one row. That eliminates most scenarios where advanced forms of GROUP BY could be useful, and indeed makes it hard even to find a simple test case. Nonetheless it's wrong. This is directly the fault of `be45be9c3` which added the groupDistinct field, but I think much of the blame has to fall on `c9d529848`, in which I incautiously supposed that we'd manage to keep two copies of a big chunk of parse-analysis logic in sync. As a follow-up, I plan to refactor so that there's only one copy. But that seems useful only in master, so let's use this one-line fix for the back branches. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/31027.1758919078@sss.pgh.pa.us Backpatch-through: 14	2025-09-27 14:29:41 -04:00
David Rowley	59c2f03d1e	Teach MSVC that elog/ereport ERROR doesn't return It had always been intended that this already works correctly as pg_unreachable() uses __assume(0) on MSVC, and that directs the compiler in a way so it knows that a given function won't return. However, with ereport_domain(), it didn't work... It's now understood that the failure to determine that elog(ERROR) does not return comes from the inability of the MSVC compiler to detect the "const int elevel_" is the same as the "elevel" macro parameter. MSVC seems to be unable to make the "if (elevel_ >= ERROR) branch constantly-true when the macro is used with any elevel >= ERROR, therefore the pg_unreachable() is seen to only be present in a conditional branch rather than present unconditionally. While there seems to be no way to force the compiler into knowing that elevel_ is equal to elevel within the ereport_domain() macro, there is a way in C11 to determine if the elevel parameter is a compile-time constant or not. This is done via some hackery using the _Generic() intrinsic function, which gives us functionality similar to GCC's __builtin_constant_p(), albeit only for integers. Here we define pg_builtin_integer_constant_p() for this purpose. Callers can check for availability via HAVE_PG_BUILTIN_INTEGER_CONSTANT_P. ereport_domain() has been adjusted to use pg_builtin_integer_constant_p() instead of __builtin_constant_p(). It's not quite clear at this stage if this now allows us to forego doing the likes of "return NULL; /* keep compiler quiet /" as there may be other compilers in use that have similar struggles. It's just a matter of time before someone commits a function that does not "return" a value after an elog(ERROR). Let's make time and lack of complaints about said commit be the judge of if we need to continue the "/ keep compiler quiet */" palaver. Author: David Rowley <drowleyml@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CAApHDvom02B_XNVSkvxznVUyZbjGAR+5myA89ZcbEd3=PA9UcA@mail.gmail.com	2025-09-27 22:41:04 +12:00
Masahiko Sawada	66cdef4425	Remove unused for_all_tables field from AlterPublicationStmt. No backpatch as AlterPublicationStmt struct is exposed. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAD21AoC6B6AuxWOST-TkxUbDgp8FwX=BLEJZmKLG_VrH-hfxpA@mail.gmail.com	2025-09-26 09:23:00 -07:00
Álvaro Herrera	c4067383cb	Split vacuumdb to create vacuuming.c/h This allows these routines to be reused by a future utility heavily based on vacuumdb. I made a few relatively minor changes from the original, most notably: - objfilter was declared as an enum but the values are bit-or'ed, and individual bits are tested throughout the code. We've discussed this coding pattern in other contexts and stayed away from it, on the grounds that the values so generated aren't really true values of the enum. This commit changes it to be a bits32 with a few #defines for the flag definitions, the way we do elsewhere. Also, instead of being a global variable, it's now in the vacuumingOptions struct. - Two booleans, analyze_only (in vacuumingOptions) and analyze_in_stages (passed around as a separate boolean argument), are really determining what "mode" the program runs in -- it's either vacuum, or one of those two modes. I have three adjectives for them: inconsistent, unergonomic, unorthodox. Removing these and replacing them with a mode enum to be kept in vacuumingOptions makes the code structure easier to understand in a couple of places, and it'll be useful for the new mode we add next, so do that. Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/202508301750.cbohxyy2pcce@alvherre.pgsql	2025-09-26 16:21:28 +02:00
Álvaro Herrera	dbf8cfb4f0	Create a separate file listing backend types Use our established coding pattern to reduce maintenance pain when adding other per-process-type characteristics. Like PG_KEYWORD, PG_CMDTAG, PG_RMGR. To keep the strings translatable, the relevant makefile now also scans src/include for this specific file. I didn't want to have it scan all .h files, as then gettext would have to scan all header files. I didn't find any way to affect the meson behavior in this respect though. Author: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Discussion: https://postgr.es/m/202507151830.dwgz5nmmqtdy@alvherre.pgsql	2025-09-26 15:21:49 +02:00
Fujii Masao	8bb174295e	pgbench: Fix assertion failure with retriable errors in pipeline mode. When running pgbench with --verbose-errors option and a custom script that triggered retriable errors (e.g., serialization errors) in pipeline mode, an assertion failure could occur: Assertion failed: (sql_script[st->use_file].commands[st->command]->type == 1), function commandError, file pgbench.c, line 3062. The failure happened because pgbench assumed these errors would only occur during SQL commands, but in pipeline mode they can also happen during \endpipeline meta command. This commit fixes the assertion failure by adjusting the assertion check to allow such errors during either SQL commands or \endpipeline. Backpatch to v15, where the assertion check was introduced. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGWQMOzNkQs-LmpDHdNC0h8dmAuUMRvZrEntQi5a-b=Kg@mail.gmail.com	2025-09-26 21:23:43 +09:00
Michael Paquier	85e0ff62b6	Improve stability of btree page split on ERRORs This improves the stability of VACUUM when processing btree indexes, which was previously able to trigger an assertion failure in _bt_lock_subtree_parent() when an error was previously thrown outside the scope of _bt_split() when splitting a btree page. VACUUM would consider the index as in a corrupted state as the right page would not be zeroed for the error thrown (allocation failure is one pattern). In a non-assert build, VACUUM is able to succeed, reporting what it sees as a corruption while attempting to fix the index. This would manifest as a LOG message, as of: LOG: failed to re-find parent key in index "idx" for deletion target page N CONTEXT: while vacuuming index "idx" of relation "public.tab" This commit improves the code to rely on two PGAlignedBlocks that are used as a temporary space for the left and right pages. The main change concerns the right page, whose contents are now copied into the "temporary" PGAlignedBlock page while its original space is zeroed. Its contents are moved from the PGAlignedBlock page back to the page once we enter in the critical section used for the split. This simplifies the split logic, as it is not necessary to zero the right page before throwing an error anymore. Hence errors can now be thrown outside the split code. For the left page, this shaves one allocation, with PageGetTempPage() being previously used. The previous logic originates from commit `8fa30f906b`, at a point where PGAlignedBlock did not exist yet. This could be argued as something that should be backpatched, but the lack of complaints indicates that it may not be necessary. Author: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/566dacaf-5751-47e4-abc6-73de17a5d42a@garret.ru	2025-09-26 08:41:06 +09:00
David Rowley	3760d278dc	Fix misleading comment in pg_get_statisticsobjdef_string() The comment claimed that a TABLESPACE reference was added to the resulting string, but that's not true. Looks like the comment was copied from pg_get_indexdef_string() without being adjusted correctly. Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHwVPgeu8o9D8oUeDQYEHTAZGt-J5uaJNgYMzkAW7MiCA@mail.gmail.com	2025-09-26 11:04:15 +12:00
David Rowley	4be9024d57	Remove unused parameter from check_and_push_window_quals ... and find_window_run_conditions. This seems to have been around and unused ever since the Run Condition feature was added in `9d9c02ccd`. Let's remove it to clean things up a bit. Author: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/DD26NJ0Y34ZS.2ZOJPHSY12PFI@gmail.com	2025-09-26 10:21:30 +12:00
Masahiko Sawada	76418a0b67	psql: Add COMPLETE_WITH_FILES and COMPLETE_WITH_GENERATOR macros. While most tab completions in match_previous_words() use COMPLETE_WITH* macros to wrap rl_completion_matches(), some direct calls to rl_completion_matches() still remained. This commit introduces COMPLETE_WITH_FILES and COMPLETE_WITH_GENERATOR macros to replace these direct calls, enhancing both code consistency and readability. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/20250605100835.b396f9d656df1018f65a4556@sraoss.co.jp	2025-09-25 14:28:01 -07:00
Tom Lane	02c4bc8830	Try to avoid floating-point roundoff error in pg_sleep(). I noticed the surprising behavior that pg_sleep(0.001) will sleep for 2ms not the expected 1ms. Apparently the float8 calculation of time-to-sleep is managing to produce something a hair over 1, which ceil() rounds up to 2, and then WaitLatch() faithfully waits 2ms. It could be that this works as-expected for some ranges of current timestamp but not others, which would account for not having seen it before. In any case, let's try to avoid it by removing the float arithmetic in the delay calculation. We're stuck with the declared input type being float8, but we can convert that to integer microseconds right away, and then work strictly with integral values. There might still be roundoff surprises for certain input values, but at least the behavior won't be time-varying. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/3879137.1758825752@sss.pgh.pa.us	2025-09-25 17:02:15 -04:00
Tom Lane	e849bd551c	Add minimal sleep to stats isolation test functions. The functions test_stat_func() and test_stat_func2() had empty function bodies, so that they took very little time to run. This made it possible that on machines with relatively low timer resolution the functions could return before the clock advanced, making the test fail (as seen on buildfarm members fruitcrow and hamerkop). To avoid that, pg_sleep for 10us during the functions. As far as we can tell, all current hardware has clock resolution much less than that. (The current implementation of pg_sleep will round it up to 1ms anyway, but someday that might get improved.) Author: Michael Banck <mbanck@gmx.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/68d413a3.a70a0220.24c74c.8be9@mx.google.com Backpatch-through: 15	2025-09-25 13:29:37 -04:00
Robert Haas	803ef0ed49	Fix array allocation bugs in SetExplainExtensionState. If we already have an extension_state array but see a new extension_id much larger than the highest the extension_id we've previously seen, the old code might have failed to expand the array to a large enough size, leading to disaster. Also, if we don't have an extension array at all and need to create one, we should make sure that it's big enough that we don't have to resize it instantly. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: http://postgr.es/m/2949591.1758570711@sss.pgh.pa.us Backpatch-through: 18	2025-09-25 11:43:52 -04:00
Tom Lane	507aa16125	Doc: clean up documentation for new UUID functions. Fix assorted failures to conform to our normal style for function documentation, such as lack of parentheses and incorrect markup. Author: Marcos Pegoraro <marcos@f10.com.br> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAB-JLwbocrFjKfGHoKY43pHTf49Ca2O0j3WVebC8z-eQBMPJyw@mail.gmail.com Backpatch-through: 18	2025-09-25 11:23:27 -04:00
Tom Lane	170a8a3f46	Teach doc/src/sgml/Makefile about the new func/.sgml files. These were omitted from build dependencies and also tab/nbsp checks, with the result that "make" did nothing after modifying a func/.sgml file. Oversight in `4e23c9ef6`. AFAICT we don't need any comparable changes in meson.build, or at least I don't see it doing anything special for the pre-existing ref/*.sgml files.	2025-09-25 11:09:26 -04:00
Daniel Gustafsson	0b3ce7878a	Remove preprocessor guards from injection points When defining an injection point there is no need to wrap the definition with USE_INJECTION_POINT guards, the INJECTION_POINT macro is available in all builds. Remove to make the code consistent. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/OSCPR01MB14966C8015DEB05ABEF2CE077F51FA@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 17	2025-09-25 15:27:33 +02:00
Daniel Gustafsson	d8f07dbb81	Fix comments in recovery tests Commit `4464fddf` removed the large insertions but missed to remove all the comments referring to them. Also remove a superfluous ')' in another comment. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/OSCPR01MB149663A99DAF2826BE691C23DF51FA@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-09-25 15:24:41 +02:00
Álvaro Herrera	7e638d7f50	Don't include execnodes.h in replication/conflict.h ... which silently propagates a lot of headers into many places via pgstat.h, as evidenced by the variety of headers that this patch needs to add to seemingly random places. Add a minimum of typedefs to conflict.h to be able to remove execnodes.h, and fix the fallout. Backpatch to 18, where conflict.h first appeared. Discussion: https://postgr.es/m/202509191927.uj2ijwmho7nv@alvherre.pgsql	2025-09-25 14:52:41 +02:00
Álvaro Herrera	81fc3e28e3	Update some more forward declarations to use typedef As commit `d4d1fc527b`. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/202509191025.22agk3fvpilc@alvherre.pgsql	2025-09-25 14:33:19 +02:00
Fujii Masao	668de04309	pgbench: Fix typo in documentation. This commit fixes a typo introduced in commit `b6290ea48e`. Reported off-list by Erik Rijkers <er@xs4all.nl>	2025-09-25 14:06:12 +09:00
Fujii Masao	b6290ea48e	pgbench: Clarify documentation for \gset and \aset. This commit updates the pgbench documentation to list \gset and \aset as separate terms for easier reading. It also clarifies that \gset raises an error if the query returns zero or multiple rows, and explains how to detect cases where the query with \aset returned no rows. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20250626180125.5b896902a3d0bcd93f86c240@sraoss.co.jp	2025-09-25 12:09:32 +09:00
Fujii Masao	879c492480	vacuumdb: Do not run VACUUM (ONLY_DATABASE_STATS) when --analyze-only. Previously, vacuumdb --analyze-only issued VACUUM (ONLY_DATABASE_STATS) at the end. Since --analyze-only is meant to update optimizer statistics only, this extra VACUUM command is unnecessary. This commit prevents vacuumdb --analyze-only from running that redundant VACUUM command. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Discussion: https://postgr.es/m/CAHGQGwEqHGa-k=wbRMucUVihHVXk4NQkK94GNN=ym9cQ5HBSHg@mail.gmail.com	2025-09-25 01:38:54 +09:00
Melanie Plageman	ae8ea7278c	Correct prune WAL record opcode name in comment `f83d709760` incorrectly refers to a XLOG_HEAP2_PRUNE_FREEZE WAL record opcode. No such code exists. The relevant opcodes are XLOG_HEAP2_PRUNE_ON_ACCESS, XLOG_HEAP2_PRUNE_VACUUM_SCAN, and XLOG_HEAP2_PRUNE_VACUUM_CLEANUP. Correct it. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/yn4zp35kkdsjx6wf47zcfmxgexxt4h2og47pvnw2x5ifyrs3qc%407uw6jyyxuyf7	2025-09-24 12:29:56 -04:00
Tom Lane	aadbcc40bc	Ensure guc_tables.o's dependency on guc_tables.inc.c is known. Without this, rebuilds can malfunction unless --enable-depend is used. Historically we've expected that you can get away without --enable-depend as long as you manually clean after changing *.h files; the makefiles are supposed to handle other sorts of dependencies. So add this one. Follow-on to `635998965`, so no need for back-patch. Discussion: https://postgr.es/m/3121329.1758650878@sss.pgh.pa.us	2025-09-24 12:28:20 -04:00
Tom Lane	7ccbf6d8b5	Include pg_test_timing's full output in the TAP test log. We were already doing a short (1-second) pg_test_timing run during check-world and buildfarm runs. But we weren't doing anything with the result except for a basic regex-based sanity check. Collecting that output from buildfarm runs is seeming very attractive though, because it would help us determine what sort of timing resolution is available on supported platforms. It's not very long, so let's just note it verbatim in the TAP log. Discussion: https://postgr.es/m/3321785.1758728271@sss.pgh.pa.us	2025-09-24 12:09:11 -04:00
Fujii Masao	7fcb32ad02	Fix incorrect and inconsistent comments in tableam.h and heapam.c. This commit corrects several issues in function comments: * The parameter "rel" was incorrectly referred to as "relation" in the comments for table_tuple_delete(), table_tuple_update(), and table_tuple_lock(). * In table_tuple_delete(), "changingPart" was listed as an output parameter in the comments but is actually input. * In table_tuple_update(), "slot" was listed as an input parameter in the comments but is actually output. * The comment for "update_indexes" in table_tuple_update() was mis-indented. * The comments for heap_lock_tuple() incorrectly referenced a non-existent "tid" parameter. Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAEoWx2nB6Ay8g=KEn7L3qbYX_4+sLk9XOMkV0XZqHR4cTY8ZvQ@mail.gmail.com	2025-09-25 00:51:59 +09:00
Peter Eisentraut	a5b35fcedb	Remove PointerIsValid() This doesn't provide any value over the standard style of checking the pointer directly or comparing against NULL. Also remove related: - AllocPointerIsValid() [unused] - IndexScanIsValid() [had one user] - HeapScanIsValid() [unused] - InvalidRelation [unused] Leaving HeapTupleIsValid(), ItemIdIsValid(), PortalIsValid(), RelationIsValid for now, to reduce code churn. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/ad50ab6b-6f74-4603-b099-1cd6382fb13d%40eisentraut.org Discussion: https://www.postgresql.org/message-id/CA+hUKG+NFKnr=K4oybwDvT35dW=VAjAAfiuLxp+5JeZSOV3nBg@mail.gmail.com Discussion: https://www.postgresql.org/message-id/bccf2803-5252-47c2-9ff0-340502d5bd1c@iki.fi	2025-09-24 15:17:20 +02:00
Daniel Gustafsson	0fba25eb72	Fix incorrect option name in usage screen The usage screen incorrectly refered to the --docs option as --sgml. Backpatch down to v17 where this script was introduced. Author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20250729.135638.1148639539103758555.horikyota.ntt@gmail.com Backpatch-through: 17	2025-09-24 14:58:18 +02:00
Daniel Gustafsson	711ccce38f	Consistently handle tab delimiters for wait event names Format validation and element extraction for intermediate line strings were inconsistent in their handling of tab delimiters, which resulted in an unclear error when multiple tab characters were used as a delimiter. This fixes it by using captures from the validation regex instead of a separate split() to avoid the inconsistency. Also, it ensures that \t+ is used consistently when inspecting the strings. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20250729.135638.1148639539103758555.horikyota.ntt@gmail.com	2025-09-24 14:57:26 +02:00
John Naylor	5334620eef	Update GB18030 encoding from version 2000 to 2022 Mappings for 18 characters have changed, affecting 36 code points. This is a break in compatibility, but these characters are rarely used. U+E5E5 (Private Use Area) was previously mapped to \xA3A0. This code point now maps to \x65356535. Attempting to convert \xA3A0 will now raise an error. Separate from the 2022 update, the following mappings were previously swapped, and subsequently corrected in 2000 and later versions: * U+E7C7 (Private Use Area) now maps to \x8135F437 * U+1E3F (Latin Small Letter M with Acute) now maps to \xA8BC The 2022 standard mentions the following policy changes, but they have no effect in our implementation: 66 new ideographs are now required, but these are mapped algorithmically so were already handled by utf8_and_gb18030.c. Nine CJK compatibility ideographs are no longer required, but implementations may retain them, as does the source we use from the Unicode Consortium. Release notes: Compatibility section For further details, see: https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf https://ken-lunde.medium.com/the-gb-18030-2022-standard-3d0ebaeb4132 Author: Chao Li <lic@highgo.com> Author: Zheng Tao <taoz@highgo.com> Discussion: https://postgr.es/m/966d9fc.169.198741fe60b.Coremail.jiaoshuntian%40highgo.com	2025-09-24 13:26:05 +07:00
Amit Kapila	e41d954da6	Fix LOCK_TIMEOUT handling during parallel apply. Previously, the parallel apply worker used SIGINT to receive a graceful shutdown signal from the leader apply worker. However, SIGINT is also used by the LOCK_TIMEOUT handler to trigger a query-cancel interrupt. This overlap caused the parallel apply worker to miss LOCK_TIMEOUT signals, leading to incorrect behavior during lock wait/contention. This patch resolves the conflict by switching the graceful shutdown signal from SIGINT to SIGUSR2. Reported-by: Zane Duffield <duffieldzane@gmail.com> Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16, where it was introduced Discussion: https://postgr.es/m/CACMiCkXyC4au74kvE2g6Y=mCEF8X6r-Ne_ty4r7qWkUjRE4+oQ@mail.gmail.com	2025-09-24 04:11:53 +00:00
Michael Paquier	f83fe65f3f	Fix compiler warnings in test_bitmapset The macros doing conversions of/from "text" from/to Bitmapset were using arbitrary casts with Datum, something that is not fine since `2a600a93c7`. These macros do not actually need casts with Datum, as they are given already "text" and Bitmapset data in input. They are updated to use cstring_to_text() and text_to_cstring(), fixing the compiler warnings reported by the buildfarm. Note that appending a -m32 to gcc to trigger 32-bit builds was enough to reproduce the warnings here. While on it, outer parenthesis are added to TEXT_TO_BITMAPSET(), and inner parenthesis are removed from BITMAPSET_TO_TEXT(), to make these macros more consistent with the style used in the tree, based on suggestions by Tom Lane. Oversights in commit `00c3d87a5c`. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Greg Burd <greg@burd.me> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/3027069.1758606227@sss.pgh.pa.us	2025-09-24 08:20:23 +09:00
Robert Haas	f2bae51dfd	Keep track of what RTIs a Result node is scanning. Result nodes now include an RTI set, which is only non-NULL when they have no subplan, and is taken from the relid set of the RelOptInfo that the Result is generating. ExplainPreScanNode now takes notice of these RTIs, which means that a few things get schema-qualified in the regression tests that previously did not. This makes the output more consistent between cases where some part of the plan tree is replaced by a Result node and those where this does not happen. Likewise, pg_overexplain's EXPLAIN (RANGE_TABLE) now displays the RTIs stored in a Result node just as it already does for other RTI-bearing node types. Result nodes also now include a result_reason, which tells us something about why the Result node was inserted. Using that information, EXPLAIN now emits, where relevant, a "Replaces" line describing the origin of a Result node. The purpose of these changes is to allow code that inspects a Plan tree to understand the origin of Result nodes that appear therein. Discussion: http://postgr.es/m/CA+TgmoYeUZePZWLsSO+1FAN7UPePT_RMEZBKkqYBJVCF1s60=w@mail.gmail.com Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>	2025-09-23 09:07:55 -04:00
Daniel Gustafsson	a48d1ef586	doc: Remove trailing whitespace in xref Remove stray whitespace in xref tag. This was found due to a regression in xmllint 2.15.0 which flagged this as an error, and at the time of this commit no fix for xmllint has shipped. Author: Erik Wienhold <ewie@ewie.name> Discussion: https://postgr.es/m/f4c4661b-4e60-4c10-9336-768b7b55c084@ewie.name Backpatch-through: 17	2025-09-22 10:12:31 +02:00
Michael Paquier	00c3d87a5c	Add a test module for Bitmapset Bitmapset has a complex set of APIs, defined in bitmapset.h, and it can be hard to test edge cases with the backend core code only. This test module is aimed at closing the gap, and implements a set of SQL functions that act as wrappers of the low-level C functions of the same names. These functions rely on text as data type for the input and the output as Bitmapset as a node has support for these. An extra function, named test_random_operations(), can be used to stress bitmaps with random member values and a defined number of operations potentially useful for other purposes than only tests. The coverage increases from 85.2% to 93.4%. It should be possible to cover more code paths, but at least it's a beginning. Author: Greg Burd <greg@burd.me> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/7BD1ABDB-B03A-464A-9BA9-A73B55AD8A1F@getmailspring.com	2025-09-22 16:53:00 +09:00
David Rowley	9fc7f6ab72	Fix various incorrect filename references Author: Chao Li <li.evan.chao@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=hOBCPm-Z=F15twr_23XjHeoXSbifP5GdEdtWona97wQ@mail.gmail.com	2025-09-22 13:33:17 +12:00
Richard Guo	e3a0304eba	Fix misleading comment in RangeTblEntry The comment describing join_using_alias incorrectly referred to the alias field as being defined "below", when it actually appears earlier in the RangeTblEntry struct. This patch fixes that. Author: Steve Lau <stevelauc@outlook.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/TYWPR01MB10612B020C33FD08F729415CEB613A@TYWPR01MB10612.jpnprd01.prod.outlook.com	2025-09-22 10:04:39 +09:00
Michael Paquier	293a3286d7	Fix meson build with -Duuid=ossp when using version older than 0.60 The package for the UUID library may be named "uuid" or "ossp-uuid", and meson.build has been using a single call of dependency() with multiple names, something only supported since meson 0.60.0. The minimum version of meson supported by Postgres is 0.57.2 on HEAD, since `f039c22441`, and 0.54 on stable branches down to 16. Author: Oreo Yang <oreo.yang@hotmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/OS3P301MB01656E6F91539770682B1E77E711A@OS3P301MB0165.JPNP301.PROD.OUTLOOK.COM Backpatch-through: 16	2025-09-22 08:03:23 +09:00
Daniel Gustafsson	e1d917182c	Add support for base64url encoding and decoding This adds support for base64url encoding and decoding, a base64 variant which is safe to use in filenames and URLs. base64url replaces '+' in the base64 alphabet with '-' and '/' with '_', thus making it safe for URL addresses and file systems. Support for base64url was originally suggested by Przemysław Sztoch. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: David E. Wheeler <david@justatheory.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Chao Li (Evan) <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/70f2b6a8-486a-4fdb-a951-84cef35e22ab@sztoch.pl	2025-09-20 23:19:32 +02:00
Tom Lane	261f89a976	Track the maximum possible frequency of non-MCE array elements. The lossy-counting algorithm that ANALYZE uses to identify most-common array elements has a notion of cutoff frequency: elements with frequency greater than that are guaranteed to be collected, elements with smaller frequencies are not. In cases where we find fewer MCEs than the stats target would permit us to store, the cutoff frequency provides valuable additional information, to wit that there are no non-MCEs with frequency greater than that. What the selectivity estimation functions actually use the "minfreq" entry for is as a ceiling on the possible frequency of non-MCEs, so using the cutoff rather than the lowest stored MCE frequency provides a tighter bound and more accurate estimates. Therefore, instead of redundantly storing the minimum observed MCE frequency, store the cutoff frequency when there are fewer tracked values than we want. (When there are more, then of course we cannot assert that no non-stored elements are above the cutoff frequency, since we're throwing away some that are; so we still use the minimum stored frequency in that case.) Notably, this works even when none of the values are common enough to be called MCEs. In such cases we previously stored nothing in the STATISTIC_KIND_MCELEM pg_statistic slot, which resulted in the selectivity functions falling back to default estimates. So in that case we want to construct a STATISTIC_KIND_MCELEM entry that contains no "values" but does have "numbers", to wit the three extra numbers that the MCELEM entry type defines. A small obstacle is that update_attstats() has traditionally stored a null, not an empty array, when passed zero "values" for a slot. That gives rise to an MCELEM entry that get_attstatsslot() will spit up on. The least risky solution seems to be to adjust update_attstats() so that it will emit a non-null (but possibly empty) array when the passed stavalues array pointer isn't NULL, rather than conditioning that on numvalues > 0. In other existing cases I don't believe that that changes anything. For consistency, handle the stanumbers array the same way. In passing, improve the comments in routines that use STATISTIC_KIND_MCELEM data. Particularly, explain why we use minfreq / 2 not minfreq as the estimate for non-MCE values. Thanks to Matt Long for the suggestion that we could apply this idea even when there are more than zero MCEs. Reported-by: Mark Frost <FROSTMAR@uk.ibm.com> Reported-by: Matt Long <matt@mattlong.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/PH3PPF1C905D6E6F24A5C1A1A1D8345B593E16FA@PH3PPF1C905D6E6.namprd15.prod.outlook.com	2025-09-20 14:48:16 -04:00
Tom Lane	1eccb93150	Re-allow using statistics for bool-valued functions in WHERE. Commit `a391ff3c3`, which added the ability for a function's support function to provide a custom selectivity estimate for "WHERE f(...)", unintentionally removed the possibility of applying expression statistics after finding there's no applicable support function. That happened because we no longer fell through to boolvarsel() as before. Refactor to do so again, putting the 0.3333333 default back into boolvarsel() where it had been (cf. commit `39df0f150`). I surely wouldn't have made this error if `39df0f150` had included a test case, so add one now. At the time we did not have the "extended statistics" infrastructure, but we do now, and it is also unable to work in this scenario because of this error. So make use of that for the test case. This is very clearly a bug fix, but I'm afraid to put it into released branches because of the likelihood of altering plan choices, which we avoid doing in minor releases. So, master only. Reported-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/a8b99dce-1bfb-4d97-af73-54a32b85c916@dalibo.com	2025-09-20 12:44:52 -04:00
Nathan Bossart	18cdf5932a	Fix obsolete references to postgres.h in comments. Oversights in commits `d08741eab5` and `d952373a98`. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aMxbfSJ2wLWd32x-%40nathan	2025-09-19 09:19:03 -05:00
David Rowley	ac7c8e412c	Improve wording in a few comments Initially this was to fix the "catched" typo, but I (David) wasn't quite clear on what the previous comment meant about being "effective". I expect this means efficiency, so I've reworded the comment to indicate that. While this is only a comment fixup, for the sake of possibly minimizing possible future backpatching pain, I've opted to backpatch to 18 since this code is new to that version and the release isn't out the door yet. Author: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNmSYWPud1sfBvpKbCJeRkWeZYuqatxtV9U9LvAFXBEiBw@mail.gmail.com Backpatch-through: 18	2025-09-19 23:35:23 +12:00
Amit Kapila	5b148706c5	Add optional pid parameter to pg_replication_origin_session_setup(). Commit `216a784829` introduced parallel apply workers, allowing multiple processes to share a replication origin. To support this, replorigin_session_setup() was extended to accept a pid argument identifying the process using the origin. This commit exposes that capability through the SQL interface function pg_replication_origin_session_setup() by adding an optional pid parameter. This enables multiple processes to coordinate replication using the same origin when using SQL-level replication functions. This change allows the non-builtin logical replication solutions to implement parallel apply for large transactions. Additionally, an existing internal error was made user-facing, as it can now be triggered via the exposed SQL API. Author: Doruk Yilmaz <doruk@mixrank.com> Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/CAMPB6wfe4zLjJL8jiZV5kjjpwBM2=rTRme0UCL7Ra4L8MTVdOg@mail.gmail.com Discussion: https://postgr.es/m/CAE2gYzyTSNvHY1+iWUwykaLETSuAZsCWyryokjP6rG46ZvRgQA@mail.gmail.com	2025-09-19 05:38:40 +00:00
Amit Kapila	8aac5923a3	Improve few errdetail messages introduced in commit `0d48d393d4`. Based on suggestions by Tom Lane Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/20250916.114644.275726106301941878.horikyota.ntt@gmail.com	2025-09-19 04:52:59 +00:00
Michael Paquier	deb208df45	Make XLogFlush() and XLogNeedsFlush() decision-making more consistent When deciding which code path to use depending on the state of recovery, XLogFlush() and XLogNeedsFlush() have been relying on different criterias: - XLogFlush() relied on XLogInsertAllowed(). - XLogNeedsFlush() relied on RecoveryInProgress(). Currently, the checkpointer is allowed to insert WAL records while RecoveryInProgress() returns true for an end-of-recovery checkpoint, where XLogInsertAllowed() matters. Using RecoveryInProgress() in XLogNeedsFlush() did not really matter for its existing callers, as the checkpointer only called XLogFlush(). However, a feature under discussion, by Melanie Plageman, needs XLogNeedsFlush() to be able to work in more contexts, the end-of-recovery checkpoint being one. This commit changes XLogNeedsFlush() to use XLogInsertAllowed() instead of RecoveryInProgress(), making the checks in both routines more consistent. While on it, an assertion based on XLogNeedsFlush() is added at the end of XLogFlush(), triggered when flushing a physical position (not for the normal recovery patch that checks for updates of the minimum recovery point). This assertion would fail for example in the recovery test 015_promotion_pages if XLogNeedsFlush() is changed to use RecoveryInProgress(). This should be hopefully enough to ensure that the checks done in both routines remain consistent. Author: Melanie Plageman <melanieplageman@gmail.com> Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAAKRu_a1vZRZRWO3_jv_X13RYoqLRVipGO0237g5PKzPa2YX6g@mail.gmail.com	2025-09-19 13:47:28 +09:00
Amit Langote	8741e48e5d	Fix EPQ crash from missing partition pruning state in EState Commit `bb3ec16e14` moved partition pruning metadata into PlannedStmt. At executor startup this metadata is used to initialize the EState fields es_part_prune_infos, es_part_prune_states, and es_part_prune_results. EvalPlanQualStart() failed to copy those fields into the child EState, causing NULL dereference when Append ran partition pruning during a recheck. This can occur with DELETE or UPDATE on partitioned tables that use runtime pruning, e.g. with generic plans. Fix by copying all partition pruning state into the EPQ estate. Add an isolation test that reproduces the crash with concurrent UPDATE and DELETE on a partitioned table, where the DELETE session hits the crash during its EPQ recheck after the UPDATE commits. Bug: #19056 Reported-by: Fei Changhong <feichanghong@qq.com> Diagnozed-by: Fei Changhong <feichanghong@qq.com> Author: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/19056-a677cef9b54d76a0%40postgresql.org	2025-09-19 11:38:29 +09:00
Michael Paquier	3cd3a039da	Document and check that PgStat_HashKey has no padding This change is a tighter rework of `7d85d87f4d`, which tried to improve the code so as it would work should PgStat_HashKey gain new fields that create padding bytes. However, the previous change is proving to not be enough as some code paths of pgstats do not pass PgStat_HashKey by reference (valgrind would warn when padding is added to the structure, through a new field). Per discussion, let's document and check that PgStat_HashKey has no padding rather than try to complicate the code of pgstats so as it is able to work around that. This removes a couple of memset(0) calls that should not be required. While on it, this commit adds a static assertion checking that no padding is introduced in the structure, by checking that the size of PgStat_HashKey matches with the sum of the size of all its fields. The object ID part of the hash key is already 8 bytes, which should be plenty enough already. A comment is added to discourage the addition of new fields. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0t9omat+HVSakJXwTMWvhpYFcAZb41RPWKwrKFUgmAFBQ@mail.gmail.com	2025-09-19 09:54:05 +09:00
Nathan Bossart	16607718c0	Add a test harness for the LWLock tranche code. This code is heavily used and already has decent test coverage, but it lacks a dedicated test suite. This commit changes that. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tQ%2BEYSTOd2hQ8RXdsNfGBLAtOe-YmnsTE6ZVg0E-4qew%40mail.gmail.com Discussion: https://postgr.es/m/CAA5RZ0vpr0P2rbA%3D_K0_SCHM7bmfVX4wEO9FAyopN1eWCYORhA%40mail.gmail.com	2025-09-18 15:23:11 -05:00
Nathan Bossart	c3cc2ab87d	Fix re-initialization of LWLock-related shared memory. When shared memory is re-initialized after a crash, the named LWLock tranche request array that was copied to shared memory will no longer be accessible. To fix, save the pointer to the original array in postmaster's local memory, and switch to it when re-initializing the LWLock-related shared memory. Oversight in commit `ed1aad15e0`. Per buildfarm member batta. Reported-by: Michael Paquier <michael@paquier.xyz> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aMoejB3iTWy1SxfF%40paquier.xyz Discussion: https://postgr.es/m/f8ca018f-3479-49f6-a92c-e31db9f849d7%40gmail.com	2025-09-18 09:55:39 -05:00
Fujii Masao	2e66cae935	pgbench: Remove unused argument from create_sql_command(). Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Steven Niu <niushiji@gmail.com> Discussion: https://postgr.es/m/20250917112814.096f660ea4c3c64630475e62@sraoss.co.jp	2025-09-18 11:22:21 +09:00
Fujii Masao	45f50c995f	pg_restore: Fix security label handling with --no-publications/subscriptions. Previously, pg_restore did not skip security labels on publications or subscriptions even when --no-publications or --no-subscriptions was specified. As a result, it could issue SECURITY LABEL commands for objects that were never created, causing those commands to fail. This commit fixes the issue by ensuring that security labels on publications and subscriptions are also skipped when the corresponding options are used. Backpatch to all supported versions. Author: Jian He <jian.universality@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CACJufxHCt00pR9h51AVu6+yPD5J7JQn=7dQXxqacj0XyDhc-fA@mail.gmail.com Backpatch-through: 13	2025-09-18 11:09:15 +09:00
Andres Freund	0110e2ec5c	Mark shared buffer lookup table HASH_FIXED_SIZE StrategyInitialize() calls InitBufTable() with maximum number of entries that the buffer lookup table can ever have. Thus there should not be any need to allocate more element after initialization. Hence mark the hash table as fixed sized. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAExHW5v0jh3F_wj86yC=qBfWk0uiT94qy=Z41uzAHLHh0SerRA@mail.gmail.com	2025-09-17 20:28:43 -04:00
Tom Lane	b0cc0a71e0	Calculate agglevelsup correctly when Aggref contains a CTE. If an aggregate function call contains a sub-select that has an RTE referencing a CTE outside the aggregate, we must treat that reference like a Var referencing the CTE's query level for purposes of determining the aggregate's level. Otherwise we might reach the nonsensical conclusion that the aggregate should be evaluated at some query level higher than the CTE, ending in a planner error or a broken plan tree that causes executor failures. Bug: #19055 Reported-by: BugForge <dllggyx@outlook.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19055-6970cfa8556a394d@postgresql.org Backpatch-through: 13	2025-09-17 16:32:57 -04:00
Thomas Munro	0951942bba	jit: Fix type used for Datum values in LLVM IR. Commit `2a600a93` made Datum 8 bytes wide everywhere. It was no longer appropriate to use TypeSizeT on 32 bit systems, and JIT compilation would fail with various type check errors. Introduce a separate LLVMTypeRef with the name TypeDatum. TypeSizeT is still used in some places for actual size_t values. Reported-by: Dmitry Mityugov <d.mityugov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Dmitry Mityugov <d.mityugov@postgrespro.ru> Discussion: https://postgr.es/m/0a9f0be59171c2e8f1b3bc10f4fcf267%40postgrespro.ru	2025-09-17 13:38:35 +12:00
Michael Paquier	39f67d9b55	injection_points: Fix incrementation of variable-numbered stats The pending entry was not used when incrementing its data, directly manipulating the shared memory pointer, without even locking it. This could mean losing statistics under concurrent activity. The flush callback was a no-op. This code serves as a base template for extensions for the custom cumulative statistics, so let's be clean and use a pending entry for the incrementations, whose data is then flushed to the corresponding entry in the shared hashtable when all the stats are reported, in its own flush callback. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0v0U0yhPbY+bqChomkPbyUrRQ3rQXnZf_SB-svDiQOpgQ@mail.gmail.com Backpatch-through: 18	2025-09-17 10:15:13 +09:00
Michael Paquier	158c48303e	Fix shared memory calculation size of PgAioCtl The shared memory size was calculated based on an offset of io_handles, which is itself a pointer included in the structure. We tend to overestimate the shared memory size overall, so this was unlikely an issue in practice, but let's be correct and use the full size of the structure in the calculation, so as the pointer for io_handles is included. Oversight in `da7226993f`. Author: Madhukar Prasad <madhukarprasad@google.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAKi+wrbC2dTzh_vKJoAZXV5wqTbhY0n4wRNpCjJ=e36aoo0kFw@mail.gmail.com Backpatch-through: 18	2025-09-17 09:33:32 +09:00
David Rowley	ac06ea8f7b	Add missing EPQ recheck for TID Range Scan The EvalPlanQual recheck for TID Range Scan wasn't rechecking the TID qual still passed after following update chains. This could result in tuples being updated or deleted by plans using TID Range Scans where the ctid of the new (updated) tuple no longer matches the clause of the scan. This isn't desired behavior, and isn't consistent with what would happen if the chosen plan had used an Index or Seq Scan, and that could lead to hard to predict behavior for scans that contain TID quals and other quals as the planner has freedom to choose TID Range or some other non-TID scan method for such queries, and the chosen plan could change at any moment. Here we fix this by properly implementing the recheck function for TID Range Scans. Backpatch to 14, where TID Range Scans were added Reported-by: Sophie Alpert <pg@sophiebits.com> Author: Sophie Alpert <pg@sophiebits.com> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/4a6268ff-3340-453a-9bf5-c98d51a6f729@app.fastmail.com Backpatch-through: 14	2025-09-17 12:19:15 +12:00
David Rowley	dee21ea6d6	Add missing EPQ recheck for TID Scan The EvalPlanQual recheck for TID Scan wasn't rechecking the TID qual still passed after following update chains. This could result in tuples being updated or deleted by plans using TID Scans where the ctid of the new (updated) tuple no longer matches the clause of the scan. This isn't desired behavior, and isn't consistent with what would happen if the chosen plan had used an Index or Seq Scan, and that could lead to hard to predict behavior for scans that contain TID quals and other quals as the planner has freedom to choose TID or some other scan method for such queries, and the chosen plan could change at any moment. Here we fix this by properly implementing the recheck function for TID Scans. Backpatch to 13, oldest supported version Reported-by: Sophie Alpert <pg@sophiebits.com> Author: Sophie Alpert <pg@sophiebits.com> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/4a6268ff-3340-453a-9bf5-c98d51a6f729@app.fastmail.com Backpatch-through: 13	2025-09-17 11:48:55 +12:00
Tom Lane	e633fa6351	Add regression expected-files for older OpenSSL in FIPS mode. Cover contrib/pgcrypto, per buildfarm. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/443709.1757876535@sss.pgh.pa.us Backpatch-through: 17	2025-09-16 14:36:51 -04:00
Tom Lane	8abbbbae61	Revert "Avoid race condition between "GRANT role" and "DROP ROLE"". This reverts commit `98fc31d649`. That change allowed DROP OWNED BY to drop grants of the target role to other roles, arguing that nobody would need those privileges anymore. But that's not so: if you're not superuser, you still need admin privilege on the target role so you can drop it. It's not clear whether or how the dependency-based approach to solving the original problem can be adapted to keep these grants. Since v18 release is fast approaching, the sanest thing to do seems to be to revert this patch for now. The race-condition problem is low severity and not worth taking risks for. I didn't force a catversion bump in `98fc31d64`, so I won't do so here either. Reported-by: Dipesh Dhameliya <dipeshdhameliya125@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CABgZEgczOFicCJoqtrH9gbYMe_BV3Hq8zzCBRcMgmU6LRsihUA@mail.gmail.com Backpatch-through: 18	2025-09-16 13:05:53 -04:00
Noah Misch	c044b50d19	Fix pg_dump COMMENT dependency for separate domain constraints. The COMMENT should depend on the separately-dumped constraint, not the domain. Sufficient restore parallelism might fail the COMMENT command by issuing it before the constraint exists. Back-patch to v13, like commit `0858f0f96e`. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/20250913020233.fa.nmisch@google.com Backpatch-through: 13	2025-09-16 09:40:44 -07:00
Tom Lane	83a5641945	Provide more-specific error details/hints for function lookup failures. Up to now we've contented ourselves with a one-size-fits-all error hint when we fail to find any match to a function or procedure call. That was mostly okay in the beginning, but it was never great, and since the introduction of named arguments it's really not adequate. We at least ought to distinguish "function name doesn't exist" from "function name exists, but not with those argument names". And the rules for named-argument matching are arcane enough that some more detail seems warranted if we match the argument names but the call still doesn't work. This patch creates a framework for dealing with these problems: FuncnameGetCandidates and related code will now pass back a bitmask of flags showing how far the match succeeded. This allows a considerable amount of granularity in the reports. The set-bits-in-a-bitmask approach means that when there are multiple candidate functions, the report will reflect the match(es) that got the furthest, which seems correct. Also, we can avoid mentioning "maybe add casts" unless failure to match argument types is actually the issue. Extend the same return-a-bitmask approach to OpernameGetCandidates. The issues around argument names don't apply to operator syntax, but it still seems worth distinguishing between "there is no operator of that name" and "we couldn't match the argument types". While at it, adjust these messages and related ones to more strictly separate "detail" from "hint", following our message style guidelines' distinction between those. Reported-by: Dominique Devienne <ddevienne@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/1756041.1754616558@sss.pgh.pa.us	2025-09-16 12:17:02 -04:00
Tom Lane	c7b0cb367d	Add regression expected-files for older OpenSSL in FIPS mode. In our previous discussions around making our regression tests pass in FIPS mode, we concluded that we didn't need to support the different error message wording observed with pre-3.0 OpenSSL. However there are still a few LTS distributions soldiering along with such versions, and now we have some in the buildfarm. So let's add the variant expected-files needed to make them happy. This commit only covers the core regression tests. Previous discussion suggested that we might need some adjustments in contrib as well, but it's not totally clear to me what those would be. Rather than work it out from first principles, I'll wait to see what the buildfarm shows. Back-patch to v17 which is the oldest branch that claims to support this case. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/443709.1757876535@sss.pgh.pa.us Backpatch-through: 17	2025-09-16 10:09:49 -04:00
Richard Guo	b63a822452	Treat JsonConstructorExpr as non-strict JsonConstructorExpr can produce non-NULL output with a NULL input, so it should be treated as a non-strict construct. Failing to do so can lead to incorrect query behavior. For example, in the reported case, when pulling up a subquery that is under an outer join, if the subquery's target list contains a JsonConstructorExpr that uses subquery variables and it is mistakenly treated as strict, it will be pulled up without being wrapped in a PlaceHolderVar. As a result, the expression will be evaluated at the wrong place and will not be forced to null when the outer join should do so. Back-patch to v16 where JsonConstructorExpr was introduced. Bug: #19046 Reported-by: Runyuan He <runyuan@berkeley.edu> Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19046-765b6602b0a8cfdf@postgresql.org Backpatch-through: 16	2025-09-16 18:42:20 +09:00
John Naylor	cfa6cd2927	Generate GB18030 mappings from the Unicode Consortium's UCM file Previously we built the .map files for GB18030 (version 2000) from an XML file. The 2022 version for this encoding is only available as a Unicode Character Mapping (UCM) file, so as preparatory refactoring switch to this format as the source for building version 2000. As we do with most input files for the conversion mappings, download the file on demand. In order to generate the same mappings we have now, we must download from a previous upstream commit, rather than the head since the latter contains a correction not present in our current .map files. The XML file is still used by EUC_CN, so we cannot delete it from our repository. GB18030 is a superset of EUC_CN, so it may be possible to build EUC_CN from the same UCM file, but that is left for future work. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/966d9fc.169.198741fe60b.Coremail.jiaoshuntian%40highgo.com	2025-09-16 16:29:08 +07:00
Peter Eisentraut	e56a601e06	Move pg_int64 back to postgres_ext.h Fix for commit `3c86223c99`. That commit moved the typedef of pg_int64 from postgres_ext.h to libpq-fe.h, because the only remaining place where it might be used is libpq users, and since the type is obsolete, the intent was to limit its scope. The problem is that if someone builds an extension against an older (pre-PG18) server version and a new (PG18) libpq, they might get two typedefs, depending on include file order. This is not allowed under C99, so they might get warnings or errors, depending on the compiler and options. The underlying types might also be different (e.g., long int vs. long long int), which would also lead to errors. This scenario is plausible when using the standard Debian packaging, which provides only the newest libpq but per-major-version server packages. The fix is to undo that part of commit `3c86223c99`. That way, the typedef is in the same header file across versions. At least, this is the safest fix doable before PostgreSQL 18 releases. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/25144219-5142-4589-89f8-4e76948b32db%40eisentraut.org	2025-09-16 10:48:56 +02:00
Fujii Masao	8e5b92928d	pg_dump: Fix dumping of security labels on subscriptions and event triggers. Previously, pg_dump incorrectly queried pg_seclabel to retrieve security labels for subscriptions, which are stored in pg_shseclabel as they are global objects. This could result in security labels for subscriptions not being dumped. This commit fixes the issue by updating pg_dump to query the pg_seclabels view, which aggregates entries from both pg_seclabel and pg_shseclabel. While querying pg_shseclabel directly for subscriptions was an alternative, using pg_seclabels is simpler and sufficient. In addition, pg_dump is updated to dump security labels on event triggers, which were previously omitted. Backpatch to all supported versions. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CACJufxHCt00pR9h51AVu6+yPD5J7JQn=7dQXxqacj0XyDhc-fA@mail.gmail.com Backpatch-through: 13	2025-09-16 16:44:58 +09:00
Amit Kapila	0f42206531	Fix intermittent BF failures in 035_conflicts. This commit addresses two sources of instability in the 035_conflicts test: Unstable VACUUM usage: The test previously relied on VACUUM to remove a deleted column, which can be unreliable due to concurrent background writer or checkpoint activity that may lock the page containing the deleted tuple. Since the test already verifies that replication_slot.xmin has advanced sufficiently to confirm the feature's correctness, so, the VACUUM step is removed to improve test stability. Timing-sensitive retention resumption check: The test includes a check to confirm that retention of conflict-relevant information resumes after setting max_retention_duration to 0. However, in some cases, the apply worker resumes retention immediately after the inactive slot is removed from synchronized_standby_slots, even before max_retention_duration is updated. This can happen if remote changes are applied in under 1ms, causing the test to timeout while waiting for a later log position. To ensure consistent behavior, this commit delays the removal of synchronized_standby_slots until after max_retention_duration is set to 0. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TY4PR01MB16907805DE4816E53C54708159414A@TY4PR01MB16907.jpnprd01.prod.outlook.com	2025-09-16 06:25:55 +00:00
Peter Eisentraut	bce18ef3c6	Fix incorrect const qualifier Commit `7202d72787` added in passing some const qualifiers, but the one on the postmaster_child_launch() startup_data argument was incorrect, because the function itself modifies the pointed-to data. This is hidden from the compiler because of casts. The qualifiers on the functions called by postmaster_child_launch() are still correct.	2025-09-16 07:27:32 +02:00
Fujii Masao	66dabc06b1	pg_restore: Fix comment handling with --no-policies. Previously, pg_restore did not skip comments on policies even when --no-policies was specified. As a result, it could issue COMMENT commands for policies that were never created, causing those commands to fail. This commit fixes the issue by ensuring that comments on policies are also skipped when --no-policies is used. Backpatch to v18, where --no-policies was added in pg_restore. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CACJufxHCt00pR9h51AVu6+yPD5J7JQn=7dQXxqacj0XyDhc-fA@mail.gmail.com Backpatch-through: 18	2025-09-16 11:54:23 +09:00
Fujii Masao	b54e8dbfe3	pg_restore: Fix comment handling with --no-publications / --no-subscriptions. Previously, pg_restore did not skip comments on publications or subscriptions even when --no-publications or --no-subscriptions was specified. As a result, it could issue COMMENT commands for objects that were never created, causing those commands to fail. This commit fixes the issue by ensuring that comments on publications and subscriptions are also skipped when the corresponding options are used. Backpatch to all supported versions. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CACJufxHCt00pR9h51AVu6+yPD5J7JQn=7dQXxqacj0XyDhc-fA@mail.gmail.com Backpatch-through: 13	2025-09-16 10:35:12 +09:00
Peter Geoghegan	7d9cd2df5f	Teach nbtree to avoid evaluating row compare keys. Add logic to _bt_set_startikey that determines whether row compare keys are guaranteed to be satisfied by every tuple on a page that is about to be read by _bt_readpage. This works in essentially the same way as the existing scalar inequality logic. Testing has shown that the new logic improves performance to about the same degree as the existing scalar inequality logic (compared to the unoptimized case). In other words, the new logic makes many row compare scans significantly faster. Note that the new row compare inequality logic is only effective when the same individual row member is the deciding subkey for all tuples on the page (obviously, all tuples have to satisfy the row compare, too). This is what makes the new row compare logic very similar to the existing logic for scalar inequalities. Note, in particular, that this makes it safe to ignore whether all row compare members are against either ASC or DESC index attributes (i.e. it doesn't matter if individual subkeys don't all use the same inequality strategy). Also stop refusing to set pstate.startikey to an offset beyond any nonrequired key (don't add logic that'll do that for an individual row compare subkey, either). We can fully rely on our firstchangingattnum tests instead. This will do the right thing when a page has a group of tuples with NULLs in a lower-order attribute that makes the tuples fail to satisfy a row compare key -- we won't incorrectly conclude that all tuples must satisfy the row compare, just because firsttup and lasttup happen to. Our firstchangingattnum test prevents that from happening. (Note that the original "avoid evaluating nbtree scan keys" mechanism added by commit `e0b1ee17` couldn't support row compares due to issues with tuples that contain NULLs in a lower-order subkey's attribute. That original mechanism relied on requiredness markings, which the replacement _bt_set_startikey mechanism never really needed.) Follow up to commit `8a510275`, which added the _bt_set_startikey optimization. _bt_set_startikey is now feature complete; there's no remaining kind of nbtree scan key that it still doesn't support. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAH2-WznL6Z3H_GTQze9d8T_Ls=cYbnd-_9f-Jo7aYgTGRUD58g@mail.gmail.com	2025-09-15 16:56:49 -04:00
Peter Eisentraut	ce71993ae4	Expand virtual generated columns in constraint expressions Virtual generated columns in constraint expressions need to be expanded because the optimizer matches these expressions to qual clauses. Failing to do so can cause us to miss opportunities for constraint exclusion. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/204804c0-798f-4c72-bd1f-36116024fda3%40eisentraut.org	2025-09-15 16:27:50 +02:00
Peter Eisentraut	9ec0b29976	CREATE STATISTICS: improve misleading error message The previous change (commit `f225473cba`) was still not on target, because it talked about relation kinds, which are not what is being checked here. Provide a more accurate message. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxEZ48toGH0Em_6vdsT57Y3L8pLF=DZCQ_gCii6=C3MeXw@mail.gmail.com	2025-09-15 11:43:34 +02:00
Peter Eisentraut	4bd9191298	Change fmgr.h typedefs to use original names fmgr.h defined some types such as fmNodePtr which is just Node *, but it made its own types to avoid having to include various header files. With C11, we can now instead typedef the original names without fear of conflicts. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-15 11:04:10 +02:00
Peter Eisentraut	dc41d7415f	Remove hbaPort type This was just a workaround to avoid including the header file that defines the Port type. With C11, we can now just re-define the Port type without the possibility of a conflict. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-15 11:04:10 +02:00
Peter Eisentraut	d4d1fc527b	Update various forward declarations to use typedef There are a number of forward declarations that use struct but not the customary typedef, because that could have led to repeat typedefs, which was not allowed. This is now allowed in C11, so we can update these to provide the typedefs as well, so that the later uses of the types look more consistent. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-15 11:04:10 +02:00
Peter Eisentraut	70407d39b7	Improve ExplainState type handling in header files Now that we can have repeat typedefs with C11, we don't need to use "struct ExplainState" anymore but can instead make a typedef where necessary. This doesn't change anything but makes it look nicer. (There are more opportunities for similar changes, but this is broken out because there was a separate discussion about it, and it's somewhat bulky on its own.) Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/f36c0a45-98cd-40b2-a7cc-f2bf02b12890%40eisentraut.org#a12fb1a2c1089d6d03010f6268871b00 Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-15 11:04:10 +02:00
Peter Eisentraut	1e3b5edb8e	Remove workarounds against repeat typedefs This is allowed in C11, so we don't need the workarounds anymore. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-15 11:04:10 +02:00
Amit Kapila	0d48d393d4	Resume conflict-relevant data retention automatically. This commit resumes automatic retention of conflict-relevant data for a subscription. Previously, retention would stop if the apply process failed to advance its xmin (oldest_nonremovable_xid) within the configured max_retention_duration and user needs to manually re-enable retain_dead_tuples option. With this change, retention will resume automatically once the apply worker catches up and begins advancing its xmin (oldest_nonremovable_xid) within the configured threshold. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-09-15 08:46:55 +00:00
Peter Eisentraut	282d0bdee6	jit: fix build with LLVM-21 LLVM-21 renamed llvm::GlobalValue::getGUID() to getGUIDAssumingExternalLinkage(), so add a version guard. Author: Holger Hoffstätte <holger@applied-asynchrony.com> Discussion: https://www.postgresql.org/message-id/flat/d25e6e4a-d1b4-84d3-2f8a-6c45b975f53d%40applied-asynchrony.com	2025-09-15 08:31:11 +02:00
Peter Eisentraut	748caa9dcb	Some stylistic improvements in toast_save_datum() Move some variables to a smaller scope. Initialize chunk_data before storing a pointer to it; this avoids compiler warnings on clang-21, or respectively us having to work around it by initializing it to zero before the variable is used (as was done in commit `e92677e863`). Discussion: https://www.postgresql.org/message-id/flat/6604ad6e-5934-43ac-8590-15113d6ae4b1%40eisentraut.org	2025-09-15 07:43:23 +02:00
Peter Eisentraut	bf5da5d6ca	Hide duplicate names from extension views If extensions of equal names were installed in different directories in the path, the views pg_available_extensions and pg_available_extension_versions would show all of them, even though only the first one was actually reachable by CREATE EXTENSION. To fix, have those views skip extensions found later in the path if they have names already found earlier. Also add a bit of documentation that only the first extension in the path can be used. Reported-by: Pierrick <pierrick.chovelon@dalibo.com> Discussion: https://www.postgresql.org/message-id/flat/8f5a0517-1cb8-4085-ae89-77e7454e27ba%40dalibo.com	2025-09-15 07:30:31 +02:00
Peter Geoghegan	454c046094	nbtree: Always set skipScan flag on rescan. The TimescaleDB extension expects to be able to change an nbtree scan's keys across rescans. The issue arises in the extension's implementation of loose index scan. This is arguably a misuse of the index AM API, though apparently it worked until recently. It stopped working when the skipScan flag was added to BTScanOpaqueData by commit `8a510275`, though. The flag wouldn't reliably track whether the scan (actually, the current rescan) has any skip arrays, leading to confusion in _bt_set_startikey. nbtree preprocessing will now defensively initialize the scan's skipScan flag in all cases, including the case where _bt_preprocess_array_keys returns early due to the (re)scan not using arrays. While nbtree isn't obligated to support this use case (at least not according to my reading of the index AM API), it still seems like a good idea to be consistent here, on general robustness grounds. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Natalya Aksman <natalya@timescale.com> Discussion: https://postgr.es/m/CAJumhcirfMojbk20+W0YimbNDkwdECvJprQGQ-XqK--ph09nQw@mail.gmail.com Backpatch-through: 18	2025-09-13 21:01:33 -04:00
Tom Lane	cdf7feb965	Amend recent fix for SIMILAR TO regex conversion. Commit `e3ffc3e91` fixed the translation of character classes in SIMILAR TO regular expressions. Unfortunately the fix broke a corner case: if there is an escape character right after the opening bracket (for example in "[\q]"), a closing bracket right after the escape sequence would not be seen as closing the character class. There were two more oversights: a backslash or a nested opening bracket right at the beginning of a character class should remove the special meaning from any following caret or closing bracket. This bug suggests that this code needs to be more readable, so also rename the variables "charclass_depth" and "charclass_start" to something more meaningful, rewrite an "if" cascade to be more consistent, and improve the commentary. Reported-by: Dominique Devienne <ddevienne@gmail.com> Reported-by: Stephan Springl <springl-psql@bfw-online.de> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFCRh-8NwJd0jq6P=R3qhHyqU7hw0BTor3W0SvUcii24et+zAw@mail.gmail.com Backpatch-through: 13	2025-09-13 16:55:51 -04:00
Nathan Bossart	95bdc67228	Add commit `7e9c216b52` to .git-blame-ignore-revs.	2025-09-13 14:55:38 -05:00
Nathan Bossart	7e9c216b52	Re-pgindent nbtpreprocesskeys.c after commit `796962922e`. Backpatch-through: 18	2025-09-13 14:50:02 -05:00
Alexander Korotkov	f6edf403a9	Specify locale provider for pg_regress --no-locale pg_regress has a --no-locale option that forces the temporary database to have C locale. However, currently, locale C only exists in the 'builtin' locale provider. This makes 'pg_regress --no-locale' fail when the default locale provider is not 'builtin'. This commit makes 'pg_regress --no-locale' specify both LOCALE='C' and LOCALE_PROVIDER='builtin'. Discussion: https://postgr.es/m/b54921f95e23b4391b1613e9053a3d58%40postgrespro.ru Author: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-09-13 20:39:09 +03:00
Tom Lane	88824e6861	Avoid context dependency in test case added by `9a71989a8`. It's not quite clear to me why this didn't show up in my local check-world testing, but some of the buildfarm evidently runs this test with a different database name. Adjust the test so that that doesn't affect the reported error messages.	2025-09-12 18:45:06 -04:00
Tom Lane	9a71989a8f	Reject "ALTER DATABASE/USER ... RESET foo" with invalid GUC name. If the database or user had no entry in pg_db_role_setting, RESET silently did nothing --- including not checking the validity of the given GUC name. This is quite inconsistent and surprising, because you would get such an error if there were any pg_db_role_setting entry, even though it contains values for unrelated GUCs. While this is clearly a bug, changing it in stable branches seems unwise. The effect will be that some ALTER commands that formerly were no-ops will now be errors, and people don't like that sort of thing in minor releases. Author: Vitaly Davydov <v.davydov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/30783e-68c28a00-9-41004480@130449754	2025-09-12 18:10:11 -04:00
Tom Lane	f14ea34d6e	Fix oversights in pg_event_trigger_dropped_objects() fixes. Commit `a0b99fc12` caused pg_event_trigger_dropped_objects() to not fill the object_name field for schemas, which it should have; and caused it to fill the object_name field for default values, which it should not have. In addition, triggers and RLS policies really should behave the same way as we're making column defaults do; that is, they should have is_temporary = true if they belong to a temporary table. Fix those things, and upgrade event_trigger.sql's woefully inadequate test coverage of these secondary output columns. As before, back-patch only to v15. Reported-by: Sergey Shinderuk <s.shinderuk@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/bd7b4651-1c26-4d30-832b-f942fabcb145@postgrespro.ru Backpatch-through: 15	2025-09-12 17:43:17 -04:00
Noah Misch	4adb0380b9	Replace tests of ALTER DATABASE RESET TABLESPACE. This unblocks rejection of that syntax. One copy was a misspelling of "SET TABLESPACE pg_default" that instead made no persistent changes. The other copy just needed to populate a DATABASEOID syscache entry. This slightly raises database.sql test coverage of catcache.c, while dbcommands.c coverage remains the same. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1802710.1757608564@sss.pgh.pa.us	2025-09-12 12:44:14 -07:00
Peter Geoghegan	796962922e	Always commute strategy when preprocessing DESC keys. A recently added nbtree preprocessing step failed to account for the fact that DESC columns already had their B-Tree strategy number commuted at this point in preprocessing. As a result, preprocessing could output a set of scan keys where one or more keys had the correct strategy number, but used the wrong comparison routine. To fix, make the faulty code path that looks up a more restrictive replacement operator/comparison routine commute its requested inequality strategy (while outputting the transformed strategy number as before). This makes the final transformed scan key comport with the approach preprocessing has always used to deal with DESC columns (which is described by comments above _bt_fix_scankey_strategy). Oversight in commit commit `b3f1a13f`, which made nbtree preprocessing perform transformations on skip array inequalities that can reduce the total number of index searches. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Natalya Aksman <natalya@timescale.com> Discussion: https://postgr.es/m/19049-b7df801e71de41b2@postgresql.org Backpatch-through: 18	2025-09-12 13:23:00 -04:00
Álvaro Herrera	7dcea51c2a	Avoid unexpected changes of CurrentResourceOwner and CurrentMemoryContext Users of logical decoding can encounter an unexpected change of CurrentResourceOwner and CurrentMemoryContext. The problem is that, unlike other call sites of RollbackAndReleaseCurrentSubTransaction(), in reorderbuffer.c we fail to restore the original values of these global variables after being clobbered by subtransaction abort. This patch saves the values prior to the call and restores them eventually. In addition, logical.c and logicalfuncs.c had a hack to restore resource owner, presumably because of lack of this restore. Remove that. Instead, because the test coverage here is not very consistent, add an Assert() to ensure that the resowner is kept identical; this would make it easy to detect other cases of bugs were we fail to restore resowner properly. This could be removed later. This is arguably an old bug, but there appears to be no reason to backpatch it and it's risky to do so, so refrain for now. Author: Antonin Houska <ah@cybertec.at> Reported-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/119497.1756892972@localhost	2025-09-12 18:47:25 +02:00
Andres Freund	20d541a200	ci: openbsd: Increase RAM disk's size Its size was ~3.8GB before, which sometimes was not enough. OpenBSD CI task often were failing due to no space left on device. Increase the RAM disk size to ~4.6 GB. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ2XVVPJRJmGB2DsL3gOrOinWh=HWvj6GO1cHzJ=6LwTag@mail.gmail.com Backpatch-through: 18, where openbsd was added to CI	2025-09-12 10:18:31 -04:00
Peter Eisentraut	ae0e1be9f2	Allow redeclaration of typedef yyscan_t This is allowed in C11, so we don't need the workaround guards against it anymore. This effectively reverts commit `382092a0cd` that put these guards in place. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-12 08:16:00 +02:00
Peter Eisentraut	675ddc4d70	Improve pgbench definition of yyscan_t It was defining yyscan_t as a macro while the rest of the code uses a typedef with #ifdef guards around it. The latter is also what the flex generated code uses. So it seems best to make it look like those other places for consistency. The old way also had a potential for conflict if some code included multiple headers providing yyscan_t. exprscan.l includes #include "fe_utils/psqlscan_int.h" #include "pgbench.h" and fe_utils/psqlscan_int.h contains #ifndef YY_TYPEDEF_YY_SCANNER_T #define YY_TYPEDEF_YY_SCANNER_T typedef void yyscan_t; #endif which was then followed by pgbench.h #define yyscan_t void and then the generated code in exprscan.c #ifndef YY_TYPEDEF_YY_SCANNER_T #define YY_TYPEDEF_YY_SCANNER_T typedef void* yyscan_t; #endif This works, but if the #ifdef guard in psqlscan_int.h is removed, this fails. We want to move toward allowing repeat typedefs, per C11, but for that we need to make sure they are all the same. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/10d32190-f31b-40a5-b177-11db55597355@eisentraut.org	2025-09-12 08:15:41 +02:00
Peter Eisentraut	2aac62be8c	Default to log_lock_waits=on If someone is stuck behind a lock for more than a second, that is almost always a problem that is worth a log entry. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-By: Michael Banck <mbanck@gmx.net> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Christoph Berg <myon@debian.org> Reviewed-By: Stephen Frost <sfrost@snowman.net> Discussion: https://postgr.es/m/b8b8502915e50f44deb111bc0b43a99e2733e117.camel%40cybertec.at	2025-09-12 07:57:06 +02:00
Peter Eisentraut	25f36066dd	Remove traces of support for Sun Studio compiler Per discussion, this compiler suite is no longer maintained, and it has not been able to compile PostgreSQL since at least PostgreSQL 17. This removes all the remaining support code for this compiler. Note that the Solaris operating system continues to be supported, but using GCC as the compiler. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/a0f817ee-fb86-483a-8a14-b6f7f5991b6e%40eisentraut.org	2025-09-12 07:39:05 +02:00
Peter Eisentraut	e92677e863	Silence compiler warnings on clang 21 Clang 21 shows some new compiler warnings, for example: warning: variable 'dstsize' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] The fix is to initialize the variables when they are defined. This is similar to, for example, the existing situation in gistKeyIsEQ(). Discussion: https://www.postgresql.org/message-id/flat/6604ad6e-5934-43ac-8590-15113d6ae4b1%40eisentraut.org	2025-09-12 07:28:32 +02:00
Richard Guo	2d756ebbe8	Fix misuse of Relids for storing attribute numbers The typedef Relids (Bitmapset ) is intended to represent set of relation identifiers, but was incorrectly used in several places to store sets of attribute numbers. This is my oversight in `e2debb643`. Fix that by replacing such usages with Bitmapset to reflect the correct semantics. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LJhp_xriXf39iCz0TsK+M-2biuhDhpLC6Baxw8+ZYT3A@mail.gmail.com	2025-09-12 11:12:19 +09:00
Michael Paquier	528dadf691	Add more information for WAL records of hash index AMs hashdesc.c was missing a couple of fields in its record descriptions, as of: - is_prev_bucket_same_wrt for SQUEEZE_PAGE. - procid for INIT_META_PAGE. - old_bucket_flag and new_bucket_flag for SPLIT_ALLOCATE_PAGE. The author has noted the first hole, and I have spotted the others while double-checking this area of the code. Note that the only data missing now are the offsets stored in VACUUM_ONE_PAGE. We could perhaps add them, if somebody sees value in this data, even if it makes the output larger. These are discarded here. Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CALdSSPjc-OVwtZH0Xrkvg7n=2ZwdbMJzqrm_ed_CfjiAzuKVGg@mail.gmail.com	2025-09-12 10:29:02 +09:00
Michael Paquier	306dd13079	Remove whitespace in comment of pg_stat_statements.c Introduced by `6b4d23feef`, spotted while reading this area of the code.	2025-09-12 09:56:10 +09:00
Nathan Bossart	ed1aad15e0	Move named LWLock tranche requests to shared memory. In EXEC_BACKEND builds, GetNamedLWLockTranche() can segfault when called outside of the postmaster process, as it might access NamedLWLockTrancheRequestArray, which won't be initialized. Given the lack of reports, this is apparently unusual, presumably because it is usually called from a shmem_startup_hook like this: mystruct = ShmemInitStruct(..., &found); if (!found) { mystruct->locks = GetNamedLWLockTranche(...); ... } This genre of shmem_startup_hook evades the aforementioned segfaults because the struct is initialized in the postmaster, so all other callers skip the !found path. We considered modifying the documentation or requiring GetNamedLWLockTranche() to be called from the postmaster, but ultimately we decided to simply move the request array to shared memory (and add it to the BackendParameters struct), thereby allowing calls outside postmaster on all platforms. Since the main shared memory segment is initialized after accepting LWLock tranche requests, postmaster builds the request array in local memory first and then copies it to shared memory later. Given the lack of reports, back-patching seems unnecessary. Reported-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0v1_15QPg5Sqd2Qz5rh_qcsyCeHHmRDY89xVHcy2yt5BQ%40mail.gmail.com	2025-09-11 16:13:55 -05:00
Tom Lane	a0b99fc122	Report the correct is_temporary flag for column defaults. pg_event_trigger_dropped_objects() would report a column default object with is_temporary = false, even if it belongs to a temporary table. This seems clearly wrong, so adjust it to report the table's temp-ness. While here, refactor EventTriggerSQLDropAddObject to make its handling of namespace objects less messy and avoid duplication of the schema-lookup code. And add some explicit test coverage of dropped-object reports for dependencies of temp tables. Back-patch to v15. The bug exists further back, but the GetAttrDefaultColumnAddress function this patch depends on does not, and it doesn't seem worth adjusting it to cope with the older code. Author: Antoine Violin <violin.antuan@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFjUV9x3-hv0gihf+CtUc-1it0hh7Skp9iYFhMS7FJjtAeAptA@mail.gmail.com Backpatch-through: 15	2025-09-11 17:11:57 -04:00
Álvaro Herrera	1d5800019f	Improve comment about snapshot macros The comment mistakenly had "the others" for "the other", but this commit also reorders the comment so it matches the macros below. Now we describe the levels in increasing strictness. In addition, it seems easier to follow if we introduce one level at a time, rather than describing two, followed by "the other" (and then jumping back to one of the first two). Finally, reword the sentence about the purpose of the macros, which was slightly off-point. Author: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+renyUp=xja80rBaB6NpY3RRdi750y046x28bo_xg29zKY72Q@mail.gmail.com	2025-09-11 19:49:57 +02:00
Álvaro Herrera	e8cec3d179	Add test for temporal referential integrity This commit adds an isolation test showing that temporal foreign keys do not permit referential integrity violations under concurrency, like fk-snapshot-2. You can show that the test fails by passing false for detectNewRows to ri_PerformCheck in ri_restrict. Author: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+renyUp=xja80rBaB6NpY3RRdi750y046x28bo_xg29zKY72Q@mail.gmail.com	2025-09-11 18:16:25 +02:00
Álvaro Herrera	a2b4102a21	Fill testing gap for possible referential integrity violation This commit adds a missing isolation test for (non-PERIOD) foreign keys. With REPEATABLE READ, one transaction can insert a referencing row while another deletes the referenced row, and both see a valid state. But after they have committed, the table violates referential integrity. If the INSERT precedes the DELETE, we use a crosscheck snapshot to see the just-added row, so that the DELETE can raise a foreign key error. You can see the table violate referential integrity if you change ri_restrict to pass false for detectNewRows to ri_PerformCheck. A crosscheck snapshot is not needed when the DELETE comes first, because the INSERT's trigger takes a FOR KEY SHARE lock that sees the row now marked for deletion, waits for that transaction to commit, and raises a serialization error. I (Paul) added a test for that too though. We already have a similar test (in ri-triggers.spec) for SERIALIZABLE snapshot isolation showing that you can implement foreign keys with just pl/pgSQL, but that test does nothing to validate ri_triggers.c. We also have tests (in fk-snapshot.spec) for other concurrency scenarios, but not this one: we test concurrently deleting both the referencing and referenced row, when the constraint activates a cascade/set null action. But those tests don't exercise ri_restrict, and the consequence of omitting a crosscheck comparison is different: a serialization failure, not a referential integrity violation. Author: Paul Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA+renyUp=xja80rBaB6NpY3RRdi750y046x28bo_xg29zKY72Q@mail.gmail.com	2025-09-11 18:11:46 +02:00
Peter Eisentraut	4fbe015145	Remove checks for no longer supported GCC versions Since commit `f5e0186f86` (Raise C requirement to C11), we effectively require at least GCC version 4.7, so checks for older versions can be removed. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/a0f817ee-fb86-483a-8a14-b6f7f5991b6e%40eisentraut.org	2025-09-11 12:05:59 +02:00
Peter Eisentraut	368c38dd47	Remove stray semicolon at global scope The Sun Studio compiler complains about an empty declaration here. Note for future historians: This does not mean that this compiler is still of current interest for anyone using PostgreSQL. But we can let this small fix be its parting gift. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/a0f817ee-fb86-483a-8a14-b6f7f5991b6e%40eisentraut.org	2025-09-11 12:03:15 +02:00
Amit Kapila	01d793698f	Fix intermittent test failure introduced in `6456c6e2c4`. The test assumes that a backend will execute COMMIT PREPARED on the publisher and hit the injection point commit-after-delay-checkpoint within the commit critical section. This should cause the apply worker on the subscriber to wait for the transaction to complete. However, the test does not guarantee that the injection point is actually triggered, creating a race condition where the apply worker may proceed prematurely during COMMIT PREPARED. This commit resolves the issue by explicitly waiting for the injection point to be hit before continuing with the test, ensuring consistent and reliable behavior. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/TY4PR01MB1690751D1CA8C128B0770EC6F9409A@TY4PR01MB16907.jpnprd01.prod.outlook.com	2025-09-11 09:33:48 +00:00
Dean Rasheed	2bbbb2eca9	doc: Fix indentation in func-datetime.sgml. Incorrect indentation introduced by commit `faf071b553`.	2025-09-11 09:49:39 +01:00
Dean Rasheed	9c24111c4d	doc: Improve description of new random(min, max) functions. Mention that the new variants of random(min, max) are affected by setseed(), like the original functions. Reported-by: Marcos Pegoraro <marcos@f10.com.br> Discussion: https://postgr.es/m/CAB-JLwb1=drA3Le6uZXDBi_tCpeS1qm6XQU7dKwac_x91Z4qDg@mail.gmail.com	2025-09-11 09:25:47 +01:00
Michael Paquier	26eadf4d2b	Fix description of WAL record blocks in hash_xlog.h hash_xlog.h included descriptions for the blocks used in WAL records that were was not completely consistent with how the records are generated, with one block missing for SQUEEZE_PAGE, and inconsistent descriptions used for block 0 in VACUUM_ONE_PAGE and MOVE_PAGE_CONTENTS. This information was incorrect since `c11453ce0a`, cross-checking the logic for the record generation. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CALdSSPj1j=a1d1hVA3oabRFz0hSU3KKrYtZPijw4UPUM7LY9zw@mail.gmail.com Backpatch-through: 13	2025-09-11 17:17:04 +09:00
Michael Paquier	c88ce73eda	Fix incorrect file reference in guc.h GucSource_Names was documented as being in guc.c, but since `0a20ff54f5` it is located in guc_tables.c. The reference to the location of GucSource_Names is important, as GucSource needs to be kept in sync with GucSource_Names. Author: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CAKFQuwYPgAHWPYjPzK7iXzhSZ6MKR8w20_Nz7ZXpOvx=kZbs7A@mail.gmail.com Backpatch-through: 16	2025-09-11 10:15:33 +09:00
Tom Lane	09036dc71c	Avoid faulty alignment of Datums in build_sorted_items(). If sizeof(Pointer) is 4 then sizeof(SortItem) will be 12, so that if data->numrows is odd then we placed the values array at a location that's not a multiple of 8. That was fine when sizeof(Datum) was also 4, but in the wake of commit `2a600a93c` it makes some alignment-picky machines unhappy. (You need a 32-bit machine that nonetheless expects 8-byte alignment of 8-byte quantities, which is an odd-seeming combination but it does exist outside the Intel universe.) To fix, MAXALIGN the space allocated to the SortItem array. In passing, let's make the "len" variable be Size not int, just for paranoia's sake. This code was arguably not too safe even before `2a600a93c`, but at present I don't see a strong argument for back-patching. Reported-by: Tomas Vondra <tomas@vondra.me> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/87036018-8d70-40ad-a0ac-192b07bd7b04@vondra.me	2025-09-10 17:51:24 -04:00
Tom Lane	bdc6cfcd12	Eliminate duplicative hashtempcxt in nodeSubplan.c. Instead of building a separate memory context that's used just for running hash functions, make the hash functions run in the per-tuple context of the node's innerecontext. This saves a little space at runtime, and it avoids needing to reset two contexts instead of one inside buildSubPlanHash's main loop. This largely reverts commit `133924e13`. That's safe to do now because `bf6c614a2` decoupled the evaluation context used by TupleHashTableMatch from that used for hash function evaluation, so that there's no longer a risk of resetting the innerecontext too soon. Per discussion of bug #19040, although this is not directly a fix for that. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Haiyang Li <mohen.lhy@alibaba-inc.com> Reviewed-by: Fei Changhong <feichanghong@qq.com> Discussion: https://postgr.es/m/19040-c9b6073ef814f48c@postgresql.org	2025-09-10 16:15:08 -04:00
Tom Lane	abdeacdb09	Fix memory leakage in nodeSubplan.c. If the hash functions used for hashing tuples leaked any memory, we failed to clean that up, resulting in query-lifespan memory leakage in queries using hashed subplans. One way that could happen is if the values being hashed require de-toasting, since most of our hash functions don't trouble to clean up de-toasted inputs. Prior to commit `bf6c614a2`, this leakage was largely masked because TupleHashTableMatch would reset hashtable->tempcxt (via execTuplesMatch). But it doesn't do that anymore, and that's not really the right place for this anyway: doing it there could reset the tempcxt many times per hash lookup, or not at all. Instead put reset calls into ExecHashSubPlan and buildSubPlanHash. Along the way to that, rearrange ExecHashSubPlan so that there's just one place to call MemoryContextReset instead of several. This amounts to accepting the de-facto API spec that the caller of the TupleHashTable routines is responsible for resetting the tempcxt adequately often. Although the other callers seem to get this right, it was not documented anywhere, so add a comment about it. Bug: #19040 Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com> Author: Haiyang Li <mohen.lhy@alibaba-inc.com> Reviewed-by: Fei Changhong <feichanghong@qq.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19040-c9b6073ef814f48c@postgresql.org Backpatch-through: 13	2025-09-10 16:05:03 -04:00
Nathan Bossart	9016fa7e3b	meson: Build numeric.c with -ftree-vectorize. autoconf builds have compiled this file with -ftree-vectorize since commit `8870917623`, but meson builds seem to have missed the memo. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/aL85CeasM51-0D1h%40nathan Backpatch-through: 16	2025-09-10 11:21:12 -05:00
Peter Eisentraut	33eec80940	Fix CREATE TABLE LIKE with not-valid check constraint In CREATE TABLE ... LIKE, any check constraints copied from the source table should be set to valid if they are ENFORCED (the default). Bug introduced in commit `ca87c415e2`. Author: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/CACJufxH%3D%2Bod8Wy0P4L3_GpapNwLUP3oAes5UFRJ7yTxrM_M5kg%40mail.gmail.com	2025-09-10 13:25:58 +02:00
Michael Paquier	e6da68a6e1	Remove dynahash.h All the callers of my_log2() are now limited inside dynahash.c, so let's remove this header. The same capability is provided by pg_bitutils.h already. Discussion: https://postgr.es/m/CAEZATCUJPQD_7sC-wErak2CQGNa6bj2hY-mr8wsBki=kX7f2_A@mail.gmail.com	2025-09-10 14:11:50 +09:00
Michael Paquier	b1187266e0	Replace callers of dynahash.h's my_log() by equivalent in pg_bitutils.h All the calls replaced by this commit use 4-byte integers for their variables used in input of my_log2(). Hence, the limit against too-large inputs does not really apply. Thresholds are also applied, as of: - In nodeAgg.c, the number of partitions is limited by HASHAGG_MAX_PARTITIONS. - In nodeHash.c, ExecChooseHashTableSize() caps its maximum number of buckets based on HashJoinTuple and palloc() allocation limit. - In worker.c, the number of subxacts tracked by ApplySubXactData uses uint32, making pg_ceil_log2_64() safe to use directly. Several approaches have been discussed, like an integration with thresholds in pg_bitutils.h, but it was found confusing. This uses Dean's idea, which gives a simpler result than what I came up with to be able to remove dynahash.h. dynahash.h will be removed in a follow-up commit, removing some duplication with the ceil log2 routines. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAEZATCUJPQD_7sC-wErak2CQGNa6bj2hY-mr8wsBki=kX7f2_A@mail.gmail.com	2025-09-10 11:20:46 +09:00
Michael Paquier	8c8f7b199d	Fix leak with SMgrRelations in startup process The startup process does not process shared invalidation messages, only sending them, and never calls AtEOXact_SMgr() which clean up any unpinned SMgrRelations. Hence, it is never able to free SMgrRelations on a periodic basis, bloating its hashtable over time. Like the checkpointer and the bgwriter, this commit takes a conservative approach by freeing periodically SMgrRelations when replaying a checkpoint record, either online or shutdown, so as the startup process has a way to perform a periodic cleanup. Issue caused by `21d9c3ee4e`, so backpatch down to v17. Author: Jingtang Zhang <mrdrivingduck@gmail.com> Reviewed-by: Yuhang Qiu <iamqyh@gmail.com> Discussion: https://postgr.es/m/28C687D4-F335-417E-B06C-6612A0BD5A10@gmail.com Backpatch-through: 17	2025-09-10 07:23:05 +09:00
Nathan Bossart	d96c854dfc	Fix documentation for shmem_startup_hook. This section claims that each backend executes the shmem_startup_hook shortly after attaching to shared memory, which is true for EXEC_BACKEND builds, but not for others. This commit adds this important detail. Oversight in commit `964152c476`. Reported-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0vEGT1eigGbVt604LkXP6mUPMwPMxQoRCbFny44w%2B9EUQ%40mail.gmail.com Backpatch-through: 17	2025-09-09 14:35:30 -05:00
Nathan Bossart	530cfa8eb5	test_slru: Fix LWLock tranche allocation in EXEC_BACKEND builds. Currently, test_slru's shmem_startup_hook unconditionally generates new LWLock tranche IDs. This is fine on non-EXEC_BACKEND builds, where only the postmaster executes this hook, but on EXEC_BACKEND builds, every backend executes it, too. To fix, only generate the tranche IDs in the postmaster process by checking the IsUnderPostmaster variable. This is arguably a bug fix and could be back-patched, but since the damage is limited to some extra unused tranche IDs in a test module, I'm not going to bother. Reported-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0vaAuonaf12CeDddQJu5xKL%2B6xVyS%2B_q1%2BcH%3D33JXV82w%40mail.gmail.com	2025-09-09 14:09:36 -05:00
Peter Eisentraut	81a61fde84	Fix typo in comment Author: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/CAK98qZ0whQ%3Dc%2BJGXbGSEBxCtLgy6sf-YGYqsKTAGsS-wt0wj%2BA%40mail.gmail.com	2025-09-09 15:33:46 +02:00
Dean Rasheed	faf071b553	Add date and timestamp variants of random(min, max). This adds 3 new variants of the random() function: random(min date, max date) returns date random(min timestamp, max timestamp) returns timestamp random(min timestamptz, max timestamptz) returns timestamptz Each returns a random value x in the range min <= x <= max. Author: Damien Clochard <damien@dalibo.info> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Vik Fearing <vik@postgresfriends.org> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/f524d8cab5914613d9e624d9ce177d3d@dalibo.info	2025-09-09 10:39:30 +01:00
Amit Kapila	5ac3c1ac22	Fix Coverity issue reported in commit `a850be2fe`. Address a potential SIGSEGV that may occur when the tablesync worker attempts to locate a deleted row while applying changes. This situation arises during conflict detection for update-deleted scenarios. To prevent this crash, ensure that the operation is errored out early if the leader apply worker is unavailable. Since the leader worker maintains the necessary conflict detection metadata, proceeding without it serves no purpose and risks reporting incorrect conflict type. In the passing, improve a nearby comment. Reported by Tom Lane as per Coverity Author: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/334468.1757280992@sss.pgh.pa.us	2025-09-09 03:18:22 +00:00
Melanie Plageman	8ec97e78a7	Add error codes when vacuum discovers VM corruption Commit `fd6ec93bf8` and other previous work established the principle that when an error is potentially reachable in case of on-disk corruption but is not expected to be reached otherwise, ERRCODE_DATA_CORRUPTED should be used. This allows log monitoring software to search for evidence of corruption by filtering on the error code. Enhance the existing log messages emitted when the heap page is found to be inconsistent with the VM by adding this error code. Suggested-by: Andrey Borodin <x4mmm@yandex-team.ru> Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/87DD95AA-274F-4F4F-BAD9-7738E5B1F905%40yandex-team.ru	2025-09-08 17:13:31 -04:00
Jeff Davis	9af672bcb2	meson: build checksums with extra optimization flags. Use -funroll-loops and -ftree-vectorize when building checksum.c to match what autoconf does. Discussion: https://postgr.es/m/a81f2f7ef34afc24a89c613671ea017e3651329c.camel@j-davis.com Reviewed-by: Andres Freund <andres@anarazel.de>	2025-09-08 12:29:42 -07:00
Nathan Bossart	3bcfcd815e	pg_upgrade: Transfer pg_largeobject_metadata's files when possible. Commit `161a3e8b68` taught pg_upgrade to use COPY for large object metadata for upgrades from v12 and newer, which is much faster to restore than the proper large object commands. For upgrades from v16 and newer, we can take this a step further and transfer the large object metadata files as if they were user tables. We can't transfer the files from older versions because the aclitem data type (needed by pg_largeobject_metadata.lomacl) changed its storage format in v16 (see commit `7b378237aa`). Note that this commit is essentially a revert of commit `12a53c732c`. There are a couple of caveats. First, we still need to COPY the corresponding pg_shdepend rows for large objects. Second, we need to COPY anything in pg_largeobject_metadata with a comment or security label, else restoring those will fail. This means that an upgrade in which every large object has a comment or security label won't gain anything from this commit, but it should at least avoid making those unusual use-cases any worse. pg_upgrade must also take care to transfer the relfilenodes of pg_largeobject_metadata and its index, as was done for pg_largeobject in commits `d498e052b4` and `bbe08b8869`. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aJ3_Gih_XW1_O2HF%40nathan	2025-09-08 14:19:48 -05:00
Melanie Plageman	4b5f206de2	Remove unused xl_heap_prune member, reason `f83d709760` refactored xl_heap_prune and added an unused member, reason. While PruneReason is used when constructing this WAL record to set the WAL record definition, it doesn't need to be stored in a separate field in the record. Remove it. We won't backport this, since modifying an exposed struct definition to remove an unused field would do more harm than good. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/tvvtfoxz5ykpsctxjbzxg3nldnzfc7geplrt2z2s54pmgto27y%40hbijsndifu45	2025-09-08 14:25:10 -04:00
Robert Haas	5a170e992a	Don't generate fake "TLOCRN" or "TROCRN" aliases, either. This is just like the previous two commits, except that this fix actually doesn't change any regression test outputs. Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com	2025-09-08 12:58:07 -04:00
Robert Haas	6f79024df3	Don't generate fake "ANY_subquery" aliases, either. This is just like the previous commit, but for a different invented alias name. Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com	2025-09-08 12:24:02 -04:00
Robert Haas	585e31fcb6	Don't generate fake "SELECT" or "SELECT %d" subquery aliases. rte->alias should point only to a user-written alias, but in these cases that principle was violated. Fixing this causes some regression test output changes: wherever rte->alias previously had a value and is now NULL, rte->eref is now set to a generated name rather than to rte->alias; and the scheme used to generate eref names differs from what we were doing for aliases. The upshot is that instead of "SELECT" or "SELECT %d", EXPLAIN will now emit "unnamed_subquery" or "unnamed_subquery_%d". But that's a reasonable descriptor, and we were already producing that in yet other cases, so this seems not too objectionable. Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Robert Haas <rhaas@postgresql.org> Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com	2025-09-08 11:50:33 -04:00
Melanie Plageman	3399c26554	Remove unneeded VM pin from VM replay Previously, heap_xlog_visible() called visibilitymap_pin() even after getting a buffer from XLogReadBufferForRedoExtended() -- which returns a pinned buffer containing the specified block of the visibility map. This would just have resulted in visibilitymap_pin() returning early since the specified page was already present and pinned, but it was confusing extraneous code, so remove it. It doesn't seem worth backporting, though. It appears to be an oversight in `2c03216`. While we are at it, remove two VM-related redundant asserts in the COPY FREEZE code path. visibilitymap_set() already asserts that PD_ALL_VISIBLE is set on the heap page and checks that the vmbuffer contains the bits corresponding to the specified heap block, so callers do not also need to check this. Author: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Melanie Plageman <melanieplageman@gmail.com> Reported-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CALdSSPhu7WZd%2BEfQDha1nz%3DDC93OtY1%3DUFEdWwSZsASka_2eRQ%40mail.gmail.com	2025-09-08 10:22:42 -04:00
Amit Kapila	6456c6e2c4	Add test to prevent premature removal of conflict-relevant data. A test has been added to ensure that conflict-relevant data is not prematurely removed when a concurrent prepared transaction is being committed on the publisher. This test introduces an injection point that simulates the presence of a prepared transaction in the commit phase, validating that the system correctly delays conflict slot advancement until the transaction is fully committed. Additionally, the test serves as a safeguard for developers, ensuring that the acquisition of the commit timestamp does not occur before marking DELAY_CHKPT_IN_COMMIT in RecordTransactionCommitPrepared. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS9PR01MB16913F67856B0DA2A909788129400A@OS9PR01MB16913.jpnprd01.prod.outlook.com	2025-09-08 12:06:03 +00:00
Michael Paquier	8191e0c16a	Fix corruption of pgstats shared hashtable due to OOM failures A new pgstats entry is created as a two-step process: - The entry is looked at in the shared hashtable of pgstats, and is inserted if not found. - When not found and inserted, its fields are then initialized. This part include a DSA chunk allocation for the stats data of the new entry. As currently coded, if the DSA chunk allocation fails due to an out-of-memory failure, an ERROR is generated, leaving in the pgstats shared hashtable an inconsistent entry due to the first step, as the entry has already been inserted in the hashtable. These broken entries can then be found by other backends, crashing them. There are only two callers of pgstat_init_entry(), when loading the pgstats file at startup and when creating a new pgstats entry. This commit changes pgstat_init_entry() so as we use dsa_allocate_extended() with DSA_ALLOC_NO_OOM, making it return NULL on allocation failure instead of failing. This way, a backend failing an entry creation can take appropriate cleanup actions in the shared hashtable before throwing an error. Currently, this means removing the entry from the shared hashtable before throwing the error for the allocation failure. Out-of-memory errors unlikely happen in the wild, and we do not bother with back-patches when these are fixed, usually. However, the problem dealt with here is a degree worse as it breaks the shared memory state of pgstats, impacting other processes that may look at an inconsistent entry that a different process has failed to create. Author: Mikhail Kot <mikhail.kot@databricks.com> Discussion: https://postgr.es/m/CAAi9E7jELo5_-sBENftnc2E8XhW2PKZJWfTC3i2y-GMQd2bcqQ@mail.gmail.com Backpatch-through: 15	2025-09-08 15:52:23 +09:00
Amit Kapila	1f7e9ba3ac	Post-commit review fixes for `228c370868`. This commit fixes three issues: 1) When a disabled subscription is created with retain_dead_tuples set to true, the launcher is not woken up immediately, which may lead to delays in creating the conflict detection slot. Creating the conflict detection slot is essential even when the subscription is not enabled. This ensures that dead tuples are retained, which is necessary for accurately identifying the type of conflict during replication. 2) Conflict-related data was unnecessarily retained when the subscription does not have a table. 3) Conflict-relevant data could be prematurely removed before applying prepared transactions on the publisher that are in the commit critical section. This issue occurred because the backend executing COMMIT PREPARED was not accounted for during the computation of oldestXid in the commit phase on the publisher. As a result, the subscriber could advance the conflict slot's xmin without waiting for such COMMIT PREPARED transactions to complete. We fixed this issue by identifying prepared transactions that are in the commit critical section during computation of oldestXid in commit phase. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS9PR01MB16913DACB64E5721872AA5C02943BA@OS9PR01MB16913.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/OS9PR01MB16913F67856B0DA2A909788129400A@OS9PR01MB16913.jpnprd01.prod.outlook.com	2025-09-08 06:10:15 +00:00
Michael Paquier	43eb2c5419	Update parser README to include parse_jsontable.c The README was missing parse_jsontable.c which handles JSON_TABLE. Oversight in `de3600452b`. Author: Karthik S <karthikselvaam@gmail.com> Discussion: https://postgr.es/m/CAK4gQD9gdcj+vq_FZGp=Rv-W+41v8_C7cmCUmDeu=cfrOdfXEw@mail.gmail.com Backpatch-through: 17	2025-09-08 10:07:14 +09:00
Tatsuo Ishii	06473f5a34	Allow to log raw parse tree. This commit allows to log the raw parse tree in the same way we currently log the parse tree, rewritten tree, and plan tree. To avoid unnecessary log noise for users not interested in this detail, a new GUC option, "debug_print_raw_parse", has been added. When starting the PostgreSQL process with "-d N", and N is 3 or higher, debug_print_raw_parse is enabled automatically, alongside debug_print_parse. Author: Chao Li <lic@highgo.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEoWx2mcO0Gpo4vd8kPMAFWeJLSp0MeUUnaLdE1x0tSVd-VzUw%40mail.gmail.com	2025-09-06 07:49:51 +09:00
Andres Freund	2c78940527	bufmgr: Remove freelist, always use clock-sweep This set of changes removes the list of available buffers and instead simply uses the clock-sweep algorithm to find and return an available buffer. This also removes the have_free_buffer() function and simply caps the pg_autoprewarm process to at most NBuffers. While on the surface this appears to be removing an optimization it is in fact eliminating code that induces overhead in the form of synchronization that is problematic for multi-core systems. The main reason for removing the freelist, however, is not the moderate improvement in scalability, but that having the freelist would require dedicated complexity in several upcoming patches. As we have not been able to find a case benefiting from the freelist... Author: Greg Burd <greg@burd.me> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/70C6A5B5-2A20-4D0B-BC73-EB09DD62D61C@getmailspring.com	2025-09-05 12:25:59 -04:00
Andres Freund	50e4c6ace5	bufmgr: Use consistent naming of the clock-sweep algorithm Minor edits to comments only. Author: Greg Burd <greg@burd.me> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/70C6A5B5-2A20-4D0B-BC73-EB09DD62D61C@getmailspring.com	2025-09-05 12:25:59 -04:00
Melanie Plageman	e3d5ddb7ca	Add assert and log message to visibilitymap_set Add an assert to visibilitymap_set() that the provided heap buffer is exclusively locked, which is expected. Also, enhance the debug logging message to specify which VM flags were set. Based on a related suggestion by Kirill Reshke on an in-progress patchset. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CALdSSPhAU56g1gGVT0%2BwG8RrSWE6qW8TOfNJS1HNAWX6wPgbFA%40mail.gmail.com	2025-09-05 09:33:36 -04:00
Dean Rasheed	6ede13d1b5	Fix concurrent update issue with MERGE. When executing a MERGE UPDATE action, if there is more than one concurrent update of the target row, the lock-and-retry code would sometimes incorrectly identify the latest version of the target tuple, leading to incorrect results. This was caused by using the ctid field from the TM_FailureData returned by table_tuple_lock() in a case where the result was TM_Ok, which is unsafe because the TM_FailureData struct is not guaranteed to be fully populated in that case. Instead, it should use the tupleid passed to (and updated by) table_tuple_lock(). To reduce the chances of similar errors in the future, improve the commentary for table_tuple_lock() and TM_FailureData to make it clearer that table_tuple_lock() updates the tid passed to it, and most fields of TM_FailureData should not be relied on in non-failure cases. An exception to this is the "traversed" field, which is set in both success and failure cases. Reported-by: Dmitry <dsy.075@yandex.ru> Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1570d30e-2b95-4239-b9c3-f7bf2f2f8556@yandex.ru Backpatch-through: 15	2025-09-05 08:18:18 +01:00
Michael Paquier	567d27e8e2	Fix outdated comments in slru.c SlruRecentlyUsed() is an inline function since `53c2a97a92`, not a macro. The description of long_segment_names was missing at the top of SimpleLruInit(), part forgotten in `4ed8f0913b`. Author: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://postgr.es/m/aLpBLMOYwEQkaleF@jrouhaud Backpatch-through: 17	2025-09-05 14:10:08 +09:00
Michael Paquier	4246a977ba	Switch some numeric-related functions to use soft error reporting This commit changes some functions related to the data type numeric to use the soft error reporting rather than a custom boolean flag (called "have_error") that callers of these functions could rely on to bypass the generation of ERROR reports, letting the callers do their own error handling (timestamp, jsonpath and numeric_to_char() require them). This results in the removal of some boilerplate code that was required to handle both the ereport() and the "have_error" code paths bypassing ereport(), unifying everything under the soft error reporting facility. While on it, some duplicated error messages are removed. The function upgraded in this commit were suffixed with "_opt_error" in their names. They are renamed to "_safe" instead. This change relies on `d9f7f5d32f`, that has introduced the soft error reporting infrastructure. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAAJ_b96No5h5tRuR+KhcC44YcYUCw8WAHuLoqqyyop8_k3+JDQ@mail.gmail.com	2025-09-05 13:53:47 +09:00
Michael Paquier	ae45312008	Change pg_lsn_in_internal() to use soft error reporting pg_lsn includes pg_lsn_in_internal() for the purpose of parsing a LSN position for the GUC recovery_target_lsn (`21f428ebde`). It relies on a boolean called "have_error" that would be set when the LSN parsing fails, then let its callers handle any errors. `d9f7f5d32f` has added support for soft error reporting. This commit removes some boilerplate code and switches the routine to use soft error reporting directly, giving to the callers of pg_lsn_in_internal() the possibility to be fed the error message generated on failure. pg_lsn_in_internal() routine is renamed to pg_lsn_in_safe(), for consistency with other similar routines that are given an escontext. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAAJ_b96No5h5tRuR+KhcC44YcYUCw8WAHuLoqqyyop8_k3+JDQ@mail.gmail.com	2025-09-05 12:59:29 +09:00
Nathan Bossart	d814d7fc3d	Revert recent change to RequestNamedLWLockTranche(). Commit `38b602b028` modified this function to allocate enough space for MAX_NAMED_TRANCHES (256) requests, which is likely far more than most clusters need. This commit reverts that change so that it first allocates enough space for only 16 requests and resizes the array when necessary. While at it, remove the check for too many tranches from this function. We can now rely on InitializeLWLocks() to do that check via its calls to LWLockNewTrancheId() for the named tranches. Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aLmzwC2dRbqk14y6%40nathan	2025-09-04 15:34:48 -05:00
Peter Eisentraut	f0478149c3	Clean up newly added guc_tables.inc.c There was a missing makefile rule to clean up the guc_tables.inc.c symlink in src/include/. Oversight in commit `6359989654`. Author: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org	2025-09-04 17:25:43 +02:00
Nathan Bossart	1129d3e4c8	Adjust commentary for WaitEventLWLock in wait_event_names.txt. In addition to changing a couple of references for clarity, this commit combines the two similar comments.	2025-09-04 10:18:42 -05:00
Dean Rasheed	fc6600fc1c	Fix replica identity check for MERGE. When executing a MERGE, check that the target relation supports all actions mentioned in the MERGE command. Specifically, check that it has a REPLICA IDENTITY if it publishes updates or deletes and the MERGE command contains update or delete actions. Failing to do this can silently break replication. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Tested-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB57180C87E43A679A730482DF94B62@OS3PR01MB5718.jpnprd01.prod.outlook.com Backpatch-through: 15	2025-09-04 11:45:44 +01:00
Dean Rasheed	5386bfb9c1	Fix replica identity check for INSERT ON CONFLICT DO UPDATE. If an INSERT has an ON CONFLICT DO UPDATE clause, the executor must check that the target relation supports UPDATE as well as INSERT. In particular, it must check that the target relation has a REPLICA IDENTITY if it publishes updates. Formerly, it was not doing this check, which could lead to silently breaking replication. Fix by adding such a check to CheckValidResultRel(), which requires adding a new onConflictAction argument. In back-branches, preserve ABI compatibility by introducing a wrapper function with the original signature. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Tested-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB57180C87E43A679A730482DF94B62@OS3PR01MB5718.jpnprd01.prod.outlook.com Backpatch-through: 13	2025-09-04 11:27:53 +01:00
Michael Paquier	09119238a1	Fix incorrect comment in pgstat_backend.c The counters saved from pgWalUsage, used for the difference calculations when flushing the backend WAL stats, are updated when calling pgstat_flush_backend() under PGSTAT_BACKEND_FLUSH_WAL, and not pgstat_report_wal(). The comment updated in this commit referenced the latter, but it is perfectly OK to flush the backend stats independently of the WAL stats. Noticed while looking at this area of the code, introduced by `76def4cdd7` as a copy-pasto. Backpatch-through: 18	2025-09-04 08:34:51 +09:00
Tom Lane	e351e5c4fe	Make libpq_pipeline.c shorter and more uniform via helper functions. There are many places in this test program that need to consume a PGresult while checking that its PQresultStatus is as-expected, or related tasks such as checking that PQgetResult has nothing more to return. These tasks were open-coded in a rather inconsistent way, leading to some outright bugs, some memory leakage, and frequent inconsistencies about what would be reported in event of an error. Invent a few helper functions to standardize the behavior and reduce code duplication. Also, rename the one pre-existing helper function from confirm_query_canceled to consume_query_cancel, per Álvaro's suggestion that "confirm" is a poor choice of verb for a function that will discard the PGresult. While at it, clean up assorted other places that were leaking PGresults or even server connections. This is pure neatnik-ism, since the test doesn't run long enough for those leaks to be of any real-world concern. While this fixes some things that are clearly bugs, it's only a test program, and none of the bugs seem serious enough to justify back-patching. Bug: #18960 Reported-by: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/18960-09cd4a5100152e58@postgresql.org	2025-09-03 16:07:57 -04:00
Nathan Bossart	38b602b028	Move dynamically-allocated LWLock tranche names to shared memory. There are two ways for shared libraries to allocate their own LWLock tranches. One way is to call RequestNamedLWLockTranche() in a shmem_request_hook, which requires the library to be loaded via shared_preload_libraries. The other way is to call LWLockNewTrancheId(), which is not subject to the same restrictions. However, LWLockNewTrancheId() does require each backend to store the tranche's name in backend-local memory via LWLockRegisterTranche(). This API is a little cumbersome and leads to things like unhelpful pg_stat_activity.wait_event values in backends that haven't loaded the library. This commit moves these LWLock tranche names to shared memory, thus eliminating the need for each backend to call LWLockRegisterTranche(). Instead, the tranche name must be provided to LWLockNewTrancheId(), which immediately makes the name available to all backends. Since the tranche name array is append-only, lookups can ordinarily avoid locking as long as their local copy of the LWLock counter is greater than the requested tranche ID. One downside of this approach is that we now have a hard limit on both the length of tranche names (NAMEDATALEN-1 bytes) and the number of dynamically-allocated tranches (256). Besides a limit of NAMEDATALEN-1 bytes for tranche names registered via RequestNamedLWLockTranche(), no such limits previously existed. We could avoid these new limits by using dynamic shared memory, but the complexity involved didn't seem worth it. We briefly considered making the tranche limit user-configurable but ultimately decided against that, too. Since there is still a lot of time left in the v19 development cycle, it's possible we will revisit this choice. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAA5RZ0vvED3naph8My8Szv6DL4AxOVK3eTPS0qXsaKi%3DbVdW2A%40mail.gmail.com	2025-09-03 13:57:48 -05:00
Jacob Champion	7b0fb9f5c6	ci: Explicitly enable Meson features Meson's "auto" feature mode silently disables features with missing prerequisites, which is nice for development but can lead to false positives in the CI (such as my commit `b0635bfda`, which broke OAuth detection on OpenBSD). Use an explicit feature list in the Cirrus config instead; this mirrors the --with-XXX experience of Autoconf. While we're here, also move common configuration options into a single variable, MESON_COMMON_PG_CONFIG_ARGS, as suggested by Peter. The resulting hierarchy is as follows: MESON_COMMON_PG_CONFIG_ARGS "global" Meson configuration options MESON_COMMON_FEATURES the default set of CI features, to be used unless there's a specific reason not to MESON_FEATURES per-OS feature configuration, overriding the above The current exceptions to the use of MESON_COMMON_FEATURES are - SanityCheck, which uses almost no dependencies; - Windows - VS, whose feature list has diverged significantly from the others; and - Linux, which continues to use 'auto' features so that autodetection is still tested in the CI. (Options shared between 64- and 32-bit builds can go into LINUX_MESON_FEATURES instead.) Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Suggested-by: Jacob Champion <jacob.champion@enterprisedb.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/flat/CAN55FZ0aO8d_jkyRijcGP8qO%3DXH09qG%3Dpw0ZZDvB4LMzuXYU1w%40mail.gmail.com	2025-09-03 07:54:24 -07:00
Jacob Champion	01c5938003	ci: Remove extra PG_TEST_EXTRA from NetBSD/OpenBSD The PG_TEST_EXTRA environment variable is already set at the top level. As of `3d1aec225`, Meson tasks will use this by default, so there's no need for another intermediate variable. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/flat/CAN55FZ0aO8d_jkyRijcGP8qO%3DXH09qG%3Dpw0ZZDvB4LMzuXYU1w%40mail.gmail.com	2025-09-03 07:54:15 -07:00
Peter Eisentraut	01d6e5b2cf	Fix mistake in new GUC tables source Commit `6359989654` had it so that the parameter "debug_discard_caches" did not exist unless DISCARD_CACHES_ENABLED was defined (typically via enabling asserts). This was a mistake, it did not correspond to the prior setup. Several tests use this parameter, so they were now failing if you did not have asserts enabled.	2025-09-03 11:48:35 +02:00
Peter Eisentraut	6359989654	Generate GUC tables from .dat file Store the information in guc_tables.c in a .dat file similar to the catalog data in src/include/catalog/, and generate a part of guc_tables.c from that. The goal is to make it easier to edit that information, and to be able to make changes to the downstream data structures more easily. (Essentially, those are the same reasons as for the original adoption of the .dat format.) Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: David E. Wheeler <david@justatheory.com> Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org	2025-09-03 09:45:17 +02:00
Richard Guo	aba8f61c30	Fix planner error when estimating SubPlan cost SubPlan nodes are typically built very early, before any RelOptInfos have been constructed for the parent query level. As a result, the simple_rel_array in the parent root has not yet been initialized. Currently, during cost estimation of a SubPlan's testexpr, we may call examine_variable() to look up statistical data about the expressions. This can lead to "no relation entry for relid" errors. To fix, pass root as NULL to cost_qual_eval() in cost_subplan(), since the root does not yet contain enough information to safely consult statistics. One exception is SubPlan nodes built for the initplans of MIN/MAX aggregates from indexes. In this case, having a NULL root is safe because testexpr will be NULL. Additionally, an initplan will by definition not consult anything from the parent plan. Backpatch to all supported branches. Although the reported call path that triggers this error is not reachable prior to v17, there's no guarantee that other code paths -- especially in extensions -- could not encounter the same issue when cost_qual_eval() is called with a root that lacks a valid simple_rel_array. The test case is not included in pre-v17 branches though. Bug: #19037 Reported-by: Alexander Lakhin <exclusion@gmail.com> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19037-3d1c7bb553c7ce84@postgresql.org Backpatch-through: 13	2025-09-03 16:00:38 +09:00
Amit Kapila	f2dbc83501	Fix use-after-free issue in slot synchronization. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 18, where it was introduced Discussion: https://postgr.es/m/CANhcyEXMrcEdzj-RNGJam0nJHM4y+ttdWsgUCFmXciM7BNKc7A@mail.gmail.com	2025-09-03 06:31:05 +00:00
Michael Paquier	db9405493b	libpq: Fix PQtrace() format for non-printable characters PQtrace() was generating its output for non-printable characters without casting the characters printed with unsigned char, leading to some extra "\xffffff" generated in the output due to the fact that char may be signed. Oversights introduced by commit `198b3716db`, so backpatch down to v14. Author: Ran Benita <ran@unusedvar.com> Discussion: https://postgr.es/m/a3383211-4539-459b-9d51-95c736ef08e0@app.fastmail.com Backpatch-through: 14	2025-09-03 12:54:23 +09:00
Michael Paquier	c6ea528b47	Update outdated references to the SLRU ControlLock SLRU bank locks are referred as "bank locks" or "SLRU bank locks" in the code comments. The comments updated in this commit use the latter term. Oversight in `53c2a97a92`, that has replaced the single ControlLock by the bank control locks. Author: Julien Rouhaud <julien.rouhaud@free.fr> Discussion: https://postgr.es/m/aLUT2UO8RjJOzZNq@jrouhaud Backpatch-through: 17	2025-09-03 10:20:28 +09:00
Fujii Masao	229911c4bf	Add HINT for COPY TO when WHERE clause is used. COPY TO does not support a WHERE clause, and currently fails with the error: ERROR: WHERE clause not allowed with COPY TO Since the intended behavior can be achieved by using COPY (SELECT ... WHERE ...) TO, this commit adds a HINT to the error message: HINT: Try the COPY (SELECT ... WHERE ...) TO variant. This makes the error more informative and helps users quickly find the alternative usage. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/3520c224c5ffac0113aef84a9179f37e@oss.nttdata.com	2025-09-03 08:35:55 +09:00
Nathan Bossart	510777a2d5	Change ReplicationSlotPersistentData's "synced" member to a bool. Note that this doesn't require bumping SLOT_VERSION because we require sizeof(bool) == 1, thanks to commit `97525bc5c8`. Overight in commit `ddd5f4f54a`. Discussion: Ranier Vilela <ranier.vf@gmail.com>	2025-09-02 16:53:54 -05:00
Tom Lane	1b1960c8c9	Improve error message for duplicate labels when creating an enum type. Previously, duplicate labels in CREATE TYPE AS ENUM were caught by the unique index on pg_enum, resulting in a generic error message. While this was evidently intentional, it's not terribly user-friendly, nor consistent with the ALTER TYPE cases which take more care with such errors. This patch adds an explicit check to produce a more user-friendly and descriptive error message. A potential objection to this implementation is that it adds O(N^2) work to the creation operation. However, quick testing finds that that's pretty negligible below 1000 enum labels, and tolerable even at 10000. So it doesn't really seem worth being smarter. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20250704000402.37e605ab0c59c300965a17ee@sraoss.co.jp	2025-09-02 13:50:56 -04:00
Michael Paquier	eccba079c2	Generate pgstat_count_slru*() functions for slru using macros This change replaces seven functions definitions by macros, reducing a bit some repetitive patterns in the code. An interesting side effect is that this removes an inconsistency in the naming of SLRU increment functions with the field names. This change is similar to `850f4b4c8c`, `8018ffbf58` or `83a1a1b566`. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aLHA//gr4dTpDHHC@ip-10-97-1-34.eu-west-3.compute.internal	2025-09-02 16:22:03 +09:00
Amit Kapila	a850be2fe6	Add max_retention_duration option to subscriptions. This commit introduces a new subscription parameter, max_retention_duration, aimed at mitigating excessive accumulation of dead tuples when retain_dead_tuples is enabled and the apply worker lags behind the publisher. When the time spent advancing a non-removable transaction ID exceeds the max_retention_duration threshold, the apply worker will stop retaining conflict detection information. In such cases, the conflict slot's xmin will be set to InvalidTransactionId, provided that all apply workers associated with the subscription (with retain_dead_tuples enabled) confirm the retention duration has been exceeded. To ensure retention status persists across server restarts, a new column subretentionactive has been added to the pg_subscription catalog. This prevents unnecessary reactivation of retention logic after a restart. The conflict detection slot will not be automatically re-initialized unless a new subscription is created with retain_dead_tuples = true, or the user manually re-enables retain_dead_tuples. A future patch will introduce support for automatic slot re-initialization once at least one apply worker confirms that the retention duration is within the configured max_retention_duration. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-09-02 03:20:18 +00:00
Michael Paquier	36aed19fd9	postgres_fdw: Use psql variables for connection parameters Several statements need to reference the current connection's current database name and current port value. Until now, this has been accomplished by creating dynamic SQL statements inside of a DO block, which is not as easy to parse. It also takes away some of the granularity of any error messages that might occur, making debugging harder. By capturing the connection-specific settings into psql variables, it becomes possible to write simpler SQL statements for the FDW objects. This eliminates most of DO blocks used in this test, making it a bit more readable and shorter. Author: Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=cpUiJ3QF7aUthTvaVMmgQcm7QqZBRMDLhBRTR+gJX-Og@mail.gmail.com	2025-09-01 09:02:03 +09:00
Richard Guo	317c117d6d	Fix const-simplification for constraints and stats Constraint expressions and statistics expressions loaded from the system catalogs need to be run through const-simplification, because the planner will be comparing them to similarly-processed qual clauses. Without this step, the planner may fail to detect valid matches. Currently, NullTest clauses in these expressions may not be reduced correctly during const-simplification. This happens because their Var nodes do not yet have the correct varno when eval_const_expressions is applied. Since eval_const_expressions relies on varno to reduce NullTest quals, incorrect varno can cause problems. Additionally, for statistics expressions, eval_const_expressions is called with root set to NULL, which also inhibits NullTest reduction. This patch fixes the issue by ensuring that Vars are updated to have the correct varno before const-simplification, and that a valid root is passed to eval_const_expressions when needed. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org	2025-08-31 08:59:48 +09:00
Bruce Momjian	0c6d572c11	add_commit_links.pl: error out if missing major version number Reported-by: Tom Lane Author: Tom Lane Discussion: https://postgr.es/m/53125.1756591456@sss.pgh.pa.us	2025-08-30 18:26:08 -04:00
Nathan Bossart	5487058b56	Prepare DSM registry for upcoming changes to LWLock tranche names. A proposed patch would place a limit of NAMEDATALEN-1 (i.e., 63) bytes on the names of dynamically-allocated LWLock tranches, but GetNamedDSA() and GetNamedDSHash() may register tranches with longer names. This commit lowers the maximum DSM registry entry name length to NAMEDATALEN-1 bytes and modifies GetNamedDSHash() to create only one tranche, thereby allowing us to keep the DSM registry's tranche names below NAMEDATALEN bytes. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aKzIg1JryN1qhNuy%40nathan	2025-08-29 20:34:53 -05:00
Tom Lane	f727b63e81	Provide error context when an error is thrown within WaitOnLock(). Show the requested lock level and the object being waited on, in the same format we use for deadlock reports and similar errors. This is particularly helpful for debugging lock-timeout errors, since otherwise the user has very little to go on about which lock timed out. The performance cost of setting up the callback should be negligible compared to the other tracing support already present in WaitOnLock. As in the deadlock-report case, we just show numeric object OIDs, because it seems too scary to try to perform catalog lookups in this context. Reported-by: Steve Baldwin <steve.baldwin@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1602369.1752167154@sss.pgh.pa.us	2025-08-29 15:43:34 -04:00
Daniel Gustafsson	e686010c5b	pg_dump: Fix compression API errorhandling Compression in pg_dump is abstracted using an API with multiple implementations which can be selected at runtime by the user. The API and its implementations have evolved over time, notable commits include `bf9aa490db`, `e9960732a9`, `84adc8e20`, and `0da243fed`. The errorhandling defined by the API was however problematic and the implementations had a few bugs and/or were not following the API specification. This commit modifies the API to ensure that callers can perform errorhandling efficiently and fixes all the implementations such that they all implement the API in the same way. A full list of the changes can be seen below. * write_func: - Make write_func throw an error on all error conditions. All callers of write_func were already checking for success and calling pg_fatal on all errors, so we might as well make the API support that case directly with simpler errorhandling as a result. * open_func: - zstd: move stream initialization from the open function to the read and write functions as they can have fatal errors. Also ensure to dup the file descriptor like none and gzip. - lz4: Ensure to dup the file descriptor like none and gzip. * close_func: - zstd: Ensure to close the file descriptor even if closing down the compressor fails, and clean up state allocation on fclose failures. Make sure to capture errors set by fclose. - lz4: Ensure to close the file descriptor even if closing down the compressor fails, and instead of calling pg_fatal log the failures using pg_log_error. Make sure to capture errors set by fclose. - none: Make sure to catch errors set by fclose. * read_func / gets_func: - Make read_func unconditionally return the number of read bytes instead of making it optional per implementation. - lz4: Make sure to call throw an error and not return -1 - gzip: gzread returning zero cannot be assumed to indicate EOF as it is documented to return zero for some types of errors. - lz4, zstd: Convert the _read_internal helper functions to not call pg_fatal on errors to be able to handle gets_func returning NULL on error. * getc_func: - zstd: Use an unsigned char rather than an int to read char into. * LZ4Stream_init: - Make sure to not switch to inited state until we know that initialization succeeded and reset errno just in case. On top of these changes there are minor comment cleanups and improvements as well as an attempt to consistently reset errno in codepaths where it is inspected. This work was initiated by a report of API misuse, which turned into a larger body of work. As this is an internal API these changes can be backpatched into all affected branches. Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Evgeniy Gorbanev <gorbanyoves@basealt.ru> Discussion: https://postgr.es/m/517794.1750082166@sss.pgh.pa.us Backpatch-through: 16	2025-08-29 19:28:46 +02:00
Nathan Bossart	67fcf48c3b	Make LWLockCounter a global variable. Using the LWLockCounter requires first calculating its address in shared memory like this: LWLockCounter = (int ) ((char ) MainLWLockArray - sizeof(int)); Commit `82e861fbe1` started this trend in order to fix EXEC_BACKEND builds, but it could also be fixed by adding it to the BackendParameters struct. The current approach is somewhat difficult to follow, so this commit switches to the latter. While at it, swap around the code in LWLockShmemSize() to match the order of assignments in CreateLWLocks() for added readability. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aLDLnan9gNCS9fHx%40nathan	2025-08-29 12:13:37 -05:00
Tom Lane	66fa3b5eef	Fix .gitignore for src/interfaces/libpq-oauth. This missed files created when running the oauth tests.	2025-08-29 12:05:58 -04:00
Nathan Bossart	6fbd7b93c6	Remove unused parameter from ProcessSlotSyncInterrupts(). Oversight in commit `93db6cbda0`. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_7B42BBE8D0A5C28DDAB91436192CBCCB8307%40qq.com	2025-08-29 10:56:10 -05:00
Tom Lane	8722e7965f	Silence -Wmissing-variable-declarations in headerscheck. Newer gcc versions will emit warnings about missing extern declarations if certain header files are compiled by themselves. Add the "extern" declarations needed to quiet that. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1127775.1754417387@sss.pgh.pa.us	2025-08-29 10:46:13 -04:00
David Rowley	da9f9f75e5	Fix possible use after free in expand_partitioned_rtentry() It's possible that if the only live partition is concurrently dropped and try_table_open() fails, that the bms_del_member() will pfree the live_parts Bitmapset. Since the bms_del_member() call does not assign the result back to the live_parts local variable, the while loop could segfault as that variable would still reference the pfree'd Bitmapset. Backpatch to 15. `52f3de874` was backpatched to 14, but there's no bms_del_member() there due to live_parts not yet existing in RelOptInfo in that version. Technically there's no bug in version 15 as bms_del_member() didn't pfree when the set became empty prior to `00b41463c` (from v16). Applied to v15 anyway to keep the code similar and to avoid the bad coding pattern. Author: Bernd Reiß <bd_reiss@gmx.at> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/6b88f27a-c45c-4826-8e37-d61a04d90182@gmx.at Backpatch-through: 15	2025-08-30 00:50:50 +12:00
Álvaro Herrera	f225473cba	CREATE STATISTICS: improve misleading error message I think the error message for a different condition was inadvertently copied. This problem seems to have been introduced by commit `a4d75c86bf`. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 14 Discussion: https://postgr.es/m/CACJufxEZ48toGH0Em_6vdsT57Y3L8pLF=DZCQ_gCii6=C3MeXw@mail.gmail.com	2025-08-29 14:43:47 +02:00
Daniel Gustafsson	5d7f58848c	Fix typo in isolation test spec Replace 'committs' with 'commits'. Author: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAEoWx2=BESkfXsZ9jQW+1NcGTazKuj2wEXsPm1_EpgzHs0BHDQ@mail.gmail.com	2025-08-29 13:08:32 +02:00
Peter Eisentraut	f5d0708582	headerscheck: Document that --with-llvm is required We already documented that other --with-* options are required for a successful run. It turns out --with-llvm is also required. Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1127775.1754417387%40sss.pgh.pa.us	2025-08-29 09:30:50 +02:00
Peter Eisentraut	da0413373c	headerscheck: Ignore Windows-specific header Ignore src/include/port/win32/sys/resource.h. At least on macOS, including this results in warnings and errors because of duplication with system headers: ../src/include/port/win32/sys/resource.h:10:9: warning: 'RUSAGE_CHILDREN' redefined ../src/include/port/win32/sys/resource.h:16:1: error: redefinition of struct or union 'struct rusage' Since we are also not checking similar system-replacement headers for Windows, it makes sense to exclude this one, too. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1127775.1754417387%40sss.pgh.pa.us	2025-08-29 09:01:46 +02:00
Peter Eisentraut	664e0d6789	headerscheck: Use ICU_CFLAGS Otherwise, headerscheck will fail if the ICU headers are in a location not reached by the normal CFLAGS/CPPFLAGS: ../src/include/utils/pg_locale.h:21:10: fatal error: unicode/ucol.h: No such file or directory Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/1127775.1754417387%40sss.pgh.pa.us	2025-08-29 09:01:46 +02:00
Peter Eisentraut	991295f387	Mark ItemPointer arguments as const in tuple/table lock functions The functions LockTuple, ConditionalLockTuple, UnlockTuple, and XactLockTableWait take an ItemPointer argument that they do not modify, so the argument can be const-qualified to better convey intent and allow the compiler to enforce immutability. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2m9e4rECHBwpRE4%2BGCH%2BpbYZXLh2f4rB1Du5hDfKug%2BOg%40mail.gmail.com	2025-08-29 07:39:58 +02:00
Peter Eisentraut	710e6c4301	Remove unneeded casts of BufferGetPage() result BufferGetPage() already returns type Page, so casting it to Page doesn't achieve anything. A sizable number of call sites does this casting; remove that. This was already done inconsistently in the code in the first import in 1996 (but didn't exist in the pre-1995 code), and it was then apparently just copied around. Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CALdSSPgFhc5=vLqHdk-zCcnztC0zEY3EU_Q6a9vPEaw7FkE9Vw@mail.gmail.com	2025-08-29 07:18:29 +02:00
Richard Guo	97b0f36bde	Fix semijoin unique-ification for child relations For a child relation, we should not assume that its parent's unique-ified relation (or unique-ified path in v18) always exists. In cases where all RHS columns that need to be unique-ified are equated to constants, the unique-ified relation/path for the parent table is not built, as there are no columns left to unique-ify. Failing to account for this can result in a SIGSEGV crash during planning. This patch checks whether the parent's unique-ified relation or path exists and skips unique-ification of the child relation if it does not. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs49MOdLW2c+qbLHHBt8VBu=4ONpM91D19=AWeW93eFUF6A@mail.gmail.com Backpatch-through: 18	2025-08-29 13:14:12 +09:00
Masahiko Sawada	fabd8b8e2a	Use LW_SHARED in walsummarizer.c for WALSummarizerLock lock where possible. Previously, we used LW_EXCLUSIVE in several places despite only reading WalSummarizerCtl fields. This patch reduces the lock level to LW_SHARED where we are only reading the shared fields. Backpatch to 17, where wal summarization was introduced. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CAD21AoDdKhf_9oriEYxY-JCdF+Oe_muhca3pcdkMEdBMzyHyKw@mail.gmail.com Backpatch-through: 17	2025-08-28 17:06:42 -07:00
Tom Lane	b8a1bdc458	Fix "variable not found in subplan target lists" in semijoin de-duplication. One mechanism we have for implementing semi-joins is to de-duplicate the output of the RHS and then treat the join as a plain inner join. Initial construction of the join's SpecialJoinInfo identifies the RHS columns that need to be de-duplicated, but later we may find that some of those don't need to be handled explicitly, either because they're known to be constant or because they are redundant with some previous column. Up to now, while sort-based de-duplication handled such cases well, hash-based de-duplication didn't: we'd still hash on all of the originally-identified columns. This is probably not a very big deal performance-wise, but in the wake of commit `a3179ab69` it can cause planner errors. That happens when join elimination causes recalculation of variables' attr_needed bitmapsets, and we decide that a variable mentioned in a semijoin clause doesn't need to be propagated up to the join level anymore. There are a number of ways we could slice the blame for this, but the only fix that doesn't result in pessimizing plans for loosely-related cases is to be more careful about not hashing columns we don't actually need to de-duplicate. We can install that consideration into create_unique_paths in master, or the predecessor code in create_unique_path in v18, without much refactoring. (As follow-up work, it might be a good idea to look at more-invasive refactoring, in hopes of preventing other bugs in this area. But with v18 release so close, there's not time for that now, nor would we be likely to want to put such refactoring into v18 anyway.) Reported-by: Sergey Soloviev <sergey.soloviev@tantorlabs.ru> Diagnosed-by: Richard Guo <guofenglinux@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/1fd1a421-4609-4d46-a1af-ab74d5de504a@tantorlabs.ru Backpatch-through: 18	2025-08-28 13:49:23 -04:00
Álvaro Herrera	16a9165ce4	Glossary: improve definition of "relation" Define the more general term first, then the Postgres-specific meaning. Wording from Tom Lane. Discussion: https://postgr.es/m/CACJufxEZ48toGH0Em_6vdsT57Y3L8pLF=DZCQ_gCii6=C3MeXw@mail.gmail.com	2025-08-28 18:16:08 +02:00
Álvaro Herrera	325fc0ab14	Avoid including commands/dbcommands.h in so many places This has been done historically because of get_database_name (which since commit `cb98e6fb8f` belongs in lsyscache.c/h, so let's move it there) and get_database_oid (which is in the right place, but whose declaration should appear in pg_database.h rather than dbcommands.h). Clean this up. Also, xlogreader.h and stringinfo.h are no longer needed by dbcommands.h since commit `f1fd515b39`, so remove them. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/202508191031.5ipojyuaswzt@alvherre.pgsql	2025-08-28 12:39:04 +02:00
Peter Eisentraut	80f1106132	Message style improvements An improvement pass over the new stats import functionality.	2025-08-28 09:09:26 +02:00
Andres Freund	5865150b6d	aio: Stop using enum bitfields due to bad code generation During an investigation into rather odd aio related errors on macos, observed by Alexander and Konstantin, we started to wonder if bitfield access is related to the error. At the moment it looks like it is related, we cannot reproduce the failures when replacing the bitfields. In addition, the problem can only be reproduced with some compiler [versions] and not everyone has been able to reproduce the issue. The observed problem is that, very rarely, PgAioHandle->{state,target} are in an inconsistent state, after having been checked to be in a valid state not long before, triggering an assertion failure. Unfortunately, this could be caused by wrong compiler code generation or somehow of missing memory barriers - we don't really know. In theory there should not be any concurrent write access to the handle in the state the bug is triggered, as the handle was idle and is just being initialized. Separately from the bug, we observed that at least gcc and clang generate rather terrible code for the bitfield access. Even if it's not clear if the observed assertion failure is actually caused by the bitfield somehow, the bad code generation alone is sufficient reason to stop using bitfields. Therefore, replace the enum bitfields with uint8s and instead cast in each switch statement. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Discussion: https://postgr.es/m/1500090.1745443021@sss.pgh.pa.us Backpatch-through: 18	2025-08-27 19:12:11 -04:00
Peter Eisentraut	310d04169a	Put back intra-grant-inplace.spec test coverage Commit `d31bbfb659` lost some test coverage, because the situation being tested, a concurrent DROP, cannot happen anymore. Put the test coverage back with a bit of a trick, by deleting directly from the catalog table. Co-authored-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/bf72b82c-124d-4efa-a484-bb928e9494e4@eisentraut.org	2025-08-27 17:46:31 +02:00
Peter Eisentraut	e36fa9319b	Improve objectNamesToOids() comment Commit `d31bbfb659` removed the comment at objectNamesToOids() that there is no locking, because that commit added locking. But to fix all the problems, we'd still need a stronger lock. So put the comment back with more a detailed explanation. Co-authored-by: Noah Misch <noah@leadboat.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/bf72b82c-124d-4efa-a484-bb928e9494e4@eisentraut.org	2025-08-27 17:46:26 +02:00
Peter Eisentraut	990c8db182	Fix: Don't strip $libdir from nested module_pathnames This patch fixes a bug in how 'load_external_function' handles '$libdir/ prefixes in module paths. Previously, 'load_external_function' would unconditionally strip '$libdir/' from the beginning of the 'filename' string. This caused an issue when the path was nested, such as "$libdir/nested/my_lib". Stripping the prefix resulted in a path of "nested/my_lib", which would fail to be found by the expand_dynamic_library_name function because the original '$libdir' macro was removed. To fix this, the code now checks for the presence of an additional directory separator ('/' or '\') after the '$libdir/' prefix. The prefix is only stripped if the remaining string does not contain a separator. This ensures that simple filenames like '"$libdir/my_lib"' are correctly handled, while nested paths are left intact for 'expand_dynamic_library_name' to process correctly. Reported-by: Dilip Kumar <dilipbalaut@gmail.com> Co-authored-by: Matheus Alcantara <matheusssilv97@gmail.com> Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAFiTN-uKNzAro4tVwtJhF1UqcygfJ%2BR%2BRL%3Db-_ZMYE3LdHoGhA%40mail.gmail.com	2025-08-27 15:49:58 +02:00
Jeff Davis	ef5b87b970	Check for more Unicode functions during upgrade. When checking for expression indexes that may be affected by a Unicode update during upgrade, check for a few more functions. Specifically, check for documented regexp functions, as well as the new CASEFOLD() function. Also, fully-qualify references to pg_catalog.text and pg_catalog.regtype. Discussion: https://postgr.es/m/399b656a3abb0c9283538a040f72199c0601525c.camel@j-davis.com Backpatch-through: 18	2025-08-26 22:55:14 -07:00
Jacob Champion	85b380162c	oauth: Explicitly depend on -pthread Followup to `4e1e41733` and `52ecd05ae`. oauth-utils.c uses pthread_sigmask(), requiring -pthread on Debian bullseye at minimum. Reported-by: Christoph Berg <myon@debian.org> Tested-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aK4PZgC0wuwQ5xSK%40msg.df7cb.de Backpatch-through: 18	2025-08-26 14:16:31 -07:00
Peter Eisentraut	e567e22290	Message style improvements Mostly adding some quoting.	2025-08-26 22:52:11 +02:00
Nathan Bossart	984d7165dd	Document privileges required for vacuumdb --missing-stats-only. When vacuumdb's --missing-stats-only option is used, the catalog query for retrieving the list of relations to process must read pg_statistic and pg_statistic_ext_data. However, those catalogs can only be read by superusers by default, so --missing-stats-only is effectively superuser-only. This is unfortunate, but since the option is primarily intended for use by administrators after running pg_upgrade, let's just live with it for v18. This commit adds a note about the aforementioned privilege requirements to the documentation for --missing-stats-only. We first tried to improve matters by modifying the query to read the pg_stats and pg_stats_ext system views instead. While that is indeed more lenient from a privilege standpoint, it is also borderline incomprehensible. pg_stats shows rows for which the user has the SELECT privilege on the corresponding column, and pg_stats_ext shows rows for tables the user owns. Meanwhile, ANALYZE requires either MAINTAIN on the table or, for non-shared relations, ownership of the database. But even if the privilege discrepancies were tolerable, the performance impact was not. Ultimately, the modified query was substantially more expensive, so we abandoned the idea. For v19, perhaps we could introduce a simple, inexpensive way to discover which relations are missing statistics, such as a system function or view with similar privilege requirements to ANALYZE. Unfortunately, it is far too late for anything like that in v18. Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHh43suEfss1wvBsk7vqiou%3DUY0zcy8HGyE5hBp%2BHZ7SQ%40mail.gmail.com Backpatch-through: 18	2025-08-26 14:49:01 -05:00
Tom Lane	327b7324d0	Put "excludeOnly" GIN scan keys at the end of the scankey array. Commit `4b754d6c1` introduced the concept of an excludeOnly scan key, which cannot select matching index entries but can reject non-matching tuples, for example a tsquery such as '!term'. There are poorly-documented assumptions that such scan keys do not appear as the first scan key. ginNewScanKey did nothing to ensure that, however, with the result that certain GIN index searches could go into an infinite loop while apparently-equivalent queries with the clauses in a different order were fine. Fix by teaching ginNewScanKey to place all excludeOnly scan keys after all not-excludeOnly ones. So far as we know at present, it might be sufficient to avoid the case where the very first scan key is excludeOnly; but I'm not very convinced that there aren't other dependencies on the ordering. Bug: #19031 Reported-by: Tim Wood <washwithcare@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19031-0638148643d25548@postgresql.org Backpatch-through: 13	2025-08-26 12:08:57 -04:00
Tom Lane	b55068236c	Do CHECK_FOR_INTERRUPTS inside, not before, scanGetItem. The CHECK_FOR_INTERRUPTS call in gingetbitmap turns out to be inadequate to prevent a long uninterruptible loop, because we now know a case where looping occurs within scanGetItem. While the next patch will fix the bug that caused that, it seems foolish to assume that no similar patterns are possible. Let's do the CFI within scanGetItem's retry loop, instead. This demonstrably allows canceling out of the loop exhibited in bug #19031. Bug: #19031 Reported-by: Tim Wood <washwithcare@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19031-0638148643d25548@postgresql.org Backpatch-through: 13	2025-08-26 11:38:41 -04:00
Alexander Korotkov	5f6f951f88	Improve RowMark handling during Self-Join Elimination The Self-Join Elimination SJE feature messes up keeping and removing RowMark's in remove_self_joins_one_group(). That didn't lead to user-level error, because the planned RowMark is only used to reference a rtable entry in later execution stages. An RTE entry for keeping and removing relations is identical and refers to the same relation OID. To reduce confusion and prevent future issues, this commit cleans up the code and fixes the incorrect behaviour. Furthermore, it includes sanity checks in setrefs.c on existing non-null RTE and RelOptInfo entries for each RowMark. Discussion: https://postgr.es/m/18c6bd6c-6d2a-419a-b0da-dfedef34b585%40gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Backpatch-through: 18	2025-08-26 13:23:18 +03:00
Alexander Korotkov	d713cf9b65	Refactor variable names in remove_self_joins_one_group() Rename inner and outer to rrel and krel, respectively, to highlight their connection to r and k indexes. For the same reason, rename imark and omark to rmark and kmark. Discussion: https://postgr.es/m/18c6bd6c-6d2a-419a-b0da-dfedef34b585%40gmail.com Author: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Backpatch-through: 18	2025-08-26 13:22:43 +03:00
Alexander Korotkov	f8ce9ed220	Further clarify documentation for the initcap function This is a follow-up for commit `c2c2c7e225`. It further clarifies the following in the initcap function documentation: * Document that title case is used for digraphs in specific locales, * Reference particular ICU function used, * Add note about the purpose of the function. Discussion: https://postgr.es/m/804cc10ef95d4d3b298e76b181fd9437%40postgrespro.ru Author: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org>	2025-08-26 13:22:43 +03:00
Peter Eisentraut	f5e0186f86	Raise C requirement to C11 This changes configure and meson.build to require at least C11, instead of the previous C99. The installation documentation is updated accordingly. configure.ac previously used AC_PROG_CC_C99 to activate C99. But there is no AC_PROG_CC_C11 in Autoconf 2.69, because it's too old. (Also, post-2.69, the AC_PROG_CC_Cnn macros were deprecated and AC_PROG_CC activates the last supported C mode.) We could update the required Autoconf version, but that might be a separate project that no one wants to undertake at the moment. Instead, we open-code the test for C11 using some inspiration from later Autoconf versions. But instead of writing an elaborate test program, we keep it simple and just check __STDC_VERSION__, which should be good enough in practice. In meson.build, we update the existing C99 test to C11, but again we just check for __STDC_VERSION__. This also removes the separate option for the conforming preprocessor on MSVC, added by commit `8fd9bb1d96`, since that is activated automatically in C11 mode. Note, we don't use the "official" way to set the C standard in Meson using the c_std project option, because that is impossible to use correctly (see <https://github.com/mesonbuild/meson/issues/14717>). Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/01a69441-af54-4822-891b-ca28e05b215a@eisentraut.org	2025-08-26 11:50:46 +02:00
Peter Eisentraut	99234e9ddc	Message wording improvements Use "row" instead of "tuple" for user-facing information for logical replication conflicts.	2025-08-25 23:15:24 +02:00
Nathan Bossart	989b2e4d5c	Use PqMsg_* macros in applyparallelworker.c. Oversight in commit `f4b54e1ed9`. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAobFsHaLMypA6C96-9YExvF4AcU1xNPoPuNYRVm3mq4dg%40mail.gmail.com	2025-08-25 14:11:01 -05:00
Jacob Champion	4e1e417330	oauth: Add unit tests for multiplexer handling To better record the internal behaviors of oauth-curl.c, add a unit test suite for the socket and timer handling code. This is all based on TAP and driven by our existing Test::More infrastructure. This commit is a replay of `1443b6c0e`, which was reverted due to buildfarm failures. Compared with that, this version protects the build targets in the Makefile with a with_libcurl conditional, and it tweaks the code style in 001_oauth.pl. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com Discussion: https://postgr.es/m/CAOYmi+m=xY0P_uAzAP_884uF-GhQ3wrineGwc9AEnb6fYxVqVQ@mail.gmail.com	2025-08-25 09:27:45 -07:00
Jacob Champion	52ecd05aee	oauth: Always link with -lm for floor() libpq-oauth uses floor() but did not link against libm. Since libpq itself uses -lm, nothing in the buildfarm has had problems with libpq-oauth yet, and it seems difficult to hit a failure in practice. But commit `1443b6c0e` attempted to add an executable based on libpq-oauth, which ran into link-time failures with Clang due to this omission. It seems prudent to fix this for both the module and the executable simultaneously so that no one trips over it in the future. This is a Makefile-only change. The Meson side already pulls in libm, through the os_deps dependency. Discussion: https://postgr.es/m/CAOYmi%2Bn6ORcmV10k%2BdAs%2Bp0b9QJ4bfpk0WuHQaF5ODXxM8Y36A%40mail.gmail.com Backpatch-through: 18	2025-08-25 09:27:39 -07:00
Nathan Bossart	3ef2b863a3	Use PqMsg_* macros in fe-protocol3.c. Oversight in commit `f4b54e1ed9`. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Discussion: https://postgr.es/m/aKx5vEbbP03JNgtp%40nathan	2025-08-25 11:08:26 -05:00
Peter Eisentraut	878656dbde	Formatting cleanup of guc_tables.c This cleans up a few minor formatting inconsistencies. Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org	2025-08-25 09:10:27 +02:00
Noah Misch	ad4412480d	Rewrite previous commit's test for TestUpgradeXversion compatibility. v17 introduced the MAINTAIN ON TABLES privilege. That changed the applicable "baseacls" reaching buildACLCommands(). That yielded spurious TestUpgradeXversion diffs. Change to use a TYPES privilege. Types have the same one privilege in all supported versions, so they avoid the problem. Per buildfarm. Back-patch to v13, like that commit. Discussion: https://postgr.es/m/20250823144505.88.nmisch@google.com Backpatch-through: 13	2025-08-23 16:46:20 -07:00
Noah Misch	b61a5c4bed	Sort DO_DEFAULT_ACL dump objects independent of OIDs. Commit `0decd5e89d` missed DO_DEFAULT_ACL, leading to assertion failures, potential dump order instability, and spurious schema diffs. Back-patch to v13, like that commit. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/d32aaa8d-df7c-4f94-bcb3-4c85f02bea21@gmail.com Backpatch-through: 13	2025-08-22 20:50:28 -07:00
Alexander Korotkov	c13070a27b	Revert "Get rid of WALBufMappingLock" This reverts commit `bc22dc0e0d`. It appears that conditional variables are not suitable for use inside critical sections. If WaitLatch()/WaitEventSetWaitBlock() face postmaster death, they exit, releasing all locks instead of PANIC. In certain situations, this leads to data corruption. Reported-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/B3C69B86-7F82-4111-B97F-0005497BB745%40yandex-team.ru Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Yura Sokolov <y.sokolov@postgrespro.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 18	2025-08-22 19:26:38 +03:00
Nathan Bossart	b63952a781	vacuumdb: Fix --missing-stats-only with virtual generated columns. Statistics aren't created for virtual generated columns, so "vacuumdb --missing-stats-only" always chooses to analyze tables that have them. To fix, modify vacuumdb's query for retrieving relations that are missing statistics to exclude those columns. Oversight in commit `edba754f05`. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/20250820104226.8ba51e43164cd590b863ce41%40sraoss.co.jp Backpatch-through: 18	2025-08-22 11:11:28 -05:00
Heikki Linnakangas	807ee417e5	Revert unnecessary check for NULL Jelte pointed out that this was unnecessary, but I failed to remove it before pushing `f6f0542266`. Oops. Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://www.postgresql.org/message-id/CAGECzQT%3DxNV-V%2BvFC7YQwYQMj0wGN61b3p%3DJ1_rL6M0vbjTtrA@mail.gmail.com Backpatch-through: 18	2025-08-22 14:47:19 +03:00
Heikki Linnakangas	e411a8d25a	libpq: Be strict about cancel key lengths The protocol documentation states that the maximum length of a cancel key is 256 bytes. This starts checking for that limit in libpq. Otherwise third party backend implementations will probably start using more bytes anyway. We also start requiring that a protocol 3.0 connection does not send a longer cancel key, to make sure that servers don't start breaking old 3.0-only clients by accident. Finally this also restricts the minimum key length to 4 bytes (both in the protocol spec and in the libpq implementation). Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Jacob Champion <jchampion@postgresql.org> Discussion: https://www.postgresql.org/message-id/df892f9f-5923-4046-9d6f-8c48d8980b50@iki.fi Backpatch-through: 18	2025-08-22 14:39:29 +03:00
Heikki Linnakangas	f6f0542266	libpq: Handle OOM by disconnecting instead of hanging or skipping msgs In most cases, if an out-of-memory situation happens, we attach the error message to the connection and report it at the next PQgetResult() call. However, there are a few cases, while processing messages that are not associated with any particular query, where we handled failed allocations differently and not very nicely: - If we ran out of memory while processing an async notification, getNotify() either returned EOF, which stopped processing any further data until more data was received from the server, or silently dropped the notification. Returning EOF is problematic because if more data never arrives, e.g. because the connection was used just to wait for the notification, or because the next ReadyForQuery was already received and buffered, it would get stuck forever. Silently dropping a notification is not nice either. - (New in v18) If we ran out of memory while receiving BackendKeyData message, getBackendKeyData() returned EOF, which has the same issues as in getNotify(). - If we ran out of memory while saving a received a ParameterStatus message, we just skipped it. A later call to PQparameterStatus() would return NULL, even though the server did send the status. Change all those cases to terminate the connnection instead. Our options for reporting those errors are limited, but it seems better to terminate than try to soldier on. Applications should handle connection loss gracefully, whereas silently missing a notification, parameter status, or cancellation key could cause much weirder problems. This also changes the error message on OOM while expanding the input buffer. It used to report "cannot allocate memory for input buffer", followed by "lost synchronization with server: got message type ...". The "lost synchronization" message seems unnecessary, so remove that and report only "cannot allocate memory for input buffer". (The comment speculated that the out of memory could indeed be caused by loss of sync, but that seems highly unlikely.) This evolved from a more narrow patch by Jelte Fennema-Nio, which was reviewed by Jacob Champion. Somewhat arbitrarily, backpatch to v18 but no further. These are long-standing issues, but we haven't received any complaints from the field. We can backpatch more later, if needed. Co-authored-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Jacob Champion <jchampion@postgresql.org> Discussion: https://www.postgresql.org/message-id/df892f9f-5923-4046-9d6f-8c48d8980b50@iki.fi Backpatch-through: 18	2025-08-22 14:39:25 +03:00
Heikki Linnakangas	661f821ef0	Use ereport() rather than elog() Noah pointed this out before I committed `50f770c3d9`, but I accidentally pushed the old version with elog() anyway. Oops. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/20250820003756.31.nmisch@google.com	2025-08-22 13:35:05 +03:00
Heikki Linnakangas	50f770c3d9	Revert GetTransactionSnapshot() to return historic snapshot during LR Commit `1585ff7387` changed GetTransactionSnapshot() to throw an error if it's called during logical decoding, instead of returning the historic snapshot. I made that change for extra protection, because a historic snapshot can only be used to access catalog tables while GetTransactionSnapshot() is usually called when you're executing arbitrary queries. You might get very subtle visibility problems if you tried to use the historic snapshot for arbitrary queries. There's no built-in code in PostgreSQL that calls GetTransactionSnapshot() during logical decoding, but it turns out that the pglogical extension does just that, to evaluate row filter expressions. You would get weird results if the row filter runs arbitrary queries, but it is sane as long as you don't access any non-catalog tables. Even though there are no checks to enforce that in pglogical, a typical row filter expression does not access any tables and works fine. Accessing tables marked with the user_catalog_table = true option is also OK. To fix pglogical with row filters, and any other extensions that might do similar things, revert GetTransactionSnapshot() to return a historic snapshot during logical decoding. To try to still catch the unsafe usage of historic snapshots, add checks in heap_beginscan() and index_beginscan() to complain if you try to use a historic snapshot to scan a non-catalog table. We're very close to the version 18 release however, so add those new checks only in master. Backpatch-through: 18 Reported-by: Noah Misch <noah@leadboat.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://www.postgresql.org/message-id/20250809222338.cc.nmisch@google.com	2025-08-22 13:07:46 +03:00
Peter Eisentraut	16a0039dc0	Reduce lock level for ALTER DOMAIN ... VALIDATE CONSTRAINT Reduce from ShareLock to ShareUpdateExclusivelock. Validation during ALTER DOMAIN ... ADD CONSTRAINT keeps using ShareLock. Example: create domain d1 as int; create table t (a d1); alter domain d1 add constraint cc10 check (value > 10) not valid; begin; alter domain d1 validate constraint cc10; -- another session insert into t values (8); Now we should still be able to perform DML operations on table t while the domain constraint is being validated. The equivalent works already on table constraints. Author: jian he <jian.universality@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxHz92A88NLRTA2msgE2dpXpE-EoZ2QO61od76-6bfqurA%40mail.gmail.com	2025-08-22 08:56:11 +02:00
Amit Kapila	123e65fdb7	Doc: Fix typo in logicaldecoding.sgml. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/OSCPR01MB149662EC5467B4135398E3731F532A@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-08-22 05:29:36 +00:00
Michael Paquier	13b935cd52	Change dynahash.c and hsearch.h to use int64 instead of long This code was relying on "long", which is signed 8 bytes everywhere except on Windows where it is 4 bytes, that could potentially expose it to overflows, even if the current uses in the code are fine as far as I know. This code is now able to rely on the same sizeof() variable everywhere, with int64. long was used for sizes, partition counts and entry counts. Some callers of the dynahash.c routines used long declarations, that can be cleaned up to use int64 instead. There was one shortcut based on SIZEOF_LONG, that can be removed. long is entirely removed from dynahash.c and hsearch.h. Similar work was done in `b1e5c9fa9a`. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aKQYp-bKTRtRauZ6@paquier.xyz	2025-08-22 11:59:02 +09:00
Michael Paquier	ef03ea01fe	Ignore temporary relations in RelidByRelfilenumber() Temporary relations may share the same RelFileNumber with a permanent relation, or other temporary relations associated with other sessions. Being able to uniquely identify a temporary relation would require RelidByRelfilenumber() to know about the proc number of the temporary relation it wants to identify, something it is not designed for since its introduction in `f01d1ae3a1`. There are currently three callers of RelidByRelfilenumber(): - autoprewarm. - Logical decoding, reorder buffer. - pg_filenode_relation(), that attempts to find a relation OID based on a tablespace OID and a RelFileNumber. This makes the situation problematic particularly for the first two cases, leading to the possibility of random ERRORs due to inconsistencies that temporary relations can create in the cache maintained by RelidByRelfilenumber(). The third case should be less of an issue, as I suspect that there are few direct callers of pg_filenode_relation(). The window where the ERRORs are happen is very narrow, requiring an OID wraparound to create a lookup conflict in RelidByRelfilenumber() with a temporary table reusing the same OID as another relation already cached. The problem is easier to reach in workloads with a high OID consumption rate, especially with a higher number of temporary relations created. We could get pg_filenode_relation() and RelidByRelfilenumber() to work with temporary relations if provided the means to identify them with an optional proc number given in input, but the years have also shown that we do not have a use case for it, yet. Note that this could not be backpatched if pg_filenode_relation() needs changes. It is simpler to ignore temporary relations. Reported-by: Shenhao Wang <wangsh.fnst@fujitsu.com> Author: Vignesh C <vignesh21@gmail.com> Reviewed-By: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-By: Takamichi Osumi <osumi.takamichi@fujitsu.com> Reviewed-By: Michael Paquier <michael@paquier.xyz> Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com> Reported-By: Shenhao Wang <wangsh.fnst@fujitsu.com> Discussion: https://postgr.es/m/bbaaf9f9-ebb2-645f-54bb-34d6efc7ac42@fujitsu.com Backpatch-through: 13	2025-08-22 09:03:59 +09:00
Peter Eisentraut	47932f3cdc	Use consistent type for pgaio_io_get_id() result The result of pgaio_io_get_id() was being assigned to a mix of int and uint32 variables. This fixes it to use int consistently, which seems the most correct. Also change the queue empty special value in method_worker.c to -1 from UINT32_MAX. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/70c784b3-f60b-4652-b8a6-75e5f051243e%40eisentraut.org	2025-08-21 19:45:25 +02:00
Fujii Masao	12da45742c	Disallow server start with sync_replication_slots = on and wal_level < logical. Replication slot synchronization (sync_replication_slots = on) requires wal_level to be logical. This commit prevents the server from starting if sync_replication_slots is enabled but wal_level is set to minimal or replica. Failing early during startup helps users catch invalid configurations immediately, which is important because changing wal_level requires a server restart. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAH0PTU_pc3oHi__XESF9ZigCyzai1Mo3LsOdFyQA4aUDkm01RA@mail.gmail.com	2025-08-21 22:18:11 +09:00
Peter Eisentraut	53eff471c6	PL/Python: Add event trigger support Allow event triggers to be written in PL/Python. It provides a TD dictionary with some information about the event trigger. Author: Euler Taveira <euler@eulerto.com> Co-authored-by: Dimitri Fontaine <dimitri@2ndQuadrant.fr> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/03f03515-2068-4f5b-b357-8fb540883c38%40app.fastmail.com	2025-08-21 09:21:11 +02:00
Peter Eisentraut	6e09c960eb	PL/Python: Refactor for event trigger support Change is_trigger type from boolean to enum. That's a preparation for adding event trigger support. Author: Euler Taveira <euler@eulerto.com> Co-authored-by: Dimitri Fontaine <dimitri@2ndQuadrant.fr> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/03f03515-2068-4f5b-b357-8fb540883c38%40app.fastmail.com	2025-08-21 09:16:29 +02:00
Michael Paquier	e8eb98754b	Apply some fat commas to commands of TAP tests This is similar to `19c6e92b13`, in order to keep the style used in the scripts consistent for the option names and values used in commands. The places updated in this commit have been added recently in `71ea0d6795`. These changes are cosmetic; there is no need for a backpatch.	2025-08-21 14:17:26 +09:00
Michael Paquier	b5d87a823f	doc: Improve description of wal_compression The description of this GUC provides a list of the situations where full-page writes are generated. However, it is not completely exact, mentioning only the cases where full_page_writes=on or base backups. It is possible to generate full-page writes in more situations than these two, making the description confusing as it implies that no other cases exist. The description is slightly reworded to take into account that other cases are possible, without mentioning them directly to minimize the maintenance burden should FPWs be generated in more contexts in the future. Author: Jingtang Zhang <mrdrivingduck@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAPsk3_CtAYa_fy4p6=x7qtoutrdKvg1kGk46D5fsE=sMt2546g@mail.gmail.com Backpatch-through: 13	2025-08-21 13:25:03 +09:00
Tom Lane	a67d4847a4	Fix re-execution of a failed SQLFunctionCache entry. If we error out during execution of a SQL-language function, we will often leave behind non-null pointers in its SQLFunctionCache's cplan and eslist fields. This is problematic if the SQLFunctionCache is re-used, because those pointers will point at resources that were released during error cleanup. This problem escaped detection so far because ordinarily we won't re-use an FmgrInfo+SQLFunctionCache struct after a query error. However, in the rather improbable case that someone implements an opclass support function in SQL language, there will be long-lived FmgrInfos for it in the relcache, and then the problem is reachable after the function throws an error. To fix, add a flag to SQLFunctionCache that tracks whether execution escapes out of fmgr_sql, and clear out the relevant fields during init_sql_fcache if so. (This is going to need more thought if we ever try to share FMgrInfos across threads; but it's very far from being the only problem such a project will encounter, since many functions regard fn_extra as being query-local state.) This broke at commit 0313c5dc6; before that we did not try to re-use SQLFunctionCache state across calls. Hence, back-patch to v18. Bug: #19026 Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19026-90aed5e71d0c8af3@postgresql.org Backpatch-through: 18	2025-08-20 16:09:18 -04:00
Peter Eisentraut	e9c043a11a	Minor error message enhancement In refuseDupeIndexAttach(), change from errdetail("Another index is already attached for partition \"%s\"."...) to errdetail("Another index \"%s\" is already attached for partition \"%s\"."...) so we can easily understand which index is already attached for partition \"%s\". Author: Jian He <jian.universality@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CACJufxGBfykJ_1ztk9T%2BL_gLmkOSOF%2BmL9Mn4ZPydz-rh%3DLccQ%40mail.gmail.com	2025-08-20 18:14:24 +02:00
Michael Paquier	1f2e51e3c7	Fix assertion failure with replication slot release in single-user mode Some replication slot manipulations (logical decoding via SQL, advancing) were failing an assertion when releasing a slot in single-user mode, because active_pid was not set in a ReplicationSlot when its slot is acquired. ReplicationSlotAcquire() has some logic to be able to work with the single-user mode. This commit sets ReplicationSlot->active_pid to MyProcPid, to let the slot-related logic fall-through, considering the single process as the one holding the slot. Some TAP tests are added for various replication slot functions with the single-user mode, while on it, for slot creation, drop, advancing, copy and logical decoding with multiple slot types (temporary, physical vs logical). These tests are skipped on Windows, as direct calls of postgres --single would fail on permission failures. There is no platform-specific behavior that needs to be checked, so living with this restriction should be fine. The CI is OK with that, now let's see what the buildfarm tells. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Mutaamba Maasha <maasha@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966ED588A0328DAEBE8CB25F5FA2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13	2025-08-20 15:00:04 +09:00
Fujii Masao	6429e5b771	vacuumdb: Make vacuumdb --analyze-only process partitioned tables. vacuumdb should follow the behavior of the underlying VACUUM and ANALYZE commands. When --analyze-only is used, it ought to analyze regular tables, materialized views, and partitioned tables, just as ANALYZE (with no explicit target tables) does. Otherwise, it should only process regular tables and materialized views, since VACUUM skips partitioned tables when no targets are given. Previously, vacuumdb --analyze-only skipped partitioned tables. This was inconsistent, and also inconvenient after pg_upgrade, where --analyze-only is typically used to gather missing statistics. This commit fixes the behavior so that vacuumdb --analyze-only also processes partitioned tables. As a result, both vacuumdb --analyze-only and ANALYZE (with no explicit targets) now analyze regular tables, partitioned tables, and materialized views, but not foreign tables. Because this is a nontrivial behavior change, it is applied only to master. Reported-by: Zechman, Derek S <Derek.S.Zechman@snapon.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Co-authored-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CO1PR04MB8281387B9AD9DE30976966BBC045A%40CO1PR04MB8281.namprd04.prod.outlook.com	2025-08-20 13:16:06 +09:00
Nathan Bossart	3eec0e6533	Fix comment for MAX_SIMUL_LWLOCKS. This comment mentions that pg_buffercache locks all buffer partitions simultaneously, but it hasn't done so since v10. Oversight in commit `6e654546fb`. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/aKTuAHVEuYCUmmIy%40nathan	2025-08-19 16:48:22 -05:00
Masahiko Sawada	eab9e4e27c	Add CHECK_FOR_INTERRUPTS in contrib/pg_buffercache functions. This commit adds CHECK_FOR_INTERRUPTS to loops iterating over shared buffers in several pg_buffercache functions, allowing them to be interrupted during long-running operations. Backpatch to all supported versions. Add CHECK_FOR_INTERRUPTS to the loop in pg_buffercache_pages() in all supported branches, and to pg_buffercache_summary() and pg_buffercache_usage_counts() in version 16 and newer. Author: SATYANARAYANA NARLAPURAM <satyanarlapuram@gmail.com> Discussion: https://postgr.es/m/CAHg+QDcejeLx7WunFT3DX6XKh1KshvGKa8F5au8xVhqVvvQPRw@mail.gmail.com Backpatch-through: 13	2025-08-19 12:11:42 -07:00
Nathan Bossart	c6abf24ebf	Fix misspelling of "tranche" in dsa.h. Oversight in commit `bb952c8c8b`. Discussion: https://postgr.es/m/aKOWzsCPgrsoEG1Q%40nathan	2025-08-19 10:43:15 -05:00
Fujii Masao	38c5fbd97e	doc: Improve pgoutput documentation. This commit updates the pgoutput documentation with the following changes: - Specify the data type for each pgoutput option. - Clarify the relationship between proto_version and options such as streaming and two_phase. - Add a note on the use of pg_logical_slot_peek_changes and pg_logical_slot_get_changes with pgoutput. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/CAHGQGwFJTbygdhhjR_iP4Oem=Lo1xsptWWOq825uoW+hG_Lfnw@mail.gmail.com	2025-08-19 18:54:27 +09:00
Fujii Masao	34a62c2c7f	doc: Improve documentation discoverability for pgoutput. Previously, the documentation for pgoutput was located in the section on the logical streaming replication protocol, and there was no index entry for it. As a result, users had difficulty finding information about pgoutput. This commit moves the pgoutput documentation under the logical decoding section and adds an index entry, making it easier for users to locate and access this information. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/CAHGQGwFJTbygdhhjR_iP4Oem=Lo1xsptWWOq825uoW+hG_Lfnw@mail.gmail.com	2025-08-19 18:53:56 +09:00
Peter Eisentraut	16d434d53d	Add src/include/catalog/README This just includes a link to the bki documentation, to help people get started. Before commit `372728b0d4`, there was a README at src/backend/catalog/README, but then this was moved to the SGML documentation. So this effectively puts back a link to what was moved. But src/include/catalog/ is probably a better location, because that's where all the interesting files are. Co-authored-by: Florents Tselai <florents.tselai@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+v5N400GJFJ9RyXAX7hFKbtF7vVQGvWdFWEfcSQmvVhi9xfrA@mail.gmail.com	2025-08-19 08:41:42 +02:00
Amit Kapila	aa21e49225	Fix self-deadlock during DROP SUBSCRIPTION. The DROP SUBSCRIPTION command performs several operations: it stops the subscription workers, removes subscription-related entries from system catalogs, and deletes the replication slot on the publisher server. Previously, this command acquired an AccessExclusiveLock on pg_subscription before initiating these steps. However, while holding this lock, the command attempts to connect to the publisher to remove the replication slot. In cases where the connection is made to a newly created database on the same server as subscriber, the cache-building process during connection tries to acquire an AccessShareLock on pg_subscription, resulting in a self-deadlock. To resolve this issue, we reduce the lock level on pg_subscription during DROP SUBSCRIPTION from AccessExclusiveLock to RowExclusiveLock. Earlier, the higher lock level was used to prevent the launcher from starting a new worker during the drop operation, as a restarted worker could become orphaned. Now, instead of relying on a strict lock, we acquire an AccessShareLock on the specific subscription being dropped and re-validate its existence after acquiring the lock. If the subscription is no longer valid, the worker exits gracefully. This approach avoids the deadlock while still ensuring that orphan workers are not created. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/18988-7312c868be2d467f@postgresql.org	2025-08-19 05:33:17 +00:00
Michael Paquier	a977e419ee	Refactor ReadMultiXactCounts() into GetMultiXactInfo() This provides a single entry point to access some information about the state of MultiXacts, able to return some data about multixacts offsets and counts. Originally this function was only able to return some information about the number of multixacts and multixact members, extended here to provide some data about the oldest multixact ID in use and the oldest offset, if known. This change has been proposed in a patch that aims at providing more monitoring capabilities for multixacts, and it is useful on its own. GetMultiXactInfo() is added to multixact.h, becoming available for out-of-core code. Extracted from a larger patch by the same author. Author: Naga Appani <nagnrik@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com	2025-08-19 14:04:09 +09:00
Michael Paquier	9b7eb6f02e	Remove useless pointer update in StatsShmemInit() This pointer was not used after its last update. This variable assignment was most likely a vestige artifact of the earlier versions of the patch set that have led to `5891c7a8ed`. This pointer update is useless, so let's remove it. It removes one call to pgstat_dsa_init_size(), making the code slightly easier to grasp. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aKLsu2sdpnyeuSSc@ip-10-97-1-34.eu-west-3.compute.internal	2025-08-19 09:54:18 +09:00
Richard Guo	bf9ee294e5	Simplify relation_has_unique_index_for() Now that the only call to relation_has_unique_index_for() that supplied an exprlist and oprlist has been removed, the loop handling those lists is effectively dead code. This patch removes that loop and simplifies the function accordingly. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com	2025-08-19 09:37:04 +09:00
Richard Guo	24225ad9aa	Pathify RHS unique-ification for semijoin planning There are two implementation techniques for semijoins: one uses the JOIN_SEMI jointype, where the executor emits at most one matching row per left-hand side (LHS) row; the other unique-ifies the right-hand side (RHS) and then performs a plain inner join. The latter technique currently has some drawbacks related to the unique-ification step. * Only the cheapest-total path of the RHS is considered during unique-ification. This may cause us to miss some optimization opportunities; for example, a path with a better sort order might be overlooked simply because it is not the cheapest in total cost. Such a path could help avoid a sort at a higher level, potentially resulting in a cheaper overall plan. * We currently rely on heuristics to choose between hash-based and sort-based unique-ification. A better approach would be to generate paths for both methods and allow add_path() to decide which one is preferable, consistent with how path selection is handled elsewhere in the planner. * In the sort-based implementation, we currently pay no attention to the pathkeys of the input subpath or the resulting output. This can result in redundant sort nodes being added to the final plan. This patch improves semijoin planning by creating a new RelOptInfo for the RHS rel to represent its unique-ified version. It then generates multiple paths that represent elimination of distinct rows from the RHS, considering both a hash-based implementation using the cheapest total path of the original RHS rel, and sort-based implementations that either exploit presorted input paths or explicitly sort the cheapest total path. All resulting paths compete in add_path(), and those deemed worthy of consideration are added to the new RelOptInfo. Finally, the unique-ified rel is joined with the other side of the semijoin using a plain inner join. As a side effect, most of the code related to the JOIN_UNIQUE_OUTER and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or RHS path should be made unique -- has been removed. Besides, the T_Unique path now has the same meaning for both semijoins and upper DISTINCT clauses: it represents adjacent-duplicate removal on presorted input. This patch unifies their handling by sharing the same data structures and functions. This patch also removes the UNIQUE_PATH_NOOP related code along the way, as it is dead code -- if the RHS rel is provably unique, the semijoin should have already been simplified to a plain inner join by analyzejoins.c. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com	2025-08-19 09:35:40 +09:00
Michael Paquier	3c07944d04	test_ddl_deparse: Rename test create_sequence_1 to create_sequence This test was the only one named following the convention used for alternate output files. This was a little bit confusing when looking at the diffs of the test, because one would think that the diffs are based on an uncommon case, as alternate outputs are usually used for uncommon configuration scenarios. create_sequence_1 was the only test in the tree using such a name, and it had no alternate output. Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/aKLY6wCa_OInr3kY@paquier.xyz	2025-08-19 09:08:57 +09:00
Michael Paquier	24e71d53f8	Remove unneeded header declarations in multixact.c Two header declarations were related to SQL-callable functions, that should have been cleaned up in `df9133fa63`. Some more includes can be removed on closer inspection, so let's clean up these as well, while on it. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/345438.1755524834@sss.pgh.pa.us	2025-08-19 08:57:20 +09:00
David Rowley	a98ccf727e	Remove HASH_DEBUG output from dynahash.c This existed in a semi broken stated from `be0a66666` until `296cba276`. Recent discussion has questioned the value of having this at all as it only outputs static information from various of the hash table's properties when the hash table is created. Author: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/OSCPR01MB1496650D03FA0293AB9C21416F534A@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-08-19 11:14:21 +12:00
David Rowley	05fcb9667c	Use elog(DEBUG4) for dynahash.c statistics output Previously this was being output to stderr. This commit adjusts things to use elog(DEBUG4). Here we also adjust the format of the message to add the hash table name and also put the message on a single line. This should make grepping the logs for this information easier. Also get rid of the global hash table statistics. This seems very dated and didn't fit very well with trying to put all the statistics for a specific hash table on a single log line. The main aim here is to allow it so we can have at least one buildfarm member build with HASH_STATISTICS to help prevent future changes from breaking things in that area. `ca3891251` recently fixed some issues here. In passing, switch to using uint64 data types rather than longs for the usage counters. The long type is 32 bits on some platforms we support. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAApHDvoccvJ9CG5zx+i-EyCzJbcL5K=CzqrnL_YN59qaL5hiaw@mail.gmail.com	2025-08-19 10:57:44 +12:00
Tom Lane	5e8f05cd70	Fix missing "use Test::More" in Kerberos.pm. Apparently the only Test::More function this script uses is BAIL_OUT, so this omission just results in the wrong error output appearing in the cases where it bails out. Seems to have been an oversight in commit `9f899562d` which split Kerberos.pm out of another script. Author: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACG=ezY1Dp-S94b78nN0ZuaBGGcMUB6_nF-VyYUwPt1ArFqmGA@mail.gmail.com Backpatch-through: 17	2025-08-18 14:54:59 -04:00
Peter Eisentraut	c61d51d500	Detect buffer underflow in get_th() Input with zero length can result in a buffer underflow when accessing *(num + (len - 1)), as (len - 1) would produce a negative index. Add an assertion for zero-length input to prevent it. This was found by ALT Linux Team. Reviewing the call sites shows that get_th() currently cannot be applied to an empty string: it is always called on a string containing a number we've just printed. Therefore, an assertion rather than a user-facing error message is sufficient. Co-authored-by: Alexander Kuznetsov <kuznetsovam@altlinux.org> Discussion: https://www.postgresql.org/message-id/flat/e22df993-cdb4-4d0a-b629-42211ebed582@altlinux.org	2025-08-18 11:03:22 +02:00
Michael Paquier	df9133fa63	Move SQL-callable code related to multixacts into its own file A patch is under discussion to add more SQL capabilities related to multixacts, and this move avoids bloating the file more than necessary. This affects pg_get_multixact_members(). A side effect of this move is the requirement to add mxstatus_to_string() to multixact.h. Extracted from a larger patch by the same author, tweaked by me. Author: Naga Appani <nagnrik@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com	2025-08-18 14:57:55 +09:00
Peter Eisentraut	4a4038068b	meson: Move C99 test earlier Move the test for compiler options for C99 earlier in meson.build, before we make use of the compiler for other tests. That way, if any command-line options are needed, subsequent tests will also use them. This is at the moment a theoretical problem, but it seems better to get this correct. It also matches the order in the Autoconf-based build more closely. Discussion: https://www.postgresql.org/message-id/flat/01a69441-af54-4822-891b-ca28e05b215a@eisentraut.org	2025-08-18 07:42:39 +02:00
Michael Paquier	ba3d93b2e8	Refactor init_params() in sequence.c to not use FormData_pg_sequence_data init_params() sets up "last_value" and "is_called" for a sequence relation holdind its metadata, based on the sequence properties in pg_sequences. "log_cnt" is the third property that can be updated in this routine for FormData_pg_sequence_data, tracking when WAL records should be generated for a sequence after nextval() iterations. This routine is called when creating or altering a sequence. This commit refactors init_params() to not depend anymore on FormData_pg_sequence_data, removing traces of it in sequence.c, making easier the manipulation of metadata related to sequences. The knowledge about "log_cnt" is replaced with a more general "reset_state" flag, to let the caller know if the sequence state should be reset. In the case of in-core sequences, this relates to WAL logging. We still need to depend on FormData_pg_sequence. Author: Michael Paquier <michael@paquier.xyz> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/ZWlohtKAs0uVVpZ3@paquier.xyz	2025-08-18 11:38:44 +09:00
Michael Paquier	97ca67377a	Remove md5() call from isolation test for CLUSTER and TOAST This test was failing because MD5 computations are not supported in these environments. This switches the test to rely on sha256() instead, providing the same coverage while avoiding the failure. Oversight in `f57e214d1c`. Per buildfarm members gecko, molamola, shikra and froghopper. Discussion: https://postgr.es/m/aKJijS2ZRfRZiYb0@paquier.xyz	2025-08-18 08:18:09 +09:00
Etsuro Fujita	5a8ab650a7	Update obsolete comments in ResultRelInfo struct. Commit `c5b7ba4e6` changed things so that the ri_RootResultRelInfo field of this struct is set for both partitions and inheritance children and used for tuple routing and transition capture (before that commit, it was only set for partitions to route tuples into), but failed to update these comments. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAPmGK14NF5CcdCmTZpxrvpvBiT0y4EqKikW1r_wAu1CEHeOmUA%40mail.gmail.com Backpatch-through: 14	2025-08-17 19:40:00 +09:00
Michael Paquier	f57e214d1c	Add isolation test for TOAST value reuse during CLUSTER This test exercises the corner case in toast_save_datum() where CLUSTER operations encounter duplicated TOAST references, reusing the existing TOAST data instead of creating redundant copies. During table rewrites like CLUSTER, both live and recently-dead versions of a row may reference the same TOAST value. When copying the second or later version of such a row, the system checks if a TOAST value already exists in the new TOAST table using toastrel_valueid_exists(). If found, toast_save_datum() sets data_todo = 0 so as redundant data is not stored, ensuring only one copy of the TOAST value exists in the new table. The test relies on a combination of UPDATE, CLUSTER, and checks of the TOAST values used before and after the relation rewrite, to make sure that the same values are reused across the rewrite. This is a continuation of `69f75d6714` to make sure that this corner case keeps working should we mess with this area of the code. Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Discussion: https://postgr.es/m/CAFAfj_E+kw5P713S8_jZyVgQAGVFfzFiTUJPrgo-TTtJJoazQw@mail.gmail.com	2025-08-17 15:20:01 +09:00
Masahiko Sawada	928da6ff12	Fix typos in comments. Oversight in commit `fd5a1a0c3e`. Author: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAHewXNmTT3M_w4NngG=6G3mdT3iJ6DdncTqV9YnGXBPHW8XYtA@mail.gmail.com	2025-08-16 01:11:40 -07:00
Masahiko Sawada	37265ca01f	Fix constant when extracting timestamp from UUIDv7. When extracting a timestamp from a UUIDv7, a conversion from milliseconds to microseconds was using the incorrect constant NS_PER_US instead of US_PER_MS. Although both constants have the same value, this fix improves code clarity by using the semantically correct constant. Backpatch to v18, where UUIDv7 was introduced. Author: Erik Nordström <erik@tigerdata.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CACAa4V+i07eaP6h4MHNydZeX47kkLPwAg0sqe67R=M5tLdxNuQ@mail.gmail.com Backpatch-through: 18	2025-08-15 11:58:53 -07:00
Peter Eisentraut	2e2e7ff7b8	Fix git whitespace warning Recent changes to src/tools/ci/README triggered warnings like src/tools/ci/README:88: leftover conflict marker Raise conflict-marker-size in .gitattributes to avoid these.	2025-08-15 10:32:35 +02:00
Peter Eisentraut	8212c83939	Add TAP tests for LDAP connection parameter lookup Add TAP tests that tests the LDAP Lookup of Connection Parameters functionality in libpq. Prior to this commit, LDAP test coverage only existed for the server-side authentication functionality and for connection service file with parameters directly specified in the file. The tests included here test a pg_service.conf that contains a link to an LDAP system that contains all of the connection parameters. Author: Andrew Jackson <andrewjackson947@gmail.com> Discussion: https://www.postgresql.org/message-id/CAKK5BkHixcivSCA9pfd_eUp7wkLRhvQ6OtGLAYrWC%3Dk7E76LDQ%40mail.gmail.com	2025-08-15 10:17:22 +02:00
David Rowley	296cba2760	Fix invalid format string in HASH_DEBUG code This seems to have been broken back in `be0a66666`. Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB14966E11EEFB37D7857FCEDB7F535A@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 14	2025-08-15 18:05:44 +12:00
David Rowley	ca38912512	Fix failing -D HASH_STATISTICS builds This seems to have been broken for a few years by `cc5ef90ed`. Author: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/OSCPR01MB14966E11EEFB37D7857FCEDB7F535A@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 17	2025-08-15 17:23:45 +12:00
David Rowley	b4632883d4	Add Asserts to validate prevbit values in bms_prev_member bms_prev_member() could attempt to access memory outside of the words[] array in cases where the prevbit was a number < -1 or > a->nwords * BITS_PER_BITMAPWORD + 1. Here we add the Asserts to help draw attention to bogus callers so we're more likely to catch them during development. In passing, fix wording of bms_prev_member's header comment which talks about how we expect the callers to ensure only valid prevbit values are used. Author: Greg Burd <greg@burd.me> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2000A717-1FFE-4031-827B-9330FB2E9065%40getmailspring.com	2025-08-15 16:33:07 +12:00
Michael Paquier	69f75d6714	Add SQL test for TOAST value allocations on rewrite The SQL test added in this commit check a specific code path that had no coverage until now. When a TOAST datum is rewritten, toast_save_datum() has a dedicated path to make sure that a new value is allocated if it does not exist on the TOAST table yet. This test uses a trick with PLAIN and EXTERNAL storage, with a tuple large enough to be toasted and small enough to fit on a page. It is initially stored in plain more, and the rewrite forces the tuple to be stored externally. The key point is that there is no value allocated during the initial insert, and that there is one after the rewrite. A second pattern checked is the reuse of the same value across rewrites, using \gset. A set of patches under discussion is messing up with this area of the code, so this makes sure that such rewrite cases remain consistent across the board. Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAFAfj_E+kw5P713S8_jZyVgQAGVFfzFiTUJPrgo-TTtJJoazQw@mail.gmail.com	2025-08-15 12:30:36 +09:00
Andres Freund	60b64e6a31	ci: Simplify ci-os-only handling Handle 'ci-os-only' occurrences in the .cirrus.star file instead of .cirrus.tasks.yml file. Now, 'ci-os-only' occurrences are controlled from one central place instead of dealing with them in each task. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/20240413021221.hg53rvqlvldqh57i%40awork3.anarazel.de Backpatch: 15-, where CI support was added	2025-08-14 12:09:34 -04:00
Andres Freund	49cba82bec	ci: Per-repo configuration for manually trigger tasks We do not want to trigger some tasks by default, to avoid using too many compute credits. These tasks have to be manually triggered to be run. But e.g. for cfbot we do have sufficient resources, so we always want to start those tasks. With this commit, an individual repository can be configured to trigger them automatically using an environment variable defined under "Repository Settings", for example: REPO_CI_AUTOMATIC_TRIGGER_TASKS="mingw netbsd openbsd" This will enable cfbot to turn them on by default when running tests for the Commitfest app. Backpatch this back to PG 15, even though PG 15 does not have any manually triggered task. Keeping the CI infrastructure the same seems advantageous. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/20240413021221.hg53rvqlvldqh57i%40awork3.anarazel.de Backpatch-through: 16	2025-08-14 11:54:03 -04:00
Álvaro Herrera	d0e7e04ede	Avoid including tableam.h and xlogreader.h in nbtree.h Doing that seems rather random and unnecessary. This commit removes those and fixes fallout, which is pretty minimal. We do need to add a forward declaration of struct TM_IndexDeleteOp (whose full definition appears in tableam.h) so that _bt_delitems_delete_check()'s declaration can use it. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/202508051109.lzk3lcuzsaxo@alvherre.pgsql	2025-08-14 17:48:46 +02:00
Tom Lane	ed07361721	Don't leak memory during failure exit from SelectConfigFiles(). Make sure the memory allocated by make_absolute_path() is freed when SelectConfigFiles() fails. Since all the callers will exit immediately in that case, there's no practical gain here, but silencing Valgrind leak complaints seems useful. In any case, it was inconsistent that only one of the failure exits did this. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJ7c6TMByXE8dc7zDvDWTQjk6o-XXAdRg_RAg5CBaUOgFPV3LQ%40mail.gmail.com	2025-08-14 11:39:19 -04:00
Heikki Linnakangas	4ec6e22b43	Fix LSN format in debug message Commit `2633dae2e4` standardized all existing messages to use `%X/%08X` for LSNs, but this one crept back in after the commit.	2025-08-14 13:31:18 +03:00
Michael Paquier	6304256e79	Fix compilation warning with SerializeClientConnectionInfo() This function uses an argument named "maxsize" that is only used in assertions, being set once outside the assertion area. Recent gcc versions with -Wunused-but-set-parameter complain about a warning when building without assertions enabled, because of that. In order to fix this issue, PG_USED_FOR_ASSERTS_ONLY is added to the function argument of SerializeClientConnectionInfo(), which is the first time we are doing so in the tree. The CI is fine with the change, but let's see what the buildfarm has to say on the matter. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Jacob Champion <jchampion@postgresql.org> Discussion: https://postgr.es/m/pevajesswhxafjkivoq3yvwxga77tbncghlf3gq5fvchsvfuda@6uivg25sb3nx Backpatch-through: 16	2025-08-14 16:21:50 +09:00
Fujii Masao	e9a31c0cc6	Revert logical snapshot filename format change in SnapBuildSnapshotExists(). Commit `2633dae2e4` standardized LSN formatting but mistakenly changed the logical snapshot filename format in SnapBuildSnapshotExists() from "%X-%X.snap" to "%08X-%08X.snap". Other code still used the original "%X-%X.snap" format, causing the replication slot synchronization worker to fail to find existing snapshot files and produce excessive log messages. This commit restores the original "%X-%X.snap" format in SnapBuildSnapshotExists() to resolve the issue. Author: Shveta Malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHuHPB-ucAk_Tq3uSs4Fdziu1Jp_AA_RD3m5Ycky7m48w@mail.gmail.com	2025-08-14 12:33:14 +09:00
Fujii Masao	12f3639ee7	Fix incorrect LSN format in comment. The comment previously used %X/08X, which is wrong. Updated it to the standardized format %X/%08X. Author: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/ME0P300MB0445A37908EFCCD15E6D749DB62BA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-08-14 11:12:03 +09:00
Tom Lane	ee54046601	Grab the low-hanging fruit from forcing USE_FLOAT8_BYVAL to true. Remove conditionally-compiled code for the other case. Replace uses of FLOAT8PASSBYVAL with constant "true", mainly because it was quite confusing in cases where the type we were dealing with wasn't float8. I left the associated pg_control and Pg_magic_struct fields in place. Perhaps we should get rid of them, but it would save little, so it doesn't seem worth thinking hard about the compatibility implications. I just labeled them "vestigial" in places where that seemed helpful. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us	2025-08-13 17:18:22 -04:00
Tom Lane	6aebedc384	Grab the low-hanging fruit from forcing sizeof(Datum) to 8. Remove conditionally-compiled code for smaller Datum widths, and simplify comments that describe cases no longer of interest. I also fixed up a few more places that were not using DatumGetIntXX where they should, and made some cosmetic adjustments such as using sizeof(int64) not sizeof(Datum) in places where that fit better with the surrounding code. One thing I remembered while preparing this part is that SP-GiST stores pass-by-value prefix keys as Datums, so that the on-disk representation depends on sizeof(Datum). That's even more unfortunate than the existing commentary makes it out to be, because now there is a hazard that the change of sizeof(Datum) will break SP-GiST indexes on 32-bit machines. It appears that there are no existing SP-GiST opclasses that are actually affected; and if there are some that I didn't find, the number of installations that are using them on 32-bit machines is doubtless tiny. So I'm proceeding on the assumption that we can get away with this, but it's something to worry about. (gininsert.c looks like it has a similar problem, but it's okay because the "tuples" it's constructing are just transient data within the tuplesort step. That's pretty poorly documented though, so I added some comments.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us	2025-08-13 17:18:22 -04:00
Tom Lane	2a600a93c7	Make type Datum be 8 bytes wide everywhere. This patch makes sizeof(Datum) be 8 on all platforms including 32-bit ones. The objective is to allow USE_FLOAT8_BYVAL to be true everywhere, and in consequence to remove a lot of code that is specific to pass-by-reference handling of float8, int8, etc. The code for abbreviated sort keys can be simplified similarly. In this way we can reduce the maintenance effort involved in supporting 32-bit platforms, without going so far as to actually desupport them. Since Datum is strictly an in-memory concept, this has no impact on on-disk storage, though an initdb or pg_upgrade will be needed to fix affected catalog entries. We have required platforms to support [u]int64 for ages, so this breaks no supported platform. We can expect that this change will make 32-bit builds a bit slower and more memory-hungry, although being able to use pass-by-value handling of 8-byte types may buy back some of that. But we stopped optimizing for 32-bit cases a long time ago, and this seems like just another step on that path. This initial patch simply forces the correct type definition and USE_FLOAT8_BYVAL setting, and cleans up a couple of minor compiler complaints that ensued. This is sufficient for testing purposes. In the wake of a bunch of Datum-conversion cleanups by Peter Eisentraut, this now compiles cleanly with gcc on a 32-bit platform. (I'd only tested the previous version with clang, which it turns out is less picky than gcc about width-changing coercions.) There is a good deal of now-dead code that I'll remove in separate follow-up patches. A catversion bump is required because this affects initial catalog contents (on 32-bit machines) in two ways: pg_type.typbyval changes for some built-in types, and Const nodes in stored views/rules will now have 8 bytes not 4 for pass-by-value types. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us	2025-08-13 17:18:22 -04:00
Andres Freund	66f8765c53	ci: windows: Stop using DEBUG:FASTLINK Currently the pdb file for libpq and some other libraries are named the same for the static and shared libraries. That has been the case for a long time, but recently started failing, after an image update started using a newer ninja version. The issue is not itself caused by ninja, but just made visible, as the newer version optimizes the build order and builds the shared libpq earlier than the static library. Previously both static and shared libraries were built at the same time, which prevented msvc from detecting the issue. When using /DEBUG:FASTLINK pdb files cannot be updated, triggering the error. We were using /DEBUG:FASTLINK due to running out of memory in the past, but that was when using container based CI images, rather than full VMs. This isn't really the correct fix (that'd be to deconflict the pdb file names), but we'd like to get CI to become green again, and a proper fix (in meson) will presumably take longer. Suggested-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAN55FZ1RuBhJmPWs3Oi%3D9UoezDfrtO-VaU67db5%2B0_uy19uF%2BA%40mail.gmail.com Backpatch-through: 16	2025-08-13 15:52:10 -04:00
Andres Freund	377b7ab145	Add very basic test for kill_prior_tuples Previously our tests did not exercise kill_prior_tuples for hash and gist. For gist some related paths were reached, but gist's implementation seems to not work if all the dead tuples are on one page (or something like that). The coverage for other index types was rather incidental. Thus add an explicit test ensuring kill_prior_tuples works at all. Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/lxzj26ga6ippdeunz6kuncectr5gfuugmm2ry22qu6hcx6oid6@lzx3sjsqhmt6	2025-08-13 15:17:29 -04:00
Tom Lane	21fddb3d76	Don't treat EINVAL from semget() as a hard failure. It turns out that on some platforms (at least current macOS, NetBSD, OpenBSD) semget(2) will return EINVAL if there is a pre-existing semaphore set with the same key and too few semaphores. Our code expects EEXIST in that case and treats EINVAL as a hard failure, resulting in failure during initdb or postmaster start. POSIX does document EINVAL for too-few-semaphores-in-set, and is silent on its priority relative to EEXIST, so this behavior arguably conforms to spec. Nonetheless it's quite problematic because EINVAL is also documented to mean that nsems is greater than the system's limit on the number of semaphores per set (SEMMSL). If that is where the problem lies, retrying would just become an infinite loop. To resolve this contradiction, retry after EINVAL, but also install a loop limit that will make us give up regardless of the specific errno after trying 1000 different keys. (1000 is a pretty arbitrary number, but it seems like it should be sufficient.) I like this better than the previous infinite-looping behavior, since it will also keep us out of trouble if (say) we get EACCES due to a system-level permissions problem rather than anything to do with a specific semaphore set. This problem has only been observed in the field in PG 17, which uses a higher nsems value than other branches (cf. `38da05346`, `810a8b1c8`). That makes it possible to get the failure if a new v17 postmaster has a key collision with an existing postmaster of another branch. In principle though, we might see such a collision against a semaphore set created by some other application, in which case all branches are vulnerable on these platforms. Hence, backpatch. Reported-by: Gavin Panella <gavinpanella@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CALL7chmzY3eXHA7zHnODUVGZLSvK3wYCSP0RmcDFHJY8f28Q3g@mail.gmail.com Backpatch-through: 13	2025-08-13 12:00:03 -04:00
Peter Eisentraut	05f506a515	Adjust some table column widths in PDF Make some column widths more pleasing. Note: Some of this relies on the reduced body indents introduced by commit `37e06ba6e8`. Author: Noboru Saito <noborusai@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAAM3qnLyMUD79XF+SqAVwWCwURCF3hyuFY9Ki9Csbqs-zMwwnw@mail.gmail.com	2025-08-13 17:40:13 +02:00
Peter Eisentraut	37e06ba6e8	Improve PDF documentation margins Set body indent to 0 to make use of the horizontal space better (and some reviewers thought it was also more readable). Add some left and right margin to the warning boxes, otherwise they drift too far off the page in combination with the above change. Author: Noboru Saito <noborusai@gmail.com> Reviewed-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> Reviewed-by: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://www.postgresql.org/message-id/flat/CAAM3qnLyMUD79XF+SqAVwWCwURCF3hyuFY9Ki9Csbqs-zMwwnw@mail.gmail.com	2025-08-13 15:50:14 +02:00
Peter Eisentraut	8081e54bc5	Clean up order in stylesheete-fo.xsl Make a separate section for release notes customization. Commits `f986882ffd` and `8a6e85b46e` put those into the middle of unrelated things.	2025-08-13 15:09:56 +02:00
Michael Paquier	783cbb6d5e	postgres_fdw: Fix tests with ANALYZE and remote sampling The tests fixed in this commit were changing the sampling setting of a foreign server, but then were analyzing a local table instead of a foreign table, meaning that the test was not running for its original purpose. This commit changes the ANALYZE commands to analyze the foreign table, and changes the foreign table definition to point to a valid remote table. Attempting to analyze the foreign table "analyze_ftable" would have failed before this commit, because "analyze_rtable1" is not defined on the remote side. Issue introduced by `8ad51b5f44`. Author: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/CADkLM=cpUiJ3QF7aUthTvaVMmgQcm7QqZBRMDLhBRTR+gJX-Og@mail.gmail.com Backpatch-through: 16	2025-08-13 13:11:19 +09:00
Peter Eisentraut	5f19d13dfe	libpq: Set LDAP protocol version 3 Some LDAP servers reject the default version 2 protocol. So set version 3 before starting the connection. This matches how the backend LDAP code has worked all along. Co-authored-by: Andrew Jackson <andrewjackson947@gmail.com> Reviewed-by: Pavel Seleznev <pavel.seleznev@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkHixcivSCA9pfd_eUp7wkLRhvQ6OtGLAYrWC%3Dk7E76LDQ%40mail.gmail.com	2025-08-12 20:56:49 +02:00
Andres Freund	b227b0bb4e	Reduce ExecSeqScan* code size using pg_assume() `fb9f955025` optimized code generation by using specialized variants of ExecSeqScan* for [not] having a qual, projection etc. This allowed the compiler to optimize the code out the code for qual / projection. However, as observed by David Rowley at the time, the compiler couldn't prove the opposite, i.e. that the qual etc are present. By using pg_assume(), introduced in `d65eb5b1b8`, we can tell the compiler that the relevant variables are non-null. This reduces the code size to a surprising degree and seems to lead to a small but reproducible performance gain. Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CA+HiwqFk-MbwhfX_kucxzL8zLmjEt9MMcHi2YF=DyhPrSjsBEA@mail.gmail.com	2025-08-11 15:41:34 -04:00
Andres Freund	01d6832c10	meson: add and use stamp files for generated headers Without using stamp files, meson lists the generated headers as the dependency for every .c file, bloating build.ninja by more than 2x. Processing all the dependencies also increases the time to generate build.ninja. The immediate benefit is that this makes re-configuring and clean builds a bit faster. The main motivation however is that I have other patches that introduce additional build targets that further would increase the size of build.ninja, making re-configuring more noticeably slower. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/cgkdgvzdpinkacf4v33mky7tbmk467oda5dd4dlmucjjockxzi@xkqfvjoq4uiy	2025-08-11 15:18:23 -04:00
Nathan Bossart	71ea0d6795	Restrict psql meta-commands in plain-text dumps. A malicious server could inject psql meta-commands into plain-text dump output (i.e., scripts created with pg_dump --format=plain, pg_dumpall, or pg_restore --file) that are run at restore time on the machine running psql. To fix, introduce a new "restricted" mode in psql that blocks all meta-commands (except for \unrestrict to exit the mode), and teach pg_dump, pg_dumpall, and pg_restore to use this mode in plain-text dumps. While at it, encourage users to only restore dumps generated from trusted servers or to inspect it beforehand, since restoring causes the destination to execute arbitrary code of the source superusers' choice. However, the client running the dump and restore needn't trust the source or destination superusers. Reported-by: Martin Rakhmanov Reported-by: Matthieu Denais <litezeraw@gmail.com> Reported-by: RyotaK <ryotak.mail@gmail.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Security: CVE-2025-8714 Backpatch-through: 13	2025-08-11 09:00:00 -05:00
Noah Misch	70693c645f	Convert newlines to spaces in names written in v11+ pg_dump comments. Maliciously-crafted object names could achieve SQL injection during restore. CVE-2012-0868 fixed this class of problem at the time, but later work reintroduced three cases. Commit `bc8cd50fef` (back-patched to v11+ in 2023-05 releases) introduced the pg_dump case. Commit `6cbdbd9e8d` (v12+) introduced the two pg_dumpall cases. Move sanitize_line(), unchanged, to dumputils.c so pg_dumpall has access to it in all supported versions. Back-patch to v13 (all supported versions). Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Backpatch-through: 13 Security: CVE-2025-8715	2025-08-11 06:18:59 -07:00
Dean Rasheed	22424953cd	Fix security checks in selectivity estimation functions. Commit `e2d4ef8de8` (the fix for CVE-2017-7484) added security checks to the selectivity estimation functions to prevent them from running user-supplied operators on data obtained from pg_statistic if the user lacks privileges to select from the underlying table. In cases involving inheritance/partitioning, those checks were originally performed against the child RTE (which for plain inheritance might actually refer to the parent table). Commit `553d2ec271` then extended that to also check the parent RTE, allowing access if the user had permissions on either the parent or the child. It turns out, however, that doing any checks using the child RTE is incorrect, since securityQuals is set to NULL when creating an RTE for an inheritance child (whether it refers to the parent table or the child table), and therefore such checks do not correctly account for any RLS policies or security barrier views. Therefore, do the security checks using only the parent RTE. This is consistent with how RLS policies are applied, and the executor's ACL checks, both of which use only the parent table's permissions/policies. Similar checks are performed in the extended stats code, so update that in the same way, centralizing all the checks in a new function. In addition, note that these checks by themselves are insufficient to ensure that the user has access to the table's data because, in a query that goes via a view, they only check that the view owner has permissions on the underlying table, not that the current user has permissions on the view itself. In the selectivity estimation functions, there is no easy way to navigate from underlying tables to views, so add permissions checks for all views mentioned in the query to the planner startup code. If the user lacks permissions on a view, a permissions error will now be reported at planner-startup, and the selectivity estimation functions will not be run. Checking view permissions at planner-startup in this way is a little ugly, since the same checks will be repeated at executor-startup. Longer-term, it might be better to move all the permissions checks from the executor to the planner so that permissions errors can be reported sooner, instead of creating a plan that won't ever be run. However, such a change seems too far-reaching to be back-patched. Back-patch to all supported versions. In v13, there is the added complication that UPDATEs and DELETEs on inherited target tables are planned using inheritance_planner(), which plans each inheritance child table separately, so that the selectivity estimation functions do not know that they are dealing with a child table accessed via its parent. Handle that by checking access permissions on the top parent table at planner-startup, in the same way as we do for views. Any securityQuals on the top parent table are moved down to the child tables by inheritance_planner(), so they continue to be checked by the selectivity estimation functions. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 13 Security: CVE-2025-8713	2025-08-11 09:03:11 +01:00
Thomas Munro	b421223172	Fix rare bug in read_stream.c's split IO handling. The internal queue of buffers could become corrupted in a rare edge case that failed to invalidate an entry, causing a stale buffer to be "forwarded" to StartReadBuffers(). This is a simple fix for the immediate problem. A small API change might be able to remove this and related fragility entirely, but that will have to wait a bit. Defect in commit `ed0b87ca`. Bug: 19006 Backpatch-through: 18 Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/19006-80fcaaf69000377e%40postgresql.org	2025-08-09 13:04:38 +12:00
Tom Lane	665c3dbba4	Mop-up for Datum conversion cleanups. Fix a couple more places where an explicit Datum conversion is needed (not clear how we missed these in `ff89e182d` and previous commits). Replace the minority usage "(Datum) NULL" with "(Datum) 0". The former depends on the assumption that Datum is the same width as Pointer, the latter doesn't. Anyway consistency is a good thing. This is, I believe, the last of the notational mop-up needed before we can consider changing Datum to uint64 everywhere. It's also important cleanup for more aggressive ideas such as making Datum a struct. Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us Discussion: https://postgr.es/m/8246d7ff-f4b7-4363-913e-827dadfeb145@eisentraut.org	2025-08-08 18:44:57 -04:00
Peter Eisentraut	ff89e182d4	Add missing Datum conversions Add various missing conversions from and to Datum. The previous code mostly relied on implicit conversions or its own explicit casts instead of using the correct DatumGet() or GetDatum() functions. We think these omissions are harmless. Some actual bugs that were discovered during this process have been committed separately (`80c758a2e1`, `fd2ab03fea`). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org	2025-08-08 22:06:57 +02:00
Peter Eisentraut	dcfc0f8912	Remove useless/superfluous Datum conversions Remove useless DatumGetFoo() and FooGetDatum() calls. These are places where no conversion from or to Datum was actually happening. We think these extra calls covered here were harmless. Some actual bugs that were discovered during this process have been committed separately (`80c758a2e1`, `2242b26ce4`). Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org	2025-08-08 22:06:57 +02:00
Peter Eisentraut	138750dde4	postgres_fdw and dblink should check if backend has MyProcPort before checking ->has_scram_keys. MyProcPort is NULL in background workers. So this could crash for example if a background worker accessed a suitable configured foreign table. Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/27b29a35-9b96-46a9-bc1a-914140869dac%40gmail.com	2025-08-08 19:34:31 +02:00
Jacob Champion	ebaaf386ad	Revert "oauth: Add unit tests for multiplexer handling" Commit `1443b6c0e` introduced buildfarm breakage for Autoconf animals, which expect to be able to run `make installcheck` on the libpq-oauth directory even if libcurl support is disabled. Some other Meson animals complained of a missing -lm link as well. Since this is the day before a freeze, revert for now and come back later. Discussion: https://postgr.es/m/CAOYmi%2BnCkoh3zB%2BGkZad44%3DFNskwUg6F1kmuxqQZzng7Zgj5tw%40mail.gmail.com	2025-08-08 10:16:37 -07:00
Jacob Champion	1443b6c0ea	oauth: Add unit tests for multiplexer handling To better record the internal behaviors of oauth-curl.c, add a unit test suite for the socket and timer handling code. This is all based on TAP and driven by our existing Test::More infrastructure. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com	2025-08-08 08:45:01 -07:00
Jacob Champion	3e311664e4	oauth: Track total call count during a client flow Tracking down the bugs that led to the addition of comb_multiplexer() and drain_timer_events() was difficult, because an inefficient flow is not visibly different from one that is working properly. To help maintainers notice when something has gone wrong, track the number of calls into the flow as part of debug mode, and print the total when the flow finishes. A new test makes sure the total count is less than 100. (We expect something on the order of 10.) This isn't foolproof, but it is able to catch several regressions in the logic of the prior two commits, and future work to add TLS support to the oauth_validator test server should strengthen it as well. Backpatch-through: 18 Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com	2025-08-08 08:44:56 -07:00
Jacob Champion	1749a12f0d	oauth: Remove expired timers from the multiplexer In a case similar to the previous commit, an expired timer can remain permanently readable if Curl does not remove the timeout itself. Since that removal isn't guaranteed to happen in real-world situations, implement drain_timer_events() to reset the timer before calling into drive_request(). Moving to drain_timer_events() happens to fix a logic bug in the previous caller of timer_expired(), which treated an error condition as if the timer were expired instead of bailing out. The previous implementation of timer_expired() gave differing results for epoll and kqueue if the timer was reset. (For epoll, a reset timer was considered to be expired, and for kqueue it was not.) This didn't previously cause problems, since timer_expired() was only called while the timer was known to be set, but both implementations now use the kqueue logic. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com	2025-08-08 08:44:52 -07:00
Jacob Champion	3d9c03429a	oauth: Ensure unused socket registrations are removed If Curl needs to switch the direction of a socket's registration (e.g. from CURL_POLL_IN to CURL_POLL_OUT), it expects the old registration to be discarded. For epoll, this happened via EPOLL_CTL_MOD, but for kqueue, the old registration would remain if it was not explicitly removed by Curl. Explicitly remove the opposite-direction event during registrations. (If that event doesn't exist, we'll just get an ENOENT, which will be ignored by the same code that handles CURL_POLL_REMOVE.) A few assertions are also added to strengthen the relationship between the number of events added, the number of events pulled off the queue, and the lengths of the kevent arrays. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com	2025-08-08 08:44:46 -07:00
Jacob Champion	ff5b0824b3	oauth: Remove stale events from the kqueue multiplexer If a socket is added to the kqueue, becomes readable/writable, and subsequently becomes non-readable/writable again, the kqueue itself will remain readable until either the socket registration is removed, or the stale event is cleared via a call to kevent(). In many simple cases, Curl itself will remove the socket registration quickly, but in real-world usage, this is not guaranteed to happen. The kqueue can then remain stuck in a permanently readable state until the request ends, which results in pointless wakeups for the client and wasted CPU time. Implement comb_multiplexer() to call kevent() and unstick any stale events that would cause unnecessary callbacks. This is called right after drive_request(), before we return control to the client to wait. Suggested-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com	2025-08-08 08:44:37 -07:00
Thomas Munro	b5cd74612c	Remove obsolete comment. Remove a comment about potential for AIO in StartReadBuffersImpl(), because that change happened.	2025-08-09 01:46:04 +12:00
Peter Eisentraut	fd2ab03fea	Fix incorrect lack of Datum conversion in _int_matchsel() The code used return (Selectivity) 0.0; where PG_RETURN_FLOAT8(0.0); would be correct. On 64-bit systems, these are pretty much equivalent, but on 32-bit systems, PG_RETURN_FLOAT8() correctly produces a pointer, but the old wrong code would return a null pointer, possibly leading to a crash elsewhere. We think this code is actually not reachable because bqarr_in won't accept an empty query, and there is no other function that will create query_int values. But better be safe and not let such incorrect code lie around. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org	2025-08-08 12:06:06 +02:00
Etsuro Fujita	9e63f83a7e	Fix oversight in FindTriggerIncompatibleWithInheritance. This function is called from ATExecAttachPartition/ATExecAddInherit, which prevent tables with row-level triggers with transition tables from becoming partitions or inheritance children, to check if there is such a trigger on the given table, but failed to check if a found trigger is row-level, causing the caller functions to needlessly prevent a table with only a statement-level trigger with transition tables from becoming a partition or inheritance child. Repair. Oversight in commit `501ed02cf`. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAPmGK167mXzwzzmJ_0YZ3EZrbwiCxtM1vogH_8drqsE6PtxRYw%40mail.gmail.com Backpatch-through: 13	2025-08-08 17:35:00 +09:00
Fujii Masao	85ccd7e30a	pg_dump: Fix incorrect parsing of object types in pg_dump --filter. Previously, pg_dump --filter could misinterpret invalid object types in the filter file as valid ones. For example, the invalid object type "table-data" (likely a typo for the valid "table_data") could be mistakenly recognized as "table", causing pg_dump to succeed when it should have failed. This happened because pg_dump identified keywords as sequences of ASCII alphabetic characters, treating non-alphabetic characters (like hyphens) as keyword boundaries. As a result, "table-data" was parsed as "table". To fix this, pg_dump --filter now treats keywords as strings of non-whitespace characters, ensuring invalid types like "table-data" are correctly rejected. Back-patch to v17, where the --filter option was introduced. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Srinath Reddy <srinath2133@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAHGQGwFzPKUwiV5C-NLBqz1oK1+z9K8cgrF+LcxFem-p3_Ftug@mail.gmail.com Backpatch-through: 17	2025-08-08 14:36:39 +09:00
Etsuro Fujita	62a1211d33	Disallow collecting transition tuples from child foreign tables. Commit `9e6104c66` disallowed transition tables on foreign tables, but failed to account for cases where a foreign table is a child table of a partitioned/inherited table on which transition tables exist, leading to incorrect transition tuples collected from such foreign tables for queries on the parent table triggering transition capture. This occurred not only for inherited UPDATE/DELETE but for partitioned INSERT later supported by commit `3d956d956`, which should have handled it at least for the INSERT case, but didn't. To fix, modify ExecARTriggers to throw an error if the given relation is a foreign table requesting transition capture. Also, this commit fixes make_modifytable so that in case of an inherited UPDATE/DELETE triggering transition capture, FDWs choose normal operations to modify child foreign tables, not DirectModify; which is needed because they would otherwise skip the calls to ExecARTriggers at execution, causing unexpected behavior. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CAPmGK14QJYikKzBDCe3jMbpGENnQ7popFmbEgm-XTNuk55oyHg%40mail.gmail.com Backpatch-through: 13	2025-08-08 10:50:00 +09:00
Michael Paquier	84b32fd228	Add information about "generation" when dropping twice pgstats entry Dropping twice a pgstats entry should not happen, and the error report generated was missing the "generation" counter (tracking when an entry is reused) that has been added in `818119afcc`. Like `d92573adcb`, backpatch down to v15 where this information is useful to have, to gather more information from instances where the problem shows up. A report has shown that this error path has been reached on a standby based on 17.3, for a relation stats entry and an OID close to wraparound. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/CAN4RuQvYth942J2+FcLmJKgdpq6fE5eqyFvb_PuskxF2eL=Wzg@mail.gmail.com Backpatch-through: 15	2025-08-08 09:07:10 +09:00
Jacob Champion	4ae03be547	meson: Fix install-quiet after clean libpq-oauth was missing from the installed_targets list, so $ ninja clean && ninja install-quiet failed with the error message ERROR: File 'src/interfaces/libpq-oauth/libpq-oauth.a' could not be found It seems a little odd to have to tell Meson what's missing, since it clearly knows how to build that file during regular installation. But the "quiet" variant we've created must use --no-rebuild, to avoid spawning concurrent ninja processes that would step on each other. Reported-by: Andres Freund <andres@anarazel.de> Backpatch-through: 18 Discussion: https://postgr.es/m/hbpqdwxkfnqijaxzgdpvdtp57s7gwxa5d6sbxswovjrournlk6%404jnb2gzan4em	2025-08-07 15:31:28 -07:00
Tom Lane	04b7ff3cd3	doc: add float as an alias for double precision. Although the "Floating-Point Types" section says that "float" data type is taken to mean "double precision", this information was not reflected in the data type table that lists all data type aliases. Reported-by: alexander.kjall@hafslund.no Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/175456294638.800.12038559679827947313@wrigleys.postgresql.org Backpatch-through: 13	2025-08-07 18:04:45 -04:00
Dean Rasheed	d699687b32	Extend int128.h to support more numeric code. This adds a few more functions to int128.h, allowing more of numeric.c to use 128-bit integers on all platforms. Specifically, int64_div_fast_to_numeric() and the following aggregate functions can now use 128-bit integers for improved performance on all platforms, rather than just platforms with native support for int128: - SUM(int8) - AVG(int8) - STDDEV_POP(int2 or int4) - STDDEV_SAMP(int2 or int4) - VAR_POP(int2 or int4) - VAR_SAMP(int2 or int4) In addition to improved performance on platforms lacking native 128-bit integer support, this significantly simplifies this numeric code by allowing a lot of conditionally compiled code to be deleted. A couple of numeric functions (div_var_int64() and sqrt_var()) still contain conditionally compiled 128-bit integer code that only works on platforms with native 128-bit integer support. Making those work more generally would require rolling our own higher precision 128-bit division, which isn't supported for now. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com	2025-08-07 15:49:24 +01:00
Peter Eisentraut	0ef891e541	doc: Formatting improvements Small touch-up on commits `25505082f0` and `50fd428b2b`. Fix the formatting of the example messages in the documentation and adjust the wording to match the code.	2025-08-07 14:07:31 +02:00
Alexander Korotkov	466c5435fd	Fix checkpointer shared memory allocation Use Min(NBuffers, MAX_CHECKPOINT_REQUESTS) instead of NBuffers in CheckpointerShmemSize() to match the actual array size limit set in CheckpointerShmemInit(). This prevents wasting shared memory when NBuffers > MAX_CHECKPOINT_REQUESTS. Also, fix the comment. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1439188.1754506714%40sss.pgh.pa.us Author: Xuneng Zhou <xunengzhou@gmail.com> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-08-07 14:29:02 +03:00
John Naylor	90bfae9f93	Update ICU C++ API symbols Recent ICU versions have added U_SHOW_CPLUSPLUS_HEADER_API, and we need to set this to zero as well to hide the ICU C++ APIs from pg_locale.h Per discussion, we want cpluspluscheck to work cleanly in backbranches, so backpatch both this and its predecessor commit `ed26c4e25a` to all supported versions. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1115793.1754414782%40sss.pgh.pa.us Backpatch-through: 13	2025-08-07 17:10:52 +07:00
Peter Eisentraut	1beda2c3cf	pg_upgrade: Improve message indentation Fix commit `f295494d33` to use consistent four-space indentation for verbose messages.	2025-08-07 11:48:43 +02:00
Dean Rasheed	d8a08dbee4	Simplify non-native 64x64-bit multiplication in int128.h. In the non-native code in int128_add_int64_mul_int64(), use signed 64-bit integer multiplication instead of unsigned multiplication for the first three product terms. This simplifies the code needed to add each product term to the result, leading to more compact and efficient code. The actual performance gain is quite modest, but it seems worth it to improve the code's readability. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com	2025-08-07 09:52:30 +01:00
Dean Rasheed	d9bb8ef093	Optimise non-native 128-bit addition in int128.h. On platforms without native 128-bit integer support, simplify the test for carry in int128_add_uint64() by noting that the low-part addition is unsigned integer arithmetic, which is just modular arithmetic. Therefore the test for carry can simply be written as "new value < old value" (i.e., a test for modular wrap-around). This can then be made branchless so that on modern compilers it produces the same machine instructions as native 128-bit addition, making it significantly simpler and faster. Similarly, the test for carry in int128_add_int64() can be written in much the same way, but with an extra term to compensate for the sign of the value being added. Again, on modern compilers this leads to branchless code, often identical to the native 128-bit integer addition machine code. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com	2025-08-07 09:20:02 +01:00
Michael Paquier	572c0f1b0e	Improve tests of date_trunc() with infinity and unsupported units Commit `d85ce012f9` has added some new error handling code to date_trunc() of timestamp, timestamptz, and interval with infinite values. However, the new test cases added by that commit did not actually test all of the new code, missing coverage for the following cases: 1) For timestamp without time zone: 1-1) infinite value with valid unit 1-2) infinite value with unsupported unit 1-3) finite value with unsupported unit, for a code path older than `d85ce012f9`. 2) For timestamp with time zone, without a time zone specified for the truncation: 2-1) infinite value with valid unit 2-2) infinite value with unsupported unit 2-3) finite value with unsupported unit, for a code path older than `d85ce012f9`. 3) For timestamp with time zone, with a time zone specified for the truncation: 3-1) infinite value with valid unit. 3-2) infinite value with unsupported unit. This commit also provides coverage for the bug fixed in `2242b26ce4`, through cases 2-1) and 3-1), when using an infinite value with a valid unit, with[out] the optional time zone parameter used for the truncation. Author: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/2d320b6f-b4af-4fbc-9eec-5d0fa15d187b@eisentraut.org Discussion: https://postgr.es/m/4bf60a84-2862-4a53-acd5-8eddf134a60e@eisentraut.org Backpatch-through: 18	2025-08-07 11:49:25 +09:00
Michael Paquier	2242b26ce4	Fix incorrect Datum conversion in timestamptz_trunc_internal() The code used a PG_RETURN_TIMESTAMPTZ() where the return type is TimestampTz and not a Datum. On 64-bit systems, there is no effect since this just ends up casting 64-bit integers back and forth. On 32-bit systems, timestamptz is pass-by-reference. PG_RETURN_TIMESTAMPTZ() allocates new memory and returns the address, meaning that the caller could interpret this as a timestamp value. The effect is using "date_trunc(..., 'infinity'::timestamptz) will return random values (instead of the correct return value 'infinity'). Bug introduced in commit `d85ce012f9`. Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2d320b6f-b4af-4fbc-9eec-5d0fa15d187b@eisentraut.org Discussion: https://postgr.es/m/4bf60a84-2862-4a53-acd5-8eddf134a60e@eisentraut.org Backpatch-through: 18	2025-08-07 11:02:04 +09:00
Nathan Bossart	9ea3b6f751	Expand usage of macros for protocol characters. This commit makes use of the existing PqMsg_* macros in more places and adds new PqReplMsg_* and PqBackupMsg_* macros for use in special replication and backup messages, respectively. Author: Dave Cramer <davecramer@gmail.com> Co-authored-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/aIECfYfevCUpenBT@nathan Discussion: https://postgr.es/m/CAFcNs%2Br73NOUb7%2BqKrV4HHEki02CS96Z%2Bx19WaFgE087BWwEng%40mail.gmail.com	2025-08-06 13:37:00 -05:00
Nathan Bossart	35baa60cc7	Rename transformRelOptions()'s "namspace" parameter to "nameSpace". The name "namspace" looks like a typo, but it was presumably meant to avoid using the "namespace" C++ keyword. This commit renames the parameter to "nameSpace" to prevent future confusion while still avoiding the keyword. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/aJJxpfsDfiQ1VbJ5%40nathan	2025-08-06 12:08:07 -05:00
Fujii Masao	99139c46cb	Fix typo in comment. Author: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CD9B2247-617A-4761-8338-2705C8728E2A@highgo.com	2025-08-06 22:52:13 +09:00
Dean Rasheed	5761d991c9	Refactor int128.h, bringing the native and non-native code together. This rearranges the code in src/include/common/int128.h, so that the native and non-native implementations of each function are together inside the function body (as they are in src/include/common/int.h), rather than being in separate parts of the file. This improves readability and maintainability, making it easier to compare the native and non-native implementations, and avoiding the need to duplicate every function comment and declaration. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com	2025-08-06 11:37:07 +01:00
Dean Rasheed	811633105a	Fix printf format specfiers in test_int128 module. Compiler warnings introduced by `8c7445a008`. Author: Dean Rasheed <dean.a.rasheed@gmail.com>	2025-08-06 10:16:14 +01:00
Peter Eisentraut	73d33be4da	Remove INT64_HEX_FORMAT and UINT64_HEX_FORMAT These were introduced (commit `efdc7d7475`) at the same time as we were moving to using the standard inttypes.h format macros (commit `a0ed19e0a9`). It doesn't seem useful to keep a new already-deprecated interface like this with only a few users, so remove the new symbols again and have the callers use PRIx64. (Also, INT64_HEX_FORMAT was kind of a misnomer, since hex formats all use unsigned types.) Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/0ac47b5d-e5ab-4cac-98a7-bdee0e2831e4%40eisentraut.org	2025-08-06 11:08:10 +02:00
Dean Rasheed	8c7445a008	Convert src/tools/testint128.c into a test module. This creates a new test module src/test/modules/test_int128 and moves src/tools/testint128.c into it so that it can be built using the normal build system, allowing the 128-bit integer arithmetic functions in src/include/common/int128.h to be tested automatically. For now, the tests are skipped on platforms that don't have native int128 support. While at it, fix the test128 union in the test code: the "hl" member of test128 was incorrectly defined to be a union instead of a struct, which meant that the tests were only ever setting and checking half of each 128-bit integer value. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com	2025-08-06 09:41:11 +01:00
Michael Paquier	225ebfe30a	Add regression test for short varlenas saved in TOAST relations toast_save_datum() has for a very long time some code able to handle short varlenas (values up to 126 bytes reduced to a 1-byte header), converting such varlenas to an external on-disk TOAST pointer with the value saved uncompressed in the secondary TOAST relation. There was zero coverage for this code path. This commit adds a test able to exercise it, relying on two external attributes, one with a low toast_tuple_target, so as it is possible to trigger the threshold for the insertion of short varlenas into the TOAST relation. Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aJAl7-NvIk0kZByz@paquier.xyz	2025-08-06 17:22:03 +09:00
Fujii Masao	0b6aea0384	doc: Recommend ANALYZE after ALTER TABLE ... SET EXPRESSION AS. ALTER TABLE ... SET EXPRESSION AS removes statistics for the target column, so running ANALYZE afterward is recommended. But this was previously not documented, even though a similar recommendation exists for ALTER TABLE ... SET DATA TYPE, which also clears the column's statistics. This commit updates the documentation to include the ANALYZE recommendation for SET EXPRESSION AS. Since v18, virtual generated columns are supported, and these columns never have statistics. Therefore, ANALYZE is not needed after SET DATA TYPE or SET EXPRESSION AS when used on virtual generated columns. This commit also updates the documentation to clarify that ANALYZE is unnecessary in such cases. Back-patch the ANALYZE recommendation for SET EXPRESSION AS to v17 where the feature was introduced, and the note about virtual generated columns to v18 where those columns were added. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20250804151418.0cf365bd2855d606763443fe@sraoss.co.jp Backpatch-through: 17	2025-08-06 16:47:20 +09:00
Masahiko Sawada	b5c53b403c	Suppress maybe-uninitialized warning. Following commit `e035863c9a`, building with -O0 began triggering warnings about potentially uninitialized 'workbuf' usage. While theoretically the initialization isn't necessary since VARDATA() doesn't access the contents of the pointed-to object, this commit explicitly initializes the workbuf variable to suppress the warning. Buildfarm members adder and flaviventris have shown the warning. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAD21AoCOZxfqnNgfM5yVKJZYnOq5m2Q96fBGy1fovEqQ9V4OZA@mail.gmail.com	2025-08-05 15:30:28 -07:00
Tom Lane	80c758a2e1	Fix incorrect return value in brin_minmax_multi_distance_numeric(). The result of "DirectFunctionCall1(numeric_float8, d)" is already in Datum form, but the code was incorrectly applying PG_RETURN_FLOAT8() to it. On machines where float8 is pass-by-reference, this would result in complete garbage, since an unpredictable pointer value would be treated as an integer and then converted to float. It's not entirely clear how much of a problem would ensue on 64-bit hardware, but certainly interpreting a float8 bitpattern as uint64 and then converting that to float isn't the intended behavior. As luck would have it, even the complete-garbage case doesn't break BRIN indexes, since the results are only used to make choices about how to merge values into ranges: at worst, we'd make poor choices resulting in an inefficient index. Doubtless that explains the lack of field complaints. However, users with BRIN indexes that use the numeric_minmax_multi_ops opclass may wish to reindex in hopes of making their indexes more efficient. Author: Peter Eisentraut <peter@eisentraut.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2093712.1753983215@sss.pgh.pa.us Backpatch-through: 14	2025-08-05 16:51:10 -04:00
Álvaro Herrera	455a040d96	Put PG_TEST_EXTRA doc items back in alphabetical order A few items appears to have added in random order over the years.	2025-08-05 20:22:32 +02:00
Álvaro Herrera	37fc1803cc	Hide expensive pg_upgrade test behind PG_TEST_EXTRA This new test is very expensive. Make it opt-in. Discussion: https://postgr.es/m/202508051433.ebznuqrxt4b2@alvherre.pgsql	2025-08-05 20:09:42 +02:00
Masahiko Sawada	deb674454c	Add backup_type column to pg_stat_progress_basebackup. This commit introduces a new column backup_type that indicates the type of backup being performed: either 'full' or 'incremental'. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/CAOzEurQuzbHwTj1ehk1a+eeQDidJPyrE5s6mYumkjwjZnurhkQ@mail.gmail.com	2025-08-05 10:50:45 -07:00
Jeff Davis	295a39770e	Don't copy datlocale from template unless provider matches. During CREATE DATABASE, if changing the locale provider, require that a new locale is specified rather than trying to reinterpret the template's locale using the new provider. This only affects the behavior when the template uses the builtin provider and CREATE DATABASE specifies the ICU provider without specifying the locale. Previously, that may have succeeded due to loose validation by ICU, whereas now that will cause an error. Because it can cause an error, backport only to unreleased versions. Discussion: https://postgr.es/m/5038b33a6dc639009f4b3d43fa6ae0c5ba9e04f7.camel@j-davis.com Backpatch-through: 18	2025-08-05 09:25:23 -07:00
Tom Lane	f291751ef8	Mop-up for commit `e035863c9`. Neither Peter nor I had tried this with USE_VALGRIND ... Per buildfarm member skink.	2025-08-05 12:11:33 -04:00
Peter Eisentraut	e035863c9a	Convert varatt.h access macros to static inline functions. We've only bothered converting the external interfaces, not the endian-dependent internal macros (which should not be used by any callers other than the interface functions in this header, anyway). The VARTAG_1B_E() changes are required for C++ compatibility. Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/928ea48f-77c6-417b-897c-621ef16685a6@eisentraut.org	2025-08-05 17:01:25 +02:00
Peter Eisentraut	0f5ade7a36	Fix varatt versus Datum type confusions Macros like VARDATA() and VARSIZE() should be thought of as taking values of type pointer to struct varlena or some other related struct. The way they are implemented, you can pass anything to it and it will cast it right. But this is in principle incorrect. To fix, add the required DatumGetPointer() calls. Or in a couple of cases, remove superfluous PointerGetDatum() calls. It is planned in a subsequent patch to change macros like VARDATA() and VARSIZE() to inline functions, which will enforce stricter typing. This is in preparation for that. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/928ea48f-77c6-417b-897c-621ef16685a6%40eisentraut.org	2025-08-05 12:11:36 +02:00
Peter Eisentraut	2ad6e80de9	Fix various hash function uses These instances were using Datum-returning functions where a lower-level function returning uint32 would be more appropriate. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org	2025-08-05 11:47:23 +02:00
Amit Kapila	c9a5860f7a	Throw ERROR when publish_generated_columns is specified without a value. Previously, specifying the publication option 'publish_generated_columns' without an explicit value would incorrectly default to 'stored', which is not the intended behavior. This patch fixes the issue by raising an ERROR when no value is provided for 'publish_generated_columns', ensuring that users must explicitly specify a valid option. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Backpatch-through: 18, where it was introduced Discussion: https://postgr.es/m/CAHut+PsCUCWiEKmB10DxhoPfXbF6jw5RD9ib2LuaQeA_XraW7w@mail.gmail.com	2025-08-05 09:34:22 +00:00
Peter Eisentraut	1469e31297	Fix mixups of FooGetDatum() vs. DatumGetFoo() Some of these were accidentally reversed, but there was no ill effect. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org	2025-08-05 10:53:49 +02:00
Melanie Plageman	6551a05d9c	Minor test fixes in 035_standby_logical_decoding.pl Import usleep, which, due to an oversight in oversight in commit `48796a98d5` was used but not imported. Correct the comparison string used in two logfile checks. Previously, it was incorrect and thus the test could never have failed. Also wordsmith a comment to make it clear when hot_standby_feedback is meant to be on during the test scenarios. Reported-by: Melanie Plageman <melanieplageman@gmail.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/flat/CAAKRu_YO2mEm%3DZWZKPjTMU%3DgW5Y83_KMi_1cr51JwavH0ctd7w%40mail.gmail.com Backpatch-through: 16	2025-08-04 15:07:32 -04:00
Dean Rasheed	88f0fdabea	Fix typo in create_index.sql. Introduced by `578b229718`. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAEZATCV_CzRSOPMf1gbHQ7xTmyrV6kE7ViCBD6B81WF7GfTAEA@mail.gmail.com Backpatch-through: 13	2025-08-04 16:18:59 +01:00
Andrew Dunstan	4e23c9ef65	Split func.sgml into more manageable pieces func.sgml has grown over the years to the point where it is very difficult to manage. This commit splits out each sect1 piece into its own file, which is then included in the main file, so that the built documentation should be identical to the pre-split documentation. All these new files are placed in a new "func" subdirectory, and the previous func.sgml is removed. Done using scripts developed by: Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxFgAh1--EMwOjMuANe=VTmjkNaZjH+AzSe04-8ZCGiESA@mail.gmail.com	2025-08-04 09:04:56 -04:00
Peter Eisentraut	6ae268cf28	Improve prep_buildtree When prep_buildtree is used to prepare a build tree when the source directory already contains another build tree, then it will produce the directory structure of the first build tree in the second one. For example, if there is postgresql/ postgresql/build1/ and a new build tree postgresql/build2/ is prepared, then this will produce postgresql/build2/build1/ because it just copies all subdirectories of the source tree. This is not harmful, but it's pretty stupid and can be confusing, and it slows down prep_buildtree when there are many build trees. When prep_buildtree was first created, it was more common for the build tree to be outside the source tree, in which case this is not a problem. But now with the arrival of meson, it appears to be more common (and also the way it is documented in the PostgreSQL documentation) to have the build tree inside the source tree. (To be clear: This change does not affect meson at all. But it would be an issue for example if you have a meson build tree and a configure build tree under the same source tree.) To fix this, change the "find" command to process only those top-level directories that we know about (namely config, contrib, doc, src). (I alternatively looked for ways to ignore directories that looked like build directories, but that seemed extremely complicated.) With that, we can also remove the code that ignores directories related to source-control management. In passing, also remove the workaround for handling prebuilt docs, since that has been obsolete since commit `54fac0e505`. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/8b96b07f-1f48-46e9-b26e-01b2c9e4ac8d%40eisentraut.org	2025-08-04 14:06:58 +02:00
Álvaro Herrera	07684443b1	Rename XLogData protocol message to WALData This name is only used as documentation, and using this name is consistent with its byte being a 'w'. Renaming it would also make the use of a symbolic name based on the word "WAL" rather than the obsolete "XLog" term more consistent, per future commits along the lines of `37c7a7eeb6`, `4a68d50088`, `f4b54e1ed9`. Discussion: https://postgr.es/m/aIECfYfevCUpenBT@nathan	2025-08-04 14:03:01 +02:00
Fujii Masao	4614d53d4e	Avoid unexpected shutdown when sync_replication_slots is enabled. Previously, enabling sync_replication_slots while wal_level was not set to logical could cause the server to shut down. This was because the postmaster performed a configuration check before launching the slot synchronization worker and raised an ERROR if the settings were incompatible. Since ERROR is treated as FATAL in the postmaster, this resulted in the entire server shutting down unexpectedly. This commit changes the postmaster to log that message with a LOG-level instead of raising an ERROR, allowing the server to continue running even with the misconfiguration. Back-patch to v17, where slot synchronization was introduced. Reported-by: Hugo DUBOIS <hdubois@scaleway.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Hugo DUBOIS <hdubois@scaleway.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAH0PTU_pc3oHi__XESF9ZigCyzai1Mo3LsOdFyQA4aUDkm01RA@mail.gmail.com Backpatch-through: 17	2025-08-04 20:51:42 +09:00
Álvaro Herrera	126665289f	doc: mention unusability of dropped CHECK to verify NOT NULL It's possible to use a CHECK (col IS NOT NULL) constraint to skip scanning a table for nulls when adding a NOT NULL constraint on the same column. However, if the CHECK constraint is dropped on the same command that the NOT NULL is added, this fails, i.e., makes the NOT NULL addition slow. The best we can do about it at this stage is to document this so that users aren't taken by surprise. (In Postgres 18 you can directly add the NOT NULL constraint as NOT VALID instead, so there's no longer much use for the CHECK constraint, therefore no point in building mechanism to support the case better.) Reported-by: Andrew <psy2000usa@yahoo.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/175385113607.786.16774570234342968908@wrigleys.postgresql.org	2025-08-04 13:26:45 +02:00
David Rowley	bca9a1900c	Fix incorrect comment regarding mod_since_analyze Author: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/20250804140120.280c2d6a9d2ea687cd167743@sraoss.co.jp	2025-08-04 17:43:22 +12:00
Amit Kapila	fd5a1a0c3e	Detect and report update_deleted conflicts. This enhancement builds upon the infrastructure introduced in commit `228c370868`, which enables the preservation of deleted tuples and their origin information on the subscriber. This capability is crucial for handling concurrent transactions replicated from remote nodes. The update introduces support for detecting update_deleted conflicts during the application of update operations on the subscriber. When an update operation fails to locate the target row-typically because it has been concurrently deleted-we perform an additional table scan. This scan uses the SnapshotAny mechanism and we do this additional scan only when the retain_dead_tuples option is enabled for the relevant subscription. The goal of this scan is to locate the most recently deleted tuple-matching the old column values from the remote update-that has not yet been removed by VACUUM and is still visible according to our slot (i.e., its deletion is not older than conflict-detection-slot's xmin). If such a tuple is found, the system reports an update_deleted conflict, including the origin and transaction details responsible for the deletion. This provides a groundwork for more robust and accurate conflict resolution process, preventing unexpected behavior by correctly identifying cases where a remote update clashes with a deletion from another origin. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-08-04 04:02:47 +00:00
Tom Lane	5c8eda1f72	Take a little more care in set_backtrace(). Coverity complained that the "errtrace" string is leaked if we return early because backtrace_symbols fails. Another criticism that could be leveled at this is that not providing any hint of what happened is user-unfriendly. Fix that. The odds of a leak here are small, and typically it wouldn't matter anyway since the leak will be in ErrorContext which will soon get reset. So I'm not feeling a need to back-patch.	2025-08-03 13:01:17 -04:00
Tom Lane	4fbfdde58e	Avoid leakage of zero-length arrays in partition_bounds_copy(). If ndatums is zero, the code would allocate zero-length boundKinds and boundDatums chunks, which would have nothing pointing to them, leading to Valgrind complaints. Rearrange the code to avoid the useless pallocs, and also to not bother computing byval/typlen when they aren't used. I'm unsure why I didn't see this in my Valgrind testing back in May. This code hasn't changed since then, but maybe we added a regression test that reaches this edge case. Or possibly I just failed to notice the reports, which do say "0 bytes lost". Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	b102c8c473	Silence complaints about leaks in PlanCacheComputeResultDesc. CompleteCachedPlan intentionally doesn't worry about small leaks from PlanCacheComputeResultDesc. However, Valgrind knows nothing of engineering tradeoffs and complains anyway. Silence it by doing things the hard way if USE_VALGRIND. I don't really love this patch, because it makes the handling of plansource->resultDesc different from the handling of query dependencies and search_path just above, which likewise are willing to accept small leaks into the cached plan's context. However, those cases aren't provoking Valgrind complaints. (Perhaps in a CLOBBER_CACHE_ALWAYS build, they would?) For the moment, this makes the src/pl/plpgsql tests leak-free according to Valgrind. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	7f6ededa76	Suppress complaints about leaks in TS dictionary loading. Like the situation with function cache loading, text search dictionary loading functions tend to leak some cruft into the dictionary's long-lived cache context. To judge by the examples in the core regression tests, not very many bytes are at stake. Moreover, I don't see a way to prevent such leaks without changing the API for TS template initialization functions: right now they do not have to worry about making sure that their results are long-lived. Hence, I think we should install a suppression rule rather than trying to fix this completely. However, I did grab some low-hanging fruit: several places were leaking the result of get_tsearch_config_filename. This seems worth doing mostly because they are inconsistent with other dictionaries that were freeing it already. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	2c7b4ad24d	Suppress complaints about leaks in function cache loading. PL/pgSQL and SQL-function parsing leak some stuff into the long-lived function cache context. This isn't really a huge practical problem, since it's not a large amount of data and the cruft will be recovered if we have to re-parse the function. It's not clear that it's worth working any harder than the previous patch did to eliminate these leak complaints, so instead silence them with a suppression rule. This suppression rule also hides the fact that CachedFunction structs are intentionally leaked in some cases because we're unsure if any fn_extra pointers remain. That might be nice to do something about eventually, but it's not clear how. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	9f18fa9995	Reduce leakage during PL/pgSQL function compilation. format_procedure leaks memory, so run it in a short-lived context not the session-lifespan cache context for the PL/pgSQL function. parse_datatype called the core parser in the function's cache context, thus leaking potentially a lot of storage into that context. We were also being a bit careless with the TypeName structures made in that code path and others. Most of the time we don't need to retain the TypeName, so make sure it is made in the short-lived temp context, and copy it only if we do need to retain it. These are far from the only leaks in PL/pgSQL compilation, but they're the biggest as far as I've seen, and further improvement looks like it'd require delicate and bug-prone surgery. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	db01c90b2f	Silence Valgrind leakage complaints in more-or-less-hackish ways. These changes don't actually fix any leaks. They just make sure that Valgrind will find pointers to data structures that remain allocated at process exit, and thus not falsely complain about leaks. In particular, we are trying to avoid situations where there is no pointer to the beginning of an allocated block (except possibly within the block itself, which Valgrind won't count). * Because dynahash.c never frees hashtable storage except by deleting the whole hashtable context, it doesn't bother to track the individual blocks of elements allocated by element_alloc(). This results in "possibly lost" complaints from Valgrind except when the first element of each block is actively in use. (Otherwise it'll be on a freelist, but very likely only reachable via "interior pointers" within element blocks, which doesn't satisfy Valgrind.) To fix, if we're building with USE_VALGRIND, expend an extra pointer's worth of space in each element block so that we can chain them all together from the HTAB header. Skip this in shared hashtables though: Valgrind doesn't track those, and we'd need additional locking to make it safe to manipulate a shared chain. While here, update a comment obsoleted by `9c911ec06`. * Put the dlist_node fields of catctup and catclist structs first. This ensures that the dlist pointers point to the starts of these palloc blocks, and thus that Valgrind won't consider them "possibly lost". * The postmaster's PMChild structs and the autovac launcher's avl_dbase structs also have the dlist_node-is-not-first problem, but putting it first still wouldn't silence the warning because we bulk-allocate those structs in an array, so that Valgrind sees a single allocation. Commonly the first array element will be pointed to only from some later element, so that the reference would be an interior pointer even if it pointed to the array start. (This is the same issue as for dynahash elements.) Since these are pretty simple data structures, I don't feel too bad about faking out Valgrind by just keeping a static pointer to the array start. (This is all quite hacky, and it's not hard to imagine usages where we'd need some other idea in order to have reasonable leak tracking of structures that are only accessible via dlist_node lists. But these changes seem to be enough to silence this class of leakage complaints for the moment.) * Free a couple of data structures manually near the end of an autovacuum worker's run when USE_VALGRIND, and ensure that the final vac_update_datfrozenxid() call is done in a non-permanent context. This doesn't have any real effect on the process's total memory consumption, since we're going to exit as soon as that last transaction is done. But it does pacify Valgrind. * Valgrind complains about the postmaster's socket-files and lock-files lists being leaked, which we can silence by just not nulling out the static pointers to them. * Valgrind seems not to consider the global "environ" variable as a valid root pointer; so when we allocate a new environment array, it claims that data is leaked. To fix that, keep our own statically-allocated copy of the pointer, similarly to the previous item. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	e78d1d6d47	Fix assorted pretty-trivial memory leaks in the backend. In the current system architecture, none of these are worth obsessing over; most are once-per-process leaks. However, Valgrind complains about all of them, and if we get to using threads rather than processes for backend sessions, it will become more interesting to avoid per-session leaks. * Fix leaks in StartupXLOG() and ShutdownWalRecovery(). * Fix leakage of pq_mq_handle in a parallel worker. While at it, move mq_putmessage's "Assert(pq_mq_handle != NULL)" to someplace where it's not trivially useless. * Fix leak in logicalrep_worker_detach(). * Don't leak the startup-packet buffer in ProcessStartupPacket(). * Fix leak in evtcache.c's DecodeTextArrayToBitmapset(). If the presented array is toasted, this neglected to free the detoasted copy, which was then leaked into EventTriggerCacheContext. * I'm distressed by the amount of code that BuildEventTriggerCache is willing to run while switched into a long-lived cache context. Although the detoasted array is the only leak that Valgrind reports, let's tighten things up while we're here. (DecodeTextArrayToBitmapset is still run in the cache context, so doing this doesn't remove the need for the detoast fix. But it reduces the surface area for other leaks.) * load_domaintype_info() intentionally leaked some intermediate cruft into the long-lived DomainConstraintCache's memory context, reasoning that the amount of leakage will typically not be much so it's not worth doing a copyObject() of the final tree to avoid that. But Valgrind knows nothing of engineering tradeoffs and complains anyway. On the whole, the copyObject doesn't cost that much and this is surely not a performance-critical code path, so let's do it the clean way. * MarkGUCPrefixReserved didn't bother to clean up removed placeholder GUCs at all, which shows up as a leak in one regression test. It seems appropriate for it to do as much cleanup as define_custom_variable does when replacing placeholders, so factor that code out into a helper function. define_custom_variable's logic was one brick shy of a load too: it forgot to free the separate allocation for the placeholder's name. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	9e9190154e	Fix MemoryContextAllocAligned's interaction with Valgrind. Arrange that only the "aligned chunk" part of the allocated space is included in a Valgrind vchunk. This suppresses complaints about that vchunk being possibly lost because PG is retaining only pointers to the aligned chunk. Also make sure that trailing wasted space is marked NOACCESS. As a tiny performance improvement, arrange that MCXT_ALLOC_ZERO zeroes only the returned "aligned chunk", not the wasted padding space. In passing, fix GetLocalBufferStorage to use MemoryContextAllocAligned instead of rolling its own implementation, which was equally broken according to Valgrind. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us	2025-08-02 21:59:46 -04:00
Tom Lane	bb049a79d3	Improve our support for Valgrind's leak tracking. When determining whether an allocated chunk is still reachable, Valgrind will consider only pointers within what it believes to be allocated chunks. Normally, all of a block obtained from malloc() would be considered "allocated" --- but it turns out that if we use VALGRIND_MEMPOOL_ALLOC to designate sub-section(s) of a malloc'ed block as allocated, all the rest of that malloc'ed block is ignored. This leads to lots of false positives of course. In particular, in any multi-malloc-block context, all but the primary block were reported as leaked. We also had a problem with context "ident" strings, which were reported as leaked unless there was some other pointer to them besides the one in the context header. To fix, we need to use VALGRIND_MEMPOOL_ALLOC to designate a context's management structs (the context struct itself and any per-block headers) as allocated chunks. That forces moving the VALGRIND_CREATE_MEMPOOL/VALGRIND_DESTROY_MEMPOOL calls into the per-context-type code, so that the pool identifier can be made as soon as we've allocated the initial block, but otherwise it's fairly straightforward. Note that in Valgrind's eyes there is no distinction between these allocations and the allocations that the mmgr modules hand out to user code. That's fine for now, but perhaps someday we'll want to do better yet. When reading this patch, it's helpful to start with the comments added at the head of mcxt.c. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/285483.1746756246@sss.pgh.pa.us Discussion: https://postgr.es/m/20210317181531.7oggpqevzz6bka3g@alap3.anarazel.de	2025-08-02 21:59:46 -04:00
Fujii Masao	12efa48978	Fix assertion failure in pgbench when handling multiple pipeline sync messages. Previously, when running pgbench in pipeline mode with a custom script that triggered retriable errors (e.g., serialization errors), an assertion failure could occur: Assertion failed: (res == ((void*)0)), function discardUntilSync, file pgbench.c, line 3515. The root cause was that pgbench incorrectly assumed only a single pipeline sync message would be received at the end. In reality, multiple pipeline sync messages can be sent and must be handled properly. This commit fixes the issue by updating pgbench to correctly process multiple pipeline sync messages, preventing the assertion failure. Back-patch to v15, where the bug was introduced. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Discussion: https://postgr.es/m/CAHGQGwFAX56Tfx+1ppo431OSWiLLuW72HaGzZ39NkLkop6bMzQ@mail.gmail.com Backpatch-through: 15	2025-08-03 10:49:03 +09:00
Jeff Davis	6a46089e45	Simplify options in pg_dump and pg_restore. Remove redundant options --with-data and --with-schema, and rename --with-statistics to just --statistics. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/f379d0aeefe8effe13302a436bc28f549f09e924.camel@j-davis.com Backpatch-through: 18	2025-08-02 07:51:42 -07:00
Michael Paquier	2106fe25a1	Fix typo in foreign_key.sql Introduced by `eec0040c4b`. Author: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CAEoWx2kKMdtWKQiYNuwG2L41YwHA7G3sUsRfD9esPJwZyX1+Eg@mail.gmail.com Backpatch-through: 18	2025-08-02 19:54:23 +09:00
Etsuro Fujita	37e7744585	Doc: clarify the restrictions of AFTER triggers with transition tables. It was not very clear that the triggers are only allowed on plain tables (not foreign tables). Also, rephrase the documentation for better readability. Follow up to commit `9e6104c66`. Reported-by: Etsuro Fujita <etsuro.fujita@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAPmGK16XBs9ptNr8Lk4f-tJZogf6y-Prz%3D8yhvJbb_4dpsc3mQ%40mail.gmail.com Backpatch-through: 13	2025-08-02 18:30:00 +09:00
Michael Paquier	3b3fa94900	Fix use-after-free with INSERT ON CONFLICT changes in reorderbuffer.c In ReorderBufferProcessTXN(), used to send the data of a transaction to an output plugin, INSERT ON CONFLICT changes (INTERNAL_SPEC_INSERT) are delayed until a confirmation record arrives (INTERNAL_SPEC_CONFIRM), updating the change being processed. `8c58624df4` has added an extra step after processing a change to update the progress of the transaction, by calling the callback update_progress_txn() based on the LSN stored in a change after a threshold of CHANGES_THRESHOLD (100) is reached. This logic has missed the fact that for an INSERT ON CONFLICT change the data is freed once processed, hence update_progress_txn() could be called pointing to a LSN value that's already been freed. This could result in random crashes, depending on the workload. Per discussion, this issue is fixed by reusing in update_progress_txn() the LSN from the change processed found at the beginning of the loop, meaning that for a INTERNAL_SPEC_CONFIRM change the progress is updated using the LSN of the INTERNAL_SPEC_CONFIRM change, and not the LSN from its INTERNAL_SPEC_INSERT change. This is actually more correct, as we want to update the progress to point to the INTERNAL_SPEC_CONFIRM change. Masahiko Sawada has found a nice trick to reproduce the issue: hardcode CHANGES_THRESHOLD at 1 and run test_decoding (test "ddl" being enough) on an instance running valgrind. The bug has been analyzed by Ethan Mertz, who also originally suggested the solution used in this patch. Issue introduced by `8c58624df4`, so backpatch down to v16. Author: Ethan Mertz <ethan.mertz@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aIsQqDZ7x4LAQ6u1@paquier.xyz Backpatch-through: 16	2025-08-02 17:08:45 +09:00
Nathan Bossart	9eb6068fb6	Allow resetting unknown custom GUCs with reserved prefixes. Currently, ALTER DATABASE/ROLE/SYSTEM RESET [ALL] with an unknown custom GUC with a prefix reserved by MarkGUCPrefixReserved() errors (unless a superuser runs a RESET ALL variant). This is problematic for cases such as an extension library upgrade that removes a GUC. To fix, simply make sure the relevant code paths explicitly allow it. Note that we require superuser or privileges on the parameter to reset it. This is perhaps a bit more restrictive than is necessary, but it's not clear whether further relaxing the requirements is safe. Oversight in commit `88103567cb`. The ALTER SYSTEM fix is dependent on commit `2d870b4aef`, which first appeared in v17. Unfortunately, back-patching that commit would introduce ABI breakage, and while that breakage seems unlikely to bother anyone, it doesn't seem worth the risk. Hence, the ALTER SYSTEM part of this commit is omitted on v15 and v16. Reported-by: Mert Alev <mert@futo.org> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/18964-ba09dea8c98fccd6%40postgresql.org Backpatch-through: 15	2025-08-01 16:52:11 -05:00
Masahiko Sawada	a2c6c4ed31	Fix typo in AutoVacLauncherMain(). Author: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/20250802002027.cd35c481f6c6bae7ca2a3e26@sraoss.co.jp	2025-08-01 18:02:41 +00:00
Jeff Davis	0ed92cf50c	pg_dump: reject combination of "only" and "with" Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/8ce896d1a05040905cc1a3afbc04e94d8e95669a.camel@j-davis.com Backpatch-through: 18	2025-08-01 10:06:57 -07:00
Heikki Linnakangas	a4801eb691	libpq: Complain about missing BackendKeyData later with PGgetCancel() PostgreSQL always sends the BackendKeyData message at connection startup, but there are some third party backend implementations out there that don't support cancellation, and don't send the message [1]. While the protocol docs left it up for interpretation if that is valid behavior, libpq in PostgreSQL 17 and below accepted it. It does not seem like the libpq behavior was intentional though, since it did so by sending CancelRequest messages with all zeros to such servers (instead of returning an error or making the cancel a no-op). In version 18 the behavior was changed to return an error when trying to create the cancel object with PGgetCancel() or PGcancelCreate(). This was done without any discussion, as part of supporting different lengths of cancel packets for the new 3.2 version of the protocol. This commit changes the behavior of PGgetCancel() / PGcancel() once more to only return an error when the cancel object is actually used to send a cancellation, instead of when merely creating the object. The reason to do so is that some clients [2] create a cancel object as part of their connection creation logic (thus having the cancel object ready for later when they need it), so if creating the cancel object returns an error, the whole connection attempt fails. By delaying the error, such clients will still be able to connect to the third party backend implementations in question, but when actually trying to cancel a query, the user will be notified that that is not possible for the server that they are connected to. This commit only changes the behavior of the older PGgetCancel() / PQcancel() functions, not the more modern PQcancelCreate() family of functions. I.e. PQcancelCreate() returns a failed connection object (CONNECTION_BAD) if the server didn't send a cancellation key. Unlike the old PQgetCancel() function, we're not aware of any clients in the field that use PQcancelCreate() during connection startup in a way that would prevent connecting to such servers. [1] AWS RDS Proxy is definitely one of them, and CockroachDB might be another. [2] psycopg2 (but not psycopg3). Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Backpatch-through: 18 Discussion: https://www.postgresql.org/message-id/20250617.101056.1437027795118961504.ishii%40postgresql.org	2025-08-01 18:24:19 +03:00
Amit Kapila	2ab2d6f970	Fix a deadlock during ALTER SUBSCRIPTION ... DROP PUBLICATION. A deadlock can occur when the DDL command and the apply worker acquire catalog locks in different orders while dropping replication origins. The issue is rare in PG16 and higher branches because, in most cases, the tablesync worker performs the origin drop in those branches, and its locking sequence does not conflict with DDL operations. This patch ensures consistent lock acquisition to prevent such deadlocks. As per buildfarm. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Ajin Cherian <itsajin@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 14, where it was introduced Discussion: https://postgr.es/m/bab95e12-6cc5-4ebb-80a8-3e41956aa297@gmail.com	2025-08-01 07:58:48 +00:00
Tomas Vondra	ca09ef3a6a	Fix tab completion for ALTER ROLE\|USER ... RESET Commit `c407d5426b` added tab completion for ALTER ROLE\|USER ... RESET, with the intent to offer only the variables actually set on the role. But as soon as the user started typing something, it would start to offer all possible matching variables. Fix this the same way ALTER DATABASE ... RESET does it, i.e. by properly considering the prefix. A second issue causing similar symptoms (offering variables that are not actually set for a role) was caused by a match to another pattern. The ALTER DATABASE ... RESET was already excluded, so do the same thing for ROLE/USER. Report and fix by Dagfinn Ilmari Mannsåker. Backpatch to 18, same as `c407d5426b`. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/87qzyghw2x.fsf%40wibble.ilmari.org Discussion: https://postgr.es/m/87tt4lumqz.fsf%40wibble.ilmari.org Backpatch-through: 18	2025-07-31 16:04:21 +02:00
Tomas Vondra	dbf5a83d46	Schema-qualify unnest() in ALTER DATABASE ... RESET Commit `9df8727c50` failed to schema-quality the unnest() call in the query used to list the variables in ALTER DATABASE ... RESET. If there's another unnest() function in the search_path, this could cause either failures, or even security issues (when the tab-completion gets used by privileged accounts). Report and fix by Dagfinn Ilmari Mannsåker. Backpatch to 18, same as `9df8727c50`. Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/87qzyghw2x.fsf%40wibble.ilmari.org Discussion: https://postgr.es/m/87tt4lumqz.fsf%40wibble.ilmari.org Backpatch-through: 18	2025-07-31 16:04:06 +02:00
Noah Misch	0decd5e89d	Sort dump objects independent of OIDs, for the 7 holdout object types. pg_dump sorts objects by their logical names, e.g. (nspname, relname, tgname), before dependency-driven reordering. That removes one source of logically-identical databases differing in their schema-only dumps. In other words, it helps with schema diffing. The logical name sort ignored essential sort keys for constraints, operators, PUBLICATION ... FOR TABLE, PUBLICATION ... FOR TABLES IN SCHEMA, operator classes, and operator families. pg_dump's sort then depended on object OID, yielding spurious schema diffs. After this change, OIDs affect dump order only in the event of catalog corruption. While pg_dump also wrongly ignored pg_collation.collencoding, CREATE COLLATION restrictions have been keeping that imperceptible in practical use. Use techniques like we use for object types already having full sort key coverage. Where the pertinent queries weren't fetching the ignored sort keys, this adds columns to those queries and stores those keys in memory for the long term. The ignorance of sort keys became more problematic when commit `172259afb5` added a schema diff test sensitive to it. Buildfarm member hippopotamus witnessed that. However, dump order stability isn't a new goal, and this might avoid other dump comparison failures. Hence, back-patch to v13 (all supported versions). Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/20250707192654.9e.nmisch@google.com Backpatch-through: 13	2025-07-31 06:37:56 -07:00
Michael Paquier	3357471cf9	pg_stat_statements: Add counters for generic and custom plans This patch adds two new counters to pg_stat_statements: - generic_plan_calls - custom_plan_calls These counters track how many times a prepared statement was executed using a generic or custom plan, respectively, providing a global equivalent at query level, for top and non-top levels, of pg_prepared_statements whose data is restricted to a single session. This commit builds upon `e125e36002`. The module is bumped to version 1.13. PGSS_FILE_HEADER is bumped as well, something that the latest patches touching the on-disk format of the PGSS file did not actually bother with since 2022.. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nikolay Samokhvalov <nik@postgres.ai> Discussion: https://postgr.es/m/CAA5RZ0uFw8Y9GCFvafhC=OA8NnMqVZyzXPfv_EePOt+iv1T-qQ@mail.gmail.com	2025-07-31 11:37:37 +09:00
Michael Paquier	e125e36002	Rename CachedPlanType to PlannedStmtOrigin for PlannedStmt Commit `719dcf3c42` introduced a field called CachedPlanType in PlannedStmt to allow extensions to determine whether a cached plan is generic or custom. After discussion, the concepts that we want to track are a bit wider than initially anticipated, as it is closer to knowing from which "source" or "origin" a PlannedStmt has been generated or retrieved. Custom and generic cached plans are a subset of that. Based on the state of HEAD, we have been able to define two more origins: - "standard", for the case where PlannedStmt is generated in standard_planner(), the most common case. - "internal", for the fake PlannedStmt generated internally by some query patterns. This could be tuned in the future depending on what is needed. This looks like a good starting point, at least. The default value is called "UNKNOWN", provided as fallback value. This value is not used in the core code, the idea is to let extensions building their own PlannedStmts know about this new field. Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/aILaHupXbIGgF2wJ@paquier.xyz	2025-07-31 10:06:34 +09:00
Nathan Bossart	ee924698d5	doc: Adjust documentation for vacuumdb --missing-stats-only. The sentence in question gave readers the impression that vacuumdb removes statistics for a period of time while analyzing, but it's actually meant to convey that --analyze-in-stages temporarily replaces existing statistics with ones generated with lower statistics targets. Reported-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Reviewed-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/4b94ca16-7a6d-4581-b2aa-4ea79dbc082a%40dalibo.com Backpatch-through: 18	2025-07-30 13:04:47 -05:00
Nathan Bossart	412036c22d	Teach pg_upgrade to handle in-place tablespaces. Presently, pg_upgrade assumes that all non-default tablespaces don't move to different directories during upgrade. Unfortunately, this isn't true for in-place tablespaces, which move to the new cluster's pg_tblspc directory. This commit teaches pg_upgrade to handle in-place tablespaces by retrieving the tablespace directories for both the old and new clusters. In turn, we can relax the prohibition on non-default tablespaces for same-version upgrades, i.e., if all non-default tablespaces are in-place, pg_upgrade may proceed. This change is primarily intended to enable additional pg_upgrade testing with non-default tablespaces, as is done in 006_transfer_modes.pl. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aA_uBLYMUs5D66Nb%40nathan	2025-07-30 10:48:41 -05:00
Andrew Dunstan	ce9a6244b5	Revert Non text modes for pg_dumpall, and pg_restore support Recent discussions of the mechanisms used to manage global data have raised concerns about their robustness and security. Rather than try to deal with those concerns at a very late stage of the release cycle, the conclusion is to revert these features and work on them for the next release. This reverts parts or all of the following commits: `1495eff7bd` Non text modes for pg_dumpall, correspondingly change pg_restore `5db3bf7391` Clean up from commit `1495eff7bd` `289f74d0cb` Add more TAP tests for pg_dumpall `2ef5790806` Fix a couple of error messages and tests for them `b52a4a5f28` Clean up error messages from `1495eff7bd` `4170298b6e` Further cleanup for directory creation on pg_dump/pg_dumpall `22cb6d2895` Fix memory leak in pg_restore.c `928394b664` Improve various new-to-v18 appendStringInfo calls `39729ec01d` Fix fat fingering in `22cb6d2895` `5822bf21d5` Add missing space in pg_restore documentation. `f09088a01d` Free memory properly in pg_restore.c `40b9c27014` pg_restore cleanups `4aad2cb770` Portability fix: isdigit() must be passed an unsigned char. `88e947136b` Fix typos and grammar in the code `f60420cff6` doc: Alphabetize long options for pg_dump[all]. `bc35adee8d` doc: Put new options in consistent order on man pages `a876464abc` Message style improvements `dec6643487` Improve pg_dump/pg_dumpall help synopses and terminology `0ebd242555` Run pgperltidy Discussion: https://postgr.es/m/20250708212819.09.nmisch@google.com Backpatch-to: 18 Reviewed-by: Noah Misch <noah@leadboat.com>	2025-07-30 11:31:54 -04:00
Peter Eisentraut	00c9771779	Fix whitespace	2025-07-30 09:51:45 +02:00
Michael Paquier	1a5212775e	Fix ./configure checks with __cpuidex() and __cpuid() The configure checks used two incorrect functions when checking the presence of some routines in an environment: - __get_cpuidex() for the check of __cpuidex(). - __get_cpuid() for the check of __cpuid(). This means that Postgres has never been able to detect the presence of these functions, impacting environments where these exist, like Windows. Simply fixing the function name does not work. For example, using configure with MinGW on Windows causes the checks to detect all four of __get_cpuid(), __get_cpuid_count(), __cpuidex() and __cpuid() to be available, causing a compilation failure as this messes up with the MinGW headers as we would include both <intrin.h> and <cpuid.h>. The Postgres code expects only one in { __get_cpuid() , __cpuid() } and one in { __get_cpuid_count() , __cpuidex() } to exist. This commit reshapes the configure checks to do exactly what meson is doing, which has been working well for us: check one, then the other, but never allow both to be detected in a given build. The logic is wrong since `3dc2d62d04` and `792752af4e` where these checks have been introduced (the second case is most likely a copy-pasto coming from the first case), with meson documenting that the configure checks were broken. As far as I can see, they are not once applied consistently with what the code expects, but let's see if the buildfarm has different something to say. The comment in meson.build is adjusted as well, to reflect the new reality. Author: Lukas Fittl <lukas@fittl.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aIgwNYGVt5aRAqTJ@paquier.xyz Backpatch-through: 13	2025-07-30 11:55:42 +09:00
Heikki Linnakangas	613f647122	Handle cancel requests with PID 0 gracefully If the client sent a query cancel request with backend PID 0, it tripped an assertion. With assertions disabled, you got this in the log instead: LOG: invalid cancel request with PID 0 LOG: wrong key in cancel request for process 0 Query cancellations don't even require authentication, so we better tolerate bogus requests. Fix by turning the assertion into a regular runtime check. Spotted while testing libpq behavior with a modified server that didn't send BackendKeyData to the client. Backpatch-through: 18	2025-07-30 00:39:49 +03:00
Tom Lane	4300d8b6a7	Don't put library-supplied -L/-I switches before user-supplied ones. For many optional libraries, we extract the -L and -l switches needed to link the library from a helper program such as llvm-config. In some cases we put the resulting -L switches into LDFLAGS ahead of -L switches specified via --with-libraries. That risks breaking the user's intention for --with-libraries. It's not such a problem if the library's -L switch points to a directory containing only that library, but on some platforms a library helper may "helpfully" offer a switch such as -L/usr/lib that points to a directory holding all standard libraries. If the user specified --with-libraries in hopes of overriding the standard build of some library, the -L/usr/lib switch prevents that from happening since it will come before the user-specified directory. To fix, avoid inserting these switches directly into LDFLAGS during configure, instead adding them to LIBDIRS or SHLIB_LINK. They will still eventually get added to LDFLAGS, but only after the switches coming from --with-libraries. The same problem exists for -I switches: those coming from --with-includes should appear before any coming from helper programs such as llvm-config. We have not heard field complaints about this case, but it seems certain that a user attempting to override a standard library could have issues. The changes for this go well beyond configure itself, however, because many Makefiles have occasion to manipulate CPPFLAGS to insert locally-desirable -I switches, and some of them got it wrong. The correct ordering is any -I switches pointing at within-the- source-tree-or-build-tree directories, then those from the tree-wide CPPFLAGS, then those from helper programs. There were several places that risked pulling in a system-supplied copy of libpq headers, for example, instead of the in-tree files. (Commit `cb36f8ec2` fixed one instance of that a few months ago, but this exercise found more.) The Meson build scripts may or may not have any comparable problems, but I'll leave it to someone else to investigate that. Reported-by: Charles Samborski <demurgos@demurgos.net> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/70f2155f-27ca-4534-b33d-7750e20633d7@demurgos.net Backpatch-through: 13	2025-07-29 15:17:40 -04:00
Peter Eisentraut	c3019bb778	Update comment The code being referred to was moved to a different function in commit `eb8312a22a`, so update the comment accordingly.	2025-07-29 18:57:14 +02:00
Tom Lane	902f922218	Remove unnecessary complication around xmlParseBalancedChunkMemory. When I prepared `71c0921b6` et al yesterday, I was thinking that the logic involving explicitly freeing the node_list output was still needed to dodge leakage bugs in libxml2. But I was misremembering: we introduced that only because with early 2.13.x releases we could not trust xmlParseBalancedChunkMemory's result code, so we had to look to see if a node list was returned or not. There's no reason to believe that xmlParseBalancedChunkMemory will fail to clean up the node list when required, so simplify. (This essentially completes reverting all the non-cosmetic changes in 6082b3d5d.) Reported-by: Jim Jones <jim.jones@uni-muenster.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/997668.1753802857@sss.pgh.pa.us Backpatch-through: 13	2025-07-29 12:47:38 -04:00
Nathan Bossart	0f3a26fedd	Add commit `1d1612aec7` to .git-blame-ignore-revs.	2025-07-29 10:32:53 -05:00
Tom Lane	74e121c8dc	Split up pgfdw_report_error so that we can mark it pg_noreturn. pgfdw_report_error has the same design fault as elog/ereport do, namely that it might or might not return depending on elevel. While those functions are too widely used to redesign, there are only about 30 call sites for pgfdw_report_error, and it's not exposed for extension use. So let's rethink it. Split it into pgfdw_report_error() which hard-wires ERROR elevel and is marked pg_noreturn, and pgfdw_report() which allows only elevels less than ERROR. (Thanks to Álvaro Herrera for suggesting this naming.) The motivation for doing this now is that in the wake of commit `80aa9848b`, which removed a bunch of PG_TRYs from postgres_fdw, we're seeing more thorough flow analysis there from C compilers and Coverity. Marking pgfdw_report_error as noreturn where appropriate should help prevent false-positive complaints. We could alternatively have invented a macro wrapper similar to what we use for elog/ereport, but that code is sufficiently fragile that I didn't find it appetizing to make another copy. Since `80aa9848b` already changed pgfdw_report_error's signature, this won't make back-patching any harder than it was already. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/420221.1753714491@sss.pgh.pa.us	2025-07-29 10:35:01 -04:00
Tom Lane	b9ebb92bcb	Suppress uninitialized-variable warning. In the wake of commit `80aa9848b`, a few compilers think that postgresAcquireSampleRowsFunc's "reltuples" might be used uninitialized. The logic is visibly correct, both before and after that change; presumably what happened here is that the previous presence of a setjmp() in the function stopped them from attempting any flow analysis at all. Add a dummy initialization to silence the warning. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAExHW5tkerCufA_F6oct5dMJ61N+yVrVgYXL7M8dD-5_zXjrDw@mail.gmail.com	2025-07-29 09:42:22 -04:00
Robert Haas	1d1612aec7	Run pgindent. Per buildfarm member koel, Nathan Bossart, and David Rowley.	2025-07-29 09:10:41 -04:00
Fujii Masao	cc321b1d1c	Add regression test for background worker restart after crash. Previously, if a background worker crashed and the server restarted with restart_after_crash enabled, the worker was not restarted as expected. This issue was fixed by commit `b5d084c535`, which ensures that background workers without the never-restart flag are correctly restarted after a crash-and-restart cycle. To guard against regressions, this commit adds a test that verifies a background worker successfully restarts in such a scenario. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/CAHGQGwHF-PdUOgiXCH_8K5qBm8b13h0Qt=dSoFXZybXQdbf-tw@mail.gmail.com	2025-07-29 19:43:10 +09:00
Michael Paquier	cb833c1b6d	Handle timeout in PostgreSQL::Test::Cluster::is_alive() This commit adds an extra --timeout=PG_TEST_TIMEOUT_DEFAULT to the call of pg_isready done in is_alive(), so as it is possible to have more leverage with the call on machines constrained on resources. By default the timeout is 180s, and it can be changed depending on the environment where the tests are run. Per buildfarm member mamba, where the default timeout of 3s used by pg_isready has proved that it may not be enough as the postmaster may not have the time it needs to reply to a ping request. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/29b637df-f818-4b52-986a-f11ba28300e9@gmail.com	2025-07-29 17:03:07 +09:00
Alexander Korotkov	c2c2c7e225	Clarify documentation for the initcap function This commit documents differences in the definition of word separators for the initcap function between libc and ICU locale providers. Backpatch to all supported branches. Discussion: https://postgr.es/m/804cc10ef95d4d3b298e76b181fd9437%40postgrespro.ru Author: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> Backpatch-through: 13	2025-07-29 10:41:13 +03:00
David Rowley	4bc62b8684	Display Memoize planner estimates in EXPLAIN There've been a few complaints that it can be overly difficult to figure out why the planner picked a Memoize plan. To help address that, here we adjust the EXPLAIN output to display the following additional details: 1) The estimated number of cache entries that can be stored at once 2) The estimated number of unique lookup keys that we expect to see 3) The number of lookups we expect 4) The estimated hit ratio Technically #4 can be calculated using #1, #2 and #3, but it's not a particularly obvious calculation, so we opt to display it explicitly. The original patch by Lukas Fittl only displayed the hit ratio, but there was a fear that might lead to more questions about how that was calculated. The idea with displaying all 4 is to be transparent which may allow queries to be tuned more easily. For example, if #2 isn't correct then maybe extended statistics or a manual n_distinct estimate can be used to help fix poor plan choices. Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAP53Pky29GWAVVk3oBgKBDqhND0BRBN6yTPeguV_qSivFL5N_g%40mail.gmail.com	2025-07-29 15:18:01 +12:00
Tom Lane	71c0921b64	Avoid regression in the size of XML input that we will accept. This mostly reverts commit `6082b3d5d`, "Use xmlParseInNodeContext not xmlParseBalancedChunkMemory". It turns out that xmlParseInNodeContext will reject text chunks exceeding 10MB, while (in most libxml2 versions) xmlParseBalancedChunkMemory will not. The bleeding-edge libxml2 bug that we needed to work around a year ago is presumably no longer a factor, and the argument that xmlParseBalancedChunkMemory is semi-deprecated is not enough to justify a functionality regression. Hence, go back to doing it the old way. Reported-by: Michael Paquier <michael@paquier.xyz> Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Erik Wienhold <ewie@ewie.name> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aIGknLuc8b8ega2X@paquier.xyz Backpatch-through: 13	2025-07-28 16:50:41 -04:00
Robert Haas	d5b9b2d402	Remove misleading hint for "unexpected data beyond EOF" error. Commit `ffae5cc5a6` added this hint in 2006, but it's now obsolete and doesn't reflect what users should really check in this situation. We were not able to agree on a new hint, so just delete the existing one and update the comments to mention one possibility that is known to cause problems of this kind: something other than PostgreSQL is modifying files in the PostgreSQL data directory. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Robert Haas <rhaas@postgresql.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/CAKZiRmxNbcaL76x=09Sxf7aUmrRQJBf8drzDdUHo+j9_eM+VMg@mail.gmail.com	2025-07-28 11:15:47 -04:00
Robert Haas	dcc9820a35	Avoid throwing away the error message in syncrep_yyerror. Commit `473a575e05` purported to make this function stash the error message in *syncrep_parse_result_p, but it didn't actually. As a result, an attempt to set synchronous_standby_names to any value that does not parse resulted in a generic "parser failed." message rather than anything more specific. This fixes that. Discussion: http://postgr.es/m/CA+TgmoYF9wPNZ-Q_EMfib_espgHycY-eX__6Tzo2GpYpVXqCdQ@mail.gmail.com Backpatch-through: 18	2025-07-28 10:35:05 -04:00
Michael Paquier	3151c264d4	ecpg: Fix memory leaks in ecpg_auto_prepare() This routines includes three code paths that can fail, with the allocated prepared statement name going out of scope. Per report from Coverity. Oversight in commit `a6eabec680`, that has played with the order of some ecpg_strdup() calls in this code path.	2025-07-28 08:38:24 +09:00
Michael Paquier	793928c2d5	Fix performance regression with flush of pending fixed-numbered stats The callback added in `fc415edf8c` used to check if there is any pending data to flush for fixed-numbered statistics, done by looping across all the builtin and custom stats kinds with a call to have_fixed_pending_cb, is proving to able to show in workloads that do not report any stats (read-only, no function calls, no WAL, no IO, etc). The code used in v17 was cheaper than that what HEAD has introduced, relying on three boolean checks for WAL, SLRU and IO stats. This commit switches the code to use a more efficient approach than `fc415edf8c`, with a single boolean flag that can be switched to "true" by any fixed-numbered stats kinds to force pgstat_report_stat() to go through one round of reports. The flag is reset by pgstat_report_stat() once a full round of reports is done. The flag being false means that fixed-numbered stats kinds saw no activity, and that there is no pending data to flush. `ac000fca74` took one step in improving the performance by reducing the number of stats kinds that the backend can hold. This commit takes a more drastic step by bringing back the code efficiency to what it was before v18 with a cheap check at the beginning of pgstat_report_stat() for its fast-exit path. The callback have_static_pending_cb is removed as an effect of all that. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/eb224uegsga2hgq7dfq3ps5cduhpqej7ir2hjxzzozjthrekx5@dysei6buqthe Backpatch-through: 18	2025-07-28 08:15:11 +09:00
Alexander Korotkov	258bf0a2ea	Process sync requests incrementally in AbsorbSyncRequests If the number of sync requests is big enough, the palloc() call in AbsorbSyncRequests() will attempt to allocate more than 1 GB of memory, resulting in failure. This can lead to an infinite loop in the checkpointer process, as it repeatedly fails to absorb the pending requests. This commit introduces the following changes to cope with this problem: 1. Turn pending checkpointer requests array in shared memory into a bounded ring buffer. 2. Limit maximum ring buffer size to 10M items. 3. Make AbsorbSyncRequests() process requests incrementally in 10K batches. Even #2 makes the whole queue size fit the maximum palloc() size of 1 GB. of continuous lock holding. This commit is for master only. Simpler fix, which just limits a request queue size to 10M, will be backpatched. Reported-by: Ekaterina Sokolova <e.sokolova@postgrespro.ru> Discussion: https://postgr.es/m/db4534f83a22a29ab5ee2566ad86ca92%40postgrespro.ru Author: Maxim Orlov <orlovmg@gmail.com> Co-authored-by: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-07-27 15:07:47 +03:00
Michael Paquier	6f22a82a40	Add assertions for all the required index AM callbacks Similar checks are done for the mandatory table AM callbacks. A portion of the index AM callbacks are optional and can be NULL; the rest is mandatory and is documented as such in the documentation and in amapi.h. These checks are useful to detect quickly if all the mandatory callbacks are defined when implementing a new index access method, as the assertions are run when loading the AM. Author: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/ME0P300MB0445795D31CEAB92C58B41FDB651A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-27 17:48:47 +09:00
Tom Lane	db6461b1c9	Add commit `73873805f` to .git-blame-ignore-revs.	2025-07-25 16:45:57 -04:00
Tom Lane	0f9d4d7c12	Silence leakage complaint about postgres_fdw's InitPgFdwOptions. Valgrind complains that the PQconninfoOption array returned by libpq is leaked. We apparently believed that we could suppress that warning by storing that array's address in a static variable. However, modern C compilers are bright enough to optimize the static variable away. We could escalate that arms race by making the variable global. But on the whole it seems better to revise the code so that it can free libpq's result properly. The only thing that costs us is copying the parameter-name keywords; which seems like a pretty negligible cost in a function that runs at most once per process. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us	2025-07-25 16:37:29 -04:00
Tom Lane	73873805fb	Run pgindent on the changes of the previous patch. This step can be checked mechanically. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us	2025-07-25 16:36:44 -04:00
Tom Lane	80aa9848be	Reap the benefits of not having to avoid leaking PGresults. Remove a bunch of PG_TRY constructs, de-volatilize related variables, remove some PQclear calls in error paths. Aside from making the code simpler and shorter, this should provide some marginal performance gains. For ease of review, I did not re-indent code within the removed PG_TRY constructs. That'll be done in a separate patch. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us	2025-07-25 16:31:43 -04:00
Tom Lane	7d8f595779	Create infrastructure to reliably prevent leakage of PGresults. Commit `232d8caea` fixed a case where postgres_fdw could lose track of a PGresult object, resulting in a process-lifespan memory leak. But I have little faith that there aren't other potential PGresult leakages, now or in future, in the backend modules that use libpq. Therefore, this patch proposes infrastructure that makes all PGresults returned from libpq act as though they are palloc'd in the CurrentMemoryContext (with the option to relocate them to another context later). This should greatly reduce the risk of careless leaks, and it also permits removal of a bunch of code that attempted to prevent such leaks via PG_TRY blocks. This patch adds infrastructure that wraps each PGresult in a "libpqsrv_PGresult" that provides a memory context reset callback to PQclear the PGresult. Code using this abstraction is inherently memory-safe to the same extent as we are accustomed to in most backend code. Furthermore, we add some macros that automatically redirect calls of the libpq functions concerned with PGresults to use this infrastructure, so that almost no source-code changes are needed to wheel this infrastructure into place in all the backend code that uses libpq. Perhaps in future we could create similar infrastructure for PGconn objects, but there seems less need for that. This patch just creates the infrastructure and makes relevant code use it, including reverting `232d8caea` in favor of this mechanism. A good deal of follow-on simplification is possible now that we don't have to be so cautious about freeing PGresults, but I'll put that in a separate patch. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Discussion: https://postgr.es/m/2976982.1748049023@sss.pgh.pa.us	2025-07-25 16:30:00 -04:00
Tom Lane	5457ea46d1	Fix dynahash's HASH_FIXED_SIZE ("isfixed") option. This flag was effectively a no-op in EXEC_BACKEND (ie, Windows) builds, because it was kept in the process-local HTAB struct, and it could only ever become set in the postmaster's copy. The simplest fix is to move it to the shared HASHHDR struct. We could keep a copy in HTAB as well, as we do with keysize and some other fields, but the "too much contention" argument doesn't seem to apply here: we only examine isfixed during element_alloc(), which had better not get hit very often for a shared hashtable. This oversight dates to `7c797e719` which invented the option. But back-patching doesn't seem appropriate given the lack of field complaints. If there is anyone running an affected workload on Windows, they might be unhappy about the behavior changing in a minor release. Author: Aidar Imamov <a.imamov@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/4d0cb35ff01c5c74d2b9a582ecb73823@postgrespro.ru	2025-07-25 10:56:55 -04:00
Álvaro Herrera	1dfe3ef3f9	Refactor grammar to create opt_utility_option_list This changes the grammar for REINDEX, CHECKPOINT, CLUSTER, ANALYZE/ANALYSE; they still accept the same options as before, but the grammar is written differently for convenience of future development. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202507231538.ir7pjzoow6oe@alvherre.pgsql	2025-07-25 12:03:19 +02:00
Fujii Masao	b5d084c535	Fix background worker not restarting after crash-and-restart cycle. Previously, if a background worker crashed (e.g., due to a SIGKILL) and the server restarted due to restart_after_crash being enabled, the worker was not restarted as expected. Background workers without the never-restart flag should automatically restart in this case. This issue was introduced in commit `28a520c0b7`, which failed to reset the rw_pid field in the RegisteredBgWorker struct for the crashed worker. This commit fixes the problem by resetting rw_pid for all eligible background workers during the crash-and-restart cycle. Back-patched to v18, where the bug was introduced. Bug fix patches were proposed by Andrey Rudometov and ChangAo Chen, but this commit uses a different approach. Reported-by: Andrey Rudometov <unlimitedhikari@gmail.com> Reported-by: ChangAo Chen <cca5507@qq.com> Author: Andrey Rudometov <unlimitedhikari@gmail.com> Author: ChangAo Chen <cca5507@qq.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAF6JsWiO=i24qYitWe6ns1sXqcL86rYxdyU+pNYk-WueKPSySg@mail.gmail.com Discussion: https://postgr.es/m/tencent_E00A056B3953EE6440F0F40F80EC30427D09@qq.com Backpatch-through: 18	2025-07-25 18:38:36 +09:00
Michael Paquier	641f20d4c4	Fix assertion failure with latch wait in single-user mode LatchWaitSetPostmasterDeathPos, the latch event position for the postmaster death event, is initialized under IsUnderPostmaster. WaitLatch() considered it as a valid wait target in single-user mode (!IsUnderPostmaster), which was incorrect. One code path found to fail with an assertion failure is a database drop in single-user mode while waiting in WaitForProcSignalBarrier() after the drop. Oversight in commit `84e5b2f07a`. Author: Patrick Stählin <me@packi.ch> Co-authored-by: Ronan Dunklau <ronan.dunklau@aiven.io> Discussion: https://postgr.es/m/18996-3a2744c8140488de@postgresql.org Backpatch-through: 18	2025-07-25 16:17:13 +09:00
Michael Paquier	ac000fca74	Lower bounds related to pgstats kinds This commit changes stats kinds to have the following bounds, making their handling in core cheaper by default: - PGSTAT_KIND_CUSTOM_MIN 128 -> 24 - PGSTAT_KIND_MAX 256 -> 32 The original numbers were rather high, and showed an impact on performance in pgstat_report_stat() for the case of simple queries with its early-exit path if there are no pending statistics to flush. This logic will be improved more in a follow-up commit to bring the performance of pgstat_report_stat() on par with v17 and older versions. Lowering the bounds is a change worth doing on its own, independently of the other improvement. These new numbers should be enough to leave some room for the following years for built-in and custom stats kinds, with stable ID numbers. At least that should be enough to start with this facility for extension developers. It can be always increased in the tree depending on the requirements wanted. Per discussion with Andres Freund and Bertrand Drouvot. Discussion: https://postgr.es/m/eb224uegsga2hgq7dfq3ps5cduhpqej7ir2hjxzzozjthrekx5@dysei6buqthe Backpatch-through: 18	2025-07-25 11:17:48 +09:00
Nathan Bossart	15d33eb192	Fix return value of visibilitymap_get_status(). This function is declared as returning a uint8, but it returns a bool in one code path. To fix, return (uint8) 0 instead of false there. This should behave exactly the same as before, but it might prevent future compiler complaints. Oversight in commit `a892234f83`. Author: Julien Rouhaud <rjuju123@gmail.com> Discussion: https://postgr.es/m/aIHluT2isN58jqHV%40jrouhaud	2025-07-24 10:13:45 -05:00
Amit Kapila	e1c3654839	Fix duplicate transaction replay during pg_createsubscriber. Previously, the tool could replay the same transaction twice, once during recovery, then again during replication after the subscriber was set up. This occurred because the same recovery_target_lsn was used both to finalize recovery and to start replication. If recovery_target_inclusive = true, the transaction at that LSN would be applied during recovery and then sent again by the publisher leading to duplication. To prevent this, we now set recovery_target_inclusive = false. This ensures the transaction at recovery_target_lsn is not reapplied during recovery, avoiding duplication when replication begins. Bug #18897 Reported-by: Zane Duffield <duffieldzane@gmail.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/18897-d3db67535860dddb@postgresql.org	2025-07-24 09:05:32 +00:00
Michael Paquier	719dcf3c42	Introduce field tracking cached plan type in PlannedStmt PlannedStmt gains a new field, called CachedPlanType, able to track if a given plan tree originates from the cache and if we are dealing with a generic or custom cached plan. This field can be used for monitoring or statistical purposes, in the executor hooks, for example, based on the planned statement attached to a QueryDesc. A patch is under discussion for pg_stat_statements to provide an equivalent of the counters in pg_prepared_statements for custom and generic plans, to provide a more global view of such data, as this data is now restricted to the current session. The concept introduced in this commit is useful on its own, and has been extracted from a larger patch by the same author. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0uFw8Y9GCFvafhC=OA8NnMqVZyzXPfv_EePOt+iv1T-qQ@mail.gmail.com	2025-07-24 15:41:18 +09:00
Amit Kapila	df335618ed	Fix cfbot failure caused by commit `228c370868`. Ensure the test waits for the apply worker to exit after disabling the subscription. This is necessary to safely enable the retain_dead_tuples option. Also added a similar wait in another part of the test to prevent unintended apply worker activity that could lead to test failures post-subscription disable. Reported by Michael Paquier as per cfbot. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Discussion: https://postgr.es/m/aIGLgfRJIBwExoPj@paquier.xyz	2025-07-24 03:51:55 +00:00
Fujii Masao	086b9a33aa	doc: Add missing index entries and fix title formatting in pg_buffercache docs. This commit adds missing index entries for the functions pg_buffercache_numa() and pg_buffercache_usage_counts() in the pg_buffercache documentation. It also makes the function titles consistent by adding parentheses after function names where they were previously missing. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/7d19af4b-7da3-4862-9f52-ff958960bd8d@oss.nttdata.com Backpatch-through: 18	2025-07-24 11:43:20 +09:00
Tom Lane	e6dfd068ed	Fix build breakage on Solaris-alikes with late-model GCC. Solaris has never bothered to add "const" to the second argument of PAM conversation procs, as all other Unixen did decades ago. This resulted in an "incompatible pointer" compiler warning when building --with-pam, but had no more serious effect than that, so we never did anything about it. However, as of GCC 14 the case is an error not warning by default. To complicate matters, recent OpenIndiana (and maybe illumos in general?) does supply the "const" by default, so we can't just assume that platforms using our solaris template need help. What we can do, short of building a configure-time probe, is to make solaris.h #define _PAM_LEGACY_NONCONST, which causes OpenIndiana's pam_appl.h to revert to the traditional definition, and hopefully will have no effect anywhere else. Then we can use that same symbol to control whether we include "const" in the declaration of pam_passwd_conv_proc(). Bug: #18995 Reported-by: Andrew Watkins <awatkins1966@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18995-82058da9ab4337a7@postgresql.org Backpatch-through: 13	2025-07-23 15:44:29 -04:00
Nathan Bossart	2047ad0681	Cross-check lists of built-in LWLock tranches. lwlock.c, lwlock.h, and wait_event_names.txt each contain a list of built-in LWLock tranches. It is easy to miss one or the other when adding or removing tranches, and discrepancies have adverse effects (e.g., breaking JOINs between pg_stat_activity and pg_wait_events). This commit moves the lists of built-in tranches in lwlock.{c,h} to lwlocklist.h and adds a cross-check to the script that generates lwlocknames.h. If the lists do not match exactly, building will fail. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aHpOgwuFQfcFMZ/B%40ip-10-97-1-34.eu-west-3.compute.internal	2025-07-23 12:06:20 -05:00
Nathan Bossart	37c7a7eeb6	Use PqMsg_* macros in walsender.c Oversights in commits `f4b54e1ed9`, `dc21234005`, and `228c370868`. Author: Dave Cramer <davecramer@gmail.com> Discussion: https://postgr.es/m/CADK3HH%2BowWVdnbmWH4NHG8%3D%2BkXA_wjsyEVLoY719iJnb%3D%2BtT6A%40mail.gmail.com	2025-07-23 10:29:45 -05:00
Álvaro Herrera	196063d676	Move enum RecoveryTargetAction to xlogrecovery.h Commit `70e81861fa` split out xlogrecovery.c/h and moved some enums related to recovery targets to xlogrecovery.h. However, it seems that the enum RecoveryTargetAction was inadvertently left out by that commit. This commit moves it to xlogrecovery.h for consistency. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20240904.173013.1132986940678039655.horikyota.ntt@gmail.com	2025-07-23 11:02:13 +02:00
Amit Kapila	228c370868	Preserve conflict-relevant data during logical replication. Logical replication requires reliable conflict detection to maintain data consistency across nodes. To achieve this, we must prevent premature removal of tuples deleted by other origins and their associated commit_ts data by VACUUM, which could otherwise lead to incorrect conflict reporting and resolution. This patch introduces a mechanism to retain deleted tuples on the subscriber during the application of concurrent transactions from remote nodes. Retaining these tuples allows us to correctly ignore concurrent updates to the same tuple. Without this, an UPDATE might be misinterpreted as an INSERT during resolutions due to the absence of the original tuple. Additionally, we ensure that origin metadata is not prematurely removed by vacuum freeze, which is essential for detecting update_origin_differs and delete_origin_differs conflicts. To support this, a new replication slot named pg_conflict_detection is created and maintained by the launcher on the subscriber. Each apply worker tracks its own non-removable transaction ID, which the launcher aggregates to determine the appropriate xmin for the slot, thereby retaining necessary tuples. Conflict information retention (deleted tuples and commit_ts) can be enabled per subscription via the retain_conflict_info option. This is disabled by default to avoid unnecessary overhead for configurations that do not require conflict resolution or logging. During upgrades, if any subscription on the old cluster has retain_conflict_info enabled, a conflict detection slot will be created to protect relevant tuples from deletion when the new cluster starts. This is a foundational work to correctly detect update_deleted conflict which will be done in a follow-up patch. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2025-07-23 02:56:00 +00:00
David Rowley	039f7ee0fe	Use strchr instead of strstr for single-char lookups Compilers such as gcc and clang seem to perform this rewrite automatically when the lookup string is known at compile-time to contain a single character. The MSVC compiler does not seem apply the same optimization, and the code being adjusted here is within an #ifdef WIN32, so it seems worth adjusting this with the assumption that strchr() will be slightly more performant. There are a couple more instances in contrib/fuzzystrmatch that this commit could also have adjusted. After some discussion, we deemed those not important enough to bother with. Author: Dmitry Mityugov <d.mityugov@postgrespro.ru> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: David Rowley <drowleyml@gmail.com> Discussion: https://postgr.es/m/9c1beea6c7a5e9fb6677f26620f1f257%40postgrespro.ru	2025-07-23 12:02:55 +12:00
Michael Paquier	a6eabec680	ecpg: Improve error detection around ecpg_strdup() Various code paths of the ECPG code did not check for memory allocation failures, including the specific case where ecpg_strdup() considers a NULL value given in input as a valid behavior. strdup() returning itself NULL on failure, there was no way to make the difference between what could be valid and what should fail. With the different cases in mind, ecpg_strdup() is redesigned and gains a new optional argument, giving its callers the possibility to differentiate allocation failures and valid cases where the caller is giving a NULL value in input. Most of the ECPG code does not expect a NULL value, at the exception of ECPGget_desc() (setlocale) and ECPGconnect(), like dbname being unspecified, with repeated strdup calls. The code is adapted to work with this new routine. Note the case of ecpg_auto_prepare(), where the code order is switched so as we handle failures with ecpg_strdup() before manipulating any cached data, avoiding inconsistencies. This class of failure is unlikely a problem in practice, so no backpatch is done. Random OOM failures in ECPGconnect() could cause the driver to connect to a different server than the one wanted by the caller, because it could fallback to default values instead of the parameters defined depending on the combinations of allocation failures and successes. Author: Evgeniy Gorbanev <gorbanyoves@basealt.ru> Co-authored-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/a6b193c1-6994-4d9c-9059-aca4aaf41ddd@basealt.ru	2025-07-23 08:18:36 +09:00
Fujii Masao	a7ca73af66	Remove translation marker from libpq-be-fe-helpers.h. Commit `112faf1378` introduced a translation marker in libpq-be-fe-helpers.h, but this caused build failures on some platforms—such as the one reported by buildfarm member indri—due to linker issues with dblink. This is the same problem previously addressed in commit `213c959a29`. To fix the issue, this commit removes the translation marker from libpq-be-fe-helpers.h, following the approach used in `213c959a29`. It also removes the associated gettext_noop() calls added in commit `112faf1378`, as they are no longer needed. While reviewing this, a gettext_noop() call was also found in contrib/basic_archive. Since contrib modules don't support translation, this call has been removed as well. Per buildfarm member indri. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/0e6299d9-608a-4ffa-aeb1-40cb8a99000b@oss.nttdata.com	2025-07-22 22:08:36 +09:00
Andres Freund	d3f97fd1dd	aio: Fix assertion, clarify README The assertion wouldn't have triggered for a long while yet, but this won't accidentally fail to detect the issue if/when it occurs. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://postgr.es/m/CAEze2Wj-43JV4YufW23gm=Uwr7Lkj+p0yKctKHxNm1rwFC+_DQ@mail.gmail.com Backpatch-through: 18	2025-07-22 08:30:52 -04:00
Amit Kapila	ce6513e96a	Doc: Fix logical replication examples. The definition of \dRp+ was modified in commit `7054186c4e`. This patch updates the column list and row filter examples to align with the revised definition. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed by: Peter Smith <smithpb2250@gmail.com> Backpatch-through: 18, where it was introduced Discussion: https://postgr.es/m/CANhcyEUvqkSO6b9zi_fs_BBPEge5acj4mf8QKmq2TX-7axa7EQ@mail.gmail.com	2025-07-22 06:00:21 +00:00
Michael Paquier	19179dbffc	doc: Inform about aminsertcleanup optional NULLness This index AM callback has been introduced in `c1ec02be1d` and it is optional, currently only being used by BRIN. Optional callbacks are documented with NULL as possible value in amapi.h and indexam.sgml, but this callback has missed this part of the description. Reported-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/CAHut+PvgYcPmPDi1YdHMJY5upnyGRpc0N8pk1xNB11xDSBwNog@mail.gmail.com Backpatch-through: 17	2025-07-22 14:34:15 +09:00
Fujii Masao	112faf1378	Log remote NOTICE, WARNING, and similar messages using ereport(). Previously, NOTICE, WARNING, and similar messages received from remote servers over replication, postgres_fdw, or dblink connections were printed directly to stderr on the local server (e.g., the subscriber). As a result, these messages lacked log prefixes (e.g., timestamp), making them harder to trace and correlate with other log entries. This commit addresses the issue by introducing a custom notice receiver for replication, postgres_fdw, and dblink connections. These messages are now logged via ereport(), ensuring they appear in the logs with proper formatting and context, which improves clarity and aids in debugging. Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CALDaNm2xsHpWRtLm-VL_HJCsaE3+1Y_n-jDEAr3-suxVqc3xoQ@mail.gmail.com	2025-07-22 14:16:45 +09:00
Michael Paquier	1b8bbee05d	ecpg: Fix NULL pointer dereference during connection lookup ECPGconnect() caches established connections to the server, supporting the case of a NULL connection name when a database name is not specified by its caller. A follow-up call to ECPGget_PGconn() to get an established connection from the cached set with a non-NULL name could cause a NULL pointer dereference if a NULL connection was listed in the cache and checked for a match. At least two connections are necessary to reproduce the issue: one with a NULL name and one with a non-NULL name. Author: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/CAJ7c6TNvFTPUTZQuNAoqgzaSGz-iM4XR61D7vEj5PsQXwg2RyA@mail.gmail.com Backpatch-through: 13	2025-07-22 14:00:00 +09:00
Richard Guo	e2debb6438	Reduce "Var IS [NOT] NULL" quals during constant folding In commit `b262ad440`, we introduced an optimization that reduces an IS [NOT] NULL qual on a NOT NULL column to constant true or constant false, provided we can prove that the input expression of the NullTest is not nullable by any outer joins or grouping sets. This deduction happens quite late in the planner, during the distribution of quals to rels in query_planner. However, this approach has some drawbacks: we can't perform any further folding with the constant, and it turns out to be prone to bugs. Ideally, this deduction should happen during constant folding. However, the per-relation information about which columns are defined as NOT NULL is not available at that point. This information is currently collected from catalogs when building RelOptInfos for base or "other" relations. This patch moves the collection of NOT NULL attribute information for relations before pull_up_sublinks, storing it in a hash table keyed by relation OID. It then uses this information to perform the NullTest deduction for Vars during constant folding. This also makes it possible to leverage this information to pull up NOT IN subqueries. Note that this patch does not get rid of restriction_is_always_true and restriction_is_always_false. Removing them would prevent us from reducing some IS [NOT] NULL quals that we were previously able to reduce, because (a) the self-join elimination may introduce new IS NOT NULL quals after constant folding, and (b) if some outer joins are converted to inner joins, previously irreducible NullTest quals may become reducible. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com	2025-07-22 11:21:36 +09:00
Richard Guo	904f6a593a	Centralize collection of catalog info needed early in the planner There are several pieces of catalog information that need to be retrieved for a relation during the early stage of planning. These include relhassubclass, which is used to clear the inh flag if the relation has no children, as well as a column's attgenerated and default value, which are needed to expand virtual generated columns. More such information may be required in the future. Currently, these pieces of catalog data are collected in multiple places, resulting in repeated table_open/table_close calls for each relation in the rangetable. This patch centralizes the collection of all required early-stage catalog information into a single loop over the rangetable, allowing each relation to be opened and closed only once. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com	2025-07-22 11:20:40 +09:00
Richard Guo	e0d0529526	Expand virtual generated columns before sublink pull-up Currently, we expand virtual generated columns after we have pulled up any SubLinks within the query's quals. This ensures that the virtual generated column references within SubLinks that should be transformed into joins are correctly expanded. This approach works well and has posed no issues. In an upcoming patch, we plan to centralize the collection of catalog information needed early in the planner. This will help avoid repeated table_open/table_close calls for relations in the rangetable. Since this information is required during sublink pull-up, we are moving the expansion of virtual generated columns to occur beforehand. To achieve this, if any EXISTS SubLinks can be pulled up, their rangetables are processed just before pulling them up. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMbWs4-bFJ1At4btk5wqbezdu8PLtQ3zv-aiaY3ry9Ymm=jgFQ@mail.gmail.com	2025-07-22 11:19:17 +09:00
Alexander Korotkov	0810fbb02d	Update comment for ReplicationSlot.last_saved_restart_lsn Document that restart_lsn can go backwards and explain why this could happen. Discussion: https://postgr.es/m/1d12d2-67235980-35-19a406a0%4063439497 Discussion: https://postgr.es/m/CAPpHfdvuyMrUg0Vs5jPfwLOo1M9B-GP5j_My9URnBX0B%3DnrHKw%40mail.gmail.com Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Co-authored-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-07-21 15:07:54 +03:00
Álvaro Herrera	da71717f0a	pg_dump: include comments on not-null constraints on domains, too Commit `e5da0fe3c2` introduced catalog entries for not-null constraints on domains; but because commit `b0e96f3119` (the original work for catalogued not-null constraints on tables) forgot to teach pg_dump to process the comments for them, this one also forgot. Add that now. We also need to teach repairDependencyLoop() about the new type of constraints being possible for domains. Backpatch-through: 17 Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxF-0bqVR=j4jonS6N2Ka6hHUpFyu3_3TWKNhOW_4yFSSg@mail.gmail.com	2025-07-21 11:34:10 +02:00
Fujii Masao	cb937e48f0	doc: Document reopen of output file via SIGHUP in pg_recvlogical. When pg_recvlogical receives a SIGHUP signal, it closes the current output file and reopens a new one. This is useful since it allows us to rotate the output file by renaming the current file and sending a SIGHUP. This behavior was previously undocumented. This commit adds the missing documentation. Back-patch to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/0977fc4f-1523-4ecd-8a0e-391af4976367@oss.nttdata.com Backpatch-through: 13	2025-07-20 11:58:31 +09:00
Tom Lane	aadf7db66e	Mostly-cosmetic adjustments to estimate_multivariate_bucketsize(). The only practical effect of these changes is to avoid a useless list_copy() operation when there is a single hashclause. That's never going to make any noticeable performance difference, but the code is arguably clearer this way, especially if we take the opportunity to add some comments so that readers don't have to reverse-engineer the usage of these local variables. Also add some braces for better/more consistent style. Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAHewXNnHBOO9NEa=NBDYOrwZL4oHu2NOcTYvqyNyWEswo8f5OQ@mail.gmail.com	2025-07-19 14:23:02 -04:00
Alexander Korotkov	cdf1f5a607	Reintroduce test 046_checkpoint_logical_slot This commit is only for HEAD and v18, where the test has been removed. It also incorporates improvements below to stability and coverage of the original test, which were already backpatched to v17. - Add one pg_logical_emit_message() call to force the creation of a record that spawns across two pages. - Make the logic wait for the checkpoint completion. Author: Alexander Korotkov <akorotkov@postgresql.org> Co-authored-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 18	2025-07-19 13:59:17 +03:00
Alexander Korotkov	ccd9451593	Improve the stability of the recovery test 047_checkpoint_physical_slot Currently, the comments in 047_checkpoint_physical_slot. It shows an incomplete intention to wait for checkpoint completion before performing an immediate database stop. However, an immediate node stop can occur both before and after checkpoint completion. Both cases should work correctly. But we would like the test to be more stable and deterministic. This is why this commit makes this test explicitly wait for the checkpoint completion log message. Discussion: https://postgr.es/m/CAPpHfdurV-j_e0pb%3DUFENAy3tyzxfF%2ByHveNDNQk2gM82WBU5A%40mail.gmail.com Discussion: https://postgr.es/m/aHXLep3OaX_vRTNQ%40paquier.xyz Author: Alexander Korotkov <akorotkov@postgresql.org> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 17	2025-07-19 13:51:07 +03:00
Alexander Korotkov	d3917d8f13	Fix infinite wait when reading a partially written WAL record If a crash occurs while writing a WAL record that spans multiple pages, the recovery process marks the page with the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag. However, logical decoding currently attempts to read the full WAL record based on its expected size before checking this flag, which can lead to an infinite wait if the remaining data is never written (e.g., no activity after crash). This patch updates the logic first to read the page header and check for the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag before attempting to reconstruct the full WAL record. If the flag is set, decoding correctly identifies the record as incomplete and avoids waiting for WAL data that will never arrive. Discussion: https://postgr.es/m/CAAKRu_ZCOzQpEumLFgG_%2Biw3FTa%2BhJ4SRpxzaQBYxxM_ZAzWcA%40mail.gmail.com Discussion: https://postgr.es/m/CALDaNm34m36PDHzsU_GdcNXU0gLTfFY5rzh9GSQv%3Dw6B%2BQVNRQ%40mail.gmail.com Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Backpatch-through: 13	2025-07-19 13:45:51 +03:00
Michael Paquier	1e9b5140c4	Check status of nodes after regression test run in 027_stream_regress This commit improves the recovery TAP test 027_stream_regress so as regression diffs are printed only if both the primary and the standby are still alive after the main regression test suite finishes, relying on `d4c9195eff` to do the job. Particularly, a crash of the primary could scribble the contents reported with mostly useless data, as the diffs would refer to query that failed to run, not necessarily the cause of the crash. Suggested-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com	2025-07-19 15:03:14 +09:00
Michael Paquier	d4c9195eff	Add PostgreSQL::Test::Cluster::is_alive() This new routine acts as a wrapper of pg_isready, that can be run on a node to check its connection status. This will be used in a recovery test in a follow-up commit. Suggested-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com	2025-07-19 14:38:52 +09:00
Tom Lane	3683af6170	Speed up byteain by not parsing traditional-style input twice. Instead of laboriously computing the exact output length, use strlen to get an upper bound cheaply. (This is still O(N) of course, but the constant factor is a lot less.) This will typically result in overallocating the output datum, but that's of little concern since it's a short-lived allocation in just about all use-cases. A simple microbenchmark showed about 40% speedup for long input strings. While here, make some cosmetic cleanups and add a test case that covers the double-backslash code path in byteain and byteaout. Author: Steven Niu <niushiji@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Stepan Neretin <slpmcf@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/ca315729-140b-426e-81a6-6cd5cfe7ecc5@gmail.com	2025-07-18 16:42:10 -04:00
Nathan Bossart	84409ed640	Remove unused variable in generate-lwlocknames.pl. Oversight in commit `da952b415f`. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aHpOgwuFQfcFMZ/B%40ip-10-97-1-34.eu-west-3.compute.internal	2025-07-18 11:27:19 -05:00
Nathan Bossart	161a3e8b68	pg_upgrade: Use COPY for large object metadata. Presently, pg_dump generates commands like SELECT pg_catalog.lo_create('5432'); ALTER LARGE OBJECT 5432 OWNER TO alice; GRANT SELECT ON LARGE OBJECT 5432 TO bob; for each large object. This is particularly slow at restore time, especially when there are tens or hundreds of millions of large objects. From reports and personal experience, such slow restores seem to be most painful when encountered during pg_upgrade. This commit teaches pg_dump to instead dump pg_largeobject_metadata and the corresponding pg_shdepend rows when in binary upgrade mode, i.e., pg_dump now generates commands like COPY pg_catalog.pg_largeobject_metadata (oid, lomowner, lomacl) FROM stdin; 5432 16384 {alice=rw/alice,bob=r/alice} \. COPY pg_catalog.pg_shdepend (dbid, classid, objid, objsubid, refclassid, refobjid, deptype) FROM stdin; 5 2613 5432 0 1260 16384 o 5 2613 5432 0 1260 16385 a \. Testing indicates the COPY approach can be significantly faster. To do any better, we'd probably need to find a way to copy/link pg_largeobject_metadata's files during pg_upgrade, which would be limited to upgrades from >= v16 (since commit `7b378237aa` changed the storage format for aclitem, which is used for pg_largeobject_metadata.lomacl). Note that this change only applies to binary upgrade mode (i.e., dumps initiated by pg_upgrade) since it inserts rows directly into catalogs. Also, this optimization can only be used for upgrades from >= v12 because pg_largeobject_metadata was created WITH OIDS in older versions, which prevents pg_dump from handling pg_largeobject_metadata.oid properly. With some extra effort, it might be possible to support upgrades from older versions, but the added complexity didn't seem worth it to support versions that will have been out-of-support for nearly 3 years by the time this change is released. Experienced hackers may remember that prior to v12, pg_upgrade copied/linked pg_largeobject_metadata's files (see commit `12a53c732c`). Besides the aforementioned storage format issues, this approach failed to transfer the relevant pg_shdepend rows, and pg_dump still had to generate an lo_create() command per large object so that creating the dependent comments and security labels worked. We could perhaps adopt a hybrid approach for upgrades from v16 and newer (i.e., generate lo_create() commands for each large object, copy/link pg_largeobject_metadata's files, and COPY the relevant pg_shdepend rows), but further testing is needed. Reported-by: Hannu Krosing <hannuk@google.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Hannu Krosing <hannuk@google.com> Reviewed-by: Nitin Motiani <nitinmotiani@google.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMT0RQSS-6qLH%2BzYsOeUbAYhop3wmQTkNmQpo5--QRDUR%2BqYmQ%40mail.gmail.com	2025-07-18 10:59:46 -05:00
Alexander Korotkov	4c5159a2d8	Fix a typo in the deparseArrayCoerceExpr() header comment Discussion: https://postgr.es/m/CAHewXNn%3D_ykCtcTw5SCfZ-eVr4m%2BCuc804rGeMsKuj%3DD4xpL4w%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com>	2025-07-18 18:40:07 +03:00
Dean Rasheed	5022ff250e	Fix concurrent update trigger issues with MERGE in a CTE. If a MERGE inside a CTE attempts an UPDATE or DELETE on a table with BEFORE ROW triggers, and a concurrent UPDATE or DELETE happens, the merge code would fail (crashing in the case of an UPDATE action, and potentially executing the wrong action for a DELETE action). This is the same issue that `9321c79c86` attempted to fix, except now for a MERGE inside a CTE. As noted in `9321c79c86`, what needs to happen is for the trigger code to exit early, returning the TM_Result and TM_FailureData information to the merge code, if a concurrent modification is detected, rather than attempting to do an EPQ recheck. The merge code will then do its own rechecking, and rescan the action list, potentially executing a different action in light of the concurrent update. In particular, the trigger code must never call ExecGetUpdateNewTuple() for MERGE, since that is bound to fail because MERGE has its own per-action projection information. Commit `9321c79c86` did this using estate->es_plannedstmt->commandType in the trigger code to detect that a MERGE was being executed, which is fine for a plain MERGE command, but does not work for a MERGE inside a CTE. Fix by passing that information to the trigger code as an additional parameter passed to ExecBRUpdateTriggers() and ExecBRDeleteTriggers(). Back-patch as far as v17 only, since MERGE cannot appear inside a CTE prior to that. Additionally, take care to preserve the trigger ABI in v17 (though not in v18, which is still in beta). Bug: #18986 Reported-by: Yaroslav Syrytsia <me@ys.lc> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/18986-e7a8aac3d339fa47@postgresql.org Backpatch-through: 17	2025-07-18 09:55:43 +01:00
Alexander Korotkov	62c3b4cd9d	Support for deparsing of ArrayCoerceExpr node in contrib/postgres_fdw When using a prepared statement to select data from a PostgreSQL foreign table (postgres_fdw) with the "field = ANY($1)" expression, the operation is not pushed down when an implicit type case is applied, and a generic plan is used. This commit resolves the issue by supporting the push-down of ArrayCoerceExpr, which is used in this case. The support is quite straightforward and similar to other nods, such as RelabelType. Discussion: https://postgr.es/m/4f0cea802476d23c6e799512ffd17aff%40postgrespro.ru Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2025-07-18 10:52:05 +03:00
Nathan Bossart	b597ae6cc1	Add a test harness for the binary heap code. binaryheap is heavily used and already has decent test coverage, but it lacks dedicated tests for its correctness. This commit changes that. Author: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/CAJ7c6TMwp%2Bmb8MMoi%3DSMVMso2hYecoVu2Pwf2EOkesq0MiSKxw%40mail.gmail.com	2025-07-17 16:32:10 -05:00
Tom Lane	daf9bdc47d	Fix PQport to never return NULL unless the connection is NULL. This is the documented behavior, and it worked that way before v10. However, addition of the connhost[] array created cases where conn->connhost[conn->whichhost].port is NULL. The rest of libpq is careful to substitute DEF_PGPORT[_STR] for a null or empty port string, but we failed to do so here, leading to possibly returning NULL. As of v18 that causes psql's \conninfo command to segfault. Older psql versions avoid that, but it's pretty likely that other clients have trouble with this, so we'd better back-patch the fix. In stable branches, just revert to our historical behavior of returning an empty string when there was no user-given port specification. However, it seems substantially more useful and indeed more correct to hand back DEF_PGPORT_STR in such cases, so let's make v18 and master do that. Author: Daniele Varrazzo <daniele.varrazzo@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+mi_8YTS8WPZPO0PAb2aaGLwHuQ0DEQRF0ZMnvWss4y9FwDYQ@mail.gmail.com Backpatch-through: 13	2025-07-17 12:46:57 -04:00
Álvaro Herrera	b8926a5b4b	Remove assertion from PortalRunMulti We have an assertion to ensure that a command tag has been assigned by the time we're done executing, but if we happen to execute a command with no queries, the assertion would fail. Per discussion, rather than contort things to get a tag assigned, just remove the assertion. Oversight in `2f9661311b`. That commit also retained a comment that explained logic that had been adjacent to it but diffused into various places, leaving none apt to keep part of the comment. Remove that part, and rewrite what remains for extra clarity. Bug: #18984 Backpatch-through: 13 Reported-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michaël Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/18984-0f4778a6599ac3ae@postgresql.org	2025-07-17 17:40:22 +02:00
Nathan Bossart	26cc96d452	doc: Add note about how to use pg_overexplain. This commit adds a note to the pg_overexplain page that describes how to use it (LOAD, session_preload_libraries, or shared_preload_libraries). The new text is mostly lifted from the auto_explain page. We should probably consider centralizing this information in the future. While at it, add a missing "module" to the opening sentence. Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/aHVWKM8l8kLlZzgv%40nathan Backpatch-through: 18	2025-07-17 10:25:59 -05:00
Amit Langote	afa5c365ec	Remove duplicate line In `231b7d670b`, while copy-pasting some code into ExecEvalJsonCoercionFinish(), I (amitlan) accidentally introduced a duplicate line. Remove it. Reported-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHcf=BpmRAJcjgfjOUfV76MwKnyz1x3ErXsWL26EAFmng@mail.gmail.com	2025-07-17 14:37:06 +09:00
Michael Paquier	74a3fc36f3	Split regression tests for TOAST compression methods into two files The regression tests for TOAST compression methods are split into two independent files: one specific to LZ4 and interactions between two different TOAST compression methods, now called compression_lz4, and a second one for the "core" cases where only pglz is required. This saves 300 lines in diffs coming from the alternate output of compression.sql, required for builds where lz4 is not available. The new test is skipped if the build does not support LZ4 compression, relying on an \if and the values reported in pg_settings for the GUC default_toast_compression, "lz4" being available only under USE_LZ4. Another benefit of this split is that this facilitates the addition of more compression methods for TOAST, which are under discussion. Note the trick added for the tests of the GUC default_toast_compression, where VERBOSITY = terse is used to avoid the HINT printing the lists of values available in the GUC, which are environment-dependent. This makes compression.sql independent of the availability of LZ4. The code coverage of toast_compression.c is slightly improved, increased from 89% to 91%, with one new case covered in lz4_compress_datum() for incompressible data. Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aDlcU-ym9KfMj9sG@paquier.xyz	2025-07-17 14:08:55 +09:00
Michael Paquier	a493e741d3	Fix inconsistent LWLock tranche names for MultiXact* The terms used in wait_event_names.txt and lwlock.c were inconsistent for MultiXactOffsetSLRU and MultiXactMemberSLRU, which could cause joins between pg_wait_events and pg_stat_activity to fail. lwlock.c is adjusted in this commit to what the historical name of the event has always been, and what is documented. Oversight in `53c2a97a92`. `08b9b9e043` has fixed a similar inconsistency some time ago. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/aHdxN0D0hKXzHFQG@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 17	2025-07-17 09:30:26 +09:00
Daniel Gustafsson	f6ffbeda00	doc: Add example file for COPY The paragraph for introducing INSERT and COPY discussed how a file could be used for bulk loading with COPY, without actually showing what the file would look like. This adds a programlisting for the file contents. Backpatch to all supported branches since this example has lacked the file contents since PostgreSQL 7.2. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/158017814191.19852.15019251381150731439@wrigleys.postgresql.org Backpatch-through: 13	2025-07-17 00:21:18 +02:00
Jeff Davis	5e6e42e44f	Force LC_COLLATE to C in postmaster. Avoid dependence on setlocale(). strcoll(), etc., are not called directly; all collation-sensitive calls should go through pg_locale.c and use the appropriate provider. By setting LC_COLLATE to C, we avoid accidentally depending on libc behavior when using a different provider. No behavior change in the backend, but it's possible that some extensions will be affected. Such extensions should be updated to use the pg_locale_t APIs. Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org Reviewed-by: Peter Eisentraut <peter@eisentraut.org>	2025-07-16 14:13:18 -07:00
Álvaro Herrera	0858f0f96e	Fix dumping of comments on invalid constraints on domains We skip dumping constraints together with domains if they are invalid ('separate') so that they appear after data -- but their comments were dumped together with the domain definition, which in effect leads to the comment being dumped when the constraint does not yet exist. Delay them in the same way. Oversight in 7eca575d1c28; backpatch all the way back. Author: jian he <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxF_C2pe6J_+nPr6C5jf5rQnbYP8XOKr4HM8yHZtp2aQqQ@mail.gmail.com	2025-07-16 19:22:53 +02:00
Peter Geoghegan	4c8ad67a98	nbtree: Use only one notnullkey ScanKeyData. _bt_first need only store one ScanKeyData struct on the stack for the purposes of building an IS NOT NULL key based on an implied NOT NULL constraint. We don't need INDEX_MAX_KEYS-many ScanKeyData structs. This saves us a little over 2KB in stack space. It's possible that this has some performance benefit. It also seems simpler and more direct. It isn't possible for more than a single index attribute to need its own implied IS NOT NULL key: the first such attribute/IS NOT NULL key always makes _bt_first stop adding additional boundary keys to startKeys[]. Using INDEX_MAX_KEYS-many ScanKeyData entries was (at best) misleading. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Mircea Cadariu <cadariu.mircea@gmail.com> Discussion: https://postgr.es/m/CAH2-Wzm=1kJMSZhhTLoM5BPbwQNWxUj-ynOEh=89ptDZAVgauw@mail.gmail.com	2025-07-16 13:05:44 -04:00
Jeff Davis	48c2c7b4b4	pg_dumpall: Skip global objects with --statistics-only or --no-schema. Previously, pg_dumpall would still dump global objects such as roles and tablespaces even when --statistics-only or --no-schema was specified. Since these global objects are treated as schema-level data, they should be skipped in these cases. This commit fixes the issue by ensuring that global objects are not dumped when either --statistics-only or --no-schema is used. Author: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/08129593-6f3c-4fb9-94b7-5aa2eefb99b0@oss.nttdata.com Backpatch-through: 18	2025-07-16 09:57:12 -07:00
Nathan Bossart	ecc5161a0b	psql: Fix note on project naming in output of \copyright. This adjusts the wording to match the changes in commits `5987553fde`, `a233a603ba`, and pgweb commit 2d764dbc08. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/aHVo791guQR6uqwT%40nathan Backpatch-through: 13	2025-07-16 11:50:34 -05:00
Michael Paquier	1dbe6f7667	Refactor non-supported compression error message in toast_compression.c This code used a NO_LZ4_SUPPORT() macro to issue an error in the code paths where LZ4 [de]compression is attempted but the build does not support it. This commit refactors the code to use a more flexible error message so as it can be used for other compression methods, where the method is given in input of macro. Extracted from a larger patch by the same author. Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAFAfj_HX84EK4hyRYw50AOHOcdVi-+FFwAAPo7JHx4aShCvunQ@mail.gmail.com	2025-07-16 11:59:22 +09:00
Fujii Masao	b8341ae856	pgoutput: Initialize missing default for "origin" parameter. The pgoutput plugin initializes optional parameters like "binary" with default values at the start of processing. However, the "origin" parameter was previously missed and left without explicit initialization. Although the PGOutputData struct, which holds these settings, is zero-initialized at allocation (resulting in publish_no_origin field for "origin" parameter being false by default), this default was not set explicitly, unlike other parameters. This commit adds explicit initialization of the "origin" parameter to ensure consistency and clarity in how defaults are handled. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/d2790f10-238d-4cb5-a743-d9d2a9dd900f@oss.nttdata.com	2025-07-16 10:31:51 +09:00
Fujii Masao	d8425811b6	doc: Document default values for pgoutput options in protocol.sgml. The pgoutput plugin options are described in the logical streaming replication protocol documentation, but their default values were previously not mentioned. This made it less convenient for users, for example, when specifying those options to use pg_recvlogical with pgoutput plugin. This commit adds the explanations of the default values for pgoutput options to improve clarity and usability. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/d2790f10-238d-4cb5-a743-d9d2a9dd900f@oss.nttdata.com	2025-07-16 08:51:04 +09:00
Fujii Masao	09fcc652fe	doc: Fix confusing description of streaming option in START_REPLICATION. Previously, the documentation described the streaming option as a boolean, which is outdated since it's no longer a boolean as of protocol version 4. This could confuse users. This commit updates the description to remove the "boolean" reference and clearly list the valid values for the streaming option. Back-patch to v16, where the streaming option changed to a non-boolean. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/8d21fb98-5c25-4dee-8387-e5a62b01ea7d@app.fastmail.com Backpatch-through: 16	2025-07-16 08:32:52 +09:00
Fujii Masao	7c3b591af3	doc: Clarify that total_vacuum_time excludes VACUUM FULL. The last_vacuum and vacuum_count fields in pg_stat_all_tables already state that they do not include VACUUM FULL. However, total_vacuum_time, which also excludes VACUUM FULL, did not mention this. This could mislead users into thinking VACUUM FULL time is included. To address this, this commit updates the documentation for pg_stat_all_tables to explicitly state that total_vacuum_time does not count VACUUM FULL. Back-patched to v18, where total_vacuum_time was introduced. Additionally, this commit clarifies that n_ins_since_vacuum also excludes VACUUM FULL. Although n_ins_since_vacuum was added in v13, we are not back-patching this change to stable branches, as it is a documentation improvement, not a bug fix. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Discussion: https://postgr.es/m/2ac375d1-591b-4f1b-a2af-f24335567866@oss.nttdata.com Backpatch-through: 18	2025-07-16 08:03:36 +09:00
Tom Lane	5fe55a0fe4	Doc: clarify description of regexp fields in pg_ident.conf. The grammar was a little shaky and confusing here, so word-smith it a bit. Also, adjust the comments in pg_ident.conf.sample to use the same terminology as the SGML docs, in particular "DATABASE-USERNAME" not "PG-USERNAME". Back-patch appropriate subsets. I did not risk changing pg_ident.conf.sample in released branches, but it still seems OK to change it in v18. Reported-by: Alexey Shishkin <alexey.shishkin@enterprisedb.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/175206279327.3157504.12519088928605422253@wrigleys.postgresql.org Backpatch-through: 13	2025-07-15 18:53:00 -04:00
Tom Lane	2a3a396432	Clarify the ra != rb case in compareJsonbContainers(). It's impossible to reach this case with either ra or rb being WJB_DONE, because our earlier checks that the structure and length of the inputs match should guarantee that we reach their ends simultaneously. However, the comment completely fails to explain this, and the Asserts don't cover it either. The comment is pretty obscure anyway, so rewrite it, and extend the Asserts to reject WJB_DONE. This is only cosmetic, so no need for back-patch. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru	2025-07-15 18:21:12 -04:00
Tom Lane	aad1617b76	Silence uninitialized-value warnings in compareJsonbContainers(). Because not every path through JsonbIteratorNext() sets val->type, some compilers complain that compareJsonbContainers() is comparing possibly-uninitialized values. The paths that don't set it return WJB_DONE, WJB_END_ARRAY, or WJB_END_OBJECT, so it's clear by manual inspection that the "(ra == rb)" code path is safe, and indeed we aren't seeing warnings about that. But the (ra != rb) case is much less obviously safe. In Assert-enabled builds it seems that the asserts rejecting WJB_END_ARRAY and WJB_END_OBJECT persuade gcc 15.x not to warn, which makes little sense because it's impossible to believe that the compiler can prove of its own accord that ra/rb aren't WJB_DONE here. (In fact they never will be, so the code isn't wrong, but why is there no warning?) Without Asserts, the appearance of warnings is quite unsurprising. We discussed fixing this by converting those two Asserts into pg_assume, but that seems not very satisfactory when it's so unclear why the compiler is or isn't warning: the warning could easily reappear with some other compiler version. Let's fix it in a less magical, more future-proof way by changing JsonbIteratorNext() so that it always does set val->type. The cost of that should be pretty negligible, and it makes the function's API spec less squishy. Reported-by: Erik Rijkers <er@xs4all.nl> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/988bf1bc-3f1f-99f3-bf98-222f1cd9dc5e@xs4all.nl Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru Backpatch-through: 13	2025-07-15 18:11:18 -04:00
Tom Lane	8ffd9ac3b2	Doc: clarify description of current-date/time functions. Minor wordsmithing of the func.sgml paragraph describing statement_timestamp() and allied functions: don't switch between "statement" and "command" when those are being used to mean about the same thing. Also, add some text to protocol.sgml describing the perhaps-surprising behavior these functions have in a multi-statement Query message. Reported-by: P M <petermittere@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/175223006802.3157505.14764328206246105568@wrigleys.postgresql.org Backpatch-through: 13	2025-07-15 16:35:42 -04:00
Fujii Masao	ff0bcb248e	psql: Fix tab-completion after GRANT/REVOKE on LARGE OBJECT and FOREIGN SERVER. Previously, when pressing Tab after GRANT or REVOKE ... ON LARGE OBJECT or ON FOREIGN SERVER, TO or FROM was incorrectly suggested by psql's tab-completion. This was not appropriate, as those clauses are not valid at that point. This commit fixes the issue by preventing TO and FROM from being offered immediately after those specific GRANT/REVOKE statements. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20250408122857.b2b06dde4e6a08290af02336@sraoss.co.jp	2025-07-15 18:51:17 +09:00
Michael Paquier	006fc975a2	Fix comments in index.c This comment paragraph referred to text_eq(), but the name of the function in charge of "text" comparisons is called texteq(). Author: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHL--XNcCCO1LgKsygzYGiVHZMfTcAxOSG8+ezxWtjddw@mail.gmail.com	2025-07-15 16:05:59 +09:00
Fujii Masao	88a658a42e	amcheck: Improve error message for partitioned index target. Previously, amcheck could produce misleading error message when a partitioned index was passed to functions like bt_index_check(). For example, bt_index_check() with a partitioned btree index produced: ERROR: expected "btree" index as targets for verification DETAIL: Relation ... is a btree index. Reporting "expected btree index as targets" even when the specified index was a btree was confusing. In this case, the function should fail since the partitioned index specified is not valid target. This commit improves the error reporting to better reflect this actual issue. Now, bt_index_check() with a partitioned index, the error message is: ERROR: expected index as targets for verification DETAIL: This operation is not supported for partitioned indexes. This commit also applies the following minor changes: - Simplifies index_checkable() by using get_am_name() to retrieve the access method name. - Changes index_checkable() from extern to static, as it is only used in verify_common.c. - Updates the error code for invalid indexes to ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE, aligning with usage in similar modules like pgstattuple. Author: Masahiro Ikeda <ikedamsh@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/8829854bbfc8635ddecd0846bb72dfda@oss.nttdata.com	2025-07-14 20:05:10 +09:00
Michael Paquier	6b1c4d326b	psql: Add variable SERVICEFILE This new psql variable can be used to check which service file has been used for a connection. Like other variables, this can be set in a PROMPT or reported by an \echo, like these commands: \echo :SERVICEFILE \set PROMPT1 '=(%:SERVICEFILE:)%# ' This relies on commits `092f3c63ef` and `fef6da9e9c` to retrieve this information from the connection's PQconninfoOption. Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com> Discussion: https://postgr.es/m/CAKkG4_nCjx3a_F3gyXHSPWxD8Sd8URaM89wey7fG_9g7KBkOCQ@mail.gmail.com	2025-07-14 09:08:46 +09:00
Tom Lane	3c4e26a62c	In username-map substitution, cope with more than one \1. If the system-name field of a pg_ident.conf line is a regex containing capturing parentheses, you can write \1 in the user-name field to represent the captured part of the system name. But what happens if you write \1 more than once? The only reasonable expectation IMO is that each \1 gets replaced, but presently our code replaces only the first. Fix that. Also, improve the tests for this feature to exercise cases where a non-empty string needs to be substituted for \1. The previous testing didn't inspire much faith that it was verifying correct operation of the substitution code. Given the lack of field complaints about this, I don't feel a need to back-patch. Reported-by: David G. Johnston <david.g.johnston@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKFQuwZu6kZ8ZPvJ3pWXig+6UX4nTVK-hdL_ZS3fSdps=RJQQQ@mail.gmail.com	2025-07-13 13:52:32 -04:00
Michael Paquier	092f3c63ef	libpq: Add "servicefile" connection option This commit adds the possibility to specify a service file in a connection string, using a new option called "servicefile". The parsing of the service file happens so as things are done in this order of priority: - The servicefile connection option. - Environment variable PGSERVICEFILE. - Default path, depending on the HOME environment. Note that in the last default case, we need to fill in "servicefile" for the connection's PQconninfoOption to let clients know which service file has been used for the connection. Some TAP tests are added, with a few tweaks required for Windows when using URIs or connection option values, for the location paths. Author: Torsten Förtsch <tfoertsch123@gmail.com> Co-authored-by: Ryo Kanbayashi <kanbayashi.dev@gmail.com> Discussion: https://postgr.es/m/CAKkG4_nCjx3a_F3gyXHSPWxD8Sd8URaM89wey7fG_9g7KBkOCQ@mail.gmail.com	2025-07-13 16:52:19 +09:00
Nathan Bossart	8893c3ab36	Remove XLogCtl->ckptFullXid. A few code paths set this variable, but its value is never used. Oversight in commit `2fc7af5e96`. Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/aHFyE1bs9YR93dQ1%40nathan	2025-07-12 14:34:57 -05:00
Tom Lane	84ce258707	Replace float8 with int in date2isoweek() and date2isoyear(). The values of the "result" variables in these functions are always integers; using a float8 variable accomplishes nothing except to incur useless conversions to and from float. While that wastes a few nanoseconds, these functions aren't all that time-critical. But it seems worth fixing to remove possible reader confusion. Also, in the case of date2isoyear(), "result" is a very poorly chosen variable name because it is not the function's result. Rename it to "week", and do the same in date2isoweek() for consistency. Since this is mostly cosmetic, there seems little need for back-patch. Author: Sergey Fukanchik <s.fukanchik@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6323a-68726500-1-7def9d00@137821581	2025-07-12 11:50:37 -04:00
Andres Freund	f2c87ac04e	Remove long-unused TransactionIdIsActive() TransactionIdIsActive() has not been used since `bb38fb0d43`, in 2014. There are no known uses in extensions either and it's hard to see valid uses for it. Therefore remove TransactionIdIsActive(). Discussion: https://postgr.es/m/odgftbtwp5oq7cxjgf4kjkmyq7ypoftmqy7eqa7w3awnouzot6@hrwnl5tdqrgu	2025-07-12 11:00:44 -04:00
Thomas Munro	b8e1f2d96b	aio: Fix configuration reload in IO workers. method_worker.c installed SignalHandlerForConfigReload, but it failed to actually process reload requests. That hasn't yet produced any concrete problem reports in terms of GUC changes it should have cared about in v18, but it was inconsistent. It did cause problems for a couple of patches in development that need IO workers to react to ALTER SYSTEM + pg_reload_conf(). Fix extracted from one of those patches. Back-patch to 18. Reported-by: Dmitry Dolgov <9erthalion6@gmail.com> Discussion: https://postgr.es/m/sh5uqe4a4aqo5zkkpfy5fobe2rg2zzouctdjz7kou4t74c66ql%40yzpkxb7pgoxf	2025-07-12 16:33:02 +12:00
Thomas Munro	177c1f0593	aio: Remove obsolete IO worker ID references. In an ancient ancestor of this code, the postmaster assigned IDs to IO workers. Now it tracks them in an unordered array and doesn't know their IDs, so it might be confusing to readers that it still referred to their indexes as IDs. No change in behavior, just variable name and error message cleanup. Back-patch to 18. Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com	2025-07-12 14:44:22 +12:00
Thomas Munro	01d618bcd7	aio: Regularize IO worker internal naming. Adopt PgAioXXX convention for pgaio module type names. Rename a function that didn't use a pgaio_worker_ submodule prefix. Rename the internal submit function's arguments to match the indirectly relevant function pointer declaration and nearby examples. Rename the array of handle IDs in PgAioSubmissionQueue to sqes, a term of art seen in the systems it emulates, also clarifying that they're not IO handle pointers as the old name might imply. No change in behavior, just type, variable and function name cleanup. Back-patch to 18. Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com	2025-07-12 14:44:09 +12:00
Thomas Munro	40e105042a	Fix stale idle flag when IO workers exit. Otherwise we could choose a worker that has exited and crash while trying to wake it up. Back-patch to 18. Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/t5aqjhkj6xdkido535pds7fk5z4finoxra4zypefjqnlieevbg%40357aaf6u525j	2025-07-12 13:11:47 +12:00
Tom Lane	64840e4624	Fix inconsistent quoting of role names in ACLs. getid() and putid(), which parse and deparse role names within ACL input/output, applied isalnum() to see if a character within a role name requires quoting. They did this even for non-ASCII characters, which is problematic because the results would depend on encoding, locale, and perhaps even platform. So it's possible that putid() could elect not to quote some string that, later in some other environment, getid() will decide is not a valid identifier, causing dump/reload or similar failures. To fix this in a way that won't risk interoperability problems with unpatched versions, make getid() treat any non-ASCII as a legitimate identifier character (hence not requiring quotes), while making putid() treat any non-ASCII as requiring quoting. We could remove the resulting excess quoting once we feel that no unpatched servers remain in the wild, but that'll be years. A lesser problem is that getid() did the wrong thing with an input consisting of just two double quotes (""). That has to represent an empty string, but getid() read it as a single double quote instead. The case cannot arise in the normal course of events, since we don't allow empty-string role names. But let's fix it while we're here. Although we've not heard field reports of problems with non-ASCII role names, there's clearly a hazard there, so back-patch to all supported versions. Reported-by: Peter Eisentraut <peter@eisentraut.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3792884.1751492172@sss.pgh.pa.us Backpatch-through: 13	2025-07-11 18:50:13 -04:00
Jacob Champion	990571a08b	oauth: Run Autoconf tests with correct compiler flags Commit `b0635bfda` split off the CPPFLAGS/LDFLAGS/LDLIBS for libcurl into their own separate Makefile variables, but I neglected to move the existing AC_CHECKs for Curl into a place where they would make use of those variables. They instead tested the system libcurl, which 1) is unhelpful if a different Curl is being used for the build and 2) will fail the build entirely if no system libcurl exists. Correct the order of operations here. Reported-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com> Discussion: https://postgr.es/m/8a611028-51a1-408c-b592-832e2e6e1fc9%40tantorlabs.com Backpatch-through: 18	2025-07-11 10:06:41 -07:00
Nathan Bossart	8d33fbacba	Add FLUSH_UNLOGGED option to CHECKPOINT command. This option, which is disabled by default, can be used to request the checkpoint also flush dirty buffers of unlogged relations. As with the MODE option, the server may consolidate the options for concurrently requested checkpoints. For example, if one session uses (FLUSH_UNLOGGED FALSE) and another uses (FLUSH_UNLOGGED TRUE), the server may perform one checkpoint with FLUSH_UNLOGGED enabled. Author: Christoph Berg <myon@debian.org> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	2f698d7f4b	Add MODE option to CHECKPOINT command. This option may be set to FAST (the default) to request the checkpoint be completed as fast as possible, or SPREAD to request the checkpoint be spread over a longer interval (based on the checkpoint-related configuration parameters). Note that the server may consolidate the options for concurrently requested checkpoints. For example, if one session requests a "fast" checkpoint and another requests a "spread" checkpoint, the server may perform one "fast" checkpoint. Author: Christoph Berg <myon@debian.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	a4f126516e	Add option list to CHECKPOINT command. This commit adds the boilerplate code for supporting a list of options in CHECKPOINT commands. No actual options are supported yet, but follow-up commits will add support for MODE and FLUSH_UNLOGGED. While at it, this commit refactors the code for executing CHECKPOINT commands to its own function since it's about to become significantly larger. Author: Christoph Berg <myon@debian.org> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	bb938e2c3c	Rename CHECKPOINT_IMMEDIATE to CHECKPOINT_FAST. The new name more accurately reflects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "fast" instead of "immediate", too. Likewise, references to "immediate" checkpoints in the documentation have been updated to say "fast". This is preparatory work for a follow-up commit that will add a MODE option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	cd8324cc89	Rename CHECKPOINT_FLUSH_ALL to CHECKPOINT_FLUSH_UNLOGGED. The new name more accurately relects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "flush-unlogged" instead of "flush-all", too. This is preparatory work for a follow-up commit that will add a FLUSH_UNLOGGED option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Tom Lane	f25792c541	Force LC_NUMERIC to C while running TAP tests. We already forced LC_MESSAGES to C in order to get consistent message output, but that isn't enough to stabilize messages that include %f or similar formatting. I'm a bit surprised that this hasn't come up before. Perhaps we ought to back-patch this change, but I'll refrain for now. Reported-by: Bernd Helmle <mailings@oopsware.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6f024eaa7885eddf5e0eb4ba1d095fbc7146519b.camel@oopsware.de	2025-07-11 12:49:07 -04:00
Amit Kapila	72e6c08fea	Fix the handling of two GUCs during upgrade. Previously, the check_hook functions for max_slot_wal_keep_size and idle_replication_slot_timeout would incorrectly raise an ERROR for values set in postgresql.conf during upgrade, even though those values were not actively used in the upgrade process. To prevent logical slot invalidation during upgrade, we used to set special values for these GUCs. Now, instead of relying on those values, we directly prevent WAL removal and logical slot invalidation caused by max_slot_wal_keep_size and idle_replication_slot_timeout. Note: PostgreSQL 17 does not include the idle_replication_slot_timeout GUC, so related changes were not backported. BUG #18979 Reported-by: jorsol <jorsol@gmail.com> Author: Dilip Kumar <dilipbalaut@gmail.com> Reviewed by: vignesh C <vignesh21@gmail.com> Reviewed by: Alvaro Herrera <alvherre@alvh.no-ip.org> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/219561.1751826409@sss.pgh.pa.us Discussion: https://postgr.es/m/18979-a1b7fdbb7cd181c6@postgresql.org	2025-07-11 10:46:43 +05:30
Tatsuo Ishii	4cff01c4a3	Doc: fix outdated protocol version. In the description of StartupMessage, the protocol version was left 3.0. Instead of just updating it, this commit removes the hard coded protocol version and shows the numbers as an example. This makes that the part of the doc does not need to be updated when the version is changed in the future. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/20250626.155608.568829483879866256.ishii%40postgresql.org	2025-07-11 10:34:57 +09:00
Fujii Masao	110e6dcaa6	doc: Clarify meaning of "idle" in idle_replication_slot_timeout. This commit updates the documentation to clarify that "idle" in idle_replication_slot_timeout means the replication slot is inactive, that is, not currently used by any replication connection. Without this clarification, "idle" could be misinterpreted to mean that the slot is not advancing or that no data is being streamed, even if a connection exists. Back-patch to v18 where idle_replication_slot_timeout was added. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Gunnar Morling <gunnar.morling@googlemail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com Backpatch-through: 18	2025-07-11 08:44:32 +09:00
Fujii Masao	05dedf43d3	Change unit of idle_replication_slot_timeout to seconds. Previously, the idle_replication_slot_timeout parameter used minutes as its unit, based on the assumption that values would typically exceed one minute in production environments. However, this caused unexpected behavior: specifying a value below 30 seconds would round down to 0, effectively disabling the timeout. This could be surprising to users. To allow finer-grained control and avoid such confusion, this commit changes the unit of idle_replication_slot_timeout to seconds. Larger values can still be specified easily using standard time suffixes, for example, '24h' for 24 hours. Back-patch to v18 where idle_replication_slot_timeout was added. Reported-by: Gunnar Morling <gunnar.morling@googlemail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com Backpatch-through: 18	2025-07-11 08:39:24 +09:00
Daniel Gustafsson	a6c0bf9303	Fix sslkeylogfile error handling logging When sslkeylogfile has been set but the file fails to open in an otherwise successful connection, the log entry added to the conn object is never printed. Instead print the error on stderr for increased visibility. This is a debugging tool so using stderr for logging is appropriate. Also while there, remove the umask call in the callback as it's not useful. Issues noted by Peter Eisentraut in post-commit review, backpatch down to 18 when support for sslkeylogfile was added Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/70450bee-cfaa-48ce-8980-fc7efcfebb03@eisentraut.org Backpatch-through: 18	2025-07-10 23:26:51 +02:00
Nathan Bossart	fb6c860bbd	pg_dump: Fix object-type sort priority for large objects. Commit `a45c78e328` moved large object metadata from SECTION_PRE_DATA to SECTION_DATA but neglected to move PRIO_LARGE_OBJECT in dbObjectTypePriorities accordingly. While this hasn't produced any known live bugs, it causes problems for a proposed patch that optimizes upgrades with many large objects. Fixing the priority might also make the topological sort step marginally faster by reducing the number of ordering violations that have to be fixed. Reviewed-by: Nitin Motiani <nitinmotiani@google.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aBkQLSkx1zUJ-LwJ%40nathan Discussion: https://postgr.es/m/aG_5DBCjdDX6KAoD%40nathan Backpatch-through: 17	2025-07-10 15:52:41 -05:00
Michael Paquier	b41c430846	btree_gist: Merge the last two versions into version 1.8 During the development cycle of v18, btree_gist has been bumped once to 1.8 for the addition of translate_cmptype support functions (originally `7406ab623f`, renamed in `32edf732e8`). 1.9 has added sortsupport functions (`e4309f73f6`). There is no need for two version bumps in a module for a single major release of PostgreSQL. This commit unifies both upgrades to a single SQL script, downgrading btree_gist to 1.8. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/13c61807-f702-4afe-9a8d-795e2fd40923@illuminatedcomputing.com Backpatch-through: 18	2025-07-10 12:23:04 +09:00
Michael Paquier	4eca711bc9	injection_points: Add injection_points_list() This function can be used to retrieve the information about all the injection points attached to a cluster, providing coverage for InjectionPointList() introduced in `7b2eb72b1b`. The original proposal turned around a system function, but that would not be backpatchable to stable branches. It was also a bit weird to have a system function that fails depending on if the build allows injection points or not. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/Z_xYkA21KyLEHvWR@paquier.xyz	2025-07-10 10:01:20 +09:00
Andres Freund	48a23f6eae	Use pg_assume() to avoid compiler warning below exec_set_found() The warning, visible when building with -O3 and a recent-ish gcc, is due to gcc not realizing that found is a byvalue type and therefore will never be interpreted as a varlena type. Discussion: https://postgr.es/m/3prdb6hkep3duglhsujrn52bkvnlkvhc54fzvph2emrsm4vodl@77yy6j4hkemb Discussion: https://postgr.es/m/20230316172818.x6375uvheom3ibt2%40awork3.anarazel.de Discussion: https://postgr.es/m/20240207203138.sknifhlppdtgtxnk%40awork3.anarazel.de	2025-07-09 18:40:54 -04:00
Andres Freund	d65eb5b1b8	Add pg_assume(expr) macro This macro can be used to avoid compiler warnings, particularly when using -O3 and not using assertions, and to get the compiler to generate better code. A subsequent commit introduces a first user. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3prdb6hkep3duglhsujrn52bkvnlkvhc54fzvph2emrsm4vodl@77yy6j4hkemb Discussion: https://postgr.es/m/20230316172818.x6375uvheom3ibt2%40awork3.anarazel.de Discussion: https://postgr.es/m/20240207203138.sknifhlppdtgtxnk%40awork3.anarazel.de	2025-07-09 18:38:05 -04:00
Tom Lane	4df477153a	Link libpq with libdl if the platform needs that. Since `b0635bfda`, libpq uses dlopen() and related functions. On some platforms these are not supplied by libc, but by a separate library libdl, in which case we need to make sure that that dependency is known to the linker. Meson seems to take care of that automatically, but the Makefile didn't cater for it. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1328170.1752082586@sss.pgh.pa.us Backpatch-through: 18	2025-07-09 14:21:06 -04:00
Jeff Davis	53cd0b71ee	Change wchar2char() and char2wchar() to accept a locale_t. These are libc-specific functions, so should require a locale_t rather than a pg_locale_t (which could use another provider). Discussion: https://postgr.es/m/a8666c391dfcabe79868d95f7160eac533ace718.camel%40j-davis.com	2025-07-09 08:45:34 -07:00
Tom Lane	9dcc764144	Minor tweaks for pg_test_timing. Increase the size of the "direct" histogram to 10K elements, so that we can precisely track loop times up to 10 microseconds. (Going further than that seems pretty uninteresting, even for very old and slow machines.) Relabel "Per loop time" as "Average loop time" for clarity. Pre-zero the histogram arrays to make sure that they are loaded into processor cache and any copy-on-write overhead has happened before we enter the timing loop. Also use unlikely() to keep the compiler from thinking that the clock-went-backwards case is part of the hot loop. Neither of these hacks made a lot of difference on my own machine, but they seem like they might help on some platforms. Discussion: https://postgr.es/m/be0339cc-1ae1-4892-9445-8e6d8995a44d@eisentraut.org	2025-07-09 11:26:53 -04:00
Nathan Bossart	167ed8082f	Introduce pg_dsm_registry_allocations view. This commit adds a new system view that provides information about entries in the dynamic shared memory (DSM) registry. Specifically, it returns the name, type, and size of each entry. Note that since we cannot discover the size of dynamic shared memory areas (DSAs) and hash tables backed by DSAs (dshashes) without first attaching to them, the size column is left as NULL for those. Bumps catversion. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Sungwoo Chang <swchangdev@gmail.com> Discussion: https://postgr.es/m/4D445D3E-81C5-4135-95BB-D414204A0AB4%40gmail.com	2025-07-09 09:17:56 -05:00
Masahiko Sawada	f5a987c0e5	Fix tab-completion for COPY and \copy options. Commit `c273d9d8ce` reworked tab-completion of COPY and \copy in psql and added support for completing options within WITH clauses. However, the same COPY options were suggested for both COPY TO and COPY FROM commands, even though some options are only valid for one or the other. This commit separates the COPY options for COPY FROM and COPY TO commands to provide more accurate auto-completion suggestions. Back-patch to v14 where tab-completion for COPY and \copy options within WITH clauses was first supported. Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/079e7a2c801f252ae8d522b772790ed7@oss.nttdata.com Backpatch-through: 14	2025-07-09 05:45:34 -07:00
Fujii Masao	86c539c5af	psql: Improve psql tab completion for GRANT/REVOKE on large objects. This commit enhances psql's tab completion to support TO/FROM after "GRANT/REVOKE ... ON LARGE OBJECT ...". Additionally, since "ALTER DEFAULT PRIVILEGES" now supports large objects, tab completion is also updated for "GRANT/REVOKE ... ON LARGE OBJECTS" with TO/FROM. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/ade0ab29-777f-47f6-9d0d-1af67728a86e@oss.nttdata.com	2025-07-09 20:33:50 +09:00
John Naylor	ed26c4e25a	Hide ICU C++ APIs from pg_locale.h The cpluspluscheck script wraps our headers in `extern "C"`. This disables name mangling, which is necessary for the C++ templates in system ICU headers. cpluspluscheck thus fails when the build is configured with ICU (the default). CI worked around this by disabling ICU, but let's make it work so others can run the script. We can specify we only want the C APIs by defining U_SHOW_CPLUSPLUS_API to be 0 in pg_locale.h. Extensions that want the C++ APIs can include ICU headers separately before including PostgreSQL headers. ICU documentation: https://github.com/unicode-org/icu/blob/main/docs/processes/release/tasks/healthy-code.md#test-icu4c-headers Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20220323002024.f2g6tivduzrktgfa%40alap3.anarazel.de Discussion: https://postgr.es/m/CANWCAZbgiaz1_0-F4SD%2B%3D-e9onwAnQdBGJbhg94EqUu4Gb7WyA%40mail.gmail.com	2025-07-09 14:20:22 +07:00
Michael Paquier	df286a5b83	libpq: Add TAP test for nested service file This test corresponds to the case of a "service" defined in a service file, that libpq is not able to support in parseServiceFile(). This has come up during the review of a patch to add more features in this area, useful on its own. Piece extracted from a larger patch by the same author. Author: Ryo Kanbayashi <kanbayashi.dev@gmail.com> Discussion: https://postgr.es/m/Zz2AE7NKKLIZTtEh@paquier.xyz	2025-07-09 15:46:31 +09:00
Amit Kapila	24f608625f	Doc: Improve logical replication failover documentation. Clarified that the failover steps apply to a specific PostgreSQL subscriber and added guidance for verifying replication slot synchronization during planned failover. Additionally, corrected the standby query to avoid false positives by checking invalidation_reason IS NULL instead of conflicting. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Shveta Malik <shveta.malik@gmail.com> Backpatch-through: 17, where it was introduced Discussion: https://www.postgresql.org/message-id/CAExHW5uiZ-fF159=jwBwPMbjZeZDtmcTbN+hd4mrURLCg2uzJg@mail.gmail.com	2025-07-09 09:44:27 +05:30
Michael Paquier	fef6da9e9c	libpq: Remove PQservice() This routine has been introduced as a shortcut to be able to retrieve a service name from an active connection, for psql. Per discussion, and as it is only used by psql, let's remove it to not clutter the libpq API more than necessary. The logic in psql is replaced by lookups of PQconninfoOption for the active connection, instead, updated each time the variables are synced by psql, the prompt shortcut relying on the variable synced. Reported-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/20250706161319.c1.nmisch@google.com Backpatch-through: 18	2025-07-09 12:46:13 +09:00
Tom Lane	93001888d8	Fix up misuse of "volatile" in contrib/xml2. What we want in these places is "xmlChar volatile ptr", not "volatile xmlChar ptr". The former means that the pointer variable itself needs to be treated as volatile, while the latter says that what it points to is volatile. Since the point here is to ensure that the pointer variables don't go crazy after a longjmp, it's the former semantics that we need. The misplacement of "volatile" also led to needing to cast away volatile in some places. Also fix a number of places where variables that are assigned to within a PG_TRY and then used after it were not initialized or not marked as volatile. (A few buildfarm members were issuing "may be used uninitialized" warnings about some of these variables, which is what drew my attention to this area.) In most cases these variables were being set as the last step within the PG_TRY block, which might mean that we could get away without the "volatile" marking. But doing that seems unsafe and is definitely not per our coding conventions. These problems seem to have come in with `732061150`, so no need for back-patch.	2025-07-08 17:00:34 -04:00
Tom Lane	e03c952877	Fix low-probability memory leak in XMLSERIALIZE(... INDENT). xmltotext_with_options() did not consider the possibility that pg_xml_init() could fail --- most likely due to OOM. If that happened, the already-parsed xmlDoc structure would be leaked. Oversight in commit `483bdb2af`. Bug: #18981 Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18981-9bc3c80f107ae925@postgresql.org Backpatch-through: 16	2025-07-08 12:50:33 -04:00
Álvaro Herrera	aa39b4e35a	Fix a couple more places in docs for pg_lsn change Also, revert Unicode linestyle to ASCII. Reported-by: Japin Li <japinli@hotmail.com> Discussion: https://postgr.es/m/ME0P300MB04453A39931F95805C4205A8B64FA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-08 18:38:47 +02:00
Tom Lane	0b096e379e	Change pg_test_timing to measure in nanoseconds not microseconds. Most of our platforms have better-than-microsecond timing resolution, so the original definition of this program is getting less and less useful. Make it report nanoseconds not microseconds. Also, add a second output table that reports the exact observed timing durations, up to a limit of 1024 ns; and be sure to report the largest observed duration. The documentation for this program included a lot of system-specific details that now seem largely obsolete. Move all that text to the PG wiki, where perhaps it will be easier to maintain and update. Also, improve the TAP test so that it actually runs a short standard run, allowing most of the code to be exercised; its coverage before was abysmal. Author: Hannu Krosing <hannuk@google.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/be0339cc-1ae1-4892-9445-8e6d8995a44d@eisentraut.org	2025-07-08 11:23:15 -04:00
Michael Paquier	a27893df45	pg_walsummary: Improve stability of test checking statistics Per buildfarm member culicidae, the query checking for stats reported by the WAL summarizer related to WAL reads is proving to be unstable. Instead of a one-time query, this commit replaces the logic with a polling query checking for the WAL read stats, making the test more reliable on machines that could be slow with the stats reports. This test has been introduced in `f4694e0f35`, so backpatch down to v18. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/f35ba3db-fca7-4693-bc35-6db64488e4b1@gmail.com Backpatch-through: 18	2025-07-08 13:48:49 +09:00
Andres Freund	f54af9f267	aio: Combine io_uring memory mappings, if supported By default io_uring creates a shared memory mapping for each io_uring instance, leading to a large number of memory mappings. Unfortunately a large number of memory mappings slows things down, backend exit is particularly affected. To address that, newer kernels (6.5) support using user-provided memory for the memory. By putting the relevant memory into shared memory we don't need any additional mappings. On a system with a new enough kernel and liburing, there is no discernible overhead when doing a pgbench -S -C anymore. Reported-by: MARK CALLAGHAN <mdcallag@gmail.com> Reviewed-by: "Burd, Greg" <greg@burd.me> Reviewed-by: Jim Nasby <jnasby@upgrade.com> Discussion: https://postgr.es/m/CAFbpF8OA44_UG+RYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA@mail.gmail.com Backpatch-through: 18	2025-07-07 22:57:07 -04:00
Richard Guo	55a780e947	Consider explicit incremental sort for Append and MergeAppend For an ordered Append or MergeAppend, we need to inject an explicit sort into any subpath that is not already well enough ordered. Currently, only explicit full sorts are considered; incremental sorts are not yet taken into account. In this patch, for subpaths of an ordered Append or MergeAppend, we choose to use explicit incremental sort if it is enabled and there are presorted keys. The rationale is based on the assumption that incremental sort is always faster than full sort when there are presorted keys, a premise that has been applied in various parts of the code. In addition, the current cost model tends to favor incremental sort as being cheaper than full sort in the presence of presorted keys, making it reasonable not to consider full sort in such cases. No backpatch as this could result in plan changes. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAMbWs4_V7a2enTR+T3pOY_YZ-FU8ZsFYym2swOz4jNMqmSgyuw@mail.gmail.com	2025-07-08 10:21:44 +09:00
Jacob Champion	7376e60854	oauth: Fix kqueue detection on OpenBSD In `b0635bfda`, I added an early header check to the Meson OAuth support, which was intended to duplicate the later checks for HAVE_SYS_[EVENT\|EPOLL]_H. However, I implemented the new test via check_header() -- which tries to compile -- rather than has_header(), which just looks for the file's existence. The distinction matters on OpenBSD, where <sys/event.h> can't be compiled without including prerequisite headers, so -Dlibcurl=enabled failed on that platform. Switch to has_header() to fix this. Note that reviewers expressed concern about the difference between our Autoconf feature tests (which compile headers) and our Meson feature tests (which do not). I'm not opposed to aligning the two, but I want to avoid making bigger changes as part of this fix. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/flat/CAOYmi+kdR218ke2zu74oTJvzYJcqV1MN5=mGAPqZQuc79HMSVA@mail.gmail.com Backpatch-through: 18	2025-07-07 13:41:55 -07:00
Álvaro Herrera	3adcf9fbd8	Adapt pg_upgrade test to pg_lsn output format difference Commit `2633dae2e4` added some zero padding to various LSNs output routines so that the low word is always 8 hex digits long, for easy human consumption. This included the pg_lsn datatype, which breaks the pg_upgrade test when it compares the pg_dump output of an older version. Silence this problem by setting the pg_lsn columns to NULL before the upgrade. Discussion: https://postgr.es/m/202507071504.xm2r26u7lmzr@alvherre.pgsql	2025-07-07 22:38:40 +02:00
Tom Lane	87b05fdc73	Restore the ability to run pl/pgsql expression queries in parallel. pl/pgsql's notion of an "expression" is very broad, encompassing any SQL SELECT query that returns a single column and no more than one row. So there are cases, for example evaluation of an aggregate function, where the query involves significant work and it'd be useful to run it with parallel workers. This used to be possible, but commits `3eea7a0c9` et al unintentionally disabled it. The simplest fix is to make exec_eval_expr() pass maxtuples = 0 rather than 2 to exec_run_select(). This avoids the new rule that we will never use parallelism when a nonzero "count" limit is passed to ExecutorRun(). (Note that the pre-3eea7a0c9 behavior was indeed unsafe, so reverting that rule is not in the cards.) The reason for passing 2 before was that exec_eval_expr() will throw an error if it gets more than one returned row, so we figured that as soon as we have two rows we know that will happen and we might as well stop running the query. That choice was cost-free when it was made; but disabling parallelism is far from cost-free, so now passing 2 amounts to optimizing a failure case at the expense of useful cases. An expression query that can return more than one row is certainly broken. People might now need to wait a bit longer to discover such breakage; but hopefully few will use enormously expensive cases as their first test of new pl/pgsql logic. Author: Dipesh Dhameliya <dipeshdhameliya125@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CABgZEgdfbnq9t6xXJnmXbChNTcWFjeM_6nuig41tm327gYi2ig@mail.gmail.com Backpatch-through: 13	2025-07-07 14:33:20 -04:00
Álvaro Herrera	c616785516	Refactor some repetitive SLRU code Functions to bootstrap and zero pages in various SLRU callers were fairly duplicative. We can slash almost two hundred lines with a couple of simple helpers: - SimpleLruZeroAndWritePage: Does the equivalent of SimpleLruZeroPage followed by flushing the page to disk - XLogSimpleInsertInt64: Does a XLogBeginInsert followed by XLogInsert of a trivial record whose data is just an int64. Author: Evgeny Voropaev <evgeny.voropaev@tantorlabs.com> Reviewed by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/97820ce8-a1cd-407f-a02b-47368fadb14b%40tantorlabs.com	2025-07-07 16:49:19 +02:00
Álvaro Herrera	2633dae2e4	Standardize LSN formatting by zero padding This commit standardizes the output format for LSNs to ensure consistent representation across various tools and messages. Previously, LSNs were inconsistently printed as `%X/%X` in some contexts, while others used zero-padding. This often led to confusion when comparing. To address this, the LSN format is now uniformly set to `%X/%08X`, ensuring the lower 32-bit part is always zero-padded to eight hexadecimal digits. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-07 13:57:43 +02:00
Michael Paquier	62a17a9283	Integrate FullTransactionIds deeper into two-phase code This refactoring is a follow-up of the work done in `5a1dfde833`, that has switched 2PC file names to use FullTransactionIds when written on disk. This will help with the integration of a follow-up solution related to the handling of two-phase files during recovery, to address older defects while reading these from disk after a crash. This change is useful in itself as it reduces the need to build the file names from epoch numbers and TransactionIds, because we can use directly FullTransactionIds from which the 2PC file names are guessed. So this avoids a lot of back-and-forth between the FullTransactionIds retrieved from the file names and how these are passed around in the internal 2PC logic. Note that the core of the change is the use of a FullTransactionId instead of a TransactionId in GlobalTransactionData, that tracks 2PC file information in shared memory. The change in TwoPhaseCallback makes this commit unfit for stable branches. Noah has contributed a good chunk of this patch. I have spent some time on it as well while working on the issues with two-phase state files and recovery. Author: Noah Misch <noah@leadboat.com> Co-Authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z5sd5O9JO7NYNK-C@paquier.xyz Discussion: https://postgr.es/m/20250116205254.65.nmisch@google.com	2025-07-07 12:50:40 +09:00
Michael Paquier	8aa54aa7ee	Fix incompatibility with libxml2 >= 2.14 libxml2 has deprecated the members of xmlBuffer, and it is recommended to access them with dedicated routines. We have only one case in the tree where this shows an impact: xml2/xpath.c where "content" was getting directly accessed. The rest of the code looked fine, checking the PostgreSQL code with libxml2 close to the top of its "2.14" branch. xmlBufferContent() exists since year 2000 based on a check of the upstream libxml2 tree, so let's switch to it. Like `400928b83b`, backpatch all the way down as this can have an impact on all the branches already released once newer versions of libxml2 get more popular. Reported-by: Walid Ibrahim <walidib@amazon.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aGdSdcR4QTjEHX6s@paquier.xyz Backpatch-through: 13	2025-07-07 08:53:57 +09:00
Etsuro Fujita	21c9756db6	postgres_fdw: Add Assert to estimate_path_cost_size(). When estimating the cost/size of a pre-sorted path for a given upper relation using local stats, this function dereferences the passed-in PgFdwPathExtraData pointer without checking that it is not NULL. But that is not a bug as the pointer is guaranteed to be non-NULL in that case; to avoid confusion, add an Assert to ensure that it is not NULL before dereferencing it. Reported-by: Ranier Vilela <ranier.vf@gmail.com> Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQArgiALbV1akQpeZOgim7XP05n%3DbDP1%3DTcOYLA43nRX_vA%40mail.gmail.com	2025-07-06 17:15:00 +09:00
Álvaro Herrera	144ad723a4	Fix new pg_upgrade query not to rely on regnamespace That was invented in 9.5, and pg_upgrade claims to support back to 9.0. But we don't need that with a simple query change, tested by Tom Lane. Discussion: https://postgr.es/m/202507041645.afjl5rssvrgu@alvherre.pgsql	2025-07-04 21:30:05 +02:00
Álvaro Herrera	90a85fce5e	pg_upgrade: Add missing newline in error message Minor oversight in `347758b120`	2025-07-04 18:31:35 +02:00
Álvaro Herrera	f295494d33	pg_upgrade: check for inconsistencies in not-null constraints w/inheritance With tables defined like this, CREATE TABLE ip (id int PRIMARY KEY); CREATE TABLE ic (id int) INHERITS (ip); ALTER TABLE ic ALTER id DROP NOT NULL; pg_upgrade fails during the schema restore phase due to this error: ERROR: column "id" in child table must be marked NOT NULL This can only be fixed by marking the child column as NOT NULL before the upgrade, which could take an arbitrary amount of time (because ic's data must be scanned). Have pg_upgrade's check mode warn if that condition is found, so that users know what to adjust before running the upgrade for real. Author: Ali Akbar <the.apaan@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CACQjQLoMsE+1pyLe98pi0KvPG2jQQ94LWJ+PTiLgVRK4B=i_jg@mail.gmail.com	2025-07-04 18:05:43 +02:00
Fujii Masao	d64d68fddf	amcheck: Remove unused IndexCheckableCallback typedef. Commit `d70b17636d` introduced the IndexCheckableCallback typedef for a callback function, but it was never used. This commit removes the unused typedef to clean up dead code. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/e1ea4e14-3b21-4e01-a5f2-0686883265df@oss.nttdata.com	2025-07-04 23:25:40 +09:00
Michael Paquier	5a6c39b6df	Disable commit timestamps during bootstrap Attempting to use commit timestamps during bootstrapping leads to an assertion failure, that can be reached for example with an initdb -c that enables track_commit_timestamp. It makes little sense to register a commit timestamp for a BootstrapTransactionId, so let's disable the activation of the module in this case. This problem has been independently reported once by each author of this commit. Each author has proposed basically the same patch, relying on IsBootstrapProcessingMode() to skip the use of commit_ts during bootstrap. The test addition is a suggestion by me, and is applied down to v16. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Andy Fan <zhihuifan1213@163.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/OSCPR01MB14966FF9E4C4145F37B937E52F5102@OSCPR01MB14966.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/87plejmnpy.fsf@163.com Backpatch-through: 13	2025-07-04 15:09:24 +09:00
Fujii Masao	78ebda66bf	Speed up truncation of temporary relations. Previously, truncating a temporary relation required scanning the entire local buffer pool once per relation fork to invalidate buffers. This could be slow, especially with a large local buffers, as the scan was repeated multiple times. A similar issue with regular tables (shared buffers) was addressed in commit `6d05086c0a` by scanning the buffer pool only once for all forks. This commit applies the same optimization to temporary relations, improving truncation performance. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Discussion: https://postgr.es/m/CAJDiXggNqsJOH7C5co4jA8nDk8vw-=sokyh5s1_TENWnC6Ofcg@mail.gmail.com	2025-07-04 09:03:58 +09:00
Tom Lane	931766aaec	Simplify COALESCE() with one surviving argument. If, after removal of useless null-constant arguments, a CoalesceExpr has exactly one remaining argument, we can just take that argument as the result, without bothering to wrap a new CoalesceExpr around it. This isn't likely to produce any great improvement in runtime per se, but it can lead to better plans since the planner no longer has to treat the expression as non-strict. However, there were a few regression test cases that intentionally wrote COALESCE(x) as a shorthand way of creating a non-strict subexpression. To avoid ruining the intent of those tests, write COALESCE(x,x) instead. (If anyone ever proposes de-duplicating COALESCE arguments, we'll need another iteration of this arms race. But it seems pretty unlikely that such an optimization would be worthwhile.) Author: Maksim Milyutin <maksim.milyutin@tantorlabs.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8e8573c3-1411-448d-877e-53258b7b2be0@tantorlabs.ru	2025-07-03 17:39:53 -04:00
Tom Lane	fc896821c4	Add more cross-type comparisons to contrib/btree_gin. Using the just-added infrastructure, extend btree_gin to support cross-type operators in its other opclasses. All of the cross-type comparison operators supported by the core btree opclasses for these datatypes are now available for btree_gin indexes as well. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://postgr.es/m/262624.1738460652@sss.pgh.pa.us	2025-07-03 16:30:38 -04:00
Tom Lane	e2b64fcef3	Add cross-type comparisons to contrib/btree_gin. Extend the infrastructure in btree_gin.c to permit cross-type operators, and add the code to support them for the int2, int4, and int8 opclasses. (To keep this patch digestible, I left the other datatypes for a separate patch.) This improves the usability of btree_gin indexes by allowing them to support the same set of queries that a regular btree index does. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://postgr.es/m/262624.1738460652@sss.pgh.pa.us	2025-07-03 16:24:31 -04:00
Tom Lane	0059bbe1ec	Break out xxx2yyy_opt_overflow APIs for more datetime conversions. Previous commits invented timestamp2timestamptz_opt_overflow, date2timestamp_opt_overflow, and date2timestamptz_opt_overflow functions to perform non-error-throwing conversions between datetime types. This patch completes the set by adding timestamp2date_opt_overflow, timestamptz2date_opt_overflow, and timestamptz2timestamp_opt_overflow. In addition, adjust timestamp2timestamptz_opt_overflow so that it doesn't throw error if timestamp2tm fails, but treats that as an overflow case. The situation probably can't arise except with an invalid timestamp value, and I can't think of a way that that would happen except data corruption. However, it's pretty silly to have a function whose entire reason for existence is to not throw errors for out-of-range inputs nonetheless throw an error for out-of-range input. The new APIs are not used in this patch, but will be needed in upcoming btree_gin changes. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://postgr.es/m/262624.1738460652@sss.pgh.pa.us	2025-07-03 16:17:08 -04:00
Tom Lane	a10f21e6ce	Obtain required table lock during cross-table updates, redux. Commits `8319e5cb5` et al missed the fact that ATPostAlterTypeCleanup contains three calls to ATPostAlterTypeParse, and the other two also need protection against passing a relid that we don't yet have lock on. Add similar logic to those code paths, and add some test cases demonstrating the need for it. In v18 and master, the test cases demonstrate that there's a behavioral discrepancy between stored generated columns and virtual generated columns: we disallow changing the expression of a stored column if it's used in any composite-type columns, but not that of a virtual column. Since the expression isn't actually relevant to either sort of composite-type usage, this prohibition seems unnecessary; but changing it is a matter for separate discussion. For now we are just documenting the existing behavior. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: CACJufxGKJtGNRRSXfwMW9SqVOPEMdP17BJ7DsBf=tNsv9pWU9g@mail.gmail.com Backpatch-through: 13	2025-07-03 13:46:07 -04:00
Álvaro Herrera	a604affade	Add tab-completion for ALTER TABLE not-nulls The command is: ALTER TABLE x ADD [CONSTRAINT y] NOT NULL z This syntax was added in 18, but I got pushback for getting commit `dbf42b84ac` in 18 (also tab-completion for new syntax) after the feature freeze, so I'll put this in master only for now. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/d4f14c6b-086b-463c-b15f-01c7c9728eab@oss.nttdata.com Discussion: https://postgr.es/m/202505111448.bwbfomrymq4b@alvherre.pgsql	2025-07-03 16:54:36 +02:00
Fujii Masao	c84698ceae	Remove leftover dead code from commit_ts.h. Commit `08aa89b326` removed the COMMIT_TS_SETTS WAL record, leaving xl_commit_ts_set and SizeOfCommitTsSet unused. However, it missed removing these definitions. This commit cleans up the leftover code. Since this is a cleanup rather than a bug fix, it is applied only to the master branch. Author: Andy Fan <zhihuifan1213@163.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/87ecuzmkqf.fsf@163.com	2025-07-03 23:39:45 +09:00
Álvaro Herrera	81a2625eb2	Fix broken XML I messed this up in commit `87251e1149`. Per buildfarm member alabio, via Daniel Gustafsson. Discussion: https://postgr.es/m/B94D82D1-7AF4-4412-AC02-82EAA6154957@yesql.se	2025-07-03 16:27:09 +02:00
Fujii Masao	ff3007c66d	doc: Update outdated descriptions of wal_status in pg_replication_slots. The documentation for pg_replication_slots previously mentioned only max_slot_wal_keep_size as a condition under which the wal_status column could show unreserved or lost. However, since commit `be87200`, replication slots can also be invalidated due to horizon or wal_level, and since commit `ac0e33136a`, idle_replication_slot_timeout can also trigger this state. This commit updates the description of the wal_status column to reflect that max_slot_wal_keep_size is not the only cause of the lost state. Back-patched to v16, where the additional invalidation cases were introduced. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Discussion: https://postgr.es/m/78b34e84-2195-4f28-a151-5d204a382fdd@oss.nttdata.com Backpatch-through: 16	2025-07-03 23:07:23 +09:00
Álvaro Herrera	647cffd2f3	Prevent creation of duplicate not-null constraints for domains This was previously harmless, but now that we create pg_constraint rows for those, duplicates are not welcome anymore. Backpatch to 18. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CACJufxFSC0mcQ82bSk58sO-WJY4P-o4N6RD2M0D=DD_u_6EzdQ@mail.gmail.com	2025-07-03 11:46:12 +02:00
Álvaro Herrera	87251e1149	Fix bogus grammar for a CREATE CONSTRAINT TRIGGER error If certain constraint characteristic clauses (NO INHERIT, NOT VALID, NOT ENFORCED) are given to CREATE CONSTRAINT TRIGGER, the resulting error message is ERROR: TRIGGER constraints cannot be marked NO INHERIT which is a bit silly, because these aren't "constraints of type TRIGGER". Hardcode a better error message to prevent it. This is a cosmetic fix for quite a fringe problem with no known complaints from users, so no backpatch. While at it, silently accept ENFORCED if given. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAAJ_b97hd-jMTS7AjgU6TDBCzDx_KyuKxG+K-DtYmOieg+giyQ@mail.gmail.com Discussion: https://postgr.es/m/CACJufxHSp2puxP=q8ZtUGL1F+heapnzqFBZy5ZNGUjUgwjBqTQ@mail.gmail.com	2025-07-03 11:25:39 +02:00
Michael Paquier	8ec04c8577	Refactor subtype field of AlterDomainStmt AlterDomainStmt.subtype used characters for its subtypes of commands, SET\|DROP DEFAULT\|NOT NULL and ADD\|DROP\|VALIDATE CONSTRAINT, which were hardcoded in a couple of places of the code. The code is improved by using an enum instead, with the same character values as the original code. Note that the field was documented in parsenodes.h and that it forgot to mention 'V' (VALIDATE CONSTRAINT). Author: Quan Zongliang <quanzongliang@yeah.net> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/41ff310b-16bd-44b9-a3ef-97e20f14b709@yeah.net	2025-07-03 16:34:28 +09:00
Fujii Masao	170673a22f	doc: Remove incorrect note about wal_status in pg_replication_slots. The documentation previously stated that the wal_status column is NULL if restart_lsn is NULL in the pg_replication_slots view. This is incorrect, and wal_status can be "lost" even when restart_lsn is NULL. This commit removes the incorrect description. Back-patched to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Discussion: https://postgr.es/m/c9d23cdc-b5dd-455a-8ee9-f1f24d701d89@oss.nttdata.com Backpatch-through: 13	2025-07-03 16:03:19 +09:00
Fujii Masao	bc2f348e87	Support multi-line headers in COPY FROM command. The COPY FROM command now accepts a non-negative integer for the HEADER option, allowing multiple header lines to be skipped. This is useful when the input contains multi-line headers that should be ignored during data import. Author: Shinya Kato <shinya11.kato@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/CAOzEurRPxfzbxqeOPF_AGnAUOYf=Wk0we+1LQomPNUNtyZGBZw@mail.gmail.com	2025-07-03 15:27:26 +09:00
Michael Paquier	fd7d7b7191	Improve checks for GUC recovery_target_timeline Currently check_recovery_target_timeline() converts any value that is not "current", "latest", or a valid integer to 0. So, for example, the following configuration added to postgresql.conf followed by a startup: recovery_target_timeline = 'bogus' recovery_target_timeline = '9999999999' ... results in the following error patterns: FATAL: 22023: recovery target timeline 0 does not exist FATAL: 22023: recovery target timeline 1410065407 does not exist This is confusing, because the server does not reflect the intention of the user, and just reports incorrect data unrelated to the GUC. The origin of the problem is that we do not perform a range check in the GUC value passed-in for recovery_target_timeline. This commit improves the situation by using strtou64() and by providing stricter range checks. Some test cases are added for the cases of an incorrect, an upper-bound and a lower-bound timeline value, checking the sanity of the reports based on the contents of the server logs. Author: David Steele <david@pgmasters.net> Discussion: https://postgr.es/m/e5d472c7-e9be-4710-8dc4-ebe721b62cea@pgbackrest.org	2025-07-03 11:14:20 +09:00
Richard Guo	0da29e4cb1	Enable use of Memoize for ANTI joins Currently, we do not support Memoize for SEMI and ANTI joins because nested loop SEMI/ANTI joins do not scan the inner relation to completion, which prevents Memoize from marking the cache entry as complete. One might argue that we could mark the cache entry as complete after fetching the first inner tuple, but that would not be safe: if the first inner tuple and the current outer tuple do not satisfy the join clauses, a second inner tuple matching the parameters would find the cache entry already marked as complete. However, if the inner side is provably unique, this issue doesn't arise, since there would be no second matching tuple. That said, this doesn't help in the case of SEMI joins, because a SEMI join with a provably unique inner side would already have been reduced to an inner join by reduce_unique_semijoins. Therefore, in this patch, we check whether the inner relation is provably unique for ANTI joins and enable the use of Memoize in such cases. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CAMbWs48FdLiMNrmJL-g6mDvoQVt0yNyJAqMkv4e2Pk-5GKCZLA@mail.gmail.com	2025-07-03 10:57:26 +09:00
Michael Paquier	7b2eb72b1b	Add InjectionPointList() to retrieve list of injection points This routine has come as a useful piece to be able to know the list of injection points currently attached in a system. One area would be to use it in a set-returning function, or just let out-of-core code play with it. This hides the internals of the shared memory array lookup holding the information about the injection points (point name, library and function name), allocating the result in a palloc'd List consumable by the caller. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/Z_xYkA21KyLEHvWR@paquier.xyz Discussion: https://postgr.es/m/aBG2rPwl3GE7m1-Q@paquier.xyz	2025-07-03 08:41:25 +09:00
Tom Lane	fe05430ace	Correctly copy the target host identification in PQcancelCreate. PQcancelCreate failed to copy struct pg_conn_host's "type" field, instead leaving it zero (a/k/a CHT_HOST_NAME). This seemingly has no great ill effects if it should have been CHT_UNIX_SOCKET instead, but if it should have been CHT_HOST_ADDRESS then a null-pointer dereference will occur when the cancelConn is used. Bug: #18974 Reported-by: Maxim Boguk <maxim.boguk@gmail.com> Author: Sergei Kornilov <sk@zsrv.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18974-575f02b2168b36b3@postgresql.org Backpatch-through: 17	2025-07-02 15:48:02 -04:00
Nathan Bossart	0c2b7174c3	Fix cross-version upgrade test breakage from commit `fe07100e82`. In commit `fe07100e82`, I renamed a couple of functions in test_dsm_registry to make it clear what they are testing. However, the buildfarm's cross-version upgrade tests run pg_upgrade with the test modules installed, so this caused errors like: ERROR: could not find function "get_val_in_shmem" in file ".../test_dsm_registry.so" To fix, revert those renames. I could probably get away with only un-renaming the C symbols, but I figured I'd avoid introducing function name mismatches. Also, AFAICT the buildfarm's cross-version upgrade tests do not run the test module tests post-upgrade, else we'll need to properly version the extension. Per buildfarm member crake. Discussion: https://postgr.es/m/aGVuYUNW23tStUYs%40nathan	2025-07-02 13:26:33 -05:00
Nathan Bossart	bb109382ef	Make more use of RELATION_IS_OTHER_TEMP(). A few places were open-coding it instead of using this handy macro. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LjTGJcOcxQx-SUOGoxstG4XuCWLH0ATJKKt_aBTE5K8w%40mail.gmail.com	2025-07-02 12:32:19 -05:00
Nathan Bossart	fe07100e82	Add GetNamedDSA() and GetNamedDSHash(). Presently, the dynamic shared memory (DSM) registry only provides GetNamedDSMSegment(), which allocates a fixed-size segment. To use the DSM registry for more sophisticated things like dynamic shared memory areas (DSAs) or a hash table backed by a DSA (dshash), users need to create a DSM segment that stores various handles and LWLock tranche IDs and to write fairly complicated initialization code. Furthermore, there is likely little variation in this initialization code between libraries. This commit introduces functions that simplify allocating a DSA or dshash within the DSM registry. These functions are very similar to GetNamedDSMSegment(). Notable differences include the lack of an initialization callback parameter and the prohibition of calling the functions more than once for a given entry in each backend (which should be trivially avoidable in most circumstances). While at it, this commit bumps the maximum DSM registry entry name length from 63 bytes to 127 bytes. Also note that even though one could presumably detach/destroy the DSAs and dshashes created in the registry, such use-cases are not yet well-supported, if for no other reason than the associated DSM registry entries cannot be removed. Adding such support is left as a future exercise. The test_dsm_registry test module contains tests for the new functions and also serves as a complete usage example. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/aEC8HGy2tRQjZg_8%40nathan	2025-07-02 11:50:52 -05:00
Peter Geoghegan	9ca30a0b04	Update obsolete row compare preprocessing comments. Restore nbtree preprocessing comments describing how we mark nbtree row compare members required to how they were prior to 2016 bugfix commit `a298a1e0`. Oversight in commit `bd3f59fd`, which made nbtree preprocessing revert to the original 2006 rules, but neglected to revert these comments. Backpatch-through: 18	2025-07-02 12:36:35 -04:00
Tom Lane	7374b3a536	Allow width_bucket()'s "operand" input to be NaN. The array-based variant of width_bucket() has always accepted NaN inputs, treating them as equal but larger than any non-NaN, as we do in ordinary comparisons. But up to now, the four-argument variants threw errors for a NaN operand. This is inconsistent and unnecessary, since we can perfectly well regard NaN as falling after the last bucket. We do still throw error for NaN or infinity histogram-bound inputs, since there's no way to compute sensible bucket boundaries. Arguably this is a bug fix, but given the lack of field complaints I'm content to fix it in master. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/2822872.1750540911@sss.pgh.pa.us	2025-07-02 11:34:40 -04:00
Álvaro Herrera	c989affb52	Fix error message for ALTER CONSTRAINT ... NOT VALID Trying to alter a constraint so that it becomes NOT VALID results in an error that assumes the constraint is a foreign key. This is potentially wrong, so give a more generic error message. While at it, give CREATE CONSTRAINT TRIGGER a better error message as well. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@oss.nttdata.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: Amul Sul <sulamul@gmail.com> Discussion: https://postgr.es/m/CACJufxHSp2puxP=q8ZtUGL1F+heapnzqFBZy5ZNGUjUgwjBqTQ@mail.gmail.com	2025-07-02 17:02:27 +02:00
Peter Geoghegan	bd3f59fdb7	Make row compares robust during nbtree array scans. Recent nbtree bugfix commit `5f4d98d4` added a special case to the code that sets up a page-level prefix of keys that are definitely satisfied by every tuple on the page: whenever _bt_set_startikey reached a row compare key, we'd refuse to apply the pstate.forcenonrequired behavior in scans where that usually happens (scans with a higher-order array key). That hack made the scan avoid essentially the same infinite cycling behavior that also affected nbtree scans with redundant keys (keys that preprocessing could not eliminate) prior to commit `f09816a0`. There are now serious doubts about this row compare workaround. Testing has shown that a scan with a row compare key and an array key could still read the same leaf page twice (without the scan's direction changing), which isn't supposed to be possible following the SAOP enhancements added by Postgres 17 commit `5bf748b8`. Also, we still allowed a required row compare key to be used with forcenonrequired mode when its header key happened to be beyond the pstate.ikey set by _bt_set_startikey, which was complicated and brittle. The underlying problem was that row compares had inconsistent rules around how scans start (which keys can be used for initial positioning purposes) and how scans end (which keys can set continuescan=false). Quals with redundant keys that could not be eliminated by preprocessing also had that same quality to them prior to today's bugfix `f09816a0`. It now seems prudent to bring row compare keys in line with the new charter for required keys, by making the start and end rules symmetric. This commit fixes two points of disagreement between _bt_first and _bt_check_rowcompare. Firstly, _bt_check_rowcompare was capable of ending the scan at the point where it needed to compare an ISNULL-marked row compare member that came immediately after a required row compare member. _bt_first now has symmetric handling for NULL row compares. Secondly, _bt_first had its own ideas about which keys were safe to use for initial positioning purposes. It could use fewer or more keys than _bt_check_rowcompare. _bt_first now uses the same requiredness markings as _bt_check_rowcompare for this. Now that _bt_first and _bt_check_rowcompare agree on how to start and end scans, we can get rid of the forcenonrequired special case, without any risk of infinite cycling. This approach also makes row compare keys behave more like regular scalar keys, particularly within _bt_first. Fixing these inconsistencies necessitates dealing with a related issue with the way that row compares were marked required by preprocessing: we didn't mark any lower-order row members required following 2016 bugfix commit `a298a1e0`. That approach was over broad. The bug in question was actually an oversight in how _bt_check_rowcompare dealt with tuple NULL values that failed to satisfy a scan key marked required in the opposite scan direction (it was a bug in 2011 commits `6980f817` and `882368e8`, not a bug in 2006 commit `3a0a16cb`). Go back to marking row compare members as required using the original 2006 rules, and fix the 2016 bug in a more principled way: by limiting use of the "set continuescan=false with a key required in the opposite scan direction upon encountering a NULL tuple value" optimization to the first/most significant row member key. While it isn't safe to use an implied IS NOT NULL qualifier to end the scan when it comes from a required lower-order row compare member key, it _is_ generally safe for such a required member key to end the scan -- provided the key is marked required in the _current_ scan direction. This fixes what was arguably an oversight in either commit `5f4d98d4` or commit `8a510275`. It is a direct follow-up to today's commit `f09816a0`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://postgr.es/m/CAH2-Wz=pcijHL_mA0_TJ5LiTB28QpQ0cGtT-ccFV=KzuunNDDQ@mail.gmail.com Backpatch-through: 18	2025-07-02 09:48:15 -04:00
Peter Geoghegan	f09816a0a7	Make handling of redundant nbtree keys more robust. nbtree preprocessing's handling of redundant (and contradictory) keys created problems for scans with = arrays. It was just about possible for a scan with an = array key and one or more redundant keys (keys that preprocessing could not eliminate due an incomplete opfamily and a cross-type key) to get stuck. Testing has shown that infinite cycling where the scan never manages to make forward progress was possible. This could happen when the scan's arrays were reset in _bt_readpage's forcenonrequired=true path (added by bugfix commit `5f4d98d4`) when the arrays weren't at least advanced up to the same point that they were in at the start of the _bt_readpage call. Earlier redundant keys prevented the finaltup call to _bt_advance_array_keys from reaching lower-order keys that needed to be used to sufficiently advance the scan's arrays. To fix, make preprocessing leave the scan's keys in a state that is as close as possible to how it'll usually leave them (in the common case where there's no redundant keys that preprocessing failed to eliminate). Now nbtree preprocessing _reliably_ leaves behind at most one required >/>= key per index column, and at most one required </<= key per index column. Columns that have one or more = keys that are eligible to be marked required (based on the traditional rules) prioritize the = keys over redundant inequality keys; they'll _reliably_ be left with only one of the = keys as the index column's only required key. Keys that are not marked required (whether due to the new preprocessing step running or for some other reason) are relocated to the end of the so->keyData[] array as needed. That way they'll always be evaluated after the scan's required keys, and so cannot prevent code in places like _bt_advance_array_keys and _bt_first from reaching a required key. Also teach _bt_first to decide which initial positioning keys to use based on the same requiredness markings that have long been used by _bt_checkkeys/_bt_advance_array_keys. This is a necessary condition for reliably avoiding infinite cycling. _bt_advance_array_keys expects to be able to reason about what'll happen in the next _bt_first call should it start another primitive index scan, by evaluating inequality keys that were marked required in the opposite-to-scan scan direction only. Now everybody (_bt_first, _bt_checkkeys, and _bt_advance_array_keys) will always agree on which exact key will be used on each index column to start and/or end the scan (except when row compare keys are involved, which have similar problems not addressed by this commit). An upcoming commit will finish off the work started by this commit by harmonizing how _bt_first, _bt_checkkeys, and _bt_advance_array_keys apply row compare keys to start and end scans. This fixes what was arguably an oversight in either commit `5f4d98d4` or commit `8a510275`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://postgr.es/m/CAH2-Wz=ds4M+3NXMgwxYxqU8MULaLf696_v5g=9WNmWL2=Uo2A@mail.gmail.com Backpatch-through: 18	2025-07-02 09:40:49 -04:00
Daniel Gustafsson	8eede2c720	doc: pg_buffercache documentation wordsmithing A words seemed to have gone missing in the leading paragraphs. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/aGTQYZz9L0bjlzVL@ip-10-97-1-34.eu-west-3.compute.internal Backpatch-through: 18	2025-07-02 11:42:36 +02:00
Peter Eisentraut	f039c22441	meson: Increase minimum version to 0.57.2 The previous minimum was to maintain support for Python 3.5, but we now require Python 3.6 anyway (commit `45363fca63`), so that reason is obsolete. A small raise to Meson 0.57 allows getting rid of a fair amount of version conditionals and silences some future-deprecated warnings. With the version bump, the following deprecation warnings appeared and are fixed: WARNING: Project targets '>=0.57' but uses feature deprecated since '0.55.0': ExternalProgram.path. use ExternalProgram.full_path() instead WARNING: Project targets '>=0.57' but uses feature deprecated since '0.56.0': meson.build_root. use meson.project_build_root() or meson.global_build_root() instead. It turns out that meson 0.57.0 and 0.57.1 are buggy for our use, so the minimum is actually set to 0.57.2. This is specific to this version series; in the future we won't necessarily need to be this precise. Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/42e13eb0-862a-441e-8d84-4f0fd5f6def0%40eisentraut.org	2025-07-02 11:14:53 +02:00
Peter Eisentraut	de5aa15209	Reformat some node comments Use per-field comments for IndexInfo, instead of one big header comment listing all the fields. This makes the relevant comments easier to find, and it will also make it less likely that comments are not updated when fields are added or removed, as has happened in the past. Author: Japin Li <japinli@hotmail.com> Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-02 09:50:51 +02:00
Masahiko Sawada	3811ca3600	Fix missing FSM vacuum opportunities on tables without indexes. Commit `c120550edb` optimized the vacuuming of relations without indexes (a.k.a. one-pass strategy) by directly marking dead item IDs as LP_UNUSED. However, the periodic FSM vacuum was still checking if dead item IDs had been marked as LP_DEAD when attempting to vacuum the FSM every VACUUM_FSM_EVERY_PAGES blocks. This condition was never met due to the optimization, resulting in missed FSM vacuum opportunities. This commit modifies the periodic FSM vacuum condition to use the number of tuples deleted during HOT pruning. This count includes items marked as either LP_UNUSED or LP_REDIRECT, both of which are expected to result in new free space to report. Back-patch to v17 where the vacuum optimization for tables with no indexes was introduced. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/CAD21AoBL8m6B9GSzQfYxVaEgvD7-Kr3AJaS-hJPHC+avm-29zw@mail.gmail.com Backpatch-through: 17	2025-07-01 23:25:20 -07:00
John Naylor	9adb58a3cc	Remove implicit cast from 'void *' Commit `e2809e3a10` added code to a header which assigns a pointer to void to a pointer to unsigned char. This causes build errors for extensions written in C++. Fix by adding an explicit cast. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CANWCAZaCq9AHBuhs%3DMx7Gg_0Af9oRU7iAqr0itJCtfmsWwVmnQ%40mail.gmail.com Backpatch-through: 18	2025-07-02 11:51:10 +07:00
Michael Paquier	3369a3b49b	Fix bug in archive streamer with LZ4 decompression When decompressing some input data, the calculation for the initial starting point and the initial size were incorrect, potentially leading to failures when decompressing contents with LZ4. These initialization points are fixed in this commit, bringing the logic closer to what exists for gzip and zstd. The contents of the compressed data is clear (for example backups taken with LZ4 can still be decompressed with a "lz4" command), only the decompression part reading the input data was impacted by this issue. This code path impacts pg_basebackup and pg_verifybackup, which can use the LZ4 decompression routines with an archive streamer, or any tools that try to use the archive streamers in src/fe_utils/. The issue is easier to reproduce with files that have a low-compression rate, like ones filled with random data, for a size of at least 512kB, but this could happen with anything as long as it is stored in a data folder. Some tests are added based on this idea, with a file filled with random bytes grabbed from the backend, written at the root of the data folder. This is proving good enough to reproduce the original problem. Author: Mikhail Gribkov <youzhick@gmail.com> Discussion: https://postgr.es/m/CAMEv5_uQS1Hg6KCaEP2JkrTBbZ-nXQhxomWrhYQvbdzR-zy-wA@mail.gmail.com Backpatch-through: 15	2025-07-02 13:48:36 +09:00
Michael Paquier	b45242fd30	Move code for the bytea data type from varlena.c to new bytea.c This commit moves all the routines related to the bytea data type into its own new file, called bytea.c, clearing some of the bloat in varlena.c. This includes the routines for: - Input, output, receive and send - Comparison - Casts to integer types - bytea-specific functions The internals of the routines moved here are unchanged, with one exception. This comes with a twist in bytea_string_agg_transfn(), where the call to makeStringAggState() is replaced by the internals of this routine, still located in varlena.c. This simplifies the move to the new file by not having to expose makeStringAggState(). Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CAJ7c6TMPVPJ5DL447zDz5ydctB8OmuviURtSwd=PHCRFEPDEAQ@mail.gmail.com	2025-07-02 09:52:21 +09:00
Michael Paquier	bee23ea4dd	Show sizes of FETCH queries as constants in pg_stat_statements Prior to this patch, every FETCH call would generate a unique queryId with a different size specified. Depending on the workloads, this could lead to a significant bloat in pg_stat_statements, as repeatedly calling a specific cursor would result in a new queryId each time. For example, FETCH 1 c1; and FETCH 2 c1; would produce different queryIds. This patch improves the situation by normalizing the fetch size, so as semantically similar statements generate the same queryId. As a result, statements like the below, which differ syntactically but have the same effect, will now share a single queryId: FETCH FROM c1 FETCH NEXT c1 FETCH 1 c1 In order to do a normalization based on the keyword used in FETCH, FetchStmt is tweaked with a new FetchDirectionKeywords. This matters for "howMany", which could be set to a negative value depending on the direction, and we want to normalize the queries with enough information about the direction keywords provided, including RELATIVE, ABSOLUTE or all the ALL variants. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0tA6LbHCg2qSS+KuM850BZC_+ZgHV7Ug6BXw22TNyF+MA@mail.gmail.com	2025-07-02 08:39:25 +09:00
Peter Eisentraut	184595836b	Update comment for IndexInfo.ii_NullsNotDistinct Commit `7a7b3e11e6` added the ii_NullsNotDistinct field, but the comment was not updated. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-01 23:12:24 +02:00
Nathan Bossart	aa268cbaad	Add commit `9e345415bc` to .git-blame-ignore-revs.	2025-07-01 14:30:16 -05:00
Nathan Bossart	32bcf568cb	Make more use of binaryheap_empty() and binaryheap_size(). A few places were accessing bh_size directly instead of via these handy macros. Author: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://postgr.es/m/CAJ7c6TPQMVL%2B028T4zuw9ZqL5Du9JavOLhBQLkJeK0RznYx_6w%40mail.gmail.com	2025-07-01 14:19:07 -05:00
Nathan Bossart	e6115394d4	Document pg_get_multixact_members(). Oversight in commit `0ac5ad5134`. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/20150619215231.GT133018%40postgresql.org Discussion: https://postgr.es/m/CAA5RZ0sjQDDwJfMRb%3DZ13nDLuRpF13ME2L_BdGxi0op8RKjmDg%40mail.gmail.com Backpatch-through: 13	2025-07-01 13:54:38 -05:00
Peter Eisentraut	7a7b3e11e6	Update comment for IndexInfo.ii_WithoutOverlaps Commit `fc0438b4e8` added the ii_WithoutOverlaps field, but the comment was not updated. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-01 20:37:24 +02:00
Peter Eisentraut	9e5fee8228	Fix outdated comment for IndexInfo Commit `7841623571` removed the ii_OpclassOptions field, but the comment was not updated. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/ME0P300MB04453E6C7EA635F0ECF41BFCB6832%40ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-01 20:12:36 +02:00
Peter Eisentraut	fff0d1edf5	Improve code comment The previous wording was potentially confusing about the impact of the OVERRIDING clause on generated columns. Reword slightly to avoid that. Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxFMBe0nPXOQZMLTH4Ry5Gyj4m%2B2Z05mRi9KB4hk8rGt9w%40mail.gmail.com	2025-07-01 18:42:07 +02:00
Tom Lane	1fd772d192	Make sure IOV_MAX is defined. We stopped defining IOV_MAX on non-Windows systems in `75357ab94`, on the assumption that every non-Windows system defines it in <limits.h> as required by X/Open. GNU Hurd, however, doesn't follow that standard either. Put back the old logic to assume 16 if it's not defined. Author: Michael Banck <mbanck@gmx.net> Co-authored-by: Christoph Berg <myon@debian.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6862e8d1.050a0220.194b8d.76fa@mx.google.com Discussion: https://postgr.es/m/6846e0c3.df0a0220.39ef9b.c60e@mx.google.com Backpatch-through: 16	2025-07-01 12:40:35 -04:00
Tom Lane	29213636e6	Make safeguard against incorrect flags for fsync more portable. The existing code assumed that O_RDONLY is defined as 0, but this is not required by POSIX and is not true on GNU Hurd. We can avoid the assumption by relying on O_ACCMODE to mask the fcntl() result. (Hopefully, all supported platforms define that.) Author: Michael Banck <mbanck@gmx.net> Co-authored-by: Samuel Thibault Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6862e8d1.050a0220.194b8d.76fa@mx.google.com Discussion: https://postgr.es/m/68480868.5d0a0220.1e214d.68a6@mx.google.com Backpatch-through: 13	2025-07-01 12:08:20 -04:00
Jeff Davis	8af0d0ab01	Remove provider field from pg_locale_t. The behavior of pg_locale_t is specified by methods, so a separate provider field is no longer necessary. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com	2025-07-01 07:50:46 -07:00
Jeff Davis	5a38104b36	Control ctype behavior internally with a method table. Previously, pattern matching and case mapping behavior branched based on the provider. Refactor to use a method table, which is less error-prone. This is also a step toward multiple provider versions, which we may want to support in the future. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com	2025-07-01 07:44:47 -07:00
Jeff Davis	d81dcc8d62	Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate. Avoids unnecessary dependence on setlocale(). No behavior change. This commit reverts `e1458f2f1b`, which reverted some changes unintentionally committed before the branch for 19. Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/a8666c391dfcabe79868d95f7160eac533ace718.camel@j-davis.com Discussion: https://postgr.es/m/7efaaa645aa5df3771bb47b9c35df27e08f3520e.camel@j-davis.com	2025-07-01 07:24:23 -07:00
Tomas Vondra	9e345415bc	Fix indentation in pg_numa code Broken by commits `7fe2f67c7c`, `81f287dc92` and `bf1119d74a`. Backpatch to 18, same as the offending commits. Backpatch-through: 18	2025-07-01 15:23:07 +02:00
Tomas Vondra	bf1119d74a	Add CHECK_FOR_INTERRUPTS into pg_numa_query_pages Querying the NUMA status can be quite time consuming, especially with large shared buffers. `8cc139bec3` called numa_move_pages() once, for all buffers, and we had to wait for the syscall to complete. But with the chunking, introduced by `7fe2f67c7c` to work around a kernel bug, we can do CHECK_FOR_INTERRUPTS() after each chunk, allowing users to abort the execution. Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Backpatch-through: 18	2025-07-01 12:58:35 +02:00
Tomas Vondra	81f287dc92	Silence valgrind about pg_numa_touch_mem_if_required When querying NUMA status of pages in shared memory, we need to touch the memory first to get valid results. This may trigger valgrind reports, because some of the memory (e.g. unpinned buffers) may be marked as noaccess. Solved by adding a valgrind suppresion. An alternative would be to adjust the access/noaccess status before touching the memory, but that seems far too invasive. It would require all those places to have detailed knowledge of what the shared memory stores. The pg_numa_touch_mem_if_required() macro is replaced with a function. Macros are invisible to suppressions, so it'd have to suppress reports for the caller - e.g. pg_get_shmem_allocations_numa(). So we'd suppress reports for the whole function, and that seems to heavy-handed. It might easily hide other valid issues. Reviewed-by: Christoph Berg <myon@debian.org> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Backpatch-through: 18	2025-07-01 12:32:23 +02:00
Peter Eisentraut	953050236a	amcheck: Improve confusing message The way it was worded, the %u placeholder could be read as the table OID. Rearrange slightly to avoid the possible confusion. Reported-by: jian he <jian.universality@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxFx-25XQV%2Br23oku7ZnL958P30hyb9cFeYPv6wv7yzCCw%40mail.gmail.com	2025-07-01 12:24:17 +02:00
Tomas Vondra	7fe2f67c7c	Limit the size of numa_move_pages requests There's a kernel bug in do_pages_stat(), affecting systems combining 64-bit kernel and 32-bit user space. The function splits the request into chunks of 16 pointers, but forgets the pointers are 32-bit when advancing to the next chunk. Some of the pointers get skipped, and memory after the array is interpreted as pointers. The result is that the produced status of memory pages is mostly bogus. Systems combining 64-bit and 32-bit environments like this might seem rare, but that's not the case - all 32-bit Debian packages are built in a 32-bit chroot on a system with a 64-bit kernel. This is a long-standing kernel bug (since 2010), affecting pretty much all kernels, so it'll take time until all systems get a fixed kernel. Luckily, we can work around the issue by chunking the requests the same way do_pages_stat() does, at least on affected systems. We don't know what kernel a 32-bit build will run on, so all 32-bit builds use chunks of 16 elements (the largest chunk before hitting the issue). 64-bit builds are not affected by this issue, and so could work without the chunking. But chunking has other advantages, so we apply chunking even for 64-bit builds, with chunks of 1024 elements. Reported-by: Christoph Berg <myon@debian.org> Author: Christoph Berg <myon@debian.org> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/aEtDozLmtZddARdB@msg.df7cb.de Context: https://marc.info/?l=linux-mm&m=175077821909222&w=2 Backpatch-through: 18	2025-07-01 12:02:31 +02:00
Amit Kapila	b5cd0ecd4d	Fix typo in pg_publication.h. Author: shveta malik <shveta.malik@gmail.com> Discussion: https://postgr.es/m/CAJpy0uAyFN9o7vU_ZkZFv5-6ysXDNKNx_fC0gwLLKg=8==E3ow@mail.gmail.com	2025-07-01 15:17:03 +05:30
Peter Eisentraut	8338983882	doc: TOAST not toast There are different capitializations of "TOAST" around the documentation and code. This just changes a few places that were more obviously inconsistent with similar phrases elsewhere. Author: Peter Smith <peter.b.smith@fujitsu.com> Discussion: https://www.postgresql.org/message-id/flat/CAHut+PtxXLJFhwJFvx+M=Ux8WGHU85XbT3nDqk-aAUS3E5ANCw@mail.gmail.com	2025-07-01 10:19:52 +02:00
Peter Eisentraut	8fd9bb1d96	Enable MSVC conforming preprocessor Switch MSVC to use the conforming preprocessor, using the /Zc:preprocessor option. This allows us to drop the alternative implementation of VA_ARGS_NARGS() for the previous "traditional" preprocessor. This also prepares the way for enabling C11 mode in the future, which enables the conforming preprocessor by default. This now requires Visual Studio 2019. The installation documentation is adjusted accordingly. Discussion: https://www.postgresql.org/message-id/flat/01a69441-af54-4822-891b-ca28e05b215a%40eisentraut.org	2025-07-01 09:41:40 +02:00
Michael Paquier	732061150b	xml2: Improve error handling of libxml2 calls The contrib module xml2/ has always been fuzzy with the cleanup of the memory allocated by the calls internal to libxml2, even if there are APIs in place giving a lot of control over the error behavior, all located in the backend's xml.c. The code paths fixed in the commit address multiple defects, while sanitizing the code: - In xpath.c, several allocations are done by libxml2 for xpath_workspace, whose memory cleanup could go out of sight as it relied on a single TRY/CATCH block done in pgxml_xpath(). workspace->res is allocated by libxml2, and may finish by not being freed at all upon a failure outside of a TRY area. This code is refactored so as the TRY/CATCH block of pgxml_xpath() is moved one level higher to its callers, which are responsible for cleaning up the contents of a workspace on failure. cleanup_workspace() now requires a volatile workspace, forcing as a rule that a TRY/CATCH block should be used. - Several calls, like xmlStrdup(), xmlXPathNewContext(), xmlXPathCtxtCompile(), etc. can return NULL on failures (for most of them allocation failures. These forgot to check for failures, or missed that pg_xml_error_occurred() should be called, to check if an error is already on the stack. - Some memory allocated by libxml2 calls was freed in an incorrect way, "resstr" in xslt_process() being one example. The class of errors fixed here are for problems that are unlikely going to happen in practice, so no backpatch is done. The changes have finished by being rather invasive, so it is perhaps not a bad thing to be conservative and to keep these changes only on HEAD anyway. Author: Michael Paquier <michael@paquier.xyz> Reported-by: Karavaev Alexey <maralist86@mail.ru> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18943-2f2a04ab03904598@postgresql.org	2025-07-01 15:48:32 +09:00
Amit Langote	c67989789c	Fix typos in comments Commit `19d8e2308b` added enum values with the prefix TU_, but a few comments still referred to TUUI_, which was used in development versions of the patches committed as `19d8e2308b`. Author: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/20250701110216.8ac8a9e4c6f607f1d954f44a@sraoss.co.jp Backpatch-through: 16	2025-07-01 13:13:48 +09:00
Michael Paquier	a3df0d43d9	Fix typo in system_views.sql's definition of pg_stat_activity backend_xmin used a lower-character 's' instead of the upper-character 'S' like the other attributes. This is harmless, but let's be consistent. Issue introduced in `dd1a3bccca`. Author: Daisuke Higuchi <higuchi.daisuke11@gmail.com> Discussion: https://postgr.es/m/CAEVT6c8M39cqWje-df39wWr0KWcDgGKd5fMvQo84zvCXKoEL9Q@mail.gmail.com	2025-07-01 09:41:42 +09:00
Michael Paquier	2e94721747	Improve error handling of libxml2 calls in xml.c This commit fixes some defects in the backend's xml.c, found upon inspection of the internals of libxml2: - xmlEncodeSpecialChars() can fail on malloc(), returning NULL back to the caller. xmltext() assumed that this could never happen. Like other code paths, a TRY/CATCH block is added there, covering also the fact that cstring_to_text_with_len() could fail a memory allocation, where the backend would miss to free the buffer allocated by xmlEncodeSpecialChars(). - Some libxml2 routines called in xmlelement() can return NULL, like xmlAddChildList() or xmlTextWriterStartElement(). Dedicated errors are added for them. - xml_xmlnodetoxmltype() missed that xmlXPathCastNodeToString() can fail on an allocation failure. In this case, the call can just be moved to the existing TRY/CATCH block. All these code paths would cause the server to crash. As this is unlikely a problem in practice, no backpatch is done. Jim and I have caught these defects, not sure who has scored the most. The contrib module xml2/ has similar defects, which will be addressed in a separate change. Reported-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/aEEingzOta_S_Nu7@paquier.xyz	2025-07-01 08:57:05 +09:00
Tom Lane	0836683a89	Improve error report for PL/pgSQL reserved word used as a field name. The current code in resolve_column_ref (dating to commits `01f7d2990` and `fe24d7816`) believes that not finding a RECFIELD datum is a can't-happen case, in consequence of which I didn't spend a whole lot of time considering what to do if it did happen. But it turns out that it can happen if the would-be field name is a fully-reserved PL/pgSQL keyword. Change the error message to describe that situation, and add a test case demonstrating it. This might need further refinement if anyone can find other ways to trigger a failure here; but without an example it's not clear what other error to throw. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/2185258.1745617445@sss.pgh.pa.us	2025-06-30 17:06:39 -04:00
Tom Lane	999f172ded	De-reserve keywords EXECUTE and STRICT in PL/pgSQL. On close inspection, there does not seem to be a strong reason why these should be fully-reserved keywords. I guess they just escaped consideration in previous attempts to minimize PL/pgSQL's list of reserved words. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/2185258.1745617445@sss.pgh.pa.us	2025-06-30 16:59:36 -04:00
Nathan Bossart	bd09f024a1	Add new OID alias type regdatabase. This provides a convenient way to look up a database's OID. For example, the query SELECT * FROM pg_shdepend WHERE dbid = (SELECT oid FROM pg_database WHERE datname = current_database()); can now be simplified to SELECT * FROM pg_shdepend WHERE dbid = current_database()::regdatabase; Like the regrole type, regdatabase has cluster-wide scope, so we disallow regdatabase constants from appearing in stored expressions. Bumps catversion. Author: Ian Lawrence Barwick <barwick@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aBpjJhyHpM2LYcG0%40nathan	2025-06-30 15:38:54 -05:00
Andres Freund	f20a347e1a	aio: Fix reference to outdated name Reported-by: Antonin Houska <ah@cybertec.at> Author: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/5250.1751266701@localhost Backpatch-through: 18, where `da7226993f` introduced this	2025-06-30 10:22:02 -04:00
Andrew Dunstan	c3e28e9fd9	Avoid uninitialized value error in TAP tests' Cluster->psql If the method is called in scalar context and we didn't pass in a stderr handle, one won't be created. However, some error paths assume that it exists, so in this case create a dummy stderr to avoid the resulting perl error. Per gripe from Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru> and adapted from his patch. Discussion: https://postgr.es/m/378eac5de4b8ecb5be7bcdf2db9d2c4d@postgrespro.ru	2025-06-30 09:49:50 -04:00
Peter Eisentraut	40a96cd148	pgflex: propagate environment to flex subprocess Python's subprocess.run docs say that if the env argument is not None, it will be used "instead of the default behavior of inheriting the current process’ environment". However, the environment should be preserved, only adding FLEX_TMP_DIR to it. Author: Javier Maestro <jjmaestro@ieee.org> Discussion: https://www.postgresql.org/message-id/flat/CABvji06GUpmrTqqiCr6_F9vRL2-JUSVAh8ChgWa6k47FUCvYmA%40mail.gmail.com	2025-06-30 12:24:48 +02:00
Peter Eisentraut	cc2ac0e6f9	Remove unused #include's in src/backend/utils/adt/* Author: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAJ7c6TOowVbR-0NEvvDm6a_mag18krR0XJ2FKrc9DHXj7hFRtQ%40mail.gmail.com	2025-06-30 12:00:00 +02:00
Peter Eisentraut	a6a4641252	Fix whitespace	2025-06-30 11:38:18 +02:00
Fujii Masao	a4c10de929	psql: Improve tab completion for COPY command. Previously, tab completion for COPY only suggested plain tables and partitioned tables, even though materialized views are also valid for COPY TO (since commit `534874fac0`), and foreign tables are valid for COPY FROM. This commit enhances tab completion for COPY to also include materialized views and foreign tables. Views with INSTEAD OF INSERT triggers are supported with COPY FROM but rarely used, so plain views are intentionally excluded from completion. Author: jian he <jian.universality@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/CACJufxFxnSkikp+GormAGHcMTX1YH2HRXW1+3dJM9w7yY9hdsg@mail.gmail.com	2025-06-30 18:36:24 +09:00
Peter Eisentraut	9601351146	doc: explain pgstatindex fragmentation It was quite hard to guess what leaf_fragmentation meant without looking at pgstattuple's code. This patch aims to give to the user a better idea of what it means. Author: Frédéric Yhuel <frederic.yhuel@dalibo.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Benoit Lobréau <benoit.lobreau@dalibo.com> Discussion: https://postgr.es/m/bf110561-f774-4957-a890-bb6fab6804e0%40dalibo.com Discussion: https://postgr.es/m/4c5dee3a-8381-4e0f-b882-d1bd950e8972@dalibo.com	2025-06-30 11:30:56 +02:00
Peter Eisentraut	3431e3e4aa	pgbench: Use standard option handling test routines Run program_XXX tests instead of its own tests. This ensures consistency with the test suites of other programs and enforces common policies, such as help line length. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/OSCPR01MB14966247015B7E3D8D340D022F56FA@OSCPR01MB14966.jpnprd01.prod.outlook.com	2025-06-30 10:47:42 +02:00
Peter Eisentraut	2e640a0fa2	doc: Some copy-editing around prefix operators When postfix operators where dropped in `1ed6b8956`, the CREATE OPERATOR docs were not updated to make the RIGHTARG argument mandatory in the grammar. While at it, make the RIGHTARG docs more concise. Also, the operator docs were mentioning "infix" in the introduction, while using "binary" everywhere else. Author: Christoph Berg <myon@debian.org> Discussion: https://www.postgresql.org/message-id/flat/aAtpbnQphv4LWAye@msg.df7cb.de	2025-06-30 10:38:43 +02:00
Daniel Gustafsson	c5c4fbb4d4	doc: Fix typo in pg_sync_replication_slots documentation Commit `1546e17f9d` accidentally misspelled additionally as additionaly. Backpatch to v17 to match where the original commit was backpatched. Author: Daniel Gustafsson <daniel@yesql.se> Backpatch-through: 17	2025-06-30 10:12:31 +02:00
Michael Paquier	2252fcd427	Rationalize handling of VacuumParams This commit refactors the vacuum routines that rely on VacuumParams, adding const markers where necessary to force a new policy in the code. This structure should not use a pointer as it may be used across multiple relations, and its contents should never be updated. vacuum_rel() stands as an exception as it touches the "index_cleanup" and "truncate" options. VacuumParams has been introduced in `0d83138974`, and `661643deda` has fixed a bug impacting VACUUM operating on multiple relations. The changes done in tableam.h break ABI compatibility, so this commit can only happen on HEAD. Author: Shihao Zhong <zhong950419@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAGRkXqTo+aK=GTy5pSc-9cy8H2F2TJvcrZ-zXEiNJj93np1UUw@mail.gmail.com	2025-06-30 15:42:50 +09:00
Michael Paquier	5ba00e175a	Align log_line_prefix in CI and TAP tests with pg_regress.c log_line_prefix is changed to include "%b", the backend type in the TAP test configuration. %v and %x are removed from the CI configuration, with the format around %b changed. The lack of backend type in postgresql.conf set by Cluster.pm for the TAP test configuration was something that has been bugging me, beginning the discussion that has led to this change. The change in the CI has come up during the discussion, to become consistent with pg_regress.c, %v and %x not being that useful to have. Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/aC0VaIWAXLgXcHVP@paquier.xyz	2025-06-30 13:56:31 +09:00
Joe Conway	2652835d3e	Stamp HEAD as 19devel. Let the hacking begin ...	2025-06-29 22:28:10 -04:00

4248 changed files with 340771 additions and 180636 deletions

88

.cirrus.star

View file

 @ -7,7 +7,7 @@ https://github.com/bazelbuild/starlark/blob/master/spec.md
 See also .cirrus.yml and src/tools/ci/README
 """
 load("cirrus", "env", "fs")
 load("cirrus", "env", "fs", "re", "yaml")
 def main():
 @ -18,19 +18,36 @@ def main():
 ) the contents of .cirrus.yml
 ) if defined, the contents of the file referenced by the, repository
 ) computed environment variables
 ) if defined, the contents of the file referenced by the, repository
        level, REPO_CI_CONFIG_GIT_URL variable (see
        https://cirrus-ci.org/guide/programming-tasks/#fs for the accepted
        format)
 ) .cirrus.tasks.yml
 ) .cirrus.tasks.yml
     """
     output = ""
     # 1) is evaluated implicitly
     # Add 2)
     additional_env = compute_environment_vars()
     env_fmt = """
 ###
 # Computed environment variables start here
 ###
 {0}
 ###
 # Computed environment variables end here
 ###
 """
     output += env_fmt.format(yaml.dumps({'env': additional_env}))
     # Add 3)
     repo_config_url = env.get("REPO_CI_CONFIG_GIT_URL")
     if repo_config_url != None:
         print("loading additional configuration from \"{}\"".format(repo_config_url))
 @ -38,12 +55,75 @@ def main():
     else:
         output += "\n# REPO_CI_CONFIG_URL was not set\n"
     # Add 3)
     # Add 4)
     output += config_from(".cirrus.tasks.yml")
     return output
 def compute_environment_vars():
     cenv = {}
     ###
     # Some tasks are manually triggered by default because they might use too
     # many resources for users of free Cirrus credits, but they can be
     # triggered automatically by naming them in an environment variable e.g.
     # REPO_CI_AUTOMATIC_TRIGGER_TASKS="task_name other_task" under "Repository
     # Settings" on Cirrus CI's website.
     default_manual_trigger_tasks = ['mingw', 'netbsd', 'openbsd']
     repo_ci_automatic_trigger_tasks = env.get('REPO_CI_AUTOMATIC_TRIGGER_TASKS', '')
     for task in default_manual_trigger_tasks:
         name = 'CI_TRIGGER_TYPE_' + task.upper()
         if repo_ci_automatic_trigger_tasks.find(task) != -1:
             value = 'automatic'
         else:
             value = 'manual'
         cenv[name] = value
     ###
     ###
     # Parse "ci-os-only:" tag in commit message and set
     # CI_{$OS}_ENABLED variable for each OS
     # We want to disable SanityCheck if testing just a specific OS. This
     # shortens push-wait-for-ci cycle time a bit when debugging operating
     # system specific failures. Just treating it as an OS in that case
     # suffices.
     operating_systems = [
       'compilerwarnings',
       'freebsd',
       'linux',
       'macos',
       'mingw',
       'netbsd',
       'openbsd',
       'sanitycheck',
       'windows',
     ]
     commit_message = env.get('CIRRUS_CHANGE_MESSAGE')
     match_re = r"(^|.*\n)ci-os-only: ([^\n]+)($|\n.*)"
     # re.match() returns an array with a tuple of (matched-string, match_1, ...)
     m = re.match(match_re, commit_message)
     if m and len(m) > 0:
         os_only = m[0][2]
         os_only_list = re.split(r'[, ]+', os_only)
     else:
         os_only_list = operating_systems
     for os in operating_systems:
         os_enabled = os in os_only_list
         cenv['CI_{0}_ENABLED'.format(os.upper())] = os_enabled
     ###
     return cenv
 def config_from(config_src):
     """return contents of config file `config_src`, surrounded by markers
     indicating start / end of the included file

									
										227

.cirrus.tasks.yml
									
										View file
										
				@ -31,6 +31,31 @@ env:

				  TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf

				  PG_TEST_EXTRA: kerberos ldap ssl libpq_encryption load_balance oauth

				  # Postgres config args for the meson builds, shared between all meson tasks

				  # except the 'SanityCheck' task

				  MESON_COMMON_PG_CONFIG_ARGS: -Dcassert=true -Dinjection_points=true

				  # Meson feature flags shared by all meson tasks, except:

				  # SanityCheck: uses almost no dependencies.

				  # Windows - VS: has fewer dependencies than listed here, so defines its own.

				  # Linux: uses the 'auto' feature option to test meson feature autodetection.

				  MESON_COMMON_FEATURES: >-

				    -Dauto_features=disabled

				    -Dldap=enabled

				    -Dssl=openssl

				    -Dtap_tests=enabled

				    -Dplperl=enabled

				    -Dplpython=enabled

				    -Ddocs=enabled

				    -Dicu=enabled

				    -Dlibxml=enabled

				    -Dlibxslt=enabled

				    -Dlz4=enabled

				    -Dpltcl=enabled

				    -Dreadline=enabled

				    -Dzlib=enabled

				    -Dzstd=enabled

				# What files to preserve in case tests fail

				on_failure_ac: &on_failure_ac

				@ -72,13 +97,13 @@ task:

				  # push-wait-for-ci cycle time a bit when debugging operating system specific

				  # failures. Uses skip instead of only_if, as cirrus otherwise warns about

				  # only_if conditions not matching.

				  skip: $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:.*'

				  skip: $CI_SANITYCHECK_ENABLED == false

				  env:

				    CPUS: 4

				    BUILD_JOBS: 8

				    TEST_JOBS: 8

				    IMAGE_FAMILY: pg-ci-bookworm

				    IMAGE_FAMILY: pg-ci-trixie

				    CCACHE_DIR: ${CIRRUS_WORKING_DIR}/ccache_dir

				    # no options enabled, should be small

				    CCACHE_MAXSIZE: "150M"

				@ -104,14 +129,17 @@ task:

				  configure_script: |

				    su postgres <<-EOF

				      set -e

				      meson setup \

				        --buildtype=debug \

				        --auto-features=disabled \

				        -Ddefault_library=shared \

				        -Dtap_tests=enabled \

				        build

				    EOF

				  build_script: |

				    su postgres <<-EOF

				      set -e

				      ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}

				    EOF

				  upload_caches: ccache

				@ -121,6 +149,7 @@ task:

				  # tap test that exercises both a frontend binary and the backend.

				  test_minimal_script: |

				    su postgres <<-EOF

				      set -e

				      ulimit -c unlimited

				      meson test $MTEST_ARGS --suite setup

				      meson test $MTEST_ARGS --num-processes ${TEST_JOBS} \

				@ -164,10 +193,19 @@ task:

				      -c debug_parallel_query=regress

				    PG_TEST_PG_UPGRADE_MODE: --link

				    MESON_FEATURES: >-

				      -Ddtrace=enabled

				      -Dgssapi=enabled

				      -Dlibcurl=enabled

				      -Dnls=enabled

				      -Dpam=enabled

				      -Dtcl_version=tcl86

				      -Duuid=bsd

				  <<: *freebsd_task_template

				  depends_on: SanityCheck

				  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*freebsd.*'

				  only_if: $CI_FREEBSD_ENABLED

				  sysinfo_script: |

				    id

				@ -195,11 +233,12 @@ task:

				  # already takes longer than other platforms except for windows.

				  configure_script: |

				    su postgres <<-EOF

				      set -e

				      meson setup \

				        ${MESON_COMMON_PG_CONFIG_ARGS} \

				        --buildtype=debug \

				        -Dcassert=true -Dinjection_points=true \

				        -Duuid=bsd -Dtcl_version=tcl86 -Ddtrace=auto \

				        -Dextra_lib_dirs=/usr/local/lib -Dextra_include_dirs=/usr/local/include/ \

				        ${MESON_COMMON_FEATURES} ${MESON_FEATURES} \

				        build

				    EOF

				  build_script: su postgres -c 'ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}'

				@ -207,6 +246,7 @@ task:

				  test_world_script: |

				    su postgres <<-EOF

				      set -e

				      ulimit -c unlimited

				      meson test $MTEST_ARGS --num-processes ${TEST_JOBS}

				    EOF

				@ -231,6 +271,7 @@ task:

				    # during upload, as it doesn't expect artifacts to change size

				    stop_running_script: |

				      su postgres <<-EOF

				        set -e

				        build/tmp_install/usr/local/pgsql/bin/pg_ctl -D build/runningcheck stop || true

				      EOF

				    <<: *on_failure_meson

				@ -239,7 +280,6 @@ task:

				task:

				  depends_on: SanityCheck

				  trigger_type: manual

				  env:

				    # Below are experimentally derived to be a decent choice.

				@ -257,7 +297,9 @@ task:

				  matrix:

				    - name: NetBSD - Meson

				      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*netbsd.*'

				      # See REPO_CI_AUTOMATIC_TRIGGER_TASKS in .cirrus.star

				      trigger_type: $CI_TRIGGER_TYPE_NETBSD

				      only_if: $CI_NETBSD_ENABLED

				      env:

				        OS_NAME: netbsd

				        IMAGE_FAMILY: pg-ci-netbsd-postgres

				@ -269,18 +311,32 @@ task:

				        LC_ALL: "C"

				        # -Duuid is not set for the NetBSD, see the comment below, above

				        # configure_script, for more information.

				        MESON_FEATURES: >-

				          -Dgssapi=enabled

				          -Dlibcurl=enabled

				          -Dnls=enabled

				          -Dpam=enabled

				      setup_additional_packages_script: |

				        #pkgin -y install ...

				      <<: *netbsd_task_template

				    - name: OpenBSD - Meson

				      only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*openbsd.*'

				      # See REPO_CI_AUTOMATIC_TRIGGER_TASKS in .cirrus.star

				      trigger_type: $CI_TRIGGER_TYPE_OPENBSD

				      only_if: $CI_OPENBSD_ENABLED

				      env:

				        OS_NAME: openbsd

				        IMAGE_FAMILY: pg-ci-openbsd-postgres

				        PKGCONFIG_PATH: '/usr/lib/pkgconfig:/usr/local/lib/pkgconfig'

				        UUID: -Duuid=e2fs

				        TCL: -Dtcl_version=tcl86

				        CORE_DUMP_EXECUTABLE_DIR: $CIRRUS_WORKING_DIR/build/tmp_install/usr/local/pgsql/bin

				        MESON_FEATURES: >-

				          -Dbsd_auth=enabled

				          -Dlibcurl=enabled

				          -Dtcl_version=tcl86

				          -Duuid=e2fs

				      setup_additional_packages_script: |

				        #pkg_add -I ...

				      # Always core dump to ${CORE_DUMP_DIR}

				@ -313,12 +369,12 @@ task:

				  # And other uuid options are not available on NetBSD.

				  configure_script: |

				    su postgres <<-EOF

				      set -e

				      meson setup \

				        ${MESON_COMMON_PG_CONFIG_ARGS} \

				        --buildtype=debugoptimized \

				        --pkg-config-path ${PKGCONFIG_PATH} \

				        -Dcassert=true -Dinjection_points=true \

				        -Dssl=openssl ${UUID} ${TCL} \

				        -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \

				        ${MESON_COMMON_FEATURES} ${MESON_FEATURES} \

				        build

				    EOF

				@ -327,10 +383,8 @@ task:

				  test_world_script: |

				    su postgres <<-EOF

				      set -e

				      ulimit -c unlimited

				      # Otherwise tests will fail on OpenBSD, due to inability to start enough

				      # processes.

				      ulimit -p 256

				      meson test $MTEST_ARGS --num-processes ${TEST_JOBS}

				    EOF

				@ -341,7 +395,7 @@ task:

				      # ${CORE_DUMP_DIR}, they may not obey this. So, move core files to the

				      # ${CORE_DUMP_DIR} directory.

				      find build/ -type f -name '*.core' -exec mv '{}' ${CORE_DUMP_DIR} \;

				      src/tools/ci/cores_backtrace.sh ${OS_NAME} ${CORE_DUMP_DIR}

				      src/tools/ci/cores_backtrace.sh ${OS_NAME} ${CORE_DUMP_DIR} ${CORE_DUMP_EXECUTABLE_DIR}

				# configure feature flags, shared between the task running the linux tests and

				@ -365,10 +419,6 @@ LINUX_CONFIGURE_FEATURES: &LINUX_CONFIGURE_FEATURES >-

				  --with-uuid=ossp

				  --with-zstd

				LINUX_MESON_FEATURES: &LINUX_MESON_FEATURES >-

				  -Dllvm=enabled

				  -Duuid=e2fs

				# Check SPECIAL in the matrix: below

				task:

				@ -376,7 +426,7 @@ task:

				    CPUS: 4

				    BUILD_JOBS: 4

				    TEST_JOBS: 8 # experimentally derived to be a decent choice

				    IMAGE_FAMILY: pg-ci-bookworm

				    IMAGE_FAMILY: pg-ci-trixie

				    CCACHE_DIR: /tmp/ccache_dir

				    DEBUGINFOD_URLS: "https://debuginfod.debian.net"

				@ -397,7 +447,7 @@ task:

				    # print_stacktraces=1,verbosity=2, duh

				    # detect_leaks=0: too many uninteresting leak errors in short-lived binaries

				    UBSAN_OPTIONS: print_stacktrace=1:disable_coredump=0:abort_on_error=1:verbosity=2

				    ASAN_OPTIONS: print_stacktrace=1:disable_coredump=0:abort_on_error=1:detect_leaks=0

				    ASAN_OPTIONS: print_stacktrace=1:disable_coredump=0:abort_on_error=1:detect_leaks=0:detect_stack_use_after_return=0

				    # SANITIZER_FLAGS is set in the tasks below

				    CFLAGS: -Og -ggdb -fno-sanitize-recover=all $SANITIZER_FLAGS

				@ -405,16 +455,15 @@ task:

				    LDFLAGS: $SANITIZER_FLAGS

				    CC: ccache gcc

				    CXX: ccache g++

				    # GCC emits a warning for llvm-14, so switch to a newer one.

				    LLVM_CONFIG: llvm-config-16

				    LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES

				    LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES

				    LINUX_MESON_FEATURES: >-

				      -Duuid=e2fs

				  <<: *linux_task_template

				  depends_on: SanityCheck

				  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*linux.*'

				  only_if: $CI_LINUX_ENABLED

				  ccache_cache:

				    folder: ${CCACHE_DIR}

				@ -453,7 +502,7 @@ task:

				    # - Uses address sanitizer, sanitizer failures are typically printed in

				    #   the server log

				    # - Configures postgres with a small segment size

				    - name: Linux - Debian Bookworm - Autoconf

				    - name: Linux - Debian Trixie - Autoconf

				      env:

				        SANITIZER_FLAGS: -fsanitize=address

				@ -467,6 +516,7 @@ task:

				      # that.

				      configure_script: |

				        su postgres <<-EOF

				          set -e

				          ./configure \

				            --enable-cassert --enable-injection-points --enable-debug \

				            --enable-tap-tests --enable-nls \

				@ -476,13 +526,14 @@ task:

				            \

				            ${LINUX_CONFIGURE_FEATURES} \

				            \

				            CLANG="ccache clang-16"

				            CLANG="ccache clang"

				        EOF

				      build_script: su postgres -c "make -s -j${BUILD_JOBS} world-bin"

				      upload_caches: ccache

				      test_world_script: |

				        su postgres <<-EOF

				          set -e

				          ulimit -c unlimited # default is 0

				          make -s ${CHECK} ${CHECKFLAGS} -j${TEST_JOBS}

				        EOF

				@ -495,7 +546,8 @@ task:

				    #   are typically printed in the server log

				    # - Test both 64bit and 32 bit builds

				    # - uses io_method=io_uring

				    - name: Linux - Debian Bookworm - Meson

				    # - Uses meson feature autodetection

				    - name: Linux - Debian Trixie - Meson

				      env:

				        CCACHE_MAXSIZE: "400M" # tests two different builds

				@ -505,10 +557,11 @@ task:

				      configure_script: |

				        su postgres <<-EOF

				          set -e

				          meson setup \

				            ${MESON_COMMON_PG_CONFIG_ARGS} \

				            --buildtype=debug \

				            -Dcassert=true -Dinjection_points=true \

				            ${LINUX_MESON_FEATURES} \

				            ${LINUX_MESON_FEATURES} -Dllvm=enabled \

				            build

				        EOF

				@ -516,26 +569,28 @@ task:

				      # locally.

				      configure_32_script: |

				        su postgres <<-EOF

				          set -e

				          export CC='ccache gcc -m32'

				          export CXX='ccache g++ -m32'

				          meson setup \

				            ${MESON_COMMON_PG_CONFIG_ARGS} \

				            --buildtype=debug \

				            -Dcassert=true -Dinjection_points=true \

				            ${LINUX_MESON_FEATURES} \

				            -Dllvm=disabled \

				            --pkg-config-path /usr/lib/i386-linux-gnu/pkgconfig/ \

				            -DPERL=perl5.36-i386-linux-gnu \

				            -Dlibnuma=disabled \

				            -DPERL=perl5.40-i386-linux-gnu \

				            ${LINUX_MESON_FEATURES} -Dlibnuma=disabled \

				            build-32

				        EOF

				      build_script: |

				        su postgres <<-EOF

				          set -e

				          ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}

				          ninja -C build -t missingdeps

				        EOF

				      build_32_script: |

				        su postgres <<-EOF

				          set -e

				          ninja -C build-32 -j${BUILD_JOBS} ${MBUILD_TARGET}

				          ninja -C build -t missingdeps

				        EOF

				@ -544,6 +599,7 @@ task:

				      test_world_script: |

				        su postgres <<-EOF

				          set -e

				          ulimit -c unlimited

				          meson test $MTEST_ARGS --num-processes ${TEST_JOBS}

				        EOF

				@ -556,6 +612,7 @@ task:

				      # from C, prevent that with PYTHONCOERCECLOCALE.

				      test_world_32_script: |

				        su postgres <<-EOF

				          set -e

				          ulimit -c unlimited

				          PYTHONCOERCECLOCALE=0 LANG=C meson test $MTEST_ARGS -C build-32 --num-processes ${TEST_JOBS}

				        EOF

				@ -573,21 +630,29 @@ task:

				# SPECIAL:

				# - Enables --clone for pg_upgrade and pg_combinebackup

				task:

				  name: macOS - Sonoma - Meson

				  name: macOS - Sequoia - Meson

				  env:

				    CPUS: 4 # always get that much for cirrusci macOS instances

				    BUILD_JOBS: $CPUS

				    # Test performance regresses noticably when using all cores. 8 seems to

				    # Test performance regresses noticeably when using all cores. 8 seems to

				    # work OK. See

				    # https://postgr.es/m/20220927040208.l3shfcidovpzqxfh%40awork3.anarazel.de

				    TEST_JOBS: 8

				    IMAGE: ghcr.io/cirruslabs/macos-runner:sonoma

				    IMAGE: ghcr.io/cirruslabs/macos-runner:sequoia

				    CIRRUS_WORKING_DIR: ${HOME}/pgsql/

				    CCACHE_DIR: ${HOME}/ccache

				    MACPORTS_CACHE: ${HOME}/macports-cache

				    MESON_FEATURES: >-

				      -Dbonjour=enabled

				      -Ddtrace=enabled

				      -Dgssapi=enabled

				      -Dlibcurl=enabled

				      -Dnls=enabled

				      -Duuid=e2fs

				    MACOS_PACKAGE_LIST: >-

				      ccache

				      icu

				@ -613,7 +678,7 @@ task:

				  <<: *macos_task_template

				  depends_on: SanityCheck

				  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*(macos|darwin|osx).*'

				  only_if: $CI_MACOS_ENABLED

				  sysinfo_script: |

				    id

				@ -657,11 +722,11 @@ task:

				  configure_script: |

				    export PKG_CONFIG_PATH="/opt/local/lib/pkgconfig/"

				    meson setup \

				      ${MESON_COMMON_PG_CONFIG_ARGS} \

				      --buildtype=debug \

				      -Dextra_include_dirs=/opt/local/include \

				      -Dextra_lib_dirs=/opt/local/lib \

				      -Dcassert=true -Dinjection_points=true \

				      -Duuid=e2fs -Ddtrace=auto \

				      ${MESON_COMMON_FEATURES} ${MESON_FEATURES} \

				      build

				  build_script: ninja -C build -j${BUILD_JOBS} ${MBUILD_TARGET}

				@ -701,7 +766,7 @@ WINDOWS_ENVIRONMENT_BASE: &WINDOWS_ENVIRONMENT_BASE

				task:

				  name: Windows - Server 2019, VS 2019 - Meson & ninja

				  name: Windows - Server 2022, VS 2019 - Meson & ninja

				  << : *WINDOWS_ENVIRONMENT_BASE

				  env:

				@ -716,10 +781,19 @@ task:

				    # 0x8001 is SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX

				    CIRRUS_WINDOWS_ERROR_MODE: 0x8001

				    MESON_FEATURES:

				      -Dcpp_args=/std:c++20

				      -Dauto_features=disabled

				      -Dldap=enabled

				      -Dssl=openssl

				      -Dtap_tests=enabled

				      -Dplperl=enabled

				      -Dplpython=enabled

				  <<: *windows_task_template

				  depends_on: SanityCheck

				  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*windows.*'

				  only_if: $CI_WINDOWS_ENABLED

				  setup_additional_packages_script: |

				    REM choco install -y --no-progress ...

				@ -730,10 +804,9 @@ task:

				    echo 127.0.0.3 pg-loadbalancetest >> c:\Windows\System32\Drivers\etc\hosts

				    type c:\Windows\System32\Drivers\etc\hosts

				  # Use /DEBUG:FASTLINK to avoid high memory usage during linking

				  configure_script: |

				    vcvarsall x64

				    meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% build

				    meson setup --backend ninja %MESON_COMMON_PG_CONFIG_ARGS% --buildtype debug -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% %MESON_FEATURES% build

				  build_script: |

				    vcvarsall x64

				@ -753,15 +826,13 @@ task:

				task:

				  << : *WINDOWS_ENVIRONMENT_BASE

				  name: Windows - Server 2019, MinGW64 - Meson

				  name: Windows - Server 2022, MinGW64 - Meson

				  # See REPO_CI_AUTOMATIC_TRIGGER_TASKS in .cirrus.star.

				  trigger_type: $CI_TRIGGER_TYPE_MINGW

				  # due to resource constraints we don't run this task by default for now

				  trigger_type: manual

				  # worth using only_if despite being manual, otherwise this task will show up

				  # when e.g. ci-os-only: linux is used.

				  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*mingw.*'

				  # otherwise it'll be sorted before other tasks

				  depends_on: SanityCheck

				  only_if: $CI_MINGW_ENABLED

				  env:

				    TEST_JOBS: 4 # higher concurrency causes occasional failures

				@ -777,6 +848,11 @@ task:

				    CHERE_INVOKING: 1

				    BASH: C:\msys64\usr\bin\bash.exe -l

				    # Keep -Dnls explicitly disabled, as the number of files it creates causes a

				    # noticeable slowdown.

				    MESON_FEATURES: >-

				      -Dnls=disabled

				  <<: *windows_task_template

				  ccache_cache:

				@ -791,9 +867,8 @@ task:

				    %BASH% -c "where perl"

				    %BASH% -c "perl --version"

				  # disable -Dnls as the number of files it creates cause a noticable slowdown

				  configure_script: |

				    %BASH% -c "meson setup -Ddebug=true -Doptimization=g -Dcassert=true -Dinjection_points=true -Db_pch=true -Dnls=disabled -DTAR=%TAR% build"

				    %BASH% -c "meson setup %MESON_COMMON_PG_CONFIG_ARGS% -Ddebug=true -Doptimization=g -Db_pch=true %MESON_COMMON_FEATURES% %MESON_FEATURES% -DTAR=%TAR% build"

				  build_script: |

				    %BASH% -c "ninja -C build ${MBUILD_TARGET}"

				@ -815,15 +890,14 @@ task:

				  # To limit unnecessary work only run this once the SanityCheck

				  # succeeds. This is particularly important for this task as we intentionally

				  # use always: to continue after failures. Task that did not run count as a

				  # success, so we need to recheck SanityChecks's condition here ...

				  # use always: to continue after failures.

				  depends_on: SanityCheck

				  only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*'

				  only_if: $CI_COMPILERWARNINGS_ENABLED

				  env:

				    CPUS: 4

				    BUILD_JOBS: 4

				    IMAGE_FAMILY: pg-ci-bookworm

				    IMAGE_FAMILY: pg-ci-trixie

				    # Use larger ccache cache, as this task compiles with multiple compilers /

				    # flag combinations

				@ -831,10 +905,6 @@ task:

				    CCACHE_DIR: "/tmp/ccache_dir"

				    LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES

				    LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES

				    # GCC emits a warning for llvm-14, so switch to a newer one.

				    LLVM_CONFIG: llvm-config-16

				  <<: *linux_task_template

				@ -871,7 +941,7 @@ task:

				        --cache gcc.cache \

				        --enable-dtrace \

				        ${LINUX_CONFIGURE_FEATURES} \

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"

				      make -s -j${BUILD_JOBS} clean

				      time make -s -j${BUILD_JOBS} world-bin

				@ -882,7 +952,7 @@ task:

				        --cache gcc.cache \

				        --enable-cassert \

				        ${LINUX_CONFIGURE_FEATURES} \

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"

				      make -s -j${BUILD_JOBS} clean

				      time make -s -j${BUILD_JOBS} world-bin

				@ -892,7 +962,7 @@ task:

				      time ./configure \

				        --cache clang.cache \

				        ${LINUX_CONFIGURE_FEATURES} \

				        CC="ccache clang" CXX="ccache clang++-16" CLANG="ccache clang-16"

				        CC="ccache clang" CXX="ccache clang++" CLANG="ccache clang"

				      make -s -j${BUILD_JOBS} clean

				      time make -s -j${BUILD_JOBS} world-bin

				@ -904,7 +974,7 @@ task:

				        --enable-cassert \

				        --enable-dtrace \

				        ${LINUX_CONFIGURE_FEATURES} \

				        CC="ccache clang" CXX="ccache clang++-16" CLANG="ccache clang-16"

				        CC="ccache clang" CXX="ccache clang++" CLANG="ccache clang"

				      make -s -j${BUILD_JOBS} clean

				      time make -s -j${BUILD_JOBS} world-bin

				@ -912,11 +982,11 @@ task:

				  always:

				    mingw_cross_warning_script: |

				      time ./configure \

				        --host=x86_64-w64-mingw32 \

				        --host=x86_64-w64-mingw32ucrt \

				        --enable-cassert \

				        --without-icu \

				        CC="ccache x86_64-w64-mingw32-gcc" \

				        CXX="ccache x86_64-w64-mingw32-g++"

				        CC="ccache x86_64-w64-mingw32ucrt-gcc" \

				        CXX="ccache x86_64-w64-mingw32ucrt-g++"

				      make -s -j${BUILD_JOBS} clean

				      time make -s -j${BUILD_JOBS} world-bin

				@ -928,30 +998,25 @@ task:

				    docs_build_script: |

				      time ./configure \

				        --cache gcc.cache \

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang-16"

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"

				      make -s -j${BUILD_JOBS} clean

				      time make -s -j${BUILD_JOBS} -C doc

				  ###

				  # Verify headerscheck / cpluspluscheck succeed

				  #

				  # - Don't use ccache, the files are uncacheable, polluting ccache's

				  #   cache

				  # - Run both in same script to increase parallelism, use -k to get result of both

				  # - Use -fmax-errors, as particularly cpluspluscheck can be very verbose

				  # - XXX have to disable ICU to avoid errors:

				  #   https://postgr.es/m/20220323002024.f2g6tivduzrktgfa%40alap3.anarazel.de

				  ###

				  always:

				    headers_headerscheck_script: |

				      time ./configure \

				        ${LINUX_CONFIGURE_FEATURES} \

				        --without-icu \

				        --cache gcc.cache \

				        --quiet \

				        CC="gcc" CXX"=g++" CLANG="clang-16"

				        CC="ccache gcc" CXX="ccache g++" CLANG="ccache clang"

				      make -s -j${BUILD_JOBS} clean

				      time make -s headerscheck EXTRAFLAGS='-fmax-errors=10'

				    headers_cpluspluscheck_script: |

				      time make -s cpluspluscheck EXTRAFLAGS='-fmax-errors=10'

				      time make -s -j${BUILD_JOBS} -k ${CHECKFLAGS} headerscheck cpluspluscheck EXTRAFLAGS='-fmax-errors=10'

				  always:

				    upload_caches: ccache

									
										12

.cirrus.yml
									
										View file
										
				@ -10,12 +10,20 @@

				#

				# 1) the contents of this file

				#

				# 2) if defined, the contents of the file referenced by the, repository

				# 2) computed environment variables

				#

				#    Used to enable/disable tasks based on the execution environment. See

				#    .cirrus.star: compute_environment_vars()

				#

				# 3) if defined, the contents of the file referenced by the, repository

				#    level, REPO_CI_CONFIG_GIT_URL variable (see

				#    https://cirrus-ci.org/guide/programming-tasks/#fs for the accepted

				#    format)

				#

				# 3) .cirrus.tasks.yml

				#    This allows running tasks in a different execution environment than the

				#    default, e.g. to have sufficient resources for cfbot.

				#

				# 4) .cirrus.tasks.yml

				#

				# This composition is done by .cirrus.star

									
										6

.editorconfig
									
										View file
										
				@ -85,6 +85,12 @@ insert_final_newline = true

				indent_style = unset

				tab_width = unset

				[src/backend/utils/misc/postgresql.conf.sample]

				trim_trailing_whitespace = true

				insert_final_newline = true

				indent_style = space

				tab_width = unset

				[*.out]

				indent_style = unset

				indent_size = unset

159

.git-blame-ignore-revs

View file

 @ -14,6 +14,66 @@
 #
 # $ git log --pretty=format:"%H # %cd%n# %s" $PGINDENTGITHASH -1 --date=iso
 fe0779d8d116bf8ee06056fed4c1257f1744c # 2026-05-13 10:41:33 -0400
 # Pre-beta mechanical code beautification, step 3: run reformat-dat-files.
 a97bddd16f0511dc62b7e4770376a34f10114 # 2026-05-13 10:37:42 -0400
 # Pre-beta mechanical code beautification, step 2: run pgperltidy.
 ee42a3413b416898e7931a8a3a5b43e9ab # 2026-05-13 10:34:17 -0400
 # Pre-beta mechanical code beautification, step 1: run pgindent.
 a1f9542792c6533ef74c2e7aefad0da1d9a7a # 2026-04-17 17:46:27 -0400
 # pg_plan_advice: pgindent
 acd1be59e407a62dbe6a5240d9d1dcb8cd062 # 2026-04-04 21:50:54 +0700
 # Fix indentation
 da8b1f6143808a4433df645c1e81f6a8bbd1e # 2026-03-26 20:10:13 -0400
 # pg_plan_advice: pgindent
 d32016d845f8a29b3ec3ab7fa98a69cea1a0f # 2026-03-19 13:42:24 +0900
 # test_saslprep: Apply proper indentation
 b6eb8dde6be32e394a4420dfeb671b5891a87c8b # 2026-03-11 15:14:46 +0100
 # Fix indentation from commit 29a0fb21577
 b24959434629970c14fc5ee668585e491e565e4 # 2026-02-23 16:22:49 -0500
 # Fix indentation from commit b380a56a3f9
 c37015d49665c52ae7eabd5852af36851aede4 # 2026-01-27 00:26:36 +0100
 # pgindent fix for 3fccbd94cba
 f2231a30cb5002219888eef14f4dfce5b0391 # 2026-01-15 14:57:45 -0500
 # pgindent fix for 8077649907d
 b276a4a9b4b2c63ef00765f0e2867e1bcac4ca # 2025-11-19 10:41:28 +0100
 # Fix indentation
 f63ae72bbcea057534144eaf27ffe3f6e9267511 # 2025-11-18 10:28:36 -0600
 # Switch from tabs to spaces in postgresql.conf.sample.
 c2b0e3a0351e021dea9b61fe2f759570d3fedb70 # 2025-11-13 14:25:21 +0900
 # Fix indentation issue
 e94a7afe44bfa1bd8dc929204a2d4ac8b3fa9854 # 2025-10-21 09:56:26 -0500
 # Re-pgindent brin.c.
 e9c216b5236cc61f677787b35e8c8f28f5f6959 # 2025-09-13 14:50:02 -0500
 # Re-pgindent nbtpreprocesskeys.c after commit 796962922e.
 d1612aec7688139e1a5506df1366b4b6a69605d # 2025-07-29 09:10:41 -0400
 # Run pgindent.
 73873805fb3627cb23937c750fa83ffd8f16fc6c # 2025-07-25 16:36:44 -0400
 # Run pgindent on the changes of the previous patch.
 e345415bcd3c4358350b89edfd710469b8bfaf9 # 2025-07-01 15:23:07 +0200
 # Fix indentation in pg_numa code
 ebd24255581837f9a5b189ef15147b769df116b # 2025-06-29 21:14:21 -0400
 # Run pgperltidy
 b27644bade0348d0dafd3036c47880a349fe9332 # 2025-06-15 13:04:24 -0400
 # Sync typedefs.list with the buildfarm.
 @ -26,6 +86,9 @@ b27644bade0348d0dafd3036c47880a349fe9332 # 2025-06-15 13:04:24 -0400
 e1a8b1ad587112e67fdc5aa7b388631dde4dbdda # 2025-04-04 09:38:22 -0500
 # Re-pgindent pg_largeobject.c after commit 0d6c477664.
 ec4327d106be745534592e8aff14effb716f4dc9 # 2025-03-29 16:47:44 +0100
 # amcheck: Fix indentation in verify_gin.c
 bdda484c838313959f65e2b700f14ac7c0e66 # 2025-03-18 09:02:36 -0400
 # Fix indentation again.
 @ -38,6 +101,9 @@ b955df443405e056fd9047ef819a1465654f9d79 # 2025-03-13 12:41:44 +1300
 aa615943049c04efd36ab4765c06eda89cdfea # 2025-01-31 16:44:24 +0900
 # Fix bad indentation introduced in commit d47cbf474
 be31ac25191b26a8a1db345a727545959654f4cb # 2025-01-22 10:15:32 +0900
 # Run perltidy
 e826278f1ebd9967c0f8adda29c8960a812e344 # 2025-01-13 11:27:32 +0900
 # Fix pgindent damage
 @ -50,6 +116,9 @@ b955df443405e056fd9047ef819a1465654f9d79 # 2025-03-13 12:41:44 +1300
 a7f2f6adc240a2823c2344b89e90bb630dea8803 # 2024-10-16 12:21:13 -0700
 # Whitespace fixup from generated unicode tables.
 ad00e4b59a04263fd79fb115aecce2fb0851b # 2024-10-14 11:25:03 +0200
 # Run pgperltidy on newly-added test code
 f7474a8e4002ac9fd4979cc7b16b50b70b70c28 # 2024-09-27 11:14:31 -0400
 # Reindent pg_verifybackup.c.
 @ -71,15 +140,27 @@ c883453cb29cb40c1e59c3c54d159c5e744da8a9 # 2024-07-26 12:00:04 -0400
 ecbfdfcc71e41d7dcc35f0be04f8adbe88397f # 2024-07-15 15:17:04 -0700
 # Fix bad indentation introduced in 43cd30bcd1c
 dcc6f8e6d7a0eb0ce90802311278723843b4bbd # 2024-07-01 07:35:01 +0900
 # Run pgperltidy
 b48f275f18d7da4f4863888ad047cbd699698880 # 2024-06-28 10:51:05 -0400
 # pgindent, because I forgot to do that.
 da256a4a7fdcca35fe7ca808686ad3de6ee22306 # 2024-05-14 16:34:50 -0400
 # Pre-beta mechanical code beautification.
 9301308bd196f614696e0e9492cf0c52f7857f83 # 2024-04-03 09:44:47 +0200
 # Fix indentation from cafe1056558f
 b0be28761ec5958bb7bbf9a03d94ee6e1bc59849 # 2024-03-27 13:21:29 -0700
 # Run perltidy on generate-unicode_version.pl.
 e401b62b1559d617db5c1e1070d7a05e794c27 # 2024-03-25 14:18:33 +0100
 # Fix indentation from a11f330b5
 b0289574bdf1202248201a3143d1459bdf5727fd # 2024-03-09 11:51:31 -0800
 # Run perltidy on 002_pg_upgrade.pl.
 a3b851abe89bec6c3eff51b03038808e1997 # 2024-03-05 11:16:23 -0800
 # Run pgindent again on the same file.
 @ -98,6 +179,24 @@ dd7ea37c435e10f9c5aa3fb257a05c08814a4ad2 # 2024-03-04 14:37:35 -0500
 c019ffa3f883d139709ea0cfe76dc1bce0f1f8 # 2024-01-13 13:54:11 -0500
 # Re-pgindent catcache.c after previous commit.
 faffa0434b484772782ff4763c0b2080222dde0 # 2024-01-11 13:24:35 -0500
 # Reindent after commit d9ef650fca7bc574586f4171cd929cfd5240326e.
 d0cf0b05defcee985d5af38cb0db2b9c2f8dbae # 2023-12-05 15:54:59 +0100
 # Fix indentation
 d93ce31a584ff454f39df9f94f84376756e074b # 2023-12-01 17:58:18 +0100
 # pgindent fix
 b2caf7c0e1ebada614b6aa3004d826080a07e7e4 # 2023-11-26 21:35:32 +0100
 # Fix brin.c indentation issues introduced by c1ec02be1d
 b30e266eaa74f38bdda45067c9a5a63cd24c75 # 2023-10-30 14:52:35 -0400
 # pgindent run to fix commits de64268561 and 5ae2087202a
 fed4df5db4e78d40a0ce9cb785cfba9fa480f # 2023-10-30 10:35:03 +0200
 # Fix indentation in contrib/amcheck/verify_nbtree.c
 cd726fc92989ac918eac48fd8d684869c7 # 2023-10-26 09:20:54 +0200
 # Add trailing commas to enum definitions
 @ -128,6 +227,9 @@ bc6041b61f6678d32a5cfb70744653cd8f8d01c0 # 2023-08-30 15:56:22 +0900
 f492d2565cfbe383f13a69425d751fd79415f # 2023-07-13 22:26:10 +0900
 # Fix code indentation violation in commit b6e1157e7d
 a44d96add2eb377ab70055a54b713c5c78380383 # 2023-07-10 12:05:32 +0200
 # Fix pgindent
 a674a170058e63e8178aec8a36a673efce8801 # 2023-07-06 11:49:18 +0530
 # Fix code indentation vioaltion introduced in commit cc32ec24fd.
 @ -140,18 +242,30 @@ b334612b8aee9f9a34378982d8938b201dfad323 # 2023-06-20 09:50:43 -0400
 f8db36f375326c2bae0c3420d3c77714e72d # 2023-05-19 17:24:48 -0400
 # Pre-beta mechanical code beautification.
 b8ace8d773257fffeaceda196ed94877c2b74df # 2023-05-19 10:52:04 +0200
 # Reindent some comments
 b6dfee28f2b44e28b123b77a91fb05c47da63501 # 2023-03-09 15:09:45 +0900
 # Run pgindent on libpq's fe-auth.c, fe-auth-scram.c and fe-connect.c
 dbdea02d74c72db2d1a57d5299f94f91fa975 # 2023-01-23 23:08:38 +1300
 # Run pgindent on heapam.c
 c8ee07ede8a104ae1471f6ebca204d94267dd # 2023-01-13 15:23:17 -0800
 # Manual cleanup and pgindent of pgstat and bufmgr related code
 c4dafe1eed511c5af92bcea5311cf627673377 # 2022-12-04 14:25:53 -0500
 # Re-pgindent a few files.
 b2e6e768230be334b12dae536ba4c147fba4e9c9 # 2022-09-08 14:01:13 +0700
 # Run perltidy over Catalog.pm
 d0ffae3219e4bc153a1306ce23013d168e04a2 # 2022-06-30 11:03:03 -0400
 # pgindent run prior to branching v15.
 ed71e423ee63b263730b86326da2a629a29f84 # 2022-05-13 07:17:29 +0200
 # Indent C code in flex and bison files
 e7b38bfe396f919fdb66057174d29e17086418 # 2022-05-12 15:17:30 -0400
 # Pre-beta mechanical code beautification.
 @ -209,6 +323,9 @@ d8421390996dcd762383a28e57d1f3f16cc5f76f # 2018-06-30 12:28:55 -0400
 e9c8580904625576871eeb2efec7f04d4c3bc1c # 2018-06-30 12:25:49 -0400
 # pgindent run prior to branching
 bb240e1c8e279efa2d805c7f700abfb771925 # 2018-05-09 10:05:35 -0400
 # perltidy some recent code changes before changing perltidy settings
 bdf46af748d0f15f257c99bf06e9e25aba6a24f9 # 2018-04-26 14:47:16 -0400
 # Post-feature-freeze pgindent run.
 @ -242,21 +359,36 @@ ce554810329b9b8e862eade08b598148931eb456 # 2017-05-17 19:01:23 -0400
 a6fd7b7a5f7bf3a8aa3f3d076cf09d922c1c6dd2 # 2017-05-17 16:31:56 -0400
 # Post-PG 10 beta1 pgindent run
 cdb3414b3fb4c8fcc069572568390450bb04c9 # 2017-01-24 10:20:02 -0500
 # Reindent table partitioning code.
 fcf70e0dbca1432959be5f3557acd546d639c9ba # 2016-11-17 14:36:59 -0500
 # Re-pgindent src/bin/pg_dump/*
 b5bce6c1ec6061c8a4f730d927e162db7e2ce365 # 2016-08-15 13:42:51 -0400
 # Final pgindent + perltidy run for 9.6.
 fe7360afdc0bb1820743ea5bfe3fc7d522f6c4 # 2016-08-05 15:14:19 -0400
 # Re-pgindent tsvector_op.c.
 be0a62ffe58f0753d190cbe22acbeb8b4926b85 # 2016-06-12 04:19:56 -0400
 # Finish pgindent run for 9.6: Perl files.
 bc424b968058c7f0aa685821d7039e86faac99c # 2016-06-09 18:02:36 -0400
 # pgindent run for 9.6
 dd318d277b8e1d8269b030f545240193943162f # 2016-04-09 13:33:33 -0400
 # Run pgindent on generic_xlog.c.
 de94e2af184e25576b13cbda8cf825118835d1cd # 2016-04-06 11:34:02 -0400
 # Run pgindent on a batch of (mostly-planner-related) source files.
 be060cbcd42737693f6fd425db4c139121181cce # 2016-03-09 13:51:11 -0500
 # Re-pgindent vacuumlazy.c.
 a361490806435fda6340fa13c0a881767c57c87a # 2016-02-12 13:36:13 -0500
 # Re-pgindent isolationtester.c.
 f838565d2921a0960407c4240237ba1d56ae # 2016-02-08 15:17:40 -0500
 # Re-pgindent varlena.c.
 @ -266,6 +398,9 @@ d0cd7bda97a626049aa7d247374909c52399c413 # 2016-02-04 22:30:08 -0500
 d290c8ec6c182a4df1d089c21fe84c7912f01fe # 2016-01-17 19:13:18 -0500
 # Re-pgindent a few files.
 e009babe6020fddcf3820e57e2f87c5539c # 2016-01-13 15:48:54 -0500
 # Run pgindent on src/bin/pg_dump/*
 befa3e648ce018d84cd2a0df701927c56fe3da4e # 2015-05-24 21:45:01 -0400
 # Revert 9.5 pgindent changes to atomics directory files
 @ -284,18 +419,36 @@ befa3e648ce018d84cd2a0df701927c56fe3da4e # 2015-05-24 21:45:01 -0400
 a7832005792fa6dad171f9cadb8d587fe0dd800 # 2014-05-06 12:12:18 -0400
 # pgindent run for 9.4
 ee266339bd4a049ff92e101010242169b7287 # 2014-01-22 08:46:51 -0500
 # Reindent json.c and jsonfuncs.c.
 af4159fce6654aa0e081b00d02bca40b978745c # 2013-05-29 16:58:43 -0400
 # pgindent run for release 9.3 This is the first run of the Perl-based pgindent script.  Also update pgindent instructions.
 d9ffc282a8c796d2a5babc600c1a6db150dac # 2012-07-04 21:47:49 -0400
 # Run newly-configured perltidy script on Perl files.
 d61eeff78363ea3938c818d07e511ebaf75cf # 2012-06-10 15:20:04 -0400
 # Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.
 cdaa45fd4b09c64d634818e52ef7a2191ce40667 # 2011-11-14 12:08:48 -0500
 # Run pgindent on range type files, per request from Tom.
 fd6913a18955b0f89ca994b5036c103bcea23f28 # 2011-07-12 15:25:08 +0100
 # perltidy run over msvc build system
 6560407c7db2c7e32926a46f5fb52175ac10d9e5 # 2011-06-09 14:32:50 -0400
 # Pgindent run before 9.1 beta2.
 bf50caf105a901c4f83ac1df3cdaf910c26694a4 # 2011-04-10 11:42:00 -0400
 # pgindent run before PG 9.1 beta 1.
 f7b51d175a02a3b6589f091ca732959618844232 # 2011-02-17 22:20:39 -0300
 # pgindent run on plperl.c
 c0e96b49e588b2a5ab501a2acc03b96ff76cf288 # 2011-01-03 10:44:56 +0100
 # perltidy run on the MSVC build system
 d769e7e05e0a5ef3bd6828e93e22ef3962780 # 2010-07-06 19:19:02 +0000
 # pgindent run for 9.0, second run
 @ -305,6 +458,9 @@ bf50caf105a901c4f83ac1df3cdaf910c26694a4 # 2011-04-10 11:42:00 -0400
 d7471402794266078953f1bd113dab4913d631a1 # 2009-06-11 14:49:15 +0000
 # 8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.
 f0bf6cb0d1f1922a7da68392e50d214b1c2abe3 # 2007-11-16 01:12:24 +0000
 # Run pgindent on remaining files now that LOOPBYTE is a usable macro.
 f6e8730d11ddfc720eda1dde23794d262ad8cc08 # 2007-11-15 22:25:18 +0000
 # Re-run pgindent with updated list of typedefs.  (Updated README should avoid this problem in the future.)
 @ -347,6 +503,9 @@ ea08e6cd5542cb269ecd3e735f1dfa3bb61fbc4f # 2001-11-05 17:46:40 +0000
 b81844b1738c584d92330a5ccd0fbd8b603d2886 # 2001-10-25 05:50:21 +0000
 # pgindent run on all C files.  Java run to follow.  initdb/regression tests pass.
 c551683cb99d94ebb44be0b94a0e03dd1d9a0f8 # 2001-05-08 17:12:36 +0000
 # Run pgindent on ODBC code only, to reformat new comments.
 d49da0a34ad92f61f791ea1039dec5d20f41 # 2001-03-22 06:16:21 +0000
 # Remove dashes in comments that don't need them, rewrap with pgindent.

5

.gitattributes vendored

View file

 @ -12,13 +12,14 @@
 *.xsl		whitespace=space-before-tab,trailing-space,tab-in-indent
 # Avoid confusing ASCII underlines with leftover merge conflict markers
 README		conflict-marker-size=32
 README.*	conflict-marker-size=32
 README		conflict-marker-size=48
 README.*	conflict-marker-size=48
 # Certain data files that contain special whitespace, and other special cases
 *.data						-whitespace
 contrib/pgcrypto/sql/pgp-armor.sql		whitespace=-blank-at-eol
 src/backend/catalog/sql_features.txt		whitespace=space-before-tab,blank-at-eof,-blank-at-eol
 src/backend/utils/misc/postgresql.conf.sample	whitespace=space-before-tab,trailing-space,tab-in-indent
 # Test output files that contain extra whitespace
 *.out					-whitespace

2

COPYRIGHT

View file

 @ -1,7 +1,7 @@
 PostgreSQL Database Management System
 (also known as Postgres, formerly known as Postgres95)
 Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
 Portions Copyright (c) 1996-2026, PostgreSQL Global Development Group
 Portions Copyright (c) 1994, The Regents of the University of California

									
										2

Makefile
									
										View file
										
				@ -13,8 +13,6 @@

				# AIX make defaults to building *every* target of the first rule.  Start with

				# a single-target, empty rule to make the other targets non-default.

				# (We don't support AIX anymore, but if someone tries to build on AIX anyway,

				# at least they'll get the instructions to run 'configure' first.)

				all:

				all check install installdirs installcheck installcheck-parallel uninstall clean distclean maintainer-clean dist distcheck world check-world install-world installcheck-world:

221

config/c-compiler.m4

View file

 @ -7,10 +7,10 @@
 # Select the format archetype to be used by gcc to check printf-type functions.
 # We prefer "gnu_printf", as that most closely matches the features supported
 # by src/port/snprintf.c (particularly the %m conversion spec).  However,
 # on some NetBSD versions, that doesn't work while "__syslog__" does.
 # If all else fails, use "printf".
 # on clang and on some NetBSD versions, that doesn't work while "__syslog__"
 # does.  If all else fails, use "printf".
 AC_DEFUN([PGAC_PRINTF_ARCHETYPE],
 [AC_CACHE_CHECK([for printf format archetype], pgac_cv_printf_archetype,
 [AC_CACHE_CHECK([for C printf format archetype], pgac_cv_printf_archetype,
 [pgac_cv_printf_archetype=gnu_printf
 PGAC_TEST_PRINTF_ARCHETYPE
 if [[ "$ac_archetype_ok" = no ]]; then
 @ -20,8 +20,8 @@ if [[ "$ac_archetype_ok" = no ]]; then
     pgac_cv_printf_archetype=printf
   fi
 fi])
 AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE], [$pgac_cv_printf_archetype],
 [Define to best printf format archetype, usually gnu_printf if available.])
 AC_DEFINE_UNQUOTED([PG_C_PRINTF_ATTRIBUTE], [$pgac_cv_printf_archetype],
 [Define to best C printf format archetype, usually gnu_printf if available.])
 ])# PGAC_PRINTF_ARCHETYPE
 # Subroutine: test $pgac_cv_printf_archetype, set $ac_archetype_ok to yes or no
 @ -38,6 +38,42 @@ ac_c_werror_flag=$ac_save_c_werror_flag
 ])# PGAC_TEST_PRINTF_ARCHETYPE
 # PGAC_CXX_PRINTF_ARCHETYPE
 # -------------------------
 # Because we support using gcc as C compiler with clang as C++ compiler,
 # we have to be prepared to use different printf archetypes in C++ code.
 # So, do the above test all over in C++.
 AC_DEFUN([PGAC_CXX_PRINTF_ARCHETYPE],
 [AC_CACHE_CHECK([for C++ printf format archetype], pgac_cv_cxx_printf_archetype,
 [pgac_cv_cxx_printf_archetype=gnu_printf
 PGAC_TEST_CXX_PRINTF_ARCHETYPE
 if [[ "$ac_archetype_ok" = no ]]; then
   pgac_cv_cxx_printf_archetype=__syslog__
   PGAC_TEST_CXX_PRINTF_ARCHETYPE
   if [[ "$ac_archetype_ok" = no ]]; then
     pgac_cv_cxx_printf_archetype=printf
   fi
 fi])
 AC_DEFINE_UNQUOTED([PG_CXX_PRINTF_ATTRIBUTE], [$pgac_cv_cxx_printf_archetype],
 [Define to best C++ printf format archetype, usually gnu_printf if available.])
 ])# PGAC_CXX_PRINTF_ARCHETYPE
 # Subroutine: test $pgac_cv_cxx_printf_archetype, set $ac_archetype_ok to yes or no
 AC_DEFUN([PGAC_TEST_CXX_PRINTF_ARCHETYPE],
 [ac_save_cxx_werror_flag=$ac_cxx_werror_flag
 ac_cxx_werror_flag=yes
 AC_LANG_PUSH(C++)
 AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
 [extern void pgac_write(int ignore, const char *fmt,...)
 __attribute__((format($pgac_cv_cxx_printf_archetype, 2, 3)));],
 [pgac_write(0, "error %s: %m", "foo");])],
                   [ac_archetype_ok=yes],
                   [ac_archetype_ok=no])
 AC_LANG_POP([])
 ac_cxx_werror_flag=$ac_save_cxx_werror_flag
 ])# PGAC_TEST_CXX_PRINTF_ARCHETYPE
 # PGAC_TYPE_128BIT_INT
 # --------------------
 # Check if __int128 is a working 128 bit integer type, and if so
 @ -83,7 +119,7 @@ if test x"$pgac_cv__128bit_int" = xyes ; then
   AC_CACHE_CHECK([for __int128 alignment bug], [pgac_cv__128bit_int_bug],
   [AC_RUN_IFELSE([AC_LANG_PROGRAM([
 /* This must match the corresponding code in c.h: */
 #if defined(__GNUC__) || defined(__SUNPRO_C)
 #if defined(__GNUC__)
 #define pg_attribute_aligned(a) __attribute__((aligned(a)))
 #elif defined(_MSC_VER)
 #define pg_attribute_aligned(a) __declspec(align(a))
 @ -114,26 +150,6 @@ fi])# PGAC_TYPE_128BIT_INT
 # PGAC_C_STATIC_ASSERT
 # --------------------
 # Check if the C compiler understands _Static_assert(),
 # and define HAVE__STATIC_ASSERT if so.
 #
 # We actually check the syntax ({ _Static_assert(...) }), because we need
 # gcc-style compound expressions to be able to wrap the thing into macros.
 AC_DEFUN([PGAC_C_STATIC_ASSERT],
 [AC_CACHE_CHECK(for _Static_assert, pgac_cv__static_assert,
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([],
 [({ _Static_assert(1, "foo"); })])],
 [pgac_cv__static_assert=yes],
 [pgac_cv__static_assert=no])])
 if test x"$pgac_cv__static_assert" = xyes ; then
 AC_DEFINE(HAVE__STATIC_ASSERT, 1,
           [Define to 1 if your compiler understands _Static_assert.])
 fi])# PGAC_C_STATIC_ASSERT
 # PGAC_C_TYPEOF
 # -------------
 # Check if the C compiler understands typeof or a variant.  Define
 @ -160,6 +176,96 @@ if test "$pgac_cv_c_typeof" != no; then
 fi])# PGAC_C_TYPEOF
 # PGAC_C_TYPEOF_UNQUAL
 # --------------------
 # Check if the C compiler understands typeof_unqual or a variant.  Define
 # HAVE_TYPEOF_UNQUAL if so, and define 'typeof_unqual' to the actual key word.
 #
 AC_DEFUN([PGAC_C_TYPEOF_UNQUAL],
 [AC_CACHE_CHECK(for typeof_unqual, pgac_cv_c_typeof_unqual,
 [pgac_cv_c_typeof_unqual=no
 # Test with a void pointer, because MSVC doesn't handle that, and we
 # need that for copyObject().
 for pgac_kw in typeof_unqual __typeof_unqual__; do
   AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
 [int x = 0;
 $pgac_kw(x) y;
 const void *a;
 void *b;
 y = x;
 b = ($pgac_kw(*a) *) a;
 return y;])],
 [pgac_cv_c_typeof_unqual=$pgac_kw])
   test "$pgac_cv_c_typeof_unqual" != no && break
 done])
 if test "$pgac_cv_c_typeof_unqual" != no; then
   AC_DEFINE(HAVE_TYPEOF_UNQUAL, 1,
             [Define to 1 if your compiler understands `typeof_unqual' or something similar.])
   if test "$pgac_cv_c_typeof_unqual" != typeof_unqual; then
     AC_DEFINE_UNQUOTED(typeof_unqual, $pgac_cv_c_typeof_unqual, [Define to how the compiler spells `typeof_unqual'.])
   fi
 fi])# PGAC_C_TYPEOF_UNQUAL
 # PGAC_CXX_TYPEOF
 # ----------------
 # Check if the C++ compiler understands typeof or a variant.  Define
 # HAVE_CXX_TYPEOF if so, and define 'pg_cxx_typeof' to the actual key word.
 #
 AC_DEFUN([PGAC_CXX_TYPEOF],
 [AC_CACHE_CHECK(for C++ typeof, pgac_cv_cxx_typeof,
 [pgac_cv_cxx_typeof=no
 AC_LANG_PUSH(C++)
 for pgac_kw in typeof __typeof__; do
   AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
 [int x = 0;
 $pgac_kw(x) y;
 y = x;
 return y;])],
 [pgac_cv_cxx_typeof=$pgac_kw])
   test "$pgac_cv_cxx_typeof" != no && break
 done
 AC_LANG_POP([])])
 if test "$pgac_cv_cxx_typeof" != no; then
   AC_DEFINE(HAVE_CXX_TYPEOF, 1,
             [Define to 1 if your C++ compiler understands `typeof' or something similar.])
   if test "$pgac_cv_cxx_typeof" != typeof; then
     AC_DEFINE_UNQUOTED(pg_cxx_typeof, $pgac_cv_cxx_typeof, [Define to how the C++ compiler spells `typeof'.])
   fi
 fi])# PGAC_CXX_TYPEOF
 # PGAC_CXX_TYPEOF_UNQUAL
 # ----------------------
 # Check if the C++ compiler understands typeof_unqual or a variant.  Define
 # HAVE_CXX_TYPEOF_UNQUAL if so, and define 'pg_cxx_typeof_unqual' to the actual key word.
 #
 AC_DEFUN([PGAC_CXX_TYPEOF_UNQUAL],
 [AC_CACHE_CHECK(for C++ typeof_unqual, pgac_cv_cxx_typeof_unqual,
 [pgac_cv_cxx_typeof_unqual=no
 AC_LANG_PUSH(C++)
 for pgac_kw in typeof_unqual __typeof_unqual__; do
   AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
 [int x = 0;
 $pgac_kw(x) y;
 const void *a;
 void *b;
 y = x;
 b = ($pgac_kw(*a) *) a;
 return y;])],
 [pgac_cv_cxx_typeof_unqual=$pgac_kw])
   test "$pgac_cv_cxx_typeof_unqual" != no && break
 done
 AC_LANG_POP([])])
 if test "$pgac_cv_cxx_typeof_unqual" != no; then
   AC_DEFINE(HAVE_CXX_TYPEOF_UNQUAL, 1,
             [Define to 1 if your C++ compiler understands `typeof_unqual' or something similar.])
   if test "$pgac_cv_cxx_typeof_unqual" != typeof_unqual; then
     AC_DEFINE_UNQUOTED(pg_cxx_typeof_unqual, $pgac_cv_cxx_typeof_unqual, [Define to how the C++ compiler spells `typeof_unqual'.])
   fi
 fi])# PGAC_CXX_TYPEOF_UNQUAL
 # PGAC_C_TYPES_COMPATIBLE
 # -----------------------
 @ -581,6 +687,31 @@ fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_SSE42_CRC32_INTRINSICS
 # PGAC_AVX2_SUPPORT
 # ---------------------------
 # Check if the compiler supports AVX2 as a target
 #
 # If AVX2 target attribute is supported, sets pgac_avx2_support.
 #
 # There is deliberately not a guard for __has_attribute here
 AC_DEFUN([PGAC_AVX2_SUPPORT],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
 AC_CACHE_CHECK([for AVX2 target attribute support], [Ac_cachevar],
 [AC_COMPILE_IFELSE([AC_LANG_PROGRAM([
     __attribute__((target("avx2")))
     static int avx2_test(void)
     {
       return 0;
     }],
   [return avx2_test();])],
   [Ac_cachevar=yes],
   [Ac_cachevar=no])])
 if test x"$Ac_cachevar" = x"yes"; then
   pgac_avx2_support=yes
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_AVX2_SUPPORT
 # PGAC_AVX512_PCLMUL_INTRINSICS
 # ---------------------------
 # Check if the compiler supports AVX-512 carryless multiplication
 @ -653,6 +784,44 @@ fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_ARMV8_CRC32C_INTRINSICS
 # PGAC_ARM_PLMULL
 # ---------------------------
 # Check if the compiler supports Arm CRYPTO PMULL (carryless multiplication)
 # instructions used for vectorized CRC.
 #
 # If the instructions are supported, sets pgac_arm_pmull.
 AC_DEFUN([PGAC_ARM_PLMULL],
 [define([Ac_cachevar], [AS_TR_SH([pgac_cv_arm_pmull_$1])])dnl
 AC_CACHE_CHECK([for pmull and pmull2], [Ac_cachevar],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <arm_acle.h>
 #include <arm_neon.h>
 uint64x2_t  a;
 uint64x2_t  b;
 uint64x2_t  c;
 uint64x2_t  r1;
 uint64x2_t  r2;
   #if defined(__has_attribute) && __has_attribute (target)
   __attribute__((target("+crypto")))
   #endif
   static int pmull_test(void)
   {
     __asm("pmull  %0.1q, %2.1d, %3.1d\neor %0.16b, %0.16b, %1.16b\n":"=w"(r1), "+w"(c):"w"(a), "w"(b));
     __asm("pmull2 %0.1q, %2.2d, %3.2d\neor %0.16b, %0.16b, %1.16b\n":"=w"(r2), "+w"(c):"w"(a), "w"(b));
     r1 = veorq_u64(r1, r2);
     /* return computed value, to prevent the above being optimized away */
     return (int) vgetq_lane_u64(r1, 0);
   }],
   [return pmull_test();])],
   [Ac_cachevar=yes],
   [Ac_cachevar=no])])
 if test x"$Ac_cachevar" = x"yes"; then
   pgac_arm_pmull=yes
 fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_ARM_PLMULL
 # PGAC_LOONGARCH_CRC32C_INTRINSICS
 # ---------------------------
 # Check if the compiler supports the LoongArch CRCC instructions, using

									
										2

config/check_modules.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2024-2025, PostgreSQL Global Development Group

				# Copyright (c) 2024-2026, PostgreSQL Global Development Group

				#

				# Verify that required Perl modules are available,

17

config/config.guess vendored

View file

 @ -1,10 +1,10 @@
 #! /bin/sh
 # Attempt to guess a canonical system name.
 #   Copyright 1992-2024 Free Software Foundation, Inc.
 #   Copyright 1992-2025 Free Software Foundation, Inc.
 # shellcheck disable=SC2006,SC2268 # see below for rationale
 timestamp='2024-07-27'
 timestamp='2025-07-10'
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
 @ -60,7 +60,7 @@ version="\
 GNU config.guess ($timestamp)
 Originally written by Per Bothner.
 Copyright 1992-2024 Free Software Foundation, Inc.
 Copyright 1992-2025 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE."
 @ -1597,8 +1597,11 @@ EOF
     *:Unleashed:*:*)
 	GUESS=$UNAME_MACHINE-unknown-unleashed$UNAME_RELEASE
 	;;
     *:Ironclad:*:*)
 	GUESS=$UNAME_MACHINE-unknown-ironclad
     x86_64:[Ii]ronclad:*:*|i?86:[Ii]ronclad:*:*)
 	GUESS=$UNAME_MACHINE-pc-ironclad-mlibc
 	;;
     *:[Ii]ronclad:*:*)
 	GUESS=$UNAME_MACHINE-unknown-ironclad-mlibc
 	;;
 esac
 @ -1808,8 +1811,8 @@ fi
 exit 1
 # Local variables:
 # eval: (add-hook 'before-save-hook 'time-stamp)
 # eval: (add-hook 'before-save-hook 'time-stamp nil t)
 # time-stamp-start: "timestamp='"
 # time-stamp-format: "%:y-%02m-%02d"
 # time-stamp-format: "%Y-%02m-%02d"
 # time-stamp-end: "'"
 # End:

28

config/config.sub vendored

View file

 @ -1,10 +1,10 @@
 #! /bin/sh
 # Configuration validation subroutine script.
 #   Copyright 1992-2024 Free Software Foundation, Inc.
 #   Copyright 1992-2025 Free Software Foundation, Inc.
 # shellcheck disable=SC2006,SC2268,SC2162 # see below for rationale
 timestamp='2024-05-27'
 timestamp='2025-07-10'
 # This file is free software; you can redistribute it and/or modify it
 # under the terms of the GNU General Public License as published by
 @ -76,7 +76,7 @@ Report bugs and patches to <config-patches@gnu.org>."
 version="\
 GNU config.sub ($timestamp)
 Copyright 1992-2024 Free Software Foundation, Inc.
 Copyright 1992-2025 Free Software Foundation, Inc.
 This is free software; see the source for copying conditions.  There is NO
 warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE."
 @ -145,6 +145,7 @@ case $1 in
 			| kfreebsd*-gnu* \
 			| knetbsd*-gnu* \
 			| kopensolaris*-gnu* \
 			| ironclad-* \
 			| linux-* \
 			| managarm-* \
 			| netbsd*-eabi* \
 @ -242,7 +243,6 @@ case $1 in
 					| rombug \
 					| semi \
 					| sequent* \
 					| siemens \
 					| sgi* \
 					| siemens \
 					| sim \
 @ -261,7 +261,7 @@ case $1 in
 						basic_machine=$field1-$field2
 						basic_os=
 						;;
 					zephyr*)
 					tock* | zephyr*)
 						basic_machine=$field1-unknown
 						basic_os=$field2
 						;;
 @ -1194,7 +1194,7 @@ case $cpu-$vendor in
 	xscale-* | xscalee[bl]-*)
 		cpu=`echo "$cpu" | sed 's/^xscale/arm/'`
 		;;
 	arm64-* | aarch64le-*)
 	arm64-* | aarch64le-* | arm64_32-*)
 		cpu=aarch64
 		;;
 @ -1321,6 +1321,7 @@ case $cpu-$vendor in
 			| i960 \
 			| ia16 \
 			| ia64 \
 			| intelgt \
 			| ip2k \
 			| iq2000 \
 			| javascript \
 @ -1522,6 +1523,10 @@ EOF
 		kernel=nto
 		os=`echo "$basic_os" | sed -e 's|nto|qnx|'`
 		;;
 	ironclad*)
 		kernel=ironclad
 		os=`echo "$basic_os" | sed -e 's|ironclad|mlibc|'`
 		;;
 	linux*)
 		kernel=linux
 		os=`echo "$basic_os" | sed -e 's|linux|gnu|'`
 @ -1976,6 +1981,7 @@ case $os in
 	| atheos* \
 	| auroraux* \
 	| aux* \
 	| banan_os* \
 	| beos* \
 	| bitrig* \
 	| bme* \
 @ -2022,7 +2028,6 @@ case $os in
 	| ios* \
 	| iris* \
 	| irix* \
 	| ironclad* \
 	| isc* \
 	| its* \
 	| l4re* \
 @ -2118,6 +2123,7 @@ case $os in
 	| sysv* \
 	| tenex* \
 	| tirtos* \
 	| tock* \
 	| toppers* \
 	| tops10* \
 	| tops20* \
 @ -2214,6 +2220,8 @@ case $kernel-$os-$obj in
 		;;
 	uclinux-uclibc*- | uclinux-gnu*- )
 		;;
 	ironclad-mlibc*-)
 		;;
 	managarm-mlibc*- | managarm-kernel*- )
 		;;
 	windows*-msvc*-)
 @ -2249,6 +2257,8 @@ case $kernel-$os-$obj in
 		;;
 	*-eabi*- | *-gnueabi*-)
 		;;
 	ios*-simulator- | tvos*-simulator- | watchos*-simulator- )
 		;;
 	none--*)
 		# None (no kernel, i.e. freestanding / bare metal),
 		# can be paired with an machine code file format
 @ -2347,8 +2357,8 @@ echo "$cpu-$vendor${kernel:+-$kernel}${os:+-$os}${obj:+-$obj}"
 exit
 # Local variables:
 # eval: (add-hook 'before-save-hook 'time-stamp)
 # eval: (add-hook 'before-save-hook 'time-stamp nil t)
 # time-stamp-start: "timestamp='"
 # time-stamp-format: "%:y-%02m-%02d"
 # time-stamp-format: "%Y-%02m-%02d"
 # time-stamp-end: "'"
 # End:

21

config/llvm.m4

View file

 @ -4,7 +4,7 @@
 # -----------------
 #
 # Look for the LLVM installation, check that it's new enough, set the
 # corresponding LLVM_{CFLAGS,CXXFLAGS,BINPATH} and LDFLAGS
 # corresponding LLVM_{CFLAGS,CXXFLAGS,BINPATH,LIBS}
 # variables. Also verify that CLANG is available, to transform C
 # into bitcode.
 #
 @ -55,7 +55,7 @@ AC_DEFUN([PGAC_LLVM_SUPPORT],
   for pgac_option in `$LLVM_CONFIG --ldflags`; do
     case $pgac_option in
       -L*) LDFLAGS="$LDFLAGS $pgac_option";;
       -L*) LLVM_LIBS="$LLVM_LIBS $pgac_option";;
     esac
   done
 @ -101,20 +101,3 @@ dnl LLVM_CONFIG, CLANG are already output via AC_ARG_VAR
   AC_SUBST(LLVM_BINPATH)
 ])# PGAC_LLVM_SUPPORT
 # PGAC_CHECK_LLVM_FUNCTIONS
 # -------------------------
 #
 # Check presence of some optional LLVM functions.
 # (This shouldn't happen until we're ready to run AC_CHECK_DECLS tests;
 # because PGAC_LLVM_SUPPORT runs very early, it's not an appropriate place.)
 #
 AC_DEFUN([PGAC_CHECK_LLVM_FUNCTIONS],
 [
   # Check which functionality is present
   SAVE_CPPFLAGS="$CPPFLAGS"
   CPPFLAGS="$CPPFLAGS $LLVM_CPPFLAGS"
   AC_CHECK_DECLS([LLVMCreateGDBRegistrationListener, LLVMCreatePerfJITEventListener], [], [], [[#include <llvm-c/ExecutionEngine.h>]])
   CPPFLAGS="$SAVE_CPPFLAGS"
 ])# PGAC_CHECK_LLVM_FUNCTIONS

8

config/prep_buildtree

View file

 @ -22,18 +22,14 @@ sourcetree=`cd $1 && pwd`
 buildtree=`cd ${2:-'.'} && pwd`
 # We must not auto-create the subdirectories holding built documentation.
 # If we did, it would interfere with installation of prebuilt docs from
 # the source tree, if a VPATH build is done from a distribution tarball.
 # See bug #5595.
 for item in `find "$sourcetree" -type d \( \( -name CVS -prune \) -o \( -name .git -prune \) -o -print \) | grep -v "$sourcetree/doc/src/sgml/\+"`; do
 for item in `find "$sourcetree"/config "$sourcetree"/contrib "$sourcetree"/doc "$sourcetree"/src -type d -print`; do
     subdir=`expr "$item" : "$sourcetree\(.*\)"`
     if test ! -d "$buildtree/$subdir"; then
         mkdir -p "$buildtree/$subdir" || exit 1
     fi
 done
 for item in `find "$sourcetree" -name Makefile -print -o -name GNUmakefile -print | grep -v "$sourcetree/doc/src/sgml/images/"`; do
 for item in "$sourcetree"/Makefile `find "$sourcetree"/config "$sourcetree"/contrib "$sourcetree"/doc "$sourcetree"/src -name Makefile -print -o -name GNUmakefile -print`; do
     filename=`expr "$item" : "$sourcetree\(.*\)"`
     if test ! -f "${item}.in"; then
         if cmp "$item" "$buildtree/$filename" >/dev/null 2>&1; then : ; else

18

config/programs.m4

View file

 @ -284,20 +284,26 @@ AC_DEFUN([PGAC_CHECK_STRIP],
 AC_DEFUN([PGAC_CHECK_LIBCURL],
 [
   # libcurl compiler/linker flags are kept separate from the global flags, so
   # they have to be added back temporarily for the following tests.
   pgac_save_CPPFLAGS=$CPPFLAGS
   pgac_save_LDFLAGS=$LDFLAGS
   pgac_save_LIBS=$LIBS
   CPPFLAGS="$CPPFLAGS $LIBCURL_CPPFLAGS"
   LDFLAGS="$LDFLAGS $LIBCURL_LDFLAGS"
   AC_CHECK_HEADER(curl/curl.h, [],
 				  [AC_MSG_ERROR([header file <curl/curl.h> is required for --with-libcurl])])
   # LIBCURL_LDLIBS is determined here. Like the compiler flags, it should not
   # pollute the global LIBS setting.
   AC_CHECK_LIB(curl, curl_multi_init, [
 				 AC_DEFINE([HAVE_LIBCURL], [1], [Define to 1 if you have the `curl' library (-lcurl).])
 				 AC_SUBST(LIBCURL_LDLIBS, -lcurl)
 			   ],
 			   [AC_MSG_ERROR([library 'curl' does not provide curl_multi_init])])
   pgac_save_CPPFLAGS=$CPPFLAGS
   pgac_save_LDFLAGS=$LDFLAGS
   pgac_save_LIBS=$LIBS
   CPPFLAGS="$LIBCURL_CPPFLAGS $CPPFLAGS"
   LDFLAGS="$LIBCURL_LDFLAGS $LDFLAGS"
   LIBS="$LIBCURL_LDLIBS $LIBS"
   # Check to see whether the current platform supports threadsafe Curl

5

config/python.m4

View file

 @ -97,6 +97,11 @@ python_ldlibrary=`${PYTHON} -c "import sysconfig; print(' '.join(filter(None,sys
 # If LDLIBRARY exists and has a shlib extension, use it verbatim.
 ldlibrary=`echo "${python_ldlibrary}" | sed -e 's/\.so$//' -e 's/\.dll$//' -e 's/\.dylib$//' -e 's/\.sl$//'`
 if test "$PORTNAME" = "aix"; then
   # On AIX, '.a' should also be believed to be a shlib.
   ldlibrary=`echo "${ldlibrary}" | sed -e 's/\.a$//'`
 fi
 if test -e "${python_libdir}/${python_ldlibrary}" -a x"${python_ldlibrary}" != x"${ldlibrary}"
 then
 	ldlibrary=`echo "${ldlibrary}" | sed "s/^lib//"`

1572

configure vendored

View file

File diff suppressed because it is too large Load diff

315

configure.ac

View file

 @ -17,13 +17,13 @@ dnl Read the Autoconf manual for details.
 dnl
 m4_pattern_forbid(^PGAC_)dnl to catch undefined macros
 AC_INIT([PostgreSQL], [18beta1], [pgsql-bugs@lists.postgresql.org], [], [https://www.postgresql.org/])
 AC_INIT([PostgreSQL], [19devel], [pgsql-bugs@lists.postgresql.org], [], [https://www.postgresql.org/])
 m4_if(m4_defn([m4_PACKAGE_VERSION]), [2.69], [], [m4_fatal([Autoconf version 2.69 is required.
 Untested combinations of 'autoconf' and PostgreSQL versions are not
 recommended.  You can remove the check from 'configure.ac' but it is then
 your responsibility whether the result works or not.])])
 AC_COPYRIGHT([Copyright (c) 1996-2025, PostgreSQL Global Development Group])
 AC_COPYRIGHT([Copyright (c) 1996-2026, PostgreSQL Global Development Group])
 AC_CONFIG_SRCDIR([src/backend/access/common/heaptuple.c])
 AC_CONFIG_AUX_DIR(config)
 AC_PREFIX_DEFAULT(/usr/local/pgsql)
 @ -62,6 +62,7 @@ PGAC_ARG_REQ(with, template, [NAME], [override operating system template],
 # --with-template not given
 case $host_os in
      aix*) template=aix ;;
   cygwin*|msys*) template=cygwin ;;
   darwin*) template=darwin ;;
 dragonfly*) template=netbsd ;;
 @ -95,12 +96,6 @@ AC_MSG_RESULT([$template])
 PORTNAME=$template
 AC_SUBST(PORTNAME)
 # Initialize default assumption that we do not need separate assembly code
 # for TAS (test-and-set).  This can be overridden by the template file
 # when it's executed.
 need_tas=no
 tas_file=dummy.s
 # Default, works for most platforms, override in template file if needed
 DLSUFFIX=".so"
 @ -364,16 +359,75 @@ pgac_cc_list="gcc cc"
 pgac_cxx_list="g++ c++"
 AC_PROG_CC([$pgac_cc_list])
 AC_PROG_CC_C99()
 # Error out if the compiler does not support C99, as the codebase
 # relies on that.
 if test "$ac_cv_prog_cc_c99" = no; then
     AC_MSG_ERROR([C compiler "$CC" does not support C99])
 # Detect option needed for C11
 # loosely modeled after code in later Autoconf versions
 AC_MSG_CHECKING([for $CC option to accept ISO C11])
 AC_CACHE_VAL([pgac_cv_prog_cc_c11],
 [pgac_cv_prog_cc_c11=no
 pgac_save_CC=$CC
 for pgac_arg in '' '-std=gnu11' '-std=c11'; do
   CC="$pgac_save_CC $pgac_arg"
   AC_COMPILE_IFELSE([AC_LANG_SOURCE([[#if !defined __STDC_VERSION__ || __STDC_VERSION__ < 201112L
 # error "Compiler does not advertise C11 conformance"
 #endif]])], [[pgac_cv_prog_cc_c11=$pgac_arg]])
   test x"$pgac_cv_prog_cc_c11" != x"no" && break
 done
 CC=$pgac_save_CC])
 if test x"$pgac_cv_prog_cc_c11" = x"no"; then
   AC_MSG_RESULT([unsupported])
   AC_MSG_ERROR([C compiler "$CC" does not support C11])
 elif test x"$pgac_cv_prog_cc_c11" = x""; then
   AC_MSG_RESULT([none needed])
 else
   AC_MSG_RESULT([$pgac_cv_prog_cc_c11])
   CC="$CC $pgac_cv_prog_cc_c11"
 fi
 AC_PROG_CXX([$pgac_cxx_list])
 # Check if it actually found a C++ compiler.
 AC_LANG_PUSH([C++])
 AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [])],
   [have_cxx=yes],
   [have_cxx=no])
 AC_LANG_POP([C++])
 AC_SUBST(have_cxx)
 if test "$have_cxx" = yes; then
 # Detect option needed for C++11
 AC_MSG_CHECKING([for $CXX option to accept ISO C++11])
 AC_CACHE_VAL([pgac_cv_prog_cxx_cxx11],
 [pgac_cv_prog_cxx_cxx11=no
 pgac_save_CXX=$CXX
 AC_LANG_PUSH([C++])
 for pgac_arg in '' '-std=gnu++11' '-std=c++11'; do
   CXX="$pgac_save_CXX $pgac_arg"
   AC_COMPILE_IFELSE([AC_LANG_SOURCE([[#if !defined __cplusplus || __cplusplus < 201103L
 # error "Compiler does not advertise C++11 conformance"
 #endif]])], [[pgac_cv_prog_cxx_cxx11=$pgac_arg]])
   test x"$pgac_cv_prog_cxx_cxx11" != x"no" && break
 done
 AC_LANG_POP([C++])
 CXX=$pgac_save_CXX])
 if test x"$pgac_cv_prog_cxx_cxx11" = x"no"; then
   AC_MSG_RESULT([unsupported])
   AC_MSG_WARN([C++ compiler "$CXX" does not support C++11])
   have_cxx=no
 elif test x"$pgac_cv_prog_cxx_cxx11" = x""; then
   AC_MSG_RESULT([none needed])
 else
   AC_MSG_RESULT([$pgac_cv_prog_cxx_cxx11])
   CXX="$CXX $pgac_cv_prog_cxx_cxx11"
 fi
 fi  # have_cxx
 # Check if it's Intel's compiler, which (usually) pretends to be gcc,
 # but has idiosyncrasies of its own.  We assume icc will define
 # __INTEL_COMPILER regardless of CFLAGS.
 @ -381,14 +435,6 @@ AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [@%:@ifndef __INTEL_COMPILER
 choke me
 @%:@endif])], [ICC=yes], [ICC=no])
 # Check if it's Sun Studio compiler. We assume that
 # __SUNPRO_C will be defined for Sun Studio compilers
 AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [@%:@ifndef __SUNPRO_C
 choke me
 @%:@endif])], [SUN_STUDIO_CC=yes], [SUN_STUDIO_CC=no])
 AC_SUBST(SUN_STUDIO_CC)
 #
 # LLVM
 @ -501,18 +547,32 @@ if test "$GCC" = yes -a "$ICC" = no; then
     PERMIT_DECLARATION_AFTER_STATEMENT=-Wno-declaration-after-statement
   fi
   AC_SUBST(PERMIT_DECLARATION_AFTER_STATEMENT)
   # Really don't want VLAs to be used in our dialect of C
   # Really don't want VLAs to be used
   PGAC_PROG_CC_CFLAGS_OPT([-Werror=vla])
   PGAC_PROG_CXX_CFLAGS_OPT([-Werror=vla])
   # On macOS, complain about usage of symbols newer than the deployment target
   PGAC_PROG_CC_CFLAGS_OPT([-Werror=unguarded-availability-new])
   PGAC_PROG_CXX_CFLAGS_OPT([-Werror=unguarded-availability-new])
   # -Wvla is not applicable for C++
   PGAC_PROG_CC_CFLAGS_OPT([-Wendif-labels])
   PGAC_PROG_CXX_CFLAGS_OPT([-Wendif-labels])
   PGAC_PROG_CC_CFLAGS_OPT([-Wmissing-format-attribute])
   PGAC_PROG_CXX_CFLAGS_OPT([-Wmissing-format-attribute])
   PGAC_PROG_CC_CFLAGS_OPT([-Wimplicit-fallthrough=3])
   PGAC_PROG_CXX_CFLAGS_OPT([-Wimplicit-fallthrough=3])
   PGAC_PROG_CC_CFLAGS_OPT([-Wold-style-declaration])
   # -Wold-style-declaration is not applicable for C++
   # To require fallthrough attribute annotations, use
   # -Wimplicit-fallthrough=5 with gcc and -Wimplicit-fallthrough with
   # clang.  The latter is also accepted on gcc but does not enforce
   # attribute annotations, so test the former first.
   save_CFLAGS=$CFLAGS
   PGAC_PROG_CC_CFLAGS_OPT([-Wimplicit-fallthrough=5])
   if test x"$save_CFLAGS" = x"$CFLAGS"; then
     PGAC_PROG_CC_CFLAGS_OPT([-Wimplicit-fallthrough])
   fi
   save_CXXFLAGS=$CXXFLAGS
   PGAC_PROG_CXX_CFLAGS_OPT([-Wimplicit-fallthrough=5])
   if test x"$save_CXXFLAGS" = x"$CXXFLAGS"; then
     PGAC_PROG_CXX_CFLAGS_OPT([-Wimplicit-fallthrough])
   fi
   PGAC_PROG_CC_CFLAGS_OPT([-Wcast-function-type])
   PGAC_PROG_CXX_CFLAGS_OPT([-Wcast-function-type])
   PGAC_PROG_CC_CFLAGS_OPT([-Wshadow=compatible-local])
 @ -599,7 +659,7 @@ fi
 # __attribute__((visibility("hidden"))) is supported, if we encounter a
 # compiler that supports one of the supported variants of -fvisibility=hidden
 # but uses a different syntax to mark a symbol as exported.
 if test "$GCC" = yes -o "$SUN_STUDIO_CC" = yes ; then
 if test "$GCC" = yes; then
   PGAC_PROG_CC_VAR_OPT(CFLAGS_SL_MODULE, [-fvisibility=hidden])
   # For C++ we additionally want -fvisibility-inlines-hidden
   PGAC_PROG_VARCXX_VARFLAGS_OPT(CXX, CXXFLAGS_SL_MODULE, [-fvisibility=hidden])
 @ -726,13 +786,6 @@ AC_LINK_IFELSE([AC_LANG_PROGRAM([], [return 0;])],
   [AC_MSG_RESULT(no)
    AC_MSG_ERROR([cannot proceed])])
 # Defend against gcc -ffast-math
 if test "$GCC" = yes; then
 AC_COMPILE_IFELSE([AC_LANG_PROGRAM([], [@%:@ifdef __FAST_MATH__
 choke me
 @%:@endif])], [], [AC_MSG_ERROR([do not put -ffast-math in CFLAGS])])
 fi
 # Defend against clang being used on x86-32 without SSE2 enabled.  As current
 # versions of clang do not understand -fexcess-precision=standard, the use of
 # x87 floating point operations leads to problems like isinf possibly returning
 @ -755,19 +808,6 @@ AC_PROG_CPP
 AC_SUBST(GCC)
 #
 # Set up TAS assembly code if needed; the template file has now had its
 # chance to request this.
 #
 AC_CONFIG_LINKS([src/backend/port/tas.s:src/backend/port/tas/${tas_file}])
 if test "$need_tas" = yes ; then
   TAS=tas.o
 else
   TAS=""
 fi
 AC_SUBST(TAS)
 AC_SUBST(DLSUFFIX)dnl
 AC_DEFINE_UNQUOTED([DLSUFFIX], ["$DLSUFFIX"],
                    [Define to the file name extension of dynamically-loadable modules.])
 @ -1103,12 +1143,12 @@ if test "$with_libxml" = yes ; then
   # Note the user could also set XML2_CFLAGS/XML2_LIBS directly
   for pgac_option in $XML2_CFLAGS; do
     case $pgac_option in
       -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
       -I*|-D*) INCLUDES="$INCLUDES $pgac_option";;
     esac
   done
   for pgac_option in $XML2_LIBS; do
     case $pgac_option in
       -L*) LDFLAGS="$LDFLAGS $pgac_option";;
       -L*) LIBDIRS="$LIBDIRS $pgac_option";;
     esac
   done
 fi
 @ -1152,12 +1192,12 @@ if test "$with_lz4" = yes; then
   # note that -llz4 will be added by AC_CHECK_LIB below.
   for pgac_option in $LZ4_CFLAGS; do
     case $pgac_option in
       -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
       -I*|-D*) INCLUDES="$INCLUDES $pgac_option";;
     esac
   done
   for pgac_option in $LZ4_LIBS; do
     case $pgac_option in
       -L*) LDFLAGS="$LDFLAGS $pgac_option";;
       -L*) LIBDIRS="$LIBDIRS $pgac_option";;
     esac
   done
 fi
 @ -1177,12 +1217,12 @@ if test "$with_zstd" = yes; then
   # note that -lzstd will be added by AC_CHECK_LIB below.
   for pgac_option in $ZSTD_CFLAGS; do
     case $pgac_option in
       -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";;
       -I*|-D*) INCLUDES="$INCLUDES $pgac_option";;
     esac
   done
   for pgac_option in $ZSTD_LIBS; do
     case $pgac_option in
       -L*) LDFLAGS="$LDFLAGS $pgac_option";;
       -L*) LIBDIRS="$LIBDIRS $pgac_option";;
     esac
   done
 fi
 @ -1222,6 +1262,8 @@ case $MKDIR_P in
   *install-sh*) MKDIR_P='\${SHELL} \${top_srcdir}/config/install-sh -c -d';;
 esac
 AC_PATH_PROG(NM, nm)
 AC_SUBST(NM)
 PGAC_PATH_BISON
 PGAC_PATH_FLEX
 @ -1401,7 +1443,7 @@ if test "$with_ssl" = openssl ; then
   # Function introduced in OpenSSL 1.0.2, not in LibreSSL.
   AC_CHECK_FUNCS([SSL_CTX_set_cert_cb])
   # Function introduced in OpenSSL 1.1.1, not in LibreSSL.
   AC_CHECK_FUNCS([X509_get_signature_info SSL_CTX_set_num_tickets SSL_CTX_set_keylog_callback])
   AC_CHECK_FUNCS([X509_get_signature_info SSL_CTX_set_num_tickets SSL_CTX_set_keylog_callback SSL_CTX_set_client_hello_cb])
   AC_DEFINE([USE_OPENSSL], 1, [Define to 1 to build with OpenSSL support. (--with-ssl=openssl)])
 elif test "$with_ssl" != no ; then
   AC_MSG_ERROR([--with-ssl must specify openssl])
 @ -1420,6 +1462,13 @@ if test "$with_libxslt" = yes ; then
   AC_CHECK_LIB(xslt, xsltCleanupGlobals, [], [AC_MSG_ERROR([library 'xslt' is required for XSLT support])])
 fi
 if test "$with_liburing" = yes; then
   _LIBS="$LIBS"
   LIBS="$LIBURING_LIBS $LIBS"
   AC_CHECK_FUNCS([io_uring_queue_init_mem])
   LIBS="$_LIBS"
 fi
 if test "$with_lz4" = yes ; then
   AC_CHECK_LIB(lz4, LZ4_compress_default, [], [AC_MSG_ERROR([library 'lz4' is required for LZ4 support])])
 fi
 @ -1428,7 +1477,8 @@ if test "$with_zstd" = yes ; then
   AC_CHECK_LIB(zstd, ZSTD_compress, [], [AC_MSG_ERROR([library 'zstd' is required for ZSTD support])])
 fi
 # Note: We can test for libldap_r only after we know PTHREAD_LIBS
 # Note: We can test for libldap_r only after we know PTHREAD_LIBS;
 # also, on AIX, we may need to have openssl in LIBS for this step.
 if test "$with_ldap" = yes ; then
   _LIBS="$LIBS"
   if test "$PORTNAME" != "win32"; then
 @ -1500,12 +1550,10 @@ AC_SUBST(UUID_LIBS)
 ##
 AC_CHECK_HEADERS(m4_normalize([
 	atomic.h
 	copyfile.h
 	execinfo.h
 	getopt.h
 	ifaddrs.h
 	mbarrier.h
 	sys/epoll.h
 	sys/event.h
 	sys/personality.h
 @ -1514,6 +1562,7 @@ AC_CHECK_HEADERS(m4_normalize([
 	sys/signalfd.h
 	sys/ucred.h
 	termios.h
 	uchar.h
 	ucred.h
 	xlocale.h
 ]))
 @ -1672,10 +1721,12 @@ fi
 m4_defun([AC_PROG_CC_STDC], []) dnl We don't want that.
 AC_C_BIGENDIAN
 AC_C_INLINE
 PGAC_PRINTF_ARCHETYPE
 PGAC_C_STATIC_ASSERT
 PGAC_CXX_PRINTF_ARCHETYPE
 PGAC_C_TYPEOF
 PGAC_CXX_TYPEOF
 PGAC_C_TYPEOF_UNQUAL
 PGAC_CXX_TYPEOF_UNQUAL
 PGAC_C_TYPES_COMPATIBLE
 PGAC_C_BUILTIN_CONSTANT_P
 PGAC_C_BUILTIN_OP_OVERFLOW
 @ -1688,8 +1739,8 @@ PGAC_STRUCT_SOCKADDR_SA_LEN
 # MSVC doesn't cope well with defining restrict to __restrict, the
 # spelling it understands, because it conflicts with
 # __declspec(restrict). Therefore we define pg_restrict to the
 # appropriate definition, which presumably won't conflict.
 # __declspec(restrict) in C++ mode. Therefore we define pg_restrict to
 # the appropriate definition, which presumably won't conflict.
 AC_C_RESTRICT
 if test "$ac_cv_c_restrict" = "no"; then
   pg_restrict=""
 @ -1761,11 +1812,29 @@ AC_CHECK_SIZEOF([off_t])
 # If we don't have largefile support, can't handle segment size >= 2GB.
 if test "$ac_cv_sizeof_off_t" -lt 8; then
   if expr $RELSEG_SIZE '*' $blocksize '>=' 2 '*' 1024 '*' 1024; then
   if expr $RELSEG_SIZE '*' $blocksize '>=' 2 '*' 1024 '*' 1024 >/dev/null; then
     AC_MSG_ERROR([Large file support is not enabled. Segment size cannot be larger than 1GB.])
   fi
 fi
 # Check for SA_SIGINFO extended signal handler availability
 AC_CACHE_CHECK([for SA_SIGINFO], [ac_cv_have_sa_siginfo], [
     AC_COMPILE_IFELSE([
         AC_LANG_PROGRAM([[
             #include <signal.h>
             #include <stddef.h>
         ]], [[
             struct sigaction sa;
             sa.sa_flags = SA_SIGINFO;
         ]])
     ],
     [ac_cv_have_sa_siginfo=yes],
     [ac_cv_have_sa_siginfo=no])
 ])
 if test "x$ac_cv_have_sa_siginfo" = "xyes"; then
     AC_DEFINE([HAVE_SA_SIGINFO], 1, [Define to 1 if you have SA_SIGINFO available.])
 fi
 ##
 ## Functions, global variables
 @ -1785,6 +1854,7 @@ AC_CHECK_FUNCS(m4_normalize([
 	copyfile
 	copy_file_range
 	elf_aux_info
 	explicit_memset
 	getauxval
 	getifaddrs
 	getpeerucred
 @ -1792,6 +1862,7 @@ AC_CHECK_FUNCS(m4_normalize([
 	kqueue
 	localeconv_l
 	mbstowcs_l
 	memset_explicit
 	posix_fallocate
 	ppoll
 	pthread_is_threaded_np
 @ -1811,7 +1882,6 @@ PGAC_CHECK_BUILTIN_FUNC([__builtin_bswap64], [long int x])
 # We assume that we needn't test all widths of these explicitly:
 PGAC_CHECK_BUILTIN_FUNC([__builtin_clz], [unsigned int x])
 PGAC_CHECK_BUILTIN_FUNC([__builtin_ctz], [unsigned int x])
 PGAC_CHECK_BUILTIN_FUNC([__builtin_popcount], [unsigned int x])
 # __builtin_frame_address may draw a diagnostic for non-constant argument,
 # so it needs a different test function.
 PGAC_CHECK_BUILTIN_FUNC_PTR([__builtin_frame_address], [0])
 @ -1830,7 +1900,7 @@ AC_CHECK_DECLS(posix_fadvise, [], [], [#include <fcntl.h>])
 ]) # fi
 AC_CHECK_DECLS(fdatasync, [], [], [#include <unistd.h>])
 AC_CHECK_DECLS([strlcat, strlcpy, strnlen, strsep, timingsafe_bcmp])
 AC_CHECK_DECLS([strlcat, strlcpy, strsep, timingsafe_bcmp])
 # We can't use AC_CHECK_FUNCS to detect these functions, because it
 # won't handle deployment target restrictions on macOS
 @ -1851,7 +1921,6 @@ AC_REPLACE_FUNCS(m4_normalize([
 	mkdtemp
 	strlcat
 	strlcpy
 	strnlen
 	strsep
 	timingsafe_bcmp
 ]))
 @ -1937,7 +2006,7 @@ fi
 if test "$with_icu" = yes; then
   ac_save_CPPFLAGS=$CPPFLAGS
   CPPFLAGS="$ICU_CFLAGS $CPPFLAGS"
   CPPFLAGS="$CPPFLAGS $ICU_CFLAGS"
   # Verify we have ICU's header files
   AC_CHECK_HEADER(unicode/ucol.h, [],
 @ -1946,10 +2015,6 @@ if test "$with_icu" = yes; then
   CPPFLAGS=$ac_save_CPPFLAGS
 fi
 if test "$with_llvm" = yes; then
   PGAC_CHECK_LLVM_FUNCTIONS()
 fi
 # Lastly, restore full LIBS list and check for readline/libedit symbols
 LIBS="$LIBS_including_readline"
 @ -1980,6 +2045,14 @@ related to locating shared libraries.  Check the file 'config.log'
 for the exact reason.]])],
 [AC_MSG_RESULT([cross-compiling])])
 # These flags are supported in all C11-capable GCC/Clang versions,
 # so no capability test is needed.  Added here to avoid affecting configure probes,
 # particularly PGAC_PRINTF_ARCHETYPE which uses -Werror and would fail to detect
 # gnu_printf if -Wstrict-prototypes is active.
 if test "$GCC" = yes -a "$ICC" = no; then
   CFLAGS="$CFLAGS -Wstrict-prototypes -Wold-style-definition"
 fi
 # --------------------
 # Run tests below here
 # --------------------
 @ -1989,38 +2062,29 @@ AC_CHECK_SIZEOF([void *])
 AC_CHECK_SIZEOF([size_t])
 AC_CHECK_SIZEOF([long])
 AC_CHECK_SIZEOF([long long])
 AC_CHECK_SIZEOF([intmax_t])
 # Determine memory alignment requirements for the basic C data types.
 AC_CHECK_ALIGNOF(short)
 AC_CHECK_ALIGNOF(int)
 AC_CHECK_ALIGNOF(long)
 AC_CHECK_ALIGNOF(int64_t)
 AC_CHECK_ALIGNOF(double)
 # Compute maximum alignment of any basic type.
 #
 # We require 'double' to have the strictest alignment among the basic types,
 # because otherwise the C ABI might impose 8-byte alignment on some of the
 # other C types that correspond to TYPALIGN_DOUBLE SQL types.  That could
 # cause a mismatch between the tuple layout and the C struct layout of a
 # catalog tuple.  We used to carefully order catalog columns such that any
 # fixed-width, attalign=4 columns were at offsets divisible by 8 regardless
 # of MAXIMUM_ALIGNOF to avoid that, but we no longer support any platforms
 # where TYPALIGN_DOUBLE != MAXIMUM_ALIGNOF.
 #
 # We assume without checking that long's alignment is at least as strong as
 # char, short, or int.  Note that we intentionally do not consider any types
 # wider than 64 bits, as allowing MAXIMUM_ALIGNOF to exceed 8 would be too
 # much of a penalty for disk and memory space.
 # We assume without checking that the maximum alignment requirement is that
 # of int64_t and/or double.  (On most platforms those are the same, but not
 # everywhere.)  For historical reasons, both int8 and float8 datatypes have
 # typalign 'd', and therefore will be aligned per ALIGNOF_DOUBLE in database
 # tuples even if ALIGNOF_INT64_T is more.  Note that we intentionally do not
 # consider any types wider than 64 bits, as allowing MAXIMUM_ALIGNOF to exceed
 # 8 would be too much of a penalty for disk and memory space.
 MAX_ALIGNOF=$ac_cv_alignof_double
 if test $ac_cv_alignof_long -gt $MAX_ALIGNOF ; then
   AC_MSG_ERROR([alignment of 'long' is greater than the alignment of 'double'])
 fi
 if test $ac_cv_alignof_int64_t -gt $MAX_ALIGNOF ; then
   AC_MSG_ERROR([alignment of 'int64_t' is greater than the alignment of 'double'])
 if test $ac_cv_alignof_int64_t -gt $ac_cv_alignof_double ; then
   MAX_ALIGNOF=$ac_cv_alignof_int64_t
 else
   MAX_ALIGNOF=$ac_cv_alignof_double
 fi
 AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX_ALIGNOF, [Define as the maximum alignment requirement of any C data type.])
 @ -2037,7 +2101,7 @@ PGAC_HAVE_GCC__ATOMIC_INT32_CAS
 PGAC_HAVE_GCC__ATOMIC_INT64_CAS
 # Check for x86 cpuid instruction
 # Check for __get_cpuid() and __cpuid()
 AC_CACHE_CHECK([for __get_cpuid], [pgac_cv__get_cpuid],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <cpuid.h>],
   [[unsigned int exx[4] = {0, 0, 0, 0};
 @ -2047,8 +2111,21 @@ AC_CACHE_CHECK([for __get_cpuid], [pgac_cv__get_cpuid],
   [pgac_cv__get_cpuid="no"])])
 if test x"$pgac_cv__get_cpuid" = x"yes"; then
   AC_DEFINE(HAVE__GET_CPUID, 1, [Define to 1 if you have __get_cpuid.])
 else
   # __cpuid()
   AC_CACHE_CHECK([for __cpuid], [pgac_cv__cpuid],
   [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <intrin.h>],
     [[unsigned int exx[4] = {0, 0, 0, 0};
     __cpuid(exx, 1);
     ]])],
     [pgac_cv__cpuid="yes"],
     [pgac_cv__cpuid="no"])])
   if test x"$pgac_cv__cpuid" = x"yes"; then
     AC_DEFINE(HAVE__CPUID, 1, [Define to 1 if you have __cpuid.])
   fi
 fi
 # Check for __get_cpuid_count()
 AC_CACHE_CHECK([for __get_cpuid_count], [pgac_cv__get_cpuid_count],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <cpuid.h>],
   [[unsigned int exx[4] = {0, 0, 0, 0};
 @ -2060,21 +2137,15 @@ if test x"$pgac_cv__get_cpuid_count" = x"yes"; then
   AC_DEFINE(HAVE__GET_CPUID_COUNT, 1, [Define to 1 if you have __get_cpuid_count.])
 fi
 AC_CACHE_CHECK([for __cpuid], [pgac_cv__cpuid],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <intrin.h>],
   [[unsigned int exx[4] = {0, 0, 0, 0};
   __get_cpuid(exx[0], 1);
   ]])],
   [pgac_cv__cpuid="yes"],
   [pgac_cv__cpuid="no"])])
 if test x"$pgac_cv__cpuid" = x"yes"; then
   AC_DEFINE(HAVE__CPUID, 1, [Define to 1 if you have __cpuid.])
 fi
 # Check for __cpuidex()
 AC_CACHE_CHECK([for __cpuidex], [pgac_cv__cpuidex],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <intrin.h>],
 [AC_LINK_IFELSE([AC_LANG_PROGRAM([#ifdef _MSC_VER
     #include <intrin.h>
     #else
     #include <cpuid.h>
     #endif],
   [[unsigned int exx[4] = {0, 0, 0, 0};
   __get_cpuidex(exx[0], 7, 0);
   __cpuidex((int *) exx, 7, 0);
   ]])],
   [pgac_cv__cpuidex="yes"],
   [pgac_cv__cpuidex="no"])])
 @ -2082,6 +2153,15 @@ if test x"$pgac_cv__cpuidex" = x"yes"; then
   AC_DEFINE(HAVE__CPUIDEX, 1, [Define to 1 if you have __cpuidex.])
 fi
 # Check for AVX2 target support
 #
 if test x"$host_cpu" = x"x86_64"; then
   PGAC_AVX2_SUPPORT()
   if test x"$pgac_avx2_support" = x"yes"; then
     AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
   fi
 fi
 # Check for XSAVE intrinsics
 #
 PGAC_XSAVE_INTRINSICS()
 @ -2205,17 +2285,17 @@ fi
 AC_MSG_CHECKING([which CRC-32C implementation to use])
 if test x"$USE_SSE42_CRC32C" = x"1"; then
   AC_DEFINE(USE_SSE42_CRC32C, 1, [Define to 1 use Intel SSE 4.2 CRC instructions.])
   PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sse42_choose.o"
   PG_CRC32C_OBJS="pg_crc32c_sse42.o"
   AC_MSG_RESULT(SSE 4.2)
 else
   if test x"$USE_SSE42_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
     AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSE 4.2 CRC instructions with a runtime check.])
     PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o pg_crc32c_sse42_choose.o"
     PG_CRC32C_OBJS="pg_crc32c_sse42.o pg_crc32c_sb8.o"
     AC_MSG_RESULT(SSE 4.2 with runtime check)
   else
     if test x"$USE_ARMV8_CRC32C" = x"1"; then
       AC_DEFINE(USE_ARMV8_CRC32C, 1, [Define to 1 to use ARMv8 CRC Extension.])
       PG_CRC32C_OBJS="pg_crc32c_armv8.o"
       PG_CRC32C_OBJS="pg_crc32c_armv8.o pg_crc32c_armv8_choose.o"
       AC_MSG_RESULT(ARMv8 CRC instructions)
     else
       if test x"$USE_ARMV8_CRC32C_WITH_RUNTIME_CHECK" = x"1"; then
 @ -2242,6 +2322,10 @@ AC_SUBST(PG_CRC32C_OBJS)
 #
 if test x"$host_cpu" = x"x86_64"; then
   PGAC_AVX512_PCLMUL_INTRINSICS()
 else
   if test x"$host_cpu" = x"aarch64"; then
     PGAC_ARM_PLMULL()
   fi
 fi
 AC_MSG_CHECKING([for vectorized CRC-32C])
 @ -2249,7 +2333,12 @@ if test x"$pgac_avx512_pclmul_intrinsics" = x"yes"; then
   AC_DEFINE(USE_AVX512_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX-512 CRC algorithms with a runtime check.])
   AC_MSG_RESULT(AVX-512 with runtime check)
 else
   AC_MSG_RESULT(none)
   if test x"$pgac_arm_pmull" = x"yes"; then
     AC_DEFINE(USE_PMULL_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Arm PMULL CRC algorithms with a runtime check.])
     AC_MSG_RESULT(CRYPTO PMULL with runtime check)
   else
     AC_MSG_RESULT(none)
   fi
 fi
 # Select semaphore implementation type.
 @ -2337,7 +2426,7 @@ Use --without-tcl to disable building PL/Tcl.])
     fi
     # now that we have TCL_INCLUDE_SPEC, we can check for <tcl.h>
     ac_save_CPPFLAGS=$CPPFLAGS
     CPPFLAGS="$TCL_INCLUDE_SPEC $CPPFLAGS"
     CPPFLAGS="$CPPFLAGS $TCL_INCLUDE_SPEC"
     AC_CHECK_HEADER(tcl.h, [], [AC_MSG_ERROR([header file <tcl.h> is required for Tcl])])
     CPPFLAGS=$ac_save_CPPFLAGS
 fi
 @ -2374,7 +2463,7 @@ fi
 # check for <Python.h>
 if test "$with_python" = yes; then
   ac_save_CPPFLAGS=$CPPFLAGS
   CPPFLAGS="$python_includespec $CPPFLAGS"
   CPPFLAGS="$CPPFLAGS $python_includespec"
   AC_CHECK_HEADER(Python.h, [], [AC_MSG_ERROR([header file <Python.h> is required for Python])])
   CPPFLAGS=$ac_save_CPPFLAGS
 fi
 @ -2449,8 +2538,6 @@ AC_SUBST(LDFLAGS_EX_BE)
 if test x"$GCC" = x"yes" ; then
   cc_string=`${CC} --version | sed q`
   case $cc_string in [[A-Za-z]]*) ;; *) cc_string="GCC $cc_string";; esac
 elif test x"$SUN_STUDIO_CC" = x"yes" ; then
   cc_string=`${CC} -V 2>&1 | sed q`
 else
   cc_string=$CC
 fi

									
										2

contrib/Makefile
									
										View file
										
				@ -34,7 +34,9 @@ SUBDIRS = \

						pg_freespacemap \

						pg_logicalinspect \

						pg_overexplain \

						pg_plan_advice \

						pg_prewarm	\

						pg_stash_advice	\

						pg_stat_statements \

						pg_surgery	\

						pg_trgm		\

8

contrib/amcheck/expected/check_btree.out

View file

 @ -60,6 +60,14 @@ SELECT bt_index_parent_check('bttest_a_brin_idx');
 ERROR:  expected "btree" index as targets for verification
 DETAIL:  Relation "bttest_a_brin_idx" is a brin index.
 ROLLBACK;
 -- verify partitioned indexes are rejected (error)
 BEGIN;
 CREATE TABLE bttest_partitioned (a int, b int) PARTITION BY list (a);
 CREATE INDEX bttest_btree_partitioned_idx ON bttest_partitioned USING btree (b);
 SELECT bt_index_parent_check('bttest_btree_partitioned_idx');
 ERROR:  expected index as targets for verification
 DETAIL:  This operation is not supported for partitioned indexes.
 ROLLBACK;
 -- normal check outside of xact
 SELECT bt_index_check('bttest_a_idx');
  bt_index_check

									
										2

contrib/amcheck/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				amcheck_sources = files(

				  'verify_common.c',

									
										7

contrib/amcheck/sql/check_btree.sql
									
										View file
										
				@ -52,6 +52,13 @@ CREATE INDEX bttest_a_brin_idx ON bttest_a USING brin(id);

				SELECT bt_index_parent_check('bttest_a_brin_idx');

				ROLLBACK;

				-- verify partitioned indexes are rejected (error)

				BEGIN;

				CREATE TABLE bttest_partitioned (a int, b int) PARTITION BY list (a);

				CREATE INDEX bttest_btree_partitioned_idx ON bttest_partitioned USING btree (b);

				SELECT bt_index_parent_check('bttest_btree_partitioned_idx');

				ROLLBACK;

				-- normal check outside of xact

				SELECT bt_index_check('bttest_a_idx');

				-- more expansive tests

									
										2

contrib/amcheck/t/001_verify_heapam.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				use strict;

				use warnings FATAL => 'all';

									
										26

contrib/amcheck/t/002_cic.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				# Test CREATE INDEX CONCURRENTLY with concurrent modifications

				use strict;

				@ -64,5 +64,29 @@ $node->pgbench(

						  )

					});

				# Test bt_index_parent_check() with indexes created with

				# CREATE INDEX CONCURRENTLY.

				$node->safe_psql('postgres', q(CREATE TABLE quebec(i int primary key)));

				# Insert two rows into index

				$node->safe_psql('postgres',

					q(INSERT INTO quebec SELECT i FROM generate_series(1, 2) s(i);));

				# start background transaction

				my $in_progress_h = $node->background_psql('postgres');

				$in_progress_h->query_safe(q(BEGIN; SELECT pg_current_xact_id();));

				# delete one row from table, while background transaction is in progress

				$node->safe_psql('postgres', q(DELETE FROM quebec WHERE i = 1;));

				# create index concurrently, which will skip the deleted row

				$node->safe_psql('postgres',

					q(CREATE INDEX CONCURRENTLY oscar ON quebec(i);));

				# check index using bt_index_parent_check

				my $result = $node->psql('postgres',

					q(SELECT bt_index_parent_check('oscar', heapallindexed => true)));

				is($result, '0', 'bt_index_parent_check for CIC after removed row');

				$in_progress_h->quit;

				$node->stop;

				done_testing();

									
										2

contrib/amcheck/t/003_cic_2pc.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				# Test CREATE INDEX CONCURRENTLY with concurrent prepared-xact modifications

				use strict;

									
										22

contrib/amcheck/t/004_verify_nbtree_unique.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2023-2025, PostgreSQL Global Development Group

				# Copyright (c) 2023-2026, PostgreSQL Global Development Group

				# This regression test checks the behavior of the btree validation in the

				# presence of breaking sort order changes.

				@ -159,7 +159,9 @@ $node->safe_psql(

					'postgres', q(

					SELECT bt_index_check('bttest_unique_idx1', true, true);

				));

				ok( $stderr =~ /index uniqueness is violated for index "bttest_unique_idx1"/,

				like(

					$stderr,

					qr/index uniqueness is violated for index "bttest_unique_idx1"/,

					'detected uniqueness violation for index "bttest_unique_idx1"');

				#

				@ -177,7 +179,9 @@ ok( $stderr =~ /index uniqueness is violated for index "bttest_unique_idx1"/,

					'postgres', q(

					SELECT bt_index_check('bttest_unique_idx2', true, true);

				));

				ok( $stderr =~ /item order invariant violated for index "bttest_unique_idx2"/,

				like(

					$stderr,

					qr/item order invariant violated for index "bttest_unique_idx2"/,

					'detected item order invariant violation for index "bttest_unique_idx2"');

				$node->safe_psql(

				@ -191,7 +195,9 @@ $node->safe_psql(

					'postgres', q(

					SELECT bt_index_check('bttest_unique_idx2', true, true);

				));

				ok( $stderr =~ /index uniqueness is violated for index "bttest_unique_idx2"/,

				like(

					$stderr,

					qr/index uniqueness is violated for index "bttest_unique_idx2"/,

					'detected uniqueness violation for index "bttest_unique_idx2"');

				#

				@ -208,7 +214,9 @@ ok( $stderr =~ /index uniqueness is violated for index "bttest_unique_idx2"/,

					'postgres', q(

					SELECT bt_index_check('bttest_unique_idx3', true, true);

				));

				ok( $stderr =~ /item order invariant violated for index "bttest_unique_idx3"/,

				like(

					$stderr,

					qr/item order invariant violated for index "bttest_unique_idx3"/,

					'detected item order invariant violation for index "bttest_unique_idx3"');

				# For unique index deduplication is possible only for same values, but

				@ -237,7 +245,9 @@ $node->safe_psql(

					'postgres', q(

					SELECT bt_index_check('bttest_unique_idx3', true, true);

				));

				ok( $stderr =~ /index uniqueness is violated for index "bttest_unique_idx3"/,

				like(

					$stderr,

					qr/index uniqueness is violated for index "bttest_unique_idx3"/,

					'detected uniqueness violation for index "bttest_unique_idx3"');

				$node->stop;

									
										2

contrib/amcheck/t/005_pitr.pl
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				# Test integrity of intermediate states by PITR to those states

				use strict;

									
										2

contrib/amcheck/t/006_verify_gin.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				use strict;

				use warnings FATAL => 'all';

									
										42

contrib/amcheck/verify_common.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * verify_common.c

				 *		Utility functions common to all access methods.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/verify_common.c

				@ -18,11 +18,13 @@

				#include "verify_common.h"

				#include "catalog/index.h"

				#include "catalog/pg_am.h"

				#include "commands/defrem.h"

				#include "commands/tablecmds.h"

				#include "utils/guc.h"

				#include "utils/syscache.h"

				static bool amcheck_index_mainfork_expected(Relation rel);

				static bool index_checkable(Relation rel, Oid am_id);

				/*

				@ -48,14 +50,14 @@ amcheck_index_mainfork_expected(Relation rel)

				}

				/*

				* Amcheck main workhorse.

				* Given index relation OID, lock relation.

				* Next, take a number of standard actions:

				* 1) Make sure the index can be checked

				* 2) change the context of the user,

				* 3) keep track of GUCs modified via index functions

				* 4) execute callback function to verify integrity.

				*/

				 * Amcheck main workhorse.

				 * Given index relation OID, lock relation.

				 * Next, take a number of standard actions:

				 * 1) Make sure the index can be checked

				 * 2) change the context of the user,

				 * 3) keep track of GUCs modified via index functions

				 * 4) execute callback function to verify integrity.

				 */

				void

				amcheck_lock_relation_and_check(Oid indrelid,

												Oid am_id,

				@ -155,23 +157,21 @@ amcheck_lock_relation_and_check(Oid indrelid,

				 * callable by non-superusers. If granted, it's useful to be able to check a

				 * whole cluster.

				 */

				bool

				static bool

				index_checkable(Relation rel, Oid am_id)

				{

					if (rel->rd_rel->relkind != RELKIND_INDEX ||

						rel->rd_rel->relam != am_id)

					{

						HeapTuple	amtup;

						HeapTuple	amtuprel;

					if (rel->rd_rel->relkind != RELKIND_INDEX)

						ereport(ERROR,

								(errcode(ERRCODE_WRONG_OBJECT_TYPE),

								 errmsg("expected index as targets for verification"),

								 errdetail_relkind_not_supported(rel->rd_rel->relkind)));

						amtup = SearchSysCache1(AMOID, ObjectIdGetDatum(am_id));

						amtuprel = SearchSysCache1(AMOID, ObjectIdGetDatum(rel->rd_rel->relam));

					if (rel->rd_rel->relam != am_id)

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								 errmsg("expected \"%s\" index as targets for verification", NameStr(((Form_pg_am) GETSTRUCT(amtup))->amname)),

								 errmsg("expected \"%s\" index as targets for verification", get_am_name(am_id)),

								 errdetail("Relation \"%s\" is a %s index.",

										   RelationGetRelationName(rel), NameStr(((Form_pg_am) GETSTRUCT(amtuprel))->amname))));

					}

										   RelationGetRelationName(rel), get_am_name(rel->rd_rel->relam))));

					if (RELATION_IS_OTHER_TEMP(rel))

						ereport(ERROR,

				@ -182,7 +182,7 @@ index_checkable(Relation rel, Oid am_id)

					if (!rel->rd_index->indisvalid)

						ereport(ERROR,

								(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),

								(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),

								 errmsg("cannot check index \"%s\"",

										RelationGetRelationName(rel)),

								 errdetail("Index is not valid.")));

									
										11

contrib/amcheck/verify_common.h
									
										View file
										
				@ -1,12 +1,12 @@

				/*-------------------------------------------------------------------------

				 *

				 * amcheck.h

				 * verify_common.h

				 *		Shared routines for amcheck verifications.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/amcheck.h

				 *	  contrib/amcheck/verify_common.h

				 *

				 *-------------------------------------------------------------------------

				 */

				@ -16,8 +16,7 @@

				#include "utils/relcache.h"

				#include "miscadmin.h"

				/* Typedefs for callback functions for amcheck_lock_relation_and_check */

				typedef void (*IndexCheckableCallback) (Relation index);

				/* Typedef for callback function for amcheck_lock_relation_and_check */

				typedef void (*IndexDoCheckCallback) (Relation rel,

													  Relation heaprel,

													  void *state,

				@ -27,5 +26,3 @@ extern void amcheck_lock_relation_and_check(Oid indrelid,

															Oid am_id,

															IndexDoCheckCallback check,

															LOCKMODE lockmode, void *state);

				extern bool index_checkable(Relation rel, Oid am_id);

									
										26

contrib/amcheck/verify_gin.c
									
										View file
										
				@ -13,7 +13,7 @@

				 *   can reference either only leaf pages or only internal pages.

				 *

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/verify_gin.c

				@ -107,7 +107,7 @@ ginReadTupleWithoutState(IndexTuple itup, int *nitems)

					{

						if (nipd > 0)

						{

							ipd = ginPostingListDecode((GinPostingList *) ptr, &ndecoded);

							ipd = ginPostingListDecode(ptr, &ndecoded);

							if (nipd != ndecoded)

								elog(ERROR, "number of items mismatch in GIN entry tuple, %d in tuple header, %d decoded",

									 nipd, ndecoded);

				@ -117,7 +117,7 @@ ginReadTupleWithoutState(IndexTuple itup, int *nitems)

					}

					else

					{

						ipd = (ItemPointer) palloc(sizeof(ItemPointerData) * nipd);

						ipd = palloc_array(ItemPointerData, nipd);

						memcpy(ipd, ptr, sizeof(ItemPointerData) * nipd);

					}

					*nitems = nipd;

				@ -152,7 +152,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting

					leafdepth = -1;

					/* Start the scan at the root page */

					stack = (GinPostingTreeScanItem *) palloc0(sizeof(GinPostingTreeScanItem));

					stack = palloc0_object(GinPostingTreeScanItem);

					stack->depth = 0;

					ItemPointerSetInvalid(&stack->parentkey);

					stack->parentblk = InvalidBlockNumber;

				@ -174,7 +174,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting

						buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,

													RBM_NORMAL, strategy);

						LockBuffer(buffer, GIN_SHARE);

						page = (Page) BufferGetPage(buffer);

						page = BufferGetPage(buffer);

						Assert(GinPageIsData(page));

				@ -354,7 +354,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting

													stack->blkno, i)));

								/* This is an internal page, recurse into the child. */

								ptr = (GinPostingTreeScanItem *) palloc(sizeof(GinPostingTreeScanItem));

								ptr = palloc_object(GinPostingTreeScanItem);

								ptr->depth = stack->depth + 1;

								/*

				@ -368,8 +368,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting

								stack->next = ptr;

							}

						}

						LockBuffer(buffer, GIN_UNLOCK);

						ReleaseBuffer(buffer);

						UnlockReleaseBuffer(buffer);

						/* Step to next item in the queue */

						stack_next = stack->next;

				@ -412,7 +411,7 @@ gin_check_parent_keys_consistency(Relation rel,

					leafdepth = -1;

					/* Start the scan at the root page */

					stack = (GinScanItem *) palloc0(sizeof(GinScanItem));

					stack = palloc0_object(GinScanItem);

					stack->depth = 0;

					stack->parenttup = NULL;

					stack->parentblk = InvalidBlockNumber;

				@ -434,7 +433,7 @@ gin_check_parent_keys_consistency(Relation rel,

						buffer = ReadBufferExtended(rel, MAIN_FORKNUM, stack->blkno,

													RBM_NORMAL, strategy);

						LockBuffer(buffer, GIN_SHARE);

						page = (Page) BufferGetPage(buffer);

						page = BufferGetPage(buffer);

						maxoff = PageGetMaxOffsetNumber(page);

						rightlink = GinPageGetOpaque(page)->rightlink;

				@ -473,7 +472,7 @@ gin_check_parent_keys_consistency(Relation rel,

								elog(DEBUG3, "split detected for blk: %u, parent blk: %u", stack->blkno, stack->parentblk);

								ptr = (GinScanItem *) palloc(sizeof(GinScanItem));

								ptr = palloc_object(GinScanItem);

								ptr->depth = stack->depth;

								ptr->parenttup = CopyIndexTuple(stack->parenttup);

								ptr->parentblk = stack->parentblk;

				@ -601,7 +600,7 @@ gin_check_parent_keys_consistency(Relation rel,

							{

								GinScanItem *ptr;

								ptr = (GinScanItem *) palloc(sizeof(GinScanItem));

								ptr = palloc_object(GinScanItem);

								ptr->depth = stack->depth + 1;

								/* last tuple in layer has no high key */

								if (i == maxoff && rightlink == InvalidBlockNumber)

				@ -642,8 +641,7 @@ gin_check_parent_keys_consistency(Relation rel,

							prev_attnum = current_attnum;

						}

						LockBuffer(buffer, GIN_UNLOCK);

						ReleaseBuffer(buffer);

						UnlockReleaseBuffer(buffer);

						/* Step to next item in the queue */

						stack_next = stack->next;

									
										88

contrib/amcheck/verify_heapam.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * verify_heapam.c

				 *	  Functions to check postgresql heap relations for corruption

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 *	  contrib/amcheck/verify_heapam.c

				 *-------------------------------------------------------------------------

				@ -24,11 +24,13 @@

				#include "funcapi.h"

				#include "miscadmin.h"

				#include "storage/bufmgr.h"

				#include "storage/lwlock.h"

				#include "storage/procarray.h"

				#include "storage/read_stream.h"

				#include "utils/builtins.h"

				#include "utils/fmgroids.h"

				#include "utils/rel.h"

				#include "utils/tuplestore.h"

				PG_FUNCTION_INFO_V1(verify_heapam);

				@ -73,7 +75,7 @@ typedef enum SkipPages

				 */

				typedef struct ToastedAttribute

				{

					struct varatt_external toast_pointer;

					varatt_external toast_pointer;

					BlockNumber blkno;			/* block in main table */

					OffsetNumber offnum;		/* offset in main table */

					AttrNumber	attnum;			/* attribute in main table */

				@ -526,17 +528,17 @@ verify_heapam(PG_FUNCTION_ARGS)

								if (rdoffnum < FirstOffsetNumber)

								{

									report_corruption(&ctx,

													  psprintf("line pointer redirection to item at offset %u precedes minimum offset %u",

															   (unsigned) rdoffnum,

															   (unsigned) FirstOffsetNumber));

													  psprintf("line pointer redirection to item at offset %d precedes minimum offset %d",

															   rdoffnum,

															   FirstOffsetNumber));

									continue;

								}

								if (rdoffnum > maxoff)

								{

									report_corruption(&ctx,

													  psprintf("line pointer redirection to item at offset %u exceeds maximum offset %u",

															   (unsigned) rdoffnum,

															   (unsigned) maxoff));

													  psprintf("line pointer redirection to item at offset %d exceeds maximum offset %d",

															   rdoffnum,

															   maxoff));

									continue;

								}

				@ -550,22 +552,22 @@ verify_heapam(PG_FUNCTION_ARGS)

								if (!ItemIdIsUsed(rditem))

								{

									report_corruption(&ctx,

													  psprintf("redirected line pointer points to an unused item at offset %u",

															   (unsigned) rdoffnum));

													  psprintf("redirected line pointer points to an unused item at offset %d",

															   rdoffnum));

									continue;

								}

								else if (ItemIdIsDead(rditem))

								{

									report_corruption(&ctx,

													  psprintf("redirected line pointer points to a dead item at offset %u",

															   (unsigned) rdoffnum));

													  psprintf("redirected line pointer points to a dead item at offset %d",

															   rdoffnum));

									continue;

								}

								else if (ItemIdIsRedirected(rditem))

								{

									report_corruption(&ctx,

													  psprintf("redirected line pointer points to another redirected line pointer at offset %u",

															   (unsigned) rdoffnum));

													  psprintf("redirected line pointer points to another redirected line pointer at offset %d",

															   rdoffnum));

									continue;

								}

				@ -601,10 +603,10 @@ verify_heapam(PG_FUNCTION_ARGS)

							if (ctx.lp_off + ctx.lp_len > BLCKSZ)

							{

								report_corruption(&ctx,

												  psprintf("line pointer to page offset %u with length %u ends beyond maximum page offset %u",

												  psprintf("line pointer to page offset %u with length %u ends beyond maximum page offset %d",

														   ctx.lp_off,

														   ctx.lp_len,

														   (unsigned) BLCKSZ));

														   BLCKSZ));

								continue;

							}

				@ -678,16 +680,16 @@ verify_heapam(PG_FUNCTION_ARGS)

								if (!HeapTupleHeaderIsHeapOnly(next_htup))

								{

									report_corruption(&ctx,

													  psprintf("redirected line pointer points to a non-heap-only tuple at offset %u",

															   (unsigned) nextoffnum));

													  psprintf("redirected line pointer points to a non-heap-only tuple at offset %d",

															   nextoffnum));

								}

								/* HOT chains should not intersect. */

								if (predecessor[nextoffnum] != InvalidOffsetNumber)

								{

									report_corruption(&ctx,

													  psprintf("redirect line pointer points to offset %u, but offset %u also points there",

															   (unsigned) nextoffnum, (unsigned) predecessor[nextoffnum]));

													  psprintf("redirect line pointer points to offset %d, but offset %d also points there",

															   nextoffnum, predecessor[nextoffnum]));

									continue;

								}

				@ -719,8 +721,8 @@ verify_heapam(PG_FUNCTION_ARGS)

							if (predecessor[nextoffnum] != InvalidOffsetNumber)

							{

								report_corruption(&ctx,

												  psprintf("tuple points to new version at offset %u, but offset %u also points there",

														   (unsigned) nextoffnum, (unsigned) predecessor[nextoffnum]));

												  psprintf("tuple points to new version at offset %d, but offset %d also points there",

														   nextoffnum, predecessor[nextoffnum]));

								continue;

							}

				@ -743,15 +745,15 @@ verify_heapam(PG_FUNCTION_ARGS)

								HeapTupleHeaderIsHeapOnly(next_htup))

							{

								report_corruption(&ctx,

												  psprintf("non-heap-only update produced a heap-only tuple at offset %u",

														   (unsigned) nextoffnum));

												  psprintf("non-heap-only update produced a heap-only tuple at offset %d",

														   nextoffnum));

							}

							if ((curr_htup->t_infomask2 & HEAP_HOT_UPDATED) &&

								!HeapTupleHeaderIsHeapOnly(next_htup))

							{

								report_corruption(&ctx,

												  psprintf("heap-only update produced a non-heap only tuple at offset %u",

														   (unsigned) nextoffnum));

												  psprintf("heap-only update produced a non-heap only tuple at offset %d",

														   nextoffnum));

							}

							/*

				@ -772,10 +774,10 @@ verify_heapam(PG_FUNCTION_ARGS)

								TransactionIdIsInProgress(curr_xmin))

							{

								report_corruption(&ctx,

												  psprintf("tuple with in-progress xmin %u was updated to produce a tuple at offset %u with committed xmin %u",

														   (unsigned) curr_xmin,

														   (unsigned) ctx.offnum,

														   (unsigned) next_xmin));

												  psprintf("tuple with in-progress xmin %u was updated to produce a tuple at offset %d with committed xmin %u",

														   curr_xmin,

														   ctx.offnum,

														   next_xmin));

							}

							/*

				@ -788,16 +790,16 @@ verify_heapam(PG_FUNCTION_ARGS)

							{

								if (xmin_commit_status[nextoffnum] == XID_IN_PROGRESS)

									report_corruption(&ctx,

													  psprintf("tuple with aborted xmin %u was updated to produce a tuple at offset %u with in-progress xmin %u",

															   (unsigned) curr_xmin,

															   (unsigned) ctx.offnum,

															   (unsigned) next_xmin));

													  psprintf("tuple with aborted xmin %u was updated to produce a tuple at offset %d with in-progress xmin %u",

															   curr_xmin,

															   ctx.offnum,

															   next_xmin));

								else if (xmin_commit_status[nextoffnum] == XID_COMMITTED)

									report_corruption(&ctx,

													  psprintf("tuple with aborted xmin %u was updated to produce a tuple at offset %u with committed xmin %u",

															   (unsigned) curr_xmin,

															   (unsigned) ctx.offnum,

															   (unsigned) next_xmin));

													  psprintf("tuple with aborted xmin %u was updated to produce a tuple at offset %d with committed xmin %u",

															   curr_xmin,

															   ctx.offnum,

															   next_xmin));

							}

						}

				@ -1660,11 +1662,11 @@ static bool

				check_tuple_attribute(HeapCheckContext *ctx)

				{

					Datum		attdatum;

					struct varlena *attr;

					varlena    *attr;

					char	   *tp;				/* pointer to the tuple data */

					uint16		infomask;

					CompactAttribute *thisatt;

					struct varatt_external toast_pointer;

					varatt_external toast_pointer;

					infomask = ctx->tuphdr->t_infomask;

					thisatt = TupleDescCompactAttr(RelationGetDescr(ctx->rel), ctx->attnum);

				@ -1754,7 +1756,7 @@ check_tuple_attribute(HeapCheckContext *ctx)

					 * We go further, because we need to check if the toast datum is corrupt.

					 */

					attr = (struct varlena *) DatumGetPointer(attdatum);

					attr = (varlena *) DatumGetPointer(attdatum);

					/*

					 * Now we follow the logic of detoast_external_attr(), with the same

				@ -1838,7 +1840,7 @@ check_tuple_attribute(HeapCheckContext *ctx)

					{

						ToastedAttribute *ta;

						ta = (ToastedAttribute *) palloc0(sizeof(ToastedAttribute));

						ta = palloc0_object(ToastedAttribute);

						VARATT_EXTERNAL_GET_POINTER(ta->toast_pointer, attr);

						ta->blkno = ctx->blkno;

				@ -1942,7 +1944,7 @@ check_tuple(HeapCheckContext *ctx, bool *xmin_commit_status_ok,

					if (RelationGetDescr(ctx->rel)->natts < ctx->natts)

					{

						report_corruption(ctx,

										  psprintf("number of attributes %u exceeds maximum expected for table %u",

										  psprintf("number of attributes %u exceeds maximum %u expected for table",

												   ctx->natts,

												   RelationGetDescr(ctx->rel)->natts));

						return;

									
										150

contrib/amcheck/verify_nbtree.c
									
										View file
										
				@ -14,7 +14,7 @@

				 * that every visible heap tuple has a matching index tuple.

				 *

				 *

				 * Copyright (c) 2017-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2017-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/amcheck/verify_nbtree.c

				@ -92,9 +92,11 @@ typedef struct BtreeCheckState

					BufferAccessStrategy checkstrategy;

					/*

					 * Info for uniqueness checking. Fill these fields once per index check.

					 * Info for uniqueness checking. Fill this field and the one below once

					 * per index check.

					 */

					IndexInfo  *indexinfo;

					/* Table scan snapshot for heapallindexed and checkunique */

					Snapshot	snapshot;

					/*

				@ -382,7 +384,6 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,

					BTMetaPageData *metad;

					uint32		previouslevel;

					BtreeLevel	current;

					Snapshot	snapshot = SnapshotAny;

					if (!readonly)

						elog(DEBUG1, "verifying consistency of tree structure for index \"%s\"",

				@ -400,7 +401,7 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,

					/*

					 * Initialize state for entire verification operation

					 */

					state = palloc0(sizeof(BtreeCheckState));

					state = palloc0_object(BtreeCheckState);

					state->rel = rel;

					state->heaprel = heaprel;

					state->heapkeyspace = heapkeyspace;

				@ -433,54 +434,46 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,

						state->heaptuplespresent = 0;

						/*

						 * Register our own snapshot in !readonly case, rather than asking

						 * Register our own snapshot for heapallindexed, rather than asking

						 * table_index_build_scan() to do this for us later.  This needs to

						 * happen before index fingerprinting begins, so we can later be

						 * certain that index fingerprinting should have reached all tuples

						 * returned by table_index_build_scan().

						 */

						if (!state->readonly)

						{

							snapshot = RegisterSnapshot(GetTransactionSnapshot());

						state->snapshot = RegisterSnapshot(GetTransactionSnapshot());

							/*

							 * GetTransactionSnapshot() always acquires a new MVCC snapshot in

							 * READ COMMITTED mode.  A new snapshot is guaranteed to have all

							 * the entries it requires in the index.

							 *

							 * We must defend against the possibility that an old xact

							 * snapshot was returned at higher isolation levels when that

							 * snapshot is not safe for index scans of the target index.  This

							 * is possible when the snapshot sees tuples that are before the

							 * index's indcheckxmin horizon.  Throwing an error here should be

							 * very rare.  It doesn't seem worth using a secondary snapshot to

							 * avoid this.

							 */

							if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&

								!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),

													   snapshot->xmin))

								ereport(ERROR,

										(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),

										 errmsg("index \"%s\" cannot be verified using transaction snapshot",

												RelationGetRelationName(rel))));

						}

						/*

						 * GetTransactionSnapshot() always acquires a new MVCC snapshot in

						 * READ COMMITTED mode.  A new snapshot is guaranteed to have all the

						 * entries it requires in the index.

						 *

						 * We must defend against the possibility that an old xact snapshot

						 * was returned at higher isolation levels when that snapshot is not

						 * safe for index scans of the target index.  This is possible when

						 * the snapshot sees tuples that are before the index's indcheckxmin

						 * horizon.  Throwing an error here should be very rare.  It doesn't

						 * seem worth using a secondary snapshot to avoid this.

						 */

						if (IsolationUsesXactSnapshot() && rel->rd_index->indcheckxmin &&

							!TransactionIdPrecedes(HeapTupleHeaderGetXmin(rel->rd_indextuple->t_data),

												   state->snapshot->xmin))

							ereport(ERROR,

									errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),

									errmsg("index \"%s\" cannot be verified using transaction snapshot",

										   RelationGetRelationName(rel)));

					}

					/*

					 * We need a snapshot to check the uniqueness of the index. For better

					 * performance take it once per index check. If snapshot already taken

					 * reuse it.

					 * We need a snapshot to check the uniqueness of the index.  For better

					 * performance, take it once per index check.  If one was already taken

					 * above, use that.

					 */

					if (state->checkunique)

					{

						state->indexinfo = BuildIndexInfo(state->rel);

						if (state->indexinfo->ii_Unique)

						{

							if (snapshot != SnapshotAny)

								state->snapshot = snapshot;

							else

								state->snapshot = RegisterSnapshot(GetTransactionSnapshot());

						}

						if (state->indexinfo->ii_Unique && state->snapshot == InvalidSnapshot)

							state->snapshot = RegisterSnapshot(GetTransactionSnapshot());

					}

					Assert(!state->rootdescend || state->readonly);

				@ -555,13 +548,12 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,

						/*

						 * Create our own scan for table_index_build_scan(), rather than

						 * getting it to do so for us.  This is required so that we can

						 * actually use the MVCC snapshot registered earlier in !readonly

						 * case.

						 * actually use the MVCC snapshot registered earlier.

						 *

						 * Note that table_index_build_scan() calls heap_endscan() for us.

						 */

						scan = table_beginscan_strat(state->heaprel,	/* relation */

													 snapshot,	/* snapshot */

													 state->snapshot,	/* snapshot */

													 0, /* number of keys */

													 NULL,	/* scan key */

													 true,	/* buffer access strategy OK */

				@ -569,16 +561,15 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,

						/*

						 * Scan will behave as the first scan of a CREATE INDEX CONCURRENTLY

						 * behaves in !readonly case.

						 * behaves.

						 *

						 * It's okay that we don't actually use the same lock strength for the

						 * heap relation as any other ii_Concurrent caller would in !readonly

						 * case.  We have no reason to care about a concurrent VACUUM

						 * operation, since there isn't going to be a second scan of the heap

						 * that needs to be sure that there was no concurrent recycling of

						 * TIDs.

						 * heap relation as any other ii_Concurrent caller would.  We have no

						 * reason to care about a concurrent VACUUM operation, since there

						 * isn't going to be a second scan of the heap that needs to be sure

						 * that there was no concurrent recycling of TIDs.

						 */

						indexinfo->ii_Concurrent = !state->readonly;

						indexinfo->ii_Concurrent = true;

						/*

						 * Don't wait for uncommitted tuple xact commit/abort when index is a

				@ -602,14 +593,11 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,

												 state->heaptuplespresent, RelationGetRelationName(heaprel),

												 100.0 * bloom_prop_bits_set(state->filter))));

						if (snapshot != SnapshotAny)

							UnregisterSnapshot(snapshot);

						bloom_free(state->filter);

					}

					/* Be tidy: */

					if (snapshot == SnapshotAny && state->snapshot != InvalidSnapshot)

					if (state->snapshot != InvalidSnapshot)

						UnregisterSnapshot(state->snapshot);

					MemoryContextDelete(state->targetcontext);

				}

				@ -721,7 +709,7 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)

											 errmsg("block %u is not leftmost in index \"%s\"",

													current, RelationGetRelationName(state->rel))));

								if (level.istruerootlevel && !P_ISROOT(opaque))

								if (level.istruerootlevel && (!P_ISROOT(opaque) && !P_INCOMPLETE_SPLIT(opaque)))

									ereport(ERROR,

											(errcode(ERRCODE_INDEX_CORRUPTED),

											 errmsg("block %u is not true root in index \"%s\"",

				@ -876,7 +864,7 @@ heap_entry_is_visible(BtreeCheckState *state, ItemPointer tid)

				}

				/*

				 * Prepare an error message for unique constrain violation in

				 * Prepare an error message for unique constraint violation in

				 * a btree index and report ERROR.

				 */

				static void

				@ -913,7 +901,7 @@ bt_report_duplicate(BtreeCheckState *state,

							(errcode(ERRCODE_INDEX_CORRUPTED),

							 errmsg("index uniqueness is violated for index \"%s\"",

									RelationGetRelationName(state->rel)),

							 errdetail("Index %s%s and%s%s (point to heap %s and %s) page lsn=%X/%X.",

							 errdetail("Index %s%s and%s%s (point to heap %s and %s) page lsn=%X/%08X.",

									   itid, pposting, nitid, pnposting, htid, nhtid,

									   LSN_FORMAT_ARGS(state->targetlsn))));

				}

				@ -1058,7 +1046,7 @@ bt_leftmost_ignoring_half_dead(BtreeCheckState *state,

									(errcode(ERRCODE_NO_DATA),

									 errmsg_internal("harmless interrupted page deletion detected in index \"%s\"",

													 RelationGetRelationName(state->rel)),

									 errdetail_internal("Block=%u right block=%u page lsn=%X/%X.",

									 errdetail_internal("Block=%u right block=%u page lsn=%X/%08X.",

														reached, reached_from,

														LSN_FORMAT_ARGS(pagelsn))));

				@ -1283,7 +1271,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("wrong number of high key index tuple attributes in index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Index block=%u natts=%u block type=%s page lsn=%X/%X.",

									 errdetail_internal("Index block=%u natts=%u block type=%s page lsn=%X/%08X.",

														state->targetblock,

														BTreeTupleGetNAtts(itup, state->rel),

														P_ISLEAF(topaque) ? "heap" : "index",

				@ -1332,7 +1320,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index tuple size does not equal lp_len in index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Index tid=(%u,%u) tuple size=%zu lp_len=%u page lsn=%X/%X.",

									 errdetail_internal("Index tid=(%u,%u) tuple size=%zu lp_len=%u page lsn=%X/%08X.",

														state->targetblock, offset,

														tupsize, ItemIdGetLength(itemid),

														LSN_FORMAT_ARGS(state->targetlsn)),

				@ -1356,7 +1344,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("wrong number of index tuple attributes in index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Index tid=%s natts=%u points to %s tid=%s page lsn=%X/%X.",

									 errdetail_internal("Index tid=%s natts=%u points to %s tid=%s page lsn=%X/%08X.",

														itid,

														BTreeTupleGetNAtts(itup, state->rel),

														P_ISLEAF(topaque) ? "heap" : "index",

				@ -1406,7 +1394,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("could not find tuple using search from root page in index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Index tid=%s points to heap tid=%s page lsn=%X/%X.",

									 errdetail_internal("Index tid=%s points to heap tid=%s page lsn=%X/%08X.",

														itid, htid,

														LSN_FORMAT_ARGS(state->targetlsn))));

						}

				@ -1435,7 +1423,7 @@ bt_target_page_check(BtreeCheckState *state)

											(errcode(ERRCODE_INDEX_CORRUPTED),

											 errmsg_internal("posting list contains misplaced TID in index \"%s\"",

															 RelationGetRelationName(state->rel)),

											 errdetail_internal("Index tid=%s posting list offset=%d page lsn=%X/%X.",

											 errdetail_internal("Index tid=%s posting list offset=%d page lsn=%X/%08X.",

																itid, i,

																LSN_FORMAT_ARGS(state->targetlsn))));

								}

				@ -1488,7 +1476,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("index row size %zu exceeds maximum for index \"%s\"",

											tupsize, RelationGetRelationName(state->rel)),

									 errdetail_internal("Index tid=%s points to %s tid=%s page lsn=%X/%X.",

									 errdetail_internal("Index tid=%s points to %s tid=%s page lsn=%X/%08X.",

														itid,

														P_ISLEAF(topaque) ? "heap" : "index",

														htid,

				@ -1595,7 +1583,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("high key invariant violated for index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Index tid=%s points to %s tid=%s page lsn=%X/%X.",

									 errdetail_internal("Index tid=%s points to %s tid=%s page lsn=%X/%08X.",

														itid,

														P_ISLEAF(topaque) ? "heap" : "index",

														htid,

				@ -1641,9 +1629,7 @@ bt_target_page_check(BtreeCheckState *state)

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("item order invariant violated for index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Lower index tid=%s (points to %s tid=%s) "

														"higher index tid=%s (points to %s tid=%s) "

														"page lsn=%X/%X.",

									 errdetail_internal("Lower index tid=%s (points to %s tid=%s) higher index tid=%s (points to %s tid=%s) page lsn=%X/%08X.",

														itid,

														P_ISLEAF(topaque) ? "heap" : "index",

														htid,

				@ -1760,7 +1746,7 @@ bt_target_page_check(BtreeCheckState *state)

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("cross page item order invariant violated for index \"%s\"",

												RelationGetRelationName(state->rel)),

										 errdetail_internal("Last item on page tid=(%u,%u) page lsn=%X/%X.",

										 errdetail_internal("Last item on page tid=(%u,%u) page lsn=%X/%08X.",

															state->targetblock, offset,

															LSN_FORMAT_ARGS(state->targetlsn))));

							}

				@ -1813,7 +1799,7 @@ bt_target_page_check(BtreeCheckState *state)

												(errcode(ERRCODE_INDEX_CORRUPTED),

												 errmsg("right block of leaf block is non-leaf for index \"%s\"",

														RelationGetRelationName(state->rel)),

												 errdetail_internal("Block=%u page lsn=%X/%X.",

												 errdetail_internal("Block=%u page lsn=%X/%08X.",

																	state->targetblock,

																	LSN_FORMAT_ARGS(state->targetlsn))));

				@ -2237,7 +2223,7 @@ bt_child_highkey_check(BtreeCheckState *state,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("the first child of leftmost target page is not leftmost of its level in index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%X.",

									 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%08X.",

														state->targetblock, blkno,

														LSN_FORMAT_ARGS(state->targetlsn))));

				@ -2270,7 +2256,7 @@ bt_child_highkey_check(BtreeCheckState *state,

						 * If we visit page with high key, check that it is equal to the

						 * target key next to corresponding downlink.

						 */

						if (!rightsplit && !P_RIGHTMOST(opaque))

						if (!rightsplit && !P_RIGHTMOST(opaque) && !P_ISHALFDEAD(opaque))

						{

							BTPageOpaque topaque;

							IndexTuple	highkey;

				@ -2323,7 +2309,7 @@ bt_child_highkey_check(BtreeCheckState *state,

												(errcode(ERRCODE_INDEX_CORRUPTED),

												 errmsg("child high key is greater than rightmost pivot key on target level in index \"%s\"",

														RelationGetRelationName(state->rel)),

												 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%X.",

												 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%08X.",

																	state->targetblock, blkno,

																	LSN_FORMAT_ARGS(state->targetlsn))));

									pivotkey_offset = P_HIKEY;

				@ -2353,7 +2339,7 @@ bt_child_highkey_check(BtreeCheckState *state,

											(errcode(ERRCODE_INDEX_CORRUPTED),

											 errmsg("can't find left sibling high key in index \"%s\"",

													RelationGetRelationName(state->rel)),

											 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%X.",

											 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%08X.",

																state->targetblock, blkno,

																LSN_FORMAT_ARGS(state->targetlsn))));

								itup = state->lowkey;

				@ -2365,7 +2351,7 @@ bt_child_highkey_check(BtreeCheckState *state,

										(errcode(ERRCODE_INDEX_CORRUPTED),

										 errmsg("mismatch between parent key and child high key in index \"%s\"",

												RelationGetRelationName(state->rel)),

										 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%X.",

										 errdetail_internal("Target block=%u child block=%u target page lsn=%X/%08X.",

															state->targetblock, blkno,

															LSN_FORMAT_ARGS(state->targetlsn))));

							}

				@ -2505,7 +2491,7 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("downlink to deleted page found in index \"%s\"",

										RelationGetRelationName(state->rel)),

								 errdetail_internal("Parent block=%u child block=%u parent page lsn=%X/%X.",

								 errdetail_internal("Parent block=%u child block=%u parent page lsn=%X/%08X.",

													state->targetblock, childblock,

													LSN_FORMAT_ARGS(state->targetlsn))));

				@ -2546,7 +2532,7 @@ bt_child_check(BtreeCheckState *state, BTScanInsert targetkey,

									(errcode(ERRCODE_INDEX_CORRUPTED),

									 errmsg("down-link lower bound invariant violated for index \"%s\"",

											RelationGetRelationName(state->rel)),

									 errdetail_internal("Parent block=%u child index tid=(%u,%u) parent page lsn=%X/%X.",

									 errdetail_internal("Parent block=%u child index tid=(%u,%u) parent page lsn=%X/%08X.",

														state->targetblock, childblock, offset,

														LSN_FORMAT_ARGS(state->targetlsn))));

					}

				@ -2616,7 +2602,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,

								(errcode(ERRCODE_NO_DATA),

								 errmsg_internal("harmless interrupted page split detected in index \"%s\"",

												 RelationGetRelationName(state->rel)),

								 errdetail_internal("Block=%u level=%u left sibling=%u page lsn=%X/%X.",

								 errdetail_internal("Block=%u level=%u left sibling=%u page lsn=%X/%08X.",

													blkno, opaque->btpo_level,

													opaque->btpo_prev,

													LSN_FORMAT_ARGS(pagelsn))));

				@ -2638,7 +2624,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg("leaf index block lacks downlink in index \"%s\"",

										RelationGetRelationName(state->rel)),

								 errdetail_internal("Block=%u page lsn=%X/%X.",

								 errdetail_internal("Block=%u page lsn=%X/%08X.",

													blkno,

													LSN_FORMAT_ARGS(pagelsn))));

				@ -2704,7 +2690,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,

								(errcode(ERRCODE_INDEX_CORRUPTED),

								 errmsg_internal("downlink to deleted leaf page found in index \"%s\"",

												 RelationGetRelationName(state->rel)),

								 errdetail_internal("Top parent/target block=%u leaf block=%u top parent/under check lsn=%X/%X.",

								 errdetail_internal("Top parent/target block=%u leaf block=%u top parent/under check lsn=%X/%08X.",

													blkno, childblk,

													LSN_FORMAT_ARGS(pagelsn))));

				@ -2730,7 +2716,7 @@ bt_downlink_missing_check(BtreeCheckState *state, bool rightsplit,

							(errcode(ERRCODE_INDEX_CORRUPTED),

							 errmsg("internal index block lacks downlink in index \"%s\"",

									RelationGetRelationName(state->rel)),

							 errdetail_internal("Block=%u level=%u page lsn=%X/%X.",

							 errdetail_internal("Block=%u level=%u page lsn=%X/%08X.",

												blkno, opaque->btpo_level,

												LSN_FORMAT_ARGS(pagelsn))));

				}

				@ -3023,7 +3009,6 @@ static bool

				bt_rootdescend(BtreeCheckState *state, IndexTuple itup)

				{

					BTScanInsert key;

					BTStack		stack;

					Buffer		lbuf;

					bool		exists;

				@ -3040,7 +3025,7 @@ bt_rootdescend(BtreeCheckState *state, IndexTuple itup)

					 */

					Assert(state->readonly && state->rootdescend);

					exists = false;

					stack = _bt_search(state->rel, NULL, key, &lbuf, BT_READ);

					_bt_search(state->rel, NULL, key, &lbuf, BT_READ, false);

					if (BufferIsValid(lbuf))

					{

				@ -3067,7 +3052,6 @@ bt_rootdescend(BtreeCheckState *state, IndexTuple itup)

						_bt_relbuf(state->rel, lbuf);

					}

					_bt_freestack(stack);

					pfree(key);

					return exists;

									
										2

contrib/auth_delay/auth_delay.c
									
										View file
										
				@ -2,7 +2,7 @@

				 *

				 * auth_delay.c

				 *

				 * Copyright (c) 2010-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2010-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *		contrib/auth_delay/auth_delay.c

									
										2

contrib/auth_delay/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				auth_delay_sources = files(

				  'auth_delay.c',

									
										3

contrib/auto_explain/Makefile
									
										View file
										
				@ -6,6 +6,9 @@ OBJS = \

					auto_explain.o

				PGFILEDESC = "auto_explain - logging facility for execution plans"

				EXTRA_INSTALL = contrib/pg_overexplain

				REGRESS = alter_reset extension_options

				TAP_TESTS = 1

				ifdef USE_PGXS

									
										434

contrib/auto_explain/auto_explain.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * auto_explain.c

				 *

				 *

				 * Copyright (c) 2008-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2008-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/auto_explain/auto_explain.c

				@ -15,12 +15,17 @@

				#include <limits.h>

				#include "access/parallel.h"

				#include "commands/defrem.h"

				#include "commands/explain.h"

				#include "commands/explain_format.h"

				#include "commands/explain_state.h"

				#include "common/pg_prng.h"

				#include "executor/instrument.h"

				#include "nodes/makefuncs.h"

				#include "nodes/value.h"

				#include "parser/scansup.h"

				#include "utils/guc.h"

				#include "utils/varlena.h"

				PG_MODULE_MAGIC_EXT(

									.name = "auto_explain",

				@ -33,6 +38,7 @@ static int	auto_explain_log_parameter_max_length = -1; /* bytes or -1 */

				static bool auto_explain_log_analyze = false;

				static bool auto_explain_log_verbose = false;

				static bool auto_explain_log_buffers = false;

				static bool auto_explain_log_io = false;

				static bool auto_explain_log_wal = false;

				static bool auto_explain_log_triggers = false;

				static bool auto_explain_log_timing = true;

				@ -41,6 +47,31 @@ static int	auto_explain_log_format = EXPLAIN_FORMAT_TEXT;

				static int	auto_explain_log_level = LOG;

				static bool auto_explain_log_nested_statements = false;

				static double auto_explain_sample_rate = 1;

				static char *auto_explain_log_extension_options = NULL;

				/*

				 * Parsed form of one option from auto_explain.log_extension_options.

				 */

				typedef struct auto_explain_option

				{

					char	   *name;

					char	   *value;

					NodeTag		type;

				} auto_explain_option;

				/*

				 * Parsed form of the entirety of auto_explain.log_extension_options, stored

				 * as GUC extra. The options[] array will have pointers into the string

				 * following the array.

				 */

				typedef struct auto_explain_extension_options

				{

					int			noptions;

					auto_explain_option options[FLEXIBLE_ARRAY_MEMBER];

					/* a null-terminated copy of the GUC string follows the array */

				} auto_explain_extension_options;

				static auto_explain_extension_options *extension_options = NULL;

				static const struct config_enum_entry format_options[] = {

					{"text", EXPLAIN_FORMAT_TEXT, false},

				@ -88,6 +119,15 @@ static void explain_ExecutorRun(QueryDesc *queryDesc,

				static void explain_ExecutorFinish(QueryDesc *queryDesc);

				static void explain_ExecutorEnd(QueryDesc *queryDesc);

				static bool check_log_extension_options(char **newval, void **extra,

														GucSource source);

				static void assign_log_extension_options(const char *newval, void *extra);

				static void apply_extension_options(ExplainState *es,

													auto_explain_extension_options *ext);

				static char *auto_explain_scan_literal(char **endp, char **nextp);

				static int	auto_explain_split_options(char *rawstring,

													   auto_explain_option *options,

													   int maxoptions, char **errmsg);

				/*

				 * Module load callback

				@ -164,6 +204,17 @@ _PG_init(void)

											 NULL,

											 NULL);

					DefineCustomBoolVariable("auto_explain.log_io",

											 "Log I/O statistics.",

											 NULL,

											 &auto_explain_log_io,

											 false,

											 PGC_SUSET,

											 0,

											 NULL,

											 NULL,

											 NULL);

					DefineCustomBoolVariable("auto_explain.log_wal",

											 "Log WAL usage.",

											 NULL,

				@ -232,6 +283,17 @@ _PG_init(void)

											 NULL,

											 NULL);

					DefineCustomStringVariable("auto_explain.log_extension_options",

											   "Extension EXPLAIN options to be added.",

											   NULL,

											   &auto_explain_log_extension_options,

											   NULL,

											   PGC_SUSET,

											   0,

											   check_log_extension_options,

											   assign_log_extension_options,

											   NULL);

					DefineCustomRealVariable("auto_explain.sample_rate",

											 "Fraction of queries to process.",

											 NULL,

				@ -284,6 +346,9 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)

					if (auto_explain_enabled())

					{

						/* We're always interested in runtime */

						queryDesc->query_instr_options |= INSTRUMENT_TIMER;

						/* Enable per-node instrumentation iff log_analyze is required. */

						if (auto_explain_log_analyze && (eflags & EXEC_FLAG_EXPLAIN_ONLY) == 0)

						{

				@ -293,6 +358,8 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)

								queryDesc->instrument_options |= INSTRUMENT_ROWS;

							if (auto_explain_log_buffers)

								queryDesc->instrument_options |= INSTRUMENT_BUFFERS;

							if (auto_explain_log_io)

								queryDesc->instrument_options |= INSTRUMENT_IO;

							if (auto_explain_log_wal)

								queryDesc->instrument_options |= INSTRUMENT_WAL;

						}

				@ -302,23 +369,6 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)

						prev_ExecutorStart(queryDesc, eflags);

					else

						standard_ExecutorStart(queryDesc, eflags);

					if (auto_explain_enabled())

					{

						/*

						 * Set up to track total elapsed time in ExecutorRun.  Make sure the

						 * space is allocated in the per-query context so it will go away at

						 * ExecutorEnd.

						 */

						if (queryDesc->totaltime == NULL)

						{

							MemoryContext oldcxt;

							oldcxt = MemoryContextSwitchTo(queryDesc->estate->es_query_cxt);

							queryDesc->totaltime = InstrAlloc(1, INSTRUMENT_ALL, false);

							MemoryContextSwitchTo(oldcxt);

						}

					}

				}

				/*

				@ -370,7 +420,7 @@ explain_ExecutorFinish(QueryDesc *queryDesc)

				static void

				explain_ExecutorEnd(QueryDesc *queryDesc)

				{

					if (queryDesc->totaltime && auto_explain_enabled())

					if (queryDesc->query_instr && auto_explain_enabled())

					{

						MemoryContext oldcxt;

						double		msec;

				@ -381,14 +431,8 @@ explain_ExecutorEnd(QueryDesc *queryDesc)

						 */

						oldcxt = MemoryContextSwitchTo(queryDesc->estate->es_query_cxt);

						/*

						 * Make sure stats accumulation is done.  (Note: it's okay if several

						 * levels of hook all do this.)

						 */

						InstrEndLoop(queryDesc->totaltime);

						/* Log plan if duration is exceeded. */

						msec = queryDesc->totaltime->total * 1000.0;

						msec = INSTR_TIME_GET_MILLISEC(queryDesc->query_instr->total);

						if (msec >= auto_explain_log_min_duration)

						{

							ExplainState *es = NewExplainState();

				@ -396,6 +440,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)

							es->analyze = (queryDesc->instrument_options && auto_explain_log_analyze);

							es->verbose = auto_explain_log_verbose;

							es->buffers = (es->analyze && auto_explain_log_buffers);

							es->io = (es->analyze && auto_explain_log_io);

							es->wal = (es->analyze && auto_explain_log_wal);

							es->timing = (es->analyze && auto_explain_log_timing);

							es->summary = es->analyze;

				@ -404,6 +449,8 @@ explain_ExecutorEnd(QueryDesc *queryDesc)

							es->format = auto_explain_log_format;

							es->settings = auto_explain_log_settings;

							apply_extension_options(es, extension_options);

							ExplainBeginOutput(es);

							ExplainQueryText(es, queryDesc);

							ExplainQueryParameters(es, queryDesc->params, auto_explain_log_parameter_max_length);

				@ -412,6 +459,12 @@ explain_ExecutorEnd(QueryDesc *queryDesc)

								ExplainPrintTriggers(es, queryDesc);

							if (es->costs)

								ExplainPrintJITSummary(es, queryDesc);

							if (explain_per_plan_hook)

								(*explain_per_plan_hook) (queryDesc->plannedstmt,

														  NULL, es,

														  queryDesc->sourceText,

														  queryDesc->params,

														  queryDesc->estate->es_queryEnv);

							ExplainEndOutput(es);

							/* Remove last line break */

				@ -445,3 +498,332 @@ explain_ExecutorEnd(QueryDesc *queryDesc)

					else

						standard_ExecutorEnd(queryDesc);

				}

				/*

				 * GUC check hook for auto_explain.log_extension_options.

				 */

				static bool

				check_log_extension_options(char **newval, void **extra, GucSource source)

				{

					char	   *rawstring;

					auto_explain_extension_options *result;

					auto_explain_option *options;

					int			maxoptions = 8;

					Size		rawstring_len;

					Size		allocsize;

					char	   *errmsg;

					/* NULL or empty string means no options. */

					if (*newval == NULL || (*newval)[0] == '\0')

					{

						*extra = NULL;

						return true;

					}

					rawstring_len = strlen(*newval) + 1;

				retry:

					/* Try to allocate an auto_explain_extension_options object. */

					allocsize = offsetof(auto_explain_extension_options, options) +

						sizeof(auto_explain_option) * maxoptions +

						rawstring_len;

					result = (auto_explain_extension_options *) guc_malloc(LOG, allocsize);

					if (result == NULL)

						return false;

					/* Copy the string after the options array. */

					rawstring = (char *) &result->options[maxoptions];

					memcpy(rawstring, *newval, rawstring_len);

					/* Parse. */

					options = result->options;

					result->noptions = auto_explain_split_options(rawstring, options,

																  maxoptions, &errmsg);

					if (result->noptions < 0)

					{

						GUC_check_errdetail("%s", errmsg);

						guc_free(result);

						return false;

					}

					/*

					 * Retry with a larger array if needed.

					 *

					 * It should be impossible for this to loop more than once, because

					 * auto_explain_split_options tells us how many entries are needed.

					 */

					if (result->noptions > maxoptions)

					{

						maxoptions = result->noptions;

						guc_free(result);

						goto retry;

					}

					/* Validate each option against its registered check handler. */

					for (int i = 0; i < result->noptions; i++)

					{

						if (!GUCCheckExplainExtensionOption(options[i].name, options[i].value,

															options[i].type))

						{

							guc_free(result);

							return false;

						}

					}

					*extra = result;

					return true;

				}

				/*

				 * GUC assign hook for auto_explain.log_extension_options.

				 */

				static void

				assign_log_extension_options(const char *newval, void *extra)

				{

					extension_options = (auto_explain_extension_options *) extra;

				}

				/*

				 * Apply parsed extension options to an ExplainState.

				 */

				static void

				apply_extension_options(ExplainState *es, auto_explain_extension_options *ext)

				{

					if (ext == NULL)

						return;

					for (int i = 0; i < ext->noptions; i++)

					{

						auto_explain_option *opt = &ext->options[i];

						DefElem    *def;

						Node	   *arg;

						if (opt->value == NULL)

							arg = NULL;

						else if (opt->type == T_Integer)

							arg = (Node *) makeInteger(strtol(opt->value, NULL, 0));

						else if (opt->type == T_Float)

							arg = (Node *) makeFloat(opt->value);

						else

							arg = (Node *) makeString(opt->value);

						def = makeDefElem(opt->name, arg, -1);

						ApplyExtensionExplainOption(es, def, NULL);

					}

				}

				/*

				 * auto_explain_scan_literal - In-place scanner for single-quoted string

				 * literals.

				 *

				 * This is the single-quote analog of scan_quoted_identifier from varlena.c.

				 */

				static char *

				auto_explain_scan_literal(char **endp, char **nextp)

				{

					char	   *token = *nextp + 1;

					for (;;)

					{

						*endp = strchr(*nextp + 1, '\'');

						if (*endp == NULL)

							return NULL;		/* mismatched quotes */

						if ((*endp)[1] != '\'')

							break;				/* found end of literal */

						/* Collapse adjacent quotes into one quote, and look again */

						memmove(*endp, *endp + 1, strlen(*endp));

						*nextp = *endp;

					}

					/* *endp now points at the terminating quote */

					*nextp = *endp + 1;

					return token;

				}

				/*

				 * auto_explain_split_options - Parse an option string into an array of

				 * auto_explain_option structs.

				 *

				 * Much of this logic is similar to SplitIdentifierString and friends, but our

				 * needs are different enough that we roll our own parsing logic. The goal here

				 * is to accept the same syntax that the main parser would accept inside of

				 * an EXPLAIN option list. While we can't do that perfectly without adding a

				 * lot more code, the goal of this implementation is to be close enough that

				 * users don't really notice the differences.

				 *

				 * The input string is modified in place (null-terminated, downcased, quotes

				 * collapsed).  All name and value pointers in the output array refer into

				 * this string, so the caller must ensure the string outlives the array.

				 *

				 * Returns the full number of options in the input string, but stores no

				 * more than maxoptions into the caller-provided array. If a syntax error

				 * occurs, returns -1 and sets *errmsg.

				 */

				static int

				auto_explain_split_options(char *rawstring, auto_explain_option *options,

										   int maxoptions, char **errmsg)

				{

					char	   *nextp = rawstring;

					int			noptions = 0;

					bool		done = false;

					*errmsg = NULL;

					while (scanner_isspace(*nextp))

						nextp++;				/* skip leading whitespace */

					if (*nextp == '\0')

						return 0;				/* empty string is fine */

					while (!done)

					{

						char	   *name;

						char	   *name_endp;

						char	   *value = NULL;

						char	   *value_endp = NULL;

						NodeTag		type = T_Invalid;

						/* Parse the option name. */

						name = scan_identifier(&name_endp, &nextp, ',', true);

						if (name == NULL || name_endp == name)

						{

							*errmsg = "option name missing or empty";

							return -1;

						}

						/* Skip whitespace after the option name. */

						while (scanner_isspace(*nextp))

							nextp++;

						/*

						 * Determine whether we have an option value.  A comma or end of

						 * string means no value; otherwise we have one.

						 */

						if (*nextp != '\0' && *nextp != ',')

						{

							if (*nextp == '\'')

							{

								/* Single-quoted string literal. */

								type = T_String;

								value = auto_explain_scan_literal(&value_endp, &nextp);

								if (value == NULL)

								{

									*errmsg = "unterminated single-quoted string";

									return -1;

								}

							}

							else if (isdigit((unsigned char) *nextp) ||

									 ((*nextp == '+' || *nextp == '-') &&

									  isdigit((unsigned char) nextp[1])))

							{

								char	   *endptr;

								long		intval;

								char		saved;

								/* Remember the start of the next token, and find the end. */

								value = nextp;

								while (*nextp && *nextp != ',' && !scanner_isspace(*nextp))

									nextp++;

								value_endp = nextp;

								/* Temporarily '\0'-terminate so we can use strtol/strtod. */

								saved = *value_endp;

								*value_endp = '\0';

								/*

								 * Integer, float, or neither?

								 *

								 * NB: Since we use strtol and strtod here rather than

								 * pg_strtoint64_safe, some syntax that would be accepted by

								 * the main parser is not accepted here, e.g. 100_000. On the

								 * plus side, strtol and strtod won't allocate, and

								 * pg_strtoint64_safe might. For now, it seems better to keep

								 * things simple here.

								 */

								errno = 0;

								intval = strtol(value, &endptr, 0);

								if (errno == 0 && *endptr == '\0' && endptr != value &&

									intval == (int) intval)

									type = T_Integer;

								else

								{

									type = T_Float;

									(void) strtod(value, &endptr);

									if (*endptr != '\0')

									{

										*value_endp = saved;

										*errmsg = "invalid numeric value";

										return -1;

									}

								}

								/* Remove temporary terminator. */

								*value_endp = saved;

							}

							else

							{

								/* Identifier, possibly double-quoted. */

								type = T_String;

								value = scan_identifier(&value_endp, &nextp, ',', true);

								if (value == NULL)

								{

									/*

									 * scan_identifier will return NULL if it finds an

									 * unterminated double-quoted identifier or it finds no

									 * identifier at all because the next character is

									 * whitespace or the separator character, here a comma.

									 * But the latter case is impossible here because the code

									 * above has skipped whitespace and checked for commas.

									 */

									*errmsg = "unterminated double-quoted string";

									return -1;

								}

							}

						}

						/* Skip trailing whitespace. */

						while (scanner_isspace(*nextp))

							nextp++;

						/* Expect comma or end of string. */

						if (*nextp == ',')

						{

							nextp++;

							while (scanner_isspace(*nextp))

								nextp++;

							if (*nextp == '\0')

							{

								*errmsg = "trailing comma in option list";

								return -1;

							}

						}

						else if (*nextp == '\0')

							done = true;

						else

						{

							*errmsg = "expected comma or end of option list";

							return -1;

						}

						/*

						 * Now safe to null-terminate the name and value.  We couldn't do this

						 * earlier because in the unquoted case, the null terminator position

						 * may coincide with a character that the scanning logic above still

						 * needed to read.

						 */

						*name_endp = '\0';

						if (value_endp != NULL)

							*value_endp = '\0';

						/* Always count this option, and store the details if there is room. */

						if (noptions < maxoptions)

						{

							options[noptions].name = name;

							options[noptions].type = type;

							options[noptions].value = value;

						}

						noptions++;

					}

					return noptions;

				}

19

contrib/auto_explain/expected/alter_reset.out Normal file

View file

 @ -0,0 +1,19 @@
 --
 -- This tests resetting unknown custom GUCs with reserved prefixes.  There's
 -- nothing specific to auto_explain; this is just a convenient place to put
 -- this test.
 --
 SELECT current_database() AS datname \gset
 CREATE ROLE regress_ae_role;
 ALTER DATABASE :"datname" SET auto_explain.bogus = 1;
 ALTER ROLE regress_ae_role SET auto_explain.bogus = 1;
 ALTER ROLE regress_ae_role IN DATABASE :"datname" SET auto_explain.bogus = 1;
 ALTER SYSTEM SET auto_explain.bogus = 1;
 LOAD 'auto_explain';
 WARNING:  invalid configuration parameter name "auto_explain.bogus", removing it
 DETAIL:  "auto_explain" is now a reserved prefix.
 ALTER DATABASE :"datname" RESET auto_explain.bogus;
 ALTER ROLE regress_ae_role RESET auto_explain.bogus;
 ALTER ROLE regress_ae_role IN DATABASE :"datname" RESET auto_explain.bogus;
 ALTER SYSTEM RESET auto_explain.bogus;
 DROP ROLE regress_ae_role;

49

contrib/auto_explain/expected/extension_options.out Normal file

View file

 @ -0,0 +1,49 @@
 --
 -- Tests for auto_explain.log_extension_options.
 --
 LOAD 'auto_explain';
 LOAD 'pg_overexplain';
 -- Various legal values with assorted quoting and whitespace choices.
 SET auto_explain.log_extension_options = '';
 SET auto_explain.log_extension_options = 'debug, RANGE_TABLE';
 SET auto_explain.log_extension_options = 'debug TRUE  ';
 SET auto_explain.log_extension_options = '   debug 1,RAnge_table "off"';
 SET auto_explain.log_extension_options = $$"debug" tRuE, range_table 'false'$$;
 -- Syntax errors.
 SET auto_explain.log_extension_options = ',';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": ","
 DETAIL:  option name missing or empty
 SET auto_explain.log_extension_options = ', range_table';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": ", range_table"
 DETAIL:  option name missing or empty
 SET auto_explain.log_extension_options = 'range_table, ';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": "range_table, "
 DETAIL:  trailing comma in option list
 SET auto_explain.log_extension_options = 'range_table true false';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": "range_table true false"
 DETAIL:  expected comma or end of option list
 SET auto_explain.log_extension_options = '"range_table';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": ""range_table"
 DETAIL:  option name missing or empty
 SET auto_explain.log_extension_options = 'range_table 3.1415nine';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": "range_table 3.1415nine"
 DETAIL:  invalid numeric value
 SET auto_explain.log_extension_options = 'range_table "true';
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": "range_table "true"
 DETAIL:  unterminated double-quoted string
 SET auto_explain.log_extension_options = $$range_table 'true$$;
 ERROR:  invalid value for parameter "auto_explain.log_extension_options": "range_table 'true"
 DETAIL:  unterminated single-quoted string
 SET auto_explain.log_extension_options = $$'$$;
 ERROR:  unrecognized EXPLAIN option "'"
 -- Unacceptable option values.
 SET auto_explain.log_extension_options = 'range_table maybe';
 ERROR:  EXPLAIN option "range_table" requires a Boolean value
 SET auto_explain.log_extension_options = 'range_table 2';
 ERROR:  EXPLAIN option "range_table" requires a Boolean value
 SET auto_explain.log_extension_options = 'range_table "0"';
 ERROR:  EXPLAIN option "range_table" requires a Boolean value
 SET auto_explain.log_extension_options = 'range_table 3.14159';
 ERROR:  EXPLAIN option "range_table" requires a Boolean value
 -- Supply enough options to force the option array to be reallocated.
 SET auto_explain.log_extension_options = 'debug, debug, debug, debug, debug, debug, debug, debug, debug, debug false';

									
										8

contrib/auto_explain/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				auto_explain_sources = files(

				  'auto_explain.c',

				@ -20,6 +20,12 @@ tests += {

				  'name': 'auto_explain',

				  'sd': meson.current_source_dir(),

				  'bd': meson.current_build_dir(),

				  'regress': {

				    'sql': [

				      'alter_reset',

				      'extension_options',

				    ],

				  },

				  'tap': {

				    'tests': [

				      't/001_auto_explain.pl',

									
										22

contrib/auto_explain/sql/alter_reset.sql
									
										Normal file
									
										View file
										
				@ -0,0 +1,22 @@

				--

				-- This tests resetting unknown custom GUCs with reserved prefixes.  There's

				-- nothing specific to auto_explain; this is just a convenient place to put

				-- this test.

				--

				SELECT current_database() AS datname \gset

				CREATE ROLE regress_ae_role;

				ALTER DATABASE :"datname" SET auto_explain.bogus = 1;

				ALTER ROLE regress_ae_role SET auto_explain.bogus = 1;

				ALTER ROLE regress_ae_role IN DATABASE :"datname" SET auto_explain.bogus = 1;

				ALTER SYSTEM SET auto_explain.bogus = 1;

				LOAD 'auto_explain';

				ALTER DATABASE :"datname" RESET auto_explain.bogus;

				ALTER ROLE regress_ae_role RESET auto_explain.bogus;

				ALTER ROLE regress_ae_role IN DATABASE :"datname" RESET auto_explain.bogus;

				ALTER SYSTEM RESET auto_explain.bogus;

				DROP ROLE regress_ae_role;

									
										33

contrib/auto_explain/sql/extension_options.sql
									
										Normal file
									
										View file
										
				@ -0,0 +1,33 @@

				--

				-- Tests for auto_explain.log_extension_options.

				--

				LOAD 'auto_explain';

				LOAD 'pg_overexplain';

				-- Various legal values with assorted quoting and whitespace choices.

				SET auto_explain.log_extension_options = '';

				SET auto_explain.log_extension_options = 'debug, RANGE_TABLE';

				SET auto_explain.log_extension_options = 'debug TRUE  ';

				SET auto_explain.log_extension_options = '   debug 1,RAnge_table "off"';

				SET auto_explain.log_extension_options = $$"debug" tRuE, range_table 'false'$$;

				-- Syntax errors.

				SET auto_explain.log_extension_options = ',';

				SET auto_explain.log_extension_options = ', range_table';

				SET auto_explain.log_extension_options = 'range_table, ';

				SET auto_explain.log_extension_options = 'range_table true false';

				SET auto_explain.log_extension_options = '"range_table';

				SET auto_explain.log_extension_options = 'range_table 3.1415nine';

				SET auto_explain.log_extension_options = 'range_table "true';

				SET auto_explain.log_extension_options = $$range_table 'true$$;

				SET auto_explain.log_extension_options = $$'$$;

				-- Unacceptable option values.

				SET auto_explain.log_extension_options = 'range_table maybe';

				SET auto_explain.log_extension_options = 'range_table 2';

				SET auto_explain.log_extension_options = 'range_table "0"';

				SET auto_explain.log_extension_options = 'range_table 3.14159';

				-- Supply enough options to force the option array to be reallocated.

				SET auto_explain.log_extension_options = 'debug, debug, debug, debug, debug, debug, debug, debug, debug, debug false';

									
										18

contrib/auto_explain/t/001_auto_explain.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				use strict;

				use warnings FATAL => 'all';

				@ -30,7 +30,7 @@ sub query_log

				my $node = PostgreSQL::Test::Cluster->new('main');

				$node->init(auth_extra => [ '--create-role' => 'regress_user1' ]);

				$node->append_conf('postgresql.conf',

					"session_preload_libraries = 'auto_explain'");

					"session_preload_libraries = 'pg_overexplain,auto_explain'");

				$node->append_conf('postgresql.conf', "auto_explain.log_min_duration = 0");

				$node->append_conf('postgresql.conf', "auto_explain.log_analyze = on");

				$node->start;

				@ -172,6 +172,20 @@ like(

					qr/"Node Type": "Index Scan"[^}]*"Index Name": "pg_class_relname_nsp_index"/s,

					"index scan logged, json mode");

				# Extension options.

				$log_contents = query_log($node, "SELECT 1;",

					{ "auto_explain.log_extension_options" => "debug" });

				like(

					$log_contents,

					qr/Parallel Safe:/,

					"extension option produces per-node output");

				like(

					$log_contents,

					qr/Command Type: select/,

					"extension option produces per-plan output");

				# Check that PGC_SUSET parameters can be set by non-superuser if granted,

				# otherwise not

									
										4

contrib/basebackup_to_shell/basebackup_to_shell.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * basebackup_to_shell.c

				 *	  target base backup files to a shell command

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 *	  contrib/basebackup_to_shell/basebackup_to_shell.c

				 *-------------------------------------------------------------------------

				@ -136,7 +136,7 @@ shell_get_sink(bbsink *next_sink, void *detail_arg)

					 * We remember the current value of basebackup_to_shell.shell_command to

					 * be certain that it can't change under us during the backup.

					 */

					sink = palloc0(sizeof(bbsink_shell));

					sink = palloc0_object(bbsink_shell);

					*((const bbsink_ops **) &sink->base.bbs_ops) = &bbsink_shell_ops;

					sink->base.bbs_next = next_sink;

					sink->target_detail = detail_arg;

									
										6

contrib/basebackup_to_shell/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				basebackup_to_shell_sources = files(

				  'basebackup_to_shell.c',

				@ -24,7 +24,7 @@ tests += {

				    'tests': [

				      't/001_basic.pl',

				    ],

				    'env': {'GZIP_PROGRAM': gzip.found() ? gzip.path() : '',

				            'TAR': tar.found() ? tar.path() : '' },

				    'env': {'GZIP_PROGRAM': gzip.found() ? gzip.full_path() : '',

				            'TAR': tar.found() ? tar.full_path() : '' },

				  },

				}

									
										2

contrib/basebackup_to_shell/t/001_basic.pl
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				use strict;

				use warnings FATAL => 'all';

									
										21

contrib/basic_archive/basic_archive.c
									
										View file
										
				@ -17,7 +17,7 @@

				 * a file is successfully archived and then the system crashes before

				 * a durable record of the success has been made.

				 *

				 * Copyright (c) 2022-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2022-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/basic_archive/basic_archive.c

				@ -65,7 +65,7 @@ void

				_PG_init(void)

				{

					DefineCustomStringVariable("basic_archive.archive_directory",

											   gettext_noop("Archive file destination directory."),

											   "Archive file destination directory.",

											   NULL,

											   &archive_directory,

											   "",

				@ -90,19 +90,17 @@ _PG_archive_module_init(void)

				/*

				 * check_archive_directory

				 *

				 * Checks that the provided archive directory exists.

				 * Checks that the provided archive directory path isn't too long.

				 */

				static bool

				check_archive_directory(char **newval, void **extra, GucSource source)

				{

					struct stat st;

					/*

					 * The default value is an empty string, so we have to accept that value.

					 * Our check_configured callback also checks for this and prevents

					 * archiving from proceeding if it is still empty.

					 */

					if (*newval == NULL || *newval[0] == '\0')

					if (*newval == NULL || (*newval)[0] == '\0')

						return true;

					/*

				@ -115,17 +113,6 @@ check_archive_directory(char **newval, void **extra, GucSource source)

						return false;

					}

					/*

					 * Do a basic sanity check that the specified archive directory exists. It

					 * could be removed at some point in the future, so we still need to be

					 * prepared for it not to exist in the actual archiving logic.

					 */

					if (stat(*newval, &st) != 0 || !S_ISDIR(st.st_mode))

					{

						GUC_check_errdetail("Specified archive directory does not exist.");

						return false;

					}

					return true;

				}

									
										2

contrib/basic_archive/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				basic_archive_sources = files(

				  'basic_archive.c',

									
										5

contrib/bloom/blcost.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * blcost.c

				 *		Cost estimate function for bloom indexes.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/bloom/blcost.c

				@ -30,6 +30,9 @@ blcostestimate(PlannerInfo *root, IndexPath *path, double loop_count,

					/* We have to visit all index tuples anyway */

					costs.numIndexTuples = index->tuples;

					/* As in btcostestimate, count only the metapage as non-leaf */

					costs.numNonLeafPages = 1;

					/* Use generic estimate */

					genericcostestimate(root, path, loop_count, &costs);

									
										4

contrib/bloom/blinsert.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * blinsert.c

				 *		Bloom index build and insert functions.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/bloom/blinsert.c

				@ -151,7 +151,7 @@ blbuild(Relation heap, Relation index, IndexInfo *indexInfo)

					MemoryContextDelete(buildstate.tmpCtx);

					result = (IndexBuildResult *) palloc(sizeof(IndexBuildResult));

					result = palloc_object(IndexBuildResult);

					result->heap_tuples = reltuples;

					result->index_tuples = buildstate.indtuples;

									
										4

contrib/bloom/bloom.h
									
										View file
										
				@ -3,7 +3,7 @@

				 * bloom.h

				 *	  Header for bloom index.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/bloom/bloom.h

				@ -72,7 +72,7 @@ typedef BloomPageOpaqueData *BloomPageOpaque;

					((BloomTuple *)(PageGetContents(page) \

						+ (state)->sizeOfBloomTuple * ((offset) - 1)))

				#define BloomPageGetNextTuple(state, tuple) \

					((BloomTuple *)((Pointer)(tuple) + (state)->sizeOfBloomTuple))

					((BloomTuple *)((char *)(tuple) + (state)->sizeOfBloomTuple))

				/* Preserved page numbers */

				#define BLOOM_METAPAGE_BLKNO	(0)

									
										36

contrib/bloom/blscan.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * blscan.c

				 *		Bloom index scan functions.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/bloom/blscan.c

				@ -14,9 +14,11 @@

				#include "access/relscan.h"

				#include "bloom.h"

				#include "executor/instrument_node.h"

				#include "miscadmin.h"

				#include "pgstat.h"

				#include "storage/bufmgr.h"

				#include "storage/read_stream.h"

				/*

				 * Begin scan of bloom index.

				@ -29,7 +31,7 @@ blbeginscan(Relation r, int nkeys, int norderbys)

					scan = RelationGetIndexScan(r, nkeys, norderbys);

					so = (BloomScanOpaque) palloc(sizeof(BloomScanOpaqueData));

					so = (BloomScanOpaque) palloc_object(BloomScanOpaqueData);

					initBloomState(&so->state, scan->indexRelation);

					so->sign = NULL;

				@ -75,18 +77,20 @@ int64

				blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)

				{

					int64		ntids = 0;

					BlockNumber blkno = BLOOM_HEAD_BLKNO,

					BlockNumber blkno,

								npages;

					int			i;

					BufferAccessStrategy bas;

					BloomScanOpaque so = (BloomScanOpaque) scan->opaque;

					BlockRangeReadStreamPrivate p;

					ReadStream *stream;

					if (so->sign == NULL)

					{

						/* New search: have to calculate search signature */

						ScanKey		skey = scan->keyData;

						so->sign = palloc0(sizeof(BloomSignatureWord) * so->state.opts.bloomLength);

						so->sign = palloc0_array(BloomSignatureWord, so->state.opts.bloomLength);

						for (i = 0; i < scan->numberOfKeys; i++)

						{

				@ -119,14 +123,29 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)

					if (scan->instrument)

						scan->instrument->nsearches++;

					/* Scan all blocks except the metapage using streaming reads */

					p.current_blocknum = BLOOM_HEAD_BLKNO;

					p.last_exclusive = npages;

					/*

					 * It is safe to use batchmode as block_range_read_stream_cb takes no

					 * locks.

					 */

					stream = read_stream_begin_relation(READ_STREAM_FULL |

														READ_STREAM_USE_BATCHING,

														bas,

														scan->indexRelation,

														MAIN_FORKNUM,

														block_range_read_stream_cb,

														&p,

														0);

					for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)

					{

						Buffer		buffer;

						Page		page;

						buffer = ReadBufferExtended(scan->indexRelation, MAIN_FORKNUM,

													blkno, RBM_NORMAL, bas);

						buffer = read_stream_next_buffer(stream, NULL);

						LockBuffer(buffer, BUFFER_LOCK_SHARE);

						page = BufferGetPage(buffer);

				@ -162,6 +181,9 @@ blgetbitmap(IndexScanDesc scan, TIDBitmap *tbm)

						UnlockReleaseBuffer(buffer);

						CHECK_FOR_INTERRUPTS();

					}

					Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);

					read_stream_end(stream);

					FreeAccessStrategy(bas);

					return ntids;

									
										117

contrib/bloom/blutils.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * blutils.c

				 *		Bloom index utilities.

				 *

				 * Portions Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Portions Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 * Portions Copyright (c) 1990-1993, Regents of the University of California

				 *

				 * IDENTIFICATION

				@ -86,7 +86,7 @@ makeDefaultBloomOptions(void)

					BloomOptions *opts;

					int			i;

					opts = (BloomOptions *) palloc0(sizeof(BloomOptions));

					opts = palloc0_object(BloomOptions);

					/* Convert DEFAULT_BLOOM_LENGTH from # of bits to # of words */

					opts->bloomLength = (DEFAULT_BLOOM_LENGTH + SIGNWORDBITS - 1) / SIGNWORDBITS;

					for (i = 0; i < INDEX_MAX_KEYS; i++)

				@ -102,61 +102,62 @@ makeDefaultBloomOptions(void)

				Datum

				blhandler(PG_FUNCTION_ARGS)

				{

					IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);

					static const IndexAmRoutine amroutine = {

						.type = T_IndexAmRoutine,

						.amstrategies = BLOOM_NSTRATEGIES,

						.amsupport = BLOOM_NPROC,

						.amoptsprocnum = BLOOM_OPTIONS_PROC,

						.amcanorder = false,

						.amcanorderbyop = false,

						.amcanhash = false,

						.amconsistentequality = false,

						.amconsistentordering = false,

						.amcanbackward = false,

						.amcanunique = false,

						.amcanmulticol = true,

						.amoptionalkey = true,

						.amsearcharray = false,

						.amsearchnulls = false,

						.amstorage = false,

						.amclusterable = false,

						.ampredlocks = false,

						.amcanparallel = false,

						.amcanbuildparallel = false,

						.amcaninclude = false,

						.amusemaintenanceworkmem = false,

						.amparallelvacuumoptions =

						VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP,

						.amkeytype = InvalidOid,

					amroutine->amstrategies = BLOOM_NSTRATEGIES;

					amroutine->amsupport = BLOOM_NPROC;

					amroutine->amoptsprocnum = BLOOM_OPTIONS_PROC;

					amroutine->amcanorder = false;

					amroutine->amcanorderbyop = false;

					amroutine->amcanhash = false;

					amroutine->amconsistentequality = false;

					amroutine->amconsistentordering = false;

					amroutine->amcanbackward = false;

					amroutine->amcanunique = false;

					amroutine->amcanmulticol = true;

					amroutine->amoptionalkey = true;

					amroutine->amsearcharray = false;

					amroutine->amsearchnulls = false;

					amroutine->amstorage = false;

					amroutine->amclusterable = false;

					amroutine->ampredlocks = false;

					amroutine->amcanparallel = false;

					amroutine->amcanbuildparallel = false;

					amroutine->amcaninclude = false;

					amroutine->amusemaintenanceworkmem = false;

					amroutine->amparallelvacuumoptions =

						VACUUM_OPTION_PARALLEL_BULKDEL | VACUUM_OPTION_PARALLEL_CLEANUP;

					amroutine->amkeytype = InvalidOid;

						.ambuild = blbuild,

						.ambuildempty = blbuildempty,

						.aminsert = blinsert,

						.aminsertcleanup = NULL,

						.ambulkdelete = blbulkdelete,

						.amvacuumcleanup = blvacuumcleanup,

						.amcanreturn = NULL,

						.amcostestimate = blcostestimate,

						.amgettreeheight = NULL,

						.amoptions = bloptions,

						.amproperty = NULL,

						.ambuildphasename = NULL,

						.amvalidate = blvalidate,

						.amadjustmembers = NULL,

						.ambeginscan = blbeginscan,

						.amrescan = blrescan,

						.amgettuple = NULL,

						.amgetbitmap = blgetbitmap,

						.amendscan = blendscan,

						.ammarkpos = NULL,

						.amrestrpos = NULL,

						.amestimateparallelscan = NULL,

						.aminitparallelscan = NULL,

						.amparallelrescan = NULL,

						.amtranslatestrategy = NULL,

						.amtranslatecmptype = NULL,

					};

					amroutine->ambuild = blbuild;

					amroutine->ambuildempty = blbuildempty;

					amroutine->aminsert = blinsert;

					amroutine->aminsertcleanup = NULL;

					amroutine->ambulkdelete = blbulkdelete;

					amroutine->amvacuumcleanup = blvacuumcleanup;

					amroutine->amcanreturn = NULL;

					amroutine->amcostestimate = blcostestimate;

					amroutine->amgettreeheight = NULL;

					amroutine->amoptions = bloptions;

					amroutine->amproperty = NULL;

					amroutine->ambuildphasename = NULL;

					amroutine->amvalidate = blvalidate;

					amroutine->amadjustmembers = NULL;

					amroutine->ambeginscan = blbeginscan;

					amroutine->amrescan = blrescan;

					amroutine->amgettuple = NULL;

					amroutine->amgetbitmap = blgetbitmap;

					amroutine->amendscan = blendscan;

					amroutine->ammarkpos = NULL;

					amroutine->amrestrpos = NULL;

					amroutine->amestimateparallelscan = NULL;

					amroutine->aminitparallelscan = NULL;

					amroutine->amparallelrescan = NULL;

					amroutine->amtranslatestrategy = NULL;

					amroutine->amtranslatecmptype = NULL;

					PG_RETURN_POINTER(amroutine);

					PG_RETURN_POINTER(&amroutine);

				}

				/*

				@ -324,7 +325,7 @@ BloomPageAddItem(BloomState *state, Page page, BloomTuple *tuple)

				{

					BloomTuple *itup;

					BloomPageOpaque opaque;

					Pointer		ptr;

					char	   *ptr;

					/* We shouldn't be pointed to an invalid page */

					Assert(!PageIsNew(page) && !BloomPageIsDeleted(page));

				@ -336,11 +337,11 @@ BloomPageAddItem(BloomState *state, Page page, BloomTuple *tuple)

					/* Copy new tuple to the end of page */

					opaque = BloomPageGetOpaque(page);

					itup = BloomPageGetTuple(state, page, opaque->maxoff + 1);

					memcpy((Pointer) itup, (Pointer) tuple, state->sizeOfBloomTuple);

					memcpy(itup, tuple, state->sizeOfBloomTuple);

					/* Adjust maxoff and pd_lower */

					opaque->maxoff++;

					ptr = (Pointer) BloomPageGetTuple(state, page, opaque->maxoff + 1);

					ptr = (char *) BloomPageGetTuple(state, page, opaque->maxoff + 1);

					((PageHeader) page)->pd_lower = ptr - page;

					/* Assert we didn't overrun available space */

									
										68

contrib/bloom/blvacuum.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * blvacuum.c

				 *		Bloom VACUUM functions.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/bloom/blvacuum.c

				@ -17,6 +17,7 @@

				#include "commands/vacuum.h"

				#include "storage/bufmgr.h"

				#include "storage/indexfsm.h"

				#include "storage/read_stream.h"

				/*

				@ -40,9 +41,11 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,

					Page		page;

					BloomMetaPageData *metaData;

					GenericXLogState *gxlogState;

					BlockRangeReadStreamPrivate p;

					ReadStream *stream;

					if (stats == NULL)

						stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));

						stats = palloc0_object(IndexBulkDeleteResult);

					initBloomState(&state, index);

				@ -51,6 +54,25 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,

					 * they can't contain tuples to delete.

					 */

					npages = RelationGetNumberOfBlocks(index);

					/* Scan all blocks except the metapage using streaming reads */

					p.current_blocknum = BLOOM_HEAD_BLKNO;

					p.last_exclusive = npages;

					/*

					 * It is safe to use batchmode as block_range_read_stream_cb takes no

					 * locks.

					 */

					stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |

														READ_STREAM_FULL |

														READ_STREAM_USE_BATCHING,

														info->strategy,

														index,

														MAIN_FORKNUM,

														block_range_read_stream_cb,

														&p,

														0);

					for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)

					{

						BloomTuple *itup,

				@ -59,8 +81,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,

						vacuum_delay_point(false);

						buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,

													RBM_NORMAL, info->strategy);

						buffer = read_stream_next_buffer(stream, NULL);

						LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);

						gxlogState = GenericXLogStart(index);

				@ -94,8 +115,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,

							{

								/* No; copy it to itupPtr++, but skip copy if not needed */

								if (itupPtr != itup)

									memmove((Pointer) itupPtr, (Pointer) itup,

											state.sizeOfBloomTuple);

									memmove(itupPtr, itup, state.sizeOfBloomTuple);

								itupPtr = BloomPageGetNextTuple(&state, itupPtr);

							}

				@ -122,7 +142,7 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,

							if (BloomPageGetMaxOffset(page) == 0)

								BloomPageSetDeleted(page);

							/* Adjust pd_lower */

							((PageHeader) page)->pd_lower = (Pointer) itupPtr - page;

							((PageHeader) page)->pd_lower = (char *) itupPtr - page;

							/* Finish WAL-logging */

							GenericXLogFinish(gxlogState);

						}

				@ -134,6 +154,9 @@ blbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,

						UnlockReleaseBuffer(buffer);

					}

					Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);

					read_stream_end(stream);

					/*

					 * Update the metapage's notFullPage list with whatever we found.  Our

					 * info could already be out of date at this point, but blinsert() will

				@ -167,12 +190,14 @@ blvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)

					Relation	index = info->index;

					BlockNumber npages,

								blkno;

					BlockRangeReadStreamPrivate p;

					ReadStream *stream;

					if (info->analyze_only)

						return stats;

					if (stats == NULL)

						stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));

						stats = palloc0_object(IndexBulkDeleteResult);

					/*

					 * Iterate over the pages: insert deleted pages into FSM and collect

				@ -182,6 +207,25 @@ blvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)

					stats->num_pages = npages;

					stats->pages_free = 0;

					stats->num_index_tuples = 0;

					/* Scan all blocks except the metapage using streaming reads */

					p.current_blocknum = BLOOM_HEAD_BLKNO;

					p.last_exclusive = npages;

					/*

					 * It is safe to use batchmode as block_range_read_stream_cb takes no

					 * locks.

					 */

					stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |

														READ_STREAM_FULL |

														READ_STREAM_USE_BATCHING,

														info->strategy,

														index,

														MAIN_FORKNUM,

														block_range_read_stream_cb,

														&p,

														0);

					for (blkno = BLOOM_HEAD_BLKNO; blkno < npages; blkno++)

					{

						Buffer		buffer;

				@ -189,10 +233,9 @@ blvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)

						vacuum_delay_point(false);

						buffer = ReadBufferExtended(index, MAIN_FORKNUM, blkno,

													RBM_NORMAL, info->strategy);

						buffer = read_stream_next_buffer(stream, NULL);

						LockBuffer(buffer, BUFFER_LOCK_SHARE);

						page = (Page) BufferGetPage(buffer);

						page = BufferGetPage(buffer);

						if (PageIsNew(page) || BloomPageIsDeleted(page))

						{

				@ -207,6 +250,9 @@ blvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)

						UnlockReleaseBuffer(buffer);

					}

					Assert(read_stream_next_buffer(stream, NULL) == InvalidBuffer);

					read_stream_end(stream);

					IndexFreeSpaceMapVacuum(info->index);

					return stats;

									
										2

contrib/bloom/blvalidate.c
									
										View file
										
				@ -3,7 +3,7 @@

				 * blvalidate.c

				 *	  Opclass validator for bloom.

				 *

				 * Copyright (c) 2016-2025, PostgreSQL Global Development Group

				 * Copyright (c) 2016-2026, PostgreSQL Global Development Group

				 *

				 * IDENTIFICATION

				 *	  contrib/bloom/blvalidate.c

									
										2

contrib/bloom/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				bloom_sources = files(

				  'blcost.c',

									
										2

contrib/bloom/t/001_wal.pl
									
										View file
										
				@ -1,5 +1,5 @@

				# Copyright (c) 2021-2025, PostgreSQL Global Development Group

				# Copyright (c) 2021-2026, PostgreSQL Global Development Group

				# Test generic xlog record work for bloom index replication.

				use strict;

									
										2

contrib/bool_plperl/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				if not perl_dep.found()

				  subdir_done()

									
										2

contrib/btree_gin/Makefile
									
										View file
										
				@ -7,7 +7,7 @@ OBJS = \

				EXTENSION = btree_gin

				DATA = btree_gin--1.0.sql btree_gin--1.0--1.1.sql btree_gin--1.1--1.2.sql \

					 btree_gin--1.2--1.3.sql

					 btree_gin--1.2--1.3.sql btree_gin--1.3--1.4.sql

				PGFILEDESC = "btree_gin - B-tree equivalent GIN operator classes"

				REGRESS = install_btree_gin int2 int4 int8 float4 float8 money oid \

									
										151

contrib/btree_gin/btree_gin--1.3--1.4.sql
									
										Normal file
									
										View file
										
				@ -0,0 +1,151 @@

				/* contrib/btree_gin/btree_gin--1.3--1.4.sql */

				-- complain if script is sourced in psql, rather than via CREATE EXTENSION

				\echo Use "ALTER EXTENSION btree_gin UPDATE TO '1.4'" to load this file. \quit

				--

				-- Cross-type operator support is new in 1.4.  We only need to worry

				-- about this for cross-type operators that exist in core.

				--

				-- Because the opclass extractQuery and consistent methods don't directly

				-- get any information about the datatype of the RHS value, we have to

				-- encode that in the operator strategy numbers.  The strategy numbers

				-- are the operator's normal btree strategy (1-5) plus 16 times a code

				-- for the RHS datatype.

				--

				ALTER OPERATOR FAMILY int2_ops USING gin

				ADD

				    -- Code 1: RHS is int4

				    OPERATOR        0x11    < (int2, int4),

				    OPERATOR        0x12    <= (int2, int4),

				    OPERATOR        0x13    = (int2, int4),

				    OPERATOR        0x14    >= (int2, int4),

				    OPERATOR        0x15    > (int2, int4),

				    -- Code 2: RHS is int8

				    OPERATOR        0x21    < (int2, int8),

				    OPERATOR        0x22    <= (int2, int8),

				    OPERATOR        0x23    = (int2, int8),

				    OPERATOR        0x24    >= (int2, int8),

				    OPERATOR        0x25    > (int2, int8)

				;

				ALTER OPERATOR FAMILY int4_ops USING gin

				ADD

				    -- Code 1: RHS is int2

				    OPERATOR        0x11    < (int4, int2),

				    OPERATOR        0x12    <= (int4, int2),

				    OPERATOR        0x13    = (int4, int2),

				    OPERATOR        0x14    >= (int4, int2),

				    OPERATOR        0x15    > (int4, int2),

				    -- Code 2: RHS is int8

				    OPERATOR        0x21    < (int4, int8),

				    OPERATOR        0x22    <= (int4, int8),

				    OPERATOR        0x23    = (int4, int8),

				    OPERATOR        0x24    >= (int4, int8),

				    OPERATOR        0x25    > (int4, int8)

				;

				ALTER OPERATOR FAMILY int8_ops USING gin

				ADD

				    -- Code 1: RHS is int2

				    OPERATOR        0x11    < (int8, int2),

				    OPERATOR        0x12    <= (int8, int2),

				    OPERATOR        0x13    = (int8, int2),

				    OPERATOR        0x14    >= (int8, int2),

				    OPERATOR        0x15    > (int8, int2),

				    -- Code 2: RHS is int4

				    OPERATOR        0x21    < (int8, int4),

				    OPERATOR        0x22    <= (int8, int4),

				    OPERATOR        0x23    = (int8, int4),

				    OPERATOR        0x24    >= (int8, int4),

				    OPERATOR        0x25    > (int8, int4)

				;

				ALTER OPERATOR FAMILY float4_ops USING gin

				ADD

				    -- Code 1: RHS is float8

				    OPERATOR        0x11    < (float4, float8),

				    OPERATOR        0x12    <= (float4, float8),

				    OPERATOR        0x13    = (float4, float8),

				    OPERATOR        0x14    >= (float4, float8),

				    OPERATOR        0x15    > (float4, float8)

				;

				ALTER OPERATOR FAMILY float8_ops USING gin

				ADD

				    -- Code 1: RHS is float4

				    OPERATOR        0x11    < (float8, float4),

				    OPERATOR        0x12    <= (float8, float4),

				    OPERATOR        0x13    = (float8, float4),

				    OPERATOR        0x14    >= (float8, float4),

				    OPERATOR        0x15    > (float8, float4)

				;

				ALTER OPERATOR FAMILY text_ops USING gin

				ADD

				    -- Code 1: RHS is name

				    OPERATOR        0x11    < (text, name),

				    OPERATOR        0x12    <= (text, name),

				    OPERATOR        0x13    = (text, name),

				    OPERATOR        0x14    >= (text, name),

				    OPERATOR        0x15    > (text, name)

				;

				ALTER OPERATOR FAMILY name_ops USING gin

				ADD

				    -- Code 1: RHS is text

				    OPERATOR        0x11    < (name, text),

				    OPERATOR        0x12    <= (name, text),

				    OPERATOR        0x13    = (name, text),

				    OPERATOR        0x14    >= (name, text),

				    OPERATOR        0x15    > (name, text)

				;

				ALTER OPERATOR FAMILY date_ops USING gin

				ADD

				    -- Code 1: RHS is timestamp

				    OPERATOR        0x11    < (date, timestamp),

				    OPERATOR        0x12    <= (date, timestamp),

				    OPERATOR        0x13    = (date, timestamp),

				    OPERATOR        0x14    >= (date, timestamp),

				    OPERATOR        0x15    > (date, timestamp),

				    -- Code 2: RHS is timestamptz

				    OPERATOR        0x21    < (date, timestamptz),

				    OPERATOR        0x22    <= (date, timestamptz),

				    OPERATOR        0x23    = (date, timestamptz),

				    OPERATOR        0x24    >= (date, timestamptz),

				    OPERATOR        0x25    > (date, timestamptz)

				;

				ALTER OPERATOR FAMILY timestamp_ops USING gin

				ADD

				    -- Code 1: RHS is date

				    OPERATOR        0x11    < (timestamp, date),

				    OPERATOR        0x12    <= (timestamp, date),

				    OPERATOR        0x13    = (timestamp, date),

				    OPERATOR        0x14    >= (timestamp, date),

				    OPERATOR        0x15    > (timestamp, date),

				    -- Code 2: RHS is timestamptz

				    OPERATOR        0x21    < (timestamp, timestamptz),

				    OPERATOR        0x22    <= (timestamp, timestamptz),

				    OPERATOR        0x23    = (timestamp, timestamptz),

				    OPERATOR        0x24    >= (timestamp, timestamptz),

				    OPERATOR        0x25    > (timestamp, timestamptz)

				;

				ALTER OPERATOR FAMILY timestamptz_ops USING gin

				ADD

				    -- Code 1: RHS is date

				    OPERATOR        0x11    < (timestamptz, date),

				    OPERATOR        0x12    <= (timestamptz, date),

				    OPERATOR        0x13    = (timestamptz, date),

				    OPERATOR        0x14    >= (timestamptz, date),

				    OPERATOR        0x15    > (timestamptz, date),

				    -- Code 2: RHS is timestamp

				    OPERATOR        0x21    < (timestamptz, timestamp),

				    OPERATOR        0x22    <= (timestamptz, timestamp),

				    OPERATOR        0x23    = (timestamptz, timestamp),

				    OPERATOR        0x24    >= (timestamptz, timestamp),

				    OPERATOR        0x25    > (timestamptz, timestamp)

				;

									
										650

contrib/btree_gin/btree_gin.c
									
										View file
										
				@ -6,6 +6,8 @@

				#include <limits.h>

				#include "access/stratnum.h"

				#include "mb/pg_wchar.h"

				#include "nodes/miscnodes.h"

				#include "utils/builtins.h"

				#include "utils/date.h"

				#include "utils/float.h"

				@ -13,20 +15,36 @@

				#include "utils/numeric.h"

				#include "utils/timestamp.h"

				#include "utils/uuid.h"

				#include "varatt.h"

				PG_MODULE_MAGIC_EXT(

									.name = "btree_gin",

									.version = PG_VERSION

				);

				/*

				 * Our opclasses use the same strategy numbers as btree (1-5) for same-type

				 * comparison operators.  For cross-type comparison operators, the

				 * low 4 bits of our strategy numbers are the btree strategy number,

				 * and the upper bits are a code for the right-hand-side data type.

				 */

				#define BTGIN_GET_BTREE_STRATEGY(strat)		((strat) & 0x0F)

				#define BTGIN_GET_RHS_TYPE_CODE(strat)		((strat) >> 4)

				/* extra data passed from gin_btree_extract_query to gin_btree_compare_prefix */

				typedef struct QueryInfo

				{

					StrategyNumber strategy;

					Datum		datum;

					bool		is_varlena;

					Datum		(*typecmp) (FunctionCallInfo);

					StrategyNumber strategy;	/* operator strategy number */

					Datum		orig_datum;		/* original query (comparison) datum */

					Datum		entry_datum;	/* datum we reported as the entry value */

					PGFunction	typecmp;		/* appropriate btree comparison function */

				} QueryInfo;

				typedef Datum (*btree_gin_convert_function) (Datum input);

				typedef Datum (*btree_gin_leftmost_function) (void);

				/*** GIN support functions shared by all datatypes ***/

				static Datum

				@ -34,8 +52,9 @@ gin_btree_extract_value(FunctionCallInfo fcinfo, bool is_varlena)

				{

					Datum		datum = PG_GETARG_DATUM(0);

					int32	   *nentries = (int32 *) PG_GETARG_POINTER(1);

					Datum	   *entries = (Datum *) palloc(sizeof(Datum));

					Datum	   *entries = palloc_object(Datum);

					/* Ensure that values stored in the index are not toasted */

					if (is_varlena)

						datum = PointerGetDatum(PG_DETOAST_DATUM(datum));

					entries[0] = datum;

				@ -44,42 +63,54 @@ gin_btree_extract_value(FunctionCallInfo fcinfo, bool is_varlena)

					PG_RETURN_POINTER(entries);

				}

				/*

				 * For BTGreaterEqualStrategyNumber, BTGreaterStrategyNumber, and

				 * BTEqualStrategyNumber we want to start the index scan at the

				 * supplied query datum, and work forward. For BTLessStrategyNumber

				 * and BTLessEqualStrategyNumber, we need to start at the leftmost

				 * key, and work forward until the supplied query datum (which must be

				 * sent along inside the QueryInfo structure).

				 */

				static Datum

				gin_btree_extract_query(FunctionCallInfo fcinfo,

										bool is_varlena,

										Datum (*leftmostvalue) (void),

										Datum (*typecmp) (FunctionCallInfo))

										btree_gin_leftmost_function leftmostvalue,

										const bool *rhs_is_varlena,

										const btree_gin_convert_function *cvt_fns,

										const PGFunction *cmp_fns)

				{

					Datum		datum = PG_GETARG_DATUM(0);

					int32	   *nentries = (int32 *) PG_GETARG_POINTER(1);

					StrategyNumber strategy = PG_GETARG_UINT16(2);

					bool	  **partialmatch = (bool **) PG_GETARG_POINTER(3);

					Pointer   **extra_data = (Pointer **) PG_GETARG_POINTER(4);

					Datum	   *entries = (Datum *) palloc(sizeof(Datum));

					QueryInfo  *data = (QueryInfo *) palloc(sizeof(QueryInfo));

					bool	   *ptr_partialmatch;

					Datum	   *entries = palloc_object(Datum);

					QueryInfo  *data = palloc_object(QueryInfo);

					bool	   *ptr_partialmatch = palloc_object(bool);

					int			btree_strat,

								rhs_code;

					*nentries = 1;

					ptr_partialmatch = *partialmatch = (bool *) palloc(sizeof(bool));

					*ptr_partialmatch = false;

					if (is_varlena)

					/*

					 * Extract the btree strategy code and the RHS data type code from the

					 * given strategy number.

					 */

					btree_strat = BTGIN_GET_BTREE_STRATEGY(strategy);

					rhs_code = BTGIN_GET_RHS_TYPE_CODE(strategy);

					/*

					 * Detoast the comparison datum.  This isn't necessary for correctness,

					 * but it can save repeat detoastings within the comparison function.

					 */

					if (rhs_is_varlena[rhs_code])

						datum = PointerGetDatum(PG_DETOAST_DATUM(datum));

					data->strategy = strategy;

					data->datum = datum;

					data->is_varlena = is_varlena;

					data->typecmp = typecmp;

					*extra_data = (Pointer *) palloc(sizeof(Pointer));

					**extra_data = (Pointer) data;

					switch (strategy)

					/* Prep single comparison key with possible partial-match flag */

					*nentries = 1;

					*partialmatch = ptr_partialmatch;

					*ptr_partialmatch = false;

					/*

					 * For BTGreaterEqualStrategyNumber, BTGreaterStrategyNumber, and

					 * BTEqualStrategyNumber we want to start the index scan at the supplied

					 * query datum, and work forward.  For BTLessStrategyNumber and

					 * BTLessEqualStrategyNumber, we need to start at the leftmost key, and

					 * work forward until the supplied query datum (which we'll send along

					 * inside the QueryInfo structure).  Use partial match rules except for

					 * BTEqualStrategyNumber without a conversion function.  (If there is a

					 * conversion function, comparison to the entry value is not trustworthy.)

					 */

					switch (btree_strat)

					{

						case BTLessStrategyNumber:

						case BTLessEqualStrategyNumber:

				@ -89,77 +120,108 @@ gin_btree_extract_query(FunctionCallInfo fcinfo,

						case BTGreaterEqualStrategyNumber:

						case BTGreaterStrategyNumber:

							*ptr_partialmatch = true;

							/* FALLTHROUGH */

							pg_fallthrough;

						case BTEqualStrategyNumber:

							entries[0] = datum;

							/* If we have a conversion function, apply it */

							if (cvt_fns && cvt_fns[rhs_code])

							{

								entries[0] = (*cvt_fns[rhs_code]) (datum);

								*ptr_partialmatch = true;

							}

							else

								entries[0] = datum;

							break;

						default:

							elog(ERROR, "unrecognized strategy number: %d", strategy);

					}

					/* Fill "extra" data */

					data->strategy = strategy;

					data->orig_datum = datum;

					data->entry_datum = entries[0];

					data->typecmp = cmp_fns[rhs_code];

					*extra_data = palloc_object(Pointer);

					**extra_data = (Pointer) data;

					PG_RETURN_POINTER(entries);

				}

				/*

				 * Datum a is a value from extract_query method and for BTLess*

				 * strategy it is a left-most value.  So, use original datum from QueryInfo

				 * to decide to stop scanning or not.  Datum b is always from index.

				 */

				static Datum

				gin_btree_compare_prefix(FunctionCallInfo fcinfo)

				{

					Datum		a = PG_GETARG_DATUM(0);

					Datum		b = PG_GETARG_DATUM(1);

					Datum		partial_key PG_USED_FOR_ASSERTS_ONLY = PG_GETARG_DATUM(0);

					Datum		key = PG_GETARG_DATUM(1);

					QueryInfo  *data = (QueryInfo *) PG_GETARG_POINTER(3);

					int32		res,

								cmp;

					/*

					 * partial_key is only an approximation to the real comparison value,

					 * especially if it's a leftmost value.  We can get an accurate answer by

					 * doing a possibly-cross-type comparison to the real comparison value.

					 * (Note that partial_key and key are of the indexed datatype while

					 * orig_datum is of the query operator's RHS datatype.)

					 *

					 * But just to be sure that things are what we expect, let's assert that

					 * partial_key is indeed what gin_btree_extract_query reported, so that

					 * we'll notice if anyone ever changes the core code in a way that breaks

					 * our assumptions.

					 */

					Assert(partial_key == data->entry_datum);

					cmp = DatumGetInt32(CallerFInfoFunctionCall2(data->typecmp,

																 fcinfo->flinfo,

																 PG_GET_COLLATION(),

																 (data->strategy == BTLessStrategyNumber ||

																  data->strategy == BTLessEqualStrategyNumber)

																 ? data->datum : a,

																 b));

																 data->orig_datum,

																 key));

					switch (data->strategy)

					/*

					 * Convert the comparison result to the correct thing for the search

					 * operator strategy.  When dealing with cross-type comparisons, an

					 * imprecise entry datum could lead GIN to start the scan just before the

					 * first possible match, so we must continue the scan if the current index

					 * entry doesn't satisfy the search condition for >= and > cases.  But if

					 * that happens in an = search we can stop, because an imprecise entry

					 * datum means that the search value is unrepresentable in the indexed

					 * data type, so that there will be no exact matches.

					 */

					switch (BTGIN_GET_BTREE_STRATEGY(data->strategy))

					{

						case BTLessStrategyNumber:

							/* If original datum > indexed one then return match */

							if (cmp > 0)

								res = 0;

							else

								res = 1;

								res = 1;		/* end scan */

							break;

						case BTLessEqualStrategyNumber:

							/* The same except equality */

							/* If original datum >= indexed one then return match */

							if (cmp >= 0)

								res = 0;

							else

								res = 1;

								res = 1;		/* end scan */

							break;

						case BTEqualStrategyNumber:

							if (cmp != 0)

								res = 1;

							else

							/* If original datum = indexed one then return match */

							/* See above about why we can end scan when cmp < 0 */

							if (cmp == 0)

								res = 0;

							else

								res = 1;		/* end scan */

							break;

						case BTGreaterEqualStrategyNumber:

							/* If original datum <= indexed one then return match */

							if (cmp <= 0)

								res = 0;

							else

								res = 1;

								res = -1;		/* keep scanning */

							break;

						case BTGreaterStrategyNumber:

							/* If original datum <= indexed one then return match */

							/* If original datum == indexed one then continue scan */

							/* If original datum < indexed one then return match */

							if (cmp < 0)

								res = 0;

							else if (cmp == 0)

								res = -1;

							else

								res = 1;

								res = -1;		/* keep scanning */

							break;

						default:

							elog(ERROR, "unrecognized strategy number: %d",

				@ -182,19 +244,20 @@ gin_btree_consistent(PG_FUNCTION_ARGS)

				/*** GIN_SUPPORT macro defines the datatype specific functions ***/

				#define GIN_SUPPORT(type, is_varlena, leftmostvalue, typecmp)				\

				#define GIN_SUPPORT(type, leftmostvalue, is_varlena, cvtfns, cmpfns)		\

				PG_FUNCTION_INFO_V1(gin_extract_value_##type);								\

				Datum																		\

				gin_extract_value_##type(PG_FUNCTION_ARGS)									\

				{																			\

					return gin_btree_extract_value(fcinfo, is_varlena);						\

					return gin_btree_extract_value(fcinfo, is_varlena[0]);					\

				}	\

				PG_FUNCTION_INFO_V1(gin_extract_query_##type);								\

				Datum																		\

				gin_extract_query_##type(PG_FUNCTION_ARGS)									\

				{																			\

					return gin_btree_extract_query(fcinfo,									\

												   is_varlena, leftmostvalue, typecmp);		\

												   leftmostvalue, is_varlena,				\

												   cvtfns, cmpfns);							\

				}	\

				PG_FUNCTION_INFO_V1(gin_compare_prefix_##type);								\

				Datum																		\

				@ -206,13 +269,66 @@ gin_compare_prefix_##type(PG_FUNCTION_ARGS)									\

				/*** Datatype specifications ***/

				/* Function to produce the least possible value of the indexed datatype */

				static Datum

				leftmostvalue_int2(void)

				{

					return Int16GetDatum(SHRT_MIN);

				}

				GIN_SUPPORT(int2, false, leftmostvalue_int2, btint2cmp)

				/*

				 * For cross-type support, we must provide conversion functions that produce

				 * a Datum of the indexed datatype, since GIN requires the "entry" datums to

				 * be of that type.  If an exact conversion is not possible, produce a value

				 * that will lead GIN to find the first index entry that is greater than

				 * or equal to the actual comparison value.  (But rounding down is OK, so

				 * sometimes we might find an index entry that's just less than the

				 * comparison value.)

				 *

				 * For integer values, it's sufficient to clamp the input to be in-range.

				 *

				 * Note: for out-of-range input values, we could in theory detect that the

				 * search condition matches all or none of the index, and avoid a useless

				 * index descent in the latter case.  Such searches are probably rare though,

				 * so we don't contort this code enough to do that.

				 */

				static Datum

				cvt_int4_int2(Datum input)

				{

					int32		val = DatumGetInt32(input);

					val = Max(val, SHRT_MIN);

					val = Min(val, SHRT_MAX);

					return Int16GetDatum((int16) val);

				}

				static Datum

				cvt_int8_int2(Datum input)

				{

					int64		val = DatumGetInt64(input);

					val = Max(val, SHRT_MIN);

					val = Min(val, SHRT_MAX);

					return Int16GetDatum((int16) val);

				}

				/*

				 * RHS-type-is-varlena flags, conversion and comparison function arrays,

				 * indexed by high bits of the operator strategy number.  A NULL in the

				 * conversion function array indicates that no conversion is needed, which

				 * will always be the case for the zero'th entry.  Note that the cross-type

				 * comparison functions should be the ones with the indexed datatype second.

				 */

				static const bool int2_rhs_is_varlena[] =

				{false, false, false};

				static const btree_gin_convert_function int2_cvt_fns[] =

				{NULL, cvt_int4_int2, cvt_int8_int2};

				static const PGFunction int2_cmp_fns[] =

				{btint2cmp, btint42cmp, btint82cmp};

				GIN_SUPPORT(int2, leftmostvalue_int2, int2_rhs_is_varlena, int2_cvt_fns, int2_cmp_fns)

				static Datum

				leftmostvalue_int4(void)

				@ -220,7 +336,34 @@ leftmostvalue_int4(void)

					return Int32GetDatum(INT_MIN);

				}

				GIN_SUPPORT(int4, false, leftmostvalue_int4, btint4cmp)

				static Datum

				cvt_int2_int4(Datum input)

				{

					int16		val = DatumGetInt16(input);

					return Int32GetDatum((int32) val);

				}

				static Datum

				cvt_int8_int4(Datum input)

				{

					int64		val = DatumGetInt64(input);

					val = Max(val, INT_MIN);

					val = Min(val, INT_MAX);

					return Int32GetDatum((int32) val);

				}

				static const bool int4_rhs_is_varlena[] =

				{false, false, false};

				static const btree_gin_convert_function int4_cvt_fns[] =

				{NULL, cvt_int2_int4, cvt_int8_int4};

				static const PGFunction int4_cmp_fns[] =

				{btint4cmp, btint24cmp, btint84cmp};

				GIN_SUPPORT(int4, leftmostvalue_int4, int4_rhs_is_varlena, int4_cvt_fns, int4_cmp_fns)

				static Datum

				leftmostvalue_int8(void)

				@ -228,7 +371,32 @@ leftmostvalue_int8(void)

					return Int64GetDatum(PG_INT64_MIN);

				}

				GIN_SUPPORT(int8, false, leftmostvalue_int8, btint8cmp)

				static Datum

				cvt_int2_int8(Datum input)

				{

					int16		val = DatumGetInt16(input);

					return Int64GetDatum((int64) val);

				}

				static Datum

				cvt_int4_int8(Datum input)

				{

					int32		val = DatumGetInt32(input);

					return Int64GetDatum((int64) val);

				}

				static const bool int8_rhs_is_varlena[] =

				{false, false, false};

				static const btree_gin_convert_function int8_cvt_fns[] =

				{NULL, cvt_int2_int8, cvt_int4_int8};

				static const PGFunction int8_cmp_fns[] =

				{btint8cmp, btint28cmp, btint48cmp};

				GIN_SUPPORT(int8, leftmostvalue_int8, int8_rhs_is_varlena, int8_cvt_fns, int8_cmp_fns)

				static Datum

				leftmostvalue_float4(void)

				@ -236,7 +404,34 @@ leftmostvalue_float4(void)

					return Float4GetDatum(-get_float4_infinity());

				}

				GIN_SUPPORT(float4, false, leftmostvalue_float4, btfloat4cmp)

				static Datum

				cvt_float8_float4(Datum input)

				{

					float8		val = DatumGetFloat8(input);

					float4		result;

					/*

					 * Assume that ordinary C conversion will produce a usable result.

					 * (Compare dtof(), which raises error conditions that we don't need.)

					 * Note that for inputs that aren't exactly representable as float4, it

					 * doesn't matter whether the conversion rounds up or down.  That might

					 * cause us to scan a few index entries that we'll reject as not matching,

					 * but we won't miss any that should match.

					 */

					result = (float4) val;

					return Float4GetDatum(result);

				}

				static const bool float4_rhs_is_varlena[] =

				{false, false};

				static const btree_gin_convert_function float4_cvt_fns[] =

				{NULL, cvt_float8_float4};

				static const PGFunction float4_cmp_fns[] =

				{btfloat4cmp, btfloat84cmp};

				GIN_SUPPORT(float4, leftmostvalue_float4, float4_rhs_is_varlena, float4_cvt_fns, float4_cmp_fns)

				static Datum

				leftmostvalue_float8(void)

				@ -244,7 +439,24 @@ leftmostvalue_float8(void)

					return Float8GetDatum(-get_float8_infinity());

				}

				GIN_SUPPORT(float8, false, leftmostvalue_float8, btfloat8cmp)

				static Datum

				cvt_float4_float8(Datum input)

				{

					float4		val = DatumGetFloat4(input);

					return Float8GetDatum((float8) val);

				}

				static const bool float8_rhs_is_varlena[] =

				{false, false};

				static const btree_gin_convert_function float8_cvt_fns[] =

				{NULL, cvt_float4_float8};

				static const PGFunction float8_cmp_fns[] =

				{btfloat8cmp, btfloat48cmp};

				GIN_SUPPORT(float8, leftmostvalue_float8, float8_rhs_is_varlena, float8_cvt_fns, float8_cmp_fns)

				static Datum

				leftmostvalue_money(void)

				@ -252,7 +464,13 @@ leftmostvalue_money(void)

					return Int64GetDatum(PG_INT64_MIN);

				}

				GIN_SUPPORT(money, false, leftmostvalue_money, cash_cmp)

				static const bool money_rhs_is_varlena[] =

				{false};

				static const PGFunction money_cmp_fns[] =

				{cash_cmp};

				GIN_SUPPORT(money, leftmostvalue_money, money_rhs_is_varlena, NULL, money_cmp_fns)

				static Datum

				leftmostvalue_oid(void)

				@ -260,7 +478,13 @@ leftmostvalue_oid(void)

					return ObjectIdGetDatum(0);

				}

				GIN_SUPPORT(oid, false, leftmostvalue_oid, btoidcmp)

				static const bool oid_rhs_is_varlena[] =

				{false};

				static const PGFunction oid_cmp_fns[] =

				{btoidcmp};

				GIN_SUPPORT(oid, leftmostvalue_oid, oid_rhs_is_varlena, NULL, oid_cmp_fns)

				static Datum

				leftmostvalue_timestamp(void)

				@ -268,9 +492,75 @@ leftmostvalue_timestamp(void)

					return TimestampGetDatum(DT_NOBEGIN);

				}

				GIN_SUPPORT(timestamp, false, leftmostvalue_timestamp, timestamp_cmp)

				static Datum

				cvt_date_timestamp(Datum input)

				{

					DateADT		val = DatumGetDateADT(input);

					Timestamp	result;

					ErrorSaveContext escontext = {T_ErrorSaveContext};

				GIN_SUPPORT(timestamptz, false, leftmostvalue_timestamp, timestamp_cmp)

					result = date2timestamp_safe(val, (Node *) &escontext);

					/* We can ignore errors, since result is useful as-is */

					return TimestampGetDatum(result);

				}

				static Datum

				cvt_timestamptz_timestamp(Datum input)

				{

					TimestampTz val = DatumGetTimestampTz(input);

					ErrorSaveContext escontext = {T_ErrorSaveContext};

					Timestamp	result;

					result = timestamptz2timestamp_safe(val, (Node *) &escontext);

					/* We can ignore errors, since result is useful as-is */

					return TimestampGetDatum(result);

				}

				static const bool timestamp_rhs_is_varlena[] =

				{false, false, false};

				static const btree_gin_convert_function timestamp_cvt_fns[] =

				{NULL, cvt_date_timestamp, cvt_timestamptz_timestamp};

				static const PGFunction timestamp_cmp_fns[] =

				{timestamp_cmp, date_cmp_timestamp, timestamptz_cmp_timestamp};

				GIN_SUPPORT(timestamp, leftmostvalue_timestamp, timestamp_rhs_is_varlena, timestamp_cvt_fns, timestamp_cmp_fns)

				static Datum

				cvt_date_timestamptz(Datum input)

				{

					DateADT		val = DatumGetDateADT(input);

					ErrorSaveContext escontext = {T_ErrorSaveContext};

					TimestampTz result;

					result = date2timestamptz_safe(val, (Node *) &escontext);

					/* We can ignore errors, since result is useful as-is */

					return TimestampTzGetDatum(result);

				}

				static Datum

				cvt_timestamp_timestamptz(Datum input)

				{

					Timestamp	val = DatumGetTimestamp(input);

					ErrorSaveContext escontext = {T_ErrorSaveContext};

					TimestampTz result;

					result = timestamp2timestamptz_safe(val, (Node *) &escontext);

					/* We can ignore errors, since result is useful as-is */

					return TimestampTzGetDatum(result);

				}

				static const bool timestamptz_rhs_is_varlena[] =

				{false, false, false};

				static const btree_gin_convert_function timestamptz_cvt_fns[] =

				{NULL, cvt_date_timestamptz, cvt_timestamp_timestamptz};

				static const PGFunction timestamptz_cmp_fns[] =

				{timestamp_cmp, date_cmp_timestamptz, timestamp_cmp_timestamptz};

				GIN_SUPPORT(timestamptz, leftmostvalue_timestamp, timestamptz_rhs_is_varlena, timestamptz_cvt_fns, timestamptz_cmp_fns)

				static Datum

				leftmostvalue_time(void)

				@ -278,12 +568,18 @@ leftmostvalue_time(void)

					return TimeADTGetDatum(0);

				}

				GIN_SUPPORT(time, false, leftmostvalue_time, time_cmp)

				static const bool time_rhs_is_varlena[] =

				{false};

				static const PGFunction time_cmp_fns[] =

				{time_cmp};

				GIN_SUPPORT(time, leftmostvalue_time, time_rhs_is_varlena, NULL, time_cmp_fns)

				static Datum

				leftmostvalue_timetz(void)

				{

					TimeTzADT  *v = palloc(sizeof(TimeTzADT));

					TimeTzADT  *v = palloc_object(TimeTzADT);

					v->time = 0;

					v->zone = -24 * 3600;		/* XXX is that true? */

				@ -291,7 +587,13 @@ leftmostvalue_timetz(void)

					return TimeTzADTPGetDatum(v);

				}

				GIN_SUPPORT(timetz, false, leftmostvalue_timetz, timetz_cmp)

				static const bool timetz_rhs_is_varlena[] =

				{false};

				static const PGFunction timetz_cmp_fns[] =

				{timetz_cmp};

				GIN_SUPPORT(timetz, leftmostvalue_timetz, timetz_rhs_is_varlena, NULL, timetz_cmp_fns)

				static Datum

				leftmostvalue_date(void)

				@ -299,39 +601,90 @@ leftmostvalue_date(void)

					return DateADTGetDatum(DATEVAL_NOBEGIN);

				}

				GIN_SUPPORT(date, false, leftmostvalue_date, date_cmp)

				static Datum

				cvt_timestamp_date(Datum input)

				{

					Timestamp	val = DatumGetTimestamp(input);

					ErrorSaveContext escontext = {T_ErrorSaveContext};

					DateADT		result;

					result = timestamp2date_safe(val, (Node *) &escontext);

					/* We can ignore errors, since result is useful as-is */

					return DateADTGetDatum(result);

				}

				static Datum

				cvt_timestamptz_date(Datum input)

				{

					TimestampTz val = DatumGetTimestampTz(input);

					ErrorSaveContext escontext = {T_ErrorSaveContext};

					DateADT		result;

					result = timestamptz2date_safe(val, (Node *) &escontext);

					/* We can ignore errors, since result is useful as-is */

					return DateADTGetDatum(result);

				}

				static const bool date_rhs_is_varlena[] =

				{false, false, false};

				static const btree_gin_convert_function date_cvt_fns[] =

				{NULL, cvt_timestamp_date, cvt_timestamptz_date};

				static const PGFunction date_cmp_fns[] =

				{date_cmp, timestamp_cmp_date, timestamptz_cmp_date};

				GIN_SUPPORT(date, leftmostvalue_date, date_rhs_is_varlena, date_cvt_fns, date_cmp_fns)

				static Datum

				leftmostvalue_interval(void)

				{

					Interval   *v = palloc(sizeof(Interval));

					Interval   *v = palloc_object(Interval);

					INTERVAL_NOBEGIN(v);

					return IntervalPGetDatum(v);

				}

				GIN_SUPPORT(interval, false, leftmostvalue_interval, interval_cmp)

				static const bool interval_rhs_is_varlena[] =

				{false};

				static const PGFunction interval_cmp_fns[] =

				{interval_cmp};

				GIN_SUPPORT(interval, leftmostvalue_interval, interval_rhs_is_varlena, NULL, interval_cmp_fns)

				static Datum

				leftmostvalue_macaddr(void)

				{

					macaddr    *v = palloc0(sizeof(macaddr));

					macaddr    *v = palloc0_object(macaddr);

					return MacaddrPGetDatum(v);

				}

				GIN_SUPPORT(macaddr, false, leftmostvalue_macaddr, macaddr_cmp)

				static const bool macaddr_rhs_is_varlena[] =

				{false};

				static const PGFunction macaddr_cmp_fns[] =

				{macaddr_cmp};

				GIN_SUPPORT(macaddr, leftmostvalue_macaddr, macaddr_rhs_is_varlena, NULL, macaddr_cmp_fns)

				static Datum

				leftmostvalue_macaddr8(void)

				{

					macaddr8   *v = palloc0(sizeof(macaddr8));

					macaddr8   *v = palloc0_object(macaddr8);

					return Macaddr8PGetDatum(v);

				}

				GIN_SUPPORT(macaddr8, false, leftmostvalue_macaddr8, macaddr8_cmp)

				static const bool macaddr8_rhs_is_varlena[] =

				{false};

				static const PGFunction macaddr8_cmp_fns[] =

				{macaddr8_cmp};

				GIN_SUPPORT(macaddr8, leftmostvalue_macaddr8, macaddr8_rhs_is_varlena, NULL, macaddr8_cmp_fns)

				static Datum

				leftmostvalue_inet(void)

				@ -339,9 +692,21 @@ leftmostvalue_inet(void)

					return DirectFunctionCall1(inet_in, CStringGetDatum("0.0.0.0/0"));

				}

				GIN_SUPPORT(inet, true, leftmostvalue_inet, network_cmp)

				static const bool inet_rhs_is_varlena[] =

				{true};

				GIN_SUPPORT(cidr, true, leftmostvalue_inet, network_cmp)

				static const PGFunction inet_cmp_fns[] =

				{network_cmp};

				GIN_SUPPORT(inet, leftmostvalue_inet, inet_rhs_is_varlena, NULL, inet_cmp_fns)

				static const bool cidr_rhs_is_varlena[] =

				{true};

				static const PGFunction cidr_cmp_fns[] =

				{network_cmp};

				GIN_SUPPORT(cidr, leftmostvalue_inet, cidr_rhs_is_varlena, NULL, cidr_cmp_fns)

				static Datum

				leftmostvalue_text(void)

				@ -349,9 +714,32 @@ leftmostvalue_text(void)

					return PointerGetDatum(cstring_to_text_with_len("", 0));

				}

				GIN_SUPPORT(text, true, leftmostvalue_text, bttextcmp)

				static Datum

				cvt_name_text(Datum input)

				{

					Name		val = DatumGetName(input);

				GIN_SUPPORT(bpchar, true, leftmostvalue_text, bpcharcmp)

					return PointerGetDatum(cstring_to_text(NameStr(*val)));

				}

				static const bool text_rhs_is_varlena[] =

				{true, false};

				static const btree_gin_convert_function text_cvt_fns[] =

				{NULL, cvt_name_text};

				static const PGFunction text_cmp_fns[] =

				{bttextcmp, btnametextcmp};

				GIN_SUPPORT(text, leftmostvalue_text, text_rhs_is_varlena, text_cvt_fns, text_cmp_fns)

				static const bool bpchar_rhs_is_varlena[] =

				{true};

				static const PGFunction bpchar_cmp_fns[] =

				{bpcharcmp};

				GIN_SUPPORT(bpchar, leftmostvalue_text, bpchar_rhs_is_varlena, NULL, bpchar_cmp_fns)

				static Datum

				leftmostvalue_char(void)

				@ -359,9 +747,21 @@ leftmostvalue_char(void)

					return CharGetDatum(0);

				}

				GIN_SUPPORT(char, false, leftmostvalue_char, btcharcmp)

				static const bool char_rhs_is_varlena[] =

				{false};

				GIN_SUPPORT(bytea, true, leftmostvalue_text, byteacmp)

				static const PGFunction char_cmp_fns[] =

				{btcharcmp};

				GIN_SUPPORT(char, leftmostvalue_char, char_rhs_is_varlena, NULL, char_cmp_fns)

				static const bool bytea_rhs_is_varlena[] =

				{true};

				static const PGFunction bytea_cmp_fns[] =

				{byteacmp};

				GIN_SUPPORT(bytea, leftmostvalue_text, bytea_rhs_is_varlena, NULL, bytea_cmp_fns)

				static Datum

				leftmostvalue_bit(void)

				@ -372,7 +772,13 @@ leftmostvalue_bit(void)

											   Int32GetDatum(-1));

				}

				GIN_SUPPORT(bit, true, leftmostvalue_bit, bitcmp)

				static const bool bit_rhs_is_varlena[] =

				{true};

				static const PGFunction bit_cmp_fns[] =

				{bitcmp};

				GIN_SUPPORT(bit, leftmostvalue_bit, bit_rhs_is_varlena, NULL, bit_cmp_fns)

				static Datum

				leftmostvalue_varbit(void)

				@ -383,7 +789,13 @@ leftmostvalue_varbit(void)

											   Int32GetDatum(-1));

				}

				GIN_SUPPORT(varbit, true, leftmostvalue_varbit, bitcmp)

				static const bool varbit_rhs_is_varlena[] =

				{true};

				static const PGFunction varbit_cmp_fns[] =

				{bitcmp};

				GIN_SUPPORT(varbit, leftmostvalue_varbit, varbit_rhs_is_varlena, NULL, varbit_cmp_fns)

				/*

				 * Numeric type hasn't a real left-most value, so we use PointerGetDatum(NULL)

				@ -428,7 +840,13 @@ leftmostvalue_numeric(void)

					return PointerGetDatum(NULL);

				}

				GIN_SUPPORT(numeric, true, leftmostvalue_numeric, gin_numeric_cmp)

				static const bool numeric_rhs_is_varlena[] =

				{true};

				static const PGFunction numeric_cmp_fns[] =

				{gin_numeric_cmp};

				GIN_SUPPORT(numeric, leftmostvalue_numeric, numeric_rhs_is_varlena, NULL, numeric_cmp_fns)

				/*

				 * Use a similar trick to that used for numeric for enums, since we don't

				@ -477,7 +895,13 @@ leftmostvalue_enum(void)

					return ObjectIdGetDatum(InvalidOid);

				}

				GIN_SUPPORT(anyenum, false, leftmostvalue_enum, gin_enum_cmp)

				static const bool enum_rhs_is_varlena[] =

				{false};

				static const PGFunction enum_cmp_fns[] =

				{gin_enum_cmp};

				GIN_SUPPORT(anyenum, leftmostvalue_enum, enum_rhs_is_varlena, NULL, enum_cmp_fns)

				static Datum

				leftmostvalue_uuid(void)

				@ -486,12 +910,18 @@ leftmostvalue_uuid(void)

					 * palloc0 will create the UUID with all zeroes:

					 * "00000000-0000-0000-0000-000000000000"

					 */

					pg_uuid_t  *retval = (pg_uuid_t *) palloc0(sizeof(pg_uuid_t));

					pg_uuid_t  *retval = palloc0_object(pg_uuid_t);

					return UUIDPGetDatum(retval);

				}

				GIN_SUPPORT(uuid, false, leftmostvalue_uuid, uuid_cmp)

				static const bool uuid_rhs_is_varlena[] =

				{false};

				static const PGFunction uuid_cmp_fns[] =

				{uuid_cmp};

				GIN_SUPPORT(uuid, leftmostvalue_uuid, uuid_rhs_is_varlena, NULL, uuid_cmp_fns)

				static Datum

				leftmostvalue_name(void)

				@ -501,7 +931,37 @@ leftmostvalue_name(void)

					return NameGetDatum(result);

				}

				GIN_SUPPORT(name, false, leftmostvalue_name, btnamecmp)

				static Datum

				cvt_text_name(Datum input)

				{

					text	   *val = DatumGetTextPP(input);

					NameData   *result = (NameData *) palloc0(NAMEDATALEN);

					int			len = VARSIZE_ANY_EXHDR(val);

					/*

					 * Truncate oversize input.  We're assuming this will produce a result

					 * considered less than the original.  That could be a bad assumption in

					 * some collations, but fortunately an index on "name" is generally going

					 * to use C collation.

					 */

					if (len >= NAMEDATALEN)

						len = pg_mbcliplen(VARDATA_ANY(val), len, NAMEDATALEN - 1);

					memcpy(NameStr(*result), VARDATA_ANY(val), len);

					return NameGetDatum(result);

				}

				static const bool name_rhs_is_varlena[] =

				{false, true};

				static const btree_gin_convert_function name_cvt_fns[] =

				{NULL, cvt_text_name};

				static const PGFunction name_cmp_fns[] =

				{btnamecmp, bttextnamecmp};

				GIN_SUPPORT(name, leftmostvalue_name, name_rhs_is_varlena, name_cvt_fns, name_cmp_fns)

				static Datum

				leftmostvalue_bool(void)

				@ -509,4 +969,10 @@ leftmostvalue_bool(void)

					return BoolGetDatum(false);

				}

				GIN_SUPPORT(bool, false, leftmostvalue_bool, btboolcmp)

				static const bool bool_rhs_is_varlena[] =

				{false};

				static const PGFunction bool_cmp_fns[] =

				{btboolcmp};

				GIN_SUPPORT(bool, leftmostvalue_bool, bool_rhs_is_varlena, NULL, bool_cmp_fns)

2

contrib/btree_gin/btree_gin.control

View file

 @ -1,6 +1,6 @@
 # btree_gin extension
 comment = 'support for indexing common datatypes in GIN'
 default_version = '1.3'
 default_version = '1.4'
 module_pathname = '$libdir/btree_gin'
 relocatable = true
 trusted = true

362

contrib/btree_gin/expected/date.out

View file

 @ -49,3 +49,365 @@ SELECT * FROM test_date WHERE i>'2004-10-26'::date ORDER BY i;
 -28-2004
 (2 rows)
 explain (costs off)
 SELECT * FROM test_date WHERE i<'2004-10-26'::timestamp ORDER BY i;
                                        QUERY PLAN
 -----------------------------------------------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_date
          Recheck Cond: (i < 'Tue Oct 26 00:00:00 2004'::timestamp without time zone)
          ->  Bitmap Index Scan on idx_date
                Index Cond: (i < 'Tue Oct 26 00:00:00 2004'::timestamp without time zone)
 (6 rows)
 SELECT * FROM test_date WHERE i<'2004-10-26'::timestamp ORDER BY i;
      i
 ------------
 -23-2004
 -24-2004
 -25-2004
 (3 rows)
 SELECT * FROM test_date WHERE i<='2004-10-26'::timestamp ORDER BY i;
      i
 ------------
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 (4 rows)
 SELECT * FROM test_date WHERE i='2004-10-26'::timestamp ORDER BY i;
      i
 ------------
 -26-2004
 (1 row)
 SELECT * FROM test_date WHERE i>='2004-10-26'::timestamp ORDER BY i;
      i
 ------------
 -26-2004
 -27-2004
 -28-2004
 (3 rows)
 SELECT * FROM test_date WHERE i>'2004-10-26'::timestamp ORDER BY i;
      i
 ------------
 -27-2004
 -28-2004
 (2 rows)
 explain (costs off)
 SELECT * FROM test_date WHERE i<'2004-10-26'::timestamptz ORDER BY i;
                                         QUERY PLAN
 ------------------------------------------------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_date
          Recheck Cond: (i < 'Tue Oct 26 00:00:00 2004 PDT'::timestamp with time zone)
          ->  Bitmap Index Scan on idx_date
                Index Cond: (i < 'Tue Oct 26 00:00:00 2004 PDT'::timestamp with time zone)
 (6 rows)
 SELECT * FROM test_date WHERE i<'2004-10-26'::timestamptz ORDER BY i;
      i
 ------------
 -23-2004
 -24-2004
 -25-2004
 (3 rows)
 SELECT * FROM test_date WHERE i<='2004-10-26'::timestamptz ORDER BY i;
      i
 ------------
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 (4 rows)
 SELECT * FROM test_date WHERE i='2004-10-26'::timestamptz ORDER BY i;
      i
 ------------
 -26-2004
 (1 row)
 SELECT * FROM test_date WHERE i>='2004-10-26'::timestamptz ORDER BY i;
      i
 ------------
 -26-2004
 -27-2004
 -28-2004
 (3 rows)
 SELECT * FROM test_date WHERE i>'2004-10-26'::timestamptz ORDER BY i;
      i
 ------------
 -27-2004
 -28-2004
 (2 rows)
 -- Check endpoint and out-of-range cases
 INSERT INTO test_date VALUES ('-infinity'), ('infinity');
 SELECT gin_clean_pending_list('idx_date');
  gin_clean_pending_list
 ------------------------
 
 (1 row)
 SELECT * FROM test_date WHERE i<'-infinity'::timestamp ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_date WHERE i<='-infinity'::timestamp ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_date WHERE i='-infinity'::timestamp ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_date WHERE i>='-infinity'::timestamp ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
  infinity
 (8 rows)
 SELECT * FROM test_date WHERE i>'-infinity'::timestamp ORDER BY i;
      i
 ------------
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
  infinity
 (7 rows)
 SELECT * FROM test_date WHERE i<'infinity'::timestamp ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
 (7 rows)
 SELECT * FROM test_date WHERE i<='infinity'::timestamp ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
  infinity
 (8 rows)
 SELECT * FROM test_date WHERE i='infinity'::timestamp ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_date WHERE i>='infinity'::timestamp ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_date WHERE i>'infinity'::timestamp ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_date WHERE i<'-infinity'::timestamptz ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_date WHERE i<='-infinity'::timestamptz ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_date WHERE i='-infinity'::timestamptz ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_date WHERE i>='-infinity'::timestamptz ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
  infinity
 (8 rows)
 SELECT * FROM test_date WHERE i>'-infinity'::timestamptz ORDER BY i;
      i
 ------------
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
  infinity
 (7 rows)
 SELECT * FROM test_date WHERE i<'infinity'::timestamptz ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
 (7 rows)
 SELECT * FROM test_date WHERE i<='infinity'::timestamptz ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 -26-2004
 -27-2004
 -28-2004
  infinity
 (8 rows)
 SELECT * FROM test_date WHERE i='infinity'::timestamptz ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_date WHERE i>='infinity'::timestamptz ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_date WHERE i>'infinity'::timestamptz ORDER BY i;
  i
 ---
 (0 rows)
 -- Check rounding cases
 -- '2004-10-25 00:00:01' rounds to '2004-10-25' for date.
 -- '2004-10-25 23:59:59' also rounds to '2004-10-25',
 -- so it's the same case as '2004-10-25 00:00:01'
 SELECT * FROM test_date WHERE i < '2004-10-25 00:00:01'::timestamp ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 (4 rows)
 SELECT * FROM test_date WHERE i <= '2004-10-25 00:00:01'::timestamp ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 (4 rows)
 SELECT * FROM test_date WHERE i = '2004-10-25 00:00:01'::timestamp ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_date WHERE i > '2004-10-25 00:00:01'::timestamp ORDER BY i;
      i
 ------------
 -26-2004
 -27-2004
 -28-2004
  infinity
 (4 rows)
 SELECT * FROM test_date WHERE i >= '2004-10-25 00:00:01'::timestamp ORDER BY i;
      i
 ------------
 -26-2004
 -27-2004
 -28-2004
  infinity
 (4 rows)
 SELECT * FROM test_date WHERE i < '2004-10-25 00:00:01'::timestamptz ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 (4 rows)
 SELECT * FROM test_date WHERE i <= '2004-10-25 00:00:01'::timestamptz ORDER BY i;
      i
 ------------
  -infinity
 -23-2004
 -24-2004
 -25-2004
 (4 rows)
 SELECT * FROM test_date WHERE i = '2004-10-25 00:00:01'::timestamptz ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_date WHERE i > '2004-10-25 00:00:01'::timestamptz ORDER BY i;
      i
 ------------
 -26-2004
 -27-2004
 -28-2004
  infinity
 (4 rows)
 SELECT * FROM test_date WHERE i >= '2004-10-25 00:00:01'::timestamptz ORDER BY i;
      i
 ------------
 -26-2004
 -27-2004
 -28-2004
  infinity
 (4 rows)

321

contrib/btree_gin/expected/float4.out

View file

 @ -42,3 +42,324 @@ SELECT * FROM test_float4 WHERE i>1::float4 ORDER BY i;
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_float4 WHERE i<1::float8 ORDER BY i;
                       QUERY PLAN
 -------------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_float4
          Recheck Cond: (i < '1'::double precision)
          ->  Bitmap Index Scan on idx_float4
                Index Cond: (i < '1'::double precision)
 (6 rows)
 SELECT * FROM test_float4 WHERE i<1::float8 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_float4 WHERE i<=1::float8 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_float4 WHERE i=1::float8 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_float4 WHERE i>=1::float8 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_float4 WHERE i>1::float8 ORDER BY i;
  i
 ---
 
 
 (2 rows)
 -- Check endpoint and out-of-range cases
 INSERT INTO test_float4 VALUES ('NaN'), ('Inf'), ('-Inf');
 SELECT gin_clean_pending_list('idx_float4');
  gin_clean_pending_list
 ------------------------
 
 (1 row)
 SELECT * FROM test_float4 WHERE i<'-Inf'::float8 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_float4 WHERE i<='-Inf'::float8 ORDER BY i;
      i
 -----------
  -Infinity
 (1 row)
 SELECT * FROM test_float4 WHERE i='-Inf'::float8 ORDER BY i;
      i
 -----------
  -Infinity
 (1 row)
 SELECT * FROM test_float4 WHERE i>='-Inf'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
   Infinity
        NaN
 (9 rows)
 SELECT * FROM test_float4 WHERE i>'-Inf'::float8 ORDER BY i;
     i
 ----------
        -2
        -1
 
 
 
 
  Infinity
       NaN
 (8 rows)
 SELECT * FROM test_float4 WHERE i<'Inf'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
 (7 rows)
 SELECT * FROM test_float4 WHERE i<='Inf'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
   Infinity
 (8 rows)
 SELECT * FROM test_float4 WHERE i='Inf'::float8 ORDER BY i;
     i
 ----------
  Infinity
 (1 row)
 SELECT * FROM test_float4 WHERE i>='Inf'::float8 ORDER BY i;
     i
 ----------
  Infinity
       NaN
 (2 rows)
 SELECT * FROM test_float4 WHERE i>'Inf'::float8 ORDER BY i;
   i
 -----
  NaN
 (1 row)
 SELECT * FROM test_float4 WHERE i<'1e300'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
 (7 rows)
 SELECT * FROM test_float4 WHERE i<='1e300'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
 (7 rows)
 SELECT * FROM test_float4 WHERE i='1e300'::float8 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_float4 WHERE i>='1e300'::float8 ORDER BY i;
     i
 ----------
  Infinity
       NaN
 (2 rows)
 SELECT * FROM test_float4 WHERE i>'1e300'::float8 ORDER BY i;
     i
 ----------
  Infinity
       NaN
 (2 rows)
 SELECT * FROM test_float4 WHERE i<'NaN'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
   Infinity
 (8 rows)
 SELECT * FROM test_float4 WHERE i<='NaN'::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 
 
 
   Infinity
        NaN
 (9 rows)
 SELECT * FROM test_float4 WHERE i='NaN'::float8 ORDER BY i;
   i
 -----
  NaN
 (1 row)
 SELECT * FROM test_float4 WHERE i>='NaN'::float8 ORDER BY i;
   i
 -----
  NaN
 (1 row)
 SELECT * FROM test_float4 WHERE i>'NaN'::float8 ORDER BY i;
  i
 ---
 (0 rows)
 -- Check rounding cases
 -- 1e-300 rounds to 0 for float4 but not for float8
 SELECT * FROM test_float4 WHERE i < -1e-300::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 (3 rows)
 SELECT * FROM test_float4 WHERE i <= -1e-300::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 (3 rows)
 SELECT * FROM test_float4 WHERE i = -1e-300::float8 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_float4 WHERE i > -1e-300::float8 ORDER BY i;
     i
 ----------
 
 
 
 
  Infinity
       NaN
 (6 rows)
 SELECT * FROM test_float4 WHERE i >= -1e-300::float8 ORDER BY i;
     i
 ----------
 
 
 
 
  Infinity
       NaN
 (6 rows)
 SELECT * FROM test_float4 WHERE i < 1e-300::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 (4 rows)
 SELECT * FROM test_float4 WHERE i <= 1e-300::float8 ORDER BY i;
      i
 -----------
  -Infinity
         -2
         -1
 
 (4 rows)
 SELECT * FROM test_float4 WHERE i = 1e-300::float8 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_float4 WHERE i > 1e-300::float8 ORDER BY i;
     i
 ----------
 
 
 
  Infinity
       NaN
 (5 rows)
 SELECT * FROM test_float4 WHERE i >= 1e-300::float8 ORDER BY i;
     i
 ----------
 
 
 
  Infinity
       NaN
 (5 rows)

50

contrib/btree_gin/expected/float8.out

View file

 @ -42,3 +42,53 @@ SELECT * FROM test_float8 WHERE i>1::float8 ORDER BY i;
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_float8 WHERE i<1::float4 ORDER BY i;
                  QUERY PLAN
 ---------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_float8
          Recheck Cond: (i < '1'::real)
          ->  Bitmap Index Scan on idx_float8
                Index Cond: (i < '1'::real)
 (6 rows)
 SELECT * FROM test_float8 WHERE i<1::float4 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_float8 WHERE i<=1::float4 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_float8 WHERE i=1::float4 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_float8 WHERE i>=1::float4 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_float8 WHERE i>1::float4 ORDER BY i;
  i
 ---
 
 
 (2 rows)

190

contrib/btree_gin/expected/int2.out

View file

 @ -42,3 +42,193 @@ SELECT * FROM test_int2 WHERE i>1::int2 ORDER BY i;
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_int2 WHERE i<1::int4 ORDER BY i;
                 QUERY PLAN
 -------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_int2
          Recheck Cond: (i < 1)
          ->  Bitmap Index Scan on idx_int2
                Index Cond: (i < 1)
 (6 rows)
 SELECT * FROM test_int2 WHERE i<1::int4 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_int2 WHERE i<=1::int4 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_int2 WHERE i=1::int4 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_int2 WHERE i>=1::int4 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_int2 WHERE i>1::int4 ORDER BY i;
  i
 ---
 
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_int2 WHERE i<1::int8 ORDER BY i;
                  QUERY PLAN
 ---------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_int2
          Recheck Cond: (i < '1'::bigint)
          ->  Bitmap Index Scan on idx_int2
                Index Cond: (i < '1'::bigint)
 (6 rows)
 SELECT * FROM test_int2 WHERE i<1::int8 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_int2 WHERE i<=1::int8 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_int2 WHERE i=1::int8 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_int2 WHERE i>=1::int8 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_int2 WHERE i>1::int8 ORDER BY i;
  i
 ---
 
 
 (2 rows)
 -- Check endpoint and out-of-range cases
 INSERT INTO test_int2 VALUES ((-32768)::int2),(32767);
 SELECT gin_clean_pending_list('idx_int2');
  gin_clean_pending_list
 ------------------------
 
 (1 row)
 SELECT * FROM test_int2 WHERE i<(-32769)::int4 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_int2 WHERE i<=(-32769)::int4 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_int2 WHERE i=(-32769)::int4 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_int2 WHERE i>=(-32769)::int4 ORDER BY i;
    i
 --------
  -32768
      -2
      -1
 
 
 
 
 
 (8 rows)
 SELECT * FROM test_int2 WHERE i>(-32769)::int4 ORDER BY i;
    i
 --------
  -32768
      -2
      -1
 
 
 
 
 
 (8 rows)
 SELECT * FROM test_int2 WHERE i<32768::int4 ORDER BY i;
    i
 --------
  -32768
      -2
      -1
 
 
 
 
 
 (8 rows)
 SELECT * FROM test_int2 WHERE i<=32768::int4 ORDER BY i;
    i
 --------
  -32768
      -2
      -1
 
 
 
 
 
 (8 rows)
 SELECT * FROM test_int2 WHERE i=32768::int4 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_int2 WHERE i>=32768::int4 ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_int2 WHERE i>32768::int4 ORDER BY i;
  i
 ---
 (0 rows)

100

contrib/btree_gin/expected/int4.out

View file

 @ -42,3 +42,103 @@ SELECT * FROM test_int4 WHERE i>1::int4 ORDER BY i;
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_int4 WHERE i<1::int2 ORDER BY i;
                   QUERY PLAN
 -----------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_int4
          Recheck Cond: (i < '1'::smallint)
          ->  Bitmap Index Scan on idx_int4
                Index Cond: (i < '1'::smallint)
 (6 rows)
 SELECT * FROM test_int4 WHERE i<1::int2 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_int4 WHERE i<=1::int2 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_int4 WHERE i=1::int2 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_int4 WHERE i>=1::int2 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_int4 WHERE i>1::int2 ORDER BY i;
  i
 ---
 
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_int4 WHERE i<1::int8 ORDER BY i;
                  QUERY PLAN
 ---------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_int4
          Recheck Cond: (i < '1'::bigint)
          ->  Bitmap Index Scan on idx_int4
                Index Cond: (i < '1'::bigint)
 (6 rows)
 SELECT * FROM test_int4 WHERE i<1::int8 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_int4 WHERE i<=1::int8 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_int4 WHERE i=1::int8 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_int4 WHERE i>=1::int8 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_int4 WHERE i>1::int8 ORDER BY i;
  i
 ---
 
 
 (2 rows)

100

contrib/btree_gin/expected/int8.out

View file

 @ -42,3 +42,103 @@ SELECT * FROM test_int8 WHERE i>1::int8 ORDER BY i;
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_int8 WHERE i<1::int2 ORDER BY i;
                   QUERY PLAN
 -----------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_int8
          Recheck Cond: (i < '1'::smallint)
          ->  Bitmap Index Scan on idx_int8
                Index Cond: (i < '1'::smallint)
 (6 rows)
 SELECT * FROM test_int8 WHERE i<1::int2 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_int8 WHERE i<=1::int2 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_int8 WHERE i=1::int2 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_int8 WHERE i>=1::int2 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_int8 WHERE i>1::int2 ORDER BY i;
  i
 ---
 
 
 (2 rows)
 explain (costs off)
 SELECT * FROM test_int8 WHERE i<1::int4 ORDER BY i;
                 QUERY PLAN
 -------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_int8
          Recheck Cond: (i < 1)
          ->  Bitmap Index Scan on idx_int8
                Index Cond: (i < 1)
 (6 rows)
 SELECT * FROM test_int8 WHERE i<1::int4 ORDER BY i;
  i
 ----
  -2
  -1
 
 (3 rows)
 SELECT * FROM test_int8 WHERE i<=1::int4 ORDER BY i;
  i
 ----
  -2
  -1
 
 
 (4 rows)
 SELECT * FROM test_int8 WHERE i=1::int4 ORDER BY i;
  i
 ---
 
 (1 row)
 SELECT * FROM test_int8 WHERE i>=1::int4 ORDER BY i;
  i
 ---
 
 
 
 (3 rows)
 SELECT * FROM test_int8 WHERE i>1::int4 ORDER BY i;
  i
 ---
 
 
 (2 rows)

59

contrib/btree_gin/expected/name.out

View file

 @ -95,3 +95,62 @@ EXPLAIN (COSTS OFF) SELECT * FROM test_name WHERE i>'abc' ORDER BY i;
                Index Cond: (i > 'abc'::name)
 (6 rows)
 explain (costs off)
 SELECT * FROM test_name WHERE i<'abc'::text ORDER BY i;
                  QUERY PLAN
 ---------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_name
          Recheck Cond: (i < 'abc'::text)
          ->  Bitmap Index Scan on idx_name
                Index Cond: (i < 'abc'::text)
 (6 rows)
 SELECT * FROM test_name WHERE i<'abc'::text ORDER BY i;
   i
 -----
  a
  ab
  abb
 (3 rows)
 SELECT * FROM test_name WHERE i<='abc'::text ORDER BY i;
   i
 -----
  a
  ab
  abb
  abc
 (4 rows)
 SELECT * FROM test_name WHERE i='abc'::text ORDER BY i;
   i
 -----
  abc
 (1 row)
 SELECT * FROM test_name WHERE i>='abc'::text ORDER BY i;
   i
 -----
  abc
  axy
  xyz
 (3 rows)
 SELECT * FROM test_name WHERE i>'abc'::text ORDER BY i;
   i
 -----
  axy
  xyz
 (2 rows)
 SELECT * FROM test_name WHERE i<=repeat('abc', 100) ORDER BY i;
   i
 -----
  a
  ab
  abb
  abc
 (4 rows)

50

contrib/btree_gin/expected/text.out

View file

 @ -42,3 +42,53 @@ SELECT * FROM test_text WHERE i>'abc' ORDER BY i;
  xyz
 (2 rows)
 explain (costs off)
 SELECT * FROM test_text WHERE i<'abc'::name COLLATE "default" ORDER BY i;
                           QUERY PLAN
 ---------------------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_text
          Recheck Cond: (i < 'abc'::name COLLATE "default")
          ->  Bitmap Index Scan on idx_text
                Index Cond: (i < 'abc'::name COLLATE "default")
 (6 rows)
 SELECT * FROM test_text WHERE i<'abc'::name COLLATE "default" ORDER BY i;
   i
 -----
  a
  ab
  abb
 (3 rows)
 SELECT * FROM test_text WHERE i<='abc'::name COLLATE "default" ORDER BY i;
   i
 -----
  a
  ab
  abb
  abc
 (4 rows)
 SELECT * FROM test_text WHERE i='abc'::name COLLATE "default" ORDER BY i;
   i
 -----
  abc
 (1 row)
 SELECT * FROM test_text WHERE i>='abc'::name COLLATE "default" ORDER BY i;
   i
 -----
  abc
  axy
  xyz
 (3 rows)
 SELECT * FROM test_text WHERE i>'abc'::name COLLATE "default" ORDER BY i;
   i
 -----
  axy
  xyz
 (2 rows)

306

contrib/btree_gin/expected/timestamp.out

View file

 @ -7,8 +7,8 @@ INSERT INTO test_timestamp VALUES
 	( '2004-10-26 04:55:08' ),
 	( '2004-10-26 05:55:08' ),
 	( '2004-10-26 08:55:08' ),
 	( '2004-10-26 09:55:08' ),
 	( '2004-10-26 10:55:08' )
 	( '2004-10-27 09:55:08' ),
 	( '2004-10-27 10:55:08' )
 ;
 CREATE INDEX idx_timestamp ON test_timestamp USING gin (i);
 SELECT * FROM test_timestamp WHERE i<'2004-10-26 08:55:08'::timestamp ORDER BY i;
 @ -38,14 +38,308 @@ SELECT * FROM test_timestamp WHERE i>='2004-10-26 08:55:08'::timestamp ORDER BY
             i
 --------------------------
  Tue Oct 26 08:55:08 2004
  Tue Oct 26 09:55:08 2004
  Tue Oct 26 10:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (3 rows)
 SELECT * FROM test_timestamp WHERE i>'2004-10-26 08:55:08'::timestamp ORDER BY i;
             i
 --------------------------
  Tue Oct 26 09:55:08 2004
  Tue Oct 26 10:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (2 rows)
 explain (costs off)
 SELECT * FROM test_timestamp WHERE i<'2004-10-27'::date ORDER BY i;
                      QUERY PLAN
 ----------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_timestamp
          Recheck Cond: (i < '10-27-2004'::date)
          ->  Bitmap Index Scan on idx_timestamp
                Index Cond: (i < '10-27-2004'::date)
 (6 rows)
 SELECT * FROM test_timestamp WHERE i<'2004-10-27'::date ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
 (4 rows)
 SELECT * FROM test_timestamp WHERE i<='2004-10-27'::date ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
 (4 rows)
 SELECT * FROM test_timestamp WHERE i='2004-10-27'::date ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_timestamp WHERE i>='2004-10-27'::date ORDER BY i;
             i
 --------------------------
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (2 rows)
 SELECT * FROM test_timestamp WHERE i>'2004-10-27'::date ORDER BY i;
             i
 --------------------------
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (2 rows)
 explain (costs off)
 SELECT * FROM test_timestamp WHERE i<'2004-10-26 08:55:08'::timestamptz ORDER BY i;
                                         QUERY PLAN
 ------------------------------------------------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_timestamp
          Recheck Cond: (i < 'Tue Oct 26 08:55:08 2004 PDT'::timestamp with time zone)
          ->  Bitmap Index Scan on idx_timestamp
                Index Cond: (i < 'Tue Oct 26 08:55:08 2004 PDT'::timestamp with time zone)
 (6 rows)
 SELECT * FROM test_timestamp WHERE i<'2004-10-26 08:55:08'::timestamptz ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
 (3 rows)
 SELECT * FROM test_timestamp WHERE i<='2004-10-26 08:55:08'::timestamptz ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
 (4 rows)
 SELECT * FROM test_timestamp WHERE i='2004-10-26 08:55:08'::timestamptz ORDER BY i;
             i
 --------------------------
  Tue Oct 26 08:55:08 2004
 (1 row)
 SELECT * FROM test_timestamp WHERE i>='2004-10-26 08:55:08'::timestamptz ORDER BY i;
             i
 --------------------------
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (3 rows)
 SELECT * FROM test_timestamp WHERE i>'2004-10-26 08:55:08'::timestamptz ORDER BY i;
             i
 --------------------------
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (2 rows)
 -- Check endpoint and out-of-range cases
 INSERT INTO test_timestamp VALUES ('-infinity'), ('infinity');
 SELECT gin_clean_pending_list('idx_timestamp');
  gin_clean_pending_list
 ------------------------
 
 (1 row)
 SELECT * FROM test_timestamp WHERE i<'-infinity'::date ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_timestamp WHERE i<='-infinity'::date ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i='-infinity'::date ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>='-infinity'::date ORDER BY i;
             i
 --------------------------
  -infinity
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (8 rows)
 SELECT * FROM test_timestamp WHERE i>'-infinity'::date ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (7 rows)
 SELECT * FROM test_timestamp WHERE i<'infinity'::date ORDER BY i;
             i
 --------------------------
  -infinity
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (7 rows)
 SELECT * FROM test_timestamp WHERE i<='infinity'::date ORDER BY i;
             i
 --------------------------
  -infinity
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (8 rows)
 SELECT * FROM test_timestamp WHERE i='infinity'::date ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>='infinity'::date ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>'infinity'::date ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_timestamp WHERE i<'-infinity'::timestamptz ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_timestamp WHERE i<='-infinity'::timestamptz ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i='-infinity'::timestamptz ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>='-infinity'::timestamptz ORDER BY i;
             i
 --------------------------
  -infinity
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (8 rows)
 SELECT * FROM test_timestamp WHERE i>'-infinity'::timestamptz ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (7 rows)
 SELECT * FROM test_timestamp WHERE i<'infinity'::timestamptz ORDER BY i;
             i
 --------------------------
  -infinity
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
 (7 rows)
 SELECT * FROM test_timestamp WHERE i<='infinity'::timestamptz ORDER BY i;
             i
 --------------------------
  -infinity
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (8 rows)
 SELECT * FROM test_timestamp WHERE i='infinity'::timestamptz ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>='infinity'::timestamptz ORDER BY i;
     i
 ----------
  infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>'infinity'::timestamptz ORDER BY i;
  i
 ---
 (0 rows)
 -- This PST timestamptz will underflow if converted to timestamp
 SELECT * FROM test_timestamp WHERE i<='4714-11-23 17:00 BC'::timestamptz ORDER BY i;
      i
 -----------
  -infinity
 (1 row)
 SELECT * FROM test_timestamp WHERE i>'4714-11-23 17:00 BC'::timestamptz ORDER BY i;
             i
 --------------------------
  Tue Oct 26 03:55:08 2004
  Tue Oct 26 04:55:08 2004
  Tue Oct 26 05:55:08 2004
  Tue Oct 26 08:55:08 2004
  Wed Oct 27 09:55:08 2004
  Wed Oct 27 10:55:08 2004
  infinity
 (7 rows)

111

contrib/btree_gin/expected/timestamptz.out

View file

 @ -7,8 +7,8 @@ INSERT INTO test_timestamptz VALUES
 	( '2004-10-26 04:55:08' ),
 	( '2004-10-26 05:55:08' ),
 	( '2004-10-26 08:55:08' ),
 	( '2004-10-26 09:55:08' ),
 	( '2004-10-26 10:55:08' )
 	( '2004-10-27 09:55:08' ),
 	( '2004-10-27 10:55:08' )
 ;
 CREATE INDEX idx_timestamptz ON test_timestamptz USING gin (i);
 SELECT * FROM test_timestamptz WHERE i<'2004-10-26 08:55:08'::timestamptz ORDER BY i;
 @ -38,14 +38,113 @@ SELECT * FROM test_timestamptz WHERE i>='2004-10-26 08:55:08'::timestamptz ORDER
               i
 ------------------------------
  Tue Oct 26 08:55:08 2004 PDT
  Tue Oct 26 09:55:08 2004 PDT
  Tue Oct 26 10:55:08 2004 PDT
  Wed Oct 27 09:55:08 2004 PDT
  Wed Oct 27 10:55:08 2004 PDT
 (3 rows)
 SELECT * FROM test_timestamptz WHERE i>'2004-10-26 08:55:08'::timestamptz ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 09:55:08 2004 PDT
  Tue Oct 26 10:55:08 2004 PDT
  Wed Oct 27 09:55:08 2004 PDT
  Wed Oct 27 10:55:08 2004 PDT
 (2 rows)
 explain (costs off)
 SELECT * FROM test_timestamptz WHERE i<'2004-10-27'::date ORDER BY i;
                      QUERY PLAN
 ----------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_timestamptz
          Recheck Cond: (i < '10-27-2004'::date)
          ->  Bitmap Index Scan on idx_timestamptz
                Index Cond: (i < '10-27-2004'::date)
 (6 rows)
 SELECT * FROM test_timestamptz WHERE i<'2004-10-27'::date ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 03:55:08 2004 PDT
  Tue Oct 26 04:55:08 2004 PDT
  Tue Oct 26 05:55:08 2004 PDT
  Tue Oct 26 08:55:08 2004 PDT
 (4 rows)
 SELECT * FROM test_timestamptz WHERE i<='2004-10-27'::date ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 03:55:08 2004 PDT
  Tue Oct 26 04:55:08 2004 PDT
  Tue Oct 26 05:55:08 2004 PDT
  Tue Oct 26 08:55:08 2004 PDT
 (4 rows)
 SELECT * FROM test_timestamptz WHERE i='2004-10-27'::date ORDER BY i;
  i
 ---
 (0 rows)
 SELECT * FROM test_timestamptz WHERE i>='2004-10-27'::date ORDER BY i;
               i
 ------------------------------
  Wed Oct 27 09:55:08 2004 PDT
  Wed Oct 27 10:55:08 2004 PDT
 (2 rows)
 SELECT * FROM test_timestamptz WHERE i>'2004-10-27'::date ORDER BY i;
               i
 ------------------------------
  Wed Oct 27 09:55:08 2004 PDT
  Wed Oct 27 10:55:08 2004 PDT
 (2 rows)
 explain (costs off)
 SELECT * FROM test_timestamptz WHERE i<'2004-10-26 08:55:08'::timestamp ORDER BY i;
                                        QUERY PLAN
 -----------------------------------------------------------------------------------------
  Sort
    Sort Key: i
    ->  Bitmap Heap Scan on test_timestamptz
          Recheck Cond: (i < 'Tue Oct 26 08:55:08 2004'::timestamp without time zone)
          ->  Bitmap Index Scan on idx_timestamptz
                Index Cond: (i < 'Tue Oct 26 08:55:08 2004'::timestamp without time zone)
 (6 rows)
 SELECT * FROM test_timestamptz WHERE i<'2004-10-26 08:55:08'::timestamp ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 03:55:08 2004 PDT
  Tue Oct 26 04:55:08 2004 PDT
  Tue Oct 26 05:55:08 2004 PDT
 (3 rows)
 SELECT * FROM test_timestamptz WHERE i<='2004-10-26 08:55:08'::timestamp ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 03:55:08 2004 PDT
  Tue Oct 26 04:55:08 2004 PDT
  Tue Oct 26 05:55:08 2004 PDT
  Tue Oct 26 08:55:08 2004 PDT
 (4 rows)
 SELECT * FROM test_timestamptz WHERE i='2004-10-26 08:55:08'::timestamp ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 08:55:08 2004 PDT
 (1 row)
 SELECT * FROM test_timestamptz WHERE i>='2004-10-26 08:55:08'::timestamp ORDER BY i;
               i
 ------------------------------
  Tue Oct 26 08:55:08 2004 PDT
  Wed Oct 27 09:55:08 2004 PDT
  Wed Oct 27 10:55:08 2004 PDT
 (3 rows)
 SELECT * FROM test_timestamptz WHERE i>'2004-10-26 08:55:08'::timestamp ORDER BY i;
               i
 ------------------------------
  Wed Oct 27 09:55:08 2004 PDT
  Wed Oct 27 10:55:08 2004 PDT
 (2 rows)

									
										3

contrib/btree_gin/meson.build
									
										View file
										
				@ -1,4 +1,4 @@

				# Copyright (c) 2022-2025, PostgreSQL Global Development Group

				# Copyright (c) 2022-2026, PostgreSQL Global Development Group

				btree_gin_sources = files(

				  'btree_gin.c',

				@ -22,6 +22,7 @@ install_data(

				  'btree_gin--1.0--1.1.sql',

				  'btree_gin--1.1--1.2.sql',

				  'btree_gin--1.2--1.3.sql',

				  'btree_gin--1.3--1.4.sql',

				  kwargs: contrib_data_args,

				)

									
										64

contrib/btree_gin/sql/date.sql
									
										View file
										
				@ -20,3 +20,67 @@ SELECT * FROM test_date WHERE i<='2004-10-26'::date ORDER BY i;

				SELECT * FROM test_date WHERE i='2004-10-26'::date ORDER BY i;

				SELECT * FROM test_date WHERE i>='2004-10-26'::date ORDER BY i;

				SELECT * FROM test_date WHERE i>'2004-10-26'::date ORDER BY i;

				explain (costs off)

				SELECT * FROM test_date WHERE i<'2004-10-26'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i<'2004-10-26'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i<='2004-10-26'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i='2004-10-26'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i>='2004-10-26'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i>'2004-10-26'::timestamp ORDER BY i;

				explain (costs off)

				SELECT * FROM test_date WHERE i<'2004-10-26'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i<'2004-10-26'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i<='2004-10-26'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i='2004-10-26'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i>='2004-10-26'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i>'2004-10-26'::timestamptz ORDER BY i;

				-- Check endpoint and out-of-range cases

				INSERT INTO test_date VALUES ('-infinity'), ('infinity');

				SELECT gin_clean_pending_list('idx_date');

				SELECT * FROM test_date WHERE i<'-infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i<='-infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i='-infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i>='-infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i>'-infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i<'infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i<='infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i='infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i>='infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i>'infinity'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i<'-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i<='-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i='-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i>='-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i>'-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i<'infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i<='infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i='infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i>='infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i>'infinity'::timestamptz ORDER BY i;

				-- Check rounding cases

				-- '2004-10-25 00:00:01' rounds to '2004-10-25' for date.

				-- '2004-10-25 23:59:59' also rounds to '2004-10-25',

				-- so it's the same case as '2004-10-25 00:00:01'

				SELECT * FROM test_date WHERE i < '2004-10-25 00:00:01'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i <= '2004-10-25 00:00:01'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i = '2004-10-25 00:00:01'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i > '2004-10-25 00:00:01'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i >= '2004-10-25 00:00:01'::timestamp ORDER BY i;

				SELECT * FROM test_date WHERE i < '2004-10-25 00:00:01'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i <= '2004-10-25 00:00:01'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i = '2004-10-25 00:00:01'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i > '2004-10-25 00:00:01'::timestamptz ORDER BY i;

				SELECT * FROM test_date WHERE i >= '2004-10-25 00:00:01'::timestamptz ORDER BY i;

									
										53

contrib/btree_gin/sql/float4.sql
									
										View file
										
				@ -13,3 +13,56 @@ SELECT * FROM test_float4 WHERE i<=1::float4 ORDER BY i;

				SELECT * FROM test_float4 WHERE i=1::float4 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>=1::float4 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>1::float4 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_float4 WHERE i<1::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<1::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<=1::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i=1::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>=1::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>1::float8 ORDER BY i;

				-- Check endpoint and out-of-range cases

				INSERT INTO test_float4 VALUES ('NaN'), ('Inf'), ('-Inf');

				SELECT gin_clean_pending_list('idx_float4');

				SELECT * FROM test_float4 WHERE i<'-Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<='-Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i='-Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>='-Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>'-Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<'Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<='Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i='Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>='Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>'Inf'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<'1e300'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<='1e300'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i='1e300'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>='1e300'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>'1e300'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<'NaN'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i<='NaN'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i='NaN'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>='NaN'::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i>'NaN'::float8 ORDER BY i;

				-- Check rounding cases

				-- 1e-300 rounds to 0 for float4 but not for float8

				SELECT * FROM test_float4 WHERE i < -1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i <= -1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i = -1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i > -1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i >= -1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i < 1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i <= 1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i = 1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i > 1e-300::float8 ORDER BY i;

				SELECT * FROM test_float4 WHERE i >= 1e-300::float8 ORDER BY i;

									
										9

contrib/btree_gin/sql/float8.sql
									
										View file
										
				@ -13,3 +13,12 @@ SELECT * FROM test_float8 WHERE i<=1::float8 ORDER BY i;

				SELECT * FROM test_float8 WHERE i=1::float8 ORDER BY i;

				SELECT * FROM test_float8 WHERE i>=1::float8 ORDER BY i;

				SELECT * FROM test_float8 WHERE i>1::float8 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_float8 WHERE i<1::float4 ORDER BY i;

				SELECT * FROM test_float8 WHERE i<1::float4 ORDER BY i;

				SELECT * FROM test_float8 WHERE i<=1::float4 ORDER BY i;

				SELECT * FROM test_float8 WHERE i=1::float4 ORDER BY i;

				SELECT * FROM test_float8 WHERE i>=1::float4 ORDER BY i;

				SELECT * FROM test_float8 WHERE i>1::float4 ORDER BY i;

									
										35

contrib/btree_gin/sql/int2.sql
									
										View file
										
				@ -13,3 +13,38 @@ SELECT * FROM test_int2 WHERE i<=1::int2 ORDER BY i;

				SELECT * FROM test_int2 WHERE i=1::int2 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>=1::int2 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>1::int2 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_int2 WHERE i<1::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<1::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<=1::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i=1::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>=1::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>1::int4 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_int2 WHERE i<1::int8 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<1::int8 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<=1::int8 ORDER BY i;

				SELECT * FROM test_int2 WHERE i=1::int8 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>=1::int8 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>1::int8 ORDER BY i;

				-- Check endpoint and out-of-range cases

				INSERT INTO test_int2 VALUES ((-32768)::int2),(32767);

				SELECT gin_clean_pending_list('idx_int2');

				SELECT * FROM test_int2 WHERE i<(-32769)::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<=(-32769)::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i=(-32769)::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>=(-32769)::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>(-32769)::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<32768::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i<=32768::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i=32768::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>=32768::int4 ORDER BY i;

				SELECT * FROM test_int2 WHERE i>32768::int4 ORDER BY i;

									
										18

contrib/btree_gin/sql/int4.sql
									
										View file
										
				@ -13,3 +13,21 @@ SELECT * FROM test_int4 WHERE i<=1::int4 ORDER BY i;

				SELECT * FROM test_int4 WHERE i=1::int4 ORDER BY i;

				SELECT * FROM test_int4 WHERE i>=1::int4 ORDER BY i;

				SELECT * FROM test_int4 WHERE i>1::int4 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_int4 WHERE i<1::int2 ORDER BY i;

				SELECT * FROM test_int4 WHERE i<1::int2 ORDER BY i;

				SELECT * FROM test_int4 WHERE i<=1::int2 ORDER BY i;

				SELECT * FROM test_int4 WHERE i=1::int2 ORDER BY i;

				SELECT * FROM test_int4 WHERE i>=1::int2 ORDER BY i;

				SELECT * FROM test_int4 WHERE i>1::int2 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_int4 WHERE i<1::int8 ORDER BY i;

				SELECT * FROM test_int4 WHERE i<1::int8 ORDER BY i;

				SELECT * FROM test_int4 WHERE i<=1::int8 ORDER BY i;

				SELECT * FROM test_int4 WHERE i=1::int8 ORDER BY i;

				SELECT * FROM test_int4 WHERE i>=1::int8 ORDER BY i;

				SELECT * FROM test_int4 WHERE i>1::int8 ORDER BY i;

									
										18

contrib/btree_gin/sql/int8.sql
									
										View file
										
				@ -13,3 +13,21 @@ SELECT * FROM test_int8 WHERE i<=1::int8 ORDER BY i;

				SELECT * FROM test_int8 WHERE i=1::int8 ORDER BY i;

				SELECT * FROM test_int8 WHERE i>=1::int8 ORDER BY i;

				SELECT * FROM test_int8 WHERE i>1::int8 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_int8 WHERE i<1::int2 ORDER BY i;

				SELECT * FROM test_int8 WHERE i<1::int2 ORDER BY i;

				SELECT * FROM test_int8 WHERE i<=1::int2 ORDER BY i;

				SELECT * FROM test_int8 WHERE i=1::int2 ORDER BY i;

				SELECT * FROM test_int8 WHERE i>=1::int2 ORDER BY i;

				SELECT * FROM test_int8 WHERE i>1::int2 ORDER BY i;

				explain (costs off)

				SELECT * FROM test_int8 WHERE i<1::int4 ORDER BY i;

				SELECT * FROM test_int8 WHERE i<1::int4 ORDER BY i;

				SELECT * FROM test_int8 WHERE i<=1::int4 ORDER BY i;

				SELECT * FROM test_int8 WHERE i=1::int4 ORDER BY i;

				SELECT * FROM test_int8 WHERE i>=1::int4 ORDER BY i;

				SELECT * FROM test_int8 WHERE i>1::int4 ORDER BY i;

									
										11

contrib/btree_gin/sql/name.sql
									
										View file
										
				@ -19,3 +19,14 @@ EXPLAIN (COSTS OFF) SELECT * FROM test_name WHERE i<='abc' ORDER BY i;

				EXPLAIN (COSTS OFF) SELECT * FROM test_name WHERE i='abc' ORDER BY i;

				EXPLAIN (COSTS OFF) SELECT * FROM test_name WHERE i>='abc' ORDER BY i;

				EXPLAIN (COSTS OFF) SELECT * FROM test_name WHERE i>'abc' ORDER BY i;

				explain (costs off)

				SELECT * FROM test_name WHERE i<'abc'::text ORDER BY i;

				SELECT * FROM test_name WHERE i<'abc'::text ORDER BY i;

				SELECT * FROM test_name WHERE i<='abc'::text ORDER BY i;

				SELECT * FROM test_name WHERE i='abc'::text ORDER BY i;

				SELECT * FROM test_name WHERE i>='abc'::text ORDER BY i;

				SELECT * FROM test_name WHERE i>'abc'::text ORDER BY i;

				SELECT * FROM test_name WHERE i<=repeat('abc', 100) ORDER BY i;

									
										9

contrib/btree_gin/sql/text.sql
									
										View file
										
				@ -13,3 +13,12 @@ SELECT * FROM test_text WHERE i<='abc' ORDER BY i;

				SELECT * FROM test_text WHERE i='abc' ORDER BY i;

				SELECT * FROM test_text WHERE i>='abc' ORDER BY i;

				SELECT * FROM test_text WHERE i>'abc' ORDER BY i;

				explain (costs off)

				SELECT * FROM test_text WHERE i<'abc'::name COLLATE "default" ORDER BY i;

				SELECT * FROM test_text WHERE i<'abc'::name COLLATE "default" ORDER BY i;

				SELECT * FROM test_text WHERE i<='abc'::name COLLATE "default" ORDER BY i;

				SELECT * FROM test_text WHERE i='abc'::name COLLATE "default" ORDER BY i;

				SELECT * FROM test_text WHERE i>='abc'::name COLLATE "default" ORDER BY i;

				SELECT * FROM test_text WHERE i>'abc'::name COLLATE "default" ORDER BY i;

									
										55

contrib/btree_gin/sql/timestamp.sql
									
										View file
										
				@ -9,8 +9,8 @@ INSERT INTO test_timestamp VALUES

					( '2004-10-26 04:55:08' ),

					( '2004-10-26 05:55:08' ),

					( '2004-10-26 08:55:08' ),

					( '2004-10-26 09:55:08' ),

					( '2004-10-26 10:55:08' )

					( '2004-10-27 09:55:08' ),

					( '2004-10-27 10:55:08' )

				;

				CREATE INDEX idx_timestamp ON test_timestamp USING gin (i);

				@ -20,3 +20,54 @@ SELECT * FROM test_timestamp WHERE i<='2004-10-26 08:55:08'::timestamp ORDER BY

				SELECT * FROM test_timestamp WHERE i='2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'2004-10-26 08:55:08'::timestamp ORDER BY i;

				explain (costs off)

				SELECT * FROM test_timestamp WHERE i<'2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<'2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<='2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i='2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'2004-10-27'::date ORDER BY i;

				explain (costs off)

				SELECT * FROM test_timestamp WHERE i<'2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<'2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<='2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i='2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'2004-10-26 08:55:08'::timestamptz ORDER BY i;

				-- Check endpoint and out-of-range cases

				INSERT INTO test_timestamp VALUES ('-infinity'), ('infinity');

				SELECT gin_clean_pending_list('idx_timestamp');

				SELECT * FROM test_timestamp WHERE i<'-infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<='-infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i='-infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='-infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'-infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<'infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<='infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i='infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'infinity'::date ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<'-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<='-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i='-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'-infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<'infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i<='infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i='infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>='infinity'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'infinity'::timestamptz ORDER BY i;

				-- This PST timestamptz will underflow if converted to timestamp

				SELECT * FROM test_timestamp WHERE i<='4714-11-23 17:00 BC'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamp WHERE i>'4714-11-23 17:00 BC'::timestamptz ORDER BY i;

									
										22

contrib/btree_gin/sql/timestamptz.sql
									
										View file
										
				@ -9,8 +9,8 @@ INSERT INTO test_timestamptz VALUES

					( '2004-10-26 04:55:08' ),

					( '2004-10-26 05:55:08' ),

					( '2004-10-26 08:55:08' ),

					( '2004-10-26 09:55:08' ),

					( '2004-10-26 10:55:08' )

					( '2004-10-27 09:55:08' ),

					( '2004-10-27 10:55:08' )

				;

				CREATE INDEX idx_timestamptz ON test_timestamptz USING gin (i);

				@ -20,3 +20,21 @@ SELECT * FROM test_timestamptz WHERE i<='2004-10-26 08:55:08'::timestamptz ORDER

				SELECT * FROM test_timestamptz WHERE i='2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i>='2004-10-26 08:55:08'::timestamptz ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i>'2004-10-26 08:55:08'::timestamptz ORDER BY i;

				explain (costs off)

				SELECT * FROM test_timestamptz WHERE i<'2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i<'2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i<='2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i='2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i>='2004-10-27'::date ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i>'2004-10-27'::date ORDER BY i;

				explain (costs off)

				SELECT * FROM test_timestamptz WHERE i<'2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i<'2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i<='2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i='2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i>='2004-10-26 08:55:08'::timestamp ORDER BY i;

				SELECT * FROM test_timestamptz WHERE i>'2004-10-26 08:55:08'::timestamp ORDER BY i;

									
										5

contrib/btree_gist/Makefile
									
										View file
										
				@ -31,10 +31,11 @@ OBJS =  \

				EXTENSION = btree_gist

				DATA = btree_gist--1.0--1.1.sql \

				       btree_gist--1.1--1.2.sql btree_gist--1.2.sql btree_gist--1.2--1.3.sql \

				       btree_gist--1.1--1.2.sql btree_gist--1.2--1.3.sql \

				       btree_gist--1.3--1.4.sql btree_gist--1.4--1.5.sql \

				       btree_gist--1.5--1.6.sql btree_gist--1.6--1.7.sql \

				       btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql

				       btree_gist--1.7--1.8.sql btree_gist--1.8--1.9.sql \

				       btree_gist--1.9.sql

				PGFILEDESC = "btree_gist - B-tree equivalent GiST operator classes"

				REGRESS = init int2 int4 int8 float4 float8 cash oid timestamp timestamptz \

									
										6

contrib/btree_gist/btree_bit.c
									
										View file
										
				@ -8,6 +8,7 @@

				#include "utils/fmgrprotos.h"

				#include "utils/sortsupport.h"

				#include "utils/varbit.h"

				#include "varatt.h"

				/* GiST support functions */

				PG_FUNCTION_INFO_V1(gbt_bit_compress);

				@ -138,8 +139,9 @@ gbt_bit_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					void	   *query = DatumGetByteaP(PG_GETARG_DATUM(1));

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					bool		retval;

					GBT_VARKEY *key = (GBT_VARKEY *) DatumGetPointer(entry->key);

									
										6

contrib/btree_gist/btree_bool.c
									
										View file
										
				@ -5,6 +5,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct boolkey

				@ -107,8 +108,9 @@ gbt_bool_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					bool		query = PG_GETARG_INT16(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					boolKEY    *kkk = (boolKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										5

contrib/btree_gist/btree_bytea.c
									
										View file
										
				@ -101,8 +101,9 @@ gbt_bytea_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					void	   *query = DatumGetByteaP(PG_GETARG_DATUM(1));

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					bool		retval;

					GBT_VARKEY *key = (GBT_VARKEY *) DatumGetPointer(entry->key);

									
										11

contrib/btree_gist/btree_cash.c
									
										View file
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/cash.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct

				@ -138,8 +139,9 @@ gbt_cash_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					Cash		query = PG_GETARG_CASH(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					cashKEY    *kkk = (cashKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -160,8 +162,9 @@ gbt_cash_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					Cash		query = PG_GETARG_CASH(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					cashKEY    *kkk = (cashKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										11

contrib/btree_gist/btree_date.c
									
										View file
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "utils/fmgrprotos.h"

				#include "utils/date.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct

				@ -153,8 +154,9 @@ gbt_date_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					DateADT		query = PG_GETARG_DATEADT(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					dateKEY    *kkk = (dateKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -175,8 +177,9 @@ gbt_date_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					DateADT		query = PG_GETARG_DATEADT(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					dateKEY    *kkk = (dateKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										10

contrib/btree_gist/btree_enum.c
									
										View file
										
				@ -8,6 +8,7 @@

				#include "fmgr.h"

				#include "utils/fmgrprotos.h"

				#include "utils/fmgroids.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				/* enums are really Oids, so we just use the same structure */

				@ -125,8 +126,9 @@ gbt_enum_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					Oid			query = PG_GETARG_OID(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					oidKEY	   *kkk = (oidKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -193,8 +195,8 @@ gbt_enum_ssup_cmp(Datum x, Datum y, SortSupport ssup)

					return DatumGetInt32(CallerFInfoFunctionCall2(enum_cmp,

																  ssup->ssup_extra,

																  InvalidOid,

																  arg1->lower,

																  arg2->lower));

																  ObjectIdGetDatum(arg1->lower),

																  ObjectIdGetDatum(arg2->lower)));

				}

				Datum

									
										11

contrib/btree_gist/btree_float4.c
									
										View file
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/float.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct float4key

				@ -132,8 +133,9 @@ gbt_float4_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					float4		query = PG_GETARG_FLOAT4(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					float4KEY  *kkk = (float4KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -154,8 +156,9 @@ gbt_float4_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					float4		query = PG_GETARG_FLOAT4(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					float4KEY  *kkk = (float4KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										11

contrib/btree_gist/btree_float8.c
									
										View file
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "utils/float.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct float8key

				@ -140,8 +141,9 @@ gbt_float8_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					float8		query = PG_GETARG_FLOAT8(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					float8KEY  *kkk = (float8KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -162,8 +164,9 @@ gbt_float8_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					float8		query = PG_GETARG_FLOAT8(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					float8KEY  *kkk = (float8KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										197

contrib/btree_gist/btree_gist--1.7--1.8.sql
									
										View file
										
				@ -3,6 +3,203 @@

				-- complain if script is sourced in psql, rather than via CREATE EXTENSION

				\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.8'" to load this file. \quit

				-- Add sortsupport functions

				CREATE FUNCTION gbt_bit_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_varbit_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bool_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bytea_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_cash_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_date_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_enum_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_float4_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_float8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_inet_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int2_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int4_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_intv_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_macaddr_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_macad8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_numeric_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_oid_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_text_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bpchar_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_time_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_ts_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_uuid_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD

					FUNCTION	11  (bit, bit) gbt_bit_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD

					FUNCTION	11  (varbit, varbit) gbt_varbit_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD

					FUNCTION	11  (bool, bool) gbt_bool_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD

					FUNCTION	11  (bytea, bytea) gbt_bytea_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD

					FUNCTION	11  (money, money) gbt_cash_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_date_ops USING gist ADD

					FUNCTION	11  (date, date) gbt_date_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD

					FUNCTION	11  (anyenum, anyenum) gbt_enum_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD

					FUNCTION	11  (float4, float4) gbt_float4_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD

					FUNCTION	11  (float8, float8) gbt_float8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD

					FUNCTION	11  (inet, inet) gbt_inet_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD

					FUNCTION	11  (cidr, cidr) gbt_inet_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD

					FUNCTION	11  (int2, int2) gbt_int2_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD

					FUNCTION	11  (int4, int4) gbt_int4_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD

					FUNCTION	11  (int8, int8) gbt_int8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD

					FUNCTION	11  (interval, interval) gbt_intv_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD

					FUNCTION	11  (macaddr, macaddr) gbt_macaddr_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD

					FUNCTION	11  (macaddr8, macaddr8) gbt_macad8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD

					FUNCTION	11  (numeric, numeric) gbt_numeric_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD

					FUNCTION	11  (oid, oid) gbt_oid_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_text_ops USING gist ADD

					FUNCTION	11  (text, text) gbt_text_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD

					FUNCTION	11  (bpchar, bpchar) gbt_bpchar_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_time_ops USING gist ADD

					FUNCTION	11  (time, time) gbt_time_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD

					FUNCTION	11  (timetz, timetz) gbt_time_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD

					FUNCTION	11  (timestamp, timestamp) gbt_ts_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD

					FUNCTION	11  (timestamptz, timestamptz) gbt_ts_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD

					FUNCTION	11  (uuid, uuid) gbt_uuid_sortsupport (internal) ;

				-- Add translate_cmptype functions

				CREATE FUNCTION gist_translate_cmptype_btree(int)

				RETURNS smallint

				AS 'MODULE_PATHNAME'

									
										221

contrib/btree_gist/btree_gist--1.8--1.9.sql
									
										View file
										
				@ -1,197 +1,40 @@

				/* contrib/btree_gist/btree_gist--1.7--1.8.sql */

				/* contrib/btree_gist/btree_gist--1.8--1.9.sql */

				-- complain if script is sourced in psql, rather than via CREATE EXTENSION

				\echo Use "ALTER EXTENSION btree_gist UPDATE TO '1.9'" to load this file. \quit

				CREATE FUNCTION gbt_bit_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				--

				-- Mark gist_inet_ops and gist_cidr_ops opclasses as non-default.

				-- This is the first step on the way to eventually removing them.

				--

				-- There's no SQL command for this, so fake it with a manual update on

				-- pg_opclass.

				--

				DO LANGUAGE plpgsql

				$$

				DECLARE

				  my_schema pg_catalog.text := pg_catalog.quote_ident(pg_catalog.current_schema());

				  old_path pg_catalog.text := pg_catalog.current_setting('search_path');

				BEGIN

				-- for safety, transiently set search_path to just pg_catalog+pg_temp

				PERFORM pg_catalog.set_config('search_path', 'pg_catalog, pg_temp', true);

				CREATE FUNCTION gbt_varbit_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				UPDATE pg_catalog.pg_opclass

				SET opcdefault = false

				WHERE opcmethod = (SELECT oid FROM pg_catalog.pg_am WHERE amname = 'gist') AND

				      opcname IN ('gist_inet_ops', 'gist_cidr_ops') AND

				      opcnamespace = my_schema::pg_catalog.regnamespace;

				CREATE FUNCTION gbt_bool_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				PERFORM pg_catalog.set_config('search_path', old_path, true);

				END

				$$;

				CREATE FUNCTION gbt_bytea_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_cash_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_date_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_enum_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_float4_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_float8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_inet_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int2_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int4_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_int8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_intv_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_macaddr_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_macad8_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_numeric_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_oid_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_text_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_bpchar_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_time_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_ts_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				CREATE FUNCTION gbt_uuid_sortsupport(internal)

				RETURNS void

				AS 'MODULE_PATHNAME'

				LANGUAGE C IMMUTABLE PARALLEL SAFE STRICT;

				ALTER OPERATOR FAMILY gist_bit_ops USING gist ADD

					FUNCTION	11  (bit, bit) gbt_bit_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_vbit_ops USING gist ADD

					FUNCTION	11  (varbit, varbit) gbt_varbit_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bool_ops USING gist ADD

					FUNCTION	11  (bool, bool) gbt_bool_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bytea_ops USING gist ADD

					FUNCTION	11  (bytea, bytea) gbt_bytea_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_cash_ops USING gist ADD

					FUNCTION	11  (money, money) gbt_cash_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_date_ops USING gist ADD

					FUNCTION	11  (date, date) gbt_date_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_enum_ops USING gist ADD

					FUNCTION	11  (anyenum, anyenum) gbt_enum_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_float4_ops USING gist ADD

					FUNCTION	11  (float4, float4) gbt_float4_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_float8_ops USING gist ADD

					FUNCTION	11  (float8, float8) gbt_float8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_inet_ops USING gist ADD

					FUNCTION	11  (inet, inet) gbt_inet_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_cidr_ops USING gist ADD

					FUNCTION	11  (cidr, cidr) gbt_inet_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int2_ops USING gist ADD

					FUNCTION	11  (int2, int2) gbt_int2_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int4_ops USING gist ADD

					FUNCTION	11  (int4, int4) gbt_int4_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_int8_ops USING gist ADD

					FUNCTION	11  (int8, int8) gbt_int8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_interval_ops USING gist ADD

					FUNCTION	11  (interval, interval) gbt_intv_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_macaddr_ops USING gist ADD

					FUNCTION	11  (macaddr, macaddr) gbt_macaddr_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_macaddr8_ops USING gist ADD

					FUNCTION	11  (macaddr8, macaddr8) gbt_macad8_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_numeric_ops USING gist ADD

					FUNCTION	11  (numeric, numeric) gbt_numeric_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_oid_ops USING gist ADD

					FUNCTION	11  (oid, oid) gbt_oid_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_text_ops USING gist ADD

					FUNCTION	11  (text, text) gbt_text_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_bpchar_ops USING gist ADD

					FUNCTION	11  (bpchar, bpchar) gbt_bpchar_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_time_ops USING gist ADD

					FUNCTION	11  (time, time) gbt_time_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timetz_ops USING gist ADD

					FUNCTION	11  (timetz, timetz) gbt_time_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timestamp_ops USING gist ADD

					FUNCTION	11  (timestamp, timestamp) gbt_ts_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_timestamptz_ops USING gist ADD

					FUNCTION	11  (timestamptz, timestamptz) gbt_ts_sortsupport (internal) ;

				ALTER OPERATOR FAMILY gist_uuid_ops USING gist ADD

					FUNCTION	11  (uuid, uuid) gbt_uuid_sortsupport (internal) ;

				-- Fix parallel-safety markings overlooked in btree_gist--1.6--1.7.sql.

				ALTER FUNCTION gbt_bool_consistent(internal, bool, smallint, oid, internal) PARALLEL SAFE;

				ALTER FUNCTION gbt_bool_compress(internal) PARALLEL SAFE;

				ALTER FUNCTION gbt_bool_fetch(internal) PARALLEL SAFE;

				ALTER FUNCTION gbt_bool_penalty(internal, internal, internal) PARALLEL SAFE;

				ALTER FUNCTION gbt_bool_picksplit(internal, internal) PARALLEL SAFE;

				ALTER FUNCTION gbt_bool_union(internal, internal) PARALLEL SAFE;

				ALTER FUNCTION gbt_bool_same(gbtreekey2, gbtreekey2, internal) PARALLEL SAFE;

1009

contrib/btree_gist/btree_gist--1.2.sql → contrib/btree_gist/btree_gist--1.9.sql

View file

File diff suppressed because it is too large Load diff

									
										6

contrib/btree_gist/btree_gist.c
									
										View file
										
				@ -49,9 +49,9 @@ gbtreekey_out(PG_FUNCTION_ARGS)

				/*

				** GiST DeCompress methods

				** do not do anything.

				*/

				 * GiST DeCompress methods

				 * do not do anything.

				 */

				Datum

				gbt_decompress(PG_FUNCTION_ARGS)

				{

									
										10

contrib/btree_gist/btree_inet.c
									
										View file
										
				@ -7,6 +7,7 @@

				#include "btree_utils_num.h"

				#include "catalog/pg_type.h"

				#include "utils/builtins.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct inetkey

				@ -96,10 +97,10 @@ gbt_inet_compress(PG_FUNCTION_ARGS)

					if (entry->leafkey)

					{

						inetKEY    *r = (inetKEY *) palloc(sizeof(inetKEY));

						inetKEY    *r = palloc_object(inetKEY);

						bool		failure = false;

						retval = palloc(sizeof(GISTENTRY));

						retval = palloc_object(GISTENTRY);

						r->lower = convert_network_to_scalar(entry->key, INETOID, &failure);

						Assert(!failure);

						r->upper = r->lower;

				@ -119,8 +120,9 @@ gbt_inet_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					Datum		dquery = PG_GETARG_DATUM(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					inetKEY    *kkk = (inetKEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										11

contrib/btree_gist/btree_int2.c
									
										View file
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct int16key

				@ -138,8 +139,9 @@ gbt_int2_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					int16		query = PG_GETARG_INT16(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					int16KEY   *kkk = (int16KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -159,8 +161,9 @@ gbt_int2_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					int16		query = PG_GETARG_INT16(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					int16KEY   *kkk = (int16KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										11

contrib/btree_gist/btree_int4.c
									
										View file
										
				@ -5,6 +5,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct int32key

				@ -136,8 +137,9 @@ gbt_int4_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					int32		query = PG_GETARG_INT32(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					int32KEY   *kkk = (int32KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -157,8 +159,9 @@ gbt_int4_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					int32		query = PG_GETARG_INT32(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					int32KEY   *kkk = (int32KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

									
										11

contrib/btree_gist/btree_int8.c
									
										View file
										
				@ -6,6 +6,7 @@

				#include "btree_gist.h"

				#include "btree_utils_num.h"

				#include "common/int.h"

				#include "utils/rel.h"

				#include "utils/sortsupport.h"

				typedef struct int64key

				@ -138,8 +139,9 @@ gbt_int8_consistent(PG_FUNCTION_ARGS)

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					int64		query = PG_GETARG_INT64(1);

					StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					bool	   *recheck = (bool *) PG_GETARG_POINTER(4);

					int64KEY   *kkk = (int64KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

				@ -159,8 +161,9 @@ gbt_int8_distance(PG_FUNCTION_ARGS)

				{

					GISTENTRY  *entry = (GISTENTRY *) PG_GETARG_POINTER(0);

					int64		query = PG_GETARG_INT64(1);

					/* Oid		subtype = PG_GETARG_OID(3); */

				#ifdef NOT_USED

					Oid			subtype = PG_GETARG_OID(3);

				#endif

					int64KEY   *kkk = (int64KEY *) DatumGetPointer(entry->key);

					GBT_NUMKEY_R key;

Compare commits

2773 commits REL_18_4 ... master

88 .cirrus.star Unescape Escape View file

227 .cirrus.tasks.yml Unescape Escape View file

12 .cirrus.yml Unescape Escape View file

6 .editorconfig Unescape Escape View file

159 .git-blame-ignore-revs Unescape Escape View file

5 .gitattributes vendored Unescape Escape View file

2 COPYRIGHT Unescape Escape View file

2 Makefile Unescape Escape View file

221 config/c-compiler.m4 Unescape Escape View file

2 config/check_modules.pl Unescape Escape View file

17 config/config.guess vendored Unescape Escape View file

28 config/config.sub vendored Unescape Escape View file

21 config/llvm.m4 Unescape Escape View file

8 config/prep_buildtree Unescape Escape View file

18 config/programs.m4 Unescape Escape View file

5 config/python.m4 Unescape Escape View file

1572 configure vendored View file

315 configure.ac Unescape Escape View file

2 contrib/Makefile Unescape Escape View file

8 contrib/amcheck/expected/check_btree.out Unescape Escape View file

2 contrib/amcheck/meson.build Unescape Escape View file

7 contrib/amcheck/sql/check_btree.sql Unescape Escape View file

2 contrib/amcheck/t/001_verify_heapam.pl Unescape Escape View file

26 contrib/amcheck/t/002_cic.pl Unescape Escape View file

2 contrib/amcheck/t/003_cic_2pc.pl Unescape Escape View file

22 contrib/amcheck/t/004_verify_nbtree_unique.pl Unescape Escape View file

2 contrib/amcheck/t/005_pitr.pl Unescape Escape View file

2 contrib/amcheck/t/006_verify_gin.pl Unescape Escape View file

42 contrib/amcheck/verify_common.c Unescape Escape View file

11 contrib/amcheck/verify_common.h Unescape Escape View file

26 contrib/amcheck/verify_gin.c Unescape Escape View file

88 contrib/amcheck/verify_heapam.c Unescape Escape View file

150 contrib/amcheck/verify_nbtree.c Unescape Escape View file

2 contrib/auth_delay/auth_delay.c Unescape Escape View file

2 contrib/auth_delay/meson.build Unescape Escape View file

3 contrib/auto_explain/Makefile Unescape Escape View file

434 contrib/auto_explain/auto_explain.c Unescape Escape View file

19 contrib/auto_explain/expected/alter_reset.out Normal file Unescape Escape View file

49 contrib/auto_explain/expected/extension_options.out Normal file Unescape Escape View file

8 contrib/auto_explain/meson.build Unescape Escape View file

22 contrib/auto_explain/sql/alter_reset.sql Normal file Unescape Escape View file

33 contrib/auto_explain/sql/extension_options.sql Normal file Unescape Escape View file

18 contrib/auto_explain/t/001_auto_explain.pl Unescape Escape View file

4 contrib/basebackup_to_shell/basebackup_to_shell.c Unescape Escape View file

6 contrib/basebackup_to_shell/meson.build Unescape Escape View file

2 contrib/basebackup_to_shell/t/001_basic.pl Unescape Escape View file

21 contrib/basic_archive/basic_archive.c Unescape Escape View file

2 contrib/basic_archive/meson.build Unescape Escape View file

5 contrib/bloom/blcost.c Unescape Escape View file

4 contrib/bloom/blinsert.c Unescape Escape View file

4 contrib/bloom/bloom.h Unescape Escape View file

36 contrib/bloom/blscan.c Unescape Escape View file

117 contrib/bloom/blutils.c Unescape Escape View file

68 contrib/bloom/blvacuum.c Unescape Escape View file

2 contrib/bloom/blvalidate.c Unescape Escape View file

2 contrib/bloom/meson.build Unescape Escape View file

2 contrib/bloom/t/001_wal.pl Unescape Escape View file

2 contrib/bool_plperl/meson.build Unescape Escape View file

2 contrib/btree_gin/Makefile Unescape Escape View file

151 contrib/btree_gin/btree_gin--1.3--1.4.sql Normal file Unescape Escape View file

650 contrib/btree_gin/btree_gin.c Unescape Escape View file

2 contrib/btree_gin/btree_gin.control Unescape Escape View file

362 contrib/btree_gin/expected/date.out Unescape Escape View file

321 contrib/btree_gin/expected/float4.out Unescape Escape View file

50 contrib/btree_gin/expected/float8.out Unescape Escape View file

190 contrib/btree_gin/expected/int2.out Unescape Escape View file

100 contrib/btree_gin/expected/int4.out Unescape Escape View file

100 contrib/btree_gin/expected/int8.out Unescape Escape View file

59 contrib/btree_gin/expected/name.out Unescape Escape View file

50 contrib/btree_gin/expected/text.out Unescape Escape View file

306 contrib/btree_gin/expected/timestamp.out Unescape Escape View file

111 contrib/btree_gin/expected/timestamptz.out Unescape Escape View file

3 contrib/btree_gin/meson.build Unescape Escape View file

64 contrib/btree_gin/sql/date.sql Unescape Escape View file

53 contrib/btree_gin/sql/float4.sql Unescape Escape View file

9 contrib/btree_gin/sql/float8.sql Unescape Escape View file

35 contrib/btree_gin/sql/int2.sql Unescape Escape View file

18 contrib/btree_gin/sql/int4.sql Unescape Escape View file

2773 commits

REL_18_4 ... master

88

.cirrus.star

View file

227

.cirrus.tasks.yml

View file

12

.cirrus.yml

View file

6

.editorconfig

View file

159

.git-blame-ignore-revs

View file

5

.gitattributes vendored

View file

2

COPYRIGHT

View file

2

Makefile

View file

221

config/c-compiler.m4

View file

2

config/check_modules.pl

View file

17

config/config.guess vendored

View file

28

config/config.sub vendored

View file

21

config/llvm.m4

View file

8

config/prep_buildtree

View file

18

config/programs.m4

View file

5

config/python.m4

View file

1572

configure vendored

View file

315

configure.ac

View file

2

contrib/Makefile

View file

8

contrib/amcheck/expected/check_btree.out

View file

2

contrib/amcheck/meson.build

View file

7

contrib/amcheck/sql/check_btree.sql

View file

2

contrib/amcheck/t/001_verify_heapam.pl

View file

26

contrib/amcheck/t/002_cic.pl

View file

2

contrib/amcheck/t/003_cic_2pc.pl

View file

22

contrib/amcheck/t/004_verify_nbtree_unique.pl

View file

2

contrib/amcheck/t/005_pitr.pl

View file

2

contrib/amcheck/t/006_verify_gin.pl

View file

42

contrib/amcheck/verify_common.c

View file

11

contrib/amcheck/verify_common.h

View file

26

contrib/amcheck/verify_gin.c

View file

88

contrib/amcheck/verify_heapam.c

View file

150

contrib/amcheck/verify_nbtree.c

View file

2

contrib/auth_delay/auth_delay.c

View file

2

contrib/auth_delay/meson.build

View file

3

contrib/auto_explain/Makefile

View file

434

contrib/auto_explain/auto_explain.c

View file

19

contrib/auto_explain/expected/alter_reset.out Normal file

View file

49

contrib/auto_explain/expected/extension_options.out Normal file

View file

8

contrib/auto_explain/meson.build

View file

22

contrib/auto_explain/sql/alter_reset.sql Normal file

View file

33

contrib/auto_explain/sql/extension_options.sql Normal file

View file

18

contrib/auto_explain/t/001_auto_explain.pl

View file

4

contrib/basebackup_to_shell/basebackup_to_shell.c

View file

6

contrib/basebackup_to_shell/meson.build

View file

2

contrib/basebackup_to_shell/t/001_basic.pl

View file

21

contrib/basic_archive/basic_archive.c

View file

2

contrib/basic_archive/meson.build

View file

5

contrib/bloom/blcost.c

View file

4

contrib/bloom/blinsert.c

View file

4

contrib/bloom/bloom.h

View file

36

contrib/bloom/blscan.c

View file

117

contrib/bloom/blutils.c

View file

68

contrib/bloom/blvacuum.c

View file

2

contrib/bloom/blvalidate.c

View file

2

contrib/bloom/meson.build

View file

2

contrib/bloom/t/001_wal.pl

View file

2

contrib/bool_plperl/meson.build

View file

2

contrib/btree_gin/Makefile

View file

151

contrib/btree_gin/btree_gin--1.3--1.4.sql Normal file

View file

650

contrib/btree_gin/btree_gin.c

View file

2

contrib/btree_gin/btree_gin.control

View file

362

contrib/btree_gin/expected/date.out

View file

321

contrib/btree_gin/expected/float4.out

View file

50

contrib/btree_gin/expected/float8.out

View file

190

contrib/btree_gin/expected/int2.out

View file

100

contrib/btree_gin/expected/int4.out

View file

100

contrib/btree_gin/expected/int8.out

View file

59

contrib/btree_gin/expected/name.out

View file

50

contrib/btree_gin/expected/text.out

View file

306

contrib/btree_gin/expected/timestamp.out

View file

111

contrib/btree_gin/expected/timestamptz.out

View file

3

contrib/btree_gin/meson.build

View file

64

contrib/btree_gin/sql/date.sql

View file

53

contrib/btree_gin/sql/float4.sql

View file

9

contrib/btree_gin/sql/float8.sql

View file

35

contrib/btree_gin/sql/int2.sql

View file

18

contrib/btree_gin/sql/int4.sql

View file

18

contrib/btree_gin/sql/int8.sql

View file