postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-09 11:06:21 -04:00

Author	SHA1	Message	Date
Jacob Champion	b977bd308a	oauth: Allow validators to register custom HBA options OAuth validators can already use custom GUCs to configure behavior globally. But we currently provide no ability to adjust settings for individual HBA entries, because the original design focused on a world where a provider covered a "single audience" of users for one database cluster. This assumption does not apply to multitenant use cases, where a single validator may be controlling access for wildly different user groups. To improve this use case, add two new API calls for use by validator callbacks: RegisterOAuthHBAOptions() and GetOAuthHBAOption(). Registering options "foo" and "bar" allows a user to set "validator.foo" and "validator.bar" in an oauth HBA entry. These options are stringly typed (syntax validation is solely the responsibility of the defining module), and names are restricted to a subset of ASCII to avoid tying our hands with future HBA syntax improvements. Unfortunately, we can't check the custom option names during a reload of the configuration, like we do with standard HBA options, without requiring all validators to be loaded via shared_preload_libraries. (I consider this to be a nonstarter: most validators should probably use session_preload_libraries at most, since requiring a full restart just to update authentication behavior will be unacceptable to many users.) Instead, the new validator.* options are checked against the registered list at connection time. Multiple alternatives were proposed and/or prototyped, including extending the GUC system to allow per-HBA overrides, joining forces with recent refactoring work on the reloptions subsystem, and giving the ability to customize HBA options to all PostgreSQL extensions. I personally believe per-HBA GUC overrides are the best option, because several existing GUCs like authentication_timeout and pre_auth_delay would fit there usefully. But the recent addition of SNI per-host settings in `4f433025f` indicates that a more general solution is needed, and I expect that to take multiple releases' worth of discussion. This compromise patch, then, is intentionally designed to be an architectural dead end: simple to describe, cheap to maintain, and providing just enough functionality to let validators move forward for PG19. The hope is that it will be replaced in the future by a solution that can handle per-host, per-HBA, and other per-context configuration with the same functionality that GUCs provide today. In the meantime, the bulk of the code in this patch consists of strict guardrails on the simple API, to try to ensure that we don't have any reason to regret its existence during its unknown lifespan. I owe particular thanks here to Zsolt Parragi, who prototyped several approaches that guided the final design. Suggested-by: Zsolt Parragi <zsolt.parragi@percona.com> Suggested-by: VASUKI M <vasukianand0119@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAN4CZFM3b8u5uNNNsY6XCya257u%2BDofms3su9f11iMCxvCacag%40mail.gmail.com	2026-04-07 08:15:19 -07:00
Jacob Champion	6d00fb9048	libpq: Split PGOAUTHDEBUG=UNSAFE into multiple options PGOAUTHDEBUG is a blunt instrument: you get all the debugging features, or none of them. The most annoying consequence during manual use is the Curl debug trace, which tends to obscure the device flow prompt entirely. The promotion of PGOAUTHCAFILE into its own feature in `993368113` improved the situation somewhat, but there's still the discomfort of knowing you have to opt into many dangerous behaviors just to get the single debug feature you wanted. Explode the PGOAUTHDEBUG syntax into a comma-separated list. The old "UNSAFE" value enables everything, like before. Any individual unsafe features still require the envvar to begin with an "UNSAFE:" prefix, to try to interrupt the flow of someone who is about to do something they should not. So now, rather than PGOAUTHDEBUG=UNSAFE # enable all the unsafe things a developer can say PGOAUTHDEBUG=call-count # only show me the call count. safe! PGOAUTHDEBUG=UNSAFE:trace # print secrets, but don't allow HTTP To avoid adding more build system scaffolding to libpq-oauth, implement this entirely in a small private header. This unfortunately can't be standalone, so it needs a headerscheck exception. Author: Zsolt Parragi <zsolt.parragi@percona.com> Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2B%3DfbZNJSkHVci%3DGpR8XPYObK%3DH%2B2ERRha0LDTS%2BifsWnw%40mail.gmail.com Discussion: https://postgr.es/m/CAN4CZFMmDZMH56O9vb_g7vHqAk8ryWFxBMV19C39PFghENg8kA%40mail.gmail.com	2026-04-07 08:15:14 -07:00
Álvaro Herrera	e76d8c749c	Reserve replication slots specifically for REPACK Add a new GUC max_repack_replication_slots, which lets the user reserve some additional replication slots for concurrent repack (and only concurrent repack). With this, the user doesn't have to worry about changing the max_replication_slots in order to cater for use of concurrent repack. (We still use the same pool of bgworkers though, but that's less commonly a problem than slots.) Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Discussion: https://postgr.es/m/202604012148.nnnmyxxrr6nh@alvherre.pgsql	2026-04-07 16:55:29 +02:00
Heikki Linnakangas	979387f188	Fix harmless leftover in _hash_kill_items() Checking for 'havePin' is sufficient here. An earlier version of the patch didn't have the 'havePin' variable and used 'so->hashso_bucket_buf == so->currPos.buf' as the condition when both locking and unlocking the page. The havePin variable was added later during development, but the unlocking condition wasn't fully updated. Tidy it up. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/b9de8d05-3b02-4a27-9b0b-03972fa4bfd3@iki.fi	2026-04-07 17:38:11 +03:00
Andrew Dunstan	55890a9194	Add errdetail() with PID and UID about source of termination signal. When a backend is terminated via pg_terminate_backend() or an external SIGTERM, the error message now includes the sender's PID and UID as errdetail, making it easier to identify the source of unexpected terminations in multi-user environments. On platforms that support SA_SIGINFO (Linux, FreeBSD, and most modern Unix systems), the signal handler captures si_pid and si_uid from the siginfo_t structure. On platforms without SA_SIGINFO, the detail is simply omitted. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Chao Li <1356863904@qq.com> Discussion: https://postgr.es/m/CAKZiRmyrOWovZSdixpLd3PGMQXuQL_zw2Ght5XhHCkQ1uDsxjw@mail.gmail.com	2026-04-07 10:22:33 -04:00
Robert Haas	c10edb102a	pg_stash_advice: Allow stashed advice to be persisted to disk. If pg_stash_advice.persist = true, stashed advice will be written to pg_stash_advice.tsv in the data directory, periodically and at shutdown. On restart, stash modifications are locked out until this file has been reloaded, but queries will not be, so there may be a short window after startup during which previously-stashed advice is not automatically applied. Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/CA+Tgmob87qsWa-VugofU6epuV0H5XjWZGMbQas4Q-ADKmvSyBg@mail.gmail.com	2026-04-07 10:11:36 -04:00
Andres Freund	29e7dbf5e4	Minimal fix for WAIT FOR ... MODE 'standby_flush' The investigation into the negative test performance impact of `7e8aeb9e48` lead to discovering that there are a few issues with WAIT FOR. This commit is just a minimal fix to prevent hangs in standby_flush mode, due to WAIT FOR ... 'standby_flush' seeing a 0 LSN if a newly started walreceiver does not receive any writes, because the stanby is already caught up. There are several other issues and this is isn't necessarily the best fix. But this way we get the hangs out of the way. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/zqbppucpmkeqecfy4s5kscnru4tbk6khp3ozqz6ad2zijz354k@w4bdf4z3wqoz	2026-04-07 09:48:09 -04:00
Álvaro Herrera	8fb95a8ab6	doc: Add an example of REPACK (CONCURRENTLY) Suggested-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm3tiKhtegx5Cawi34UjbHmNGEDNAtScGM1RgWRtV-5_0Q@mail.gmail.com	2026-04-07 15:33:55 +02:00
Heikki Linnakangas	9480c585df	Tidy up #ifdef USE_INJECTION_POINTS guards Remove unnecessary #ifdef guard around the function prototypes; they are already inside a larger #ifdef block. Move #include "subsystems.h" inside the USE_INJECTION_POINTS guard; it's needed for InjectionPointShmemCallbacks, which is a also inside the guard. Reported-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://www.postgresql.org/message-id/87y0iz2c1v.fsf@wibble.ilmari.org	2026-04-07 16:18:31 +03:00
Álvaro Herrera	be142fa008	Fix tests under wal_level=minimal Buildfarm members which have specifically configured to use wal_level=minimal fail the repack regression tests, which require wal_level=replica. Add a temp config file to fix that.	2026-04-07 15:14:32 +02:00
Heikki Linnakangas	257c8231bf	Modernize and optimize pg_buffercache_pages() Refactor pg_buffercache_pages() to use SFRM_Materialize mode and construct a tuplestore directly. That's simpler and more efficient than collecting all the data to a custom array first. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Palak Chaturvedi <chaturvedipalak1911@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5sMsaz1j+hrdhyo-DJp7JCgJx87=q2iJfOc_9mwYWyvmw@mail.gmail.com	2026-04-07 16:04:48 +03:00
Heikki Linnakangas	9f3755ea07	Optimize sorting and deduplicating trigrams Use templated qsort() so that the comparison function can be inlined. To speed up qunique(), use a specialized comparison function that only checks for equality. Author: David Geier <geidav.pg@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/2a76b5ef-4b12-4023-93a1-eed6e64968f3@gmail.com	2026-04-07 14:11:25 +03:00
Tomas Vondra	884f9b3c76	Use add_size/mul_size for index instrumentation size calculations Use overflow-safe size arithmetic in the Index[Only]Scan and parallel instrumentation functions, consistent with other executor nodes (Hash, Sort, Agg, Memoize). This was an oversight in `dd78e69cfc`. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 12:47:28 +02:00
Tomas Vondra	9c18b47e61	Fix BitmapHeapScan non-parallel-aware EXPLAIN ANALYZE Allocates shared bitmap table scan instrumentation for all parallel scans. Previously, the instrumentation was only allocated for parallel-aware scans, other bitmap heap scans in the parallel query had no shared instrumentation and EXPLAIN didn't report exact/lossy pages. This affected cases like scans on the outside of a parallel join or queries run with debug_parallel_query=regress. Fixed by allocating a separate DSM chunk for shared instrumentation and doing so regardless of parallel-awareness. The instrumentation is allocated in its own DSM chunk, separate from ParallelBitmapHeapState. Report an initial patch by me. The approach with a separate DSM was proposed and implemented by Melanie. Not backpatched. The issue affects Postgres 18 (since `5a1e6df3b8`), but having multiple DSM chunks is possible only since `dd78e69cfc`. If we decide to fix this in backbranches too, it will need to be done in a less invasive way. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 12:47:13 +02:00
Álvaro Herrera	0d3dba38c7	Allow logical replication snapshots to be database-specific By default, the logical decoding assumes access to shared catalogs, so the snapshot builder needs to consider cluster-wide XIDs during startup. That in turn means that, if any transaction is already running (and has XID assigned), the snapshot builder needs to wait for its completion, as it does not know if that transaction performed catalog changes earlier. A possible problem with this concept is that if REPACK (CONCURRENTLY) is running in some database, backends running the same command in other databases get stuck until the first one has committed. Thus only a single backend in the cluster can run REPACK (CONCURRENTLY) at any time. Likewise, REPACK (CONCURRENTLY) can block walsenders starting on behalf of subscriptions throughout the cluster. This patch adds a new option to logical replication output plugin, to declare that it does not use shared catalogs (i.e. catalogs that can be changed by transactions running in other databases in the cluster). In that case, no snapshot the backend will use during the decoding needs to contain information about transactions running in other databases. Thus the snapshot builder only needs to wait for completion of transactions in the current database. Currently we only use this option in the REPACK background worker. It could possibly be used in the plugin for logical replication too, however that would need thorough analysis of that plugin. Bump WAL version number, due to a new field in xl_running_xacts. Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/90475.1775218118@localhost	2026-04-07 12:31:18 +02:00
Álvaro Herrera	a3b069ef90	Avoid different-size pointer-to-integer cast Buildfarm member mamba is unhappy that I wrote "(Datum) NULL" in commit `28d534e2ae`: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mamba&dt=2026-04-07%2005%3A08%3A08 Use "(Datum) 0" which is what we do everywhere else. Discussion: https://postgr.es/m/CANWCAZaOs_+WPH13ow33Q==+FwBwVZkqzm4vND=WEB4_NBmv1Q@mail.gmail.com	2026-04-07 12:28:05 +02:00
Heikki Linnakangas	6f5ad00ab7	Optimize sort and deduplication in ginExtractEntries() Remove NULLs from the array first, and use qsort to deduplicate only the non-NULL items. This simplifies the comparison function. Also replace qsort_arg() with a templated version so that the comparison function can be inlined. These changes make ginExtractEntries() a little faster especially for simple datatypes like integers. Author: David Geier <geidav.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/6d16b6bd-a1ff-4469-aefb-a1c8274e561a@iki.fi	2026-04-07 13:26:39 +03:00
Peter Eisentraut	b6ccd30d8f	Add isolation tests for UPDATE/DELETE FOR PORTION OF Add documentation about concurrency issues related to UPDATE/DELETE FOR PORTION OF as well as supporting isolation tests. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com	2026-04-07 11:22:11 +02:00
Álvaro Herrera	5bcc3fbd19	Fix valgrind failure Buildfarm member skink reports that the new REPACK code is trying to write uninitialized bytes to disk, which correspond to padding space in the SerializedSnapshotData struct. Silence that by initializing the memory in SerializeSnapshot() to all zeroes. Co-authored-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/1976915.1775537087@sss.pgh.pa.us	2026-04-07 11:13:50 +02:00
John Naylor	8c3e22a8f8	Use .h for the file containing the page checksum code fragment Commit `5e13b0f24` used a .c file for a file containing a code fragment, to avoid adding an exception to headerscheck. That turned out to be too clever, since it meant installation didn't happen by the usual mechanism. Make it look like a normal header and add the requisite exception. Bug: #19450 Reported-by: RekGRpth <rekgrpth@gmail.com> Discussion: https://postgr.es/m/19450-bb0612c50c6786e5@postgresql.org	2026-04-07 15:52:55 +07:00
John Naylor	30229be755	Simplify SortSupport for the macaddr data type As of commit `6aebedc38` Datums are 64-bit values. Since MAC addresses have only 6 bytes, the abbreviated key always contains the entire MAC address and is thus authoritative (for practical purposes -- the tuple sort machinery has no way of knowing that). Abbreviating this datatype is cheap, and aborting abbreviation prevents optimizations like radix sort, so remove cardinality estimation. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed-by: Michael Paquier <michael@paquier.xyz> Suggested-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CAJ7c6TMk10rF_LiMz6j9rRy1rqk-5s+wBPuBefLix4cY+-4s1w@mail.gmail.com	2026-04-07 13:29:27 +07:00
Michael Paquier	49cc0d4148	Mark JumbleState as a const in the post_parse_analyze hook This commit changes the post_parse_analyze_hook_type() hook to take a const JumbleState, to tell external modules that they are not allowed to touch the JumbleState that has been compiled by the core code. This fixes a pretty old problem with pg_stat_statements, that had always the idea of modifying the lengths of the constants stored in the JumbleState. The previous state could confuse extensions that need to look at a JumbleState depending on the loading order, if pg_stat_statements is part of the stack loaded. Another piece included in this commit is the move of the routine fill_in_constant_lengths() to queryjumblefuncs.c, to give an option to extensions to compile the lengths of the constants, if necessary. I was surprised by the number of external code that carries a copy of this routine (see the thread for details). Previously, this routine modified JumbleState. It now copies the set of LocationLens from JumbleState, and fills the constant lengths for separate use. pg_stat_statements is updated to use the new ComputeConstantLengths(). JumbleState is now marked with a const in the module, where relevant. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/CAA5RZ0tZp5qU0ikZEEqJnxvdSNGh1DWv80sb-k4QAUmiMoOp_Q@mail.gmail.com	2026-04-07 15:22:49 +09:00
John Naylor	51098839cf	Split CREATE STATISTICS error reasons out into errdetails Some errmsgs in statscmds.c were phrased as "...cannot be used because...". Put the reasons into errdetails. While at it, switch from passive voice to "cannot create..." for the errmsg. Author: Yugo Nagata <nagata@sraoss.co.jp> Suggested-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CANWCAZaZeX0omWNh_ZbD_JVujzYQdRUW8UZOQ4dWh9Sg7OcAow@mail.gmail.com	2026-04-07 11:37:48 +07:00
Michael Paquier	3284e3f63c	Fix injection point detach timing problem in TAP test for lock stats injection_points_detach() could fail because of a concurrent cleanup triggered by injection_points_set_local() when a session finishes. This problem could be reproduced by adding a hardcoded sleep in InjectionPointDetach(), and has been detected by the CI. As the test is designed so as the injection point is detached before being awaken, there is no need for it to be local, similarly to test 010_index_concurrently_upsert. This commit removes injection_points_set_local(), replacing it with a confirmation that the point has been attached in the session expected to block on a lock. With this removal, the detach cannot happen concurrently anymore, only before when the point is woken up. Issue introduced by `557a9f1e3e`, where the test has been added. Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/rp6wz4lnz5qn4zlh7uxtavzfrmqvycy2g42z4zasfss2gxi54f@zzcsjdvdflwp	2026-04-07 13:17:13 +09:00
Michael Paquier	17132f55c5	Fix shmem allocation of fixed-sized custom stats kind StatsShmemSize(), that computes the shmem size needed for pgstats, includes the amount of shared memory wanted by all the custom stats kinds registered. However, the shared memory allocation was done by ShmemAlloc() in StatsShmemInit(), meaning that the space reserved was not used, wasting some memory. These extra allocations would show up under "<anonymous>" in pg_shmem_allocations, as the allocations done by ShmemAlloc() are not tracked by ShmemIndexEnt. Issue introduced by `7949d95945`. Author: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://postgr.es/m/04b04387-92f5-476c-90b0-4064e71c5f37@iki.fi Backpatch-through: 18	2026-04-07 11:59:49 +09:00
Amit Langote	5c54c3ed1b	Fix deferred FK check batching introduced by commit `b7b27eb41a` That commit introduced AfterTriggerIsActive() to detect whether we are inside the after-trigger firing machinery, so that RI trigger functions can take the batched fast path. It was implemented using query_depth >= 0, which correctly identified immediate trigger firing but missed the deferred case where query_depth is -1 at COMMIT via AfterTriggerFireDeferred(). This caused deferred FK checks to fall back to the per-row fast path instead of the batched path. The correct check is whether we are inside an after-trigger firing loop specifically. Introduce afterTriggerFiringDepth, a counter incremented around the trigger-firing loops in AfterTriggerEndQuery, AfterTriggerFireDeferred, and AfterTriggerSetState, and decremented after FireAfterTriggerBatchCallbacks() returns. AfterTriggerIsActive() now returns afterTriggerFiringDepth > 0. Reported-by: Chao Li <li.evan.chao@gmail.com> Author: Chao Li <li.evan.chao@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/C2133B47-79CD-40FF-B088-02D20D654806@gmail.com	2026-04-07 10:45:59 +09:00
Michael Paquier	9897957805	Fix shared memory size of template code for custom fixed-sized pgstats On HEAD, the template code for custom fixed-sized pgstats is in the test module test_custom_stats. On REL_18_STABLE, this code lives in the test module injection_points. Both cases were underestimating the size of the shared memory area required for the storage of the stats data, using a single entry rather than the whole area. This underestimation meant that there was no memory allocated for the LWLock required for the stats, and even more. This problem would be also misleading for extension developers looking at this code. This issue has been noticed while digging into a different bug reported by Heikki Linnakangas, showing that the underestimation was causing failures in the TAP tests of the test modules for 32-bit builds. The other issue reported, related to the memory allocation of custom fixed-sized pgstats, will be fixed in a follow-up commit. Discussion: https://postgr.es/m/adMk_lWbnz3HDOA8@paquier.xyz Backpatch-through: 18	2026-04-07 08:24:32 +09:00
Melanie Plageman	dd78e69cfc	Allocate separate DSM chunk for parallel Index[Only]Scan instrumentation Previously, parallel index and index-only scans packed the parallel scan descriptor and shared instrumentation (for EXPLAIN ANALYZE) into a single DSM allocation. Since scans may be instrumented without being parallel-aware, and vice versa, using separate DSM chunks -- each with its own TOC key -- is cleaner. A future commit will extend this pattern to other scan node types. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-06 19:10:19 -04:00
Melanie Plageman	43222b8e53	Assert no duplicate keys in shm_toc_insert() shm_toc_insert() silently accepts duplicate keys. Since shm_toc_lookup() returns the first matching entry, any later entry with the same key would be unreachable. Add an assertion to catch this. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-06 18:41:47 -04:00
Nathan Bossart	87f61f0c82	Add pg_stat_autovacuum_scores system view. This view contains one row for each table in the current database, showing the current autovacuum scores for that specific table. It also shows whether autovacuum would vacuum or analyze the table. Bumps catversion. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Satyanarayana Narlapuram <satyanarlapuram@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-06 16:56:33 -05:00
Daniel Gustafsson	b3a37ffbc5	Use PG_DATA_CHECKSUM_OFF instead of hardcoded value For a long time, the online checksums patchset kept the "off" state as literal zero without a label to be consistent with the previous coding which only had a label for the "on" state. Later, when an "off" label was made not all uses in the code got the memo. Fix by setting these to PG_DATA_CHECKSUM_OFF. While there, fix a duplicate word in a comment introduced by the same commit. Author: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAJ7c6TPRTnQFXXX1CRcYoTLXw2swtDH==uSz1MYoMKdLrKZHjA@mail.gmail.com	2026-04-06 22:11:53 +02:00
Álvaro Herrera	28d534e2ae	Add CONCURRENTLY option to REPACK When this flag is specified, REPACK no longer acquires access-exclusive lock while the new copy of the table is being created; instead, it creates the initial copy under share-update-exclusive lock only (same as vacuum, etc), and it follows an MVCC snapshot; it sets up a replication slot starting at that snapshot, and uses a concurrent background worker to do logical decoding starting at the snapshot to populate a stash of concurrent data changes. Those changes can then be re-applied to the new copy of the table just before swapping the relfilenodes. Applications can continue to access the original copy of the table normally until just before the swap, which is the only point at which the access-exclusive lock is needed. There are some loose ends in this commit: 1. concurrent repack needs its own replication slot in order to apply logical decoding, which are a scarce resource and easy to run out of. 2. due to the way the historic snapshot is initially set up, only one REPACK process can be running at any one time on the whole system. 3. there's a danger of deadlocking (and thus abort) due to the lock upgrade required at the final phase. These issues will be addressed in upcoming commits. The design and most of the code are by Antonin Houska, heavily based on his own pg_squeeze third-party implementation. Author: Antonin Houska <ah@cybertec.at> Co-authored-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/5186.1706694913@antos Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql	2026-04-06 21:55:08 +02:00
Alexander Korotkov	10484c2cc7	Document that WAIT FOR may be interrupted by recovery conflicts Add a note to the WAIT FOR documentation explaining that sessions using this command on a standby server may be interrupted by recovery conflicts. Some conflicts are unavoidable - for example, replaying a tablespace drop terminates all backends unconditionally. Discussion: https://postgr.es/m/CAPpHfds7oSCbZqob7ytT_Lso8fv-NW8LnedUTE4Krde%2B3rkJeA%40mail.gmail.com Author: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2026-04-06 22:47:26 +03:00
Alexander Korotkov	7e8aeb9e48	Use WAIT FOR LSN in PostgreSQL::Test::Cluster::wait_for_catchup() When the standby is passed as a PostgreSQL::Test::Cluster instance, use the WAIT FOR LSN command on the standby server to implement wait_for_catchup() for replay, write, and flush modes. This is more efficient than polling pg_stat_replication on the upstream, as the WAIT FOR LSN command uses a latch-based wakeup mechanism. The optimization applies when: - The standby is passed as a Cluster object (not just a name string) - The mode is 'replay', 'write', or 'flush' (not 'sent') Rather than pre-checking pg_is_in_recovery() on the standby (which would add an extra round-trip on every call), we issue WAIT FOR LSN directly and handle the 'not in recovery' result as a signal to fall back to polling. For 'sent' mode, when the standby is passed as a string (e.g., a subscription name for logical replication), when the standby has been promoted, or when WAIT FOR LSN is interrupted by a recovery conflict, the function falls back to the original polling-based approach using pg_stat_replication on the upstream. The recovery conflict fallback is necessary because some conflicts are unavoidable - for example, ResolveRecoveryConflictWithTablespace() kills all backends unconditionally, regardless of what they are doing. The recovery conflict detection matches the English error message "conflict with recovery", which is reliable because the test suite runs with LC_MESSAGES=C. Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>	2026-04-06 22:47:26 +03:00
Alexander Korotkov	834038c1f8	Avoid syscache lookup while building a WAIT FOR tuple descriptor Use TupleDescInitBuiltinEntry instead of TupleDescInitEntry when building the result tuple descriptor for the WAIT FOR command. This avoids a syscache access that could re-establish a catalog snapshot after we've explicitly released all snapshots before the wait. Discussion: https://postgr.es/m/CABPTF7U%2BSUnJX_woQYGe%3D%3DR9Oz%2B-V6X0VO2stBLPGfJmH_LEhw%40mail.gmail.com Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>	2026-04-06 22:47:26 +03:00
Nathan Bossart	775fe51daa	Remove recheck_relation_needs_vacanalyze(). This function is a thin wrapper around relation_needs_vacanalyze() that handles fetching and freeing the pgstat entry for the table. Since all callers of relation_needs_vacanalyze() do that anyway, we can teach that function to fetch/free the pgstat entry and use it instead. Suggested-by: Álvaro Herrera <alvherre@kurilemu.de> Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0s4xjMrB-VAnLccC7kY8d0-4806-Lsac-czJsdA1LXtAw%40mail.gmail.com	2026-04-06 14:30:52 -05:00
Robert Haas	e972dff6c3	auto_explain: Add new GUC, auto_explain.log_extension_options. The associated value should look like something that could be part of an EXPLAIN options list, but restricted to EXPLAIN options added by extensions. For example, if pg_overexplain is loaded, you could set auto_explain.log_extension_options = 'DEBUG, RANGE_TABLE'. You can also specify arguments to these options in the same manner as normal e.g. 'DEBUG 1, RANGE_TABLE false'. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 15:19:42 -04:00
Tom Lane	d516974840	Support more object types within CREATE SCHEMA. Having rejected the principle that we should know how to re-order the sub-commands of CREATE SCHEMA, there is not really anything except a little coding to stop us from supporting more object types. This patch adds support for creating functions (including procedures and aggregates), operators, types (including domains), collations, and text search objects. SQL:2021 specifies that we should allow functions, procedures, types, domains, and collations, so this moves us a great deal closer to full SQL compatibility of CREATE SCHEMA. What remains missing from their list are casts, transforms, roles, and some object types we don't support yet (e.g. CREATE CHARACTER SET). Supporting casts or transforms would be problematic because they don't have names at all, let alone schema-qualified names, so it'd be quite a stretch to say that they belong to a schema. Roles likewise are not schema-qualified, plus they are global to a cluster, making it even less reasonable to consider them as belonging to a schema. So I don't see us trying to complete the list. User-defined aggregates and operators are outside the spec's ken, as are text search objects, so adding them does not do anything for spec compatibility. But they go along with these other object types, plus it takes no additional code to support them since they are represented as DefineStmts like some variants of CREATE TYPE. It would indeed take some effort to reject them. Author: Kirill Reshke <reshkekirill@gmail.com> Author: Jian He <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CALdSSPh4jUSDsWu3K58hjO60wnTRR0DuO4CKRcwa8EVuOSfXxg@mail.gmail.com	2026-04-06 15:16:25 -04:00
Tom Lane	404db8f9ed	Execute foreign key constraints in CREATE SCHEMA at the end. The previous patch simplified CREATE SCHEMA's behavior to "execute all subcommands in the order they are written". However, that's a bit too simple, as the spec clearly requires forward references in foreign key constraint clauses to work, see feature F311-01. (Most other SQL implementations seem to read more into the spec than that, but it's not clear that there's justification for more in the text, and this is the only case that doesn't introduce unresolvable issues.) We never implemented that before, but let's do so now. To fix it, transform FOREIGN KEY clauses into ALTER TABLE ... ADD FOREIGN KEY commands and append them to the end of the CREATE SCHEMA's subcommand list. This works because the foreign key constraints are independent and don't affect any other DDL that might be in CREATE SCHEMA. For simplicity, we do this for all FOREIGN KEY clauses even if they would have worked where they were. Author: Jian He <jian.universality@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1075425.1732993688@sss.pgh.pa.us	2026-04-06 15:16:25 -04:00
Tom Lane	a9c350d9ee	Don't try to re-order the subcommands of CREATE SCHEMA. transformCreateSchemaStmtElements has always believed that it is supposed to re-order the subcommands of CREATE SCHEMA into a safe execution order. However, it is nowhere near being capable of doing that correctly. Nor is there reason to think that it ever will be, or that that is a well-defined requirement. (The SQL standard does say that it should be possible to do foreign-key forward references within CREATE SCHEMA, but it's not clear that the text requires anything more than that.) Moreover, the problem will get worse as we add more subcommand types. Let's just drop the whole idea and execute the commands in the order given, which seems like a much less astonishment-prone definition anyway. The foreign-key issue will be handled in a follow-up patch. This will result in a release-note-worthy incompatibility, which is that forward references like CREATE SCHEMA myschema CREATE VIEW myview AS SELECT * FROM mytable CREATE TABLE mytable (...); used to work and no longer will. Considering how many closely related variants never worked, this isn't much of a loss. Along the way, pass down a ParseState so that we can provide an error cursor for "wrong schema name" and related errors, and fix transformCreateSchemaStmtElements so that it doesn't scribble on the parsetree passed to it. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/1075425.1732993688@sss.pgh.pa.us	2026-04-06 15:16:25 -04:00
Masahiko Sawada	1ff3180ca0	Allow autovacuum to use parallel vacuum workers. Previously, autovacuum always disabled parallel vacuum regardless of the table's index count or configuration. This commit enables autovacuum workers to use parallel index vacuuming and index cleanup, using the same parallel vacuum infrastructure as manual VACUUM. Two new configuration options control the feature. The GUC autovacuum_max_parallel_workers sets the maximum number of parallel workers a single autovacuum worker may launch; it defaults to 0, preserving existing behavior unless explicitly enabled. The per-table storage parameter autovacuum_parallel_workers provides per-table limits. A value of 0 disables parallel vacuum for the table, a positive value caps the worker count (still bounded by the GUC), and -1 (the default) defers to the GUC. To handle cases where autovacuum workers receive a SIGHUP and update their cost-based vacuum delay parameters mid-operation, a new propagation mechanism is added to vacuumparallel.c. The leader stores its effective cost parameters in a DSM segment. Parallel vacuum workers poll for changes in vacuum_delay_point(); if an update is detected, they apply the new values locally via VacuumUpdateCosts(). A new test module, src/test/modules/test_autovacuum, is added to verify that parallel autovacuum workers are correctly launched and that cost-parameter updates are propagated as expected. The patch was originally proposed by Maxim Orlov, but the implementation has undergone significant architectural changes since then during the review process. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: zengman <zengman@halodbtech.com> Discussion: https://postgr.es/m/CACG=ezZOrNsuLoETLD1gAswZMuH2nGGq7Ogcc0QOE5hhWaw=cw@mail.gmail.com	2026-04-06 11:48:29 -07:00
Álvaro Herrera	c0b53ec063	Rename cluster.c to repack.c (and corresponding .h) CLUSTER is no longer the favored way to invoke this functionality, and the code is about to shift its focus to the REPACK more ambitiously. Rename the file to avoid leaving an unnecessary historical artifact around. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/202603271635.owyhm7btgoic@alvherre.pgsql	2026-04-06 20:11:01 +02:00
Tom Lane	21c69dc73f	Disallow system columns in COPY FROM WHERE conditions. These columns haven't been computed yet when the filtering happens (since we've not written the candidate tuple into the table); so any check on them is wrong or useless. Worse, since `aa606b931` such a reference results in an access off the end of a TupleDesc, potentially causing a phony "generated columns are not supported in COPY FROM WHERE conditions" error; and since `c98ad086a` it throws an Assert instead. Actually we could allow tableoid, which has been set to the OID of the table named as the COPY target. However, plausible uses for tests of tableoid would involve a partitioned target table, and the user would wish it to read as the OID of the destination partition. There has been some discussion of changing things to make it work like that, but pending that happening we should just disallow tableoid along with other system columns. It seems best though to install this prohibition only in HEAD. In the back branches we'll just guard the unsafe TupleDesc access, and people will keep getting whatever semantics they got before. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6f435023-8ab6-47c2-ba07-035d0c4212f9@gmail.com	2026-04-06 14:05:01 -04:00
Tom Lane	f7da81f68b	Add missing .gitignore files. contrib/pg_stash_advice and src/test/modules/test_shmem missed these, leading to complaints from git after an in-tree check-world run. Use our standard boilerplate list of ignorable subdirectories, although the two modules presently create different subsets of that.	2026-04-06 13:25:29 -04:00
Tom Lane	6582010c80	Fix null-bitmap combining in array_agg_array_combine(). This code missed the need to update the combined state's nullbitmap if state1 already had a bitmap but state2 didn't. We need to extend the existing bitmap with 1's but didn't. This could result in wrong output from a parallelized array_agg(anyarray) calculation, if the input has a mix of null and non-null elements. The errors depended on timing of the parallel workers, and therefore would vary from one run to another. Also install guards against integer overflow when calculating the combined object's sizes, and make some trivial cosmetic improvements. Author: Dmytro Astapov <dastapov@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFQUnFj2pQ1HbGp69+w2fKqARSfGhAi9UOb+JjyExp7kx3gsqA@mail.gmail.com Backpatch-through: 16	2026-04-06 13:14:53 -04:00
Robert Haas	0442f1c9ef	Add a guc_check_handler to the EXPLAIN extension mechanism. It would be useful to be able to tell auto_explain to set a custom EXPLAIN option, but it would be bad if it tried to do so and the option name or value wasn't valid, because then every query would fail with a complaint about the EXPLAIN option. So add a guc_check_handler that auto_explain will be able to use to only try to set option name/value/type combinations that have been determined to be legal, and to emit useful messages about ones that aren't. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 12:31:47 -04:00
Nathan Bossart	e3481edfd1	Remove autoanalyze corner case. The restructuring in commit `53b8ca6881` revealed an interesting corner case: if a table needs vacuuming for wraparound prevention and autovacuum is disabled for it, we might still choose to analyze it. Research seems to indicate this was an accidental addition by commit `48188e1621`, and further discussion indicates there is consensus that it is unnecessary and can be removed. Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/adB9nSsm_S0D9708%40nathan	2026-04-06 11:28:46 -05:00
Robert Haas	e0e819cc08	Expose helper functions scan_quoted_identifier and scan_identifier. Previously, this logic was embedded within SplitIdentifierString, SplitDirectoriesString, and SplitGUCList. Factoring it out saves a bit of duplicated code, and also makes it available to extensions that might want to do similar things without necessarily wanting to do exactly the same thing. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 11:13:25 -04:00
Fujii Masao	ca2b5443e2	Add TAP tests for log_lock_waits This commit updates 011_lock_stats.pl to verify log_lock_waits behavior. The tests check that messages are emitted both when a wait occurs and when the lock is acquired, and that the "still waiting for" message is logged exactly once per wait, even if the backend wakes up during the wait. The latter covers the behavior introduced by commit `fd6ecbfa75`. Author: Hüseyin Demir <huseyin.d3r@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAB5wL7YB1my9W5k5i=SY+=sTjeozyJ0YkvGXrVfeDNzuRkoTPg@mail.gmail.com	2026-04-06 23:49:40 +09:00
Fujii Masao	93dc1ace20	Release postmaster working memory context in slotsync worker Child processes do not need the postmaster's working memory context and normally release it at the start of their main entry point. However, the slotsync worker forgot to do so. This commit makes the slotsync worker release the postmaster's working memory context at startup, preventing unintended use. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tiancheng Ge <getiancheng_2012@163.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAHGQGwHO05JaUpgKF8FBDmPdBUJsK22axRRcgmAUc2Jyi8OK8g@mail.gmail.com	2026-04-06 23:04:18 +09:00

1 2 3 4 5 ...

63991 commits