postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-15 22:10:45 -04:00

Author	SHA1	Message	Date
Etsuro Fujita	28972b6fc3	Add support for importing statistics from remote servers. Add a new FDW callback routine that allows importing remote statistics for a foreign table directly to the local server, instead of collecting statistics locally. The new callback routine is called at the beginning of the ANALYZE operation on the table, and if the FDW failed to import the statistics, the existing callback routine is called on the table to collect statistics locally. Also implement this for postgres_fdw. It is enabled by "restore_stats" option both at the server and table level. Currently, it is the user's responsibility to ensure remote statistics to import are up-to-date, so the default is false. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CADkLM%3DchrYAx%3DX2KUcDRST4RLaRLivYDohZrkW4LLBa0iBhb5w%40mail.gmail.com	2026-04-08 19:15:00 +09:00
Andres Freund	2c16deee2f	instrumentation: Allocate query level instrumentation in ExecutorStart Until now extensions that wanted to measure overall query execution could create QueryDesc->totaltime, which the core executor would then start and stop. That's a bit odd and composes badly, e.g. extensions always had to use INSTRUMENT_ALL, because otherwise another extension might not get what they need. Instead this introduces a new field, QueryDesc->query_instr_options, that extensions can use to indicate whether they need query level instrumentation populated, and with which instrumentation options. Extensions should take care to only add options they need, instead of replacing the options of others. The prior name of the field, totaltime, sounded like it would only measure time, but these days the instrumentation infrastructure can track more resources. The secondary benefit is that this will make it obvious to extensions that they may not create the Instrumentation struct themselves anymore (often extensions build only against a postgres build without assertions). Adjust pg_stat_statements and auto_explain to match, and lower the requested instrumentation level for auto_explain to INSTRUMENT_TIMER, since the summary instrumentation it needs is only runtime. The reason to push this now, rather in the PG 20 cycle, is that `5a79e78501` already required extensions using query level instrumentations to adjust their code, and it seemed undesirable to require them to do so again for 20. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53Pkyqsht+exJQYRsjhSWYKu+vFGHhPub7m6PmFD6Or0=p1g@mail.gmail.com	2026-04-08 00:06:45 -04:00
Tomas Vondra	61c36a34a4	auto_explain: Add new GUC auto_explain.log_io Allows enabling the new EXPLAIN "IO" option for auto_explain. Author: Tomas Vondra <tomas@vondra.me> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/flat/a177a6dd-240b-455a-8f25-aca0b1c08c6e%40vondra.me	2026-04-07 22:49:44 +02:00
Robert Haas	c10edb102a	pg_stash_advice: Allow stashed advice to be persisted to disk. If pg_stash_advice.persist = true, stashed advice will be written to pg_stash_advice.tsv in the data directory, periodically and at shutdown. On restart, stash modifications are locked out until this file has been reloaded, but queries will not be, so there may be a short window after startup during which previously-stashed advice is not automatically applied. Author: Robert Haas <rhaas@postgresql.org> Co-authored-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/CA+Tgmob87qsWa-VugofU6epuV0H5XjWZGMbQas4Q-ADKmvSyBg@mail.gmail.com	2026-04-07 10:11:36 -04:00
Heikki Linnakangas	257c8231bf	Modernize and optimize pg_buffercache_pages() Refactor pg_buffercache_pages() to use SFRM_Materialize mode and construct a tuplestore directly. That's simpler and more efficient than collecting all the data to a custom array first. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Palak Chaturvedi <chaturvedipalak1911@gmail.com> Discussion: https://www.postgresql.org/message-id/CAExHW5sMsaz1j+hrdhyo-DJp7JCgJx87=q2iJfOc_9mwYWyvmw@mail.gmail.com	2026-04-07 16:04:48 +03:00
Heikki Linnakangas	9f3755ea07	Optimize sorting and deduplicating trigrams Use templated qsort() so that the comparison function can be inlined. To speed up qunique(), use a specialized comparison function that only checks for equality. Author: David Geier <geidav.pg@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/2a76b5ef-4b12-4023-93a1-eed6e64968f3@gmail.com	2026-04-07 14:11:25 +03:00
Álvaro Herrera	0d3dba38c7	Allow logical replication snapshots to be database-specific By default, the logical decoding assumes access to shared catalogs, so the snapshot builder needs to consider cluster-wide XIDs during startup. That in turn means that, if any transaction is already running (and has XID assigned), the snapshot builder needs to wait for its completion, as it does not know if that transaction performed catalog changes earlier. A possible problem with this concept is that if REPACK (CONCURRENTLY) is running in some database, backends running the same command in other databases get stuck until the first one has committed. Thus only a single backend in the cluster can run REPACK (CONCURRENTLY) at any time. Likewise, REPACK (CONCURRENTLY) can block walsenders starting on behalf of subscriptions throughout the cluster. This patch adds a new option to logical replication output plugin, to declare that it does not use shared catalogs (i.e. catalogs that can be changed by transactions running in other databases in the cluster). In that case, no snapshot the backend will use during the decoding needs to contain information about transactions running in other databases. Thus the snapshot builder only needs to wait for completion of transactions in the current database. Currently we only use this option in the REPACK background worker. It could possibly be used in the plugin for logical replication too, however that would need thorough analysis of that plugin. Bump WAL version number, due to a new field in xl_running_xacts. Author: Antonin Houska <ah@cybertec.at> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/90475.1775218118@localhost	2026-04-07 12:31:18 +02:00
Michael Paquier	49cc0d4148	Mark JumbleState as a const in the post_parse_analyze hook This commit changes the post_parse_analyze_hook_type() hook to take a const JumbleState, to tell external modules that they are not allowed to touch the JumbleState that has been compiled by the core code. This fixes a pretty old problem with pg_stat_statements, that had always the idea of modifying the lengths of the constants stored in the JumbleState. The previous state could confuse extensions that need to look at a JumbleState depending on the loading order, if pg_stat_statements is part of the stack loaded. Another piece included in this commit is the move of the routine fill_in_constant_lengths() to queryjumblefuncs.c, to give an option to extensions to compile the lengths of the constants, if necessary. I was surprised by the number of external code that carries a copy of this routine (see the thread for details). Previously, this routine modified JumbleState. It now copies the set of LocationLens from JumbleState, and fills the constant lengths for separate use. pg_stat_statements is updated to use the new ComputeConstantLengths(). JumbleState is now marked with a const in the module, where relevant. Author: Sami Imseih <samimseih@gmail.com> Co-authored-by: Lukas Fittl <lukas@fittl.com> Discussion: https://postgr.es/m/CAA5RZ0tZp5qU0ikZEEqJnxvdSNGh1DWv80sb-k4QAUmiMoOp_Q@mail.gmail.com	2026-04-07 15:22:49 +09:00
Robert Haas	e972dff6c3	auto_explain: Add new GUC, auto_explain.log_extension_options. The associated value should look like something that could be part of an EXPLAIN options list, but restricted to EXPLAIN options added by extensions. For example, if pg_overexplain is loaded, you could set auto_explain.log_extension_options = 'DEBUG, RANGE_TABLE'. You can also specify arguments to these options in the same manner as normal e.g. 'DEBUG 1, RANGE_TABLE false'. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 15:19:42 -04:00
Tom Lane	f7da81f68b	Add missing .gitignore files. contrib/pg_stash_advice and src/test/modules/test_shmem missed these, leading to complaints from git after an in-tree check-world run. Use our standard boilerplate list of ignorable subdirectories, although the two modules presently create different subsets of that.	2026-04-06 13:25:29 -04:00
Robert Haas	0442f1c9ef	Add a guc_check_handler to the EXPLAIN extension mechanism. It would be useful to be able to tell auto_explain to set a custom EXPLAIN option, but it would be bad if it tried to do so and the option name or value wasn't valid, because then every query would fail with a complaint about the EXPLAIN option. So add a guc_check_handler that auto_explain will be able to use to only try to set option name/value/type combinations that have been determined to be legal, and to emit useful messages about ones that aren't. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CA+Tgmob-0W8306mvrJX5Urtqt1AAasu8pi4yLrZ1XfwZU-Uj1w@mail.gmail.com	2026-04-06 12:31:47 -04:00
Robert Haas	e8ec19aa32	Add pg_stash_advice contrib module. This module allows plan advice strings to be provided automatically from an in-memory advice stash. Advice stashes are stored in dynamic shared memory and must be recreated and repopulated after a server restart. If pg_stash_advice.stash_name is set to the name of an advice stash, and if query identifiers are enabled, the query identifier for each query will be looked up in the advice stash and the associated advice string, if any, will be used each time that query is planned. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: http://postgr.es/m/CA+TgmoaeNuHXQ60P3ZZqJLrSjP3L1KYokW9kPfGbWDyt+1t=Ng@mail.gmail.com	2026-04-06 07:41:28 -04:00
Heikki Linnakangas	d4885af3d6	Convert pg_stat_statements to use the new shmem allocation functions As part of this, embed the LWLock it needs in the shared memory struct itself, so that we don't need to use RequestNamedLWLockTranche() anymore. LWLockNewTrancheId() + LWLockInitialize() is more convenient to use in extensions. Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAExHW5vM1bneLYfg0wGeAa=52UiJ3z4vKd3AJ72X8Fw6k3KKrg@mail.gmail.com	2026-04-06 02:12:53 +03:00
Andres Freund	5a79e78501	instrumentation: Separate per-node logic from other uses Previously, different places (e.g. query "total time") were repurposing the Instrumentation struct initially introduced for capturing per-node statistics during execution. This overuse of the same struct is confusing, e.g. by cluttering calls of InstrStartNode/InstrStopNode in unrelated code paths, and prevents future refactorings. Instead, simplify the Instrumentation struct to only track time and WAL/buffer usage. Similarly, drop the use of InstrEndLoop outside of per-node instrumentation - these calls were added without any apparent benefit since the relevant fields were never read. Introduce the NodeInstrumentation struct to carry forward the per-node instrumentation information. WorkerInstrumentation is renamed to WorkerNodeInstrumentation for clarity. In passing, clarify that InstrAggNode is expected to only run after InstrEndLoop (as it does in practice), and drop unused code. This also fixes a consequence-less bug: Previously ->async_mode was only set when a non-zero instrument_option was passed. That turns out to be harmless right now, as ->async_mode only affects a timing related field. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAP53PkzdBK8VJ1fS4AZ481LgMN8f9mJiC39ZRHqkFUSYq6KWmg@mail.gmail.com	2026-04-05 19:04:24 -04:00
Andres Freund	7d9b74df53	instrumentation: Separate trigger logic from other uses Introduce TriggerInstrumentation to capture trigger timing and firings (previously counted in "ntuples"), to aid a future refactoring that splits out all Instrumentation fields beyond timing and WAL/buffers into more specific structs. In passing, drop the "n" argument to InstrAlloc, as all remaining callers need exactly one Instrumentation struct. The duplication between InstrAlloc() and InstrInit(), as well as the conditional initialization of async_mode will be addressed in a subsequent commit. Author: Lukas Fittl <lukas@fittl.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/CAP53PkzdBK8VJ1fS4AZ481LgMN8f9mJiC39ZRHqkFUSYq6KWmg@mail.gmail.com	2026-04-05 16:56:50 -04:00
Etsuro Fujita	de28140ded	postgres_fdw: Inherit the local transaction's access/deferrable modes. READ ONLY transactions should prevent modifications to foreign data as well as local data, but postgres_fdw transactions declared as READ ONLY that reference foreign tables mapped to a remote view executing volatile functions would modify data on remote servers, as it would open remote transactions in READ WRITE mode. Similarly, DEFERRABLE transactions should not abort due to a serialization failure even when accessing foreign data, but postgres_fdw transactions declared as DEFERRABLE would abort due to that failure in a remote server, as it would open remote transactions in NOT DEFERRABLE mode. To fix, modify postgres_fdw to open remote transactions in the same access/deferrable modes as the local transaction. This commit also modifies it to open remote subtransactions in the same access mode as the local subtransaction. This commit changes the behavior of READ ONLY/DEFERRABLE transactions using postgres_fdw; in particular, it doesn't allow the READ ONLY transactions to modify data on remote servers anymore, so such transactions should be redeclared as READ WRITE or rewritten using other tools like dblink. The release notes should note this as an incompatibility. These issues exist since the introduction of postgres_fdw, but to avoid the incompatibility in the back branches, fix them in master only. Author: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com Discussion: https://postgr.es/m/E1uLe9X-000zsY-2g%40gemulon.postgresql.org	2026-04-05 18:55:00 +09:00
Heikki Linnakangas	9ebe1c4f2c	Merge init and max size options on shmem hash tables Replace the separate init and max size options with a single size option. We didn't make much use of the feature, all callers except the ones in wait_event.c already used the same size for both, and the hash tables in wait_event.c are small so there's little harm in just allocating them to the max size. The only reason why you might want to not reserve the max size upfront is to make the memory available for other hash tables to grow beyond their max size. Letting hash tables grow much beyond their max size is bad for performance, however, because we cannot resize the directory, and we never had very much "wiggle room" to grow to anyway so you couldn't really rely on it. We recently marked the LOCK and PROCLOCK tables with HAS_FIXED_SIZE, so there's nothing left in core that would benefit from more unallocated shared memory. Reviewed-by: Tomas Vondra <tomas@vondra.me> Discussion: https://www.postgresql.org/message-id/01ab1d41-3eda-4705-8bbd-af898f5007f1@iki.fi	2026-04-04 02:40:20 +03:00
Peter Eisentraut	8e72d914c5	Add UPDATE/DELETE FOR PORTION OF This is an extension of the UPDATE and DELETE commands to do a "temporal update/delete" based on a range or multirange column. The user can say UPDATE t FOR PORTION OF valid_at FROM '2001-01-01' TO '2002-01-01' SET ... (or likewise with DELETE) where valid_at is a range or multirange column. The command is automatically limited to rows overlapping the targeted portion, and only history within those bounds is changed. If a row represents history partly inside and partly outside the bounds, then the command truncates the row's application time to fit within the targeted portion, then it inserts one or more "temporal leftovers": new rows containing all the original values, except with the application-time column changed to only represent the untouched part of history. To compute the temporal leftovers that are required, we use the *_minus_multi set-returning functions defined in `5eed8ce50c`. - Added bison support for FOR PORTION OF syntax. The bounds must be constant, so we forbid column references, subqueries, etc. We do accept functions like NOW(). - Added logic to executor to insert new rows for the "temporal leftover" part of a record touched by a FOR PORTION OF query. - Documented FOR PORTION OF. - Added tests. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com	2026-04-01 19:06:03 +02:00
Nathan Bossart	771fe0948c	Avoid including vacuum.h in tableam.h and heapam.h. Commit `2252fcd427` modified some function prototypes in tableam.h and heapam.h to take a VacuumParams argument instead of a pointer, which required including vacuum.h in those headers. vacuum.h has a reasonably large dependency tree, and headers like tableam.h are widely included, so this is not ideal. To fix, change the functions in question to accept a "const VacuumParams *" argument instead. That allows us to use a forward declaration for VacuumParams and avoid including vacuum.h. Since vacuum_rel() needs to scribble on the params argument, we still pass it by value to that function so that the original struct is not modified. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/rzxpxod4c4la62yvutyrvgoyilrl2fx55djaf2suidy7np5m6c%403l2ln476eadh	2026-03-31 12:43:52 -05:00
Nathan Bossart	bab2f27eaa	Remove bits* typedefs. In addition to removing the bits8, bits16, and bits32 typedefs, this commit replaces all uses with uint8, uint16, or uint32. bits* provided little benefit beyond establishing the intent of the variable, and they were inconsistently used for that purpose. Third-party code should instead use the corresponding uint* typedef. Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Discussion: https://postgr.es/m/absbX33E4eaA0Ity%40nathan	2026-03-30 16:12:08 -05:00
Melanie Plageman	dcd8cc1c85	Thread flags through begin-scan APIs Add an AM user-settable flags parameter to several of the table scan functions, one table AM callback, and index_beginscan(). This allows users to pass additional context to be used when building the scan descriptors. For index scans, a new flags field is added to IndexFetchTableData, and the heap AM saves the caller-provided flags there. This introduces an extension point for follow-up work to pass per-scan information (such as whether the relation is read-only for the current query) from the executor to the AM layer. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Tomas Vondra <tomas@vondra.me> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/2be31f17-5405-4de9-8d73-90ebc322f7d8%40vondra.me	2026-03-30 12:27:24 -04:00
Robert Haas	e2ee95233c	pg_plan_advice: Avoid assertion failure with partitionwise aggregate. An Append node that is part of a partitionwise aggregate has no apprelids. If such a node was elided, the previous coding would attempt to call unique_nonjoin_rtekind() on a NULL pointer, which leads to an assertion failure. Insert a NULL check to prevent that. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/0afba1ce-c946-4131-972d-191d9a1c097c@gmail.com	2026-03-30 09:58:25 -04:00
Melanie Plageman	39dcd10a2c	Remove PlannedStmt->resultRelations in favor of resultRelationRelids PlannedStmt->resultRelations was an integer list of range table indexes because at the time it was added (to Query), the Bitmapset data type did not yet exist in Postgres. `0f4c170cf3` added a Bitmapset of result relations, so remove the integer list of RTIs and use the more compact resultRelationRelids. Discussion: https://postgr.es/m/CAApHDvqAOeOwCKh9g0gfxWa040%3DHyc7_oA%3DC59rjod8kXJDWyw%40mail.gmail.com	2026-03-30 09:51:28 -04:00
Fujii Masao	7bff9f106a	psql: Make \d+ partition list formatting consistent with other objects Previously, \d+ <table> displayed partitions differently from other object lists: the first partition appeared on the same line as the "Partitions" header. For example: Partitions: pt12 FOR VALUES IN (1, 2), pt34 FOR VALUES IN (3, 4) This commit updates the output so that partitions are listed consistently with other objects, with each entry on its own line starting below the header: Partitions: pt12 FOR VALUES IN (1, 2) pt34 FOR VALUES IN (3, 4) Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Neil Chen <carpenter.nail.cz@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAHut+Pu1puO00C-OhgLnAcECzww8MB3Q8DCsvx0cZWHRfs4gBQ@mail.gmail.com	2026-03-30 11:06:42 +09:00
Andres Freund	8df3c48e46	Use UnlockReleaseBuffer() in more places An upcoming commit will make UnlockReleaseBuffer() considerably faster and more scalable than doing LockBuffer(BUFFER_LOCK_UNLOCK); ReleaseBuffer();. But it's a small performance benefit even as-is. Most of the callsites changed in this patch are not performance sensitive, however some, like the nbtree ones, are in critical paths. This patch changes all the easily convertible places over to UnlockReleaseBuffer() mainly because I needed to check all of them anyway, and reducing cases where the operations are done separately makes the checking easier. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/5ubipyssiju5twkb7zgqwdr7q2vhpkpmuelxfpanetlk6ofnop@hvxb4g2amb2d	2026-03-27 15:56:29 -04:00
Heikki Linnakangas	3c74cb5762	Avoid memory leak on error while parsing pg_stat_statements dump file By using palloc() instead of raw malloc(). Reported-by: Gaurav Singh <gaurav.singh@yugabyte.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/CAEcQ1bYR9s4eQLFDjzzJHU8fj-MTbmRpW-9J-r2gsCn+HEsynw@mail.gmail.com Backpatch-through: 14	2026-03-27 12:25:10 +02:00
Robert Haas	874da8b1f6	pg_plan_advice: pgindent Reported-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 20:10:13 -04:00
Robert Haas	6455e55b0d	pg_plan_advice: Invent DO_NOT_SCAN(relation_identifier). The premise of src/test/modules/test_plan_advice is that if we plan a query once, generate plan advice, and then replan it using that same advice, all of that advice should apply cleanly, since the settings and everything else are the same. Unfortunately, that's not the case: the test suite is the main regression tests, and concurrent activity can change the statistics on tables involved in the query, especially system catalogs. That's OK as long as it only affects costing, but in a few cases, it affects which relations appear in the final plan at all. In the buildfarm failures observed to date, this happens because we consider alternative subplans for the same portion of the query; in theory, MinMaxAggPath is vulnerable to a similar hazard. In both cases, the planner clones an entire subquery, and the clone has a different plan name, and therefore different range table identifiers, than the original. If a cost change results in flipping between one of these plans and the other, the test_plan_advice tests will fail, because the range table identifiers to which advice was applied won't even be present in the output of the second planning cycle. To fix, invent a new DO_NOT_SCAN advice tag. When generating advice, emit it for relations that should not appear in the final plan at all, because some alternative version of that relation was used instead. When DO_NOT_SCAN is supplied, disable all scan methods for that relation. To make this work, we reuse a bunch of the machinery that previously existed for the purpose of ensuring that we build the same set of relation identifiers during planning as we do from the final PlannedStmt. In the process, this commit slightly weakens the cross-check mechanism: before this commit, it would fire whenever the pg_plan_advice module was loaded, even if pg_plan_advice wasn't actually doing anything; now, it will only engage when we have some other reason to create a pgpa_planner_state. The old way was complex and didn't add much useful test coverage, so this seems like an acceptable sacrifice. Discussion: http://postgr.es/m/CA+TgmoYuWmN-00Ec5pY7zAcpSFQUQLbgAdVWGR9kOR-HM-fHrA@mail.gmail.com Reviewed-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 17:09:57 -04:00
Robert Haas	5dcb15e89a	pg_plan_advice: Refactor to invent pgpa_planner_info pg_plan_advice tracks two pieces of per-PlannerInfo data: (1) for each RTI, the corresponding relation identifier, for purposes of cross-checking those calculations against the final plan; and (2) the set of semijoins seen during planning for which the strategy of making one side unique was considered. The former is tracked using a hash table that uses <plan_name, RTI> as the key, and the latter is tracked using a List of <plan_name, relids>. It seems better to track both of these things in the same way and to try to reuse some code instead of having everything be completely separate, so invent pgpa_planner_info; we'll create one every time we see a new PlannerInfo and need to associate some data with it, and we'll use the plan_name field to distinguish between PlannerInfo objects, as it should always be unique. Then, refactor the two systems mentioned above to use this new infrastructure. (Note that the adjustment in pgpa_plan_walker is necessary in order to avoid spuriously triggering the sanity check in that function, in the case where a pgpa_planner_info is created for a purpose not related to sj_unique_rels.) Discussion: https://postgr.es/m/CA+TgmoaK=4w7-qknUo3QhUJ53pXZq=c=KgZmRyD+k7ytqfmgSg@mail.gmail.com Reviewed-by: Lukas Fittl <lukas@fittl.com>	2026-03-26 11:57:33 -04:00
Álvaro Herrera	2102ebb195	Don't include storage/lock.h in so many headers Since storage/locktags.h was added by commit `322bab7974`, many headers can be made leaner by depending on that instead of on storage/lock.h, which has many other dependencies. (In fact, some of these changes were possible even before that.) Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/abvrRZo52Yx9ZzWQ@ip-10-97-1-34.eu-west-3.compute.internal	2026-03-24 17:11:12 +01:00
Álvaro Herrera	5f2350a043	Fix dereference in a couple of GUC check hooks check_backtrace_functions() and check_archive_directory() were doing an empty-string check this way: newval[0] == '\0' which, because of operator precedence, is interpreted as (newval[0]) instead of (*newval)[0] -- but these variables are pointers to C-strings and we want to check the first character therein, rather than check the first pointer of the array, so that interpretation is wrong. This would be wrong for any index element other than 0, as evidenced by every other dereference of the same variable in check_backtrace_functions, which use parentheses. Add parentheses to make the intended dereference explicit. This is just cosmetic at this stage, so no backpatch, although it's been "wrong" for a long time. Author: Zhang Hu <kongbaik228@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Discussion: https://postgr.es/m/CAB5m2QssN6UO+ckr6ZCcV0A71mKUB6WdiTw1nHo43v4DTW1Dfg@mail.gmail.com	2026-03-24 16:45:39 +01:00
Fujii Masao	1c162c965a	Report detailed errors from XLogFindNextRecord() failures. Previously, XLogFindNextRecord() did not return detailed error information when it failed to find a valid WAL record. As a result, callers such as the WAL summarizer, pg_waldump, and pg_walinspect could only report generic errors (e.g., "could not find a valid record after ..."), making troubleshooting difficult. This commit fix the issue by extending XLogFindNextRecord() to return detailed error information on failure, and updating its callers to include those details in their error messages. For example, when pg_waldump is run on a WAL file with an invalid magic number, it now reports not only the generic error but also the specific cause (e.g., "invalid magic number"). Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAO6_XqoxJXddcT4wkd9Xd+cD6Sz-fyspRGuV4Bq-wbXG4pVNzA@mail.gmail.com	2026-03-24 22:33:09 +09:00
Tom Lane	ce8d5fe0e2	plpgsql: optimize "SELECT simple-expression INTO var". Previously, we always fed SELECT ... INTO to the SPI machinery. While that works for all cases, it's a great deal slower than the otherwise-equivalent "var := expression" if the expression is "simple" and the INTO target is a single variable. Users coming from MSSQL or T_SQL are likely to be surprised by this; they are used to writing SELECT ... INTO since there is no "var := expression" syntax in those dialects. Hence, check for a simple expression and use the faster code path if possible. (Here, "simple" means whatever exec_is_simple_query accepts, which basically means "SELECT scalar-expression" without any input tables, aggregates, qual clauses, etc.) This optimization is not entirely transparent. Notably, one of the reasons it's faster is that the hooks that pg_stat_statements uses aren't called in this path, so that the evaluated expression no longer appears in pg_stat_statements output as it did before. There may be some other minor behavioral changes too, although I tried hard to make error reporting look the same. Hopefully, none of them are significant enough to not be acceptable as routine changes in a PG major version. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com> Discussion: https://postgr.es/m/CAFj8pRDieSQOPDHD_svvR75875uRejS9cN87FoAC3iXMXS1saQ@mail.gmail.com	2026-03-20 18:23:45 -04:00
Tom Lane	733f20df53	Discount the metapage when estimating number of index pages visited. genericcostestimate() estimates the number of index leaf pages to be visited as a pro-rata fraction of the total number of leaf pages. Or at least that was the intention. What it actually used in the calculation was the total number of index pages, so that non-leaf pages were also counted. In a decent-sized index the error is probably small, since we expect upper page fanout to be high. But in a small index that's not true; in the worst case with one data-bearing page plus a metapage, we had 100% relative error. This led to surprising planning choices such as not using a small partial index. To fix, ask genericcostestimate's caller to supply an estimate of the number of non-leaf pages, and subtract that. For the built-in index AMs, it seems sufficient to count the index metapage (if the AM uses one) as non-leaf. Per the above argument, counting upper index pages shouldn't change the estimate much, and in most cases we don't have any easy way of estimating the number of upper pages. This might be an area for further research in future. Any external genericcostestimate callers that do not set the new field GenericCosts.numNonLeafPages will see the same behavior as before, assuming they followed the advice to zero out that whole struct. Unsurprisingly, this change affects a number of plans seen in the core regression tests. I hacked up the existing tests to keep the tests' plans the same, since in each case it appeared that the test's intent was to test exactly that plan. Also add one new test case demonstrating that a better index choice is now made. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Henson Choi <assam258@gmail.com> Discussion: https://postgr.es/m/870521.1745860752@sss.pgh.pa.us	2026-03-20 14:50:53 -04:00
Peter Eisentraut	57ee397953	Update Unicode data to Unicode 17.0.0 Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alexander Borisov <lex.borisov@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-20 08:42:50 +01:00
Robert Haas	b335fe56f3	pg_plan_advice: Fix multiple copy-and-paste-errors in test case. The second half of this file is meant to test feedback, not generated advice, and is meant to use the statements that it prepares, not leftover prepared statements from earlier in the file. These mistakes resulted in failures under debug_discard_caches = 1, because re-executing pt2 instead of executing pt4 for the first time resulted in different output depending on whether the query was replanned. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> (per BF member avocet)	2026-03-18 18:24:39 -04:00
Jeff Davis	1c5bf1185a	ALTER SUBSCRIPTION ... SERVER test. Test ALTER SUBSCRIPTION ... SERVER and ALTER SUBSCRIPTION ... CONNECTION, including invalidation. Also run perltidy on the test file. Discussion: https://postgr.es/m/CAExHW5vV5znEvecX=ra2-v7UBj9-M6qvdDzuB78M-TxbYD1PEA@mail.gmail.com Suggested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>	2026-03-18 10:15:51 -07:00
Peter Eisentraut	1b0c269f2e	Implement unaccent Unicode data update in meson The meson/ninja update-unicode target did not cover the required updates in contrib/unaccent/. This is fixed now. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Alexander Borisov <lex.borisov@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/2a668979-ed92-49a3-abf9-a3ec2d460ec2%40eisentraut.org	2026-03-18 13:42:05 +01:00
Robert Haas	01b02c0eca	pg_plan_advice: Avoid a crash under GEQO. The previous code could allocate pgpa_sj_unique_rel objects in a context that had too short a lifespan. Fix by allocating them (and any associated List-related allocations) in the same context as the pgpa_planner_state to which they are attached. We also need to copy uniquerel->relids, because the associated RelOptInfo may also be allocated within a short-lived context. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: http://postgr.es/m/a6e6d603-e847-44dc-acd5-879fb4570062@gmail.com	2026-03-17 14:25:43 -04:00
Robert Haas	59dcc19b39	pg_plan_advice: Always install pg_plan_advice.h, and in the right place The Makefile failed to set HEADERS_pg_plan_advice, so the header wasn't installed. Fixing that reveals another problem: since this is just a loadable module, not an extension, the header file is installed into $(includedir_server)/contrib rather than $(includedir_server)/extension. While we have no existing cases of installing header files there, it appears to be the intent of pgxs.mk. However, this is inconsistent with meson.build, which was using dir_include_extension. Changing that to dir_include_server / 'contrib' makes the install locations consistent across the two builds. Author: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: http://postgr.es/m/CAN4CZFP6NOjv__4Mx+iQD8StdpbHvzDAatEQn2n15UKJ=MySSQ@mail.gmail.com	2026-03-17 12:53:13 -04:00
Robert Haas	7560995a38	pg_plan_advice: Fix variable type confusion. pgs_mask values should always be uint64, but in a couple of places I incorrectly used uint32. Fix that. Reported-by: David Rowley <dgrowleyml@gmail.com> Discussion: http://postgr.es/m/CAApHDvquH6wnp4fhpaCOkC4R3KAvr2BOTbhhDPDQCBNR3YbLMQ@mail.gmail.com	2026-03-17 11:34:26 -04:00
Jeff Davis	f4af7849b3	Clean up postgres_fdw/t/010_subscription.pl. The test was based on test/subscription/002_rep_changes.pl, but had some leftover copy+paste problems that were useless and/or distracting. Discussion: https://postgr.es/m/CAA4eK1+=V_UFNHwcoMFqzy0F4AtS9_GyXhQDUzizgieQPWr=0A@mail.gmail.com Reported-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>	2026-03-16 13:42:55 -07:00
Robert Haas	5e72ce2467	pg_plan_advice: Fix failures to accept identifier keywords. TOK_IDENT allows only non-keywords; identifier should be used any place where either keywords or non-keywords should be accepted. Hence, without this commit, any string that happens to be a keyword can't be used as a partition schema, partition name, or plan name, which is incorrect. Author: Lukas Fittl <lukas@fittl.com> Discussion: http://postgr.es/m/CAP53PkzKeD=t90OfeMsniYrcRe2THQbUx3g6wV17Y=ZtiwmWTQ@mail.gmail.com	2026-03-16 14:46:50 -04:00
Peter Eisentraut	4f888d0f94	Fix whitespace	2026-03-16 19:33:13 +01:00
Álvaro Herrera	fba4233c83	Reduce header inclusions via execnodes.h Remove a bunch of #include lines from execnodes.h. Most of these requier suitable typedefs to be added, so that it still compiles standalone. In one case, the fix is to move a struct definition to the one .c file where it is needed. Also some light clean up in plannodes.h and genam.h, though not as extensive as in execnodes.h. Author: Álvaro Herrera <alvherre@kurilemu.de> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/202603131240.ihwqdxnj7w2o@alvherre.pgsql	2026-03-16 14:34:57 +01:00
Peter Eisentraut	2f094e7ac6	SQL Property Graph Queries (SQL/PGQ) Implementation of SQL property graph queries, according to SQL/PGQ standard (ISO/IEC 9075-16:2023). This adds: - GRAPH_TABLE table function for graph pattern matching - DDL commands CREATE/ALTER/DROP PROPERTY GRAPH - several new system catalogs and information schema views - psql \dG command - pg_get_propgraphdef() function for pg_dump and psql A property graph is a relation with a new relkind RELKIND_PROPGRAPH. It acts like a view in many ways. It is rewritten to a standard relational query in the rewriter. Access privileges act similar to a security invoker view. (The security definer variant is not currently implemented.) Starting documentation can be found in doc/src/sgml/ddl.sgml and doc/src/sgml/queries.sgml. Author: Peter Eisentraut <peter@eisentraut.org> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Ajay Pal <ajay.pal.k@gmail.com> Reviewed-by: Henson Choi <assam258@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/a855795d-e697-4fa5-8698-d20122126567@eisentraut.org	2026-03-16 10:14:18 +01:00
Fujii Masao	d8879d34b9	file_fdw: Add regression test for file_fdw with ON_ERROR='set_null' Commit `2a525cc97e` introduced the ON_ERROR = 'set_null' option for COPY, allowing it to be used with foreign tables backed by file_fdw. However, unlike ON_ERROR = 'ignore', no regression test was added to verify this behavior for file_fdw. This commit adds a regression test to ensure that foreign tables using file_fdw work correctly with ON_ERROR = 'set_null', improving test coverage. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yi Ding <dingyi_yale@163.com> Discussion: https://postgr.es/m/CAHGQGwGmPc6aHpA5=WxKreiDePiOEitfOFsW2dSo5m81xWXgRA@mail.gmail.com	2026-03-16 12:13:11 +09:00
David Rowley	503620311e	Add all required calls to TupleDescFinalize() As of this commit all TupleDescs must have TupleDescFinalize() called on them once the TupleDesc is set up and before BlessTupleDesc() is called. In this commit, TupleDescFinalize() does nothing. This change has only been separated out from the commit that properly implements this function to make the change more obvious. Any extension which makes its own TupleDesc will need to be modified to call the new function. The follow-up commit which properly implements TupleDescFinalize() will cause any code which forgets to do this to fail in assert-enabled builds in BlessTupleDesc(). It may still be worth mentioning this change in the release notes so that extension authors update their code. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAApHDvpoFjaj3%2Bw_jD5uPnGazaw41A71tVJokLDJg2zfcigpMQ%40mail.gmail.com	2026-03-16 11:45:49 +13:00
Michael Paquier	ae58189a4d	pgstattuple: Optimize pgstattuple_approx() with streaming read This commit plugs into pgstattuple_approx(), the SQL function faster than pgstattuple() that returns approximate results, the streaming read APIs. A callback is used to be able to skip all-visible pages via VM lookup, to match with the logic prior to this commit. Under test conditions similar to `6c228755ad` (some dm_delay and debug_io_direct=data), this can substantially improve the execution time of the function, particularly for large relations. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CABPTF7VrqfbcDXqGrdLQ2xaQ=K0RzExNuw6U_GGqzSJu32wfdQ@mail.gmail.com	2026-03-14 15:06:13 +09:00
Nathan Bossart	be43c48c22	Initialize variable to placate compiler. Since commit `5883ff30b0`, some compilers have been warning that the rtekind variable in unique_nonjoin_rtekind() may be used uninitialized. There doesn't appear to be any actual risk, so let's just initialize it to something to silence the compiler warnings. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0sieVNfniCKMDdDjuXGd1OuzMQfTS5%3D9vX3sa-iiujKUA%40mail.gmail.com	2026-03-13 11:32:14 -05:00

1 2 3 4 5 ...

5185 commits