postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-02-20 00:10:16 -05:00

Author	SHA1	Message	Date
Michael Paquier	459576303d	pgcrypto: Tweak error message for incorrect session key length The error message added in `379695d3cc` referred to the public key being too long. This is confusing as it is in fact the session key included in a PGP message which is too long. This is harmless, but let's be precise about what is wrong. Per offline report. Reported-by: Zsolt Parragi <zsolt.parragi@percona.com> Backpatch-through: 14	2026-02-16 12:18:18 +09:00
Noah Misch	9f4fd119b2	Fix SUBSTRING() for toasted multibyte characters. Commit `1e7fe06c10` changed pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an incomplete character. Most callers want that. text_substring() does not. It detoasts the most bytes it could possibly need to get the requested number of characters. For example, to extract up to 2 chars from UTF8, it needs to detoast 8 bytes. In a string of 3-byte UTF8 chars, 8 bytes spans 2 complete chars and 1 partial char. Fix this by replacing this pg_mbstrlen_with_len() call with a string traversal that differs by stopping upon finding as many chars as the substring could need. This also makes SUBSTRING() stop raising an encoding error if the incomplete char is past the end of the substring. This is consistent with the general philosophy of the above commit, which was to raise errors on a just-in-time basis. Before the above commit, SUBSTRING() never raised an encoding error. SUBSTRING() has long been detoasting enough for one more char than needed, because it did not distinguish exclusive and inclusive end position. For avoidance of doubt, stop detoasting extra. Back-patch to v14, like the above commit. For applications using SUBSTRING() on non-ASCII column values, consider applying this to your copy of any of the February 12, 2026 releases. Reported-by: SATŌ Kentarō <ranvis@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Bug: #19406 Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org Backpatch-through: 14	2026-02-14 12:16:16 -08:00
Noah Misch	4644f8b23b	pg_mblen_range, pg_mblen_with_len: Valgrind after encoding ereport. The prior order caused spurious Valgrind errors. They're spurious because the ereport(ERROR) non-local exit discards the pointer in question. pg_mblen_cstr() ordered the checks correctly, but these other two did not. Back-patch to v14, like commit `1e7fe06c10`. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20260214053821.fa.noahmisch@microsoft.com Backpatch-through: 14	2026-02-14 12:16:16 -08:00
John Naylor	ef3c3cf6d0	Perform radix sort on SortTuples with pass-by-value Datums Radix sort can be much faster than quicksort, but for our purposes it is limited to sequences of unsigned bytes. To make tuples with other types amenable to this technique, several features of tuple comparison must be accounted for, i.e. the sort key must be "normalized": 1. Signedness -- It's possible to modify a signed integer such that it can be compared as unsigned. For example, a signed char has range -128 to 127. If we cast that to unsigned char and add 128, the range of values becomes 0 to 255 while preserving order. 2. Direction -- SQL allows specification of ASC or DESC. The descending case is easily handled by taking the complement of the unsigned representation. 3. NULL values -- NULLS FIRST and NULLS LAST must work correctly. This commmit only handles the case where datum1 is pass-by-value Datum (possibly abbreviated) that compares like an ordinary integer. (Abbreviations of values of type "numeric" are a convenient counterexample.) First, tuples are partitioned by nullness in the correct NULL ordering. Then the NOT NULL tuples are sorted with radix sort on datum1. For tiebreaks on subsequent sortkeys (including the first sort key if abbreviated), we divert to the usual qsort. ORDER BY queries on pre-warmed buffers are up to 2x faster on high cardinality inputs with radix sort than the sort specializations added by commit `697492434`, so get rid of them. It's sufficient to fall back to qsort_tuple() for small arrays. Moderately low cardinality inputs show more modest improvents. Our qsort is strongly optimized for very low cardinality inputs, but radix sort is usually equal or very close in those cases. The changes to the regression tests are caused by under-specified sort orders, e.g. "SELECT a, b from mytable order by a;". For unstable sorts, such as our qsort and this in-place radix sort, there is no guarantee of the order of "b" within each group of "a". The implementation is taken from ska_byte_sort() (Boost licensed), which is similar to American flag sort (an in-place radix sort) with modifications to make it better suited for modern pipelined CPUs. The technique of normalization described above can also be extended to the case of multiple keys. That is left for future work (Thanks to Peter Geoghegan for the suggestion to look into this area). Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com> Reviewed-by: zengman <zengman@halodbtech.com> Reviewed-by: ChangAo Chen <cca5507@qq.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> (earlier version) Discussion: https://postgr.es/m/CANWCAZYzx7a7E9AY16Jt_U3+GVKDADfgApZ-42SYNiig8dTnFA@mail.gmail.com	2026-02-14 13:50:06 +07:00
Daniel Gustafsson	aa082bed0b	doc: Mention PASSING support for jsonpath variables Commit `dfd79e2d` added a TODO comment to update this paragraph when support for PASSING was added. Commit `6185c9737c` added PASSING but missed resolving this TODO. Fix by expanding the paragraph with a reference to PASSING. Author: Aditya Gollamudi <adigollamudi@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20260117051406.sx6pss4ryirn2x4v@pgs	2026-02-13 12:12:11 +01:00
Daniel Gustafsson	4469fe1761	doc: Update docs images README with required ditaa version The URL for Ditaa linked to the old Sourceforge version which is too old for what we need, the fork over on Github is the correct version to use for re-generating the SVG files for the docs. The required Ditaa version is 0.11.0 as it when SVG support as added. Running the version found on Sourceforge produce the error below: $ ditaa -E -S --svg in.txt out.txt Unrecognized option: --svg usage: ditaa <INPFILE> [OUTFILE] [-A] [-b <BACKGROUND>] [-d] [-E] [-e <ENCODING>] [-h] [--help] [-o] [-r] [-S] [-s <SCALE>] [-T] [-t <TABS>] [-v] [-W] While there, also mention that meson rules exists for building images. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Discussion: https://postgr.es/m/CAN55FZ2O-23xERF2NYcvv9DM_1c9T16y6mi3vyP=O1iuXS0ASA@mail.gmail.com	2026-02-13 11:50:17 +01:00
Daniel Gustafsson	4ec0e75afd	meson: Add target for generating docs images This adds an 'images' target to the meson build system in order to be able to regenerate the images used in the docs. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reported-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CAN55FZ0c0Tcjx9=e-YibWGHa1-xmdV63p=THH4YYznz+pYcfig@mail.gmail.com	2026-02-13 11:50:14 +01:00
Michael Paquier	6736dea14a	pg_dump: Use pg_malloc_object() and pg_malloc_array() The idea is to encourage more the use of these allocation routines across the tree, as these offer stronger type safety guarantees than pg_malloc() & co (type cast in the result, sizeof() embedded). This set of changes is dedicated to the pg_dump code. Similar work has been done as of `31d3847a37`, as one example. Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/CAHut+PvpGPDLhkHAoxw_g3jdrYxA1m16a8uagbgH3TGWSKtXNQ@mail.gmail.com	2026-02-13 19:48:35 +09:00
Daniel Gustafsson	53c6bd0aa3	Restart BackgroundPsql's timer more nicely. Use BackgroundPsql's published API for automatically restarting its timer for each query, rather than manually reaching into it to achieve the same thing. 010_tab_completion.pl's logic for this predates the invention of BackgroundPsql (and `664d75753` missed the opportunity to make it cleaner). 030_pager.pl copied-and-pasted the code. Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1100715.1712265845@sss.pgh.pa.us	2026-02-13 11:36:31 +01:00
Michael Paquier	775fc01415	Improve error message for checksum failures in pgstat_database.c This log message was referring to conflicts, but it is about checksum failures. The log message improved in this commit should never show up, due to the fact that pgstat_prepare_report_checksum_failure() should always be called before pgstat_report_checksum_failures_in_db(), with a stats entry already created in the pgstats shared hash table. The three code paths able to report database-level checksum failures follow already this requirement. Oversight in `b96d3c3897`. Author: Wang Peng <215722532@qq.com> Discussion: https://postgr.es/m/tencent_9B6CD6D9D34AE28CDEADEC6188DB3BA1FE07@qq.com Backpatch-through: 18	2026-02-13 12:17:08 +09:00
Heikki Linnakangas	d7edcec35c	Make pg_numa_query_pages() work in frontend programs It's currently only used in the server, but it was placed in src/port with the idea that it might be useful in client programs too. However, it will currently fail to link if used in a client program, because CHECK_FOR_INTERRUPTS() is not usable in client programs. Fix that by wrapping it in "#ifndef FRONTEND". Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/21cc7a48-99d9-4f69-9a3f-2c2de61ac8e5%40iki.fi Backpatch-through: 18	2026-02-12 19:41:06 +02:00
Heikki Linnakangas	d7a4291bb7	Fix comment neglected in commit `ddc3250208` I renamed the field in commit `ddc3250208`, but missed this one reference.	2026-02-12 19:41:02 +02:00
Nathan Bossart	a468898883	Remove specialized word-length popcount implementations. The uses of these functions do not justify the level of micro-optimization we've done and may even hurt performance in some cases (e.g., due to using function pointers). This commit removes all architecture-specific implementations of pg_popcount{32,64} and converts the portable ones to inlined functions in pg_bitutils.h. These inlined versions should produce the same code as before (but inlined), so in theory this is a net gain for many machines. A follow-up commit will replace the remaining loops over these word-length popcount functions with calls to pg_popcount(), further reducing the need for architecture-specific implementations. Suggested-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Greg Burd <greg@burd.me> Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com	2026-02-12 11:32:49 -06:00
Nathan Bossart	cb7b2e5e8e	Remove some unnecessary optimizations in popcount code. Over the past few releases, we've added a huge amount of complexity to our popcount implementations. Commits `fbe327e5b4`, `79e232ca01`, `8c6653516c`, and `25dc485074` did some preliminary refactoring, but many opportunities remain. In particular, if we disclaim interest in micro-optimizing this code for 32-bit builds and in unnecessary alignment checks on x86-64, we can remove a decent chunk of code. I cannot find public discussion or benchmarks for the code this commit removes, but it seems unlikely that this change will noticeably impact performance on affected systems. Suggested-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: John Naylor <johncnaylorls@gmail.com> Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com	2026-02-12 11:32:49 -06:00
Dean Rasheed	88327092ff	Add support for INSERT ... ON CONFLICT DO SELECT. This adds a new ON CONFLICT action DO SELECT [FOR UPDATE/SHARE], which returns the pre-existing rows when conflicts are detected. The INSERT statement must have a RETURNING clause, when DO SELECT is specified. The optional FOR UPDATE/SHARE clause allows the rows to be locked before they are are returned. As with a DO UPDATE conflict action, an optional WHERE clause may be used to prevent rows from being selected for return (but as with a DO UPDATE action, rows filtered out by the WHERE clause are still locked). Bumps catversion as stored rules change. Author: Andreas Karlsson <andreas@proxel.se> Author: Marko Tiikkaja <marko@joh.to> Author: Viktor Holmberg <v@viktorh.net> Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/d631b406-13b7-433e-8c0b-c6040c4b4663@Spark Discussion: https://postgr.es/m/5fca222d-62ae-4a2f-9fcb-0eca56277094@Spark Discussion: https://postgr.es/m/2b5db2e6-8ece-44d0-9890-f256fdca9f7e@proxel.se Discussion: https://postgr.es/m/CAL9smLCdV-v3KgOJX3mU19FYK82N7yzqJj2HAwWX70E=P98kgQ@mail.gmail.com	2026-02-12 09:57:04 +00:00
Amit Kapila	788ec96d59	Refactor slot synchronization logic in slotsync.c. Following `e68b6adad9`, the reason for skipping slot synchronization is stored as a slot property. This commit removes redundant function parameters that previously tracked this state, instead relying directly on the slot property. Additionally, this change centralizes the logic for skipping synchronization when required WAL has not yet been received or flushed. By consolidating this check, we reduce code duplication and the risk of inconsistent state updates across different code paths. In passing, add an assertion to ensure a slot is marked as temporary if a consistent point has not been reached during synchronization. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/TY4PR01MB16907DD16098BE3B20486D4569463A@TY4PR01MB16907.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/CAFPTHDZAA+gWDntpa5ucqKKba41=tXmoXqN3q4rpjO9cdxgQrw@mail.gmail.com	2026-02-12 14:38:31 +05:30
Dean Rasheed	706cadde32	Remove p_is_insert from struct ParseState. The only place that used p_is_insert was transformAssignedExpr(), which used it to distinguish INSERT from UPDATE when handling indirection on assignment target columns -- see commit `c1ca3a19df`. However, this information is already available to transformAssignedExpr() via its exprKind parameter, which is always either EXPR_KIND_INSERT_TARGET or EXPR_KIND_UPDATE_TARGET. As noted in the commit message for `c1ca3a19df`, this use of p_is_insert isn't particularly pretty, so have transformAssignedExpr() use the exprKind parameter instead. This then allows p_is_insert to be removed entirely, which simplifies state management in a few other places across the parser. Author: Viktor Holmberg <v@viktorh.net> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/badc3b4c-da73-4000-b8d3-638a6f53a769@Spark	2026-02-12 09:01:42 +00:00
Richard Guo	cf74558feb	Reduce LEFT JOIN to ANTI JOIN using NOT NULL constraints For a LEFT JOIN, if any var from the right-hand side (RHS) is forced to null by upper-level quals but is known to be non-null for any matching row, the only way the upper quals can be satisfied is if the join fails to match, producing a null-extended row. Thus, we can treat this left join as an anti-join. Previously, this transformation was limited to cases where the join's own quals were strict for the var forced to null by upper qual levels. This patch extends the logic to check table constraints, leveraging the NOT NULL attribute information already available thanks to the infrastructure introduced by `e2debb643`. If a forced-null var belongs to the RHS and is defined as NOT NULL in the schema (and is not nullable due to lower-level outer joins), we know that the left join can be reduced to an anti-join. Note that to ensure the var is not nullable by any lower-level outer joins within the current subtree, we collect the relids of base rels that are nullable within each subtree during the first pass of the reduce-outer-joins process. This allows us to verify in the second pass that a NOT NULL var is indeed safe to treat as non-nullable. Based on a proposal by Nicolas Adenis-Lamarre, but this is not the original patch. Suggested-by: Nicolas Adenis-Lamarre <nicolas.adenis.lamarre@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CACPGbctKMDP50PpRH09in+oWbHtZdahWSroRstLPOoSDKwoFsw@mail.gmail.com	2026-02-12 15:30:13 +09:00
Tom Lane	9863c90759	Fix plpgsql's handling of "return simple_record_variable". If the variable's value is null, exec_stmt_return() missed filling in estate->rettype. This is a pretty old bug, but we'd managed not to notice because that value isn't consulted for a null result ... unless we have to cast it to a domain. That case led to a failure with "cache lookup failed for type 0". The correct way to assign the data type is known by exec_eval_datum. While we could copy-and-paste that logic, it seems like a better idea to just invoke exec_eval_datum, as the ROW case already does. Reported-by: Pavel Stehule <pavel.stehule@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFj8pRBT_ahexDf-zT-cyH8bMR_qcySKM8D5nv5MvTWPiatYGA@mail.gmail.com Backpatch-through: 14	2026-02-11 16:53:14 -05:00
Heikki Linnakangas	78a5e3074b	Fix pg_stat_get_backend_wait_event() for aux processes The pg_stat_activity view shows information for aux processes, but the pg_stat_get_backend_wait_event() and pg_stat_get_backend_wait_event_type() functions did not. To fix, call AuxiliaryPidGetProc(pid) if BackendPidGetProc(pid) returns NULL, like we do in pg_stat_get_activity(). In version 17 and above, it's a little silly to use those functions when we already have the ProcNumber at hand, but it was necessary before v17 because the backend ID was different from ProcNumber. I have other plans for wait_event_info on master, so it doesn't seem worth applying a different fix on different versions now. Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/c0320e04-6e85-4c49-80c5-27cfb3a58108@iki.fi Backpatch-through: 14	2026-02-11 18:50:57 +02:00
Nathan Bossart	1d92e0c2cc	Add password expiration warnings. This commit adds a new parameter called password_expiration_warning_threshold that controls when the server begins emitting imminent-password-expiration warnings upon successful password authentication. By default, this parameter is set to 7 days, but this functionality can be disabled by setting it to 0. This patch also introduces a new "connection warning" infrastructure that can be reused elsewhere. For example, we may want to warn about the use of MD5 passwords for a couple of releases before removing MD5 password support. Author: Gilles Darold <gilles@darold.net> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: songjinzhou <tsinghualucky912@foxmail.com> Reviewed-by: liu xiaohui <liuxh.zj.cn@gmail.com> Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/129bcfbf-47a6-e58a-190a-62fc21a17d03%40migops.com	2026-02-11 10:36:15 -06:00
Tom Lane	a3fd53babb	Further stabilize a postgres_fdw test case. The buildfarm occasionally shows a variant row order in the output of this UPDATE ... RETURNING, implying that the preceding INSERT dropped one of the rows into some free space within the table rather than appending them all at the end. It's not entirely clear why that happens some times and not other times, but we have established that it's affected by concurrent activity in other databases of the cluster. In any case, the behavior is not wrong; the test is at fault for presuming that a seqscan will give deterministic row ordering. Add an ORDER BY atop the update to stop the buildfarm noise. The buildfarm seems to have shown this only in v18 and master branches, but just in case the cause is older, back-patch to all supported branches. Discussion: https://postgr.es/m/3866274.1770743162@sss.pgh.pa.us Backpatch-through: 14	2026-02-11 11:03:17 -05:00
Álvaro Herrera	1efdd7cc63	Cleanup for log_min_messages changes in `38e0190ced` * Remove an unused variable * Use "default log level" consistently (instead of "generic") * Keep the process types in alphabetical order (missed one place in the SGML docs) * Since log_min_messages type was changed from enum to string, it is a good idea to add single quotes when printing it out. Otherwise it fails if the user copies and pastes from the SHOW output to SET, except in the simplest case. Using single quotes reduces confusion. * Use lowercase string for the burned-in default value, to keep the same output as previous versions. Author: Euler Taveira <euler@eulerto.com> Author: Man Zeng <zengman@halodbtech.com> Author: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/202602091250.genyflm2d5dw@alvherre.pgsql	2026-02-11 16:38:18 +01:00
Heikki Linnakangas	7984ce7a1d	Move ProcStructLock to the ProcGlobal struct It protects the freeProcs and some other fields in ProcGlobal, so let's move it there. It's good for cache locality to have it next to the thing it protects, and just makes more sense anyway. I believe it was allocated as a separate shared memory area just for historical reasons. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://www.postgresql.org/message-id/b78719db-0c54-409f-b185-b0d59261143f@iki.fi	2026-02-11 16:48:45 +02:00
Dean Rasheed	bc953bf523	doc: Mention all SELECT privileges required by INSERT ... ON CONFLICT. On the INSERT page, mention that SELECT privileges are also required for any columns mentioned in the arbiter clause, including those referred to by the constraint, and clarify that this applies to all forms of ON CONFLICT, not just ON CONFLICT DO UPDATE. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com Backpatch-through: 14	2026-02-11 10:52:58 +00:00
Dean Rasheed	227a6ea657	doc: Clarify RLS policies applied for ON CONFLICT DO NOTHING. On the CREATE POLICY page, the description of per-command policies stated that SELECT policies are applied when an INSERT has an ON CONFLICT DO NOTHING clause. However, that is only the case if it includes an arbiter clause, so clarify that. While at it, also clarify the comment in the regression tests that cover this. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Viktor Holmberg <v@viktorh.net> Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com Backpatch-through: 14	2026-02-11 10:25:05 +00:00
Heikki Linnakangas	ab32a9e21d	Remove useless store to local variable It was a leftover from commit `5764f611e1`, which converted the loop to use dclist_foreach. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/3dd6f70c-b94d-4428-8e75-74a7136396be@iki.fi	2026-02-11 11:49:18 +02:00
Robert Haas	7358abcc60	Store information about Append node consolidation in the final plan. An extension (or core code) might want to reconstruct the planner's decisions about whether and where to perform partitionwise joins from the final plan. To do so, it must be possible to find all of the RTIs of partitioned tables appearing in the plan. But when an AppendPath or MergeAppendPath pulls up child paths from a subordinate AppendPath or MergeAppendPath, the RTIs of the subordinate path do not appear in the final plan, making this kind of reconstruction impossible. To avoid this, propagate the RTI sets that would have been present in the 'apprelids' field of the subordinate Append or MergeAppend nodes that would have been created into the surviving Append or MergeAppend node, using a new 'child_append_relid_sets' field for that purpose. The value of this field is a list of Bitmapsets, because each relation whose append-list was pulled up had its own set of RTIs: just one, if it was a partitionwise scan, or more than one, if it was a partitionwise join. Since our goal is to see where partitionwise joins were done, it is essential to avoid losing the information about how the RTIs were grouped in the pulled-up relations. This commit also updates pg_overexplain so that EXPLAIN (RANGE_TABLE) will display the saved RTI sets. Co-authored-by: Robert Haas <rhaas@postgresql.org> Co-authored-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-02-10 17:55:59 -05:00
Michael Paquier	9181c870ba	Improve type handling of varlena structures This commit changes the definition of varlena to a typedef, so as it becomes possible to remove "struct" markers from various declarations in the code base. Historically, "struct" markers are not the project style for variable declarations, so this update simplifies the code and makes it more consistent across the board. This change has an impact on the following structures, simplifying declarations using them: - varlena - varatt_indirect - varatt_external This cleanup has come up in a different path set that played with TOAST and varatt.h, independently worth doing on its own. Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aW8xvVbovdhyI4yo@paquier.xyz	2026-02-11 07:33:24 +09:00
Robert Haas	0d4391b265	Store information about elided nodes in the final plan. An extension (or core code) might want to reconstruct the planner's choice of join order from the final plan. To do so, it must be possible to find all of the RTIs that were part of the join problem in that plan. Commit `adbad833f3`, together with the earlier work in `8c49a484e8`, is enough to let us match up RTIs we see in the final plan with RTIs that we see during the planning cycle, but we still have a problem if the planner decides to drop some RTIs out of the final plan altogether. To fix that, when setrefs.c removes a SubqueryScan, single-child Append, or single-child MergeAppend from the final Plan tree, record the type of the removed node and the RTIs that the removed node would have scanned in the final plan tree. It would be natural to record this information on the child of the removed plan node, but that would require adding an additional pointer field to type Plan, which seems undesirable. So, instead, store the information in a separate list that the executor need never consult, and use the plan_node_id to identify the plan node with which the removed node is logically associated. Also, update pg_overexplain to display these details. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-02-10 16:46:05 -05:00
Robert Haas	adbad833f3	Store information about range-table flattening in the final plan. Suppose that we're currently planning a query and, when that same query was previously planned and executed, we learned something about how a certain table within that query should be planned. We want to take note when that same table is being planned during the current planning cycle, but this is difficult to do, because the RTI of the table from the previous plan won't necessarily be equal to the RTI that we see during the current planning cycle. This is because each subquery has a separate range table during planning, but these are flattened into one range table when constructing the final plan, changing RTIs. Commit `8c49a484e8` allows us to match up subqueries seen in the previous planning cycles with the subqueries currently being planned just by comparing textual names, but that's not quite enough to let us deduce anything about individual tables, because we don't know where each subquery's range table appears in the final, flattened range table. To fix that, store a list of SubPlanRTInfo objects in the final planned statement, each including the name of the subplan, the offset at which it begins in the flattened range table, and whether or not it was a dummy subplan -- if it was, some RTIs may have been dropped from the final range table, but also there's no need to control how a dummy subquery gets planned. The toplevel subquery has no name and always begins at rtoffset 0, so we make no entry for it. This commit teaches pg_overexplain's RANGE_TABLE option to make use of this new data to display the subquery name for each range table entry. Reviewed-by: Lukas Fittl <lukas@fittl.com> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Greg Burd <greg@burd.me> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Haibo Yan <tristan.yim@gmail.com> Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com> Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com	2026-02-10 15:33:39 -05:00
Robert Haas	0f4c8d33d4	Pass cursorOptions to planner_setup_hook. Commit `94f3ad3961` failed to do this because I couldn't think of a use for the information, but this has proven to be short-sighted. Best to fix it before this code is officially released. Now, the only argument to standard_planenr that isn't passed to planner_setup_hook is boundParams, but that is accessible via glob->boundParams, and so doesn't need to be passed separately. Discussion: https://www.postgresql.org/message-id/CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com	2026-02-10 11:50:28 -05:00
Robert Haas	cbdf93d471	Fix PGS_CONSIDER_NONPARTIAL interaction with Materialize nodes. Commit `4020b370f2` had the idea that it would be a good idea to handle testing PGS_CONSIDER_NONPARTIAL within cost_material to save callers the trouble, but that turns out not to be a very good idea. One concern is that it makes cost_material() dependent on the caller having initialized certain fields in the MaterialPath, which is a bit awkward for materialize_finished_plan, which wants to use a dummy path. Another problem is that it can result in generated materialized nested loops where the Materialize node is disabled, contrary to the intention of joinpath.c's logic in match_unsorted_outer() and consider_parallel_nestloop(), which aims to consider such paths only when they would not need to be disabled. In the previous coding, it was possible for the pgs_mask on the joinrel to have PGS_CONSIDER_NONPARTIAL set, while the inner rel had the same bit clear. In that case, we'd generate and then disable a Materialize path. That seems wrong, so instead, pull up the logic to test the PGS_CONSIDER_NONPARTIAL bit into joinpath.c, restoring the historical behavior that either we don't generate a given materialized nested loop in the first place, or we don't disable it. Discussion: http://postgr.es/m/CA+TgmoawzvCoZAwFS85tE5+c8vBkqgcS8ZstQ_ohjXQ9wGT9sw@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoYS4ZCVAF2jTce=bMP0Oq_db_srocR4cZyO0OBp9oUoGg@mail.gmail.com	2026-02-10 11:49:07 -05:00
Heikki Linnakangas	be5257725d	Refactor ProcessRecoveryConflictInterrupt for readability Two changes here: 1. Introduce a separate RECOVERY_CONFLICT_BUFFERPIN_DEADLOCK flag to indicate a suspected deadlock that involves a buffer pin. Previously the startup process used the same flag for a deadlock involving just regular locks, and to check for deadlocks involving the buffer pin. The cases are handled separately in the startup process, but the receiving backend had to deduce which one it was based on HoldingBufferPinThatDelaysRecovery(). With a separate flag, the receiver doesn't need to guess. 2. Rewrite the ProcessRecoveryConflictInterrupt() function to not rely on fallthrough through the switch-statement. That was difficult to read. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-10 16:23:10 +02:00
Heikki Linnakangas	17f51ea818	Separate RecoveryConflictReasons from procsignals Share the same PROCSIG_RECOVERY_CONFLICT flag for all recovery conflict reasons. To distinguish, have a bitmask in PGPROC to indicate the reason(s). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-10 16:23:08 +02:00
Heikki Linnakangas	ddc3250208	Use ProcNumber rather than pid in ReplicationSlot This helps the next commit. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi	2026-02-10 16:23:05 +02:00
Michael Paquier	f33c585774	Simplify some log messages in extended_stats_funcs.c The log messages used in this file applied too much quoting logic: - No need for quote_identifier(), which is fine to not use in the context of a log entry. - The usual project style is to group the namespace and object together in a quoted string, when mentioned in an log message. This code quoted the namespace name and the extended statistics object name separately, which was confusing. Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20260210.143752.1113524465620875233.horikyota.ntt@gmail.com	2026-02-10 16:59:19 +09:00
Michael Paquier	307447e6db	Add information about range type stats to pg_stats_ext_exprs This commit adds three attributes to the system view pg_stats_ext_exprs, whose data can exist when involving a range type in an expression: range_length_histogram range_empty_frac range_bounds_histogram These statistics fields exist since `918eee0c49`, and have become viewable in pg_stats later in `bc3c8db8ae`. This puts the definition of pg_stats_ext_exprs on par with pg_stats. This issue has showed up during the discussion about the restore of extended statistics for expressions, so as it becomes possible to query the stats data to restore from the catalogs. Having access to this data is useful on its own, without the restore part. Some documentation and some tests are added, written by me. Corey has authored the part in system_views.sql. Bump catalog version. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aYmCUx9VvrKiZQLL@paquier.xyz	2026-02-10 12:36:57 +09:00
Richard Guo	f41ab51573	Teach planner to transform "x IS [NOT] DISTINCT FROM NULL" to a NullTest In the spirit of `8d19d0e13`, this patch teaches the planner about the principle that NullTest with !argisrow is fully equivalent to SQL's IS [NOT] DISTINCT FROM NULL. The parser already performs this transformation for literal NULLs. However, a DistinctExpr expression with one input evaluating to NULL during planning (e.g., via const-folding of "1 + NULL" or parameter substitution in custom plans) currently remains as a DistinctExpr node. This patch closes the gap for const-folded NULLs. It specifically targets the case where one input is a constant NULL and the other is a nullable non-constant expression. (If the other input were otherwise, the DistinctExpr node would have already been simplified to a constant TRUE or FALSE.) This transformation can be beneficial because NullTest is much more amenable to optimization than DistinctExpr, since the planner knows a good deal about the former and next to nothing about the latter. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com	2026-02-10 10:19:25 +09:00
Richard Guo	0aaf0de7fe	Optimize BooleanTest with non-nullable input The BooleanTest construct (IS [NOT] TRUE/FALSE/UNKNOWN) treats a NULL input as the logical value "unknown". However, when the input is proven to be non-nullable, this special handling becomes redundant. In such cases, the construct can be simplified directly to a boolean expression or a constant. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com	2026-02-10 10:18:47 +09:00
Richard Guo	0a37961254	Optimize IS DISTINCT FROM with non-nullable inputs The IS DISTINCT FROM construct compares values acting as though NULL were a normal data value, rather than "unknown". Semantically, "x IS DISTINCT FROM y" yields true if the values differ or if exactly one is NULL, and false if they are equal or both NULL. Unlike ordinary comparison operators, it never returns NULL. Previously, the planner only simplified this construct if all inputs were constants, folding it to a constant boolean result. This patch extends the optimization to cases where inputs are non-constant but proven to be non-nullable. Specifically, "x IS DISTINCT FROM NULL" folds to constant TRUE if "x" is known to be non-nullable. For cases where both inputs are guaranteed not to be NULL, the expression becomes semantically equivalent to "x <> y", and the DistinctExpr is converted into an inequality OpExpr. This transformation provides several benefits. It converts the comparison into a standard operator, allowing the use of partial indexes and constraint exclusion. Furthermore, if the clause is negated (i.e., "IS NOT DISTINCT FROM"), it simplifies to an equality operator. This enables the planner to generate better plans using index scans, merge joins, hash joins, and EC-based qual deduction. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAMbWs49BMAOWvkdSHxpUDnniqJcEcGq3_8dd_5wTR4xrQY8urA@mail.gmail.com	2026-02-10 10:17:45 +09:00
Nathan Bossart	158408fef8	pg_upgrade: Fix handling of pg_largeobject_metadata. For binary upgrades from v16 or newer, pg_upgrade transfers the files for pg_largeobject_metadata from the old cluster, as opposed to using COPY or ordinary SQL commands to reconstruct its contents. While this approach adds complexity, it can greatly reduce pg_upgrade's runtime when there are many large objects. Large objects with comments or security labels are one source of complexity for this approach. During pg_upgrade, schema restoration happens before files are transferred. Comments and security labels are transferred in the former step, but the COMMENT and SECURITY LABEL commands will fail if their corresponding large objects do not exist. To deal with this, pg_upgrade first copies only the rows of pg_largeobject_metadata that are needed to avoid failures. Later, pg_upgrade overwrites those rows by replacing pg_largeobject_metadata's files with its files in the old cluster. Unfortunately, there's a subtle problem here. Simply put, there's no guarantee that pg_upgrade will overwrite all of pg_largeobject_metadata's files on the new cluster. For example, the new cluster's version might more aggressively extend relations or create visibility maps, and pg_upgrade's file transfer code is not sophisticated enough to remove files that lack counterparts in the old cluster. These extra files could cause problems post-upgrade. More fortunately, we can simultaneously fix the aforementioned problem and further optimize binary upgrades for clusters with many large objects. If we teach the COMMENT and SECURITY LABEL commands to allow nonexistent large objects during binary upgrades, pg_upgrade no longer needs to transfer pg_largeobject_metadata's contents beforehand. This approach also allows us to remove the associated dependency tracking from pg_dump, even for upgrades from v12-v15 that use COPY to transfer pg_largeobject_metadata's contents. In addition to what is described in the previous paragraph, this commit modifies the query in getLOs() to only retrieve LOs with comments or security labels for upgrades from v12 or newer. We have long assumed that such usage is rare, so this should reduce pg_upgrade's memory usage and runtime in many cases. We might also be able to remove the "upgrades from v12 or newer" restriction on the recent batch of optimizations by adding special handling for pg_largeobject_metadata's hidden OID column on older versions (since this catalog previously used the now-removed WITH OIDS feature), but that is left as a future exercise. Reported-by: Andres Freund <andres@anarazel.de> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/3yd2ss6n7xywo6pmhd7jjh3bqwgvx35bflzgv3ag4cnzfkik7m%40hiyadppqxx6w	2026-02-09 14:58:02 -06:00
Heikki Linnakangas	73d60ac385	cleanup: Deadlock checker is no longer called from signal handler Clean up a few leftovers from when the deadlock checker was called from signal handler. We stopped doing that in commit `6753333f55`, in year 2015. - CheckDeadLock can return a return value directly to the caller, there's no need to use a global variable for that. - Remove outdated comments that claimed that CheckDeadLock "signals ProcSleep". - It should be OK to ereport() from DeadLockCheck now. I considered getting rid of InitDeadLockChecking() and moving the workspace allocations into DeadLockCheck, but it's still good to avoid doing the allocations while we're holding all the partition locks. So just update the comment to give that as the reason we do the allocations up front.	2026-02-09 20:26:23 +02:00
Álvaro Herrera	cbef472558	Remove HeapTupleheaderSetXminCommitted/Invalid functions They are not and never have been used by any known code -- apparently we just cargo-culted them in commit `37484ad2aa` (or their ancestor macros anyway, which begat these functions in commit `34694ec888`). Allegedly they're also potentially dangerous; users are better off going through HeapTupleSetHintBits instead. Author: Andy Fan <zhihuifan1213@163.com> Discussion: https://postgr.es/m/87sejogt4g.fsf@163.com	2026-02-09 19:15:20 +01:00
Heikki Linnakangas	18f0afb2a6	Fix incorrect iteration type in extension_file_exists() Commit `f3c9e341cd` changed the type of objects in the List that get_extension_control_directories() returns, from "char " to "ExtensionLocation ", but missed adjusting this one caller. Author: Chao Li <lic@highgo.com> Discussion: https://www.postgresql.org/message-id/362EA9B3-589B-475A-A16E-F10C30426E28@gmail.com	2026-02-09 19:15:44 +02:00
Noah Misch	c5dc75479b	Fix test "NUL byte in text decrypt" for --without-zlib builds. Backpatch-through: 14 Security: CVE-2026-2006	2026-02-09 09:08:10 -08:00
Tom Lane	8ebdf41c26	Harden _int_matchsel() against being attached to the wrong operator. While the preceding commit prevented such attachments from occurring in future, this one aims to prevent further abuse of any already- created operator that exposes _int_matchsel to the wrong data types. (No other contrib module has a vulnerable selectivity estimator.) We need only check that the Const we've found in the query is indeed of the type we expect (query_int), but there's a difficulty: as an extension type, query_int doesn't have a fixed OID that we could hard-code into the estimator. Therefore, the bulk of this patch consists of infrastructure to let an extension function securely look up the OID of a datatype belonging to the same extension. (Extension authors have requested such functionality before, so we anticipate that this code will have additional non-security uses, and may soon be extended to allow looking up other kinds of SQL objects.) This is done by first finding the extension that owns the calling function (there can be only one), and then thumbing through the objects owned by that extension to find a type that has the desired name. This is relatively expensive, especially for large extensions, so a simple cache is put in front of these lookups. Reported-by: Daniel Firer as part of zeroday.cloud Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2004 Backpatch-through: 14	2026-02-09 10:14:22 -05:00
Tom Lane	841d42cc4e	Require superuser to install a non-built-in selectivity estimator. Selectivity estimators come in two flavors: those that make specific assumptions about the data types they are working with, and those that don't. Most of the built-in estimators are of the latter kind and are meant to be safely attachable to any operator. If the operator does not behave as the estimator expects, you might get a poor estimate, but it won't crash. However, estimators that do make datatype assumptions can malfunction if they are attached to the wrong operator, since then the data they get from pg_statistic may not be of the type they expect. This can rise to the level of a security problem, even permitting arbitrary code execution by a user who has the ability to create SQL objects. To close this hole, establish a rule that built-in estimators are required to protect themselves against being called on the wrong type of data. It does not seem practical however to expect estimators in extensions to reach a similar level of security, at least not in the near term. Therefore, also establish a rule that superuser privilege is required to attach a non-built-in estimator to an operator. We expect that this restriction will have little negative impact on extensions, since estimators generally have to be written in C and thus superuser privilege is required to create them in the first place. This commit changes the privilege checks in CREATE/ALTER OPERATOR to enforce the rule about superuser privilege, and fixes a couple of built-in estimators that were making datatype assumptions without sufficiently checking that they're valid. Reported-by: Daniel Firer as part of zeroday.cloud Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2004 Backpatch-through: 14	2026-02-09 10:07:31 -05:00
Tom Lane	60e7ae41a6	Guard against unexpected dimensions of oidvector/int2vector. These data types are represented like full-fledged arrays, but functions that deal specifically with these types assume that the array is 1-dimensional and contains no nulls. However, there are cast pathways that allow general oid[] or int2[] arrays to be cast to these types, allowing these expectations to be violated. This can be exploited to cause server memory disclosure or SIGSEGV. Fix by installing explicit checks in functions that accept these types. Reported-by: Altan Birler <altan.birler@tum.de> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Security: CVE-2026-2003 Backpatch-through: 14	2026-02-09 09:57:43 -05:00
Noah Misch	d536aee556	Require PGP-decrypted text to pass encoding validation. pgp_sym_decrypt() and pgp_pub_decrypt() will raise such errors, while bytea variants will not. The existing "dat3" test decrypted to non-UTF8 text, so switch that query to bytea. The long-term intent is for type "text" to always be valid in the database encoding. pgcrypto has long been known as a source of exceptions to that intent, but a report about exploiting invalid values of type "text" brought this module to the forefront. This particular exception is straightforward to fix, with reasonable effect on user queries. Back-patch to v14 (all supported versions). Reported-by: Paul Gerste (as part of zeroday.cloud) Reported-by: Moritz Sanft (as part of zeroday.cloud) Author: shihao zhong <zhong950419@gmail.com> Reviewed-by: cary huang <hcary328@gmail.com> Discussion: https://postgr.es/m/CAGRkXqRZyo0gLxPJqUsDqtWYBbgM14betsHiLRPj9mo2=z9VvA@mail.gmail.com Backpatch-through: 14 Security: CVE-2026-2006	2026-02-09 06:14:47 -08:00

1 2 3 4 5 ...

63316 commits