postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-06-13 18:50:17 -04:00

Author	SHA1	Message	Date
Noah Misch	4399d10556	Test restartpoints in archive recovery. v14 commit `1f95181b44` and its v13 equivalent caused timing-dependent failures in archive recovery, at restartpoints. The symptom was "invalid magic number 0000 in log segment X, offset 0", "unexpected pageaddr X in log segment Y, offset 0" [X < Y], or an assertion failure. Commit `3635a0a35a` and predecessors back-patched v15 changes to fix that. This test reproduces the problem probabilistically, typically in less than 1000 iterations of the test. Hence, buildfarm and CI runs would have surfaced enough failures to get attention within a day. Reported-by: Arun Thirupathi <arunth@google.com> Discussion: https://postgr.es/m/20250306193013.36.nmisch@google.com Backpatch-through: 13	2025-04-20 08:28:53 -07:00
Noah Misch	d34b671a63	Avoid ERROR at ON COMMIT DELETE ROWS after relhassubclass=f. Commit `7102070329` fixed a similar bug, but it missed the case of database-wide ANALYZE ("use_own_xacts" mode). Commit `a07e03fd8f` changed consequences from silent discard of a pg_class stats (relpages et al.) update to ERROR "tuple to be updated was already modified". Losing a relpages update of an ON COMMIT DELETE ROWS table was negligible, but a COMMIT-time error isn't negligible. Back-patch to v13 (all supported versions). Reported-by: Richard Guo <guofenglinux@gmail.com Reported-by: Robins Tharakan <tharakan@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-XwMKMKJ_GT=p3_-_=j9rQSEs1FbDFUnW9zHuKPsPNEQ@mail.gmail.com Backpatch-through: 13	2025-04-20 08:28:53 -07:00
Tom Lane	e276b58293	Fix parse_cte.c's failure to examine sub-WITHs in DML statements. makeDependencyGraphWalker thought that only SelectStmt nodes could contain a WithClause. Which was true in our original implementation of WITH, but astonishingly we missed updating this code when we added the ability to attach WITH to INSERT/UPDATE/DELETE (and later MERGE). Moreover, since it was coded to deliberately block recursion to a WithClause, even updating raw_expression_tree_walker didn't save it. The upshot of this was that we didn't see references to outer CTE names appearing within an inner WITH, and would neither complain about disallowed recursion nor account for such references when sorting CTEs into a usable order. The lack of complaints about this is perhaps not so surprising, because typical usage of WITH wouldn't hit either case. Still, it's pretty broken; failing to detect recursion here leads to assert failures or worse later on. Fix by factoring out the processing of sub-WITHs into a new function WalkInnerWith, and invoking that for all the statement types that can have WITH. Bug: #18878 Reported-by: Yu Liang <luy70@psu.edu> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18878-a26fa5ab6be2f2cf@postgresql.org Backpatch-through: 13	2025-04-05 15:01:33 -04:00
Fujii Masao	a11a823aa8	Fix logical decoding test to correctly check slot removal on standby. The regression test for logical decoding verifies whether a logical slot is correctly dropped on a standby when its associated database is dropped. However, the test mistakenly retrieved slot information from the primary instead of the standby, causing incorrect behavior. This commit fixes the issue by ensuring the test correctly checks the slot on the standby. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/1fdfd020-a509-403c-bd8f-a04664aba148@oss.nttdata.com Backpatch-through: 13	2025-04-04 13:34:30 +09:00
Fujii Masao	b29a183c67	Fix logical decoding regression tests to correctly check slot existence. The regression tests for logical decoding verify whether a logical slot exists or has been dropped. Previously, these tests attempted to retrieve "slot_name" from the result of slot(), but since "slot_name" was not included in the result, slot()->{'slot_name'} always returned undef, leading to incorrect behavior. This commit fixes the issue by checking the "plugin" field in the result of slot() instead, ensuring the tests properly verify slot existence. Back-patch to all supported versions. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/OSCPR01MB149667EC4E738769CA80B7EA5F5AE2@OSCPR01MB14966.jpnprd01.prod.outlook.com Backpatch-through: 13	2025-04-04 13:12:17 +09:00
Tom Lane	dd34cbfce2	Need to do CommandCounterIncrement after StoreAttrMissingVal. Without this, an additional change to the same pg_attribute row within the same command will fail. This is possible at least with ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure. (Another potential hazard is that immediately-following operations might not see the missingval.) Introduced by `95f650674`, which split the former coding that used a single pg_attribute update to change both atthasdef and atthasmissing/attmissingval into two updates, but missed that this should entail two CommandCounterIncrements as well. Like that fix, back-patch through v13. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/025a3ffa-5eff-4a88-97fb-8f583b015965@gmail.com Backpatch-through: 13	2025-04-02 11:13:01 -04:00
Tom Lane	474aee3dfe	Fix ARRAY_SUBLINK and ARRAY[] for int2vector and oidvector input. If the given input_type yields valid results from both get_element_type and get_array_type, initArrayResultAny believed the former and treated the input as an array type. However this is inconsistent with what get_promoted_array_type does, leading to situations where the output of an ARRAY() subquery is labeled with the wrong type: it's labeled as oidvector[] but is really a 2-D array of OID. That at least results in strange output, and can result in crashes if further processing such as unnest() is applied. AFAIK this is only possible with the int2vector and oidvector types, which are special-cased to be treated mostly as true arrays even though they aren't quite. Fix by switching the logic to match get_promoted_array_type by testing get_array_type not get_element_type, and remove an Assert thereby made pointless. (We need not introduce a symmetrical check for get_element_type in the other if-branch, because initArrayResultArr will check it.) This restores the behavior that existed before `bac27394a` introduced initArrayResultAny: the output really is int2vector[] or oidvector[]. Comparable confusion exists when an input of an ARRAY[] construct is int2vector or oidvector: transformArrayExpr decides it's dealing with a multidimensional array constructor, and we end up with something that's a multidimensional OID array but is alleged to be of type oidvector. I have not found a crashing case here, but it's easy to demonstrate totally-wrong results. Adjust that code so that what you get is an oidvector[] instead, for consistency with ARRAY() subqueries. (This change also makes these types work like domains-over-arrays in this context, which seems correct.) Bug: #18840 Reported-by: yang lei <ylshiyu@126.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18840-fbc9505f066e50d6@postgresql.org Backpatch-through: 13	2025-03-13 16:07:55 -04:00
Tom Lane	39af32f788	Build whole-row Vars the same way during parsing and planning. makeWholeRowVar() has different rules for constructing a whole-row Var depending on the kind of RTE it's representing. This turns out to be problematic because the rewriter and planner can convert view RTEs and set-returning-function RTEs into subquery RTEs; so a whole-row Var made during planning might look different from one made by the parser. In isolation this doesn't cause any problem, but if a query contains Vars made both ways for the same varno, there are cross-checks in the executor that will complain. This manifests for UPDATE, DELETE, and MERGE queries that use whole-row table references. To fix, we need makeWholeRowVar() to produce the same result from an inlined RTE as it would have for the original. For an inlined view, we can use RangeTblEntry.relid to detect that this had been a view RTE. For inlined SRFs, make a data structure definition change akin to commit `47bb9db75`, and say that we won't clear RangeTblEntry.functions until the end of planning. That allows makeWholeRowVar() to repeat what it would have done with the unmodified RTE. Reported-by: Duncan Sands <duncan.sands@deepbluecap.com> Reported-by: Dean Rasheed <dean.a.rasheed@gmail.com> Diagnosed-by: Tender Wang <tndrwang@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/3518c50a-ab18-482f-b916-a37263622501@deepbluecap.com Backpatch-through: 13	2025-03-12 11:47:19 -04:00
Álvaro Herrera	0f354e0805	Fix ALTER TABLE error message This bogus error message was introduced in 2013 by commit `f177cbfe67`, because of misunderstanding the processCASbits() API; at the time, no test cases were added that would be affected by this change. Only in `ca87c415e2` was one added (along with a couple of typos), with an XXX note that the error message was bogus. Fix the whole, add some test cases. Backpatch all the way back. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202503041822.aobpqke3igvb@alvherre.pgsql	2025-03-04 20:07:30 +01:00
Tom Lane	aac07b5625	Fix broken handling of domains in atthasmissing logic. If a domain type has a default, adding a column of that type (without any explicit DEFAULT clause) failed to install the domain's default value in existing rows, instead leaving the new column null. This is unexpected, and it used to work correctly before v11. The cause is confusion in the atthasmissing mechanism about which default value to install: we'd only consider installing an explicitly-specified default, and then we'd decide that no table rewrite is needed. To fix, take the responsibility for filling attmissingval out of StoreAttrDefault, and instead put it into ATExecAddColumn's existing logic that derives the correct value to fill the new column with. Also, centralize the logic that determines the need for default-related table rewriting there, instead of spreading it over four or five places. In the back branches, we'll leave the attmissingval-filling code in StoreAttrDefault even though it's now dead, for fear that some extension may be depending on that functionality to exist there. A separate HEAD-only patch will clean up the now-useless code. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com Backpatch-through: 13	2025-03-03 12:43:29 -05:00
Michael Paquier	bc01859540	test_escape: Fix output of --help The short option name -f was not listed, only its long option name --force-unsupported. Author: Japin Li Discussion: https://postgr.es/m/ME0P300MB04452BD1FB1B277D4C1C20B9B6C52@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM Backpatch-through: 13	2025-02-20 09:31:08 +09:00
Andres Freund	bb2bf22761	tests: BackgroundPsql: Fix potential for lost errors on windows This addresses various corner cases in BackgroundPsql: - On windows stdout and stderr may arrive out of order, leading to errors not being reported, or attributed to the wrong statement. To fix, emit the "query-separation banner" on both stdout and stderr and wait for both. - Very occasionally the "query-separation banner" would not get removed, because we waited until the banner arrived, but then replaced the banner plus newline. To fix, wait for banner and newline. - For interactive psql replacing $banner\n is not sufficient, interactive psql outputs \r\n. - For interactive psql, where commands are echoed to stdout, the \echo command, rather than its output, would be matched. This would sometimes lead to output from the prior query, or wait_connect(), being returned in the next command. This also affected wait_connect(), leading to sometimes sending queries to psql before the connection actually was established. While debugging these issues I also found that it's hard to know whether a query separation banner was attributed to the right query. Make that easier by counting the queries each BackgroundPsql instance has emitted and include the number in the banner. Also emit psql stdout/stderr in query() and wait_connect() as Test::More notes, without that it's rather hard to debug some issues in CI and buildfarm. As this can cause issues not just to-be-added tests, but also existing ones, backpatch the fix to all supported versions. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/wmovm6xcbwh7twdtymxuboaoarbvwj2haasd3sikzlb3dkgz76@n45rzycluzft Backpatch-through: 13	2025-02-19 10:46:49 -05:00
Andres Freund	6af51bf05a	backport: Extend background_psql() to be able to start asynchronously This is a backport of `ba08edb065`. Originally it was only applied to master, but I (Andres) am planning to fix a few bugs in BackgroundPsql, which would be somewhat harder with the behavioural differences across branches. It's also generally good for test infrastructure to behave similarly across branches, to avoid pain during backpatching. Discussion: https://postgr.es/m/ilcctzb5ju2gulvnadjmhgatnkxsdpac652byb2u3d3wqziyvx@fbuqcglker46 Michael's original commit message: This commit extends the constructor routine of BackgroundPsql.pm with a new "wait" parameter. If set to 0, the routine returns without waiting for psql to start, ready to consume input. background_psql() in Cluster.pm gains the same "wait" parameter. The default behavior is still to wait for psql to start. It becomes now possible to not wait, giving to TAP scripts the possibility to perform actions between a BackgroundPsql startup and its wait_connect() call. Author: Jacob Champion Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com	2025-02-19 10:11:35 -05:00
Andres Freund	3c562b58c2	backport: Improve handling of empty query results in BackgroundPsql This is a backport of `70291a3c66`. Originally it was only applied to master, but I (Andres) am planning to fix a few bugs in BackgroundPsql that are harder to fix in the backbranches with the old behavior. It's also generally good for test infrastructure to behave similarly across branches, to avoid pain during backpatching. `70291a3c66` changes the behavior in some cases, but after discussing it, we are ok with that, it seems unlikely that there are out-of-core tests relying on the prior behavior. Discussion: https://postgr.es/m/ilcctzb5ju2gulvnadjmhgatnkxsdpac652byb2u3d3wqziyvx@fbuqcglker46 Michael's original commit message: A newline is not added at the end of an empty query result, causing the banner of the hardcoded \echo to not be discarded. This would reflect on scripts that expect an empty result by showing the "QUERY_SEPARATOR" in the output returned back to the caller, which was confusing. This commit changes BackgroundPsql::query() so as empty results are able to work correctly, making the first newline before the banner optional, bringing more flexibility. Note that this change affects 037_invalid_database.pl, where three queries generated an empty result, with the script relying on the data from the hardcoded banner to exist in the expected output. These queries are changed to use query_safe(), leading to a simpler script. The author has also proposed a test in a different patch where empty results would exist when using BackgroundPsql. Author: Jacob Champion Reviewed-by: Andrew Dunstan, Michael Paquier Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com	2025-02-19 09:52:32 -05:00
Michael Paquier	e9c95c6d36	test_escape: Fix handling of short options in getopt_long() This addresses two errors in the module, based on the set of options supported: - '-c', for --conninfo, was not listed. - '-f', for --force-unsupported, was not listed. While on it, these are now listed in an alphabetical order. Author: Japin Li Discussion: https://postgr.es/m/ME0P300MB04451FB20CE0346A59C25CADB6FA2@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM Backpatch-through: 13	2025-02-19 09:46:00 +09:00
Andres Freund	1f7a053245	Fix PQescapeLiteral()/PQescapeIdentifier() length handling In `5dc1e42b4f` I fixed bugs in various escape functions, unfortunately as part of that I introduced a new bug in PQescapeLiteral()/PQescapeIdentifier(). The bug is that I made PQescapeInternal() just use strlen(), rather than taking the specified input length into account. That's bad, because it can lead to including input that wasn't intended to be included (in case len is shorter than null termination of the string) and because it can lead to reading invalid memory if the input string is not null terminated. Expand test_escape to this kind of bug: a) for escape functions with length support, append data that should not be escaped and check that it is not b) add valgrind requests to detect access of bytes that should not be touched Author: Tom Lane <tgl@sss.pgh.pa.us> Author: Andres Freund <andres@anarazel.de Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/Z64jD3u46gObCo1p@pryzbyj2023 Backpatch: 13	2025-02-14 18:09:27 -05:00
Michael Paquier	5209058240	Fix MakeTransitionCaptureState() to return a consistent result When an UPDATE trigger referencing a new table and a DELETE trigger referencing an old table are both present, MakeTransitionCaptureState() returns an inconsistent result for UPDATE commands in its set of flags and tuplestores holding the TransitionCaptureState for transition tables. As proved by the test added here, this issue causes a crash in v14 and earlier versions (down to 11, actually, older versions do not support triggers on partitioned tables) during cross-partition updates on a partitioned table. v15 and newer versions are safe thanks to `7103ebb7aa`. This commit fixes the function so that it returns a consistent state by using portions of the changes made in commit `7103ebb7aa` for v13 and v14. v15 and newer versions are slightly tweaked to match with the older versions, mainly for consistency across branches. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20250207.150238.968446820828052276.horikyota.ntt@gmail.com Backpatch-through: 13	2025-02-13 16:31:12 +09:00
Andres Freund	7beb2af5e6	Fix type in test_escape test On machines where char is unsigned this could lead to option parsing looping endlessly. It's also too narrow a type on other hardware. Found via Tom Lane's monitoring of the buildfarm. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Security: CVE-2025-1094 Backpatch-through: 13	2025-02-10 12:13:02 -05:00
Andres Freund	4ea3f5ef35	Add test of various escape functions As highlighted by the prior commit, writing correct escape functions is less trivial than one might hope. This test module tries to verify that different escaping functions behave reasonably. It e.g. tests: - Invalidly encoded input to an escape function leads to invalidly encoded output - Trailing incomplete multi-byte characters are handled sensibly - Escaped strings are parsed as single statement by psql's parser (which derives from the backend parser) There are further tests that would be good to add. But even in the current state it was rather useful for writing the fix in the prior commit. Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 13 Security: CVE-2025-1094	2025-02-10 10:03:40 -05:00
Andres Freund	db3eb0e825	Add pg_encoding_set_invalid() There are cases where we cannot / do not want to error out for invalidly encoded input. In such cases it can be useful to replace e.g. an incomplete multi-byte characters with bytes that will trigger an error when getting validated as part of a larger string. Unfortunately, until now, for some encoding no such sequence existed. For those encodings this commit removes one previously accepted input combination - we consider that to be ok, as the chosen bytes are outside of the valid ranges for the encodings, we just previously failed to detect that. As we cannot add a new field to pg_wchar_table without breaking ABI, this is implemented "in-line" in the newly added function. Author: Noah Misch <noah@leadboat.com> Reviewed-by: Andres Freund <andres@anarazel.de> Backpatch-through: 13 Security: CVE-2025-1094	2025-02-10 10:03:40 -05:00
Tom Lane	dee1e86d0e	Avoid using timezone Asia/Manila in regression tests. The freshly-released 2025a version of tzdata has a refined estimate for the longitude of Manila, changing their value for LMT in pre-standardized-timezone days. This changes the output of one of our test cases. Since we need to be able to run with system tzdata files that may or may not contain this update, we'd better stop making that specific test. I switched it to use Asia/Singapore, which has a roughly similar UTC offset. That LMT value hasn't changed in tzdb since 2003, so we can hope that it's well established. I also noticed that this set of make_timestamptz tests only exercises zones east of Greenwich, which seems rather sad, and was not the original intent AFAICS. (We've already changed these tests once to stabilize their results across tzdata updates, cf 66b737cd9; it looks like I failed to consider the UTC-offset-sign aspect then.) To improve that, add a test with Pacific/Honolulu. That LMT offset is also quite old in tzdb, so we'll cross our fingers that it doesn't get improved. Reported-by: Christoph Berg <cb@df7cb.de> Discussion: https://postgr.es/m/Z46inkznCxesvDEb@msg.df7cb.de Backpatch-through: 13	2025-01-20 15:47:53 -05:00
Michael Paquier	0f0431e919	Fix header check for continuation records where standbys could be stuck XLogPageRead() checks immediately for an invalid WAL record header on a standby, to be able to handle the case of continuation records that need to be read across two different sources. As written, the check was too generic, applying to any target LSN. Based on an analysis by Kyotaro Horiguchi, what really matters is to make sure that the page header is checked when attempting to read a LSN at the boundary of a segment, to handle the case of a continuation record that spawns across multiple pages when dealing with multiple segments, as WAL receivers are spawned they request WAL from the beginning of a segment. This fix has been proposed by Kyotaro Horiguchi. This could cause standbys to loop infinitely when dealing with a continuation record during a timeline jump, in the case where the contents of the record in the follow-up page are invalid. Some regression tests are added to check such scenarios, able to reproduce the original problem. In the test, the contents of a continuation record are overwritten with junk zeros on its follow-up page, and replayed on standbys. This is inspired by 039_end_of_wal.pl, and is enough to show how standbys should react on promotion by not being stuck. Without the fix, the test would fail with a timeout. The test to reproduce the problem has been written by Alexander Kukushkin. The original check has been introduced in `0668719801`, for a similar problem. Author: Kyotaro Horiguchi, Alexander Kukushkin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com Backpatch-through: 13	2025-01-20 09:30:40 +09:00
Tom Lane	45004f527a	Fix setrefs.c's failure to do expression processing on prune steps. We should run the expression subtrees of PartitionedRelPruneInfo structs through fix_scan_expr. Failure to do so means that AlternativeSubPlans within those expressions won't be cleaned up properly, resulting in "unrecognized node type" errors since v14. It seems fairly likely that at least some of the other steps done by fix_scan_expr are important here as well, resulting in as-yet- undetected bugs. Therefore, I've chosen to back-patch this to all supported branches including v13, even though the known symptom doesn't manifest in v13. Per bug #18778 from Alexander Lakhin. Discussion: https://postgr.es/m/18778-24cd399df6c806af@postgresql.org	2025-01-16 20:40:07 -05:00
Michael Paquier	c9e50ce2a0	Move routines to manipulate WAL into PostgreSQL::Test::Cluster These facilities were originally in the recovery TAP test 039_end_of_wal.pl. A follow-up bug fix with a TAP test doing similar WAL manipulations requires them, and all these had better not be duplicated due to their complexity. The routine names are tweaked to use "wal" more consistently, similarly to the existing "advance_wal". In v14 and v13, the new routines are moved to PostgresNode.pm. 039_end_of_wal.pl is updated to use the refactored routines, without changing its coverage. Reviewed-by: Alexander Kukushkin Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com Backpatch-through: 13	2025-01-16 09:26:31 +09:00
Dean Rasheed	8f137f0382	Fix XMLTABLE() deparsing to quote namespace names if necessary. When deparsing an XMLTABLE() expression, XML namespace names were not quoted. However, since they are parsed as ColLabel tokens, some names require double quotes to ensure that they are properly interpreted. Fix by using quote_identifier() in the deparsing code. Back-patch to all supported versions. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com	2025-01-12 13:02:56 +00:00
Noah Misch	1025463225	In REASSIGN OWNED of a database, lock the tuple as mandated. Commit `aac2c9b4fd` mandated such locking and attempted to fulfill that mandate, but it missed REASSIGN OWNED. Hence, it remained possible to lose VACUUM's inplace update of datfrozenxid if a REASSIGN OWNED processed that database at the same time. This didn't affect the other inplace-updated catalog, pg_class. For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills the locking mandate. Like in GRANT, implement this by following the locking protocol for any catalog subject to the generic AlterObjectOwner_internal(). It would suffice to do this for IsInplaceUpdateOid() catalogs only. Back-patch to v13 (all supported versions). Kirill Reshke. Reported by Alexander Kukushkin. Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com	2024-12-28 07:16:27 -08:00
David Rowley	2c7887c9d6	Fix Assert failure in WITH RECURSIVE UNION queries If the non-recursive part of a recursive CTE ended up using TTSOpsBufferHeapTuple as the table slot type, then a duplicate value could cause an Assert failure in CheckOpSlotCompatibility() when checking the hash table for the duplicate value. The expected slot type for the deform step was TTSOpsMinimalTuple so the Assert failed when the TTSOpsBufferHeapTuple slot was used. This is a long-standing bug which we likely didn't notice because it seems much more likely that the non-recursive term would have required projection and used a TTSOpsVirtual slot, which CheckOpSlotCompatibility is ok with. There doesn't seem to be any harm done here other than the Assert failure. Both TTSOpsMinimalTuple and TTSOpsBufferHeapTuple slot types require tuple deformation, so the EEOP__FETCHSOME ExprState step would have properly existed in the ExprState. The solution is to pass NULL for the ExecBuildGroupingEqual's 'lops' parameter. This means the ExprState's EEOP__FETCHSOME step won't expect a fixed slot type. This makes CheckOpSlotCompatibility() happy as no checking is performed when the ExprEvalStep is not expecting a fixed slot type. Reported-by: Richard Guo Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAMbWs4-8U9q2LAtf8+ghV11zeUReA3AmrYkxzBEv0vKnDxwkKA@mail.gmail.com Backpatch-through: 13, all supported versions	2024-12-19 13:13:51 +13:00
Heikki Linnakangas	e438b0f424	Make 009_twophase.pl test pass with recovery_min_apply_delay set The test failed if you ran the regression tests with TEMP_CONFIG with recovery_min_apply_delay = '500ms'. Fix the race condition by waiting for transaction to be applied in the replica, like in a few other tests. The failing test was introduced in commit `cbfbda7841`. Backpatch to all supported versions like that commit (except v12, which is no longer supported). Reported-by: Alexander Lakhin Discussion: https://www.postgresql.org/message-id/09e2a70a-a6c2-4b5c-aeae-040a7449c9f2@gmail.com	2024-12-16 15:59:38 +02:00
Tom Lane	f2eba400bc	Fix is_digit labeling of to_timestamp's FFn format codes. These format codes produce or consume strings of digits, so they should be labeled with is_digit = true, but they were not. This has effect in only one place, where is_next_separator() is checked to see if the preceding format code should slurp up all the available digits. Thus, with a format such as '...SSFF3' with remaining input '12345', the 'SS' code would consume all five digits (and then complain about seconds being out of range) when it should eat only two digits. Per report from Nick Davies. This bug goes back to `d589f9446` where the FFn codes were introduced, so back-patch to v13. Discussion: https://postgr.es/m/AM8PR08MB6356AC979252CFEA78B56678B6312@AM8PR08MB6356.eurprd08.prod.outlook.com	2024-12-07 13:12:32 -05:00
Tom Lane	48a6cd1ae6	Fix NULLIF()'s handling of read-write expanded objects. If passed a read-write expanded object pointer, the EEOP_NULLIF code would hand that same pointer to the equality function and then (unless equality was reported) also return the same pointer as its value. This is no good, because a function that receives a read-write expanded object pointer is fully entitled to scribble on or even delete the object, thus corrupting the NULLIF output. (This problem is likely unobservable with the equality functions provided in core Postgres, but it's easy to demonstrate with one coded in plpgsql.) To fix, make sure the pointer passed to the equality function is read-only. We can still return the original read-write pointer as the NULLIF result, allowing optimization of later operations. Per bug #18722 from Alexander Lakhin. This has been wrong since we invented expanded objects, so back-patch to all supported branches. Discussion: https://postgr.es/m/18722-fd9e645448cc78b4@postgresql.org	2024-11-25 18:09:11 -05:00
Noah Misch	01745fb04d	Avoid "you don't own a lock of type ExclusiveLock" in GRANT TABLESPACE. This WARNING appeared because SearchSysCacheLocked1() read cc_relisshared before catcache initialization, when the field is false unconditionally. On the basis of reading false there, it constructed a locktag as though pg_tablespace weren't relisshared. Only shared catalogs could be affected, and only GRANT TABLESPACE was affected in practice. SearchSysCacheLocked1() callers use one other shared-relation syscache, DATABASEOID. DATABASEOID is initialized by the end of CheckMyDatabase(), making the problem unreachable for pg_database. Back-patch to v13 (all supported versions). This has no known impact before v16, where ExecGrant_common() first appeared. Earlier branches avoid trouble by having a separate ExecGrant_Tablespace() that doesn't use LOCKTAG_TUPLE. However, leaving this unfixed in v15 could ensnare a future back-patch of a SearchSysCacheLocked1() call. Reported by Aya Iwata. Discussion: https://postgr.es/m/OS7PR01MB11964507B5548245A7EE54E70EA212@OS7PR01MB11964.jpnprd01.prod.outlook.com	2024-11-25 14:42:40 -08:00
Noah Misch	07c6e0f613	Fix per-session activation of ALTER {ROLE\|DATABASE} SET role. After commit `5a2fed911a`, the catalog state resulting from these commands ceased to affect sessions. Restore the longstanding behavior, which is like beginning the session with a SET ROLE command. If cherry-picking the CVE-2024-10978 fixes, default to including this, too. (This fixes an unintended side effect of fixing CVE-2024-10978.) Back-patch to v12, like that commit. The release team decided to include v12, despite the original intent to halt v12 commits earlier this week. Tom Lane and Noah Misch. Reported by Etienne LAFARGE. Discussion: https://postgr.es/m/CADOZwSb0UsEr4_UTFXC5k7=fyyK8uKXekucd+-uuGjJsGBfxgw@mail.gmail.com	2024-11-15 20:40:00 -08:00
Tom Lane	76123ded6e	Fix improper interactions between session_authorization and role. The SQL spec mandates that SET SESSION AUTHORIZATION implies SET ROLE NONE. We tried to implement that within the lowest-level functions that manipulate these settings, but that was a bad idea. In particular, guc.c assumes that it doesn't matter in what order it applies GUC variable updates, but that was not the case for these two variables. This problem, compounded by some hackish attempts to work around it, led to some security-grade issues: * Rolling back a transaction that had done SET SESSION AUTHORIZATION would revert to SET ROLE NONE, even if that had not been the previous state, so that the effective user ID might now be different from what it had been. * The same for SET SESSION AUTHORIZATION in a function SET clause. * If a parallel worker inspected current_setting('role'), it saw "none" even when it should see something else. Also, although the parallel worker startup code intended to cope with the current role's pg_authid row having disappeared, its implementation of that was incomplete so it would still fail. Fix by fully separating the miscinit.c functions that assign session_authorization from those that assign role. To implement the spec's requirement, teach set_config_option itself to perform "SET ROLE NONE" when it sets session_authorization. (This is undoubtedly ugly, but the alternatives seem worse. In particular, there's no way to do it within assign_session_authorization without incompatible changes in the API for GUC assign hooks.) Also, improve ParallelWorkerMain to directly set all the relevant user-ID variables instead of relying on some of them to get set indirectly. That allows us to survive not finding the pg_authid row during worker startup. In v16 and earlier, this includes back-patching `9987a7bf3` which fixed a violation of GUC coding rules: SetSessionAuthorization is not an appropriate place to be throwing errors from. Security: CVE-2024-10978	2024-11-11 10:29:54 -05:00
Nathan Bossart	952ff31e2a	Ensure cached plans are correctly marked as dependent on role. If a CTE, subquery, sublink, security invoker view, or coercion projection references a table with row-level security policies, we neglected to mark the plan as potentially dependent on which role is executing it. This could lead to later executions in the same session returning or hiding rows that should have been hidden or returned instead. Reported-by: Wolfgang Walther Reviewed-by: Noah Misch Security: CVE-2024-10976 Backpatch-through: 12	2024-11-11 09:00:00 -06:00
Noah Misch	e428cd058f	Block environment variable mutations from trusted PL/Perl. Many process environment variables (e.g. PATH), bypass the containment expected of a trusted PL. Hence, trusted PLs must not offer features that achieve setenv(). Otherwise, an attacker having USAGE privilege on the language often can achieve arbitrary code execution, even if the attacker lacks a database server operating system user. To fix PL/Perl, replace trusted PL/Perl %ENV with a tied hash that just replaces each modification attempt with a warning. Sites that reach these warnings should evaluate the application-specific implications of proceeding without the environment modification: Can the application reasonably proceed without the modification? If no, switch to plperlu or another approach. If yes, the application should change the code to stop attempting environment modifications. If that's too difficult, add "untie %main::ENV" in any code executed before the warning. For example, one might add it to the start of the affected function or even to the plperl.on_plperl_init setting. In passing, link to Perl's guidance about the Perl features behind the security posture of PL/Perl. Back-patch to v12 (all supported versions). Andrew Dunstan and Noah Misch Security: CVE-2024-10979	2024-11-11 06:23:48 -08:00
Amit Langote	054701a2b7	Disallow partitionwise join when collations don't match If the collation of any join key column doesn’t match the collation of the corresponding partition key, partitionwise joins can yield incorrect results. For example, rows that would match under the join key collation might be located in different partitions due to the partitioning collation. In such cases, a partitionwise join would yield different results from a non-partitionwise join, so disallow it in such cases. Reported-by: Tender Wang <tndrwang@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com Backpatch-through: 12	2024-11-08 17:18:21 +09:00
Amit Langote	ff65f695c0	Disallow partitionwise grouping when collations don't match If the collation of any grouping column doesn’t match the collation of the corresponding partition key, partitionwise grouping can yield incorrect results. For example, rows that would be grouped under the grouping collation may end up in different partitions under the partitioning collation. In such cases, full partitionwise grouping would produce results that differ from those without partitionwise grouping, so disallowed that. Partial partitionwise aggregation is still allowed, as the Finalize step reconciles partition-level aggregates with grouping requirements across all partitions, ensuring that the final output remains consistent. This commit also fixes group_by_has_partkey() by ensuring the RelabelType node is stripped from grouping expressions when matching them to partition key expressions to avoid false mismatches. Bug: #18568 Reported-by: Webbo Han <1105066510@qq.com> Author: Webbo Han <1105066510@qq.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/18568-2a9afb6b9f7e6ed3@postgresql.org Discussion: https://postgr.es/m/tencent_9D9103CDA420C07768349CC1DFF88465F90A@qq.com Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com Backpatch-through: 12	2024-11-08 16:06:12 +09:00
Peter Eisentraut	ebbfa2ae34	Message style improvement Backpatch the part of `edee0c621d` that applies to `a90bdd7a44`, which was also backpatched. That way, the message is consistent in all branches.	2024-11-08 07:32:14 +01:00
Noah Misch	fe8091c9e3	Revert "For inplace update, send nontransactional invalidations." This reverts commit `95c5acb3fc` (v17) and counterparts in each other non-master branch. If released, that commit would have caused a worst-in-years minor release regression, via undetected LWLock self-deadlock. This commit and its self-deadlock fix warrant more bake time in the master branch. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com	2024-11-02 09:05:07 -07:00
Noah Misch	0ea9d40a66	For inplace update, send nontransactional invalidations. The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v12 (all supported versions). The back branch versions don't change WAL, because those branches just added end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Reviewed by Nitin Motiani and (in earlier versions) Andres Freund. Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com	2024-10-25 06:51:08 -07:00
Noah Misch	3e5ea478d8	Stop reading uninitialized memory in heap_inplace_lock(). Stop computing a never-used value. This removes the read; the read had no functional implications. Back-patch to v12, like commit `a07e03fd8f`. Reported by Alexander Lakhin. Discussion: https://postgr.es/m/6c92f59b-f5bc-e58c-9bdd-d1f21c17c786@gmail.com	2024-10-24 09:16:19 -07:00
Álvaro Herrera	d20194cead	Restructure foreign key handling code for ATTACH/DETACH ... to fix bugs when the referenced table is partitioned. The catalog representation we chose for foreign keys connecting partitioned tables (in commit `f56f8f8da6`) is inconvenient, in the sense that a standalone table has a different way to represent the constraint when referencing a partitioned table, than when the same table becomes a partition (and vice versa). Because of this, we need to create additional catalog rows on detach (pg_constraint and pg_trigger), and remove them on attach. We were doing some of those things, but not all of them, leading to missing catalog rows in certain cases. The worst problem seems to be that we are missing action triggers after detaching a partition, which means that you could update/delete rows from the referenced partitioned table that still had referencing rows on that table, the server failing to throw the required errors. !!! Note that this means existing databases with FKs that reference partitioned tables might have rows that break relational integrity, on tables that were once partitions on the referencing side of the FK. Another possible problem is that trying to reattach a table that had been detached would fail indicating that internal triggers cannot be found, which from the user's point of view is nonsensical. In branches 15 and above, we fix this by creating a new helper function addFkConstraint() which is in charge of creating a standalone pg_constraint row, and repurposing addFkRecurseReferencing() and addFkRecurseReferenced() so that they're only the recursive routine for each side of the FK, and they call addFkConstraint() to create pg_constraint at each partitioning level and add the necessary triggers. These new routines can be used during partition creation, partition attach and detach, and foreign key creation. This reduces redundant code and simplifies the flow. In branches 14 and 13, we have a much simpler fix that consists on simply removing the constraint on detach. The reason is that those branches are missing commit `f4566345cf`, which reworked the way this works in a way that we didn't consider back-patchable at the time. We opted to leave branch 12 alone, because it's different from branch 13 enough that the fix doesn't apply; and because it is going in EOL mode very soon, patching it now might be worse since there's no way to undo the damage if it goes wrong. Existing databases might need to be repaired. In the future we might want to rethink the catalog representation to avoid this problem, but for now the code seems to do what's required to make the constraints operate correctly. Co-authored-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Co-authored-by: Tender Wang <tndrwang@gmail.com> Co-authored-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Guillaume Lelarge <guillaume@lelarge.info> Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com> Reported-by: Thomas Baehler (SBB CFF FFS) <thomas.baehler2@sbb.ch> Discussion: https://postgr.es/m/20230420144344.40744130@karst Discussion: https://postgr.es/m/20230705233028.2f554f73@karst Discussion: https://postgr.es/m/GVAP278MB02787E7134FD691861635A8BC9032@GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org	2024-10-22 16:01:18 +02:00
Tom Lane	beab395a42	Fix wrong assertion and poor error messages in "COPY (query) TO". If the query is rewritten into a NOTIFY command by a DO INSTEAD rule, we'd get an assertion failure, or in non-assert builds issue a rather confusing error message. Improve that. Also fix a longstanding grammar mistake in a nearby error message. Per bug #18664 from Alexander Lakhin. Back-patch to all supported branches. Tender Wang and Tom Lane Discussion: https://postgr.es/m/18664-ffd0ebc2386598df@postgresql.org	2024-10-21 15:08:22 -04:00
Tom Lane	76de4b182c	Correctly identify which EC members are computable at a plan node. find_computable_ec_member() had the wrong mental model of what its primary caller prepare_sort_from_pathkeys() would do with the selected EquivalenceClass member expression. We will not compute the EC expression in a plan node atop the one returning the passed-in targetlist; rather, the EC expression will be computed as an additional column of that targetlist. So any Var or quasi-Var used in the given tlist is also available to the EC expression. In simple cases this makes no difference because the given tlist is just a list of Vars or quasi-Vars --- but if we are considering an appendrel member produced by flattening a UNION ALL, the tlist may contain expressions, resulting in failure to match and a "could not find pathkey item to sort" error. To fix, we can flatten both the tlist and the EC members with pull_var_clause(), and then just check for subset-ness, so that the code is actually shorter than before. While this bug is quite old, the present patch only works back to v13. We could possibly make it work in v12 by back-patching parts of `375398244`. On the whole though I don't like the risk/reward ratio of that idea. v12's final release is next month, meaning there would be no chance to correct matters if the patch causes a regression. Since this failure has escaped notice for 14 years, it's likely nobody will hit it in the field with v12. Per bug #18652 from Alexander Lakhin. Andrei Lepikhov and Tom Lane Discussion: https://postgr.es/m/18652-deaa782ebcca85d1@postgresql.org	2024-10-12 14:56:08 -04:00
Noah Misch	118dfd1213	Avoid 037_invalid_database.pl hang under debug_discard_caches. Back-patch to v12 (all supported versions).	2024-09-27 15:29:01 -07:00
Andres Freund	f232d7c682	tests: Restrict pg_locks queries in advisory_locks.sql to current database Otherwise testing an existing installation can fail, if there are other locks, e.g. from one of the isolation tests. This was originally applied as `c3315a7da5` in 16~, but it is possible to see this test fail depending on the concurrent activity for older active branches. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/20221003234111.4ob7yph6r4g4ywhu@awork3.anarazel.de Backpatch-through: 12	2024-09-26 13:46:40 +09:00
Noah Misch	14c57cb639	For inplace update durability, make heap_update() callers wait. The previous commit fixed some ways of losing an inplace update. It remained possible to lose one when a backend working toward a heap_update() copied a tuple into memory just before inplace update of that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE to govern admission to the steps of copying an old tuple, modifying it, and issuing heap_update(). This includes MERGE commands. To avoid changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when holding a relation lock sufficient to exclude inplace updaters. Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE pg_class" or "UPDATE pg_database" can still lose an inplace update. The v14+ UPDATE fix needs commit `86dc90056d`, and it wasn't worth reimplementing that fix without such infrastructure. Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas. Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com	2024-09-24 15:25:24 -07:00
Noah Misch	a8ad1929d2	Fix data loss at inplace update after heap_update(). As previously-added tests demonstrated, heap_inplace_update() could instead update an unrelated tuple of the same catalog. It could lose the update. Losing relhasindex=t was a source of index corruption. Inplace-updating commands like VACUUM will now wait for heap_update() commands like GRANT TABLE and GRANT DATABASE. That isn't ideal, but a long-running GRANT already hurts VACUUM progress more just by keeping an XID running. The VACUUM will behave like a DELETE or UPDATE waiting for the uncommitted change. For implementation details, start at the systable_inplace_update_begin() header comment and README.tuplock. Back-patch to v12 (all supported versions). In back branches, retain a deprecated heap_inplace_update(), for extensions. Reported by Smolkin Grigory. Reviewed by Nitin Motiani, (in earlier versions) Heikki Linnakangas, and (in earlier versions) Alexander Lakhin. Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com	2024-09-24 15:25:24 -07:00
Daniel Gustafsson	e43e71b6cd	Drop global objects after completed test Project policy is to not leave global objects behind after a regress test run. This was found as a result of the development of a patch to make pg_regress detect such leftovers automatically, which in the end was withdrawn due to issues with parallel runs. This was originally committed as `936e3fa378`, but the issue also exists in the 12~16 range. Discussion: https://postgr.es/m/E1phvk7-000VAH-7k@gemulon.postgresql.org Backpatch-through: 12	2024-09-24 12:10:33 +09:00
Noah Misch	916b8ae475	Don't enter parallel mode when holding interrupts. Doing so caused the leader to hang in wait_event=ParallelFinish, which required an immediate shutdown to resolve. Back-patch to v12 (all supported versions). Francesco Degrassi Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com	2024-09-17 19:54:26 -07:00

1 2 3 4 5 ...

5902 commits