postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-03-02 13:24:01 -05:00

Author	SHA1	Message	Date
Tom Lane	ffbe9dec13	Remove orphaned expected-result file. This should have been removed in `43e084197`, which removed the corresponding spec file. Noted while fooling about with the isolationtester.	2021-06-14 21:28:21 -04:00
Michael Paquier	2d689babe3	Improve handling of dropped objects in pg_event_trigger_ddl_commands() An object found as dropped when digging into the list of objects returned by pg_event_trigger_ddl_commands() could cause a cache lookup error, as the calls grabbing for the object address and the type name would fail if the object was missing. Those lookup errors could be seen with combinations of ALTER TABLE sub-commands involving identity columns. The lookup logic is changed in this code path to get a behavior similar to any other SQL-callable function by ignoring objects that are not found, taking advantage of `2a10fdc`. The back-branches are not changed, as they require this commit that is too invasive for stable branches. While on it, add test cases to exercise event triggers with identity columns, and stress more cases with the event ddl_command_end for relations. Author: Sven Klemm, Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/CAMCrgp2R1cEXU53iYKtW6yVEp2_yKUz+z=3-CTrYpPP+xryRtg@mail.gmail.com	2021-06-14 14:57:22 +09:00
Michael Paquier	dbab0c07e5	Remove forced toast recompression in VACUUM FULL/CLUSTER The extra checks added by the recompression of toast data introduced in `bbe0a81` is proving to have a performance impact on VACUUM or CLUSTER even if no recompression is done. This is more noticeable with more toastable columns that contain non-NULL values. Improvements could be done to make those extra checks less expensive, but that's not material for 14 at this stage, and we are not sure either if the code path of VACUUM FULL/CLUSTER is adapted for this job. Per discussion with several people, including Andres Freund, Robert Haas, Álvaro Herrera, Tom Lane and myself. Discussion: https://postgr.es/m/20210527003144.xxqppojoiwurc2iz@alap3.anarazel.de	2021-06-14 09:25:50 +09:00
Andrew Dunstan	9d97c34083	Further tweaks to stuck_on_old_timeline recovery test Translate path slashes on target directory path. This was confusing old branches, but is applied to all branches for the sake of uniformity. Perl is perfectly able to understand paths with forward slashes. Along the way, restore the previous archive_wait query, for the sake of uniformity with other tests, per gripe from Tom Lane.	2021-06-13 07:19:34 -04:00
Michael Paquier	a9e0b3b08f	Ignore more environment variables in pg_regress.c This is similar to the work done in `8279f68` for TestLib.pm, where environment variables set may cause unwanted failures if using a temporary installation with pg_regress. The list of variables reset is adjusted in each stable branch depending on what is supported. Comments are added to remember that the lists in TestLib.pm and pg_regress.c had better be kept in sync. Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz Backpatch-through: 9.6	2021-06-13 20:07:39 +09:00
Tom Lane	f452aaf7d4	Restore robustness of TAP tests that wait for postmaster restart. Several TAP tests use poll_query_until() to wait for the postmaster to restart. They were checking to see if a trivial query (e.g. "SELECT 1") succeeds. However, that's problematic in the wake of commit `11e9caff8`, because now that we feed said query to psql via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql quits before reading the query. Hence, we can't use a nonempty query in cases where we need to wait for connection failures to stop happening. Per the precedent of commits `c757a3da0` and `6d41dd045`, we can pass "undef" as the query in such cases to ensure that IPC::Run has nothing to write. However, then we have to say that the expected output is empty, and this exposes a deficiency in poll_query_until: if psql fails altogether and returns empty stdout, poll_query_until will treat that as a success! That's because, contrary to its documentation, it makes no actual check for psql failure, looking neither at the exit status nor at stderr. To fix that, adjust poll_query_until to insist on empty stderr as well as a stdout match. (I experimented with checking exit status instead, but it seems that psql often does exit(1) in cases that we need to consider successes. That might be something to fix someday, but it would be a non-back-patchable behavior change.) Back-patch to v10. The test cases needing this exist only as far back as v11, but it seems wise to keep poll_query_until's behavior the same in v10, in case we back-patch another such test case in future. (9.6 does not currently need this change, because in that branch poll_query_until can't be told to accept empty stdout as a success case.) Per assorted buildfarm failures, mostly on hoverfly. Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com	2021-06-12 15:12:10 -04:00
Andrew Dunstan	c3652f976b	Fix new recovery test for use under msys Commit `caba8f0d43` wasn't quite right for msys, as demonstrated by several buildfarm animals, including jacana and fairywren. We need to use the msys perl in the archive command, but call it in such a way that Windows will understand the path. Furthermore, inside the copy script we need to convert a Windows path to an msys path.	2021-06-12 08:43:54 -04:00
Michael Paquier	bfd96b7a3d	Improve log pattern detection in recently-added TAP tests `ab55d74` has introduced some tests with rows found as missing in logical replication subscriptions for partitioned tables, relying on a logic with a lookup of the logs generated, scanning the whole file. This commit makes the logic more precise, by scanning the logs only from the position before the key queries are run to the position where we check for the logs. This will reduce the risk of issues with log patterns overlapping with each other if those tests get more complicated in the future. Per discussion with Tom Lane. Discussion: https://postgr.es/m/YMP+Gx2S8meYYHW4@paquier.xyz Backpatch-through: 13	2021-06-12 15:29:48 +09:00
Tom Lane	ab55d742eb	Fix multiple crasher bugs in partitioned-table replication logic. apply_handle_tuple_routing(), having detected and reported that the tuple it needed to update didn't exist, tried to update that tuple anyway, leading to a null-pointer dereference. logicalrep_partition_open() failed to ensure that the LogicalRepPartMapEntry it built for a partition was fully independent of that for the partition root, leading to trouble if the root entry was later freed or rebuilt. Meanwhile, on the publisher's side, pgoutput_change() sometimes attempted to apply execute_attr_map_tuple() to a NULL tuple. The first of these was reported by Sergey Bernikov in bug #17055; I found the other two while developing some test cases for this sadly under-tested code. Diagnosis and patch for the first issue by Amit Langote; patches for the others by me; new test cases by me. Back-patch to v13 where this logic came in. Discussion: https://postgr.es/m/17055-9ba800ec8522668b@postgresql.org	2021-06-11 16:12:41 -04:00
Alvaro Herrera	4efcf47053	Add 'Portal Close' message to pipelined PQsendQuery() Commit `acb7e4eb6b` added a new implementation for PQsendQuery so that it works in pipeline mode (by using extended query protocol), but it behaves differently from the 'Q' message (in simple query protocol) used by regular implementation: the new one doesn't close the unnamed portal. Change the new code to have identical behavior to the old. Reported-by: Yura Sokolov <y.sokolov@postgrespro.ru> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/202106072107.d4i55hdscxqj@alvherre.pgsql	2021-06-11 16:05:50 -04:00
Noah Misch	d0e750c0ac	Rename PQtraceSetFlags() to PQsetTraceFlags(). We have a dozen PQset*() functions. PQresultSetInstanceData() and this were the libpq setter functions having a different word order. Adopt the majority word order. Reviewed by Alvaro Herrera and Robert Haas, though this choice of name was not unanimous. Discussion: https://postgr.es/m/20210605060555.GA216695@rfd.leadboat.com	2021-06-10 21:56:13 -07:00
Tom Lane	e56bce5d43	Reconsider the handling of procedure OUT parameters. Commit `2453ea142` redefined pg_proc.proargtypes to include the types of OUT parameters, for procedures only. While that had some advantages for implementing the SQL-spec behavior of DROP PROCEDURE, it was pretty disastrous from a number of other perspectives. Notably, since the primary key of pg_proc is name + proargtypes, this made it possible to have multiple procedures with identical names + input arguments and differing output argument types. That would make it impossible to call any one of the procedures by writing just NULL (or "?", or any other data-type-free notation) for the output argument(s). The change also seems likely to cause grave confusion for client applications that examine pg_proc and expect the traditional definition of proargtypes. Hence, revert the definition of proargtypes to what it was, and undo a number of complications that had been added to support that. To support the SQL-spec behavior of DROP PROCEDURE, when there are no argmode markers in the command's parameter list, we perform the lookup both ways (that is, matching against both proargtypes and proallargtypes), succeeding if we get just one unique match. In principle this could result in ambiguous-function failures that would not happen when using only one of the two rules. However, overloading of procedure names is thought to be a pretty rare usage, so this shouldn't cause many problems in practice. Postgres-specific code such as pg_dump can defend against any possibility of such failures by being careful to specify argmodes for all procedure arguments. This also fixes a few other bugs in the area of CALL statements with named parameters, and improves the documentation a little. catversion bump forced because the representation of procedures with OUT arguments changes. Discussion: https://postgr.es/m/3742981.1621533210@sss.pgh.pa.us	2021-06-10 17:11:36 -04:00
Robert Haas	4dcb1d087a	Adjust new test case to set wal_keep_size. Per buildfarm member conchuela and Kyotaro Horiguchi, it's possible for the WAL segment that the cascading standby needs to be removed too quickly. Hopefully this will prevent that. Kyotaro Horiguchi Discussion: http://postgr.es/m/20210610.101240.1270925505780628275.horikyota.ntt@gmail.com	2021-06-10 09:46:08 -04:00
Robert Haas	caba8f0d43	Fix corner case failure of new standby to follow new primary. This only happens if (1) the new standby has no WAL available locally, (2) the new standby is starting from the old timeline, (3) the promotion happened in the WAL segment from which the new standby is starting, (4) the timeline history file for the new timeline is available from the archive but the WAL files for are not (i.e. this is a race), (5) the WAL files for the new timeline are available via streaming, and (6) recovery_target_timeline='latest'. Commit `ee994272ca` introduced this logic and was an improvement over the previous code, but it mishandled this case. If recovery_target_timeline='latest' and restore_command is set, validateRecoveryParameters() can change recoveryTargetTLI to be different from receiveTLI. If streaming is then tried afterward, expectedTLEs gets initialized with the history of the wrong timeline. It's supposed to be a list of entries explaining how to get to the target timeline, but in this case it ends up with a list of entries explaining how to get to the new standby's original timeline, which isn't right. Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi. Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com	2021-06-09 16:17:00 -04:00
Tom Lane	bfeede9fa4	Don't crash on empty statements in SQL-standard function bodies. gram.y should discard NULL pointers (empty statements) when assembling a routine_body_stmt_list, as it does for other sorts of statement lists. Julien Rouhaud and Tom Lane, per report from Noah Misch. Discussion: https://postgr.es/m/20210606044418.GA297923@rfd.leadboat.com	2021-06-08 11:59:34 -04:00
Michael Paquier	68a6d8a870	Fix portability issue in test indirect_toast When run on a server using default_toast_compression set to LZ4, this test would fail because of a consistency issue with the order of the tuples treated. LZ4 causes one tuple to be stored inline instead of getting externalized. As the goal of this test is to check after data stored externally, stick to pglz as the compression algorithm used, so as all data of this test is stored the way it should. Analyzed-by: Dilip Kumar Discussion: https://postgr.es/m/YLrDWxJgM8WWMoCg@paquier.xyz	2021-06-07 18:12:29 +09:00
Andrew Dunstan	11e9caff82	In PostgresNode.pm, don't pass SQL to psql on the command line The Msys shell mangles certain patterns in its command line, so avoid handing arbitrary SQL to psql on the command line and instead use IPC::Run's redirection facility for stdin. This pattern is already mostly whats used, but query_poll_until() was not doing the right thing. Problem discovered on the buildfarm when a new TAP test failed on msys.	2021-06-03 16:14:06 -04:00
Michael Paquier	8279f68a1b	Ignore more environment variables in TAP tests Various environment variables were not getting reset in the TAP tests, which would cause failures depending on the tests or the environment variables involved. For example, PGSSL{MAX,MIN}PROTOCOLVERSION could cause failures in the SSL tests. Even worse, a junk value of PGCLIENTENCODING makes a server startup fail. The list of variables reset is adjusted in each stable branch depending on what is supported. While on it, simplify a bit the code per a suggestion from Andrew Dunstan, using a list of variables instead of doing single deletions. Reviewed-by: Andrew Dunstan, Daniel Gustafsson Discussion: https://postgr.es/m/YLbjjRpucIeZ78VQ@paquier.xyz Backpatch-through: 9.6	2021-06-03 11:50:56 +09:00
Tom Lane	2955c2be79	Re-allow custom GUC names that have more than two components. Commit `3db826bd5` disallowed this case, but it turns out that some people are depending on it. Since the core grammar has allowed it since `3dc37cd8d`, it seems like this code should fall in line. Per bug #17045 from Robert Sosinski. Discussion: https://postgr.es/m/17045-6a4a9f0d1513f72b@postgresql.org	2021-06-02 18:50:23 -04:00
Fujii Masao	df466d30c6	Remove unnecessary use of Time::HiRes from 013_crash_restart.pl. The regression test 013_crash_restart.pl included "use Time::HiRes qw(usleep)", but the usleep was not used there. Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/63ad1368-18e2-8900-8443-524bdfb1bef5@oss.nttdata.com	2021-06-02 12:20:15 +09:00
Fujii Masao	6bbc5c5e96	Add regression test for recovery pause. Previously there was no regression test for recovery pause feature. This commit adds the test that checks - recovery can be paused or resumed expectedly - pg_get_wal_replay_pause_state() reports the correct pause state - the paused state ends and promotion continues if a promotion is triggered while recovery is paused Suggested-by: Michael Paquier Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi, Dilip Kumar Discussion: https://postgr.es/m/YKNirzqM1HYyk5h4@paquier.xyz	2021-06-02 12:19:39 +09:00
Noah Misch	42344ad0ed	Add Windows file version information to libpq_pipeline.exe.	2021-06-01 18:04:15 -07:00
Tom Lane	1103033aed	Reject SELECT ... GROUP BY GROUPING SETS (()) FOR UPDATE. This case should be disallowed, just as FOR UPDATE with a plain GROUP BY is disallowed; FOR UPDATE only makes sense when each row of the query result can be identified with a single table row. However, we missed teaching CheckSelectLocking() to check groupingSets as well as groupClause, so that it would allow degenerate grouping sets. That resulted in a bad plan and a null-pointer dereference in the executor. Looking around for other instances of the same bug, the only one I found was in examine_simple_variable(). That'd just lead to silly estimates, but it should be fixed too. Per private report from Yaoguang Chen. Back-patch to all supported branches.	2021-06-01 11:12:56 -04:00
Amit Kapila	eb89cb43a0	pgoutput: Fix memory leak due to RelationSyncEntry.map. Release memory allocated when creating the tuple-conversion map and its component TupleDescs when its owning sync entry is invalidated. TupleDescs must also be freed when no map is deemed necessary, to begin with. Reported-by: Andres Freund Author: Amit Langote Reviewed-by: Takamichi Osumi, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/MEYP282MB166933B1AB02B4FE56E82453B64D9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM	2021-06-01 14:27:14 +05:30
Tom Lane	6ee41a301e	Fix mis-planning of repeated application of a projection. create_projection_plan contains a hidden assumption (here made explicit by an Assert) that a projection-capable Path will yield a projection-capable Plan. Unfortunately, that assumption is violated only a few lines away, by create_projection_plan itself. This means that two stacked ProjectionPaths can yield an outcome where we try to jam the upper path's tlist into a non-projection-capable child node, resulting in an invalid plan. There isn't any good reason to have stacked ProjectionPaths; indeed the whole concept is faulty, since the set of Vars/Aggs/etc needed by the upper one wouldn't necessarily be available in the output of the lower one, nor could the lower one create such values if they weren't available from its input. Hence, we can fix this by adjusting create_projection_path to strip any top-level ProjectionPath from the subpath it's given. (This amounts to saying "oh, we changed our minds about what we need to project here".) The test case added here only fails in v13 and HEAD; before that, we don't attempt to shove the Sort into the parallel part of the plan, for reasons that aren't entirely clear to me. However, all the directly-related code looks generally the same as far back as v11, where the hazard was introduced (by `d7c19e62a`). So I've got no faith that the same type of bug doesn't exist in v11 and v12, given the right test case. Hence, back-patch the code changes, but not the irrelevant test case, into those branches. Per report from Bas Poot. Discussion: https://postgr.es/m/534fca83789c4a378c7de379e9067d4f@politie.nl	2021-05-31 12:03:00 -04:00
Noah Misch	d03eeab886	Raise a timeout to 180s, in test 010_logical_decoding_timelines.pl. Per buildfarm member hornet. Also, update Pod documentation showing the lower value. Back-patch to v10, where the test first appeared.	2021-05-31 00:29:58 -07:00
Michael Paquier	12cc956664	Improve some error wording with multirange type parsing Braces were referred in some error messages as only brackets (not curly brackets or curly braces), which can be confusing as other types of brackets could be used. While on it, add one test to check after the case of junk characters detected after a right brace. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20210514.153153.1814935914483287479.horikyota.ntt@gmail.com	2021-05-31 11:35:00 +09:00
Tom Lane	e6241d8e03	Rethink definition of pg_attribute.attcompression. Redefine '\0' (InvalidCompressionMethod) as meaning "if we need to compress, use the current setting of default_toast_compression". This allows '\0' to be a suitable default choice regardless of datatype, greatly simplifying code paths that initialize tupledescs and the like. It seems like a more user-friendly approach as well, because now the default compression choice doesn't migrate into table definitions, meaning that changing default_toast_compression is usually sufficient to flip an installation's behavior; one needn't tediously issue per-column ALTER SET COMPRESSION commands. Along the way, fix a few minor bugs and documentation issues with the per-column-compression feature. Adopt more robust APIs for SetIndexStorageProperties and GetAttributeCompression. Bump catversion because typical contents of attcompression will now be different. We could get away without doing that, but it seems better to ensure v14 installations all agree on this. (We already forced initdb for beta2, anyway.) Discussion: https://postgr.es/m/626613.1621787110@sss.pgh.pa.us	2021-05-27 13:24:27 -04:00
Peter Eisentraut	a717e5c771	Fix vpath build in libpq_pipeline test The path needs to be set to refer to the build directory, not the current directory, because that's actually the source directory at that point. fix for `6abc8c2596`	2021-05-27 16:40:52 +02:00
Peter Eisentraut	6abc8c2596	Add NO_INSTALL option to pgxs Apply in libpq_pipeline test makefile, so that the test file is not installed into tmp_install. Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/cb9d16a6-760f-cd44-28d6-b091d5fb6ca7%40enterprisedb.com	2021-05-27 13:58:29 +02:00
Alvaro Herrera	eb43bdbf51	Make detach-partition-concurrently-4 less timing sensitive Same as `5e0b1aeb2d`, for the companion test file. This one seems lower probability (only two failures in a month of runs); I was hardly able to reproduce a failure without a patch, so the fact that I was also unable to reproduce one with it doesn't say anything. We'll have to wait for further buildfarm results to see if we need any further adjustments. Discussion: https://postgr.es/m/20210524090712.GA3771394@rfd.leadboat.com	2021-05-25 19:44:55 -04:00
Alvaro Herrera	5e0b1aeb2d	Make detach-partition-concurrently-3 less timing-sensitive This recently added test has shown to be too sensitive to timing when sending a cancel to a session waiting for a lock. We fix this by running a no-op query in the blocked session immediately after the cancel; this avoids the session that sent the cancel sending another query immediately before the cancel has been reported. Idea by Noah Misch. With that fix, we sometimes see that the cancel error report is shown only relative to the step that is cancelled, instead of together with the step that sends the cancel. To increase the probability that both steps are shown togeter, add a 0.1s sleep to the cancel. In normal conditions this appears sufficient to silence most failures, but we'll see that the slower buildfarm members say about it. Reported-by: Takamichi Osumi <osumi.takamichi@fujitsu.com> Discussion: https://postgr.es/m/OSBPR01MB4888C4ABA361C7E81094AC66ED269@OSBPR01MB4888.jpnprd01.prod.outlook.com	2021-05-25 12:53:57 -04:00
Amit Kapila	0734b0e983	Improve docs and error messages for parallel vacuum. The error messages, docs, and one of the options were using 'parallel degree' to indicate parallelism used by vacuum command. We normally use 'parallel workers' at other places so change it for parallel vacuum accordingly. Author: Bharath Rupireddy Reviewed-by: Dilip Kumar, Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CALj2ACWz=PYrrFXVsEKb9J1aiX4raA+UBe02hdRp_zqDkrWUiw@mail.gmail.com	2021-05-25 09:26:53 +05:30
David Rowley	cba5c70b95	Fix setrefs.c code for Result Cache nodes Result Cache, added in `9eacee2e6` neglected to properly adjust the plan references in setrefs.c. This could lead to the following error during EXPLAIN: ERROR: cannot decompile join alias var in plan tree Fix that. Bug: 17030 Reported-by: Hans Buschmann Discussion: https://postgr.es/m/17030-5844aecae42fe223@postgresql.org	2021-05-25 12:50:22 +12:00
Tom Lane	4b10074453	Disallow whole-row variables in GENERATED expressions. This was previously allowed, but I think that was just an oversight. It's a clear violation of the rule that a generated column cannot depend on itself or other generated columns. Moreover, because the code was relying on the assumption that no such cross-references exist, it was pretty easy to crash ALTER TABLE and perhaps other places. Even if you managed not to crash, you got quite unstable, implementation-dependent results. Per report from Vitaly Ustinov. Back-patch to v12 where GENERATED came in. Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com	2021-05-21 15:12:08 -04:00
Tom Lane	2b0ee126bb	Fix usage of "tableoid" in GENERATED expressions. We consider this supported (though I've got my doubts that it's a good idea, because tableoid is not immutable). However, several code paths failed to fill the field in soon enough, causing such a GENERATED expression to see zero or the wrong value. This occurred when ALTER TABLE adds a new GENERATED column to a table with existing rows, and during regular INSERT or UPDATE on a foreign table with GENERATED columns. Noted during investigation of a report from Vitaly Ustinov. Back-patch to v12 where GENERATED came in. Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com	2021-05-21 15:02:06 -04:00
Tom Lane	84f5c2908d	Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit `2ecfeda3e` into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org	2021-05-21 14:03:59 -04:00
Amit Kapila	6d0eb38557	Fix deadlock for multiple replicating truncates of the same table. While applying the truncate change, the logical apply worker acquires RowExclusiveLock on the relation being truncated. This allowed truncate on the relation at a time by two apply workers which lead to a deadlock. The reason was that one of the workers after updating the pg_class tuple tries to acquire SHARE lock on the relation and started to wait for the second worker which has acquired RowExclusiveLock on the relation. And when the second worker tries to update the pg_class tuple, it starts to wait for the first worker which leads to a deadlock. Fix it by acquiring AccessExclusiveLock on the relation before applying the truncate change as we do for normal truncate operation. Author: Peter Smith, test case by Haiying Tang Reviewed-by: Dilip Kumar, Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/CAHut+PsNm43p0jM+idTvWwiGZPcP0hGrHMPK9TOAkc+a4UpUqw@mail.gmail.com	2021-05-21 07:54:27 +05:30
Tom Lane	f21fadafaf	Avoid detoasting failure after COMMIT inside a plpgsql FOR loop. exec_for_query() normally tries to prefetch a few rows at a time from the query being iterated over, so as to reduce executor entry/exit overhead. Unfortunately this is unsafe if we have COMMIT or ROLLBACK within the loop, because there might be TOAST references in the data that we prefetched but haven't yet examined. Immediately after the COMMIT/ROLLBACK, we have no snapshots in the session, meaning that VACUUM is at liberty to remove recently-deleted TOAST rows. This was originally reported as a case triggering the "no known snapshots" error in init_toast_snapshot(), but even if you miss hitting that, you can get "missing toast chunk", as illustrated by the added isolation test case. To fix, just disable prefetching in non-atomic contexts. Maybe there will be performance complaints prompting us to work harder later, but it's not clear at the moment that this really costs much, and I doubt we'd want to back-patch any complicated fix. In passing, adjust that error message in init_toast_snapshot() to be a little clearer about the likely cause of the problem. Patch by me, based on earlier investigation by Konstantin Knizhnik. Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org	2021-05-20 18:32:37 -04:00
Andrew Dunstan	bdbb2ce7d5	Install PostgresVersion.pm A lamentable oversight on my part meant that when PostgresVersion.pm was added in commit `4c4eaf3d19` provision to install it was not added to the Makefile, so it was not installed along with the other perl modules.	2021-05-20 15:11:17 -04:00
Andrew Dunstan	8bdd6f563a	Use a more portable way to get the version string in PostgresNode Older versions of perl on Windows don't like the list form of pipe open, and perlcritic doesn't like the string form of open, so we avoid both with a simpler formulation using qx{}. Per complaint from Amit Kapila.	2021-05-20 08:07:08 -04:00
Tom Lane	413c1ef98e	Avoid creating testtablespace directories where not wanted. Recently we refactored things so that pg_regress makes the "testtablespace" subdirectory used by the core regression tests, instead of doing that in the makefiles. That had the undesirable side effect of making such a subdirectory in every directory that has "input" or "output" test files. Since these subdirectories remain empty, git doesn't complain about them, but nonetheless they're clutter. To fix, invent an explicit --make-testtablespace-dir switch, so that pg_regress only makes the subdirectory when explicitly told to. Discussion: https://postgr.es/m/2854388.1621284789@sss.pgh.pa.us	2021-05-19 14:04:01 -04:00
Amit Kapila	0a442a408b	Fix 020_messages.pl test. We were not waiting for a publisher to catch up with the subscriber after creating a subscription. Now, it can happen that apply worker starts replication even after we have disabled the subscription in the test. This will make the test expect that there is no active slot whereas there exists one. Fix this symptom by allowing the publisher to wait for catching up with the subscription. It is not a good idea to ensure if the slot is still active by checking for walsender existence as we release the slot after we clean up the walsender related memory. Fix that by checking the slot status in pg_replication_slots. Also, it is better to avoid repeated enabling/disabling of the subscription. Finally, we make autovacuum off for this test to avoid any empty transaction appearing in the test while consuming changes. Reported-by: as per buildfarm Author: Vignesh C Reviewed-by: Amit Kapila, Michael Paquier Discussion: https://postgr.es/m/CAA4eK1+uW1UGDHDz-HWMHMen76mKP7NJebOTZN4uwbyMjaYVww@mail.gmail.com	2021-05-19 08:54:46 +05:30
Tom Lane	c3c35a733c	Prevent infinite insertion loops in spgdoinsert(). Formerly we just relied on operator classes that assert longValuesOK to eventually shorten the leaf value enough to fit on an index page. That fails since the introduction of INCLUDE-column support (commit `09c1c6ab4`), because the INCLUDE columns might alone take up more than a page, meaning no amount of leaf-datum compaction will get the job done. At least with spgtextproc.c, that leads to an infinite loop, since spgtextproc.c won't throw an error for not being able to shorten the leaf datum anymore. To fix without breaking cases that would otherwise work, add logic to spgdoinsert() to verify that the leaf tuple size is decreasing after each "choose" step. Some opclasses might not decrease the size on every single cycle, and in any case, alignment roundoff of the tuple size could obscure small gains. Therefore, allow up to 10 cycles without additional savings before throwing an error. (Perhaps this number will need adjustment, but it seems quite generous right now.) As long as we've developed this logic, let's back-patch it. The back branches don't have INCLUDE columns to worry about, but this seems like a good defense against possible bugs in operator classes. We already know that an infinite loop here is pretty unpleasant, so having a defense seems to outweigh the risk of breaking things. (Note that spgtextproc.c is actually the only known opclass with longValuesOK support, so that this is all moot for known non-core opclasses anyway.) Per report from Dilip Kumar. Discussion: https://postgr.es/m/CAFiTN-uxP_soPhVG840tRMQTBmtA_f_Y8N51G7DKYYqDh7XN-A@mail.gmail.com	2021-05-14 15:07:34 -04:00
Tom Lane	def5b065ff	Initial pgindent and pgperltidy run for v14. Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.	2021-05-12 13:14:10 -04:00
Tom Lane	e135743ef0	Reduce runtime of privileges.sql test under CLOBBER_CACHE_ALWAYS. Several queries in the privileges regression test cause the planner to apply the plpgsql function "leak()" to every element of the histogram for atest12.b. Since commit `0c882e52a` increased the size of that histogram to 10000 entries, the test invokes that function over 100000 times, which takes an absolutely unreasonable amount of time in clobber-cache-always mode. However, there's no real reason why that has to be a plpgsql function: for the purposes of this test, all that matters is that it not be marked leakproof. So we can replace the plpgsql implementation with a direct call of int4lt, which has the same behavior and is orders of magnitude faster. This is expected to cut several hours off the buildfarm cycle time for CCA animals. It has some positive impact in normal builds too, though that's probably lost in the noise. Back-patch to v13 where `0c882e52a` came in. Discussion: https://postgr.es/m/575884.1620626638@sss.pgh.pa.us	2021-05-11 20:59:58 -04:00
Tom Lane	1df3555acc	Get rid of the separate serial_schedule list of tests. Having to maintain two lists of regression test scripts has proven annoyingly error-prone. We can achieve the effect of the serial_schedule by running the parallel_schedule with "--max_connections=1"; so do that and remove serial_schedule. This causes cosmetic differences in the progress output, but it doesn't seem worth restructuring pg_regress to avoid that. Discussion: https://postgr.es/m/899209.1620759506@sss.pgh.pa.us	2021-05-11 17:52:13 -04:00
Tom Lane	6303a57309	Replace opr_sanity test's binary_coercible() function with C code. opr_sanity's binary_coercible() function has always been meant to match the parser's notion of binary coercibility, but it also has always been a rather poor approximation of the parser's real rules (as embodied in IsBinaryCoercible()). That hasn't bit us so far, but it's predictable that it will eventually. It also now emerges that implementing this check in plpgsql performs absolutely horribly in clobber-cache-always testing. (Perhaps we could do something about that, but I suspect it just means that plpgsql is exploiting catalog caching to the hilt.) Hence, let's replace binary_coercible() with a C shim that directly invokes IsBinaryCoercible(), eliminating both the semantic hazard and the performance issue. Most of regress.c's C functions are declared in create_function_1, but we can't simply move that to before opr_sanity/type_sanity since those tests would complain about the resulting shell types. I chose to split it into create_function_0 and create_function_1. Since create_function_0 now runs as part of a parallel group while create_function_1 doesn't, reduce the latter to create just those functions that opr_sanity and type_sanity would whine about. To make room for create_function_0 in the second parallel group of tests, move tstypes to the third parallel group. In passing, clean up some ordering deviations between parallel_schedule and serial_schedule. Discussion: https://postgr.es/m/292305.1620503097@sss.pgh.pa.us	2021-05-11 14:28:11 -04:00
Peter Eisentraut	6d177e2813	Fix typo	2021-05-11 09:06:49 +02:00
Tom Lane	049e1e2edb	Fix mishandling of resjunk columns in ON CONFLICT ... UPDATE tlists. It's unusual to have any resjunk columns in an ON CONFLICT ... UPDATE list, but it can happen when MULTIEXPR_SUBLINK SubPlans are present. If it happens, the ON CONFLICT UPDATE code path would end up storing tuples that include the values of the extra resjunk columns. That's fairly harmless in the short run, but if new columns are added to the table then the values would become accessible, possibly leading to malfunctions if they don't match the datatypes of the new columns. This had escaped notice through a confluence of missing sanity checks, including * There's no cross-check that a tuple presented to heap_insert or heap_update matches the table rowtype. While it's difficult to check that fully at reasonable cost, we can easily add assertions that there aren't too many columns. * The output-column-assignment cases in execExprInterp.c lacked any sanity checks on the output column numbers, which seems like an oversight considering there are plenty of assertion checks on input column numbers. Add assertions there too. * We failed to apply nodeModifyTable's ExecCheckPlanOutput() to the ON CONFLICT UPDATE tlist. That wouldn't have caught this specific error, since that function is chartered to ignore resjunk columns; but it sure seems like a bad omission now that we've seen this bug. In HEAD, the right way to fix this is to make the processing of ON CONFLICT UPDATE tlists work the same as regular UPDATE tlists now do, that is don't add "SET x = x" entries, and use ExecBuildUpdateProjection to evaluate the tlist and combine it with old values of the not-set columns. This adds a little complication to ExecBuildUpdateProjection, but allows removal of a comparable amount of now-dead code from the planner. In the back branches, the most expedient solution seems to be to (a) use an output slot for the ON CONFLICT UPDATE projection that actually matches the target table, and then (b) invent a variant of ExecBuildProjectionInfo that can be told to not store values resulting from resjunk columns, so it doesn't try to store into nonexistent columns of the output slot. (We can't simply ignore the resjunk columns altogether; they have to be evaluated for MULTIEXPR_SUBLINK to work.) This works back to v10. In 9.6, projections work much differently and we can't cheaply give them such an option. The 9.6 version of this patch works by inserting a JunkFilter when it's necessary to get rid of resjunk columns. In addition, v11 and up have the reverse problem when trying to perform ON CONFLICT UPDATE on a partitioned table. Through a further oversight, adjust_partition_tlist() discarded resjunk columns when re-ordering the ON CONFLICT UPDATE tlist to match a partition. This accidentally prevented the storing-bogus-tuples problem, but at the cost that MULTIEXPR_SUBLINK cases didn't work, typically crashing if more than one row has to be updated. Fix by preserving resjunk columns in that routine. (I failed to resist the temptation to add more assertions there too, and to do some minor code beautification.) Per report from Andres Freund. Back-patch to all supported branches. Security: CVE-2021-32028	2021-05-10 11:02:29 -04:00

1 2 3 4 5 ...

5916 commits