postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-29 10:11:47 -04:00

Author	SHA1	Message	Date
Tom Lane	d67354d870	Fix limitations on what SQL commands can be issued to a walsender. In logical replication mode, a WalSender is supposed to be able to execute any regular SQL command, as well as the special replication commands. Poor design of the replication-command parser caused it to fail in various cases, notably: * semicolons embedded in a command, or multiple SQL commands sent in a single message; * dollar-quoted literals containing odd numbers of single or double quote marks; * commands starting with a comment. The basic problem here is that we're trying to run repl_scanner.l across the entire input string even when it's not a replication command. Since repl_scanner.l does not understand all of the token types known to the core lexer, this is doomed to have failure modes. We certainly don't want to make repl_scanner.l as big as scan.l, so instead rejigger stuff so that we only lex the first token of a non-replication command. That will usually look like an IDENT to repl_scanner.l, though a comment would end up getting reported as a '-' or '/' single-character token. If the token is a replication command keyword, we push it back and proceed normally with repl_gram.y parsing. Otherwise, we can drop out of exec_replication_command() without examining the rest of the string. (It's still theoretically possible for repl_scanner.l to fail on the first token; but that could only happen if it's an unterminated single- or double-quoted string, in which case you'd have gotten largely the same error from the core lexer too.) In this way, repl_gram.y isn't involved at all in handling general SQL commands, so we can get rid of the SQLCmd node type. (In the back branches, we can't remove it because renumbering enum NodeTag would be an ABI break; so just leave it sit there unused.) I failed to resist the temptation to clean up some other sloppy coding in repl_scanner.l while at it. The only externally-visible behavior change from that is it now accepts \r and \f as whitespace, same as the core lexer. Per bug #17379 from Greg Rychlewski. Back-patch to all supported branches. Discussion: https://postgr.es/m/17379-6a5c6cfb3f1f5e77@postgresql.org	2022-01-24 15:33:34 -05:00
Tom Lane	c94c6612da	Remember to reset yy_start state when firing up repl_scanner.l. Without this, we get odd behavior when the previous cycle of lexing exited in a non-default exclusive state. Every other copy of this code is aware that it has to do BEGIN(INITIAL), but repl_scanner.l did not get that memo. The real-world impact of this is probably limited, since most replication clients would abandon their connection after getting a syntax error. Still, it's a bug. This mistake is old, so back-patch to all supported branches. Discussion: https://postgr.es/m/1874781.1643035952@sss.pgh.pa.us	2022-01-24 12:09:46 -05:00
Tom Lane	0cbc507378	Suppress variable-set-but-not-used warning from clang 13. In the normal configuration where GEQO_DEBUG isn't defined, recent clang versions have started to complain that geqo_main.c accumulates the edge_failures count but never does anything with it. As a minimal back-patchable fix, insert a void cast to silence this warning. (I'd speculated about ripping out the GEQO_DEBUG logic altogether, but I don't think we'd wish to back-patch that.) Per recently-established project policy, this is a candidate for back-patching into out-of-support branches: it suppresses an annoying compiler warning but changes no behavior. Hence, back-patch all the way to 9.2. Discussion: https://postgr.es/m/CA+hUKGLTSZQwES8VNPmWO9AO0wSeLt36OCPDAZTccT1h7Q7kTQ@mail.gmail.com	2022-01-23 11:09:25 -05:00
Tomas Vondra	4d8c4d2061	Correct type of front_pathkey to PathKey In sort_inner_and_outer we iterate a list of PathKey elements, but the variable is declared as (List *). This mistake is benign, because we only pass the pointer to lcons() and never dereference it. This exists since ~2004, but it's confusing. So fix and backpatch to all supported branches. Backpatch-through: 10 Discussion: https://postgr.es/m/bf3a6ea1-a7d8-7211-0669-189d5c169374%40enterprisedb.com	2022-01-23 04:02:22 +01:00
Tomas Vondra	267ccc38ba	Check syscache result in AlterStatistics The syscache lookup may return NULL even for valid OID, for example due to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary. Without it, it may fail with a segfault. Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER STATISTICS ... SET STATISTICS was introduced. Backpatch-through: 13 Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org	2022-01-23 03:20:32 +01:00
Tom Lane	31b7b4d26e	Flush table's relcache during ALTER TABLE ADD PRIMARY KEY USING INDEX. Previously, unless we had to add a NOT NULL constraint to the column, this command resulted in updating only the index's relcache entry. That's problematic when replication behavior is being driven off the existence of a primary key: other sessions (and ours too for that matter) failed to recalculate their opinion of whether the table can be replicated. Add a relcache invalidation to fix it. This has been broken since pg_class.relhaspkey was removed in v11. Before that, updating the table's relhaspkey value sufficed to cause a cache flush. Hence, backpatch to v11. Report and patch by Hou Zhijie Discussion: https://postgr.es/m/OS0PR01MB5716EBE01F112C62F8F9B786947B9@OS0PR01MB5716.jpnprd01.prod.outlook.com	2022-01-22 13:32:40 -05:00
Tom Lane	64ebb43df0	Fix race condition in gettext() initialization in libpq and ecpglib. In libpq and ecpglib, multiple threads can concurrently enter the initialization logic for message localization. Since we set the its-done flag before actually doing the work, it'd be possible for some threads to reach gettext() before anyone has called bindtextdomain(). Barring bugs in libintl itself, this would not result in anything worse than failure to localize some early messages. Nonetheless, it's a bug, and an easy one to fix. Noted while investigating bug #17299 from Clemens Zeidler (much thanks to Liam Bowen for followup investigation on that). It currently appears that that actually is a bug in libintl itself, but that doesn't let us off the hook for this bit. Back-patch to all supported versions. Discussion: https://postgr.es/m/17299-7270741958c0b1ab@postgresql.org Discussion: https://postgr.es/m/CAE7q7Eit4Eq2=bxce=Fm8HAStECjaXUE=WBQc-sDDcgJQ7s7eg@mail.gmail.com	2022-01-21 15:36:28 -05:00
Andres Freund	fd48e5f5d3	fsync pg_logical/mappings in CheckPointLogicalRewriteHeap(). While individual logical rewrite files were synced to disk, the directory was not. On some filesystems that could lead to loosing directory entries after a crash. Reported-By: Tom Lane <tgl@sss.pgh.pa.us> Author: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/867F2E29-2782-4869-970E-B984C6D35A8F@amazon.com Backpatch: 10-	2022-01-21 11:24:12 -08:00
Michael Paquier	b5f634116e	Fix one-off bug causing missing commit timestamps for subtransactions The logic in charge of writing commit timestamps (enabled with track_commit_timestamp) for subtransactions had a one-bug bug, where it would be possible that commit timestamps go missing for the last subtransaction committed. While on it, simplify a bit the iteration logic in the loop writing the commit timestamps, as per suggestions from Kyotaro Horiguchi and Tom Lane, so as some variable initializations are not part of the loop itself. Issue introduced in `73c986a`. Analyzed-by: Alex Kingsborough Author: Alex Kingsborough, Kyotaro Horiguchi Discussion: https://postgr.es/m/73A66172-4050-4F2A-B7F1-13508EDA2144@amazon.com Backpatch-through: 10	2022-01-21 14:54:51 +09:00
Tom Lane	4828a80069	Tighten TAP tests' tracking of postmaster state some more. Commits `6c4a8903b` et al. had a couple of deficiencies: * The logic I added to Cluster::start to see if a PID file is present could be fooled by a stale PID file left over from a previous postmaster. To fix, if we're not sure whether we expect to find a running postmaster or not, validate the PID using "kill 0". * 017_shm.pl has a loop in which it just issues repeated Cluster::start calls; this will fail if some invocation fails but leaves self->_pid set. Per buildfarm results, the above fix is not enough to make this safe: we might have "validated" a PID for a postmaster that exits immediately after we look. Hence, match each failed start call with a stop call that will get us back to the self->_pid == undef state. Add a fail_ok option to Cluster::stop to make this work. Discussion: https://postgr.es/m/CA+hUKGKV6fOHvfiPt8=dOKzvswjAyLoFoJF1iQXMNpi7+hD1JQ@mail.gmail.com	2022-01-20 17:28:07 -05:00
Andrew Dunstan	31680730ef	Allow clean.bat to be run from anywhere This was omitted from `c3879a7b4c` which modified the other msvc .bat files. Per request from Juan José Santamaría Flecha Discussion: https://postgr.es/m/CAC+AXB0_fxYGbQoaYjCA8um7TTbOVP4L9aXnVmHwK8WzaT4gdA@mail.gmail.com Backpatch to all live branches.	2022-01-20 10:20:51 -05:00
Thomas Munro	8e8e253a7b	Try to stabilize reloptions test, again. Since the test requires reproducible behavior from VACUUM, and since DISABLE_PAGE_SKIPPING doesn't actually disable all forms of page skipping, let's use a temporary table to avoid contention. Back-patch to 12, like commit `3414099c`. Discussion: https://postgr.es/m/20220120052404.sonrhq3f3qgplpzj%40alap3.anarazel.de	2022-01-20 23:31:01 +13:00
Tom Lane	6eec809fbc	TAP tests: check for postmaster.pid anyway when "pg_ctl start" fails. "pg_ctl start" might start a new postmaster and then return failure anyway, for example if PGCTLTIMEOUT is exceeded. If there is a postmaster there, it's still incumbent on us to shut it down at script end, so check for the PID file even though we are about to fail. This has been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/647439.1642622744@sss.pgh.pa.us	2022-01-19 16:29:09 -05:00
Thomas Munro	639a9293d4	Try to stabilize the reloptions test. Where we test vacuum_truncate's effects, sometimes this is failing to truncate as expected on the build farm. That could be explained by page skipping, so disable it explicitly, with the theory that commit `fe246d1c` didn't go far enough. Back-patch to 12, where the vacuum_truncate tests were added. Discussion: https://postgr.es/m/CA%2BhUKGLT2UL5_JhmBzUgkdyKfc%3D5J-gJSQJLysMs4rqLUKLAzw%40mail.gmail.com	2022-01-19 07:34:38 +13:00
Tom Lane	90e0f9fd8c	Fix psql \d's query for identifying parent triggers. The original coding (from `c33869cc3`) failed with "more than one row returned by a subquery used as an expression" if there were unrelated triggers of the same tgname on parent partitioned tables. (That's possible because statement-level triggers don't get inherited.) Fix by applying LIMIT 1 after sorting the candidates by inheritance level. Also, wrap the subquery in a CASE so that we don't have to execute it at all when the trigger is visibly non-inherited. Aside from saving some cycles, this avoids the need for a confusing and undocumented NULLIF(). While here, tweak the format of the emitted query to look a bit nicer for "psql -E", and add some explanation of this subquery, because it badly needs it. Report and patch by Justin Pryzby (with some editing by me). Back-patch to v13 where the faulty code came in. Discussion: https://postgr.es/m/20211217154356.GJ17618@telsasoft.com	2022-01-17 21:18:49 -05:00
Tom Lane	d18ec312f9	Avoid calling gettext() in signal handlers. It seems highly unlikely that gettext() can be relied on to be async-signal-safe. psql used to understand that, but someone got it wrong long ago in the src/bin/scripts/ version of handle_sigint, and then the bad idea was perpetuated when those two versions were unified into src/fe_utils/cancel.c. I'm unsure why there have not been field complaints about this ... maybe gettext() is signal-safe once it's translated at least one message? But we have no business assuming any such thing. In cancel.c (v13 and up), I preserved our ability to localize "Cancel request sent" messages by invoking gettext() before the signal handler is set up. In earlier branches I just made src/bin/scripts/ not localize those messages, as psql did then. (Just for extra unsafety, the src/bin/scripts/ version was invoking fprintf() from a signal handler. Sigh.) Noted while fixing signal-safety issues in PQcancel() itself. Back-patch to all supported branches. Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us	2022-01-17 13:30:04 -05:00
Tom Lane	f27af7b880	Avoid calling strerror[_r] in PQcancel(). PQcancel() is supposed to be safe to call from a signal handler, and indeed psql uses it that way. All of the library functions it uses are specified to be async-signal-safe by POSIX ... except for strerror. Neither plain strerror nor strerror_r are considered safe. When this code was written, back in the dark ages, we probably figured "oh, strerror will just index into a constant array of strings" ... but in any locale except C, that's unlikely to be true. Probably the reason we've not heard complaints is that (a) this error-handling code is unlikely to be reached in normal use, and (b) in many scenarios, localized error strings would already have been loaded, after which maybe it's safe to call strerror here. Still, this is clearly unacceptable. The best we can do without relying on strerror is to print the decimal value of errno, so make it do that instead. (This is probably not much loss of user-friendliness, given that it is hard to get a failure here.) Back-patch to all supported branches. Discussion: https://postgr.es/m/2937814.1641960929@sss.pgh.pa.us	2022-01-17 12:52:44 -05:00
Tom Lane	90a847e6dc	Fix psql's tab-completion of enum label values. Since enum labels have to be single-quoted, this part of the tab completion machinery got side-swiped by commit `cd69ec66c`. A side-effect of that commit is that (at least with some versions of Readline) the text string passed for completion will omit the leading quote mark of the enum label literal. Libedit still acts the same as before, though, so adapt COMPLETE_WITH_ENUM_VALUE so that it can cope with either convention. Also, when we fail to find any valid completion, set rl_completion_suppress_quote = 1. Otherwise readline will go ahead and append a closing quote, which is unwanted. Per report from Peter Eisentraut. Back-patch to v13 where `cd69ec66c` came in. Discussion: https://postgr.es/m/8ca82d89-ec3d-8b28-8291-500efaf23b25@enterprisedb.com	2022-01-16 14:59:20 -05:00
Tomas Vondra	d6817032d2	Build inherited extended stats on partitioned tables Commit `859b3003de` disabled building of extended stats for inheritance trees, to prevent updating the same catalog row twice. While that resolved the issue, it also means there are no extended stats for declaratively partitioned tables, because there are no data in the non-leaf relations. That also means declaratively partitioned tables were not affected by the issue `859b3003de` addressed, which means this is a regression affecting queries that calculate estimates for the whole inheritance tree as a whole (which includes e.g. GROUP BY queries). But because partitioned tables are empty, we can invert the condition and build statistics only for the case with inheritance, without losing anything. And we can consider them when calculating estimates. It may be necessary to run ANALYZE on partitioned tables, to collect proper statistics. For declarative partitioning there should no prior statistics, and it might take time before autoanalyze is triggered. For tables partitioned by inheritance the statistics may include data from child relations (if built `859b3003de`), contradicting the current code. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as `859b3003de`). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-15 19:14:00 +01:00
Tomas Vondra	acfde7c583	Ignore extended statistics for inheritance trees Since commit `859b3003de` we only build extended statistics for individual relations, ignoring the child relations. This resolved the issue with updating catalog tuple twice, but we still tried to use the statistics when calculating estimates for the whole inheritance tree. When the relations contain very distinct data, it may produce bogus estimates. This is roughly the same issue `427c6b5b9` addressed ~15 years ago, and we fix it the same way - by ignoring extended statistics when calculating estimates for the inheritance tree as a whole. We still consider extended statistics when calculating estimates for individual child relations, of course. This may result in plan changes due to different estimates, but if the old statistics were not describing the inheritance tree particularly well it's quite likely the new plans is actually better. Report and patch by Justin Pryzby, minor fixes and cleanup by me. Backpatch all the way back to PostgreSQL 10, where extended statistics were introduced (same as `859b3003de`). Author: Justin Pryzby Reported-by: Justin Pryzby Backpatch-through: 10 Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com	2022-01-15 02:30:06 +01:00
Tom Lane	ca14c4184b	Fix ruleutils.c's dumping of whole-row Vars in more contexts. Commit `7745bc352` intended to ensure that whole-row Vars would be printed with "::type" decoration in all contexts where plain "var.*" notation would result in star-expansion, notably in ROW() and VALUES() constructs. However, it missed the case of INSERT with a single-row VALUES, as reported by Timur Khanjanov. Nosing around ruleutils.c, I found a second oversight: the code for RowCompareExpr generates ROW() notation without benefit of an actual RowExpr, and naturally it wasn't in sync :-(. (The code for FieldStore also does this, but we don't expect that to generate strictly parsable SQL anyway, so I left it alone.) Back-patch to all supported branches. Discussion: https://postgr.es/m/efaba6f9-4190-56be-8ff2-7a1674f9194f@intrans.baku.az	2022-01-13 17:49:26 -05:00
Andrew Dunstan	32cd4264cc	Avoid warning about uninitialized value in MSVC python3 tests Juan José Santamaría Flecha Backpatch to all live branches	2022-01-10 10:12:31 -05:00
Andrew Dunstan	f3ded9c460	Allow MSVC .bat wrappers to be called from anywhere Instead of using a hardcoded or default path to the perl file the .bat file is a wrapper for, we use a path that means the file is found in the same directory as the .bat file. Patch by Anton Voloshin, slightly tweaked by me. Backpatch to all live branches Discussion: https://postgr.es/m/2b7a674b-5fb0-d264-75ef-ecc7a31e54f8@postgrespro.ru	2022-01-07 16:14:16 -05:00
Tom Lane	86d4bbb56a	Prevent altering partitioned table's rowtype, if it's used elsewhere. We disallow altering a column datatype within a regular table, if the table's rowtype is used as a column type elsewhere, because we lack code to go around and rewrite the other tables. This restriction should apply to partitioned tables as well, but it was not checked because ATRewriteTables and ATPrepAlterColumnType were not on the same page about who should do it for which relkinds. Per bug #17351 from Alexander Lakhin. Back-patch to all supported branches. Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org	2022-01-06 16:46:46 -05:00
Michael Paquier	3f8062bcf7	Reduce relcache access in WAL sender streaming logical changes get_rel_sync_entry(), which is called each time a change needs to be logically replicated, is a rather hot code path in the WAL sender sending logical changes. This code path was doing a relcache access on relkind and relpartition for each logical change, but we only need to know this information when building or re-building the cached information for a relation. Some measurements prove that this is noticeable in perf profiles, particularly when attempting to replicate changes from relations that are not published as these cause less overhead in the WAL sender, delaying further the replication of changes for relations that are published. Issue introduced in `83fd453`. Author: Hou Zhijie Reviewed-by: Kyotaro Horiguchi, Euler Taveira Discussion: https://postgr.es/m/OS0PR01MB5716E863AA9E591C1F010F7A947D9@OS0PR01MB5716.jpnprd01.prod.outlook.com Backpatch-through: 13	2022-01-05 10:27:53 +09:00
Alvaro Herrera	33fdd9f854	Fix silly mistake in Assert	2022-01-04 13:21:23 -03:00
Alvaro Herrera	29f9fb8fe8	Allow special SKIP LOCKED condition in Assert() Under concurrency, it is possible for two sessions to be merrily locking and releasing a tuple and marking it again as HEAP_XMAX_INVALID all the while a third session attempts to lock it, miserably fails at it, and then contemplates life, the universe and everything only to eventually fail an assertion that said bit is not set. Before SKIP LOCKED that was indeed a reasonable expectation, but alas! commit `df630b0dd5` falsified it. This bug is as old as time itself, and even older, if you think time begins with the oldest supported branch. Therefore, backpatch to all supported branches. Author: Simon Riggs <simon.riggs@enterprisedb.com> Discussion: https://postgr.es/m/CANbhV-FeEwMnN8yuMyss7if1ZKjOKfjcgqB26n8pqu1e=q0ebg@mail.gmail.com	2022-01-04 13:01:05 -03:00
Tom Lane	20d08b2c61	Fix index-only scan plans, take 2. Commit `4ace45677` failed to fix the problem fully, because the same issue of attempting to fetch a non-returnable index column can occur when rechecking the indexqual after using a lossy index operator. Moreover, it broke EXPLAIN for such indexquals (which indicates a gap in our test cases :-(). Revert the code changes of `4ace45677` in favor of adding a new field to struct IndexOnlyScan, containing a version of the indexqual that can be executed against the index-returned tuple without using any non-returnable columns. (The restrictions imposed by check_index_only guarantee this is possible, although we may have to recompute indexed expressions.) Support construction of that during setrefs.c processing by marking IndexOnlyScan.indextlist entries as resjunk if they can't be returned, rather than removing them entirely. (We could alternatively require setrefs.c to look up the IndexOptInfo again, but abusing resjunk this way seems like a reasonably safe way to avoid needing to do that.) This solution isn't great from an API-stability standpoint: if there are any extensions out there that build IndexOnlyScan structs directly, they'll be broken in the next minor releases. However, only a very invasive extension would be likely to do such a thing. There's no change in the Path representation, so typical planner extensions shouldn't have a problem. As before, back-patch to all supported branches. Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org	2022-01-03 15:42:27 -05:00
Tom Lane	45ae427141	Fix index-only scan plans when not all index columns can be returned. If an index has both returnable and non-returnable columns, and one of the non-returnable columns is an expression using a Var that is in a returnable column, then a query returning that expression could result in an index-only scan plan that attempts to read the non-returnable column, instead of recomputing the expression from the returnable column as intended. To fix, redefine the "indextlist" list of an IndexOnlyScan plan node as containing null Consts in place of any non-returnable columns. This solves the problem by preventing setrefs.c from falsely matching to such entries. The executor is happy since it only cares about the exposed types of the entries, and ruleutils.c doesn't care because a correct plan won't reference those entries. I considered some other ways to prevent setrefs.c from doing the wrong thing, but this way seems good since (a) it allows a very localized fix, (b) it makes the indextlist structure more compact in many cases, and (c) the indextlist is now a more faithful representation of what the index AM will actually produce, viz. nulls for any non-returnable columns. This is easier to hit since we introduced included columns, but it's possible to construct failing examples without that, as per the added regression test. Hence, back-patch to all supported branches. Per bug #17350 from Louis Jachiet. Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org	2022-01-01 16:12:03 -05:00
Thomas Munro	cadd98cd30	Fix overly generic name in with.sql test. Avoid the name "test". In the 10 branch, this could clash with alter_table.sql, as seen in the build farm. That other instance was already renamed in later branches by commit `2cf8c7aa`, but it's good to future-proof the name here too. Back-patch to 10. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA%2BhUKGJf4RAXUyAYVUcQawcptX%3DnhEco3SYpuPK5cCbA-F1eLA%40mail.gmail.com	2021-12-30 17:11:20 +13:00
Michael Paquier	28e1e5c2a9	Correct comment and some documentation about REPLICA_IDENTITY_INDEX catalog/pg_class.h was stating that REPLICA_IDENTITY_INDEX with a dropped index is equivalent to REPLICA_IDENTITY_DEFAULT. The code tells a different story, as it is equivalent to REPLICA_IDENTITY_NOTHING. The behavior exists since the introduction of replica identities, and `fe7fd4e` even added tests for this case but I somewhat forgot to fix this comment. While on it, this commit reorganizes the documentation about replica identities on the ALTER TABLE page, and a note is added about the case of dropped indexes with REPLICA_IDENTITY_INDEX. Author: Michael Paquier, Wei Wang Reviewed-by: Euler Taveira Discussion: https://postgr.es/m/OS3PR01MB6275464AD0A681A0793F56879E759@OS3PR01MB6275.jpnprd01.prod.outlook.com Backpatch-through: 10	2021-12-22 16:38:42 +09:00
Tom Lane	da0d8a4545	Ensure casting to typmod -1 generates a RelabelType. Fix the code changed by commit `5c056b0c2` so that we always generate RelabelType, not something else, for a cast to unspecified typmod. Otherwise planner optimizations might not happen. It appears we missed this point because the previous experiments were done on type numeric: the parser undesirably generates a call on the numeric() length-coercion function, but then numeric_support() optimizes that down to a RelabelType, so that everything seems fine. It misbehaves for types that have a non-optimized length coercion function, such as bpchar. Per report from John Naylor. Back-patch to all supported branches, as the previous patch eventually was. Unfortunately, that no longer includes 9.6 ... we really shouldn't put this type of change into a nearly-EOL branch. Discussion: https://postgr.es/m/CAFBsxsEfbFHEkouc+FSj+3K1sHipLPbEC67L0SAe-9-da8QtYg@mail.gmail.com	2021-12-16 15:36:02 -05:00
Michael Paquier	aadbf825b7	Adjust behavior of some env settings for the TAP tests of MSVC `edc2332` has introduced in vcregress.pl some control on the environment variables LZ4, TAR and GZIP_PROGRAM to allow any TAP tests to be able use those commands. This makes the settings more consistent with src/Makefile.global.in, as the same default gets used for Make and MSVC builds. Each parameter can be changed in buildenv.pl, but as a default gets assigned after loading buldenv.pl, it is not possible to unset any of these, and using an empty value would not work with "\|\|=" either. As some environments may not have a compatible command in their PATH (tar coming from MinGW is an issue, for one), this could break tests without an exit path to bypass any failing test. This commit changes things so as the default values for LZ4, TAR and GZIP_PROGRAM are assigned before loading buildenv.pl, not after. This way, we keep the same amount of compatibility as a GNU build with the same defaults, and it becomes possible to unset any of those values. While on it, this adds some documentation about those three variables in the section dedicated to the TAP tests for MSVC. Per discussion with Andrew Dunstan. Discussion: https://postgr.es/m/YbGYe483803il3X7@paquier.xyz Backpatch-through: 10	2021-12-15 10:40:12 +09:00
Tom Lane	0a5682041d	Fix datatype confusion in logtape.c's right_offset(). This could only matter if (a) long is wider than int, and (b) the heap of free blocks exceeds UINT_MAX entries, which seems pretty unlikely. Still, it's a theoretical bug, so backpatch to v13 where the typo came in (in commit `c02fdc922`). In passing, also make swap_nodes() use consistent datatypes. Ma Liangzhu Discussion: https://postgr.es/m/17336-fc4e522d26a750fd@postgresql.org	2021-12-14 11:46:48 -05:00
Michael Paquier	3f710fc2b4	Remove assertion for replication origins in PREPARE TRANSACTION When using replication origins, pg_replication_origin_xact_setup() is an optional choice to be able to set a LSN and a timestamp to mark the origin, which would be additionally added to WAL for transaction commits or aborts (including 2PC transactions). An assertion in the code path of PREPARE TRANSACTION assumed that this data should always be set, so it would trigger when using replication origins without setting up an origin LSN. Some tests are added to cover more this kind of scenario. Oversight in commit `1eb6d65`. Per discussion with Amit Kapila and Masahiko Sawada. Discussion: https://postgr.es/m/YbbBfNSvMm5nIINV@paquier.xyz Backpatch-through: 11	2021-12-14 10:58:29 +09:00
Andres Freund	65f860e78a	isolationtester: append session name to application_name. When writing / debugging an isolation test it sometimes is useful to see which session holds what lock etc. To make it easier, both as part of spec files and interactively, append the session name to application_name. Since `b1907d688` application_name already contains the test name, this appends the session's name to that. insert-conflict-specconflict did something like this manually, which can now be removed. As we have done lately with other test infrastructure improvements, backpatch this change, to make it easier to backpatch tests. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Michael Paquier <michael@paquier.xyz> Reviewed-By: Andrew Dunstan <andrew@dunslane.net> Discussion: https://postgr.es/m/20211211012052.2blmzcmxnxqawd2z@alap3.anarazel.de Backpatch: 10-, to make backpatching of tests easier.	2021-12-13 12:02:45 -08:00
Amit Kapila	3f06c00cf6	Fix double publish of child table's data. We publish the child table's data twice for a publication that has both child and parent tables and is published with publish_via_partition_root as true. This happens because subscribers will initiate synchronization using both parent and child tables, since it gets both as separate tables in the initial table list. Ensure that pg_publication_tables returns only parent tables in such cases. Author: Hou Zhijie Reviewed-by: Greg Nancarrow, Amit Langote, Vignesh C, Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB57167F45D481F78CDC5986F794B99@OS0PR01MB5716.jpnprd01.prod.outlook.com	2021-12-09 09:00:35 +05:30
Michael Paquier	9acea52ea3	Fix corruption of toast indexes with REINDEX CONCURRENTLY REINDEX CONCURRENTLY run on a toast index or a toast relation could corrupt the target indexes rebuilt, as a backend running in parallel that manipulates toast values would directly release the lock on the toast relation when its local operation is done, rather than releasing the lock once the transaction that manipulated the toast values committed. The fix done here is simple: we now hold a ROW EXCLUSIVE lock on the toast relation when saving or deleting a toast value until the transaction working on them is committed, so as a concurrent reindex happening in parallel would be able to wait for any activity and see any new rows inserted (or deleted). An isolation test is added to check after the case fixed here, which is a bit fancy by design as it relies on allow_system_table_mods to rename the toast table and its index to fixed names. This way, it is possible to reindex them directly without any dependency on the OID of the underlying relation. Note that this could not use a DO block either, as REINDEX CONCURRENTLY cannot be run in a transaction block. The test is backpatched down to 13, where it is possible, thanks to `c4a7a39`, to use allow_system_table_mods in a test suite. Reported-by: Alexey Ermakov Analyzed-by: Andres Freund, Noah Misch Author: Michael Paquier Reviewed-by: Nathan Bossart Discussion: https://postgr.es/m/17268-d2fb426e0895abd4@postgresql.org Backpatch-through: 12	2021-12-08 11:01:19 +09:00
Andrew Dunstan	14c54e40f5	Enable settings used in TAP tests for MSVC builds Certain settings from configuration or the Makefile infrastructure are used by the TAP tests, but were not being set up by vcregress.pl. This remedies those omissions. This should increase test coverage, especially on the buildfarm. Reviewed by Noah Misch Discussion: https://postgr.es/m/17093da5-e40d-8335-d53a-2bd803fc38b0@dunslane.net Backpatch to all live branches.	2021-12-07 15:05:33 -05:00
Tom Lane	a8a983e829	On Windows, also call shutdown() while closing the client socket. Further experimentation shows that commit `6051857fc` is not sufficient when using (some versions of?) OpenSSL. The reason is obscure, but calling shutdown(socket, SD_SEND) improves matters. Per testing by Andrew Dunstan and Alexander Lakhin. Back-patch as before. Discussion: https://postgr.es/m/af5e0bf3-6a61-bb97-6cba-061ddf22ff6b@dunslane.net	2021-12-07 13:34:19 -05:00
Tom Lane	6251f86241	On Windows, close the client socket explicitly during backend shutdown. It turns out that this is necessary to keep Winsock from dropping any not-yet-sent data, such as an error message explaining the reason for process termination. It's pretty weird that the implicit close done by the kernel acts differently from an explicit close, but it's hard to argue with experimental results. Independently submitted by Alexander Lakhin and Lars Kanis (comments by me, though). Back-patch to all supported branches. Discussion: https://postgr.es/m/90b34057-4176-7bb0-0dbb-9822a5f6425b@greiz-reinsdorf.de Discussion: https://postgr.es/m/16678-253e48d34dc0c376@postgresql.org	2021-12-02 17:15:01 -05:00
Michael Paquier	fae5f08e17	Move into separate file all the SQL queries used in pg_upgrade tests The existing pg_upgrade/test.sh and the buildfarm code have been holding the same set of SQL queries when doing cross-version upgrade tests to adapt the objects created by the regression tests before the upgrade (mostly, incompatible or non-existing objects need to be dropped from the origin, perhaps re-created). This moves all those SQL queries into a new, separate, file with a set of \if clauses to handle the version checks depending on the old version of the cluster to-be-upgraded. The long-term plan is to make the buildfarm code re-use this new SQL file, so as committers are able to fix any compatibility issues in the tests of pg_upgrade with a refresh of the core code, without having to poke at the buildfarm client. Note that this is only able to handle the main regression test suite, and that nothing is done yet for contrib modules yet (these have more issues like their database names). A backpatch down to 10 is done, adapting the version checks as this script needs to be only backward-compatible, so as it becomes possible to clean up a maximum amount of code within the buildfarm client. Author: Justin Pryzby, Michael Paquier Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com Backpatch-through: 10	2021-12-02 10:31:34 +09:00
Tom Lane	7413caabe6	Avoid leaking memory during large-scale REASSIGN OWNED BY operations. The various ALTER OWNER routines tend to leak memory in CurrentMemoryContext. That's not a problem when they're only called once per command; but in this usage where we might be touching many objects, it can amount to a serious memory leak. Fix that by running each call in a short-lived context. (DROP OWNED BY likely has a similar issue, except that you'll probably run out of lock table space before noticing. REASSIGN is worth fixing since for most non-table object types, it won't take any lock.) Back-patch to all supported branches. Unfortunately, in the back branches this helps to only a limited extent, since the sinval message queue bloats quite a lot in this usage before commit `3aafc030a`, consuming memory more or less comparable to what's actually leaked. Still, it's clearly a leak with a simple fix, so we might as well fix it. Justin Pryzby, per report from Guillaume Lelarge Discussion: https://postgr.es/m/CAECtzeW2DAoioEGBRjR=CzHP6TdL=yosGku8qZxfX9hhtrBB0Q@mail.gmail.com	2021-12-01 13:44:47 -05:00
Alvaro Herrera	f76fd05bae	Harden be-gssapi-common.h for headerscheck Surround the contents with a test that the feature is enabled by configure, to silence header checking tools on systems without GSSAPI installed. Backpatch to 12, where the file appeared. Discussion: https://postgr.es/m/202111161709.u3pbx5lxdimt@alvherre.pgsql	2021-11-26 17:00:29 -03:00
Alvaro Herrera	ef41c3fd6c	Fix determination of broken LSN in OVERWRITTEN_CONTRECORD In commit `ff9f111bce` I mixed up inconsistent definitions of the LSN of the first record in a page, when the previous record ends exactly at the page boundary. The correct LSN is adjusted to skip the WAL page header; I failed to use that when setting XLogReaderState->overwrittenRecPtr, so at WAL replay time VerifyOverwriteContrecord would refuse to let replay continue past that record. Backpatch to 10. 9.6 also contains this bug, but it's no longer being maintained. Discussion: https://postgr.es/m/45597.1637694259@sss.pgh.pa.us	2021-11-26 11:14:27 -03:00
Peter Eisentraut	04875ae92f	Remove unneeded Python includes Inluding <compile.h> and <eval.h> has not been necessary since Python 2.4, since they are included via <Python.h>. Morever, <eval.h> is being removed in Python 3.11. So remove these includes. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/flat/84884.1637723223%40sss.pgh.pa.us	2021-11-25 14:30:46 +01:00
Michael Paquier	37827de430	Block ALTER TABLE .. DROP NOT NULL on columns in replica identity index Replica identities that depend directly on an index rely on a set of properties, one of them being that all the columns defined in this index have to be marked as NOT NULL. There was a hole in the logic with ALTER TABLE DROP NOT NULL, where it was possible to remove the NOT NULL property of a column part of an index used as replica identity, so block it to avoid problems with logical decoding down the road. The same check was already done columns part of a primary key, so the fix is straight-forward. Author: Haiying Tang, Hou Zhijie Reviewed-by: Dilip Kumar, Michael Paquier Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com Backpatch-through: 10	2021-11-25 15:05:28 +09:00
Michael Paquier	baef657d3c	Add support for Visual Studio 2022 in build scripts Documentation and any code paths related to VS are updated to keep the whole consistent. Similarly to 2017 and 2019, the version of VS and the version of nmake that we use to determine which code paths to use for the build are still inconsistent in their own way. Backpatch down to 10, so as buildfarm members are able to use this new version of Visual Studio on all the stable branches supported. Author: Hans Buschmann Discussion: https://postgr.es/m/1633101364685.39218@nidsa.net Backpatch-through: 10	2021-11-24 13:03:59 +09:00
Tom Lane	d4f6a36d82	Adjust pg_dump's priority ordering for casts. When a stored expression depends on a user-defined cast, the backend records the dependency as being on the cast's implementation function --- or indeed, if there's no cast function involved but just RelabelType or CoerceViaIO, no dependency is recorded at all. This is problematic for pg_dump, which is at risk of dumping things in the wrong order leading to restore failures. Given the lack of previous reports, the risk isn't that high, but it can be demonstrated if the cast is used in some view whose rowtype is then used as an input or result type for some other function. (That results in the view getting hoisted into the functions portion of the dump, ahead of the cast.) A logically bulletproof fix for this would require including the cast's OID in the parsed form of the expression, whence it could be extracted by dependency.c, and then the stored dependency would force pg_dump to do the right thing. Such a change would be fairly invasive, and certainly not back-patchable. Moreover, since we'd prefer that an expression using cast syntax be equal() to one doing the same thing by explicit function call, the cast OID field would have to have special ignored-by-comparisons semantics, making things messy. So, let's instead fix this by a very simple hack in pg_dump: change the object-type priority order so that casts are initially sorted before functions, immediately after types. This fixes the problem in a fairly direct way for casts that have no implementation function. For those that do, the implementation function will be hoisted to just before the cast by the dependency sorting step, so that we still have a valid dump order. (I'm not sure that this provides a full guarantee of no problems; but since it's been like this for many years without any previous reports, this is probably enough to fix it in practice.) Per report from Дмитрий Иванов. Back-patch to all supported branches. Discussion: https://postgr.es/m/CAPL5KHoGa3uvyKp6z6m48LwCnTsK+LRQ_mcA4uKGfqAVSEjV_A@mail.gmail.com	2021-11-22 17:16:29 -05:00
Tom Lane	6fc8b145e7	Fix pg_dump --inserts mode for generated columns with dropped columns. If a table contains a generated column that's preceded by a dropped column, dumpTableData_insert failed to account for the dropped column, and would emit DEFAULT placeholder(s) in the wrong column(s). This resulted in failures at restore time. The default COPY code path did not have this bug, likely explaining why it wasn't noticed sooner. While we're fixing this, we can be a little smarter about the situation: (1) avoid unnecessarily fetching the values of generated columns, (2) omit generated columns from the output, too, if we're using --column-inserts. While these modes aren't expected to be as high-performance as the COPY path, we might as well be as efficient as we can; it doesn't add much complexity. Per report from Дмитрий Иванов. Back-patch to v12 where generated columns came in. Discussion: https://postgr.es/m/CAPL5KHrkBniyQt5e1rafm5DdXvbgiiqfEQEJ9GjtVzN71Jj5pA@mail.gmail.com	2021-11-22 15:25:48 -05:00
Tom Lane	33edf4a3ca	pg_receivewal, pg_recvlogical: allow canceling initial password prompt. Previously it was impossible to terminate these programs via control-C while they were prompting for a password. We can fix that trivially for their initial password prompts, by moving setup of the SIGINT handler from just before to just after their initial GetConnection() calls. This fix doesn't permit escaping out of later re-prompts, but those should be exceedingly rare, since the user's password or the server's authentication setup would have to have changed meanwhile. We considered applying a fix similar to commit `46d665bc2`, but that seemed more complicated than it'd be worth. Moreover, this way is back-patchable, which that wasn't. The misbehavior exists in all supported versions, so back-patch to all. Tom Lane and Nathan Bossart Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us	2021-11-21 14:13:35 -05:00
Amit Kapila	33b6dd83e2	Fix parallel operations that prevent oldest xmin from advancing. While determining xid horizons, we skip over backends that are running Vacuum. We also ignore Create Index Concurrently, or Reindex Concurrently for the purposes of computing Xmin for Vacuum. But we were not setting the flags corresponding to these operations when they are performed in parallel which was preventing Xid horizon from advancing. The optimization related to skipping Create Index Concurrently, or Reindex Concurrently operations was implemented in PG-14 but the fix is the same for the Parallel Vacuum as well so back-patched till PG-13. Author: Masahiko Sawada Reviewed-by: Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CAD21AoCLQqgM1sXh9BrDFq0uzd3RBFKi=Vfo6cjjKODm0Onr5w@mail.gmail.com	2021-11-19 09:24:00 +05:30
Michael Paquier	49f2b1168a	Fix quoting of ACL item in table for upgrade binary compatibility checks Per buildfarm member prion, that runs the regression tests under a role name that uses a hyphen. Issue introduced by `835bcba`. Discussion: https://postgr.es/m/YZW4MvzCZ+hQ34vw@paquier.xyz Backpatch-through: 12	2021-11-18 12:53:02 +09:00
Michael Paquier	755f04c72e	Add table to regression tests for binary-compatibility checks in pg_upgrade This commit adds to the main regression test suite a table with all the in-core data types (some exceptions apply). This table is not dropped, so as pg_upgrade would be able to check the binary compatibility of the types tracked in the table. If a new type is added in core, this part of the tests would need a refresh but the tests are designed to fail if that were to happen. As this is useful for upgrades and that these rely on the objects created in the regression test suite of the old version upgraded from, a backpatch down to 12 is done, which is the last point where a binary incompatible change has been done (`7c15cef`). This will hopefully be enough to find out if something gets broken during the development of a new version of Postgres, so as it is possible to take actions in pg_upgrade itself in this case (like `0ccfc28` for sql_identifier). An area that is not covered yet is related to external modules, which may create their own types. The testing infrastructure of pg_upgrade is not integrated yet with the external modules stored in core (src/test/modules/ or contrib/, all use the same database name for their tests so there would be an overlap). This could be improved in the future. Author: Justin Pryzby Reviewed-by: Jacob Champion, Peter Eisentraut, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com Backpatch-through: 12	2021-11-18 10:37:39 +09:00
Tom Lane	c8b5221b57	Clean up error handling in pg_basebackup's walmethods.c. The error handling here was a mess, as a result of a fundamentally bad design (relying on errno to keep its value much longer than is safe to assume) as well as a lot of just plain sloppiness, both as to noticing errors at all and as to reporting the correct errno. Moreover, the recent addition of LZ4 compression broke things completely, because liblz4 doesn't use errno to report errors. To improve matters, keep the error state in the DirectoryMethodData or TarMethodData struct, and add a string field so we can handle cases that don't set errno. (The tar methods already had a version of this, but it can be done more efficiently since all these cases use a constant error string.) Make the dir and tar methods handle errors in basically identical ways, which they didn't before. This requires copying errno into the state struct in a lot of places, which is a bit tedious, but it has the virtue that we can get rid of ad-hoc code to save and restore errno in a number of places ... not to mention that it fixes other places that should've saved/restored errno but neglected to. In passing, fix some pointlessly static buffers to be ordinary local variables. There remains an issue about exactly how to handle errors from fsync(), but that seems like material for its own patch. While the LZ4 problems are new, all the rest of this is fixes for old bugs, so backpatch to v10 where walmethods.c was introduced. Patch by me; thanks to Michael Paquier for review. Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us	2021-11-17 14:16:34 -05:00
Tom Lane	bbda88c338	Handle close() failures more robustly in pg_dump and pg_basebackup. Coverity complained that applying get_gz_error after a failed gzclose, as we did in one place in pg_basebackup, is unsafe. I think it's right: it's entirely likely that the call is touching freed memory. Change that to inspect errno, as we do for other gzclose calls. Also, be careful to initialize errno to zero immediately before any gzclose() call where we care about the error status. (There are some calls where we don't, because we already failed at some previous step.) This ensures that we don't get a misleadingly irrelevant error code if gzclose() fails in a way that doesn't set errno. We could work harder at that, but it looks to me like all such cases are basically can't-happen if we're not misusing zlib, so it's not worth the extra notational cruft that would be required. Also, fix several places that simply failed to check for close-time errors at all, mostly at some remove from the close or gzclose itself; and one place that did check but didn't bother to report the errno. Back-patch to v12. These mistakes are older than that, but between the frontend logging API changes that happened in v12 and the fact that frontend code can't rely on %m before that, the patch would need substantial revision to work in older branches. It doesn't quite seem worth the trouble given the lack of related field complaints. Patch by me; thanks to Michael Paquier for review. Discussion: https://postgr.es/m/1343113.1636489231@sss.pgh.pa.us	2021-11-17 13:08:25 -05:00
Amit Kapila	63c3eeddc2	Invalidate relcache when changing REPLICA IDENTITY index. When changing REPLICA IDENTITY INDEX to another one, the target table's relcache was not being invalidated. This leads to skipping update/delete operations during apply on the subscriber side as the columns required to search corresponding rows won't get logged. Author: Tang Haiying, Hou Zhijie Reviewed-by: Euler Taveira, Amit Kapila Backpatch-through: 10 Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com	2021-11-16 08:46:12 +05:30
Tom Lane	843925fadb	Make psql's \password default to CURRENT_USER, not PQuser(conn). The documentation says plainly that \password acts on "the current user" by default. What it actually acted on, or tried to, was the username used to log into the current session. This is not the same thing if one has since done SET ROLE or SET SESSION AUTHENTICATION. Aside from the possible surprise factor, it's quite likely that the current role doesn't have permissions to set the password of the original role. To fix, use "SELECT CURRENT_USER" to get the role name to act on. (This syntax works with servers at least back to 7.0.) Also, in hopes of reducing confusion, include the role name that will be acted on in the password prompt. The discrepancy from the documentation makes this a bug, so back-patch to all supported branches. Patch by me; thanks to Nathan Bossart for review. Discussion: https://postgr.es/m/747443.1635536754@sss.pgh.pa.us	2021-11-12 14:55:32 -05:00
Michael Paquier	a691a22983	Fix memory overrun when querying pg_stat_slru pg_stat_get_slru() in pgstatfuncs.c would point to one element after the end of the array PgStat_SLRUStats when finishing to scan its entries. This had no direct consequences as no data from the extra memory area was read, but static analyzers would rightfully complain here. So let's be clean. While on it, this adds one regression test in the area reserved for system views. Reported-by: Alexander Kozhemyakin, via AddressSanitizer Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/17280-37da556e86032070@postgresql.org Backpatch-through: 13	2021-11-12 21:50:08 +09:00
Noah Misch	d4e9d69469	Report any XLogReadRecord() error in XlogReadTwoPhaseData(). Buildfarm members kittiwake and tadarida have witnessed errors at this site. The site discarded key facts. Back-patch to v10 (all supported versions). Reviewed by Michael Paquier and Tom Lane. Discussion: https://postgr.es/m/20211107013157.GB790288@rfd.leadboat.com	2021-11-11 17:11:01 -08:00
Michael Paquier	13c8adf90e	Fix buffer overrun in unicode string normalization with empty input PostgreSQL 13 and newer versions are directly impacted by that through the SQL function normalize(), which would cause a call of this function to write one byte past its allocation if using in input an empty string after recomposing the string with NFC and NFKC. Older versions (v10~v12) are not directly affected by this problem as the only code path using normalization is SASLprep in SCRAM authentication that forbids the case of an empty string, but let's make the code more robust anyway there so as any out-of-core callers of this function are covered. The solution chosen to fix this issue is simple, with the addition of a fast-exit path if the decomposed string is found as empty. This would only happen for an empty string as at its lowest level a codepoint would be decomposed as itself if it has no entry in the decomposition table or if it has a decomposition size of 0. Some tests are added to cover this issue in v13~. Note that an empty string has always been considered as normalized (grammar "IS NF[K]{C,D} NORMALIZED", through the SQL function is_normalized()) for all the operations allowed (NFC, NFD, NFKC and NFKD) since this feature has been introduced as of `2991ac5`. This behavior is unchanged but some tests are added in v13~ to check after that. I have also checked "make normalization-check" in src/common/unicode/, while on it (works in 13~, and breaks in older stable branches independently of this commit). The release notes should just mention this commit for v13~. Reported-by: Matthijs van der Vleuten Discussion: https://postgr.es/m/17277-0c527a373794e802@postgresql.org Backpatch-through: 10	2021-11-11 15:01:54 +09:00
Tom Lane	78f058411b	Doc: improve protocol spec for logical replication Type messages. protocol.sgml documented the layout for Type messages, but completely dropped the ball otherwise, failing to explain what they are, when they are sent, or what they're good for. While at it, do a little copy-editing on the description of Relation messages. In passing, adjust the comment for apply_handle_type() to make it clearer that we choose not to do anything when receiving a Type message, not that we think it has no use whatsoever. Per question from Stefen Hillman. Discussion: https://postgr.es/m/CAPgW8pMknK5pup6=T4a_UG=Cz80Rgp=KONqJmTdHfaZb0RvnFg@mail.gmail.com	2021-11-10 13:12:58 -05:00
Tom Lane	5a3240cf6b	Fix instability in 026_overwrite_contrecord.pl test. We've seen intermittent failures in this test on slower buildfarm machines, which I think can be explained by assuming that autovacuum emitted some additional WAL. Disable autovacuum to stabilize it. In passing, use stringwise not numeric comparison to compare WAL file names. Doesn't matter at present, but they are hex strings not decimal ... Discussion: https://postgr.es/m/1372189.1636499287@sss.pgh.pa.us	2021-11-09 18:40:19 -05:00
Tom Lane	844b316920	libpq: reject extraneous data after SSL or GSS encryption handshake. libpq collects up to a bufferload of data whenever it reads data from the socket. When SSL or GSS encryption is requested during startup, any additional data received with the server's yes-or-no reply remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could probably be abused to inject faked responses to the client's first few queries, although other details of libpq's behavior make that harder than it sounds. A different line of attack is to exfiltrate the client's password, or other sensitive data that might be sent early in the session. That has been shown to be possible with a server vulnerable to CVE-2021-23214. To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23222	2021-11-08 11:14:56 -05:00
Tom Lane	e92ed93e8e	Reject extraneous data after SSL or GSS encryption handshake. The server collects up to a bufferload of data whenever it reads data from the client socket. When SSL or GSS encryption is requested during startup, any additional data received with the initial request message remained in the buffer, and would be treated as already-decrypted data once the encryption handshake completed. Thus, a man-in-the-middle with the ability to inject data into the TCP connection could stuff some cleartext data into the start of a supposedly encryption-protected database session. This could be abused to send faked SQL commands to the server, although that would only work if the server did not demand any authentication data. (However, a server relying on SSL certificate authentication might well not do so.) To fix, throw a protocol-violation error if the internal buffer is not empty after the encryption handshake. Our thanks to Jacob Champion for reporting this problem. Security: CVE-2021-23214	2021-11-08 11:01:43 -05:00
Alvaro Herrera	7c0a78f089	Fix typo Introduced in `1d97d3d086`. Co-authored-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/83641f59-d566-b33e-ef21-a272a98675aa@gmail.com	2021-11-08 09:17:24 -03:00
Peter Eisentraut	98da5cd0d1	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 027ff7dad8afb1a907cb4c59da4e13c3ace8d376	2021-11-08 10:08:56 +01:00
Alexander Korotkov	e1fee28a04	Reset lastOverflowedXid on standby when needed Currently, lastOverflowedXid is never reset. It's just adjusted on new transactions known to be overflowed. But if there are no overflowed transactions for a long time, snapshots could be mistakenly marked as suboverflowed due to wraparound. This commit fixes this issue by resetting lastOverflowedXid when needed altogether with KnownAssignedXids. Backpatch to all supported versions. Reported-by: Stan Hu Discussion: https://postgr.es/m/CAMBWrQ%3DFp5UAsU_nATY7EMY7NHczG4-DTDU%3DmCvBQZAQ6wa2xQ%40mail.gmail.com Author: Kyotaro Horiguchi, Alexander Korotkov Reviewed-by: Stan Hu, Simon Riggs, Nikolay Samokhvalov, Andrey Borodin, Dmitry Dolgov	2021-11-06 18:34:19 +03:00
Alvaro Herrera	bf5cdcfd5e	Avoid crash in rare case of concurrent DROP When a role being dropped contains is referenced by catalog objects that are concurrently also being dropped, a crash can result while trying to construct the string that describes the objects. Suppress that by ignoring objects whose descriptions are returned as NULL. The majority of relevant codesites were already cautious about this already; we had just missed a couple. This is an old bug, so backpatch all the way back. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/17126-21887f04508cb5c8@postgresql.org	2021-11-05 12:29:34 -03:00
Heikki Linnakangas	b7299b6646	Update alternative expected output file. Previous commit added a test to 'largeobject', but neglected the alternative expected output file 'largeobject_1.source'. Per failure on buildfarm animal 'hamerkop'. Discussion: https://www.postgresql.org/message-id/DBA08346-9962-4706-92D1-230EE5201C10@yesql.se	2021-11-03 19:41:38 +02:00
Heikki Linnakangas	07070c0082	Fix snapshot reference leak if lo_export fails. If lo_export() fails to open the target file or to write to it, it leaks the created LargeObjectDesc and its snapshot in the top-transaction context and resource owner. That's pretty harmless, it's a small leak after all, but it gives the user a "Snapshot reference leak" warning. Fix by using a short-lived memory context and no resource owner for transient LargeObjectDescs that are opened and closed within one function call. The leak is easiest to reproduce with lo_export() on a directory that doesn't exist, but in principle the other lo_* functions could also fail. Backpatch to all supported versions. Reported-by: Andrew B Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/32bf767a-2d65-71c4-f170-122f416bab7e@iki.fi	2021-11-03 10:54:36 +02:00
Tom Lane	ada667b454	Fix variable lifespan in ExecInitCoerceToDomain(). This undoes a mistake in `1ec7679f1`: domainval and domainnull were meant to live across loop iterations, but they were incorrectly moved inside the loop. The effect was only to emit useless extra EEOP_MAKE_READONLY steps, so it's not a big deal; nonetheless, back-patch to v13 where the mistake was introduced. Ranier Vilela Discussion: https://postgr.es/m/CAEudQAqXuhbkaAp-sGH6dR6Nsq7v28_0TPexHOm6FiDYqwQD-w@mail.gmail.com	2021-11-02 13:36:57 -04:00
Tom Lane	0151af40cd	Avoid O(N^2) behavior in SyncPostCheckpoint(). As in commits `6301c3ada` and `e9d9ba2a4`, avoid doing repetitive list_delete_first() operations, since that would be expensive when there are many files waiting to be unlinked. This is a slightly larger change than in those cases. We have to keep the list state valid for calls to AbsorbSyncRequests(), so it's necessary to invent a "canceled" field instead of immediately deleting PendingUnlinkEntry entries. Also, because we might not be able to process all the entries, we need a new list primitive list_delete_first_n(). list_delete_first_n() is almost list_copy_tail(), but it modifies the input List instead of making a new copy. I found a couple of existing uses of the latter that could profitably use the new function. (There might be more, but the other callers look like they probably shouldn't overwrite the input List.) As before, back-patch to v13. Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com	2021-11-02 11:31:54 -04:00
Tom Lane	e477642a1b	Avoid some other O(N^2) hazards in list manipulation. In the same spirit as `6301c3ada`, fix some more places where we were using list_delete_first() in a loop and thereby risking O(N^2) behavior. It's not clear that the lists manipulated in these spots can get long enough to be really problematic ... but it's not clear that they can't, either, and the fixes are simple enough. As before, back-patch to v13. Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com	2021-11-01 16:24:40 -04:00
Alvaro Herrera	17227825ca	Handle XLOG_OVERWRITE_CONTRECORD in DecodeXLogOp Failing to do so results in inability of logical decoding to process the WAL stream. Handle it by doing nothing. Backpatch all the way back. Reported-by: Petr Jelínek <petr.jelinek@enterprisedb.com>	2021-11-01 13:07:23 -03:00
Michael Paquier	77f7909a40	Preserve opclass parameters across REINDEX CONCURRENTLY The opclass parameter Datums from the old index are fetched in the same way as for predicates and expressions, by grabbing them directly from the system catalogs. They are then copied into the new IndexInfo that will be used for the creation of the new copy. This caused the new index to be rebuilt with default parameters rather than the ones pre-defined by a user. The only way to get back a new index with correct opclass parameters would be to recreate a new index from scratch. The issue has been introduced by `911e702`. Author: Michael Paquier Reviewed-by: Zhihong Yu Discussion: https://postgr.es/m/YX0CG/QpLXcPr8HJ@paquier.xyz Backpatch-through: 13	2021-11-01 11:40:29 +09:00
Tom Lane	df238aed10	Avoid O(N^2) behavior when the standby process releases many locks. When replaying a transaction that held many exclusive locks on the primary, a standby server's startup process would expend O(N^2) effort on manipulating the list of locks. This code was fine when written, but commit `1cff1b95a` made repetitive list_delete_first() calls inefficient, as explained in its commit message. Fix by just iterating the list normally, and releasing storage only when done. (This'd be inadequate if we needed to recover from an error occurring partway through; but we don't.) Back-patch to v13 where `1cff1b95a` came in. Nathan Bossart Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com	2021-10-31 15:31:44 -04:00
Tom Lane	4cd72add05	Update time zone data files to tzdata release 2021e. DST law changes in Fiji, Jordan, Palestine, and Samoa. Historical corrections for Barbados, Cook Islands, Guyana, Niue, Portugal, and Tonga. Also, the Pacific/Enderbury zone has been renamed to Pacific/Kanton. The following zones have been merged into nearby, more-populous zones whose clocks have agreed since 1970: Africa/Accra, America/Atikokan, America/Blanc-Sablon, America/Creston, America/Curacao, America/Nassau, America/Port_of_Spain, Antarctica/DumontDUrville, and Antarctica/Syowa.	2021-10-29 11:38:38 -04:00
Tom Lane	5a4b8a8a72	Improve contrib/amcheck's tests for CREATE INDEX CONCURRENTLY. Commits `fdd965d07` and `3cd9c3b92` tested CREATE INDEX CONCURRENTLY by launching two separate pgbench runs concurrently. This was needed so that only a single client thread would run CREATE INDEX CONCURRENTLY, avoiding deadlock between two CICs. However, there's a better way, which is to use an advisory lock to prevent concurrent CICs. That's better in part because the test code is shorter and more readable, but mostly because it automatically scales things to launch an appropriate number of CICs relative to the number of INSERT transactions. As committed, typically half to three-quarters of the CIC transactions were pointless because the INSERT transactions had already stopped. In passing, remove background_pgbench, which was added to support these tests and isn't needed anymore. We can always put it back if we find a use for it later. Back-patch to v12; older pgbench versions lack the conditional-execution features needed for this method. Tom Lane and Andrey Borodin Discussion: https://postgr.es/m/139687.1635277318@sss.pgh.pa.us	2021-10-28 11:45:14 -04:00
Peter Geoghegan	d5a2ffbce5	Fix ordering of items in nbtree error message. Oversight in commit `a5213adf`. Backpatch: 13-, just like commit `a5213adf`.	2021-10-27 13:09:00 -07:00
Peter Geoghegan	f8cce4a3d8	Further harden nbtree posting split code. Add more defensive checks around posting list split code. These should detect corruption involving duplicate table TIDs earlier and more reliably than any existing check. Follow up to commit `8f72bbac`. Discussion: https://postgr.es/m/CAH2-WzkrSY_kjyd1_M5xJK1uM0govJXMxPn8JUSvwcUOiHuWVw@mail.gmail.com Backpatch: 13-, where nbtree deduplication was introduced.	2021-10-27 12:10:43 -07:00
Magnus Hagander	dd111887fb	Clarify that --system reindexes system catalogs only Make this more clear both in the help message and docs. Reviewed-By: Michael Paquier Backpatch-through: 9.6 Discussion: https://postgr.es/m/CABUevEw6Je0WUFTLhPKOk4+BoBuDrE-fKw3N4ckqgDBMFu4paA@mail.gmail.com	2021-10-27 16:28:54 +02:00
Thomas Munro	24b7cf8a5c	Reject huge_pages=on if shared_memory_type=sysv. It doesn't work (it could, but hasn't been implemented). Back-patch to 12, where shared_memory_type arrived. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/163271880203.22789.1125998876173795966@wrigleys.postgresql.org	2021-10-26 13:04:40 +13:00
Noah Misch	a9d0a54094	Fix CREATE INDEX CONCURRENTLY for the newest prepared transactions. The purpose of commit `8a54e12a38` was to fix this, and it sufficed when the PREPARE TRANSACTION completed before the CIC looked for lock conflicts. Otherwise, things still broke. As before, in a cluster having used CIC while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Fix this for future index builds by making CIC wait for arbitrarily-recent prepared transactions and for ordinary transactions that may yet PREPARE TRANSACTION. As part of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC before it calls ProcArrayClearTransaction(). Back-patch to 9.6 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/01824242-AA92-4FE9-9BA7-AEBAFFEA3D0C@yandex-team.ru	2021-10-23 18:36:42 -07:00
Noah Misch	2e33b43599	Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY. CIC and REINDEX CONCURRENTLY assume backends see their catalog changes no later than each backend's next transaction start. That failed to hold when a backend absorbed a relevant invalidation in the middle of running RelationBuildDesc() on the CIC index. Queries that use the resulting index can silently fail to find rows. Fix this for future index builds by making RelationBuildDesc() loop until it finishes without accepting a relevant invalidation. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Back-patch to 9.6 (all supported versions). Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com	2021-10-23 18:36:42 -07:00
Tom Lane	2e01d050d9	Fix frontend version of sh_error() in simplehash.h. The code does not expect sh_error() to return, but the patch that made this header usable in frontend didn't get that memo. While here, plaster unlikely() on the tests that decide whether to invoke sh_error(), and add our standard copyright notice. Noted by Andres Freund. Back-patch to v13 where this frontend support came in. Discussion: https://postgr.es/m/0D54435C-1199-4361-9D74-2FBDCF8EA164@anarazel.de	2021-10-22 16:43:38 -04:00
Tom Lane	4760060235	pg_dump: fix mis-dumping of non-global default privileges. Non-global default privilege entries should be dumped as-is, not made relative to the default ACL for their object type. This would typically only matter if one had revoked some on-by-default privileges in a global entry, and then wanted to grant them again in a non-global entry. Per report from Boris Korzun. This is an old bug, so back-patch to all supported branches. Neil Chen, test case by Masahiko Sawada Discussion: https://postgr.es/m/111621616618184@mail.yandex.ru Discussion: https://postgr.es/m/CAA3qoJnr2+1dVJObNtfec=qW4Z0nz=A9+r5bZKoTSy5RDjskMw@mail.gmail.com	2021-10-22 15:22:26 -04:00
Amit Kapila	23469b867a	Back-patch "Add parent table name in an error in reorderbuffer.c." This was originally done in commit `5e77625b26` for 15 only, as a troubleshooting aid but multiple people showed interest in back-patching this. Author: Jeremy Schneider Reviewed-by: Amit Kapila Backpatch-through: 9.6 Discussion: https://postgr.es/m/808ed65b-994c-915a-361c-577f088b837f@amazon.com	2021-10-21 09:36:27 +05:30
Alvaro Herrera	a73a3671da	Protect against collation variations in test Discussion: https://postgr.es/m/YW/MYdSRQZtPFBWR@paquier.xyz	2021-10-20 13:05:42 -03:00
Michael Paquier	abb9ee92c5	Fix build of MSVC with OpenSSL 3.0.0 The build scripts of Visual Studio would fail to detect properly a 3.0.0 build as the check on the second digit was failing. This is adjusted where needed, allowing the builds to complete. Note that the MSIs of OpenSSL mentioned in the documentation have not changed any library names for Win32 and Win64, making this change straight-forward. Reported-by: htalaco, via github Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/YW5XKYkq6k7OtrFq@paquier.xyz Backpatch-through: 9.6	2021-10-20 16:49:00 +09:00
Alvaro Herrera	842fe6123c	Ensure correct lock level is used in ALTER ... RENAME Commit `1b5d797cd4` intended to relax the lock level used to rename indexes, but inadvertently allowed any relation to be renamed with a lowered lock level, as long as the command is spelled ALTER INDEX. That's undesirable for other relation types, so retry the operation with the higher lock if the relation turns out not to be an index. After this fix, ALTER INDEX <sometable> RENAME will require access exclusive lock, which it didn't before. Author: Nathan Bossart <bossartn@amazon.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Onder Kalaci <onderk@microsoft.com> Discussion: https://postgr.es/m/PH0PR21MB1328189E2821CDEC646F8178D8AE9@PH0PR21MB1328.namprd21.prod.outlook.com	2021-10-19 19:08:45 -03:00
Andres Freund	246a035f05	Adapt src/test/ldap/t/001_auth.pl to work with openldap 2.5. ldapsearch's deprecated -h/-p arguments were removed, need to use -H now - which has been around for over 20 years. As perltidy insists on reflowing the parameters anyway, change order and "phrasing" to yield a less confusing layout (per suggestion from Tom Lane). Discussion: https://postgr.es/m/20211009233850.wvr6apcrw2ai6cnj@alap3.anarazel.de Backpatch: 11-, where the tests were added.	2021-10-19 11:15:45 -07:00
Tom Lane	30e61a8cdc	Fix assignment to array of domain over composite. An update such as "UPDATE ... SET fld[n].subfld = whatever" failed if the array elements were domains rather than plain composites. That's because isAssignmentIndirectionExpr() failed to cope with the CoerceToDomain node that would appear in the expression tree in this case. The result would typically be a crash, and even if we accidentally didn't crash, we'd not correctly preserve other fields of the same array element. Per report from Onder Kalaci. Back-patch to v11 where arrays of domains came in. Discussion: https://postgr.es/m/PH0PR21MB132823A46AA36F0685B7A29AD8BD9@PH0PR21MB1328.namprd21.prod.outlook.com	2021-10-19 13:54:46 -04:00
Tom Lane	cf33fb7f4a	Remove bogus assertion in transformExpressionList(). I think when I added this assertion (in commit `8f889b108`), I was only thinking of the use of transformExpressionList at top level of INSERT and VALUES. But it's also called by transformRowExpr(), which can certainly occur in an UPDATE targetlist, so it's inappropriate to suppose that p_multiassign_exprs must be empty. Besides, since the input is not expected to contain ResTargets, there's no reason it should contain MultiAssignRefs either. Hence this code need not be concerned about the state of p_multiassign_exprs, and we should just drop the assertion. Per bug #17236 from ocean_li_996. It's been wrong for years, so back-patch to all supported branches. Discussion: https://postgr.es/m/17236-3210de9bcba1d7ca@postgresql.org	2021-10-19 11:35:15 -04:00
Daniel Gustafsson	687fe8a9d7	Fix bug in TOC file error message printing If the blob TOC file cannot be parsed, the error message was failing to print the filename as the variable holding it was shadowed by the destination buffer for parsing. When the filename fails to parse, the error will print an empty string: ./pg_restore -d foo -F d dump pg_restore: error: invalid line in large object TOC file "": .. ..instead of the intended error message: ./pg_restore -d foo -F d dump pg_restore: error: invalid line in large object TOC file "dump/blobs.toc": .. Fix by renaming both variables as the shared name was too generic to store either and still convey what the variable held. Backpatch all the way down to 9.6. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/A2B151F5-B32B-4F2C-BA4A-6870856D9BDE@yesql.se Backpatch-through: 9.6	2021-10-19 12:59:54 +02:00
Daniel Gustafsson	d3a4c1eb3d	Fix sscanf limits in pg_basebackup and pg_dump Make sure that the string parsing is limited by the size of the destination buffer. In pg_basebackup the available values sent from the server is limited to two characters so there was no risk of overflow. In pg_dump the buffer is bounded by MAXPGPATH, and thus the limit must be inserted via preprocessor expansion and the buffer increased by one to account for the terminator. There is no risk of overflow here, since in this case, the buffer scanned is smaller than the destination buffer. Backpatch the pg_basebackup fix to 11 where it was introduced, and the pg_dump fix all the way down to 9.6. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/B14D3D7B-F98C-4E20-9459-C122C67647FB@yesql.se Backpatch-through: 11 and 9.6	2021-10-19 12:59:50 +02:00
Michael Paquier	85dc4292a7	Block ALTER INDEX/TABLE index_name ALTER COLUMN colname SET (options) The grammar of this command run on indexes with column names has always been authorized by the parser, and it has never been documented. Since `911e702`, it is possible to define opclass parameters as of CREATE INDEX, which actually broke the old case of ALTER INDEX/TABLE where relation-level parameters n_distinct and n_distinct_inherited could be defined for an index (see `76a47c0` and its thread where this point has been touched, still remained unused). Attempting to do that in v13~ would cause the index to become unusable, as there is a new dedicated code path to load opclass parameters instead of the relation-level ones previously available. Note that it is possible to fix things with a manual catalog update to bring the relation back online. This commit disables this command for now as the use of column names for indexes does not make sense anyway, particularly when it comes to index expressions where names are automatically computed. One way to properly support this case properly in the future would be to use column numbers when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN .. SET STATISTICS. Partitioned indexes were already blocked, but not indexes. Some tests are added for both cases. There was some code in ANALYZE to enforce n_distinct to be used for an index expression if the parameter was defined, but just remove it for now until/if there is support for this (note that index-level parameters never had support in pg_dump either, previously), so this was just dead code. Reported-by: Matthijs van der Vleuten Author: Nathan Bossart, Michael Paquier Reviewed-by: Vik Fearing, Dilip Kumar Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org Backpatch-through: 13	2021-10-19 11:04:04 +09:00
Alvaro Herrera	fe35528a5e	Invalidate partitions of table being attached/detached Failing to do that, any direct inserts/updates of those partitions would fail to enforce the correct constraint, that is, one that considers the new partition constraint of their parent table. Backpatch to 10. Reported by: Hou Zhijie <houzj.fnst@fujitsu.com> Author: Amit Langote <amitlangote09@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com	2021-10-18 19:08:25 -03:00
Michael Paquier	8f4fe8d7f8	Reset properly snapshot export state during transaction abort During a replication slot creation, an ERROR generated in the same transaction as the one creating a to-be-exported snapshot would have left the backend in an inconsistent state, as the associated static export snapshot state was not being reset on transaction abort, but only on the follow-up command received by the WAL sender that created this snapshot on replication slot creation. This would trigger inconsistency failures if this session tried to export again a snapshot, like during the creation of a replication slot. Note that a snapshot export cannot happen in a transaction block, so there is no need to worry resetting this state for subtransaction aborts. Also, this inconsistent state would very unlikely show up to users. For example, one case where this could happen is an out-of-memory error when building the initial snapshot to-be-exported. Dilip found this problem while poking at a different patch, that caused an error in this code path for reasons unrelated to HEAD. Author: Dilip Kumar Reviewed-by: Michael Paquier, Zhihong Yu Discussion: https://postgr.es/m/CAFiTN-s0zA1Kj0ozGHwkYkHwa5U0zUE94RSc_g81WrpcETB5=w@mail.gmail.com Backpatch-through: 9.6	2021-10-18 11:56:52 +09:00
Tom Lane	0b5f557b7e	Avoid core dump in pg_dump when dumping from pre-8.3 server. Commit `f0e21f2f6` missed adding a tgisinternal output column to getTriggers' query for pre-8.3 servers. Back-patch to v11, like that commit.	2021-10-16 15:03:10 -04:00
Tom Lane	6a262ba8c8	Make pg_dump acquire lock on partitioned tables that are to be dumped. It was clearly the intent to do so all along, but the original coding fat-fingered this by checking the wrong array element. We fixed it in passing in `403a3d91c`, but that later got reverted, and we forgot to keep this bug fix. Most of the time this'd be relatively harmless, since once we lock any of the partitioned table's leaf partitions, that would suffice to prevent major DDL on the partitioned table itself. However, a childless partitioned table would get dumped with no relevant lock whatsoever, possibly allowing dump failure or inconsistent output. Unlike `403a3d91c`, there are no versioning concerns, since every server version that has partitioned tables will allow you to lock one. Back-patch to v10 where partitioned tables were introduced. Discussion: https://postgr.es/m/1018205.1634346327@sss.pgh.pa.us	2021-10-16 12:24:17 -04:00
Jeff Davis	20f785732f	Check criticalSharedRelcachesBuilt in GetSharedSecurityLabel(). An extension may want to call GetSecurityLabel() on a shared object before the shared relcaches are fully initialized. For instance, a ClientAuthentication_hook might want to retrieve the security label on a role. Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com Backpatch-through: 9.6	2021-10-14 12:24:47 -07:00
Tom Lane	fdd6a4d8d9	Fix planner error with pulling up subquery expressions into function RTEs. If a function-in-FROM laterally references the output of some sub-SELECT earlier in the FROM clause, and we are able to flatten that sub-SELECT into the outer query, the expression(s) copied into the function RTE missed being processed by eval_const_expressions. This'd lead to trouble and probable crashes at execution if such expressions contained named-argument function call syntax or functions with defaulted arguments. The bug is masked if the query contains any explicit JOIN syntax, which may help explain why we'd not noticed. Per bug #17227 from Bernd Dorn. This is an oversight in commit `7266d0997`, so back-patch to v13 where that came in. Discussion: https://postgr.es/m/17227-5a28ed1512189fa4@postgresql.org	2021-10-14 12:43:43 -04:00
Alvaro Herrera	2cdf97fd1e	Change recently added test code for stability The test code added with `ff9f111bce` fails under valgrind, and probably other slow cases too, because if (say) autovacuum runs in between and produces WAL of its own, the large INSERT fails to account for that in the LSN calculations. Rewrite to use a DO loop. Per complaint from Andres Freund Backpatch to all branches. Discussion: https://postgr.es/m/20211013180338.5guyqzpkcisqugrl@alap3.anarazel.de	2021-10-13 18:49:27 -03:00
Michael Paquier	2a8dee6a67	Fix tests of pg_upgrade across different major versions This fixes a set of issues that cause different breakages or annoyances when using pg_upgrade's test.sh to do upgrades across different major versions: - test.sh is completely broken when using v14 as new version because of the removal of testtablespace/ as Makefile rule. Older versions of pg_regress don't support --make-tablespacedir, blocking the creation of the tablespace. In order to fix that, it is simple enough to create those directories in the script itself, but only do that when an old version is involved. This fix is needed on HEAD and REL_14_STABLE. - The script would fail when using PG <= v11 as old version because of WITH OIDS relations not supported in v12. In order to fix this, this steals a method from the buildfarm that uses a DO block to change all the relations marked as WITH OIDS, allowing pg_upgrade to pass. This is more portable than using ALTER TABLE queries on the relations causing issues. This is fixed down to v12, and authored originally by Andrew Dunstan. - Not using --extra-float-digits=0 with v11 as old version causes a lot of diffs in the dumps, making the whole unreadable. This gets only done when using v11 as old version. This is fixed down to v12. The buildfarm code uses that already. Note that the addition of --wal-segsize and --allow-group-access breaks the script when using v10 or older at initdb time as these got added in 11. 10 would be EOL'd next year and nobody has complained about those problems yet, so nothing is done about that. This means that this commit fixes upgrade tests using test.sh with v11 as minimum older version, up to HEAD, and that it is enough to apply this change down to 12. The old and new dumps still generate diffs, still require manual checks, and more could be done to reduce the noise, but this allows the tests to run with a rather minimal amount of them. I have tested this commit and test.sh with v11 as minimum across all the branches where this is applied. Note that this commit has no impact on the normal pg_upgrade test run with a simple "make check". Author: Justin Pryzby, Andrew Dunstan, Michael Paquier Discussion: https://postgr.es/m/20201206180248.GI24052@telsasoft.com Backpatch-through: 12	2021-10-13 09:22:38 +09:00
Michael Paquier	bab0ff2e44	Add more $Test::Builder::Level in the TAP tests Incrementing the level of the call stack reported is useful for debugging purposes as it allows to control which part of the test is exactly failing, especially if a test is structured with subroutines that call routines from Test::More. This adds more incrementations of $Test::Builder::Level where debugging gets improved (for example it does not make sense for some paths like pg_rewind where long subroutines are used). A note is added to src/test/perl/README about that, based on a suggestion from Andrew Dunstan and a wording coming from both of us. Usage of Test::Builder::Level has spread in 12, so a backpatch down to this version is done. Reviewed-by: Andrew Dunstan, Peter Eisentraut, Daniel Gustafsson Discussion: https://postgr.es/m/YV1CCFwgM1RV1LeS@paquier.xyz Backpatch-through: 12	2021-10-12 11:16:25 +09:00
Etsuro Fujita	08f37e2592	Add missing word to comment in joinrels.c. Author: Amit Langote Backpatch-through: 13 Discussion: https://postgr.es/m/CA%2BHiwqGQNbtamQ_9DU3osR1XiWR4wxWFZurPmN6zgbdSZDeWmw%40mail.gmail.com	2021-10-07 17:45:03 +09:00
Dean Rasheed	9ab94ccb15	Fix corner-case loss of precision in numeric_power(). This fixes a loss of precision that occurs when the first input is very close to 1, so that its logarithm is very small. Formerly, during the initial low-precision calculation to estimate the result weight, the logarithm was computed to a local rscale that was capped to NUMERIC_MAX_DISPLAY_SCALE (1000). However, the base may be as close as 1e-16383 to 1, hence its logarithm may be as small as 1e-16383, and so the local rscale needs to be allowed to exceed 16383, otherwise all precision is lost, leading to a poor choice of rscale for the full-precision calculation. Fix this by removing the cap on the local rscale during the initial low-precision calculation, as we already do in the full-precision calculation. This doesn't change the fact that the initial calculation is a low-precision approximation, computing the logarithm to around 8 significant digits, which is very fast, especially when the base is very close to 1. Patch by me, reviewed by Alvaro Herrera. Discussion: https://postgr.es/m/CAEZATCV-Ceu%2BHpRMf416yUe4KKFv%3DtdgXQAe5-7S9tD%3D5E-T1g%40mail.gmail.com	2021-10-06 13:20:23 +01:00
Michael Paquier	d6d68e2233	Fix warning in TAP test of pg_verifybackup Oversight in `a3fcbcd`. Reported-by: Thomas Munro Discussion: https://postgr.es/m/CA+hUKGKnajZEwe91OTjro9kQLCMGGFHh2vvFn8tgHgbyn4bF9w@mail.gmail.com Backpatch-through: 13	2021-10-06 13:28:35 +09:00
Andres Freund	5ba397f740	Fix TestLib::slurp_file() with offset on windows. `3c5b0685b9` used setFilePointer() to set the position of the filehandle, but passed the wrong filehandle, always leaving the position at 0. Instead of just fixing that, remove use of setFilePointer(), we have a perl fd at this point, so we can just use perl's seek(). Additionally, the perl filehandle wasn't closed, just the windows filehandle. Reviewed-By: Andrew Dunstan <andrew@dunslane.net> Author: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20211003173038.64mmhgxctfqn7wl6@alap3.anarazel.de Backpatch: 9.6-, like `3c5b0685b9`	2021-10-04 13:32:35 -07:00
Tom Lane	c53ff69e1f	Update our mapping of Windows time zone names some more. Per discussion, let's just follow CLDR's default zone mappings faithfully. There are two changes here that are clear improvements: * Mapping "Greenwich Standard Time" to Atlantic/Reykjavik is actually a better fit than using London, because Iceland hasn't observed DST since 1968, so this is more nearly what people might expect. * Since the "Samoa" zone is specified to be UTC+13:00, we must map it to Pacific/Apia not Pacific/Samoa; the latter refers to American Samoa which is now on the other side of the date line. The rest of these changes look like they're choosing the most populous IANA zone as representative. Whatever the details, we're just going to say "if you don't like this mapping, complain to CLDR". Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us	2021-10-04 14:52:17 -04:00
Michael Paquier	194e535a07	Fix snapshot builds during promotion of hot standby node with 2PC Some specific logic is done at the end of recovery when involving 2PC transactions: 1) Call RecoverPreparedTransactions(), to recover the state of 2PC transactions into memory (re-acquire locks, etc.). 2) ShutdownRecoveryTransactionEnvironment(), to move back to normal operations, mainly cleaning up recovery locks and KnownAssignedXids (including any 2PC transaction tracked previously). 3) Switch XLogCtl->SharedRecoveryState to RECOVERY_STATE_DONE, which is the tipping point for any process calling RecoveryInProgress() to check if the cluster is still in recovery or not. Any snapshot taken between steps 2) and 3) would be empty, causing any transaction relying on a snapshot at this point to potentially corrupt data as there could still be some 2PC transactions to track, with RecentXmin moving backwards on successive calls to GetSnapshotData() in the same transaction. As SharedRecoveryState is the point to take into account to know if it is safe to discard KnownAssignedXids, this commit moves step 2) after step 3), so as we can never finish with empty snapshots. This exists since the introduction of hot standby, so backpatch all the way down. The window with incorrect snapshots is extremely small, but I have seen it when running 023_pitr_prepared_xact.pl, as did buildfarm member fairywren. Thomas Munro also found it independently. Special thanks to Andres Freund for taking the time to analyze this issue. Reported-by: Thomas Munro, Michael Paquier Analyzed-by: Andres Freund Discussion: https://postgr.es/m/20210422203603.fdnh3fu2mmfp2iov@alap3.anarazel.de Backpatch-through: 9.6	2021-10-04 14:05:52 +09:00
Tom Lane	9c76689de4	Update our mapping of Windows time zone names using CLDR info. This corrects a bunch of entries in win32_tzmap[], and adds a few new ones, based on the CLDR project's windowsZones.xml file. Non-cosmetic changes fall into four main categories: * Flat-out errors: US/Aleutan doesn't exist America/Salvador doesn't exist Asia/Baku is wrong for Yerevan Asia/Dhaka (Bangladesh) is wrong for Astana (Kazakhstan) Europe/Bucharest is wrong for Chisinau America/Mexico_City is wrong for Chetumal America/Buenos_Aires is wrong for Cayenne America/Caracas has its own zone, so poor fit for La Paz US/Eastern is wrong for Haiti US/Eastern is wrong for Indiana (East) Asia/Karachi is wrong for Tashkent Etc/UTC+12 doesn't exist Signs of Etc/GMT zones were backwards * Judgment calls: (These changes follow CLDR's choices, except for the first one) Use Europe/London for "Greenwich Standard Time", since that seems much more likely than Africa/Casablanca to be what people will think that zone name means. CLDR has Atlantic/Reykjavik here, but that's no better. Asia/Shanghai seems a better fit than Hong Kong for "China Standard Time". Europe/Sarajevo is now a link to Belgrade, ie "Central Europe Standard Time"; so use Warsaw for "Central European Standard Time". America/Sao_Paulo seems more representative than Araguaina for "E. South America Standard Time". Africa/Johannesburg seems more representative than Harare for "South Africa Standard Time". * New Windows zone names: "Israel Standard Time" "Kaliningrad Standard Time" "Russia Time Zone N" for various N "Singapore Standard Time" "South Sudan Standard Time" "W. Central Africa Standard Time" "West Bank Standard Time" "Yukon Standard Time" Some of these replace older spellings, but I kept the older spellings too in case our code runs on a machine with the older data. * Replace aliases (tzdb Links) with underlying city-named zones: (This tracks tzdb's longstanding practice, and reduces inconsistency with the rest of the entries, as well as with CLDR.) US/Alaska Asia/Kuwait Asia/Muscat Canada/Atlantic Australia/Canberra Canada/Saskatchewan US/Central US/Eastern US/Hawaii US/Mountain Canada/Newfoundland US/Pacific Back-patch to all supported branches, as is our usual practice for time zone data updates. Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us	2021-10-02 16:06:23 -04:00
Tom Lane	7ba8eb81f6	Re-alphabetize the win32_tzmap[] array. The original intent seems to have been to sort case-insensitively by the Windows zone name, but various changes over the years did not get that memo. This commit just moves a few entries to restore exact alphabetic order, to ease comparison to the outputs of processing scripts. Back-patch to all supported branches, as is our usual practice for time zone data updates. Discussion: https://postgr.es/m/3266414.1633045628@sss.pgh.pa.us	2021-10-02 16:06:23 -04:00
Alvaro Herrera	170206e458	Error out if SKIP LOCKED and WITH TIES are both specified Both bugs #16676[1] and #17141[2] illustrate that the combination of SKIP LOCKED and FETCH FIRST WITH TIES break expectations when it comes to rows returned to other sessions accessing the same row. Since this situation is detectable from the syntax and hard to fix otherwise, forbid for now, with the potential to fix in the future. [1] https://postgr.es/m/16676-fd62c3c835880da6@postgresql.org [2] https://postgr.es/m/17141-913d78b9675aac8e@postgresql.org Backpatch-through: 13, where WITH TIES was introduced Author: David Christensen <david.christensen@crunchydata.com> Discussion: https://postgr.es/m/CAOxo6XLPccCKru3xPMaYDpa+AXyPeWFs+SskrrL+HKwDjJnLhg@mail.gmail.com	2021-10-01 18:29:18 -03:00
Tom Lane	7adbe186f7	Avoid believing incomplete MCV-only stats in get_variable_range(). get_variable_range() would incautiously believe that statistics containing only an MCV list are sufficient to derive a range estimate. That's okay for an enum-like column that contains only MCVs, but otherwise the estimate could be pretty bad. Make it report that the range is indeterminate unless the MCVs plus nullfrac account for the whole table. I don't think this needs a dedicated test case, since a quick code coverage check verifies that the existing regression tests traverse all the alternatives. There is room to doubt that a future-proof test case could be built anyway, given that the submitted example accidentally doesn't fail before v11. Per bug #17207 from Simon Perepelitsa. Back-patch to v10. In principle this has been broken all along, but I'm hesitant to make such changes in 9.6, since if anyone is unhappy with 9.6.24's behavior there will be no second chance to fix it. Discussion: https://postgr.es/m/17207-5265aefa79e333b4@postgresql.org	2021-10-01 14:59:35 -04:00
Tom Lane	04ef2021e3	Fix Portal snapshot tracking to handle subtransactions properly. Commit `84f5c2908` forgot to consider the possibility that EnsurePortalSnapshotExists could run inside a subtransaction with lifespan shorter than the Portal's. In that case, the new active snapshot would be popped at the end of the subtransaction, leaving a dangling pointer in the Portal, with mayhem ensuing. To fix, make sure the ActiveSnapshot stack entry is marked with the same subtransaction nesting level as the associated Portal. It's certainly safe to do so since we won't be here at all unless the stack is empty; hence we can't create an out-of-order stack. Let's also apply this logic in the case where PortalRunUtility sets portalSnapshot, just to be sure that path can't cause similar problems. It's slightly less clear that that path can't create an out-of-order stack, so add an assertion guarding it. Report and patch by Bertrand Drouvot (with kibitzing by me). Back-patch to v11, like the previous commit. Discussion: https://postgr.es/m/ff82b8c5-77f4-3fe7-6028-fcf3303e82dd@amazon.com	2021-10-01 11:10:12 -04:00
Tom Lane	649e561f65	Remove gratuitous environment dependency in 002_types.pl test. Computing related timestamps by subtracting "N days" is sensitive to the prevailing timezone, since we interpret that as "same local time on the N'th prior day". Even though the intervals in question are only two to four days, through remarkable bad luck they managed to cross the end of Ramadan in 2014, causing the test's output to change if timezone is set to Africa/Casablanca. (Maybe in other Muslim areas as well; I didn't check.) There's absolutely no reason for this test to exercise interval subtraction, so just get rid of that and use plain timestamptz constants representing the intended values. Per report from Andres Freund. Back-patch to v10 where this test script came in. Discussion: https://postgr.es/m/20210930183641.7lh4jhvpipvromca@alap3.anarazel.de	2021-09-30 16:23:26 -04:00
Alvaro Herrera	1d97d3d086	Fix WAL replay in presence of an incomplete record Physical replication always ships WAL segment files to replicas once they are complete. This is a problem if one WAL record is split across a segment boundary and the primary server crashes before writing down the segment with the next portion of the WAL record: WAL writing after crash recovery would happily resume at the point where the broken record started, overwriting that record ... but any standby or backup may have already received a copy of that segment, and they are not rewinding. This causes standbys to stop following the primary after the latter crashes: LOG: invalid contrecord length 7262 at A8/D9FFFBC8 because the standby is still trying to read the continuation record (contrecord) for the original long WAL record, but it is not there and it will never be. A workaround is to stop the replica, delete the WAL file, and restart it -- at which point a fresh copy is brought over from the primary. But that's pretty labor intensive, and I bet many users would just give up and re-clone the standby instead. A fix for this problem was already attempted in commit `515e3d84a0`, but it only addressed the case for the scenario of WAL archiving, so streaming replication would still be a problem (as well as other things such as taking a filesystem-level backup while the server is down after having crashed), and it had performance scalability problems too; so it had to be reverted. This commit fixes the problem using an approach suggested by Andres Freund, whereby the initial portion(s) of the split-up WAL record are kept, and a special type of WAL record is written where the contrecord was lost, so that WAL replay in the replica knows to skip the broken parts. With this approach, we can continue to stream/archive segment files as soon as they are complete, and replay of the broken records will proceed across the crash point without a hitch. Because a new type of WAL record is added, users should be careful to upgrade standbys first, primaries later. Otherwise they risk the standby being unable to start if the primary happens to write such a record. A new TAP test that exercises this is added, but the portability of it is yet to be seen. This has been wrong since the introduction of physical replication, so backpatch all the way back. In stable branches, keep the new XLogReaderState members at the end of the struct, to avoid an ABI break. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Nathan Bossart <bossartn@amazon.com> Discussion: https://postgr.es/m/202108232252.dh7uxf6oxwcy@alvherre.pgsql	2021-09-29 11:21:51 -03:00
Fujii Masao	8cf4f71185	pgbench: Fix handling of socket errors during benchmark. Previously socket errors such as invalid socket or socket wait method failures during benchmark caused pgbench to exit with status 0. Instead, errors during the run should result in exit status 2. Back-patch to v12 where pgbench started reporting exit status. Original complaint and patch by Hayato Kuroda. Author: Yugo Nagata, Fabien COELHO Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/TYCPR01MB5870057375ACA8A73099C649F5349@TYCPR01MB5870.jpnprd01.prod.outlook.com	2021-09-29 21:49:36 +09:00
Fujii Masao	3cc85d7d53	pgbench: Correct log level of message output when socket wait method fails. The failure of socket wait method like "select()" doesn't terminate pgbench. So the log level of error message when that failure happens should be ERROR. But previously FATAL was used in that case. Back-patch to v13 where pgbench started using common logging API. Author: Yugo Nagata, Fabien COELHO Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/20210617005934.8bd37bf72efd5f1b38e6f482@sraoss.co.jp	2021-09-29 21:47:31 +09:00
Alexander Korotkov	fd22aec631	Split macros from visibilitymap.h into a separate header That allows to include just visibilitymapdefs.h from file.c, and in turn, remove include of postgres.h from relcache.h. Reported-by: Andres Freund Discussion: https://postgr.es/m/20210913232614.czafiubr435l6egi%40alap3.anarazel.de Author: Alexander Korotkov Reviewed-by: Andres Freund, Tom Lane, Alvaro Herrera Backpatch-through: 13	2021-09-23 20:12:25 +03:00
Tomas Vondra	c0386f403a	Release memory allocated by dependency_degree Calculating degree of a functional dependency may allocate a lot of memory - we have released mot of the explicitly allocated memory, but e.g. detoasted varlena values were left behind. That may be an issue, because we consider a lot of dependencies (all combinations), and the detoasting may happen for each one again. Fixed by calling dependency_degree() in a dedicated context, and resetting it after each call. We only need the calculated dependency degree, so we don't need to copy anything. Backpatch to PostgreSQL 10, where extended statistics were introduced. Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com	2021-09-23 18:34:01 +02:00
Tomas Vondra	b564eb0181	Free memory after building each statistics object Until now, all extended statistics on a given relation were built in the same memory context, without resetting. Some of the memory was released explicitly, but not all of it - for example memory allocated while detoasting values is hard to free. This is how it worked since extended statistics were introduced in PostgreSQL 10, but adding support for extended stats on expressions made the issue somewhat worse as it increases the number of statistics to build. Fixed by adding a memory context which gets reset after building each statistics object (all the statistics kinds included in it). Resetting it after building each statistics kind would be even better, but it would require more invasive changes and copying of results, making it harder to backpatch. Backpatch to PostgreSQL 10, where extended statistics were introduced. Author: Justin Pryzby Reported-by: Justin Pryzby Reviewed-by: Tomas Vondra Backpatch-through: 10 Discussion: https://www.postgresql.org/message-id/20210915200928.GP831%40telsasoft.com	2021-09-23 18:33:59 +02:00
Amit Kapila	f09a81f1c0	Invalidate all partitions for a partitioned table in publication. Updates/Deletes on a partition were allowed even without replica identity after the parent table was added to a publication. This would later lead to an error on subscribers. The reason was that we were not invalidating the partition's relcache and the publication information for partitions was not getting rebuilt. Similarly, we were not invalidating the partitions' relcache after dropping a partitioned table from a publication which will prohibit Updates/Deletes on its partition without replica identity even without any publication. Reported-by: Haiying Tang Author: Hou Zhijie and Vignesh C Reviewed-by: Vignesh C and Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com	2021-09-22 08:24:20 +05:30
Michael Paquier	583e15af99	Fix places in TestLib.pm in need of adaptation to the output of Msys perl Contrary to the output of native perl, Msys perl generates outputs with CRLFs characters. There are already places in the TAP code where CRLFs (\r\n) are automatically converted to LF (\n) on Msys, but we missed a couple of places when running commands and using their output for comparison, that would lead to failures. This problem has been found thanks to the test added in `5adb067` using TestLib::command_checks_all(), but after a closer look more code paths were missing a filter. This is backpatched all the way down to prevent any surprises if a new test is introduced in stable branches. Reviewed-by: Andrew Dunstan, Álvaro Herrera Discussion: https://postgr.es/m/1252480.1631829409@sss.pgh.pa.us Backpatch-through: 9.6	2021-09-22 08:43:10 +09:00
Tom Lane	5f0a073cbb	Fix misevaluation of STABLE parameters in CALL within plpgsql. Before commit `84f5c2908`, a STABLE function in a plpgsql CALL statement's argument list would see an up-to-date snapshot, because exec_stmt_call would push a new snapshot. I got rid of that because the possibility of the snapshot disappearing within COMMIT made it too hard to manage a snapshot across the CALL statement. That's fine so far as the procedure itself goes, but I forgot to think about the possibility of STABLE functions within the CALL argument list. As things now stand, those'll be executed with the Portal's snapshot as ActiveSnapshot, keeping them from seeing updates more recent than Portal startup. (VOLATILE functions don't have a problem because they take their own snapshots; which indeed is also why the procedure itself doesn't have a problem. There are no STABLE procedures.) We can fix this by pushing a new snapshot transiently within ExecuteCallStmt itself. Popping the snapshot before we get into the procedure proper eliminates the management problem. The possibly-useless extra snapshot-grab is slightly annoying, but it's no worse than what happened before `84f5c2908`. Per bug #17199 from Alexander Nawratil. Back-patch to v11, like the previous patch. Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org	2021-09-21 19:06:33 -04:00
Peter Geoghegan	a1708ab652	Remove overzealous index deletion assertion. A broken HOT chain is not an unexpected condition, even when the offset number points past the end of the page's line pointer array. heap_prune_chain() does not (and never has) treated this condition as unexpected, so derivative code in heap_index_delete_tuples() shouldn't do so either. Oversight in commit `4228817449`. The assertion can probably only fail on Postgres 14 and master. Earlier releases don't have commit `3c3b8a4b`, which taught VACUUM to truncate the line pointer array of heap pages. Backpatch all the same, just to be consistent. Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/17197-9438f31f46705182@postgresql.org Backpatch: 12-, just like commit `4228817449`.	2021-09-20 14:26:22 -07:00
Tom Lane	dede143997	Don't elide casting to typmod -1. Casting a value that's already of a type with a specific typmod to an unspecified typmod doesn't do anything so far as run-time behavior is concerned. However, it really ought to change the exposed type of the expression to match. Up to now, coerce_type_typmod hasn't bothered with that, which creates gotchas in contexts such as recursive unions. If for example one side of the union is numeric(18,3), but it needs to be plain numeric to match the other side, there's no direct way to express that. This is easy enough to fix, by inserting a RelabelType to update the exposed type of the expression. However, it's a bit nervous-making to change this behavior, because it's stood for a really long time. But no complaints have emerged about 14beta3, so go ahead and back-patch. Back-patch of `5c056b0c2` into previous supported branches. Discussion: https://postgr.es/m/CABNQVagu3bZGqiTjb31a8D5Od3fUMs7Oh3gmZMQZVHZ=uWWWfQ@mail.gmail.com Discussion: https://postgr.es/m/1488389.1631984807@sss.pgh.pa.us	2021-09-20 11:48:52 -04:00
Tom Lane	e0b0d1eab4	Fix pull_varnos to cope with translated PlaceHolderVars. Commit `55dc86eca` changed pull_varnos to use (if possible) the associated ph_eval_at for a PlaceHolderVar. I missed a fine point though: we might be looking at a PHV in the quals or tlist of a child appendrel, in which case we need to compute a ph_eval_at value that's been translated in the same way that the PHV itself has been (cf. adjust_appendrel_attrs). Fortunately, enough info is available in the PlaceHolderInfo to make such translation possible without additional outside data, so we don't need another round of uglification of planner APIs. This is a little bit complicated, but since it's a hard-to-hit corner case, I'm not much worried about adding cycles here. Per report from Jaime Casanova. Back-patch to v12, like the previous commit. Discussion: https://postgr.es/m/20210915230959.GB17635@ahch-to	2021-09-17 15:41:16 -04:00
Fujii Masao	8767a86fa2	Fix variable shadowing in procarray.c. ProcArrayGroupClearXid function has a parameter named "proc", but the same name was used for its local variables. This commit fixes this variable shadowing, to improve code readability. Back-patch to all supported versions, to make future back-patching easy though this patch is classified as refactoring only. Reported-by: Ranier Vilela Author: Ranier Vilela, Aleksander Alekseev https://postgr.es/m/CAEudQAqyoTZC670xWi6w-Oe2_Bk1bfu2JzXz6xRfiOUzm7xbyQ@mail.gmail.com	2021-09-16 13:07:29 +09:00
Tom Lane	e06cc024bd	Disallow LISTEN in background workers. It's possible to execute user-defined SQL in some background processes; for example, logical replication workers can fire triggers. This opens the possibility that someone would try to execute LISTEN in such a context. But since only regular backends ever call ProcessNotifyInterrupt, no messages would actually be received, and thus the registered listener would simply prevent the message queue from being cleaned. Eventually NOTIFY would stop working, which is bad. Perhaps someday somebody will invent infrastructure to make listening in a background worker actually useful. In the meantime, forbid it. Back-patch to v13, which is where we introduced the MyBackendType variable. It'd be a lot harder to implement the check without that, and it doesn't seem worth the trouble. Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org	2021-09-15 12:31:56 -04:00
Tom Lane	63f28776cb	Send NOTIFY signals during CommitTransaction. Formerly, we sent signals for outgoing NOTIFY messages within ProcessCompletedNotifies, which was also responsible for sending relevant ones of those messages to our connected client. It therefore had to run during the main-loop processing that occurs just before going idle. This arrangement had two big disadvantages: * Now that procedures allow intra-command COMMITs, it would be useful to send NOTIFYs to other sessions immediately at COMMIT (though, for reasons of wire-protocol stability, we still shouldn't forward them to our client until end of command). * Background processes such as replication workers would not send NOTIFYs at all, since they never execute the client communication loop. We've had requests to allow triggers running in replication workers to send NOTIFYs, so that's a problem. To fix these things, move transmission of outgoing NOTIFY signals into AtCommit_Notify, where it will happen during CommitTransaction. Also move the possible call of asyncQueueAdvanceTail there, to ensure we don't bloat the async SLRU if a background worker sends many NOTIFYs with no one listening. We can also drop the call of asyncQueueReadAllNotifications, allowing ProcessCompletedNotifies to go away entirely. That's because commit `790026972` added a call of ProcessNotifyInterrupt adjacent to PostgresMain's call of ProcessCompletedNotifies, and that does its own call of asyncQueueReadAllNotifications, meaning that we were uselessly doing two such calls (inside two separate transactions) whenever inbound notify signals coincided with an outbound notify. We need only set notifyInterruptPending to ensure that ProcessNotifyInterrupt runs, and we're done. The existing documentation suggests that custom background workers should call ProcessCompletedNotifies if they want to send NOTIFY messages. To avoid an ABI break in the back branches, reduce it to an empty routine rather than removing it entirely. Removal will occur in v15. Although the problems mentioned above have existed for awhile, I don't feel comfortable back-patching this any further than v13. There was quite a bit of churn in adjacent code between 12 and 13. At minimum we'd have to also backpatch `51004c717`, and a good deal of other adjustment would also be needed, so the benefit-to-risk ratio doesn't look attractive. Per bug #15293 from Michael Powers (and similar gripes from others). Artur Zakirov and Tom Lane Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org	2021-09-14 17:18:25 -04:00
Andres Freund	c49e6f9d97	jit: Do not try to shut down LLVM state in case of LLVM triggered errors. If an allocation failed within LLVM it is not safe to call back into LLVM as LLVM is not generally safe against exceptions / stack-unwinding. Thus errors while in LLVM code are promoted to FATAL. However llvm_shutdown() did call back into LLVM even in such cases, while llvm_release_context() was careful not to do so. We cannot generally skip shutting down LLVM, as that can break profiling. But it's OK to do so if there was an error from within LLVM. Reported-By: Jelte Fennema <Jelte.Fennema@microsoft.com> Author: Andres Freund <andres@anarazel.de> Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/AM5PR83MB0178C52CCA0A8DEA0207DC14F7FF9@AM5PR83MB0178.EURPRD83.prod.outlook.com Backpatch: 11-, where jit was introduced	2021-09-13 18:26:18 -07:00
Tom Lane	745abdd951	Fix EXIT out of outermost block in plpgsql. Ordinarily, using EXIT this way would draw "control reached end of function without RETURN". However, if the function is one where we don't require an explicit RETURN (such as a DO block), that should not happen. It did anyway, because add_dummy_return() neglected to account for the case. Per report from Herwig Goemans. Back-patch to all supported branches. Discussion: https://postgr.es/m/868ae948-e3ca-c7ec-95a6-83cfc08ef750@gmail.com	2021-09-13 12:42:03 -04:00
Amit Kapila	58cf794ca6	Fix reorder buffer memory accounting for toast changes. While processing toast changes in logical decoding, we rejigger the tuple change to point to in-memory toast tuples instead to on-disk toast tuples. And, to make sure the memory accounting is correct, we were subtracting the old change size and then after re-computing the new tuple, re-adding its size at the end. Now, if there is any error before we add the new size, we will release the changes and that will update the accounting info (subtracting the size from the counters). And we were underflowing there which leads to an assertion failure in assert enabled builds and wrong memory accounting in reorder buffer otherwise. Author: Bertrand Drouvot Reviewed-by: Amit Kapila Backpatch-through: 13, where memory accounting was introduced Discussion: https://postgr.es/m/92b0ee65-b8bd-e42d-c082-4f3f4bf12d34@amazon.com	2021-09-13 10:46:58 +05:30
Michael Paquier	b589d212fd	Fix error handling with threads on OOM in ECPG connection logic An out-of-memory failure happening when allocating the structures to store the connection parameter keywords and values would mess up with the set of connections saved, as on failure the pthread mutex would still be hold with the new connection object listed but free()'d. Rather than just unlocking the mutex, which would leave the static list of connections into an inconsistent state, move the allocation for the structures of the connection parameters before beginning the test manipulation. This ensures that the list of connections and the connection mutex remain consistent all the time in this code path. This error is unlikely going to happen, but this could mess up badly with ECPG clients in surprising ways, so backpatch all the way down. Reported-by: ryancaicse Discussion: https://postgr.es/m/17186-b4cfd8f0eb4d1dee@postgresql.org Backpatch-through: 9.6	2021-09-13 13:24:20 +09:00
Tom Lane	7e420072ea	Make pg_regexec() robust against out-of-range search_start. If search_start is greater than the length of the string, we should just return REG_NOMATCH immediately. (Note that the equality case should not be rejected, since the pattern might be able to match zero characters.) This guards various internal assumptions that the min of a range of string positions is not more than the max. Violation of those assumptions could allow an attempt to fetch string[search_start-1], possibly causing a crash. Jaime Casanova pointed out that this situation is reachable with the new regexp_xxx functions that accept a user-specified start position. I don't believe it's reachable via any in-core call site in v14 and below. However, extensions could possibly call pg_regexec with an out-of-range search_start, so let's back-patch the fix anyway. Discussion: https://postgr.es/m/20210911180357.GA6870@ahch-to	2021-09-11 15:19:49 -04:00
Tom Lane	fa5d0415f1	Fix some anomalies with NO SCROLL cursors. We have long forbidden fetching backwards from a NO SCROLL cursor, but the prohibition didn't extend to cases in which we rewind the query altogether and then re-fetch forwards. I think the reason is that this logic was mainly meant to protect plan nodes that can't be run in the reverse direction. However, re-reading the query output is problematic if the query is volatile (which includes SELECT FOR UPDATE, not just queries with volatile functions): the re-read can produce different results, which confuses the cursor navigation logic completely. Another reason for disliking this approach is that some code paths will either fetch backwards or rewind-and-fetch-forwards depending on the distance to the target row; so that seemingly identical use-cases may or may not draw the "cursor can only scan forward" error. Hence, let's clean things up by disallowing rewind as well as fetch-backwards in a NO SCROLL cursor. Ordinarily we'd only make such a definitional change in HEAD, but there is a third reason to consider this change now. Commit `ba2c6d6ce` created some new user-visible anomalies for non-scrollable cursors WITH HOLD, in that navigation in the cursor result got confused if the cursor had been partially read before committing. The only good way to resolve those anomalies is to forbid rewinding such a cursor, which allows removal of the incorrect cursor state manipulations that `ba2c6d6ce` added to PersistHoldablePortal. To minimize the behavioral change in the back branches (including v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore, ie has been held over from a previous transaction due to WITH HOLD. This should avoid breaking most applications that have been sloppy about whether to declare cursors as scrollable. We'll enforce the prohibition across-the-board beginning in v15. Back-patch to v11, as `ba2c6d6ce` was. Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us	2021-09-10 13:18:32 -04:00
Tom Lane	d8d93bc8ba	Avoid fetching from an already-terminated plan. Some plan node types don't react well to being called again after they've already returned NULL. PortalRunSelect() has long dealt with this by calling the executor with NoMovementScanDirection if it sees that we've already run the portal to the end. However, commit `ba2c6d6ce` overlooked this point, so that persisting an already-fully-fetched cursor would fail if it had such a plan. Per report from Tomas Barton. Back-patch to v11, as the faulty commit was. (I've omitted a test case because the type of plan that causes a problem isn't all that stable.) Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com	2021-09-09 13:36:31 -04:00
Tom Lane	04118de78f	Check for relation length overrun soon enough. We don't allow relations to exceed 2^32-1 blocks, because block numbers are 32 bits and the last possible block number is reserved to mean InvalidBlockNumber. There is a check for this in mdextend, but that's really way too late, because the smgr API requires us to create a buffer for the block-to-be-added, and we do not want to have any buffer with blocknum InvalidBlockNumber. (Such a case can trigger assertions in bufmgr.c, plus I think it might confuse ReadBuffer's logic for data-past-EOF later on.) So put the check into ReadBuffer. Per report from Christoph Berg. It's been like this forever, so back-patch to all supported branches. Discussion: https://postgr.es/m/YTn1iTkUYBZfcODk@msg.credativ.de	2021-09-09 11:45:48 -04:00
Fujii Masao	dd9b3fced8	Fix issue with WAL archiving in standby. Previously, walreceiver always closed the currently-opened WAL segment and created its archive notification file, after it finished writing the current segment up and received any WAL data that should be written into the next segment. If walreceiver exited just before any WAL data in the next segment arrived at standby, it did not create the archive notification file of the current segment even though that's known completed. This behavior could cause WAL archiving of the segment to be delayed until subsequent restartpoints or checkpoints created its notification file. To fix the issue, this commit changes walreceiver so that it creates an archive notification file of a current WAL segment immediately if that's known completed before receiving next WAL data. Back-patch to all supported branches. Reported-by: Kyotaro Horiguchi Author: Fujii Masao Reviewed-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20200630.165503.1465894182551545886.horikyota.ntt@gmail.com	2021-09-09 23:58:26 +09:00
Tom Lane	da7d81dd05	Avoid useless malloc/free traffic around getFormattedTypeName(). Coverity complained that one caller of getFormattedTypeName() failed to free the returned string. Which is true, but rather than fixing that one, let's get rid of this tedious and error-prone requirement. Now that getFormattedTypeName() caches its result, strdup'ing that result and expecting the caller to free it accomplishes little except to waste cycles. We do create a leak in the case where getTypes didn't make a TypeInfo for the type, but that basically shouldn't ever happen. Back-patch, as commit `6c450a861` was. This isn't a particularly interesting bug fix, but the API change seems like a hazard for future back-patching activity if we don't back-patch it.	2021-09-08 15:09:42 -04:00
Tom Lane	cbba6ba3a0	Fix rewriter to set hasModifyingCTE correctly on rewritten queries. If we copy data-modifying CTEs from the original query to a replacement query (from a DO INSTEAD rule), we must set hasModifyingCTE properly in the replacement query. Failure to do this can cause various unpleasantness, such as unsafe usage of parallel plans. The code also neglected to propagate hasRecursive, though that's only cosmetic at the moment. A difficulty arises if the rule action is an INSERT...SELECT. We attach the original query's RTEs and CTEs to the sub-SELECT Query, but data-modifying CTEs are only allowed to appear in the topmost Query. For the moment, throw an error in such cases. It would probably be possible to avoid this error by attaching the CTEs to the top INSERT Query instead; but that would require a bunch of new code to adjust ctelevelsup references. Given the narrowness of the use-case, and the need to back-patch this fix, it does not seem worth the trouble for now. We can revisit this if we get field complaints. Per report from Greg Nancarrow. Back-patch to all supported branches. (The test case added here does not fail before v10, but there are plenty of places checking top-level hasModifyingCTE in 9.6, so I have no doubt that this code change is necessary there too.) Greg Nancarrow and Tom Lane Discussion: https://postgr.es/m/CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com	2021-09-08 12:05:43 -04:00
Amit Kapila	ddfc7299d0	Invalidate relcache for publications defined for all tables. Updates/Deletes on a relation were allowed even without replica identity after we define the publication for all tables. This would later lead to an error on subscribers. The reason was that for such publications we were not invalidating the relcache and the publication information for relations was not getting rebuilt. Similarly, we were not invalidating the relcache after dropping of such publications which will prohibit Updates/Deletes without replica identity even without any publication. Author: Vignesh C and Hou Zhijie Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila Backpatch-through: 10, where it was introduced Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com	2021-09-08 12:14:59 +05:30
Noah Misch	aae398a87c	AIX: Fix missing libpq symbols by respecting SHLIB_EXPORTS. We make each AIX shared library export all globals found in .o files that originate in the library. That doesn't include symbols acquired by -lpgcommon_shlib. That is good on average, but it became a problem for libpq when commit `e6afa8918c` moved five official libpq API symbols into src/common. Fix this by implementing the SHLIB_EXPORTS mechanism for AIX, so affected libraries export the same symbols that they export on Linux. This reintroduces symbols pg_encoding_to_char, pg_utf_mblen, pg_char_to_encoding, pg_valid_server_encoding, and pg_valid_server_encoding_id. Back-patch to v13, where the aforementioned commit first appeared. While a minor release is usually the wrong time to add or remove symbol exports in libpq or libecpg, we should expect users to want each documented symbol. Tony Reix Discussion: https://postgr.es/m/PR3PR02MB6396742E2FC3E77D37A920BC86C79@PR3PR02MB6396.eurprd02.prod.outlook.com	2021-09-06 11:28:02 -07:00
Tom Lane	d8a266c5e1	Fix bogus timetz_zone() results for DYNTZ abbreviations. timetz_zone() delivered completely wrong answers if the zone was specified by a dynamic TZ abbreviation, because it failed to account for the difference between the POSIX conventions for field values in struct pg_tm and the conventions used in PG-specific datetime code. As a stopgap fix, just adjust the tm_year and tm_mon fields to match PG conventions. This is fixed in a different way in HEAD (`388e71af8`) but I don't want to back-patch the change of reference point. Discussion: https://postgr.es/m/CAJ7c6TOMG8zSNEZtCn5SPe+cCk3Lfxb71ZaQwT2F4T7PJ_t=KA@mail.gmail.com	2021-09-06 11:29:52 -04:00
Peter Eisentraut	9f9ae019d1	Fix pkg-config files for static linking Since `ea53100d5` (PostgreSQL 12), the shipped pkg-config files have been broken for statically linking libpq because libpgcommon and libpgport are missing. This patch adds those two missing private dependencies (in a non-hardcoded way). Reported-by: Filip Gospodinov <f@gospodinov.ch> Discussion: https://www.postgresql.org/message-id/flat/c7108bde-e051-11d5-a234-99beec01ce2a@gospodinov.ch	2021-09-06 09:43:05 +02:00
Tom Lane	2c0dd669c3	Further portability tweaks for float4/float8 hash functions. Attempting to make hashfloat4() look as much as possible like hashfloat8(), I'd figured I could replace NaNs with get_float4_nan() before widening to float8. However, results from protosciurus and topminnow show that on some platforms that produces a different bit-pattern from get_float8_nan(), breaking the intent of `ce773f230`. Rearrange so that we use the result of get_float8_nan() for all NaN cases. As before, back-patch.	2021-09-04 16:29:08 -04:00
Alvaro Herrera	518621c40b	Revert "Avoid creating archive status ".ready" files too early" This reverts commit `515e3d84a0` and equivalent commits in back branches. This solution to the problem has a number of problems, so we'll try again with a different approach. Per note from Andres Freund Discussion: https://postgr.es/m/20210831042949.52eqp5xwbxgrfank@alap3.anarazel.de	2021-09-04 12:14:30 -04:00
Tom Lane	742b30caee	Remove arbitrary MAXPGPATH limit on command lengths in pg_ctl. Replace fixed-length command buffers with psprintf() calls. We didn't have anything as convenient as psprintf() when this code was written, but now that we do, there's little reason for the limitation to stand. Removing it eliminates some corner cases where (for example) starting the postmaster with a whole lot of options fails. Most individual file names that pg_ctl deals with are still restricted to MAXPGPATH, but we've seldom had complaints about that limitation so long as it only applies to one filename. Back-patch to all supported branches. Phil Krylov Discussion: https://postgr.es/m/567e199c6b97ee19deee600311515b86@krylov.eu	2021-09-03 21:04:44 -04:00
Tom Lane	132be60006	Disallow creating an ICU collation if the DB encoding won't support it. Previously this was allowed, but the collation effectively vanished into the ether because of the way lookup_collation() works: you could not use the collation, nor even drop it. Seems better to give an error up front than to leave the user wondering why it doesn't work. (Because this test is in DefineCollation not CreateCollation, it does not prevent pg_import_system_collations from creating ICU collations, regardless of the initially-chosen encoding.) Per bug #17170 from Andrew Bille. Back-patch to v10 where ICU support was added. Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org	2021-09-03 16:38:55 -04:00
Tom Lane	9089f1543e	Fix portability issue in tests from commit `ce773f230`. Modern POSIX seems to require strtod() to accept "-NaN", but there's nothing about NaN in SUSv2, and some of our oldest buildfarm members don't like it. Let's try writing it as -'NaN' instead; that seems to produce the same result, at least on Intel hardware. Per buildfarm.	2021-09-03 10:01:02 -04:00
Tom Lane	be2beadaff	Fix float4/float8 hash functions to produce uniform results for NaNs. The IEEE 754 standard allows a wide variety of bit patterns for NaNs, of which at least two ("NaN" and "-NaN") are pretty easy to produce from SQL on most machines. This is problematic because our btree comparison functions deem all NaNs to be equal, but our float hash functions know nothing about NaNs and will happily produce varying hash codes for them. That causes unexpected results from queries that hash a column containing different NaN values. It could also produce unexpected lookup failures when using a hash index on a float column, i.e. "WHERE x = 'NaN'" will not find all the rows it should. To fix, special-case NaN in the float hash functions, not too much unlike the existing special case that forces zero and minus zero to hash the same. I arranged for the most vanilla sort of NaN (that coming from the C99 NAN constant) to still have the same hash code as before, to reduce the risk to existing hash indexes. I dithered about whether to back-patch this into stable branches, but ultimately decided to do so. It's a clear improvement for queries that hash internally. If there is anybody who has -NaN in a hash index, they'd be well advised to re-index after applying this patch ... but the misbehavior if they don't will not be much worse than the misbehavior they had before. Per bug #17172 from Ma Liangzhu. Discussion: https://postgr.es/m/17172-7505bea9e04e230f@postgresql.org	2021-09-02 17:24:42 -04:00
Amit Kapila	b51985d8a0	Fix the random test failure in 001_rep_changes. The check to test whether the subscription workers were restarting after a change in the subscription was failing. The reason was that the test was assuming the walsender started before it reaches the 'streaming' state and the walsender was exiting due to an error before that. Now, the walsender was erroring out before reaching the 'streaming' state because it tries to acquire the slot before the previous walsender has exited. In passing, improve the die messages so that it is easier to investigate the failures in the future if any. Reported-by: Michael Paquier, as per buildfarm Author: Ajin Cherian Reviewed-by: Masahiko Sawada, Amit Kapila Backpatch-through: 10, where this test was introduced Discussion: https://postgr.es/m/YRnhFxa9bo73wfpV@paquier.xyz	2021-09-01 09:16:35 +05:30
Tom Lane	db11b4a3db	In pg_dump, avoid doing per-table queries for RLS policies. For no particularly good reason, getPolicies() queried pg_policy separately for each table. We can collect all the policies in a single query instead, and attach them to the correct TableInfo objects using findTableByOid() lookups. On the regression database, this reduces the number of queries substantially, and provides a visible savings even when running against a local server. Per complaint from Hubert Depesz Lubaczewski. Since this is such a simple fix and can have a visible performance benefit, back-patch to all supported branches. Discussion: https://postgr.es/m/20210826084430.GA26282@depesz.com	2021-08-31 15:04:05 -04:00
Tom Lane	904ce45bfa	Cache the results of format_type() queries in pg_dump. There's long been a "TODO: there might be some value in caching the results" annotation on pg_dump's getFormattedTypeName function; but we hadn't gotten around to checking what it was costing us to repetitively look up type names. It turns out that when dumping the current regression database, about 10% of the total number of queries issued are duplicative format_type() queries. However, Hubert Depesz Lubaczewski reported a not-unusual case where these account for over half of the queries issued by pg_dump. Individually these queries aren't expensive, but when network lag is a factor, they add up to a problem. We can very easily add some caching to getFormattedTypeName to solve it. Since this is such a simple fix and can have a visible performance benefit, back-patch to all supported branches. Discussion: https://postgr.es/m/20210826084430.GA26282@depesz.com	2021-08-31 13:53:50 -04:00
Tomas Vondra	c8213aa949	Rename the role in stats_ext to have regress_ prefix Commit `5be8ce82e8` added a new role to the stats_ext regression suite, but the role name did not start with regress_ causing failures when running with ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS. Fixed by renaming the role to start with the expected regress_ prefix. Backpatch-through: 10, same as the new regression test Discussion: https://postgr.es/m/1F238937-7CC2-4703-A1B1-6DC225B8978A%40enterprisedb.com	2021-08-31 19:36:03 +02:00
Tomas Vondra	1fe1a04af8	Fix lookup error in extended stats ownership check When an ownership check on extended statistics object failed, the code was calling aclcheck_error_type to report the failure, which is clearly wrong, resulting in cache lookup errors. Fix by calling aclcheck_error. This issue exists since the introduction of extended statistics, so backpatch all the way back to PostgreSQL 10. It went unnoticed because there were no tests triggering the error, so add one. Reported-by: Mark Dilger Backpatch-through: 10, where extended stats were introduced Discussion: https://postgr.es/m/1F238937-7CC2-4703-A1B1-6DC225B8978A%40enterprisedb.com	2021-08-31 18:38:11 +02:00
Alvaro Herrera	6197d7b538	Report tuple address in data-corruption error message Most data-corruption reports mention the location of the problem, but this one failed to. Add it. Backpatch all the way back. In 12 and older, also assign the ERRCODE_DATA_CORRUPTED error code as was done in commit `fd6ec93bf8` for 13 and later. Discussion: https://postgr.es/m/202108191637.oqyzrdtnheir@alvherre.pgsql	2021-08-30 16:29:12 -04:00
Amit Kapila	8ba3bad4c3	Fix incorrect error code in StartupReplicationOrigin(). ERRCODE_CONFIGURATION_LIMIT_EXCEEDED was used for checksum failure, use ERRCODE_DATA_CORRUPTED instead. Reported-by: Tatsuhito Kasahara Author: Tatsuhito Kasahara Backpatch-through: 9.6, where it was introduced Discussion: https://postgr.es/m/CAP0=ZVLHtYffs8SOWcFJWrBGoRzT9QQbk+_aP+E5AHLNXiOorA@mail.gmail.com	2021-08-30 09:26:49 +05:30
Alvaro Herrera	9a33ed8fa1	psql \dP: reference regclass with "pg_catalog." prefix Strictly speaking this isn't a bug, but since all references to catalog objects are schema-qualified, we might as well be consistent. The omission first appeared in commit `1c5d9270e3`, so backpatch to 12. Author: Justin Pryzby <pryzbyj@telsasoft.com> Discussion: https://postgr.es/m/20210827193151.GN26465@telsasoft.com	2021-08-28 11:45:47 -04:00
Noah Misch	b18669f5e6	Fix data loss in wal_level=minimal crash recovery of CREATE TABLESPACE. If the system crashed between CREATE TABLESPACE and the next checkpoint, the result could be some files in the tablespace unexpectedly containing no rows. Affected files would be those for which the system did not write WAL; see the wal_skip_threshold documentation. Before v13, a different set of conditions governed the writing of WAL; see v12's <sect2 id="populate-pitr">. (The v12 conditions were broader in some ways and narrower in others.) Users may want to audit non-default tablespaces for unexpected short files. The bug could have truncated an index without affecting the associated table, and reindexing the index would fix that particular problem. This fixes the bug by making create_tablespace_directories() more like TablespaceCreateDbspace(). create_tablespace_directories() was recursively removing tablespace contents, reasoning that WAL redo would recreate everything removed that way. That assumption holds for other wal_level values. Under wal_level=minimal, the old approach could delete files for which no other copy existed. Back-patch to 9.6 (all supported versions). Reviewed by Robert Haas and Prabhat Sahu. Reported by Robert Haas. Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com	2021-08-27 23:33:27 -07:00
Tom Lane	dbb239d518	Count SP-GiST index scans in pg_stat statistics. Somehow, spgist overlooked the need to call pgstat_count_index_scan(). Hence, pg_stat_all_indexes.idx_scan and equivalent columns never became nonzero for an SP-GiST index, although the related per-tuple counters worked fine. This fix works a bit differently from other index AMs, in that the counter increment occurs in spgrescan not spggettuple/spggetbitmap. It looks like this won't make the user-visible semantics noticeably different, so I won't go to the trouble of introducing an is-this- the-first-call flag just to make the counter bumps happen in the same places. Per bug #17163 from Christian Quest. Back-patch to all supported versions. Discussion: https://postgr.es/m/17163-b8c5cc88322a5e92@postgresql.org	2021-08-27 19:42:42 -04:00
Robert Haas	bc062cb938	Fix broken snapshot handling in parallel workers. Pengchengliu reported an assertion failure in a parallel woker while performing a parallel scan using an overflowed snapshot. The proximate cause is that TransactionXmin was set to an incorrect value. The underlying cause is incorrect snapshot handling in parallel.c. In particular, InitializeParallelDSM() was unconditionally calling GetTransactionSnapshot(), because I (rhaas) mistakenly thought that was always retrieving an existing snapshot whereas, at isolation levels less than REPEATABLE READ, it's actually taking a new one. So instead do this only at higher isolation levels where there actually is a single snapshot for the whole transaction. By itself, this is not a sufficient fix, because we still need to guarantee that TransactionXmin gets set properly in the workers. The easiest way to do that seems to be to install the leader's active snapshot as the transaction snapshot if the leader did not serialize a transaction snapshot. This doesn't affect the results of future GetTrasnactionSnapshot() calls since those have to take a new snapshot anyway; what we care about is the side effect of setting TransactionXmin. Report by Pengchengliu. Patch by Greg Nancarrow, except for some comment text which I supplied. Discussion: https://postgr.es/m/002f01d748ac$eaa781a0$bff684e0$@tju.edu.cn	2021-08-25 08:40:52 -04:00
Amit Kapila	794025eff0	Fix toast rewrites in logical decoding. Commit `325f2ec555` introduced pg_class.relwrite to skip operations on tables created as part of a heap rewrite during DDL. It links such transient heaps to the original relation OID via this new field in pg_class but forgot to do anything about toast tables. So, logical decoding was not able to skip operations on internally created toast tables. This leads to an error when we tried to decode the WAL for the next operation for which it appeared that there is a toast data where actually it didn't have any toast data. To fix this, we set pg_class.relwrite for internally created toast tables as well which allowed skipping operations on them during logical decoding. Author: Bertrand Drouvot Reviewed-by: David Zhang, Amit Kapila Backpatch-through: 11, where it was introduced Discussion: https://postgr.es/m/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com	2021-08-25 09:23:27 +05:30
Fujii Masao	7d9026cbfd	Avoid using ambiguous word "positive" in error message. There are two identical error messages about valid value of modulus for hash partition, in PostgreSQL source code. Commit `0e1275fb07` improved only one of them so that ambiguous word "positive" was avoided there, and forgot to improve the other. This commit improves the other. Which would reduce translator burden. Back-pach to v11 where the error message exists. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com	2021-08-25 11:47:46 +09:00
Fujii Masao	81fa1bce27	Improve error message about valid value for distance in phrase operator. The distance in phrase operator must be an integer value between zero and MAXENTRYPOS inclusive. But previously the error message about its valid value included the information about its upper limit but not lower limit (i.e., zero). This commit improves the error message so that it also includes the information about its lower limit. Back-patch to v9.6 where full-text phrase search was supported. Author: Kyotaro Horiguchi Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/20210819.170315.1413060634876301811.horikyota.ntt@gmail.com	2021-08-25 11:45:15 +09:00
Tom Lane	071146184a	Fix regexp misbehavior with capturing parens inside "{0}". Regexps like "(.){0}...\1" drew an "invalid backreference number". That's not unreasonable on its face, since the capture group will never be matched if it's iterated zero times. However, other engines such as Perl's don't complain about this, nor do we throw an error for related cases such as "(.)\|\1", even though that backref can never succeed either. Also, if the zero-iterations case happens at runtime rather than compile time --- say, "(x)*...\1" when there's no "x" to be found --- that's not an error, we just deem the backref to not match. Making this even less defensible, no error was thrown for nested cases such as "((.)){0}...\2"; and to add insult to injury, those cases could result in assertion failures instead. (It seems that nothing especially bad happened in non-assert builds, though.) Let's just fix it so that no error is thrown and instead the backref is deemed to never match, so that compile-time detection of no iterations behaves the same as run-time detection. Per report from Mark Dilger. This appears to be an aboriginal error in Spencer's library, so back-patch to all supported versions. Pre-v14, it turns out to also be necessary to back-patch one aspect of commits cb76fbd7e/00116dee5, namely to create capture-node subREs with the begin/end states of their subexpressions, not the current lp/rp of the outer parseqatom invocation. Otherwise delsub complains that we're trying to disconnect a state from itself. This is a bit scary but code examination shows that it's safe: in the pre-v14 code, if we want to wrap iteration around the subexpression, the first thing we do is overwrite the atom's begin/end fields with new states. So the bogus values didn't survive long enough to be used for anything, except if no iteration is required, in which case it doesn't matter. Discussion: https://postgr.es/m/A099E4A8-4377-4C64-A98C-3DEDDC075502@enterprisedb.com	2021-08-24 16:37:27 -04:00
Tom Lane	9a327179c8	Prevent regexp back-refs from sometimes matching when they shouldn't. The recursion in cdissect() was careless about clearing match data for capturing parentheses after rejecting a partial match. This could allow a later back-reference to succeed when by rights it should fail for lack of a defined referent. To fix, think a little more rigorously about what the contract between different levels of cdissect's recursion needs to be. With the right spec, we can fix this using fewer rather than more resets of the match data; the key decision being that a failed sub-match is now explicitly responsible for clearing any matches it may have set. There are enough other cross-checks and optimizations in the code that it's not especially easy to exhibit this problem; usually, the match will fail as-expected. Plus, regexps that are even potentially vulnerable are most likely user errors, since there's just not much point in writing a back-ref that doesn't always have a referent. These facts perhaps explain why the issue hasn't been detected, even though it's almost certainly a couple of decades old. Discussion: https://postgr.es/m/151435.1629733387@sss.pgh.pa.us	2021-08-23 17:41:07 -04:00
Alvaro Herrera	ad1231171f	Avoid creating archive status ".ready" files too early WAL records may span multiple segments, but XLogWrite() does not wait for the entire record to be written out to disk before creating archive status files. Instead, as soon as the last WAL page of the segment is written, the archive status file is created, and the archiver may process it. If PostgreSQL crashes before it is able to write and flush the rest of the record (in the next WAL segment), the wrong version of the first segment file lingers in the archive, which causes operations such as point-in-time restores to fail. To fix this, keep track of records that span across segments and ensure that segments are only marked ready-for-archival once such records have been completely written to disk. This has always been wrong, so backpatch all the way back. Author: Nathan Bossart <bossartn@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CBDDFA01-6E40-46BB-9F98-9340F4379505@amazon.com	2021-08-23 15:50:35 -04:00
Michael Paquier	29f9423251	Fix backup manifests to generate correct WAL-Ranges across timelines In a backup manifest, WAL-Ranges stores the range of WAL that is required for the backup to be valid. pg_verifybackup would then internally use pg_waldump for the checks based on this data. When the timeline where the backup started was more than 1 with a history file looked at for the manifest data generation, the calculation of the WAL range for the first timeline to check was incorrect. The previous logic used as start LSN the start position of the first timeline, but it needs to use the start LSN of the backup. This would cause failures with pg_verifybackup, or any tools making use of the backup manifests. This commit adds a test based on a logic using a self-promoted node, making it rather cheap. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20210818.143031.1867083699202617521.horikyota.ntt@gmail.com Backpatch-through: 13	2021-08-23 11:09:57 +09:00
Tom Lane	b30f7f399e	Fix performance bug in regexp's citerdissect/creviterdissect. After detecting a sub-match "dissect" failure (i.e., a backref match failure) in the i'th sub-match of an iteration node, we should proceed by adjusting the attempted length of the i'th submatch. As coded, though, these functions changed the attempted length of the last sub-match, and only after exhausting all possibilities for that would they back up to adjust the next-to-last sub-match, and then the second-from-last, etc; all of which is wasted effort, since only changing the start or length of the i'th sub-match can possibly make it succeed. This oversight creates the possibility for exponentially bad performance. Fortunately the problem is masked in most cases by optimizations or constraints applied elsewhere; which explains why we'd not noticed it before. But it is possible to reach the problem with fairly simple, if contrived, regexps. Oversight in my commit `173e29aa5`. That's pretty ancient now, so back-patch to all supported branches. Discussion: https://postgr.es/m/1808998.1629412269@sss.pgh.pa.us	2021-08-20 14:19:04 -04:00
Tom Lane	7fa367d96b	Avoid trying to lock OLD/NEW in a rule with FOR UPDATE. transformLockingClause neglected to exclude the pseudo-RTEs for OLD/NEW when processing a rule's query. This led to odd errors or even crashes later on. This bug is very ancient, but it's not terribly surprising that nobody noticed, since the use-case for SELECT FOR UPDATE in a non-view rule is somewhere between thin and non-existent. Still, crashing is not OK. Per bug #17151 from Zhiyong Wu. Thanks to Masahiko Sawada for analysis of the problem. Discussion: https://postgr.es/m/17151-c03a3e6e4ec9aadb@postgresql.org	2021-08-19 12:12:35 -04:00
Tom Lane	ecd4dd9f1d	Fix check_agg_arguments' examination of aggregate FILTER clauses. Recursion into the FILTER clause was mis-implemented, such that a relevant Var or Aggref at the very top of the FILTER clause would be ignored. (Of course, that'd have to be a plain boolean Var or boolean-returning aggregate.) The consequence would be mis-identification of the correct semantic level of the aggregate, which could lead to not-per-spec query behavior. If the FILTER expression is an aggregate, this could also lead to failure to issue an expected "aggregate function calls cannot be nested" error, which would likely result in a core dump later on, since the planner and executor aren't expecting such cases to appear. The root cause is that commit `b560ec1b0` blindly copied some code that assumed it's recursing into a List, and thus didn't examine the top-level node. To forestall questions about why this call doesn't look like the others, as well as possible future copy-and-paste mistakes, let's change all three check_agg_arguments_walker calls in check_agg_arguments, even though only the one for the filter clause is really broken. Per bug #17152 from Zhiyong Wu. This has been wrong since we implemented FILTER, so back-patch to all supported versions. (Testing suggests that pre-v11 branches manage to avoid crashing in the bad-Aggref case, thanks to "redundant" checks in ExecInitAgg. But I'm not sure how thorough that protection is, and anyway the wrong-behavior issue remains, so fix 9.6 and 10 too.) Discussion: https://postgr.es/m/17152-c7f906cc1a88e61b@postgresql.org	2021-08-18 18:12:51 -04:00
Tom Lane	7b01246e1d	Prevent ALTER TYPE/DOMAIN/OPERATOR from changing extension membership. If recordDependencyOnCurrentExtension is invoked on a pre-existing, free-standing object during an extension update script, that object will become owned by the extension. In our current code this is possible in three cases: * Replacing a "shell" type or operator. * CREATE OR REPLACE overwriting an existing object. * ALTER TYPE SET, ALTER DOMAIN SET, and ALTER OPERATOR SET. The first of these cases is intentional behavior, as noted by the existing comments for GenerateTypeDependencies. It seems like appropriate behavior for CREATE OR REPLACE too; at least, the obvious alternatives are not better. However, the fact that it happens during ALTER is an artifact of trying to share code (GenerateTypeDependencies and makeOperatorDependencies) between the CREATE and ALTER cases. Since an extension script would be unlikely to ALTER an object that didn't already belong to the extension, this behavior is not very troubling for the direct target object ... but ALTER TYPE SET will recurse to dependent domains, and it is very uncool for those to become owned by the extension if they were not already. Let's fix this by redefining the ALTER cases to never change extension membership, full stop. We could minimize the behavioral change by only changing the behavior when ALTER TYPE SET is recursing to a domain, but that would complicate the code and it does not seem like a better definition. Per bug #17144 from Alex Kozhemyakin. Back-patch to v13 where ALTER TYPE SET was added. (The other cases are older, but since they only affect the directly-named object, there's not enough of a problem to justify changing the behavior further back.) Discussion: https://postgr.es/m/17144-e67d7a8f049de9af@postgresql.org	2021-08-17 14:29:22 -04:00
Daniel Gustafsson	e15f32f0ed	Set type identifier on BIO In OpenSSL there are two types of BIO's (I/O abstractions): source/sink and filters. A source/sink BIO is a source and/or sink of data, ie one acting on a socket or a file. A filter BIO takes a stream of input from another BIO and transforms it. In order for BIO_find_type() to be able to traverse the chain of BIO's and correctly find all BIO's of a certain type they shall have the type bit set accordingly, source/sink BIO's (what PostgreSQL implements) use BIO_TYPE_SOURCE_SINK and filter BIO's use BIO_TYPE_FILTER. In addition to these, file descriptor based BIO's should have the descriptor bit set, BIO_TYPE_DESCRIPTOR. The PostgreSQL implementation didn't set the type bits, which went unnoticed for a long time as it's only really relevant for code auditing the OpenSSL installation, or doing similar tasks. It is required by the API though, so this fixes it. Backpatch through 9.6 as this has been wrong for a long time. Author: Itamar Gafni Discussion: https://postgr.es/m/SN6PR06MB39665EC10C34BB20956AE4578AF39@SN6PR06MB3966.namprd06.prod.outlook.com Backpatch-through: 9.6	2021-08-17 14:31:00 +02:00
Michael Paquier	7f0873f328	Refresh apply delay on reload of recovery_min_apply_delay at recovery This commit ensures that the wait interval in the replay delay loop waiting for an amount of time defined by recovery_min_apply_delay is correctly handled on reload, recalculating the delay if this GUC value is updated, based on the timestamp of the commit record being replayed. The previous behavior would be problematic for example with replay still waiting even if the delay got reduced or just cancelled. If the apply delay was increased to a larger value, the wait would have just respected the old value set, finishing earlier. Author: Soumyadeep Chakraborty, Ashwin Agrawal Reviewed-by: Kyotaro Horiguchi, Michael Paquier Discussion: https://postgr.es/m/CAE-ML+93zfr-HLN8OuxF0BjpWJ17O5dv1eMvSE5jsj9jpnAXZA@mail.gmail.com Backpatch-through: 9.6	2021-08-16 12:11:53 +09:00
Tom Lane	48695decc2	Add RISC-V spinlock support in s_lock.h. Like the ARM case, just use gcc's __sync_lock_test_and_set(); that will compile into AMOSWAP.W.AQ which does what we need. At some point it might be worth doing some work on atomic ops for RISC-V, but this should be enough for a creditable port. Back-patch to all supported branches, just in case somebody wants to try them on RISC-V. Marek Szuba Discussion: https://postgr.es/m/dea97b6d-f55f-1f6d-9109-504aa7dfa421@gentoo.org	2021-08-13 13:59:06 -04:00
David Rowley	4873da79da	Fix incorrect hash table resizing code in simplehash.h This fixes a bug in simplehash.h which caused an incorrect size mask to be used when the hash table grew to SH_MAX_SIZE (2^32). The code was incorrectly setting the size mask to 0 when the hash tables reached the maximum possible number of buckets. This would result always trying to use the 0th bucket causing an infinite loop of trying to grow the hash table due to there being too many collisions. Seemingly it's not that common for simplehash tables to ever grow this big as this bug dates back to v10 and nobody seems to have noticed it before. However, probably the most likely place that people would notice it would be doing a large in-memory Hash Aggregate with something close to at least 2^31 groups. After this fix, the code now works correctly with up to within 98% of 2^32 groups and will fail with the following error when trying to insert any more items into the hash table: ERROR: hash table size exceeded However, the work_mem (or hash_mem_multiplier in newer versions) settings will generally cause Hash Aggregates to spill to disk long before reaching that many groups. The minimal test case I did took a work_mem setting of over 192GB to hit the bug. simplehash hash tables are used in a few other places such as Bitmap Index Scans, however, again the size that the hash table can become there is also limited to work_mem and it would take a relation of around 16TB (2^31) pages and a very large work_mem setting to hit this. With smaller work_mem values the table would become lossy and never grow large enough to hit the problem. Author: Yura Sokolov Reviewed-by: David Rowley, Ranier Vilela Discussion: https://postgr.es/m/b1f7f32737c3438136f64b26f4852b96@postgrespro.ru Backpatch-through: 10, where simplehash.h was added	2021-08-13 16:42:35 +12:00
Thomas Munro	2c62754235	Make EXEC_BACKEND more convenient on macOS. It's hard to disable ASLR on current macOS releases, for testing with -DEXEC_BACKEND. You could already set the environment variable PG_SHMEM_ADDR to something not likely to collide with mappings created earlier in process startup. Let's also provide a default value that works on current releases and architectures, for developer convenience. As noted in the pre-existing comment, this is a horrible hack, but -DEXEC_BACKEND is only used by Unix-based PostgreSQL developers for testing some otherwise Windows-only code paths, so it seems excusable. Back-patch to all supported branches. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/20210806032944.m4tz7j2w47mant26%40alap3.anarazel.de	2021-08-13 11:11:38 +12:00
Peter Eisentraut	dc10035ecc	Translation updates Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 9bb123c161ac8f773572e112ced524b99e81c1d9	2021-08-09 11:56:40 +02:00
Tom Lane	ba9f665a44	Really fix the ambiguity in REFRESH MATERIALIZED VIEW CONCURRENTLY. Rather than trying to pick table aliases that won't conflict with any possible user-defined matview column name, adjust the queries' syntax so that the aliases are only used in places where they can't be mistaken for column names. Mostly this consists of writing "alias." not just "alias", which adds clarity for humans as well as machines. We do have the issue that "SELECT alias." acts differently from "SELECT alias", but we can use the same hack ruleutils.c uses for whole-row variables in SELECT lists: write "alias.*::compositetype". We might as well revert to the original aliases after doing this; they're a bit easier to read. Like `75d66d10e`, back-patch to all supported branches. Discussion: https://postgr.es/m/2488325.1628261320@sss.pgh.pa.us	2021-08-07 13:29:32 -04:00
Dean Rasheed	da188b9934	Adjust the integer overflow tests in the numeric code. Formerly, the numeric code tested whether an integer value of a larger type would fit in a smaller type by casting it to the smaller type and then testing if the reverse conversion produced the original value. That's perfectly fine, except that it caused a test failure on buildfarm animal castoroides, most likely due to a compiler bug. Instead, do these tests by comparing against PG_INT16/32_MIN/MAX. That matches existing code in other places, such as int84(), which is more widely tested, and so is less likely to go wrong. While at it, add regression tests covering the numeric-to-int8/4/2 conversions, and adjust the recently added tests to the style of `434ddfb79a` (on the v11 branch) to make failures easier to diagnose. Per buildfarm via Tom Lane, reviewed by Tom Lane. Discussion: https://postgr.es/m/2394813.1628179479%40sss.pgh.pa.us	2021-08-06 21:31:20 +01:00
Peter Eisentraut	d3ad6566a1	Fix wording	2021-08-06 22:05:41 +02:00
Dean Rasheed	a72ad63154	Fix division-by-zero error in to_char() with 'EEEE' format. This fixes a long-standing bug when using to_char() to format a numeric value in scientific notation -- if the value's exponent is less than -NUMERIC_MAX_DISPLAY_SCALE-1 (-1001), it produced a division-by-zero error. The reason for this error was that get_str_from_var_sci() divides its input by 10^exp, which it produced using power_var_int(). However, the underflow test in power_var_int() causes it to return zero if the result scale is too small. That's not a problem for power_var_int()'s only other caller, power_var(), since that limits the rscale to 1000, but in get_str_from_var_sci() the exponent can be much smaller, requiring a much larger rscale. Fix by introducing a new function to compute 10^exp directly, with no rscale limit. This also allows 10^exp to be computed more efficiently, without any numeric multiplication, division or rounding. Discussion: https://postgr.es/m/CAEZATCWhojfH4whaqgUKBe8D5jNHB8ytzemL-PnRx+KCTyMXmg@mail.gmail.com	2021-08-05 09:29:13 +01:00
Bruce Momjian	47a573d911	C comment: correct heading of extension query Reported-by: Justin Pryzby Discussion: https://postgr.es/m/20210803161345.GZ12533@telsasoft.com Backpatch-through: 9.6	2021-08-03 12:26:08 -04:00
Bruce Momjian	a81c71e3a8	pg_upgrade: warn about extensions that need updating Also create a script that can be run to update them. Reported-by: Dave Cramer Discussion: https://postgr.es/m/CADK3HHKawwbOcGwMGnDuAf3-U8YfvTcS8jqDv3UM=niijs3MMA@mail.gmail.com Backpatch-through: 9.6	2021-08-03 11:58:15 -04:00
Tom Lane	93f99693f9	Use elog, not Assert, to report failure to provide an outer snapshot. As of commit `84f5c2908`, executing SQL commands (via SPI or otherwise) requires having either an active Portal, or a caller-established active snapshot. We were simply Assert'ing that that's the case. But we've now had a couple different reports of people testing extensions that didn't meet this requirement, and were confused by the resulting crash. Let's convert the Assert to a test-and-elog, in hopes of making the issue clearer for extension authors. Per gripes from Liu Huailing and RekGRpth. Back-patch to v11, like the prior commit. Discussion: https://postgr.es/m/OSZPR01MB6215671E3C5956A034A080DFBEEC9@OSZPR01MB6215.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/17035-14607d308ac8643c@postgresql.org	2021-07-31 11:50:14 -04:00
Dean Rasheed	053ec4e0c4	Fix corner-case errors and loss of precision in numeric_power(). This fixes a couple of related problems that arise when raising numbers to very large powers. Firstly, when raising a negative number to a very large integer power, the result should be well-defined, but the previous code would only cope if the exponent was small enough to go through power_var_int(). Otherwise it would throw an internal error, attempting to take the logarithm of a negative number. Fix this by adding suitable handling to the general case in power_var() to cope with negative bases, checking for integer powers there. Next, when raising a (positive or negative) number whose absolute value is slightly less than 1 to a very large power, the result should approach zero as the power is increased. However, in some cases, for sufficiently large powers, this would lose all precision and return 1 instead of 0. This was due to the way that the local_rscale was being calculated for the final full-precision calculation: local_rscale = rscale + (int) val - ln_dweight + 8 The first two terms on the right hand side are meant to give the number of significant digits required in the result ("val" being the estimated result weight). However, this failed to account for the fact that rscale is clipped to a maximum of NUMERIC_MAX_DISPLAY_SCALE (1000), and the result weight might be less then -1000, causing their sum to be negative, leading to a loss of precision. Fix this by forcing the number of significant digits calculated to be nonnegative. It's OK for it to be zero (when the result weight is less than -1000), since the local_rscale value then includes a few extra digits to ensure an accurate result. Finally, add additional underflow checks to exp_var() and power_var(), so that they consistently return zero for cases like this where the result is indistinguishable from zero. Some paths through this code already returned zero in such cases, but others were throwing overflow errors. Dean Rasheed, reviewed by Yugo Nagata. Discussion: http://postgr.es/m/CAEZATCW6Dvq7+3wN3tt5jLj-FyOcUgT5xNoOqce5=6Su0bCR0w@mail.gmail.com	2021-07-31 11:25:39 +01:00
John Naylor	171bf1cea5	Fix range check in ECPG numeric to int conversion The previous coding guarded against -INT_MAX instead of INT_MIN, leading to -2147483648 being rejected as out of range. Per bug #17128 from Kevin Sweet Discussion: https://www.postgresql.org/message-id/flat/17128-55a8a879727a3e3a%40postgresql.org Reviewed-by: Tom Lane Backpatch to all supported branches	2021-07-30 16:18:59 -04:00
Alvaro Herrera	41d27ee7b8	Close yet another race condition in replication slot test code Buildfarm shows that this test has a further failure mode when a checkpoint starts earlier than expected, so we detect a "checkpoint completed" line that's not the one we want. Change the config to try and prevent this. Per buildfarm Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20210729.162038.534808353849568395.horikyota.ntt@gmail.com	2021-07-29 17:26:25 -04:00
Michael Paquier	efe169c900	Add missing exit() in pg_verifybackup when failing to find pg_waldump pg_verifybackup needs by default pg_waldump to check after a range of WAL segments required for a backup, except if --no-parse-wal is specified. The code checked for the presence of the binary pg_waldump in an installation and reported an error, but it forgot to properly exit(). This could lead to confusing errors reported. Reviewed-by: Robert Haas, Fabien Coelho Discussion: https://postgr.es/m/YQDMdB+B68yePFeT@paquier.xyz Backpatch-through: 13	2021-07-29 11:00:00 +09:00
Fujii Masao	a66b05b422	Update minimum recovery point on truncation during WAL replay of abort record. If a file is truncated, we must update minRecoveryPoint. Once a file is truncated, there's no going back; it would not be safe to stop recovery at a point earlier than that anymore. Commit `7bffc9b7bf` changed xact_redo_commit() so that it updates minRecoveryPoint on truncation, but forgot to change xact_redo_abort(). Back-patch to all supported versions. Reported-by: mengjuan.cmj@alibaba-inc.com Author: Fujii Masao Reviewed-by: Heikki Linnakangas Discussion: https://postgr.es/m/b029fce3-4fac-4265-968e-16f36ff4d075.mengjuan.cmj@alibaba-inc.com	2021-07-29 01:34:13 +09:00
Alvaro Herrera	b8f91d7f92	Set pg_setting.pending_restart when pertinent config lines are removed This changes the behavior of examining the pg_file_settings view after changing a config option that requires restart. The user needs to know that any change of such options does not take effect until a restart, and this worked correctly if the line is edited without removing it. However, for the case where the line is removed altogether, the flag doesn't get set, because a flag was only set in set_config_option, but that's not called for lines removed. Repair. (Ref.: commits `62d16c7fc5` and `a486e35706`) Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/202107262302.xsfdfc5sb7sh@alvherre.pgsql	2021-07-27 15:44:12 -04:00
Fujii Masao	92913fc290	Avoid using ambiguous word "non-negative" in error messages. The error messages using the word "non-negative" are confusing because it's ambiguous about whether it accepts zero or not. This commit improves those error messages by replacing it with less ambiguous word like "greater than zero" or "greater than or equal to zero". Also this commit added the note about the word "non-negative" to the error message style guide, to help writing the new error messages. When postgres_fdw option fetch_size was set to zero, previously the error message "fetch_size requires a non-negative integer value" was reported. This error message was outright buggy. Therefore back-patch to all supported versions where such buggy error message could be thrown. Reported-by: Hou Zhijie Author: Bharath Rupireddy Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/OS0PR01MB5716415335A06B489F1B3A8194569@OS0PR01MB5716.jpnprd01.prod.outlook.com	2021-07-28 01:21:52 +09:00
Bruce Momjian	0a5e708e2b	pg_resetxlog: add option to set oldest xid & use by pg_upgrade Add pg_resetxlog -u option to set the oldest xid in pg_control. Previously -x set this value be -2 billion less than the -x value. However, this causes the server to immediately scan all relation's relfrozenxid so it can advance pg_control's oldest xid to be inside the autovacuum_freeze_max_age range, which is inefficient and might disrupt diagnostic recovery. pg_upgrade will use this option to better create the new cluster to match the old cluster. Reported-by: Jason Harvey, Floris Van Nee Discussion: https://postgr.es/m/20190615183759.GB239428@rfd.leadboat.com, 87da83168c644fd9aae38f546cc70295@opammb0562.comp.optiver.com Author: Bertrand Drouvot Backpatch-through: 9.6	2021-07-26 22:38:14 -04:00
Michael Paquier	2c7395aad7	Fix a couple of memory leaks in src/bin/pg_basebackup/ These have been introduced by `7fbe0c8`, and could happen for pg_basebackup and pg_receivewal. Per report from Coverity for the ones in walmethods.c, I have spotted the ones in receivelog.c after more review. Backpatch-through: 10	2021-07-26 11:14:11 +09:00
Tom Lane	2b8f3f5a7c	Get rid of artificial restriction on hash table sizes on Windows. The point of introducing the hash_mem_multiplier GUC was to let users reproduce the old behavior of hash aggregation, i.e. that it could use more than work_mem at need. However, the implementation failed to get the job done on Win64, where work_mem is clamped to 2GB to protect various places that calculate memory sizes using "long int". As written, the same clamp was applied to hash_mem. This resulted in severe performance regressions for queries requiring a bit more than 2GB for hash aggregation, as they now spill to disk and there's no way to stop that. Getting rid of the work_mem restriction seems like a good idea, but it's a big job and could not conceivably be back-patched. However, there's only a fairly small number of places that are concerned with the hash_mem value, and it turns out to be possible to remove the restriction there without too much code churn or any ABI breaks. So, let's do that for now to fix the regression, and leave the larger task for another day. This patch does introduce a bit more infrastructure that should help with the larger task, namely pg_bitutils.h support for working with size_t values. Per gripe from Laurent Hasson. Back-patch to v13 where the behavior change came in. Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com	2021-07-25 14:02:27 -04:00
Fujii Masao	8d091922ff	Make the standby server promptly handle interrupt signals. This commit changes the startup process in the standby server so that it handles the interrupt signals after waiting for wal_retrieve_retry_interval on the latch and resetting it, before entering another wait on the latch. This change causes the standby server to promptly handle interrupt signals. Otherwise, previously, there was the case where the standby needs to wait extra five seconds to shutdown when the shutdown request arrived while the startup process was waiting for wal_retrieve_retry_interval on the latch. Author: Fujii Masao, but implementation idea is from Soumyadeep Chakraborty Reviewed-by: Soumyadeep Chakraborty Discussion: https://postgr.es/m/9d7e6ab0-8a53-ddb9-63cd-289bcb25fe0e@oss.nttdata.com Per discussion of BUG #17073, back-patch to all supported versions. Discussion: https://postgr.es/m/17073-1a5fdaed0fa5d4d0@postgresql.org	2021-07-25 11:15:30 +09:00

... 2 3 4 5 6 ...

36508 commits