postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-15 22:10:45 -04:00

Author	SHA1	Message	Date
Tom Lane	3a02d68a96	Avoid divide-by-zero in regex_selectivity() with long fixed prefix. Given a regex pattern with a very long fixed prefix (approaching 500 characters), the result of pow(FIXED_CHAR_SEL, fixed_prefix_len) can underflow to zero. Typically the preceding selectivity calculation would have underflowed as well, so that we compute 0/0 and get NaN. In released branches this leads to an assertion failure later on. That doesn't happen in HEAD, for reasons I've not explored yet, but it's surely still a bug. To fix, just skip the division when the pow() result is zero, so that we'll (most likely) return a zero selectivity estimate. In the edge cases where "sel" didn't yet underflow, perhaps this isn't desirable, but I'm not sure that the case is worth spending a lot of effort on. The results of regex_selectivity_sub() are barely worth the electrons they're written on anyway :-( Per report from Alexander Lakhin. Back-patch to all supported versions. Discussion: https://postgr.es/m/6de0a0c3-ada9-cd0c-3e4e-2fa9964b41e3@gmail.com	2021-02-12 16:26:47 -05:00
Michael Paquier	8493831385	Preserve pg_attribute.attstattarget across REINDEX CONCURRENTLY For an index, attstattarget can be updated using ALTER INDEX SET STATISTICS. This data was lost on the new index after REINDEX CONCURRENTLY. The update of this field is done when the old and new indexes are swapped to make the fix back-patchable. Another approach we could look after in the long-term is to change index_create() to pass the wanted values of attstattarget when creating the new relation, but, as this would cause an ABI breakage this can be done only on HEAD. Reported-by: Ronan Dunklau Author: Michael Paquier Reviewed-by: Ronan Dunklau, Tomas Vondra Discussion: https://postgr.es/m/16628084.uLZWGnKmhe@laptop-ronand Backpatch-through: 12	2021-02-10 13:09:09 +09:00
Peter Eisentraut	0f966d56b0	Translation updates Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 0da38e9f43d2b931a5efb3b64aac53c2beccd3b6	2021-02-08 17:41:32 +01:00
Tom Lane	d525fbcfd1	Fix mishandling of column-level SELECT privileges for join aliases. scanNSItemForColumn, expandNSItemAttrs, and ExpandSingleTable would pass the wrong RTE to markVarForSelectPriv when dealing with a join ParseNamespaceItem: they'd pass the join RTE, when what we need to mark is the base table that the join column came from. The end result was to not fill the base table's selectedCols bitmap correctly, resulting in an understatement of the set of columns that are read by the query. The executor would still insist on there being at least one selectable column; but with a correctly crafted query, a user having SELECT privilege on just one column of a table would nonetheless be allowed to read all its columns. To fix, make markRTEForSelectPriv fetch the correct RTE for itself, ignoring the possibly-mismatched RTE passed by the caller. Later, we'll get rid of some now-unused RTE arguments, but that risks API breaks so we won't do it in released branches. This problem was introduced by commit `9ce77d75c`, so back-patch to v13 where that came in. Thanks to Sven Klemm for reporting the problem. Security: CVE-2021-20229	2021-02-08 10:14:09 -05:00
Heikki Linnakangas	8e56684d54	Fix permission checks on constraint violation errors on partitions. If a cross-partition UPDATE violates a constraint on the target partition, and the columns in the new partition are in different physical order than in the parent, the error message can reveal columns that the user does not have SELECT permission on. A similar bug was fixed earlier in commit `804b6b6db4`. The cause of the bug is that the callers of the ExecBuildSlotValueDescription() function got confused when constructing the list of modified columns. If the tuple was routed from a parent, we converted the tuple to the parent's format, but the list of modified columns was grabbed directly from the child's RTE entry. ExecUpdateLockMode() had a similar issue. That lead to confusion on which columns are key columns, leading to wrong tuple lock being taken on tables referenced by foreign keys, when a row is updated with INSERT ON CONFLICT UPDATE. A new isolation test is added for that corner case. With this patch, the ri_RangeTableIndex field is no longer set for partitions that don't have an entry in the range table. Previously, it was set to the RTE entry of the parent relation, but that was confusing. NOTE: This modifies the ResultRelInfo struct, replacing the ri_PartitionRoot field with ri_RootResultRelInfo. That's a bit risky to backpatch, because it breaks any extensions accessing the field. The change that ri_RangeTableIndex is not set for partitions could potentially break extensions, too. The ResultRelInfos are visible to FDWs at least, and this patch required small changes to postgres_fdw. Nevertheless, this seem like the least bad option. I don't think these fields widely used in extensions; I don't think there are FDWs out there that uses the FDW "direct update" API, other than postgres_fdw. If there is, you will get a compilation error, so hopefully it is caught quickly. Backpatch to 11, where support for both cross-partition UPDATEs, and unique indexes on partitioned tables, were added. Reviewed-by: Amit Langote Security: CVE-2021-3393	2021-02-08 11:01:55 +02:00
Tom Lane	ac1df003f2	Revert "Propagate CTE property flags when copying a CTE list into a rule." This reverts commit `ed29089633` and equivalent back-branch commits. The issue is subtler than I thought, and it's far from new, so just before a release deadline is no time to be fooling with it. We'll consider what to do at a bit more leisure. Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com	2021-02-07 12:54:08 -05:00
Tom Lane	739375174a	Propagate CTE property flags when copying a CTE list into a rule. rewriteRuleAction() neglected this step, although it was careful to propagate other similar flags such as hasSubLinks or hasRowSecurity. Omitting to transfer hasRecursive is just cosmetic at the moment, but omitting hasModifyingCTE is a live bug, since the executor certainly looks at that. The proposed test case only fails back to v10, but since the executor examines hasModifyingCTE in 9.x as well, I suspect that a test case could be devised that fails in older branches. Given the nearness of the release deadline, though, I'm not going to spend time looking for a better test. Report and patch by Greg Nancarrow, cosmetic changes by me Discussion: https://postgr.es/m/CAJcOf-fAdj=nDKMsRhQzndm-O13NY4dL6xGcEvdX5Xvbbi0V7g@mail.gmail.com	2021-02-06 19:28:39 -05:00
Tom Lane	4353bc8781	Disallow converting an inheritance child table to a view. Generally, members of inheritance trees must be plain tables (or, in more recent versions, foreign tables). ALTER TABLE INHERIT rejects creating an inheritance relationship that has a view at either end. When DefineQueryRewrite attempts to convert a relation to a view, it already had checks prohibiting doing so for partitioning parents or children as well as traditional-inheritance parents ... but it neglected to check that a traditional-inheritance child wasn't being converted. Since the planner assumes that any inheritance child is a table, this led to making plans that tried to do a physical scan on a view, causing failures (or even crashes, in recent versions). One could imagine trying to support such a case by expanding the view normally, but since the rewriter runs before the planner does inheritance expansion, it would take some very fundamental refactoring to make that possible. There are probably a lot of other parts of the system that don't cope well with such a situation, too. For now, just forbid it. Per bug #16856 from Yang Lin. Back-patch to all supported branches. (In versions before v10, this includes back-patching the portion of commit `501ed02cf` that added has_superclass(). Perhaps the lack of that infrastructure partially explains the missing check.) Discussion: https://postgr.es/m/16856-0363e05c6e1612fd@postgresql.org	2021-02-06 15:17:01 -05:00
Heikki Linnakangas	c87cbd51ee	Fix backslash-escaping multibyte chars in COPY FROM. If a multi-byte character is escaped with a backslash in TEXT mode input, and the encoding is one of the client-only encodings where the bytes after the first one can have an ASCII byte "embedded" in the char, we didn't skip the character correctly. After a backslash, we only skipped the first byte of the next character, so if it was a multi-byte character, we would try to process its second byte as if it was a separate character. If it was one of the characters with special meaning, like '\n', '\r', or another '\\', that would cause trouble. One such exmple is the byte sequence '\x5ca45c2e666f6f' in Big5 encoding. That's supposed to be [backslash][two-byte character][.][f][o][o], but because the second byte of the two-byte character is 0x5c, we incorrectly treat it as another backslash. And because the next character is a dot, we parse it as end-of-copy marker, and throw an "end-of-copy marker corrupt" error. Backpatch to all supported versions. Reviewed-by: John Naylor, Kyotaro Horiguchi Discussion: https://www.postgresql.org/message-id/a897f84f-8dca-8798-3139-07da5bb38728%40iki.fi	2021-02-05 11:19:48 +02:00
Tom Lane	6467661b6d	Fix bug in HashAgg's selective-column-spilling logic. Commit `230230223` taught nodeAgg.c that, when spilling tuples from memory in an oversized hash aggregation, it only needed to spill input columns referenced in the node's tlist and quals. Unfortunately, that's wrong: we also have to save the grouping columns. The error is masked in common cases because the grouping columns also appear in the tlist, but that's not necessarily true. The main category of plans where it's not true seem to come from semijoins ("WHERE outercol IN (SELECT innercol FROM innertable)") where the innercol needs an implicit promotion to make it comparable to the outercol. The grouping column will be "innercol::promotedtype", but that expression appears nowhere in the Agg node's own tlist and quals; only the bare "innercol" is found in the tlist. I spent quite a bit of time looking for a suitable regression test case for this, without much success. If the number of distinct values of the innercol is large enough to make spilling happen, the planner tends to prefer a non-HashAgg plan, at least for problem sizes that are reasonable to use in the regression tests. So, no new regression test. However, this patch does demonstrably fix the originally-reported test case. Per report from s.p.e (at) gmx-topmail.de. Backpatch to v13 where the troublesome code came in. Discussion: https://postgr.es/m/trinity-1c565d44-159f-488b-a518-caf13883134f-1611835701633@3c-app-gmx-bap78	2021-02-04 23:01:33 -05:00
Tom Lane	10fcb83da6	Fix YA incremental sort bug. switchToPresortedPrefixMode() did the wrong thing if it detected a batch boundary just at the last tuple of a fullsort group. The initially-reported symptom was a "retrieved too many tuples in a bounded sort" error, but the test case added here just silently gives the wrong answer without this patch. I (tgl) am not really happy about committing this patch without review from the incremental-sort authors, but they seem AWOL and we are hard against a release deadline. This does demonstrably make some cases better, anyway. Per bug #16846 from Yoran Heling. Back-patch to v13 where incremental sort was introduced. Neil Chen Discussion: https://postgr.es/m/16846-ae49f51ac379a4cb@postgresql.org	2021-02-04 19:12:09 -05:00
Tom Lane	57868d957e	Avoid crash when rolling back within a prepared statement. If a portal is used to run a prepared CALL or DO statement that contains a ROLLBACK, PortalRunMulti fails because the portal's statement list gets cleared by the rollback. (Since the grammar doesn't allow CALL/DO in PREPARE, the only easy way to get to this is via extended query protocol, which treats all inputs as prepared statements.) It's difficult to avoid resetting the portal early because of resource-management issues, so work around this by teaching PortalRunMulti to be wary of portal->stmts having suddenly become NIL. The crash has only been seen to occur in v13 and HEAD (as a consequence of commit `1cff1b95a` having added an extra touch of portal->stmts). But even before that, the code involved touching a List that the portal no longer has any claim on. In the test case at hand, the List will still exist because of another refcount on the cached plan; but I'm far from convinced that it's impossible for the cached plan to have been dropped by the time control gets back to PortalRunMulti. Hence, backpatch to v11 where nested transactions were added. Thomas Munro and Tom Lane, per bug #16811 from James Inform Discussion: https://postgr.es/m/16811-c1b599b2c6c2d622@postgresql.org	2021-02-03 19:38:29 -05:00
Noah Misch	d798ea750d	Revive "snapshot too old" with wal_level=minimal and SET TABLESPACE. Given a permanent relation rewritten in the current transaction, the old_snapshot_threshold mechanism assumed the relation had never been subject to early pruning. Hence, a query could fail to report "snapshot too old" when the rewrite followed an early truncation. ALTER TABLE SET TABLESPACE is probably the only rewrite mechanism capable of exposing this bug. REINDEX sets indcheckxmin, avoiding the problem. CLUSTER has zeroed page LSNs since before old_snapshot_threshold existed, so old_snapshot_threshold has never cooperated with it. ALTER TABLE ... SET DATA TYPE makes the table look empty to every past snapshot, which is strictly worse. Back-patch to v13, where commit `c6b92041d3` broke this. Kyotaro Horiguchi and Noah Misch Discussion: https://postgr.es/m/20210113.160705.2225256954956139776.horikyota.ntt@gmail.com	2021-01-30 00:13:58 -08:00
Noah Misch	e8e3e67490	Fix error with CREATE PUBLICATION, wal_level=minimal, and new tables. CREATE PUBLICATION has failed spuriously when applied to a permanent relation created or rewritten in the current transaction. Make the same change to another site having the same semantic intent; the second instance has no user-visible consequences. Back-patch to v13, where commit `c6b92041d3` broke this. Kyotaro Horiguchi Discussion: https://postgr.es/m/20210113.160705.2225256954956139776.horikyota.ntt@gmail.com	2021-01-30 00:12:05 -08:00
Noah Misch	86a5b309c9	Fix CREATE INDEX CONCURRENTLY for simultaneous prepared transactions. In a cluster having used CREATE INDEX CONCURRENTLY while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. Fix this for future CREATE INDEX CONCURRENTLY by making it wait for prepared transactions like it waits for ordinary transactions. This expands the VirtualTransactionId structure domain to admit prepared transactions. It may be necessary to reindex to recover from past occurrences. Back-patch to 9.5 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Tom Lane and Michael Paquier. Discussion: https://postgr.es/m/2E712143-97F7-4890-B470-4A35142ABC82@yandex-team.ru	2021-01-30 00:00:42 -08:00
Tom Lane	4a9ce085ab	Silence another gcc 11 warning. Per buildfarm and local experimentation, bleeding-edge gcc isn't convinced that the MemSet in reorder_function_arguments() is safe. Shut it up by adding an explicit check that pronargs isn't negative, and by changing MemSet to memset. (It appears that either change is enough to quiet the warning at -O2, but let's do both to be sure.)	2021-01-28 17:19:17 -05:00
Alvaro Herrera	16f69062e5	Remove bogus restriction from BEFORE UPDATE triggers In trying to protect the user from inconsistent behavior, commit `487e9861d0` "Enable BEFORE row-level triggers for partitioned tables" tried to prevent BEFORE UPDATE FOR EACH ROW triggers from moving the row from one partition to another. However, it turns out that the restriction is wrong in two ways: first, it fails spuriously, preventing valid situations from working, as in bug #16794; and second, they don't protect from any misbehavior, because tuple routing would cope anyway. Fix by removing that restriction. We keep the same restriction on BEFORE INSERT FOR EACH ROW triggers, though. It is valid and useful there. In the future we could remove it by having tuple reroute work for inserts as it does for updates. Backpatch to 13. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reported-by: Phillip Menke <pg@pmenke.de> Discussion: https://postgr.es/m/16794-350a655580fbb9ae@postgresql.org	2021-01-28 16:56:07 -03:00
Tom Lane	7f1921cb92	Fix hash partition pruning with asymmetric partition sets. perform_pruning_combine_step() was not taught about the number of partition indexes used in hash partitioning; more embarrassingly, get_matching_hash_bounds() also had it wrong. These errors are masked in the common case where all the partitions have the same modulus and no partition is missing. However, with missing or unequal-size partitions, we could erroneously prune some partitions that need to be scanned, leading to silently wrong query answers. While a minimal-footprint fix for this could be to export get_partition_bound_num_indexes and make the incorrect functions use it, I'm of the opinion that that function should never have existed in the first place. It's not reasonable data structure design that PartitionBoundInfoData lacks any explicit record of the length of its indexes[] array. Perhaps that was all right when it could always be assumed equal to ndatums, but something should have been done about it as soon as that stopped being true. Putting in an explicit "nindexes" field makes both partition_bounds_equal() and partition_bounds_copy() simpler, safer, and faster than before, and removes explicit knowledge of the number-of-partition-indexes rules from some other places too. This change also makes get_hash_partition_greatest_modulus obsolete. I left that in place in case any external code uses it, but no core code does anymore. Per bug #16840 from Michał Albrycht. Back-patch to v11 where the hash partitioning code came in. (In the back branches, add the new field at the end of PartitionBoundInfoData to minimize ABI risks.) Discussion: https://postgr.es/m/16840-571a22976f829ad4@postgresql.org	2021-01-28 13:41:55 -05:00
Andrew Gierth	75e3cca42d	Don't add bailout adjustment for non-strict deserialize calls. When building aggregate expression steps, strict checks need a bailout jump for when a null value is encountered, so there is a list of steps that require later adjustment. Adding entries to that list for steps that aren't actually strict would be harmless, except that there is an Assert which catches them. This leads to spurious errors on asserts builds, for data sets that trigger parallel aggregation of an aggregate with a non-strict deserialization function (no such aggregates exist in the core system). Repair by not adding the adjustment entry when it's not needed. Backpatch back to 11 where the code was introduced. Per a report from Darafei (Komzpa) of the PostGIS project; analysis and patch by me. Discussion: https://postgr.es/m/87mty7peb3.fsf@news-spur.riddles.org.uk	2021-01-28 11:09:40 +00:00
Tom Lane	a26194f22b	Fix broken ruleutils support for function TRANSFORM clauses. I chanced to notice that this dumped core due to a faulty Assert. To add insult to injury, the output has been misformatted since v11. Obviously we need some regression testing here. Discussion: https://postgr.es/m/d1cc628c-3953-4209-957b-29427acc38c8@www.fastmail.com	2021-01-25 13:03:11 -05:00
David Rowley	7632ef5a71	Fix hypothetical bug in heap backward scans Both heapgettup() and heapgettup_pagemode() incorrectly set the first page to scan in a backward scan in which the number of pages to scan was specified by heap_setscanlimits(). The code incorrectly started the scan at the end of the relation when startBlk was 0, or otherwise at startBlk - 1, neither of which is correct when only scanning a subset of pages. The fix here checks if heap_setscanlimits() has changed the number of pages to scan and if so we set the first page to scan as the final page in the specified range during backward scans. Proper adjustment of this code was forgotten when heap_setscanlimits() was added in `7516f5259` back in 9.5. However, practice, nowhere in core code performs backward scans after having used heap_setscanlimits(), yet, it is possible an extension uses the heap functions in this way, hence backpatch. An upcoming patch does use heap_setscanlimits() with backward scans, so this must be fixed before that can go in. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvpGc9h0_oVD2CtgBcxCS1N-qDYZSeBRnUh+0CWJA9cMaA@mail.gmail.com Backpatch-through: 9.5, all supported versions	2021-01-25 19:52:52 +13:00
Tom Lane	73fc2e5bab	Fix pull_varnos' miscomputation of relids set for a PlaceHolderVar. Previously, pull_varnos() took the relids of a PlaceHolderVar as being equal to the relids in its contents, but that fails to account for the possibility that we have to postpone evaluation of the PHV due to outer joins. This could result in a malformed plan. The known cases end up triggering the "failed to assign all NestLoopParams to plan nodes" sanity check in createplan.c, but other symptoms may be possible. The right value to use is the join level we actually intend to evaluate the PHV at. We can get that from the ph_eval_at field of the associated PlaceHolderInfo. However, there are some places that call pull_varnos() before the PlaceHolderInfos have been created; in that case, fall back to the conservative assumption that the PHV will be evaluated at its syntactic level. (In principle this might result in missing some legal optimization, but I'm not aware of any cases where it's an issue in practice.) Things are also a bit ticklish for calls occurring during deconstruct_jointree(), but AFAICS the ph_eval_at fields should have reached their final values by the time we need them. The main problem in making this work is that pull_varnos() has no way to get at the PlaceHolderInfos. We can fix that easily, if a bit tediously, in HEAD by passing it the planner "root" pointer. In the back branches that'd cause an unacceptable API/ABI break for extensions, so leave the existing entry points alone and add new ones with the additional parameter. (If an old entry point is called and encounters a PHV, it'll fall back to using the syntactic level, again possibly missing some valid optimization.) Back-patch to v12. The computation is surely also wrong before that, but it appears that we cannot reach a bad plan thanks to join order restrictions imposed on the subquery that the PlaceHolderVar came from. The error only became reachable when commit `4be058fe9` allowed trivial subqueries to be collapsed out completely, eliminating their join order restrictions. Per report from Stephan Springl. Discussion: https://postgr.es/m/171041.1610849523@sss.pgh.pa.us	2021-01-21 15:37:23 -05:00
Heikki Linnakangas	b8403d140f	Fix bug in detecting concurrent page splits in GiST insert In commit `9eb5607e69`, I got the condition on checking for split or deleted page wrong: I used && instead of \|\|. The comment correctly said "concurrent split _or_ deletion". As a result, GiST insertion could miss a concurrent split, and insert to wrong page. Duncan Sands demonstrated this with a test script that did a lot of concurrent inserts. Backpatch to v12, where this was introduced. REINDEX is required to fix indexes that were affected by this bug. Backpatch-through: 12 Reported-by: Duncan Sands Discussion: https://www.postgresql.org/message-id/a9690483-6c6c-3c82-c8ba-dc1a40848f11%40deepbluecap.com	2021-01-20 11:58:25 +02:00
Michael Paquier	31e0f9d76b	Fix ALTER DEFAULT PRIVILEGES with duplicated objects Specifying duplicated objects in this command would lead to unique constraint violations in pg_default_acl or "tuple already updated by self" errors. Similarly to GRANT/REVOKE, increment the command ID after each subcommand processing to allow this case to work transparently. A regression test is added by tweaking one of the existing queries of privileges.sql to stress this case. Reported-by: Andrus Author: Michael Paquier Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/ae2a7dc1-9d71-8cba-3bb9-e4cb7eb1f44e@hot.ee Backpatch-through: 9.5	2021-01-20 11:39:14 +09:00
Tom Lane	188cd4f440	Remove faulty support for MergeAppend plan with WHERE CURRENT OF. Somebody extended search_plan_tree() to treat MergeAppend exactly like Append, which is 100% wrong, because unlike Append we can't assume that only one input node is actively returning tuples. Hence a cursor using a MergeAppend across a UNION ALL or inheritance tree could falsely match a WHERE CURRENT OF query at a row that isn't actually the cursor's current output row, but coincidentally has the same TID (in a different table) as the current output row. Delete the faulty code; this means that such a case will now return an error like 'cursor "foo" is not a simply updatable scan of table "bar"', instead of silently misbehaving. Users should not find that surprising though, as the same cursor query could have failed that way already depending on the chosen plan. (It would fail like that if the sort were done with an explicit Sort node instead of MergeAppend.) Expand the clearly-inadequate commentary to be more explicit about what this code is doing, in hopes of forestalling future mistakes. It's been like this for awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/482865.1611075182@sss.pgh.pa.us	2021-01-19 13:25:40 -05:00
Tom Lane	f0f53195b5	Avoid crash with WHERE CURRENT OF and a custom scan plan. execCurrent.c's search_plan_tree() assumed that ForeignScanStates and CustomScanStates necessarily have a valid ss_currentRelation. This is demonstrably untrue for postgres_fdw's remote join and remote aggregation plans, and non-leaf custom scans might not have an identifiable scan relation either. Avoid crashing by ignoring such nodes when the field is null. This solution will lead to errors like 'cursor "foo" is not a simply updatable scan of table "bar"' in cases where maybe we could have allowed WHERE CURRENT OF to work. That's not an issue for postgres_fdw's usages, since joins or aggregations would render WHERE CURRENT OF invalid anyway. But an otherwise-transparent upper level custom scan node might find this annoying. When and if someone cares to expend work on such a scenario, we could invent a custom-scan-provider callback to determine what's safe. Report and patch by David Geier, commentary by me. It's been like this for awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/0253344d-9bdd-11c4-7f0d-d88c02cd7991@swarm64.com	2021-01-18 18:32:40 -05:00
Noah Misch	6eb3fc7fcd	Prevent excess SimpleLruTruncate() deletion. Every core SLRU wraps around. With the exception of pg_notify, the wrap point can fall in the middle of a page. Account for this in the PagePrecedes callback specification and in SimpleLruTruncate()'s use of said callback. Update each callback implementation to fit the new specification. This changes SerialPagePrecedesLogically() from the style of asyncQueuePagePrecedes() to the style of CLOGPagePrecedes(). (Whereas pg_clog and pg_serial share a key space, pg_serial is nothing like pg_notify.) The bug fixed here has the same symptoms and user followup steps as `592a589a04`. Back-patch to 9.5 (all supported versions). Reviewed by Andrey Borodin and (in earlier versions) by Tom Lane. Discussion: https://postgr.es/m/20190202083822.GC32531@gust.leadboat.com	2021-01-16 12:21:38 -08:00
Tomas Vondra	d26d4c717d	Disallow CREATE STATISTICS on system catalogs Add a check that CREATE STATISTICS does not add extended statistics on system catalogs, similarly to indexes etc. It can be overriden using the allow_system_table_mods GUC. This bug exists since `7b504eb282`, adding the extended statistics, so backpatch all the way back to PostgreSQL 10. Author: Tomas Vondra Reported-by: Dean Rasheed Backpatch-through: 10 Discussion: https://postgr.es/m/CAEZATCXAPrrOKwEsyZKQ4uzzJQWBCt6QAvOcgqRGdWwT1zb%2BrQ%40mail.gmail.com	2021-01-15 23:32:54 +01:00
Fujii Masao	60369db86f	Fix calculation of how much shared memory is required to store a TOC. Commit `ac883ac453` refactored shm_toc_estimate() but changed its calculation of shared memory size for TOC incorrectly. Previously this could cause too large memory to be allocated. Back-patch to v11 where the bug was introduced. Author: Takayuki Tsunakawa Discussion: https://postgr.es/m/TYAPR01MB2990BFB73170E2C4921E2C4DFEA80@TYAPR01MB2990.jpnprd01.prod.outlook.com	2021-01-15 12:45:25 +09:00
Alvaro Herrera	5b01a6f13f	Prevent drop of tablespaces used by partitioned relations When a tablespace is used in a partitioned relation (per commits `ca4103025d` in pg12 for tables and `33e6c34c32` in pg11 for indexes), it is possible to drop the tablespace, potentially causing various problems. One such was reported in bug #16577, where a rewriting ALTER TABLE causes a server crash. Protect against this by using pg_shdepend to keep track of tablespaces when used for relations that don't keep physical files; we now abort a tablespace if we see that the tablespace is referenced from any partitioned relations. Backpatch this to 11, where this problem has been latent all along. We don't try to create pg_shdepend entries for existing partitioned indexes/tables, but any ones that are modified going forward will be protected. Note slight behavior change: when trying to drop a tablespace that contains both regular tables as well as partitioned ones, you'd previously get ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE and now you'll get ERRCODE_DEPENDENT_OBJECTS_STILL_EXIST. Arguably, the latter is more correct. It is possible to add protecting pg_shdepend entries for existing tables/indexes, by doing ALTER TABLE ONLY some_partitioned_table SET TABLESPACE pg_default; ALTER TABLE ONLY some_partitioned_table SET TABLESPACE original_tablespace; for each partitioned table/index that is not in the database default tablespace. Because these partitioned objects do not have storage, no file needs to be actually moved, so it shouldn't take more time than what's required to acquire locks. This query can be used to search for such relations: SELECT ... FROM pg_class WHERE relkind IN ('p', 'I') AND reltablespace <> 0 Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/16577-881633a9f9894fd5@postgresql.org Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Michael Paquier <michael@paquier.xyz>	2021-01-14 15:32:14 -03:00
Fujii Masao	94f52929a0	Ensure that a standby is able to follow a primary on a newer timeline. Commit `709d003fbd` refactored WAL-reading code, but accidentally caused WalSndSegmentOpen() to fail to follow a timeline switch while reading from a historic timeline. This issue caused a standby to fail to follow a primary on a newer timeline when WAL archiving is enabled. If there is a timeline switch within the segment, WalSndSegmentOpen() should read from the WAL segment belonging to the new timeline. But previously since it failed to follow a timeline switch, it tried to read the WAL segment with old timeline. When WAL archiving is enabled, that WAL segment with old timeline doesn't exist because it's renamed to .partial. This leads a primary to have tried to read non-existent WAL segment, and which caused replication to faill with the error "ERROR: requested WAL segment ... has already been removed". This commit fixes WalSndSegmentOpen() so that it's able to follow a timeline switch, to ensure that a standby is able to follow a primary on a newer timeline even when WAL archiving is enabled. This commit also adds the regression test to check whether a standby is able to follow a primary on a newer timeline when WAL archiving is enabled. Back-patch to v13 where the bug was introduced. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi, tweaked by Fujii Masao Reviewed-by: Alvaro Herrera, Fujii Masao Discussion: https://postgr.es/m/20201209.174314.282492377848029776.horikyota.ntt@gmail.com	2021-01-14 12:28:47 +09:00
Amit Kapila	0685c5c1b9	Fix memory leak in SnapBuildSerialize. The memory for the snapshot was leaked while serializing it to disk during logical decoding. This memory will be freed only once walsender stops streaming the changes. This can lead to a huge memory increase when master logs Standby Snapshot too frequently say when the user is trying to create many replication slots. Reported-by: funnyxj.fxj@alibaba-inc.com Diagnosed-by: funnyxj.fxj@alibaba-inc.com Author: Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/033ab54c-6393-42ee-8ec9-2b399b5d8cde.funnyxj.fxj@alibaba-inc.com	2021-01-13 08:31:45 +05:30
Alvaro Herrera	ee69833e6e	Fix thinko in comment This comment has been wrong since its introduction in commit `2c03216d83`. Author: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/CAD21AoAzz6qipFJBbGEaHmyWxvvNDp8httbwLR9tUQWaTjUs2Q@mail.gmail.com	2021-01-12 11:48:45 -03:00
Amit Kapila	decf3fe61c	Fix relation descriptor leak. We missed closing the relation descriptor while sending changes via the root of partitioned relations during logical replication. Author: Amit Langote and Mark Zhao Reviewed-by: Amit Kapila and Ashutosh Bapat Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/tencent_41FEA657C206F19AB4F406BE9252A0F69C06@qq.com Discussion: https://postgr.es/m/tencent_6E296D2F7D70AFC90D83353B69187C3AA507@qq.com	2021-01-12 08:30:16 +05:30
Tom Lane	49c928c0c0	Fix ancient bug in parsing of BRE-mode regular expressions. brenext(), when parsing a '' quantifier, forgot to return any "value" for the token; per the equivalent case in next(), it should return value 1 to indicate that greedy rather than non-greedy behavior is wanted. The result is that the compiled regexp could behave like 'x?' rather than the intended 'x', if we were unlucky enough to have a zero in v->nextvalue at this point. That seems to happen with some reliability if we have '.' at the beginning of a BRE-mode regexp, although that depends on the initial contents of a stack-allocated struct, so it's not guaranteed to fail. Found by Alexander Lakhin using valgrind testing. This bug seems to be aboriginal in Spencer's code, so back-patch all the way. Discussion: https://postgr.es/m/16814-6c5e3edd2bdf0d50@postgresql.org	2021-01-08 12:16:00 -05:00
Tom Lane	5db4fdc224	Further second thoughts about idle_session_timeout patch. On reflection, the order of operations in PostgresMain() is wrong. These timeouts ought to be shut down before, not after, we do the post-command-read CHECK_FOR_INTERRUPTS, to guarantee that any timeout error will be detected there rather than at some ill-defined later point (possibly after having wasted a lot of work). This is really an error in the original idle_in_transaction_timeout patch, so back-patch to 9.6 where that was introduced.	2021-01-07 11:45:09 -05:00
Fujii Masao	0f8977b3f2	Detect the deadlocks between backends and the startup process. The deadlocks that the recovery conflict on lock is involved in can happen between hot-standby backends and the startup process. If a backend takes an access exclusive lock on the table and which finally triggers the deadlock, that deadlock can be detected as expected. On the other hand, previously, if the startup process took an access exclusive lock and which finally triggered the deadlock, that deadlock could not be detected and could remain even after deadlock_timeout passed. This is a bug. The cause of this bug was that the code for handling the recovery conflict on lock didn't take care of deadlock case at all. It assumed that deadlocks involving the startup process and backends were able to be detected by the deadlock detector invoked within backends. But this assumption was incorrect. The startup process also should have invoked the deadlock detector if necessary. To fix this bug, this commit makes the startup process invoke the deadlock detector if deadlock_timeout is reached while handling the recovery conflict on lock. Specifically, in that case, the startup process requests all the backends holding the conflicting locks to check themselves for deadlocks. Back-patch to v9.6. v9.5 has also this bug, but per discussion we decided not to back-patch the fix to v9.5. Because v9.5 doesn't have some infrastructure codes (e.g., `37c54863cf`) that this bug fix patch depends on. We can apply those codes for the back-patch, but since the next minor version release is the final one for v9.5, it's risky to do that. If we unexpectedly introduce new bug to v9.5 by the back-patch, there is no chance to fix that. We determined that the back-patch to v9.5 would give more risk than gain. Author: Fujii Masao Reviewed-by: Bertrand Drouvot, Masahiko Sawada, Kyotaro Horiguchi Discussion: https://postgr.es/m/4041d6b6-cf24-a120-36fa-1294220f8243@oss.nttdata.com	2021-01-06 12:30:37 +09:00
Dean Rasheed	5777b6ea29	Add an explicit cast to double when using fabs(). Commit `bc43b7c2c0` used fabs() directly on an int variable, which apparently requires an explicit cast on some platforms. Per buildfarm.	2021-01-05 11:51:21 +00:00
Dean Rasheed	e15c384d7a	Fix numeric_power() when the exponent is INT_MIN. In power_var_int(), the computation of the number of significant digits to use in the computation used log(Abs(exp)), which isn't safe because Abs(exp) returns INT_MIN when exp is INT_MIN. Use fabs() instead of Abs(), so that the exponent is cast to a double before the absolute value is taken. Back-patch to 9.6, where this was introduced (by `7d9a4737c2`). Discussion: https://postgr.es/m/CAEZATCVd6pMkz=BrZEgBKyqqJrt2xghr=fNc8+Z=5xC6cgWrWA@mail.gmail.com	2021-01-05 11:08:59 +00:00
Tom Lane	9e7d87ca84	Fix integer-overflow corner cases in substring() functions. If the substring start index and length overflow when added together, substring() misbehaved, either throwing a bogus "negative substring length" error on a case that should succeed, or failing to complain that a negative length is negative (and instead returning the whole string, in most cases). Unsurprisingly, the text, bytea, and bit variants of the function all had this issue. Rearrange the logic to ensure that negative lengths are always rejected, and add an overflow check to handle the other case. Also install similar guards into detoast_attr_slice() (nee heap_tuple_untoast_attr_slice()), since it's far from clear that no other code paths leading to that function could pass it values that would overflow. Patch by myself and Pavel Stehule, per bug #16804 from Rafi Shamim. Back-patch to v11. While these bugs are old, the common/int.h infrastructure for overflow-detecting arithmetic didn't exist before commit `4d6ad3125`, and it doesn't seem like these misbehaviors are bad enough to justify developing a standalone fix for the older branches. Discussion: https://postgr.es/m/16804-f4eeeb6c11ba71d4@postgresql.org	2021-01-04 18:32:40 -05:00
Peter Geoghegan	55e5352266	Get heap page max offset with buffer lock held. On further reflection it seems better to call PageGetMaxOffsetNumber() after acquiring a buffer lock on the page. This shouldn't really matter, but doing it this way is cleaner. Follow-up to commit `42288174`. Backpatch: 12-, just like commit `42288174`	2020-12-30 17:21:41 -08:00
Peter Geoghegan	7a57960ba6	Fix index deletion latestRemovedXid bug. The logic for determining the latest removed XID for the purposes of generating recovery conflicts in REDO routines was subtly broken. It failed to follow links from HOT chains, and so failed to consider all relevant heap tuple headers in some cases. To fix, expand the loop that deals with LP_REDIRECT line pointers to also deal with HOT chains. The new version of the loop is loosely based on a similar loop from heap_prune_chain(). The impact of this bug is probably quite limited, since the horizon code necessarily deals with heap tuples that are pointed to by LP_DEAD-set index tuples. The process of setting LP_DEAD index tuples (e.g. within the kill_prior_tuple mechanism) is highly correlated with opportunistic pruning of pointed-to heap tuples. Plus the question of generating a recovery conflict usually comes up some time after index tuple LP_DEAD bits were initially set, unlike heap pruning, where a latestRemovedXid is generated at the point of the pruning operation (heap pruning has no deferred "would-be page split" style processing that produces conflicts lazily). Only backpatch to Postgres 12, the first version where this logic runs during original execution (following commit `558a9165e0`). The index latestRemovedXid mechanism has had the same bug since it first appeared over 10 years ago (in commit `a760893d`), but backpatching to all supported versions now seems like a bad idea on balance. Running the new improved code during recovery seems risky, especially given the lack of complaints from the field. Author: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/CAH2-Wz=Eib393+HHcERK_9MtgNS7Ew1HY=RDC_g6GL46zM5C6Q@mail.gmail.com Backpatch: 12-	2020-12-30 16:29:03 -08:00
Tom Lane	861e967176	Fix up usage of krb_server_keyfile GUC parameter. secure_open_gssapi() installed the krb_server_keyfile setting as KRB5_KTNAME unconditionally, so long as it's not empty. However, pg_GSS_recvauth() only installed it if KRB5_KTNAME wasn't set already, leading to a troubling inconsistency: in theory, clients could see different sets of server principal names depending on whether they use GSSAPI encryption. Always using krb_server_keyfile seems like the right thing, so make both places do that. Also fix up secure_open_gssapi()'s lack of a check for setenv() failure --- it's unlikely, surely, but security-critical actions are no place to be sloppy. Also improve the associated documentation. This patch does nothing about secure_open_gssapi()'s use of setenv(), and indeed causes pg_GSS_recvauth() to use it too. That's nominally against project portability rules, but since this code is only built with --with-gssapi, I do not feel a need to do something about this in the back branches. A fix will be forthcoming for HEAD though. Back-patch to v12 where GSSAPI encryption was introduced. The dubious behavior in pg_GSS_recvauth() goes back further, but it didn't have anything to be inconsistent with, so let it be. Discussion: https://postgr.es/m/2187460.1609263156@sss.pgh.pa.us	2020-12-30 11:38:42 -05:00
Tom Lane	d05e14d786	Improve log messages related to pg_hba.conf not matching a connection. Include details on whether GSS encryption has been activated; since we added "hostgssenc" type HBA entries, that's relevant info. Kyotaro Horiguchi and Tom Lane. Back-patch to v12 where GSS encryption was introduced. Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se	2020-12-28 17:58:58 -05:00
Tom Lane	c1c88bf03e	Fix assorted issues in backend's GSSAPI encryption support. Unrecoverable errors detected by GSSAPI encryption can't just be reported with elog(ERROR) or elog(FATAL), because attempting to send the error report to the client is likely to lead to infinite recursion or loss of protocol sync. Instead make this code do what the SSL encryption code has long done, which is to just report any such failure to the server log (with elevel COMMERROR), then pretend we've lost the connection by returning errno = ECONNRESET. Along the way, fix confusion about whether message translation is done by pg_GSS_error() or its callers (the latter should do it), and make the backend version of that function work more like the frontend version. Avoid allocating the port->gss struct until it's needed; we surely don't need to allocate it in the postmaster. Improve logging of "connection authorized" messages with GSS enabled. (As part of this, I back-patched the code changes from dc11f31a1.) Make BackendStatusShmemSize() account for the GSS-related space that will be allocated by CreateSharedBackendStatus(). This omission could possibly cause out-of-shared-memory problems with very high max_connections settings. Remove arbitrary, pointless restriction that only GSS authentication can be used on a GSS-encrypted connection. Improve documentation; notably, document the fact that libpq now prefers GSS encryption over SSL encryption if both are possible. Per report from Mikael Gustavsson. Back-patch to v12 where this code was introduced. Discussion: https://postgr.es/m/e5b0b6ed05764324a2f3fe7acfc766d5@smhi.se	2020-12-28 17:44:17 -05:00
Michael Paquier	30803bd1cd	Fix inconsistent code with shared invalidations of snapshots The code in charge of processing a single invalidation message has been using since `568d413` the structure for relation mapping messages. This had fortunately no consequence as both locate the database ID at the same location, but it could become a problem in the future if this area of the code changes. Author: Konstantin Knizhnik Discussion: https://postgr.es/m/8044c223-4d3a-2cdb-42bf-29940840ce94@postgrespro.ru Backpatch-through: 9.5	2020-12-28 22:16:57 +09:00
Jeff Davis	7b8692eaf1	Fix bug #16784 in Disk-based Hash Aggregation. Before processing tuples, agg_refill_hash_table() was setting all pergroup pointers to NULL to signal to advance_aggregates() that it should not attempt to advance groups that had spilled. The problem was that it also set the pergroups for sorted grouping sets to NULL, which caused rescanning to fail. Instead, change agg_refill_hash_table() to only set the pergroups for hashed grouping sets to NULL; and when compiling the expression, pass doSort=false. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/16784-7ff169bf2c3d1588%40postgresql.org Backpatch-through: 13	2020-12-26 17:27:50 -08:00
Noah Misch	9f8a48bb2c	Invalidate acl.c caches when pg_authid changes. This makes existing sessions reflect "ALTER ROLE ... [NO]INHERIT" as quickly as they have been reflecting "GRANT role_name". Back-patch to 9.5 (all supported versions). Reviewed by Nathan Bossart. Discussion: https://postgr.es/m/20201221095028.GB3777719@rfd.leadboat.com	2020-12-25 10:42:02 -08:00
Tom Lane	0217ad8066	Fix race condition between shutdown and unstarted background workers. If a database shutdown (smart or fast) is commanded between the time some process decides to request a new background worker and the time that the postmaster can launch that worker, then nothing happens because the postmaster won't launch any bgworkers once it's exited PM_RUN state. This is fine ... unless the requesting process is waiting for that worker to finish (or even for it to start); in that case the requestor is stuck, and only manual intervention will get us to the point of being able to shut down. To fix, cancel pending requests for workers when the postmaster sends shutdown (SIGTERM) signals, and similarly cancel any new requests that arrive after that point. (We can optimize things slightly by only doing the cancellation for workers that have waiters.) To fit within the existing bgworker APIs, the "cancel" is made to look like the worker was started and immediately stopped, causing deregistration of the bgworker entry. Waiting processes would have to deal with premature worker exit anyway, so this should introduce no bugs that weren't there before. We do have a side effect that registration records for restartable bgworkers might disappear when theoretically they should have remained in place; but since we're shutting down, that shouldn't matter. Back-patch to v10. There might be value in putting this into 9.6 as well, but the management of bgworkers is a bit different there (notably see `8ff518699`) and I'm not convinced it's worth the effort to validate the patch for that branch. Discussion: https://postgr.es/m/661570.1608673226@sss.pgh.pa.us	2020-12-24 17:00:43 -05:00
Michael Paquier	1e54664eee	Fix portability issues with parsing of recovery_target_xid The parsing of this parameter has been using strtoul(), which is not portable across platforms. On most Unix platforms, unsigned long has a size of 64 bits, while on Windows it is 32 bits. It is common in recovery scenarios to rely on the output of txid_current() or even the newer pg_current_xact_id() to get a transaction ID for setting up recovery_target_xid. The value returned by those functions includes the epoch in the computed result, which would cause strtoul() to fail where unsigned long has a size of 32 bits once the epoch is incremented. WAL records and 2PC data include only information about 32-bit XIDs and it is not possible to have XIDs across more than one epoch, so discarding the high bits from the transaction ID set has no impact on recovery. On the contrary, the use of strtoul() prevents a consistent behavior across platforms depending on the size of unsigned long. This commit changes the parsing of recovery_target_xid to use pg_strtouint64() instead, available down to 9.6. There is one TAP test stressing recovery with recovery_target_xid, where a tweak based on pg_reset{xlog,wal} is added to bump the XID epoch so as this change gets tested, as per an idea from Alexander Lakhin. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/16780-107fd0c0385b1035@postgresql.org Backpatch-through: 9.6	2020-12-23 12:51:35 +09:00

1 2 3 4 5 ...

20880 commits