Commit c8ea87e4b introduced a temporary conversion buffer for
substrings extracted during regexp splits. Unfortunately the code that
sized it was failing to ignore the effects of ignored degenerate
regexp matches, so for regexp_split_* calls it could under-size the
buffer in such cases.
Fix, and add some regression test cases (though those will only catch
the bug if run in a multibyte encoding).
Backpatch to 9.3 as the faulty code was.
Thanks to the PostGIS project, Regina Obe and Paul Ramsey for the
report (via IRC) and assistance in analysis. Patch by me.
Up to now, get_const_expr() insisted on prefixing BIT and VARBIT
literals with 'B'. That's not really necessary, because we always
append explicit-cast syntax to identify the constant's type.
Moreover, it's subtly wrong for VARBIT, because the parser will
interpret B'...' as '...'::"bit"; see make_const() which explicitly
assigns type BITOID for a T_BitString literal. So what had been
a simple VARBIT literal is reconstructed as ('...'::"bit")::varbit,
which is not the same thing, at least not before constant folding.
This results in odd differences after dump/restore, as complained
of by the patch submitter, and it could result in actual failures in
partitioning or inheritance DDL operations (see commit 542320c2b,
which repaired similar misbehaviors for some other data types).
Fixing it is pretty easy: just remove the special case and let the
default code path handle these types. We could have kept the special
case for BIT only, but there seems little point in that.
Like the previous patch, I judge that back-patching this into stable
branches wouldn't be a good idea. However, it seems not quite too
late for v11, so let's fix it there.
Paul Guo, reviewed by Davy Machado and John Naylor, minor adjustments
by me
Discussion: https://postgr.es/m/CABQrizdTra=2JEqA6+Ms1D1k1Kqw+aiBBhC9TreuZRX2JzxLAA@mail.gmail.com
spgrescan would first reset traversalCxt, and then traverse a
potentially non-empty stack containing pointers to traversalValues
which had been allocated in those contexts, freeing them a second
time. This bug originates in commit ccd6eb49a where traversalValue was
introduced.
Repair by traversing the stack before the context reset; this isn't
ideal, since it means doing retail pfree in a context that's about to
be reset, but the freeing of a stack entry is also done in other
places in the code during the scan so it's not worth trying to
refactor it further. Regression test added.
Backpatch to 9.6 where the problem was introduced.
Per bug #15378; analysis and patch by me, originally from a report on
IRC by user velix; see also PostGIS ticket #4174; review by Alexander
Korotkov.
Discussion: https://postgr.es/m/153663176628.23136.11901365223750051490@wrigleys.postgresql.org
Buildfarm member tern failed src/bin/pg_ctl/t/001_start_stop.pl when a
check_mode_recursive() call overlapped a server's startup-time deletion
of pg_stat/global.stat. Just warn. Also, include errno in the message.
Back-patch to v11, where check_mode_recursive() first appeared.
It's been true for a long time that we expect names of table and domain
constraints to be unique among the constraints of that table or domain.
However, the enforcement of that has been pretty haphazard, and it missed
some corner cases such as creating a CHECK constraint and then an index
constraint of the same name (as per recent report from André Hänsel).
Also, due to the lack of an actual unique index enforcing this, duplicates
could be created through race conditions.
Moreover, the code that searches pg_constraint has been quite inconsistent
about how to handle duplicate names if one did occur: some places checked
and threw errors if there was more than one match, while others just
processed the first match they came to.
To fix, create a unique index on (conrelid, contypid, conname). Since
either conrelid or contypid is zero, this will separately enforce
uniqueness of constraint names among constraints of any one table and any
one domain. (If we ever implement SQL assertions, and put them into this
catalog, more thought might be needed. But it'd be at least as reasonable
to put them into a new catalog; having overloaded this one catalog with
two kinds of constraints was a mistake already IMO.) This index can replace
the existing non-unique index on conrelid, though we need to keep the one
on contypid for query performance reasons.
Having done that, we can simplify the logic in various places that either
coped with duplicates or neglected to, as well as potentially improve
lookup performance when searching for a constraint by name.
Also, as per our usual practice, install a preliminary check so that you
get something more friendly than a unique-index violation report in the
case complained of by André. And teach ChooseIndexName to avoid choosing
autogenerated names that would draw such a failure.
While it's not possible to make such a change in the back branches,
it doesn't seem quite too late to put this into v11, so do so.
Discussion: https://postgr.es/m/0c1001d4428f$0942b430$1bc81c90$@webkr.de
workers.
Allowing window function calculation in workers leads to inconsistent
results because if the input row ordering is not fully deterministic, the
output of window functions might vary across workers. The fix is to treat
them as parallel-restricted.
In the passing, improve the coding pattern in max_parallel_hazard_walker
so that it has a chain of mutually-exclusive if ... else if ... else if
... else if ... IsA tests.
Reported-by: Marko Tiikkaja
Bug: 15324
Author: Amit Kapila
Reviewed-by: Tom Lane
Backpatch-through: 9.6
Discussion: https://postgr.es/m/CAL9smLAnfPJCDUUG4ckX2iznj53V7VSMsYefzZieN93YxTNOcw@mail.gmail.com
This column was added in commit 8224de4f42 ("Indexes with INCLUDE
columns and their support in B-tree") to ease writing the ruleutils.c
supporting code for that feature, but it turns out to be unnecessary --
we can do the same thing with just one more syscache lookup.
Even the documentation for the new column being removed in this commit
is awkward.
Discussion: https://postgr.es/m/20180902165018.33otxftp3olgtu4t@alvherre.pgsql
Commit f49842d, which added support for partitionwise joins, built the
child's tlist by applying adjust_appendrel_attrs() to the parent's. So in
the case where the parent's included a whole-row Var for the parent, the
child's contained a ConvertRowtypeExpr. To cope with that, that commit
added code to the planner, such as setrefs.c, but some code paths still
assumed that the tlist for a scan (or join) rel would only include Vars
and PlaceHolderVars, which was true before that commit, causing errors:
* When creating an explicit sort node for an input path for a mergejoin
path for a child join, prepare_sort_from_pathkeys() threw the 'could not
find pathkey item to sort' error.
* When deparsing a relation participating in a pushed down child join as a
subquery in contrib/postgres_fdw, get_relation_column_alias_ids() threw
the 'unexpected expression in subquery output' error.
* When performing set_plan_references() on a local join plan generated by
contrib/postgres_fdw for EvalPlanQual support for a pushed down child
join, fix_join_expr() threw the 'variable not found in subplan target
lists' error.
To fix these, two approaches have been proposed: one by Ashutosh Bapat and
one by me. While the former keeps building the child's tlist with a
ConvertRowtypeExpr, the latter builds it with a whole-row Var for the
child not to violate the planner assumption, and tries to fix it up later,
But both approaches need more work, so refuse to generate partitionwise
join paths when whole-row Vars are involved, instead. We don't need to
handle ConvertRowtypeExprs in the child's tlists for now, so this commit
also removes the changes to the planner.
Previously, partitionwise join computed attr_needed data for each child
separately, and built the child join's tlist using that data, which also
required an extra step for adding PlaceHolderVars to that tlist, but it
would be more efficient to build it from the parent join's tlist through
the adjust_appendrel_attrs() transformation. So this commit builds that
list that way, and simplifies build_joinrel_tlist() and placeholder.c as
well as part of set_append_rel_size() to basically what they were before
partitionwise join went in.
Back-patch to PG11 where partitionwise join was introduced.
Report by Rajkumar Raghuwanshi. Analysis by Ashutosh Bapat, who also
provided some of regression tests. Patch by me, reviewed by Robert Haas.
Discussion: https://postgr.es/m/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg@mail.gmail.com
Commits c6b3c939b (which fixed the precedence of >=, <=, <> operators)
and 865f14a2d (which added support for the standard => notation for
named arguments) created a class of lexer tokens which look like
multi-character operators but which have their own token IDs distinct
from Op. However, longest-match rules meant that following any of
these tokens with another operator character, as in (1<>-1), would
cause them to be incorrectly returned as Op.
The error here isn't immediately obvious, because the parser would
usually still find the correct operator via the Op token, but there
were more subtle problems:
1. If immediately followed by a comment or +-, >= <= <> would be given
the old precedence of Op rather than the correct new precedence;
2. If followed by a comment, != would be returned as Op rather than as
NOT_EQUAL, causing it not to be found at all;
3. If followed by a comment or +-, the => token for named arguments
would be lexed as Op, causing the argument to be mis-parsed as a
simple expression, usually causing an error.
Fix by explicitly checking for the operators in the {operator} code
block in addition to all the existing special cases there.
Backpatch to 9.5 where the problem was introduced.
Analysis and patch by me; review by Tom Lane.
Discussion: https://postgr.es/m/87va851ppl.fsf@news-spur.riddles.org.uk
Since procedures are now a different thing from functions, change the
CREATE TRIGGER and CREATE EVENT TRIGGER syntax to use FUNCTION in the
clause that specifies the function. PROCEDURE is still accepted for
compatibility.
pg_dump and ruleutils.c output is not changed yet, because that would
require a change in information_schema.sql and thus a catversion change.
Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
Since procedures are now a different thing from functions, change the
CREATE OPERATOR syntax to use FUNCTION in the clause that specifies the
function. PROCEDURE is still accepted for compatibility.
Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
Historically, the term procedure was used as a synonym for function in
Postgres/PostgreSQL. Now we have procedures as separate objects from
functions, so we need to clean up the documentation to not mix those
terms.
In particular, mentions of "trigger procedures" are changed to "trigger
functions", and access method "support procedures" are changed to
"support functions". (The latter already used FUNCTION in the SQL
syntax anyway.) Also, the terminology in the SPI chapter has been
cleaned up.
A few tests, examples, and code comments are also adjusted to be
consistent with documentation changes, but not everything.
Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Jonathan S. Katz <jonathan.katz@excoventures.com>
When executing a SubPlan in an expression, the EState's direction
field was left alone, resulting in an attempt to execute the subplan
backwards if it was encountered during a backwards scan of a cursor.
Also, though much less likely, it was possible to reach the execution
of an InitPlan while in backwards-scan state.
Repair by saving/restoring estate->es_direction and forcing forward
scan mode in the relevant places.
Backpatch all the way, since this has been broken since 8.3 (prior to
commit c7ff7663e, SubPlans had their own EStates rather than sharing
the parent plan's, so there was no confusion over scan direction).
Per bug #15336 reported by Vladimir Baranoff; analysis and patch by
me, review by Tom Lane.
Discussion: https://postgr.es/m/153449812167.1304.1741624125628126322@wrigleys.postgresql.org
In a multi-layer partitioning setup, if at plan time all the
sub-partitions are pruned but the intermediate one remains, the executor
later throws a spurious error that there's nothing to prune. That is
correct, but there's no reason to throw an error. Therefore, don't.
Reported-by: Andreas Seltenreich <seltenreich@gmx.de>
Author: David Rowley <david.rowley@2ndquadrant.com>
Discussion: https://postgr.es/m/87in4h98i0.fsf@ansel.ydns.eu
Fix a small number of places that were testing the result of snprintf()
but doing so incorrectly. The right test for buffer overrun, per C99,
is "result >= bufsize" not "result > bufsize". Some places were also
checking for failure with "result == -1", but the standard only says
that a negative value is delivered on failure.
(Note that this only makes these places correct if snprintf() delivers
C99-compliant results. But at least now these places are consistent
with all the other places where we assume that.)
Also, make psql_start_test() and isolation_start_test() check for
buffer overrun while constructing their shell commands. There seems
like a higher risk of overrun, with more severe consequences, here
than there is for the individual file paths that are made elsewhere
in the same functions, so this seemed like a worthwhile change.
Also fix guc.c's do_serialize() to initialize errno = 0 before
calling vsnprintf. In principle, this should be unnecessary because
vsnprintf should have set errno if it returns a failure indication ...
but the other two places this coding pattern is cribbed from don't
assume that, so let's be consistent.
These errors are all very old, so back-patch as appropriate. I think
that only the shell command overrun cases are even theoretically
reachable in practice, but there's not much point in erroneous error
checks.
Discussion: https://postgr.es/m/17245.1534289329@sss.pgh.pa.us
The pump_nb() step might've already received the desired data, so we must
check for that at the top of the loop not the bottom. Otherwise, the
call to pump() will sit with nothing to do until the timeout elapses.
pump_until then falls out with apparent success ... but the timeout has
been used up, causing the next call of pump_until to report a timeout
failure. I believe this explains the intermittent timeout failures
we've seen in the buildfarm ever since this test went in. I was able
to reproduce the problem on gaur semi-repeatably, and this appears to
fix it.
In passing, remove a duplicate assignment, fix one stdin-assignment to
look like the rest, and document the test's dependency on test_decoding.
When considering a partitioning parent rel, we should stop processing that
subroot as soon as we've done adjust_appendrel_attrs and any securityQuals
updates. The rest of this is unnecessary, and indeed adding duplicate
subquery RTEs to the subroot is *wrong*. As the code stood, the children
of that partition ended up with two sets of copied subquery RTEs, confusing
matters greatly. Even more hilarity ensued if all of the children got
excluded by constraint exclusion, so that the extra RTEs didn't make it
back into the parent rtable.
Per fuzz testing by Andreas Seltenreich. Back-patch to v11 where this
got broken (by commit 0a480502b, it looks like).
Discussion: https://postgr.es/m/87va8g7vq0.fsf@ansel.ydns.eu
Commit 1c2cb2744 added some code that tried to detect whether two
RelOptInfos were the "same" rel by pointer comparison; but it turns
out that inheritance_planner breaks that, through its shenanigans
with copying some relations forward into new subproblems. Compare
relid sets instead. Add a regression test case to exercise this
area.
Problem reported by Rushabh Lathia; diagnosis and fix by Amit Langote,
modified a bit by me.
Discussion: https://postgr.es/m/CAGPqQf3anJGj65bqAQ9edDr8gF7qig6_avRgwMT9MsZ19COUPw@mail.gmail.com
There are some problems with the tls-unique channel binding type. It's not
supported by all SSL libraries, and strictly speaking it's not defined for
TLS 1.3 at all, even though at least in OpenSSL, the functions used for it
still seem to work with TLS 1.3 connections. And since we had no
mechanism to negotiate what channel binding type to use, there would be
awkward interoperability issues if a server only supported some channel
binding types. tls-server-end-point seems feasible to support with any SSL
library, so let's just stick to that.
This removes the scram_channel_binding libpq option altogether, since there
is now only one supported channel binding type.
This also removes all the channel binding tests from the SSL test suite.
They were really just testing the scram_channel_binding option, which
is now gone. Channel binding is used if both client and server support it,
so it is used in the existing tests. It would be good to have some tests
specifically for channel binding, to make sure it really is used, and the
different combinations of a client and a server that support or doesn't
support it. The current set of settings we have make it hard to write such
tests, but I did test those things manually, by disabling
HAVE_BE_TLS_GET_CERTIFICATE_HASH and/or
HAVE_PGTLS_GET_PEER_CERTIFICATE_HASH.
I also removed the SCRAM_CHANNEL_BINDING_TLS_END_POINT constant. This is a
matter of taste, but IMO it's more readable to just use the
"tls-server-end-point" string.
Refactor the checks on whether the SSL library supports the functions
needed for tls-server-end-point channel binding. Now the server won't
advertise, and the client won't choose, the SCRAM-SHA-256-PLUS variant, if
compiled with an OpenSSL version too old to support it.
In the passing, add some sanity checks to check that the chosen SASL
mechanism, SCRAM-SHA-256 or SCRAM-SHA-256-PLUS, matches whether the SCRAM
exchange used channel binding or not. For example, if the client selects
the non-channel-binding variant SCRAM-SHA-256, but in the SCRAM message
uses channel binding anyway. It's harmless from a security point of view,
I believe, and I'm not sure if there are some other conditions that would
cause the connection to fail, but it seems better to be strict about these
things and check explicitly.
Discussion: https://www.postgresql.org/message-id/ec787074-2305-c6f4-86aa-6902f98485a4%40iki.fi
When expanding an updatable view that is an INSERT's target, the rewriter
failed to rewrite Vars in the ON CONFLICT UPDATE clause. This accidentally
worked if the view was just "SELECT * FROM ...", as the transformation
would be a no-op in that case. With more complicated view targetlists,
this omission would often lead to "attribute ... has the wrong type" errors
or even crashes, as reported by Mario De Frutos Dieguez.
Fix by adding code to rewriteTargetView to fix up the data structure
correctly. The easiest way to update the exclRelTlist list is to rebuild
it from scratch looking at the new target relation, so factor the code
for that out of transformOnConflictClause to make it sharable.
In passing, avoid duplicate permissions checks against the EXCLUDED
pseudo-relation, and prevent useless view expansion of that relation's
dummy RTE. The latter is only known to happen (after this patch) in cases
where the query would fail later due to not having any INSTEAD OF triggers
for the view. But by exactly that token, it would create an unintended
and very poorly tested state of the query data structure, so it seems like
a good idea to prevent it from happening at all.
This has been broken since ON CONFLICT was introduced, so back-patch
to 9.5.
Dean Rasheed, based on an earlier patch by Amit Langote;
comment-kibitzing and back-patching by me
Discussion: https://postgr.es/m/CAFYwGJ0xfzy8jaK80hVN2eUWr6huce0RU8AgU04MGD00igqkTg@mail.gmail.com
This suite malfunctioned if the canonical name of "localhost" was
something other than "localhost", such as "localhost.localdomain". Use
hostaddr=127.0.0.1 and a fictitious host=, so the resolver's answers for
"localhost" don't affect the outcome. Back-patch to v11, which
introduced this test suite.
Discussion: https://postgr.es/m/20180801050903.GA1392916@rfd.leadboat.com
The previous coding here supposed that if run-time partitioning applied to
a particular Append/MergeAppend plan, then all child plans of that node
must be members of a single partitioning hierarchy. This is totally wrong,
since an Append could be formed from a UNION ALL: we could have multiple
hierarchies sharing the same Append, or child plans that aren't part of any
hierarchy.
To fix, restructure the related plan-time and execution-time data
structures so that we can have a separate list or array for each
partitioning hierarchy. Also track subplans that are not part of any
hierarchy, and make sure they don't get pruned.
Per reports from Phil Florent and others. Back-patch to v11, since
the bug originated there.
David Rowley, with a lot of cosmetic adjustments by me; thanks also
to Amit Langote for review.
Discussion: https://postgr.es/m/HE1PR03MB17068BB27404C90B5B788BCABA7B0@HE1PR03MB1706.eurprd03.prod.outlook.com
Commits 742869946 et al turn out to be a couple bricks shy of a load.
We were dumping the stored values of GUC_LIST_QUOTE variables as they
appear in proconfig or setconfig catalog columns. However, although that
quoting rule looks a lot like SQL-identifier double quotes, there are two
critical differences: empty strings ("") are legal, and depending on which
variable you're considering, values longer than NAMEDATALEN might be valid
too. So the current technique fails altogether on empty-string list
entries (as reported by Steven Winfield in bug #15248) and it also risks
truncating file pathnames during dump/reload of GUC values that are lists
of pathnames.
To fix, split the stored value without any downcasing or truncation,
and then emit each element as a SQL string literal.
This is a tad annoying, because we now have three copies of the
comma-separated-string splitting logic in varlena.c as well as a fourth
one in dumputils.c. (Not to mention the randomly-different-from-those
splitting logic in libpq...) I looked at unifying these, but it would
be rather a mess unless we're willing to tweak the API definitions of
SplitIdentifierString, SplitDirectoriesString, or both. That might be
worth doing in future; but it seems pretty unsafe for a back-patched
bug fix, so for now accept the duplication.
Back-patch to all supported branches, as the previous fix was.
Discussion: https://postgr.es/m/7585.1529435872@sss.pgh.pa.us
In commit 84940644de, bms_add_range was added with an API to fail with
an error if an empty range was specified. This seems arbitrary and
unhelpful, so turn that case into a no-op instead. Callers that require
further verification on the arguments or result can apply them by
themselves.
This fixes the bug that partition pruning throws an API error for a case
involving the default partition of a default partition, as in the
included test case.
Reported-by: Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/16590.1532622503@sss.pgh.pa.us
transformPartitionSpec rejected duplicate simple partition columns
(e.g., "PARTITION BY RANGE (x,x)") but paid no attention to expression
columns, resulting in inconsistent behavior. Worse, cases like
"PARTITION BY RANGE (x,(x))") were accepted but would then result in
dump/reload failures, since the expression (x) would get simplified
to a plain column later.
There seems no better reason for this restriction than there was for
the one against duplicate included index columns (cf commit 701fd0bbc),
so let's just remove it.
Back-patch to v10 where this code was added.
Report and patch by Yugo Nagata.
Discussion: https://postgr.es/m/20180712165939.36b12aff.nagata@sraoss.co.jp
This is essential information when looking at an index that has
"included" columns. Per discussion, follow the style used in \dC
and some other places: column header is "Key?" and values are "yes"
or "no" (all translatable).
While at it, revise describeOneTableDetails to be a bit more maintainable:
avoid hard-wired column numbers and multiple repetitions of what needs
to be identical test logic. This also results in the emitted catalog
query corresponding more closely to what we print, which should be a
benefit to users of ECHO_HIDDEN mode, and perhaps a bit faster too
(the old logic sometimes asked for values it would not print, even
ones that are fairly expensive to get).
Discussion: https://postgr.es/m/21724.1531943735@sss.pgh.pa.us
The multi-argument form of pg_get_indexdef() failed to print anything when
asked to print a single index column that is an included column rather than
a key column. This seems an unintentional result of someone having tried
to take a short-cut and use the attrsOnly flag for two different purposes.
To fix, split said flag into two flags, attrsOnly which suppresses
non-attribute info, and keysOnly which suppresses included columns.
Add a test case using psql's \d command, which relies on that function.
(It's mighty tempting at this point to replace pg_get_indexdef_worker's
mess of boolean flag arguments with a single bitmask-of-flags argument,
which would allow making the call sites much more self-documenting.
But I refrained for the moment.)
Discussion: https://postgr.es/m/21724.1531943735@sss.pgh.pa.us
This script supposed that if it turned hot_standby_feedback on and then
shut down the standby server, at least one feedback message would be
guaranteed to be sent before the standby stops. But there is no such
guarantee, if the standby's walreceiver process is slow enough --- and
we've seen multiple failures in the buildfarm showing that that does
happen in practice. While we could rearrange the walreceiver logic to
make it less likely, it seems probably impossible to create a really
bulletproof guarantee of that sort; and if we tried, we might create
situations where the walreceiver wouldn't react in a timely manner to
shutdown commands. It seems better instead to remove the script's
assumption that feedback will occur before shutdown.
But once we do that, these last few tests seem quite redundant with
the earlier tests in the script. So let's just drop them altogether
and save some buildfarm cycles.
Backpatch to v10 where these tests were added.
Discussion: https://postgr.es/m/1922.1531592205@sss.pgh.pa.us
The initial version of the included-index-column feature stated that
included columns couldn't be the same as any key column of the index.
While it'd be pretty silly to do that, since the included column would be
entirely redundant, we've never prohibited redundant index columns before
so it's not very consistent to do so here. Moreover, the prohibition
was itself badly implemented, so that it failed to reject columns that
were effectively identical but not spelled quite alike, as reported by
Aditya Toshniwal.
(Moreover, it's not hard to imagine that for some non-btree index types,
such cases would be non-silly anyhow: the index might use a lossy
representation for key columns but be able to support retrieval of the
original form of included columns.)
Hence, let's just drop the prohibition.
In passing, do some copy-editing on the documentation for the
included-column feature.
Yugo Nagata; documentation and test corrections by me
Discussion: https://postgr.es/m/CAM9w-_mhBCys4fQNfaiQKTRrVWtoFrZ-wXmDuE9Nj5y-Y7aDKQ@mail.gmail.com
A collection of typos I happened to spot while reading code, as well as
grepping for common mistakes.
Backpatch to all supported versions, as applicable, to avoid conflicts
when backporting other commits in the future.
The existing error message was complaining that the column is not an
expression, which is not correct. Introduce a suitable wording
variation and a test.
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20180628182803.e4632d5a.nagata@sraoss.co.jp
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
The original code was unable to prune partitions that could not possibly
contain NULL values, when the query specified less than all columns in a
multicolumn partition key. Reorder the if-tests so that it is, and add
more commentary and regression tests.
Reported-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Co-authored-by: Dilip Kumar <dilipbalaut@gmail.com>
Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Reviewed-by: amul sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/CAFjFpRc7qjLUfXLVBBC_HAnx644sjTYM=qVoT3TJ840HPbsTXw@mail.gmail.com
As of commit 37a795a60, populate_recordset_worker() tried to pass back
(as rsi.setDesc) a tupdesc that it also had cached in its fn_extra.
But the core executor would free the passed-back tupdesc, risking a
crash if the function were called again in the same query. The safest
and least invasive way to fix that is to make an extra tupdesc copy
to pass back.
While at it, I failed to resist the temptation to get rid of unnecessary
get_fn_expr_argtype() calls here and in populate_record_worker().
Per report from Dmitry Dolgov; thanks to Michael Paquier and
Andrew Gierth for investigation and discussion.
Discussion: https://postgr.es/m/CA+q6zcWzN9ztCfR47ZwgTr1KLnuO6BAY6FurxXhovP4hxr+yOQ@mail.gmail.com
When truncating a table that is referenced by foreign keys in
partitioned tables, the check to ensure the referencing table are also
truncated spuriously failed. This is because it was relying on
relhastriggers as a proxy for the table having FKs, and that's wrong for
partitioned tables. Fix it to consider such tables separately. There
may be a better way ... but this code is pretty inefficient already.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/20180711000624.zmeizicibxeehhsg@alvherre.pgsql
An update that causes the tuple to be moved to a different partition was
missing out on re-constructing the to-be-updated tuple, based on the latest
tuple in the update chain. Instead, it's simply deleting the latest tuple
and inserting a new tuple in the new partition based on the old tuple.
Commit 2f17844104 didn't consider this case, so some of the updates were
getting lost.
In passing, change the argument order for output parameter in ExecDelete
and add some commentary about it.
Reported-by: Pavan Deolasee
Author: Amit Khandekar, with minor changes by me
Reviewed-by: Dilip Kumar, Amit Kapila and Alvaro Herrera
Backpatch-through: 11
Discussion: https://postgr.es/m/CAJ3gD9fRbEzDqdeDq1jxqZUb47kJn+tQ7=Bcgjc8quqKsDViKQ@mail.gmail.com
WAL senders sending logically-decoded data fail to properly report in
"streaming" state when starting up, hence as long as one extra record is
not replayed, such WAL senders would remain in a "catchup" state, which
is inconsistent with the physical cousin.
This can be easily reproduced by for example using pg_recvlogical and
restarting the upstream server. The TAP tests have been slightly
modified to detect the failure and strengthened so as future tests also
make sure that a node is in streaming state when waiting for its
catchup.
Backpatch down to 9.4 where this code has been introduced.
Reported-by: Sawada Masahiko
Author: Simon Riggs, Sawada Masahiko
Reviewed-by: Petr Jelinek, Michael Paquier, Vaishnavi Prabakaran
Discussion: https://postgr.es/m/CAD21AoB2ZbCCqOx=bgKMcLrAvs1V0ZMqzs7wBTuDySezTGtMZA@mail.gmail.com
We should only run apply_pathtarget_labeling_to_tlist if CP_LABEL_TLIST
was specified, because only in that case has use_physical_tlist checked
that the labeling will succeed; otherwise we may get an "ORDER/GROUP BY
expression not found in targetlist" error. (This subsumes the previous
test about gating_clauses, because we reset "flags" to zero earlier
if there are gating clauses to apply.)
The only known case in which a failure can occur is with a ProjectSet
path directly atop a table scan path, although it seems likely that there
are other cases or will be such in future. This means that the failure
is currently only visible in the v10 branch: 9.6 didn't have ProjectSet,
while in v11 and HEAD, apply_scanjoin_target_to_paths for some weird
reason is using create_projection_path not apply_projection_to_path,
masking the problem because there's a ProjectionPath in between.
Nonetheless this code is clearly wrong on its own terms, so back-patch
to 9.6 where this logic was introduced.
Per report from Regina Obe.
Discussion: https://postgr.es/m/001501d40f88$75186950$5f493bf0$@pcorp.us
nodeWindowAgg.c failed to cope with the possibility that no ordering
columns are defined in the window frame for GROUPS mode or RANGE OFFSET
mode, leading to assertion failures or odd errors, as reported by Masahiko
Sawada and Lukas Eder. In RANGE OFFSET mode, an ordering column is really
required, so add an Assert about that. In GROUPS mode, the code would
work, except that the node initialization code wasn't in sync with the
execution code about when to set up tuplestore read pointers and spare
slots. Fix the latter for consistency's sake (even though I think the
changes described below make the out-of-sync cases unreachable for now).
Per SQL spec, a single ordering column is required for RANGE OFFSET mode,
and at least one ordering column is required for GROUPS mode. The parser
enforced the former but not the latter; add a check for that.
We were able to reach the no-ordering-column cases even with fully spec
compliant queries, though, because the planner would drop partitioning
and ordering columns from the generated plan if they were redundant with
earlier columns according to the redundant-pathkey logic, for instance
"PARTITION BY x ORDER BY y" in the presence of a "WHERE x=y" qual.
While in principle that's an optimization that could save some pointless
comparisons at runtime, it seems unlikely to be meaningful in the real
world. I think this behavior was not so much an intentional optimization
as a side-effect of an ancient decision to construct the plan node's
ordering-column info by reverse-engineering the PathKeys of the input
path. If we give up redundant-column removal then it takes very little
code to generate the plan node info directly from the WindowClause,
ensuring that we have the expected number of ordering columns in all
cases. (If anyone does complain about this, the planner could perhaps
be taught to remove redundant columns only when it's safe to do so,
ie *not* in RANGE OFFSET mode. But I doubt anyone ever will.)
With these changes, the WindowAggPath.winpathkeys field is not used for
anything anymore, so remove it.
The test cases added here are not actually very interesting given the
removal of the redundant-column-removal logic, but they would represent
important corner cases if anyone ever tries to put that back.
Tom Lane and Masahiko Sawada. Back-patch to v11 where RANGE OFFSET
and GROUPS modes were added.
Discussion: https://postgr.es/m/CAD21AoDrWqycq-w_+Bx1cjc+YUhZ11XTj9rfxNiNDojjBx8Fjw@mail.gmail.com
Discussion: https://postgr.es/m/153086788677.17476.8002640580496698831@wrigleys.postgresql.org
We fail to handle polymorphic types properly when they are used as
partition keys: we were unnecessarily adding a RelabelType node on top,
which confuses code examining the nodes. In particular, this makes
predtest.c-based partition pruning not to work, and ruleutils.c to emit
expressions that are uglier than needed. Fix it by not adding RelabelType
when not needed.
In master/11 the new pruning code is separate so it doesn't suffer from
this problem, since we already fixed it (in essentially the same way) in
e5dcbb88a1, which also added a few tests; back-patch those tests to
pg10 also. But since UPDATE/DELETE still uses predtest.c in pg11, this
change improves partitioning for those cases too. Add tests for this.
The ruleutils.c behavior change is relevant in pg11/master too.
Co-authored-by: Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/54745d13-7ed4-54ac-97d8-ea1eec95ae25@lab.ntt.co.jp
While we probably don't want to split up all error messages into
function and procedure variants, this one is a very prominent one, so
it's helpful to be more specific here.
If a standby crashes after promotion before having completed its first
post-recovery checkpoint, then the minimal recovery point which marks
the LSN position where the cluster is able to reach consistency may be
set to a position older than the first end-of-recovery checkpoint while
all the WAL available should be replayed. This leads to the instance
thinking that it contains inconsistent pages, causing a PANIC and a hard
instance crash even if all the WAL available has not been replayed for
certain sets of records replayed. When in crash recovery,
minRecoveryPoint is expected to always be set to InvalidXLogRecPtr,
which forces the recovery to replay all the WAL available, so this
commit makes sure that the local copy of minRecoveryPoint from the
control file is initialized properly and stays as it is while crash
recovery is performed. Once switching to archive recovery or if crash
recovery finishes, then the local copy minRecoveryPoint can be safely
updated.
Pavan Deolasee has reported and diagnosed the failure in the first
place, and the base fix idea to rely on the local copy of
minRecoveryPoint comes from Kyotaro Horiguchi, which has been expanded
into a full-fledged patch by me. The test included in this commit has
been written by Álvaro Herrera and Pavan Deolasee, which I have modified
to make it faster and more reliable with sleep phases.
Backpatch down to all supported versions where the bug appears, aka 9.3
which is where the end-of-recovery checkpoint is not run by the startup
process anymore. The test gets easily supported down to 10, still it
has been tested on all branches.
Reported-by: Pavan Deolasee
Diagnosed-by: Pavan Deolasee
Reviewed-by: Pavan Deolasee, Kyotaro Horiguchi
Author: Michael Paquier, Kyotaro Horiguchi, Pavan Deolasee, Álvaro
Herrera
Discussion: https://postgr.es/m/CABOikdPOewjNL=05K5CbNMxnNtXnQjhTx2F--4p4ruorCjukbA@mail.gmail.com
Since recent commit 1c7c317c, temporary relations cannot be mixed with
permanent relations within the same partition tree, and the same counts
for temporary relations created by other sessions, which the planner
simply discarded. Instead be paranoid and issue an error, as those
should be blocked at definition time, at least for now.
At the same time, a test case is added to stress what has been moved
when expand_partitioned_rtentry gets called recursively but bumps on a
partitioned relation with no partitions which should be handled the same
way as the non-inheritance case. This code may be reworked in a close
future, and covering this code path will limit surprises.
Reported-by: David Rowley
Author: David Rowley
Reviewed-by: Amit Langote, Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/CAKJS1f_HyV1txn_4XSdH5EOhBMYaCwsXyAj6bHXk9gOu4JKsbw@mail.gmail.com
When these programs call pg_catalog.set_config, they need to check for
PGRES_TUPLES_OK instead of PGRES_COMMAND_OK. Fix for
5770172cb0.
Reported-by: Ideriha, Takeshi <ideriha.takeshi@jp.fujitsu.com>