VALIDATE CONSTRAINT
CLUSTER ON
SET WITHOUT CLUSTER
ALTER COLUMN SET STATISTICS
ALTER COLUMN SET ()
ALTER COLUMN RESET ()
All other sub-commands use AccessExclusiveLock
Simon Riggs and Noah Misch
Reviews by Robert Haas and Andres Freund
Also add a regression test for a GIN index with enough items with the same
key, so that a GIN posting tree gets created. Apparently none of the
existing GIN tests were large enough for that.
This code is new, no backpatching required.
This is needed because Windows services may get started with a different
current directory than where pg_ctl is executed. We want relative -D
paths to be interpreted relative to pg_ctl's CWD, similarly to what
happens on other platforms.
In support of this, move the backend's make_absolute_path() function
into src/port/path.c (where it probably should have been long since)
and get rid of the rather inferior version in pg_regress.
Kumar Rajeev Rastogi, reviewed by MauMau
Any OS user able to access the socket can connect as the bootstrap
superuser and in turn execute arbitrary code as the OS user running the
test. Protect against that by placing the socket in the temporary data
directory, which has mode 0700 thanks to initdb. Back-patch to 8.4 (all
supported versions). The hazard remains wherever the temporary cluster
accepts TCP connections, notably on Windows.
Attempts to run "make check" from a directory with a long name will now
fail. An alternative not sharing that problem was to place the socket
in a subdirectory of /tmp, but that is only secure if /tmp is sticky.
The PG_REGRESS_SOCK_DIR environment variable is available as a
workaround when testing from long directory paths.
As a convenient side effect, this lets testing proceed smoothly in
builds that override DEFAULT_PGSOCKET_DIR. Popular non-default values
like /var/run/postgresql are often unwritable to the build user.
Security: CVE-2014-0067
The original coding of EquivalenceClasses didn't foresee that appendrel
child relations might themselves be appendrels; but this is possible for
example when a UNION ALL subquery scans a table with inheritance children.
The oversight led to failure to optimize ordering-related issues very well
for the grandchild tables. After some false starts involving explicitly
flattening the appendrel representation, we found that this could be fixed
easily by removing a few implicit assumptions about appendrel parent rels
not being children themselves.
Kyotaro Horiguchi and Tom Lane, reviewed by Noah Misch
Display "replica identity" only for \d plus mode, exclude system schema
objects, and display all possible values, not just non-default,
non-index ones.
Set function parameter names and defaults. Add jsonb versions (which the
code already provided for so the actual new code is trivial). Add jsonb
regression tests and docs.
Bump catalog version (which I apparently forgot to do when jsonb was
committed).
The new format accepts exactly the same data as the json type. However, it is
stored in a format that does not require reparsing the orgiginal text in order
to process it, making it much more suitable for indexing and other operations.
Insignificant whitespace is discarded, and the order of object keys is not
preserved. Neither are duplicate object keys kept - the later value for a given
key is the only one stored.
The new type has all the functions and operators that the json type has,
with the exception of the json generation functions (to_json, json_agg etc.)
and with identical semantics. In addition, there are operator classes for
hash and btree indexing, and two classes for GIN indexing, that have no
equivalent in the json type.
This feature grew out of previous work by Oleg Bartunov and Teodor Sigaev, which
was intended to provide similar facilities to a nested hstore type, but which
in the end proved to have some significant compatibility issues.
Authors: Oleg Bartunov, Teodor Sigaev, Peter Geoghegan and Andrew Dunstan.
Review: Andres Freund
This covers all the SQL-standard trigger types supported for regular
tables; it does not cover constraint triggers. The approach for
acquiring the old row mirrors that for view INSTEAD OF triggers. For
AFTER ROW triggers, we spool the foreign tuples to a tuplestore.
This changes the FDW API contract; when deciding which columns to
populate in the slot returned from data modification callbacks, writable
FDWs will need to check for AFTER ROW triggers in addition to checking
for a RETURNING clause.
In support of the feature addition, refactor the TriggerFlags bits and
the assembly of old tuples in ModifyTable.
Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
With the GIN "fast scan" feature, GIN can skip items without fetching all
the keys for them, if it can prove that they don't match regardless of
those keys. So far, it has done the proving by calling the boolean
consistent function with all combinations of TRUE/FALSE for the unfetched
keys, but since that's O(n^2), it becomes unfeasible with more than a few
keys. We can avoid calling consistent with all the combinations, if we can
tell the operator class implementation directly which keys are unknown.
This commit includes a triConsistent function for the built-in array and
tsvector opclasses.
Alexander Korotkov, with some changes by me.
We should allow this so that matviews can be referenced in UPDATE/DELETE
statements in READ COMMITTED isolation level. The requirement for that
is that a re-fetch by TID will see the same row version the query saw
earlier, which is true of matviews, so there's no reason for the
restriction. Per bug #9398.
Michael Paquier, after a suggestion by me
This forces an input field containing the quoted null string to be
returned as a NULL. Without this option, only unquoted null strings
behave this way. This helps where some CSV producers insist on quoting
every field, whether or not it is needed. The option takes a list of
fields, and only applies to those columns. There is an equivalent
column-level option added to file_fdw.
Ian Barwick, with some tweaking by Andrew Dunstan, reviewed by Payal
Singh.
Author: Pavel Stěhule, editorialized somewhat by Álvaro Herrera
Reviewed-by: Tomáš Vondra, Marko Tiikkaja
With input from Fabrízio de Royes Mello, Jim Nasby
This feature, building on previous commits, allows the write-ahead log
stream to be decoded into a series of logical changes; that is,
inserts, updates, and deletes and the transactions which contain them.
It is capable of handling decoding even across changes to the schema
of the effected tables. The output format is controlled by a
so-called "output plugin"; an example is included. To make use of
this in a real replication system, the output plugin will need to be
modified to produce output in the format appropriate to that system,
and to perform filtering.
Currently, information can be extracted from the logical decoding
system only via SQL; future commits will add the ability to stream
changes via walsender.
Andres Freund, with review and other contributions from many other
people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan,
Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit
Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve
Singer.
Additional non-security issues/improvements spotted by Coverity.
In backend/libpq, no sense trying to protect against port->hba being
NULL after we've already dereferenced it in the switch() statement.
Prevent against possible overflow due to 32bit arithmitic in
basebackup throttling (not yet released, so no security concern).
Remove nonsensical check of array pointer against NULL in procarray.c,
looks to be a holdover from 9.1 and earlier when there were pointers
being used but now it's just an array.
Remove pointer check-against-NULL in tsearch/spell.c as we had already
dereferenced it above (in the strcmp()).
Remove dead code from adt/orderedsetaggs.c, isnull is checked
immediately after each tuplesort_getdatum() call and if true we return,
so no point checking it again down at the bottom.
Remove recently added minor error-condition memory leak in pg_regress.
A number of issues were identified by the Coverity scanner and are
addressed in this patch. None of these appear to be security issues
and many are mostly cosmetic changes.
Short comments for each of the changes follows.
Correct the semi-colon placement in be-secure.c regarding SSL retries.
Remove a useless comparison-to-NULL in proc.c (value is dereferenced
prior to this check and therefore can't be NULL).
Add checking of chmod() return values to initdb.
Fix a couple minor memory leaks in initdb.
Fix memory leak in pg_ctl- involves free'ing the config file contents.
Use an int to capture fgetc() return instead of an enum in pg_dump.
Fix minor memory leaks in pg_dump.
(note minor change to convertOperatorReference()'s API)
Check fclose()/remove() return codes in psql.
Check fstat(), find_my_exec() return codes in psql.
Various ECPG memory leak fixes.
Check find_my_exec() return in ECPG.
Explicitly ignore pqFlush return in libpq error-path.
Change PQfnumber() to avoid doing an strdup() when no changes required.
Remove a few useless check-against-NULL's (value deref'd beforehand).
Check rmtree(), malloc() results in pg_regress.
Also check get_alternative_expectfile() return in pg_regress.
Change input function error messages to be more consistent with what is
done elsewhere. Remove a bunch of redundant type casts, so that the
compiler will warn us if we screw up. Don't pass LSNs by value on
platforms where a Datum is only 32 bytes, per buildfarm. Move macros
for packing and unpacking LSNs to pg_lsn.h so that we can include
access/xlogdefs.h, to avoid an unsatisfied dependency on XLogRecPtr.
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit. We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue. Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.
Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.
In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass. The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode"). This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.
Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
Many server functions use the MAXDATELEN constant to size a buffer for
parsing or displaying a datetime value. It was much too small for the
longest possible interval output and slightly too small for certain
valid timestamp input, particularly input with a long timezone name.
The long input was rejected needlessly; the long output caused
interval_out() to overrun its buffer. ECPG's pgtypes library has a copy
of the vulnerable functions, which bore the same vulnerabilities along
with some of its own. In contrast to the server, certain long inputs
caused stack overflow rather than failing cleanly. Back-patch to 8.4
(all supported versions).
Reported by Daniel Schüssler, reviewed by Tom Lane.
Security: CVE-2014-0063
Granting a role without ADMIN OPTION is supposed to prevent the grantee
from adding or removing members from the granted role. Issuing SET ROLE
before the GRANT bypassed that, because the role itself had an implicit
right to add or remove members. Plug that hole by recognizing that
implicit right only when the session user matches the current role.
Additionally, do not recognize it during a security-restricted operation
or during execution of a SECURITY DEFINER function. The restriction on
SECURITY DEFINER is not security-critical. However, it seems best for a
user testing his own SECURITY DEFINER function to see the same behavior
others will see. Back-patch to 8.4 (all supported versions).
The SQL standards do not conflate roles and users as PostgreSQL does;
only SQL roles have members, and only SQL users initiate sessions. An
application using PostgreSQL users and roles as SQL users and roles will
never attempt to grant membership in the role that is the session user,
so the implicit right to add or remove members will never arise.
The security impact was mostly that a role member could revoke access
from others, contrary to the wishes of his own grantor. Unapproved role
member additions are less notable, because the member can still largely
achieve that by creating a view or a SECURITY DEFINER function.
Reviewed by Andres Freund and Tom Lane. Reported, independently, by
Jonas Sundman and Noah Misch.
Security: CVE-2014-0060
We used to have externs for getopt() and its API variables scattered
all over the place. Now that we find we're going to need to tweak the
variable declarations for Cygwin, it seems like a good idea to have
just one place to tweak.
In this commit, the variables are declared "#ifndef HAVE_GETOPT_H".
That may or may not work everywhere, but we'll soon find out.
Andres Freund
Providing this information as plain text was doubtless worth the trouble
ten years ago, but it seems likely that hardly anyone reads it in this
format anymore. And the effort required to maintain these files (in the
form of extra-complex markup rules in the relevant parts of the SGML
documentation) is significant. So, let's stop doing that and rely solely
on the other documentation formats.
Per discussion, the plain-text INSTALL instructions might still be worth
their keep, so we continue to generate that file.
Rather than remove HISTORY and src/test/regress/README from distribution
tarballs entirely, replace them with simple stub files that tell the reader
where to find the relevant documentation. This is mainly to avoid possibly
breaking packaging recipes that expect these files to exist.
Back-patch to all supported branches, because simplifying the markup
requirements for release notes won't help much unless we do it in all
branches.
We may process relcache flush requests during transaction startup or
shutdown. In general it's not terribly safe to do catalog access at those
times, so the code's habit of trying to immediately revalidate unflushable
relcache entries is risky. Although there are no field trouble reports
that are positively traceable to this, we have been able to demonstrate
failure of the assertions recently added in RelationIdGetRelation() and
SearchCatCache(). On the other hand, it seems safe to just postpone
revalidation of the cache entry until we're inside a valid transaction.
The one case where this is questionable is where we're exiting a
subtransaction and the outer transaction is holding the relcache entry open
--- but if we made any significant changes to the rel inside such a
subtransaction, we've got problems anyway. There are mechanisms in place
to prevent that (to wit, locks for cross-session cases and
CheckTableNotInUse() for intra-session cases), so let's trust to those
mechanisms to keep us out of trouble.
Given a composite-type parameter named x, "$1.*" worked fine, but "x.*"
not so much. This has been broken since named parameter references were
added in commit 9bff0780cf, so patch back
to 9.2. Per bug #9085 from Hardy Falk.
Previously an input array string that started with a single-element
array dimension would then later accept a multi-dimensional segment.
BACKWARD INCOMPATIBILITY
Replication slots are a crash-safe data structure which can be created
on either a master or a standby to prevent premature removal of
write-ahead log segments needed by a standby, as well as (with
hot_standby_feedback=on) pruning of tuples whose removal would cause
replication conflicts. Slots have some advantages over existing
techniques, as explained in the documentation.
In a few places, we refer to the type of replication slots introduced
by this patch as "physical" slots, because forthcoming patches for
logical decoding will also have slots, but with somewhat different
properties.
Andres Freund and Robert Haas
When pulling a "postponed" qual from a LATERAL subquery up into the quals
of an outer join, we must make sure that the postponed qual is included
in those seen by make_outerjoininfo(). Otherwise we might compute a
too-small min_lefthand or min_righthand for the outer join, leading to
"JOIN qualification cannot refer to other relations" failures from
distribute_qual_to_rels. Subtler errors in the created plan seem possible,
too, if the extra qual would only affect join ordering constraints.
Per bug #9041 from David Leverton. Back-patch to 9.3.
json_build_array() and json_build_object allow for the construction of
arbitrarily complex json trees. json_object() turns a one or two
dimensional array, or two separate arrays, into a json_object of
name/value pairs, similarly to the hstore() function.
json_object_agg() aggregates its two arguments into a single json object
as name value pairs.
Catalog version bumped.
Andrew Dunstan, reviewed by Marko Tiikkaja.
Some cases were still reporting errors and aborting, instead of a NOTICE
that the object was being skipped. This makes it more difficult to
cleanly handle pg_dump --clean, so change that to instead skip missing
objects properly.
Per bug #7873 reported by Dave Rolsky; apparently this affects a large
number of users.
Authors: Pavel Stehule and Dean Rasheed. Some tweaks by Álvaro Herrera
Unlike our other array functions, this considers the total number of
elements across all dimensions, and returns 0 rather than NULL when the
array has no elements. But it seems that both of those behaviors are
almost universally disliked, so hopefully that's OK.
Marko Tiikkaja, reviewed by Dean Rasheed and Pavel Stehule
When there are consecutive spaces (or other non-format-code characters) in
the format, we should advance over exactly that many characters of input.
The previous coding mistakenly did a "skip whitespace" action between such
characters, possibly allowing more input to be skipped than the user
intended. We only need to skip whitespace just before an actual field.
This is really a bug fix, but given the minimal number of field complaints
and the risk of breaking applications coded to expect the old behavior,
let's not back-patch it.
Jeevan Chalke
Tablespaces have a few options which can be set on them to give PG hints
as to how the tablespace behaves (perhaps it's faster for sequential
scans, or better able to handle random access, etc). These options were
only available through the ALTER TABLESPACE command.
This adds the ability to set these options at CREATE TABLESPACE time,
removing the need to do both a CREATE TABLESPACE and ALTER TABLESPACE to
get the correct options set on the tablespace.
Vik Fearing, reviewed by Michael Paquier.
This adds a 'MOVE' sub-command to ALTER TABLESPACE which allows moving sets of
objects from one tablespace to another. This can be extremely handy and avoids
a lot of error-prone scripting. ALTER TABLESPACE ... MOVE will only move
objects the user owns, will notify the user if no objects were found, and can
be used to move ALL objects or specific types of objects (TABLES, INDEXES, or
MATERIALIZED VIEWS).
On second thought, commit 0c051c9008 was
over-hasty: rather than allowing this case, we ought to reject it for now.
That leaves the field clear for a future feature that allows the target
table to be re-specified in the FROM (or USING) clause, which will enable
left-joining the target table to something else. We can then also allow
LATERAL references to such an explicitly re-specified target table.
But allowing them right now will create ambiguities or worse for such a
feature, and it isn't something we documented 9.3 as supporting.
While at it, add a convenience subroutine to avoid having several copies
of the ereport for disalllowed-LATERAL-reference cases.
Add a query that lists all the functions that are operator implementation
functions and have a SQL comment that doesn't just say "implementation of
XYZ operator". (Note that the preceding test checks that such functions'
comments exactly match the corresponding operators' comments.)
While it's not forbidden to add more functions to this list, that should
only be done when we're encouraging users to use either the function or
operator syntax for the functionality, which is a fairly rare situation.
In commit c1352052ef, I implemented an
optimization that assumed that a function's argument expressions would
either always return a set (ie multiple rows), or always not. This is
wrong however: we allow CASE expressions in which some arms return a set
of some type and others just return a scalar of that type. There may be
other examples as well. To fix, replace the run-time test of whether an
argument returned a set with a static precheck (expression_returns_set).
This adds a little bit of query startup overhead, but it seems barely
measurable.
Per bug #8228 from David Johnston. This has been broken since 8.0,
so patch all supported branches.
I failed to think much about UPDATE/DELETE when implementing LATERAL :-(.
The implemented behavior ended up being that subqueries in the FROM or
USING clause (respectively) could access the update/delete target table as
though it were a lateral reference; which seems fine if they said LATERAL,
but certainly ought to draw an error if they didn't. Fix it so you get a
suitable error when you omit LATERAL. Per report from Emre Hasegeli.
It's possible to extract a restriction OR clause from a join clause that
has the form of an OR-of-ANDs, if each sub-AND includes a clause that
mentions only one specific relation. While PG has been aware of that idea
for many years, the code previously only did it if it could extract an
indexable OR clause. On reflection, though, that seems a silly limitation:
adding a restriction clause can be a win by reducing the number of rows
that have to be filtered at the join step, even if we have to test the
clause as a plain filter clause during the scan. This should be especially
useful for foreign tables, where the change can cut the number of rows that
have to be retrieved from the foreign server; but testing shows it can win
even on local tables. Per a suggestion from Robert Haas.
As a heuristic, I made the code accept an extracted restriction clause
if its estimated selectivity is less than 0.9, which will probably result
in accepting extracted clauses just about always. We might need to tweak
that later based on experience.
Since the code no longer has even a weak connection to Path creation,
remove orindxpath.c and create a new file optimizer/util/orclauses.c.
There's some additional janitorial cleanup of now-dead code that needs
to happen, but it seems like that's a fit subject for a separate commit.
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()). We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.
Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions. To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c. This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need. There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.
In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates. Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.
Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
This ensures that all stdout output is flushed immediately, to match
stderr. This eliminates the need for fflush(stdout) calls sprinkled all
over the place.
Per Daniel Wood in message 519A79C6.90308@salesforce.com
If a tuple was locked by transaction A, and transaction B updated it,
the new version of the tuple created by B would be locked by A, yet
visible only to B; due to an oversight in HeapTupleSatisfiesUpdate, the
lock held by A wouldn't get checked if transaction B later deleted (or
key-updated) the new version of the tuple. This might cause referential
integrity checks to give false positives (that is, allow deletes that
should have been rejected).
This is an easy oversight to have made, because prior to improved tuple
locks in commit 0ac5ad5134 it wasn't possible to have tuples created by
our own transaction that were also locked by remote transactions, and so
locks weren't even considered in that code path.
It is recommended that foreign keys be rechecked manually in bulk after
installing this update, in case some referenced rows are missing with
some referencing row remaining.
Per bug reported by Daniel Wood in
CAPweHKe5QQ1747X2c0tA=5zf4YnS2xcvGf13Opd-1Mq24rF1cQ@mail.gmail.com
This fixes a problem noted as a followup to bug #8648: if a query has a
semantically-empty target list, e.g. SELECT * FROM zero_column_table,
ruleutils.c will dump it as a syntactically-empty target list, which was
not allowed. There doesn't seem to be any reliable way to fix this by
hacking ruleutils (note in particular that the originally zero-column table
might since have had columns added to it); and even if we had such a fix,
it would do nothing for existing dump files that might contain bad syntax.
The best bet seems to be to relax the syntactic restriction.
Also, add parse-analysis errors for SELECT DISTINCT with no columns (after
*-expansion) and RETURNING with no columns. These cases previously
produced unexpected behavior because the parsed Query looked like it had
no DISTINCT or RETURNING clause, respectively. If anyone ever offers
a plausible use-case for this, we could work a bit harder on making the
situation distinguishable.
Arguably this is a bug fix that should be back-patched, but I'm worried
that there may be client apps or PLs that expect "SELECT ;" to throw a
syntax error. The issue doesn't seem important enough to risk changing
behavior in minor releases.
Fix an oversight in commit b3aaf9081a: we do
indeed need to process the planner's append_rel_list when copying RTE
subqueries, because if any of them were flattenable UNION ALL subqueries,
the append_rel_list shows which subquery RTEs were pulled up out of which
other ones. Without this, UNION ALL subqueries aren't correctly inserted
into the update plans for inheritance child tables after the first one,
typically resulting in no update happening for those child table(s).
Per report from Victor Yegorov.
Experimentation with this case also exposed a fault in commit
a7b965382c: if an inherited UPDATE/DELETE
was proven totally dummy by constraint exclusion, we might arrive at
add_rtes_to_flat_rtable with root->simple_rel_array being NULL. This
should be interpreted as not having any RelOptInfos. I chose to code
the guard as a check against simple_rel_array_size, so as to also
provide some protection against indexing off the end of the array.
Back-patch to 9.2 where the faulty code was added.
Make the COPY test, which loads most of the large static tables used in
the tests, also explicitly ANALYZE those tables. This allows us to get
rid of various ad-hoc, and rather redundant, ANALYZE commands that had
gotten stuck into various test scripts over time to ensure we got
consistent plan choices. (We could have done a database-wide ANALYZE,
but that would cause stats to get attached to the small static tables
too, which results in plan changes compared to the historical behavior.
I'm not sure that's a good idea, so not going that far for now.)
Back-patch to 9.0, since 9.0 and 9.1 are currently sometimes failing
regression tests for lack of an "ANALYZE tenk1" in the subselect test.
There's no need for this in 8.4 since we didn't print any plans back
then.
The test only needs the one table to be vacuumed. Vacuuming the
database may affect other tests.
Per gripe from Tom Lane. Back-patch to 9.3, where the test was
was added.
An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...)
could produce an invalid plan that results in a crash at execution time,
if the planner attempts to flatten the outer IN into a semi-join.
This happens because convert_testexpr() was not expecting any nested
SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging
to the inner SubLink. (I think the comment denying that this case could
happen was wrong when written; it's certainly been wrong for quite a long
time, since very early versions of the semijoin flattening logic.)
Per report from Teodor Sigaev. Back-patch to all supported branches.
SQL-standard TABLE() is a subset of UNNEST(); they deal with arrays and
other collection types. This feature, however, deals with set-returning
functions. Use a different syntax for this feature to keep open the
possibility of implementing the standard TABLE().
In 247c76a989, I added some code to do fine-grained checking of
MultiXact status of locking/updating transactions when traversing an
update chain. There was a thinko in that patch which would have the
traversing abort, that is return HeapTupleUpdated, when the other
transaction is a committed lock-only. In this case we should ignore it
and return success instead. Of course, in the case where there is a
committed update, HeapTupleUpdated is the correct return value.
A user-visible symptom of this bug is that in REPEATABLE READ and
SERIALIZABLE transaction isolation modes spurious serializability errors
can occur:
ERROR: could not serialize access due to concurrent update
In order for this to happen, there needs to be a tuple that's key-share-
locked and also updated, and the update must abort; a subsequent
transaction trying to acquire a new lock on that tuple would abort with
the above error. The reason is that the initial FOR KEY SHARE is seen
as committed by the new locking transaction, which triggers this bug.
(If the UPDATE commits, then the serialization error is correctly
reported.)
When running a query in READ COMMITTED mode, what happens is that the
locking is aborted by the HeapTupleUpdated return value, then
EvalPlanQual fetches the newest version of the tuple, which is then the
only version that gets locked. (The second time the tuple is checked
there is no misbehavior on the committed lock-only, because it's not
checked by the code that traverses update chains; so no bug.) Only the
newest version of the tuple is locked, not older ones, but this is
harmless.
The isolation test added by this commit illustrates the desired
behavior, including the proper serialization errors that get thrown.
Backpatch to 9.3.
HeapTupleSatisfiesUpdate can very easily "forget" tuple locks while
checking the contents of a multixact and finding it contains an aborted
update, by setting the HEAP_XMAX_INVALID bit. This would lead to
concurrent transactions not noticing any previous locks held by
transactions that might still be running, and thus being able to acquire
subsequent locks they wouldn't be normally able to acquire.
This bug was introduced in commit 1ce150b7bb; backpatch this fix to 9.3,
like that commit.
This change reverts the change to the delete-abort-savept isolation test
in 1ce150b7bb, because that behavior change was caused by this bug.
Noticed by Andres Freund while investigating a different issue reported
by Noah Misch.
It is dangerous to do so, because some code expects to be able to see what's
the true Xmax even if it is aborted (particularly while traversing HOT
chains). So don't do it, and instead rely on the callers to verify for
abortedness, if necessary.
Several race conditions and bugs fixed in the process. One isolation test
changes the expected output due to these.
This also reverts commit c235a6a589, which is no longer necessary.
Backpatch to 9.3, where this function was introduced.
Andres Freund
Although user-defined relations can't be directly created in
pg_catalog, it's possible for them to end up there, because you can
create them in some other schema and then use ALTER TABLE .. SET SCHEMA
to move them there. Previously, such relations couldn't afterwards
be manipulated, because IsSystemRelation()/IsSystemClass() rejected
all attempts to modify objects in the pg_catalog schema, regardless
of their origin. With this patch, they now reject only those
objects in pg_catalog which were created at initdb-time, allowing
most operations on user-created tables in pg_catalog to proceed
normally.
This patch also adds new functions IsCatalogRelation() and
IsCatalogClass(), which is similar to IsSystemRelation() and
IsSystemClass() but with a slightly narrower definition: only TOAST
tables of system catalogs are included, rather than *all* TOAST tables.
This is currently used only for making decisions about when
invalidation messages need to be sent, but upcoming logical decoding
patches will find other uses for this information.
Andres Freund, with some modifications by me.
Reviewed-by: Ali Dar <ali.munir.dar@gmail.com>
Reviewed-by: Amit Khandekar <amit.khandekar@enterprisedb.com>
Reviewed-by: Rodolfo Campero <rodolfo.campero@anachronics.com>
Change SET LOCAL/CONSTRAINTS/TRANSACTION behavior outside of a
transaction block from error (post-9.3) to warning. (Was nothing in <=
9.3.) Also change ABORT outside of a transaction block from notice to
warning.
pullup_replace_vars()'s decisions about whether a pulled-up replacement
expression needs to be wrapped in a PlaceHolderVar depend on the assumption
that what looks like a Var behaves like a Var. However, if the Var is a
join alias reference, later flattening of join aliases might replace the
Var with something that's not a Var at all, and should have been wrapped.
To fix, do a forcible pass of flatten_join_alias_vars() on the subquery
targetlist before we start to copy items out of it. We'll re-run that
processing on the pulled-up expressions later, but that's harmless.
Per report from Ken Tanzer; the added regression test case is based on his
example. This bug has been there since the PlaceHolderVar mechanism was
invented, but has escaped detection because the circumstances that trigger
it are fairly narrow. You need a flattenable query underneath an outer
join, which contains another flattenable query inside a join of its own,
with a dangerous expression (a constant or something else non-strict)
in that one's targetlist.
Having seen this, I'm wondering if it wouldn't be prudent to do all
alias-variable flattening earlier, perhaps even in the rewriter.
But that would probably not be a back-patchable change.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry. The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others. This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.
This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.
Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does. There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST(). After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.
Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
This patch improves performance of most built-in aggregates that formerly
used a NUMERIC or NUMERIC array as their transition type; this includes
not only aggregates on numeric inputs, but some aggregates on integer
inputs where overflow of an int8 value is a possibility. The code now
uses a special-purpose data structure to avoid array construction and
deconstruction overhead, as well as packing and unpacking overhead for
numeric values.
These aggregates' transition type is now declared as INTERNAL, since
it doesn't correspond to any SQL data type. To keep the planner from
thinking that that means a lot of storage will be used, we make use
of the just-added pg_aggregate.aggtransspace feature. The space estimate
is set to 128 bytes, which is at least in the right ballpark.
Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
Formerly the planner had a hard-wired rule of thumb for guessing the amount
of space consumed by an aggregate function's transition state data. This
estimate is critical to deciding whether it's OK to use hash aggregation,
and in many situations the built-in estimate isn't very good. This patch
adds a column to pg_aggregate wherein a per-aggregate estimate can be
provided, overriding the planner's default, and infrastructure for setting
the column via CREATE AGGREGATE.
It may be that additional smarts will be required in future, perhaps even
a per-aggregate estimation function. But this is already a step forward.
This is extracted from a larger patch to improve the performance of numeric
and int8 aggregates. I (tgl) thought it was worth reviewing and committing
this infrastructure separately. In this commit, all built-in aggregates
are given aggtransspace = 0, so no behavior should change.
Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr
must be able to compute valid em_nullable_relids for any new equivalence
class members it creates. I'd worried about this in the commit message
for db9f0e1d9a, but claimed that it wasn't a
problem because multi-member ECs should already exist when it runs. That
is transparently wrong, though, because this function is also called by
initialize_mergeclause_eclasses, which runs during deconstruct_jointree.
The example given in the bug report (which the new regression test item
is based upon) fails because the COALESCE() expression is first seen by
initialize_mergeclause_eclasses rather than process_equivalence.
Fixing this requires passing the appropriate nullable_relids set to
get_eclass_for_sort_expr, and it requires new code to compute that set
for top-level expressions such as ORDER BY, GROUP BY, etc. We store
the top-level nullable_relids in a new field in PlannerInfo to avoid
computing it many times. In the back branches, I've added the new
field at the end of the struct to minimize ABI breakage for planner
plugins. There doesn't seem to be a good alternative to changing
get_eclass_for_sort_expr's API signature, though. There probably aren't
any third-party extensions calling that function directly; moreover,
if there are, they probably need to think about what to pass for
nullable_relids anyway.
Back-patch to 9.2, like the previous patch in this area.
plpgsql likes to cache query plans and simple-expression execution state
trees across calls. This is a considerable win for multiple executions
of the same function. However, it's useless for DO blocks, since by
definition those are executed only once and discarded. Nonetheless,
we were allowing a DO block's expression execution trees to survive
until end of transaction, resulting in a significant intra-transaction
memory leak, as reported by Yeb Havinga. Worse, if the DO block exited
with an error, the compiled form of the block's code was leaked till
end of session --- along with subsidiary plancache entries.
To fix, make DO blocks keep their expression execution trees in a private
EState that's deleted at exit from the block, and add a PG_TRY block
to plpgsql_inline_handler to make sure that memory cleanup happens
even on error exits. Also add a regression test covering error handling
in a DO block, because my first try at this broke that. (The test is
not meant to prove that we don't leak memory anymore, though it could
be used for that with a much larger loop count.)
Ideally we'd back-patch this into all versions supporting DO blocks;
but the patch needs to add a field to struct PLpgSQL_execstate, and that
would break ABI compatibility for third-party plugins such as the plpgsql
debugger. Given the small number of complaints so far, fixing this in
HEAD only seems like an acceptable choice.
Commit 061b88c732 saved argv0 to a
global buffer without ensuring that it was zero terminated,
allowing references to it to overrun the buffer and access other
memory. This probably would not have presented any security risk,
but could have resulted in very confusing failures if the path to
the executable was very long.
Reported by David Rowley
It's a trivial amount of RAM held until the end of the regression
test run; but it's probably worth fixing to silence future warnings
from code analyzers.
This was the only memory leak pointed out by clang's static code
analysis tool.
We can't search for the isolationtester binary until after we've set
up the environment, because otherwise when find_other_exec() tries
to invoke it with the -V option, it might fail for inability to
locate a working libpq. So postpone that step.
Andres Freund
The pretty-printing logic in ruleutils.c operates by inserting a newline
and some indentation whitespace into strings that are already valid SQL.
This naturally results in leaving some trailing whitespace before the
newline in many cases; which can be annoying when processing the output
with other tools, as complained of by Joe Abbate. We can fix that in
a pretty localized fashion by deleting any trailing whitespace before
we append a pretty-printing newline. In addition, we have to modify the
code inserted by commit 2f582f76b1 so that
we also delete trailing whitespace when transposing items from temporary
buffers into the main result string, when a temporary item starts with a
newline.
This results in rather voluminous changes to the regression test results,
but it's easily verified that they are only removal of trailing whitespace.
Back-patch to 9.3, because the aforementioned commit resulted in many
more cases of trailing whitespace than had occurred in earlier branches.
Although the SQL spec forbids duplicate table aliases, historically
we've allowed queries like
SELECT ... FROM tab1 x CROSS JOIN (tab2 x CROSS JOIN tab3 y) z
on the grounds that the aliased join (z) hides the aliases within it,
therefore there is no conflict between the two RTEs named "x". The
LATERAL patch broke this, on the misguided basis that "x" could be
ambiguous if tab3 were a LATERAL subquery. To avoid breaking existing
queries, it's better to allow this situation and complain only if
tab3 actually does contain an ambiguous reference. We need only remove
the check that was throwing an error, because the column lookup code
is already prepared to handle ambiguous references. Per bug #8444.
Pending patches for logical replication will use this to determine
which columns of a tuple ought to be considered as its candidate key.
Andres Freund, with minor, mostly cosmetic adjustments by me
This change prevents us from doing inappropriate subquery flattening in
cases such as dangerous functions hidden inside a sub-SELECT in the
targetlist of another sub-SELECT. That could result in unexpected behavior
due to multiple evaluations of a volatile function, as in a recent
complaint from Etienne Dube. It's been questionable from the very
beginning whether these functions should look into subqueries (as noted in
their comments), and this case seems to provide proof that they should.
Because the new code only descends into SubLinks, not SubPlans or
InitPlans, the change only affects the planner's behavior during
prepjointree processing and not later on --- for example, you can still get
it to use a volatile function in an indexqual if you wrap the function in
(SELECT ...). That's a historical behavior, for sure, but it's reasonable
given that the executor's evaluation rules for subplans don't depend on
whether there are volatile functions inside them. In any case, we need to
constrain the behavioral change as narrowly as we can to make this
reasonable to back-patch.
ExecBuildSlotValueDescription() printed "null" for each dropped column in
a row being complained of by ExecConstraints(). This has some sanity in
terms of the underlying implementation, but is of course pretty surprising
to users. To fix, we must pass the target relation's descriptor to
ExecBuildSlotValueDescription(), because the slot descriptor it had been
using doesn't get labeled with attisdropped markers.
Per bug #8408 from Maxim Boguk. Back-patch to 9.2 where the feature of
printing row values in NOT NULL and CHECK constraint violation messages
was introduced.
Michael Paquier and Tom Lane
Before jamming a desired targetlist into a plan node, one really ought to
make sure the plan node can handle projections, and insert a buffering
Result plan node if not. planagg.c forgot to do this, which is a hangover
from the days when it only dealt with IndexScan plan types. MergeAppend
doesn't project though, not to mention that it gets unhappy if you remove
its possibly-resjunk sort columns. The code accidentally failed to fail
for cases in which the min/max argument was a simple Var, because the new
targetlist would be equivalent to the original "flat" tlist anyway.
For any more complex case, it's been broken since 9.1 where we introduced
the ability to optimize min/max using MergeAppend, as reported by Raphael
Bauduin. Fix by duplicating the logic from grouping_planner that decides
whether we need a Result node.
In 9.2 and 9.1, this requires back-porting the tlist_same_exprs() function
introduced in commit 4387cf956b, else we'd
uselessly add a Result node in cases that worked before. It's rather
tempting to back-patch that whole commit so that we can avoid extra Result
nodes in mainline cases too; but I'll refrain, since that code hasn't
really seen all that much field testing yet.
These things didn't work because the planner omitted to do the necessary
preprocessing of a WindowFunc's argument list. Add the few dozen lines
of code needed to handle that.
Although this sounds like a feature addition, it's really a bug fix because
the default-argument case was likely to crash previously, due to lack of
checking of the number of supplied arguments in the built-in window
functions. It's not a security issue because there's no way for a
non-superuser to create a window function definition with defaults that
refers to a built-in C function, but nonetheless people might be annoyed
that it crashes rather than producing a useful error message. So
back-patch as far as the patch applies easily, which turns out to be 9.2.
I'll put a band-aid in earlier versions as a separate patch.
(Note that these features still don't work for aggregates, and fixing that
case will be harder since we represent aggregate arg lists as target lists
not bare expression lists. There's no crash risk though because CREATE
AGGREGATE doesn't accept defaults, and we reject named-argument notation
when parsing an aggregate call.)
A subquery reference to a matview should be allowed by CREATE
MATERIALIZED VIEW WITH NO DATA, just like a direct reference is.
Per bug report from Laurent Sartran.
Backpatch to 9.3.
These variables no longer have any useful purpose, since there's no reason
to special-case brute force timezones now that we have a valid
session_timezone setting for them. Remove the variables, and remove the
SET/SHOW TIME ZONE code that deals with them.
The user-visible impact of this is that SHOW TIME ZONE will now show a
POSIX-style zone specification, in the form "<+-offset>-+offset", rather
than an interval value when a brute-force zone has been set. While perhaps
less intuitive, this is a better definition than before because it's
actually possible to give that string back to SET TIME ZONE and get the
same behavior, unlike what used to happen.
We did not previously mention the angle-bracket syntax when describing
POSIX timezone specifications; add some documentation so that people
can figure out what these strings do. (There's still quite a lot of
undocumented functionality there, but anybody who really cares can
go read the POSIX spec to find out about it. In practice most people
seem to prefer Olsen-style city names anyway.)
The only remaining places where we actually look at CTimeZone/HasCTZSet
are abstime2tm() and timestamp2tm(). Now that session_timezone is always
valid, we can remove these special cases. The caller-visible impact of
this is that these functions now always return a valid zone abbreviation
if requested, whereas before they'd return a NULL pointer if a brute-force
timezone was in use. In the existing code, the only place I can find that
changes behavior is to_char(), whose TZ format code will now print
something useful rather than nothing for such zones. (In the places where
the returned zone abbreviation is passed to EncodeDateTime, the lack of
visible change is because we've chosen the abbreviation used for these
zones to match what EncodeTimezone would have printed.)
It's likely that there is now a fair amount of removable dead code around
the call sites, namely anything that's meant to cope with getting a NULL
timezone abbreviation, but I've not made an effort to root that out.
This could be back-patched if we decide we'd like to fix to_char()'s
behavior in the back branches, but there doesn't seem to be much
enthusiasm for that at present.
Formerly, when using a SQL-spec timezone setting with a fixed GMT offset
(called a "brute force" timezone in the code), the session_timezone
variable was not updated to match the nominal timezone; rather, all code
was expected to ignore session_timezone if HasCTZSet was true. This is
of course obviously fragile, though a search of the code finds only
timeofday() failing to honor the rule. A bigger problem was that
DetermineTimeZoneOffset() supposed that if its pg_tz parameter was
pointer-equal to session_timezone, then HasCTZSet should override the
parameter. This would cause datetime input containing an explicit zone
name to be treated as referencing the brute-force zone instead, if the
zone name happened to match the session timezone that had prevailed
before installing the brute-force zone setting (as reported in bug #8572).
The same malady could affect AT TIME ZONE operators.
To fix, set up session_timezone so that it matches the brute-force zone
specification, which we can do using the POSIX timezone definition syntax
"<abbrev>offset", and get rid of the bogus lookaside check in
DetermineTimeZoneOffset(). Aside from fixing the erroneous behavior in
datetime parsing and AT TIME ZONE, this will cause the timeofday() function
to print its result in the user-requested time zone rather than some
previously-set zone. It might also affect results in third-party
extensions, if there are any that make use of session_timezone without
considering HasCTZSet, but in all cases the new behavior should be saner
than before.
Back-patch to all supported branches.
With these, one need no longer manipulate large object descriptors and
extract numeric constants from header files in order to read and write
large object contents from SQL.
Pavel Stehule, reviewed by Rushabh Lathia.
The rules regression test prints all known views and rules, which is a set
that changes regularly. Previously, a change in one rule would frequently
lead to whitespace changes across the entire output of this query, which is
painful to verify and causes undesirable conflicts between unrelated patch
sets. Use \a mode to improve matters. Also use \t mode to suppress the
total-rows count, which was also a source of unnecessary patch conflicts.
Likewise modify the output mode for the list of indexed tables generated
in sanity_check.sql. There might be other places where we should use this
idea, but these are the ones that have caused the most problems.
Andres Freund
This reverts commit a0a546f0d9.
It seems better to tweak the code to suppress -0 results during
line_construct_pts(), which I'll do in the next commit.
Previously, unless all columns were auto-updateable, we wouldn't
inserts, updates, or deletes, or at least not without a rule or trigger;
now, we'll allow inserts and updates that target only the auto-updateable
columns, and deletes even if there are no auto-updateable columns at
all provided the view definition is otherwise suitable.
Dean Rasheed, reviewed by Marko Tiikkaja
Add asprintf(), pg_asprintf(), and psprintf() to simplify string
allocation and composition. Replacement implementations taken from
NetBSD.
Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Asif Naeem <anaeem.it@gmail.com>
Change the input/output format to {A,B,C}, to match the internal
representation.
Complete the implementations of line_in, line_out, line_recv, line_send.
Remove comments and error messages about the line type not being
implemented. Add regression tests for existing line operators and
functions.
Reviewed-by: rui hua <365507506hua@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com>
REFRESH MATERIALIZED VIEW CONCURRENTLY was broken for any matview
containing a column of a type without a default btree operator
class. It also did not produce results consistent with a non-
concurrent REFRESH or a normal view if any column was of a type
which allowed user-visible differences between values which
compared as equal according to the type's default btree opclass.
Concurrent matview refresh was modified to use the new operators
to solve these problems.
Documentation was added for record comparison, both for the
default btree operator class for record, and the newly added
operators. Regression tests now check for proper behavior both
for a matview with a box column and a matview containing a citext
column.
Reviewed by Steve Singer, who suggested some of the doc language.
This option provides more detailed error messages when STRICT is used
and the number of rows returned is not one.
Marko Tiikkaja, reviewed by Ian Lawrence Barwick
Previously, isolationtester would forbid returning tuples in
session-specific teardown (but not global teardown), as well as in
global setup. Allow these places to return tuples, too.
DISCARD ALL will now discard cached sequence information, as well.
Fabrízio de Royes Mello, reviewed by Zoltán Böszörményi, with some
further tweaks by me.
Though @libdir@ almost always matches @abs_builddir@ in this context,
the test could only fail if they differed. Back-patch to 9.1, where the
test was introduced.
Hamid Quddus Akhtar
Previously, arbitray system columns could be mentioned in table
constraints, but they were not correctly checked at runtime, because
the values weren't actually set correctly in the tuple. Since it
seems easy enough to initialize the table OID properly, do that,
and continue allowing that column, but disallow the rest unless and
until someone figures out a way to make them work properly.
No back-patch, because this doesn't seem important enough to take the
risk of destabilizing the back branches. In fact, this will pose a
dump-and-reload hazard for those upgrading from previous versions:
constraints that were accepted before but were not correctly enforced
will now either be enforced correctly or not accepted at all. Either
could result in restore failures, but in practice I think very few
users will notice the difference, since the use case is pretty
marginal anyway and few users will be relying on features that have
not historically worked.
Amit Kapila, reviewed by Rushabh Lathia, with doc changes by me.
The previous coding attempted to activate all the GUC settings specified
in SET clauses, so that the function validator could operate in the GUC
environment expected by the function body. However, this is problematic
when restoring a dump, since the SET clauses might refer to database
objects that don't exist yet. We already have the parameter
check_function_bodies that's meant to prevent forward references in
function definitions from breaking dumps, so let's change CREATE FUNCTION
to not install the SET values if check_function_bodies is off.
Authors of function validators were already advised not to make any
"context sensitive" checks when check_function_bodies is off, if indeed
they're checking anything at all in that mode. But extend the
documentation to point out the GUC issue in particular.
(Note that we still check the SET clauses to some extent; the behavior
with !check_function_bodies is now approximately equivalent to what ALTER
DATABASE/ROLE have been doing for awhile with context-dependent GUCs.)
This problem can be demonstrated in all active branches, so back-patch
all the way.
There's no inherent reason why an aggregate function can't be variadic
(even VARIADIC ANY) if its transition function can handle the case.
Indeed, this patch to add the feature touches none of the planner or
executor, and little of the parser; the main missing stuff was DDL and
pg_dump support.
It is true that variadic aggregates can create the same sort of ambiguity
about parameters versus ORDER BY keys that was complained of when we
(briefly) had both one- and two-argument forms of string_agg(). However,
the policy formed in response to that discussion only said that we'd not
create any built-in aggregates with varying numbers of arguments, not that
we shouldn't allow users to do it. So the logical extension of that is
we can allow users to make variadic aggregates as long as we're wary about
shipping any such in core.
In passing, this patch allows aggregate function arguments to be named, to
the extent of remembering the names in pg_proc and dumping them in pg_dump.
You can't yet call an aggregate using named-parameter notation. That seems
like a likely future extension, but it'll take some work, and it's not what
this patch is really about. Likewise, there's still some work needed to
make window functions handle VARIADIC fully, but I left that for another
day.
initdb forced because of new aggvariadic field in Aggref parse nodes.
It's possible that inlining of SQL functions (or perhaps other changes?)
has exposed typmod information not known at parse time. In such cases,
Vars generated by query_planner might have valid typmod values while the
original grouping columns only have typmod -1. This isn't a semantic
problem since the behavior of grouping only depends on type not typmod,
but it breaks locate_grouping_columns' use of tlist_member to locate the
matching entry in query_planner's result tlist.
We can fix this without an excessive amount of new code or complexity by
relying on the fact that locate_grouping_columns only gets called when
make_subplanTargetList has set need_tlist_eval == false, and that can only
happen if all the grouping columns are simple Vars. Therefore we only need
to search the sub_tlist for a matching Var, and we can reasonably define a
"match" as being a match of the Var identity fields
varno/varattno/varlevelsup. The code still Asserts that vartype matches,
but ignores vartypmod.
Per bug #8393 from Evan Martin. The added regression test case is
basically the same as his example. This has been broken for a very long
time, so back-patch to all supported branches.
In an example such as
SELECT * FROM
i LEFT JOIN LATERAL (SELECT * FROM j WHERE i.n = j.n) j ON true;
it is safe to pull up the LATERAL subquery into its parent, but we must
then treat the "i.n = j.n" clause as a qual clause of the LEFT JOIN. The
previous coding in deconstruct_recurse mistakenly labeled the clause as
"is_pushed_down", resulting in wrong semantics if the clause were applied
at the join node, as per an example submitted awhile ago by Jeremy Evans.
To fix, postpone processing of such clauses until we return back up to
the appropriate recursion depth in deconstruct_recurse.
In addition, tighten the is-safe-to-pull-up checks in is_simple_subquery;
we previously missed the possibility that the LATERAL subquery might itself
contain an outer join that makes lateral references in lower quals unsafe.
A regression test case equivalent to Jeremy's example was already in my
commit of yesterday, but was giving the wrong results because of this
bug. This patch fixes the expected output for that, and also adds a
test case for the second problem.
The planner largely failed to consider the possibility that a
PlaceHolderVar's expression might contain a lateral reference to a Var
coming from somewhere outside the PHV's syntactic scope. We had a previous
report of a problem in this area, which I tried to fix in a quick-hack way
in commit 4da6439bd8, but Antonin Houska
pointed out that there were still some problems, and investigation turned
up other issues. This patch largely reverts that commit in favor of a more
thoroughly thought-through solution. The new theory is that a PHV's
ph_eval_at level cannot be higher than its original syntactic level. If it
contains lateral references, those don't change the ph_eval_at level, but
rather they create a lateral-reference requirement for the ph_eval_at join
relation. The code in joinpath.c needs to handle that.
Another issue is that createplan.c wasn't handling nested PlaceHolderVars
properly.
In passing, push knowledge of lateral-reference checks for join clauses
into join_clause_is_movable_to. This is mainly so that FDWs don't need
to deal with it.
This patch doesn't fix the original join-qual-placement problem reported by
Jeremy Evans (and indeed, one of the new regression test cases shows the
wrong answer because of that). But the PlaceHolderVar problems need to be
fixed before that issue can be addressed, so committing this separately
seems reasonable.
The planner logic that attempted to make a preliminary estimate of the
ph_needed levels for PlaceHolderVars seems to be completely broken by
lateral references. Fortunately, the potential join order optimization
that this code supported seems to be of relatively little value in
practice; so let's just get rid of it rather than trying to fix it.
Getting rid of this allows fairly substantial simplifications in
placeholder.c, too, so planning in such cases should be a bit faster.
Issue noted while pursuing bugs reported by Jeremy Evans and Antonin
Houska, though this doesn't in itself fix either of their reported cases.
What this does do is prevent an Assert crash in the kind of query
illustrated by the added regression test. (I'm not sure that the plan for
that query is stable enough across platforms to be usable as a regression
test output ... but we'll soon find out from the buildfarm.)
Back-patch to 9.3. The problem case can't arise without LATERAL, so
no need to touch older branches.
My tweak of these error messages in commit c359a1b082 contained the
thinko that a query would always have rowMarks set for a query
containing a locking clause. Not so: when declaring a cursor, for
instance, rowMarks isn't set at the point we're checking, so we'd be
dereferencing a NULL pointer.
The fix is to pass the lock strength to the function raising the error,
instead of trying to reverse-engineer it. The result not only is more
robust, but it also seems cleaner overall.
Per report from Robert Haas.
As pointed out by Tom Lane, we can allow other users of the error
handler callbacks to provide their own memory context by adding
the context to use to ErrorData and using that instead of explicitly
using ErrorContext.
This then allows GetErrorContextStack() to be called from inside
exception handlers, so modify plpgsql to take advantage of that and
add an associated regression test for it.
We'd find the same match twice if it was of zero length and not immediately
adjacent to the previous match. replace_text_regexp() got similar cases
right, so adjust this search logic to match that. Note that even though
the regexp_split_to_xxx() functions share this code, they did not display
equivalent misbehavior, because the second match would be considered
degenerate and ignored.
Jeevan Chalke, with some cosmetic changes by me.
Refactoring as part of commit 8ceb245680
had the unintended effect of making REINDEX TABLE and REINDEX DATABASE
no longer validate constraints enforced by the indexes in question;
REINDEX INDEX still did so. Indexes marked invalid remained so, and
constraint violations arising from data corruption went undetected.
Back-patch to 9.0, like the causative commit.
As GetErrorContextStack() borrowed setup and tear-down code from other
places, it was less than clear that it must only be called as a
top-level entry point into the error system and can't be called by an
exception handler (unlike the rest of the error system, which is set up
to be reentrant-safe).
Being called from an exception handler is outside the charter of
GetErrorContextStack(), so add a bit more protection against it,
improve the comments addressing why we have to set up an errordata
stack for this function at all, and add a few more regression tests.
Lack of clarity pointed out by Tom Lane; all bugs are mine.
This adds the ability to get the call stack as a string from within a
PL/PgSQL function, which can be handy for logging to a table, or to
include in a useful message to an end-user.
Pavel Stehule, reviewed by Rushabh Lathia and rather heavily whacked
around by Stephen Frost.
After further thought about implicit coercions appearing in a joinaliasvars
list, I realized that they represent an additional reason why we might need
to reference the join output column directly instead of referencing an
underlying column. Consider SELECT x FROM t1 LEFT JOIN t2 USING (x) where
t1.x is of type date while t2.x is of type timestamptz. The merged output
variable is of type timestamptz, but it won't go to null when t2 does,
therefore neither t1.x nor t2.x is a valid substitute reference.
The code in get_variable() actually gets this case right, since it knows
it shouldn't look through a coercion, but we failed to ensure that the
unqualified output column name would be globally unique. To fix, modify
the code that trawls for a dangerous situation so that it actually scans
through an unnamed join's joinaliasvars list to see if there are any
non-simple-Var entries.
It's possible to drop a column from an input table of a JOIN clause in a
view, if that column is nowhere actually referenced in the view. But it
will still be there in the JOIN clause's joinaliasvars list. We used to
replace such entries with NULL Const nodes, which is handy for generation
of RowExpr expansion of a whole-row reference to the view. The trouble
with that is that it can't be distinguished from the situation after
subquery pull-up of a constant subquery output expression below the JOIN.
Instead, replace such joinaliasvars with null pointers (empty expression
trees), which can't be confused with pulled-up expressions. expandRTE()
still emits the old convention, though, for convenience of RowExpr
generation and to reduce the risk of breaking extension code.
In HEAD and 9.3, this patch also fixes a problem with some new code in
ruleutils.c that was failing to cope with implicitly-casted joinaliasvars
entries, as per recent report from Feike Steenbergen. That oversight was
because of an inadequate description of the data structure in parsenodes.h,
which I've now corrected. There were some pre-existing oversights of the
same ilk elsewhere, which I believe are now all fixed.
Future patches are expected to introduce logical replication that
works by decoding WAL. WAL contains relfilenodes rather than relation
OIDs, so this infrastructure will be needed to find the relation OID
based on WAL contents.
If logical replication does not make it into this release, we probably
should consider reverting this, since it will add some overhead to DDL
operations that create new relations. One additional index insert per
pg_class row is not a large overhead, but it's more than zero.
Another way of meeting the needs of logical replication would be to
the relation OID to WAL, but that would burden DML operations, not
only DDL.
Andres Freund, with some changes by me. Design review, in earlier
versions, by Álvaro Herrera.
An ancient logic error in cfindloop() could cause the regex engine to fail
to find matches that begin later than the start of the string. This
function is only used when the regex pattern contains a back reference,
and so far as we can tell the error is only reachable if the pattern is
non-greedy (i.e. its first quantifier uses the ? modifier). Furthermore,
the actual match must begin after some potential match that satisfies the
DFA but then fails the back-reference's match test.
Reported and fixed by Jeevan Chalke, with cosmetic adjustments by me.
For simple views which are automatically updatable, this patch allows
the user to specify what level of checking should be done on records
being inserted or updated. For 'LOCAL CHECK', new tuples are validated
against the conditionals of the view they are being inserted into, while
for 'CASCADED CHECK' the new tuples are validated against the
conditionals for all views involved (from the top down).
This option is part of the SQL specification.
Dean Rasheed, reviewed by Pavel Stehule
This is more efficient and simpler . It does mean that an untyped NULL
can no longer be used in such cases, which should be mentioned in
Release Notes, but doesn't seem a terrible loss. The workaround is to
cast the NULL to some array type.
Pavel Stehule, reviewed by Jeevan Chalke.
This is SQL-standard with a few extensions, namely support for
subqueries and outer references in clause expressions.
catversion bump due to change in Aggref and WindowFunc.
David Fetter, reviewed by Dean Rasheed.
This allows reads to continue without any blocking while a REFRESH
runs. The new data appears atomically as part of transaction
commit.
Review questioned the Assert that a matview was not a system
relation. This will be addressed separately.
Reviewed by Hitoshi Harada, Robert Haas, Andres Freund.
Merged after review with security patch f3ab5d4.
The code in set_append_rel_pathlist() for building parameterized paths
for append relations (inheritance and UNION ALL combinations) supposed
that the cheapest regular path for a child relation would still be cheapest
when reparameterized. Which might not be the case, particularly if the
added join conditions are expensive to compute, as in a recent example from
Jeff Janes. Fix it to compare child path costs *after* reparameterizing.
We can short-circuit that if the cheapest pre-existing path is already
parameterized correctly, which seems likely to be true often enough to be
worth checking for.
Back-patch to 9.2 where parameterized paths were introduced.
Commit 31a891857a added some tests in
plpgsql.sql that used a function rather unthinkingly named "foo()".
However, rangefuncs.sql has some much older tests that create a function
of that name, and since these test scripts run in parallel, there is a
chance of failures if the timing is just right. Use another name to
avoid that. Per buildfarm (failure seen today on "hamerkop", but
probably it's happened before and not been noticed).
This value, now pg_stat_all_tables.n_mod_since_analyze, was already
tracked and used by autovacuum, but not exposed to the user.
Mark Kirkwood, review by Laurenz Albe
Treat TOAST index just the same as normal one and get the OID
of TOAST index from pg_index but not pg_class.reltoastidxid.
This change allows us to handle multiple TOAST indexes, and
which is required infrastructure for upcoming
REINDEX CONCURRENTLY feature.
Patch by Michael Paquier, reviewed by Andres Freund and me.
The collate.linux.utf8 test covers some of the same territory, but
isn't portable and so probably does not get run often, or on
non-Linux platforms. If this approach turns out to be sufficiently
portable, we may want to look at trimming the redundant tests out
of that file to avoid duplication.
Robins Tharakan, reviewed by Michael Paquier and Fabien Coelho,
with further changes and cleanup by me.
An INSERT into such a view should work just like an INSERT into its base
table, ie the insertion should go directly into that table ... not be
duplicated into each child table, as was happening before, per bug #8275
from Rushabh Lathia. On the other hand, the current behavior for
UPDATE/DELETE seems reasonable: the update/delete traverses the child
tables, or not, depending on whether the view specifies ONLY or not.
Add some regression tests covering this area.
Dean Rasheed
Specifically, permit attaching them to the error in RAISE and retrieving
them from a caught error in GET STACKED DIAGNOSTICS. RAISE enforces
nothing about the content of the fields; for its purposes, they are just
additional string fields. Consequently, clarify in the protocol and
libpq documentation that the usual relationships between error fields,
like a schema name appearing wherever a table name appears, are not
universal. This freedom has other applications; consider a FDW
propagating an error from an RDBMS having no schema support.
Back-patch to 9.3, where core support for the error fields was
introduced. This prevents the confusion of having a release where libpq
exposes the fields and PL/pgSQL does not.
Pavel Stehule, lexical revisions by Noah Misch.
To that end, support tags rather than lengths for external datums.
As an example of how this can be used, add support or "indirect"
tuples which point to some externally allocated memory containing
a toast tuple. Similar infrastructure could be used for other
purposes, including, perhaps, support for alternative compression
algorithms.
Andres Freund, reviewed by Hitoshi Harada and myself
The dependencies on the spi and dummy_seclabel contrib modules were
incomplete, because they did not pick up automatically generated
dependencies on header files. This will manifest itself especially when
switching major versions, where the contrib modules would not be
recompiled to contain the new version number, leading to regression test
failures.
To fix this, use the submake approach already in use elsewhere, so that
the contrib modules are built using their full rules.
This results in a slightly less specific error message when OVER
is used in a context where we don't accept window functions, but
per discussion, it's worth it to get the benefit of not needing
to reserve this keyword any more. This same refactoring will
also let us avoid reserving some other keywords that we expect
to add in upcoming patches (specifically, IGNORE, RESPECT, and
FILTER).
Troels Nielsen, with minor changes by me
In Danish collations, there are letter combinations which sort
higher than 'Z'. A test for values > 'WA' was picking up rows
where the value started with 'AA', causing the test to fail.
Backpatch to 9.2, where the failing test was added.
Per report from Svenne Krap and analysis by Jeff Janes
Per discussion on -hackers. We treat Unicode escapes when unescaping
them similarly to the way we treat them in PostgreSQL string literals.
Escapes in the ASCII range are always accepted, no matter what the
database encoding. Escapes for higher code points are only processed in
UTF8 databases, and attempts to process them in other databases will
result in an error. \u0000 is never unescaped, since it would result in
an impermissible null byte.
When the existing code here was written, it made sense to special-case
RowExprs because that was the only way that we could handle row comparisons
at all. Now that we have record_eq() and arrays of composites, the generic
logic for "scalar" types will in fact work on RowExprs too, so there's no
reason to throw error for combinations of RowExprs and other ways of
forming composite values, nor to ignore the possibility of using a
ScalarArrayOpExpr. But keep using the old logic when comparing two
RowExprs, for consistency with the main transformAExprOp() logic. (This
allows some cases with not-quite-identical rowtypes to succeed, so we might
get push-back if we removed it.) Per bug #8198 from Rafal Rzepecki.
Back-patch to all supported branches, since this works fine as far back as
8.4.
Rafal Rzepecki and Tom Lane
In 9.2, Unicode escape sequences are not analysed at all other than
to make sure that they are in the form \uXXXX. But in 9.3 many of the
new operators and functions try to turn JSON text values into text in
the server encoding, and this includes de-escaping Unicode escape
sequences. This processing had not taken into account the possibility
that this might contain a surrogate pair to designate a character
outside the BMP. That is now handled correctly.
This also enforces correct use of surrogate pairs, something that is not
done by the type's input routines. This fact is noted in the docs.
The planner is aware that it mustn't push down upper-level quals into
subqueries if the quals reference subquery output columns that contain
set-returning functions or volatile functions, or are non-DISTINCT outputs
of a DISTINCT ON subquery. However, it missed making this check when
there were one or more levels of UNION or INTERSECT above the dangerous
expression. This could lead to "set-valued function called in context that
cannot accept a set" errors, as seen in bug #8213 from Eric Soroos, or to
silently wrong answers in the other cases.
To fix, refactor the checks so that we make the column-is-unsafe checks
during subquery_is_pushdown_safe(), which already has to recursively
inspect all arms of a set-operation tree. This makes
qual_is_pushdown_safe() considerably simpler, at the cost that we will
spend some cycles checking output columns that possibly aren't referenced
in any upper qual. But the cases where this code gets executed at all
are already nontrivial queries, so it's unlikely anybody will notice any
slowdown of planning.
This has been broken since commit 05f916e6ad,
which makes the bug over ten years old. A bit surprising nobody noticed it
before now.
euc_* and mule_internal test cases were identical to the ones in
src/test/mb. sql_ascii didn't exist elsewhere, but has been broken since
2001, and doesn't seem very interesting anyway. drop.sql hasn't been used
since 2000, when regress.sh was removed.
Fixes oversight in commit 2ffa740be9.
Per report from Josh Kupershmidt.
I think we've broken this case before, so let's add a regression test
this time.
In a construct like "select plain_function(set_returning_function(...))",
the plain function is applied to each output row of the SRF successively.
If some of the SRF outputs are NULL, and the plain function is strict,
you'd expect to get NULL results for such rows ... but what actually
happened was that such rows were omitted entirely from the result set.
This was due to confusion of this case with what should happen for nested
set-returning functions; a strict SRF is indeed supposed to yield an empty
set for null input. Per bug #8150 from Erwin Brandstetter.
Although this has been broken forever, we're not back-patching because
of the possibility that some apps out there expect the incorrect behavior.
This change should be listed as a possible incompatibility in the 9.3
release notes.
This reverts the code changes in 50c137487c,
which turned out to induce crashes and not completely fix the problem
anyway. That commit only considered single subqueries that were excluded
by constraint-exclusion logic, but actually the problem also exists for
subqueries that are appendrel members (ie part of a UNION ALL list). In
such cases we can't add a dummy subpath to the appendrel's AppendPath list
without defeating the logic that recognizes when an appendrel is completely
excluded. Instead, fix the problem by having setrefs.c scan the rangetable
an extra time looking for subqueries that didn't get into the plan tree.
(This approach depends on the 9.2 change that made set_subquery_pathlist
generate dummy paths for excluded single subqueries, so that the exclusion
behavior is the same for single subqueries and appendrel members.)
Note: it turns out that the appendrel form of the missed-permissions-checks
bug exists as far back as 8.4. However, since the practical effect of that
bug seems pretty minimal, consensus is to not attempt to fix it in the back
branches, at least not yet. Possibly we could back-port this patch once
it's gotten a reasonable amount of testing in HEAD. For the moment I'm
just going to revert the previous patch in 9.2.
What we have implemented is a radix tree (or a radix trie or a patricia
trie), but the docs and code comments incorrectly called it a "suffix tree".
Alexander Korotkov
Previously this state was represented by whether the view's disk file had
zero or nonzero size, which is problematic for numerous reasons, since it's
breaking a fundamental assumption about heap storage. This was done to
allow unlogged matviews to revert to unpopulated status after a crash
despite our lack of any ability to update catalog entries post-crash.
However, this poses enough risk of future problems that it seems better to
not support unlogged matviews until we can find another way. Accordingly,
revert that choice as well as a number of existing kluges forced by it
in favor of creating a pg_class.relispopulated flag column.
The initial implementation of this feature was really unsupportable,
because it's relying on the physical size of an on-disk file to carry the
relation's populated/unpopulated state, which is at least a modularity
violation and could have serious long-term consequences. We could say that
an unlogged matview goes to empty on crash, but not everybody likes that
definition, so let's just remove the feature for 9.3. We can add it back
when we have a less klugy implementation.
I left the grammar and tab-completion support for CREATE UNLOGGED
MATERIALIZED VIEW in place, since it's harmless and allows delivering a
more specific error message about the unsupported feature.
I'm committing this separately to ease identification of what should be
reverted when/if we are able to re-enable the feature.
A view defined as "select <something> where false" had the curious property
that the system wouldn't check whether users had the privileges necessary
to select from it. More generally, permissions checks could be skipped
for tables referenced in sub-selects or views that were proven empty by
constraint exclusion (although some quick testing suggests this seldom
happens in cases of practical interest). This happened because the planner
failed to include rangetable entries for such tables in the finished plan.
This was noticed in connection with erroneous handling of materialized
views, but actually the issue is quite unrelated to matviews. Therefore,
revert commit 200ba1667b in favor of a more
direct test for the real problem.
Back-patch to 9.2 where the bug was introduced (by commit
7741dd6590).
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
ORDER BY expressions were being treated the same as regular aggregate
arguments for purposes of collation determination, but really they should
not affect the aggregate's collation at all; only collations of the
aggregate's regular arguments should affect it.
In many cases this mistake would lead to incorrectly throwing a "collation
conflict" error; but in some cases the corrected code will silently assign
a different collation to the aggregate than before, for example
agg(foo ORDER BY bar COLLATE "x")
which will now use foo's collation rather than "x" for the aggregate.
Given this risk and the lack of field complaints about the issue, it
doesn't seem prudent to back-patch.
In passing, rearrange code in assign_collations_walker so that we don't
need multiple copies of the standard logic for computing collation of a
node with children. (Previously, CaseExpr duplicated the standard logic,
and we would have needed a third copy for Aggref without this change.)
Andrew Gierth and David Fetter
In most cases, these were just references to the SQL standard in
general. In a few cases, a contrast was made between SQL92 and later
standards -- those have been kept unchanged.
Notice and complain about PQcancel() failures. Also, don't dump core if
an error PGresult doesn't contain severity and message subfields, as it
might not if it was generated by libpq itself. (We have a longstanding
TODO item to improve that, but in the meantime isolationtester had better
cope.)
I tripped across the latter item while investigating a trouble report on
buildfarm member spoonbill. As for the former, there's no evidence that
PQcancel failure is actually involved in spoonbill's problem, but it still
seems like a bad idea to ignore an error return code.