Commit graph

7953 commits

Author SHA1 Message Date
Peter Eisentraut
086c84b23d Fix error code for referential action RESTRICT
According to the SQL standard, if the referential action RESTRICT is
triggered, it has its own error code.  We previously didn't use that,
we just used the error code for foreign key violation.  But RESTRICT
is not necessarily an actual foreign key violation.  The foreign key
might still be satisfied in theory afterwards, but the RESTRICT
setting prevents the action even then.  So it's a separate kind of
error condition.

Discussion: https://www.postgresql.org/message-id/ea5b2777-266a-46fa-852f-6fca6ec480ad@eisentraut.org
2024-12-02 08:22:34 +01:00
Tom Lane
e032e4c7dd Avoid mislabeling of lateral references, redux.
As I'd feared, commit 5c9d8636d was still a few bricks shy of a load.
We can't just leave pulled-up lateral-reference Vars with no new
nullingrels: we have to carefully compute what subset of the
to-be-replaced Var's nullingrels apply to them, else we still get
"wrong varnullingrels" errors.  This is a bit tedious, but it looks
like we can use the nullingrel data this patch computes for other
purposes, enabling better optimization.  We don't want to inject
unnecessary plan changes into stable branches though, so leave that
idea for a later HEAD-only patch.

Patch by me, but thanks to Richard Guo for devising a test case that
broke 5c9d8636d, and for preliminary investigation about how to fix
it.  As before, back-patch to v16.

Discussion: https://postgr.es/m/E1tGn4j-0003zi-MP@gemulon.postgresql.org
2024-11-30 12:42:19 -05:00
Peter Eisentraut
4a2dbfc6be Add tests for foreign keys with case-insensitive collations
Some of the behaviors of the different referential actions, such as
the difference between NO ACTION and RESTRICT are best illustrated
using a case-insensitive collation.  So add some tests for that.

(What is actually being tested here is the behavior with values that
are "distinct" (binary different) but compare as equal.  Another way
to do that would be with positive and negative zeroes with float
types.  But this way seems nicer and more flexible.)

Discussion: https://www.postgresql.org/message-id/ea5b2777-266a-46fa-852f-6fca6ec480ad@eisentraut.org
2024-11-29 08:52:55 +01:00
Alexander Korotkov
5bba0546ee Skip not SOAP-supported indexes while transforming an OR clause into SAOP
There is no point in transforming OR-clauses into SAOP's if the target index
doesn't support SAOP scans anyway.  This commit adds corresponding checks
to match_orclause_to_indexcol() and group_similar_or_args().  The first check
fixes the actual bug, while the second just saves some cycles.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/8174de69-9e1a-0827-0e81-ef97f56a5939%40gmail.com
Author: Alena Rybakina
Reviewed-by: Ranier Vilela, Alexander Korotkov, Andrei Lepikhov
2024-11-29 09:52:12 +02:00
Tom Lane
5c9d8636d3 Avoid mislabeling of lateral references when pulling up a subquery.
If we are pulling up a subquery that's under an outer join, and
the subquery's target list contains a strict expression that uses
both a subquery variable and a lateral-reference variable, it's okay
to pull up the expression without wrapping it in a PlaceHolderVar.
That's safe because if the subquery variable is forced to NULL
by the outer join, the expression result will come out as NULL too,
so we don't have to force that outcome by evaluating the expression
below the outer join.  It'd be correct to wrap in a PHV, but that can
lead to very significantly worse plans, since we'd then have to use
a nestloop plan to pass down the lateral reference to where the
expression will be evaluated.

However, when we do that, we should not mark the lateral reference
variable as being nulled by the outer join, because it isn't after
we pull up the expression in this way.  So the marking logic added
by cb8e50a4a was incorrect in this detail, leading to "wrong
varnullingrels" errors from the consistency-checking logic in
setrefs.c.  It seems to be sufficient to just not mark lateral
references at all in this case.  (I have a nagging feeling that more
complexity may be needed in cases where there are several levels of
outer join, but some attempts to break it with that didn't succeed.)

Per report from Bertrand Mamasam.  Back-patch to v16, as the previous
patch was.

Discussion: https://postgr.es/m/CACZ67_UA_EVrqiFXJu9XK50baEpH=ofEPJswa2kFxg6xuSw-ww@mail.gmail.com
2024-11-28 17:33:16 -05:00
Peter Eisentraut
7f798aca1d Remove useless casts to (void *)
Many of them just seem to have been copied around for no real reason.
Their presence causes (small) risks of hiding actual type mismatches
or silently discarding qualifiers

Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org
2024-11-28 08:27:20 +01:00
Andrew Dunstan
5c32c21afe jsonapi: add lexer option to keep token ownership
Commit 0785d1b8b adds support for libpq as a JSON client, but
allocations for string tokens can still be leaked during parsing
failures. This is tricky to fix for the object_field semantic callbacks:
the field name must remain valid until the end of the object, but if a
parsing error is encountered partway through, object_field_end() won't
be invoked and the client won't get a chance to free the field name.

This patch adds a flag to switch the ownership of parsed tokens to the
lexer. When this is enabled, the client must make a copy of any tokens
it wants to persist past the callback lifetime, but the lexer will
handle necessary cleanup on failure.

Backend uses of the JSON parser don't need to use this flag, since the
parser's allocations will occur in a short lived memory context.

A -o option has been added to test_json_parser_incremental to exercise
the new setJsonLexContextOwnsTokens() API, and the test_json_parser TAP
tests make use of it. (The test program now cleans up allocated memory,
so that tests can be usefully run under leak sanitizers.)

Author: Jacob Champion

Discussion: https://postgr.es/m/CAOYmi+kb38EciwyBQOf9peApKGwraHqA7pgzBkvoUnw5BRfS1g@mail.gmail.com
2024-11-27 12:07:14 -05:00
Álvaro Herrera
09d09d4297
Fix pg_get_constraintdef for NOT NULL constraints on domains
We added pg_constraint rows for all not-null constraints, first for
tables and later for domains; but while the ones for tables were
reverted, the ones for domains were not.  However, we did accidentally
revert ruleutils.c support for the ones on domains in 6f8bb7c1e9,
which breaks running pg_get_constraintdef() on them.  Put that back.

This is only needed in branch 17, because we've reinstated this code in
branch master with commit 14e87ffa5c.  Add some new tests in both
branches.

I couldn't find anything else that needs de-reverting.

Reported-by: Erki Eessaar <erki.eessaar@taltech.ee>
Reviewed-by: Magnus Hagander <magnus@hagander.net>
Discussion: https://postgr.es/m/AS8PR01MB75110350415AAB8BBABBA1ECFE222@AS8PR01MB7511.eurprd01.prod.exchangelabs.com
2024-11-27 13:50:27 +01:00
Amit Kapila
9d7aa406d0 Fix buildfarm failure from commit 8fcd80258b.
The test case was incorrectly matching the error code.

Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm0C5LPiTxkdqsxiyeaL=nuUP8t6ne81sp9jE0=MFz=-ew@mail.gmail.com
2024-11-27 14:54:26 +05:30
Peter Eisentraut
85b7efa1cd Support LIKE with nondeterministic collations
This allows for example using LIKE with case-insensitive collations.
There was previously no internal implementation of this, so it was met
with a not-supported error.  This adds the internal implementation and
removes the error.  The implementation follows the specification of
the SQL standard for this.

Unlike with deterministic collations, the LIKE matching cannot go
character by character but has to go substring by substring.  For
example, if we are matching against LIKE 'foo%bar', we can't start by
looking for an 'f', then an 'o', but instead with have to find
something that matches 'foo'.  This is because the collation could
consider substrings of different lengths to be equal.  This is all
internal to MatchText() in like_match.c.

The changes in GenericMatchText() in like.c just pass through the
locale information to MatchText(), which was previously not needed.
This matches exactly Generic_Text_IC_like() below.

ILIKE is not affected.  (It's unclear whether ILIKE makes sense under
nondeterministic collations.)

This also updates match_pattern_prefix() in like_support.c to support
optimizing the case of an exact pattern with nondeterministic
collations.  This was already alluded to in the previous code.

(includes documentation examples from Daniel Vérité and test cases
from Paul A Jungwirth)

Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/700d2e86-bf75-4607-9cf2-f5b7802f6e88@eisentraut.org
2024-11-27 08:19:42 +01:00
Amit Kapila
8fcd80258b Improve error message for replication of generated columns.
Currently, logical replication produces a generic error message when
targeting a subscriber-side table column that is either missing or
generated. The error message can be misleading for generated columns.

This patch introduces a specific error message to clarify the issue when
generated columns are involved.

Author: Shubham Khanna
Reviewed-by: Peter Smith, Vignesh C, Amit Kapila
Discussion: https://postgr.es/m/CAHv8RjJBvYtqU7OAofBizOmQOK2Q8h+w9v2_cQWxT_gO7er3Aw@mail.gmail.com
2024-11-27 09:09:20 +05:30
Richard Guo
e15e567137 Fix test case from a8ccf4e93
Commit a8ccf4e93 uses the same table name "distinct_tbl" in both
select_distinct.sql and select_distinct_on.sql, which could cause
conflicts when these two test scripts are run in parallel.

Fix by renaming the table in select_distinct_on.sql to
"distinct_on_tbl".

Per buildfarm (via Tom Lane)

Discussion: https://postgr.es/m/1572004.1732583549@sss.pgh.pa.us
2024-11-26 11:12:57 +09:00
Richard Guo
a8ccf4e93a Reordering DISTINCT keys to match input path's pathkeys
The ordering of DISTINCT items is semantically insignificant, so we
can reorder them as needed.  In fact, in the parser, we absorb the
sorting semantics of the sortClause as much as possible into the
distinctClause, ensuring that one clause is a prefix of the other.
This can help avoid a possible need to re-sort.

In this commit, we attempt to adjust the DISTINCT keys to match the
input path's pathkeys.  This can likewise help avoid re-sorting, or
allow us to use incremental-sort to save efforts.

For DISTINCT ON expressions, the parser already ensures that they
match the initial ORDER BY expressions.  When reordering the DISTINCT
keys, we must ensure that the resulting pathkey list matches the
initial distinctClause pathkeys.

This introduces a new GUC, enable_distinct_reordering, which allows
the optimization to be disabled if needed.

Author: Richard Guo
Reviewed-by: Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48dR26cCcX0f=8bja2JKQPcU64136kHk=xekHT9xschiQ@mail.gmail.com
2024-11-26 09:25:18 +09:00
Tom Lane
5b8728cd7f Fix NULLIF()'s handling of read-write expanded objects.
If passed a read-write expanded object pointer, the EEOP_NULLIF
code would hand that same pointer to the equality function
and then (unless equality was reported) also return the same
pointer as its value.  This is no good, because a function that
receives a read-write expanded object pointer is fully entitled
to scribble on or even delete the object, thus corrupting the
NULLIF output.  (This problem is likely unobservable with the
equality functions provided in core Postgres, but it's easy to
demonstrate with one coded in plpgsql.)

To fix, make sure the pointer passed to the equality function
is read-only.  We can still return the original read-write
pointer as the NULLIF result, allowing optimization of later
operations.

Per bug #18722 from Alexander Lakhin.  This has been wrong
since we invented expanded objects, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/18722-fd9e645448cc78b4@postgresql.org
2024-11-25 18:09:09 -05:00
Noah Misch
4ba84de459 Avoid "you don't own a lock of type ExclusiveLock" in GRANT TABLESPACE.
This WARNING appeared because SearchSysCacheLocked1() read
cc_relisshared before catcache initialization, when the field is false
unconditionally.  On the basis of reading false there, it constructed a
locktag as though pg_tablespace weren't relisshared.  Only shared
catalogs could be affected, and only GRANT TABLESPACE was affected in
practice.  SearchSysCacheLocked1() callers use one other shared-relation
syscache, DATABASEOID.  DATABASEOID is initialized by the end of
CheckMyDatabase(), making the problem unreachable for pg_database.

Back-patch to v13 (all supported versions).  This has no known impact
before v16, where ExecGrant_common() first appeared.  Earlier branches
avoid trouble by having a separate ExecGrant_Tablespace() that doesn't
use LOCKTAG_TUPLE.  However, leaving this unfixed in v15 could ensnare a
future back-patch of a SearchSysCacheLocked1() call.

Reported by Aya Iwata.

Discussion: https://postgr.es/m/OS7PR01MB11964507B5548245A7EE54E70EA212@OS7PR01MB11964.jpnprd01.prod.outlook.com
2024-11-25 14:42:35 -08:00
Alexander Korotkov
d4d11940df Remove the wrong assertion from match_orclause_to_indexcol()
Obviously, the constant could be zero.  Also, add the relevant check to
regression tests.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-siKJdtWhcbqk4Y-xG12do2Ckm1qw672GNsSnDqL9FQg%40mail.gmail.com
2024-11-25 09:07:30 +02:00
Noah Misch
5de08f136a Test "options=-crole=" and "ALTER DATABASE SET role".
Commit 7b88529f43 fixed a regression
spanning these features, but it didn't test them.  It did test code
paths sufficient for their present implementations, so no back-patch.

Reported by Matthew Woodcraft.

Discussion: https://postgr.es/m/87iksnsbhx.fsf@golux.woodcraft.me.uk
2024-11-24 12:49:53 -08:00
Alexander Korotkov
ae4569161a Teach bitmap path generation about transforming OR-clauses to SAOP's
When optimizer generates bitmap paths, it considers breaking OR-clause
arguments one-by-one.  But now, a group of similar OR-clauses can be
transformed into SAOP during index matching.  So, bitmap paths should
keep up.

This commit teaches bitmap paths generation machinery to group similar
OR-clauses into dedicated RestrictInfos.  Those RestrictInfos are considered
both to match index as a whole (as SAOP), or to match as a set of individual
OR-clause argument one-by-one (the old way).

Therefore, bitmap path generation will takes advantage of OR-clauses to SAOP's
transformation.  The old way of handling them is also considered.  So, there
shouldn't be planning regression.

Discussion: https://postgr.es/m/CAPpHfdu5iQOjF93vGbjidsQkhHvY2NSm29duENYH_cbhC6x%2BMg%40mail.gmail.com
Author: Alexander Korotkov, Andrey Lepikhov
Reviewed-by: Alena Rybakina, Andrei Lepikhov, Jian he, Robert Haas
Reviewed-by: Peter Geoghegan
2024-11-24 01:41:45 +02:00
Alexander Korotkov
d4378c0005 Transform OR-clauses to SAOP's during index matching
This commit makes match_clause_to_indexcol() match
"(indexkey op C1) OR (indexkey op C2) ... (indexkey op CN)" expression
to the index while transforming it into "indexkey op ANY(ARRAY[C1, C2, ...])"
(ScalarArrayOpExpr node).

This transformation allows handling long OR-clauses with single IndexScan
avoiding diving them into a slower BitmapOr.

We currently restrict Ci to be either Const or Param to apply this
transformation only when it's clearly beneficial.  However, in the future,
we might switch to a liberal understanding of constants, as it is in other
cases.

Discussion: https://postgr.es/m/567ED6CA.2040504%40sigaev.ru
Author: Alena Rybakina, Andrey Lepikhov, Alexander Korotkov
Reviewed-by: Peter Geoghegan, Ranier Vilela, Alexander Korotkov, Robert Haas
Reviewed-by: Jian He, Tom Lane, Nikolay Shaplov
2024-11-24 01:40:20 +02:00
Jeff Davis
869ee4f10e Disallow modifying statistics on system columns.
Reported-by: Heikki Linnakangas
Discussion: https://postgr.es/m/df3e1c41-4e6c-40ad-9636-98deefe488cd@iki.fi
2024-11-22 12:40:24 -08:00
Nathan Bossart
efdc7d7475 Add INT64_HEX_FORMAT and UINT64_HEX_FORMAT to c.h.
Like INT64_FORMAT and UINT64_FORMAT, these macros produce format
strings for 64-bit integers.  However, INT64_HEX_FORMAT and
UINT64_HEX_FORMAT generate the output in hexadecimal instead of
decimal.  Besides introducing these macros, this commit makes use
of them in several places.  This was originally intended to be part
of commit 5d6187d2a2, but I left it out because I felt there was a
nonzero chance that back-patching these new macros into c.h could
cause problems with third-party code.  We tend to be less cautious
with such changes in new major versions.

Note that UINT64_HEX_FORMAT was originally added in commit
ee1b30f128, but it was placed in test_radixtree.c, so it wasn't
widely available.  This commit moves UINT64_HEX_FORMAT to c.h.

Discussion: https://postgr.es/m/ZwQvtUbPKaaRQezd%40nathan
2024-11-22 12:41:57 -06:00
Michael Paquier
c06e71d1ac Add write_to_file to PgStat_KindInfo for pgstats kinds
This new field controls if entries of a stats kind should be written or
not to the on-disk pgstats file when shutting down an instance.  This
affects both fixed and variable-numbered kinds.

This is useful for custom statistics by itself, and a patch is under
discussion to add a new builtin stats kind where the write of the stats
is not necessary.  All the built-in stats kinds, as well as the two
custom stats kinds in the test module injection_points, set this flag to
"true" for now, so as stats entries are written to the on-disk pgstats
file.

Author: Bertrand Drouvot
Reviewed-by: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/Zz7T47nHwYgeYwOe@ip-10-97-1-34.eu-west-3.compute.internal
2024-11-22 10:12:26 +09:00
Peter Eisentraut
79b575d3bc Fix ALTER TABLE / REPLICA IDENTITY for temporal tables
REPLICA IDENTITY USING INDEX did not accept a GiST index.  This should
be allowed when used as a temporal primary key.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/04579cbf-b134-45e1-8f2d-8c54c849c1ee@illuminatedcomputing.com
2024-11-21 13:50:18 +01:00
Noah Misch
7b88529f43 Fix per-session activation of ALTER {ROLE|DATABASE} SET role.
After commit 5a2fed911a, the catalog state
resulting from these commands ceased to affect sessions.  Restore the
longstanding behavior, which is like beginning the session with a SET
ROLE command.  If cherry-picking the CVE-2024-10978 fixes, default to
including this, too.  (This fixes an unintended side effect of fixing
CVE-2024-10978.)  Back-patch to v12, like that commit.  The release team
decided to include v12, despite the original intent to halt v12 commits
earlier this week.

Tom Lane and Noah Misch.  Reported by Etienne LAFARGE.

Discussion: https://postgr.es/m/CADOZwSb0UsEr4_UTFXC5k7=fyyK8uKXekucd+-uuGjJsGBfxgw@mail.gmail.com
2024-11-15 20:39:56 -08:00
Tom Lane
b69bdcee9c Avoid assertion due to disconnected NFA sub-graphs in regex parsing.
In commit 08c0d6ad6 which introduced "rainbow" arcs in regex NFAs,
I didn't think terribly hard about what to do when creating the color
complement of a rainbow arc.  Clearly, the complement cannot match any
characters, and I took the easy way out by just not building any arcs
at all in the complement arc set.  That mostly works, but Nikolay
Shaplov found a case where it doesn't: if we decide to delete that
sub-NFA later because it's inside a "{0}" quantifier, delsub()
suffered an assertion failure.  That's because delsub() relies on
the target sub-NFA being fully connected.  That was always true
before, and the best fix seems to be to restore that property.
Hence, invent a new arc type CANTMATCH that can be generated in
place of an empty color complement, and drop it again later when we
start NFA optimization.  (At that point we don't need to do delsub()
any more, and besides there are other cases where NFA optimization can
lead to disconnected subgraphs.)

It appears that this bug has no consequences in a non-assert-enabled
build: there will be some transiently leaked NFA states/arcs, but
they'll get cleaned up eventually.  Still, we don't like assertion
failures, so back-patch to v14 where rainbow arcs were introduced.

Per bug #18708 from Nikolay Shaplov.

Discussion: https://postgr.es/m/18708-f94f2599c9d2c005@postgresql.org
2024-11-15 18:23:38 -05:00
Peter Eisentraut
9321d2fdf8 Fix collation handling for foreign keys
Allowing foreign keys where the referenced and the referencing columns
have collations with different notions of equality is problematic.
This can only happen when using nondeterministic collations, for
example, if the referencing column is case-insensitive and the
referenced column is not, or vice versa.  It does not happen if both
collations are deterministic.

To show one example:

    CREATE COLLATION case_insensitive (provider = icu, deterministic = false, locale = 'und-u-ks-level2');

    CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
    CREATE TABLE fktable (x text COLLATE case_insensitive REFERENCES pktable ON UPDATE CASCADE ON DELETE CASCADE);
    INSERT INTO pktable VALUES ('A'), ('a');
    INSERT INTO fktable VALUES ('A');

    BEGIN; DELETE FROM pktable WHERE x = 'a'; TABLE fktable; ROLLBACK;
    BEGIN; DELETE FROM pktable WHERE x = 'A'; TABLE fktable; ROLLBACK;

Both of these DELETE statements delete the one row from fktable.  So
this means that one row from fktable references two rows in pktable,
which should not happen.  (That's why a primary key or unique
constraint is required on pktable.)

When nondeterministic collations were implemented, the SQL standard
available to yours truly said that referential integrity checks should
be performed with the collation of the referenced column, and so
that's how we implemented it.  But this turned out to be a mistake in
the SQL standard, for the same reasons as above, that was later
(SQL:2016) fixed to require both collations to be the same.  So that's
what we are aiming for here.

We don't have to be quite so strict.  We can allow different
collations if they are both deterministic.  This is also good for
backward compatibility.

So the new rule is that the collations either have to be the same or
both deterministic.  Or in other words, if one of them is
nondeterministic, then both have to be the same.

Users upgrading from before that have affected setups will need to
make changes to their schemas (i.e., change one or both collations in
affected foreign-key relationships) before the upgrade will succeed.

Some of the nice test cases for the previous situation in
collate.icu.utf8.sql are now obsolete.  They are changed to just check
the error checking of the new rule.  Note that collate.sql already
contained a test for foreign keys with different deterministic
collations.

A bunch of code in ri_triggers.c that added a COLLATE clause to
enforce the referenced column's collation can be removed, because both
columns now have to have the same notion of equality, so it doesn't
matter which one to use.

Reported-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/78d824e0-b21e-480d-a252-e4b84bc2c24b@illuminatedcomputing.com
2024-11-15 14:55:54 +01:00
Peter Eisentraut
d31bbfb659 Proper object locking for GRANT/REVOKE
Refactor objectNamesToOids() to use get_object_address() internally if
possible.  Not only does this save a lot of code, it also allows us to
use the object locking provided by get_object_address() for
GRANT/REVOKE.  There was previously a code comment that complained
about the lack of locking in objectNamesToOids(), which is now fixed.

The check in ExecGrant_Type_check() is obsolete because
get_object_address_type() already does the same check.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/bf72b82c-124d-4efa-a484-bb928e9494e4@eisentraut.org
2024-11-15 11:03:48 +01:00
Heikki Linnakangas
bb861414fe Kill dead-end children when there's nothing else left
Previously, the postmaster would never try to kill dead-end child
processes, even if there were no other processes left. A dead-end
backend will eventually exit, when authentication_timeout expires, but
if a dead-end backend is the only thing that's preventing the server
from shutting down, it seems better to kill it immediately. It's
particularly important, if there was a bug in the early startup code
that prevented a dead-end child from timing out and exiting normally.

Includes a test for that case where a dead-end backend previously
prevented the server from shutting down.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:04 +02:00
Tom Lane
5a2fed911a Fix improper interactions between session_authorization and role.
The SQL spec mandates that SET SESSION AUTHORIZATION implies
SET ROLE NONE.  We tried to implement that within the lowest-level
functions that manipulate these settings, but that was a bad idea.
In particular, guc.c assumes that it doesn't matter in what order
it applies GUC variable updates, but that was not the case for these
two variables.  This problem, compounded by some hackish attempts to
work around it, led to some security-grade issues:

* Rolling back a transaction that had done SET SESSION AUTHORIZATION
would revert to SET ROLE NONE, even if that had not been the previous
state, so that the effective user ID might now be different from what
it had been.

* The same for SET SESSION AUTHORIZATION in a function SET clause.

* If a parallel worker inspected current_setting('role'), it saw
"none" even when it should see something else.

Also, although the parallel worker startup code intended to cope
with the current role's pg_authid row having disappeared, its
implementation of that was incomplete so it would still fail.

Fix by fully separating the miscinit.c functions that assign
session_authorization from those that assign role.  To implement the
spec's requirement, teach set_config_option itself to perform "SET
ROLE NONE" when it sets session_authorization.  (This is undoubtedly
ugly, but the alternatives seem worse.  In particular, there's no way
to do it within assign_session_authorization without incompatible
changes in the API for GUC assign hooks.)  Also, improve
ParallelWorkerMain to directly set all the relevant user-ID variables
instead of relying on some of them to get set indirectly.  That
allows us to survive not finding the pg_authid row during worker
startup.

In v16 and earlier, this includes back-patching 9987a7bf3 which
fixed a violation of GUC coding rules: SetSessionAuthorization
is not an appropriate place to be throwing errors from.

Security: CVE-2024-10978
2024-11-11 10:29:54 -05:00
Nathan Bossart
cd7ab57532 Ensure cached plans are correctly marked as dependent on role.
If a CTE, subquery, sublink, security invoker view, or coercion
projection references a table with row-level security policies, we
neglected to mark the plan as potentially dependent on which role
is executing it.  This could lead to later executions in the same
session returning or hiding rows that should have been hidden or
returned instead.

Reported-by: Wolfgang Walther
Reviewed-by: Noah Misch
Security: CVE-2024-10976
Backpatch-through: 12
2024-11-11 09:00:00 -06:00
Noah Misch
b7e3a52a87 Block environment variable mutations from trusted PL/Perl.
Many process environment variables (e.g. PATH), bypass the containment
expected of a trusted PL.  Hence, trusted PLs must not offer features
that achieve setenv().  Otherwise, an attacker having USAGE privilege on
the language often can achieve arbitrary code execution, even if the
attacker lacks a database server operating system user.

To fix PL/Perl, replace trusted PL/Perl %ENV with a tied hash that just
replaces each modification attempt with a warning.  Sites that reach
these warnings should evaluate the application-specific implications of
proceeding without the environment modification:

  Can the application reasonably proceed without the modification?

    If no, switch to plperlu or another approach.

    If yes, the application should change the code to stop attempting
    environment modifications.  If that's too difficult, add "untie
    %main::ENV" in any code executed before the warning.  For example,
    one might add it to the start of the affected function or even to
    the plperl.on_plperl_init setting.

In passing, link to Perl's guidance about the Perl features behind the
security posture of PL/Perl.

Back-patch to v12 (all supported versions).

Andrew Dunstan and Noah Misch

Security: CVE-2024-10979
2024-11-11 06:23:43 -08:00
Michael Paquier
e7a9496de9 Add two attributes to pg_stat_database for parallel workers activity
Two attributes are added to pg_stat_database:
* parallel_workers_to_launch, counting the total number of parallel
workers that were planned to be launched.
* parallel_workers_launched, counting the total number of parallel
workers actually launched.

The ratio of both fields can provide hints that there are not enough
slots available when launching parallel workers, also useful when
pg_stat_statements is not deployed on an instance (i.e. cf54a2c002).

This commit relies on de3a2ea3b2, that has added two fields to EState,
that get incremented when executing Gather or GatherMerge nodes.

A test is added in select_parallel, where parallel workers are spawned.

Bump catalog version.

Author: Benoit Lobréau
Discussion: https://postgr.es/m/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com
2024-11-11 10:40:48 +09:00
Álvaro Herrera
14e87ffa5c
Add pg_constraint rows for not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints on
user tables.  Only one such constraint is allowed for a column.

We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables.  These related constraints mostly
follow the well-known rules of conislocal and coninhcount that we have
for CHECK constraints, with some adaptations: for example, as opposed to
CHECK constraints, we don't match not-null ones by name when descending
a hierarchy to alter or remove it, instead matching by the name of the
column that they apply to.  This means we don't require the constraint
names to be identical across a hierarchy.

The inheritance status of these constraints can be controlled: now we
can be sure that if a parent table has one, then all children will have
it as well.  They can optionally be marked NO INHERIT, and then children
are free not to have one.  (There's currently no support for altering a
NO INHERIT constraint into inheriting down the hierarchy, but that's a
desirable future feature.)

This also opens the door for having these constraints be marked NOT
VALID, as well as allowing UNIQUE+NOT NULL to be used for functional
dependency determination, as envisioned by commit e49ae8d3bc.  It's
likely possible to allow DEFERRABLE constraints as followup work, as
well.

psql shows these constraints in \d+, though we may want to reconsider if
this turns out to be too noisy.  Earlier versions of this patch hid
constraints that were on the same columns of the primary key, but I'm
not sure that that's very useful.  If clutter is a problem, we might be
better off inventing a new \d++ command and not showing the constraints
in \d+.

For now, we omit these constraints on system catalog columns, because
they're unlikely to achieve anything.

The main difference to the previous attempt at this (b0e96f3119) is
that we now require that such a constraint always exists when a primary
key is in the column; we didn't require this previously which had a
number of unpalatable consequences.  With this requirement, the code is
easier to reason about.  For example:

- We no longer have "throwaway constraints" during pg_dump.  We needed
  those for the case where a table had a PK without a not-null
  underneath, to prevent a slow scan of the data during restore of the
  PK creation, which was particularly problematic for pg_upgrade.

- We no longer have to cope with attnotnull being set spuriously in
  case a primary key is dropped indirectly (e.g., via DROP COLUMN).

Some bits of code in this patch were authored by Jian He.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: 何建 (jian he) <jian.universality@gmail.com>
Reviewed-by: 王刚 (Tender Wang) <tndrwang@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/202408310358.sdhumtyuy2ht@alvherre.pgsql
2024-11-08 13:28:48 +01:00
Amit Langote
075acdd933 Disallow partitionwise join when collations don't match
If the collation of any join key column doesn’t match the collation of
the corresponding partition key, partitionwise joins can yield incorrect
results. For example, rows that would match under the join key collation
might be located in different partitions due to the partitioning
collation. In such cases, a partitionwise join would yield different
results from a non-partitionwise join, so disallow it in such cases.

Reported-by: Tender Wang <tndrwang@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com
Backpatch-through: 12
2024-11-08 17:25:24 +09:00
Amit Langote
90fe6251c8 Disallow partitionwise grouping when collations don't match
If the collation of any grouping column doesn’t match the collation of
the corresponding partition key, partitionwise grouping can yield
incorrect results. For example, rows that would be grouped under the
grouping collation may end up in different partitions under the
partitioning collation. In such cases, full partitionwise grouping
would produce results that differ from those without partitionwise
grouping, so disallowed that.

Partial partitionwise aggregation is still allowed, as the Finalize
step reconciles partition-level aggregates with grouping requirements
across all partitions, ensuring that the final output remains
consistent.

This commit also fixes group_by_has_partkey() by ensuring the
RelabelType node is stripped from grouping expressions when matching
them to partition key expressions to avoid false mismatches.

Bug: #18568
Reported-by: Webbo Han <1105066510@qq.com>
Author: Webbo Han <1105066510@qq.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/18568-2a9afb6b9f7e6ed3@postgresql.org
Discussion: https://postgr.es/m/tencent_9D9103CDA420C07768349CC1DFF88465F90A@qq.com
Discussion: https://postgr.es/m/CAHewXNno_HKiQ6PqyLYfuqDtwp7KKHZiH1J7Pqyz0nr+PS2Dwg@mail.gmail.com
Backpatch-through: 12
2024-11-08 16:07:22 +09:00
Richard Guo
f00ab1fd15 Fix inconsistent RestrictInfo serial numbers
When we generate multiple clones of the same qual condition to cope
with outer join identity 3, we need to ensure that all the clones get
the same serial number.  To achieve this, we reset the
root->last_rinfo_serial counter each time we produce RestrictInfo(s)
from the qual list (see deconstruct_distribute_oj_quals).  This
approach works only if we ensure that we are not changing the qual
list in any way that'd affect the number of RestrictInfos built from
it.

However, with b262ad440, an IS NULL qual on a NOT NULL column might
result in an additional constant-FALSE RestrictInfo.  And different
versions of the same qual clause can lead to different conclusions
about whether it can be reduced to constant-FALSE.  This would affect
the number of RestrictInfos built from the qual list for different
versions, causing inconsistent RestrictInfo serial numbers across
multiple clones of the same qual.  This inconsistency can confuse
users of these serial numbers, such as rebuild_joinclause_attr_needed,
and lead to planner errors such as "ERROR:  variable not found in
subplan target lists".

To fix, reset the root->last_rinfo_serial counter after generating the
additional constant-FALSE RestrictInfo.

Back-patch to v17 where the issue crept in.  In v17, I failed to make
a test case that would expose this bug, so no test case for v17.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-B6kafn+LmPuh-TYFwFyEm-vVj3Qqv7Yo-69CEv14rRg@mail.gmail.com
2024-11-08 11:21:11 +09:00
Peter Eisentraut
d7a2b5bd87 Clarify a foreign key error message
Clarify the message about type mismatch in foreign key definition to
indicate which column the referencing and which is the referenced one.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxEL82ao-aXOa=d_-Xip0bix-qdSyNc9fcWxOdkEZFko8w@mail.gmail.com
2024-11-07 11:13:06 +01:00
Amit Kapila
7054186c4e Replicate generated columns when 'publish_generated_columns' is set.
This patch builds on the work done in commit 745217a051 by enabling the
replication of generated columns alongside regular column changes through
a new publication parameter: publish_generated_columns.

Example usage:
CREATE PUBLICATION pub1 FOR TABLE tab_gencol WITH (publish_generated_columns = true);

The column list takes precedence. If the generated columns are specified
in the column list, they will be replicated even if
'publish_generated_columns' is set to false. Conversely, if generated
columns are not included in the column list (assuming the user specifies a
column list), they will not be replicated even if
'publish_generated_columns' is true.

Author: Vignesh C, Shubham Khanna
Reviewed-by: Peter Smith, Amit Kapila, Hayato Kuroda, Shlok Kyal, Ajin Cherian, Hou Zhijie, Masahiko Sawada
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2024-11-07 08:58:49 +05:30
Michael Paquier
70291a3c66 Improve handling of empty query results in BackgroundPsql::query()
A newline is not added at the end of an empty query result, causing the
banner of the hardcoded \echo to not be discarded.  This would reflect
on scripts that expect an empty result by showing the "QUERY_SEPARATOR"
in the output returned back to the caller, which was confusing.

This commit changes BackgroundPsql::query() so as empty results are able
to work correctly, making the first newline before the banner optional,
bringing more flexibility.

Note that this change affects 037_invalid_database.pl, where three
queries generated an empty result, with the script relying on the data
from the hardcoded banner to exist in the expected output.  These
queries are changed to use query_safe(), leading to a simpler script.

The author has also proposed a test in a different patch where empty
results would exist when using BackgroundPsql.

Author: Jacob Champion
Reviewed-by: Andrew Dunstan, Michael Paquier
Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2024-11-07 12:11:27 +09:00
Michael Paquier
ba08edb065 Extend Cluster.pm's background_psql() to be able to start asynchronously
This commit extends the constructor routine of BackgroundPsql.pm with a
new "wait" parameter.  If set to 0, the routine returns without waiting
for psql to start, ready to consume input.

background_psql() in Cluster.pm gains the same "wait" parameter.  The
default behavior is still to wait for psql to start.  It becomes now
possible to not wait, giving to TAP scripts the possibility to perform
actions between a BackgroundPsql startup and its wait_connect() call.

Author: Jacob Champion
Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2024-11-06 15:31:14 +09:00
Alexander Korotkov
3a7ae6b3d9 Revert pg_wal_replay_wait() stored procedure
This commit reverts 3c5db1d6b0, and subsequent improvements and fixes
including 8036d73ae3, 867d396ccd, 3ac3ec580c, 0868d7ae70, 85b98b8d5a,
2520226c95, 014f9f34d2, e658038772, e1555645d7, 5035172e4a, 6cfebfe88b,
73da6b8d1b, and e546989a26.

The reason for reverting is a set of remaining issues.  Most notably, the
stored procedure appears to need more effort than the utility statement
to turn the backend into a "snapshot-less" state.  This makes an approach
to use stored procedures questionable.

Catversion is bumped.

Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz
2024-11-04 22:47:57 +02:00
Heikki Linnakangas
99b937a44f Add PG_TEST_EXTRA configure option to the Make builds
The Meson builds have PG_TEST_EXTRA as a configure-time variable,
which was not available in the Make builds. To ensure both build
systems are in sync, PG_TEST_EXTRA is now added as a configure-time
variable. It can be set like this:

    ./configure PG_TEST_EXTRA="kerberos, ssl, ..."

Note that to preserve the old behavior, this configure-time variable
is overridden by the PG_TEST_EXTRA environment variable when you run
the tests.

Author: Jacob Champion
Reviewed by: Ashutosh Bapat, Nazir Bilal Yavuz
2024-11-04 14:09:38 +02:00
Michael Paquier
027124a872 Add missing newlines at the end of two SQL files
arrays.sql was already missing it before 49d6c7d8da, and I have just
noticed it thanks to this commit.  The second one in test_slru has been
introduced by 768a9fd553.
2024-11-03 19:42:51 +09:00
Michael Paquier
49d6c7d8da Add SQL function array_reverse()
This function takes in input an array, and reverses the position of all
its elements.  This operation only affects the first dimension of the
array, like array_shuffle().

The implementation structure is inspired by array_shuffle(), with a
subroutine called array_reverse_n() that may come in handy in the
future, should more functions able to reverse portions of arrays be
introduced.

Bump catalog version.

Author: Aleksander Alekseev
Reviewed-by: Ashutosh Bapat, Tom Lane, Vladlen Popolitov
Discussion: https://postgr.es/m/CAJ7c6TMpeO_ke+QGOaAx9xdJuxa7r=49-anMh3G5476e3CX1CA@mail.gmail.com
2024-11-01 10:32:19 +09:00
Tom Lane
2d8bff603c Make all ereport() calls within gram.y provide error locations.
This patch responds to a comment that I (tgl) made in the
discussion leading up to 774171c4f, that really all errors
occurring during raw parsing should provide error cursors.
Syntax errors reported by Bison will have one, and most of
the handwritten ereport's in gram.y already provide one,
but there were a few stragglers.

(It is not claimed that this handles every failure reachable
during raw parsing --- out-of-memory is an obvious exception.
But this makes a good start on cases that are likely to occur.)

While we're at it, clean up the reported positions for errors
associated with LIMIT/OFFSET clauses.  Previously we were
relying on applying exprLocation() to the contained expressions,
but that leads to slightly odd cursor placement, e.g.

regression=# (select * from foo limit 10) limit 10;
ERROR:  multiple LIMIT clauses not allowed
LINE 1: (select * from foo limit 10) limit 10;
                                           ^

We can afford to keep a little more state in the transient
SelectLimit structs in order to make that better.

Jian He and Tom Lane (extracted from a larger patch by Jian,
with some additional work by me)

Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
2024-10-31 16:09:27 -04:00
Tom Lane
89e51abcb2 Add a parse location field to struct FunctionParameter.
This allows an error cursor to be supplied for a bunch of
bad-function-definition errors that previously lacked one,
or that cheated a bit by pointing at the contained type name
when the error isn't really about that.

Bump catversion from an abundance of caution --- I don't think
this node type can actually appear in stored views/rules, but
better safe than sorry.

Jian He and Tom Lane (extracted from a larger patch by Jian,
with some additional work by me)

Discussion: https://postgr.es/m/CACJufxEmONE3P2En=jopZy1m=cCCUs65M4+1o52MW5og9oaUPA@mail.gmail.com
2024-10-31 16:09:27 -04:00
Michael Paquier
baa1ae0429 injection_points: Improve comment about disabled isolation permutation
9f00edc228 has disabled a permutation due to failures in the CI for
FreeBSD environments, but this is a matter of timing.  Let's document
properly why this type of permutation is a bad idea if relying on a wait
done in a SQL function, so as this can be avoided when implementing new
tests (this spec is also a template).

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZyCa2qsopKaw3W3K@paquier.xyz
2024-10-31 08:28:20 +09:00
Tom Lane
af21152268 Stabilize jsonb_path_query test case.
An operation like '12:34:56'::time_tz takes the UTC offset from
the prevailing time zone, which means that the results change
across DST transitions.  One of the test cases added in ed055d249
failed to consider this.

Per report from Bernhard Wiedemann.  Back-patch to v17, as the
test case was.

Discussion: https://postgr.es/m/ba8e1bc0-8a99-45b7-8397-3f2e94415e03@suse.de
2024-10-30 11:42:34 -04:00
Álvaro Herrera
2d5fe51405
Fix some more bugs in foreign keys connecting partitioned tables
* In DetachPartitionFinalize() we were applying a tuple conversion map
  to tuples that didn't need one, which can lead to erratic behavior if
  a partitioned table has a partition with a different column order, as
  reported by Alexander Lakhin. This was introduced by 53af9491a0.
  Don't do that.  Also, modify a recently added test case to exercise
  this.

* The same function as well as CloneFkReferenced() were acquiring
  AccessShareLock on a partition, only to have CreateTrigger() later
  acquire ShareRowExclusiveLock on it.  This can lead to deadlock by
  lock escalation, unnecessarily.  Avoid that by acquiring the stronger
  lock to begin with.  This probably dates back to branch 12, but I have
  never seen a report of this being a problem in the field.

* Innocuous but wasteful: also introduced by 53af9491a0, we were
  reading a pg_constraint tuple from syscache that we don't need, as
  reported by Tender Wang.  Don't.

Backpatch to 15.

Discussion: https://postgr.es/m/461e9c26-2076-8224-e119-84998b6a784e@gmail.com
2024-10-30 10:54:03 +01:00
Amit Kapila
745217a051 Replicate generated columns when specified in the column list.
This commit allows logical replication to publish and replicate generated
columns when explicitly listed in the column list. We also ensured that
the generated columns were copied during the initial tablesync when they
were published.

We will allow to replicate generated columns even when they are not
specified in the column list (via a new publication option) in a separate
commit.

The motivation of this work is to allow replication for cases where the
client doesn't have generated columns. For example, the case where one is
trying to replicate data from Postgres to the non-Postgres database.

Author: Shubham Khanna, Vignesh C, Hou Zhijie
Reviewed-by: Peter Smith, Hayato Kuroda, Shlok Kyal, Amit Kapila
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2024-10-30 12:36:26 +05:30