Commit graph

8902 commits

Author SHA1 Message Date
Nathan Bossart
efdc7d7475 Add INT64_HEX_FORMAT and UINT64_HEX_FORMAT to c.h.
Like INT64_FORMAT and UINT64_FORMAT, these macros produce format
strings for 64-bit integers.  However, INT64_HEX_FORMAT and
UINT64_HEX_FORMAT generate the output in hexadecimal instead of
decimal.  Besides introducing these macros, this commit makes use
of them in several places.  This was originally intended to be part
of commit 5d6187d2a2, but I left it out because I felt there was a
nonzero chance that back-patching these new macros into c.h could
cause problems with third-party code.  We tend to be less cautious
with such changes in new major versions.

Note that UINT64_HEX_FORMAT was originally added in commit
ee1b30f128, but it was placed in test_radixtree.c, so it wasn't
widely available.  This commit moves UINT64_HEX_FORMAT to c.h.

Discussion: https://postgr.es/m/ZwQvtUbPKaaRQezd%40nathan
2024-11-22 12:41:57 -06:00
Michael Paquier
c06e71d1ac Add write_to_file to PgStat_KindInfo for pgstats kinds
This new field controls if entries of a stats kind should be written or
not to the on-disk pgstats file when shutting down an instance.  This
affects both fixed and variable-numbered kinds.

This is useful for custom statistics by itself, and a patch is under
discussion to add a new builtin stats kind where the write of the stats
is not necessary.  All the built-in stats kinds, as well as the two
custom stats kinds in the test module injection_points, set this flag to
"true" for now, so as stats entries are written to the on-disk pgstats
file.

Author: Bertrand Drouvot
Reviewed-by: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/Zz7T47nHwYgeYwOe@ip-10-97-1-34.eu-west-3.compute.internal
2024-11-22 10:12:26 +09:00
Tom Lane
94131cd53c Avoid assertion failure if a setop leaf query contains setops.
Ordinarily transformSetOperationTree will collect all UNION/
INTERSECT/EXCEPT steps into the setOperations tree of the topmost
Query, so that leaf queries do not contain any setOperations.
However, it cannot thus flatten a subquery that also contains
WITH, ORDER BY, FOR UPDATE, or LIMIT.  I (tgl) forgot that in
commit 07b4c48b6 and wrote an assertion in rule deparsing that
a leaf's setOperations would always be empty.

If it were nonempty then we would want to parenthesize the subquery
to ensure that the output represents the setop nesting correctly
(e.g. UNION below INTERSECT had better get parenthesized).  So
rather than just removing the faulty Assert, let's change it into
an additional case to check to decide whether to add parens.  We
don't expect that the additional case will ever fire, but it's
cheap insurance.

Man Zeng and Tom Lane

Discussion: https://postgr.es/m/tencent_7ABF9B1F23B0C77606FC5FE3@qq.com
2024-11-20 12:03:47 -05:00
Noah Misch
7b88529f43 Fix per-session activation of ALTER {ROLE|DATABASE} SET role.
After commit 5a2fed911a, the catalog state
resulting from these commands ceased to affect sessions.  Restore the
longstanding behavior, which is like beginning the session with a SET
ROLE command.  If cherry-picking the CVE-2024-10978 fixes, default to
including this, too.  (This fixes an unintended side effect of fixing
CVE-2024-10978.)  Back-patch to v12, like that commit.  The release team
decided to include v12, despite the original intent to halt v12 commits
earlier this week.

Tom Lane and Noah Misch.  Reported by Etienne LAFARGE.

Discussion: https://postgr.es/m/CADOZwSb0UsEr4_UTFXC5k7=fyyK8uKXekucd+-uuGjJsGBfxgw@mail.gmail.com
2024-11-15 20:39:56 -08:00
Peter Eisentraut
9321d2fdf8 Fix collation handling for foreign keys
Allowing foreign keys where the referenced and the referencing columns
have collations with different notions of equality is problematic.
This can only happen when using nondeterministic collations, for
example, if the referencing column is case-insensitive and the
referenced column is not, or vice versa.  It does not happen if both
collations are deterministic.

To show one example:

    CREATE COLLATION case_insensitive (provider = icu, deterministic = false, locale = 'und-u-ks-level2');

    CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
    CREATE TABLE fktable (x text COLLATE case_insensitive REFERENCES pktable ON UPDATE CASCADE ON DELETE CASCADE);
    INSERT INTO pktable VALUES ('A'), ('a');
    INSERT INTO fktable VALUES ('A');

    BEGIN; DELETE FROM pktable WHERE x = 'a'; TABLE fktable; ROLLBACK;
    BEGIN; DELETE FROM pktable WHERE x = 'A'; TABLE fktable; ROLLBACK;

Both of these DELETE statements delete the one row from fktable.  So
this means that one row from fktable references two rows in pktable,
which should not happen.  (That's why a primary key or unique
constraint is required on pktable.)

When nondeterministic collations were implemented, the SQL standard
available to yours truly said that referential integrity checks should
be performed with the collation of the referenced column, and so
that's how we implemented it.  But this turned out to be a mistake in
the SQL standard, for the same reasons as above, that was later
(SQL:2016) fixed to require both collations to be the same.  So that's
what we are aiming for here.

We don't have to be quite so strict.  We can allow different
collations if they are both deterministic.  This is also good for
backward compatibility.

So the new rule is that the collations either have to be the same or
both deterministic.  Or in other words, if one of them is
nondeterministic, then both have to be the same.

Users upgrading from before that have affected setups will need to
make changes to their schemas (i.e., change one or both collations in
affected foreign-key relationships) before the upgrade will succeed.

Some of the nice test cases for the previous situation in
collate.icu.utf8.sql are now obsolete.  They are changed to just check
the error checking of the new rule.  Note that collate.sql already
contained a test for foreign keys with different deterministic
collations.

A bunch of code in ri_triggers.c that added a COLLATE clause to
enforce the referenced column's collation can be removed, because both
columns now have to have the same notion of equality, so it doesn't
matter which one to use.

Reported-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/78d824e0-b21e-480d-a252-e4b84bc2c24b@illuminatedcomputing.com
2024-11-15 14:55:54 +01:00
Michael Paquier
818119afcc Fix race conditions with drop of reused pgstats entries
This fixes a set of race conditions with cumulative statistics where a
shared stats entry could be dropped while it should still be valid in
the event when it is reused: an entry may refer to a different object
but requires the same hash key.  This can happen with various stats
kinds, like:
- Replication slots that compute internally an index number, for
different slot names.
- Stats kinds that use an OID in the object key, where a wraparound
causes the same key to be used if an OID is used for the same object.
- As of PostgreSQL 18, custom pgstats kinds could also be an issue,
depending on their implementation.

This issue is fixed by introducing a counter called "generation" in the
shared entries via PgStatShared_HashEntry, initialized at 0 when an
entry is created and incremented when the same entry is reused, to avoid
concurrent issues on drop because of other backends still holding a
reference to it.  This "generation" is copied to the local copy that a
backend holds when looking at an object, then cross-checked with the
shared entry to make sure that the entry is not dropped even if its
"refcount" justifies that if it has been reused.

This problem could show up when a backend shuts down and needs to
discard any entries it still holds, causing statistics to be removed
when they should not, or even an assertion failure.  Another report
involved a failure in a standby after an OID wraparound, where the
startup process would FATAL on a "can only drop stats once", stopping
recovery abruptly.  The buildfarm has been sporadically complaining
about the problem, as well, but the window is hard to reach with the
in-core tests.

Note that the issue can be reproduced easily by adding a sleep before
dshash_find() in pgstat_release_entry_ref() to enlarge the problematic
window while repeating test_decoding's isolation test oldest_xmin a
couple of times, for example, as pointed out by Alexander Lakhin.

Reported-by: Alexander Lakhin, Peter Smith
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAA4eK1KxuMVyAryz_Vk5yq3ejgKYcL6F45Hj9ZnMNBS-g+PuZg@mail.gmail.com
Discussion: https://postgr.es/m/17947-b9554521ad963c9c@postgresql.org
Backpatch-through: 15
2024-11-15 11:31:58 +09:00
Heikki Linnakangas
18d67a8d7d Replace postmaster.c's own backend type codes with BackendType
Introduce a separate BackendType for dead-end children, so that we
don't need a separate dead_end flag.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:06:16 +02:00
Michael Paquier
d74b590983 Fix comment in injection_point.c
InjectionPointEntry->name was described as a hash key, which was fine
when introduced in d86d20f0ba, but it is not now.

Oversight in 86db52a506, that has changed the way injection points are
stored in shared memory from a hash table to an array.

Backpatch-through: 17
2024-11-13 13:58:09 +09:00
Tom Lane
73c9f91a1b Parallel workers use AuthenticatedUserId for connection privilege checks.
Commit 5a2fed911 had an unexpected side-effect: the parallel worker
launched for the new test case would fail if it couldn't use a
superuser-reserved connection slot.  The reason that test failed
while all our pre-existing ones worked is that the connection
privilege tests in InitPostgres had been based on the superuserness
of the leader's AuthenticatedUserId, but after the rearrangements
of 5a2fed911 we were testing the superuserness of CurrentUserId,
which the new test case deliberately made to be a non-superuser.

This all seems very accidental and probably not the behavior we really
want, but a security patch is no time to be redesigning things.
Pending some discussion about desirable semantics, hack it so that
InitPostgres continues to pay attention to the superuserness of
AuthenticatedUserId when starting a parallel worker.

Nathan Bossart and Tom Lane, per buildfarm member sawshark.

Security: CVE-2024-10978
2024-11-11 17:05:53 -05:00
Tom Lane
5a2fed911a Fix improper interactions between session_authorization and role.
The SQL spec mandates that SET SESSION AUTHORIZATION implies
SET ROLE NONE.  We tried to implement that within the lowest-level
functions that manipulate these settings, but that was a bad idea.
In particular, guc.c assumes that it doesn't matter in what order
it applies GUC variable updates, but that was not the case for these
two variables.  This problem, compounded by some hackish attempts to
work around it, led to some security-grade issues:

* Rolling back a transaction that had done SET SESSION AUTHORIZATION
would revert to SET ROLE NONE, even if that had not been the previous
state, so that the effective user ID might now be different from what
it had been.

* The same for SET SESSION AUTHORIZATION in a function SET clause.

* If a parallel worker inspected current_setting('role'), it saw
"none" even when it should see something else.

Also, although the parallel worker startup code intended to cope
with the current role's pg_authid row having disappeared, its
implementation of that was incomplete so it would still fail.

Fix by fully separating the miscinit.c functions that assign
session_authorization from those that assign role.  To implement the
spec's requirement, teach set_config_option itself to perform "SET
ROLE NONE" when it sets session_authorization.  (This is undoubtedly
ugly, but the alternatives seem worse.  In particular, there's no way
to do it within assign_session_authorization without incompatible
changes in the API for GUC assign hooks.)  Also, improve
ParallelWorkerMain to directly set all the relevant user-ID variables
instead of relying on some of them to get set indirectly.  That
allows us to survive not finding the pg_authid row during worker
startup.

In v16 and earlier, this includes back-patching 9987a7bf3 which
fixed a violation of GUC coding rules: SetSessionAuthorization
is not an appropriate place to be throwing errors from.

Security: CVE-2024-10978
2024-11-11 10:29:54 -05:00
Michael Paquier
e7a9496de9 Add two attributes to pg_stat_database for parallel workers activity
Two attributes are added to pg_stat_database:
* parallel_workers_to_launch, counting the total number of parallel
workers that were planned to be launched.
* parallel_workers_launched, counting the total number of parallel
workers actually launched.

The ratio of both fields can provide hints that there are not enough
slots available when launching parallel workers, also useful when
pg_stat_statements is not deployed on an instance (i.e. cf54a2c002).

This commit relies on de3a2ea3b2, that has added two fields to EState,
that get incremented when executing Gather or GatherMerge nodes.

A test is added in select_parallel, where parallel workers are spawned.

Bump catalog version.

Author: Benoit Lobréau
Discussion: https://postgr.es/m/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com
2024-11-11 10:40:48 +09:00
Álvaro Herrera
14e87ffa5c
Add pg_constraint rows for not-null constraints
We now create contype='n' pg_constraint rows for not-null constraints on
user tables.  Only one such constraint is allowed for a column.

We propagate these constraints to other tables during operations such as
adding inheritance relationships, creating and attaching partitions and
creating tables LIKE other tables.  These related constraints mostly
follow the well-known rules of conislocal and coninhcount that we have
for CHECK constraints, with some adaptations: for example, as opposed to
CHECK constraints, we don't match not-null ones by name when descending
a hierarchy to alter or remove it, instead matching by the name of the
column that they apply to.  This means we don't require the constraint
names to be identical across a hierarchy.

The inheritance status of these constraints can be controlled: now we
can be sure that if a parent table has one, then all children will have
it as well.  They can optionally be marked NO INHERIT, and then children
are free not to have one.  (There's currently no support for altering a
NO INHERIT constraint into inheriting down the hierarchy, but that's a
desirable future feature.)

This also opens the door for having these constraints be marked NOT
VALID, as well as allowing UNIQUE+NOT NULL to be used for functional
dependency determination, as envisioned by commit e49ae8d3bc.  It's
likely possible to allow DEFERRABLE constraints as followup work, as
well.

psql shows these constraints in \d+, though we may want to reconsider if
this turns out to be too noisy.  Earlier versions of this patch hid
constraints that were on the same columns of the primary key, but I'm
not sure that that's very useful.  If clutter is a problem, we might be
better off inventing a new \d++ command and not showing the constraints
in \d+.

For now, we omit these constraints on system catalog columns, because
they're unlikely to achieve anything.

The main difference to the previous attempt at this (b0e96f3119) is
that we now require that such a constraint always exists when a primary
key is in the column; we didn't require this previously which had a
number of unpalatable consequences.  With this requirement, the code is
easier to reason about.  For example:

- We no longer have "throwaway constraints" during pg_dump.  We needed
  those for the case where a table had a PK without a not-null
  underneath, to prevent a slow scan of the data during restore of the
  PK creation, which was particularly problematic for pg_upgrade.

- We no longer have to cope with attnotnull being set spuriously in
  case a primary key is dropped indirectly (e.g., via DROP COLUMN).

Some bits of code in this patch were authored by Jian He.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Bernd Helmle <mailings@oopsware.de>
Reviewed-by: 何建 (jian he) <jian.universality@gmail.com>
Reviewed-by: 王刚 (Tender Wang) <tndrwang@gmail.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/202408310358.sdhumtyuy2ht@alvherre.pgsql
2024-11-08 13:28:48 +01:00
Michael Paquier
7d85d87f4d Clear padding of PgStat_HashKey when handling pgstats entries
PgStat_HashKey is currently initialized in a way that could result in
random data if the structure has any padding bytes.  The structure
has no padding bytes currently, fortunately, but it could become a
problem should the structure change at some point in the future.

The code is changed to use some memset(0) so as any padding would be
handled properly, as it would be surprising to see random failures in
the pgstats entry lookups.  PgStat_HashKey is a structure internal to
pgstats, and an ABI change could be possible in the scope of a bug fix,
so backpatch down to 15 where this has been introduced.

Author: Bertrand Drouvot
Reviewed-by: Jelte Fennema-Nio, Michael Paquier
Discussion: https://postgr.es/m/Zyb7RW1y9dVfO0UH@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 15
2024-11-05 09:39:43 +09:00
Alexander Korotkov
3a7ae6b3d9 Revert pg_wal_replay_wait() stored procedure
This commit reverts 3c5db1d6b0, and subsequent improvements and fixes
including 8036d73ae3, 867d396ccd, 3ac3ec580c, 0868d7ae70, 85b98b8d5a,
2520226c95, 014f9f34d2, e658038772, e1555645d7, 5035172e4a, 6cfebfe88b,
73da6b8d1b, and e546989a26.

The reason for reverting is a set of remaining issues.  Most notably, the
stored procedure appears to need more effort than the utility statement
to turn the backend into a "snapshot-less" state.  This makes an approach
to use stored procedures questionable.

Catversion is bumped.

Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz
2024-11-04 22:47:57 +02:00
Noah Misch
0bada39c83 Fix inplace update buffer self-deadlock.
A CacheInvalidateHeapTuple* callee might call
CatalogCacheInitializeCache(), which needs a relcache entry.  Acquiring
a valid relcache entry might scan pg_class.  Hence, to prevent
undetected LWLock self-deadlock, CacheInvalidateHeapTuple* callers must
not hold BUFFER_LOCK_EXCLUSIVE on buffers of pg_class.  Move the
CacheInvalidateHeapTupleInplace() before the BUFFER_LOCK_EXCLUSIVE.  No
back-patch, since I've reverted commit
243e9b40f1 from non-master branches.

Reported by Alexander Lakhin.  Reviewed by Alexander Lakhin.

Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
2024-11-02 09:04:56 -07:00
Michael Paquier
07e9e28b56 Add pg_memory_is_all_zeros() in memutils.h
This new function tests if a memory region starting at a given location
for a defined length is made only of zeroes.  This unifies in a single
path the all-zero checks that were happening in a couple of places of
the backend code:
- For pgstats entries of relation, checkpointer and bgwriter, where
some "all_zeroes" variables were previously used with memcpy().
- For all-zero buffer pages in PageIsVerifiedExtended().

This new function uses the same forward scan as the check for all-zero
buffer pages, applying it to the three pgstats paths mentioned above.

Author: Bertrand Drouvot
Reviewed-by: Peter Eisentraut, Heikki Linnakangas, Peter Smith
Discussion: https://postgr.es/m/ZupUDDyf1hHI4ibn@ip-10-97-1-34.eu-west-3.compute.internal
2024-11-01 11:35:46 +09:00
Michael Paquier
49d6c7d8da Add SQL function array_reverse()
This function takes in input an array, and reverses the position of all
its elements.  This operation only affects the first dimension of the
array, like array_shuffle().

The implementation structure is inspired by array_shuffle(), with a
subroutine called array_reverse_n() that may come in handy in the
future, should more functions able to reverse portions of arrays be
introduced.

Bump catalog version.

Author: Aleksander Alekseev
Reviewed-by: Ashutosh Bapat, Tom Lane, Vladlen Popolitov
Discussion: https://postgr.es/m/CAJ7c6TMpeO_ke+QGOaAx9xdJuxa7r=49-anMh3G5476e3CX1CA@mail.gmail.com
2024-11-01 10:32:19 +09:00
Heikki Linnakangas
b82c877e76 Fix refreshing physical relfilenumber on shared index
Buildfarm member 'prion', which is configured with
-DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE, failed with errors
like this:

    ERROR:  could not read blocks 0..0 in file "global/2672": read only 0 of 8192 bytes

while running a parallel test group that includes VACUUM FULL on some
catalog tables among other things. I was not able to reproduce that
just by running the tests with -DRELCACHE_FORCE_RELEASE
-DCATCACHE_FORCE_RELEASE, even though 'prion' hit it on first run
after commit 2b9b8ebbf8, so there might be something else that makes
it more susceptible to the race. However, I was able to reproduce it
by adding another test to the same test group that runs "vacuum full
pg_database" repeatedly.

The problem is that RelationReloadIndexInfo() no longer calls
RelationInitPhysicalAddr() on a nailed, shared index, when an
invalidation happens early during backend startup, before the critical
relcaches have been built. Before commit 2b9b8ebbf8, that was done by
RelationReloadNailed(), but it went missing from that path. Add it
back as an explicit step.

Broken by commit 2b9b8ebbf8, which refactored these functions.

Discussion: https://www.postgresql.org/message-id/db876575-8f5b-4193-a538-df7e1f92d47a%40iki.fi
2024-10-31 18:24:48 +02:00
Heikki Linnakangas
2b9b8ebbf8 Split RelationClearRelation into three different functions
The old RelationClearRelation function did different things depending
on the arguments and circumstances. It could:

a) remove the relation completely from relcache (rebuild == false),
b) mark the entry as invalid (rebuild == true, but not in xact), or
c) rebuild the entry (rebuild == true).

Different callers used it for different purposes, and often assumed a
particular behavior, which was confusing. Split it into three
different functions, one for each of the above actions (one of them,
RelationInvalidateRelation, was already added in commit e6cd857726).
Move the responsibility of choosing the action and calling the right
function to the callers.

Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
2024-10-31 10:09:40 +02:00
Heikki Linnakangas
8e2e266221 Simplify call to rebuild relcache entry for indexes
RelationClearRelation(rebuild == true) calls RelationReloadIndexInfo()
for indexes. We can rely on that in RelationIdGetRelation(), instead
of calling RelationReloadIndexInfo() directly. That simplifies the
code a little.

In the passing, add a comment in RelationBuildLocalRelation()
explaining why it doesn't call RelationInitIndexAccessInfo(). It's
because at index creation, it's called before the pg_index row has
been created. That's also the reason that RelationClearRelation()
still needs a special case to go through the full-blown rebuild if the
index support information in the relcache entry hasn't been populated
yet.

Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/9c9e8908-7b3e-4ce7-85a8-00c0e165a3d6%40iki.fi
2024-10-31 10:02:58 +02:00
Peter Eisentraut
e18512c000 Remove unused #include's from backend .c files
as determined by IWYU

These are mostly issues that are new since commit dbbca2cf29.

Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-27 08:26:50 +01:00
Jeff Davis
3aa2373c11 Refactor the code to create a pg_locale_t into new function.
Reviewed-by: Andreas Karlsson
Discussion: https://postgr.es/m/59da7ee4-5e1a-4727-b464-a603c6ed84cd@proxel.se
2024-10-25 16:31:08 -07:00
Noah Misch
243e9b40f1 For inplace update, send nontransactional invalidations.
The inplace update survives ROLLBACK.  The inval didn't, so another
backend's DDL could then update the row without incorporating the
inplace update.  In the test this fixes, a mix of CREATE INDEX and ALTER
TABLE resulted in a table with an index, yet relhasindex=f.  That is a
source of index corruption.  Back-patch to v12 (all supported versions).
The back branch versions don't change WAL, because those branches just
added end-of-recovery SIResetAll().  All branches change the ABI of
extern function PrepareToInvalidateCacheTuple().  No PGXN extension
calls that, and there's no apparent use case in extensions.

Reviewed by Nitin Motiani and (in earlier versions) Andres Freund.

Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
2024-10-25 06:51:02 -07:00
Daniel Gustafsson
45188c2ea2 Support configuring TLSv1.3 cipher suites
The ssl_ciphers GUC can only set cipher suites for TLSv1.2, and lower,
connections. For TLSv1.3 connections a different OpenSSL API must be
used.  This adds a new GUC, ssl_tls13_ciphers, which can be used to
configure a colon separated list of cipher suites to support when
performing a TLSv1.3 handshake.

Original patch by Erica Zhang with additional hacking by me.

Author: Erica Zhang <ericazhangy2021@qq.com>
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
2024-10-24 15:20:32 +02:00
Daniel Gustafsson
3d1ef3a15c Support configuring multiple ECDH curves
The ssl_ecdh_curve GUC only accepts a single value, but the TLS
handshake can list multiple curves in the groups extension (the
extension has been renamed to contain more than elliptic curves).
This changes the GUC to accept a colon-separated list of curves.
This commit also renames the GUC to ssl_groups to match the new
nomenclature for the TLS extension.

Original patch by Erica Zhang with additional hacking by me.

Author: Erica Zhang <ericazhangy2021@qq.com>
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/tencent_063F89FA72CCF2E48A0DF5338841988E9809@qq.com
2024-10-24 15:20:28 +02:00
Alexander Korotkov
b85a9d046e Avoid looping over all type cache entries in TypeCacheRelCallback()
Currently, when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate.  Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups.  This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.

We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean.  Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.

There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur.  In order to handle this, we
keep an array of in-progress type cache entries.  In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.

Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
Reviewed-by: Artur Zakirov
2024-10-24 14:35:52 +03:00
Alexander Korotkov
c1500a1ba7 Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.

Discussion: https://postgr.es/m/CAPpHfdsQhwUrnB3of862j9RgHoJM--eRbifvBMvtQxpC57dxCA%40mail.gmail.com
Reviewed-by: Andrei Lepikhov, Artur Zakirov, Pavel Borisov
2024-10-24 14:34:16 +03:00
Jeff Davis
eecd9138a0 Improve ThrowErrorData() comments for use with soft errors.
Reviewed-by: Corey Huinker
Discussion: https://postgr.es/m/901ab7cf01957f92ea8b30b6feeb0eacfb7505fc.camel@j-davis.com
2024-10-17 14:56:44 -07:00
Masahiko Sawada
7cdfeee320 Add contrib/pg_logicalinspect.
This module provides SQL functions that allow to inspect logical
decoding components.

It currently allows to inspect the contents of serialized logical
snapshots of a running database cluster, which is useful for debugging
or educational purposes.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith, Peter Eisentraut
Reviewed-by: David G. Johnston
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal
2024-10-14 17:22:02 -07:00
Jeff Davis
66ac94cdc7 Move libc-specific code from pg_locale.c into pg_locale_libc.c.
Move implementation of pg_locale_t code for libc collations into
pg_locale_libc.c. Other locale-related code, such as
pg_perm_setlocale(), remains in pg_locale.c for now.

Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b.camel@j-davis.com
2024-10-14 12:48:43 -07:00
Jeff Davis
f244a2bb4c Move ICU-specific code from pg_locale.c into pg_locale_icu.c.
Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b.camel@j-davis.com
2024-10-14 12:13:26 -07:00
Masahiko Sawada
4681ad4b2f Use construct_array_builtin for FLOAT8OID instead of construct_array.
Commit d746021de1 introduced construct_array_builtin() for built-in
data types, but forgot some replacements linked to FLOAT8OID.

Author: Bertrand Drouvot
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/CAD21AoCERkwmttY44dqUw%3Dm_9QCctu7W%2Bp6B7w_VqxRJA1Qq_Q%40mail.gmail.com
2024-10-14 09:49:29 -07:00
Peter Eisentraut
0d2aa4d493 Track sort direction in SortGroupClause
Functions make_pathkey_from_sortop() and transformWindowDefinitions(),
which receive a SortGroupClause, were determining the sort order
(ascending vs. descending) by comparing that structure's operator
strategy to BTLessStrategyNumber, but could just as easily have gotten
it from the SortGroupClause object, if it had such a field, so add
one.  This reduces the number of places that hardcode the assumption
that the strategy refers specifically to a btree strategy, rather than
some other index AM's operators.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-10-14 15:36:02 +02:00
Michael Paquier
c0b74323dc Use MAX_PARALLEL_WORKER_LIMIT for max_parallel_maintenance_workers
max_parallel_maintenance_workers has been introduced in 9da0cc3528,
and used a hardcoded limit of 1024 rather than this variable.

max_parallel_workers and max_parallel_workers_per_gather already used
MAX_PARALLEL_WORKER_LIMIT (1024) as their upper-bound since
6599c9ac33.

Author: Matthias van de Meent
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/CAEze2WiCiJD+8Wig_wGPyn4vgdPjbnYXy2Rw+9KYi6izTMuP=w@mail.gmail.com
2024-10-13 11:20:30 +09:00
Jeff Davis
98c5b191e7 Fix missed case for builtin collation provider.
A missed check for the builtin collation provider could result in
falling through to call isalpha().

This does not appear to have practical consequences because it only
happens for characters in the ASCII range. Regardless, the builtin
provider should not be calling libc functions, so backpatch.

Discussion: https://postgr.es/m/1bd5a0a5192f82c22ee7527e825b18ab0028b2c7.camel@j-davis.com
Backpatch-through: 17
2024-10-11 16:59:29 -07:00
Nathan Bossart
4e1fad3787 Add pg_ls_summariesdir().
This function returns the name, size, and last modification time of
each regular file in pg_wal/summaries.  This allows administrators
to grant privileges to view the contents of this directory without
granting privileges on pg_ls_dir(), which allows listing the
contents of many other directories.  This commit also gives the
pg_monitor predefined role EXECUTE privileges on the new
pg_ls_summariesdir() function.

Bumps catversion.

Author: Yushi Ogiwara
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/a0a3af15a9b9daa107739eb45aa9a9bc%40oss.nttdata.com
2024-10-11 11:02:09 -05:00
Tom Lane
5a4416192d Avoid crash in estimate_array_length with null root pointer.
Commit 9391f7152 added a "PlannerInfo *root" parameter to
estimate_array_length, but failed to consider the possibility that
NULL would be passed for that, leading to a null pointer dereference.

We could rectify the particular case shown in the bug report by fixing
simplify_function/inline_function to pass through the root pointer.
However, as long as eval_const_expressions is documented to accept
NULL for root, similar hazards would remain.  For now, let's just do
the narrow fix of hardening estimate_array_length to not crash.
Its behavior with NULL root will be the same as it was before
9391f7152, so this is not too awful.

Per report from Fredrik Widlert (via Paul Ramsey).  Back-patch to v17
where 9391f7152 came in.

Discussion: https://postgr.es/m/518339E7-173E-45EC-A0FF-9A4A62AA4F40@cleverelephant.ca
2024-10-09 17:07:53 -04:00
Michael Paquier
f3f06b1330 Apply GUC name from central table in more places of guc.c
The name extracted from the record of the GUC tables is applied to more
internal places of guc.c.  This change has the advantage to simplify
parse_and_validate_value(), where the "name" was only used in elog
messages, while it was required to match with the name from the GUC
record.

pg_parameter_aclcheck() now passes the name of the GUC from its record
in two places rather than the caller's argument.  The value given to
this function goes through convert_GUC_name_for_parameter_acl() that
does a simple ASCII downcasing.

Few GUCs mix character casing in core; one test is added for one of
these code paths with "IntervalStyle".

Author: Peter Smith, Michael Paquier
Discussion: https://postgr.es/m/ZwNh4vkc2NHJHnND@paquier.xyz
2024-10-09 18:47:34 +09:00
Tom Lane
2d24fd942c Add min and max aggregates for bytea type.
Similar to a0f1fce80, although we chose to duplicate logic
rather than invoke byteacmp, primarily to avoid repeat detoasting.

Marat Buharov, Aleksander Alekseev

Discussion: https://postgr.es/m/CAPCEVGXiASjodos4P8pgyV7ixfVn-ZgG9YyiRZRbVqbGmfuDyg@mail.gmail.com
2024-10-08 13:52:14 -04:00
Nathan Bossart
5d6187d2a2 Fix Y2038 issues with MyStartTime.
Several places treat MyStartTime as a "long", which is only 32 bits
wide on some platforms.  In reality, MyStartTime is a pg_time_t,
i.e., a signed 64-bit integer.  This will lead to interesting bugs
on the aforementioned systems in 2038 when signed 32-bit integers
are no longer sufficient to store Unix time (e.g., "pg_ctl start"
hanging).  To fix, ensure that MyStartTime is handled as a 64-bit
value everywhere.  (Of course, users will need to ensure that
time_t is 64 bits wide on their system, too.)

Co-authored-by: Max Johnson
Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com
Backpatch-through: 12
2024-10-07 13:51:03 -05:00
Michael Paquier
2e7c4abe5a Use camel case for "DateStyle" in some error messages
This GUC is written as camel-case in most of the documentation and the
GUC table (but not postgresql.conf.sample), and two error messages
hardcoded it with lower case characters.  Let's use a style more
consistent.

Most of the noise comes from the regression tests, updated to reflect
the GUC name in these error messages.

Author: Peter Smith
Reviewed-by: Peter Eisentraut, Álvaro Herrera
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-10-07 12:36:00 +09:00
Tom Lane
f8d9a9f21e Ignore not-yet-defined Portals in pg_cursors view.
pg_cursor() supposed that any Portal it finds in the hash table must
have sourceText set up, but there's an edge case where that is not so.
A newly-created Portal has sourceText = NULL, and that doesn't change
until PortalDefineQuery is called.  In SPI_cursor_open_internal,
we perform GetCachedPlan between CreatePortal and PortalDefineQuery,
and it's possible for user-defined code to execute during that
planning and cause a fetch from the pg_cursors view, resulting in a
null-pointer-dereference crash.  (It looks like the same could happen
in exec_bind_message, but I've not tried to provoke a failure there.)

I considered trying to fix this by setting sourceText sooner, but
there may be instances of this same calling pattern in extensions,
and we couldn't be sure they'd get the memo promptly.  It seems
better to redefine pg_cursor as not showing Portals that have
not yet had PortalDefineQuery called on them, which we can do by
just skipping them if sourceText is still NULL.

(Before a1c692358, pg_cursor would instead return a row with NULL
in the statement column.  We could revert to that behavior but it
doesn't really seem like a better definition, especially since our
documentation doesn't suggest that the column could be NULL.)

Per report from PetSerAl.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAKygsHTBXLXjwV43kpZa+Cs+XTiaeeJiZdL4cPBm9f4MTdw7wg@mail.gmail.com
2024-10-06 16:03:48 -04:00
Thomas Munro
adbb27ac89 Reject non-ASCII locale names.
Commit bf03cfd1 started scanning all available BCP 47 locale names on
Windows.  This caused an abort/crash in the Windows runtime library if
the default locale name contained non-ASCII characters, because of our
use of the setlocale() save/restore pattern with "char" strings.  After
switching to another locale with a different encoding, the saved name
could no longer be understood, and setlocale() would abort.

"Turkish_Türkiye.1254" is the example from recent reports, but there are
other examples of countries and languages with non-ASCII characters in
their names, and they appear in Windows' (old style) locale names.

To defend against this:

1.  In initdb, reject non-ASCII locale names given explicity on the
command line, or returned by the operating system environment with
setlocale(..., ""), or "canonicalized" by the operating system when we
set it.

2.  In initdb only, perform the save-and-restore with Windows'
non-standard wchar_t variant of setlocale(), so that it is not subject
to round trip failures stemming from char string encoding confusion.

3.  In the backend, we don't have to worry about the save-and-restore
problem because we have already vetted the defaults, so we just have to
make sure that CREATE DATABASE also rejects non-ASCII names in any new
databases.  SET lc_XXX doesn't suffer from the problem, but the ban
applies to it too because it uses check_locale().  CREATE COLLATION
doesn't suffer from the problem either, but it doesn't use
check_locale() so it is not included in the new ban for now, to minimize
the change.

Anyone who encounters the new error message should either create a new
duplicated locale with an ASCII-only name using Windows Locale Builder,
or consider using BCP 47 names like "tr-TR".  Users already couldn't
initialize a cluster with "Turkish_Türkiye.1254" on PostgreSQL 16+, but
the new failure mode is an error message that explains why, instead of a
crash.

Back-patch to 16, where bf03cfd1 landed.  Older versions are affected
in theory too, but only 16 and later are causing crash reports.

Reviewed-by: Andrew Dunstan <andrew@dunslane.net> (the idea, not the patch)
Reported-by: Haifang Wang (Centific Technologies Inc) <v-haiwang@microsoft.com>
Discussion: https://postgr.es/m/PH8PR21MB3902F334A3174C54058F792CE5182%40PH8PR21MB3902.namprd21.prod.outlook.com
2024-10-05 13:50:02 +13:00
Dean Rasheed
9428c001f6 Speed up numeric division by always using the "fast" algorithm.
Formerly there were two internal functions in numeric.c to perform
numeric division, div_var() and div_var_fast(). div_var() performed
division exactly to a specified rscale using Knuth's long division
algorithm, while div_var_fast() used the algorithm from the "FM"
library, which approximates each quotient digit using floating-point
arithmetic, and computes a truncated quotient with DIV_GUARD_DIGITS
extra digits. div_var_fast() could be many times faster than
div_var(), but did not guarantee correct results in all cases, and was
therefore only suitable for use in transcendental functions, where
small errors are acceptable.

This commit merges div_var() and div_var_fast() together into a single
function with an extra "exact" boolean parameter, which can be set to
false if the caller is OK with an approximate result. The new function
uses the faster algorithm from the "FM" library, except that when
"exact" is true, it does not truncate the computation with
DIV_GUARD_DIGITS extra digits, but instead performs the full-precision
computation, subtracting off complete multiples of the divisor for
each quotient digit. However, it is able to retain most of the
performance benefits of div_var_fast(), by delaying the propagation of
carries, allowing the inner loop to be auto-vectorized.

Since this may still lead to an inaccurate result, when "exact" is
true, it then inspects the remainder and uses that to adjust the
quotient, if necessary, to make it correct. In practice, the quotient
rarely needs to be adjusted, and never by more than one in the final
digit, though it's difficult to prove that, so the code allows for
larger adjustments, just in case.

In addition, use base-NBASE^2 arithmetic and a 64-bit dividend array,
similar to mul_var(), so that the number of iterations of the outer
loop is roughly halved. Together with the faster algorithm, this makes
div_var() up to around 20 times as fast as the old Knuth algorithm
when "exact" is true, and up to 2 or 3 times as fast as the old
div_var_fast() function when "exact" is false.

Dean Rasheed, reviewed by Joel Jacobson.

Discussion: https://postgr.es/m/CAEZATCVHR10BPDJSANh0u2+Sg6atO3mD0G+CjKDNRMD-C8hKzQ@mail.gmail.com
2024-10-04 09:49:24 +01:00
Fujii Masao
17cc5f666f Fix inconsistent reporting of checkpointer stats.
Previously, the pg_stat_checkpointer view and the checkpoint completion
log message could show different numbers for buffers written
during checkpoints. The view only counted shared buffers,
while the log message included both shared and SLRU buffers,
causing inconsistencies.

This commit resolves the issue by updating both the view and the log message
to separately report shared and SLRU buffers written during checkpoints.
A new slru_written column is added to the pg_stat_checkpointer view
to track SLRU buffers, while the existing buffers_written column now
tracks only shared buffers. This change would help users distinguish
between the two types of buffers, in the pg_stat_checkpointer view and
the checkpoint complete log message, respectively.

Bump catalog version.

Author: Nitin Jadhav
Reviewed-by: Bharath Rupireddy, Michael Paquier, Kyotaro Horiguchi, Robert Haas
Reviewed-by: Andres Freund, vignesh C, Fujii Masao
Discussion: https://postgr.es/m/CAMm1aWb18EpT0whJrjG+-nyhNouXET6ZUw0pNYYAe+NezpvsAA@mail.gmail.com
2024-10-02 11:17:47 +09:00
Fujii Masao
559efce1d6 Add num_done counter to the pg_stat_checkpointer view.
Checkpoints can be skipped when the server is idle. The existing num_timed and
num_requested counters in pg_stat_checkpointer track both completed and
skipped checkpoints, but there was no way to count only the completed ones.

This commit introduces the num_done counter, which tracks only completed
checkpoints, making it easier to see how many were actually performed.

Bump catalog version.

Author: Anton A. Melnikov
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
2024-09-30 11:56:05 +09:00
Tom Lane
147bbc90f7 Modernize to_char's Roman-numeral code, fixing overflow problems.
int_to_roman() only accepts plain "int" input, which is fine since
we're going to produce '###############' for any value above 3999
anyway.  However, the numeric and int8 variants of to_char() would
throw an error if the given input exceeded the integer range, while
the float-input variants invoked undefined-per-C-standard behavior.
Fix things so that you uniformly get '###############' for out of
range input.

Also add test cases covering this code, plus the equally-untested
EEEE, V, and PL format codes.

Discussion: https://postgr.es/m/2956175.1725831136@sss.pgh.pa.us
2024-09-26 11:02:31 -04:00
Noah Misch
aac2c9b4fd For inplace update durability, make heap_update() callers wait.
The previous commit fixed some ways of losing an inplace update.  It
remained possible to lose one when a backend working toward a
heap_update() copied a tuple into memory just before inplace update of
that tuple.  In catalogs eligible for inplace update, use LOCKTAG_TUPLE
to govern admission to the steps of copying an old tuple, modifying it,
and issuing heap_update().  This includes MERGE commands.  To avoid
changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when
holding a relation lock sufficient to exclude inplace updaters.
Back-patch to v12 (all supported versions).  In v13 and v12, "UPDATE
pg_class" or "UPDATE pg_database" can still lose an inplace update.  The
v14+ UPDATE fix needs commit 86dc90056d,
and it wasn't worth reimplementing that fix without such infrastructure.

Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas.

Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
2024-09-24 15:25:18 -07:00
Jeff Davis
ac30021356 Allow length=-1 for NUL-terminated input to pg_strncoll(), etc.
Like ICU, allow a length of -1 to be specified for NUL-terminated
arguments to pg_strncoll(), pg_strnxfrm(), and pg_strnxfrm_prefix().

Simplifies the code and comments.

Discussion: https://postgr.es/m/2d758e07dff26bcc7cbe2aec57431329bfe3679a.camel@j-davis.com
2024-09-24 15:15:18 -07:00
Jeff Davis
ceeaaed87a Tighten up make_libc_collator() and make_icu_collator().
Ensure that error paths within these functions do not leak a collator,
and return the result rather than using an out parameter. (Error paths
in the caller may still result in a leaked collator, which will be
addressed separately.)

In make_libc_collator(), if the first newlocale() succeeds and the
second one fails, close the first locale_t object.

The function make_icu_collator() doesn't have any external callers, so
change it to be static.

Discussion: https://postgr.es/m/54d20e812bd6c3e44c10eddcd757ec494ebf1803.camel@j-davis.com
2024-09-24 12:01:45 -07:00