Commit graph

11659 commits

Author SHA1 Message Date
Heikki Linnakangas
96e61b2792 Fix catcache invalidation of a list entry that's being built
If a new catalog tuple is inserted that belongs to a catcache list
entry, and cache invalidation happens while the list entry is being
built, the list entry might miss the newly inserted tuple.

To fix, change the way we detect concurrent invalidations while a
catcache entry is being built. Keep a stack of entries that are being
built, and apply cache invalidation to those entries in addition to
the real catcache entries. This is similar to the in-progress list in
relcache.c.

Back-patch to all supported versions.

Reviewed-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi
2025-01-14 14:35:11 +02:00
Álvaro Herrera
ffd9b81346
Fix SLRU bank selection code
The originally submitted code (using bit masking) was correct when the
number of slots was restricted to be a power of two -- but that
limitation was removed during development that led to commit
53c2a97a92, which made the bank selection code incorrect.  This led to
always using a smaller number of banks than available.  Change said code
to use integer modulo instead, which works correctly with an arbitrary
number of banks.

It's likely that we could improve on this to avoid runtime use of
integer division.  But with this change we're, at least, not wasting
memory on unused banks, and more banks mean less contention, which is
likely to have a much higher performance impact than a single
instruction's latency.

Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/9444dc46-ca47-43ed-9058-89c456316306@postgrespro.ru
2025-01-09 07:44:30 +01:00
Thomas Munro
af109e3399 Provide 64-bit ftruncate() and lseek() on Windows.
Change our ftruncate() macro to use the 64-bit variant of chsize(), and
add a new macro to redirect lseek() to _lseeki64().

Back-patch to all supported releases, in preparation for a bug fix.

Tested-by: Davinder Singh <davinder.singh@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
2025-01-09 15:00:23 +13:00
Thomas Munro
66aaabe7a1 Restore smgrtruncate() prototype in back-branches.
It's possible that external code is calling smgrtruncate().  Any
external callers might like to consider the recent changes to
RelationTruncate(), but commit 38c579b0 should not have changed the
function prototype in the back-branches, per ABI stability policy.

Restore smgrtruncate()'s traditional argument list in the back-branches,
but make it a wrapper for a new function smgrtruncate2().  The three
callers in core can use smgrtruncate2() directly.  In master (18-to-be),
smgrtruncate2() is effectively renamed to smgrtruncate(), so this wart
is cleaned up.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKG%2BThae6x6%2BjmQiuALQBT2Ae1ChjMh1%3DkMvJ8y_SBJZrvA%40mail.gmail.com
2025-01-08 10:43:40 +13:00
Heikki Linnakangas
e998901b88 Remove duplicate definitions in proc.h
These are also present in procnumber.h

Reported-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/bd04d675-4672-4f87-800a-eb5d470c15fc@eisentraut.org
2025-01-06 11:56:28 +02:00
Tom Lane
15b4c46c32 Exclude parallel workers from connection privilege/limit checks.
Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges.  The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role).  Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized.  We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.

Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached.  Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well.  The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends.  Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.

While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make some of these
checks using the leader's current role where before we had used its
AuthenticatedUserId, thus allowing parallel queries to fail after
SET ROLE.  The previous behavior was fairly accidental and I have
no desire to return to it.

This patch includes reverting 73c9f91a1, which was an emergency hack
to suppress these same checks in some cases.  It wasn't complete,
as shown by a recent bug report from Laurenz Albe.  We can also revert
fd4d93d26 and 492217301, which hacked around the same problems in one
regression test.

In passing, remove the special case for autovac workers in
CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass
the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's
needed.

Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 16:08:50 -05:00
Tom Lane
14141bbbc1 Reserve a PGPROC slot and semaphore for the slotsync worker process.
The need for this was missed in commit 93db6cbda, with the result
being that if we launch a slotsync worker it would consume one of
the PGPROCs in the max_connections pool.  That could lead to inability
to launch the worker, or to subsequent failures of connection requests
that should have succeeded according to the configured settings.

Rather than create some one-off infrastructure to support this,
let's group the slotsync worker with the existing autovac launcher
in a new category of "special worker" processes.  These are kind of
like auxiliary processes, but they cannot use that infrastructure
because they need to be able to run transactions.

For the moment, make these processes share the PGPROC freelist
used for autovac workers (which previously supplied the autovac
launcher too).  This is partly to avoid an ABI change in v17,
and partly because it seems silly to have a freelist with
at most two members.  This might be worth revisiting if we grow
enough workers in this category.

Tom Lane and Hou Zhijie.  Back-patch to v17.

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 12:30:42 -05:00
Noah Misch
fa61313770 In REASSIGN OWNED of a database, lock the tuple as mandated.
Commit aac2c9b4fd mandated such locking
and attempted to fulfill that mandate, but it missed REASSIGN OWNED.
Hence, it remained possible to lose VACUUM's inplace update of
datfrozenxid if a REASSIGN OWNED processed that database at the same
time.  This didn't affect the other inplace-updated catalog, pg_class.
For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the
generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills
the locking mandate.

Like in GRANT, implement this by following the locking protocol for any
catalog subject to the generic AlterObjectOwner_internal().  It would
suffice to do this for IsInplaceUpdateOid() catalogs only.  Back-patch
to v13 (all supported versions).

Kirill Reshke.  Reported by Alexander Kukushkin.

Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com
2024-12-28 07:16:26 -08:00
Thomas Munro
0350b876b0 Fix corruption when relation truncation fails.
RelationTruncate() does three things, while holding an
AccessExclusiveLock and preventing checkpoints:

1. Logs the truncation.
2. Drops buffers, even if they're dirty.
3. Truncates some number of files.

Step 2 could previously be canceled if it had to wait for I/O, and step
3 could and still can fail in file APIs.  All orderings of these
operations have data corruption hazards if interrupted, so we can't give
up until the whole operation is done.  When dirty pages were discarded
but the corresponding blocks were left on disk due to ERROR, old page
versions could come back from disk, reviving deleted data (see
pgsql-bugs #18146 and several like it).  When primary and standby were
allowed to disagree on relation size, standbys could panic (see
pgsql-bugs #18426) or revive data unknown to visibility management on
the primary (theorized).

Changes:

 * WAL is now unconditionally flushed first
 * smgrtruncate() is now called in a critical section, preventing
   interrupts and causing PANIC on file API failure
 * smgrtruncate() has a new parameter for existing fork sizes,
   because it can't call smgrnblocks() itself inside a critical section

The changes apply to RelationTruncate(), smgr_redo() and
pg_truncate_visibility_map().  That last is also brought up to date with
other evolutions of the truncation protocol.

The VACUUM FileTruncate() failure mode had been discussed in older
reports than the ones referenced below, with independent analysis from
many people, but earlier theories on how to fix it were too complicated
to back-patch.  The more recently invented cancellation bug was
diagnosed by Alexander Lakhin.  Other corruption scenarios were spotted
by me while iterating on this patch and earlier commit 75818b3a.

Back-patch to all supported releases.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reported-by: rootcause000@gmail.com
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/18146-04e908c662113ad5%40postgresql.org
Discussion: https://postgr.es/m/18426-2d18da6586f152d6%40postgresql.org
2024-12-20 23:57:18 +13:00
Masahiko Sawada
a12b0f3a9d Include necessary header files in radixtree.h.
When #include'ing radixtree.h with RT_SHMEM, it could happen to raise
compiler errors due to missing some declarations of types and
functions.

This commit also removes the inclusion of postgres.h since it's
against our usual convention.

Backpatch to v17, where radixtree.h was introduced.

Reviewed-by: Heikki Linnakangas, Álvaro Herrera
Discussion: https://postgr.es/m/CAD21AoCU9YH%2Bb9Rr8YRw7UjmB%3D1zh8GKQkWNiuN9mVhMvkyrRg%40mail.gmail.com
Backpatch-through: 17
2024-12-09 13:07:03 -08:00
Tom Lane
556f7b7bc1 Simplify executor's determination of whether to use parallelism.
Our parallel-mode code only works when we are executing a query
in full, so ExecutePlan must disable parallel mode when it is
asked to do partial execution.  The previous logic for this
involved passing down a flag (variously named execute_once or
run_once) from callers of ExecutorRun or PortalRun.  This is
overcomplicated, and unsurprisingly some of the callers didn't
get it right, since it requires keeping state that not all of
them have handy; not to mention that the requirements for it were
undocumented.  That led to assertion failures in some corner
cases.  The only state we really need for this is the existing
QueryDesc.already_executed flag, so let's just put all the
responsibility in ExecutePlan.  (It could have been done in
ExecutorRun too, leading to a slightly shorter patch -- but if
there's ever more than one caller of ExecutePlan, it seems better
to have this logic in the subroutine than the callers.)

This makes those ExecutorRun/PortalRun parameters unnecessary.
In master it seems okay to just remove them, returning the
API for those functions to what it was before parallelism.
Such an API break is clearly not okay in stable branches,
but for them we can just leave the parameters in place after
documenting that they do nothing.

Per report from Yugo Nagata, who also reviewed and tested
this patch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
2024-12-09 14:38:19 -05:00
Tom Lane
ec7b89cc53 Ensure that pg_amop/amproc entries depend on their lefttype/righttype.
Usually an entry in pg_amop or pg_amproc does not need a dependency on
its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types,
because there is an indirect dependency via the argument types of its
referenced operator or procedure, or via the opclass it belongs to.
However, for some support procedures in some index AMs, the argument
types of the support procedure might not mention the column data type
at all.  Also, the amop/amproc entry might be treated as "loose" in
the opfamily, in which case it lacks a dependency on any particular
opclass; or it might be a cross-type entry having a reference to a
datatype that is not its opclass' opcintype.

The upshot of all this is that there are cases where a datatype can
be dropped while leaving behind amop/amproc entries that mention it,
because there is no path in pg_depend showing that those entries
depend on that type.  Such entries are harmless in normal activity,
because they won't get used, but they cause problems for maintenance
actions such as dropping the operator family.  They also cause pg_dump
to produce bogus output.  The previous commit put a band-aid on the
DROP OPERATOR FAMILY failure, but a real fix is needed.

To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry
depends on its lefttype/righttype.  To avoid bloating pg_depend too
much, skip this if the referenced operator or function has that type
as an input type.  (I did not bother with considering the possible
indirect dependency via the opclass' opcintype; at least in the
reported case, that wouldn't help anyway.)

Probably, the reason this has escaped notice for so long is that
add-on datatypes and relevant opclasses/opfamilies are usually
packaged as extensions nowadays, so that there's no way to drop
a type without dropping the referencing opclasses/opfamilies too.
Still, in the absence of pg_depend entries there's nothing that
constrains DROP EXTENSION to drop the opfamily entries before the
datatype, so it seems possible for a DROP failure to occur anyway.

The specific case that was reported doesn't fail in v13, because
v13 prefers to attach the support procedure to the opclass not the
opfamily.  But it's surely possible to construct other edge cases
that do fail in v13, so patch that too.

Per report from Yoran Heling.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 15:56:28 -05:00
Álvaro Herrera
b0e572819d
Clean up newlines following left parentheses
Most came in during the 17 cycle, so backpatch there.  Some
(particularly reorderbuffer.h) are very old, but backpatching doesn't
seem useful.

Like commits c9d2977519, c4f113e8fe.
2024-11-26 17:10:07 +01:00
Peter Eisentraut
f3834c12af Fix C23 compiler warning
The approach of declaring a function pointer with an empty argument
list and hoping that the compiler will not complain about casting it
to another type no longer works with C23, because foo() is now
equivalent to foo(void).

We don't need to do this here.  With a few struct forward declarations
we can supply a correct argument list without having to pull in
another header file.

(This is the only new warning with C23.  Together with the previous
fix a67a49648d, this makes the whole code compile cleanly under C23.)

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/95c6a9bf-d306-43d8-b880-664ef08f2944%40eisentraut.org
Discussion: https://www.postgresql.org/message-id/flat/87o72eo9iu.fsf%40gentoo.org
2024-11-26 13:36:49 +01:00
Tom Lane
97be02ad00 Fix NULLIF()'s handling of read-write expanded objects.
If passed a read-write expanded object pointer, the EEOP_NULLIF
code would hand that same pointer to the equality function
and then (unless equality was reported) also return the same
pointer as its value.  This is no good, because a function that
receives a read-write expanded object pointer is fully entitled
to scribble on or even delete the object, thus corrupting the
NULLIF output.  (This problem is likely unobservable with the
equality functions provided in core Postgres, but it's easy to
demonstrate with one coded in plpgsql.)

To fix, make sure the pointer passed to the equality function
is read-only.  We can still return the original read-write
pointer as the NULLIF result, allowing optimization of later
operations.

Per bug #18722 from Alexander Lakhin.  This has been wrong
since we invented expanded objects, so back-patch to all
supported branches.

Discussion: https://postgr.es/m/18722-fd9e645448cc78b4@postgresql.org
2024-11-25 18:09:10 -05:00
Thomas Munro
0813a5e691 Assume that <stdbool.h> conforms to the C standard.
Previously we checked "for <stdbool.h> that conforms to C99" using
autoconf's AC_HEADER_STDBOOL macro.  We've required C99 since PostgreSQL
12, so the test was redundant, and under C23 it was broken: autoconf
2.69's implementation doesn't understand C23's new empty header (the
macros it's looking for went away, replaced by language keywords).
Later autoconf versions fixed that, but let's just remove the
anachronistic test.

HAVE_STDBOOL_H and HAVE__BOOL will no longer be defined, but they
weren't directly tested in core or likely extensions (except in 11, see
below).  PG_USE_STDBOOL (or USE_STDBOOL in 11 and 12) is still defined
when sizeof(bool) is 1, which should be true on all modern systems.
Otherwise we define our own bool type and values of size 1, which would
fail to compile under C23 as revealed by the broken test.  (We'll
probably clean that dead code up in master, but here we want a minimal
back-patchable change.)

This came to our attention when GCC 15 recently started using using C23
by default and failed to compile the replacement code, as reported by
Sam James and build farm animal alligator.

Back-patch to all supported releases, and then two older versions that
also know about <stdbool.h>, per the recently-out-of-support policy[1].
12 requires C99 so it's much like the supported releases, but 11 only
assumes C89 so it now uses AC_CHECK_HEADERS instead of the overly picky
AC_HEADER_STDBOOL.  (I could find no discussion of which historical
systems had <stdbool.h> but failed the conformance test; if they ever
existed, they surely aren't relevant to that policy's goals.)

[1] https://wiki.postgresql.org/wiki/Committing_checklist#Policies

Reported-by: Sam James <sam@gentoo.org>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org> (master version)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (approach)
Discussion: https://www.postgresql.org/message-id/flat/87o72eo9iu.fsf%40gentoo.org
2024-11-25 20:54:15 +13:00
Amit Kapila
5f46439d59 Doc: Clarify the inactive_since field description.
Updated to specify that it represents the exact time a slot became
inactive, rather than the period of inactivity.

Reported-by: Peter Smith
Author: Bruce Momjian, Nisha Moond
Reviewed-by: Amit Kapila, Peter Smith
Backpatch-through: 17
Discussion: https://postgr.es/m/CAHut+PuvsyA5v8y7rYoY9mkDQzUhwaESM05yCByTMaDoRh30tA@mail.gmail.com
2024-11-25 10:58:06 +05:30
Heikki Linnakangas
e3b249c6d8 Make the memory layout of Port struct independent of USE_OPENSSL
Commit d39a49c1e4 added new fields to the struct, but missed the "keep
these last" comment on the previous fields. Add placeholder variables
so that the offsets of the fields are the same whether you build with
USE_OPENSSL or not. This is a courtesy to extensions that might peek
at the fields, to make the ABI the same regardless of the options used
to build PostgreSQL.

In reality, I don't expect any extensions to look at the 'raw_buf'
fields. Firstly, they are new in v17, so no one's written such
extensions yet. Secondly, extensions should have no business poking at
those fields anyway. Nevertheless, fix this properly on 'master'. On
v17, we mustn't change the memory layout, so just fix the comments.

Author: Jacob Champion
Discussion: https://www.postgresql.org/message-id/raw/CAOYmi%2BmKVJNzn5_TD_MK%3DhqO64r_w8Gb0FHCLk0oAkW-PJv8jQ@mail.gmail.com
2024-11-22 17:43:34 +02:00
Tom Lane
6bfacd368b Undo unintentional ABI break in struct ResultRelInfo.
Commits aac2c9b4f et al. added a bool field to struct ResultRelInfo.
That's no problem in the master branch, but in released branches
care must be taken when modifying publicly-visible structs to avoid
an ABI break for extensions.  Frequently we solve that by adding the
new field at the end of the struct, and that's what was done here.
But ResultRelInfo has stricter constraints than just about any other
node type in Postgres.  Some executor APIs require extensions to index
into arrays of ResultRelInfo, which means that any change whatever in
sizeof(ResultRelInfo) causes a fatal ABI break.

Fortunately, this is easy to fix, because the new field can be
squeezed into available padding space instead --- indeed, that's where
it was put in master, so this fix also removes a cross-branch coding
variation.

Per report from Pavan Deolasee.  Patch v14-v17 only; earlier versions
did not gain the extra field, nor is there any problem in master.

Discussion: https://postgr.es/m/CABOikdNmVBC1LL6pY26dyxAS2f+gLZvTsNt=2XbcyG7WxXVBBQ@mail.gmail.com
2024-11-16 12:58:26 -05:00
Tom Lane
5f28e6ba7f Avoid assertion due to disconnected NFA sub-graphs in regex parsing.
In commit 08c0d6ad6 which introduced "rainbow" arcs in regex NFAs,
I didn't think terribly hard about what to do when creating the color
complement of a rainbow arc.  Clearly, the complement cannot match any
characters, and I took the easy way out by just not building any arcs
at all in the complement arc set.  That mostly works, but Nikolay
Shaplov found a case where it doesn't: if we decide to delete that
sub-NFA later because it's inside a "{0}" quantifier, delsub()
suffered an assertion failure.  That's because delsub() relies on
the target sub-NFA being fully connected.  That was always true
before, and the best fix seems to be to restore that property.
Hence, invent a new arc type CANTMATCH that can be generated in
place of an empty color complement, and drop it again later when we
start NFA optimization.  (At that point we don't need to do delsub()
any more, and besides there are other cases where NFA optimization can
lead to disconnected subgraphs.)

It appears that this bug has no consequences in a non-assert-enabled
build: there will be some transiently leaked NFA states/arcs, but
they'll get cleaned up eventually.  Still, we don't like assertion
failures, so back-patch to v14 where rainbow arcs were introduced.

Per bug #18708 from Nikolay Shaplov.

Discussion: https://postgr.es/m/18708-f94f2599c9d2c005@postgresql.org
2024-11-15 18:23:38 -05:00
Michael Paquier
1d6a03ea41 Fix race conditions with drop of reused pgstats entries
This fixes a set of race conditions with cumulative statistics where a
shared stats entry could be dropped while it should still be valid in
the event when it is reused: an entry may refer to a different object
but requires the same hash key.  This can happen with various stats
kinds, like:
- Replication slots that compute internally an index number, for
different slot names.
- Stats kinds that use an OID in the object key, where a wraparound
causes the same key to be used if an OID is used for the same object.
- As of PostgreSQL 18, custom pgstats kinds could also be an issue,
depending on their implementation.

This issue is fixed by introducing a counter called "generation" in the
shared entries via PgStatShared_HashEntry, initialized at 0 when an
entry is created and incremented when the same entry is reused, to avoid
concurrent issues on drop because of other backends still holding a
reference to it.  This "generation" is copied to the local copy that a
backend holds when looking at an object, then cross-checked with the
shared entry to make sure that the entry is not dropped even if its
"refcount" justifies that if it has been reused.

This problem could show up when a backend shuts down and needs to
discard any entries it still holds, causing statistics to be removed
when they should not, or even an assertion failure.  Another report
involved a failure in a standby after an OID wraparound, where the
startup process would FATAL on a "can only drop stats once", stopping
recovery abruptly.  The buildfarm has been sporadically complaining
about the problem, as well, but the window is hard to reach with the
in-core tests.

Note that the issue can be reproduced easily by adding a sleep before
dshash_find() in pgstat_release_entry_ref() to enlarge the problematic
window while repeating test_decoding's isolation test oldest_xmin a
couple of times, for example, as pointed out by Alexander Lakhin.

Reported-by: Alexander Lakhin, Peter Smith
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAA4eK1KxuMVyAryz_Vk5yq3ejgKYcL6F45Hj9ZnMNBS-g+PuZg@mail.gmail.com
Discussion: https://postgr.es/m/17947-b9554521ad963c9c@postgresql.org
Backpatch-through: 15
2024-11-15 11:32:13 +09:00
Tom Lane
cd82afdda5 Fix improper interactions between session_authorization and role.
The SQL spec mandates that SET SESSION AUTHORIZATION implies
SET ROLE NONE.  We tried to implement that within the lowest-level
functions that manipulate these settings, but that was a bad idea.
In particular, guc.c assumes that it doesn't matter in what order
it applies GUC variable updates, but that was not the case for these
two variables.  This problem, compounded by some hackish attempts to
work around it, led to some security-grade issues:

* Rolling back a transaction that had done SET SESSION AUTHORIZATION
would revert to SET ROLE NONE, even if that had not been the previous
state, so that the effective user ID might now be different from what
it had been.

* The same for SET SESSION AUTHORIZATION in a function SET clause.

* If a parallel worker inspected current_setting('role'), it saw
"none" even when it should see something else.

Also, although the parallel worker startup code intended to cope
with the current role's pg_authid row having disappeared, its
implementation of that was incomplete so it would still fail.

Fix by fully separating the miscinit.c functions that assign
session_authorization from those that assign role.  To implement the
spec's requirement, teach set_config_option itself to perform "SET
ROLE NONE" when it sets session_authorization.  (This is undoubtedly
ugly, but the alternatives seem worse.  In particular, there's no way
to do it within assign_session_authorization without incompatible
changes in the API for GUC assign hooks.)  Also, improve
ParallelWorkerMain to directly set all the relevant user-ID variables
instead of relying on some of them to get set indirectly.  That
allows us to survive not finding the pg_authid row during worker
startup.

In v16 and earlier, this includes back-patching 9987a7bf3 which
fixed a violation of GUC coding rules: SetSessionAuthorization
is not an appropriate place to be throwing errors from.

Security: CVE-2024-10978
2024-11-11 10:29:54 -05:00
Nathan Bossart
0a883a067b Fix sign-compare warnings in pg_iovec.h.
The code in question (pg_preadv() and pg_pwritev()) has been around
for a while, but commit 15c9ac3629 moved it to a header file.  If
third-party code that includes this header file is built with
-Wsign-compare on a system without preadv() or pwritev(), warnings
ensue.  This commit fixes said warnings by casting the result of
pg_pread()/pg_pwrite() to size_t, which should be safe because we
will have already checked for a negative value.

Author: Wolfgang Walther
Discussion: https://postgr.es/m/16989737-1aa8-48fd-8dfe-b7ada06509ab%40technowledgy.de
Backpatch-through: 17
2024-11-08 16:11:08 -06:00
Thomas Munro
b7467ab71c Monkey-patch LLVM code to fix ARM relocation bug.
Supply a new memory manager for RuntimeDyld, to avoid crashes in
generated code caused by memory placement that can overflow a 32 bit
data type.  This is a drop-in replacement for the
llvm::SectionMemoryManager class in the LLVM library, with Michael
Smith's proposed fix from
https://www.github.com/llvm/llvm-project/pull/71968.

We hereby slurp it into our own source tree, after moving into a new
namespace llvm::backport and making some minor adjustments so that it
can be compiled with older LLVM versions as far back as 12.  It's harder
to make it work on even older LLVM versions, but it doesn't seem likely
that people are really using them so that is not investigated for now.

The problem could also be addressed by switching to JITLink instead of
RuntimeDyld, and that is the LLVM project's recommended solution as
the latter is about to be deprecated.  We'll have to do that soon enough
anyway, and then when the LLVM version support window advances far
enough in a few years we'll be able to delete this code.  Unfortunately
that wouldn't be enough for PostgreSQL today: in most relevant versions
of LLVM, JITLink is missing or incomplete.

Several other projects have already back-ported this fix into their fork
of LLVM, which is a vote of confidence despite the lack of commit into
LLVM as of today.  We don't have our own copy of LLVM so we can't do
exactly what they've done; instead we have a copy of the whole patched
class so we can pass an instance of it to RuntimeDyld.

The LLVM project hasn't chosen to commit the fix yet, and even if it
did, it wouldn't be back-ported into the releases of LLVM that most of
our users care about, so there is not much point in waiting any longer
for that.  If they make further changes and commit it to LLVM 19 or 20,
we'll still need this for older versions, but we may want to
resynchronize our copy and update some comments.

The changes that we've had to make to our copy can be seen by diffing
our SectionMemoryManager.{h,cpp} files against the ones in the tree of
the pull request.  Per the LLVM project's license requirements, a copy
is in SectionMemoryManager.LICENSE.

This should fix the spate of crash reports we've been receiving lately
from users on large memory ARM systems.

Back-patch to all supported releases.

Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se> (license aspects)
Reported-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_Xqr63qj%3DSx7HY6ZiiQ6R_JbX%2B-p6sTPwDYwTWZjUmjsYBg%40mail.gmail.com
2024-11-06 23:07:34 +13:00
Noah Misch
c1099dd745 Revert "For inplace update, send nontransactional invalidations."
This reverts commit 95c5acb3fc (v17) and
counterparts in each other non-master branch.  If released, that commit
would have caused a worst-in-years minor release regression, via
undetected LWLock self-deadlock.  This commit and its self-deadlock fix
warrant more bake time in the master branch.

Reported by Alexander Lakhin.

Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com
2024-11-02 09:05:00 -07:00
Noah Misch
9aef6f19ac Unpin buffer before inplace update waits for an XID to end.
Commit a07e03fd8f changed inplace updates
to wait for heap_update() commands like GRANT TABLE and GRANT DATABASE.
By keeping the pin during that wait, a sequence of autovacuum workers
and an uncommitted GRANT starved one foreground LockBufferForCleanup()
for six minutes, on buildfarm member sarus.  Prevent, at the cost of a
bit of complexity.  Back-patch to v12, like the earlier commit.  That
commit and heap_inplace_lock() have not yet appeared in any release.

Discussion: https://postgr.es/m/20241026184936.ae.nmisch@google.com
2024-10-29 09:39:58 -07:00
Michael Paquier
709ce29b16 doc: Add better description for rewrite functions in event triggers
There are two functions that can be used in event triggers to get more
details about a rewrite happening on a relation.  Both had a limited
documentation:
- pg_event_trigger_table_rewrite_reason() and
pg_event_trigger_table_rewrite_oid() were not mentioned in the main
event trigger section in the paragraph dedicated to the event
table_rewrite.
- pg_event_trigger_table_rewrite_reason() returns an integer which is a
bitmap of the reasons why a rewrite happens.  There was no explanation
about the meaning of these values, forcing the reader to look at the
code to find out that these are defined in event_trigger.h.

While on it, let's add a comment in event_trigger.h where the
AT_REWRITE_* are defined, telling to update the documentation when
these values are changed.

Backpatch down to 13 as a consequence of 1ad23335f3, where this area
of the documentation has been heavily reworked.

Author: Greg Sabino Mullane
Discussion: https://postgr.es/m/CAKAnmmL+Z6j-C8dAx1tVrnBmZJu+BSoc68WSg3sR+CVNjBCqbw@mail.gmail.com
Backpatch-through: 13
2024-10-29 15:35:14 +09:00
Noah Misch
95c5acb3fc For inplace update, send nontransactional invalidations.
The inplace update survives ROLLBACK.  The inval didn't, so another
backend's DDL could then update the row without incorporating the
inplace update.  In the test this fixes, a mix of CREATE INDEX and ALTER
TABLE resulted in a table with an index, yet relhasindex=f.  That is a
source of index corruption.  Back-patch to v12 (all supported versions).
The back branch versions don't change WAL, because those branches just
added end-of-recovery SIResetAll().  All branches change the ABI of
extern function PrepareToInvalidateCacheTuple().  No PGXN extension
calls that, and there's no apparent use case in extensions.

Reviewed by Nitin Motiani and (in earlier versions) Andres Freund.

Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
2024-10-25 06:51:06 -07:00
Noah Misch
a4668c99f0 At end of recovery, reset all sinval-managed caches.
An inplace update's invalidation messages are part of its transaction's
commit record.  However, the update survives even if its transaction
aborts or we stop recovery before replaying its transaction commit.
After recovery, a backend that started in recovery could update the row
without incorporating the inplace update.  That could result in a table
with an index, yet relhasindex=f.  That is a source of index corruption.

This bulk invalidation avoids the functional consequences.  A future
change can fix the !RecoveryInProgress() scenario without changing the
WAL format.  Back-patch to v17 - v12 (all supported versions).  v18 will
instead add invalidations to WAL.

Discussion: https://postgr.es/m/20240618152349.7f.nmisch@google.com
2024-10-25 06:51:06 -07:00
Amit Langote
7148cb3e30 SQL/JSON: Fix some oversights in commit b6e1157e7
The decision in b6e1157e7 to ignore raw_expr when evaluating a
JsonValueExpr was incorrect.  While its value is not ultimately
used (since formatted_expr's value is), failing to initialize it
can lead to problems, for instance,  when the expression tree in
raw_expr contains Aggref nodes, which must be initialized to
ensure the parent Agg node works correctly.

Also, optimize eval_const_expressions_mutator()'s handling of
JsonValueExpr a bit.  Currently, when formatted_expr cannot be folded
into a constant, we end up processing it twice -- once directly in
eval_const_expressions_mutator() and again recursively via
ece_generic_processing().  This recursive processing is required to
handle raw_expr. To avoid the redundant processing of formatted_expr,
we now  process raw_expr directly in eval_const_expressions_mutator().

Finally, update the comment of JsonValueExpr to describe the roles of
raw_expr and formatted_expr more clearly.

Bug: #18657
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Diagnosed-by: Fabio R. Sluzala <fabio3rs@gmail.com>
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18657-1b90ccce2b16bdb8@postgresql.org
Backpatch-through: 16
2024-10-20 12:21:12 +09:00
Nathan Bossart
8aaca07851 Add type cast to foreach_internal's loop variable.
C++ requires explicitly casting void pointers to the appropriate
pointer type, which means the foreach_ptr macro cannot be used in
C++ code without this change.

Author: Jelte Fennema-Nio
Reviewed-by: Bruce Momjian
Discussion: https://postgr.es/m/CAGECzQSYG3QfHrc-rOk2KbnB9iJOd7Qu-Xii1s-GTA%3D3JFt49Q%40mail.gmail.com
Backpatch-through: 17
2024-10-15 16:20:49 -05:00
Noah Misch
3b7a689e1a For inplace update durability, make heap_update() callers wait.
The previous commit fixed some ways of losing an inplace update.  It
remained possible to lose one when a backend working toward a
heap_update() copied a tuple into memory just before inplace update of
that tuple.  In catalogs eligible for inplace update, use LOCKTAG_TUPLE
to govern admission to the steps of copying an old tuple, modifying it,
and issuing heap_update().  This includes MERGE commands.  To avoid
changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when
holding a relation lock sufficient to exclude inplace updaters.
Back-patch to v12 (all supported versions).  In v13 and v12, "UPDATE
pg_class" or "UPDATE pg_database" can still lose an inplace update.  The
v14+ UPDATE fix needs commit 86dc90056d,
and it wasn't worth reimplementing that fix without such infrastructure.

Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas.

Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
2024-09-24 15:25:22 -07:00
Noah Misch
fd27b878c2 Fix data loss at inplace update after heap_update().
As previously-added tests demonstrated, heap_inplace_update() could
instead update an unrelated tuple of the same catalog.  It could lose
the update.  Losing relhasindex=t was a source of index corruption.
Inplace-updating commands like VACUUM will now wait for heap_update()
commands like GRANT TABLE and GRANT DATABASE.  That isn't ideal, but a
long-running GRANT already hurts VACUUM progress more just by keeping an
XID running.  The VACUUM will behave like a DELETE or UPDATE waiting for
the uncommitted change.

For implementation details, start at the systable_inplace_update_begin()
header comment and README.tuplock.  Back-patch to v12 (all supported
versions).  In back branches, retain a deprecated heap_inplace_update(),
for extensions.

Reported by Smolkin Grigory.  Reviewed by Nitin Motiani, (in earlier
versions) Heikki Linnakangas, and (in earlier versions) Alexander
Lakhin.

Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
2024-09-24 15:25:21 -07:00
Thomas Munro
ec1d545c8f Allow ReadStream to be consumed as raw block numbers.
Commits 041b9680 and 6377e12a changed the interface of
scan_analyze_next_block() to take a ReadStream instead of a BlockNumber
and a BufferAccessStrategy, and to return a value to indicate when the
stream has run out of blocks.

This caused integration problems for at least one known extension that
uses specially encoded BlockNumber values that map to different
underlying storage, because acquire_sample_rows() sets up the stream so
that read_stream_next_buffer() reads blocks from the main fork of the
relation's SMgrRelation.

Provide read_stream_next_block(), as a way for such an extension to
access the stream of raw BlockNumbers directly and forward them to its
own ReadBuffer() calls after decoding, as it could in earlier releases.
The new function returns the BlockNumber and BufferAccessStrategy that
were previously passed directly to scan_analyze_next_block().
Alternatively, an extension could wrap the stream of BlockNumbers in
another ReadStream with a callback that performs any decoding required
to arrive at real storage manager BlockNumber values, so that it could
benefit from the I/O combining and concurrency provided by
read_stream.c.

Another class of table access method that does nothing in
scan_analyze_next_block() because it is not block-oriented could use
this function to control the number of block sampling loops.  It could
match the previous behavior with "return read_stream_next_block(stream,
&bas) != InvalidBlockNumber".

Ongoing work is expected to provide better ANALYZE support for table
access methods that don't behave like heapam with respect to storage
blocks, but that will be for future releases.

Back-patch to 17.

Reported-by: Mats Kindahl <mats@timescale.com>
Reviewed-by: Mats Kindahl <mats@timescale.com>
Discussion: https://postgr.es/m/CA%2B14425%2BCcm07ocG97Fp%2BFrD9xUXqmBKFvecp0p%2BgV2YYR258Q%40mail.gmail.com
2024-09-18 11:29:58 +12:00
Amit Langote
fe32343814 Update comment about ExprState.escontext
The updated comment provides more helpful guidance by mentioning that
escontext should be set when soft error handling is needed.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 12:01:49 +09:00
Alexander Korotkov
84f594da35 Revert support for ALTER TABLE ... MERGE/SPLIT PARTITION(S) commands
This commit reverts 1adf16b8fb, 87c21bb941, and subsequent fixes and
improvements including df64c81ca9, c99ef1811a, 9dfcac8e15, 885742b9f8,
842c9b2705, fcf80c5d5f, 96c7381c4c, f4fc7cb54b, 60ae37a8bc, 259c96fa8f,
449cdcd486, 3ca43dbbb6, 2a679ae94e, 3a82c689fd, fbd4321fd5, d53a4286d7,
c086896625, 4e5d6c4091, 04158e7fa3.

The reason for reverting is security issues related to repeatable name lookups
(CVE-2014-0062).  Even though 04158e7fa3 solved part of the problem, there
are still remaining issues, which aren't feasible to even carefully analyze
before the RC deadline.

Reported-by: Noah Misch, Robert Haas
Discussion: https://postgr.es/m/20240808171351.a9.nmisch%40google.com
Backpatch-through: 17
2024-08-24 19:18:24 +03:00
Alexander Korotkov
f636ab41ab Avoid repeated table name lookups in createPartitionTable()
Currently, createPartitionTable() opens newly created table using its name.
This approach is prone to privilege escalation attack, because we might end
up opening another table than we just created.

This commit address the issue above by opening newly created table by its
OID.  It appears to be tricky to get a relation OID out of ProcessUtility().
We have to extend TableLikeClause with new newRelationOid field, which is
filled within ProcessUtility() to be further accessed by caller.

Security: CVE-2014-0062
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20240808171351.a9.nmisch%40google.com
Reviewed-by: Pavel Borisov, Dmitry Koval
2024-08-22 09:52:54 +03:00
Amit Kapila
915aafe82a Don't advance origin during apply failure.
We advance origin progress during abort on successful streaming and
application of ROLLBACK in parallel streaming mode. But the origin
shouldn't be advanced during an error or unsuccessful apply due to
shutdown. Otherwise, it will result in a transaction loss as such a
transaction won't be sent again by the server.

Reported-by: Hou Zhijie
Author: Hayato Kuroda and Shveta Malik
Reviewed-by: Amit Kapila
Backpatch-through: 16
Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com
2024-08-21 09:08:16 +05:30
Tom Lane
2b8d33f66c Allow adjusting session_authorization and role in parallel workers.
The code intends to allow GUCs to be set within parallel workers
via function SET clauses, but not otherwise.  However, doing so fails
for "session_authorization" and "role", because the assign hooks for
those attempt to set the subsidiary "is_superuser" GUC, and that call
falls foul of the "not otherwise" prohibition.  We can't switch to
using GUC_ACTION_SAVE for this, so instead add a new GUC variable
flag GUC_ALLOW_IN_PARALLEL to mark is_superuser as being safe to set
anyway.  (This is okay because is_superuser has context PGC_INTERNAL
and thus only hard-wired calls can change it.  We'd need more thought
before applying the flag to other GUCs; but maybe there are other
use-cases.)  This isn't the prettiest fix perhaps, but other
alternatives we thought of would be much more invasive.

While here, correct a thinko in commit 059de3ca4: when rejecting
a GUC setting within a parallel worker, we should return 0 not -1
if the ereport doesn't longjmp.  (This seems to have no consequences
right now because no caller cares, but it's inconsistent.)  Improve
the comments to try to forestall future confusion of the same kind.

Despite the lack of field complaints, this seems worth back-patching.
Thanks to Nathan Bossart for the idea to invent a new flag,
and for review.

Discussion: https://postgr.es/m/2833457.1723229039@sss.pgh.pa.us
2024-08-10 15:51:28 -04:00
Masahiko Sawada
fdf218f1d5 Restrict accesses to non-system views and foreign tables during pg_dump.
When pg_dump retrieves the list of database objects and performs the
data dump, there was possibility that objects are replaced with others
of the same name, such as views, and access them. This vulnerability
could result in code execution with superuser privileges during the
pg_dump process.

This issue can arise when dumping data of sequences, foreign
tables (only 13 or later), or tables registered with a WHERE clause in
the extension configuration table.

To address this, pg_dump now utilizes the newly introduced
restrict_nonsystem_relation_kind GUC parameter to restrict the
accesses to non-system views and foreign tables during the dump
process. This new GUC parameter is added to back branches too, but
these changes do not require cluster recreation.

Back-patch to all supported branches.

Reviewed-by: Noah Misch
Security: CVE-2024-7348
Backpatch-through: 12
2024-08-05 06:05:30 -07:00
Etsuro Fujita
eb39497eed Update comment in portal.h.
We store tuples into the portal's tuple store for a PORTAL_ONE_MOD_WITH
query as well.

Back-patch to all supported branches.

Reviewed by Andy Fan.

Discussion: https://postgr.es/m/CAPmGK14HVYBZYZtHabjeCd-e31VT%3Dwx6rQNq8QfehywLcpZ2Hw%40mail.gmail.com
2024-08-01 17:45:00 +09:00
Amit Langote
f95c5090d9 SQL/JSON: Fix casting for integer EXISTS columns in JSON_TABLE
The current method of coercing the boolean result value of
JsonPathExists() to the target type specified for an EXISTS column,
which is to call the type's input function via json_populate_type(),
leads to an error when the target type is integer, because the
integer input function doesn't recognize boolean literal values as
valid.

Instead use the boolean-to-integer cast function for coercion in that
case so that using integer or domains thereof as type for EXISTS
columns works. Note that coercion for ON ERROR values TRUE and FALSE
already works like that because the parser creates a cast expression
including the cast function, but the coercion of the actual result
value is not handled by the parser.

Tests by Jian He.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-07-30 10:39:28 +09:00
Robert Haas
53b327f83e Wait for WAL summarization to catch up before creating .partial file.
When a standby is promoted, CleanupAfterArchiveRecovery() may decide
to rename the final WAL file from the old timeline by adding ".partial"
to the name. If WAL summarization is enabled and this file is renamed
before its partial contents are summarized, WAL summarization breaks:
the summarizer gets stuck at that point in the WAL stream and just
errors out.

To fix that, first make the startup process wait for WAL summarization
to catch up before renaming the file. Generally, this should be quick,
and if it's not, the user can shut off summarize_wal and try again.
To make this fix work, also teach the WAL summarizer that after a
promotion has occurred, no more WAL can appear on the previous
timeline: previously, the WAL summarizer wouldn't switch to the new
timeline until we actually started writing WAL there, but that meant
that when the startup process was waiting for the WAL summarizer, it
was waiting for an action that the summarizer wasn't yet prepared to
take.

In the process of fixing these bugs, I realized that the logic to wait
for WAL summarization to catch up was spread out in a way that made
it difficult to reuse properly, so this code refactors things to make
it easier.

Finally, add a test case that would have caught this bug and the
previously-fixed bug that WAL summarization sometimes needs to back up
when the timeline changes.

Discussion: https://postgr.es/m/CA+TgmoZGEsZodXC4f=XZNkAeyuDmWTSkpkjCEOcF19Am0mt_OA@mail.gmail.com
2024-07-26 14:51:10 -04:00
Daniel Gustafsson
1272cfb727 Fix macro placement in pg_config.h.in
Commit 274bbced85 accidentally placed the pg_config.h.in
for SSL_CTX_set_num_tickets on the wrong line wrt where autoheader
places it.  Fix by re-arranging and backpatch to the same level as
the original commit.

Reported-by: Marina Polyakova <m.polyakova@postgrespro.ru>
Discussion: https://postgr.es/m/48cebe8c3eaf308bae253b1dbf4e4a75@postgrespro.ru
Backpatch-through: v12
2024-07-26 16:25:56 +02:00
Heikki Linnakangas
f19beba3e3 Fix using injection points at backend startup in EXEC_BACKEND mode
Commit 86db52a506 changed the locking of injection points to use only
atomic ops and spinlocks, to make it possible to define injection
points in processes that don't have a PGPROC entry (yet). However, it
didn't work in EXEC_BACKEND mode, because the pointer to shared memory
area was not initialized until the process "attaches" to all the
shared memory structs. To fix, pass the pointer to the child process
along with other global variables that need to be set up early.

Backpatch-through: 17
2024-07-26 15:12:12 +03:00
Daniel Gustafsson
3df7f44a8c Disable all TLS session tickets
OpenSSL supports two types of session tickets for TLSv1.3, stateless
and stateful. The option we've used only turns off stateless tickets
leaving stateful tickets active. Use the new API introduced in 1.1.1
to disable all types of tickets.

Backpatch to all supported versions.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20240617173803.6alnafnxpiqvlh3g@awork3.anarazel.de
Backpatch-through: v12
2024-07-26 11:09:45 +02:00
Michael Paquier
3b279d89cb Improve comments in slru.{c,h} about segment name format
slru.h described incorrectly how SLRU segment names are formatted
depending on the segment number and if long or short segment names are
used.  This commit closes the gap with a better description, fitting
with the reality.

Reported-by: Noah Misch
Author: Aleksander Alekseev
Discussion: https://postgr.es/m/20240626002747.dc.nmisch@google.com
Backpatch-through: 17
2024-07-23 16:55:09 +09:00
Heikki Linnakangas
b0a8a7ddd3 Move resowner from common JitContext to LLVM specific
Only the LLVM specific code uses it since resource owners were made
extensible in commit b8bff07daa. This is
new in v17, so backpatch there to keep the branches from diverging
just yet.

Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/fd3a2a00-6605-4e30-a118-48418b478e6e@proxel.se
2024-07-19 10:27:54 +03:00
Robert Haas
2b5819e2b4 Do not summarize WAL if generated with wal_level=minimal.
To do this, we must include the wal_level in the first WAL record
covered by each summary file; so add wal_level to struct Checkpoint
and the payload of XLOG_CHECKPOINT_REDO and XLOG_END_OF_RECOVERY.

This, in turn, requires bumping XLOG_PAGE_MAGIC and, since the
Checkpoint is also stored in the control file, also
PG_CONTROL_VERSION. It's not great to do that so late in the release
cycle, but the alternative seems to ship v17 without robust
protections against this scenario, which could result in corrupted
incremental backups.

A side effect of this patch is that, when a server with
wal_level=replica is started with summarize_wal=on for the first time,
summarization will no longer begin with the oldest WAL that still
exists in pg_wal, but rather from the first checkpoint after that.
This change should be harmless, because a WAL summary for a partial
checkpoint cycle can never make an incremental backup possible when
it would otherwise not have been.

Report by Fujii Masao. Patch by me. Review and/or testing by Jakub
Wartak and Fujii Masao.

Discussion: http://postgr.es/m/6e30082e-041b-4e31-9633-95a66de76f5d@oss.nttdata.com
2024-07-18 12:19:29 -04:00
Nathan Bossart
925479b8d8 Use PqMsg_* macros in more places.
Commit f4b54e1ed9, which introduced macros for protocol characters,
missed updating a few places.  It also did not introduce macros for
messages sent from parallel workers to their leader processes.
This commit adds a new section in protocol.h for those.

Author: Aleksander Alekseev
Discussion: https://postgr.es/m/CAJ7c6TNTd09AZq8tGaHS3LDyH_CCnpv0oOz2wN1dGe8zekxrdQ%40mail.gmail.com
Backpatch-through: 17
2024-07-17 10:51:00 -05:00
Jeff Davis
b4da732fd6 When creating materialized views, use REFRESH to load data.
Previously, CREATE MATERIALIZED VIEW ... WITH DATA populated the MV
the same way as CREATE TABLE ... AS.

Instead, reuse the REFRESH logic, which locks down security-restricted
operations and restricts the search_path. This reduces the chance that
a subsequent refresh will fail.

Reported-by: Noah Misch
Backpatch-through: 17
Discussion: https://postgr.es/m/20240630222344.db.nmisch@google.com
2024-07-16 15:41:22 -07:00
Masahiko Sawada
068674f4ab Fix possibility of logical decoding partial transaction changes.
When creating and initializing a logical slot, the restart_lsn is set
to the latest WAL insertion point (or the latest replay point on
standbys). Subsequently, WAL records are decoded from that point to
find the start point for extracting changes in the
DecodingContextFindStartpoint() function. Since the initial
restart_lsn could be in the middle of a transaction, the start point
must be a consistent point where we won't see the data for partial
transactions.

Previously, when not building a full snapshot, serialized snapshots
were restored, and the SnapBuild jumps to the consistent state even
while finding the start point. Consequently, the slot's restart_lsn
and confirmed_flush could be set to the middle of a transaction. This
could lead to various unexpected consequences. Specifically, there
were reports of logical decoding decoding partial transactions, and
assertion failures occurred because only subtransactions were decoded
without decoding their top-level transaction until decoding the commit
record.

To resolve this issue, the changes prevent restoring the serialized
snapshot and jumping to the consistent state while finding the start
point.

On v17 and HEAD, a flag indicating whether snapshot restores should be
skipped has been added to the SnapBuild struct, and SNAPBUILD_VERSION
has been bumpded.

On backbranches, the flag is stored in the LogicalDecodingContext
instead, preserving on-disk compatibility.

Backpatch to all supported versions.

Reported-by: Drew Callahan
Reviewed-by: Amit Kapila, Hayato Kuroda
Discussion: https://postgr.es/m/2444AA15-D21B-4CCE-8052-52C7C2DAFE5C%40amazon.com
Backpatch-through: 12
2024-07-11 22:48:21 +09:00
Thomas Munro
9c273679b3 Cope with <regex.h> name clashes.
macOS 15's SDK pulls in headers related to <regex.h> when we include
<xlocale.h>.  This causes our own regex_t implementation to clash with
the OS's regex_t implementation.  Luckily our function names already had
pg_ prefixes, but the macros and typenames did not.

Include <regex.h> explicitly on all POSIX systems, and fix everything
that breaks.  Then we can prove that we are capable of fully hiding and
replacing the system regex API with our own.

1.  Deal with standard-clobbering macros by undefining them all first.
POSIX says they are "symbolic constants".  If they are macros, this
allows us to redefine them.  If they are enums or variables, our macros
will hide them.

2.  Deal with standard-clobbering types by giving our types pg_
prefixes, and then using macros to redirect xxx_t -> pg_xxx_t.

After including our "regex/regex.h", the system <regex.h> is hidden,
because we've replaced all the standard names.  The PostgreSQL source
tree and extensions can continue to use standard prefix-less type and
macro names, but reach our implementation, if they included our
"regex/regex.h" header.

Back-patch to all supported branches, so that macOS 15's tool chain can
build them.

Reported-by: Stan Hu <stanhu@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://postgr.es/m/CAMBWrQnEwEJtgOv7EUNsXmFw2Ub4p5P%2B5QTBEgYwiyjy7rAsEQ%40mail.gmail.com
2024-07-06 11:23:40 +12:00
Alvaro Herrera
3a9d0d774d
Remove bogus assertion in pg_atomic_monotonic_advance_u64
This code wanted to ensure that the 'exchange' variable passed to
pg_atomic_compare_exchange_u64 has correct alignment, but apparently
platforms don't actually require anything that doesn't come naturally.

While messing with pg_atomic_monotonic_advance_u64: instead of using
Max() to determine the value to return, just use
pg_atomic_compare_exchange_u64()'s return value to decide; also, use
pg_atomic_compare_exchange_u64 instead of the _impl version; also remove
the unnecessary underscore at the end of variable name "target".

Backpatch to 17, where this code was introduced by commit bf3ff7bf83bc.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/36796438-a718-cf9b-2071-b2c1b947c1b5@gmail.com
2024-07-04 13:25:31 +02:00
Amit Kapila
0f934b0739 Rename standby_slot_names to synchronized_standby_slots.
The standby_slot_names GUC allows the specification of physical standby
slots that must be synchronized before the logical walsenders associated
with logical failover slots. However, for this purpose, the GUC name is
too generic.

Author: Hou Zhijie
Reviewed-by: Bertrand Drouvot, Masahiko Sawada
Backpatch-through: 17
Discussion: https://postgr.es/m/ZnWeUgdHong93fQN@momjian.us
2024-07-01 11:33:55 +05:30
Amit Langote
716bd12d22 SQL/JSON: Always coerce JsonExpr result at runtime
Instead of looking up casts at parse time for converting the result
of JsonPath* query functions to the specified or the default
RETURNING type, always perform the conversion at runtime using either
the target type's input function or the function
json_populate_type().

There are two motivations for this change:

1. json_populate_type() coerces to types with typmod such that any
   string values that exceed length limit cause an error instead of
   silent truncation, which is necessary to be standard-conforming.

2. It was possible to end up with a cast expression that doesn't
   support soft handling of errors causing bugs in the of handling
   ON ERROR clause.

JsonExpr.coercion_expr which would store the cast expression is no
longer necessary, so remove.

Bump catversion because stored rules change because of the above
removal.

Reported-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: Discussion: https://postgr.es/m/202405271326.5a5rprki64aw%40alvherre.pgsql
2024-06-28 21:58:13 +09:00
Robert Haas
065583cf46 Prevent summarizer hang when summarize_wal turned off and back on.
Before this commit, when the WAL summarizer started up or recovered
from an error, it would resume summarization from wherever it left
off. That was OK normally, but wrong if summarize_wal=off had been
turned off temporary, allowing some WAL to be removed, and then turned
back on again. In such cases, the WAL summarizer would simply hang
forever. This commit changes the reinitialization sequence for WAL
summarizer to rederive the starting position in the way we were
already doing at initial startup, fixing the problem.

Per report from Israel Barth Rubio. Reviewed by Tom Lane.

Discussion: http://postgr.es/m/CA+TgmoYN6x=YS+FoFOS6=nr6=qkXZFWhdiL7k0oatGwug2hcuA@mail.gmail.com
2024-06-28 08:29:05 -04:00
Noah Misch
4a7f91b3d3 Remove comment about xl_heap_inplace "AT END OF STRUCT".
Commit 2c03216d83 moved the tuple data
from there to the buffer-0 data.  Back-patch to v12 (all supported
versions), the plan for the next change to this struct.

Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com
2024-06-27 19:21:06 -07:00
Noah Misch
f9f47f0d93 Cope with inplace update making catcache stale during TOAST fetch.
This extends ad98fb1422 to invals of
inplace updates.  Trouble requires an inplace update of a catalog having
a TOAST table, so only pg_database was at risk.  (The other catalog on
which core code performs inplace updates, pg_class, has no TOAST table.)
Trouble would require something like the inplace-inval.spec test.
Consider GRANT ... ON DATABASE fetching a stale row from cache and
discarding a datfrozenxid update that vac_truncate_clog() has already
relied upon.  Back-patch to v12 (all supported versions).

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240114201411.d0@rfd.leadboat.com
Discussion: https://postgr.es/m/20240512232923.aa.nmisch@google.com
2024-06-27 19:21:06 -07:00
Noah Misch
0cecc908e9 Lock before setting relhassubclass on RELKIND_PARTITIONED_INDEX.
Commit 5b562644fe added a comment that
SetRelationHasSubclass() callers must hold this lock.  When commit
17f206fbc8 extended use of this column to
partitioned indexes, it didn't take the lock.  As the latter commit
message mentioned, we currently never reset a partitioned index to
relhassubclass=f.  That largely avoids harm from the lock omission.  The
cause for fixing this now is to unblock introducing a rule about locks
required to heap_update() a pg_class row.  This might cause more
deadlocks.  It gives minor user-visible benefits:

- If an ALTER INDEX SET TABLESPACE runs concurrently with ALTER TABLE
  ATTACH PARTITION or CREATE PARTITION OF, one transaction blocks
  instead of failing with "tuple concurrently updated".  (Many cases of
  DDL concurrency still fail that way.)

- Match ALTER INDEX ATTACH PARTITION in choosing to lock the index.

While not user-visible today, we'll need this if we ever make something
set the flag to false for a partitioned index, like ANALYZE does today
for tables.  Back-patch to v12 (all supported versions), the plan for
the commit relying on the new rule.  In back branches, add
LockOrStrongerHeldByMe() instead of adding a LockHeldByMe() parameter.

Reviewed (in an earlier version) by Robert Haas.

Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
2024-06-27 19:21:05 -07:00
Noah Misch
bb93640a68 Add wait event type "InjectionPoint", a custom type like "Extension".
Both injection points and customization of type "Extension" are new in
v17, so this just changes a detail of an unreleased feature.

Reported by Robert Haas.  Reviewed by Michael Paquier.

Discussion: https://postgr.es/m/CA+TgmobfMU5pdXP36D5iAwxV5WKE_vuDLtp_1QyH+H5jMMt21g@mail.gmail.com
2024-06-27 19:21:05 -07:00
Heikki Linnakangas
cbfbda7841 Fix MVCC bug with prepared xact with subxacts on standby
We did not recover the subtransaction IDs of prepared transactions
when starting a hot standby from a shutdown checkpoint. As a result,
such subtransactions were considered as aborted, rather than
in-progress. That would lead to hint bits being set incorrectly, and
the subtransactions suddenly becoming visible to old snapshots when
the prepared transaction was committed.

To fix, update pg_subtrans with prepared transactions's subxids when
starting hot standby from a shutdown checkpoint. The snapshots taken
from that state need to be marked as "suboverflowed", so that we also
check the pg_subtrans.

Backport to all supported versions.

Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
2024-06-27 21:09:58 +03:00
Alexander Korotkov
70a845c04a Remove extra comment at TableAmRoutine.scan_analyze_next_block
The extra comment was accidentally copied here by 6377e12a from
heapam_scan_analyze_next_block().

Reported-by: Matthias van de Meent
Discussion: https://postgr.es/m/CAEze2WjC5PiweG-4oe0hB_Zr59iF3tRE1gURm8w4Cg5b6JEBGw%40mail.gmail.com
2024-06-22 16:17:50 +03:00
John Naylor
fd49e8f323 Prevent access of uninitialized memory in radix tree nodes
RT_NODE_16_SEARCH_EQ() performs comparisions using vector registers
on x64-64 and aarch64. We apply a mask to the resulting bitfield
to eliminate irrelevant bits that may be set. This ensures correct
behavior, but Valgrind complains of the partially-uninitialised
values. So far the warnings have only occurred on aarch64, which
explains why this hasn't been seen earlier.

To fix this warning, initialize the whole fixed-sized part of the nodes
upon allocation, rather than just do the minimum initialization to
function correctly. The initialization for node48 is a bit different
in that the 256-byte slot index array must be populated with "invalid
index" rather than zero. Experimentation has shown that compilers
tend to emit code that uselessly memsets that array twice. To avoid
pessimizing this path, swap the order of the slot_idxs[] and isset[]
arrays so we can initialize with two non-overlapping memset calls.

Reported by Tomas Vondra
Analysis and patch by Tom Lane, reviewed by Masahiko Sawada. I
investigated the behavior of memset calls to overlapping regions,
leading to the above tweaks to node48 as discussed in the thread.

Discussion: https://postgr.es/m/120c63ad-3d12-415f-a7bf-3da451c31bf6%40enterprisedb.com
2024-06-21 17:29:39 +07:00
Peter Eisentraut
02bbc3c83a parse_manifest: Use const char *
This adapts the manifest parsing code to take advantage of the
const-ified jsonapi.

Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://www.postgresql.org/message-id/flat/f732b014-f614-4600-a437-dba5a2c3738b%40eisentraut.org
2024-06-21 07:53:30 +02:00
Peter Eisentraut
15cd9a3881 jsonapi: Use const char *
Apply const qualifiers to char * arguments and fields throughout the
jsonapi.  This allows the top-level APIs such as
pg_parse_json_incremental() to declare their input argument as const.
It also reduces the number of unconstify() calls.

Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://www.postgresql.org/message-id/flat/f732b014-f614-4600-a437-dba5a2c3738b%40eisentraut.org
2024-06-21 07:53:30 +02:00
Peter Eisentraut
0b06bf9fa9 jsonapi: Use size_t
Use size_t instead of int for object sizes in the jsonapi.  This makes
the API better self-documenting.

Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://www.postgresql.org/message-id/flat/f732b014-f614-4600-a437-dba5a2c3738b%40eisentraut.org
2024-06-21 07:53:30 +02:00
Amit Langote
0f271e8e8d SQL/JSON: Correct jsonpath variable name matching
Previously, GetJsonPathVar() allowed a jsonpath expression to
reference any prefix of a PASSING variable's name. For example, the
following query would incorrectly work:

SELECT JSON_QUERY(context_item, jsonpath '$xy' PASSING val AS xyz);

The fix ensures that the length of the variable name mentioned in a
jsonpath expression matches exactly with the name of the PASSING
variable before comparing the strings using strncmp().

Reported-by: Alvaro Herrera (off-list)
Discussion: https://postgr.es/m/CA+HiwqFGkLWMvELBH6E4SQ45qUHthgcRH6gCJL20OsYDRtFx_w@mail.gmail.com
2024-06-19 15:22:06 +09:00
Tom Lane
92c49d1062 Fix insertion of SP-GiST REDIRECT tuples during REINDEX CONCURRENTLY.
Reconstruction of an SP-GiST index by REINDEX CONCURRENTLY may
insert some REDIRECT tuples.  This will typically happen in
a transaction that lacks an XID, which leads either to assertion
failure in spgFormDeadTuple or to insertion of a REDIRECT tuple
with zero xid.  The latter's not good either, since eventually
VACUUM will apply GlobalVisTestIsRemovableXid() to the zero xid,
resulting in either an assertion failure or a garbage answer.

In practice, since REINDEX CONCURRENTLY locks out index scans
till it's done, it doesn't matter whether it inserts REDIRECTs
or PLACEHOLDERs; and likewise it doesn't matter how soon VACUUM
reduces such a REDIRECT to a PLACEHOLDER.  So in non-assert builds
there's no observable problem here, other than perhaps a little
index bloat.  But it's not behaving as intended.

To fix, remove the failing Assert in spgFormDeadTuple, acknowledging
that we might sometimes insert a zero XID; and guard VACUUM's
GlobalVisTestIsRemovableXid() call with a test for valid XID,
ensuring that we'll reduce such a REDIRECT the first time VACUUM
sees it.  (Versions before v14 use TransactionIdPrecedes here,
which won't fail on zero xid, so they really have no bug at all
in non-assert builds.)

Another solution could be to not create REDIRECTs at all during
REINDEX CONCURRENTLY, making the relevant code paths treat that
case like index build (which likewise knows that no concurrent
index scans can be happening).  That would allow restoring the
Assert in spgFormDeadTuple, but we'd still need the VACUUM change
because redirection tuples with zero xid may be out there already.
But there doesn't seem to be a nice way for spginsert() to tell that
it's being called in REINDEX CONCURRENTLY without some API changes,
so we'll leave that as a possible future improvement.

In HEAD, also rename the SpGistState.myXid field to redirectXid,
which seems less misleading (since it might not in fact be our
transaction's XID) and is certainly less uninformatively generic.

Per bug #18499 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18499-8a519c280f956480@postgresql.org
2024-06-17 14:30:59 -04:00
Tom Lane
35dd40d34c Improve tracking of role dependencies of pg_init_privs entries.
Commit 534287403 invented SHARED_DEPENDENCY_INITACL entries in
pg_shdepend, but installed them only for non-owner roles mentioned
in a pg_init_privs entry.  This turns out to be the wrong thing,
because there is nothing to cue REASSIGN OWNED to go and update
pg_init_privs entries when the object's ownership is reassigned.
That leads to leaving dangling entries in pg_init_privs, as
reported by Hannu Krosing.  Instead, install INITACL entries for
all roles mentioned in pg_init_privs entries (except pinned roles),
and change ALTER OWNER to not touch them, just as it doesn't
touch pg_init_privs entries.

REASSIGN OWNED will now substitute the new owner OID for the old
in pg_init_privs entries.  This feels like perhaps not quite the
right thing, since pg_init_privs ought to be a historical record
of the state of affairs just after CREATE EXTENSION.  However,
it's hard to see what else to do, if we don't want to disallow
dropping the object's original owner.  In any case this is
better than the previous do-nothing behavior, and we're unlikely
to come up with a superior solution in time for v17.

While here, tighten up some coding rules about how ACLs in
pg_init_privs should never be null or empty.  There's not any
obvious reason to allow that, and perhaps asserting that it's
not so will catch some bugs.  (We were previously inconsistent
on the point, with some code paths taking care not to store
empty ACLs and others not.)

This leaves recordExtensionInitPrivWorker not doing anything
with its ownerId argument, but we'll deal with that separately.

catversion bump forced because of change of expected contents
of pg_shdepend when pg_init_privs entries exist.

Discussion: https://postgr.es/m/CAMT0RQSVgv48G5GArUvOVhottWqZLrvC5wBzBa4HrUdXe9VRXw@mail.gmail.com
2024-06-17 12:55:10 -04:00
Masahiko Sawada
f1affb6705 Reintroduce dead tuple counter in pg_stat_progress_vacuum.
Commit 667e65aac3 changed both num_dead_tuples and max_dead_tuples
columns to dead_tuple_bytes and max_dead_tuple_bytes columns,
respectively. But as per discussion, the number of dead tuples
collected still provides meaningful insights for users.

This commit reintroduces the column for the count of dead tuples,
renamed as num_dead_item_ids. It avoids confusion with the number of
dead tuples removed by VACUUM, which includes dead heap-only tuples
but excludes any pre-existing LP_DEAD items left behind by
opportunistic pruning.

Bump catalog version.

Reviewed-by: Peter Geoghegan, Álvaro Herrera, Andrey Borodin
Discussion: https://postgr.es/m/CAD21AoBL5sJE9TRWPyv%2Bw7k5Ee5QAJqDJEDJBUdAaCzGWAdvZw%40mail.gmail.com
2024-06-14 10:08:15 +09:00
Peter Geoghegan
6207f08f70 Harmonize function parameter names for Postgres 17.
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places.  These
inconsistencies were all introduced during Postgres 17 development.

pg_bsd_indent still has a couple of similar inconsistencies, which I
(pgeoghegan) have left untouched for now.

This commit was written with help from clang-tidy, by mechanically
applying the same rules as similar clean-up commits (the earliest such
commit was commit 035ce1fe).
2024-06-12 17:01:51 -04:00
Peter Eisentraut
3482bab5e3 meson: Restore implicit warning/debug/optimize flags for extensions
Meson uses warning/debug/optimize flags such as "-Wall", "-g", and
"-O2" automatically based on "--warnlevel" and "--buildtype" options.
And we use "--warning_level=1" and "--buildtype=debugoptimized" by
default.

But we need these flags for Makefile.global (for extensions) and
pg_config, so we need to compute them manually based on the
higher-level options.

Without this change, extensions building using pgxs wouldn't get -Wall
or optimization options.

Author: Sutou Kouhei <kou@clear-code.com>
Discussion: https://www.postgresql.org/message-id/flat/20240122.141139.931086145628347157.kou%40clear-code.com
2024-06-07 09:36:26 +02:00
Alexander Korotkov
505c008ca3 Restore preprocess_groupclause()
0452b461bc made optimizer explore alternative orderings of group-by pathkeys.
It eliminated preprocess_groupclause(), which was intended to match items
between GROUP BY and ORDER BY.  Instead, get_useful_group_keys_orderings()
function generates orderings of GROUP BY elements at the time of grouping
paths generation.  The get_useful_group_keys_orderings() function takes into
account 3 orderings of GROUP BY pathkeys and clauses: original order as written
in GROUP BY, matching ORDER BY clauses as much as possible, and matching the
input path as much as possible.  Given that even before 0452b461b,
preprocess_groupclause() could change the original order of GROUP BY clauses
we don't need to consider it apart from ordering matching ORDER BY clauses.

This commit restores preprocess_groupclause() to provide an ordering of
GROUP BY elements matching ORDER BY before generation of paths.  The new
version of preprocess_groupclause() takes into account an incremental sort.
The get_useful_group_keys_orderings() function now takes into 2 orderings of
GROUP BY elements: the order generated preprocess_groupclause() and the order
matching the input path as much as possible.

Discussion: https://postgr.es/m/CAPpHfdvyWLMGwvxaf%3D7KAp-z-4mxbSH8ti2f6mNOQv5metZFzg%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
2024-06-06 13:44:34 +03:00
Alexander Korotkov
0c1af2c35c Rename PathKeyInfo to GroupByOrdering
0452b461bc made optimizer explore alternative orderings of group-by pathkeys.
The PathKeyInfo data structure was used to store the particular ordering of
group-by pathkeys and corresponding clauses.  It turns out that PathKeyInfo
is not the best name for that purpose.  This commit renames this data structure
to GroupByOrdering, and revises its comment.

Discussion: https://postgr.es/m/db0fc3a4-966c-4cec-a136-94024d39212d%40postgrespro.ru
Reported-by: Tom Lane
Author: Andrei Lepikhov
Reviewed-by: Alexander Korotkov, Pavel Borisov
2024-06-06 13:43:24 +03:00
Alexander Korotkov
199012a3d8 Fix asymmetry in setting EquivalenceClass.ec_sortref
0452b461bc made get_eclass_for_sort_expr() always set
EquivalenceClass.ec_sortref if it's not done yet.  This leads to an asymmetric
situation when whoever first looks for the EquivalenceClass sets the
ec_sortref.  It is also counterintuitive that get_eclass_for_sort_expr()
performs modification of data structures.

This commit makes make_pathkeys_for_sortclauses_extended() responsible for
setting EquivalenceClass.ec_sortref.  Now we set the
EquivalenceClass.ec_sortref's needed to explore alternative GROUP BY ordering
specifically during building pathkeys by the list of grouping clauses.

Discussion: https://postgr.es/m/17037754-f187-4138-8285-0e2bfebd0dea%40postgrespro.ru
Reported-by: Tom Lane
Author: Andrei Lepikhov
Reviewed-by: Alexander Korotkov, Pavel Borisov
2024-06-06 13:41:34 +03:00
Dean Rasheed
5c5bccef21 Fix another couple of outdated comments for MERGE RETURNING.
Oversights in c649fa24a4 which added RETURNING support to MERGE.

Discussion: https://postgr.es/m/CAApHDvpqp6vtUzG-_josUEiBGyqnrnVxJ-VdF+hJLXjHdHzsyQ@mail.gmail.com
2024-06-04 09:29:42 +01:00
David Rowley
8559252095 Fix a couple of outdated comments now that we have MERGE RETURNING
This has been supported since c649fa24a.

Discussion: https://postgr.es/m/CAApHDvpqp6vtUzG-_josUEiBGyqnrnVxJ-VdF+hJLXjHdHzsyQ@mail.gmail.com
2024-05-23 15:24:54 +12:00
Robert Haas
12933dc604 Re-allow planner to use Merge Append to efficiently implement UNION.
This reverts commit 7204f35919,
thus restoring 66c0185a3 (Allow planner to use Merge Append to
efficiently implement UNION) as well as the follow-on commits
d5d2205c8, 3b1a7eb28, 7487044d6.

Per further discussion on pgsql-release, we wish to ship beta1 with
this feature, and patch the bug that was found just before wrap,
rather than shipping beta1 with the feature reverted.
2024-05-21 12:44:51 -04:00
Tom Lane
7204f35919 Revert commit 66c0185a3 and follow-on patches.
This reverts 66c0185a3 (Allow planner to use Merge Append to
efficiently implement UNION) as well as the follow-on commits
d5d2205c8, 3b1a7eb28, 7487044d6.  In addition to those, 07746a8ef
had to be removed then re-applied in a different place, because
66c0185a3 moved the relevant code.

The reason for this last-minute thrashing is that depesz found a
case in which the patched code creates a completely wrong plan
that silently gives incorrect query results.  It's unclear what
the cause is or how many cases are affected, but with beta1 wrap
staring us in the face, there's no time for closer investigation.
After we figure that out, we can decide whether to un-revert this
for beta2 or hold it for v18.

Discussion: https://postgr.es/m/Zktzf926vslR35Fv@depesz.com
(also some private discussion among pgsql-release)
2024-05-20 15:08:30 -04:00
Peter Eisentraut
be5942aee7 Remove unused typedefs
There were a few typedefs that were never used to define a variable or
field.  This had the effect that the buildfarm's typedefs.list would
not include them, and so they would need to be re-added manually to
keep the overall pgindent result perfectly clean.

We can easily get rid of these typedefs to avoid the issue.  In a few
cases, we just let the enum or struct stand on its own without a
typedef around it.  In one case, an enum was abused to define flag
bits; that's better done with macro definitions.

This fixes all the remaining issues with missing entries in the
buildfarm's typedefs.list.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/1919000.1715815925@sss.pgh.pa.us
2024-05-17 07:36:12 +02:00
Michael Paquier
110eb4aefb Remove enum WaitEventExtension
This enum was used to determine the first ID to use when assigning a
custom wait event for extensions, which is always 1.  It was kept so
as it would be possible to add new in-core wait events in the category
"Extension".  There is no such thing currently, so let's remove this
enum until a case justifying it pops up.  This makes the code simpler
and easier to understand.

This has as effect to switch back autoprewarm.c to use PG_WAIT_EXTENSION
rather than WAIT_EVENT_EXTENSION, on par with v16 and older stable
branches.

Thinko in c9af054653.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/195c6c45-abce-4331-be6a-e87724e1d060@eisentraut.org
2024-05-17 12:29:57 +09:00
Peter Eisentraut
8aee330af5 Revert temporal primary keys and foreign keys
This feature set did not handle empty ranges correctly, and it's now
too late for PostgreSQL 17 to fix it.

The following commits are reverted:

    6db4598fcb Add stratnum GiST support function
    46a0cd4cef Add temporal PRIMARY KEY and UNIQUE constraints
    86232a49a4 Fix comment on gist_stratnum_btree
    030e10ff1a Rename pg_constraint.conwithoutoverlaps to conperiod
    a88c800deb Use daterange and YMD in without_overlaps tests instead of tsrange.
    5577a71fb0 Use half-open interval notation in without_overlaps tests
    34768ee361 Add temporal FOREIGN KEY contraints
    482e108cd3 Add test for REPLICA IDENTITY with a temporal key
    c3db1f30cb doc:  clarify PERIOD and WITHOUT OVERLAPS in CREATE TABLE
    144c2ce0cc Fix ON CONFLICT DO NOTHING/UPDATE for temporal indexes

Discussion: https://www.postgresql.org/message-id/d0b64a7a-dfe4-4b84-a906-c7dedfa40a3e@eisentraut.org
2024-05-16 08:17:46 +02:00
David Rowley
8cb653b245 Fix incorrect year in some copyright notices added this year
A few patches that were written <= 2023 and committed in 2024 still
contained 2023 copyright year.  Fix that.

Discussion: https://postgr.es/m/CAApHDvr5egyW3FmHbAg-Uq2p_Aizwco1Zjs6Vbq18KqN64-hRA@mail.gmail.com
2024-05-15 15:01:21 +12:00
Tom Lane
245f1cec59 Do pre-release housekeeping on catalog data.
Run renumber_oids.pl to move high-numbered OIDs down, as per pre-beta
tasks specified by RELEASE_CHANGES.  For reference, the command was

./renumber_oids.pl --first-mapped-oid 8000 --target-oid 6300
2024-05-14 16:46:38 -04:00
Tom Lane
da256a4a7f Pre-beta mechanical code beautification.
Run pgindent, pgperltidy, and reformat-dat-files.

The pgindent part of this is pretty small, consisting mainly of
fixing up self-inflicted formatting damage from patches that
hadn't bothered to add their new typedefs to typedefs.list.
In order to keep it from making anything worse, I manually added
a dozen or so typedefs that appeared in the existing typedefs.list
but not in the buildfarm's list.  Perhaps we should formalize that,
or better find a way to get those typedefs into the automatic list.

pgperltidy is as opinionated as always, and reformat-dat-files too.
2024-05-14 16:34:50 -04:00
Tom Lane
1f7452fa59 Remove COMMAND_TAG_NEXTTAG from enum CommandTag.
COMMAND_TAG_NEXTTAG isn't really a valid command tag.  Declaring it
as if it were one prompts complaints from Coverity and perhaps other
static analyzers.  Our only use of it is in an entirely-unnecessary
array sizing declaration, so let's just drop it.

Ranier Vilela

Discussion: https://postgr.es/m/CAEudQAoY0xrKuTAX7W10zsjjUpKBPFRtdCyScb3Z0FB2v6HNmQ@mail.gmail.com
2024-05-13 13:52:17 -04:00
Alvaro Herrera
6f8bb7c1e9
Revert structural changes to not-null constraints
There are some problems with the new way to handle these constraints
that were detected at the last minute, and require fixes that appear too
invasive to be doing this late in the cycle.  Revert this (again) for
now, we'll try again with these problems fixed.

The following commits are reverted:

    b0e96f3119  Catalog not-null constraints
    9b581c5341  Disallow changing NO INHERIT status of a not-null constraint
    d0ec2ddbe0  Fix not-null constraint test
    ac22a9545c  Move privilege check to the right place
    b0f7dd915b  Check stack depth in new recursive functions
    3af7217942  Update information_schema definition for not-null constraints
    c3709100be  Fix propagating attnotnull in multiple inheritance
    d9f686a72e  Fix restore of not-null constraints with inheritance
    d72d32f52d  Don't try to assign smart names to constraints
    0cd711271d  Better handle indirect constraint drops
    13daa33fa5  Disallow NO INHERIT not-null constraints on partitioned tables
    d45597f72f  Disallow direct change of NO INHERIT of not-null constraints
    21ac38f498  Fix inconsistencies in error messages

Discussion: https://postgr.es/m/202405110940.joxlqcx4dogd@alvherre.pgsql
2024-05-13 11:31:09 +02:00
Michael Paquier
33181b48fd Introduce private data area for injection points
This commit extends the backend-side infrastructure of injection points
so as it becomes possible to register some input data when attaching a
point.  This private data can be registered with the function name and
the library name of the callback when attaching a point, then it is
given as input argument to the callback.  This gives the possibility for
modules to pass down custom data at runtime when attaching a point
without managing that internally, in a manner consistent with the
callback entry retrieved from the hash shmem table storing the injection
point data.

InjectionPointAttach() gains two arguments, to be able to define the
private data contents and its size.

A follow-up commit will rely on this infrastructure to close a race
condition with the injection point detach in the module
injection_points.

While on it, this changes InjectionPointDetach() to return a boolean,
returning false if a point cannot be detached.  This has been mentioned
by Noah as useful when it comes to implement more complex tests with
concurrent point detach, solid with the automatic detach done for local
points in the test module.

Documentation is adjusted in consequence.

Per discussion with Noah Misch.

Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/20240509031553.47@rfd.leadboat.com
2024-05-12 18:53:06 +09:00
Heikki Linnakangas
407e0b023c Change ALPN protocol ID to IANA-approved "postgresql"
"TBD-pgsql" was a placeholder until the IANA registration was
approved.

Discussion: https://www.postgresql.org/message-id/87jzk2hj2n.fsf%40wibble.ilmari.org
Discussion: https://mailarchive.ietf.org/arch/msg/tls-reg-review/9LWPzQfOpbc8dTT7vc9ahNeNaiw/
2024-05-11 18:48:19 +03:00
Tom Lane
9effc4608e Repair ALTER EXTENSION ... SET SCHEMA.
It turns out that we broke this in commit e5bc9454e, because
the code was assuming that no dependent types would appear
among the extension's direct dependencies, and now they do.

This isn't terribly hard to fix: just skip dependent types,
expecting that we will recurse to them when we process the parent
object (which should also be among the direct dependencies).
But a little bit of refactoring is needed so that we can avoid
duplicating logic about what is a dependent type.

Although there is some testing of ALTER EXTENSION SET SCHEMA,
it failed to cover interesting cases, so add more tests.

Discussion: https://postgr.es/m/930191.1715205151@sss.pgh.pa.us
2024-05-09 12:19:52 -04:00
Peter Eisentraut
509199587d Fix assorted bugs related to identity column in partitioned tables
When changing the data type of a column of a partitioned table, craft
the ALTER SEQUENCE command only once.  Partitions do not have identity
sequences of their own and thus do not need a ALTER SEQUENCE command
for each partition.

Fix getIdentitySequence() to fetch the identity sequence associated
with the top-level partitioned table when a Relation of a partition is
passed to it.  While doing so, translate the attribute number of the
partition into the attribute number of the partitioned table.

Author: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Discussion: https://www.postgresql.org/message-id/3b8a9dc1-bbc7-0ef5-6863-c432afac7d59@gmail.com
2024-05-07 22:50:00 +02:00
Nathan Bossart
521a7156ab Fix privilege checks in pg_stats_ext and pg_stats_ext_exprs.
The catalog view pg_stats_ext fails to consider privileges for
expression statistics.  The catalog view pg_stats_ext_exprs fails
to consider privileges and row-level security policies.  To fix,
restrict the data in these views to table owners or roles that
inherit privileges of the table owner.  It may be possible to apply
less restrictive privilege checks in some cases, but that is left
as a future exercise.  Furthermore, for pg_stats_ext_exprs, do not
return data for tables with row-level security enabled, as is
already done for pg_stats_ext.

On the back-branches, a fix-CVE-2024-4317.sql script is provided
that will install into the "share" directory.  This file can be
used to apply the fix to existing clusters.

Bumps catversion on 'master' branch only.

Reported-by: Lukas Fittl
Reviewed-by: Noah Misch, Tomas Vondra, Tom Lane
Security: CVE-2024-4317
Backpatch-through: 14
2024-05-06 09:00:00 -05:00
Alexander Korotkov
d1d286d83c Revert: Remove useless self-joins
This commit reverts d3d55ce571 and subsequent fixes 2b26a69455, 93c85db3b5,
b44a1708ab, b7f315c9d7, 8a8ed916f7, b5fb6736ed, 0a93f803f4, e0477837ce,
a7928a57b9, 5ef34a8fc3, 30b4955a46, 8c441c0827, 028b15405b, fe093994db,
489072ab7a, and 466979ef03.

We are quite late in the release cycle and new bugs continue to appear.  Even
though we have fixes for all known bugs, there is a risk of throwing many
bugs to end users.

The plan for self-join elimination would be to do more review and testing,
then re-commit in the early v18 cycle.

Reported-by: Tom Lane
Discussion: https://postgr.es/m/2422119.1714691974%40sss.pgh.pa.us
2024-05-06 14:36:36 +03:00
David Rowley
7d2c7f08d9 Fix query pullup issue with WindowClause runCondition
94985c210 added code to detect when WindowFuncs were monotonic and
allowed additional quals to be "pushed down" into the subquery to be
used as WindowClause runConditions in order to short-circuit execution
in nodeWindowAgg.c.

The Node representation of runConditions wasn't well selected and
because we do qual pushdown before planning the subquery, the planning
of the subquery could perform subquery pull-up of nested subqueries.
For WindowFuncs with args, the arguments could be changed after pushing
the qual down to the subquery.

This was made more difficult by the fact that the code duplicated the
WindowFunc inside an OpExpr to include in the WindowClauses runCondition
field.  This could result in duplication of subqueries and a pull-up of
such a subquery could result in another initplan parameter being issued
for the 2nd version of the subplan.  This could result in errors such as:

ERROR:  WindowFunc not found in subplan target lists

To fix this, we change the node representation of these run conditions
and instead of storing an OpExpr containing the WindowFunc in a list
inside WindowClause, we now store a new node type named
WindowFuncRunCondition within a new field in the WindowFunc.  These get
transformed into OpExprs later in planning once subquery pull-up has been
performed.

This problem did exist in v15 and v16, but that was fixed by 9d36b883b
and e5d20bbd.

Cat version bump due to new node type and modifying WindowFunc struct.

Bug: #18305
Reported-by: Zuming Jiang
Discussion: https://postgr.es/m/18305-33c49b4c830b37b3%40postgresql.org
2024-05-05 12:54:46 +12:00
David Rowley
a42fc1c903 Fix an assortment of typos
Author: Alexander Lakhin
Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
2024-05-04 02:33:25 +12:00
Alvaro Herrera
d45597f72f
Disallow direct change of NO INHERIT of not-null constraints
We support changing NO INHERIT constraint to INHERIT for constraints in
child relations when adding a constraint to some ancestor relation, and
also during pg_upgrade's schema restore; but other than those special
cases, command ALTER TABLE ADD CONSTRAINT should not be allowed to
change an existing constraint from NO INHERIT to INHERIT, as that would
require to process child relations so that they also acquire an
appropriate constraint, which we may not be in a position to do.  (It'd
also be surprising behavior.)

It is conceivable that we want to allow ALTER TABLE SET NOT NULL to make
such a change; but in that case some more code is needed to implement it
correctly, so for now I've made that throw the same error message.

Also, during the prep phase of ALTER TABLE ADD CONSTRAINT, acquire locks
on all descendant tables; otherwise we might operate on child tables on
which no locks are held, particularly in the mode where a primary key
causes not-null constraints to be created on children.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/7d923a66-55f0-3395-cd40-81c142b5448b@gmail.com
2024-05-02 17:26:30 +02:00
David Rowley
a63224be49 Ensure we allocate NAMEDATALEN bytes for names in Index Only Scans
As an optimization, we store "name" columns as cstrings in btree
indexes.

Here we modify it so that Index Only Scans convert these cstrings back
to names with NAMEDATALEN bytes rather than storing the cstring in the
tuple slot, as was happening previously.

Bug: #17855
Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin, Tom Lane
Discussion: https://postgr.es/m/17855-5f523e0f9769a566@postgresql.org
Backpatch-through: 12, all supported versions
2024-05-01 13:21:21 +12:00
Tom Lane
5342874039 Fix failure to track role dependencies of pg_init_privs entries.
If an ACL recorded in pg_init_privs mentions a non-pinned role,
that reference must also be noted in pg_shdepend so that we know
that the role can't go away without removing the ACL reference.
Otherwise, DROP ROLE could succeed and leave dangling entries
behind, which is what's causing the recent upgrade-check failures
on buildfarm member copperhead.

This has been wrong since pg_init_privs was introduced, but it's
escaped notice because typical pg_init_privs entries would only
mention the bootstrap superuser (pinned) or at worst the owner
of the extension (who can't go away before the extension does).

We lack even a representation of such a role reference for
pg_shdepend.  My first thought for a solution was entries listing
pg_init_privs in classid, but that doesn't work because then there's
noplace to put the granted-on object's classid.  Rather than adding
a new column to pg_shdepend, let's add a new deptype code
SHARED_DEPENDENCY_INITACL.  Much of the associated boilerplate
code can be cribbed from code for SHARED_DEPENDENCY_ACL.

A lot of the bulk of this patch just stems from the new need to pass
the object's owner ID to recordExtensionInitPriv, so that we can
consult it while updating pg_shdepend.  While many callers have that
at hand already, a few places now need to fetch the owner ID of an
arbitrary privilege-bearing object.  For that, we assume that there
is a catcache on the relevant catalog's OID column, which is an
assumption already made in ExecGrant_common so it seems okay here.

We do need an entirely new routine RemoveRoleFromInitPriv to perform
cleanup of pg_init_privs ACLs during DROP OWNED BY.  It's analogous
to RemoveRoleFromObjectACL, but we can't share logic because that
function operates by building a command parsetree and invoking
existing GRANT/REVOKE infrastructure.  There is of course no SQL
command that would update pg_init_privs entries when we're not in
process of creating their extension, so we need a routine that can
do the updates directly.

catversion bump because this changes the expected contents of
pg_shdepend.  For the same reason, there's no hope of back-patching
this, even though it fixes a longstanding bug.  Fortunately, the
case where it's a problem seems to be near nonexistent in the field.
If it weren't for the buildfarm breakage, I'd have been content to
leave this for v18.

Patch by me; thanks to Daniel Gustafsson for review and discussion.

Discussion: https://postgr.es/m/1745535.1712358659@sss.pgh.pa.us
2024-04-29 19:26:19 -04:00
Peter Eisentraut
592a228372 Revert "Add GUC backtrace_on_internal_error"
This reverts commit a740b213d4.

Subsequent discussion showed that there was interest in a more general
facility to configure when server log events would produce backtraces,
and this existing limited way couldn't be extended in a compatible
way.  So the consensus was to revert this for PostgreSQL 17 and
reconsider this topic for PostgreSQL 18.

Discussion: https://www.postgresql.org/message-id/flat/CAGECzQTChkvn5Xj772LB3%3Dxo2x_LcaO5O0HQvXqobm1xVp6%2B4w%40mail.gmail.com#764bcdbb73e162787e1ad984935e51e3
2024-04-29 10:49:42 +02:00