This commit adds support for the restore of extended statistics of the
kind "dependencies", for the following input data:
[{"attributes": [2], "dependency": 3, "degree": 1.000000},
{"attributes": [3], "dependency": 2, "degree": 1.000000}]
This relies on the existing routines of "dependencies" to cross-check
the input data with the definition of the extended statistics objects
for the attribute numbers. An input argument of type "pg_dependencies"
is required for this new option.
Thanks to the work done in 0e80f3f88d for the restore function and
e1405aa5e3 for the input handling of data type pg_dependencies, this
addition is straight-forward. This will be used so as it is possible to
transfer these statistics across dumps and upgrades, removing the need
for a post-operation ANALYZE for these kinds of statistics.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
The psql documentation for the \d and \d+ meta-commands lists objects
that may be shown, but previously the wording could be read as exhaustive
even though additional objects can also appear in the output.
This commit clarifies the description by adding phrasing such as "for example"
or "such as", making it clear that the listed objects are illustrative
rather than a complete list. While the change is small, it helps avoid
potential user confusion.
As this is a documentation clarification rather than a bug fix,
it is not backpatched.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+Pt1DBtaUqfJftkkaQLJJJenYJBtb6Ec6s6vu82KEMh46A@mail.gmail.com
This function closely mirror its relation and attribute counterparts,
but for extended statistics (i.e. CREATE STATISTICS) objects, being
able to restore extended statistics for an extended stats object. Like
the other functions, the goal of this feature is to ease the dump or
upgrade of clusters so as ANALYZE would not be required anymore after
these operations, stats being directly loaded into the target cluster
without any post-dump/upgrade computation.
The caller of this function needs the following arguments for the
extended stats to restore:
- The name of the relation.
- The schema name of the relation.
- The name of the extended stats object.
- The schema name of the extended stats object.
- If the stats are inherited or not.
- One or more extended stats kind with its data.
This commit adds only support for the restore of the extended statistics
kind "n_distinct", building the basic infrastructure for the restore
of more extended statistics kinds in follow-up commits, including MVC
and dependencies.
The support for "n_distinct" is eased in this commit thanks to the
previous work done particularly in commits 1f927cce44 and
44eba8f06e, that have added the input function for the type
pg_ndistinct, used as data type in input of this new restore function.
Bump catalog version.
Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
Previously, a WAL receiver freshly started would set its state to
WALRCV_STREAMING immediately at startup, before actually establishing a
replication connection.
This commit introduces a new state called WALRCV_CONNECTING, which is
the state used when the WAL receiver freshly starts, or when a restart
is requested, with a switch to WALRCV_STREAMING once the connection to
the upstream server has been established with COPY_BOTH, meaning that
the WAL receiver is ready to stream changes. This change is useful for
monitoring purposes, especially in environments with a high latency
where a connection could take some time to be established, giving some
room between the [re]start phase and the streaming activity.
From the point of view of the startup process, that flips the shared
memory state of the WAL receiver when it needs to be stopped, the
existing WALRCV_STREAMING and the new WALRCV_CONNECTING states have the
same semantics: the WAL receiver has started and it can be stopped.
Based on an initial suggestion from Noah Misch, with some input from me
about the design.
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Discussion: https://postgr.es/m/CABPTF7VQ5tGOSG5TS-Cg+Fb8gLCGFzxJ_eX4qg+WZ3ZPt=FtwQ@mail.gmail.com
When a range type is created, several construction functions are also
created, two for the range type and three for the multirange type.
These have an internal dependency, so they "belong" to the range type.
But there was no way to identify those functions when given a range
type. An upcoming patch needs access to the two- or possibly the
three-argument range constructor function for a given range type. The
only way to do that would be with fragile workarounds like matching
names and argument types. The correct way to do that kind of thing is
to record to the links in the system catalogs. This is what this
patch does, it records the OIDs of these five constructor functions in
the pg_range catalog. (Currently, there is no code that makes use of
this.)
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/7d63ddfa-c735-4dfe-8c7a-4f1e2a621058%40eisentraut.org
There were many PG_GETARG_* calls, mostly around gin, gist, spgist
code, that were commented out, presumably to indicate that the
argument was unused and to indicate that it wasn't forgotten or
miscounted. But keeping commented-out code updated with refactorings
and style changes is annoying. So this commit changes them to
#ifdef NOT_USED
blocks, which is a style already in use. That way, at least the
indentation and syntax highlighting works correctly, making some of
these blocks much easier to read.
An alternative would be to just delete that code, but there is some
value in making unused arguments explicit, and some of this arguably
serves as example code for index AM APIs.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: David Geier <geidav.pg@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/328e4371-9a4c-4196-9df9-1f23afc900df%40eisentraut.org
The possible values of pg_stat_wal_receiver.status have never been
documented. Note that the status "stopped" will never show up in this
view, hence there is no need to document it.
Issue noticed while discussing a patch that aims to add a new status to
WAL receiver.
Author: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/CABPTF7X_Jgmyk1FBVNf3tyAOKqU55LLpLMzWkGtEAb_jQWVN=g@mail.gmail.com
Commit bc2f348 introduced multi-line HEADER support for COPY. This commit
extends this capability to file_fdw, allowing multiple header lines to be
skipped.
Because CREATE/ALTER FOREIGN TABLE requires option values to be single-quoted,
this commit also updates defGetCopyHeaderOption() to accept integer values
specified as strings for HEADER option.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: songjinzhou <tsinghualucky912@foxmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOzEurT+iwC47VHPMS+uJ4WSzvOLPsZ2F2_wopm8M7O+CZa3Xw@mail.gmail.com
Continuing to support this backwards-compatibility feature has
nontrivial costs; in particular it is potentially a security hazard
if an application somehow gets confused about which setting the
server is using. We changed the default to ON fifteen years ago,
which seems like enough time for applications to have adapted.
Let's remove support for the legacy string syntax.
We should not remove the GUC altogether, since client-side code will
still test it, pg_dump scripts will attempt to set it to ON, etc.
Instead, just prevent it from being set to OFF. There is precedent
for this approach (see commit de66987ad).
This patch does remove the related GUC escape_string_warning, however.
That setting does nothing when standard_conforming_strings is on,
so it's now useless. We could leave it in place as a do-nothing
setting to avoid breaking clients that still set it, if there are any.
But it seems likely that any such client is also trying to turn off
standard_conforming_strings, so it'll need work anyway.
The client-side changes in this patch are pretty minimal, because even
though we are dropping the server's support, most of our clients still
need to be able to talk to older server versions. We could remove
dead client code only once we disclaim compatibility with pre-v19
servers, which is surely years away. One change of note is that
pg_dump/pg_dumpall now set standard_conforming_strings = on in their
source session, rather than accepting the source server's default.
This ensures that literals in view definitions and such will be
printed in a way that's acceptable to v19+. In particular,
pg_upgrade will work transparently even if the source installation has
standard_conforming_strings = off. (However, pg_restore will behave
the same as before if given an archive file containing
standard_conforming_strings = off. Such an archive will not be safely
restorable into v19+, but we shouldn't break the ability to extract
valid data from it for use with an older server.)
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3279216.1767072538@sss.pgh.pa.us
This change enhances the clarity and usefulness of error detail messages
generated during logical replication conflicts. The following improvements
have been made:
1. Eliminate redundant output: Avoid printing duplicate remote row and
replica identity values for the multiple_unique_conflicts conflict type.
2. Improve message structure: Append tuple values directly to the main
error message, separated by a colon (:), for better readability.
3. Simplify local row terminology: Remove the word 'existing' when
referring to the local row, as this is already implied by context.
4. General code refinements: Apply miscellaneous code cleanups to improve
how conflict detail messages are constructed and formatted.
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/CAHut+Psgkwy5-yGRJC15izecySGGysrbCszv_z93ess8XtCDOQ@mail.gmail.com
This function is able to clear the data associated to an extended
statistics object, making things so as the object looks as
newly-created.
The caller of this function needs the following arguments for the
extended stats to clear:
- The name of the relation.
- The schema name of the relation.
- The name of the extended stats object.
- The schema name of the extended stats object.
- If the stats are inherited or not.
The first two parameters are especially important to ensure a consistent
lookup and ACL checks for the relation on which is based the extended
stats object that will be cleared, relying first on a RangeVar lookup
where permissions are checked without locking a relation, critical to
prevent denial-of-service attacks when using this kind of function (see
also 688dc6299a for a similar concern). The third to fifth arguments
give a way to target the extended stats records to clear.
This has been extracted from a larger patch by the same author, for a
piece which is again useful on its own. I have rewritten large portions
of it. The tests have been extended while discussing this piece,
resulting on what this commit includes. The intention behind this
feature is to add support for the import of extended statistics across
dumps and upgrades, this change building one piece that we will be able
to rely on for the rest of the changes.
Bump catalog version.
Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
Commit 9f8377f7a introduced the DEFAULT option for file_fdw but did not
update the documentation. This commit adds the missing description of
the DEFAULT option to the file_fdw documentation.
Backpatch to v16, where the DEFAULT option was introduced.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAOzEurT_PE7QEh5xAdb7Cja84Rur5qPv2Fzt3Tuqi=NU0WJsbg@mail.gmail.com
Backpatch-through: 16
The parameter name used for the option value was named "number-of-jobs",
which was inconsistent with what all the other tools with an option
called --jobs use. This commit updates the parameter name to "njobs".
Author: Tatsuro Yamada <yamatattsu@gmail.com>
Discussion: https://postgr.es/m/CAOKkKFvHqA6Tny0RKkezWVfVV91nPJyj4OGtMi3C1RznDVXqrg@mail.gmail.com
Reword publish_via_partition_root's opening paragraph. Describe its
behavior more clearly, and directly state that its default is false.
Per complaint by Peter Smith; final text of the patch made in
collaboration with Chao Li.
Author: Chao Li <li.evan.chao@gmail.com>
Author: Peter Smith <peter.b.smith@fujitsu.com>
Reported-by: Peter Smith <peter.b.smith@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut%2BPu7SpK%2BctOYoqYR3V4w5LKc9sCs6c_qotk9uTQJQ4zp6g%40mail.gmail.com
Backpatch-through: 14
This patch completes the transition to making inet_ops be default for
inet/cidr columns, rather than btree_gist's opclasses. Once we do
that, though, pg_upgrade has a big problem. A dump from an older
version will see btree_gist's opclasses as being default, so it will
not mention the opclass explicitly in CREATE INDEX commands, which
would cause the restore to create the indexes using inet_ops. Since
that's not compatible with what's actually in the files, havoc would
ensue.
This isn't readily fixable, because the CREATE INDEX command strings
are built by the older server's pg_get_indexdef() function; pg_dump
hasn't nearly enough knowledge to modify those strings successfully.
Even if we cared to put in the work to make that happen in pg_dump,
it would be counterproductive because the end goal here is to get
people off of these opclasses. Allowing such indexes to persist
through pg_upgrade wouldn't advance that goal.
Therefore, this patch just adds code to pg_upgrade to detect indexes
that would be problematic and refuse to upgrade.
There's another issue too: even without any indexes to worry about,
pg_dump in binary-upgrade mode will reproduce the "CREATE OPERATOR
CLASS ... DEFAULT" commands for btree_gist's opclasses, and those
will fail because now we have a built-in opclass that provides a
conflicting default. We could ask users to drop the btree_gist
extension altogether before upgrading, but that would carry very
severe penalties. It would affect perfectly-valid indexes for other
data types, and it would drop operators that might be relied on in
views or other database objects. Instead, put a hack in DefineOpClass
to ignore the DEFAULT clauses for these opclasses when in
binary-upgrade mode. This will result in installing a version of
btree_gist that isn't quite the version it claims to be, but that can
be fixed by issuing ALTER EXTENSION UPDATE afterwards.
Since we don't apply that hack when not in binary-upgrade mode,
it is now impossible to install any version of btree_gist
less than 1.9 via CREATE EXTENSION.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/2483812.1754072263@sss.pgh.pa.us
When repurposing a standby to a logical replica, pg_createsubscriber
uses for the new replica a set of configuration parameters saved into
postgresql.auto.conf, to force recovery patterns when the physical
replica is promoted.
While not wrong in practice, this approach can cause issues when forcing
again recovery on a logical replica or its base backup as the recovery
parameters are not reset on the target server once pg_createsubscriber
is done with the node.
This commit aims at improving the situation, by changing the way
recovery parameters are saved on the target node. Instead of writing
all the configuration to postgresql.auto.conf, this file now uses an
include_if_exists, that points to a pg_createsubscriber.conf. This new
file contains all the recovery configuration, and is renamed to
pg_createsubscriber.conf.disabled when pg_createsubscriber exits. This
approach resets the recovery parameters, and offers the benefit to keep
a trace of the setup used when the target node got promoted, for
debugging purposes. If pg_createsubscriber.conf cannot be renamed
(unlikely scenario), a warning is issued to inform users that a manual
intervention may be required to reset this configuration.
This commit includes a test case to demonstrate the problematic case: a
standby node created from a base backup of what was the target node of
pg_createsubscriber does not get confused when started. If removing
this new logic, the test fails with the standby not able to start due
to an incorrect recovery target setup, where the startup process fails
quickly with a FATAL.
I have provided the design idea for the patch, that Alyona has written
(with some code adjustments from me). This could be considered as a
bug, but after discussion this is put into the bucket for improvements.
Redesigning pg_createsubscriber would not be acceptable in the stable
branches anyway.
Author: Alyona Vinter <dlaaren8@gmail.com>
Reviewed-by: Ilyasov Ian <ianilyasov@outlook.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andrey Rudometov <unlimitedhikari@gmail.com>
Discussion: https://postgr.es/m/CAGWv16K6L6Pzm99i1KiXLjFWx2bUS3DVsR6yV87-YR9QO7xb3A@mail.gmail.com
This commit does the following to get tests passing for
MSVC/AArch64:
* Implements spin_delay() with an ISB instruction (like we do for
gcc/clang on AArch64).
* Sets USE_ARMV8_CRC32C unconditionally. Vendor-supported versions
of Windows for AArch64 require at least ARMv8.1, which is where CRC
extension support became mandatory.
* Implements S_UNLOCK() with _InterlockedExchange(). The existing
implementation for MSVC uses _ReadWriteBarrier() (a compiler
barrier), which is insufficient for this purpose on non-TSO
architectures.
There are likely other changes required to take full advantage of
the hardware (e.g., atomics/arch-arm.h, simd.h,
pg_popcount_aarch64.c), but those can be dealt with later.
Author: Niyas Sait <niyas.sait@linaro.org>
Co-authored-by: Greg Burd <greg@burd.me>
Co-authored-by: Dave Cramer <davecramer@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Tested-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/A6152C7C-F5E3-4958-8F8E-7692D259FF2F%40greg.burd.me
Discussion: https://postgr.es/m/CAFPTBD-74%2BAEuN9n7caJ0YUnW5A0r-KBX8rYoEJWqFPgLKpzdg%40mail.gmail.com
The USER and IN GROUP clauses of CREATE ROLE are deprecated, and
commit 8e78f0a1 removed them from the CREATE ROLE syntax syntax
synopsis in the docs. However, previously CREATE USER and
CREATE GROUP docs still listed these clauses.
Since CREATE USER is equivalent to CREATE ROLE ... WITH LOGIN and
CREATE GROUP is equivalent to CREATE ROLE, their documented syntax
synopsis should match CREATE ROLE to avoid confusion.
Therefore this commit removes the deprecated USER and IN GROUP
clauses from the CREATE USER and CREATE GROUP syntax synopsis
in the docs.
Author: Japin Li <japinli@hotmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/MEAPR01MB3031C30E72EF16CFC08C8565B687A@MEAPR01MB3031.ausprd01.prod.outlook.com
Commit c7eab0e97 changed the default password_encryption setting to
'scram-sha-256', so update the example for creating a user with an
assigned password.
In addition, commit 08951a7c9 added new options that in turn pass
default tokens NOBYPASSRLS and NOREPLICATION to the CREATE ROLE
command, so fix this omission as well for v16 and later.
Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/cff1ea60-c67d-4320-9e33-094637c2c4fb%40iki.fi
Backpatch-through: 14
GUC is a commonly used term but previously appeared only
in the acronym documentation. This commit adds glossary and
documentation index entries for GUC to make it easier to find
and understand.
Author: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CABV9wwPQnkeo_G6-orMGnHPK9SXGVWm7ajJPzsbE6944tDx=hQ@mail.gmail.com
This new identifier type provides support for 64-bit unsigned values,
to be used in catalogs, like OIDs. An advantage of a new data type is
that it becomes easier to grep for it in the code when assigning this
type to a catalog attribute, linking it to dedicated APIs and internal
structures.
The following operators are added in this commit, with dedicated tests:
- Casts with integer types and OID.
- btree and hash operators
- min/max functions.
- C type with related macros and defines, named around "Oid8".
This has been mentioned as useful on its own on the thread to add
support for 64-bit TOAST values, so as it becomes possible to attach
this data type to the TOAST code and catalog definitions. However, as
this concept can apply to many more areas, it is implemented as its own
independent change. This is based on a discussion with Andres Freund
and Tom Lane.
Bump catalog version.
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com>
Discussion: https://postgr.es/m/1891064.1754681536@sss.pgh.pa.us
Background workers gain a new flag, called BGWORKER_INTERRUPTIBLE, that
offers the possibility to terminate the workers when these are connected
to a database that is involved in one of the following commands:
ALTER DATABASE RENAME TO
ALTER DATABASE SET TABLESPACE
CREATE DATABASE
DROP DATABASE
This is useful to give background workers the same behavior as backends
and autovacuum workers, which are stopped when these commands are
executed. The default behavior, that exists since 9.3, is still to
never terminate bgworkers connected to the database involved in any of
these commands. The new flag has to be set to terminate the workers.
A couple of tests are added to worker_spi to track the commands that
impact the termination of the workers. There is a test case for a
non-interruptible worker, additionally, that relies on an injection
point to make the wait time in CountOtherDBBackends() reduced from 5s to
0.3s for faster test runs. The tests rely on the contents of the server
logs to check if a worker has been started or terminated:
- LOG generated by worker_spi_main() at startup, once connection to
database is done.
- FATAL in bgworker_die() when terminated.
A couple of tests run in the CI have showed that this method is stable
enough. The safe_psql() calls that scan pg_stat_activity could be
replaced with some poll_query_until() for more stability, if the current
method proves to be an issue in the buildfarm.
Author: Aya Iwata <iwata.aya@fujitsu.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/OS7PR01MB11964335F36BE41021B62EAE8EAE4A@OS7PR01MB11964.jpnprd01.prod.outlook.com
Update pg_rewind documentation to reflect the change that data checksums are
now enabled by default during initdb.
Backpatch to v18, where data checksums were changed to be enabled by default.
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/TY4PR01MB16907D62F3A0A377B30FDBEA794B2A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Backpatch-through: 18
It's been almost a year since we last did this, and upstream has
been busy. They've added stemmers for Polish and Esperanto,
and also deprecated their old Dutch stemmer in favor of the
Kraaij-Pohlmann algorithm. (The "dutch" stemmer is now the
latter, and "dutch_porter" is the old algorithm.)
Upstream also decided to rename their internal header "header.h"
to something less generic: "snowball_runtime.h". Seems like a good
thing, but it complicates this patch a bit because we were relying on
interposing our own version of "header.h" to control system header
inclusion order. (We're partially failing at that now, because now the
generated stemmer files include <stddef.h> before snowball_runtime.h.
I think that'll be okay, but if the buildfarm complains then we'll
have to do more-extensive editing of the generated files.)
I realized that we weren't documenting the available stemmers in
any user-visible place, except indirectly through sample \dFd output.
That's incomplete because we only provide built-in dictionaries for
the recommended stemmers for each language, not alternative stemmers
such as dutch_porter. So I added a list to the documentation.
I did not do anything with the stopword lists. If those are still
available from snowballstem.org, they are mighty well hidden.
Discussion: https://postgr.es/m/1185975.1767569534@sss.pgh.pa.us
This commit extends the WAIT FOR LSN command with an optional MODE option in
the WITH clause that specifies which LSN type to wait for:
WAIT FOR LSN '<lsn>' [WITH (MODE '<mode>', ...)]
where mode can be:
- 'standby_replay' (default): Wait for WAL to be replayed to the specified
LSN,
- 'standby_write': Wait for WAL to be written (received) to the specified
LSN,
- 'standby_flush': Wait for WAL to be flushed to disk at the specified LSN,
- 'primary_flush': Wait for WAL to be flushed to disk on the primary server.
The default mode is 'standby_replay', matching the original behavior when MODE
is not specified. This follows the pattern used by COPY and EXPLAIN
commands, where options are specified as string values in the WITH clause.
Modes are explicitly named to distinguish between primary and standby
operations:
- Standby modes ('standby_replay', 'standby_write', 'standby_flush') can only
be used during recovery (on a standby server),
- Primary mode ('primary_flush') can only be used on a primary server.
The 'standby_write' and 'standby_flush' modes are useful for scenarios where
applications need to ensure WAL has been received or persisted on the standby
without necessarily waiting for replay to complete. The 'primary_flush' mode
allows waiting for WAL to be flushed on the primary server.
This commit also includes includes:
- Documentation updates for the new syntax and mode descriptions,
- Test coverage for all four modes, including error cases and concurrent
waiters,
- Wakeup logic in walreceiver for standby write/flush waiters,
- Wakeup logic in WAL writer for primary flush waiters.
Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
This change is a cocktail of harmonization of function argument names,
grammar typos, renames for better consistency and unused code (see
ltree). All of these have been spotted by the author.
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b2c0d0b7-3944-487d-a03d-d155851958ff@gmail.com
Add a new "location" column to the pg_available_extensions and
pg_available_extension_versions views, exposing the directory where
the extension is located.
The default system location is shown as '$system', the same value
that can be used to configure the extension_control_path GUC.
User-defined locations are only visible for super users, otherwise
'<insufficient privilege>' is returned as a column value, the same
behaviour that we already use in pg_stat_activity.
I failed to resist the temptation to do a little extra editorializing of
the TAP test script.
Catalog version bumped.
Author: Matheus Alcantara <mths.dev@pm.me>
Reviewed-By: Chao Li <li.evan.chao@gmail.com>
Reviewed-By: Rohit Prasad <rohit.prasad@arm.com>
Reviewed-By: Michael Banck <mbanck@gmx.net>
Reviewed-By: Manni Wood <manni.wood@enterprisedb.com>
Reviewed-By: Euler Taveira <euler@eulerto.com>
Reviewed-By: Quan Zongliang <quanzongliang@yeah.net>
This <note> was originally written to describe the double levels
of de-backslashing encountered when a backslash-aware string
literal is used to hold the text representation of a composite
value. It still made sense when we switched to mostly using E'...'
syntax for that type of literal. However, commit f77de4b0c mangled
it completely by changing the example literal to be SQL-standard.
The extra pass of de-backslashing described in the text doesn't
actually occur with the example as written, unless you happen to
be using standard_conforming_strings = off.
We could restore this <note> to self-consistency by reverting the
change from f77de4b0c, but on the whole I judge that its time has
passed. standard_conforming_strings = off is nearly obsolete,
and may soon be fully so. But without that, the behavior isn't
so complicated as to justify a discursive note. I observe that
the nearby section about array I/O syntax has no equivalent text,
although that syntax is equally subject to this issue.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2998401.1767038920@sss.pgh.pa.us
Discussion: https://postgr.es/m/3279216.1767072538@sss.pgh.pa.us
Up to now, index amhandlers were expected to produce a new, palloc'd
struct on each call. That requires palloc/pfree overhead, and creates
a risk of memory leaks if the caller fails to pfree, and the time
taken to fill such a large structure isn't nil. Moreover, we were
storing these things in the relcache, eating several hundred bytes for
each cached index. There is not anything in these structs that needs
to vary at runtime, so let's change the definition so that an
amhandler can return a pointer to a "static const" struct of which
there's only one copy per index AM. Mark all the core code's
IndexAmRoutine pointers const so that we catch anyplace that might
still try to change or pfree one.
(This is similar to the way we were already handling TableAmRoutine
structs. This commit does fix one comment that was infelicitously
copied-and-pasted into tableamapi.c.)
This commit needs to be called out in the v19 release notes as an API
change for extension index AMs. An un-updated AM will still work
(as of now, anyway) but it risks memory leaks and will be slower than
necessary.
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAEoWx2=vApYk2LRu8R0DdahsPNEhWUxGBZ=rbZo1EXE=uA+opQ@mail.gmail.com
This new function exposes at SQL level some information related to
multixacts, not available until now. This data is useful for monitoring
purposes, especially for workloads that make a heavy use of multixacts:
- num_mxids, number of MultiXact IDs in use.
- num_members, number of member entries in use.
- members_size, bytes used by num_members in pg_multixact/members/.
- oldest_multixact: oldest MultiXact still needed.
This patch has been originally proposed when MultiXactOffset was still
32 bits, to monitor wraparound. This part is not relevant anymore since
bd8d9c9bdf that has widen MultiXactOffset to 64 bits. The monitoring
of disk space usage for the members is still relevant.
Some tests are added to check this function, in the shape of one
isolation test with concurrent transactions that take a ROW SHARE lock,
and some SQL tests for pg_read_all_stats. Some documentation is added
to explain some patterns that can come from the information provided by
the function.
Bump catalog version.
Author: Naga Appani <nagnrik@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
The correct spelling is Beijing, fix in regression test
and docs.
Author: JiaoShuntian <jiaoshuntian@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/ebfa3ec2-dc3c-4adb-be2a-4a882c2e85a7@gmail.com
The pg_overexplain documentation previously used the <literal> tag for
some file names, struct names, and commands. Update the markup to
use the more appropriate tags: <filename>, <structname>, and <command>.
Backpatch to v18, where pg_overexplain was introduced.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Shixin Wang <wang-shi-xin@outlook.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEyYUzz0LjBV_fMcdwU3wgmu0NCoT+JJiozPa8DG6eeog@mail.gmail.com
Backpatch-through: 18
Update the logical replication documentation to explicitly outline the
privilege requirements for each publication syntax. This will ensure users
understand the necessary permissions when creating or managing
publications.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CANhcyEXODen4U0XLk0aAwFTwGxjAfE9eRaynREenLp-JBSaFHw@mail.gmail.com
Previously logical decoding required wal_level to be set to 'logical'
at server start. This meant that users had to incur the overhead of
logical-level WAL logging even when no logical replication slots were
in use.
This commit adds functionality to automatically control logical
decoding availability based on logical replication slot presence. The
newly introduced module logicalctl.c allows logical decoding to be
dynamically activated when needed when wal_level is set to
'replica'.
When the first logical replication slot is created, the system
automatically increases the effective WAL level to maintain
logical-level WAL records. Conversely, after the last logical slot is
dropped or invalidated, it decreases back to 'replica' WAL level.
While activation occurs synchronously right after creating the first
logical slot, deactivation happens asynchronously through the
checkpointer process. This design avoids a race condition at the end
of recovery; a concurrent deactivation could happen while the startup
process enables logical decoding at the end of recovery, but WAL
writes are still not permitted until recovery fully completes. The
checkpointer will handle it after recovery is done. Asynchronous
deactivation also avoids excessive toggling of the logical decoding
status in workloads that repeatedly create and drop a single logical
slot. On the other hand, this lazy approach can delay changes to
effective_wal_level and the disabling logical decoding, especially
when the checkpointer is busy with other tasks. We chose this lazy
approach in all deactivation paths to keep the implementation simple,
even though laziness is strictly required only for end-of-recovery
cases. Future work might address this limitation either by using a
dedicated worker instead of the checkpointer, or by implementing
synchronous waiting during slot drops if workloads are significantly
affected by the lazy deactivation of logical decoding.
The effective WAL level, determined internally by XLogLogicalInfo, is
allowed to change within a transaction until an XID is assigned. Once
an XID is assigned, the value becomes fixed for the remainder of the
transaction. This behavior ensures that the logging mode remains
consistent within a writing transaction, similar to the behavior of
GUC parameters.
A new read-only GUC parameter effective_wal_level is introduced to
monitor the actual WAL level in effect. This parameter reflects the
current operational WAL level, which may differ from the configured
wal_level setting.
Bump PG_CONTROL_VERSION as it adds a new field to CheckPoint struct.
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAD21AoCVLeLYq09pQPaWs+Jwdni5FuJ8v2jgq-u9_uFbcp6UbA@mail.gmail.com
Correct the referenced location of the RangeTblEntry definition
in the pg_overexplain documentation.
Backpatched to v18, where pg_overexplain was introduced.
Author: Julien Tachoires <julien@tachoires.me>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20251218092319.tht64ffmcvzqdz7u@poseidon.home.virt
Backpatch-through: 18
In the wake of commit db6a4a985, remove most use of 'md5' from the
example configuration file. The only remainder is an example exception
for a client that doesn't support SCRAM.
Author: Mikael Gustavsson <mikael.gustavsson@smhi.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/176595607507.978865.11597773194269211255@wrigleys.postgresql.org
Discussion: https://postgr.es/m/4ed268473fdb4cf9b0eced6c8019d353@smhi.se
Backpatch-through: 18
The definition of PGoauthBearerRequest uses a temporary SOCKTYPE macro
to hide the difference between Windows and Berkeley socket handles,
since we don't surface pgsocket in our public API. This macro doesn't
need to escape the header, because implementers will choose the correct
socket type based on their platform, so I #undef'd it immediately after
use.
I didn't namespace that helper, though, so if anyone else needs a
SOCKTYPE macro, libpq-fe.h will now unhelpfully get rid of it. This
doesn't seem too far-fetched, given its proximity to existing POSIX
macro names.
Add a PQ_ prefix to avoid collisions, update and improve the surrounding
documentation, and backpatch.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com
Backpatch-through: 18
Allow pg_createsubscriber to reuse existing publications instead of
failing when they already exist on the publisher.
Previously, pg_createsubscriber would fail if any specified publication
already existed. Now, existing publications are reused as-is with their
current configuration, and non-existing publications are created
automatically with FOR ALL TABLES.
This change provides flexibility when working with mixed scenarios of
existing and new publications. Users should verify that existing
publications have the desired configuration before reusing them, and can
use --dry-run with verbose mode to see which publications will be reused
and which will be created.
Only publications created by pg_createsubscriber are cleaned up during
error cleanup operations. Pre-existing publications are preserved unless
'--clean=publications' is explicitly specified, which drops all
publications.
This feature would be helpful for pub-sub configurations where users want
to subscribe to a subset of tables from the publisher.
Author: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: tianbing <tian_bing_0531@163.com>
Discussion: https://postgr.es/m/CAHv8Rj%2BsxWutv10WiDEAPZnygaCbuY2RqiLMj2aRMH-H3iZwyA%40mail.gmail.com
Commit 119fc30 moved CompareType to cmptype.h but the mention in
the docs still refered to primnodes.h
Author: Daisuke Higuchi <higuchi.daisuke11@gmail.com>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAEVT6c8guXe5P=L_Un5NUUzCgEgbHnNcP+Y3TV2WbQh-xjiwqA@mail.gmail.com
Backpatch-through: 18
This commit adds a new "void *arg" parameter to
GetNamedDSMSegment() that is passed to the initialization callback
function. This is useful for reusing an initialization callback
function for multiple DSM segments.
Author: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAN4CZFMjh8TrT9ZhWgjVTzBDkYZi2a84BnZ8bM%2BfLPuq7Cirzg%40mail.gmail.com
Previously, pg_sync_replication_slots() would finish without synchronizing
slots that didn't meet requirements, rather than failing outright. This
could leave some failover slots unsynchronized if required catalog rows or
WAL segments were missing or at risk of removal, while the standby
continued removing needed data.
To address this, the function now waits for the primary slot to advance to
a position where all required data is available on the standby before
completing synchronization. It retries cyclically until all failover slots
that existed on the primary at the start of the call are synchronized.
Slots created after the function begins are not included. If the standby
is promoted during this wait, the function exits gracefully and the
temporary slots will be removed.
Author: Ajin Cherian <itsajin@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Yilin Zhang <jiezhilove@126.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAFPTHDZAA%2BgWDntpa5ucqKKba41%3DtXmoXqN3q4rpjO9cdxgQrw%40mail.gmail.com
This new DDL command splits a single partition into several partitions. Just
like the ALTER TABLE ... MERGE PARTITIONS ... command, new partitions are
created using the createPartitionTable() function with the parent partition
as the template.
This commit comprises a quite naive implementation which works in a single
process and holds the ACCESS EXCLUSIVE LOCK on the parent table during all
the operations, including the tuple routing. This is why the new DDL command
can't be recommended for large, partitioned tables under high load. However,
this implementation comes in handy in certain cases, even as it is. Also, it
could serve as a foundation for future implementations with less locking and
possibly parallelism.
Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru
Author: Dmitry Koval <d.koval@postgrespro.ru>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Richard Guo <guofenglinux@gmail.com>
Co-authored-by: Dagfinn Ilmari Mannsaker <ilmari@ilmari.org>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Co-authored-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Stephane Tachoires <stephane.tachoires@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <dgustafsson@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
This new DDL command merges several partitions into a single partition of the
target table. The target partition is created using the new
createPartitionTable() function with the parent partition as the template.
This commit comprises a quite naive implementation which works in a single
process and holds the ACCESS EXCLUSIVE LOCK on the parent table during all
the operations, including the tuple routing. This is why this new DDL
command can't be recommended for large partitioned tables under a high load.
However, this implementation comes in handy in certain cases, even as it is.
Also, it could serve as a foundation for future implementations with less
locking and possibly parallelism.
Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru
Author: Dmitry Koval <d.koval@postgrespro.ru>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Richard Guo <guofenglinux@gmail.com>
Co-authored-by: Dagfinn Ilmari Mannsaker <ilmari@ilmari.org>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Co-authored-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Stephane Tachoires <stephane.tachoires@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <dgustafsson@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
The reference to the test module test_custom_stats should have been
added under the section "Custom Cumulative Statistics", but the section
"Injection Points" has been updated instead, reversing the references
for both test modules.
d52c24b0f8 has removed a paragraph that was correct, and 31280d96a6
has added a paragraph that was incorrect.
Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0s4heX926+ZNh63u12gLd9jgauU6yiirKc7xGo1G01PXQ@mail.gmail.com
This new option instructs vacuumdb to print, but not execute, the
VACUUM and ANALYZE commands that would've been sent to the server.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CADkLM%3DckHkX7Of5SrK7g0LokPUwJ%3Dkk8JU1GXGF5pZ1eBVr0%3DQ%40mail.gmail.com
The new column, started_by, indicates the initiator of the
analyze ('manual' or 'autovacuum'), helping users and monitoring tools
to better understand ANALYZE behavior.
Bump catalog version.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Yu Wang <wangyu_runtime@163.com>
Discussion: https://postgr.es/m/CAA5RZ0suoicwxFeK_eDkUrzF7s0BVTaE7M%2BehCpYcCk5wiECpw%40mail.gmail.com
The new columns, mode and started_by, indicate the vacuum
mode ('normal', 'aggressive', or 'failsafe') and the initiator of the
vacuum ('manual', 'autovacuum', or 'autovacuum_wraparound'),
respectively. This allows users and monitoring tools to better
understand VACUUM behavior.
Bump catalog version.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Yu Wang <wangyu_runtime@163.com>
Discussion: https://postgr.es/m/CAOzEurQcOY-OBL_ouEVfEaFqe_md3vB5pXjR_m6L71Dcp1JKCQ@mail.gmail.com
This eliminates MultiXactOffset wraparound and the 2^32 limit on the
total number of multixid members. Multixids are still limited to 2^31,
but this is a nice improvement because 'members' can grow much faster
than the number of multixids. On such systems, you can now run longer
before hitting hard limits or triggering anti-wraparound vacuums.
Not having to deal with MultiXactOffset wraparound also simplifies the
code and removes some gnarly corner cases.
We no longer need to perform emergency anti-wraparound freezing
because of running out of 'members' space, so the offset stop limit is
gone. But you might still not want 'members' to consume huge amounts
of disk space. For that reason, I kept the logic for lowering vacuum's
multixid freezing cutoff if a large amount of 'members' space is
used. The thresholds for that are roughly the same as the "safe" and
"danger" thresholds used before, 2 billion transactions and 4 billion
transactions. This keeps the behavior for the freeze cutoff roughly
the same as before. It might make sense to make this smarter or
configurable, now that the threshold is only needed to manage disk
usage, but that's left for the future.
Add code to pg_upgrade to convert multitransactions from the old to
the new format, rewriting the pg_multixact SLRU files. Because
pg_upgrade now rewrites the files, we can get rid of some hacks we had
put in place to deal with old bugs and upgraded clusters. Bump catalog
version for the pg_multixact/offsets format change.
Author: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACG%3DezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com
The description of deferrable constraints in create_table.sgml states
that deferrable constraints cannot be used as conflict arbitrators in
an INSERT with an ON CONFLICT DO UPDATE clause, but in fact this
restriction applies to all ON CONFLICT clauses, not just those with DO
UPDATE. Fix this, and while at it, change the word "arbitrators" to
"arbiters", to match the terminology used elsewhere.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWsybvZP3ce8rGcVNx-QHuDOJZDz8y=p1SzqHwjRXyV4Q@mail.gmail.com
Backpatch-through: 14
The idea is to encourage more the use of these new routines across the
tree, as these offer stronger type safety guarantees than palloc().
The following paths are included in this batch, treating all the areas
proposed by the author for the most trivial changes, except src/backend
(by far the largest batch):
src/bin/
src/common/
src/fe_utils/
src/include/
src/pl/
src/test/
src/tutorial/
Similar work has been done in 31d3847a37.
The code compiles the same before and after this commit, with the
following exceptions due to changes in line numbers because some of the
new allocation formulas are shorter:
blkreftable.c
pgfnames.c
pl_exec.c
Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com
Plus a similar fix to the README.
Backpatch as far back as the sgml issue exists. The README issue does
exist in v14, but that seems unlikely to harm anyone.
Author: David Geier <geidav.pg@gmail.com>
Discussion: https://postgr.es/m/ed3db7ea-55b4-4809-86af-81ad3bb2c7d3@gmail.com
Backpatch-through: 15
This test module acts as a replacement that existed prior to
d52c24b0f8 in the test module injection_points. It uses a more
flexible structure than its ancestor:
- Two libraries are built, one for fixed-sized stats and one for
variable-sized stats.
- No GUCs required. The stats are enabled only if one or both libraries
are loaded with shared_preload_libraries.
- Same kind IDs reserved: 25 (variable-sized) and 26 (fixed-sized)
The goal of this redesign is to be able to easier extend the code
coverage provided by this module for other changes that are currently
under discussion, and injection_points was not suited for these.
Injection points are also now widely used in the tree now, so extending
more the test coverage for custom pgstats in the test module
injection_points would be a riskier long-term move.
The new code is mostly a copy of what existed previously in the test
module injection_points, with the same callbacks defined for fixed-sized
and variable-sized stats, but a simpler overall structure in terms of
the stats counters updated.
The test coverage should remain the same as previously: one TAP test is
used to check data reports, crash recovery and clean restart scenarios.
Tests are added for the manual reset of fixed-sized stats, something
not tested until now.
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com
The test module injection_points has been used as a landing spot to
provide coverage for the custom pgstats APIs, for both fixed-sized and
variable-sized stats kinds. Some recent work related to pgstats is
proving that this structure makes the implementation of new tests
harder.
This commit removes the code related to pgstats from injection_points,
and an equivalent will be reintroduced as a separate test module in a
follow-up commit. This removal is done in its own commit for clarity.
Using injection_points for this test coverage was perhaps not the best
way to design things, but this was good enough while working on the
first flavor of the custom pgstats APIs. Using a new test module will
make easier the introduction of new tests, and we will not need to worry
about the impact of new changes related to custom pgstats could have
with the internals of injection_points.
Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com
This function reads the lines from a file and filters its contents to
report its head and tail contents. The amount of contents to read from
a file can be tuned by the environment variable PG_TEST_FILE_READ_LINES,
that can be used to override the default of 50 lines. If the file whose
content is read has less lines than two times PG_TEST_FILE_READ_LINES,
the whole file is returned.
This will be used in a follow-up commit to limit the amount of
information reported by some of the TAP tests on failure, where we have
noticed that the contents reported by the buildfarm can be heavily
bloated in some cases, with the head and tail contents of a report being
able to provide enough information to be useful for debugging.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com
Commit 76b78721ca introduced two new columns in pg_stat_replication_slots
to improve monitoring of slot synchronization. One of these columns was
named slotsync_skip_at, which is inconsistent with the naming convention
used for similar columns in other system views.
Columns that store timestamps of the most recent event typically use the
'last_' in the column name (e.g., last_autovacuum, checksum_last_failure).
Renaming slotsync_skip_at to slotsync_last_skip aligns with this pattern,
making the purpose of the column clearer and improving overall consistency
across the views.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Discussion: https://postgr.es/m/20251128091552.GB13635@p46.dedyn.io;lightning.p46.dedyn.io
Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com
We were using SnapshotAny to do some index checks, but that's wrong and
causes spurious errors when used on indexes created by CREATE INDEX
CONCURRENTLY. Fix it to use an MVCC snapshot, and add a test for it.
This problem came in with commit 5ae2087202, which introduced
uniqueness check. Backpatch to 17.
Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Backpatch-through: 17
Discussion: https://postgr.es/m/CANtu0ojmVd27fEhfpST7RG2KZvwkX=dMyKUqg0KM87FkOSdz8Q@mail.gmail.com
In an upcoming patch more wait events will be added to the wait event
class (for buffer locking), making the current name too
specific. Alternatively we could introduce a dedicated wait event class for
those, but it seems somewhat confusing to have a BUFFERPIN and a BUFFER wait
event class.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff
The majority of cases already used "restartpoint" with just a few
instances of "restart point". Changing the latter spelling to the
former ensures consistency in the user facing documentation. Code
comments are not affected by this since it is not worth the churn
to change anything there.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/0F6E38D0-649F-4489-B2C1-43CD937E6636@yesql.se
Presently, this view reports NULL for the size of DSAs and dshash
tables because 1) the current backend might not be attached to them
and 2) the registry doesn't save the pointers to the dsa_area or
dshash_table in local memory. Also, the view doesn't show
partially-initialized entries to avoid ambiguity, since those
entries would report a NULL size as well.
This commit introduces a function that looks up the size of a DSA
given its handle (transiently attaching to the control segment if
needed) and teaches pg_dsm_registry_allocations to use it to show
the size of successfully-initialized DSA and dshash entries.
Furthermore, the view now reports partially-initialized entries
with a NULL size.
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aSeEDeznAsHR1_YF%40nathan
Introduce a new column, slotsync_skip_reason, in the pg_replication_slots
view. This column records the reason why the last slot synchronization was
skipped. It is primarily relevant for logical replication slots on standby
servers where the 'synced' field is true. The value is NULL when
synchronization succeeds.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com
This commit introduces three new functions for marking shared buffers as
dirty by using the functions introduced in 9660906dbd:
* pg_buffercache_mark_dirty() for one shared buffer.
- pg_buffercache_mark_dirt_relation() for all the shared buffers in a
relation.
* pg_buffercache_mark_dirty_all() for all the shared buffers in pool.
The "_all" and "_relation" flavors are designed to address the
inefficiency of repeatedly calling pg_buffercache_mark_dirty() for each
individual buffer, which can be time-consuming when dealing with with
large shared buffers pool.
These functions are intended as developer tools and are available only
to superusers. There is no need to bump the version of pg_buffercache,
4b203d499c having done this job in this release cycle.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Yuhang Qiu <iamqyh@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com
The documentation for CREATE/ALTER PUBLICATION previously showed:
[ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ] [, ... ]
to indicate that the table/column specification could be repeated.
However, placing [, ... ] directly after a multi-part construct was
misleading and made it unclear which portion was repeatable.
This commit introduces a new term, table_and_columns, to represent:
[ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ]
and updates the synopsis to use:
table_and_columns [, ... ]
which clearly identifies the repeatable element.
Backpatched to v15, where the misleading syntax was introduced.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+PtsyvYL3KmA6C8f0ZpXQ=7FEqQtETVy-BOF+cm9WPvfMQ@mail.gmail.com
Backpatch-through: 15
In v14, bb437f995 added support for scanning for ranges of TIDs using a
dedicated executor node for the purpose. Here, we allow these scans to
be parallelized. The range of blocks to scan is divvied up similarly to
how a Parallel Seq Scans does that, where 'chunks' of blocks are
allocated to each worker and the size of those chunks is slowly reduced
down to 1 block per worker by the time we're nearing the end of the
scan. Doing that means workers finish at roughly the same time.
Allowing TID Range Scans to be parallelized removes the dilemma from the
planner as to whether a Parallel Seq Scan will cost less than a
non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan
(disk costs are not divided by the number of workers). It was possible
the planner could choose the Parallel Seq Scan which would result in
reading additional blocks during execution than the TID Scan would have.
Allowing Parallel TID Range Scans removes the trade-off the planner
makes when choosing between reduced CPU costs due to parallelism vs
additional I/O from the Parallel Seq Scan due to it scanning blocks from
outside of the required TID range. There is also, of course, the
traditional parallelism performance benefits to be gained as well, which
likely doesn't need to be explained here.
Author: Cary Huang <cary.huang@highgo.ca>
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Steven Niu <niushiji@gmail.com>
Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca
There is no straightforward way to determine if a cluster is running
in EXEC_BACKEND mode or not, which is useful for tests to know. This
adds a GUC debug_exec_backend similar to debug_assertions which will
be true when the server is running in EXEC_BACKEND mode.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/5F301096-921A-427D-8EC1-EBAEC2A35082@yesql.se
When running on Windows (or EXEC_BACKEND) the SSL configuration will
be reloaded on each backend start, so the passphrase command will be
reloaded along with it. This implies that passphrase command reload
must be enabled on Windows for connections to work at all. Document
this since it wasn't mentioned explicitly, and will there add markup
for parameter value to match the rest of the docs.
Backpatch to all supported versions.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/5F301096-921A-427D-8EC1-EBAEC2A35082@yesql.se
Backpatch-through: 14
This patch adds two new columns to the pg_stat_replication_slots view:
slotsync_skip_count - the total number of times a slotsync operation was
skipped.
slotsync_skip_at - the timestamp of the most recent skip.
These additions provide better visibility into replication slot
synchronization behavior.
A future patch will introduce the slotsync_skip_reason column in
pg_replication_slots to capture the reason for skip.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com
ba2a3c2302 has added a way to check if a buffer is spread across
multiple pages with some NUMA information, via a new view
pg_buffercache_numa that depends on pg_buffercache_numa_pages(), a SQL
function. These can only be queried when support for libnuma exists,
generating an error if not.
However, it can be useful to know how shared buffers and OS pages map
when NUMA is not supported or not available. This commit expands the
capabilities around pg_buffercache_numa:
- pg_buffercache_numa_pages() is refactored as an internal function,
able to optionally process NUMA. Its SQL definition prior to this
commit is still around to ensure backward-compatibility with v1.6.
- A SQL function called pg_buffercache_os_pages() is added, able to work
with or without NUMA.
- The view pg_buffercache_numa is redefined to use
pg_buffercache_os_pages().
- A new view is added, called pg_buffercache_os_pages. This ignores
NUMA for its result processing, for a better efficiency.
The implementation is done so as there is no code duplication between
the NUMA and non-NUMA views/functions, relying on one internal function
that does the job for all of them. The module is bumped to v1.7.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z/fFA2heH6lpSLlt@ip-10-97-1-34.eu-west-3.compute.internal
This request allows a support function to replace a function call
appearing in FROM (typically a set-returning function) with an
equivalent SELECT subquery. The subquery will then be subject
to the planner's usual optimizations, potentially allowing a much
better plan to be generated. While the planner has long done this
automatically for simple SQL-language functions, it's now possible
for extensions to do it for functions outside that group.
Notably, this could be useful for functions that are presently
implemented in PL/pgSQL and work by generating and then EXECUTE'ing
a SQL query.
Author: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/09de6afa-c33d-4d94-a5cb-afc6cea0d2bb@illuminatedcomputing.com
The existing range_minus function raises an exception when the range is
"split", because then the result can't be represented by a single range.
For example '[0,10)'::int4range - '[4,5)' would be '[0,4)' and '[5,10)'.
This commit adds new set-returning functions so that callers can get
results even in the case of splits. There is no risk of an exception for
multiranges, but a set-returning function lets us handle them the same
way we handle ranges.
Both functions return zero results if the subtraction would give an
empty range/multirange.
The main use-case for these functions is to implement UPDATE/DELETE FOR
PORTION OF, which must compute the application-time of "temporal
leftovers": the part of history in an updated/deleted row that was not
changed. To preserve the untouched history, we will implicitly insert
one record for each result returned by range/multirange_minus_multi.
Using a set-returning function will also let us support user-defined
types for application-time update/delete in the future.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com
A set of wording improvements and spelling fixes.
Author: Oleg Sibiryakov <o.sibiryakov@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/e62bedb5-c26f-4d37-b4ed-ce9b55f1e980@postgrespro.ru
Commit 792353f7d5 updated the pg_dump and pg_dumpall documentation to
clarify which statistics are not included in their output. The pg_upgrade
documentation contained a nearly identical description, but it was not updated
at the same time.
This commit updates the pg_upgrade documentation to match those changes.
Backpatch to v18, where commit 792353f7d5 was backpatched to.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Bruce Momjian <bruce@momjian.us>
Discussion: https://postgr.es/m/CAHGQGwFnfgdGz8aGWVzgFCFwoWQU7KnFFjmxinf4RkQAkzmR+w@mail.gmail.com
Backpatch-through: 18
This patch renames the sync_error_count column to sync_table_error_count
in the pg_stat_subscription_stats view. The new name makes the purpose
explicit now that a separate column exists to track sequence
synchronization errors.
Additionally, the column seq_sync_error_count is renamed to
sync_seq_error_count to maintain a consistent naming pattern, making it
easier for users to group, and query synchronization related counters.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CALDaNm3WwJmz=-4ybTkhniB-Nf3qmFG9Zx1uKjyLLoPF5NYYXA@mail.gmail.com
The documentation did not previously mention the default values for
the --fsync-interval and --plugin options, even though pg_recvlogical --help
shows them. This omission made it harder for users to understand
the tool's behavior from the documentation alone.
This commit adds the missing default value descriptions for both options
to the pg_recvlogical documentation.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/CAHGQGwFqssPBjkWMFofGq32e_tANOeWN-cM=6biAP3nnFUXMRw@mail.gmail.com
The existing format of pg_dependencies uses a single-object JSON
structure, with each key value embedding all the knowledge about the
set attributes tracked, like:
{"1 => 5": 1.000000, "5 => 1": 0.423130}
While this is a very compact format, it is confusing to read and it is
difficult to manipulate the values within the object, particularly when
tracking multiple attributes.
The new output format introduced in this commit is a JSON array of
objects, with:
- A key named "degree", with a float value.
- A key named "attributes", with an array of attribute numbers.
- A key named "dependency", with an attribute number.
The values use the same underlying type as previously when printed, with
a new output format that shows now as follows:
[{"degree": 1.000000, "attributes": [1], "dependency": 5},
{"degree": 0.423130, "attributes": [5], "dependency": 1}]
This new format will become handy for a follow-up set of changes, so as
it becomes possible to inject extended statistics rather than require an
ANALYZE, like in a dump/restore sequence or after pg_upgrade on a new
cluster.
This format has been suggested by Tomas Vondra. The key names are
defined in the header introduced by 1f927cce44, to ease the
integration of frontend-specific changes that are still under
discussion. (Again a personal note: if anybody comes up with better
name for the keys, of course feel free.)
The bulk of the changes come from the regression tests, where
jsonb_pretty() is now used to make the outputs generated easier to
parse.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
The existing format of pg_ndistinct uses a single-object JSON structure
where each key is itself a comma-separated list of attnums, like:
{"3, 4": 11, "3, 6": 11, "4, 6": 11, "3, 4, 6": 11}
While this is a very compact format, it is confusing to read and it is
difficult to manipulate the values within the object.
The new output format introduced in this commit is an array of objects,
with:
- A key named "attributes", that contains an array of attribute numbers.
- A key named "ndistinct", represented as an integer.
The values use the same underlying type as previously when printed, with
a new output format that shows now as follows:
[{"ndistinct": 11, "attributes": [3,4]},
{"ndistinct": 11, "attributes": [3,6]},
{"ndistinct": 11, "attributes": [4,6]},
{"ndistinct": 11, "attributes": [3,4,6]}]
This new format will become handy for a follow-up set of changes, so as
it becomes possible to inject extended statistics rather than require an
ANALYZE, like in a dump/restore sequence or after pg_upgrade on a new
cluster.
This format has been suggested by Tomas Vondra. The key names are
defined in a new header, to ease with the integration of
frontend-specific changes that are still under discussion. (Personal
note: I am not specifically wedded to these key names, but if there are
better name suggestions for this release, feel free.)
The bulk of the changes come from the regression tests, where
jsonb_pretty() is now used to make the outputs generated easier to
parse.
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
Path expansion might expose characters like spaces which would cause
command failure, so double-quote the examples. While %f doesn't need
quoting since it uses a fixed character set, it is best to be
consistent.
Discussion: https://postgr.es/m/aROPCQCfvKp9Htk4@momjian.us
Backpatch-through: master
This reverts commit 1fd981f053, based on concerns that the logging
improvements do not justify the protocol breakage of dropping an unnamed
portal once its execution has completed.
It seems unlikely that one would try to send an execute or describe
message after the portal has been used, but if they do such
post-completion messages would not be able to process as the previous
versions. Let's revert this change for now so as we keep compatibility
and consider a different solution.
The tests added by 76bba03312 track the pre-1fd981f05369 behavior, and
are still valid.
Discussion: https://postgr.es/m/CA+TgmoYFJyJNQw3RT7veO3M2BWRE9Aw4hprC5rOcawHZti-f8g@mail.gmail.com
Much of the "Replication Slot" chapter applies to physical and logical
slots, but it was sloppy in mentioning mostly physical slots. This
patch clarified which parts of the text apply to which slot types.
This chapter is referenced from the logical slot/subscriber chapter, so
it needs to do double duty.
Backpatch-through: master
Also mention that logical replication slots are created by default when
subscriptions are created. This should clarify the text.
Backpatch-through: master
Previously it was not clear that "physical" replication slots were being
discussed, and that they needed to be created on the primary and not the
standby.
Backpatch-through: master
On the CREATE POLICY page, the "Policies Applied by Command Type"
table was missing MERGE ... THEN DELETE and some of the policies
applied during INSERT ... ON CONFLICT and MERGE. Fix that, and try to
improve readability by listing the various MERGE cases separately,
rather than together with INSERT/UPDATE/DELETE. Mention COPY ... TO
along with SELECT, since it behaves in the same way. In addition,
document which policy violations cause errors to be thrown, and which
just cause rows to be silently ignored.
Also, a paragraph above the table states that INSERT ... ON CONFLICT
DO UPDATE only checks the WITH CHECK expressions of INSERT policies
for rows appended to the relation by the INSERT path, which is
incorrect -- all rows proposed for insertion are checked, regardless
of whether they end up being inserted. Fix that, and also mention that
the same applies to INSERT ... ON CONFLICT DO NOTHING.
In addition, in various other places on that page, clarify how the
different types of policy are applied to different commands, and
whether or not errors are thrown when policy checks do not pass.
Backpatch to all supported versions. Prior to v17, MERGE did not
support RETURNING, and so MERGE ... THEN INSERT would never check new
rows against SELECT policies. Prior to v15, MERGE was not supported at
all.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWqnfeChjK=n1V_dYZT4rt4mnq+ybf9c0qXDYTVMsy8pg@mail.gmail.com
Backpatch-through: 14
Explicitly document that privileges are transferred along with the
ownership. Backpatch to all supported versions since this behavior
has always been present.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Josef Šimánek <josef.simanek@gmail.com>
Reported-by: Gilles Parc <gparc@free.fr>
Discussion: https://postgr.es/m/2023185982.281851219.1646733038464.JavaMail.root@zimbra15-e2.priv.proxad.net
Backpatch-through: 14
Add documentation describing sequence synchronization support in logical
replication. It explains how sequence changes are synchronized from the
publisher to the subscriber, the configuration requirements, and provide
examples illustrating setup and usage.
Additionally, document the pg_get_sequence_data() function, which allows
users to query sequence details on the publisher to determine when to
refresh corresponding sequences on the subscriber.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
The synopsis for the ALTER PUBLICATION ... DROP ... command incorrectly
implied that a column list and WHERE clause could be specified as part of
the publication object. However, these options are not allowed for
DROP operations, making the documentation misleading.
This commit corrects the synopsis to clearly show only the valid forms
of publication objects.
Backpatched to v15, where the incorrect synopsis was introduced.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsPu+47Q7b0o6h1r-qSt90U3zgbAHMHUag5o5E1Lo+=uw@mail.gmail.com
Backpatch-through: 15
The documentation stated that the directory specified by --file
must not exist, but pg_dump does allow for empty directories to
be specified and used.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Bruce Momjian <bruce@momjian.us>
Discussion: https://postgr.es/m/534AA60D-CF6B-432F-9882-E9737B33D1B7@gmail.com
This commit adds the --continue-on-error option, allowing pgbench clients
to continue running even when SQL statements fail for reasons other than
serialization or deadlock errors. Without this option (by default),
the clients aborts in such cases, which was the only available behavior
previously.
This option is useful for benchmarks using custom scripts that may
raise errors, such as unique constraint violations, where users want
pgbench to complete the run despite individual statement failures.
Author: Rintaro Ikeda <ikedarintarof@oss.nttdata.com>
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Stepan Neretin <slpmcf@gmail.com>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/44334231a4d214fac382a69cceb7d9fc@oss.nttdata.com
This commit adds a new column, seq_sync_error_count, to the
pg_stat_subscription_stats view. This counter tracks the number of errors
encountered by the sequence synchronization worker during operation.
Since a single worker handles the synchronization of all sequences, this
value may reflect errors from multiple sequences. This addition improves
observability of sequence synchronization behavior and helps monitor
potential issues during replication.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
The following parameters can only be set at server start because
their context is PGC_POSTMASTER, but this information was missing
or incorrectly documented. This commit adds or corrects
that information for the following parameters:
* debug_io_direct
* dynamic_shared_memory_type
* event_source
* huge_pages
* io_max_combine_limit
* max_notify_queue_pages
* shared_memory_type
* track_commit_timestamp
* wal_decode_buffer_size
Backpatched to all supported branches.
Author: Karina Litskevich <litskevichkarina@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGfPzcin-_6XwPgVbWTOUFVZgHF5g9ROrwLUdCTfjy=0A@mail.gmail.com
Backpatch-through: 13
If these parameters are set without units, the values are interpreted
as blocks. This detail was previously missing from the documentation,
so this commit adds it.
Backpatch to v17 where io_combine_limit was added.
Author: Karina Litskevich <litskevichkarina@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACiT8iZCDkz1bNYQNQyvGhXWJExSnJULRTYT894u4-Ti7Yh6jw@mail.gmail.com
Backpatch-through: 17
Use uppercase SQL keywords consistently throughout the documentation to
ease reading. Also add whitespace in a couple of places where it
improves readability.
Author: Erik Wienhold <ewie@ewie.name>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/82eb512b-8ed2-46be-b311-54ffd26978c4%40ewie.name
This section introduces temporal tables, with a focus on Application
Time (which we support) and only a brief mention of System Time (which
we don't). It covers temporal primary keys, unique constraints, and
temporal foreign keys. We will document temporal update/delete and
periods as we add those features.
This commit also adds glossary entries for temporal table, application
time, and system time.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024@illuminatedcomputing.com
WAIT FOR is to be used on standby and specifies waiting for
the specific WAL location to be replayed. This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.
WAIT FOR needs to wait without any snapshot held. Otherwise, the snapshot
could prevent the replay of WAL records, implying a kind of self-deadlock.
This is why separate utility command seems appears to be the most robust
way to implement this functionality. It's not possible to implement this as
a function. Previous experience shows that stored procedures also have
limitation in this aspect.
Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com
Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Previously, unnamed portals were kept until the next Bind message or the
end of the transaction. This could cause temporary files to persist
longer than expected and make logging not reflect the actual SQL
responsible for the temporary file.
This patch changes exec_execute_message() to drop unnamed portals
immediately after execution to completion at the end of an Execute
message, making their removal more aggressive. This forces temporary
file cleanups to happen at the same time as the completion of the portal
execution, with statement logging correctly reflecting to which
statements these temporary files were attached to (see the diffs in the
TAP test updated by this commit for an idea).
The documentation is updated to describe the lifetime of unnamed
portals, and test cases are updated to verify temporary file removal and
proper statement logging after unnamed portal execution. This changes
how unnamed portals are handled in the protocol, hence no backpatch is
done.
Author: Frédéric Yhuel <frederic.yhuel@dalibo.com>
Co-Authored-by: Sami Imseih <samimseih@gmail.com>
Co-Authored-by: Mircea Cadariu <cadariu.mircea@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tTrTUoEr3kDXCuKsvqYGq8OOHiBwoD-dyJocq95uEOTQ%40mail.gmail.com
We have never had a SET syntax that allows setting a GUC_LIST_INPUT
parameter to be an empty list. A locution such as
SET search_path = '';
doesn't mean that; it means setting the GUC to contain a single item
that is an empty string. (For search_path the net effect is much the
same, because search_path ignores invalid schema names and '' must be
invalid.) This is confusing, not least because configuration-file
entries and the set_config() function can easily produce empty-list
values.
We considered making the empty-string syntax do this, but that would
foreclose ever allowing empty-string items to be valid in list GUCs.
While there isn't any obvious use-case for that today, it feels like
the kind of restriction that might hurt someday. Instead, let's
accept the forbidden-up-to-now value NULL and treat that as meaning an
empty list. (An objection to this could be "what if we someday want
to allow NULL as a GUC value?". That seems unlikely though, and even
if we did allow it for scalar GUCs, we could continue to treat it as
meaning an empty list for list GUCs.)
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrei Klychkov <andrew.a.klychkov@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Discussion: https://postgr.es/m/CA+mfrmwsBmYsJayWjc8bJmicxc3phZcHHY=yW5aYe=P-1d_4bg@mail.gmail.com
Several functions in the codebase accept "Datum *" parameters but do
not modify the pointed-to data. These have been updated to take
"const Datum *" instead, improving type safety and making the
interfaces clearer about their intent. This change helps the compiler
catch accidental modifications and better documents immutability of
arguments.
Most of "Datum *" parameters have a pairing "bool *isnull" parameter,
they are constified as well.
No functional behavior is changed by this patch.
Author: Chao Li <lic@highgo.com>
Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2msfT0knvzUa72ZBwu9LR_RLY4on85w2a9YpE-o2By5HQ@mail.gmail.com
The docs for max_protocol_version suggested PQprotocolVersion()
instead of PQfullProtocolVersion() to find out the exact protocol
version. Since PQprotocolVersion() only returns the major protocol
version, that is bad advice.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAGECzQSKFxQsYAgr11PhdOr-RtPZEdAXZnHx6U3avLuk3xQaTQ%40mail.gmail.com
The new wal_fpi_bytes counter calculates the total amount of full page
images inserted in WAL records, in bytes. This commit exposes this
information in EXPLAIN (ANALYZE, WAL) alongside the existing counters,
for both the text and JSON/YAML outputs, building upon f9a09aa295.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discusssion: https://postgr.es/m/CAOzEurQtZEAfg6P0kU3Wa-f9BWQOi0RzJEMPN56wNTOmJLmfaQ@mail.gmail.com
The new %S substitution shows the current value of search_path.
Note that this only works when connected to Postgres v18 or newer,
since search_path was first marked as GUC_REPORT in commit
28a1121fd9. On older versions that don't report search_path, %S is
replaced with a question mark.
Suggested-by: Lauri Siltanen <lauri.siltanen@gmail.com>
Author: Florents Tselai <florents.tselai@gmail.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CANsM767JhTKCRagTaq5Lz52fVwLPVkhSpyD1C%2BOrridGv0SO0A%40mail.gmail.com
This new counter, called "wal_fpi_bytes", tracks the total amount in
bytes of full page images (FPIs) generated in WAL. This data becomes
available globally via pg_stat_wal, and for backend statistics via
pg_stat_get_backend_wal().
Previously, this information could only be retrieved with pg_waldump or
pg_walinspect, which may not be available depending on the environment,
and are expensive to execute. It offers hints about how much FPIs
impact the WAL generated, which could be a large percentage for some
workloads, as well as the effects of wal_compression or page holes.
Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, due to the addition of wal_fpi_bytes in
PgStat_WalCounters.
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAOzEurQtZEAfg6P0kU3Wa-f9BWQOi0RzJEMPN56wNTOmJLmfaQ@mail.gmail.com
This commit makes the way WAL segments are handled from the source to
the target server slightly smarter: the copy of the WAL segments is now
skipped if these have been created before the point where source and
target have diverged (the WAL segment where the point of divergence
exists is still copied), because we know that such segments exist on
both the target and source. Note that the on-disk size of the WAL
segments on the source and target need to match. Hence, only the
segments generated after the point of divergence are now copied. A
segment existing on the source but not the target is copied.
Previously, all the WAL segments were just copied in full. This change
can make the rewind operation cheaper in some configurations, especially
for setups where some WAL retention causes many segments to remain on
the source server even after the promotion of a standby used as source
to rewind a previous primary.
A TAP test is added to track these new behaviors. The file map printed
with --debug now includes all the information related to WAL segments,
to be able to track if these are copied or skipped, and the test relies
on the debug output generated.
Author: John Hsu <johnhyvr@gmail.com>
Author: Justin Kwan <justinpkwan@outlook.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Discussion: https://postgr.es/m/181b4c6fa9c.b8b725681941212.7547232617810891479@viggy28.dev
The project Git server hasn't supported cloning with the Git protocol
in a very long time, but the documentation never got the memo. Remove
the mention of using the Git protocol, and while there wrap a mention
of Git in <productname> tags.
Backpatch down to all supported versions.
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Gurjeet Singh <gurjeet@singh.im>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CABwTF4WMiMb-KT2NRcib5W0C8TQF6URMb+HK9a_=rnZnY8Q42w@mail.gmail.com
Backpatch-through: 13
This patch adds support for a new SQL command:
ALTER SUBSCRIPTION ... REFRESH SEQUENCES
This command updates the sequence entries present in the
pg_subscription_rel catalog table with the INIT state to trigger
resynchronization.
In addition to the new command, the following subscription commands have
been enhanced to automatically refresh sequence mappings:
ALTER SUBSCRIPTION ... REFRESH PUBLICATION
ALTER SUBSCRIPTION ... ADD PUBLICATION
ALTER SUBSCRIPTION ... DROP PUBLICATION
ALTER SUBSCRIPTION ... SET PUBLICATION
These commands will perform the following actions:
Add newly published sequences that are not yet part of the subscription.
Remove sequences that are no longer included in the publication.
This ensures that sequence replication remains aligned with the current
state of the publication on the publisher side.
Note that the actual synchronization of sequence data/values will be
handled in a subsequent patch that introduces a dedicated sequence sync
worker.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
Previously, COPY TO command didn't support directly specifying
partitioned tables so users had to use COPY (SELECT ...) TO variant.
This commit adds direct COPY TO support for partitioned
tables, improving both usability and performance. Performance tests
show it's faster than the COPY (SELECT ...) TO variant as it avoids
the overheads of query processing and sending results to the COPY TO
command.
When used with partitioned tables, COPY TO copies the same rows as
SELECT * FROM table. Row-level security policies of the partitioned
table are applied in the same way as when executing COPY TO on a plain
table.
Author: jian he <jian.universality@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CACJufxEZt%2BG19Ors3bQUq-42-61__C%3Dy5k2wk%3DsHEFRusu7%3DiQ%40mail.gmail.com
Previously, attempting to use pg_checksums on a cluster with a control
file whose version does not match with what thetool is able to support
would lead to the following error:
pg_checksums: error: pg_control CRC value is incorrect
This is confusing, because it would look like the control file is
corrupted. However, the contents of the control file are correct,
pg_checksums not being able to understand how the past control file is
shaped.
This commit adds a check based on PG_VERSION, using the facility added
by cd0be131ba, using the same error message as some of the other
frontend tools. A note is added in the documentation about the major
version requirement.
Author: Michael Banck <mbanck@gmx.net>
Discussion: https://postgr.es/m/68f1ff21.170a0220.2c9b5f.4df5@mx.google.com
Improve the documentation of pg_stat_replication to explain when
the backend_xmin column becomes NULL. This happens when
a replication slot is used (the xmin is then shown in pg_replication_slots)
or when hot_standby_feedback is disabled.
Author: Renzo Dani <arons7@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CA+XOKQAMXzskpdUmj2sg03_5fmiXc2Gs0r3TX1_rmcFcqh+=xQ@mail.gmail.com
The log output functionality of log_autovacuum_min_duration applies to
both VACUUM and ANALYZE, so it is not possible to separate the VACUUM
and ANALYZE log output thresholds. Logs are likely to be output only for
VACUUM and not for ANALYZE.
Therefore, we decided to separate the threshold for log output of VACUUM
by autovacuum (log_autovacuum_min_duration) and the threshold for log
output of ANALYZE by autovacuum (log_autoanalyze_min_duration).
Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Kasahara Tatsuhito <kasaharatt@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/flat/CAOzEurQtfV4MxJiWT-XDnimEeZAY+rgzVSLe8YsyEKhZcajzSA@mail.gmail.com
There was some confusion around how to adjust the n_distinct estimates
for partitioned tables. Here we try and clarify that
n_distinct_inherited needs to be adjusted rather than n_distinct.
Also fix some slightly misleading text which was talking about table
size rather than table rows, fix a grammatical error, and adjust some
text which indicated that ANALYZE was performing calculations based on
the n_distinct settings. Really it's the query planner that does this
and ANALYZE only stores the overridden n_distinct estimate value in
pg_statistic.
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com
Since protocol version 3.2 the CancelRequest does not have a fixed size
length anymore. The protocol docs still listed the length field to be a
constant number though. This fixes that.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reported-by: Dmitry Igrishin <dmitigr@gmail.com>
Backpatch-through: 18
This patch adds support for the ALL SEQUENCES clause in publications,
enabling synchronization/replication of all sequences that is useful for
upgrades.
Publications can now include all sequences via FOR ALL SEQUENCES.
psql enhancements:
\d shows publications for a given sequence.
\dRp indicates if a publication includes all sequences.
ALL SEQUENCES can be combined with ALL TABLES, but not with other options
like TABLE or TABLES IN SCHEMA. We can extend support for more granular
clauses in future.
The view pg_publication_sequences provides information about the mapping
between publications and sequences.
This patch enables publishing of sequences; subscriber-side support will
be added in upcoming patches.
Author: vignesh C <vignesh21@gmail.com>
Author: Tomas Vondra <tomas@vondra.me>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
This commit introduces a new column mem_exceeded_count to the
pg_stat_replication_slots view. This counter tracks how often the
memory used by logical decoding exceeds the logical_decoding_work_mem
limit. The new statistic helps users determine whether exceeding the
logical_decoding_work_mem limit is a rare occurrences or a frequent
issue, information that wasn't available through existing statistics.
Bumps catversion.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/978D21E8-9D3B-40EA-A4B1-F87BABE7868C@yesql.se
Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined. Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.
In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations. To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node. During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied. If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.
Grouped relation paths can be generated in two ways. The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths. To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation. Joining two grouped relations
is currently not supported.
To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.
For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators. This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does. This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.
One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation. Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.
If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end. The final paths will compete in the
usual way with paths built from regular planning.
The patch was originally proposed by Antonin Houska in 2017. This
commit reworks various important aspects and rewrites most of the
current code. However, the original patch and reviews were very
useful.
Author: Richard Guo <guofenglinux@gmail.com>
Author: Antonin Houska <ah@cybertec.at> (in an older version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version)
Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version)
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
This patch reuses the existing aggtransspace in pg_aggregate to
signal that an aggregate's transition state can grow unboundedly. If
aggtransspace is set to a negative value, it now indicates that the
transition state may consume unpredictable or large amounts of memory,
such as in aggregates like array_agg or string_agg that accumulate
input rows.
This information can be used by the planner to avoid applying
memory-sensitive optimizations (e.g., eager aggregation) when there is
a risk of excessive memory usage during partial aggregation.
Bump catalog version.
Per idea from Robert Haas, though applied differently than originally
suggested.
Discussion: https://postgr.es/m/CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com
It is possible to call pg_stat_reset_single_function_counters() for a
single function, but the reset time was missing the system view showing
its statistics. Like all the fields of pg_stat_user_functions, the GUC
track_functions needs to be enabled to show the statistics about
function executions.
Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to
PgStat_StatFuncEntry.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aONjnsaJSx-nEdfU@paquier.xyz