The COMMENT should depend on the separately-dumped constraint, not the
domain. Sufficient restore parallelism might fail the COMMENT command
by issuing it before the constraint exists. Back-patch to v13, like
commit 0858f0f96e.
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/20250913020233.fa.nmisch@google.com
Backpatch-through: 13
Previously, pg_dump incorrectly queried pg_seclabel to retrieve security labels
for subscriptions, which are stored in pg_shseclabel as they are global objects.
This could result in security labels for subscriptions not being dumped.
This commit fixes the issue by updating pg_dump to query the pg_seclabels view,
which aggregates entries from both pg_seclabel and pg_shseclabel.
While querying pg_shseclabel directly for subscriptions was an alternative,
using pg_seclabels is simpler and sufficient.
In addition, pg_dump is updated to dump security labels on event triggers,
which were previously omitted.
Backpatch to all supported versions.
Author: Jian He <jian.universality@gmail.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACJufxHCt00pR9h51AVu6+yPD5J7JQn=7dQXxqacj0XyDhc-fA@mail.gmail.com
Backpatch-through: 13
A malicious server could inject psql meta-commands into plain-text
dump output (i.e., scripts created with pg_dump --format=plain,
pg_dumpall, or pg_restore --file) that are run at restore time on
the machine running psql. To fix, introduce a new "restricted"
mode in psql that blocks all meta-commands (except for \unrestrict
to exit the mode), and teach pg_dump, pg_dumpall, and pg_restore to
use this mode in plain-text dumps.
While at it, encourage users to only restore dumps generated from
trusted servers or to inspect it beforehand, since restoring causes
the destination to execute arbitrary code of the source superusers'
choice. However, the client running the dump and restore needn't
trust the source or destination superusers.
Reported-by: Martin Rakhmanov
Reported-by: Matthieu Denais <litezeraw@gmail.com>
Reported-by: RyotaK <ryotak.mail@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Security: CVE-2025-8714
Backpatch-through: 13
Maliciously-crafted object names could achieve SQL injection during
restore. CVE-2012-0868 fixed this class of problem at the time, but
later work reintroduced three cases. Commit
bc8cd50fef (back-patched to v11+ in
2023-05 releases) introduced the pg_dump case. Commit
6cbdbd9e8d (v12+) introduced the two
pg_dumpall cases. Move sanitize_line(), unchanged, to dumputils.c so
pg_dumpall has access to it in all supported versions. Back-patch to
v13 (all supported versions).
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Backpatch-through: 13
Security: CVE-2025-8715
pg_dump sorts objects by their logical names, e.g. (nspname, relname,
tgname), before dependency-driven reordering. That removes one source
of logically-identical databases differing in their schema-only dumps.
In other words, it helps with schema diffing. The logical name sort
ignored essential sort keys for constraints, operators, PUBLICATION
... FOR TABLE, PUBLICATION ... FOR TABLES IN SCHEMA, operator classes,
and operator families. pg_dump's sort then depended on object OID,
yielding spurious schema diffs. After this change, OIDs affect dump
order only in the event of catalog corruption. While pg_dump also
wrongly ignored pg_collation.collencoding, CREATE COLLATION restrictions
have been keeping that imperceptible in practical use.
Use techniques like we use for object types already having full sort key
coverage. Where the pertinent queries weren't fetching the ignored sort
keys, this adds columns to those queries and stores those keys in memory
for the long term.
The ignorance of sort keys became more problematic when commit
172259afb5 added a schema diff test
sensitive to it. Buildfarm member hippopotamus witnessed that.
However, dump order stability isn't a new goal, and this might avoid
other dump comparison failures. Hence, back-patch to v13 (all supported
versions).
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20250707192654.9e.nmisch@google.com
Backpatch-through: 13
Recent discussions of the mechanisms used to manage global data have
raised concerns about their robustness and security. Rather than try
to deal with those concerns at a very late stage of the release cycle,
the conclusion is to revert these features and work on them for the
next release.
This reverts parts or all of the following commits:
1495eff7bd Non text modes for pg_dumpall, correspondingly change pg_restore
5db3bf7391 Clean up from commit 1495eff7bd289f74d0cb Add more TAP tests for pg_dumpall
2ef5790806 Fix a couple of error messages and tests for them
b52a4a5f28 Clean up error messages from 1495eff7bd4170298b6e Further cleanup for directory creation on pg_dump/pg_dumpall
22cb6d2895 Fix memory leak in pg_restore.c
928394b664 Improve various new-to-v18 appendStringInfo calls
39729ec01d Fix fat fingering in 22cb6d28955822bf21d5 Add missing space in pg_restore documentation.
f09088a01d Free memory properly in pg_restore.c
40b9c27014 pg_restore cleanups
4aad2cb770 Portability fix: isdigit() must be passed an unsigned char.
88e947136b Fix typos and grammar in the code
f60420cff6 doc: Alphabetize long options for pg_dump[all].
bc35adee8d doc: Put new options in consistent order on man pages
a876464abc Message style improvements
dec6643487 Improve pg_dump/pg_dumpall help synopses and terminology
0ebd242555 Run pgperltidy
Discussion: https://postgr.es/m/20250708212819.09.nmisch@google.com
Backpatch-to: 18
Reviewed-by: Noah Misch <noah@leadboat.com>
Commit e5da0fe3c2 introduced catalog entries for not-null constraints
on domains; but because commit b0e96f3119 (the original work for
catalogued not-null constraints on tables) forgot to teach pg_dump to
process the comments for them, this one also forgot. Add that now.
We also need to teach repairDependencyLoop() about the new type of
constraints being possible for domains.
Backpatch-through: 17
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF-0bqVR=j4jonS6N2Ka6hHUpFyu3_3TWKNhOW_4yFSSg@mail.gmail.com
We skip dumping constraints together with domains if they are invalid
('separate') so that they appear after data -- but their comments were
dumped together with the domain definition, which in effect leads to the
comment being dumped when the constraint does not yet exist. Delay
them in the same way.
Oversight in 7eca575d1c28; backpatch all the way back.
Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF_C2pe6J_+nPr6C5jf5rQnbYP8XOKr4HM8yHZtp2aQqQ@mail.gmail.com
We were missing collecting comments for not-null constraints that are
dumped inline with the table definition (i.e., valid ones), because they
aren't represented by a separately dumpable object. Fix by creating
separate TocEntries for the comments.
Co-Authored-By: Jian He <jian.universality@gmail.com>
Co-Authored-By: Álvaro Herrera <alvherre@kurilemu.de>
Reported-By: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-By: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/d50ff977-c728-4e9e-8488-fc2688e08754@oss.nttdata.com
Increase consistency of --help and man page synopses between pg_dump
and pg_dumpall. These should now be very similar, as pg_dumpall can
now also produce non-text dump output. But actually, they had drifted
further apart.
- Use verb "export" consistently, instead of "dump" or "extract".
- Use "SQL script" instead of just "script" or "text file".
- Maintain consistent distinction between SQL script and other
formats/archives (which is relevant for pg_restore).
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://www.postgresql.org/message-id/flat/3f71d8a7-095b-4829-9b0b-fce09e9866b3%40eisentraut.org
Commit 1fd1bd8710 introduced support for dumping statistics with
pg_dump and pg_dumpall, covering tables, materialized views, and indexes.
However, it overlooked foreign tables, even though functions like
pg_restore_relation_stats() support them.
This commit fixes that oversight by allowing pg_dump and pg_dumpall
to include statistics for foreign tables.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/3772e4e4-ef39-4deb-bb76-aa8165f33fb6@oss.nttdata.com
Before v14, a reltuples value of 0 was ambiguous: it could either
mean the relation is empty, or it could mean that it hadn't yet
been vacuumed or analyzed. (Commit 3d351d916b taught v14 and newer
to use -1 for the latter case.) This ambiguity allegedly can cause
the planner to choose inefficient plans after restoring to v18 or
newer. To fix, let's just dump reltuples as -1 in that case. This
will cause some truly empty tables to be seen as not-yet-processed,
but that seems unlikely to cause too much trouble in practice.
Note that we could alternatively teach pg_restore_relation_stats()
to translate reltuples based on the version argument, but since
that function doesn't exist until v18, there's no particular
advantage to that approach. That is, there's no chance of
restoring stats dumped from a pre-v14 server to another pre-v14
server. Per discussion, the current policy is to fix pre-v18
behavior differences during export and everything else during
import.
Commit 9879105024 fixed a similar problem for vacuumdb by removing
the check for reltuples != 0. Presumably we could reinstate that
check now, but I've chosen to leave it in place in case reltuples
isn't accurate. As before, processing some empty tables seems
relatively harmless.
Author: Hari Krishna Sunder <hari.db.pg@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CAAeiqZ0o2p4SX5_xPcuAbbsmXjg6MJLNuPYSLUjC%3DWh-VeW64A%40mail.gmail.com
Presently, fetchAttributeStats() builds array literals by treating
the elements as SQL identifiers. This is incorrect for a couple of
reasons:
* Array literal content must match the external text representation
of the array, i.e., what array_out() would return. One notable
problem is that double quotes are escaped with "" in identifiers
but with \" in array literals. To fix, build the array content
using the pre-existing appendPGArray() function.
* Array literals must be written as string constants. A notable
problem here is that single quotes are escaped via '' in strings
but are not escaped in the text representation of an array. To
fix, append the aforementioned array literal content to the query
with appendStringLiteralAH().
While at it, modify a test case to use an identifier that would
cause the test to fail without this change.
Oversight in commit 9c02e3a986.
Reported-by: Philippe Beaudoin <pbh.emaj@free.fr>
Author: Jian He <jian.universality@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Stepan Neretin <slpmcf@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18923
Discussion: https://postgr.es/m/18923-e79273f87c6bed69%40postgresql.org
When a child constraint is validated and the parent constraint it
derives from isn't, pg_dump must be coerced into printing the child
constraint; failing to do would result in a dump that restores the
constraint as not valid, which would be incorrect.
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: jian he <jian.universality@gmail.com>
Message-id: https://postgr.es/m/CACJufxGHNNMc0E2JphUqJMzD3=bwRSuAEVBF5ekgkG8uY0Q3hg@mail.gmail.com
Use appendPQExpBufferStr when there are no parameters and
appendPQExpBufferChar when the string length is 1.
Unlike 3fae25cbb, which fixed this issue for code that was new to v18,
this one fixes up instances which exist in the backbranches. We've
historically tried to maintain this standard and if we're going to
continue doing that, then we won't be doing that selectively based on
when the code was introduced. Now seems like a good time to flush out the
existing misuses. Waiting until v19 just prolongs their existence in
terms of released versions that the misuses exist in.
Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com
We'd try to drop the partitions of a partitioned index separately,
which is disallowed by the backend, leading to an error during
restore. While the error is harmless, it causes problems if you
try to use --single-transaction mode.
Fortunately, there seems no need to do a DROP at all, since the
partition will go away silently when we drop either the parent index
or the partition's table. So just make the DROP conditional on not
being a partition.
Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com
Backpatch-through: 13
This allows them to be added without scanning the table, and validating
them afterwards without holding access exclusive lock on the table after
any violating rows have been deleted or fixed.
Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid
not-null constraint validates that constraint. ALTER TABLE .. VALIDATE
CONSTRAINT is also supported. There are various checks on whether an
invalid constraint is allowed in a child table when the parent table has
a valid constraint; this should match what we do for enforced/not
enforced constraints.
pg_attribute.attnotnull is now only an indicator for whether a not-null
constraint exists for the column; whether it's valid or invalid must be
queried in pg_constraint. Applications can continue to query
pg_attribute.attnotnull as before, but now it's possible that NULL rows
are present in the column even when that's set to true.
For backend internal purposes, we cache the nullability status in
CompactAttribute->attnullability that each tuple descriptor carries
(replacing CompactAttribute.attnotnull, which was a mirror of
Form_pg_attribute.attnotnull). During the initial tuple descriptor
creation, based on the pg_attribute scan, we set this to UNRESTRICTED if
pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we
update the latter to VALID or INVALID depending on the pg_constraint
scan. This flag is also copied when tupledescs are copied.
Comparing tuple descs for equality must also compare the
CompactAttribute.attnullability flag and return false in case of a
mismatch.
pg_dump deals with these constraints by storing the OIDs of invalid
not-null constraints in a separate array, and running a query to obtain
their properties. The regular table creation SQL omits them entirely.
They are then dealt with in the same way as "separate" CHECK
constraints, and dumped after the data has been loaded. Because no
additional pg_dump infrastructure was required, we don't bump its
version number.
I decided not to bump catversion either, because the old catalog state
works perfectly in the new world. (Trying to run with new catalog state
and the old server version would likely run into issues, however.)
System catalogs do not support invalid not-null constraints (because
commit 14e87ffa5c didn't allow them to have pg_constraint rows
anyway.)
Author: Rushabh Lathia <rushabh.lathia@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com
Commit 9c02e3a986 taught pg_dump to retrieve attribute statistics
for 64 relations at a time. pg_dump supports dumping from v9.2 and
newer versions, but our query for retrieving statistics for
multiple relations uses WITH ORDINALITY and multi-argument
UNNEST(), both of which were introduced in v9.4. To fix, we resort
to gathering statistics for a single relation at a time on versions
older than v9.4.
Per buildfarm member crake.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/Z_BcWVMvlUIJ_iuZ%40nathan
pg_dumpall acquires a new -F/--format option, with the same meanings as
pg_dump. The default is p, meaning plain text. For any other value, a
directory is created containing two files, globals.data and map.dat. The
first contains SQL for restoring the global data, and the second
contains a map from oids to database names. It will also contain a
subdirectory called databases, inside which it will create archives in
the specified format, named using the database oids.
In these casess the -f argument is required.
If pg_restore encounters a directory containing globals.dat, and no
toc.dat, it restores the global settings and then restores each
database.
pg_restore acquires two new options: -g/--globals-only which suppresses
restoration of any databases, and --exclude-database which inhibits
restoration of particualr database(s) in the same way the same option
works in pg_dumpall.
Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Co-authored-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/cb103623-8ee6-4ba5-a2c9-f32e3a4933fa@dunslane.net
ConnectDatabase is used by pg_dumpall, pg_restore and pg_dump so move
common code to new file.
new file name: connectdb.c
Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Currently, pg_dump gathers attribute statistics with a query per
relation, which can cause pg_dump to take significantly longer,
especially when there are many relations. This commit addresses
this by teaching pg_dump to gather attribute statistics for 64
relations at a time. Some simple tests showed this was the optimal
batch size, but performance may vary depending on the workload.
Our lookahead code determines the next batch of relations by
searching the TOC sequentially for relevant entries. This approach
assumes that we will dump all such entries in TOC order, which
unfortunately isn't true for dump formats that use
RestoreArchive(). RestoreArchive() does multiple passes through
the TOC and selectively dumps certain groups of entries each time.
This is particularly problematic for index stats and a subset of
matview stats; both are in SECTION_POST_DATA, but matview stats
that depend on matview data are dumped in RESTORE_PASS_POST_ACL,
while all other stats are dumped in RESTORE_PASS_MAIN. To handle
this, this commit moves all statistics data entries in
SECTION_POST_DATA to RESTORE_PASS_POST_ACL, which ensures that we
always dump them in TOC order. A convenient side effect of this
change is that we can revert a decent chunk of commit a0a4601765,
but that is left for a follow-up commit.
Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
Right now, pg_dump stores all generated commands for statistics in
memory. These commands can be quite large and therefore can
significantly increase pg_dump's memory footprint. To fix, wait
until we are about to write out the commands before generating
them, and be sure to free the commands after writing. This is
implemented via a new defnDumper callback that works much like the
dataDumper one but is specifically designed for TOC entries.
Custom dumps that include data might write the TOC twice (to update
data offset information), which would ordinarily cause pg_dump to
run the attribute statistics queries twice. However, as a hack, we
save the length of the written-out entry in the first pass and skip
over it in the second. While there is no known technical issue
with executing the queries multiple times and rewriting the
results, it's expensive and feels risky, so let's avoid it.
As an exception, we _do_ execute the queries twice for the tar
format. This format does a second pass through the TOC to generate
the restore.sql file. pg_restore doesn't use this file, so even if
the second round of queries returns different results than the
first, it won't corrupt the output; the archive and restore.sql
file will just have different content. A follow-up commit will
teach pg_dump to gather attribute statistics in batches, which our
testing indicates more than makes up for the added expense of
running the queries twice.
Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
Previously, ALTER DEFAULT PRIVILEGES did not support large objects.
This meant that to grant privileges to users other than the owner,
permissions had to be manually assigned each time a large object
was created, which was inconvenient.
This commit extends ALTER DEFAULT PRIVILEGES to allow defining default
access privileges for large objects. With this change, specified privileges
will automatically apply to newly created large objects, making privilege
management more efficient.
As a side effect, this commit introduces the new keyword OBJECTS
since it's used in the syntax of ALTER DEFAULT PRIVILEGES.
Original patch by Haruka Takatsuka, with some fixes and tests by Yugo Nagata,
and rebased by Laurenz Albe.
Author: Takatsuka Haruka <harukat@sraoss.co.jp>
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Co-authored-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20240424115242.236b499b2bed5b7a27f7a418@sraoss.co.jp
This expands the NOT ENFORCED constraint flag, previously only
supported for CHECK constraints (commit ca87c415e2), to foreign key
constraints.
Normally, when a foreign key constraint is created on a table, action
and check triggers are added to maintain data integrity. With this
patch, if a constraint is marked as NOT ENFORCED, integrity checks are
no longer required, making these triggers unnecessary. Consequently,
when creating a NOT ENFORCED foreign key constraint, triggers will not
be created, and the constraint will be marked as NOT VALID.
Similarly, if an existing foreign key constraint is changed to NOT
ENFORCED, the associated triggers will be dropped, and the constraint
will also be marked as NOT VALID. Conversely, if a NOT ENFORCED
foreign key constraint is changed to ENFORCED, the necessary triggers
will be created, and the will be changed to VALID by performing
necessary validation.
Since not-enforced foreign key constraints have no triggers, the
shortcut used for example in psql and pg_dump to skip looking for
foreign keys if the relation is known not to have triggers no longer
applies. (It already didn't work for partitioned tables.)
Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Tested-by: Triveni N <triveni.n@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
REFRESH MATERIALIZED VIEW replaces the storage, which resets
statistics, so statistics must be restored afterward.
If both statistics and data are being dumped for a materialized view,
add a dependency from the former to the latter. Defer the statistics
to SECTION_POST_DATA, and use RESTORE_PASS_POST_ACL.
Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5s47kmubpbbRJzSM-Zfe0Tj2O3GBagB7YAyE8rQ-V24Uw@mail.gmail.com
By adding the positive variants of options, in addition to the
negative variants that already exist, users can be explicit about what
pg_dump should produce.
Discussion: https://postgr.es/m/bd0513e4b1ea2b2f2d06f02720c6579711cb62a6.camel@j-davis.com
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
This new option instructs pg_dump to dump sequence data when the
--no-data, --schema-only, or --statistics-only option is specified.
This was originally considered for commit a7e5457db8, but it was
left out at that time because there was no known use-case. A
follow-up commit will use this to optimize pg_upgrade's file
transfer step.
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
For import and export, use schemaname/relname rather than
regclass.
This is more natural during export, fits with the other arguments
better, and it gives better control over error handling in case we
need to downgrade more errors to warnings.
Also, use text for the argument types for schemaname, relname, and
attname so that casts to "name" are not required.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
Add --no-policies option to control row level security policy handling
in dump and restore operations. When this option is used, both CREATE
POLICY commands and ALTER TABLE ... ENABLE ROW LEVEL SECURITY commands
are excluded from dumps and skipped during restores.
This is useful in scenarios where policies need to be redefined in the
target system or when moving data between environments with different
security requirements.
Author: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAM527d8kG2qPKvbfJ=OYJkT7iRNd623Bk+m-a4ngm+nyHYsHog@mail.gmail.com
Commit 8f427187db improved performance by remembering relation stats
as native types rather than issuing a new query for each relation.
Using native types is fine for integers like relpages; but reltuples
is floating point. The commit controllled for that complexity by using
setlocale(LC_NUMERIC, "C"). After that, Alexander Lakhin found a
problem in pg_strtof(), fixed in 00d61a08c5.
While we aren't aware of any more problems with that approach, it
seems wise to just use a string the whole way for floating point
values, as Corey's original patch did, and get rid of the
setlocale(). Integers are still converted to native types to avoid
wasting memory.
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/3049348.1740855411@sss.pgh.pa.us
Discussion: https://postgr.es/m/560cca3781740bd69881bb07e26eb8f65b09792c.camel%40j-davis.com
Previously we used attname for both table and index columns, but
that is problematic for indexes because their attnames are assigned
by internal rules that don't guarantee to preserve the names across
dump and reload. (This is what's causing the remaining buildfarm
failures in cross-version-upgrade tests.) Fortunately we can use
attnum instead, since there's no such thing as adding or dropping
columns in an existing index. We met this same problem previously
with ALTER INDEX ... SET STATISTICS, and solved it the same way,
cf commit 5b6d13eec.
In pg_restore_attribute_stats() itself, we accept either attnum or
attname, but the policy used by pg_dump is to always use attname
for tables and attnum for indexes.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us
Follow precedent in pg_dump for preparing queries to improve
performance. Also, simplify the query by removing unnecessary joins.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM=dRMC6t8gp9GVf6y6E_r5EChQjMAAh_vPyih_zMiq0zvA@mail.gmail.com
The few fields we need can be easily collected in getTables() and
getIndexes() and stored in RelStatsInfo.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM=f0a43aTd88xW4xCFayEF25g-7hTrHX_WhV40HyocsUGg@mail.gmail.com
Add support to pg_dump for dumping stats, and use that during
pg_upgrade so that statistics are transferred during upgrade. In most
cases this removes the need for a costly re-analyze after upgrade.
Some statistics are not transferred, such as extended statistics or
statistics with a custom stakind.
Now pg_dump accepts the options --schema-only, --no-schema,
--data-only, --no-data, --statistics-only, and --no-statistics; which
allow all combinations of schema, data, and/or stats. The options are
named this way to preserve compatibility with the previous
--schema-only and --data-only options.
Statistics are in SECTION_DATA, unless the object itself is in
SECTION_POST_DATA.
The stats are represented as calls to pg_restore_relation_stats() and
pg_restore_attribute_stats().
Author: Corey Huinker, Jeff Davis
Reviewed-by: Jian He
Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com
The current implementation inconsistently includes public schema but not
information_schema when those are specified in FOR TABLES IN SCHMEA ...
Apart from that, the current behavior for publications w.r.t exclude table
and schema (--exclude-table, --exclude-schema) option differs from what we
do at other places. We try to avoid including publications for
corresponding tables or schemas when an exclude-table or exclude-schema
option is given, unlike what we do for views using functions defined in a
particular schema or a subscription pointing to publications with their
corresponding exclude options.
I decided not to backpatch this as it leads to a behavior change and we don't
see any field report for current behavior.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/1270733.1734134272@sss.pgh.pa.us
Coverity reported the two oversights in getPublicationTables.
Valgrind found the one in determineNotNullFlags.
The mistakes in getPublicationTables seem too minor to be worth
back-patching. determineNotNullFlags could be run enough times
to matter, but that code is new in v18. So, no back-patch.
This commit adds fmtIdEnc() and fmtQualifiedIdEnc(), which allow to specify
the encoding as an explicit argument. Additionally setFmtEncoding() is
provided, which defines the encoding when no explicit encoding is provided, to
avoid breaking all code using fmtId().
All users of fmtId()/fmtQualifiedId() are either converted to the explicit
version or a call to setFmtEncoding() has been added.
This commit does not yet utilize the now well-defined encoding, that will
happen in a subsequent commit.
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 13
Security: CVE-2025-1094
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).
The syntax for the column definition is
... GENERATED ALWAYS AS (...) VIRTUAL
and VIRTUAL is also optional. VIRTUAL is the default rather than
STORED to match various other SQL products. (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.) (Also, virtual views are the default, rather than
materialized views.)
Virtual generated columns are stored in tuples as null values. (A
very early version of this patch had the ambition to not store them at
all. But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing. This is a
compromise, and it still saves space over being forced to use stored
generated columns. If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)
The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns. In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent. Some of
that could maybe be relaxed later after separate careful
considerations.
Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:
- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication
The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced. This way we can make
sure the behavior is mostly aligned, and the differences can be
visible. Some tests for currently not supported features are
currently commented out.
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
The column added in commit e65dbc9927, pubgencols_type, was inconsistent
with the naming conventions of other columns in the pg_publication
catalog.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1u-ufVOW-RUsXSooqzkpohxfZYy=z78fbcr_9Pq5hbCg@mail.gmail.com
The current boolean publish_generated_columns option only supports a
binary choice, which is insufficient for future enhancements where
generated columns can be of different types (e.g., stored or virtual). The
supported values for the publish_generated_columns option are 'none' and
'stored'.
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com