Commit graph

1324 commits

Author SHA1 Message Date
Noah Misch
4ad846445d Fix pg_dump COMMENT dependency for separate domain constraints.
The COMMENT should depend on the separately-dumped constraint, not the
domain.  Sufficient restore parallelism might fail the COMMENT command
by issuing it before the constraint exists.  Back-patch to v13, like
commit 0858f0f96e.

Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/20250913020233.fa.nmisch@google.com
Backpatch-through: 13
2025-09-16 09:40:48 -07:00
Fujii Masao
176002c5bf pg_dump: Fix dumping of security labels on subscriptions and event triggers.
Previously, pg_dump incorrectly queried pg_seclabel to retrieve security labels
for subscriptions, which are stored in pg_shseclabel as they are global objects.
This could result in security labels for subscriptions not being dumped.

This commit fixes the issue by updating pg_dump to query the pg_seclabels view,
which aggregates entries from both pg_seclabel and pg_shseclabel.
While querying pg_shseclabel directly for subscriptions was an alternative,
using pg_seclabels is simpler and sufficient.

In addition, pg_dump is updated to dump security labels on event triggers,
which were previously omitted.

Backpatch to all supported versions.

Author: Jian He <jian.universality@gmail.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACJufxHCt00pR9h51AVu6+yPD5J7JQn=7dQXxqacj0XyDhc-fA@mail.gmail.com
Backpatch-through: 13
2025-09-16 16:46:28 +09:00
Nathan Bossart
67a2fbb8f9 Restrict psql meta-commands in plain-text dumps.
A malicious server could inject psql meta-commands into plain-text
dump output (i.e., scripts created with pg_dump --format=plain,
pg_dumpall, or pg_restore --file) that are run at restore time on
the machine running psql.  To fix, introduce a new "restricted"
mode in psql that blocks all meta-commands (except for \unrestrict
to exit the mode), and teach pg_dump, pg_dumpall, and pg_restore to
use this mode in plain-text dumps.

While at it, encourage users to only restore dumps generated from
trusted servers or to inspect it beforehand, since restoring causes
the destination to execute arbitrary code of the source superusers'
choice.  However, the client running the dump and restore needn't
trust the source or destination superusers.

Reported-by: Martin Rakhmanov
Reported-by: Matthieu Denais <litezeraw@gmail.com>
Reported-by: RyotaK <ryotak.mail@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Security: CVE-2025-8714
Backpatch-through: 13
2025-08-11 09:00:00 -05:00
Noah Misch
13a67ce603 Convert newlines to spaces in names written in v11+ pg_dump comments.
Maliciously-crafted object names could achieve SQL injection during
restore.  CVE-2012-0868 fixed this class of problem at the time, but
later work reintroduced three cases.  Commit
bc8cd50fef (back-patched to v11+ in
2023-05 releases) introduced the pg_dump case.  Commit
6cbdbd9e8d (v12+) introduced the two
pg_dumpall cases.  Move sanitize_line(), unchanged, to dumputils.c so
pg_dumpall has access to it in all supported versions.  Back-patch to
v13 (all supported versions).

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Backpatch-through: 13
Security: CVE-2025-8715
2025-08-11 06:19:03 -07:00
Jeff Davis
a3e8dc1438 Simplify options in pg_dump and pg_restore.
Remove redundant options --with-data and --with-schema, and rename
--with-statistics to just --statistics.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/f379d0aeefe8effe13302a436bc28f549f09e924.camel@j-davis.com
Backpatch-through: 18
2025-08-02 07:51:35 -07:00
Jeff Davis
60121890f7 pg_dump: reject combination of "only" and "with"
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/8ce896d1a05040905cc1a3afbc04e94d8e95669a.camel@j-davis.com
Backpatch-through: 18
2025-08-01 10:06:50 -07:00
Noah Misch
c0ae03384f Sort dump objects independent of OIDs, for the 7 holdout object types.
pg_dump sorts objects by their logical names, e.g. (nspname, relname,
tgname), before dependency-driven reordering.  That removes one source
of logically-identical databases differing in their schema-only dumps.
In other words, it helps with schema diffing.  The logical name sort
ignored essential sort keys for constraints, operators, PUBLICATION
... FOR TABLE, PUBLICATION ... FOR TABLES IN SCHEMA, operator classes,
and operator families.  pg_dump's sort then depended on object OID,
yielding spurious schema diffs.  After this change, OIDs affect dump
order only in the event of catalog corruption.  While pg_dump also
wrongly ignored pg_collation.collencoding, CREATE COLLATION restrictions
have been keeping that imperceptible in practical use.

Use techniques like we use for object types already having full sort key
coverage.  Where the pertinent queries weren't fetching the ignored sort
keys, this adds columns to those queries and stores those keys in memory
for the long term.

The ignorance of sort keys became more problematic when commit
172259afb5 added a schema diff test
sensitive to it.  Buildfarm member hippopotamus witnessed that.
However, dump order stability isn't a new goal, and this might avoid
other dump comparison failures.  Hence, back-patch to v13 (all supported
versions).

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20250707192654.9e.nmisch@google.com
Backpatch-through: 13
2025-07-31 06:37:59 -07:00
Andrew Dunstan
4a9ee867bf Revert Non text modes for pg_dumpall, and pg_restore support
Recent discussions of the mechanisms used to manage global data have
raised concerns about their robustness and security. Rather than try
to deal with those concerns at a very late stage of the release cycle,
the conclusion is to revert these features and work on them for the
next release.

This reverts parts or all of the following commits:

1495eff7bd Non text modes for pg_dumpall, correspondingly change pg_restore
5db3bf7391 Clean up from commit 1495eff7bd
289f74d0cb Add more TAP tests for pg_dumpall
2ef5790806 Fix a couple of error messages and tests for them
b52a4a5f28 Clean up error messages from 1495eff7bd
4170298b6e Further cleanup for directory creation on pg_dump/pg_dumpall
22cb6d2895 Fix memory leak in pg_restore.c
928394b664 Improve various new-to-v18 appendStringInfo calls
39729ec01d Fix fat fingering in 22cb6d2895
5822bf21d5 Add missing space in pg_restore documentation.
f09088a01d Free memory properly in pg_restore.c
40b9c27014 pg_restore cleanups
4aad2cb770 Portability fix: isdigit() must be passed an unsigned char.
88e947136b Fix typos and grammar in the code
f60420cff6 doc: Alphabetize long options for pg_dump[all].
bc35adee8d doc: Put new options in consistent order on man pages
a876464abc Message style improvements
dec6643487 Improve pg_dump/pg_dumpall help synopses and terminology
0ebd242555 Run pgperltidy

Discussion: https://postgr.es/m/20250708212819.09.nmisch@google.com

Backpatch-to: 18
Reviewed-by: Noah Misch <noah@leadboat.com>
2025-07-30 11:32:16 -04:00
Álvaro Herrera
f9545e95c5
pg_dump: include comments on not-null constraints on domains, too
Commit e5da0fe3c2 introduced catalog entries for not-null constraints
on domains; but because commit b0e96f3119 (the original work for
catalogued not-null constraints on tables) forgot to teach pg_dump to
process the comments for them, this one also forgot.  Add that now.

We also need to teach repairDependencyLoop() about the new type of
constraints being possible for domains.

Backpatch-through: 17
Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF-0bqVR=j4jonS6N2Ka6hHUpFyu3_3TWKNhOW_4yFSSg@mail.gmail.com
2025-07-21 11:34:10 +02:00
Álvaro Herrera
dca0e9693b
Fix dumping of comments on invalid constraints on domains
We skip dumping constraints together with domains if they are invalid
('separate') so that they appear after data -- but their comments were
dumped together with the domain definition, which in effect leads to the
comment being dumped when the constraint does not yet exist.  Delay
them in the same way.

Oversight in 7eca575d1c28; backpatch all the way back.

Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF_C2pe6J_+nPr6C5jf5rQnbYP8XOKr4HM8yHZtp2aQqQ@mail.gmail.com
2025-07-16 19:22:53 +02:00
Álvaro Herrera
47fb87563b
pg_dump: include comments on valid not-null constraints, too
We were missing collecting comments for not-null constraints that are
dumped inline with the table definition (i.e., valid ones), because they
aren't represented by a separately dumpable object.  Fix by creating
separate TocEntries for the comments.

Co-Authored-By: Jian He <jian.universality@gmail.com>
Co-Authored-By: Álvaro Herrera <alvherre@kurilemu.de>
Reported-By: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-By: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/d50ff977-c728-4e9e-8488-fc2688e08754@oss.nttdata.com
2025-06-26 18:24:12 +02:00
Peter Eisentraut
dec6643487 Improve pg_dump/pg_dumpall help synopses and terminology
Increase consistency of --help and man page synopses between pg_dump
and pg_dumpall.  These should now be very similar, as pg_dumpall can
now also produce non-text dump output.  But actually, they had drifted
further apart.

- Use verb "export" consistently, instead of "dump" or "extract".
- Use "SQL script" instead of just "script" or "text file".
- Maintain consistent distinction between SQL script and other
  formats/archives (which is relevant for pg_restore).

Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://www.postgresql.org/message-id/flat/3f71d8a7-095b-4829-9b0b-fce09e9866b3%40eisentraut.org
2025-06-19 13:57:16 +02:00
Fujii Masao
c2e2589ab9 pg_dump: Allow pg_dump to dump the statistics for foreign tables.
Commit 1fd1bd8710 introduced support for dumping statistics with
pg_dump and pg_dumpall, covering tables, materialized views, and indexes.
However, it overlooked foreign tables, even though functions like
pg_restore_relation_stats() support them.

This commit fixes that oversight by allowing pg_dump and pg_dumpall
to include statistics for foreign tables.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/3772e4e4-ef39-4deb-bb76-aa8165f33fb6@oss.nttdata.com
2025-06-18 14:53:55 +09:00
Peter Eisentraut
a876464abc Message style improvements
Some message style improvements in new code, and some small
refactorings to make translations easier.
2025-06-16 11:14:39 +02:00
Nathan Bossart
5d6eac80cd pg_dump: Adjust reltuples from 0 to -1 for dumps of older versions.
Before v14, a reltuples value of 0 was ambiguous: it could either
mean the relation is empty, or it could mean that it hadn't yet
been vacuumed or analyzed.  (Commit 3d351d916b taught v14 and newer
to use -1 for the latter case.)  This ambiguity allegedly can cause
the planner to choose inefficient plans after restoring to v18 or
newer.  To fix, let's just dump reltuples as -1 in that case.  This
will cause some truly empty tables to be seen as not-yet-processed,
but that seems unlikely to cause too much trouble in practice.

Note that we could alternatively teach pg_restore_relation_stats()
to translate reltuples based on the version argument, but since
that function doesn't exist until v18, there's no particular
advantage to that approach.  That is, there's no chance of
restoring stats dumped from a pre-v14 server to another pre-v14
server.  Per discussion, the current policy is to fix pre-v18
behavior differences during export and everything else during
import.

Commit 9879105024 fixed a similar problem for vacuumdb by removing
the check for reltuples != 0.  Presumably we could reinstate that
check now, but I've chosen to leave it in place in case reltuples
isn't accurate.  As before, processing some empty tables seems
relatively harmless.

Author: Hari Krishna Sunder <hari.db.pg@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CAAeiqZ0o2p4SX5_xPcuAbbsmXjg6MJLNuPYSLUjC%3DWh-VeW64A%40mail.gmail.com
2025-05-22 10:23:26 -05:00
Nathan Bossart
a6060f1cbe pg_dump: Fix array literals in fetchAttributeStats().
Presently, fetchAttributeStats() builds array literals by treating
the elements as SQL identifiers.  This is incorrect for a couple of
reasons:

* Array literal content must match the external text representation
  of the array, i.e., what array_out() would return.  One notable
  problem is that double quotes are escaped with "" in identifiers
  but with \" in array literals.  To fix, build the array content
  using the pre-existing appendPGArray() function.

* Array literals must be written as string constants.  A notable
  problem here is that single quotes are escaped via '' in strings
  but are not escaped in the text representation of an array.  To
  fix, append the aforementioned array literal content to the query
  with appendStringLiteralAH().

While at it, modify a test case to use an identifier that would
cause the test to fail without this change.

Oversight in commit 9c02e3a986.

Reported-by: Philippe Beaudoin <pbh.emaj@free.fr>
Author: Jian He <jian.universality@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Stepan Neretin <slpmcf@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Bug: #18923
Discussion: https://postgr.es/m/18923-e79273f87c6bed69%40postgresql.org
2025-05-20 16:31:00 -05:00
Álvaro Herrera
0e13b13d26
Fix pg_dump for inherited validated not-null constraints
When a child constraint is validated and the parent constraint it
derives from isn't, pg_dump must be coerced into printing the child
constraint; failing to do would result in a dump that restores the
constraint as not valid, which would be incorrect.

Co-authored-by: jian he <jian.universality@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: jian he <jian.universality@gmail.com>
Message-id: https://postgr.es/m/CACJufxGHNNMc0E2JphUqJMzD3=bwRSuAEVBF5ekgkG8uY0Q3hg@mail.gmail.com
2025-04-28 16:25:06 +02:00
David Rowley
1bd08f6ba5 Fixup various older misuses of appendPQExpBuffer
Use appendPQExpBufferStr when there are no parameters and
appendPQExpBufferChar when the string length is 1.

Unlike 3fae25cbb, which fixed this issue for code that was new to v18,
this one fixes up instances which exist in the backbranches.  We've
historically tried to maintain this standard and if we're going to
continue doing that, then we won't be doing that selectively based on
when the code was introduced.  Now seems like a good time to flush out the
existing misuses.  Waiting until v19 just prolongs their existence in
terms of released versions that the misuses exist in.

Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com
2025-04-18 12:15:08 +12:00
David Rowley
3fae25cbb3 Fixup various new-to-v18 usages of appendPQExpBuffer
Use appendPQExpBufferStr when there are no parameters and
appendPQExpBufferChar when the string length is 1.

Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvoARMvPeXTTC0HnpARBHn-WgVstc8XFCyMGOzvgu_1HvQ@mail.gmail.com
2025-04-17 11:37:55 +12:00
Tom Lane
1fc3403626 Fix pg_dump --clean with partitioned indexes.
We'd try to drop the partitions of a partitioned index separately,
which is disallowed by the backend, leading to an error during
restore.  While the error is harmless, it causes problems if you
try to use --single-transaction mode.

Fortunately, there seems no need to do a DROP at all, since the
partition will go away silently when we drop either the parent index
or the partition's table.  So just make the DROP conditional on not
being a partition.

Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxF0QSdkjFKF4di-JGWN6CSdQYEAhGPmQJJCdkSZtd=oLg@mail.gmail.com
Backpatch-through: 13
2025-04-16 13:31:59 -04:00
Peter Eisentraut
9ad19295e9 Fix incorrect format placeholders
for commits 8f427187db, 6ee3b91bad
2025-04-10 08:04:35 +02:00
Álvaro Herrera
a379061a22
Allow NOT NULL constraints to be added as NOT VALID
This allows them to be added without scanning the table, and validating
them afterwards without holding access exclusive lock on the table after
any violating rows have been deleted or fixed.

Doing ALTER TABLE ... SET NOT NULL for a column that has an invalid
not-null constraint validates that constraint.  ALTER TABLE .. VALIDATE
CONSTRAINT is also supported.  There are various checks on whether an
invalid constraint is allowed in a child table when the parent table has
a valid constraint; this should match what we do for enforced/not
enforced constraints.

pg_attribute.attnotnull is now only an indicator for whether a not-null
constraint exists for the column; whether it's valid or invalid must be
queried in pg_constraint.  Applications can continue to query
pg_attribute.attnotnull as before, but now it's possible that NULL rows
are present in the column even when that's set to true.

For backend internal purposes, we cache the nullability status in
CompactAttribute->attnullability that each tuple descriptor carries
(replacing CompactAttribute.attnotnull, which was a mirror of
Form_pg_attribute.attnotnull).  During the initial tuple descriptor
creation, based on the pg_attribute scan, we set this to UNRESTRICTED if
pg_attribute.attnotnull is false, or to UNKNOWN if it's true; then we
update the latter to VALID or INVALID depending on the pg_constraint
scan.  This flag is also copied when tupledescs are copied.

Comparing tuple descs for equality must also compare the
CompactAttribute.attnullability flag and return false in case of a
mismatch.

pg_dump deals with these constraints by storing the OIDs of invalid
not-null constraints in a separate array, and running a query to obtain
their properties.  The regular table creation SQL omits them entirely.
They are then dealt with in the same way as "separate" CHECK
constraints, and dumped after the data has been loaded.  Because no
additional pg_dump infrastructure was required, we don't bump its
version number.

I decided not to bump catversion either, because the old catalog state
works perfectly in the new world.  (Trying to run with new catalog state
and the old server version would likely run into issues, however.)

System catalogs do not support invalid not-null constraints (because
commit 14e87ffa5c didn't allow them to have pg_constraint rows
anyway.)

Author: Rushabh Lathia <rushabh.lathia@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Tested-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAGPqQf0KitkNack4F5CFkFi-9Dqvp29Ro=EpcWt=4_hs-Rt+bQ@mail.gmail.com
2025-04-07 19:19:50 +02:00
Nathan Bossart
f0d0083f52 pg_dump: Fix query for gathering attribute stats on older versions.
Commit 9c02e3a986 taught pg_dump to retrieve attribute statistics
for 64 relations at a time.  pg_dump supports dumping from v9.2 and
newer versions, but our query for retrieving statistics for
multiple relations uses WITH ORDINALITY and multi-argument
UNNEST(), both of which were introduced in v9.4.  To fix, we resort
to gathering statistics for a single relation at a time on versions
older than v9.4.

Per buildfarm member crake.

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/Z_BcWVMvlUIJ_iuZ%40nathan
2025-04-04 21:05:30 -05:00
Andrew Dunstan
1495eff7bd Non text modes for pg_dumpall, correspondingly change pg_restore
pg_dumpall acquires a new -F/--format option, with the same meanings as
pg_dump. The default is p, meaning plain text. For any other value, a
directory is created containing two files, globals.data and map.dat. The
first contains SQL for restoring the global data, and the second
contains a map from oids to database names. It will also contain a
subdirectory called databases, inside which it will create archives in
the specified format, named using the database oids.

In these casess the -f argument is required.

If pg_restore encounters a directory containing globals.dat, and no
toc.dat, it restores the global settings and then restores each
database.

pg_restore acquires two new options: -g/--globals-only which suppresses
restoration of any databases, and --exclude-database which inhibits
restoration of particualr database(s) in the same way the same option
works in pg_dumpall.

Author: Mahendra Singh Thalor <mahi6run@gmail.com>
Co-authored-by:  Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Srinath Reddy <srinath2133@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>

Discussion: https://postgr.es/m/cb103623-8ee6-4ba5-a2c9-f32e3a4933fa@dunslane.net
2025-04-04 16:01:22 -04:00
Andrew Dunstan
c1da728106 Move common pg_dump code related to connections to a new file
ConnectDatabase is used by pg_dumpall, pg_restore and pg_dump so move
common code to new file.

new file name: connectdb.c

Author:    Mahendra Singh Thalor <mahi6run@gmail.com>
2025-04-04 16:01:22 -04:00
Nathan Bossart
9c02e3a986 pg_dump: Retrieve attribute statistics in batches.
Currently, pg_dump gathers attribute statistics with a query per
relation, which can cause pg_dump to take significantly longer,
especially when there are many relations.  This commit addresses
this by teaching pg_dump to gather attribute statistics for 64
relations at a time.  Some simple tests showed this was the optimal
batch size, but performance may vary depending on the workload.

Our lookahead code determines the next batch of relations by
searching the TOC sequentially for relevant entries.  This approach
assumes that we will dump all such entries in TOC order, which
unfortunately isn't true for dump formats that use
RestoreArchive().  RestoreArchive() does multiple passes through
the TOC and selectively dumps certain groups of entries each time.
This is particularly problematic for index stats and a subset of
matview stats; both are in SECTION_POST_DATA, but matview stats
that depend on matview data are dumped in RESTORE_PASS_POST_ACL,
while all other stats are dumped in RESTORE_PASS_MAIN.  To handle
this, this commit moves all statistics data entries in
SECTION_POST_DATA to RESTORE_PASS_POST_ACL, which ensures that we
always dump them in TOC order.  A convenient side effect of this
change is that we can revert a decent chunk of commit a0a4601765,
but that is left for a follow-up commit.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
2025-04-04 14:51:08 -05:00
Nathan Bossart
7d5c83b4e9 pg_dump: Reduce memory usage of dumps with statistics.
Right now, pg_dump stores all generated commands for statistics in
memory.  These commands can be quite large and therefore can
significantly increase pg_dump's memory footprint.  To fix, wait
until we are about to write out the commands before generating
them, and be sure to free the commands after writing.  This is
implemented via a new defnDumper callback that works much like the
dataDumper one but is specifically designed for TOC entries.

Custom dumps that include data might write the TOC twice (to update
data offset information), which would ordinarily cause pg_dump to
run the attribute statistics queries twice.  However, as a hack, we
save the length of the written-out entry in the first pass and skip
over it in the second.  While there is no known technical issue
with executing the queries multiple times and rewriting the
results, it's expensive and feels risky, so let's avoid it.

As an exception, we _do_ execute the queries twice for the tar
format.  This format does a second pass through the TOC to generate
the restore.sql file.  pg_restore doesn't use this file, so even if
the second round of queries returns different results than the
first, it won't corrupt the output; the archive and restore.sql
file will just have different content.  A follow-up commit will
teach pg_dump to gather attribute statistics in batches, which our
testing indicates more than makes up for the added expense of
running the queries twice.

Author: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
2025-04-04 14:51:08 -05:00
Fujii Masao
0d6c477664 Extend ALTER DEFAULT PRIVILEGES to define default privileges for large objects.
Previously, ALTER DEFAULT PRIVILEGES did not support large objects.
This meant that to grant privileges to users other than the owner,
permissions had to be manually assigned each time a large object
was created, which was inconvenient.

This commit extends ALTER DEFAULT PRIVILEGES to allow defining default
access privileges for large objects. With this change, specified privileges
will automatically apply to newly created large objects, making privilege
management more efficient.

As a side effect, this commit introduces the new keyword OBJECTS
since it's used in the syntax of ALTER DEFAULT PRIVILEGES.

Original patch by Haruka Takatsuka, with some fixes and tests by Yugo Nagata,
and rebased by Laurenz Albe.

Author: Takatsuka Haruka <harukat@sraoss.co.jp>
Co-authored-by: Yugo Nagata <nagata@sraoss.co.jp>
Co-authored-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Masao Fujii <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20240424115242.236b499b2bed5b7a27f7a418@sraoss.co.jp
2025-04-04 19:02:17 +09:00
Álvaro Herrera
f104192e52
Remove duplicate set of print_notnull
I inserted the second one by mistake in commit 14e87ffa5c.

Reported-by: jian he <jian.universality@gmail.com>
Confirmed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CACJufxFqckBFxPfCixHHbOr0zMLksviTj2m3o12-tErfx_PvTg@mail.gmail.com
2025-04-03 17:34:25 +02:00
Peter Eisentraut
eec0040c4b Add support for NOT ENFORCED in foreign key constraints
This expands the NOT ENFORCED constraint flag, previously only
supported for CHECK constraints (commit ca87c415e2), to foreign key
constraints.

Normally, when a foreign key constraint is created on a table, action
and check triggers are added to maintain data integrity.  With this
patch, if a constraint is marked as NOT ENFORCED, integrity checks are
no longer required, making these triggers unnecessary.  Consequently,
when creating a NOT ENFORCED foreign key constraint, triggers will not
be created, and the constraint will be marked as NOT VALID.
Similarly, if an existing foreign key constraint is changed to NOT
ENFORCED, the associated triggers will be dropped, and the constraint
will also be marked as NOT VALID.  Conversely, if a NOT ENFORCED
foreign key constraint is changed to ENFORCED, the necessary triggers
will be created, and the will be changed to VALID by performing
necessary validation.

Since not-enforced foreign key constraints have no triggers, the
shortcut used for example in psql and pg_dump to skip looking for
foreign keys if the relation is known not to have triggers no longer
applies.  (It already didn't work for partitioned tables.)

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Tested-by: Triveni N <triveni.n@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/CAAJ_b962c5AcYW9KUt_R_ER5qs3fUGbe4az-SP-vuwPS-w-AGA@mail.gmail.com
2025-04-02 13:36:44 +02:00
Jeff Davis
4694aedf63 Add relallfrozen to pg_dump statistics.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=desCuf3dVHasADvdUVRmb-5gO0mhMO5u9nzgv6i7U86Q@mail.gmail.com
2025-03-30 22:14:06 -07:00
Jeff Davis
a0a4601765 Matview statistics depend on matview data.
REFRESH MATERIALIZED VIEW replaces the storage, which resets
statistics, so statistics must be restored afterward.

If both statistics and data are being dumped for a materialized view,
add a dependency from the former to the latter. Defer the statistics
to SECTION_POST_DATA, and use RESTORE_PASS_POST_ACL.

Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5s47kmubpbbRJzSM-Zfe0Tj2O3GBagB7YAyE8rQ-V24Uw@mail.gmail.com
2025-03-28 16:12:55 -07:00
Jeff Davis
bde2fb797a Add pg_dump --with-{schema|data|statistics} options.
By adding the positive variants of options, in addition to the
negative variants that already exist, users can be explicit about what
pg_dump should produce.

Discussion: https://postgr.es/m/bd0513e4b1ea2b2f2d06f02720c6579711cb62a6.camel@j-davis.com
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
2025-03-25 17:36:38 -07:00
Nathan Bossart
9c49f0e8cd pg_dump: Add --sequence-data.
This new option instructs pg_dump to dump sequence data when the
--no-data, --schema-only, or --statistics-only option is specified.
This was originally considered for commit a7e5457db8, but it was
left out at that time because there was no known use-case.  A
follow-up commit will use this to optimize pg_upgrade's file
transfer step.

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/Zyvop-LxLXBLrZil%40nathan
2025-03-25 16:02:35 -05:00
Jeff Davis
650ab8aaf1 Stats: use schemaname/relname instead of regclass.
For import and export, use schemaname/relname rather than
regclass.

This is more natural during export, fits with the other arguments
better, and it gives better control over error handling in case we
need to downgrade more errors to warnings.

Also, use text for the argument types for schemaname, relname, and
attname so that casts to "name" are not required.

Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CADkLM=ceOSsx_=oe73QQ-BxUFR2Cwqum7-UP_fPe22DBY0NerA@mail.gmail.com
2025-03-25 11:16:06 -07:00
Tom Lane
cd3c45125d pg_dump, pg_dumpall, pg_restore: Add --no-policies option.
Add --no-policies option to control row level security policy handling
in dump and restore operations. When this option is used, both CREATE
POLICY commands and ALTER TABLE ... ENABLE ROW LEVEL SECURITY commands
are excluded from dumps and skipped during restores.

This is useful in scenarios where policies need to be redefined in the
target system or when moving data between environments with different
security requirements.

Author: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: newtglobal postgresql_contributors <postgresql_contributors@newtglobalcorp.com>
Discussion: https://postgr.es/m/CAM527d8kG2qPKvbfJ=OYJkT7iRNd623Bk+m-a4ngm+nyHYsHog@mail.gmail.com
2025-03-16 18:08:15 -04:00
Jeff Davis
1852aea3f5 Don't convert to and from floats in pg_dump.
Commit 8f427187db improved performance by remembering relation stats
as native types rather than issuing a new query for each relation.

Using native types is fine for integers like relpages; but reltuples
is floating point. The commit controllled for that complexity by using
setlocale(LC_NUMERIC, "C"). After that, Alexander Lakhin found a
problem in pg_strtof(), fixed in 00d61a08c5.

While we aren't aware of any more problems with that approach, it
seems wise to just use a string the whole way for floating point
values, as Corey's original patch did, and get rid of the
setlocale(). Integers are still converted to native types to avoid
wasting memory.

Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/3049348.1740855411@sss.pgh.pa.us
Discussion: https://postgr.es/m/560cca3781740bd69881bb07e26eb8f65b09792c.camel%40j-davis.com
2025-03-08 11:25:36 -08:00
Jeff Davis
424ededc58 Adjust pg_dump tag for relation stats.
Do not use fmtId(), just use dobj->name directly, like for table data.
2025-02-27 20:42:12 -08:00
Tom Lane
40e27d04b4 Use attnum to identify index columns in pg_restore_attribute_stats().
Previously we used attname for both table and index columns, but
that is problematic for indexes because their attnames are assigned
by internal rules that don't guarantee to preserve the names across
dump and reload.  (This is what's causing the remaining buildfarm
failures in cross-version-upgrade tests.)  Fortunately we can use
attnum instead, since there's no such thing as adding or dropping
columns in an existing index.  We met this same problem previously
with ALTER INDEX ... SET STATISTICS, and solved it the same way,
cf commit 5b6d13eec.

In pg_restore_attribute_stats() itself, we accept either attnum or
attname, but the policy used by pg_dump is to always use attname
for tables and attnum for indexes.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/1457469.1740419458@sss.pgh.pa.us
2025-02-26 16:36:20 -05:00
Jeff Davis
6ee3b91bad pg_dump: prepare attribute stats query.
Follow precedent in pg_dump for preparing queries to improve
performance. Also, simplify the query by removing unnecessary joins.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM=dRMC6t8gp9GVf6y6E_r5EChQjMAAh_vPyih_zMiq0zvA@mail.gmail.com
2025-02-25 19:52:11 -08:00
Jeff Davis
8f427187db Avoid unnecessary relation stats query in pg_dump.
The few fields we need can be easily collected in getTables() and
getIndexes() and stored in RelStatsInfo.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Andres Freund <andres@anarazel.de>
Co-authored-by: Corey Huinker <corey.huinker@gmail.com>
Co-authored-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CADkLM=f0a43aTd88xW4xCFayEF25g-7hTrHX_WhV40HyocsUGg@mail.gmail.com
2025-02-25 19:51:45 -08:00
Jeff Davis
1fd1bd8710 Transfer statistics during pg_upgrade.
Add support to pg_dump for dumping stats, and use that during
pg_upgrade so that statistics are transferred during upgrade. In most
cases this removes the need for a costly re-analyze after upgrade.

Some statistics are not transferred, such as extended statistics or
statistics with a custom stakind.

Now pg_dump accepts the options --schema-only, --no-schema,
--data-only, --no-data, --statistics-only, and --no-statistics; which
allow all combinations of schema, data, and/or stats. The options are
named this way to preserve compatibility with the previous
--schema-only and --data-only options.

Statistics are in SECTION_DATA, unless the object itself is in
SECTION_POST_DATA.

The stats are represented as calls to pg_restore_relation_stats() and
pg_restore_attribute_stats().

Author: Corey Huinker, Jeff Davis
Reviewed-by: Jian He
Discussion: https://postgr.es/m/CADkLM=fzX7QX6r78fShWDjNN3Vcr4PVAnvXxQ4DiGy6V=0bCUA@mail.gmail.com
Discussion: https://postgr.es/m/CADkLM%3DcB0rF3p_FuWRTMSV0983ihTRpsH%2BOCpNyiqE7Wk0vUWA%40mail.gmail.com
2025-02-20 01:29:06 -08:00
Amit Kapila
4aa6fa3cd0 Include schema/table publications even with exclude options in dump.
The current implementation inconsistently includes public schema but not
information_schema when those are specified in FOR TABLES IN SCHMEA ...
Apart from that, the current behavior for publications w.r.t exclude table
and schema (--exclude-table, --exclude-schema) option differs from what we
do at other places. We try to avoid including publications for
corresponding tables or schemas when an exclude-table or exclude-schema
option is given, unlike what we do for views using functions defined in a
particular schema or a subscription pointing to publications with their
corresponding exclude options.

I decided not to backpatch this as it leads to a behavior change and we don't
see any field report for current behavior.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/1270733.1734134272@sss.pgh.pa.us
2025-02-20 11:25:29 +05:30
Tom Lane
fcd77a6873 Fix minor memory leaks in pg_dump.
Coverity reported the two oversights in getPublicationTables.
Valgrind found the one in determineNotNullFlags.

The mistakes in getPublicationTables seem too minor to be worth
back-patching.  determineNotNullFlags could be run enough times
to matter, but that code is new in v18.  So, no back-patch.
2025-02-12 15:46:31 -05:00
Andres Freund
3e98c8ce50 Specify the encoding of input to fmtId()
This commit adds fmtIdEnc() and fmtQualifiedIdEnc(), which allow to specify
the encoding as an explicit argument.  Additionally setFmtEncoding() is
provided, which defines the encoding when no explicit encoding is provided, to
avoid breaking all code using fmtId().

All users of fmtId()/fmtQualifiedId() are either converted to the explicit
version or a call to setFmtEncoding() has been added.

This commit does not yet utilize the now well-defined encoding, that will
happen in a subsequent commit.

Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 13
Security: CVE-2025-1094
2025-02-10 10:03:37 -05:00
Peter Eisentraut
83ea6c5402 Virtual generated columns
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).

The syntax for the column definition is

    ... GENERATED ALWAYS AS (...) VIRTUAL

and VIRTUAL is also optional.  VIRTUAL is the default rather than
STORED to match various other SQL products.  (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.)  (Also, virtual views are the default, rather than
materialized views.)

Virtual generated columns are stored in tuples as null values.  (A
very early version of this patch had the ambition to not store them at
all.  But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing.  This is a
compromise, and it still saves space over being forced to use stored
generated columns.  If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)

The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns.  In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent.  Some of
that could maybe be relaxed later after separate careful
considerations.

Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:

- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication

The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced.  This way we can make
sure the behavior is mostly aligned, and the differences can be
visible.  Some tests for currently not supported features are
currently commented out.

Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-02-07 09:46:59 +01:00
Amit Kapila
75eb9766ec Rename pubgencols_type to pubgencols in pg_publication.
The column added in commit e65dbc9927, pubgencols_type, was inconsistent
with the naming conventions of other columns in the pg_publication
catalog.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1u-ufVOW-RUsXSooqzkpohxfZYy=z78fbcr_9Pq5hbCg@mail.gmail.com
2025-01-28 10:42:46 +05:30
Amit Kapila
b35434b134 Fix buildfarm failure introduced by commit e65dbc9927.
The patch had incorrectly specified the default value for
publish_generated_columns during the query formation in pg_dump.

Author: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1KfZYTD8Hpi9TD1KaB8rNUBR9baUvTxa5wYyZDGbEaa6g@mail.gmail.com
2025-01-23 17:47:15 +05:30
Amit Kapila
e65dbc9927 Change publication's publish_generated_columns option type to enum.
The current boolean publish_generated_columns option only supports a
binary choice, which is insufficient for future enhancements where
generated columns can be of different types (e.g., stored or virtual). The
supported values for the publish_generated_columns option are 'none' and
'stored'.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2025-01-23 15:28:37 +05:30
David Rowley
11012c5037 Fix an assortment of spelling mistakes and typos
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/5812a0b9-b0cf-4151-9a14-d9f00e4f2858@gmail.com
2025-01-02 12:42:01 +13:00