Newest versions of gcc+glibc are able to detect cases where code
implicitly casts away const by assigning the result of strchr() or
a similar function applied to a "const char *" value to a target
variable that's just "char *". This of course creates a hazard of
not getting a compiler warning about scribbling on a string one was
not supposed to, so fixing up such cases is good.
This patch fixes a dozen or so places where we were doing that.
Most are trivial additions of "const" to the target variable,
since no actually-hazardous change was occurring.
Thanks to Bertrand Drouvot for finding a couple more spots than
I had.
This commit back-patches relevant portions of 8f1791c61 and
9f7565c6c into supported branches. However, there are two
places in ecpg (in v18 only) where a proper fix is more
complicated than seems appropriate for a back-patch. I opted
to silence those two warnings by adding casts.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18
The test added by commit 4b760a181 assumed that a table's physical
row order would be predictable after an UPDATE. But a non-heap table
AM might produce some other order. Even with heap AM, the assumption
seems risky; compare a3fd53bab for instance. Adding an ORDER BY is
cheap insurance and doesn't break any goal of the test.
Author: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALT9ZEHcE6tpvumScYPO6pGk_ASjTjWojLkodHnk33dvRPHXVw@mail.gmail.com
Backpatch-through: 14
The grease patch in 4966bd3ed found its first problem: prior to the
February 2018 patch releases, no server knew how to negotiate protocol
versions, so pg_upgrade needs to take that into account when speaking to
those older servers.
This will be true even after the grease feature is reverted; we don't
need anyone to trip over this again in the future. Backpatch so that all
supported versions of pg_upgrade can gracefully handle an update to the
default protocol version. (This is needed for any distributions that
link older binaries against newer libpqs, such as Debian.) Branches
prior to 18 need an additional version check, for the existence of
max_protocol_version.
Per buildfarm member crake.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2B%3D4QhCjssfNEoZVK8LPtWxnfkwT5p-PAeoxtG9gpNjqOQ%40mail.gmail.com
Backpatch-through: 14
Commit c67bef3f32 introduced this test helper function for use by
src/test/regress/sql/encoding.sql, but its logic was incorrect. It
confused an encoding ID for a boolean so it gave the wrong results for
some inputs, and also forgot the usual return macro. The mistake didn't
affect values actually used in the test, so there is no change in
behavior.
Also drop it and another missed function at the end of the test, for
consistency.
Backpatch-through: 14
Author: Zsolt Parragi <zsolt.parragi@percona.com>
Various buildfarm members, having compilers like gcc 8.5 and 6.3, fail
to deduce that text_substring() variable "E" is initialized if
slice_size!=-1. This suppression approach quiets gcc 8.5; I did not
reproduce the warning elsewhere. Back-patch to v14, like commit
9f4fd119b2.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1157953.1771266105@sss.pgh.pa.us
Backpatch-through: 14
The receive function of hstore was not able to handle correctly
duplicate key values when a new duplicate links to a NULL value, where a
pfree() could be attempted on a NULL pointer, crashing due to a pointer
dereference.
This problem would happen for a COPY BINARY, when stacking values like
that:
aa => 5
aa => null
The second key/value pair is discarded and pfree() calls are attempted
on its key and its value, leading to a pointer dereference for the value
part as the value is NULL. The first key/value pair takes priority when
a duplicate is found.
Per offline report.
Reported-by: "Anemone" <vergissmeinnichtzh@gmail.com>
Reported-by: "A1ex" <alex000young@gmail.com>
Backpatch-through: 14
'latest_page_number' is set to the correct value, according to
nextOffset, early at system startup. Contrary to the comment, it hence
should be set up correctly by the time we get to WAL replay.
This fixes a failure to replay WAL generated on older minor versions,
before commit 789d65364c (18.2, 17.8, 16.12, 15.16, 14.21). The
failure occurs after a truncation record has been replayed and looks
like this:
FATAL: could not access status of transaction 858112
DETAIL: Could not read from file "pg_multixact/offsets/000D" at offset 24576: read too few bytes.
CONTEXT: WAL redo at 3/2A3AB408 for MultiXact/CREATE_ID: 858111 offset 6695072 nmembers 5: 1048228 (sh) 1048271 (keysh) 1048316 (sh) 1048344 (keysh) 1048370 (sh)
Reported-by: Sebastian Webber <sebastian@swebber.me>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/20260214090150.GC2297@p46.dedyn.io;lightning.p46.dedyn.io
Backpatch-through: 14-18
The error message added in 379695d3cc referred to the public key being
too long. This is confusing as it is in fact the session key included
in a PGP message which is too long. This is harmless, but let's be
precise about what is wrong.
Per offline report.
Reported-by: Zsolt Parragi <zsolt.parragi@percona.com>
Backpatch-through: 14
Commit 1e7fe06c10 changed
pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an
incomplete character. Most callers want that. text_substring() does
not. It detoasts the most bytes it could possibly need to get the
requested number of characters. For example, to extract up to 2 chars
from UTF8, it needs to detoast 8 bytes. In a string of 3-byte UTF8
chars, 8 bytes spans 2 complete chars and 1 partial char.
Fix this by replacing this pg_mbstrlen_with_len() call with a string
traversal that differs by stopping upon finding as many chars as the
substring could need. This also makes SUBSTRING() stop raising an
encoding error if the incomplete char is past the end of the substring.
This is consistent with the general philosophy of the above commit,
which was to raise errors on a just-in-time basis. Before the above
commit, SUBSTRING() never raised an encoding error.
SUBSTRING() has long been detoasting enough for one more char than
needed, because it did not distinguish exclusive and inclusive end
position. For avoidance of doubt, stop detoasting extra.
Back-patch to v14, like the above commit. For applications using
SUBSTRING() on non-ASCII column values, consider applying this to your
copy of any of the February 12, 2026 releases.
Reported-by: SATŌ Kentarō <ranvis@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Bug: #19406
Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org
Backpatch-through: 14
The prior order caused spurious Valgrind errors. They're spurious
because the ereport(ERROR) non-local exit discards the pointer in
question. pg_mblen_cstr() ordered the checks correctly, but these other
two did not. Back-patch to v14, like commit
1e7fe06c10.
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20260214053821.fa.noahmisch@microsoft.com
Backpatch-through: 14
If the variable's value is null, exec_stmt_return() missed filling
in estate->rettype. This is a pretty old bug, but we'd managed not
to notice because that value isn't consulted for a null result ...
unless we have to cast it to a domain. That case led to a failure
with "cache lookup failed for type 0".
The correct way to assign the data type is known by exec_eval_datum.
While we could copy-and-paste that logic, it seems like a better
idea to just invoke exec_eval_datum, as the ROW case already does.
Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFj8pRBT_ahexDf-zT-cyH8bMR_qcySKM8D5nv5MvTWPiatYGA@mail.gmail.com
Backpatch-through: 14
The pg_stat_activity view shows information for aux processes, but the
pg_stat_get_backend_wait_event() and
pg_stat_get_backend_wait_event_type() functions did not. To fix, call
AuxiliaryPidGetProc(pid) if BackendPidGetProc(pid) returns NULL, like
we do in pg_stat_get_activity().
In version 17 and above, it's a little silly to use those functions
when we already have the ProcNumber at hand, but it was necessary
before v17 because the backend ID was different from ProcNumber. I
have other plans for wait_event_info on master, so it doesn't seem
worth applying a different fix on different versions now.
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/c0320e04-6e85-4c49-80c5-27cfb3a58108@iki.fi
Backpatch-through: 14
The buildfarm occasionally shows a variant row order in the output
of this UPDATE ... RETURNING, implying that the preceding INSERT
dropped one of the rows into some free space within the table rather
than appending them all at the end. It's not entirely clear why that
happens some times and not other times, but we have established that
it's affected by concurrent activity in other databases of the
cluster. In any case, the behavior is not wrong; the test is at fault
for presuming that a seqscan will give deterministic row ordering.
Add an ORDER BY atop the update to stop the buildfarm noise.
The buildfarm seems to have shown this only in v18 and master
branches, but just in case the cause is older, back-patch to
all supported branches.
Discussion: https://postgr.es/m/3866274.1770743162@sss.pgh.pa.us
Backpatch-through: 14
On the INSERT page, mention that SELECT privileges are also required
for any columns mentioned in the arbiter clause, including those
referred to by the constraint, and clarify that this applies to all
forms of ON CONFLICT, not just ON CONFLICT DO UPDATE.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
On the CREATE POLICY page, the description of per-command policies
stated that SELECT policies are applied when an INSERT has an ON
CONFLICT DO NOTHING clause. However, that is only the case if it
includes an arbiter clause, so clarify that.
While at it, also clarify the comment in the regression tests that
cover this.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
While the preceding commit prevented such attachments from occurring
in future, this one aims to prevent further abuse of any already-
created operator that exposes _int_matchsel to the wrong data types.
(No other contrib module has a vulnerable selectivity estimator.)
We need only check that the Const we've found in the query is indeed
of the type we expect (query_int), but there's a difficulty: as an
extension type, query_int doesn't have a fixed OID that we could
hard-code into the estimator.
Therefore, the bulk of this patch consists of infrastructure to let
an extension function securely look up the OID of a datatype
belonging to the same extension. (Extension authors have requested
such functionality before, so we anticipate that this code will
have additional non-security uses, and may soon be extended to allow
looking up other kinds of SQL objects.)
This is done by first finding the extension that owns the calling
function (there can be only one), and then thumbing through the
objects owned by that extension to find a type that has the desired
name. This is relatively expensive, especially for large extensions,
so a simple cache is put in front of these lookups.
Reported-by: Daniel Firer as part of zeroday.cloud
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2004
Backpatch-through: 14
Selectivity estimators come in two flavors: those that make specific
assumptions about the data types they are working with, and those
that don't. Most of the built-in estimators are of the latter kind
and are meant to be safely attachable to any operator. If the
operator does not behave as the estimator expects, you might get a
poor estimate, but it won't crash.
However, estimators that do make datatype assumptions can malfunction
if they are attached to the wrong operator, since then the data they
get from pg_statistic may not be of the type they expect. This can
rise to the level of a security problem, even permitting arbitrary
code execution by a user who has the ability to create SQL objects.
To close this hole, establish a rule that built-in estimators are
required to protect themselves against being called on the wrong type
of data. It does not seem practical however to expect estimators in
extensions to reach a similar level of security, at least not in the
near term. Therefore, also establish a rule that superuser privilege
is required to attach a non-built-in estimator to an operator.
We expect that this restriction will have little negative impact on
extensions, since estimators generally have to be written in C and
thus superuser privilege is required to create them in the first
place.
This commit changes the privilege checks in CREATE/ALTER OPERATOR
to enforce the rule about superuser privilege, and fixes a couple
of built-in estimators that were making datatype assumptions without
sufficiently checking that they're valid.
Reported-by: Daniel Firer as part of zeroday.cloud
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2004
Backpatch-through: 14
An upcoming patch requires this cache so that it can track updates
in the pg_extension catalog. So far though, the EXTENSIONOID cache
only exists in v18 and up (see 490f869d9). We can add it in older
branches without an ABI break, if we are careful not to disturb the
numbering of existing syscache IDs.
In v16 and before, that just requires adding the new ID at the end
of the hand-assigned enum list, ignoring our convention about
alphabetizing the IDs. But in v17, genbki.pl enforces alphabetical
order of the IDs listed in MAKE_SYSCACHE macros. We can fake it
out by calling the new cache ZEXTENSIONOID.
Note that adding a syscache does change the required contents of the
relcache init file (pg_internal.init). But that isn't problematic
since we blow those away at postmaster start for other reasons.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2004
Backpatch-through: 14-17
These data types are represented like full-fledged arrays, but
functions that deal specifically with these types assume that the
array is 1-dimensional and contains no nulls. However, there are
cast pathways that allow general oid[] or int2[] arrays to be cast
to these types, allowing these expectations to be violated. This
can be exploited to cause server memory disclosure or SIGSEGV.
Fix by installing explicit checks in functions that accept these
types.
Reported-by: Altan Birler <altan.birler@tum.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2003
Backpatch-through: 14
pgp_sym_decrypt() and pgp_pub_decrypt() will raise such errors, while
bytea variants will not. The existing "dat3" test decrypted to non-UTF8
text, so switch that query to bytea.
The long-term intent is for type "text" to always be valid in the
database encoding. pgcrypto has long been known as a source of
exceptions to that intent, but a report about exploiting invalid values
of type "text" brought this module to the forefront. This particular
exception is straightforward to fix, with reasonable effect on user
queries. Back-patch to v14 (all supported versions).
Reported-by: Paul Gerste (as part of zeroday.cloud)
Reported-by: Moritz Sanft (as part of zeroday.cloud)
Author: shihao zhong <zhong950419@gmail.com>
Reviewed-by: cary huang <hcary328@gmail.com>
Discussion: https://postgr.es/m/CAGRkXqRZyo0gLxPJqUsDqtWYBbgM14betsHiLRPj9mo2=z9VvA@mail.gmail.com
Backpatch-through: 14
Security: CVE-2026-2006
A security patch changed them today, so close the coverage gap now.
Test that buffer overrun is avoided when pg_mblen*() requires more
than the number of bytes remaining.
This does not cover the calls in dict_thesaurus.c or in dict_synonym.c.
That code is straightforward. To change that code's input, one must
have access to modify installed OS files, so low-privilege users are not
a threat. Testing this would likewise require changing installed
share/postgresql/tsearch_data, which was enough of an obstacle to not
bother.
Security: CVE-2026-2006
Backpatch-through: 14
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
A corrupted string could cause code that iterates with pg_mblen() to
overrun its buffer. Fix, by converting all callers to one of the
following:
1. Callers with a null-terminated string now use pg_mblen_cstr(), which
raises an "illegal byte sequence" error if it finds a terminator in the
middle of the sequence.
2. Callers with a length or end pointer now use either
pg_mblen_with_len() or pg_mblen_range(), for the same effect, depending
on which of the two seems more convenient at each site.
3. A small number of cases pre-validate a string, and can use
pg_mblen_unbounded().
The traditional pg_mblen() function and COPYCHAR macro still exist for
backward compatibility, but are no longer used by core code and are
hereby deprecated. The same applies to the t_isXXX() functions.
Security: CVE-2026-2006
Backpatch-through: 14
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Paul Gerste (as part of zeroday.cloud)
Reported-by: Moritz Sanft (as part of zeroday.cloud)
When converting multibyte to pg_wchar, the UTF-8 implementation would
silently ignore an incomplete final character, while the other
implementations would cast a single byte to pg_wchar, and then repeat
for the remaining byte sequence. While it didn't overrun the buffer, it
was surely garbage output.
Make all encodings behave like the UTF-8 implementation. A later change
for master only will convert this to an error, but we choose not to
back-patch that behavior change on the off-chance that someone is
relying on the existing UTF-8 behavior.
Security: CVE-2026-2006
Backpatch-through: 14
Author: Thomas Munro <thomas.munro@gmail.com>
Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
While EUC_CN supports only 1- and 2-byte sequences (CS0, CS1), the
mb<->wchar conversion functions allow 3-byte sequences beginning SS2,
SS3.
Change pg_encoding_max_length() to return 3, not 2, to close a
hypothesized buffer overrun if a corrupted string is converted to wchar
and back again in a newly allocated buffer. We might reconsider that in
master (ie harmonizing in a different direction), but this change seems
better for the back-branches.
Also change pg_euccn_mblen() to report SS2 and SS3 characters as having
length 3 (following the example of EUC_KR). Even though such characters
would not pass verification, it's remotely possible that invalid bytes
could be used to compute a buffer size for use in wchar conversion.
Security: CVE-2026-2006
Backpatch-through: 14
Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
pgp_pub_decrypt_bytea() was missing a safeguard for the session key
length read from the message data, that can be given in input of
pgp_pub_decrypt_bytea(). This can result in the possibility of a buffer
overflow for the session key data, when the length specified is longer
than PGP_MAX_KEY, which is the maximum size of the buffer where the
session data is copied to.
A script able to rebuild the message and key data that can trigger the
overflow is included in this commit, based on some contents provided by
the reporter, heavily editted by me. A SQL test is added, based on the
data generated by the script.
Reported-by: Team Xint Code as part of zeroday.cloud
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Noah Misch <noah@leadboat.com>
Security: CVE-2026-2005
Backpatch-through: 14
(This is a cherry-pick of 390b3cbbb, which I hadn't realized wasn't
backpatched. It was originally reported to security@ and determined not
to be a vulnerability; thanks to Stanislav Osipov for noticing the
omission in the back branches.)
In case of torn UTF8 in the input data we might end up going
past the end of the string since we don't account for length.
While validation won't be performed on a sequence with a NULL
byte it's better to avoid going past the end to beging with.
Fix by taking the length into consideration.
Reported-by: Stanislav Osipov <stasos24@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAOYmi+mTnmM172g=_+Yvc47hzzeAsYPy2C4UBY3HK9p-AXNV0g@mail.gmail.com
Backpatch-through: 14
These errors are very unlikely going to show up, but in the event that
they happen, some incorrect information would have been provided:
- In pg_rewind, a stat() failure was reported as an open() failure.
- In pg_combinebackup, a check for the new directory of a tablespace
mapping was referred as the old directory.
- In pg_combinebackup, a failure in reading a source file when copying
blocks referred to the destination file.
The changes for pg_combinebackup affect v17 and newer versions. For
pg_rewind, all the stable branches are affected.
Author: Man Zeng <zengman@halodbtech.com>
Discussion: https://postgr.es/m/tencent_1EE1430B1E6C18A663B8990F@qq.com
Backpatch-through: 14
Mostly this involves checking for NULL pointer before doing operations
that add a non-zero offset.
The exception is an overflow warning in heap_fetch_toast_slice(). This
was caused by unneeded parentheses forcing an expression to be
evaluated to a negative integer, which then got cast to size_t.
Per clang 21 undefined behavior sanitizer.
Backpatch to all supported versions.
Co-authored-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/777bd201-6e3a-4da0-a922-4ea9de46a3ee@gmail.com
Backpatch-through: 14
BackgroundPsql needs to wait for all the output from an interactive
psql command to come back. To make sure that's happened, it issues
the command, then issues \echo and \warn psql commands that echo
a "banner" string (which we assume won't appear in the command's
output), then waits for the banner strings to appear. The hazard
in this approach is that the banner will also appear in the echoed
psql commands themselves, so we need to distinguish those echoes from
the desired output. Commit 8b886a4e3 tried to do that by positing
that the desired output would be directly preceded and followed by
newlines, but it turns out that that assumption is timing-sensitive.
In particular, it tends to fail in builds made --without-readline,
wherein the command echoes will be made by the pty driver and may
be interspersed with prompts issued by psql proper.
It does seem safe to assume that the banner output we want will be
followed by a newline, since that should be the last output before
things quiesce. Therefore, we can improve matters by putting quotes
around the banner strings in the \echo and \warn psql commands, so
that their echoes cannot include banner directly followed by newline,
and then checking for just banner-and-newline in the match pattern.
While at it, spruce up the pump() call in sub query() to look like
the neater version in wait_connect(), and don't die on timeout
until after printing whatever we got.
Reported-by: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru>
Diagnosed-by: Oleg Tselebrovskiy <o.tselebrovskiy@postgrespro.ru>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com>
Discussion: https://postgr.es/m/db6fdb35a8665ad3c18be01181d44b31@postgrespro.ru
Backpatch-through: 14
ExecInitModifyTable() unconditionally required a ctid junk column even
when the target was a partitioned table. This led to spurious "could
not find junk ctid column" errors when all children were excluded and
only the dummy root result relation remained.
A partitioned table only appears in the result relations list when all
leaf partitions have been pruned, leaving the dummy root as the sole
entry. Assert this invariant (nrels == 1) and skip the ctid requirement.
Also adjust ExecModifyTable() to tolerate invalid ri_RowIdAttNo for
partitioned tables, which is safe since no rows will be processed in
this case.
Bug: #19099
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19099-e05dcfa022fe553d%40postgresql.org
Backpatch-through: 14
Commit f16241bef mistakenly supposed that INSERT...ON CONFLICT DO
UPDATE rejects partitioned target tables. (This may have been
accurate when the patch was written, but it was already obsolete
when committed.) Hence, there's an assertion that we can't see
ItemPointerIndicatesMovedPartitions() in that path, but the assertion
is triggerable.
Some other places throw error if they see a moved-across-partitions
tuple, but there seems no need for that here, because if we just retry
then we get the same behavior as in the update-within-partition case,
as demonstrated by the new isolation test. So fix by deleting the
faulty Assert. (The fact that this is the fix doubtless explains
why we've heard no field complaints: the behavior of a non-assert
build is fine.)
The TM_Deleted case contains a cargo-culted copy of the same Assert,
which I also deleted to avoid confusion, although I believe that one
is actually not triggerable.
Per our code coverage report, neither the TM_Updated nor the
TM_Deleted case were reached at all by existing tests, so this
patch adds tests for both.
Reported-by: Dmitry Koval <d.koval@postgrespro.ru>
Author: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/f5fffe4b-11b2-4557-a864-3587ff9b4c36@postgrespro.ru
Backpatch-through: 14
With LLVM >= 17, transform passes are provided as a string to
LLVMRunPasses. Only two strings were used: "default<O3>" and
"default<O0>,mem2reg".
With previous LLVM versions, an additional inline pass was added when
JIT inlining was enabled without optimization. With LLVM >= 17, the code
would go through llvm_inline, prepare the functions for inlining, but
the generated bitcode would be the same due to the missing inline pass.
This patch restores the previous behavior by adding an inline pass when
inlining is enabled but no optimization is done.
This fixes an oversight introduced by 76200e5e when support for LLVM 17
was added.
Backpatch-through: 14
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Pierre Ducroquet <p.psql@pinaraf.info>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAO6_XqrNjJnbn15ctPv7o4yEAT9fWa-dK15RSyun6QNw9YDtKg%40mail.gmail.com
We were using SnapshotAny to do some index checks, but that's wrong and
causes spurious errors when used on indexes created by CREATE INDEX
CONCURRENTLY. Fix it to use an MVCC snapshot, and add a test for it.
Backpatch of 6bd469d26a to branches 14-16. I previously misidentified
the bug's origin: it came in with commit 7f563c09f8 (pg11-era, not
5ae2087202 as claimed previously), so all live branches are affected.
Also take the opportunity to fix some comments that we failed to update
in the original commits and apply pgperltidy. In branch 14, remove the
unnecessary test plan specification (which would have need to have been
changed anyway; c.f. commit 549ec201d613.)
Diagnosed-by: Donghang Lin <donghanglin@gmail.com>
Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Backpatch-through: 17
Discussion: https://postgr.es/m/CANtu0ojmVd27fEhfpST7RG2KZvwkX=dMyKUqg0KM87FkOSdz8Q@mail.gmail.com
This is pretty pro-forma for our purposes, as the only change
is a historical correction for pre-1976 DST laws in
Baja California. (Upstream made this release mostly to update
their leap-second data, which we don't use.) But with minor
releases coming up, we should be up-to-date.
Backpatch-through: 14
The code adding the WAL information included in a backup manifest is
cross-checked with the contents of the timeline history file of the end
timeline. A check based on the end timeline, when it fails, reported
the value of the start timeline in the error message. This error is
fixed to show the correct timeline number in the report.
This error report would be confusing for users if seen, because it would
provide an incorrect information, so backpatch all the way down.
Oversight in 0d8c9c1210.
Author: Man Zeng <zengman@halodbtech.com>
Discussion: https://postgr.es/m/tencent_0F2949C4594556F672CF4658@qq.com
Backpatch-through: 14
If a FATAL error occurs while holding a lock in a DSM segment (such
as a dshash lock) and the process is not in a transaction, a
segmentation fault can occur during process exit.
The problem sequence is:
1. Process acquires a lock in a DSM segment (e.g., via dshash)
2. FATAL error occurs outside transaction context
3. proc_exit() begins, calling before_shmem_exit callbacks
4. dsm_backend_shutdown() detaches all DSM segments
5. Later, on_shmem_exit callbacks run
6. ProcKill() calls LWLockReleaseAll()
7. Segfault: the lock being released is in unmapped memory
This only manifests outside transaction contexts because
AbortTransaction() calls LWLockReleaseAll() during transaction
abort, releasing locks before DSM cleanup. Background workers and
other non-transactional code paths are vulnerable.
Fix by calling LWLockReleaseAll() unconditionally at the start of
shmem_exit(), before any callbacks run. Releasing locks before
callbacks prevents the segfault - locks must be released before
dsm_backend_shutdown() detaches their memory. This is safe because
after an error, held locks are protecting potentially inconsistent
data anyway, and callbacks can acquire fresh locks if needed.
Also add a comment noting that LWLockReleaseAll() must be safe to
call before LWLock initialization (which it is, since
num_held_lwlocks will be 0), plus an Assert for the post-condition.
This fix aligns with the original design intent from commit
001a573a2, which noted that backends must clean up shared memory
state (including releasing lwlocks) before unmapping dynamic shared
memory segments.
Reported-by: Rahila Syed <rahilasyed90@gmail.com>
Author: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CAH2L28uSvyiosL+kaic9249jRVoQiQF6JOnaCitKFq=xiFzX3g@mail.gmail.com
Backpatch-through: 14
On restart, a replica can fail with an error like 'unexpected data
beyond EOF in block 200 of relation T/D/R'. These are the steps to
reproduce it:
- A relation has a size of 400 blocks.
- Blocks 201 to 400 are empty.
- Block 200 has two rows.
- Blocks 100 to 199 are empty.
- A restartpoint is done
- Vacuum truncates the relation to 200 blocks
- A FPW deletes a row in block 200
- A checkpoint is done
- A FPW deletes the last row in block 200
- Vacuum truncates the relation to 100 blocks
- The replica restarts
When the replica restarts:
- The relation on disk starts at 100 blocks, because all the
truncations were applied before restart.
- The first truncate to 200 blocks is replayed. It silently fails, but
it will still (incorrectly!) update the cache size to 200 blocks
- The first FPW on block 200 is applied. XLogReadBufferForRead relies
on the cached size and incorrectly assumes that the page already
exists in the file, and thus won't extend the relation.
- The online checkpoint record is replayed, calling smgrdestroyall
which causes the cached size to be discarded
- The second FPW on block 200 is applied. This time, the detected size
is 100 blocks, an extend is attempted. However, the block 200 is
already present in the buffer cache due to the first FPW. This
triggers the 'unexpected data beyond EOF'.
To fix, update the cached size in SmgrRelation with the current size
rather than the requested new size, when the requested new size is
greater.
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://www.postgresql.org/message-id/CAO6_Xqrv-snNJNhbj1KjQmWiWHX3nYGDgAc=vxaZP3qc4g1Siw@mail.gmail.com
Backpatch-through: 14
If a multixid with zero offset is left behind after a crash, and that
multixid later becomes the oldest multixid, truncation might try to
look up its offset and read the zero value. In the worst case, we
might incorrectly use the zero offset to truncate valid SLRU segments
that are still needed. I'm not sure if that can happen in practice, or
if there are some other lower-level safeguards or incidental reasons
that prevent the caller from passing an unwritten multixid as the
oldest multi. But better safe than sorry, so let's add an explicit
check for it.
In stable branches, we should perhaps do the same check for
'oldestOffset', i.e. the offset of the old oldest multixid (in master,
'oldestOffset' is gone). But if the old oldest multixid has an invalid
offset, the damage has been done already, and we would never advance
past that point. It's not clear what we should do in that case. The
check that this commit adds will prevent such an multixid with invalid
offset from becoming the oldest multixid in the first place, which
seems enough for now.
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: Discussion: https://www.postgresql.org/message-id/000301b2-5b81-4938-bdac-90f6eb660843@iki.fi
Backpatch-through: 14
Reword publish_via_partition_root's opening paragraph. Describe its
behavior more clearly, and directly state that its default is false.
Per complaint by Peter Smith; final text of the patch made in
collaboration with Chao Li.
Author: Chao Li <li.evan.chao@gmail.com>
Author: Peter Smith <peter.b.smith@fujitsu.com>
Reported-by: Peter Smith <peter.b.smith@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut%2BPu7SpK%2BctOYoqYR3V4w5LKc9sCs6c_qotk9uTQJQ4zp6g%40mail.gmail.com
Backpatch-through: 14
When creating a partition for a RANGE partitioned table, the reporting
of errors relating to converting the specified range values into
constant values for the partition key's type could display the name of a
previous partition key column when an earlier range was specified as
MINVALUE or MAXVALUE.
This was caused by the code not correctly incrementing the index that
tracks which partition key the foreach loop was working on after
processing MINVALUE/MAXVALUE ranges.
Fix by using foreach_current_index() to ensure the index variable is
always set to the List element being worked on.
Author: myzhen <zhenmingyang@yeah.net>
Reviewed-by: zhibin wang <killerwzb@gmail.com>
Discussion: https://postgr.es/m/273cab52.978.19b96fc75e7.Coremail.zhenmingyang@yeah.net
Backpatch-through: 14