infer_arbiter_indexes failed to renumber varnos in index expressions
or predicates that it got from the catalogs. This escaped detection
up to now because the stored varnos in such trees will be 1, and an
INSERT's result relation is usually the first rangetable entry,
so that that was fine. However, in cases such as inserting through
an updatable view, it's not fine, leading to failure to match the
expressions to the query with ensuing "there is no unique or exclusion
constraint matching the ON CONFLICT specification" errors.
Fix by copy-and-paste from get_relation_info().
Per bug #18502 from Michael Wang. Back-patch to all supported
versions.
Discussion: https://postgr.es/m/18502-545b53f5b81e54e0@postgresql.org
test_predtest() neglected to consider the possibility that
SPI_plan_get_cached_plan would return NULL. This led to a core
dump if the input (incorrectly) contains more than one SQL
command.
While here, let's expend more than zero effort on the error
message for this case and nearby ones.
Per (half of) bug #18483 from Alexander Kozhemyakin.
Back-patch to all supported branches, not because this is
very significant (it's merely test scaffolding) but to make
our world a bit safer for fuzz testing.
Discussion: https://postgr.es/m/18483-30bfff42de238000@postgresql.org
Normally this case isn't even reachable by non-superusers, since
permissions checks prevent naming such a table. However, it is
possible to make it happen by altering a parent table whose child
is another session's temp table.
We definitely can't support any such ALTER that requires modifying
the contents of such a table, since we lack access to the other
session's temporary-buffer pool. But there seems no good reason
to allow it even if it'd only require changing catalog contents.
One reason not to allow it is that we'd rather not expose the
implementation-dependent behavior of whether a specific ALTER
requires touching the table contents. Another is that there may
be (in future, even if not today) optimizations that assume that
a session's own temp tables won't be modified by other sessions.
Hence, add a RELATION_IS_OTHER_TEMP() check to all the places
where ALTER TABLE currently does CheckTableNotInUse(). (I looked
through all other callers of CheckTableNotInUse(), and they seem
OK already.)
Per bug #18492 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18492-c7a2634bf4968763@postgresql.org
If the CALL is within an atomic context (e.g. there's an outer
transaction block), _SPI_execute_plan should acquire a fresh snapshot
to execute any such functions with. We failed to do that and instead
passed them the Portal snapshot, which had been acquired at the start
of the current SQL command. This'd lead to seeing stale values of
rows modified since the start of the command.
This is arguably a bug in 84f5c2908: I failed to see that "are we in
non-atomic mode" needs to be defined the same way as it is further
down in _SPI_execute_plan, i.e. check !_SPI_current->atomic not just
options->allow_nonatomic. Alternatively the blame could be laid on
plpgsql, which is unconditionally passing allow_nonatomic = true
for CALL/DO even when it knows it's in an atomic context. However,
fixing it in spi.c seems like a better idea since that will also fix
the problem for any extensions that may have copied plpgsql's coding
pattern.
While here, update an obsolete comment about _SPI_execute_plan's
snapshot management.
Per report from Victor Yegorov. Back-patch to all supported versions.
Discussion: https://postgr.es/m/CAGnEboiRe+fG2QxuBO2390F7P8e2MQ6UyBjZSL_w1Cej+E4=Vw@mail.gmail.com
Before the v13-era commit 913bbd88d, check_sql_fn_retval fails to
resolve polymorphic output types and then just throws up its hands and
assumes the check will be made at runtime. I think that's true for
ordinary functions returning RECORD, but it doesn't happen in CALL,
potentially resulting in crashes if the actual output of the SQL
procedure's SELECT doesn't match the type inferred from polymorphism.
With a little bit of rearrangement, we can use get_call_result_type
instead of get_func_result_type and thereby infer the correct types.
I'm still unwilling to back-patch all of 913bbd88d, so if the types
don't match you'll get an error rather than perhaps silently inserting
a cast as v13 and later can. That's consistent with prior behavior
though, so it seems fine.
Prior to 70ffb27b2, you'd typically get other errors due to other
shortcomings of CALL's management of polymorphism. Nonetheless,
this is an independent bug.
Although there is no bug in v13 and up, it seems prudent to add
the test case for this to the newer branches too. It's clearly
an under-tested area.
Per report from Andrew Bille.
Discussion: https://postgr.es/m/CAJnzarw9EeWHAQRm76dXd=7j+rgw6ERqC=nCay8jeFqTwKwhqQ@mail.gmail.com
In a procedure or function returning tuple, we use that function to
parse the Tcl script's result, which is supposed to be a Tcl list.
If it isn't, you get an error. Commit 26abb50c4 incautiously
supposed that we could use throw_tcl_error() to report such an error.
That doesn't actually work, because low-level functions like
Tcl_ListObjGetElements() don't fill Tcl's errorInfo variable.
The result is either a null-pointer-dereference crash or emission
of misleading context information describing the previous Tcl error.
Back off to just reporting the interpreter's result string, and
improve throw_tcl_error()'s comment to explain when to use it.
Also, although the similar code in pltcl_trigger_handler() avoided
this mistake, it was using a fairly confusing wording of the
error message. Improve that while we're here.
Per report from A. Kozhemyakin. Back-patch to all supported
branches.
Erik Wienhold and Tom Lane
Discussion: https://postgr.es/m/6a2a1c40-2b2c-4a33-8b72-243c0766fcda@postgrespro.ru
Coverity complains that ECPGdebug is accessing debugstream without
holding debug_mutex, which is a fair complaint: we should take
debug_mutex while changing the settings ecpg_log looks at.
In some branches it also complains about unlocked use of simple_debug.
I think it's intentional and safe to have a quick unlocked check of
simple_debug at the start of ecpg_log, since that early exit will
always be taken in non-debug cases. But we should recheck
simple_debug after acquiring the mutex. In the worst case, calling
ECPGdebug concurrently with ecpg_log in another thread could result
in a null-pointer dereference due to debugstream transiently being
NULL while simple_debug isn't 0.
This is largely hypothetical, since it's unlikely anybody uses
ECPGdebug() at all in the field, and our own regression tests
don't seem to be hitting the theoretical race conditions either.
Still, if we're going to the trouble of having mutexes here, we ought
to be using them in a way that's actually safe not just almost safe.
Hence, back-patch to all supported branches.
We are capable of optimizing MIN() and MAX() aggregates on indexed
columns into subqueries that exploit the index, rather than the normal
thing of scanning the whole table. When we do this, we replace the
Aggref node(s) with Params referencing subquery outputs. Such Params
really ought to be included in the per-plan-node extParam/allParam
sets computed by SS_finalize_plan. However, we've never done so
up to now because of an ancient implementation choice to perform
that substitution during set_plan_references, which runs after
SS_finalize_plan, so that SS_finalize_plan never sees these Params.
The cleanest fix would be to perform a separate tree walk to do
these substitutions before SS_finalize_plan runs. That seems
unattractive, first because a whole-tree mutation pass is expensive,
and second because we lack infrastructure for visiting expression
subtrees in a Plan tree, so that we'd need a new function knowing
as much as SS_finalize_plan knows about that. I also considered
swapping the order of SS_finalize_plan and set_plan_references,
but that fell foul of various assumptions that seem tricky to fix.
So the approach adopted here is to teach SS_finalize_plan itself
to check for such Aggrefs. I refactored things a bit in setrefs.c
to avoid having three copies of the code that does that.
Back-patch of v17 commits d0d44049d and 779ac2c74. When d0d44049d
went in, there was no evidence that it was fixing a reachable bug,
so I refrained from back-patching. Now we have such evidence.
Per bug #18465 from Hal Takahara. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18465-2fae927718976b22@postgresql.org
Discussion: https://postgr.es/m/2391880.1689025003@sss.pgh.pa.us
Commit 695b4a113a added a dependency on retrieving oldestxid from
pg_control, which only exists in 9.0 and onwards, but the check for
8.4 as the oldest version was retained. Since there has been few if
any complaints of 8.4 upgrades not working, fix by setting 9.0 as
the oldest version supported rather than resurrecting 8.4 support.
Backpatch to all supported versions.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1973418.1657040382@sss.pgh.pa.us
Backpatch-through: v12
Most of the infrastructure for procedure arguments was already
okay with polymorphic output arguments, but it turns out that
CallStmtResultDesc() was a few bricks shy of a load here. It thought
all it needed to do was call build_function_result_tupdesc_t, but
that function specifically disclaims responsibility for resolving
polymorphic arguments. Failing to handle that doesn't seem to be
a problem for CALL in plpgsql, but CALL from plain SQL would get
errors like "cannot display a value of type anyelement", or even
crash outright.
In v14 and later we can simply examine the exposed types of the
CallStmt.outargs nodes to get the right type OIDs. But it's a lot
more complicated to fix in v12/v13, because those versions don't
have CallStmt.outargs, nor do they do expand_function_arguments
until ExecuteCallStmt runs. We have to duplicatively run
expand_function_arguments, and then re-determine which elements
of the args list are output arguments.
Per bug #18463 from Drew Kimball. Back-patch to all supported
versions, since it's busted in all of them.
Discussion: https://postgr.es/m/18463-f8cd77e12564d8a2@postgresql.org
Presently, when this function is called for an unlogged sequence on
a standby server, it will error out with a message like
ERROR: could not open file "base/5/16388": No such file or directory
Since the pg_sequences system view uses pg_sequence_last_value(),
it can error similarly. To fix, modify the function to return NULL
for unlogged sequences on standby servers. Since this bug is
present on all versions since v15, this approach is preferable to
making the ERROR nicer because we need to repair the pg_sequences
view without modifying its definition on released versions. For
consistency, this commit also modifies the function to return NULL
for other sessions' temporary sequences. The pg_sequences view
already appropriately filters out such sequences, so there's no bug
there, but we might as well offer some defense in case someone
invokes this function directly.
Unlogged sequences were first introduced in v15, but temporary
sequences are much older, so while the fix for unlogged sequences
is only back-patched to v15, the temporary sequence portion is
back-patched to all supported versions.
We could also remove the privilege check in the pg_sequences view
definition in v18 if we modify this function to return NULL for
sequences for which the current user lacks privileges, but that is
left as a future exercise for when v18 development begins.
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/20240501005730.GA594666%40nathanxps13
Backpatch-through: 12
If we recursed to a new call of the same function, with a different
coldeflist (AS clause), it would fail because the inner call would
overwrite the outer call's idea of what to return. This is vaguely
like 1d2fe56e4 and c5bec5426, but it's not due to any API decisions:
it's just that we computed the actual output rowtype at the start of
the call, and saved it in the per-procedure data structure. We can
fix it at basically zero cost by doing the computation at the end
of each call instead of the start.
It's not clear that there's any real-world use-case for such a
function, but given that it doesn't cost anything to fix,
it'd be silly not to.
Per report from Andreas Karlsson. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/1651a46d-3c15-4028-a8c1-d74937b54e19@proxel.se
If -l was specified together with selective-restore options such as -n
or -N, dependent TOC entries such as comments would be omitted from
the listing, even when an actual restore would have selected them.
This happened because PrintTOCSummary neglected to update the te->reqs
marking of the entry they depended on.
Per report from Justin Pryzby. This has been wrong since 0d4e6ed30
taught _tocEntryRequired to sometimes look at the "reqs" marking of
other TOC entries, so back-patch to all supported branches.
Discussion: https://postgr.es/m/ZjoeirG7yxODdC4P@pryzbyj2023
If a plpython-language trigger caused another one to be invoked,
the "TD" dictionary created for the inner one would overwrite the
outer one's "TD" dictionary. This is more or less the same problem
that 1d2fe56e4 fixed for ordinary functions in plpython, so fix it
the same way, by saving and restoring "TD" during a recursive
invocation.
This fix makes an ABI-incompatible change in struct PLySavedArgs.
I'm not too worried about that because it seems highly unlikely that
any extension is messing with those structs. We could imagine doing
something weird to preserve nominal ABI compatibility in the back
branches, like keeping the saved TD object in an extra element of
namedargs[]. However, that would only be very nominal compatibility:
if anything *is* touching PLySavedArgs, it would likely do the wrong
thing due to not knowing about the additional value. So I judge it
not worth the ugliness to do something different there.
(I also changed struct PLyProcedure, but its added field fits
into formerly-padding space, so that should be safe.)
Per bug #18456 from Jacques Combrink. This bug is very ancient,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/3008982.1714853799@sss.pgh.pa.us
As an optimization, we store "name" columns as cstrings in btree
indexes.
Here we modify it so that Index Only Scans convert these cstrings back
to names with NAMEDATALEN bytes rather than storing the cstring in the
tuple slot, as was happening previously.
Bug: #17855
Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin, Tom Lane
Discussion: https://postgr.es/m/17855-5f523e0f9769a566@postgresql.org
Backpatch-through: 12, all supported versions
We have long disallowed all forms of ALTER TABLE if the table is
already opened by some outer SQL command in the same session.
This has the same purpose as obtaining AccessExclusiveLock, but
since a session's own locks don't conflict the lock only blocks use
of the table by other sessions, not our own. Without this check,
the ALTER might confuse the outer SQL command since any previous
inspection of the table would potentially become invalid.
However, the RelisBecomingView code path in DefineQueryRewrite never
got that memo, and assumed that AccessExclusiveLock is sufficient
for performing something morally equivalent to a rather invasive
ALTER TABLE. Unsurprisingly, this can confuse an outer command
that is trying to do something with the table.
This was submitted as a security issue, but the security team
has been unable to identify any consequence worse than a null
pointer dereference (from trying to access rd_tableam methods
that the relation no longer has). Therefore, in accordance
with our usual policy, it's not security material and should
just be fixed as a routine bug.
Fix by disallowing the operation if the table is open locally,
exactly as ALTER TABLE does it.
Per an anonymous security researcher, via Bundesamt für Sicherheit
in der Informationstechnik.
Patch v12-v15 only. In v16 and later, we removed this code
altogether (cf. commit b23cd185f), so that there's no issue.
vac_update_datfrozenxid() did multiple loads of relfrozenxid and
relminmxid from buffer memory, and it assumed each would get the same
value. Not so if a concurrent vac_update_relstats() did an inplace
update. Commit 2d2e40e3be fixed the same
kind of bug in vac_truncate_clog(). Today's bug could cause the
rel-level field and XIDs in the rel's rows to precede the db-level
field. A cluster having such values should VACUUM affected tables.
Back-patch to v12 (all supported versions).
Discussion: https://postgr.es/m/20240423003956.e7.nmisch@google.com
In commit 25cd2d640 I (tgl) opined that "The additions of the months
and microseconds fields could also overflow, of course. However,
I believe we need no additional checks there; the existing range
checks should catch such cases". This is demonstrably wrong however
for the microseconds field, and given that discovery it seems prudent
to be paranoid about the months addition as well.
Report and patch by Joseph Koshakow. As before, back-patch to all
supported branches. (However, the test case doesn't work before
v15 because we didn't allow wider-than-int32 numbers in interval
literals. A variant test could probably be built that fits within
that restriction, but it didn't seem worth the trouble.)
Discussion: https://postgr.es/m/CAAvxfHf77sRHKoEzUw9_cMYSpbpNS2C+J_+8Dq4+0oi8iKopeA@mail.gmail.com
While back-patching commit 6f0cef935, I forgot that the MSVC
build scripts would also need adjustment in the back branches.
This is a blind attempt at a fix, but it's basically copying
nearby code so I think it will work.
Per buildfarm (via Andrew Dunstan)
Discussion: https://postgr.es/m/4cc4dc47-ca2b-4129-8784-db69b5f82777@dunslane.net
The code associated with EXEC SQL DEFINE was unreadable and full of
bugs, notably:
* It'd attempt to free a non-malloced string if the ecpg program
tries to redefine a macro that was defined on the command line.
* Possible memory stomp if user writes "-D=foo".
* Undef'ing or redefining a macro defined on the command line would
change the state visible to the next file, when multiple files are
specified on the command line. (While possibly that could have been
an intentional choice, the code clearly intends to revert to the
original macro state; it's just failing to consider this interaction.)
* Missing "break" in defining a new macro meant that redefinition
of an existing name would cause an extra entry to be added to the
definition list. While not immediately harmful, a subsequent undef
would result in the prior entry becoming visible again.
* The interactions with input buffering are subtle and were entirely
undocumented.
It's not that surprising that we hadn't noticed these bugs,
because there was no test coverage at all of either the -D
command line switch or multiple input files. This patch adds
such coverage (in a rather hacky way I guess).
In addition to the code bugs, the user documentation was confused
about whether the -D switch defines a C macro or an ecpg one, and
it failed to mention that you can write "-Dsymbol=value".
These problems are old, so back-patch to all supported branches.
Discussion: https://postgr.es/m/998011.1713217712@sss.pgh.pa.us
get_baserel_parampathinfo previously assumed without checking that
the results of generate_join_implied_equalities "necessarily satisfy
join_clause_is_movable_into". This turns out to be wrong in the
presence of outer joins, because the generated clauses could include
Vars that mustn't be evaluated below a relevant outer join. That
led to applying clauses at the wrong plan level and possibly getting
incorrect query results. We must check each clause's nullable_relids,
and really the right thing to do is test join_clause_is_movable_into.
However, trying to fix it that way exposes an oversight in
equivclass.c: it wasn't careful about marking join clauses for
appendrel children with the correct clause_relids. That caused the
modified get_baserel_parampathinfo code to reject some clauses it
still needs to accept. (See parallel commit for HEAD/v16 for more
commentary about that.)
Per bug #18429 from Benoît Ryder. This misbehavior existed for
a long time before commit 2489d76c4, so patch v12-v15 this way.
Discussion: https://postgr.es/m/18429-8982d4a348cc86c6@postgresql.org
Commit 2ed8f9a01 intended to institute a policy that if a
RangeTblFunction has a coldeflist, then the function return type is
certainly RECORD, and we should use the coldeflist as the source of
truth about what the columns of the record type are. When the
original function has been folded to a constant, inspection of the
constant might give a different answer. This situation will lead to
a tuple-type-mismatch error at execution, but up until that point we
need to consistently believe the coldeflist, or we'll have problems
from different bits of code reaching different conclusions.
expandRTE didn't get that memo though, and would try to produce a
tupdesc based on the constant in this situation, leading to an
assertion failure. (Desultory testing suggests that non-assert
builds often manage to give the expected error, although I also
saw a "cache lookup failed for type 0" error, and it seems at
least possible that a crash could happen.)
Some other callers of get_expr_result_type and get_expr_result_tupdesc
were also being incautious about this. While none of them seem to
have actual bugs, they're working harder than necessary in this case,
besides which it seems safest to have an explicit policy of not using
those functions on an RTE with a coldeflist. Adjust the code
accordingly, and add commentary to funcapi.c about this policy.
Also fix an obsolete comment that claimed "get_expr_result_type()
doesn't know how to extract type info from a RECORD constant".
That hasn't been true since commit d57534740.
Per bug #18422 from Alexander Lakhin.
As with the previous commit, back-patch to all supported branches.
Discussion: https://postgr.es/m/18422-89ca86c8eac5246d@postgresql.org
This function would have the same issue we solved in commit 501cfd07d:
If an error is thrown after calling CreateWaitEventSet(), the file
descriptor (on epoll- or kqueue-based systems) or handles (on Windows)
that the WaitEventSet contains are leaked.
Like that commit, use PG_TRY-PG_FINALLY (PG_TRY-PG_CATCH in v12) to make
sure the WaitEventSet is freed properly.
Back-patch to all supported versions, but as we do not have this issue
in HEAD (cf. commit 50c67c201), no need to apply this patch to it.
Discussion: https://postgr.es/m/CAPmGK16MqdDoD8oatp8SQWaEa4vS3nfQqDN_Sj9YRuu5J3Lj9g%40mail.gmail.com
Up to now, read_sql_construct() has collected all the source text from
the statement or expression's initial token up to the character just
before the "until" token. It normally tries to strip trailing
whitespace from that, largely for neatness. If there was a "-- text"
comment after the expression, this resulted in removing the newline
that terminates the comment, which creates a hazard if we try to paste
the collected text into a larger SQL construct without inserting a
newline after it. In particular this caused our handling of CASE
constructs to fail if there's a comment after a WHEN expression.
Commit 4adead1d2 noticed a similar problem with cursor arguments,
and worked around it through the rather crude hack of suppressing
the whitespace-trimming behavior for those. Rather than do that
and leave the hazard open for future hackers to trip over, let's
fix it properly. pl_scanner.c already has enough infrastructure
to report the end location of the expression's last token, so
we can copy up to that location and never collect any trailing
whitespace or comment to begin with.
Erik Wienhold and Tom Lane, per report from Michal Bartak.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAAVzF_FjRoi8fOVuLCZhQJx6HATQ7MKm=aFOHWZODFnLmjX-xA@mail.gmail.com
Commit 72559438 started copying more attributes from AttributeTemplate
to the functions we generate on the fly. In the case of deform
functions, which return void, this meant that "noundef", from
AttributeTemplate's return value (a Datum) was copied to a void type.
Older LLVM releases were OK with that, but LLVM 18 crashes.
Update our llvm_copy_attributes() function to skip copying the attribute
for the return value, if the target function returns void.
Thanks to Dmitry Dolgov for help chasing this down.
Back-patch to all supported releases, like 72559438.
Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Dmitry Dolgov <9erthalion6@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRACpVFr7LMdVYENUkScG5FCYMZDDdSGNU-tch%2Bw98OxYg%40mail.gmail.com
While SH_STAT() is only used for debugging, the allocated array can be large,
and therefore should be freed.
It's unclear why coverity started warning now.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Coverity
Discussion: https://postgr.es/m/3005248.1712538233@sss.pgh.pa.us
Backpatch: 12-
If the OpenLDAP installation directory is not found, set $setup to 0
so that the LDAP tests are skipped. The macOS checks were already
doing that, but the checks on other OS's were not. While we're at it,
improve the error message when the tests are skipped, to specify
whether the OS is supported at all, or if we just didn't find the
installation directory.
This was accidentally "working" without this, i.e. we were skipping
the tests if the OpenLDAP installation was not found, because of a bug
in the LdapServer test module: the END block clobbered the exit code
so if the script die()s before running the first subtest, the whole
test script was marked as SKIPped. The next commit will fix that bug,
but we need to fix the setup code first.
These checks should probably go into configure/meson, but this is
better than nothing and allows fixing the bug in the END block.
Backpatch to all supported versions.
Discussion: https://www.postgresql.org/message-id/fb898a70-3a88-4629-88e9-f2375020061d@iki.fi
ecpg wants to emit a warning if it parses a SQL construct that the
backend can parse but will immediately throw a FEATURE_NOT_SUPPORTED
error for. The way it was testing for this was to see if the string
ERRCODE_FEATURE_NOT_SUPPORTED appeared anywhere in the gram.y code.
This is, of course, not nearly good enough, as there are plenty of
rules in gram.y that throw that error only conditionally. There was
a hack dating to 2008 to suppress the warning in one rule that
doesn't even exist anymore, but nothing for other cases we've created
since then. End result was that you could get "unsupported feature
will be passed to server" warnings while compiling perfectly good SQL
code in ecpg. Somehow we'd not heard complaints about this, but
it was exposed by the recent addition of an ecpg test for a SQL/JSON
construct.
To fix, suppress the warning if the rule contains any "if" statement.
Manual comparison of gram.y with the generated preproc.y file shows
that the warning is now emitted only in rules where it's sensible.
This problem has existed for a long time, so back-patch to all
supported branches.
Discussion: https://postgr.es/m/603615.1712245382@sss.pgh.pa.us
If temp tables have dependencies (such as sequences) then it's
possible for autovacuum's cleanup of orphan temp tables to deadlock
against an incoming backend that's trying to clean out the temp
namespace for its own use. That can happen because RemoveTempRelations'
performDeletion call can visit objects within the namespace in
an order different from the order in which a per-table deletion
will visit them.
To fix, observe that performDeletion will begin by taking an exclusive
lock on the temp namespace (even though it won't actually delete it).
So, if we can get a shared lock on the namespace, we can be sure we're
not running concurrently with RemoveTempRelations, while also not
conflicting with ordinary use of the namespace. This requires
introducing a conditional version of LockDatabaseObject, but that's no
big deal. (It's surprising we've got along without that this long.)
Report and patch by Mikhail Zhilin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/c43ce028-2bc2-4865-9b89-3f706246eed5@postgrespro.ru
If we are building with openssl but USE_SSL_ENGINE didn't get set,
initialize_SSL's variable "pkey" is declared but used nowhere.
Apparently this combination hasn't been exercised in the buildfarm
before now, because I've not seen this warning before, even though
the code has been like this a long time. Move the declaration
to silence the warning (and remove its useless initialization).
Per buildfarm member sawshark. Back-patch to all supported branches.
The "pltargs" variable wasn't marked volatile, which makes it unsafe
to change its value within the PG_TRY block. It looks like the worst
outcome would be to fail to release a refcount on Py_None during an
(improbable) error exit, which would likely go unnoticed in the field.
Still, it's a bug. A one-liner fix could be to mark pltargs volatile,
but on the whole it seems cleaner to arrange things so that we don't
change its value within PG_TRY.
Per report from Xing Guo. This has been there for quite awhile,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/CACpMh+DLrk=fDv07MNpBT4J413fDAm+gmMXgi8cjPONE+jvzuw@mail.gmail.com
When a plain aggregate is used as a window function, and the window
frame start is specified as UNBOUNDED PRECEDING, the frame's head
cannot move so we do not need to use moving-aggregate mode. The check
for that was put into initialize_peragg(), failing to notice that
ExecInitWindowAgg() calls that function before it's filled in
winstate->frameOptions. Since makeNode() would have zeroed the field,
this didn't provoke uninitialized-value complaints, nor would the
erroneous decision have resulted in more than a little inefficiency.
Still, it's wrong, so move the initialization of
winstate->frameOptions earlier to make it work properly.
While here, also fix a thinko in a comment. Both errors crept in in
commit a9d9acbf2 which introduced the moving-aggregate mode.
Spotted by Vallimaharajan G. Back-patch to all supported branches.
Discussion: https://postgr.es/m/18e7f2a5167.fe36253866818.977923893562469143@zohocorp.com
Ordinary ALTER TABLE SET SCHEMA will also move any owned sequences
into the new schema. We failed to do likewise for foreign tables,
because AlterTableNamespaceInternal believed that only certain
relkinds could have indexes, owned sequences, or constraints.
We could simply add foreign tables to that relkind list, but it
seems likely that the same oversight could be made again in
future. Instead let's remove the relkind filter altogether.
These functions shouldn't cost much when there are no objects
that they need to process, and surely this isn't an especially
performance-critical case anyway.
Per bug #18407 from Vidushi Gupta. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18407-4fd07373d252c6a0@postgresql.org
The musl dynamic linker saves a pointer to the process' environment
value of LD_LIBRARY_PATH very early in startup. When we move/clobber
the environment to make more room for ps status strings, we clobber
that value and thereby prevent libraries from being found via
LD_LIBRARY_PATH, which breaks the use of a temporary installation
for testing purposes. To fix, stop collecting usable space for
ps status if we notice that the variable we are about to clobber
is LD_LIBRARY_PATH. This will result in some reduction in how long
the ps status can be, but it's only likely to occur in temporary
test contexts, so it doesn't seem like a big problem. In any case,
we don't have to do it if we see we are on glibc, which surely is
where the majority of our Linux testing is done.
Thomas Munro, Bruce Momjian, and Tom Lane, per report from Wolfgang
Walther. Back-patch to all supported branches, with the hope that
we'll set up a buildfarm animal to test on this platform.
Discussion: https://postgr.es/m/fddd1cd6-dc16-40a2-9eb5-d7fef2101488@technowledgy.de
Some last minute polish of the patch managed to break the SQL
query for extracting the role comments due to fat-fingering.
Per the buildfarm Xversion tests.
Commit a3c7a993d fixed some cases involving target columns that are
arrays or composites by applying transformAssignedExpr to the VALUES
entries, and then stripping off any assignment ArrayRefs or
FieldStores that the transformation added. But I forgot about domains
over arrays or composites :-(. Such cases would either fail with
surprising complaints about mismatched datatypes, or insert unexpected
coercions that could lead to odd results. To fix, extend the
stripping logic to get rid of CoerceToDomain if it's atop an ArrayRef
or FieldStore.
While poking at this, I realized that there's a poorly documented and
not-at-all-tested behavior nearby: we coerce each VALUES column to
the domain type separately, and rely on the rewriter to merge those
operations so that the domain constraints are checked only once.
If that merging did not happen, it's entirely possible that we'd get
unexpected domain constraint failures due to checking a
partially-updated container value. There's no bug there, but while
we're here let's improve the commentary about it and add some test
cases that explicitly exercise that behavior.
Per bug #18393 from Pablo Kharo. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18393-65fedb1a0de9260d@postgresql.org
There is a very ancient hack in check_sql_fn_retval that allows a
single SELECT targetlist entry of composite type to be taken as
supplying all the output columns of a function returning composite.
(This is grotty and fundamentally ambiguous, but it's really hard
to do nested composite-returning functions without it.)
As far as I know, that doesn't cause any problems in ordinary
functions. It's disastrous for procedures however. All procedures
that have any output parameters are labeled with prorettype RECORD,
and the CALL code expects it will get back a record with one column
per output parameter, regardless of whether any of those parameters
is composite. Doing something else leads to an assertion failure
or core dump.
This is simple enough to fix: we just need to not apply that rule
when considering procedures. However, that requires adding another
argument to check_sql_fn_retval, which at least in principle might be
getting called by external callers. Therefore, in the back branches
convert check_sql_fn_retval into an ABI-preserving wrapper around a
new function check_sql_fn_retval_ext.
Per report from Yahor Yuzefovich. This has been broken since we
implemented procedures, so back-patch to all supported branches.
Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com
Commit 387da18874 moved the code to put socket into non-blocking mode
from socket_set_nonblocking() into the one-time initialization
function, pq_init(). In socket_set_nonblocking(), there indeed was a
risk of recursion on failure like the comment said, but in pq_init(),
ERROR or FATAL is fine. There's even another elog(FATAL) just after
this, if setting FD_CLOEXEC fails.
Note that COMMERROR merely logged the error, it did not close the
connection, so if putting the socket to non-blocking mode failed we
would use the connection anyway. You might not immediately notice,
because most socket operations in a regular backend wait for the
socket to become readable/writable anyway. But e.g. replication will
be quite broken.
Backpatch to all supported versions.
Discussion: https://www.postgresql.org/message-id/d40a5cd0-2722-40c5-8755-12e9e811fa3c@iki.fi
In OpenSSL 3.0.0 and later, ERR_reason_error_string randomly refuses
to provide a string for error codes representing system errno values
(e.g., "No such file or directory"). There is a poorly-documented way
to extract the errno from the SSL error code in this case, so do that
and apply strerror, rather than falling back to reporting the error
code's numeric value as we were previously doing.
Problem reported by David Zhang, although this is not his proposed
patch; it's instead based on a suggestion from Heikki Linnakangas.
Back-patch to all supported branches, since any of them are likely
to be used with recent OpenSSL.
Discussion: https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649faf18@highgo.ca
This reverts commit eae7be600b, following a discussion with Tom Lane,
due to concerns that this impacts the decisions made by the planner for
the number of workers spawned based on the inlining and const-folding of
index expressions and predicate for cases that would have worked until
this commit.
Discussion: https://postgr.es/m/162802.1709746091@sss.pgh.pa.us
Backpatch-through: 12
In the corner case where a function returning RECORD has been
simplified to a RECORD constant or an inlined ROW() expression,
ExecInitFunctionScan failed to cross-check the function's result
rowtype against the coldeflist provided by the calling query.
That happened because get_expr_result_type is able to extract a
tupdesc from such expressions, which led ExecInitFunctionScan to
ignore the coldeflist. (Instead, it used the extracted tupdesc
to check the function's output, which of course always succeeds.)
I have not been able to demonstrate any really serious consequences
from this, because if some column of the result is of the wrong
type and is directly referenced by a Var of the calling query,
CheckVarSlotCompatibility will catch it. However, we definitely do
fail to report the case where the function returns more columns than
the coldeflist expects, and in the converse case where it returns
fewer columns, we get an assert failure (but, seemingly, no worse
results in non-assert builds).
To fix, always build the expected tupdesc from the coldeflist if there
is one, and consult get_expr_result_type only when there isn't one.
Also remove the failing Assert, even though it is no longer reached
after this fix. It doesn't seem to be adding anything useful, since
later checking will deal with cases with the wrong number of columns.
The only other place I could find that is doing something similar
is inline_set_returning_function. There's no live bug there because
we cannot be looking at a Const or RowExpr, but for consistency
change that code to agree with ExecInitFunctionScan.
Per report from PetSerAl. After some debate I've concluded that
this should be back-patched. There is a small risk that somebody
has been relying on such a case not throwing an error, but I judge
this outweighed by the risk that I've missed some way in which the
failure to cross-check has worse consequences than sketched above.
Discussion: https://postgr.es/m/CAKygsHSerA1eXsJHR9wft3Gn3wfHQ5RfP8XHBzF70_qcrrRvEg@mail.gmail.com
As coded, the planner logic that calculates the number of parallel
workers to use for a parallel index build uses expressions and
predicates from the relcache, which are flattened for the planner by
eval_const_expressions().
As reported in the bug, an immutable parallel-unsafe function flattened
in the relcache would become a Const, which would be considered as
parallel-safe, even if the predicate or the expressions including the
function are not safe in parallel workers. Depending on the expressions
or predicate used, this could cause the parallel build to fail.
Tests are included that check parallel index builds with parallel-unsafe
predicate and expressions. Two routines are added to lsyscache.h to be
able to retrieve expressions and predicate of an index from its pg_index
data.
Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Jian He, Michael Paquier
Discussion: https://postgr.es/m/CAHewXN=UaAaNn9ruHDH3Os8kxLVmtWqbssnf=dZN_s9=evHUFA@mail.gmail.com
Backpatch-through: 12
The error message(s) were reporting the stats kind of 'f', which is not
correct as that's for the "dependencies" statistics kind.
Reported-by: Horst Reiterer
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/18375-ba99383eb9062d6a@postgresql.org
Backpatch-through: 12, where MCV extended stats were added.
dsa_dump would print a large negative number instead of zero for
segment bin 0. Fix by explicitly checking for underflow and add
special case for bin 0. Backpatch to all supported versions.
Author: Ian Ilyasov <ianilyasov@outlook.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/GV1P251MB1004E0D09D117D3CECF9256ECD502@GV1P251MB1004.EURP251.PROD.OUTLOOK.COM
Backpatch-through: v12
When this assertion was installed (in commit d2f60a3ab), I thought
it was only for catching server logic errors that caused accesses to
catalogs that were undergoing index rebuilds. However, it will also
fire in case of a user-defined index expression that attempts to
access its own table. We occasionally see reports of people trying
to do that, and typically getting unintelligible low-level errors
as a result. We can provide a more on-point message by making this
a regular runtime check.
While at it, adjust the similar error check in
systable_beginscan_ordered to use the same message text. That one
is (probably) not reachable without a coding bug, but we might as
well use a translatable message if we have one.
Per bug #18363 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18363-e3598a5a572d0699@postgresql.org
build_child_join_sjinfo creates a derived SpecialJoinInfo in
the short-lived GEQO context, but afterwards the semi_rhs_exprs
from that may be used in a UniquePath for a child base relation.
This breaks the expectation that all base-relation-level structures
are in the planning-lifespan context, leading to use of a dangling
pointer with probable ensuing crash later on in create_unique_plan.
To fix, copy the expression trees when making a UniquePath.
Per bug #18360 from Alexander Lakhin. This has been broken since
partitionwise joins were added, so back-patch to all supported
branches.
Discussion: https://postgr.es/m/18360-a23caf3157f34e62@postgresql.org
Partition pruning wrongly assumed that, for a table partitioned on a
boolean column, a clause in the form "boolcol IS NOT false" and "boolcol
IS NOT true" could be inverted to correspondingly become "boolcol IS true"
and "boolcol IS false". These are not equivalent as the NOT version
matches the opposite boolean value *and* NULLs. This incorrect assumption
meant that partition pruning pruned away partitions that could contain
NULL values.
Here we fix this by correctly not pruning partitions which could store
NULLs.
To be affected by this, the table must be partitioned by a NULLable boolean
column and queries would have to contain "boolcol IS NOT false" or "boolcol
IS NOT true". This could result in queries filtering out NULL values
with a LIST partitioned table and "ERROR: invalid strategy number 0"
for RANGE and HASH partitioned tables.
Reported-by: Alexander Lakhin
Bug: #18344
Discussion: https://postgr.es/m/18344-8d3f00bada6d09c6@postgresql.org
Backpatch-through: 12
intoasc(), a wrapper for PGTYPESinterval_to_asc that converts an
interval to its textual representation, used a plain memcpy() when
copying its result. This could miss a zero-termination in the result
string, leading to an incorrect result.
The routines in informix.c do not provide the length of their result
buffer, which would allow a replacement of strcpy() to safer strlcpy()
calls, but this requires an ABI breakage and that cannot happen in
back-branches.
Author: Oleg Tselebrovskiy
Reviewed-by: Ashutosh Bapat
Discussion: https://postgr.es/m/bf47888585149f83b276861a1662f7e4@postgrespro.ru
Backpatch-through: 12
The macOS Finder application creates .DS_Store files in directories
when opened, which creates problems for serverside utilities which
expect all files to be PostgreSQL specific files. Skip these files
when encountered in pg_checksums, pg_rewind and pg_basebackup.
This was extracted from a larger patchset for skipping hidden files
and system files, where the concencus was to just skip these. Since
this is equally likely to happen in every version, backpatch to all
supported versions.
Reported-by: Mark Guertin <markguertin@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Tobias Bussmann <t.bussmann@gmx.net>
Discussion: https://postgr.es/m/E258CE50-AB0E-455D-8AAD-BB4FE8F882FB@gmail.com
Backpatch-through: v12
Since its introduction, pg_get_expr() has intended to silently
return NULL if called with an invalid relation OID, as can happen
when scanning the catalogs concurrently with relation drops.
However, there is a race condition: we check validity of the OID
at the start, but it could get dropped just afterward, leading to
failures. This is the cause of some intermittent instability we're
seeing in a proposed new test case, and presumably it's a hazard in
the field as well.
We can fix this by AccessShareLock-ing the target relation for the
duration of pg_get_expr(). Since we don't require any permissions
on the target relation, this is semantically a bit undesirable. But
it turns out that the set_relation_column_names() subroutine already
takes a transient AccessShareLock on that relation, and has done since
commit 2ffa740be in 2012. Given the lack of complaints about that, it
seems like there should be no harm in holding the lock a bit longer.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/31ddcc01-a71b-4e8c-9948-01d1c47293ca@eisentraut.org
We previously supposed that it was okay for different threads to
call bindtextdomain() concurrently (cf. commit 1f655fdc3).
It now emerges that there's at least one gettext implementation
in which that triggers an abort() crash, so let's stop doing that.
Add mutexes guarding libpq's and ecpglib's calls, which are the
only ones that need worry about multithreaded callers.
Note: in libpq, we could perhaps have piggybacked on
default_threadlock() to avoid defining a new mutex variable.
I judge that not terribly safe though, since libpq_gettext could
be called from code that is holding the default mutex. If that
were the first such call in the process, it'd fail. An extra
mutex is cheap insurance against unforeseen interactions.
Per bug #18312 from Christian Maurer. Back-patch to all
supported versions.
Discussion: https://postgr.es/m/18312-bbbabc8113592b78@postgresql.org
Discussion: https://postgr.es/m/264860.1707163416@sss.pgh.pa.us
Fix pthread-win32.h and pthread-win32.c to provide a more complete
emulation of POSIX pthread mutexes: define PTHREAD_MUTEX_INITIALIZER
and make sure that pthread_mutex_lock() can operate on a mutex
object that's been initialized that way. Then we don't need the
duplicative platform-specific logic in default_threadlock() and
pgtls_init(), which we'd otherwise need yet a third copy of for
an upcoming bug fix.
Also, since default_threadlock() supposes that pthread_mutex_lock()
cannot fail, try to ensure that that's actually true, by getting
rid of the malloc call that was formerly involved in initializing
an emulated mutex. We can define an extra state for the spinlock
field instead.
Also, replace the similar code in ecpglib/misc.c with this version.
While ecpglib's version at least had a POSIX-compliant API, it
also had the potential of failing during mutex init (but here,
because of CreateMutex failure rather than malloc failure). Since
all of misc.c's callers ignore failures, it seems like a wise idea
to avoid failures here too.
A further improvement in this area could be to unify libpq's and
ecpglib's implementations into a src/port/pthread-win32.c file.
But that doesn't seem like a bug fix, so I'll desist for now.
In preparation for the aforementioned bug fix, back-patch to all
supported branches.
Discussion: https://postgr.es/m/264860.1707163416@sss.pgh.pa.us
The TransactionIdInRecentPast() should return false for all the transactions
older than TransamVariables->oldestClogXid. However, the function contains
a bug in comparison FullTransactionId to TransactionID allowing full
transactions between nextXid - 2^32 and oldestClogXid - 2^31.
This commit fixes TransactionIdInRecentPast() by turning the oldestClogXid into
FullTransactionId first, then performing the comparison.
Backpatch to all supported versions.
Reported-by: Egor Chindyaskin
Bug: 18212
Discussion: https://postgr.es/m/18212-547307f8adf57262%40postgresql.org
Author: Karina Litskevich
Reviewed-by: Kyotaro Horiguchi
Backpatch-through: 12
When assertions are disabled, the built SQL statement is invalid and
you get a "syntax error". So this isn't a serious problem, but let's
avoid the assertion failure.
Backpatch to all supported versions.
Reviewed-by: Noah Misch
The internal commands in REFRESH MATERIALIZED VIEW CONCURRENTLY are
correctly executed in SECURITY_RESTRICTED_OPERATION mode, except for
creating the temporary "diff" table, because you cannot create
temporary tables in SRO mode. But creating the temporary "diff" table
is a pretty complex CTAS command that selects from another temporary
table created earlier in the command. If you can cajole that CTAS
command to execute code defined by the table owner, the table owner
can run code with the privileges of the user running the REFRESH
command.
The proof-of-concept reported to the security team relied on CREATE
RULE to convert the internally-built temp table to a view. That's not
possible since commit b23cd185fd, and I was not able to find a
different way to turn the SELECT on the temp table into code
execution, so as far as I know this is only exploitable in v15 and
below. That's a fiddly assumption though, so apply this patch to
master and all stable versions.
Thanks to Pedro Gallegos for the report.
Security: CVE-2023-5869
Reviewed-by: Noah Misch
DST law changes in Ittoqqortoormiit, Greenland (America/Scoresbysund),
Kazakhstan (Asia/Almaty and Asia/Qostanay) and Palestine; as well as
updates for the Antarctic stations Casey and Vostok.
Historical corrections for Vietnam, Toronto, and Miquelon.
Further fallout from commit 6ee26c6a4b. To keep code in sync and avoid
issues on older releases with different package names, simply use the
unqualified name like many other places in our code.
The path we wish to reparameterize is not a standalone object:
in particular, it implicitly references baserestrictinfo clauses
in the associated RelOptInfo, and if it's a SampleScan path then
there is also the TableSampleClause in the RTE to worry about.
Both of those could contain lateral references to the join partner
relation, which would need to be modified to refer to its child.
Since we aren't doing that, affected queries can give wrong answers,
or odd failures such as "variable not found in subplan target list",
or executor crashes. But we can't just summarily modify those
expressions, because they are shared with other paths for the rel.
We'd break things if we modify them and then end up using some
non-partitioned-join path.
In HEAD, we plan to fix this by postponing reparameterization
until create_plan(), when we know that those other paths are
no longer of interest, and then adjusting those expressions along
with the ones in the path itself. That seems like too big a change
for stable branches however. In the back branches, let's just detect
whether any troublesome lateral references actually exist in those
expressions, and fail reparameterization if so. This will result in
not performing a partitioned join in such cases. Given the lack of
field complaints, nobody's likely to miss the optimization.
Report and patch by Richard Guo. Apply to 12-16 only, since
the intended fix for HEAD looks quite different. We're not quite
ready to push the HEAD fix, but with back-branch releases coming
up soon, it seems wise to get this stopgap fix in place there.
Discussion: https://postgr.es/m/CAMbWs496+N=UAjOc=rcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw@mail.gmail.com
This commit addresses a set of issues when changing token type mappings
in a text search configuration when using duplicated token names:
- ADD MAPPING would fail on insertion because of a constraint failure
after inserting the same mapping.
- ALTER MAPPING with an "overridden" configuration failed with "tuple
already updated by self" when the token mappings are removed.
- DROP MAPPING failed with "tuple already updated by self", like
previously, but in a different code path.
The code is refactored so the token names (with their numbers) are
handled as a List with unique members rather than an array with numbers,
ensuring that no duplicates mess up with the catalog inserts, updates
and deletes. The list is generated by getTokenTypes(), with the same
error handling as previously while duplicated tokens are discarded from
the list used to work on the catalogs.
Regression tests are expanded to cover much more ground for the cases
fixed by this commit, as there was no coverage for the code touched in
this commit. A bit more is done regarding the fact that a token name
not supported by a configuration's parser should result in an error even
if IF EXISTS is used in a DROP MAPPING clause. This is implied in the
code but there was no coverage for that, and it was very easy to miss.
These issues exist since at least their introduction in core with
140d4ebcb4, so backpatch all the way down.
Reported-by: Alexander Lakhin
Author: Tender Wang, Michael Paquier
Discussion: https://postgr.es/m/18310-1eb233c5908189c8@postgresql.org
Backpatch-through: 12
File::Find converts backslashes to slashes in the newer Perl versions.
See: 414f14df98
So, do the same conversion for Windows before comparing paths. To
support all Perl versions, always convert them on Windows regardless of
the Perl's version.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Backpatch to all live branches
We seem to have only documented a foreign key can reference the columns of
a primary key or unique constraint. Here we adjust the documentation
to mention columns in a non-partial unique index can be mentioned too.
The header comment for transformFkeyCheckAttrs() also didn't mention
unique indexes, so fix that too. In passing make that header comment
reflect reality in the various other aspects where it deviated from it.
Bug: 18295
Reported-by: Gilles PARC
Author: Laurenz Albe, David Rowley
Discussion: https://www.postgresql.org/message-id/18295-0ed0fac5c9f7b17b%40postgresql.org
Backpatch-through: 12
libxml2 changed the required signature of error handler callbacks
to make the passed xmlError struct "const". This is causing build
failures on buildfarm member caiman, and no doubt will start showing
up in the field quite soon. Add a version check to adjust the
declaration of xml_errorHandler() according to LIBXML_VERSION.
2.12.x also produces deprecation warnings for contrib/xml2/xpath.c's
assignment to xmlLoadExtDtdDefaultValue. I see no good reason for
that to still be there, seeing that we disabled external DTDs (at a
lower level) years ago for security reasons. Let's just remove it.
Back-patch to all supported branches, since they might all get built
with newer libxml2 once it gets a bit more popular. (The back
branches produce another deprecation warning about xpath.c's use of
xmlSubstituteEntitiesDefault(). We ought to consider whether to
back-patch all or part of commit 65c5864d7 to silence that. It's
less urgent though, since it won't break the buildfarm.)
Discussion: https://postgr.es/m/1389505.1706382262@sss.pgh.pa.us
ginFinishSplit() expects the caller to hold an exclusive lock on the
buffer, but when finishing an earlier "leftover" incomplete split of
an internal page, the caller held a shared lock. That caused an
assertion failure in MarkBufferDirty(). Without assertions, it could
lead to corruption if two backends tried to complete the split at the
same time.
On master, add a test case using the new injection point facility.
Report and analysis by Fei Changhong. Backpatch the fix to all
supported versions.
Reviewed-by: Fei Changhong, Michael Paquier
Discussion: https://www.postgresql.org/message-id/tencent_A3CE810F59132D8E230475A5F0F7A08C8307@qq.com
We perform addition of the days field of an interval via
arithmetic on the Julian-date representation of the timestamp's date.
This step is subject to int32 overflow, and we also should not let
the Julian date become very negative, for fear of weird results from
j2date. (In the timestamptz case, allow a Julian date of -1 to pass,
since it might convert back to zero after timezone rotation.)
The additions of the months and microseconds fields could also
overflow, of course. However, I believe we need no additional
checks there; the existing range checks should catch such cases.
The difficulty here is that j2date's magic modular arithmetic could
produce something that looks like it's in-range.
Per bug #18313 from Christian Maurer. This has been wrong for
a long time, so back-patch to all supported branches.
Discussion: https://postgr.es/m/18313-64d2c8952d81e84b@postgresql.org
This command, when used to add a column on a parent table with a complex
inheritance tree, tried to update multiple times the same tuple in
pg_attribute for a child table when incrementing attinhcount, causing
failures with "tuple already updated by self" because of a missing
CommandCounterIncrement() between two updates.
This exists for a rather long time, so backpatch all the way down.
Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/18297-b04cd83a55b51e35@postgresql.org
Backpatch-through: 12
Until now LWLockDequeueSelf() sequentially searched the list of waiters to see
if the current proc is still is on the list of waiters, or has already been
removed. In extreme workloads, where the wait lists are very long, this leads
to a quadratic behavior. #backends iterating over a list #backends
long. Additionally, the likelihood of needing to call LWLockDequeueSelf() in
the first place also increases with the increased length of the wait queue, as
it becomes more likely that a lock is released while waiting for the wait list
lock, which is held for longer during lock release.
Due to the exponential back-off in perform_spin_delay() this is surprisingly
hard to detect. We should make that easier, e.g. by adding a wait event around
the pg_usleep() - but that's a separate patch.
The fix is simple - track whether a proc is currently waiting in the wait list
or already removed but waiting to be woken up in PGPROC->lwWaiting.
In some workloads with a lot of clients contending for a small number of
lwlocks (e.g. WALWriteLock), the fix can substantially increase throughput.
This has been originally fixed for 16~ with a4adc31f69 without a
backpatch, and we have heard complaints from users impacted by this
quadratic behavior in older versions as well.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/20221027165914.2hofzp4cvutj6gin@awork3.anarazel.de
Discussion: https://postgr.es/m/CALj2ACXktNbG=K8Xi7PSqbofTZozavhaxjatVc14iYaLu4Maag@mail.gmail.com
Backpatch-through: 12
If configuring the newly created socket non-blocking fails we
error out and return INVALID_SOCKET, but the socket that had
been created wasn't closed. Fix by issuing closesocket in the
errorpath.
Backpatch to all supported branches.
Author: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQApmU5CrKefH85VbNYE2y8H=-qqEJbg6RAPU65+vCe+89A@mail.gmail.com
Backpatch-through: v12
We've long had a policy that any toasted fields in a catalog tuple
should be pulled in-line before entering the tuple in a catalog cache.
However, that requires access to the catalog's toast table, and we'll
typically do AcceptInvalidationMessages while opening the toast table.
So it's possible that the catalog tuple is outdated by the time we
finish detoasting it. Since no cache entry exists yet, we can't
mark the entry stale during AcceptInvalidationMessages, and instead
we'll press forward and build an apparently-valid cache entry. The
upshot is that we have a race condition whereby an out-of-date entry
could be made in a backend's catalog cache, and persist there
indefinitely causing indeterminate misbehavior.
To fix, use the existing systable_recheck_tuple code to recheck
whether the catalog tuple is still up-to-date after we finish
detoasting it. If not, loop around and restart the process of
searching the catalog and constructing cache entries from the top.
The case is rare enough that this shouldn't create any meaningful
performance penalty, even in the SearchCatCacheList case where
we need to tear down and reconstruct the whole list.
Indeed, the case is so rare that AFAICT it doesn't occur during
our regression tests, and there doesn't seem to be any easy way
to build a test that would exercise it reliably. To allow
testing of the retry code paths, add logic (in USE_ASSERT_CHECKING
builds only) that randomly pretends that the recheck failed about
one time out of a thousand. This is enough to ensure that we'll
pass through the retry paths during most regression test runs.
By adding an extra level of looping, this commit creates a need
to reindent most of SearchCatCacheMiss and SearchCatCacheList.
I'll do that separately, to allow putting those changes in
.git-blame-ignore-revs.
Patch by me; thanks to Alexander Lakhin for having built a test
case to prove the bug is real, and to Xiaoran Wang for review.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/1393953.1698353013@sss.pgh.pa.us
Discussion: https://postgr.es/m/CAGjhLkOoBEC9mLsnB42d3CO1vcMx71MLSEuigeABbQ8oRdA6gw@mail.gmail.com
This is similar to 9886744a36, to prevent the execution of other
programs due to autorun configurations which could influence the
postmaster startup.
This was originally applied on HEAD as of 83c75ac7fb69 without a
backpatch, but the patch has survived CI and buildfarm cycles. I have
checked that cmd /d exists down to Windows XP, which should make this
change work correctly in the oldest branches still supported.
Discussion: https://postgr.es/m/20230922.161551.320043332510268554.horikyota.ntt@gmail.com
Backpatch-through: 12
On Windows, cmd.exe is used to launch the postmaster process to ease its
redirection setup. However, cmd.exe may execute other programs at
startup due to autorun configurations, which could influence the
postmaster startup. This patch adds /D flag to the launcher cmd.exe
command line to disable autorun settings written in the registry.
This was originally applied on HEAD as of 9886744a36 without a
backpatch, but the patch has survived CI and buildfarm cycles. I have
checked that cmd /d exists down to Windows XP, which should make this
change work correctly in the oldest branches still supported.
Reported-by: Hayato Kuroda
Author: Kyotaro Horiguchi
Reviewed-by: Robert Haas, Michael Paquier
Discussion: https://postgr.es/m/20230922.161551.320043332510268554.horikyota.ntt@gmail.com
Backpatch-through: 12
The code for wrapping subquery output expressions in PlaceHolderVars
believed that if the expression already was a PlaceHolderVar, it was
never necessary to wrap that in another one. That's wrong if the
expression is underneath an outer join and involves a lateral
reference to outside that scope: failing to add an additional PHV
risks evaluating the expression at the wrong place and hence not
forcing it to null when the outer join should do so. This is an
oversight in commit 9e7e29c75, which added logic to forcibly wrap
lateral-reference Vars in PlaceHolderVars, but didn't see that the
adjacent case for PlaceHolderVars needed the same treatment.
The test case we have for this doesn't fail before 4be058fe9, but now
that I see the problem I wonder if it is possible to demonstrate
related errors before that. That's moot though, since all such
branches are out of support.
Per bug #18284 from Holger Reise. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18284-47505a20c23647f8@postgresql.org
During the calculations of the maximum for the number of buckets, take into
account that later we round that to the next power of 2.
Reported-by: Karen Talarico
Bug: #16925
Discussion: https://postgr.es/m/16925-ec96d83529d0d629%40postgresql.org
Author: Thomas Munro, Andrei Lepikhov, Alexander Korotkov
Reviewed-by: Alena Rybakina
Backpatch-through: 12
CheckPWChallengeAuth() would return STATUS_ERROR if the user does not
exist or has no password assigned, even if the client disconnected
without responding to the password challenge (as libpq often will,
for example). We should return STATUS_EOF in that case, and the
lower-level functions do, but this code level got it wrong since the
refactoring done in 7ac955b34. This breaks the intent of not logging
anything for EOF cases (cf. comments in auth_failed()) and might
also confuse users of ClientAuthentication_hook.
Per report from Liu Lang. Back-patch to all supported versions.
Discussion: https://postgr.es/m/b725238c-539d-cb09-2bff-b5e6cb2c069c@esgyn.cn
If the underlying table isn't being dumped, it's useless to dump
an extended statistics object; it'll just cause errors at restore.
We have always applied similar policies to, say, indexes.
(When and if we get cross-table stats objects, it might be profitable
to think a little harder about what to do with them. But for now
there seems no point in considering a stats object as anything but
an appendage of its table.)
Rian McGuire and Tom Lane, per report from Rian McGuire.
Back-patch to supported branches.
Discussion: https://postgr.es/m/7075d3aa-3f05-44a5-b68f-47dc6a8a0550@buildkite.com
set_config_option() bails out early if it detects that the option to
be set is PGC_BACKEND or PGC_SU_BACKEND class and we're reading the
config file in a postmaster child; we don't want to apply any new
value in such a case. That's fine as far as it goes, but it fails
to consider the requirements of the pg_file_settings view: for that,
we need to check validity of the value even though we have no
intention to apply it. Because we didn't, even very silly values
for affected GUCs would be reported as valid by the view. There
are only half a dozen such GUCs, which perhaps explains why this
got overlooked for so long.
Fix by continuing when changeVal is false; this parallels the logic
in some other early-exit paths.
Also, the check added by commit 924bcf4f1 to prevent GUC changes in
parallel workers seems a few bricks shy of a load: it's evidently
assuming that ereport(elevel, ...) won't return. Make sure we
bail out if it does. The lack of trouble reports suggests that
this is only a latent bug, i.e. parallel workers don't actually
reach here with elevel < ERROR. (Per the code coverage report,
we never reach here at all in the regression suite.) But we clearly
don't want to risk proceeding if that does happen.
Per report from Rıdvan Korkmaz. These are ancient bugs, so back-patch
to all supported branches.
Discussion: https://postgr.es/m/2089235.1703617353@sss.pgh.pa.us
Like commit 388e80132, use "#pragma GCC system_header" to silence
warnings appearing within the Python headers, since newer Python
versions no longer worry about some restrictions we still use like
-Wdeclaration-after-statement.
This patch improves on 388e80132 by inventing a separate wrapper
header file, allowing the pragma to be tightly scoped to just
the Python headers and not other stuff we have laying about in
plpython.h. I applied the same technique to plperl for the same
reason: the original patch suppressed warnings for a good deal
of our own code, not only the Perl headers.
Like the previous commit, back-patch to supported branches.
Peter Eisentraut and Tom Lane
Discussion: https://postgr.es/m/ae523163-6d2a-4b81-a875-832e48dec502@eisentraut.org
This is necessary when spgcanreturn() is invoked on a partitioned
index, and the failure might be reachable in other scenarios as
well. The rest of what spgGetCache() does is perfectly sensible
for a partitioned index, so we should allow it to go through.
I think the main takeaway from this is that we lack sufficient test
coverage for non-btree partitioned indexes. Therefore, I added
simple test cases for brin and gin as well as spgist (hash and
gist AMs were covered already in indexing.sql).
Per bug #18256 from Alexander Lakhin. Although the known test case
only fails since v16 (3c569049b), I've got no faith at all that there
aren't other ways to reach this problem; so back-patch to all
supported branches.
Discussion: https://postgr.es/m/18256-0b0e1b6e4a620f1b@postgresql.org
In v16 and up (since commit afbfc0298), large object ownership
checking has been broken because object_ownercheck() didn't take care
of the discrepancy between our object-address representation of large
objects (classId == LargeObjectRelationId) and the catalog where their
ownership info is actually stored (LargeObjectMetadataRelationId).
This resulted in failures such as "unrecognized class ID: 2613"
when trying to update blob properties as a non-superuser.
Poking around for related bugs, I found that AlterObjectOwner_internal
would pass the wrong classId to the PostAlterHook in the no-op code
path where the large object already has the desired owner. Also,
recordExtObjInitPriv checked for the wrong classId; that bug is only
latent because the stanza is dead code anyway, but as long as we're
carrying it around it should be less wrong. These bugs are quite old.
In HEAD, we can reduce the scope for future bugs of this ilk by
changing AlterObjectOwner_internal's API to let the translation happen
inside that function, rather than requiring callers to know about it.
A more bulletproof fix, perhaps, would be to start using
LargeObjectMetadataRelationId as the dependency and object-address
classId for blobs. However that has substantial risk of breaking
third-party code; even within our own code, it'd create hassles
for pg_dump which would have to cope with a version-dependent
representation. For now, keep the status quo.
Discussion: https://postgr.es/m/2650449.1702497209@sss.pgh.pa.us
Dead tuples are ignored and are not marked as dead during recovery, as
it can lead to MVCC issues on a standby because its xmin may not match
with the primary. This information is tracked by a field called
"xactStartedInRecovery" in the transaction state data, switched on when
starting a transaction in recovery.
Unfortunately, this information was not correctly tracked when starting
a subtransaction, because the transaction state used for the
subtransaction did not update "xactStartedInRecovery" based on the state
of its parent. This would cause index scans done in subtransactions to
return inconsistent data, depending on how the xmin of the primary
and/or the standby evolved.
This is broken since the introduction of hot standby in efc16ea520, so
backpatch all the way down.
Author: Fei Changhong
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/tencent_C4D907A5093C071A029712E73B43C6512706@qq.com
Backpatch-through: 12