As transam's README documents, the general order of actions recommended
when WAL-logging a buffer is to unlock and unpin buffers after leaving a
critical section. This pattern was not being followed by some code
paths of GIN and GiST, adjusted in this commit, where buffers were
either unlocked or unpinned inside a critical section. Based on my
analysis of each code path updated here, there is no reason to not
follow the recommended unlocking/unpin pattern done outside of a
critical section.
These inconsistencies are rather old, coming mainly from ecaa4708e5
and ff301d6e69. The guidelines in the README predate these commits,
being introduced in 6d61cdec07.
Author: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CALdSSPgBPnpNNzxv0Y+_GNFzW6PmzRZYh+_hpf06Y1N2zLhZaQ@mail.gmail.com
Up to now, to create such a function, one had to make a pg_proc.dat
entry and then overwrite it with a CREATE OR REPLACE command in
system_functions.sql. That's error-prone (cf. bug #19409) and
results in leaving dead rows in the initial contents of pg_proc.
Manual maintenance of pg_node_tree strings seems entirely impractical,
and parsing expressions during bootstrap would be extremely difficult
as well. But Andres Freund observed that all the current use-cases
are simple constants, and building a Const node is well within the
capabilities of bootstrap mode. So this patch invents a special case:
if bootstrap mode is asked to ingest a non-null value for
pg_proc.proargdefaults (which would otherwise fail in
pg_node_tree_in), it parses the value as an array literal and then
feeds the element strings to the input functions for the corresponding
parameter types. Then we can build a suitable pg_node_tree string
with just a few more lines of code.
This allows removing all the system_functions.sql entries that are
just there to set up default arguments, replacing them with
proargdefaults fields in pg_proc.dat entries. The old technique
remains available in case someone needs a non-constant default.
The initial contents of pg_proc are demonstrably the same after
this patch, except that (1) json_strip_nulls and jsonb_strip_nulls
now have the correct provolatile setting, as per bug #19409;
(2) pg_terminate_backend, make_interval, and drandom_normal
now have defaults that don't include a type coercion, which is
how they should have been all along.
In passing, remove some unused entries from bootstrap.c's TypInfo[]
array. I had to add some new ones because we'll now need an entry for
each default-possessing system function parameter, but we shouldn't
carry more than we need there; it's just a maintenance gotcha.
Bug: #19409
Reported-by: Lucio Chiessi <lucio.chiessi@trustly.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/183292bb-4891-4c96-a3ca-e78b5e0e1358@dunslane.net
Discussion: https://postgr.es/m/19409-e16cd2605e59a4af@postgresql.org
The previous default bgworker_die() signal would exit with elog(FATAL)
directly from the signal handler. That could cause deadlocks or
crashes if the signal handler runs while we're e.g holding a spinlock
or in the middle of a memory allocation.
All the built-in background workers overrode that to use the normal
die() handler and CHECK_FOR_INTERRUPTS(). Let's make that the default
for all background workers. Some extensions relying on the old
behavior might need to adapt, but the new default is much safer and is
the right thing to do for most background workers.
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/5238fe45-e486-4c62-a7f3-c7d8d416e812@iki.fi
Previously, if --stamp_file was specified, libpq_check.pl would create a
new stamp file only if none could be found. If there was already a
stamp file, the script would do nothing, leaving the previous stamp file
in place. This logic could cause unnecessary rebuilds because meson
relies on the timestamp of the output files to determine if a rebuild
should happen. In this case, a stamp file generated during an older
check would be kept, but we need a stamp file from the latest moment
where the libpq check has been run, so as correct rebuild decisions can
be taken.
This commit changes libpq_check.pl so as a fresh stamp file is created
each time libpq_check.pl is run, when --stamp_file is specified.
Oversight in commit 4a8e6f43a6.
Reported-by: Andres Freund <andres@anarazel.de>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: VASUKI M <vasukim1992002@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ22rrN6gCn7urtmTR=_5z7ArZLUJu-TsMChdXwmRTaquA@mail.gmail.com
The main purpose of this change is to allow an ABI checker to understand
when the list of SysCacheIdentifier changes, by switching all the
routine declarations that relied on a signed integer for a syscache ID
to this new type. This is going to be useful in the long-term for
versions newer than v19 so as we will be able to check when the list of
values in SysCacheIdentifier is updated in a non-ABI compliant fashion.
Most of the changes of this commit are due to the new definition of
SyscacheCallbackFunction, where a SysCacheIdentifier is now required for
the syscache ID. It is a mechanical change, still slightly invasive.
There are more areas in the tree that could be improved with an ABI
checker in mind; this takes care of only one area.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/289125.1770913057@sss.pgh.pa.us
This commit tweaks the generation of the syscache IDs for the enum
SysCacheIdentifier to now include an invalid value, with -1 assigned as
value. The concept of an invalid syscache ID exists when handling
lookups of a ObjectAddress, based on their set of properties in
ObjectPropertyType. -1 is used for the case where an object type has no
option for a syscache lookup.
This has been found as independently useful while discussing a switch of
SysCacheIdentifier to a typedef, as we already have places that want to
know about the concept of an invalid value when dealing with
ObjectAddresses.
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/aZQRnmp9nVjtxAHS@paquier.xyz
get_catalog_object_by_oid_extended() has been doing a syscache lookup
when given a cache ID strictly higher than 0, which is wrong because the
first valid value of SysCacheIdentifier is 0.
This issue had no consequences, as the first value assigned in the
enum SysCacheIdentifier is AGGFNOID, which is not used in the object
type properties listed in objectaddress.c. Even if an ID of 0 was
hypotherically given, the code would still work with a less efficient
heap-or-index scan.
Discussion: https://postgr.es/m/aZTr_R6JGmqokUBb@paquier.xyz
... instead of passing a bunch of separate booleans.
Also, rearrange the argument list in a hopefully more sensible order.
Discussion: https://postgr.es/m/202602111846.xpvuccb3inbx@alvherre.pgsql
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com> (older version)
Commit 38e0190ced forgot to pfree() an allocation (freed in other
places of the same function) in only one of several spots in
check_log_min_messages(). Per Coverity. Add that.
While at it, avoid open-coding guc_strdup(). The new coding does a
strlen() that wasn't there before, but I doubt it's measurable.
Previously, SIGINT was treated the same as SIGTERM in walwriter and
walsummarizer. That decision goes back to when the walwriter process
was introduced (commit ad4295728e), and was later copied to
walsummarizer. It was a pretty arbitrary decision back then, and we
haven't adopted that convention in all the other processes that have
been introduced later.
Summary of how other processes respond to SIGINT:
- Autovacuum launcher: Cancel the current iteration of launching
- bgworker: Ignore (unless connected to a database)
- checkpointer: Request shutdown checkpoint
- bgwriter: Ignore
- pgarch: Ignore
- startup process: Ignore
- walreceiver: Ignore
- IO worker: die()
IO workers are a notable exception in that they exit on SIGINT, and
there's a documented reason for that: IO workers ignore SIGTERM, so
SIGINT provides a way to manually kill them. (They do respond to
SIGUSR2, though, like all the other processes that we don't want to
exit immediately on SIGTERM on operating system shutdown.)
To make this a little more consistent, ignore SIGINT in walwriter and
walsummarizer. They have no "query" to cancel, and they react to
SIGTERM just fine.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/818bafaf-1e77-4c78-8037-d7120878d87c@iki.fi
Most of the StaticAssert macros already worked in C++ with Clang and
GCC:(the only compilers we're currently testing C++ extension support
for). This adds a regression test for them in our test C++ extension,
so we can safely change their implementation without accidentally
breaking C++.
The only macros that StaticAssert macros that don't work yet are the
StaticAssertVariableIsOfType and StaticAssertVariableIsOfTypeMacro.
These will be added in a follow-on commit.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com
All of these macros already work in C++ with Clang and GCC (the only
compilers we're currently testing C++ extension support for). This
adds a regression test for them in our test C++ extension, so we can
safely change their implementation without accidentally breaking C++.
Some of the List macros didn't work in C++ in the past (see commit
d5ca15ee5), and this would have caught that.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg@mail.gmail.com
Commit c67bef3f32 introduced this test helper function for use by
src/test/regress/sql/encoding.sql, but its logic was incorrect. It
confused an encoding ID for a boolean so it gave the wrong results for
some inputs, and also forgot the usual return macro. The mistake didn't
affect values actually used in the test, so there is no change in
behavior.
Also drop it and another missed function at the end of the test, for
consistency.
Backpatch-through: 14
Author: Zsolt Parragi <zsolt.parragi@percona.com>
Various buildfarm members, having compilers like gcc 8.5 and 6.3, fail
to deduce that text_substring() variable "E" is initialized if
slice_size!=-1. This suppression approach quiets gcc 8.5; I did not
reproduce the warning elsewhere. Back-patch to v14, like commit
9f4fd119b2.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1157953.1771266105@sss.pgh.pa.us
Backpatch-through: 14
The receive function of hstore was not able to handle correctly
duplicate key values when a new duplicate links to a NULL value, where a
pfree() could be attempted on a NULL pointer, crashing due to a pointer
dereference.
This problem would happen for a COPY BINARY, when stacking values like
that:
aa => 5
aa => null
The second key/value pair is discarded and pfree() calls are attempted
on its key and its value, leading to a pointer dereference for the value
part as the value is NULL. The first key/value pair takes priority when
a duplicate is found.
Per offline report.
Reported-by: "Anemone" <vergissmeinnichtzh@gmail.com>
Reported-by: "A1ex" <alex000young@gmail.com>
Backpatch-through: 14
Before v12, pg_largeobject_metadata was defined WITH OIDS, so
unlike newer versions, the "oid" column was a hidden system column
that pg_dump's getTableAttrs() will not pick up. Thus, for commit
161a3e8b68, we did not bother trying to use COPY for
pg_largeobject_metadata for upgrades from older versions. This
commit removes that restriction by adjusting the query in
getTableAttrs() to pick up the "oid" system column and by teaching
dumpTableData_copy() to use COPY (SELECT ...) for this catalog,
since system columns cannot be used in COPY's column list.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/aYzuAz_ITUpd9ZvH%40nathan
syscache_info.h was installed into $installdir/include/server/catalog
if you use a non-VPATH autoconf build, but not if you use a VPATH
build or meson. That happened because the makefiles blindly install
src/include/catalog/*.h, and in a non-VPATH build the generated
header files would be swept up in that. While it's hard to conjure
a reason to need syscache_info.h outside of backend build, it's
also hard to get the makefiles to skip syscache_info.h, so let's
go the other way and install it in the other two cases too.
Another problem, new in v19, was that meson builds install a copy of
src/include/catalog/README, while autoconf builds do not. The issue
here is that that file is new and wasn't added to meson.build's
exclusion list.
While it's clearly a bug if different build methods don't install
the same set of files, I doubt anyone would thank us for changing
the behavior in released branches. Hence, fix in master only.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/946828.1771185367@sss.pgh.pa.us
The X25519 curve is not allowed when OpenSSL is configured for
FIPS mode, so add a note to the documentation that the default
setting must be altered for such setups.
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3521653.1770666093@sss.pgh.pa.us
The X25519 curve is disallowed when OpenSSL is configured for
FIPS mode which makes the testsuite fail. Since X25519 isn't
required for the tests we can remove it to allow FIPS enabled
configurations to run the tests.
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3521653.1770666093@sss.pgh.pa.us
This completes the work started by commit 75f49221c2.
In basebackup.c, changing the StaticAssertStmt to StaticAssertDecl
results in having the same StaticAssertDecl() in 2 functions. So, it
makes more sense to move it to file scope instead.
Also, as it depends on some computations based on 2 tar blocks, define
TAR_NUM_TERMINATION_BLOCKS.
In deadlock.c, change the StaticAssertStmt to StaticAssertDecl and
keep it in the function scope. Add new braces to avoid warning from
-Wdeclaration-after-statement.
In aset.c, change the StaticAssertStmt to StaticAssertDecl and move it
to file scope.
Finally, update the comments in c.h a bit.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Co-authored-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/aYH6ii46AvGVCB84%40ip-10-97-1-34.eu-west-3.compute.internal
When both standby.signal and recovery.signal are present, standby.signal
takes precedence and the server runs in standby mode. Previously,
in this case, recovery.signal was not removed at the end of standby mode
(i.e., on promotion) or at the end of archive recovery, while standby.signal
was removed. As a result, a leftover recovery.signal could cause
a subsequent restart to enter archive recovery unexpectedly, potentially
preventing the server from starting. This behavior was surprising and
confusing to users.
This commit fixes the issue by updating the recovery code to remove
recovery.signal alongside standby.signal when both files are present and
recovery completes.
Because this code path is particularly sensitive and changes in recovery
behavior can be risky for stable branches, this change is applied only to
the master branch.
Reported-by: Nikolay Samokhvalov <nik@postgres.ai>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: David Steele <david@pgbackrest.org>
Discussion: https://postgr.es/m/CAM527d8PVAQFLt_ndTXE19F-XpDZui861882L0rLY3YihQB8qA@mail.gmail.com
The error message added in 379695d3cc referred to the public key being
too long. This is confusing as it is in fact the session key included
in a PGP message which is too long. This is harmless, but let's be
precise about what is wrong.
Per offline report.
Reported-by: Zsolt Parragi <zsolt.parragi@percona.com>
Backpatch-through: 14
Commit 1e7fe06c10 changed
pg_mbstrlen_with_len() to ereport(ERROR) if the input ends in an
incomplete character. Most callers want that. text_substring() does
not. It detoasts the most bytes it could possibly need to get the
requested number of characters. For example, to extract up to 2 chars
from UTF8, it needs to detoast 8 bytes. In a string of 3-byte UTF8
chars, 8 bytes spans 2 complete chars and 1 partial char.
Fix this by replacing this pg_mbstrlen_with_len() call with a string
traversal that differs by stopping upon finding as many chars as the
substring could need. This also makes SUBSTRING() stop raising an
encoding error if the incomplete char is past the end of the substring.
This is consistent with the general philosophy of the above commit,
which was to raise errors on a just-in-time basis. Before the above
commit, SUBSTRING() never raised an encoding error.
SUBSTRING() has long been detoasting enough for one more char than
needed, because it did not distinguish exclusive and inclusive end
position. For avoidance of doubt, stop detoasting extra.
Back-patch to v14, like the above commit. For applications using
SUBSTRING() on non-ASCII column values, consider applying this to your
copy of any of the February 12, 2026 releases.
Reported-by: SATŌ Kentarō <ranvis@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Bug: #19406
Discussion: https://postgr.es/m/19406-9867fddddd724fca@postgresql.org
Backpatch-through: 14
The prior order caused spurious Valgrind errors. They're spurious
because the ereport(ERROR) non-local exit discards the pointer in
question. pg_mblen_cstr() ordered the checks correctly, but these other
two did not. Back-patch to v14, like commit
1e7fe06c10.
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20260214053821.fa.noahmisch@microsoft.com
Backpatch-through: 14
Radix sort can be much faster than quicksort, but for our purposes it
is limited to sequences of unsigned bytes. To make tuples with other
types amenable to this technique, several features of tuple comparison
must be accounted for, i.e. the sort key must be "normalized":
1. Signedness -- It's possible to modify a signed integer such that
it can be compared as unsigned. For example, a signed char has range
-128 to 127. If we cast that to unsigned char and add 128, the range
of values becomes 0 to 255 while preserving order.
2. Direction -- SQL allows specification of ASC or DESC. The
descending case is easily handled by taking the complement of the
unsigned representation.
3. NULL values -- NULLS FIRST and NULLS LAST must work correctly.
This commmit only handles the case where datum1 is pass-by-value
Datum (possibly abbreviated) that compares like an ordinary
integer. (Abbreviations of values of type "numeric" are a convenient
counterexample.) First, tuples are partitioned by nullness in the
correct NULL ordering. Then the NOT NULL tuples are sorted with radix
sort on datum1. For tiebreaks on subsequent sortkeys (including the
first sort key if abbreviated), we divert to the usual qsort.
ORDER BY queries on pre-warmed buffers are up to 2x faster on high
cardinality inputs with radix sort than the sort specializations added
by commit 697492434, so get rid of them. It's sufficient to fall back
to qsort_tuple() for small arrays. Moderately low cardinality inputs
show more modest improvents. Our qsort is strongly optimized for very
low cardinality inputs, but radix sort is usually equal or very close
in those cases.
The changes to the regression tests are caused by under-specified sort
orders, e.g. "SELECT a, b from mytable order by a;". For unstable
sorts, such as our qsort and this in-place radix sort, there is no
guarantee of the order of "b" within each group of "a".
The implementation is taken from ska_byte_sort() (Boost licensed),
which is similar to American flag sort (an in-place radix sort) with
modifications to make it better suited for modern pipelined CPUs.
The technique of normalization described above can also be extended
to the case of multiple keys. That is left for future work (Thanks
to Peter Geoghegan for the suggestion to look into this area).
Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com>
Reviewed-by: zengman <zengman@halodbtech.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com> (earlier version)
Discussion: https://postgr.es/m/CANWCAZYzx7a7E9AY16Jt_U3+GVKDADfgApZ-42SYNiig8dTnFA@mail.gmail.com
Commit dfd79e2d added a TODO comment to update this paragraph
when support for PASSING was added. Commit 6185c9737c added
PASSING but missed resolving this TODO. Fix by expanding the
paragraph with a reference to PASSING.
Author: Aditya Gollamudi <adigollamudi@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20260117051406.sx6pss4ryirn2x4v@pgs
The URL for Ditaa linked to the old Sourceforge version which is
too old for what we need, the fork over on Github is the correct
version to use for re-generating the SVG files for the docs. The
required Ditaa version is 0.11.0 as it when SVG support as added.
Running the version found on Sourceforge produce the error below:
$ ditaa -E -S --svg in.txt out.txt
Unrecognized option: --svg
usage: ditaa <INPFILE> [OUTFILE] [-A] [-b <BACKGROUND>] [-d] [-E] [-e
<ENCODING>] [-h] [--help] [-o] [-r] [-S] [-s <SCALE>] [-T] [-t
<TABS>] [-v] [-W]
While there, also mention that meson rules exists for building
images.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://postgr.es/m/CAN55FZ2O-23xERF2NYcvv9DM_1c9T16y6mi3vyP=O1iuXS0ASA@mail.gmail.com
This adds an 'images' target to the meson build system in order to
be able to regenerate the images used in the docs.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAN55FZ0c0Tcjx9=e-YibWGHa1-xmdV63p=THH4YYznz+pYcfig@mail.gmail.com
The idea is to encourage more the use of these allocation routines
across the tree, as these offer stronger type safety guarantees than
pg_malloc() & co (type cast in the result, sizeof() embedded). This set
of changes is dedicated to the pg_dump code.
Similar work has been done as of 31d3847a37, as one example.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com>
Discussion: https://postgr.es/m/CAHut+PvpGPDLhkHAoxw_g3jdrYxA1m16a8uagbgH3TGWSKtXNQ@mail.gmail.com
Use BackgroundPsql's published API for automatically restarting
its timer for each query, rather than manually reaching into it
to achieve the same thing.
010_tab_completion.pl's logic for this predates the invention
of BackgroundPsql (and 664d75753 missed the opportunity to
make it cleaner). 030_pager.pl copied-and-pasted the code.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1100715.1712265845@sss.pgh.pa.us
This log message was referring to conflicts, but it is about checksum
failures. The log message improved in this commit should never show up,
due to the fact that pgstat_prepare_report_checksum_failure() should
always be called before pgstat_report_checksum_failures_in_db(), with a
stats entry already created in the pgstats shared hash table. The three
code paths able to report database-level checksum failures follow
already this requirement.
Oversight in b96d3c3897.
Author: Wang Peng <215722532@qq.com>
Discussion: https://postgr.es/m/tencent_9B6CD6D9D34AE28CDEADEC6188DB3BA1FE07@qq.com
Backpatch-through: 18
It's currently only used in the server, but it was placed in src/port
with the idea that it might be useful in client programs too. However,
it will currently fail to link if used in a client program, because
CHECK_FOR_INTERRUPTS() is not usable in client programs. Fix that by
wrapping it in "#ifndef FRONTEND".
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://www.postgresql.org/message-id/21cc7a48-99d9-4f69-9a3f-2c2de61ac8e5%40iki.fi
Backpatch-through: 18
The uses of these functions do not justify the level of
micro-optimization we've done and may even hurt performance in some
cases (e.g., due to using function pointers). This commit removes
all architecture-specific implementations of pg_popcount{32,64} and
converts the portable ones to inlined functions in pg_bitutils.h.
These inlined versions should produce the same code as before (but
inlined), so in theory this is a net gain for many machines. A
follow-up commit will replace the remaining loops over these
word-length popcount functions with calls to pg_popcount(), further
reducing the need for architecture-specific implementations.
Suggested-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Greg Burd <greg@burd.me>
Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com
Over the past few releases, we've added a huge amount of complexity
to our popcount implementations. Commits fbe327e5b4, 79e232ca01,
8c6653516c, and 25dc485074 did some preliminary refactoring, but
many opportunities remain. In particular, if we disclaim interest
in micro-optimizing this code for 32-bit builds and in unnecessary
alignment checks on x86-64, we can remove a decent chunk of code.
I cannot find public discussion or benchmarks for the code this
commit removes, but it seems unlikely that this change will
noticeably impact performance on affected systems.
Suggested-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/CANWCAZY7R%2Biy%2Br9YM_sySNydHzNqUirx1xk0tB3ej5HO62GdgQ%40mail.gmail.com
This adds a new ON CONFLICT action DO SELECT [FOR UPDATE/SHARE], which
returns the pre-existing rows when conflicts are detected. The INSERT
statement must have a RETURNING clause, when DO SELECT is specified.
The optional FOR UPDATE/SHARE clause allows the rows to be locked
before they are are returned. As with a DO UPDATE conflict action, an
optional WHERE clause may be used to prevent rows from being selected
for return (but as with a DO UPDATE action, rows filtered out by the
WHERE clause are still locked).
Bumps catversion as stored rules change.
Author: Andreas Karlsson <andreas@proxel.se>
Author: Marko Tiikkaja <marko@joh.to>
Author: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/d631b406-13b7-433e-8c0b-c6040c4b4663@Spark
Discussion: https://postgr.es/m/5fca222d-62ae-4a2f-9fcb-0eca56277094@Spark
Discussion: https://postgr.es/m/2b5db2e6-8ece-44d0-9890-f256fdca9f7e@proxel.se
Discussion: https://postgr.es/m/CAL9smLCdV-v3KgOJX3mU19FYK82N7yzqJj2HAwWX70E=P98kgQ@mail.gmail.com
Following e68b6adad9, the reason for skipping slot synchronization is
stored as a slot property. This commit removes redundant function
parameters that previously tracked this state, instead relying directly on
the slot property.
Additionally, this change centralizes the logic for skipping
synchronization when required WAL has not yet been received or flushed. By
consolidating this check, we reduce code duplication and the risk of
inconsistent state updates across different code paths.
In passing, add an assertion to ensure a slot is marked as temporary if a
consistent point has not been reached during synchronization.
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/TY4PR01MB16907DD16098BE3B20486D4569463A@TY4PR01MB16907.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/CAFPTHDZAA+gWDntpa5ucqKKba41=tXmoXqN3q4rpjO9cdxgQrw@mail.gmail.com
The only place that used p_is_insert was transformAssignedExpr(),
which used it to distinguish INSERT from UPDATE when handling
indirection on assignment target columns -- see commit c1ca3a19df.
However, this information is already available to
transformAssignedExpr() via its exprKind parameter, which is always
either EXPR_KIND_INSERT_TARGET or EXPR_KIND_UPDATE_TARGET.
As noted in the commit message for c1ca3a19df, this use of
p_is_insert isn't particularly pretty, so have transformAssignedExpr()
use the exprKind parameter instead. This then allows p_is_insert to be
removed entirely, which simplifies state management in a few other
places across the parser.
Author: Viktor Holmberg <v@viktorh.net>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/badc3b4c-da73-4000-b8d3-638a6f53a769@Spark
For a LEFT JOIN, if any var from the right-hand side (RHS) is forced
to null by upper-level quals but is known to be non-null for any
matching row, the only way the upper quals can be satisfied is if the
join fails to match, producing a null-extended row. Thus, we can
treat this left join as an anti-join.
Previously, this transformation was limited to cases where the join's
own quals were strict for the var forced to null by upper qual levels.
This patch extends the logic to check table constraints, leveraging
the NOT NULL attribute information already available thanks to the
infrastructure introduced by e2debb643. If a forced-null var belongs
to the RHS and is defined as NOT NULL in the schema (and is not
nullable due to lower-level outer joins), we know that the left join
can be reduced to an anti-join.
Note that to ensure the var is not nullable by any lower-level outer
joins within the current subtree, we collect the relids of base rels
that are nullable within each subtree during the first pass of the
reduce-outer-joins process. This allows us to verify in the second
pass that a NOT NULL var is indeed safe to treat as non-nullable.
Based on a proposal by Nicolas Adenis-Lamarre, but this is not the
original patch.
Suggested-by: Nicolas Adenis-Lamarre <nicolas.adenis.lamarre@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CACPGbctKMDP50PpRH09in+oWbHtZdahWSroRstLPOoSDKwoFsw@mail.gmail.com
If the variable's value is null, exec_stmt_return() missed filling
in estate->rettype. This is a pretty old bug, but we'd managed not
to notice because that value isn't consulted for a null result ...
unless we have to cast it to a domain. That case led to a failure
with "cache lookup failed for type 0".
The correct way to assign the data type is known by exec_eval_datum.
While we could copy-and-paste that logic, it seems like a better
idea to just invoke exec_eval_datum, as the ROW case already does.
Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFj8pRBT_ahexDf-zT-cyH8bMR_qcySKM8D5nv5MvTWPiatYGA@mail.gmail.com
Backpatch-through: 14
The pg_stat_activity view shows information for aux processes, but the
pg_stat_get_backend_wait_event() and
pg_stat_get_backend_wait_event_type() functions did not. To fix, call
AuxiliaryPidGetProc(pid) if BackendPidGetProc(pid) returns NULL, like
we do in pg_stat_get_activity().
In version 17 and above, it's a little silly to use those functions
when we already have the ProcNumber at hand, but it was necessary
before v17 because the backend ID was different from ProcNumber. I
have other plans for wait_event_info on master, so it doesn't seem
worth applying a different fix on different versions now.
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/c0320e04-6e85-4c49-80c5-27cfb3a58108@iki.fi
Backpatch-through: 14
This commit adds a new parameter called
password_expiration_warning_threshold that controls when the server
begins emitting imminent-password-expiration warnings upon
successful password authentication. By default, this parameter is
set to 7 days, but this functionality can be disabled by setting it
to 0. This patch also introduces a new "connection warning"
infrastructure that can be reused elsewhere. For example, we may
want to warn about the use of MD5 passwords for a couple of
releases before removing MD5 password support.
Author: Gilles Darold <gilles@darold.net>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: songjinzhou <tsinghualucky912@foxmail.com>
Reviewed-by: liu xiaohui <liuxh.zj.cn@gmail.com>
Reviewed-by: Yuefei Shi <shiyuefei1004@gmail.com>
Reviewed-by: Steven Niu <niushiji@gmail.com>
Reviewed-by: Soumya S Murali <soumyamurali.work@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/129bcfbf-47a6-e58a-190a-62fc21a17d03%40migops.com
The buildfarm occasionally shows a variant row order in the output
of this UPDATE ... RETURNING, implying that the preceding INSERT
dropped one of the rows into some free space within the table rather
than appending them all at the end. It's not entirely clear why that
happens some times and not other times, but we have established that
it's affected by concurrent activity in other databases of the
cluster. In any case, the behavior is not wrong; the test is at fault
for presuming that a seqscan will give deterministic row ordering.
Add an ORDER BY atop the update to stop the buildfarm noise.
The buildfarm seems to have shown this only in v18 and master
branches, but just in case the cause is older, back-patch to
all supported branches.
Discussion: https://postgr.es/m/3866274.1770743162@sss.pgh.pa.us
Backpatch-through: 14
* Remove an unused variable
* Use "default log level" consistently (instead of "generic")
* Keep the process types in alphabetical order (missed one place in the
SGML docs)
* Since log_min_messages type was changed from enum to string, it
is a good idea to add single quotes when printing it out. Otherwise
it fails if the user copies and pastes from the SHOW output to SET,
except in the simplest case. Using single quotes reduces confusion.
* Use lowercase string for the burned-in default value, to keep the same
output as previous versions.
Author: Euler Taveira <euler@eulerto.com>
Author: Man Zeng <zengman@halodbtech.com>
Author: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/202602091250.genyflm2d5dw@alvherre.pgsql
It protects the freeProcs and some other fields in ProcGlobal, so
let's move it there. It's good for cache locality to have it next to
the thing it protects, and just makes more sense anyway. I believe it
was allocated as a separate shared memory area just for historical
reasons.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/b78719db-0c54-409f-b185-b0d59261143f@iki.fi
On the INSERT page, mention that SELECT privileges are also required
for any columns mentioned in the arbiter clause, including those
referred to by the constraint, and clarify that this applies to all
forms of ON CONFLICT, not just ON CONFLICT DO UPDATE.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
On the CREATE POLICY page, the description of per-command policies
stated that SELECT policies are applied when an INSERT has an ON
CONFLICT DO NOTHING clause. However, that is only the case if it
includes an arbiter clause, so clarify that.
While at it, also clarify the comment in the regression tests that
cover this.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Viktor Holmberg <v@viktorh.net>
Discussion: https://postgr.es/m/CAEZATCXGwMQ+x00YY9XYG46T0kCajH=21QaYL9Xatz0dLKii+g@mail.gmail.com
Backpatch-through: 14
An extension (or core code) might want to reconstruct the planner's
decisions about whether and where to perform partitionwise joins from
the final plan. To do so, it must be possible to find all of the RTIs
of partitioned tables appearing in the plan. But when an AppendPath
or MergeAppendPath pulls up child paths from a subordinate AppendPath
or MergeAppendPath, the RTIs of the subordinate path do not appear
in the final plan, making this kind of reconstruction impossible.
To avoid this, propagate the RTI sets that would have been present
in the 'apprelids' field of the subordinate Append or MergeAppend
nodes that would have been created into the surviving Append or
MergeAppend node, using a new 'child_append_relid_sets' field for
that purpose. The value of this field is a list of Bitmapsets,
because each relation whose append-list was pulled up had its own
set of RTIs: just one, if it was a partitionwise scan, or more than
one, if it was a partitionwise join. Since our goal is to see where
partitionwise joins were done, it is essential to avoid losing the
information about how the RTIs were grouped in the pulled-up
relations.
This commit also updates pg_overexplain so that EXPLAIN (RANGE_TABLE)
will display the saved RTI sets.
Co-authored-by: Robert Haas <rhaas@postgresql.org>
Co-authored-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Haibo Yan <tristan.yim@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoZ-Jh1T6QyWoCODMVQdhTUPYkaZjWztzP1En4=ZHoKPzw@mail.gmail.com