LLVM 22 has the fix that we copied into our tree in commit 9044fc1d and
a new function to reach it[1][2], so we only need to use our copy for
Aarch64 + LLVM < 22. The only change to the final version that our copy
didn't get is a new LLVM_ABI macro, but that isn't appropriate for us.
Our copy is hopefully now frozen and would only need maintenance if bugs
are found in the upstream code.
Non-Aarch64 systems now also use the new API with LLVM 22. It allocates
all sections with one contiguous mmap() instead of one per
section. We could have done that earlier, but commit 9044fc1d wanted to
limit the blast radius to the affected systems. We might as well
benefit from that small improvement everywhere now that it is available
out of the box.
We can't delete our copy until LLVM 22 is our minimum supported version,
or we switch to the newer JITLink API for at least Aarch64.
[1] https://github.com/llvm/llvm-project/pull/71968
[2] https://github.com/llvm/llvm-project/pull/174307
Backpatch-through: 14
Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com
Buildfarm testing shows that OpenSUSE (and perhaps related platforms?)
configures GNU tar in such a way that it'll archive sparse WAL files
by default, thus triggering the pax-extension detection code added by
bc30c704a. Thus, we need something similar to 852de579a but for
GNU tar's option set. "--format=ustar" seems to do the trick.
Moreover, the buildfarm shows that pg_verifybackup's 003_corruption.pl
test script is also triggering creation of pax-format tar files on
that platform. We had not noticed because those test cases all fail
(intentionally) before getting to the point of trying to verify WAL
data.
Since that means two TAP scripts need this option-selection logic, and
plausibly more will do so in future, factor it out into a subroutine
in Test::Utils. We also need to back-patch the 003_corruption.pl fix
into v18, where it's also failing.
While at it, clean up some places where guards for $tar being empty
or undefined were incomplete or even outright backwards. Presumably,
we missed noticing because the set of machines that run TAP tests
and don't have tar installed is empty. But if we're going to try
to handle that scenario, we should do it correctly.
Reported-by: Tomas Vondra <tomas@vondra.me>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/02770bea-b3f3-4015-8a43-443ae345379c@vondra.me
Backpatch-through: 18
Add the following jsonpath methods:
* l/r/btrim()
* lower(), upper()
* initcap()
* replace()
* split_part()
Each simply dispatches to the standard string processing functions.
These depend on the locale, but since it's set at `initdb`, they can be
considered immutable and therefore allowed in any jsonpath expression.
Author: Florents Tselai <florents.tselai@gmail.com>
Co-authored-by: David E. Wheeler <david@justatheory.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/CA+v5N40sJF39m0v7h=QN86zGp0CUf9F1WKasnZy9nNVj_VhCZQ@mail.gmail.com
This is just cleanup in the jsonpath grammar.
Rename the `csv_` tokens to `int_`, because they represent signed or
unsigned integers, as follows:
* `csv_elem` => `int_elem`
* `csv_list` => `int_list`
* `opt_csv_list` => `opt_int_list`
Rename the `datetime_precision` tokens to `uint_arg`, as they represent
unsigned integers and will be useful for other methods in the future, as
follows:
* `datetime_precision` => `uint_elem`
* `opt_datetime_precision` => `opt_uint_arg`
Rename the `datetime_template` tokens to `str_arg`, as they represent
strings and will be useful for other methods in the future, as follows:
* `datetime_template` => `str_elem`
* `opt_datetime_template` => `opt_str_arg`
Author: David E. Wheeler <david@justatheory.com>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/CA+v5N40sJF39m0v7h=QN86zGp0CUf9F1WKasnZy9nNVj_VhCZQ@mail.gmail.com
When a tablesync worker checks whether a specific table is published,
it previously issued a query to the publisher calling
pg_get_publication_tables() and filtering the result by relid via a
WHERE clause. Because the function itself was fully evaluated before
the filter was applied, this forced the publisher to enumerate all
tables in the publication. For publications covering a large number of
tables, this resulted in expensive catalog scans and unnecessary CPU
overhead on the publisher.
This commit adds a new overloaded form of pg_get_publication_tables()
that accepts an array of publication names and a target table
OID. Instead of enumerating all published tables, it evaluates
membership for the specified relation via syscache lookups, using the
new is_table_publishable_in_publication() helper. This helper
correctly accounts for publish_via_partition_root, ALL TABLES with
EXCEPT clauses, schema publications, and partition inheritance, while
avoiding the overhead of building the complete published table list.
The existing VARIADIC array form of pg_get_publication_tables() is
preserved for backward compatibility. Tablesync workers use the new
two-argument form when connected to a publisher running PostgreSQL 19
or later.
Bump catalog version.
Reported-by: Marcos Pegoraro <marcos@f10.com.br>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Haoyan Wang <wanghaoyan20@163.com>
Discussion: https://postgr.es/m/CAB-JLwbBFNuASyEnZWP0Tck9uNkthBZqi6WoXNevUT6+mV8XmA@mail.gmail.com
Previously, there was essentially no verification in this code that
the input is a tar file at all, let alone that it fits into the
subset of valid tar files that we can handle. This was exposed by
the discovery that we couldn't handle files that FreeBSD's tar
makes, because it's fairly aggressive about converting sparse WAL
files into sparse tar entries. To fix:
* Bail out if we find a pax extension header. This covers the
sparse-file case, and also protects us against scenarios where
the pax header changes other file properties that we care about.
(Eventually we may extend the logic to actually handle such
headers, but that won't happen in time for v19.)
* Be more wary about tar file type codes in general: do not assume
that anything that's neither a directory nor a symlink must be a
regular file. Instead, we just ignore entries that are none of the
three supported types.
* Apply pg_dump's isValidTarHeader to verify that a purported
header block is actually in tar format. To make this possible,
move isValidTarHeader into src/port/tar.c, which is probably where
it should have been since that file was created.
I also took the opportunity to const-ify the arguments of
isValidTarHeader and tarChecksum, and to use symbols not hard-wired
constants inside tarChecksum.
Back-patch to v18 but not further. Although this code exists inside
pg_basebackup in older branches, it's not really exposed in that
usage to tar files that weren't generated by our own code, so it
doesn't seem worth back-porting these changes across 3c9056981
and f80b09bac. I did choose to include a back-patch of 5868372bb
into v18 though, to minimize cosmetic differences between these
two branches.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/3049460.1775067940@sss.pgh.pa.us>
Backpatch-through: 18
Interrupt handling functions (e.g., HandleCatchupInterrupt(),
HandleParallelApplyMessageInterrupt()) are called only by
procsignal_sigusr1_handler(), which already calls SetLatch()
for the current process at the end of its processing.
Therefore, these interrupt handling functions do not need to
call SetLatch() themselves.
However, previously, some of these functions redundantly
called SetLatch(). This commit removes those unnecessary
calls.
While duplicate SetLatch() calls are redundant, they are
harmless, so this change is not backpatched.
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CALj2ACWd5apddj6Cd885WwJ6LquYu_G81C4GoR4xSoDV1x-FEA@mail.gmail.com
Previously we would only check for the availability of __cpuidex if
the related __get_cpuid_count was not available on a platform.
Future commits will need to access hypervisor information about
the TSC frequency of x86 CPUs. For that case __cpuidex is the only
viable option for accessing a high leaf (e.g. 0x40000000), since
__get_cpuid_count does not allow that.
__cpuidex is defined in cpuid.h for gcc/clang, but in intrin.h
for MSVC, so adjust tests to suite. We also need to cast the array
of unsigned ints to signed, since gcc (with -Wall) and clang emit
warnings otherwise.
Author: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: John Naylor <john.naylor@postgresql.org>
Discussion: https://postgr.es/m/CAP53PkyooCeR8YV0BUD_xC7oTZESHz8OdA=tP7pBRHFVQ9xtKg@mail.gmail.com
Now that command_ok() captures and displays failure output, use it
instead of system() plus manual diff-dumping in these two tests. This
simplifies both scripts and produces consistent, truncated output on
failure.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl
Replace die with croak throughout Cluster.pm and Utils.pm (except in
INIT blocks and signal handlers, where die is correct) so that error
messages report the test script's line number rather than the helper
module's.
Add @CARP_NOT in Utils.pm listing PostgreSQL::Test::Cluster, so that
when a Utils function is called through a Cluster.pm wrapper, croak
skips both packages and reports the actual test-script caller.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl
Install a $SIG{__DIE__} handler in the INIT block of Utils.pm that emits
the die message as a TAP diagnostic. Previously, an unexpected die
(e.g. from safe_psql) produced only "no plan was declared" with no
indication of the actual error. The handler also calls done_testing()
to suppress that confusing message.
Dies during compilation ($^S undefined) and inside eval ($^S == 1) are
left alone.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl
Discussion: https://postgr.es/m/20220222181924.eehi7o4pmneeb4hm%40alap3.anarazel.de
Capture stdout and stderr from command_ok() and command_fails() and emit
them as TAP diagnostics on failure. Output is truncated to the first
and last 30 lines per channel to avoid flooding.
A new helper _diag_command_output() is introduced in Utils.pm so
both functions share the same truncation and formatting logic.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl
When pg_regress fails it is often tedious to find the actual diffs,
especially in CI where you must navigate a file browser. Emit the first
80 lines of the combined regression.diffs as TAP diagnostics so the
failure reason is visible directly in the test output.
The line limit is across all failing tests in a single pg_regress run to
avoid flooding when a crash causes every subsequent test to fail.
New DIAG_DETAIL / DIAG_END tap output types are added, mirroring the
existing NOTE_DETAIL / NOTE_END pair, so that long diff lines can be
emitted without spurious '#' prefixes on continuation lines.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/DFYFWM053WHS.10K8ZPJ605UFK@jeltef.nl
While JIT can speed up large analytical queries, it can also cause
serious performance issues on otherwise very fast queries. Compiling
and optimizing the expressions may be so expensive, it completely
outweighs the JIT benefits for shorter queries.
Ideally, we'd address this in the cost model, but the part deciding
whether to enable JIT for a query is rather simple, partially because we
don't have any reliable estimates of how expensive the LLVM compilation
and optimization is.
Sometimes seemingly unrelated changes (for example a couple additional
INSERTs into a table) increase the cost just enough to enable JIT,
resulting in a performance cliff.
Because of these risks, most large-scale deployments already disable JIT
by default. Notably, this includes all hyperscalers.
This commit changes our default to align with that established practice.
If we improve the JIT (be it better costing or cheaper execution), we
can consider enabling it by default again.
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://postgr.es/m/DG1VZJEX1AQH.2EH4OKGRUDB71@jeltef.nl
c456e3911 added various optimizations to the tuple deformation routines.
One optimization assumed that heap tuples would never contain cstrings.
That optimization also made its way into nocachegetattr(), which isn't
correct as ROW() types get formed into HeapTuples by ExecEvalRow() and
those can contain cstring Datums. nocachegetattr() gets used to extract
Datums from those tuples.
Here we remove the pg_assume(), which was there to instruct the compiler
to omit the attlen == -2 related code in att_addlength_pointer().
Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/80aeac57-8f50-4732-a5b4-c2373c3f8149@gmail.com
The pg_test_timing program was previously using INSTR_TIME_GET_NANOSEC on an
absolute instr_time value in order to do a diff, which goes against the spirit
of how the GET_* macros are supposed to be used, and will cause overhead in a
future change that assumes these macros are typically used on intervals only.
Additionally the program was doing unnecessary work in the test loop by
measuring the time elapsed, instead of checking the existing current time
measurement against a target end time. To support that, introduce a new
INSTR_TIME_ADD_NANOSEC macro that allows adding user-defined nanoseconds
to an instr_time variable.
While modifying the relevant code anyway, simplify it by not handling
durations <= 0 in test_timing(), since duration is unsigned and 0 is
disallowed by the caller.
Author: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAP53Pkyxv3-3gX+aOxC5tX0p2v9RHU+XH0iyvb64+ZnBXj92vg@mail.gmail.com
Until now we reduced the look-ahead distance by 1 on every hit, and doubled it
on every miss. That is problematic because there are very common IO patterns
where this prevents us from ever reaching a sufficiently high distance (e.g. a
miss followed by a hit will never have the distance grow beyond 2). In many
such cases, if we had ever reached a sufficient look-ahead distance, things
would have been fine, because we grow the distance faster than we decrease it.
One might think that the most obvious answer to this problem would be to never
reduce the distance. However, that would not work well, as (particularly with
upcoming users of read streams), it is reasonably common to at first have a
lot of misses and then to transition to a fully cached workload, e.g. because
the same blocks are needed repeatedly within one stream. Doing unnecessarily
deep readahead can be costly, due to having to pin a lot more buffers, which
increases CPU overhead.
Because the cost of a synchronously handled miss can be very high (multiple
milliseconds for every IO with commonly used storage) compared to the CPU
overhead of keeping the distance too high, we want to err on the side of not
reducing the distance too early.
The insight that a decrease of the distance by 1 at ever hit may be ok at
large distances, but not at low distances, shows a way out: If we only allow
decreasing the distance once there were no misses for our maximum look-ahead
distance, we will keep the distance high as long as readahead has a chance to
do IO asynchronously, but not commonly when not.
Several folks have written variants of this patch, including at least Thomas
Munro, Melanie Plageman and I.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu
Discussion: https://postgr.es/m/CA+hUKGL2PhFyDoqrHefqasOnaXhSg48t1phs3VM8BAdrZqKZkw@mail.gmail.com
Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com
While in fast-path, execute any IO that we might encounter synchronously.
Because we are, in that moment, not reading ahead, dispatching any occasional
IO to workers has the dispatch overhead, without any realistic chance of the
IO completing before we need it.
This helps io_method=worker performance for workloads that have only
occasional cache misses, but where those occasional misses still take long
enough to matter. It is likely this is only measurable with fast local
storage or workloads with the data in the kernel page cache, as with remote
storage the IO latency, not the dispatch-to-worker latency, is the determining
factor.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu
Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com
The tuple_insert() method already has an equivalent argument, so this
makes sense just on consistency grounds, for future growth.
table_delete() can immediately use it to carry the 'changingPart'
boolean; for table_update we don't have any options at present.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> (older version)
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/202603171606.kf6pmhscqbqz@alvherre.pgsql
This is an extension of the UPDATE and DELETE commands to do a
"temporal update/delete" based on a range or multirange column. The
user can say UPDATE t FOR PORTION OF valid_at FROM '2001-01-01' TO
'2002-01-01' SET ... (or likewise with DELETE) where valid_at is a
range or multirange column.
The command is automatically limited to rows overlapping the targeted
portion, and only history within those bounds is changed. If a row
represents history partly inside and partly outside the bounds, then
the command truncates the row's application time to fit within the
targeted portion, then it inserts one or more "temporal leftovers":
new rows containing all the original values, except with the
application-time column changed to only represent the untouched part
of history.
To compute the temporal leftovers that are required, we use the *_minus_multi
set-returning functions defined in 5eed8ce50c.
- Added bison support for FOR PORTION OF syntax. The bounds must be
constant, so we forbid column references, subqueries, etc. We do
accept functions like NOW().
- Added logic to executor to insert new rows for the "temporal
leftover" part of a record touched by a FOR PORTION OF query.
- Documented FOR PORTION OF.
- Added tests.
Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/ec498c3d-5f2b-48ec-b989-5561c8aa2024%40illuminatedcomputing.com
Oversight in commit 1bd6f22f43: I was way too optimistic about the
compiler letting me know what variables needed to be updated, and missed
a few of them. Clean it up.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/40E570EE-5A60-49D8-B8F7-2F8F2B7C8DFA@gmail.com
This allows both univariate and multivariate statistics to be built on
virtual generated columns and expressions that refer to virtual
generated columns. The restriction disallowing extended statistics on
a single column is lifted in the case of a single virtual generated
column, since it is treated as a single expression.
In the catalogs, references to virtual generated columns are stored
as-is. They are expanded at ANALYZE time to build the statistics, and
at planning time to allow the optimizer to make use of the statistics.
This allows the statistics to be correctly rebuilt using ANALYZE, if a
column's generation expression is altered (which causes any existing
statistics data to be deleted).
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/20250422181006.dd6f9d1d81299f5b2ad55e1a@sraoss.co.jp
Until now pgaio_wref_check_done() with io_method=io_uring would not detect if
IOs are known to have completed to the kernel, but the completion has not yet
been consumed by userspace. This can lead to inferior performance and also
makes it harder to use smarter feedback logic in read_stream, because we
cannot use knowledge about whether an IO completed to control the readahead
distance.
This commit just adds the io_uring specific infrastructure. Later commits will
return whether a wait was needed from WaitReadBuffers() and then use that
knowledge.
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/f3xxfrkafjxpyqxywcxricxgyizjirfceychyxsgn7bwjp5eda@kwbduhy7tfmu
Discussion: https://postgr.es/m/CAH2-Wz%3DkMg3PNay96cHMT0LFwtxP-cQSRZTZzh1Cixxf8G%3Dzrw%40mail.gmail.com
FastPathMeta stored pointers into ri_compare_cache entries via
compare_entries[], creating a dependency on that cache remaining
stable. If ri_compare_cache entries were invalidated after fpmeta
was populated, the pointers would dangle.
Replace compare_entries[] with inline copies of the two FmgrInfo
fields actually needed (cast_func_finfo and eq_opr_finfo), copied
at populate time via fmgr_info_copy(). fpmeta now depends only on
riinfo remaining valid, which is already handled by the invalidation
callback.
Introduced by commit 2da86c1ef9 ("Add fast path for foreign key
constraint checks"), noticed while reviewing code for robustness
under CLOBBER_CACHE_ALWAYS.
Discussion: https://postgr.es/m/CA+HiwqFQ+ZA7hSOygv4uv_t75B3r0_gosjadetCsAEoaZwTu6g@mail.gmail.com
First, under CLOBBER_CACHE_ALWAYS, the RI_ConstraintInfo entry can
be invalidated by relcache callbacks triggered inside table_open()
or index_open(), leaving ri_FastPathCheck() calling
ri_populate_fastpath_metadata() with a stale entry whose valid flag
is false. Fix by moving the fpmeta initialization to after
ri_CheckPermissions(), reloading riinfo first to ensure it is
valid, then calling ri_ExtractValues() and build_index_scankeys()
immediately after before any further operations that could trigger
invalidation.
Second, fpmeta allocated in TopMemoryContext was not freed when the
entry was invalidated in InvalidateConstraintCacheCallBack(),
leaking memory each time the constraint cache entry was recycled.
Fix by freeing and NULLing fpmeta at invalidation time.
Noticed locally when testing with CLOBBER_CACHE_ALWAYS.
Discussion: https://postgr.es/m/CA+HiwqGBU__7-VZZhQWQ3EQuwLYNPd9==ngnzduhGWKHMj9mvw@mail.gmail.com
During the counting step, keep track of the bits that are the same
for the entire input. If we counted only a single distinct byte,
the next recursion will start at the next byte position that has
more than one distinct byte in the input. This allows us to skip over
multiple passes where the byte is the same for the entire input.
This provides a significant speedup for integers that have some upper
bytes with all-zeros or all-ones, which is common.
Reviewed-by: Chengpeng Yan <chengpeng_yan@outlook.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Discussion: https://postgr.es/m/CANWCAZYpGMDSSwAa18fOxJGXaPzVdyPsWpOkfCX32DWh3Qznzw@mail.gmail.com
Previously some logical decoding messages (e.g., "logical decoding found
consistent point") were logged at level LOG, even though they provided
low-level, developer-oriented information that DBAs were typically not
interested in.
Since these messages can occur routinely (for example, when keeping calling
pg_logical_slot_get_changes() to obtain the changes from logical decoding),
logging them at LOG can be overly verbose.
This commit reduces their log level to DEBUG1 to avoid unnecessary log noise.
This change applies to a small set of messages for now. Additional messages
may be adjusted similarly in the future.
Even with this change, if these messages from walsender still need to be
observed, enabling DEBUG1 logging selectively for walsender (e.g.,
log_min_messages = 'warning,walsender:debug1') would be helpful to avoid
increasing overall log volume.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGTyHgtD9tyN664x6vQ8Q1G53H7ZUCgBU9_X=nLt3f1QA@mail.gmail.com
Use the standard C23 and C++ attributes [[nodiscard]], [[noreturn]],
and [[maybe_unused]], if available.
This makes pg_nodiscard and pg_attribute_unused() available in
not-GCC-compatible compilers that support C23 as well as in C++.
For pg_noreturn, we can now drop the GCC-specific and MSVC-specific
fallbacks, because the C11 and the C++ implementation will now cover
all required cases.
Note, in a few places, we need to change the position of the attribute
because it's not valid in that place in C23.
Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
The test_cplusplusext test module has so far been disabled on MSVC.
The only remaining problem now is that designated initializers, as
used in PG_MODULE_MAGIC, require C++20. (With GCC and Clang they work
in older C++ versions as well.)
This adds another test in the top-level meson.build to check that the
compiler supports C++20 designated initializers. This is not
required, we are just checking and recording the answer. If yes, we
can enable the test module.
Most current compilers likely won't be in C++20 mode by default. This
doesn't change that; we are not doing anything to try to switch the
compiler into that mode. This might be a separate project, but for
now we'll leave that for the user or the test scaffolding.
The VS task on Cirrus CI is changed to provide the required flag to
turn on C++20 mode.
There is no equivalent change in configure, since this change mainly
targets MSVC.
Co-authored-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/CAGECzQR21OnnKiZO_1rLWO0-16kg1JBxnVq-wymYW0-_1cUNtg%40mail.gmail.com
The new test fails with an error about a missing WAL file on that
stack, because it is archived in GNU tar's --sparse --format=posix
format. BSD tar uses that format by default, unlike GNU tar itself, and
ZFS triggers it by implicitly creating sparse files when it sees a lot
of zeroes.
The problem will surely also affect real users of the new tar support in
pg_waldump (commit b15c1513) and pg_verifybackup (commit b3cf461b3) on
such systems. Ideas under discussion, but for now the test is made to
pass by disabling sparse file detection in BSD tar.
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/1624716.1774736283%40sss.pgh.pa.us
The check for skip_if_not_valid added in 819dc118c0 was put at the start of
the loop. A CAS loop in theory does allow to make that check in a race free
manner. However, just after the check, there's a
old_buf_state = WaitBufHdrUnlocked(buf);
which introduces a race, because it would allow BM_VALID to be cleared, after
the skip_if_not_valid check.
Fix by restarting the loop after WaitBufHdrUnlocked().
Reported-by: Yura Sokolov <y.sokolov@postgrespro.ru>
Discussion: https://postgr.es/m/5bf667f3-5270-4b19-a08f-0facbecdff68@postgrespro.ru
In a production build, StartReadBuffers() doesn't populate all fields
of a ReadBuffersOperation for a buffer hit because no callers use them
(they are populated in assert builds).
read_buffers() (a test-only function) relied on some of these fields, so
AIO tests failed on non-assert builds (discovered on the buildfarm after
commit 020c02bd90).
Fix by tracking the required information ourselves in read_buffers() and
avoiding reliance on the ReadBuffersOperation unless we know that we did
IO.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/9ce8f5d8-8ab2-4aa2-b062-c5d74161069c%40gmail.com
Currently, when the client sends a parameter discovery request within
OAUTHBEARER, the server logs the attempt with
FATAL: OAuth bearer authentication failed for user
These log entries are difficult to distinguish from true authentication
failures, and by default, libpq sends a discovery request as part of
every OAuth connection, making them annoyingly noisy. Use the new
PG_SASL_EXCHANGE_ABANDONED status to suppress them.
Patch by Zsolt Parragi, with some additional comments added by me.
Author: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAN4CZFPim7hUiyb7daNKQPSZ8CvQRBGkVhbvED7yZi8VktSn4Q%40mail.gmail.com
Introduce PG_SASL_EXCHANGE_ABANDONED, which allows CheckSASLAuth to
suppress the failing log entry for any SASL exchange that isn't actually
an authentication attempt. This is desirable for OAUTHBEARER's discovery
exchanges (and a subsequent commit will make use of it there).
This might have some overlap in the future with in-band aborts for SASL
exchanges, but it's intentionally not named _ABORTED to avoid confusion.
(We don't currently support clientside aborts in our SASL profile.)
Adapted from a patch by Zsolt Parragi.
Author: Zsolt Parragi <zsolt.parragi@percona.com>
Co-authored-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAN4CZFPim7hUiyb7daNKQPSZ8CvQRBGkVhbvED7yZi8VktSn4Q%40mail.gmail.com
SASL exchanges must end with either an AuthenticationOk or an
ErrorResponse from the server, and the standard way to produce an
ErrorResponse packet is for auth_failed() to call ereport(FATAL). This
means that there's no way for a SASL mechanism to suppress the server
log entry if the "authentication attempt" was really just a query for
authentication metadata, as is done with OAUTHBEARER.
Following the example of 1f9158ba4, add a FATAL_CLIENT_ONLY elevel. This
will allow ClientAuthentication() to choose not to log a particular
failure, while still correctly ending the authentication exchange before
process exit.
(The provenance of this patch is convoluted: since it's a mechanical
copy-paste of 1f9158ba4, both Zsolt Parragi and I produced nearly
identical versions independently, and Andrey Borodin reviewed Zsolt's
version. Tom Lane is the author of 1f9158ba4, but I don't want to imply
that he's signed off on this adaptation. See Discussion.)
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAN4CZFPim7hUiyb7daNKQPSZ8CvQRBGkVhbvED7yZi8VktSn4Q%40mail.gmail.com
For PG19, since we won't have the ability to officially switch out flow
plugins, relax the flow-loading code to not require the internal init
function. Modules that don't have one will be treated as custom user
flows in error messages.
This will let bleeding-edge developers more easily test out the API and
provide feedback for PG20, by telling the runtime linker to find a
different libpq-oauth. It remains undocumented for end users.
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOYmi%2BmrGg%2Bn_X2MOLgeWcj3v_M00gR8uz_D7mM8z%3DdX1JYVbg%40mail.gmail.com
The new PGoauthBearerRequestV2 API (which has similarities to the
"subclass" pointer architecture in use by the backend, for Nodes)
carries the risk of a developer ignoring the type of hook in use and
just casting directly to the V2 struct. This will appear to work fine in
19, but crash (or worse) when speaking to libpq 18.
However, we're in a unique position to catch this problem, because we
have tight control over the struct. Add poisoning code to the v1 path
which does the following:
- masks the v2 request->issuer pointer, to hopefully point at nonsense
memory
- abort()s if the v2 request->error is assigned by the hook
- attempts to cover both with VALGRIND_MAKE_MEM_NOACCESS for the
duration of the callback (a potential AddressSanitizer implementation
is left for future work)
The struct is unpoisoned after the call, so we can switch back to the v2
internal implementation when necessary.
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAOYmi%2BnCg5upBVOo_UCSjMfO%3DYMkZXcSEsgaADKXqerr5wahZQ%40mail.gmail.com
Commit 2252fcd427 modified some function prototypes in tableam.h
and heapam.h to take a VacuumParams argument instead of a pointer,
which required including vacuum.h in those headers. vacuum.h has a
reasonably large dependency tree, and headers like tableam.h are
widely included, so this is not ideal. To fix, change the
functions in question to accept a "const VacuumParams *" argument
instead. That allows us to use a forward declaration for
VacuumParams and avoid including vacuum.h. Since vacuum_rel()
needs to scribble on the params argument, we still pass it by value
to that function so that the original struct is not modified.
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/rzxpxod4c4la62yvutyrvgoyilrl2fx55djaf2suidy7np5m6c%403l2ln476eadh
The docs previously didn't explain that leaf and non-leaf keys
could be treated differently, even though many of our opclasses
do exactly that. It also wasn't explained how that relates to
the STORAGE option, particularly since only one storage type
can be specified for both leaf and non-leaf keys.
While here, reorganize the text slightly, rather than sticking
additional detail into what's supposed to be a brief summary
paragraph.
Author: Paul A Jungwirth <pj@illuminatedcomputing.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+renyWs5Np+FLSYfL+eu20S4U671A3fQGb-+7e22HLrD1NbYw@mail.gmail.com
When converting the WHERE clause in an element pattern,
generate_query_for_graph_path() calls replace_property_refs() to
replace the property references in it. Only the current graph element
pattern is passed as the context for replacement. If there are
references to variables from other element patterns, it causes a
segmentation fault (an assertion failure in an Assert enabled build)
since it does not find path_element object corresponding to those
variables.
We do not support forward and backward variable references within a
graph table clause. Hence prohibit all the cross references.
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reported-by: Man Zeng <zengman@halodbtech.com>
Reviewed-by: Henson Choi <assam258@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAExHW5u6AoDfNg4%3DR5eVJn_bJn%3DC%3DwVPrto02P_06fxy39fniA%40mail.gmail.com
When a ColumnRef can be resolved as a graph table property reference
and a lateral table column reference prefer the graph table property
reference since element pattern variables in the GRAPH_TABLE clause
form the innermost namespace.
Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Henson Choi <assam258@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAExHW5u6AoDfNg4%3DR5eVJn_bJn%3DC%3DwVPrto02P_06fxy39fniA%40mail.gmail.com
XLOG_CHECKPOINT_REDO only contains the wal_level copied straight in
without an encapsulating record structure. While it works, it makes
future uses of XLOG_CHECKPOINT_REDO hard as there is nowhere to put
new data items. This fix this was inspired by the online checksums
patch which adds data to this record, but this change has value on
its own.
Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/c92b5d8b-bc03-47bc-b209-2e4a719eee32@iki.fi