postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-07-16 13:12:58 -04:00

Author	SHA1	Message	Date
Michael Paquier	2d4ead6f4b	Fix order of columns in pg_stat_recovery recovery_last_xact_time is listed before current_chunk_start_time in the documentation, the function definition and the view definition, but their order was reversed in the code. Thinko in `01d485b142`. Mea culpa. Author: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAOzEurQQ1naKmPJhfE5WOUQjtf5tu08Kw3QCGY5UY=7Rt9fE=w@mail.gmail.com	2026-03-06 14:41:41 +09:00
Amit Kapila	f1ddaa1535	Fix inconsistent elevel in pg_sync_replication_slots() retry logic. The commit `0d2d4a0ec3` allowed pg_sync_replication_slots() to retry sync attempts, but missed a case, when WAL prior to a slot's confirmed_flush_lsn is not yet flushed locally. By changing the elevel from ERROR to LOG, we allow the sync loop to continue. This provides the opportunity for the slot to be synchronized once the standby catches up with the necessary WAL. Author: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAFPTHDZAA+gWDntpa5ucqKKba41=tXmoXqN3q4rpjO9cdxgQrw@mail.gmail.com	2026-03-06 10:51:32 +05:30
Michael Paquier	01d485b142	Add system view pg_stat_recovery This commit introduces pg_stat_recovery, that exposes at SQL level the state of recovery as tracked by XLogRecoveryCtlData in shared memory, maintained by the startup process. This new view includes the following fields, that are useful for monitoring purposes on a standby, once it has reached a consistent state (making the execution of the SQL function possible): - Last-successfully replayed WAL record LSN boundaries and its timeline. - Currently replaying WAL record end LSN and its timeline. - Current WAL chunk start time. - Promotion trigger state. - Timestamp of latest processed commit/abort. - Recovery pause state. Some of this data can already be recovered from different system functions, but not all of it. See pg_get_wal_replay_pause_state or pg_last_xact_replay_timestamp. This new view offers a stronger consistency guarantee, by grabbing the recovery state for all fields through one spinlock acquisition. The system view relies on a new function, called pg_stat_get_recovery(). Querying this data requires the pg_read_all_stats privilege. The view returns no rows if the node is not in recovery. This feature originates from a suggestion I have made while discussion the addition of a CONNECTING state to the WAL receiver's shared memory state, because we lacked access to some of the state data. The author has taken the time to implement it, so thanks for that. Bump catalog version. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CABPTF7W+Nody-+P9y4PNk37-QWuLpfUrEonHuEhrX+Vx9Kq+Kw@mail.gmail.com Discussion: https://postgr.es/m/aW13GJn_RfTJIFCa@paquier.xyz	2026-03-06 12:37:40 +09:00
Michael Paquier	42a12856a6	Refactor code retrieving string for RecoveryPauseState This refactoring is going to be useful in an upcoming commit, to avoid some code duplication with the function pg_get_wal_replay_pause_state(), that returns a string for the recovery pause state. Refactoring opportunity noticed while hacking on a different patch. Discussion: https://postgr.es/m/CABPTF7W+Nody-+P9y4PNk37-QWuLpfUrEonHuEhrX+Vx9Kq+Kw@mail.gmail.com	2026-03-06 11:53:23 +09:00
Tom Lane	f95d73ed43	Simplify creation of built-in functions with non-default ACLs. Up to now, to create such a function, one had to make a pg_proc.dat entry and then modify it with GRANT/REVOKE commands, which we put in system_functions.sql. That seems a little ugly though, because it violates the idea of having a single source of truth about the initial contents of pg_proc, and it results in leaving dead rows in the initial contents of pg_proc. This patch improves matters by allowing aclitemin to work during early bootstrap, before pg_authid has been loaded. On the same principle that we use for early access to pg_type details, put a table of known built-in role names into bootstrap.c, and use that in bootstrap mode. To create a built-in function with a non-default ACL, one should write the desired ACL list in its pg_proc.dat entry, using a simplified version of aclitemout's notation: omit the grantor (if it is the bootstrap superuser, which it pretty much always should be) and spell the bootstrap superuser's name as POSTGRES, similarly to the notation used elsewhere in src/include/catalog. This results in entries like proacl => '{POSTGRES=X,pg_monitor=X}' which shows that we've revoked public execute permissions and instead granted that to pg_monitor. In addition to fixing up pg_proc.dat entries, I got rid of some role grants that had been stuck into system_functions.sql, and instead put them into a new file pg_auth_members.dat; that seems like a far less random place to put the information. The correctness of the data changes can be verified by comparing the initial contents of pg_proc and pg_auth_members before and after. pg_proc should match exactly, but the OID column of pg_auth_members will probably be different because those OIDs now get assigned a little earlier in bootstrap. (I forced a catversion bump out of caution, but it wasn't really necessary.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/183292bb-4891-4c96-a3ca-e78b5e0e1358@dunslane.net	2026-03-05 17:43:09 -05:00
Tom Lane	7664319ccb	Be more wary of false matches in initdb's replace_token(). Do not replace the target string unless the occurrence is surrounded by whitespace or line start/end. This avoids potential false match to a substring of a field. While we've not had trouble with that up to now, the next patch creates hazards of false matches to POSTGRES within an ACL field. There is one call site that needs adjustment, as it was presuming it could write "::1" and have that match "::1/128". For all the others, this restriction is okay and strictly safer. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/183292bb-4891-4c96-a3ca-e78b5e0e1358@dunslane.net	2026-03-05 17:22:31 -05:00
Melanie Plageman	34cb4254bd	Prefix PruneState->all_{visible,frozen} with set_ The PruneState had members called "all_visible" and "all_frozen" which reflect not the current state of the page but the state it could be in once pruning and freezing have been executed. These are then saved in the PruneFreezeResult so the caller can set the VM accordingly. Prefix the PruneState members as well as the corresponsding PruneFreezeResult members with "set_" to clarify that they represent the proposed state of the all-visible and all-frozen bits for a heap page in the visibility map, not the current state. Author: Melanie Plageman <melanieplageman@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/bqc4kh5midfn44gnjiqez3bjqv4zogydguvdn446riw45jcf3y%404ez66il7ebvk	2026-03-05 16:55:00 -05:00
Melanie Plageman	68c2dcb913	Add PageGetPruneXid() helper This is similar to the other page accessors in bufpage.h. It improves readability and avoids long lines. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/BD8B69E7-26D8-4706-9164-597C6AE57812%40gmail.com	2026-03-05 16:22:57 -05:00
Melanie Plageman	59663e4207	Move commonly used context into PruneState and simplify helpers heap_page_prune_and_freeze() and many of its helpers use the heap buffer, block number, and page. Other helpers took the heap page and didn't use it. Initializing these values once during prune_freeze_setup() simplifies the helpers' interfaces and avoids any repeated calls to BufferGetBlockNumber() and BufferGetPage(). While updating PruneState, also reorganize its fields to make the layout and member documentation more consistent. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/BD8B69E7-26D8-4706-9164-597C6AE57812%40gmail.com	2026-03-05 16:10:29 -05:00
Tom Lane	ac0accafd6	Exit after fatal errors in client-side compression code. It looks like whoever wrote the astreamer (nee bbstreamer) code thought that pg_log_error() is equivalent to elog(ERROR), but it's not; it just prints a message. So all these places tried to continue on after a compression or decompression error return, with the inevitable result being garbage output and possibly cascading error messages. We should use pg_fatal() instead. These error conditions are probably pretty unlikely in practice, which no doubt accounts for the lack of field complaints. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/1531718.1772644615@sss.pgh.pa.us Backpatch-through: 15	2026-03-05 14:43:21 -05:00
Jacob Champion	a6483f5ac9	oauth: Add TLS support for oauth_validator tests The oauth_validator tests don't currently support HTTPS, which makes testing PGOAUTHCAFILE difficult. Add a localhost certificate to src/test/ssl and make use of it in oauth_server.py. In passing, explain the hardcoded use of IPv4 in our issuer identifier, after intermittent failures on NetBSD led to commit `8d9d5843b`. (The new certificate is still set up for IPv6, to make it easier to improve that behavior in the future.) Patch by Jonathan Gonzalez V., with some additional tests and tweaks by me. Author: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com> Discussion: https://postgr.es/m/8a296a2c128aba924bff0ae48af2b88bf8f9188d.camel@gmail.com	2026-03-05 10:04:53 -08:00
Jacob Champion	b8d7685835	libpq: Add PQgetThreadLock() to mirror PQregisterThreadLock() Allow libpq clients to retrieve the current pg_g_threadlock pointer with PQgetThreadLock(). Single-threaded applications could already do this in a convoluted way: pgthreadlock_t tlock; tlock = PQregisterThreadLock(NULL); PQregisterThreadLock(tlock); /* re-register the callback / / use tlock */ But a generic library can't do that without potentially breaking concurrent libpq connections. The motivation for doing this now is the libpq-oauth plugin, which currently relies on direct injection of pg_g_threadlock, and should ideally not. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmEU_q9sr1PMmE-4rLwFN%3DOjyndDwFZvpsMU3RNJLrM9g%40mail.gmail.com Discussion: https://postgr.es/m/CAOYmi%2B%3DMHD%2BWKD4rsTn0v8220mYfyLGhEc5EfhmtqrAb7SmC5g%40mail.gmail.com	2026-03-05 10:04:48 -08:00
Jacob Champion	f8c0b91a60	oauth: Report cleanup errors as warnings on stderr Using conn->errorMessage for these "shouldn't-happen" cases will only work if the connection itself fails. Our SSL and password callbacks print WARNINGs when they find themselves in similar situations, so follow their lead. Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com> Discussion: https://postgr.es/m/CAOYmi%2BmEU_q9sr1PMmE-4rLwFN%3DOjyndDwFZvpsMU3RNJLrM9g%40mail.gmail.com	2026-03-05 10:04:36 -08:00
Alexander Korotkov	177037341a	Fix handling of updated tuples in the MERGE statement This branch missed the IsolationUsesXactSnapshot() check. That led to EPQ on repeatable read and serializable isolation levels. This commit fixes the issue and provides a simple isolation check for that. Backpatch through v15 where MERGE statement was introduced. Reported-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/CAPpHfdvzZSaNYdj5ac-tYRi6MuuZnYHiUkZ3D-AoY-ny8v%2BS%2Bw%40mail.gmail.com Author: Tender Wang <tndrwang@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Backpatch-through: 15	2026-03-05 19:49:28 +02:00
Fujii Masao	bffd7130e9	Improve validation of recovery_target_xid GUC values. Previously, the recovery_target_xid GUC values were not sufficiently validated. As a result, clearly invalid inputs such as the string "bogus", a decimal value like "1.1", or 0 (a transaction ID smaller than the minimum valid value of 3) were unexpectedly accepted. In these cases, the value was interpreted as transaction ID 0, which could cause recovery to behave unexpectedly. This commit improves validation of recovery_target_xid GUC so that invalid values are rejected with an error. This prevents recovery from proceeding with misconfigured recovery_target_xid settings. Also this commit updates the documentation to clarify the allowed values for recovery_target_xid GUC. Author: David Steele <david@pgbackrest.org> Reviewed-by: Hüseyin Demir <huseyin.d3r@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/f14463ab-990b-4ae9-a177-998d2677aae0@pgbackrest.org	2026-03-05 21:40:32 +09:00
Fujii Masao	9b0e5bd532	doc: Clarify that COLUMN is optional in ALTER TABLE ... ADD/DROP COLUMN. In ALTER TABLE ... ADD/DROP COLUMN, the COLUMN keyword is optional. However, part of the documentation could be read as if COLUMN were required, which may mislead users about the command syntax. This commit updates the ALTER TABLE documentation to clearly state that COLUMN is optional for ADD and DROP. Also this commit adds regression tests covering ALTER TABLE ... ADD/DROP without the COLUMN keyword. Backpatch to all supported versions. Author: Chao Li <lic@highgo.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CAEoWx2n6ShLMOnjOtf63TjjgGbgiTVT5OMsSOFmbjGb6Xue1Bw@mail.gmail.com Backpatch-through: 14	2026-03-05 12:55:52 +09:00
Michael Paquier	5f8124a0cf	Move definition of XLogRecoveryCtlData to xlogrecovery.h XLogRecoveryCtlData is the structure that stores the shared-memory state of WAL recovery, including information such as promotion requests, the timeline ID (TLI), and the LSNs of replayed records. This refactoring is independently useful because it allows code outside of core to access the recovery state in live. It will be used by an upcoming patch that introduces a SQL function for querying this information, that can be accessed on a standby once a consistent state has been reached. This only moves code around, changing nothing functionally. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CABPTF7W+Nody-+P9y4PNk37-QWuLpfUrEonHuEhrX+Vx9Kq+Kw@mail.gmail.com	2026-03-05 12:17:47 +09:00
Michael Paquier	ea4744782b	Fix rare instability in recovery TAP test 004_timeline_switch This fixes a problem similar to `ad8c86d22c`. In this case, the test could fail under the following circumstances: - The primary is stopped with teardown_node(), meaning that it may not be able to send all its WAL records to standby_1 and standby_2. - If standby_2 receives more records than standby_1, attempting to reconnect standby_2 to the promoted standby_1 would fail because of a timeline fork. This race condition is fixed with a simple trick: instead of tearing down the primary, it is stopped cleanly so as all the WAL records of the primary are received and flushed by both standby_1 and standby_2. Once we do that, there is no need for a wait_for_catchup() before stopping the node. The test wants to check that a timeline jump can be achieved when reconnecting a standby to a promoted standby in the same cluster, hence an immediate stop of the primary is not required. This failure is harder to reach than the previous instability of 009_twophase, still the buildfarm has been able to detect this failure at least once. I have tried Alexander Lakhin's test trick with the bgwriter and very aggressive standby snapshots, but I could not reproduce it directly. It is reachable, as the buildfarm has proved. Backpatch down to all supported branches, and this problem can lead to spurious failures in the buildfarm. Discussion: https://postgr.es/m/493401a8-063f-436a-8287-a235d9e065fc@gmail.com Backpatch-through: 14	2026-03-05 10:05:44 +09:00
Michael Paquier	34dfca2934	Change default value of default_toast_compression to "lz4", take two The default value for default_toast_compression was "pglz". The main reason for this choice is that this option is always available, pglz code being embedded in Postgres. However, it is known that LZ4 is more efficient than pglz: less CPU required, more compression on average. As of this commit, the default value of default_toast_compression becomes "lz4", if available. By switching to LZ4 as the default, users should see natural speedups on TOAST data reads and/or writes. Support for LZ4 in TOAST compression was added in Postgres v14, or 5 releases ago. This should be long enough to consider this feature as stable. While at it, quotes are removed from default_toast_compression in postgresql.conf.sample. Quotes are not required in this case. The in-place value replacement done by initdb if the build supports LZ4 would not use them in the postgresql.conf file added to a freshly-initialized cluster. Note that this is a version lighter than `7c1849311e`, that included a replacement of --with-lz4 by --without-lz4 in configure builds, forcing a requirement for LZ4 in all environments. The buildfarm did not like it, at all. This commit switches default_toast_compression to lz4 as default only when --with-lz4 is defined, which should keep the buildfarm at bay while still allowing users to benefit from LZ4 compression in TOAST as long as the code is compiled with it. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://posgr.es/m/435df33a-129e-4f0c-a803-f3935c5a5ecb@eisentraut.org	2026-03-05 09:24:35 +09:00
Michael Paquier	4f0b3afab4	Revert "Change default value of default_toast_compression to "lz4"" This reverts commit `7c1849311e`, due to the fact that more than 60% of the buildfarm members do not have lz4 installed. As we are in the last commit fest of the development cycle, and that it could take a couple of weeks to stabilize things, this change is reverted for now. This commit will be reworked in a lighter version, as default_toast_compression's default can be changed to "lz4" without the switch from --with-lz4 to --without-lz4. This approach will keep the buildfarm at bay, and still allow builds to take advantage of LZ4 in TOAST by default, as long as the code is compiled with LZ4 support. A harder requirement based on LZ4 should be achievable at some point, but it is going to require some work from the buildfarm owners first. Perhaps this part could be revisited at the beginning of the next development cycle. Discussion: https://postgr.es/m/CAOYmi+meTT0NbLbnVqOJD5OKwCtHL86PQ+RZZTrn6umfmHyWaw@mail.gmail.com	2026-03-05 08:25:35 +09:00
Andrew Dunstan	3c19983cc0	pg_restore: add --no-globals option to skip globals This is a followup to commit `763aaa06f0` Add non-text output formats to pg_dumpall. Add a --no-globals option to pg_restore that skips restoring global objects (roles and tablespaces) when restoring from a pg_dumpall archive. When -C/--create is not specified, databases that do not already exist on the target server are also skipped. This is useful when restoring only specific databases from a pg_dumpall archive without needing the global objects to be restored first. Author: Mahendra Singh Thalor <mahi6run@gmail.com> With small tweaks by me. Discussion: https://postgr.es/m/CAKYtNArdcc5kx1MdTtTKFNYiauo3=zCA-NB0LmBCW-RU_kSb3A@mail.gmail.com	2026-03-04 16:53:29 -05:00
Andrew Dunstan	c7572cd48d	Improve writing map.dat preamble Fix code from commit `763aaa06f0` Suggestion from Alvaro Herrera following a bug discovered by Coverity.	2026-03-04 16:08:04 -05:00
Andrew Dunstan	01c729e0c7	Fix casting away const-ness in pg_restore.c This was intoduced in commit `763aaa06f0` per gripe from Peter Eistentrut. Author: Mahendra Singh Thalor <mahi6run@gmail.com> Slightly tweaked by me. Discussion: https://postgr.es/m/016819c0-666e-42a8-bfc8-2b93fd8d0176@eisentraut.org	2026-03-04 15:54:02 -05:00
Tom Lane	e6a1d8f5ac	Fix estimate_hash_bucket_stats's correction for skewed data. The previous idea was "scale up the bucketsize estimate by the ratio of the MCV's frequency to the average value's frequency". But we should have been suspicious of that plan, since it frequently led to impossible (> 1) values which we had to apply an ad-hoc clamp to. Joel Jacobson demonstrated that it sometimes leads to making the wrong choice about which side of the hash join should be inner. Instead, drop the whole business of estimating average frequency, and just clamp the bucketsize estimate to be at least the MCV's frequency. This corresponds to the bucket size we'd get if only the MCV appears in a bucket, and the MCV's frequency is not affected by the WHERE-clause filters. (We were already making the latter assumption.) This also matches the coding used since `4867d7f62` in the case where only a default ndistinct estimate is available. Interestingly, this change affects no existing regression test cases. Add one to demonstrate that it helps pick the smaller table to be hashed when the MCV is common enough to affect the results. This leaves estimate_hash_bucket_stats not considering the effects of null join keys at all, which we should probably improve. However, I have a different patch in the queue that will change the executor's handling of null join keys, so it seems appropriate to wait till that's in before doing anything more here. Reported-by: Joel Jacobson <joel@compiler.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Joel Jacobson <joel@compiler.org> Discussion: https://postgr.es/m/341b723c-da45-4058-9446-1514dedb17c1@app.fastmail.com	2026-03-04 15:33:15 -05:00
Tom Lane	c70f6dc6bd	Fix yet another bug in archive streamer with LZ4 decompression. The code path in astreamer_lz4_decompressor_content() that updated the output pointers when the output buffer isn't full was wrong. It advanced next_out by bytes_written, which could include previous decompression output not just that of the current cycle. The correct amount to advance is out_size. While at it, make the output pointer updates look more like the input pointer updates. This bug is pretty hard to reach, as it requires consecutive compression frames that are too small to fill the output buffer. pg_dump could have produced such data before `66ec01dc4`, but I'm unsure whether any files we use astreamer with would be likely to contain problematic data. Author: Chao Li <lic@highgo.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0594CC79-1544-45DD-8AA4-26270DE777A7@gmail.com Backpatch-through: 15	2026-03-04 12:08:37 -05:00
Álvaro Herrera	ce4fbe1ac6	Don't malloc(0) in EventTriggerCollectAlterTSConfig Author: Florin Irion <florin.irion@enterprisedb.com> Discussion: https://postgr.es/m/c6fff161-9aee-4290-9ada-71e21e4d84de@gmail.com	2026-03-04 15:04:53 +01:00
Amit Kapila	fd366065e0	Allow table exclusions in publications via EXCEPT TABLE. Extend CREATE PUBLICATION ... FOR ALL TABLES to support the EXCEPT TABLE syntax. This allows one or more tables to be excluded. The publisher will not send the data of excluded tables to the subscriber. To support this, pg_publication_rel now includes a prexcept column to flag excluded relations. For partitioned tables, the exclusion is applied at the root level; specifying a root table excludes all current and future partitions in that tree. Follow-up work will implement ALTER PUBLICATION support for managing these exclusions. Author: vignesh C <vignesh21@gmail.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Nisha Moond <nisha.moond412@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CALDaNm3=JrucjhiiwsYQw5-PGtBHFONa6F7hhWCXMsGvh=tamA@mail.gmail.com	2026-03-04 15:56:48 +05:30
Heikki Linnakangas	fe08113aef	Add test for row-locking and multixids with prepared transactions This is a repro for the issue fixed in commit `ccae90abdb`. Backpatch to v17 like that commit, although that's a little arbitrary as this test would work on older versions too. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://www.postgresql.org/message-id/CAA5RZ0twq5bNMq0r0QNoopQnAEv+J3qJNCrLs7HVqTEntBhJ=g@mail.gmail.com Backpatch-through: 17	2026-03-04 11:29:02 +02:00
Heikki Linnakangas	19615a44b3	Skip prepared_xacts test if max_prepared_transactions < 2 This reduces maintenance overhead, as we no longer need to update the dummy expected output file every time the .sql file changes. Discussion: https://www.postgresql.org/message-id/1009073.1772551323@sss.pgh.pa.us Backpatch-through: 14	2026-03-04 11:06:43 +02:00
Michael Paquier	ad8c86d22c	Fix rare instability in recovery TAP test 009_twophase The phase of the test where we want to check that 2PC transactions prepared on a primary can be committed on a promoted standby relied on an immediate stop of the primary. This logic has a race condition: it could be possible that some records (most likely standby snapshot records) are generated on the primary before it finishes its shutdown, without the promoted standby know about them. When the primary is recycled as new standby, the test could fail because of a timeline fork as an effect of these extra records. This fix takes care of the instability by doing a clean stop of the primary instead of a teardown (aka immediate stop), so as all records generated on the primary are sent to the promoted standby and flushed there. There is no need for a teardown of the primary in this test scenario: the commit of 2PC transactions on a promoted standby do not care about the state of the primary, only of the standby. This race is very hard to hit in practice, even slow buildfarm members like skink have a very low rate of reproduction. Alexander Lakhin has come up with a recipe to improve the reproduction rate a lot: - Enable -DWAL_DEBUG. - Patch the bgwriter so as standby snapshots are generated every milliseconds. - Run 009_twophase tests under heavy parallelism. With this method, the failure appears after a couple of iterations. With the fix in place, I have been able to run more than 50 iterations of the parallel test sequence, without seeing a failure. Issue introduced in `30820982b2`, due to a copy-pasto coming from the surrounding tests. Thanks also to Hayato Kuroda for digging into the details of the failure. He has proposed a fix different than the one of this commit. Unfortunately, it relied on injection points, feature only available in v17. The solution of this commit is simpler, and can be applied to v14~v16. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/b0102688-6d6c-c86a-db79-e0e91d245b1a@gmail.com Backpatch-through: 14	2026-03-04 16:30:51 +09:00
Michael Paquier	7c1849311e	Change default value of default_toast_compression to "lz4", when available The default value for default_toast_compression was "pglz". The main reason for this choice is that this option is always available, pglz code being embedded in Postgres. However, it is known that LZ4 is more efficient than pglz: less CPU required, more compression on average. As of this commit, the default value of default_toast_compression becomes "lz4", if available. By switching to LZ4 as the default, users should see natural speedups on TOAST data reads and/or writes. Support for LZ4 in TOAST compression was added in Postgres v14, or 5 releases ago. This should be long enough to consider this feature as stable. --with-lz4 is removed, replaced by a --without-lz4 to disable LZ4 in the builds on an option-basis, following a practice similar to readline or ICU. References to --with-lz4 are removed from the documentation. While at it, quotes are removed from default_toast_compression in postgresql.conf.sample. Quotes are not required in this case. The in-place value replacement done by initdb if the build supports LZ4 would not use them in the postgresql.conf file added to a freshly-initialized cluster. For the reference, a similar switch has been done with ICU in `fcb21b3acd`. Some of the changes done in this commit are consistent with that. Note: this is going to create some disturbance in the buildfarm, in environments where lz4 is not installed. Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://posgr.es/m/435df33a-129e-4f0c-a803-f3935c5a5ecb@eisentraut.org	2026-03-04 13:05:31 +09:00
Richard Guo	1f4f87d794	Remove redundant restriction checks in apply_child_basequals In apply_child_basequals, after translating a parent relation's restriction quals for a child relation, we simplify each child qual by calling eval_const_expressions. Historically, the code then called restriction_is_always_false and restriction_is_always_true to reduce NullTest quals that are provably false or true. However, since commit `e2debb643`, the planner natively performs NullTest deduction during constant folding. Therefore, calling restriction_is_always_false and restriction_is_always_true immediately afterward is redundant and wastes CPU cycles. We can safely remove them and simply rely on the constant folding to handle the deduction. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-vLmGXaUEZyOMacN0BVfqWCt2tM-eDVWdDfJnOQaauGg@mail.gmail.com	2026-03-04 10:57:43 +09:00
Richard Guo	ce1c17a316	Remove obsolete SAMESIGN macro The SAMESIGN macro was historically used as a helper for manual integer overflow checks. However, since commit `4d6ad3125` introduced overflow-aware integer operations, this manual sign-checking logic is no longer necessary. The macro remains defined in brin_minmax_multi.c and timestamp.c, but is not used in either file. This patch removes these definitions to clean things up. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4-NL3J3hQ3LzrwV-YUkQC18P+jM7ZiegQyAHzgdZev2qg@mail.gmail.com	2026-03-04 10:56:06 +09:00
Michael Paquier	9ef6381829	Add some tests for CREATE OR REPLACE VIEW with column additions When working on an already-defined view with matching attributes, CREATE OR REPLACE VIEW would internally generate an ALTER TABLE command with a set of AT_AddColumnToView sub-commands, one for each attribute added. Such a command is stored in event triggers twice: - Once as a simple command. - Once as an ALTER TABLE command, as it has sub-commands. There was no test coverage to track this command pattern in terms of event triggers and DDL deparsing: - For the test module test_ddl_deparse, two command notices are issued. - For event triggers, a CREATE VIEW command is logged twice, which may look a bit weird first, but again this maps with the internal behavior of how the commands are built, and how the event trigger code reacts in terms of commands gathered. While on it, this adds a test for CREATE SCHEMA with a CREATE VIEW command embedded in it, case supported by the grammar but not covered yet. This hole in the test coverage has been found while digging into what would be a similar behavior for sequences if adding attributes to them with ALTER TABLE variants, after the initial relation creation. Discussion: https://postgr.es/m/aaFG9bqkEn0RhLJG@paquier.xyz	2026-03-04 09:55:58 +09:00
Melanie Plageman	38229cb905	Add read_stream_{pause,resume}() Read stream users can now pause lookahead when no blocks are currently available. After resuming, subsequent read_stream_next_buffer() calls continue lookahead with the previous lookahead distance. This is especially useful for read stream users with self-referential access patterns (where consuming already-read buffers can produce additional block numbers). Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKGJLT2JvWLEiBXMbkSSc5so_Y7%3DN%2BS2ce7npjLw8QL3d5w%40mail.gmail.com	2026-03-03 16:03:09 -05:00
Peter Eisentraut	b30656ce00	doc: Add restart on failure to example systemd file The documentation previously had a systemd unit file that would not attempt to recover from process failures such as OOM's, segfaults, etc. This commit adds "Restart=on-failure",` which tells systemd to attempt to restart the process after failure. This is the recommended configuration per the systemd documentation: "Setting this to on-failure is the recommended choice for long-running services". Many PostgreSQL users will simply copy/paste what the PostgreSQL documentation recommends and will probably do their own research and change the service file to restart on failure, so might as well set this as the default in the PostgreSQL documentation. Author: Andrew Jackson <andrewjackson947@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAKK5BkFfMpAQnv8CLs%3Di%3DrZwurtCV_gmfRb0uZi-V%2Bd6wcryqg%40mail.gmail.com	2026-03-03 13:18:53 +01:00
Álvaro Herrera	cece37c984	Reduce scope of for-loop-local variables to avoid shadowing Adjust a couple of for-loops where a local variable was shadowed by another in the same scope, by renaming it as well as reducing its scope to the containing for-loop. Author: Chao Li <lic@highgo.com> Reviewed-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAEoWx2kQ2x5gMaj8tHLJ3=jfC+p5YXHkJyHrDTiQw2nn2FJTmQ@mail.gmail.com	2026-03-03 11:24:11 +01:00
Peter Eisentraut	f2d7570cdd	Reduce the scope of volatile qualifiers Commit `c66a7d75e6` introduced a new "cast discards ‘volatile’" warning (-Wcast-qual) in vac_truncate_clog(). Instead of making use of unvolatize(), remove the warning by reducing the scope of the volatile qualifier (added in commit `2d2e40e3be`) to only 2 fields. Also do the same for vac_update_datfrozenxid(), since the intent of commit `f65ab862e3` was to prevent the same kind of race condition that commit `2d2e40e3be` was fixing. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Suggested-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/aZ3a%2BV82uSfEjDmD%40ip-10-97-1-34.eu-west-3.compute.internal	2026-03-03 10:02:28 +01:00
Peter Eisentraut	2a525cc97e	Add COPY (on_error set_null) option If ON_ERROR SET_NULL is specified during COPY FROM, any data type conversion errors will result in the affected column being set to a null value. A column's not-null constraints are still enforced, and attempting to set a null value in such columns will raise a constraint violation error. This applies to a column whose data type is a domain with a NOT NULL constraint. Author: Jian He <jian.universality@gmail.com> Author: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Yugo NAGATA <nagata@sraoss.co.jp> Reviewed-by: torikoshia <torikoshia@oss.nttdata.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/CAKFQuwawy1e6YR4S%3Dj%2By7pXqg_Dw1WBVrgvf%3DBP3d1_aSfe_%2BQ%40mail.gmail.com	2026-03-03 07:37:12 +01:00
Michael Paquier	a1bd0c1615	doc: Fix sentence of pg_walsummary page Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAHut+PvfYBL-ppX-i8DPeRu7cakYCZz+QYBhrmQzicx7z_Tj5w@mail.gmail.com Backpatch-through: 17	2026-03-03 15:27:50 +09:00
Fujii Masao	bae42a54e3	doc: Clarify that empty COMMENT string removes the comment. Clarify the documentation of COMMENT ON to state that specifying an empty string is treated as NULL, meaning that the comment is removed. This makes the behavior explicit and avoids possible confusion about how empty strings are handled. Also adds regress test cases that use empty string to remove a comment. Backpatch to all supported versions. Author: Chao Li <lic@highgo.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Shengbin Zhao <zshengbin91@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Reviewed-by: zhangqiang <zhang_qiang81@163.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/26476097-B1C1-4BA8-AA92-0AD0B8EC7190@gmail.com Backpatch-through: 14	2026-03-03 14:45:52 +09:00
Michael Paquier	ba97bf9cb7	Add support for "exprs" in pg_restore_extended_stats() This commit adds support for the restore of extended statistics of the kind "exprs", counting for the statistics data computed for expressions. The input format consists of a jsonb object which must be an array of objects which are keyed by statistics parameter names, like this: [{"stat_type1": "...", "stat_type2": "...", ...}, {"stat_type1": "...", "stat_type2": "...", ...}, ...] The outer array must have as many elements as there are expressions defined in the statistics object, mapping with the way extended statistics are built with one pg_statistic tuple stored for each expression whose statistics have been computed. The elements of the array must be either objects or null values (equivalent of invalid data, case also supported by the stats computations when its data is inserted in the catalogs). The keys of the inner objects are names of the statistical columns in pg_stats_ext_exprs (i.e. everything after "inherited"). Not all parameter keys need to be provided, those omitted being silently ignored. Key values that do not match a statistical column name will cause a warning to be issued, but do not otherwise fail the expression or the import as a whole. The expected value type for all parameters is jbvString, which allows us to validate the values using the input function specific to that parameter. Any parameters with a null value are silently ignored, same as if they were not provided in the first place. This commit includes a battery of test cases: - Sanity checks for what-should-be-all the failures in restore code paths, including parsing errors, parameter sanity checks depending on the extended stats object definition, etc. - Value injection, for scalar, array, range, multi-range cases. - Stats data cloning, with differential checks between the source relation and its target. The source and the target should hold the same stats data after restore. - While expressions are supported in extended statistics since v14, range_length_histogram, range_empty_frac, and range_bounds_histogram have been added to pg_stat_ext_exprs only in v19. A test case has been added to emulate a dump taken from v18, with expression stats restored for a range data type where these three fields are NULL. Support for pg_dump is included, with expressions supported since v14, inherited since v15, and data for range types in expressions in v19. pg_upgrade is the main use-case of this feature; it is also possible to inject statistics, same as for the other extstat kinds. As of this commit, ANALYZE should not be required after pg_upgrade when the cluster upgrading from uses extended statistics, as MCV, dependencies, expressions and ndistinct stats are all covered. The stats data related to range types used in expressions requires v19, whose support has also been added. Author: Corey Huinker <corey.huinker@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CADkLM=fPcci6oPyuyEZ0F4bWqAA7HzaWO+ZPptufuX5_uWt6kw@mail.gmail.com	2026-03-03 14:19:54 +09:00
Jeff Davis	11171fe1fc	style: define parameterless functions as foo(void). Change pg_icu_unicode_version() to pg_icu_unicode_version(void), introduced by commit `af2d4ca191`. See commit `9b05e2ec08`, which fixed similar cases. Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aaEhpwrj1FY/8/7n@ip-10-97-1-34.eu-west-3.compute.internal	2026-03-02 20:12:38 -08:00
Tom Lane	cdaa675658	Fix local-variable shadowing in pg_trgm's printSourceNFA(). We hadn't noticed this violation of -Wshadow=compatible-local because this function isn't compiled without -DTRGM_REGEXP_DEBUG. As long as we have to clean it up, let's do so by converting all this function's loops to use C99 loop-local control variables. Reported-by: Sergei Kornilov <sk@zsrv.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3009911772478436@08341ecb-668d-43a9-af4d-b45f00c72521	2026-03-02 14:40:29 -05:00
Nathan Bossart	f191dc6766	Add commit `7b24959434` to .git-blame-ignore-revs.	2026-03-02 13:23:28 -06:00
Nathan Bossart	cc774c543b	basic_archive: Allow archive directory to be missing at startup. Presently, the GUC check hook for basic_archive.archive_directory checks that the specified directory exists. Consequently, if the directory does not exist at server startup, archiving will be stuck indefinitely, even if it appears later. To fix, remove this check from the hook so that archiving will resume automatically once the directory is present. basic_archive must already be prepared to deal with the directory disappearing at any time, so no additional special handling is required. Reported-by: Олег Самойлов <splarv@ya.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Sergei Kornilov <sk@zsrv.org> Discussion: https://postgr.es/m/73271769675212%40mail.yandex.ru Backpatch-through: 15	2026-03-02 13:12:25 -06:00
Heikki Linnakangas	ccae90abdb	Fix OldestMemberMXactId and OldestVisibleMXactId array usage Commit `ab355e3a88` changed how the OldestMemberMXactId array is indexed. It's no longer indexed by synthetic dummyBackendId, but with ProcNumber. The PGPROC entries for prepared xacts come after auxiliary processes in the allProcs array, which rendered the calculation for MaxOldestSlot and the indexes into the array incorrect. (The OldestVisibleMXactId array is not used for prepared xacts, and thus never accessed with ProcNumber's greater than MaxBackends, so this only affects the OldestMemberMXactId array.) As a result, a prepared xact would store its value past the end of the OldestMemberMXactId array, overflowing into the OldestVisibleMXactId array. That could cause a transaction's row lock to appear invisible to other backends, or other such visibility issues. With a very small max_connections setting, the store could even go beyond the OldestVisibleMXactId array, stomping over the first element in the BufferDescriptor array. To fix, calculate the array sizes more precisely, and introduce helper functions to calculate the array indexes correctly. Author: Yura Sokolov <y.sokolov@postgrespro.ru> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/7acc94b0-ea82-4657-b1b0-77842cb7a60c@postgrespro.ru Backpatch-through: 17	2026-03-02 19:19:22 +02:00
Álvaro Herrera	344b572e3e	psql: Tab-complete ALTER ROLE ... IN DATABASE SET/RESET Detailed completion of the RESET clause is still missing. Not sure a detailed implementation is worth the trouble. Author: Ian Lawrence Barwick <barwick@gmail.com> Author: Vasuki M <vasukianand0119@gmail.com> Reviewed-by: zengman <zengman@halodbtech.com> Reviewed-by: Dharin Shah <dharinshah95@gmail.com> Reviewed-by: Surya Poondla <suryapoondla4@gmail.com> Discussion: https://postgr.es/m/CAB8KJ=iH_v1YB2ss1A=BqvOAf28OVYiWRqUdE6TJ3pP-RdsPig@mail.gmail.com	2026-03-02 18:03:44 +01:00
Tom Lane	74b4438a70	In pg_dumpall, don't skip role GRANTs with dangling grantor OIDs. In commits `29d75b25b` et al, I made pg_dumpall's dumpRoleMembership logic treat a dangling grantor OID the same as dangling role and member OIDs: print a warning and skip emitting the GRANT. This wasn't terribly well thought out; instead, we should handle the case by emitting the GRANT without the GRANTED BY clause. When the source database is pre-v16, such cases are somewhat expected because those versions didn't prevent dropping the grantor role; so don't even print a warning that we did this. (This change therefore restores pg_dumpall's pre-v16 behavior for these cases.) The case is not expected in >= v16, so then we do print a warning, but soldiering on with no GRANTED BY clause still seems like a reasonable strategy. Per complaint from Robert Haas that we were now dropping GRANTs altogether in easily-reachable scenarios. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CA+TgmoauoiW4ydDhdrseg+DD4Kwha=+TSZp18BrJeHKx3o1Fdw@mail.gmail.com Backpatch-through: 16	2026-03-02 11:15:10 -05:00
Melanie Plageman	8b9d42bf6b	Save prune cycles by consistently clearing prune hints on all-visible pages All-visible pages can't contain prunable tuples. We already clear the prune hint (pd_prune_xid) during pruning of all-visible pages, but we were not doing so in vacuum phase three, nor initializing it for all-frozen pages created by COPY FREEZE, and we were not clearing it on standbys. Because page hints are not WAL-logged, pages on a standby carry stale pd_prune_xid values. After promotion, that stale hint triggers unnecessary on-access pruning. Fix this by clearing the prune hint everywhere we currently mark a heap page all-visible. Clearing it when setting PD_ALL_VISIBLE ensures no extra overhead. Author: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/flat/CAAKRu_b-BMOyu0X-0jc_8bWNSbQ5K6JTEueayEhcQuw-OkCSKg%40mail.gmail.com	2026-03-02 11:05:59 -05:00

1 2 3 4 5 ...

63475 commits