postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-14 05:27:20 -04:00

Author	SHA1	Message	Date
Amit Kapila	a1cdb81201	Update comments to reflect changes in `8e0d32a4a1`. Commit `8e0d32a4a1` fixed an issue by allowing the replication origin to be created while marking the table sync state as SUBREL_STATE_DATASYNC. Update the comment in check_old_cluster_subscription_state() to accurately describe this corrected behavior. Author: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Backpatch-through: 17, where the code was introduced Discussion: https://postgr.es/m/CAA4eK1+KaSf5nV_tWy+SDGV6MnFnKMhdt41jJjSDWm6yCyOcTw@mail.gmail.com Discussion: https://postgr.es/m/aUTekQTg4OYnw-Co@paquier.xyz	2025-12-24 09:55:03 +00:00
Amit Kapila	0ed8f1afb1	Don't advance origin during apply failure. The logical replication parallel apply worker could incorrectly advance the origin progress during an error or failed apply. This behavior risks transaction loss because such transactions will not be resent by the server. Commit `3f28b2fcac` addressed a similar issue for both the apply worker and the table sync worker by registering a before_shmem_exit callback to reset origin information. This prevents the worker from advancing the origin during transaction abortion on shutdown. This patch applies the same fix to the parallel apply worker, ensuring consistent behavior across all worker types. As with `3f28b2fcac`, we are backpatching through version 16, since parallel apply mode was introduced there and the issue only occurs when changes are applied before the transaction end record (COMMIT or ABORT) is received. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 16 Discussion: https://postgr.es/m/TY4PR01MB169078771FB31B395AB496A6B94B4A@TY4PR01MB16907.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com	2025-12-24 04:05:06 +00:00
Heikki Linnakangas	bb87d7fef1	Fix bug in following update chain when locking a heap tuple After waiting for a concurrent updater to finish, heap_lock_tuple() followed the update chain to lock all tuple versions. However, when stepping from the initial tuple to the next one, it failed to check that the next tuple's XMIN matches the initial tuple's XMAX. That's an important check whenever following an update chain, and the recursive part that follows the chain did it, but the initial step missed it. Without the check, if the updating transaction aborts, the updated tuple is vacuumed away and replaced by an unrelated tuple, the unrelated tuple might get incorrectly locked. Author: Jasper Smit <jasper.smit@servicenow.com> Discussion: https://www.postgresql.org/message-id/CAOG+RQ74x0q=kgBBQ=mezuvOeZBfSxM1qu_o0V28bwDz3dHxLw@mail.gmail.com Backpatch-through: 14	2025-12-23 13:37:25 +02:00
Tom Lane	4c9f262ba8	Add missing .gitignore for src/test/modules/test_cloexec.	2025-12-23 15:00:13 +09:00
Michael Paquier	e063ccc722	Fix orphaned origin in shared memory after DROP SUBSCRIPTION Since `ce0fdbfe97`, a replication slot and an origin are created by each tablesync worker, whose information is stored in both a catalog and shared memory (once the origin is set up in the latter case). The transaction where the origin is created is the same as the one that runs the initial COPY, with the catalog state of the origin becoming visible for other sessions only once the COPY transaction has committed. The catalog state is coupled with a state in shared memory, initialized at the same time as the origin created in the catalogs. Note that the transaction doing the initial data sync can take a long time, time that depends on the amount of data to transfer from a publication node to its subscriber node. Now, when a DROP SUBSCRIPTION is executed, all its workers are stopped with the origins removed. The removal of each origin relies on a catalog lookup. A worker still running the initial COPY would fail its transaction, with the catalog state of the origin rolled back while the shared memory state remains around. The session running the DROP SUBSCRIPTION should be in charge of cleaning up the catalog and the shared memory state, but as there is no data in the catalogs the shared memory state is not removed. This issue would leave orphaned origin data in shared memory, leading to a confusing state as it would still show up in pg_replication_origin_status. Note that this shared memory data is sticky, being flushed on disk in replorigin_checkpoint at checkpoint. This prevents other origins from reusing a slot position in the shared memory data. To address this problem, the commit moves the creation of the origin at the end of the transaction that precedes the one executing the initial COPY, making the origin immediately visible in the catalogs for other sessions, giving DROP SUBSCRIPTION a way to know about it. A different solution would have been to clean up the shared memory state using an abort callback within the tablesync worker. The solution of this commit is more consistent with the apply worker that creates an origin in a short transaction. A test is added in the subscription test 004_sync.pl, which was able to display the problem. The test fails when this commit is reverted. Reported-by: Tenglong Gu <brucegu@amazon.com> Reported-by: Daisuke Higuchi <higudai@amazon.com> Analyzed-by: Michael Paquier <michael@paquier.xyz> Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Discussion: https://postgr.es/m/aUTekQTg4OYnw-Co@paquier.xyz Backpatch-through: 14	2025-12-23 14:32:21 +09:00
Thomas Munro	d4549176ea	Fix printf format string warning on MinGW. This is a back-patch of `1319997d` to branches 14-17 to fix an old warning about a printf type mismatch on MinGW, in anticipation of a potential expansion of the scope of CI's CompilerWarnings checks. Though CI began in 15, BF animal fairwren also shows the warning in 14, so we might as well fix that too. Original commit message (except for new "Backpatch-through" tag): Commit `517bf2d91` changed a printf format string to placate MinGW, which at the time warned about "%lld". Current MinGW is now warning about the replacement "%I64d". Reverting the change clears the warning on the MinGW CI task, and hopefully it will clear it on build farm animal fairywren too. Backpatch-through: 14-17 Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/TYAPR01MB5866A71B744BE01B3BF71791F5AEA%40TYAPR01MB5866.jpnprd01.prod.outlook.com	2025-12-21 21:28:12 +13:00
Thomas Munro	0451859131	Clean up test_cloexec.c and Makefile. An unused variable caused a compiler warning on BF animal fairywren, an snprintf() call was redundant, and some buffer sizes were inconsistent. Per code review from Tom Lane. The Makefile's test ifeq ($(PORTNAME), win32) never succeeded due to a circularity, so only Meson builds were actually compiling the new test code, partially explaining why CI didn't tell us about the warning sooner (the other problem being that CompilerWarnings only makes world-bin, a problem for another commit). Simplify. Backpatch-through: 16, like commit `c507ba55` Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <tmunro@gmail.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1086088.1765593851%40sss.pgh.pa.us	2025-12-21 17:35:46 +13:00
Fujii Masao	699293d274	Add guard to prevent recursive memory context logging. Previously, if memory context logging was triggered repeatedly and rapidly while a previous request was still being processed, it could result in recursive calls to ProcessLogMemoryContextInterrupt(). This could lead to infinite recursion and potentially crash the process. This commit adds a guard to prevent such recursion. If ProcessLogMemoryContextInterrupt() is already in progress and logging memory contexts, subsequent calls will exit immediately, avoiding unintended recursive calls. While this scenario is unlikely in practice, it's not impossible. This change adds a safety check to prevent such failures. Back-patch to v14, where memory context logging was introduced. Reported-by: Robert Haas <robertmhaas@gmail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Artem Gavrilov <artem.gavrilov@percona.com> Discussion: https://postgr.es/m/CA+TgmoZMrv32tbNRrFTvF9iWLnTGqbhYSLVcrHGuwZvCtph0NA@mail.gmail.com Backpatch-through: 14	2025-12-19 12:07:21 +09:00
Noah Misch	1cdc07ad5a	Sort DO_SUBSCRIPTION_REL dump objects independent of OIDs. Commit `0decd5e89d` missed DO_SUBSCRIPTION_REL, leading to assertion failures. In the unlikely use case of diffing "pg_dump --binary-upgrade" output, spurious diffs were possible. As part of fixing that, align the DumpableObject naming and sort order with DO_PUBLICATION_REL. The overall effect of this commit is to change sort order from (subname, srsubid) to (rel, subname). Since DO_SUBSCRIPTION_REL is only for --binary-upgrade, accept that larger-than-usual dump order change. Back-patch to v17, where commit `9a17be1e24` introduced DO_SUBSCRIPTION_REL. Reported-by: vignesh C <vignesh21@gmail.com> Author: vignesh C <vignesh21@gmail.com> Discussion: https://postgr.es/m/CALDaNm2x3rd7C0_HjUpJFbxpAqXgm=QtoKfkEWDVA8h+JFpa_w@mail.gmail.com Backpatch-through: 17	2025-12-18 10:23:51 -08:00
Heikki Linnakangas	4b6d096a0f	Do not emit WAL for unlogged BRIN indexes Operations on unlogged relations should not be WAL-logged. The brin_initialize_empty_new_buffer() function didn't get the memo. The function is only called when a concurrent update to a brin page uses up space that we're just about to insert to, which makes it pretty hard to hit. If you do manage to hit it, a full-page WAL record is erroneously emitted for the unlogged index. If you then crash, crash recovery will fail on that record with an error like this: FATAL: could not create file "base/5/32819": File exists Author: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://www.postgresql.org/message-id/CALdSSPhpZXVFnWjwEBNcySx_vXtXHwB2g99gE6rK0uRJm-3GgQ@mail.gmail.com Backpatch-through: 14	2025-12-18 15:09:23 +02:00
Noah Misch	bcb784e7d2	Assert lack of hazardous buffer locks before possible catalog read. Commit `0bada39c83` fixed a bug of this kind, which existed in all branches for six days before detection. While the probability of reaching the trouble was low, the disruption was extreme. No new backends could start, and service restoration needed an immediate shutdown. Hence, add this to catch the next bug like it. The new check in RelationIdGetRelation() suffices to make autovacuum detect the bug in commit `243e9b40f1` that led to commit `0bada39`. This also checks in a number of similar places. It replaces each Assert(IsTransactionState()) that pertained to a conditional catalog read. Back-patch to v14 - v17. This a back-patch of commit `f4ece891fc` (from before v18 branched) to all supported branches, to accompany the back-patch of commits `243e9b4` and `0bada39`. For catalog indexes, the bttextcmp() behavior that motivated IsCatalogTextUniqueIndexOid() was v18-specific. Hence, this back-patch doesn't need that or its correction from commit `4a4ee0c2c1`. Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/20250410191830.0e.nmisch@google.com Discussion: https://postgr.es/m/10ec0bc3-5933-1189-6bb8-5dec4114558e@gmail.com Backpatch-through: 14-17	2025-12-16 16:13:54 -08:00
Noah Misch	d3e5d89504	WAL-log inplace update before revealing it to other sessions. A buffer lock won't stop a reader having already checked tuple visibility. If a vac_update_datfrozenid() and then a crash happened during inplace update of a relfrozenxid value, datfrozenxid could overtake relfrozenxid. That could lead to "could not access status of transaction" errors. Back-patch to v14 - v17. This is a back-patch of commits: - `8e7e672cda` (main change, on master, before v18 branched) - `8180136652` (defect fix, on master, before v18 branched) It reverses commit `bc6bad8857`, my revert of the original back-patch. In v14, this also back-patches the assertion removal from commit `7fcf2faf9c`. Discussion: https://postgr.es/m/20240620012908.92.nmisch@google.com Backpatch-through: 14-17	2025-12-16 16:13:54 -08:00
Noah Misch	0f69beddea	For inplace update, send nontransactional invalidations. The inplace update survives ROLLBACK. The inval didn't, so another backend's DDL could then update the row without incorporating the inplace update. In the test this fixes, a mix of CREATE INDEX and ALTER TABLE resulted in a table with an index, yet relhasindex=f. That is a source of index corruption. Back-patch to v14 - v17. This is a back-patch of commits: - `243e9b40f1` (main change, on master, before v18 branched) - `0bada39c83` (defect fix, on master, before v18 branched) - `bae8ca82fd` (cosmetics from post-commit review, on REL_18_STABLE) It reverses commit `c1099dd745`, my revert of the original back-patch of `243e9b4`. This back-patch omits the non-comment heap_decode() changes. I find those changes removed harmless code that was last necessary in v13. See discussion thread for details. The back branches aren't the place to remove such code. Like the original back-patch, this doesn't change WAL, because these branches use end-of-recovery SIResetAll(). All branches change the ABI of extern function PrepareToInvalidateCacheTuple(). No PGXN extension calls that, and there's no apparent use case in extensions. Expect ".abi-compliance-history" edits to follow. Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Surya Poondla <s_poondla@apple.com> Reviewed-by: Ilyasov Ian <ianilyasov@outlook.com> Reviewed-by: Nitin Motiani <nitinmotiani@google.com> (in earlier versions) Reviewed-by: Andres Freund <andres@anarazel.de> (in earlier versions) Discussion: https://postgr.es/m/20240523000548.58.nmisch@google.com Backpatch-through: 14-17	2025-12-16 16:13:54 -08:00
Robert Haas	1d0fc2499f	Switch memory contexts in ReinitializeParallelDSM. We already do this in CreateParallelContext, InitializeParallelDSM, and LaunchParallelWorkers. I suspect the reason why the matching logic was omitted from ReinitializeParallelDSM is that I failed to realize that any memory allocation was happening here -- but shm_mq_attach does allocate, which could result in a shm_mq_handle being allocated in a shorter-lived context than the ParallelContext which points to it. That could result in a crash if the shorter-lived context is freed before the parallel context is destroyed. As far as I am currently aware, there is no way to reach a crash using only code that is present in core PostgreSQL, but extensions could potentially trip over this. Fixing this in the back-branches appears low-risk, so back-patch to all supported versions. Author: Jakub Wartak <jakub.wartak@enterprisedb.com> Co-authored-by: Jeevan Chalke <jeevan.chalke@enterprisedb.com> Backpatch-through: 14 Discussion: http://postgr.es/m/CAKZiRmwfVripa3FGo06=5D1EddpsLu9JY2iJOTgbsxUQ339ogQ@mail.gmail.com	2025-12-16 10:59:04 -05:00
Michael Paquier	f5927da4ff	Fail recovery when missing redo checkpoint record without backup_label This commit adds an extra check at the beginning of recovery to ensure that the redo record of a checkpoint exists before attempting WAL replay, logging a PANIC if the redo record referenced by the checkpoint record could not be found. This is the same level of failure as when a checkpoint record is missing. This check is added when a cluster is started without a backup_label, after retrieving its checkpoint record. The redo LSN used for the check is retrieved from the checkpoint record successfully read. In the case where a backup_label exists, the startup process already fails if the redo record cannot be found after reading a checkpoint record at the beginning of recovery. Previously, the presence of the redo record was not checked. If the redo and checkpoint records were located on different WAL segments, it would be possible to miss a entire range of WAL records that should have been replayed but were just ignored. The consequences of missing the redo record depend on the version dealt with, these becoming worse the older the version used: - On HEAD, v18 and v17, recovery fails with a pointer dereference at the beginning of the redo loop, as the redo record is expected but cannot be found. These versions are good students, because we detect a failure before doing anything, even if the failure is misleading in the shape of a segmentation fault, giving no information that the redo record is missing. - In v16 and v15, problems show at the end of recovery within FinishWalRecovery(), the startup process using a buggy LSN to decide from where to start writing WAL. The cluster gets corrupted, still it is noisy about it. - v14 and older versions are worse: a cluster gets corrupted but it is entirely silent about the matter. The redo record missing causes the startup process to skip entirely recovery, because a missing record is the same as not redo being required at all. This leads to data loss, as everything is missed between the redo record and the checkpoint record. Note that I have tested that down to 9.4, reproducing the issue with a version of the author's reproducer slightly modified. The code is wrong since at least 9.2, but I did not look at the exact point of origin. This problem has been found by debugging a cluster where the WAL segment including the redo segment was missing due to an operator error, leading to a crash, based on an investigation in v15. Requesting archive recovery with the creation of a recovery.signal or a standby.signal even without a backup_label would mitigate the issue: if the record cannot be found in pg_wal/, the missing segment can be retrieved with a restore_command when checking that the redo record exists. This was already the case without this commit, where recovery would re-fetch the WAL segment that includes the redo record. The check introduced by this commit makes the segment to be retrieved earlier to make sure that the redo record can be found. On HEAD, the code will be slightly changed in a follow-up commit to not rely on a PANIC, to include a test able to emulate the original problem. This is a minimal backpatchable fix, kept separated for clarity. Reported-by: Andres Freund <andres@anarazel.de> Analyzed-by: Andres Freund <andres@anarazel.de> Author: Nitin Jadhav <nitinjadhavpostgres@gmail.com> Discussion: https://postgr.es/m/20231023232145.cmqe73stvivsmlhs@awork3.anarazel.de Discussion: https://postgr.es/m/CAMm1aWaaJi2w49c0RiaDBfhdCL6ztbr9m=daGqiOuVdizYWYaA@mail.gmail.com Backpatch-through: 14	2025-12-16 13:29:39 +09:00
Heikki Linnakangas	cd1a887fe9	Clarify comment on multixid offset wraparound check Coverity complained that offset cannot be 0 here because there's an explicit check for "offset == 0" earlier in the function, but it didn't see the possibility that offset could've wrapped around to 0. The code is correct, but clarify the comment about it. The same code exists in backbranches in the server GetMultiXactIdMembers() function and in 'master' in the pg_upgrade GetOldMultiXactIdSingleMember function. In backbranches Coverity didn't complain about it because the check was merely an assertion, but change the comment in all supported branches for consistency. Per Tom Lane's suggestion. Discussion: https://www.postgresql.org/message-id/1827755.1765752936@sss.pgh.pa.us	2025-12-15 11:47:58 +02:00
Michael Paquier	0bab0c3b74	Fix allocation formula in llvmjit_expr.c An array of LLVMBasicBlockRef is allocated with the size used for an element being "LLVMBasicBlockRef *" rather than "LLVMBasicBlockRef". LLVMBasicBlockRef is a type that refers to a pointer, so this did not directly cause a problem because both should have the same size, still it is incorrect. This issue has been spotted while reviewing a different patch, and exists since `2a0faed9d7`, so backpatch all the way down. Discussion: https://postgr.es/m/CA+hUKGLngd9cKHtTUuUdEo2eWEgUcZ_EQRbP55MigV2t_zTReg@mail.gmail.com Backpatch-through: 14	2025-12-11 10:25:46 +09:00
Heikki Linnakangas	807b2f261d	Fix bogus extra arguments to query_safe in test The test seemed to incorrectly think that query_safe() takes an argument that describes what the query does, similar to e.g. command_ok(). Until commit `bd8d9c9bdf` the extra arguments were harmless and were just ignored, but when commit `bd8d9c9bdf` introduced a new optional argument to query_safe(), the extra arguments started clashing with that, causing the test to fail. Backpatch to v17, that's the oldest branch where the test exists. The extra arguments didn't cause any trouble on the older branches, but they were clearly bogus anyway.	2025-12-10 19:39:00 +02:00
Heikki Linnakangas	998d100cdb	Fix some near-bugs related to ResourceOwner function arguments These functions took a ResourceOwner argument, but only checked if it was NULL, and then used CurrentResourceOwner for the actual work. Surely the intention was to use the passed-in resource owner. All current callers passed CurrentResourceOwner or NULL, so this has no consequences at the moment, but it's an accident waiting to happen for future caller and extensions. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEze2Whnfv8VuRZaohE-Af+GxBA1SNfD_rXfm84Jv-958UCcJA@mail.gmail.com Backpatch-through: 17	2025-12-10 11:44:07 +02:00
Michael Paquier	d0518e965e	Fix failures with cross-version pg_upgrade tests Buildfarm members skimmer and crake have reported that pg_upgrade running from v18 fails due to the changes of `d52c24b0f8`, with the expectations that the objects removed in the test module injection_points should still be present post upgrades, but the test module does not have them anymore. The origin of the issue is that the following test modules depend on injection_points, but they do not drop the extension once the tests finish, leaving its traces in the dumps used for the upgrades: - gin, down to v17 - typcache, down to v18 - nbtree, HEAD-only Test modules have no upgrade requirements, as they are used only for.. Tests, so there is no point in keeping them around. An alternative solution would be to drop the databases created by these modules in AdjustUpgrade.pm, but the solution of this commit to drop the extension is simpler. Note that there would be a catch if using a solution based on AdjustUpgrade.pm as the database name used for the test runs differs between configure and meson: - configure relies on USE_MODULE_DB for the database name unicity, that would build a database name based on the first entry of REGRESS, that lists all the SQL tests. - meson relies on a "name" field. For example, for the test module "gin", the regression database is named "regression_gin" under meson, while it is more complex for configure, as of "contrib_regression_gin_incomplete_splits". So a AdjustUpgrade.pm would need a set of DROP DATABASE IF EXISTS to solve this issue, to cope with each build system. The failure has been caused by `d52c24b0f8`, and the problem can happen with upgrade dumps from v17 and v18 to HEAD. This problem is not currently reachable in the back-branches, but it could be possible that a future change in injection_points in stable branches invalidates this theory, so this commit is applied down to v17 in the test modules that matter. Per discussion with Tom Lane and Heikki Linnakangas. Discussion: https://postgr.es/m/2899652.1765167313@sss.pgh.pa.us Backpatch-through: 17	2025-12-10 12:47:23 +09:00
Thomas Munro	f24af0e04c	Fix O_CLOEXEC flag handling in Windows port. PostgreSQL's src/port/open.c has always set bInheritHandle = TRUE when opening files on Windows, making all file descriptors inheritable by child processes. This meant the O_CLOEXEC flag, added to many call sites by commit `1da569ca1f` (v16), was silently ignored. The original commit included a comment suggesting that our open() replacement doesn't create inheritable handles, but it was a mis- understanding of the code path. In practice, the code was creating inheritable handles in all cases. This hasn't caused widespread problems because most child processes (archive_command, COPY PROGRAM, etc.) operate on file paths passed as arguments rather than inherited file descriptors. Even if a child wanted to use an inherited handle, it would need to learn the numeric handle value, which isn't passed through our IPC mechanisms. Nonetheless, the current behavior is wrong. It violates documented O_CLOEXEC semantics, contradicts our own code comments, and makes PostgreSQL behave differently on Windows than on Unix. It also creates potential issues with future code or security auditing tools. To fix, define O_CLOEXEC to _O_NOINHERIT in master, previously used by O_DSYNC. We use different values in the back branches to preserve existing values. In pgwin32_open_handle() we set bInheritHandle according to whether O_CLOEXEC is specified, for the same atomic semantics as POSIX in multi-threaded programs that create processes. Backpatch-through: 16 Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> (minor adjustments) Discussion: https://postgr.es/m/e2b16375-7430-4053-bda3-5d2194ff1880%40gmail.com	2025-12-10 09:10:31 +13:00
Amit Kapila	f2818868ae	Fix LOCK_TIMEOUT handling in slotsync worker. Previously, the slotsync worker relied on SIGINT for graceful shutdown during promotion. However, SIGINT is also used by the LOCK_TIMEOUT handler to cancel queries. Since the slotsync worker can lock catalog tables while parsing libpq tuples, this overlap caused it to ignore LOCK_TIMEOUT signals and potentially wait indefinitely on locks. This patch replaces the slotsync worker's SIGINT handler with StatementCancelHandler to correctly process query-cancel interrupts. Additionally, the startup process now uses SIGUSR1 to signal the slotsync worker to stop during promotion. The worker exits after detecting that the shared memory flag stopSignaled is set. Author: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Backpatch-through: 17, here it was introduced Discussion: https://postgr.es/m/TY4PR01MB169078F33846E9568412D878C94A2A@TY4PR01MB16907.jpnprd01.prod.outlook.com	2025-12-09 07:02:08 +00:00
David Rowley	ca98d8ba10	Doc: fix typo in hash index documentation Plus a similar fix to the README. Backpatch as far back as the sgml issue exists. The README issue does exist in v14, but that seems unlikely to harm anyone. Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ed3db7ea-55b4-4809-86af-81ad3bb2c7d3@gmail.com Backpatch-through: 15	2025-12-09 14:42:40 +13:00
Heikki Linnakangas	cad40cec24	Fix setting next multixid's offset at offset wraparound In commit `789d65364c`, we started updating the next multixid's offset too when recording a multixid, so that it can always be used to calculate the number of members. I got it wrong at offset wraparound: we need to skip over offset 0. Fix that. Discussion: https://www.postgresql.org/message-id/d9996478-389a-4340-8735-bfad456b313c@iki.fi Backpatch-through: 14	2025-12-05 11:35:44 +02:00
Michael Paquier	9d4f6d17f5	Show version of nodes in output of TAP tests This commit adds the version information of a node initialized by Cluster.pm, that may vary depending on the install_path given by the test. The code was written so as the node information, that includes the version number, was dumped before the version number was set. This is particularly useful for the pg_upgrade TAP tests, that may mix several versions for cross-version runs. The TAP infrastructure also allows mixing nodes with different versions, so this information can be useful for out-of-core tests. Backpatch down to v15, where Cluster.pm and the pg_upgrade TAP tests have been introduced. Author: Potapov Alexander <a.potapov@postgrespro.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/e59bb-692c0a80-5-6f987180@170377126 Backpatch-through: 15	2025-12-05 09:21:18 +09:00
Heikki Linnakangas	8ba61bc063	Set next multixid's offset when creating a new multixid With this commit, the next multixid's offset will always be set on the offsets page, by the time that a backend might try to read it, so we no longer need the waiting mechanism with the condition variable. In other words, this eliminates "corner case 2" mentioned in the comments. The waiting mechanism was broken in a few scenarios: - When nextMulti was advanced without WAL-logging the next multixid. For example, if a later multixid was already assigned and WAL-logged before the previous one was WAL-logged, and then the server crashed. In that case the next offset would never be set in the offsets SLRU, and a query trying to read it would get stuck waiting for it. Same thing could happen if pg_resetwal was used to forcibly advance nextMulti. - In hot standby mode, a deadlock could happen where one backend waits for the next multixid assignment record, but WAL replay is not advancing because of a recovery conflict with the waiting backend. The old TAP test used carefully placed injection points to exercise the old waiting code, but now that the waiting code is gone, much of the old test is no longer relevant. Rewrite the test to reproduce the IPC/MultixactCreation hang after crash recovery instead, and to verify that previously recorded multixids stay readable. Backpatch to all supported versions. In back-branches, we still need to be able to read WAL that was generated before this fix, so in the back-branches this includes a hack to initialize the next offsets page when replaying XLOG_MULTIXACT_CREATE_ID for the last multixid on a page. On 'master', bump XLOG_PAGE_MAGIC instead to indicate that the WAL is not compatible. Author: Andrey Borodin <amborodin@acm.org> Reviewed-by: Dmitry Yurichev <dsy.075@yandex.ru> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Ivan Bykov <i.bykov@modernsys.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/172e5723-d65f-4eec-b512-14beacb326ce@yandex.ru Backpatch-through: 14	2025-12-03 19:15:21 +02:00
Dean Rasheed	c090965036	Avoid rewriting data-modifying CTEs more than once. Formerly, when updating an auto-updatable view, or a relation with rules, if the original query had any data-modifying CTEs, the rewriter would rewrite those CTEs multiple times as RewriteQuery() recursed into the product queries. In most cases that was harmless, because RewriteQuery() is mostly idempotent. However, if the CTE involved updating an always-generated column, it would trigger an error because any subsequent rewrite would appear to be attempting to assign a non-default value to the always-generated column. This could perhaps be fixed by attempting to make RewriteQuery() fully idempotent, but that looks quite tricky to achieve, and would probably be quite fragile, given that more generated-column-type features might be added in the future. Instead, fix by arranging for RewriteQuery() to rewrite each CTE exactly once (by tracking the number of CTEs already rewritten as it recurses). This has the advantage of being simpler and more efficient, but it does make RewriteQuery() dependent on the order in which rewriteRuleAction() joins the CTE lists from the original query and the rule action, so care must be taken if that is ever changed. Reported-by: Bernice Southey <bernice.southey@gmail.com> Author: Bernice Southey <bernice.southey@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAEDh4nyD6MSH9bROhsOsuTqGAv_QceU_GDvN9WcHLtZTCYM1kA@mail.gmail.com Backpatch-through: 14	2025-11-29 12:32:12 +00:00
Tom Lane	e79b276621	Allow indexscans on partial hash indexes with implied quals. Normally, if a WHERE clause is implied by the predicate of a partial index, we drop that clause from the set of quals used with the index, since it's redundant to test it if we're scanning that index. However, if it's a hash index (or any !amoptionalkey index), this could result in dropping all available quals for the index's first key, preventing us from generating an indexscan. It's fair to question the practical usefulness of this case. Since hash only supports equality quals, the situation could only arise if the index's predicate is "WHERE indexkey = constant", implying that the index contains only one hash value, which would make hash a really poor choice of index type. However, perhaps there are other !amoptionalkey index AMs out there with which such cases are more plausible. To fix, just don't filter the candidate indexquals this way if the index is !amoptionalkey. That's a bit hokey because it may result in testing quals we didn't need to test, but to do it more accurately we'd have to redundantly identify which candidate quals are actually usable with the index, something we don't know at this early stage of planning. Doesn't seem worth the effort. Reported-by: Sergei Glukhov <s.glukhov@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/e200bf38-6b45-446a-83fd-48617211feff@postgrespro.ru Backpatch-through: 14	2025-11-27 13:09:59 -05:00
Amit Langote	b5511fed50	Fix error reporting for SQL/JSON path type mismatches transformJsonFuncExpr() used exprType()/exprLocation() on the possibly coerced path expression, which could be NULL when coercion to jsonpath failed, leading to "cache lookup failed for type 0" errors. Preserve the original expression node so that type and location in the "must be of type jsonpath" error are reported correctly. Add regression tests to cover these cases. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CACJufxHunVg81JMuNo8Yvv_hJD0DicgaVN2Wteu8aJbVJPBjZA@mail.gmail.com Backpatch-through: 17	2025-11-27 11:59:36 +09:00
Nathan Bossart	2fc5c50622	Teach DSM registry to retry entry initialization if needed. If DSM registry entry initialization fails, backends could try to use an uninitialized DSM segment, DSA, or dshash table (since the entry is still added to the registry). To fix, restructure the code so that the registry retries initialization as needed. This commit also modifies pg_get_dsm_registry_allocations() to leave out partially-initialized entries, as they shouldn't have any allocated memory. DSM registry entry initialization shouldn't fail often in practice, but retrying was deemed better than leaving entries in a permanently failed state (as was done by commit `1165a933aa`, which has since been reverted). Suggested-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/E1vJHUk-006I7r-37%40gemulon.postgresql.org Backpatch-through: 17	2025-11-26 15:12:25 -06:00
Nathan Bossart	c7e0f263d6	Revert "Teach DSM registry to ERROR if attaching to an uninitialized entry." This reverts commit `1165a933aa` (and the corresponding commits on the back-branches). In a follow-up commit, we'll teach the registry to retry entry initialization instead of leaving it in a permanently failed state. Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/E1vJHUk-006I7r-37%40gemulon.postgresql.org Backpatch-through: 17	2025-11-26 11:37:21 -06:00
Andres Freund	427e886a79	lwlock: Fix, currently harmless, bug in LWLockWakeup() Accidentally the code in LWLockWakeup() checked the list of to-be-woken up processes to see if LW_FLAG_HAS_WAITERS should be unset. That means that HAS_WAITERS would not get unset immediately, but only during the next, unnecessary, call to LWLockWakeup(). Luckily, as the code stands, this is just a small efficiency issue. However, if there were (as in a patch of mine) a case in which LWLockWakeup() would not find any backend to wake, despite the wait list not being empty, we'd wrongly unset LW_FLAG_HAS_WAITERS, leading to potentially hanging. While the consequences in the backbranches are limited, the code as-is confusing, and it is possible that there are workloads where the additional wait list lock acquisitions hurt, therefore backpatch. Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff Backpatch-through: 14	2025-11-24 17:39:57 -05:00
David Rowley	232e0f5de4	Fix incorrect IndexOptInfo header comment The comment incorrectly indicated that indexcollations[] stored collations for both key columns and INCLUDE columns, but in reality it only has elements for the key columns. canreturn[] didn't get a mention, so add that while we're here. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LwbZgMKOQ9CmZarX5DEipKivdHp5PZMOO-riL0w%3DL%3D4A%40mail.gmail.com Backpatch-through: 14	2025-11-24 17:01:13 +13:00
Thomas Munro	60215eae7c	jit: Adjust AArch64-only code for LLVM 21. LLVM 21 changed the arguments of RTDyldObjectLinkingLayer's constructor, breaking compilation with the backported SectionMemoryManager from commit `9044fc1d`. `cd585864c0` Backpatch-through: 14 Author: Holger Hoffstätte <holger@applied-asynchrony.com> Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/d25e6e4a-d1b4-84d3-2f8a-6c45b975f53d%40applied-asynchrony.com	2025-11-22 21:22:37 +13:00
Heikki Linnakangas	f2e0ca0af9	Print new OldestXID value in pg_resetwal when it's being changed Commit `74cf7d46a9` added the --oldest-transaction-id option to pg_resetwal, but forgot to update the code that prints all the new values that are being set. Fix that. Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/5461bc85-e684-4531-b4d2-d2e57ad18cba@iki.fi Backpatch-through: 14	2025-11-19 18:06:23 +02:00
Tom Lane	075a763e2d	Don't allow CTEs to determine semantic levels of aggregates. The fix for bug #19055 (commit `b0cc0a71e`) allowed CTE references in sub-selects within aggregate functions to affect the semantic levels assigned to such aggregates. It turns out this broke some related cases, leading to assertion failures or strange planner errors such as "unexpected outer reference in CTE query". After experimenting with some alternative rules for assigning the semantic level in such cases, we've come to the conclusion that changing the level is more likely to break things than be helpful. Therefore, this patch undoes what `b0cc0a71e` changed, and instead installs logic to throw an error if there is any reference to a CTE that's below the semantic level that standard SQL rules would assign to the aggregate based on its contained Var and Aggref nodes. (The SQL standard disallows sub-selects within aggregate functions, so it can't reach the troublesome case and hence has no rule for what to do.) Perhaps someone will come along with a legitimate query that this logic rejects, and if so probably the example will help us craft a level-adjustment rule that works better than what `b0cc0a71e` did. I'm not holding my breath for that though, because the previous logic had been there for a very long time before bug #19055 without complaints, and that bug report sure looks to have originated from fuzzing not from real usage. Like `b0cc0a71e`, back-patch to all supported branches, though sadly that no longer includes v13. Bug: #19106 Reported-by: Kamil Monicz <kamil@monicz.dev> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19106-9dd3668a0734cd72@postgresql.org Backpatch-through: 14	2025-11-18 12:56:55 -05:00
Thomas Munro	d66a922f92	Define PS_USE_CLOBBER_ARGV on GNU/Hurd. Until `d2ea2d310d`, the PS_USE_PS_STRINGS option was used on the GNU/Hurd. As this option got removed and PS_USE_CLOBBER_ARGV appears to work fine nowadays on the Hurd, define this one to re-enable process title changes on this platform. In the 14 and 15 branches, the existing test for __hurd__ (added 25 years ago by commit `209aa77d`, removed in 16 by the above commit) is left unchanged for now as it was activating slightly different code paths and would need investigation by a Hurd user. Author: Michael Banck <mbanck@debian.org> Discussion: https://postgr.es/m/CA%2BhUKGJMNGUAqf27WbckYFrM-Mavy0RKJvocfJU%3DJ2XcAZyv%2Bw%40mail.gmail.com Backpatch-through: 16	2025-11-17 12:48:37 +13:00
Dean Rasheed	d6c415c4b4	Fix Assert failure in EXPLAIN ANALYZE MERGE with a concurrent update. When instrumenting a MERGE command containing both WHEN NOT MATCHED BY SOURCE and WHEN NOT MATCHED BY TARGET actions using EXPLAIN ANALYZE, a concurrent update of the target relation could lead to an Assert failure in show_modifytable_info(). In a non-assert build, this would lead to an incorrect value for "skipped" tuples in the EXPLAIN output, rather than a crash. This could happen if the concurrent update caused a matched row to no longer match, in which case ExecMerge() treats the single originally matched row as a pair of not matched rows, and potentially executes 2 not-matched actions for the single source row. This could then lead to a state where the number of rows processed by the ModifyTable node exceeds the number of rows produced by its source node, causing "skipped_path" in show_modifytable_info() to be negative, triggering the Assert. Fix this in ExecMergeMatched() by incrementing the instrumentation tuple count on the source node whenever a concurrent update of this kind is detected, if both kinds of merge actions exist, so that the number of source rows matches the number of actions potentially executed, and the "skipped" tuple count is correct. Back-patch to v17, where support for WHEN NOT MATCHED BY SOURCE actions was introduced. Bug: #19111 Reported-by: Dilip Kumar <dilipbalaut@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/19111-5b06624513d301b3@postgresql.org Backpatch-through: 17	2025-11-16 22:15:45 +00:00
Nathan Bossart	505ce19a20	Add note about CreateStatistics()'s selective use of check_rights. Commit `5e4fcbe531` added a check_rights parameter to this function for use by ALTER TABLE commands that re-create statistics objects. However, we intentionally ignore check_rights when verifying relation ownership because this function's lookup could return a different answer than the caller's. This commit adds a note to this effect so that we remember it down the road. Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 14	2025-11-14 13:20:09 -06:00
Fujii Masao	5bc251b288	pgbench: Fix assertion failure with multiple \syncpipeline in pipeline mode. Previously, when pgbench ran a custom script that triggered retriable errors (e.g., deadlocks) followed by multiple \syncpipeline commands in pipeline mode, the following assertion failure could occur: Assertion failed: (res == ((void*)0)), function discardUntilSync, file pgbench.c, line 3594. The issue was that discardUntilSync() assumed a pipeline sync result (PGRES_PIPELINE_SYNC) would always be followed by either another sync result or NULL. This assumption was incorrect: when multiple sync requests were sent, a sync result could instead be followed by another result type. In such cases, discardUntilSync() mishandled the results, leading to the assertion failure. This commit fixes the issue by making discardUntilSync() correctly handle cases where a pipeline sync result is followed by other result types. It now continues discarding results until another pipeline sync followed by NULL is reached. Backpatched to v17, where support for \syncpipeline command in pgbench was introduced. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20251111105037.f3fc554616bc19891f926c5b@sraoss.co.jp Backpatch-through: 17	2025-11-14 22:42:02 +09:00
Nathan Bossart	ac2800ddc1	Teach DSM registry to ERROR if attaching to an uninitialized entry. If DSM entry initialization fails, backends could try to use an uninitialized DSM segment, DSA, or dshash table (since the entry is still added to the registry). To fix, keep track of whether initialization completed, and ERROR if a backend tries to attach to an uninitialized entry. We could instead retry initialization as needed, but that seemed complicated, error prone, and unlikely to help most cases. Furthermore, such problems probably indicate a coding error. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/dd36d384-55df-4fc2-825c-5bc56c950fa9%40gmail.com Backpatch-through: 17	2025-11-12 14:30:11 -06:00
Heikki Linnakangas	d80d5f0995	Clear 'xid' in dummy async notify entries written to fill up pages Before we started to freeze async notify entries (commit `8eeb4a0f7c`), no one looked at the 'xid' on an entry with invalid 'dboid'. But now we might actually need to freeze it later. Initialize them with InvalidTransactionId to begin with, to avoid that work later. Álvaro pointed this out in review of commit `8eeb4a0f7c`, but I forgot to include this change there. Author: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://www.postgresql.org/message-id/202511071410.52ll56eyixx7@alvherre.pgsql Backpatch-through: 14	2025-11-12 21:27:02 +02:00
Heikki Linnakangas	c2682810ab	Fix remaining race condition with CLOG truncation and LISTEN/NOTIFY Previous commit fixed a bug where VACUUM would truncate the CLOG that's still needed to check the commit status of XIDs in the async notify queue, but as mentioned in the commit message, it wasn't a full fix. If a backend is executing asyncQueueReadAllNotifications() and has just made a local copy of an async SLRU page which contains old XIDs, vacuum can concurrently truncate the CLOG covering those XIDs, and the backend still gets an error when it calls TransactionIdDidCommit() on those XIDs in the local copy. This commit fixes that race condition. To fix, hold the SLRU bank lock across the TransactionIdDidCommit() calls in NOTIFY processing. Per Tom Lane's idea. Backpatch to all supported versions. Reviewed-by: Joel Jacobson <joel@compiler.org> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://www.postgresql.org/message-id/2759499.1761756503@sss.pgh.pa.us Backpatch-through: 14	2025-11-12 21:01:16 +02:00
Heikki Linnakangas	d02c03ddc5	Fix bug where we truncated CLOG that was still needed by LISTEN/NOTIFY The async notification queue contains the XID of the sender, and when processing notifications we call TransactionIdDidCommit() on the XID. But we had no safeguards to prevent the CLOG segments containing those XIDs from being truncated away. As a result, if a backend didn't for some reason process its notifications for a long time, or when a new backend issued LISTEN, you could get an error like: test=# listen c21; ERROR: 58P01: could not access status of transaction 14279685 DETAIL: Could not open file "pg_xact/000D": No such file or directory. LOCATION: SlruReportIOError, slru.c:1087 To fix, make VACUUM "freeze" the XIDs in the async notification queue before truncating the CLOG. Old XIDs are replaced with FrozenTransactionId or InvalidTransactionId. Note: This commit is not a full fix. A race condition remains, where a backend is executing asyncQueueReadAllNotifications() and has just made a local copy of an async SLRU page which contains old XIDs, while vacuum concurrently truncates the CLOG covering those XIDs. When the backend then calls TransactionIdDidCommit() on those XIDs from the local copy, you still get the error. The next commit will fix that remaining race condition. This was first reported by Sergey Zhuravlev in 2021, with many other people hitting the same issue later. Thanks to: - Alexandra Wang, Daniil Davydov, Andrei Varashen and Jacques Combrink for investigating and providing reproducable test cases, - Matheus Alcantara and Arseniy Mukhin for review and earlier proposed patches to fix this, - Álvaro Herrera and Masahiko Sawada for reviews, - Yura Sokolov aka funny-falcon for the idea of marking transactions as committed in the notification queue, and - Joel Jacobson for the final patch version. I hope I didn't forget anyone. Backpatch to all supported versions. I believe the bug goes back all the way to commit `d1e027221d`, which introduced the SLRU-based async notification queue. Discussion: https://www.postgresql.org/message-id/16961-25f29f95b3604a8a@postgresql.org Discussion: https://www.postgresql.org/message-id/18804-bccbbde5e77a68c2@postgresql.org Discussion: https://www.postgresql.org/message-id/CAK98qZ3wZLE-RZJN_Y%2BTFjiTRPPFPBwNBpBi5K5CU8hUHkzDpw@mail.gmail.com Backpatch-through: 14	2025-11-12 21:01:13 +02:00
Heikki Linnakangas	b821c92920	Escalate ERRORs during async notify processing to FATAL Previously, if async notify processing encountered an error, we would report the error to the client and advance our read position past the offending entry to prevent trying to process it over and over again. Trying to continue after an error has a few problems however: - We have no way of telling the client that a notification was lost. They get an ERROR, but that doesn't tell you much. As such, it's not clear if keeping the connection alive after losing a notification is a good thing. Depending on the application logic, missing a notification could cause the application to get stuck waiting, for example. - If the connection is idle, PqCommReadingMsg is set and any ERROR is turned into FATAL anyway. - We bailed out of the notification processing loop on first error without processing any subsequent notifications. The subsequent notifications would not be processed until another notify interrupt arrives. For example, if there were two notifications pending, and processing the first one caused an ERROR, the second notification would not be processed until someone sent a new NOTIFY. This commit changes the behavior so that any ERROR while processing async notifications is turned into FATAL, causing the client connection to be terminated. That makes the behavior more consistent as that's what happened in idle state already, and terminating the connection is a clear signal to the application that it might've missed some notifications. The reason to do this now is that the next commits will change the notification processing code in a way that would make it harder to skip over just the offending notification entry on error. Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://www.postgresql.org/message-id/fedbd908-4571-4bbe-b48e-63bfdcc38f64@iki.fi Backpatch-through: 14	2025-11-12 21:01:08 +02:00
Daniel Gustafsson	1aa5a029fc	Fix range for commit_siblings in sample conf The range for commit_siblings was incorrectly listed as starting on 1 instead of 0 in the sample configuration file. Backpatch down to all supported branches. Author: Man Zeng <zengman@halodbtech.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/tencent_53B70BA72303AE9C6889E78E@qq.com Backpatch-through: 14	2025-11-12 13:51:53 +01:00
Heikki Linnakangas	cb2ef0e92e	Fix pg_upgrade around multixid and mxoff wraparound pg_resetwal didn't accept multixid 0 or multixact offset UINT32_MAX, but they are both valid values that can appear in the control file. That caused pg_upgrade to fail if you tried to upgrade a cluster exactly at multixid or offset wraparound, because pg_upgrade calls pg_resetwal to restore multixid/offset on the new cluster to the values from the old cluster. To fix, allow those values in pg_resetwal. Fixes bugs #18863 and #18865 reported by Dmitry Kovalenko. Backpatch down to v15. Version 14 has the same bug, but the patch doesn't apply cleanly there. It could be made to work but it doesn't seem worth the effort given how rare it is to hit this problem with pg_upgrade, and how few people are upgrading to v14 anymore. Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezaApSMTjd%3DM2Sfn5Ucuggd3FG8Z8Qte8Xq9k5-%2BRQis-g@mail.gmail.com Discussion: https://www.postgresql.org/message-id/18863-72f08858855344a2@postgresql.org Discussion: https://www.postgresql.org/message-id/18865-d4c66cf35c2a67af@postgresql.org Backpatch-through: 15	2025-11-12 12:23:55 +02:00
Michael Paquier	f30cd34b3f	Report better object limits in error messages for injection points Previously, error messages for oversized injection point names, libraries, and functions showed buffer sizes (64, 128, 128) instead of the usable character limits (63, 127, 127) as it did not count for the zero-terminated byte, which was confusing. These messages are adjusted to show better the reality. The limit enforced for the private area was also too strict by one byte, as specifying a zone worth exactly INJ_PRIVATE_MAXLEN should be able to work because three is no zero-terminated byte in this case. This is a stylistic change (well, mostly, a private_area size of exactly 1024 bytes can be defined with this change, something that nobody seem to care about based on the lack of complaints). However, this is a testing facility let's keep the logic consistent across all the branches where this code exists, as there is an argument in favor of out-of-core extensions that use injection points. Author: Xuneng Zhou <xunengzhou@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CABPTF7VxYp4Hny1h+7ejURY-P4O5-K8WZg79Q3GUx13cQ6B2kg@mail.gmail.com Backpatch-through: 17	2025-11-12 10:19:20 +09:00
Nathan Bossart	e2fb3dfa81	Check for CREATE privilege on the schema in CREATE STATISTICS. This omission allowed table owners to create statistics in any schema, potentially leading to unexpected naming conflicts. For ALTER TABLE commands that require re-creating statistics objects, skip this check in case the user has since lost CREATE on the schema. The addition of a second parameter to CreateStatistics() breaks ABI compatibility, but we are unaware of any impacted third-party code. Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl> Author: Jelte Fennema-Nio <postgres@jeltef.nl> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Security: CVE-2025-12817 Backpatch-through: 13	2025-11-10 09:00:00 -06:00
Jacob Champion	f5999f0181	libpq: Prevent some overflows of int/size_t Several functions could overflow their size calculations, when presented with very large inputs from remote and/or untrusted locations, and then allocate buffers that were too small to hold the intended contents. Switch from int to size_t where appropriate, and check for overflow conditions when the inputs could have plausibly originated outside of the libpq trust boundary. (Overflows from within the trust boundary are still possible, but these will be fixed separately.) A version of add_size() is ported from the backend to assist with code that performs more complicated concatenation. Reported-by: Aleksey Solovev (Positive Technologies) Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Security: CVE-2025-12818 Backpatch-through: 13	2025-11-10 06:03:03 -08:00

1 2 3 4 5 ...

44080 commits