mirror of
https://github.com/postgres/postgres.git
synced 2026-04-13 21:17:00 -04:00
HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING: concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.
Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.
As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.
As this bug goes back all the way to the introduction of
subtransactions in
|
||
|---|---|---|
| .. | ||
| .gitignore | ||
| aclchk.c | ||
| catalog.c | ||
| Catalog.pm | ||
| dependency.c | ||
| genbki.pl | ||
| heap.c | ||
| index.c | ||
| indexing.c | ||
| information_schema.sql | ||
| Makefile | ||
| namespace.c | ||
| objectaddress.c | ||
| pg_aggregate.c | ||
| pg_collation.c | ||
| pg_constraint.c | ||
| pg_conversion.c | ||
| pg_db_role_setting.c | ||
| pg_depend.c | ||
| pg_enum.c | ||
| pg_inherits.c | ||
| pg_largeobject.c | ||
| pg_namespace.c | ||
| pg_operator.c | ||
| pg_proc.c | ||
| pg_range.c | ||
| pg_shdepend.c | ||
| pg_type.c | ||
| README | ||
| sql_feature_packages.txt | ||
| sql_features.txt | ||
| storage.c | ||
| system_views.sql | ||
| toasting.c | ||
src/backend/catalog/README System Catalog ============== This directory contains .c files that manipulate the system catalogs; src/include/catalog contains the .h files that define the structure of the system catalogs. When the compile-time scripts (Gen_fmgrtab.pl and genbki.pl) execute, they grep the DATA statements out of the .h files and munge these in order to generate the postgres.bki file. The .bki file is then used as input to initdb (which is just a wrapper around postgres running single-user in bootstrapping mode) in order to generate the initial (template) system catalog relation files. ----------------------------------------------------------------- People who are going to hose around with the .h files should be aware of the following facts: - It is very important that the DATA statements be properly formatted (e.g., no broken lines, proper use of white-space and _null_). The scripts are line-oriented and break easily. In addition, the only documentation on the proper format for them is the code in the bootstrap/ directory. Just be careful when adding new DATA statements. - Some catalogs require that OIDs be preallocated to tuples because of cross-references from other pre-loaded tuples. For example, pg_type contains pointers into pg_proc (e.g., pg_type.typinput), and pg_proc contains back-pointers into pg_type (pg_proc.proargtypes). For such cases, the OID assigned to a tuple may be explicitly set by use of the "OID = n" clause of the .bki insert statement. If no such pointers are required to a given tuple, then the OID = n clause may be omitted (then the system generates an OID in the usual way, or leaves it 0 in a catalog that has no OIDs). In practice we usually preassign OIDs for all or none of the pre-loaded tuples in a given catalog, even if only some of them are actually cross-referenced. - We also sometimes preallocate OIDs for catalog tuples whose OIDs must be known directly in the C code. In such cases, put a #define in the catalog's .h file, and use the #define symbol in the C code. Writing the actual numeric value of any OID in C code is considered very bad form. Direct references to pg_proc OIDs are common enough that there's a special mechanism to create the necessary #define's automatically: see backend/utils/Gen_fmgrtab.pl. We also have standard conventions for setting up #define's for the pg_class OIDs of system catalogs and indexes. For all the other system catalogs, you have to manually create any #define's you need. - If you need to find a valid OID for a new predefined tuple, use the unused_oids script. It generates inclusive ranges of *unused* OIDs (e.g., the line "45-900" means OIDs 45 through 900 have not been allocated yet). Currently, OIDs 1-9999 are reserved for manual assignment; the unused_oids script simply looks through the include/catalog headers to see which ones do not appear in "OID =" clauses in DATA lines. (As of Postgres 8.1, it also looks at CATALOG and DECLARE_INDEX lines.) You can also use the duplicate_oids script to check for mistakes. - The OID counter starts at 10000 at bootstrap. If a catalog row is in a table that requires OIDs, but no OID was preassigned by an "OID =" clause, then it will receive an OID of 10000 or above. - To create a "BOOTSTRAP" table you have to do a lot of extra work: these tables are not created through a normal CREATE TABLE operation, but spring into existence when first written to during initdb. Therefore, you must manually create appropriate entries for them in the pre-loaded contents of pg_class, pg_attribute, and pg_type. Avoid making new catalogs be bootstrap catalogs if at all possible; generally, only tables that must be written to in order to create a table should be bootstrapped. - Certain BOOTSTRAP tables must be at the start of the Makefile POSTGRES_BKI_SRCS variable, as these cannot be created through the standard heap_create_with_catalog process, because it needs these tables to exist already. The list of files this currently includes is: pg_proc.h pg_type.h pg_attribute.h pg_class.h Within this list, pg_type.h must come before pg_attribute.h. Also, indexing.h must be last, since the indexes can't be created until all the tables are in place, and toasting.h should probably be next-to-last (or at least after all the tables that need toast tables). There are reputedly some other order dependencies in the .bki list, too. ----------------------------------------------------------------- When munging the .c files, you should be aware of certain conventions: - The system catalog cache code (and most catalog-munging code in general) assumes that the fixed-length portions of all system catalog tuples are in fact present, because it maps C struct declarations onto them. Thus, the variable-length fields must all be at the end, and only the variable-length fields of a catalog tuple are permitted to be NULL. For example, if you set pg_type.typrelid to be NULL, a piece of code will likely perform "typetup->typrelid" (or, worse, "typetyp->typelem", which follows typrelid). This will result in random errors or even segmentation violations. Hence, do NOT insert catalog tuples that contain NULL attributes except in their variable-length portions! (The bootstrapping code is fairly good about marking NOT NULL each of the columns that can legally be referenced via C struct declarations ... but those markings won't be enforced against DATA commands, so you must get it right in a DATA line.) - Modification of the catalogs must be performed with the proper updating of catalog indexes! That is, most catalogs have indexes on them; when you munge them using the executor, the executor will take care of doing the index updates, but if you make direct access method calls to insert new or modified tuples into a heap, you must also make the calls to insert the tuple into ALL of its indexes! If not, the new tuple will generally be "invisible" to the system because most of the accesses to the catalogs in question will be through the associated indexes.