Commit graph

466 commits

Author SHA1 Message Date
Tom Lane
2b74fcedc4 Widen the nLocks counts in local lock tables from int to int64. This
forestalls potential overflow when the same table (or other object, but
usually tables) is accessed by very many successive queries within a single
transaction.  Per report from Michael Milligan.

Back-patch to 8.0, which is as far back as the patch conveniently applies.
There have been no reports of overflow in pre-8.3 releases, but clearly the
risk existed all along.  (Michael's report suggests that 8.3 may consume lock
counts faster than prior releases, but with no test case to look at it's hard
to be sure about that.  Widening the counts seems a good future-proofing
measure in any event.)
2008-09-16 01:56:56 +00:00
Tom Lane
9fad6e338b It turns out that TablespaceCreateDbspace fails badly if a relcache flush
occurs when it tries to heap_open pg_tablespace.  When control returns to
smgrcreate, that routine will be holding a dangling pointer to a closed
SMgrRelation, resulting in mayhem.  This is of course a consequence of
the violation of proper module layering inherent in having smgr.c call
a tablespace command routine, but the simplest fix seems to be to change
the locking mechanism.  There's no real need for TablespaceCreateDbspace
to touch pg_tablespace at all --- it's only opening it as a way of locking
against a parallel DROP TABLESPACE command.  A much better answer is to
create a special-purpose LWLock to interlock these two operations.
This drops TablespaceCreateDbspace quite a few layers down the food chain
and makes it something reasonably safe for smgr to call.
2006-01-19 04:45:58 +00:00
Tom Lane
846ed05de6 Sigh, looks like you need '.set mips2' before you can access MIPS
SYNC instruction.
2005-08-29 00:41:44 +00:00
Tom Lane
9f70dce4ce Add a SYNC instruction to the S_UNLOCK sequence for MIPS. 2005-08-28 18:26:07 +00:00
Tom Lane
5677c28c91 Get the MIPS assembler syntax right. Also add a separate sync command;
the reference I consulted yesterday said SC does a SYNC, but apparently
this is not true on newer MIPS processors, so be safe.
2005-08-27 16:22:58 +00:00
Tom Lane
51aebb07c3 Another try at the inlined MIPS spinlock code. Can't test this myself,
but for sure it's not any more broken than the prior version.
2005-08-26 22:04:53 +00:00
Tom Lane
8c3cf25225 Back-port recent MIPS and M68K spinlock improvements to 8.0 branch. 2005-08-26 14:48:13 +00:00
Tom Lane
817bc021b7 Need to reset local buffer pin counts, not only shared buffer pins,
before we attempt any file deletions in ShutdownPostgres.  Per Tatsuo.
2005-03-18 16:16:20 +00:00
Tom Lane
c8460d571e Ensure that all details of the ARC algorithm are hidden within freelist.c.
This refactoring does not change any algorithms or data structures, just
remove visibility of the ARC datastructures from other source files.
2005-02-03 23:30:12 +00:00
Tom Lane
fc299179df Separate the functions of relcache entry flush and smgr cache entry flush
so that we can get the size of a shared inval message back down to what it
was in 7.4 (and simplify the logic too).  Phase 2 of fixing the
'SMgrRelation hashtable corrupted' problem.
2005-01-10 21:57:19 +00:00
Tom Lane
0ce4d56924 Phase 1 of fix for 'SMgrRelation hashtable corrupted' problem. This
is the minimum required fix.  I want to look next at taking advantage of
it by simplifying the message semantics in the shared inval message queue,
but that part can be held over for 8.1 if it turns out too ugly.
2005-01-10 20:02:24 +00:00
PostgreSQL Daemon
2ff501590b Tag appropriate files for rc3
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
2004-12-31 22:04:05 +00:00
Tom Lane
eee5abce46 Refactor EXEC_BACKEND code so that postmaster child processes reattach
to shared memory as soon as possible, ie, right after read_backend_variables.
The effective difference from the original code is that this happens
before instead of after read_nondefault_variables(), which loads GUC
information and is apparently capable of expanding the backend's memory
allocation more than you'd think it should.  This should fix the
failure-to-attach-to-shared-memory reports we've been seeing on Windows.
Also clean up a few bits of unnecessarily grotty EXEC_BACKEND code.
2004-12-29 21:36:09 +00:00
Tom Lane
8f6278d907 Put in place some defenses against being fooled by accidental match of
shared memory segment ID.  If we can't access the existing shmem segment,
it must not be relevant to our data directory.  If we can access it,
then attach to it and check for an actual match to the data directory.
This should avoid some cases of failure-to-restart-after-boot without
introducing any significant risk of failing to detect a still-running
old backend.
2004-11-09 21:30:18 +00:00
Tom Lane
fdd13f1568 Give the ResourceOwner mechanism full responsibility for releasing buffer
pins at end of transaction, and reduce AtEOXact_Buffers to an Assert
cross-check that this was done correctly.  When not USE_ASSERT_CHECKING,
AtEOXact_Buffers is a complete no-op.  This gets rid of an O(NBuffers)
bottleneck during transaction commit/abort, which recent testing has shown
becomes significant above a few tens of thousands of shared buffers.
2004-10-16 18:57:26 +00:00
Tom Lane
1c2de47746 Remove BufferLocks[] array in favor of a single pointer to the buffer
(if any) currently waited for by LockBufferForCleanup(), which is all
that we were using it for anymore.  Saves some space and eliminates
proportional-to-NBuffers slowdown in UnlockBuffers().
2004-10-16 18:05:07 +00:00
Tom Lane
9ffc8ed58b Repair possible failure to update hint bits back to disk, per
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00464.php.
This fix is intended to be permanent: it moves the responsibility for
calling SetBufferCommitInfoNeedsSave() into the tqual.c routines,
eliminating the requirement for callers to test whether t_infomask changed.
Also, tighten validity checking on buffer IDs in bufmgr.c --- several
routines were paranoid about out-of-range shared buffer numbers but not
about out-of-range local ones, which seems a tad pointless.
2004-10-15 22:40:29 +00:00
Neil Conway
f629583f94 Document what the "rep; nop" x86 assembler sequence is actually equivalent
to, and what it is intended to do.
2004-10-06 23:41:59 +00:00
Tom Lane
0fb3152ea9 Minor adjustments to improve the accuracy of our computation of required
shared memory size.
2004-09-29 15:15:56 +00:00
Tom Lane
3a246cc285 Arrange to preallocate all required space for the buffer and FSM hash
tables in shared memory.  This ensures that overflow of the lock table
creates no long-lasting problems.  Per discussion with Merlin Moncure.
2004-09-28 20:46:37 +00:00
Tom Lane
682598139e Get rid of /*-inside-comment warning. My fault. 2004-09-24 01:48:43 +00:00
Tom Lane
409b38f514 Fix TAS assembly stuff for Solaris/386. (I'm not in a position to
actually test this, but it couldn't be broken any worse than it was...)
2004-09-24 00:21:32 +00:00
Tom Lane
8f9f198603 Restructure subtransaction handling to reduce resource consumption,
as per recent discussions.  Invent SubTransactionIds that are managed like
CommandIds (ie, counter is reset at start of each top transaction), and
use these instead of TransactionIds to keep track of subtransaction status
in those modules that need it.  This means that a subtransaction does not
need an XID unless it actually inserts/modifies rows in the database.
Accordingly, don't assign it an XID nor take a lock on the XID until it
tries to do that.  This saves a lot of overhead for subtransactions that
are only used for error recovery (eg plpgsql exceptions).  Also, arrange
to release a subtransaction's XID lock as soon as the subtransaction
exits, in both the commit and abort cases.  This avoids holding many
unique locks after a long series of subtransactions.  The price is some
additional overhead in XactLockTableWait, but that seems acceptable.
Finally, restructure the state machine in xact.c to have a more orthogonal
set of states for subtransactions.
2004-09-16 16:58:44 +00:00
Tom Lane
5042985fb4 Add s_lock support for HPUX on IA64, per Shinji Teragaito. 2004-09-02 17:10:58 +00:00
Tom Lane
17364edce6 slock_t must be int not char for MIPS. 7.4 got this right, but the
info was apparently mistranscribed in s_lock code rearrangement.
2004-08-30 22:49:07 +00:00
Bruce Momjian
b6b71b85bc Pgindent run for 8.0. 2004-08-29 05:07:03 +00:00
Bruce Momjian
da9a8649d8 Update copyright to 2004. 2004-08-29 04:13:13 +00:00
Tom Lane
1785acebf2 Introduce local hash table for lock state, as per recent proposal.
PROCLOCK structs in shared memory now have only a bitmask for held
locks, rather than counts (making them 40 bytes smaller, which is a
good thing).  Multiple locks within a transaction are counted in the
local hash table instead, and we have provision for tracking which
ResourceOwner each count belongs to.  Solves recently reported problem
with memory leakage within long transactions.
2004-08-27 17:07:42 +00:00
Tom Lane
51d7e25651 Improve some comments. 2004-08-26 17:22:28 +00:00
Tom Lane
4dbb880d3c Rearrange pg_subtrans handling as per recent discussion. pg_subtrans
updates are no longer WAL-logged nor even fsync'd; we do not need to,
since after a crash no old pg_subtrans data is needed again.  We truncate
pg_subtrans to RecentGlobalXmin at each checkpoint.  slru.c's API is
refactored a little bit to separate out the necessary decisions.
2004-08-23 23:22:45 +00:00
Tom Lane
3fdf649f4f Fix failure to guarantee that a checkpoint will write out pg_clog updates
for transaction commits that occurred just before the checkpoint.  This is
an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a
reproducible test case to prove its existence.
2004-08-11 04:07:16 +00:00
Tom Lane
fcbc438727 Label CVS tip as 8.0devel instead of 7.5devel. Adjust various comments
and documentation to reference 8.0 instead of 7.5.
2004-08-04 21:34:35 +00:00
Tom Lane
efcaf1e868 Some mop-up work for savepoints (nested transactions). Store a small
number of active subtransaction XIDs in each backend's PGPROC entry,
and use this to avoid expensive probes into pg_subtrans during
TransactionIdIsInProgress.  Extend EOXactCallback API to allow add-on
modules to get control at subxact start/end.  (This is deliberately
not compatible with the former API, since any uses of that API probably
need manual review anyway.)  Add basic reference documentation for
SAVEPOINT and related commands.  Minor other cleanups to check off some
of the open issues for subtransactions.
Alvaro Herrera and Tom Lane.
2004-08-01 17:32:22 +00:00
Tom Lane
1bf3d61504 Fix subtransaction behavior for large objects, temp namespace, files,
password/group files.  Also allow read-only subtransactions of a read-write
parent, but not vice versa.  These are the reasonably noncontroversial
parts of Alvaro's recent mop-up patch, plus further work on large objects
to minimize use of the TopTransactionResourceOwner.
2004-07-28 14:23:31 +00:00
Tom Lane
2042b3428d Invent WAL timelines, as per recent discussion, to make point-in-time
recovery more manageable.  Also, undo recent change to add FILE_HEADER
and WASTED_SPACE records to XLOG; instead make the XLOG page header
variable-size with extra fields in the first page of an XLOG file.
This should fix the boundary-case bugs observed by Mark Kirkwood.
initdb forced due to change of XLOG representation.
2004-07-21 22:31:26 +00:00
Bruce Momjian
7a55ba7615 Back out pg_autovacuum commit after cvs clean failure causes commit. 2004-07-21 20:34:50 +00:00
Bruce Momjian
8dec0c1bf2 lease find enclosed a patch that matches the PL/Perl documentation
(fairly closely, I hope) to the current PL/Perl implementation.

David Fetter
2004-07-21 20:23:05 +00:00
Tom Lane
66ec2db728 XLOG file archiving and point-in-time recovery. There are still some
loose ends and a glaring lack of documentation, but it basically works.

Simon Riggs with some editorialization by Tom Lane.
2004-07-19 02:47:16 +00:00
Tom Lane
fe548629c5 Invent ResourceOwner mechanism as per my recent proposal, and use it to
keep track of portal-related resources separately from transaction-related
resources.  This allows cursors to work in a somewhat sane fashion with
nested transactions.  For now, cursor behavior is non-subtransactional,
that is a cursor's state does not roll back if you abort a subtransaction
that fetched from the cursor.  We might want to change that later.
2004-07-17 03:32:14 +00:00
Tom Lane
573a71a5da Nested transactions. There is still much left to do, especially on the
performance front, but with feature freeze upon us I think it's time to
drive a stake in the ground and say that this will be in 7.5.

Alvaro Herrera, with some help from Tom Lane.
2004-07-01 00:52:04 +00:00
Tom Lane
1098677482 Adjust TAS assembly as per recent discussions: use "+m"(*lock) everywhere
to reference the spinlock variable, and specify "memory" as a clobber
operand to be sure gcc does not try to keep shared-memory values in
registers across a spinlock acquisition.  Also tighten the S/390 asm
sequence, which was apparently written with only minimal study of the
gcc asm documentation.  I have personally tested i386, ia64, ppc, hppa,
and s390 variants --- there is some small chance that I broke the others,
but I doubt it.
2004-06-19 23:02:32 +00:00
Tom Lane
2467394ee1 Tablespaces. Alternate database locations are dead, long live tablespaces.
There are various things left to do: contrib dbsize and oid2name modules
need work, and so does the documentation.  Also someone should think about
COMMENT ON TABLESPACE and maybe RENAME TABLESPACE.  Also initlocation is
dead, it just doesn't know it yet.

Gavin Sherry and Tom Lane.
2004-06-18 06:14:31 +00:00
Tom Lane
e6cba71503 Add some code to Assert that when we release pin on a buffer, we are
not holding the buffer's cntx_lock or io_in_progress_lock.  A recent
report from Litao Wu makes me wonder whether it is ever possible for
us to drop a buffer and forget to release its cntx_lock.  The Assert
does not fire in the regression tests, but that proves little ...
2004-06-11 16:43:24 +00:00
Tom Lane
24a1e20f14 Adjust PageGetMaxOffsetNumber to ensure sane behavior on uninitialized
pages, even when the macro's result is stored into an unsigned variable.
2004-06-05 17:42:46 +00:00
Bruce Momjian
e8d9d68ca4 Per previous discussions, here are two functions to send INT and TERM
(cancel and terminate) signals to other backends.   They permit only INT
and TERM, and permits sending only to postgresql backends.

Magnus Hagander
2004-06-02 21:29:29 +00:00
Tom Lane
2095206de1 Adjust btree index build to not use shared buffers, thereby avoiding the
locking conflict against concurrent CHECKPOINT that was discussed a few
weeks ago.  Also, if not using WAL archiving (which is always true ATM
but won't be if PITR makes it into this release), there's no need to
WAL-log the index build process; it's sufficient to force-fsync the
completed index before commit.  This seems to gain about a factor of 2
in my tests, which is consistent with writing half as much data.  I did
not try it with WAL on a separate drive though --- probably the gain would
be a lot less in that scenario.
2004-06-02 17:28:18 +00:00
Tom Lane
91d20ff7aa Additional mop-up for sync-to-fsync changes: avoid issuing fsyncs for
temp tables, and avoid WAL-logging truncations of temp tables.  Do issue
fsync on truncated files (not sure this is necessary but it seems like
a good idea).
2004-05-31 20:31:33 +00:00
Tom Lane
e674707968 Minor code rationalization: FlushRelationBuffers just returns void,
rather than an error code, and does elog(ERROR) not elog(WARNING)
when it detects a problem.  All callers were simply elog(ERROR)'ing on
failure return anyway, and I find it hard to envision a caller that would
not, so we may as well simplify the callers and produce the more useful
error message directly.
2004-05-31 19:24:05 +00:00
Tom Lane
9b178555fc Per previous discussions, get rid of use of sync(2) in favor of
explicitly fsync'ing every (non-temp) file we have written since the
last checkpoint.  In the vast majority of cases, the burden of the
fsyncs should fall on the bgwriter process not on backends.  (To this
end, we assume that an fsync issued by the bgwriter will force out
blocks written to the same file by other processes using other file
descriptors.  Anyone have a problem with that?)  This makes the world
safe for WIN32, which ain't even got sync(2), and really makes the world
safe for Unixen as well, because sync(2) never had the semantics we need:
it offers no way to wait for the requested I/O to finish.

Along the way, fix a bug I recently introduced in xlog recovery:
file truncation replay failed to clear bufmgr buffers for the dropped
blocks, which could result in 'PANIC:  heap_delete_redo: no block'
later on in xlog replay.
2004-05-31 03:48:10 +00:00
Tom Lane
076a055acf Separate out bgwriter code into a logically separate module, rather
than being random pieces of other files.  Give bgwriter responsibility
for all checkpoint activity (other than a post-recovery checkpoint);
so this child process absorbs the functionality of the former transient
checkpoint and shutdown subprocesses.  While at it, create an actual
include file for postmaster.c, which for some reason never had its own
file before.
2004-05-29 22:48:23 +00:00