page when it's read in, per pghackers discussion around 17-Feb. Add a
GUC variable zero_damaged_pages that causes the response to be a WARNING
followed by zeroing the page, rather than the normal ERROR; this is per
Hiroshi's suggestion that there needs to be a way to get at the data
in the rest of the table.
(materialization into a tuple store) discussed on pgsql-hackers earlier.
I've updated the documentation and the regression tests.
Notes on the implementation:
- I needed to change the tuple store API slightly -- it assumes that it
won't be used to hold data across transaction boundaries, so the temp
files that it uses for on-disk storage are automatically reclaimed at
end-of-transaction. I added a flag to tuplestore_begin_heap() to control
this behavior. Is changing the tuple store API in this fashion OK?
- in order to store executor results in a tuple store, I added a new
CommandDest. This works well for the most part, with one exception: the
current DestFunction API doesn't provide enough information to allow the
Executor to store results into an arbitrary tuple store (where the
particular tuple store to use is chosen by the call site of
ExecutorRun). To workaround this, I've temporarily hacked up a solution
that works, but is not ideal: since the receiveTuple DestFunction is
passed the portal name, we can use that to lookup the Portal data
structure for the cursor and then use that to get at the tuple store the
Portal is using. This unnecessarily ties the Portal code with the
tupleReceiver code, but it works...
The proper fix for this is probably to change the DestFunction API --
Tom suggested passing the full QueryDesc to the receiveTuple function.
In that case, callers of ExecutorRun could "subclass" QueryDesc to add
any additional fields that their particular CommandDest needed to get
access to. This approach would work, but I'd like to think about it for
a little bit longer before deciding which route to go. In the mean time,
the code works fine, so I don't think a fix is urgent.
- (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and
adjusted the behavior of SCROLL in accordance with the discussion on
-hackers.
- (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml
Neil Conway
at database shutdown, and then load it again at database startup. This
preserves our hard-won knowledge of free space across restarts (given
an orderly shutdown, that is).
Adjustable threshold is gone in favor of keeping track of total requested
page storage and doling out proportional fractions to each relation
(with a minimum amount per relation, and some quantization of the results
to avoid thrashing with small changes in page counts). Provide special-
case code for indexes so as not to waste space storing useless page
free space counts. Restructure internal data storage to be a flat array
instead of list-of-chunks; this may cost a little more work in data
copying when reorganizing, but allows binary search to be used during
lookup_fsm_page_entry().
RelOid_pg_class, and transaction locks XactLockTableId. RelId is renamed
to objId.
- LockObject() and UnlockObject() functions created, and their use
sprinkled throughout the code to do descent locking for domains and
types. They accept lock modes AccessShare and AccessExclusive, as we
only really need a 'read' and 'write' lock at the moment. Most locking
cases are held until the end of the transaction.
This fixes the cases Tom mentioned earlier in regards to locking with
Domains. If the patch is good, I'll work on cleaning up issues with
other database objects that have this problem (most of them).
Rod Taylor
>
> ... he is now about to write an inlined version that can go into
> s_lock.h . I'll send the new patch later on...
OK, here it comes:
An inlined version of tas(), that works for both, powerpc and
powerpc64. The patch is against 7.3b5 and passes the test suite on
both architectures.
Reinhard Max
between signal handler and enable/disable code, avoid accumulation of
timing error due to trying to maintain remaining-time instead of
absolute-end-time, disable timeout before commit not after.
during the regression test. The problem has been reproduced on two machine
but both of these are the same type of hardware and software. I also tried
to recreate the problem on other machines, on older version of AIX but I
couldn't.
After looked through pgsql-hackers mailing list, I focused on spin lock
issue to solve the problem. The easiest and may not be the best solution
for the problem is to give up HAS_TEST_AND_SET. This actually works.
One another and better solution for the problem is to use _check_lock() and
_clear_lock() as spin lock. Important thing here is to define S_UNLOCK()
with _clear_lock(). This will solve the so called "Compiler bug" issue
someone wrote on the mailing list.
We have some other API such as cs(), compare_and_swap() and fetch_and_or()
to do test and set on AIX, but any of these didn't solve my problem. I
wrote tiny testing program to see if we have any bug of these API of AIX,
but I couldn't see any problem except for compare_and_swap(). It seems that
you can not use compare_and_swap() for the purpose, as it would not work as
spin lock on any SMP machines I tested. I don't know the reason why cs()
nor fetch_and_or()/fetch_and_and() will not work with PostgreSQL on p690.
These worked with my testing program on all machines I tested.
Tomoyuki Niijima
(overlaying low byte of page size) and add HEAP_HASOID bit to t_infomask,
per earlier discussion. Simplify scheme for overlaying fields in tuple
header (no need for cmax to live in more than one place). Don't try to
clear infomask status bits in tqual.c --- not safe to do it there. Don't
try to force output table of a SELECT INTO to have OIDs, either. Get rid
of unnecessarily complex three-state scheme for TupleDesc.tdhasoids, which
has already caused one recent failure. Improve documentation.
available (else there's no way to interpret the list links). Change
pg_locks view to show transaction ID locks separately from ordinary
relation locks. Avoid showing N duplicate rows when the same lock is
held multiple times (seems unlikely that users care about exact hold
count). Improve documentation.
connections by the superuser only.
This patch replaces the last patch I sent a couple of days ago.
It closes a connection that has not been authorised by a superuser if it would
leave less than the GUC variable ReservedBackends
(superuser_reserved_connections in postgres.conf) backend process slots free
in the SISeg. This differs to the first patch which only reserved the last
ReservedBackends slots in the procState array. This has made the free slot
test more expensive due to the use of a lock.
After thinking about a comment on the first patch I've also made it a fatal
error if the number of reserved slots is not less than the maximum number of
connections.
Nigel J. Andrews
This patch is an updated version of the lock listing patch. I've made
the following changes:
- write documentation
- wrap the SRF in a view called 'pg_locks': all user-level
access should be done through this view
- re-diff against latest CVS
One thing I chose not to do is adapt the SRF to use the anonymous
composite type code from Joe Conway. I'll probably do that eventually,
but I'm not really convinced it's a significantly cleaner way to
bootstrap SRF builtins than the method this patch uses (of course, it
has other uses...)
Neil Conway
The local buffer manager is no longer used for newly-created relations
(unless they are TEMP); a new non-TEMP relation goes through the shared
bufmgr and thus will participate normally in checkpoints. But TEMP relations
use the local buffer manager throughout their lifespan. Also, operations
in TEMP relations are not logged in WAL, thus improving performance.
Since it's no longer necessary to fsync relations as they move out of the
local buffers into shared buffers, quite a lot of smgr.c/md.c/fd.c code
is no longer needed and has been removed: there's no concept of a dirty
relation anymore in md.c/fd.c, and we never fsync anything but WAL.
Still TODO: improve local buffer management algorithms so that it would
be reasonable to increase NLocBuffer.
>usefulness.
>
>> [...] should I post a patch that puts pagesize directly into
>> PageHeaderData?
>
>If you're so inclined. Given that pd_opaque is hidden in those macros,
>there wouldn't be much of any gain in readability either, so I haven't
>worried about changing the declaration.
Thanks for the clarification. Here is the patch. Not much gain, but at
least it saves the next junior hacker from scratching his head ...
Manfred Koizar
all places, where pd_linp is accessed. Also introduce new macros
SizeOfPageHeaderData and BTMaxItemSize. This is just source code
cosmetic, no behaviour changed.
Manfred Koizar
lines of code into internal routines (drop_relfilenode_buffers,
release_buffer) and by hiding unused routines (PrintBufferDescs,
PrintPinnedBufs) behind #ifdef NOT_USED. Remove AbortBufferIO()
declaration from bufmgr.c (already declared in bufmgr.h)
Manfred Koizar
> Changes to avoid collisions with WIN32 & MFC names...
> 1. Renamed:
> a. PROC => PGPROC
> b. GetUserName() => GetUserNameFromId()
> c. GetCurrentTime() => GetCurrentDateTime()
> d. IGNORE => IGNORE_DTF in include/utils/datetime.h & utils/adt/datetim
>
> 2. Added _P to some lex/yacc tokens:
> CONST, CHAR, DELETE, FLOAT, GROUP, IN, OUT
Jan
As proof of concept, provide an alternate implementation based on POSIX
semaphores. Also push the SysV shared-memory implementation into a
separate file so that it can be replaced conveniently.
was in the thread "make BufferGetBlockNumber() a macro". Tom
objected to the original patch, so I prepared a new one which
doesn't change BufferGetBlockNumber() into a macro, it just
cleans up some comments and fixes an assertion. The patch
is attached.
Neil Conway
speed up repetitive failed searches; per pghackers discussion in late
January. inval.c logic substantially simplified, since we can now treat
inserts and deletes alike as far as inval events are concerned. Some
repair work needed in heap_create_with_catalog, which turns out to have
been doing CommandCounterIncrement at a point where the new relation has
non-self-consistent catalog entries. With the new inval code, that
resulted in assert failures during a relcache entry rebuild.
Improve 'pg_internal.init' relcache entry preload mechanism so that it is
safe to use for all system catalogs, and arrange to preload a realistic
set of system-catalog entries instead of only the three nailed-in-cache
indexes that were formerly loaded this way. Fix mechanism for deleting
out-of-date pg_internal.init files: this must be synchronized with transaction
commit, not just done at random times within transactions. Drive it off
relcache invalidation mechanism so that no special-case tests are needed.
Cache additional information in relcache entries for indexes (their pg_index
tuples and index-operator OIDs) to eliminate repeated lookups. Also cache
index opclass info at the per-opclass level to avoid repeated lookups during
relcache load.
Generalize 'systable scan' utilities originally developed by Hiroshi,
move them into genam.c, use in a number of places where there was formerly
ugly code for choosing either heap or index scan. In particular this allows
simplification of the logic that prevents infinite recursion between syscache
and relcache during startup: we can easily switch to heapscans in relcache.c
when and where needed to avoid recursion, so IndexScanOK becomes simpler and
does not need any expensive initialization.
Eliminate useless opening of a heapscan data structure while doing an indexscan
(this saves an mdnblocks call and thus at least one kernel call).