Commit graph

4407 commits

Author SHA1 Message Date
Marc G. Fournier
2a174e45dd update files for beta3 2007-11-16 04:29:45 +00:00
Bruce Momjian
f6e8730d11 Re-run pgindent with updated list of typedefs. (Updated README should
avoid this problem in the future.)
2007-11-15 22:25:18 +00:00
Bruce Momjian
fdf5a5efb7 pgindent run for 8.3. 2007-11-15 21:14:46 +00:00
Tom Lane
6cc4451b5c Prevent re-use of a deleted relation's relfilenode until after the next
checkpoint.  This guards against an unlikely data-loss scenario in which
we re-use the relfilenode, then crash, then replay the deletion and
recreation of the file.  Even then we'd be OK if all insertions into the
new relation had been WAL-logged ... but that's not guaranteed given all
the no-WAL-logging optimizations that have recently been added.

Patch by Heikki Linnakangas, per a discussion last month.
2007-11-15 20:36:40 +00:00
Tom Lane
4394c1b09c Resurrect the code for the rewrite(ARRAY[...]) aggregate function,
and put it into contrib/tsearch2 compatibility module.
2007-11-13 22:14:50 +00:00
Tom Lane
0bd4da23a4 Ensure that typmod decoration on a datatype name is validated in all cases,
even in code paths where we don't pay any subsequent attention to the typmod
value.  This seems needed in view of the fact that 8.3's generalized typmod
support will accept a lot of bogus syntax, such as "timestamp(foo)" or
"record(int, 42)" --- if we allow such things to pass without comment,
users will get confused.  Per a recent example from Greg Stark.

To implement this in a way that's not very vulnerable to future
bugs-of-omission, refactor the API of parse_type.c's TypeName lookup routines
so that typmod validation is folded into the base lookup operation.  Callers
can still choose not to receive the encoded typmod, but we'll check the
decoration anyway if it's present.
2007-11-11 19:22:49 +00:00
Tom Lane
654dcfb9e4 Clean up ts_locale.h/.c. Fix broken and not-consistent-across-platforms
behavior of wchar2char/char2wchar; this should resolve bug #3730.  Avoid
excess computations of pg_mblen in t_isalpha and friends.  Const-ify
APIs where possible.
2007-11-09 22:37:35 +00:00
Magnus Hagander
4b606ee444 Add parameter krb_realm used by GSSAPI, SSPI and Kerberos
to validate the realm of the connecting user. By default
it's empty meaning no verification, which is the way
Kerberos authentication has traditionally worked in
PostgreSQL.
2007-11-09 17:31:07 +00:00
Tom Lane
c291203ca3 Fix EquivalenceClass code to handle volatile sort expressions in a more
predictable manner; in particular that if you say ORDER BY output-column-ref,
it will in fact sort by that specific column even if there are multiple
syntactic matches.  An example is
	SELECT random() AS a, random() AS b FROM ... ORDER BY b, a;
While the use-case for this might be a bit debatable, it worked as expected
in earlier releases, so we should preserve the behavior for 8.3.  Per my
recent proposal.

While at it, fix convert_subquery_pathkeys() to handle RelabelType stripping
in both directions; it needs this for the same reasons make_sort_from_pathkeys
does.
2007-11-08 21:49:48 +00:00
Tom Lane
1be0601681 Last week's patch for make_sort_from_pathkeys wasn't good enough: it has
to be able to discard top-level RelabelType nodes on *both* sides of the
equivalence-class-to-target-list comparison, since make_pathkey_from_sortinfo
might either add or remove a RelabelType.  Also fix the latter to do the
removal case cleanly.  Per example from Peter.
2007-11-08 19:25:37 +00:00
Tom Lane
2de946be6a Improve the performance of LIKE/regex estimation in non-C locales, by making
make_greater_string() try harder to generate a string that's actually greater
than its input string.  Before we just assumed that making a string that was
memcmp-greater was enough, but it is easy to generate examples where this is
not so when the locale is not C.  Instead, loop until the relevant comparison
function agrees that the generated string is greater than the input.

Unfortunately this is probably not enough to guarantee that the generated
string is greater than all extensions of the input, so we cannot relax the
restriction to C locale for the LIKE/regex index optimization.  But it should
at least improve the odds of getting a useful selectivity estimate in
prefix_selectivity().  Per example from Guillaume Smet.

Backpatch to 8.1, mainly because that's what the complainant is using...
2007-11-07 22:37:24 +00:00
Peter Eisentraut
5f9869d0ee Use "alternative" instead of "alternate" where it is clearer. 2007-11-07 12:24:24 +00:00
Bruce Momjian
c1a03bee08 Document that configure option only affects contrib:
--with-ossp-uuid        use OSSP UUID library when building /contrib/uuid-ossp
2007-11-05 17:43:20 +00:00
Tom Lane
a06ce21c72 Add a note about another issue that needs to be considered before
changing the TOAST size thresholds.
2007-11-05 14:11:17 +00:00
Tom Lane
b17b7fae8c Remove the hack in the grammar that "optimized away" DEFAULT NULL clauses.
Instead put in a test to drop a NULL default at the last moment before
storing the catalog entry.  This changes the behavior in a couple of ways:
* Specifying DEFAULT NULL when creating an inheritance child table will
  successfully suppress inheritance of any default expression from the
  parent's column, where formerly it failed to do so.
* Specifying DEFAULT NULL for a column of a domain type will correctly
  override any default belonging to the domain; likewise for a sub-domain.
The latter change happens because by the time the clause is checked,
it won't be a simple null Const but a CoerceToDomain expression.

Personally I think this should be back-patched, but there doesn't seem to
be consensus for that on pgsql-hackers, so refraining.
2007-10-29 19:40:40 +00:00
Magnus Hagander
6f14effbf9 New versions of mingw have gettimeofday(), so add an autoconf test
for this.
2007-10-29 11:25:42 +00:00
Tom Lane
5b5a70aedf Stamp 8.3beta2. 2007-10-27 00:22:42 +00:00
Magnus Hagander
bb98b2e27e Change win32 child-death tracking code to use a threadpool to wait for
childprocess deaths instead of using one thread per child. This drastastically
reduces the address space usage and should allow for more backends running.

Also change the win32_waitpid functionality to use an IO Completion Port for
queueing child death notices instead of using a fixed-size array.
2007-10-26 21:50:10 +00:00
Alvaro Herrera
acac68b2bc Allow an autovacuum worker to be interrupted automatically when it is found
to be locking another process (except when it's working to prevent Xid
wraparound problems).
2007-10-26 20:45:10 +00:00
Tom Lane
048efc25e4 Disallow scrolling of FOR UPDATE/FOR SHARE cursors, so as to avoid problems
in corner cases such as re-fetching a just-deleted row.  We may be able to
relax this someday, but let's find out how many people really care before
we invest a lot of work in it.  Per report from Heikki and subsequent
discussion.

While in the neighborhood, make the combination of INSENSITIVE and FOR UPDATE
throw an error, since they are semantically incompatible.  (Up to now we've
accepted but just ignored the INSENSITIVE option of DECLARE CURSOR.)
2007-10-24 23:27:08 +00:00
Alvaro Herrera
745c1b2c2a Rearrange vacuum-related bits in PGPROC as a bitmask, to better support
having several of them.  Add two more flags: whether the process is
executing an ANALYZE, and whether a vacuum is for Xid wraparound (which
is obviously only set by autovacuum).

Sneakily move the worker's recently-acquired PostAuthDelay to a more useful
place.
2007-10-24 20:55:36 +00:00
Tom Lane
c29a9c37bf Fix UPDATE/DELETE WHERE CURRENT OF to support repeated update and update-
then-delete on the current cursor row.  The basic fix is that nodeTidscan.c
has to apply heap_get_latest_tid() to the current-scan-TID obtained from the
cursor query; this ensures we get the latest row version to work with.
However, since that only works if the query plan is a TID scan, we also have
to hack the planner to make sure only that type of plan will be selected.
(Formerly, the planner might decide to apply a seqscan if the table is very
small.  This change is probably a Good Thing anyway, since it's hard to see
how a seqscan could really win.)  That means the execQual.c code to support
CurrentOfExpr as a regular expression type is dead code, so replace it with
just an elog().  Also, add regression tests covering these cases.  Note
that the added tests expose the fact that re-fetching an updated row
misbehaves if the cursor used FOR UPDATE.  That's an independent bug that
should be fixed later.  Per report from Dharmendra Goyal.
2007-10-24 18:37:09 +00:00
Tom Lane
592c88a0d2 Remove the aggregate form of ts_rewrite(), since it doesn't work as desired
if there are zero rows to aggregate over, and the API seems both conceptually
and notationally ugly anyway.  We should look for something that improves
on the tsquery-and-text-SELECT version (which is also pretty ugly but at
least it works...), but it seems that will take query infrastructure that
doesn't exist today.  (Hm, I wonder if there's anything in or near SQL2003
window functions that would help?)  Per discussion.
2007-10-24 02:24:49 +00:00
Tom Lane
07d0a370c1 Make configure probe for the location of the <uuid.h> header file.
Needed to accommodate different layout on some platforms (Debian for
one).  Heikki Linnakangas
2007-10-23 21:38:16 +00:00
Tom Lane
dbaec70c15 Rename and slightly redefine the default text search parser's "word"
categories, as per discussion.  asciiword (formerly lword) is still
ASCII-letters-only, and numword (formerly word) is still the most general
mixed-alpha-and-digits case.  But word (formerly nlword) is now
any-group-of-letters-with-at-least-one-non-ASCII, rather than all-non-ASCII as
before.  This is no worse than before for parsing mixed Russian/English text,
which seems to have been the design center for the original coding; and it
should simplify matters for parsing most European languages.  In particular
it will not be necessary for any language to accept strings containing digits
as being regular "words".  The hyphenated-word categories are adjusted
similarly.
2007-10-23 20:46:12 +00:00
Tom Lane
12f25e70a6 Fix two-argument form of ts_rewrite() so it actually works for cases where
a later rewrite rule should change a subtree modified by an earlier one.
Per my gripe of a few days ago.
2007-10-23 01:44:40 +00:00
Tom Lane
3e17ef1cfa Adjust ts_debug's output as per my proposal of yesterday: show the
active dictionary and its output lexemes as separate columns, instead
of smashing them into one text column, and lowercase the column names.
Also, define the output rowtype using OUT parameters instead of a
composite type, to be consistent with the other built-in functions.
2007-10-22 20:13:37 +00:00
Tom Lane
1ea47dd8cb Fix shared tsvector/tsquery input code so that we don't say "syntax error in
tsvector" when we are really parsing a tsquery.  Report the bogus input,
too.  Make styles of some related error messages more consistent.
2007-10-21 22:29:56 +00:00
Tom Lane
638bd34f89 Found another small glitch in tsearch API: the two versions of ts_lexize()
are really redundant, since we invented a regdictionary alias type.
We can have just one function, declared as taking regdictionary, and
it will handle both behaviors.  Noted while working on documentation.
2007-10-19 22:01:45 +00:00
Tom Lane
ba6b0bfd63 ts_rewrite() does not return a set, only one row; fix mislabeling in
pg_proc.h.
2007-10-19 19:48:34 +00:00
Tom Lane
febd60bf5d Fix pg_wchar_table[] to match revised ordering of the encoding ID enum.
Add some comments so hopefully the next poor sod doesn't fall into the
same trap.  (Wrong comments are worse than none at all...)
2007-10-15 22:46:27 +00:00
Tom Lane
7cf3ff109d make install is supposed to install everything under src/include/,
but it was missing a bunch of recently-added subdirectories.
2007-10-14 17:07:51 +00:00
Tom Lane
18e3fcc31e Migrate the former contrib/txid module into core. This will make it easier
for Slony and Skytools to depend on it.  Per discussion.
2007-10-13 23:06:28 +00:00
Tom Lane
8468146b03 Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the
renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2
initdb and psql if they are run with an 8.3beta1 libpq.so.  For the moment
we can rearrange the order of enum pg_enc to keep the same number for
everything except PG_JOHAB, which isn't a problem since there are no direct
references to it in the 8.2 programs anyway.  (This does force initdb
unfortunately.)

Going forward, we want to fix things so that encoding IDs can be changed
without an ABI break, and this commit includes the changes needed to allow
libpq's encoding IDs to be treated as fully independent of the backend's.
The main issue is that libpq clients should not include pg_wchar.h or
otherwise assume they know the specific values of libpq's encoding IDs,
since they might encounter version skew between pg_wchar.h and the libpq.so
they are using.  To fix, have libpq officially export functions needed for
encoding name<=>ID conversion and validity checking; it was doing this
anyway unofficially.

It's still the case that we can't renumber backend encoding IDs until the
next bump in libpq's major version number, since doing so will break the
8.2-era client programs.  However the code is now prepared to avoid this
type of problem in future.

Note that initdb is no longer a libpq client: we just pull in the two
source files we need directly.  The patch also fixes a few places that
were being sloppy about checking for an unrecognized encoding name.
2007-10-13 20:18:42 +00:00
Tom Lane
537e92e41f Fix ALTER COLUMN TYPE to preserve the tablespace and reloptions of indexes
it affects.  The original coding neglected tablespace entirely (causing
the indexes to move to the database's default tablespace) and for an index
belonging to a UNIQUE or PRIMARY KEY constraint, it would actually try to
assign the parent table's reloptions to the index :-(.  Per bug #3672 and
subsequent investigation.

8.0 and 8.1 did not have reloptions, but the tablespace bug is present.
2007-10-13 15:55:40 +00:00
Tom Lane
82d8ab6fc4 Fix the plan-invalidation mechanism to treat regclass constants that refer to
a relation as a reason to invalidate a plan when the relation changes.  This
handles scenarios such as dropping/recreating a sequence that is referenced by
nextval('seq') in a cached plan.  Rather than teach plancache.c all about
digging through plan trees to find regclass Consts, we charge the planner's
setrefs.c with making a list of the relation OIDs on which each plan depends.
That way the list can be built cheaply during a plan tree traversal that has
to happen anyway.  Per bug #3662 and subsequent discussion.
2007-10-11 18:05:27 +00:00
Tom Lane
c4db0d9ae1 Adjust regcustom.h so that all those assert() calls in the regex package
are converted to Postgres Assert() macros, instead of using <assert.h>
as formerly.  No difference in production builds, but --enable-cassert
debug builds will get better coverage for regex testing.
2007-10-06 16:01:51 +00:00
Tom Lane
89db887b1e Keep the planner from failing on "WHERE false AND something IN (SELECT ...)".
eval_const_expressions simplifies this to just "WHERE false", but we have
already done pull_up_IN_clauses so the IN join will be done, or at least
planned, anyway.  The trouble case comes when the sub-SELECT is itself a join
and we decide to implement the IN by unique-ifying the sub-SELECT outputs:
with no remaining reference to the output Vars in WHERE, we won't have
propagated the Vars up to the upper join point, leading to "variable not found
in subplan target lists" error.  Fix by adding an extra scan of in_info_list
and forcing all Vars mentioned therein to be propagated up to the IN join
point.  Per bug report from Miroslav Sulc.
2007-10-04 20:44:47 +00:00
Tom Lane
87dfa0d9ae Stamp 8.3beta1, except in configure.in/configure. 2007-10-04 19:12:04 +00:00
Tom Lane
f1d37a9997 Cope with ERR_set_mark() and ERR_pop_to_mark() not existing in older
OpenSSL libraries --- just don't call them if they're not there.  This
might possibly lead to misleading error messages, but we'll just have
to live with that.
2007-10-02 00:25:20 +00:00
Tom Lane
b526462f9e Avoid assuming that struct varattrib_pointer doesn't get padded by the
compiler --- at least on ARM, it does.  I suspect that the varvarlena patch
has been creating larger-than-intended toast pointers all along on ARM,
but it wasn't exposed until the latest tweak added some Asserts that
calculated the expected size in a different way.  We could probably have
fixed this by adding __attribute__((packed)) as is done for ItemPointerData,
but struct varattrib_pointer isn't really all that useful anyway, so it
seems cleanest to just get rid of it and have only struct varattrib_1b_e.
Per results from buildfarm member quagga.
2007-10-01 16:25:56 +00:00
Magnus Hagander
4164e6636e Enable __FUNCTION__ on MSVC builds.
Hannes Eder
2007-10-01 10:54:29 +00:00
Tom Lane
27b8922221 Add an extra header byte to TOAST-pointer datums to represent their size
explicitly.  This means a TOAST pointer takes 18 bytes instead of 17 --- still
smaller than in 8.2 --- which seems a good tradeoff to ensure we won't have
painted ourselves into a corner if we want to support multiple types of TOAST
pointer later on.  Per discussion with Greg Stark.
2007-09-30 19:54:58 +00:00
Tom Lane
70b9b9b788 Change initdb and CREATE DATABASE to actively reject attempts to create
databases with encodings that are incompatible with the server's LC_CTYPE
locale, when we can determine that (which we can on most modern platforms,
I believe).  C/POSIX locale is compatible with all encodings, of course,
so there is still some usefulness to CREATE DATABASE's ENCODING option,
but this will insulate us against all sorts of recurring complaints
caused by mismatched settings.

I moved initdb's existing LC_CTYPE-to-encoding mapping knowledge into
a new src/port/ file so it could be shared by CREATE DATABASE.
2007-09-28 22:25:49 +00:00
Tom Lane
aedc5ed571 Fix typos in two comments. Spotted by Brendan Jurd 2007-09-27 21:01:59 +00:00
Tom Lane
314ed5de6d Define the FRONTEND symbol in postgres_fe.h, which allows us to eliminate
duplicative -DFRONTEND flags from many Makefiles.  We still need Makefile
control of the symbol in a few places that compile frontend-or-backend
src/port/ files, but it's a lot cleaner than before.

Hiroshi Saito
2007-09-27 19:53:44 +00:00
Tom Lane
f18dfc4835 Minor improvements in backup and recovery:
- create a separate archive_mode GUC, on which archive_command is dependent

- %r option in recovery.conf sends last restartpoint to recovery command

- %r used in pg_standby, updated README

- minor other code cleanup in pg_standby

- doc on Warm Standby now mentions pg_standby and %r

- log_restartpoints recovery option emits LOG message at each restartpoint

- end of recovery now displays last transaction end time, as requested
  by Warren Little; also shown at each restartpoint

- restart archiver if needed to carry away WAL files at shutdown

Simon Riggs
2007-09-26 22:36:30 +00:00
Tom Lane
cdf0231c88 Create a function variable "join_search_hook" to let plugins override the
join search order portion of the planner; this is specifically intended to
simplify developing a replacement for GEQO planning.  Patch by Julius
Stroffek, editorialized on by me.  I renamed make_one_rel_by_joins to
standard_join_search and make_rels_by_joins to join_search_one_level to better
reflect their place within this scheme.
2007-09-26 18:51:51 +00:00
Tom Lane
f828f878e9 Change on-disk representation of NUMERIC datatype so that the sign_dscale
word comes before the weight instead of after.  This will allow future
binary-compatible extension of the representation to support compact formats,
as discussed on pgsql-hackers around 2007/06/18.  The reason to do it now is
that we've already pretty well broken any chance of simple in-place upgrade
from 8.2 to 8.3, but it's possible that 8.3 to 8.4 (or whenever we get around
to squeezing NUMERIC) could otherwise be data-compatible.
2007-09-25 22:21:55 +00:00
Tom Lane
6f5c38dcd0 Just-in-time background writing strategy. This code avoids re-scanning
buffers that cannot possibly need to be cleaned, and estimates how many
buffers it should try to clean based on moving averages of recent allocation
requests and density of reusable buffers.  The patch also adds a couple
more columns to pg_stat_bgwriter to help measure the effectiveness of the
bgwriter.

Greg Smith, building on his own work and ideas from several other people,
in particular a much older patch from Itagaki Takahiro.
2007-09-25 20:03:38 +00:00
Tom Lane
48f7e64395 Simplify and rename some GUC variables, per various recent discussions:
* stats_start_collector goes away; we always start the collector process,
unless prevented by a problem with setting up the stats UDP socket.

* stats_reset_on_server_start goes away; it seems useless in view of the
availability of pg_stat_reset().

* stats_block_level and stats_row_level are merged into a single variable
"track_counts", which controls all reports sent to the collector process.

* stats_command_string is renamed to track_activities.

* log_autovacuum is renamed to log_autovacuum_min_duration to better reflect
its meaning.

The log_autovacuum change is not a compatibility issue since it didn't exist
before 8.3 anyway.  The other changes need to be release-noted.
2007-09-24 03:12:23 +00:00
Andrew Dunstan
02138357ff Remove "convert 'blah' using conversion_name" facility, because if it
produces text it is an encoding hole and if not it's incompatible
with the spec, whatever the spec means (which we're not sure about anyway).
2007-09-24 01:29:30 +00:00
Tom Lane
7125687511 Fix cost estimates for EXISTS subqueries that are evaluated as initPlans
(because they are uncorrelated with the immediate parent query).  We were
charging the full run cost to the parent node, disregarding the fact that
only one row need be fetched for EXISTS.  While this would only be a
cosmetic issue in most cases, it might possibly affect planning outcomes
if the parent query were itself a subquery to some upper query.
Per recent discussion with Steve Crawford.
2007-09-22 21:36:40 +00:00
Tom Lane
571340a00e Parenthesize macro arguments safely. I see no bug among the current
uses of PG_DETOAST_DATUM_SLICE, but it's clearly trouble waiting to
happen.
2007-09-22 04:41:19 +00:00
Tom Lane
7583f9a7ca Fix regex, LIKE, and some other second-rank text-manipulation functions
to not cause needless copying of text datums that have 1-byte headers.
Greg Stark, in response to performance gripe from Guillaume Smet and
ITAGAKI Takahiro.
2007-09-21 22:52:52 +00:00
Tom Lane
cc59049daf Improve handling of prune/no-prune decisions by storing a page's oldest
unpruned XMAX in its header.  At the cost of 4 bytes per page, this keeps us
from performing heap_page_prune when there's no chance of pruning anything.
Seems to be necessary per Heikki's preliminary performance testing.
2007-09-21 21:25:42 +00:00
Tom Lane
017daed0dd If we're gonna provide an --enable-profiling configure option, surely
it ought to know that you need -DLINUX_PROFILE on Linux.
2007-09-21 02:33:46 +00:00
Tom Lane
282d2a03dd HOT updates. When we update a tuple without changing any of its indexed
columns, and the new version can be stored on the same heap page, we no longer
generate extra index entries for the new version.  Instead, index searches
follow the HOT-chain links to ensure they find the correct tuple version.

In addition, this patch introduces the ability to "prune" dead tuples on a
per-page basis, without having to do a complete VACUUM pass to recover space.
VACUUM is still needed to clean up dead index entries, however.

Pavan Deolasee, with help from a bunch of other people.
2007-09-20 17:56:33 +00:00
Andrew Dunstan
55613bf9cd Close previously open holes for invalidly encoded data to enter the
database via builtin functions, as recently discussed on -hackers.

chr() now returns a character in the database encoding. For UTF8 encoded databases
the argument is treated as a Unicode code point. For other multi-byte encodings
the argument must designate a strict ascii character, or an error is raised,
as is also the case if the argument is 0.

ascii() is adjusted so that it remains the inverse of chr().

The two argument form of convert() is gone, and the three argument form now
takes a bytea first argument and returns a bytea. To cover this loss three new
functions are introduced:
. convert_from(bytea, name) returns text - converts the first argument from the
  named encoding to the database encoding
. convert_to(text, name) returns bytea - converts the first argument from the
  database encoding to the named encoding
. length(bytea, name) returns int - gives the length of the first argument in
  characters in the named encoding
2007-09-18 17:41:17 +00:00
Tom Lane
6889303531 Redefine the lp_flags field of item pointers as having four states, rather
than two independent bits (one of which was never used in heap pages anyway,
or at least hadn't been in a very long time).  This gives us flexibility to
add the HOT notions of redirected and dead item pointers without requiring
anything so klugy as magic values of lp_off and lp_len.  The state values
are chosen so that for the states currently in use (pre-HOT) there is no
change in the physical representation.
2007-09-12 22:10:26 +00:00
Teodor Sigaev
476045a21b Remove QueryOperand->istrue flag, it was used only in cover ranking
(ts_rank_cd). Use palloc'ed array in ranking instead of flag.
2007-09-11 16:01:40 +00:00
Teodor Sigaev
13553cbbff Fix header's size of structs defines in ispell.
Backpatch is needed for contrib version.
2007-09-11 12:57:05 +00:00
Teodor Sigaev
57cafe7982 Refactor from Heikki Linnakangas <heikki@enterprisedb.com>:
* Defined new struct WordEntryPosVector that holds a uint16 length and a
variable size array of WordEntries. This replaces the previous
convention of a variable size uint16 array, with the first element
implying the length. WordEntryPosVector has the same layout in memory,
but is more readable in source code. The POSDATAPTR and POSDATALEN
macros are still used, though it would now be more readable to access
the fields in WordEntryPosVector directly.

* Removed needfree field from DocRepresentation. It was always set to false.

* Miscellaneous other commenting and refactoring
2007-09-11 08:46:29 +00:00
Tom Lane
ef4d38c86c Rename recently-added pg_stat_activity column from txn_start to xact_start,
for consistency with other column names such as in pg_stat_database.
2007-09-11 03:28:05 +00:00
Tom Lane
82a47982f3 Arrange for SET LOCAL's effects to persist until the end of the current top
transaction, unless rolled back or overridden by a SET clause for the same
variable attached to a surrounding function call.  Per discussion, these
seem the best semantics.  Note that this is an INCOMPATIBLE CHANGE: in 8.0
through 8.2, SET LOCAL's effects disappeared at subtransaction commit
(leading to behavior that made little sense at the SQL level).

I took advantage of the opportunity to rewrite and simplify the GUC variable
save/restore logic a little bit.  The old idea of a "tentative" value is gone;
it was a hangover from before we had a stack.  Also, we no longer need a stack
entry for every nesting level, but only for those in which a variable's value
actually changed.
2007-09-11 00:06:42 +00:00
Teodor Sigaev
f7379f5c88 Heikki Linnakangas <heikki@enterprisedb.com>:
Add tsearch subdirectory is added to Makefile to allow
compile  custom tsearch dictionary as an external module.
2007-09-10 20:37:36 +00:00
Teodor Sigaev
d982daae0b Change void* opaque argument to Datum type, add argument's
name to PushFunction type definition.

Per suggestion by Tome Lane <tgl@sss.pgh.pa.us>
2007-09-10 12:36:41 +00:00
Tom Lane
40fda15dce Code review for GUC revert-values-if-removed-from-postgresql.conf patch;
and in passing, fix some bogosities dating from the custom_variable_classes
patch.  Fix guc-file.l to correctly check changes in custom_variable_classes
that are attempted concurrently with additions/removals of custom variables,
and don't allow the new setting to be applied in advance of checking it.
Clean up messy and undocumented situation for string variables with NULL
boot_val.  Fix DefineCustomVariable functions to initialize boot_val
correctly.  Prevent find_option from inserting bogus placeholders for custom
variables that are simply inquired about rather than being set.
2007-09-10 00:57:22 +00:00
Tom Lane
6bd4f401b0 Replace the former method of determining snapshot xmax --- to wit, calling
ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid"
variable that is updated during transaction commit or abort.  Since
latestCompletedXid is written only in places that had to lock ProcArrayLock
exclusively anyway, and is read only in places that had to lock ProcArrayLock
shared anyway, it adds no new locking requirements to the system despite being
cluster-wide.  Moreover, removing ReadNewTransactionId from snapshot
acquisition eliminates the need to take both XidGenLock and ProcArrayLock at
the same time.  Since XidGenLock is sometimes held across I/O this can be a
significant win.  Some preliminary benchmarking suggested that this patch has
no effect on average throughput but can significantly improve the worst-case
transaction times seen in pgbench.  Concept by Florian Pflug, implementation
by Tom Lane.
2007-09-08 20:31:15 +00:00
Teodor Sigaev
978de9d06d Improvements from Heikki Linnakangas <heikki@enterprisedb.com>
- change the alignment requirement of lexemes in TSVector slightly.
Lexeme strings were always padded to 2-byte aligned length to make sure
that if there's position array (uint16[]) it has the right alignment.
The patch changes that so that the padding is not done when there's no
positions. That makes the storage of tsvectors without positions
slightly more compact.

- added some #include "miscadmin.h" lines I missed in the earlier when I
added calls to check_stack_depth().

- Reimplement the send/recv functions, and added a comment
above them describing the on-wire format. The CRC is now recalculated in
tsquery as well per previous discussion.
2007-09-07 16:03:40 +00:00
Teodor Sigaev
8983852e34 Improving various checks by Heikki Linnakangas <heikki@enterprisedb.com>
- add code to check that the query tree is well-formed. It was indeed
  possible to send malformed queries in binary mode, which produced all
  kinds of strange results.

- make the left-field a uint32. There's no reason to
  arbitrarily limit it to 16-bits, and it won't increase the disk/memory
  footprint either now that QueryOperator and QueryOperand are separate
  structs.

- add check_stack_depth() call to all recursive functions I found.
  Some of them might have a natural limit so that you can't force
  arbitrarily deep recursions, but check_stack_depth() is cheap enough
  that seems best to just stick it into anything that might be a problem.
2007-09-07 15:35:11 +00:00
Teodor Sigaev
e5be89981f Refactoring by Heikki Linnakangas <heikki@enterprisedb.com> with
small editorization by me

- Brake the QueryItem struct into QueryOperator and QueryOperand.
  Type was really the only common field between them. QueryItem still
  exists, and is used in the TSQuery struct as before, but it's now a
  union of the two. Many other changes fell from that, like separation
  of pushval_asis function into pushValue, pushOperator and pushStop.

- Moved some structs that were for internal use only from header files
  to the right .c-files.

- Moved tsvector parser to a new tsvector_parser.c file. Parser code was
  about half of the size of tsvector.c, it's also used from tsquery.c, and
  it has some data structures of its own, so it seems better to separate
  it. Cleaned up the API so that TSVectorParserState is not accessed from
  outside tsvector_parser.c.

- Separated enumerations (#defines, really) used for QueryItem.type
  field and as return codes from gettoken_query. It was just accidental
  code sharing.

- Removed ParseQueryNode struct used internally by makepol and friends.
  push*-functions now construct QueryItems directly.

- Changed int4 variables to just ints for variables like "i" or "array
  size", where the storage-size was not significant.
2007-09-07 15:09:56 +00:00
Tom Lane
cd1aae5864 Allow CREATE INDEX CONCURRENTLY to disregard transactions in other
databases, per gripe from hubert depesz lubaczewski.  Patch from
Simon Riggs.
2007-09-07 00:58:57 +00:00
Tom Lane
f8942f4a15 Make eval_const_expressions() preserve typmod when simplifying something like
null::char(3) to a simple Const node.  (It already worked for non-null values,
but not when we skipped evaluation of a strict coercion function.)  This
prevents loss of typmod knowledge in situations such as exhibited in bug
#3598.  Unfortunately there seems no good way to fix that bug in 8.1 and 8.2,
because they simply don't carry a typmod for a plain Const node.

In passing I made all the other callers of makeNullConst supply "real" typmod
values too, though I think it probably doesn't matter anywhere else.
2007-09-06 17:31:58 +00:00
Tom Lane
295e63983d Implement lazy XID allocation: transactions that do not modify any database
rows will normally never obtain an XID at all.  We already did things this way
for subtransactions, but this patch extends the concept to top-level
transactions.  In applications where there are lots of short read-only
transactions, this should improve performance noticeably; not so much from
removal of the actual XID-assignments, as from reduction of overhead that's
driven by the rate of XID consumption.  We add a concept of a "virtual
transaction ID" so that active transactions can be uniquely identified even
if they don't have a regular XID.  This is a much lighter-weight concept:
uniqueness of VXIDs is only guaranteed over the short term, and no on-disk
record is made about them.

Florian Pflug, with some editorialization by Tom.
2007-09-05 18:10:48 +00:00
Andrew Dunstan
2e74c53ec1 Provide for binary input/output of enums, to fix complaint from Merlin Moncure.
This just provides text values, we're not exposing the underlying Oid representation.
Catalog version bumped.
2007-09-04 16:41:43 +00:00
Tom Lane
e7889b83b7 Support SET FROM CURRENT in CREATE/ALTER FUNCTION, ALTER DATABASE, ALTER ROLE.
(Actually, it works as a plain statement too, but I didn't document that
because it seems a bit useless.)  Unify VariableResetStmt with
VariableSetStmt, and clean up some ancient cruft in the representation of
same.
2007-09-03 18:46:30 +00:00
Tom Lane
7ab43b88d7 Improve stylistic consistency of descriptions of built-in objects by avoiding
initcap style --- the vast majority of the existing descriptions do not use
an initial cap.  I didn't change places where the first word was all-cap.

initdb not forced because this doesn't change any regression test results.
2007-09-03 02:30:45 +00:00
Tom Lane
a4df52f95f Fix breakage of GIN support for varchar[] and cidr[] that I introduced in the
operator-family rewrite.  I had mistakenly supposed that these could use the
pg_amproc entries for text[] and inet[] respectively.  However, binary
compatibility of the underlying types does not make two array types binary
compatible (since they must differ in the header field that gives the element
type OID), and so the index support code doesn't consider those entries
applicable.  Add back the missing pg_amproc entries, and add an opr_sanity
query to try to catch such mistakes in future.  Per report from Gregory
Maxwell.
2007-09-03 01:18:33 +00:00
Tom Lane
2abae34a2e Implement function-local GUC parameter settings, as per recent discussion.
There are still some loose ends: I didn't do anything about the SET FROM
CURRENT idea yet, and it's not real clear whether we are happy with the
interaction of SET LOCAL with function-local settings.  The documentation
is a bit spartan, too.
2007-09-03 00:39:26 +00:00
Tom Lane
0ee5a39862 Apply a band-aid fix for the problem that 8.2 and up completely misestimate
the number of rows likely to be produced by a query such as
	SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL;
What this is doing is selecting for t1 rows with no match in t2, and thus
it may produce a significant number of rows even if the t2.key table column
contains no nulls at all.  8.2 thinks the table column's null fraction is
relevant and thus may estimate no rows out, which results in terrible plans
if there are more joins above this one.  A proper fix for this will involve
passing much more information about the context of a clause to the selectivity
estimator functions than we ever have.  There's no time left to write such a
patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway.  Instead,
put in an ad-hoc test to defeat the normal table-stats-based estimation when
an IS NULL test is evaluated at an outer join, and just use a constant
estimate instead --- I went with 0.5 for lack of a better idea.  This won't
catch every case but it will catch the typical ways of writing such queries,
and it seems unlikely to make things worse for other queries.
2007-08-31 23:35:22 +00:00
Tom Lane
b4c806faa8 Rewrite make_outerjoininfo's construction of min_lefthand and min_righthand
sets for outer joins, in the light of bug #3588 and additional thought and
experimentation.  The original methodology was fatally flawed for nests of
more than two outer joins: it got the relationships between adjacent joins
right, but didn't always come to the right conclusions about whether a join
could be interchanged with one two or more levels below it.  This was largely
caused by a mistaken idea that we should use the min_lefthand + min_righthand
sets of a sub-join as the minimum left or right input set of an upper join
when we conclude that the sub-join can't commute with the upper one.  If
there's a still-lower join that the sub-join *can* commute with, this method
led us to think that that one could commute with the topmost join; which it
can't.  Another problem (not directly connected to bug #3588) was that
make_outerjoininfo's processing-order-dependent method for enforcing outer
join identity #3 didn't work right: if we decided that join A could safely
commute with lower join B, we dropped all information about sub-joins under B
that join A could perhaps not safely commute with, because we removed B's
entire min_righthand from A's.

To fix, make an explicit computation of all inner join combinations that occur
below an outer join, and add to that the full syntactic relsets of any lower
outer joins that we determine it can't commute with.  This method gives much
more direct enforcement of the outer join rearrangement identities, and it
turns out not to cost a lot of additional bookkeeping.

Thanks to Richard Harris for the bug report and test case.
2007-08-31 01:44:06 +00:00
Tom Lane
862861ee77 Fix a couple of misbehaviors rooted in the fact that the default creation
namespace isn't necessarily first in the search path (there could be implicit
schemas ahead of it).  Examples are

test=# set search_path TO s1;

test=# create view pg_timezone_names as select * from pg_timezone_names();
ERROR:  "pg_timezone_names" is already a view

test=# create table pg_class (f1 int primary key);
ERROR:  permission denied: "pg_class" is a system catalog

You'd expect these commands to create the requested objects in s1, since
names beginning with pg_ aren't supposed to be reserved anymore.  What is
happening is that we create the requested base table and then execute
additional commands (here, CREATE RULE or CREATE INDEX), and that code is
passed the same RangeVar that was in the original command.  Since that
RangeVar has schemaname = NULL, the secondary commands think they should do a
path search, and that means they find system catalogs that are implicitly in
front of s1 in the search path.

This is perilously close to being a security hole: if the secondary command
failed to apply a permission check then it'd be possible for unprivileged
users to make schema modifications to system catalogs.  But as far as I can
find, there is no code path in which a check doesn't occur.  Which makes it
just a weird corner-case bug for people who are silly enough to want to
name their tables the same as a system catalog.

The relevant code has changed quite a bit since 8.2, which means this patch
wouldn't work as-is in the back branches.  Since it's a corner case no one
has reported from the field, I'm not going to bother trying to back-patch.
2007-08-27 03:36:08 +00:00
Tom Lane
6c96188cb5 Remove the 'not in' operator (!!=). This was a hangover from Berkeley
days that was obsolete the moment we had IN (SELECT ...) capability.
It's arguably a security hole since it applied no permissions check to
the table it searched, and since it was never documented anywhere,
removing it seems more appropriate than fixing it.
2007-08-27 01:39:25 +00:00
Tom Lane
67bf7b919e Make ARRAY(SELECT ...) return an empty array, rather than a NULL, when the
sub-select returns zero rows.  Per complaint from Jens Schicke.  Since this
is more in the nature of a definition change than a bug, not back-patched.
2007-08-26 21:44:25 +00:00
Tom Lane
93eab9312f Rename built-in Snowball stemmer dictionaries to be english_stem,
russian_stem, etc.  Per discussion.
2007-08-25 01:06:25 +00:00
Tom Lane
7351b5fa17 Cleanup for some problems in tsearch patch:
- ispell initialization crashed on empty dictionary file
- ispell initialization crashed on affix file with prefixes but no suffixes
- stop words file was run through pg_verify_mbstr, with database
  encoding, but it's supposed to be UTF-8; similar bug for synonym files
- bunch of comments added, typos fixed, and other cleanup

Introduced consistent encoding checking/conversion of data read from tsearch
configuration files, by doing this in a single t_readline() subroutine
(replacing direct usages of fgets).  Cleaned up API for readstopwords too.

Heikki Linnakangas
2007-08-25 00:03:59 +00:00
Tom Lane
8a5592daf1 Remove option to change parser of an existing text search configuration.
This prevents needing to do complex and poorly-defined updates of the
mapping table if the new parser has different token types than the old.
Per discussion.
2007-08-22 05:13:50 +00:00
Tom Lane
d321421d0a Simplify the syntax of CREATE/ALTER TEXT SEARCH DICTIONARY by treating the
init options of the template as top-level options in the syntax.  This also
makes ALTER a bit easier to use, since options can be replaced individually.
I also made these statements verify that the tmplinit method will accept
the new settings before they get stored; in the original coding you didn't
find out about mistakes until the dictionary got invoked.

Under the hood, init methods now get options as a List of DefElem instead
of a raw text string --- that lets tsearch use existing options-pushing code
instead of duplicating functionality.
2007-08-22 01:39:46 +00:00
Tom Lane
140d4ebcb4 Tsearch2 functionality migrates to core. The bulk of this work is by
Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing,
so anything that's broken is probably my fault.

Documentation is nonexistent as yet, but let's land the patch so we can
get some portability testing done.
2007-08-21 01:11:32 +00:00
Andrew Dunstan
fd801f4faa Provide for logfiles in machine readable CSV format. In consequence, rename
redirect_stderr to logging_collector.
Original patch from Arul Shaji, subsequently modified by Greg Smith, and then
heavily modified by me.
2007-08-19 01:41:25 +00:00
Tom Lane
817946bb04 Arrange to cache a ResultRelInfo in the executor's EState for relations that
are not one of the query's defined result relations, but nonetheless have
triggers fired against them while the query is active.  This was formerly
impossible but can now occur because of my recent patch to fix the firing
order for RI triggers.  Caching a ResultRelInfo avoids duplicating work by
repeatedly opening and closing the same relation, and also allows EXPLAIN
ANALYZE to "see" and report on these extra triggers.  Use the same mechanism
to cache open relations when firing deferred triggers at transaction shutdown;
this replaces the former one-element-cache strategy used in that case, and
should improve performance a bit when there are deferred triggers on a number
of relations.
2007-08-15 21:39:50 +00:00
Tom Lane
9cb8409762 Repair problems occurring when multiple RI updates have to be done to the same
row within one query: we were firing check triggers before all the updates
were done, leading to bogus failures.  Fix by making the triggers queued by
an RI update go at the end of the outer query's trigger event list, thereby
effectively making the processing "breadth-first".  This was indeed how it
worked pre-8.0, so the bug does not occur in the 7.x branches.
Per report from Pavel Stehule.
2007-08-15 19:15:47 +00:00
Tom Lane
67f99d216a Fix oversight in async-commit patch: there were some places in heapam.c
that still thought they could set HEAP_XMAX_COMMITTED immediately after
seeing the other transaction commit.  Make them use the same logic as
tqual.c does to determine if the hint bit can be set yet.
2007-08-14 17:35:18 +00:00
Neil Conway
849ec99753 Adjust the output of MemoryContextStats() so that the stats for a
child memory contexts is indented two spaces to the right of its
parent context.  This should make it easier to deduce the memory
context hierarchy from the output of MemoryContextStats().
2007-08-07 06:25:14 +00:00
Tom Lane
c8b7e811f3 Apparently icc doesn't always define __ICC, and it's more correct to
check for __INTEL_COMPILER.  Per report from Dirk Tilger.
Not back-patched since I don't fully trust it yet ...
2007-08-05 15:11:40 +00:00
Tom Lane
8d30337566 Fix up bad layout of some comments (probably pg_indent's fault), and
improve grammar a tad.  Per Greg Stark.
2007-08-04 21:53:00 +00:00
Tom Lane
4fd8d6b3e7 Fix crash caused by log_timezone patch if we attempt to emit any elog messages
between the setting of log_line_prefix and the setting of log_timezone.  We
can't realistically set log_timezone any earlier than we do now, so the best
behavior seems to be to use GMT zone if any timestamps are to be logged during
early startup.  Create a dummy zone variable with a minimal definition of GMT
(in particular it will never know about leap seconds), so that we can set it
up without reference to any external files.
2007-08-04 19:29:25 +00:00
Tom Lane
bdd6b62245 Switch over to using the src/timezone functions for formatting timestamps
displayed in the postmaster log.  This avoids Windows-specific problems with
localized time zone names that are in the wrong encoding, and generally seems
like a good idea to forestall other potential platform-dependent issues.
To preserve the existing behavior that all backends will log in the same time
zone, create a new GUC variable log_timezone that can only be changed on a
system-wide basis, and reference log-related calculations to that zone instead
of the TimeZone variable.

This fixes the issue reported by Hiroshi Saito that timestamps printed by
xlog.c startup could be improperly localized on Windows.  We still need a
simpler patch for that problem in the back branches, however.
2007-08-04 01:26:54 +00:00
Andrew Dunstan
63872601e8 Move session_start out of MyProcPort stucture and make it a global called MyStartTime,
so that we will be able to create a cookie for all processes for CSVlogs.
It is set wherever MyProcPid is set. Take the opportunity to remove the now
unnecessary session-only restriction on the %s and %c escapes in log_line_prefix.
2007-08-02 23:39:45 +00:00