plpgsql often just remembers SPI-result tuple tables in local variables,
and has no mechanism for freeing them if an ereport(ERROR) causes an escape
out of the execution function whose local variable it is. In the original
coding, that wasn't a problem because the tuple table would be cleaned up
when the function's SPI context went away during transaction abort.
However, once plpgsql grew the ability to trap exceptions, repeated
trapping of errors within a function could result in significant
intra-function-call memory leakage, as illustrated in bug #8279 from
Chad Wagner.
We could fix this locally in plpgsql with a bunch of PG_TRY/PG_CATCH
coding, but that would be tedious, probably slow, and prone to bugs of
omission; moreover it would do nothing for similar risks elsewhere.
What seems like a better plan is to make SPI itself responsible for
freeing tuple tables at subtransaction abort. This patch attacks the
problem that way, keeping a list of live tuple tables within each SPI
function context. Currently, such freeing is automatic for tuple tables
made within the failed subtransaction. We might later add a SPI call to
mark a tuple table as not to be freed this way, allowing callers to opt
out; but until someone exhibits a clear use-case for such behavior, it
doesn't seem worth bothering.
A very useful side-effect of this change is that SPI_freetuptable() can
now defend itself against bad calls, such as duplicate free requests;
this should make things more robust in many places. (In particular,
this reduces the risks involved if a third-party extension contains
now-redundant SPI_freetuptable() calls in error cleanup code.)
Even though the leakage problem is of long standing, it seems imprudent
to back-patch this into stable branches, since it does represent an API
semantics change for SPI users. We'll patch this in 9.3, but live with
the leakage in older branches.
If an error is thrown out of the datatype I/O functions called by this
function, we need to do subtransaction cleanup, which the previous coding
entirely failed to do. Fortunately, both existing callers of this function
already have proper cleanup logic, so re-throwing the exception is enough.
Also, postpone creation of the resultset tupdesc until after the I/O
conversions are complete, so that we won't leak memory in TopMemoryContext
when such an error happens.
The old implementation converted PostgreSQL numeric to Python float,
which was always considered a shortcoming. Now numeric is converted to
the Python Decimal object. Either the external cdecimal module or the
standard library decimal module are supported.
From: Szymon Guz <mabewlun@gmail.com>
From: Ronan Dunklau <rdunklau@gmail.com>
Reviewed-by: Steve Singer <steve@ssinger.info>
With -Wtype-limits, gcc correctly points out that size_t can never be < 0.
Backpatch to 9.3 and 9.2. It's been like this forever, but in <= 9.1 you got
a lot other warnings with -Wtype-limits anyway (at least with my version of
gcc).
Andres Freund
Memory was allocated based on the sizeof a type that was not the type of
the pointer that the result was being assigned to. The types happen to
be of the same size, but it's still wrong.
This confused Cygwin's make because of the colon in the path. The
DLL isn't likely to change under us so preserving the dependency
doesn't gain us much, and it's useful to be able to do a native
Windows build with the Cygwin mingw toolset.
Noah Misch.
If a PL/Python function raises an SPIError (or one if its subclasses)
directly with python's raise statement, treat it the same as an SPIError
generated internally. In particular, if the user sets the sqlstate
attribute, preserve that.
Oskari Saarenmaa and Jan Urbański, reviewed by Karl O. Pinc.
plpython tried to use a single cache entry for a trigger function, but it
needs a separate cache entry for each table the trigger is applied to,
because there is table-dependent data in there. This was done correctly
before 9.1, but commit 46211da1b8 broke it
by simplifying the lookup key from "function OID and triggered table OID"
to "function OID and is-trigger boolean". Go back to using both OIDs
as the lookup key. Per bug report from Sandro Santilli.
Andres Freund
The PL/Python build on OS X was previously hardcoded to use the system
installation of Python, ignoring whatever was specified to configure.
Except that it would use the header files from configure, which could
lead to mismatches. It was not possible to build against a custom
Python installation.
Now, we check in configure how the specified Python installation was
built and use that, supporting framework and non-framework builds.
This reverts commit be0dfbad36.
The previous information that Py_RETURN_TRUE and Py_RETURN_FALSE are
supported in Python 2.3 is wrong. They require Python 2.4. Update the
comment about that.
This was used in a time when a shared libperl or libpython was difficult
to come by. That is obsolete, and the idea behind the flag was never
fully portable anyway and will likely fail on more modern CPU
architectures.
Currently, we are making mangled copies of plpython/{expected,sql} to
plpython/python3/{expected,sql}, and run the tests in
plpython/python3. This has the disadvantage that the regression.diffs
file, if any, ends up in plpython/python3, which is not the normal
location. If we instead make the mangled copies in
plpython/{expected,sql}/python3/, we can run the tests from the normal
directory, regression.diffs ends up the normal place, and the
pg_regress invocation also becomes a lot simpler. It's also more
obvious at run time what's going on, because the tests end up being
named "python3/something" in the test output.
Commit 2cfb1c6f77 fixed some issues caused
by Python 3.3 choosing to iterate through dict entries in a different order
than before. But here's another one: the test cases adjusted here made two
bad entries in a dict and expected the one complained of would always be
the same.
Possibly this should be back-patched further than 9.2, but there seems
little point unless the earlier fix is too.
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
We used to convert the unicode object directly to a string in the server
encoding by calling Python's PyUnicode_AsEncodedString function. In other
words, we used Python's routines to do the encoding. However, that has a
few problems. First of all, it required keeping a mapping table of Python
encoding names and PostgreSQL encodings. But the real killer was that Python
doesn't support EUC_TW and MULE_INTERNAL encodings at all.
Instead, convert the Python unicode object to UTF-8, and use PostgreSQL's
encoding conversion functions to convert from UTF-8 to server encoding. We
were already doing the same in the other direction in PLyUnicode_FromString,
so this is more consistent, too.
Note: This makes SQL_ASCII to behave more leniently. We used to map
SQL_ASCII to Python's 'ascii', which on Python means strict 7-bit ASCII
only, so you got an error if the python string contained anything but pure
ASCII. You no longer get an error; you get the UTF-8 representation of the
string instead.
Backpatch to 9.0, where these conversions were introduced.
Jan Urbański
That caused the plpython_unicode regression test to fail on SQL_ASCII
encoding, as evidenced by the buildfarm. The reason is that with the patch,
you don't get the detail in the error message that you got before. That
detail is actually very informative, so rather than just adjust the expected
output, let's revert that part of the patch for now to make the buildfarm
green again, and figure out some other way to avoid the recursion of
PLy_elog() that doesn't lose the detail.
Windows encodings, "win1252" and so forth, are named differently in Python,
like "cp1252". Also, if the PyUnicode_AsEncodedString() function call fails
for some reason, use a plain ereport(), not a PLy_elog(), to report that
error. That avoids recursion and crash, if PLy_elog() tries to call
PLyUnicode_Bytes() again.
This fixes bug reported by Asif Naeem. Backpatch down to 9.0, before that
plpython didn't even try these conversions.
Jan Urbański, with minor comment improvements by me.
The string representation of ImportError changed. Remove printing
that; it's not necessary for the test.
The order in which members of a dict are printed changed. But this
was always implementation-dependent, so we have just been lucky for a
long time. Do the printing the hard way to ensure sorted order.
The old way of implementing slicing support by implementing
PySequenceMethods.sq_slice no longer works in Python 3. You now have
to implement PyMappingMethods.mp_subscript. Do this by simply
proxying the call to the wrapped list of result dictionaries.
Consolidate some of the subscripting regression tests.
Jan Urbański
It was already on its last legs, and it turns out that it was
accidentally broken in commit 89e850e6fd
and no one cared. So remove the rest the support for it and update
the documentation to indicate that Python 2.3 is now required.
Add test cases for inline handler of plython2u (when using that
language name), and for result object element assignment. There is
now at least one test case for every top-level functionality, except
plpy.Fatal (annoying to use in regression tests) and result object
slice retrieval and slice assignment (which are somewhat broken).
Allocate PLyResultObject.tupdesc in TopMemoryContext, because its
lifetime is the lifetime of the Python object and it shouldn't be
freed by some other memory context, such as one controlled by SPI. We
trust that the Python object will clean up its own memory.
Before, this would crash the included regression test case by trying
to use memory that was already freed.
reported by Asif Naeem, analysis by Tom Lane
Before 9.1, PL/Python functions returning composite types could return
a string and it would be parsed using record_in. The 9.1 changes made
PL/Python only expect dictionaries, tuples, or objects supporting
getattr as output of composite functions, resulting in a regression
and a confusing error message, as the strings were interpreted as
sequences and the code for transforming lists to database tuples was
used. Fix this by treating strings separately as before, before
checking for the other types.
The reason why it's important to support string to database tuple
conversion is that trigger functions on tables with composite columns
get the composite row passed in as a string (from record_out).
Without supporting converting this back using record_in, this makes it
impossible to implement pass-through behavior for these columns, as
PL/Python no longer accepts strings for composite values.
A better solution would be to fix the code that transforms composite
inputs into Python objects to produce dictionaries that would then be
correctly interpreted by the Python->PostgreSQL counterpart code. But
that would be too invasive to backpatch to 9.1, and it is too late in
the 9.2 cycle to attempt it. It should be revisited in the future,
though.
Reported as bug #6559 by Kirill Simonov.
Jan Urbański
The result object methods colnames() etc. would crash when called
after a command that did not produce a result set. Now they throw an
exception.
discovery and initial patch by Jean-Baptiste Quenot
Dave Malcolm of Red Hat is working on a static code analysis tool for
Python-related C code. It reported a number of problems in plpython,
most of which were failures to check for NULL results from object-creation
functions, so would only be an issue in very-low-memory situations.
Patch in HEAD and 9.1. We could go further back but it's not clear that
these issues are important enough to justify the work.
Jan Urbański
This replaces the former global variable PLy_curr_procedure, and provides
a place to stash per-call-level information. In particular we create a
per-call-level scratch memory context.
For the moment, the scratch context is just used to avoid leaking memory
from datatype output function calls in PLyDict_FromTuple. There probably
will be more use-cases in future.
Although this is a fix for a pre-existing memory leakage bug, it seems
sufficiently invasive to not want to back-patch; it feels better as part
of the major rearrangement of plpython code that we've already done as
part of 9.2.
Jan Urbański
Don't quote the output of format_procedure(); it's already quoted quite
enough. Remove the fn_name field, which was now just dead weight. Fix
remaining expected-output files.
Add result object functions .colnames, .coltypes, .coltypmods to
obtain information about the result column names and types, which was
previously not possible in the PL/Python SPI interface.
reviewed by Abhijit Menon-Sen
This moves the code around from one huge file into hopefully logical
and more manageable modules. For the most part, the code itself was
not touched, except: PLy_function_handler and PLy_trigger_handler were
renamed to PLy_exec_function and PLy_exec_trigger, because they were
not actually handlers in the PL handler sense, and it makes the naming
more similar to the way PL/pgSQL is organized. The initialization of
the procedure caches was separated into a new function
init_procedure_caches to keep the hash tables private to
plpy_procedures.c.
Jan Urbański and Peter Eisentraut
Add a function plpy.cursor that is similar to plpy.execute but uses an
SPI cursor to avoid fetching the entire result set into memory.
Jan Urbański, reviewed by Steve Singer
The old expression sed 's,$(srcdir),python3,' would normally resolve
as sed 's,.,python3,', which is not really what we wanted. While it
doesn't actually break anything right now, it's still wrong, so put in
a bit more work to make it more robust.
exception handler. This was a regression in 9.1, when the capability
to catch specific SPI errors was added, so backpatch to 9.1.
Mika Eloranta, with some editing by Jan Urbański.
Rewrite plancache.c so that a "cached plan" (which is rather a misnomer
at this point) can support generation of custom, parameter-value-dependent
plans, and can make an intelligent choice between using custom plans and
the traditional generic-plan approach. The specific choice algorithm
implemented here can probably be improved in future, but this commit is
all about getting the mechanism in place, not the policy.
In addition, restructure the API to greatly reduce the amount of extraneous
data copying needed. The main compromise needed to make that possible was
to split the initial creation of a CachedPlanSource into two steps. It's
worth noting in particular that SPI_saveplan is now deprecated in favor of
SPI_keepplan, which accomplishes the same end result with zero data
copying, and no need to then spend even more cycles throwing away the
original SPIPlan. The risk of long-term memory leaks while manipulating
SPIPlans has also been greatly reduced. Most of this improvement is based
on use of the recently-added MemoryContextSetParent primitive.
Module initialization functions in Python 3 must have external
linkage, because PyMODINIT_FUNC does dllexport on Windows-like
platforms. Without this change, the build with Python 3 fails on
Windows.
Dropped columns within a composite type were not handled correctly.
Also, we did not check for whether a composite result type had changed
since we cached the information about it.
Jan Urbański, per a bug report from Jean-Baptiste Quenot
To avoid having the python headers hijack various definitions,
we now include them after all the system headers we want, having
first undefined some of the things they want to define. After that's
done we restore the things they scribbled on that matter, namely our
snprintf and vsnprintf macros, if we're using them.
There may be some other places where we should use errdetail_internal,
but they'll have to be evaluated case-by-case. This commit just hits
a bunch of places where invoking gettext is obviously a waste of cycles.
Certain subdirectories do not get built if corresponding options are not
selected at configure time. However, "make distprep" should visit such
directories anyway, so that constructing derived files to be included in
the tarball happens without requiring all configure options to be given
in the tarball build script. Likewise, it's better if cleanup actions
unconditionally visit all directories (for example, this ensures proper
cleanup if someone has done a manual make in such a subdirectory).
To handle this, set up a convention that subdirectories that are
conditionally included in SUBDIRS should be added to ALWAYS_SUBDIRS
instead when they are excluded.
Back-patch to 9.1, so that plpython's spiexceptions.h will get provided
in 9.1 tarballs. There don't appear to be any instances where distprep
actions got missed in previous releases, and anyway this fix requires
gmake 3.80 so we don't want to apply it before 9.1.
The --flag argument can be used to tell xgettext the arguments of
which functions should be flagged with c-format in the PO files,
instead of guessing based on the presence of format specifiers, which
fails if no format specifiers are present but the translation
accidentally introduces one.
Appropriate flag settings have been added for each message catalog.
based on a patch by Christoph Berg for bug #6066
Put gettext trigger words that are common to the backend and backend
modules into a makefile variable to include everywhere, to avoid
error-prone repetitions.
It currently doesn't make a difference, but it's inconsistent with
most other usage, and it might interfere with a future patch, so I'll
change it all in a separate commit.
Also, replace tabs with spaces for alignment.
install-sh can install multiple files at once, so for loops are not
necessary. This was already changed for the rest of the code some
time ago, but pgxs.mk was apparently forgotten, and the obsolete
coding style has now been copied to the PLs as well.
This also fixes the problem that the for loops in question did not
catch errors.
The style is set to "printf" for backwards compatibility everywhere except
on Windows, where it is set to "gnu_printf", which eliminates hundreds of
false error messages from modern versions of gcc arising from %m and %ll{d,u}
formats.
It assumed that the lineno from the traceback always refers to the
PL/Python function. If you created a PL/Python function that imports
some code, runs it, and that code raises an exception, PLy_traceback
would get utterly confused.
Now we look at the file name reported with the traceback and only
print the source line if it came from the PL/Python function.
Jan Urbański
This improves reporting, as the error string now includes the actual
Python exception. As a side effect, this no longer sets the errcode to
ERRCODE_DATA_EXCEPTION, which might be considered a feature, as it's
not documented and not clear why iterator errors should be treated
differently.
Jan Urbański
Seen with an older gcc version. I'm not sure these represent any real
risk factor, but still a bit scary. Anyway we have lots of other
volatile-marked variables in this code, so a couple more won't hurt.
The original scheme for this was to symlink plpython.$DLSUFFIX to
plpython2.$DLSUFFIX, but that doesn't work on Windows, and only
accidentally failed to fail because of the way that CREATE LANGUAGE created
or didn't create new C functions. My changes of yesterday exposed the
weakness of that approach. To fix, get rid of the symlink and make
pg_pltemplate show what's really going on.
This mostly just involves creating control, install, and
update-from-unpackaged scripts for them. However, I had to adjust plperl
and plpython to not share the same support functions between variants,
because we can't put the same function into multiple extensions.
catversion bump forced due to new contents of pg_pltemplate, and because
initdb now installs plpgsql as an extension not a bare language.
Add support for regression testing these as extensions not bare
languages.
Fix a couple of other issues that popped up while testing this: my initial
hack at pg_dump binary-upgrade support didn't work right, and we don't want
an extra schema permissions test after all.
Documentation changes still to come, but I'm committing now to see
whether the MSVC build scripts need work (likely they do).
This provides a separate exception class for each error code that the
backend defines, as well as the ability to get the SQLSTATE from the
exception object.
Jan Urbański, reviewed by Steve Singer
Adds a context manager, obtainable by plpy.subtransaction(), to run a
group of statements in a subtransaction.
Jan Urbański, reviewed by Steve Singer, additional scribbling by me
We don't have complete expected coverage for Python 2.2 anyway, so it
doesn't seem worth keeping this one around that no one appears to be
updating anyway. Visual inspection of the differences ought to be
good enough for those few who care about this obsolete Python version.