Doc: improve user docs and code comments about EXISTS(SELECT * ...).

Point out that Postgres automatically optimizes away the target list
of an EXISTS' subquery, except in weird cases such as target lists
containing set-returning functions.  Thus, both common conventions
EXISTS(SELECT * FROM ...) and EXISTS(SELECT 1 FROM ...) are
overhead-free and there's little reason to prefer one over the other.

In the code comments, mention that the SQL spec says that
EXISTS(SELECT * FROM ...) should be interpreted as EXISTS(SELECT
some-literal FROM ...), but we don't choose to do it exactly that way.

Author: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/9b301c70-3909-4f0f-98ca-9e3c4d142f3e@eisentraut.org
This commit is contained in:
Tom Lane 2026-02-27 15:20:16 -05:00
parent 98616ac18b
commit 65a3ff8f1b
2 changed files with 15 additions and 3 deletions

View file

@ -70,8 +70,14 @@ EXISTS (<replaceable>subquery</replaceable>)
and not on the contents of those rows, the output list of the
subquery is normally unimportant. A common coding convention is
to write all <literal>EXISTS</literal> tests in the form
<literal>EXISTS(SELECT 1 WHERE ...)</literal>. There are exceptions to
this rule however, such as subqueries that use <token>INTERSECT</token>.
<literal>EXISTS(SELECT * FROM ... WHERE ...)</literal>, another common
convention is to write <literal>EXISTS(SELECT 1 FROM ... WHERE
...)</literal> or some other dummy constant. These conventions are
actually equivalent in <productname>PostgreSQL</productname>, which
will optimize away evaluation of the subquery's output list altogether
when it cannot affect the number of rows returned. (An example
that cannot be optimized away is an output list containing a
set-returning function, since the function might return zero rows.)
</para>
<para>

View file

@ -1643,7 +1643,13 @@ convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
* Note: by suppressing the targetlist we could cause an observable behavioral
* change, namely that any errors that might occur in evaluating the tlist
* won't occur, nor will other side-effects of volatile functions. This seems
* unlikely to bother anyone in practice.
* unlikely to bother anyone in practice. Note that any column privileges are
* still checked even if the reference is removed here.
*
* The SQL standard specifies that a SELECT * immediately inside EXISTS
* expands to not all columns but an arbitrary literal. That is kind of the
* same idea, but our optimization goes further in that it throws away the
* entire targetlist, and not only if it was written as *.
*
* Returns true if was able to discard the targetlist, else false.
*/