1998-02-12 22:46:56 -05:00
|
|
|
/*-------------------------------------------------------------------------
|
|
|
|
|
*
|
1999-02-13 18:22:53 -05:00
|
|
|
* subselect.h
|
Avoid sharing PARAM_EXEC slots between different levels of NestLoop.
Up to now, createplan.c attempted to share PARAM_EXEC slots for
NestLoopParams across different plan levels, if the same underlying Var
was being fed down to different righthand-side subplan trees by different
NestLoops. This was, I think, more of an artifact of using subselect.c's
PlannerParamItem infrastructure than an explicit design goal, but anyway
that was the end result.
This works well enough as long as the plan tree is executing synchronously,
but the feature whereby Gather can execute the parallelized subplan locally
breaks it. An upper NestLoop node might execute for a row retrieved from
a parallel worker, and assign a value for a PARAM_EXEC slot from that row,
while the leader's copy of the parallelized subplan is suspended with a
different active value of the row the Var comes from. When control
eventually returns to the leader's subplan, it gets the wrong answers if
the same PARAM_EXEC slot is being used within the subplan, as reported
in bug #15577 from Bartosz Polnik.
This is pretty reminiscent of the problem fixed in commit 46c508fbc, and
the proper fix seems to be the same: don't try to share PARAM_EXEC slots
across different levels of controlling NestLoop nodes.
This requires decoupling NestLoopParam handling from PlannerParamItem
handling, although the logic remains somewhat similar. To avoid bizarre
division of labor between subselect.c and createplan.c, I decided to move
all the param-slot-assignment logic for both cases out of those files
and put it into a new file paramassign.c. Hopefully it's a bit better
documented now, too.
A regression test case for this might be nice, but we don't know a
test case that triggers the problem with a suitably small amount
of data.
Back-patch to 9.6 where we added Gather nodes. It's conceivable that
related problems exist in older branches; but without some evidence
for that, I'll leave the older branches alone.
Discussion: https://postgr.es/m/15577-ca61ab18904af852@postgresql.org
2019-01-11 15:53:34 -05:00
|
|
|
* Planning routines for subselects.
|
1998-02-12 22:46:56 -05:00
|
|
|
*
|
2025-01-01 11:21:55 -05:00
|
|
|
* Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
|
2003-01-20 13:55:07 -05:00
|
|
|
* Portions Copyright (c) 1994, Regents of the University of California
|
|
|
|
|
*
|
2010-09-20 16:08:53 -04:00
|
|
|
* src/include/optimizer/subselect.h
|
2003-01-20 13:55:07 -05:00
|
|
|
*
|
1998-02-12 22:46:56 -05:00
|
|
|
*-------------------------------------------------------------------------
|
|
|
|
|
*/
|
|
|
|
|
#ifndef SUBSELECT_H
|
|
|
|
|
#define SUBSELECT_H
|
|
|
|
|
|
2019-01-29 16:49:25 -05:00
|
|
|
#include "nodes/pathnodes.h"
|
1999-07-13 21:20:30 -04:00
|
|
|
#include "nodes/plannodes.h"
|
2005-06-05 18:32:58 -04:00
|
|
|
|
2008-10-04 17:56:55 -04:00
|
|
|
extern void SS_process_ctes(PlannerInfo *root);
|
2025-04-04 09:01:50 -04:00
|
|
|
extern ScalarArrayOpExpr *convert_VALUES_to_ANY(PlannerInfo *root,
|
|
|
|
|
Node *testexpr,
|
|
|
|
|
Query *values);
|
2009-02-24 22:30:38 -05:00
|
|
|
extern JoinExpr *convert_ANY_sublink_to_join(PlannerInfo *root,
|
|
|
|
|
SubLink *sublink,
|
|
|
|
|
Relids available_rels);
|
|
|
|
|
extern JoinExpr *convert_EXISTS_sublink_to_join(PlannerInfo *root,
|
|
|
|
|
SubLink *sublink,
|
|
|
|
|
bool under_not,
|
|
|
|
|
Relids available_rels);
|
2007-02-19 02:03:34 -05:00
|
|
|
extern Node *SS_replace_correlation_vars(PlannerInfo *root, Node *expr);
|
|
|
|
|
extern Node *SS_process_sublinks(PlannerInfo *root, Node *expr, bool isQual);
|
2015-08-11 23:48:37 -04:00
|
|
|
extern void SS_identify_outer_params(PlannerInfo *root);
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 15:58:22 -05:00
|
|
|
extern void SS_charge_for_initplans(PlannerInfo *root, RelOptInfo *final_rel);
|
Allow plan nodes with initPlans to be considered parallel-safe.
If the plan itself is parallel-safe, and the initPlans are too,
there's no reason anymore to prevent the plan from being marked
parallel-safe. That restriction (dating to commit ab77a5a45) was
really a special case of the fact that we couldn't transmit subplans
to parallel workers at all. We fixed that in commit 5e6d8d2bb and
follow-ons, but this case never got addressed.
We still forbid attaching initPlans to a Gather node that's
inserted pursuant to debug_parallel_query = regress. That's because,
when we hide the Gather from EXPLAIN output, we'd hide the initPlans
too, causing cosmetic regression diffs. It seems inadvisable to
kluge EXPLAIN to the extent required to make the output look the
same, so just don't do it in that case.
Along the way, this also takes care of some sloppiness about updating
path costs to match when we move initplans from one place to another
during createplan.c and setrefs.c. Since all the planning decisions
are already made by that point, this is just cosmetic; but it seems
good to keep EXPLAIN output consistent with where the initplans are.
The diff in query_planner() might be worth remarking on. I found that
one because after fixing things to allow parallel-safe initplans, one
partition_prune test case changed plans (as shown in the patch) ---
but only when debug_parallel_query was active. The reason proved to
be that we only bothered to mark Result nodes as potentially
parallel-safe when debug_parallel_query is on. This neglects the fact
that parallel-safety may be of interest for a sub-query even though
the Result itself doesn't parallelize.
Discussion: https://postgr.es/m/1129530.1681317832@sss.pgh.pa.us
2023-07-13 17:30:14 -04:00
|
|
|
extern void SS_compute_initplan_cost(List *init_plans,
|
|
|
|
|
Cost *initplan_cost_p,
|
|
|
|
|
bool *unsafe_initplans_p);
|
2015-08-11 23:48:37 -04:00
|
|
|
extern void SS_attach_initplans(PlannerInfo *root, Plan *plan);
|
|
|
|
|
extern void SS_finalize_plan(PlannerInfo *root, Plan *plan);
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 15:58:22 -05:00
|
|
|
extern Param *SS_make_initplan_output_param(PlannerInfo *root,
|
|
|
|
|
Oid resulttype, int32 resulttypmod,
|
|
|
|
|
Oid resultcollation);
|
|
|
|
|
extern void SS_make_initplan_from_plan(PlannerInfo *root,
|
2015-08-11 23:48:37 -04:00
|
|
|
PlannerInfo *subroot, Plan *plan,
|
Make the upper part of the planner work by generating and comparing Paths.
I've been saying we needed to do this for more than five years, and here it
finally is. This patch removes the ever-growing tangle of spaghetti logic
that grouping_planner() used to use to try to identify the best plan for
post-scan/join query steps. Now, there is (nearly) independent
consideration of each execution step, and entirely separate construction of
Paths to represent each of the possible ways to do that step. We choose
the best Path or set of Paths using the same add_path() logic that's been
used inside query_planner() for years.
In addition, this patch removes the old restriction that subquery_planner()
could return only a single Plan. It now returns a RelOptInfo containing a
set of Paths, just as query_planner() does, and the parent query level can
use each of those Paths as the basis of a SubqueryScanPath at its level.
This allows finding some optimizations that we missed before, wherein a
subquery was capable of returning presorted data and thereby avoiding a
sort in the parent level, making the overall cost cheaper even though
delivering sorted output was not the cheapest plan for the subquery in
isolation. (A couple of regression test outputs change in consequence of
that. However, there is very little change in visible planner behavior
overall, because the point of this patch is not to get immediate planning
benefits but to create the infrastructure for future improvements.)
There is a great deal left to do here. This patch unblocks a lot of
planner work that was basically impractical in the old code structure,
such as allowing FDWs to implement remote aggregation, or rewriting
plan_set_operations() to allow consideration of multiple implementation
orders for set operations. (The latter will likely require a full
rewrite of plan_set_operations(); what I've done here is only to fix it
to return Paths not Plans.) I have also left unfinished some localized
refactoring in createplan.c and planner.c, because it was not necessary
to get this patch to a working state.
Thanks to Robert Haas, David Rowley, and Amit Kapila for review.
2016-03-07 15:58:22 -05:00
|
|
|
Param *prm);
|
2001-10-28 01:26:15 -05:00
|
|
|
|
1998-02-12 22:46:56 -05:00
|
|
|
#endif /* SUBSELECT_H */
|