postgresql

mirror of https://github.com/postgres/postgres.git synced 2026-04-15 22:10:45 -04:00

Author	SHA1	Message	Date
Tom Lane	ec94bc1473	Undo mistaken tightening in join_is_legal(). One of the changes I made in commit `8703059c6b` turns out not to have been such a good idea: we still need the exception in join_is_legal() that allows a join if both inputs already overlap the RHS of the special join we're checking. Otherwise we can miss valid plans, and might indeed fail to find a plan at all, as in recent report from Andreas Seltenreich. That code was added way back in commit `c17117649b`, but I failed to include a regression test case then; my bad. Put it back with a better explanation, and a test this time. The logic does end up a bit different than before though: I now believe it's appropriate to make this check first, thereby allowing such a case whether or not we'd consider the previous SJ(s) to commute with this one. (Presumably, we already decided they did; but it was confusing to have this consideration in the middle of the code that was handling the other case.) Back-patch to all active branches, like the previous patch.	2015-08-12 21:19:04 -04:00
Tom Lane	df3b0f47b9	Further fixes for degenerate outer join clauses. Further testing revealed that commit `f69b4b9495` was still a few bricks shy of a load: minor tweaking of the previous test cases resulted in the same wrong-outer-join-order problem coming back. After study I concluded that my previous changes in make_outerjoininfo() were just accidentally masking the problem, and should be reverted in favor of forcing syntactic join order whenever an upper outer join's predicate doesn't mention a lower outer join's LHS. This still allows the chained-outer-joins style that is the normally optimizable case. I also tightened things up some more in join_is_legal(). It seems to me on review that what's really happening in the exception case where we ignore a mismatched special join is that we're allowing the proposed join to associate into the RHS of the outer join we're comparing it to. As such, we should always insist that the proposed join be a left join, which eliminates a bunch of rather dubious argumentation. The case where we weren't enforcing that was the one that was already known buggy anyway (it had a violatable Assert before the aforesaid commit) so it hardly deserves a lot of deference. Back-patch to all active branches, like the previous patch. The added regression test case failed in all branches back to 9.1, and I think it's only an unrelated change in costing calculations that kept 9.0 from choosing a broken plan.	2015-08-06 15:35:47 -04:00
Tom Lane	06663971bb	Make real sure we don't reassociate joins into or out of SEMI/ANTI joins. Per the discussion in optimizer/README, it's unsafe to reassociate anything into or out of the RHS of a SEMI or ANTI join. An example from Piotr Stefaniak showed that join_is_legal() wasn't sufficiently enforcing this rule, so lock it down a little harder. I couldn't find a reasonably simple example of the optimizer trying to do this, so no new regression test. (Piotr's example involved the random search in GEQO accidentally trying an invalid case and triggering a sanity check way downstream in clause selectivity estimation, which did not seem like a sequence of events that would be useful to memorialize in a regression test as-is.) Back-patch to all active branches.	2015-08-05 14:39:07 -04:00
Tom Lane	7968238eb1	Fix some planner issues with degenerate outer join clauses. An outer join clause that didn't actually reference the RHS (perhaps only after constant-folding) could confuse the join order enforcement logic, leading to wrong query results. Also, nested occurrences of such things could trigger an Assertion that on reflection seems incorrect. Per fuzz testing by Andreas Seltenreich. The practical use of such cases seems thin enough that it's not too surprising we've not heard field reports about it. This has been broken for a long time, so back-patch to all active branches.	2015-08-01 20:57:41 -04:00
Tom Lane	edf26ed033	Fix an oversight in checking whether a join with LATERAL refs is legal. In many cases, we can implement a semijoin as a plain innerjoin by first passing the righthand-side relation through a unique-ification step. However, one of the cases where this does NOT work is where the RHS has a LATERAL reference to the LHS; that makes the RHS dependent on the LHS so that unique-ification is meaningless. joinpath.c understood this, and so would not generate any join paths of this kind ... but join_is_legal neglected to check for the case, so it would think that we could do it. The upshot would be a "could not devise a query plan for the given query" failure once we had failed to generate any join paths at all for the bogus join pair. Back-patch to 9.3 where LATERAL was added.	2015-07-31 19:26:33 -04:00
Tom Lane	b55722692b	Improve planner's cost estimation in the presence of semijoins. If we have a semijoin, say SELECT * FROM x WHERE x1 IN (SELECT y1 FROM y) and we're estimating the cost of a parameterized indexscan on x, the number of repetitions of the indexscan should not be taken as the size of y; it'll really only be the number of distinct values of y1, because the only valid plan with y on the outside of a nestloop would require y to be unique-ified before joining it to x. Most of the time this doesn't make that much difference, but sometimes it can lead to drastically underestimating the cost of the indexscan and hence choosing a bad plan, as pointed out by David Kubečka. Fixing this is a bit difficult because parameterized indexscans are costed out quite early in the planning process, before we have the information that would be needed to call estimate_num_groups() and thereby estimate the number of distinct values of the join column(s). However we can move the code that extracts a semijoin RHS's unique-ification columns, so that it's done in initsplan.c rather than on-the-fly in create_unique_path(). That shouldn't make any difference speed-wise and it's really a bit cleaner too. The other bit of information we need is the size of the semijoin RHS, which is easy if it's a single relation (we make those estimates before considering indexscan costs) but problematic if it's a join relation. The solution adopted here is just to use the product of the sizes of the join component rels. That will generally be an overestimate, but since estimate_num_groups() only uses this input as a clamp, an overestimate shouldn't hurt us too badly. In any case we don't allow this new logic to produce a value larger than we would have chosen before, so that at worst an overestimate leaves us no wiser than we were before.	2015-03-11 21:21:00 -04:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Tom Lane	9e7e29c75a	Fix planner problems with LATERAL references in PlaceHolderVars. The planner largely failed to consider the possibility that a PlaceHolderVar's expression might contain a lateral reference to a Var coming from somewhere outside the PHV's syntactic scope. We had a previous report of a problem in this area, which I tried to fix in a quick-hack way in commit `4da6439bd8`, but Antonin Houska pointed out that there were still some problems, and investigation turned up other issues. This patch largely reverts that commit in favor of a more thoroughly thought-through solution. The new theory is that a PHV's ph_eval_at level cannot be higher than its original syntactic level. If it contains lateral references, those don't change the ph_eval_at level, but rather they create a lateral-reference requirement for the ph_eval_at join relation. The code in joinpath.c needs to handle that. Another issue is that createplan.c wasn't handling nested PlaceHolderVars properly. In passing, push knowledge of lateral-reference checks for join clauses into join_clause_is_movable_to. This is mainly so that FDWs don't need to deal with it. This patch doesn't fix the original join-qual-placement problem reported by Jeremy Evans (and indeed, one of the new regression test cases shows the wrong answer because of that). But the PlaceHolderVar problems need to be fixed before that issue can be addressed, so committing this separately seems reasonable.	2013-08-17 20:22:37 -04:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Tom Lane	9ff79b9d4e	Fix up planner infrastructure to support LATERAL properly. This patch takes care of a number of problems having to do with failure to choose valid join orders and incorrect handling of lateral references pulled up from subqueries. Notable changes: * Add a LateralJoinInfo data structure similar to SpecialJoinInfo, to represent join ordering constraints created by lateral references. (I first considered extending the SpecialJoinInfo structure, but the semantics are different enough that a separate data structure seems better.) Extend join_is_legal() and related functions to prevent trying to form unworkable joins, and to ensure that we will consider joins that satisfy lateral references even if the joins would be clauseless. * Fill in the infrastructure needed for the last few types of relation scan paths to support parameterization. We'd have wanted this eventually anyway, but it is necessary now because a relation that gets pulled up out of a UNION ALL subquery may acquire a reltargetlist containing lateral references, meaning that its paths have to be parameterized whether or not we have any code that can push join quals down into the scan. * Compute data about lateral references early in query_planner(), and save in RelOptInfo nodes, to avoid repetitive calculations later. * Assorted corner-case bug fixes. There's probably still some bugs left, but this is a lot closer to being real than it was before.	2012-08-26 22:50:23 -04:00
Tom Lane	eb919e8fde	Resurrect the "last ditch" code path in join_search_one_level(). This essentially reverts commit `e54b10a62d`, in which I'd decided that the "last ditch" join logic was useless. The folly of that is now exposed by a report from Pavel Stehule: although the function should always find at least one join in a self-contained join problem, it can still fail to do so in a sub-problem created by artificial from_collapse_limit or join_collapse_limit constraints. Adjust the comments to describe this, and simplify the code a bit to match the new coding of the earlier loop in the function. I'm not terribly happy about this: I still subscribe to the opinion stated in the previous commit message that the "last ditch" code can obscure logic bugs elsewhere. But the alternative seems to be to complicate the earlier tests for does-this-relation-have-a-join-clause to the point where they can tell whether the join clauses link outside the current join sub-problem. And that looks messy, slow, and possibly a source of bugs in itself. In any case, now is not the time to be inserting experimental code into 9.2, so let's just go back to the time-tested solution.	2012-08-15 00:08:13 -04:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	1f03630011	Adjust join_search_one_level's handling of clauseless joins. For an initial relation that lacks any join clauses (that is, it has to be cartesian-product-joined to the rest of the query), we considered only cartesian joins with initial rels appearing later in the initial-relations list. This creates an undesirable dependency on FROM-list order. We would never fail to find a plan, but perhaps we might not find the best available plan. Noted while discussing the logic with Amit Kapila. Improve the comments a bit in this area, too. Arguably this is a bug fix, but given the lack of complaints from the field I'll refrain from back-patching.	2012-04-20 20:10:46 -04:00
Tom Lane	5b7b5518d0	Revise parameterized-path mechanism to fix assorted issues. This patch adjusts the treatment of parameterized paths so that all paths with the same parameterization (same set of required outer rels) for the same relation will have the same rowcount estimate. We cache the rowcount estimates to ensure that property, and hopefully save a few cycles too. Doing this makes it practical for add_path_precheck to operate without a rowcount estimate: it need only assume that paths with different parameterizations never dominate each other, which is close enough to true anyway for coarse filtering, because normally a more-parameterized path should yield fewer rows thanks to having more join clauses to apply. In add_path, we do the full nine yards of comparing rowcount estimates along with everything else, so that we can discard parameterized paths that don't actually have an advantage. This fixes some issues I'd found with add_path rejecting parameterized paths on the grounds that they were more expensive than not-parameterized ones, even though they yielded many fewer rows and hence would be cheaper once subsequent joining was considered. To make the same-rowcounts assumption valid, we have to require that any parameterized path enforce all join clauses that could be obtained from the particular set of outer rels, even if not all of them are useful for indexing. This is required at both base scans and joins. It's a good thing anyway since the net impact is that join quals are checked at the lowest practical level in the join tree. Hence, discard the original rather ad-hoc mechanism for choosing parameterization joinquals, and build a better one that has a more principled rule for when clauses can be moved. The original rule was actually buggy anyway for lack of knowledge about which relations are part of an outer join's outer side; getting this right requires adding an outer_relids field to RestrictInfo.	2012-04-19 15:53:47 -04:00
Tom Lane	e54b10a62d	Remove the "last ditch" code path in join_search_one_level(). So far as I can tell, it is no longer possible for this heuristic to do anything useful, because the new weaker definition of have_relevant_joinclause means that any relation with a joinclause must be considered joinable to at least one other relation. It would still be possible for the code block to be entered, for example if there are join order restrictions that prevent any join of the current level from being formed; but in that case it's just a waste of cycles to attempt to form cartesian joins, since the restrictions will still apply. Furthermore, IMO the existence of this code path can mask bugs elsewhere; we would have noticed the problem with cartesian joins a lot sooner if this code hadn't compensated for it in the simplest case. Accordingly, let's remove it and see what happens. I'm committing this separately from the prerequisite changes in have_relevant_joinclause, just to make the question easier to revisit if there is some fault in my logic.	2012-04-13 16:07:18 -04:00
Tom Lane	e3ffd05b02	Weaken the planner's tests for relevant joinclauses. We should be willing to cross-join two small relations if that allows us to use an inner indexscan on a large relation (that is, the potential indexqual for the large table requires both smaller relations). This worked in simple cases but fell apart as soon as there was a join clause to a fourth relation, because the existence of any two-relation join clause caused the planner to not consider clauseless joins between other base relations. The added regression test shows an example case adapted from a recent complaint from Benoit Delbosc. Adjust have_relevant_joinclause, have_relevant_eclass_joinclause, and has_relevant_eclass_joinclause to consider that a join clause mentioning three or more relations is sufficient grounds for joining any subset of those relations, even if we have to do so via a cartesian join. Since such clauses are relatively uncommon, this shouldn't affect planning speed on typical queries; in fact it should help a bit, because the latter two functions in particular get significantly simpler. Although this is arguably a bug fix, I'm not going to risk back-patching it, since it might have currently-unforeseen consequences.	2012-04-13 16:07:17 -04:00
Tom Lane	e2fa76d80b	Use parameterized paths to generate inner indexscans more flexibly. This patch fixes the planner so that it can generate nestloop-with- inner-indexscan plans even with one or more levels of joining between the indexscan and the nestloop join that is supplying the parameter. The executor was fixed to handle such cases some time ago, but the planner was not ready. This should improve our plans in many situations where join ordering restrictions formerly forced complete table scans. There is probably a fair amount of tuning work yet to be done, because of various heuristics that have been added to limit the number of parameterized paths considered. However, we are not going to find out what needs to be adjusted until the code gets some real-world use, so it's time to get it in there where it can be tested easily. Note API change for index AM amcostestimate functions. I'm not aware of any non-core index AMs, but if there are any, they will need minor adjustments.	2012-01-27 19:26:38 -05:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Bruce Momjian	6560407c7d	Pgindent run before 9.1 beta2.	2011-06-09 14:32:50 -04:00
Tom Lane	eca75a12a2	Ensure mark_dummy_rel doesn't create dangling pointers in RelOptInfos. When we are doing GEQO join planning, the current memory context is a short-lived context that will be reset at the end of geqo_eval(). However, the RelOptInfos for base relations are set up before that and then re-used across many GEQO cycles. Hence, any code that modifies a baserel during join planning has to be careful not to put pointers to the short-lived context into the baserel struct. mark_dummy_rel got this wrong, leading to easy-to-reproduce-once-you-know-how crashes in 8.4, as reported off-list by Leo Carson of SDSC. Some improvements made in 9.0 make it difficult to demonstrate the crash in 9.0 or HEAD; but there's no doubt that there's still a risk factor here, so patch all branches that have the function. (Note: 8.3 has a similar function, but it's only applied to joinrels and thus is not a hazard.)	2011-04-13 18:56:40 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Bruce Momjian	5d950e3b0c	Stamp copyrights for year 2011.	2011-01-01 13:18:15 -05:00
Tom Lane	f4e4b32743	Support RIGHT and FULL OUTER JOIN in hash joins. This is advantageous first because it allows us to hash the smaller table regardless of the outer-join type, and second because hash join can be more flexible than merge join in dealing with arbitrary join quals in a FULL join. For merge join all the join quals have to be mergejoinable, but hash join will work so long as there's at least one hashjoinable qual --- the others can be any condition. (This is true essentially because we don't keep per-inner-tuple match flags in merge join, while hash join can do so.) To do this, we need a has-it-been-matched flag for each tuple in the hashtable, not just one for the current outer tuple. The key idea that makes this practical is that we can store the match flag in the tuple's infomask, since there are lots of bits there that are of no interest for a MinimalTuple. So we aren't increasing the size of the hashtable at all for the feature. To write this without turning the hash code into even more of a pile of spaghetti than it already was, I rewrote ExecHashJoin in a state-machine style, similar to ExecMergeJoin. Other than that decision, it was pretty straightforward.	2010-12-30 20:26:08 -05:00
Magnus Hagander	9f2e211386	Remove cvs keywords from all files.	2010-09-20 22:08:53 +02:00
Tom Lane	4e97631e6a	Fix join-removal logic for pseudoconstant and outerjoin-delayed quals. In these cases a qual can get marked with the removable rel in its required_relids, but this is just to schedule its evaluation correctly, not because it really depends on the rel. We were assuming that, in effect, we could throw away all quals so marked, which is nonsense. Tighten up the logic to be a little more paranoid about which quals belong to the outer join being considered for removal, and arrange for all quals that don't belong to be updated so they will still get evaluated correctly. Also fix another problem that happened to be exposed by this test case, which was that make_join_rel() was failing to notice some cases where a constant-false qual could be used to prove a join relation empty. If it's a pushed-down constant false, then the relation is empty even if it's an outer join, because the qual applies after the outer join expansion. Per report from Nathan Grange. Back-patch into 9.0.	2010-09-14 23:15:29 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Bruce Momjian	0239800893	Update copyright for the year 2010.	2010-01-02 16:58:17 +00:00
Tom Lane	1a95f12702	Eliminate a lot of list-management overhead within join_search_one_level by adding a requirement that build_join_rel add new join RelOptInfos to the appropriate list immediately at creation. Per report from Robert Haas, the list_concat_unique_ptr() calls that this change eliminates were taking the lion's share of the runtime in larger join problems. This doesn't do anything to fix the fundamental combinatorial explosion in large join problems, but it should push out the threshold of pain a bit further. Note: because this changes the order in which joinrel lists are built, it might result in changes in selected plans in cases where different alternatives have exactly the same costs. There is one example in the regression tests.	2009-11-28 00:46:19 +00:00
Tom Lane	1ca695db38	Fix another thinko in join_is_legal's handling of semijoins: we have to test for the case that the semijoin was implemented within either input by unique-ifying its RHS before we test to see if it appears to match the current join situation. The previous coding would select semijoin logic in situations where we'd already unique-ified the RHS and joined it to some unrelated relation(s), and then came to join it to the semijoin's LHS. That still gave the right answer as far as the semijoin itself was concerned, but would lead to incorrectly examining only an arbitrary one of the matchable rows from the unrelated relation(s). The cause of this thinko was incorrect unification of the pre-8.4 logic for IN joins and OUTER joins --- the comparable case for outer joins can be handled after making the match test, but that's because there is nothing like the unique-ification escape hatch for outer joins. Per bug #4934 from Benjamin Reed.	2009-07-23 17:42:06 +00:00
Tom Lane	a43b190e3c	Fix a thinko in join_is_legal: when we decide we can implement a semijoin by unique-ifying the RHS and then inner-joining to some other relation, that is not grounds for violating the RHS of some other outer join. Noticed while regression-testing new GEQO code, which will blindly follow any path that join_is_legal says is legal, and then complain later if that leads to a dead end. I'm not certain that this can result in any visible failure in 8.4: the mistake may always be masked by the fact that subsequent attempts to join the rest of the RHS of the other join will fail. But I'm not certain it can't, either, and it's definitely not operating as intended. So back-patch. The added regression test depends on the new no-failures-allowed logic that I'm about to commit in GEQO, so no point back-patching that.	2009-07-19 20:32:48 +00:00
Bruce Momjian	d747140279	8.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef list provided by Andrew.	2009-06-11 14:49:15 +00:00
Tom Lane	75c85bd199	Tighten up join ordering rules to account for recent more-careful analysis of the associativity of antijoins. Also improve optimizer/README discussion of outer join ordering rules.	2009-02-27 22:41:38 +00:00
Tom Lane	233b8a99ad	Improve comments about semijoin implementation strategy, per a question from Robert Haas.	2009-02-19 20:32:45 +00:00
Bruce Momjian	511db38ace	Update copyright for 2009.	2009-01-01 17:24:05 +00:00
Tom Lane	213256cfa9	My recent fix for semijoin planning didn't actually work for a semijoin with a RHS that can't be unique-ified --- join_is_legal has to check that before deciding to build a join, else we'll have an unimplementable joinrel. Per report from Greg Stark.	2008-11-28 19:29:07 +00:00
Tom Lane	8309d006cb	Switch the planner over to treating qualifications of a JOIN_SEMI join as though it is an inner rather than outer join type. This essentially means that we don't bother to separate "pushed down" qual conditions from actual join quals at a semijoin plan node; which is okay because the restrictions of SQL syntax make it impossible to have a pushed-down qual that references the inner side of a semijoin. This allows noticeably better optimization of IN/EXISTS cases than we had before, since the equivalence-class machinery can now use those quals. Also fix a couple of other mistakes that had essentially disabled the ability to unique-ify the inner relation and then join it to just a subset of the left-hand relations. An example case using the regression database is select * from tenk1 a, tenk1 b where (a.unique1,b.unique2) in (select unique1,unique2 from tenk1 c); which is planned reasonably well by 8.3 and earlier but had been forcing a cartesian join of a/b in CVS HEAD.	2008-11-22 22:47:06 +00:00
Tom Lane	719012e013	Add some defenses against constant-FALSE outer join conditions. Since eval_const_expressions will generally throw away anything that's ANDed with constant FALSE, what we're left with given an example like select * from tenk1 a where (unique1,0) in (select unique2,1 from tenk1 b); is a cartesian product computation, which is really not acceptable. This is a regression in CVS HEAD compared to previous releases, which were able to notice the impossible join condition in this case --- though not in some related cases that are also improved by this patch, such as select * from tenk1 a left join tenk1 b on (a.unique1=b.unique2 and 0=1); Fix by skipping evaluation of the appropriate side of the outer join in cases where it's demonstrably unnecessary.	2008-08-17 19:40:11 +00:00
Tom Lane	e006a24ad1	Implement SEMI and ANTI joins in the planner and executor. (Semijoins replace the old JOIN_IN code, but antijoins are new functionality.) Teach the planner to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL filters are recognized as being anti joins. Unify the InClauseInfo and OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change, it becomes possible to associate a SpecialJoinInfo with every join attempt, which permits some cleanup of join selectivity estimation. That needs to be taken much further than this patch does, but the next step is to change the API for oprjoin selectivity functions, which seems like material for a separate patch. So for the moment the output size estimates for semi and especially anti joins are quite bogus.	2008-08-14 18:48:00 +00:00
Tom Lane	fd791e7b5a	When a relation has been proven empty by constraint exclusion, propagate that knowledge up through any joins it participates in. We were doing that already in some special cases but not in the general case. Also, defend against zero row estimates for the input relations in cost_mergejoin --- this fix may have eliminated the only scenario in which that can happen, but be safe. Per report from Alex Solovey.	2008-03-24 21:53:04 +00:00
Tom Lane	59fc64acee	Fix a conceptual error in my patch of 2007-10-26 that avoided considering clauseless joins of relations that have unexploited join clauses. Rather than looking at every other base relation in the query, the correct thing is to examine the other relations in the "initial_rels" list of the current make_rel_from_joinlist() invocation, because those are what we actually have the ability to join against. This might be a subset of the whole query in cases where join_collapse_limit or from_collapse_limit or full joins have prevented merging the whole query into a single join problem. This is a bit untidy because we have to pass those rels down through a new PlannerInfo field, but it's necessary. Per bug #3865 from Oleg Kharin.	2008-01-11 04:02:18 +00:00
Bruce Momjian	9098ab9e32	Update copyrights in source tree to 2008.	2008-01-01 19:46:01 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	cd2a2ce904	Change have_join_order_restriction() so that we do not force a clauseless join if either of the input relations can legally be joined to any other rels using join clauses. This avoids uselessly (and expensively) considering a lot of really stupid join paths when there is a join restriction with a large footprint, that is, lots of relations inside its LHS or RHS. My patch of 15-Feb-2007 had been causing the code to consider joining every combination of rels inside such a group, which is exponentially bad :-(. With this behavior, clauseless bushy joins will be done if necessary, but they'll be put off as long as possible. Per report from Jakub Ouhrabka. Backpatch to 8.2. We might someday want to backpatch to 8.1 as well, but 8.1 does not have the problem for OUTER JOIN nests, only for IN-clauses, so it's not clear anyone's very likely to hit it in practice; and the current patch doesn't apply cleanly to 8.1.	2007-10-26 18:10:50 +00:00
Tom Lane	cdf0231c88	Create a function variable "join_search_hook" to let plugins override the join search order portion of the planner; this is specifically intended to simplify developing a replacement for GEQO planning. Patch by Julius Stroffek, editorialized on by me. I renamed make_one_rel_by_joins to standard_join_search and make_rels_by_joins to join_search_one_level to better reflect their place within this scheme.	2007-09-26 18:51:51 +00:00
Tom Lane	6bef118b01	Restructure code that is responsible for ensuring that clauseless joins are considered when it is necessary to do so because of a join-order restriction (that is, an outer-join or IN-subselect construct). The former coding was a bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario Weilguni's recent bug report. His specific problem was that an IN could be turned into a "clauseless" join due to constant-propagation removing the IN's joinclause, and if the IN's subselect involved more than one relation and there was more than one such IN linking to the same upper relation, then the only valid join orders involve "bushy" plans but we would fail to consider the specific paths needed to get there. (See the example case added to the join regression test.) On examining the code I wonder if there weren't some other problem cases too; in particular it seems that GEQO was defending against a different set of corner cases than the main planner was. There was also an efficiency problem, in that when we did realize we needed a clauseless join because of an IN, we'd consider clauseless joins against every other relation whether this was sensible or not. It seems a better design is to use the outer-join and in-clause lists as a backup heuristic, just as the rule of joining only where there are joinclauses is a heuristic: we'll join two relations if they have a usable joinclause or this might be necessary to satisfy an outer-join or IN-clause join order restriction. I refactored the code to have just one place considering this instead of three, and made sure that it covered all the cases that any of them had been considering. Backpatch as far as 8.1 (which has only the IN-clause form of the disease). By rights 8.0 and 7.4 should have the bug too, but they accidentally fail to fail, because the joininfo structure used in those releases preserves some memory of there having once been a joinclause between the inner and outer sides of an IN, and so it leads the code in the right direction anyway. I'll be conservative and not touch them.	2007-02-16 00:14:01 +00:00
Tom Lane	c17117649b	Repair bug in 8.2's new logic for planning outer joins: we have to allow joins that overlap an outer join's min_righthand but aren't fully contained in it, to support joining within the RHS after having performed an outer join that can commute with this one. Aside from the direct fix in make_join_rel(), fix has_join_restriction() and GEQO's desirable_join() to consider this possibility. Per report from Ian Harding.	2007-02-13 02:31:03 +00:00
Tom Lane	f41803bb39	Refactor planner's pathkeys data structure to create a separate, explicit representation of equivalence classes of variables. This is an extensive rewrite, but it brings a number of benefits: * planner no longer fails in the presence of "incomplete" operator families that don't offer operators for every possible combination of datatypes. * avoid generating and then discarding redundant equality clauses. * remove bogus assumption that derived equalities always use operators named "=". * mergejoins can work with a variety of sort orders (e.g., descending) now, instead of tying each mergejoinable operator to exactly one sort order. * better recognition of redundant sort columns. * can make use of equalities appearing underneath an outer join.	2007-01-20 20:45:41 +00:00
Bruce Momjian	29dccf5fe0	Update CVS HEAD for 2007 copyright. Back branches are typically not back-stamped for this.	2007-01-05 22:20:05 +00:00

1 2 3

132 commits