Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Do any queries exist that require RIGHT JOIN, or can they always be re-written with LEFT JOIN?

And more specifically, how do you re-write this one without the right join (and I guess implicitly without any subqueries or other fanciness):


SELECT *
FROM t1
LEFT JOIN t2 ON t1.k2 = t2.k2
RIGHT JOIN t3 ON t3.k3 = t2.k3
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
246 views
Welcome To Ask or Share your Answers For Others

1 Answer

You can always re-write them to get the same result set. However, sometimes the execution plan may be different in significant ways (performance) and sometimes a right join let's you express the query in a way that makes more sense.

Let me illustrate the performance difference. Programmers tend to think in terms of an sql statement happening all at once. However, it's useful to keep a mental model that complicated queries happen in a series of steps where tables are typically joined in the order listed. So you may have a query like this:

SELECT * /* example: don't care what's returned */
FROM LargeTable L
LEFT JOIN MediumTable M ON M.L_ID=L.ID
LEFT JOIN SmallTable S ON S.M_ID=M.ID
WHERE ...

The server will normally start by applying anything it can from the WHERE clause to the first table listed (LargeTable, in this case), to reduce what it needs to load into memory. Then it will join the next table (MediumTable), and then the one after that (SmallTable), and so on.

What we want to do is use a strategy that accounts for the expected impact of each joined table on the results. In general you want to keep the result set as small as possible for as long as possible. Apply that principle to the example query above, and we see it's obviously much slower than it needs to be. It starts with the larger sets (tables) and works down. We want to begin with the smaller sets and work up. That means using SmallTable first, and the way to do that is via a RIGHT JOIN.

Another key here is that the server usually can't know which rows from SmallTable will be needed until the join is completed. Therefore it only matters if SmallTable is so much smaller than LargeTable that loading the entire SmallTable into memory is cheaper than whatever you would start with from LargeTable (which, being a large table, is probably well-indexed and probably filters on a field or three in the where clause).

It's important to also point out that in the vast majority of cases the optimizer will look at this and handle things in the most efficient way possible, and most of the time the optimizer is going to do a better job at this than you could.

But the optimizer isn't perfect. Sometimes you need to help it along: especially if one or more of your "tables" is a view (perhaps into a linked server!) or a nested select statement, for example. A nested sub-query is also a good case of where you might want to use a right join for expressive reasons: it lets you move the nested portion of the query around so you can group things better.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...