We have a general perception that EXISTS is faster then IN. This is not always true. There can be some cases where IN performs better then EXISTS.
This depends on the size of the tables (and data queried) Using "IN", the optimizer generally runs the subquery first, then joins with the main dataset (nested loops, merge, whatever). If the table in the subquery is small, and the table in the main query is large, then IN usually makes sense.
Using "EXISTS", the optimizer generally runs the main query first, then applies that dataset to the subquery. If the table in the subquery is large, and the table in the main query is small, then EXISTS usually makes sense.
By the way, Full Table Scan are not necessarily bad. If you are querying more than, say 15 - 20%, of the data in a table the Full Table Scan may be faster. Two Full Table Scans with a Hash Join are frequently gobs faster than a Nested Loops type of plan.
No comments:
Post a Comment