SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Tuesday, May 02, 2006

UNION ALL and SELECT DISTINCT

SQL Apprentice Question
The nice folks who gave us the SQL language adopted two different
conventions regarding duplicates in result tables.

In the SELECT operator, they chose to have SELECT mean the same as SELECT
ALL. If you want SELECT DISTINCT, you have to say so.


Notice that UNION is just the reverse. If you want UNION ALL, you have to
say so. If you just say UNION, you get only distinct rows in the result.


I think it would have been nicer if they had adopted the same convention in
both cases. I like the way UNION does it. So I'd like it if SELECT, by
default, eliminated duplicates. SELECT ALL could be used for what is now
meant by SELECT.


Of course, it's too late to change SQL at this juncture.


I'd also like it if the optimizer could figure out when SELECT ALL and
SELECT DISTINCT are equivalent, and make use of that in picking the optimal
strategy. If the optimizer has to make a mistake, I'd prefer it to pick a
strategy that's correct but not optimal, and not pick a fast but incorrect
strategy.

Celko Answers
>>The nice folks who gave us the SQL language adopted two different conventions regarding duplicates in result tables. <<


Well, someone did not do their homework!! The ALL keyword preserves
duplcates and DISTINCT removes redundantrdups in many places in SQL.
But it is the assumption/option and not the requirement.in places in
the language.


>> I'd also like it if the optimizer could figure out when SELECT ALL and SELECT DISTINCT are equivalent, and make use of that in picking the optimal strategy. <<


Most products do ..


>> I'd prefer it to pick a strategy that's correct but not optimal, and not pick a fast but incorrect strategy. <<


I do not see a conflict. Correct first, fast next.

No comments: