SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Friday, April 27, 2007

joining on same table multiple times on different values

SQL Apprentice Question
Having perfomance problems with this query (returning to many rows and
takening to long), related I think
to joining on the DomainDetail table 3 times.
Is there a better way to do this?


SELECT SO.SystemOrderID AS Order_ID,
SO.SystemOrderTypeID AS Order_TypeID,
SO.OrderStatusID AS Order_StatusID,
SO.CreateDate AS Order_CreateDate,
DD.[Name] AS Detail_Status
DD1.Name AS Detail_Type
DD2.Name AS Order_Status
FROM MyDatabase3.dbo..SystemOrder SO
INNER JOIN MyDatabase2.dbo..SystemOrderDetail SOD
ON SOD.SystemOrderID = SO.SystemOrderID
INNER JOIN MyDatabase1.dbo.DomainDetail DD
ON SOD.DetailStatusID = DD.DomainValue
INNER JOIN MyDatabase1.dbo.DomainDetail DD1
ON SOD.DetailTypeID = DD1.DomainValue
INNER JOIN MyDatabase1.dbo.DomainDetail DD2
ON SO.OrderStatusID = DD2.DomainValue
WHERE SO.CreateDate > dateadd(month,-13,getdate())AND
SOD.CreateDate > dateadd(month,-13,getdate())AND
DD.DomainID IN(125,126,127)


thanks


Celko Answers
I first re-wrote your query to bring the names closer to ISO-11179,
remove the display headers in the SELECT list and get rid of redundant
predicates like "SOD.system_order_id = SO.system_order_id" (that
happens a lot with the infixed syntax ). Untested I got this.

SELECT SO.system_order_id, SO.system_order_type,
SO.order_status, SO.create_date,
DD.name AS detail_status
DD1.name AS detail_type
DD2.name AS order_status
FROM Mydatabase3.DBO..System_Order AS SO,
Mydatabase2.DBO..System_Orderdetail AS SOD,
Mydatabase1.DBO.Domaindetail AS DD,
Mydatabase1.DBO.Domaindetail AS DD1,
Mydatabase1.DBO.Domaindetail AS DD2
WHERE SOD.detail_status = DD.domain_value
AND SOD.detail_type = DD1.domain_value
AND SO.order_status = DD2.domain_value
AND SOD.system_order_id = SO.system_order_id
AND SO.create_date > DATEADD(MONTH, -13, CURRENT_TIMESTAMP)
AND SOD.create_date > DATEADD(MONTH, -13, CURRENT_TIMESTAMP)
AND DD.domain_id IN (125, 126, 127);


What you posted scares me. First of all, there is no such thing as a
"type_id" in a valid data model-- an attribute can be an identifier or
a type, but NEVER both at once. Ditto for "status_id" data element.


Is your order information spread across three databases (mydatabase3,
mydatabase2, mydatabase1)? That just does not work; a data model
needs to be in one place. I am still trying to make sense of the
"system-" prefix -- what is a "System_orders" entity and why is it
logically different from a plain "Orders" table?


There is a magical column with the uselessly vague name "name" that is
equivalent to "detail_status", "detail_type" and "order_status". But
a type and a status are very different kinds of attributes, just as
orders and details are very different kinds of entities. Ditto for
the equally magical "domain_value" data element. We need to know
"name of WHAT!?" to make sense of this.


I am going to make a guess that this schema:


1) Mixes data and metadata. The choice of names implies this strongly


2) Has non-scalar columns. We know that already from your magical
changing columns. Do you find the meaning of the magical domain_value
from the domain_id?


3) Has split entities across tables, which is probably why you have
multiple databases.


Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. It is very hard to debug code when you do not let us
see it.

No comments: