SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Tuesday, April 25, 2006

GROUP BY Question

SQL Apprentice Question
I need to generate daily aggregates (sums) from some hourly data and although
I have query that works, I think I've gone overboard on the GROUP BY
statement. Can anyone give me some feedback on this? I'm sure there is a
better way to do this, but I'm at a loss.

SELECT
max(LP.NO_FINISH_LINE) NO_FINISH_LINE,
CONVERT(CHAR(10),max(LP.DT_FINISHED),126) DT_FINISHED,
CAST(sum(LP.NO_WEIGHT_PROD) as DECIMAL(8,2)) NO_WEIGHT_PROD,
CAST(sum(LP.NO_WEIGHT_REJECT) as DECIMAL(8,2)) NO_WEIGHT_REJECT,
FROM LOT_PRODUCTION LP
WHERE (LP.DT_FINISHED BETWEEN CONVERT(CHAR(10), DATEADD(mm, - 1,
(DATEADD(dd, 1 - DAY(GETDATE()), GETDATE()))), 126) AND GETDATE()) AND
LP.NO_FINISH_LINE = 'L2'
GROUP BY DAY(DT_FINISHED),MONTH(DT_FINISHED),YEAR(DT_FINISHED)
ORDER BY DT_FINISHED ASC;


Celko Answers
Why are you formatting the data in the query? That is always done in
the front end. Why are you computing all those silly things in SQL,
which is not a computational language?

Because you are thinking like a COBOL programmer whose data is stored
in strings; think like an SQL programmer who works with abstract data
types.


First build an auxiliary table with your reporting periods. Remember,
tables and not computations:


CREATE TABLE ReportPeriods
(report_period_name CHAR(10) NOT NULL PRIMARY KEY,
rpt_start_date DATETIME NOT NULL,
rpt_end_date DATETIME NOT NULL,
CHECK (rpt_start_date < rpt_end_date));


Next, do a simple join to get your reports:


SELECT R.report_period_name,
MAX(LP.finish_line_nbr)AS finish_line_max,
SUM(LP.prod_wgt) AS prod_wgt_tot,
SUM(LP.reject_wgt) AS reject_wgt_tot,
FROM LotProduction AS LP,
ReportPeriods AS R
WHERE LP.finish_date
BETWEEN R.rpt_start_date AND R.rpt_end_date
GROUP BY R.report_period_name;


Since your data element did not follow ISO-11179 rules, I tried to
guess at corrections. You might also want to start writing Standard SQL
instead of dialect CURRENT_TIMESTAMP instead of the old getdate(),
CAST() instead of CONVERT(), etc.

No comments: