SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Friday, November 24, 2006

complex(?) query

SQL Apprentice Question
I'm a novice sql writer and need some help in writing a query to
extract applicable data from the following table (titled EMPLOYEE):


--
ID_NUMBER CODE DATE
------------------ --------- --------
12 VO 20060914
12 XD 20060913
12 AD 20060912
12 WR 20060911
12 AT 20060910
45 VO 20060914
45 XR 20060913
45 AT 20060912
45 AD 20060911
45 AT 20060910
78 AD 20060914
78 AT 20060913
78 VO 20060912
78 AD 20060911
78 AT 20060910


I need to select ID_NUMBER
from EMPLOYEE
where CODE = 'VO'


caveat: I only want the ID_NUMBER(s) where the CODE = 'VO'
and the previous CODE (by DATE) = 'AD'
or the previous CODE (by DATE) = 'AD' with any CODE in between
except 'AT';


E.g., in the above example, the appropriate code should select
ID_NUMBER(s) 12 and 78 because
1. a VO code exists
2. an AD code (by DATE) precedes it
3. although 'AD' does not come immediately before 'VO' (in the
case of ID_NUMBER 12) 'AT' cannot be found in between


I hope I haven't confused anyone. Any help would be appreciated.



Celko Answers

>>I'm a novice sql writer and need some help in writing a query to extract applicable data from the following table (titled EMPLOYEE): <<


First, let's clean up your missing DDL. The table name should tell us
what set of entities is modeled in the table; do you really have one
employee? Small firm! Try Personnel -- the collective name of the set
or something that tells us what the set is. Code is too vague --
postal code? Date is both too vague *and* a reserved word. A name
like "id_number" is also uselessly general; emp_id would be a better
choice. Since you did not post DDL, we have to guess at constaints and
keys. A skeleton of what you need is something like this:

CREATE TABLE PersonnelActions
(emp_id INTEGER NOT NULL,
action_date action_dateTIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
PRIMARY KEY (emp_id, foobar_date),
action_code CHAR(2) NOT NULL
CHECK (foobar_code IN ('VO', 'XD'))
);


You need to read a book on data modeling and ISO-11179 rules for names.
I would also look up the use of UPPERCASE for names -- it is the worst
way to code, being about 8-12% harder to detect misspellings. That is
why books and newspapers use lowercase.



>> I only want the emp_id(s) where the action_code = 'VO'


and the previous action_code (by action_date) = 'AD'
or the previous action_code (by action_date) = 'AD' with any
action_code in between
except 'AT'; <<

SELECT DISTINCT emp_id
FROM PersonnelAction AS PVO,
PersonnelAction AS PAD
WHERE PVO.emp_id = PAD.emp_id
AND PVO.action_code = 'VO'
AND PAD.action_code = 'AD'
AND PAD.action_date < PVO.action_date
AND NOT EXISTS
(SELECT *
FROM PersonnelAction AS PAT
WHERE PAT.action_code = 'AT'
AND PAT.emp_id = PVO.emp_id
AND PAT_action_date BETWEEN PAD.action_date AND
PVO.action_date);

No comments: