SQL Apprentice Question
I have two tables:
pat01 record patient status1 from date to date.
patientID startdate enddate status1
1 2000-01-01 2000-01-25 Good
1 2000-01-26 2000-02-25 not so Good
1 2000-02-26 2000-03-01 BAD
2 2000-01-01 2000-04-01 BAD
and another table pat02 record patient status2 from date to date
patientID startdate2 enddate2 status2
1 2000-01-01 2000-01-10 CHECKED
1 2000-01-11 2000-02-01 UNCHECKED
Please note that there are overlap between date range in two tables for
the same patient.Now I want to merge those two tables into one:
PATIENTID STARTDATE ENDDATE STATUS1 STATUS2
1 2000-01-01 2000-01-10 Good
CHECKED
1 2000-01-11 2000-01-25 Good
UNCHECKED
1 2000-01-26 2000-02-01 NOT SO GOOD
unchecked
1 2000-02-02 2000-02-25 not so good
null
1 2000-02-26 2000-03-01 BAD
null
2 2000-01-01 2000-04-01 BAD
null
Both table have about 1 million rows, more than 10 years data. I know I
can use number table to create row for each patient each day then
merge them but I am worry about the performance. Does anyone have other
ideas?
Thanks
DDL:
create table pat01
(PatientID int,
StartDate datetime,
EndDate datetime,
Status1 varchar(20)
)
insert into pat01 values (1, '2000-01-01','2000-01-25', 'Good')
insert into pat01 values (1, '2000-01-26','2000-02-25', 'not so Good')
insert into pat01 values (1, '2000-02-26','2000-03-01', 'BAD')
insert into pat01 values (2, '2000-01-01','2000-04-01', 'BAD')
create table pat02
(PatientID int,
StartDate2 datetime,
EndDate2 datetime,
Status2 varchar(20)
)
insert into pat01 values (1, '2000-01-01','2000-01-10', 'CHECKED')
insert into pat01 values (1, '2000-01-11','2000-02-01', 'UNCHECKED')
Celko Answers
>> Both table have about 1 million rows, more than 10 years data. I know I can use number table to create row for each patient each day then merge them but I am worry about the performance. <<
So basically something like this:
SELECT Pat01.patient_id, C1.cal_date, Pat01.status1, Pat02.status2
FROM Pat01, Pat02, Calendar AS C1
WHERE Pat01.patient_id = Pat02.patient_id
AND C1.cal_date BETWEEN Pat01.start_date AND Pat01.end_date
AND C1.cal_date BETWEEN Pat02.start_date AND Pat02.end_date
AND C1.cal_date BETWEEN @my_start_date AND @my_end_date;
so figure we have 3652+ rows per patient_id, assuming they all stay in
the study for 10 years. How many patients? Could be ugly. But I
also assume this is a one-shot data warehousing kind of thing, so this
might not be a bad way to go.
I cannot think of an answer right now, but I have the feeling that you
can use the new OLAP functions to build the ranges..
Monday, July 24, 2006
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment