SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Thursday, August 09, 2007

A Strange Query Options

SQL Apprentice Question
Hi guys,

I'm trying to work this query out in my head see if you can help me with it.
I'll give you a little background information so you have the big picture.
This is a real estate database, every home has an APN number it's a unique
number that always means the same property. When a company wants to list a
home on the MLS to sell the home it gets an MLS number which is only unique
to the instance the entity wants to sell that property, so in a span of a
few years the same APN could go for sale several times and have several MLS
Numbers, but maintain the same APN. This database is setup so each row
is an MLS Number.


I need to make a list of all entries that qualify:


1. Find APN's with multiple entries


2. Narrow it down to only solds


3. Only solds that have been modified in the last 3 years


and it needs to generate a list of just DISTINCT APN's that qualify.


Thanks,




Celko Answers
Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. We don't even know your data types, column or table
names much less your codes! Sample data is also a good idea, along
with clear specifications. It is very hard to debug code when you do
not let us see it.


>> This is a real estate database, every home has an APN number it's a unique number that always means the same property. <<


CREATE TABLE Properties
(apn INTEGER NOT NULL PRIMARY KEY,
etc);


>> When a company wants to list a home on the MLS to sell the home it gets an MLS number which is only unique to the instance the entity wants to sell that property, so in a span of a few years the same APN could go for sale several times and have several MLS Numbers, but maintain the same APN. <<


CREATE TABLE Listings
(apn INTEGER NOT NULL
REFERENCES Properties(apn)
ON DELETE CASCADE
ON UPDATE CASCADE,
start_date DATETIME DEFAULT CURRENT_TIMESTAMP NOT NULL,
end_date DATETIME, -- null means current
CHECK (start_date < end_date),
PRIMARY KEY (apn, start_date),
mls INTEGER NOT NULL,
listing_status CHAR(10) DEFAULT 'listed' NOT NULL
CHECK (listing_status IN ('listed', 'sold', 'reduced', etc)),
asking_price DECIMAL(12,2) NOT NULL,
etc);

You also need a Calendar table, since this is a history schema. Google
that.



>> I need to make a list of all entries that qualify:


1. Find APN's with multiple entries
2. Narrow it down to only solds
3. Only solds that have been modified in the last 3 years and it
needs to generate a list of just DISTINCT APN's that qualify. <<

Did you notice that you gave a step by step **procedure**, rather than
a **declarative statement**? You do not thinking SQL yet. This is
steps 1 and 2 as a query:


SELECT apn, COUNT(*) AS listed_cnt
FROM Listings
WHERE listing_status = 'sold'
GROUP BY apn
HAVING COUNT(*) > 1;


Step 3 is impossible with what you posted. What does "modified" mean
and where is it in the tables? I assume with the Properties, which
means I need a history schema on it, too. My mental image is that
"modified" is going to involve the bedroom counts, kitchen appliances,
roofing, etc. and NOT one column with a simple code in it.