SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Tuesday, May 02, 2006

Box query

SQL Apprentice Question
Given a set of n-dimensional boxes

table boxes (
dimension# integer,
low integer,
high integer
)


find all the pairs that intersect. Example


the cubes A={(x,0,2),(y,0,2),(z,0,2)}


and B={(x,1,3),(y,1,3),(z,1,3)} intersect, while the box
C={(x,10,12),(y,0,4),(z,0,100)}


intersect neither A nor B.


Bonus point: anything special about this kind of query?


Celko Answers
Try a slightly different approach. Begin with one dimension and
stronger DDL:

CREATE TABLE Boxes
(box_id CHAR (1) NOT NULL,
dim CHAR(1) NOT NULL,
PRIMARY KEY (box_id, dim),
low INTEGER NOT NULL,
high INTEGER NOT NULL,
CHECK (low < high));


INSERT INTO Boxes VALUES ('A', 'x', 0, 2);
INSERT INTO Boxes VALUES ('B', 'x', 1, 3);
INSERT INTO Boxes VALUES ('C', 'x', 10, 12);


--in 1 dimension
SELECT B1.box_id, B2.box_id
FROM Boxes AS B1, Boxes AS B2
WHERE B1.box_id < B2.box_id
AND (B1.high - B1.low) + (B2.high - B2.low)
> ABS(B1.high - B2.low);


This says that two lines segements overlap when their combined lengths
are less than their span in the dimension. Math rather than
between-ness.


/* the cubes A={(x,0,2),(y,0,2),(z,0,2)}
and B={(x,1,3),(y,1,3),(z,1,3)} intersect, while the box
C={(x,10,12),(y,0,4),(z,0,100)} */


let's go to 2D


INSERT INTO Boxes VALUES ('A', 'y', 0, 2);
INSERT INTO Boxes VALUES ('B', 'y', 1, 3);
INSERT INTO Boxes VALUES ('C', 'y', 0, 4);


--in 2 dimension: first shot:
SELECT B1.box_id, B2.box_id, B1.dim
FROM Boxes AS B1, Boxes AS B2
WHERE B1.box_id < B2.box_id
AND B1.dim = B2.dim
AND (B1.high - B1.low) + (B2.high - B2.low)
> ABS(B1.high - B2.low);


Now look for a common area in (x,y) by having overlaps in both
dimensions:


SELECT B1.box_id, B2.box_id
FROM Boxes AS B1, Boxes AS B2
WHERE B1.box_id < B2.box_id
AND B1.dim = B2.dim
AND (B1.high - B1.low) + (B2.high - B2.low)
> ABS(B1.high - B2.low)
GROUP BY B1.box_id, B2.box_id
HAVING COUNT(B1.dim) = 2;


--3 dimensions:


INSERT INTO Boxes VALUES ('A', 'z', 0, 2);
INSERT INTO Boxes VALUES ('B', 'z', 1, 3);
INSERT INTO Boxes VALUES ('C', 'z', 0, 100);


Now change the HAVING clause to COUNT(B1.dim) = 3 or (SELECT COUNT
(DISTINCT dim) FROM Boxes) if you do not know the space you are using.

No comments: