**SQL Apprentice Question**

Given a set of n-dimensional boxes

table boxes (

dimension# integer,

low integer,

high integer

)

find all the pairs that intersect. Example

the cubes A={(x,0,2),(y,0,2),(z,0,2)}

and B={(x,1,3),(y,1,3),(z,1,3)} intersect, while the box

C={(x,10,12),(y,0,4),(z,0,100)}

intersect neither A nor B.

Bonus point: anything special about this kind of query?

**Celko Answers**

Try a slightly different approach. Begin with one dimension and

stronger DDL:

CREATE TABLE Boxes

(box_id CHAR (1) NOT NULL,

dim CHAR(1) NOT NULL,

PRIMARY KEY (box_id, dim),

low INTEGER NOT NULL,

high INTEGER NOT NULL,

CHECK (low < high));

INSERT INTO Boxes VALUES ('A', 'x', 0, 2);

INSERT INTO Boxes VALUES ('B', 'x', 1, 3);

INSERT INTO Boxes VALUES ('C', 'x', 10, 12);

--in 1 dimension

SELECT B1.box_id, B2.box_id

FROM Boxes AS B1, Boxes AS B2

WHERE B1.box_id < B2.box_id

AND (B1.high - B1.low) + (B2.high - B2.low)

> ABS(B1.high - B2.low);

This says that two lines segements overlap when their combined lengths

are less than their span in the dimension. Math rather than

between-ness.

/* the cubes A={(x,0,2),(y,0,2),(z,0,2)}

and B={(x,1,3),(y,1,3),(z,1,3)} intersect, while the box

C={(x,10,12),(y,0,4),(z,0,100)} */

let's go to 2D

INSERT INTO Boxes VALUES ('A', 'y', 0, 2);

INSERT INTO Boxes VALUES ('B', 'y', 1, 3);

INSERT INTO Boxes VALUES ('C', 'y', 0, 4);

--in 2 dimension: first shot:

SELECT B1.box_id, B2.box_id, B1.dim

FROM Boxes AS B1, Boxes AS B2

WHERE B1.box_id < B2.box_id

AND B1.dim = B2.dim

AND (B1.high - B1.low) + (B2.high - B2.low)

> ABS(B1.high - B2.low);

Now look for a common area in (x,y) by having overlaps in both

dimensions:

SELECT B1.box_id, B2.box_id

FROM Boxes AS B1, Boxes AS B2

WHERE B1.box_id < B2.box_id

AND B1.dim = B2.dim

AND (B1.high - B1.low) + (B2.high - B2.low)

> ABS(B1.high - B2.low)

GROUP BY B1.box_id, B2.box_id

HAVING COUNT(B1.dim) = 2;

--3 dimensions:

INSERT INTO Boxes VALUES ('A', 'z', 0, 2);

INSERT INTO Boxes VALUES ('B', 'z', 1, 3);

INSERT INTO Boxes VALUES ('C', 'z', 0, 100);

Now change the HAVING clause to COUNT(B1.dim) = 3 or (SELECT COUNT

(DISTINCT dim) FROM Boxes) if you do not know the space you are using.

## No comments:

Post a Comment