SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Tuesday, May 09, 2006

locating set of points close to one another (within a threshold)

SQL Apprentice Question
I have a large data set of points situated in 3d space. I have a simple
primary key and an x, y and z value.


What I would like is an efficient method for finding the group of
points within a threshold.


So far I have tested the following however it is very slow.


---------------
select *
from locations a full outer join locations b
on a.ID < b.ID and a.X-b.X<2 and a.Y-b.Y<2 and a.Z-b.Z<2
where a.ID is not null and b.ID is not null
---------------


If anyone knows of a more efficient method to arrive at this results it
would be most appreciated.


Celko Answers
Why would you have a NULL identifier? I guess that two things can have
the same location, otherwise (x,y,z) would be a key. And should your
query look like this, since you want to go in both directions.on an
axis? The OUTER JOIN makes no sense.

SELECT A.node_id, B.node_id
FROM Locations AS A, Locations AS b
WHERE A.node_id < B.node_id
AND ABS(A.x-B.x) <= 2
AND ABS(A.y-B.y) <= 2
AND ABS(A.z-B.z) <= 2


I did something like in 2-D for neighborhoods based on quadrants. If a
node was in one quadrant, I assumed that the nine adjacent quads
centered on his quad would have the nearest neighbor, so i only did a
distanve formula on popint in those quads. It meant carrying an extra
pair of quad co-ordinates, but it saved searching all the nodes -- 9
quads versus ~15,000 quads.

Square both sides to avoid conversions to FLOAT that SQRT() will give
you.

POWER ((A.x - B.x), 2) + POWER ((A.y - B.y), 2)+ POWER ((A.z - B.z), 2)
< POWER(cutoff , 2)

No comments: