**SQL Apprentice Question**

I have a large data set of points situated in 3d space. I have a simple

primary key and an x, y and z value.

What I would like is an efficient method for finding the group of

points within a threshold.

So far I have tested the following however it is very slow.

---------------

select *

from locations a full outer join locations b

on a.ID < b.ID and a.X-b.X<2 and a.Y-b.Y<2 and a.Z-b.Z<2

where a.ID is not null and b.ID is not null

---------------

If anyone knows of a more efficient method to arrive at this results it

would be most appreciated.

**Celko Answers**

Why would you have a NULL identifier? I guess that two things can have

the same location, otherwise (x,y,z) would be a key. And should your

query look like this, since you want to go in both directions.on an

axis? The OUTER JOIN makes no sense.

SELECT A.node_id, B.node_id

FROM Locations AS A, Locations AS b

WHERE A.node_id < B.node_id

AND ABS(A.x-B.x) <= 2

AND ABS(A.y-B.y) <= 2

AND ABS(A.z-B.z) <= 2

I did something like in 2-D for neighborhoods based on quadrants. If a

node was in one quadrant, I assumed that the nine adjacent quads

centered on his quad would have the nearest neighbor, so i only did a

distanve formula on popint in those quads. It meant carrying an extra

pair of quad co-ordinates, but it saved searching all the nodes -- 9

quads versus ~15,000 quads.

Square both sides to avoid conversions to FLOAT that SQRT() will give

you.

POWER ((A.x - B.x), 2) + POWER ((A.y - B.y), 2)+ POWER ((A.z - B.z), 2)

< POWER(cutoff , 2)

## No comments:

Post a Comment