SQL for Smarties | SQL Programming Style | Trees and Hierarchies in SQL | SQL Puzzles and Answers | Data and Databases


Friday, November 24, 2006

one to one random mapping between int

SQL Apprentice Question
I need to one-to-one map a range of integer (A), say 1 to 10000, randomly to
1 to 10000 (B). And I need to get A if a B is given. Any existed good
algorithm in T-SQL?

Celko Answers
Here is an implementation of the additive congruential method of
generating values in pseudo-random order and is due to Roy Hann of
Rational Commerce Limited, a CA-Ingres consulting firm. It is based on
a shift-register and an XOR-gate, and it has its origins in
cryptography. While there are other ways to do this, this code is nice
because:

1) The algorithm can be written in C or another low level language for
speed. But math is fairly simple even in base ten.


2) The algorithm tends to generate successive values that are (usually)
"far apart", which is handy for improving the performance of tree
indexes. You will tend to put data on separate physical data pages in
storage.


3) The algorithm does not cycle until it has generated every possible
value, so we don't have to worry about duplicates. Just count how many
calls have been made to the generator.


4) The algorithm produces uniformly distributed values, which is a nice
mathematical property to have. It also does not include zero.


Generalizing the algorithm to arbitrary binary word sizes, and
therefore longer number sequences, is not as easy as you might think.
Finding the "tap" positions where bits are extracted for feedback
varies according to the word-size in an extremely non-obvious way.
Choosing incorrect tap positions results in an incomplete and usually
very short cycle, which is unusable. If you want the details and tap
positions for words of one to 100 bits, see E. J. Watson, "Primitive
Polynomials (Mod 2)", Mathematics of Computation, v.16, 1962,
p.368-369. Here is code for a 31-bit integer, which you can use:


see the details at:


http://www.rationalcommerce.com/resources/surrogates.htm

UPDATE generator31
SET keyval
= keyval/2 + MOD(MOD(keyval, 2) + MOD(keyval/2, 2), 2) * 8;


T-SQL version:


CREATE TABLE Generator31 (keyval INTEGER NOT NULL);
INSERT INTO Generator31 VALUES (1); -- any start value


UPDATE Generator31
SET keyval =
keyval/2 + ((keyval % 2) + (keyval/8 % 2) % 2)* 8
SELECT * FROM Generator31;


Or if you prefer, the algorithm in C:


int Generator31 ()
{static int n = 1;
n = n >> 1 ((n^n >> 3) & 1) << 30;
return n;

1 comment:

Anonymous said...

The SQL samples are incorrect - the feedback term should be multiplied by 2^30 (1073741824), not 8.