November 30, 2006 --- Class 25 --- Pseudorandom Numbers Generators,
Serial Correlation Tests
Activities:
Properties of Pseudo Random Number Generators
We looked at some properties of Linear Congruential PRNGs.
x_{n+1} = (a x_n + c) mod m
a is the multiplier
c is the increment
m is the modulus.
If c=0, this is a multiplicative generator, otherwise, it is mixed.
We start the sequence with a seed x_0.
m determines the maximum possible length of the sequence. It is
necessary to pick a and c carefully to get the maximum length
sequence.
Here is a theorem: The length of the period is m if the following
three conditions all hold.
a) c is relatively prime to m
b) (a-1) is a multiple of p for every prime factor of m
c) if m is a multiple of 4, (a-1) is a multiple of 4.
We looked at the case, m=32, a=3, c=4. This does not obey the
theorem and there is no sequence longer than 8. There is even a
sequence of length one. So much for randomness! To explore this
generator use the Mathematica definition
myrand[x_] := Mod[ 3x+4, 32]
The NestList command can be used to study a sequence:
seed = 3
NestList[myrand, seed, 32]
Adjusting a and c, we can easily find sequences of length 32;
however, that is not enough to ensure that the sequence looks random.
Problem with ran1 from first edition of Numerical Recipes
What is the problem with ran1? Here are some details:
Ran1 can be found in ~sg/src/ran1.c.
It is based on three linear congruential random number generators.
The first determines the most significant part the of the random
numbers, the second determines the least significant part, and
the third is just used to mixup the order.
Use the Mathematica FactorInteger command to see if each of the
three linear congruential random number generators obeys our
theorem. They do, but the problem with the random number generator
is that the high order bits of the float and the low order bits
come from two sequences with length 259,200 and 134,456,
respectively. These can be factored:
In[3]:= FactorInteger[134456]
Out[3]= {{2, 3}, {7, 5}}
In[4]:= FactorInteger[259200]
Out[4]= {{2, 7}, {3, 4}, {5, 2}}
It looks like the lowest common multiple is 4,356,374,000, which is a
reasonably big number. However, in many applications (percolation
or numerical integration, for instance ) the lower order bits may
not be that significant, so the period of the high order bits,
259,200, a relatively small number is more relevant. The mixing of
the order of the random numbers from the third linear congruential
RNG does not help that much.
Serial Correlation Tests
In addition to having a uniform probability density, a pseudo random
number generator should have zero autocorrelation. There are tests
based on looking at pairs or triplets of numbers. For linear
congruential random number generators, there is a well known
problem that like the rain in Spain they tend to fall mainly in
the planes. This can be demonstrated by showing a two dimensional
scatter plot of (r_n, r_{n+1}). If you would like to see this, there
is an executable in ~sg/bin called random_lc:
random_lc | axis m 0 | xplot
will show you the correlations. The code is in ~sg/src and this
makes use of a poor linear congruential random number generator. You
would not see the same pattern with the fsr random number generator.
We looked in detail at another test of correlations based on
triplets of numbers. If we divide the interval (0,1) into L bins,
and look at triplets in the resulting L^3 cells, we derived that
the probability of having a cell remain unoccupied is exponentially
falling. Specifially, if we examine the occupancy after tL^3
triplets have been generated, the number of unoccupied cells should
be L^3 Exp[-t].
The box_rand.c code in ~sg/src implements this code for L=10.
To compile the code use the command:
cc -o box_rand box_rand.c new_fsr.o
This assumes that the new_fsr.c code has already been complied with
cc -c new_fsr.c
In our next class, we will run this code multiple times to collect
statistics, and examine how the feedback shift register code works.