April 3, 2013 --- Class 24 --- Serial Correlation Tests, Autocorrelations
of PRNG, Feedback Shift Register Random Number Generator, SPRNG
Activities:
Review of Serial Correlation Test
---------------------------------
In out last class, we looked in detail at a test of correlations
based on triplets of numbers. If we divided the interval (0,1) into L
bins, and looked at triplets in the resulting L^3 cells, we derived that
the probability of having a cell remain unoccupied is exponentially
falling. Specifially, if we examine the occupancy after tL^3
triplets have been generated, the number of unoccupied cells should
be L^3 Exp[-t].
The box_rand.c code in ~sg/src implements this code for L=10.
(You can change the define statement near the top of the code to
use other values of L.) To compile the code use the command:
cc -o box_rand box_rand.c
Running box_rand
----------------
We ran a few tests with the fsr random number generator and saw that
the number of unoccupied cells was decreasing. Then using a
while shell command, we ran the code 50 times, putting the reults
in a file box_rand.apr3. Using awk and aver, it was easy to get
the average of the 50 trials and plot the result with a semilog
scale using axis. The result clearly shows exponential decay of
the number of unoccupied boxes.
Autocorrelations of Pseudo Random Number Generators, or
comparing drand48 and fsr
-------------------------------------------------------
Before class I prepared codes to test the drand48 random number
generator that comes with Unix. The codes can be found in my src/misc
directory. random_drand48.c puts out two numbers per line and
can be used for a scatter plot or to calculate autocorrelations.
box_drand48.c can be used to test triplets of random numbers as
we did before with box_rand.c for the fsr random number generator.
We compared the autocorrelation of the two random number generators
out to a lag of 20. Doing about 50 trials in each case, we found
that drand48 seemed to have a smaller dispersion than fsr, but that
it seemed to have more negative values of autocorrelation than
positive values. (I also passed around a similar graph from a test
of a linear congruential random number generator that had some much
larger values of the autorcorrelations for some lags.)
We created lists of random numbers as long as 250,000 and looked
at autocorrelations with lags as large as 250. We also histogrammed
the autocorrelations and found a distribution that appears Gaussian.
Understanding the Feedback Shift Register Random Number Generator
-----------------------------------------------------------------
We examined how the feedback shift register (FSR) code makes a new
random 28 bit integer every time it is called. The reason this random
number generator has a period longer than 2^28 is because the period
is determined by the entire state of the feedback shift register
which has about 140 bits.
We also used wikipedia to find a description of IEEE floating point
format. 23 bits are used to store the fractional value of a number.
Feedback shift register random generators are based on bitwise exclusive
or operations. The recursion relation is
b_n = b_{n-p} ^ b_{n-q}
So the new bit depends on bits earlier in the sequence by p and q.
There are certain values of p and q that result in good randomness.
You can find good pairs of p and q in the book "Seminumerical
Algorithms" by Donald Knuth in Table 1. Our pair (p,q) = (97,127).
Within the code, 28 bits of each 32 bit word are used to store the
state of the FSR. rshift[1] to rshift[5] hold 140 bits of state.
The next 28 bits, numbered 141-168 are created in rshift[0].
Using b_n = b_{n-97} ^ b_{n-127}, we see that with n=141 we need
28 bit words starting with bit 44 (=141-97) and bit 14 (= 141-127).
Here are how the bits are stored in the rightmost 28 bits of rshift[i].
rshift[5] = 1-28
rshift[4] = 29-56
rshift[3] = 57-84
rshift[2] = 85-112
rshift[1] = 113-140
From the code, t2= (rshift[5]<<13))|(rshift[4]>>15)
contains the bits 14-41 that we need. In the C language x<> denotes a shift to the right.
We can get bits 44-56 from t1 = (rshift[4]<<15)|(rshift[3]>>13).
We then take the exclusive or of t1 and t2 and mask of the rightmost
28 bits. Finally, the shift register is moved to higher values,
dropping off the old value of rshift[5], and the new 28 bit integer
we created is divided by 2^28 to get a float between 0 and 1. All
the details can be found in ~sg/src/newfsr.c.
Other Sources of Random Numbers
-------------------------------
We briefly looked at Gnu Scientific Library documentation for their
random number generators. The URL is
http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Generation.html
We also used Google to find the web pages
for SPRNG a set of Scalable Parallel Pseudo Random Number Generators:
http://sprng.cs.fsu.edu/
These routines can be used in parallel programs that require a different
(independent) stream of random numbers on each node.
I have a copy of the code in ~sg/src/sprng4.tar. The code is unpacked
in ~sg/src/sprng4/ . In the EXAMPLES subdirectory, I modified
sprng.cpp to create a large number of random numbers. The output of
a test is in the file apr3. I suggest learning more about sprng,
especially if you need random numbers on a parallel computer where
it is important to have independent random number streams.