April 6, 2017 --- Class 24 --- Feedback Shift Register Random Number Generator,
SPRNG, Fitting Data
Activities:
Understanding the Feedback Shift Register Random Number Generator
-----------------------------------------------------------------
We examined how the feedback shift register (FSR) code makes a new
random 28 bit integer every time it is called. The reason this random
number generator has a period longer than 2^28 is because the period
is determined by the entire state of the feedback shift register
which has about 140 bits.
You may use wikipedia to find a description of IEEE floating point
format. 23 bits are used to store the fractional value of a number.
Feedback shift register random generators are based on bitwise exclusive
or operations. The recursion relation is
b_n = b_{n-p} ^ b_{n-q}
So the new bit depends on bits earlier in the sequence by p and q.
There are certain values of p and q that result in good randomness.
You can find good pairs of p and q in the book "Seminumerical
Algorithms" by Donald Knuth in Table 1. Our pair (p,q) = (97,127).
Within the code, 28 bits of each 32 bit word are used to store the
state of the FSR. rshift[1] to rshift[5] hold 140 bits of state.
The next 28 bits, numbered 141-168 are created in rshift[0].
Using b_n = b_{n-97} ^ b_{n-127}, we see that with n=141 we need
28 bit words starting with bit 44 (=141-97) and bit 14 (= 141-127).
Here are how the bits are stored in the rightmost 28 bits of rshift[i].
rshift[5] = 1-28
rshift[4] = 29-56
rshift[3] = 57-84
rshift[2] = 85-112
rshift[1] = 113-140
From the code, t2= (rshift[5]<<13))|(rshift[4]>>15)
contains the bits 14-41 that we need. In the C language x<> denotes a shift to the right.
We can get bits 44-56 from t1 = (rshift[4]<<15)|(rshift[3]>>13).
We then take the exclusive or of t1 and t2 and mask of the rightmost
28 bits. Finally, the shift register is moved to higher values,
dropping off the old value of rshift[5], and the new 28 bit integer
we created is divided by 2^28 to get a float between 0 and 1. All
the details can be found in ~sg/src/newfsr.c.
I have written a code based on (p,q) = (250,103). It is called newfsr250.c and
in my src/misc directory. Other values of (p,q) can be found in
CSM page 237 or Knuth's book on Seminumerial Algorithms.
There are some handwritten notes done by Alexei Bazavov that are linked
to the class web page.
Other Sources of Random Numbers
-------------------------------
We briefly looked at Gnu Scientific Library documentation for their
random number generators. The URL is
http://www.gnu.org/software/gsl/manual/html_node/Random-Number-Generation.html
You will find the web pages for SPRNG a set of Scalable Parallel
Pseudo Random Number Generators at:
http://www.sprng.org
These routines can be used in parallel programs that require a different
(independent) stream of random numbers on each node.
I have an old copy of the code in ~sg/src/sprng4.tar. The code is unpacked
in ~sg/src/sprng4/ . In the EXAMPLES subdirectory, I modified
sprng.cpp to create a large number of random numbers. The output of
a test is in the file apr3. I suggest learning more about sprng,
especially if you need random numbers on a parallel computer where
it is important to have independent random number streams.
SPRNG 5.0 is the current version of code. I have been able to
compile it under linux, but had problems trying to compile in the Mac
cluster.
The latest edition of Numerical Recipes has a chapter on random number
generation that is updated from previous editions. It would be a
good place to learn more about the topic.
If you are going to do serious work with Monte Carlo methods, you
should have a look at ``Monte Carlo Simulations: Hidden Errors from
`Good' Random Number Generators'' by A.M. Ferrenberg, D.P. Landau and
Y.J. Wong., Phys. Rev. Lett. 69, 3382, 1992. There is a link on
the class web page.
Fitting Data
------------
We looked at what one should do to fit data to a model function.
If the data has a Gaussian distribution about its true value, then
we maximize the probability for the set of observations by
adjusting the parameters of the model to minimize \chi^2.
\chi^2= \Sum_{i=0}^N} (y_i- y_model(x_i)^2/ \sigma_i^2
Linear Model Function
---------------------
The case of a linear model was considered in great detail. We
obtained specific formulae for the slope and intercept parameters of
the fit. However, even with a formula for the slope and intercept,
there are two important questions:
1) How large is the error in the fitted parameters?
2) How good is the fit?
Using a standard propagation of errors approach, we fill find the error
on the slope. and intercept, and the covariance of those quantities
in our next class.