March 29, 2012 --- Class 22---Inverse Transform Method, Box-Muller Method
Activities:
Inverse Transform Method
Sometimes we are interested in generating random numbers with a non
uniform probabilty density. There are a number of ways of doing
this and we shall study a few. This is covered in CSM section
11.5, starting on page 358. The first method we will study is
called inverse transform. You may be used to the idea of the
probability density function. For a real variable, we have
$\int_{-\Infinity}^{+\Infinity} p(x) dx = 1 $.
We can define the monotonically increasing function called the
cumulative probability distribution by
$ P(x) = \int_{-\Infinity}^{x} p(x') dx' $.
Of course, $ d P(x)/dx = p(x)$. P(x) is a number between 0 and 1.
If we can solve the equation $ P(x)=r$ for $x(r)$, then if we
generate uniform random numbers r, then the numbers x(r) have the
distribution we want. So, the key is we start with p(x), and
the method only is useful if we can find P(x) and then solve
P(x) = r
for x as a function of r.
A very simple example is a constant probablility for x to be between
two numbers a and b. In this case, we easily showed that
P(x) = (x-a)/(b-a) for ab. Solving P(x)=r, we find
x = a + (b-a)r
This simple relation is exactly what one would intitutively select for
generating numbers uniformly distributed between a and b.
A useful and more interesting example of the inverse tranform method
involves an exponential distribution. Here
p(x)= exp(-x/lambda)/lambda.
We were able to show that
P(x) = 1 - exp(-x/lambda)
Setting P(x)=r and solving for r, we find
x = - lambda * ln(1-r)
Since 1-r and r are both uniformly distributed between 0 and 1,
we can simplify the above and replace ln(1-r) by ln(r).
Thus, to generate numbers with an exponential distribution, we
first generate a uniform random variable between 0 and 1. Then
x = -lambda * ln(r)
has an exponential distribution. We used Mathematica to generate
such numbers.
v = Table[ -Log[ Random[] ], {1000}] ;
We used ListPlot to look at the numbers.
We can also make use of the Mathematica Histogram function:
Needs["Graphics`Graphics"]
Histogram[v]
I find it easy to use my histogram package and axis to make a
histogram as a semilog plot. This is useful because an exponential
distribution turns into a straight line on a semilog plot. First we
need to get the data out of Mathematica with the Export function.
Note the use of double quotes in the first and third arguments.
Export["exponential.dat", v, "List"]
This will put the vector v in a file called exponential.dat.
Then use hist to plot the distribution. (If you want to use a
logarithmic y axis, make sure to use -g 2 option for hist and don't
let axis try to plot down to 0.)
We can use hist to make a histogram of the numbers. If we use the
-g 2 option to hist, and the logarithmic option to axis, we find
a plot that looks rather linear, as we would expect with an
exponential distribution.
hist -n 20 -g 2 < exponential2.dat |axis y l 1 500 |xplot
Another way to get output from Mathematica is with the Write function.
Do[ Write[ "exponential2.dat", v[[i]] ], {i,Length[v]} ]
will do the trick. The first argument of the Write function is
the name of the file. It must be included in double quotes.
You may have additional arguments to specify what to write. Here
we only want a single element of the vector v. Note the use of
the function Length to determine the length of the vector v. We
thus don't need to explicitly remember the length of v in the
iterator expression.
Box-Muller Method
The Box-Muller method is a clever way to generate random
numbers with a Gaussian Distribution. It is described on page
360 of CSM. For homework you will get to try out this method and
verify that the distribution is Gaussian. You can also find
this method described in Numerical Recipes. We derived the Box-Muller
method.
Making Gaussian Random Numbers in Mathematica
Here is an implementation of the Box-Muller method with sigma=1:
gaus[] := Module[ {theta,rho}, theta=2.*Pi*Random[];
rho = -Log[Random[]];
N[{Sqrt[2 rho]Cos[theta],Sqrt[2 rho]Sin[theta]}] ]
gaus is now a function with no argument. With the above definition,
you can just type
gaus[]
and you will get a list with two Gaussian random numbers in it.
Of course, we want a lot to test the distribution, so try
v = Table[ gaus[], {10000} ];
Don't forget the semicolon at the end of the command! Otherwise
you are going to be looking at a lot of output.
Check some values:
v[[4]]
Now export v to a file:
Export["myfile", v, "Table"]
This will result in a pair of numbers on each line of the file myfile.
axis m 0 myfile.2
To take a file with one number per line and then print pairs of numbers
is also pretty easy.
awk '{ if( NR % 2 == 1) x=$i; else print x,$1}' myfile.2 >myfile.3
In Mathematica we could instead have used
Do[Write["myfile", v[[i,1]], " ", v[[i,2]]], {i, Length[v]}]
to print two words per line. Note that we really do need the
string of blanks in double quotes or the two numbers will have
no space between them. How annoying! It is good to learn the options
of the Export function.
The file that has two words per line will work for either the
scatter plot or histogram. However, it is useful to know these simple
awk commands.
We prepared one graph with the x and y
numbers from the Box-Muller method histogrammed separately. here are
appropriate commands:
#!/bin/csh
(awk '{print $1}' myfile | hist -n 40 -x -4 4 -g 2 -m 1;\
awk '{print $2}' myfile | hist -n 40 -x -4 4 -g 2 -m 2) | axis |
xplot
Autocorrelation
We briefly introduced the autocorrelation at the end of the class.