March 18, 2013 --- Class 19 --- Simpson's Rule Rederived, Hit or Miss Method,
Sample Mean Method
We indicated how Simpson's rule can simply
be derived from removing the leading 1/N^2 error in the trapeziod
rule with N and 2N points. This technique can be extended in
the method known as Romberg integration. The previous two points are
expanded upon in Numerical Recipes. Practical codes may be found
there.
Simpson's Rule Rederived
The leading error in the trapeziod rule is 1/N^2. We can use an
idea similar to what we did to remove sample size bias, but taking
into account the fact that dependence of the error on N is different.
Starting from the trapezoid rule, we used the formulae below
for correcting the leading 1/N^2 errors.
S_N = A + E/N^2
S_2N = A + E/(2N)^2
so A = 4/3 S_2N -1/3 S_N
Applying this to a one-dimensional integral, we will easily find that
the weighting of the sum of two expressions on the right hand side
exactly coincides with the Simpson's rule formula. We previously
found that Simpson's Rule has a leading error of order 1/N^4. Since
the approach above removes the leading 1/N^2 error and since the
error of the trapezoid rule only has even powers of 1/N (asserted, but
not proved), this is not surprising.
Testing Trapeziod, Simpson and Romberg Methods
In ~sg/integration_test/double_precision, you will find source code
and output files. The source code has been modified so we can see
how the methods converge. We went over the source code in detail.
In particular, we discussed how static variables are used in the
evaluation of the trapiziod rule to reduce the bin width and only
evaluate the function at new points. I also showed how to compile
the code and what to do when the compiler complained about missing
functions.
The integral of cos(x) from x=0 to x=pi is calculated by xqsimp2.
Here is the output:
1 7.853981633974e-01 3.333333333333e+29
2 9.480594489685e-01 1.002279877492e+00
3 9.871158009728e-01 1.000134584974e+00
4 9.967851718862e-01 1.000008295524e+00
5 9.991966804851e-01 1.000000516685e+00
6 9.997991943200e-01 1.000000032265e+00
7 9.999498000921e-01 1.000000002016e+00
8 9.999874501175e-01 1.000000000126e+00
9 9.999968625353e-01 1.000000000008e+00
10 9.999992156342e-01 1.000000000000e+00
The first column is the level at which we call the routine, i.e., with
2^(j-1) intervals in the integration range. The second column is the
trapzoid rule and the third column is for Simpson's rule. Note that
the j=1 value for Simpson's rule is bogus since we need to go at least
to j=2 to use Simpson's rule. In cos.ax, you will find the error
plotted vs. j. The crosses are for the trapezoid rule, the diamonds
are for Simpson's rule. You will also find a single point from
the Romberg routine.
We found that from j=0 to j=10 the value for the trapezoid rule
fall by 6 orders of magnitude. This corresponds to increasing the
number of intervals by 2^10 or 1024. Since the error for the
trapzoid rule is supposed to fall like n^{-2}, and 1024^2 is about
10^6, this makes perfect sense.
Simpson's rule is seen to converge much more quickly and falls
12 orders of magnitude. This is the expected behavior. With the
Romberg method, the error is about 10^{-12} for j=5. Quite an
impressive improvement!
A Second Example (not done in class)
A second example integrates sqrt(1-x^2) from 0 to 1. Curiously, it
looks very different. This is because of the singularity in the
derivative of the integrand at x=1. Simpson's rule does not converge
with a higher power of j. Plot sqrt_1minus_xsqr.ax or plot both graphs
together. We recalled that the expected error for the trapezoid rule
depends on the average value of the second derivative of the function.
Higher Dimensional Integrals
Monte Carlo methods like hit or miss and the sample mean methods
have errors that fall like n^{-1/2}. This is not very good. To
cut the error in half you need to do 4 times a much work. For
Simpson's rule the error goes like n^{-4}, so if you do twice a
much work, the error falls by a factor of 16. However, for
multidimensional integrals, a method with an error like h^a,
where h is the spacing between grid points, will have an error like
n^{-a/d} because the number of points on the grid goes like h^{-d}.
As an example, if a=4 and d=10, a/d=0.4 so the error is falling
more slowly than n^{-1/2}. So, for high dimension integrals,
Monte Carlo methods may be bad, but other methods are worse.
Hit-or-Miss and Sample Mean Methods
The Hit-or-Miss method and Sample Mean methods are described in CSM
Sec. 11.2. I introduced the hit-or-miss method first. We will
consider the sample mean method in a future class.
Testing Hit-or-Miss Method
Program ~sg/src/misc/pi3.c implements the hit-or-miss method using a
feedback shift register random number generator. Compile it with
this command:
cc -o pi3 pi3.c newfsr.c
You have to tell the program how many "pebbles you want to throw."
This is the first command line argument. You may also have it
repeat the experiment multiple times. This is the optional second
argument. Look at the code to see how this is done if you are
unfamiliar with this type of feature. Also note how the random
number generator gets its seed through a call to pid(). This is
the "process id" of the calling routine. It is essentially a
random number. If you like, you can modify the code so that you
specify the seed yourself. This can be very useful when you are
trying to debug the code and want to see if results are
reproducible. Sometimes I set up my code so that it asks for a
seed, but if the seed I give is negative, it calls pid() to get a
random seed. That way, I can do many runs without having to think
up new seeds, and when I want to reproduce runs with a known seed I
can also do that.
I ran the pi3 code a few times with this command
pi3 10
The output ranged from 2.4 to 3.6. Clearly the output is a random
variable. I used the second argument to repeat the experiment
multiple time and made historgrams of the result. With larger sample
sizes, the histograms were narrower.
We have to answer several questions.
Do we get the right answer on average?
Can we estimate the error?
How does the error depend on n, the number of trials?
We expect the error to fall like 1/sqrt(n), but can we show that?
I asked everyone to run the program to see of the error falls like
1/sqrt(n). I also showed a simple shell script and introduced a
program aver that is very useful for data analysis. We will explore
its use in the next class.