December 9, 2004 --- Class 29 --- Non-Linear Fitting
Activities:
Nonlinear Fitting
We next generalized the discussion to include two
types of non-linearities in the model. For the case of a model that
is a non-linear function of x, but still linear in its parameters,
it is still easy to find the minimum of \chi^2 because \chi^2 is
only a quadratic function of the parameters. Numerical Recipes has
a function lfit to deal with this case. You have to supply a
routine that computes the x dependence of the model function.
y_model (x) = \Sum_{k=1}^N =lambda_k f_k(x)
Your routine calculates the f_k and the NR routine takes care of
varying lambda_k. Of course, you also have to supply the data with
(optionally) error bars.
Mathematica can also do fits like this if there are no error bars.
Here is the description of the Fit function of Mathematica:
Fit[data,funs,vars]
finds a least-squares fit to a list of data as a linear
combination of the functions funs of variables vars. The data can have
the form {{x1, y1, ... , f1}, {x2, y2, ... , f2}, ... }, where the number of
coordinates x, y, ... is equal to the number of variables in the list
vars. The data can also be of the form {f1, f2, ... }, with a single
coordinate assumed to take values 1, 2, ... . The argument funs can be any
list of functions that depend only on the objects vars.
Polyfit
Polyfit is a routine that fits data with an nth order polynomial.
You are welcome to use it. You will find the code and a makefile
in ~sg/fitting.
Things get more interesting when the model is non-linear in its
parameters. In this case, we have to interatively solve a
non-linear minimization problem. NR chapter 10 describes how to do
minimization when you only know the value of the function you are
trying to minimize. However, in our case, we know the model
function explicitly, so we can find \chi^2 and its first and second
derivatives, if we know the first and second derivatives of the
model function wrt its parameters. The Levenberg-Marquardt method
is based on this notion and described in Numerical Recipes. You
have to supply a routine the computes the model function and its first
derivatives.
Much of what was derived in class can be found in Numerical
Recipes.
Even more interesting is to fit the data to a finite size scaling
form. In my directory ~sg/fitting you will find a nonlinear fitting
routine.
finitesize pc_vs_size.ax
will start the fitting if finitesize is in your path and you are in
the directory that contains pc_vs_size.ax. If not, use full pathnames.
Here is an example of the output file:
mohawk% mo pc_vs_size.ax
4 5.134219e-01 4.651702e-03
8 5.425469e-01 2.965517e-03
16 5.637734e-01 1.863987e-03
32 5.775273e-01 1.165397e-03
50 5.812461e-01 7.796417e-04
64 5.816485e-01 6.925400e-04
128 5.864141e-01 4.070379e-04
The code tells you about the model and prompts for input:
Finite size scaling fit.
model: y=b[0]*(1.-b[1]*x**b[2])
Enter epsilon, maximum iterations
You need to type the numbers below, or similar values:
0.001 100
Enter 1 for verbose, 0 for terse
0
Enter range of x to be fitted
0 256
Enter Number of parameters
3
Enter number of parameters held fixed
0
0 of 3 parameters are held fixed
Initial values:
Enter initial value for parameter 0.
To save space I will put in all the initial guesses for the parameters on one
line. Then the fitting proceeds:
0.5 0.5 0.5
x[0] = 5.000000e-01
Enter initial value for parameter 1
x[1] = 5.000000e-01
Enter initial value for parameter 2
x[2] = 5.000000e-01
read in 7 data points
chi-square with 4 degrees of freedom is 5.280679e+00
confidence of fit is 2.596911e-01
final parameters:
x[0] = 5.914963e-01 +- 1.222739e-03
x[1] = 4.089219e-01 +- 4.563046e-02
x[2] = -7.937788e-01 +- 5.917352e-02
The confidence level of the fit is 25.9%. That is not too bad. The
parameter x[0] is p_c for inifinite volume:
x[0] = 5.914963e-01 +- 1.222739e-03
The accepted value for this result is 0.59275, which agrees quite nicely
with what we have!!
Discussion of a Homework Problem
We discussed a recent homework problem. The problem is to see how the
error in Monte Carlo integration depends on the sample size. In
the book, it is suggested that one can get a rough estimate of this by
using a single trial for each sample size. However, one can do a
much better job by doing many samples at each sample size. That
is why my programs such as pi1 are written to take an optional
argument the number of batches or trials. In ~sg/monte/pi1_fsr_trial.csh
you will find the commands to run off 1000 trials at each of
a dozen different sample sizes from 10 to 100000. This script
was run before class because some of the larger sample sizes
take several minutes to run.
The shell script actually runs hit or miss code with the feedback
shift register random number generator. This method
is discussed in CSM on page 402. There is a typo in the last
full paragraph on the page In the next to last sentence of that
paragraph m raised to the power n should be replaced by
m ^ n. In C, the caret symbol (^) is used for the bitwise exclusive
or operation. In the typesetting program TeX, the caret represents
raising to a power and the authors forgot that they were using
TeX when they composed that sentence.
The output of pi1_fsr consists of one line for each
trial. The line contains the sample size and the estimate of the
integral from that sample. However, we want to know the error.
There is a file ~sg/monte/error.a with this command:
{ err=3.1415926-$2; if(err<0) err= (-err); print $1,err}
It gives us the absolute value of the error. We can now use
aver to give us the average value of the absolute value of
the error. For example:
set trials = 1000
awk -f error.a pi1_fsr_trial_$trials |aver
Note, that if we did not use the absolute value of the error, and
just tried to average the error, it should average to zero, so we
don't want to do that. The root mean square error is also a
good estimate of the average error. If you use the command:
aver pi1_fsr_trial_$trials
you will be averaging the value of the integral, not the error.
So, the average value that comes out of aver should be close to
pi, but the error reported by aver is the error in the mean value.
This is the standard deviation of the distribution of values we
got for the integral divided by sqrt(number of trials), so
if we multiple the third column of output from aver by sqrt(1000),
we will have our value for the error in the integral (but with no
error bar).
aver pi1_fsr_trial_$trials |awk '{print $1,$3*sqrt(1000.)}'
Alternatively, we discussed an awk script rms_error.a
that calculations the error directly. Here is a simplified version:
BEGIN{sumsq=0.; count=0; n=0}
NR==1 {n=$1}
$1 != n { print n, sqrt(sumsq/count); count =0; sumsq=0.; n=$1}
{ err=$2-3.1415926; sumsq += err*err; ++count}
END{print n, sqrt(sumsq/count)}
The shell script creates an axis file with both errors. The rms
error has no error bar. It is consistently larger than the average
of absolute value of error by a constant factor (easily seen in the
log-log plot).
It is trivial to change pi1 to pi2 in the script and produce a similar
graph for the sample mean method. Then one can see that for the
same sample size the sample mean method has smaller errors than the
hit or miss method.
Please note that we can also demonstrate that ran1 is not a good
random number generator. I have an axis file hit_or_miss.ax
that uses different plotting symbols for
the two programs pi1 and pi1_fsr. On a log-log plot pi1_fsr
was perfectly straight and when the sample size grew by four orders
of magnitude, the error by two orders of magnitude. This is
exactly what is expected for inverse square root dependence on
sample size. However, for pi1, the curve is not a straight line
and the error falls faster. This is beacause of the poor quality of
the random numbers produced by ran1. The different trials from
ran1 are not independent and therefore the error is underestimated.
This can easily be seen by the following command:
tail -1000 pi1_trial_1000 |awk '{print $2}' |axis a |xplot
This shows a time history for the largest sample size. Notice
how this looks periodic with a period of about 440. (Note the
matching extreme values.) Note how different the time history is
with pi1_fsr_trial_1000.