December 9, 2004 --- Class 29 --- Non-Linear Fitting Activities: Nonlinear Fitting We next generalized the discussion to include two types of non-linearities in the model. For the case of a model that is a non-linear function of x, but still linear in its parameters, it is still easy to find the minimum of \chi^2 because \chi^2 is only a quadratic function of the parameters. Numerical Recipes has a function lfit to deal with this case. You have to supply a routine that computes the x dependence of the model function. y_model (x) = \Sum_{k=1}^N =lambda_k f_k(x) Your routine calculates the f_k and the NR routine takes care of varying lambda_k. Of course, you also have to supply the data with (optionally) error bars. Mathematica can also do fits like this if there are no error bars. Here is the description of the Fit function of Mathematica: Fit[data,funs,vars] finds a least-squares fit to a list of data as a linear combination of the functions funs of variables vars. The data can have the form {{x1, y1, ... , f1}, {x2, y2, ... , f2}, ... }, where the number of coordinates x, y, ... is equal to the number of variables in the list vars. The data can also be of the form {f1, f2, ... }, with a single coordinate assumed to take values 1, 2, ... . The argument funs can be any list of functions that depend only on the objects vars. Polyfit Polyfit is a routine that fits data with an nth order polynomial. You are welcome to use it. You will find the code and a makefile in ~sg/fitting. Things get more interesting when the model is non-linear in its parameters. In this case, we have to interatively solve a non-linear minimization problem. NR chapter 10 describes how to do minimization when you only know the value of the function you are trying to minimize. However, in our case, we know the model function explicitly, so we can find \chi^2 and its first and second derivatives, if we know the first and second derivatives of the model function wrt its parameters. The Levenberg-Marquardt method is based on this notion and described in Numerical Recipes. You have to supply a routine the computes the model function and its first derivatives. Much of what was derived in class can be found in Numerical Recipes. Even more interesting is to fit the data to a finite size scaling form. In my directory ~sg/fitting you will find a nonlinear fitting routine. finitesize pc_vs_size.ax will start the fitting if finitesize is in your path and you are in the directory that contains pc_vs_size.ax. If not, use full pathnames. Here is an example of the output file: mohawk% mo pc_vs_size.ax 4 5.134219e-01 4.651702e-03 8 5.425469e-01 2.965517e-03 16 5.637734e-01 1.863987e-03 32 5.775273e-01 1.165397e-03 50 5.812461e-01 7.796417e-04 64 5.816485e-01 6.925400e-04 128 5.864141e-01 4.070379e-04 The code tells you about the model and prompts for input: Finite size scaling fit. model: y=b[0]*(1.-b[1]*x**b[2]) Enter epsilon, maximum iterations You need to type the numbers below, or similar values: 0.001 100 Enter 1 for verbose, 0 for terse 0 Enter range of x to be fitted 0 256 Enter Number of parameters 3 Enter number of parameters held fixed 0 0 of 3 parameters are held fixed Initial values: Enter initial value for parameter 0. To save space I will put in all the initial guesses for the parameters on one line. Then the fitting proceeds: 0.5 0.5 0.5 x[0] = 5.000000e-01 Enter initial value for parameter 1 x[1] = 5.000000e-01 Enter initial value for parameter 2 x[2] = 5.000000e-01 read in 7 data points chi-square with 4 degrees of freedom is 5.280679e+00 confidence of fit is 2.596911e-01 final parameters: x[0] = 5.914963e-01 +- 1.222739e-03 x[1] = 4.089219e-01 +- 4.563046e-02 x[2] = -7.937788e-01 +- 5.917352e-02 The confidence level of the fit is 25.9%. That is not too bad. The parameter x[0] is p_c for inifinite volume: x[0] = 5.914963e-01 +- 1.222739e-03 The accepted value for this result is 0.59275, which agrees quite nicely with what we have!! Discussion of a Homework Problem We discussed a recent homework problem. The problem is to see how the error in Monte Carlo integration depends on the sample size. In the book, it is suggested that one can get a rough estimate of this by using a single trial for each sample size. However, one can do a much better job by doing many samples at each sample size. That is why my programs such as pi1 are written to take an optional argument the number of batches or trials. In ~sg/monte/pi1_fsr_trial.csh you will find the commands to run off 1000 trials at each of a dozen different sample sizes from 10 to 100000. This script was run before class because some of the larger sample sizes take several minutes to run. The shell script actually runs hit or miss code with the feedback shift register random number generator. This method is discussed in CSM on page 402. There is a typo in the last full paragraph on the page In the next to last sentence of that paragraph m raised to the power n should be replaced by m ^ n. In C, the caret symbol (^) is used for the bitwise exclusive or operation. In the typesetting program TeX, the caret represents raising to a power and the authors forgot that they were using TeX when they composed that sentence. The output of pi1_fsr consists of one line for each trial. The line contains the sample size and the estimate of the integral from that sample. However, we want to know the error. There is a file ~sg/monte/error.a with this command: { err=3.1415926-$2; if(err<0) err= (-err); print $1,err} It gives us the absolute value of the error. We can now use aver to give us the average value of the absolute value of the error. For example: set trials = 1000 awk -f error.a pi1_fsr_trial_$trials |aver Note, that if we did not use the absolute value of the error, and just tried to average the error, it should average to zero, so we don't want to do that. The root mean square error is also a good estimate of the average error. If you use the command: aver pi1_fsr_trial_$trials you will be averaging the value of the integral, not the error. So, the average value that comes out of aver should be close to pi, but the error reported by aver is the error in the mean value. This is the standard deviation of the distribution of values we got for the integral divided by sqrt(number of trials), so if we multiple the third column of output from aver by sqrt(1000), we will have our value for the error in the integral (but with no error bar). aver pi1_fsr_trial_$trials |awk '{print $1,$3*sqrt(1000.)}' Alternatively, we discussed an awk script rms_error.a that calculations the error directly. Here is a simplified version: BEGIN{sumsq=0.; count=0; n=0} NR==1 {n=$1} $1 != n { print n, sqrt(sumsq/count); count =0; sumsq=0.; n=$1} { err=$2-3.1415926; sumsq += err*err; ++count} END{print n, sqrt(sumsq/count)} The shell script creates an axis file with both errors. The rms error has no error bar. It is consistently larger than the average of absolute value of error by a constant factor (easily seen in the log-log plot). It is trivial to change pi1 to pi2 in the script and produce a similar graph for the sample mean method. Then one can see that for the same sample size the sample mean method has smaller errors than the hit or miss method. Please note that we can also demonstrate that ran1 is not a good random number generator. I have an axis file hit_or_miss.ax that uses different plotting symbols for the two programs pi1 and pi1_fsr. On a log-log plot pi1_fsr was perfectly straight and when the sample size grew by four orders of magnitude, the error by two orders of magnitude. This is exactly what is expected for inverse square root dependence on sample size. However, for pi1, the curve is not a straight line and the error falls faster. This is beacause of the poor quality of the random numbers produced by ran1. The different trials from ran1 are not independent and therefore the error is underestimated. This can easily be seen by the following command: tail -1000 pi1_trial_1000 |awk '{print $2}' |axis a |xplot This shows a time history for the largest sample size. Notice how this looks periodic with a period of about 440. (Note the matching extreme values.) Note how different the time history is with pi1_fsr_trial_1000.