December 7, 2004 --- Class 28 --- More on Chi-squared Distribution Activities: We showed how the chi squared distribution can be found by integrating random variables in spherical coordinates. If we integrate all the variables but the radial one, we find that the radial variable is equivalent to the random variable chi-squared. In fact, we don't even have to do the angular integral because it contributes to the normalization and we can just normalize our probability distribution function so that when we integrate it from 0 to infinity, we get one. However, we derived the angular integral which is the surface area of an n-dimensional sphere by completing the integration over the radial variable and using the fact that we started with the normalized integral. You can also find in ~sg/gausdist/dec2.math a Mathematica expression for the chi-squared distribution: chisq[x_, n_] := (x^((n - 2)/2)*Exp[-(x/2)])/(Gamma[n/2]*2^(n/2)) When making histograms of chi square or graphs in mathematica, we find that the peak in the function occurs where x is about n, the number of degrees of freedom. We compared a historgram from a numerical experiment with a Mathematica plot of the chisq function above and found good agreement. Confidence Levels We considered the fact that the quantity \chi^2 can be used to determine the goodness of fit. Since \chi^2 should be the sum of the squares of N normalized Gaussian variables, it has a known distribution. If we find that our fit gives a small value of \chi^2, that is good. If we get a large value, maybe our model is not the correct description of the data. If the model has p parameters that we adjust, the fit is said to have N-p degrees of freedom. A program confidence2, whose source code is in ~sg/src/confidence2_2.c can be used to determine the confidence level of a fit. The confidence level depends on \chi^2 and the number of degrees of freedom. The confidence level is the probability that \chi^2 would exceed the value for which we are computing the confidence level. As an example: confidence2 6.3 5 6.300000e+00 5.000000e+00 2.781123e-01 This represents the screen display for considering \chi^2=6.3 for 5 degrees of freedom. The confidence level is 0.278 or nearly 28%. Thus, for a fit with this confidence level, we would realize that even for a correct model the computed value of \chi^2 would be larger than this 28% of the time. Usually, statistician like to rule things out at the 90% or 95% confidence level. So, when your confidence level gets below 5-10%, you begin to wonder if you really have the right model for describing your data. BTW, confidence2 takes its input from the standard input, so hit Control-D or Control-C to terminate its running. Confidence Level From Mathmatica Since we have already considered how to get Mathmematica to calculate the normalized chisq distribution, we can just use numerical integration to compute the confidence level: chisq[x_, n_] := (x^((n - 2)/2)*Exp[-(x/2)])/(Gamma[n/2]*2^(n/2)) confidence[chisquared_, n_] := NIntegrate[ chisq[y,n], {y,chisquared,Infinity}]