February 18, 2016 --- Class 12 --- Biased Estimators,
Bias Reduction and the Jackknife Method
Activities:
Biased Esimators
We continued our discussion of Mathematica from Section H of
~sg/mathematica_notes_sg_extended.pdf .
Several Mathematica notebooks from previous years
are available for your review:
~sg/Documents/feb16_v1.nb , ~sg/Documents/feb16_v2.nb , and
~sg/Documents/October_10_2006.nb .
We plotted the values obtained for different samples sizes vs
1/(sample size) and found that the curve was linearly approaching
2 as 1/(sample size) approaches zero.
Bias reduction
Recognizing that a plot of the result vs. 1/n shows the error is
linear in 1/n, where n is the sample size:
S_n = A +e/n
where A is the limit of n -> infinity and e determines the size of
the error. Considering samples sizes n and n-1, it is easy to
solve for A.
A = n S_n - (n-1) S_{n-1}
The jackknife method allows us to resample our n values to also
get samples of size (n-1). If we have a sample of size n, we use
it to compute the statistic S_n. Instead of getting a new sample
with n-1 new variables, we throw one of the values out of our
sample of size n to create a sample of size n-1 and use that to
compute S_{n-1}. However, we can throw any of the values out, so
we might as well take the average over throwing out each of the n
values. Let
S'^j_{n-1}
represent the statistic on the sample of size n-1 from which the jth
element has been eliminated. Our estimate for S_{n-1} is
(1/n) \sum_{j=1}^n S'^j_{n-1}
Biased Estimators and the Jackknife method
We have a choice of which tool too use for applying the jackknife
method. I will start by using
Mathematica. Later will will turn to a compliled languague.
Jackknife Method Via Mathematica
Mathematica can be used to generate samples of random numbers and
to apply the jackknife technique. This is a good opportunity to
introduce a new mathematica construct the "module", which is like
a subroutine. In ~sg/jackknife/jackn.math is the code to implement
this method.
First we need a way to create our set of random variables. This is
done via the newvars command:
newvars[num_] := Do[z[i] = Random[], {i, num}]
Once this is defined, you may create as many new random numbers
in the vector z by calling newvars with the number you desire
as the argument.
A Module in mathematica takes two arguments. The first is a list
of variables local to the module, and the second is an expression,
i.e., a sequence of statements.
Using a module statement we define jackknife[x_, num_Integer],
where x is vector of random numbers and num is the sample size.
Here is the code:
jackknife[x_, num_Integer] :=
Module[{mysum, mysumpr, stat, statpr, estimate, fctr, i},
fctr = N[(num - 1)/num]; mysum = Sum[x[i], {i, 1, num}];
stat = num/mysum; estimate = num*stat;
Do[mysumpr = mysum - x[i]; statpr = (num - 1)/mysumpr;
estimate -= fctr*statpr, {i, 1, num}]; {stat, estimate}]
Successive calls to newvars and jackknife are equivalent to what
was done in the C code in the last class. I made a table from many
calls of newvars and jackknife, from which a scatterplot was
produced using ListPlot.
~sg/Documents/Feb18_2016.nb is the Mathematica notebook that I
used in class.
You may also consult the Mathematica notebook ~sg/feb27_jackknife.nb to
follow some of the calculations done in a previous semester.