Homework V February 25, 2002. MT - Welcome | Department of Statistics and Data...
Transcript of Homework V February 25, 2002. MT - Welcome | Department of Statistics and Data...
8.17.c When σ is known, µ̂ = X̄ is an unbiased estimator for µ. If you can show that its
variance attains the Cramer-Rao lower bound, then no other unbiased estimator can have
lower variance. Calculations for the Fisher information:
In(µ) = nI(µ)
I(µ) = E[∂
∂µlog f(X1 | µ)
]2
∂
∂µlog f(x | µ) =
x− µσ2
I(µ) =E(X1 − µ)2
σ4 =1σ2
In(µ) =n
σ2 .
Hence the Cramer-Rao lower bound is σ2/n, which is just the variance of µ̂ = X̄.
8.40 X1, · · · , Xn IID from a Poisson(λ) distribution. Define Y =∑n
i=1 1{Xi=0} which is
the sum of n IID Bernoulli random variables with probability of success p0 = P(X = 0).
Thus Y ∼ Bin(n, p0). Let W = Y/n for convenience. First two moments of W are given by
EW = p0 = e−λ
Var(W ) = p0(1− p0)/n = e−λ(1− e−λ)/n
EW 2 = e−λ[1− (1 + n)e−λ]/n
The motivation for the estimator λ̃ is that, for large n, values of Y/n should be close to their
average E(Y/n) = e−λ. Thus Y/n = e−λ̃, or log(Y/n) = −λ̃. Let g(t) = − log t. The first
order approximation
λ̃ = g(W ) ≈ g(µW ) + (W − µW )g′(µW ) = λ− (W − e−λ)eλ
does not give any information about the bias Eλ̃ − λ, as EW − e−λ = 0, but it does allow
you to get an approximate variance term
Var(λ̃) ≈ σ2W [g′(µW )]2 = e−λ(1− e−λ)e2λ.
For the bias term, you need to use the second order approximation
λ̃ = g(W ) ≈ g(µW ) + (W − µW )g′(µW ) +12
(W − µW )2g′′(µW ).
1
Obtain
Bias(λ̃) ≈ 12n
(eλ − 1)
Var(λ̃) ≈ 1n
(eλ − 1)
MSE(λ̃) ≈ 1n
(eλ − 1)[
14n
(eλ − 1) + 1]
The MLE for this problem is easily checked to correspond to λ̂MLE = X̄, since, up to a term
not dependent on λ, the (1/n)-loglikelihood is −λ + x̄ log λ. Of course, X̄ is unbiased and
has variance λ/n. Hence the efficiency of λ̃ relative to λ̂MLE is
eff(λ̃, λ̂MLE) =λ
(eλ − 1)[
14n
(eλ − 1) + 1] .
For large n this behaves likeλ
eλ − 1. As λ → 0 (p0 → 1), λ/(eλ − 1) → 1. As λ is increased
to +∞ (p0 → 0), λ/(eλ − 1)→ 0.
8.41 Preliminaries: The method-of-moments estimator α̂MME = 3X̄ is unbiased for α and
has variance
Var(α̂MME) = (3− α2)/n.
The maximum likelihood estimator cannot be given in closed form (see example D, §8.4).
However you can calculate the asymptotic variance through the Fisher information
In(α) = nI(α) = nE[
X21
(1 + αX1)2
].
If α = 0, In(α) = n/3. For α 6= 0, the second moment is computed by changing variable
y = 1 + αx and evaluating the integral∫ 1
−1
x2
(1 + αx)2 f(x | α)dx =12
∫ 1+α
1−α
(y − 1)2/α2
y
dy
α= − 1
α2 +1
2α3 log(
1 + α
1− α
).
Therefore, whenever α 6= 0,
Var(α̂MLE) ≈ 1In(α)
=1
− n
α2 +n
2α3 log(
1 + α
1− α
) .Now, recording only Yi = 1{Xi>0}, which are Bernoulli random variables with probability
of success p =12
(1 + α/2), Y =∑n
i=1 Yi ∼ Bin(n, p) has EY = np =n
2(1 + α/2). Hence an
unbiased estimator for α can be obtained by setting Y =n
2(1 + α̂/2). Thus
α̂ = 4(Y/n)− 2
Var(α̂) = (4− α2)/n
2
Relative to the MME, the efficiency of α̂ is
eff(α̂, α̂MME) =(3− α2)/n(4− α2)/n
=3− α2
4− α2 ,
while, relative to the MLE, the efficiency of α̂ is
eff(α̂, α̂MLE) =
{− 1α2 +
12α3 log
(1 + α
1− α
)}−1
(4− α2)
for α 6= 0. When α = 0, eff(α̂, α̂MLE) = 3/(4 − α2) = 3/4. The following table allows to
compare the efficiency α̂ relative to the MME and that relative to the MLE for different
values of α.
α 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
eff(α̂, α̂MME) 0.750 0.749 0.747 0.744 0.740 0.733 0.725 0.715 0.702 0.687
eff(α̂, α̂MLE) 0.750 0.747 0.739 0.725 0.705 0.682 0.637 0.584 0.509 0.396
8.45 X1, · · · , Xn IID from a Uniform[0, θ]. (a) Have EXi = θ/2 for each i. Hence the
method-of-moments estimator is
θ̂MME = 2X̄
which is unbiased and has variance
Var(θ̂MME) =θ2
3n.
(b) θ̂MLE = max(X1, · · · , Xn) = X(n) since the likelihood function is
1θn× 1{0≤x(1)≤x(n)≤θ}.
(c) The density function of X(n) is
f(x) =nxn−1
θn× 1{0≤x≤θ}.
Thus
E(θ̂MLE) = n
∫ 1
0ynθdy =
θn
n+ 1
E(θ̂2MLE) = n
∫ 1
0yn+1θ2dy =
θ2n
n+ 2
Var(θ̂MLE) =θ2n
(n+ 1)2(n+ 2)
The following table compares the MME and MLE:
3
Estimator θ̂MME θ̂MLE
Bias 0−θn+ 1
Varianceθ
3nθ2n
(n+ 1)2(n+ 2)
MSEθ
3n2θ2
(n+ 1)(n+ 2)
eff(θ̂MME, θ̂MLE) =6
(n+ 1)(1 + 2/n)
(d) The estimator θ̃ = (1 + 1/n)θ̂MLE is unbiased for θ, with variance
Var(θ̃) =θ2
n(n+ 2).
8.49 (a) Using the very handy fact stated at the start of the problem, we have
E(
(n− 1)s2
σ2
)= n− 1
Var(
(n− 1)s2
σ2
)= 2(n− 1)
Hence,
Es2 = σ2 and Eσ̂2 = (1− 1/n)σ2,
which means s2 is unbiased for σ2 while σ̂2 has bias equal to −σ2/n. For (b),
Var(s2) =2σ4
n− 1and Var(σ̂2) =
2σ4(n− 1)n2 ,
therefore
MSE(s2) =2σ4
n− 1while MSE(σ̂2) =
2σ4(n− 1/2)n2 .
It is easy to check that the latter is smaller than the former. Indeed, as long as n ≥ 1,
MSE(σ̂2)MSE(s2)
=(n− 1/2)(n− 1)
n2 ≤ n2
n2 = 1.
(c) Let s2ρ = ρ(n− 1)s2. Then Es2
ρ = ρ(n− 1)Es2 = ρ(n− 1)σ2, so
Bias(s2ρ) = [ρ(n− 1)− 1]σ2.
On the other hand,
Var(s2ρ) = ρ2(n− 1)2Var(s2) = 2ρ2(n− 1)σ4.
4
Hence
MSE(sρ) ={
[ρ(n− 1)− 1]2 + 2ρ2(n− 1)}σ4 =
{(n2 − 1)
(ρ− 1
n+ 1
)2
+2
n+ 1
}σ4,
which is minimized at ρ = 1/(n + 1) taking the minimum value of 2σ4/(n + 1). (Note that
the MSE for s2 is 2σ4/(n− 1).)
8.52 (a) The (1/n)-loglikelihood is − log τ − x̄/τ. Therefore the MLE is the solution to the
first order condition 0 = −1/τ̂ + X̄/τ̂2 which is simply τ̂ = X̄. (b)∑n
i=1Xi is a sum of
independent exponentially distributed random variables with parameter τ, hence it follows
a Gamma(n, 1/τ) distribution. Hence τ̂ = X̄ ∼ Gamma(n, n/τ) is the exact sampling
distribution. (c) For large n, we can derive an approximate sampling distribution by invoking
the central limit theorem together with the facts that EX̄ = EX1 = τ and Var(X̄) =
Var(X1)/n = τ2/n. Thus, for large n, τ̂ = X̄d≈ N (τ, τ2/n) is the approximate sampling
distribution. (d) EX̄ = τ and Var(X̄) = τ2/n. (e) To determine whether the MLE has
minimum variance, we compare its variance to the Cramer-Rao lower bound. The latter is
computed via
I(τ) = E[∂
∂τlog f(X1 | τ)
]2
= E[−1τ
+X1
τ2
]2
=1τ2 .
Hence the Cramer-Rao lower bound is
1In(τ)
=1
nI(τ)=τ2
n,
which is the variance of the MLE: no other unbiased estimator can have smaller variance.
(f) For an approximate 100(1 − α)% confidence interval, use the approximate sampling
distribution √n(X̄ − τ)
τ≈ N (0, 1).
(g) For an exact 100(1− α)% confidence interval, use the fact that
X̄
2nτ∼ Gamma(n, 1/2) ≡ χ2
2n.
8.62 Simply decompose as follows
Var(V1) = Var(E(V1 | T2)) + EVar(V1 | T2) ≥ Var(E(V1 | T2)) = Var(V2).
5