Homework V February 25, 2002. MT - Welcome | Department of Statistics and Data...

6
Theory of Statistics. Homework V February 25, 2002. MT

Transcript of Homework V February 25, 2002. MT - Welcome | Department of Statistics and Data...

Theory of Statistics.

Homework V

February 25, 2002. MT

8.17.c When σ is known, µ̂ = X̄ is an unbiased estimator for µ. If you can show that its

variance attains the Cramer-Rao lower bound, then no other unbiased estimator can have

lower variance. Calculations for the Fisher information:

In(µ) = nI(µ)

I(µ) = E[∂

∂µlog f(X1 | µ)

]2

∂µlog f(x | µ) =

x− µσ2

I(µ) =E(X1 − µ)2

σ4 =1σ2

In(µ) =n

σ2 .

Hence the Cramer-Rao lower bound is σ2/n, which is just the variance of µ̂ = X̄.

8.40 X1, · · · , Xn IID from a Poisson(λ) distribution. Define Y =∑n

i=1 1{Xi=0} which is

the sum of n IID Bernoulli random variables with probability of success p0 = P(X = 0).

Thus Y ∼ Bin(n, p0). Let W = Y/n for convenience. First two moments of W are given by

EW = p0 = e−λ

Var(W ) = p0(1− p0)/n = e−λ(1− e−λ)/n

EW 2 = e−λ[1− (1 + n)e−λ]/n

The motivation for the estimator λ̃ is that, for large n, values of Y/n should be close to their

average E(Y/n) = e−λ. Thus Y/n = e−λ̃, or log(Y/n) = −λ̃. Let g(t) = − log t. The first

order approximation

λ̃ = g(W ) ≈ g(µW ) + (W − µW )g′(µW ) = λ− (W − e−λ)eλ

does not give any information about the bias Eλ̃ − λ, as EW − e−λ = 0, but it does allow

you to get an approximate variance term

Var(λ̃) ≈ σ2W [g′(µW )]2 = e−λ(1− e−λ)e2λ.

For the bias term, you need to use the second order approximation

λ̃ = g(W ) ≈ g(µW ) + (W − µW )g′(µW ) +12

(W − µW )2g′′(µW ).

1

Obtain

Bias(λ̃) ≈ 12n

(eλ − 1)

Var(λ̃) ≈ 1n

(eλ − 1)

MSE(λ̃) ≈ 1n

(eλ − 1)[

14n

(eλ − 1) + 1]

The MLE for this problem is easily checked to correspond to λ̂MLE = X̄, since, up to a term

not dependent on λ, the (1/n)-loglikelihood is −λ + x̄ log λ. Of course, X̄ is unbiased and

has variance λ/n. Hence the efficiency of λ̃ relative to λ̂MLE is

eff(λ̃, λ̂MLE) =λ

(eλ − 1)[

14n

(eλ − 1) + 1] .

For large n this behaves likeλ

eλ − 1. As λ → 0 (p0 → 1), λ/(eλ − 1) → 1. As λ is increased

to +∞ (p0 → 0), λ/(eλ − 1)→ 0.

8.41 Preliminaries: The method-of-moments estimator α̂MME = 3X̄ is unbiased for α and

has variance

Var(α̂MME) = (3− α2)/n.

The maximum likelihood estimator cannot be given in closed form (see example D, §8.4).

However you can calculate the asymptotic variance through the Fisher information

In(α) = nI(α) = nE[

X21

(1 + αX1)2

].

If α = 0, In(α) = n/3. For α 6= 0, the second moment is computed by changing variable

y = 1 + αx and evaluating the integral∫ 1

−1

x2

(1 + αx)2 f(x | α)dx =12

∫ 1+α

1−α

(y − 1)2/α2

y

dy

α= − 1

α2 +1

2α3 log(

1 + α

1− α

).

Therefore, whenever α 6= 0,

Var(α̂MLE) ≈ 1In(α)

=1

− n

α2 +n

2α3 log(

1 + α

1− α

) .Now, recording only Yi = 1{Xi>0}, which are Bernoulli random variables with probability

of success p =12

(1 + α/2), Y =∑n

i=1 Yi ∼ Bin(n, p) has EY = np =n

2(1 + α/2). Hence an

unbiased estimator for α can be obtained by setting Y =n

2(1 + α̂/2). Thus

α̂ = 4(Y/n)− 2

Var(α̂) = (4− α2)/n

2

Relative to the MME, the efficiency of α̂ is

eff(α̂, α̂MME) =(3− α2)/n(4− α2)/n

=3− α2

4− α2 ,

while, relative to the MLE, the efficiency of α̂ is

eff(α̂, α̂MLE) =

{− 1α2 +

12α3 log

(1 + α

1− α

)}−1

(4− α2)

for α 6= 0. When α = 0, eff(α̂, α̂MLE) = 3/(4 − α2) = 3/4. The following table allows to

compare the efficiency α̂ relative to the MME and that relative to the MLE for different

values of α.

α 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

eff(α̂, α̂MME) 0.750 0.749 0.747 0.744 0.740 0.733 0.725 0.715 0.702 0.687

eff(α̂, α̂MLE) 0.750 0.747 0.739 0.725 0.705 0.682 0.637 0.584 0.509 0.396

8.45 X1, · · · , Xn IID from a Uniform[0, θ]. (a) Have EXi = θ/2 for each i. Hence the

method-of-moments estimator is

θ̂MME = 2X̄

which is unbiased and has variance

Var(θ̂MME) =θ2

3n.

(b) θ̂MLE = max(X1, · · · , Xn) = X(n) since the likelihood function is

1θn× 1{0≤x(1)≤x(n)≤θ}.

(c) The density function of X(n) is

f(x) =nxn−1

θn× 1{0≤x≤θ}.

Thus

E(θ̂MLE) = n

∫ 1

0ynθdy =

θn

n+ 1

E(θ̂2MLE) = n

∫ 1

0yn+1θ2dy =

θ2n

n+ 2

Var(θ̂MLE) =θ2n

(n+ 1)2(n+ 2)

The following table compares the MME and MLE:

3

Estimator θ̂MME θ̂MLE

Bias 0−θn+ 1

Varianceθ

3nθ2n

(n+ 1)2(n+ 2)

MSEθ

3n2θ2

(n+ 1)(n+ 2)

eff(θ̂MME, θ̂MLE) =6

(n+ 1)(1 + 2/n)

(d) The estimator θ̃ = (1 + 1/n)θ̂MLE is unbiased for θ, with variance

Var(θ̃) =θ2

n(n+ 2).

8.49 (a) Using the very handy fact stated at the start of the problem, we have

E(

(n− 1)s2

σ2

)= n− 1

Var(

(n− 1)s2

σ2

)= 2(n− 1)

Hence,

Es2 = σ2 and Eσ̂2 = (1− 1/n)σ2,

which means s2 is unbiased for σ2 while σ̂2 has bias equal to −σ2/n. For (b),

Var(s2) =2σ4

n− 1and Var(σ̂2) =

2σ4(n− 1)n2 ,

therefore

MSE(s2) =2σ4

n− 1while MSE(σ̂2) =

2σ4(n− 1/2)n2 .

It is easy to check that the latter is smaller than the former. Indeed, as long as n ≥ 1,

MSE(σ̂2)MSE(s2)

=(n− 1/2)(n− 1)

n2 ≤ n2

n2 = 1.

(c) Let s2ρ = ρ(n− 1)s2. Then Es2

ρ = ρ(n− 1)Es2 = ρ(n− 1)σ2, so

Bias(s2ρ) = [ρ(n− 1)− 1]σ2.

On the other hand,

Var(s2ρ) = ρ2(n− 1)2Var(s2) = 2ρ2(n− 1)σ4.

4

Hence

MSE(sρ) ={

[ρ(n− 1)− 1]2 + 2ρ2(n− 1)}σ4 =

{(n2 − 1)

(ρ− 1

n+ 1

)2

+2

n+ 1

}σ4,

which is minimized at ρ = 1/(n + 1) taking the minimum value of 2σ4/(n + 1). (Note that

the MSE for s2 is 2σ4/(n− 1).)

8.52 (a) The (1/n)-loglikelihood is − log τ − x̄/τ. Therefore the MLE is the solution to the

first order condition 0 = −1/τ̂ + X̄/τ̂2 which is simply τ̂ = X̄. (b)∑n

i=1Xi is a sum of

independent exponentially distributed random variables with parameter τ, hence it follows

a Gamma(n, 1/τ) distribution. Hence τ̂ = X̄ ∼ Gamma(n, n/τ) is the exact sampling

distribution. (c) For large n, we can derive an approximate sampling distribution by invoking

the central limit theorem together with the facts that EX̄ = EX1 = τ and Var(X̄) =

Var(X1)/n = τ2/n. Thus, for large n, τ̂ = X̄d≈ N (τ, τ2/n) is the approximate sampling

distribution. (d) EX̄ = τ and Var(X̄) = τ2/n. (e) To determine whether the MLE has

minimum variance, we compare its variance to the Cramer-Rao lower bound. The latter is

computed via

I(τ) = E[∂

∂τlog f(X1 | τ)

]2

= E[−1τ

+X1

τ2

]2

=1τ2 .

Hence the Cramer-Rao lower bound is

1In(τ)

=1

nI(τ)=τ2

n,

which is the variance of the MLE: no other unbiased estimator can have smaller variance.

(f) For an approximate 100(1 − α)% confidence interval, use the approximate sampling

distribution √n(X̄ − τ)

τ≈ N (0, 1).

(g) For an exact 100(1− α)% confidence interval, use the fact that

2nτ∼ Gamma(n, 1/2) ≡ χ2

2n.

8.62 Simply decompose as follows

Var(V1) = Var(E(V1 | T2)) + EVar(V1 | T2) ≥ Var(E(V1 | T2)) = Var(V2).

5