Example - University of Chicagogalton.uchicago.edu/~eichler/stat24600/Handouts/s02add.pdf ·...
Transcript of Example - University of Chicagogalton.uchicago.edu/~eichler/stat24600/Handouts/s02add.pdf ·...
Example
Fitting a Poisson distribution (correctly specified case)
Suppose that X1, . . . , Xn are independent and Poisson distributed,
Xi
iid∼ Poisson(λ0).
The log-likelihood function is
ln(λ|X) = log(λ)n∑
i=1
Xi −n∑
i=1
log(Xi!) − n λ.
Differentiating with respect to λ, we obtain the score function
S(θ|X) =∂l
n(λ|X)
∂λ=
1
λ
n∑
i=1
Xi − n
and the ML estimator
λ̂ML =1
n
n∑
i=1
Xi.
The second derivative of the log-likelihood function is
∂2ln(λ|X)
∂λ2= −
1
λ2
n∑
i=1
Xi
which yields the observed Fisher-information
I(λ|X) = −∂2l
n(λ|X)
∂λ2=
1
λ2
n∑
i=1
Xi
and the (expected) Fisher-information
I(λ|X) = −�(
∂2ln(λ|X)
∂λ2
)
=n λ
λ2=
n
λ.
Therefore the MLE is approximately normally distributed with mean λ
and variance λ/n.
Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 1 -
Example
Fitting a Poisson distribution (misspecified case)
Now suppose that the variables Xi and binomially distributed,
Xi
iid∼ Bin(m, θ0).
How does the MLE λ̂ML of the fitted Poisson model relate to the true
distribution?
The “distance” between the fitted model and the true model can be mea-
sured by the Kullback-Leibler distance,
�(
logfBin(X|θ0)
fPoiss(X|λ)
)
=� (
log fBin(X|θ0))
−� (
log fPoiss(X|λ))
=� (
λ − Xi log(λ))
+ terms constant in λ
= λ − m θ log(λ) + terms constant in λ.
Differentiating with respect to λ, we obtain
1 − mθ0
λ= 0 ⇔ λ = m θ0.
Thus the MLE λ̂ML converges to λ0 = m θ0.
Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 2 -
Asymptotic Properties of the MLE
Let θ̂ be the MLE for θ0. Taylor expansion of the score function at θ̂ about
θ0 yields
∂ln(θ̂|Y )
∂θ≈
∂ln(θ0|Y )
∂θ+
∂2ln(θ0|Y )
∂θ2(θ̂ − θ0) (1)
and hence
θ̂ − θ0 ≈ −(
∂2ln(θ0|Y )
∂θ2
)−1∂ln(θ0|Y )
∂θ,
since the left side of (1) is zero. Furthermore since
∂2ln(θ0|Y )
∂θ2→ I(θ0|Y )
and
�(
∂ln(θ0|Y )
∂θ
)
=�(
∂ log f(Y |θ0)
∂θ
)
= 0,
this suggests that
var(
θ̂ − θ0
)
≈ I(θ0)−1 �
(
∂ln(θ0|Y )
∂θ
)2
I(θ0)−1.
If the model is correctly specified, we have
�(
∂2ln(θ0|Y )
∂θ2
)
=�(
∂2 log f(Y |θ0)
∂θ2
)
=�[
∂
∂θ
(∂f(Y |θ0)
∂θ
1
f(Y |θ0)
])
=�[
∂2f(Y |θ0)
∂θ2
1
f(Y |θ0)
]
−�[(
∂f(Y |θ0)
∂θ
)2( 1
f(Y |θ0)
)2]
.
Noting that
�[
∂2f(Y |θ0)
∂θ2
1
f(Y |θ0)
]
=
∫
∂2f(y|θ0)
∂θ2dy =
∂2
∂θ2
∫
f(y|θ0) dy = 0,
we obtain
I(θ0) =�(
∂f(Y |θ0)
∂θ
1
f(Y |θ0
)2
=�(
∂ log f(Y |θ0)
∂θ
)2
=�(
∂ln(θ0|Y )
∂θ
)2
=� (
S(θ|Y ))2
Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 3 -
Example
Fitting a Poisson distribution (misspecified case)
The variance of the MLE λ̂ can be approximated by
I(λ0)−1 � (
S(λ0|Y )2)
I(λ0)−1.
Using the formulas for the first and second derivative, we find that
� (S(λ0|Y )
)2=
�(
1
λ0
n∑
i=1
Xi − n)2
=(
n m θ0 (1 − θ0)
λ2
0
+n2 m2 θ2
0
λ2− 2
n2 m θ0
λ+ n2
)
= n1 − θ0
λ0
= n(
1
m θ0
−1
m
)
and
I(λ0) =�(
1
λ2
0
n∑
i=1
Xi
)
=n m θ0
λ2
0
=n
m θ0
,
where we used that
n∑
i=1
Xi ∼ Bin(m n, θ0).
Hence the variance of λ̂ becomes
m2θ2
0
n
(
1
m θ0
−1
m
)
=m θ0 (1 − θ0)
n.
Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 4 -