Download - Example - University of Chicagogalton.uchicago.edu/~eichler/stat24600/Handouts/s02add.pdf · Example Fitting a Poisson distribution (correctly specifled case) Suppose that X1;:::;Xn

Transcript
Page 1: Example - University of Chicagogalton.uchicago.edu/~eichler/stat24600/Handouts/s02add.pdf · Example Fitting a Poisson distribution (correctly specifled case) Suppose that X1;:::;Xn

Example

Fitting a Poisson distribution (correctly specified case)

Suppose that X1, . . . , Xn are independent and Poisson distributed,

Xi

iid∼ Poisson(λ0).

The log-likelihood function is

ln(λ|X) = log(λ)n∑

i=1

Xi −n∑

i=1

log(Xi!) − n λ.

Differentiating with respect to λ, we obtain the score function

S(θ|X) =∂l

n(λ|X)

∂λ=

1

λ

n∑

i=1

Xi − n

and the ML estimator

λ̂ML =1

n

n∑

i=1

Xi.

The second derivative of the log-likelihood function is

∂2ln(λ|X)

∂λ2= −

1

λ2

n∑

i=1

Xi

which yields the observed Fisher-information

I(λ|X) = −∂2l

n(λ|X)

∂λ2=

1

λ2

n∑

i=1

Xi

and the (expected) Fisher-information

I(λ|X) = −�(

∂2ln(λ|X)

∂λ2

)

=n λ

λ2=

n

λ.

Therefore the MLE is approximately normally distributed with mean λ

and variance λ/n.

Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 1 -

Example

Fitting a Poisson distribution (misspecified case)

Now suppose that the variables Xi and binomially distributed,

Xi

iid∼ Bin(m, θ0).

How does the MLE λ̂ML of the fitted Poisson model relate to the true

distribution?

The “distance” between the fitted model and the true model can be mea-

sured by the Kullback-Leibler distance,

�(

logfBin(X|θ0)

fPoiss(X|λ)

)

=� (

log fBin(X|θ0))

−� (

log fPoiss(X|λ))

=� (

λ − Xi log(λ))

+ terms constant in λ

= λ − m θ log(λ) + terms constant in λ.

Differentiating with respect to λ, we obtain

1 − mθ0

λ= 0 ⇔ λ = m θ0.

Thus the MLE λ̂ML converges to λ0 = m θ0.

Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 2 -

Asymptotic Properties of the MLE

Let θ̂ be the MLE for θ0. Taylor expansion of the score function at θ̂ about

θ0 yields

∂ln(θ̂|Y )

∂θ≈

∂ln(θ0|Y )

∂θ+

∂2ln(θ0|Y )

∂θ2(θ̂ − θ0) (1)

and hence

θ̂ − θ0 ≈ −(

∂2ln(θ0|Y )

∂θ2

)−1∂ln(θ0|Y )

∂θ,

since the left side of (1) is zero. Furthermore since

∂2ln(θ0|Y )

∂θ2→ I(θ0|Y )

and

�(

∂ln(θ0|Y )

∂θ

)

=�(

∂ log f(Y |θ0)

∂θ

)

= 0,

this suggests that

var(

θ̂ − θ0

)

≈ I(θ0)−1 �

(

∂ln(θ0|Y )

∂θ

)2

I(θ0)−1.

If the model is correctly specified, we have

�(

∂2ln(θ0|Y )

∂θ2

)

=�(

∂2 log f(Y |θ0)

∂θ2

)

=�[

∂θ

(∂f(Y |θ0)

∂θ

1

f(Y |θ0)

])

=�[

∂2f(Y |θ0)

∂θ2

1

f(Y |θ0)

]

−�[(

∂f(Y |θ0)

∂θ

)2( 1

f(Y |θ0)

)2]

.

Noting that

�[

∂2f(Y |θ0)

∂θ2

1

f(Y |θ0)

]

=

∂2f(y|θ0)

∂θ2dy =

∂2

∂θ2

f(y|θ0) dy = 0,

we obtain

I(θ0) =�(

∂f(Y |θ0)

∂θ

1

f(Y |θ0

)2

=�(

∂ log f(Y |θ0)

∂θ

)2

=�(

∂ln(θ0|Y )

∂θ

)2

=� (

S(θ|Y ))2

Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 3 -

Example

Fitting a Poisson distribution (misspecified case)

The variance of the MLE λ̂ can be approximated by

I(λ0)−1 � (

S(λ0|Y )2)

I(λ0)−1.

Using the formulas for the first and second derivative, we find that

� (S(λ0|Y )

)2=

�(

1

λ0

n∑

i=1

Xi − n)2

=(

n m θ0 (1 − θ0)

λ2

0

+n2 m2 θ2

0

λ2− 2

n2 m θ0

λ+ n2

)

= n1 − θ0

λ0

= n(

1

m θ0

−1

m

)

and

I(λ0) =�(

1

λ2

0

n∑

i=1

Xi

)

=n m θ0

λ2

0

=n

m θ0

,

where we used that

n∑

i=1

Xi ∼ Bin(m n, θ0).

Hence the variance of λ̂ becomes

m2θ2

0

n

(

1

m θ0

−1

m

)

=m θ0 (1 − θ0)

n.

Maximum Likelihood Estimation (Addendum), Apr 8, 2004 - 4 -