Hypothesis Testing - University of...

Introduction Large Sample Testing Composite Hypotheses

Hypothesis Testing

Daniel SchmiererEcon 312

March 30, 2007


Basics

Parameter of interest: θ ∈ Θ

Structure of the test:

H0: θ ∈ Θ0

H1: θ ∈ Θ1

for some sets Θ0, Θ1 ⊂ Θ where Θ0 ∩Θ1 = ∅ (oftenΘ1 = Θ−Θ0).


Basics

Type I error occurs when we reject a true null hypothesis.

Type II error occurs when we accept a false null hypothesis.

Size of a test (α): Pr(Type I error).

Power of a test (1− β): 1− Pr(Type II error).

A test is better if (all else equal) it has a higher power and/orlower size.


Consistency

For a given critical (rejection) region Cn at sample size n, thesize of the test is

αn = Pr(y ∈ Cn|θ ∈ Θ0)

and the power is

πn(θ) = Pr(y ∈ Cn|θ) for θ ∈ Θ1

Size generally doesn’t depend on alternative hypothesis, butpower does.

A test is called consistent if limn→∞ πn(θ) = 1 for all θ ∈ Θ1

(if in the limit, the test always rejects false null hypotheses).


Standard case – Maximum Likelihood

Objective function is ln L(y ; θ) =∑n

i=1 ln f (yi ; θ) with θ ∈ Rk .

Let the test be

H0: θ = θ0

H1: θ 6= θ0

Note that√

n(θ̂ − θ0)d−→ N(0, I−1

θ0) and

√n

(1n

∂ ln L(y ;θ)∂θ

∣∣∣θ0

)d−→ N(0, Iθ0) where

Iθ0 = −E

[∂2 ln L

∂θ∂θ′

]So standardizing and squaring these quantities will giveasymptotic χ2 distributions.


Trinity

Wald test statistic:√

n(θ̂ − θ0)′Iθ̂√

n(θ̂ − θ0)a∼ χ2

k

Rao’s Score (LM) test statistic:√

n

(1n


∣∣∣θ0

)′I−1

θ̂

√n

(1n


∣∣∣θ0

)a∼ χ2

k

Likelihood Ratio test statistic: −2 ln[

L(y ;θ0)

L(y ;θ̂)

]a∼ χ2

k


Graphical Interpretation


Composite Hypotheses

A test has a composite null hypothesis if the null hypothesiscontains more than one possible value of θ.

In particular, say θ = ( θ1(k×1)

, θ2(l×1)

), then a common test is

H0: θ1 = θ01 , θ2 unrestricted

H1: θ unrestricted

Can go over an example of this for maximum likelihood.


ML Example

Let true parameter vector be θ0 ≡ (θ̄1, θ̄2)

Define estimated parameter vector (unconstrained) asθ̂ ≡ (θ̂1, θ̂2)

Define estimated parameter vector (constrained, i.e. under thenull hypothesis setting θ1 = θ0

1) as θ̃ ≡ (θ01, θ̃2)

Decompose the information matrix into the following blocks

Iθ0 ≡(

Iθ1θ1 Iθ1θ2

Iθ2θ1 Iθ2θ2

)


ML Example

From asymptotic normality results we know that

√n(θ̂1 − θ0

1)d−→ N(0, upper left block of I−1

θ0)

By partitioned inverse results, the upper left block of I−1θ0

is

I 11−1

θ0

(k×k)

= I−1θ1θ1

(k×k)

− Iθ1θ2

(k×l)

I−1θ2θ2

(l×l)

Iθ2θ1

(l×k)

Also define

I 11θ0

=(I 11−1

θ0

)−1=

(I−1θ1θ1

− Iθ1θ2 I−1θ2θ2

Iθ2θ1

)−1


ML Example

Wald test statistic:

W =√

n(θ̂1 − θ01)′I 11

θ0

√n(θ̂1 − θ0

1)

Wald uses unrestricted estimate.

It checks whether the null hypothesis and the relevant portionof the unrestricted estimate (which is the best choice ofparameters under the alternative hypothesis) are very farapart.

Intuitively, if the null hypothesis were true, then by theconsistency of the ML estimator, the best choice under thealternative hypothesis should be getting close to the nullhypothesis.


ML Example

Score test statistic:

LM =

(1√n

∂ ln L

∂θ1

∣∣∣∣θ̃

)′I 11−1

θ0

(1√n

∂ ln L

∂θ1

∣∣∣∣θ̃

)LM uses the restricted estimate.

It checks whether the relevant portion of the score vector atthe restricted estimate (which is the best choice of parametersunder the null hypothesis) is close to zero.

Intuitively, if the null hypothesis were true, then the gradientof the likelihood should be zero in the population at thatparameter value and so restricting ourselves to that parametervalue should produce a gradient close to zero.


ML Example

Likelihood ratio test statistic:

LR = −2 ln

[L(y ; θ̃)

L(y ; θ̂)

]

Compares the highest value of the likelihood under the nullhypothesis with the highest value of the likelihood under thealternative hypothesis.

Intuitively, if the null hypothesis were true (and under ourregularity conditions about the uniform convergence of thelikelihood function), then the maximum of the likelihoodunder the null hypothesis and the maximum of the likelihoodunder the alternative hypothesis should be close.


ML Example

All three are asymptotically distributed as χ2k .

Show asymptotic equivalence using Taylor expansions – seeAsymptotic Theory Part IV notes.


Good references:

Engle, R. (1984), “Wald, Likelihood Ratio and LagrangeMultiplier Tests in Econometrics,” Handbook of Econometrics,Vol. II, Ch. 13.

Newey, W. and McFadden, D. (1994), “Large SampleEstimation and Hypothesis Testing,” Handbook ofEconometrics, Vol. IV, Ch. 36.

Both are on Google Scholar.

Hypothesis Testing - University of...

Documents

Transcript of Hypothesis Testing - University of...