1 Hypothesis Testing Under General Linear Model Previously we derived the sampling property results...

30
1 Hypothesis Testing Under General Linear Model Previously we derived the sampling property results assuming normality: Y = X + e where e t ~N(0, 2 ) → Y~N(X, 2 I T ) s =(X'X) -1 X'Y, E( s )= Cov( s )= β = 2 (X'X) -1 l ~N(, 2 (X'X) -1 ) σ U 2 unbiased estimate of σ 2 An estimate of Cov(β s ) = βs U 2 (X'X) -1 2 ll U ee σ = (T-K ) e l = y - Xβ l

Transcript of 1 Hypothesis Testing Under General Linear Model Previously we derived the sampling property results...

Page 1: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

1

Hypothesis Testing UnderGeneral Linear Model

Previously we derived the sampling property results assuming normality:Y = X + e where et~N(0,2)→ Y~N(X,2IT)s=(X'X)-1X'Y, E(s)=Cov(s)= β =2(X'X)-1

l~N(, 2(X'X)-1)σU

2 unbiased estimate of σ2

An estimate of Cov(βs) = βs=σU

2(X'X)-1

2 l lU

e eσ =

(T-K)

el = y - Xβl

Page 2: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

2

Hypothesis Testing UnderGeneral Linear Model

Single Parameter (βk,L) Hypothesis Test βk,l~N(βk,Var(βk))

kth diagonal element of βs

When σ2 is known:

( )

, ~ (0,1)var

k l k

k

z Nb b

b

-=

unknown true coeff.

When σ2 not known:

( )

, ~ˆvar

k l kT K

k

t tb b

b-

-=

Σβs=σu2(X'X)-1

Page 3: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

3

Hypothesis Testing UnderGeneral Linear Model

Can obtain (1-) CI for βk:

There is a (1-α) probability that the true unknown value of β is within this range

Does this interval contain our hypothesized value? If it does, than we can not reject H0

( )

( )

k,l k α 2,T-K k

k,l k α 2,T-K

ˆβ - var β t β

ˆβ + var β t

£

£

Page 4: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

4

Hypothesis Testing UnderGeneral Linear Model

Testing More Than One Linear Combination of Estimated Coefficients Assume we have a-priori

information about the value of β

We can represent this information via a set of J-Linear hypotheses (or restrictions):

In matrix notation

K

jk k jk=1

R β = r j=1,2,…,J (J K)

(JxK) (Jx1)(Kx1)R β = r

Page 5: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

5

Hypothesis Testing UnderGeneral Linear Model

(JxK) (Jx1)(Kx1)R β = r

11 12 1

21 22 2

1 2

K

k

J J JK

R R R

R R RR

R R R

1

2

J

r

rr

r

knowncoefficients

Page 6: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

6

Hypothesis Testing UnderGeneral Linear Model

Assume we have a model with 5 parameters to be estimatedJoint hypotheses: β1=8 and β2=β3

J=2, K=5

0

1

2

3

4

0 1 0 0 0 8

0 0 1 1 0 0

0 1 0 0 0 8

0 0 1 1 0 0

R r

R r

β2-β3=0

Page 7: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

7

Hypothesis Testing UnderGeneral Linear Model

How do we obtain parameter estimates if J hypotheses are true? Constrained (Restricted) Least

Squares R is β that minimizes: S=(Y-Xβ)'(Y-Xβ) s.t. Rβ=r

= e'e s.t. Rβ=re.g. we act as if H0 are true

S*=(Y-Xβ)'(Y-Xβ)+λ'(r-Rβ) λ is (J x1) Lagrangian multipliers

associated with J-joint hypotheses We want to choose β such that we

minimize SSE but also satisfy the J constraints (hypotheses), βR

Page 8: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

8

Hypothesis Testing UnderGeneral Linear Model

Min. S*=(Y-Xβ)'(Y-Xβ) + λ'(r-Rβ)

What and how many FOC’s? K+J FOC’s

1

* * *0

K

S S S

K-FOC’s

*0

Sr R

J-FOC’s

Page 9: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

9

Hypothesis Testing UnderGeneral Linear Model

What are the FOC’s?

( 1)

-1 -1R

*-2X Y + 2X Xβ - R λ 0

move -2X Y and -R λ to RHS

and divide by 2

R λX Xβ = X Y +

2R λ

β = X X X Y + X X2

Kx

S

Substitute these FOC into 2nd set∂S*/∂λ = (r-RβR) = 0J →

-1sR λ

r = Rβ + R X X2

S*=(Y-Xβ)'(Y-Xβ)+λ'(r-Rβ)

CRM

βS

Page 10: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

10

Hypothesis Testing UnderGeneral Linear Model

-1s

-1-1s

λr-Rβ = R X X R 2

λ = R X X R r-Rβ2

The 1st FOC

Substitute the expression for λ/2 into the 1st FOC:

R s

1-1 -1sX X R R X X R r-Rβ

1 1 RX X X Y X X

2R

Page 11: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

11

Hypothesis Testing UnderGeneral Linear Model

βR is the restricted LS estimator of β as well as the restricted ML estimator

Properties of Restricted Least Squares Estimator

→E(R) if R rV(R) ≤ V(S)

→[V(S) - V(R)] is positive semi-definite

diag(V(R)) ≤ diag(V(S))

R

1-1 -1

E

X X R R X X R r-Rβ

True butUnknown Value

Page 12: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

12

Hypothesis Testing UnderGeneral Linear Model

From above, if Y is multivariate normal and H0 is true βl,R~N(β,σ2M*(X'X)-1M*')

~N(β,σ2M*(X'X)-1)

From previous results, if r-Rβ≠0 (e.g., not all H0 true), estimate of β is biased if we continue to assume r-Rβ=0

,

11 1

Bias l RE

X X R R X X R r R

≠0

11 1*

KM I X X R R X X RR

Page 13: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

13

Hypothesis Testing UnderGeneral Linear Model

The variance is the same regardless of he correctness of the restrictions and the biasedness of βR → βR has a variance that is smaller

when compared to βs which only uses the sample information.

Page 14: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

14

Hypothesis Testing UnderGeneral Linear Model

Beer Consumption Example : qB ≡ quantity of beer purchased

PB ≡ price of beerPL ≡ price of other alcoholic bev.PO≡ price of other goodsINC ≡ household income

Real Prices Matter? All prices and INC by 10% β1 + β2 + β3 + β4=0

Equal Price Impacts? Liquor and Other Goods β2=β3

Unitary Income Elasticity? β4=1

Data used in the analysis

ββ β β31 2 4B B L Oq =αP P P INC exp( )e

Page 15: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

15

Given the above, what does the R-matrix and r vector look like for these joint tests?

Lets develop a test statistic to test these joint hypotheses

We are going to use the Likelihood Ratio (LR) to test the joint hypotheses

Hypothesis Testing UnderGeneral Linear Model

0 1 1 1 1 0

0 0 1 1 0 0

0 0 0 0 1 1

R r

B,t 0 1 B,t 2 L,t

3 O,t 4 t t

lnq =β +β ln(P )+β ln(P )

+β ln(P )+β ln(Inc )+e

Page 16: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

16

Hypothesis Testing UnderGeneral Linear Model

LR=lU*/lR

*

lU*=Max [l(|y1,…,yT);

=(β, σ) ]= “unrestricted” maximum likelihood function

lR*=Max [l(|y1,…,yT);

=(β, σ); Rβ=r]= “restricted” maximum likelihood function

Again, because we are possibly restricting the parameter space via our null hypotheses, LR≥1

Page 17: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

17

Hypothesis Testing UnderGeneral Linear Model

If lU* is large relative to lR

*→data shows evidence that the restrictions (hypotheses) are not true (e.g., reject null hypothesis) How much should LR exceed 1

before we reject H0? We reject H0 when LR ≥ LRC where

LRC is a constant chosen on the basis of the relative cost of the Type I vs. Type II errors

When implementing the LR Test you need to know the PDF of the dependent variable which determines the density of the test statistic

Page 18: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

18

Hypothesis Testing UnderGeneral Linear Model

For LR test, assume Y has a normal distribution →e~N(0,σIT) This implies the following LR

test statistic (LR*) What are the distributional

characteristics of LR*? Will address this in a bit

Page 19: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

19

Hypothesis Testing UnderGeneral Linear Model

We can derive alternative specifications of LR test statistic LR*=(SSER-SSEU)/(J2U)

(ver. 1) LR*=[(Re-r)′[R(X′X)-1R′]-1(Re-r)]/(J2U)

(ver. 2) LR*=[(R-e)′(X′X)(R-e)]/(J2U)

(ver. 3)βe =βS=βl

What are the Distributional Characteristics of LR* (JHGLL p. 255) LR* ~ FJ,T-K

J = # of Hypotheses K= # of Parameters (including intercept)

Page 20: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

20

Hypothesis Testing UnderGeneral Linear Model

Proposed Test Procedure Choose = P(reject H0| H0 true) =

P(Type-I error) Calculate the test statistic LR*

based on sample information Find the critical value LRcrit in an F-

table such that: = P(F(J, T – K) LRcrit), where α =

P(reject H0| H0 true)f(LR*)

αLRcrit

α = P(FJ,T-K ≥ LRcrit)

Page 21: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

21

Hypothesis Testing UnderGeneral Linear Model

Proposed Test Procedure Choose = P(reject H0| H0 true) =

P(Type-I error) Calculate the test statistic LR*

based on sample information Find the critical value LRcrit in an F-

table such that: = P(F(J, T – K) LRcrit), where α =

P(reject H0| H0 true) Reject H0 if LR* LRcrit

Don’t reject H0 if LR* < LRcrit

Page 22: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

22

Hypothesis Testing UnderGeneral Linear Model

Beer Consumption Example

Does the regression do a better job in explaining variation in beer consumption than if assumed the mean response across all obs.? Remember SSE=(T-K)σ2

U

Under H0: All slope coefficients=0

Under H0, TSS=SSE given that that there is no RSS and TSS=RSS+SSE

B,t 0 1 B,t 2 L,t

3 O,t 4 t t

lnq =β +β ln(P )+β ln(P )

+β ln(P )+β ln(Inc )+e

Page 23: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

23

Hypothesis Testing UnderGeneral Linear Model

Log-Log Beer Consumption Model

Unconstrained Model

R2 0.8254

Adj. R2 0.7975

σU 0.05997

Obs 30

Variable  Coeff Std Error T-Stat

Intercept -3.243 3.743 -0.87

lnPB -1.020 0.239 -4.27

lnPL -0.583 0.560 -1.04

lnPO 0.210 0.080 2.63

ln(INC) 0.923 0.416 2.22

Constrained Model

σU 0.13326 SSER=0.133262*29=0.51497

  Coeff Std Error T-Stat

Intercept 4.019 0.0243 165.17

SSE = 0.059972 *25 = 0.08992

R2=1- 0.08992/0.51497

TSS=SSERMean of LN(Beer)

Page 24: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

24

Hypothesis Testing UnderGeneral Linear Model

Results of our test of overall significance of regression model

Lets look at the following GAUSS Code

GAUSS command:CDFFC(29.544,4,25)=3.799e-009CDFFC Computes the complement

of the cdf of the F distribution (1-Fdf1,df2)

Unlikely value of F if hypothesis is true, that is no impact of exogenous variables on beer consumption

Reject the null hypothesisAn alternative look

Page 25: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

25

Hypothesis Testing UnderGeneral Linear Model

Beer Consumption Example

Three joint hypotheses exampleSum of Price and Income

Elasticities Sum to 0 (e.g., β1 + β2 + β3 + β4=0)Other Liquor and Other Goods

Price Elasticities are Equal (e.g., β2=β3)

Income Elasticity = 1 (e.g., β4=1) cdffc(0.84,3,25)=0.4848

B,t 0 1 B,t 2 L,t

3 O,t 4 t t

lnq =β +β ln(P )+β ln(P )

+β ln(P )+β ln(Inc )+e

Page 26: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

26

Hypothesis Testing UnderGeneral Linear Model

PDF

F3,25

0.84

area = 0.4848

Location of our calculated test statistic

F

Page 27: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

27

Hypothesis Testing UnderGeneral Linear Model

A side note: How do you estimate the variance of an elasticity and therefore test H0 about this elasticity?

Suppose you have the following model:FDXt = β0 + β1Inct + β2 Inc2

t + et

FDX= food expenditure Inc=household income

Want to estimate the impacts of a change in income on expenditures. Use an elasticity measure evaluated at mean of the data. That is:

Page 28: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

28

Hypothesis Testing UnderGeneral Linear Model

Income Elasticity (Γ) is:

How do you calculate the variance of Γ?

We know that: Var(α′Z)= α′Var(Z)α Z is a column vector of RV’s α a column vector of constants

Treat β0, β1 and β2 are RV’s. The α vector is:

1 2FDX Inc Inc

2 IncInc FDX FDX

2Inc 2Inc

α 0FDX FDX

FDXt = β0 + β1Inct + β2 Inc2t + et

Linear combination of Z

Page 29: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

29

Hypothesis Testing UnderGeneral Linear Model

This implies var(Γ) is:

2 0 0 1 0 2

0 1 1 1 2

0 2 1 2 2 2

0Var β Cov β ,β Cov β ,β

Inc 2Inc Inc0 Cov β ,β Var β Cov β ,β

FDX FDX FDXCov β ,β Cov β ,β Var β

2Inc

FDX

21 1 2

21 2 2

(2 2)(1 2)

(2 1)

Inc

Var β Cov β ,β FDXInc 2Inc

Cov β ,β Var βFDX FDX 2Inc

FDXxx

x

(1 x 1)

σ2(X'X)-1

(3 x 3)

(1 x 3)

(3 x 1)Due to 0 α value

Page 30: 1 Hypothesis Testing Under General Linear Model  Previously we derived the sampling property results assuming normality:  Y = X  + e where e t ~N(0,

30

Hypothesis Testing UnderGeneral Linear Model

This implies: var(Γ) is:

22 2

1 2

2

1 2

Inc 2IncVar Var β Var β

FDX FDX

Inc 2Inc2 Cov β ,β

FDX FDX

C12

C22

22 2

1 2

3

1 22

Inc 2IncVar β Var β

FDX FDX

Inc4 Cov β ,β

FDX

2C1C2