Panel Data Lecture Rome

Dynamic Panel Data Methods Lecture II

Microeconometrics Lectures

Richard Blundell

UCL and IFS March 2005

2

Dynamic Panel Data Methods Background The standard panel data model is

y x x x v

x v

it it it itk k i it

it i it

= + + + + + +

= ′ + +

β β β β η

β η

0 1 1 2 2 ...

where the ηi are the unobserved constant individual effects. i N t T= =1 1,..., ; ,..., , with N large and T small. Often lagged values of y are included in x.

3

An Example: Company Investment Rates The panel data model is

1

1

it iti t it

it it

I I vK K

β η λ−

−

= + + +

Unbalanced panel Company level data T = 4-10, N = 700.

4

Example OLS

Levels Within Groups

DIF 2SLS

I K it/( ) −1

0.2669(.0185)

-0.0094(.0181)

0.1626 (.0362)

Instruments I K it/( ) −2 STATA command for GMM: xtabond2 [email protected] On the CeMMAP website http://cemmap.ifs.org.uk/ (resources page), the Windmeijer course is available together with the computer exercises and some of the data sets.

5

Three common specifications to deal with ηi: 1. Random effects 2. Fixed effects 3. First Differences In the model

y x uu v

it it it

it i it

= ′ +

= +

βη

we assume that

E v E v xit it ita f a f= =0 0; |

6

The Random Effects specification further assumes that

E E xi i itη ηa f a f= =0 0; | i.e. it assumes that the individual effect ηi is uncorrelated with the regressors xit. Therefore

E y x x E x E v x xit it it i it it it it| | |a f a f a f= ′ + + = ′β η β

and therefore the simple OLS estimator on the pooled data is unbiased. However, it is not efficient, and the estimated standard errors are wrong, as it does not take account of the dependence of the error term within individual over time.

7

Let u vit i it= +η and assume independence of vis and vit, s t≠ , and of ηi and the vit, then

E u u Eis it ia f b g= =η ση2 2

and therefore the uis and uit are correlated. The within individual variance-covariance matrix is given by, ′ =u u u ui i i iT1 2 ...a f,

Ω = ′ =

++

+

L

N

MMMM

O

Q

PPPPE u ui i

v

v

v

a fσ σ σ σσ σ σ

σσ σ σ σ

η η η

η η

η

η η η

2 2 2 2

2 2 2

2

2 2 2 2

β RE ii

N

i ii

N

iX X X y= ′FHG

IKJ ′

=

−−

=

−∑ ∑1

11

1

1Ω Ω

8

Fixed Effects The more likely and interesting case is when the unobserved individual effects are correlated with the regressors:

E xi itη | .a f ≠ 0

Clearly, in this case OLS and the Random Effects estimator are biased and inconsistent as

E y x x E x E v x

x E x x

it it it i it it it

it i it it

| | |

|

a f a f a fa f

= ′ + +

= ′ + ≠ ′

β η

β η β

9

A solution is to estimate the model with a separate intercept for every individual by OLS. As

η βi i i iy x v= − ′ −

this happens to be equivalent for the β parameters to estimate the transformed, within group model by OLS

y y x x v vit i it i it i− = − ′ + −a f a fβ

Therefore, for the fixed effects, or within group estimator, only the effects of variables that change over time can be estimated. (OLS standard errors in this model are again wrong as it ignores the fact that N intercepts have been estimated).

10

For the fixed effects estimator to be unbiased, one needs that the xit in all periods are uncorrelated with the vis in all periods:

E v x s T t Tis ita f = = =0 1 1; ,..., , ,...,

when xit satisfies this condition, we call it to be strictly exogenous. Assuming strict exogeneity, the Hausman test can be used to test whether the unobserved heterogeneity is correlated with the regressors. When they are not correlated the RE estimator is efficient. If they are correlated, the FE estimator is consistent, but the RE estimator is not.

H Var VarFE RE FE RE FE RE= −′

− −−

β β β β β βd i d i d i d i1

If H is large, RE is rejected in favour of FE. For large samples H k~ χ 2, with k the number of elements in β .

11

First Differencing Again consider the model

y x uu v

it it it

it i it

= ′ +

= +

βη

where the unobserved individual effects ηi are correlated with xit. Taking first differences eliminates ηi:

y y x x u u x x v vit it it it it it it it it it− = − ′ + − = − ′ + −− − − − −1 1 1 1 1a f a f a f a fβ β

12

and therefore OLS is unbiased if v vit it− −1a f and x xit it− −1a f are uncorrelated. This is a weaker assumption than the strict exogeneity assumption of the fixed effects estimator. Again OLS estimated standard errors are wrong as it does not take account of the correlation between v vit it− −1a f and v vit it− −−1 2a f

E v v v vit it it it v− − = −− − −1 1 22a fa f σ

E v vi i v′ =

−−

−−

L

N

MMMM

O

Q

PPPPa f σ 2

2 1 01 2

10 1 2

(when the vit themselves are not correlated over time).

13

Endogenous Variables Consider again the model in first differences

y y x x v vit it it it it it− = − ′ + −− − −1 1 1a f a fβ

And xit is endogenous if it is correlated with vit. There can also be feedback from vit−1 to xit such that E x vit it− ≠1 0a f . In this case we call xit predetermined or weakly exogenous. In both cases E x x v vit it it it− − ≠− −1 1 0a fa fb g and OLS is biased. If the uit are not correlated over time, lagged values of xit can be used as instruments for the endogenous differences, and the model can be estimated by the Instrumental Variables estimator.

14

If xit is endogenous, E x vit ita f ≠ 0 and E x vit it− − ≠1 1 0a f . Valid instruments are xis, with s=1,…,t-2, as E x vit it− =2 0a f . If xit is predetermined, E x vit it− ≠1 0a f but E x vit it− − =1 1 0a f . Valid instruments therefore are xis, with s=1,…,t-1.

15

Treatment Effects in Panels Suppose the model is:

it i it it t i ity d x vα β λ η′= + + + +

where 1id = if the program impacts on group i in period t. Typically once the program is in place this dummy is set to unity for all remaining time periods. If the time effects, the group effects and the x are sufficient to render

1id = exogenous, then within groups (fixed effects) will be consistent for the ATT impact of the treatment. In this case, if the treatment occurs at the same time for all groups that are treated then diff-in-diff and within groups are identical estimators.

16

Dynamic Panel Data Models A dynamic panel data model is specified as

y y x vit it it i it= + ′ + +−α β η1

Consider a model without other explanatory variables

y y vit it i it= + +−α η1

Clearly, y y vit it i it− − −= + +1 2 1α ηa f is correlated with ηi. OLS estimator is biased upwards. Fixed Effects estimator is biased downwards (this bias gets smaller for larger T)

17

For the first differenced model

y y y y v vit it it it it it− = − + −− − − −1 1 2 1αa f a f

yit−1 is of course correlated with vit−1, (y is predetermined), and the OLS estimator in the differenced model is severely downward biased. Valid instruments for y yit it− −−1 2a f are the lagged levels y y yit it i− −2 3 1, ,..., , as E y v vit it it− −− =2 1 0a fb g . An Instrumental Variables estimator that uses this information optimally is the Generalised Method of Moments (GMM) estimator.

18

Let ∆vi be the vector of errors for individual i in the first differenced equation:

∆

∆ ∆∆ ∆

∆ ∆

v

v vv v

v v

y yy y

y y

i

i i

i i

iT iT

i i

i i

iT iT

=

−−

−

L

N

MMMM

O

Q

PPPP=

−−

−

L

N

MMMM

O

Q

PPPP− −

3 2

4 3

1

3 2

4 3

1

αα

α

and let Zi be the matrix of instruments for individual i

Z

yy y

y y y

i

i

i i

i i iT

=

L

N

MMMM

O

Q

PPPP−

1

1 2

1 2 2

0 0 00 0 0

0 0… …

19

Then E Z vi i

′ =∆e j 0,

a total of T T−( ) −( )1 2 2/ moment conditions. The GMM estimator uses these moment condition to estimate the parameters consistently and efficiently in two steps. The one-step estimator minimises

JN

Z v WN

Z vN i ii

N

N i ii

N

= ′FHG

IKJ′

′FHG

IKJ= =

∑ ∑1 11 1

∆ ∆

where WN is a weight matrix.

Choosing WN

Z ZN i ii

N

= ′FHG

IKJ=

−

∑11

1

results in the Two-Stage Least Squares

estimator.

20

The one-step GMM estimator uses as the weight matrix

WN

Z A Z

A

N i N ii

N

N

11

11

2 1 0 01 2 0

0 10 0 1 2

= ′FHG

IKJ

=

−−

−−

L

N

MMMM

O

Q

PPPP

=

−

∑

and is efficient when the errors are homoscedastic and not correlated over time. This is often too restrictive. However, the one-step results are consistent, and robust standard errors that adjust for heteroscedasticity and autocorrelation are easily obtained.

21

The two-step estimator is efficient under more general conditions, like heteroscedasticity. The efficient weight matrix is computed as

WN

Z v v Z

v y y

N i i i ii

N

i i i

21

1

1 1

1= ′ ′FHG

IKJ

= −=

−

−

∑ ∆ ∆

∆ ∆ ∆ ,α

where α 1 is the one-step GMM estimator. A problem is that in small samples (small number of individuals) the estimated standard errors of the two-step GMM estimator tend to be too small.

22

Sargan test for overidentifying restrictions: The null hypothesis for this test is that the instruments are valid in the sense that they are not correlated with the errors in the first-differenced equation. It is computed as

S NJ NN

Z v WN

Z vN i ii

N

N i ii

N

= = ′FHG

IKJ′

′FHG

IKJ= =

∑ ∑α 2 21

2 21

1 1a f ∆ ∆ .

Under the null, this test statistic has a χ q

2 distribution, with q equal to the total number of instruments minus the number of parameters in the model. Only use the two-step result for the Sargan test. Note also test for serial correlation in the errors.

23

An Example: Investment Rates across Firms The estimated model is

IK

IK

vit

itt

it

iti it

FHGIKJ = +

FHGIKJ + +−

−

λ α η1

1

and results are presented in Table 1 for OLS, within groups, just identified Two-Stage Least Squares for a differenced model, with I K it/( ) −2 as an instrument for ∆ I K it/( ) −1, and two GMM estimates for α in the differenced model, one using I K it/( ) −2 and I K it/( ) −3, the other using I K I Kit i/ ,..., /( ) ( )−2 1 as instruments.

24

OLS


2SLS DIF GMM1 DIF GMM1 DIF

I K it/( ) −1

0.2669(.0185)

-0.0094(.0181)

0.1626 (.0362)

0.1593 (.0327)

0.1560 (.0318)

m1 m2 Sargan (p)

-4.71 2.52

-11.36 -2.02

-10.56 0.61

-10.91 0.52

0.36

-11.12 0.46

0.43

Instruments I K it/( ) −2 I K it/( ) −2 I K it/( ) −3

I K it/( ) −2

I K i/( ) 1

25

Exogeneity/Endogeneity of additional regressors and instrument set Consider again the dynamic model with one other explanatory variable:

y y x vit it it i it= + + +−α β η1

and the model in first differences:

∆ ∆ ∆ ∆y y x vit it it it= + +−α β1 .

Consider the case with T = 4. When x is strictly exogenous w.r.t. v, the instruments are

Zy x x

y y x xii i i

i i i i

= LNMOQP

1 1 4

1 2 1 4

00

, ,...,, , ,...,

.

26

When x is predetermined

Zy x x

y y x x xii i i

i i i i i

= LNMOQP

1 1 2

1 2 1 2 3

00

, ,, , , ,

.

And when x is endogenous

Zy x

y y x xii i

i i i i

= LNMOQP

1 1

1 2 1 2

00,

, , ,.

27

An example and finite sample inference Arellano and Bond (1991) estimate dynamic employment equations using a sample of 140 UK quoted firms over the years 1976-1984. One model was specified as

n n n w w k ys ys uit it it it it it it it t i it= + + + + + + + + +− − − −α α β β γ δ δ λ η1 1 2 2 1 1 1 1

where nit is the logarithm of UK employment in company i at the end of the period t, wit is the log of the real product wage, kit is the log of gross capital and ysit is the log of industry output. The table presents estimation results for the one- and two-step GMM estimators.

28

One-Step Two-Step coeff std err coeff std err std errc nit−1 .535 .166 .474 .085 .185 nit−2 -.075 .068 -.052 .027 .052 wit -.592 .168 -.513 .049 .146 wit−1 .292 .142 .225 .080 .142 kit .359 .054 .293 .040 .063 ysit .597 .172 .610 .109 .156 ysit−1 -.612 .212 -.446 .125 .217 m1 -2.493 -2.826 -1.999 m2 -0.359 -0.327 -0.316 Wald 219.6 372.0 142.0

29

Another test statistic with reasonable finite sample properties is the difference between the Sargan test statistics in the models with and without the restriction imposed. Imposing α 2 0= (keeping time periods and instruments the same) results in a Sargan test of 30.58. The difference between the Sargan tests is therefore 0.47, which is much smaller that the 5% critical value of the chi-squared distribution with one degree of freedom. H0 2 0:α = is therefore not rejected.

30

Weak Instruments and Dynamic Panels Remember that instruments have to satisfy that 1.They are not correlated with the error term in the equation of interest. 2.They are correlated with the endogenous explanatory variable. Whether the instruments are correlated with the error term is tested by means of the Sargan test. If the Sargan test rejects the null of no correlation, the IV estimator is biased and inconsistent. However, even if the instruments are not correlated with the error term, a serious small sample bias can occur if they are only weakly correlated with the endogenous explanatory variable.

31

For the dynamic panel data model in first differences

∆ ∆ ∆y y vit it it= +−α 1

lagged levels y yit i−2 1,..., as instruments for ∆yit−1 become less informative as α increases. (For the extreme unit root case, y y vit it it= +−1 , α is not identified in the first differenced GMM model). The weak instrument bias tend to go in the direction of the within groups bias (i.e. downward). This occurs for any highly persistent endogenous r.h.s. variable – capital etc.

32

There are T − 2 additional moment conditions (additional to the moment conditions for the model in first differences) for this case are E u y E v y

E y y yit it i it it

it it it

∆ ∆

∆

− −

− −

= +

= − =

1 1

1 1 0

a f a fb ga fb gη

α

These additional moment conditions are available if the initial conditions satisfy

E yi iη ∆ 2 0a f = , which holds when the process is mean stationary:

y

E E

ii

i

i i i

1 1

1 1

10

=−

+

= =

ηα

ε

ε η εa f a f.

33

The GMM estimator that combines the moment conditions for the differenced model with those for the levels model is call the SYSTEM estimator (Blundell and Bond (1998)) and has been shown to perform much better (less bias and more precision), especially when α is large, i.e. when the series are persistent. This is due to the fact that ∆yit−1 is a good instrument for yit−1, it explains yit−1 well, irrespective of the value of α . Whether the additional moment conditions are valid has of course to be tested, using the Sargan test.

34

The model is

y y x vx x v e

it it it i it

it it i it it

= + + +

= + + +−

−

α β ηρ τη θ

1

1

T = 8, N = 500, β = 1, τ = 0 25. , θ = −0 1. , ση

2 1= , σ v2 1= , σ e

2 0 16= . (Normal), 10,000 replications

35

OLS WG DIF SYS Mean St D Mean St D Mean St D Mean St D ρ = 0 5. ρ 0.762 .012 0.265 .018 0.494 .034 0.501 .024 α = 0 5. α 0.820 .007 0.311 .017 0.480 .040 0.511 .027 β 0.775 .034 0.490 .045 0.930 .136 0.997 .124 α = 0 95. α 0.990 .001 0.662 .016 0.548 .177 0.979 .011 β 0.581 .035 0.388 .044 0.226 .356 0.983 .101

36

OLS WG DIF SYS Mean St D Mean St D Mean St D Mean St D ρ = 0 95. ρ 0.997 .001 0.591 .017 0.676 .222 0.958 .031 α = 0 5. α 0.650 .009 0.396 .015 0.480 .033 0.518 .021 β 0.830 .022 0.796 .040 0.800 .290 1.075 .059 α = 0 95. α 0.962 .001 0.882 .009 0.927 .025 0.957 .003 β 0.902 .017 0.745 .040 0.615 .400 1.019 .031

37

An Example: Company Capital Stock The estimated model is

k k vit t it i it= + + +−λ α η1 OLS


GMM1 DIF t −( )3

GMM1 SYS t −( )3

kit−1

0.987 (.002)

0.733 (.027)

0.768 (.070)

0.925 (.021)

m1 m2 Sargan (p) Dif-Sar

7.72 2.29

-6.82 -1.73

-5.80 -1.73

.563

-6.51 -1.81

0.627 0.562

38

Count Data Models

Often the dependent variable is an integer valued non-negative count variable, like the number of visits to the doctor, the number of patents granted or the average daily number of cigarettes smoked. A standard model for analysing such data is the Poisson regression model. The Poisson density for a count variable yi given xi

f y x eyi i

iy

i

i

|!

a f =−µ µ

where

µ βi i i iE y x x= = ′| expa f e j

is the conditional mean of yi given xi, which is positive.

39

As ln µ βi ixa f = ′ , the model is often called a log-linear model. The Poisson distribution has the property that the conditional variance is equal to the conditional mean (equidispersion):

Var y x E y x xi i i i i| | expa f a f e j= = ′β .

Consider the regression model

y x ui i i= ′ +exp βe j

with E u xi i|a f = 0 from which it follows that E x ui ia f = 0. As long as these conditions are valid in the population, the Poisson estimator for β is consistent, even if the true distribution is not Poisson.

40

Parameter Interpretation The partial effects are given by

∂∂

= ′E y xx

xj

j| expa f e jβ β .

Further,

β jj

E y xx E y x

=∂∂

||

a fa f1

and so β j is a semi-elasticity, it equals the proportionate change in the conditional mean if the j th regressor changes by one unit. If x jis replaced by ln x jc h, β j is the elasticity of E y x|a f with respect to x j.

41

Overdispersion In many applications there is overdispersion, i.e. the conditional variance is larger than the conditional mean (and sometimes there is underdispersion). The Poisson maximum likelihood estimated standard errors are then wrong, but they can easily be corrected by using robust standard errors that allow for general heteroskedasticity.

42

Overdispersion can be introduced directly into the model by introducing an unobserved heterogeneity term, ηi. Conditional on xi and ηi, the yi are Poisson distributed with

E y x x xi i i i i i i| , exp expη β η β εa f e j e j= ′ + = ′

If ε i is independent of xi and has a gamma distribution with E iεa f =1 and Var iε δa f = 2, then the conditional distribution of yi given xi is negative binomial with

E y x x

Var y x x x

i i i

i i i i

| exp

| exp exp

a f e ja f e j e j

= ′

= ′ + ′

β

β δ β22

Note again, that as the conditional mean has not changed, the Poisson ML estimator is consistent in this case, but not efficient.

43

Panel Data To allow for general correlation between xit and ηi the fixed effects Poisson estimator is obtained by including N individual specific dummies in the model. This is similar to the within groups estimator in the linear model, and therefore the Poisson model does not suffer from the incidental parameter problem. This fixed effects, within groups mean scaling estimator is obtained from the regression

y y wit iti

iit= +µ

µ

where µ βit itx= ′expe j and µ µi itt

T

T=

=∑11

.

44

#cigs Pooled Poisson Fixed Effects Poisson coeff se coeff se age .789 .052 .725 .078 age2 -.102 .006 -.095 .002 lrhi -.112 .020 .019 .004 hsownd -.586 .028 -.132 .008 unemp .218 .030 -.016 .008 dkid04 -.115 .028 -.041 .006 female -.202 .028 # obs 89844 32043 # indiv 19070 5657

45

Weak exogeneity For both the random and fixed effects estimators, the xit have to be strictly exogenous. This assumption can be relaxed and quasi-differencing techniques can be used to allow for endogenous and/or predetermined explanatory variables. Write the panel data model with unobserved heterogeneity as

y x u

u v

it it it

it i it

= ′

=

exp β

ε

e j

and ε i is not correlated with vit.

46

Then y y u u v vit

it

it

itit it i it itµ µ

ε− = − = −−

−− −

1

11 1a f a f

If for example xit is endogenous, E x vit ita f ≠ 0, valid moment conditions are

E x y y sit sit

it

it

it−

−

−

−FHG

IKJ

LNM

OQP = ≥

µ µ1

1

0 2, for .

Alternative moment conditions that are only valid when xit is predetermined are

E x y y E x y x x y sit s itit

itit it s it it it it−

−− − − −−

FHG

IKJ

LNM

OQP = − − ′FH IK −FH IKLNM

OQP = ≥

µµ

β11 1 1 0 1exp .a f , for

These moment conditions can be used for estimation by GMM.

47

Dynamic Panel Data Methods

Lecture II

Microeconometrics Lectures

Richard Blundell UCL and IFS March 2005

Panel Data Lecture Rome

Documents

Transcript of Panel Data Lecture Rome