Panel Data Lecture Rome
Embed Size (px)
Transcript of Panel Data Lecture Rome
Dynamic Panel Data Methods Lecture IIMicroeconometrics Lectures Richard Blundell UCL and IFS March 2005
Dynamic Panel Data MethodsBackground The standard panel data model is
yit = 0 + xit1 1 + xit 2 2 +...+ xitk k + i + vit = xit + i + vitwhere the i are the unobserved constant individual effects. i = 1,..., N ; t = 1,..., T , with N large and T small. Often lagged values of y are included in x.
An Example: Company Investment Rates The panel data model is
I it I it 1 = + i + t + vit K it K it 1Unbalanced panel Company level data T = 4-10, N = 700.
Example OLS Within DIF Levels Groups 2SLS ( I / K )it 1 0.2669 -0.0094 (.0185) (.0181) 0.1626 (.0362) ( I / K )it 2
STATA command for GMM: xtabond2 email@example.com On the CeMMAP website http://cemmap.ifs.org.uk/ (resources page), the Windmeijer course is available together with the computer exercises and some of the data sets.
Three common specifications to deal with i : 1. Random effects 2. Fixed effects 3. First Differences In the modelyit = xit + uit uit = i + vit
we assume that E vit = 0; E vit | xit = 0
The Random Effects specification further assumes that E i = 0; E i | xit = 0 i.e. it assumes that the individual effect i is uncorrelated with the regressors xit . ThereforeE yit | xit = xit + E i | xit + E vit | xit = xit
and therefore the simple OLS estimator on the pooled data is unbiased. However, it is not efficient, and the estimated standard errors are wrong, as it does not take account of the dependence of the error term within individual over time.
Let uit = i + vit and assume independence of vis and vit , s t , and of i and the vit , then 2 E uis uit = E i2 =
a f b g2 2 v
and therefore the uis and uit are correlated. The within individual variance-covariance matrix is given by, ui = ui1 ui 2 ... uiT ,
LM + MM = Eau u f = MN 2 i i
2 2 +2 v
OP PP P + Q2
FG X X IJ = H KN 1 i i i =17
Xi 1 yi i =1
Fixed Effects The more likely and interesting case is when the unobserved individual effects are correlated with the regressors:
E i | xit 0.Clearly, in this case OLS and the Random Effects estimator are biased and inconsistent as
E yit | xit = xit + E i | xit + E vit | xitit i it it
a f a f = x + Ea | x f x
A solution is to estimate the model with a separate intercept for every individual by OLS. As
i = yi xi vithis happens to be equivalent for the parameters to estimate the transformed, within group model by OLS
yit yi = xit xi + vit vi
Therefore, for the fixed effects, or within group estimator, only the effects of variables that change over time can be estimated. (OLS standard errors in this model are again wrong as it ignores the fact that N intercepts have been estimated).9
For the fixed effects estimator to be unbiased, one needs that the xit in all periods are uncorrelated with the vis in all periods:
E vis xit = 0; s = 1,..., T , t = 1,..., Twhen xit satisfies this condition, we call it to be strictly exogenous. Assuming strict exogeneity, the Hausman test can be used to test whether the unobserved heterogeneity is correlated with the regressors. When they are not correlated the RE estimator is efficient. If they are correlated, the FE estimator is consistent, but the RE estimator is not. 1 H = FE RE Var FE Var RE FE RE
d i d
If H is large, RE is rejected in favour of FE. For large samples H ~ 2 , k with k the number of elements in .10
Again consider the modelyit = xit + uit uit = i + vit
where the unobserved individual effects i are correlated with xit . Taking first differences eliminates i :
yit yit 1 = xit xit 1 + uit uit 1 = xit xit 1 + vit vit 1
and therefore OLS is unbiased if vit vit 1 and xit xit 1 are uncorrelated. This is a weaker assumption than the strict exogeneity assumption of the fixed effects estimator. Again OLS estimated standard errors are wrong as it does not take account of the correlation between vit vit 1 and vit 1 vit 2
fa LM 2 1 Eav v f = M MM N0i i 2 v
E vit vit 1 vit 1 vit 21 2
f a f = 0
OP PP 1 2P Q
(when the vit themselves are not correlated over time).
Consider again the model in first differences
yit yit 1 = xit xit 1 + vit vit 1And xit is endogenous if it is correlated with vit .
f a f
There can also be feedback from vit 1 to xit such that E xit vit 1 0. In this case we call xit predetermined or weakly exogenous. In both cases E xit xit 1 vit vit 1 0 and OLS is biased. If the uit are not correlated over time, lagged values of xit can be used as instruments for the endogenous differences, and the model can be estimated by the Instrumental Variables estimator.13
If xit is endogenous, E xit vit 0 and E xit 1vit 1 0 . Valid instruments are xis , with s=1,,t-2, as E xit 2 vit = 0.
a f a
If xit is predetermined, E xit vit1 0 but E xit 1vit 1 = 0 . Valid instruments therefore are xis , with s=1,,t-1.
Treatment Effects in Panels
Suppose the model is:
yit = i dit + xit + t + i + vitwhere di = 1 if the program impacts on group i in period t. Typically once the program is in place this dummy is set to unity for all remaining time periods. If the time effects, the group effects and the x are sufficient to render di = 1 exogenous, then within groups (fixed effects) will be consistent for the ATT impact of the treatment. In this case, if the treatment occurs at the same time for all groups that are treated then diff-in-diff and within groups are identical estimators.15
Dynamic Panel Data Models
A dynamic panel data model is specified as
yit = yit 1 + xit + i + vitConsider a model without other explanatory variablesyit = yit 1 + i + vit
Clearly, yit 1 = yit 2 + i + vit 1 is correlated with i . OLS estimator is biased upwards. Fixed Effects estimator is biased downwards (this bias gets smaller for larger T)16
For the first differenced model
yit yit 1 = yit 1 yit 2 + vit vit 1
yit 1 is of course correlated with vit 1, ( y is predetermined), and the OLS estimator in the differenced model is severely downward biased.
Valid instruments for yit 1 yit 2 are the lagged levels yit 2 , yit 3 ,..., yi1, as E yit 2 vit vit 1 = 0 .
An Instrumental Variables estimator that uses this information optimally is the Generalised Method of Moments (GMM) estimator.
Let vi be the vector of errors for individual i in the first differenced equation: vi 3 vi 2 yi 3 yi 2 vi 4 vi 3 yi 4 yi 3 vi = =
LM MM MNv
OP PP PQ
LM MM MNy
OP PP PQ
and let Zi be the matrix of instruments for individual i
LM y 0 Z =M MM N0i
0 yi1 yi 2
0 0 0 yi1
0 0 yi 2 yiT 2
OP PP PQ
E Zi vi = 0, a total of (T 1)(T 2) / 2 moment conditions. The GMM estimator uses these moment condition to estimate the parameters consistently and efficiently in two steps. The one-step estimator minimises
FG 1 Z v IJ W FG 1 Z v IJ = HN K HN KN N i i N i i i =1 i =1
where WN is a weight matrix. 1 N 1 Zi Zi results in the Two-Stage Least Squares Choosing WN = N i =1 estimator.
The one-step GMM estimator uses as the weight matrixWN 1
FG 1 Z A Z IJ = HN K LM 2 1 0 0 OP 0 1 2 PP =M 1 MM 0 N 0 0 1 2 PQN 1 i N i i =1
and is efficient when the errors are homoscedastic and not correlated over time. This is often too restrictive. However, the one-step results are consistent, and robust standard errors that adjust for heteroscedasticity and autocorrelation are easily obtained.
The two-step estimator is efficient under more general conditions, like heteroscedasticity. The efficient weight matrix is computed asWN 2
FG 1 Z v v Z IJ = HN KN i i i i i =1
vi = yi 1yi , 1
where 1 is the one-step GMM estimator. A problem is that in small samples (small number of individuals) the estimated standard errors of the two-step GMM estimator tend to be too small.
Sargan test for overidentifying restrictions:
The null hypothesis for this test is that the instruments are valid in the sense that they are not correlated with the errors in the first-differenced equation. It is computed as
FG 1 Z v IJ W FG 1 Z v IJ . S = NJ a f = N HN K HN KN N N 2 i i2 N2 i i2 i =1 i =1
Under the null, this test statistic has a 2 distribution, with q equal to the q total number of instruments minus the number of parameters in the model. Only use the two-step result for the Sargan test.Note