Di⁄erences-in-Di⁄erences - London School of Economics...

Differences-in-Differences

Jörn-Steffen Pischke

LSE

October 10, 2016

Pischke (LSE) Differences-in-Differences October 10, 2016 1 / 12

Fixed effects versus differences-in-differences

Recall how the fixed effects model assumes

E (Y0it |Ai , t) = α+ λt + γAi

orE (Y0it |i , t) = αi + λt

The differences-in-differences (DD) model makes a very similarassumption but conditions on a group level instead of an individuallevel effect

E (Y0ist |s, t) = γs + λt

where s could be, for example, a state. This is suffi cient for anytreatment that happens at the state-time level.While the basic strategy is the same, the data requirements are muchless. We don’t need repeated observations on unit i (i.e. a panel).Repeated cross-sections sampling from the same aggregate units s aresuffi cient.Pischke (LSE) Differences-in-Differences October 10, 2016 2 / 12

Compulsory schooling laws

An example for a differences-in-differences setup would be the effectof compulsory schooling laws on schooling obtained in the US. Theselaws are set at the state level, and different states change thecompulsory schooling laws at different times. For example, Floridaraised its compulsory schooling requirement from 5 to 7 grades in1935. Neighboring Georgia required 6 grades both before and after1935.

We can think of FL as the treatment state and GA as the controlstate.

1934 is a control period and 1935 is the treatment period.


Differences-in-differences

Let Dst denote a dummy for the treatment, i.e. a compulsory schoolingrequirement in FL.

Yist = γs + λt + βDst + eist

Using this it is easy to see that

E [Y ist |s = GA, t = 1935]− E [Y ist |s = GA, t = 1934] = λ1935 − λ1934

E [Y ist |s = FL, t = 1935]−E [Y ist |s = FL, t = 1934] = λ1935−λ1934+ β

The population difference-in-difference is

[E (Y ist |s = FL, t = 1935)− E (Y ist |s = FL, t = 1934)]

− [E (Y ist |s = GA, t = 1935)− E (Y ist |s = GA, t = 1934)] = β


Identification in the differences-in-differences model

time

schooling

before: 1934 after: 1935

schooling trend incontrol state: GA

schooling trend intreatment state: FL

counterfactual schoolingtrend in treatment state

effect of comp.schooling law


Regression DD

Y ist = γs + λt + βDst + eist

= α+ γ1(s = FL) + λ1(t = 1935)

+β1(s = FL) · 1(t = 1935) + eistwhere 1(·) is the indicator function. Taking conditional expectations fordifferent states and periods, and subtracting easily yields

α = E (Y ist |s = GA, t = 1934) = γGA + λ1934

γ = E (Y ist |s = FL, t = 1934)− E (Y ist |s = GA, t = 1934)= γFL − γGA

λ = E (Y ist |s = GA, t = 1935)− E (Y i |s = GA, t = 1934)= λNov − λFeb

β = [E (Y ist |s = GA, t = 1935)− E (Y i |s = GA, t = 1934)]− [E (Y ist |s = FL, t = 1935)− E (Y i |s = FL, t = 1934)] .


Advantages of the regression formulation

Y ist = γs + λt + βDst + eist

1 The regression gives you a standard error and t-statistic on β.2 Can easily extend the DD framework to more than two states andperiods: e.g. Acemoglu and Angrist (2000) use individuals bornbetween 1910 - 1940 in any of the lower 48 states. If we use morethan 2x2 states/periods:

3 Can use a multivalued treatment indicator

Y ist = γs + λt + βCS st + eist

where CSst takes on values from 6 to 9 for the number gradesrequired, or multiple dummies for the different treatments.

4 Can add covariates

Y ist = γs + λt + βCS st + Xstδ+ eist


Covariates in the DD model

Start with the regression

Y ist = γs + λt + βCS st + eist .

We could run this on the micro data or aggregate to the state level

Y st = γs + λt + βCS st + est .

Both regressions (the 2nd weighted by the number of obs. in the cell)give the same estimates since regressors just vary at the group level.For the same reason, only covariates at the state/year level matter foridentification

Y ist = γs + λt + βCS st + Xstδ+ eist

We might want to include individual level covariates

Y ist = γs + λt + βCS st + Xistδ+ eist .

The within state variation doesn’t matter for identification but mayreduce standard errors.Pischke (LSE) Differences-in-Differences October 10, 2016 8 / 12

Assessing DD identification

The key identifying assumption in DD models is that the treatment stateshave similar trends to the control states in the absence of treatment.

With only one treatment and control group, graph your results, andlook at trends in periods with before the treatment.

With many treatment and control groups:

and a binary treatment, estimate treatment impacts at different dates

Y ist = γs + λt +q

∑j=−m

βjDst+j + eist

where Dst is now an indicator for whether the treatment got switchedon in year t. This estimates q leads and m lags of the treatment. Theleads should all be zero.include state specific trends.


Graph for two statesFlorida raises compulsory schooling from 5 to 7 grades in 1935

.6.7

.8.9

1Fr

actio

n co

mpl

etin

g gr

ade

8 or

mor

e

1920 1930 1940 1950 1960Year student turns 14

Florida Georgia


Some DD estimatesYears of schooling on child labor laws

Regressor (1) (2) (3) (4) (5)

CL70.51(0.43)

0.80(0.36)

0.15(0.36)

0.09(0.10)

0.04(0.04)

CL81.37(0.28)

1.11(0.22)

0.77(0.33)

0.20(0.10)

0.05(0.05)

CL91.56(0.31)

2.39(0.21)

−0.28(0.37)

0.37(0.12)

0.06(0.04)

State effects X X XYear effects X X XState trends X


Some more DD estimatesYears of schooling on child labor laws

Dependent variablecompletes 8+ years completes 10+ years

Regressor (1) (2) (3) (4)

CL70.01(0.02)

0.009(0.004)

0.01(0.02)

0.001(0.008)

CL80.03(0.02)

0.007(0.005)

0.03(0.01)

0.002(0.009)

CL90.05(0.02)

0.010(0.004)

0.05(0.02)

0.005(0.009)

State effects X X X XYear effects X X X XState trends X X


Di⁄erences-in-Di⁄erences - London School of Economics...

Documents

Transcript of Di⁄erences-in-Di⁄erences - London School of Economics...