Selectivity Econometria
-
Upload
roger-asencios-nunez -
Category
Documents
-
view
5 -
download
1
description
Transcript of Selectivity Econometria
-
Sample Selection
Walter Sosa-Escudero
Econ 507. Econometric Analysis. Spring 2012
April 23, 2012
Walter Sosa-Escudero Sample Selection
-
Preliminaries 1) Truncated normal distribution
X f(x), X|X < a: X truncated in a. Then
f(x|X < a) = f(x)Pr(X < a)
If X N(, 2), and recalling that
Pr(X < a) = Pr(
1/(X ) < 1/(a ))
= Pr(z < ),
(a )/, z (x )/.
f(x|X < a) =1
12pi
exp[12(x
)2]()
=(1/)(z)
()
Walter Sosa-Escudero Sample Selection
-
Result (no proof): if X N(, 2), then:
E(X|X < a) = ()()
Truncated to the right: expected value moves to the left(general).
How much? Depends on and 2
(z) (z)/(z) is known as the inverse Mills ratio
Walter Sosa-Escudero Sample Selection
-
Preliminaries 2) Latent variables and probits
Recall the probit model:
Pr(y = 1|x) = (x)
can be consistently estimated by MLE based on a randomsample (yi, xi), i = 1, . . . , n.
Consider the regression model
y = x + u, u N(0, 2)y not directly observable (a latent variable) but, instead, weobserve y = 1[y > 0].
Which parameters of the regression models can be estimatedconsistently with this information?
Walter Sosa-Escudero Sample Selection
-
Note that:
P (y = 1|x) = P (y > 0|x)= P (u > x|x)= P (u < x|x)= P (u/ < x/ | x)= (x)
This is a probit model with /.
Based on (yi, xi), we can estimate consistenly by MLE, evenwhen we cannot estimate and 2 separately.
Walter Sosa-Escudero Sample Selection
-
2 and are not identified with the sample (yi, xi). Forexample = 10 and 2 = 2 are observationally equivalent tothe case = 5 y 2 = 1. / is identified.
Walter Sosa-Escudero Sample Selection
-
Sample selectivity
Consider the regression model
yi = xi + ui
si is a selectivity variable: si = 1 observed, 0 if not.
We can think that there is a super sample of size N ofyi , xi, si and that we observe the sub sample y
i , xi, only
when si = 1.
Example: female labor productivity
Walter Sosa-Escudero Sample Selection
-
OLS under selectivity
With a random sample (yi , xi), consistency relies on:
E(ui|xi) = 0
which implies E(yi|xi) = xi.
Now we do not have a random sample, but one conditioned onsi = 1. Taking conditional expectations:
E(yi|xi, si = 1) = xi + E(ui|xi, si = 1)OLS based on the selected sample will be inconsistent, unlessE(ui|xi, si = 1) = 0.
Walter Sosa-Escudero Sample Selection
-
Not every selectivity mechanism makes OLS inconsistent.
If u is independent of x, OLS still consistent (why?).
If s = g(x), OLS still consistent.
Examples: wages and education. Males with odd SSN. Maleswith formal education?
Walter Sosa-Escudero Sample Selection
-
An estimable model under selectivity
Consider the following equations:{y1i = x
1i 1 + u1i (regression)
y2i = x2i 2 + u2i (selectivity)
Let y2i = 1[y2i > 0].
Example: y1i = wage, regression equantion determines wages based onpersons characteristics (x1i). y2i = net utility of work, x2i are observeddeterminantes of utility.
Walter Sosa-Escudero Sample Selection
-
Assumptions:
1 (y2i, x2i) observed for everyone.
2 (y1i, x1i) observed only if y2i = 1 (selected sample).
3 (u1i, u2i) are independent of x2i and have zero mean.
4 u2i N(0, 22).5 E(u1i|u2i) = u2. No-observables can be related.
Walter Sosa-Escudero Sample Selection
-
Selectivity bias
Here si y2i.
E(y1i|x1i, y2i = 1) = x1i1 + E(u1i | x1i, y2i = 1)= x1i1 + E
[E(u1i|u2i) | x1i, y2i = 1
]= x1i1 + E
[u2i | x1i, y2i = 1
]= x1i1 + E
[u2i | x1i, y2i > 0
]= x1i1 + E
[u2i | x1i , u2i < x2i2
]= x1i1 2 (x2i2/2)= x1i1 2 zi 6= x1i1
with zi (x2i2/2). OLS with the selected sample isinconsistent.
Walter Sosa-Escudero Sample Selection
-
E(y1i|x1i, y2i = 1) = x1i1 2 zi 6= x1i1
Inconsistency: omission of zi. Heckman (1979): selectivitybias as misspecification.
Inconsistency due to the correlation between u1i y u2i, , thatis 6= 0.
Walter Sosa-Escudero Sample Selection
-
Heckmans two-step estimator
Define u1i y1i x1i1 zi, 2. Solving:
y1i = x1i +
z + u1i
where, by construction E(u1i|x1i, y2i = 1) = 0.x1i, zi observable when y2i = 1: OLS of y1i on x1i and ziusing the selected sample gives consistent esimates of 1 and.Problem: zi (x2i2/2) is NOT observable, it depends on2 and 2.
Walter Sosa-Escudero Sample Selection
-
Note that u2i N(0, 22), hence:
P (y2i = 1) = P (y2i > 0) = P (u2i/2 < x
2i2/2) = (x
2i)
P (y2i = 1) is a probit model with unknown coefficients .
x2i and y2i are observed for everybody: can be estimatedusing probit.
Important: we cannot identify 2i and 2 separately but = 2i/2i.
Walter Sosa-Escudero Sample Selection
-
This suggests the following two-stage method:
Stage 1: Obtain estimates based on the probit modelP (y2i = 1) = (x
2i) using the full sample. Predict zi using
zi = (x2i).
Stage 2: Regress y2i on x1i and zi using the selected sample.This produces consistent estimates of 1 and
.
Walter Sosa-Escudero Sample Selection
-
Practical remarks
The method is consistent and asymptotically normal (methodof moments estimator). Standard inference works fine.
Careful with asympototic variance. The second stage isheteroskedastic. Requires correction. See Greene (Ch. 20).
A test of H0 : = 0 may be used to check sample selectivity.Under H0, the regression model with the selected sample ishomoskedastic, test can be based in a model without takingcare of heteroskedasticity.
Classic issue: low power when x1 is very similar to x2.
MLE? Requires bivariate normality. Complicated likelihood,rarely used.
Walter Sosa-Escudero Sample Selection