Selectivity Econometria

17
Sample Selection Walter Sosa-Escudero Econ 507. Econometric Analysis. Spring 2012 April 23, 2012 Walter Sosa-Escudero Sample Selection

description

Econometria

Transcript of Selectivity Econometria

  • Sample Selection

    Walter Sosa-Escudero

    Econ 507. Econometric Analysis. Spring 2012

    April 23, 2012

    Walter Sosa-Escudero Sample Selection

  • Preliminaries 1) Truncated normal distribution

    X f(x), X|X < a: X truncated in a. Then

    f(x|X < a) = f(x)Pr(X < a)

    If X N(, 2), and recalling that

    Pr(X < a) = Pr(

    1/(X ) < 1/(a ))

    = Pr(z < ),

    (a )/, z (x )/.

    f(x|X < a) =1

    12pi

    exp[12(x

    )2]()

    =(1/)(z)

    ()

    Walter Sosa-Escudero Sample Selection

  • Result (no proof): if X N(, 2), then:

    E(X|X < a) = ()()

    Truncated to the right: expected value moves to the left(general).

    How much? Depends on and 2

    (z) (z)/(z) is known as the inverse Mills ratio

    Walter Sosa-Escudero Sample Selection

  • Preliminaries 2) Latent variables and probits

    Recall the probit model:

    Pr(y = 1|x) = (x)

    can be consistently estimated by MLE based on a randomsample (yi, xi), i = 1, . . . , n.

    Consider the regression model

    y = x + u, u N(0, 2)y not directly observable (a latent variable) but, instead, weobserve y = 1[y > 0].

    Which parameters of the regression models can be estimatedconsistently with this information?

    Walter Sosa-Escudero Sample Selection

  • Note that:

    P (y = 1|x) = P (y > 0|x)= P (u > x|x)= P (u < x|x)= P (u/ < x/ | x)= (x)

    This is a probit model with /.

    Based on (yi, xi), we can estimate consistenly by MLE, evenwhen we cannot estimate and 2 separately.

    Walter Sosa-Escudero Sample Selection

  • 2 and are not identified with the sample (yi, xi). Forexample = 10 and 2 = 2 are observationally equivalent tothe case = 5 y 2 = 1. / is identified.

    Walter Sosa-Escudero Sample Selection

  • Sample selectivity

    Consider the regression model

    yi = xi + ui

    si is a selectivity variable: si = 1 observed, 0 if not.

    We can think that there is a super sample of size N ofyi , xi, si and that we observe the sub sample y

    i , xi, only

    when si = 1.

    Example: female labor productivity

    Walter Sosa-Escudero Sample Selection

  • OLS under selectivity

    With a random sample (yi , xi), consistency relies on:

    E(ui|xi) = 0

    which implies E(yi|xi) = xi.

    Now we do not have a random sample, but one conditioned onsi = 1. Taking conditional expectations:

    E(yi|xi, si = 1) = xi + E(ui|xi, si = 1)OLS based on the selected sample will be inconsistent, unlessE(ui|xi, si = 1) = 0.

    Walter Sosa-Escudero Sample Selection

  • Not every selectivity mechanism makes OLS inconsistent.

    If u is independent of x, OLS still consistent (why?).

    If s = g(x), OLS still consistent.

    Examples: wages and education. Males with odd SSN. Maleswith formal education?

    Walter Sosa-Escudero Sample Selection

  • An estimable model under selectivity

    Consider the following equations:{y1i = x

    1i 1 + u1i (regression)

    y2i = x2i 2 + u2i (selectivity)

    Let y2i = 1[y2i > 0].

    Example: y1i = wage, regression equantion determines wages based onpersons characteristics (x1i). y2i = net utility of work, x2i are observeddeterminantes of utility.

    Walter Sosa-Escudero Sample Selection

  • Assumptions:

    1 (y2i, x2i) observed for everyone.

    2 (y1i, x1i) observed only if y2i = 1 (selected sample).

    3 (u1i, u2i) are independent of x2i and have zero mean.

    4 u2i N(0, 22).5 E(u1i|u2i) = u2. No-observables can be related.

    Walter Sosa-Escudero Sample Selection

  • Selectivity bias

    Here si y2i.

    E(y1i|x1i, y2i = 1) = x1i1 + E(u1i | x1i, y2i = 1)= x1i1 + E

    [E(u1i|u2i) | x1i, y2i = 1

    ]= x1i1 + E

    [u2i | x1i, y2i = 1

    ]= x1i1 + E

    [u2i | x1i, y2i > 0

    ]= x1i1 + E

    [u2i | x1i , u2i < x2i2

    ]= x1i1 2 (x2i2/2)= x1i1 2 zi 6= x1i1

    with zi (x2i2/2). OLS with the selected sample isinconsistent.

    Walter Sosa-Escudero Sample Selection

  • E(y1i|x1i, y2i = 1) = x1i1 2 zi 6= x1i1

    Inconsistency: omission of zi. Heckman (1979): selectivitybias as misspecification.

    Inconsistency due to the correlation between u1i y u2i, , thatis 6= 0.

    Walter Sosa-Escudero Sample Selection

  • Heckmans two-step estimator

    Define u1i y1i x1i1 zi, 2. Solving:

    y1i = x1i +

    z + u1i

    where, by construction E(u1i|x1i, y2i = 1) = 0.x1i, zi observable when y2i = 1: OLS of y1i on x1i and ziusing the selected sample gives consistent esimates of 1 and.Problem: zi (x2i2/2) is NOT observable, it depends on2 and 2.

    Walter Sosa-Escudero Sample Selection

  • Note that u2i N(0, 22), hence:

    P (y2i = 1) = P (y2i > 0) = P (u2i/2 < x

    2i2/2) = (x

    2i)

    P (y2i = 1) is a probit model with unknown coefficients .

    x2i and y2i are observed for everybody: can be estimatedusing probit.

    Important: we cannot identify 2i and 2 separately but = 2i/2i.

    Walter Sosa-Escudero Sample Selection

  • This suggests the following two-stage method:

    Stage 1: Obtain estimates based on the probit modelP (y2i = 1) = (x

    2i) using the full sample. Predict zi using

    zi = (x2i).

    Stage 2: Regress y2i on x1i and zi using the selected sample.This produces consistent estimates of 1 and

    .

    Walter Sosa-Escudero Sample Selection

  • Practical remarks

    The method is consistent and asymptotically normal (methodof moments estimator). Standard inference works fine.

    Careful with asympototic variance. The second stage isheteroskedastic. Requires correction. See Greene (Ch. 20).

    A test of H0 : = 0 may be used to check sample selectivity.Under H0, the regression model with the selected sample ishomoskedastic, test can be based in a model without takingcare of heteroskedasticity.

    Classic issue: low power when x1 is very similar to x2.

    MLE? Requires bivariate normality. Complicated likelihood,rarely used.

    Walter Sosa-Escudero Sample Selection