chapter3_econometrics

Multiple Regression Analysis

Võ Đức Hoàng Vũ

University of Economics HCMC

June 2015

Võ Đức Hoàng Vũ (UEH) Applied Econometrics June 2015 1 / 17

Estimation

The model is : y = β0 + β1x1 + β2x2 + . . . + βkxk

β0 is still the intercept

β1 to βk all called slope parameters

u is still the error term (or disturbance)

Still need to make a zero conditional mean assumption, so nowassume that E (u|x1, x2, . . . , xk) = 0

Still minimizing the sum of squared residuals, so have k + 1 firstorder conditions


Interpreting Multiple Regression

β0 + β1x1 + . . . + βkxk , so

∆y = ∆β1x1 + ∆β2x2 + . . . + ∆βkxk ,

so holding x2, . . . , xk fixed implies that ∆y = ∆β1x1, that iseach β has a ceteris paribus interpretation.


"Partialling Out" Interpretation

Consider the case where k = 2, i.e.

y = β0 + β1x1 + β2x2, then

β1 = (∑

ri1yi)/∑

r 2i1, where ri1 are the residuals from the

estimated regression x1 = γ0 + γ2x2

Previous equation implies that regressing y on x1 and x2 givessame effect of x1 as regressing y on residuals from a regressionof x1 on x2.

This means only the part of xi1 that is uncorrelated with xi2 arebeing related to yi so we’re estimating the effect of x1 on y afterx2 has been "partialled out".


Simple vs Multiple Regression Estimate

Compare the simple regression y = β0 + β1x1 with the multipleregression haty = β0 + β1x1 + β2x2

Generally, β1 6= β1 unless: β2 = 0 (i.e. no partial effect of x2) ORx1 and x2 are uncorrelated in the sample


Goodness-of-fit

We can think of each observation as being made up of anexplained part, and an unexplained part - yi = yi + ui - We thendefine the following:∑

(yi − y)2 is the total sum of squares (SST)∑(yi − y)2 is the explained sum of squares (SSE)∑u2i is the residual sum of squares (SSR)

Then SST = SSE + SSR

How do we think about how well our sample regression line fitsour sample data?

Can compute the fraction of the total sum of squares (SST) thatis explained by the model, call this the R-squared of regression

R2 = SSE/SST = 1− SSR/SST


Goodness-of-fit (cont)

We can also think of R2 as being equal to the squaredcorrelation coefficient between the actual yi and the value yi

R2 =(∑

(yi − y)(yi − ¯y))2

(∑

(yi − y)2)(∑

(yi − ¯y)2)

R2 can never decrease when another independent variable isadded to regression, and usually will increase

Because R2 will usually increase with the number of independentvariables, it is not a good way to compare models.


Assumption for Unbiasedness

Population model is linear in parameters:y = β0 + β1x1 + β2x2 + . . . + βkxk + u

We can use a random sample of sizen, {(xi1, xi2, . . . , xik , yi) : i = 1, 2, . . . , n}, from the populationmodel, so that the sample model isyi = β0 + β1xi1 + β2xi2 + . . . + βkxik + ui

E (u|x1, x2, . . . , xk) = 0, implying that all of the explanatoryvariables are exogenous.

None of the x’s is constant, and there are no exact linearrelationships among them.


Too Many or Too Few Variables

What happens if we include variables in our specification thatdon’t belong?

There is no effect on our parameter estimate, and OLS remainsunbiased

What if we exclude a variable from our specification that doesbelong?

OLS will usually be biased.


Omitted Variable Bias

Suppose the true model is given asy = β0 + β1x1 + β2x2 + u, but we estimate y = +β1x1 + u, then

β1 =

∑(xi1 − x1)yi∑(xi1 − x1)2

Recall the true model, so that yi = β0 + β1xi1 + β2xi2 + ui , so thenumerator becomes∑

(xi1 − x1)(β0 + β1xi1 + β2xi2 + ui) =β1

∑(xi1 − x1)2 + β2

∑(xi1 − x1)xi2 +

∑(xi1 − x)ui


Omitted Variable Bias (cont)

β = β1 + β2

∑(xi1 − x1)xi2∑(xi1 − x1)2

+

∑(xi1 − x1)ui∑(xi1 − x1)2

since E (ui) = 0, taking expectations we have

E () = β1 + β2

∑(xi1 − x1)xi2∑(xi1 − x1)2

Consider the regression of x2 on x1

x2 = δ0 + δ1x1 then δ1 =

∑(xi1 − x1)xi2∑(xi1 − x1)2

so E (β1) = β1 + β2δ1


Summary of Direction of Bias

Corr(x1, x2) > 0 Corr(x1, x2) < 0β2 > 0 Positive bias Negative biasβ2 < 0 Negative bias Positive bias


Omitted Variable Bias Summary

Two cases where bias is equal to zero

β2 = 0 that is x2 doesn’t really bebong in modelx1 and x2 are uncorrelated in the sample

If correlation between x2, x1 and x2, y is the same direction, biaswill be positive

If correlation between x2, x1 and x2, y is the opposite direction,bias will be negative

Technically, can only sign the bias for the more general case if alof the included x’s are uncorrelated.

Typpically, then, we work through the bias assuming the x’s areuncorrelated, as a useful guide even if this assumption is notstrictly true.


Variance of the OLS Estimators

Now we know that the sampling distribution of our estimate iscentered around the true parameter

Want to think about how spread out this distribution is

Much easier to think about this variance under an additionalassumption, so

Assume Var(u|x1, x2, . . . , xk) = σ2 (Homoskedasticity)

Let xxx stand for (x1, x2, . . . , xk)

Assuming that Var(u|x) = σ2 also implies that Var(y |x) = σ2

The 4 assumptions for unbiasedness, plus this homoskedasticityassumption are known as the Gauss-Markov assumptions


Variance of OLS (cont.)

Given the Gauss-Markov assumptions

Var(βj) =σ2

SSTj(1− R2j ), where

SSTj =∑

(xij − xj)2 and R2

j is the R2 fromregressioing xj on all other x’s


Components of OLS Variances

The error variance: a larger σ2 implies a larger variance for theOLS estimators

The total sample variation: a larger SSTj implies a smallervariance for the estimators

Linear relationships among the independent variables: a largerR2j implies a larger variance for the estimators


Misspecified Models

Consider again the misspecified model

y = β0 + β1x1, so that Var(β1) =σ2

SST1

Thus, Var(β1) < Var(β1) unless x1 and x2 areuncorrelated, then they’re the same


Misspecified Models (cont.)

While the variance of the estimator is smaller for the misspecifiedmodel, unless β2 = 0 the misspecified model is biased

As the sample size grows, the variance of each estimator shrinksto zero, making the variance difference less important


Estimating the Error Variance

We don’t know what the error variance, σ2, is, because we don’tobserve the errors, ui

What we observe are the residuals, ui

We can use the residuals to form an estimate of the errorvariance

σ2 = (∑

u2i )/(n − k − 1) ≡ SSR/df

thus, se(βj) = σ/[SSTj(1− R2j )]1/2

df = n − (k + 1)


The Gauss-Markov Theorem

Given our 5 Gauss-Markov Assumption it can be shown thatOLS is "BLUE"

Best

Linear

Unbiased

Estimator

Thus, if the assumptions hold, use OLS


chapter3_econometrics

Documents

Transcript of chapter3_econometrics