     • date post

07-Feb-2018
• Category

## Documents

• view

212

0

Embed Size (px)

### Transcript of SIMPLE REGRESSION MODEL - Yıldız Teknik tastan/teaching/Handout 02 Simple Regression Model.pdf5...

• 1

SIMPLE REGRESSION MODEL

Huseyin Tastan1

1Yldz Technical UniversityDepartment of Economics

These presentation notes are based onIntroductory Econometrics: A Modern Approach (2nd ed.)

by J. Wooldridge.

14 Ekim 2012

2

Simple Regression Model

Simple (Bivariate) Regression Model

y = 0 + 1x+ u

I y: dependent variable

I x: explanatory variable

I Also called bivariate linear regression model, two-variablelinear regression model

I Purpose: to explain the dependent variable y by theindependent variable x

3

Simple Regression Model

Terminologyy x

Dependent variable Independent variableExplained variable Explanatory variableResponse variable Control variablePredicted variable Predictor variable

Regressand Regressor

4

Predictor variable: y = 0 + 1x+ u

u: Random Error term - Disturbance termRepresents factors other than x that affect y. These factors aretreated as unobserved in the simple regression model.

Slope parameter 1

I If the other factors in u are held fixed, i.e. u = 0, then thelinear effect of x on y:

y = 1x

I 1: slope term.

Intercept term (also called constant term): 0the value of y when x = 0.

• 5

Simple Regression Model: Examples

Agricultural production and fertilizer usage

yield = 0 + 1fertilizer + u

yield: quantity of wheat production

Slope parameter 1

yield = 1fertilizer

Ceteris Paribus, one unit change in fertilizer leads to 1 unitchange in wheat yield.

Random error term: uContains factors such as land quality, rainfall, etc, which areassumed to be unobserved.Ceteris Paribus Holding all other factors fixed u = 0

6

Simple Regression Model: Examples

A Simple Wage Equation

wage = 0 + 1educ+ u

wage: hourly wage (in dollars); educ: education level (in years)

Slope parameter 1

wage = 1educ

1 measures the change in hourly wage given another year ofeducation, holding all other factors fixed (ceteris paribus).

Random error term uOther factors include labor force experience, innate ability, tenurewith current employer, gender, quality of education, marital status,number of children, etc. Any factor that may potentially affectworker productivity.

7

Linearity

I The linearity of simple regression model means: a one-unitchange in x has the same effect on y regardless of the initialvalue of x.

I This is unrealistic for many economic applications.

I For example, if there increasing or decreasing returns to scalethen this model is inappropriate.

I In wage equation, the impact of the next year of education onwages has a larger effect than did the previous year.

I An extra year of experience may also have similar increasingreturns.

I We will see how to allow for such possibilities in the followingclasses.

8

Assumptions for Ceteris Paribus conclusions

1. Expected value of the error term u is zero

I If the model includes a constant term (0) then we canassume:

E(u) = 0

I This assumption is about the distribution of u(unobservables). Some u terms will be + and some will be but on average u is zero.

I This assumption is always guaranteed to hold by redefining 0.

• 9

Assumptions for Ceteris Paribus conclusions

2. Conditional mean of u is zero

I How can we be sure that the ceteris paribus notion is valid(which meas that u = 0?

I For this to hold, x and u must be uncorrelated. But sincecorrelation coefficient measures only the linear associationbetween two variables it is not enough just to have zerocorrelation.

I u must also be uncorrelated with the functions of x(e.g. x2,x etc.)

I Zero Conditional Mean assumption ensures this:

E(u|x) = E(u) = 0

I This equation says that the average value of theunobservables is the same across all slices of the populationdetermined by the value of x.

10

Zero Conditional Mean Assumption

Conditional mean of u given x is zero

E(u|x) = E(u) = 0

I Both u and x are random variables. Thus, we can define theconditional distribution of u given a value of x.

I A given value of x represents a slice in the population. Theconditional mean of u for this specific slice of the populationcan be defined.

I This assumption means that the average value of u does notdepend on x.

I For a given value of x unobservable factors have a zero mean.Also, unconditional mean of unobservables is zero.

11

Zero Conditional Mean Assumption: Example

Wage equation:wage = 0 + 1educ+ u

I Suppose that u represents innate ability of employees, denotedabil.

I E(u|x) assumption implies that innate ability is the sameacross all levels of education in the population:

E(abil|educ = 8) = E(abil|educ = 12) = . . . = 0

I If we believe that average ability increases with years ofeduction this assumption will not hold.

I Since we cannot observe ability we cannot determine ifaverage ability is the same for all education levels.

12

Zero Conditional Mean Assumption: Example

Agricultural production-fertilizer model:

I Recall the fertilizer experiment: the land is divided into equalplots and different amounts of fertilizer is applied to each plot.

I If the amount of fertilizer is assigned to land plotsindependent of the quality of land then thezero-conditional-mean assumption will hold.

I However, if larger amounts of fertilizer is assigned to landplots with higher quality then the expected value of the errorterm will increase with the amount of fertilizer.

I In this case zero conditional mean assumption is false.

• 13

Population Regression Function (PRF)

I Expected value of y given x:

E(y|x) = 0 + 1x+ E(u|x) =0

= 0 + 1x

I This is called PRF. Obviously, conditional expectation of thedependent variable is a linear function of x.

I Linearity of PRF: for a one-unit change in x conditionalexpectation of y changes by 1.

I The center of the conditional distribution of y for a givenvalue of x is E(y|x).

Population Regression Function (PRF)

15

Systematic and Unsystematic Parts of Dependent Variable

I In the simple regression model

y = 0 + 1x+ u

under E(u|x) = 0 the dependent variable y can bedecomposed into two parts:

I Systematic part: 0 +1x. This is the part of y explained by x.

I Unsystematic part: u. This is the part of y that cannot beexplained by x.

16

Estimation of Unknown Parameters

I How can we estimate the unknown population parameters(0, 1) given a cross-sectional data set.?

I Suppose that we have a random sample of n observations:

{yi, xi : i = 1, 2, . . . , n}

I Regression model can be written for each observation asfollows:

yi = 0 + 1xi + ui, i = 1, 2, . . . , n

I Now we have a system of n equations with two unknowns.

• 17

Estimation of unknown population parameters (0, 1)

yi = 0 + 1xi + ui, i = 1, 2, . . . , n

n equations with 2 unknowns:

y1 = 0 + 1x1 + u1

y2 = 0 + 1x2 + u2

y3 = 0 + 1x3 + u3... =

...

yn = 0 + 1xn + un

Random Sample Example: Savings and income for 15families

19

Estimation of unknown population parameters (0, 1):Method of Moments

We just made two assumptions for ceteris paribus conclusions tobe valid:

E(u) = 0

Cov(x, u) = E(xu) = 0

NOTE: If E(u|x) = 0 then by definition Cov(x, u) = 0 but thereverse may not be true. Since E(u) = 0 then by definitionCov(x, u) = E(xu). Using these assumptions and u = y 0 1xPopulation Moment Conditions can be written as:

E(y 0 1x) = 0E[x(y 0 1x)] = 0

Now we have 2 equations with 2 unknowns.

20

Method of Moments: Sample Moment Conditions

Population moment conditions:

E(y 0 1x) = 0E[x(y 0 1x)] = 0

Replacing these with their sample analogs we obtain:

1

n

ni=1

(yi 0 1xi) = 0

1

n

ni=1

xi(yi 0 1xi) = 0

This system can easily be solved for 0 and 1 using sample data .Note that 0 and 1 have hats on them, they are not fixedquantities. They change as the data change.

• 21

Method of Moments: Sample Moment Conditions

Using the properties of the summation operator, from the firstsample moment condition:

y = 0 + 1x

where y and x sample means.Using this we can write

0 = y 1x

Substituting this into the second sample moment condition we cansolve for 1.

22

Method of Moments

Substituting 0 into second moment condition after multiplying itwith 1/n:

ni=1

xi(yi (y 1x) 1xi) = 0

This expression can be written as

ni=1

xi(yi y) = 1ni=1

xi(xi x)

23

Slope Estimator

1 =

ni=1(xi x)(yi y)n

i=1(xi x)2

The following properties have been used in deriving the expressionabove:

ni=1

xi(xi x) =ni=1

(xi x)2

ni=1

xi(yi y) =ni=1

(xi x)(yi y)

24

Slope Estimator

1 =

ni=1(xi x)(yi y)n

i=1(xi x)2

1. Slope estimator is the ratio of the sample covariance betweenx and y to the sample variance of x.

2. The sign of 1 depends on the sign of sample covariance. If xand y are positively correlated in the sample, 1 is positive; ifx and y are negatively correlated then 1 is negative.

3. To be able to calculate 1 x must have enough variability:ni=1

(xi x)2 > 0

If all x values are the same then the sample variance will be 0.In this case, 1 will be undefined. For example, if allemployees have the same level of education, say 12 years, thenit is not possible to measure the impact of edu