Download - Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Transcript

Simple Linear Regression (OLS)

Types of Correlation

Positive correlation Negative correlation No correlation

Simple linear regression describes the linear relationship between an independentvariable, plotted on the x-axis, and a dependent variable, plotted on the y-axis

Independent Variable (X)

depe

nden

t Var

iabl

e (Y

)

Page 4: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

1oY X

o1.0

Page 5: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

1oY X

1.0

Page 6: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Page 7: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Y ε

Page 8: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Fitting data to a linear model

1i o i iY X

intercept slope residuals

Page 9: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

How to fit data to a linear model?

The Ordinary Least Square Method (OLS)

Page 10: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Least Squares Regression

Residual (ε) =

Sum of squares of residuals =

Model line:

• we must find values of and that minimise o 1

XY 10

2)( YY

2)(min YY

Page 11: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Y = a + b X

• b = ∑(X – X bar) (Y – Y bar) / ∑(X- X bar)2

• a = Y bar – b X bar

Page 12: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression Coefficients

21x

XbYb 10

Page 13: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Required Statistics

nsobservatio ofnumber n

Page 14: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Descriptive Statistics

)( 1

YYYVar

)( 1

XXXVar

xxS

)(SSTS yy

xyS 1

),(Covar 1

YYXXYX

Page 15: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression Statistics

2)( YYSST

2)( YYSSR

2)( YYSSE

Page 16: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Variance to beexplained by predictors

(SST)

Page 17: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Variance NOT explained by X1

(SSE)

Variance explained by X1

(SSR)

Page 18: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

SSESSRSST

Regression Statistics

Page 19: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression Statistics

SST

SSRR 2

Coefficient of Determinationto judge the adequacy of the regression model

Page 20: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Page 21: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression Statistics

yyxx

Correlation

measures the strength of the linear association between two variables.

Standard Error for the regression model

MSES

SSES

Regression Statistics

2)( YYSSE

Page 23: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

ANOVA

df SS MS F P-value

Regression 1 SSR SSR / df MSR / MSE P(F)

Residual n-2 SSE SSE / df

Total n-1 SST

If P(F)< then we know that we get significantly better prediction of Y from the regression model than by just predicting mean of Y.

ANOVA to test significance of regression

Page 24: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Hypothesis Tests for Regression Coefficients

iikn S

)1(

Page 25: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Hypotheses Tests for Regression Coefficients

eekn

11)1( )(

Page 26: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Hypothesis Tests on Regression Coefficients

xxe

ekn

00)1(

1)(

Page 27: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Hypotheses Test the Correlation Coefficient

0:0

201

nRT

We would reject the null hypothesis if 2,2/0 ntt

Page 28: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Diagnostic Tests For Regressions

Expected distribution of residuals for a linear model with normal distribution or residuals (errors).

Page 29: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Diagnostic Tests For Regressions

Residuals for a non-linear fit

Page 30: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Diagnostic Tests For Regressions

Residuals for a quadratic function or polynomial

Page 31: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Diagnostic Tests For Regressions

Residuals are not homogeneous (increasing in variance)

Page 32: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression – important points

1. Ensure that the range of valuessampled for the predictor variableis large enough to capture the fullrange to responses by the responsevariable.

Page 33: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Page 34: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression – important points

2. Ensure that the distribution ofpredictor values is approximatelyuniform within the sampled range.

Page 35: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Page 36: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Assumptions of Regression

1. The linear model correctly describes the functional relationship between X and Y.

Page 37: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Assumptions of Regression

1. The linear model correctly describes the functional relationship between X and Y.

Page 38: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Assumptions of Regression

2. The X variable is measured without error

Page 39: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Assumptions of Regression

3. For any given value of X, the sampled Y values are independent

4. Residuals (errors) are normally distributed.

5. Variances are constant along the regression line.

Page 40: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Multiple Linear Regression (MLR)

Page 41: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

The linear model with a singlepredictor variable X can easily be extended to two or more predictor variables.

1 1 2 2 ...o p pY X X X

Page 42: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Variance NOT explained by X1 and X2

Unique variance explained by X1

Unique variance explained by X2

Common variance explained by X1 and X2

Page 43: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

X1 X2

A “good” model

Page 44: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Partial Regression Coefficients (slopes): Regression coefficient of X after controlling for (holding all other predictors constant) influence of other variables from both X and Y.

1 1 2 2 ...o p pY X X X

Partial Regression Coefficients

intercept residuals

Page 45: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

The matrix algebra of

Ordinary Least Square

1( ' ) 'X X X Y Predicted Values:

Residuals:

Intercept and Slopes:

Page 46: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression StatisticsHow good is our model?

2)( YYSST

2)( YYSSR

2)( YYSSE

Page 47: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Regression Statistics

SST

SSRR 2

Coefficient of Determinationto judge the adequacy of the regression model

Page 48: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Adjusted R2 are not biased!

n = sample sizek = number of independent variables

)1(1

11 22 R

nRadj

Regression Statistics

Page 49: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Standard Error for the regression model

MSES

SSES

Regression Statistics

2)( YYSSE

Page 50: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

ANOVA

df SS MS F P-value

Regression k SSR SSR / df MSR / MSE P(F)

Residual n-k-1 SSE SSE / df

Total n-1 SST

If P(F)< then we know that we get significantly better prediction of Y from the regression model than by just predicting mean of Y.

ANOVA to test significance of regression

0...: 210

at least one!

Page 51: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Hypothesis Tests for Regression Coefficients

iikn S

)1(

Page 52: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Hypotheses Tests for Regression Coefficients

iie

ikn

1)1( )(

0:0

S 2

Page 53: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Diagnostic Tests For Regressions

Expected distribution of residuals for a linear model with normal distribution or residuals (errors).

X Residual Plot

-5

0 2 4 6 8

esid

uals

Page 54: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Standardized Residuals

Standard Residuals

-2-1.5

-1-0.5

00.5

11.5

22.5

0 5 10 15 20 25

Page 55: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Avoiding predictors (Xs)

that do not contribute significantly

to model prediction

Model Selection

Page 56: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

- Forward selectionThe ‘best’ predictor variables are entered, one by one.

- Backward eliminationThe ‘worst’ predictor variables are eliminated, one by one.

Model Selection

Page 57: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Forward Selection

Page 58: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

BackwardElimination

Page 59: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Model Selection: The General Case

),...,,,...,,(

),...,,,...,,(),...,,(

121

12121

xxxxxSSEqk

xxxxxSSExxxSSE

Fkqq

kqqq

1,, knqkFF

zeronot in oneleast at :

0...:

210

H kqq

Reject H0 if :

Page 60: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

The degree of correlation between Xs.

A high degree of multicolinearity produces unacceptable uncertainty (large variance) in regression coefficient estimates (i.e., large sampling variation)

Imprecise estimates of slopes and even the signs of the coefficients may be misleading.

t-tests which fail to reveal significant factors.

Multicolinearity

Page 61: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Scatter Plot

Page 62: Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.

Multicolinearity

If the F-test for significance of regression is significant, but tests on the individual regression coefficients are not, multicolinearity may be present.

Variance Inflation Factors (VIFs) are very useful measures of multicolinearity. If any VIF exceed 5, multicolinearity is a problem.

iii

i CR

VIF

1)(