Chapter 6 Multiple Regression I - National University of ...

44
Chapter 6 Multiple Regression I &5² Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 6 1 / 44

Transcript of Chapter 6 Multiple Regression I - National University of ...

Page 1: Chapter 6 Multiple Regression I - National University of ...

Chapter 6

Multiple Regression I

許湘伶

Applied Linear Regression Models(Kutner, Nachtsheim, Neter, Li)

hsuhl (NUK) LR Chap 6 1 / 44

Page 2: Chapter 6 Multiple Regression I - National University of ...

Multiple regression analysis is one of the most widely used of allstatistical methods.

a variety(變化) of multiple regression models

basic statistical results in matrix form 

hsuhl (NUK) LR Chap 6 2 / 44

Page 3: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

First-Order Models with Two Predictor Variables

Two predictor variables: X1, X2

Yi = β0 + β1Xi1 + β2Xi2 + εi

first-order model with two predictor variables: linear in thepredictor variables

Assume E{εi} = 0:

E{Y} = β0 + β1X1 + β2X2

called a regression surface(迴歸曲面) or a response surface(反應曲面)

hsuhl (NUK) LR Chap 6 3 / 44

Page 4: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Figure : Response Function is a plane-Sales Promotion Example.

hsuhl (NUK) LR Chap 6 4 / 44

Page 5: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Meaning of Regression Coefficients

β0: intercept in the regression plane; X1 = 0,X2 = 0

β1: the change in the mean response with4X1 = 1 andX2=constant; ∂E{Y}

∂X1= β1

β2: the change in the mean response with X1=constant and4X2 = 1; ∂E{Y}

∂X2= β2

β1, β2: partial regression coefficients ∵ they reflect the partialeffect

X1, X2: additive effect(相加效應) or not to interact

Yi = β0 + β1Xi1 + β2Xi2 + εi: additive or do not interact

hsuhl (NUK) LR Chap 6 5 / 44

Page 6: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

First-Order with More than Two Predictor Variables

The first-order regression model with p− 1 predictor variables:

Yi = β0 + β1Xi1 + β2Xi2 + · · ·+ βp−1Xi,p−1 + εi

= β0 +

p−1∑k=1

βkXik + εi

=

p−1∑k=0

βkXik + εi, where Xi0 ≡ 1

Assume E{εi} = 0, the response function (hyperplane):

⇒ E{Y} = β0 + β1X1 + β2X2 + · · ·+ βp−1Xp−1

hsuhl (NUK) LR Chap 6 6 / 44

Page 7: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

First-Order with More than Two Predictor Variables (cont.)

A hyperplane: in more than two dimensions; no longer possible topicture

βk: the change in the mean response E{Y} with4Xk = 1 and allother predictor variables are held constant

additive and do not interact

hsuhl (NUK) LR Chap 6 7 / 44

Page 8: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

General Linear Regression Model

The general linear regression model with normal error terms:

Yi = β0 + β1Xi1 + β2Xi2 + · · ·+ βp−1Xi,p−1 + εi, i = 1, . . . , n

= β0Xi0 + β1Xi1 + β2Xi2 + · · ·+ βp−1Xi,p−1 + εi (Xi0 ≡ 1)

=

p−1∑k=0

βkXik + εi, where Xi0 ≡ 1

parameters: β0, β1, . . . , βp−1

known constants: Xi1, . . . ,Xi,p−1

εi: independent N(0, σ2)

hsuhl (NUK) LR Chap 6 8 / 44

Page 9: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Qualitative predictor variables: quantitative(量的), qualitative(質的)

Polynomial regression: X, X2

Transformed variables: log Y

Interaction effects: X1X2

Combination of Cases: X21 , X2

2 ,X1X2

hsuhl (NUK) LR Chap 6 9 / 44

Page 10: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Qualitative Predictor Variables

Illustration:Y-the length of hospital stay; X1-age; X2-gender(性別)

Yi = β0 + β1Xi1 + β2Xi2 + εi

Xi1: patient’s age

Xi2:{

1, where patient female0, where patient male

hsuhl (NUK) LR Chap 6 10 / 44

Page 11: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Qualitative Predictor Variables (cont.)

The response function: E{Y} = β0 + β1X1 + β2X2

The response function for male patients: X2 = 0E{Y} = β0 + β1X1

The response function for female patients: X2 = 1

E{Y} = (β0 + β2) + β1X1

(Chap. 8)

hsuhl (NUK) LR Chap 6 11 / 44

Page 12: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Polynomial Regression (Chap. 8)

Special cases of the general linear regression model

Squared and higher-order terms of the predictor variables⇒curvilinear

Yi = β0 + β1Xi + β2X2i + εi = β0 + β1Xi1 + β2Xi2 + εi

hsuhl (NUK) LR Chap 6 12 / 44

Page 13: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Transformed Variables

Complex;

curvilinear response functions

log Yi =β0 + β1Xi1 + β2Xi2 + β3Xi3 + εi (Y ′i = log Yi)

Yi =1

β0 + β1Xi1 + β2Xi2 + εi(Y ′i =

1Yi

)

hsuhl (NUK) LR Chap 6 13 / 44

Page 14: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Interaction Effects and Combination of Cases

An example of a nonadditive regression model with X1, X2: (complex)

Yi = β0 + β1Xi1 + β2Xi2 + β3Xi1Xi2 + εi

= β0 + β1Xi1 + β2Xi2 + β3Xi3 + εi, (Xi3 = Xi1Xi2)

By cross-product term:

Yi = β0 + β1Xi1 + β2X2i2 + β3Xi2 + β4X2

i2 + β5Xi1Xi2 + εi

= β0 + β1Zi1 + β2Zi2 + β3Zi3 + β4Zi4 + β5Zi5 + εi

hsuhl (NUK) LR Chap 6 14 / 44

Page 15: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Interaction Effects and Combination of Cases (cont.)

Figure : Additional Examples of Response Functions.

hsuhl (NUK) LR Chap 6 15 / 44

Page 16: Chapter 6 Multiple Regression I - National University of ...

6.1 Multiple Regression Models

Meaning of Linear in General Linear Regression Model

A regression model is linear in the parameters:

Yi = ci0β0 + ci1β1 + ci2β2 + · · ·+ ci,p−1βp−1 + εi

where cik, k = 0, . . . , p− 1 are coefficients involving thepredictor variables

Illustration: nonlinear

Yi = β0 exp(β1Xi) + εi

hsuhl (NUK) LR Chap 6 16 / 44

Page 17: Chapter 6 Multiple Regression I - National University of ...

6.2 General Linear Regression Model in Matrix Terms

General Linear Regression Model in Matrix Terms

The general linear regression model:

Yi = β0 + β1Xi1 + β2Xi2 + · · ·+ βp−1Xi,p−1 + εi, i = 1, . . . , n

Yn×1

=

Y1

Y2...

Yn

Xn×p

=

1 X11 X12 · · · X1,p−1

1 X21 X22 · · · X2,p−1...

......

...1 Xn1 Xn2 · · · Xn,p−1

βp×1

=

β1

β2...

βp−1

εn×1

=

ε1

ε2...εn

hsuhl (NUK) LR Chap 6 17 / 44

Page 18: Chapter 6 Multiple Regression I - National University of ...

6.2 General Linear Regression Model in Matrix Terms

General Linear Regression Model in Matrix Terms (cont.)

⇒ Yn×1

= Xn×pβ

p×1+ ε

n×1

Y: vector of responses;

β: vector of parameters

X: matrix of constants

ε: vector of independent normal random variables;E{ε} = 0; σ2{ε}

n×n= σ2I

Expectation and variance-covariance matrix of Y:

E{Y}n×1

= Xβ; σ2{Y}n×1

= σ2I

hsuhl (NUK) LR Chap 6 18 / 44

Page 19: Chapter 6 Multiple Regression I - National University of ...

6.3 Estimation of Regression Coefficients

Estimation of Regression Coefficients

The general linear regression model:

Yi = β0 + β1Xi1 + β2Xi2 + · · ·+ βp−1Xi,p−1 + εi, i = 1, . . . , n

The least squares criterion: the values β0, . . . , βp−1 minimize Q

Q =n∑

i=1

(Yi − β0 − β1Xi1 − · · · − βp−1Xi,p−1)2

hsuhl (NUK) LR Chap 6 19 / 44

Page 20: Chapter 6 Multiple Regression I - National University of ...

6.3 Estimation of Regression Coefficients

Estimation of Regression Coefficients (cont.)

the vector of the least squares estimated regression coefficients:

The normal equations:

bp×1

=

b0

b1...

bp−1

X′Xb = X′Y

LSE:

bp×1

= (X′X)−1

p×pX′p×1

Y

hsuhl (NUK) LR Chap 6 20 / 44

Page 21: Chapter 6 Multiple Regression I - National University of ...

6.3 Estimation of Regression Coefficients

Estimation of Regression Coefficients (cont.)

The method of MLE leads to the same estimators of normal errorregression model

The likelihood function:

L(β, σ2) =1

(2πσ2)n2

exp

[− 1

2σ2

n∑i=1

(Yi − β0 − β1Xi1 − · · · − βp−1Xi,p−1)2

]

Maximizing the likelihood function with respect to β0, β1, . . . , βp−1

leads to the estimators b

hsuhl (NUK) LR Chap 6 21 / 44

Page 22: Chapter 6 Multiple Regression I - National University of ...

6.4 Fitted Values and Residuals

Fitted Values and Residuals

The vector of Yi and the vector of ei = Yi − Yi:

Yn×1

=

Y1

Y2...

Yn

= Xb = HY ( Hn×n

= X(X′X)−1X′)

en×1

=

e1

e2...

en

= Y − Y = Y − Xb = (I −H)Y(= (I −H)ε

)

hsuhl (NUK) LR Chap 6 22 / 44

Page 23: Chapter 6 Multiple Regression I - National University of ...

6.4 Fitted Values and Residuals

Fitted Values and Residuals (cont.)

the variance-covariance matrix of the residuals:

σ2{e}n×n

= σ2(I −H)

⇒ (estimated by) s2{e}n×n

= MSE(I −H)

hsuhl (NUK) LR Chap 6 23 / 44

Page 24: Chapter 6 Multiple Regression I - National University of ...

6.5 Analysis of Variance Results

Analysis of Variance Results

The sums of squares for ANOVA in matrix terms:

SSTO = Y′Y −(

1n

)Y′JY = Y′

[I − (1

n)J]

Y

SSE = e′e = (Y − Xb)′(Y − Xb) = Y′Y − b′X′Y = Y′(I −H)Y

SSR = b′X′Y −(

1n

)Y′JY = Y′

[H −

(1n

)J]

Y

MSR =SSR

p− 1

MSE =SSE

n− p

hsuhl (NUK) LR Chap 6 24 / 44

Page 25: Chapter 6 Multiple Regression I - National University of ...

6.5 Analysis of Variance Results

Analysis of Variance Results (cont.)

Table : ANOVA Table for General Linear Regression Model Model(6.19).

Source of Variation SS df MSRegression SSR = b′X′Y − (1

n)Y′JY p− 1 MSR = SSRp−1

Error SSE = Y′Y − b′X′Y n− p MSE = SSEn−p

Total SSTO = Y′Y − (1n)Y′JY n− 1

E{MSE} = σ2

p− 1 = 2:

E{MSR} = σ2 +12

[β2

1

∑(Xi1 − X1)2 + β2

2

∑(Xi2 − X2)2

+ 2β1β2

∑(Xi1 − X1)(Xi2 − X2)

]if β1 = 0 = β2 ⇒ E{MSR} = σ2, otherwise E{MSR} > σ2

hsuhl (NUK) LR Chap 6 25 / 44

Page 26: Chapter 6 Multiple Regression I - National University of ...

6.5 Analysis of Variance Results

F Test for Regression Relation

Test whether there is a regression relation between Y and X:

H0 : β1 = β2 = · · · = βp−1 = 0

Ha : not all βk (k = 1, . . . , p− 1) equal zero

⇒ test statistic: F∗ =MSRMSE

The decision rule to control the Type I error at α:

If F∗ ≤ F(1− α; p− 1, n− p), conclude H0

If F∗ > F(1− α; p− 1, n− p), concludeHa

hsuhl (NUK) LR Chap 6 26 / 44

Page 27: Chapter 6 Multiple Regression I - National University of ...

6.5 Analysis of Variance Results

Coefficient of Multiple Determination

The coefficient of multiple determination: R2

R2 =SSR

SSTO= 1− SSE

SSTO

Measures the proportionate reduction of total variation in Yassociated with X1, . . . ,Xp−1

0 ≤ R2 ≤ 1

Adding more X variable⇒ R2 ↑SSE can never become larger with more X variablesSSTO is always the same of a given set of responses

hsuhl (NUK) LR Chap 6 27 / 44

Page 28: Chapter 6 Multiple Regression I - National University of ...

6.5 Analysis of Variance Results

Coefficient of Multiple Determination (cont.)

The adjusted coefficient of multiple determination: R2a

R2a = 1−

SSEn−pSSTOn−1

= 1−(

n− 1n− p

)SSE

SSTO

R2a becomes smaller when another X is introduce into the model

(∵ SSE ↓)

Coefficient of Multiple Correlation: R

R =√

R2

hsuhl (NUK) LR Chap 6 28 / 44

Page 29: Chapter 6 Multiple Regression I - National University of ...

6.6 Inferences about Regression Parameters

Inferences about Regression Parameters

Unbiased: E{b} = βThe variance-covariance matrix σ2{b}:

σ2{b}p×p

=

σ2{b0} σ{b0, b1} · · · σ{b0, bp−1}σ{b1, b0} σ2{b1} · · · σ{b1, bp−1}...

......

σ{bp−1, b0} σ{bp−1, b1} · · · σ2{bp−1}

= σ2{b}

p×p= σ2(X′X)−1

hsuhl (NUK) LR Chap 6 29 / 44

Page 30: Chapter 6 Multiple Regression I - National University of ...

6.6 Inferences about Regression Parameters

Inferences about Regression Parameters (cont.)

The estimated variance-covariance matrix s2{b}:

s2{b}p×p

=

s2{b0} s{b0, b1} · · · s{b0, bp−1}s{b1, b0} s2{b1} · · · s{b1, bp−1}...

......

s{bp−1, b0} s{bp−1, b1} · · · s2{bp−1}

= MSE(X′X)−1

hsuhl (NUK) LR Chap 6 30 / 44

Page 31: Chapter 6 Multiple Regression I - National University of ...

6.6 Inferences about Regression Parameters

Interval Estimation of βk

For the normal error regression model:

bk − βk

s{bk}∼ t(n− p) k = 0, 1, . . . , p− 1

The confidence limits for βk with 1− α confidence coefficient:

bk ± t(1− α/2; n− p)s{bK}

hsuhl (NUK) LR Chap 6 31 / 44

Page 32: Chapter 6 Multiple Regression I - National University of ...

6.6 Inferences about Regression Parameters

Interval Estimation of βk (cont.)

Tests for βk:

H0 :βk = 0 Ha : βk 6= 0

⇒t∗ =bk

s{bk}

⇒ The decision rule:

If |t∗| ≤ t(1− α/2; n− p), conclude H0

Otherwise conclude Ha

hsuhl (NUK) LR Chap 6 32 / 44

Page 33: Chapter 6 Multiple Regression I - National University of ...

6.6 Inferences about Regression Parameters

Joint Inferences

The Bonferroni joint confidence intervals for g parameters with1− α: (g ≤ p)

bk ± Bs{bk}B = t(1− α/2g; n− p)

(Chap. 7: tests concerning subsets of the regression parameters)

hsuhl (NUK) LR Chap 6 33 / 44

Page 34: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Interval estimation of E{Yh}

Given values of X1, . . . ,Xp−1: Xh,1, . . . ,Xh,p−1

Xhp×1

=

1

Xh1...

Xh,p−1

The mean response E{Yh}:

E{Yh} = X′hβ

hsuhl (NUK) LR Chap 6 34 / 44

Page 35: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Interval estimation of E{Yh} (cont.)

The estimated mean response Yh:

Yh = X′hb⇒ E{Yh} = X′hβ = E{Yh} (Unbiased)

σ2{Yh} = σ2X′h(X′X)−1Xh = σ2{Yh} = X′hσ

2{b}Xh

The estimated variance s2{Yh}:

s2{Yh} = MSE(X′h(X′X)−1Xh) = X′hs2{b}Xh

The 1− α confidence limits for E{Yh}:

Yh ± t(1− α/2; n− p)s{Yh}

hsuhl (NUK) LR Chap 6 35 / 44

Page 36: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Confidence region for regression surface

The Working-Hotelling confidence band for the regression line

Boundary points of the confidence region at Xh:

Yh ±Ws{Yh}W2 = pF(1− α; p, n− p)

hsuhl (NUK) LR Chap 6 36 / 44

Page 37: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Simultaneous Confidence Intervals for Several Mean Responses

Estimate of E{Yh} corresponding to different Xh vectors with1− α:

Working-Hotelling confidence region bounds:

Yh ±Ws{Yh}

Bonferroni simultaneous confidence intervals: (g intervalestimates)

Yh ± Bs{Yh}B = t(1− α/2g; n− p)

hsuhl (NUK) LR Chap 6 37 / 44

Page 38: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Prediction of New Observation Yh(new)

The 1− α prediction limits for a new observation Yh(new) at Xh:

Yh ± t(1− α/2; n− p)s{pred}where s2{pred} = MSE + s2{Yh} = MSE(1 + X′h(X

′X)−1Xh)

Prediction of mean of m new observations at Xh:

Yh ± t(1− α/2; n− p)s{spredmean}

where s2{spredmean} = MSEm

+ s2{Yh}

= MSE(1m

+ X′h(X′X)−1Xh)

hsuhl (NUK) LR Chap 6 38 / 44

Page 39: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Predictions of g new observations

g new observations at g different levels Xh with family confidencecoefficient 1− α:

Yh ± Ss{pred}where S2 = gF(1− α; g, n− p)

Bonferroni simultaneous prediction limits:

Yh ± Bs{pred}where B = t(1− α/2g; n− p)

hsuhl (NUK) LR Chap 6 39 / 44

Page 40: Chapter 6 Multiple Regression I - National University of ...

6.7 Estimate of Mean Response and Prediction of New Observation

Caution about Hidden Extrapolations (Chap. 10)

Danger: the model may not be appropriate when it is extendedoutside the region of the observations.In multiple regression, it is particularly easy to lose track of thisregion since the levels of X1, . . . ,Xp−1 jointly define the region

Figure : Region of Observations on X1 and X2 jointly,Compared with Rangesof X1 and X2 Individually.

hsuhl (NUK) LR Chap 6 40 / 44

Page 41: Chapter 6 Multiple Regression I - National University of ...

6.8 Diagnostics and Remedial Measures

Diagnostics and Remedial Measures

Similar to the simple linear regression model given in Chap. 3.

Some important ones: Chap. 10, 11

hsuhl (NUK) LR Chap 6 41 / 44

Page 42: Chapter 6 Multiple Regression I - National University of ...

6.8 Diagnostics and Remedial Measures

Diagnostics and Remedial Measures (cont.)

Figure : SYGRAPH Scatter Plot Matrix and Correlation Matrix-DwaineStudio Example.

Scatter Plots

Residual Plots

Correlation Test for Normality

Test for Constancy of Error Variance

F Test for Lack of Fit

Box-Cox tramsformations

hsuhl (NUK) LR Chap 6 42 / 44

Page 43: Chapter 6 Multiple Regression I - National University of ...

6.8 Diagnostics and Remedial Measures

F test of lack of fit

To test whether the multiple regression response function

E{Y} = β0 + β1X1 + · · ·+ βp−1Xp−1

is an appropriate response surface

Repeat observations

SSE is decomposed into PE and LOF components

hsuhl (NUK) LR Chap 6 43 / 44

Page 44: Chapter 6 Multiple Regression I - National University of ...

6.8 Diagnostics and Remedial Measures

F test of lack of fit (cont.)

Testing:

H0 : E{Y} = β0 + β1X1 + · · ·+ βp−1Xp−1

Ha : E{Y} 6= β0 + β1X1 + · · ·+ βp−1Xp−1

The test statistic:

F∗ =SSLEc− p

÷ SSPEn− c

=MSLFMSPE

⇒ If F∗ ≤ F(1− α; c− p; n− c), conclude H0

If F∗ > F(1− α; c− p; n− c), conclude Ha

hsuhl (NUK) LR Chap 6 44 / 44