ANOVA and Residuals in Regression - Department of Statistics · Statistics 850 Clayton January 20,...

Statistics 850 Clayton January 20, 2004

ANOVA and Residuals in Regression

For the model: Yi = β0 + β1xi1 + · · · + βkxik + εi we can write the ANOVA table as:

Source df SS MSRegression k β′X ′Y − ny2 SSReg/kError n − k − 1

∑ε2i = ε′ε SSErr/(n − k − 1) = s2

ε

Total n − 1∑

(yi − y)2 = Y ′Y − ny2

Note that F = MSReg/MSError is a test of H0 : β1 = β2 = · · · = βk = 0|β0. In that sense, we could writeSSReg = SS(β1, β2, . . . , βk|β0) and rewrite the ANOVA table as:

Source df SSβ0 1 ny2

SS(β1, . . . , βk|β0) k β′X ′Y − ny2

Error n − k − 1∑

ε2iTotalNCM n Y ′Y

but we usually do not. (Why?)When we want to focus on the sequential fitting order of parameters, we can write the ANOVA table as,

for example,Source dfβ1|β0 1β2|β0, β1 1...

...βk|β0, . . . , βk−1 1Error n − k − 1Total n − 1

Residuals

There are several different definitions of residual. Here is a summary table of these definitions, includingtheir name in R and SAS.

Definition R SASεi = yi − yi residual residual

ri = εi

sε

√1−hii

rstandard student

ti = εi

s(i)

√1−hii

rstudent rstudent

• Var(εi) = σ2ε (1 − hii) where hii is from the “hat” matrix H = X(X ′X)−1X ′.

• In simple linear regression,

hii =1n

+(xi − x)2∑(xi − x)2

• In the formula for ti, s(i) is the “leave one out” estimate of σε based on the regression that is conductedon all the data except the ith case.

• ti can also be calculated as

ti =ε(i)√

Var(ε(i))

See Rencher for details.

1

ANOVA and Residuals in Regression - Department of Statistics · Statistics 850 Clayton January 20,...

Documents

Transcript of ANOVA and Residuals in Regression - Department of Statistics · Statistics 850 Clayton January 20,...