ANOVA and Residuals in Regression - Department of Statistics · Statistics 850 Clayton January 20,...
Transcript of ANOVA and Residuals in Regression - Department of Statistics · Statistics 850 Clayton January 20,...
Statistics 850 Clayton January 20, 2004
ANOVA and Residuals in Regression
For the model: Yi = β0 + β1xi1 + · · · + βkxik + εi we can write the ANOVA table as:
Source df SS MSRegression k β′X ′Y − ny2 SSReg/kError n − k − 1
∑ε2i = ε′ε SSErr/(n − k − 1) = s2
ε
Total n − 1∑
(yi − y)2 = Y ′Y − ny2
Note that F = MSReg/MSError is a test of H0 : β1 = β2 = · · · = βk = 0|β0. In that sense, we could writeSSReg = SS(β1, β2, . . . , βk|β0) and rewrite the ANOVA table as:
Source df SSβ0 1 ny2
SS(β1, . . . , βk|β0) k β′X ′Y − ny2
Error n − k − 1∑
ε2iTotalNCM n Y ′Y
but we usually do not. (Why?)When we want to focus on the sequential fitting order of parameters, we can write the ANOVA table as,
for example,Source dfβ1|β0 1β2|β0, β1 1...
...βk|β0, . . . , βk−1 1Error n − k − 1Total n − 1
Residuals
There are several different definitions of residual. Here is a summary table of these definitions, includingtheir name in R and SAS.
Definition R SASεi = yi − yi residual residual
ri = εi
sε
√1−hii
rstandard student
ti = εi
s(i)
√1−hii
rstudent rstudent
• Var(εi) = σ2ε (1 − hii) where hii is from the “hat” matrix H = X(X ′X)−1X ′.
• In simple linear regression,
hii =1n
+(xi − x)2∑(xi − x)2
• In the formula for ti, s(i) is the “leave one out” estimate of σε based on the regression that is conductedon all the data except the ith case.
• ti can also be calculated as
ti =ε(i)√
Var(ε(i))
See Rencher for details.
1