Multiple Linear Regression: Inference &...
Transcript of Multiple Linear Regression: Inference &...
- p. 1/16
Statistics 203: Introduction to Regressionand Analysis of Variance
Multiple Linear Regression: Inference &Polynomial
Jonathan Taylor
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 2/16
Today
■ Inference: trying to “reduce” model.■ Polynomial regression.■ Splines + other bases.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 3/16
Summary of last class
■
Yn×1 = Xn×pβp×1 + εn×1
■
Y = HY, H = X(XtX)−1Xt
■
e = (I − H)Y
■
‖e‖2 ∼ σ2χ2n−p
■ Generally, if P is a projection onto a subspace L such thatP (Xβ) = 0, then
‖PY ‖2 = ‖P (Xβ + ε)‖2 = ‖Pε‖2 ∼ σ2χ2
dimL.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 4/16
R2 for multiple regression
SSE =n∑
i=1
(Yi − Yi)2 = ‖Y − Y ‖2
SSR =n∑
i=1
(Y − Yi)2 = ‖Y − Y 111‖2
SST =
n∑
i=1
(Yi − Y )2 = ‖Y − Y 111‖2
R2 =SSR
SST
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 5/16
Adjusted R2
■ As we add more and more variables to the model – evenrandom ones, R2 will go to 1.
■ Adjusted R2 tries to take this into account by replacing sumsof squares by “mean” squares
R2a = 1 −
SSE/(n − p)
SST/(n − 1)= 1 −
MSE
MST.
■ Here is an example.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 6/16
Inference in multiple regression
■ F -statistics.■ Dropping a subset of variables.■ General linear hypothesis.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 7/16
Testing H0 : β2 = 0
■ Can be tested with a t-test:
T =β2
SE(β2).
■ Alternatively, using an F -test with a “full” and “reduced”model◆ (F) Yi = β0 + β1Xi1 + β2Xi2 + εi
◆ (R) Yi = β0 + β1Xi1 + εi
■ F -statistic: under H0 : β2 = 0
SSEF = ‖Y − YF ‖2 ∼ σ2χ2
n−3
SSER = ‖Y − YR‖2 ∼ σ2χ2
n−2
SSEF − SSER = ‖YF − YR‖2 ∼ σ2χ2
1
and SSEF − SSER is independent of SSEF (see details).
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 8/16
Testing H0 : β2 = 0
■ Under H0
F =(SSEF − SSER)/1
SSEF /(n − 3)∼ F1,n−3.
■ Reject H0 at level α if F > F1,n−3,1−α.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 9/16
Some details
■ SSEF ∼ σ2χ2n−3 if the full model is correct, and
SSER ∼ σ2χ2n−2 if H0 is correct because
HF Y = HF (Xβ + ε) = Xβ + HF ε
HRY = HR(Xβ + ε) = Xβ + HRε ( under H0)
If H0 is false SSER is σ2 times a non-central χ2n−2.
■ Why is SSER − SSEF independent of SSEF ?
SSER − SSEF = ‖Y − HRY ‖2 − ‖Y − HF Y ‖2
= ‖HRY − HF Y ‖2 (Pythagoras)
= ‖HRε − HF ε‖2 ( under H0)
(HR − HF )ε is in LF , the subspace of the full model whileeF = (I − HF )ε is in L⊥
F the orthogonal complement of thefull model – therefore eF is independent of (HR − HF )ε.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 10/16
Overall goodness of fit
■ TestingH0 : β1 = β2 = 0.
■ Two models:◆ (F) Yi = β0 + β1Xi1 + β2Xi2 + εi
◆ (R) Yi = β0 + εi
■ F -statistic, under H0:
F =(SSER − SSEF )/2
SSEF /(n − 3)=
‖(HR − HF )Y ‖2/2
‖(I − HF )Y ‖2/(n − 3)∼ F2,n−3.
■ Reject H0 if F > F1−α,2,n−3.■ Details: same as before.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 11/16
Dropping subsets
■ Suppose we have the model
Yi = β0 + β1Xi1 + · · · + βp−1Xi,p−1 + εi
and we want to test whether we can simplify the model bydropping variables, i.e. testing
H0 : βj1 = · · · = βjk= 0.
■ Two models:◆ (F) – above◆ (R) – model with columns Xj1 , . . . , Xjk
omitted from thedesign matrix.
■ Under H0
F =(SSER − SSEF )/(dfR − dfF )
SSEF /dfF∼ FdfR−dfF ,dfF
where dfF and dfR are the “residual” degrees of freedom ofthe two models.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 12/16
General linear hypothesis
■ In previous slide: we had to fit two models, and we mightwant to test more than just whether some coefficients arezero.
■ Suppose we want to test
H0 : Ck×pβp×1 = hk×1
Specifying the reduced model can be difficult.■ Under H0
Cβ − h ∼ N(0, σ2C(XtX)−1Ct
).
■ As long as C(XtX)−1Ct is invertible
(Cβ−h)t(C(XtX)−1Ct
)−1(Cβ−h) = SSER−SSEF ∼ σ2χ2
k.
■ F -statistic
F =(SSEF − SSER)/(dfR − dfF )
SSEF /dfF∼ FdfR−dfF ,dfF
.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 13/16
Another fact about multivariate normal
■ Suppose that Zk×1 ∼ N(0, Σk×k) where Σ is invertible. Then
ZtΣ−1Z ∼ χ2k.
■ Why? Let Σ−1/2 be a square root of Σ−1, i.e. Σ−1/2 is asymmetric matrix such that
Σ−1/2ΣΣ−1/2 = Ik×k
Σ−1/2Σ−1/2 = Σ−1.
■ Then,Σ−1/2Z ∼ N(0, Ik×k)
andZtΣ−1Z = ‖Σ−1/2Z‖2 ∼ χ2
k.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 14/16
Polynomial models
■ So far, we have considered models that are linear in the x’s.■ We could have regression model be linear in known
functions of x: example polynomials.
Yi = β0 + β1Xi + β2X2i + · · · + βkXk
i + εi.
■ Here is an example.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 15/16
Polynomial models
■ Caution should be used in degree of polynomial used: it iseasy to overfit the model.
■ Useful when there is reason to believe relation is nonlinear.■ Easy to add polynomials in more than two variables to the
regression: interactions.■ Although polynomials can approximate any continuous
function (Bernstein’s polynomials) there are sometimesbetter bases. For instance, regression model may not bepolynomial, but only “piecewise” polynomial.
■ Design matrix X can become ill-conditioned which cancause numerical problems.
● Today
● Summary of last class
● R2 for multiple regression
● Adjusted R2
● Inference in multiple
regression
● Testing H0 : β2 = 0
● Testing H0 : β2 = 0
● Some details
● Overall goodness of fit
● Dropping subsets
● General linear hypothesis
● Another fact about
multivariate normal● Polynomial models
● Polynomial models
● Spline models
- p. 16/16
Spline models
■ Splines are piecewise polynomials functions, i.e. on aninterval between “knots” (ti, ti+1) the spline f(x) ispolynomial but the coefficients change within each interval.
■ Example: cubic spline with knows at t1 < t2 < · · · < th
f(x) =3∑
j=0
β0jxj +
h∑
i=1
βi(x − ti)3+
where
(x − ti)+ =
{x − ti if x − ti ≥ 0
0 otherwise.
■ Here is an example.■ Conditioning problem again: B-splines are used to keep the
model subspace the same but have the design lessill-conditioned.
■ Other bases one might use:◆ Fourier: sin and cos waves.◆ Wavelet: space/time localized basis for functions.