bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L...

21
1 Chapter 4 Multiple Linear Regression , The model: y : The depentent variable : The indepentent variable or carriers : Residual or err or e . , p p p x x x e y x x b b b = + + + + 1 1 2 2 1 L L , , A special case y y : o , x x x r x x x e x e b ab b b b a b b = = + + = = = = = = + 1 2 1 2 1 1 1

Transcript of bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L...

Page 1: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

1

Chapter 4

Multiple Linear Regression

,

The model: y

: The depentent variable

: The indepentent variable or carriers

: Residual or err

ore .

, p

p px x x e

y

x x

β β β= + + + +1 1 2 2

1

L

L

,

,

A special case

y

y

:

o

,

x x x

r

x x

x e

x e

β α β β

β β

α β

β

== + +=

= =

= == +

1 2

1 2

1 1

1

Page 2: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

2

2 t

Polynomial regression

y

function

f(x)

carriers

x , x , ,

Example:

x

tt

t

t

t

x x x e

response

x x x

the

x x

β β β β

β β β β

+

= + + + +

= + + + +

= = =

21 2

2

1 2

11

o

o

L

L

L

i

, ,...,

y

Data Model:

i i p ip i

i n

x x x eβ β β

=

= + + + +11 2 2

1 2

L

Vector model:

p p

p

n n np

n n

xx x

x x x

y e

y e

β β β= + + + +

= = =

= =

1 2 2

111 12

1 2

1 1

1

1 2 p

y x x x e

x x x

y e

L

M M L M

M M

Page 3: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

3

n p

Matrix

Model

:

p

p

n

n np

p n

x xy

x x

yx x

e

e

β

β

β

β

×

= +

=

= =

11 1

1

21 2

1

1 1

X

X

X

1 p

y e

= x x

y =

e

L

LLM M O ML

M M

i

Least

squares estima

We minimize

Q( , , ) (y )

with respect to to obtain the

, denoted by .

The is

ˆ ˆ ˆˆ

,

tes

fitte

y

d

,

ˆ ˆ, ,

T

response

n

p

p

i i

p

p ipi

i i p ip

x x x

x x x

β β β β β

β β

β

β

β

β β

== − − − −

= + + +

∑ 2

1 1 1 2

1

1

21

1 1 2 2

L LL

L

L

i

he are

ˆ ˆ e

fitted residual

f , ,

s

ori iy y i n= − = 1L

Page 4: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

4

1ˆ ˆˆ ˆ xy xα β β= + +1 2 2

x21 x

y

Observation space representation

When using least squares to estimate parameters, we obtain the plane such that the sum of square of the distances from each observation to the plane is minimized.

L, , 1 px x

ˆ−y y

y

The variable space representation .

Min Q( ) pβ β β1= − −2

1 py - x xL% % %

Normal equations:

ˆ , ,span ⊥ 1 p y - y x xL

Page 5: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

5

ˆˆ

ˆ( )

ˆ( )

ˆ( )

ˆ( )

ˆ

β

β

β

β

β

β

=

== ⇒=

=

=

T

T

X

X 0

X 0

X 0

X X 0

X X 0

T1

T2

Tp

y

x y -

x y -

x

y -

y -

%

%

%

%

%M

%

If is nonsingular (i.e. is estimable),

then

ˆ ( ) β

β

−= T

T

T 1X X

X X

X y%

ˆ: is a l inear funct ion of ˆ i

N o

s a l inear es t imator .

te β

β

y

i.e.

.

is nonsingular if and only if columns

of , , are linearly independent.

=

TX X

X 1 px xL

1 ty ; x ,

1

,x

t

sample multiple correThe between

a variable y a

latio

nd variables x , ,x is denoted

R

by

n

.

Multiple correlation and coefficient of determination.

L

L

1 t ˆy ; x , , x

1 t

By definition

,

ˆwhere y is the fited value obtained from

the

R

intercept

y

mod

x .

el

xt

yy

e

r

α β β

= + + + +

=

1

L

L

Page 6: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

6

[ ]1 ty ; x , , x , where is the angle

between and

R

, ,

cosθ θ•

− −

=

1 1 t ty - y span x x x x

L

L

−y y

L, , − − 1 1 p px x x x

θ

y;xIf t , R .xythen r= =1

ˆ ˆy y

ˆye

e

It can be shown that (problem 15)

For the intercept model

ˆ ˆ ˆ e ( )

ˆ ˆˆ ˆ ( )( ) (why?) ( )i

i i

e e

y y e e

=

⊥ ⇒ = ⇒ = ∗

= − − = ∗ ∗

∑∑

0 0

0

2 2 2S S + S

Hints:

1

S

y

ˆ ˆy e

ˆˆ

ˆ ˆˆ ˆ ( ) ( )

Expand RHS and use ( ) and ( ) to

show that RHS .

i i i

i i i

y y e

y y y e y e

= +

= − = + − −

∗ ∗ ∗

=

∑ ∑2 22

2 2

S

S + S

Page 7: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

7

y2

y

1 t

Is defined by

R

variance explained by

the regression of y on x, , x .

Coefficiant of determination

The propertion o f

=2

2

S

S

L

1 t

2 2ˆ y ; x , , x, R R (problem 15)y yr= =2

L

ˆ ˆy yy

ˆy y y

ˆ ˆyy y

ˆye

Want to show

Show that ( ) by replacing

ˆˆ y and using the facttha

t y e

=

=

= + =

2

0

2 2

2 2 2

2 2

S S

H i n t :

S S S

S S

S

2R for multiple regression has the

same problems as those discassed in simple linear regression,and possibly

more.

2

R is not defined for models without an intercept :

, R= > ⇒ = ∞0 02 2y yS S !

2

1 t

R is an index that measures the degree

of linear relationship between y and

x , , x , the relation is lin ar.eifL

1y

2yy

x

Page 8: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

8

2

When a model without an intercept is fit, statistical programs generlly output a

number in the intervthere is no commonly accepted definition

for R formod ls withou

al [0,1]. Howe

t an in

ver,

t .e ercept

Beware!

A review of algebra of random vectors:

Let be an n p matrix of nonrandom

ˆ

ˆelements, and let be a vectorˆ

of random elements.p

θ

θ

θ

×

=

1

A

M%

The fo l lowing propert ies hold:

ˆ ˆ( ) ( ) ( ).

ˆ ˆ( ) Cov ( ) ( ) T

E E

Cov

θ θ

θ θ

=

=

A A

A A A

1

2

% %

% %

( )

Then

ˆ ˆE( ) and Cov( )

ˆ ˆ ˆWhere and ( , ).

p

p p p pp

i i kl k lE Cov

θ σ σ σθ θ

θ σ σ σ

θ θ σ θ θ

= =

= =

1 11 12 1

1 2

…M M M M% % L

ˆ ˆ( ) If then .

( ) If is a constant vector ,thenˆ ˆ E( + ) +E( ) +

ˆ ˆ Cov( ) ( )

N N

4

Cov

= =

=

A3 ? ?

Z

Z ? Z ? Z ?

Z + ? ?

∼ ∼

Page 9: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

9

( )

ˆIf E( ) then E( ) provided that is

estimable.

Proof:ˆ E( ) ( ) ( ) E( )

(

Unbiased errors:

)

E − −

= =

= =

= =

T 1 T T 1 T

T 1 T

X X X X X X

X X X X

e 0 ß ß ß

ß y y

ß ß

i

Recall the Data model y

Statistical properties of LS estima

1 n

, , : fixed (nonrandom)

:random

, :unkown and

tes:

i p ip i

i ip

i

p

x x e

x x

e

β β

β β

= + + +1 1

1

1

i = , ,

LL

L

L fixed.

T

T

-1

This assumption is equivallent to having

1 0 0

0 1 0Cov( )

0 1

Uncorrelated, constant variance errors:

ˆC

If is estimable,then

Proof: Let (

ov( ) ( )σ

σ σ

= =

=

=

2

2 2

X

A X

X

e

ß

I

ß

LL

M O ML

[ ]

T -1 T T -1

T -

-1

1

T

ˆCov( ) Cov( ) Cov( )

Cov( )

( ) ( )

ˆ) .then

This, in particular, leads to the formulas

of chapter 2 by using

( )

ProbeX ( l

T

T Tσ

σ

σ

=

=

= =

= =

=

=

2

2

2

A A A

A A AA

X X X X X X

X X A

X X

ß y y

e

ß y

1,x ) m 16

Page 10: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

10

ˆRss ( )

ˆ

E( ˆ )

n

i ii

y y

RssRMSn p

σ

σ σ

=

= −

= =−

∗ =

∑ 2

1

2

2 2

ß

T

If E( )=0, Cov( ) , then ˆ ( ) y is the MVUE for ,

amongst all linear unbiased estimators

of .

Additionallyˆ ˆ ˆf(x) is MVUE for f(

Guss-Markov Theorem

x).

T

p px x

σ

β β

β β

=

=

= + +

2

1

1 1

X X X

e e I

L

STACK-LOSS DATA, Brownlee(1965):

Data describe operation of a plant forthe oxidation of ammonia to nitric ac

An example of using SAS fo

id.

Four variables are obse

r multiplereg

rved over a p

ressio

eri

n:

od of 21 days.

The variation in the variablestackloss ( ) is to be

explained by the independentvariables:

: airflow :water temperature

:acid co

LOSS

AIRT

ncentrat

EM

io

P

ACID n.

Page 11: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

11

Page 12: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

12

With the assumption of normality we have the following results:

( )ß

2

2

If e N(0 , ) , then

ˆ N 0 ,

T

n

I

X X

R S S

σ

σ

σ χ

1

2 21

∼∼

2i

Confidence and prediction intervals

Here we assume e N(0,s )Let Var

I∼

iWe make the asssumption N(0, ).

Let be the estimate of

ˆVar , and let

ˆˆVar ˆ ( )

ˆ ˆ ˆˆstd( ) Var

Confidence a

.

Then u

nd predicti

sing the ch

on inte

aracter

rva

ization of t fromChapter 2

ls

Trrσ

σ

−=

=

2

2 1X X

r

r

r r

ß

ß

e I

ß ß

( )

, we have

ˆ

~ˆ ˆstd

( ) n pt −

−r r

r

ß ß

ß

Page 13: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

13

( )

,

T T

2 T

, let ( , ) be

a vector of constant values.Then

ˆ ˆ Var Var( )

More generall

( )

y

Tp

T

c c

σ −

=

=

=

1

1X X

c

c ß c ß c

c c

L

( )

2 T

ˆ ˆ ˆIf ˆ , then

ˆ ˆ ( ˆ )whe re

ˆV a r ( ˆ ) ˆ ( )

ˆ ˆa n d s t d ( ˆ ) V a r ( ˆ )

Tp p

n p

T T

c c

tstd

γ β β

γ γ

γ

γ γ σ

γ γ

= + + =

= =

=

1 1

1X X

c ß

c ß, c c

L

1 1 1 2

2 1 22

1 1 12 2 2

ˆ ˆL e t ˆ 2 f o r t h e m o d e l

ˆ ˆ y

( 1 - 2 )

1( ˆ ) ( 1 - 2 )

2

4 4

E x a m p l e :

x x e

V a r

γ β β

β β

β

β

σ σγ

σ σ

σ σ σ

= −

= + +

=

= − = − +

1 2

1 1 2 2

1

2

Tc ß =

11 12 2

21 22

2

2

Where

( )

ˆTo obta in Var( ˆ ) we replace

ˆ and accordingly replace

by ˆ in the above formula.

T

ij

i j

by

σ σσ

σ σ

γ σ

σ σ

σ

− =

1X X

Page 14: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

14

( )

T1 p

T1 p

(n-p)

2 T

( )

Let (x, ,x ) then

ˆ ˆ ˆ ˆ ˆ f( ) , where ( , , )

f( )-f( ) t ˆˆstd f( )

In g

ˆVar f( ) ˆ ( )

ener

a

l :

p p

T

T

f x x xβ β

β β

σ −

= + +

=

=

1 1

1X X

c =

x c ß ß =

x x

x

x c c

LL

L

T

1 p

ˆLet and ˆ , where

=(c , ,c ) is a constant vector.

γ γ= =T Tc ß c ß

c L

( )

2

isgiven by

ˆ

A 100(1- )% confidenc

ˆ

e int

t s td( ˆ

erval for

) n p

δγ γ

δ γ

−±

( )/

( )

o

To test hypothesis at 100( )% level,

compute the test statistic

and compare it with t for two sided

tests, and with t for one-sided te

ˆt ˆ ( ˆ )

sts

n p

n p

std

δ

δ

γ γ

δ

γ−

−=

2

.

a

o

(n-2)/2

2

1 2

o

To test the hypothesis

we use

and compare t the observed value

H

to t

: 2 5

H : 2 5

.

ˆ ˆ2 5t ˆ ( ˆ )std

β β

β β

β βγ

− = − ≠

− −=

1

1

1 2

o

Page 15: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

15

2

2

:( )Compute a 95% confidence interval

for the coefficient of AIR( ) in the

Stack-loss data.( ) Test 1 at 5% level.

Example

β

β ≥

a

b

( 1 7 )

2 .025 2

)ˆˆ ˆ t s t d ( )

0 . 7 1 6 2 . 1 1 ( 0 . 1 3 5 )

( 0 . 4 3 1 2 , 1 . 0 0 0 9 )

β β±= ±=

a

2 2 2

2 2

17

.05

2

)ˆH : 1

t ˆ ˆH : <1 std( )

0.716-1 2.10

0.135-2.10<-t

Reject H.

There is sufficient evidence at 5% level to support .

β β ββ β

β

≥ −=

= = −

<

1

1

b

o

o

2

2

2 1 2 3 4

T

(a)(

For the stack-loss data write a confidence interval for 3 and test

3 .5

3

Examp

( , , , )

note

b)

(0,3,1

l

0

e:

, )

β ββ β

γ β β β β β β

++ ≤

= +

3

3

3

3

ß =

c =

144424443

11 12 13 14

21 22 23 24

31 32 33 34

41 42 43 44

ˆ( )

22 23

32 33

0 3

( ˆ ) (0,3,1,0) 1

0

ˆ ˆ 3ˆ ( ˆ ) (3 1)

ˆ ˆ 1

.0182 (3 1)

Cov

Cov

Cov

σ σ σ σ

σ σ σ σγ

σ σ σ σ

σ σ σ σ

σ σγ

σ σ

=

⇒ =

=

ß

-0.0365 30.0365 0.135 1

9(.0182)-6(.0365)+.135 0.0798

==

Page 16: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

16

ß ß ß ß( )2 . 2

ˆˆ ˆ ˆ ˆ3 (3 )

= 3 . 7 2 . .T h e 9 5 % C I i s ( - . 3 8 5 , . 8 0 7 1

)

)

a t std+ ± +

±

173 0 2 5 3

2 1 1 0 7 9 8

2

a 2

o

( 1 7 ).05

o

: 3 .5

H : 3 .5

. . t .

. t . D o n o t r

)

e jec t H a t 5 % l e v e l .

H

b

β ββ β

+ ≤ + >

−= =

=

0 3

3

33

3 7 2 3 578

0 7 9 81 7 4

T

The method of previous example can

be applied.But there is

Compute a 95% confidence interval for

f( ), where =(82 27 89)

( (1 82 27 89

an eas

) a

ier wa

nd com

E

p

xam

uta

y.

Not tionis cumbersome.)

e

le

:

p :

T=

x x

c

i i

Use

ˆ ˆThe option produces e ,y

Model Loss AI and their

s

R TEMP ACI

tandard er

RR

r

D/

.

;

ors

=

Page 17: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

17

T

st

(n-p)0.25

In this example =(82 27 89) is the

1 observation.

A 95% for f( ) is:ˆ ˆˆ f( ) t std f( )

38.765 2.11(1.781)

(35.007, 42.523)ˆˆ std f( ) can also be used for test

of hypot

±

±

x

x

x x

xheses.

Page 18: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

18

T

Suppose that we want a 95%

confidence interval for f( ), where

=(60 20 90) .This is not part of the data.

So we add a data point

. 60 20 90to the end of the data, where "."

x

x

(n-p)

0.25

is

placed in the location of the dependent variable (see results next

page).

ˆ ˆˆ f( ) t std f( )

38.765 2.11(1.781) (35.007, 42.523)

±

±

x x

Page 19: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

19

2 3 4

1 2 3 4

3 2

perform the following tests of hypotheses for the Stack-Loss data:

) 2 3 4 7

) -40

) 0

For each of the problems value of is

) (0,2,3,-4)

) (1,1,1,1)

) (0,1,1,0)

To te

β β β

β β β β

β β

+ − =

+ + + =

+ =

=

=

=

T

T

T

a

b

c

c

a c

b c

c c

) Test 2*AIR+3*TEMP-4*ACID 7/PRINT;

) Test INTERCEPT+AIR+TEMP+ACID 407/PRINT;

st in SAS use

) Test AIR+TEMP/PRINT;

=

= −

a

b

c

o

-1

In ( )-( ) we are testing

:

:

where b is a constant scaler.ˆ-bUse t ˆ ˆstd( )

ˆ ˆˆwhere Var( ) Var( )

ˆ ( )

H b

H b

σ

=

=

=

1

2 TX X

T

T

T

T

T T

T

a c

c ß

c ß

c ßc ß

c ß c ß c

= c c

The numerator and square of the denominator are given in the SAS output (see next page).

Page 20: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

20

17.025

-1

) ˆ 10.5 for the stack loss data;

t 2.11ˆ( ) 0.129 -1.07

-1.07 t 0.919(10.5)(.129)

Do not reject H.

1.94) t 0.16(10.5)(10.1)

Do not reject H.

2.01) t

(0.008)(10

b

σ =

=

= − =

= = −

= =

=

2

TX XT T

a

c c c ß

b

c

o

o

6.9.5)

Reject H.

=

o

Page 21: bbb+++ xxxe Lbme2.aut.ac.ir/~towhidkhah/MI/Resources/... · pp xx s bb bb-= = =++ 2 1 11 XXX eeI L STACK-LOSS DATA, Brownlee(1965): Data describe operation of a plant for the oxidation

21

1442443x

A trick to do ( ) without using the

command .

Fit the model y ( )AIR

(TEMP-AIR) ACID

ˆ ˆThis gives s.e.for

TEST

.(SAME model)

e

β β β

β β

β β

1= + + +

+ +

+

2 3

3 4

2 3

c

: In general the TEST statment canˆ ˆˆ be used to obtain , Var( ).These

for example can be used to obtain ˆconfidence intervals for

NoteT T

T

c ß c ß

c ß.