Econ 3790: Business and Economics Statistics

35
Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal [email protected]

description

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal [email protected]. Sampling Distribution of b 1. Expected value of b 1 : E(b 1 ) = b 1 Variance of b 1 : Var(b 1 ) = σ 2 /SS x. Estimate of σ 2. The mean square error (MSE) provides the estimate of σ 2. - PowerPoint PPT Presentation

Transcript of Econ 3790: Business and Economics Statistics

Page 1: Econ 3790: Business and Economics Statistics

Econ 3790: Business and Economics Statistics

Instructor: Yogesh [email protected]

Page 2: Econ 3790: Business and Economics Statistics

Sampling Distribution of b1

Expected value of b1:

E(b1) =1

Variance of b1:

Var(b1) = σ2/SSx

Page 3: Econ 3790: Business and Economics Statistics

Estimate of σ2

The mean square error (MSE) provides the The mean square error (MSE) provides the estimate of estimate of σσ22..

ss 22 = MSE = SSE/( = MSE = SSE/(n n 2) 2)

where:where:2)ˆ(SSE ii yy 2)ˆ(SSE ii yy

Page 4: Econ 3790: Business and Economics Statistics

Sample variance of b1

Estimate of variance of b1:

Standard error of b1:

s is called the standard error of the estimate.

xx SSMSE

SSsbVar

2

1)(

xxx SSs

SSMSE

SSsbSE

2

1)(

Page 5: Econ 3790: Business and Economics Statistics

Interval Estimate of 1:

(1-)100% confidence interval for 1 is:

Where t/2 is the value from t distribution with (n-2) degrees of freedom such that probability in the upper tail is /2.

)( 12/1 bSEtb

Page 6: Econ 3790: Business and Economics Statistics

Example: Reed Auto SalesReed Auto Sales

ss22 = MSE = SSE/( = MSE = SSE/(n n - 2) = 8.2/3 =2.73- 2) = 8.2/3 =2.73

95% confidence interval for 95% confidence interval for 11::

We can say we 95% confidence that We can say we 95% confidence that 11 will lie will lie

between 1.87 and 7.13.between 1.87 and 7.13.

83.0473.2)(

2

1 xSS

sbSE

63.25.483.0182.35.4

Page 7: Econ 3790: Business and Economics Statistics

Testing for Significance: t Test

Hypotheses

Test Statistic

Where b1 is the slope estimate and SE(b1) is the standard error of b1.

0 1: 0H 0 1: 0H

1: 0aH 1: 0aH

)(

0

1

1

bSE

bt

Page 8: Econ 3790: Business and Economics Statistics

Rejection RuleRejection Rule

Testing for Significance: Testing for Significance: tt Test Test

where: where:

tt is based on a is based on a tt distribution distribution

with with nn - 2 degrees of freedom - 2 degrees of freedom

Reject Reject HH00 if if pp-value -value << or or tt << - -ttor or tt >> tt

Page 9: Econ 3790: Business and Economics Statistics

1. Determine the hypotheses.1. Determine the hypotheses.

2. Specify the level of significance.2. Specify the level of significance.

3. Select the test statistic.3. Select the test statistic.

= .05= .05

4. State the rejection rule.4. State the rejection rule.Reject Reject HH00 if if pp-value -value << .05 .05or t ≤ 3.182 or t ≥ 3.182or t ≤ 3.182 or t ≥ 3.182

Testing for Significance: Testing for Significance: tt Test Test

0 1: 0H 0 1: 0H

1: 0aH 1: 0aH

)( 1

1

bSE

bt

Page 10: Econ 3790: Business and Economics Statistics

Testing for Significance: Testing for Significance: tt Test Test

5. Compute the value of the test statistic.5. Compute the value of the test statistic.

6. Determine whether to reject 6. Determine whether to reject HH00..

tt = 5.42 > t = 5.42 > t/2/2 = 3.182. We can reject = 3.182. We can reject HH00..

42.583.0

5.4

)( 1

1 bSE

bt

Page 11: Econ 3790: Business and Economics Statistics

Some Cautions about theInterpretation of Significance Tests

Just because we are able to reject Just because we are able to reject HH00: : 11 = 0 and = 0 and demonstrate statistical significance does not enabledemonstrate statistical significance does not enable

us to conclude that there is a us to conclude that there is a linear relationshiplinear relationshipbetween between xx and and yy..

Rejecting Rejecting HH00: : 11 = 0 and concluding that = 0 and concluding that thethe

relationship between relationship between xx and and yy is significant is significant does does not enable us to conclude that a not enable us to conclude that a cause-cause-and-effectand-effect

relationshiprelationship is present between is present between xx and and yy..

Page 12: Econ 3790: Business and Economics Statistics

The equation that describes how the dependent variable y is related to the independent variables x1, x2, . . . xp and an error term is called the multiple regression model.

Multiple Regression Model

yy = = 00 + + 11xx11 + + 22xx2 2 ++ . . . + . . . + ppxxpp + +

where:where:00, , 11, , 22, . . . , , . . . , pp are the are the parametersparameters, and, and is a random variable called the is a random variable called the error termerror term

Page 13: Econ 3790: Business and Economics Statistics

A simple random sample is used to A simple random sample is used to compute sample statistics compute sample statistics bb00, , bb11, , bb22, , . . . , . . . , bbpp that are used as the point estimators of the that are used as the point estimators of the parameters parameters 00, , 11, , 22, . . . , , . . . , pp..

Estimated Multiple Regression EquationEstimated Multiple Regression Equation

^yy = = bb00 + + bb11xx1 1 + + bb22xx2 2 + . . . + + . . . + bbppxxpp

The The estimated multiple regression equationestimated multiple regression equation is: is:

Page 14: Econ 3790: Business and Economics Statistics

Interpreting the CoefficientsInterpreting the Coefficients

In multiple regression analysis, we In multiple regression analysis, we interpret eachinterpret each

regression coefficient as follows:regression coefficient as follows: bbii represents an estimate of the change in represents an estimate of the change in yy corresponding to a 1-unit increase in corresponding to a 1-unit increase in xxii when all when all other independent variables are held constant.other independent variables are held constant.

Page 15: Econ 3790: Business and Economics Statistics

Example: Car SalesExample: Car Sales Suppose we believe that number of cars sold (Suppose we believe that number of cars sold (yy) is) is

not only related to the number of ads (not only related to the number of ads (xx11), but also ), but also to the minimum down payment required at the to the minimum down payment required at the ((xx22). The regression model can be given by:). The regression model can be given by:

Multiple Regression ModelMultiple Regression Model

wherewhere yy = number of cars sold = number of cars sold

xx11 = number of ads = number of ads

xx22 = minimum down payment required (‘000) = minimum down payment required (‘000)

yy = = 00 + + 11xx1 1 + + 22xx2 2 + +

Page 16: Econ 3790: Business and Economics Statistics

Estimated Regression EquationEstimated Regression Equation

y = 14.4 + 3.7 y = 14.4 + 3.7 xx11 + 0.251 + 0.251 xx22y = 14.4 + 3.7 y = 14.4 + 3.7 xx11 + 0.251 + 0.251 xx22

Interpretation? Interpretation? Estimated values of y?Estimated values of y? Error?Error? Prediction?Prediction?

Page 17: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE

where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y

Page 18: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

RR22 = 84.63/89.2 = .949 = 84.63/89.2 = .949

Adjusted Multiple Coefficient of Adjusted Multiple Coefficient of DeterminationDetermination

R Rn

n pa2 21 1

11

( )R Rn

n pa2 21 1

11

( )

Standard Error of EstimateStandard Error of Estimate

RR22 = SSR/SST = SSR/SST

1 pnSSEMSEs

Page 19: Econ 3790: Business and Economics Statistics

Testing for Significance: Testing for Significance: t t Test Test

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

Reject Reject HH00 if if pp-value -value << or or

if if tt << - -ttor or tt >> ttwhere where tt

is based on a is based on a t t distribution distribution

with with nn - - pp - 1 degrees of freedom. - 1 degrees of freedom.

0 : 0iH 0 : 0iH

: 0a iH : 0a iH

)( i

i

bSE

bt

Page 20: Econ 3790: Business and Economics Statistics

Example: Testing for significance of coefficients

HypothesesHypotheses

Rejection RuleRejection RuleFor For = .05 and d.f. = ?, = .05 and d.f. = ?, tt.025.025 = =

0:

0:0

ia

i

H

H

Test StatisticsTest Statistics)( i

i

bSE

bt

Page 21: Econ 3790: Business and Economics Statistics

Testing for Significance of Regression: Testing for Significance of Regression: F F TestTest

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0

HHaa: One or more of the parameters: One or more of the parameters

is not equal to zero.is not equal to zero.

FF = MSR/MSE = MSR/MSE

Reject Reject HH00 if if pp-value -value << or if or if FF > > FF

where where FF is based on an is based on an FF distribution distribution

with with pp d.f. in the numerator and d.f. in the numerator and

nn - - pp - 1 d.f. in the denominator. - 1 d.f. in the denominator.

Page 22: Econ 3790: Business and Economics Statistics

The years of experience, score on the The years of experience, score on the aptitudeaptitudetest, and corresponding annual salary test, and corresponding annual salary ($1000s) for a ($1000s) for a sample of 20 programmers is shown on the sample of 20 programmers is shown on the nextnextslide.slide.

Example 2: Programmer Salary Survey

Multiple Regression ModelMultiple Regression Model

A software firm collected data for a sampleA software firm collected data for a sampleof 20 computer programmers. A suggestionof 20 computer programmers. A suggestionwas made that regression analysis couldwas made that regression analysis couldbe used to determine if salary was relatedbe used to determine if salary was relatedto the years of experience and the scoreto the years of experience and the scoreon the firm’s programmer aptitude test.on the firm’s programmer aptitude test.

Page 23: Econ 3790: Business and Economics Statistics

4477115588101000116666

9922101055668844663333

787810010086868282868684847575808083839191

8888737375758181747487877979949470708989

24244343

23.723.734.334.335.835.83838

22.222.223.123.130303333

383826.626.636.236.231.631.629293434

30.130.133.933.928.228.23030

Exper.Exper. ScoreScore ScoreScoreExper.Exper.SalarySalary SalarySalary

Multiple Regression ModelMultiple Regression Model

Page 24: Econ 3790: Business and Economics Statistics

Suppose we believe that salary (Suppose we believe that salary (yy) is) is

related to the years of experience (related to the years of experience (xx11) and the ) and the score onscore on

the programmer aptitude test (the programmer aptitude test (xx22) by the ) by the following following

regression model:regression model:

Multiple Regression ModelMultiple Regression Model

wherewhere yy = annual salary ($1000) = annual salary ($1000)

xx11 = years of experience = years of experience

xx22 = score on programmer aptitude test = score on programmer aptitude test

yy = = 00 + + 11xx1 1 + + 22xx2 2 + +

Page 25: Econ 3790: Business and Economics Statistics

Solving for 0, 1 and 2:

A B C3839 Coeffic. Std. Err.40 Intercept 3.17394 6.1560741 Experience 1.4039 0.1985742 Test Score 0.25089 0.07735

Page 26: Econ 3790: Business and Economics Statistics

Anova Table

Source of Variation

Sum of Squares

Degrees of Freedom

Mean Square

F-statistic

Regression 500.34 …… …….. ……….

Error …….. ……. …….

Total 599.8 ……..

Page 27: Econ 3790: Business and Economics Statistics

Estimated Regression EquationEstimated Regression Equation

SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)SALARY = 3.174 + 1.404(EXPER) + 0.251(SCORE)

bb11 = 1.404 implies that salary is expected to = 1.404 implies that salary is expected to increase by $1,404 for each additional year of increase by $1,404 for each additional year of experience (when the variable experience (when the variable score on score on programmer attitude testprogrammer attitude test is held constant). is held constant).

b2 = 0.251 implies that salary is expected to b2 = 0.251 implies that salary is expected to increase by $251 for each additional point increase by $251 for each additional point scored on the programmer aptitude test (when scored on the programmer aptitude test (when the variable the variable years of experienceyears of experience is held is heldconstant).constant).

Page 28: Econ 3790: Business and Economics Statistics

Prediction

Suppose Bob had an experience of 4 years and had a score of 78 on the aptitude test. What would you estimate (or expect) his score to be?

= 3.174 + 1.404*(4) + 0.251(78)= 3.174 + 1.404*(4) + 0.251(78)

= 28.358= 28.358 Bob’s estimated salary is $28,358.Bob’s estimated salary is $28,358.

y

Page 29: Econ 3790: Business and Economics Statistics

Error

Bob’s actual salary is $24000. How much error we made in estimating his salary based on his experience and score?

So, we shall overestimate Bob’s salary.

43582835824000ˆ yyerror

Page 30: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE

where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y

Page 31: Econ 3790: Business and Economics Statistics

Multiple Coefficient of DeterminationMultiple Coefficient of Determination

RR22 = 500.3285/599.7855 = .83418 = 500.3285/599.7855 = .83418

RR22 = SSR/SST = SSR/SST

Adjusted Multiple Coefficient of Adjusted Multiple Coefficient of DeterminationDetermination

R Rn

n pa2 21 1

11

( )R Rn

n pa2 21 1

11

( )

2 20 11 (1 .834179) .814671

20 2 1aR

2 20 11 (1 .834179) .814671

20 2 1aR

Page 32: Econ 3790: Business and Economics Statistics

Testing for Significance: Testing for Significance: t t Test Test

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

Reject Reject HH00 if if pp-value -value << or or

if if tt << - -ttor or tt >> ttwhere where tt

is based on a is based on a t t distribution distribution

with with nn - - pp - 1 degrees of freedom. - 1 degrees of freedom.

0 : 0iH 0 : 0iH

: 0a iH : 0a iH

)( i

i

bSE

bt

Page 33: Econ 3790: Business and Economics Statistics

Example

HypothesesHypotheses

Rejection RuleRejection RuleFor For = .05 and d.f. = 17, = .05 and d.f. = 17, tt.025.025 = 2.11 = 2.11

Reject Reject HH00 if if pp-value -value << .05 or if .05 or if tt >> 2.11 2.11

0:

0:

1

10

aH

H

Test StatisticsTest Statistics 07.7199.0

404.1

)( 1

1 bSE

bt

Since t=7.07 > tSince t=7.07 > t0.0250.025 =2.11, we reject H =2.11, we reject H00..

Page 34: Econ 3790: Business and Economics Statistics

Testing for Significance of Regression: Testing for Significance of Regression: F F TestTest

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0

HHaa: One or more of the parameters: One or more of the parameters

is not equal to zero.is not equal to zero.

FF = MSR/MSE = MSR/MSE

Reject Reject HH00 if if pp-value -value << or if or if FF > > FF

where where FF is based on an is based on an FF distribution distribution

with with pp d.f. in the numerator and d.f. in the numerator and

nn - - pp - 1 d.f. in the denominator. - 1 d.f. in the denominator.

Page 35: Econ 3790: Business and Economics Statistics

ExampleExample

HypothesesHypotheses HH00: : 11 = = 2 2 = 0= 0

HHaa: One or both of the parameters: One or both of the parameters

is not equal to zero.is not equal to zero.

Rejection RuleRejection Rule For For = .05 and d.f. = 2, 17; = .05 and d.f. = 2, 17; FF.05.05 = 3.59 = 3.59

Reject Reject HH00 if if pp-value -value << .05 or .05 or FF >> 3.59 3.59

Test StatisticsTest Statistics FF = MSR/MSE = MSR/MSE = 250.17/5.86 = 42.8= 250.17/5.86 = 42.8

FF = 42.8 = 42.8 >> F F0.050.05 = 3.59, so we can reject = 3.59, so we can reject HH00..