# Chapter 5 Inference in the Simple Regression Model ...web.thu.edu.tw/wichuang/www/Financial...

date post

20-Apr-2021Category

## Documents

view

2download

0

Embed Size (px)

### Transcript of Chapter 5 Inference in the Simple Regression Model ...web.thu.edu.tw/wichuang/www/Financial...

Slide 5.1

Undergraduate Econometrics, 2nd Edition –Chapter 5

Chapter 5

Inference in the Simple Regression Model: Interval Estimation, Hypothesis Testing,

and Prediction

Assumptions of the Simple Linear Regression Model

SR1. yt = β1 + β2xt + et

SR2. E(et) = 0 ⇔ E[yt] = β1 + β2xt

SR3. var(et) = σ2 = var(yt)

SR4. cov(ei, ej) = cov(yi, yj) = 0

SR5. xt is not random and takes at least two different values

SR6. et ~ N(0, σ2) ⇔ yt ~ N[(β1 + β2xt), σ2] (optional)

Slide 5.2

Undergraduate Econometrics, 2nd Edition –Chapter 5

If all the above-mentioned assumptions are correct, then the least squares estimators b1

and b2 are normally distributed random variables and have, from Chapter 4.4, normal

distributions with means and variances as follows:

2 2

1 1 2

2

2 2 2

~ , ( )

~ , ( )

t

t

t

x b N

T x x

b N x x

σ β −

σ β −

∑ ∑

∑

From Chapter 4.5 we know that the unbiased estimator of the error variance is as follows:

2

2 ˆˆ 2 te

T σ =

− ∑

Slide 5.3

Undergraduate Econometrics, 2nd Edition –Chapter 5

By replacing the unknown parameter σ2 with this estimator we can estimate the variances

of the least squares estimators and their covariance.

In Chapter 4 you learned how to calculate point estimates of the regression

parameters β1 and β2 using the best, linear unbiased estimation procedure. The estimates

represent an inference about the regression function E(y) = β1 + β2x of the population

from which the sample data was drawn.

In this chapter we introduce the additional tools of statistical inference: interval

estimation, prediction, interval prediction, and hypothesis testing. A prediction is a

forecast of a future value of the dependent variable y, for creating ranges of values,

sometimes called confidence intervals, in which the unknown parameters, or the value

of y, are likely to be located. Hypothesis testing procedures are a means of comparing

conjecture that we as economists might have about the regression parameters to the

information about the parameters contained in a sample of data. Hypothesis tests allow

Slide 5.4

Undergraduate Econometrics, 2nd Edition –Chapter 5

us to say that the data are compatible, or are not compatible, with a particular conjecture,

or hypothesis.

The procedures for interval estimation, prediction, and hypothesis testing, depend

heavily on assumption SR6 of the simple linear regression model, and the resulting

normality of the least squares estimators. If assumption SR6 is not made, then the sample

size must be sufficiently large so that the least squares estimator’s distributions are

approximately normal, in which case the procedures we develop in this chapter are also

approximate. In developing the procedures in this chapter we will be using the normal

distribution, and distributions related to the normal, namely “Student’s” t-distribution and

the chi-square distribution.

Slide 5.5

Undergraduate Econometrics, 2nd Edition –Chapter 5

5.1 Interval Estimation

5.1.1 The Theory

A standard normal random variable that we will use to construct an interval estimator is

based on the normal distribution of the least squares estimator. Consider, for example,

the normal distribution of b2 the least squares estimator of β2, which we denote as

2

2 2 2~ , ( )t b N

x x σ β − ∑

A standardized normal random variable is obtained from b2 by subtracting its mean and

dividing by its standard deviation:

Slide 5.6

Undergraduate Econometrics, 2nd Edition –Chapter 5

2 2

2

~ (0,1) var( )

bZ N b

− β = (5.1.1)

That is, the standardized random variable Z is normally distributed with mean 0 and

variance 1.

5.5.1a The Chi-Square Distribution

• Chi-square random variables arise when standard normal, N(0,1), random variables are

squared.

If Z1, Z2, ..., Zm denote m independent N(0,1) random variables, then 2 2 2 2

1 2 ( )~m mV Z Z Z= + + + χK (5.1.2)

Slide 5.7

Undergraduate Econometrics, 2nd Edition –Chapter 5

The notation 2( )~ mV χ is read as: the random variable V has a chi-square distribution

with m degrees of freedom. The degrees of freedom parameter m indicates the

number of independent N(0,1) random variables that are squared and summed to form

V.

• The value of m determines the entire shape of the chi-square distribution, and its mean

and variance

2 ( )

2 ( )

[ ]

var[ ] var 2

m

m

E V E m

V m

= χ =

= χ =

(5.1.3)

In Figure 5.1, graphs of the chi-square distribution for various degrees of freedom, m,

are presented.

Slide 5.8

Undergraduate Econometrics, 2nd Edition –Chapter 5

• Since V is formed be squaring and summing m standardized normal [N(0,1)] random

variables, the value of V must be nonnegative, v ≥ 0.

• The distribution has a long tail, or is skewed, to the right.

• As the degrees of freedom m gets larger, the distribution becomes more symmetric and

“bell-shaped.”

• As m gets large, the chi-square distribution converges to, and essentially becomes, a

normal distribution.

5.5.1b The Probability Distribution of 2σ̂

• If SR6 holds, then the random error term et has a normal distribution, et ~ N(0,σ2).

• Standardize the random variable by dividing by its standard deviation so that , et/σ ~

N(0,1).

• The square of a standard normal random variable is a chi-square random variable with

one degree of freedom, so 2 2(1)( / ) ~te σ χ .

Slide 5.9

Undergraduate Econometrics, 2nd Edition –Chapter 5

• If all the random errors are independent then

2 2 2 2

21 2 ( )~t T T

t

e e e e = + + + χ σ σ σ σ ∑ L (5.1.4)

Since the true random errors are unobservable we replace them by their sample

counterparts, the least squares residuals 1 2t̂ t te y b b x= − − to obtain

2

2

2 2

ˆ ˆ( 2)tt

e TV − σ= =

σ σ

∑ (5.1.5)

• The random variable V in Equation (5.1.5) does not have a 2( )Tχ distribution because

the least squares residuals are not independent random variables.

Slide 5.10

Undergraduate Econometrics, 2nd Edition –Chapter 5

• All T residuals 1 2t̂ t te y b b x= − − depend on the least squares estimators b1 and b2. It

can be shown that only T – 2 of the least squares residuals are independent in the

simple linear regression model. That is, when multiplied by the constant (T – 2)/σ2 the

random variable 2σ̂ has a chi-square distribution with T – 2 degrees of freedom,

2

2 ( 2)2

ˆ( 2) ~ T TV −

− σ = χ

σ (5.1.6)

• We have not established the fact that the chi-square random variable V is statistically

independent of the least squares estimators b1 and b2, but it is. Now we turn our

attention to define a t-random variable.

Slide 5.11

Undergraduate Econometrics, 2nd Edition –Chapter 5

5.1.1c The t-Distribution

• A “t” random variable (no uppercase) is formed by dividing a standard normal, Z ~

N(0,1), random variable by the square root of an independent chi-square random

variable, 2( )~ mV χ , that has been divided by its degrees of freedom, m.

If Z~N(0,1) and 2( )~ mV χ , and if Z and V are independent, then

( )~ m Zt t V

m = (5.1.7)

• The shape of the t-distribution is completely determined by the degrees of freedom

parameter, m, and the distribution is symbolized by t(m).

Slide 5.12

Undergraduate Econometrics, 2nd Edition –Chapter 5

• Figure 5.2 shows a graph of the t-distribution with m = 3 degrees of freedom, relative

to the N(0,1). Note that the t-distribution is less “peaked,” and more spread out than

the N(0,1).

• The t-distribution is symmetric, with mean E[t(m)] = 0 and variance var[t(m)] = m/(m−2).

• As the degrees of freedom parameter m→∞, the t(m) distribution approaches the