Chapter 5 Inference in the Simple Regression Model ...web.thu.edu.tw/wichuang/www/Financial...

Click here to load reader

  • date post

    20-Apr-2021
  • Category

    Documents

  • view

    2
  • download

    0

Embed Size (px)

Transcript of Chapter 5 Inference in the Simple Regression Model ...web.thu.edu.tw/wichuang/www/Financial...

  • Slide 5.1

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    Chapter 5

    Inference in the Simple Regression Model: Interval Estimation, Hypothesis Testing,

    and Prediction

    Assumptions of the Simple Linear Regression Model

    SR1. yt = β1 + β2xt + et

    SR2. E(et) = 0 ⇔ E[yt] = β1 + β2xt

    SR3. var(et) = σ2 = var(yt)

    SR4. cov(ei, ej) = cov(yi, yj) = 0

    SR5. xt is not random and takes at least two different values

    SR6. et ~ N(0, σ2) ⇔ yt ~ N[(β1 + β2xt), σ2] (optional)

  • Slide 5.2

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    If all the above-mentioned assumptions are correct, then the least squares estimators b1

    and b2 are normally distributed random variables and have, from Chapter 4.4, normal

    distributions with means and variances as follows:

    2 2

    1 1 2

    2

    2 2 2

    ~ , ( )

    ~ , ( )

    t

    t

    t

    x b N

    T x x

    b N x x

     σ β  − 

     σ β  − 

    ∑ ∑

    From Chapter 4.5 we know that the unbiased estimator of the error variance is as follows:

    2

    2 ˆˆ 2 te

    T σ =

    − ∑

  • Slide 5.3

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    By replacing the unknown parameter σ2 with this estimator we can estimate the variances

    of the least squares estimators and their covariance.

    In Chapter 4 you learned how to calculate point estimates of the regression

    parameters β1 and β2 using the best, linear unbiased estimation procedure. The estimates

    represent an inference about the regression function E(y) = β1 + β2x of the population

    from which the sample data was drawn.

    In this chapter we introduce the additional tools of statistical inference: interval

    estimation, prediction, interval prediction, and hypothesis testing. A prediction is a

    forecast of a future value of the dependent variable y, for creating ranges of values,

    sometimes called confidence intervals, in which the unknown parameters, or the value

    of y, are likely to be located. Hypothesis testing procedures are a means of comparing

    conjecture that we as economists might have about the regression parameters to the

    information about the parameters contained in a sample of data. Hypothesis tests allow

  • Slide 5.4

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    us to say that the data are compatible, or are not compatible, with a particular conjecture,

    or hypothesis.

    The procedures for interval estimation, prediction, and hypothesis testing, depend

    heavily on assumption SR6 of the simple linear regression model, and the resulting

    normality of the least squares estimators. If assumption SR6 is not made, then the sample

    size must be sufficiently large so that the least squares estimator’s distributions are

    approximately normal, in which case the procedures we develop in this chapter are also

    approximate. In developing the procedures in this chapter we will be using the normal

    distribution, and distributions related to the normal, namely “Student’s” t-distribution and

    the chi-square distribution.

  • Slide 5.5

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    5.1 Interval Estimation

    5.1.1 The Theory

    A standard normal random variable that we will use to construct an interval estimator is

    based on the normal distribution of the least squares estimator. Consider, for example,

    the normal distribution of b2 the least squares estimator of β2, which we denote as

    2

    2 2 2~ , ( )t b N

    x x  σ β  − ∑

    A standardized normal random variable is obtained from b2 by subtracting its mean and

    dividing by its standard deviation:

  • Slide 5.6

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    2 2

    2

    ~ (0,1) var( )

    bZ N b

    − β = (5.1.1)

    That is, the standardized random variable Z is normally distributed with mean 0 and

    variance 1.

    5.5.1a The Chi-Square Distribution

    • Chi-square random variables arise when standard normal, N(0,1), random variables are

    squared.

    If Z1, Z2, ..., Zm denote m independent N(0,1) random variables, then 2 2 2 2

    1 2 ( )~m mV Z Z Z= + + + χK (5.1.2)

  • Slide 5.7

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    The notation 2( )~ mV χ is read as: the random variable V has a chi-square distribution

    with m degrees of freedom. The degrees of freedom parameter m indicates the

    number of independent N(0,1) random variables that are squared and summed to form

    V.

    • The value of m determines the entire shape of the chi-square distribution, and its mean

    and variance

    2 ( )

    2 ( )

    [ ]

    var[ ] var 2

    m

    m

    E V E m

    V m

     = χ = 

     = χ = 

    (5.1.3)

    In Figure 5.1, graphs of the chi-square distribution for various degrees of freedom, m,

    are presented.

  • Slide 5.8

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    • Since V is formed be squaring and summing m standardized normal [N(0,1)] random

    variables, the value of V must be nonnegative, v ≥ 0.

    • The distribution has a long tail, or is skewed, to the right.

    • As the degrees of freedom m gets larger, the distribution becomes more symmetric and

    “bell-shaped.”

    • As m gets large, the chi-square distribution converges to, and essentially becomes, a

    normal distribution.

    5.5.1b The Probability Distribution of 2σ̂

    • If SR6 holds, then the random error term et has a normal distribution, et ~ N(0,σ2).

    • Standardize the random variable by dividing by its standard deviation so that , et/σ ~

    N(0,1).

    • The square of a standard normal random variable is a chi-square random variable with

    one degree of freedom, so 2 2(1)( / ) ~te σ χ .

  • Slide 5.9

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    • If all the random errors are independent then

    2 2 2 2

    21 2 ( )~t T T

    t

    e e e e       = + + + χ       σ σ σ σ        ∑ L (5.1.4)

    Since the true random errors are unobservable we replace them by their sample

    counterparts, the least squares residuals 1 2t̂ t te y b b x= − − to obtain

    2

    2

    2 2

    ˆ ˆ( 2)tt

    e TV − σ= =

    σ σ

    ∑ (5.1.5)

    • The random variable V in Equation (5.1.5) does not have a 2( )Tχ distribution because

    the least squares residuals are not independent random variables.

  • Slide 5.10

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    • All T residuals 1 2t̂ t te y b b x= − − depend on the least squares estimators b1 and b2. It

    can be shown that only T – 2 of the least squares residuals are independent in the

    simple linear regression model. That is, when multiplied by the constant (T – 2)/σ2 the

    random variable 2σ̂ has a chi-square distribution with T – 2 degrees of freedom,

    2

    2 ( 2)2

    ˆ( 2) ~ T TV −

    − σ = χ

    σ (5.1.6)

    • We have not established the fact that the chi-square random variable V is statistically

    independent of the least squares estimators b1 and b2, but it is. Now we turn our

    attention to define a t-random variable.

  • Slide 5.11

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    5.1.1c The t-Distribution

    • A “t” random variable (no uppercase) is formed by dividing a standard normal, Z ~

    N(0,1), random variable by the square root of an independent chi-square random

    variable, 2( )~ mV χ , that has been divided by its degrees of freedom, m.

    If Z~N(0,1) and 2( )~ mV χ , and if Z and V are independent, then

    ( )~ m Zt t V

    m = (5.1.7)

    • The shape of the t-distribution is completely determined by the degrees of freedom

    parameter, m, and the distribution is symbolized by t(m).

  • Slide 5.12

    Undergraduate Econometrics, 2nd Edition –Chapter 5

    • Figure 5.2 shows a graph of the t-distribution with m = 3 degrees of freedom, relative

    to the N(0,1). Note that the t-distribution is less “peaked,” and more spread out than

    the N(0,1).

    • The t-distribution is symmetric, with mean E[t(m)] = 0 and variance var[t(m)] = m/(m−2).

    • As the degrees of freedom parameter m→∞, the t(m) distribution approaches the