Chapter 5 Inference in the Simple Regression Model ...web.thu.edu.tw/wichuang/www/Financial...
date post
20-Apr-2021Category
Documents
view
2download
0
Embed Size (px)
Transcript of Chapter 5 Inference in the Simple Regression Model ...web.thu.edu.tw/wichuang/www/Financial...
Slide 5.1
Undergraduate Econometrics, 2nd Edition –Chapter 5
Chapter 5
Inference in the Simple Regression Model: Interval Estimation, Hypothesis Testing,
and Prediction
Assumptions of the Simple Linear Regression Model
SR1. yt = β1 + β2xt + et
SR2. E(et) = 0 ⇔ E[yt] = β1 + β2xt
SR3. var(et) = σ2 = var(yt)
SR4. cov(ei, ej) = cov(yi, yj) = 0
SR5. xt is not random and takes at least two different values
SR6. et ~ N(0, σ2) ⇔ yt ~ N[(β1 + β2xt), σ2] (optional)
Slide 5.2
Undergraduate Econometrics, 2nd Edition –Chapter 5
If all the above-mentioned assumptions are correct, then the least squares estimators b1
and b2 are normally distributed random variables and have, from Chapter 4.4, normal
distributions with means and variances as follows:
2 2
1 1 2
2
2 2 2
~ , ( )
~ , ( )
t
t
t
x b N
T x x
b N x x
σ β −
σ β −
∑ ∑
∑
From Chapter 4.5 we know that the unbiased estimator of the error variance is as follows:
2
2 ˆˆ 2 te
T σ =
− ∑
Slide 5.3
Undergraduate Econometrics, 2nd Edition –Chapter 5
By replacing the unknown parameter σ2 with this estimator we can estimate the variances
of the least squares estimators and their covariance.
In Chapter 4 you learned how to calculate point estimates of the regression
parameters β1 and β2 using the best, linear unbiased estimation procedure. The estimates
represent an inference about the regression function E(y) = β1 + β2x of the population
from which the sample data was drawn.
In this chapter we introduce the additional tools of statistical inference: interval
estimation, prediction, interval prediction, and hypothesis testing. A prediction is a
forecast of a future value of the dependent variable y, for creating ranges of values,
sometimes called confidence intervals, in which the unknown parameters, or the value
of y, are likely to be located. Hypothesis testing procedures are a means of comparing
conjecture that we as economists might have about the regression parameters to the
information about the parameters contained in a sample of data. Hypothesis tests allow
Slide 5.4
Undergraduate Econometrics, 2nd Edition –Chapter 5
us to say that the data are compatible, or are not compatible, with a particular conjecture,
or hypothesis.
The procedures for interval estimation, prediction, and hypothesis testing, depend
heavily on assumption SR6 of the simple linear regression model, and the resulting
normality of the least squares estimators. If assumption SR6 is not made, then the sample
size must be sufficiently large so that the least squares estimator’s distributions are
approximately normal, in which case the procedures we develop in this chapter are also
approximate. In developing the procedures in this chapter we will be using the normal
distribution, and distributions related to the normal, namely “Student’s” t-distribution and
the chi-square distribution.
Slide 5.5
Undergraduate Econometrics, 2nd Edition –Chapter 5
5.1 Interval Estimation
5.1.1 The Theory
A standard normal random variable that we will use to construct an interval estimator is
based on the normal distribution of the least squares estimator. Consider, for example,
the normal distribution of b2 the least squares estimator of β2, which we denote as
2
2 2 2~ , ( )t b N
x x σ β − ∑
A standardized normal random variable is obtained from b2 by subtracting its mean and
dividing by its standard deviation:
Slide 5.6
Undergraduate Econometrics, 2nd Edition –Chapter 5
2 2
2
~ (0,1) var( )
bZ N b
− β = (5.1.1)
That is, the standardized random variable Z is normally distributed with mean 0 and
variance 1.
5.5.1a The Chi-Square Distribution
• Chi-square random variables arise when standard normal, N(0,1), random variables are
squared.
If Z1, Z2, ..., Zm denote m independent N(0,1) random variables, then 2 2 2 2
1 2 ( )~m mV Z Z Z= + + + χK (5.1.2)
Slide 5.7
Undergraduate Econometrics, 2nd Edition –Chapter 5
The notation 2( )~ mV χ is read as: the random variable V has a chi-square distribution
with m degrees of freedom. The degrees of freedom parameter m indicates the
number of independent N(0,1) random variables that are squared and summed to form
V.
• The value of m determines the entire shape of the chi-square distribution, and its mean
and variance
2 ( )
2 ( )
[ ]
var[ ] var 2
m
m
E V E m
V m
= χ =
= χ =
(5.1.3)
In Figure 5.1, graphs of the chi-square distribution for various degrees of freedom, m,
are presented.
Slide 5.8
Undergraduate Econometrics, 2nd Edition –Chapter 5
• Since V is formed be squaring and summing m standardized normal [N(0,1)] random
variables, the value of V must be nonnegative, v ≥ 0.
• The distribution has a long tail, or is skewed, to the right.
• As the degrees of freedom m gets larger, the distribution becomes more symmetric and
“bell-shaped.”
• As m gets large, the chi-square distribution converges to, and essentially becomes, a
normal distribution.
5.5.1b The Probability Distribution of 2σ̂
• If SR6 holds, then the random error term et has a normal distribution, et ~ N(0,σ2).
• Standardize the random variable by dividing by its standard deviation so that , et/σ ~
N(0,1).
• The square of a standard normal random variable is a chi-square random variable with
one degree of freedom, so 2 2(1)( / ) ~te σ χ .
Slide 5.9
Undergraduate Econometrics, 2nd Edition –Chapter 5
• If all the random errors are independent then
2 2 2 2
21 2 ( )~t T T
t
e e e e = + + + χ σ σ σ σ ∑ L (5.1.4)
Since the true random errors are unobservable we replace them by their sample
counterparts, the least squares residuals 1 2t̂ t te y b b x= − − to obtain
2
2
2 2
ˆ ˆ( 2)tt
e TV − σ= =
σ σ
∑ (5.1.5)
• The random variable V in Equation (5.1.5) does not have a 2( )Tχ distribution because
the least squares residuals are not independent random variables.
Slide 5.10
Undergraduate Econometrics, 2nd Edition –Chapter 5
• All T residuals 1 2t̂ t te y b b x= − − depend on the least squares estimators b1 and b2. It
can be shown that only T – 2 of the least squares residuals are independent in the
simple linear regression model. That is, when multiplied by the constant (T – 2)/σ2 the
random variable 2σ̂ has a chi-square distribution with T – 2 degrees of freedom,
2
2 ( 2)2
ˆ( 2) ~ T TV −
− σ = χ
σ (5.1.6)
• We have not established the fact that the chi-square random variable V is statistically
independent of the least squares estimators b1 and b2, but it is. Now we turn our
attention to define a t-random variable.
Slide 5.11
Undergraduate Econometrics, 2nd Edition –Chapter 5
5.1.1c The t-Distribution
• A “t” random variable (no uppercase) is formed by dividing a standard normal, Z ~
N(0,1), random variable by the square root of an independent chi-square random
variable, 2( )~ mV χ , that has been divided by its degrees of freedom, m.
If Z~N(0,1) and 2( )~ mV χ , and if Z and V are independent, then
( )~ m Zt t V
m = (5.1.7)
• The shape of the t-distribution is completely determined by the degrees of freedom
parameter, m, and the distribution is symbolized by t(m).
Slide 5.12
Undergraduate Econometrics, 2nd Edition –Chapter 5
• Figure 5.2 shows a graph of the t-distribution with m = 3 degrees of freedom, relative
to the N(0,1). Note that the t-distribution is less “peaked,” and more spread out than
the N(0,1).
• The t-distribution is symmetric, with mean E[t(m)] = 0 and variance var[t(m)] = m/(m−2).
• As the degrees of freedom parameter m→∞, the t(m) distribution approaches the