4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or...
-
Upload
norman-lynch -
Category
Documents
-
view
223 -
download
2
Embed Size (px)
Transcript of 4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or...

4.3 Confidence Intervals-Using our CLM assumptions, we can construct
CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form:
)ˆ(*ˆjj setCI
-Given a significance level α (which is used to determine t*), we construct 100(1- α)% confidence intervals
-Given random samples, 100(1- α)% of our confidence intervals contain the true value Bj
-we don’t know whether an individual confidence interval contains the true value

4.3 Confidence Intervals-Confidence intervals are similar to 2-tailed tests in that α/2 is in each tail when finding t*
-if our hypothesis test and confidence interval use the same α:
1) we can not reject the null hypothesis (at the given significance level) that Bj=aj if aj is within the confidence interval
2) we can reject the null hypothesis (at the given significance level) that Bj=aj if aj is not within the confidence interval

4.3 Confidence Example-Going back to our Pepsi example, we now
look at geekiness:
43N 62.0
5.03.03.4ˆ
2
21.025.01.2
R
PepsiGeekoloC
-From before our 2-sided t* with α=0.01 was t*=2.704, therefore our 99% CI is:
]976.0,376.0[
)25.0(704.23.0
)ˆ(*ˆ
CI
CI
setCI jj

4.3 Confidence Intervals-Remember that a CI is only as good as the 6 CLM assumptions:
1) Omitted variables cause the estimates (Bjhats) to be unreliable
-CI is not valid2) If heteroskedasticity is present, standard error is not a valid estimate of standard deviation
-CI is not valid3) If normality fails, CI MAY not be valid if our sample size is too small

4.4 Complicated Single Tests-In this section we will see how to test a single
hypothesis involving more than one Bj
-Take again our coolness regression:
43N 62.0
5.03.03.4ˆ
2
21.025.01.2
R
PepsiGeekoloC
-If we wonder if geekiness has more impact on coolness than Pepsi consumption:
21
210
:
:
aH
H

4.4 Complicated Single Tests-This test is similar to our one coefficient
tests, but our standard error will be different-We can rewrite our hypotheses for clarity:
0:
0:
21
210
aH
H
-We can reject the null hypothesis if the estimated difference between B1hat and B2hat is positive enough

4.4 Complicated Single Tests-Our new t statistic becomes:
)ˆˆ(
ˆˆ
21
21
se
t
-And our test continues as before:1) Calculate t2) Pick α and calculate t*3) Reject if t<t*

4.4 Complicated Standard Errors-The standard error in this test is more complicated than before
-If we simply subtract standard errors, we may end up with a negative value
-this is theoretically impossible-se must always be positive since
it estimates standard deviations

4.4 Complicated Standard Errors-Using the properties of variances, we know
that: )ˆ,ˆ(2)ˆ()ˆ()ˆˆ( 212121 CovVarVarVar -Where the variances are always added and the covariance always subtracted-transferring to standard deviation, this becomes:
122
22
121 2)}ˆ({)}ˆ({)ˆˆ( ssesese -Where s12 is an estimate of the covariance between coefficients
-s12 can either be calculated using matrix algebra or be supplied by econometrics programs

4.4 Complicated Standard Errors-To see how to find this standard error,
take our typical regression:uxxxy 3322110
-and consider the related equation where θ=B1-B2 or B1= θ+B2:
uxxxxy
uxxxy
3321210
3322120
)(
)(
-where x1 and x1 could be related concepts (ie: sleep time and naps) and x3 could be relatively unrelated (ie: study time)

4.4 Complicated Standard Errors-By running this new regression, we can find the standard error for our hypothesis test
-using an econometric program is easier
-Empirically:
1) B0 and se(B0) are the same for both regressions
2) B2 and B3 are the same for both regressions
3) Only B1 (the coefficient of θ) changes
-given this new standard error, CI’s are created as normal

4.5 Testing Multiple Restrictions-Thus far we have tested whether a SINGLE variable is significant, or how two different variable’s impacts compare
-In this section we will test whether a SET of variables are significant; have a partial effect on the dependent variable
-Even though a group of variables may be individually insignificant, they may be significant as a group due to multicollinearity

4.5 Testing Multiple Restrictions-Consider our general true model and an
example measuring reading week utility (rwu):
utripsskirwu
uxxxy
homework3210
3322110
-we want to test the hypothesis that B1 and B2 equal zero at the same time, that x1 and x1 have no partial effect simultaneously:
0,0: 210 H-in our example, we are testing that positive activities have no effect
on r.w. utility

4.5 Testing Multiple Restrictions-our null hypothesis had two EXCLUSION RESTRICTIONS
-this set of MULTIPLE RESTRICTIONS is tested using a MULTIPLE HYPOTHESIS TEST or JOINT HYPOTHESIS TEST
-the alternate hypothesis is unique:
not true is : 0HH a-note that we CANNOT use individual t tests to test this multiple
restriction; we need to test the restriction jointly

4.5 Testing Multiple Restrictions-to test joint significance, we need to use SSR and R squared values obtained from two different
regressions-we know that SSR increases and R2 decreases when variable are dropped from the model-in order to conduct our test, we need to regress two models:1) An UNRESTRICTED model with all of the variables2) A RESTRICTED MODEL that excludes the variables in the test

4.5 Testing Multiple Restrictions-Given a hypothesis test with q restrictions,
we have the following regressions:
(4.36) ...
(4.34) ...
(4.35) 0,.....,0:
22110
22110
10
uxxxy
uxxxy
H
qkqk
kk
kqk
-Where 4.34 is the UNRESTIRCTED MODEL giving us SSRu and 4.35 is the RESTRICTED MODEL giving us SSRr

4.5 Testing Multiple Restrictions-These SSR values combine to give us our F
STATISTIC or TEST F STATISTIC:
(4.37) )1/(
/)(
knSSR
qSSRSSRF
ur
urr
-Where q is the number of restrictions in the null hypothesis and q=numerator degrees of freedom-n-k-1=denominator degrees of freedom (the denominator is the unbiased estimator of σ2)-since SSRr≥SSRur, F is always positive

4.5 Testing Multiple Restrictions-Once can think of our test F stat as measuring the relative increase in SSR from moving from
the unrestricted model to restricted-a large F indicates that the excluded variables have much explanatory power -using Ho and our CLM assumptions, we know that F has an F distribution with q, n-k-1 degrees of
freedom: F~Fq, n-k-1
-we obtain F* from F tables and reject Ho if:
*FF

4.5 Multiple Example-Given our previous example of reading
week utility, a restricted and unrestricted model give us:
175SSR 572N
homework5.00.30.29.15ˆ)12.0()3.1()9.0()3.4(
tripsskiuwr
141SSR 572N
homework6.06.17ˆ)17.0()3.6(
uwr
not true is :
0,0:
0
320
HH
H
a
-Which correspond to the hypotheses:

4.5 Multiple Example-We use these SSR to construct a test statistic:
6.68)13572/(141
141)/2-(175F
)1/(
/)(
knSSR
qSSRSSRF
ur
urr
-given α=0.05, F*2,569=3.00
-since F>F*, reject H0 at a 95% confidence level; positive activities have an impact on reading week utility

4.5 Multiple Notes-Once the degrees of freedom in F’s denominator reach about 120, the F distribution is no longer sensitive to
it-hence the infinity entry in the F table
-if H0 is rejected, the variables in question are JOINTLY (STATISTICALLY) SIGNIFICANT at the given alpha level
-if H0 is not rejected the variables in question are JOINTLY INSIGNIFICANT at the alpha level
-an F test can often be not rejected when individual t tests are rejected due to multicollinearity

4.5 F, t’s secret identity?-the F statistic can also be used to test significance of a single variable
-in this case, q=1
-it can be shown that F=t2 in this case
-or t2n-k-1 ~F1, n-k-1
-this only applies to two-sided tests
-therefore t statistic is more flexible since it allows for one-sided tests
-the t statistic is always best suited for testing a single hypothesis

4.5 F tests and abuse-we have already seen where individually insignificant variables may be jointly significant due to
multicollinearity
-a significant variable can also prove to be jointly insignificant if grouped with enough insignificant variables
-an insignificant variable can also prove to be significant if grouped with significant variables
-therefore t tests are much better than F tests at determining individual significance

4.5 R2 and F-While SSR can be large, R2 is bounded, often making it an easier way to calculate F:
(4.41) )1/()1(
/)(2
22
knR
qRRF
ur
rur
-Which is also called the R-SQUARED FORM OF THE F STATISTIC-since R2
ur>R2r, F is still always positive
-this form is NOT valid for testing all linear restrictions (as seen later)

4.5 F and p-values-similar to t-tests, F tests can produce p-values which are defined as:
(4.43) F)P(F*value- p-the p-value is the “probability of observing a value of F at least as large as we did, given that the null hypothesis is true”
-a small p-value is therefore evidence against H0
-as before, reject H0 if p>α
-p-values can give us a more complete view of significance

4.5 Overall significance-Often it is valid to test if the model is significant overall-the hypothesis that NONE of the explanatory variables have an effect on y is given as:
(4.44) 0.... : k210 H-as before with multiple restrictions, we compare against the restricted model:
(4.45)u 0 y

4.5 Overall significance-Since our restricted model has no independent variables, its R2 is zero and our F formula simplifies to:
(4.46) )1/()1(
/2
2
knR
kRF
-Which is only valid for this special test-this test determines the OVERALL SIGNIFICANCE OF THE REGRESSION-if this tests fails, we need to find other explanatory variables

4.5 Testing General Linear Restrictions-Sometimes economic theory (generally
using elasticity) requires us to test complicated joint restrictions, such as:
2,1,0: 3210 H
-Which expects our model:
uxxxy 3322110 -To be of the form:
uxxxy 3210 210

4.5 Testing General Linear Restrictions-We rewrite this expected model to obtain
a restricted model:
uxxy 032 21 -We then calculate the F statistic using the SSR formula-note that since the dependent variable changes between the two
models, the R2 F formula is not valid in this case-note that the number of restrictions (q) is simply equal to the number of
= in the null hypothesis

4.6 Reporting Regression Results-When reporting single regressions, the
proper reporting method is:
431N 41.0
)ln(4.1)ln(2.07.3)ˆln(
2
)78.0()15.0()9.0(
R
SkillTimesteaT iii
-where R2, estimated coefficients, and N MUST be reported (note also the ^ and i’s)-either standard errors or t-values must also be reported (se is more robust for tests other than Bk=0)
-SSR and standard error of the regression can also be reported

4.6 Reporting Regression Results-When multiple, related regressions are run
(often to test for joint significance), the results can be expressed in table format, as seen on the next slide-whether a simple or table reporting method is done, the meanings and scaling of all the included variables must always be explained in a proper projectIe: price: average price, measured weekly, in American dollarsCollege: Dummy Variable. 0 if no college education, 1 if college education

4.6 Reporting Regression Results
Dependent variable: Midterm readiness
Ind. variables 1 2
Study Time 0.47(0.12)
-
Intellect 1.89(1.7)
2.36(1.4)
Intercept 2.5(0.03)
2.8(0.02)
ObservationsR2
330.48
330.34