4.3 Confidence Intervals Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or...

Upload
normanlynch 
Category
Documents

view
223 
download
2
Embed Size (px)
Transcript of 4.3 Confidence Intervals Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or...
4.3 Confidence IntervalsUsing our CLM assumptions, we can construct
CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form:
)ˆ(*ˆjj setCI
Given a significance level α (which is used to determine t*), we construct 100(1 α)% confidence intervals
Given random samples, 100(1 α)% of our confidence intervals contain the true value Bj
we don’t know whether an individual confidence interval contains the true value
4.3 Confidence IntervalsConfidence intervals are similar to 2tailed tests in that α/2 is in each tail when finding t*
if our hypothesis test and confidence interval use the same α:
1) we can not reject the null hypothesis (at the given significance level) that Bj=aj if aj is within the confidence interval
2) we can reject the null hypothesis (at the given significance level) that Bj=aj if aj is not within the confidence interval
4.3 Confidence ExampleGoing back to our Pepsi example, we now
look at geekiness:
43N 62.0
5.03.03.4ˆ
2
21.025.01.2
R
PepsiGeekoloC
From before our 2sided t* with α=0.01 was t*=2.704, therefore our 99% CI is:
]976.0,376.0[
)25.0(704.23.0
)ˆ(*ˆ
CI
CI
setCI jj
4.3 Confidence IntervalsRemember that a CI is only as good as the 6 CLM assumptions:
1) Omitted variables cause the estimates (Bjhats) to be unreliable
CI is not valid2) If heteroskedasticity is present, standard error is not a valid estimate of standard deviation
CI is not valid3) If normality fails, CI MAY not be valid if our sample size is too small
4.4 Complicated Single TestsIn this section we will see how to test a single
hypothesis involving more than one Bj
Take again our coolness regression:
43N 62.0
5.03.03.4ˆ
2
21.025.01.2
R
PepsiGeekoloC
If we wonder if geekiness has more impact on coolness than Pepsi consumption:
21
210
:
:
aH
H
4.4 Complicated Single TestsThis test is similar to our one coefficient
tests, but our standard error will be differentWe can rewrite our hypotheses for clarity:
0:
0:
21
210
aH
H
We can reject the null hypothesis if the estimated difference between B1hat and B2hat is positive enough
4.4 Complicated Single TestsOur new t statistic becomes:
)ˆˆ(
ˆˆ
21
21
se
t
And our test continues as before:1) Calculate t2) Pick α and calculate t*3) Reject if t<t*
4.4 Complicated Standard ErrorsThe standard error in this test is more complicated than before
If we simply subtract standard errors, we may end up with a negative value
this is theoretically impossiblese must always be positive since
it estimates standard deviations
4.4 Complicated Standard ErrorsUsing the properties of variances, we know
that: )ˆ,ˆ(2)ˆ()ˆ()ˆˆ( 212121 CovVarVarVar Where the variances are always added and the covariance always subtractedtransferring to standard deviation, this becomes:
122
22
121 2)}ˆ({)}ˆ({)ˆˆ( ssesese Where s12 is an estimate of the covariance between coefficients
s12 can either be calculated using matrix algebra or be supplied by econometrics programs
4.4 Complicated Standard ErrorsTo see how to find this standard error,
take our typical regression:uxxxy 3322110
and consider the related equation where θ=B1B2 or B1= θ+B2:
uxxxxy
uxxxy
3321210
3322120
)(
)(
where x1 and x1 could be related concepts (ie: sleep time and naps) and x3 could be relatively unrelated (ie: study time)
4.4 Complicated Standard ErrorsBy running this new regression, we can find the standard error for our hypothesis test
using an econometric program is easier
Empirically:
1) B0 and se(B0) are the same for both regressions
2) B2 and B3 are the same for both regressions
3) Only B1 (the coefficient of θ) changes
given this new standard error, CI’s are created as normal
4.5 Testing Multiple RestrictionsThus far we have tested whether a SINGLE variable is significant, or how two different variable’s impacts compare
In this section we will test whether a SET of variables are significant; have a partial effect on the dependent variable
Even though a group of variables may be individually insignificant, they may be significant as a group due to multicollinearity
4.5 Testing Multiple RestrictionsConsider our general true model and an
example measuring reading week utility (rwu):
utripsskirwu
uxxxy
homework3210
3322110
we want to test the hypothesis that B1 and B2 equal zero at the same time, that x1 and x1 have no partial effect simultaneously:
0,0: 210 Hin our example, we are testing that positive activities have no effect
on r.w. utility
4.5 Testing Multiple Restrictionsour null hypothesis had two EXCLUSION RESTRICTIONS
this set of MULTIPLE RESTRICTIONS is tested using a MULTIPLE HYPOTHESIS TEST or JOINT HYPOTHESIS TEST
the alternate hypothesis is unique:
not true is : 0HH anote that we CANNOT use individual t tests to test this multiple
restriction; we need to test the restriction jointly
4.5 Testing Multiple Restrictionsto test joint significance, we need to use SSR and R squared values obtained from two different
regressionswe know that SSR increases and R2 decreases when variable are dropped from the modelin order to conduct our test, we need to regress two models:1) An UNRESTRICTED model with all of the variables2) A RESTRICTED MODEL that excludes the variables in the test
4.5 Testing Multiple RestrictionsGiven a hypothesis test with q restrictions,
we have the following regressions:
(4.36) ...
(4.34) ...
(4.35) 0,.....,0:
22110
22110
10
uxxxy
uxxxy
H
qkqk
kk
kqk
Where 4.34 is the UNRESTIRCTED MODEL giving us SSRu and 4.35 is the RESTRICTED MODEL giving us SSRr
4.5 Testing Multiple RestrictionsThese SSR values combine to give us our F
STATISTIC or TEST F STATISTIC:
(4.37) )1/(
/)(
knSSR
qSSRSSRF
ur
urr
Where q is the number of restrictions in the null hypothesis and q=numerator degrees of freedomnk1=denominator degrees of freedom (the denominator is the unbiased estimator of σ2)since SSRr≥SSRur, F is always positive
4.5 Testing Multiple RestrictionsOnce can think of our test F stat as measuring the relative increase in SSR from moving from
the unrestricted model to restricteda large F indicates that the excluded variables have much explanatory power using Ho and our CLM assumptions, we know that F has an F distribution with q, nk1 degrees of
freedom: F~Fq, nk1
we obtain F* from F tables and reject Ho if:
*FF
4.5 Multiple ExampleGiven our previous example of reading
week utility, a restricted and unrestricted model give us:
175SSR 572N
homework5.00.30.29.15ˆ)12.0()3.1()9.0()3.4(
tripsskiuwr
141SSR 572N
homework6.06.17ˆ)17.0()3.6(
uwr
not true is :
0,0:
0
320
HH
H
a
Which correspond to the hypotheses:
4.5 Multiple ExampleWe use these SSR to construct a test statistic:
6.68)13572/(141
141)/2(175F
)1/(
/)(
knSSR
qSSRSSRF
ur
urr
given α=0.05, F*2,569=3.00
since F>F*, reject H0 at a 95% confidence level; positive activities have an impact on reading week utility
4.5 Multiple NotesOnce the degrees of freedom in F’s denominator reach about 120, the F distribution is no longer sensitive to
ithence the infinity entry in the F table
if H0 is rejected, the variables in question are JOINTLY (STATISTICALLY) SIGNIFICANT at the given alpha level
if H0 is not rejected the variables in question are JOINTLY INSIGNIFICANT at the alpha level
an F test can often be not rejected when individual t tests are rejected due to multicollinearity
4.5 F, t’s secret identity?the F statistic can also be used to test significance of a single variable
in this case, q=1
it can be shown that F=t2 in this case
or t2nk1 ~F1, nk1
this only applies to twosided tests
therefore t statistic is more flexible since it allows for onesided tests
the t statistic is always best suited for testing a single hypothesis
4.5 F tests and abusewe have already seen where individually insignificant variables may be jointly significant due to
multicollinearity
a significant variable can also prove to be jointly insignificant if grouped with enough insignificant variables
an insignificant variable can also prove to be significant if grouped with significant variables
therefore t tests are much better than F tests at determining individual significance
4.5 R2 and FWhile SSR can be large, R2 is bounded, often making it an easier way to calculate F:
(4.41) )1/()1(
/)(2
22
knR
qRRF
ur
rur
Which is also called the RSQUARED FORM OF THE F STATISTICsince R2
ur>R2r, F is still always positive
this form is NOT valid for testing all linear restrictions (as seen later)
4.5 F and pvaluessimilar to ttests, F tests can produce pvalues which are defined as:
(4.43) F)P(F*value pthe pvalue is the “probability of observing a value of F at least as large as we did, given that the null hypothesis is true”
a small pvalue is therefore evidence against H0
as before, reject H0 if p>α
pvalues can give us a more complete view of significance
4.5 Overall significanceOften it is valid to test if the model is significant overallthe hypothesis that NONE of the explanatory variables have an effect on y is given as:
(4.44) 0.... : k210 Has before with multiple restrictions, we compare against the restricted model:
(4.45)u 0 y
4.5 Overall significanceSince our restricted model has no independent variables, its R2 is zero and our F formula simplifies to:
(4.46) )1/()1(
/2
2
knR
kRF
Which is only valid for this special testthis test determines the OVERALL SIGNIFICANCE OF THE REGRESSIONif this tests fails, we need to find other explanatory variables
4.5 Testing General Linear RestrictionsSometimes economic theory (generally
using elasticity) requires us to test complicated joint restrictions, such as:
2,1,0: 3210 H
Which expects our model:
uxxxy 3322110 To be of the form:
uxxxy 3210 210
4.5 Testing General Linear RestrictionsWe rewrite this expected model to obtain
a restricted model:
uxxy 032 21 We then calculate the F statistic using the SSR formulanote that since the dependent variable changes between the two
models, the R2 F formula is not valid in this casenote that the number of restrictions (q) is simply equal to the number of
= in the null hypothesis
4.6 Reporting Regression ResultsWhen reporting single regressions, the
proper reporting method is:
431N 41.0
)ln(4.1)ln(2.07.3)ˆln(
2
)78.0()15.0()9.0(
R
SkillTimesteaT iii
where R2, estimated coefficients, and N MUST be reported (note also the ^ and i’s)either standard errors or tvalues must also be reported (se is more robust for tests other than Bk=0)
SSR and standard error of the regression can also be reported
4.6 Reporting Regression ResultsWhen multiple, related regressions are run
(often to test for joint significance), the results can be expressed in table format, as seen on the next slidewhether a simple or table reporting method is done, the meanings and scaling of all the included variables must always be explained in a proper projectIe: price: average price, measured weekly, in American dollarsCollege: Dummy Variable. 0 if no college education, 1 if college education
4.6 Reporting Regression Results
Dependent variable: Midterm readiness
Ind. variables 1 2
Study Time 0.47(0.12)

Intellect 1.89(1.7)
2.36(1.4)
Intercept 2.5(0.03)
2.8(0.02)
ObservationsR2
330.48
330.34