Topic 4: Statistical Inference
Outline
• Statistical inference – confidence intervals – significance tests
• Statistical inference for β1
• Statistical inference for β0
• Tower of Pisa example
Theory for Statistical Inference
• Xi iid Normal(μ,σ2), parameters unknown
2i i2
2
X X XX ,
n n 1
, Xn
( )
( )
s
ss s s
Theory for Statistical Inference
• Consider variable
• t is distributed as t(n-1) • Use distribution in inference for m
– confidence intervals– significance tests
X
(X)t
s
Confidence Intervals
where tc= t(1-α/2,n-1), the upper (1-a/2)100 percentile of the t distribution with n-1 degrees of freedom
• 1-a is the confidence level
X (X)ct s
Confidence Intervals
• is the sample mean (center of interval)
• s( ) is the estimated standard deviation of , sometimes called the standard error of the mean
• is the margin of error and describes the precision of the estimate
XX
X
(X)ct s
Confidence Intervals
• Procedure such that (1-a)100% of the time, the true mean will be contained in interval
• Do not know whether a single interval is one that contains the mean or not
• Confidence describes “long-run” behavior of procedure
• If data non-Normal, procedure only approximate (central limit theorem)
Significance tests
0 0 0
*0
*0 c c
*
vs
Reject if t t(1 2 n 1)
Prob where t t(n 1)
: :
t (X ) (X)
t | t |,
( t t ),
a
α / ,
~ -
H H
s
H
P
Significance tests
• Under H0 t* will have distribution t(n-1)
• P(reject H0 | H0 true) = a (Type I error)
• Under Ha, t* will have noncentral t(n-1) dists
• P(DNR H0 | Ha true) = b (Type II error)
• Type II error related to the power of the test
NOTE
IN THIS COURSE USE α=.05
UNLESS SPECIFIED OTHERWISE
Theory for β1 Inference
21 1 1
2 2 21 i
*1 1 1
2 21 i
*
b ~ ( , (b ))
where (b ) (X X)
t (b ) / (b )
where (b ) (X X)
Under , t ~ t(n 2)0
N
s
s s
H
Confidence Interval for β1
b1 ± tcs(b1) where tc = t(1-α/2,n-2), the upper (1-α/2)100 percentile of the t distribution with n-2 degrees of freedom
•1-α is the confidence level
Significance tests for β1
0 1 1
*1 1
*0 c c
*
vs
Reject if t 1 2 n 2)
Prob where t~t(n 2)
: 0 : 0
t (b 0) (b )
t | t |, t(
( t t ),
a
α / ,
H H
s
H
P
Theory for β0 Inference2
0 0 0
22 2
0 2i
*0 0 0
2 20
*0
b ~ ( , (b ))
1 X where (b )
n (X X)
t (b ) / (b )
for (b ) replace by and take
Under , t ~ t(n 2)
N
s
s s
H
Confidence Interval for β0
b0 ± tcs(b0) where tc = t(1-α/2,n-2), the upper (1-α/2)100 percentile of the t distribution with n-2 degrees of freedom
•1-α is the confidence level
Significance tests for β0
0 0 0
*0 0
*0 c c
*
vs
Reject if t t(1 2 n 2)
Prob where t~t(n 2)
: 0 : 0
t (b 0) (b )
t | t |,
( t t ),
a
α / ,
H H
s
H
P
Notes
• The normality of b0 and b1 follows from the fact that each of these is a linear combination of the Yi, each of which is an independent normal
• For b1 see KNNL p42
• For b0 try this as an exercise
Notes
• Usually the CI and significance test for β0 is not of interest
• If the ei are not normal but are relatively symmetric, then the CIs and significance tests are reasonable approximations
Notes
• These procedures can easily be modified to produce one-sided confidence intervals and significance tests
• Because we can make this quantity small by making
large.
2 2 21( ) ( )ib X X
2
1
( )n
ii
X X
SAS Proc Reg
proc reg data=a1; model lean=year/clb;run;
clb option generates confidence intervals
Parameter Estimates
Variable DFParameter
EstimateStandard
Error t Value Pr > |t|95% Confidence
LimitsIntercept 1 -61.12088 25.12982 -2.43 0.0333 -116.43124 -5.81052year 1 9.31868 0.30991 30.07 <.0001 8.63656 10.00080
CIs given here….CI for intercept is uninteresting
Review
• What is the default value of α that we will use in this class?
• What is the default confidence level that we use in this class?
• Suppose you could choose the X’s. How would you choose them if you wanted a precise estimate of the slope? intercept? both?
Background Reading
• Chapter 2– 2.3 : Considerations
• Chapter 16– 16.10 : Planning sample sizes with power
• Appendix A.6
Top Related