Bootstrap Confidence Intervals - ST552 Lecture 13 · Motivation...
Embed Size (px)
Transcript of Bootstrap Confidence Intervals - ST552 Lecture 13 · Motivation...

Bootstrap Confidence IntervalsST552 Lecture 13
Charlotte Wickham2019-02-11
1

Motivation
The inferences we’ve covered so far relied on our assumption ofNormal errors:
ε ∼ N(0, σ2In×n)
For example, we’ve seen under this assumption, the least squaresestimates are also Normally distributed:
β ∼ N(β, σ2
(XT X
)−1)
If the errors aren’t truly Normally distributed, whatdistribution do the estimates have?
2

Warm-up: Your Turn
Imagine the errors are in fact t3 distributed?
With your neighbour: design a simulation to understand thedistribution of the least squares estimates.
3

4

Example: 1. Fix n, fix X
2
4
6
8
10
5 10 15 20
x
Res
pons
e
5

Example: 1. Fix β, find y
0
4
8
12
5 10 15 20
x
Res
pons
e
6

Example: 2. Simulate errors, find y
0
4
8
12
5 10 15 20
x
Res
pons
e
7

Example: 3. Find least squares line
0
4
8
12
5 10 15 20
x
Res
pons
e
8

Example: 4. Repeat #2. and #3. many times
0
4
8
12
5 10 15 20
x
Res
pons
e
9

Example: Examine distribution of estimates
0.0
0.1
0.2
0.3
0.4
−5 0 5 10
Estimate of intercept
dens
ity
0
1
2
3
4
5
0.00 0.25 0.50 0.75
Estimate of slope
dens
ity
10

Example: Compared to theory
0.0
0.1
0.2
0.3
0.4
−5 0 5 10
Estimate of intercept
dens
ity
0
1
2
3
4
5
0.00 0.25 0.50 0.75
Estimate of slope
dens
ity
11

When the errors aren’t Normal: CLT
Think of our estimates like linear combinations of the errors. I.e. asort of average of i.i.d random variables.Some version of the Central Limit Theorem will apply.
For large samples, even when the errors aren’t Normal,
β∼N(β, σ2(XT X )−1)
12

Summary so far
If we knew the error distribution and true parameters we could usesimulation to understand the sampling distribution the leastsquares estimates.
Simulation can also be used to demonstrate the CLT at work inregression.
13

Bootstrap confidence intervals
In practice, with data in front of us, we don’t know the distributionof the errors (nor the true parameter values).
The bootstrap is one approach to estimate the samplingdistribution of β, by using the simulation idea, and substituting inour best guesses for the things we don’t know.
14

Bootstrapping regression
(Model based resampling)
0. Fit model and find estimates, β, and residuals, ei
1. Fix X ,2. For k = 1, . . . ,B
i. Generate errors, ε∗i sampled with replacement from ei
ii. Construct y , using the model, y = y + ε∗
iii. Use least squares to find β∗(k)
3. Examine the distribution of β∗ and compare to β
One confidence interval for βj is the 2.5% and 97.5% quantiles ofthe distribution of β∗
j .(Known as the Percentile method, there are other (better?)methods).
15

Example: Faraway Galapagos Islands
(I’ll illustrate with simple linear regression, Faraway does multiplecase in 3.6)
−200
0
200
400
0 500 1000 1500
Elevation
Spe
cies
Observed data
16

Bootstrap: 1. Find β, y , and ei .
−200
0
200
400
0 500 1000 1500
Elevation
Spe
cies
Observed data
17

Bootstrap: Using fixed X , ˆbeta from observed data
−200
0
200
400
0 500 1000 1500
Elevation
Spe
cies
Observed data
18

Bootstrap: 2. Resample residuals to construct bootstrappedresponse
−200
0
200
400
0 500 1000 1500
Elevation
Spe
cies
Bootstrapped data
19

Bootstrap: 3. Fit regression model to bootstrapped response
−200
0
200
400
0 500 1000 1500
Elevation
Spe
cies
Bootstrapped data
20

Bootstrap: 3. Repeat #2. and #3. many times
−200
0
200
400
0 500 1000 1500
Elevation
Spe
cies
21

Examine distribution of estimates
0.000
0.005
0.010
0.015
0.020
−50 0 50
Bootstrap estimates of intercept
dens
ity
0
5
10
0.10 0.15 0.20 0.25 0.30 0.35
Bootstrap estimates of slope
dens
ity
22

High level: bootstrap idea
We don’t know the distribution of the errors, but our best guess isprobably the empirical c.d.f on the residuals.
Sampling from a random variable with a c.d.f. defined as theempirical c.d.f. of the residuals, boils down to sampling withreplacement from residuals.
23

Limitations
We might rely on bootstrap confidence intervals when we areworried about the assumption of Normal errors. But, there arelimitations.
• We still rely on the assumption that the errors areindependent and identically distributed.
• Generally scaled residuals are used (residuals don’t have thesame variance, more later)
• An alternative bootstrap resamples the (yi , xi1, . . . , xip)vectors, i.e. resamples the rows of the data, a.k.a resamplingcases bootstrap.
24