Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

57
Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis Friday: last lecture: review – Bring questions DEC 8 – 9am FINAL EXAM EN 2007

description

DEC 8 – 9am FINAL EXAM EN 2007. Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis Friday: last lecture: review – Bring questions. Resampling. Resampling Introduction. - PowerPoint PPT Presentation

Transcript of Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Page 1: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Today:Quizz 11: review. Last quizz!

Wednesday:Guest lecture – Multivariate Analysis

Friday:last lecture: review – Bring questions

DEC 8 – 9am

FINAL EXAMEN 2007

Page 2: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Resampling

Page 3: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingIntroduction

We have relied on idealized models of the origins of our data (ε ~N) to make inferences

But, these models can be inadequate

Resampling techniques allow us to base the analysis of a study solely on the design of that study, rather than on a poorly-fitting model

Page 4: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

• Fewer assumptions– Ex: resampling methods do not require that distributions

be Normal or that sample sizes be large

• Generality: Resampling methods are remarkably similar for a wide range of statistics and do not require new formulas for every statistic

• Promote understanding: Boostrap procedures build intuition by providing concrete analogies to theoretical concepts

ResamplingWhy resampling

Page 5: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Resampling

Collection of procedures to make statistical inferences without relying on parametric assumptions

- bias

- variance, measures of error

- Parameter estimation

- hypothesis testing

Page 6: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingProcedures

Permutation (randomization)

Bootsrap

Jackknife

Monte Carlo techniques

Page 7: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Resampling

With replacement

Without replacement

Page 8: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Resamplingpermutation

Page 9: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Resamplingpermutation

If number of samples in each group=10And n groups=2 number of permutations=184756If n groups = 3 number of permutations > 5 000 billions

Page 10: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingBootstrap

"to pull oneself up by one's bootstraps"

The Surprising Adventures of Baron Munchausen, (1781) by Rudolf Erich Raspe

Bradley Efron 1979

Page 11: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingBootstrap

Hypothesis testing, parameter estimation, assigning measures of accuracy to sample estimates

e.g.: se, CI

Useful when:

formulas for parameter estimates are based on assumptions that are not met

computational formulas only valid for large samples

computational formulas do not exist

Page 12: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingBootstrap

Assume that sample is representative of population

Approximate the distribution of the population by repeatedly resampling (with replacement) from the sample

Page 13: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingBootstrap

Page 14: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingBootstrap

Page 15: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingBootstrap

Non-parametric bootstrapresample observation from original samples

Parametric bootstrapfit a particular model to the data and then use this model to

produce bootstrap samples

Page 16: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervals Non Parametric Bootstrap

Large Capelin

Small Capelin

OtherPrey

Page 17: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervals Non Parametric Bootstrap

74 lc %Nlc = 48.3%

76 sc %Nsc = 49.6%

3 ot %Nlc = 1.9%

What about uncertainty around the point estimates?

Bootstrap

Page 18: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervals Non Parametric Bootstrap

Bootstrap:

- 153 balls, each with a tag: 76 sc, 74 lc, 3 ot

- Draw 153 random samples (with replacement) and record tag

- Calculate %Nlc*1, %Nsc*1, %Not*1

- Repeat nboot=50 000 times

(%Nlc*1, %Nsc*1, %Not*1),

(%Nlc*2, %Nsc*2, %Not*2),…,

( %Nlc*nboot, %Nsc*nboot, %Not*nboot)

- sort the %Ni*b UCL = 48750th %Ni*b (0.975)

LCL = 1250th %Ni*b (0.025)

Page 19: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervalsParametric Bootstrap

Page 20: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervalsParametric Bootstrap

Page 21: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervalsParametric Bootstrap

Page 22: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervalsParametric Bootstrap

Page 23: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Params

β

α

Confidence intervalsParametric Bootstrap

Page 24: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Params

β*1

α*1

Confidence intervalsParametric Bootstrap

Page 25: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Params

β*2

α*2

Confidence intervalsParametric Bootstrap

Page 26: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Confidence intervalsParametric Bootstrap

Params

β*nboot

α*nboot

Page 27: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Params

β*1, β*2, …,β*nboot

Construct Confidence Interval for β

1. Percentile Method

2. Bias-Corrected Percentile Confidence Limits

3. Accelerated Bias-Corrected Percentile Limits

4 .Bootstrap-t

5, 6, …. , Other methods

Confidence intervalsParametric Bootstrap

Page 28: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

BootstrapCI – Percentile Method

Page 29: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

BootstrapCI – Percentile Method

Page 30: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

BootstrapCaveats

Independence

Incomplete data

Outliers

Cases where small perturbations to the data-generating process produce big swings in the sampling distribution

Page 31: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingJackknife

Tukey 1958

Quenouille 1956

Estimate bias and variance of a statistic

Concept: Leave one observation out and recompute statistic

Page 32: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Assess the performance of the model

How accurately will the model predict a new observation?

Page 33: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 34: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 35: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 36: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 37: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 38: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 39: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 40: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 41: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 42: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 43: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 44: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 45: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 46: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Cross-validationJackknife

Page 47: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Jackknife Bootstrap Differences

Both estimate variability of a statistic between subsamples

Jackknife provides estimate of the variance of an estimator

Bootstrap first estimates the distribution of the estimator. From this distribution, we can estimate the variance

Using the same data set:

bootstrap results will always be different (slightly)

jackknife results will always be identical

Page 48: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingMonte Carlo

Stanisław Ulam

John von Neumann

Mid 1940s

“The first thoughts and attempts I made to practice [the Monte Carlo Method] were suggested by a question which occurred to me in 1946 as I was convalescing from an illness and playing solitaires. The question was what are the chances that a Canfield solitaire laid out with 52 cards will come out successfully? After spending a lot of time trying to estimate them by pure combinatorial calculations, I wondered whether a more practical method than "abstract thinking" might not be to lay it out say one hundred times and simply observe and count the number of successful plays.”

Page 49: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

ResamplingMonte Carlo

Monte Carlo methodS:

not just one no clear consensus on how they should be defined

Commonality:

repeated sampling from populations with known characteristics,

i.e. we assume a distribution and create random samples that follow that distribution, then compare our estimated statistic to the

distribution of outcomes

Page 50: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Monte Carlo

Goal: assess the robustness of constant escapement and constant harvest rate policies with respect to management error for Pacific salmon fisheries

Page 51: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Monte Carlo

Spawner-return dynamic model

Constant harvest rate

Constant escapement

Stochastic production variation

Management error

Average catch CV of catch

Page 52: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis
Page 53: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis
Page 54: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis
Page 55: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

Discussion

Was the bootstrap example I showed parametric or non-parametric?

Could you think an example of the other case?

So, what’s the difference between a bootstrap and a Monte Carlo?

Page 56: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

74 large capelin

76 small capelin

3 other prey

Bootstrap samples:

153 balls, each with a tag: sc, lc, ot

Draw 153 random samples and record the tag

Repeat 50 000 times

Page 57: Today: Quizz 11: review. Last quizz! Wednesday: Guest lecture – Multivariate Analysis

DiscussionIn principle both the parametric and the non-parametric bootstrap are special cases of Monte Carlo simulations used for a very specific purpose: estimate some characteristics of the sampling distribution.

The idea behind the bootstrap is that the sample is an estimate of the population, so an estimate of the sampling distribution can be obtained by drawing many samples (with replacement) from the observed sample, compute the statistic in each new sample.

Monte Carlo simulations are more general: basically it refers to repeatedly creating random data in some way, do something to that random data, and collect some results.

This strategy could be used to estimate some quantity, like in the bootstrap, but also to theoretically investigate some general characteristic of an estimator which is hard to derive analytically.

In practice it would be pretty safe to presume that whenever someone speaks of a Monte Carlo simulation they are talking about a theoretical investigation, e.g. creating random data with no empirical content what so ever to investigate whether an estimator can recover known characteristics of this random `data', while the (parametric) bootstrap refers to an emprical estimation. The fact that the parametric bootstrap implies a model should not worry you: any empirical estimate is based on a model. Hope this helps, Maarten ----------------------------------------- Maarten L. Buis

http://www.stata.com/statalist/archive/2008-06/msg00802.html