Download - Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Transcript
Page 1: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Introduction to Bayesian Methods (I)

C. Shane ReeseDepartment of StatisticsBrigham Young University

Page 2: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

OutlineDefinitions

Classical or Frequentist Bayesian

Comparison (Bayesian vs. Classical)

Bayesian Data Analysis

Examples

Page 3: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

DefinitionsProblem: Unknown population parameter (θ)

must be estimated.

EXAMPLE #1: θ = Probability that a randomly selected person

will be a cancer survivor Data are binary, parameter is unknown and

continuous

EXAMPLE #2: θ = Mean survival time of cancer patients. Data are continuous, parameter is continuous.

Page 4: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

DefinitionsStep 1 of either formulation is to pose a statistical

(or probability)model for the random variable which represents the phenomenon.

EXAMPLE #1: a reasonable choice for f (y|θ) (the sampling density

or likelihood function) would be that the number of 6 month survivors (Y) would follow a binomial distribution with a total of n subjects followed and the probability of any one subject surviving is θ.

EXAMPLE #2: a reasonable choice for f (y|θ) survival time (Y) has

an exponential distribution with mean θ.

Page 5: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Classical (Frequentist) Approach All pertinent information enters the problem

through the likelihood function in the form of data(Y1, . . . ,Yn)

objective in nature

software packages all have this capability

maximum likelihood, unbiased estimation, etc.

confidence intervals, difficult interpretation

Page 6: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Bayesian Data Analysisdata (enters through the likelihood function as

well as allowance of other information

reads: the posterior distribution is a constant multiplied by the likelihood muliplied by the prior Distribution

posterior distribution: in light of the data our updated view of the parameter

prior distribution: before any data collection, the view of the parameter

Page 7: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Additional InformationPrior Distributions

can come from expert opinion, historical studies, previous research, or general knowledge of a situation (see examples)

there exists a “flat prior” or “noninformative” which represents a state of ignorance.

Controversial piece of Bayesian methods

Objective Bayes, Empirical Bayes

Page 8: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Bayesian Data Analysis inherently subjective (prior is controversial)

few software packages have this capability

result is a probability distribution

credible intervals use the language that everyone uses anyway. (Probability that θ is in the interval is 0.95)

see examples for demonstration

Page 9: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Mammography

Test Result

Positive Negative

PatientStatus

Cancer 88% 12%

Healthy

24% 76%

o Sensitivity:o True Positiveo Cancer ID’d!

o Specificity:o True Negativeo Healthy not ID’d!

Page 10: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Mammography IllustrationMy friend (40!!!) heads into her OB/GYN for a

mammography (according to Dr.’s orders) and finds a positive test result.

Does she have cancer?

Specificity, sensitivity both high! Seems likely ... or does it?

Important points: incidence of breast cancer in 40 year old women is 126.2 per 100,000 women.

Page 11: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Bayes Theorem for Mammography

Page 12: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Mammography Tradeoffs Impacts of false positive

Stress

Invasive follow-up procedures

Worth the trade-off with less than 1% (0.46%)chance you actually have cancer???

Page 13: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Mammography IllustrationMy mother-in-law has the same diagnosis in

2001.

Holden, UT is a “downwinder”, she was 65.

Does she have cancer?

Specificity, sensitivity both high! Seems likely ... or does it?

Important points: incidence of breast cancer in 65 year old women is 470 per 100,000 women, and approx 43% in “downwinder” cities.

Does this change our assessment?

Page 14: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Downwinder Mammography

Page 15: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Modified Example #1One person in the class stand at the back and

throw the ball tothe target on the board (10 times). before we have the person throw the ball ten

times does the choice of person change the a priori belief you have about the probability they will hit the target (θ)?

before we have the person throw the ball ten times does the choice of target size change the a priori belief you have about the probability they will hit the target (θ)?

Page 16: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Prior Distributionsa convenient choice for this prior information is

the Beta distribution where the parameters defining this distribution are the number of a priori successes and failures. For example, if you believe your prior opinions on the success or failure are worth 8 throws and you think the person selected can hit the target drawn on the board 6 times, we would say that has a Beta(6,2) distribution.

Page 17: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Bayes for Example #1 if our data are Binomial(n, θ) then we would

calculate Y/n as our estimate and use a confidence interval formula for a proportion.

If our data are Binomial(n, θ) and our prior distribution is Beta(a,b), then our posterior distribution is Beta(a+y,b+n−y). thus, in our example: a = b = n = y

=

and so the posterior distribution is: Beta( , )

Page 18: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Bayesian InterpretationTherefore we can say that the probability that

θ is in the interval ( , ) is 0.95. Notice that we don’t have to address the problem

of “in repeated sampling” this is a direct probability statement relies on the prior distribution

Page 19: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Example: Phase II Dose FindingGoal:

Fit models of the form:

Where

And d=1,…,D is the dose level

Page 20: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Definition of TermsED(Q):

Lowest dose for which Q% of efficacy is achieved Multiple definitions:

Def. 1

Def. 2

Example: Q=.95, ED95 dose is the lowest dose for which .95 efficacy is achieved

Page 21: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Classical ApproachCompletely randomized design

Perform F-test for difference between groupsIf significant at , then call the trial

a “success”, and determine the most effective dose as the lowest dose that achieves some pre-specified criteria (ED95)

Page 22: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Bayesian Adaptive ApproachAssign patients to doses adaptively based on

the amount of information about the dose-response relationship.

Goal: maximize expected change in information gain:

Weighted average of the posterior variances and the probability that a particular dose is the ED95 dose.

Page 23: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Probability of AllocationAssign patients to doses based on

Where is the probability of being assigned to dose

Page 24: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Four Decisions at Interim LooksStop trial for success: the trial is a

success, let’s move on to next phase.Stop trial for futililty: the trial is

going nowhere, let’s stop now and cut our losses.

Stop trial because the maximum number of patients allowed is reached (Stop for cap): trial outcome is still uncertain, but we can’t afford to continue trial.

Continue

Page 25: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Stop for FutilityThe dose-finding trial is stopped

because there is insufficient evidence that any of the doses is efficacious.

If the posterior probability that the mean change for the most likely ED95 dose is within a “clinically meaningful amount” of the placebo response is greater than 0.99 then the trial stops for futility.

Page 26: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Stop for SuccessThe dose-finding trial is stopped

when the current probability that the ED95* is sufficiently efficacious is sufficiently high.

If the posterior probability that the most likely ED95 dose is better than placebo reaches a high value (0.99) or higher then the trial stops early for success.

Note: Posterior (after updated data) probability drives this decision.

Page 27: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Stop for CapCap: If the sample size reaches

the maximum (the cap) defined for all dose groups the trial stops.

Refine definition based on application. Perhaps one dose group reaching max is of interest.

Almost always $$$ driven.

Page 28: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

ContinueContinue: If none of the above

three conditions hold then the trial continues to accrue.

Decision to continue or stop is made at each interim look at the data (accrual is in batches)

Page 29: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Benefits of ApproachStatistical: weighting by the

variance of the response at each dose allows quicker resolution of dose-response relationship.

Medical: Integrating over the probability that each dose is ED95 allows quicker allocation to more efficacious doses.

Page 30: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Example of ApproachReduction in average number of

events

Y=reduction of number of events

D=6 (5 active, 1 placebo)

Potential exists that there is a non-monotonic dose-response relationship.

Let be the dose value for dose d.

Page 31: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Model for Example

Page 32: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Dynamic Model PropertiesAllows for flexibility.Borrows strength from

“neighboring” doses and similarity of response at neighboring doses.

Simplified version of Gaussian Process Models.

Potential problem: semi-parametric, thus only considers doses within dose range:

Page 33: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Example Curves?

Page 34: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Simulations5000 simulated trials at each of

the 5 scenarios

Fixed dose design,

Bayesian adaptive approach as outlined above

Compare two approaches for each of 5 cases with sample size, power, and type-I error

Page 35: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Results (power & alpha)

Case Pr(S) Pr(F) Pr(cap) P(Rej)

1 .018 .973 .009 .049

2 1 0 0 .235

3 1 0 0 .759

4 1 0 0 .241

5 1 0 0 .802

Page 36: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

Results (n)0 10 20 40 80 120

1 51.6 26.1 26.2 31.2 33.5 36.8

2 28.4 10.9 13.8 18.9 22.5 19.2

3 27.7 11.3 14.5 25.2 17 15.2

4 31.2 10.8 13.3 19.6 22.2 27.8

5 28.9 18.0 22.3 21.1 14.5 10.7

Fixed 130 130 130 130 130 130

Page 37: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

ObservationsAdaptive design serves two

purposes:Get patients to efficacious dosesMore efficient statistical estimation

Sample size considerations

Dose expansion -- inclusion of safety considerations

Incorporation of uncertainties!!! Predictive inference is POWERFUL!!!

Page 38: Introduction to Bayesian Methods (I) C. Shane Reese Department of Statistics Brigham Young University.

ConclusionsScience is subjective (what about the choice of

a likelihood?) Bayes uses all available information Makes interpretation easier

BAD NEWS: I have showed very simple cases . . . they get much harder.

GOOD NEWS: They are possible (and practical) with advanced computational procedures