• date post

27-Mar-2020
• Category

## Documents

• view

6

1

Embed Size (px)

### Transcript of Bayesian methods for parameter estimation and model comparison · PDF file Bayesian methods...

• Bayesian methods for parameter estimation and

model comparison

Carson C Chow, LBM, NIDDK, NIH

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

t

y

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

dy

dt = f(t; θ)

t

y

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

dy

dt = f(t; θ)

t

y

y(t) = g(t|y(0), θ)

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

dy

dt = f(t; θ)

t

y

y(t) = g(t|y(0), θ)

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

dy

dt = f(t; θ)

t

y y

t

y(t) = g(t|y(0), θ)

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

dy

dt = f(t; θ)

t

y y

t

y = at + b

y(t) = g(t|y(0), θ)

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions:

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions: What algorithm to use?

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions: What algorithm to use?

How good is the fit?

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions:

Is there a better model?

What algorithm to use?

How good is the fit?

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions:

Is there a better model?

What algorithm to use?

Sensitivity?

How good is the fit?

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions:

Is there a better model?

What algorithm to use?

Answer: Use Bayesian inference and MCMC

Sensitivity?

How good is the fit?

Monday, April 26, 2010

• Task: Fit a model (ODE, PDE,...) to data - estimate parameters

Questions:

Is there a better model?

What algorithm to use?

Sensitivity?

How good is the fit?

Answer: Invert vs. infer

Monday, April 26, 2010

• Bayesian inference

Frequentist: Probability is a frequency (of a random variable)

Monday, April 26, 2010

• Bayesian inference

Frequentist: Probability is a frequency (of a random variable)

Bayesian: Probability is a measure of uncertainty

Monday, April 26, 2010

• Bayesian inference

Frequentist: Probability is a frequency (of a random variable)

Bayesian: Probability is a measure of uncertainty

Jaynes: Probability is extended logic

Monday, April 26, 2010

• Bayesian inference

Models and parameters have probabilities

Anything can be assigned a probability

Makes everything straightforward

Monday, April 26, 2010

• D = Data θ = Parameters

Parameter estimation y(θ,t): solution

i.e. Maximum likelihood estimation

θ that maximizes likelihood

i

(Di − yi(θ))2e.g. Minimize mean square error

P (D|θ) ∝ exp (

i

− (Di − yi(θ)) 2

2σ2

)

Monday, April 26, 2010

• D = Data θ = Parameters

Parameter estimation y(θ,t): solution

i.e. Maximum likelihood estimation

θ that maximizes likelihood

i

(Di − yi(θ))2e.g. Minimize mean square error

P (D|θ) ∝ exp (

i

− (Di − yi(θ)) 2

2σ2

)

Monday, April 26, 2010

• Statistics

But in frequentist statistics models and parameters cannot have probabilities

Confidence intervals of a parameter are with respect to sampling errors

“Natural” thing to do is to find the probability distribution for a model or parameters

Monday, April 26, 2010

• Bayes theorem

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• Bayes theorem

P (θ|D) = P (D|θ)P (θ) P (D)

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• Bayes theorem

Prior P (θ|D) = P (D|θ)P (θ)

P (D)

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• Bayes theorem

Prior

Likelihood

P (θ|D) = P (D|θ)P (θ) P (D)

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• Bayes theorem

Prior

Likelihood

P (θ|D) = P (D|θ)P (θ) P (D)

Evidence

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• Bayes theorem

PriorPosterior

Likelihood

P (θ|D) = P (D|θ)P (θ) P (D)

Evidence

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• Bayes theorem

Bayesian inference reduces statistics to one equation

PriorPosterior

Likelihood

P (θ|D) = P (D|θ)P (θ) P (D)

Evidence

P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

Monday, April 26, 2010

• D = Data θ = Parameters

Parameter estimation

P (θ|D) = P (D|θ)P (θ) P (D)

P (θ|D) ∝ P (D|θ)P (θ)

P (D) = ∫

P (D|θ)P (θ)dθ

y(θ,t): solution

Monday, April 26, 2010

• D = Data θ = Parameters

Parameter estimation

P (θ|D) = P (D|θ)P (θ) P (D)

P (θ|D) ∝ P (D|θ)P (θ)

P (D) = ∫

P (D|θ)P (θ)dθ

∝ exp (

i

− (Di − yi(θ)) 2

2σ2

) y(θ,t): solution

Monday, April 26, 2010

• D = Data θ = Parameters

Parameter estimation

P (θ|D) = P (D|θ)P (θ) P (D)

P (θ|D) ∝ P (D|θ)P (θ)

P (D) = ∫

P (D|θ)P (θ)dθ

∝ exp (

i

− (Di − yi(θ)) 2

2σ2

) y(θ,t): solution

Probability of parameter given the data

Monday, April 26, 2010

• 0 20 40 60 80 100−2

0

2

4

6

8

10

12

0 20 40 60 80 100−2

0

2

4

6

8

10

12

t

y

Model as exponential decay with Gaussian noise

Posterior probability

Likelihood

Model

Data

P(y|b, t) ∝ ∏ t

e−(yt−10exp(−bt)) 2/2

P (b|y) ∝ P (y|b, t)P (b)

Example

Monday, April 26, 2010

• 0 0.025 0.05 0.075 0.10

0.5

1

1.5

2

b 0 20 40 60 80 100−2

0

2

4

6

8

10

12

0 20 40 60 80 100−2

0

2

4

6

8

10

12

t

y

Model as exponential decay with Gaussian noise

Posterior probability

Likelihood

true b=0.05

Model

Data

Prior Posterior

P(y|b, t) ∝ ∏ t

e−(yt−10exp(−bt)) 2/2

P (b|y) ∝ P (y|b, t)P (b)

Example

Monday, April 26, 2010

• 0 20 40 60 80 100−2

0

2

4

6

8

10

12

0 20 40 60 80 100−2

0

2

4

6

8

10

12

t

y

0 0.025 0.05 0.075 0.10

0.2

0.4

0.6

0.8

1

b

Model as exponential decay with Gaussian noise

Posterior probability

Likelihood

true b=0.05

Model

Data

P(y|b, t) ∝ ∏ t

e−(yt−10exp(−bt)) 2/2

P (b|y) ∝ P (y|b, t)P (b)

Example

Monday, April 26, 2010

• Sensitivity analysis

− [

∂2

∂θ2 lnP (D|θ)

]

max

Deviation around M.L. Fisher information

I(θ) = −E [

∂2

∂θ2 lnP (D|θ)

]

Monday, April 26, 2010

• Sensitivity analysis

− [

∂2

∂θ2 lnP (D|θ)

]

max

Deviation around M.L. Fisher information

I(θ) = −E [

∂2

∂θ2 lnP (D|θ)

]

Monday, April 26, 2010

• Sensitivity analysis

− [

∂2

∂θ2 lnP (D|θ)

]

max

Deviation around M.L. Fisher information

I(θ) = −E [

∂2

∂θ2 lnP (D|θ)

]

Monday, April 26, 2010

• Sensitivity analysis