Bayesian methods for parameter estimation and model comparison · PDF file Bayesian methods...
date post
27-Mar-2020Category
Documents
view
6download
1
Embed Size (px)
Transcript of Bayesian methods for parameter estimation and model comparison · PDF file Bayesian methods...
Bayesian methods for parameter estimation and
model comparison
Carson C Chow, LBM, NIDDK, NIH
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
t
y
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
dy
dt = f(t; θ)
t
y
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
dy
dt = f(t; θ)
t
y
y(t) = g(t|y(0), θ)
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
dy
dt = f(t; θ)
t
y
y(t) = g(t|y(0), θ)
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
dy
dt = f(t; θ)
t
y y
t
y(t) = g(t|y(0), θ)
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
dy
dt = f(t; θ)
t
y y
t
y = at + b
y(t) = g(t|y(0), θ)
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions:
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions: What algorithm to use?
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions: What algorithm to use?
How good is the fit?
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions:
Is there a better model?
What algorithm to use?
How good is the fit?
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions:
Is there a better model?
What algorithm to use?
Sensitivity?
How good is the fit?
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions:
Is there a better model?
What algorithm to use?
Answer: Use Bayesian inference and MCMC
Sensitivity?
How good is the fit?
Monday, April 26, 2010
Task: Fit a model (ODE, PDE,...) to data - estimate parameters
Questions:
Is there a better model?
What algorithm to use?
Sensitivity?
How good is the fit?
Answer: Invert vs. infer
Monday, April 26, 2010
Bayesian inference
Frequentist: Probability is a frequency (of a random variable)
Monday, April 26, 2010
Bayesian inference
Frequentist: Probability is a frequency (of a random variable)
Bayesian: Probability is a measure of uncertainty
Monday, April 26, 2010
Bayesian inference
Frequentist: Probability is a frequency (of a random variable)
Bayesian: Probability is a measure of uncertainty
Jaynes: Probability is extended logic
Monday, April 26, 2010
Bayesian inference
Models and parameters have probabilities
Anything can be assigned a probability
Makes everything straightforward
Monday, April 26, 2010
D = Data θ = Parameters
Parameter estimation y(θ,t): solution
i.e. Maximum likelihood estimation
θ that maximizes likelihood
∑
i
(Di − yi(θ))2e.g. Minimize mean square error
P (D|θ) ∝ exp (
∑
i
− (Di − yi(θ)) 2
2σ2
)
Monday, April 26, 2010
D = Data θ = Parameters
Parameter estimation y(θ,t): solution
i.e. Maximum likelihood estimation
θ that maximizes likelihood
∑
i
(Di − yi(θ))2e.g. Minimize mean square error
P (D|θ) ∝ exp (
∑
i
− (Di − yi(θ)) 2
2σ2
)
Monday, April 26, 2010
Statistics
But in frequentist statistics models and parameters cannot have probabilities
Confidence intervals of a parameter are with respect to sampling errors
“Natural” thing to do is to find the probability distribution for a model or parameters
Monday, April 26, 2010
Bayes theorem
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
Bayes theorem
P (θ|D) = P (D|θ)P (θ) P (D)
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
Bayes theorem
Prior P (θ|D) = P (D|θ)P (θ)
P (D)
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
Bayes theorem
Prior
Likelihood
P (θ|D) = P (D|θ)P (θ) P (D)
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
Bayes theorem
Prior
Likelihood
P (θ|D) = P (D|θ)P (θ) P (D)
Evidence
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
Bayes theorem
PriorPosterior
Likelihood
P (θ|D) = P (D|θ)P (θ) P (D)
Evidence
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
Bayes theorem
Bayesian inference reduces statistics to one equation
PriorPosterior
Likelihood
P (θ|D) = P (D|θ)P (θ) P (D)
Evidence
P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)
Monday, April 26, 2010
D = Data θ = Parameters
Parameter estimation
P (θ|D) = P (D|θ)P (θ) P (D)
P (θ|D) ∝ P (D|θ)P (θ)
P (D) = ∫
P (D|θ)P (θ)dθ
y(θ,t): solution
Monday, April 26, 2010
D = Data θ = Parameters
Parameter estimation
P (θ|D) = P (D|θ)P (θ) P (D)
P (θ|D) ∝ P (D|θ)P (θ)
P (D) = ∫
P (D|θ)P (θ)dθ
∝ exp (
∑
i
− (Di − yi(θ)) 2
2σ2
) y(θ,t): solution
Monday, April 26, 2010
D = Data θ = Parameters
Parameter estimation
P (θ|D) = P (D|θ)P (θ) P (D)
P (θ|D) ∝ P (D|θ)P (θ)
P (D) = ∫
P (D|θ)P (θ)dθ
∝ exp (
∑
i
− (Di − yi(θ)) 2
2σ2
) y(θ,t): solution
Probability of parameter given the data
Monday, April 26, 2010
0 20 40 60 80 100−2
0
2
4
6
8
10
12
0 20 40 60 80 100−2
0
2
4
6
8
10
12
t
y
Model as exponential decay with Gaussian noise
Posterior probability
Likelihood
Model
Data
P(y|b, t) ∝ ∏ t
e−(yt−10exp(−bt)) 2/2
P (b|y) ∝ P (y|b, t)P (b)
Example
Monday, April 26, 2010
0 0.025 0.05 0.075 0.10
0.5
1
1.5
2
b 0 20 40 60 80 100−2
0
2
4
6
8
10
12
0 20 40 60 80 100−2
0
2
4
6
8
10
12
t
y
Model as exponential decay with Gaussian noise
Posterior probability
Likelihood
true b=0.05
Model
Data
Prior Posterior
P(y|b, t) ∝ ∏ t
e−(yt−10exp(−bt)) 2/2
P (b|y) ∝ P (y|b, t)P (b)
Example
Monday, April 26, 2010
0 20 40 60 80 100−2
0
2
4
6
8
10
12
0 20 40 60 80 100−2
0
2
4
6
8
10
12
t
y
0 0.025 0.05 0.075 0.10
0.2
0.4
0.6
0.8
1
b
Model as exponential decay with Gaussian noise
Posterior probability
Likelihood
true b=0.05
Model
Data
P(y|b, t) ∝ ∏ t
e−(yt−10exp(−bt)) 2/2
P (b|y) ∝ P (y|b, t)P (b)
Example
Monday, April 26, 2010
Sensitivity analysis
− [
∂2
∂θ2 lnP (D|θ)
]
max
Deviation around M.L. Fisher information
I(θ) = −E [
∂2
∂θ2 lnP (D|θ)
]
Monday, April 26, 2010
Sensitivity analysis
− [
∂2
∂θ2 lnP (D|θ)
]
max
Deviation around M.L. Fisher information
I(θ) = −E [
∂2
∂θ2 lnP (D|θ)
]
Monday, April 26, 2010
Sensitivity analysis
− [
∂2
∂θ2 lnP (D|θ)
]
max
Deviation around M.L. Fisher information
I(θ) = −E [
∂2
∂θ2 lnP (D|θ)
]
Monday, April 26, 2010
Sensitivity analysis