Bayesian methods for parameter estimation and model comparison Bayesian methods for parameter...

download Bayesian methods for parameter estimation and model comparison Bayesian methods for parameter estimation

of 101

  • date post

    27-Mar-2020
  • Category

    Documents

  • view

    6
  • download

    1

Embed Size (px)

Transcript of Bayesian methods for parameter estimation and model comparison Bayesian methods for parameter...

  • Bayesian methods for parameter estimation and

    model comparison

    Carson C Chow, LBM, NIDDK, NIH

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    t

    y

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    dy

    dt = f(t; θ)

    t

    y

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    dy

    dt = f(t; θ)

    t

    y

    y(t) = g(t|y(0), θ)

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    dy

    dt = f(t; θ)

    t

    y

    y(t) = g(t|y(0), θ)

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    dy

    dt = f(t; θ)

    t

    y y

    t

    y(t) = g(t|y(0), θ)

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    dy

    dt = f(t; θ)

    t

    y y

    t

    y = at + b

    y(t) = g(t|y(0), θ)

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions:

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions: What algorithm to use?

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions: What algorithm to use?

    How good is the fit?

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions:

    Is there a better model?

    What algorithm to use?

    How good is the fit?

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions:

    Is there a better model?

    What algorithm to use?

    Sensitivity?

    How good is the fit?

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions:

    Is there a better model?

    What algorithm to use?

    Answer: Use Bayesian inference and MCMC

    Sensitivity?

    How good is the fit?

    Monday, April 26, 2010

  • Task: Fit a model (ODE, PDE,...) to data - estimate parameters

    Questions:

    Is there a better model?

    What algorithm to use?

    Sensitivity?

    How good is the fit?

    Answer: Invert vs. infer

    Monday, April 26, 2010

  • Bayesian inference

    Frequentist: Probability is a frequency (of a random variable)

    Monday, April 26, 2010

  • Bayesian inference

    Frequentist: Probability is a frequency (of a random variable)

    Bayesian: Probability is a measure of uncertainty

    Monday, April 26, 2010

  • Bayesian inference

    Frequentist: Probability is a frequency (of a random variable)

    Bayesian: Probability is a measure of uncertainty

    Jaynes: Probability is extended logic

    Monday, April 26, 2010

  • Bayesian inference

    Models and parameters have probabilities

    Anything can be assigned a probability

    Makes everything straightforward

    Monday, April 26, 2010

  • D = Data θ = Parameters

    Parameter estimation y(θ,t): solution

    i.e. Maximum likelihood estimation

    θ that maximizes likelihood

    i

    (Di − yi(θ))2e.g. Minimize mean square error

    P (D|θ) ∝ exp (

    i

    − (Di − yi(θ)) 2

    2σ2

    )

    Monday, April 26, 2010

  • D = Data θ = Parameters

    Parameter estimation y(θ,t): solution

    i.e. Maximum likelihood estimation

    θ that maximizes likelihood

    i

    (Di − yi(θ))2e.g. Minimize mean square error

    P (D|θ) ∝ exp (

    i

    − (Di − yi(θ)) 2

    2σ2

    )

    Monday, April 26, 2010

  • Statistics

    But in frequentist statistics models and parameters cannot have probabilities

    Confidence intervals of a parameter are with respect to sampling errors

    “Natural” thing to do is to find the probability distribution for a model or parameters

    Monday, April 26, 2010

  • Bayes theorem

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • Bayes theorem

    P (θ|D) = P (D|θ)P (θ) P (D)

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • Bayes theorem

    Prior P (θ|D) = P (D|θ)P (θ)

    P (D)

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • Bayes theorem

    Prior

    Likelihood

    P (θ|D) = P (D|θ)P (θ) P (D)

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • Bayes theorem

    Prior

    Likelihood

    P (θ|D) = P (D|θ)P (θ) P (D)

    Evidence

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • Bayes theorem

    PriorPosterior

    Likelihood

    P (θ|D) = P (D|θ)P (θ) P (D)

    Evidence

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • Bayes theorem

    Bayesian inference reduces statistics to one equation

    PriorPosterior

    Likelihood

    P (θ|D) = P (D|θ)P (θ) P (D)

    Evidence

    P (θ, D) = P (θ|D)P (D) = P (D|θ)P (θ)

    Monday, April 26, 2010

  • D = Data θ = Parameters

    Parameter estimation

    P (θ|D) = P (D|θ)P (θ) P (D)

    P (θ|D) ∝ P (D|θ)P (θ)

    P (D) = ∫

    P (D|θ)P (θ)dθ

    y(θ,t): solution

    Monday, April 26, 2010

  • D = Data θ = Parameters

    Parameter estimation

    P (θ|D) = P (D|θ)P (θ) P (D)

    P (θ|D) ∝ P (D|θ)P (θ)

    P (D) = ∫

    P (D|θ)P (θ)dθ

    ∝ exp (

    i

    − (Di − yi(θ)) 2

    2σ2

    ) y(θ,t): solution

    Monday, April 26, 2010

  • D = Data θ = Parameters

    Parameter estimation

    P (θ|D) = P (D|θ)P (θ) P (D)

    P (θ|D) ∝ P (D|θ)P (θ)

    P (D) = ∫

    P (D|θ)P (θ)dθ

    ∝ exp (

    i

    − (Di − yi(θ)) 2

    2σ2

    ) y(θ,t): solution

    Probability of parameter given the data

    Monday, April 26, 2010

  • 0 20 40 60 80 100−2

    0

    2

    4

    6

    8

    10

    12

    0 20 40 60 80 100−2

    0

    2

    4

    6

    8

    10

    12

    t

    y

    Model as exponential decay with Gaussian noise

    Posterior probability

    Likelihood

    Model

    Data

    P(y|b, t) ∝ ∏ t

    e−(yt−10exp(−bt)) 2/2

    P (b|y) ∝ P (y|b, t)P (b)

    Example

    Monday, April 26, 2010

  • 0 0.025 0.05 0.075 0.10

    0.5

    1

    1.5

    2

    b 0 20 40 60 80 100−2

    0

    2

    4

    6

    8

    10

    12

    0 20 40 60 80 100−2

    0

    2

    4

    6

    8

    10

    12

    t

    y

    Model as exponential decay with Gaussian noise

    Posterior probability

    Likelihood

    true b=0.05

    Model

    Data

    Prior Posterior

    P(y|b, t) ∝ ∏ t

    e−(yt−10exp(−bt)) 2/2

    P (b|y) ∝ P (y|b, t)P (b)

    Example

    Monday, April 26, 2010

  • 0 20 40 60 80 100−2

    0

    2

    4

    6

    8

    10

    12

    0 20 40 60 80 100−2

    0

    2

    4

    6

    8

    10

    12

    t

    y

    0 0.025 0.05 0.075 0.10

    0.2

    0.4

    0.6

    0.8

    1

    b

    Model as exponential decay with Gaussian noise

    Posterior probability

    Likelihood

    true b=0.05

    Model

    Data

    P(y|b, t) ∝ ∏ t

    e−(yt−10exp(−bt)) 2/2

    P (b|y) ∝ P (y|b, t)P (b)

    Example

    Monday, April 26, 2010

  • Sensitivity analysis

    − [

    ∂2

    ∂θ2 lnP (D|θ)

    ]

    max

    Deviation around M.L. Fisher information

    I(θ) = −E [

    ∂2

    ∂θ2 lnP (D|θ)

    ]

    Monday, April 26, 2010

  • Sensitivity analysis

    − [

    ∂2

    ∂θ2 lnP (D|θ)

    ]

    max

    Deviation around M.L. Fisher information

    I(θ) = −E [

    ∂2

    ∂θ2 lnP (D|θ)

    ]

    Monday, April 26, 2010

  • Sensitivity analysis

    − [

    ∂2

    ∂θ2 lnP (D|θ)

    ]

    max

    Deviation around M.L. Fisher information

    I(θ) = −E [

    ∂2

    ∂θ2 lnP (D|θ)

    ]

    Monday, April 26, 2010

  • Sensitivity analysis