of 13 /13
Efﬁcient parameter estimation with the MCMC toolbox Marko Laine Finnish Meteorological Institute DTU – MCMC lectures, part II – 17.12.2018
• Author

others
• Category

## Documents

• view

6
• download

1

Embed Size (px)

### Transcript of Efficient parameter estimation with the MCMC toolboxpcha/UQ/MCMC-tbx.pdf · 2018. 12. 18. · MCMC...

• Efficient parameter estimation with the MCMC toolbox

Marko [email protected]

Finnish Meteorological Institute

DTU – MCMC lectures, part II – 17.12.2018

• Statistical analysis by simulation

• We are interested in the uncertainty distribution of the unknown modelparameter vector θ given the observational data y and the model: p(θ|y, M).• This distribution is typically analytically intractable.• We can still simulate observations / realizations from p(y|θ, M).

0 1 2 3 4 5 6 7 8 9 100.5

1

1.5

2

2.5

3

3.5

time

y

• Statistical inference is used to define what is a good fit. Parameters that areconsistent with the data and the modelling uncertainty are accepted.

M.Laine: Efficient parameter estimation with the MCMC toolbox 2/13

• MCMC - Markov chain Monte Carlo

• Simulate the model while sampling theparameters from a proposal distribution.• Weight the (or accept) the parameters

according to a suitable goodness-of-fitcriteria depending on prior information anderror statistics.• The resulting chain (i.e. a sample of

parameters) is a sample from the Bayesianposterior distribution of parameteruncertainty.

−3 −2 −1 0 1 2 3−8

−6

−4

−2

0

2

4

99%

95%

90%

80%

Random walk in probability distribution

M.Laine: Efficient parameter estimation with the MCMC toolbox 3/13

• Posterior distributions

0

0.5

1

σ 3

µ3

1

1.1

1.2

1.3

θ 3

σ3

0.2 0.4 0.6

0

5

10

15

KP

3

0 0.5 1 1 1.1 1.2 1.3

θ3

Jun Jul Aug Sep Oct0

5

10

15

20

25

30

35

40

45

50

time

Y

predictive distribution of model

dY/dt = (µθT−20−ρ−Q/V−pZ)Y

95% predictive envelope

95% envelope for the observations

• Posterior distribution of model parameters.• Posterior distribution of model predictions.• Posterior distribution for model comparison.

M.Laine: Efficient parameter estimation with the MCMC toolbox 4/13

• Terminology

The model in general form is

y = f (x|θ) + e,observations = model + error.

Likelihood function for Gaussian errors corresponds to a quadratic cost function,with

p(y|θ) ∝ exp{−1

2∑ni (yi − f (xi|θ))

2

σ2

}

= exp{−1

2SS(θ)

σ2

},

where SS(θ) = −2 log(p(y|θ)) is the -2 times log-likelihood in "sum-of-squares"cost function format. For the posterior, we also need to accountSSpri(θ) = −2 log(p(θ)), prior "sum-of-squares".

M.Laine: Efficient parameter estimation with the MCMC toolbox 5/13

• Metropolis-Hastings algorithm

Random walk Metropolis-Hastings algorithm with Gaussian proposal distribution(and Gaussian likelihood).

• Propose new parameter value θprop = θcurr + ξ, where ξ ∼ N(0, Σprop) isdrawn from the proposal distribution.• Accept θprop with probability α,

α(θcurr, θprop) = 1∧ exp{− 1

2

(SS(θprop)− SS(θcurr)

σ2

)− 1

2

(SSpri(θprop)− SSpri(θcurr)

)}• Efficient proposal distribution⇒ adaptive tuning of Σprop, AM, DRAM

algorithms.

M.Laine: Efficient parameter estimation with the MCMC toolbox 6/13

• Convergence diagnostics

• 1d and 2d plots of the chain• Mixing, rejection rate.• Monte Carlo error of the estimates, chain autocorrelation.• Efficient number of simulations, integrated autocorrelation time, batch mean

standard error.

M.Laine: Efficient parameter estimation with the MCMC toolbox 7/13

• Short chains and adaptation• Important to make short chains as efficient as possible.• Efficient: produce estimates with small Monte Carlo error.

0 1000 2000 3000 4000 50000

0.1

0.2

0.3

0.4

0.5

simulation index

mean of the 1. parameter

mhamdramram

0 1000 2000 3000 4000 50000.94

0.95

0.96

0.97

0.98

0.99

195% quantile

simulation index

mhamdramram

Short MCMC chain run 1000 times with different algorithms.M.Laine: Efficient parameter estimation with the MCMC toolbox 8/13

• Chains have not mixed yet

−1

−0.5

0

0.5

1

θ1

−2.5

−2

−1.5

−1

−0.5

0

θ2

200 400 600 800 1000−1

−0.5

0

0.5

1

θ3

200 400 600 800 1000−1

−0.5

0

0.5

1

θ4

−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−4

−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

θ 2

θ1

2 first dimensions of the chain with 50% and 95% target contoursc50 = 78.7%, c95 = 98.8%, τ=95.8

>> chainstats(chain,names)MCMC statistics, nsimu = 1000

mean std MC_err tau geweke---------------------------------------------------------------------theta_1 0.037146 0.29164 0.054374 67.086 0.017371theta_2 -0.98318 0.39323 0.082073 112.74 0.43098theta_3 0.0077533 0.37183 0.076218 101.3 0.013535theta_4 0.058239 0.34665 0.069839 101.88 0.014742---------------------------------------------------------------------

M.Laine: Efficient parameter estimation with the MCMC toolbox 9/13

• Chains that have better mixing

−1.5

−1

−0.5

0

0.5

1

1.5

θ1

−3

−2.5

−2

−1.5

−1

−0.5

0

θ2

200 400 600 800 1000−1.5

−1

−0.5

0

0.5

1

1.5

θ3

200 400 600 800 1000−1.5

−1

−0.5

0

0.5

1

1.5

θ4

−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−4

−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

θ 2

θ1

2 first dimensions of the chain with 50% and 95% target contoursc50 = 58.8%, c95 = 98.3%, τ=44.0

>> chainstats(chain,names)MCMC statistics, nsimu = 1000

mean std MC_err tau geweke---------------------------------------------------------------------theta_1 0.047117 0.42071 0.068536 43.274 0.027867theta_2 -1.1139 0.49858 0.088196 47.542 0.60531theta_3 0.050454 0.43527 0.069026 41.525 0.023323theta_4 0.033922 0.42226 0.067013 43.516 0.038749---------------------------------------------------------------------

M.Laine: Efficient parameter estimation with the MCMC toolbox 10/13

• MCMC Toolbox for Matlab

https://github.com/mjlaine/mcmcstat

M.Laine: Efficient parameter estimation with the MCMC toolbox 11/13

https://github.com/mjlaine/mcmcstat

• MCMC toolbox for matlab

%% MCMC toolboxmodel.ssfun = @mycostfun

data = load(’datafile.dat’);

parameters = {{’par1’, 2.3 }{’par2’, 1.2 }

};

options.nsimu = 5000;options.method = ’am’;

[results,chain] = mcmcrun(model,data,parameters,options);

mcmcplot(chain,[],results)

M.Laine: Efficient parameter estimation with the MCMC toolbox 12/13

• MCMC demos

Matlab source code: https://github.com/mjlaine/mcmcstatDocumentation: https://mjlaine.github.io/mcmcstat/Installation by git:

git clone https://github.com/mjlaine/mcmcstat.git

M.Laine: Efficient parameter estimation with the MCMC toolbox 13/13

https://github.com/mjlaine/mcmcstathttps://mjlaine.github.io/mcmcstat/