Efficient parameter estimation with the MCMC toolboxpcha/UQ/MCMC-tbx.pdf · 2018. 12. 18. · MCMC...
Embed Size (px)
Transcript of Efficient parameter estimation with the MCMC toolboxpcha/UQ/MCMC-tbx.pdf · 2018. 12. 18. · MCMC...
-
Efficient parameter estimation with the MCMC toolbox
Marko [email protected]
Finnish Meteorological Institute
DTU – MCMC lectures, part II – 17.12.2018
-
Statistical analysis by simulation
• We are interested in the uncertainty distribution of the unknown modelparameter vector θ given the observational data y and the model: p(θ|y, M).• This distribution is typically analytically intractable.• We can still simulate observations / realizations from p(y|θ, M).
0 1 2 3 4 5 6 7 8 9 100.5
1
1.5
2
2.5
3
3.5
time
y
• Statistical inference is used to define what is a good fit. Parameters that areconsistent with the data and the modelling uncertainty are accepted.
M.Laine: Efficient parameter estimation with the MCMC toolbox 2/13
-
MCMC - Markov chain Monte Carlo
• Simulate the model while sampling theparameters from a proposal distribution.• Weight the (or accept) the parameters
according to a suitable goodness-of-fitcriteria depending on prior information anderror statistics.• The resulting chain (i.e. a sample of
parameters) is a sample from the Bayesianposterior distribution of parameteruncertainty.
−3 −2 −1 0 1 2 3−8
−6
−4
−2
0
2
4
99%
95%
90%
80%
Random walk in probability distribution
M.Laine: Efficient parameter estimation with the MCMC toolbox 3/13
-
Posterior distributions
0
0.5
1
σ 3
µ3
1
1.1
1.2
1.3
θ 3
σ3
0.2 0.4 0.6
0
5
10
15
KP
3
0 0.5 1 1 1.1 1.2 1.3
θ3
Jun Jul Aug Sep Oct0
5
10
15
20
25
30
35
40
45
50
time
Y
predictive distribution of model
dY/dt = (µθT−20−ρ−Q/V−pZ)Y
95% predictive envelope
95% envelope for the observations
• Posterior distribution of model parameters.• Posterior distribution of model predictions.• Posterior distribution for model comparison.
M.Laine: Efficient parameter estimation with the MCMC toolbox 4/13
-
Terminology
The model in general form is
y = f (x|θ) + e,observations = model + error.
Likelihood function for Gaussian errors corresponds to a quadratic cost function,with
p(y|θ) ∝ exp{−1
2∑ni (yi − f (xi|θ))
2
σ2
}
= exp{−1
2SS(θ)
σ2
},
where SS(θ) = −2 log(p(y|θ)) is the -2 times log-likelihood in "sum-of-squares"cost function format. For the posterior, we also need to accountSSpri(θ) = −2 log(p(θ)), prior "sum-of-squares".
M.Laine: Efficient parameter estimation with the MCMC toolbox 5/13
-
Metropolis-Hastings algorithm
Random walk Metropolis-Hastings algorithm with Gaussian proposal distribution(and Gaussian likelihood).
• Propose new parameter value θprop = θcurr + ξ, where ξ ∼ N(0, Σprop) isdrawn from the proposal distribution.• Accept θprop with probability α,
α(θcurr, θprop) = 1∧ exp{− 1
2
(SS(θprop)− SS(θcurr)
σ2
)− 1
2
(SSpri(θprop)− SSpri(θcurr)
)}• Efficient proposal distribution⇒ adaptive tuning of Σprop, AM, DRAM
algorithms.
M.Laine: Efficient parameter estimation with the MCMC toolbox 6/13
-
Convergence diagnostics
• 1d and 2d plots of the chain• Mixing, rejection rate.• Monte Carlo error of the estimates, chain autocorrelation.• Efficient number of simulations, integrated autocorrelation time, batch mean
standard error.
M.Laine: Efficient parameter estimation with the MCMC toolbox 7/13
-
Short chains and adaptation• Important to make short chains as efficient as possible.• Efficient: produce estimates with small Monte Carlo error.
0 1000 2000 3000 4000 50000
0.1
0.2
0.3
0.4
0.5
simulation index
mean of the 1. parameter
mhamdramram
0 1000 2000 3000 4000 50000.94
0.95
0.96
0.97
0.98
0.99
195% quantile
simulation index
mhamdramram
Short MCMC chain run 1000 times with different algorithms.M.Laine: Efficient parameter estimation with the MCMC toolbox 8/13
-
Chains have not mixed yet
−1
−0.5
0
0.5
1
θ1
−2.5
−2
−1.5
−1
−0.5
0
θ2
200 400 600 800 1000−1
−0.5
0
0.5
1
θ3
200 400 600 800 1000−1
−0.5
0
0.5
1
θ4
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−4
−3.5
−3
−2.5
−2
−1.5
−1
−0.5
0
θ 2
θ1
2 first dimensions of the chain with 50% and 95% target contoursc50 = 78.7%, c95 = 98.8%, τ=95.8
>> chainstats(chain,names)MCMC statistics, nsimu = 1000
mean std MC_err tau geweke---------------------------------------------------------------------theta_1 0.037146 0.29164 0.054374 67.086 0.017371theta_2 -0.98318 0.39323 0.082073 112.74 0.43098theta_3 0.0077533 0.37183 0.076218 101.3 0.013535theta_4 0.058239 0.34665 0.069839 101.88 0.014742---------------------------------------------------------------------
M.Laine: Efficient parameter estimation with the MCMC toolbox 9/13
-
Chains that have better mixing
−1.5
−1
−0.5
0
0.5
1
1.5
θ1
−3
−2.5
−2
−1.5
−1
−0.5
0
θ2
200 400 600 800 1000−1.5
−1
−0.5
0
0.5
1
1.5
θ3
200 400 600 800 1000−1.5
−1
−0.5
0
0.5
1
1.5
θ4
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5−4
−3.5
−3
−2.5
−2
−1.5
−1
−0.5
0
θ 2
θ1
2 first dimensions of the chain with 50% and 95% target contoursc50 = 58.8%, c95 = 98.3%, τ=44.0
>> chainstats(chain,names)MCMC statistics, nsimu = 1000
mean std MC_err tau geweke---------------------------------------------------------------------theta_1 0.047117 0.42071 0.068536 43.274 0.027867theta_2 -1.1139 0.49858 0.088196 47.542 0.60531theta_3 0.050454 0.43527 0.069026 41.525 0.023323theta_4 0.033922 0.42226 0.067013 43.516 0.038749---------------------------------------------------------------------
M.Laine: Efficient parameter estimation with the MCMC toolbox 10/13
-
MCMC Toolbox for Matlab
https://github.com/mjlaine/mcmcstat
M.Laine: Efficient parameter estimation with the MCMC toolbox 11/13
https://github.com/mjlaine/mcmcstat
-
MCMC toolbox for matlab
%% MCMC toolboxmodel.ssfun = @mycostfun
data = load(’datafile.dat’);
parameters = {{’par1’, 2.3 }{’par2’, 1.2 }
};
options.nsimu = 5000;options.method = ’am’;
[results,chain] = mcmcrun(model,data,parameters,options);
mcmcplot(chain,[],results)
M.Laine: Efficient parameter estimation with the MCMC toolbox 12/13
-
MCMC demos
Matlab source code: https://github.com/mjlaine/mcmcstatDocumentation: https://mjlaine.github.io/mcmcstat/Installation by git:
git clone https://github.com/mjlaine/mcmcstat.git
M.Laine: Efficient parameter estimation with the MCMC toolbox 13/13
https://github.com/mjlaine/mcmcstathttps://mjlaine.github.io/mcmcstat/