Intro to ABC
-
Upload
matt-moores -
Category
Data & Analytics
-
view
102 -
download
0
description
Transcript of Intro to ABC
Intro to ABC Example Conclusion
an introduction toApproximate Bayesian Computation
Matt Moores
Mathematical Sciences SchoolQueensland University of Technology
Brisbane, Australia
ABC in SydneyJuly 3, 2014
Intro to ABC Example Conclusion
Motivation
Inference for a parameter θ when it is:
impossible
or very expensive
to evaluate the likelihood p(y|θ)
ABC is a likelihood-free method for approximatingthe posterior distribution
π(θ|y)
by generating pseudo-data from the model:
w ∼ f(·|θ)
Intro to ABC Example Conclusion
Likelihood-free rejection sampler
Algorithm 1 Likelihood-free rejection sampler
1: Draw parameter value θ′ ∼ π(θ)2: Generate w ∼ f(·|θ′)3: if w = y (the observed data) then4: accept θ′
5: end if
But if the observations y are continuous
(or the space y ∈ Y is enormous)
then P(w = y) ≈ 0
Tavare, Balding, Griffith & Donnelly (1997) Genetics 145(2)
Intro to ABC Example Conclusion
ABC tolerance
accept θ′ if δ(w,y) < ε
where
ε > 0 is the tolerance level
δ(·, ·) is a distance function(for an appropriate choice of norm)
Inference is more exact when ε is close to zero. butmore proposed θ′ are rejected(tradeoff between accuracy & computational cost)
Pritchard, Seielstad, Perez-Lezaun & Feldman (1999) Mol. Biol. Evol. 16(12)
Intro to ABC Example Conclusion
Summary statistics
Computing δ(w,y) for w1, . . . , wn and y1, . . . , yncan be very expensive for large n
Instead, compute summary statistics s(y)
e.g. sufficient statistics(only available for exponential family)
Intro to ABC Example Conclusion
Sufficient statistics
Fisher-Neyman factorisation theorem:
if s(y) is sufficient for θ
then p(y|θ) = f(y) g (s(y)|θ)
only applies to Potts, Ising, exponential randomgraph models (ERGM)
otherwise, selection of suitable summarystatistics can be a very difficult problem
Intro to ABC Example Conclusion
ABC rejection sampler
Algorithm 2 ABC rejection sampler
1: for all iterations t ∈ 1 . . . T do2: Draw independent proposal θ′ ∼ π(θ)3: Generate w ∼ f(·|θ′)4: if ‖s(w)− s(y)‖ < ε then5: set θt ← θ′
6: else7: set θt ← θt−1
8: end if9: end for
Approximates π(θ|y) by πε(θ | ‖s(w)− s(y)‖ < ε)Marin, Pudlo, Robert & Ryder (2012) Stat. Comput. 22(6)Marin & Robert (2014) Bayesian Essentials with R §8.3
Intro to ABC Example Conclusion
A trivial (counter) example
Gaussian with unknown mean:
y ∼ N (µ, 1)
natural conjugate prior:
π(µ) ∼ N (0, 106)
sufficient statistic:
y = 1n
∑ni=1 yi
posterior is analytically tractable:
π(µ|y) ∼ N (m′, s2′)
where1s2′
=(n1
+ 1106
)m′ = s2′ (ny
1+ 0)
= nyn+10−6
∴ no need for ABC (nor MCMC) in practice
Intro to ABC Example Conclusion
R code
π(µ|y)
1.5 2.0 2.5 3.0 3.5 4.0 4.5
0.0
0.2
0.4
0.6
0.8
�y ← rnorm ( n=5, mean=3, sd=1)n ← length ( y )ybar ← sum( y )/npos t s ← 1/ ( n + 1e−6)pos t m ← pos t s ∗ n∗ ybarpos t s im ← rnorm (10000 , pos t m, sd=sqr t ( pos t s ) )
Intro to ABC Example Conclusion
now with ABC
π(µ)
−4000 −2000 0 2000 4000
0e
+0
02
e−
04
4e
−0
4
πε(µ | δ(s(w), s(y)) < ε)
0 2 4 6
0.0
0.2
0.4
0.6
0.8
�prop mu ← rnorm (10000 , 0 , sqr t (1 e6 ) )pseudo ← rnorm ( n∗ 10000 , prop mu, 1)pseudoMx ← matrix ( pseudo , nrow=10000 , ncol=n)ps ybar ← rowMeans ( pseudoMx )ps norm ← abs ( ps ybar − ybar )e p s i l o n ← so r t ( ps norm ) [ 2 0 ]prop keep ← prop mu[ ps norm <= e p s i l o n ]
Intro to ABC Example Conclusion
choice of ε
−15 −10 −5 0 5 10 15 20
0.0
0.2
0.4
0.6
0.8
(a) ε = 15.498
0 2 4 6
0.0
0.2
0.4
0.6
0.8
(b) ε = 3.47
1.5 2.0 2.5 3.0 3.5 4.0 4.5
0.0
0.2
0.4
0.6
0.8
1.0
(c) ε = 1.65
1.5 2.0 2.5 3.0 3.5 4.0 4.5
0.0
0.5
1.0
1.5
2.0
(d) ε = 1.11
Intro to ABC Example Conclusion
Improvements to ABC
Alternatives to i.i.d. proposals:
ABC-MCMC
ABC-SMC
Regression adjustment
compensates for larger ε
Validation of ABC approximation
ABC for model choice
Intro to ABC Example Conclusion
Summary
ABC is a method for likelihood-free inference
It enables inference for models that areotherwise computationally intractable
Main components of ABC:
π(θ) proposal density for θ′
f(·|θ) generative model for wε tolerance level
δ(·, ·) distance functions(y) summary statistics
Intro to ABC Example Conclusion
References
Jean-Michel Marin & Christian RobertBayesian Essentials with RSpringer-Verlag, 2014.
Jean-Michel Marin, Pierre Pudlo, Christian Robert & Robin RyderApproximate Bayesian computational methods.Statistics & Computing, 22(6): 1167–80, 2012.
Simon Tavare, David Balding, Robert Griffiths & Peter DonnellyInferring coalescence times from DNA sequence data.Genetics, 145(2): 505–18, 1997.
Jonathan Pritchard, Mark Seielstad, Anna Perez-Lezaun & MarcusFeldmanPopulation Growth of Human Y Chromosomes: A Study of YChromosome Microsatellites.Mol. Biol. Evol. 16(12): 1791–98, 1999.