Intro to ABC

Post on 10-May-2015

102 views 0 download

description

ABC in Sydney, July 3, 2014

Transcript of Intro to ABC

Intro to ABC Example Conclusion

an introduction toApproximate Bayesian Computation

Matt Moores

Mathematical Sciences SchoolQueensland University of Technology

Brisbane, Australia

ABC in SydneyJuly 3, 2014

Intro to ABC Example Conclusion

Motivation

Inference for a parameter θ when it is:

impossible

or very expensive

to evaluate the likelihood p(y|θ)

ABC is a likelihood-free method for approximatingthe posterior distribution

π(θ|y)

by generating pseudo-data from the model:

w ∼ f(·|θ)

Intro to ABC Example Conclusion

Likelihood-free rejection sampler

Algorithm 1 Likelihood-free rejection sampler

1: Draw parameter value θ′ ∼ π(θ)2: Generate w ∼ f(·|θ′)3: if w = y (the observed data) then4: accept θ′

5: end if

But if the observations y are continuous

(or the space y ∈ Y is enormous)

then P(w = y) ≈ 0

Tavare, Balding, Griffith & Donnelly (1997) Genetics 145(2)

Intro to ABC Example Conclusion

ABC tolerance

accept θ′ if δ(w,y) < ε

where

ε > 0 is the tolerance level

δ(·, ·) is a distance function(for an appropriate choice of norm)

Inference is more exact when ε is close to zero. butmore proposed θ′ are rejected(tradeoff between accuracy & computational cost)

Pritchard, Seielstad, Perez-Lezaun & Feldman (1999) Mol. Biol. Evol. 16(12)

Intro to ABC Example Conclusion

Summary statistics

Computing δ(w,y) for w1, . . . , wn and y1, . . . , yncan be very expensive for large n

Instead, compute summary statistics s(y)

e.g. sufficient statistics(only available for exponential family)

Intro to ABC Example Conclusion

Sufficient statistics

Fisher-Neyman factorisation theorem:

if s(y) is sufficient for θ

then p(y|θ) = f(y) g (s(y)|θ)

only applies to Potts, Ising, exponential randomgraph models (ERGM)

otherwise, selection of suitable summarystatistics can be a very difficult problem

Intro to ABC Example Conclusion

ABC rejection sampler

Algorithm 2 ABC rejection sampler

1: for all iterations t ∈ 1 . . . T do2: Draw independent proposal θ′ ∼ π(θ)3: Generate w ∼ f(·|θ′)4: if ‖s(w)− s(y)‖ < ε then5: set θt ← θ′

6: else7: set θt ← θt−1

8: end if9: end for

Approximates π(θ|y) by πε(θ | ‖s(w)− s(y)‖ < ε)Marin, Pudlo, Robert & Ryder (2012) Stat. Comput. 22(6)Marin & Robert (2014) Bayesian Essentials with R §8.3

Intro to ABC Example Conclusion

A trivial (counter) example

Gaussian with unknown mean:

y ∼ N (µ, 1)

natural conjugate prior:

π(µ) ∼ N (0, 106)

sufficient statistic:

y = 1n

∑ni=1 yi

posterior is analytically tractable:

π(µ|y) ∼ N (m′, s2′)

where1s2′

=(n1

+ 1106

)m′ = s2′ (ny

1+ 0)

= nyn+10−6

∴ no need for ABC (nor MCMC) in practice

Intro to ABC Example Conclusion

R code

π(µ|y)

1.5 2.0 2.5 3.0 3.5 4.0 4.5

0.0

0.2

0.4

0.6

0.8

�y ← rnorm ( n=5, mean=3, sd=1)n ← length ( y )ybar ← sum( y )/npos t s ← 1/ ( n + 1e−6)pos t m ← pos t s ∗ n∗ ybarpos t s im ← rnorm (10000 , pos t m, sd=sqr t ( pos t s ) )

Intro to ABC Example Conclusion

now with ABC

π(µ)

−4000 −2000 0 2000 4000

0e

+0

02

e−

04

4e

−0

4

πε(µ | δ(s(w), s(y)) < ε)

0 2 4 6

0.0

0.2

0.4

0.6

0.8

�prop mu ← rnorm (10000 , 0 , sqr t (1 e6 ) )pseudo ← rnorm ( n∗ 10000 , prop mu, 1)pseudoMx ← matrix ( pseudo , nrow=10000 , ncol=n)ps ybar ← rowMeans ( pseudoMx )ps norm ← abs ( ps ybar − ybar )e p s i l o n ← so r t ( ps norm ) [ 2 0 ]prop keep ← prop mu[ ps norm <= e p s i l o n ]

Intro to ABC Example Conclusion

choice of ε

−15 −10 −5 0 5 10 15 20

0.0

0.2

0.4

0.6

0.8

(a) ε = 15.498

0 2 4 6

0.0

0.2

0.4

0.6

0.8

(b) ε = 3.47

1.5 2.0 2.5 3.0 3.5 4.0 4.5

0.0

0.2

0.4

0.6

0.8

1.0

(c) ε = 1.65

1.5 2.0 2.5 3.0 3.5 4.0 4.5

0.0

0.5

1.0

1.5

2.0

(d) ε = 1.11

Intro to ABC Example Conclusion

Improvements to ABC

Alternatives to i.i.d. proposals:

ABC-MCMC

ABC-SMC

Regression adjustment

compensates for larger ε

Validation of ABC approximation

ABC for model choice

Intro to ABC Example Conclusion

Summary

ABC is a method for likelihood-free inference

It enables inference for models that areotherwise computationally intractable

Main components of ABC:

π(θ) proposal density for θ′

f(·|θ) generative model for wε tolerance level

δ(·, ·) distance functions(y) summary statistics

Intro to ABC Example Conclusion

References

Jean-Michel Marin & Christian RobertBayesian Essentials with RSpringer-Verlag, 2014.

Jean-Michel Marin, Pierre Pudlo, Christian Robert & Robin RyderApproximate Bayesian computational methods.Statistics & Computing, 22(6): 1167–80, 2012.

Simon Tavare, David Balding, Robert Griffiths & Peter DonnellyInferring coalescence times from DNA sequence data.Genetics, 145(2): 505–18, 1997.

Jonathan Pritchard, Mark Seielstad, Anna Perez-Lezaun & MarcusFeldmanPopulation Growth of Human Y Chromosomes: A Study of YChromosome Microsatellites.Mol. Biol. Evol. 16(12): 1791–98, 1999.