On Particle Gibbs Samplingmwl25/mcmski/slides/ss_mcmski14.pdf · 0.6 0.8 ACF X 0 multinomial...

On Particle Gibbs Sampling

Sumeetpal S. SinghCambridge University Engineering Department

joint work: N. Chopin (CREST-ENSAE)

MCMSKICharmonix 7 January 2014

State-space Model

Running example (Yu & Meng 2011):

X0 ∼ N(µ, σ2), Xt+1−µ = ρ(Xt−µ)+σεt , εt ∼i.i .d . N(0, 1)

Yt |Xt = xt ∼ Poisson(ext )

In general:

Xt |Xt−1 = xt−1 ∼ mθ(xt−1, xt)dxt

Yt |Xt = xt ∼ gθ(xt , yt)dyt

State-space Model

Running example (Yu & Meng 2011):

X0 ∼ N(µ, σ2), Xt+1−µ = ρ(Xt−µ)+σεt , εt ∼i.i .d . N(0, 1)

Yt |Xt = xt ∼ Poisson(ext )

In general:

Xt |Xt−1 = xt−1 ∼ mθ(xt−1, xt)dxt

Yt |Xt = xt ∼ gθ(xt , yt)dyt

Inference Objective

The posterior: p(θ, x0:T |y0:T ), θ = (µ, ρ, σ)

The Gibbs sampler: (θ, x0:T )→ (θ′, x ′0:T )

σ′|(x0:T , µ, ρ) ∼ Gamma (· · ·)...

µ′|(x0:T , σ′, ρ′) ∼ Normal (· · ·)

x ′0:T |(σ′, µ′, ρ′)

One at a time: xi |(x ′0:i−1, xi+1:T , σ′, µ′, ρ′)

Inference Objective

σ′|(x0:T , µ, ρ) ∼ Gamma (· · ·)...

µ′|(x0:T , σ′, ρ′) ∼ Normal (· · ·)

x ′0:T |(σ′, µ′, ρ′)

Inference Objective

σ′|(x0:T , µ, ρ) ∼ Gamma (· · ·)...

µ′|(x0:T , σ′, ρ′) ∼ Normal (· · ·)

x ′0:T |(σ′, µ′, ρ′)

Particle Gibbs kernel (Andrieu, Doucet and Holenstein,2010):

Replace Gibbs step for x0:T with

X ′0:T |(θ′, x0:T ) ∼ Pθ′,N(x0:T , dx′0:T )

Invariant measure p(x0:T |θ′, y0:T )

The kernel is a randomly chosen output of a particle filter, Nparticles, one fixed to x0:T

Meant to emulate the Gibbs step (change the wholetrajectory)

Typically:

x ′0:T 6= x0:T but x ′i = xi for small i .

Increasing N fixes this.

Particle Gibbs kernel (Andrieu, Doucet and Holenstein,2010):

Replace Gibbs step for x0:T with

X ′0:T |(θ′, x0:T ) ∼ Pθ′,N(x0:T , dx′0:T )

Invariant measure p(x0:T |θ′, y0:T )

The kernel is a randomly chosen output of a particle filter, Nparticles, one fixed to x0:T

Meant to emulate the Gibbs step (change the wholetrajectory)

Typically:

x ′0:T 6= x0:T but x ′i = xi for small i .

Increasing N fixes this.

Running Particle Gibbs

Sampling p(θ, x0:399|y0:399)

0 50 100 150 200 250 300 350 400t

multinomialresidualsystematicmulti+BS

0 50 100 150 200 250 300 350 400t

Statistic: counting proportion x ′i 6= xi for i = 0, . . . , 399

Running Particle Gibbs

Sampling p(θ, x0:400|y0:400)

0 10 20 30 40 50 60 70 80lag

Statistic: autocorrellation X0, X399 (200 particles)

Particle filter

Input: {x i0:t}Ni=1 ≈ p(x0:t |θ, y0:t)

Output: {x i0:t+1}Ni=1 ≈ p(x0:t+1|θ, y0:t+1)

Append:(x i0:t ,Xit+1) where X i

t+1 ∼ mθ(x it , dxt+1)

Weight: x i0:t+1 gets weight w it+1 = gθ(x it+1, yt+1)

Output: Approximation of p(x0:t+1|θ, y0:t+1) are N independentsamples from

N∑i=1

w it+1δx i0:t+1

(dx ′0:t+1)

Particle filter

Input: {x i0:t}Ni=1 ≈ p(x0:t |θ, y0:t)

Output: {x i0:t+1}Ni=1 ≈ p(x0:t+1|θ, y0:t+1)

Append:(x i0:t ,Xit+1) where X i

Output: Approximation of p(x0:t+1|θ, y0:t+1) are N independentsamples from

N∑i=1

w it+1δx i0:t+1

(dx ′0:t+1)

Sampling from Pθ,N(x?0:T , dx0:T )

Idea: N particle system but force x10:T = x?0:T

Intermediate step tInput: {x i0:t}Ni=1 with x1

0:t = x?0:t

Output: {x i0:t+1}Ni=1 with x10:t+1 = x?0:t+1

Append particles i = 2, . . . ,N:

(x0:t ,Xit+1) where X i

Output x10:t+1 = x?0:t+1 and N − 1 independent samples from

w?t+1δx?0:t+1

(dx ′0:t+1) +N∑i=2

w it+1δx i0:t+1

(dx ′0:t+1)

0:t = x?0:t

w?t+1δx?0:t+1

(dx ′0:t+1) +N∑i=2

w it+1δx i0:t+1

(dx ′0:t+1)

0:t = x?0:t

w?t+1δx?0:t+1

(dx ′0:t+1) +N∑i=2

w it+1δx i0:t+1

(dx ′0:t+1)

0:t = x?0:t

w?t+1δx?0:t+1

(dx ′0:t+1) +N∑i=2

w it+1δx i0:t+1

(dx ′0:t+1)

cPF example

0 1 2 3 4 5 6 7 8 9 10−2

−1.5

−0.5

cPF example

0 1 2 3 4 5 6 7 8 9 10−2

−1.5

−0.5

cPF example

0 1 2 3 4 5 6 7 8 9 10−2

−1.5

−0.5

Uniform Ergodicity

Assumption

gθ(xt , yt) ≤ Gt,θ (density upper bounded)

Predicted density lower bounded:∫mθ(dx0)gθ(x0, y0) ≥ 1

G0,θ,

∫mθ(xt−1, dxt)gθ(xt , yt) ≥

For any θ and ε > 0, for N large enough,

|(PN,θϕ) (x0:T )− (PN,θϕ) (x̌0:T )| ≤ ε

for all x0:T , x̌0:T , and ϕ : XT+1 → [−1, 1]

⇒ Large N gives samples arbitrarily close to target p(x0:T |θ, y0:T )⇒ Uniform ergodicity

Proof Idea

Use coupling: define (X ?0:T , X̌

?0:T ) such that

Law(X ?0:T ) = PN,θ(x0:T , dx

?0:T ) Law(X̌ ?

0:T ) = PN,θ(x̌0:T , dx̌?0:T )

IfP(X ?

0:T 6= X̌ ?0:T ) ≤ ε

PNT (x0:T ,A)− PN

T (x̌0:T ,A) = E{(

IA(X ?0:T )− IA(X̌ ?

0:T ))I{X0:T 6=X̌0:T }

}≤ P

0:T 6= X̌ ?0:T

)≤ ε

Coupling cPFs

Aim: Coupling outputs of PN,θ(x0:T , dx?0:T ) and PN,θ(x̌0:T , dx̌

?0:T )

If cPF outputs {x i0:t}Ni=1 and {x̌ i0:t}Ni=1 satisfy

x i0:t = x̌ i0:t for i ∈ Ct

where C0 = {2, . . . ,N}Same holds after append move: (x i0:t , x

it+1) = (x̌ i0:t , x̌

it+1) for

i ∈ Ct

Resampling move: select particles in Ct for survival if possible

Coupling probability determined by law of CT

Backward Sampling

N. Whiteley (RSS discussion of PMCMC) suggested an extrabackward step for cPF that tries to modify (recursively, backwardin time) the ancestry of the selected trajectory.

There is a forward “version” by Lindsten and Schon (2012)

Idea of Backward Sampling

0 1 2 3 4 5 6 7 8 9 10−2

−1.5

−0.5

Running Backward Sampling

0 50 100 150 200 250 300 350 400t

0 10 20 30 40 50 60 70 80lag

Left Statistic: counting proportion x ′i 6= xi for i = 0, . . . , 399

Backward Sampling

Success depends on state transition law mθ(xt , dxt+1)

The cPF kernel Pθ,N with a BS step dominates the no BS versionin asymptotic efficiency

i.e. gives rise to a CLT with a smaller asymptotic variance (Idea:self-adjoint operator + lag-1 domination; see Tierney (1998), Mira& Geyer (1999) )

The cPF kernel geometrically ergodic ⇒ cPF kernel with BS isgeometrically ergodic

(Idea: both kernels are positive operators)

Final Remarks

Established uniform ergodicity of cPF using coupling

Dependance on N and T not given although in practise Nshould scale linearly with T– now proved by (Andrieu, Lee, Vihola) and (Douc, Lindsten,Moulines)

Whiteley’s BS is better: in asymptotic efficiency and inheritsgeometric ergodicity

On Particle Gibbs Samplingmwl25/mcmski/slides/ss_mcmski14.pdf · 0.6 0.8 ACF X 0 multinomial...

Documents

Transcript of On Particle Gibbs Samplingmwl25/mcmski/slides/ss_mcmski14.pdf · 0.6 0.8 ACF X 0 multinomial...

A Monte Carlo Method Based on Sequencesparallel.bas.bg/dpa/IMACS_MCM_2011/Talks/Ivan_Dimov.pdf · 2011. 10. 7. · 1) ˇ7:55953: 0 0.2 0.4 0.6 0.8 1 x 0 0.2 0.4 0.6 0.8 y 0 500 1000

Master Thesis Stockholm Observatory · 2006-05-08 · Master Thesis Stockholm Observatory Created using CosmoloGUI W m W x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.2 0.4 0.6 0.8 1 1.2 1.4

0.6/1 kV CV (FR-CV optional)* · 400 53 23.39 2.0 3.0 68.0 0.0470 600 678 630 13,350 200 Revision : 02, February 2014 BANGKOK CABLE 0.6/1 kV CV 0.6/1 kV CV (FR-CV optional)* 3 CORES

TCP$Alterna,ve$Backoﬀ$with$ECN$(ABE)$heim.ifi.uio.no/~naeemk/research/ABE/slides-93-tcpm-9.pdf · Bytes Departed β ecn 0.6 0.7 0.75 0.8 0.85 0.9 0.95 Improvement over β e c n

Special Functions - Sharifsharif.edu/~aborji/25120/files/special functions.pdfAmir Borji Special Functions 3-0.6 0.2 0.4 0.6 0.8 1 0 2 4 6 8 10 12 14-0.2-0.4 0 J1(x) J2(x) J0(x) Figure

Problem 2.11 A 50-Ωmicrostrip line uses a 0.6-mm alumina …etemadi/ee324/Homework/Fall... · 2012. 9. 19. · Problem 2.11 A 50-Ωmicrostrip line uses a 0.6-mm alumina substrate

arXiv:1705.05100v1 [hep-ex] 15 May 2017 · 2018. 11. 7. · ECL [GeV] 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Events / ( 0.05 GeV ) 0 20 40 60 Figure 2 { Two-dimensional t result. (left) Projection

Zr(n,tot.) Correlation Matrix vs. E for 1.0 0.8 0.6 0.4 0 ... · 1.0 0.8 0.6 0.4 0.2 0.0-1.0 -0.8 -0.6 -0.4 -0.2 0.0 ∆σ/σ vs. E for 93Zr(n,tot.) 106 107 0 5 10 15 20 25 ∆σ

A 0.8-V supply bulk-driven operational transconductance ...

Numerik Version 0.6

ICHEP Poster Session - 4 July 2014 · 2014-06-26 · ICHEP Poster Session - 4 July 2014 Introduction [GeV] T pt 0 100 200 300 400500600 700800 900 R(W,b)! 0 0.2 0.4 0.6 0.8 1 1.2

QCD results on conserved charge fluctuations and Beam ...theory.gsi.de/~friman/EMMI-Workshop/swagato.pdf · S pop fit — KpOp fit 0.2 0.4 0.6 0.4 0.2 HR -0.2 -0.4 -0.6 140 31 31

Estimation of ACVF and ACF: I Xn of a stationary process ...staff.washington.edu/dbp/s519/PDFs/06-overheads-2018.pdfEstimation of ACVF and ACF: I given a time series presumed to be

TRIUMF Summer Institute 2011 Lecture 10 · Muonium reactions are pseudo-first order because… only a few million Mu atoms are needed for an experiment. 0.0 0.2 0.4 0.6 0.8 1.0 1.2

Introdução à Nanociência e à Nanotecnologiaromeo.if.usp.br/~mcsantos/INN/INN-luz2.pdf · pz 1.0 g cm—3 p-1.2 g cm-3 1.0 0.8 0.6 0.2 0.0 400 600 800 1000 1200 1400 1600 Wavelength

GENERAL FORMULA FOR THE RESISTIVE WALL IMPEDANCE … · Elias Metral, LCE meeting, 20/02/2004 6 2000 4000 6000 8000 10000 w 0.2 0.4 0.6 0.8 1 ImHZTL 2000 4000 6000 8000 10000 w 0.005

Quarkonium measurements in heavy-ion collisions at the ... · 0 2 4 6 8 10 pAu R 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 |y|

2012 2013 Integrated Report DataTables - Food … 8.6 Hadar18 0.8 8 8.8 Saintpaul 11 6.3 Litchfield 17 0.8 Agona 7 7.7 Schwarzengrund 11 6.3 Poona 17 0.8 Schwarzengrund 7 7.7 Senftenberg

Jian Wang Shandong University · 2020. 9. 18. · Improvement for subtraction æ æ æ æ à à à à à 10-3 10 2 10-1 100 0.4 0.6 0.8 1.0 1.2 1.4 1.6 t 0 cut@GeVD D NNLO NNLO W+:w

PowerPoint at Reading · 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 10-2 rt st 10-1 ROAD & SKY 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 10-2 WALLS No Tree Real Cities 100 H/W (-) -0.1, À