Bayesian Parameter Inference in State-Space Models using...

112
Bayesian Parameter Inference in State-Space Models using Particle MCMC Arnaud Doucet Department of Statistics, Oxford University University College London 5 th October 2012 A. Doucet (UCL Masterclass Oct. 2012) 5 th October 2012 1 / 37

Transcript of Bayesian Parameter Inference in State-Space Models using...

Page 1: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Bayesian Parameter Inference in State-Space Modelsusing Particle MCMC

Arnaud DoucetDepartment of Statistics, Oxford University

University College London

5th October 2012

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 1 / 37

Page 2: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

State-Space Models with Unknown Parameters

In most scenarios of interest, the state-space model contains anunknown static parameter θ ∈ Θ so that

X1 ∼ µθ (·) and Xt | (Xt−1 = x) ∼ fθ ( ·| xt−1) .

The observations {Yt}t≥1 are conditionally independent given{Xt}t≥1 and θ

Yt | (Xt = xt ) ∼ gθ ( ·| x) .Aim: Given observations y1:T , we want to infer θ in a Bayesianframework.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 2 / 37

Page 3: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

State-Space Models with Unknown Parameters

In most scenarios of interest, the state-space model contains anunknown static parameter θ ∈ Θ so that

X1 ∼ µθ (·) and Xt | (Xt−1 = x) ∼ fθ ( ·| xt−1) .

The observations {Yt}t≥1 are conditionally independent given{Xt}t≥1 and θ

Yt | (Xt = xt ) ∼ gθ ( ·| x) .

Aim: Given observations y1:T , we want to infer θ in a Bayesianframework.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 2 / 37

Page 4: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

State-Space Models with Unknown Parameters

In most scenarios of interest, the state-space model contains anunknown static parameter θ ∈ Θ so that

X1 ∼ µθ (·) and Xt | (Xt−1 = x) ∼ fθ ( ·| xt−1) .

The observations {Yt}t≥1 are conditionally independent given{Xt}t≥1 and θ

Yt | (Xt = xt ) ∼ gθ ( ·| x) .Aim: Given observations y1:T , we want to infer θ in a Bayesianframework.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 2 / 37

Page 5: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Bayesian Parameter Inference in State-Space Models

Set a prior p (θ) on θ so inference relies now on

p ( θ, x1:T | y1:T ) =p (θ, x1:T , y1:T )

p (y1:t )

wherep (θ, x1:T , y1:T ) = p (θ) pθ (x1:T , y1:T )

with

pθ (x1:T , y1:T ) = µθ (x1)T

∏k=2

fθ (xk | xk−1)T

∏k=1

gθ (yk | xk )

Standard approaches rely on MCMC.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 3 / 37

Page 6: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Bayesian Parameter Inference in State-Space Models

Set a prior p (θ) on θ so inference relies now on

p ( θ, x1:T | y1:T ) =p (θ, x1:T , y1:T )

p (y1:t )

wherep (θ, x1:T , y1:T ) = p (θ) pθ (x1:T , y1:T )

with

pθ (x1:T , y1:T ) = µθ (x1)T

∏k=2

fθ (xk | xk−1)T

∏k=1

gθ (yk | xk )

Standard approaches rely on MCMC.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 3 / 37

Page 7: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Common MCMC Approaches and Limitations

MCMC Idea: Simulate an ergodic Markov chain {θ (i) ,X1:T (i)}i≥0of invariant distribution p ( θ, x1:T | y1:T )... infinite number ofpossibilities.

Typical strategies consists of updating iteratively X1:T conditionalupon θ then θ conditional upon X1:T .

To update X1:T conditional upon θ, use MCMC kernels updatingsubblocks according to pθ (xt :t+K−1| yt :t+K−1, xt−1, xt+K ).Standard MCMC algorithms are ineffi cient if θ and X1:T are stronglycorrelated.

Strategy impossible to implement when it is only possible to samplefrom the prior but impossible to evaluate it pointwise.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 4 / 37

Page 8: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Common MCMC Approaches and Limitations

MCMC Idea: Simulate an ergodic Markov chain {θ (i) ,X1:T (i)}i≥0of invariant distribution p ( θ, x1:T | y1:T )... infinite number ofpossibilities.

Typical strategies consists of updating iteratively X1:T conditionalupon θ then θ conditional upon X1:T .

To update X1:T conditional upon θ, use MCMC kernels updatingsubblocks according to pθ (xt :t+K−1| yt :t+K−1, xt−1, xt+K ).Standard MCMC algorithms are ineffi cient if θ and X1:T are stronglycorrelated.

Strategy impossible to implement when it is only possible to samplefrom the prior but impossible to evaluate it pointwise.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 4 / 37

Page 9: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Common MCMC Approaches and Limitations

MCMC Idea: Simulate an ergodic Markov chain {θ (i) ,X1:T (i)}i≥0of invariant distribution p ( θ, x1:T | y1:T )... infinite number ofpossibilities.

Typical strategies consists of updating iteratively X1:T conditionalupon θ then θ conditional upon X1:T .

To update X1:T conditional upon θ, use MCMC kernels updatingsubblocks according to pθ (xt :t+K−1| yt :t+K−1, xt−1, xt+K ).

Standard MCMC algorithms are ineffi cient if θ and X1:T are stronglycorrelated.

Strategy impossible to implement when it is only possible to samplefrom the prior but impossible to evaluate it pointwise.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 4 / 37

Page 10: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Common MCMC Approaches and Limitations

MCMC Idea: Simulate an ergodic Markov chain {θ (i) ,X1:T (i)}i≥0of invariant distribution p ( θ, x1:T | y1:T )... infinite number ofpossibilities.

Typical strategies consists of updating iteratively X1:T conditionalupon θ then θ conditional upon X1:T .

To update X1:T conditional upon θ, use MCMC kernels updatingsubblocks according to pθ (xt :t+K−1| yt :t+K−1, xt−1, xt+K ).Standard MCMC algorithms are ineffi cient if θ and X1:T are stronglycorrelated.

Strategy impossible to implement when it is only possible to samplefrom the prior but impossible to evaluate it pointwise.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 4 / 37

Page 11: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Common MCMC Approaches and Limitations

MCMC Idea: Simulate an ergodic Markov chain {θ (i) ,X1:T (i)}i≥0of invariant distribution p ( θ, x1:T | y1:T )... infinite number ofpossibilities.

Typical strategies consists of updating iteratively X1:T conditionalupon θ then θ conditional upon X1:T .

To update X1:T conditional upon θ, use MCMC kernels updatingsubblocks according to pθ (xt :t+K−1| yt :t+K−1, xt−1, xt+K ).Standard MCMC algorithms are ineffi cient if θ and X1:T are stronglycorrelated.

Strategy impossible to implement when it is only possible to samplefrom the prior but impossible to evaluate it pointwise.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 4 / 37

Page 12: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Metropolis-Hastings (MH) Sampling

To bypass these problems, we want to update jointly θ and X1:T .

Assume that the current state of our Markov chain is (θ, x1:T ), wepropose to update simultaneously the parameter and the states usinga proposal

q ( (θ∗, x∗1:T )| (θ, x1:T )) = q ( θ∗| θ) qθ∗ (x∗1:T | y1:T ) .

The proposal (θ∗, x∗1:T ) is accepted with MH acceptance probability

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)Problem: Designing a proposal qθ∗ (x

∗1:T | y1:T ) such that the

acceptance probability is not extremely small is very diffi cult.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 5 / 37

Page 13: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Metropolis-Hastings (MH) Sampling

To bypass these problems, we want to update jointly θ and X1:T .

Assume that the current state of our Markov chain is (θ, x1:T ), wepropose to update simultaneously the parameter and the states usinga proposal

q ( (θ∗, x∗1:T )| (θ, x1:T )) = q ( θ∗| θ) qθ∗ (x∗1:T | y1:T ) .

The proposal (θ∗, x∗1:T ) is accepted with MH acceptance probability

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)Problem: Designing a proposal qθ∗ (x

∗1:T | y1:T ) such that the

acceptance probability is not extremely small is very diffi cult.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 5 / 37

Page 14: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Metropolis-Hastings (MH) Sampling

To bypass these problems, we want to update jointly θ and X1:T .

Assume that the current state of our Markov chain is (θ, x1:T ), wepropose to update simultaneously the parameter and the states usinga proposal

q ( (θ∗, x∗1:T )| (θ, x1:T )) = q ( θ∗| θ) qθ∗ (x∗1:T | y1:T ) .

The proposal (θ∗, x∗1:T ) is accepted with MH acceptance probability

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)

Problem: Designing a proposal qθ∗ (x∗1:T | y1:T ) such that the

acceptance probability is not extremely small is very diffi cult.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 5 / 37

Page 15: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Metropolis-Hastings (MH) Sampling

To bypass these problems, we want to update jointly θ and X1:T .

Assume that the current state of our Markov chain is (θ, x1:T ), wepropose to update simultaneously the parameter and the states usinga proposal

q ( (θ∗, x∗1:T )| (θ, x1:T )) = q ( θ∗| θ) qθ∗ (x∗1:T | y1:T ) .

The proposal (θ∗, x∗1:T ) is accepted with MH acceptance probability

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)Problem: Designing a proposal qθ∗ (x

∗1:T | y1:T ) such that the

acceptance probability is not extremely small is very diffi cult.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 5 / 37

Page 16: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

“Idealized”Marginal MH Sampler

Consider the following so-called marginal Metropolis-Hastings (MH)algorithm which uses as a proposal

q ( (x∗1:T , θ∗)| (x1:T , θ)) = q ( θ∗| θ) pθ∗ (x

∗1:T | y1:T ) .

The MH acceptance probability is

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)= 1∧ pθ∗ (y1:T ) p (θ

∗)

pθ (y1:T ) p (θ)q ( θ| θ∗)q ( θ∗| θ)

In this MH algorithm, X1:T has been essentially integrated out.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 6 / 37

Page 17: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

“Idealized”Marginal MH Sampler

Consider the following so-called marginal Metropolis-Hastings (MH)algorithm which uses as a proposal

q ( (x∗1:T , θ∗)| (x1:T , θ)) = q ( θ∗| θ) pθ∗ (x

∗1:T | y1:T ) .

The MH acceptance probability is

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)= 1∧ pθ∗ (y1:T ) p (θ

∗)

pθ (y1:T ) p (θ)q ( θ| θ∗)q ( θ∗| θ)

In this MH algorithm, X1:T has been essentially integrated out.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 6 / 37

Page 18: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

“Idealized”Marginal MH Sampler

Consider the following so-called marginal Metropolis-Hastings (MH)algorithm which uses as a proposal

q ( (x∗1:T , θ∗)| (x1:T , θ)) = q ( θ∗| θ) pθ∗ (x

∗1:T | y1:T ) .

The MH acceptance probability is

1∧ p ( θ∗, x∗1:T | y1:T )

p ( θ, x1:T | y1:T )

q ( (x1:T , θ)| (x∗1:T , θ∗))

q((x∗1:T , θ

∗)∣∣ (x1:T , θ)

)= 1∧ pθ∗ (y1:T ) p (θ

∗)

pθ (y1:T ) p (θ)q ( θ| θ∗)q ( θ∗| θ)

In this MH algorithm, X1:T has been essentially integrated out.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 6 / 37

Page 19: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Implementation Issues

Problem 1: We do not know pθ (y1:T ) =∫pθ (x1:T , y1:T ) dx1:T

analytically.

Problem 2: We do not know how to sample from pθ (x1:T | y1:T ) .

“Idea”: Use particle approximations of pθ (x1:T | y1:T ) and pθ (y1:T ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 7 / 37

Page 20: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Implementation Issues

Problem 1: We do not know pθ (y1:T ) =∫pθ (x1:T , y1:T ) dx1:T

analytically.

Problem 2: We do not know how to sample from pθ (x1:T | y1:T ) .

“Idea”: Use particle approximations of pθ (x1:T | y1:T ) and pθ (y1:T ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 7 / 37

Page 21: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Implementation Issues

Problem 1: We do not know pθ (y1:T ) =∫pθ (x1:T , y1:T ) dx1:T

analytically.

Problem 2: We do not know how to sample from pθ (x1:T | y1:T ) .

“Idea”: Use particle approximations of pθ (x1:T | y1:T ) and pθ (y1:T ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 7 / 37

Page 22: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Particle Filters

Given θ, particle methods provide approximations of pθ (x1:T | y1:T )and pθ (y1:T ).

At time T , we obtain the following approximation of the posterior ofinterest

p̂θ (x1:T | y1:T ) =1N

N

∑k=1

δX (k )1:T

(x1:T )

and an approximation of pθ (y1:T ) is given by

p̂θ (y1:T ) = p̂θ (y1)T

∏t=2p̂θ (yt | y1:t−1) =

T

∏t=1

(1N

N

∑k=1

(yt |X (k )t

))

if we use fθ (xt | xt−1) as a proposal.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 8 / 37

Page 23: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Particle Filters

Given θ, particle methods provide approximations of pθ (x1:T | y1:T )and pθ (y1:T ).

At time T , we obtain the following approximation of the posterior ofinterest

p̂θ (x1:T | y1:T ) =1N

N

∑k=1

δX (k )1:T

(x1:T )

and an approximation of pθ (y1:T ) is given by

p̂θ (y1:T ) = p̂θ (y1)T

∏t=2p̂θ (yt | y1:t−1) =

T

∏t=1

(1N

N

∑k=1

(yt |X (k )t

))

if we use fθ (xt | xt−1) as a proposal.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 8 / 37

Page 24: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Few Theoretical Results

We haveE [p̂θ (y1:T )] = pθ (y1:T ) .

Under mixing assumptions, we have

V [p̂θ (y1:T )]

p2θ (y1:T )≤ Dθ

TN.

Under mixing assumptions, we also have∫|E [p̂θ (x1:T | y1:T )]− pθ (x1:T | y1:T )| dx1:T ≤ Cθ

TN

so if I run a particle method to obtain p̂θ (x1:T | y1:T ) thenX1:T ∼ p̂θ (x1:T | y1:T ), unconditionally X1:T ∼ E [p̂θ ( ·| y1:T )].

Problem: We cannot compute analytically the particle filter proposalqθ (x1:T | y1:T ) = E [p̂θ (x1:T | y1:T )] as it involves an expectation w.r.tall the variables appearing in the particle algorithm...

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 9 / 37

Page 25: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Few Theoretical Results

We haveE [p̂θ (y1:T )] = pθ (y1:T ) .

Under mixing assumptions, we have

V [p̂θ (y1:T )]

p2θ (y1:T )≤ Dθ

TN.

Under mixing assumptions, we also have∫|E [p̂θ (x1:T | y1:T )]− pθ (x1:T | y1:T )| dx1:T ≤ Cθ

TN

so if I run a particle method to obtain p̂θ (x1:T | y1:T ) thenX1:T ∼ p̂θ (x1:T | y1:T ), unconditionally X1:T ∼ E [p̂θ ( ·| y1:T )].

Problem: We cannot compute analytically the particle filter proposalqθ (x1:T | y1:T ) = E [p̂θ (x1:T | y1:T )] as it involves an expectation w.r.tall the variables appearing in the particle algorithm...

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 9 / 37

Page 26: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Few Theoretical Results

We haveE [p̂θ (y1:T )] = pθ (y1:T ) .

Under mixing assumptions, we have

V [p̂θ (y1:T )]

p2θ (y1:T )≤ Dθ

TN.

Under mixing assumptions, we also have∫|E [p̂θ (x1:T | y1:T )]− pθ (x1:T | y1:T )| dx1:T ≤ Cθ

TN

so if I run a particle method to obtain p̂θ (x1:T | y1:T ) thenX1:T ∼ p̂θ (x1:T | y1:T ), unconditionally X1:T ∼ E [p̂θ ( ·| y1:T )].

Problem: We cannot compute analytically the particle filter proposalqθ (x1:T | y1:T ) = E [p̂θ (x1:T | y1:T )] as it involves an expectation w.r.tall the variables appearing in the particle algorithm...

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 9 / 37

Page 27: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Few Theoretical Results

We haveE [p̂θ (y1:T )] = pθ (y1:T ) .

Under mixing assumptions, we have

V [p̂θ (y1:T )]

p2θ (y1:T )≤ Dθ

TN.

Under mixing assumptions, we also have∫|E [p̂θ (x1:T | y1:T )]− pθ (x1:T | y1:T )| dx1:T ≤ Cθ

TN

so if I run a particle method to obtain p̂θ (x1:T | y1:T ) thenX1:T ∼ p̂θ (x1:T | y1:T ), unconditionally X1:T ∼ E [p̂θ ( ·| y1:T )].

Problem: We cannot compute analytically the particle filter proposalqθ (x1:T | y1:T ) = E [p̂θ (x1:T | y1:T )] as it involves an expectation w.r.tall the variables appearing in the particle algorithm...

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 9 / 37

Page 28: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

“Idealized”Marginal MH Sampler

At iteration i

Sample θ∗ ∼ q ( θ| θ (i − 1)).

Sample X ∗1:T ∼ pθ∗ (x1:T | y1:T ) .

With probability

1∧ pθ∗ (y1:T ) p (θ∗)

pθ(i−1) (y1:T ) p (θ (i − 1))q ( θ (i − 1)| θ∗)q ( θ∗| θ (i − 1))

set θ (i) = θ∗, X1:T (i) = X ∗1:T otherwise set θ (i) = θ (i − 1),X1:T (i) = X1:T (i − 1) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 10 / 37

Page 29: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

“Idealized”Marginal MH Sampler

At iteration i

Sample θ∗ ∼ q ( θ| θ (i − 1)).Sample X ∗1:T ∼ pθ∗ (x1:T | y1:T ) .

With probability

1∧ pθ∗ (y1:T ) p (θ∗)

pθ(i−1) (y1:T ) p (θ (i − 1))q ( θ (i − 1)| θ∗)q ( θ∗| θ (i − 1))

set θ (i) = θ∗, X1:T (i) = X ∗1:T otherwise set θ (i) = θ (i − 1),X1:T (i) = X1:T (i − 1) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 10 / 37

Page 30: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

“Idealized”Marginal MH Sampler

At iteration i

Sample θ∗ ∼ q ( θ| θ (i − 1)).Sample X ∗1:T ∼ pθ∗ (x1:T | y1:T ) .

With probability

1∧ pθ∗ (y1:T ) p (θ∗)

pθ(i−1) (y1:T ) p (θ (i − 1))q ( θ (i − 1)| θ∗)q ( θ∗| θ (i − 1))

set θ (i) = θ∗, X1:T (i) = X ∗1:T otherwise set θ (i) = θ (i − 1),X1:T (i) = X1:T (i − 1) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 10 / 37

Page 31: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Particle Marginal MH Sampler

At iteration i

Sample θ∗ ∼ q ( θ| θ (i − 1)) and run an particle filter to obtainp̂θ∗ (x1:T | y1:T ) and p̂θ∗ (y1:T ).

Sample X ∗1:T ∼ p̂θ∗ (x1:T | y1:T ) .

With probability

1∧ p̂θ∗ (y1:T ) p (θ∗)

p̂θ(i−1) (y1:T ) p (θ (i − 1))q ( θ (i − 1)| θ∗)q ( θ∗| θ (i − 1))

set θ (i) = θ∗, X1:T (i) = X ∗1:T otherwise set θ (i) = θ (i − 1),X1:T (i) = X1:T (i − 1) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 11 / 37

Page 32: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Particle Marginal MH Sampler

At iteration i

Sample θ∗ ∼ q ( θ| θ (i − 1)) and run an particle filter to obtainp̂θ∗ (x1:T | y1:T ) and p̂θ∗ (y1:T ).

Sample X ∗1:T ∼ p̂θ∗ (x1:T | y1:T ) .

With probability

1∧ p̂θ∗ (y1:T ) p (θ∗)

p̂θ(i−1) (y1:T ) p (θ (i − 1))q ( θ (i − 1)| θ∗)q ( θ∗| θ (i − 1))

set θ (i) = θ∗, X1:T (i) = X ∗1:T otherwise set θ (i) = θ (i − 1),X1:T (i) = X1:T (i − 1) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 11 / 37

Page 33: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Particle Marginal MH Sampler

At iteration i

Sample θ∗ ∼ q ( θ| θ (i − 1)) and run an particle filter to obtainp̂θ∗ (x1:T | y1:T ) and p̂θ∗ (y1:T ).

Sample X ∗1:T ∼ p̂θ∗ (x1:T | y1:T ) .

With probability

1∧ p̂θ∗ (y1:T ) p (θ∗)

p̂θ(i−1) (y1:T ) p (θ (i − 1))q ( θ (i − 1)| θ∗)q ( θ∗| θ (i − 1))

set θ (i) = θ∗, X1:T (i) = X ∗1:T otherwise set θ (i) = θ (i − 1),X1:T (i) = X1:T (i − 1) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 11 / 37

Page 34: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Validity of the Particle Marginal MH Sampler

Proposition. Assume that the ‘idealized’marginal MH sampler chainis ergodic then, under very weak assumptions, the PMMH samplerchain is ergodic and admits p( θ, x1:T | y1:T ) whatever being N ≥ 1.

It is easy to show the simpler result that the PMMH admitsp( θ| y1:T ) as invariant distribution whatever being N ≥ 1.Let U denote all the r.v. introduce to build the SMC estimate thenwrite p̂θ (y1:T ) = p̂θ (y1:T ;U) and from unbiasedness∫

p̂θ (y1:T ; u) qθ (u) du = pθ (y1:T ) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 12 / 37

Page 35: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Validity of the Particle Marginal MH Sampler

Proposition. Assume that the ‘idealized’marginal MH sampler chainis ergodic then, under very weak assumptions, the PMMH samplerchain is ergodic and admits p( θ, x1:T | y1:T ) whatever being N ≥ 1.It is easy to show the simpler result that the PMMH admitsp( θ| y1:T ) as invariant distribution whatever being N ≥ 1.

Let U denote all the r.v. introduce to build the SMC estimate thenwrite p̂θ (y1:T ) = p̂θ (y1:T ;U) and from unbiasedness∫

p̂θ (y1:T ; u) qθ (u) du = pθ (y1:T ) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 12 / 37

Page 36: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Validity of the Particle Marginal MH Sampler

Proposition. Assume that the ‘idealized’marginal MH sampler chainis ergodic then, under very weak assumptions, the PMMH samplerchain is ergodic and admits p( θ, x1:T | y1:T ) whatever being N ≥ 1.It is easy to show the simpler result that the PMMH admitsp( θ| y1:T ) as invariant distribution whatever being N ≥ 1.Let U denote all the r.v. introduce to build the SMC estimate thenwrite p̂θ (y1:T ) = p̂θ (y1:T ;U) and from unbiasedness∫

p̂θ (y1:T ; u) qθ (u) du = pθ (y1:T ) .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 12 / 37

Page 37: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

An Incomplete But Trivial Proof

The PMMH targets the distribution

π̂ (θ, u) ∝ p (θ) p̂θ (y1:T ; u) qθ (u)

which satisfiesπ̂ (θ) = p( θ| y1:T ).

The PMMH sampler uses as a proposal

q ( (θ∗, u∗)| (θ, u)) = q ( θ∗| θ) qθ∗ (u∗)

and

π̂ (θ∗, u∗)π̂ (θ, u)

q ( (θ, u)| (θ∗, u∗))q ( (θ∗, u∗)| (θ, u)) =

p (θ∗) p̂θ∗ (y1:T ; u∗)p (θ) p̂θ (y1:T ; u)

q ( θ| θ∗)q ( θ∗| θ) .

Trivial but deep result: if you plug any unbiased likelihood estimatewithin a MCMC scheme, you do not perturb the invariant distribution.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 13 / 37

Page 38: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

An Incomplete But Trivial Proof

The PMMH targets the distribution

π̂ (θ, u) ∝ p (θ) p̂θ (y1:T ; u) qθ (u)

which satisfiesπ̂ (θ) = p( θ| y1:T ).

The PMMH sampler uses as a proposal

q ( (θ∗, u∗)| (θ, u)) = q ( θ∗| θ) qθ∗ (u∗)

and

π̂ (θ∗, u∗)π̂ (θ, u)

q ( (θ, u)| (θ∗, u∗))q ( (θ∗, u∗)| (θ, u)) =

p (θ∗) p̂θ∗ (y1:T ; u∗)p (θ) p̂θ (y1:T ; u)

q ( θ| θ∗)q ( θ∗| θ) .

Trivial but deep result: if you plug any unbiased likelihood estimatewithin a MCMC scheme, you do not perturb the invariant distribution.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 13 / 37

Page 39: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

An Incomplete But Trivial Proof

The PMMH targets the distribution

π̂ (θ, u) ∝ p (θ) p̂θ (y1:T ; u) qθ (u)

which satisfiesπ̂ (θ) = p( θ| y1:T ).

The PMMH sampler uses as a proposal

q ( (θ∗, u∗)| (θ, u)) = q ( θ∗| θ) qθ∗ (u∗)

and

π̂ (θ∗, u∗)π̂ (θ, u)

q ( (θ, u)| (θ∗, u∗))q ( (θ∗, u∗)| (θ, u)) =

p (θ∗) p̂θ∗ (y1:T ; u∗)p (θ) p̂θ (y1:T ; u)

q ( θ| θ∗)q ( θ∗| θ) .

Trivial but deep result: if you plug any unbiased likelihood estimatewithin a MCMC scheme, you do not perturb the invariant distribution.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 13 / 37

Page 40: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Inference for Stochastic Kinetic Models

Two species X 1t (prey) and X2t (predator)

Pr(X 1t+dt=x

1t+1,X

2t+dt=x

2t

∣∣ x1t , x2t ) = α x1t dt + o (dt) ,Pr(X 1t+dt=x

1t−1,X 2t+dt=x2t+1

∣∣ x1t , x2t ) = β x1t x2t dt + o (dt) ,

Pr(X 1t+dt=x

1t ,X

2t+dt=x

2t−1

∣∣ x1t , x2t ) = γ x2t dt + o (dt) ,

withYk = X

1k∆T +Wk with Wk

i.i.d.∼ N(0, σ2

).

We are interested in the kinetic rate constants θ = (α, β,γ) a prioridistributed as (Boys et al., 2008; Kunsch, 2011)

α ∼ G(1, 10), β ∼ G(1, 0.25), γ ∼ G(1, 7.5).

MCMC methods require reversible jumps, Particle MCMC requiresonly forward simulation.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 14 / 37

Page 41: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Inference for Stochastic Kinetic Models

Two species X 1t (prey) and X2t (predator)

Pr(X 1t+dt=x

1t+1,X

2t+dt=x

2t

∣∣ x1t , x2t ) = α x1t dt + o (dt) ,Pr(X 1t+dt=x

1t−1,X 2t+dt=x2t+1

∣∣ x1t , x2t ) = β x1t x2t dt + o (dt) ,

Pr(X 1t+dt=x

1t ,X

2t+dt=x

2t−1

∣∣ x1t , x2t ) = γ x2t dt + o (dt) ,

withYk = X

1k∆T +Wk with Wk

i.i.d.∼ N(0, σ2

).

We are interested in the kinetic rate constants θ = (α, β,γ) a prioridistributed as (Boys et al., 2008; Kunsch, 2011)

α ∼ G(1, 10), β ∼ G(1, 0.25), γ ∼ G(1, 7.5).

MCMC methods require reversible jumps, Particle MCMC requiresonly forward simulation.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 14 / 37

Page 42: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Inference for Stochastic Kinetic Models

Two species X 1t (prey) and X2t (predator)

Pr(X 1t+dt=x

1t+1,X

2t+dt=x

2t

∣∣ x1t , x2t ) = α x1t dt + o (dt) ,Pr(X 1t+dt=x

1t−1,X 2t+dt=x2t+1

∣∣ x1t , x2t ) = β x1t x2t dt + o (dt) ,

Pr(X 1t+dt=x

1t ,X

2t+dt=x

2t−1

∣∣ x1t , x2t ) = γ x2t dt + o (dt) ,

withYk = X

1k∆T +Wk with Wk

i.i.d.∼ N(0, σ2

).

We are interested in the kinetic rate constants θ = (α, β,γ) a prioridistributed as (Boys et al., 2008; Kunsch, 2011)

α ∼ G(1, 10), β ∼ G(1, 0.25), γ ∼ G(1, 7.5).

MCMC methods require reversible jumps, Particle MCMC requiresonly forward simulation.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 14 / 37

Page 43: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Experimental Results

­ 2 0

  0

  2 0

  4 0

  6 0

  8 0

  1 0 0

  1 2 0

  1 4 0

  0   1   2   3   4   5   6   7   8   9   1 0

p r e yp r e d a t o r

Simulated data

1 .5 2 2 .5 3 3 .5 4 4 .51 .5 2 2 .5 3 3 .5 4 4 .5 0 .0 6 0 .1 2 0 .1 80 .0 6 0 .1 2 0 .1 8 1 2 3 4 5 6 7 81 2 3 4 5 6 7 8

α

β

γ

Posterior distributions

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 15 / 37

Page 44: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Autocorrelation Functions

  0

  0 . 2

  0 . 4

  0 . 6

  0 . 8

  1

  0   1 0 0   2 0 0   3 0 0   4 0 0   5 0 0

α

    5 0   p a r t ic le s  1 0 0   p a r t ic le s  2 0 0   p a r t ic le s  5 0 0   p a r t ic le s

1 0 0 0   p a r t ic le s

  0

  0 . 2

  0 . 4

  0 . 6

  0 . 8

  1

  0   1 0 0   2 0 0   3 0 0   4 0 0   5 0 0

β

    5 0   p a r t ic le s  1 0 0   p a r t ic le s  2 0 0   p a r t ic le s  5 0 0   p a r t ic le s

1 0 0 0   p a r t ic le s

Autocorrelation of α (left) and β (right) for the PMMH sampler forvarious N.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 16 / 37

Page 45: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Summary

Offl ine Bayesian parameter inference is feasible by using particle filterproposals within MCMC.

Particle MCMC allow us to perform Bayesian inference for dynamicmodels for which only forward simulation is possible.

Computationally intensive but several implementations on GPUalready available and applications in control, ecology, econometrics,biochemical systems, epidemiology, water resources research etc.

This approach does not suffer from degeneracy problem and N scalesroughly linearly with T .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 17 / 37

Page 46: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Summary

Offl ine Bayesian parameter inference is feasible by using particle filterproposals within MCMC.

Particle MCMC allow us to perform Bayesian inference for dynamicmodels for which only forward simulation is possible.

Computationally intensive but several implementations on GPUalready available and applications in control, ecology, econometrics,biochemical systems, epidemiology, water resources research etc.

This approach does not suffer from degeneracy problem and N scalesroughly linearly with T .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 17 / 37

Page 47: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Summary

Offl ine Bayesian parameter inference is feasible by using particle filterproposals within MCMC.

Particle MCMC allow us to perform Bayesian inference for dynamicmodels for which only forward simulation is possible.

Computationally intensive but several implementations on GPUalready available and applications in control, ecology, econometrics,biochemical systems, epidemiology, water resources research etc.

This approach does not suffer from degeneracy problem and N scalesroughly linearly with T .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 17 / 37

Page 48: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Summary

Offl ine Bayesian parameter inference is feasible by using particle filterproposals within MCMC.

Particle MCMC allow us to perform Bayesian inference for dynamicmodels for which only forward simulation is possible.

Computationally intensive but several implementations on GPUalready available and applications in control, ecology, econometrics,biochemical systems, epidemiology, water resources research etc.

This approach does not suffer from degeneracy problem and N scalesroughly linearly with T .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 17 / 37

Page 49: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

How to Select the Number of Samples

Selection of N is a key issue.

If N is too small, then the algorithm mixes poorly and will requiremany MCMC iterations.

If N is too large, then each MCMC iteration is expensive.

Aim: We would like to provide guidelines on how to select N.Joint work with Mike Pitt (Warwick) and Robert Kohn (UNSW).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 18 / 37

Page 50: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

How to Select the Number of Samples

Selection of N is a key issue.

If N is too small, then the algorithm mixes poorly and will requiremany MCMC iterations.

If N is too large, then each MCMC iteration is expensive.

Aim: We would like to provide guidelines on how to select N.Joint work with Mike Pitt (Warwick) and Robert Kohn (UNSW).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 18 / 37

Page 51: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

How to Select the Number of Samples

Selection of N is a key issue.

If N is too small, then the algorithm mixes poorly and will requiremany MCMC iterations.

If N is too large, then each MCMC iteration is expensive.

Aim: We would like to provide guidelines on how to select N.Joint work with Mike Pitt (Warwick) and Robert Kohn (UNSW).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 18 / 37

Page 52: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

How to Select the Number of Samples

Selection of N is a key issue.

If N is too small, then the algorithm mixes poorly and will requiremany MCMC iterations.

If N is too large, then each MCMC iteration is expensive.

Aim: We would like to provide guidelines on how to select N.

Joint work with Mike Pitt (Warwick) and Robert Kohn (UNSW).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 18 / 37

Page 53: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

How to Select the Number of Samples

Selection of N is a key issue.

If N is too small, then the algorithm mixes poorly and will requiremany MCMC iterations.

If N is too large, then each MCMC iteration is expensive.

Aim: We would like to provide guidelines on how to select N.Joint work with Mike Pitt (Warwick) and Robert Kohn (UNSW).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 18 / 37

Page 54: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

MCMC with Intractable Likelihood Function

Let Z = log p̂θ (y ;U)− log pθ(y) be the error in log-likelihood.

The proposal from which Z arises is denoted g(z |θ).We can rewrite the extended target

π̂(θ, z) = p( θ| y) exp(z)g(z |θ)

which is directly related to π̂(θ, u) through the many-to-onetransformation from u to z .

The previous algorithm proposes θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′),accepting

(θ′, z ′

)w.p.

αQ (θ,Z ; θ′,Z ′) = min

{1, exp(Z ′ − Z )ω(θ′; θ)/ω(θ; θ′)

},

where ω(θ′; θ) = π(θ′)/q(θ′|θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 19 / 37

Page 55: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

MCMC with Intractable Likelihood Function

Let Z = log p̂θ (y ;U)− log pθ(y) be the error in log-likelihood.

The proposal from which Z arises is denoted g(z |θ).

We can rewrite the extended target

π̂(θ, z) = p( θ| y) exp(z)g(z |θ)

which is directly related to π̂(θ, u) through the many-to-onetransformation from u to z .

The previous algorithm proposes θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′),accepting

(θ′, z ′

)w.p.

αQ (θ,Z ; θ′,Z ′) = min

{1, exp(Z ′ − Z )ω(θ′; θ)/ω(θ; θ′)

},

where ω(θ′; θ) = π(θ′)/q(θ′|θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 19 / 37

Page 56: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

MCMC with Intractable Likelihood Function

Let Z = log p̂θ (y ;U)− log pθ(y) be the error in log-likelihood.

The proposal from which Z arises is denoted g(z |θ).We can rewrite the extended target

π̂(θ, z) = p( θ| y) exp(z)g(z |θ)

which is directly related to π̂(θ, u) through the many-to-onetransformation from u to z .

The previous algorithm proposes θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′),accepting

(θ′, z ′

)w.p.

αQ (θ,Z ; θ′,Z ′) = min

{1, exp(Z ′ − Z )ω(θ′; θ)/ω(θ; θ′)

},

where ω(θ′; θ) = π(θ′)/q(θ′|θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 19 / 37

Page 57: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

MCMC with Intractable Likelihood Function

Let Z = log p̂θ (y ;U)− log pθ(y) be the error in log-likelihood.

The proposal from which Z arises is denoted g(z |θ).We can rewrite the extended target

π̂(θ, z) = p( θ| y) exp(z)g(z |θ)

which is directly related to π̂(θ, u) through the many-to-onetransformation from u to z .

The previous algorithm proposes θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′),accepting

(θ′, z ′

)w.p.

αQ (θ,Z ; θ′,Z ′) = min

{1, exp(Z ′ − Z )ω(θ′; θ)/ω(θ; θ′)

},

where ω(θ′; θ) = π(θ′)/q(θ′|θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 19 / 37

Page 58: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Ineffi ciency Measure

Consider a stationary Markov chain {θj} with invariant density π(θ)and h : Θ→ R with Vπ [h(θ)] < ∞. Define

µh = Eπ [h(θ)] and µ̂h,n = n−1

n

∑j=1h(θj ).

Then, under regularity conditions, we have

limn−→∞

nV(µ̂h,n

)= Vπ [h(θ)] IFh with IFh = 1+ 2

∑τ=1

ρh(τ),

where ρh(τ) is the autocorrelation at lag τ of the stationary sequence{h(θj )}.

The IACT, IFh, quantifies how many times more samples are requiredfrom the Markov chain relative to using iid samples from π(θ) toachieve a given precision.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 20 / 37

Page 59: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Ineffi ciency Measure

Consider a stationary Markov chain {θj} with invariant density π(θ)and h : Θ→ R with Vπ [h(θ)] < ∞. Define

µh = Eπ [h(θ)] and µ̂h,n = n−1

n

∑j=1h(θj ).

Then, under regularity conditions, we have

limn−→∞

nV(µ̂h,n

)= Vπ [h(θ)] IFh with IFh = 1+ 2

∑τ=1

ρh(τ),

where ρh(τ) is the autocorrelation at lag τ of the stationary sequence{h(θj )}.The IACT, IFh, quantifies how many times more samples are requiredfrom the Markov chain relative to using iid samples from π(θ) toachieve a given precision.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 20 / 37

Page 60: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

For the sake of analysis, we introduce an alternative Q∗ chain.

Let (θ,Z ) be the current state of the Markov chain.

1 Propose θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′).2 Accept θ′ w.p. αQEX (θ; θ

′) = min{1,ω(θ′; θ)/ω(θ; θ′)

}.

3 Accept Z ′ w.p. αQ z(Z ;Z ′) = min {1, exp(Z ′ − Z )}.4(θ′,Z ′

)is accepted if and only if there is acceptance in both criteria.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 21 / 37

Page 61: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

For the sake of analysis, we introduce an alternative Q∗ chain.

Let (θ,Z ) be the current state of the Markov chain.

1 Propose θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′).2 Accept θ′ w.p. αQEX (θ; θ

′) = min{1,ω(θ′; θ)/ω(θ; θ′)

}.

3 Accept Z ′ w.p. αQ z(Z ;Z ′) = min {1, exp(Z ′ − Z )}.4(θ′,Z ′

)is accepted if and only if there is acceptance in both criteria.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 21 / 37

Page 62: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

For the sake of analysis, we introduce an alternative Q∗ chain.

Let (θ,Z ) be the current state of the Markov chain.

1 Propose θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′).

2 Accept θ′ w.p. αQEX (θ; θ′) = min

{1,ω(θ′; θ)/ω(θ; θ′)

}.

3 Accept Z ′ w.p. αQ z(Z ;Z ′) = min {1, exp(Z ′ − Z )}.4(θ′,Z ′

)is accepted if and only if there is acceptance in both criteria.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 21 / 37

Page 63: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

For the sake of analysis, we introduce an alternative Q∗ chain.

Let (θ,Z ) be the current state of the Markov chain.

1 Propose θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′).2 Accept θ′ w.p. αQEX (θ; θ

′) = min{1,ω(θ′; θ)/ω(θ; θ′)

}.

3 Accept Z ′ w.p. αQ z(Z ;Z ′) = min {1, exp(Z ′ − Z )}.4(θ′,Z ′

)is accepted if and only if there is acceptance in both criteria.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 21 / 37

Page 64: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

For the sake of analysis, we introduce an alternative Q∗ chain.

Let (θ,Z ) be the current state of the Markov chain.

1 Propose θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′).2 Accept θ′ w.p. αQEX (θ; θ

′) = min{1,ω(θ′; θ)/ω(θ; θ′)

}.

3 Accept Z ′ w.p. αQ z(Z ;Z ′) = min {1, exp(Z ′ − Z )}.

4(θ′,Z ′

)is accepted if and only if there is acceptance in both criteria.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 21 / 37

Page 65: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

For the sake of analysis, we introduce an alternative Q∗ chain.

Let (θ,Z ) be the current state of the Markov chain.

1 Propose θ′ ∼ q(·|θ) and Z ′ ∼ g(·|θ′).2 Accept θ′ w.p. αQEX (θ; θ

′) = min{1,ω(θ′; θ)/ω(θ; θ′)

}.

3 Accept Z ′ w.p. αQ z(Z ;Z ′) = min {1, exp(Z ′ − Z )}.4(θ′,Z ′

)is accepted if and only if there is acceptance in both criteria.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 21 / 37

Page 66: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

Lemma: The Markov chain Q∗ has the following properties:

1 We have

αQ (θ, z ; θ′, z ′) ≥ αQ ∗(θ, z ; θ

′, z ′) = αQEX (θ; θ′)× αQ z(z ; z

′).

2 Q∗ is a reversible Markov chain with invariant density π̂(θ, z).3 For any function h(θ) the IACT is higher for the Q∗ chain than forthe Q chain; i.e. IFQ

h ≥ IFQh (Peskun, 1973; Tierney 1998).

Remark: We have αQ ∗(θ, z ; θ′, z ′) = αQ (θ, z ; θ

′, z ′) = αQEX (θ; θ′)

when the likelihood is known exactly andαQ ∗(θ, z ; θ

′, z ′) = αQ (θ, z ; θ∗, z∗) = αQZ (z ; z

∗) when the proposal isperfect, i.e. q(θ′|θ) = π(θ′).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 22 / 37

Page 67: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

Lemma: The Markov chain Q∗ has the following properties:

1 We have

αQ (θ, z ; θ′, z ′) ≥ αQ ∗(θ, z ; θ

′, z ′) = αQEX (θ; θ′)× αQ z(z ; z

′).

2 Q∗ is a reversible Markov chain with invariant density π̂(θ, z).3 For any function h(θ) the IACT is higher for the Q∗ chain than forthe Q chain; i.e. IFQ

h ≥ IFQh (Peskun, 1973; Tierney 1998).

Remark: We have αQ ∗(θ, z ; θ′, z ′) = αQ (θ, z ; θ

′, z ′) = αQEX (θ; θ′)

when the likelihood is known exactly andαQ ∗(θ, z ; θ

′, z ′) = αQ (θ, z ; θ∗, z∗) = αQZ (z ; z

∗) when the proposal isperfect, i.e. q(θ′|θ) = π(θ′).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 22 / 37

Page 68: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

Lemma: The Markov chain Q∗ has the following properties:

1 We have

αQ (θ, z ; θ′, z ′) ≥ αQ ∗(θ, z ; θ

′, z ′) = αQEX (θ; θ′)× αQ z(z ; z

′).

2 Q∗ is a reversible Markov chain with invariant density π̂(θ, z).

3 For any function h(θ) the IACT is higher for the Q∗ chain than forthe Q chain; i.e. IFQ

h ≥ IFQh (Peskun, 1973; Tierney 1998).

Remark: We have αQ ∗(θ, z ; θ′, z ′) = αQ (θ, z ; θ

′, z ′) = αQEX (θ; θ′)

when the likelihood is known exactly andαQ ∗(θ, z ; θ

′, z ′) = αQ (θ, z ; θ∗, z∗) = αQZ (z ; z

∗) when the proposal isperfect, i.e. q(θ′|θ) = π(θ′).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 22 / 37

Page 69: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

Lemma: The Markov chain Q∗ has the following properties:

1 We have

αQ (θ, z ; θ′, z ′) ≥ αQ ∗(θ, z ; θ

′, z ′) = αQEX (θ; θ′)× αQ z(z ; z

′).

2 Q∗ is a reversible Markov chain with invariant density π̂(θ, z).3 For any function h(θ) the IACT is higher for the Q∗ chain than forthe Q chain; i.e. IFQ

h ≥ IFQh (Peskun, 1973; Tierney 1998).

Remark: We have αQ ∗(θ, z ; θ′, z ′) = αQ (θ, z ; θ

′, z ′) = αQEX (θ; θ′)

when the likelihood is known exactly andαQ ∗(θ, z ; θ

′, z ′) = αQ (θ, z ; θ∗, z∗) = αQZ (z ; z

∗) when the proposal isperfect, i.e. q(θ′|θ) = π(θ′).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 22 / 37

Page 70: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

A Bounding Chain

Lemma: The Markov chain Q∗ has the following properties:

1 We have

αQ (θ, z ; θ′, z ′) ≥ αQ ∗(θ, z ; θ

′, z ′) = αQEX (θ; θ′)× αQ z(z ; z

′).

2 Q∗ is a reversible Markov chain with invariant density π̂(θ, z).3 For any function h(θ) the IACT is higher for the Q∗ chain than forthe Q chain; i.e. IFQ

h ≥ IFQh (Peskun, 1973; Tierney 1998).

Remark: We have αQ ∗(θ, z ; θ′, z ′) = αQ (θ, z ; θ

′, z ′) = αQEX (θ; θ′)

when the likelihood is known exactly andαQ ∗(θ, z ; θ

′, z ′) = αQ (θ, z ; θ∗, z∗) = αQZ (z ; z

∗) when the proposal isperfect, i.e. q(θ′|θ) = π(θ′).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 22 / 37

Page 71: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Making Assumptions to Move Forward

Assumption. Let Z = log p̂θ (y ;U)− log pθ(y) be the error in theestimator of the log likelihood.

1 We have for N particles

g(z |θ) = φ(z ;−γ2(θ)/2N,γ2(θ)/N

),

π̂(z |θ) = exp(z)g(z |θ) = φ(z ;γ2(θ)/2N,γ2(θ)/N

)where φ(z ; a, b2) is a univariate normal of mean a, variance b2.

2 For a given value of σ2 we set N = Nσ2(θ) = γ(θ)2/σ2.

Under this assumption, both g(z |θ) and π̂(z |θ) are functions of σ2

only and we write g(z |θ) and π(z |θ) as

g(z |σ2) = φ(z ;−σ2/2, σ2), π(z |σ2) = φ(z ; σ2/2, σ2).

and θ and Z are independent under π̂(θ, z).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 23 / 37

Page 72: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Making Assumptions to Move Forward

Assumption. Let Z = log p̂θ (y ;U)− log pθ(y) be the error in theestimator of the log likelihood.

1 We have for N particles

g(z |θ) = φ(z ;−γ2(θ)/2N,γ2(θ)/N

),

π̂(z |θ) = exp(z)g(z |θ) = φ(z ;γ2(θ)/2N,γ2(θ)/N

)where φ(z ; a, b2) is a univariate normal of mean a, variance b2.

2 For a given value of σ2 we set N = Nσ2(θ) = γ(θ)2/σ2.

Under this assumption, both g(z |θ) and π̂(z |θ) are functions of σ2

only and we write g(z |θ) and π(z |θ) as

g(z |σ2) = φ(z ;−σ2/2, σ2), π(z |σ2) = φ(z ; σ2/2, σ2).

and θ and Z are independent under π̂(θ, z).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 23 / 37

Page 73: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Making Assumptions to Move Forward

Assumption. Let Z = log p̂θ (y ;U)− log pθ(y) be the error in theestimator of the log likelihood.

1 We have for N particles

g(z |θ) = φ(z ;−γ2(θ)/2N,γ2(θ)/N

),

π̂(z |θ) = exp(z)g(z |θ) = φ(z ;γ2(θ)/2N,γ2(θ)/N

)where φ(z ; a, b2) is a univariate normal of mean a, variance b2.

2 For a given value of σ2 we set N = Nσ2(θ) = γ(θ)2/σ2.

Under this assumption, both g(z |θ) and π̂(z |θ) are functions of σ2

only and we write g(z |θ) and π(z |θ) as

g(z |σ2) = φ(z ;−σ2/2, σ2), π(z |σ2) = φ(z ; σ2/2, σ2).

and θ and Z are independent under π̂(θ, z).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 23 / 37

Page 74: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Making Assumptions to Move Forward

Assumption. Let Z = log p̂θ (y ;U)− log pθ(y) be the error in theestimator of the log likelihood.

1 We have for N particles

g(z |θ) = φ(z ;−γ2(θ)/2N,γ2(θ)/N

),

π̂(z |θ) = exp(z)g(z |θ) = φ(z ;γ2(θ)/2N,γ2(θ)/N

)where φ(z ; a, b2) is a univariate normal of mean a, variance b2.

2 For a given value of σ2 we set N = Nσ2(θ) = γ(θ)2/σ2.

Under this assumption, both g(z |θ) and π̂(z |θ) are functions of σ2

only and we write g(z |θ) and π(z |θ) as

g(z |σ2) = φ(z ;−σ2/2, σ2), π(z |σ2) = φ(z ; σ2/2, σ2).

and θ and Z are independent under π̂(θ, z).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 23 / 37

Page 75: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Empirical vs Asymptotic Distribution of Log-LikelihoodEstimator

­4 ­2 0 2 4

0.1

0.2

0.3

0.4

0.5 Probit example: N=120, ρ = 0

­4 ­2 0 2 4

0.1

0.2

0.3

0.4

0.5 AR1 plus noise example: N=60, ρ = 0­4 ­2 0 2 4

0.1

0.2

0.3

0.4

0.5 Probit example: N=120, ρ = 0 .9 7

­4 ­2 0 2 4

0.1

0.2

0.3

0.4

0.5 AR1 plus noise example: N=60, ρ = 0 .9

Figure: Histograms of proposed (red) and accepted (pink) values of z in PMCMCscheme. Overlayed are Gaussian pdfs from our simplifying Assumption for atarget of σ = 0.92.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 24 / 37

Page 76: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Bounding Chain with Perfect Proposal

Lemma (Pitt et al., 2012). Under the previous assumption, thefollowing results hold for the chain {θj ,Zj} arising from Q∗ that usesthe perfect proposal q(θ′|θ) = π(θ′) denoted QZ.

1 Let p(z ; σ2) be the probability of rejection given the current value z .Then

p(z ; σ2) = 1−∫

αQZ (z ; z′)g(z ′|σ2)dz ′

= Φ(z/σ+ σ/2)− exp(−z)Φ(z/σ− σ/2).

2

IF Z(σ2) = Eπ(·|σ2)

(1+ p(z ; σ2)1− p(z ; σ2)

)=∫ 1+ p∗(w , σ)1− p∗(w , σ)φ(w)dw ,

where p∗(w , σ) = Φ(w + σ)− exp(−wσ− σ2/2)Φ(w), Φ(·)standard normal cdf. In particular, IF Z(σ2) is independent of h.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 25 / 37

Page 77: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Bounding Chain with Perfect Proposal

Lemma (Pitt et al., 2012). Under the previous assumption, thefollowing results hold for the chain {θj ,Zj} arising from Q∗ that usesthe perfect proposal q(θ′|θ) = π(θ′) denoted QZ.

1 Let p(z ; σ2) be the probability of rejection given the current value z .Then

p(z ; σ2) = 1−∫

αQZ (z ; z′)g(z ′|σ2)dz ′

= Φ(z/σ+ σ/2)− exp(−z)Φ(z/σ− σ/2).

2

IF Z(σ2) = Eπ(·|σ2)

(1+ p(z ; σ2)1− p(z ; σ2)

)=∫ 1+ p∗(w , σ)1− p∗(w , σ)φ(w)dw ,

where p∗(w , σ) = Φ(w + σ)− exp(−wσ− σ2/2)Φ(w), Φ(·)standard normal cdf. In particular, IF Z(σ2) is independent of h.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 25 / 37

Page 78: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Bounding Chain with Perfect Proposal

Lemma (Pitt et al., 2012). Under the previous assumption, thefollowing results hold for the chain {θj ,Zj} arising from Q∗ that usesthe perfect proposal q(θ′|θ) = π(θ′) denoted QZ.

1 Let p(z ; σ2) be the probability of rejection given the current value z .Then

p(z ; σ2) = 1−∫

αQZ (z ; z′)g(z ′|σ2)dz ′

= Φ(z/σ+ σ/2)− exp(−z)Φ(z/σ− σ/2).

2

IF Z(σ2) = Eπ(·|σ2)

(1+ p(z ; σ2)1− p(z ; σ2)

)=∫ 1+ p∗(w , σ)1− p∗(w , σ)φ(w)dw ,

where p∗(w , σ) = Φ(w + σ)− exp(−wσ− σ2/2)Φ(w), Φ(·)standard normal cdf. In particular, IF Z(σ2) is independent of h.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 25 / 37

Page 79: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Main Result

Under the previous assumption and additional regularity conditions,

IF Qh (σ

2) ≤ IF Q*h (σ

2) ≤ IF Uh (σ

2),

IF Uh (σ

2) =12(1+ IF EX

h )(1+ IFZ (σ2))− 1.

The inequality becomes exact when σ −→ 0 (as IF Z (σ2) −→ 1 forσ −→ 0) and when the proposal is perfect (as IF EX

h = 1 whenq(θ′|θ) = π(θ′)).

For the unconditional acceptance probability, we have

PQ(A|σ2) ≥ PQ*(A|σ2) = 2Φ(−σ/√2) PEX(A).

with the bound getting tighter if either PEX(A) −→ 1 orPZ(A|σ2) −→ 1.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 26 / 37

Page 80: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Main Result

Under the previous assumption and additional regularity conditions,

IF Qh (σ

2) ≤ IF Q*h (σ

2) ≤ IF Uh (σ

2),

IF Uh (σ

2) =12(1+ IF EX

h )(1+ IFZ (σ2))− 1.

The inequality becomes exact when σ −→ 0 (as IF Z (σ2) −→ 1 forσ −→ 0) and when the proposal is perfect (as IF EX

h = 1 whenq(θ′|θ) = π(θ′)).

For the unconditional acceptance probability, we have

PQ(A|σ2) ≥ PQ*(A|σ2) = 2Φ(−σ/√2) PEX(A).

with the bound getting tighter if either PEX(A) −→ 1 orPZ(A|σ2) −→ 1.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 26 / 37

Page 81: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Main Result

Under the previous assumption and additional regularity conditions,

IF Qh (σ

2) ≤ IF Q*h (σ

2) ≤ IF Uh (σ

2),

IF Uh (σ

2) =12(1+ IF EX

h )(1+ IFZ (σ2))− 1.

The inequality becomes exact when σ −→ 0 (as IF Z (σ2) −→ 1 forσ −→ 0) and when the proposal is perfect (as IF EX

h = 1 whenq(θ′|θ) = π(θ′)).

For the unconditional acceptance probability, we have

PQ(A|σ2) ≥ PQ*(A|σ2) = 2Φ(−σ/√2) PEX(A).

with the bound getting tighter if either PEX(A) −→ 1 orPZ(A|σ2) −→ 1.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 26 / 37

Page 82: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Relative Ineffi ciency

We have

IF Qh (σ

2)

IF EXh

≤ IF Uh (σ

2)

IF EXh

= RIF Uh (σ

2),

RIF Uh (σ

2) =12(IF Z(σ2)− 1)

IF EXh

+12(1+ IF Z(σ2)).

For fixed σ, RIF Uh (σ

2) decreases as IF EXh increases from a value of

IF Z(σ2) for IF EXh = 1 to

RIF Uh (σ

2) −→ 12 (1+ IF

Z(σ2)) ≤ IF Z(σ2) as IF EXh −→ ∞.

The loss in effi ciency from using the estimated likelihood goes downas the proposal deteriorates.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 27 / 37

Page 83: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Relative Ineffi ciency

We have

IF Qh (σ

2)

IF EXh

≤ IF Uh (σ

2)

IF EXh

= RIF Uh (σ

2),

RIF Uh (σ

2) =12(IF Z(σ2)− 1)

IF EXh

+12(1+ IF Z(σ2)).

For fixed σ, RIF Uh (σ

2) decreases as IF EXh increases from a value of

IF Z(σ2) for IF EXh = 1 to

RIF Uh (σ

2) −→ 12 (1+ IF

Z(σ2)) ≤ IF Z(σ2) as IF EXh −→ ∞.

The loss in effi ciency from using the estimated likelihood goes downas the proposal deteriorates.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 27 / 37

Page 84: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Relative Ineffi ciency

We have

IF Qh (σ

2)

IF EXh

≤ IF Uh (σ

2)

IF EXh

= RIF Uh (σ

2),

RIF Uh (σ

2) =12(IF Z(σ2)− 1)

IF EXh

+12(1+ IF Z(σ2)).

For fixed σ, RIF Uh (σ

2) decreases as IF EXh increases from a value of

IF Z(σ2) for IF EXh = 1 to

RIF Uh (σ

2) −→ 12 (1+ IF

Z(σ2)) ≤ IF Z(σ2) as IF EXh −→ ∞.

The loss in effi ciency from using the estimated likelihood goes downas the proposal deteriorates.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 27 / 37

Page 85: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Computing Time

The Computing Time (CT) for Q is defined as

CT Qh (σ

2) = IF Qh (σ

2)/σ2;

i.e. we take into account the computational efforts associated toσ2 ∝ 1/N.

1 CT Qh (σ

2) ≤ CT Uh (σ

2) where CT Uh (σ

2) := IF Uh (σ

2)/σ2.2 If IF EX

h = 1, then CT Uh (σ

2) is minimized at σUopt = 0.92 and

IF Z (σU2opt ) = 4.54, PZ (A|σU

opt ) = 0.5153.3 The minimizing σUopt rises with IF

EXh to σU

opt = 1.0206 as IFEX −→ ∞.

Define the relative computing time for the ineffi ciency bound IF Uh (σ

2)as

RCT Uh (σ

2) =RIF U

h (σ2)

σ2.

Both RIF Uh (σ

2) and RCT Uh (σ

2) are decreasing functions of IF EXh .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 28 / 37

Page 86: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Computing Time

The Computing Time (CT) for Q is defined as

CT Qh (σ

2) = IF Qh (σ

2)/σ2;

i.e. we take into account the computational efforts associated toσ2 ∝ 1/N.

1 CT Qh (σ

2) ≤ CT Uh (σ

2) where CT Uh (σ

2) := IF Uh (σ

2)/σ2.

2 If IF EXh = 1, then CT U

h (σ2) is minimized at σU

opt = 0.92 andIF Z (σU2opt ) = 4.54, P

Z (A|σUopt ) = 0.5153.

3 The minimizing σUopt rises with IFEXh to σU

opt = 1.0206 as IFEX −→ ∞.

Define the relative computing time for the ineffi ciency bound IF Uh (σ

2)as

RCT Uh (σ

2) =RIF U

h (σ2)

σ2.

Both RIF Uh (σ

2) and RCT Uh (σ

2) are decreasing functions of IF EXh .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 28 / 37

Page 87: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Computing Time

The Computing Time (CT) for Q is defined as

CT Qh (σ

2) = IF Qh (σ

2)/σ2;

i.e. we take into account the computational efforts associated toσ2 ∝ 1/N.

1 CT Qh (σ

2) ≤ CT Uh (σ

2) where CT Uh (σ

2) := IF Uh (σ

2)/σ2.2 If IF EX

h = 1, then CT Uh (σ

2) is minimized at σUopt = 0.92 and

IF Z (σU2opt ) = 4.54, PZ (A|σU

opt ) = 0.5153.

3 The minimizing σUopt rises with IFEXh to σU

opt = 1.0206 as IFEX −→ ∞.

Define the relative computing time for the ineffi ciency bound IF Uh (σ

2)as

RCT Uh (σ

2) =RIF U

h (σ2)

σ2.

Both RIF Uh (σ

2) and RCT Uh (σ

2) are decreasing functions of IF EXh .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 28 / 37

Page 88: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Computing Time

The Computing Time (CT) for Q is defined as

CT Qh (σ

2) = IF Qh (σ

2)/σ2;

i.e. we take into account the computational efforts associated toσ2 ∝ 1/N.

1 CT Qh (σ

2) ≤ CT Uh (σ

2) where CT Uh (σ

2) := IF Uh (σ

2)/σ2.2 If IF EX

h = 1, then CT Uh (σ

2) is minimized at σUopt = 0.92 and

IF Z (σU2opt ) = 4.54, PZ (A|σU

opt ) = 0.5153.3 The minimizing σUopt rises with IF

EXh to σU

opt = 1.0206 as IFEX −→ ∞.

Define the relative computing time for the ineffi ciency bound IF Uh (σ

2)as

RCT Uh (σ

2) =RIF U

h (σ2)

σ2.

Both RIF Uh (σ

2) and RCT Uh (σ

2) are decreasing functions of IF EXh .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 28 / 37

Page 89: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Computing Time

The Computing Time (CT) for Q is defined as

CT Qh (σ

2) = IF Qh (σ

2)/σ2;

i.e. we take into account the computational efforts associated toσ2 ∝ 1/N.

1 CT Qh (σ

2) ≤ CT Uh (σ

2) where CT Uh (σ

2) := IF Uh (σ

2)/σ2.2 If IF EX

h = 1, then CT Uh (σ

2) is minimized at σUopt = 0.92 and

IF Z (σU2opt ) = 4.54, PZ (A|σU

opt ) = 0.5153.3 The minimizing σUopt rises with IF

EXh to σU

opt = 1.0206 as IFEX −→ ∞.

Define the relative computing time for the ineffi ciency bound IF Uh (σ

2)as

RCT Uh (σ

2) =RIF U

h (σ2)

σ2.

Both RIF Uh (σ

2) and RCT Uh (σ

2) are decreasing functions of IF EXh .

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 28 / 37

Page 90: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Relative Upper Bounds on Ineffi ciency and ComputingTime

R I F ( I F E X = 1 )R I F ( I F E X = 4 )R I F ( I F E X = 1 0 0 )

R I F ( I F E X = 2 .5 )R I F ( I F E X = 2 0 )

0.0 2.5 5.0 7.5 10.0 12.5

10

20

30

R I F ( I F E X = 1 )R I F ( I F E X = 4 )R I F ( I F E X = 1 0 0 )

R I F ( I F E X = 2 .5 )R I F ( I F E X = 2 0 )

R C T ( I F E X = 1 )R C T ( I F E X = 4 )R C T ( I F E X = 1 0 0 )

R C T ( I F E X = 2 .5 )R C T ( I F E X = 2 0 )

0.25 0.5 0.75 1 1.25 1.5 1.75 2

10

20

30 R C T ( I F E X = 1 )R C T ( I F E X = 4 )R C T ( I F E X = 1 0 0 )

R C T ( I F E X = 2 .5 )R C T ( I F E X = 2 0 )

R I F ( I F E X = 1 )R I F ( I F E X = 4 )R I F ( I F E X = 1 0 0 )

R I F ( I F E X = 2 .5 )R I F ( I F E X = 2 0 )

0.0 2.5 5.0 7.5 10.0 12.5

50

100

150 R I F ( I F E X = 1 )R I F ( I F E X = 4 )R I F ( I F E X = 1 0 0 )

R I F ( I F E X = 2 .5 )R I F ( I F E X = 2 0 )

R C T ( I F E X = 1 )R C T ( I F E X = 4 )R C T ( I F E X = 1 0 0 )

R C T ( I F E X = 2 .5 )R C T ( I F E X = 2 0 )

0.25 0.5 0.75 1 1.25 1.5 1.75 2

50

100

150

R C T ( I F E X = 1 )R C T ( I F E X = 4 )R C T ( I F E X = 1 0 0 )

R C T ( I F E X = 2 .5 )R C T ( I F E X = 2 0 )

Figure: RCT Uh (top) and RIF

Uh (bottom) against 1/σ2 (left) and σ (right).

Different values of IFEXh are shown on each plot.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 29 / 37

Page 91: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 1: Probit Model

We use a simple Bernoulli Probit model, where for t = 1, ...,T

Yt = I(Xt > 0), Xtiid∼ N(θ; 1).

The likelihood is known explicitly as Pr(Yt = 1) = Φ(θ) but isestimated through

P̂rθ(Yt = 1) =1N

N

∑k=1

I(X (k )t > 0), X (k )tiid∼ N(θ; 1)

and set θ ∼ N(0, σ2θ

)with σ2θ � 1.

Autoregressive Metropolis proposal

θ′ = θ̂ + ρ(θ − θ̂) +

√σ̂2(1− ρ2)

ν− 2 t5,

where θ̂ is the posterior mode and σ̂2 is chosen as the negativeinverse of the second derivative of the log posterior.We will consider ρ ∈ {0, 0.4, 0.6, 0.9, 0.97}.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 30 / 37

Page 92: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 1: Probit Model

We use a simple Bernoulli Probit model, where for t = 1, ...,T

Yt = I(Xt > 0), Xtiid∼ N(θ; 1).

The likelihood is known explicitly as Pr(Yt = 1) = Φ(θ) but isestimated through

P̂rθ(Yt = 1) =1N

N

∑k=1

I(X (k )t > 0), X (k )tiid∼ N(θ; 1)

and set θ ∼ N(0, σ2θ

)with σ2θ � 1.

Autoregressive Metropolis proposal

θ′ = θ̂ + ρ(θ − θ̂) +

√σ̂2(1− ρ2)

ν− 2 t5,

where θ̂ is the posterior mode and σ̂2 is chosen as the negativeinverse of the second derivative of the log posterior.We will consider ρ ∈ {0, 0.4, 0.6, 0.9, 0.97}.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 30 / 37

Page 93: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 1: Probit Model

We use a simple Bernoulli Probit model, where for t = 1, ...,T

Yt = I(Xt > 0), Xtiid∼ N(θ; 1).

The likelihood is known explicitly as Pr(Yt = 1) = Φ(θ) but isestimated through

P̂rθ(Yt = 1) =1N

N

∑k=1

I(X (k )t > 0), X (k )tiid∼ N(θ; 1)

and set θ ∼ N(0, σ2θ

)with σ2θ � 1.

Autoregressive Metropolis proposal

θ′ = θ̂ + ρ(θ − θ̂) +

√σ̂2(1− ρ2)

ν− 2 t5,

where θ̂ is the posterior mode and σ̂2 is chosen as the negativeinverse of the second derivative of the log posterior.

We will consider ρ ∈ {0, 0.4, 0.6, 0.9, 0.97}.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 30 / 37

Page 94: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 1: Probit Model

We use a simple Bernoulli Probit model, where for t = 1, ...,T

Yt = I(Xt > 0), Xtiid∼ N(θ; 1).

The likelihood is known explicitly as Pr(Yt = 1) = Φ(θ) but isestimated through

P̂rθ(Yt = 1) =1N

N

∑k=1

I(X (k )t > 0), X (k )tiid∼ N(θ; 1)

and set θ ∼ N(0, σ2θ

)with σ2θ � 1.

Autoregressive Metropolis proposal

θ′ = θ̂ + ρ(θ − θ̂) +

√σ̂2(1− ρ2)

ν− 2 t5,

where θ̂ is the posterior mode and σ̂2 is chosen as the negativeinverse of the second derivative of the log posterior.We will consider ρ ∈ {0, 0.4, 0.6, 0.9, 0.97}.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 30 / 37

Page 95: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Setting the Number of Samples

We want to assess experimentally whether our upper/lower boundsare sharp.

We select N so as σ to be roughly equal to a pre-specified value.

1 Choose a large initial value for the number of samples, NS .2 Run the MCMC scheme for a fixed number of iterates recording θ.3 Record the estimated variance of the log of the likelihood estimator,V (θ,NS ) = V̂ [log p̂θ(y ;U)].

4 Set Nθ = V (θ,NS )/σ2.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 31 / 37

Page 96: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Setting the Number of Samples

We want to assess experimentally whether our upper/lower boundsare sharp.

We select N so as σ to be roughly equal to a pre-specified value.

1 Choose a large initial value for the number of samples, NS .2 Run the MCMC scheme for a fixed number of iterates recording θ.3 Record the estimated variance of the log of the likelihood estimator,V (θ,NS ) = V̂ [log p̂θ(y ;U)].

4 Set Nθ = V (θ,NS )/σ2.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 31 / 37

Page 97: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Setting the Number of Samples

We want to assess experimentally whether our upper/lower boundsare sharp.

We select N so as σ to be roughly equal to a pre-specified value.

1 Choose a large initial value for the number of samples, NS .

2 Run the MCMC scheme for a fixed number of iterates recording θ.3 Record the estimated variance of the log of the likelihood estimator,V (θ,NS ) = V̂ [log p̂θ(y ;U)].

4 Set Nθ = V (θ,NS )/σ2.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 31 / 37

Page 98: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Setting the Number of Samples

We want to assess experimentally whether our upper/lower boundsare sharp.

We select N so as σ to be roughly equal to a pre-specified value.

1 Choose a large initial value for the number of samples, NS .2 Run the MCMC scheme for a fixed number of iterates recording θ.

3 Record the estimated variance of the log of the likelihood estimator,V (θ,NS ) = V̂ [log p̂θ(y ;U)].

4 Set Nθ = V (θ,NS )/σ2.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 31 / 37

Page 99: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Setting the Number of Samples

We want to assess experimentally whether our upper/lower boundsare sharp.

We select N so as σ to be roughly equal to a pre-specified value.

1 Choose a large initial value for the number of samples, NS .2 Run the MCMC scheme for a fixed number of iterates recording θ.3 Record the estimated variance of the log of the likelihood estimator,V (θ,NS ) = V̂ [log p̂θ(y ;U)].

4 Set Nθ = V (θ,NS )/σ2.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 31 / 37

Page 100: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Setting the Number of Samples

We want to assess experimentally whether our upper/lower boundsare sharp.

We select N so as σ to be roughly equal to a pre-specified value.

1 Choose a large initial value for the number of samples, NS .2 Run the MCMC scheme for a fixed number of iterates recording θ.3 Record the estimated variance of the log of the likelihood estimator,V (θ,NS ) = V̂ [log p̂θ(y ;U)].

4 Set Nθ = V (θ,NS )/σ2.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 31 / 37

Page 101: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Acceptance Probabilities for Probit Example

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.25 0.50 0.75 1.00 1.25 1.50 1.75

0.5

1.0

ρ = 0

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.5

1.0

ρ = 0 .4

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.5

1.0

ρ = 0 .6

ρ = 0 .9 7

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.5

1.0

ρ = 0 .9

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.5

1.0Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

Figure: Probit example T = 100, θ = 0.5. Accept. Proba vs σ(θ). Estim. probafor the exact MCMC scheme is shown (constant), estim. proba from thesimulated likelihood scheme (red) and lower bound given as proba exact schemetimes 2Φ(−σ/

√2) (blue).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 32 / 37

Page 102: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Relative Ineffi ciency and Computing Time

RCT ( ρ = 0 )RCT  ( ρ = 0 .6 )RCT  ( ρ = 0 .9 7 )

RCT  ( ρ = 0 .4 )RCT  ( ρ = 0 .9 )

0 250 500 750 1000 1250

500

1000

1500RCT ( ρ = 0 )RCT  ( ρ = 0 .6 )RCT  ( ρ = 0 .9 7 )

RCT  ( ρ = 0 .4 )RCT  ( ρ = 0 .9 )

RCT  ( ρ = 0 )RCT  ( ρ = 0 .6 )RCT  ( ρ = 0 .9 7 )

RCT  ( ρ = 0 .4 )RCT  ( ρ = 0 .9 )

0.5 1.0 1.5 2.0

500

1000

1500

RCT  ( ρ = 0 )RCT  ( ρ = 0 .6 )RCT  ( ρ = 0 .9 7 )

RCT  ( ρ = 0 .4 )RCT  ( ρ = 0 .9 )

RIF  ( ρ = 0 )RIF  ( ρ = 0 .6 )RIF  ( ρ = 0 .9 7 )

RIF  ( ρ = 0 .4 )RIF  ( ρ = 0 .9 )

0 250 500 750 1000 1250

20

40

60RIF  ( ρ = 0 )RIF  ( ρ = 0 .6 )RIF  ( ρ = 0 .9 7 )

RIF  ( ρ = 0 .4 )RIF  ( ρ = 0 .9 )

RIF  ( ρ = 0 )RIF  ( ρ = 0 .6 )RIF  ( ρ = 0 .9 7 )

RIF  ( ρ = 0 .4 )RIF  ( ρ = 0 .9 )

0.5 1.0 1.5 2.0

20

40

60RIF  ( ρ = 0 )RIF  ( ρ = 0 .6 )RIF  ( ρ = 0 .9 7 )

RIF  ( ρ = 0 .4 )RIF  ( ρ = 0 .9 )

Figure: RCT Qh (top) and RIF

Qh (bottom) against N (left) and σ(θ) (right) for

various values of ρ.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 33 / 37

Page 103: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 2: Noisy Autoregressive Example

We have

Xt+1 = µ(1− φ) + φXt + σηηt and Yt = Xt + σεWt

where ηt and Wt are independent standard normal and

θ =(

φ, µ, σ2η

).

The likelihood can be computed exactly using the Kalman filter.

Autoregressive Metropolis proposal for θ based on multivariatet-distribution.

N is selected in the same manner so as to obtain an approximatelyconstant σ

(θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 34 / 37

Page 104: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 2: Noisy Autoregressive Example

We have

Xt+1 = µ(1− φ) + φXt + σηηt and Yt = Xt + σεWt

where ηt and Wt are independent standard normal and

θ =(

φ, µ, σ2η

).

The likelihood can be computed exactly using the Kalman filter.

Autoregressive Metropolis proposal for θ based on multivariatet-distribution.

N is selected in the same manner so as to obtain an approximatelyconstant σ

(θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 34 / 37

Page 105: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 2: Noisy Autoregressive Example

We have

Xt+1 = µ(1− φ) + φXt + σηηt and Yt = Xt + σεWt

where ηt and Wt are independent standard normal and

θ =(

φ, µ, σ2η

).

The likelihood can be computed exactly using the Kalman filter.

Autoregressive Metropolis proposal for θ based on multivariatet-distribution.

N is selected in the same manner so as to obtain an approximatelyconstant σ

(θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 34 / 37

Page 106: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Example 2: Noisy Autoregressive Example

We have

Xt+1 = µ(1− φ) + φXt + σηηt and Yt = Xt + σεWt

where ηt and Wt are independent standard normal and

θ =(

φ, µ, σ2η

).

The likelihood can be computed exactly using the Kalman filter.

Autoregressive Metropolis proposal for θ based on multivariatet-distribution.

N is selected in the same manner so as to obtain an approximatelyconstant σ

(θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 34 / 37

Page 107: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Acceptance Probabilities

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.25

0.50

0.75

ρ = 0

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.25

0.50

0.75

ρ = 0 .4

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.25

0.50

0.75

ρ = 0 .6

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMCPr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

0.5 1.0 1.5 2.0

0.25

0.50

0.75

ρ = 0 .9

Pr(Accept) of ExactPr(Accept) of Bound

Pr(Accept) of PMCMC

Figure: AR1 plus noise example with T = 300, φ = 0.8, µ = 0.5, σ2η = 1,

σ2ε = 0.5. Probabilities of acceptance displayed against σ(θ).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 35 / 37

Page 108: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Relative Ineffi ciency and Computing Time

RCT

0 100 200 300

250

500

ρ = 0RCT

RCT

1 2

250

500 RCT RIF

0 100 200 300

25

50 RIF RIF

1 2

25

50 RIF

RCT

0 100 200 300

250

500ρ = 0 .4

RCT RCT

1 2

250

500 RCT RIF

0 100 200 300

25

50 RIFRIF

1 2

25

50 RIF

RCT

0 100 200 300

250

500ρ = 0 .6

ρ = 0 .9

RCT RCT

1 2

250

500RCT RIF

0 100 200 300

25

50RIF RIF

1 2

25

50 RIF

RCT

0 100 200 300

250

500RCT RCT

1 2

250

500RCT RIF

0 100 200 300

25

50 RIF RIF

1 2

25

50 RIF

Figure: From left to right: RCT Qh vs N, RCT

Qh vs σ(θ), RIF Qh against N and

RIF Qh against σ(θ) for various values of ρ and different parameters.

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 36 / 37

Page 109: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Discussion

We have provided an approximate analysis of the PMMH sampler.

For a general proposal and under simplifying assumptions on thelikelihood estimator, we can get guidelines on how to select σ: as longas σ is around 1 then you are fine.

Coming up with a nicer way to adapt N would be useful; e.g. Lee,2011.

Particle Gibbs samplers are a powerful alternative not yet wellunderstood (Andrieu, D. & Holenstein, 2010; Whiteley, Andrieu, D.,2010; Lindsten, Jordan, Schon, 2012).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 37 / 37

Page 110: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Discussion

We have provided an approximate analysis of the PMMH sampler.

For a general proposal and under simplifying assumptions on thelikelihood estimator, we can get guidelines on how to select σ: as longas σ is around 1 then you are fine.

Coming up with a nicer way to adapt N would be useful; e.g. Lee,2011.

Particle Gibbs samplers are a powerful alternative not yet wellunderstood (Andrieu, D. & Holenstein, 2010; Whiteley, Andrieu, D.,2010; Lindsten, Jordan, Schon, 2012).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 37 / 37

Page 111: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Discussion

We have provided an approximate analysis of the PMMH sampler.

For a general proposal and under simplifying assumptions on thelikelihood estimator, we can get guidelines on how to select σ: as longas σ is around 1 then you are fine.

Coming up with a nicer way to adapt N would be useful; e.g. Lee,2011.

Particle Gibbs samplers are a powerful alternative not yet wellunderstood (Andrieu, D. & Holenstein, 2010; Whiteley, Andrieu, D.,2010; Lindsten, Jordan, Schon, 2012).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 37 / 37

Page 112: Bayesian Parameter Inference in State-Space Models using ...events.csml.ucl.ac.uk/userdata/masterclasses/doucet/...Bayesian Parameter Inference in State-Space Models using Particle

Discussion

We have provided an approximate analysis of the PMMH sampler.

For a general proposal and under simplifying assumptions on thelikelihood estimator, we can get guidelines on how to select σ: as longas σ is around 1 then you are fine.

Coming up with a nicer way to adapt N would be useful; e.g. Lee,2011.

Particle Gibbs samplers are a powerful alternative not yet wellunderstood (Andrieu, D. & Holenstein, 2010; Whiteley, Andrieu, D.,2010; Lindsten, Jordan, Schon, 2012).

A. Doucet (UCL Masterclass Oct. 2012) 5th October 2012 37 / 37