Sequential Importance Sampling Resampling

88
Sequential Importance Sampling Resampling Arnaud Doucet Departments of Statistics & Computer Science University of British Columbia A.D. () 1 / 30

Transcript of Sequential Importance Sampling Resampling

Page 1: Sequential Importance Sampling Resampling

Sequential Importance Sampling Resampling

Arnaud DoucetDepartments of Statistics & Computer Science

University of British Columbia

A.D. () 1 / 30

Page 2: Sequential Importance Sampling Resampling

Sequential Importance Sampling

We use a structured IS distribution

qn (x1:n) = qn�1 (x1:n�1) qn (xn j x1:n�1)

= q1 (x1) q2 (x2j x1) � � � qn (xn j x1:n�1)

so if X (i )1:n�1 � qn�1 (x1:n�1) then we only need to sample

X (i )n���X (i )1:n�1 � qn

�xn jX (i )1:n�1

�to obtain X (i )1:n � qn (x1:n)

The importance weights are updated according to

wn (x1:n) =γn (x1:n)

qn (x1:n)= wn�1 (x1:n�1)

γn (x1:n)

γn�1 (x1:n�1) qn (xn j x1:n�1)| {z }αn(x1:n)

A.D. () 2 / 30

Page 3: Sequential Importance Sampling Resampling

Sequential Importance Sampling

We use a structured IS distribution

qn (x1:n) = qn�1 (x1:n�1) qn (xn j x1:n�1)

= q1 (x1) q2 (x2j x1) � � � qn (xn j x1:n�1)

so if X (i )1:n�1 � qn�1 (x1:n�1) then we only need to sample

X (i )n���X (i )1:n�1 � qn

�xn jX (i )1:n�1

�to obtain X (i )1:n � qn (x1:n)

The importance weights are updated according to

wn (x1:n) =γn (x1:n)

qn (x1:n)= wn�1 (x1:n�1)

γn (x1:n)

γn�1 (x1:n�1) qn (xn j x1:n�1)| {z }αn(x1:n)

A.D. () 2 / 30

Page 4: Sequential Importance Sampling Resampling

Sequential Importance Sampling

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .

At time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= wn�1

�X (i )1:n�1

�αn�X (i )1:n

�.

It follows that

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) ,

bZn =1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 3 / 30

Page 5: Sequential Importance Sampling Resampling

Sequential Importance Sampling

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .At time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= wn�1

�X (i )1:n�1

�αn�X (i )1:n

�.

It follows that

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) ,

bZn =1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 3 / 30

Page 6: Sequential Importance Sampling Resampling

Sequential Importance Sampling

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .At time n � 2

sample X (i )n � qn��jX (i )1:n�1

compute wn�X (i )1:n

�= wn�1

�X (i )1:n�1

�αn�X (i )1:n

�.

It follows that

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) ,

bZn =1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 3 / 30

Page 7: Sequential Importance Sampling Resampling

Sequential Importance Sampling

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .At time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= wn�1

�X (i )1:n�1

�αn�X (i )1:n

�.

It follows that

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) ,

bZn =1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 3 / 30

Page 8: Sequential Importance Sampling Resampling

Sequential Importance Sampling

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .At time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= wn�1

�X (i )1:n�1

�αn�X (i )1:n

�.

It follows that

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) ,

bZn =1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 3 / 30

Page 9: Sequential Importance Sampling Resampling

Sequential Importance Sampling for State-Space Models

State-space models

Hidden Markov process: X1 � µ, Xk j (Xk�1 = xk�1) � f ( �j xk�1)

Observation process: Yk j (Xk = xk ) � g ( �j xk )

Assume we have received y1:n, we are interested in sampling from

πn (x1:n) = p (x1:n j y1:n) =p (x1:n, y1:n)

p (y1:n)

and estimating p (y1:n) where

γn (x1:n) = p (x1:n, y1:n) = µ (x1)n

∏k=2

f (xk j xk�1)n

∏k=1

g (yk j xk ) ,

Zn = p (y1:n) =Z� � �

Zµ (x1)

n

∏k=2

f (xk j xk�1)n

∏k=1

g (yk j xk ) dx1:n.

A.D. () 4 / 30

Page 10: Sequential Importance Sampling Resampling

Sequential Importance Sampling for State-Space Models

State-space models

Hidden Markov process: X1 � µ, Xk j (Xk�1 = xk�1) � f ( �j xk�1)

Observation process: Yk j (Xk = xk ) � g ( �j xk )Assume we have received y1:n, we are interested in sampling from

πn (x1:n) = p (x1:n j y1:n) =p (x1:n, y1:n)

p (y1:n)

and estimating p (y1:n) where

γn (x1:n) = p (x1:n, y1:n) = µ (x1)n

∏k=2

f (xk j xk�1)n

∏k=1

g (yk j xk ) ,

Zn = p (y1:n) =Z� � �

Zµ (x1)

n

∏k=2

f (xk j xk�1)n

∏k=1

g (yk j xk ) dx1:n.

A.D. () 4 / 30

Page 11: Sequential Importance Sampling Resampling

Locally Optimal Importance Distribution

The optimal IS distribution qn (xn j x1:n�1) at time n minimizing thevariance of wn (x1:n) is given by

qoptn (xn j x1:n�1) = πn (xn j x1:n�1)

and yields an incremental importance weight of the form

αn (x1:n) =γn (x1:n�1)

γn�1 (x1:n�1)

For state-space models, we have

qoptn (xn j x1:n�1) = p (xn j yn, xn�1) =g (yn j xn) f (xn j xn�1)

p (yn j xn�1),

αn (x1:n) = p (yn j xn�1) .

A.D. () 5 / 30

Page 12: Sequential Importance Sampling Resampling

Locally Optimal Importance Distribution

The optimal IS distribution qn (xn j x1:n�1) at time n minimizing thevariance of wn (x1:n) is given by

qoptn (xn j x1:n�1) = πn (xn j x1:n�1)

and yields an incremental importance weight of the form

αn (x1:n) =γn (x1:n�1)

γn�1 (x1:n�1)

For state-space models, we have

qoptn (xn j x1:n�1) = p (xn j yn, xn�1) =g (yn j xn) f (xn j xn�1)

p (yn j xn�1),

αn (x1:n) = p (yn j xn�1) .

A.D. () 5 / 30

Page 13: Sequential Importance Sampling Resampling

Summary

Sequential Importance Sampling is a special case of ImportanceSampling.

Importance Sampling only works decently for moderate size problems.

Today, we discuss how to partially �x this problem.

A.D. () 6 / 30

Page 14: Sequential Importance Sampling Resampling

Summary

Sequential Importance Sampling is a special case of ImportanceSampling.

Importance Sampling only works decently for moderate size problems.

Today, we discuss how to partially �x this problem.

A.D. () 6 / 30

Page 15: Sequential Importance Sampling Resampling

Summary

Sequential Importance Sampling is a special case of ImportanceSampling.

Importance Sampling only works decently for moderate size problems.

Today, we discuss how to partially �x this problem.

A.D. () 6 / 30

Page 16: Sequential Importance Sampling Resampling

Resampling

Intuitive KEY idea: As the time index n increases, the variance of theunnormalized weights

nwn�X (i )1:n

�otend to increase and all the mass

is concentrated on a few random samples/particles. We propose toreset the approximation by getting rid in a principled way of theparticles with low weights W (i )

n (relative to 1/N) and multiply theparticles with high weights W (i )

n (relative to 1/N).

The main reason is that if a particle at time n has a low weight thentypically it will still have a low weight at time n+ 1 (though I caneasily give you a counterexample).

You want to focus your computational e¤orts on the �promising�parts of the space.

A.D. () 7 / 30

Page 17: Sequential Importance Sampling Resampling

Resampling

Intuitive KEY idea: As the time index n increases, the variance of theunnormalized weights

nwn�X (i )1:n

�otend to increase and all the mass

is concentrated on a few random samples/particles. We propose toreset the approximation by getting rid in a principled way of theparticles with low weights W (i )

n (relative to 1/N) and multiply theparticles with high weights W (i )

n (relative to 1/N).The main reason is that if a particle at time n has a low weight thentypically it will still have a low weight at time n+ 1 (though I caneasily give you a counterexample).

You want to focus your computational e¤orts on the �promising�parts of the space.

A.D. () 7 / 30

Page 18: Sequential Importance Sampling Resampling

Resampling

Intuitive KEY idea: As the time index n increases, the variance of theunnormalized weights

nwn�X (i )1:n

�otend to increase and all the mass

is concentrated on a few random samples/particles. We propose toreset the approximation by getting rid in a principled way of theparticles with low weights W (i )

n (relative to 1/N) and multiply theparticles with high weights W (i )

n (relative to 1/N).The main reason is that if a particle at time n has a low weight thentypically it will still have a low weight at time n+ 1 (though I caneasily give you a counterexample).

You want to focus your computational e¤orts on the �promising�parts of the space.

A.D. () 7 / 30

Page 19: Sequential Importance Sampling Resampling

Multinomial Resampling

At time n, IS provides the following approximation of πn (x1:n)

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) .

The simplest resampling schemes consists of sampling N timeseX (i )1:n � bπn (dx1:n) to build the new approximation

eπn (dx1:n) =1N

N

∑i=1

δeX (i )1:n(dx1:n) .

The new resampled particlesneX (i )1:n

oare approximately distributed

according to πn (x1:n) but statistically dependent. This istheoretically more di¢ cult to study.

A.D. () 8 / 30

Page 20: Sequential Importance Sampling Resampling

Multinomial Resampling

At time n, IS provides the following approximation of πn (x1:n)

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) .

The simplest resampling schemes consists of sampling N timeseX (i )1:n � bπn (dx1:n) to build the new approximation

eπn (dx1:n) =1N

N

∑i=1

δeX (i )1:n(dx1:n) .

The new resampled particlesneX (i )1:n

oare approximately distributed

according to πn (x1:n) but statistically dependent. This istheoretically more di¢ cult to study.

A.D. () 8 / 30

Page 21: Sequential Importance Sampling Resampling

Multinomial Resampling

At time n, IS provides the following approximation of πn (x1:n)

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) .

The simplest resampling schemes consists of sampling N timeseX (i )1:n � bπn (dx1:n) to build the new approximation

eπn (dx1:n) =1N

N

∑i=1

δeX (i )1:n(dx1:n) .

The new resampled particlesneX (i )1:n

oare approximately distributed

according to πn (x1:n) but statistically dependent. This istheoretically more di¢ cult to study.

A.D. () 8 / 30

Page 22: Sequential Importance Sampling Resampling

Note that we can rewrite

eπn (dx1:n) =N

∑i=1

N (i )nN

δX (i )1:n(dx1:n)

where�N (1)n , ...,N (N )n

��M

�N;W (1)

n , ...,W (N )n

�thus

EhN (i )n

i= NW (i )

n , varhN (1)n

i= NW (i )

n

�1�W (i )

n

�.

It follows that the resampling step is an unbiased operation

E [ eπn (dx1:n)j bπn (dx1:n)] = bπn (dx1:n)

but clearly it introduces some errors �locally� in time. That is for anytest function, we have

vareπn [ϕ (X1:n)] � varbπn [ϕ (X1:n)]

Resampling is bene�cial for future time steps (sometimes).

A.D. () 9 / 30

Page 23: Sequential Importance Sampling Resampling

Note that we can rewrite

eπn (dx1:n) =N

∑i=1

N (i )nN

δX (i )1:n(dx1:n)

where�N (1)n , ...,N (N )n

��M

�N;W (1)

n , ...,W (N )n

�thus

EhN (i )n

i= NW (i )

n , varhN (1)n

i= NW (i )

n

�1�W (i )

n

�.

It follows that the resampling step is an unbiased operation

E [ eπn (dx1:n)j bπn (dx1:n)] = bπn (dx1:n)

but clearly it introduces some errors �locally� in time. That is for anytest function, we have

vareπn [ϕ (X1:n)] � varbπn [ϕ (X1:n)]

Resampling is bene�cial for future time steps (sometimes).

A.D. () 9 / 30

Page 24: Sequential Importance Sampling Resampling

Note that we can rewrite

eπn (dx1:n) =N

∑i=1

N (i )nN

δX (i )1:n(dx1:n)

where�N (1)n , ...,N (N )n

��M

�N;W (1)

n , ...,W (N )n

�thus

EhN (i )n

i= NW (i )

n , varhN (1)n

i= NW (i )

n

�1�W (i )

n

�.

It follows that the resampling step is an unbiased operation

E [ eπn (dx1:n)j bπn (dx1:n)] = bπn (dx1:n)

but clearly it introduces some errors �locally� in time. That is for anytest function, we have

vareπn [ϕ (X1:n)] � varbπn [ϕ (X1:n)]

Resampling is bene�cial for future time steps (sometimes).

A.D. () 9 / 30

Page 25: Sequential Importance Sampling Resampling

Strati�ed Resampling

Better resampling steps can be designed such that EhN (i )n

i= NW (i )

n

but VhN (i )n

i< NW (i )

n

�1�W (i )

n

�.

A popular alternative to multinomial resampling consists of selecting

U1 � U�0,1N

�and for i = 2, ...,N

Ui = U1 +i � 1N

= Ui�1 +1N.

Then we set

N (i )n = #

(Uj :

i�1∑m=1

W (m)n � Uj <

i

∑m=1

W (m)n

)where ∑0

m=1 = 0.

It is trivial to check that EhN (i )n

i= NW (i )

n .

A.D. () 10 / 30

Page 26: Sequential Importance Sampling Resampling

Strati�ed Resampling

Better resampling steps can be designed such that EhN (i )n

i= NW (i )

n

but VhN (i )n

i< NW (i )

n

�1�W (i )

n

�.

A popular alternative to multinomial resampling consists of selecting

U1 � U�0,1N

�and for i = 2, ...,N

Ui = U1 +i � 1N

= Ui�1 +1N.

Then we set

N (i )n = #

(Uj :

i�1∑m=1

W (m)n � Uj <

i

∑m=1

W (m)n

)where ∑0

m=1 = 0.

It is trivial to check that EhN (i )n

i= NW (i )

n .

A.D. () 10 / 30

Page 27: Sequential Importance Sampling Resampling

Strati�ed Resampling

Better resampling steps can be designed such that EhN (i )n

i= NW (i )

n

but VhN (i )n

i< NW (i )

n

�1�W (i )

n

�.

A popular alternative to multinomial resampling consists of selecting

U1 � U�0,1N

�and for i = 2, ...,N

Ui = U1 +i � 1N

= Ui�1 +1N.

Then we set

N (i )n = #

(Uj :

i�1∑m=1

W (m)n � Uj <

i

∑m=1

W (m)n

)where ∑0

m=1 = 0.

It is trivial to check that EhN (i )n

i= NW (i )

n .

A.D. () 10 / 30

Page 28: Sequential Importance Sampling Resampling

Strati�ed Resampling

Better resampling steps can be designed such that EhN (i )n

i= NW (i )

n

but VhN (i )n

i< NW (i )

n

�1�W (i )

n

�.

A popular alternative to multinomial resampling consists of selecting

U1 � U�0,1N

�and for i = 2, ...,N

Ui = U1 +i � 1N

= Ui�1 +1N.

Then we set

N (i )n = #

(Uj :

i�1∑m=1

W (m)n � Uj <

i

∑m=1

W (m)n

)where ∑0

m=1 = 0.

It is trivial to check that EhN (i )n

i= NW (i )

n .

A.D. () 10 / 30

Page 29: Sequential Importance Sampling Resampling

An alternative approach to resampling

Assumeαn (x1:n) � 1 over En (rescale if necessary)

We have

πn (x1:n) =αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)Rαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

= αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�� αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)R

αn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

Looks like measure-valued Metropolis-Hastings algorithm.

A.D. () 11 / 30

Page 30: Sequential Importance Sampling Resampling

An alternative approach to resampling

Assumeαn (x1:n) � 1 over En (rescale if necessary)

We have

πn (x1:n) =αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)Rαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

= αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�� αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)R

αn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

Looks like measure-valued Metropolis-Hastings algorithm.

A.D. () 11 / 30

Page 31: Sequential Importance Sampling Resampling

An alternative approach to resampling

Assumeαn (x1:n) � 1 over En (rescale if necessary)

We have

πn (x1:n) =αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)Rαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

= αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�� αn (x1:n) qn (xn j x1:n�1)πn�1 (x1:n�1)R

αn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

Looks like measure-valued Metropolis-Hastings algorithm.

A.D. () 11 / 30

Page 32: Sequential Importance Sampling Resampling

Probabilistic interpretation

We have

πn (x1:n) = αn (x1:n)| {z }accept with proba wn

qn (xn j x1:n�1)πn�1 (x1:n�1)| {z }trial distribution

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�| {z }

rejection probability

πn (x1:n)

Say X (i )1:n�1 � πn�1 and sample X(i )n � qn

��jX (i )1:n�1

�.

With probability αn�X (i )n

�, set eX (i )1:n = X

(i )1:n otherwiseeX (i )1:n � ∑N

i=1W(i )n δ

X (i )1:n(dx1:n).

Remark: Allows to decrease variance if αn (x1:n) ��at" over En; e.g.�ltering with large observation noise.

A.D. () 12 / 30

Page 33: Sequential Importance Sampling Resampling

Probabilistic interpretation

We have

πn (x1:n) = αn (x1:n)| {z }accept with proba wn

qn (xn j x1:n�1)πn�1 (x1:n�1)| {z }trial distribution

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�| {z }

rejection probability

πn (x1:n)

Say X (i )1:n�1 � πn�1 and sample X(i )n � qn

��jX (i )1:n�1

�.

With probability αn�X (i )n

�, set eX (i )1:n = X

(i )1:n otherwiseeX (i )1:n � ∑N

i=1W(i )n δ

X (i )1:n(dx1:n).

Remark: Allows to decrease variance if αn (x1:n) ��at" over En; e.g.�ltering with large observation noise.

A.D. () 12 / 30

Page 34: Sequential Importance Sampling Resampling

Probabilistic interpretation

We have

πn (x1:n) = αn (x1:n)| {z }accept with proba wn

qn (xn j x1:n�1)πn�1 (x1:n�1)| {z }trial distribution

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�| {z }

rejection probability

πn (x1:n)

Say X (i )1:n�1 � πn�1 and sample X(i )n � qn

��jX (i )1:n�1

�.

With probability αn�X (i )n

�, set eX (i )1:n = X

(i )1:n otherwiseeX (i )1:n � ∑N

i=1W(i )n δ

X (i )1:n(dx1:n).

Remark: Allows to decrease variance if αn (x1:n) ��at" over En; e.g.�ltering with large observation noise.

A.D. () 12 / 30

Page 35: Sequential Importance Sampling Resampling

Probabilistic interpretation

We have

πn (x1:n) = αn (x1:n)| {z }accept with proba wn

qn (xn j x1:n�1)πn�1 (x1:n�1)| {z }trial distribution

+

�1�

Zαn (x1:n) qn (dxn j x1:n�1)πn�1 (dx1:n�1)

�| {z }

rejection probability

πn (x1:n)

Say X (i )1:n�1 � πn�1 and sample X(i )n � qn

��jX (i )1:n�1

�.

With probability αn�X (i )n

�, set eX (i )1:n = X

(i )1:n otherwiseeX (i )1:n � ∑N

i=1W(i )n δ

X (i )1:n(dx1:n).

Remark: Allows to decrease variance if αn (x1:n) ��at" over En; e.g.�ltering with large observation noise.

A.D. () 12 / 30

Page 36: Sequential Importance Sampling Resampling

Degeneracy Measures

Resampling at each time step is harmful. We should resample onlywhen necessary.

To measure the variation of the weights, we can use the E¤ectiveSample Size (ESS) or the coe¢ cient of variation CV

ESS =

N

∑i=1

�W (i )n

�2!�1, CV =

1N

N

∑i=1

�NW (i )

n � 1�2!1/2

We have ESS = N and CV = 0 if W (i )n = 1/N for any i .

We have ESS = 1 and CV =pN � 1 if W (i )

n = 1 and W (j)n = 1 for

j 6= i .

A.D. () 13 / 30

Page 37: Sequential Importance Sampling Resampling

Degeneracy Measures

Resampling at each time step is harmful. We should resample onlywhen necessary.

To measure the variation of the weights, we can use the E¤ectiveSample Size (ESS) or the coe¢ cient of variation CV

ESS =

N

∑i=1

�W (i )n

�2!�1, CV =

1N

N

∑i=1

�NW (i )

n � 1�2!1/2

We have ESS = N and CV = 0 if W (i )n = 1/N for any i .

We have ESS = 1 and CV =pN � 1 if W (i )

n = 1 and W (j)n = 1 for

j 6= i .

A.D. () 13 / 30

Page 38: Sequential Importance Sampling Resampling

Degeneracy Measures

Resampling at each time step is harmful. We should resample onlywhen necessary.

To measure the variation of the weights, we can use the E¤ectiveSample Size (ESS) or the coe¢ cient of variation CV

ESS =

N

∑i=1

�W (i )n

�2!�1, CV =

1N

N

∑i=1

�NW (i )

n � 1�2!1/2

We have ESS = N and CV = 0 if W (i )n = 1/N for any i .

We have ESS = 1 and CV =pN � 1 if W (i )

n = 1 and W (j)n = 1 for

j 6= i .

A.D. () 13 / 30

Page 39: Sequential Importance Sampling Resampling

Degeneracy Measures

Resampling at each time step is harmful. We should resample onlywhen necessary.

To measure the variation of the weights, we can use the E¤ectiveSample Size (ESS) or the coe¢ cient of variation CV

ESS =

N

∑i=1

�W (i )n

�2!�1, CV =

1N

N

∑i=1

�NW (i )

n � 1�2!1/2

We have ESS = N and CV = 0 if W (i )n = 1/N for any i .

We have ESS = 1 and CV =pN � 1 if W (i )

n = 1 and W (j)n = 1 for

j 6= i .

A.D. () 13 / 30

Page 40: Sequential Importance Sampling Resampling

We can also use the entropy

Ent = �N

∑i=1W (i )n log2

�W (i )n

We have Ent = log2 (N) if W(i )n = 1/N for any i . We have Ent = 0

if W (i )n = 1 and W (j)

n = 1 for j 6= i .Dynamic Resampling: If the variation of the weights as measured byESS, CV or Ent is too high, then resample the particles.

A.D. () 14 / 30

Page 41: Sequential Importance Sampling Resampling

We can also use the entropy

Ent = �N

∑i=1W (i )n log2

�W (i )n

�We have Ent = log2 (N) if W

(i )n = 1/N for any i . We have Ent = 0

if W (i )n = 1 and W (j)

n = 1 for j 6= i .

Dynamic Resampling: If the variation of the weights as measured byESS, CV or Ent is too high, then resample the particles.

A.D. () 14 / 30

Page 42: Sequential Importance Sampling Resampling

We can also use the entropy

Ent = �N

∑i=1W (i )n log2

�W (i )n

�We have Ent = log2 (N) if W

(i )n = 1/N for any i . We have Ent = 0

if W (i )n = 1 and W (j)

n = 1 for j 6= i .Dynamic Resampling: If the variation of the weights as measured byESS, CV or Ent is too high, then resample the particles.

A.D. () 14 / 30

Page 43: Sequential Importance Sampling Resampling

Generic Sequential Monte Carlo Scheme

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= αn

�X (i )1:n

�.

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 15 / 30

Page 44: Sequential Importance Sampling Resampling

Generic Sequential Monte Carlo Scheme

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .Resample

nX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

o

At time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= αn

�X (i )1:n

�.

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 15 / 30

Page 45: Sequential Importance Sampling Resampling

Generic Sequential Monte Carlo Scheme

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .Resample

nX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= αn

�X (i )1:n

�.

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 15 / 30

Page 46: Sequential Importance Sampling Resampling

Generic Sequential Monte Carlo Scheme

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .Resample

nX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � qn��jX (i )1:n�1

compute wn�X (i )1:n

�= αn

�X (i )1:n

�.

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 15 / 30

Page 47: Sequential Importance Sampling Resampling

Generic Sequential Monte Carlo Scheme

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .Resample

nX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= αn

�X (i )1:n

�.

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 15 / 30

Page 48: Sequential Importance Sampling Resampling

Generic Sequential Monte Carlo Scheme

At time n = 1, sample X (i )1 � q1 (�) and set w1�X (i )1

�=

γ1

�X (i )1

�q1�X (i )1

� .Resample

nX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � qn��jX (i )1:n�1

�compute wn

�X (i )1:n

�= αn

�X (i )1:n

�.

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 15 / 30

Page 49: Sequential Importance Sampling Resampling

At any time n, we have two approximation of πn (x1:n)

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) (before resampling)

eπn (dx1:n) =1N

N

∑i=1

δX (i )1:n(dx1:n) (after resampling).

We also have dZnZn�1

=1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 16 / 30

Page 50: Sequential Importance Sampling Resampling

At any time n, we have two approximation of πn (x1:n)

bπn (dx1:n) =N

∑i=1W (i )n δ

X (i )1:n(dx1:n) (before resampling)

eπn (dx1:n) =1N

N

∑i=1

δX (i )1:n(dx1:n) (after resampling).

We also have dZnZn�1

=1N

N

∑i=1wn�X (i )1:n

�.

A.D. () 16 / 30

Page 51: Sequential Importance Sampling Resampling

Sequential Monte Carlo for Hidden Markov Models

At time n = 1, sample X (i )1 � q1 (�) and set

w1�X (i )1

�=

µ�X (i )1

�g�y1 jX (i )1

�q�X (i )1

���y1� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � q��j yn ,X (i )n�1

�compute wn

�X (i )1:n

�=

f�X (i )n

���X (i )n�1�g� yn jX (i )n �q�X (i )n

���yn ,X (i )n�1� .

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 17 / 30

Page 52: Sequential Importance Sampling Resampling

Sequential Monte Carlo for Hidden Markov Models

At time n = 1, sample X (i )1 � q1 (�) and set

w1�X (i )1

�=

µ�X (i )1

�g�y1 jX (i )1

�q�X (i )1

���y1� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

o

At time n � 2

sample X (i )n � q��j yn ,X (i )n�1

�compute wn

�X (i )1:n

�=

f�X (i )n

���X (i )n�1�g� yn jX (i )n �q�X (i )n

���yn ,X (i )n�1� .

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 17 / 30

Page 53: Sequential Importance Sampling Resampling

Sequential Monte Carlo for Hidden Markov Models

At time n = 1, sample X (i )1 � q1 (�) and set

w1�X (i )1

�=

µ�X (i )1

�g�y1 jX (i )1

�q�X (i )1

���y1� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � q��j yn ,X (i )n�1

�compute wn

�X (i )1:n

�=

f�X (i )n

���X (i )n�1�g� yn jX (i )n �q�X (i )n

���yn ,X (i )n�1� .

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 17 / 30

Page 54: Sequential Importance Sampling Resampling

Sequential Monte Carlo for Hidden Markov Models

At time n = 1, sample X (i )1 � q1 (�) and set

w1�X (i )1

�=

µ�X (i )1

�g�y1 jX (i )1

�q�X (i )1

���y1� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � q��j yn ,X (i )n�1

compute wn�X (i )1:n

�=

f�X (i )n

���X (i )n�1�g� yn jX (i )n �q�X (i )n

���yn ,X (i )n�1� .

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 17 / 30

Page 55: Sequential Importance Sampling Resampling

Sequential Monte Carlo for Hidden Markov Models

At time n = 1, sample X (i )1 � q1 (�) and set

w1�X (i )1

�=

µ�X (i )1

�g�y1 jX (i )1

�q�X (i )1

���y1� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � q��j yn ,X (i )n�1

�compute wn

�X (i )1:n

�=

f�X (i )n

���X (i )n�1�g� yn jX (i )n �q�X (i )n

���yn ,X (i )n�1� .

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 17 / 30

Page 56: Sequential Importance Sampling Resampling

Sequential Monte Carlo for Hidden Markov Models

At time n = 1, sample X (i )1 � q1 (�) and set

w1�X (i )1

�=

µ�X (i )1

�g�y1 jX (i )1

�q�X (i )1

���y1� .

ResamplenX (i )1 ,W

(i )1

oto obtain new particles also denoted

nX (i )1

oAt time n � 2

sample X (i )n � q��j yn ,X (i )n�1

�compute wn

�X (i )1:n

�=

f�X (i )n

���X (i )n�1�g� yn jX (i )n �q�X (i )n

���yn ,X (i )n�1� .

ResamplenX (i )1:n ,W

(i )n

oto obtain new particles also denoted

nX (i )1:n

o

A.D. () 17 / 30

Page 57: Sequential Importance Sampling Resampling

Example: Linear Gaussian model

X1 � N (0, 1) , Xn = αXn�1 + σvVn,

Yn = Xn + σwWn

where Vn � N (0, 1) and Wn � N (0, 1).

We know that p (x1:n j y1:n) is Gaussian and its parameters can becomputed using Kalman techniques. In particular p (xn j y1:n) is also aGaussian which can be computed using the Kalman �lter.

We apply the SMC method withq (xn j yn, xn�1) = f (xn j xn�1) = N

�xn; αxn�1, σ2v

�.

A.D. () 18 / 30

Page 58: Sequential Importance Sampling Resampling

Example: Linear Gaussian model

X1 � N (0, 1) , Xn = αXn�1 + σvVn,

Yn = Xn + σwWn

where Vn � N (0, 1) and Wn � N (0, 1).

We know that p (x1:n j y1:n) is Gaussian and its parameters can becomputed using Kalman techniques. In particular p (xn j y1:n) is also aGaussian which can be computed using the Kalman �lter.

We apply the SMC method withq (xn j yn, xn�1) = f (xn j xn�1) = N

�xn; αxn�1, σ2v

�.

A.D. () 18 / 30

Page 59: Sequential Importance Sampling Resampling

Example: Linear Gaussian model

X1 � N (0, 1) , Xn = αXn�1 + σvVn,

Yn = Xn + σwWn

where Vn � N (0, 1) and Wn � N (0, 1).

We know that p (x1:n j y1:n) is Gaussian and its parameters can becomputed using Kalman techniques. In particular p (xn j y1:n) is also aGaussian which can be computed using the Kalman �lter.

We apply the SMC method withq (xn j yn, xn�1) = f (xn j xn�1) = N

�xn; αxn�1, σ2v

�.

A.D. () 18 / 30

Page 60: Sequential Importance Sampling Resampling

­ 25 ­ 20 ­ 15 ­ 10 ­ 5 00

500

1000

­ 25 ­ 20 ­ 15 ­ 10 ­ 5 00

500

1000

­ 25 ­ 20 ­ 15 ­ 10 ­ 5 00

500

1000

Importance Weights (base 10 logarithm)

Figure: Histograms of the base 10 logarithm of W (i )n for n = 1 (top), n = 50

(middle) and n = 100 (bottom).

By itself this graph does not mean that the procedure is e¢ cient!A.D. () 19 / 30

Page 61: Sequential Importance Sampling Resampling

This SMC strategy performs remarkably well in terms of estimation ofthe marginals p (xk j y1:k ) . This is what is only necessary in manyapplications thankfully.

However, the joint distribution p (x1:k j y1:k ) is poorly estimated whenk is large; i.e. we have in the previous example

bp (x1:11j y1:24) = δX1:11 (x1:11) .

The same conclusion holds for most sequences of distributionsπk (x1:k ).

Resampling only solves partially our problems.

A.D. () 20 / 30

Page 62: Sequential Importance Sampling Resampling

This SMC strategy performs remarkably well in terms of estimation ofthe marginals p (xk j y1:k ) . This is what is only necessary in manyapplications thankfully.

However, the joint distribution p (x1:k j y1:k ) is poorly estimated whenk is large; i.e. we have in the previous example

bp (x1:11j y1:24) = δX1:11 (x1:11) .

The same conclusion holds for most sequences of distributionsπk (x1:k ).

Resampling only solves partially our problems.

A.D. () 20 / 30

Page 63: Sequential Importance Sampling Resampling

This SMC strategy performs remarkably well in terms of estimation ofthe marginals p (xk j y1:k ) . This is what is only necessary in manyapplications thankfully.

However, the joint distribution p (x1:k j y1:k ) is poorly estimated whenk is large; i.e. we have in the previous example

bp (x1:11j y1:24) = δX1:11 (x1:11) .

The same conclusion holds for most sequences of distributionsπk (x1:k ).

Resampling only solves partially our problems.

A.D. () 20 / 30

Page 64: Sequential Importance Sampling Resampling

This SMC strategy performs remarkably well in terms of estimation ofthe marginals p (xk j y1:k ) . This is what is only necessary in manyapplications thankfully.

However, the joint distribution p (x1:k j y1:k ) is poorly estimated whenk is large; i.e. we have in the previous example

bp (x1:11j y1:24) = δX1:11 (x1:11) .

The same conclusion holds for most sequences of distributionsπk (x1:k ).

Resampling only solves partially our problems.

A.D. () 20 / 30

Page 65: Sequential Importance Sampling Resampling

Another Illustration of the Degeneracy Phenomenon

For the linear Gaussian state-space model described before, we cancompute in closed form

Sn =1n

n

∑k=1

E�X 2k��Y1:n

�using the Kalman techniques.

We compute the SMC estimate of this quantity given by

bSn = 1n

n

∑k=1

N

∑i=1W (i )n

�X (i )k

�2This estimate can be updated sequentially using our SMCapproximation.

A.D. () 21 / 30

Page 66: Sequential Importance Sampling Resampling

Another Illustration of the Degeneracy Phenomenon

For the linear Gaussian state-space model described before, we cancompute in closed form

Sn =1n

n

∑k=1

E�X 2k��Y1:n

�using the Kalman techniques.

We compute the SMC estimate of this quantity given by

bSn = 1n

n

∑k=1

N

∑i=1W (i )n

�X (i )k

�2

This estimate can be updated sequentially using our SMCapproximation.

A.D. () 21 / 30

Page 67: Sequential Importance Sampling Resampling

Another Illustration of the Degeneracy Phenomenon

For the linear Gaussian state-space model described before, we cancompute in closed form

Sn =1n

n

∑k=1

E�X 2k��Y1:n

�using the Kalman techniques.

We compute the SMC estimate of this quantity given by

bSn = 1n

n

∑k=1

N

∑i=1W (i )n

�X (i )k

�2This estimate can be updated sequentially using our SMCapproximation.

A.D. () 21 / 30

Page 68: Sequential Importance Sampling Resampling

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Figure: Su¢ cient statistics computed exactly through the Kalman smoother(blue) and the SMC method (red).

A.D. () 22 / 30

Page 69: Sequential Importance Sampling Resampling

Some Convergence Results for SMC

We will discuss convergence results for SMC later; see (Del Moral,2004).

In particular we have for any bounded function ϕ and any p > 1

E

�����Z ϕn (x1:n) (bπn (dx1:n)� πn (dx1:n))

����p�1/p

� Cn kϕk∞N

.

It looks like a nice result but it is rather useless as Cn increasespolynomially/exponentially with time.

To achieve a �xed precision, this would require to use atime-increasing number of particles N.

A.D. () 23 / 30

Page 70: Sequential Importance Sampling Resampling

Some Convergence Results for SMC

We will discuss convergence results for SMC later; see (Del Moral,2004).

In particular we have for any bounded function ϕ and any p > 1

E

�����Z ϕn (x1:n) (bπn (dx1:n)� πn (dx1:n))

����p�1/p

� Cn kϕk∞N

.

It looks like a nice result but it is rather useless as Cn increasespolynomially/exponentially with time.

To achieve a �xed precision, this would require to use atime-increasing number of particles N.

A.D. () 23 / 30

Page 71: Sequential Importance Sampling Resampling

Some Convergence Results for SMC

We will discuss convergence results for SMC later; see (Del Moral,2004).

In particular we have for any bounded function ϕ and any p > 1

E

�����Z ϕn (x1:n) (bπn (dx1:n)� πn (dx1:n))

����p�1/p

� Cn kϕk∞N

.

It looks like a nice result but it is rather useless as Cn increasespolynomially/exponentially with time.

To achieve a �xed precision, this would require to use atime-increasing number of particles N.

A.D. () 23 / 30

Page 72: Sequential Importance Sampling Resampling

Some Convergence Results for SMC

We will discuss convergence results for SMC later; see (Del Moral,2004).

In particular we have for any bounded function ϕ and any p > 1

E

�����Z ϕn (x1:n) (bπn (dx1:n)� πn (dx1:n))

����p�1/p

� Cn kϕk∞N

.

It looks like a nice result but it is rather useless as Cn increasespolynomially/exponentially with time.

To achieve a �xed precision, this would require to use atime-increasing number of particles N.

A.D. () 23 / 30

Page 73: Sequential Importance Sampling Resampling

You cannot hope to estimate with a �xed precision a targetdistribution of increasing dimension.

At best, you can expect results of the following form

E

�����Z ϕ (xn�L+1:n) (bπn (dxn�L+1:n)� πn (dxn�L+1:n))

����p�1/p

� ML kϕk∞N

if the model has nice forgetting/mixing properties, i.e.Z ��πn (xn j x1)� πn�xn j x 01

��� dxn � 2λn�1

with 0 � λ < 1.

In the HMM case, it means thatZ ��p (xn j y1:n, x1)� p�xn j y1:n, x 01

��� dxn � λn�1

A.D. () 24 / 30

Page 74: Sequential Importance Sampling Resampling

You cannot hope to estimate with a �xed precision a targetdistribution of increasing dimension.At best, you can expect results of the following form

E

�����Z ϕ (xn�L+1:n) (bπn (dxn�L+1:n)� πn (dxn�L+1:n))

����p�1/p

� ML kϕk∞N

if the model has nice forgetting/mixing properties, i.e.Z ��πn (xn j x1)� πn�xn j x 01

��� dxn � 2λn�1

with 0 � λ < 1.

In the HMM case, it means thatZ ��p (xn j y1:n, x1)� p�xn j y1:n, x 01

��� dxn � λn�1

A.D. () 24 / 30

Page 75: Sequential Importance Sampling Resampling

You cannot hope to estimate with a �xed precision a targetdistribution of increasing dimension.At best, you can expect results of the following form

E

�����Z ϕ (xn�L+1:n) (bπn (dxn�L+1:n)� πn (dxn�L+1:n))

����p�1/p

� ML kϕk∞N

if the model has nice forgetting/mixing properties, i.e.Z ��πn (xn j x1)� πn�xn j x 01

��� dxn � 2λn�1

with 0 � λ < 1.

In the HMM case, it means thatZ ��p (xn j y1:n, x1)� p�xn j y1:n, x 01

��� dxn � λn�1

A.D. () 24 / 30

Page 76: Sequential Importance Sampling Resampling

Central Limit Theorems

For SIS we havepN (Ebπn (ϕn (X1:n))�Eπn (ϕn (X1:n)))) N

�0, σ2IS (ϕn)

�where

σ2IS (ϕn) =Z

π2n (x1:n)

qn (x1:n)((ϕn (x1:n))�Eπn (ϕ (x1:n)))

2 dx1:n

We also have pN�bZn � Zn�) N

�0, σ2IS

�where

σ2IS =Z

π2n (x1:n)

qn (x1:n)dx1:n � 1

A.D. () 25 / 30

Page 77: Sequential Importance Sampling Resampling

Central Limit Theorems

For SIS we havepN (Ebπn (ϕn (X1:n))�Eπn (ϕn (X1:n)))) N

�0, σ2IS (ϕn)

�where

σ2IS (ϕn) =Z

π2n (x1:n)

qn (x1:n)((ϕn (x1:n))�Eπn (ϕ (x1:n)))

2 dx1:n

We also have pN�bZn � Zn�) N

�0, σ2IS

�where

σ2IS =Z

π2n (x1:n)

qn (x1:n)dx1:n � 1

A.D. () 25 / 30

Page 78: Sequential Importance Sampling Resampling

For SMC, we have

σ2SMC (ϕn) =R π2n(x1)q1(x1)

�Rϕn (x2:n)πn (x2:n j x1) dx2:n �Eπn (ϕn (X1:n))

�2 dx1+∑n�1

k=2

R πn(x1:k )2

πk�1(x1:k�1)qk ( xk jxk�1)��R

ϕn (x1:n)πn (xk+1:n j xk ) dxk+1:n �Eπn (ϕn (X1:n))�2 dx1:k

+R πn(x1:n)

2

πn�1(x1:n�1)qn( xn jxn�1) (ϕn (x1:n)�Eπn (ϕn (X1:n)))2 dx1:n.

and

σ2SMC =Z

π2n (x1)q1 (x1)

dx1 +n

∑k=2

Zπn (x1:k )

2

πk�1 (x1:k�1) qk (xk j xk�1)dx1:k � n

A.D. () 26 / 30

Page 79: Sequential Importance Sampling Resampling

Back to our toy example

Consider the case where the target is de�ned on Rn and

π (x1:n) =n

∏n=1

N (xk ; 0, 1) ,

γ (x1:n) =n

∏k=1

exp��x

2k

2

�, Z = (2π)n/2 .

We select an importance distribution

q (x1:n) =n

∏k=1

N�xk ; 0, σ

2� .

A.D. () 27 / 30

Page 80: Sequential Importance Sampling Resampling

Back to our toy example

Consider the case where the target is de�ned on Rn and

π (x1:n) =n

∏n=1

N (xk ; 0, 1) ,

γ (x1:n) =n

∏k=1

exp��x

2k

2

�, Z = (2π)n/2 .

We select an importance distribution

q (x1:n) =n

∏k=1

N�xk ; 0, σ

2� .

A.D. () 27 / 30

Page 81: Sequential Importance Sampling Resampling

For SMC, the asymptotic variance is �nite only when σ2 > 12 and

VSMC

hbZniZ 2n

� 1N

"Zπ2n (x1)q1 (x1)

dx1 � 1+n

∑k=2

Zπ2n (xk )qk (xk )

dxk � 1#

=nN

"�σ4

2σ2 � 1

�1/2

� 1#

compared to

VIS

hbZniZ 2n

=1N

"�σ4

2σ2 � 1

�n/2

� 1#

for SIS.

If select σ2 = 1.2 then we saw that it is necessary to employ

N � 2� 1023 particles in order to obtain VIS[bZn]Z 2n

= 10�2 forn = 1000.

To obtain the same performance,VSMC[bZn]

Z 2n= 10�2, SMC requires the

use of just N � 104 particles: an improvement by 19 orders ofmagnitude.

A.D. () 28 / 30

Page 82: Sequential Importance Sampling Resampling

For SMC, the asymptotic variance is �nite only when σ2 > 12 and

VSMC

hbZniZ 2n

� 1N

"Zπ2n (x1)q1 (x1)

dx1 � 1+n

∑k=2

Zπ2n (xk )qk (xk )

dxk � 1#

=nN

"�σ4

2σ2 � 1

�1/2

� 1#

compared to

VIS

hbZniZ 2n

=1N

"�σ4

2σ2 � 1

�n/2

� 1#

for SIS.If select σ2 = 1.2 then we saw that it is necessary to employ

N � 2� 1023 particles in order to obtain VIS[bZn]Z 2n

= 10�2 forn = 1000.

To obtain the same performance,VSMC[bZn]

Z 2n= 10�2, SMC requires the

use of just N � 104 particles: an improvement by 19 orders ofmagnitude.

A.D. () 28 / 30

Page 83: Sequential Importance Sampling Resampling

For SMC, the asymptotic variance is �nite only when σ2 > 12 and

VSMC

hbZniZ 2n

� 1N

"Zπ2n (x1)q1 (x1)

dx1 � 1+n

∑k=2

Zπ2n (xk )qk (xk )

dxk � 1#

=nN

"�σ4

2σ2 � 1

�1/2

� 1#

compared to

VIS

hbZniZ 2n

=1N

"�σ4

2σ2 � 1

�n/2

� 1#

for SIS.If select σ2 = 1.2 then we saw that it is necessary to employ

N � 2� 1023 particles in order to obtain VIS[bZn]Z 2n

= 10�2 forn = 1000.

To obtain the same performance,VSMC[bZn]

Z 2n= 10�2, SMC requires the

use of just N � 104 particles: an improvement by 19 orders ofmagnitude.

A.D. () 28 / 30

Page 84: Sequential Importance Sampling Resampling

If you have nice mixing properties, then you can obtain

σ2SMC (ϕ) �CN

for ϕ depending only on Xn�L+1:n.

Under the same assumptions, you can also obtain

σ2SMC �D.TN.

A.D. () 29 / 30

Page 85: Sequential Importance Sampling Resampling

If you have nice mixing properties, then you can obtain

σ2SMC (ϕ) �CN

for ϕ depending only on Xn�L+1:n.

Under the same assumptions, you can also obtain

σ2SMC �D.TN.

A.D. () 29 / 30

Page 86: Sequential Importance Sampling Resampling

Summary

Resampling can drastically improve the performance of SIS in modelshaving �good�mixing properties; e.g. state-space models: this can beveri�ed experimentally and theoretically.

Resampling does not solve all our problems; only the SMCapproximations of the most recent marginals πn (xn�L+1:n) arereliable; i.e. we can have uniform (in time) convergence bounds.

The SMC approximation of πn (x1:n) is only reliable for �small�n.

A.D. () 30 / 30

Page 87: Sequential Importance Sampling Resampling

Summary

Resampling can drastically improve the performance of SIS in modelshaving �good�mixing properties; e.g. state-space models: this can beveri�ed experimentally and theoretically.

Resampling does not solve all our problems; only the SMCapproximations of the most recent marginals πn (xn�L+1:n) arereliable; i.e. we can have uniform (in time) convergence bounds.

The SMC approximation of πn (x1:n) is only reliable for �small�n.

A.D. () 30 / 30

Page 88: Sequential Importance Sampling Resampling

Summary

Resampling can drastically improve the performance of SIS in modelshaving �good�mixing properties; e.g. state-space models: this can beveri�ed experimentally and theoretically.

Resampling does not solve all our problems; only the SMCapproximations of the most recent marginals πn (xn�L+1:n) arereliable; i.e. we can have uniform (in time) convergence bounds.

The SMC approximation of πn (x1:n) is only reliable for �small�n.

A.D. () 30 / 30