link.springer.com3A978-0... · A Measure Theory A.1 Monotone Class Theorem Let S be a set. A family...

Part III

Appendices

A. Bain, D. Crisan, Fundamentals of Stochastic Filtering,DOI 10.1007/978-0-387-76896-0, © Springer Science+Business Media, LLC 2009

A

Measure Theory

A.1 Monotone Class Theorem

Let S be a set. A family C of subsets of S is called a π-system if it is closedunder finite intersection. That is, for any A,B ∈ C we have that A ∩B ∈ C.

Theorem A.1. Let H be a vector space of bounded functions from S into Rcontaining the constant function 1. Assume that H has the property that forany sequence (fn)n≥1 of non-negative functions in H such that fn ↗ f wheref is a bounded function on S, then f ∈ H. Also assume that H contains theindicator function of every set in some π-system C. Then H contains everybounded σ(C)-measurable function of S.

For a proof of Theorem A.1 and other related results see Williams [272] orRogers and Williams [248].

A.2 Conditional Expectation

Let (Ω,F ,P) be a probability space and G ⊂ F be a sub-σ-algebra of F .The conditional expectation of an integrable F-measurable random variable ξgiven G is defined as the integrable G-measurable random variable, denotedby E[ξ | G], with the property that∫

A

ξ dP =∫A

E[ξ | G] dP, for all A ∈ G. (A.1)

Then E[ξ | G] exists and is almost surely unique (for a proof of this result seefor example Williams [272]). By this we mean that if ξ is another G-measurableintegrable random variable such that∫

A

ξ dP =∫A

E[ξ | G] dP, for all A ∈ G,


294 A Measure Theory

then E[ξ | G] = ξ, P-a.s.The following are some of the important properties of the conditional

expectation which are used throughout the text.

a. If α1, α2 ∈ R and ξ1, ξ2 are F-measurable, then

E[α1ξ1 + α2ξ2 | G] = α1E[ξ1 | G] + α2E[ξ2 | G], P-a.s.

b. If ξ ≥ 0, then E[ξ | G] ≥ 0, P-a.s.c. If 0 ≤ ξn ↗ ξ, then E[ξn | G]↗ E[ξ | G], P-a.s.d. If H is a sub-σ-algebra of G, then E [E[ξ | G] | H] = E[ξ | H], P-a.s.e. If ξ is G-measurable, then E[ξη | G] = ξE[η | G], P-a.s.f. If H is independent of σ(σ(ξ),G), then

E[ξ | σ(G,H)] = E[ξ | G], P-a.s.

The conditional probability of a set A ∈ F with respect to the σ-algebra Gis the random variable denoted by P(A | G) defined as P(A | G) � E[IA | G],where IA is the indicator function of the set A. From (A.1),

P(A ∩B) =∫B

P(A | G) dP, for all B ∈ G. (A.2)

This definition of conditional probability has the shortcoming that the condi-tional probability P(A | G) is only defined outside of a null set which dependsupon the set A. As there may be an uncountable number of possible choicesfor A, P(· | G) may not be a probability measure.Under certain conditions regular conditional probabilities as in Definition

2.28 exist. Regular conditional distributions (following the nomenclature inBreiman [23] whose proof we follow) exist under much less restrictive condi-tions.

Definition A.2. Let (Ω,F ,P) be a probability space, (E, E) be a measurablespace, X : Ω → E be an F/E-measurable random element and G a sub-σ-algebra of F . A function Q(ω,B) defined for all ω ∈ Ω and B ∈ E is called aregular conditional distribution of X with respect to G if

(a) For each B ∈ E, the map Q(·, B) is G-measurable.(b) For each ω ∈ Ω, Q(ω, ·) is a probability measure on (E, E).(c) For any B ∈ E,

Q(·, B) = P(X ∈ B | G) P-a.s. (A.3)

Theorem A.3. If the space (E, E) in which X takes values is a Borel space,that is, if there exists a function ϕ : E → R such that ϕ is E-measurableand ϕ−1 is B(R)-measurable, then the regular conditional distribution of thevariable X conditional upon G in the sense of Definition A.2 exists.

A.2 Conditional Expectation 295

Proof. Consider the case when (E, E) = (R,B(R)). First we construct a reg-ular version of the distribution function P(X < x | G). Define a countablefamily of random variables by selecting versions Qq(ω) = P(X < q | G)(ω).For each q ∈ Q, define for r, q ∈ Q,

Mr,q � {ω : Qr < Qq}

and then define the set on which monotonicity of the distribution functionfails

M �⋃r>qr,q∈Q

Mr,q.

It is clear from property (b) of the conditional expectation that P(M) = 0.Similarly define for q ∈ Q,

Nq �{ω : lim

r↑qQr �= Qq

}and

N �⋃q∈Q

Nq;

by property (c) of conditional expectation it follows that P(Nq) = 0, soP(N) = 0. Finally define

L∞ �

⎧⎨⎩ω : lim

q→∞q∈Q

Qq �= 1

⎫⎬⎭ and L−∞ �

⎧⎨⎩ω : lim

q→−∞q∈Q

Qq �= 0

⎫⎬⎭ ,

and again P(L∞) = P(L−∞) = 0.Define

F (x | G) �

⎧⎨⎩limr↑x

r∈QQr if ω /∈M ∪N ∪ L∞ ∪ L−∞

Φ(x) otherwise,

where Φ(x) is the distribution function of the normal N(0, 1) distribution (itschoice is arbitrary). It follows using property (c) of conditional expectationapplied to the functions fri = 1(−∞,ri) with ri ∈ Q a sequence such that ri ↑ xthat F (x | G) satisfies all the properties of a distribution function and is aversion of P(X < x | G).This distribution function can be extended to define a measure Q(· | G).

Let H be the class of B ∈ B(R) such that Q(B | G) is a version of P(X ∈ B |G). It is clear that H contains all finite disjoint unions of intervals of the form[a, b) for a, b ∈ R so by the monotone class theorem A.1 the result follows.In the general case, Y = ϕ(X) is a real-valued random variable and so has

regular conditional distribution such that for B ∈ B(R), Q(B | G) = P(Y ∈B | G); thus define

Q(B | G) � Q(ϕ(B) | G),and since ϕ−1 is measurable it follows that Q has the required properties. ��


Lemma A.4. If X is as in the statement of Theorem A.3 and ψ is a E-measurable function such that E[|ψ(X)|] < ∞ then if Q(· | G) is a regularconditional distribution for X given G it follows that

E[ψ(X) | G] =∫E

ψ(x)Q(dx | G).

Proof. If A ∈ B then it is clear that the result follows from (A.3). By linearitythis extends to simple functions, by monotone convergence to non-negativefunctions, and in general write ψ = ψ+ − ψ−. ��

A.3 Topological Results

Definition A.5. A metric space (E, d) is said to be separable if it has acountable dense set. That is, for any x ∈ E, given ε > 0 we can find y in thiscountable set such that d(x, y) < ε.

Lemma A.6. Let (X, ρ) be a separable metric space. Then X is homeomor-phic to a subspace of [0, 1]N, the space of sequences of real numbers in [0, 1]with the topology of co-ordinatewise convergence.

Proof. Define a bounded version of the metric ρ � ρ/(1 + ρ); it is easilychecked that this is a metric on X, and the space (X, ρ) is also separable.Clearly the metric satisfies the bounds 0 ≤ ρ ≤ 1. As a consequence of sepa-rability we can choose a countable set x1,x2, . . . which is dense in (X, ρ).Define J = [0, 1]N and endow this space with the metric d which generated

the topology of co-ordinatewise convergence. Define α : X → J ,

α : x �→ (ρ(x, x1), ρ(x, x2), . . .).

Suppose x(n) → x in X; then by continuity of ρ it is immediate thatρ(x(n), xk)→ ρ(x, xk) for each k ∈ N and thus α(x(n))→ α(x).Conversely if α(x(n))→ α(x) then this implies that ρ(x(n), xk)→ ρ(x, xk)

for each k. Then by the triangle inequality

ρ(x(n), x) ≤ ρ(x(n), xk) + ρ(xk, x)

and since ρ(x(n), xk)→ ρ(x, xk) it is immediate that

lim supn→∞

ρ(x(n), x) ≤ 2ρ(xk, x) ∀k.

As this holds for all k ∈ N and the xks are dense in X we may pick a sequencexmk

→ x whence ρ(x(n), x) → 0 as n → ∞. Hence α is a homeomorphismX → J . ��

The following is a standard result and the proof is based on that in Rogersand Williams [248] who reference Bourbaki [22] Chapter IX, Section 6, No 1.

A.3 Topological Results 297

Theorem A.7. A complete separable metric space X is homeomorphic to aBorel subset of a compact metric space.

Proof. By Lemma A.6 there is a homeomorphism α : X → J . Let d denotethe metric giving the topology of co-ordinatewise convergence on J . We mustnow consider α(X) and show that it is a countable intersection of open setsin J and hence belongs to the σ-algebra of open sets, the Borel σ-algebra.For ε > 0 and x ∈ X we can find δ(ε) such that for any y ∈ X,

d(α(x), α(y)) < δ implies that ρ(x, y) < ε. For n ∈ N set ε = 1/(2n) andthen consider the ball B(α(x), δ(ε) ∧ ε). It is immediate that the d-diameterof this ball is at most 1/n. But also, as a consequence of the choice of δ, theimage under α−1 of the intersection of this ball with X has ρ-diameter atmost 1/n.Let α(X) be the closure of α(X) under the metric d in J . Define a set

Un ⊆ α(X) to be the set of x ∈ α(X) such that there exists an open ball Nx,n

about x of d-diameter less than 1/n, with ρ-diameter of the image under α−1

of the intersection of α(X) and this ball less than 1/n. By the argument ofthe previous paragraph we see that if x ∈ α(X) we can always find such aball; hence α(X) ⊆ Un.For x ∈ ∩nUn choose xn ∈ α(X)∩

⋂k≤n Nx,k. By construction d(x, xk) ≤

1/n, thus xn → x as n → ∞ under the d metric on J . However, for r ≥n both points xr and xn are in Nx,n thus ρ(α−1(xr), α−1(xn)) ≤ 1/n, so(α−1(xr))r≥1 is a Cauchy sequence in (X, ρ). But this space is complete sothere exists y ∈ X such that α−1(xn) → y. As α is a homeomorphism thisimplies that d(xn, α(y))→ 0. Hence by uniqueness of limits x = α(y) and thusit is immediate that x ∈ α(X). Therefore ∩n Un ⊆ α(X); since α(X) ⊆ Un itfollows immediately that

α(X) =⋂n

Un. (A.4)

It is now necessary to show that Un is relatively open in α(X). From thedefinition of Un, for any x ∈ Un we can find Nx,n with diameter propertiesas above which is a subset of J containing x. For any arbitrary z ∈ α(X), by(A.4) there exists x ∈ Un such that z ∈ Nx,n; then by choosing Nz,n = Nx,n

it is clear that z ∈ Un. Therefore Nx,n ∩ α(X) ⊆ Un from which we concludethat Un is relatively open in α(X). Therefore we can write Un = α(X) ∩ Vnwhere Vn is open in J

α(X) =⋂n

Un = α(X) ∩(⋂

n

Vn

), (A.5)

where Vn are open subsets of J . It only remains to show that α(X) can beexpressed as a countable intersection of open sets; this is easily done since

α(X) =⋂n

{x ∈ J : d(x, α(X)) < 1/n} ,


therefore it follows that α(X) is a countable intersection of open sets in J .Together with (A.5) it follows that α(X) is a countable intersection of opensets. ��

Theorem A.8. Any compact metric space X is separable.

Proof. Consider the open cover of X which is the uncountable union of allballs of radius 1/n centred on each point in X. As X is compact there exists afinite subcover. Let xn1 , . . . , x

nNnbe the centres of the balls in one such finite

subcover. By a diagonal argument we can construct a countable set which isthe union of all these centres for all n ∈ N. This set is clearly dense in X andcountable, so X is separable. ��

Theorem A.9. If E is a compact metric space then the set of continuousreal-valued functions defined on E is separable.

Proof. By Theorem A.8, the space E is separable. Let x1,x2, . . . be a countabledense subset of E. Define h0(x) = 1, and hn(x) = d(x, xn), for n ≥ 1. Nowdefine an algebra of polynomials in these hns with coefficients in the rationals

A ={x �→

∑qn0,...,nr

k0,...,krhn0k0(x) . . . hnr

kr(x) : qn0,...,nr

k0,...,kr∈ Q

}.

The closure of A is an algebra containing constant functions and it is clearthat it separates points in E, therefore by the Stone–Weierstrass theorem, itfollows that A is dense in C(E). ��

Corollary A.10. If E is a compact metric space then there exists a countableset f1, f2, . . . which is dense in C(E).

Proof. By Theorem A.8 E is separable, so by Theorem A.9 the space C(E)is separable and hence has a dense countable subset. ��

A.4 Tulcea’s Theorem

Tulcea’s theorem (see Tulcea [265]) is frequently stated in the form for productspaces and their σ-algebras (for a very elegant proof in this vein see Ethier andKurtz [95, Appendix 9]) and this form is sufficient to establish the existenceof stochastic processes. We give the theorem in a more general form wherethe measures are defined on the same space X, but defined on an increasingfamily of σ-algebras Bn as this makes the important condition on the atomsof the σ-algebras clear. The approach taken here is based on that in Stroockand Varadhan [261].Define the atom A(x) of the Borel σ-algebra B on the space X, for x ∈ X

byA(x) �

⋂{B : B ∈ B, x ∈ B}, (A.6)

that is, A(x) is the smallest element of B which contains x.

A.4 Tulcea’s Theorem 299

Theorem A.11. Let (X,B) be a measurable space and let Bn be an increasingfamily of sub-σ-algebras of B such that B = σ(

⋃∞n=1 Bn). Suppose that these

σ-algebras satisfy the following constraint. If An is a sequence of atoms suchthat An ∈ Bn and A1 ⊇ A2 ⊇ · · · then

⋂∞n=0An �= ∅.

Let P0 be a probability measure defined on B0 and let πn be a family ofprobability kernels, where πn(x, ·) is a measure on (X,Bn) and the mappingx �→ πn(x, ·) is Bn−1-measurable. Such a probability kernel allows us to defineinductively a family of probability measures on (X,Bn) via

Pn(A) �∫X

πn(x,A)Pn−1(dx), (A.7)

with the starting point for the induction being given by the probability measureP0.

Suppose that the kernels πn(x, ·) satisfy the compatibility condition that forx /∈ Nn, where Nn is a Pn-null set, the kernel πn+1(x, ·) is supported on An(x)(i.e. if B ∈ Bn+1 and B ∩ An(x) = ∅ then πn+1(x,B) = 0). That is, startingfrom a point x, the transition measure only contains with positive probabilitytransitions to points y such that x and y belong to the same atom of Bn.

Then there exists a unique probability measure P defined on B such thatP|Bn = Pn for all n ∈ N.

Proof. It is elementary to see that Pn as defined in (A.7) is a probabilitymeasure on Bn and that Pn+1 agrees with Pn on Bn. We can then define a setfunction P on

⋃Bn by setting P(Bn) = Pn(Bn) for Bn ∈ Bn.

From the definition (A.7), for B ∈ Bn we have defined Pn inductively viathe transition functions

Pn(Bn) =∫X

· · ·∫X

πn(qn−1, B)πn−1(qn−2, dqn−1) · · ·π1(q0, dq1)P0(dq0).(A.8)

To simplify the notation define πm,n such that πm,n(x, ·) is a measure onM(X,Bn) as follows.If m ≥ n ≥ 0 and B ∈ Bn, then define πm,n(x,B) = 1B(x) which is clearly

Bn-measurable and hence as Bm ⊇ Bn, x �→ πm,n(x,B) is also Bm-measurable.If m < n define πm,n inductively using the transition kernel πn,

πm,n(x,B) �∫X

πn(yn−1, B)πm,n−1(x,dyn−1). (A.9)

It is clear that in both cases x �→ πm,n(x, ·) is Bm-measurable. Thus πm,n

can be viewed as a transition kernel from (X,Bm) to (X,Bn). From thesedefinitions, for m < n

πm,n(x,B) =∫X

· · ·∫X

πn(yn−1, B) · · ·πm+1(ym, dym+1)πm,m(x,dym)

=∫X

· · ·∫X

πn(yn−1, B) · · ·πm+2(ym+1, dym+2)πm+1(x,dym+1).


It therefore follows from the above with m = 0 and (A.8) that for B ∈ Bn,

P(Bn) = Pn(Bn) =∫X

π0,n(y0, B)P0(dy0). (A.10)

We must show that P is a probability measure on⋃∞

n=0 Bn, as then theCaratheodory extension theorem† establishes the existence of an extensionto a probability measure on (X,σ (

⋃∞n=0 B)). The only non-trivial condition

which must be verified for P to be a measure is countable additivity.A necessary and sufficient condition for countable additivity of P is that

if Bn ∈⋃

n Bn, are such that B1 ⊇ B2 ⊇ · · · and⋂

n Bn = ∅ then P(Bn)→ 0as n→∞ (the proof can be found in many books on measure theory, see forexample page 200 of Williams [272]). It is clear that the non-trivial cases arecovered by considering Bn ∈ Bn for each n ∈ N.We argue by contradiction; suppose that P(Bn) ≥ ε > 0 for all n ∈ N. We

must exhibit a point of⋂

n Bn; as we started with the assumption that thisintersection was empty, this is the desired contradiction.Define

F 0n �{x ∈ X : π0,n(x,Bn) ≥ ε/2

}. (A.11)

Since x �→ π0,n(x, ·) is B0-measurable, it follows that Fn0 ∈ B0. Then from

(A.10) it is clear thatP(Bn) ≤ P0(F 0n) + ε/2.

As by assumption P(Bn) ≥ ε for all n ∈ N, we conclude that P0(F 0n) ≥ ε/2for all n ∈ N.Suppose that x ∈ F 0n+1; then π0,n+1(x,Bn+1) ≥ ε/2. But Bn+1 ⊆ Bn, so

π0,n+1(x,Bn) ≥ ε/2. From (A.9) it follows that

π0,n+1(x,Bn) =∫X

πn+1(yn, Bn)π0,n(x,dyn),

for y /∈ Nn, the probability measure πn+1(y, ·) is supported on An(y). AsBn ∈ Bn, from the definition of an atom, it follows that y ∈ Bn if and onlyif An(y) ⊆ Bn, thus πn+1(y,Bn) = 1Bn

(y) for y /∈ Nn. So on integration weobtain that π0,n(x,Bn) = π0,n+1(x,Bn) ≥ ε/2. Thus x ∈ Fn

0 . So we haveshown that Fn+1

0 ⊆ Fn0 .

Since P0(Fn0 ) ≥ ε/2 for all n and the Fn form a non-increasing sequence,

it is then immediate that P0(⋂∞

n=0 Fn0 ) ≥ ε/2, whence we can find x0 /∈ N0

such that π0,n(x0, Bn) ≥ ε/2 for all n ∈ N.Now we proceed inductively; suppose that we have found x0, x1, . . . xm−1

such that x0 /∈ N0 and xi ∈ Ai−1(xi−1) \ Ni for i = 1, . . . ,m − 1, and† Caratheodory extension theorem: Let S be a set, S0 be an algebra of subsetsof S and S = σ(S0). Let μ0 be a countably additive map μ0 : S0 → [0,∞];then there exists a measure μ on (S,S) such that μ = μ0 on S0. Furthermoreif μ0(S) < ∞, then this extension is unique. For a proof of the theorem see, forexample, Williams [272] or Rogers and Williams [248].

A.4 Tulcea’s Theorem 301

πi,n(xi, Bn) ≥ ε/2i+1 for all n ∈ N for each i = 0, . . . , m − 1. We havealready established the result for the case m = 0. Now define

Fmn � {x ∈ X : πm,n(x,Bn) ≥ ε/2m+1};

from the integral representation for πm,n,

πm−1,n(x,Bn) =∫X

πm,n(ym, Bn)πm(x,dym),

it follows by an argument analogous to that for F 0n , that

ε/2m ≤ πm−1,n(xm−1, Bn) ≤ ε/2m+1 + πm(xm−1, Fmn ),

where the inequality on the left hand side follows from the inductive hypoth-esis. As in the case for m = 0, we can deduce that Fm

n+1 ⊆ Fmn . Thus

πm

(xm−1,

∞⋂n=0

Fmn

)≥ ε/2m+1, (A.12)

which implies that we can choose xm ∈⋂∞

n=0 Fmn , such that πm,n(xm, Bn) >

ε/2m+1 for all n ∈ N, and from (A.12) as the set of suitable xm has strictlypositive probability, it cannot be empty, and we can choose an xm not inthe Pm-null set Nm. Therefore, this choice can be made such that xm ∈Am−1(xm−1) \Nm. This establishes the required inductive step.Now consider the case of πn,n(xn, Bn); we see from the definition that this

is just 1Bn(xn), but by choice of the xns, πn,n(xn, Bn) > 0. Consequently

as xn /∈ Nn, by the support property of the transition kernels, it followsthat An(xn) ⊆ Bn for each n. Thus

⋂An(xn) ⊂

⋂Bn and if we define

Kn �⋂n

i=0Ai(xi) it follows that xn ∈ Kn and Kn is a descending sequence;by the σ-algebra property it is clear that Kn ∈ Bn, and since An(xn) is anatom in Bn it follows that Kn = An(xn). We thus have a decreasing sequenceof atoms; by the initial assumption, such an intersection is non-empty, that is,⋂

An(xn) �= ∅ which implies that⋂

Bn �= ∅, but this is a contradiction, sincewe assumed that this intersection was empty. Therefore P is countably additiveand the existence of an extension follows from the theorem of Caratheodory.

��

A.4.1 The Daniell–Kolmogorov–Tulcea Theorem

The Daniell–Kolmogorov–Tulcea theorem gives conditions under which thelaw of a stochastic process can be extended from its finite-dimensional distri-butions to its full (infinite-dimensional) law.The original form of this result due to Daniell and Kolmogorov (see Doob

[81] or Rogers andWilliams [248, section II.30]) requires topological conditionson the space X; the space X needs to be Borel, that is, homeomorphic to a


Borel set in some space, which is the case if X is a complete separable metricspace as a consequence of Theorem A.7.It is possible to take an alternative probabilistic approach using Tulcea’s

theorem. In this approach the finite-dimensional distributions are related toeach other through the use of regular conditional probabilities as transitionkernels; while this does not explicitly use topological conditions, such condi-tions may be required to establish the existence of these regular conditionalprobabilities (as was seen in Exercise 2.29 regular conditional probabilities areguaranteed to exist if X is a complete separable metric space).We use the notation XI for the I-fold product space generated by X, that

is, XI =∏

i∈I Xi where Xis are identical copies of X, and let BI denote theproduct σ-algebra on XI ; that is, BI =

∏i∈I Bi where Bi are copies of B. If

U and V are finite subsets of the index set I, let πVU denote the restrictionmap from XV to XU .

Theorem A.12. Let X be a complete separable metric space. Let μU be afamily of probability measures on (XU ,BU ), for U any finite subset of I.Suppose that these measures satisfy the compatibility condition for U ⊂ V

μU = μV ◦ πVU .

Then there exists a unique probability measure on (XI ,BI) such that μU =μ ◦ πIU for any U a finite subset of I.

Proof. Let Fin(I) denote the set of all finite subsets of I. It is immediate fromthe compatibility condition that we can find a finitely additive μ0 which is aprobability measure on (XI ,

⋃F∈Fin(I)(π

IF )−1(BF )), such that for U ∈ Fin(I),

μU = (πIU )−1 ◦ μ0. If we can show that μ0 is countably additive, then the

Caratheodory extension theorem implies that μ0 can be extended to a measureμ on (XI ,BI).We cannot directly use Tulcea’s theorem to construct the extension mea-

sure; however we can use it to show that μ0 is countably additive. SupposeAn is a non-increasing family of sets An ∈

⋃F∈Fin(I)(π

IF )−1(BF ) such that

An ↓ ∅; we must show that μ0(An)→ 0.Given the Ais, we can find finite subsets Fi of I such that Ai ∈ (πIFi

)−1BFi

for each i. Without loss of generality we can choose this sequence so thatF0 ⊂ F1 ⊂ F2 ⊂ · · · . Define Fn � (πIFn

)−1(BFn) ⊂ BI . As a consequence of

the product space structure, these σ-algebras satisfy the condition that theintersection of a decreasing family of atoms Zn ∈ Fn is non-empty.For q ∈ XI and B ∈ Fn, let

πn(q,B) �(μFn

∣∣∣∣(πFn

Fn−1

)−1(BFn−1)

)(πIFn

(q),(πIFn

)−1(B)),

where (μFn | G)(ω, ·) for G ⊂ BFn is the regular conditional probability dis-tribution of μFn given G. We refer to the properties of regular conditional

A.5 Cadlag Paths 303

probability distribution using the nomenclature of Definition 2.28. This πn isa probability kernel from (XI ,Fn−1) to (XI ,Fn), i.e. πn(q, ·) is a measureon (XI ,Fn) and the map q �→ πn(q, ·) is Fn−1-measurable (which followsfrom property (b) of regular conditional distribution). In order to apply Tul-cea’s theorem we must verify that the compatibility condition is satisfied i.e.πn(q, ·) is supported for a.e. q on the atom in Fn−1 containing q which isdenoted An−1(q). This is readily established by computing π(q, (An−1(q))c)and using property (c) of regular conditional distribution and the fact thatq /∈ (An−1(q))c. Thus we can apply Tulcea’s theorem to find a unique proba-bility measure μ on (XI , σ(

⋃∞n=0 Fn)) such that μ is equal to μ0 on

⋃∞n=0 Fn.

Hence as An ∈ Fn, it follows that μ(An) = μ0(An) for each n and thereforesince μ is countably additive μ0(An) ↓ 0 which establishes the required count-able additivity of μ0. ��

A.5 Cadlag Paths

A cadlag (continue a droite, limite a gauche) path is one which is right con-tinuous with left limits; that is, xt has cadlag paths if for all t ∈ [0,∞), thelimit xt− exists and xt = xt+. Such paths are sometimes described as RCLL(right continuous with left limits). The space of cadlag functions from [0,∞)to E is conventionally denoted DE [0,∞).Useful references for this material are Billingsley [19, Chapter 3], Ethier

and Kurtz [95, Sections 3.5–3.9], and Whitt [269, Chapter 12].

A.5.1 Discontinuities of Cadlag Paths

Clearly cadlag paths can only have left discontinuities, i.e. points t wherext �= xt−.

Lemma A.13. For any ε > 0, a cadlag path taking values in a metric space(E, d) has at most a finite number of discontinuities of size in the metric dgreater than ε; that is, the set

D = {t ∈ [0, T ] : d(xt, xt−) > ε}

contains at most a finite number of points.

Proof. Let τ be the supremum of t ∈ [0, T ] such that [0, t) can be finitelysubdivided 0 < t0 < t1 < · · · < tk = t with the subdivision having theproperty that for i = 0, . . . , k− 1, sups,r∈[ti,ti+1) d(xs, xr) < ε. As right limitsexist at 0 it is clear that τ > 0 and since a left limit exists at τ− it is clearthat the interval [0, τ) can be thus subdivided. Right continuity implies thatthere exists δ > 0 such that for 0 ≤ t′−t < δ, then d(xt′ , xt) < ε; consequentlythe result holds for [0, t′), which contradicts the fact that τ is the supremumunless τ = T , consequently τ = T . Therefore [0, T ) can be so subdivided:


jumps of size greater than ε can only occur at the tis, of which there are afinite number and thus there must be at most a finite number of such jumps.

��

Lemma A.14. Let X be a cadlag stochastic process taking values in a metricspace (E, d); then

{t ∈ [0,∞) : P(Xt− �= Xt) > 0}

contains at most countably many points.

Proof. For ε > 0, define

Jt(ε) � {ω : d(Xt(ω), Xt−(ω)) > ε}

Fix ε, then for any T > 0, δ > 0 we show that there are at most a finitenumber of points t ∈ [0, T ] such that P(Jt(ε)) > δ. Suppose this is false, andan infinite sequence ti of disjoint times ti ∈ [0, T ] exists. Then by Fatou’slemma

P

(lim infi→∞

(Jti(ε))c)≤ lim inf

i→∞P ((Jti(ε))

c)

thus

P

(lim supi→∞

Jti(ε))≥ lim sup

i→∞P(Jti(ε)) > δ,

so the event that Jt(ε) occurs for an infinite number of the tis has strictlypositive probability and is hence non empty. This implies that there is a cadlagpath with an infinite number of jumps in [0, T ] of size greater than ε, whichcontradicts the conclusion of Lemma A.13. Taking the union over a countablesequence δn ↓ 0, it then follows that P(Jt(ε)) > 0 for at most a countable setof t ∈ [0, T ].Clearly P(Jt(ε)) → P(Xt �= Xt−) as ε → 0, thus the set {t ∈ [0, T ] :

P(Xt �= Xt−) > 0} contains at most a countable number of points. By takingthe countable union over T ∈ N, it follows that {t ∈ [0,∞) : P(Xt �= Xt−) > 0}is at most countable. ��

A.5.2 Skorohod Topology

Consider the sequence of functions xn(t) = 1{t≥1/n}, and the function x(t) =1{t>0} which are all elements of DE [0,∞). In the uniform topology which weused on CE [0,∞), as n → ∞ the sequence xn does not converge to x; yetconsidered as cadlag paths it appears natural that xn should converge to xsince the location of the unit jump of xn converges to the location of the unitjump of x. A different topology is required. The Skorohod topology is the mostfrequently used topology on the space DE [0,∞) which resolves this problem.Let λ : [0,∞)→ [0,∞), and define

A.5 Cadlag Paths 305

γ(λ) � esssupt≥0

| log λ′(t)|

= sups>t≥0

∣∣∣∣log λ(s)− λ(t)s− t

∣∣∣∣ .Let Λ be the subspace of Lipschitz continuous increasing functions from[0,∞)→ [0,∞) such that λ(0) = 0, limt→∞ λ(t) =∞ and γ(λ) <∞.The Skorohod topology is most readily defined in terms of a metric which

induces the topology. For x, y ∈ DE [0,∞) define a metric dDE(x, y) by

dDE(x, y) = inf

λ∈Λ

[γ(λ) ∨

∫ ∞

0

e−ud(x, y, λ, u) du],

whered(x, y, λ, u) = sup

t≥0d(x(t ∧ u), y(λ(t) ∧ u)).

It is of course necessary to verify that this satisfies the definition of a metric.This is straightforward, but details may be found in Ethier and Kurtz [95,Chapter 3, pages 117-118]. For the functions xn and x in the example, it isclear that dDR

(xn, x)→ 0 as n→∞. While there are other simpler topologieswhich have this property, the following proposition is the main reason whythe Skorohod topology is the preferred choice of topology on DE .

Proposition A.15. If the metric space (E, d) is complete and separable, then(DE [0,∞), dDE

) is also complete and separable.

Proof. The following proof follows Ethier and Kurtz [95]. As E is separable,it has a countable dense set. Let {xn}n≥1 be such a set. Given n, 0 = t0 <t1 < · · · < tn where tj ∈ Q+ and ij ∈ N for j = 0, . . . , n define the piecewiseconstant function

x(t) =

{xik tk ≤ t < tk+1

xin t ≥ tn.

The set of all such functions forms a dense subset of DE [0,∞), therefore thespace is separable.To show that the space is complete, suppose that {yn}n≥1 is a Cauchy

sequence in (DE [0,∞), dDE), which implies that there exists an increasing

sequence of numbers Nk such that for n,m ≥ Nk,

dDE(yn, ym) ≤ 2−k−1e−k.

Set vk = yNk; then dDE

(vk, vk+1) ≤ 2−k−1e−k. Thus there exists λk such that∫ ∞

0

e−ud(vk, vk+1, λk, u) du < 2−ke−k.

As d(x, y, λ, u) is monotonic increasing in u, it follows that for any v ≥ 0,

306 A Measure Theory∫ ∞

0

e−ud(x, y, λ, u) du ≥ d(x, y, λ, v)∫ ∞

v

e−u du = e−vd(x, y, λ, v).

Therefore it is possible to find λk ∈ Λ and uk > k such that

max(γ(λk), d(vk, vk+1, λk, uk)) ≤ 2−k. (A.13)

Then form the limit of the composition of the functions λi

μk � limn→∞λk+n ◦ · · ·λk+1 ◦ λk.

It then follows that

γ(μk) ≤∞∑i=k

γ(λi) ≤∞∑i=k

2−i = 2−k+1 <∞;

thus μk ∈ Λ. Using the bound (A.13) it follows that for k ∈ N,

supt≥0

d(vk(μ−1k (t) ∧ uk), vk+1(μ−1k+1(t) ∧ uk)

)= sup

t≥0d(vk(μ−1k (t) ∧ uk), vk+1(λk ◦ μ−1k (t) ∧ uk)

)= sup

t≥0d (vk(t ∧ uk), vk+1(λk(t) ∧ uk))

= d (vk, vk+1, λk, uk)

≤ 2−k.

Since (E, d) is complete, it now follows that zk = vk ◦μ−1k converges uniformlyon compact sets of t to some limit, which we denote z. As each zk has cadlagpaths, it follows that the limit also has cadlag paths and thus belongs toDE [0,∞). It only remains to show that vk converges to z in the Skorohodtopology. This follows since, γ(μ−1k )→ 0 as k →∞ and for fixed T > 0,

limk→∞

sup0≤t≤T

d(vk ◦ μ−1k (t), z(t)

)= 0.

��

A.6 Stopping Times

In this section, the notation Fot is used to emphasise that this filtration has

not been augmented.

Definition A.16. A random variable T taking values in [0,∞) is said to bean Fo

t -stopping time, if for all t ≥ 0, the event {T ≤ t} ∈ Fot .

A.6 Stopping Times 307

The subject of stopping times is too large to cover in any detail here. For moredetails see Rogers and Williams [248], or Dellacherie and Meyer [77, SectionIV.3].

Lemma A.17. A random variable T taking values in [0,∞) is an Fot+-

stopping time if and only if {T < t} ∈ Fot for all t ≥ 0.

Proof. If {T < t} ∈ Fot for all t ≥ 0 then since

{T ≤ t} =⋂ε>0

{T < t+ ε},

it follows that {T ≤ t} ∈ Fot+ε for any t ≥ 0 and ε > 0, thus {T ≤ t} ∈ Fo

t+.Thus T is an Fo

t+-stopping time.Conversely if T is an Fo

t+-stopping time then since

{T < t} =∞⋃n=1

{T ≤ t− 1/n}

and each {T ≤ t− 1/n} ∈ Fo(t−1/n)+ ⊆ Fo

t , therefore {T < t} ∈ Fot . ��

Lemma A.18. Let Tn be a sequence of Fot -stopping times. Then T = infn Tn

is an Fot+-stopping time.

Proof. Write the event {infn Tn < t} as{infn

Tn < t}=⋂n

{Tn < t}.

By Lemma A.17 each term in this intersection belongs to Fot+, therefore so

does the intersection which again by Lemma A.17 implies that infn Tn is aFot+-stopping time. ��

Lemma A.19. Let X be a real-valued, continuous, adapted process and a ∈ R.Define Ta � inf{t ≥ 0 : Xt ≥ a}. Then Ta is a Ft-stopping time

Proof. The set {ω : Xq(ω) ≥ a} is Fq-measurable for any q ∈ Q+ as X isFt-adapted. Then using the path continuity of X,

{Ta ≤ t} ={ω : inf

0≤s≤tXs(ω) ≥ a

}=

⋃q∈Q+:0≤q≤t

{ω : Xq(ω) ≥ a} .

Thus {Ta ≤ t} may be written as a countable union of Ft-measurable setsand so is itself Ft-measurable. Hence Ta is a Ft-stopping time. ��

Theorem A.20 (Debut Theorem). Let X be a process defined in sometopological space (S) (with its associated Borel σ-algebra B(S)). Assume thatX is progressively measurable relative to a filtration Fo

t . Then for A ∈ B(S),the mapping DA = inf{t ≥ 0;Xt ∈ A} defines an Ft-stopping time, where Ftis the augmentation of Fo

t .


For a proof see Theorem IV.50 of Dellacherie and Meyer [77]. See alsoRogers and Williams [248, 249] for related results.We require a technical result regarding the augmentation aspect of the

usual conditions which is used during the innovations derivation of the filteringequations.

Lemma A.21. Let Gt be the filtration Fot ∨N where N contains all the P-null

sets. If T is a Gt-stopping time, then there exists a Fot+-stopping time T ′ such

that T = T ′ P-a.s. In addition if L ∈ GT then there exists M ∈ FoT+ such that

L =M P-a.s.

Proof. Consider a stopping time of the form T = a1A +∞1Ac where a ∈ R+and A ∈ Ga; in this case let B be an element of Fo

a such that the symmetricdifference A�B is a P-null set and define T ′ = a1B +∞1Bc . For a generalGt-stopping time T use a dyadic approximation. Let

S(n) �∞∑k=0

k2−n1{(k−1)2−n≤T<k2−n}.

Clearly S(n) is GT -measurable and by construction S(n) ≥ T . Thus Sn is aGt-stopping time. But the stopping time S(n) takes values in a countable set,so

S(n) = infk

{k2−n1Ak

+∞IAck

},

where Ak � {S(n) = k2−n}. The result has already been proved for stoppingtimes of the form of those inside the infimum. As T = limn S(n) = infn S(n),consequently the result holds for all Gt-stopping times. As a consequence ofthis limiting operation Fo

t+ appears instead of Fot .

To prove the second assertion, let L ∈ GT . By the first part since L ∈ G∞there exists L′ ∈ Fo

∞ such that L = L′ P-a.s. Let V = T1L+∞1Lc a.s. Usingthe first part again, Fo

t+-stopping times V′ and T ′ can be constructed such that

V = V ′ a.s. and T = T ′ a.s. Define M � {L′ ∩ {T ′ =∞}} ∪ {V ′ = T ′ <∞}.Clearly M is Fo

T+-measurable and it follows that L =M P-a.s. ��

The following lemma is trivial, but worth stating to avoid confusion in themore complex proof which follows.

Lemma A.22. Let X ot be the unaugmented σ-algebra generated by a process

X. Then for T an X ot -stopping time, if T (ω) ≤ t and Xs(ω) = Xs(ω′) for

s ≤ t then T (ω′) ≤ t.

Proof. As T is a stopping time, {T ≤ t} ∈ X ot = σ(Xs : 0 ≤ s ≤ t) from

which the result follows. ��

Corollary A.23. Let X ot be the unaugmented σ-algebra generated by a process

X. Then for T an X ot -stopping time, if T (ω) ≤ t and Xs(ω) = Xs(ω′) for

s ≤ t then T (ω′) = T (ω).

A.6 Stopping Times 309

Proof. Apply Lemma A.22 with t = T (ω) to conclude T (ω′) ≤ T (ω). Bysymmetry, T (ω) ≤ T (ω′) from which the result follows. ��

Lemma A.24. Let X ot be the unaugmented σ-algebra generated by a process

X. Then for T a X ot -stopping time, for all t ≥ 0,

X ot∧T = σ {Xs∧T : 0 ≤ s ≤ t} .

Proof. Since T ∧ t is also a X ot -stopping time, it suffices to show

X oT = σ {Xs∧T : s ≥ 0} .

The definition of the σ-algebra associated with a stopping time is that

X oT � {B ∈ X o

∞ : B ∩ {T ≤ s} ∈ X os for all s ≥ 0} .

If A ∈ FoT then it follows from this definition that

TA =

{T if ω ∈ A,

+∞ otherwise,

defines a X ot -stopping time. Conversely if for some set A, the time TA defined

as above is a stopping time it follows that A ∈ X oT . Therefore we will have

established the result if we can show that A ∈ σ{Xs∧T : s ≥ 0} is a necessaryand sufficient condition for TA to be a stopping time.For the first implication, assume that TA is a X o

t -stopping time. It isnecessary to show that A ∈ σ{Xs∧T : s ≥ 0}. Suppose that ω, ω′ ∈ Ω are suchthat Xs(ω) = Xs(ω′) for s ≤ T (ω). We will establish that A ∈ σ{Xs∧T : s ≥0} if we show ω ∈ A implies that ω′ ∈ A.If T (ω) =∞ then it is immediate that the trajectories Xs(ω) and Xs(ω′)

are identical and hence ω′ ∈ A. Therefore consider T (ω) < ∞; if ω ∈ A thenTA(ω) = T (ω) and since it was assumed that TA is a X o

t -stopping time thefact that Xs(ω) and Xs(ω′) agree for s ≤ TA(ω) implies by Corollary A.23that TA(ω′) = TA(ω) = T (ω) < ∞ and from TA(ω′) < ∞ it follows thatω′ ∈ A.We must now prove the opposite implication; that is, given that T is a

stopping time and A ∈ σ{Xs∧T : s ≥ 0}, we must show that TA is a stoppingtime.Given arbitrary t ≥ 0, if TA(ω) ≤ t and Xs(ω) = Xs(ω′) for s ≤ t it follows

that ω ∈ A (since TA(ω) < ∞). If T (ω) ≤ t and Xs(ω) = Xs(ω′) for s ≤ t,since T is a stopping time it follows from Corollary A.23 that T (ω) = T (ω′).Since we assumed A ∈ σ{Xs∧T : s ≥ 0} it follows that ω′ ∈ A from which wededuce TA(ω) = T (ω) = T (ω′) = TA(ω′) whence

{TA(ω) ≤ t,Xs(ω) = Xs(ω′) for all s ≤ t} ⇒ TA(ω′) ≤ t,

which implies that {TA(ω) ≤ t} ∈ X ot and hence that TA is a X o

t -stoppingtime. ��


For many arguments it is required that the augmentation of the filtrationgenerated by a process be right continuous. While left continuity of samplepaths does imply left continuity of the filtration, right continuity (or evencontinuity) of the sample paths does not imply that the augmentation of thegenerated filtration is right continuous. This can be seen by considering theevent that a process has a local maximum at time t which may be in Xt+ butnot Xt (see the solution to Problem 7.1 (iii) in Chapter 2 of Karatzas andShreve [149]). The following proposition gives an important class of processfor which the right continuity does hold.

Proposition A.25. If X is a d-dimensional strong Markov process, then theaugmentation of the filtration generated by X is right continuous.

Proof. Denote by X o the (unaugmented) filtration generated by the processX. If 0 ≤ t0 < t1 < · · · < tn ≤ s < tn+1 · · · < tm, then by application of thestrong Markov property to the trivial X o

t+-stopping time s,

P(Xt0 ∈ Γ0, . . . Xtm ∈ Γm | Fs+)= 1{Xt0∈Γ0,...,Xtn∈Γn}P(Xtn+1 ∈ Γn+1, . . . , Xm ∈ Γm | Xs).

The right-hand side in this equation is clearly Xs-measurable and it is P-a.s.equal to P(Xt0 ∈ Γ0, . . . Xtm ∈ Γm | Fs+). As this holds for all cylinder sets,it follows that for all F ∈ X o

∞ there exists a X os -measurable random variable

which is P-a.s. equal to P(F | X os+).

Suppose that F ∈ X os+ ⊆ X o

∞; then clearly P(F | X os+) = 1F . As above

there exists a X os -measurable random variable 1F such that 1F = 1F a.s.

Define the event G � {ω : 1F (ω) = 1}, then G ∈ X os and the events G and

F differ by at most a null set (i.e. the symmetric difference G�F is null).Therefore F ∈ Xs, which establishes that X o

s+ ⊆ Xs for all s ≥ 0.It is clear that Xs ⊆ Xs+. Now prove the converse implication. Suppose

that F ∈ Xs+, which implies that for all n, F ∈ Xs+1/n. Therefore thereexists Gn ∈ X o

s+1/n such that F and Gn differ by a null set. Define G �⋂∞m=1

⋃∞n=m Gn. Then clearly G ∈ X o

s+ ⊆ Xs (by the result just proved). Toshow that F ∈ Xs, it suffices to show that this G differs by at most a null setfrom F . Consider

G \ F ⊆( ∞⋃n=1

Gn

)\ F =

∞⋃n=1

(Gn \ F ),

where the right-hand side is a countable union of null sets; thus G \F is null.Secondly

F \G = F ∩( ∞⋂m=1

∞⋃n=m

Gn

)c

= F ∩( ∞⋃m=1

∞⋂n=m

Gcn

)

=∞⋃

m=1

(F ∩

( ∞⋂n=m

Gcn

))⊆

∞⋃m=1

F ∩Gcm =

∞⋃m=1

(F \Gm),

A.7 The Optional Projection 311

and again the right-hand side is a countable union of null sets, thus F \G isnull. Therefore F ∈ Xs, which implies that Xs+ ⊆ Xs; hence Xs = Xs+. ��

A.7 The Optional Projection

Proof of Theorem 2.7

Proof. The proof uses a monotone class argument (Theorem A.1). Let H bethe class of bounded measurable processes for which an optional projectionexists. The class of functions 1[s,t)1F , where s < t and F ∈ F can readily beseen to form a π-system which generates the measurable processes. Define Zto be a cadlag version of the martingale t �→ E(1F | Ft) (which necessarilyexists since we have assumed that the usual conditions hold); then we mayset

o(1[s,t)1F

)(r, ω) = 1[s,t)(r)Zr(ω).

It is necessary to check that the defining condition (2.8) is satisfied. Let Tbe a stopping time. Then from Doob’s optional sampling theorem (which isapplicable in this case without restrictions on T , because the martingale Z isbounded and hence uniformly integrable) that

E[1F | FT ] = E[Z∞ | FT ] = ZT

whenceE[1F 1{T<∞} | FT ] = ZT 1{T<∞} P-a.s.

To apply the Monotone class theorem A.1 it is necessary to check that if Xn isa bounded monotone sequence inH with limitX then the optional projectionsoXn converge to the optional projection of X. Consider

Y � lim infn→∞

oXn1{| lim infn→∞ oXn|<∞}.

We must check that Y is the optional projection of X. Thanks to property (c)of conditional expectation the condition (2.8) is immediate. Consequently His a monotone class and thus by the monotone class theorem A.1 the optionalprojection exists for any bounded B×F-measurable process. To extend to theunbounded non-negative case consider X ∧ n and pass to the limit.In order to verify that the projection is unique up to indistinguishability,

consider two candidates for the optional projection Y and Z. For any stoppingtime T from (2.8) it follows that

YT 1{T<∞} = ZT 1{T<∞}, P-a.s. (A.14)

Define F � {(t, ω) : Zt(ω) �= Yt(ω)}. Since both Z and Y are optional pro-cesses the set F is an optional subset of [0,∞)×Ω. Write π : [0,∞)×Ω → Ω for


the canonical projection map π : (t, ω) �→ ω. Now argue by contradiction. Sup-pose that Z and Y are not indistinguishable; this implies that P(π(F )) > 0. Bythe optional section theorem (see Dellacherie and Meyer [77, IV.84]) it followsthat given ε > 0 there exists a stopping time U such that when U(ω) < ∞,(U(ω), ω) ∈ F and P(U < ∞) ≥ P(π(F )) − ε. As it has been assumed thatP(π(F )) > 0, by choosing ε sufficiently small, P(U < ∞) > 0. It followsthat on some set of non-zero probability 1{U<∞}YU �= 1{U<∞}ZU . But from(A.14) this may only hold on a null set, which is a contradiction. ThereforeP(π(F )) = 0 and it follows that Z and Y are indistinguishable. ��

Lemma A.26. If almost surely Xt ≥ 0 for all t ≥ 0 then oXt ≥ 0 for allt ≥ 0 almost surely.

Proof. Use the monotone class argument (Theorem A.1) in the proof of theexistence of the optional projection, noting that if F ∈ F then the cadlagversion of E[1F | Ft] is non-negative a.s. Alternatively use the optional sectiontheorem as in the proof of uniqueness. ��

A.7.1 Path Regularity

Introduce the following notation for a one-sided limit, in this case the rightlimit

lim sups↓↓t

xs � lim sups→t:s>t

xs = infv>t

supt<u≤v

xu,

a similar notation with s ↑↑ t being used for the left limit.The following lemma is required to establish right continuity. It can be

applied to the optional projection since being optional it must also be pro-gressively measurable.

Lemma A.27. Let X be a progressively measurable stochastic process takingvalues in R; then lim infs↓↓t Xs and lim sups↓↓t Xs are progressively measur-able.

Proof. It is sufficient to consider the case of lim sup. Let b ∈ R be such thatb > 0, then define

Xnt �

{supkb2−n≤s<(k+1)b2−n Xs if b(k − 1)2−n ≤ t < bk2−n, k < 2n,lim sups↓↓b Xs if b(1− 2−n) ≤ t ≤ b.

For every t ≤ b, the supremum in the above definition is Fb-measurable sinceX is progressively measurable; thus the random variableXn

t is Fb-measurable.For every ω ∈ Ω, Xn

t (ω) has trajectories which are right continuous fort ∈ [0, b]. Therefore Xn is B([0, b])⊗ Fb-measurable and is thus progressivelymeasurable. On [0, b] it is clear that lim supn→∞Xn

t = lim sups↓↓t Xs, hencelim sups↓↓t Xs is progressively measurable. ��


In a similar vein, the following lemma is required in order to establish theexistence of left limits. For the left limits the result is stronger and the lim infand lim sup are previsible and thus also progressively measurable.

Lemma A.28. Let X be a progressively measurable stochastic process takingvalues in R; then lim infs↑↑t Xs and lim sups↑↑t Xs are previsible.

Proof. It suffices to consider lim sups↑↑t Xt. Define

Xnt �

∑k>0

1{k2−n<t≤(k+1)2−n} sup(k−1)2−n<s≤k2−n

Xs,

from this definition it is clear that Xnt is previsible as it is a sum of left

continuous, adapted, processes. But as lim supn→∞Xnt = lim sups↑↑t Xs, it

follows that lim sups↑↑t Xs is previsible. ��

Proof of Theorem 2.9

Proof. First observe that if Yt is bounded then oYt must also be bounded.There are three things which must be established; first, the existence of rightlimits; second, right continuity; and third the existence of left limits. Becauseof the difference between Lemmas A.27 and A.28 the cases of left and rightlimits are not identical. The first part of the proof establishes the existence ofright limits. It is sufficient to show that

P

(lim infs↓↓t

oYs < lim sups↓↓t

oYs for some t ∈ [0,∞))= 0. (A.15)

The following steps are familiar from the proof of Doob’s martingale reg-ularization theorem which is used to guarantee the existence of cadlag modi-fications of martingales. If the right limit does not exist at t ∈ [0,∞), that is,if lim infs↓↓t oYs < lim sups↓↓t oYs, then rationals a, b can be found such thatlim infs↓↓t oYs < a < b < lim sups↓↓t oYs. The event that the right limit doesnot exist has thus been decomposed into a countable union over the rationals:{

ω : lim infs↓↓t

oYs(ω) < lim sups↓↓t

oYs(ω) for some t ∈ [0,∞)}=

⋃a,b∈Q

{ω : lim inf

s↓↓toYs(ω) < a < b < lim sup

s↓↓toYs(ω) for some t ∈ [0,∞)

}.

The lim sup and lim inf processes are progressively measurable by LemmaA.27, therefore for rationals a < b, the set

Ea,b �{(t, ω) : lim inf

s↓↓toYs < a < b < lim sup

s↓↓toYs

},


is progressively measurable.Now argue by contradiction; suppose that (A.15) is not true. Then from

the decomposition into a countable union, it follows that we can find a, b ∈ Qsuch that a < b and

0 < P

(lim infs↓↓t

oYs < a < b < lim sups↓↓t

oYs for some t ∈ [0,∞))= P(π(Ea,b)),

where the projection π is defined for A ⊂ [0,∞)×Ω, by π(A) = {ω : (ω, t) ∈A}. Define

Sa,b � inf{t ≥ 0 : (t, ω) ∈ Ea,b},which is the debut of a progressively measurable set, and thus by the Debuttheorem (Theorem A.20 applied to the progressive process 1Ea,b

(t, ω)) is astopping time (and hence optional). For a given ω, this stopping time Sa,b(ω)is the first time where lim infs↓↓t oYs and lim sups↓↓t oYs straddle the interval[a, b] and thus the right limit fails to exist at this point.If ω ∈ π(Ea,b) then there exists t ∈ [0,∞) such that (t, ω) ∈ Ea,b and this

implies t ≥ S(ω), whence S(ω) <∞. Thus, if P(π(Ea,b)) > 0 then this impliesP(Sa,b < ∞) > 0. Thus a consequence of the assumption that (A.15) is falseis that we can find a, b ∈ Q, with a < b such that P(Sa,b <∞) > 0. This willlead to a contradiction. For the remainder of the argument we can keep a andb fixed and consequently we write S in place of Sa,b.Define

A0 � {(t, ω) : S(ω) < t < S(ω) + 1, oYt(ω) < a};it then follows that the projection π(A0) = {S < ∞}. Thus by the optionalsection theorem, since A0 is optional (S is a stopping time and oYt is a priorioptional), we can find a stopping time S0 such that on {S0 <∞}, (S0(ω), ω) ∈A0 and

P(S0 <∞) > (1− 1/2)P(S <∞).Define

A1 � {(t, ω) : S(ω) < t < (S(ω) + 1/2) ∧ S0(ω), oYt(ω) > b}

and again by the optional section theorem we can find a stopping time S1such that on {S1 <∞}, (S1(ω), ω) ∈ A1 and

P(S1 <∞) > (1− 1/22)P(S <∞).

We can carry on this construction inductively defining

A2k �{(t, ω) : S(ω) < t < (S(ω) + 2−2k) ∧ S2k−1(ω), oYt(ω) < a

},

and

A2k+1 �{(t, ω) : S(ω) < t < (S(ω) + 2−(2k+1)) ∧ S2k(ω), oYt(ω) > b

}.


We can construct stopping times using the optional section theorem such thatfor each i, on {Si <∞}, (Si(ω), ω) ∈ Ai, and such that

P(Si <∞) >(1− 2−(i+1)

)P(S <∞).

On the event {Si < ∞} it is clear that Si < Si−1 and Si < S + 2−i. Alsoif S = ∞ it follows that Si = ∞ for all i, thus Si < ∞ implies S < ∞, soP(Si <∞, S <∞) = P(Si <∞), whence

P(Si =∞, S <∞) = P(S <∞)− P(Si <∞, S <∞)= P(S <∞)− P(Si <∞) ≤ P(S <∞)/2i+1.

Thus ∞∑i=0

P(Si =∞, S <∞) ≤ P(S <∞) ≤ 1 <∞,

so by the first Borel–Cantelli lemma the probability that infinitely many ofthe events {Si =∞, S <∞} occur is zero. In other words for ω ∈ {S <∞},we can find an index i0(ω) such that for i ≥ i0, the sequence Si converges ina decreasing fashion to S and oYSi < a for even i, and oYSi > b for odd i.Define Ri = supj≥i Sj , which is a monotonically decreasing sequence. Al-

most surely, Ri = Si for i sufficiently large, therefore limi→∞Ri = S a.s.and on the event {S < ∞}, for i sufficiently large, oYRi

< a for i even, andoYRi > b for i odd. Set Ti = Ri ∧ N . On {S < N}, for j sufficiently largeSj < N , hence using the boundedness of oY to enable interchange of limitand expectation

lim supi→∞

E [oYT2i] ≤ aP(S < N) + E

[oYN1{S≥N}

],

lim infi→∞

E[oYT2i+1

]≥ bP(S < N) + E

[oYN1{S≥N}

].

But since Ti is bounded by N , from the definition of the optional projection(2.8) it is clear that

E[oYTi] = E [E [YTi

1Ti<∞ | FTi]] = E[YTi

]. (A.16)

Thus, since Y has right limits, by an application of the bounded convergencetheorem E[YTi

]→ E[YT ], and so as i→∞

E[oYTi ]→ E[oYT ]. (A.17)

Thus

lim supi→∞

E[oYTi] = lim sup

i→∞E[oYT2i

] and lim infi→∞

E[oYTi] = lim inf

i→∞E[oYT2i+1 ],

so, if P(S < N) > 0 we see that since a < b, lim supi→∞ E[oYTi] <

lim infi→∞ E[oYTi ], which is a contradiction therefore P(S < N) = 0. As


N was chosen arbitrarily, this implies that P(S = ∞) = 1 which is a con-tradiction, since we assumed P(S < ∞) > 0. Thus a.s., right limits of oYtexist.Now we must show that oYt is right continuous. Let oYt+ be the process

of right limits. As this process is adapted and right continuous, it follows thatit is optional. Consider for ε > 0, the set

Aε � {(t, ω) : oYt(ω) ≥ oYt+(ω) + ε}.

Suppose that P(π(Aε)) > 0, from which we deduce a contradiction. By theoptional section theorem, for δ > 0, we can find a stopping time S such thaton S <∞, (S(ω), ω) ∈ Aε, and P(S <∞) = P(π(Aε))− δ. We may choose δsuch that P(S <∞) > 0. Let Sn = S + 1/n, and bound these times by someN , which is chosen sufficiently large that P(S < N) > 0. Thus set Tn � Sn∧Nand T � S ∧N . Hence by bounded convergence

limn→∞E[

oYTn ] = E[oYN1S≥N ] + E[oYT+1S<N ], (A.18)

butE[oYT ] = E[oYN1S≥N ] + E[oYT 1S<N ]. (A.19)

As the right-hand sides of (A.18) and (A.19) are not equal we conclude thatlimn→∞ E(oYTn) �= E(oYT ), which contradicts (A.17). Therefore P(π(Aε)) =0. The same argument can be applied to

Bε = {(t, ω) : oYt(ω) ≤ oYt+(ω)− ε},

which allows us to conclude that P(π(Bε)) = 0; hence

P (oYt = oYt+, ∀t ∈ [0,∞)) = 1,

and thus, up to indistinguishability, the process oYt is right continuous.The existence of left limits is approached in a similar fashion; by Lemma

A.28, the processes lim infs↑↑t oYs and lim sups↑↑t oYs are previsible and henceoptional. For a, b ∈ Q we define

Fa,b �{(t, ω) : lim inf

s↑↑toYs(ω) < a < b < lim sup

s↑↑toYs(ω)

}.

We assume P(π(Fa,b)) > 0 and deduce a contradiction. Since Fa,b is optional,we may apply the optional section theorem to find an optional time T suchthat on {T <∞}, the point (T (ω), ω) ∈ Fa,b and with P(T <∞) > ε . Define

C0 � {(t, ω) : t < T (ω), oYt < a},

which is itself optional; thus another application of the optional section the-orem constructs a stopping time R0 such that on {R0 < ∞} (R(ω), ω) ∈ C0and since R0 < T it is clear that P(R0 <∞) > ε.

A.8 The Previsible Projection 317

Then define

C1 � {(t, ω) : R0(ω) < t < T (ω), oYt > b},

which is optional and by the optional section theorem we can find a stoppingtime R1 such that on R1(ω) <∞, (R1(ω), ω) ∈ C1 and again R1 < T impliesthat P(R1 <∞) > ε. Proceed inductively.We have constructed an increasing sequence of optional times Rk such that

on the event {T < ∞}, YRk< a for even k, and oYRk

> b for odd k. DefineLk = Rk ∧ N for some N ; then this is an increasing sequence of boundedstopping times and clearly on {T < N} the limit limn E[oYLn

] does not exist.But since Ln is bounded, from (A.16) it follows that this limit must exist a.s.;hence P(T < N) = 0, which as N was arbitrary implies P(T <∞) = 0, whichis a contradiction. ��

The results used in the above proof are due to Doob and can be found ina very clear paper [82] which is discussed further in Benveniste [16]. Thesepapers work in the context of separable processes, which are processes whosegraph is the closure of the graph of the process with time restricted to somecountable set D. That is, for every t ∈ [0,∞) there exists a sequence ti ∈ Dsuch that ti → t and xti → xt. In these papers ‘rules of play’ disallow the useof the optional section theorem except when unavoidable and the above resultsare proved without its use. These results can be extended (with the addition ofextra conditions) to optionally separable processes, which are similarly defined,but the set D consists of a countable collection of stopping times and by anapplication of the optional section theorem it can be shown that every optionalprocess is optionally separable. The direct approach via the optional sectiontheorems is used in Dellacherie and Meyer [79].

A.8 The Previsible Projection

The optional projection (called the projection bien measurable in some earlyarticles) has been discussed extensively and is the projection which is of im-portance in the theory of filtering; a closely related concept is the previsible(or predictable) projection. Some of the very early theoretical papers makeuse of this projection. By convention we take F0− = F0.

Theorem A.29. Let X be a bounded measurable process; then there exists anoptional process oX called the previsible projection of X such that for everyprevisible stopping time T ,

pXT 1{T<∞} = E[XT 1{T<∞} | FT−

]. (A.20)

This process is unique up to indistinguishability, i.e. any processes which sat-isfy these conditions will be indistinguishable.


Proof. As in the proof of Theorem 2.7, let F be a measurable set, and defineZt to be a cadlag version of the martingale E[1F | Ft]. Then we define theprevisible projection of 1(s,t]1F by

p(1(s,t]1F

)(r, ω) = 1(s,t](r)Zr−(ω).

We must verify that this satisfies (A.20); let T be a previsible stopping time.Then we can find a sequence Tn of stopping times such that Tn ≤ Tn+1 <T for all n. By Doob’s optional sampling theorem applied to the uniformlyintegrable martingale Z;

E[1F | FTn] = E[Z∞ | FTn

] = ZTn,

now pass to the limit as n → ∞, using the martingale convergence theorem(see Theorem B.1), and we get

ZT− = E [Z∞ | ∨∞n=1FTn]

and from the definition of the σ-algebra of T− it follows that

ZT− = E[Z∞ | FT−].

To complete the proof, apply the monotone class theorem A.1 as in the prooffor the optional projection and use the same optional section theorem argu-ment for uniqueness. ��

The previsible and optional projection are actually very similar, as thefollowing theorem illustrates.

Theorem A.30. Let X be a bounded measurable process; then the set

{(t, ω) : oXt(ω) �= pXt(ω)}

is a countable union of graphs of stopping times.

Proof. Again we use the monotone class argument. Consider the process1[s,t)(r)1F , from (2.8) and (A.20) the set of points of difference is

{(t, ω) : Zt(ω) �= Zt−(ω)}

and since Z is a cadlag process we can define a sequence Tn of stopping timescorresponding to the nth discontinuity of Z, and by Lemma A.13 there are atmost countably many such discontinuities, therefore the points of differenceare contained in the countable union of the graphs of these Tns. ��

A.9 The Optional Projection Without the Usual Conditions 319

A.9 The Optional Projection Without the UsualConditions

The proof of the optional projection theorem in Section A.7 depends cruciallyon the usual conditions to construct a cadlag version of a martingale, both theaugmentation by null sets and the right continuity of the filtration being used.The result can be proved on the uncompleted σ-algebra by making suitablemodifications to the process constructed by Theorem 2.7. These results werefirst established in Dellacherie and Meyer [78] and take their definitive formin [77], the latter approach being followed here. The proofs in this section areof a more advanced nature and make use of a number of non-trivial resultsabout σ-algebras of stopping times which are not proved here. These resultsand their proofs can be found in, for example, Rogers and Williams [249]. Asusual let Fo

t denote the unaugmented σ-algebra corresponding to Ft.

Lemma A.31. Let L ⊂ R+ ×Ω be such that

L =⋃n

{(Sn(ω), ω) : ω ∈ Ω},

where the Sn are positive Fot -stopping times. We can find disjoint Fo

t -stoppingtimes Tn to replace the Sn such that

L =⋃n

{(Tn(ω), ω) : ω ∈ Ω}.

Proof. Define T1 = S1 and define

An � {ω ∈ Ω : S1 �= Sn, S2 �= Sn, . . . , Sn−1 �= Sn}.

Then it is clear that An ∈ FoSn. From the definition of this σ-algebra, if we

defineTn � Sn1An +∞1Ac

n,

then this Tn is a stopping time. It is clear that this process may be continuedinductively. The disjointness of the Tns follows by construction. ��

Given this lemma the following result is useful when modifying a processas it allows us to break down the ‘bad’ set A of points in a useful fashion.

Lemma A.32. Let A be a subset of R+×Ω contained in a countable union ofgraphs of positive random variables then A = K∪L where K and L are disjointmeasurable sets such that K is contained in a disjoint union of graphs of op-tional times and L intersects the graph of any optional time on an evanescentset.†

† A set A ⊂ [0,∞) × Ω is evanescent if the projection π(A) = {ω : ∃t ∈[0,∞) such that (ω, t) ∈ A} is contained in a P-null set. Two indistinguishableprocesses differ on an evanescent set.


Proof. Let V denote the set of all optional times. For Z a positive randomvariable define V (Z) = ∪T∈V{ω : Z(ω) = T (ω)}; consequently there is auseful decomposition Z = Z ′ ∧ Z ′′, where

Z ′ = Z1V (Z) +∞1V (Z)cZ ′′ = Z1V (Z)c +∞1V (Z).

From the definition of V (Z) the set {(Z ′(ω), ω) : ω ∈ Ω} is contained in thegraph of a countable number of optional times and if T is an optional timethen P(Z ′′ = T < ∞) = 0. Let the covering of A by a countable family ofgraphs of random variables be written

A ⊆∞⋃n=1

{(Zn(ω), ω) : ω ∈ Ω}

and form a decomposition of each random variable Zn = Z ′n ∧ Z ′′n as above.Clearly

⋃∞n=1{(Z ′n(ω), ω) : ω ∈ Ω} is also covered by a countable union of

graphs of optional times and by Lemma A.31 we can find a sequence of disjointoptional times Tn such that

∞⋃n=1

{(Z ′n(ω), ω) : ω ∈ Ω} ⊆∞⋃n=1

{(Tn(ω), ω) : ω ∈ Ω}.

Define

K = A ∩∞⋃n=1

{(Z ′n(ω), ω) : ω ∈ Ω} = A ∩⋃n

{(Tn(ω), ω) : ω ∈ Ω}

L = A ∩∞⋃n=1

{(Z ′′(ω), ω) : ω ∈ Ω} = A \⋃n

{(Tn(ω), ω) : ω ∈ Ω}.

Clearly A = K ∪L, hence this is a decomposition of A which has the requiredproperties. ��

Lemma A.33. For every Ft-optional process Xt there is an indistinguishableFot+-optional process.

Proof. Let T be an Ft-stopping time. Consider the process Xt = 1[0,T ), whichis cadlag and Ft-adapted, and hence Ft-optional. By Lemma A.21 there existsan Fo

t+-stopping time T ′ such that T = T ′ a.s. If we define X ′t = 1[0,T ′), then

since this process is cadlag and Fot+-adapted, it is clearly an Fo

t+-optionalprocess.

P(ω : X ′t(ω) = Xt(ω) ∀t) = P(T = T ′) = 1,

which implies that the processes X ′ and X are indistinguishable.We extend from processes of the form 1[0,T ) to the whole of O using the

monotone class framework (Theorem A.1) to extend to bounded optional pro-cesses, and use truncation to extended to the unbounded case. ��

A.9 The Optional Projection Without the Usual Conditions 321

Lemma A.34. For every Ft-previsible process, there is an indistinguishableFot -previsible process.

Proof. We first show that if T is Ft-previsible; then there exists T ′ which isFot -previsible, such that T = T ′ a.s. As {T = 0} ∈ F0−, we need only considerthe case where T > 0. Let Tn be a sequence of Ft-stopping times announcing†T . By Lemma A.33 it is clear that we can find Rn an Fo

t+-stopping time suchthat Rn = Tn a.s. Define Ln � maxi=1,...,n Rn; clearly this is an increasingsequence of stopping times. Let this sequence have limit L.Define An � {Ln = 0} ∪ {Ln < L} and define

Mn =

{Ln ∧ n if ω ∈ An

+∞ otherwise.

Since the sets An are decreasing, the stopping times Mn form an increasingsequence and the sequence Mn announces everywhere its limit T ′. This limitis strictly positive. Because T ′ is announced, T ′ is an Fo

t -previsible time andT = T ′ a.s. Finish the proof by a monotone class argument as in LemmaA.33. ��

The main result of this section is the following extension of the optionalprojection theorem which does not require the imposition of the usual condi-tions.

Theorem A.35. Given a stochastic process X, we can construct an Fot op-

tional process Zt such that for every stopping time T ,

ZT 1{T<∞} = E[ZT 1{T<∞} | FT

], (A.21)

and this process is unique up to indistinguishability.

Proof. By the optional projection theorem 2.7 we can construct an Ft-optionalprocess Zt which satisfies (A.21). By Lemma A.33 we can find a process Zt

which is indistinguishable from Zt but which is Fot+-optional. In general this

process Z will not be Fot -optional. We must therefore modify it.

Similarly using Theorem A.29, we can construct an Ft-previsible processYt, and using Lemma A.34, we can find an Fo

t -previsible process Yt which isindistinguishable from the process Yt.Let H = {(t, ω) : Yt(ω) �= Zt(ω)}; then it follows by Theorem A.30, that

this graph of differences is contained within a countable disjoint union ofgraphs of random variables. Thus by Lemma A.32 we may write H = K ∪ L

† A stopping time T is called announceable if there exists an announcing sequence(Tn)n≥1 for T . This means that for any n ≥ 1 and ω ∈ Ω, Tn(ω) ≤ Tn+1(ω) <T (ω) and Tn(ω) ↗ T (ω). A stopping time T is announceable if and only if it isprevisible. For details see Rogers and Williams [248].


such that for T any Fot -stopping time P(ω : (T (ω), ω) ∈ L) = 0 and there

exists a sequence of Fot -stopping times Tn such that

K ⊂⋃n

{(Tn(ω), ω) : ω ∈ Ω}.

For each n let Zn be a version of E[XTn1{Tn<∞} | Fo

Tn]; then we can define

Zt(ω) �{Yt(ω) if (t, ω) /∈

⋃n{(Tn(ω), ω) : ω ∈ Ω}

Zn(ω) if (t, ω) ∈ {(Tn(ω), ω) : ω ∈ Ω}.(A.22)

It is immediate that this Zt is Fot -optional. Let us now show that it satisfies

(A.21). Let T be an Fot -optional time and let A ∈ Fo

T . Set An = A∩{T = Tn};thus A ∈ Fo

Tn. Let B = A \

⋃n An and thus B ∈ Fo

T .From the definition (A.22),

ZT 1An1T<∞ = Zn1An

1Tn<∞ = 1AnE[1Tn<∞XTn

| FoTn]

= E[XTn1An

1Tn<∞ | FoTn] = E[XT 1An

1T<∞ | FoT ].

Consequently

E[1AnZT 1T<∞] = E[1An

E[1T<∞XT | FoT ]] = E[1An

XT 1T<∞].

So on An the conditions are satisfied. Now consider B, on which a.s. T �= Tnfor all n; hence (T (ω), ω) /∈ L. Since P((T (ω), ω) ∈ K) = 0, it follows that a.s.(T (ω), ω) /∈ H. Recalling the definition of H this implies that Yt(ω) = ζt(ω)a.s.; from the Definition A.22 on B, ZT = YT , thus

E[1BZT ] = E[1BζT ] = E[1BE[XT | FoT+)] = E[1BXT ].

Thus on An for each n and on B the process Z is an optional projection ofX. The uniqueness argument using the optional section theorem is exactlyanalogous to that used in the proof of Theorem 2.7. ��

A.10 Convergence of Measure-valued Random Variables

Let (Ω,F ,P) be a probability space and let (μn)∞n=1 be a sequence of randommeasures, μn : Ω → M(S) and μ : Ω → M(S) be another measure-valuedrandom variable. In the following we define two types of convergence for se-quences of measure-valued random variables:

1. limn→∞ E [|μnf − μf |] = 0 for all f ∈ Cb(S).2. limn→∞ μn = μ, P-a.s.

We call the first type of convergence convergence in expectation. If thereexists an integrable random variable w : Ω → R such that μn(1) ≤ w for all n,then limn→∞ μn = μ, P-a.s., implies that μn converged to μ in expectation bythe dominated convergence theorem. The extra condition is satisfied if (μn)∞n=1is a sequence of random probability measures, since in this case, μn(1) = 1for all n. We also have the following.

A.10 Convergence of Measure-valued Random Variables 323

Remark A.36. If μn converges in expectation to μ, then there exist sequencesn(m) such that limm→∞ μn(m) = μ, P-a.s.

Proof. SinceM(S) is isomorphic to (0,∞)×P(S), with the isomorphism beinggiven by

ν ∈M(S) �→ (ν(1), ν/ν(1)) ∈ (0,∞)× P(S),

it follows from Theorem 2.18 that there exists a countable convergence deter-mining set of functions†

M � {ϕ0, ϕ1, ϕ2, . . .}, (A.23)

where ϕ0 is the constant function equal to 1 everywhere and ϕi ∈ Cb(S) forany i > 0. Since

limn→∞E [|μ

nf − μf |] = 0

for all f ∈ {ϕ0, ϕ1, ϕ2, . . . } and the set {ϕ0, ϕ1, ϕ2, . . .} is countable, one canfind a subsequence n(m) such that, with probability one, limm→∞ μn(m)ϕi =μϕi for all i ≥ 0, hence the claim. ��

If a suitable bound on the rate of convergence for E [|μnf − μf |] is known,then the sequence n(m) can be specified explicitly. For instance we have thefollowing.

Remark A.37. Assume that there exists a countable convergence determiningsetM such that, for any f ∈M,

E [|μnf − μf |] ≤ cf√n,

where cf is a positive constant independent of n, then limm→∞ μm3= μ,

P-a.s.

Proof. By Fatou’s lemma

E

[ ∞∑m=1

∣∣∣μm3f − μf

∣∣∣]≤ lim

n→∞

n∑m=1

E

[∣∣∣μm3f − μf

∣∣∣]

≤ cf

∞∑m=1

1m3/2

<∞.

Hence ∞∑m=1

∣∣∣μm3f − μf

∣∣∣ <∞ P-a.s.,

† Recall that M is a convergence determining set if, for any sequence of fi-nite measures νn, n = 1, 2, . . . and ν being another finite measure for whichlimn→∞ νnf = νf for all f ∈ M, it follows limn→∞ νn = ν.


thereforelim

m→∞μm3f = μf for any f ∈M.

Since M is countable and convergence determining, it also follows thatlimm→∞ μm

3= μ, P-a.s. ��

Let d : P(S)×P(S)→ [0,∞) be the metric defined in Theorem 2.19; thatis, for μ, ν ∈ P(S),

d(μ, ν) =∞∑i=1

|μϕi − νϕi|2i

,

where ϕ1,ϕ2, . . . are elements of Cb(S) such that ‖ϕi‖∞ = 1 and let ϕ0 = 1.We can extend d to a metric onM(S) as follows.

dM :M(S)×M(S)→ [0,∞), d(μ, ν) �∞∑i=0

12i|μϕi − νϕi|. (A.24)

The careful reader should check that dM is a metric and that indeed dMinduces the weak topology onM(S). Using dM, the almost sure convergence2. is equivalent to

2′. limn→∞ dM(μn, μ) = 0, P-a.s.

If there exists an integrable random variable w : Ω → R such that μn(1) ≤w for all n, then similarly, (1) implies

1′. limn→∞ E [dM(μn, μ)] = 0.

However, a stronger condition (such as tightness) must be imposed in orderfor condition (1) to be equivalent to condition (1′).It is usually the case that convergence in expectation is easier to estab-

lish than almost sure convergence. However, if we have control on the highermoments of the error variables μnf − μf then we can deduce the almost sureconvergence of μn to μ. The following remark shows how this can be achievedand is used repeatedly in Chapters 8, 9 and 10.

Remark A.38. i. Assume that there exists a positive constant p > 1 and acountable convergence determining set M such that, for any f ∈ M, wehave

E[|μnf − μf |2p

]≤ cf

np,

where cf is a positive constant independent of n. Then, for any ε ∈ (0, 1/2−1/(2p)) there exists a positive random variable cf,ε almost surely finite suchthat

|μnf − μf | ≤ cf,εnε

.

In particular, limn→∞ μn = μ, P-a.s.

A.11 Gronwall’s Lemma 325

ii. Similarly, assume that there exists a positive constant p > 1 and a countableconvergence determining setM such that

E[dM(μn, μ)2p

]≤ c

np,

where dM is the metric defined in (A.24) and c is a positive constant in-dependent of n. Then, for any ε ∈ (0, 1/2− 1/(2p)) there exists a positiverandom variable cε almost surely finite such that

|μnf − μf | ≤ cεnε

, P-a.s.

In particular, limn→∞ μn = μ, P-a.s.

Proof. As in the proof of Remark A.37,

E

[ ∞∑n=1

n2εp|μnf − μf |2p]≤ cf

∞∑m=1

1np−2εp

<∞,

since p− 2εp > 1. Let cf,ε be the random variable

cf,ε =

( ∞∑n=1

n2εp|μnf − μf |2p)1/2p

.

As (cf,ε)2p is integrable, cf,ε is almost surely finite and

nε|μnf − μf | ≤ cf,ε.

Therefore limn→∞ μnf = μf for any f ∈ M. Again, since M is countableand convergence determining, it also follows that limn→∞ μn = μ, P-a.s. Part(ii) of the remark follows in a similar manner. ��

A.11 Gronwall’s Lemma

An important and frequently used result in the theory of stochastic differentialequations is Gronwall’s lemma.

Lemma A.39 (Gronwall). Let x, y and z be measurable non-negative func-tions on the real numbers. If y is bounded and z is integrable on [0, T ] forsome T ≥ 0, and for all 0 ≤ t ≤ T ,

xt ≤ zt +∫ t

0

xsys ds, (A.25)

then for all 0 ≤ t ≤ T ,

xt ≤ zt +∫ t

0

zsysexp(∫ t

s

yr dr)ds.


Proof. Multiplying both sides of the inequality (A.25) by yt exp(−∫ t0ys ds

)yields

xtyt exp(−∫ t

0

ys ds)−(∫ t

0

xsys ds)

yt exp(−∫ t

0

ys ds)

≤ ztyt exp(−∫ t

0

ys ds)

.

The left-hand side can be written as the derivative of a product,

ddt

[(∫ t

0

xsys ds)exp(−∫ t

0

ys ds)]≤ ztyt exp

(−∫ t

0

ys ds)

,

which can be integrated to give(∫ t

0

xsys ds)exp(−∫ t

0

ysds)≤∫ t

0

zsys exp(−∫ s

0

yr dr)ds,

or equivalently ∫ t

0

xsys ds ≤∫ t

0

zsys exp(∫ t

s

yr dr)ds.

Combining this with the original equation (A.25) gives the desired result. ��

Corollary A.40. If x is a real-valued function such that for all t ≥ 0,

xt ≤ A+B

∫ t

0

xs ds,

then for all t ≥ 0,xt ≤ AeBt.

Proof. We have for t ≥ 0,

xt ≤ A+∫ t

0

ABeB(t−s) ds

≤ A+ABeBt(e−tB − 1)/(−B) = AeBt.

��

A.12 Explicit Construction of the UnderlyingSample Space for the Stochastic Filtering Problem

Let (S, d) be a complete separable metric space (a Polish space) and Ω1 bethe space of S-valued continuous functions defined on [0,∞), endowed with

A.12 Explicit Construction of Sample Space 327

the topology of uniform convergence on compact intervals and with the Borelσ-algebra associated denoted with F1,

Ω1 = C([0,∞),S), F1 = B(Ω1). (A.26)

Let X be an S-valued process defined on this space; Xt(ω1) = ω1(t), ω1 ∈Ω1. We observe that Xt is measurable with respect to the σ-algebra F1 andconsider the filtration associated with the process X,

F1t = σ(Xs, s ∈ [0, t]). (A.27)

Let A : Cb(S) → Cb(S) be an unbounded operator with domain D(A) with1 ∈ D(A) and A1 = 0 and let P1 be a probability measure which is a solutionof the martingale problem associated with the infinitesimal generator A andthe initial distribution π0 ∈ P(S), i.e., under P1, the distribution of X0 is π0and

Mϕt = ϕ(Xt)− ϕ(X0)−

∫ t

0

Aϕ(Xs) ds, F1t , 0 ≤ t <∞, (A.28)

is a martingale for any ϕ ∈ D(A). Let also Ω2 be defined similarly to Ω1, butwith S = Rm. Hence

Ω2 = C([0,∞),Rm), F2 = B(Ω2). (A.29)

We consider also V to be the canonical process in Ω2, (i.e. Vt(ω2) = ω2(t),ω2 ∈ Ω2) and P2 to be a probability measure such that V is an m-dimensionalstandard Brownian motion on (Ω2,F2) with respect to it. We consider nowthe following.

Ω � Ω1 ×Ω2,

F ′ � F1 ⊗F2,P � P1 ⊗ P2,N � {B ⊂ Ω : B ⊂ A, A ∈ F , P(A) = 0}F � F ′ ∨N .

So (Ω,F ,P) is a complete probability space and, under P, X and V aretwo independent processes. They can be viewed as processes on the productspace (Ω,F ,P) in the usual way: as projections onto their original spaces ofdefinition. If W is the canonical process on Ω, then

W (t) = ω(t) = (ω1(t), ω2(t))

X = p1(ω) where p1 : Ω → Ω1, p1(ω) = ω1

V = p2(ω) where p2 : Ω → Ω2, p2(ω) = ω2.

Mϕt is also a martingale with respect to the larger filtration Ft, where


Ft = σ(Xs, Vs, s ∈ [0, t]) ∨N .

Let h : S→ Rm be a Borel-measurable function with the property that

P

(∫ T

0

‖h(Xs)‖ ds <∞)= 1 for all T > 0,

Finally let Y be the following stochastic process (usually called the observationprocess)

Yt =∫ t

0

h(s,Xs) ds+ Vt, t ≥ 0.

B

Stochastic Analysis

B.1 Martingale Theory in Continuous Time

The subject of martingale theory is too large to cover in an appendix suchas this. There are many useful references, for example, Rogers and Williams[248] or Doob [81].

Theorem B.1. If M = {Mt, t ≥ 0} is a right continuous martingale boundedin Lp for p ≥ 1, that is, supt≥0 E[|Mt|p] < ∞, then there exists an Lp-integrable random variable M∞ such that Mt →M∞ almost surely as t→∞.Furthermore,

1. If M is bounded in Lp for p > 1, then Mt →M∞ in Lp as t→∞.2. If M is bounded in L1 and {Mt, t ≥ 0} is uniformly integrable then

Mt →M∞ in L1 as t→∞.

If either condition (1) or (2) holds then the extended process {Mt, t ∈ [0,∞]}is a martingale.

For a proof see Theorem 1.5 of Chung and Williams [53].The following lemma provides a very useful test for identifying martingales.

Lemma B.2. Let M = {Mt, t ≥ 0} be a cadlag adapted process such that foreach bounded stopping time T , E[|MT |] < ∞ and E[MT ] = E[M0] then M isa martingale.

Proof. For s < t and A ∈ Fs define

T (ω) �{s if ω ∈ A,

t if ω ∈ Ac.

Then T is a stopping time and

E[M0] = E[MT ] = E[Ms1A] + E[Mt1Ac ],


330 B Stochastic Analysis

and trivially for the stopping time t,

E[M0] = E[Mt] = E[Mt1A] + E[Mt1Ac ],

so E[Mt1A] = E[Ms1A] which implies that Ms = E[Mt | Fs] a.s. which to-gether with the integrability condition implies M is a martingale. ��

By a straightforward change to this proof the following corollary may beestablished.

Corollary B.3. Let {Mt, t ≥ 0} be a cadlag adapted process such that foreach stopping time (potentially infinite) T , E[|MT |] <∞ and E[MT ] = E[M0]then M is a uniformly integrable martingale.

Definition B.4. Let M be a stochastic process. If M0 is F0-measurable andthere exists an increasing sequence Tn of stopping times such that Tn → ∞a.s. and such that

MTn = {Mt∧Tn

−M0, t ≥ 0}

is a Ft-adapted martingale for each n ∈ N, then M is called a local martingaleand the sequence Tn is called a reducing sequence for the local martingale M .

The initial condition M0 is treated separately to avoid imposing integrabilityconditions on M0.

B.2 Ito Integral

The stochastic integrals which arise in this book are the integrals of stochasticprocesses with respect to continuous local martingales. The following sectioncontains a very brief overview of the construction of the Ito integral in thiscontext and the necessary conditions on the integrands for the integral to bewell defined.The results are presented starting from the previsible integrands, since

in the general theory of stochastic integration these form the natural classof integrators. The results then extend in the case of continuous martingaleintegrators to integrands in the class of progressively measurable processes andif the quadratic variation of the continuous martingale is absolutely continuouswith respect to Lebesgue measure (as for example in the case of integrals withrespect to Brownian motion) then this extends further to all adapted, jointlymeasurable processes. It is also possible to construct directly the stochasticintegral with a continuous martingale integrator on the space of progressivelymeasurable processes (this approach is followed in e.g. Ethier and Kurtz [95]).There are numerous references which describe the material in this section

in much greater detail; examples include Chung and Williams [53], Karatzasand Shreve [149], Protter [247] and Dellacherie and Meyer [79].

B.2 Ito Integral 331

Definition B.5. The previsible (predictable) σ-algebra denoted P is the σ-algebra of subsets of [0,∞)×Ω generated by left continuous processes valuedin R; that is, it is the smallest σ-algebra with respect to which all left con-tinuous processes are measurable. A process is said to be previsible if it isP-measurable.

Lemma B.6. Let A be the ring† of subsets of [0,∞)×Ω generated by the setsof the form {(s, t]×A} where A ∈ Fs and 0 ≤ s < t and the sets {0} ×A forA ∈ F0. Then σ(A) = P.

Proof. It suffices to show that any adapted left continuous process (as a gen-erator of P) can be approximated by finite linear combinations of indicatorfunctions of elements of A. Let H be a bounded adapted left continuous pro-cess; define

Ht = limk→∞

limn→∞

nk∑i=2

H(i−1)/n1((i−1)/n,i/n](t).

As Ht is adapted it follows that H(i−1)/n ∈ F(i−1)/n, thus each term in thesum is A-measurable, and therefore by linearity so is the whole sum. ��

Definition B.7. Define the vector space of elementary function E to be thespace of finite linear combinations of indicator functions of elements of A.

Definition B.8. For the indicator function X = 1{(s,t]×A} for A ∈ Fs, whichan element of E, we can define the stochastic integral∫ ∞

0

Xr dMr � 1A(Mt −Ms).

For X = 1{0}×A where A ∈ F0, define the integral to be identically zero. Thisdefinition can be extended by linearity to the space of elementary functions E.Further define the integral between 0 and t by∫ t

0

Xr dMr �∫ ∞

0

1[0,t](r)Xr dMr.

Lemma B.9. If M is a martingale and X ∈ E then∫ t0Xr dMr is a Ft-adapted

martingale.

Proof. Consider Xt = 1A1(r,s](t) where A ∈ Fr. From Definition B.8,∫ t

0

Xp dMp =∫ ∞

0

1[0,t](p)Xp dMp = 1A(Ms∧t −Mr∧t),

and hence as M is a martingale and A ∈ Fr, then by considering separatelythe cases 0 ≤ p ≤ r, r < p ≤ s and p > s it follows that† A ring is a class of subsets closed under finite unions and set differences A \ Band which contains the empty set.


E

[∫ t

0

Xr dMr

∣∣∣∣ Fp]= E [1A(Ms∧t −Mr∧t) | Fp]

= 1AE(Ms∧p −Mr∧p) =∫ p

0

Xs dMs.

By linearity, this result extends to X ∈ E . ��

B.2.1 Quadratic Variation

The total variation is the variation which is used in the construction of theusual Lebesgue–Stieltjes integral. This cannot be used to define a non-trivialstochastic integral, as any continuous local martingale of finite variation isindistinguishable from zero.

Definition B.10. The quadratic variation process† 〈M〉t of a continuoussquare integrable martingale M is a continuous increasing process At startingfrom zero such that M2

t −At is a martingale.

Theorem B.11. If M is a continuous square integrable martingale then thequadratic variation process 〈M〉t exists and is unique.

The following proof is based on Theorem 4.3 of Chung and Williams [53]who attribute the argument to M. J. Sharpe.

Proof. Without loss of generality consider a martingale starting from zero.The result is first proved for a martingale which is bounded by C. For givenn ∈ N, define tnj � j2−n and tnj � t ∧ tnj for j ∈ N and

Snt �

∞∑j=0

(Mtnj+1

−Mtnj

)2.

By rearrangement of terms in the summations

M2t =

∞∑k=0

(M2

tnk+1−M2

tnk

)

= 2∞∑k=0

Mtnk

(Mtnk+1

−Mtnk

)+

∞∑k=0

(Mtnk+1

−Mtnk

)2.

Therefore

Snt =M2

t − 2∞∑k=0

Mtnk

(Mtnk+1

−Mtnk

). (B.1)

† Technically, if we were to study discontinuous processes what is being constructedhere should be denoted [M ]t. The process 〈M〉t, when it exists, is the dual pre-visible projection of [M ]t. In the continuous case, the two processes coincide, andhistorical precedent makes 〈M〉t the more common notation.


For fixed n and t the summation in (B.1) contains a finite number of non zeroterms each of which is a continuous martingale. It therefore follows that theSnt −M2

t is a continuous martingale for each n.It is now necessary to show that as n → ∞, for fixed t, the sequence

{Snt , n ∈ N} is a Cauchy sequence and therefore converges in L2. If we consider

fixed m < n and for notational convenience write tj for tnj , then it is possibleto relate the points on the two dyadic meshes by setting t′j = 2

−m[tj2m] andt′j � t∧ t′j ; that is, t

′j is the closest point on the coarser mesh to the left of tj .

It follows from (B.1) that

Snt − Sm

t = −2[2nt]∑j=0

(Mtj

−Mt′j

)(Mtj+1

−Mtj

). (B.2)

Define Zj � Mtj−Mt′j

; as t′j ≤ tj it follows that Zj is Ftj -measurable. Forj < k since Zj(Mtj+1

−Mtj)Zk is Ftk -measurable it follows that

E

[Zj

(Mtj+1

−Mtj

)Zk

(Mtk+1

−Mtk

)]= 0. (B.3)

Hence using (B.3) and the Cauchy–Schwartz inequality

E[(Sn

t − Smt )2]= 4E

⎡⎣[2nt]∑j=0

Z2j

(Mtj+1

−Mtj

)2⎤⎦

≤ 4E

⎡⎢⎣ sup0≤r≤s≤ts−r<2−m

(Mr −Ms)2[2nt]∑j=0

(Mtj+1

−Mtj

)2⎤⎥⎦

≤ 4

√√√√√√E⎡⎢⎣⎛⎜⎝ sup0≤r≤s≤ts−r<2−m

(Mr −Ms)2

⎞⎟⎠2⎤⎥⎦

×

√√√√√√E⎡⎢⎣⎛⎝[2nt]∑

j=0

(Mtj+1

−Mtj

)2⎞⎠2⎤⎥⎦.

The first term tends to zero using the fact that M being continuous is uni-formly continuous on the bounded time interval [0, t]. It remains to show thatthe second term is bounded. Write aj � (Mtj+1

−Mtj)2, for j ∈ N; then


E

⎡⎢⎣⎛⎝[2nt]∑

j=0

(Mtj+1

−Mtj

)2⎞⎠2⎤⎥⎦ = E

⎡⎢⎣⎛⎝[2nt]∑

j=0

aj

⎞⎠2⎤⎥⎦

= E

⎡⎣[2nt]∑j=0

a2j + 2[2nt]∑j=0

aj

[2nt]∑k=j+1

ak

⎤⎦

= E

⎡⎣[2nt]∑j=0

a2j

⎤⎦+ 2E

⎡⎣[2nt]∑j=0

aj E

⎡⎣ [2nt]∑k=j+1

ak

∣∣∣∣∣∣ Ftj+1

⎤⎦⎤⎦ .

It is clear that since the ajs are non-negative and M is bounded by C that

[2nt]∑j=0

a2j ≤ maxl=0,...,[2nt]

al

[2nt]∑j=0

aj ≤ 4C2[2nt]∑j=0

aj

and

E

⎡⎣ [2nt]∑k=j+1

ak

∣∣∣∣∣∣ Ftj+1

⎤⎦ = ∞∑

k=j+1

E

[(Mtk+1

−Mtk

)2| Ftj+1

]

=∞∑

k=j+1

E

[M2

tk+1−M2

tk| Ftj+1

]= E

[M2

t −M2tj+1

| Ftj+1

]≤ C2.

From these two bounds

E

⎡⎢⎣⎛⎝[2nt]∑

j=0

(Mtj+1

−Mtj

)2⎞⎠2⎤⎥⎦ ≤ (4C2 + 2C2)E

⎡⎣[2nt]∑j=0

aj

⎤⎦

= 6C2E[M2

t

]<∞.

As this bound holds uniformly in n, m, as n and m → ∞ it follows thatSnt − Sm

t → 0 in the L2 sense and hence the sequence {Snt , n ∈ N} converges

in L2 to a limit which we denote St. As the martingale property is preservedby L2 limits, it follows that {M2

t − St, t ≥ 0} is a martingale.It is necessary to show that St is increasing. Let s < t,

St − Ss = limn→∞ (S

nt − Sn

s ) in L2.

Then writing k � inf{j : tj > s},

Snt − Sn

s =∑tj>s

(Mtj+1

−Mtj

)2+(Mtk

−Mtk−1

)2−(Ms −Mtk−1

)2.


Clearly∣∣∣∣(Mtk−Mtk−1

)2−(Ms −Mtk−1

)2∣∣∣∣ ≤ 2 sup0≤r≤s≤ts−r<2−m

(Mr −Ms)2,

where the bound on the right-hand side tends to zero in L2 as n → ∞.Therefore in L2

St − Ss = limn→∞

∑tj>s

(Mtj+1

−Mtj

)2and hence St − Ss ≥ 0 almost surely, so the process S is a.s. increasing.It remains to show that a version of St can be chosen which is almost

surely continuous. By Doob’s L2-inequality applied to the martingale (B.2) itfollows that

E

[supt≤a|Sn

t − Smt |2]≤ 4E

[(Sn

a − Sma )2];

thus a suitable subsequence nk can be chosen such that Snkt converges a.s.

uniformly on compact time intervals to a limit S which from the continuityof M must be continuous a.s.Uniqueness follows from the result that a continuous local martingale of

finite variation is everywhere zero. Suppose the process A in the above def-inition were not unique. That is, suppose that also for some Bt continuousincreasing from zero,M2

t −Bt is a martingale. Then asM2t −At is also a mar-

tingale, by subtracting these two equations we get that At−Bt is a martingale,null at zero. It clearly must have finite variation, and hence be zero.To extend to the general case where the martingale M is not bounded use

a sequence of stopping times

Tn � inf{t ≥ 0 : |Mt| > n};

then {MTnt , t ≥ 0} is a bounded martingale to which the proof can be applied

to construct 〈MTn〉. By uniqueness it follows that 〈MTn〉 and 〈MTn+1〉 agreeon [0, Tn] so a process 〈M〉 may be defined. ��

Definition B.12. Define a measure on ([0,∞) × Ω,P) in terms of thequadratic variation of M via

μM (A) � E

[∫ ∞

0

1A(s, ω) d〈M〉s]. (B.4)

In terms of this measure we can define an associated norm on a P-measurableprocess X via

‖X‖M �∫[0,∞)×Ω

X2 dμM . (B.5)


This norm can be written using (B.4) and (B.5) more simply as

‖X‖M = E[∫ ∞

0

X2s d〈M〉s

].

Definition B.13. Define L2P � {X ∈ P : ‖X‖M <∞}.

This space L2P with associated norm ‖ · ‖M is a Banach space. Denote byL2P the space of equivalence classes of elements of L2P , where we consider theequivalence class of an element X to be all those elements Y ∈ L2P whichsatisfy ‖X − Y ‖M = 0.

Lemma B.14. The space of bounded elements of E, which we denote E isdense in the subspace of bounded functions in L2P .

Proof. This is a classical monotone class theorem proof which explains therequirement to work within spaces of bounded functions. Define

C ={H ∈ P : H is bounded, ∀ε > 0 ∃J ∈ E : ‖H − J‖M < ε

}.

It is clear that E ⊂ C. Thus it also follows that the constant function one isincluded in C. The fact that C is a vector space is immediate. It remains toverify that if Hn ↑ H where Hn ∈ C with H bounded that this implies H ∈ C.Fix ε > 0. By the bounded convergence theorem for Stieltjes integrals, it

follows that ‖Hn − H‖M → 0 as n → ∞; thus we can find N such that forn ≥ N , ‖Hn−H‖M < ε/2. As HN ∈ C, it follows that there exists J ∈ E suchthat ‖J −HN‖M < ε/2. Thus by the triangle inequality ‖H − J‖M ≤ ‖H −HN‖M+‖HN−J‖M < ε. Hence by the monotone class theorem σ(E) ⊂ C. ��

Lemma B.15. For X ∈ E it follows that

E

[(∫ ∞

0

Xr dMr

)2]= ‖X‖M .

Proof. Consider X = 1(s,t]×A where A ∈ Fs and s < t. Then

E

[(∫ ∞

0

Xt dMr

)2]= E

[(∫ ∞

0

1(s,t](r)1A dMr

)2]

= E[(Mt −Ms)21A

]= E

[1A(M2

t − 2MtMs +M2s

)]= E

[1A(M2

t +M2s

)]− 2E [1AE [MtMs | Fs]]

= E[1A(M2

t −M2s

)].

Then from the definition of μM it follows that

μM ((s, t]×A) = E[1A(〈M〉t − 〈M〉s)].


We know M2t − 〈M〉t is a local martingale, so it follows that

μM ((s, t]×A) = E

[(∫ ∞

0

1(s,t](r)1A dMr

)2]

and by linearity this extends to functions in E . ��

As a consequence of Lemma B.14 it follows that given any bounded X ∈L2P we can construct an approximating sequence Xn ∈ E such that ‖Xn −X‖M → 0 as n → ∞. Using Lemma B.15 it follows that

∫∞0

Xns dMs is a

Cauchy sequence in the L2 sense; thus we can make the following definition.

Definition B.16. For X ∈ L2P we may define the Ito integral in the L2 sensethrough the isometry

E

[(∫ ∞

0

Xr dMr

)2]= ‖X‖M . (B.6)

We must check that this extension of the stochastic integral is well defined.That is, consider another approximating sequence Yn → X; we must show thatthis converges to the same limit as the sequence Xn considered previously, butthis is immediate from the isometry.

Remark B.17. From the above definition of the stochastic integral in an L2

sense as a limit of approximations∫∞0

Xnr dMr, it follows that since conver-

gence in L2 implies convergence in probability we can also define the extensionof the stochastic integral as a limit in probability. By a standard result, thereexists a subsequence nk such that

∫∞0

Xnkr dMr converges a.s. as k → ∞. It

might appear that this would lead to a pathwise extension (i.e. a definitionfor each ω). However, this a.s. limit is not well defined: different choices ofapproximating sequence can give rise to limits which differ on (potentiallydifferent) null sets. As there are an uncountable number of possible approx-imating sequences the union of these null sets may not be null and thus thelimit not well defined.

The following theorem finds numerous applications throughout the book,usually to show that the expectation of a particular stochastic integral termis 0.

Theorem B.18. If X ∈ L2P and M is a square integrable martingale then∫ t0Xs dMs is a martingale.

Proof. Let Xn ∈ E be sequence converging to X in the ‖ · ‖M norm; then byLemma B.9 each

∫ t0Xns dMs is a martingale. By the Ito isometry

∫ t0Xns dMs

converges to∫ t0Xs dMs in L2 and the martingale property is preserved by L2

limits. ��


B.2.2 Continuous Integrator

The foregoing arguments cannot be used to extend the definition of thestochastic integral to integrands outside of the class of previsible processes.For example, the previsible processes do not form a dense set in the spaceof progressively measurable processes so approximation arguments can notbe used to extend the definition to progressively measurable integrands. Theapproach taken here is based on Chung and Williams [53].Let μM be a measure on [0,∞)×Ω which is an extension of μM (that is

μM and μM agree on P and μM is defined on a larger σ-algebra than P).Given a process X which is B × F-measurable, if there is a previsible

process Z such that ∫[0,∞)×Ω

(X − Z)2 dμM = 0, (B.7)

which, by the usual Lebesgue argument, is equivalent to

μM ((t, ω) : Xt(ω) �= Zt(ω)) = 0,

then we may define∫∞0

Xs dMs �∫∞0

Zs dMs. In general we cannot hope tofind such a Z for all B × F-measurable X. However, in the case where theintegratorM is continuous we can find such a previsible Z for all progressivelymeasurable X.Let N be the set of μM null sets and define P = P ∨ N ; then it follows

that for X a P-measurable process, we can find a process Z in P such thatμM ((t, ω) : Xt(ω) �= Zt(ω)) = 0. Hence (B.7) will hold and consequently wemay define

∫∞0

Xs dMs �∫∞0

Zs dMs. The following theorem is an importantapplication of this result.

Theorem B.19. Let M be a continuous martingale. Then if X is progres-sively measurable we can define the integral of X with respect to M in the Itosense through the extension of the isometry

E

[(∫ ∞

0

Xs dMs

)2]= E

[∫ ∞

0

X2s dμM

].

Proof. From the foregoing remarks, it is clear that it is sufficient to showthat every progressively measurable process X is P-measurable. There aretwo approaches to establishing this: one is direct via the previsible projectionand the other indirect via the optional projection. In either case, the result ofLemma B.21 is established, and the conclusion of the theorem follows. ��

Optional Projection Route

We begin with a measurability result which we need in the proof of the mainresult in this section.


Lemma B.20. If X is progressively measurable and T is a stopping time,then XT 1{T<∞} is FT -measurable.

Proof. For fixed t the map ω �→ X(t, ω) defined on [0, t] × Ω is B[0, t] ⊗ F-measurable. Since T is a stopping time ω �→ T (ω) ∧ t is Ft-measurable. Bycomposition of functions† it follows that ω �→ X(T (ω)∧t, ω) is Ft-measurable.Now define Y = XT 1{T≤∞}; for any t it is clear Y 1{T≤t} = XT∧t1{T≤t}. Henceon {T ≤ t} it follows that Y is Ft-measurable, which by the definition of FTimplies that Y is FT -measurable. ��

Lemma B.21. The set of progressively measurable functions on [0,∞) × Ωis contained in P.

Proof. First we must show that all optional processes are P-measurable. Thisis straightforward: if τ is a stopping time we must show that 1[0,τ ] is P-measurable. But 1[0,τ) is previsible and thus automatically P-measurable,hence it is sufficient to establish that [τ ] � {(τ(ω), ω) : τ(ω) < ∞, ω ∈ Ω} ∈P. But

μM ([τ ]) = E[∫ ∞

0

1{τ(ω)=s} d〈M〉s]= E[〈M〉t − 〈M〉t−] = 0;

the final equality follows from the fact that Mt is continuous.Starting from a progressively measurable process X, by Theorem 2.7 we

can construct its optional projection oX. From (B.4),

μM ((t, ω) : oXt(ω) �= Xt(ω)) = E[∫ ∞

0

1{oXs(ω)�=Xs(ω)} d〈M〉s].

Defineτt = inf{s ≥ 0 : 〈M〉s > t};

since the set (t,∞) × Ω is progressively measurable, and 〈M〉t is continuousand hence progressively measurable, it follows that τt is a stopping time bythe Debut theorem (Theorem A.20). Hence,

μM ((t, ω) : oXt(ω) �= Xt(ω)) = E[∫ ∞

0

1{oXs(ω)�=Xs(ω)} d〈M〉s]

= E

[∫ 〈M〉∞

0

1{oXτs (ω)�=Xτs (ω)} ds

]

= E[∫ ∞

0

1{τs<∞}1{oXτs (ω)�=Xτs (ω)} ds].

† It is important to realise that this argument depends fundamentally on the pro-gressive measurability of X, it is in fact the same argument which is used (e.g.in Rogers and Williams [248, Lemma II.73.11]) to show that for progressivelymeasurable X, XT is FT -measurable for T an Ft-stopping time.


Thus using Fubini’s theorem

μM ((t, ω) : oXt(ω) �= Xt(ω)) = E[∫ ∞

0

1{τs<∞}1{oXτs �=Xτs} ds]

=∫ ∞

0

P(τs <∞, oXτs �= Xτs) ds.

From Lemma B.20 it follows that for any stopping time τ , Xτ1{τ<∞} is Fτ -measurable; thus from the definition of optional projection

oXτ1{τ<∞} = E[Xτ1{τ<∞} | Fτ ]= Xτ1{τ<∞} P-a.s.

Hence μM ((t, ω) : oXt(ω) �= Xt(ω)) = 0. But we have shown that the optionalprocesses are P-measurable, and oX is an optional process; thus from thedefinition of P there exists a previsible process Z such that μM ((t, ω) : Zt(ω) �=oXt(ω)) = 0 hence using these two results μM ((t, ω) : Zt(ω) �= Xt(ω)) = 0which implies that X is P-measurable. ��

Previsible Projection Route

While the previous approach shows that the progressively measurable pro-cesses can be viewed as the class of integrands, the argument is not construc-tive. By considering the previsible projection we can provide a constructiveargument. In brief, if X is progressively measurable and M is a continuousmartingale then ∫ ∞

0

Xs dMs =∫ ∞

0

pXs dMs,

where pX, the previsible projection of X, is a previsible process and theintegral on the right-hand side is to be understood in the sense of DefinitionB.16.

Lemma B.22. If X is progressively measurable and T is a previsible time,then XT 1{T<∞} is FT−-measurable.

Proof. If T is a previsible time then there exists an announcing sequenceTn ↑ T such that Tn is a stopping time. By Lemma B.20 it follows for each nthat XTn

1{Tn<∞} is FTn-measurable. Recall that

FT− =∨n

FTn,

so if we define random variables Y n � XTn1{Tn<∞} and

Y � lim infn→∞ Y n,

then it follows that Y is FT−-measurable. ��


From the Debut theorem,

τt � inf{s ≥ 0 : 〈M〉s > t}

is a Ft-stopping time. Therefore τt−1/n is an increasing sequence of stoppingtimes and their limit is

τt � inf{s ≥ 0 : 〈M〉s ≥ t}

therefore τt is a previsible time. We can now complete the proof of LemmaB.21 using the definition of the previsible projection.

Proof. Starting from a progressively measurable process X by Theorem A.29we can construct its previsible projection pX, from (B.4),

μM (pXt(ω) �= Xt(ω)) = E[∫ ∞

0

1{(s,ω)pXs(ω)�=Xs(ω)} d〈M〉s].

Using the previsible time τt,

μM (pXt(ω) �= Xt(ω)) = E[∫ ∞

0

1{pXs(ω)�=Xs(ω)} d〈M〉s]

= E

[∫ 〈M〉∞

0

1{pXτs (ω)�=Xτs (ω)} ds

]

= E[∫ ∞

0

1{τs<∞}1{pXτs (ω)�=Xτs (ω)} ds].

Thus using Fubini’s theorem

μM ((t, ω) : pXt(ω) �= Xt(ω)) =∫ ∞

0

P(τs <∞, pXτs �= Xτs) ds.

From Lemma B.22 it follows that for any previsible time τ , Xτ1{τ<∞} isFτ−-measurable; thus from the definition of previsible projection

pXτ1{τ<∞} = E[Xτ1{τ<∞} | Fτ−]= Xτ1{τ<∞} P-a.s.

Hence μM ((t, ω) : pXt(ω) �= Xt(ω)) = 0. Therefore X is P-measurable. Wealso see that the previsible process Z in (B.7) is just the previsible projectionof X. ��

B.2.3 Integration by Parts Formula

The stochastic form of the integration parts formula leads to Ito’s formulawhich is the most important result for practical computations.


Lemma B.23. Let M be a continuous martingale. Then

〈M〉t =M2t −M2

0 − 2∫ t

0

Ms dMs.

Proof. Following the argument and notation of the proof of Theorem B.11define Xn by

Xns (ω) �

∞∑j=0

Mtj (ω)1(tj ,tj+1](s);

while Xn is defined in terms of an infinite number of non-zero terms, it isclear that 1[0,t](s)Xn

s ∈ E . Therefore using the definition B.8,

Snt =

∞∑j=0

(M2

tj+1−M2

tj− 2Mtj

(Mtj+1

−Mtj

))

=M2t −M2

0 −∫ ∞

0

1[0,t](s)Xns dMs.

As the process M is continuous, it is clear that for fixed ω, Xn(ω) → M(ω)uniformly on compact subsets of time and therefore by bounded convergence,‖Xn1[0,t]−M1[0,t]‖M tends to zero. Thus by the Ito isometry (B.6) the resultfollows. ��

Lemma B.24. Let M and N be square integrable martingales; then

MtNt =M0Nt +∫ t

0

Ms dNs +∫ t

0

Ns dMs + 〈M,N〉t.

Proof. Apply the polarization identity

〈M,N〉t = (〈M +N〉t − 〈M −N〉t)/4

to the result of Lemma B.23, to give

〈M,N〉t = (1/4)((Mt +Nt)2 − (M0 +N0)2 − 2

∫ t

0

(Ms +Ns) dMs

− 2∫ t

0

(Ms +Ns) dNs − (Mt −Nt)2 − (M0 −N0)2

− 2∫ t

0

(Ms −Ns) dMs + 2∫ t

0

(Ms −Ns) dNs

)

=MtNt −M0N0 −∫ t

0

Ns dMs −∫ t

0

Ms dNs.

��


B.2.4 Ito’s Formula

Theorem B.25. If X is an Rd-valued semimartingale and f ∈ C2(Rd) then

f(Xt) = f(X0)+d∑

i=1

∫ t

0

∂

∂xif(Xs) dXi

s+12

d∑i,j=1

∫ t

0

∂2

∂xi∂xjf(Xs) d〈Xi, Xj〉s.

The continuity condition on f in the statement of Ito’s lemma is important; ifit does not hold then the local time of X must be considered (see for exampleChapter 7 of Chung andWilliams [53] or Section IV. 43 of Rogers andWilliams[249]).

Proof. We sketch a proof for d = 1. The finite variation case is the standardfundamental theorem of calculus for Stieltjes integration. Consider the caseof M a martingale.The proof is carried out by showing it holds for f(x) = xk for all k; by

linearity it then holds for all polynomials and by a standard approximationargument for all f ∈ C2(R). To establish the result for polynomials proceedby induction. Suppose it holds for functions f and g; then by Lemma B.24,

d(f(Mt)g(Mt)) = f(Mt) dg(Mt) + g(Mt) df(Mt) + d〈f(Mt), g(Mt)〉t= f(Mt)(g′(Mt) dMt + 1

2g′′(Mt) d〈M〉t)

+ g(Mt)(f ′(Mt) dMt + 12f

′′(Mt) d〈M〉t)+ g′(Mt)f ′(Mt) d〈M〉t.

Since the result clearly holds for f(x) = x, it follows that it holds for allpolynomials. The extension to C2(R) functions follows from a standard ap-proximation argument (see e.g. Rogers and Williams [249] for details). ��

B.2.5 Localization

The integral may be extended to a larger class of integrands by the procedureof localization. Let H be a progressively measurable process. Define a non-decreasing sequence of stopping times

Tn � inft≥0

{∫ t

0

H2s d〈M〉s > n

}; (B.8)

then it is clear that the process HTnt � Ht∧Tn is in the space LP . Thus the

stochastic integral∫∞0

HTns dMs is defined in the Ito sense of Definition B.16.

Theorem B.26. If for all t ≥ 0,

P

(∫ t

0

H2s d〈M〉s <∞

)= 1, (B.9)


then we may define the stochastic integral∫ ∞

0

Hs dMs � limn→∞

∫ ∞

0

HTns dMs.

Proof. Under condition (B.9) the sequence of stopping times Tn defined in(B.8) tends to infinity P-a.s. It is straightforward to verify that this is welldefined; that is, different choices of sequence Tn tending to infinity give riseto the same limit. ��

This general definition of integral is then a local martingale. We can simi-larly extend to integrators M which are local martingales by using the mini-mum of a reducing sequence Rn for the local martingale M and the sequenceTn above.

B.3 Stochastic Calculus

A very useful result can be proved using the Ito calculus about the character-isation of Brownian motion, due to Levy.

Theorem B.27. Let {Bi}t≥0 be continuous local martingales starting fromzero for i = 1, . . . , n. Then Bt = (B1t , . . . , B

nt ) is a Brownian motion with

respect to (Ω,F ,P) adapted to the filtration Ft, if and only if

〈Bi, Bj〉t = δijt ∀i, j ∈ {1, . . . , n}.

Proof. In these circumstances it follows that the statement Bt is a Brownianmotion is by definition equivalent to stating that Bt − Bs is independent ofFs and is distributed normally with mean zero and covariance matrix (t−s)I.Clearly if Bt is a Brownian motion then the covariation result follows

trivially from the definitions. To establish the converse, we assume 〈Bi, Bj〉t =δijt for i, j ∈ {1, . . . , n} and prove that Bt is a Brownian motion.Observe that for fixed θ ∈ Rn we can define Mθ

t by

Mθt = f(Bt, t) � exp

(iθ�Bt +

12‖θ‖2 t

).

By application of Ito’s formula to f we obtain (in differential form using theEinstein summation convention)

d (f(Bt, t)) =∂f

∂xj(Bt, t) dB

jt +

∂f

∂t(Bt, t) dt+

12

∂2f

∂xj∂xk(Bt, t) d〈Bj , Bk〉t

= iθjf(Bt, t) dBjt +

12‖θ‖2f(Bt, t) dt−

12θjθkδjkf(Bt, t) dt

= iθjf(Bt, t) dBjt .

B.3 Stochastic Calculus 345

Hence

Mθt = 1 +

∫ t

0

d(f(Bt, t)),

and is a sum of stochastic integrals with respect to continuous local martin-gales and is hence itself a continuous local martingale. But for each t, using| · | to denote the complex modulus

|Mθt | = exp

(12‖θ‖2t

)<∞.

Hence for any fixed time t0, M t0t satisfies

|M t0t | ≤ |M t0∞| <∞,

and so is a bounded local martingale. Hence {M t0t , t ≥ 0} is a genuine mar-

tingale. Thus for 0 ≤ s < t we have

E[exp(iθ�(Bt −Bs)

)| Fs

]= exp

(−12(t− s)‖θ‖2

)a.s.

However, this is the characteristic function of a multivariate normal randomvariable distributed as N(O, (t−s)I). Thus by the Levy characteristic functiontheorem Bt −Bs is an N(O, (t− s)I) random variable. ��

B.3.1 Girsanov’s Theorem

Girsanov’s theorem for the change of drift underlies many important results.The result has an important converse but this is not used here.

Theorem B.28. Let M be a continuous martingale, and let Z be the associ-ated exponential martingale

Zt = exp(Mt − 1

2 〈M〉t). (B.10)

If Z is a uniformly integrable martingale, then a new measure Q, equivalentto P, may be defined by

dQdP

� Z∞.

Furthermore, if X is a continuous P local martingale then Xt − 〈X,M〉t is aQ-local martingale.

Proof. Since Z is a uniformly integrable martingale it follows from TheoremB.1 (martingale convergence) that Zt = E[Z∞ | Ft]. Hence Q constructedthus is a probability measure which is equivalent to P. Now consider X, aP-local martingale. Define a sequence of stopping times which tend to infinityvia

Tn � inf{t ≥ 0 : |Xt| ≥ n or |〈X,M〉t| ≥ n}.


Consider the process Y defined via

Y � XTnt − 〈XTn ,M〉t.

By Ito’s formula applied to (B.10), dZt = ZtdMt; a second application of Ito’sformula yields

d(ZtYt) = 1t≤Tn(ZtdYt + YtdZt + 〈Z, Y 〉t)

= 1t≤Tn (Zt(dXt − d〈X,M〉t) + YtZtdMt + 〈Z, Y 〉t)= 1t≤Tn

(Zt(dXt − d〈X,M〉t)+ (Xt − 〈X,M〉t)ZtdMt + Ztd〈X,M〉t)

=1t≤Tn((Xt − 〈X,M〉t)ZtdMt + ZtdXt) ,

where the result 〈Z, Y 〉t = Zt〈X,M〉t follows from the Kunita–Watanabeidentity; hence ZY is a P-local martingale. But Z is uniformly integrable andY is bounded (by construction of the stopping time Tn), hence ZY is a genuineP-martingale. Hence for s < t and A ∈ Fs, we have

EQ [(Yt − Ys)1A] = E [Z∞(Yt − Ys)1A] = E [(ZtYt − ZsYs)1A] = 0;

hence Y is a Q-martingale. Thus Xt − 〈X,M〉t is a Q-local martingale, sinceTn is a reducing sequence such that (X − 〈X,M〉)Tn is a Q-martingale, andTn ↑ ∞ as n→∞. ��

Corollary B.29. Let Wt be a P-Brownian motion and define Q as in TheoremB.28; then Wt =Wt − 〈W,M〉t is a Q-Brownian motion.

Proof. SinceW is a Brownian motion it follows that 〈W,W 〉t = t for all t ≥ 0.Since Wt is continuous and 〈W , W 〉t = 〈W,W 〉t = t, it follows from Levy’scharacterisation of Brownian motion (Theorem B.27) that W is a Q-Brownianmotion. ��

The form of Girsanov’s theorem in Theorem B.28 is too restrictive for manyapplications of interest. In particular the requirement that the martingale Zbe uniformly integrable and the implied equivalence of P and Q on F rulesout even such simple applications as transforming Xt = μt +Wt to removethe constant drift. In this case the martingale Zt = exp(μWt− 1

2μ2t) is clearly

not uniformly integrable. If we consider A ∈ F∞ defined by

A ={limt→∞

Xt − μt

t= 0}

, (B.11)

it is clear that P(A) = 1, yet under a measure Q under which X has no driftQ(A) = 0. Since equivalent measures have the same null sets it would followthat if this measure which killed the drift were equivalent to P then A shouldalso be null, a contradiction. Hence on F the measures P and Q cannot beequivalent.


If we consider restricting the definition of the measure Q to Ft for finitet then the above problem is avoided. In the example given earlier under Qt

the process X restricted to [0, t] is a Brownian motion with zero drift. Thisapproach via a family of consistent measures is used in the change of measureapproach to filtering, which is described in Chapter 3. Since we have justshown that there does not exist any measure equivalent to P under which Xis a Brownian motion on [0,∞) it is clear that we cannot, in general, find ameasure Q defined on F∞ such that the restriction of Q to Ft is Qt.Define a set function on

⋃0≤t<∞ Ft by

Q(A) = Qt(A), ∀A ∈ Ft, ∀t ≥ 0. (B.12)

If we have a finite set A1, . . . , An of elements of⋃0≤t<∞ Ft, then we can find

s such that Ai ∈ Fs for i = 1, . . . , n and since Qs is a probability measureit follows that the set function Q is finitely additive. It is immediate thatQ(∅) = 0 and Q(Ω) = 1.It is not obvious whether Q is countably additive. If Q is countably ad-

ditive, then Caratheodory’s theorem allows us to extend the definition of Qto σ

(⋃0≤t<∞ Ft

)= F∞. This can be resolved in special situations by using

Tulcea’s theorem. The σ-algebras Ft are all defined on the same space, so theatom condition of Tulcea’s theorem is non-trivial (contrast with the case ofthe product spaces used in the proof of the Daniell–Kolmogorov–Tulcea theo-rem), which explains why this extension cannot be carried out in general. Thefollowing corollary gives an important example where an extension is possible.

Corollary B.30. Let Ω = C([0,∞),Rd) and let Xt be the canonical processon Ω. Define Fo

t = σ(Xs : 0 ≤ s ≤ t). If

Zt = exp(Mt − 1

2 〈M〉t)

is a Fot+-adapted martingale then there exists a unique measure Q on (Ω,Fo

∞)such that

dQdP

∣∣∣∣Fo

t+

= Zt, ∀t

and the process Xt−〈X,M〉t is a Q local martingale with respect to {Fot+}t≥0.

Proof. We apply Theorem B.28 to the process Zt, which is clearly a uniformlyintegrable martingale (since Zt

s = E[Ztt | Fo

s+]). We may thus define a familyQt of measures equivalent to P on Fo

t+. It is clear that these measures areconsistent; that is for s ≤ t, Qt restricted to Fo

s+ is identical to Qs.

For any finite set of times t1 < t2 < · · · such that tk → ∞ as k → ∞,since the sample space Ω = C([0,∞),Rd) is a complete separable metricspace, regular conditional probabilities in the sense of Definition 2.28 exist asa consequence of Exercise 2.29, and we may denote them Qtk(· | Ftk−1+) fork = 1, 2, . . ..


The sequence of σ-algebras Fotk+

is clearly increasing. If we consider asequence Ak of atoms with each Ak ∈ Fo

tk+such that A1 ⊇ A2 ⊃ ·, then using

the fact that these are the unaugmented σ-algebras on the canonical samplespace it follows that ∩∞k=1Ak �= ∅. Therefore, using these regular conditionalprobabilities as the transition kernels, we may now apply Tulcea’s theoremA.11 to construct a measure Q on Fo

∞ which is consistent with Qtk on Fotk+

foreach k. The consistency condition ensures that the measure Q thus obtainedis independent of the choice of the times tks. ��

Corollary B.31. Let Wt be a P-Brownian motion and define Q as in Corol-lary B.30; then Wt = Wt − 〈W,M〉t is a Q-Brownian motion with respect toFot+.

Proof. As for Corollary B.29. ��

B.3.2 Martingale Representation Theorem

The following representation theorem has many uses. The proof given hereonly establishes the existence of the representation. The results of Clark al-low an explicit form to be established (see Nualart [227, Proposition 1.3.14]for details, or Section IV.41 of Rogers and Williams [249] for an elementaryaccount).

Theorem B.32. Let B be an m-dimensional Brownian motion and let Ft bethe right continuous enlargement of the σ-algebra generated by B augmented†

with the null sets N . Let T > 0 be a constant time. If X is a square integrablerandom variable measurable with respect to the σ-algebra FT then there existsa previsible νs such that

X = E[X] +∫ T

0

ν�s dBs. (B.13)

Proof. To establish the respresentation (B.13), without loss of generality wemay consider the case EX = 0 (in the general case apply the result toX−EX).Define the space

L2T =

{H : H is Ft-previsible and E

[∫ T

0

‖Hs‖2 ds]

<∞}

.

Consider the stochastic integral map

J : L2T → L2(FT ),

defined by† This condition is satisfied automatically if the filtration satisfies the usual condi-tions.


J(H) =∫ T

0

H�s dBs.

As a consequence of the Ito isometry theorem, this map is an isometry. Hencethe image V under J of the Hilbert space L2T is complete and hence a closedsubspace of L20(FT ) = {H ∈ L2(FT ) : EH = 0}. The theorem is proved if wecan establish that the image is equal to the whole space L20(FT ) for the imageis the space of random variables X which admit a representation of the form(B.13).Consider the orthogonal complement of V in L20(FT ). We aim to show that

every element of this orthogonal complement is zero. Suppose that Z is in theorthogonal complement of L20(FT ); thus

E(ZX) = 0 for all X ∈ L20(FT ). (B.14)

Define Zt = E[Z | Yt] which is an L2-bounded martingale. We know that theσ-algebra F0 is trivial by the Blumental 0–1 law therefore

Z0 = E[Z | F0] = E(Z) = 0 P-a.s.

Let H ∈ L2T and NT � J(H) and define Nt � E[NT | Ft] for 0 ≤ t ≤ T .It is clear that NT ∈ V . Let S be a stopping time such that S ≤ T ; then byoptional sampling

NS = E[NT | FS ] = E[∫ S

0

H�s dBs +

∫ T

S

H�s dBs

∣∣∣∣∣FS]= J(H1[0,S]),

so consequently NS ∈ V . The orthogonality relation (B.14) then implies thatE(ZNS) = 0. Thus using the properties of conditional expectation

0 = E[ZNS ] = E[E[ZNS | FS ]] = E[NSE[Z | FS ]] = E[ZSNS ].

Since this holds for S a bounded stopping time, and ZT and NT are squareintegrable, it follows that ZtNt is a uniformly integrable martingale and hence〈Z,N〉t is a null process.Let εt be an element of the set St defined in Lemma B.39 where the

stochastic process Y is taken to be the Brownian motion B. Extending J inthe obvious way to m-dimensional vector processes, we have that

εt = 1 + J(iεr1[0,t])

for some r ∈ L∞([0, t],Rm). Using the above, Ztεt = Z0 + ZtJ(iεr1[0,t]). Both{ZtJ(iεr1[0,t]), t ≥ 0} and {Zt, t ≥ 0} are martingales and Z0 = 0; hence

E[εtZt] = E[Z0] + E[ZtJ

(iεr1[0,t]

)]= E(Z0) = 0.

Thus since this holds for all εt ∈ St and the set St is total this implies thatZt = 0 P-a.s. ��


Remark B.33. For X a square integrable Ft-adapted martingale this resultcan be applied to XT , followed by conditioning and use of the martingaleproperty to obtain for any 0 ≤ t ≤ T ,

Xt = E[XT | Ft] = E[XT ] +∫ t∧T

0

ν�s dBs

= E(X0) +∫ t

0

ν�s dBs.

As the choice of the constant time T was arbitrary, it is clear that this resultholds for all t ≥ 0.

B.3.3 Novikov’s Condition

One of the most useful conditions for checking whether a local martingale ofexponential form is a martingale is that due to Novikov.

Theorem B.34. If Zt = exp(Mt − 1

2 〈M〉t)for M a continuous local martin-

gale, then a sufficient condition for Z to be a martingale is that

E[exp( 12 〈M〉t

]<∞, 0 ≤ t <∞.

Proof. Define the stopping time

Sb = inf{t ≥ 0 :Ms − s = b}

and note that P(Sb <∞) = 1. Then define

Yt � exp(Mt − 12 t); (B.15)

it follows by the optional stopping theorem that E[exp(MSb− 12Sb)] = 1, which

implies E[exp(12Sb)] = e−b. Consider

Nt � Yt∧Sb, t ≥ 0,

which is also a martingale. Since P(Sb <∞) = 1 it follows that

N∞ = lims→∞Ns = exp(MSb

− 12Sb).

By Fatou’s lemma Ns is a supermartingale with last element. But E(N∞) =1 = E(N0) whence N is a uniformly integrable martingale. So by optionalsampling for any stopping time R,

E[exp(MR∧Sb

− 12 (R ∧ Sb)

)]= 1.

Fix t ≥ 0 and set R = 〈M〉t. It then follows for b < 0,

E(1Sb<〈M〉t exp

(b+ 1

2Sb))+ E

(1Sb≥〈M〉t exp

(Mt − 1

2 〈M〉t))= 1.

The first expectation is bounded by ebE(12 〈M〉t

), thus from the condition of

the theorem it converges to zero as b → −∞. The second term converges toE(Zt) as a consequence of monotone convergence. Thus E(Zt) = 1. ��


B.3.4 Stochastic Fubini Theorem

The Fubini theorem of measure theory has a useful extension to stochasticintegrals. The form stated here requires a boundedness assumption and assuch is not the most general form possible, but is that which is most usefulfor applications. We assume that all the stochastic integrals are with respectto continuous semimartingales, because this is the framework considered here.To extend the result it is simply necessary to stipulate that a cadlag version ofthe stochastic integrals be chosen. For a more general form see Protter [247,Theorem IV.46].In this theorem we consider a family of processes parametrised by an index

a ∈ A, and let μ be a finite measure on the space (A,A); that is, μ(A) <∞.

Theorem B.35. Let X be a semimartingale and μ a finite measure. LetHa

t = H(t, a, ω) be a bounded B[0, t] ⊗ A ⊗ P measurable process andZat �

∫ t0Ha

s dXs. If we define Ht �∫AHa

t μ(da) then Yt =∫AZa μ(da) is

the process given by the stochastic integral∫ t0Hs dXs.

Proof. By stopping we can reduce the case to that of X ∈ L2. As a conse-quence of the usual Fubini theorem it suffices to consider X a martingale.The proof proceeds via a monotone class argument. Suppose H(t, a, ω) =K(t, ω)f(a) for f bounded A-measurable. Then it follows that

Zt = f(a)∫ t

0

K(s, ω) dXs,

and hence ∫A

Zat μ(da) =

∫A

f(a)(∫ t

0

K(s, ω) dXs

)μ(da)

=∫ t

0

K(s, ω) dXs

∫A

f(a)μ(da)

=∫ t

0

(∫A

f(a)μ(da)K(s, ω))dXs

=∫ t

0

Hs dXs.

Thus we have established the result in this simple case and by linearity to thevector space of finite linear combinations of bounded functions of this form.It remains to show the monotone property; that is, suppose that the resultholds for Hn and Hn → H. We must show that the result holds for H.Let Za

n,t �∫ t0Ha

n dXs. We are interested in convergence uniformly in t;thus note that

E

[supt

∣∣∣∣∫A

Zan,t μ(da)−

∫A

Zat μ(da)

∣∣∣∣]≤ E

[∫A

supt|Za

n,t − Zat |μ(da)

].


We show that the right-hand side tends to zero as n → ∞. By Jensen’sinequality and Cauchy–Schwartz we can compute as follows,(E

[∫A

supt|Za

n,t − Zat |μ(da)

])2≤ E

[(∫A

supt|Za

n,t − Zat |μ(da)

)2]

≤∫A

μ(da)E[∫

A

supt|Za

n,t − Zat |2 μ(da)

].

Then an application of the non-stochastic version of Fubini’s theorem followedby Doob’s L2-inequality implies that

1μ(A)

E

[(∫A

supt|Za

n,t − Zat |μ(da)

)2]≤∫A

E

[sup

s∈[0,T ]|Za

n,s − Zas |2]

μ(da)

≤ 4∫A

E[(Za

n,∞ − Za∞)

2]μ(da)

≤ 4∫A

E [〈Zan − Za〉∞] μ(da).

Then by the Kunita–Watanabe identity

1μ(A)

E

(∫A

supt|Za

n,t − Zat |μ(da)

)2≤ 4

∫A

E

(∫ ∞

0

(Han,s −Ha

s )2 d〈X〉s

)μ(da).

Since Hn increases monotonically to a bounded process H it follows thatHn and H are uniformly bounded; we may apply the dominated convergencetheorem to the double integral and expectation and thus the right-hand sideconverges to zero. Thus

limn→∞E

[supt

∣∣∣∣∫A

Zan,t μ(da)−

∫A

Zat μ(da)

∣∣∣∣]= 0. (B.16)

We may conclude from this that∫A

supt|Za

n,t − Zat |μ(da) <∞ a.s.

as a consequence of which∫A|Za

t |μ(da) < ∞ for all t a.s., and thus theintegral

∫AZat μ(da) is defined a.s. for all t. Defining Hn,t �

∫AHa

n,t μ(da), wehave from (B.16) that

∫ t0Hn,sdXs converges in probability uniformly in t to∫

AZat μ(da). Since a priori the result holds for Hn we have that∫ t

0

Hn,s dXs =∫A

Zan,t μ(da),


and since by the stochastic form of the dominated convergence theorem∫ t0Hn,s dXs tends to

∫ t0Hs dXs as n→∞ it follows that∫ t

0

Hs dXs =∫A

Zat μ(da).

��

B.3.5 Burkholder–Davis–Gundy Inequalities

Theorem B.36. If F : [0,∞) → [0,∞) is a continuous increasing functionsuch that F (0) = 0, and for every α > 1

KF = supx∈[0,∞)

F (αx)F (x)

<∞,

then there exist constants cF and CF such that for every continuous localmartingale M ,

cFE[F(√〈M〉∞

)]≤ E

[F

(supt≥0|Mt|

)]≤ CFE

[F(√〈M〉∞

)].

An example of a suitable function F which satisfies the conditions of thetheorem is F (x) = xp for p > 0.Various proofs exist of this result. The proof given follows Burkholder’s

approach in Chapter II of [36]. The proof requires the following lemma.

Lemma B.37. Let X and Y be nonnegative real-valued random variables. Letβ > 1, δ > 0, ε > 0 be such that for all λ > 0,

P(X > βλ, Y ≤ δλ) ≤ εP(X > λ). (B.17)

Let γ and η be such that F (βλ) ≤ γF (λ) and F (δ−1λ) ≤ ηF (λ). If γε < 1then

E [F (X)] ≤ γη

1− γεE [F (Y )] .

Proof. Assume without loss of generality that E[F (X)] <∞. It is clear from(B.17) that for λ > 0,

P(X > βλ) = P(X > βλ, Y ≤ δλ) + P(X > βλ, Y > δλ)≤ εP(X > λ) + P(Y > δλ). (B.18)

Since F (0) = 0 by assumption, it follows that

F (x) =∫ x

0

dF (λ) =∫ ∞

0

I{λ<x} dF (λ);

thus by Fubini’s theorem


E[F (X)] =∫ ∞

0

P(X > λ) dF (λ).

Thus using (B.18) it follows that

E[F (X/β)] =∫ ∞

0

P(X > βλ) dF (λ)

≤ ε

∫ ∞

0

P(X > λ) dF (λ) +∫ ∞

0

P(Y > δλ) dF (λ)

≤ εE[F (X)] + E[Y/δ];

from the conditions on η, and γ it then follows that

E[F (X/β)] ≤ εγE[F (X/β)] + ηE[F (Y )].

Since we assumed E[F (X)] <∞, and εγ < 1, it follows that

E[F (X/β)] ≤ η

1− εγE[F (Y )],

and the result follows using the condition on γ. ��

We can now prove the Burkholder–Davis–Gundy inequality, by using theabove lemma.

Proof. Let τ = inf{u : |Mu| > λ} which is an Ft-stopping time. Define Nt �(Mτ+t −Mτ )2 − (〈M〉τ+t − 〈M〉τ ), which is a continuous Fτ+t-adapted localmartingale. Choose β > 1, 0 < δ < 1. On the event defined by {supt≥0 |Mt| >βλ, 〈M〉∞ ≤ δ2λ2} the martingale Nt must hit the level (β − 1)2λ2 − δ2λ2

before it hits −δ2λ2.From elementary use of the optional sampling theorem the probability of

a martingale hitting a level b before a level a is given by −a/(b− a); thus

P

(supt≥0|Mt| > βλ, 〈M〉∞ ≤ δ2λ2 | Fτ

)≤ δ2/(β − 1)2.

Hence as β > 1,

P

(supt≥0|Mt| > βλ, 〈M〉∞ ≤ δ2λ2

)

= P(supt≥0|Mt| > βλ, 〈M〉∞ ≤ δ2λ2, τ <∞

)

= E[P

(supt≥0|Mt| > βλ, 〈M〉∞ ≤ δ2λ2

∣∣∣∣Fτ)1τ<∞

]≤ δ2P(τ <∞)/(β − 1)2.

It is immediate that since β > 1, F (βλ) < KFF (λ) and similarly since δ < 1,F (λ/δ) < KFF (λ), so we may take γ = η = KF . Now we can choose 0 < δ < 1

B.5 Total Sets in L1 355

sufficiently small that εγ = δ2/(β − 1)2 < 1/KF . Therefore all the conditionsof Lemma B.37 are satisfied whence

E

[F

(supt≥0|Mt|

)]≤ CE

[F(√〈M〉∞

)]and the opposite inequality can be established similarly. ��

B.4 Stochastic Differential Equations

Theorem B.38. Let f : Rd → Rd and σ : Rd → Rp be Lipschitz functions.That is, there exist positive constants Kf and Kσ such that

‖f(x)− f(y)‖ ≤ Kf‖x− y‖, ‖σ(x)− σ(y)‖ ≤ Kσ‖x− y‖,

for all x, y ∈ Rd.Given a probability space (Ω,F ,P) and a filtration {Ft, t ≥ 0} which

satisfies the usual conditions, let W be an Ft-adapted Brownian motion andlet ζ be an F0-adapted random variable. Then there exists a unique continuousadapted process X = {Xt, t ≥ 0} which is a strong solution of the SDE,

Xt = ζ +∫ t

0

f(Xs) ds+∫ t

0

σ(Xs) dWs.

The proof of this theorem can be found as Theorem 10.6 of Chung andWilliams [53] and is similar to the proof of Theorem 2.9 of Chapter 5 inKaratzas and Shreve [149].

B.5 Total Sets in L1

The use of the following density result in stochastic filtering originated in thework of Krylov and Rozovskii.

Lemma B.39. On the filtered probability space (Ω,F , P) let Y be a Brownianmotion starting from zero adapted to the filtration Yt; then define the set

St ={εt = exp

(i

∫ t

0

r�s dYs +12

∫ t

0

‖rs‖2 ds): r ∈ L∞ ([0, t],Rm)

}(B.19)

Then St is a total set in L1(Ω,Yt, P). That is, if a ∈ L1(Ω,Yt, P) and E[aεt] =0, for all εt ∈ St, then a = 0 P-a.s. Furthermore each process ε in the set Stsatisfies an SDE of the form

dεt = iεtr�t dYt,

for some r ∈ L∞([0, t],Rm).


Proof. We follow the proof in Bensoussan [13, page 83]. Define a set

S′t ={εt = exp

(i

∫ t

0

r�s dYs

)r ∈ L∞([0, t],Rm)

}.

Let a be a fixed element of L1(Ω,Yt, P) such that E[aεt] = 0 for all εt ∈ S′t.This can easily be seen to be equivalent to the statement that E[aεt] = 0 for allεt ∈ St, which we assume. To establish the result, we assume that E[aεt] = 0for all εt ∈ S′t, and show that a is zero a.s. Take t1, t2, . . . , tp ∈ (0, t) witht1 < t2 < · · · < tp, then given l1, l2, . . . , ln ∈ Rm, define

μp � lp, μp−1 � lp + lp−1, . . . μ1 � lp + · · ·+ l1.

Adopting the convention that t0 = 0, define a function

rt ={

μh for t ∈ (th−1, th), h = 1, . . . , p,0 for t ∈ (tp, T ),

whence as Yt0 = Y0 = 0,

p∑h=1

l�h Yth =p∑

h=1

μ�h (Yth − Yth−1) =∫ t

0

r�s dYs.

Hence for a ∈ L1(Ω,Yt, P)

E

[a exp

(i

p∑h=1

l�h Yth

)]= E

[a exp

(i

∫ t

0

r�s dYs

)]= 0,

where the second equality follows from the fact that we have assumed E[aεt] =0 for all ε ∈ S′t. By linearity therefore,

E

[a

K∑k=1

ck exp

(i

p∑h=1

l�h,kYth

)]= 0,

where this holds for all K and for all coefficients c1, . . . , cK ∈ C, and valueslh,k ∈ R. Let F (x1, . . . , xp) be a continuous bounded complex-valued func-tion defined on (Rm)p. By Weierstrass’ approximation theorem, there existsa uniformly bounded sequence of functions of the form

P (n)(x1, . . . , xp) =Kn∑k=1

c(n)k exp

(i

p∑h=1

(l(n)h,k)�xh

)

such thatlimn→∞P (n)(x1, . . . , xp) = F (x1, . . . , xp).

Hence we have E[aF (Yt1 , . . . , Ytp)] = 0 for every continuous bounded func-tion F , and by a further approximation argument, we can take F to be a

B.5 Total Sets in L1 357

bounded function, measurable with respect to the σ-algebra σ(Yt1 , . . . , Ytp).Since t1, t2, . . . , tp were chosen arbitrarily, we obtain that E[ab] = 0, for b anybounded Yt-measurable function. In particular it gives E[a2 ∧m] = 0 for ar-bitrary m; hence a = 0 P-a.s. ��

The following corollary enables us to use a smaller set of functions in thedefinition of the set St, in particular we can consider only bounded continuousfunctions with any number m of bounded continuous derivatives.

Corollary B.40. Assume the same conditions as in Lemma B.39. Define theset

Spt =

{εt = exp

(i

∫ t

0

r�s dYs +12

∫ t

0

‖rs‖2 ds): r ∈ Cp

b ([0, t],Rm)}(B.20)

where m is an arbitrary non-negative integer. Then Smt is a total set in

L1(Ω,Yt, P). That is, if a ∈ L1(Ω,Yt, P) and E[aεt] = 0, for all εt ∈ St,then a = 0 P-a.s. Furthermore each process ε in the set St satisfies an SDEof the form

dεt = iεtr�t dYt,

for some r ∈ L∞([0, t],Rm).

Proof. Let us prove the corollary for the case p = 0, that is, for r a boundedcontinuous function. To do this, as a consequence of Lemma B.39, it suffices toshow that if a ∈ L1(Ω,Yt, P) and E[aεt] = 0, for all εt ∈ S0t , then E[aεt] = 0,for all εt ∈ St. Pick an arbitrary εt ∈ St,

εt = exp(i

∫ t

0

r�s dYs +12

∫ t

0

‖rs‖2 ds)

, r ∈ L∞([0, t],Rm).

First let us note that by the fundamental theorem of calculus, as r ∈L∞([0, t],Rm), the function p : [0, t]→ Rm defined as

ps =∫ s

0

ru du

is continuous and differentiable almost everywhere. Moreover, for almost alls ∈ [0, t]

dpsds

= rs.

Now let rn ∈ C0b ([0, t],Rm) be defined as

rns � n(ps − p0∨s−1/n

), s ∈ [0, t].

Then rn is uniformly bounded by same bound as r and from the above, foralmost all s ∈ [0, t], limn→∞ rns = rs. By the bounded convergence theorem,


limn→∞

∫ t

0

‖rns ‖2 ds =∫ t

0

‖rs‖2 ds

and also

limn→∞ E

[(∫ t

0

r�s dYs −∫ t

0

(rns )� dYs

)2]= 0.

Hence at least for a subsequence (rnk)nk>0, by the Ito isometry

limk→∞

∫ t

0

(rnks )

� dYs =∫ t

0

r�s dYs, P-a.s.

and hence, the uniformly bounded sequence

εkt = exp(i

∫ t

0

(rnks )

� dYs +12

∫ t

0

‖rnks ‖2 ds

)

converges, P-almost surely to εt. Then, via another use of the dominatedconvergence theorem

E[aεt] = limk→∞

E[aεkt ] = 0,

since εkt ∈ S0t for all k ≥ 0. This completes the proof of the corollary for p = 0.For higher values of p, one iterates the above procedure. ��

B.6 Limits of Stochastic Integrals

The following proposition is used in the proof of the Zakai equation.

Proposition B.41. Let (Ω,F ,P) be a probability space, {Bt,Ft} be a stan-dard n-dimensional Brownian motion defined on this space and Ψn, Ψ be anFt-adapted process such that

∫ t0Ψ2n ds <∞,

∫ t0Ψ2 ds <∞, P-a.s. and

limn→∞

∫ t

0

‖Ψn − Ψ‖2 ds = 0

in probability; then

limn→∞ sup

t∈[0,T ]

∣∣∣∣∫ t

0

(Ψ�n − Ψ�) dBs

∣∣∣∣ = 0in probability.

Proof. Given arbitrary t, ε, η > 0 we first prove that for an n-dimensionalprocess ϕ,

P

(sup0≤s≤t

∣∣∣∣∫ s

0

ϕ�r dBr

∣∣∣∣ ≥ ε

)≤ P

(∫ t

0

‖ϕs‖2 ds > η

)+4ηε2

. (B.21)

B.6 Limits of Stochastic Integrals 359

To this end, define

τη � inf{t :∫ t

0

‖ϕs‖2 ds > η

},

and a corresponding stopped version of ϕ,

ϕηs � ϕs1[0,τη ](s).

Then using these definitions

P

(sup0≤s≤t

∣∣∣∣∫ s

0

ϕ�r dBr

∣∣∣∣ ≥ ε

)= P

(τη < t; sup

0≤s≤t

∣∣∣∣∫ s

0

ϕ�r dBr

∣∣∣∣ ≥ ε

)

+ P(τη ≥ t; sup

0≤s≤t

∣∣∣∣∫ s

0

ϕ�r dBr

∣∣∣∣ ≥ ε

)

≤ P (τη < t) + P(sup0≤s≤t

∣∣∣∣∫ s

0

(ϕηr)� dBr

∣∣∣∣ ≥ ε

)

≤ P(∫ t

0

‖ϕs‖2 ds > η

)

+ P(sup0≤s≤t

∣∣∣∣∫ s

0

(ϕηr)� dBr

∣∣∣∣ ≥ ε

).

By Chebychev’s inequality and Doob’s L2-inequality the second term on theright-hand side can be bounded

P

(sup0≤s≤t

∣∣∣∣∫ s

0

(ϕηr)� dBr

∣∣∣∣ ≥ ε

)≤ 1

ε2E

[(sup0≤s≤t

∣∣∣∣∫ s

0

(ϕηr)� dBr

∣∣∣∣)2]

≤ 4ε2E

[(∫ t

0

(ϕηr)� dBr

)2]

≤ 4ε2E

[∫ t

0

‖ϕηr‖2 dr

]≤ 4η

ε2,

which establishes (B.21). Applying this result with fixed ε to ϕ = Ψn − Ψyields

P

(sup

t∈[0,T ]

∣∣∣∣∫ t

0

(Ψ�n − Ψ�) dBs

∣∣∣∣ ≥ ε

)≤ P

(∫ t

0

‖Ψn − Ψ‖2 ds > η

)+4ηε2

.

Given arbitrary δ > 0, by choosing η < δε2/8 the second term on the right-hand side is then bounded by δ/2 and with this η by the condition of theproposition there exists N(η) such that for n ≥ N(η) the first term is boundedby δ/2. Thus the right-hand side can be bounded by δ. ��


B.7 An Exponential Functional of Brownian motion

In this section we deduce an explicit expression of a certain exponential func-tional of Brownian motion which is used in Chapter 6. Let {Bt, t ≥ 0} bea d-dimensional standard Brownian motion. Let β : [0, t]→ Rd be a boundedmeasurable function, Γ a d × d real matrix and δ ∈ Rd. In this section, wecompute the following functional of B,

Iβ,Γ,δt = E[exp(∫ t

0

B�s βs ds− 1

2

∫ t

0

‖ΓBs‖2 ds)∣∣∣∣Bt = δ

]. (B.22)

In (B.22) we use the standard notation

B�s βs =

d∑i=1

Bisβ

is, ‖ΓBs‖2 =

d∑i,j=1

(Γ ijBj

s

)2, s ≥ 0.

To obtain a closed formula for (B.22), we use Levy’s diagonalisation procedure,a powerful tool for deriving explicit formulae. Other results and techniques ofthis kind can be found in Yor [280] and the references contained therein. Theorthogonal decomposition of Bs with respect to Bt is

Bs =s

tBt +

(Bs −

s

tBt

), s ∈ [0, t],

and using the Fourier decomposition of the Brownian motion (as in Wiener’sconstruction of the Brownian motion)

Bs =s

tBt +

∑k≥1

√2t

sin(ksπ/t)kπ/t

ξk, s ∈ [0, t], (B.23)

where {ξk; k ≥ 1} are standard normal random vectors with independententries, which are also independent of Bt and the infinite sum has a subse-quence of its partial sums which almost surely converges uniformly (see Itoand McKean [135, page 22]), we obtain the following.

Lemma B.42. Let ν ∈ R and μk ∈ Rd, k ≥ 1 be the following constants

νβ,Γ,δ(t) � exp(1t

∫ t

0

sδ�βs ds−16‖Γδ‖2 t

)

μβ,Γ,δk (t) �∫ t

0

sin(ksπ/t)kπ/t

βs ds+ (−1)kt2

k2π2Γ�Γδ, k ≥ 1.

Then

Iβ,Γ,δt = νβ,Γ,δ(t)E

⎡⎣exp

⎛⎝∑

k≥1

(√2tξ�k μβ,Γ,δk (t)− t2

2k2π2‖Γξk‖2

)⎞⎠⎤⎦. (B.24)

B.7 An Exponential Functional of Brownian motion 361

Proof. We have from (B.23),∫ t

0

B�s βs ds =

1t

∫ t

0

sδ�βs ds+∑k≥1

√2t

∫ t

0

sin(ksπ/t)kπ/t

ξ�k βs ds (B.25)

and similarly∫ t

0

‖ΓBs‖2 ds =13‖Γδ‖2t− 2

√2t

∑k≥1(−1)k t2

k2π2ξ�k Γ�Γδ

+∫ t

0

∥∥∥∥∥∥Γ⎛⎝√2

t

∑k≥1

sin(ksπ/t)kπ/t

ξk

⎞⎠∥∥∥∥∥∥2

ds. (B.26)

Next using the standard orthonormality results for Fourier series

∫ t

0

(√2tsin(

ksπ

t

))2ds = 1, ∀k ≥ 1,

∫ t

0

sin(

k1sπ

t

)sin(

k2sπ

t

)ds = 0, ∀k1, k2 ≥ 1, k1 �= k2,

it follows that

∫ t

0

∥∥∥∥∥∥Γ⎛⎝√2

t

∑k≥1

sin(ksπ/t)kπ/t

ξk

⎞⎠∥∥∥∥∥∥2

ds =∑k≥1‖Γξk‖2

t2

k2π2. (B.27)

The identity (B.24) follows immediately from equations (B.25), (B.26) and(B.27). ��

Let P be an orthogonal matrix (PP� = P�P = I) and D be a diagonalmatrix D = diag(γ1, γ2, . . . , γd) such that Γ�Γ = P�DP . Obviously (γi)

di=1

are the eigenvalues of the real symmetric matrix Γ�Γ .

Lemma B.43. Let aβ,Γ,δi,k (t), for i = 1, . . . , d and k ≥ 1 be the followingconstants

aβ,Γ,δi,k (t) =d∑

j=1

P ij(μβ,Γ,δk (t)

)j.

Then

Iβ,Γ,δt = νβ,Γ,δ(t)d∏

i=1

1√∏k≥1

[γit2

k2π2 + 1] exp

⎛⎝∑

k≥1

aβ,Γ,δi,k (t)2(t2γi

k2π2 + 1)t

⎞⎠ . (B.28)


Proof. Let {ξk, k ≥ 1} be the independent identically distributed standardnormal random vectors defined by ξk = Pξk for any k ≥ 1. As a consequenceof Lemma B.42 we obtain that

Iβ,Γ,δt = νβ,Γ,δ(t)E

⎡⎣exp

⎛⎝∑

k≥1

(√2tξ�k Pμβ,Γ,δk (t)− t2

2k2π2ξ�k Dξk

)⎞⎠⎤⎦. (B.29)

Define the σ-algebras

Gk � σ(ξp, p ≥ k) and G �⋂k≥1Gk.

Now define

ζ � exp

⎛⎝∑

k≥1

(√2tξ�k Pμβ,Γ,δk (t)− t2

2k2π2ξ�k Dξk

)⎞⎠ ;using the independence of ξ1, . . . , ξn, . . . and Kolmogorov’s 0–1 Law (seeWilliams [272, page 46]), we see that

E[ζ] = E

⎡⎣ζ∣∣∣∣∣∣⋂k≥1Gk

⎤⎦ .

Since Gk is a decreasing sequence of σ-algebras, the Levy downward theorem(see Williams [272, page 136]) implies that

E

⎡⎣ζ∣∣∣∣∣∣⋂k≥1Gk

⎤⎦ = lim

k→∞E[ζ | Gk].

Hence we determine first E[ζ | Gk] and then take the limit as k →∞ to obtainthe expectation in (B.29). Hence

E[ζ] =∏k≥1

E

[exp

((√2tξ�k Pμβ,Γ,δk (t)− t2

2k2π2ξ�k Dξk

))]

=∏k≥1

d∏i=1

1√2π

∫ ∞

−∞exp

⎛⎝√2

taβ,Γ,δi,k (t)x−

(t2γi

k2π2 + 1)x2

2

⎞⎠ dx,

and identity (B.28) follows immediately. ��

Proposition B.44. Let fβ,Γ (t) be the following constant


fβ,Γ (t) �∫ t

0

∫ t

0

d∑i=1

sinh((s− t)√

γi) sinh(s′√

γi)2√

γi sinh(t√

γi)×

d∑j=1

P ijβjs

d∑j′=1

P ij′βj′

s′ dsds′,

and Rt,β,Γ (δ) be the following second-order polynomial in δ

Rt,β,Γ (δ) �

⎛⎝∫ t

0

d∑i=1

sinh(s√

γi)γi sinh(t

√γi)

d∑j=1

P ijβjs ds

⎞⎠ d∑

j′=1

P ij′ (Γ�Γδ)j′

−d∑

i=1

coth(t√

γi)2γi√

γi

⎛⎝ d∑

j=1

P ij(Γ�Γδ

)j⎞⎠2 .Then

Iβ,Γ,δt =d∏

i=1

√t√

γi

sinh(t√

γi)exp(fβ,Γ (t) +Rt,β,Γ (δ) +

‖δ‖22t

). (B.30)

Proof. Using the classical identity (B.35), the infinite product in the denom-inator of (B.28) is equal to sinh(t

√γi)/(t

√γi). Then we need to expand the

argument of the exponential in (B.28). The following argument makes use ofthe identities (B.32)–(B.34). We have that

aβ,Γ,δi,k (t) =∫ t

0

sin(ksπ/t)kπ/t

cβ,Γi (s) ds+ (−1)k t2

k2π2cΓ,δi

and

aβ,Γ,δi,k (t)2 =∫ t

0

∫ t

0

sin(ksπ/t)kπ/t

sin(ks′π/t)kπ/t

cβ,Γi (s)cβ,Γi (s′) dsds′

+ 2(−1)k t2

k2π2cΓ,δi

∫ t

0

sin(ksπ/t)kπ/t

cβ,Γi (s) ds

+(

t2

k2π2cΓ,δi

)2, (B.31)

where cβ,Γi (s) =∑d

j=1 Pijβjs and cΓ,δi =

∑dj=1 P

ij(Γ�Γδ

)j . Next we sum upover k each of the three terms on the right-hand side of (B.31). For the firstterm we use


∑k≥1

sin(ksπ/t)sin(ks′π/t)(kπ/t)2t (t2γi/(k2π2) + 1)

=t

2π2∑k≥1

cos(k(s− s′)π/t)− cos(k(s+ s′)π/t)t2γi/π2 + k2

=t

2π2

(π

2t√

γi/π

)cosh

((s− t− s′)

√γi)− cosh

((s− t+ s′)

√γi)

sinh(t√

γi)

=sinh((s− t)

√γi)sinh(s′

√γi)

2√

γi sinh(t√

γi) ;

hence

∑k≥1

∫ t

0

∫ t

0

sin(ksπ/t)kπ/t

sin(ks′π/t)kπ/t

cβ,Γi (s)cβ,Γi (s′) dsds′

t (t2γi/(k2π2) + 1)

=∫ t

0

∫ t

0

sinh((s− t)√

γi)sinh(s′√

γi)2√

γi sinh(t√

γi) cβ,Γi (s)cβ,Γi (s′) dsds′.

For the second term,

∑k≥1

(−1)k t2

k2π2sin(ksπ/t)

kπ/t

t(t2γi/(k2π2) + 1)=

t2

π3

∑k≥1

(−1)k sin(ksπ/t)k (t2γi/π2 + k2)

=t2

π3

(π

2t2γi/π2sinh(s

√γi)

sinh(t√

γi)− sπ

2t3γi/π2

)

=(12γi

sinh(s√

γi)sinh(t

√γi)− s

2tγi

);

hence

d∑i=1

∑k≥1

2(−1)k t2

k2π2cΓ,δi

∫ t

0

sin(ksπ/t)kπ/t

cβ,Γi (s) ds

t(t2γi/(k2π2) + 1)

=∫ t

0

d∑i=1

(sinh(s

√γi)

sinh(t√

γi)− s

t

)cβ,Γi (s)cΓ,δi

γids

=∫ t

0

d∑i=1

sinh(s√

γi)sinh(t

√γi)

cβ,Γi (s)cΓ,δi

γids

− 1t

∫ t

0

sδ�βs ds,

since∑d

i=1 cβ,Γi (s)cΓ,δi /γi = δ�βs. For the last term we get


∑k≥1

(t2cΓ,δi /(k2π2)

)2t (t2γi/(k2π2) + 1)

=t

γi

(cΓ,δi

)2∑k≥1

(1

k2π2− 1

t2γi + k2π2

)

=t

γi

(cΓ,δi

)2(16+

12t2γi

− 12t√

γicoth(t

√γi));

then

d∑i=1

∑k≥1

(t2cΓ,δi /(k2π2)

)2t(t2γi/(k2π2) + 1)

=‖Γδ‖26

+‖δ‖22t

−d∑

i=1

coth(t√

γi)2γi√

γi

⎛⎝ d∑

j=1

P ij(Γ�Γδ

)j⎞⎠2 ,since

∑di=1

(cΓ,δi

)2/γi = ‖Γδ‖2,

∑di=1

(cΓ,δi

)2/γ2i = ‖δ‖2. In the above we

used the following classical identities.

∑k≥1

cos krz2 + k2

=π

2ze(r−π)z + e−(r−π)z

eπz − e−πz − 12z2

, ∀r ∈ (0, 2π), (B.32)

∑k≥1(−1)k sin kr

k (z2 + k2)=

π

2z2erz − e−rzeπz − e−πz −

r

2z2, ∀r ∈ (−π, π), (B.33)

∑k≥1

1z2 + k2π2

=12z

(coth z − 1

z

), (B.34)

∏k≥1

[1 +

l2

k2

]=sinh(πl)

πl, (B.35)

and∑

k≥1 1/k2 = π2/6 (for proofs of these identities see for example, Mac-

robert [201]). We finally find the closed formula for the Brownian functional(B.22). ��In the one-dimensional case Proposition B.44 takes the following simpler

form. This is the form of the result which is used in Chapter 6 to derive thedensity of πt for the Benes filter.

Corollary B.45. Let {Bt, t ≥ 0} be a standard Brownian motion, β : [0, t]→R be a bounded measurable function, and Γ ∈ R be a positive constant. Then

E

[exp(∫ t

0

Bsβs ds−12

∫ t

0

Γ 2B2s ds)∣∣∣∣Bt = δ

]

= fβ,Γ (t) exp((∫ t

0

sinh(sΓ )sinh(tΓ )

βs ds)

δ − Γ coth(tΓ )2

δ2 +δ2

2t

), (B.36)

where


fβ,Γ (t) =

√tΓ

sinh(tΓ )exp(∫ t

0

∫ t

0

sinh((s− t)Γ ) sinh(s′Γ )2Γ sinh(tΓ )

βsβs′ dsds′)

.

References

1. Robert A. Adams. Sobolev Spaces. Academic Press, Orlando, FL, 2nd edition,2003.

2. Lakhdar Aggoun and Robert J. Elliott. Measure Theory and Filtering, vol-ume 15 of Cambridge Series in Statistical and Probabilistic Mathematics. Cam-bridge University Press, Cambridge, UK, 2004.

3. Deborah F. Allinger and Sanjoy K. Mitter. New results on the innovationsproblem for nonlinear filtering. Stochastics, 4(4):339–348, 1980/81.

4. Rami Atar, Frederi Viens, and Ofer Zeitouni. Robustness of Zakai’s equa-tion via Feynman-Kac representations. In Stochastic Analysis, Control, Op-timization and Applications, Systems Control Found. Appl., pages 339–352.Birkhauser Boston, Boston, MA, 1999.

5. Rami Atar and Ofer Zeitouni. Exponential stability for nonlinear filtering.Ann. Inst. H. Poincare Probab. Statist., 33(6):697–725, 1997.

6. J. E. Baker. Reducing bias and inefficiency in the selection algorithm. InJohn J. Grefenstette, editor, Proceedings of the Second International Confer-ence on Genetic Algorithms and their Applications, pages 14–21, Mahwah, NJ,1987. Lawrence Erlbaum.

7. John. S. Baras, Gilmer L. Blankenship, and William E. Hopkins Jr. Existence,uniqueness, and asymptotic behaviour of solutions to a class of Zakai equationswith unbounded coefficients. IEEE Trans. Automatic Control, AC-28(2):203–214, 1983.

8. Eduardo Bayro-Corrochano and Yiwen Zhang. The motor extended Kalmanfilter: A geometric approach for rigid motion estimation. J. Math. ImagingVision, 13(3):205–228, 2000.

9. V. E. Benes. Exact finite-dimensional filters for certain diffusions with nonlin-ear drift. Stochastics, 5(1-2):65–92, 1981.

10. V. E. Benes. New exact nonlinear filters with large Lie algebras. SystemsControl Lett., 5(4):217–221, 1985.

11. V. E. Benes. Nonexistence of strong nonanticipating solutions to stochasticDEs: implications for functional DEs, filtering and control. Stochastic Process.Appl., 5(3):243–263, 1977.

12. A. Bensoussan. On some approximation techniques in nonlinear filtering. InStochastic Differential Systems, Stochastic Control Theory and Applications


368 References

(Minneapolis, MN, 1986), volume 10 of IMA Vol. Math. Appl., pages 17–31.Springer, New York, 1988.

13. A. Bensoussan. Stochastic Control of Partially Observable Systems. CambridgeUniversity Press, Cambridge, UK, 1992.

14. A. Bensoussan, R. Glowinski, and A. Rascanu. Approximation of the Zakaiequation by splitting up method. SIAM J. Control Optim., 28:1420–1431, 1990.

15. Alain Bensoussan. Nonlinear filtering theory. In Recent advances in stochasticcalculus (College Park, MD, 1987), Progr. Automat. Info. Systems, pages 27–64. Springer, New York, 1990.

16. Albert Benveniste. Separabilite optionnelle, d’apres doob. In Seminaire deProbabilities, X (Univ. Strasbourg), Annees universitaire 1974/1975, volume511 of Lecture Notes in Math., pages 521–531. Springer Verlag, Berlin, 1976.

17. A. T. Bharucha-Reid. Review of Stratonovich, Conditional markov processes.Mathematical Reviews, (MR0137157), 1963.

18. A. G. Bhatt, G. Kallianpur, and R. L. Karandikar. Uniqueness and robustnessof solution of measure-valued equations of nonlinear filtering. Ann. Probab.,23(4):1895–1938, 1995.

19. P. Billingsley. Convergence of Probability Measures. Wiley, New York, 1968.20. Jean-Michel Bismut and Dominique Michel. Diffusions conditionnelles. II.

Generateur conditionnel. Application au filtrage. J. Funct. Anal., 45(2):274–292, 1982.

21. B. Z. Bobrovsky and M. Zakai. Asymptotic a priori estimates for the errorin the nonlinear filtering problem. IEEE Trans. Inform. Theory, 28:371–376,1982.

22. N. Bourbaki. Elements de Mathematique: Topologie Generale [French]. Her-mann, Paris, France, 1958.

23. Leo Breiman. Probability. Classics in Applied Mathematics. SIAM, Philadel-phia, PA, 1992.

24. Damiano Brigo, Bernard Hanzon, and Francois Le Gland. A differential ge-ometric approach to nonlinear filtering: the projection filter. IEEE Trans.Automat. Control, 43(2):247–252, 1998.

25. Damiano Brigo, Bernard Hanzon, and Francois Le Gland. Approximate non-linear filtering by projection on exponential manifolds of densities. Bernoulli,5(3):495–534, 1999.

26. R. W. Brockett. Nonlinear systems and nonlinear estimation theory. InStochastic Systems: The Mathematics of Filtering and Identification and Ap-plications (Les Arcs, 1980), volume 78 of NATO Adv. Study Inst. Ser. C: Math.Phys. Sci., pages 441–477, Dordrecht-Boston, 1981. Reidel.

27. R. W. Brockett. Nonlinear control theory and differential geometry. InZ. Ciesielski and C. Olech, editors, Proceedings of the International Congressof Mathematicians, pages 1357–1367, Warsaw, 1984. Polish Scientific.

28. R. W. Brockett and J. M. C. Clark. The geometry of the conditional densityequation. analysis and optimisation of stochastic systems. In Proceedings of theInternational Conference, University of Oxford, Oxford, 1978, pages 299–309,London-New York, 1980. Academic Press.

29. R. S. Bucy. Optimum finite time filters for a special non-stationary classof inputs. Technical Report Internal Report B. B. D. 600, March 31, JohnsHopkins Applied Physics Laboratory, 1959.

30. R. S. Bucy. Nonlinear filtering. IEEE Trans. Automatic Control, AC-10:198,1965.

References 369

31. R. S. Bucy and P. D. Joseph. Filtering for Stochastic Processes with Applica-tions to Guidance. Chelsea, New York, second edition, 1987.

32. A. Budhiraja and G. Kallianpur. Approximations to the solution of the Zakaiequation using multiple Wiener and Stratonovich integral expansions. Stochas-tics Stochastics Rep., 56(3-4):271–315, 1996.

33. A. Budhiraja and G. Kallianpur. The Feynman-Stratonovich semigroup andStratonovich integral expansions in nonlinear filtering. Appl. Math. Optim.,35(1):91–116, 1997.

34. A. Budhiraja and D. Ocone. Exponential stability in discrete-time filtering fornon-ergodic signals. Stochastic Process. Appl., 82(2):245–257, 1999.

35. Amarjit Budhiraja and Harold J. Kushner. Approximation and limit results fornonlinear filters over an infinite time interval. II. Random sampling algorithms.SIAM J. Control Optim., 38(6):1874–1908 (electronic), 2000.

36. D. L. Burkholder. Distribution function inequalities for martingales. Ann.Prob., 1(1):19–42, 1973.

37. Z. Cai, F. Le Gland, and H. Zhang. An adaptive local grid refinement methodfor nonlinear filtering. Technical Report 2679, INRIA, 1995.

38. J. Carpenter, P. Clifford, and P. Fearnhead. An improved particle filter fornon-linear problems. IEE Proceedings – Radar, Sonar and Navigation, 146:2–7, 1999.

39. J. R. Carpenter, P. Clifford, and P. Fearnhead. Sampling strategies for MonteCarlo filters for non-linear systems. IEE Colloquium Digest, 243:6/1–6/3, 1996.

40. M. Chaleyat-Maurel. Robustesse du filtre et calcul des variations stochastique.J. Funct. Anal., 68(1):55–71, 1986.

41. M. Chaleyat-Maurel. Continuity in nonlinear filtering. Some different ap-proaches. In Stochastic Partial Differential Equations and Applications(Trento, 1985), volume 1236 of Lecture Notes in Math., pages 25–39. Springer,Berlin, 1987.

42. M. Chaleyat-Maurel and D. Michel. Des resultats de non existence de filtre dedimension finie. Stochastics, 13(1-2):83–102, 1984.

43. M. Chaleyat-Maurel and D. Michel. Hypoellipticity theorems and conditionallaws. Z. Wahrsch. Verw. Gebiete, 65(4):573–597, 1984.

44. M. Chaleyat-Maurel and D. Michel. The support of the law of a filter in C∞

topology. In Stochastic Differential Systems, Stochastic Control Theory andApplications (Minneapolis, MN, 1986), volume 10 of IMA Vol. Math. Appl.,pages 395–407. Springer, New York, 1988.

45. M. Chaleyat-Maurel and D. Michel. The support of the density of a filter inthe uncorrelated case. In Stochastic Partial Differential Equations and Appli-cations, II (Trento, 1988), volume 1390 of Lecture Notes in Math., pages 33–41.Springer, Berlin, 1989.

46. M. Chaleyat-Maurel and D. Michel. Support theorems in nonlinear filtering.In New Trends in Nonlinear Control Theory (Nantes, 1988), volume 122 ofLecture Notes in Control and Inform. Sci., pages 396–403. Springer, Berlin,1989.

47. M. Chaleyat-Maurel and D. Michel. A Stroock Varadhan support theorem innonlinear filtering theory. Probab. Theory Related Fields, 84(1):119–139, 1990.

48. J. Chen, S. S.-T. Yau, and C.-W. Leung. Finite-dimensional filters with nonlin-ear drift. IV. Classification of finite-dimensional estimation algebras of maximalrank with state-space dimension 3. SIAM J. Control Optim., 34(1):179–198,1996.

370 References

49. J. Chen, S. S.-T. Yau, and C.-W. Leung. Finite-dimensional filters with nonlin-ear drift. VIII. Classification of finite-dimensional estimation algebras of max-imal rank with state-space dimension 4. SIAM J. Control Optim., 35(4):1132–1141, 1997.

50. W. L. Chiou and S. S.-T. Yau. Finite-dimensional filters with nonlinear drift. II.Brockett’s problem on classification of finite-dimensional estimation algebras.SIAM J. Control Optim., 32(1):297–310, 1994.

51. N. Chopin. Central limit theorem for sequential Monte Carlo methods and itsapplication to Bayesian inference. Annals of Statistics, 32(6):2385–2411, 2004.

52. P.-L. Chow, R. Khasminskii, and R. Liptser. Tracking of signal and its deriva-tives in Gaussian white noise. Stochastic Process. Appl., 69(2):259–273, 1997.

53. K. L. Chung and R. J. Williams. Introduction to Stochastic Integration.Birkhauser, Boston, second edition, 1990.

54. B. Cipra. Engineers look to Kalman filtering for guidance. SIAM News, 26(5),1993.

55. J. M. C. Clark. Conditions for one to one correspondence between an observa-tion process and its innovation. Technical report, Centre for Computing andAutomation, Imperial College, London, 1969.

56. J. M. C. Clark. The design of robust approximations to the stochastic differ-ential equations of nonlinear filtering. In J. K. Skwirzynski, editor, Commu-nication Systems and Random Process Theory, volume 25 of Proc. 2nd NATOAdvanced Study Inst. Ser. E, Appl. Sci., pages 721–734. Sijthoff & Noordhoff,Alphen aan den Rijn, 1978.

57. J. M. C. Clark, D. L. Ocone, and C. Coumarbatch. Relative entropy anderror bounds for filtering of Markov processes. Math. Control Signals Systems,12(4):346–360, 1999.

58. M. Cohen de Lara. Finite-dimensional filters. II. Invariance group techniques.SIAM J. Control Optim., 35(3):1002–1029, 1997.

59. M. Cohen de Lara. Finite-dimensional filters. part I: The Wei normal tech-nique. Part II: Invariance group technique. SIAM J. Control Optim., 35(3):980–1029, 1997.

60. D. Crisan. Exact rates of convergence for a branching particle approximationto the solution of the Zakai equation. Ann. Probab., 31(2):693–718, 2003.

61. D. Crisan. Particle approximations for a class of stochastic partial differentialequations. Appl. Math. Optim., 54(3):293–314, 2006.

62. D. Crisan, P. Del Moral, and T. Lyons. Interacting particle systems approxima-tions of the Kushner-Stratonovitch equation. Adv. in Appl. Probab., 31(3):819–838, 1999.

63. D. Crisan, J. Gaines, and T. Lyons. Convergence of a branching parti-cle method to the solution of the Zakai equation. SIAM J. Appl. Math.,58(5):1568–1590, 1998.

64. D. Crisan and T. Lyons. Nonlinear filtering and measure-valued processes.Probab. Theory Related Fields, 109(2):217–244, 1997.

65. D. Crisan and T. Lyons. A particle approximation of the solution of theKushner-Stratonovitch equation. Probab. Theory Related Fields, 115(4):549–578, 1999.

66. D. Crisan and T. Lyons. Minimal entropy approximations and optimal al-gorithms for the filtering problem. Monte Carlo Methods and Applications,8(4):343–356, 2002.

References 371

67. D. Crisan, P. Del Moral, and T. Lyons. Discrete filtering using branching andinteracting particle systems. Markov Processes and Related Fields, 5(3):293–318, 1999.

68. R. W. R. Darling. Geometrically intrinsic nonlinear recursive filters. Technicalreport, Berkeley Statistics Department, 1998. http://www.stat.berkeley.

edu/~darling/GINRF.69. F. E. Daum. New exact nonlinear filters. In J. C. Spall, editor, Bayesian

Analysis of Time Series and Dynamic Models, pages 199–226, New York, 1988.Marcel Dekker.

70. F. E. Daum. New exact nonlinear filters: Theory and applications. Proc. SPIE,2235:636–649, 1994.

71. M. H. A. Davis. Linear Estimation and Stochastic Control. Chapman and HallMathematics Series. Chapman and Hall, London, 1977.

72. M. H. A. Davis. On a multiplicative functional transformation arising in non-linear filtering theory. Z. Wahrsch. Verw. Gebiete, 54(2):125–139, 1980.

73. M. H. A. Davis. New approach to filtering for nonlinear systems. Proc. IEE-D,128(5):166–172, 1981.

74. M. H. A. Davis. Pathwise nonlinear filtering. In M. Hazewinkel and J. C.Willems, editors, Stochastic Systems: The Mathematics of Filtering and Iden-tification and Applications, Proc. NATO Advanced Study Inst. Ser. C 78, pages505–528, Dordrecht-Boston, 1981. Reidel.

75. M. H. A. Davis. A pathwise solution of the equations of nonlinear filter-ing. Theory Probability Applications [trans. of Teor. Veroyatnost. i Primenen.],27(1):167–175, 1982.

76. M. H. A. Davis and M. P. Spathopoulos. Pathwise nonlinear filtering for nonde-generate diffusions with noise correlation. SIAM J. Control Optim., 25(2):260–278, 1987.

77. Claude Dellacherie and Paul-Andre Meyer. Probabilites et potentiel. ChapitresI a IV. [French] [Probability and potential. Chapters I–IV] . Hermann, Paris,1975.

78. Claude Dellacherie and Paul-Andre Meyer. Un noveau theoreme de projectionet de section [French]. In Seminaire de Probabilites, IX (Seconde Partie, Univ.Strasbourg, Annees universitaires 1973/1974 et 1974/1975), pages 239–245.Springer Verlag, New York, 1975.

79. Claude Dellacherie and Paul-Andre Meyer. Probabilites et potentiel. ChapitresV a VIII. [French] [Probability and potential. Chapters V–VIII] Theorie desmartingales. Hermann, Paris, 1980.

80. Giovanni B. Di Masi and Wolfgang J. Runggaldier. An adaptive linear ap-proach to nonlinear filtering. In Applications of Mathematics in Industry andTechnology (Siena, 1988), pages 308–316. Teubner, Stuttgart, 1989.

81. J. L. Doob. Stochastic Processes. Wiley, New York, 1963.82. J. L. Doob. Stochastic process measurability conditions. Annales de l’institut

Fourier, 25(3–4):163–176, 1975.83. Arnaud Doucet, Nando de Freitas, and Neil Gordon. Sequential Monte Carlo

Methods in Practice. Stat. Eng. Inf. Sci. Springer, New York, 2001.84. T. E. Duncan. Likelihood functions for stochastic signals in white noise. In-

formation and Control, 16:303–310, 1970.85. T. E. Duncan. On the absolute continuity of measures. Ann. Math. Statist.,

41:30–38, 1970.

372 References

86. T. E. Duncan. On the steady state filtering problem for linear pure delay timesystems. In Analysis and control of systems (IRIA Sem., Rocquencourt, 1979),pages 25–42. INRIA, Rocquencourt, 1980.

87. T. E. Duncan. Stochastic filtering in manifolds. In Control Science and Tech-nology for the Progress of Society, Vol. 1 (Kyoto, 1981), pages 553–556. IFAC,Luxembourg, 1982.

88. T. E. Duncan. Explicit solutions for an estimation problem in manifolds asso-ciated with Lie groups. In Differential Geometry: The Interface Between Pureand Applied Mathematics (San Antonio, TX, 1986), volume 68 of Contemp.Math., pages 99–109. Amer. Math. Soc., Providence, RI, 1987.

89. T. E. Duncan. An estimation problem in compact Lie groups. Systems ControlLett., 10(4):257–263, 1988.

90. R. J Elliott and V. Krishnamurthy. Exact finite-dimensional filters for maxi-mum likelihood parameter estimation of continuous-time linear Gaussian sys-tems. SIAM J. Control Optim., 35(6):1908–1923, 1997.

91. R. J Elliott and J. van der Hoek. A finite-dimensional filter for hybrid obser-vations. IEEE Trans. Automat. Control, 43(5):736–739, 1998.

92. Robert J. Elliott and Michael Kohlmann. Robust filtering for correlated mul-tidimensional observations. Math. Z., 178(4):559–578, 1981.

93. Robert J. Elliott and Michael Kohlmann. The existence of smooth densities forthe prediction filtering and smoothing problems. Acta Appl. Math., 14(3):269–286, 1989.

94. Robert J. Elliott and John B. Moore. Zakai equations for Hilbert space valuedprocesses. Stochastic Anal. Appl., 16(4):597–605, 1998.

95. Stewart N. Ethier and Thomas G. Kurtz. Markov Processes: Characterizationand Convergence. Wiley, New York, 1986.

96. Marco Ferrante and Wolfgang J. Runggaldier. On necessary conditions for theexistence of finite-dimensional filters in discrete time. Systems Control Lett.,14(1):63–69, 1990.

97. W. H. Fleming and E. Pardoux. Optimal control of partially observed diffu-sions. SIAM J. Control Optim., 20(2):261–285, 1982.

98. Wendell H. Fleming and Sanjoy K. Mitter. Optimal control and nonlinear fil-tering for nondegenerate diffusion processes. Stochastics, 8(1):63–77, 1982/83.

99. Patrick Florchinger. Malliavin calculus with time dependent coefficients andapplication to nonlinear filtering. Probab. Theory Related Fields, 86(2):203–223, 1990.

100. Patrick Florchinger and Francois Le Gland. Time-discretization of the Za-kai equation for diffusion processes observed in correlated noise. In Analysisand Optimization of Systems (Antibes, 1990), volume 144 of Lecture Notes inControl and Inform. Sci., pages 228–237. Springer, Berlin, 1990.

101. Patrick Florchinger and Francois Le Gland. Time-discretization of the Za-kai equation for diffusion processes observed in correlated noise. StochasticsStochastics Rep., 35(4):233–256, 1991.

102. Avner Friedman. Partial Differential Equations of Parabolic Type. Prentice-Hall, Englewood Cliffs, NJ, 1964.

103. P. Frost and T. Kailath. An innovations approach to least-squares estimation.III. IEEE Trans. Autom. Control, AC-16:217–226, 1971.

104. M. Fujisaki, G. Kallianpur, and H. Kunita. Stochastic differential equationsfor the non linear filtering problem. Osaka J. Math., 9:19–40, 1972.

References 373

105. R. K. Getoor. On the construction of kernels. In Seminaire de Probabilites,IX (Seconde Partie, Univ. Strasbourg, Annees universitaires 1973/1974 et1974/1975), volume 465 of Lecture Notes in Math., pages 443–463. SpringerVerlag, Berlin, 1975.

106. N. J. Gordon, D. J. Salmond, and A. F. M. Smith. Novel approach tononlinear/non-Gaussian Bayesian state estimation. IEE Proceedings, Part F,140(2):107–113, 1993.

107. B. Grigelionis. The theory of nonlinear estimation and semimartingales. Izv.Akad. Nauk UzSSR Ser. Fiz.-Mat. Nauk, (3):17–22, 97, 1981.

108. B. Grigelionis. Stochastic nonlinear filtering equations and semimartingales.In Nonlinear Filtering and Stochastic Control (Cortona, 1981), volume 972 ofLecture Notes in Math., pages 63–99. Springer, Berlin, 1982.

109. B. Grigelionis and R. Mikulevicius. On weak convergence to random processeswith boundary conditions. In Nonlinear Filtering and Stochastic Control (Cor-tona, 1981), volume 972 of Lecture Notes in Math., pages 260–275. Springer,Berlin, 1982.

110. B. Grigelionis and R. Mikulevicius. Stochastic evolution equations and den-sities of the conditional distributions. In Theory and Application of RandomFields (Bangalore, 1982), volume 49 of Lecture Notes in Control and Inform.Sci., pages 49–88. Springer, Berlin, 1983.

111. B. Grigelionis and R. Mikulyavichyus. Robustness in nonlinear filtering theory.Litovsk. Mat. Sb., 22(4):37–45, 1982.

112. I. Gyongy. The approximation of stochastic partial differential equations andapplications in nonlinear filtering. Comput. Math. Appl., 19(1):47–63, 1990.

113. I. Gyongy and N. V. Krylov. Stochastic partial differential equations withunbounded coefficients and applications. II. Stochastics Stochastics Rep., 32(3-4):165–180, 1990.

114. I. Gyongy and N. V. Krylov. On stochastic partial differential equations withunbounded coefficients. In Stochastic partial differential equations and appli-cations (Trento, 1990), volume 268 of Pitman Res. Notes Math. Ser., pages191–203. Longman Sci. Tech., Harlow, 1992.

115. Istvan Gyongy. On stochastic partial differential equations. Results on approx-imations. In Topics in Stochastic Systems: Modelling, Estimation and AdaptiveControl, volume 161 of Lecture Notes in Control and Inform. Sci., pages 116–136. Springer, Berlin, 1991.

116. Istvan Gyongy. Filtering on manifolds. Acta Appl. Math., 35(1-2):165–177,1994. White noise models and stochastic systems (Enschede, 1992).

117. Istvan Gyongy. Stochastic partial differential equations on manifolds. II. Non-linear filtering. Potential Anal., 6(1):39–56, 1997.

118. Istvan Gyongy and Nicolai Krylov. On the rate of convergence of splitting-upapproximations for SPDEs. In Stochastic inequalities and applications, vol-ume 56 of Progr. Probab., pages 301–321. Birkhauser, 2003.

119. Istvan Gyongy and Nicolai Krylov. On the splitting-up method and stochasticpartial differential equations. Ann. Probab., 31(2):564–591, 2003.

120. J. E. Handschin and D. Q. Mayne. Monte Carlo techniques to estimate theconditional expectation in multi-stage non-linear filtering. Internat. J. Control,1(9):547–559, 1969.

121. M. Hazewinkel, S. I. Marcus, and H. J. Sussmann. Nonexistence of finite-dimensional filters for conditional statistics of the cubic sensor problem. Sys-tems Control Lett., 3(6):331–340, 1983.

374 References

122. M. Hazewinkel, S. I. Marcus, and H. J. Sussmann. Nonexistence of finite-dimensional filters for conditional statistics of the cubic sensor problem. InFiltering and Control of Random Processes (Paris, 1983), volume 61 of LectureNotes in Control and Inform. Sci., pages 76–103, Berlin, 1984. Springer.

123. Michiel Hazewinkel. Lie algebraic methods in filtering and identification. In VI-IIth International Congress on Mathematical Physics (Marseille, 1986), pages120–137. World Scientific, Singapore, 1987.

124. Michiel Hazewinkel. Lie algebraic method in filtering and identification. InStochastic Processes in Physics and Engineering (Bielefeld, 1986), volume 42of Math. Appl., pages 159–176. Reidel, Dordrecht, 1988.

125. Michiel Hazewinkel. Non-Gaussian linear filtering, identification of linear sys-tems, and the symplectic group. In Modeling and Control of Systems in Engi-neering, Quantum Mechanics, Economics and Biosciences (Sophia-Antipolis,1988), volume 121 of Lecture Notes in Control and Inform. Sci., pages 299–308.Springer, Berlin, 1989.

126. Michiel Hazewinkel. Non-Gaussian linear filtering, identification of linear sys-tems, and the symplectic group. In Signal Processing, Part II, volume 23 ofIMA Vol. Math. Appl., pages 99–113. Springer, New York, 1990.

127. A. J. Heunis. Nonlinear filtering of rare events with large signal-to-noise ratio.J. Appl. Probab., 24(4):929–948, 1987.

128. A. J. Heunis. On the stochastic differential equations of filtering theory. Appl.Math. Comput., 37(3):185–218, 1990.

129. A. J. Heunis. On the stochastic differential equations of filtering theory. Appl.Math. Comput., 39(3, suppl.):3s–36s, 1990.

130. Andrew Heunis. Rates of convergence for an adaptive filtering algorithm drivenby stationary dependent data. SIAM J. Control Optim., 32(1):116–139, 1994.

131. Guo-Qing Hu, Stephen S. T. Yau, and Wen-Lin Chiou. Finite-dimensionalfilters with nonlinear drift. XIII. Classification of finite-dimensional estimationalgebras of maximal rank with state space dimension five. Loo-Keng Hua: agreat mathematician of the twentieth century. Asian J. Math., 4(4):905–931,2000.

132. M. Isard and A. Blake. Visual tracking by stochastic propagation of conditionaldensity. In Proceedings of the 4th European Conference on Computer Vision,pages 343–356, New York, 1996. Springer Verlag.

133. M. Isard and A. Blake. Condensation conditional density propagation for visualtracking. Int. J. Computer Vision, 1998.

134. M. Isard and A. Blake. A mixed-state condensation tracker with automaticmodel switching. In Proceedings of the 6th International Conference on Com-puter Vision, pages 107–112, 1998.

135. K. Ito and H. P. McKean. Diffusion Processes and Their Sample Paths. Aca-demic Press, New York, 1965.

136. Matthew R. James and Francois Le Gland. Numerical approximation for non-linear filtering and finite-time observers. In Applied Stochastic Analysis (NewBrunswick, NJ, 1991), volume 177 of Lecture Notes in Control and Inform.Sci., pages 159–175. Springer, Berlin, 1992.

137. T. Kailath. An innovations approach to least-squares estimation. I. linearfiltering in additive white noise. IEEE Trans. Autom. Control, AC-13:646–655, 1968.

References 375

138. G. Kallianpur. White noise theory of filtering—Some robustness and consis-tency results. In Stochastic Differential Systems (Marseille-Luminy, 1984), vol-ume 69 of Lecture Notes in Control and Inform. Sci., pages 217–223. Springer,Berlin, 1985.

139. G. Kallianpur and R. L. Karandikar. The Markov property of the filter inthe finitely additive white noise approach to nonlinear filtering. Stochastics,13(3):177–198, 1984.

140. G. Kallianpur and R. L. Karandikar. Measure-valued equations for the opti-mum filter in finitely additive nonlinear filtering theory. Z. Wahrsch. Verw.Gebiete, 66(1):1–17, 1984.

141. G. Kallianpur and R. L. Karandikar. A finitely additive white noise approachto nonlinear filtering: A brief survey. In Multivariate Analysis VI (Pittsburgh,PA, 1983), pages 335–344. North-Holland, Amsterdam, 1985.

142. G. Kallianpur and R. L. Karandikar. White noise calculus and nonlinear fil-tering theory. Ann. Probab., 13(4):1033–1107, 1985.

143. G. Kallianpur and R. L. Karandikar. White Noise Theory of Prediction, Fil-tering and Smoothing, volume 3 of Stochastics Monographs. Gordon & BreachScience, New York, 1988.

144. G. Kallianpur and C. Striebel. Estimation of stochastic systems: Arbitrary sys-tem process with additive white noise observation errors. Ann. Math. Statist.,39(3):785–801, 1968.

145. Gopinath Kallianpur. Stochastic filtering theory, volume 13 of Applications ofMathematics. Springer, New York, 1980.

146. R. E. Kalman. A new approach to linear filtering and prediction problems. J.Basic Eng., 82:35–45, 1960.

147. R. E. Kalman and R. S. Bucy. New results in linear filtering and predictiontheory. Trans. ASME, Ser. D, J. Basic Eng., 83:95–108, 1961.

148. Jim Kao, Dawn Flicker, Kayo Ide, and Michael Ghil. Estimating model pa-rameters for an impact-produced shock-wave simulation: Optimal use of partialdata with the extended Kalman filter. J. Comput. Phys., 214(2):725–737, 2006.

149. I. Karatzas and S. E. Shreve. Brownian Motion and Stochastic Calculus.,volume 113 of Graduate Texts in Mathematics. Springer, New York, secondedition, 1991.

150. Genshiro Kitagawa. Non-Gaussian state-space modeling of nonstationary timeseries. with comments and a reply by the author. J. Amer. Statist. Assoc.,82(400):1032–1063, 1987.

151. P. E. Kloeden and E. Platen. The Numerical Solution of Stochastic DifferentialEquations. Springer, New York, 1992.

152. A. N. Kolmogorov. Sur l’interpolation et extrapolation des suites stationnaires.C. R. Acad. Sci., 208:2043, 1939.

153. A. N. Kolmogorov. Interpolation and extrapolation. Bulletin de l-academiedes sciences de U.S.S.R., Ser. Math., 5:3–14, 1941.

154. Hayri Korezlioglu and Wolfgang J. Runggaldier. Filtering for nonlinear systemsdriven by nonwhite noises: An approximation scheme. Stochastics StochasticsRep., 44(1-2):65–102, 1993.

155. M. G. Krein. On a generalization of some investigations of G. Szego, W. M.smirnov, and A. N. Kolmogorov. Dokl. Adad. Nauk SSSR, 46:91–94, 1945.

156. M. G. Krein. On a problem of extrapolation of A. N. Kolmogorov. Dokl. Akad.Nauk SSSR, 46:306–309, 1945.

376 References

157. N. V. Krylov. On Lp-theory of stochastic partial differential equations in thewhole space. SIAM J. Math. Anal., 27(2):313–340, 1996.

158. N. V. Krylov. An analytic approach to SPDEs. In Stochastic Partial Differen-tial Equations: Six Perspectives, number 64 in Math. Surveys Monogr., pages185–242. Amer. Math. Soc., Providence, RI, 1999.

159. N. V. Krylov and B. L. Rozovskiı. The Cauchy problem for linear stochasticpartial differential equations. Izv. Akad. Nauk SSSR Ser. Mat., 41(6):1329–1347, 1448, 1977.

160. N. V. Krylov and B. L. Rozovskii. Conditional distributions of diffusion pro-cesses. Izv. Akad. Nauk SSSR Ser. Mat., 42(2):356–378,470, 1978.

161. N. V. Krylov and B. L. Rozovskiı. Characteristics of second-order degenerateparabolic Ito equations. Trudy Sem. Petrovsk., (8):153–168, 1982.

162. N. V. Krylov and B. L. Rozovskiı. Stochastic partial differential equations anddiffusion processes. Uspekhi Mat. Nauk, 37(6(228)):75–95, 1982.

163. N. V. Krylov and A. Zatezalo. A direct approach to deriving filtering equationsfor diffusion processes. Appl. Math. Optim., 42(3):315–332, 2000.

164. H. Kunita. Stochastic Flows and Stochastic Differential Equations. Number 24in Cambridge Studies in Advanced Mathematics. Cambridge University Press,Cambridge, UK, 1990.

165. Hiroshi Kunita. Cauchy problem for stochastic partial differential equationsarising in nonlinear filtering theory. Systems Control Lett., 1(1):37–41, 1981/82.

166. Hiroshi Kunita. Stochastic partial differential equations connected with non-linear filtering. In Nonlinear Filtering and Stochastic Control (Cortona, 1981),volume 972 of Lecture Notes in Math., pages 100–169. Springer, Berlin, 1982.

167. Hiroshi Kunita. Ergodic properties of nonlinear filtering processes. In SpatialStochastic Processes, volume 19 of Progr. Probab., pages 233–256. BirkhauserBoston, 1991.

168. Hiroshi Kunita. The stability and approximation problems in nonlinear filteringtheory. In Stochastic Analysis, pages 311–330. Academic Press, Boston, 1991.

169. Hans R. Kunsch. Recursive Monte Carlo filters: Algorithms and theoreticalanalysis. Ann. Statist., 33(5):1983–2021, 2005.

170. T. G. Kurtz and D. L. Ocone. Unique characterization of conditional distribu-tions in nonlinear filtering. Ann. Probab., 16(1):80–107, 1988.

171. T. G. Kurtz and J. Xiong. Numerical solutions for a class of SPDEs with ap-plication to filtering. In Stochastics in Finite and Infinite Dimensions, TrendsMath., pages 233–258. Birkhauser Boston, 2001.

172. Thomas G. Kurtz. Martingale problems for conditional distributions of Markovprocesses. Electron. J. Probab., 3:no. 9, 29 pp. (electronic), 1998.

173. Thomas G. Kurtz and Daniel Ocone. A martingale problem for conditionaldistributions and uniqueness for the nonlinear filtering equations. In StochasticDifferential Systems (Marseille-Luminy, 1984), volume 69 of Lecture Notes inControl and Inform. Sci., pages 224–234. Springer, Berlin, 1985.

174. Thomas G. Kurtz and Jie Xiong. Particle representations for a class of non-linear SPDEs. Stochastic Process. Appl., 83(1):103–126, 1999.

175. H. Kushner. On the differential equations satisfied by conditional densities ofmarkov processes, with applications. SIAM J. Control, 2:106–119, 1964.

176. H. Kushner. Technical Report JA2123, M.I.T Lincoln Laboratory, March 1963.177. H. J. Kushner. Approximations of nonlinear filters. IEEE Trans. Automat.

Control, AC-12:546–556, 1967.

References 377

178. H. J. Kushner. Dynamical equations for optimal nonlinear filtering. J. Differ-ential Equations, 3:179–190, 1967.

179. H. J. Kushner. A robust discrete state approximation to the optimal nonlinearfilter for a diffusion. Stochastics, 3(2):75–83, 1979.

180. H. J. Kushner. Robustness and convergence of approximations to nonlinearfilters for jump-diffusions. Matematica Aplicada e Computacional, 16(2):153–183, 1997.

181. H. J. Kushner and P. Dupuis. Numerical Methods for Stochastic ControlProblems in Continuous Time. Number 24 in Applications of Mathematics.Springer, New York, 1992.

182. Harold J. Kushner. Weak Convergence Methods and Singularly PerturbedStochastic Control and Filtering Problems, volume 3 of Systems & Control:Foundations & Applications. Birkhauser Boston, 1990.

183. Harold J. Kushner and Amarjit S. Budhiraja. A nonlinear filtering algorithmbased on an approximation of the conditional distribution. IEEE Trans. Au-tom. Control, 45(3):580–585, 2000.

184. Harold J. Kushner and Hai Huang. Approximate and limit results for nonlinearfilters with wide bandwidth observation noise. Stochastics, 16(1-2):65–96, 1986.

185. S. Kusuoka and D. Stroock. The partial Malliavin calculus and its applicationto nonlinear filtering. Stochastics, 12(2):83–142, 1984.

186. F. Le Gland. Time discretization of nonlinear filtering equations. In Proceedingsof the 28th IEEE-CSS Conference Decision Control, Tampa, FL, pages 2601–2606, 1989.

187. Francois Le Gland. Splitting-up approximation for SPDEs and SDEs withapplication to nonlinear filtering. In Stochastic Partial Differential Equationsand Their Applications (Charlotte, NC, 1991), volume 176 of Lecture Notes inControl and Inform. Sci., pages 177–187. Springer, New York, 1992.

188. Francois Le Gland and Nadia Oudjane. Stability and uniform approximationof nonlinear filters using the Hilbert metric and application to particle filters.Ann. Appl. Probab., 14(1):144–187, 2004.

189. J. Levine. Finite-dimensional realizations of stochastic PDEs and applicationto filtering. Stochastics Stochastics Rep., 37(1–2):75–103, 1991.

190. Robert Liptser and Ofer Zeitouni. Robust diffusion approximation for nonlinearfiltering. J. Math. Systems Estim. Control, 8(1):22 pp. (electronic), 1998.

191. Robert S. Liptser and Wolfgang J. Runggaldier. On diffusion approximationsfor filtering. Stochastic Process. Appl., 38(2):205–238, 1991.

192. Robert S. Liptser and Albert N. Shiryaev. Statistics of Random Processes.I General Theory, volume 5 of Stochastic Modelling and Applied Probablility.Springer, New York, second edition, 2001. Translated from the 1974 Russianoriginal by A. B. Aries.

193. Robert S. Liptser and Albert N. Shiryaev. Statistics of Random Processes.II Applications, volume 6 of Stochastic Modelling and Applied Probability.Springer, New York, second edition, 2001. Translated from the 1974 Russianoriginal by A. B. Aries.

194. S. Lototsky, C. Rao, and B. Rozovskii. Fast nonlinear filter for continuous-discrete time multiple models. In Proceedings of the 35th IEEE Conference onDecision and Control, Kobe, Japan, 1996, volume 4, pages 4060–4064, Madison,WI, 1997. Omnipress.

378 References

195. S. V. Lototsky. Optimal filtering of stochastic parabolic equations. In RecentDevelopments in Stochastic Analysis and Related Topics, pages 330–353. WorldScientific, Hackensack, NJ, 2004.

196. S. V. Lototsky. Wiener chaos and nonlinear filtering. Appl. Math. Optim.,54(3):265–291, 2006.

197. Sergey Lototsky, Remigijus Mikulevicius, and Boris L. Rozovskii. Nonlinearfiltering revisited: A spectral approach. SIAM J. Control Optim., 35(2):435–461, 1997.

198. Sergey Lototsky and Boris Rozovskii. Stochastic differential equations: AWiener chaos approach. In From Stochastic Calculus to Mathematical Finance,pages 433–506. Springer, New York, 2006.

199. Sergey V. Lototsky. Nonlinear filtering of diffusion processes in correlatednoise: analysis by separation of variables. Appl. Math. Optim., 47(2):167–194,2003.

200. Vladimir M. Lucic and Andrew J. Heunis. On uniqueness of solutions forthe stochastic differential equations of nonlinear filtering. Ann. Appl. Probab.,11(1):182–209, 2001.

201. T. M. Macrobert. Functions of a Complex Variable. St. Martin’s Press, NewYork, 1954.

202. Michael Mangold, Markus Grotsch, Min Sheng, and Achim Kienle. State es-timation of a molten carbonate fuel cell by an extended Kalman filter. InControl and Observer Design for Nonlinear Finite and Infinite DimensionalSystems, volume 322 of Lecture Notes in Control and Inform. Sci., pages 93–109. Springer, New York, 2005.

203. S. J. Maybank. Path integrals and finite-dimensional filters. In StochasticPartial Differential Equations (Edinburgh, 1994), volume 216 of London Math.Soc. Lecture Note Ser., pages 209–229, Cambridge, UK, 1995. Cambridge Uni-versity Press.

204. Stephen Maybank. Finite-dimensional filters. Phil. Trans. R. Soc. Lond. A,354(1710):1099–1123, 1996.

205. Paul-Andre Meyer. Sur un probleme de filtration [French]. In Seminaire deProbabilities, VII (Univ. Strasbourg), Annees universitaire 1971/1972, volume321 of Lecture Notes in Math., pages 223–247. Springer Verlag, Berlin, 1973.

206. Paul-Andre Meyer. La theorie de la prediction de F. Knight [French].In Seminaire de Probabilities, X (Univ. Strasbourg), Annees universitaire1974/1975, volume 511 of Lecture Notes in Math., pages 86–103. Springer Ver-lag, Berlin, 1976.

207. Dominique Michel. Regularite des lois conditionnelles en theorie du filtragenon-lineaire et calcul des variations stochastique. J. Funct. Anal., 41(1):8–36,1981.

208. R. Mikulevicius and B. L. Rozovskii. Separation of observations and parametersin nonlinear filtering. In Proceedings of the 32nd IEEE Conference on Decisionand Control, Part 2, San Antonio. IEEE Control Systems Society, 1993.

209. R. Mikulevicius and B. L. Rozovskii. Fourier-Hermite expansions for nonlinearfiltering. Teor. Veroyatnost. i Primenen., 44(3):675–680, 1999.

210. Sanjoy K. Mitter. Existence and nonexistence of finite-dimensional filters.Rend. Sem. Mat. Univ. Politec. Torino, Special Issue:173–188, 1982.

211. Sanjoy K. Mitter. Geometric theory of nonlinear filtering. In MathematicalTools and Models for Control, Systems Analysis and Signal Processing, Vol.

References 379

3 (Toulouse/Paris, 1981/1982), Travaux Rech. Coop. Programme 567, pages37–60. CNRS, Paris, 1983.

212. Sanjoy K. Mitter and Nigel J. Newton. A variational approach to nonlinearestimation. SIAM J. Control Optim., 42(5):1813–1833 (electronic), 2003.

213. Sanjoy K. Mitter and Irvin C. Schick. Point estimation, stochastic approxima-tion, and robust Kalman filtering. In Systems, Models and Feedback: Theoryand Applications (Capri, 1992), volume 12 of Progr. Systems Control Theory,pages 127–151. Birkhauser Boston, 1992.

214. P. Del Moral. Non-linear filtering: Interacting particle solution. Markov Pro-cesses Related Fields, 2:555–580, 1996.

215. P. Del Moral. Non-linear filtering using random particles. Theory ProbabilityApplications, 40(4):690–701, 1996.

216. P. Del Moral. Feynman-Kac formulae. Genealogical and Interacting ParticleSystems with Applications. Springer, New York, 2004.

217. P. Del Moral and J. Jacod. The Monte-Carlo method for filtering with discrete-time observations: Central limit theorems. In Numerical Methods and Stochas-tics (Toronto, ON, 1999), Fields Inst. Commun., 34, pages 29–53. Amer. Math.Soc., Providence, RI, 2002.

218. P. Del Moral and L. Miclo. Branching and interacting particle systems approxi-mations of Feynman-Kac formulae with applications to non-linear filtering. InSeminaire de Probabilites, XXXIV, volume 1729 of Lecture Notes in Math.,pages 1–145. Springer, Berlin, 2000.

219. P. Del Moral, J. C. Noyer, G. Rigal, and G. Salut. Traitement particulairedu signal radar : detection, estimation et reconnaissance de cibles aeriennes.Technical Report 92495, LAAS, Dcembre 1992.

220. P. Del Moral, G. Rigal, and G. Salut. Estimation et commande optimale non-lineaire : un cadre unifie pour la resolution particulaire. Technical Report91137, LAAS, 1991.

221. P. Del Moral, G. Rigal, and G. Salut. Filtrage non-lineaire non-gaussien ap-plique au recalage de plates-formes inertielles. Technical Report 92207, LAAS,Juin 1992.

222. R. E. Mortensen. Stochastic optimal control with noisy observations. Internat.J. Control, 1(4):455–464, 1966.

223. Christian Musso, Nadia Oudjane, and Francois Le Gland. Improving regu-larised particle filters. In Sequential Monte Carlo Methods in Practice, Stat.Eng. Inf. Sci., pages 247–271. Springer, New York, 2001.

224. David E. Newland. Harmonic wavelet analysis. Proc. Roy. Soc. London Ser.A, 443(1917):203–225, 1993.

225. Nigel J. Newton. Observation sampling and quantisation for continuous-timeestimators. Stochastic Process. Appl., 87(2):311–337, 2000.

226. Nigel J. Newton. Observations preprocessing and quantization for nonlinearfilters. SIAM J. Control Optim., 38(2):482–502 (electronic), 2000.

227. David Nualart. The Malliavin Calculus and Related Topics. Springer, NewYork, second edition, 2006.

228. D. L. Ocone. Asymptotic stability of Benes filters. Stochastic Anal. Appl.,17(6):1053–1074, 1999.

229. Daniel Ocone. Multiple integral expansions for nonlinear filtering. Stochastics,10(1):1–30, 1983.

380 References

230. Daniel Ocone. Application of Wiener space analysis to nonlinear filtering.In Theory and Applications of Nonlinear Control Systems (Stockholm, 1985),pages 387–400. North-Holland, Amsterdam, 1986.

231. Daniel Ocone. Stochastic calculus of variations for stochastic partial differentialequations. J. Funct. Anal., 79(2):288–331, 1988.

232. Daniel Ocone. Entropy inequalities and entropy dynamics in nonlinear filteringof diffusion processes. In Stochastic Analysis, Control, Optimization and Ap-plications, Systems Control Found. Appl., pages 477–496. Birkhauser Boston,1999.

233. Daniel Ocone and Etienne Pardoux. A Lie algebraic criterion for nonexistenceof finite-dimensionally computable filters. In Stochastic Partial DifferentialEquations and Applications, II (Trento, 1988), volume 1390 of Lecture Notesin Math., pages 197–204. Springer, Berlin, 1989.

234. O. A. Oleınik and E. V. Radkevic. Second Order Equations with NonnegativeCharacteristic Form. Plenum Press, New York, 1973.

235. Levent Ozbek and Murat Efe. An adaptive extended Kalman filter withapplication to compartment models. Comm. Statist. Simulation Comput.,33(1):145–158, 2004.

236. E. Pardoux. Equations aux derivees partielles stochastiques non lineariresmonotones. PhD thesis, Univ Paris XI, Orsay, 1975.

237. E. Pardoux. Stochastic partial differential equations and filtering of diffusionprocesses. Stochastics, 3(2):127–167, 1979.

238. E. Pardoux. Filtrage non lineaire et equations aux derivees partielles stochas-tiques associees. In Ecole d’Ete de Probabilites de Saint-Flour XIX – 1989,volume 1464 of Lecture Notes in Mathematics, pages 67–163. Springer, 1991.

239. P. Parthasarathy. Probability Measures on Metric Spaces. Academic Press,New York, 1967.

240. J. Picard. Efficiency of the extended Kalman filter for nonlinear systems withsmall noise. SIAM J. Appl. Math., 51(3):843–885, 1991.

241. Jean Picard. Approximation of nonlinear filtering problems and order of con-vergence. In Filtering and Control of Random Processes (Paris, 1983), vol-ume 61 of Lecture Notes in Control and Inform. Sci., pages 219–236. Springer,Berlin, 1984.

242. Jean Picard. An estimate of the error in time discretization of nonlinear fil-tering problems. In Theory and Applications of Nonlinear Control Systems(Stockholm, 1985), pages 401–412. North-Holland, Amsterdam, 1986.

243. Jean Picard. Nonlinear filtering of one-dimensional diffusions in the case of ahigh signal-to-noise ratio. SIAM J. Appl. Math., 46(6):1098–1125, 1986.

244. Michael K. Pitt and Neil Shephard. Filtering via simulation: Auxiliary particlefilters. J. Amer. Statist. Assoc., 94(446):590–599, 1999.

245. M. Pontier, C. Stricker, and J. Szpirglas. Sur le theoreme de representationpar raport a l’innovation [French]. In Seminaire de Probabilites, XX (Univ.Strasbourg, Annees universitaires 1984/1985), volume 1204 of Lecture Notesin Math., pages 34–39. Springer Verlag, Berlin, 1986.

246. Yu. V. Prokhorov. Convergence of random processes and limit theoremsin probability theory. Theory Probability Applications [Teor. Veroyatnost. iPrimenen.], 1(2):157–214, 1956.

247. P. Protter. Stochastic Integration and Differential Equations. Springer, Berlin,second edition, 2003.

References 381

248. L. C. G. Rogers and D. Williams. Diffusions, Markov Processes and Martin-gales: Volume I Foundations. Cambridge University Press, Cambridge, UK,second edition, 2000.

249. L. C. G. Rogers and D. Williams. Diffusions, Markov Processes and Martin-gales: Volume II Ito Calculus. Cambridge University Press, Cambridge, UK,second edition, 2000.

250. B. L. Rozovskii. Stochastic Evolution Systems. Kluwer, Dordrecht, 1990.251. D. B. Rubin. A noniterative sampling/importance resampling alternative to

the data augmentation algorithm for creating a few imputations when thefraction of missing information is modest: The SIR algorithm (discussion ofTanner and Wong). J. Amer. Statist. Assoc., 82:543–546, 1987.

252. Laurent Saloff-Coste. Aspects of Sobolev-Type Inequalities, volume 289 of Lon-don Mathematical Society Lecture Note Series. Cambridge University Press,Cambridge, UK, 2002.

253. G. C. Schmidt. Designing nonlinear filters based on Daum’s theory. J. ofGuidance, Control Dynamics, 16(2):371–376, 1993.

254. Carla A.I. Schwartz and Bradley W. Dickinson. Characterizing finite-dimensional filters for the linear innovations of continuous-time random pro-cesses. IEEE Trans. Autom. Control, 30(3):312–315, 1985.

255. A. N. Shiryaev. Some new results in the theory of controlled random pro-cesses [Russian]. In Transactions of the Fourth Prague Conference on Informa-tion Theory, Statistical Decision Functions, Random Processes (Prague, 1965),pages 131–203. Academia Prague, 1967.

256. Elias M. Stein. Singular Integrals and Differentiability Properties of Func-tions. Number 30 in Princeton Mathematical Series. Princeton UniversityPress, Princeton, NJ, 1970.

257. R. L. Stratonovich. On the theory of optimal non-linear filtration of randomfunctions. Teor. Veroyatnost. i Primenen., 4:223–225, 1959.

258. R. L. Stratonovich. Application of the theory of Markov processes for optimumfiltration of signals. Radio Eng. Electron. Phys, 1:1–19, 1960.

259. R. L. Stratonovich. Conditional Markov processes. Theory Probability Appli-cations [translation of Teor. Verojatnost. i Primenen.], 5(2):156–178, 1960.

260. R. L. Stratonovich. Conditional Markov Processes and Their Application to theTheory of Optimal Control, volume 7 of Modern Analytic and ComputationalMethods in Science and Mathematics. Elsevier, New York, 1968. Translatedfrom the Russian by R. N. and N. B. McDonough for Scripta Technica.

261. D. W. Stroock and S. R. S. Varadhan. Multidimensional Diffusion Processes.Springer, New York, 1979.

262. Daniel W. Stroock. Probability Theory, An Analytic View. Cambridge Univer-sity Press, Cambridge, UK, 1993.

263. M. Sun and R. Glowinski. Pathwise approximation and simulation for the Zakaifiltering equation through operator splitting. Calcolo, 30(3):219–239 (1994),1993.

264. J. Szpirglas. Sur l’equivalence d’equations differentielles stochastiques a valeursmesures intervenant dans le filtrage Markovien non lineaire [French]. Ann. Inst.H. Poincare Sect. B (N.S.), 14(1):33–59, 1978.

265. I. Tulcea. Measures dans les espaces produits [French]. Atti. Accad. Naz. LinceiRend. Cl. Sci. Fis. Math. Nat., 8(7):208–211, 1949.

382 References

266. A. S. Ustunel. Some comments on the filtering of diffusions and the Malliavincalculus. In Stochastic analysis and related topics (Silivri, 1986), volume 1316of Lecture Notes in Math., pages 247–266. Springer, Berlin, 1988.

267. A. Yu. Veretennikov. On backward filtering equations for SDE systems (directapproach). In Stochastic Partial Differential equations (Edinburgh, 1994), vol-ume 216 of London Math. Soc. Lecture Note Ser., pages 304–311, Cambridge,UK, 1995. Cambridge Univ. Press.

268. D. Whitley. A genetic algorithm tutorial. Statist. Comput., 4:65–85, 1994.269. Ward Whitt. Stochastic Process Limits. An Introduction to Stochastic-Process

Limits and Their Application to Queues. Springer, New York, 2002.270. N. Wiener. Extrapolation, Interpolation, and Smoothing of Stationary Time

Series: With Engineering Applications. MIT Press, Cambridge, MA, 1949.271. N. Wiener. I Am a Mathematician. Doubleday, Garden City, NY; Victor

Gollancz, London, 1956.272. D. Williams. Probability with Martingales. Cambridge University Press, Cam-

bridge, UK, 1991.273. W. M. Wonham. Some applications of stochastic differential equations to opti-

mal nonlinear filtering. J. Soc. Indust. Appl. Math. Ser. A Control, 2:347–369,1965.

274. Xi Wu, Stephen S.-T. Yau, and Guo-Qing Hu. Finite-dimensional filters withnonlinear drift. XII. Linear and constant structure of Wong-matrix. In Stochas-tic Theory and Control (Lawrence, KS, 2001), volume 280 of Lecture Notes inControl and Inform. Sci., pages 507–518, Berlin, 2002. Springer.

275. T. Yamada and S. Watanabe. On the uniqueness of solutions of stochasticdifferential equations. J. Math. Kyoto Univ., 11:151–167, 1971.

276. Shing-Tung Yau and Stephen S. T. Yau. Finite-dimensional filters with non-linear drift. XI. Explicit solution of the generalized Kolmogorov equation inBrockett-Mitter program. Adv. Math., 140(2):156–189, 1998.

277. Stephen S.-T. Yau. Finite-dimensional filters with nonlinear drift. I. A class offilters including both Kalman-Bucy filters and Benes filters. J. Math. SystemsEstim. Control, 4(2):181–203, 1994.

278. Stephen S.-T. Yau and Guo-Qing Hu. Finite-dimensional filters with nonlineardrift. X. Explicit solution of DMZ equation. IEEE Trans. Autom. Control,46(1):142–148, 2001.

279. Marc Yor. Sur les theories du filtrage et de la prediction [French]. In Seminairede Probabilities, XI (Univ. Strasbourg), Annees universitaire 1975/1976, vol-ume 581 of Lecture Notes in Math., pages 257–297. Springer Verlag, Berlin,1977.

280. Marc Yor. Some Aspects of Brownian Motion, Part 1: Some Special Functionals(Lectures in Mathematics, ETH, Zurich). Birkhauser Boston, 1992.

281. Moshe Zakai. On the optimal filtering of diffusion processes. Z. Wahrschein-lichkeitstheorie und Verw. Gebiete, 11:230–243, 1969.

282. O. Zeitouni. On the tightness of some error bounds for the nonlinear filteringproblem. IEEE Trans. Autom. Control, 29(9):854–857, 1984.

283. O. Zeitouni and B. Z. Bobrovsky. On the reference probability approach to theequations of nonlinear filtering. Stochastics, 19(3):133–149, 1986.

284. Ofer Zeitouni. On the filtering of noise-contaminated signals observed via hardlimiters. IEEE Trans. Inform. Theory, 34(5, part 1):1041–1048, 1988.

Author Name Index

A

Adams, R. A. 165, 166Aggoun, L. 192Allinger, D. F. 35

B

Baker, J. E. 280Baras, J. S. 179Bayro-Corrochano, E. 194Benes, V. E. 8, 142, 197–199Bensoussan, A. 8, 9, 95, 104, 196, 356Bharucha-Reid, A. T. 7Bhatt, A. G. 126Billingsley, P. 303Blake, A. 286Bobrovsky, B. Z. 196Bourbaki, N. 27, 296Breiman, L. 294Brigo, D. 199, 202Brockett, R. W. 8Bucy, R. S. 6, 7, 192Budhiraja, A. 9Burkholder, D. L. 353

C

Carpenter, J. 230, 280Chaleyat-Maurel, M. 8, 9Chen, J. 8Chiou, W. L. 8Chopin, N. 280

Chung, K. L. 329, 330, 332, 338, 343,355

Cipra, B. 6Clark, J. M. C. 7, 8, 35, 129, 139, 348Clifford, P. 230, 280Cohen de Lara, M. 199Crisan, D. 230, 249, 279, 281, 285, 286

D

Daniell, P. J. 301Darling, R. W. R. 199Daum, F. E. 8, 199Davis, M. H. A. 7, 149, 250Del Moral, P. 249, 250, 281, 286Dellacherie, C. 307, 308, 312, 317, 319,

330Dickinson, B. W. 8Dieudonne, J. 32Doob, J. L. 18, 58, 88, 301, 329Doucet, A. 285Duncan, T. E. 7, 9Dynkin, E. B. 43

E

Efe, M. 194Elliott, R. J. 9, 192Ethier, S. N. 298, 303, 305, 330

F

Fearnhead, P. 230, 280Fleming, W. H. 196


384 Author Name Index

Friedman, A. 101, 103Frost, P. 7Fujisaki, M. 7, 34, 45

G

Getoor, R. K. 27, 28Gordon, N. J. 276, 286Grigelionis, B. 9Gyongy, I. 9, 139, 209

H

Halmos, P. R. 32Handschin, J. E. 286Hazewinkel, M. 8, 9Heunis, A. J. 9, 95, 113, 114, 126Hu, G.-Q. 8

I

Isard, M. 286Ito, K. 360

J

Jacod J. 249Joseph, P. D. 192

K

Kunsch, H. R. 279, 280Kailath, T. 7Kallianpur, G. 7, 8, 34, 35, 45, 57Kalman, R. E. 6Kao, J. 194Karandikar, R. L. 8Karatzas, I. 51, 88, 310, 330, 355Kitagawa, G. 286Kloeden, P. E. 251Kolmogorov, A. N. 5, 13, 31, 32, 301Krein, M. G. 5Krylov, N. V. 7, 93, 139, 209, 355Kunita, H. 7, 9, 34, 45, 182Kuratowksi, K. 27Kurtz, T. G. 9, 126, 165, 249, 298, 303,

305, 330Kushner, H. J. 7, 9, 139, 202

L

Levy, P. 344, 362Le Gland, F. 9Leung, C. W. 8Liptser, R. S. 9Lototsky, S. 202, 204Lucic, V. M. 95, 113, 114, 126Lyons, T. J. 230, 249, 250, 281, 285,

286

M

Mangold, M. 194Marcus, S. I. 8Maybank, S. J. 8Mayne, D. Q. 286McKean, H. P. 360Meyer, P. A. 27, 45, 307, 308, 312, 317,

319, 330Michel, D. 8, 9Miclo, L. 250Mikulevicius, R. 9, 202Mitter, S. K. 8, 9, 35Mortensen, R. E. 7

N

Newton, N. J. 9Novikov, A. A. 52, 350Nualart, D. 348

O

Ocone, D. L. 9, 126Oleınik, O. A. 105Ozbek, L. 194

P

Pardoux, E. 7, 9, 182, 193, 196Picard, J. 9, 195, 196Pitt, M. K. 285Platen, E. 251Prokhorov, Y. V. 45Protter P. 330, 351

R

Radkevic, E. V. 105

Author Name Index 385

Rigal, G. 286Rogers, L. C. G. 17, 32, 58, 293, 296,

300, 301, 307, 308, 319, 321, 329,339, 343, 348

Rozovskii, B. L. 7, 9, 93, 176, 177, 182,202, 355

Rubin, D. B. 286Runggaldier, W. J. 9

S

Salmond, D. J. 276, 286Saloff-Coste, L. 166Salut, G. 286Schmidt, G. C. 199Schwartz, C. A. I. 8Sharpe, M. J. 332Shephard, N. 285Shiryaev, A. N. 7, 9Shreve, S. E. 51, 88, 310, 330, 355Smith, A. F. M. 276, 286Stein, E. M. 166Stratonovich, R. S. 7Striebel, C. 8, 57Stroock, D. W. 28, 298Sussmann, H. J. 8Szpirglas, J. 125

T

Tsirel’son, B. S. 35Tulcea, I. 298

V

Varadhan, S. R. S. 298

Veretennikov, A. Y. 249

W

Watanabe, S. 35Whitley, D. 230, 280Whitt, W. 303Wiener, N. 5Williams, D. 17, 32, 43, 58, 293, 296,

300, 301, 307, 308, 319, 321, 329,339, 343, 348, 362

Williams, R. J. 329, 330, 332, 338, 343,355

Wonham, W. M. 7Wu, X. 8

X

Xiong, J. 165, 249

Y

Yamada, T. 35Yau, S.-T. 8Yau, S. S.-T. 8Yor, M. 28, 360

Z

Zakai, M. 7, 196Zatezalo, A. 93Zeitouni, O. 9

Subject Index

A

Announcing sequence 321Atom 298Augmented filtration see Observation

filtrationAveraging over the characteristics

formula 182

B

Benes condition 142, 196Benes filter 141, 146, 196the d-dimensional case 197

Bootstrap filter 276, 286Borel space 301Branching algorithm 278Brownian motion 346exponential functional of 360, 361,363, 365

Fourier decomposition of 360Levy’s characterisation 344, 346

Burkholder–Davis–Gundy inequalities246, 256, 353

C

Cadlag path 303Caratheodory extension theorem 300,

347Change detection filter see Change-

detection problemChange of measure method 49, 52Change-detection problem 52, 69

Clark’s robustness result see Robustrepresentation formula

ClassU 96, 97, 100, 107, 109, 110, 113, 118U 109, 110U ′ 110, 111, 113, 114, 116U ′ 116

ConditionU 97, 102, 107, 110U′ 113, 114, 116U′′ 114

Conditional distributionof Xt 2–3, 191approximating sequence 265density of 174density of the 200recurrence formula 261, 264unnormalised 58, 173, 175regular 294

Conditional expectation 293Conditional probabilityof a set 294regular 32, 294, 296, 347

Convergence determining set 323Convergence in expectation 322, 324Cubic sensor 201

D

Debut theorem 307, 314, 339, 341Daniell–Kolmogorov–Tulcea theorem

301, 302, 347Density of ρt

existence of 168


388 Subject Index

smoothness of 174Dual previsible projection 332Duncan–Mortensen–Zakai equation

see Zakai equation

E

Empirical measure 210Euler method 251Evanescent set 319Exponential projection filter 201Extended Kalman filter 194

F

Feller property 267Feynman–Kac formula 182Filteringequations 4, 16, 72, 93, 125, 249, 308

see Kushner–Stratonovichequation, Zakai equationfor inhomogeneous test functions69

problem 13, 48discrete time 258–259the correlated noise case 73–75,109

Finite difference scheme 207Finite-dimensional filters 141, 146,

154, 196–199Fisher information matrix 199Fokker–Planck equation 206Fujisaki–Kallianpur–Kunita equa-

tion see Kushner–Stratonovichequation

G

Generator of the process X 48, 50, 51,151, 168, 207, 221

domain of the 47, 50–51maximal 51

Girsanov’s theorem 345, 346Gronwall’s lemma 78, 79, 81, 88, 172,

325

H

Hermite polynomials 203

I

Importance distribution 285Importance sampling 273Indistinguishable processes 319Infinitesimal generator see Generator

of the process XInnovationapproach 7, 49, 70–73process 33–34

Ito integral see Stochastic integralIto isometry 337, 338, 349Ito’s formula 343

K

Kallianpur–Striebel formula 57, 59,128

Kalman–Bucy filter 6, 148–154, 191,192, 199

1D case 158as a Benes filter 142, 148

Kushner–Stratonovich equation 68,71, 153

correlated noise case 74finite-dimensional 66for inhomogeneous test functions69

linear case 151strong form 179uniqueness of solution 110, 116

L

Likelihood function 260Linear filter see Kalman–Bucy filterLocal martingale 330, 344

M

Markov chain 257Martingale 329representation theorem 348uniformly integrable 330, 346

Martingale convergence theorem 318,329, 345

Martingale problem 47Martingale representation theorem 35,

38, 44Measurement noise 1

Subject Index 389

Monotone class theorem 29, 31, 293,295, 311, 318, 336

Monte Carlo approximation 210, 216,222, 230

convergence of 213, 214, 217convergence rate 215, 216

Multinomial resampling see Resam-pling procedure

Mutation step 273

N

Non-linear filtering see Stochasticfiltering

Non-linear filtering problem seeFiltering problem

Novikov’s condition 52, 127, 131, 218,222, 350

O

Observationfiltration 13–17right continuity of the 17, 27,33–40unaugmented 16process 1, 3, 16discrete time 258

σ-algebra see Observation filtrationOffspring distribution 224, 252,

274–281Bernoulli 280binomial 280minimal variance 225, 226, 228, 230,279, 280

multinomial 275–277obtained by residual sampling 277Poisson 280

Optional process 320Optional projection of a process

17–19, 311–317, 338kernel for the 27without the usual conditions 321

P

Parabolic PDEsexistence and uniqueness result 100maximum principle for 102systems of 102

uniformly 101, 121Parseval’s equality 204, 205Particle filter 209, 222–224branching algorithm 225convergence rates 241, 244, 245, 248correction step 222, 230, 250discrete time 272–273convergence of 281–284prediction step 264updating step 264evolution equation 230implementation 250–252correction step 251, 252evolution step 251offspring distribution see Offspringdistribution

path regularity 229resampling procedure 250, 252

Particle methods see Particle filterPath process 259PDE Methodscorrection step 207prediction step 206

πthe stochastic process 14, 27–32cadlag version of 31

πt see Conditional distribution of Xt

Polarization identity 342Posterior distribution 259Predictable σ-algebra see Previsible

σ-algebraPredictable process see Previsible

processPredicted conditional probability 259Previsible σ-algebra 331Previsible process 321, 331, 338Previsible projection of a process 317,

321, 340, 341Prior distribution 259Projection bien measurable see

Optional projection of a processProjection filter 199Projective product 261

Q

Q-matrix 51Quadratic variation 332, 335, 342

390 Subject Index

R

Reducing sequence 330Regular grid 207Regularisation method 167Regularised measure 167Resampling procedure 276Residual sampling 277ρ see Conditional distribution of Xt,

unnormaliseddensity of 173, 178dual of 165, 180–182, 233, 238

Riccati equation 152, 192Ring of subsets 331Robust representation formula 129,

137

S

Sampling with replacement methodsee Resampling procedure

Selection step 274Sensor function 4Separable metric space 296Sequential Monte Carlo methods see

Particle filterSignal process 1, 3, 16, 47discrete time version 257filtration associated with the 47in discrete time 257particular cases 49–52

SIR algorithm 276Skorohod topology 304–305Sobolevembedding theorem 166space 166

Splitting-up algorithm 206Stochastic differential equationstrong solution 355

Stochastic filtering 1, 3, 6, 8, 9, see alsoFiltering problem

Stochastic Fubini’s theorem 351Stochastic integral 330–341

limits of 358localization 343martingale property 337

Stochastic integration by parts 342Stopping time 306announceable 321

T

TBBA see Tree-based branchingalgorithms

Total sets in L1 355, 357Transition kernel 257Tree-based branching algorithms 230,

279Tulcea’s theorem 298, 303, 347, 348

U

Uniqueness of solution see Kushner–Stratonovich equation, uniquenessof solution, see Zakai equation,uniqueness of solution

Usual conditions 16, 319

W

Weak topology on P(S) 21–27metric for 26

Wick polynomials 203Wiener filter 5–6

Z

Zakai equation 62, 69, 73, 154, 177correlated noise case 74, 111finite-dimensional 65for inhomogeneous test functions69, 97

strong form 67, 175–178, 202–203,206

uniqueness of solution 107, 109, 114,182

link.springer.com3A978-0... · A Measure Theory A.1 Monotone Class Theorem Let S be a set. A family...

Documents

Transcript of link.springer.com3A978-0... · A Measure Theory A.1 Monotone Class Theorem Let S be a set. A family...