LargeDeviations - math.nyu.eduvaradhan/Spring2012/Chapters1-2.pdf · LargeDeviations Spring2012...

Large Deviations

Spring 2012

February 2, 2012

Chapter 1

Introduction

1.1 Outline.

We will examine the theory of large deviations through three concrete examples. We willwork them out fully and in the process develop the subject.

The first example is the exit problem. We consider the Dirichlet problem

ǫ

2∆u+ b(x) · ∇u = 0

in some domain G with boundary data u = f on δG. The vector field b = −∇V for somefunction V . As ǫ → 0 the limiting behavior of the solution u = uǫ will depend on thebehavior of the solutions of the ODE

dx

dt= b(x(t))

The difficult case is when the solutions of the ODE do not exit fromG. Then large deviationtheory provides the answer. Assuming that there is a unique stable equilibrium inside Gand all trajectories starting from x ∈ G converge to it without leaving G, one can showthat

limǫ→0

uǫ(x) = f(y)

provided V (y) uniquely minimizes of V (·) on the boundary δG.

The second example is about the simple random walk in d dimensions. We denote bySn = X1 + · · · + Xn the random walk and Dn the range of S1, . . . , Sn. Then |Dn| is thenumber of distinct sites visited by the random walk. The question is the behavior of

E[e−ν|Dn|]

for large n. Contribution comes mainly from paths that do not visit too many sites. Wecan insist that the random walk is confined to a ball of radius R = R(n). Then the

3

4 CHAPTER 1. INTRODUCTION

number of sites visited is at most the volume of (actually the number of lattice pointsinside) the ball which is approximately v(d)Rd for large R, where v(d) is the volume of theunit ball in R

d. On the other hand confining a random walk to the region for a long timehas exponentially small probability p(n) ≃ exp[−λd(R)n] = exp[−n λd

R2 ]. Here −λd is the

ground state eigenvalue of 12d

∑di=1

∂2

∂x2i

with Dirichlet boundary condition in the unit ball.

The contribution from these paths is exp[−νv(d)Rd − nλ(d)R2 ] and R = R(n) can be chosen

to maximize this contribution. One can fashion a proof that establishes this as a lowerbound. But to show that the optimal lower bound obtained in this manner is actually atrue upper bound requires a theory.

The third example that we will consider is the symmetric simple exclusion process. Onthe periodic d-dimensional integer lattice Z

dN of size Nd, we have k(N) = ρNd particles

(with at most one particle per site) doing simple random walk independently with rate 1.However jumps to occupied sites are forbidden. The Markov process has the generator

(ANu)(x1, . . . , xk(N)) =1

2d

k(N)∑

i=1

∑

e

[1−η(xi+e)][u(x1, . . . , xi+e, . . . , xk(N))−u(x1, . . . , xk(N))]

where e runs over the units in the 2d directions and η(x) =∑

i 1xi=x is the particle countat x, which is either 0 or 1. We do a diffusive rescaling of space and time and consider therandom measure γN on the path space D[[0, T ];T d].

γN =1

Nd

∑

1≤i≤k(N)

δxi(N2·)

N

We want to study the behavior as N → ∞. The theory of large deviations is needed evento prove a law of large numbers for γN .

1.2 Supplementary Material.

Large deviation theorems in some generality were first established by Crameer in [2]. Heconsidered deviations from the law of large numbers for sums of independent identicallydistributed random variables and showed that the rate function was the convex conjugateof the logarithm of the moment generating function of the underlying common distribution.The subject has evolved considerably over time and several texts are now available offeringdifferent perspectives. The exit problem was studied by Wentzell and Freidlin in their work[3]. They go on to study in [4] the long time behavior of small random perturbations ofdynamical systems, when there are several equilibrium points.

The problem of counting the number of distinct sites comes up in the discussion of a randomwalk on Z

d in the presence of randomly located traps. The estimation of the probability of

1.2. SUPPLEMENTARY MATERIAL. 5

avoiding traps for a long time reduces to the calculation described in the second example.This problem was proposed by Mark Kac [6], along with a similar problem for a Brownianpath avoiding traps in R

d . Each trap is ball of some fixed radius δ with their centerslocated randomly as a Poisson point process of intensity ρ. Now the role of the numberof distinct sites of the random walk is replaced by the volume | ∪0≤s≤t B(x(s), δ)|, of the”Wiener Sausage”, i.e. the δ-neighborhood of the range Brownian path x(·) in [0, t].

The use of large Deviation techniques in the study of hydrodynamic scaling limits beganwith the work of Guo, Papanicolaou and Varadhan in [5] and the results presented herestarted with the study of non gradient systems in [16], followed by the Ph.D thesis ofQuastel [7] and subsequent work of Quastel, Rezakhnalou and Varadhan in [8] , [9] and[10].

6 CHAPTER 1. INTRODUCTION

Chapter 2

Small Noise.

2.1 The Exit Problem.

Let L be a second order elliptic operator

(Lu)(x) = 1

2

d∑

i,j=1

ai,j(x)∂2u

∂xi∂xj(x) +

d∑

j=1

bj(x)∂u

∂xj(x)

on some Rd. The solution of the Dirichlet problem

(Lu)(x) = 0 for x ∈ G (2.1)

u(y) = f(y) for y ∈ δG

can be represented as

u(x) = Ex[f(x(τ)] (2.2)

where Ex is expectation with respect to the diffusion process Px corresponding to L startingfrom x ∈ G, τ is the exit time from the region G and x(τ) is the exit place on the boundaryδG of G. If L is elliptic and G is bounded then τ is finite almost surely and in fact itsdistribution has an exponentially decaying tail under every Px. If G has a regular boundary,(exterior cone condition is sufficient) then the function u(x) defined by (2.2) solves (2.1)and u(x) → f(y) as x ∈ G → y ∈ δG.

We are interested in the situation where L depends on a parameter ǫ that is small. Asǫ → 0, Lǫ degenerates to a first order operator, i.e. a vector field

(Xu)(x) =

d∑

j=1

bj(x)∂u

∂xj(x)

7

8 CHAPTER 2. SMALL NOISE.

The behavior of the solution uǫ(x) of

(Lǫuǫ)(x) = 0 for x ∈ G

u(y) = f(y) for y ∈ δG

will depend on the behavior of the solution of the ODE,

dx(t)

dt= b(x(t));x(0) = x (2.3)

If x(t) exits cleanly from G at a point y0 ∈ δG, then uǫ(x) → f(y0). If the trajectory x(t)touches the boundary and reenters G, it is problematic. If the trajectory does not exit G,then we have a real problem.

We will concentrate on the following situation. The operator Lǫ is given by

Lǫu =ǫ

2∆u+Xu

We will assume that every solution of the corresponding ODE (2.3) with x ∈ G stays inG for ever and as t → ∞, they all converge to a limit x0 which is the unique equilibriumpoint in G, i.e. the only point with b(x0) = 0. In other words x0 is the unique globallystable equilibrium in G and every solution converges to it without leaving G. Let Pǫ,x bethe distribution of the solution

xǫ(t) = x+

∫ t

0b(xǫ(s))ds+

√ǫβ(t) (2.4)

where β(t) is the d-dimensional Brownian motion. It is clear that while the paths will exitfrom G almost surely under Pǫ,x as ǫ → 0 it will take an increasingly longer time, andin the limit there will be no exit. The behavior of the solution uǫ is far from clear. Theproblem is to determine when, how and where xǫ(·) will exit from G when ǫ << 1 is verysmall. We will investigate it when b(x) = −(∇V )(x) is the gradient flow and x0 is theunique global minimum of a nice function V (x).

The picture that emerges is that a typical path will go quickly near the equilibrium point,stay around it for a long time making periodic futile short lived attempts to get out. Theseattempts, although infrequent, are large in number, since the total time it takes to exit isvery large. More serious the attempt, fewer the number of such attempts. Each individualattempt occurs at a Poisson rate that is tiny. Finally a successful excursion takes place.The point of exit is close to the minimizer y0 of V (y) on the boundary. Assuming it isunique, the path followed near the end is the reverse path of the approach to equilibriumof x(·) starting from y0 and the total time it takes for the exit to take place is of the order

exp[2(V (y0)−V (x0))ǫ

]. Compared to the total time, the duration of individual excursionsare tiny and can be considered to be almost instantaneous and so they are more or lessindependent. Various excursions take place more or less independently with various tinyrates. Among the excursions that get out one that occurs first is the reverse path thatexits at y0. Its rate is the highest among those that get out.

2.2. LARGE DEVIATIONS OF Pǫ,X 9

2.2 Large Deviations of Pǫ,xThe mapping x(·) → g(·) of

x(t) = x(0) +

∫ t

o

b(x(s))ds + g(t)

is clearly a continuous map of C[0, T ] → C0[0, T ]. On the other hand the difference of twosolutions x(·) and y(·), corresponding to g(·) and h(·) respectively, satisfy

x(t)− y(t) =

∫ t

0[b(x(s))− b(y(s))]ds + g(t)− h(t)

and if b(x) is Lipschitz with constant A, ∆(t) = sup0≤s≤t |x(s)− y(s)| satisfies

∆(t) ≤ A

∫ t

0∆(s)ds+ sup

0≤s≤t|g(s) − h(s)|

Applying Gronwall’s inequality for any fixed the interval [0, T ],

∆(T ) ≤ c(T ) sup0≤s≤T

|g(s) − h(s)|

proving that he map from g(·) → x(·) is continuous. If we denote this continuous mapby φ = φx and the distribution of the scaled Brownian motion

√ǫβ(·) by Qǫ, then Px,ǫ =

Qǫφ−1x . The probability Pǫ,x(B(f, δ)) will be estimated by Qǫ[B(g, δ′)]. We will prove two

theorems.

Theorem 2.2.1. The measures Qǫ on C0[0, T ] satisfy: for any closed set C and open setG that are subsets of C0[0, T ],

lim supǫ→0

ǫ logQǫ[C] ≤ − infg∈C

1

2

∫ T

0[g′(t)]2dt (2.5)

lim infǫ→0

ǫ logQǫ[G] ≥ − infg∈G

1

2

∫ T

0[g′(t)]2dt (2.6)

Theorem 2.2.2. The measures Px,ǫ on C[0, T ] satisfy: for any closed set C and open setG that are subsets of C[0, T ],

lim supǫ→0

ǫ log Px,ǫ[C] ≤ − inff∈C

f(0)=x

1

2

∫ T

0[f ′(t)− b(f(t))]2dt (2.7)

lim infǫ→0

ǫ log Px,ǫ[G] ≥ − inff∈G

f(0)=x

1

2

∫ T

0[f ′(t)− b(f(t))]2dt (2.8)


In both theorems the infimum is taken over f and g that are absolutely continuous int and have square integrable derivatives.

We note that Theorem 2.2.2 follows from Theorem 2.2.1. Since Px,ǫ(A) = Qǫ(φ−1x A)

and φx is a continuous one-to-one map of C0[0, T ] on to Cx[0, T ], we only need to observethat

infg∈φ−1

x C

1

2

∫ T

0[g′(t)]2dt = inf

f∈Cf(0)=x

1

2

∫ T

0[f ′(t)− b(f(t))]2

which is an immediate consequence of the relation: if f = φxg, then

1

2

∫ T

0[f ′(t)− b(f(t))]2dt =

1

2

∫ T

0[g′(t)]2dt

We now turn to the proof of Theorem 2.2.1. This was independently observed in someform by Strassen [12] and Schilder [11].

Proof. Let us take an integer N and divide the interval [0, T ] into N equal parts. For anyf ∈ C[0, T ] we denote by fN = πNf the piecewise linear approximation of f obtained by

interpolating linearly over [ (j−1)TN

, jTN]. for j = 1, 2, . . . , N . In particular fN( j

N) = f( j

N)

for j = 0, 1, . . . , N . To prove the upper bound let δ > 0 be arbitrary and N be an integer.Then

Qǫ[C] ≤ Qǫ[fN ∈ Cδ] +Qǫ[‖πNf − f‖ ≥ δ]

where Cδ = ∪f∈CB(f, δ). Under Qǫ, f( jN is a multivariate Gaussian with density

[

√

N

2πǫT

]N

exp[

− N

2ǫT

N∑

j=1

[zj − zj−1]2]

Moreover if zj = f( jN),

N

T

N∑

j=1

[zj − zj−1]2 =

∫ T

0[f ′

N (t)]2dt

It is now not difficult to show that

lim supǫ→0

ǫ logQǫ[fN ∈ Cδ] ≤ −1

2inff∈Cδ

∫ T

0[f ′(t)]2dt

Simple estimate on the maximum of Brownian motion provides the estimate

lim supǫ→0

ǫ logQǫ[‖fN − f‖ ≥ δ] ≤ −Nδ2

2ǫT

If we now let N → ∞ and then δ → 0, we obtain (2.7). We note that the function

I(f) =1

2

∫ T

0[f ′(t)]2dt

2.2. LARGE DEVIATIONS OF Pǫ,X 11

is lower semicontinuous on C[0, T ] and the level sets f : I(f) ≤ ℓ are all compact. Thisallows us to conclude that for any closed set C,

limδ→0

inff∈Cδ

I(f) = inff∈C

I(f)

Another elementary but important fact is that the sum of two non negatives quantitiesbehaves like the maximum if we are only interested in the exponential rate of decay (orgrowth).

Now we turn to the lower bound. It suffices to show that for any f ∈ C0[0, T ] with

ℓ = 12

∫ T

0 [f ′(t)]2dt < ∞ and δ > 0

lim infǫ→0

ǫ logQǫ[B(f, δ)] ≥ −ℓ

Since f can be approximated by more regular functions fk with the corresponding ℓkapproximating ℓ we can assume with out loss of generality that f is smooth. If we denoteby Qf,ǫ the distribution of

√ǫβ(t)− f(t), we have

Qǫ[B(f, δ)] = Qf,ǫ[B(0, δ)]

=

∫

B(0,δ)

dQf,ǫ

dQǫdQǫ

=

∫

B(0,δ)exp[

1

ǫ

∫ T

0f ′(s)dx(s)− 1

2ǫ

∫ T

0[f ′(t)]2dt]dQǫ

≥ e−ℓǫQǫ[B(0, δ)]

1

Qǫ[B(0, δ)]

∫

B(0,δ)exp[

1√ǫ

∫ T

0f ′(s)dx(s)]dQǫ

≥ e−ℓǫQǫ[B(0, δ)] exp

[

1

Qǫ[B(0, δ)]

∫

B(0,δ)[1√ǫ

∫ T

0f ′(s)dx(s)]dQǫ

]

≥ e−ℓǫQǫ[B(0, δ)]

by Jensen’s inequality coupled with symmetry. Since for any δ > 0, Qǫ[B(0, δ)] → 1 asǫ → 0, we are done.

Remark 2.2.3. We will need local uniformity in x, in the statement of our large deviationprinciple for Pǫ,x. This follows easily from the continuity of the maps φx in x.

Remark 2.2.4. This does not quite solve the exit problem. The estimates are good onlyfor a finite T , and all estimates only show that the probabilities involved are quite small.The solution to the exit problem is slightly more subtle. The basic idea is that among abunch of very unlikely things the least unlikely thing is most likely to occur first!


2.3 The Exit Problem.

We start with a lemma that is a variational calculation. Consider any path h(·) that startsfrom the stable equilibrium x0 and ends at some x ∈ G. Then

Lemma 2.3.1.

inf0<T<∞

infh

h(0)=x0h(T )=x

∫ T

0[h′(t) +∇V ]2dt = 4[V (x)− V (x0)]

Proof. We look at the ODE x(t) = −(∇V )(x(t)), x(0) = x and reverse it between 0 andT , giving a trajectory h(t) = x(T − t) from x(T ) to x satisfying h′(t) = (∇V )(h(t)).

∫ T

0[h′(t) +∇(V )(h(t))]2dt =

∫ T

0[h′(t)−∇(V )(h(t))]2dt+ 4

∫ T

0(∇V )(h(t)) · h′(t)dt

= 4[V (x)− V (x(T ))]

For T large x(T ) ≃ x0 and therefore

inf0<T<∞

infh

h(0)=x0h(T )=x

∫ T

0[h′(t) +∇V ]2dt ≤ 4[V (x)− V (x0)]

On the other hand for any h with h(T ) = x and h(0) = x0,

4[V (x)− V (x0)] = 4

∫ T

0(∇V )(h(t)) · h′(t)dt

=

∫ T

0[h′(t) + (∇V )(h(t)]2dt−

∫ T

0[h′(t)− (∇V )(h(t)]2dt

≤∫ T

0[h′(t) + (∇V )(h(t)]2dt

The next lemma says that it is very unlikely that the path stays away from the equi-librium point for too long.

Lemma 2.3.2. Let U be any neighborhood of the equilibrium x0 and

Λ(U, T ) = inff(·):f(·)∈G∩Uc

∫ T

0[f ′(t) + (∇V )(f(t))]2dt

Then lim infT→∞Λ(U, T ) = ∞.

2.3. THE EXIT PROBLEM. 13

Proof. Suppose there are paths in G ∩ U c for long periods with bounded rate I(f) =12

∫ T

0 [f ′(t) + (∇V )(f(t))]2dt. Then there has to be arbitrarily long stretches for whichthe contribution to I(f) is small. Such trajectories are equicontinuous and produce inthe limit solutions of dx(t) + (∇V )(x(t))dt = 0 that live in G ∩ U c for ever, which is acontradiction.

Now we state and prove the main theorem.

Theorem 2.3.3. Assume that V (·) on the boundary δG, achieves its minimum at a uniquepoint y0. Then for any x ∈ G

limǫ→0

uǫ(x) = f(y0)

In other words, irrespective of the starting point, exit will take place near y0 with probabilitynearly 1.

Proof. Let us fix a neighborhoodN of y0 on the boundary. Let infy∈δG∩Nc V (y) = V (y0)+θfor some θ > 0. Let us take two neighborhoods, U1, U2 of x0 such that U1 ⊂ U2 andV (x)−V (x0) ≤ θ

10 on U2. Let τ be the exit time from G. We will show that for any x ∈ G

limǫ→0

Px,ǫ[x(τ) /∈ N ] = 0

Let us define the following stopping times.

τ = inft : x(t) /∈ Gτ1 = inft : x(t) /∈ U1

c ∧ τ

τ2 = inft ≥ τ1 : x(t) /∈ U2· · · · · ·

τ2k+1 = inft ≥ τ2k : x(t) /∈ U1c ∧ τ

τ2k+2 = inft ≥ τ2k+1 : x(t) /∈ U2

For any x ∈ G, Px,ǫ[τ1 = τ ] → 0 as ǫ → 0 and the path can not exit from G between τ2k+1

and τ2k+2. As for τ2k+1 one of three things can happen. τ > τ2k+1 and then x(τ2k+1) ∈ δU1.Or τ = τ2k+1 in which case either x(τ2k+1) = x(τ) ∈ N or in δG ∩ N c. The first eventhas probability nearly one and the remaining two have probability nearly zero. But one ofthem has much smaller probability than the other. So the event that has the larger of thetwo probabilities will happen first. We need to prove only that

limǫ→0

supx∈δU2Px,ǫ[τ1 = τ ∩ x(τ) /∈ N]

infx∈δU2 Px,ǫ[τ1 = τ ∩ x(τ) ∈ N] = 0

Let us look at the numerator first.

a(x, ǫ) = Px,ǫ[τ1 = τ∩x(τ) /∈ N] ≤ Px,ǫ[τ1 = τ∩x(τ) /∈ N∩ τ1 ≤ T ]+Px,ǫ[τ1 ≥ T ]


By lemma 2.3.2 the second term on the right can be made super exponentially small, i.e.

lim supT→∞

lim supǫ→0

ǫ log Px,ǫ[τ1 ≥ T ] = −∞

The first term has an explicit exponential rate and for any T ,

lim supǫ→0

ǫ log supx∈δU2

Px,ǫ[τ1 = τ ∩ x(τ) /∈ N ∩ τ1 ≤ T ]

≤ −2 infy∈Nc

infx∈δU2

[V (y)− V (x)]

≤ −3θ

2− 2[V (y0)− V (x0)]

Therefore

lim supǫ→0

ǫ log supx∈δU2

a(x, ǫ) ≤ −3θ

2− 2[V (y0)− V (x0)]

On the other hand for estimating the denominator

lim infǫ→0

ǫ infx∈δU2

logPx,ǫ[τ1 = τ ∩ x(τ) ∈ N] ≥ − supx∈δU2

2[V (y0)− V (x)]

≥ −2[V (y0)− V (x0)]−θ

5

The numerator goes to 0 a lot faster than the denominator and the ratio therefore goes to0.

Remark 2.3.4. It is not important that b(x) = −(∇V )(x) for some V . Otherwise of x0is the unique stable equilibrium in G, for x ∈ G one can define the ”quasi potential” V (x)by

4V (x) = inf0<T<∞

infh(·)

h(0)=x0h(T )=x

∫ T

0[x′(t)− b(x(t))]2dt

and it works just as well.

2.4 General diffusion operators.

We can have more general operators

Lǫu =ǫ

2

d∑

i,j=1

ai,j(x)∂2u

∂xi∂xj+

d∑

j=1

bj(x)∂u

∂xj

The rate function will have a different expression.

I(f) =1

2

∫ T

0

d∑

i,j=1

〈a−1(f(t))(f ′(t)− b(f(t))), (f ′(t)− b(f(t))〉dt

2.5. GENERAL FORMULATION 15

The proof would proceed along the following lines. We will assume that all the coeffi-cients are smooth and in addition ai,j(x) is uniformly elliptic. This provides a choice ofthe square root σ that is smooth as well. The distribution Px,ǫ is now the distribution ofthe solution of the SDE

x(t) = x+√ǫ

∫ t

0σ(x(s)) · dβ(s) +

∫ t

0b(x(s))ds

which has (almost surely) a uniquely defined solution. We have a large deviation for√ǫβ(t)

with rate function as before

I0(f) =1

2

∫ T

0‖f ′(t)‖2dt

The map β(·) → x(·) is however not continuous in the usual topology on C[[0, T ];Rd].Given N we can approximate x(t) by xN (t) which solves

xN (t) = x+√ǫ

∫ t

0σ(xN (πN (s))dβ(s) +

∫ t

0b(x(πN (s)))ds

where πN (s) = [Ns]N

. The coefficients are frozen and updated every 1N

units of time. Themap β(·) → xN (·) is continuous and the distribution of xN (t) satisfies a large deviationprinciple with rate function

IN (f) =1

2

∫ T

0‖σ−1(x(πN (s)))[f ′(s)− b(x(πN (s)))]‖2ds

=1

2

∫ T

0〈a−1(x(πN (s)))[f ′(s)− b(x(πN (s)))], [f ′(s)− b(x(πN (s)))]〉ds

The proof is completed (see Theorem 3.3) by proving that for any δ > 0,

limN→∞

lim supǫ→0

ǫ log Px,ǫ[ sup0≤t≤T

‖xN (t)− x(t)‖ ≥ δ] = −∞

andI(f) = inf

fN→flim infN→∞

IN (fN )

where the infimum is taken over all sequences fN that converge to f

2.5 General Formulation

We will take time out to formulate Large Deviations in a more abstract setting and establishsome basic principles. If we have a sequence of probability distributions Pn on (X ,B), acomplete separable metric space X with its Borel σ-field B, we say that it satisfies a LargeDeviation Principle (LDP) with rate I(x) if the following properties hold.


• The function I(x) ≥ 0 is lower semicontinuous and the level sets Kℓ = x : I(x) ≤ ℓare compact for any finite ℓ

• For any closed set C ⊂ X we have

lim supn→∞

1

nlog Pn[C] ≤ − inf

x∈CI(x) (2.9)

• For any open set G ⊂ X we have

lim infn→∞

1

nlog Pn[G] ≥ − inf

x∈GI(x) (2.10)

It is easy to verify the following contraction principle.

Theorem 2.5.1. If Pn satisfies a large deviation principle with rate I(·) on X andf : X → Y is a continuous map then Qn = Pnf

−1 satisfies a Large Deviation principle onY with rate function

J(y) = infx:f(x)=y

I(x)

Another easy consequence of the definition is the following theorem.

Theorem 2.5.2. Let Pn satisfy LDP on X with rate I(·) and F (x) : X → R a boundedcontinuous function. Then

limn→∞

1

nlog

∫

exp[nF (x)]dPn = supx[F (x) − I(x)]

Proof. We remark that

limn→∞

1

nlog[ena + enb] = maxa, b

For the upper bound, dividing the range of F into a finite number of intervals of size 1k,

and denoting by Cr,k = x : r−1k

≤ F (x) ≤ rk

∫

enF (x)dPn ≤∑

r

∫

Cr,k

enF (x)dPn ≤∑

r

enrk Pn[Cr,k]

Therefore we obtain for any k, the bound

limn→∞

1

nlog

∫

exp[nF (x)]dPn ≤ supr[r

k− inf

x∈Cr,k

I(x)]

≤ supx[F (x)− I(x)] +

1

k

2.5. GENERAL FORMULATION 17

proving the upper bound. The lower bound is local. If we take any x0 with I(x0) < ∞,Phen in a neighborhood U of x0, F (x) is bounded below by F (x0) − ǫ(U). Pn(U) ≥exp[−nI(x0) + o(n)]. Since the integrand is nonnegative

∫

XenF (x)dPn ≥

∫

U

enF (x)dPn ≥ exp[n[(F (x0)− I(x0)− ǫ(U)] + o(n)]

Since x0 is arbitrary and U can be shrunk to x0 we are done.

Remark 2.5.3. For the upper bound it is enough if F is upper semi continuous andbounded. The lower bound needs only lower semicontinuity.

There are two components to the large deviation estimate. The lower bound is reallya local issue. For any x ∈ X

limδ→0

lim infn→∞

1

nlogPn[B(x, δ)] ≥ −I(x) (2.11)

where as the upper bound is a combination of local estimates

limδ→0

lim supn→∞

1

nlogPn[B(x, δ)] ≤ −I(x) (2.12)

and an exponential tightness estimate: given any ℓ < ∞ there exists a compact set Kℓ ⊂ Xso that for any closed C ⊂ Kc

ℓ we have

lim supn→∞

1

nlogPn[C] ≤ −ℓ (2.13)

(2.11) is easily seen to be equivalent to the lower bound (2.10) and (2.12) and (2.13) areequivalent to the upper bound (2.9).

Often we have a sequence Xn,k of random variables with values in X , defined on some(Ω,Σ, P ) and for each fixed k we have an LDP for Pn,k the distribution of Xn,k on X withrate function Ik(x). As k → ∞, for each n, Xn,k → Xn. We want to prove LDP for Pn

the distribution of Xn on X . This involves interchanging two limits and needs additionalestimates. The following ”super exponential estimate” is enough. For each fixed δ > 0,

lim supk→∞

lim supn→∞

1

nlogP [d(Xn,k,Xn) ≥ δ] = −∞ (2.14)

Theorem 2.5.4. If for each k the distributions Pn,k of Xn,k satisfy a large deviationprinciple with a rate function Ik(x) and if (2.14) holds, then

I(x) = limδ→0

lim infk→∞

infy∈B(x,δ)

Ik(y) = limδ→0

lim supk→∞

infy∈B(x,δ)

Ik(y)

and I(·) is a rate function and the distribution Pn of Xn satisfies LDP with rate I(·).


Proof. Let us define I+(x) ≥ I−(x) as

I+(x) = limδ→0

lim supk→∞

infy∈B(x,δ)

Ik(y)

I−(x) = limδ→0

lim infk→∞

infy∈B(x,δ)

Ik(y)

Step 1. Let δ > 0 be arbitrary. Then there exists k0 such that for k ≥ k0

lim supn→∞

1

nlogP [d(Xn,k,Xn) ≥ δ] ≤ −2ℓ

Clearly

P [d(Xn,k0 , xk) ≤ 3δ] ≥ P[

[d(Xn,k0 ,Xn) ≤ δ] ∩ [d(Xn),Xn,k) ≤ δ] ∩ [d(Xn,k, xk) ≤ δ]]

≥ P [d(Xn,k, xk) ≤ δ] − P [d(Xn,k,Xn) ≥ δ] − P [d(Xn,k0 ,Xn) ≥ δ]

or

P [d(Xn,k, xk) ≤ δ] ≤ P [d(Xn,k0 , xk) ≤ 3δ] + P [d(Xn,k,Xn) ≥ δ] + P [d(Xn,k0 ,Xn) ≥ δ]

This implies, for any fixed k ≥ k0,

lim supn→∞

1

nlog P [d(Xn,k, xk) ≤ δ]

≤ lim supn→∞

1

nlogmaxP [d(Xn,k0 , xk) ≤ 3δ], P [d(Xn,k ,Xn) ≥ δ], P [d(Xn,k0 ,Xn) ≥ δ]

Since Ik(xk) ≤ ℓ and lim supn→∞1nlog P [d(Xn,k,Xn) ≥ δ] ≤ −2ℓ for all k ≥ k0

infy∈B(xk ,3δ)

Ik0(y) ≤ infy∈B(xk ,δ)

Ik(y) ≤ ℓ

Shows that for any arbitrary δ > 0, there is a sequence yk ∈ B(xk, 3δ) with Ik0(yk) ≤ ℓwhich therefore has a convergent subsequence. By a variant of the diagonalization processwe can find a subsequence xkr such that there is yr,kr with d(xkr , yr,kr) ≤ 2−r and for eachj, yj,kr → yj as r → ∞. In other words we can assume with out loss of generality that forany δ > 0, there is yk ∈ B(xk, δ) that converges to a limit. It is easy to check now thatxk must be a Cauchy sequence. Since the space is complete it converges.

The next step is to show that Cℓ = x : I−(x) ≤ ℓ is compact, i.e. totally bounded andclosed. If we denote by Dk,ℓ = x : Ik(x) ≤ ℓ then

Cℓ = ∩ℓ′>ℓ ∩δ>0 ∩k′≥1[∪k≥k′Dk,ℓ′ ]δ

2.6. SUPEREXPONENTIAL ESTIMATES. 19

It is clear that Cℓ is closed. Since ∪k≥k′Dk,ℓ′ is totally bounded it follows that so is Cℓ.

Step 2. Let C ⊂ X be closed. Then either Xn,k ∈ Cδor d(Xn,k,Xn) ≥ δ. Therefore for

any k,

lim supn→∞

1

nlog Pn[C] ≤ max− inf

x∈Cδ

Ik(x), θk

where θk → −∞ as k → ∞. Consequently

lim supn→∞

1

nlog Pn[C] ≤ − lim sup

δ→0lim supk→∞

infx∈C

δIk(x) ≤ − inf

x∈CI+(x)

In particular

limδ→0

lim supn→∞

1

nlog Pn[B(x, δ)] ≤ −I+(x)

Step 3. Let I(x) = ℓ < ∞. Then there are xk ∈ B(x, δ) with Ik(xk) ≤ ℓ+ ǫ

Pn[B(x, 2δ)] ≥ Pn,k[B(xk, δ)] − P [d(Xn,k,Xn) ≥ δ]

We choose k large enough so that the second term on the right is negligible compared tothe first. We then obtain

limδ→0

lim infn→∞

1

nlog Pn[B(x, 2δ)] ≥ −I−(x)

This proves I+(x) = I−(x).

2.6 Superexponential Estimates.

We will show that with πN (s) = 1N[Ns], the solutions xN,ǫ(·) and xǫ(·) of

xN,ǫ(t) = x+√ǫ

∫ t

0σ(xN,ǫ(πN (s)))dβ(s) +

∫ t

0b(xN,ǫ(πN (s)))ds

and

xǫ(t) = x+√ǫ

∫ t

0σ(xǫ(s))dβ(s) +

∫ t

0b(xǫ(s))ds

satisfylim supN→∞

lim supǫ→0

ǫ log P[

sup0≤t≤T

‖xN,ǫ(t)− xǫ(t)‖ ≥ δ]

= −∞ (2.15)

for any δ > 0.Denoting by ZN,ǫ(t) = xN,ǫ(t)− xǫ(t), we have

ZN,ǫ(t) =√ǫ

∫ t

0eN (s)dβ(s) +

∫ t

0gN (s)ds


where

‖eN (s)‖ = ‖σ(xN,ǫ(πN (s)))− σ(x(s))‖ ≤ A‖ZN,ǫ(s)‖+A‖xN,ǫ(πN (s))− xN,ǫ(s)‖

and

‖gN (s)‖ = ‖b(xN,ǫ(πN (s)))− b(xǫ(s))‖ ≤ A‖ZN,ǫ(s)‖+A‖xN,ǫ(πN (s))− xN,ǫ(s)‖

If we define the stopping time τ as

τ = infs : ‖xN,ǫ(πN (s))− xN,ǫ(s)‖ ≥ η ∧ T

P[

sup0≤t≤T

‖xN,ǫ(t)− xǫ(t)‖ ≥ δ]

≤ P[

sup0≤t≤τ

‖xN,ǫ(t)− xǫ(t)‖ ≥ δ]

+ P [τ < T ]

≤ P[

sup0≤t≤τ

‖xN,ǫ(t)− xǫ(t)‖ ≥ δ]

+ P [ sup0≤s≤T

‖xN,ǫ(πN (s))− xN,ǫ(s)‖ ≥ η]

= Θ1 +Θ2

Let us handle each of the two terms separately. First we need this lemma.

Lemma 2.6.1. Let z(t) be a process satisfying

z(t) =

∫ s

0e(s) · dβ(s) +

∫

g(s)ds

with ‖e(s)‖ ≤ B(η2 + ‖z‖2) 12 , ‖g(s)‖ ≤ A(η2 + ‖z‖2) 1

2 in some interval 0 ≤ t ≤ τ whereτ ≤ T is a stopping time. Then for any ℓ ≥ 0

P [ sup0≤t≤τ

‖z(s)‖ ≥ δ] ≤[ δ2

δ2 + η2]ℓeT (2Aℓ+4B2ℓ2)

Proof. Consider the function

f(x) = (η2 + ‖x‖2)ℓ

By Ito’s formula

df(z(t)) = (∇f)(z(t)) · dz(t) + 1

2Tr[(∇2f)(z(t))e(t)e∗(t)]dt = a(t)dt+m(t)

where m(t) is a martingale and

|a(t)| ≤ (2Bℓ+ 4A2ℓ2)(η2 + ‖z(t)‖2)ℓ

2.6. SUPEREXPONENTIAL ESTIMATES. 21

Therefore

f(z(t))e−t(2Aℓ+4B2ℓ2)

is a super-martingale and

P [ sup0≤s≤τ

‖z(s)‖ ≥ δ] ≤[ η2

δ2 + η2]ℓeT (2Aℓ+4B2ℓ2)

In 0 ≤ t ≤ τ we have

‖en‖ ≤ 2A[‖ZN,ǫ‖2 + η2]12 ; ‖gn‖ ≤ 2A[‖ZN,ǫ‖2 + η2]

12

Applying the lemma with 2A and 2√ǫA replacing A and B, we obtain with ℓ = 1

ǫ

ǫ log Θ1 ≤ logη2

δ2 + η2+ T [2A+ 4A2] (2.16)

Now we turn to Θ2. We will use the following lemma.

Lemma 2.6.2. Let

z(t) = x+√ǫ

∫ t

0e(s) · dβ(s) +

∫ t

0g(s)ds

where ‖e(s)‖, ‖g(s)‖ are bounded by C. Then for any η > 0,

lim supN→∞

lim supǫ→0

ǫ logP [ sup0≤s≤T

‖z(πN (s))− z(s)‖ ≥ η] = −∞

Proof. We can choose N large enough so that CN

≤ η2 . Then we need only show that

lim supN→∞

lim supǫ→0

log P [ sup0≤t≤ 1

N

‖∫ t

0e(s) · dβ‖ ≥ η

2√ǫ] = −∞

which is an elementary consequence of the following fact. If e(s)e∗(s) ≤ CI,

exp[〈θ,∫ t

0e(s) · dβ(s)〉 − Ct‖θ‖2

2]

is a super-martingale for all θ.

This shows that for any η > 0,

lim supN→∞

lim supǫ→0

ǫ logΘ2 = −∞ (2.17)


We conclude by letting ǫ → 0, then N → ∞ and finally η → 0. Estimates (2.16) and (2.17)imply (2.15).

Finally it is not difficult to show that

inffN (·)→f(·)

lim infN→∞

∫ T

0< f ′

N (t), a−1(fN (πN (t)))f ′N (t) > dt =

∫ T

0< f ′(t), a−1(f(t))f ′(t) > dt

We have therefore proved the following theorem. Let ai,j(x) be smooth and uniformlyelliptic. b(x) is smooth and bounded. Then

Theorem 2.6.3. The distribution Pǫ,x of diffusion with generator

(Lǫu)(x) =ǫ

2

d∑

i,j=1

ai,j(x)∂2u

∂xi∂xj(x) +

d∑

j=1

bj(x)∂u

∂xj(x)

satisfies on C[[0, T ],Rd], as ǫ → 0, a large deviation principle with rate

I(f) =1

2

∫ T

0

d∑

i,j=1

〈a−1(f(t))(f ′(t)− b(f(t))), (f ′(t)− b(f(t))〉dt

if f(0) = x and f(t) is absolutely continuous with a square integrable derivative. OtherwiseI(f) = +∞.

Remark 2.6.4. In our case it is easy to show directly that ∪Nf : IN (f) ≤ ℓ is totallybounded. From the bounds on b and a−1 it is easy to conclude that

∪Nf : IN (f) ≤ ℓ ⊂ f :

∫ T

0‖f ′(t)‖2dt ≤ ℓ′

for an ℓ′ depending on ℓ and the bounds on a−1 and b.

2.7 Short time behavior of diffusions.

Brownian motion on Rd has the transition density

p(t, x, y) = exp[−|x− y|22t

+ o(1

t)] = exp[−d(x, y)2

2t+ o(

1

t)]

where d(x, y) is the Euclidean distance. If we replace the Brownian motion with indepen-dent components by one with positive definite covariance A then the metric gets replacedby by d(x, y) =

√

< A−1(x− y), (x − y) > and a similar formula for pA(t, x, y) is still validas seen by by a simple linear change of coordinates. The natural question that arises is

2.7. SHORT TIME BEHAVIOR OF DIFFUSIONS. 23

whether there is a similar relation between the transition probability density pL(t, x, y) ofthe diffusion with generator

(Lu)(x) = 1

2

d∑

i,j=1

ai,j(x)∂2u

∂xi∂xj(x) +

d∑

j=1

bj(x)∂u

∂xj(x)

and the geodesic distance dL(x, y) between x and y in the Riemannianian metric ds2 =∑

i,j a−1i,j (x)dxidxj . we will show that indeed there is. In the special case when b ≡ 0, for

the generator

Lǫ =ǫ

2

∑

i,j

ai,j(x)∂2

∂xi∂xj

the rate function takes the form

I(f) =1

2

∫ T

0〈f ′(t), a−1(f(t))f ′(t)〉dt

This is not changed if we add a small first order term.

Lǫ =ǫ

2

∑

i,j

ai,j(x)∂2

∂xi∂xj+ δ(ǫ)

∑

i

bi(x)∂

∂xi

where δ(ǫ) → 0 as ǫ → 0. Denoting the two measures by Qǫ and Pǫ, the Radon-Nikodymderivative is

dQǫ

dPǫ= exp

[

δ(ǫ)

ǫ

∫ T

0〈a−1(x(s))b(x(s)), dx(s)〉 − δ(ǫ)2

2ǫ

∫ T

0〈a−1(x(s))b(x(s)), b(x(s)〉ds

]

From the boundedness of a, a−1 and b it is easy to deduce [see exercise at the end] that forany k,

limǫ→0

ǫ logEPǫ[

[dQǫ

dPǫ]k]

= 0

and

limǫ→0

ǫ logEQǫ[

[dPǫ

dQǫ]k]

= 0

We can now estimate by Holder’s inequality

Qǫ(A) =

∫

A

dQǫ

dPǫdPǫ ≤ [Pǫ(A)]

1p ‖dQǫ

dPǫ‖q,Pǫ

as well as

Pǫ(A) =

∫

A

dPǫ

dQǫdQǫ ≤ [Qǫ(A)]

1p ‖ dPǫ

dQǫ‖q,Qǫ


By choosing p > 1 but arbitrarily close to 1, Pǫ(A) and Qǫ(A) are seen to have the sameexponential decay rate.

The process with generator ǫL is the same as the process for L slowed down. Thereforethe transition probability p(ǫ, x, dy) is the same as the transition probability pǫ(1, x, dy) ofǫL.. By contraction principle we can conclude that for G open

lim infǫ→0

2ǫ log p(ǫ, x,G) ≥ − inff :f(0)=x

f(1)∈G

I(f) (2.18)

and for C closedlim sup

ǫ→02ǫ log p(ǫ, x, C) ≥ − inf

f :f(0)=x

f(1)∈C

I(f) (2.19)

Moreover an elementary calculation shows that

inff(0)=x

f(1)=y

I(f) =1

2d(x, y)2

where d(x, y) is the geodesic distance in the metric ds2 =∑

i,j a−1i,j (x)dxidxj . One can then

use the Chapman-Kolmogorov equation

p(t, x, y) =

∫

p(t1, x, dz)p(t− t1, z, y)

and improve the estimate on p(t, x,A) to an estimate on p(t, x, y) that takes the form

p(t, x, y) = exp[−d(x, y)2

2t+ o(

1

t)]

Another way of looking at this is if we have a Riemannian metric ds2 =∑

gi,j(x)dxidxj onRd where gi,j(x) are smooth, bounded and uniformly positive definite, then the diffusion

with generator 12∆g where ∆g is Laplacian in the metric g has transition probability density

that satisfies

p(t, x, y) = exp[−dg(x, y)2

2t+ o(

1

t)] (2.20)

where dg(x, y) is the geodesic distance between x and y in the metric gi,j(x)

2.8 Supplementary material.

The work on small time behavior of diffusions was suggested by a result of Cieselski [1]that if pG(t, x, y) is the fundamental solution of the heat equation ut =

12∆ with Dirichlet

boundary condition on the boundary δG of an open set G and p(t, x, y) the whole spacesolution then

limt→0

pG(t, x, y)

p(t, x, y)= 1

2.8. SUPPLEMENTARY MATERIAL. 25

for all x, y ∈ G if and only if G is essentially convex. Intuitively this says that if a Brownianpath goes from x → y in a short time then it did so in a straight line. The analog fordiffusions would be that the geodesic replaces the straight line. This is a consequence ofthe Large deviation result as is shown in the following exercises.

Exercise. Assuming a PDE estimate of the form

limδ→0

lim supt→0

t sup|x−y|≤δ

log p(t, x, y) = limδ→0

lim inft→0

t inf|x−y|≤δ

log p(t, x, y) = 0

for the fundamental solution of p(t, x, y) of ut = Lu use (2.18) and (2.19) to prove (2.20).

Exercise. Deduce that the measure Qǫ,x,y on path space C[[0, 1],Rd] starting from x withtransition probability

qǫ,x,y(s, x′, t, y′) =

p(ǫ(t− s), x′, y′)p(ǫ(1− t), y′, y)

p(ǫ(1− s)x′, y)

concentrates as ǫ → 0 on the set of geodesics connecting x and y.

Strassen in [12] used the large deviation estimate to prove a functional form of the Law ofthe iterated logarithm. Let β(t) be the one dimensional Brownian motion. Let

βλ(t) =β(λt)√

λ log log λ

then on the space C[0, 1] with probability 1, the set βλ(·) : λ ≥ 10 is conditionallycompact and the set of limit points as λ → ∞ is precisely the set of f such that f(0) = 0and I(f) = 1

2

∫ 10 [f

′(t)]2 ≤ 1. The proof is very similar to the proof of the usual law of theiterated logarithm. Due to the slow change in λ it is enough to look at λn = ρn for ρ > 1.Then

P [βρn(·) ∈ B(f, δ)] ≃ n−I(f)

and we now apply Borel-Cantelli lemma. One half requires ρ → 1 and the other halfrequires ρ → ∞ to generate near independence. Now using Skorohod imbedding one candeduce a similar result for sums of i.i.d. random variables with any arbitrary commondistribution with mean 0 and variance 1.

Bibliography

[1] Ciesielski, Z. Heat conduction and the principle of not feeling the boundary. Bull.Acad. Polon. Sci. Sr. Sci. Math. Astronom. Phys. 14 1966 435–440.

[2] Cramer,H.(1938). Sur un nouveau theoreme-limite de la theorie des probabilites.Actualites Scientifiques et Industrialles 736, 523. Colloque Consecre a la Theoriedes Probabilites 3. Hermann, Paris.

[3] Freidlin, Mark I. ; Wentzell A.D. On Small random perturbations of dynamicalsystems. Russian Mathematical Surveys Volume 25(1970) , number 1 pages 1-56

[4] Freidlin, Mark I. ; Wentzell, Alexander D. Random perturbations of Hamiltoniansystems. Mem. Amer. Math. Soc. 109 (1994), no. 523, viii+82 pp.

[5] Guo, M. Z. ; Papanicolaou, G. C. ; Varadhan, S. R. S. Nonlinear diffusion limitfor a system with nearest neighbor interactions. Comm. Math. Phys. 118 (1988),no. 1, 31–59.

[6] Kac, M. ; Luttinger, J. M. Bose-Einstein condensation in the presence of impuri-ties. II. J. Mathematical Phys. 15 (1974), 183–186.

[7] Quastel, Jeremy . Diffusion of color in the simple exclusion process. Comm. PureAppl. Math. 45 (1992), no. 6, 623–679.

[8] Quastel, Jeremy . Large deviations from a hydrodynamic scaling limit for a non-gradient system. Ann. Probab. 23 (1995), no. 2, 724–742.

[9] Quastel, J. ; Rezakhanlou, F. ; Varadhan, S. R. S. Large deviations for the sym-metric simple exclusion process in dimensions d3. Probab. Theory Related Fields113 (1999), no. 1, 1–84.

[10] Rezakhanlou, Fraydoun . Propagation of chaos for symmetric simple exclusions.Comm. Pure Appl. Math. 47 (1994), no. 7, 943–957.

[11] Schilder, M. Some asymptotic formulas for Wiener integrals. Trans. Amer. Math.Soc. 125 (1966) 63–85.

27

28 BIBLIOGRAPHY

[12] Strassen, V. An invariance principle for the law of the iterated logarithm. Z.Wahrscheinlichkeitstheorie und Verw. Gebiete 3 (1964) 211–226.

[13] Varadhan, S. R. S. Asymptotic probabilities and differential equations. Comm.Pure Appl. Math. 19 (1966) 261–286.

[14] Varadhan, S. R. S. On the behavior of the fundamental solution of the heatequation with variable coefficients. Comm. Pure Appl. Math. 20 (1967) 431–455.

[15] Varadhan, S. R. S. Diffusion processes in a small time interval. Comm. Pure Appl.Math. 20 (1967) 659–685.

[16] Varadhan, S. R. S. Nonlinear diffusion limit for a system with nearest neighborinteractions. II. Asymptotic problems in probability theory: stochastic modelsand diffusions on fractals (Sanda/Kyoto, 1990), 75–128, Pitman Res. Notes Math.Ser., 283, Longman Sci. Tech., Harlow, 1993.

LargeDeviations - math.nyu.eduvaradhan/Spring2012/Chapters1-2.pdf · LargeDeviations Spring2012...

Documents

Transcript of LargeDeviations - math.nyu.eduvaradhan/Spring2012/Chapters1-2.pdf · LargeDeviations Spring2012...