Introduction to the ergodic theory - Université Grenoble Alpes · 2017-11-27 · Introduction to...

79

Transcript of Introduction to the ergodic theory - Université Grenoble Alpes · 2017-11-27 · Introduction to...

Introduction to the ergodic theory

Christophe Leuridan

January 2017

2

Chapter 1

Some basis in measure theory

1.1 Semi-algebras, algebras, σ-elds and monotone classes

Let E be any set. Denote by P(E) its power set, namely the set of all subsets of E.

Denition 1.1. (Semi-algebras, algebras, σ-elds, monotone classes)

A semi-algebra on E is a subset S of P(E) which contains ∅ and E, is closed under

intersection and such that the dierence of any two elements can be written as a nite

union of pairwise disjoint elements of S.An algebra on E is a subset of P(E) which contains E, is closed under complement

and under nite unions.

A σ-algebra or σ-eld on E is a subset of P(E) which contains E is closed under

complement and under countable unions.

A monotone class on E is a subset of P(E) which contains ∅ and E and is closed

under non-decreasing countable union and non-increasing countable intersection.

Example 1.2. 1. The set of all intervals on R is a semi-algebra on R. One can

also consider only the invervals [a, b[ and ]−∞, b[ with −∞ < a ≤ b ≤ +∞. The

generated σ-algebra is the Borel σ-eld on R, denoted by B(R).

2. If A is an algebra on X and B an algebra on Y , the set of all Cartesian products

A × B with A ∈ A and B ∈ B is a semi-algebra on X × Y . When A and B are

σ-algebras, the σ-algebra A ⊗ B is, by denition, the σ-algebra generated by such

Cartesian products.

3. Let (E, E) be a measurable space. Call (pn)n∈Z the canonical projections from EZ

to E. A nite-base cylinder of EZ is a nite intersection of sets of the form p−1n (B)

with n ∈ Z and B ∈ E. The set of all nite-base cylinders is a semi-algebra. By

denition, the σ-algebra E⊗Z is the σ-algebra generated by the nite-base cylinders.

Proposition 1.3. If S is a semi-algebra on E, the set A of all nite unions of elements

of S is an algebra on E (namely the algebra generated by S). Moreover, every element

of A can be written as a nite union of pairwise disjoint elements of S.

Theorem 1.4. (Monotone class theorem). Let A be an algebra on X. The monotone

class generated by A is also the σ-eld generated by A.

3

4 CHAPTER 1. SOME BASIS IN MEASURE THEORY

1.2 Premeasures and measures

Denition 1.5. If A is an algebra on E, a premeasure on (E,A) is a map µ from A to

[0,+∞] such that µ(∅) = 0 and for every sequence (An)n≥1 of pairwise disjoint elements

of A whose union is still in A,

µ( ⋃n≥1

An

)=∑n≥1

µ(An).

A measure is a premeasure dened on a σ-algebra.

Remark 1.6. Let A be an algebra on E, and µ be a map from A to [0,+∞] such that

µ(E) < +∞. Then µ is a premeasure on (E,A) if and only if the two conditions below

hold:

• for every pairwise disjoint A and B in A, µ(A ∪B) = µ(A) + µ(B).

• for every non-increasing sequence (Bn)n≥1 of elements of A having an empty in-

tersection, µ(Bn)→ 0 as n→ +∞.

Theorem 1.7. (Carathéodory's theorem). Let µ be a premeasure on (E,A). For

every subset B ∈ E, set

µ∗(B) = inf∑n≥1

µ(An) : (An)n≥1 ∈ A∞, B ⊂⋃n≥1

An

).

Let

B = B ∈ P(E) : ∀A ∈ P(E), µ∗(A) = µ∗(A ∩B) + µ∗(A ∩Bc).

Then B is a σ-eld containing A, and µ∗|B is a measure which coincides with µ on A.

Remark 1.8. The map µ∗ thus dened is an outermeasure on (E,P(E), namely

µ∗( ⋃n≥1

Bn

)≤∑n≥1

µ∗(Bn)

for every sequence (Bn)n≥1 of subsets of E. Moreover, the measure space (E,B, µ∗|B) is

complete.

In many situations, we will approximate arbitrary measurable subsets by some `sim-ple' subsets. Let us give a general statement.

Lemma 1.9. Let (X,X , µ) be a probability space. The map (A,B) 7→ µ(A4B) from X 2

to R+ is a pseudo-metric on X .

Proposition 1.10. Let (X,X , µ) be a probability space, A be a sub-algebra of X . Denoteby σ(A) the σ-eld generated by A. Then A is dense in σ(A) for the pseudo-metric above:

for every B ∈ σ(A) and for every ε > 0, there exists A ∈ A such that µ(A4B) ≤ ε.

Proof. One checks that the set of all B ∈ X having the property above is a σ-eldcontaining A. Therefore, it contains σ(A).

We now give examples of situations in which proposition 1.10 is frequently used.

Example 1.11. Examples of generating algebras

1.3. SIGNED MEASURES 5

• The set of all nite unions of intervals of the form [a, b[ with a < b in R is an

algebra and generates the σ-eld B(R).

• If (Fn)n≥0 is a ltration of (X,X ), then⋃n≥0Fn is an algebra and generates the

σ-eld∨n≥0Fn.

• If (X1,X1) and (X2,X2) are measurable spaces, then the set of all nite union of

`rectangles' A1 × A2 with A1 ∈ X1 and A2 ∈ X2 is an algebra which generates the

σ-eld X1⊗X2. Moreover, each nite union of rectangles can be written as a nite

disjoint union of rectangles.

1.3 Signed Measures

Let (X,X ) be a measurable space.

Denition 1.12. A signed measure on (X,X ) is a map µ from X to R such that

µ(∅) = 0 and for every sequence (An)n≥1 of pairwise disjoint elements of X ,∑n≥1

|µ(An)| < +∞ and µ( ⋃

n≥1

An

)=∑n≥1

µ(An).

The dierence of any two nite non-negative measures on (X,X ) is a signed measureon (X,X ). Conversely, we will see that any signed measure on (X,X ) can be writ-ten as the dierence of two nite non-negative measures, and there is a minimal suchdecomposition.

Theorem and denition 1.13. Let µ be a signed measure on (X,X ). For every A ∈ X ,set

|µ|(A) = sup∑n≥1

|µ(An)| < +∞ : (An)n≥1 ∈ X∞,⋃n≥1

An = A,⋂n≥1

An = ∅,

µ+(A) =|µ|(A) + µ(A)

2and µ−(A) =

|µ|(A)− µ(A)

2.

Then |µ|, µ+, µ− are nite non-negative measures on (X,X ), respectively called variation,positive part, negative part of µ and one has µ = µ+ − µ−. The measures µ+ and µ−are mutually singular. Moreover, if µ1 and µ2 are non-negative measures on (X,X ) suchthat µ = µ1 − µ2, then µ1 ≥ µ+ and µ2 ≥ µ−.

The rst dicult point above is to check that |µ| is nite. This statement relies onthe simple fact below.

Lemma 1.14. Given any nite family of real numbers x1, . . . , xn, there exists some

subset I of [[1, n]] such that ∣∣∣∑i∈I

xi

∣∣∣ ≥ 1

2

n∑i=1

|xi|.

Remark 1.15. A similar statement holds with complex numbers, in which the best con-

stant is 1/π instead of 1/2.

Proposition 1.16. The space M(X,X ) of all signed measures on (X,X ) is a Banach

space for the norm dened by ||µ|| = |µ|(X) (total variation of µ).

Remark 1.17. Many authors dene the total variation norm by ||µ|| = (1/2)|µ|(X).This choice makes the diameter of the set of all probability measures on X equal to 1.

6 CHAPTER 1. SOME BASIS IN MEASURE THEORY

1.4 Probability measures on a metric space

We x a metric space (X, d), we denote by B(X) its Borel σ-eld and by Π(X) the setof all probability measures on (X,B(X)).

Proposition 1.18. Let µ ∈ Π(X). For every Borel set B ∈ B(K),

µ(B) = supµ(F ) : F closed ⊂ B = infµ(O) : O open ⊃ B.

One says that µ is regular.

Proof. Call A the set of all Borel sets in X for which the two equalities above hold.

By writing each closed set F as the intersection of a non-increasing sequence of opensets, namely the sets On = x ∈ X : d(x, F ) < 1/n for every n ≥ 1, one sees that Acontains the closed sets.

Therefore, it suces to check that A is a σ-eld. By construction, A is stable bycomplement. Let us check that A is closed by countable union. Fix a sequence (Bn)n≥1

of sets in A and call B their union. Given ε > 0, one can choose for each n ≥ 1 aclosed set Fn ⊂ Bn and an open set On ⊃ Bn such that µ(Fn) > µ(Bn) − ε/2n andµ(On) < µ(Bn) + ε/2n. Call F and O the union of the sequences (Fn)n≥1 and (On)n≥1.Then F ⊂ B ⊂ O and µ(F ) > µ(B)−ε and µ(O) < µ(B)+ε. Moreover, the set O is openbut the set F is not necessarily closed. Fortunately, the sets F ′n = F1∪· · ·∪Fn are closedand form a non-decreasing sequence whose union is F , and the equality µ(F ) = limµ(F ′n)show that the equality µ(F ′n) > µ(B)− ε holds for every large enough n.

The proof is complete.

Lemma 1.19. Let µ ∈ Π(X) and F be a closed subset of X. As n goes to innity,

the functions from X to R dened by fF,n : x 7→ max(1 − nd(x, F ), 0) are continuous,

bounded, and tend to 1F pointwise and in Lp(µ) for every p ∈ [1,+∞[.

Proof. The result follows from the equivalence d(x, F ) = 0⇐⇒ x ∈ F and from Lebesguedominated convergence theorem.

Corollary 1.20. Let µ ∈ Π(X). The set Cb(X) of bounded continuous functions from

X to R is dense in Lp(µ) for every p ∈ [1,+∞[.

Proof. The closure of Cb(X) is a closed vector subspace of Lp(µ). By lemma 1.19, itcontains the indicator function of every closed set. By proposition 1.18 it contains theindicator function of every Borel set. Therefore, it contains every simple function, so itis Lp(µ).

Corollary 1.21. Let (X,X ) be a separable complete metric space (X, d) and Cb(X) be

the space of all bounded continuous functions from X to R. Then any nite measure µon (X, d) is completely determined by the linear continuous form Lµ : f 7→

∫X fdµ on

Cc(X). In other words, the map µ 7→ Lµ from Π(X) to Cb(X)∗ is injective. Moreover,

for every µ ∈ Π(X), the linear form Lµ is non-negative (Lµ(f) ≥ 0 whenever f ≥ 0 on

X), and ||Lµ|| = µ(X) for every µ ∈ Π(X).

Proof. For every f ∈ Cc(X), |Lµ(f)| ≤ ||f ||∞µ(X) and that equality holds when f isa constant function. Using the density of Cc(X) in L1(µ), one sees that if two nitemeasures on X yield the same linear form, they are the same.

1.5. PROHOROV'S THEOREM 7

Actually, the map µ 7→ Lµ could be extended into an isometry from the vector spaceM(X) of all signed measures on (X, d) endowed with the total variation norm to Cb(X)∗.

Theorem 1.22. (Riesz - Markov - Kakutani representation theorem) If (X, d) islocally compact, then every non-negative linear form on Cc(X) which sends the constant

function 1X one 1 is the linear form Lµ for some µ ∈ Π(X).

We now introduce the notion of narrow-convergence, which underlies the notion ofconvergence in distribution.

Denition 1.23. Let (µn)n≥1 ∈ Π(X)∞ and µ ∈ Π(X). One says that (µn)n≥1 con-

verges narrowly to µ if Lµn → Lµ for the weak-star convergence, namely Lµn(f)→ Lµ(f)for every f ∈ Cb(X).

Many authors use the terminology of weak convergence. We prefer the appelation`narrow convergence' to avoid confusions with the weak topology on the space Cb(X)∗.Actually, the topology involved here on Cb(X)∗ is the weak-star topology!

Theorem 1.24. Portmanteau theorem Let (µn)n≥1 ∈ Π(X)∞. The following state-

ment are equivalent.

1. (µn)n≥1 converges narrowly to µ ∈ Π(X)

2. Lµn(f)→ Lµ(f) for every uniformly continuous bounded function f on (X, d).

3. µ(O) ≤ lim inf µn(O) for every open set O in (X, d).

4. µ(F ) ≥ lim supµn(F ) for every closed set F in (X, d).

5. µ(B) = lim inf µn(B) for every Borel set B in (X, d) such that µ(∂B) = 0.

1.5 Prohorov's Theorem

We keep the notation of the previous section, and x a subset P ⊂ Π(X) of probabilitymeasures onX. Prohorov's Theorem relates two notions: the tightness and the sequentialrelative compactness.

Denition 1.25. The subset P is tight if for every ε > 0, there exists some compact

subset K ⊂ X such that for every µ ∈ P, µ(K) ≥ 1− ε.

Proposition 1.26. If the metric space (X, d) is complete and separable, then every nite

subset of Π(X) is tight.

Proof. It suces to check the property for a single probability µ.

Let (xn)n≥1 be a dense sequence of points in X. For every n ≥ 1 and ε > 0, set

Fn(ε) = Bf (x1, ε) ∪ · · · ∪Bf (xn, ε).

Then (Fn(ε))n≥1 is a non-decreasing sequence of closed sets whose union is the wholespace X. For each k ≥ 1, we can choose an integer nk such that µ(Fnk(1/2k)) > 1−ε/2k.The set

K =⋂n≥1

Fnk(1/2k)

is complete and precompact, therefore compact. Moreover, µ(K) > 1− ε.

8 CHAPTER 1. SOME BASIS IN MEASURE THEORY

Corollary 1.27. If the metric space (X, d) is complete and separable, then for every

B ∈ B(X),µ(B) = supµ(K) : K compact ⊂ B

Proof. Let ε > 0. By the regularity of µ, one can nd a closed set F such that µ(F ) ≥µ(B)−ε/2. By the tightness of µ, one can nd a compact setK such that µ(K) ≥ 1−ε/2.Then F ∩K is compact and µ(F ∩K) ≥ µ(B)− ε.

Denition 1.28. A family P ⊂ Π(X) of probability measures on X is sequentially

relatively compact if every sequence of elements of P admits some subsequence which

converges narrowly to some element of Π(X).

Remark 1.29. The notion of sequential relative compactness is weaker than the rela-

tive compactness. When (X, d) is separable, the topology of the narrow convergence on

Π(X) (derived from the weak-star topology on Cb(X)∗) is metrizable, for example by the

Prohorov's metric dened by

d(µ, ν) = infε > 0 : ∀B ∈ B(X), µ(B) ≤ ν(Bε) + ε and ν(B) ≤ µ(Bε) + ε,

where Bε = x ∈ X : d(x,B) < ε. Therefore, these two notions coincide when (X, d)is separable.

Theorem 1.30. (Prohorov's theorem). Let P ⊂ Π(X). If P is tight, then P is

sequentially relatively compact. When (X, d) is complete and separable, the converse

holds.

The direct sense of Prohorov's theorem can be deduced from Banach - Alaoglu the-orem which yields the weak-star compactness of the closed unit ball of Cb(X)∗ and fromRiesz - Markov - Kakutani representation theorem. The subset of non-negative linearform in the closed unit ball of Cb(X)∗ is still compact for the weak-star topology andcorresponds to the set of all sub-probability measures on (X, d). The tightness assump-tion prevents the mass from escaping at innity and ensures that every weak-star limitpoint of any tight sequence of probability measures is still a probability measure.

Corollary 1.31. If (X, d) is compact, then Π(X) is metrizable and compact for the

topology of the narrow convergence.

Proposition 1.32. If (X, d) is complete and separable, and if D is a countable dense

subset of X, then

1. The family of all balls B(x, 1/n) with x ∈ D and n ≥ 1 is a countable basis of open

sets in X.

2. The family of all products B(x1, 1/n) × B(x2, 1/n) with (x1, x2) ∈ D2 and n ≥ 1is a countable basis of open sets in X2.

3. B(X2) = B(X)⊗ B(X).

Proof. The proof of the rst statement is left to the reader. The second statement followsfrom the rst, from the density ofD2 inX2 and from the fact thatB(x1, 1/n)×B(x2, 1/n)is the ball B((x1, x2), 1/n) for the product distance, which provides the product topology.

Since B(X)⊗B(X) contains these balls, it contains every open set inX2, so it containsB(X2). Conversely, B(X2) contains all cartesian products of open sets of X, hence allcartesian product of Borel sets of X, therefore B(X)⊗ B(X). Indeed, given A ∈ B(X),the collections B ∈ B(X) : A × B ∈ B(X2) and B ∈ B(X) : B × A ∈ B(X2) areσ-elds.

1.6. HERGLOTZ'S THEOREM 9

1.6 Herglotz's Theorem

Let T = R/Z. For each k ∈ Z, denote by χk the map from T toC dened by χk(x+Z) =ei2πkx for every x ∈ R. We endow T with the uniform measure (namely the image ofthe uniform measure on [0, 1[ by the canonical projection from R to T. Then (χk)k∈Zis an Hilbert basis of L2(T).

Denition 1.33. To every signed measure µ on T, we associate its Fourier transform

µ from T to C dened by

µ(k) =

∫Tχk(t)dµ(t).

Proposition 1.34. The map µ thus dened is bounded and ||µ||∞ ≤ ||µ|| = |µ|(X).Moreover, µ(0) = µ(T) and for every k ∈ Z, µ(−k) = µ(k).

Proposition 1.35. The map µ 7→ µ is a continuous injection from the space M(T) of

all signed measured on T to the space `∞(Z).

Proof. The continuity follows from the linearity and the inequality ||µ||∞ ≤ ||µ||. Theinjectivity follows from the density of the vector space generated by χk : k ∈ Z in thespace C(T) and corollary 1.21.

Proposition 1.36. Let (µn)n≥1 be a sequence of probability measures on T. If the

sequence (µn) converges pointwise to some function ϕ, then (µn)n≥1 converges narrowly

to some probability measure on T whose Fourier transform is ϕ.

Proof. Since Π(T) is compact and metrizable, it suces to show that the sequence(µn)n≥1 has only one limit point (for the narrow convergence). Let µ be the limit ofsome subsequence (µn)n∈I . Then for every k ∈ Z,

µ(k) =

∫Tχk(t)dµ(t) = lim

n→∞, n∈I

∫Tχk(t)dµn(t) = ϕ(t).

Therefore, µ is the unique probability measure whose Fourier transform is ϕ.

Denition 1.37. Let ϕ be a map from Z to C. One says that ϕ is positive semi-denite

if for every d ≥ 1 and (a1, . . . , ad) ∈ Zd, the matrix (ϕ(ak − al))1≤k,l≤d is semi-denite

positive.

Many authors use the confusing terminology of `positive denite function', althoughthe associated matrices are only supposed to be semi-denite positive.

Exercise. By considering the matrices associated to (a1, a2) = (0, t), one checks thatthe property above implies that ϕ(0) ∈ R+ and |ϕ(t)| ≤ ϕ(0) for every t ∈ Z.

Theorem 1.38. Herglotz's Theorem Let ϕ be a map from Z to C. Then ϕ is the

Fourier transform of some probability measure on T if and only if ϕ is positive semi-

denite and ϕ(0) = 1.

Proof. If ϕ = µ with µ ∈ Π(T), then ϕ(0) = µ(T) = 1 and for every integers a and b,

ϕ(a− b) =

∫Tχa−b dµ =

∫Tχaχb dµ = 〈χa|χb〉L2(µ).

10 CHAPTER 1. SOME BASIS IN MEASURE THEORY

Therefore, for every d ≥ 1, (a1, . . . , ad) ∈ Zd and (z1, . . . , zd) ∈ Cd,∑1≤k,l≤d

ϕ(ak − al)zkzl =∑

1≤k,l≤d〈zkχak |zlχal〉L2(µ) =

∣∣∣∣∣∣ ∑1≤k≤d

zkχak

∣∣∣∣∣∣2L2(µ)

≥ 0.

The `only if' sense follows.

Conversely, assume that ϕ is semi-denite positive and ϕ(0) = 1. For every n ≥ 1and t ∈ T, let

fn(t) =1

n

∑1≤k,l≤n

ϕ(k − l)χk(t)χl(t) =1

n

∑1≤k,l≤n

ϕ(k − l)χk−l(t).

By assumption and construction, fn is a non-negative continous map on T. Groupingtogether the terms according to the value of the dierence k − l, we get

fn(t) =1

n

∑−(n−1)≤m≤n−1

(n− |m|)ϕ(m)χm(t).

Calling µn the measure with density fn, we get for every k ∈ Z,

µn(k) =1

n

∑−(n−1)≤m≤n−1

(n− |m|)ϕ(m)〈χk|χm〉 =n− |k|n

ϕ(k)1|k|≤n−1.

In particular, µn(0) = ϕ(0) = 1 so µn ∈ Π(T). Since µn → ϕ pointwise, ϕ is the Fouriertransform of some probability measure on T.

Example 1.39. (Examples of positive semi-denite maps)

If (Xn)n∈Z is a square integrable real-valued stationary process, its covariance func-

tion, dened by c(k) = Cov(X0, Xk) for every k ∈ Z, is positive semi-denite.

If U an an unitary operator of a Hilbert space H, then for every h ∈ H, the function

ϕ from Z to C dened by ϕ(k) = 〈h, Ukh〉 is positive semi-denite.

In both examples above, Herglotz's theorem provides a nite measure on T (or equiv-alently on U), called spectral measure.

1.7 Uniform integrability

Let (X,X , µ) be a probability space.

Theorem and denition 1.40. Let F ⊂ L1(µ). One says that F is uniformly integrable

on (X,X , µ) if the following equivalent statements hold.

1. F is bounded in L1(µ) and for every ε > 0, one can nd α > 0 such that

∀A ∈ X , µ(A) ≤ α =⇒ ∀f ∈ F ,∫A|f | dµ ≤ ε.

2. For every ε > 0, one can nd r > 0 such that for every f ∈ F ,∫X|f |1[|f |≥r] dµ ≤ ε.

1.7. UNIFORM INTEGRABILITY 11

3. There exists some continuous convex increasing function Φ from R+ to R+ such

that Φ(r)/r → +∞ as r →∞ and Φ |f | : f ∈ F is bounded in L1(µ).

4. There exists some measurable function Φ from R+ to R+ such that Φ(r)/r → +∞as r →∞ and Φ |f | : f ∈ F is bounded in L1(µ).

Proof. The implication 3 =⇒ 4 is clear. We prove the implications 1 =⇒ 2 =⇒ 1,(1 and 2) =⇒ 3, and 4 =⇒ 2.

Assume that statement 1 holds. Let M = sup||f ||1 : f ∈ F. Then for every f ∈ Fand r > 0, µ[|f | ≥ r] ≤ ||f ||1/r ≤M/r. Given ε > 0, statement 1 provides some α > 0,and r := M/α satises the inequality of statement 2.

Assume that statement 2 holds. Let ε > 0. One can nd r > 0 such that for everyf ∈ F , ∫

X|f |1[|f |≥r] dµ ≤ ε/2.

Since |f | ≤ r + |f |1[|f |≥r], we deduce that ||f ||1 ≤ r + ε/2 for every f ∈ F . Moreover, ifA ∈ X and µ(A) ≤ ε/(2r), then decomposing A into A∩ [|f | < r] and A∩ [|f | ≥ r yields∫

A|f | dµ ≤ rµ(A) +

∫[|f |≥r]

|f | dµ ≤ ε.

Assume that statements 1 and 2 hold. Let M = sup||f ||1 : f ∈ F. One canconstruct an increasing sequence (rn)n≥1 of real numbers starting at r0 = 0 such thatfor every n ≥ 0 and f ∈ F , ∫

X|f |1[|f |≥rn] dµ ≤M2−n.

Let φ any continuous increasing map from R+ to R+ such that φ(rn) = n for everyn ≥ 0, and Φ its primitive which vanishes at 0. Then Φ satises the required conditionsand for every f ∈ F ,∫

X(Φ |f |) dµ =

∫ +∞

0φ(r)µ[|f | > r] dr

≤+∞∑n=0

∫ rn+1

rn

(n+ 1)µ[|f | > r] dr

≤+∞∑n=0

(n+ 1)

∫ +∞

rn

µ[|f | > r] dr

=

+∞∑n=0

(n+ 1)

∫X

(|f | − rn)+ dµ

≤+∞∑n=0

(n+ 1)

∫|f |1[|f |≥rn] dµ

≤+∞∑n=0

(n+ 1)M2−n = 4M,

so statement 4 holds.

12 CHAPTER 1. SOME BASIS IN MEASURE THEORY

Last, assume that statement 4 holds. Let M = sup|| Φ |f | ||1 : f ∈ F and ε > 0.One can nd r0 > 0 such that r/Φ(r) ≤ ε/M for every r ≥ r0, so for every f ∈ F ,∫

X|f |1[|f |≥r0] dµ ≤ ε

M

∫X

(Φ |f |)1[|f |≥r0] dµ ≤ ε,

so statement 2 holds.

Finite subsets of L1(µ), nite unions of uniformly integrable families, subsets ofuniformly integrable families are uniformly integrable. Let us give less trivial examples.

Proposition 1.41. The following families of functions are uniformly integrable.

1. Any family of functions dominated by some function in L1(µ).

2. Any sequence of elements of L1(µ) which converges in L1(µ).

3. Any family of identically distributed integrable random variables.

4. The closure in L1(µ) of any uniformly integrable family.

5. Any bounded subset of Lp(µ) with p > 1.

6. The convex hull of any uniformly integrable family.

7. The family of all conditional expectations of some integrable function with regard

to any family of sub-σ-elds of X .Proposition 1.42. Let (fn)n≥1 and f be elements of L1(µ). The following statements

are equivalent

1. fn → f in L1(µ).

2. fn → f in probability and (fn)n≥1 is uniformly integrable.

3. fn → f in probability and ||fn||1 → ||f ||1.

Proof. We prove the implications 1 =⇒ (2 and 3), 2 =⇒ 1, and 3 =⇒ 2.

If statement 1 holds, then statements 2 and 3 follow from the last proposition andfrom the inequalities µ[|fn − f | > ε] ≤ ||fn − f ||1/ε and

∣∣||fn||1 − ||f ||1∣∣ ≤ ||fn − f ||1.Assume that statement 2 holds, and x ε > 0. One can nd α > 0 such that for

every A ∈ X ,µ(A) ≤ α =⇒ ∀n ≥ 0,

∫A|fn| dµ ≤ ε/3.

We know that some subsequence (fn)n∈I converges almost surely to f . By Fatou'slemma, if µ(A) ≤ α, we have also∫

A|f | dµ ≤ lim inf

n→+∞,n∈I

∫A|fn| dµ ≤ ε/3.

Since fn → f in probability, one can nd an integerN ≥ 0 such that µ[|fn−f | ≥ ε/3] < αfor every n ≥ N . Hence, for every n ≥ N ,

||fn − f ||1 =

∫[|fn−f |<ε/3]

|fn − f | dµ+

∫[|fn−f |≥ε/3]

|fn − f | dµ

≤ (ε/3)µ[|fn − f | < ε/3] +

∫|fn−f |≥ε/3

(|fn|+ |f |) dµ

≤ ε.

1.8. COMPLETE MEASURED SPACES 13

which proves statement 1.

Last, assume that statement 3 holds. Set gn = |fn| and g = |f |. For every n ≥ 1,[g−gn]+ ≤ g, so ([g−gn]+)n≥1 is uniformly integrable. Since [g−gn]+ → 0 in probability,the implication 2 =⇒ 1 already proved show that [g − gn]+ → 0 in L1(µ). Since

||gn − g||1 =

∫Xgn dµ−

∫Xg dµ+ 2

∫X

[g − gn]+ dµ = ||fn||1 − ||f ||1 − 2||[g − gn]+||1,

we get that gn → g in L1(µ), so (gn)n≥1 is uniformly integrable, so (fn)n≥1 is uniformlyintegrable, which yields statement 2.

1.8 Complete measured spaces

Let (X,X , µ) be a measure space: X is a σ-eld on X and µ is a non-negative measureon (X,X ).

Denition 1.43. (null sets, negligible sets, complete measures)

• A null set of (X,X , µ) is a subset A of X such that µ(A) = 0.

• A negligible set of (X,X , µ) is a subset of some null set of (X,X , µ).

• The measure space (X,X , µ), the σ-eld X and the measure µ are said to be com-

plete when every negligible set of (X,X , µ) belongs to X (so is a null set).

Proposition 1.44. (properties)

• The set N of all negligible sets of (X,X , µ) is stable by countable union, and every

subset of a set in N is also in N .

• The set X ′ = A ⊂ X : ∃B ∈ X , A4B ∈ N is a σ-eld containing X . It equals

X if and only if µ is complete.

• Given A ∈ X ′, the quantity µ′(A) = µ(B) for every B ∈ X such that A4B ∈ Ndoes not depend on the choice of B, so the map µ′ from X ′ to [0,+∞] is well-

dened. Moreover, µ′ is a complete measure on (X,X ′) extending µ.

Denition 1.45. The measure space (X,X ′, µ′) is called the completion of (X,X , µ).

Example 1.46. (σ-eld of Lebesgue measurable sets)

• One checks that the Lebesgue measure λ on (R,B(R)) is not complete. The com-

pletion yields a larger σ-eld L, the set of all Lebesgue-measurable subsets of R.

This σ-eld is strictly included in the set P(R) of all subsets of R.

• Call λ′ the completion of λ. One checks that the measure λ′ ⊗ λ′ is not complete.

Hint: if A is any non-Lebesgue-measurable subset of R, then A×0 is a negligible

set for λ′ ⊗ λ′ but does not belong to L ⊗ L.

• Although B(R2) = B(R) ⊗ B(R), its completion for the Lebesgue measure on R2

is strictly larger than L ⊗ L.

14 CHAPTER 1. SOME BASIS IN MEASURE THEORY

Proposition 1.47. Let (X,X ) and (Y,Y) be two measurable spaces (Y,Y), f be a map

from X to Y . Call X ′ be the completion of X with regard to µ. Assume that there

exists some sequence (Bn)n≥1 of elements of Y such that the map from (Y,Y) to 0, 1∞(endowed with the product σ-eld Z) dened by Φ(y) = (1Bn(y))n≥1 is bimeasurable.

Then f is measurable from (X,X ′) to (Y,Y) if and only if there exists some measur-

able map g from (X,X ) to (Y,Y) such that the set [f 6= g] is negligible.

Proof. Actually, the `if' part rst does not require any assumption on the space (Y,Y)and simply follows from the inclusion f−1(B)4g−1(B) ⊂ [f 6= g] for every B ∈ Y.

Conversely, assume that f is measurable from (X,X ′) to (Y,Y). For every n ≥ 1,one can nd An ∈ X such that An4f−1(Bn) is a null set. The map Ψ from (X,X ) to0, 1∞ dened by Ψ(x) = (1An(x))n≥1 is measurable, so g := Φ−1 Ψ is X -measurable.

The sequence (Cn)n≥1 dened by Cn = z ∈ 0, 1∞ : zn = 1 generates Z. SinceBn = Φ−1(Cn) for every n ≥ 1, the sequence (Bn)n≥1 generates Φ−1(Z) = Y. But forevery n ≥ 1, g−1(Bn) = Ψ−1(Φ(Bn)) = Ψ−1(Cn) = An. Hence g is X -measurable andg = f outside the null-set

N :=⋃n≥1

(An4f−1(Bn)

).

The proof is complete.

Example 1.48. Examples of measurable spaces satisfying the conditions required are

[0, 1[, R, Rd, R∞.

Chapter 2

Measure-preserving maps

2.1 Morphisms of measure spaces and dynamical systems

Denition 2.1. Let (X,X , µ) and (F,F , ν) be two measure spaces and Φ a measurable

map from (X,X , µ) to (F,F , ν).

• One says that Φ is a morphism (or a factor map) from (X,X , µ) to (F,F , ν) if themeasure Φ(µ) := µΦ−1 equals ν.

• One says that Φ is an isomorphism from (X,X , µ) to (F,F , ν) if Φ is invertible

(namely bimeasurable) and Φ(µ) = ν.

• One says that Φ is an isomorphism modulo 0 from (X,X , µ) to (F,F , ν) if there ex-ists full-measure subsets X ′ ⊂ X and F ′ ⊂ X, such that Φ induces an isomorphism

from (X ′,X , µ) to (F ′,F , ν).

Remark 2.2. Every morphism of measure spaces is also an morphism for their comple-

tion.

Exercise. Let X = 0, 1∞, endowed with the product σ-eld X and the probabilityν =

⊗n≥1(δ0 + δ1)/2. Let F = [0, 1], endowed with the Lebesgue σ-eld F and the

Lebesgue measure ν. Show that the formula

Φ((xn)n≥1) =∑n≥1

xn2−n

denes an isomorphism modulo 0 from (X,X , µ) to (F,F , ν).

Denition 2.3. Let (X,X , µ) be a measure space, and T a measurable map from (X,X )to (X,X ). One says that T preserves µ and that µ is invariant by T when the image

T (µ) := µT−1 equals µ.

The ergodic theory, focuses mainly on measure-preserving maps, namely endomor-phisms of probability spaces.

Denition 2.4. Let (X,X , µ) be a probability space. If T preserves µ and T is invert-

ible modulo 0, then one says T is an automorphism of (X,X , µ), and that the 4-uple(X,X , µ, T ) is called a dynamical system.

15

16 CHAPTER 2. MEASURE-PRESERVING MAPS

Denition 2.5. Consider two dynamical systems (X,X , µ, T ) and (F,F , ν, S), and a

morphism Φ from (X,X , µ) to (F,F , ν). If Φ T = S Φ µ-almost surely, then one says

that

• Φ is morphism from (X,X , µ, T ) to (F,F , ν, S);

• (F,X , µ) is a factor of (X,X , ν);

• (X,X , µ) is an extension of (F,F , ν).

(X,X , µ)T //

Φ

(X,X , µ)

Φ

(F,F , ν)S// (F,F , ν)

If Φ is also invertible modulo 0, one says that the dynamical systems (X,X , µ, T ) and

(F,F , ν, S) are equivalent.

Determining whether two given dynamical systems are equivalent or not is one majorquestion in ergodic theory. An invariant called entropy plays a key role in the study ofthis problem.

Exercise. Let λ be the uniform measure on I = [0, 1[. Let U be the unit circle of C.

1. Check that the map Φ : x 7→ ei2πx is bimeasurable from (I,B(I)) to (U,B(U)).Hint : show that for every closed subset F of I, Φ(F ) is a countable union of closedsubsets of U.

2. Fix α ∈ R, and let Tα be the map from I to I dened by Tα(x) = x+α−bx+αc.Check that Tα preserves λ. Hint : there is no restriction to assume that α ∈ I.

3. Let Rα be the map from U to U dened by Rα(x) = ei2πα. Check that Rαpreserves the measure ν = Φ(λ). and that Rα Φ = Φ Tα.

4. Among the automorphisms T1/3 ,T2/3, T1/5, T2/5, which ones are equivalent?

Exercise. (Atoms are not interesting in ergodic theory) Let (X,X , µ) be a probabilityspace. A set A ∈ X is called an atom (with regard to µ) if µ(A) > 0 and for everyB ∈ X , B ⊂ A implies µ(B) = 0 or µ(B) = µ(A).

1. Check that two atoms A1 and A2 are almost surely equal (µ(A14A2) = 0) oralmost surely disjoint (µ(A1 ∩A2) = 0).

2. Let n ≥ 1. Check that µ has at most n atoms with measure ≥ 1/n (therefore, µhas countably many atoms).

3. Assume that there exists some sequence (Bn)n≥1 of elements of X which separatesthe points of X: for any distinct x1 and x2 in X, there exists some n ≥ 1 suchthat 1Bn(x1) 6= 1Bn(x2).

(a) Check that atoms of (X,X , µ) are single sets.

2.2. EXISTENCE OF INVARIANT PROBABILITY MEASURES 17

(b) Assume that the measure space (X,X , µ) has atoms. Let T be a measure-preserving map of (X,X , µ). Check that T induces a permutation on the setof atoms of given measure. Hint: start with atoms of maximal measure.

4. We still assume that the measure space (X,X , µ) has atoms and consider a measure-preserving map T of (X,X , µ), but we remove the separability hypothesis. CallXd (discrete part) the union of atoms of (X,X , µ) and Xc its complement in X(continuous part). Let A1 be an atom of maximal measure.

(a) Check that for every atom A and every B ∈ X such that µ(B) < µ(A),µ(A ∩ T−1(B)) = 0.

(b) Show that T induces a permutation on the set of atoms of given measure.Hint: use the result of the next exercise.

Exercise. (Non-atomic measure spaces.) Let (X,X , µ) be a non-atomic nite mea-sure space. Prove the existence of an increasing map A from [0, µ(X)] to X such thatµ(A(t)) = t for every t ∈ [0, µ(X)]. Hint: apply Zorn lemma to the sets of increasingmaps A from some subset S of [0, µ(X)] to X such that µ(A(t)) = t for every t ∈ S.

2.2 Existence of invariant probability measures

Given a measurable map T from (X,X ), what can be said on the set ΠT of all T -invariantprobability measures on (X,X )? Clearly, ΠT is a convex subset of the set of all signedmeasures on (X,X ). This set may be empty. We give conditions ensuring the existenceof invariant measures.

2.2.1 Continuous transformations on compact metric spaces

Let T be a continuous map from a compact metric space (X, d) to itself.

Theorem 2.6. The set ΠT of T -invariant probability measures is a non-empty compact

convex subset of Π(X).

Proof. First, we note that by corollary 1.21, a probability measure ν ∈ ΠT (X) is invariantif and only if for every f ∈ C(X) = Cb(X),∫

X(f T )dν =

∫Xfdν.

This characterization and the continuity of T show that ΠT (X) is a closed subset of thecompact space Π(X). The convexity is obvious. The non-emptyness relies on the nextproposition and on the sequential compactness of Π(X).

Proposition 2.7. Let (µn)n≥1 ∈ Π(X)∞. For every n ≥ 1, set

νn =1

n

n−1∑k=0

T k(µn).

Then every limit point of the sequence (νn)n≥1 belongs to ΠT .

18 CHAPTER 2. MEASURE-PRESERVING MAPS

Proof. First, we note that the probability measure νn gets closer and closer to be T -invariant as n goes to innity. Indeed,

T (νn)− νn =1

n

n−1∑k=0

T k+1(µn)− 1

n

n−1∑k=0

T k(µn) =1

n(Tn(µn)− µn),

so T (νn)−νn is a signed meaure whose total variation is at most 2/n. For every f ∈ C(X),∣∣∣ ∫X

(f T )dνn −∫Xfdνn

∣∣∣ ≤ 2

n||f ||∞.

Let ν be a limit point of the sequence (νn)n≥1. Taking the limit along a suitable subse-quence in the last inequality yields∫

X(f T )dν =

∫Xfdν.

Since this equality holds for every f ∈ C(X), we get T (ν) = ν.

Theorem 2.8. (Oxtoby's theorem) For every f ∈ C(X) and n ≥ 1, set

Mnf :=1

n

n−1∑k=0

(f T k).

Then the following statements are equivalent.

1. ΠT is a single set (one says that T is uniquely ergodic).

2. For every f ∈ C(X), (Mnf)n≥1 converges uniformly to some constant `(f).

3. For every f ∈ C(X), (Mnf)n≥1 converges pointwise to some constant `(f).

Morevoer, if ΠT = µ, then `(f) =∫X f dµ for every f ∈ C(X).

Proof. We prove the implications (2) =⇒ (3) =⇒ (1) =⇒ (2).

The implication (2) =⇒ (3) is obvious.

Assume that (3) holds. Let µ ∈ ΠT . For every f ∈ C(X), ||Mnf ||∞ ≤ ||f ||∞, so byLebesgue dominated convergence,

c(f) =

∫X

limn→+∞

Mnf dµ = limn→+∞

∫XMnf dµ =

∫Xf dµ,

so µ is completely determined by the knowledge of the linear form ` and (1) holds.

Last, assume now that (1) holds. Set ΠT = µ. Let f ∈ C(X). It suces to checkthat for every sequence (xn)n≥1 ∈ X∞, the sequence (Mnf(xn))n≥1 converges to

∫X fdµ.

To do this, we observe that

Mnf(xn) =

∫Xf dνn with νn =

1

n

n−1∑k=0

T k(δxn).

Proposition 2.7 and the assumption ΠT = µ show that the only limit point of the se-quence (νn)n≥1 is µ. By compactness of Π(X), (νn)n≥1 converges to µ, so (Mnf(xn))n≥1

converges to∫X fdµ and (2) holds.

2.2. EXISTENCE OF INVARIANT PROBABILITY MEASURES 19

2.2.2 Haar measures on a Hausdor compact groups

Denition 2.9. A topological group is a group G and a topological space such that the

group operations (x, y) 7→ xy from G × G to G (product) and x 7→ x−1 from G to G(taking inverses) are continuous.

Lemma 2.10. Let G be a Hausdor topological group, and H a closed subgroup of G.Then the topological quotient space G/H is Hausdor.

Proof. The equivalence relation associated to H is dened by

xRy ⇐⇒ ∃h ∈ H, y = xh⇐⇒ x−1y ∈ H.

Its graph Γ = (x, y) ∈ G2 : x−1y ∈ H is a closed subset of G2, by the continuity of themap (x, y) 7→ x−1y.

Call p the canonical projection from G on G/H. By denition of the quotient topol-ogy, a subset of G/H is open if and only if its inverse image by p is open in G.

Let aH and bH be two distinct classes in G/H. Then (a, b) /∈ Γ. By denition of theproduct topology, one can nd two open sets U and V in G such that (a, b) ∈ U × V ⊂G2 \ Γ.

The sets p−1(p(U)) and p−1(p(V )) are open sets in G since

p−1(p(U)) =⋃h∈H

Uh and p−1(p(V )) =⋃h∈H

V h.

Thus, the sets p(U) and p(V ) are open in G/H. These sets contain respectively p(a) =aH and p(b) = bH, and one checks that they are disjoint (otherwise, U × V wouldintersect Γ).

Lemma 2.11. Let G be a compact group and f be a continuous map from G to R. For

every ε > 0, one can nd a neighbourood V of 1G such that for every g ∈ V and y in G,|f(gy)− f(y)| ≤ ε.

Proof. Let ε > 0.

Given x ∈ G, the continuity at 1G of the map g 7→ f(gx) from G to R yields aneighbourood Vx of 1G such that for every g ∈ Vx, |f(gx)− f(x)| ≤ ε/2.

The continuity at (1G, 1G) of the map (g, h) 7→ gh yields two neighbouroods V ′x, V′′x

of 1G such that gh ∈ Vx for every (g, h) ∈ V ′x × V ′′x . The particular case where g = 1Gshows that V ′′x ⊂ Vx.

For every (g, h) ∈ V ′x × V ′′x , we get

|f(ghx)− f(hx)| ≤ |f(ghx)− f(x)|+ |f(hx)− f(x)| ≤ ε,

so |fn(gy)− fn(y)| ≤ ε for every (g, y) ∈ V ′x × V ′′x x.But V ′′x x is a neighbourhood of x. As x varies in G, these neighbourhoods cover G,

so by compactness of G, one can cover G with nitely many neighbouroods (V ′′x x)x∈F .The set

V =⋂x∈F

V ′x

is a neighbourhood Vx of 1G and |fn(gy)− fn(y)| ≤ ε for every (g, y) ∈ V ×G.

20 CHAPTER 2. MEASURE-PRESERVING MAPS

Theorem and denition 2.12. Let G be a Hausdor compact group. There exists

a unique left-translation-invariant probability measure on G, called Haar measure (or

uniform measure).

Proposition 2.13. Let G be a Hausdor compact group and µ its Haar measure. Then

µ is also invariant by right-translations and by the map inv : x 7→ x−1. Moreover, if fis a continuous homomorphism from G to a topological group H, then f(µ) is the Haar

measure on the compact group f(G).

Example 2.14. For example, the Haar measure on the torus T = R/Z is the image of

the uniform measure on I = [0, 1[ by the canonical projection from R to T.

Example 2.15. The Haar measure on SO3(R) can be described as follows. Choose the

rst column u uniformly on S2 (the unit sphere of R3); then choose the second column vuniformly on the circle given by the intersection of the unit sphere and the plane (Ru)⊥;the third column is necessarily u ∧ v.

The Haar measure on SO3(R) is also the law of (the matrix in the canonical basis

of) the rotation Rot(a, α) when one chooses independently the oriented axis a uniformly

on S2 and the angle α according to the measure (1/2)(1 − cosα)1[−π,π](α)dα/π. Since

Rot(−a,−α) = Rot(a, α) and since the map a 7→ −a preserves the uniform measure on

S2, one may also choose the angle α accordingly to the measure (1−cosα)1[0,π](α)dα/π.

Proof. LetH be the skew-eld of all quaternions. ThenH is a 4-dimensional vector spaceonR. Call (1, i, j,k) its canonical basis. We endowH with the canonical euclidian norm,and choose on the subspace Hp = Vecti, j,k of all pure quaternions the orientationgiven by the basis (i, j,k). Hence, the product of two quaternions has the followinggeometrical interpretation : for every (α, β) ∈ R2 and (u, v) ∈ H2

p,

(α1 + u)(β1 + v) = (αβ − u.v)1 + αv + βu+ u ∧ v.

The set Hu of all unitary quaternion, namely

Hu = τ1 + ξi + ηj + ζk : (τ, ξ, η, ζ) ∈ S3

is a compact group (for the topology induced by the norm topology on H). One checksthat the Haar measure on Hu is the uniform measure on Hu, denoted by Unif(Hu),dened as the image of the uniform measure on the unit ball of Hu by the projectionq 7→ q/||q||.

CallHu,p = Hu∩Hp the unit sphere ofHp. One checks that Unif(Hu) is the image of

the measure (2/π)1[−π/2,π/2] sin2 θ dθ ⊗Unif(Hu,p) by the map (θ, a) 7→ (cos θ)1+(sin θ)afrom [−π/2, π/2]×Hu,p to Hu.

Given a unitary quaternion u, the linear map φ(u) : q 7→ uqu−1 from H to H is anisometry which coincides with the identity map on R1 and induces a positive isometryon the space Hp. More precisely, if u = cos(θ)1 + sin(θ)a with θ ∈ [0, π], and a ∈ G,then one checks that φ(u)(a) = a, whereas φ(u)(b) = cos(2θ)b + sin(2θ)a ∧ b for everyb ∈ Hp ∩ (Ra)⊥. Thus, the isometry induced by φ(u) on Hp is the rotation with axis aand angle 2θ.

Denote by Φ(u) the matrix of the endomorphism induced by φ(u) in the basis inthe basis (i, j,k). Then Φ is a two-to-one group homorphism from G onto SO3(R), andKerΦ = 1,−1. Hence, Φ(µ) is the Haar measure on SO3(R) by proposition 2.13.

2.2. EXISTENCE OF INVARIANT PROBABILITY MEASURES 21

Exercise. Let G be a Hausdor compact group and µ its Haar measure.

1. Check that the measure of any non-empty open subset of G is positive.

2. Check that µ ∗ ν = µ for every ν ∈ Π(G). By denition, µ ∗ ν is the image of µ⊗ νby the map (x, y) 7→ xy from G×G to G.

Proposition 2.16. The measure µ is regular and C(G) is dense in L1(µ).

The density of C(G) in L1(µ) follows from the regularity of µ and from Urysohn'slemma (given a closed subset F and an open subset O in G, there exists a continuousfonction f from G to R such that 1F ≤ f ≤ 1O).

2.2.3 Translations of Hausdor compact groups

Let G be a Hausdor compact group and µ its Haar measure.

Proposition 2.17. Let a ∈ G. Denote by aZ = ak : k ∈ Z the subgroup generated by

a, and by Ta and Ra the maps x 7→ ax and x 7→ xa from G to G.

If aZ is dense in G, then Ta and Ra are uniquely ergodic.

Otherwise, the dynamic system (G,B(G), µ,Ra) is not ergodic (there exists some

Borel subsets A of G such that R−1a (A) = A and 0 < µ(A) < 1), therefore Ra is not

uniquely ergodic. The same results hold with Ta.

Proof. Let H be the closure of aZ in G. One checks that H is a closed subgroup of G.

If H = G, we prove that µ is the only Ta-invariant probability measure on G. Letν ∈ ΠTa , then for every f ∈ C(G), the map g 7→

∫G f(gx)dν(x) from G toR is continuous

by lemma 2.11 and constant on aZ (since Tak(ν) = ν for every k ∈ Z), so it is constant.Hence for every g ∈ G, Tg(ν) = ν. A similar proof works for Ra.

If H 6= G, we can x b ∈ G \ H, so H and bH are distinct elements of G/H. ButG/H is Hausdor, so one can nd two disjoint open sets U and V in G/H containingrespectivelyH and bH. Call p the canonical projection fromG onG/H. The premimagesp−1(U) and p−1(V ) are disjoint non-empty open sets in G, so their Haar measures are in]0, 1[. But the sets p−1(U) and p−1(V ) belongs to IRa since pRa = Ra, hence Ra is notergodic with regard to µ. Therefore, Ra−1 = R−1

a is not ergodic with regard to µ. Sincethe involutive map inv : x 7→ x−1 preserves µ, we deduce that Ta = inv Ra−1 inv −1

is not ergodic with regard to µ.

Last, given any A ∈ B(G) such that 0 < µ(A) < 1 and T−1(A) = A, the measureµ(·|A) is dierent from µ and belongs to ΠTa .

Denition 2.18. A topological group is said to be monothetic when some element of the

group generates a dense subgroup.

Exercise. (Monothetic groups)

1. Check that a monothetic group is necessarily abelian.

2. Check that the additive groupsTd, T∞ and∏p∈P Z/pZ (endowed with the product

topology) are monothetic.

22 CHAPTER 2. MEASURE-PRESERVING MAPS

2.3 Basic examples

2.3.1 Shifts

Denition 2.19. Let Λ be a countable set. We dene the (bilateral) shift S on ΛZ by

S(x)(k) = x(k + 1) for every x ∈ ΛZ and k ∈ Z.

The map S thus dened can be seen as a time-translation. It is bimeasurable if oneendows ΛZ with the product σ-eld P(Λ)⊗Z, with inverse given by T (x)(k) = x(k − 1).

Every probability measure on (ΛZ,P(Λ)⊗Z) is fully determined by its restriction onthe elementary cylinders, namely on the sets x ∈ ΛZ : x(t1) = a1, . . . , x(td) = ad whered is a positive integer, t1 < . . . < td are instants in Z and a1 < . . . < ad are in Λ. One canrestrict more by assuming that t1 < . . . < td are consecutive integers t < . . . < t+ d− 1.

Therefore, a probability measure µ on (ΛZ,P(Λ)⊗Z) is invariant if and only if theprobabilities µx ∈ ΛZ : x(t) = a1, . . . , x(t + d − 1) = ad do not depend on t. Manyinvariant measures can be considered on (ΛZ,P(Λ)⊗Z). Any stationary process (Xn)n∈Ztaking values in the set Λ provides an invariant measure, namely the law of the sequence(Xn)n∈Z seen as a random variable with values in ΛZ.

The simplest invariant measures, provided by the i.i.d. sequences, are the measuresp⊗Z, where p is any probability measure on Λ. These measures are given by

µx ∈ ΛZ : x(t1) = a1, . . . , x(td) = ad = p(a1) . . . p(ad)

for every d ≥ 1, t1 < · · · < td in Z and a1 < . . . < ad in Λ. The corresponding dynamicalsystems are called Bernoulli shifts. Bernoulli shifts are often denoted by B(p).

A larger class comprises all measures provided by stationary Markov chains. Givenan invariant probability π associated to a transition matrix (p(a, b))(a,b)∈Λ2 , the law ofthe corresponding stationary Markov chain is given by

µx ∈ ΛZ : x(t) = a1, . . . , x(t+ d− 1) = ad = π(a1)p(a1, a2) · · · p(ad−1, ad)

for every d ≥ 1, t in Z and a1 < . . . < ad in Λ. The corresponding dynamical systemsare called Markov shifts.

Remark 2.20. Unilateral shift are dened in the same way, replacing Z with the set of

the non-negative integers.

2.3.2 Shifts as factors of dynamical systems

Let (X,X , µ, T ) be a dynamical system, and α = Aλ : λ ∈ Λ be a countable partionof X into measurable sets.

For every x ∈ X, denote by α(x) ∈ Λ the index of the block containing x, namelyα(x) = λ if and only if x ∈ Aλ, and set Φ(x) = (α(T k(x))k∈Z ∈ ΛZ).

Call S the shift opertor on ΛZ and F the completion of P(Λ)⊗Z for the measureν := Φ(µ). Then Φ T = S Φ, so the shift S is a factor of T .

Denition 2.21. The partition α is a generator (with regard to T ) when X is the

complete σ-eld generated by the union of the partitions T−kα = T−k(Aλ) : λ ∈ Λover all k ∈ Z.

2.4. CONSTRUCTION OF MEASURE-PRESERVING MAPS 23

Proposition 2.22. If (X,X , µ) is a Lebesgue space, then α is a generator if the union of

the partitions T−kα separate points of X. In this case, the dynamical systems (X,X , µ, T )and (ΛZ,P(Λ)⊗Z, ν, S) are equivalent.

2.3.3 Rotations of the circle

Set I = [0, 1[, T = R/Z and U = u ∈ C : |u| = 1. Fix α ∈ R. The mapsTα : x 7→ x + α − bx + αc from I to I, Sα : t 7→ t + α from T to T, Rα : u 7→ ei2παufrom U to U, are bijective and preserve respectively the uniform measures on I, U andT. The corresponding dynamical systems are equivalent.

2.3.4 The angle doubling map

Set I = [0, 1[, T = R/Z and U = u ∈ C : |u| = 1. Fix α ∈ R. The mapsT : x 7→ 2x − b2xc from I to I, Sα : t 7→ 2t from T to T, Rα : u 7→ u2 from U to U,preserve respectively the uniform measures on I, U and T. The corresponding dynamicalsystems are equivalent. These maps are onto (surjective) and two-to-one (each point inthe image has exactly two preimages).

Exercise. Show that T is equivalent to an unilateral Bernoulli shift. Hint: use dyadicexpansions.

2.3.5 The baker's tranformation

The map T from I2 to I2 dened by

T (x1, x2) = (2x1, x2/2) if x1 < 1/2,

T (x1, x2) = (2x1 − 1, (x2 + 1)/2) if x1 ≥ 1/2

is called the baker's tranformation.

Exercise. Check that T preserves the uniform measure and is equivalent to a bilateralBernoulli shift.

2.4 Construction of measure-preserving maps

2.4.1 Product map

Proposition 2.23. Let T1 and T2 be measure-preserving maps on (X1,X1, µ1) and

(X2,X2, µ2) respectively. We get a measure-preserving map on (X1×X2,X1⊗X2, µ1⊗µ2)by setting T1 × T2((x1, x2)) = (T1(x1), T2(x2)).

Proof. One checks that the measures µ1⊗µ2 and T1×T2(µ1⊗µ2) coincide on all Cartesianproducts A1 × A2 where A1 ∈ X1 and A1 ∈ X2. These Cartesian products form a classwhich is stable under intersection and generates X1 ⊗X2.

24 CHAPTER 2. MEASURE-PRESERVING MAPS

2.4.2 Induced transformation

We x a measure-preserving map T on (X,X , µ), and A ∈ X such that µ(A) > 0.

Theorem 2.24. (Poincaré recurrence theorem, 1890) Then for µ-almost every

x ∈ A, Tn(x) ∈ A for innitely many n ≥ 1.

Proof. First, we check that the set for µ-almost every x ∈ A, Tn(x) ∈ A for some n ≥ 1(actually, this slightly weaker statement is Poincaré recurrence theorem), namely that

N = A \⋃n≥1

T−n(A) = x ∈ A : ∀n ≥ 1, Tn(x) ∈ Ac

is a null set.

For every integers m ≥ 0 and n ≥ 1,

T−m(N) ∩ T−m+n(N) = T−m(T−n(N) ∩N) = T−m(∅) = ∅,

since for every x ∈ N , Tn(x) ∈ Ac, so Tn(x) /∈ N .

Therefore, the sets (T−n(N))n≥0 are pairwise disjoint, so∑n≥0

µ(T−n(N)) = µ( ⋃n≥0

T−n(N))≤ 1.

Since µ(T−n(N)) = µ(N) for every n ≥ 0, we get µ(N) = 0.

We now deduce the slight renement stated in theorem 2.24. Let

A′ = A ∩ lim supn→∞

T−n(A) = x ∈ A : Tn(x) ∈ A for innitely many n ≥ 1.

Since

A \A′ = A ∩⋃n≥0

T−n(N),

we get µ(A \A′) = 0.

Denition 2.25. We dene the rst return-time in A starting at any x ∈ X by

rA(x) = infn ≥ 1 : Tn(x) ∈ A,

with the convention inf ∅ = +∞.

It is convenient to dene the map rA on the whole space X although, we are mainlyinterested in its restriction on A. Poincaré recurrence theorem says rA is nite almosteverywhere on A.

Denition 2.26. The map induced by T on A is dened almost everywhere on A by

TA(x) = T r(x)(x).

If one wishes to have an everywhere dened map from A to A, one can set arbitrarilyTA(x) = x if r(x) = +∞. Since the subset A′ is stable by TA, and can also work withthe restriction of TA from A′ to A′, namely TA′ .

2.4. CONSTRUCTION OF MEASURE-PRESERVING MAPS 25

Proposition 2.27. The map rA is a random variable on (X,X , µ). The map TA is

measurable and preserves the probability µA = µ(·|A). Moreover,∫ArA dµA =

µ[rA < +∞]

µ(A).

Proof. For every integer n ≥ 1,

[rA = n] = T−1(A)c ∩ · · ·T−(n−1)(A)c ∩ T−n(A) ∈ X .

The measurability of rA follows.

Let B ∈ X such that B ⊂ A. Then

T−1A (B) =

⋃n≥1

(A ∩ [rA = n] ∩ T−n(B)

),

soµ(T−1A (B)

)=∑n≥1

µ(A ∩ [rA = n] ∩ T−n(B)

).

But for every n ≥ 1,

µ(A ∩ [rA = n] ∩ T−n(B)

)= µ

([rA = n] ∩ T−n(B)

)− µ

(Ac ∩ [rA = n] ∩ T−n(B)

)= µ

([rA = n] ∩ T−n(B)

)− µ

([rA = n+ 1] ∩ T−(n+1)(B)

),

because

T−1(Ac ∩ [rA = n] ∩ T−n(B)

)= [rA = n+ 1] ∩ T−(n+1)(B).

Since µ([rA = n] ∩ T−n(B)

)→ 0 as n→∞, we get

µ(T−1A (B)

)=

∑n≥1

(µ([rA = n] ∩ T−n(B)

)− µ

([rA = n+ 1] ∩ T−(n+1)(B)

))= µ

([rA = 1] ∩ T−1(B)

)= µ

(T−1(B)

)= µ(B),

so µA(T−1A (B)

)= µA(B).

Last, taking B = A above yields for every n ≥ 1

µ(A ∩ [rA = n]

)= µ[rA = n]− µ[rA = n+ 1],

so

µ(A ∩ [rA ≥ n]

)=

+∞∑k=n

(µ[rA = k]− µ[rA = k + 1]

)= µ[rA = n].

Therefore, ∫ArA dµ =

+∞∑n=1

µ(A ∩ [rA ≥ n]

)=

+∞∑n=1

µ[rA = n] = µ[rA < +∞].

Dividing by µ(A) yields the result.

Remark 2.28. If T is ergodic, namely if µ(B) ∈ 0, 1 for every B ∈ X such that

µ(B4T−1(B)) = 0, then one checks that µ[rA < +∞] = 1, so one retrieves Kac's

formula ∫ArA dµA =

1

µ(A).

26 CHAPTER 2. MEASURE-PRESERVING MAPS

2.4.3 Integral transformation

We x a measure-preserving map T on (X,X , µ), and an integrable function f on(X,X , µ) taking values in Z∗+.

Let F = (x, k) ∈ X × Z : 1 ≤ k ≤ f(x) Then F is the (disjoint) union of the sets[f ≥ k]×k, over all k ≥ 1 so F ∈ X ⊗P(Z∗+). Let F be the restriction of X ⊗P(Z∗+)on F . One checks that F is exactly the set of all subsets of the form

B =⋃k≥1

Bk × k,

where Bk ∈ X and Bk ⊂ [f ≥ k] for every k ≥ 1. One denes a probability on F by

ν(B) =(∫

Xfdµ

)−1∑k≥1

µ(Bk).

Denition 2.29. The integral transformation associated to T and f is the map from

(F,F , ν) to itself dened by

S((x, k)) = (x, k + 1) if k < f(x),

S((x, k)) = (T (x), 1) if k = f(x)

The next gure illustrates how the map S operates on F . Each level [f ≥ k] × 1appears as a copy of the subset [f ≥ k].

......

......

f(x) |

...

... |OO

...

... |OO

[f ≥ 2]

1 |OO

| [f ≥ 1]

x T (x) X

Exercise. Check that S preserves ν and that the map induced by S on the subsetX × 1 is canonically isomorphic to T .

2.5 Other examples

2.5.1 Gauss' transformation

Gauss' measure is dened on I = [0, 1[ by

µ(dx) =1

ln 2× dx

1 + x.

Gauss' transformation can be dened by T (0) = 0

T (x) = x−1 − bx−1c for x ∈]0, 1[.

2.5. OTHER EXAMPLES 27

By construction, T (x) ∈ [0, 1[ and x = 1/(bx−1c+ T (x)). The map T is involved whenone expands positive numbers into continued fractions.

Exercise. 1. Given two positive integers a > b > 0, how do we get the image of therationnal number b/a?

2. Show that the union of the sets T−n(0) over all integers n ≥ 0 is exactly I ∩Q,and that I \Q is stable by T .

3. Show that T is map from I to I and that T preserves µ. Hint : compute µ(T−1[0, a])for every a ∈ [0, 1[.

2.5.2 Morphisms of the torus

Fix an integer d ≥ 1. The additive group Td = (R/Z)d is canonically isomorphic toRd/Zd. Denote by Π the canonical projection from Rd to Rd/Zd. Let λ and µ = Π(λ)be the uniform measure on Id = [0, 1[d and T.

Let A ∈ Md(Z) such that detA 6= 0, so A is invertible in Md(Q). Since A hasinteger entries, the morphism x 7→ Ax of the additive group Rd induces a morphism TAof Td.

Proposition 2.30. The map TA is onto, is | detA| to 1, and preserves µ.

Proof. The statement is proved directly when A is invertible in Md(Z) (in this case,| detA| = 1) and when A is diagonal. The general case follows, by the existence of afactorization A = PDQ, where P and Q are invertible inMd(Z) and D is diagonal withinteger entries.

Alternative proof: using the surjectivity of the linear map associated to A, one checksthat the probability TA(µ) is invariant by translations, so it is Haar measure on T

28 CHAPTER 2. MEASURE-PRESERVING MAPS

Chapter 3

Ergodic theorems and applications

We x a measure-preserving map T on (X,X , µ). Given f ∈ L1(µ), we study theconvergence of the averages

1

n

n−1∑k=0

(f T k)

as n → +∞. Von Neumann ergodic theorem yields the convergence in L2(µ) whenf ∈ L2(µ), whereas Birkho theorem yields the almost sure convergence. The limit isthe conditional expectation of f given the σ-eld of T -invariant events.

3.1 The σ-eld of T -invariant events

Proposition 3.1. The sets

IT = A ∈ X : T−1(A) = A = A ∈ X : 1A T = 1A,

I ′T = A ∈ X : µ(T−1(A)4A) = 0 = A ∈ X : 1A T = 1A µ− a.s.,

are sub-σ-elds of X .

Denition 3.2. The sets IT (repectively I ′T ) is σ-eld of T -invariant (repectively almost-

surely T -invariant) events.

Remark 3.3. Let A ∈ X . We know that µ(T−1(A)) = µ(A) since T preserves µ.Therefore, if T−1(A) ⊂ A or T−1(A) ⊃ A, then µ(T−1(A)4A) = 0, so A ∈ I ′T .

Lemma 3.4. (construction of an invariant set) Let A ∈ X . For every n ≥ 0, the set

Bn =⋃k≥n

T−k(A)

belongs to I ′T . Moreover, these sets are almost surely equal to the set

B∞ = lim supn→+∞

T−n(A) =⋂n≥0

Bn,

which belongs to IT .

29

30 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

Proof. For every n ≥ 0, Bn+1 ⊂ Bn and

Bn+1 =⋃k≥n

T−(k+1)(A) = T−1(Bn),

so µ(Bn+1) = µ(Bn) and T−1(Bn) = Bn+1 = Bn almost surely.

But B∞ is the intersection of the non-increasing sequence (Bk)k≥0, hence for everyn ≥ 0,

Bn4B∞ = Bn \B∞ =⋃k≥n

(Bk \Bk+1),

so µ(Bn4B∞) = 0. Moreover,

T−1(B∞) =⋂n≥0

T−1(Bn) =⋂n≥0

Bn+1 =⋂n≥1

Bn = B∞.

The proof is complete.

Corollary 3.5. I ′T = A ∈ X : ∃B ∈ IT , µ(A4B) = 0 = X ∩IµT where IµT denotes the

completion of IT with regard to µ.

Proposition 3.6. Let f be a measurable map from (X,X ) to a measurable space (Y,Y).

1. Assume that Y separates the points of Y . Then f is IT -measurable if and only if

f T = f .

2. Assume that there exists some sequence (Bn)n≥1 of elements of F which separates

the points of Y . Then f is I ′T -measurable if and only if f T = f almost surely.

3. Assume that there exists some sequence (Bn)n≥1 of elements of Y such that the map

from (Y,Y) to 0, 1∞ (endowed with the product σ-eld Z) dened by Φ(y) =(1Bn(y))n≥1 is bimeasurable. Then f is I ′T -measurable if and only f is almost

surely equal to some IT -measurable map from X to Y .

3.2 Von Neumann ergodic theorem

We x a complex Hilbert spaceH, and denote by L(H) the Banach space of all continuouslinear operators of H, endowed with the norm dened by

||A|| = sup||Ax||/||x|| : x ∈ H \ 0.

Given A ∈ L(H), we denote by A∗ its adjoint operator. We denote by I the identitymap of H.

Von Neumann ergodic theorem is a general result on Hilbert spaces.

Theorem 3.7. (Von Neumann 1932) Let U ∈ L(H) such that ||U || ≤ 1. Then

1. Ker(U − I) = h ∈ H : 〈h, Uh〉 = ||h||2 = Ker(U∗ − I).

2. Im(U − I) is a dense subspace of Ker(U − I)⊥.

3. Let P be the orthogonal projection on Ker(U − I). Then for every h ∈ H,

1

n

n−1∑k=0

Ukh→ Ph as n→ +∞.

3.2. VON NEUMANN ERGODIC THEOREM 31

Proof. 1. Let h ∈ H.

If Uh = h, then 〈h|Uh〉 = ||h||2.Conversely if 〈h|Uh〉 = ||h||2, then 〈h|Uh〉 = ||h|| × ||Uh|| = ||h||2 since

|〈h|Uh〉| ≤ ||h|| × ||Uh|| ≤ ||U || × ||h||2 ≤ ||h||2.

Equality in Cauchy-Schwarz inequality shows that h and Uh are colinear, and theother equalities yield Uh = h.

As a result, Uh = h if and only 〈h|Uh〉 = ||h||2. Since ||U∗|| = ||U || ≤ 1, the sameresult holds with U replaced by U∗. But 〈h|U∗h〉 = 〈Uh|h〉 = 〈h|Uh〉, so

h ∈ Ker(U∗ − I)⇐⇒ 〈h|U∗h〉 = ||h||2 ⇐⇒ 〈h|Uh〉 = ||h||2 ⇐⇒ h ∈ Ker(U − I).

Point 1 follows.

2. Let h ∈ Ker(U − I). Then (U∗ − I)h = 0, so for every f ∈ H,

〈h|(U − I)f〉 = 〈(U∗ − I)h|f〉 = 0.

This shows that Im(U − I) is a subspace of Ker(U − I)⊥.

But Ker(U − I)⊥ is a closed subsepace of the Hilbert space H. To show thatIm(U − I) is dense in Ker(U − I)⊥, it suces to show that the orthogonal ofIm(U − I) in Ker(U − I)⊥, namely Im(U − I)⊥ ∩Ker(U − I)⊥ equals 0.Let h ∈ Im(U − I)⊥ ∩Ker(U − I)⊥. Then h is orthogonal to Uh− h, so 〈h|Uh〉 =〈h|h〉 = ||h||2. By point 1,we get h ∈ Ker(U − I), so h = 0. Point 2 follows.

3. The convergence holds when h ∈ Ker(U − I), since the left-hand side equals h inthis case.

The convergence also holds when h ∈ Im(U − I). Indeed, if h = (U − I)f withf ∈ H, then

1

n

n−1∑k=0

Ukh =1

n(Unf − f)→ 0 = Ph as n→ +∞,

since ||Unf − f || ≤ ||U ||n||f ||+ ||f || ≤ 2||f || for every n ≥ 1.

By density, one checks that the convergence holds when h ∈ Ker(U − I)⊥. SinceH = Ker(U − I) + Ker(U − I)⊥, we derive by linearity the convergence for everyh ∈ H. Point 3 follows.

The proof is complete.

Remark 3.8. If ||U || < 1 then ||Un|| → 0 as n → +∞ and Ker(U − I) = 0, so the

conclusion of Von Neumann ergodic theorem trivially holds in this case. Therefore, the

interesting case is the case where ||U || = 1.

Let us see what are the consequences on the map T . Since T preserves µ, the linearmap f 7→ f T denes an isometry UT of the Hilbert space L2(X,X , µ), so Theorem 3.7applies to UT . Since Ker(UT − I) = L2(X, I ′T , µ) = L2(X, IT , µ), the projection P is theconditional expectation operator with regard to IT . This yields the next result.

Corollary 3.9. For every f ∈ L2(µ),

1

n

n−1∑k=0

(f T k)→ Eµ[f |IT ] in L2(µ).

32 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

3.3 Birkho ergodic theorem

The purpose of this section is to prove Birkho ergodic theorem. Various proofs areavalaible. The simplest one relies on the maximal ergodic theorem.

Theorem 3.10. (Maximal ergodic theorem). Let f ∈ L1(µ) be a real-valued func-

tion, and

f∗ = supn≥1

1

n

n−1∑k=0

(f T k).

Then for every λ ∈ R, ∫[f∗>λ]

fdµ ≥ λµ[f∗ > λ].

Proof. Since (f − λ1)∗ = f∗ − λ1, replacing f by f − λ1 reduces the proof to the casewhere λ = 0.

For every n ≥ 0, set

Sn =n−1∑k=0

(f T k) and Mn = max(S0, . . . , Sn),

with the convention Sn = 0, so the equality Sn+1 = Sn + f Tn still holds for n = 0.

Let n ≥ 1. Since Mn = max(S1, . . . , Sn) on the set [max(S1, . . . , Sn) > 0] whereasMn = 0 on the set [max(S1, . . . , Sn) ≤ 0], one has everywhere

Mn ≥ 0 and Mn = Mn1[Mn>0] = max(S1, . . . , Sn)1[Mn>0].

But

f + (Mn T ) = max(f + (S0 T ), . . . , f + (Sn T ))

= max(S1, . . . , Sn+1)

≥ max(S1, . . . , Sn).

Thus

f1[Mn>0] ≥Mn − (Mn T )1[Mn>0] ≥Mn − (Mn T ).

Hence, since Mn ∈ L1(µ), and since T preserves µ,∫Xf1[Mn>0] dµ ≥

∫XMn dµ−

∫X

(Mn T ) dµ = 0.

Noting that |f1[Mn>0]| ≤ |f | for every n ≥ 1 and 1[Mn>0] → 1[f∗>0] as n → +∞ yieldsthe result by Lebesgue dominated convergence theorem.

Theorem 3.11. (Birkho 1931) For every f ∈ L1(µ),

1

n

n−1∑k=0

(f T k)→ Eµ[f |IT ] a.s. and inL1(µ).

3.3. BIRKHOFF ERGODIC THEOREM 33

Proof. Let f ∈ L1(µ).

By replacing f with f −Eµ[f |IT ], one may assume that Eµ[f |IT ] = 0.

By taking real and imaginary parts, one may asume that f is a real-valued function,so we may dene

` = lim sup1

n

n−1∑k=0

(f T k).

Passing to the upper limit in the equality

1

n

n−1∑k=0

(f T k+1) =n+ 1

n× 1

n+ 1

n∑k=0

(f T k)− 1

nf

shows that ` T = `, so ` is IT -measurable.

Let ε > 0. Let us apply the maximal ergodic theorem to g := (f − ε1)1[`>ε] andλ := 0. Since [` > ε] ∈ IT , one has for every n ≥ 1,

1

n

n−1∑k=0

(g T k) =( 1

n

n−1∑k=0

(f T k)− ε1)1[`>ε].

Using the notations of theorem 3.10, we get g∗ = (f∗ − ε1)1[`>ε], so

[g∗ > 0] = [f∗ > ε] ∩ [` > ε] = [` > ε] since f∗ ≥ `.

Hence, theorem 3.10 yields

0 ≤∫

[g∗>0]g dµ =

∫[`>ε]

(f − ε1)1[`>ε] dµ =

∫[`>ε]

f dµ− εµ[` > ε] = −εµ[` > ε],

since Eµ[f |IT ] = 0 and [` > ε] ∈ IT . Hence µ[` > ε] = 0.

Since this statement holds for every ε > 0, we get that

lim sup1

n

n−1∑k=0

(f T k) ≤ 0 almost surely.

Applying the result to −f yields

lim inf1

n

n−1∑k=0

(f T k) ≥ 0 almost surely.

This yield the almost sure convergence.

To prove the convergence in L1(µ), one only needs to check that the sequence (fn)n≥1

dened by

fn =1

n

n−1∑k=0

(f T k)

is uniformly integrable. The family (fT k)k≥0 is uniformly integrable since f is integrableand T preserves µ. But each function fn lies in the convex hull of the family (f T k)k≥0,so (fn)n≥1 is uniformly integrable.

34 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

3.4 Ergodic maps

Denition 3.12. One says that T is ergodic if and only if the following equivalent

statements hold.

1. The σ-eld IT contains only events with probability 0 or 1.

2. The σ-eld I ′T contains only events with probability 0 or 1.

3. Every IT -measurable real random variable on (X,X , µ) is almost surely constant.

4. Every I ′T -measurable real random variable on (X,X , µ) is almost surely constant.

Intuitively, T is ergodic if the T -orbit of almost every point of X goes throughoutX. Birkho ergodic theorem yield equivalent characterizations of ergodicity.

Proposition 3.13. The following statements are equivalent

1. T is ergodic.

2. For every f ∈ L1(µ),

1

n

n−1∑k=0

(f T k)→∫Xf dµ a.s. and inL1(µ).

3. For every A ∈ X ,

1

n

n−1∑k=0

1T−k(A) → µ(A) a.s. and inL1(µ).

4. For every A and B in X ,

1

n

n−1∑k=0

µ(T−k(A) ∩B)→ µ(A)µ(B).

Proof. We prove the implications 1 =⇒ 2 =⇒ 3 =⇒ 4 =⇒ 1.

The implication 1 =⇒ 2 follows from Birkho ergodic theorem.

Applying point 2 to f = 1A yields point 3.

Multiplying the convergence in point 3 by 1B and taking expectations yields point4, by Lebesgue dominated convergence theorem.

Last, assume that point 4 holds. Let A ∈ IT . For every k ≥ 0, T−k(A) = A, sopoint 4 applied with B = A yields µ(A) = µ(A)2, that is µ(A) ∈ 0, 1. Thus, T isergodic.

Exercise. Prove that T is ergodic if and only if for every A ∈ X ,

µ(A) > 0 =⇒ µ( ⋃n≥1

T−n(A))

= 1.

Exercise. Let A ∈ X such that µ(A) > 0. Prove that if T is ergodic, then the inducedmap TA is ergodic. Is the converse true? Hint: given B ∈ ITA , prove that B = A ∩ C,where

C =⋃n≥0

T−n(B).

3.5. RIGID, EXACT, AND STRONGLY MIXING MAPS 35

3.5 Rigid, exact, and strongly mixing maps

Denition 3.14. (Rigidity, exactness and strong mixing)

One says that T is rigid if there exists some sequence (qn)n≥1 of integers tending to

innity such that for every A in X ,

µ(A4T−qn(A))→ 0 as n→ +∞.

One says that T is exact if the sigma-eld

AT :=⋂n≥0

T−nX

is trivial, namely contains only events with probability 0 or 1.

One says that T is strongly mixing if for every A and B in X ,

µ(A ∩ T−n(B))→ µ(A)µ(B) as n→ +∞.

Remark 3.15. One checks that IT ⊂ AT , so exactness implies ergodicity.

Proposition 3.16. (Rigidity, exactness and strong mixing)

1. If T is rigid and X is not trivial, then T is not strongly mixing.

2. If T is exact, then T is strongly mixing.

3. If T is strongly mixing, then T is ergodic.

Proof. For every for every A in X and n ∈ N,

µ(A4T−n(A)) = µ(A) + µ(T−n(A))− 2µ(A ∩ T−n(A)) = 2µ(A)− 2µ(A ∩ T−n(A)).

If T is rigid and 0 < µ(A) < 1, we have lim inf µ(A4T−n(A)) = 0, so

lim supµ(A ∩ T−n(A)) = µ(A) > µ(A)2.

Therefore, T cannot be strongly mixing.

Assume now that T is exact. Let A and B in X . For every n ≥ 0, µ(T−n(B)) = µ(B)and T−n(B) ∈ T−nX , so

|µ(A ∩ T−n(B))− µ(A)µ(B)| =∣∣∣ ∫

T−n(B)1A dµ − µ(A)µ(T−n(B))

∣∣∣=

∣∣∣ ∫T−n(B)

(E[1A|T−nX ]− µ(A)

)dµ∣∣∣

≤∫T−n(B)

∣∣E[1A|T−nX ]− µ(A)∣∣ dµ

≤∫X

∣∣E[1A|T−nX ]− µ(A)∣∣ dµ.

Since T−1X ⊂ X , the sequence (T−nX )n≥0 is non-increasing. Thus, the backward-martingale convergence theorem applies. Since AT is trivial, we get

E[1A|T−nX ]→ E[1A|AT ] = µ(A) almost surely and in L1(µ) as n→ +∞

36 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

Hence µ(A∩T−n(B))→ µ(A)µ(B) as n→ +∞, which shows that T is strongly mixing.

Last, assume that T is strongly mixing. Given A ∈ IT , the strong mixing propertyapplied with B = A yields µ(A ∩ A) = µ(A)2, so µ(A) ∈ 0, 1. Therefore, T isergodic.

We applied the martingale convergence theorem in the proof of the second statement.An alternative argument is to derive the E[1A|T−nX ] → E[1A|AT ] in L2(µ) from thefollowing general fact.

Proposition 3.17. Let H be an Hilbert space, (Fn)n≥1 be a non-increasing sequence of

closed vector subspaces of H and F their intersection. Then the orthogonal projections

(PFn)n≥1 on the (Fn)n≥1 converge pointwise to the orthogonal projection PF on F .

Proof. Let x ∈ H. Set F0 = H. The vectors (PFn(x)−PFn+1(x)) are pairwise orthogonal.By Pythagore's theorem, for every N ≥ 0,

N−1∑n=0

||PFn(x)− PFn+1(x)||2 = ||x− PFN (x)||2 = ||x||2 − ||PFN (x)||2 ≤ ||x||2.

Therefore, the series∑

n ||PFn(x)−PFn+1(x)||2 converges. One checks that (PFn(x))n≥0

is a Cauchy sequence. Since F and F⊥ are closed, the limit L(x) is necessarily in F ,whereas x− L(x) = limn→+∞ x− PFn(x) ∈ F⊥. We are done.

Exercise. Prove that the same conclusion holds when (Fn)n≥1 is a non-decreasing se-quence of closed vector subspaces of H and F is the closure of their union.

Proposition 3.18. Let α ∈ R, I = [0, 1[ and λ be the Lebesgue measure on I. On

(I,B(I), λ), the map Tα : x 7→ (x+ α)− bx+ αc is ergodic if and only if α is irrational.

Moreover, it is rigid.

Proof. For every k ∈ Z, let ek be the map from I to C dened by ek(x) = ei2πkx. Thenfor every f ∈ L2(λ),

f =∑k∈Z

ck(f)ek in L2(λ),

where

ck(f) = 〈ek|f〉 =

∫ 1

0e−i2πkxf(x) dx.

One checks that for every k ∈ Z, ck(f Tα) = ei2πkαck(f).

If α is irrational, then ei2πkα 6= 1 for every k ∈ Z∗, so for every f ∈ L2(λ),

f Tα = f =⇒ ∀k ∈ Z∗, ck(f) = 0 =⇒ f = c0(f) almost surely.

Therefore, Tα is ergodic. Moreover, the group D = Z + αZ is dense in R, so one canapproach 0 by some sequence (xn)n≥1 of non-zero elements of D∩] − 1, 1[. For everyn ≥ 1, xn = pn + qnα with pn ∈ Z and qn ∈ Z∗. By changing the sign of xn if necessary,one may assume that qn ≥ 1. The sequence (qn)n≥1 of positive integers tends to innity(otherwise, one could extract a constant subsequence and get a contradiction). Onechecks that f T qnα → f in L1(λ) for every f ∈ C(I) such that f(1−) = f(0). Using thedensity of these functions in L1(λ), one shows this convergence holds for every f ∈ L1(λ),and in particular for every indicator of a Borel subset of I. Hence, Tα is rigid.

If α = p/q with p ∈ Z and q ∈ Z∗+, then Tqn = IdX for every n ≥ 0 and eq Tα = eq

although eq is not almost surely constant. Hence Tα is rigid but not ergodic.

3.6. BERNOULLI SHIFTS 37

Exercise. Assume that α is irrational. Check that for every f ∈ C(I) such that f(1−) =f(0),

1

n

n−1∑k=0

f T kα →∫If uniformly as n→ +∞.

Hint : the family (ek)k∈Z is total in C(I).

Exercise. Show that on (I,B(I), λ), the map T : x 7→ (2x)− b2xc is exact.

3.6 Bernoulli shifts

Proposition 3.19. Unilateral Bernoulli shifts are exact therefore strongly mixing. Bi-

lateral Bernoulli shifts are strongly mixing but not exact.

Proof. Let (E, E , π) be any probability space and I = Z+ or Z. Call (pi)i∈I the canonicalprojections from EI to E and S the shift opertor on (EI , E

⊗I). One checks that piS =

pi+1 for every i ∈ I. By denition,

E⊗I =

∨i∈I

p−1i E .

On the probability space (EI , E⊗I , π

⊗I), the random variables (pi)i∈I (valued in E)

are independent (and have the same law π), namely the σ-elds p−1i E are independent.

If I = Z+, a recursion shows that for every n ≥ 0,

S−n(E⊗I) =

∨i≥n

p−1i E ,

so the asymptotic σ-eld AS is trivial by Kolmogorov zero-one law.

If I = Z, then S is bimeasurable, so AS = E⊗I is not trivial. Yet, we now prove

that S is strongly mixing. Let A and B in E⊗I . Given ε > 0, one can nd an integer

N ≥ 0 and two cylinders C and D in ∨i∈[[−N,N ]]

p−1i E

such that µ(A4C) ≤ ε and µ(B4D) ≤ ε.For every n ≥ 2N + 1, the intervals [[−N,N ]] and [[n−N,n+N ]] are disjoint, and

S−nD ∈∨

i∈[[−N,N ]]

p−1n+iE =

∨j∈[[n−N,n+N ]]

p−1j E ,

so µ(C ∩ S−nD) = µ(C)µ(S−nD) = µ(C)µ(D) by independence of the σ-elds p−1i E .

Since (A ∩ S−n(B))4(C ∩ S−n(D)) ⊂ (A4C) ∪ (S−n(B)4S−n(D)), one gets

|µ(A ∩ S−n(B))− µ(A)µ(B)| ≤ |µ(A ∩ S−n(B))− µ(C ∩ S−n(D))|+|µ(C ∩ S−n(D))− µ(C)µ(D)|+|µ(C)µ(D)− µ(A)µ(B)|

≤ 4ε.

The proof is complete.

38 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

The strong law of large numbers can be viewed as a consequence of Birkho ergodictheorem and the ergodicity of Bernoulli shifts.

Theorem 3.20. Strong law of large numbers Let (Xn)n≥1 be a sequence of i.i.d.

integrable real random variables on some probability space (Ω,A,P). Then

limn→+∞

1

n

n∑k=1

Xk = m almost surely,where m = E[X1].

Proof. Let π be the common law of the random variables (Xn)n≥1 and µ = π⊗∞. Call(pn)n≥1 the canonical projections and S the Bernoulli shift on (R∞,B(R)⊗∞, µ). Thenthe law of (Xn)n≥1 under P is the same as the law of (pn)n≥1 under π⊗∞, so

P[

limn→+∞

1

n

n∑k=1

Xk = m]

= µ[

limn→+∞

1

n

n∑k=1

p1 Sk−1 =

∫R∞

p1 dµ]

= 1,

by Birkho ergodic theorem and by the ergodicity of S.

3.7 Ergodicity and extremality

Denition 3.21. Let C be a convex subset of some real vector space E. Let c ∈ C. Onesays that c is extreme in C if the following equivalent properties hold.

1. C \ c is convex.

2. For every (a, b) ∈ C2 and t ∈]0, 1[, c = (1− t)a+ tb =⇒ a = b = c.

3. For every (a, b) ∈ C2, c = (a+ b)/2 =⇒ a = b = c.

The notion of extreme points is very important in the theory of convex sets. If Cis a convex compact subset of a locally convex Hausdor vector space and Extr(C) thesubset of its extreme points, then Krein - Milman states that C is the closure of theconvex hull of Extr(C). Choquet - Bishop - de Leeuw theorem says that any element ofC can be written as the convex combination of elements of Extr(C) the extreme pointsprovided by some probablity measure on Extr(C).

Recall that the set ΠT of all T -invariant probability measures in (X,X ) is a convexsubset of the vector spaceM(X) of all signed-measures on (X,X ).

Proposition 3.22. Let ν ∈ ΠT such that ν << µ. If (X,X , µ, T ) is ergodic, then ν = µ.

Proof. Let A ∈ X . Since (X,X , µ, T ) is ergodic, Birkho ergodic theorem applied to(X,X , µ, T ) yields

limn→+∞

1

n

n−1∑k=0

1T−k(A) = µ(A) µ− almost surely.

Since ν << µ, this convergence holds also ν-almost surely. Since the Birkho averagesof the left-hand side remain in [0, 1], one may integrate with respect to ν and applyLebesgue dominated convergence theorem to get ν(A) = µ(A).

Proposition 3.23. T is ergodic on (X,X , µ) if and only if µ is extreme in ΠT (X).

3.8. DECOMPOSITION INTO ERGODIC COMPONENTS 39

Proof. First, assume that (X,X , µ, T ) is not ergodic. One can nd A ∈ IT such that0 < µ(A) < 1. One checks that the probability measures µ(·|A) and µ(·|Ac) are distinctelements of ΠT (X), so µ = µ(A)µ(·|A) + µ(Ac)µ(·|Ac) is not extreme in ΠT (X).

Now, assume that (X,X , µ, T ) is ergodic. Assume that µ = (µ1 +µ2)/2 with µ1 andµ2 in ΠT . Then µ1 ≤ 2µ and µ2 ≤ 2µ, so µ1 and µ2 are absolutely continuous withregard to µ. By proposition 3.22, µ1 = µ2 = µ, which shows that µ is extreme in ΠT .

Corollary 3.24. If ΠT (X) = µ then µ is T is ergodic.

A measurable map from (X,X ) to itself which admits exactly one invariant measureis said to be uniquely ergodic.

Proposition 3.25. Let µ1 and µ2 be extreme points in ΠT (X). Then the measures µ1

and µ2 are equal or mutually singular.

Proof. If µ1 6= µ2, one can nd A ∈ X such that µ1(A) 6= µ2(A). For each i ∈ 1, 2,Birkho ergodic theorem applied to the ergodic system (X,X , µi, T ) shows that

Xi := x ∈ X : limn→+∞

1

n

n−1∑k=0

1T−k(A)(x) = µi(A)

is a full-measure subset for µi. By construction, X1 and X2 are disjoint. Hence, µ1 andµ2 are mutually singular.

3.8 Decomposition into ergodic components

Throughout this section, (X,X , µ) is a standard probability space, namely (X,X ) isa separable complete metric space endowed with its Borel σ-eld B(X), and µ is aprobability measure on (X,X ).

The metric space X admits a countable basis of open sets (for example the sets of allballs whose center lies in some countable dense subset, and whose radius is the inverseof a positive integer). Let A be the algebra generated by this countable basis. Then Ais countable and X is the σ-eld generated by A (without any completion).

Assuming that (X,X , µ) is a standard probability space ensures the existence ofconditional probability measures with regard to any sub-σ-eld of X .

Theorem and denition 3.26. (Conditional probability measures)

Let F be a sub-σ-eld of X . There exists a family (µx)x∈X of probability measures

on (X,X ) such that for every A ∈ X , the random variable x 7→ µx(A) is a represen-

tative of the conditional expectation E[1A|F ], namely x 7→ µx(A) is F-measurable (and

integrable), and for every B ∈ F ,∫Bµx(A) dµ(x) =

∫B1A dµ = µ(A ∩B).

Moreover, for every f ∈ L1(X,X , µ), the map x 7→∫X f dµx is dened µ-almost every-

where on X, is a representative of E[f |F ].

Such a family is essentially unique and is called a family of conditional probability

measures of µ given F .

40 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

Proof. (Partial proof) The existence is the most dicult part: given a sequence (An)n≥1

of pairwise disjoint Borel subsets of X, the equality

E[1⋃

n≥1 An|F]

= E[∑n≥1

1An |F]

=∑n≥1

E[1An |F ]

holds in L1(µ). If we choose a representative of each conditional expectation, we getan almost sure equality, and the almost sure event on which equality holds depends onthe sequence (An)n≥1. Therefore, one has to choose coherently the representative of theconditional expectations E[1A|F ], to make all such equalities true everywhere on X.

The essential uniqueness follows from the existence of the generating countable al-gebra A: if two families (µx)x∈X and (νx)x∈X satisfy the conditions required, then forevery A ∈ A, µx(A) = νx(A) for µ-almost surely x ∈ X; hence for µ-almost surelyx ∈ X, µx and νx coincide on A (since A is countable), so µx = νx.

Since the last statement holds for every indicator function, it holds for every simplefunction by linearity, thus for every measurable non-negative function by the monotoneconvergence theorem for the conditional expectation, hence for every integrable functionby dierence.

Example 3.27. Let X = C (identied with R2), X = B(C), and µ the probability

measure with density z 7→ (2π)−1e−|z|2/2. Let F be the σ-eld of all Borel subsets of C

which are invariant by the natural action of the group SO2(R).

For every z ∈ C, denote by µz the uniform distribution on the circle |z|U, namely

the image of the uniform distribution on [0, 2π[ by the map θ 7→ |z|eiθ from [0, 2π[ to C.

Then (µz)z∈C is a family of conditional probability measures of µ given F .

Proof. First, we note that F = R−1(B(R+)), where R is the map z 7→ |z| from C toR+. Indeed, R is Borel and invariant by vectorial rotations, hence R−1(B(R+)) ⊂ F .Conversely, if A ∈ F , then B := A ∩ R+ belongs to B(R+), and one checks thatA = R−1(B).

One checks that R(µ) = ν, where dν(r) = re−r2/2dr.

Let A ∈ B(C). The map r 7→∫ 2π

0 1A(reiθ)dθ from R+ to R is Borel (by Fubini'stheorem), so z 7→ µz(A) is F-measurable (by composition with R). Moreover, for everyB ∈ B(R+), integration in polar coordinates yields∫

R−1(B)1A(z) dµ(z) =

∫C1B(|z|)1A(z) (2π)−1e−|z|

2/2 dz

=

∫R+

1B(r)(∫ 2π

01A(reiθ) (2π)−1dθ

)dν(r)

=

∫C1B(|z|)

(∫ 2π

01A(|z|eiθ) (2π)−1dθ

)dµ(z)

=

∫R−1(B)

µz(A) dµ(z).

Hence the map z 7→ µz(A) is a representative of E[1A|F ].

Theorem 3.28. (Decomposition into ergodic components).

Let T be a measure-preserving map on (X,X , µ) and (µx)x∈X a family of conditional

probability measures of µ given IT . Then

3.8. DECOMPOSITION INTO ERGODIC COMPONENTS 41

1. For µ-almost every x ∈ X, the probability µx is T -invariant.

2. For µ-almost every x ∈ X, the system (X,X , µx, T ) is ergodic.

3. The measure µ can be written as a mixture of the measures (µx)x∈X , namely

µ =

∫Xµxdµ(x).

Proof. For each A ∈ X , set

`A = lim supn→+∞

1

n

n−1∑k=0

1T−k(A)

By Birkho ergodic theorem, `A(x) = E[1A|IT ](x) = µx(A) for µ-almost every x ∈ X.

Let A be a countable algebra generating X . Then the sets

X1 :=⋂A∈Ax ∈ X : `A(x) = µx(A)

, X2 :=

⋂A∈Ax ∈ X : `T−1(A)(x) = µx(T−1(A))

,

X3 :=⋂A∈A

⋂r∈Qx ∈ X : `[`A≤r](x) = µx[`A ≤ r]

are almost sure events in (X,X , µ).

1. Fix x ∈ X1 ∩X2. Then for every A ∈ A,

µx(T−1(A)) = lim supn→+∞

1

n

n−1∑k=0

1T−(k+1)(A) = lim supn→+∞

1

n

n−1∑k=0

1T−k(A) = µx(A).

The probabilities T (µx) and µx are equal since they coincide on A, so µx is T -invariant.

2. Fix x ∈ X1 ∩X2 ∩X3. Let A ∈ A.For every r ∈ Q, the event [`A ≤ r] belongs to IT , so Birkho ergodic theoremapplied to 1[`A≤r] and to the invariant measure µx yields

1[`A≤r] = `[`A≤r] = µx[`A ≤ r] µx-almost surely.

Hence µx[`A ≤ r] ∈ 0, 1 for every r ∈ Q, so `A is µx-almost surely constant.But Birkho ergodic theorem applied to 1A and to the invariant measure µx yields`A = Eµx [1A|IT ] µx-almost surely. Hence

Eµx [1A|IT ] = Eµx[Eµx [1A|IT ]

]= Eµx [1A] = µx(A) µx-almost surely.

This shows that the algebra A is contained in

Mx := A ∈ X : Eµx [1A|IT ] = µx(A) µx − almost surely.

But the monotone convergence theorem for the conditional expectation shows thatMx is a monotone class. ThereforeMx contains σ(A) = X , by the monotone classtheorem. In particular, for every B ∈ IT , 1B = Eµx [1B|IT ] = µx(B) µx-almostsurely, so µx(B) ∈ 0, 1. The ergodicity of (X,X , µx, T ) follows.

42 CHAPTER 3. ERGODIC THEOREMS AND APPLICATIONS

3. Let A ∈ X . Then∫Xµx(A) dµ(x) =

∫XEµ[1A|IT ] dµ =

∫X1A dµ = µ(A).

The proof is complete.

Remark 3.29. Given A ∈ IT , one has δx(A) = 1A(x) = E[1A|IT ](x) = µx(A) for

µ-almost every x ∈ X. But simply interverting the order of `for every A ∈ IT ' and `for

µ-almost every x ∈ X' is not possible since IT does not necessarily admits a countable

generating π-system.

Let P be the image of the measure µ by the map x 7→ µx from X to Π(X) (endowed

with the Borel σ-eld associated to the topology of narrow convergence). Then P is

carried by the subset of all extreme points in ΠT and

µ =

∫Π(X)

π dP (π).

Example 3.30. Keep the notations of example 3.27. Let α ∈ R \ Q and Rα be the

map z 7→ ei2παz from C to C. Then the family (µz)z∈C of probability measures is the

ergodic decomposition of Rα.

Proof. Call again F the σ-eld of all Borel subsets of C which are invariant by thenatural action of SO2(R). Let R be the the map z 7→ |z| from C to R+.

It suces to check that F ⊂ IRα ⊂ Fµ, where Fµ denotes the completion of F with

regard to µ. The rst inclusion is immediate, so we check the second one.

For every r ∈ R+, denote by νr the uniform measure on rU. The measure ν1 isthe Haar measure on U. We know that (Rα, ν1) is ergodic since α is irrational. Hence(Rα, νr) is ergodic for every r ∈ R+ (since this dynamical system is equivalent to (Rα, ν1)if r > 0, and since νr = δ0 if r = 0).

Let A ∈ IRα . Then for every r ∈ R+, νr(A) ∈ 0, 1. Set B = r ∈ R+ : νr(A) = 1.Then B ∈ B(R+) so R−1(B) ∈ F . Moreover

µ(A4R−1(B)) =

∫C

∣∣1A(z)− 1B(|z|)∣∣ dµ(z)

=

∫ +∞

0

(∫rU

∣∣1A(z)− 1B(r)∣∣ dνr(z)

)re−r

2/2 dr

=

∫Bνr(A

c) re−r2/2 dr +

∫Bcνr(A) re−r

2/2 dr

= 0.

The proof is complete.

Exercise. Let α ∈ Q. Find the ergodic decomposition of the following systems.

1. Tα : x 7→ (x + α) − bx + αc from I = [0, 1[ endowed with its Borel σ-eld andLebesgue measure to itself.

2. T : (x, y) 7→ (x+y−bx+yc, y) from I2 endowed with its Borel σ-eld and Lebesguemeasure to itself.

Chapter 4

Elements of spectral theory

We x a complex Hilbert space H, and denote by L(H) the vector space of all continuouslinear operators of H. This is a complete normed space for the norm dened by

||A|| = sup||Ax||/||x|| : x ∈ H \ 0.

Given A ∈ L(H), its adjoint operator A∗ is dened by 〈A∗x|y〉 = 〈x|Ay〉. We denote byI the identity map of H.

4.1 Generalities

4.1.1 Spectrum and point spectrum

Let A ∈ L(H).

Denition 4.1. The point-spectrum σp(A) is the set of all λ ∈ C such that A − λI is

not injective, whereas the spectrum σ(A) is the set of all λ ∈ C such that A− λI is not

bijective.

Proposition 4.2. Of course, σp(A) ⊂ σ(A). Moreover, σ(A) is always a non-empty

compact set contained in D(0, ||A||). Yet the point spectrum may be empty.

Let us look at some particular cases.

A is a projector if and only if A2 = A. In this case, H = KerA ⊕ Ker(A − I). Thespectrum and the point spectrum are equal to 0, 1.

A is an orthogonal projector if and only if A2 = A = A∗.

A is an isometry if and only if A∗A = I. In this case, AA∗ is an orthogonal projector,possibly dierent of I; the point spectrum is included in the unit circle U, whereasthe spectrum is included in the closed unit disk D(0, 1); the eigenvectors are paiwiseorthogonal.

By denition, A is unitary if and only if A∗A = AA∗ = I. In this case, the spectrumis included in the unit circle U.

By denition, A is self-adjoint if and only A∗ = A. In this case, the spectrum isincluded in the real line R.

By denition, A is normal if and only if A∗A = AA∗. In this case the eigenspacesare pairwise orthogonal.

43

44 CHAPTER 4. ELEMENTS OF SPECTRAL THEORY

Proof. Let us prove the last statement. If A is normal, then for every λ ∈ C and x ∈ H,

||(A− λI)x||2 = 〈x|(A∗ − λI)(A− λI)x〉 = 〈x|(A− λI)(A∗ − λI)x〉 = ||(A∗ − λI)x||.

Let λ 6= µ be in C and x and y be in H. If Ax = λx and Ay = µy, we deduce thatA∗x = λx, so

µ〈x|y〉 = 〈x|Ay〉 = 〈A∗x|y〉 = λ〈x|y〉.

Since λ 6= µ, we get 〈x|y〉 = 0.

4.1.2 Examples

Exercise. Determine the adjoint, the spectrum and the ponctual spectrum of the oper-ators below.

1. A from H = `2(Z+) to itself dened by A(x)(n) = x(n+ 1) for every n ∈ Z+.

2. A from H = `2(Z) to itself dened by A(x)(n) = x(n+ 1) for every n ∈ Z.

3. Mb : h 7→ bh from H = L2(X,X , µ) to itself, where (X,X , µ) is a measure spaceand f a bounded measurable map from X to C. Hint: he point spectrum and thespectrum can be characterised with the help of the measure b(µ) on C.

4.2 Spectral measures associated to an unitary operator

In the whole section, U denotes a unitary operator of some Hilbert space H.

4.2.1 Cyclic subspace and spectral measure associated to a vector

For every k ∈ Z, we denote by χk the map from U to C dened by χk(z) = zk. For everyn ≥ 0, we denote by Pn the vector space generated by χk : −n ≤ k ≤ n. The spaceP =

⋃n≥0 Pn of all trigonometrical polynomials on U can be identied with C[Z,Z−1]

and is dense in the Banach space (C(U), || · ||∞).

Denition 4.3. For every h ∈ H, the cyclic space C(h) associated to h, is the closure

of the vector space spanned by Ukh : k ∈ Z. It is the smallest closed subspace of Hwhich contains h and is stable by U and U∗ = U−1.

Proposition 4.4. For every h ∈ H, the space C(h)⊥ is also stable by U and U∗ = U−1.

Therefore, if x ∈ C(h)⊥, then C(x) is orthogonal to C(h).

Theorem and denition 4.5. For every h ∈ H, there exists a unique nite measure

σh on U such that for every k ∈ Z∫Uχk dσh = 〈h, Ukh〉.

In particular, the measure σh(U) = ||h||2. The measure σh is called the spectral measure

of U associated to h.

4.2. SPECTRAL MEASURES ASSOCIATED TO AN UNITARY OPERATOR 45

Proof. The map k 7→ 〈h, Ukh〉 from Z to C is positive semi-denite. When ||h|| = 1,the existence follows from Herglotz's theorem and the unicity from the injectivity of theFourier transform. The general case follows by homogeneity.

Remark 4.6. For every P ∈ P,∫U P dσh = 〈h, P (U)h〉.

Proposition 4.7. The map h 7→ σh is continuous from (H, || · ||) to M(U) for the

narrow convergence.

Proof. This result follows from the density in C(U) of the set P of all trigonometricpolynomials on U.

Lemma 4.8. (important examples)

1. Let λ ∈ U. Then Uh = λh if and only if σh = ||h||2δλ.

2. For any Q ∈ P, σQ(U)h = |Q|2 · σh and ||Q(U)h||2 = ||Q||2L2(σh).

Proof. 1. If Uh = λh with λ ∈ U, then for every k ∈ Z, Ukh = λkh, so∫Uχk dσh = 〈h, Ukh〉 = λk||h||2.

Thus σh = ||h||2δλ.Conversely, if σh = ||h||2δλ, then 〈h, Uh〉 = λ||h||2, so |〈h, Uh〉| = ||h||×||Uh||. Theequality in Cauchy-Schwarz equality show that h and Uh are colinear. Therefore,the equality 〈h, Uh〉 = λ||h||2 yields Uh = λh.

2. Let

Q =n∑

k=−nakZ

k.

Then for every k ∈ Z,

〈Q(U)h, UkQ(U)h〉 =∑`,m

a`am〈U `h, Uk+mh〉

=∑`,m

a`am

∫Uχk+m−` dσh

=∑`,m

∫Uχk a`χ` amχm dσh

=

∫Uχk QQ dσh.

Thus σQ(U)h = |Q|2σh. In particular,

||Q(U)h||2 = σQ(U)h(U) =

∫U|Q|2 dσh = ||Q||2L2(σh).

We are done.

Corollary 4.9. Let h ∈ H.

46 CHAPTER 4. ELEMENTS OF SPECTRAL THEORY

1. The linear map Q 7→ Q(U)h from P to H can be extended into an isometry Φh

from L2(σh) to H, whose range is exactly C(h).

2. For every ϕ ∈ L2(σh), σΦh(ϕ) = |ϕ|2σh. Therefore, the set of all spectral measures

associated to some vector in C(h) is exactly the set of all nite measures which are

absolutely continuous with regard to σh.

Formally, we have Φh(ϕ) = ϕ(U)h for every ϕ ∈ L2(σh), but ϕ(U) is not well-denedas a continuous linear operator of H.

Proof. The equality ||Q(U)h|| = ||Q||L2(σh) and the linearity of the map Q 7→ Q(U)hshows that Q(u)h depends only on the equivalence class of Q in L2(σh), and that thequotiented map is an isometry. The rst statement follows by completeness of H and bydensity of the set of all trigonometric polynomials in C(U).

We already know that the equality σΦh(ϕ) = |ϕ|2σh holds for every ϕ ∈ P. Sinceboth sides depend continuously on ϕ, this equality extends to every ϕ ∈ L2(σh).

Corollary 4.10. Let λ ∈ U. Then λ is an eigenvalue of U if and only if there exists

some h ∈ H such that σhλ > 0.

Proof. The proof relies on corollary 4.9 and on example 4.8.

If λ is an eigenvalue of U associated to a vector h, then σh = ||h||2δλ, hence σhλ =||h||2 > 0.

Conversely, if σhλ > 0, then corollary 4.9 applied to the vector g = Φh(1λ) yields||g||2 = ||1λ||2L2(σh) = σhλ and σg = |1λ|2σh = σhλδλ, so g is an eigenvectorassociated to λ.

Proposition 4.11. Let h ∈ H, For every ϕ ∈ L2(σh), Φh(χ1ϕ) = U(Φh(χ1ϕ)). CallingMχ1 be the `multiplication by χ1' operator on L

2(σh), we get Φh Mχ1 = U Φh, so the

diagram

L2(σh)Mχ1 //

Φh

L2(σh)

Φh

HU

// H

commutes. Therefore, the unitary endomorphism induced by U on C(h) is unitarily

equivalent to the endomorphism Mχ1 on L2(σh).

Proof. Since both sides of the equality to be proved depend continuously on ϕ, it sucesto check the equality when ϕ ∈ P. In this case, χ1ϕ ∈ P, so

Φh(χ1ϕ) = (χ1ϕ)(U)h = χ1(U)ϕ(U)h = UΦh(ϕ).

The proof is complete.

4.2.2 Spectral type

We now study the dependence of σh with regard to h ∈ H.

Proposition 4.12. Let h1 and h2 in H. If C(h1) ⊥ C(h2), then σh1+h2 = σh1 + σh2.

4.2. SPECTRAL MEASURES ASSOCIATED TO AN UNITARY OPERATOR 47

Proof. Exercise

Corollary 4.13. Let g and h in H. Prove that if σg and σh are mutually singular, then

C(g) ⊥ C(h). Hint : set h = h1 + h2 with h1 ∈ C(g) and h2 ∈ C(g)⊥.

Proof. Exercise. Hint : set h = h1 + h2 with h1 ∈ C(g) and h2 ∈ C(g)⊥.

Remark 4.14. Proposition 4.12 can be extended to any sequence (hn)n≥1 of pairwise

orthogonal vectors such that the series∑

n ||hn||2 converges (so the series∑

n hn con-

verges).

The extension above enables us to prove the next statement.

Proposition 4.15. Let (fn)n≥1 be any sequence of vectors in H such that the series∑n ||fn|| converges (so the series

∑n fn converges). Let f =

∑n≥1 fn. Then the measure∑

n≥1 σfn is nite and

σf <<∑n

σfn .

Proof. The niteness of the measure∑

n≥1 σfn follows from the convergence of the series∑n ||fn||2.We apply a variant of Gram-Schmidt procedure: we set g1 = f1, and for every n ≥ 2,

we get gn by substracting to fn its orthogonal projection on C(g1) + · · ·+ C(gn−1).

For every n ≥ 1, fn − gn ∈ C(g1) + · · · + C(gn−1) whereas gn is orthogonal toC(g1) + · · ·+C(gn−1). Therefore, fn = fn− gn + gn ∈ C(g1) + · · ·+C(gn) and C(gn) isorthogonal to C(g1) + · · · + C(gn−1), which contains C(fn − gn). By proposition 4.12,we get σfn = σfn−gn + σgn .

The cyclic subspaces C(gn) are pairwise orthogonal and f belongs to the closure oftheir sum, so we can write f =

∑n≥1 hn with hn ∈ C(gn) for every n ≥ 1. The vectors

(hn)n≥1 are pairwise orthogonal and the series∑

n ||hn||2 converges, so

σf =∑n≥1

σhn .

But for every n ≥ 1, σhn << σgn << σfn since hn ∈ C(gn) and σfn = σfn−gn +σgn . Theresult follows.

Corollary 4.16. Let µ be a non-negative measure on U and D be a dense subspace of

H. If σh << µ for every h ∈ D, then σh << µ for every h ∈ H.

Proof. Let h ∈ H. One can approach h by some sequence (hn)n≥1 of vectors in D. Byextracting a suitable subsequence if necessary, one can ensure that the series

∑n ||hn−h||

converges. By the triangle inequality, the series∑

n ||hn+1 − hn|| also converges. Setf1 = h1 and fn = hn − hn−1 for every n ≥ 2. Then proposition 4.15 applies, so

σh <<∑n≥1

σfn .

But for every n ≥ 1, σfn << µ since fn ∈ D. The result follows.

Similar arguments yield the following important result.

48 CHAPTER 4. ELEMENTS OF SPECTRAL THEORY

Theorem and denition 4.17. Assume that H is separable. Then there exists g ∈ Hsuch that for every h ∈ H, σh << σg. One says that g is of maximal spectral type.

Moreover, the equivalence class of the measure σg does not depend on g and is called the

spectral type of U .

Proof. Let (fn)n≥1 be a total family of vectors of H. By the variant of Gram-Schmidtprocedure used above, we get a sequence (gn)n≥1 of unit vectors such that the cyclicsubspace (C(gn))n≥1 are pairwise orthogonal and generate a dense subspace of H. Wex a sequence (αn)n≥1 of non-null numbers such that the series

∑n |αn|2 converges and

we set g =∑

n≥1 αngn.

Let h ∈ H. Then h =∑

n≥1 hn with hn ∈ C(gn) for every n ≥ 1, so

σh =∑n≥1

σhn <<∑n≥1

|α2n|σgn = σg.

The result follows.

Denition 4.18. Assume that H is separable. One says that U has a discrete spectrum

if its spectral measure is discrete.

Proposition 4.19. Assume that H is separable. Then U has a discrete spectrum if and

only if the eigenspaces of U span a dense subspace in H.

Proof. Exercise.

We now state without proof a result which is analogous to the Frobenius reductionof an endomorphism of a nite-dimensional vector space.

Theorem 4.20. Assume that H is separable. Then there exists countably (possibly

nitely) many vectors h1, h2, . . . in H such that

1. The cyclic subspaces C(h1), C(h2), . . . are pairwise orthogonal.

2. The closure of their sum equals H.

3. σh1 >> σh2 >> · · · .

4. For every h ∈ H, σh << σh1.

Moreover, the measure σh1 , σh2 , · · · are unique up to equivalence.

Remark 4.21. With the notations above, U is unitarily equivalent to the product of the

unitary maps Ui, where Ui is the endomorphism Mχ1 on L2(σhi).

Denition 4.22. On says that U has simple spectrum when H = C(h) for some h ∈ H.

4.3 Spectral theorem

Denition 4.23. A spectral measure on H is a map P from B(C) to the set of all or-

thogonal projectors of H such that P (C) = I and for every sequence (Bn)n≥1 of pairwise

disjoint Borel subsets of C, and h ∈ H

P( ⋃n≥1

Bn

)h =

∑n≥1

P (Bn)h.

4.3. SPECTRAL THEOREM 49

Remark 4.24. Denote by H(B) the range of P (B) (which equals the kernel of P (B)−I).The last part of the denition could be rephrased as follows: for every sequence (Bn)n≥1

of pairwise disjoint Borel subsets of C, the supspaces (H(Bn))n≥1 are pairwise orthogonal

H( ⋃n≥1

Bn

)=∑n≥1

H(Bn).

Proposition 4.25. Let P be a spectral measure on H.

1. For every h ∈ H, the map Ph grom B(C) to C dened by Ph(B) = 〈h, P (B)h〉 isa non-negative nite measure with mass ||h||2.

2. For every h ∈ H, the map Pf,g grom B(C) to C dened by Pf,g(B) = 〈f, P (B)g〉is a complex measure.

Proof. Let B ∈ B(C).

For every h ∈ H, h− PB(h) ⊥ PB(h), so Ph(B) = 〈P (B)h, P (B)h〉 ∈ R+.

For every f and g in H,

Pf,g(B) =1

4(Pf+g(B)− Pf−g(B)− iPf+ig(B) + iPf−ig(B)).

The σ-additivity properties directly follow from the denition.

Denition 4.26. Let P be a spectral measure on H and φ : C → C be a bounded

measurable function. There exists a unique bounded continuous operator on H, denoted

by∫C φ dP , such that for every f and g in H,⟨(∫

Cφ dP

)f |g⟩

=

∫Cφ dPf,g.

The existence and uniqueness follows from Riesz representation theorem and the factthat the map

(f, g) 7→∫Cφ dPf,g.

is sesquilinear and continuous.

Proposition 4.27. The integral with regard to P is a morphism of algebras from the

space Mb(C) of all measurable bounded functions on C to L(H). Moreover, for every

φ ∈Mb(C), (∫φ dP

)∗=

∫φ dP and

∣∣∣∣∣∣ ∫ φ dP∣∣∣∣∣∣ ≤ ||φ||∞.

Denition 4.28. Let P be a spectral measure. The support of P is the set

Supp(P ) = λ ∈ C : ∀ε > 0, P (B(λ, ε)) 6= 0.

Proposition 4.29. The support of P is closed. Moreover, for every φ ∈ Mb(C), theintegral of φ depends only of the restriction of φ on Supp(P ), so the integral can be

dened as soon as φ is bounded on Supp(P ).

50 CHAPTER 4. ELEMENTS OF SPECTRAL THEORY

Theorem 4.30. Spectral theorem Assume that H is separable. Given a normal

operator A ∈ L(H), there exists a unique spectral measure P on H, with compact support,

such that

A =

∫CλdP (λ).

The support of P is exactly σ(A).

Remark 4.31. For every polynomial function φ,

φ(A) =

∫Cφ(λ)dP (λ).

The formula above enables us to extend nicely the denition of φ(A) for every bounded

Borel function on σ(A).

Remark 4.32. Let U be an unitary operator on H. Then the spectral measure is carried

bu U. Moreover, for every h ∈ H and k ∈ Z

〈h, Ukh〉 =⟨h,(∫

UλkdP (λ)

)h⟩

=

∫UλkdPh(λ).

Hence the measure Ph is the spectral measure associated to h.

Example 4.33. Consider again the operator Mb of subsection 4.1.2. For every B ∈B(C), let P (B) be the orthogonal projection on h ∈ L2(X,X , µ) : 1b−1(B)h = h a.e,namely the map h 7→ 1b−1(B)h. Then P is the spectral measure of Mb.

Proof. One checks directly that P is a spectral measure. Let f and g in L2(X,X , µ).Then fg ∈ L1(X,X , µ). The corresponding measure Pf,g is the image of the complexmeasure fgµ by b since for every B ∈ B(C),

Pf,g(B) =

∫Xf1b−1(B)g dµ = (fg · µ)(b−1(B)).

Therefore,

〈f,Mbg〉 =

∫Xfbg dµ =

∫Xb(x) d(fgµ)(x) =

∫Cλ dPf,g(λ).

Chapter 5

Ergodicity and mixing

5.1 Denitions and rst characterizations

Let T be a measure-preserving map on a probability space (X,X , µ).

We recall that T is ergodic if and only if for every A and B in X ,

1

n

n−1∑k=0

µ(A ∩ T−k(B))→ µ(A)µ(B).

Denition 5.1. (Strong mixing and weak mixing)

1. One says that T is strongly mixing if and only if for every A and B in X ,

µ(A ∩ T−n(B))→ µ(A)µ(B) as n→ +∞.

2. One says that T is weakly mixing if and only if for every A and B in X ,

1

n

n−1∑k=0

|µ(A ∩ T−k(B))− µ(A)µ(B)| → 0 as n→ +∞.

Cesàro lemma and the triangle inequality provide the implications

strongly mixing =⇒ weakly mixing =⇒ ergodic.

Actually, the next result will show us that the weak mixing is equivalent to the conver-gence of the sequences µ(A ∩ T−n(B)) → µ(A)µ(B) up to removing a small subset ofintegers n (possibly depending on A and B).

Denition 5.2. Let I be a subset of Z∗+. The density of I is

limn→+∞

1

n|I ∩ [1, n]|/n

if the limit exists.

Lemma 5.3. Let (an)n≥1 be a bounded sequence of non-negative real numbers and

(cn)n≥1 be its Cesàro-means sequence

cn =1

n

n∑k=1

ak.

Then (cn)n≥1 converges to 0 if and only if there exist a subset I of Z∗+ having density 1such that the subsequence (an)n∈I converges to 0.

51

52 CHAPTER 5. ERGODICITY AND MIXING

Proof. If (cn)n≥1 converges to 0, then the non-decreasing sequence (bn)n≥1 dened bybn = supck : k ≥ n also converges to 0. Set I = n ≥ 1 : an ≤

√bn. Then for every

n ≥ 1, we have |Ic ∩ [1, n]|/n ≤ √cn since

ncn =

n∑k=1

ak ≥∑

k∈Ic∩[1,n]

√bk ≥

√bn|Ic ∩ [1, n]| ≥

√cn|Ic ∩ [1, n]|,

Therefore, Ic has density 0 so I has density 1. Since 0 ≤ an ≤√bn for every n ∈ I,

the subsequence (an)n∈I converges to 0. Note that this part of the proof works even if(an)n≥1 were unbounded.

Conversely, assume that we have a subset I of Z∗+ having density 1 such that thesubsequence (an)n∈I converges to 0. Let M = supan : n ≥ 1. Then for every n ≥ 1,

cn ≤1

n

∑k∈I∩[1,n]

|ak|+M

n|Ic ∩ [1, n]|.

Applying Cesàro lemma to the sequence (an)n∈I and using the assumption that I hasdensity 1 yields the result.

Corollary 5.4. Let (an)n≥1 be a bounded sequence of non-negative real numbers. Then

1

n

n∑k=1

ak → 0 as n→ +∞⇐⇒ 1

n

n∑k=1

a2k → 0 as n→ +∞.

We introduce the notation

L2(µ)0 = f ∈ L2(µ) :

∫Xf dµ = 0 = (C1)⊥.

Proposition 5.5. (Characterization of the strong mixing property)

The following statements are equivalent.

1. The map T is strongly mixing.

2. For every f and g in L2(µ), 〈f |g Tn〉 → 〈f |1〉 × 〈1|g〉 as n→ +∞.

3. For every f in L2(µ), 〈f |f Tn〉 → 〈f |1〉 × 〈1|f〉 as n→ +∞.

4. For every f in L2(µ)0, 〈f |f Tn〉 → 0 as n→ +∞.

Proof. By denition, the strong mixing property says exactly that statement 2 holds forindicator functions, so 2 =⇒ 1.

Conversely, if 1 holds, then 2 holds when f and g are indicator functions, hence 2holds when f and g are simple functions (by sesquilinearity), hence 2 holds for every fand g in L2(µ) (by density).

Statement 3 is a direct consequence of statement 2.

Conversely, assume that statement 3 holds. Let f ∈ L2(µ) and

Ff = g ∈ L2(µ) : 〈f Tn|g〉 → 〈f |1〉〈1|g〉 as n→ +∞.

5.1. DEFINITIONS AND FIRST CHARACTERIZATIONS 53

One checks that Ff is a closed vector subspace L2(µ) which contains 1, f T k for everyk ≥ 0 and every function orthogonal to the vector space spanned by 1∪f T k : k ≥ 0,so Ff = H, which yields statement 2.

Statement 4 is a direct consequence of statement 3. The converse follows from thenext observation: for every f ∈ L2(µ), f − 〈1|f〉1 ∈ L2(µ)0, and⟨

f − 〈1|f〉1∣∣ (f − 〈1|f〉1) Tn⟩ =

⟨f − 〈1|f〉1

∣∣ f Tn − 〈1|f〉1⟩= 〈f, f Tn〉 − 〈1|f〉〈f,1〉−〈1|f〉〈1|f Tn〉+ 〈1|f〉〈1|f〉

= 〈f, f Tn〉 − 〈1|f〉〈f,1〉,

since T preserves µ.

The proof is complete.

An analogous proof and corollary 5.4 yield the next result.

Proposition 5.6. (Characterization of the weak mixing property)

The following statements are equivalent.

1. The map T is weakly mixing.

2. For every f and g in L2(µ),

1

n

n−1∑k=0

∣∣〈f |g T k〉 − 〈f |1〉〈1|g〉∣∣→ 0 as n→ +∞.

3. For every f in L2(µ),

1

n

n−1∑k=0

∣∣〈f |f T k〉 − 〈f |1〉〈1|f〉∣∣→ 0 as n→ +∞.

4. For every f in L2(µ)0,

1

n

n−1∑k=0

∣∣〈f |f T k〉∣∣2 → 0 as n→ +∞.

We now introduce the isometric operator UT in L2(µ) dened by UT f = f T .

Proposition 5.7. (General properties of the point spectrum)

1. The constant function 1 is an eigenvector of UT associated to the eigenvalue 1. Itsorthogonal L2(µ)0 = (C1)⊥ in L2(µ) is stable by UT .

2. If f is an eigenvector of UT associated to an eigenvalue λ, then |f | is an eigenvector

associated to |λ| and |λ| = 1.

3. T is ergodic if and only if the eigenspace Ker(UT − I) is reduced to the line C1.

4. If T is ergodic, then each eigenvector of UT has a constant modulus and the point

spectrum of UT is a subgroup of U.

Proof. The proof is left as an exercise to the reader.

54 CHAPTER 5. ERGODICITY AND MIXING

5.2 Characterizations involving spectral properties of UT

During the whole section, we assume that T is an automorphism of (X,X , µ), so theHilbert space L2(µ) is separable and the operator UT on L2(µ) dened by UT f = f Tis unitary. To every f ∈ L2(µ), we associate its spectral measure σf .

When the measure space (X,X , µ) is separable, the Hilbert space L2(µ) is separable,so we can x a vector h1 ∈ L2(µ)0 of maximal spectral type (namely σf << σh1 forevery f ∈ L2(µ)0).

By proposition 5.7, the ergodicity of T is equivalent to the equality Ker(UT−I) = C1.Therefore, corollary 4.10 yields the following characterization.

Proposition 5.8. (Spectral characterization of ergodicity) The following statements are

equivalent

1. The map T is ergodic.

2. For every f ∈ L2(µ)0, σf1 = 0.

3. σh11 = 0 (provided (X,X , µ) is separable).

We can also characterize the strong mixing property.

Theorem 5.9. (Spectral characterization of strong mixing property) The following state-

ments are equivalent

1. T is strongly mixing.

2. For every f ∈ L2(µ)0, σf (k) :=∫U z−k dσf (z)→ 0 as |k| → +∞.

3. σh1(k)→ 0 as |k| → +∞ (provided (X,X , µ) is separable).

Proof. Proposition 5.5 shows that T is strongly mixing if and only if for every f ∈ L2(µ)0,〈f |f T k〉 → 0 as k → +∞. But for every k ∈ Z,

〈f |f T k〉 = 〈f |UkT f〉 =

∫Uzk dσf (z).

The equivalence 1⇐⇒ 2 follows.

The implication 2 =⇒ 3 is immediate. To prove its converse, it suces to notethat the set of all ϕ ∈ L1(σh1) such that ϕ · σh1(k) → 0 as |k| → +∞ is closed vectorspace which contains the characters χ` : z 7→ z`, since χ` · σh1(k) = σh1(k − l), for everyintergers k and `, so it is L1(σh1).

The next lemma will help us to give a characterization of the weak mixing property.

Lemma 5.10. For every f ∈ L2(µ),

limn→+∞

1

n

n−1∑k=0

∣∣〈f, f T k〉∣∣2 =∑z∈U

σfz2.

5.2. CHARACTERIZATIONS INVOLVING SPECTRAL PROPERTIES OF UT 55

Proof. For every n ≥ 1,

1

n

n−1∑k=0

∣∣〈f, f T k〉∣∣2 =1

n

n−1∑k=0

∣∣∣ ∫Uχk dσf

∣∣∣2=

1

n

n−1∑k=0

∫Uz−k1 dσf (z1)

∫Uzk2 dσf (z2)

=

∫U2

1

n

n−1∑k=0

(z2/z1)k d(σf ⊗ σf )(z1, z2).

But for every z1 and z2 in U,

1

n

n−1∑k=0

(z2/z1)k = 1 if z1 = z2

whereas

1

n

n−1∑k=0

(z2/z1)k =1

n

1− (z2/z1)n

1− z2/z1if z1 6= z2,

so

limn→+∞

1

n

n−1∑k=0

(z2/z1)k → 1[z1=z2].

Since the modulus of these quantities remains bounded by 1, Lebesgue dominated con-vergence theorem applies, so

limn→+∞

1

n

n−1∑k=0

∣∣〈f, f T k〉∣∣2 =

∫U2

1z1=z2 d(σf ⊗ σf )(z1, z2)

=

∫Uσfz1dσf (z1)

=∑z∈U

σfz2.

The proof is complete

Theorem 5.11. (spectral characterization of weak mixing property)

Let T be an automorphism of (X,X , µ). For every f ∈ L2(µ), denote by σf the

spectral measure associated to f . The following statements are equivalent

1. T is weakly mixing.

2. For every f ∈ L2(µ)0, σf has no atom.

3. The endomorphism induced by UT on L2(µ)0 has no eigenvalue.

4. T is ergodic and 1 is the only eigenvalue of UT .

5. σh1 has no atom (provided (X,X , µ) is separable).

56 CHAPTER 5. ERGODICITY AND MIXING

Proof. The equivalence between statements 1 and 2 follows from proposition 5.6 andfrom the last lemma.

The equivalence between statements 2 and 3 follows from corollary 4.10.

The equivalence between statements 3 and 4 follows from proposition 5.7.

The equivalence between statements 2 and 5 is obvious.

5.3 Ergodicity of a Cartesian product

Theorem 5.12. Let T and S be automorphisms of (X,X , µ) and (Y,Y, ν) respectively.

Then T ×S is ergodic if and only if T and S are ergodic and the only common eigenvalue

of UT and US is 1.

Proof. First, assume that T × S is ergodic. For every A ∈ IT , A × Y ∈ IT×S , soµ(A) = (µ⊗ ν)(A×Y ) ∈ 0, 1. Thus T is ergodic, and the same arguments work for S.Moreover, if f and g are eigenvectors of UT and US associated to the same eigenvalue λ,then |λ| = 1 by ergodicity of T and (f ⊗ g) (T × S) = (λf)⊗ (λg) = f ⊗ g, so f ⊗ g isµ⊗ ν-almost surely constant, so f is µ-almost surely constant and λ = 1.

Conversely, assume that T and S are ergodic and that the only common eigenvalueof UT and US is 1. One needs to prove that for every h ∈ L2(µ⊗ ν),

1

n

n−1∑k=0

h (T k × Sk)→∫X×Y

hd(µ⊗ ν) in L2(µ⊗ ν),

so E[h|IT×s] = E[h] (µ⊗ ν)-almost surely.

The set of all functions f ⊗ g with f ∈ L2(µ) and g ∈ L2(ν) is total in h ∈ L2(µ⊗ν).Therefore it suces to check the convergence when h = f ⊗ g with f ∈ L2(µ) andg ∈ L2(ν). By the ergodicity of T and S, the convergence holds when f or g is constant,so we only need to consider the case where f ∈ L2(µ)0 and g ∈ L2(ν)0.

Let h = f ⊗ g with f ∈ L2(µ)0 and g ∈ L2(ν)0. Denote by σf , σg, σh the spectralmeasures of T , S and T × S associated to f , g, h respectively. (Actually, one shoulddenote σTf , σ

Sg , σ

T×Sh ). Then for every n ≥ 1,∣∣∣∣∣∣ 1

n

n−1∑k=0

h (T × S)k∣∣∣∣∣∣2

2=

1

n2

n−1∑k=0

n−1∑l=0

〈h (T × S)k, h (T × S)l〉

=

∫U2

1

n2

n−1∑k=0

n−1∑l=0

zl−kdσh(z)

=

∫U2

∣∣∣ 1n

n−1∑k=0

zk∣∣∣2dσh(z),

since for every z ∈ U, ∣∣∣ n−1∑k=0

zk∣∣∣2 =

n−1∑k=0

zk ×n−1∑l=0

zl =

n−1∑k=0

n−1∑l=0

zl−k.

Since ∣∣∣ 1n

n−1∑k=0

zk∣∣∣2 ≤ 1 and

∣∣∣ 1n

n−1∑k=0

zk∣∣∣2 → 1[z=1] as n→ +∞,

5.3. ERGODICITY OF A CARTESIAN PRODUCT 57

the monotone convergence theorem yields∣∣∣∣∣∣ 1n

n−1∑k=0

h (T k × Sk)∣∣∣∣∣∣2

2→ σh1 as n→ +∞.

But σh = σf ∗ σg, where by denition, the measure σf ∗ σg is the image of σf ⊗ σg bythe map (z1, z2) 7→ z1z2 from U2 to U. Indeed, for every k ∈ Z,

〈h, h (T × S)k〉 = 〈f, f T k〉 × 〈g, g Sk〉 =

∫U

∫Uzk1z

k2 dσf (z1)dσg(z2) =

∫Uzkdσh(z).

Call Af and Ag the set of all atoms of the measures σf and σg, and A∗f the conjugate of

Af . Then

σh1 =

∫U

∫U1[z1z2=1]dσf (z1)dσg(z2)

=

∫Uσfz2dσg(z2) =

∫A∗f

σfzdσg(z) =∑z∈A∗f

σfzσgz.

By corollary 4.10, proposition 5.7 and by hypothesis,

A∗f ∩Ag ⊂ σp(UT )∗ ∩ σp(US) = σp(UT ) ∩ σp(US) = 1.

Hence σh1 = σf1σg1 = 0, since T is ergodic and f ∈ L2(µ)0 (see proposition 5.8).The proof is complete.

Theorems 5.12 and 5.6 yield immediatly the following result, which shows the interestof the weak mixing property.

Theorem 5.13. Let T be an automorphism of a separable measure space (X,X , µ). Thefollowing properties are equivalent.

1. T is weakly mixing.

2. T × T is ergodic.

3. T × T is weakly mixing.

Proof. The equivalence (1)⇔ (2) follows directly from theorems 5.12 and 5.6.

The implication (3) =⇒ (2) is already known.

Last, if T is weakly mixing, then one checks that the convergence

1

n

n−1∑k=0

|(µ⊗ µ)(A ∩ (T × T )−k(B))− (µ⊗ µ)(A)(µ⊗ µ)(B)| as n→ +∞.

holds

• when A = A1 ×A2 and B = B1 ×B2, with A1, A2, B1, B2 in X .

• when A and B are (disjoint) nite unions of such Cartesian products.

• in the general case (by density).

This yields the implication (1)⇔ (3)

58 CHAPTER 5. ERGODICITY AND MIXING

5.4 Examples

5.4.1 Translations on Td

We identify Td with the quotient group Rd/Zd. We call µ the Haar measure on thiscompact group. For every x ∈ Rd, we denote by x the equivalence class of x ∈ Rd.

Given k = (k1, . . . , kd) ∈ Zd and x = (x1, . . . , xd) ∈ Rd, the quantity exp(i2πk · x) =exp(i2π(k1x1 + · · · + kdxd) depends only on x, so one can dene a map ek : Td → Cby ek(x) = exp(i2πk · x). The maps (ek)k∈Z form a total family in the Banach space(C(Td,C), || · ||∞) and an orthonormal basis of L2(Td).

Let α = (α1, . . . , αd) ∈ Rd and call Tα the translation of α in Td. Then Tα is anautomorphism of (Td,B(Td), µ). For every k ∈ Zd, ek Tα = exp(i2πk · α)ek, so ek isan eigenvector of the unitary operator UTα associated to the eigenvalue exp(i2πk · α).These eigenvalues are distinct if and only if the only k ∈ Zd such that k ·α ∈ Z is 0, andin this case, the associated eigenspaces are lines.

Set

h :=∑k∈Zd

2−||k||1ek,

where ||k||1 := |k1|+· · ·+|kd|. The observations above, part 2 of corollary 4.9, remark 4.14and corollary 4.16 yield the following result.

Proposition 5.14. (Properties of a translation on the torus)

• The vector h is of maximal spectral type. Its spectral measure is

σh :=∑k∈Zd

4−||k||1δei2πk·α ,

so the unitary operator UTα has discrete spectrum.

• The unitary operator UTα has simple spectrum if and only if the only k ∈ Zd such

that k · α ∈ Z is 0.

• The map Tα is ergodic if and only if the only k ∈ Zd such that k · α ∈ Z is 0.

• The map Tα is not weakly mixing.

5.4.2 Continuous automorphisms of Td

Keep the notations of the previous subsection. Let A ∈ GLd(Z) (namely, A ∈ Md(Z)

and detA ∈ −1, 1). The map TA from Td to itself dened by TA(x) =˙Ax is an

automorphism of (Td,B(Td), µ). Denote by ν the Haar measure on U.

Lemma 5.15. Let k ∈ Zd.

1. UTAek = eA>k.

2. The greatest common divisor of the components of (A>)k equals the greatest com-

mon divisor of the components of k.

5.4. EXAMPLES 59

3. If UnTAek 6= ek for every n ≥ 1, then σek = ν. Otherwise, σek is the uniform

law on Um := z ∈ C : zm = 1, where m is the least positive integer such that

UmTAek 6= ek.

Proof. The proof is left as an exercise to the reader. The rst point is proved by directcomputation. The last two points come from the pairwise orthogonality of the characters(ek)k∈Zd .

Theorem 5.16. If Sp(A) contains no root of unity, then TA is strongly mixing. Other-

wise, TA is not ergodic.

Proof. The proof relies on the last lemma and on proposition 4.15. Since (ek)k∈Zd is anHilbert basis of L2(µ), one has for every f ∈ L2(µ)

σf <<+∞∑n=1

|〈ek, f〉|2σek .

If Sp(A) contains no root of unity, then for every f ∈ L2(µ)0, σf << ν, so TA isstrongly mixing.

Otherwise, Sp(A) contains some root of unity, which has a nite order in the group(U,×). Call m the least order possible. Then 1 is an eigenvalue of Am, and also of(A>)m. The kernel of (A>)m − I contains some non-null vector in Qd, hence one cannd k ∈ Qd \ 0 such that (A>)mk = k. Since m is minimal, the orbit of ek hasexactly m elements, namely, ek, e(A>)k, . . . , e(A>)m−1k. Hence σek1 = 1/m > 0. Since

ek ∈ L2(µ)0, TA is not ergodic.

Alternative argument : ek + e(A>)k + · · ·+ e(A>)m−1k is an invariant function which is

not almost surely constant, since it belongs to L2(µ)0\0. Hence, TA is not ergodic.

5.4.3 Chacon's transformation

Chacon's Transformation T is contructed as follows by a recursive procedure called cut-and-stack.

Fix ` > 0 (the value of ` will be precise later). Dene a sequence (hn)n≥0 of integersby h0 = 1 and hn+1 = 3hn + 1 for every n ≥ 0. One checks that for every n ≥ 0,

hn3n

=n∑k=0

1

3k=

3

2

(1− 1

3n+1

).

For every n ≥ 0, we dene recursively an ordered partition Tn = (In,0, . . . , In,hn−1) of[0, hn`/3

n[ into hn pairwise disjoint half-closed intervals having the same length `/3n anda map Tn from Dn = In,0 ∪ · · · ∪ In,hn−2 onto Tn(Dn) = In,1 ∪ · · · ∪ In,hn−1 which sendsby translation each interval In,k with 0 ≤ k ≤ hn−2 onto the interval In,k+1. The actionof Tn is represented by stacking the intervals to make a tower of height hn: the basis isthe interval In,0 at the bottom, each interval In,k is at level k and for 0 ≤ k ≤ hn − 2,Tn maps each point of In,k to the corresponding point above.

The initial partition is necessarily T0 = I0,0 where I0,0 = [0, `[, whereas T0 maps ∅onto ∅.

60 CHAPTER 5. ERGODICITY AND MIXING

Let n ≥ 0. Assume that the partition Tn = (In,0, . . . , In,hn−1) and the map Tn areconstructed as above. Then we cut the tower Tn into three columns by splitting eachinterval In,k into three disjoint half-closed intervals having the same length `/3n+1. Weadd an extra interval [hn`/3

n, hn`/3n + `/3n+1[= [hn`/3

n, hn+1`/3n+1[, called spacer,

to get a partition of [0, hn+1`/3n+1[ into hn+1 disjoint half-closed intervals having the

same length `/3n+1. We stack the three columns of Tn in the natural order, insertingthe spacer between the rst two and the last one, to get the tower Tn+1. More precisely,the spacer is the interval In+1,2hn , the rst third, the second third and the last third ofIn,k are respectively the intervals In+1,k, In+1,hn+k and In+1,2hn+1+k. By construction,the corresponding map Tn+1 extends Tn.

The gure below represents the construction of T0, T1 and T2. The dotted linesindicate the spacers.

I0,0 0 `

I1,3 2`/3 `

I1,2 `

OO

4`/3

I1,1 `/3

OO

2`/3

I1,0 0

OO

`/3

I2,12 8`/9 `

I2,11 11`/9

OO

4`/3

I2,10 5`/9

OO

2`/3

I2,9 2`/9

OO

`/3

I2,8 4`/3

OO

13`/9

I2,7 7`/9

OO

8`/9

I2,6 10`/9

OO

11`/9

I2,5 4`/9

OO

5`/9

I2,4 `/9

OO

2`/9

I2,3 2`/3

OO

7`/9

I2,2 `

OO

10`/9

I2,1 `/3

OO

4`/9

I2,0 0

OO

`/9

The total length of the invervals In,0, . . . , In,hn−1 is

hn3n` =

n∑k=0

1

3k` =

3

2

(1− 1

3n+1

)`.

We choose ` = 2/3, so that I = [0, 1[ is the union of the intervals ([0, hn`n/3n[)n≥0. Call

λ the Lebesgue mesure on I. Then λ(Dn) = (hn − 1)`/3n → 1 as n→ +∞.

By denition, Chacons' transformation is the map from I = [0, 1[ to itself which ex-tends the maps (Tn)n≥0. Actually, T is dened on the full-measure subset D =

⋃n≥0Dn

and is a bimeasurable map from D to T (D) which preserves the Lebesgue mesure λ on I,since for every n ≥ 0, the image of the Lebesgue measure on Dn is the Lebesgue measureon T (Dn).

5.4. EXAMPLES 61

Remark 5.17. The cut-and-stack procedure provides a wide class of automorphism of

(I,B(I), λ), by letting the number of sub-towers, the number of spacers and their positions

vary at each step. One may also have at each step nitely many towers with dierent

widths.

Theorem 5.18. Chacons' transformation is weakly mixing, but not strongly mixing.

Proof. First, we prove that T is ergodic. Let A ∈ IT . For every n ≥ 0, set θn = λ(A|In,0).Since A ∈ IT and T−1 preserves µ, one also has for every k ∈ [0, hn − 1],

λ(A ∩ In,k) = λ(T k(A ∩ In,0)) = λ(A ∩ In,0) = θnλ(In,0) = θnλ(In,k).

If n ≥ 1, then summing the equalities above over all k ∈ 0, hn−1, 2hn−1 + 1 yieldsλ(A ∩ In−1,0) = θnλ(In−1,0), so θn−1 = θn since In−1,0 is the disjoint union of In,0,In,hn−1 and I2hn−1+1.

As a result, λ(A ∩ In,k) = θ0λ(In,k) for every n ≥ 0 and k ∈ [0, hn − 1]. Summingover all k ∈ [0, hn − 1] and letting n go to innity yields, λ(A) = θ0. If λ(A) > 0, theprobability measures λ and λ(·|A) coincide on the class of all intervals In,k, so they areequal and λ(A) = 1 since this class is stable under intersection and generates B(I). Theergodicity follows.

Given n ≥ 0 and k ∈ [[0, hn − 1]], denote by I ′n,k, I′′n,k and I ′′′n,k the rst third, the

second third and the last third of the interval In,k. The key observation is that

• for every x ∈ I ′n,k, T hn(x) = x+ `/3n+1 ∈ I ′′n,k

• for every x ∈ I ′′n,k, T hn+1(x) = x+ `/3n+1 ∈ I ′′′n,k.

To prove that T is weakly mixing, it remains to prove that 1 is the only eigenvalueof UT . Let f be an eigenvector associated to the eigenvalue ζ. Since T is ergodic, |f | isalmost surely constant, and one may assume that this constant is 1.

For every real number α, denote by Tα the map x 7→ (x + α) − bx + αc from Ito I. Using the density of the trigonometric polynomials in L2(λ), one checks that||f Tα − f ||2 → 0 as α→ 0.

For every n ≥ 0, let An = I ′n,0 ∪ · · · ∪ I ′n,hn−1 and Bn = I ′′n,0 ∪ · · · ∪ I ′′n,hn−1. Then

||f T1/3n+1 − f ||22 ≥∫An

|f T hn − f |2 dλ+

∫Bn

|f T hn+1 − f |2 dλ

= λ(An)|ζhn − 1|2 + λ(Bn)|ζhn+1 − 1|2

=hn

3n+1`× (|ζhn − 1|2 + |ζhn+1 − 1|2).

Since (hn/3n+1)`→ 1/3 as n→ +∞, one gets ζhn → 1 and ζhn+1 → 1 as n→ +∞, so

ζ = 1. This shows that T is weakly mixing.

For every n ≥ 1, the interval I1,0 can be written as a disjoint union of levels of thetower n, namely

I1,0 =⋃k∈Kn

In,k.

SinceI1,0 ∩ T−hn(I1,0) ⊃

⋃k∈Kn

(In,k ∩ T−hn(In,k)) ⊃⋃k∈Kn

I ′n,k,

62 CHAPTER 5. ERGODICITY AND MIXING

we get

λ(I1,0 ∩ T−hn(I1,0)) ≥∑k∈Kn

λ(I ′n,k) =1

3λ(I1,0).

Since λ(I1,0) < 1/3, λ(I1,0∩T−n(I1,0)) does not tend to λ(I1,0)2 as n→ +∞. Therefore,T is not strongly mixing.

5.5 Exercices

Let T be an automorphism of a separable measure space (X,X , µ).

1. Let n ≥ 1. Give a necessary and sucient condition on UT for Tn to be ergodic.

2. Assume that T is ergodic. Let T be the map from X ×−1, 1 to itself dened byT (x, y) = (T (x),−y). Let ν be the uniform measure on −1, 1. Check that

• T is an automorphism of the product space (X×−1, 1,X⊗P(−1, 1), µ⊗ν)

• T is ergodic if and only if T 2 := T T is ergodic

• T 2 is not ergodic.

3. Check that if T is weakly mixing, then Tn is ergodic for every n ≥ 1.

4. Show that T is weakly mixing if and only if for every ergodic automorphism S ofa separable measure space (Y,Y, ν), the Cartesian product T × S is ergodic.

5. Check that T−1 is ergodic if and only if T is, and prove the same staements forthe weak mixing and the strong mixing property.

5.5. EXERCICES 63

Study of a map induced by the dyadic odometer

An odometer is an instrument that indicates the distance travelled by a vehicle or theconsumption of a household (power, gas, water). Informally, the dyadic odometer trans-formation is the addition of 1 in the set of numbers with innitely many binary digits.

Notations : given two integers a and b, and a positive integerm, we denote by a div mand a mod m the quotient and the remainder in the Euclidian division of a by m.

Preamble: the group of dyadic integers

Many equivalent denitions of the metric group (Z2,+) of dyadic integers can be given.We dene it as the completion of the group (Z,+) endowed with the metric d2 denedon Z by

d2(x, y) := 2−m(x,y) where m(x, y) := supk ∈ Z+ : x− y ∈ 2kZ,

with the convention d2(x, y) = 0 if x = y. Actually, one checks that d2 is a translation-invariant ultra-metric on Z.

Let Σ := 0, 1Z+ . Given any sequence ξ = (ξn)n≥0 in Σ, the sequence

(Φn(ξ))n≥0 := (n∑k=0

ξk2k)n≥0

is a Cauchy sequence in (Z, d2) whose limit in Z2 is denoted by

Φ∞(ξ) =

+∞∑k=0

ξk2k.

Moreover, one checks that

• Φ∞(ξ) is a non-negative integer if and only if ξn = 0 for every large enough n ;

• Φ∞(ξ) is a negative integer if and only if ξn = 1 for every large enough n.

For example,

n∑k=0

2k = 2n+1 − 1 −→n→+∞

−1 in (Z, d2), so+∞∑k=0

2k = −1 in Z2.

One checks that the map Φ∞ is a bijection from Σ to Z2. Given x ∈ Z2, the sequenceof binary digits of x is dened by (Dn(x))n≥0 = Φ−1

∞ (x).

One checks that the d2-distance between two distinct elements in Z2, is given by

d2(x, y) = 2−m(x,y) where m(x, y) = infk ≥ 0 : Dk(y) 6= Dk(y).

Hence, for every ` ≥ 0 and x ∈ Z2, the set

B`(x) := y ∈ X : ∀k ∈ [0, `− 1], Dk(y) = Dk(x)

is at the same time the the open ball B(x, 21−`) and the closed ball B(x, 2−`). For every` ∈ Z+, the balls (B`(r))r∈[0,2`−1] form a partition of Z2. Therefore, the complete metricspace (Z, d2) is precompact hence compact.

64 CHAPTER 5. ERGODICITY AND MIXING

Given two elements in Z2, namely

x =+∞∑k=0

ξk2k and y =

+∞∑k=0

ηk2k.

the sum x+y ∈ X is given by x+y = (ζn)n≥0, where the sequences (ζn)n≥0 and (γn)n≥0

are dened inductively by

ζ0 = (ξ0 + η0) mod 2 and γ0 = (ξ0 + η0) div 2

and for every k ≥ 1,

ζk = (ξk + ηk + γk−1) mod 2 and γk = (ξk + ηk + γk−1) div 2.

The sequences (γk)k≥0 indicates the successive carries in the addition.

One checks that the distance d2 on Z2 is invariant by translation, so (Z2, d2) is ametric compact additive group. The Borel σ-eld and the Haar measure on Z2 are

B(Z2) = Φ∞

(⊗n≥0

P(0, 1))and µ = Φ∞

(⊗n≥0

δ0 + δ1

2

).

Hence (Dn)n≥0 is a sequence of independent and uniformly distributed in 0, 1 randomvariables on the probability space (Z2,B(Z2), µ). In particular, µ(B`(x)) = 2−` for every` ≥ 0 and x ∈ Z2.

The dyadic odometer

The dyadic odometer is the map T : x 7→ x + 1 from Z2 to Z2. We denote by UT theKoopman operator associated to T , dened on L2(µ) by UT f = f T . Let U the set ofall unit complex numbers and ∆ = p/2n : n ≥ 0 and p ∈ [[0, 2n − 1]].

1. Why is T ergodic?

2. Check that T 2n → IdZ2 uniformly on Z2 as n → +∞ and deduce that for everyf ∈ L2(µ), U2n

T f → f as n→ +∞. This shows that the map T is rigid.

3. Given n ∈ N and p ∈ [0, 2n − 1], check that the map

fn,p =2n−1∑q=0

ei2πpq/2n1Bn(q)

is an eigenvector of UT and a group morphism from (Z2,+) to (U,×).

4. Check that fn,p depends only on the ratio p/2n, so we can set χp/2n = fn,p.

5. Prove that the maps χp/2n thus dened form an orthonormal basis of L2(µ).

6. Given f ∈ L2(µ), express ||U2n

T f − f ||22 as a function of the Fourier coecients(〈χr, f〉)r∈∆.

5.5. EXERCICES 65

A not strongly mixing map...

For every integer ` ≥ 0 and x ∈ Z2, let N`(x) = infk ≥ ` : Dk(x) = 0. Note thatN0(x) = 0 or N0(x) = N1(x).

We introduce the sets

A = x ∈ Z2 : N0(x) is even,

A1 = x ∈ Z2 : N1(x) is even.

A2 = x ∈ Z2 : N0(x) = 0 and N1(x) is odd or innite

Note that A1, A2 is a partition of A.

We call uniform law on A the probability measure µA = µ(·|A) and denote by TAthe µA-preserving map induced by T on A.

1. Compute the law of Nm and the value µ(A).

2. Check the inclusions T (A1) ⊂ A, T (A2) ⊂ Ac and T (Ac) ⊂ A. Hence TA(x) =T (x) if x ∈ A1, whereas TA(x) = T 2(x) if x ∈ A2.

3. Let B = B2(0) = [D0 = D1 = 0].

(a) Let x ∈ B and n ≥ 1. Check that T 22n(x) ∈ B and T 22n(x) = TSn(x)A (x),

where

Sn(x) =22n−1∑k=0

1A(T k(x)).

(b) Let n ≥ 1 and sn = 22n−1 + 22n−3 + · · ·+ 21. Prove that

µ[Sn = sn|B] = 1/3 and µ[Sn = sn + 1|B] = 2/3.

Hint: consider the partition (B2n(r))r∈[[0,22n−1]] of Z2 into 22n balls. Givenx ∈ Z2, check the following statement

i. Each ball contains exactly one element among x, T (x), . . . , T 22n−1(x).

ii. For each r ∈ [[0, 22n − 2]], the function 1A is constant on B2n(r).

iii. The exponent K(x) ∈ [[0, 22n − 1]] such that TK(x)(x) ∈ B2n(22n − 1)depends only on (D0(x), . . . , D2n−1(x)), whereas TK(x)(x) depends onlyon (Dk(x))k≥2n. Furthermore, N0(TK(x)(x)) = N2n(x).

(c) Deduce that for every n ≥ 1, µ(T−(sn+1)A (B) ∩ B) ≥ (2/3)µ(B) and that TA

is not strongly mixing.

66 CHAPTER 5. ERGODICITY AND MIXING

... which is weakly mixing

Keep the notations of the previous part. Our purpose is now to prove that TA is weaklymixing.

Since TA is ergodic (because T is ergodic), we only have to check that 1 is the onlyeigenvalue of the Koopman's operator UTA . So we consider a unit eigenfunction fA, andwe denote by ζ a square root of the corresponding eigenvalue, so UTAfA = ζ2fA.

We extend fA into a function f from Z2 to C by setting f(x) = fA(x) if x ∈ A,f(x) = ζ−1fA(T (x)) if x ∈ Ac.

1. Prove that |f | = 1 µ-almost surely.

2. Let h = 21A1 + 1Ac1 . Check that f T = ζhf .

3. Deduce that for every n ≥ 1, f T 22n = ζHnf , where

Hn =

22n−1∑k=0

h T k.

4. Check that for each r ∈ [[0, 22n − 3]], the function h is constant on B2n(r).

5. Prove that

µ[Hn = 2sn]→ 1/3 and µ[Hn = 2sn + 2]→ 2/3 as n→ +∞.

Hint: given x ∈ Z2, denote by K ′(x) ∈ [[0, 22n − 1]] the exponent such thatTK

′(x)(x) ∈ B2n(22n−1 − 2). If x /∈ B2n(22n − 1), check that K(x) = K ′(x) + 1,and N1(TK

′(x)(x)) = N1(TK(x)(x)) = N2n(x).

6. Deduce that ζ2 = 1. Hint: for every n ≥ 1,

||f T 22n − f ||22 =

∫Z2

|ζHn − 1|2 dµ.

5.5. EXERCICES 67

Solution - the dyadic odometer

1. Since the subgroup generated by 1, namely Z, is dense in the compact group(Z2,+), the translation T is uniquely ergodic, so (T, µ) is ergodic.

2. Let n ≥ 0. For every x ∈ Z, T 2n(x) = x + 2n, hence d(x, Tn(x)) = 2−n. ThusT 2n → IdZ2 uniformly on Z2 as n→ +∞.

For every f ∈ C(Z2), f is uniformly continuous by compactness of Z2, so U2n

T f → funiformly and therefore in L2(µ) as n→ +∞.

By equicontinuity of the sequence (U2n

T )n≥0 and by density of C(Z2) in L2(µ), wededuce that for every f ∈ L2(µ), U2n

T f → f as n→ +∞.

3. Let n ∈ N and p ∈ [[0, 2n − 1]].

Since |fn,p| = 1, one has ||fn,p||22 = 1. Since T is an isometry of (Z2, d2), one hasT−1(Bn(q)) = Bn(q − 1) for every q ∈ Z2. Noting that Bn(−1) = Bn(2n − 1) andei2πp×0/2n = ei2πp×2n/2n , one gets

fn,p T =2n−1∑q=0

ei2πpq/2n1Bn(q) T

=2n−1∑q=0

ei2πpq/2n1Bn(q−1)

=2n−2∑q=−1

ei2πp(q+1)/2n1Bn(q) = ei2πp/2nfn,p.

Therefore, fn,p is a unit eigenvector of UT .

Let x and y in Z2. Call r the integer in [[0, 2n − 1]] such that y ∈ Bn(r). Thedyadic expansion of r is given by Dk(r) = Dk(y) if 0 ≤ k ≤ n − 1 and Dk(r) = 0if k ≤ n − 1. The addition formulas show that the n rst digits of x + y are alsothe n rst digits of x+ r = T r(x). Hence

fn,p(x+ y) = fn,p(Tr(x)) = ei2πpr/2

nfn,p(x) = fn,p(x)fn,p(y).

Hence fn,p is a group morphism from (Z2,+) to (U,×).

4. Let n ∈ N and p ∈ [[0, 2n − 1]]. Then

fn+1,2p =

2n+1−1∑q=0

ei2πpq/2n1Bn+1(q)

=2n−1∑q=0

(ei2πpq/2n1Bn+1(q) + ei2πp(q+2n)/2n1Bn+1(q+2n))

=2n−1∑q=0

ei2πpq/2n(1Bn+1(q) + 1Bn+1(q+2n))

=

2n−1∑q=0

ei2πpq/2n1Bn(q) = fn,p,

since for each q ∈ [[0, 2n − 1]], Bn+1(q), Bn+1(q + 2n) is a partition of Bn(q).

68 CHAPTER 5. ERGODICITY AND MIXING

A recursion shows that fn+k,2kp = fn,p for every positive integer k. We deduce thatfn,p depends only on the ratio p/2n by writing the fraction p/2n in an irreductibleform p′/2n

′with n′ ≥ 0 and p′ ∈ [[0, 2n

′ − 1]].

5. For every r ∈ ∆, (the class of) χr is a unit eigenvector of the unitary operator UT ,associated to the eigenvalue ei2πr. Since ∆ ⊂ [0, 1[, the eigenvalues (ei2πr)r∈∆ arepairwise distinct so (χr)r∈∆ are pairwise orthogonal.

Call F the closure of the vector space spanned by (χr)r∈∆. It suces to check thatF = L2(µ) to prove that (χr)r∈∆ is an Hilbert basis of L2(µ).

For every integer n ≥ 0, the matrix of (χp)0≤p≤2n−1 in (1Bn(q))0≤q≤2n−1, namely

(ei2πpq/2n)0≤q,p≤2n−1, is invertible, so F contains (1Bn(q))0≤q≤2n−1. Hence, F con-

tains the indicator of each ball of (Z2, d2).

Therefore, F contains C(Z2), since for every f ∈ C(Z2), f is uniformly contiuous,so f is the uniform limit and also the || · ||2-limit of the sequence (fn)n≥0 given by

fn =2n−1∑q=0

f(q)1Bn(q).

But C(Z2) is dense in L2(µ), hence F = L2(µ).

Alternative argument: the set S of all balls of Z2, union ∅, is a semi-algebra:it contains ∅ and Z2, is closed under intersection, and the dierence of any twoelements of S can be written as a nite disjoint union of elements of S. Therefore,the set A of all nite unions of elements of S is an algebra and each element of Acan be written as nite disjoint union of elements of S.

Therefore F contains the indicator function of every element of A. Since thealgebra A generates B(Z2) (because S is a countable basis of Z2), F contains theindicator function of every Borel subset, and by linearity, every simple function.Simple functions are dense in L2(µ), hence F = L2(µ).

Alternative argument: the functions (χr)r∈∆ are continous since the sets Bn(q)(n ≥ 1 and q ∈ [[0, 2n−1]]) are clopen. One checks that χ0 = 1, that for every r andr′ in ∆, χrχr′ = χrr′ and χr = χ−r−b−rc (to prove the rst equality, write r and r′

as fractions with the same denominator 2n). Therefore, the vector space generatedby (χr)r∈∆ is a sub-algebra of C(Z2) which is stable by conjugation. If x 6= y are inZ2 and n = m(x, y), then D0(x− y) = . . . = Dn−1(x− y) = 0 and Dn(x− y) = 1so x− y ∈ Bn+1(2n) and χ1/2n+1(x)/χ1/2n+1(y) = χ1/2n+1(x− y) = −1. Thereforethis algebra separates points of the compact space Z2. By Stone's theorem, thisalgebra is dense in C(Z2) and therefore in L2(µ).

6. Let f ∈ L2(µ). Since

f =∑r∈∆

〈χr, f〉χr and U2n

T f =∑r∈∆

〈χr, f〉(ei2πr)2nχr,

one has

||U2n

T f − f ||22 =∑r∈∆

|ei2n+1πr − 1|2|〈χr, f〉|2 =∑r∈∆

4 sin2(2nπr)|〈χr, f〉|2.

5.5. EXERCICES 69

Solution - A not strongly mixing map...

1. Fix m ≥ 0. For every n ≥ m,

µ[Nm ≥ n] = µ[Dn = . . . = Dm−1 = 1] = (1/2)n−m,

soµ[Nm = n] = µ[Nm ≥ n]− µ[Nm ≥ n+ 1] = (1/2)n−m+1

and µ[Nm = +∞] = 0. Last,

µ(A) =

+∞∑k=0

µ[N0 = 2k] =

+∞∑k=0

1

22k+1=

2

3.

2. Let x ∈ A1. Then N1(x) is even.

• If D0(x) = 0, then D0(x+ 1) = 1 and Dn(x+ 1) = Dn(x) for every n ≥ 1, soN0(x+ 1) = N1(x+ 1) = N1(x).

• If D0(x) = 1, then D0(x+ 1) = 0, so N0(x+ 1) = 0.

In both cases, N0(x+ 1) is even so x+ 1 ∈ A. Hence T (A1) ⊂ A.Let x ∈ A2. Then D0(x) = 0 and N1(x) is odd or innite, so D0(x + 1) = 1 andDn(x+ 1) = Dn(x) for every n ≥ 1. Thus N0(x+ 1) = N1(x+ 1) = N1(x) is oddor innite, so x+ 1 ∈ Ac. Hence T (A2) ⊂ Ac.Let x ∈ Ac. Since N0(x) 6= 0, we have D0(x) = 1, so D0(x + 1) = 0. ThusN0(x+ 1) = 0, so x+ 1 ∈ A. Hence T (Ac) ⊂ A.

3. (a) Let x ∈ B and n ≥ 1. Since x + 22n and x have the same rst 2n digits, wehave T 22n(x) ∈ B.In particular x and T 22n(x) are in A, so T 22n(x) = TmA (x) for some integerm ≥ 1 The exponent 22n is the m-th return time in A from x, so m is exactlythe number k ∈ [[0, 22n − 1]] such that T k(x) ∈ A, namely the sum

Sn(x) =

22n−1∑k=0

1A(T k(x)).

Hence T 22n(x)(x) = TSn(x)A (x).

(b) i. Fix x ∈ Z2. Let p ∈ [[0, 22n − 1]] be the integer such that x ∈ B2n(p).Then for every q ∈ [[0, 22n − 1]], T q(x) = x + q ∈ B2n(p + q) = B2n((p +q) mod 22n). Since the map q 7→ (p + q) mod 22n is a permutation on[[0, 22n − 1]], each one of the balls (B2n(r))r∈[[0,22n−1]] contains exactly one

element among x, T (x), . . . , T 22n−1(x).

ii. Let r ∈ [[0, 22n− 2]]. The functions D0, . . . , D2n−1 are constant on B2n(r)and at least one of them is null on B2n(r). Hence N0 is constant onB2n(r) and at most equal to 2n−1. Therefore, 1A is constant on B2n(r),equal to 1A(r).

iii. Given x ∈ Z2, the only integer p ∈ [[0, 22n − 1]] such that x ∈ B2n(p) is

P (x) :=

2n−1∑k=0

Dk(x)2k,

70 CHAPTER 5. ERGODICITY AND MIXING

and the exponent k such that T k(x) ∈ B2n(22n − 1) is

K(x) := 22n − 1− P (x) =2n−1∑k=0

(1−Dk(x))2k.

The addition formulas show that

TK(x)(x) = x+K(x) =2n−1∑k=0

2k ++∞∑k=2n

Dk(x)2k.

Furthermore, N0(TK(x)(x)) = N2n(x).

From the statements i, ii, iii, we get

Sn(x) =22n−2∑r=0

1A(r) + 1[N2n(x) is even].

Given r ∈ [[0, 22n − 2]], a necessary and sucient for r to be in A is that the2n-uple (D0(r), . . . , D2n−1(r)) begins by 0, or by (1, 1, 0), or by (1, 1, 1, 1, 0),and so on. The number of such r is exactly 22n−1 + 22n−3 + · · · + 2 = sn.Besides, one checks that µx ∈ Z2 : N2n(x) is even = 2/3. As a result,

µ[Sn = sn|B] = 1/3 and µ[Sn = sn + 1|B] = 2/3.

(c) Let n ≥ 1. Since TSn(x)A (x) ∈ B for every x ∈ B, one gets the inclusion

T−(sn+1)A (B) ∩B) = x ∈ B : T sn+1

A (x) ∈ B⊃ x ∈ B : Sn(x) = sn + 1= B ∩ [Sn = sn + 1]

= [D0 = D1 = 0] ∩ [N2n is even ].

The events B = [D0 = D1 = 0] and [N2n is even ] are independent, since N2n

is a function of (Dk)k≥2n. Hence, µ(T−(sn+1)A (B) ∩B) ≥ (2/3)µ(B), so

µA(T−(sn+1)A (B) ∩B) ≥ (2/3)µA(B).

But µA(B) = (1/4)/(2/3) = 3/8 < 2/3, so

lim supµA(T−(sn+1)A (B) ∩B) ≥ lim supµA(T−nA (B) ∩B) > µA(B)2.

Hence TA is not strongly mixing.

... which is weakly mixing

1. Since T is ergodic and invertible, TA is ergodic and invertible with inverse (T−1)A,so UTA is unitary. Thus |ζ2| = 1 and |fA| TA = |fA TA| = |fA| µA-almost surely,so |fA| is µA-almost surely constant. The constant must be 1 since we assumedthat ||fA|| = 1. By denition on f , we get that |f | = 1 is µ-almost surely on Aand Ac ∩ [|f | 6= 1] ⊂ T−1(A ∩ [|f | 6= 1]), so |f | = 1 is µ-almost surely on Ac too.Hence |f | = 1 is µ-almost surely.

5.5. EXERCICES 71

2. Let x ∈ Z2.

If x ∈ A1, then T (x) ∈ A, so

f(T (x)) = fA(TA(x)) = ζ2fA(x) = ζ2f(x)

If x ∈ A2, then T (x) ∈ Ac, so

f(T (x)) = ζ−1fA(T 2(x)) = ζ−1fA(TA(x)) = ζfA(x) = ζf(x)

If x ∈ Ac, then T (x) ∈ A, so

f(T (x)) = fA(T (x)) = ζf(x)

In all cases, f(T (x)) = ζh(x)f(x). Hence f T = ζhf .

3. A recursion shows that for every n ≥ 1,

f Tn = ζh+(hT )+···+(hTn−1)f.

Hence for every n ≥ 1,

f T 22n = ζHnf, where Hn =22n−1∑k=0

h T k.

4. Let r ∈ [[0, 22n − 3]]. On B2n(r), the fuctions D1, . . . , D2n−1 are constant, and atleast one is null, so N ′ is constant and at most equal to 2n − 1. Hence 1A1 andh = 1 + 1A1 are constant on B2n(r).

5. Let x ∈ Z2. Each one of the balls (B2n(r))r∈[[0,22n−1]] contains exactly one element

among x, T (x), . . . , T 22n−1(x). Let K ′(x) ∈ [[0, 22n− 1]] and K(x) ∈ [[0, 22n− 1]] theexponents such that TK

′(x)(x) ∈ B2n(22n− 2) and TK(x)(x) ∈ B2n(22n− 1). Then

Hn(x) =22n−3∑r=0

h(r) + h(TK′(x)(x)) + h(TK(x)(x))

= 22n +22n−3∑r=0

1A1(r) + 1A1(TK′(x)(x)) + 1A1(TK(x)(x))

The number of r ∈ [[0, 22n − 2]] which belong to A1 is 2(22n−3 + 22n−5 + · · ·+ 2) =2sn−22n. Indeed, a necessary and sucient for r to be in A1 is that the 2n−1-uple(D1(r), . . . , D2n−1(r)) begins by (1, 0), or by (1, 1, 1, 0), or by (1, 1, 1, 1, 1, 0), andso on, whereas the digit D0(r) has no incidence. Hence

Hn(x) = 2sn + 1A1(TK′(x)(x)) + 1A1(TK(x)(x)).

As before, denote by P (x) the unique p ∈ [[0, 22n − 1]] such that x ∈ B2n(p).

If x ∈ B2n(22n − 1)c, then P (x) ≤ 22n − 2. Thus K ′(x) = 22n − 2 − P (x) andK(x) = 22n − 1− P (x),

TK′(x)(x) =

2n−1∑k=1

2k +

+∞∑k=2n

Dk(x)2k and TK(x)(x) =

2n−1∑k=0

2k +

+∞∑k=2n

Dk(x)2k.

72 CHAPTER 5. ERGODICITY AND MIXING

Hence N1(TK′(x)(x)) = N1(TK(x)(x)) = N2n(x), so

∀x ∈ B2n(22n − 1)c, Hn(x) = 2sn + 21[N2n(x) is even].

Since µ[N2n(x) is even] = 2/3 and µ(B2n(22n − 1)) = 2−2n → 0 as n → +∞, weget

µ[Hn = 2sn]→ 1/3 and µ[Hn = 2sn + 2]→ 2/3 as n→ +∞.

6. Let n ≥ 1. Since f T 22n(x) = ζHnf and |f | = 1 µ-almost surely,

||f T 22n(x) − f ||22 =

∫Z2

|ζHn − 1|2 dµ

= |ζ2sn − 1|2µ[Hn = 2sn]

+|ζ2sn+1 − 1|2µ[Hn = 2sn + 1]

+|ζ2sn+2 − 1|2µ[Hn = 2sn + 2].

As n→ +∞, the left-hand side goes to 0, whereas the right hand side is the sum ofthree non-negative terms, so each of them goes to 0. Since µ[Hn = 2sn]→ 1/3 andµ[Hn = 2sn + 2] → 2/3, we get ζ2sn → 1 and ζ2sn+2 → 1, so ζ2 = 1, by division.This proves that 1 is the only eigenvalue of UTA . Since TA is ergodic (because Tis), TA is weakly mixing.

5.5. EXERCICES 73

Ornstein's criterion for strong mixing

Let X be a separable complete metric space, X its Borel σ-eld and µ a probabilitymeasure on (X,X ). Let T be a continuous automorphism of the separable measurespace (X,X , µ).

We assume that Tn is ergodic for every integer n ≥ 1, and that we have, for someconstant c ≥ 1 and for every A and B in X ,

lim supn→+∞

µ(A ∩ T−n(B)) ≤ cµ(A)µ(B).

Our goal is to prove that the dynamical system (X,X , µ, T ) is strongly mixing.

I. Using Koopman's operator

The goal of this subsection is to prove that the Cartesian square map T × T on theprobability space (X × X,X ⊗ X , µ ⊗ µ) is ergodic. First, we prove that 1 is the onlyeigenvalue of the unitary operator UT : f 7→ f T from L2(µ) to L2(µ). Suppose, toderive a contradiction, that UT admits an eigenvalue ζ dierent from 1. Let f ∈ L2(µ)be eigenvector of UT of eigenvalue ζ, such that ||f ||2 = 1.

1. Check that ζ cannot be a root of 1.

2. Prove that the measure f(µ) is the Haar measure on U, where U denotes the unitcircle of C. Hint: use the unique ergodicity of the map R : z 7→ ζz from U to U.

3. Using a sequence (qn)n≥1 of positive integers such that qn → +∞ and ζqn → 1 asn→ +∞, obtain a contradiction. Hint: given ε ∈ [0, 1], set Iε = eiθ : θ ∈ [0, 2πε]and Aε = Bε = f−1(Iε).

4. Deduce that the Cartesian square map T ×T on (X ×X,X ⊗X , µ⊗µ) is ergodic.

II. Measure-theoretic tools

Given any set E, a collection S of subsets of E is called a semi-algebra on E when

• ∅ ∈ S and E ∈ S.

• S is stable by nite intersection,

• for every A and B in S, A \B can be written as a nite union of pairwise disjointelements of S.

1. Let S be the collection of all subsets in X that can be written as the intersectionof an open subset and a closed subset of X. Check that S is a semi-algebra on X.

2. Let A be the algebra generated by S and S2 be the set of all Cartesian productsA×B with A and B in A. Check that S2 is a semi-algebra on X2.

The next result can be used without proof in the part end of the proof: if S is asemi-algebra on E, then the set A of all nite unions of elements of S is an algebra onE, namely the algebra generated by S. Moreover, every element of A can be written asa disjoint union of nitely many elements of S.

74 CHAPTER 5. ERGODICITY AND MIXING

III. End of the proof

Keep the notation of the part `Measure-theoretic tools'. For every n ≥ 0, call νn theimage of µ by the map x 7→ (X,Tn(x)) from X to X2. Denote by ΠT×T the set of allT × T -invariant probability measures on X2.

1. Prove that νn ∈ ΠT×T . Hint: one only needs to check that the measures νn and(T × T )(νn) agree on the rectangles A×B, where A and B are in X .

2. Check that the sequence (νn)n≥0 is tight. Hint: what are the marginals of νn?

3. Let ν be a limit point of the sequence (νn)n≥0, namely the limit of some subsequence(νqn)n≥0. By Portmantau theorem, we have ν(O) ≤ lim inf νqn(O) for every openset O in X2.

(a) Show that ν is invariant by T × T .(b) Show that for every open sets A and B in X, ν(A×B) ≤ c(µ⊗ µ)(A×B).

(c) Show the last inequality still holds whenever A and B are in S.(d) Deduce that ν ≤ c(µ⊗µ). Hint: setM = C ∈ X ⊗X : ν(C) ≤ c(µ⊗µ)(C).(e) Deduce that ν = µ⊗ µ. Hint: use question I4.

4. Deduce from the previous question that νn → µ⊗ µ narrowly as n→ +∞.

5. Deduce that (X,X , µ, T ) is strongly mixing. Hint: x A and B in X and let ε > 0.Choose two continous functions from X to [0, 1] such that ||f − 1A||1 ≤ ε and||g − 1B||1 ≤ ε.

5.5. EXERCICES 75

Correction

Let X be a separable complete metric space, X its Borel σ-eld and µ a probabilitymeasure on (X,X ). Let T be a continuous automorphism of the separable measurespace (X,X , µ).

We assume that Tn is ergodic for every integer n ≥ 1, and that we have, for someconstant c ≥ 1, and for every A and B in X ,

lim supn→+∞

µ(A ∩ T−n(B)) ≤ cµ(A)µ(B).

Our goal is to prove that the dynamical system (X,X , µ, T ) is strongly mixing.

I. Using Koopman's operator

The goal of this subsection is to prove that the Cartesian square map T × T on theprobability space (X × X,X ⊗ X , µ ⊗ µ) is ergodic. First, we prove that 1 is the onlyeigenvalue of the unitary operator UT : f 7→ f T from L2(µ) to L2(µ). Suppose, toderive a contradiction, that UT admits an eigenvalue ζ dierent from 1. Let f ∈ L2(µ)be eigenvector of UT of eigenvalue ζ, such that ||f ||2 = 1.

1. Since f is an eigenvector of UT of eigenvalue ζ 6= 1, we have f /∈ C1. If we hadζn = 1 for some integer n ≥ 2, the equality UTnf = UnUf = f would contradict theergodicity of Tn. Hence ζ cannot be a root of 1.

2. The map R : z 7→ ζz fromU toU is a translation in a compact group. Since ζ is nota root of 1, the subgroup generated by ζ is dense in U, so the Haar measure on U isthe unique R-invariant measure on U. But R(f(µ)) = (ζf)(µ) = (f T )(µ) = f(µ)since T preserves µ. Hence f(µ) is the Haar measure on U (denoted by ν below).

3. The density of the subgroup generated by ζ inU, entails the existence of a sequence(qn)n≥1 of positive integers such that qn → +∞ and ζqn → 1 as n → +∞. Givenε ∈ [0, 1], set Iε = eiθ : θ ∈ [0, 2πε] and Aε = Bε = f−1(Iε). Then µ(Aε) =µ(Bε) = ν(Iε) = ε. But for every n ≥ 1,

T−qn(Bε) = (f T qn)−1(Bε) = (ζqnf)−1(Bε) = f−1(ζ−qnIε),

Aε ∩ T−qn(Bε) = f−1(Iε ∩ ζ−qnIε),so

µ(Aε ∩ T−qn(Bε)) = ν(Iε ∩ ζ−qnIε) =

∫U1Iε(z)1Iε(ζ

qnz) dν(z).

As n→ +∞, ζqn → 1, so 1Iε(ζqn(z))→ 1Iε(z) for ν-almost every z ∈ U (the only

two exceptions are the extremities of Iε). Since 0 ≤ 1Iε(z)1Iε(ζqnz) ≤ 1, Lebesgue

dominated convergence theorem applies, so

lim supn→+∞

µ(Aε∩T−n(Bε)) ≥ limn→+∞

µ(Aε∩T−qn(Bε)) = ν(Iε) = ε = ε−1µ(Aε)µ(Bε).

Choosing ε < 1/c yields a contradiction with the assumption that for every A andB in X , lim supn→+∞ µ(A ∩ T−n(B)) ≤ cµ(A)µ(B).

Therefore, 1 is the only eigenvalue of UT .

4. Since T is ergodic and 1 is the only eigenvalue of UT , the dynamical system(X,X , µ, T ) is weakly mixing, so its Cartesian square is ergodic.

76 CHAPTER 5. ERGODICITY AND MIXING

II. Measure-theoretic tools

Given any set E, a collection S of subsets of E is called a semi-algebra on E when

• ∅ ∈ S and E ∈ S.

• S is stable by nite intersection,

• for every A and B in S, A \B can be written as a nite union of pairwise disjointelements of S.

1. Let S be the collection of all subsets in X that can be written as the intersectionof an open subset and a closed subset of X.

Since X is closed in X, every open O in X belongs to S, since O = O ∩X.

Since X is open in X, every closed F in X belongs to S, since F = X ∩ F .

In particular, ∅ and X belong to S.

Let A and B be two elements of S. Then A = O1 ∩F1 and B = O2 ∩F2 where O1

and O2 are open, F1 and F2 are closed.

Therefore A ∩ B = (O1 ∩ O2) ∩ (F1 ∩ F2) belongs to S since O1 ∩ O2 is open andF1 ∩ F2 is closed, so S is stable under nite intersection.

Moreover, Bc = Oc2 ∪ F c2 is the disjoint union of Oc2 and O2 ∩ F c2 , so the set

A \B = A ∩Bc = (O1 ∩ F1) ∩ (Oc2 ∪ F c2 )

is the disjoint union of the subsets((O1 ∩F1)∩Oc2)

)and

((O1 ∩F1)∩ (O2 ∩F c2 )

),

which are both in S.

Hence S is a semi-algebra on X.

2. Let A be the algebra generated by S and S2 be the set of all Cartesian productsA×B with A and B in A.

Since ∅ and X belong to A, ∅ = ∅ × ∅ and X2 = X ×X belong to S2.

Let A,B,C,D be elements of A. Then (A × B) ∩ (C ×D) = (A × C) ∩ (B ×D)belongs to S2. Moreover, (C ×D)c is the disjoint union of Cc×X and C ×Dc, sothe set

(A×B) \ (C ×D) = (A×B) ∩ (C ×D)c

is the disjoint union of the subsets (A ∩ Cc) × B and (A ∩ C) × (B ∩Dc), whichare both is S2.

Hence S2 is a semi-algebra on X2.

The next result can be used without proof in the part end of the proof: if S is asemi-algebra on E, then the set A of all nite unions of elements of S is an algebra onE, namely the algebra generated by S. Moreover, every element of A can be written asa disjoint union of nitely many elements of S.

5.5. EXERCICES 77

III. End of the proof

Keep the notation of the part `Measure-theoretic tools'. For every n ≥ 0, call νn theimage of µ by the map x 7→ (X,Tn(x)) from X to X2. Denote by ΠT×T the set of allT × T -invariant probability measures on X2.

1. Let n ≥ 0. For every A and B are in X ,

(T × T )(νn)(A×B) = νn((T−1(A)× T−1(B)

)= µx ∈ E : x ∈ T−1(A) and Tn(x) ∈ T−1(B)= µ

((T−1(A) ∩ T−n−1(B)

)= µ

(A ∩ T−n(B)

)= µx ∈ E : x ∈ A and Tn(x) ∈ B= νn(A×B).

The measures (T × T )(νn) and νn agree on the rectangles. Since the class of allrectangles is stable under intersection and generates the σ-eld X ⊗ X , we have(T × T )(νn) = νn.

2. For every n ≥ 0, the marginals of νn are µ and µ since for every A ∈ X ,

νn(A× Ω) = µ(A ∩ T−n(Ω)

)= µ(A),

νn(Ω×A) = µ(Ω ∩ T−n(A)

)= µ(T−n(A)) = µ(A).

Since X is a complete separable complete metric space, µ is tight. For every ε > 0,one can nd a compact subset K of X such that µ(K) ≥ 1 − ε. The set K2 iscompact, and the inclusion X2 \K2 =

((X \K)×X

)∪(X × (X \K)

)shows that

νn(X2 \K2) ≤ 2ε for every n ≥ 0. Therefore, the sequence (νn)n≥0 is tight.

3. Let ν be a limit point of the sequence (νn)n≥0, namely the limit of some subsequence(νqn)n≥0. By Portmantau theorem, we have ν(O) ≤ lim inf νqn(O) for every openset O in X2.

(a) Since T is continuous, T × T is. For every f ∈ Cb(X2), f (T × T ) ∈ Cb(X2),hence∫X2

f (T × T ) dν = limn

∫X2

f (T × T ) dνqn = limn

∫X2

f dνqn =

∫X2

f dν.

Therefore, ν is invariant by T × T .(b) Let A and B be two open sets in X. Then A×B is open in X2 so

ν(A×B) ≤ lim inf νqn(A×B) ≤ lim sup νn(A×B)

= lim supµ(A× T−n(B))

≤ c(µ⊗ µ)(A×B)

(c) In a metric space, each closed subset can be written as the intersection ofsome non-increasing sequence of open sets. Therefore, every element of S canbe written as the intersection of some non-increasing sequence of open sets.Let A and B be in S. Consider two non-increasing sequences (An)n≥0 and(Bn)n≥0 of open sets converging to A and B. Then

ν(A×B) = limnν(An ×Bn) ≤ lim

nc(µ⊗ µ)(A×B) ≤ c(µ⊗ µ)(A×B).

78 CHAPTER 5. ERGODICITY AND MIXING

A better argument works for every A and B in X and bypasses the use of S2

in the next question: for every ε > 0, one can nd two open sets A′ ⊃ A andB′ ⊃ B such that µ(A′) ≤ µ(A) + ε and µ(B′) ≤ µ(B) + ε, so

ν(A×B) ≤ ν(A′ ×B′) ≤ cµ(A′)µ(B′) ≤ c(µ(A) + ε)(µ(B) + ε).

Letting ε go to 0 yields ν(A×B) ≤ c(µ⊗ µ)(A×B).

(d) The collection M of all subsets C ∈ X ⊗ X such that ν(C) ≤ c(µ ⊗ µ)(C)is stable by nite disjoint union and contains S2 (respectively the rectanglesif ones used the second argument in the previous question), which form asemi-algebra, so M contains the algebra generated by S2 (respectively therectangles). But M is also a monotone class, hence M contains the σ-eldgenerated by S2 (respectively the rectangles) the rectangles, namely X ⊗ X .To check that S2 generates B(X) = X ⊗ X , x a countable dense subset Din X, and note that the products B(x, 1/n)×B(y, 1/n) with (x, y) ∈ D2 andn ≥ 1 form a countable basis of open sets in X2. Thus ν ≤ c(µ⊗ µ).

(e) Since ν is absolutely continuous with regard to µ⊗ µ and invariant by T × Twhereas µ⊗ µ is ergodic with regard to T × T , we deduce that ν = µ⊗ µ.

4. The topology of narrow convergence on Π(X2) is metrizable. Since the sequence(νn)n≥0 is tight, its closure in the set Π(X2) of all probability measures on X2 iscompact. Since µ⊗ µ is the only limit point of the sequence (νn)n≥0, we get thatνn → µ⊗ µ narrowly as n→ +∞.

5. Fix A and B in X and let ε > 0. By density of Cb(X) in L1(µ), one can nd twocontinous functions from X to R such that ||f −1A||1 ≤ ε and ||g−1B||1 ≤ ε. Bytroncating f and g, one may assume that f and g takes values in [0, 1]. Hence∣∣∣ ∫

Xf × (g Tn) dµ− µ(A ∩ T−n(B))

∣∣∣ ≤ ||f × (g Tn)− 1A × (1B Tn)||1

≤ ||(f − 1A)× (g Tn)||1+||1A × (g Tn − 1B Tn)||1

≤ ||f − 1A||1 + ||(g − 1B) Tn)||1≤ 2ε

and ∣∣∣ ∫Xf dµ

∫Xg dµ− µ(A)µ(B)

∣∣∣ ≤ ∣∣∣ ∫X

(f − 1A) dµ×∫Xg dµ

∣∣∣+∣∣∣ ∫

X1A dµ×

∫X

(g − 1B) dµ∣∣∣

≤ ||f − 1A||1 + ||g − 1B||1≤ 2ε.

Therefore,

|µ(A ∩ T−n(B))− µ(A)µ(B)| ≤ 4ε+∣∣∣ ∫

Xf × (g Tn) dµ−

∫Xf dµ

∫Xg dµ

∣∣∣= 4ε+

∣∣∣ ∫X2

(f ⊗ g) dνn −∫X2

(f ⊗ g) d(µ⊗ µ)∣∣∣.

Since f ⊗ g ∈ Cb(X2), we get |µ(A ∩ T−n(B)) − µ(A)µ(B)| ≤ 5ε for every largeenough n. Hence µ(A ∩ T−n(B)) → µ(A)µ(B) as n → +∞, which shows that(X,X , µ, T ) is strongly mixing.

Bibliography

[1] P. Billingsley, Convergence of Probability Measures, Wiley (1999)

[2] T. De la Rue, Introduction à la théorie ergodique,http://lmrs.univ-rouen.fr/Persopage/Delarue/te.html

[3] S. Kalikow, R. Mccutcheon, A, Outline of Ergodic Theory, Cambridge UniversityPress (2010).

[4] O. Kallenberg, Foundations of Modern Probability, Springer (2001).

[5] K. Petersen, Ergodic theory, Cambridge University Press (1983).

[6] D.A. Rohlin On the fundamental ideas of measure theory, AMS Translation Serie 110, 2-53 (1963). (First publication in russian in 1949).

[7] W. Rudin Real and complex Analysis, McGraw-Hil Education (1987).

[8] D.J. Rudolph, Fundamentals of measurable dynamics - Ergodic theory on Lebesgue

spaces, Oxford University Press (1990).

[9] M. Viana - K. Oliveira Fundations of Ergodic theory, Cambridge University Press(2016)

[10] P. Walters, An introduction to ergodic theory, Springer (1982).

79