MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture...

135
MAGIC: Ergodic Theory Lecture 7 - Entropy Charles Walkden March 6, 2013

Transcript of MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture...

Page 1: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

MAGIC: Ergodic Theory Lecture 7 - Entropy

Charles Walkden

March 6, 2013

Page 2: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Introduction

A central problem in mathematics is the isomorphism problem:

when are two objects in the same class “the same”?

Two mpts T : (X ,B, µ)→ (X ,B, µ), S : (Y ,A,m)→ (Y ,A,m)

are isomorphic if there exists a bimeasurable bijection φ : X → Y

such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m (i.e.

µ(φ−1B) = m(B) ∀B ∈ A). It is natural to look for invariants. To

each mpt T we will associate a number - its entropy h(T ).

If S ,T are isomorphic then h(S) = h(T ). (Conversely, if

h(S) 6= h(T ) then S , T cannot be isomorphic.)

Throughout, log = log2.

Page 3: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Introduction

A central problem in mathematics is the isomorphism problem:

when are two objects in the same class “the same”?

Two mpts T : (X ,B, µ)→ (X ,B, µ), S : (Y ,A,m)→ (Y ,A,m)

are isomorphic if there exists a bimeasurable bijection φ : X → Y

such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m (i.e.

µ(φ−1B) = m(B) ∀B ∈ A).

It is natural to look for invariants. To

each mpt T we will associate a number - its entropy h(T ).

If S ,T are isomorphic then h(S) = h(T ). (Conversely, if

h(S) 6= h(T ) then S , T cannot be isomorphic.)

Throughout, log = log2.

Page 4: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Introduction

A central problem in mathematics is the isomorphism problem:

when are two objects in the same class “the same”?

Two mpts T : (X ,B, µ)→ (X ,B, µ), S : (Y ,A,m)→ (Y ,A,m)

are isomorphic if there exists a bimeasurable bijection φ : X → Y

such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m (i.e.

µ(φ−1B) = m(B) ∀B ∈ A). It is natural to look for invariants.

To

each mpt T we will associate a number - its entropy h(T ).

If S ,T are isomorphic then h(S) = h(T ). (Conversely, if

h(S) 6= h(T ) then S , T cannot be isomorphic.)

Throughout, log = log2.

Page 5: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Introduction

A central problem in mathematics is the isomorphism problem:

when are two objects in the same class “the same”?

Two mpts T : (X ,B, µ)→ (X ,B, µ), S : (Y ,A,m)→ (Y ,A,m)

are isomorphic if there exists a bimeasurable bijection φ : X → Y

such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m (i.e.

µ(φ−1B) = m(B) ∀B ∈ A). It is natural to look for invariants. To

each mpt T we will associate a number - its entropy h(T ).

If S ,T are isomorphic then h(S) = h(T ). (Conversely, if

h(S) 6= h(T ) then S , T cannot be isomorphic.)

Throughout, log = log2.

Page 6: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Introduction

A central problem in mathematics is the isomorphism problem:

when are two objects in the same class “the same”?

Two mpts T : (X ,B, µ)→ (X ,B, µ), S : (Y ,A,m)→ (Y ,A,m)

are isomorphic if there exists a bimeasurable bijection φ : X → Y

such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m (i.e.

µ(φ−1B) = m(B) ∀B ∈ A). It is natural to look for invariants. To

each mpt T we will associate a number - its entropy h(T ).

If S ,T are isomorphic then h(S) = h(T ). (Conversely, if

h(S) 6= h(T ) then S , T cannot be isomorphic.)

Throughout, log = log2.

Page 7: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Introduction

A central problem in mathematics is the isomorphism problem:

when are two objects in the same class “the same”?

Two mpts T : (X ,B, µ)→ (X ,B, µ), S : (Y ,A,m)→ (Y ,A,m)

are isomorphic if there exists a bimeasurable bijection φ : X → Y

such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m (i.e.

µ(φ−1B) = m(B) ∀B ∈ A). It is natural to look for invariants. To

each mpt T we will associate a number - its entropy h(T ).

If S ,T are isomorphic then h(S) = h(T ). (Conversely, if

h(S) 6= h(T ) then S , T cannot be isomorphic.)

Throughout, log = log2.

Page 8: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partition

Let (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 9: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 10: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 11: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 12: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 13: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 14: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 15: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 16: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Information and entropy of a partitionLet (X ,B, µ) be a probability space.

Suppose we are trying to locate a point x ∈ X using a partition

α = {Aj}.

If we know that x ∈ Aj then we have received some information.

If Aj is ‘big’ then we have received a ‘small’ amount of information.

If Aj is ’small’ then we have received a ’large’ amount of

information.

This motivates defining the ‘information function’ as

I (α)(x) =∑A∈α

χA(x)φ(µ(A))

for an appropriate choice of function φ.

Page 17: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Suppose α and β are two partitions.

The join of α and β is the partition

α ∨ β = {A ∩ B | A ∈ α,B ∈ β}.

α β α ∨ β

α,β are independent if µ(A ∩ B) = µ(A)µ(B) ∀A ∈ α, B ∈ β.

Page 18: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Suppose α and β are two partitions.

The join of α and β is the partition

α ∨ β = {A ∩ B | A ∈ α,B ∈ β}.

α β α ∨ β

α,β are independent if µ(A ∩ B) = µ(A)µ(B) ∀A ∈ α, B ∈ β.

Page 19: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Suppose α and β are two partitions.

The join of α and β is the partition

α ∨ β = {A ∩ B | A ∈ α,B ∈ β}.

α β α ∨ β

α,β are independent if µ(A ∩ B) = µ(A)µ(B) ∀A ∈ α, B ∈ β.

Page 20: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Suppose α and β are two partitions.

The join of α and β is the partition

α ∨ β = {A ∩ B | A ∈ α,B ∈ β}.

α β α ∨ β

α,β are independent if µ(A ∩ B) = µ(A)µ(B) ∀A ∈ α, B ∈ β.

Page 21: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

It is natural to assume that if α, β are independent, then:

Information obtained

by knowing which

element of α ∨ β we

are in

=Information

obtained from α+

Information

obtained from β

φ(µ(A ∩ B)) = φ(µ(A)µ(B)) = φ(µ(A)) + φ(µ(B))

This indicates we should take φ(t) = − log t.

DefinitionThe information function of α is

I (α)(x) = −∑A∈α

χA(x) logµ(A).

The entropy of α is the average amount of information:

H(α) =

∫I (α) dµ = −

∑A∈α

µ(A) logµ(A).

Page 22: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

It is natural to assume that if α, β are independent, then:

Information obtained

by knowing which

element of α ∨ β we

are in

=Information

obtained from α+

Information

obtained from β

φ(µ(A ∩ B)) = φ(µ(A)µ(B)) = φ(µ(A)) + φ(µ(B))

This indicates we should take φ(t) = − log t.

DefinitionThe information function of α is

I (α)(x) = −∑A∈α

χA(x) logµ(A).

The entropy of α is the average amount of information:

H(α) =

∫I (α) dµ = −

∑A∈α

µ(A) logµ(A).

Page 23: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

It is natural to assume that if α, β are independent, then:

Information obtained

by knowing which

element of α ∨ β we

are in

=Information

obtained from α+

Information

obtained from β

φ(µ(A ∩ B)) = φ(µ(A)µ(B)) = φ(µ(A)) + φ(µ(B))

This indicates we should take φ(t) = − log t.

DefinitionThe information function of α is

I (α)(x) = −∑A∈α

χA(x) logµ(A).

The entropy of α is the average amount of information:

H(α) =

∫I (α) dµ = −

∑A∈α

µ(A) logµ(A).

Page 24: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

It is natural to assume that if α, β are independent, then:

Information obtained

by knowing which

element of α ∨ β we

are in

=Information

obtained from α+

Information

obtained from β

φ(µ(A ∩ B)) = φ(µ(A)µ(B)) = φ(µ(A)) + φ(µ(B))

This indicates we should take φ(t) = − log t.

DefinitionThe information function of α is

I (α)(x) = −∑A∈α

χA(x) logµ(A).

The entropy of α is the average amount of information:

H(α) =

∫I (α) dµ = −

∑A∈α

µ(A) logµ(A).

Page 25: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

It is natural to assume that if α, β are independent, then:

Information obtained

by knowing which

element of α ∨ β we

are in

=Information

obtained from α+

Information

obtained from β

φ(µ(A ∩ B)) = φ(µ(A)µ(B)) = φ(µ(A)) + φ(µ(B))

This indicates we should take φ(t) = − log t.

DefinitionThe information function of α is

I (α)(x) = −∑A∈α

χA(x) logµ(A).

The entropy of α is the average amount of information:

H(α) =

∫I (α) dµ = −

∑A∈α

µ(A) logµ(A).

Page 26: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Conditional information & entropy

Conditional information and entropy are useful generalisations of

I (α), H(α).

Let A be a sub-σ-algebra.

For example: if β is a partition then the set of all unions of

elements of β is a σ-algebra (also denoted by β).

How much information do we gain by knowing which element of α

we are in, given we know which element of β we are in?

Page 27: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Conditional information & entropy

Conditional information and entropy are useful generalisations of

I (α), H(α).

Let A be a sub-σ-algebra.

For example: if β is a partition then the set of all unions of

elements of β is a σ-algebra (also denoted by β).

How much information do we gain by knowing which element of α

we are in, given we know which element of β we are in?

Page 28: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Conditional information & entropy

Conditional information and entropy are useful generalisations of

I (α), H(α).

Let A be a sub-σ-algebra.

For example: if β is a partition then the set of all unions of

elements of β is a σ-algebra (also denoted by β).

How much information do we gain by knowing which element of α

we are in, given we know which element of β we are in?

Page 29: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Conditional information & entropy

Conditional information and entropy are useful generalisations of

I (α), H(α).

Let A be a sub-σ-algebra.

For example: if β is a partition then the set of all unions of

elements of β is a σ-algebra (also denoted by β).

How much information do we gain by knowing which element of α

we are in, given we know which element of β we are in?

Page 30: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Recall conditional expectation:

E(· | A) : L1(X ,B, µ) −→ L1(X ,A, µ).

E(f | A) is determined by

1. E(f | A) is A-measurable,

2.∫A E(f | A) dµ =

∫A f dµ ∀A ∈ A.

E(f |A) is the best A-measurable approximation to f .

Page 31: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Recall conditional expectation:

E(· | A) : L1(X ,B, µ) −→ L1(X ,A, µ).

E(f | A) is determined by

1. E(f | A) is A-measurable,

2.∫A E(f | A) dµ =

∫A f dµ ∀A ∈ A.

E(f |A) is the best A-measurable approximation to f .

Page 32: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Recall conditional expectation:

E(· | A) : L1(X ,B, µ) −→ L1(X ,A, µ).

E(f | A) is determined by

1. E(f | A) is A-measurable,

2.∫A E(f | A) dµ =

∫A f dµ ∀A ∈ A.

E(f |A) is the best A-measurable approximation to f .

Page 33: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Consider the σ-algebra β generated by a partition β.

β

E(f | β)(x) =∑B∈β

χB(x)

∫B f dµ

µ(B).

Let A ∈ B. The conditional probability of A given a sub-σ-algebra

A is

µ(A|A) = E(χA|A).

Note:

µ(A | β) =∑B∈β

χB(x)µ(A ∩ B)

µ(B).

Page 34: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Consider the σ-algebra β generated by a partition β.

β

E(f | β)(x) =∑B∈β

χB(x)

∫B f dµ

µ(B).

Let A ∈ B. The conditional probability of A given a sub-σ-algebra

A is

µ(A|A) = E(χA|A).

Note:

µ(A | β) =∑B∈β

χB(x)µ(A ∩ B)

µ(B).

Page 35: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Consider the σ-algebra β generated by a partition β.

β

E(f | β)(x) =∑B∈β

χB(x)

∫B f dµ

µ(B).

Let A ∈ B. The conditional probability of A given a sub-σ-algebra

A is

µ(A|A) = E(χA|A).

Note:

µ(A | β) =∑B∈β

χB(x)µ(A ∩ B)

µ(B).

Page 36: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Consider the σ-algebra β generated by a partition β.

β

E(f | β)(x) =∑B∈β

χB(x)

∫B f dµ

µ(B).

Let A ∈ B. The conditional probability of A given a sub-σ-algebra

A is

µ(A|A) = E(χA|A).

Note:

µ(A | β) =∑B∈β

χB(x)µ(A ∩ B)

µ(B).

Page 37: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Consider the σ-algebra β generated by a partition β.

β

E(f | β)(x) =∑B∈β

χB(x)

∫B f dµ

µ(B).

Let A ∈ B. The conditional probability of A given a sub-σ-algebra

A is

µ(A|A) = E(χA|A).

Note:

µ(A | β) =∑B∈β

χB(x)µ(A ∩ B)

µ(B).

Page 38: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

DefinitionThe conditional information of α given A is

I (α | A)(x) = −∑A∈α

χA(x) logµ(A | A)(x).

The conditional entropy of α given A is

H(α | A) =

∫I (α | A) dµ.

The basic identities:

I (α ∨ β | γ) = I (α | γ) + I (β | α ∨ γ)

H(α ∨ β | γ) = H(α | γ) + H(β | α ∨ γ)

Page 39: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

DefinitionThe conditional information of α given A is

I (α | A)(x) = −∑A∈α

χA(x) logµ(A | A)(x).

The conditional entropy of α given A is

H(α | A) =

∫I (α | A) dµ.

The basic identities:

I (α ∨ β | γ) = I (α | γ) + I (β | α ∨ γ)

H(α ∨ β | γ) = H(α | γ) + H(β | α ∨ γ)

Page 40: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

DefinitionThe conditional information of α given A is

I (α | A)(x) = −∑A∈α

χA(x) logµ(A | A)(x).

The conditional entropy of α given A is

H(α | A) =

∫I (α | A) dµ.

The basic identities:

I (α ∨ β | γ) = I (α | γ) + I (β | α ∨ γ)

H(α ∨ β | γ) = H(α | γ) + H(β | α ∨ γ)

Page 41: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 42: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 43: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 44: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 45: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 46: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)

1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 47: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Definitionβ is a refinement of α if every element of α is a union of elements

of β.

α β

Write α ≤ β.

Facts

1. β ≤ γ =⇒ I (α ∨ β | γ) = I (α | γ)

H(α ∨ β | γ) = H(α | γ)

2. β ≤ α =⇒ I (β | γ) ≤ I (α | γ)

H(β | γ) ≤ H(α | γ)

3. β ≤ γ =⇒ H(α | β) ≥ H(α | γ)1,2 follow from the basic identities, 3 follows from Jensen’s ineq.

Page 48: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy of an mpt relative to a partition

We can now start to define the entropy h(T ) of an mpt T .

We first define the entropy of T relative to a partition.

We need the following:

Subadditive lemma

Suppose an ∈ R is subadditive: an+m ≤ an + am.

Then limn→∞

an

nexists and equals inf

n

an

n(could be −∞).

Page 49: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy of an mpt relative to a partition

We can now start to define the entropy h(T ) of an mpt T .

We first define the entropy of T relative to a partition.

We need the following:

Subadditive lemma

Suppose an ∈ R is subadditive: an+m ≤ an + am.

Then limn→∞

an

nexists and equals inf

n

an

n(could be −∞).

Page 50: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Let T : (X ,B, µ)→ (X ,B, µ) be an mpt.

Let α be a finite or countable partition.

Define T−1α = {T−1A | A ∈ α} - a countable partition.

Note:

H(T−1α) = −∑A∈α

µ(T−1A) logµ(T−1A)

= −∑A∈α

µ(A) logµ(A)

= H(α).

Define

Hn(α) = H

n−1∨j=0

T−jα

.

Page 51: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Let T : (X ,B, µ)→ (X ,B, µ) be an mpt.

Let α be a finite or countable partition.

Define T−1α = {T−1A | A ∈ α} - a countable partition.

Note:

H(T−1α) = −∑A∈α

µ(T−1A) logµ(T−1A)

= −∑A∈α

µ(A) logµ(A)

= H(α).

Define

Hn(α) = H

n−1∨j=0

T−jα

.

Page 52: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Let T : (X ,B, µ)→ (X ,B, µ) be an mpt.

Let α be a finite or countable partition.

Define T−1α = {T−1A | A ∈ α} - a countable partition.

Note:

H(T−1α) = −∑A∈α

µ(T−1A) logµ(T−1A)

= −∑A∈α

µ(A) logµ(A)

= H(α).

Define

Hn(α) = H

n−1∨j=0

T−jα

.

Page 53: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Let T : (X ,B, µ)→ (X ,B, µ) be an mpt.

Let α be a finite or countable partition.

Define T−1α = {T−1A | A ∈ α} - a countable partition.

Note:

H(T−1α) = −∑A∈α

µ(T−1A) logµ(T−1A)

= −∑A∈α

µ(A) logµ(A)

= H(α).

Define

Hn(α) = H

n−1∨j=0

T−jα

.

Page 54: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Then

Hn+m(α) = H

n+m−1∨j=0

T−jα

basicidentity

= H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

∣∣∣∣∣∣n−1∨j=0

T−jα

≤ H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

= H

n−1∨j=0

T−jα

+ H

T−nm−1∨j=0

T−jα

= Hn(α) + Hm(α).

Hence Hn(α) is subadditive.

Page 55: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Then

Hn+m(α) = H

n+m−1∨j=0

T−jα

basic

identity= H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

∣∣∣∣∣∣n−1∨j=0

T−jα

≤ H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

= H

n−1∨j=0

T−jα

+ H

T−nm−1∨j=0

T−jα

= Hn(α) + Hm(α).

Hence Hn(α) is subadditive.

Page 56: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Then

Hn+m(α) = H

n+m−1∨j=0

T−jα

basic

identity= H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

∣∣∣∣∣∣n−1∨j=0

T−jα

≤ H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

= H

n−1∨j=0

T−jα

+ H

T−nm−1∨j=0

T−jα

= Hn(α) + Hm(α).

Hence Hn(α) is subadditive.

Page 57: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Then

Hn+m(α) = H

n+m−1∨j=0

T−jα

basic

identity= H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

∣∣∣∣∣∣n−1∨j=0

T−jα

≤ H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

= H

n−1∨j=0

T−jα

+ H

T−nm−1∨j=0

T−jα

= Hn(α) + Hm(α).

Hence Hn(α) is subadditive.

Page 58: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Then

Hn+m(α) = H

n+m−1∨j=0

T−jα

basic

identity= H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

∣∣∣∣∣∣n−1∨j=0

T−jα

≤ H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

= H

n−1∨j=0

T−jα

+ H

T−nm−1∨j=0

T−jα

= Hn(α) + Hm(α).

Hence Hn(α) is subadditive.

Page 59: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Then

Hn+m(α) = H

n+m−1∨j=0

T−jα

basic

identity= H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

∣∣∣∣∣∣n−1∨j=0

T−jα

≤ H

n−1∨j=0

T−jα

+ H

n+m−1∨j=n

T−jα

= H

n−1∨j=0

T−jα

+ H

T−nm−1∨j=0

T−jα

= Hn(α) + Hm(α).

Hence Hn(α) is subadditive.

Page 60: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

By the subadditive lemma, we can define

hµ(T , α) = limn→∞

1

nH

n−1∨j=0

T−jα

= entropy of T relative to α.

Remarks

1. By sub-additivity, Hn(α) ≤ nH(α). Hence

0 ≤ hµ(T , α) ≤ H(α).

2. Using the basic identities and the Increasing Martingale

Theorem, we can obtain the following alternative formula for

hµ(T , α):

hµ(T , α) = H

α | ∞∨j=1

T−jα

= limn→∞

H

α| n∨j=1

T−jα

.

‘Entropy = average amount of information from the present,

given the past’

Page 61: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

By the subadditive lemma, we can define

hµ(T , α) = limn→∞

1

nH

n−1∨j=0

T−jα

= entropy of T relative to α.

Remarks

1. By sub-additivity, Hn(α) ≤ nH(α). Hence

0 ≤ hµ(T , α) ≤ H(α).

2. Using the basic identities and the Increasing Martingale

Theorem, we can obtain the following alternative formula for

hµ(T , α):

hµ(T , α) = H

α | ∞∨j=1

T−jα

= limn→∞

H

α| n∨j=1

T−jα

.

‘Entropy = average amount of information from the present,

given the past’

Page 62: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

By the subadditive lemma, we can define

hµ(T , α) = limn→∞

1

nH

n−1∨j=0

T−jα

= entropy of T relative to α.

Remarks

1. By sub-additivity, Hn(α) ≤ nH(α). Hence

0 ≤ hµ(T , α) ≤ H(α).

2. Using the basic identities and the Increasing Martingale

Theorem, we can obtain the following alternative formula for

hµ(T , α):

hµ(T , α) = H

α | ∞∨j=1

T−jα

= limn→∞

H

α| n∨j=1

T−jα

.

‘Entropy = average amount of information from the present,

given the past’

Page 63: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

By the subadditive lemma, we can define

hµ(T , α) = limn→∞

1

nH

n−1∨j=0

T−jα

= entropy of T relative to α.

Remarks

1. By sub-additivity, Hn(α) ≤ nH(α). Hence

0 ≤ hµ(T , α) ≤ H(α).

2. Using the basic identities and the Increasing Martingale

Theorem, we can obtain the following alternative formula for

hµ(T , α):

hµ(T , α) = H

α | ∞∨j=1

T−jα

= limn→∞

H

α| n∨j=1

T−jα

.

‘Entropy = average amount of information from the present,

given the past’

Page 64: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

By the subadditive lemma, we can define

hµ(T , α) = limn→∞

1

nH

n−1∨j=0

T−jα

= entropy of T relative to α.

Remarks

1. By sub-additivity, Hn(α) ≤ nH(α). Hence

0 ≤ hµ(T , α) ≤ H(α).

2. Using the basic identities and the Increasing Martingale

Theorem, we can obtain the following alternative formula for

hµ(T , α):

hµ(T , α) = H

α | ∞∨j=1

T−jα

= limn→∞

H

α| n∨j=1

T−jα

.

‘Entropy = average amount of information from the present,

given the past’

Page 65: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy of an mpt

Let T be an mpt of a probability space (X ,B, µ).

Then the entropy of T is:

hµ(T ) = sup

{hµ(T , α)

∣∣∣∣ α is a finite or countable

partition s.t. H(α) <∞

}Potential problem: working from the definitions, this quantity

seems impossible to calculate!

Page 66: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy of an mpt

Let T be an mpt of a probability space (X ,B, µ).

Then the entropy of T is:

hµ(T ) = sup

{hµ(T , α)

∣∣∣∣ α is a finite or countable

partition s.t. H(α) <∞

}

Potential problem: working from the definitions, this quantity

seems impossible to calculate!

Page 67: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy of an mpt

Let T be an mpt of a probability space (X ,B, µ).

Then the entropy of T is:

hµ(T ) = sup

{hµ(T , α)

∣∣∣∣ α is a finite or countable

partition s.t. H(α) <∞

}Potential problem: working from the definitions, this quantity

seems impossible to calculate!

Page 68: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Generators and Sinai’s Theorem

Let T be an mpt of the probability space (X ,B, µ).

DefinitionA finite or countable partition α is a generator for T if T is

invertible andn−1∨

j=−(n−1)

T−jα↗ B

(i.e. B is the smallest σ-algebra that contains all elements of all

the partitions∨n−1

j=−(n−1) T−jα).

We say that α is a strong generator if

n−1∨j=0

T−jα↗ B.

Page 69: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Generators and Sinai’s Theorem

Let T be an mpt of the probability space (X ,B, µ).

DefinitionA finite or countable partition α is a generator for T if T is

invertible andn−1∨

j=−(n−1)

T−jα↗ B

(i.e. B is the smallest σ-algebra that contains all elements of all

the partitions∨n−1

j=−(n−1) T−jα).

We say that α is a strong generator if

n−1∨j=0

T−jα↗ B.

Page 70: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Generators and Sinai’s Theorem

Let T be an mpt of the probability space (X ,B, µ).

DefinitionA finite or countable partition α is a generator for T if T is

invertible andn−1∨

j=−(n−1)

T−jα↗ B

(i.e. B is the smallest σ-algebra that contains all elements of all

the partitions∨n−1

j=−(n−1) T−jα).

We say that α is a strong generator if

n−1∨j=0

T−jα↗ B.

Page 71: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Generators and Sinai’s Theorem

Let T be an mpt of the probability space (X ,B, µ).

DefinitionA finite or countable partition α is a generator for T if T is

invertible andn−1∨

j=−(n−1)

T−jα↗ B

(i.e. B is the smallest σ-algebra that contains all elements of all

the partitions∨n−1

j=−(n−1) T−jα).

We say that α is a strong generator if

n−1∨j=0

T−jα↗ B.

Page 72: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

RemarkTo check whether a partition α is a strong generator (resp.

generator) it is sufficient to show it separates µ-a.e. pair of points:

for µ-a.e. x , y ∈ X , ∃n s.t. x , y are in different elements of the

partition∨n−1

j=0 T−jα (resp.∨n−1

j=−(n−1) T−jα).

Recall:

hµ(T ) = sup hµ(T , α)

where the supremum is taken over all partitions of finite entropy.

Sinai’s theorem tells us that this supremum is acheived when α is a

generator or a strong generator.

Page 73: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

RemarkTo check whether a partition α is a strong generator (resp.

generator) it is sufficient to show it separates µ-a.e. pair of points:

for µ-a.e. x , y ∈ X , ∃n s.t. x , y are in different elements of the

partition∨n−1

j=0 T−jα (resp.∨n−1

j=−(n−1) T−jα).

Recall:

hµ(T ) = sup hµ(T , α)

where the supremum is taken over all partitions of finite entropy.

Sinai’s theorem tells us that this supremum is acheived when α is a

generator or a strong generator.

Page 74: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

RemarkTo check whether a partition α is a strong generator (resp.

generator) it is sufficient to show it separates µ-a.e. pair of points:

for µ-a.e. x , y ∈ X , ∃n s.t. x , y are in different elements of the

partition∨n−1

j=0 T−jα (resp.∨n−1

j=−(n−1) T−jα).

Recall:

hµ(T ) = sup hµ(T , α)

where the supremum is taken over all partitions of finite entropy.

Sinai’s theorem tells us that this supremum is acheived when α is a

generator or a strong generator.

Page 75: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

RemarkTo check whether a partition α is a strong generator (resp.

generator) it is sufficient to show it separates µ-a.e. pair of points:

for µ-a.e. x , y ∈ X , ∃n s.t. x , y are in different elements of the

partition∨n−1

j=0 T−jα (resp.∨n−1

j=−(n−1) T−jα).

Recall:

hµ(T ) = sup hµ(T , α)

where the supremum is taken over all partitions of finite entropy.

Sinai’s theorem tells us that this supremum is acheived when α is a

generator or a strong generator.

Page 76: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Theorem: (Sinai)

Let α be a finite or countable partition with H(α) <∞.

Suppose

either:

I T is invertible and α is a generator, or

I α is a strong generator.

Then hµ(T ) = hµ(T , α).

This allows us to calculate the entropy of many of our favourite

examples.

Page 77: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Theorem: (Sinai)

Let α be a finite or countable partition with H(α) <∞. Suppose

either:

I T is invertible and α is a generator, or

I α is a strong generator.

Then hµ(T ) = hµ(T , α).

This allows us to calculate the entropy of many of our favourite

examples.

Page 78: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Theorem: (Sinai)

Let α be a finite or countable partition with H(α) <∞. Suppose

either:

I T is invertible and α is a generator, or

I α is a strong generator.

Then hµ(T ) = hµ(T , α).

This allows us to calculate the entropy of many of our favourite

examples.

Page 79: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Theorem: (Sinai)

Let α be a finite or countable partition with H(α) <∞. Suppose

either:

I T is invertible and α is a generator, or

I α is a strong generator.

Then hµ(T ) = hµ(T , α).

This allows us to calculate the entropy of many of our favourite

examples.

Page 80: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: Markov measures for shifts of finite type

We work in the one-sided setting. Analogous results hold for the

two-sided case.

Let A be an aperiodic k × k matrix with corresponding one-sided

shift of finite type Σ+A . Let σ be the shift map.

Let P = (Pij) be a stochastic matrix compatible with A.

Let p = (p1 . . . , pk) be the unique probability left-eigenvector:

pP = p.

Recall that the Markov measure µP is defined on cylinder sets by:

µP [io , . . . , in−1] = pi0Pi0i1 . . .Pin−2in−1 .

Let α = {[1], . . . , [k]} denote the partition of Σ+A into cylinders of

length 1.

Page 81: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: Markov measures for shifts of finite type

We work in the one-sided setting. Analogous results hold for the

two-sided case.

Let A be an aperiodic k × k matrix with corresponding one-sided

shift of finite type Σ+A . Let σ be the shift map.

Let P = (Pij) be a stochastic matrix compatible with A.

Let p = (p1 . . . , pk) be the unique probability left-eigenvector:

pP = p.

Recall that the Markov measure µP is defined on cylinder sets by:

µP [io , . . . , in−1] = pi0Pi0i1 . . .Pin−2in−1 .

Let α = {[1], . . . , [k]} denote the partition of Σ+A into cylinders of

length 1.

Page 82: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: Markov measures for shifts of finite type

We work in the one-sided setting. Analogous results hold for the

two-sided case.

Let A be an aperiodic k × k matrix with corresponding one-sided

shift of finite type Σ+A . Let σ be the shift map.

Let P = (Pij) be a stochastic matrix compatible with A.

Let p = (p1 . . . , pk) be the unique probability left-eigenvector:

pP = p.

Recall that the Markov measure µP is defined on cylinder sets by:

µP [io , . . . , in−1] = pi0Pi0i1 . . .Pin−2in−1 .

Let α = {[1], . . . , [k]} denote the partition of Σ+A into cylinders of

length 1.

Page 83: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: Markov measures for shifts of finite type

We work in the one-sided setting. Analogous results hold for the

two-sided case.

Let A be an aperiodic k × k matrix with corresponding one-sided

shift of finite type Σ+A . Let σ be the shift map.

Let P = (Pij) be a stochastic matrix compatible with A.

Let p = (p1 . . . , pk) be the unique probability left-eigenvector:

pP = p.

Recall that the Markov measure µP is defined on cylinder sets by:

µP [io , . . . , in−1] = pi0Pi0i1 . . .Pin−2in−1 .

Let α = {[1], . . . , [k]} denote the partition of Σ+A into cylinders of

length 1.

Page 84: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: Markov measures for shifts of finite type

We work in the one-sided setting. Analogous results hold for the

two-sided case.

Let A be an aperiodic k × k matrix with corresponding one-sided

shift of finite type Σ+A . Let σ be the shift map.

Let P = (Pij) be a stochastic matrix compatible with A.

Let p = (p1 . . . , pk) be the unique probability left-eigenvector:

pP = p.

Recall that the Markov measure µP is defined on cylinder sets by:

µP [io , . . . , in−1] = pi0Pi0i1 . . .Pin−2in−1 .

Let α = {[1], . . . , [k]} denote the partition of Σ+A into cylinders of

length 1.

Page 85: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: Markov measures for shifts of finite type

We work in the one-sided setting. Analogous results hold for the

two-sided case.

Let A be an aperiodic k × k matrix with corresponding one-sided

shift of finite type Σ+A . Let σ be the shift map.

Let P = (Pij) be a stochastic matrix compatible with A.

Let p = (p1 . . . , pk) be the unique probability left-eigenvector:

pP = p.

Recall that the Markov measure µP is defined on cylinder sets by:

µP [io , . . . , in−1] = pi0Pi0i1 . . .Pin−2in−1 .

Let α = {[1], . . . , [k]} denote the partition of Σ+A into cylinders of

length 1.

Page 86: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Easy check: H(α) < ∞

Easy check: αn =∨n−1

j=0 σ−jα = {[io , . . . , in−1]}

= the partition of Σ+A

into cylinders of length n

Hence α is a strong generator, as αn separates points.

Hence we can apply Sinai’s theorem:

H

n−1∨j=o

σ−jα

= −

∑io ,...,in−1

µ[i0, . . . , in−1] logµ[i0, . . . , in−1]

= −∑

io ,...,in−1

pi0Pi0i1 . . .Pin−2in−1 log(pi0Pi0i1 . . .Pin−2in−1)

re-arranging= −

∑i

pi log pi − (n − 1)∑i ,j

piPij log Pij .

Page 87: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Easy check: H(α) < ∞Easy check: αn =

∨n−1j=0 σ

−jα = {[io , . . . , in−1]}= the partition of Σ+

A

into cylinders of length n

Hence α is a strong generator, as αn separates points.

Hence we can apply Sinai’s theorem:

H

n−1∨j=o

σ−jα

= −

∑io ,...,in−1

µ[i0, . . . , in−1] logµ[i0, . . . , in−1]

= −∑

io ,...,in−1

pi0Pi0i1 . . .Pin−2in−1 log(pi0Pi0i1 . . .Pin−2in−1)

re-arranging= −

∑i

pi log pi − (n − 1)∑i ,j

piPij log Pij .

Page 88: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Easy check: H(α) < ∞Easy check: αn =

∨n−1j=0 σ

−jα = {[io , . . . , in−1]}= the partition of Σ+

A

into cylinders of length n

Hence α is a strong generator, as αn separates points.

Hence we can apply Sinai’s theorem:

H

n−1∨j=o

σ−jα

= −

∑io ,...,in−1

µ[i0, . . . , in−1] logµ[i0, . . . , in−1]

= −∑

io ,...,in−1

pi0Pi0i1 . . .Pin−2in−1 log(pi0Pi0i1 . . .Pin−2in−1)

re-arranging= −

∑i

pi log pi − (n − 1)∑i ,j

piPij log Pij .

Page 89: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Easy check: H(α) < ∞Easy check: αn =

∨n−1j=0 σ

−jα = {[io , . . . , in−1]}= the partition of Σ+

A

into cylinders of length n

Hence α is a strong generator, as αn separates points.

Hence we can apply Sinai’s theorem:

H

n−1∨j=o

σ−jα

= −

∑io ,...,in−1

µ[i0, . . . , in−1] logµ[i0, . . . , in−1]

= −∑

io ,...,in−1

pi0Pi0i1 . . .Pin−2in−1 log(pi0Pi0i1 . . .Pin−2in−1)

re-arranging= −

∑i

pi log pi − (n − 1)∑i ,j

piPij log Pij .

Page 90: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Easy check: H(α) < ∞Easy check: αn =

∨n−1j=0 σ

−jα = {[io , . . . , in−1]}= the partition of Σ+

A

into cylinders of length n

Hence α is a strong generator, as αn separates points.

Hence we can apply Sinai’s theorem:

H

n−1∨j=o

σ−jα

= −

∑io ,...,in−1

µ[i0, . . . , in−1] logµ[i0, . . . , in−1]

= −∑

io ,...,in−1

pi0Pi0i1 . . .Pin−2in−1 log(pi0Pi0i1 . . .Pin−2in−1)

re-arranging= −

∑i

pi log pi − (n − 1)∑i ,j

piPij log Pij .

Page 91: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Easy check: H(α) < ∞Easy check: αn =

∨n−1j=0 σ

−jα = {[io , . . . , in−1]}= the partition of Σ+

A

into cylinders of length n

Hence α is a strong generator, as αn separates points.

Hence we can apply Sinai’s theorem:

H

n−1∨j=o

σ−jα

= −

∑io ,...,in−1

µ[i0, . . . , in−1] logµ[i0, . . . , in−1]

= −∑

io ,...,in−1

pi0Pi0i1 . . .Pin−2in−1 log(pi0Pi0i1 . . .Pin−2in−1)

re-arranging= −

∑i

pi log pi − (n − 1)∑i ,j

piPij log Pij .

Page 92: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Hence

hµ(σ)Sinai= hµ(σ, α)

= limn→∞

1

nH

n−1∨j=0

σ−jα

= −

∑i ,j

piPij log Pij .

RemarkIf µ is the Bernoulli-(p1, . . . , pk) measure then

hµ(σ) = −∑

i

pi log pi .

If µ is the Bernoulli-(1/k , . . . , 1/k) measure then

hµ(σ) = log k .

Page 93: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Hence

hµ(σ)Sinai= hµ(σ, α)

= limn→∞

1

nH

n−1∨j=0

σ−jα

= −∑i ,j

piPij log Pij .

RemarkIf µ is the Bernoulli-(p1, . . . , pk) measure then

hµ(σ) = −∑

i

pi log pi .

If µ is the Bernoulli-(1/k , . . . , 1/k) measure then

hµ(σ) = log k .

Page 94: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Hence

hµ(σ)Sinai= hµ(σ, α)

= limn→∞

1

nH

n−1∨j=0

σ−jα

= −

∑i ,j

piPij log Pij .

RemarkIf µ is the Bernoulli-(p1, . . . , pk) measure then

hµ(σ) = −∑

i

pi log pi .

If µ is the Bernoulli-(1/k , . . . , 1/k) measure then

hµ(σ) = log k .

Page 95: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Hence

hµ(σ)Sinai= hµ(σ, α)

= limn→∞

1

nH

n−1∨j=0

σ−jα

= −

∑i ,j

piPij log Pij .

RemarkIf µ is the Bernoulli-(p1, . . . , pk) measure then

hµ(σ) = −∑

i

pi log pi .

If µ is the Bernoulli-(1/k , . . . , 1/k) measure then

hµ(σ) = log k .

Page 96: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example

We can model a language (written in the Roman alphabet) as a

shift on 26 symbols (corresponding to the 26 letters in the

alphabet) with an appropriate Markov measure.

For English:PQU should be near 1 as a Q is highly likely to be followed by U

PFZ should be near 0 as F is unlikely to be followed by Z.Experimentally, one can estimate

h(English) = 1.6

Note that the Bernoulli ( 126 , . . . ,

126)-measure has entropy

log 26 = 4.7.

This suggests that there is a lot of redundancy in English (good for

error-correcting!). See Shannon’s book on Information Theory.

Page 97: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example

We can model a language (written in the Roman alphabet) as a

shift on 26 symbols (corresponding to the 26 letters in the

alphabet) with an appropriate Markov measure.

For English:PQU should be near 1 as a Q is highly likely to be followed by U

PFZ should be near 0 as F is unlikely to be followed by Z.

Experimentally, one can estimate

h(English) = 1.6

Note that the Bernoulli ( 126 , . . . ,

126)-measure has entropy

log 26 = 4.7.

This suggests that there is a lot of redundancy in English (good for

error-correcting!). See Shannon’s book on Information Theory.

Page 98: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example

We can model a language (written in the Roman alphabet) as a

shift on 26 symbols (corresponding to the 26 letters in the

alphabet) with an appropriate Markov measure.

For English:PQU should be near 1 as a Q is highly likely to be followed by U

PFZ should be near 0 as F is unlikely to be followed by Z.Experimentally, one can estimate

h(English) = 1.6

Note that the Bernoulli ( 126 , . . . ,

126)-measure has entropy

log 26 = 4.7.

This suggests that there is a lot of redundancy in English (good for

error-correcting!). See Shannon’s book on Information Theory.

Page 99: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example

We can model a language (written in the Roman alphabet) as a

shift on 26 symbols (corresponding to the 26 letters in the

alphabet) with an appropriate Markov measure.

For English:PQU should be near 1 as a Q is highly likely to be followed by U

PFZ should be near 0 as F is unlikely to be followed by Z.Experimentally, one can estimate

h(English) = 1.6

Note that the Bernoulli ( 126 , . . . ,

126)-measure has entropy

log 26 = 4.7.

This suggests that there is a lot of redundancy in English (good for

error-correcting!). See Shannon’s book on Information Theory.

Page 100: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example

We can model a language (written in the Roman alphabet) as a

shift on 26 symbols (corresponding to the 26 letters in the

alphabet) with an appropriate Markov measure.

For English:PQU should be near 1 as a Q is highly likely to be followed by U

PFZ should be near 0 as F is unlikely to be followed by Z.Experimentally, one can estimate

h(English) = 1.6

Note that the Bernoulli ( 126 , . . . ,

126)-measure has entropy

log 26 = 4.7.

This suggests that there is a lot of redundancy in English (good for

error-correcting!). See Shannon’s book on Information Theory.

Page 101: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy as an invariant

Recall that two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) are (measure-theoretically) isomorphic

if there exists a bimeasurable bijection φ : X → Y such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m.

Entropy is invariant under isomorphism:

TheoremIf T ,S are isomorphic then hµ(T ) = hm(S).

Page 102: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy as an invariant

Recall that two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) are (measure-theoretically) isomorphic

if there exists a bimeasurable bijection φ : X → Y such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m.

Entropy is invariant under isomorphism:

TheoremIf T ,S are isomorphic then hµ(T ) = hm(S).

Page 103: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy as an invariant

Recall that two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) are (measure-theoretically) isomorphic

if there exists a bimeasurable bijection φ : X → Y such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m.

Entropy is invariant under isomorphism:

TheoremIf T ,S are isomorphic then hµ(T ) = hm(S).

Page 104: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Entropy as an invariant

Recall that two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) are (measure-theoretically) isomorphic

if there exists a bimeasurable bijection φ : X → Y such that

X

φ��

T // X

φ��

YS

// Y

commutes (up to sets of measure zero) and µ ◦ φ−1 = m.

Entropy is invariant under isomorphism:

TheoremIf T , S are isomorphic then hµ(T ) = hm(S).

Page 105: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 106: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 107: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 108: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 109: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 110: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 111: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 112: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α).

Hence hµ(T ) = hm(S).

Page 113: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Proof.Let α be a finite or countable partition of Y with Hm(α) <∞.

Then φ−1α = {φ−1A | A ∈ α} is a partition of X .

Note that

Hµ(φ−1α) = −∑A∈α

µ(φ−1A) logµ(φ−1A)

= −∑A∈α

m(A) log m(A) = Hm(α).

More generally

n−1∨j=0

T−j(φ−1α)

= Hµ

φ−1n−1∨j=0

S−jα

= Hm

n−1∨j=0

S−jα

.

Hence hµ(T , φ−1α) = hm(S , α). Hence hµ(T ) = hm(S).

Page 114: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ.

Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 115: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ. Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 116: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ. Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 117: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ. Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 118: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ. Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 119: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ. Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 120: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Example: the doubling map and the full 2-shift

Let Tx = 2x mod 1 be the doubling map with Lebesgue measure

λ. Let σ : Σ2 → Σ2 be the full one-sided 2-shift with the

Bernoulli-(1/2, 1/2) measure µ.

Define φ : Σ2 = {(xj)∞j=0 | xj ∈ {0, 1}} → [0, 1] by

φ(x0, x1, . . .) =∞∑j=0

xj

2j+1.

Then

I φσ = Tφ,

I φ is a bijection, except on the countable set of points which

have non-unique base 2 expansions,

I λ = µφ−1 (clear on dyadic intervals, follows for all sets by the

Kolmogorov Extension Theorem).

Page 121: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Hence Tx = 2x mod 1 with Lebesgue measure λ and the full

one-sided 2-shift σ with the Bernoulli-(1/2, 1/2) measure µ are

isomorphic.

Hence

hλ(T ) = log 2 = hµ(σ).

Page 122: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Hence Tx = 2x mod 1 with Lebesgue measure λ and the full

one-sided 2-shift σ with the Bernoulli-(1/2, 1/2) measure µ are

isomorphic.

Hence

hλ(T ) = log 2 = hµ(σ).

Page 123: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

How complete an invariant is entropy?

Given two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) with the same entropy, is it necessarily

true that they are isomorphic?

In general, the answer is no.

However, for two-sided aperiodic shifts of finite type equipped with

a Bernoulli or Markov measure, then the answer is yes

Theorem (Ornstein)

2-sided Bernoulli shifts with the same entropy are isomorphic.

Theorem (Ornstein and Friedman)

2-sided aperiodic Markov shifts with the same entropy are

isomorphic.

(The one-sided case is far more subtle.)

Page 124: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

How complete an invariant is entropy?

Given two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) with the same entropy, is it necessarily

true that they are isomorphic?

In general, the answer is no.

However, for two-sided aperiodic shifts of finite type equipped with

a Bernoulli or Markov measure, then the answer is yes

Theorem (Ornstein)

2-sided Bernoulli shifts with the same entropy are isomorphic.

Theorem (Ornstein and Friedman)

2-sided aperiodic Markov shifts with the same entropy are

isomorphic.

(The one-sided case is far more subtle.)

Page 125: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

How complete an invariant is entropy?

Given two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) with the same entropy, is it necessarily

true that they are isomorphic?

In general, the answer is no.

However, for two-sided aperiodic shifts of finite type equipped with

a Bernoulli or Markov measure, then the answer is yes

Theorem (Ornstein)

2-sided Bernoulli shifts with the same entropy are isomorphic.

Theorem (Ornstein and Friedman)

2-sided aperiodic Markov shifts with the same entropy are

isomorphic.

(The one-sided case is far more subtle.)

Page 126: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

How complete an invariant is entropy?

Given two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) with the same entropy, is it necessarily

true that they are isomorphic?

In general, the answer is no.

However, for two-sided aperiodic shifts of finite type equipped with

a Bernoulli or Markov measure, then the answer is yes

Theorem (Ornstein)

2-sided Bernoulli shifts with the same entropy are isomorphic.

Theorem (Ornstein and Friedman)

2-sided aperiodic Markov shifts with the same entropy are

isomorphic.

(The one-sided case is far more subtle.)

Page 127: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

How complete an invariant is entropy?

Given two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) with the same entropy, is it necessarily

true that they are isomorphic?

In general, the answer is no.

However, for two-sided aperiodic shifts of finite type equipped with

a Bernoulli or Markov measure, then the answer is yes

Theorem (Ornstein)

2-sided Bernoulli shifts with the same entropy are isomorphic.

Theorem (Ornstein and Friedman)

2-sided aperiodic Markov shifts with the same entropy are

isomorphic.

(The one-sided case is far more subtle.)

Page 128: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

How complete an invariant is entropy?

Given two mpts T : (X ,B, µ)→ (X ,B, µ),

S : (Y ,A,m)→ (Y ,A,m) with the same entropy, is it necessarily

true that they are isomorphic?

In general, the answer is no.

However, for two-sided aperiodic shifts of finite type equipped with

a Bernoulli or Markov measure, then the answer is yes

Theorem (Ornstein)

2-sided Bernoulli shifts with the same entropy are isomorphic.

Theorem (Ornstein and Friedman)

2-sided aperiodic Markov shifts with the same entropy are

isomorphic.

(The one-sided case is far more subtle.)

Page 129: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Bernoulli systems

Being isomorphic to a Bernoulli shift is a useful and desirable

property for a mpt to possess.

DefinitionA mpt T of a probability space (X ,B, µ) is Bernoulli if it is

isomorphic to a shift σ with some Bernoulli-(p1, . . . , pk) measure.

Example

We have already seen that the doubling map with Lebesgue

measure is Bernoulli.

In general, a mpt that exhibits some form of ‘hyperbolicity’ is,

when equipped with a suitable measure, Bernoulli.

For example, hyperbolic toral automorphisms are Bernoulli.

Page 130: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Bernoulli systems

Being isomorphic to a Bernoulli shift is a useful and desirable

property for a mpt to possess.

DefinitionA mpt T of a probability space (X ,B, µ) is Bernoulli if it is

isomorphic to a shift σ with some Bernoulli-(p1, . . . , pk) measure.

Example

We have already seen that the doubling map with Lebesgue

measure is Bernoulli.

In general, a mpt that exhibits some form of ‘hyperbolicity’ is,

when equipped with a suitable measure, Bernoulli.

For example, hyperbolic toral automorphisms are Bernoulli.

Page 131: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Bernoulli systems

Being isomorphic to a Bernoulli shift is a useful and desirable

property for a mpt to possess.

DefinitionA mpt T of a probability space (X ,B, µ) is Bernoulli if it is

isomorphic to a shift σ with some Bernoulli-(p1, . . . , pk) measure.

Example

We have already seen that the doubling map with Lebesgue

measure is Bernoulli.

In general, a mpt that exhibits some form of ‘hyperbolicity’ is,

when equipped with a suitable measure, Bernoulli.

For example, hyperbolic toral automorphisms are Bernoulli.

Page 132: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Bernoulli systems

Being isomorphic to a Bernoulli shift is a useful and desirable

property for a mpt to possess.

DefinitionA mpt T of a probability space (X ,B, µ) is Bernoulli if it is

isomorphic to a shift σ with some Bernoulli-(p1, . . . , pk) measure.

Example

We have already seen that the doubling map with Lebesgue

measure is Bernoulli.

In general, a mpt that exhibits some form of ‘hyperbolicity’ is,

when equipped with a suitable measure, Bernoulli.

For example, hyperbolic toral automorphisms are Bernoulli.

Page 133: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Bernoulli systems

Being isomorphic to a Bernoulli shift is a useful and desirable

property for a mpt to possess.

DefinitionA mpt T of a probability space (X ,B, µ) is Bernoulli if it is

isomorphic to a shift σ with some Bernoulli-(p1, . . . , pk) measure.

Example

We have already seen that the doubling map with Lebesgue

measure is Bernoulli.

In general, a mpt that exhibits some form of ‘hyperbolicity’ is,

when equipped with a suitable measure, Bernoulli.

For example, hyperbolic toral automorphisms are Bernoulli.

Page 134: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Next lecture

Entropy has been defined in a purely measure-theoretic setting.

There is a topological analogue in the setting of continuous

transformations of compact metric spaces: topological entropy.

We will define this and study the connections between

measure-theoretic and topological entropy.

Page 135: MAGIC: Ergodic Theory Lecture 7 - Entropy · 2013. 4. 12. · Title: MAGIC: Ergodic Theory Lecture 7 - Entropy Author: Charles Walkden Created Date: 20130306153002Z

Next lecture

Entropy has been defined in a purely measure-theoretic setting.

There is a topological analogue in the setting of continuous

transformations of compact metric spaces: topological entropy.

We will define this and study the connections between

measure-theoretic and topological entropy.