1 Markov Chains. Markov Chains (1) A Markov chain is a mathematical model for stochastic systems...

28
1 Markov Chains

Transcript of 1 Markov Chains. Markov Chains (1) A Markov chain is a mathematical model for stochastic systems...

Page 1: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

1

Markov Chains

Page 2: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Markov Chains (1) A Markov chain is a mathematical model

for stochastic systems whose states, discrete or continuous, are governed by transition probability.

Suppose the random variable take state space (Ω) that is a countable set of value. A Markov chain is a process that corresponds to the network.

2

1tX tX1tX 0X 1X ... ...

0 1, ,X X

Page 3: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Markov Chains (2) The current state in Markov chain only

depends on the most recent previous states.

Transition probability where

http://en.wikipedia.org/wiki/Markov_chainhttp://civs.stat.ucla.edu/MCMC/MCMC_tutorial/Lect1_MCMC_Intro.pdf

3

1 1 1 0 0

1

| , , ,

|

t t t t

t t

P X j X i X i X i

P X j X i

0 , , ,i i j

Page 4: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example of Markov Chains

where is initial state and so on. is transition matrix.

4

0 1

1,2,3,4,5

, , , ,tX X X X

0X

P

1 0.4 0.6 0.0 0.0 0.0

2 0.5 0.0 0.5 0.0 0.0

3 0.0 0.3 0.0 0.7 0.0

4 0.0 0.0 0.1 0.3 0.6

5 0.0 0.3 0.0 0.5 0.2

P

1 2 3 4 5

Page 5: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (1) Define the probability of going from state i

to state j in n time steps as

A state j is accessible from state i if there are n time steps such that , where

A state i is said to communicate with state j (denote: ), if it is true that both i is accessible from j and that j is accessible from i.

5

( ) |nij t n tp P X j X i

( ) 0nijp

0,1,n

i j

Page 6: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (2) A state i has period if any return to

state i must occur in multiples of time steps.

Formally, the period of a state is defined as

If , then the state is said to be aperiodic; otherwise ( ), the state is said to be periodic with period .

6

( )gcd : 0niid i n P

d i d i

d i

1d i 1d i

Page 7: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (3) A set of states C is a communicating class

if every pair of states in C communicates with each other.

Every state in a communicating class must have the same period

Example:

7

Page 8: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (4) A finite Markov chain is said to be

irreducible if its state space (Ω) is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state.

Example:

8

Page 9: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (5) A finite state irreducible Markov chain is

said to be ergodic if its states are aperiodic

Example:

9

Page 10: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (6) A state i is said to be transient if, given

that we start in state i, there is a non-zero probability that we will never return back to i.

Formally, let the random variable Ti be the next return time to state i (the “hitting time”):

Then, state i is transient iff there exists a finite Ti such that:

10

0min : |i nT n X i X i

1iP T

Page 11: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (7) A state i is said to be recurrent or

persistent iff there exists a finite Ti such that: .

The mean recurrent time . State i is positive recurrent if is finite;

otherwise, state i is null recurrent. A state i is said to be ergodic if it is

aperiodic and positive recurrent. If all states in a Markov chain are ergodic, then the chain is said to be ergodic.

11

1iP T i iE T

i

Page 12: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Stationary Distributions Theorem: If a Markov Chain is irreducible

and aperiodic, then

Theorem: If a Markov chain is irreducible and aperiodic, thenand

where is stationary distribution.

12

( ) 1 as , ,n

ijj

P n i j

! limj nnP X j

, 1, j i ij ji i

P j

Page 13: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (8) A Markov chain is said to be reversible, if

there is a stationary distribution such that

Theorem: if a Markov chain is reversible, then

13

,i ij j jiP P i j

j i iji

P

Page 14: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example of Stationary Distributions A Markov chain:

The stationary distribution is

14

2

1 3

0.4

0.3

0.3

0.3

0.7 0.70.3

0.7 0.3 0.0

0.3 0.4 0.3

0.0 0.3 0.7

P

1 1 1

3 3 3

0.7 0.3 0.01 1 1 1 1 1

0.3 0.4 0.33 3 3 3 3 3

0.0 0.3 0.7

Page 15: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Properties of Stationary Distributions Regardless of the starting point, the process

of irreducible and aperiodic Markov chains will converge to a stationary distribution.

The rate of converge depends on properties of the transition probability.

15

Page 16: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

16

Monte Carlo Markov Chains

Page 17: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Monte Carlo Markov Chains MCMC method are a class of algorithms for

sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution.

The state of the chain after a large number of steps is then used as a sample from the desired distribution.

http://en.wikipedia.org/wiki/MCMC

17

Page 18: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

18

Metropolis-Hastings Algorithm

Page 19: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (1) The Metropolis-Hastings algorithm can draw

samples from any probability distribution , requiring only that a functionproportional to the density can be calculated at .

Process in three steps: Set up a Markov chain; Run the chain until stationary; Estimate with Monte Carlo methods. http://en.wikipedia.org/wiki/Metropolis-

Hastings_algorithm19

x

Page 20: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (2) Let be a probability density (or

mass) function (pdf or pmf). is any function and we want to estimate

Construct the transition matrix of an irreducible Markov chain with states , whereand is its unique stationary distribution.

20

1, , n

f

1

n

ii

I E f f i

ijPP

1Pr | , 1, 2, ,ij t t tP X j X i X n 1,2, ,n

Π

Page 21: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (3) Run this Markov chain for times and

calculate the Monte Carlo sum

then Sheldon M. Ross(1997). Proposition 4.3.

Introduction to Probability Model. 7th ed. http://nlp.stanford.edu/local/talks/

mcmc_2004_07_01.ppt

21

1, ,t N

1

1ˆN

t

t

I f XN

ˆ as I I N

Page 22: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (4) In order to perform this method for a given

distribution , we must construct a Markov chain transition matrix with as its stationary distribution, i.e. .

Consider the matrix was made to satisfy the reversibility condition that for all and

. The property ensures that for all

and hence is a stationary distribution for 22

ΠP Π

ΠP = Π

P

i ij j ijΠ P = Π P

i j

i ij ji

P j

Π P

Page 23: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (5) Let a proposal be irreducible where

, and range of is equal to range of .

But is not have to a stationary distribution of .

Process: Tweak to yield .

23

ijQQ 1Pr |ij t tQ X j X i Q

Π

ΠQ

ijQ Π

States from Qij

not π Tweak States from Pij π

Page 24: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (6) We assume that has the form

where is called accepted probability, i.e. given ,

take

24

ijP

, ,

1 ,

ij ij

ii ij

i j

P Q i j i j

P P

,i j

tX i

1

1

with probability ,

with probability 1- ,t

t

X j i j

X i i j

Page 25: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (7) For

WLOG for some , . In order to achieve equality (*), one can

introduce a probability on the left-hand side and set on the right-hand side.

25

, i ij j jii j P P

( , ) ( , ) *i ij j jiQ i j Q j i

( , )i j i ij j jiQ Q

, 1i j , 1j i

Page 26: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (8) Then

These arguments imply that the accepted probability must be

26

, ,

,

i ij j ji j ji

j ji

i ij

Q i j Q j i Q

Qi j

Q

( , )i j

, min 1,j ji

i ij

Qi j

Q

Page 27: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (9) M-H Algorithm:

Step 1: Choose an irreducible Markov chain transition matrix with transition probability .Step 2: Let and initialize from states in .Step 3 (Proposal Step): Given , sample form .

27

Q

ijQ0t 0X

Q

tX i Y j iYQ

Page 28: 1 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (10) M-H Algorithm (cont.):

Step 4 (Acceptance Step):Generate a random number fromIf , setelseStep 5: , repeat Step 3~5 until convergence.

28

U 0,1Unif

,U i j 1tX Y j

1t tX X i 1t t