1 Markov Chains. Markov Chains (1) A Markov chain is a mathematical model for stochastic systems...
-
Upload
reynard-smith -
Category
Documents
-
view
239 -
download
0
Transcript of 1 Markov Chains. Markov Chains (1) A Markov chain is a mathematical model for stochastic systems...
1
Markov Chains
Markov Chains (1) A Markov chain is a mathematical model
for stochastic systems whose states, discrete or continuous, are governed by transition probability.
Suppose the random variable take state space (Ω) that is a countable set of value. A Markov chain is a process that corresponds to the network.
2
1tX tX1tX 0X 1X ... ...
0 1, ,X X
Markov Chains (2) The current state in Markov chain only
depends on the most recent previous states.
Transition probability where
http://en.wikipedia.org/wiki/Markov_chainhttp://civs.stat.ucla.edu/MCMC/MCMC_tutorial/Lect1_MCMC_Intro.pdf
3
1 1 1 0 0
1
| , , ,
|
t t t t
t t
P X j X i X i X i
P X j X i
0 , , ,i i j
An Example of Markov Chains
where is initial state and so on. is transition matrix.
4
0 1
1,2,3,4,5
, , , ,tX X X X
0X
P
1 0.4 0.6 0.0 0.0 0.0
2 0.5 0.0 0.5 0.0 0.0
3 0.0 0.3 0.0 0.7 0.0
4 0.0 0.0 0.1 0.3 0.6
5 0.0 0.3 0.0 0.5 0.2
P
1 2 3 4 5
Definition (1) Define the probability of going from state i
to state j in n time steps as
A state j is accessible from state i if there are n time steps such that , where
A state i is said to communicate with state j (denote: ), if it is true that both i is accessible from j and that j is accessible from i.
5
( ) |nij t n tp P X j X i
( ) 0nijp
0,1,n
i j
Definition (2) A state i has period if any return to
state i must occur in multiples of time steps.
Formally, the period of a state is defined as
If , then the state is said to be aperiodic; otherwise ( ), the state is said to be periodic with period .
6
( )gcd : 0niid i n P
d i d i
d i
1d i 1d i
Definition (3) A set of states C is a communicating class
if every pair of states in C communicates with each other.
Every state in a communicating class must have the same period
Example:
7
Definition (4) A finite Markov chain is said to be
irreducible if its state space (Ω) is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state.
Example:
8
Definition (5) A finite state irreducible Markov chain is
said to be ergodic if its states are aperiodic
Example:
9
Definition (6) A state i is said to be transient if, given
that we start in state i, there is a non-zero probability that we will never return back to i.
Formally, let the random variable Ti be the next return time to state i (the “hitting time”):
Then, state i is transient iff there exists a finite Ti such that:
10
0min : |i nT n X i X i
1iP T
Definition (7) A state i is said to be recurrent or
persistent iff there exists a finite Ti such that: .
The mean recurrent time . State i is positive recurrent if is finite;
otherwise, state i is null recurrent. A state i is said to be ergodic if it is
aperiodic and positive recurrent. If all states in a Markov chain are ergodic, then the chain is said to be ergodic.
11
1iP T i iE T
i
Stationary Distributions Theorem: If a Markov Chain is irreducible
and aperiodic, then
Theorem: If a Markov chain is irreducible and aperiodic, thenand
where is stationary distribution.
12
( ) 1 as , ,n
ijj
P n i j
! limj nnP X j
, 1, j i ij ji i
P j
Definition (8) A Markov chain is said to be reversible, if
there is a stationary distribution such that
Theorem: if a Markov chain is reversible, then
13
,i ij j jiP P i j
j i iji
P
An Example of Stationary Distributions A Markov chain:
The stationary distribution is
14
2
1 3
0.4
0.3
0.3
0.3
0.7 0.70.3
0.7 0.3 0.0
0.3 0.4 0.3
0.0 0.3 0.7
P
1 1 1
3 3 3
0.7 0.3 0.01 1 1 1 1 1
0.3 0.4 0.33 3 3 3 3 3
0.0 0.3 0.7
Properties of Stationary Distributions Regardless of the starting point, the process
of irreducible and aperiodic Markov chains will converge to a stationary distribution.
The rate of converge depends on properties of the transition probability.
15
16
Monte Carlo Markov Chains
Monte Carlo Markov Chains MCMC method are a class of algorithms for
sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution.
The state of the chain after a large number of steps is then used as a sample from the desired distribution.
http://en.wikipedia.org/wiki/MCMC
17
18
Metropolis-Hastings Algorithm
Metropolis-Hastings Algorithm (1) The Metropolis-Hastings algorithm can draw
samples from any probability distribution , requiring only that a functionproportional to the density can be calculated at .
Process in three steps: Set up a Markov chain; Run the chain until stationary; Estimate with Monte Carlo methods. http://en.wikipedia.org/wiki/Metropolis-
Hastings_algorithm19
xΠ
x
Metropolis-Hastings Algorithm (2) Let be a probability density (or
mass) function (pdf or pmf). is any function and we want to estimate
Construct the transition matrix of an irreducible Markov chain with states , whereand is its unique stationary distribution.
20
1, , n
f
1
n
ii
I E f f i
ijPP
1Pr | , 1, 2, ,ij t t tP X j X i X n 1,2, ,n
Π
Metropolis-Hastings Algorithm (3) Run this Markov chain for times and
calculate the Monte Carlo sum
then Sheldon M. Ross(1997). Proposition 4.3.
Introduction to Probability Model. 7th ed. http://nlp.stanford.edu/local/talks/
mcmc_2004_07_01.ppt
21
1, ,t N
1
1ˆN
t
t
I f XN
ˆ as I I N
Metropolis-Hastings Algorithm (4) In order to perform this method for a given
distribution , we must construct a Markov chain transition matrix with as its stationary distribution, i.e. .
Consider the matrix was made to satisfy the reversibility condition that for all and
. The property ensures that for all
and hence is a stationary distribution for 22
ΠP Π
ΠP = Π
P
i ij j ijΠ P = Π P
i j
i ij ji
P j
Π P
Metropolis-Hastings Algorithm (5) Let a proposal be irreducible where
, and range of is equal to range of .
But is not have to a stationary distribution of .
Process: Tweak to yield .
23
ijQQ 1Pr |ij t tQ X j X i Q
Π
ΠQ
ijQ Π
States from Qij
not π Tweak States from Pij π
Metropolis-Hastings Algorithm (6) We assume that has the form
where is called accepted probability, i.e. given ,
take
24
ijP
, ,
1 ,
ij ij
ii ij
i j
P Q i j i j
P P
,i j
tX i
1
1
with probability ,
with probability 1- ,t
t
X j i j
X i i j
Metropolis-Hastings Algorithm (7) For
WLOG for some , . In order to achieve equality (*), one can
introduce a probability on the left-hand side and set on the right-hand side.
25
, i ij j jii j P P
( , ) ( , ) *i ij j jiQ i j Q j i
( , )i j i ij j jiQ Q
, 1i j , 1j i
Metropolis-Hastings Algorithm (8) Then
These arguments imply that the accepted probability must be
26
, ,
,
i ij j ji j ji
j ji
i ij
Q i j Q j i Q
Qi j
Q
( , )i j
, min 1,j ji
i ij
Qi j
Q
Metropolis-Hastings Algorithm (9) M-H Algorithm:
Step 1: Choose an irreducible Markov chain transition matrix with transition probability .Step 2: Let and initialize from states in .Step 3 (Proposal Step): Given , sample form .
27
Q
ijQ0t 0X
Q
tX i Y j iYQ
Metropolis-Hastings Algorithm (10) M-H Algorithm (cont.):
Step 4 (Acceptance Step):Generate a random number fromIf , setelseStep 5: , repeat Step 3~5 until convergence.
28
U 0,1Unif
,U i j 1tX Y j
1t tX X i 1t t