1 Part 6 Markov Chains. Markov Chains (1) A Markov chain is a mathematical model for stochastic...

99
1 Part 6 Markov Chains

Transcript of 1 Part 6 Markov Chains. Markov Chains (1) A Markov chain is a mathematical model for stochastic...

Page 1: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

1

Part 6 Markov Chains

Page 2: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Markov Chains (1) A Markov chain is a mathematical model

for stochastic systems whose states, discrete or continuous, are governed by transition probability.

Suppose the random variable take state space (Ω) that is a countable set of value. A Markov chain is a process that corresponds to the network.

2

1tX tX1tX 0X 1X ... ...

0 1, ,X X

Page 3: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Markov Chains (2) The current state in Markov chain only

depends on the most recent previous states.

Transition probability where

http://en.wikipedia.org/wiki/Markov_chainhttp://civs.stat.ucla.edu/MCMC/MCMC_tutorial/Lect1_MCMC_Intro.pdf

3

1 1 1 0 0

1

| , , ,

|

t t t t

t t

P X j X i X i X i

P X j X i

0 , , ,i i j

Page 4: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example of Markov Chains

where is initial state and so on. is transition matrix.

4

0 1

1,2,3,4,5

, , , ,tX X X X

0X

P

1 0.4 0.6 0.0 0.0 0.0

2 0.5 0.0 0.5 0.0 0.0

3 0.0 0.3 0.0 0.7 0.0

4 0.0 0.0 0.1 0.3 0.6

5 0.0 0.3 0.0 0.5 0.2

P

1 2 3 4 5

Page 5: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (1) Define the probability of going from state i

to state j in n time steps as

A state j is accessible from state i if there are n time steps such that , where

A state i is said to communicate with state j (denote: ), if it is true that both i is accessible from j and that j is accessible from i.

5

( ) |nij t n tp P X j X i

( ) 0nijp

0,1,n

i j

Page 6: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (2) A state i has period if any return to

state i must occur in multiples of time steps.

Formally, the period of a state is defined as

If , then the state is said to be aperiodic; otherwise ( ), the state is said to be periodic with period .

6

( )gcd : 0niid i n P

d i d i

d i

1d i 1d i

Page 7: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (3) A set of states C is a communicating class

if every pair of states in C communicates with each other.

Every state in a communicating class must have the same period

Example:

7

Page 8: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (4) A finite Markov chain is said to be

irreducible if its state space (Ω) is a communicating class; this means that, in an irreducible Markov chain, it is possible to get to any state from any state.

Example:

8

Page 9: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (5) A finite state irreducible Markov chain is

said to be ergodic if its states are aperiodic

Example:

9

Page 10: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (6) A state i is said to be transient if, given

that we start in state i, there is a non-zero probability that we will never return back to i.

Formally, let the random variable Ti be the next return time to state i (the “hitting time”):

Then, state i is transient iff there exists a finite Ti such that:

10

0min : |i nT n X i X i

1iP T

Page 11: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (7) A state i is said to be recurrent or

persistent iff there exists a finite Ti such that: .

The mean recurrent time . State i is positive recurrent if is finite;

otherwise, state i is null recurrent. A state i is said to be ergodic if it is

aperiodic and positive recurrent. If all states in a Markov chain are ergodic, then the chain is said to be ergodic.

11

1iP T i iE T

i

Page 12: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Stationary Distributions Theorem: If a Markov Chain is irreducible

and aperiodic, then

Theorem: If a Markov chain is irreducible and aperiodic, thenand

where is stationary distribution.

12

( ) 1 as , ,n

ijj

P n i j

! limj nn

P X j

, 1, j i ij ji i

P j

Page 13: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Definition (8) A Markov chain is said to be reversible, if

there is a stationary distribution such that

Theorem: if a Markov chain is reversible, then

13

,i ij j jiP P i j

j i iji

P

Page 14: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example of Stationary Distributions A Markov chain:

The stationary distribution is

14

2

1 3

0.4

0.3

0.3

0.3

0.7 0.70.3

0.7 0.3 0.0

0.3 0.4 0.3

0.0 0.3 0.7

P

1 1 1

3 3 3

0.7 0.3 0.01 1 1 1 1 1

0.3 0.4 0.33 3 3 3 3 3

0.0 0.3 0.7

Page 15: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Properties of Stationary Distributions Regardless of the starting point, the process

of irreducible and aperiodic Markov chains will converge to a stationary distribution.

The rate of converge depends on properties of the transition probability.

15

Page 16: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

16

Part 7 Monte Carlo Markov

Chains

Page 17: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Applications of MCMC Simulation:

Ex:

where are known. Integration: computing in high dimensions.

Ex:

Bayesian Inference:Ex: Posterior distributions, posterior means…

17

11, ~ , 1n xxn

x y f x y c y yx

0,1,2, , , 0 1, ,x n y

1

0E g y g y f y dy

Page 18: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Monte Carlo Markov Chains MCMC method are a class of algorithms for

sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its stationary distribution.

The state of the chain after a large number of steps is then used as a sample from the desired distribution.

http://en.wikipedia.org/wiki/MCMC

18

Page 19: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method vs. MCMC (1) Inverse transform sampling, also known

as the probability integral transform, is a method of sampling a number at random from any probability distribution given its cumulative distribution function (cdf).

http://en.wikipedia.org/wiki/Inverse_transform_sampling_method

19

Page 20: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method vs. MCMC (2) A random variable with a cdf , then has a

uniform distribution on [0, 1]. The inverse transform sampling method

works as follows:1. Generate a random number from the standard

uniform distribution; call this .2. Compute the value such that ; call this

.3. Take to be the random number drawn

from the distribution described by .

20

F F

u

F x ux

chosenx

chosenxF

Page 21: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method vs. MCMC (3) For one dimension random variable,

Inversion method is good, but for two or more high dimension random variables, Inverse Method maybe not.

For two or more high dimension random variables, the marginal distributions for those random variables respectively sometime be calculated difficult with more time.

21

Page 22: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs Sampling One kind of the MCMC methods. The point of Gibbs sampling is that given a

multivariate distribution it is simpler to sample from a conditional distribution rather than integrating over a joint distribution.

George Casella and Edward I. George. "Explaining the Gibbs sampler". The American Statistician, 46:167-174, 1992. (Basic summary and many references.)

http://en.wikipedia.org/wiki/Gibbs_sampling

22

Page 23: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 (1) To sample x from:

where are known is a constant

One can see that

23

11, ~ , 1n xxn

x y f x y c y yx

0,1,2, , , 0 1, , ,x n y n

c

,

| 1 ~ ,n xxnf x y

f x y y y Binomial n yxf y

11,

| 1 ~ ,n xxf x y

f y x y y Beta x n xf x

Page 24: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 (2) Gibbs sampling Algorithm:

Initial Setting:

or a arbitrary value

For , sample a value from

Return24

0

0 0

~ 0,1

~ ,

y Uniform

x Bin n y

0,1

1

1 1

~ ,

~ ,

t t t

t t

y Beta x n x

x Bin n y

1 1,t tx y

,n nx y

0, ,t n

Page 25: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 (3) Under regular conditions: How many steps are needed for

convergence? Within an acceptable error, such as

is large enough, such as .25

, .t tt t

x x y y

10 20

1 11 0.001, .10 10

i i

t tt i t i

x xi

1000nn

Page 26: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 (4) Inversion Method:

is Beta-Binomial distribution.

The cdf of is that has a uniform distribution on [0, 1].

26

1

0~ ,

x f x f x y dy

n x n x

x n

0

x

i

n i n iF x

i n

x

x F x

Page 27: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by R (1)N = 1000; num = 16; alpha = 5; beta = 7tempy <- runif(1); tempx <- rbeta(1, alpha, beta)j = 0; Forward = 1; Afterward = 0while((abs(Forward-Afterward) > 0.0001) && (j <= 1000)){

Forward = Afterward; Afterward = 0for(i in 1:N){

tempy <- rbeta(1, tempx+alpha, num-tempx+beta)tempx <- rbinom(1, num, tempy)Afterward = Afterward+tempx

}Afterward = Afterward/N; j = j+1

}sample <- matrix(0, nrow = N, ncol = 2)for(i in 1:N){

tempy <- rbeta(1, tempx+alpha, num-tempx+beta)tempx <- rbinom(1, num, tempy)

27

Page 28: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by R (2)sample[i, 1] = tempx; sample[i, 2] = tempy

}sample_Inverse <- rbetabin(N, num, alpha, beta)write(t(sample), "Sample for Ex1 by R.txt", ncol = 2)Xhist <- cbind(hist(sample[, 1], nclass = num)$count,

hist(sample_Inverse, nclass = num)$count)write(t(Xhist), "Histogram for Ex1 by R.txt", ncol = 2)prob <- matrix(0, nrow = num+1, ncol = 2)for(i in 0:num){

if(i == 0){prob[i+1, 2] = mean(pbinom(i, num, sample[, 2]))prob[i+1, 1] = gamma(alpha+beta)*gamma(num+beta)prob[i+1, 1] = prob[i+1,

1]/(gamma(beta)*gamma(num+beta+alpha))}else{

28

Inverse method

Page 29: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by R (3)if(i == 1){

prob[i+1, 1] = num*alpha/(num-1+alpha+beta)for(j in 0:(num-2))

prob[i+1, 1] = prob[i+1, 1]*(beta+j)/(alpha+beta+j)

}else

prob[i+1, 1] = prob[i+1, 1]*(num-i+1)/(i)*(i-1+alpha)/(num-i+beta)

prob[i+1, 2] = mean((pbinom(i, num, sample[, 2])-pbinom(i-1, num, sample[, 2])))}if(i != num)

prob[i+2, 1] = prob[i+1, 1]}write(t(prob), "ProbHistogram for Ex1 by R.txt", ncol = 2)

29

Page 30: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method by R (1)rbetabin <- function(N, size, alpha, beta){

Usample <- runif(N)

Pr_0 = gamma(alpha+beta)*gamma(size+beta)/gamma(beta)/gamma(size+beta+alpha)

Pr = size*alpha/(size-1+alpha+beta)for(i in 0:(size-2))

Pr = Pr*(beta+i)/(alpha+beta+i)Pr_Initial = Pr

sample <- array(0,N)CDF <- array(0, (size+1))CDF[1] <- Pr_0

30

Page 31: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method by R (2)for(i in 1:size){

CDF[i+1] = CDF[i]+PrPr = Pr*(size-i)/(i+1)*(i+alpha)/(size-i-1+beta)

}for(i in 1:N){ sample[i] = which.min(abs(Usample[i]-CDF))-1}return(sample)

}

31

Page 32: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by C/C++ (1)

32

Page 33: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by C/C++ (2)

33

Inverse method

Page 34: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by C/C++ (3)

34

Page 35: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling by C/C++ (4)

35

Page 36: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method by C/C++ (1)

36

Page 37: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Inversion Method by C/C++ (2)

37

Page 38: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Plot Histograms by Maple (1) Figure 1:1000 samples with n=16, α=5 and

β=7.

38

Blue-Inversion method

Red-Gibbs sampling

Page 39: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Plot Histograms by Maple (2)

39

Page 40: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Probability Histograms by Maple (1) Figure 2:

Blue histogram and yellow line are pmf of x. Red histogram is from Gibbs

sampling.

40

1

1ˆ |m

i ii

P X x P X x Y ym

Page 41: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Probability Histograms by Maple (2) The probability histogram of blue histogram

of Figure 1 would be similar to the bule probability histogram of Figue 2, when the sample size .

The probability histogram of red histogram of Figure 1 would be similar to the red probability histogram of Figue 2, when the iteration n .

41

Page 42: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Probability Histograms by Maple (3)

42

Page 43: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Exercises Write your own programs similar to those

examples presented in this talk, including Example 1 in Genetics and other examples.

Write programs for those examples mentioned at the reference web pages.

Write programs for the other examples that you know.

43

Page 44: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

44

Bayesian Methods with Monte Carlo Markov

Chains III

Henry Horng-Shing LuInstitute of Statistics

National Chiao Tung [email protected]

http://tigpbp.iis.sinica.edu.tw/courses.htm

Page 45: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

45

Part 10 More Examples of Gibbs

Sampling

Page 46: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example with Three Random Variables (1) To sample as follows:

where is known, is a constant.

46

, ,X Y N

c0,1,2, , , 0 1, 0,1,2,x n y n

Page 47: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example with Three Random Variables (2) One can see that

47

| , 1 ~ ,n xxn

f x y n y y Binomial n yx

11| , 1 ~ ,n xxf y x n y y Beta x n x

1 1

| , ~ 1!

n xye yf n x x y Poisson y

n x

Page 48: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Gibbs sampling Algorithm:1. Initial Setting: ,

or an arbitrary value

or an arbitrary integer

2. Sample a value from

3. , repeat step 2 until convergence.48

An Example with Three Random Variables (3)

0

0

0 0 0

~ 0,1

~ 1,

~ ,

y Unif

n Discrete Unif

x Bin n y

0t

1, 0,1

1 1,t tx y

1

1 1

1 1 1

~ ,

~ 1

~ ,

t t t t

t t t

t t t

y Beta x n x

n x Possion y

x Bin n y

1t t

Page 49: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example with Three Random Variables by RN = 10000; alpha = 2; beta = 4; lambda = 16sample <- matrix(0, nrow = N, ncol = 3)tempY <- runif(1); tempN <- 1tempX <- rbinom(1, tempN, tempY)j = 0; forward = 1; afterward = 0while((abs(forward-afterward) > 0.001) && (j <= 1000)){

forward = afterward; afterward = 0for(i in 1:N){

tempY <- rbeta(1, tempX+alpha, tempN-tempX+beta)tempN <- rpois(1, (1-tempY)*lambda)tempN = tempN+tempXtempX <- rbinom(1, tempN, tempY)afterward = afterward+tempX

}afterward = afterward/N; j = j+1

}49

10000 samples withα=2, β=4 and λ=16

Page 50: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

for(i in i:N){tempY <- rbeta(1, tempX+alpha, tempN-tempX+beta)tempN <- rpois(1, (1-tempY)*lambda)tempN = tempN+tempXtempX <- rbinom(1, tempN, tempY)sample[i, 2] = tempY

sample[i, 3] = tempN sample[i, 1] = tempX}

50

An Example with Three Random Variables by R

Page 51: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Example with 3 Random Variables by C (1)

51

10000 samples withα=2, β=4 and λ=16

Page 52: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

52

An Example with 3 Random Variables by C (2)

Page 53: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (1) Two linked loci with alleles A and a, and B

and b A, B: dominant a, b: recessive

A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab

53

A

B b

a B

A

b

a

1/2

1/2

a

B

b

A

A

B b

a 1/2

1/2

Page 54: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (2) Probabilities for genotypes in gametes

54

No Recombination Recombination

Male 1-r r

Female 1-r’ r’

AB ab aB Ab

Male (1-r)/2 (1-r)/2 r/2 r/2

Female (1-r’)/2 (1-r’)/2 r’/2 r’/2

A

B b

a B

A

b

a

1/2

1/2

a

B

b

A

A

B b

a 1/2

1/2

Page 55: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (3) Fisher, R. A. and Balmukand, B. (1928). The

estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92.

More:http://en.wikipedia.org/wiki/Genetics http://www2.isye.gatech.edu/~brani/isyebayes/bank/handout12.pdf

55

Page 56: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (4)

56

MALE

AB (1-r)/2

ab(1-r)/2

aBr/2

Abr/2

FEMALE

AB (1-r’)/2

AABB (1-r) (1-r’)/4

aABb(1-r) (1-r’)/4

aABBr (1-r’)/4

AABbr (1-r’)/4

ab(1-r’)/2

AaBb(1-r) (1-r’)/4

aabb(1-r) (1-r’)/4

aaBbr (1-r’)/4

Aabbr (1-r’)/4

aB r’/2

AaBB(1-r) r’/4

aabB(1-r) r’/4

aaBBr r’/4

AabBr r’/4

Ab r’/2

AABb(1-r) r’/4

aAbb(1-r) r’/4

aABbr r’/4

AAbb r r’/4

Page 57: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (5) Four distinct phenotypes:

A*B*, A*b*, a*B* and a*b*. A*: the dominant phenotype from (Aa, AA, aA). a*: the recessive phenotype from aa. B*: the dominant phenotype from (Bb, BB, bB). b*: the recessive phenotype from bb. A*B*: 9 gametic combinations. A*b*: 3 gametic combinations. a*B*: 3 gametic combinations. a*b*: 1 gametic combination. Total: 16 combinations.

57

Page 58: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (6) Let , then

58

2( * *)

41

( * *) ( * *)4

( * *)4

P A B

P A b P a B

P a b

(1 )(1 ')r r

Page 59: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (7) Hence, the random sample of n from the

offspring of selfed heterozygotes will follow a multinomial distribution:

We know that and

So

59

2 1 1; , , ,

4 4 4 4Multinomial n

(1 )(1 '), 0 1/ 2,r r r

1/ 4 1

0 ' 1/ 2r

Page 60: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (8) Suppose that we observe the data of

which is a random sample from

Then the probability mass function is

60

1 2 3 4, , , 125,18,20,24y y y y y

2 1 1; , , ,

4 4 4 4Multinomial n

2 31 4

1 2 3 4

! 2 1( | ) ( ) ( ) ( )

! ! ! ! 4 4 4y yy yn

f yy y y y

Page 61: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (9) How to estimate ?

MME (shown in the last week):http://en.wikipedia.org/wiki/Method_of_moments_%28statistics%29

MLE (shown in the last week):http://en.wikipedia.org/wiki/Maximum_likelihood

Bayesian Method:http://en.wikipedia.org/wiki/Bayesian_method

61

Page 62: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (10) As the value of is between ¼ and 1, we

can assume that the prior distribution of is .

The posterior distribution is

The integration in the above denominator,

does not have a closed form.62

14 ,1Unif

||

|

f y ff y

f y f d

|f y f d

Page 63: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 in Genetics (11) We will consider the mean of posterior

distribution (the posterior mean),

The Monte Carlo Markov Chains method is a good method to estimate

even if and the posterior mean do not have closed forms.

63

| |E y f y d

| |E y f y d |f y f d

Page 64: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 by R Direct numerical integration when :

> y <- c(125, 18, 20, 24)> phi <- runif(1000000, 1/4, 1)> f_phi <- function(phi){((2+phi)/4)^y[1]*((1-phi)/4)^(y[2]+y[3])*(phi/4)^y[4]}> mean(f_phi(phi)*phi)/mean(f_phi(phi))[1] 0.573808

We can assume other prior distributions to compare the results of posterior means: , , , ,

, 64

1,1Beta 2,2Beta 2,3Beta 3,2Beta

0.5,0.5Beta 5 510 ,10Beta

1~ ,1

4U

Page 65: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example 1 by C/C++

65

Replace other prior distribution, such as Beta(1,1),…,Beta(1e-5,1e-5)

Page 66: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Beta Prior

66

Page 67: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

67

Estimate Method

Estimate Method

MME 0.683616 BayesianBeta(2,3)

0.564731

MLE 0.663165 BayesianBeta(3,2)

0.577575

BayesianU(¼,1)

0.573931 BayesianBeta(½,½)

0.574928

BayesianBeta(1,1)

0.573918 BayesianBeta(10-5,10-5)

0.588925

BayesianBeta(2,2)

0.572103 BayesianBeta(10-7,10-7)

show below

Comparison for Example 1 (1)

Page 68: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

68

Estimate Method

Estimate Method

BayesianBeta(10,10)

0.559905 BayesianBeta(10-7,10-7)

0.193891

BayesianBeta(102,102)

0.520366 BayesianBeta(10-7,10-7)

0.400567

BayesianBeta(104,104)

0.500273 BayesianBeta(10-7,10-7)

0.737646

BayesianBeta(105,105)

0.500027 BayesianBeta(10-7,10-7)

0.641388

BayesianBeta(10n,10n)

BayesianBeta(10-7,10-7)

Not stationary

0.5n

Comparison for Example 1 (2)

Page 69: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

69

Part 11 Gibbs Sampling

Strategy

Page 70: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Sampling Strategy (1) Strategy I:

Run one chain for a long time. After some “Burn-in” period, sample points

every some fixed number of steps.

The code example of Gibbs sampling in the previous lecture use sampling strategy I.

http://www.cs.technion.ac.il/~cs236372/tirgul09.ps 70

Burn-in N samples from one chain

Page 71: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Sampling Strategy (2) Strategy II:

Run the chain N times, each run for M steps. Each run starts from a different state points. Return the last state in each run.

71

N samples from the last sample of each chain

Burn-in

Page 72: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Sampling Strategy (3) Strategy II by R:N = 100; num = 16; alpha = 5; beta = 7sample <- matrix(0, nrow = N, ncol = 2)for(k in 1:N){

tempy <- runif(1); tempx <- rbeta(1, alpha, beta)j = 0; Forward = 1; Afterward = 0while((abs(Forward-Afterward) > 0.001) && (j <= 100)){

Forward = Afterward; Afterward = 0for(i in 1:N){

tempy <- rbeta(1, tempx+alpha, num-tempx+beta)tempx <- rbinom(1, num, tempy)Afterward = Afterward+tempx

}Afterward = Afterward/N; j = j+1

}tempy <- rbeta(1, tempx+alpha, num-tempx+beta)tempx <- rbinom(1, num, tempy)sample[k, 1] = tempx; sample[k, 2] = tempy

}72

Page 73: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Sampling Strategy (4) Strategy II by C/C++:

73

Page 74: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Strategy Comparison Strategy I:

Perform “burn-in” only once and save time. Samples might be correlated (--although only

weakly). Strategy II:

Better chance of “covering” the space points especially if the chain is slow to reach stationary.

This must perform “burn-in” steps for each chain and spend more time.

74

Page 75: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Hybrid Strategies (1) Run several chains and sample few samples

from each. Combines benefits of both strategies.

75

N samples from each chain

Burn-in

Page 76: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Hybrid Strategies (2) Hybrid Strategy by R:tempN <- N; loc <- 1sample <- matrix(0, nrow = N, ncol = 2)while(loc != (N+1)){

tempy <- runif(1); tempx <- rbeta(1, alpha, beta); j = 0pN <- floor(runif(1)*(N-loc))+1cat(pN, '\n‘); Forward = 1; Afterward = 0while((abs(Forward-Afterward) > 0.001) && (j <= 100)){

Forward = Afterward; Afterward = 0for(i in 1:N){tempy <- rbeta(1, tempx+alpha, num-tempx+beta)tempx <- rbinom(1, num, tempy)Afterward = Afterward+tempx}Afterward = Afterward/N; j = j+1

}for(i in loc:(loc+pN-1)){

tempy <- rbeta(1, tempx+alpha, num-tempx+beta)tempx <- rbinom(1, num, tempy)sample[i, 1] <- tempx; sample[i, 2] <- tempy

}loc <- i+1

}76

Page 77: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Hybrid Strategies (3) Hybrid Strategy by C/C++:

77

Page 78: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

78

Part 12 Metropolis-Hastings

Algorithm

Page 79: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (1) Another kind of the MCMC methods. The Metropolis-Hastings algorithm can draw

samples from any probability distribution , requiring only that a functionproportional to the density can be calculated at .

Process in three steps: Set up a Markov chain; Run the chain until stationary; Estimate with Monte Carlo methods. http://en.wikipedia.org/wiki/Metropolis-Hastings_algo

rithm 79

x

Page 80: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (2) Let be a probability density (or

mass) function (pdf or pmf). is any function and we want to estimate

Construct the transition matrix of an irreducible Markov chain with states , whereand is its unique stationary distribution.

80

1, , n

f

1

n

ii

I E f f i

ijPP

1Pr | , 1, 2, ,ij t t tP X j X i X n 1,2, ,n

Π

Page 81: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (3) Run this Markov chain for times and

calculate the Monte Carlo sum

then Sheldon M. Ross(1997). Proposition 4.3.

Introduction to Probability Model. 7th ed. http://nlp.stanford.edu/local/talks/mcmc_20

04_07_01.ppt

81

1, ,t N

1

1ˆN

t

t

I f XN

ˆ as I I N

Page 82: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (4) In order to perform this method for a given

distribution , we must construct a Markov chain transition matrix with as its stationary distribution, i.e. .

Consider the matrix was made to satisfy the reversibility condition that for all and

. The property ensures that for all

and hence is a stationary distribution for 82

ΠP Π

ΠP = Π

P

i ij j ijΠ P = Π P

i j

i ij ji

P j

Π P

Page 83: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (5) Let a proposal be irreducible where

, and range of is equal to range of .

But is not have to a stationary distribution of .

Process: Tweak to yield .

83

ijQQ

1Pr |ij t tQ X j X i QΠ

ΠQ

ijQ Π

States from Qij

not π Tweak States from Pij π

Page 84: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (6) We assume that has the form

where is called accepted probability, i.e. given ,

take

84

ijP

, ,

1 ,

ij ij

ii ij

i j

P Q i j i j

P P

,i j

tX i

1

1

with probability ,

with probability 1- ,t

t

X j i j

X i i j

Page 85: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (7) For

WLOG for some , . In order to achieve equality (*), one can

introduce a probability on the left-hand side and set on the right-hand side.

85

, i ij j jii j P P

( , ) ( , ) *i ij j jiQ i j Q j i

( , )i j i ij j jiQ Q

, 1i j , 1j i

Page 86: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (8) Then

These arguments imply that the accepted probability must be

86

, ,

,

i ij j ji j ji

j ji

i ij

Q i j Q j i Q

Qi j

Q

( , )i j

, min 1,j ji

i ij

Qi j

Q

Page 87: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (9) M-H Algorithm:

Step 1: Choose an irreducible Markov chain transition matrix with transition probability .Step 2: Let and initialize from states in .Step 3 (Proposal Step): Given , sample form .

87

Q

ijQ0t 0X

Q

tX i Y j iYQ

Page 88: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (10) M-H Algorithm (cont.):

Step 4 (Acceptance Step):Generate a random number fromIf , setelseStep 5: , repeat Step 3~5 until convergence.

88

U 0,1Unif

,U i j 1tX Y j

1t tX X i 1t t

Page 89: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

89

An Example of Step 3~5:

Qij

X1= Y1

X2= Y1

X3= Y3

‧ ‧

‧ ‧

‧ ‧

XN

PijTweak

Y1

Y2

Y3

YN

X(t) Y

0

1 0

0 1

1

2 1

1 2

2

1 ,

1 ,0 1

0 ,

1 1

2 ,

2 ,1 2

1 ,

2 1

3 ,

1. from and

( ) ( , ) min 1 ,

( )

accepted

2. from Q and

( ) ( , ) min 1 ,

( )

not accepted

3. from Q

X Y

Y X

X Y

X Y

Y X

X Y

X Y

Y Q

Y QX Y

X Q

X Y

Y

Y QX Y

X Q

X Y

Y

2 1

1 2

3 ,2 3

2 ,

3 3

and

( ) ( , ) min 1 ,

( )

accepted

and so on.

Y X

X Y

Y Qa X Y

X Q

X Y

Metropolis-Hastings Algorithm (11)

Page 90: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Metropolis-Hastings Algorithm (12) We may define a “rejection rate” as the

proportion of times t for which . Clearly, in choosing , high rejection rates are to be avoided.

Example:

90

1t tX X Q

Xt

π

Y

1

will be small ( )

and it is likely that

More Step3~5 are needed .

t

t t

Y

X

X X

Page 91: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example (1) Simulate a bivariate normal distribution:

91

1 1 11 122

2 2 21 22

1

1/2

~ , ,

1exp

2i.e. .

2 | |

T

XX N

X

X XX

Page 92: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example (2) Metropolis-Hastings Algorithm:

1. 2. Generate and

that and are independent, then3.

4.

5. , repeat step 2~4 until convergence.

92

0 , 0X i 1 ~ 1,1U U 2 ~ 1,1U U

1U 2U 1

2i

UU

U

i i iY X U

1

1

( ) w.p. , min 1,

( )

w.p. 1- ,

ii i i i

i

i i i i

YX Y X Y

X

X X X Y

1i i

Page 93: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example of M-H Algorithm by R (1)Pi <- function(x, mu, sigma){exp(-0.5*((x-mu)%*%chol2inv(sigma)

%*%as.matrix(x-mu)))/(2*pi*sqrt(det(sigma)))}N <- 1000; mu <- c(3, 7)sigma <- matrix(c(1, 0.4, 0.4, 1), nrow = 2)sample <- matrix(0, nrow = N, ncol = 2)j = 0; tempX <- muwhile((j < 1000) && (abs(mean(sample[, 1])-mu[1]) > 0.001)){

for(i in 1:N){ tempU <- c(runif(1, -1, 1), runif(1, -1, 1))

tempY <- tempX+tempUif(min(c(Pi(tempY, mu, sigma)/Pi(tempX, mu, sigma), 1))

> runif(1)){ tempX <- tempY; sample[i, ] <- tempY }

93

Page 94: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example of M-H Algorithm by R (2)

else{tempX <- tempX; sample[i, ] <- tempX

}}j = j+1

}for(i in 1:N){

tempU <- c(runif(1, -1, 1), runif(1, -1, 1))tempY <- tempX+tempUif(min(c(Pi(tempY, mu, sigma)/Pi(tempX, mu, sigma), 1)) > runif(1)){

tempX <- tempY; sample[i, ] <- tempY}else{

tempX <- tempX; sample[i, ] <- tempX }}

94

Page 95: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example of M-H Algorithm by C (1)

95

Page 96: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example of M-H Algorithm by C (2)

96

Page 97: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Example of M-H Algorithm by C (3)

97

Page 98: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

An Figure to Check Simulation Results Black points are simulated samples; color

points are probability density.

98

0 1 2 3 4 5 6

45

67

89

10

X1

X2

plot(sample, xlab = "X1", ylab = "X2")j = 0for(i in seq(0.01, 0.3, 0.02)){

for(x in seq(0,6, 0.1)){for(y in seq(4, 11,

0.1)){ if(abs(Pi(c(x, y), mu, sigma)-i) < 0.003) points(x, y, col = ((j)%%2+2), pch = 19)

}}j = j+1

}

Page 99: 1 Part 6 Markov Chains. Markov Chains (1)  A Markov chain is a mathematical model for stochastic systems whose states, discrete or continuous, are governed.

Exercises Write your own programs similar to those

examples presented in this talk, including Example 1 in Genetics and other examples.

Write programs for those examples mentioned at the reference web pages.

Write programs for the other examples that you know.

99