Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete...

IntroductionDiscrete Structures and Distributions

Discrete StructuresStochastic Algorithm

Lecture 3: Discrete Structures and GibbsSampler

October 12, 2010

Special Meeting 1/16



Configuration Space

Let V be a finite set of “sites” and Λ be a set of “commonstates.” Consider the space

S := {(xv ) : (xv ) ∈ ΛV }

of “configurations.” Here the random sample of interest is the set(Xv : v ∈ V ) of Λ-valued random variables Xv ’s, and the targetdistribution π is a joint probability distribution of Xv ’s. Theconditional distribution

π(xv | (xw )w 6=v ) =π(y)∑

λ∈Λ π((x ; xv ← λ))

can be calculated given the configuration (xw )w 6=v restricted onV \ {v}. Here (x ; xv ← λ) denotes the configuration replacing thestate xv by λ at the site v .




Gibbs Sampler

The Gibbs sampler is viewed as a special case ofMetropolis-Hastings Algorithm when one can take advantage ofunivariate conditional distributions. All the moves are alwaysaccepted in the algorithm.

Gibbs Algorithm (Gibbs Sampler)

1. Suppose that Xt = x at time t.

2. Pick a site v ∈ V , and generate λ from the conditionaldistribution π(xv | (xw )w 6=v ).

3. Set Xt+1 = (x ; xv ← λ).

Here the choice of site v to update can be in a certain order (i.e.,systematic scan) or random.




Dynamics of Gibbs Sampler

Suppose that Λ consists of finite common states. For example,Λ = {−1,+1} as in Ising model. Then we can construct thetransition probability at each state v ∈ V by

Pv (x , y) :=

π(y)∑λ∈Λ π((x ; xv ← λ))

if (xw )w 6=v = (yw )w 6=v ;

0 otherwise.




Site Update

Pv is a reversible Markov chain with the stationary distribution π:

π(x)Pv (x,y) = π(x)π(y)∑

λ∈Λ π((x; xv ← λ))

= π(y)π(x)∑

λ∈Λ π((y; yv ← λ))= π(y)Pv (y,x).

if (xw )w 6=v = (yw )w 6=v . However, it is not irreducible. Letv1, . . . , vn be a fixed ordering of V . Define P := Pv1 · · ·Pvn , andcall it systematic site update. Then P is an irreducible transitionprobability with the stationary distribution π. Alternatively we canintroduce a distribution ρ(v) over V , and defineP =

∑v∈V ρ(v)Pv , which is called random site update.



Gibbs ModelsIsing and Potts ModelSimulation with R

Gibbs Model

Given a parameter β > 0, we can define a Gibbs distribution withrespect to G by

π(x) =1

zβexp(−βH(x)), x ∈ S

wherezβ =

∑x∈S

exp(−βH(x))

is the normalizing constant. The parameter β and zβ arerespectively called a “inverse temperature” and “partitionfunction.” The function H(x), called a “Hamiltonian,” is a energyfunction, and the model abhors to retain a high energy when thetemperature T = 1/β is low.




Hamiltonian Function

Let G = (V ,E ) be a finite undirected graph, and letΛ = {1, . . . , L} be a common state space. In the language ofstatistical mechanics, a vertex v ∈ V is called a site, and the site vis said to be a neighbor of w if {v ,w} ∈ E . C ⊂ V is called aclique if {v ,w} ∈ E whenever v ,w ∈ C . We denote the set of allcliques by C. Then the Hamiltonian H(x) has the form

H(x) =∑C∈C

WC (x),

where each WC depends only on those coordinates xv , v ∈ C .




Markov Random Field

Let X = (Xv ) be a S-valued random variable. X is an MRF(Markov random field) with respect to G if for every v ∈ V andx ∈ S

1. P(X = x) > 0;

2. P(Xv = xv |Xw = xw ,w 6= v)= P(Xv = xv |Xw = xw , {w , v} ∈ E ).

Hammersley-Clifford theorem. X is an MRF with respect to G ifand only if π(x) = P(X = x) is a Gibbs distribution with respectto G . In this sense a Gibbs distribution is also known as GRF(Gibbs random field).




Conditional Distribution

In the Gibbs distribution, the conditional probability for site updateis given by

π(xv | (xw )w 6=v ) =exp(−βHv (y))∑

λ∈Λ exp(−βHv (x ; xv ← λ))

Here the normalizing constant zβ is canceled and

Hv (x) =∑

v∈C ,C∈CWC (x)

is only the partial sum over neighboring cliques of v .




Ising Model

Let G = (V ,E ) be a graph representing a lattice (or a rectangulargrid). and Λ = {−1,+1} be the states of spin “up” and “down.”The Hamiltonian is given by

H(x) = −J∑

{v ,w}∈E

xvxw − h∑v∈V

xv

where J and h represent the strength of interaction betweenneighbors and that of external magnetic field. The GRF with thisHamiltonian is called an Ising model.




Attractive Spin System

Assuming J = 1 and h = 0, the Ising model has the following siteupdate:

I Set xv ← +1 with probability

π(+1 | (xw )w 6=v ) =

1 + exp

−2β∑

w :{v ,w}∈E

xw

−1

;

I Set xv ← −1 with probability

π(−1 | (xw )w 6=v ) =

1 + exp

2β∑

w :{v ,w}∈E

xw

−1

.

The chance for the site v getting +1 increases as the number ofneighbors being +1. In this sense the model is called an attractivespin system.




Critical Temperature

The Ising model undergoes a “phase transition” between anordered and a disordered phase. When the temperature T = 1/β′

is low, spins are aligned, creating a large cluster of aligned states.As the temperature gets higher, the randomness takes over. In theIsing model it is known that this phase transition occurs at thecritical temperature Tc , and that Tc = [2/ ln(1 +

√2)]J ≈ 2.27J

when the graph G is a 2-dimensional grid with h = 0.




Potts Model

Let Λ = {1, . . . ,M}. Then the Ising model is generalized with theHamiltonian

H̃(x) = −J∑

{v ,w}∈E

δ(xv , xw )− h∑v∈V

δ(xv , fv )

where

δ(λ, λ′) =

{1 if λ = λ′;

0 if λ 6= λ′.

Here f is a configuration of external reference. This GRF is calleda Potts model.




Potts Models with External Force

An external force at each site is indicated by 0, 1, . . . ,M where thestate 0 implies no external force at a particular site. Entire data ofexternal force is represented in a matrix.

> source("potts.r")> s1 = matrix(data=c(1,2,3,3,1,2,2,3,1,1,2,2,1,1,1,2), 4, 4)> potts(m=3, tmax=0, init.state=s1)> potts(m=3, beta=0.9, tmax=10000, ref.state=s1)

An n ×m matrix of data a is created by the function matrix()with data. The result is a matrix with data assigned from the firstcolumn to the last column of the matrix. To examine the externalforce in color, use it as initial state.




Behavior Around Critital Temperature

The Ising model is a special case of the Potts model with M = 2,since their respective Hamiltonian function H(x) and H̃(x̃) satisfyH(x) = 2H̃(x̃) when xv = 1 or −1 accordingly as x̃v = 1 or 0.Here we set J = 1/2 and the inverse temperatureβ = log(1 +

√2) ≈ 0.88 for the Potts model. Then we can

simulate the phase transition for Ising model.

> potts(m=2,tmax=1000,beta=0.88)> potts(m=2,tmax=1000,beta=0.89)

Change the grid size of model, and observe the behavior aroundthe critical temperature.

> potts(m=2,n=c(100,100), tmax=1000,beta=0.88)




Exploration with R

The initial value init.state and the external force ref.statecan be assigned in a form of matrix. When init.state is set toan integer from 1 to M, the value is set to every site at time t = 0.

I Choose a different number M of common states and inversetemperature β.

I See what the Gibbs sampler generates with or withoutexistence of external force.

> potts(m=2,tmax=1000,beta=0.9)> potts(m=2,tmax=0, init.state=f1)> potts(m=2,tmax=1000,beta=0.9,ref.state=f1)> potts(m=5,tmax=1000,beta=1.1)> potts(m=5,tmax=0, init.state=f2)> potts(m=5,tmax=1000,beta=1.1,ref.state=f2)


Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete...

Documents

Transcript of Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete...