Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete...

16

Click here to load reader

Transcript of Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete...

Page 1: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Discrete StructuresStochastic Algorithm

Lecture 3: Discrete Structures and GibbsSampler

October 12, 2010

Special Meeting 1/16

Page 2: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Discrete StructuresStochastic Algorithm

Configuration Space

Let V be a finite set of “sites” and Λ be a set of “commonstates.” Consider the space

S := {(xv ) : (xv ) ∈ ΛV }

of “configurations.” Here the random sample of interest is the set(Xv : v ∈ V ) of Λ-valued random variables Xv ’s, and the targetdistribution π is a joint probability distribution of Xv ’s. Theconditional distribution

π(xv | (xw )w 6=v ) =π(y)∑

λ∈Λ π((x ; xv ← λ))

can be calculated given the configuration (xw )w 6=v restricted onV \ {v}. Here (x ; xv ← λ) denotes the configuration replacing thestate xv by λ at the site v .

Special Meeting 2/16

Page 3: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Discrete StructuresStochastic Algorithm

Gibbs Sampler

The Gibbs sampler is viewed as a special case ofMetropolis-Hastings Algorithm when one can take advantage ofunivariate conditional distributions. All the moves are alwaysaccepted in the algorithm.

Gibbs Algorithm (Gibbs Sampler)

1. Suppose that Xt = x at time t.

2. Pick a site v ∈ V , and generate λ from the conditionaldistribution π(xv | (xw )w 6=v ).

3. Set Xt+1 = (x ; xv ← λ).

Here the choice of site v to update can be in a certain order (i.e.,systematic scan) or random.

Special Meeting 3/16

Page 4: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Discrete StructuresStochastic Algorithm

Dynamics of Gibbs Sampler

Suppose that Λ consists of finite common states. For example,Λ = {−1,+1} as in Ising model. Then we can construct thetransition probability at each state v ∈ V by

Pv (x , y) :=

π(y)∑λ∈Λ π((x ; xv ← λ))

if (xw )w 6=v = (yw )w 6=v ;

0 otherwise.

Special Meeting 4/16

Page 5: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Discrete StructuresStochastic Algorithm

Site Update

Pv is a reversible Markov chain with the stationary distribution π:

π(x)Pv (x,y) = π(x)π(y)∑

λ∈Λ π((x; xv ← λ))

= π(y)π(x)∑

λ∈Λ π((y; yv ← λ))= π(y)Pv (y,x).

if (xw )w 6=v = (yw )w 6=v . However, it is not irreducible. Letv1, . . . , vn be a fixed ordering of V . Define P := Pv1 · · ·Pvn , andcall it systematic site update. Then P is an irreducible transitionprobability with the stationary distribution π. Alternatively we canintroduce a distribution ρ(v) over V , and defineP =

∑v∈V ρ(v)Pv , which is called random site update.

Special Meeting 5/16

Page 6: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Gibbs Model

Given a parameter β > 0, we can define a Gibbs distribution withrespect to G by

π(x) =1

zβexp(−βH(x)), x ∈ S

wherezβ =

∑x∈S

exp(−βH(x))

is the normalizing constant. The parameter β and zβ arerespectively called a “inverse temperature” and “partitionfunction.” The function H(x), called a “Hamiltonian,” is a energyfunction, and the model abhors to retain a high energy when thetemperature T = 1/β is low.

Special Meeting 6/16

Page 7: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Hamiltonian Function

Let G = (V ,E ) be a finite undirected graph, and letΛ = {1, . . . , L} be a common state space. In the language ofstatistical mechanics, a vertex v ∈ V is called a site, and the site vis said to be a neighbor of w if {v ,w} ∈ E . C ⊂ V is called aclique if {v ,w} ∈ E whenever v ,w ∈ C . We denote the set of allcliques by C. Then the Hamiltonian H(x) has the form

H(x) =∑C∈C

WC (x),

where each WC depends only on those coordinates xv , v ∈ C .

Special Meeting 7/16

Page 8: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Markov Random Field

Let X = (Xv ) be a S-valued random variable. X is an MRF(Markov random field) with respect to G if for every v ∈ V andx ∈ S

1. P(X = x) > 0;

2. P(Xv = xv |Xw = xw ,w 6= v)= P(Xv = xv |Xw = xw , {w , v} ∈ E ).

Hammersley-Clifford theorem. X is an MRF with respect to G ifand only if π(x) = P(X = x) is a Gibbs distribution with respectto G . In this sense a Gibbs distribution is also known as GRF(Gibbs random field).

Special Meeting 8/16

Page 9: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Conditional Distribution

In the Gibbs distribution, the conditional probability for site updateis given by

π(xv | (xw )w 6=v ) =exp(−βHv (y))∑

λ∈Λ exp(−βHv (x ; xv ← λ))

Here the normalizing constant zβ is canceled and

Hv (x) =∑

v∈C ,C∈CWC (x)

is only the partial sum over neighboring cliques of v .

Special Meeting 9/16

Page 10: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Ising Model

Let G = (V ,E ) be a graph representing a lattice (or a rectangulargrid). and Λ = {−1,+1} be the states of spin “up” and “down.”The Hamiltonian is given by

H(x) = −J∑

{v ,w}∈E

xvxw − h∑v∈V

xv

where J and h represent the strength of interaction betweenneighbors and that of external magnetic field. The GRF with thisHamiltonian is called an Ising model.

Special Meeting 10/16

Page 11: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Attractive Spin System

Assuming J = 1 and h = 0, the Ising model has the following siteupdate:

I Set xv ← +1 with probability

π(+1 | (xw )w 6=v ) =

1 + exp

−2β∑

w :{v ,w}∈E

xw

−1

;

I Set xv ← −1 with probability

π(−1 | (xw )w 6=v ) =

1 + exp

2β∑

w :{v ,w}∈E

xw

−1

.

The chance for the site v getting +1 increases as the number ofneighbors being +1. In this sense the model is called an attractivespin system.

Special Meeting 11/16

Page 12: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Critical Temperature

The Ising model undergoes a “phase transition” between anordered and a disordered phase. When the temperature T = 1/β′

is low, spins are aligned, creating a large cluster of aligned states.As the temperature gets higher, the randomness takes over. In theIsing model it is known that this phase transition occurs at thecritical temperature Tc , and that Tc = [2/ ln(1 +

√2)]J ≈ 2.27J

when the graph G is a 2-dimensional grid with h = 0.

Special Meeting 12/16

Page 13: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Potts Model

Let Λ = {1, . . . ,M}. Then the Ising model is generalized with theHamiltonian

H̃(x) = −J∑

{v ,w}∈E

δ(xv , xw )− h∑v∈V

δ(xv , fv )

where

δ(λ, λ′) =

{1 if λ = λ′;

0 if λ 6= λ′.

Here f is a configuration of external reference. This GRF is calleda Potts model.

Special Meeting 13/16

Page 14: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Potts Models with External Force

An external force at each site is indicated by 0, 1, . . . ,M where thestate 0 implies no external force at a particular site. Entire data ofexternal force is represented in a matrix.

> source("potts.r")> s1 = matrix(data=c(1,2,3,3,1,2,2,3,1,1,2,2,1,1,1,2), 4, 4)> potts(m=3, tmax=0, init.state=s1)> potts(m=3, beta=0.9, tmax=10000, ref.state=s1)

An n ×m matrix of data a is created by the function matrix()with data. The result is a matrix with data assigned from the firstcolumn to the last column of the matrix. To examine the externalforce in color, use it as initial state.

Special Meeting 14/16

Page 15: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Behavior Around Critital Temperature

The Ising model is a special case of the Potts model with M = 2,since their respective Hamiltonian function H(x) and H̃(x̃) satisfyH(x) = 2H̃(x̃) when xv = 1 or −1 accordingly as x̃v = 1 or 0.Here we set J = 1/2 and the inverse temperatureβ = log(1 +

√2) ≈ 0.88 for the Potts model. Then we can

simulate the phase transition for Ising model.

> potts(m=2,tmax=1000,beta=0.88)> potts(m=2,tmax=1000,beta=0.89)

Change the grid size of model, and observe the behavior aroundthe critical temperature.

> potts(m=2,n=c(100,100), tmax=1000,beta=0.88)

Special Meeting 15/16

Page 16: Lecture 3: Discrete Structures and Gibbs Sampler Discrete Structures and Distributions Discrete Structures Stochastic Algorithm Configuration Space Let V be a finite set of “sites”

IntroductionDiscrete Structures and Distributions

Gibbs ModelsIsing and Potts ModelSimulation with R

Exploration with R

The initial value init.state and the external force ref.statecan be assigned in a form of matrix. When init.state is set toan integer from 1 to M, the value is set to every site at time t = 0.

I Choose a different number M of common states and inversetemperature β.

I See what the Gibbs sampler generates with or withoutexistence of external force.

> potts(m=2,tmax=1000,beta=0.9)> potts(m=2,tmax=0, init.state=f1)> potts(m=2,tmax=1000,beta=0.9,ref.state=f1)> potts(m=5,tmax=1000,beta=1.1)> potts(m=5,tmax=0, init.state=f2)> potts(m=5,tmax=1000,beta=1.1,ref.state=f2)

Special Meeting 16/16