Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar [email protected] Slides available online

48
Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar [email protected] Slides available online http://cvn.ecp.fr/personnel/pawan/

Transcript of Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar [email protected] Slides available online

Page 1: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Discrete OptimizationLecture 4 – Part 3

M. Pawan Kumar

[email protected]

Slides available online http://cvn.ecp.fr/personnel/pawan/

Page 2: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Recap

Page 3: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Loopy Belief Propagation

Initialize all messages to 1

In some order of edges, update messages

Mab;k = Σiψa(li)ψab(li,lk)Πn≠bMna;i

Until Convergence

Rate of changes in messages < threshold

Not Guaranteed !!

Page 4: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Loopy Belief Propagation

B’ab(i,j) =

Normalize to compute beliefs Ba(i), Bab(i,j)

B’a(i) =

ψa(li)ψb(lj)ψab(li,lj)Πn≠bMna;iΠn≠aMnb;j

ψa(li)ΠnMna;i

At convergence Σj Bab(i,j) = Ba(i)

Page 5: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Outline

• Free Energy

• Mean-Field Approximation

• Bethe Approximation

• Kikuchi Approximation

Yedidia, Freeman and Weiss, 2000

Page 6: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Exponential Family

P(v) = exp{-Σa Σi θa;iIa;i(va) -Σa,b Σi,k θab;ikIab;ik(va,vb) - A(θ)}

A(θ) : log Z

Probability P(v) =Πa ψa(va) Π(a,b) ψab(va,vb)

Z

ψa(li) : exp(-θa(i)) ψa(li,lk) : exp(-θab(i,k))

Page 7: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Exponential Family

P(v) = exp{-Σa Σi θa;iIa;i(va) -Σa,b Σi,k θab;ikIab;ik(va,vb) - A(θ)}

A(θ) : log Z

Probability P(v) =Πa ψa(va) Π(a,b) ψab(va,vb)

Z

ψa(li) : exp(-θa(i)) ψa(li,lk) : exp(-θab(i,k))

Energy Q(v) = Σa θa(va) + Σa,b θab(va,vb)

exp(-Q(v))

Z=

Page 8: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Exponential Family

Probability P(v) =Πa ψa(va) Π(a,b) ψab(va,vb)

Z

exp(-Q(v))

Z=

Approximate probability distribution B(v)

Minimize KL divergence between B(v) and P(v)

B(v) has a simpler form than P(v)

Page 9: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kullback-Leibler Divergence

D = B(v)P(v)

Σv B(v) log

Page 10: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kullback-Leibler Divergence

D = Σv B(v) log B(v) - Σv B(v) log P(v)

Page 11: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kullback-Leibler Divergence

D = Σv B(v) log B(v) + Σv B(v) Q(v)

- (- log Z)

Helmholz free energy

Constant with respect to B

Page 12: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kullback-Leibler Divergence

Σv B(v) log B(v) + Σv B(v) Q(v)

Negative Entropy U(B)

Page 13: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kullback-Leibler Divergence

Σv B(v) log B(v) + Σv B(v) Q(v)

Average Energy S(B)

Page 14: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kullback-Leibler Divergence

Σv B(v) log B(v) + Σv B(v) Q(v)

Gibbs free energy

Page 15: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Outline

• Free Energy

• Mean-Field Approximation

• Bethe Approximation

• Kikuchi Approximation

Page 16: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Simpler Distribution

One-node marginals Ba(i)

Joint probability B(v) = Πa Ba(va)

Page 17: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

Σv B(v) Q(v)

Page 18: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

Σv B(v) (Σa θa(va) + Σa,b θab(va,vb))

* = Simplify on board !!!

*

Page 19: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

Σa Σi Ba(i)θa(i) + Σa,b Σi,k Ba(i)Bb(k)θab(i,k)

Page 20: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Negative Entropy

Σv B(v) log (B(v))*

Page 21: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Negative Entropy

Σa Σi Ba(i)log(Ba(i))

Page 22: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Mean-Field Free Energy

Σa Σi Ba(i)θa(i) + Σa,b Σi,k Ba(i)Bb(k)θab(i,k)

+ Σa Σi Ba(i)log(Ba(i))

Page 23: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Optimization Problem

Σa Σi Ba(i)θa(i) + Σa,b Σi,k Ba(i)Bb(k)θab(i,k)

+ Σa Σi Ba(i)log(Ba(i))

minB

Σi Ba(i) = 1s.t.

*

Page 24: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

KKT Condition

log(Ba(i)) = -θa(i) -Σb Σk Bb(k)θab(i,k) + λa-1

Ba(i) = exp(-θa(i) -Σb Σk Bb(k)θab(i,k))/Za

Page 25: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Optimization

Initialize Ba (random, uniform, domain knowledge)

Ba(i) = exp(-θa(i) -Σb Σk Bb(k)θab(i,k))/Za

Set all random variables to unprocessed

Pick an unprocessed random variable Va

If Ba changes, set neighbors to unprocessed

Until Convergence Guaranteed !!

Tutorial: Jaakkola, 2000 (one of several)

Page 26: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Outline

• Free Energy

• Mean-Field Approximation

• Bethe Approximation

• Kikuchi Approximation

Page 27: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Simpler Distribution

One-node marginals Ba(i)

Two-node marginals Bab(i,k)

Joint probability hard to write down

But not for trees

Page 28: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Simpler Distribution

One-node marginals Ba(i)

Two-node marginals Bab(i,k)

B(v) = Πa,b Bab(va,vb)

Πa Ba(va)n(a)-1

Pearl, 1988

n(a) = number of neighbors of Va

Page 29: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

Σv B(v) Q(v)

Page 30: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

Σv B(v) (Σa θa(va) + Σa,b θab(va,vb))*

Page 31: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

Σa Σi Ba(i)θa(i) + Σa,b Σi,k Bab(i,k)θab(i,k) *

Page 32: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Average Energy

-Σa (n(a)-1)Σi Ba(i)θa(i)

+ Σa,b Σi,k Bab(i,k)(θa(i)+θb(k)+θab(i,k))

n(a) = number of neighbors of Va

Page 33: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Negative Entropy

Σv B(v) log (B(v))*

Page 34: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Negative Entropy

-Σa (n(a)-1)Σi Ba(i)log(Ba(i))

+ Σa,b Σi,k Bab(i,k)log(Bab(i,k))

Exact for tree

Approximate for general MRF

Page 35: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Bethe Free Energy

-Σa (n(a)-1)Σi Ba(i)(θa(i)+log(Ba(i)))

+ Σa,b Σi,k Bab(i,k)(θa(i)+θb(k)+θab(i,k)+log(Bab(i,k))

Exact for tree

Approximate for general MRF

Page 36: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Optimization Problem

-Σa (n(a)-1)Σi Ba(i)(θa(i)+log(Ba(i)))minB

Σk Bab(i,k) = Ba(i)

Σi,k Bab(i,k) = 1

Σi Ba(i) = 1

s.t.

*

+ Σa,b Σi,k Bab(i,k)(θa(i)+θb(k)+θab(i,k)+log(Bab(i,k))

Page 37: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

KKT Condition

log(Bab(i,k)) = -(θa(i)+θb(k)+θab(i,k)) + λab(k) + λba(i) + μab - 1

λab(k) = log(Mab;k)

Page 38: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Optimization

BP tries to optimize Bethe free energy

But it may not converge

Convergent alternatives exist

Yuille and Rangarajan, 2003

Page 39: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Outline

• Free Energy

• Mean-Field Approximation

• Bethe Approximation

• Kikuchi Approximation

Page 40: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Local Free Energy

V3 V4

V1 V2Cluster of variablesc

Gc = Σvc Bc(vc)(log(Bc(vc)) + Σd “subset of c” θd(vd))

G12 = Σv1,v2 B12(v1,v2)(log(B12(v1,v2)) +

θ1(v1) + θ2(v2) + θ12(v1,v2))

Page 41: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Local Free Energy

V3 V4

V1 V2Cluster of variablesc

Gc = Σvc Bc(vc)(log(Bc(vc)) + Σd “subset of c” θd(vd))

G1 = Σv1 B1(v1)(log(B1(v1)) + θ1(v1))

Page 42: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Local Free Energy

V3 V4

V1 V2Cluster of variablesc

Gc = Σvc Bc(vc)(log(Bc(vc)) + Σd “subset of c” θd(vd))

G12 = Σv1,v2 B12(v1,v2)(log(B1234(v1,v2,v3,v4)) +

θ1(v1) + θ2(v2) + θ3(v3) + θ4(v4) +θ12(v1,v2) + θ13(v1,v3) + θ24(v2,v4) + θ34(v3,v4))

Page 43: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Sum of Local Free Energies

V3 V4

V1 V2

G12 + G13 + G24 + G34

Overcounts G1, G2, G3, G4 once !!!

Sum of free energies of all pairwise clusters

Page 44: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Sum of Local Free Energies

V3 V4

V1 V2

G12 + G13 + G24 + G34

Sum of free energies of all pairwise clusters

- G1 - G2 - G3 - G4

Page 45: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Sum of Local Free Energies

V3 V4

V1 V2

G12 + G13 + G24 + G34

Sum of free energies of all pairwise clusters

- G1 - G2 - G3 - G4

Bethe Approximation !!!

Page 46: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kikuchi Approximations

V3 V4

V1 V2

G1234

Use bigger clusters

Page 47: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Kikuchi Approximations

V4 V5

V1 V2

G1245 + G2356

Use bigger clusters

V6

V3

- G25

Derive message passing using KKT conditions!

Page 48: Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online

Generalized Belief Propagation

V4 V5

V1 V2

G1245 + G2356

Use bigger clusters

V6

V3

- G25

Derive message passing using KKT conditions!