Approximate Inference in Graphical Models using LP...
Transcript of Approximate Inference in Graphical Models using LP...
Approximate Inference in Graphical Models using LP Relaxations
David Sontag�
Based on joint work with Tommi Jaakkola, Amir Globerson, Talya Meltzer, and Yair Weiss
Small scale validation cont’d• We can also directly examine changes due to OR1 knock-out
Preliminary validation• The !-phage (bacteria infecting virus)!"#$%&'#($)*#+,-./
0#,12(#'
3-142(#
5&1-6#
7&8.6,
OR3 OR1OR2cI cro
cI RNAp2
Arkin et al. 1998
• We can find the equilibrium of the game (binding frequencies)as a function of overall protein concentrations.
50
10!2
100
102
0
0.2
0.4
0.6
0.8
1
Binding in OR
3
frepressor
/fRNA
Bin
din
g F
req
ue
ncy (
tim
e!
ave
rag
e)
RepressorRNA!polymerase
(a) OR3
10!2
100
102
0
0.2
0.4
0.6
0.8
1
Binding in OR
2
frepressor
/fRNA
Bin
din
g F
req
ue
ncy (
tim
e!
ave
rag
e)
RepressorRNA!polymerase
(b) OR2
10!2
100
102
0
0.2
0.4
0.6
0.8
1
Binding in OR
1
frepressor
/fRNA
Bin
din
g F
req
ue
ncy (
tim
e!
ave
rag
e)
RepressorRNA!polymerase
(c) OR1
Figure 3: Predicted protein binding to sites OR3, OR2, and mutated OR1 for increasing amounts of cI2.
became su!ciently high do we find cI2 at the mutatedOR1 as well. Note, however, that cI2 inhibits transcrip-tion at OR3 prior to occupying OR1. Thus the bindingat the mutated OR1 could not be observed without in-terventions.
7 Discussion
We believe the game theoretic approach provides a com-pelling causal abstraction of biological systems with re-source constraints. The model is complete with prov-ably convergent algorithms for finding equilibria on agenome-wide scale.
The results from the small scale application are en-couraging. Our model successfully reproduces knownbehavior of the !!switch on the basis of molecularlevel competition and resource constraints, without theneed to assume protein-protein interactions between cI2dimers and cI2 and RNA-polymerase. Even in the con-text of this well-known sub-system, however, few quan-titative experimental results are available about bind-ing. Proper validation and use of our model thereforerelies on estimating the game parameters from availableprotein-DNA binding data (in progress). Once the gameparameters are known, the model provides valid pre-dictions for a number of possible perturbations to thesystem, including changing nuclear concentrations andknock-outs.
Acknowledgments
This work was supported in part by NIH grant GM68762and by NSF ITR grant 0428715. Luis Perez-Breva is a“Fundacion Rafael del Pino” Fellow.
References
[1] Adam Arkin, John Ross, and Harley H. McAdams.Stochastic kinetic analysis of developmental path-way bifurcation in phage !-infected excherichia colicells. Genetics, 149:1633–1648, August 1998.
[2] Kenneth J. Arrow and Gerard Debreu. Existence ofan equilibrium for a competitive economy. Econo-metrica, 22(3):265–290, July 1954.
[3] Z. Bar-Joseph, G. Gerber, T. Lee, N. Rinaldi,J. Yoo, B. Gordon F. Robert, E. Fraenkel,T. Jaakkola, R. Young, and D. Gi"ord. Compu-tational discovery of gene modules and regulatorynetworks. Nature Biotechnology, 21(11):1337–1342,2003.
[4] Otto G. Berg, Robert B. Winter, and Peter H. vonHippel. Di"usion- driven mechanisms of proteintranslocation on nucleic acids. 1. models and theory.Biochemistry, 20(24):6929–48, November 1981.
[5] Drew Fudenberg and Jean Tirole. Game Theory.The MIT Press, 1991.
10
• Predictions are again qualitatively correct
52
MAP in Undirected Graphical Models
θih(xi, xh)
θjk(xj, xk) θij(xi, xj)
xi
xj
xk
xh G=(V,E)
Protein backbone
Side-chains
θij(xi, xj) xj
xi
Real-world problems:
Protein design
Stereo vision
Find most likely assignment:
How to solve MAP?
MAP is known to be NP-hard (e.g., MAP on binary MRFs is equivalent to Max-Cut)
Real-world MAP problems are not necessarily as hard as theoretical worst case
How to solve MAP?
New toolkit: Message-passing algorithms based on linear programming relaxations�(Schlensinger ’76, Kolmogorov & Wainwright ‘05, Vontobel & Koetter ‘06, Johnson et al. ’07, Komodakis et al. ‘07, Globerson & Jaakkola ’08…)
Solves exactly when LP relaxation is tight:�trees, binary submodular MRFs, and matchings
In practice, we seldom have these structures
By tightening the relaxation (problem specific), we can solve hard real-world problems, exactly
We can formulate the MAP problem as a linear program
MAP as a linear program
The marginal polytope constrains the to be marginals of some distribution:
Very many constraints! Obj.
xmap
Vertices correspond to assignments
where the variables are defined over edges.
Partial pairwise�consistency }
Tightening the LP
Such that�
Relaxation
MAP Objective
1 2 3
4 5 6
7 8 9
Partial pairwise�consistency }
Tightening the LP
Such that�
Relaxation
MAP Objective
1 2 3
4 5 6
7 8 9
Tightening the LP
Such that�
Relaxation
MAP Objective
… …
Can we efficiently solve the LP?
What clusters to add?
How do we avoid re-solving?
Great! But…
Might be “lucky”�and solve earlier
Our solution
Can we efficiently solve the LP? We work in one of the dual LPs (Globerson & Jaakkola ‘07)
Dual can be solved by an efficient message-passing algorithm Corresponds to coordinate-descent algorithm
What cluster to add next? We propose a greedy bound minimization algorithm Add clusters with guaranteed improvement – upper bound gets tighter
How do we avoid re-solving? “Warm start” of new messages using the old messages
Dual
Primal MAP Obj.
Dual algorithm
Iteration
2. Decode assignment from messages
MAP
Dual
Objective
1. Run message-passing
4. Warm start: initialize new cluster messages 3. Choose a cluster to add to relaxation
Is gap (dual obj – assignment val) small?
Integer solution
Messages
Done! No.
Same objective value
What cluster to add next?
Iteration
MAP
Dual
Objective
Full decrease due to cluster One full iteration with cluster One outgoing message from cluster
xi
zj
zk
zi
zkxk
zi
xi
xj zj,c! zj,c!
zi,c! zi,c!
zk,c!
Coarsened cluster consistency
Each new cluster requires adding a large number of LP variables and constraints
Is it possible to use just a subset of these constraints?
We give a new class of sparse cluster constraints, enforcing consistency on coarsened variables
(Sontag, Globerson, Jaakkola, NIPS ‘08)
µijk(xi, xj , xk)
Experiments: Protein design
Given protein’s 3D shape, choose amino-acids giving the most stable structure
Each state corresponds to a choice of amino-acid and side-chain angle MRFs have 41-180 variables, each variable with 95-158 states Hard to solve
Very large treewidth Many small cycles (20,000 triangles) and frustration
θih(xi, xh)
θjk(xj, xk) θik(xi, xk)
xi
xk
xj
xh G=(V,E)
Protein backbone
Side-chains
(MRFs from Yanover, Meltzer, Weiss ‘06)
Primal LP, pairwise, is large
(Yanover, Meltzer, Weiss, JMLR ‘06)
CPLEX can only run on 3:�must move to dual!
Pairwise constraints solve only 2 of the 97 proteins
Iteratively tightening relaxation with triplets, we exactly solve 96 of the 97 proteins (!!!)
Using the coarsened clusters, average time to solve 15 largest proteins is 1.5 hours
Bound criterion finds the right constraints: Only 5 to 735 triplets needed to be added per problem
Protein design results
Coarsening clusters really helps
0 1 2 3 4 5160
180
200
220
240
260
Hours
Ob
ject
ive
This paper
Sontag et al. UAI ’08
Primal (best decoding)
Dual
Sparse constraints (NIPS ’08)
Original (UAI ’08)
1000 1200 1400 1600 18000
5
10
15
20
25
30
35
Iteration Number
Tim
e (S
econds)
This paper
Sontag et al. UAI ’08
Sparse constraints (NIPS ’08)
Original (UAI ’08)
Related Work
Similar ideas can be done directly in the primal Selection criteria of constraint violation instead of bound minimization
(Sontag & Jaakkola ’08)
Can also be applied to marginals Guidance by bound on partition function rather than MAP value Similar to region-pursuit algorithm for generalized BP (Welling
UAI ’04)
Conclusions & Future Work
New toolkit of message-passing algorithms based on dual LP relaxations� +�Iterative tightening of LP relaxation� =�Ability to solve interesting real world-problems
More generally, when can we expect these MAP inference techniques to be successful?
How should we do learning with approximate inference – in particular, with LP relaxations?