Overview of AIC- PRAiSE (Lifted First-Order) Probabilistic Reasoning As Symbolic Evaluation
description
Transcript of Overview of AIC- PRAiSE (Lifted First-Order) Probabilistic Reasoning As Symbolic Evaluation
Overview of AIC-PRAiSE(Lifted First-Order)
Probabilistic ReasoningAs
Symbolic Evaluation
Propositional factor graphs• Simply a set of potential functions on random
variables
john smokes
john and mary are friends
mary smokes
Φ1(john and mary are friends, john smokes, mary smokes)
Φ2(john smokes) Φ3(mary smokes)
P(john and mary are friends, john smokes, mary smokes) = α Φ1(john and mary are friends, john smokes, mary smokes) Φ2(john smokes) Φ3(mary smokes)
Propositional factor graphs
• Bayesian nets can be represented as factor graphsPotential functions correspond to conditional probabilities
alarm
earthquake
burglary
P(alarm|earthquake, burglary)P(earthquake)
P(burglary)
P(alarm, earthquake, burglary) = α P(alarm|earthquake, burglary) P(earthquake) P(burglary)
alarm
earthquake
burglary
Bayesian network Factor graph
Belief Propagation (BP)
alarm
earthquake
burglary
P(alarm) = α Σ earthquake, burglary
P(alarm|earthquake, burglary) P(earthquake) P(burglary) = ?
Belief Propagation (BP)
P(alarm) = α Σ earthquake, burglary
P(alarm|earthquake, burglary) P(earthquake) P(burglary)
= α Σ earthquake P(earthquake)
(Σ burglary P(burglary) P(alarm|
earthquake, burglary))
alarm
earthquake
burglary
Belief Propagation (BP)
V
F
…
μ(V F) = Σ V’ in (args of F) – {V} F(args of F) ΠV’ in (args of F) – {V} μ(F V’)
V …
μ(F V) = ΠF’ in neighbors(V) – {F} μ(V F’)
F
From factor to variable
From variable to factor
(args of F) – {V}
neighbors(V) - {F}
belief(V) = α ΠF in neighbors of V μ(V F)
A Simple Example
• The probability of an epidemic happening is 10%
• For each person of a 1,000-people population, the probability of that person getting sick is40% if there is an epidemic1% if there is not an epidemic
• Query: given that three people are sick and everybody else is not, what is the probability of an epidemic?
Using a Bayesian Net to Solve it
epidemic
sick1
sick2
sick1000
sick999
...
sick3
= true
= true
= true
= false
= false
We will use a first-order logic predicate notation instead of indices:
because we want to write logic-like rules with them.
Making it more logic-like
sick1 sick(person1)=
sick2 sick(person2)=
...
Using a Bayesian Net to Solve it
epidemic
sick(person1)
sick(person2)
sick(person1000)
sick(person999)
...
sick(person3)
= true
= true
= true
= false
= false
Lots of messages have the same values and derived from essentially the same computation. We could instead compute them each only once and then exponentiate
Using a Bayesian Net to Solve itsick(person1)
sick(person2)
sick(person1000)
sick(person999)
...
sick(person3)epidemic
Factor Graph
Using a Bayesian Net to Solve itsick(person1)
sick(person2)
sick(person1000)
sick(person999)
...
sick(person3)epidemic
Factor Graph
But a regular graphical model inference algorithm will compute all the repeated messages
Using a Bayesian Net to Solve itsick(person1)
sick(person2)
sick(person1000)
sick(person999)
...
sick(person3)epidemic
Factor Graph
We could write an algorithm for this specific model, but what if we don’t know the model in advance because we are writing an inference engine, or the model is going to be learned?
An Algebraic View
sick(person1)
epidemic
Representing concepts with mathematical expressions
[ epidemic ]
[ if epidemic then 0.1 else 0.9 ]
[ sick(person1) ]message to [epidemic ] from[ if epidemic then 0.1 else 0.9 ]
[ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ]
message to[ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.9 ] from[ sick(person1) ]
(its value is simplyepidemic,like X and x in statistics)
An Algebraic View We can represent the set of factors with a single expression:{ [ if epidemic then 0.1 else 0.9 ] } union{ [ if epidemic then if sick(person1) then 0.4 else 0.6
else if sick(person1) then 0.01 else 0.99 ] }
union … union{ [ if epidemic then if sick(person1000) then 0.4 else 0.6
else if sick(person1000) then 0.01 else 0.99 ] } union{ [ if sick(person1) then 1 else 0 ], [ if sick(person2) then 1 else 0 ], [ if sick(person3) then 1 else 0 ] } union
{ [ if sick(person4) then 0 else 1 ], …, [ if sick(person1000) then 0 else 1 ] }
sick(person1)
sick(person2)
sick(person1000)
sick(person999)
...sick(person3)epidemic
An Algebraic ViewNow that we can denote factor graphs objects and quantities with mathematical expressions, we can write:
belief(epidemic) = prod_{F in neighbors([epidemic])} message to [epidemic] from FWe then compute:neighbors([epidemic]) ={ [ if epidemic then 0.1 else 0.9 ] } union{ [ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ] } union … union{ [ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ] }
And plug it back:
An Algebraic ViewNow that we can denote factor graphs objects and quantities with mathematical expressions, we can write:
belief(epidemic) = prod_{F in { [ if epidemic then 0.1 else 0.9 ] } union
{ [ if epidemic then if sick(person1) then 0.4 else 0.6
else if sick(person1) then 0.01 else 0.99 ] }
union … union{ [ if epidemic then if
sick(person1000) then 0.4 else 0.6 else
if sick(person1000) then 0.01 else 0.99 ] }} message to [epidemic] from F
An Algebraic ViewNow that we can denote factor graphs objects and quantities with mathematical expressions, we can write:
belief(epidemic) =
message to [epidemic] from[ if epidemic then 0.1 else 0.9 ]
*message to [epidemic] from[ if epidemic then if sick(person1) then 0.4 else 0.6
else if sick(person1) then 0.01 else 0.99 ]* ... *message to [epidemic] from[ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ]
epidemic ...
[ if epidemic then if sick(person1) then 0.4 else 0.6 else if sick(person1) then 0.01 else 0.99 ]
[ if epidemic then if sick(person1000) then 0.4 else 0.6 else if sick(person1000) then 0.01 else 0.99 ]
Doesn’t change anything, really; we still need compute messages from 1000 nodes!
Intensional Representation We now introduce an intensional way of representing the set of factors in the model:{ [ if epidemic then 0.1 else 0.9 ] } union{{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6
else if sick(X) then 0.01 else 0.99 ] }} union{{ (on X in People) [ if sick(X) then 1 else 0 ] | X = person1 or X = person2 or X = person3 }} union{{ (on X in People) [ if sick(X) then 0 else 1 ] | X != person1 and X != person2 and X != person3 }}
sick(person1)
sick(person2)
sick(person1000)
sick(person999)
...sick(person3)epidemic
Intensional RepresentationIntensional version:
belief(epidemic) = prod_{F in neighbors([epidemic])} message to [epidemic] from FWe then compute:neighbors([epidemic]) =
{ [ if epidemic then 0.1 else 0.9 ] } union
{{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X)
then 0.01 else 0.99 ] }}
And plug it back:
Intensional RepresentationNow that we can denote factor graphs objects and quantities with mathematical expressions, we can write:
belief(epidemic) =prod_{F in { [ if epidemic then 0.1 else 0.9 ] }
union{{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ] }} } message to [epidemic] from F
=
message to [epidemic] from [ if epidemic then 0.1 else 0.9 ]*prod_{F in {{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6
else if sick(X) then 0.01 else 0.99 ] }} }message to [epidemic] from F
Intensional RepresentationNow that we can denote factor graphs objects and quantities with mathematical expressions, we can write:
belief(epidemic) = message to [epidemic] from [ if epidemic then 0.1 else 0.9 ]*prod_{F in {{ (on X in People) [ if epidemic then if sick(X) then 0.4 else 0.6
else if sick(X) then 0.01 else 0.99 ] }} }message to [epidemic] from F
=
message to [epidemic] from [ if epidemic then 0.1 else 0.9 ]*prod_{X in People} message to [epidemic]
from [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
Intensional RepresentationNow that we can denote factor graphs objects and quantities with mathematical expressions, we can write:
belief(epidemic) =
message to [epidemic] from { [ if epidemic then 0.1 else 0.9 ] } *prod_{X in People}message to [epidemic] from [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
epidemic
[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
Does it make things better? How do I compute that expression?
[ if epidemic then 0.1 else 0.9 ]
prod_X
Stands for 1000 nodes sending 1000 messages, which are multiplied
Symbolic Evaluation
• Symbolic evaluation is about evaluating expressions even if we don’t know the value of everything in them:
• 1 + 2 + 3 6• X + 2 + 3 + 0*Y X + 5• 3 in { 1, 2, 3 } true• [ sick(X) ] in { [sick(john)], [sick(mary] }
X = john or X = mary
Symbolic EvaluationExternalizing if-then-else constructs:f (if Condition then A else B) = if Condition f(A) else f(B)
Example:income(X) := salary(X) + 2salary(X) := if X = bob then 7 else 1
income(bob) = salary(bob) + 2 = 7 + 2 = 9
income(Y) = salary(Y) + 2= (if Y = bob then 7 else 1) + 2= if Y = bob then 7 + 2 else 1 + 2= if Y = bob then 9 else 3
Symbolic EvaluationCase analysis:sum_{X in Set} if Cond(X) then A else B = sum_{X in Set : Cond(X)} A + sum_{X in Set : not Cond(X)} B
= (if A and B are constants in X)A*|{ X in Set : Condition(X) }| + B*|{X in Set : not Condition(X)}|(analogous for product and exponentiation)
Example:sum_X income(X)= sum_X if X = bob then 9 else 3= sum_{X = bob} 9 + sum_{X != bob} 3= 9 + 3 * |{X in People : X != bob}|= 9 + 3 * (|People| - 1)
A Schematic View
f g(X)prod_X
h(X) z(X)...
f g(X)prod_X
h(X) if X = bob then 1 else 2...
f g(X)prod_X
if X = bobthen 5 else 6 if X = bob then 1 else 2...
f if X = bob then 7 else 5
prod_Xif X = bob
then 5 else 6 if X = bob then 1 else 2...
7 * 5|X| - 1 if X = bob then 7 else 5
prod_Xif X = bob
then 5 else 6 if X = bob then 1 else 2...
Symbolic Evaluation to the Rescueprod_{X in People}
message to [epidemic] from
[ if epidemic then if sick(X) then 0.4 else 0.6
else if sick(X) then 0.01 else 0.99 ] = ?• We symbolically solve
message to [epidemic] from
[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
• It has to be symbolically, because it contains a free variable X.
Symbolic Evaluation to the Rescuemessage to [epidemic]
from [ if epidemic then if sick(X) then 0.4 else 0.6
else if sick(X) then 0.01 else 0.99 ]= some function of X...= another function of X......= a function of neighbors([sick(X)]) For the first time, an expression actually depends on the value of X:neighbors([sick(X)])=if X = person1 or X = person2 or X = person3
then ... union { [ if sick(X) then 1 else 0 ] } else ... union { [ if sick(X) then 0 else 1 ] }
sick(X)
[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
?
Symbolic Evaluation to the Rescuemessage to [epidemic]
from [ if epidemic then if sick(X) then 0.4 else 0.6
else if sick(X) then 0.01 else 0.99 ]=if X = person1 or X = person2 or X = person3
then < some message on epidemic > else < some other message on epidemic >
Symbolic Evaluation to the Rescueprod_X message to [epidemic]
from [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
=prod_X if X = person1 or X = person2 or X = person3
then < some message on epidemic > else < some other message on epidemic >
= prod_{X = person1 or X = person2 or X = person3}< some message on epidemic >
prod_{X != person1 and X != person2 and X != person3} < some other message on epidemic >
= < some message on epidemic > |{X = person1 or X = person2 or X = person3}|
< some other message on epidemic > |{X != person1 and X != person2 and X != person3}|
= < some message on epidemic > 3 * < some other message on epidemic > 997
= < yet another message on epidemic >
epidemic
An Algebraic Viewbelief(epidemic) =message to [epidemic] from { [ if epidemic then 0.1 else 0.9 ] } *prod_{X in People}message to [epidemic] from [ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]= < prior message on epidemic > *< yet another message on epidemic >=< value of belief of epidemic >
epidemic
[ if epidemic then if sick(X) then 0.4 else 0.6 else if sick(X) then 0.01 else 0.99 ]
We computed the exact belief without considering all the individuals
[ if epidemic then 0.1 else 0.9 ]
prod_X
Stands for 1000 nodes sending 1000 messages, which are multiplied
Lifted Inference
• Lifted inference is about performing inference from intensional representations without unnecessarily considering individual random variables
A Schematic View
f g(X)prod_X
h(X) z(X)...
f g(X)prod_X
h(X) if X = bob then 1 else 2...
f g(X)prod_X
if X = bobthen 5 else 6 if X = bob then 1 else 2...
f if X = bob then 7 else 5
prod_Xif X = bob
then 5 else 6 if X = bob then 1 else 2...
7 * 5|X| - 1 if X = bob then 7 else 5
prod_Xif X = bob
then 5 else 6 if X = bob then 1 else 2...
Why this is progress
• Lifted inference algorithms to date have been very loosely described
• Notions such as “parfactor” not clear
• How many neighbors does this parfactor have?• Are the smoker nodes really separate?
smoker(X)
friends(X,Y)
smoker(Y)
Symbolic Evaluation to the Rescue
{ U, V, a }= (set normalization)if U = V then { X, a } else if U = a then { V, a } else if V = a then { U, a }
else { U, V, a }f(X) = f(Y), f injective=X = Y
Symbolic Evaluation to the Rescueneighbors([if smoker(X) then if smoker(Y) then friends(X,Y) ... ] )={ [friends(X,Y)], [smoker(X)], [smoker(Y)] }
= (set normalization)
if [smoker(X)] = [smoker(Y)]then { [friends(X,Y)], [smoker(X)] }
else { [friends(X,Y)], [smoker(X)], [smoker(Y)] }
= (equality on injective function)
if X = Ythen { [friends(X,Y)], [smoker(X)] }else { [friends(X,Y)], [smoker(X)], [smoker(Y)] }
Cardinalities can also split| { X in People : X != bob and X != Neighbor } |=if Neighbor = bob then |People| - 1 else | People | - 2
This was also very awkward for algorithms so far, but with symbolic evaluation it is dealt with just like any other if-then-else.
Symbolically computing cardinalities of sets is a useful sub-routine for a lot of things other than probabilistic inference!
Conclusion
• An approach to Lifted inference with a clear and formal representation
• Lifted algorithm based on straightforward math manipulations; good for current state, essential for future extensions
• Get compilation, short-circuiting etc for free from symbolic evaluation base
• Meta-level gives you lots of opportunities• Lifted computation, really, not only for
probabilistic inference.