Chapter02b

64
1 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15 Backward Search Forward search starts at the initial state Chooses an action that’s applicable Computes state transition s= γ (s,a) Backward search starts at the goal Chooses an action that’s relevant A possible “last action” before the goal Computes inverse state transition g= γ –1 (g,a) g= properties a state sshould satisfy in order for γ (s,a) to satisfy g Why would we want to do this? One possibility: sometimes has a lower branching factor Forward: 10 applicable actions for each robot, two move actions and three load actions Backward: g = {loc(r1)=d3} 2 relevant actions: move(r1,d1,d3), move(r1,d2,d3) Can eliminate move(r1,d2,d3); it requires a rigid condition that’s false d2 d1 d3 r1 c1 r2 c2 c3 c4 c5 c6

Transcript of Chapter02b

Page 1: Chapter02b

1  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Backward  Search  ●  Forward search starts at the initial state

Ø  Chooses an action that’s applicable Ø  Computes state transition s′ = γ (s,a)

●  Backward search starts at the goal Ø  Chooses an action that’s relevant

•  A possible “last action” before the goal Ø  Computes inverse state transition g′ = γ –1(g,a)

•  g′ = properties a state s′ should satisfy in order for γ (s′,a) to satisfy g ●  Why would we want to do this? ●  One possibility: sometimes has a lower branching factor

Ø  Forward: 10 applicable actions •  for each robot, two move actions and three load actions  

Ø  Backward: g = {loc(r1)=d3} Ø  2 relevant actions: move(r1,d1,d3),  move(r1,d2,d3)  

•  Can eliminate move(r1,d2,d3); it requires a rigid condition that’s false  

                         d2                            d1  

d3

r1  c1  

r2  c2   c3   c4   c5   c6  

Page 2: Chapter02b

2  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Relevance  ●  Idea: a is relevant for g if a could be the last action of a plan that achieves g ●  Definition:

Ø  Let g = {g1, …, gk} be a goal. An action a is relevant for g if 1. eff(a) makes at least one gi true, i.e., eff(a) ∩ g ≠ ∅ 2. eff(a) doesn’t make any gi false

▸  ∀ x, c, c′, if eff(a) contains (x,c) and g contains x = c′ then c = c′ 3. pre(a) doesn’t require any gi to be false unless eff(a) makes gi true

▸  ∀ x, c, c′, if (x,c) ∈ pre(a) and (x,c′) ∈ g – eff(a) then c = c′

●  What actions are relevant for loc(c1)=r2  ?

                         d2                            d1  

d3

r1  c1  

r2  c2   c3   c4   c5   c6  

Page 3: Chapter02b

3  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Inverse  State  Transi5ons  

●  If a is relevant for achieving g, then γ−1(g,a) = pre(a) ∪ (g – eff(a))

●  If a isn’t relevant for g, then γ–1(g,a) is undefined

●  Example: Ø  g = {loc(c1)=r2} Ø  What is γ –1(g,  load(r2,c1,d3))? Ø  What is γ –1(g,  load(r1,c1,d1))?

                         d2                            d1  

d3

r1  c1  

r2  c2   c3   c4   c5   c6  

Page 4: Chapter02b

4  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Backward  Search  For cycle checking: ●  After line 1, put

Solved = {g} ●  After line 6, put

if g′ ∈ Solved then return failure  Solved ← Solved ∪ {g′}

●  More powerful: if ∃g ∈ Solved s.t. g ⊆ g′ then return failure    

●  Sound and complete Ø  If a planning problem is solvable

then at least one of Backward-search’s nondeterministic execution traces will find a solution

g  

g1  

g2  

g3  

a1  

a2  

a3  

g4  

g5  

s0  

a4  

a5  

Page 5: Chapter02b

5  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Branching  Factor  ●  Our motivation for Backward-­‐search  

was to focus the search Ø  But as written, it doesn’t really

accomplish that ●  Solve this by lifting

Ø  Leave y uninstantiated

. . .

move(r1,d2,d3)  

move(r1,d4,d3)  

move(r1,d20,d3)  

move(r1,d1,d3)  

g = {loc(r1)=d3}

move(r1,y,d3)   g = {loc(r1)=d3}

                         d2                            d1  

d3

r1  c1  

r2  c2   c3   c4   c5   c6  

Page 6: Chapter02b

6  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Li:ed  Backward  Search  ●  Like Backward-­‐search but more complicated

Ø  Have to keep track of what values were substituted for which parameters Ø  But it has a much smaller branching factor

●  I won’t discuss the details

Page 7: Chapter02b

7  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Plan-­‐Space  Planning  ●  Another approach

Ø  formulate planning as a constraint satisfaction problem Ø  use constraint-satisfaction techniques to produce solutions that are more

flexible than ordinary plans •  E.g., plans in which the actions are partially ordered •  Postpone ordering decisions until the plan is being executed

▸  the actor may have a better idea about which ordering is best ●  First step toward planning concurrent execution of actions (Chapter 4)

Outline: •  Basic idea •  Open goals •  Threats •  The PSP algorithm •  Long example •  Comments

Page 8: Chapter02b

8  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Plan-­‐Space  Planning  -­‐  Basic  Idea  ●  Backward search from the goal ●  Each node of the search space is a partial plan, π

•  A set of partially-instantiated actions •  Constraints on the actions

Ø  Keep making refinements, until we have a solution

●  Types of constraints: Ø  precedence constraints

indicated by solid arcs Ø  binding constraints

•  inequality constraints, e.g., z ≠ x or w ≠ p1 Ø  causal links:

•  indicated by dashed arcs •  use effect e of action a to establish precondition p of action b

●  How to tell we have a solution: no more flaws in the plan Ø  Two kinds of flaws …

foo(x) Pre: … Eff: loc(x)=p1

bar(x) Pre: loc(x)=p1 Eff: …

baz(z) Pre: loc(z)=p2 Eff: …

z ≠ x  

Page 9: Chapter02b

9  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Flaws:    1.  Open  Goals  

●  A precondition p of an action b is an open goal if there is no causal link for p

●  Resolve the flaw by creating a causal link Ø  Find an action a (either already in π,

or can add it to π) that can establish p •  can precede b •  can have p as an effect

Ø  Do substitutions on variables to make a assert p •  e.g., replace y with x

Ø  Add an ordering constraint a ≺ b Ø  Create a causal link from a to p

Pre: loc(y)=p1

Pre: loc(y)=p1

bar(y)

foo(y) bar(y)

substitute y for x

Eff: loc(y)=p1

foo(x)

Eff: loc(x)=p1

Page 10: Chapter02b

10  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Flaws:    2.  Threats  ●  Suppose we have a causal link from

action a to precondition p of action b ●  Action c threatens the link if c may affect p

and may come between a and b Ø  c is a threat even if it makes p true

rather than false •  Causal link means a, not c, is

supposed to establish p for b •  The plan in which c establishes p

will be generated on another path in the search space  

●  Three possible ways to resolve the flaw: Ø  Require c ≺ a Ø  Require b ≺ c Ø  Constrain variable(s) to prevent

b from affecting p

Pre: loc(y)=p1

foo(y) bar(y)

clobber(z) Eff: loc(z)=p2

Eff: loc(y)=p1

Pre: loc(y,p1)

foo(y) bar(y)

clobber(z) Eff: ¬loc(z,p1)

Eff: loc(y,p1)

State variables:

Classical:

Page 11: Chapter02b

11  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

PSP  Algorithm  

●  Initial plan is always {Start, Finish} with Start ≺ Finish Ø  Start has no preconditions; effects are the initial state s0 Ø  Finish has no effects; its precondition is the goal g

●  PSP is sound and complete Ø  It returns a partially ordered solution π such that any

total ordering of π will achieve g Ø  In some environments, could execute actions in parallel

Start  

Finish  

Eff: s0

Pre: g

Page 12: Chapter02b

12  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

Finish  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

Start  

pos(a)=d   pos(b)=c  

c  d  b  a  

●  Finish has two open goals: pos(a)=d, pos(b)=c  

Example  

clear(p1)=T  clear(p2)=T  clear(p3)=F  clear(p4)=F    clear(a)=F    clear(b)=F        clear(c)=T    clear(d)=T        pos(a)=p3      pos(b)=p4      pos(c)=b      pos(d)=a  

         

Page 13: Chapter02b

13  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,y1,d)  

Start  

Finish  

clear(a)=T   pos(a)=y1  

pos(a)=d   pos(b)=c  

clear(d)=T  

Example  ●  For each open goal, add a new action

Ø  Every new action a must have Start ≺ a, a ≺ Finish      

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

clear(b)=T  pos(b)=y2   clear(c)=T  

move(b,y2,c)  

clear(p1)=T  clear(p2)=T  clear(p3)=F  clear(p4)=F    clear(a)=F    clear(b)=F        clear(c)=T    clear(d)=T        pos(a)=p3      pos(b)=p4      pos(c)=b      pos(d)=a  

         

c  d  b  a  

Page 14: Chapter02b

14  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

         

move(a,p3,d)  

Start  

Finish  

clear(a)=T   clear(b)=T  pos(b)=p4  pos(a)=p3  

pos(a)=d   pos(b)=c  

Example  ●  Resolve four more open goals: bind y1=p3, y2=p4

clear(c)=T  clear(d)=T  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

move(b,p4,c)  

clear(p1)=T  clear(p2)=T  clear(p3)=F  clear(p4)=F    clear(a)=F    clear(b)=F        clear(c)=T    clear(d)=T        pos(a)=p3      pos(b)=p4      pos(c)=b      pos(d)=a  

c  d  b  a  

Page 15: Chapter02b

15  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

Example  ●  1st threat requires z3≠d ●  2nd threat has two resolvers:

Ø  move(b,p4,c) ≺ move(x3,a,z3) Ø  z3≠c  

pos(a)=p3   clear(c)=T  

clear(x3)=T  clear(z3)=T   pos(x3)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

move(x3,a,z3)  

move(b,p4,c)  

clear(p1)=T  clear(p2)=T  clear(p3)=F  clear(p4)=F    clear(a)=F    clear(b)=F        clear(c)=T    clear(d)=T        pos(a)=p3      pos(b)=p4      pos(c)=b      pos(d)=a  

c  d  b  a  

Page 16: Chapter02b

16  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

Example  

●  Threats resolved  

pos(a)=p3   clear(c)=T  

clear(x3)=T  clear(z3)=T   pos(x3)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

move(x3,a,z3)  

move(b,p4,c)  

z3≠c  z3≠d    

clear(p1)=T  clear(p2)=T  clear(p3)=F  clear(p4)=F    clear(a)=F    clear(b)=F        clear(c)=T    clear(d)=T        pos(a)=p3      pos(b)=p4      pos(c)=b      pos(d)=a  

c  d  b  a  

Page 17: Chapter02b

17  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  1st threat has two resolvers: Ø  An ordering constraint,

and z4≠d ●  2nd threat has three resolvers:

Ø  Two ordering constraints, and z4≠a

●  3rd threat has one: z4≠c  

pos(a)=p3   clear(c)=T  

move(x4,b,z4)  

clear(x3)=T   clear(x4)=T  clear(z3)=T   clear(z4)=T  pos(x4)=b  pos(x3)=a  

pos(a)=d   pos(b)=c  

Start  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

move(x3,a,z3)  

z3≠c  z3≠d    

c  d  b  a  

Page 18: Chapter02b

18  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(x3,a,z3)  

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Resolve the three threats using

the binding constraints  

pos(a)=p3   clear(c)=T  

move(x4,b,z4)  

clear(x3)=T   clear(x4)=T  clear(z3)=T   clear(z4)=T  pos(x4)=b  pos(x3)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  z4≠a  

z4≠c  z4≠d  

z3≠c  z3≠d    

c  d  b  a  

Page 19: Chapter02b

19  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

p1≠c  p1≠d    

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Resolve five open goals

Ø  Bind x3=d, x4=c, z3=p1

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,z4)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(z4)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  z4≠a  

z4≠c  z4≠d  

c  d  b  a  

Page 20: Chapter02b

20  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Threatened causal link ●  Resolvers:

Ø  move(d,a,p1) ≺ move(c,b,z4) Ø  z4≠p1  

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,z4)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(z4)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  z4≠a  

z4≠c  z4≠d  

c  d  b  a  

Page 21: Chapter02b

21  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Threat resolved

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,z4)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(z4)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  z4≠a  

z4≠c  z4≠d  z4≠p1  

c  d  b  a  

Page 22: Chapter02b

22  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F  Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Resolve open goal Ø  bind z4=p2

●  No more flaws, so we’re done!

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,p2)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(p2)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  p2≠a  

p2≠c  p2≠d  p2≠p1  

c  d  b  a  

Page 23: Chapter02b

23  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(a,p3,d)  

Start  

Finish  

move(b,p4,c)  

Example  ●  PSP returns this solution:

move(d,a,p1)   move(c,b,p4)  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

c  d  b  a  

Page 24: Chapter02b

24  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Go back to the last threat ●  Resolvers:

Ø  move(d,a,p1) ≺ move(c,b,z4) Ø  z4≠p1  

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,z4)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(z4)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  z4≠a  

z4≠c  z4≠d  

c  d  b  a  

Page 25: Chapter02b

25  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Threat resolved

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,z4)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(z4)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  z4≠a  

z4≠c  z4≠d  

c  d  b  a  

Page 26: Chapter02b

26  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Resolve open goal Ø  bind z4=p2

●  No more flaws, so we’re done

pos(a)=p3   clear(c)=T  

move(d,a,p1)   move(c,b,p4)  

clear(d)=T   clear(c)=T  clear(p1)=T   clear(p2)=T  pos(c)=b  pos(d)=a  

pos(a)=d   pos(b)=c  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

c  d  b  a  

Page 27: Chapter02b

27  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(a,p3,d)  

Start  

Finish  

move(b,p4,c)  

Example  ●  Same solution as before,

but with another ordering constraint

move(d,a,p1)   move(c,b,p4)  

   

   p3      

   p1  

   

   p2  

a  d    

 

   p4  b  c  

c  d  b  a  

Page 28: Chapter02b

28  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Node-­‐Selec5on  Heuris5cs  ●  Analogy to constraint-satisfaction problems

Ø  Resolving a flaw in PSP ≈ assigning a value to a variable in a CSP ●  What flaw to work on next?

Ø  Fewest Alternatives First (FAF): the flaw with the fewest resolvers ≈ Minimum Remaining Values (MRV) heuristic for CSPs

●  To resolve the flaw, which resolver to try first? Ø  Least Constraining Resolver (LCR): the resolver that rules out the fewest

resolvers for the other flaws ≈ Least Constraining Value (LCV) heuristic for CSPs

●  In PSP, introducing a new action introduces new flaws to resolve Ø  The plan can get arbitrarily large; want it to be as small as possible

•  Not like CSPs, where the search tree always has a fixed depth ●  Avoid introducing new actions unless necessary ●  To choose between actions a and b, estimate distance from s0 to Pre(a) and Pre(b)

Ø  We’ll discuss some heuristics for that later

Page 29: Chapter02b

29  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

move(a,p3,d)  

clear(d)=T   pos(b)=p4  

Start  

Finish  

clear(a)=T   clear(b)=T  

move(b,p4,c)  

Example  ●  Example of Fewest Alternatives First:

Ø  1st threat has two resolvers: an ordering constraint, and z4≠d Ø  2nd threat has three resolvers: 2 ordering constraints, and z4≠a Ø  3rd threat has one resolver: z4≠c  

●  So resolve the 3rd threat first

pos(a)=p3   clear(c)=T  

move(x4,b,z4)  

clear(x3)=T   clear(x4)=T  clear(z3)=T   clear(z4)=T  pos(x4)=b  pos(x3)=a  

pos(a)=d   pos(b)=c  

Start  

move(x3,a,z3)  

z3≠c  z3≠d    

Page 30: Chapter02b

30  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Discussion  ●  Problem: how to prune infinitely long paths in the search space?

Ø  Loop detection is based on recognizing states or goals we’ve seen before

Ø  In a partially ordered plan, we don’t know the states

●  Can we prune a path if we see the same action more than once?

Ø  No. Sometimes we might need the same action several times in different states of the world

Ø  Example on next slide

s s' s

act1   act2   act1   … …

Page 31: Chapter02b

31  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Example  ●  3-digit binary counter d3  d2  d1

s0 = {d3=0,  d2=0,  d1=0}, i.e., 0 0 0

g = {d3=1,  d2=1,  d1=1}, i.e., 1 1 1

●  Actions to increment the counter

•  incr-­‐xx0-­‐to-­‐xx1   Pre: d1=0    Eff: d1=1

•  incr-­‐x01-­‐to-­‐x10   Pre: d2=0,  d1=1    Eff: d2=1,  d1=0

•  incr-­‐011-­‐to-­‐100   Pre: d3=0,  d2=1,  d1=1    Eff: d3=1,  d2=0,  d1=0  

●  Plan:    d3    d2    d1  

s0 :  0      0      0  incr-­‐xx0-­‐to-­‐xx1  à  0      0      1  incr-­‐x01-­‐to-­‐x10  à  0      1      0  incr-­‐xx0-­‐to-­‐xx1  à  0      1      1  incr-­‐011-­‐to-­‐100  à  1      0      0  incr-­‐xx0-­‐to-­‐xx1  à  1      0      1  incr-­‐x01-­‐to-­‐x10  à  1      1      0  incr-­‐xx0-­‐to-­‐xx1  à  1      1      1  

Page 32: Chapter02b

32  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

A  Weak  Pruning  Technique  

●  Can prune all partial plans of n or more actions, where n = |{all possible states}| Ø  This doesn’t help very much

●  I’m not sure whether there’s a good pruning technique for plan-space planning

Page 33: Chapter02b

33  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Planning  with  Control  Rules  Motivation: ●  Given a state s and an action a ●  Sometimes domain-specific tests can

tell us we don’t want to use a, e.g., Ø  a doesn’t lead to a solution Ø  or a is dominated

•  there’s a better solution along some other path

Ø  or a doesn’t lead to a solution that’s acceptable according to domain-specific criteria

●  In such cases we can prune s (remove it from Act)

●  Approach: Ø  Write logical formulas giving conditions that states must satisfy Ø  Prune states that don’t satisfy the formulas

Page 34: Chapter02b

34  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Quick  Review  of  First  Order  Logic  First Order Logic (FOL): ●  Syntax:

Ø  atomic formulas (or atoms) •  predicate symbol with arguments, e.g., clear(c) •  include ‘=’ as a binary predicate symbol, e.g., loc(r1)=d1

Ø  logical connectives (∨, ∧, ¬, ⇒, ⇔), quantifiers (∀, ∃), punctuation •  e.g., (loc(r1)=d1  ∧ ∀c clear(c)) ⇒ ¬∃c loc(c)=r1  

●  First Order Theory T: Ø  “Logical” axioms and inference rules – encode logical reasoning in general Ø  Additional “nonlogical” axioms – talk about a particular domain Ø  Theorems: produced by applying the axioms and rules of inference

●  Model: a set of objects, functions, relations that the symbols refer to Ø  For our purposes, a model is a state of the world s Ø  In order for s to be a model, all theorems of T must be true in s Ø  s ⊨ loc(r1)=d1 read “s satisfies loc(r1)=d1” or “s entails loc(r1)=d1”

•  means that r1 is at d1 in the state s

Page 35: Chapter02b

35  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Linear  Temporal  Logic  ●  Modal logic: FOL plus modal operators

to express concepts that would be difficult to express within FOL ●  Linear Temporal Logic (LTL):

Ø  Purpose: to express a limited notion of time •  Infinite sequence 〈0, 1, 2, …〉 of time instants •  Infinite sequence M = 〈s0, s1, …〉 of states of the world

Ø  Modal operators to refer to states in M: X f “next f ” - f is true in the next state, e.g., F loc(a)=b F f “future f ” - f either is true now or in some future state G f “globally f ” - f is true now and in all future states f1 U f2 “f1 until f2” - f2 is true now or in a future state,

and f1 is true until then Ø  Propositional constant symbols True and False    

•  Instead of T and F, to avoid confusion with the F operator

Page 36: Chapter02b

36  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Linear  Temporal  Logic  (con5nued)  ●  Quantifiers cause problems with computability

Ø  Suppose f(x) is true for infinitely many values of x Ø  Problem evaluating truth of ∀x f(x) and ∃x f(x)

●  Bounded quantifiers Ø  Let g(x) be such that {x | g(x) is true} is finite and easily computed

∀[x: g(x)] f(x) ▸  means ∀x (g(x) ⇒ f(x)) ▸  expands into f(x1) ∧ f(x2) ∧ … ∧ f(xn)

∃[x: g(x)] f(x) ▸  means ∃x (g(x) ∧ f(x)) ▸  expands into f(x1) ∨   f(x2) ∨ … ∨ f(xn)

Page 37: Chapter02b

37  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Nota5on  ●  We can use state-variable assignments as logical propositions in LTL formulas

Ø  G (∀[x: clear(x)=T] final(x)=T ⇒ X(clear(x)=T ∨ ∃[y: loc(y)=x] final(y)=T))

●  For Boolean state variables, simpler to write them as logical propositions •  Instead of clear(x)=T, just write clear(x) •  Instead of clear(x)=F,  write ¬clear(x)

Ø  G (∀[x: clear(x)] final(x) ⇒ X(clear(x) ∨ ∃[y: loc(y)=x] final(y)))

Page 38: Chapter02b

38  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

pickup(x) pre: loc(x)=floor, clear(x), holding=nil eff: loc(x)=crane, ¬clear(x), holding=x

b  

stack(x,y) pre: holding=x, clear(y)   eff: holding=nil, ¬clear(y), loc(x)=y, clear(x)  

●  The “container stacking” domain Ø  Based on a classical planning domain

called the “blocks world”

unstack(x,y) pre: loc(x)=y, clear(x), holding=nil   eff: loc(x)=crane, ¬clear(x), holding=x, clear(y)  

putdown(x) pre: holding=x eff: holding=nil, loc(x)=floor, clear(x)  

Example   clear(e),  loc(e)=d,  loc(d)=floor,  clear(c),  loc(c)=a,  loc(a)=floor,  clear(b),  loc(b)=floor,  holding=nil  

d  e  

a  c  

b  

d  e  

a  c  

   clear(e),  loc(e)=d,  loc(d)=floor,      clear(c),  loc(c)=a,  loc(a)=floor,  ¬clear(b),  loc(b)=crane,  holding=b  

Page 39: Chapter02b

39  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Models  for  Planning  with  LTL  ●  A model is a pair M = (M, si)

Ø  M = 〈s0, s1, …〉 is a sequence of states Ø  si is the i’th state in M,

●  For planning, we also have a goal g = {g1, …, gn} Ø  To reason about it, add a modal operator called “Goal”

•  Not part of ordinary LTL, but I’ll call it LTL anyway

Ø  In an LTL formula, use “Goal(gi)” to refer to part of g •  ((M,si), g) ⊨ Goal(gi) iff g ⊨ gi

●  Planning problem: Ø  Initial state s0, a goal g, control formula f Ø  Find a plan π = 〈a1, …, an〉 that generates a sequence of states

M = 〈s0, s1, …sn〉 such that M ⊨ f and sn ⊨ g •  That’s not quite correct •  Do you know why?

Page 40: Chapter02b

40  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Models  for  Planning  with  LTL  ●  M needs to be an infinite sequence ●  Kluge: assume that the final state repeats infinitely after the plan ends

●  Planning problem: Ø  Initial state s0, a goal g, control formula f Ø  Find a plan π = 〈a1, …, an〉 that generates a sequence of states

M = 〈s0, s1, …, sn, sn, sn, …〉 such that M ⊨ f and sn ⊨ g

Page 41: Chapter02b

41  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Examples  ●  Suppose M = 〈s0, s1, …〉

●  (M,s2) ⊨ XX loc(a)=b

Ø  a is on b in state s2

●  Abbreviation: can omit the state, it defaults to s0 Ø  M ⊨ XX loc(a)=b          means (M,s0) ⊨ XX loc(a)=b

●  Since loc(a)=b has no modal operators Ø  (M,s2) ⊨ loc(a)=b is equivalent to s2 ⊨ loc(a)=b

●  M ⊨ G holding  ≠ c  Ø  in every state in M, we aren’t holding c

●  M ⊨ G (clear(b) ⇒ (clear(b) U loc(a)=b)) Ø  whenever we enter a state in which b is clear,

b remains clear until a is on b

Page 42: Chapter02b

42  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

TLPlan  ●  Nondeterministic forward search

Ø  s = current state, f = control formula, g = goal

●  If s satisfies g then we’re done ●  Otherwise, think about what kind of plan we need

Ø  It must generate a sequence of states M = 〈s, s+, s++, …〉 that satisfies f ●  Compute a formula f + such that

(M,s) ⊨ f iff (M,s+) ⊨ f + ●  Fail if f + = FALSE  

Ø  No matter what s+ is, (M,s+) can’t satisfy f +

●  Fail if no applicable actions

●  Otherwise, nondeterministically choose one, compute s+, and call TLPlan with s+ and f +

TLPlan  (s, f, g) if s satisfies g then return ⟨ ⟩ f + ← Progress  (f, s) if f + = False then return failure   A ← {actions applicable to s} if A is empty then return failure   nondeterministically choose a ∈ A π+ ← TLPlan  (γ (s,a), f +, g) if π+ ≠ failure then return π.π+   return failure

Page 43: Chapter02b

43  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Progression  ●  Procedure Progress(f,s)

◆  Case:  1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

False ∧ h = False, True ∧ h = h, ¬False = True, etc.

Compute the formula f + that M + must satisfy

Page 44: Chapter02b

44  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Progressing  ordinary  formulas  ●  Procedure Progress(f,s)

◆  Case:  1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

●  f = loc(a)=b ◆  if a is currently on b, then True (every possible M + is OK) ◆  otherwise False (there is no M + that’s OK)

Example:

Page 45: Chapter02b

45  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

●  f = XX loc(a)=b Ø  two states from now,

a must be on b Ø  f + = X loc(a)=b

●  f = X loc(a)=b ◆  in the next state,

a must be on b ◆  f + = loc(a)=b

Progressing  X  ●  Procedure Progress(f,s)

◆  Case:  1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

Examples:

Page 46: Chapter02b

46  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Progressing  ∧  

●  f = clear(c) ∧ X loc(a)=c Ø  c must be clear now, and a must be on c in the next state

●  f + = Progress(clear(c), s) ∧ Progress(X loc(a)=c, s) = True ∧ loc(a)=c = loc(a)=c

●  Procedure Progress(f,s) ◆  Case:  

1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

Example:

a   b  

c  

Page 47: Chapter02b

47  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Progressing  ∧  

●  f = G loc(a)=c Ø  a must be on c now and must stay there in the future

●  f + = Progress(loc(a)=c, s) ∧ f = False ∧ G loc(a)=c = False  

●  Procedure Progress(f,s) ◆  Case:  

1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

Example:

a   b  

c  

Page 48: Chapter02b

48  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Progressing  ∧  

●  f = loc(a)=b U clear(c) Ø  c must be clear, or a must be on b and stay there until c is clear

●  f + = Progress(clear(c), s) ∨ [Progress(loc(a)=b, s) ∧ f ] = True ∨ [ False ∧ (loc(a)=b) U clear(c))] = True

●  Procedure Progress(f,s) ◆  Case:  

1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

a   b  

c  

Page 49: Chapter02b

49  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

●  Procedure Progress(f,s) ◆  Case:  

1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

Progressing  ∀  

●  f = ∀[x: clear(x)] X loc(x)=floor Ø  {x | clear(x)} = {a,  c}

●  f + = Progress(X loc(a)=floor, s) ∧ Progress(X loc(c)=floor, s) = loc(a)=floor ∧ loc(c)=floor  

Example:

xi is the i’th element of {x | s ⊨ g(x)}

a   b  

c  

Page 50: Chapter02b

50  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

●  Procedure Progress(f,s) ◆  Case:  

1. f contains no temporal ops : f + ← True if s ⊨ f, False  otherwise  2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)

10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)

◆  simplify f + and return it

Progressing  ∃  

●  f = ∃[x: clear(x)] X loc(x)=floor Ø  {x | clear(x)} = {a,  c}

●  f + = Progress(X loc(a)=floor, s) ∨ Progress(X loc(c)=floor, s) = loc(a)=floor ∨ loc(c)=floor

Example:

xi is the i’th element of {x | s ⊨ g(x)}

a   b  

c  

Page 51: Chapter02b

51  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

TLPlan  ●  Nondeterministic forward search

Ø  s = current state, f = control formula, g = goal

●  If s satisfies g then we’re done ●  Otherwise, think about what kind of plan we need

Ø  It must generate a sequence of states M = 〈s, s+, s++, …〉 that satisfies f ●  Compute a formula f + such that

(M,s) ⊨ f iff (M,s+) ⊨ f + ●  Fail if f + = FALSE  

Ø  No matter what s+ is, (M,s+) can’t satisfy f +

●  Fail if no applicable actions

●  Otherwise, nondeterministically choose one, compute s+, and call TLPlan with s+ and f +

TLPlan  (s, f, g) if s satisfies g then return ⟨ ⟩ f + ← Progress  (f, s) if f + = False then return failure   A ← {actions applicable to s} if A is empty then return failure   nondeterministically choose a ∈ A π+ ← TLPlan  (γ (s,a), f +, g) if π+ ≠ failure then return π.π+   return failure

Page 52: Chapter02b

52  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Example  Planning  Problem  ●  s = {loc(a)=floor, loc(b)=floor, clear(a), clear(c), loc(c)=b} ●  g = {loc(b)=a}

●  f = G ∀[x: clear(x)] (loc(x)≠floor ∨ ∃[y: Goal(loc(x)=y)] ∨ X holding≠x)

Ø  never pick up a clear container from the floor unless it needs to be elsewhere

●  Run the TLPlan algorithm ●  Compute f +

Ø  Return failure  if f + = FALSE

●  Two applicable actions: pickup(a) and unstack(c,b) Ø  Which one to use?

●  Try using pickup(a)

Ø  Call TLPlan recursively with γ (s, pickup(a)) and f +

●  If TLPlan returns failure, then try unstack(c,b)

a   b  

b  a  

c  s0: g:

TLPlan  (s, f, g) if s satisfies g then return ⟨ ⟩ f + ← Progress  (f, s) if f + = False then return failure   A ← {actions applicable to s} if A is empty then return failure   nondeterministically choose a ∈ A π+ ← TLPlan  (γ (s,a), f +, g) if π+ ≠ failure then return π.π+   return failure

Page 53: Chapter02b

53  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Example  Planning  Problem  ●  s = {loc(a)=floor, loc(b)=floor, clear(a), clear(c), loc(c)=b} ●  g = {loc(b)=a}

●  f = G ∀[x: clear(x)] (loc(x)≠floor ∨ ∃[y: Goal(loc(x)=y)] ∨ X holding≠x)

●  f + = Progress(G f1,s) = Progress(f1,s) ∧ f = Progress(∀[x: clear(x)] h(x)), s) ∧ f = Progress(h(a) ∧ h(c)), s) ∧ f = Progress(h(a)), s) ∧ Progress(h(c)), s) ∧ f  

•  Progress(h(a),s) = Progress(loc(a)≠floor ∨ ∃[y: Goal(loc(a)=y)] ∨ X holding≠a),s) = False ∨ False ∨ holding≠a    = holding≠a

•  Progress(h(c),s) = Progress(loc(c)≠floor  ∨ ∃[y: Goal(loc(c)=y)] ∨ X holding≠c),s) = False ∨ True ∨ holding≠c          =  True

●  f + = holding≠a   ∧ True ∧ f = holding≠a   ∧ f

●  Two applicable actions: pickup(a) and unstack(c,b) Ø  s1 = γ (s, pickup(a)): Progress(f +, s1) = False ⇒ backtrack Ø  s2 = γ (s, unstack(c,b)): Progress(f +, s2) = f ⇒ keep going

a   b  

b  a  

c  s0: g:

h(x)

f1

Page 54: Chapter02b

54  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Container-­‐Stacking  Problems  ●  Define an inferred state variable final(x) ∈ Booleans, where x is a container

•  Never directly changed by any planning operator •  Produced by logical inference from the other state variables

●  Want final(x) to mean x is at the top of a stack that we’re finished moving Ø  Neither x nor the containers below x will ever need to be moved

●  Axioms to support this: Ø  final(x) ⇔ clear(x) ∧ ¬Goal(holding=x) ∧ finalbelow(x) Ø  finalbelow(x) ⇔

(loc(x)=floor ∧ ¬∃[y: Goal(loc(x)=y]) ∨ ∃[y: loc(x)=y] [ ¬Goal(loc(x)=floor) ∧ ¬Goal(holding=y) ∧ ¬Goal(clear(y)) ∧ ∀[z : Goal(loc(x)=z)] (z=y) ∧ ∀[z: Goal(loc(z)=y)] (z=x) ∧ finalbelow(y)]

Ø  nonfinal(x) ⇔ clear(x) ∧ ¬final(x)  

Page 55: Chapter02b

55  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Control  Rules  Try TLPlan with three different control formulas:

(1) If x is final, only put a container y onto x if it will make y final:

Ø  G ∀[x: clear(x)] (final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)])

(2) Like (1), but also says never to put anything onto a container that isn’t final: Ø  G ∀[x: clear(x)] [

(final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)]) ∧ (nonfinal(x) ⇒ X ¬∃[y: loc(y)=x])]

(3) Like (2), but also says never to pick up a nonfinal container from the floor

unless you can put it where it will be final: Ø  G ∀[x: clear(x)] [

(final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)]) ∧ (nonfinal(x) ⇒ X ¬∃[y: loc(y)=x]) ∧ (onfloor(x) ∧ ∃[y: Goal(loc(x)=y)] ¬final(y) ⇒ X¬holding(x))]

Page 56: Chapter02b

56  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Container  Stacking  

Page 57: Chapter02b

57  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Container  Stacking  

Page 58: Chapter02b

58  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Domain-­‐Specific  Planning  Algorithms  

●  Sometimes we can write highly efficient planning algorithms for a specific class of problems Ø  Use special properties of that class

●  For container-stacking problems with n containers, we can easily get a solution of length O(n) Ø  Move all containers to the floor, then build up stacks from the bottom

●  With additional domain-specific knowledge, can do even better …

a  e  

b  c  d  

e  c  b  a  

d  s0 g  

Page 59: Chapter02b

59  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Container-­‐Stacking  Algorithm  

●  The algorithm generates the following sequence of actions: Ø  ⟨move(e,a,floor),  move(d,c,e),  move(c,b,floor),  move(b,floor,c),  

move(a,floor,b)⟩  

g e  c  

b  a  

d  

●  c needs moving if ◆  s contains loc(c)=d and

g contains loc(c)=e, where e≠d

◆  s contains loc(c)=d and g contains loc(b)=d, where b≠c

◆  s contains loc(c)=d and d needs moving

loop if ∃ a clear container c that needs moving

& we can move c to a position d where c won’t need moving

then move c to d else if ∃ a clear container c that needs moving then move c to any clear pallet else if the goal is satisfied then return success else return failure

repeat

a  e  

b  c  d  s0

Page 60: Chapter02b

60  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Proper5es  of  the  Algorithm  

●  Sound, complete, guaranteed to terminate on all container-stacking problems

●  Runs in time O(n3) Ø  Can be modified (Slaney & Thiébaux) to run in time O(n)

●  Often finds optimal (shortest) solutions ●  But sometimes only near-optimal

Ø  For container-stacking problems, PLAN-LENGTH is NP-complete

●  I think what TLPlan does (with its 3rd control rule) is roughly similar to this algorithm

Page 61: Chapter02b

61  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Using  Determinis5c  Domain  Models  ●  For planning with deterministic domain models, we made some assumptions that

aren’t necessarily true Assumption Problem Ø  Static world The world may change dynamically Ø  Perfect information We almost never have all of the information Ø  Instantaneous actions Actions take time; there may be time constraints Ø  Correct predictions Action models usually are just approximations Ø  Determinism Action model may just be the “nominal case” Ø  Flat search space There may further lower-level refinements

●  If enough of the assumptions are approximately true, the plans may still be useful

Ø  But can’t just take a plan π and start executing it Ø  Need to monitor π’s execution, detect problems as they occur, recover from

them

Page 62: Chapter02b

62  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Ac5ng  and  Planning  

●  Interaction is roughly as follows Ø  loop

•  from the planner, get the latest plan or partial plan

•  perform one or more actions, monitoring the current state ▸  if problems occur, replan

while performing some preplanned recovery actions

●  Performance could involve lower-level

refinement rather than direct execution Ø  The next chapter contains lots of

details

Acting

Planning

Performance

Page 63: Chapter02b

63  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

7

s s’Predict

Search

8

Planning stageActing stage

Ac5ng  and  Planning  ●  What kind of information should the

planner provide? Ø  Depends on the planning domain

and the actor ●  Some possibilities

Ø  Complete plan, as in the algorithms we’ve discussed •  But usually for a subproblem

▸  example on next slide Ø  Partial plan

•  e.g., receding horizon Ø  Several partial plans, with

relative evaluations of each •  e.g., game-tree search

overall'problem

sub1 sub2 sub3

Page 64: Chapter02b

64  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  2/5/15  

Example  ●  Killzone 2

Ø  “First-person shooter” game ●  Special-purpose AI planner

Ø  Plans enemy actions at the squad level •  Subproblems; solution

plans are maybe 4–6 actions long

Ø  Different planning algorithm than what we’ve discussed so far, but it uses a deterministic domain model

Ø  Quickly generates a plan that would work if nothing interferes Ø  Replans several times per second as the world changes

●  Why it worked: Ø  Don’t want to get the best possible plan Ø  Need actions that appear believable and consistent to human users Ø  Need them very quickly