BTS3900-BTS3900A-DBS3900 WCDMA V200R011C00SPC100 Parameter Reference
Chapter02b
-
Upload
tianlu-wang -
Category
Documents
-
view
301 -
download
0
Transcript of Chapter02b
1 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Backward Search ● Forward search starts at the initial state
Ø Chooses an action that’s applicable Ø Computes state transition s′ = γ (s,a)
● Backward search starts at the goal Ø Chooses an action that’s relevant
• A possible “last action” before the goal Ø Computes inverse state transition g′ = γ –1(g,a)
• g′ = properties a state s′ should satisfy in order for γ (s′,a) to satisfy g ● Why would we want to do this? ● One possibility: sometimes has a lower branching factor
Ø Forward: 10 applicable actions • for each robot, two move actions and three load actions
Ø Backward: g = {loc(r1)=d3} Ø 2 relevant actions: move(r1,d1,d3), move(r1,d2,d3)
• Can eliminate move(r1,d2,d3); it requires a rigid condition that’s false
d2 d1
d3
r1 c1
r2 c2 c3 c4 c5 c6
2 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Relevance ● Idea: a is relevant for g if a could be the last action of a plan that achieves g ● Definition:
Ø Let g = {g1, …, gk} be a goal. An action a is relevant for g if 1. eff(a) makes at least one gi true, i.e., eff(a) ∩ g ≠ ∅ 2. eff(a) doesn’t make any gi false
▸ ∀ x, c, c′, if eff(a) contains (x,c) and g contains x = c′ then c = c′ 3. pre(a) doesn’t require any gi to be false unless eff(a) makes gi true
▸ ∀ x, c, c′, if (x,c) ∈ pre(a) and (x,c′) ∈ g – eff(a) then c = c′
● What actions are relevant for loc(c1)=r2 ?
d2 d1
d3
r1 c1
r2 c2 c3 c4 c5 c6
3 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Inverse State Transi5ons
● If a is relevant for achieving g, then γ−1(g,a) = pre(a) ∪ (g – eff(a))
● If a isn’t relevant for g, then γ–1(g,a) is undefined
● Example: Ø g = {loc(c1)=r2} Ø What is γ –1(g, load(r2,c1,d3))? Ø What is γ –1(g, load(r1,c1,d1))?
d2 d1
d3
r1 c1
r2 c2 c3 c4 c5 c6
4 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Backward Search For cycle checking: ● After line 1, put
Solved = {g} ● After line 6, put
if g′ ∈ Solved then return failure Solved ← Solved ∪ {g′}
● More powerful: if ∃g ∈ Solved s.t. g ⊆ g′ then return failure
● Sound and complete Ø If a planning problem is solvable
then at least one of Backward-search’s nondeterministic execution traces will find a solution
g
g1
g2
g3
a1
a2
a3
g4
g5
s0
a4
a5
5 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Branching Factor ● Our motivation for Backward-‐search
was to focus the search Ø But as written, it doesn’t really
accomplish that ● Solve this by lifting
Ø Leave y uninstantiated
. . .
move(r1,d2,d3)
move(r1,d4,d3)
move(r1,d20,d3)
move(r1,d1,d3)
g = {loc(r1)=d3}
move(r1,y,d3) g = {loc(r1)=d3}
d2 d1
d3
r1 c1
r2 c2 c3 c4 c5 c6
6 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Li:ed Backward Search ● Like Backward-‐search but more complicated
Ø Have to keep track of what values were substituted for which parameters Ø But it has a much smaller branching factor
● I won’t discuss the details
7 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Plan-‐Space Planning ● Another approach
Ø formulate planning as a constraint satisfaction problem Ø use constraint-satisfaction techniques to produce solutions that are more
flexible than ordinary plans • E.g., plans in which the actions are partially ordered • Postpone ordering decisions until the plan is being executed
▸ the actor may have a better idea about which ordering is best ● First step toward planning concurrent execution of actions (Chapter 4)
Outline: • Basic idea • Open goals • Threats • The PSP algorithm • Long example • Comments
8 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Plan-‐Space Planning -‐ Basic Idea ● Backward search from the goal ● Each node of the search space is a partial plan, π
• A set of partially-instantiated actions • Constraints on the actions
Ø Keep making refinements, until we have a solution
● Types of constraints: Ø precedence constraints
indicated by solid arcs Ø binding constraints
• inequality constraints, e.g., z ≠ x or w ≠ p1 Ø causal links:
• indicated by dashed arcs • use effect e of action a to establish precondition p of action b
● How to tell we have a solution: no more flaws in the plan Ø Two kinds of flaws …
foo(x) Pre: … Eff: loc(x)=p1
bar(x) Pre: loc(x)=p1 Eff: …
baz(z) Pre: loc(z)=p2 Eff: …
z ≠ x
9 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Flaws: 1. Open Goals
● A precondition p of an action b is an open goal if there is no causal link for p
● Resolve the flaw by creating a causal link Ø Find an action a (either already in π,
or can add it to π) that can establish p • can precede b • can have p as an effect
Ø Do substitutions on variables to make a assert p • e.g., replace y with x
Ø Add an ordering constraint a ≺ b Ø Create a causal link from a to p
Pre: loc(y)=p1
Pre: loc(y)=p1
bar(y)
foo(y) bar(y)
substitute y for x
Eff: loc(y)=p1
foo(x)
Eff: loc(x)=p1
10 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Flaws: 2. Threats ● Suppose we have a causal link from
action a to precondition p of action b ● Action c threatens the link if c may affect p
and may come between a and b Ø c is a threat even if it makes p true
rather than false • Causal link means a, not c, is
supposed to establish p for b • The plan in which c establishes p
will be generated on another path in the search space
● Three possible ways to resolve the flaw: Ø Require c ≺ a Ø Require b ≺ c Ø Constrain variable(s) to prevent
b from affecting p
Pre: loc(y)=p1
foo(y) bar(y)
clobber(z) Eff: loc(z)=p2
Eff: loc(y)=p1
Pre: loc(y,p1)
foo(y) bar(y)
clobber(z) Eff: ¬loc(z,p1)
Eff: loc(y,p1)
State variables:
Classical:
11 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
PSP Algorithm
● Initial plan is always {Start, Finish} with Start ≺ Finish Ø Start has no preconditions; effects are the initial state s0 Ø Finish has no effects; its precondition is the goal g
● PSP is sound and complete Ø It returns a partially ordered solution π such that any
total ordering of π will achieve g Ø In some environments, could execute actions in parallel
Start
Finish
Eff: s0
Pre: g
12 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
Finish
p3
p1
p2
a d
p4 b c
Start
pos(a)=d pos(b)=c
c d b a
● Finish has two open goals: pos(a)=d, pos(b)=c
Example
clear(p1)=T clear(p2)=T clear(p3)=F clear(p4)=F clear(a)=F clear(b)=F clear(c)=T clear(d)=T pos(a)=p3 pos(b)=p4 pos(c)=b pos(d)=a
13 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,y1,d)
Start
Finish
clear(a)=T pos(a)=y1
pos(a)=d pos(b)=c
clear(d)=T
Example ● For each open goal, add a new action
Ø Every new action a must have Start ≺ a, a ≺ Finish
p3
p1
p2
a d
p4 b c
clear(b)=T pos(b)=y2 clear(c)=T
move(b,y2,c)
clear(p1)=T clear(p2)=T clear(p3)=F clear(p4)=F clear(a)=F clear(b)=F clear(c)=T clear(d)=T pos(a)=p3 pos(b)=p4 pos(c)=b pos(d)=a
c d b a
14 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
Start
Finish
clear(a)=T clear(b)=T pos(b)=p4 pos(a)=p3
pos(a)=d pos(b)=c
Example ● Resolve four more open goals: bind y1=p3, y2=p4
clear(c)=T clear(d)=T
p3
p1
p2
a d
p4 b c
move(b,p4,c)
clear(p1)=T clear(p2)=T clear(p3)=F clear(p4)=F clear(a)=F clear(b)=F clear(c)=T clear(d)=T pos(a)=p3 pos(b)=p4 pos(c)=b pos(d)=a
c d b a
15 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
Example ● 1st threat requires z3≠d ● 2nd threat has two resolvers:
Ø move(b,p4,c) ≺ move(x3,a,z3) Ø z3≠c
pos(a)=p3 clear(c)=T
clear(x3)=T clear(z3)=T pos(x3)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c
move(x3,a,z3)
move(b,p4,c)
clear(p1)=T clear(p2)=T clear(p3)=F clear(p4)=F clear(a)=F clear(b)=F clear(c)=T clear(d)=T pos(a)=p3 pos(b)=p4 pos(c)=b pos(d)=a
c d b a
16 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
Example
● Threats resolved
pos(a)=p3 clear(c)=T
clear(x3)=T clear(z3)=T pos(x3)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c
move(x3,a,z3)
move(b,p4,c)
z3≠c z3≠d
clear(p1)=T clear(p2)=T clear(p3)=F clear(p4)=F clear(a)=F clear(b)=F clear(c)=T clear(d)=T pos(a)=p3 pos(b)=p4 pos(c)=b pos(d)=a
c d b a
17 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● 1st threat has two resolvers: Ø An ordering constraint,
and z4≠d ● 2nd threat has three resolvers:
Ø Two ordering constraints, and z4≠a
● 3rd threat has one: z4≠c
pos(a)=p3 clear(c)=T
move(x4,b,z4)
clear(x3)=T clear(x4)=T clear(z3)=T clear(z4)=T pos(x4)=b pos(x3)=a
pos(a)=d pos(b)=c
Start
p3
p1
p2
a d
p4 b c
move(x3,a,z3)
z3≠c z3≠d
c d b a
18 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(x3,a,z3)
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Resolve the three threats using
the binding constraints
pos(a)=p3 clear(c)=T
move(x4,b,z4)
clear(x3)=T clear(x4)=T clear(z3)=T clear(z4)=T pos(x4)=b pos(x3)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c z4≠a
z4≠c z4≠d
z3≠c z3≠d
c d b a
19 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
p1≠c p1≠d
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Resolve five open goals
Ø Bind x3=d, x4=c, z3=p1
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,z4)
clear(d)=T clear(c)=T clear(p1)=T clear(z4)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c z4≠a
z4≠c z4≠d
c d b a
20 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Threatened causal link ● Resolvers:
Ø move(d,a,p1) ≺ move(c,b,z4) Ø z4≠p1
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,z4)
clear(d)=T clear(c)=T clear(p1)=T clear(z4)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c z4≠a
z4≠c z4≠d
c d b a
21 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Threat resolved
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,z4)
clear(d)=T clear(c)=T clear(p1)=T clear(z4)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c z4≠a
z4≠c z4≠d z4≠p1
c d b a
22 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(c, y, z) pre: pos(c)=y, clear(c)=T, clear(z)=T eff: pos(c)←z, clear(y)←T, clear(z)←F Range(c) = Containers; Range(y) = Range(z) = Container ∪ pallets
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Resolve open goal Ø bind z4=p2
● No more flaws, so we’re done!
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,p2)
clear(d)=T clear(c)=T clear(p1)=T clear(p2)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c p2≠a
p2≠c p2≠d p2≠p1
c d b a
23 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(a,p3,d)
Start
Finish
move(b,p4,c)
Example ● PSP returns this solution:
move(d,a,p1) move(c,b,p4)
p3
p1
p2
a d
p4 b c
c d b a
24 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Go back to the last threat ● Resolvers:
Ø move(d,a,p1) ≺ move(c,b,z4) Ø z4≠p1
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,z4)
clear(d)=T clear(c)=T clear(p1)=T clear(z4)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c z4≠a
z4≠c z4≠d
c d b a
25 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Threat resolved
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,z4)
clear(d)=T clear(c)=T clear(p1)=T clear(z4)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c z4≠a
z4≠c z4≠d
c d b a
26 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Resolve open goal Ø bind z4=p2
● No more flaws, so we’re done
pos(a)=p3 clear(c)=T
move(d,a,p1) move(c,b,p4)
clear(d)=T clear(c)=T clear(p1)=T clear(p2)=T pos(c)=b pos(d)=a
pos(a)=d pos(b)=c
p3
p1
p2
a d
p4 b c
c d b a
27 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(a,p3,d)
Start
Finish
move(b,p4,c)
Example ● Same solution as before,
but with another ordering constraint
move(d,a,p1) move(c,b,p4)
p3
p1
p2
a d
p4 b c
c d b a
28 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Node-‐Selec5on Heuris5cs ● Analogy to constraint-satisfaction problems
Ø Resolving a flaw in PSP ≈ assigning a value to a variable in a CSP ● What flaw to work on next?
Ø Fewest Alternatives First (FAF): the flaw with the fewest resolvers ≈ Minimum Remaining Values (MRV) heuristic for CSPs
● To resolve the flaw, which resolver to try first? Ø Least Constraining Resolver (LCR): the resolver that rules out the fewest
resolvers for the other flaws ≈ Least Constraining Value (LCV) heuristic for CSPs
● In PSP, introducing a new action introduces new flaws to resolve Ø The plan can get arbitrarily large; want it to be as small as possible
• Not like CSPs, where the search tree always has a fixed depth ● Avoid introducing new actions unless necessary ● To choose between actions a and b, estimate distance from s0 to Pre(a) and Pre(b)
Ø We’ll discuss some heuristics for that later
29 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
move(a,p3,d)
clear(d)=T pos(b)=p4
Start
Finish
clear(a)=T clear(b)=T
move(b,p4,c)
Example ● Example of Fewest Alternatives First:
Ø 1st threat has two resolvers: an ordering constraint, and z4≠d Ø 2nd threat has three resolvers: 2 ordering constraints, and z4≠a Ø 3rd threat has one resolver: z4≠c
● So resolve the 3rd threat first
pos(a)=p3 clear(c)=T
move(x4,b,z4)
clear(x3)=T clear(x4)=T clear(z3)=T clear(z4)=T pos(x4)=b pos(x3)=a
pos(a)=d pos(b)=c
Start
move(x3,a,z3)
z3≠c z3≠d
30 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Discussion ● Problem: how to prune infinitely long paths in the search space?
Ø Loop detection is based on recognizing states or goals we’ve seen before
Ø In a partially ordered plan, we don’t know the states
● Can we prune a path if we see the same action more than once?
Ø No. Sometimes we might need the same action several times in different states of the world
Ø Example on next slide
s s' s
act1 act2 act1 … …
…
31 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Example ● 3-digit binary counter d3 d2 d1
s0 = {d3=0, d2=0, d1=0}, i.e., 0 0 0
g = {d3=1, d2=1, d1=1}, i.e., 1 1 1
● Actions to increment the counter
• incr-‐xx0-‐to-‐xx1 Pre: d1=0 Eff: d1=1
• incr-‐x01-‐to-‐x10 Pre: d2=0, d1=1 Eff: d2=1, d1=0
• incr-‐011-‐to-‐100 Pre: d3=0, d2=1, d1=1 Eff: d3=1, d2=0, d1=0
● Plan: d3 d2 d1
s0 : 0 0 0 incr-‐xx0-‐to-‐xx1 à 0 0 1 incr-‐x01-‐to-‐x10 à 0 1 0 incr-‐xx0-‐to-‐xx1 à 0 1 1 incr-‐011-‐to-‐100 à 1 0 0 incr-‐xx0-‐to-‐xx1 à 1 0 1 incr-‐x01-‐to-‐x10 à 1 1 0 incr-‐xx0-‐to-‐xx1 à 1 1 1
32 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
A Weak Pruning Technique
● Can prune all partial plans of n or more actions, where n = |{all possible states}| Ø This doesn’t help very much
● I’m not sure whether there’s a good pruning technique for plan-space planning
33 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Planning with Control Rules Motivation: ● Given a state s and an action a ● Sometimes domain-specific tests can
tell us we don’t want to use a, e.g., Ø a doesn’t lead to a solution Ø or a is dominated
• there’s a better solution along some other path
Ø or a doesn’t lead to a solution that’s acceptable according to domain-specific criteria
● In such cases we can prune s (remove it from Act)
● Approach: Ø Write logical formulas giving conditions that states must satisfy Ø Prune states that don’t satisfy the formulas
34 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Quick Review of First Order Logic First Order Logic (FOL): ● Syntax:
Ø atomic formulas (or atoms) • predicate symbol with arguments, e.g., clear(c) • include ‘=’ as a binary predicate symbol, e.g., loc(r1)=d1
Ø logical connectives (∨, ∧, ¬, ⇒, ⇔), quantifiers (∀, ∃), punctuation • e.g., (loc(r1)=d1 ∧ ∀c clear(c)) ⇒ ¬∃c loc(c)=r1
● First Order Theory T: Ø “Logical” axioms and inference rules – encode logical reasoning in general Ø Additional “nonlogical” axioms – talk about a particular domain Ø Theorems: produced by applying the axioms and rules of inference
● Model: a set of objects, functions, relations that the symbols refer to Ø For our purposes, a model is a state of the world s Ø In order for s to be a model, all theorems of T must be true in s Ø s ⊨ loc(r1)=d1 read “s satisfies loc(r1)=d1” or “s entails loc(r1)=d1”
• means that r1 is at d1 in the state s
35 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Linear Temporal Logic ● Modal logic: FOL plus modal operators
to express concepts that would be difficult to express within FOL ● Linear Temporal Logic (LTL):
Ø Purpose: to express a limited notion of time • Infinite sequence 〈0, 1, 2, …〉 of time instants • Infinite sequence M = 〈s0, s1, …〉 of states of the world
Ø Modal operators to refer to states in M: X f “next f ” - f is true in the next state, e.g., F loc(a)=b F f “future f ” - f either is true now or in some future state G f “globally f ” - f is true now and in all future states f1 U f2 “f1 until f2” - f2 is true now or in a future state,
and f1 is true until then Ø Propositional constant symbols True and False
• Instead of T and F, to avoid confusion with the F operator
36 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Linear Temporal Logic (con5nued) ● Quantifiers cause problems with computability
Ø Suppose f(x) is true for infinitely many values of x Ø Problem evaluating truth of ∀x f(x) and ∃x f(x)
● Bounded quantifiers Ø Let g(x) be such that {x | g(x) is true} is finite and easily computed
∀[x: g(x)] f(x) ▸ means ∀x (g(x) ⇒ f(x)) ▸ expands into f(x1) ∧ f(x2) ∧ … ∧ f(xn)
∃[x: g(x)] f(x) ▸ means ∃x (g(x) ∧ f(x)) ▸ expands into f(x1) ∨ f(x2) ∨ … ∨ f(xn)
37 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Nota5on ● We can use state-variable assignments as logical propositions in LTL formulas
Ø G (∀[x: clear(x)=T] final(x)=T ⇒ X(clear(x)=T ∨ ∃[y: loc(y)=x] final(y)=T))
● For Boolean state variables, simpler to write them as logical propositions • Instead of clear(x)=T, just write clear(x) • Instead of clear(x)=F, write ¬clear(x)
Ø G (∀[x: clear(x)] final(x) ⇒ X(clear(x) ∨ ∃[y: loc(y)=x] final(y)))
38 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
pickup(x) pre: loc(x)=floor, clear(x), holding=nil eff: loc(x)=crane, ¬clear(x), holding=x
b
stack(x,y) pre: holding=x, clear(y) eff: holding=nil, ¬clear(y), loc(x)=y, clear(x)
● The “container stacking” domain Ø Based on a classical planning domain
called the “blocks world”
unstack(x,y) pre: loc(x)=y, clear(x), holding=nil eff: loc(x)=crane, ¬clear(x), holding=x, clear(y)
putdown(x) pre: holding=x eff: holding=nil, loc(x)=floor, clear(x)
Example clear(e), loc(e)=d, loc(d)=floor, clear(c), loc(c)=a, loc(a)=floor, clear(b), loc(b)=floor, holding=nil
d e
a c
b
d e
a c
clear(e), loc(e)=d, loc(d)=floor, clear(c), loc(c)=a, loc(a)=floor, ¬clear(b), loc(b)=crane, holding=b
39 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Models for Planning with LTL ● A model is a pair M = (M, si)
Ø M = 〈s0, s1, …〉 is a sequence of states Ø si is the i’th state in M,
● For planning, we also have a goal g = {g1, …, gn} Ø To reason about it, add a modal operator called “Goal”
• Not part of ordinary LTL, but I’ll call it LTL anyway
Ø In an LTL formula, use “Goal(gi)” to refer to part of g • ((M,si), g) ⊨ Goal(gi) iff g ⊨ gi
● Planning problem: Ø Initial state s0, a goal g, control formula f Ø Find a plan π = 〈a1, …, an〉 that generates a sequence of states
M = 〈s0, s1, …sn〉 such that M ⊨ f and sn ⊨ g • That’s not quite correct • Do you know why?
40 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Models for Planning with LTL ● M needs to be an infinite sequence ● Kluge: assume that the final state repeats infinitely after the plan ends
● Planning problem: Ø Initial state s0, a goal g, control formula f Ø Find a plan π = 〈a1, …, an〉 that generates a sequence of states
M = 〈s0, s1, …, sn, sn, sn, …〉 such that M ⊨ f and sn ⊨ g
41 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Examples ● Suppose M = 〈s0, s1, …〉
● (M,s2) ⊨ XX loc(a)=b
Ø a is on b in state s2
● Abbreviation: can omit the state, it defaults to s0 Ø M ⊨ XX loc(a)=b means (M,s0) ⊨ XX loc(a)=b
● Since loc(a)=b has no modal operators Ø (M,s2) ⊨ loc(a)=b is equivalent to s2 ⊨ loc(a)=b
● M ⊨ G holding ≠ c Ø in every state in M, we aren’t holding c
● M ⊨ G (clear(b) ⇒ (clear(b) U loc(a)=b)) Ø whenever we enter a state in which b is clear,
b remains clear until a is on b
42 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
TLPlan ● Nondeterministic forward search
Ø s = current state, f = control formula, g = goal
● If s satisfies g then we’re done ● Otherwise, think about what kind of plan we need
Ø It must generate a sequence of states M = 〈s, s+, s++, …〉 that satisfies f ● Compute a formula f + such that
(M,s) ⊨ f iff (M,s+) ⊨ f + ● Fail if f + = FALSE
Ø No matter what s+ is, (M,s+) can’t satisfy f +
● Fail if no applicable actions
● Otherwise, nondeterministically choose one, compute s+, and call TLPlan with s+ and f +
TLPlan (s, f, g) if s satisfies g then return ⟨ ⟩ f + ← Progress (f, s) if f + = False then return failure A ← {actions applicable to s} if A is empty then return failure nondeterministically choose a ∈ A π+ ← TLPlan (γ (s,a), f +, g) if π+ ≠ failure then return π.π+ return failure
43 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Progression ● Procedure Progress(f,s)
◆ Case: 1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
False ∧ h = False, True ∧ h = h, ¬False = True, etc.
Compute the formula f + that M + must satisfy
44 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Progressing ordinary formulas ● Procedure Progress(f,s)
◆ Case: 1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
● f = loc(a)=b ◆ if a is currently on b, then True (every possible M + is OK) ◆ otherwise False (there is no M + that’s OK)
Example:
45 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
● f = XX loc(a)=b Ø two states from now,
a must be on b Ø f + = X loc(a)=b
● f = X loc(a)=b ◆ in the next state,
a must be on b ◆ f + = loc(a)=b
Progressing X ● Procedure Progress(f,s)
◆ Case: 1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
Examples:
46 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Progressing ∧
● f = clear(c) ∧ X loc(a)=c Ø c must be clear now, and a must be on c in the next state
● f + = Progress(clear(c), s) ∧ Progress(X loc(a)=c, s) = True ∧ loc(a)=c = loc(a)=c
● Procedure Progress(f,s) ◆ Case:
1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
Example:
a b
c
47 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Progressing ∧
● f = G loc(a)=c Ø a must be on c now and must stay there in the future
● f + = Progress(loc(a)=c, s) ∧ f = False ∧ G loc(a)=c = False
● Procedure Progress(f,s) ◆ Case:
1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
Example:
a b
c
48 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Progressing ∧
● f = loc(a)=b U clear(c) Ø c must be clear, or a must be on b and stay there until c is clear
● f + = Progress(clear(c), s) ∨ [Progress(loc(a)=b, s) ∧ f ] = True ∨ [ False ∧ (loc(a)=b) U clear(c))] = True
● Procedure Progress(f,s) ◆ Case:
1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
a b
c
49 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
● Procedure Progress(f,s) ◆ Case:
1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
Progressing ∀
● f = ∀[x: clear(x)] X loc(x)=floor Ø {x | clear(x)} = {a, c}
● f + = Progress(X loc(a)=floor, s) ∧ Progress(X loc(c)=floor, s) = loc(a)=floor ∧ loc(c)=floor
Example:
xi is the i’th element of {x | s ⊨ g(x)}
a b
c
50 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
● Procedure Progress(f,s) ◆ Case:
1. f contains no temporal ops : f + ← True if s ⊨ f, False otherwise 2. f = f1 ∧ f2 : f + ← Progress(f1, s) ∧ Progress(f2, s) 3. f = f1 ∨ f2 : f + ← Progress(f1, s) ∨ Progress(f2, s) 4. f =¬ f1 : f + ← ¬Progress(f1, s) 5. f = X f1 : f + ← f1 6. f = F f1 : f + ← Progress(f1, s) ∨ f 7. f = G f1 : f + ← Progress(f1, s) ∧ f 8. f = f1 U f2 : f + ← Progress(f2, s) ∨ (Progress(f1, s) ∧ f) 9. f = ∀[x:g(x)] h(x) : f + ← Progress(h(x1), s) ∧ … ∧ Progress(h(xn), s)
10. f = ∃ [x:g(x)] h(x) : f + ← Progress(h(x1), s) ∨ … ∨ Progress(h(xn), s)
◆ simplify f + and return it
Progressing ∃
● f = ∃[x: clear(x)] X loc(x)=floor Ø {x | clear(x)} = {a, c}
● f + = Progress(X loc(a)=floor, s) ∨ Progress(X loc(c)=floor, s) = loc(a)=floor ∨ loc(c)=floor
Example:
xi is the i’th element of {x | s ⊨ g(x)}
a b
c
51 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
TLPlan ● Nondeterministic forward search
Ø s = current state, f = control formula, g = goal
● If s satisfies g then we’re done ● Otherwise, think about what kind of plan we need
Ø It must generate a sequence of states M = 〈s, s+, s++, …〉 that satisfies f ● Compute a formula f + such that
(M,s) ⊨ f iff (M,s+) ⊨ f + ● Fail if f + = FALSE
Ø No matter what s+ is, (M,s+) can’t satisfy f +
● Fail if no applicable actions
● Otherwise, nondeterministically choose one, compute s+, and call TLPlan with s+ and f +
TLPlan (s, f, g) if s satisfies g then return ⟨ ⟩ f + ← Progress (f, s) if f + = False then return failure A ← {actions applicable to s} if A is empty then return failure nondeterministically choose a ∈ A π+ ← TLPlan (γ (s,a), f +, g) if π+ ≠ failure then return π.π+ return failure
52 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Example Planning Problem ● s = {loc(a)=floor, loc(b)=floor, clear(a), clear(c), loc(c)=b} ● g = {loc(b)=a}
● f = G ∀[x: clear(x)] (loc(x)≠floor ∨ ∃[y: Goal(loc(x)=y)] ∨ X holding≠x)
Ø never pick up a clear container from the floor unless it needs to be elsewhere
● Run the TLPlan algorithm ● Compute f +
Ø Return failure if f + = FALSE
● Two applicable actions: pickup(a) and unstack(c,b) Ø Which one to use?
● Try using pickup(a)
Ø Call TLPlan recursively with γ (s, pickup(a)) and f +
● If TLPlan returns failure, then try unstack(c,b)
a b
b a
c s0: g:
TLPlan (s, f, g) if s satisfies g then return ⟨ ⟩ f + ← Progress (f, s) if f + = False then return failure A ← {actions applicable to s} if A is empty then return failure nondeterministically choose a ∈ A π+ ← TLPlan (γ (s,a), f +, g) if π+ ≠ failure then return π.π+ return failure
53 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Example Planning Problem ● s = {loc(a)=floor, loc(b)=floor, clear(a), clear(c), loc(c)=b} ● g = {loc(b)=a}
● f = G ∀[x: clear(x)] (loc(x)≠floor ∨ ∃[y: Goal(loc(x)=y)] ∨ X holding≠x)
● f + = Progress(G f1,s) = Progress(f1,s) ∧ f = Progress(∀[x: clear(x)] h(x)), s) ∧ f = Progress(h(a) ∧ h(c)), s) ∧ f = Progress(h(a)), s) ∧ Progress(h(c)), s) ∧ f
• Progress(h(a),s) = Progress(loc(a)≠floor ∨ ∃[y: Goal(loc(a)=y)] ∨ X holding≠a),s) = False ∨ False ∨ holding≠a = holding≠a
• Progress(h(c),s) = Progress(loc(c)≠floor ∨ ∃[y: Goal(loc(c)=y)] ∨ X holding≠c),s) = False ∨ True ∨ holding≠c = True
● f + = holding≠a ∧ True ∧ f = holding≠a ∧ f
● Two applicable actions: pickup(a) and unstack(c,b) Ø s1 = γ (s, pickup(a)): Progress(f +, s1) = False ⇒ backtrack Ø s2 = γ (s, unstack(c,b)): Progress(f +, s2) = f ⇒ keep going
a b
b a
c s0: g:
h(x)
f1
54 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Container-‐Stacking Problems ● Define an inferred state variable final(x) ∈ Booleans, where x is a container
• Never directly changed by any planning operator • Produced by logical inference from the other state variables
● Want final(x) to mean x is at the top of a stack that we’re finished moving Ø Neither x nor the containers below x will ever need to be moved
● Axioms to support this: Ø final(x) ⇔ clear(x) ∧ ¬Goal(holding=x) ∧ finalbelow(x) Ø finalbelow(x) ⇔
(loc(x)=floor ∧ ¬∃[y: Goal(loc(x)=y]) ∨ ∃[y: loc(x)=y] [ ¬Goal(loc(x)=floor) ∧ ¬Goal(holding=y) ∧ ¬Goal(clear(y)) ∧ ∀[z : Goal(loc(x)=z)] (z=y) ∧ ∀[z: Goal(loc(z)=y)] (z=x) ∧ finalbelow(y)]
Ø nonfinal(x) ⇔ clear(x) ∧ ¬final(x)
55 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Control Rules Try TLPlan with three different control formulas:
(1) If x is final, only put a container y onto x if it will make y final:
Ø G ∀[x: clear(x)] (final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)])
(2) Like (1), but also says never to put anything onto a container that isn’t final: Ø G ∀[x: clear(x)] [
(final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)]) ∧ (nonfinal(x) ⇒ X ¬∃[y: loc(y)=x])]
(3) Like (2), but also says never to pick up a nonfinal container from the floor
unless you can put it where it will be final: Ø G ∀[x: clear(x)] [
(final(x) ⇒ X [clear(x) ∨ ∃[y: loc(y)=x] final(y)]) ∧ (nonfinal(x) ⇒ X ¬∃[y: loc(y)=x]) ∧ (onfloor(x) ∧ ∃[y: Goal(loc(x)=y)] ¬final(y) ⇒ X¬holding(x))]
56 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Container Stacking
57 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Container Stacking
58 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Domain-‐Specific Planning Algorithms
● Sometimes we can write highly efficient planning algorithms for a specific class of problems Ø Use special properties of that class
● For container-stacking problems with n containers, we can easily get a solution of length O(n) Ø Move all containers to the floor, then build up stacks from the bottom
● With additional domain-specific knowledge, can do even better …
a e
b c d
e c b a
d s0 g
59 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Container-‐Stacking Algorithm
● The algorithm generates the following sequence of actions: Ø ⟨move(e,a,floor), move(d,c,e), move(c,b,floor), move(b,floor,c),
move(a,floor,b)⟩
g e c
b a
d
● c needs moving if ◆ s contains loc(c)=d and
g contains loc(c)=e, where e≠d
◆ s contains loc(c)=d and g contains loc(b)=d, where b≠c
◆ s contains loc(c)=d and d needs moving
loop if ∃ a clear container c that needs moving
& we can move c to a position d where c won’t need moving
then move c to d else if ∃ a clear container c that needs moving then move c to any clear pallet else if the goal is satisfied then return success else return failure
repeat
a e
b c d s0
60 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Proper5es of the Algorithm
● Sound, complete, guaranteed to terminate on all container-stacking problems
● Runs in time O(n3) Ø Can be modified (Slaney & Thiébaux) to run in time O(n)
● Often finds optimal (shortest) solutions ● But sometimes only near-optimal
Ø For container-stacking problems, PLAN-LENGTH is NP-complete
● I think what TLPlan does (with its 3rd control rule) is roughly similar to this algorithm
61 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Using Determinis5c Domain Models ● For planning with deterministic domain models, we made some assumptions that
aren’t necessarily true Assumption Problem Ø Static world The world may change dynamically Ø Perfect information We almost never have all of the information Ø Instantaneous actions Actions take time; there may be time constraints Ø Correct predictions Action models usually are just approximations Ø Determinism Action model may just be the “nominal case” Ø Flat search space There may further lower-level refinements
● If enough of the assumptions are approximately true, the plans may still be useful
Ø But can’t just take a plan π and start executing it Ø Need to monitor π’s execution, detect problems as they occur, recover from
them
62 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Ac5ng and Planning
● Interaction is roughly as follows Ø loop
• from the planner, get the latest plan or partial plan
• perform one or more actions, monitoring the current state ▸ if problems occur, replan
while performing some preplanned recovery actions
● Performance could involve lower-level
refinement rather than direct execution Ø The next chapter contains lots of
details
Acting
Planning
Performance
63 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
7
s s’Predict
Search
8
Planning stageActing stage
Ac5ng and Planning ● What kind of information should the
planner provide? Ø Depends on the planning domain
and the actor ● Some possibilities
Ø Complete plan, as in the algorithms we’ve discussed • But usually for a subproblem
▸ example on next slide Ø Partial plan
• e.g., receding horizon Ø Several partial plans, with
relative evaluations of each • e.g., game-tree search
overall'problem
sub1 sub2 sub3
64 Dana Nau and Vikas Shivashankar: Lecture slides for Automated Planning and Ac0ng Updated 2/5/15
Example ● Killzone 2
Ø “First-person shooter” game ● Special-purpose AI planner
Ø Plans enemy actions at the squad level • Subproblems; solution
plans are maybe 4–6 actions long
Ø Different planning algorithm than what we’ve discussed so far, but it uses a deterministic domain model
Ø Quickly generates a plan that would work if nothing interferes Ø Replans several times per second as the world changes
● Why it worked: Ø Don’t want to get the best possible plan Ø Need actions that appear believable and consistent to human users Ø Need them very quickly