Cost Based Satisficing Search Considered Harmful William Cushing J. Benton Subbarao Kambhampati.
-
Upload
lilian-leonard -
Category
Documents
-
view
215 -
download
0
Transcript of Cost Based Satisficing Search Considered Harmful William Cushing J. Benton Subbarao Kambhampati.
Performance Bug: ε-Cost `Trap’
High cost variance: ε = $0.01 / $100.00 Board/Fly Load/Drive Labor/Precious Material Mode Switch/Machine Operation
Search depth: 0-1(heuristic-error)=∞ ε-1(heuristic-error)=huge
Optimal: cost=$1000.00, size=100,000 Runner-up: cost=$1000.10, size=20 Trillions of nodes expanded: When does depth 20 get exhausted?
Outline
Inevitability of e-cost Traps Cycle Trap Branching Trap Travel Domain
If Cost is Bad, then what? Surrogate Search Simple First: Size Then: Cost-Sensitive Size-Based
Search
Cycle Trap
Effective search graph g’ = f = g + h Edge weights = changes in
f 0 = ideal - = over-estimated earlier
Or under-estimating now + = under-estimated earlier
Or over-estimating now Simple subgraph
Heuristic plateau 1 choice: Which way?
1
1 1
1
Cycle Trap
2
2 0
1)2(
Even providing a heuristic perfect for all but 1 edge…
Cost-based search fails
Reversible operators are one way in which heuristic penalty can end up being bounded from above “Unbounded f along unbounded
paths”, to have completeness, also forces a heuristic upper bound
Fantastically over-estimating (weighting) could help, but: Suppose the right edge actually
costs 1 – ε Then both directions would have
identical heuristic value Weighting would be fruitless
x = # of 1 cost childreny = # of ε cost childrend/2 + dε/2 = Cd = 2C/(1+ε)
Branching Trap
x+y1/ε = ways to spend 1(x+y1/ε)C = ways to spend C(x+y)d = # of paths at same
depth(x+y)2C/(1+ε) << (x+y1/ε)C
Travel
A B
1 2
R
Straight Fly = 10,000 cents Diag. Fly = 7,000 cents Board/Debark = 1 cent Various Solutions:
Cheapest Plan Fastest Plan Smallest Plan
Travel – Decent Start
A B
1 2
R
g = 1 fly + 4 board + 1 debark
h = 2 fly + 4 debark + 1 board
f ~ 3 fly
Travel – Begin Backtracking
A B
1 2
R
g = 2 fly + 4 board + 1 debark
h = 2 fly + 4 debark + 1 board
f ~ 4 fly
Travel – Backtracking
A B
1 2
R
g = 1 fly + 4 board + 3 debarkh = 2 fly + 3 debark + 2 board
Fly 1-2-B Then teleport passengers
Travel – Backtracking
A B
1 2
R
g = 1 fly + 6 board + 3 debarkh = 2 fly + 4 debark + 1 board
8 people: = 1296 1, 256, 6561, 390625 (1+0)8, (1+1)8, (1+2)38, (1+4)8
4423
Travel Calculations
4 planes located in 5 cities 54 = 625 plane assignments
4k passengers, located in 9 places 94k passenger assignments globally
Cheap subspace Product over each city (1 + city-local planes) (city-local passengers)
e.g., (1+2)4(1+1)4 = 1296 Stop exploring
Large evaluation Exhaustion of possibilities
Cost-based search exhausts cheap subspaces Eventually Assuming an upper bound on the heuristic
Outline
Inevitability of e-cost Traps Cycle Trap Branching Trap Travel Domain
If Cost is Bad, then what? Surrogate Search Simple First: Size Then: Cost-Sensitive Size-Based
Search
Surrogate Search
Replace ill-behaved Objective with a well-behaved Evaluation Tradeoff: Trap Defense versus Quality Focus
Evaluation Function: “Go no further” Force ε ~ 1
Make g and f grow fast enough: in o(size) Normalize costs for hybrid methods
Heuristic: “Go this way” Calculate h in the same units as g
Retain true Objective branch-and-bound duplicates elimination + re-expansion
Re-expansion of duplicates should be done carefully Can wait till future iterations, cache heuristics, use path-max, …
Size-based Search
Replace ill-behaved Objective with a well-behaved Evaluation Pure Size
Evaluation Function: “Go no further” Force ε = 1
Heuristic: “Go this way” Replace cost metric with size metric in relaxed
problem
Retain true Objective, for pruning Resolve heuristic with real objective branch-and-bound: gcost+hcost >= best-known-cost duplicates: new.gcost >= old.gcost Re-expand better cost paths discovered
Cost-sensitive Size-Based Heuristic
Replace ill-behaved Objective with a well-behaved Evaluation
Evaluation Function: “Go no further” Heuristic: “Go this way”
Estimate cheapest/best, but, calculate size sum/max/… propagation of real objective for
heuristic make minimization choices with respect to real objective
Last minute change: Recalculate value of minimization choices by surrogate
Retain true Objective, for pruning Calculate relaxed solution’s cost, also
Faster than totally resolving heuristic
branch-and-bound: gcost+hcost >= best-known-cost If heuristic is inadmissible, force it to be admissible eventually
Results – LAMA
LAMA Greedy best-first: bad plans (iterative) WA*: no plan, time out
LAMA-size Greedy best-first: same bad plans (iterative) WA*: direct plans, time out
Better cost! … but no rendezvous
Expected Result:Only one kind of objectCosts not widely varying
Portfolio approach possible
Results – SapaReplan
WA*-cost Weight 5: one bad plan, time out Weight 2: no plan, memory out
WA*-size Weight 1-2: better plans, memory out
Quality-sensitive evaluation function: cost+size
Conclusion
ε-cost traps are inevitable Typical: Large variation in cost Large cheap subspaces Upper-bounded heuristics Large plateaus in objective
Cost-based systematic approaches are susceptible Even with all kinds of search
enhancements: LAMA Because search depth is
“unbounded” by cost-based evaluation function
ε-1(h-error) ~ 0-1(h-error) That is, search depth is bounded
only by duplicate checking
Force good behavior: Evaluation ≠ Objective Force ε~1 Quality Focus versus Trap
Defense Simplest surrogate:
Size-based Search Force ε=1 Performs surprisingly well Despite total lack of Quality
Focus Easy variation:
Cost-sensitive Size-based Heuristic
Still force ε=1 Recalculate heuristic by
surrogate Performs yet better
Conclusion (Polemic)
Lessons best learnt and then forgotten: goto is how computers work efficiently A* is how search works efficiently
Both are indispensible Both are best-possible
In just the right context Both are fragile
If the context changes
Go enthusiasts: joseki
If size doesn’t work…
Speed Everything Up Reduce All Memory Consumption Improve anytime approach: Iterated, Portfolio, Multi-
Queue Guess (search over) upper bounds Decrease weights Delay duplicate detection Delay re-expansion Delay heuristic computation Exploit external memory Use symbolic methods Learn better heuristics: from search, from inference Precompute/Memoize anything slow: the heuristic Impose hierarchy (state/task abstraction) Accept knowledge (LTL) Use more hardware: (multi-)core/processor/computer,
GPU
Related Work: The Best Approach?
The Best Surrogate? The Best Approach Over All?
Improve Exploitation (Dynamic) Heuristic Weighting (Pohl, Thayer+Ruml) Real-time A* (Korf) Beam search (Zhou) Quality-sensitive probing/lookahead (Benton et al, PROBE)
Improve Exploration Path-max, A** (Dechter+Pearl) Multi-queue approaches (Thayer+Ruml, Richter+Westphal, Helmert) Iterated search (Richter+Westphal) Portfolio methods (Rintanen, Streeter) Breadth-first search [as a serious contender] (Edelkamp)
Directly Address Heuristic Error h_cea, h_ff, h_lama, h_vhpop, h_lpg, h_crikey, h_sapa, … Pattern Databases (Culbertson+Schaeffer, Edelkamp) Limited Discrepancy Search (Ginsberg) Negative Result: “How Good is Almost Perfect?” (Helmert+Röger)
`See’ the Structure (remove the traps) Factored Planning (Brafman+Domshalak) Direct Symmetry Reductions (Korf, Long+Fox) Symbolic Methods, Indirect Symmetry Reduction (Edelkamp)
Related Fields Reinforcement Learning: Exploration/Exploitation Markov Decision Processes: Off-policy/On-policy
Reward Shaping, Potential Field Methods (Path-search) Prioritized Value Iteration
Decision Theory: Heuristic Errors “Decision-Theoretic Search” (?) k-armed Bandit Problems (UCB)
Game-tree Search: Traps, Huge Spaces Without traps, game-tree pathology (Pearl) Upper Confidence Bounds on Trees (UCT) Quiescent Search Proof-number search (Allis?)
Machine Learning: Really Huge Spaces Surrogate Loss Functions Continuous/Differentiable relaxations of 0/1
Probabilistic Reasoning: Extreme Values are Dangerous that 0/1 is bad is well known but also ε is numerically unstable
What isn’t closely related?
Typical Puzzles: Rubik’s Cube, Sliding Tiles, … Prove Optimality/Small Problems Tightly Bounded Memory: IDDFS, IDA*, SMA* Unbounded Memory, but:
Delayed/Relaxed Duplicate Detection (Zhou, Korf) External Memory (Edelkamp, Korf)
More than one problem: D*, D*-Lite, Lifelong Planning A* (Koenig) Case-based planning Learned heuristics
State-space isn’t a blackbox: Bidirectional/Perimeter Search Randomly expanding trees for continuous path planning in low dimensions Waypoint/abstraction methods Any-angle path planning (Koenig)
State-space is far from a blackbox: Explanation Based Learning Theorem Proving (Clause/Constraint Learning) Forward Checking (Unit Propagation)
Planning isn’t (only) State-space search (Kambhampati) Engineering:
Subroutine speedup via Precomputation/Memoization Python vs C Priority Queue implementation (bucket heaps!)
Quotes
“… if in some problem instance we were to allow B to skip even one node that is expanded by A, one could immediately present an infinite set of instances when B grossly outperforms A. (This is normally done by appending to the node skipped a variety of trees with negligible costs and very low h.)” Rina Dechter, Judea Pearl
“I strongly advise that you do not make road movement free (zero-cost). This confuses pathfinding algorithms such as A*, …” Amit Patel
“Then we could choose an somewhat larger than the one defined by (3). The algorithm would no longer be admissible, but it might be more desirable, from a heuristic point of view, than any admissible algorithm.” Peter Hart, Nils Nilsson, Bertram Raphael
Roughly: `… inordinate amount of time selecting among equally meritorious options’ – Ira Pohl
h