Automated Planning
[AIMA 4G] Chapter 11
COSC1127/1125
Artificial Intelligence
Semester 2, 2021
Prof. Sebastian Sardina
* Some slides based on those from Hector Geffner and James Harland.
Wominjeka!
Week 6
Planning
Prof. Sebastian Sardina
Say something about the course so far at:
6226 5651 @ menti.com
—
Wominjeka! Welcome!
I acknowledge, and invite you all to acknowledge, the Traditional Owners (Wurundjeri people of the Kulin Nations) of the land on which we will conduct this session for AI21. I recognise their continuing connection to land, waters and culture, as well as the challenges and injustices that continue up to our times, unfortunately.
IMPORTANT DISCLAIMER
These pack of slides are NOT core study
material. The core & main material is the book or any specific resources given for the week.
The slides in this presentations were prepared only for supporting the instructor during the lectorial and only some of them have been used in the session as needed.
These pack of slides do not represent the complete material of the week, but only a few bits here and there around what was discussed in lectorials. Most of the week content will NOT show up in this pack of slides.
Please refer to the book (and any resources specifically given) for a comprehensive coverage of the week material.
I have Pacman dreams…
Some news…
Project 2: marking on the way!
be patient, too many things…
Final Project Contest coming soon!
First, build your team! #144
Check description here.
Full spec coming later this week
THE coming in Week 9
Mock THE in week 7 week
In Google form: check access!
Theory, no coding
*
Working on it today…
Project 2: Multiagent
Marking results on the way…
Average was very very high…
Project was super accessible.
So, avoid “bargaining” marks.
You earn marks, you don’t lose marks
25 points for correctness + 10 for development.
no. of commits (15+) in groups and individually
type of commits, balanced commits,
quality of merges, messages, etc.
Will have to meet some groups (unfortunately).
Some great threads…
Some great repos…
Many and meaningful commits.
Balanced commits from different members.
Good comments for each commit.
Full history to initial commit; no orphan commits.
Many and meaningful branches.
Good merges via Pull Requests.
Side effects of taking AI…
Learn Python or get better at it.
Learn better development practices.
Learn about LaTex.
Improve teamwork skills.
Make friends and future colleagues!
Final Contest: Capture the Flag
Worth 45% = 10% + 15% + 10% + 10%:
Preliminary ranking + Final ranking + Report + Video
Group project: 3 per group default.
Will run “continuous” tournament
will be able to submit your agent and see how it performs, then fix, resubmit, etc.
final submission (code): week 12
Inter-University Contest: top teams to play other uni’s top teams (e.g., Melbourne University)
LET THE SHOW BEGIN!
START NOW!!
First step: register team
Questions?
*
Today: AI Planning
AI Search meets AI Knowledge Representation
All this will be assessed in THE
Monday Sept 20th @ 9am – 9pm (12 hrs!).
No repeats or exceptions after it closes.
Make yourself available now for ~3hrs (if sharp).
Don’t take risks by leaving it too late.
Recommended to submit by 6pm at most.
Topics Week 1 – 8 (included).
Content from Book & Tutorials; no Pacman!
Answers entered ONLINE in a Google Form/Quiz:
Can use paper & pen for your notes.
Make sure you are logged into your RMIT Google Account (not your private account!)
You can submit once (as if handing in on paper)
No other submission allowed (e.g., email)
Will provide a “Sample” Exercise for dry-run.
No negative marking will be used! 🙂
AI Automated Planning…
International Conference on Automated Planning and Scheduling
“Automated planning and scheduling, sometimes denoted as simply AI planning, is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex and must be discovered and optimized in multidimensional space.” – Wikipedia
AI Planning = Search + Knowledge Representation
(for dynamic systems)
*
Plus external material (reading and videos) provided in post #133 for:
PDDL language
GraphPlan technique
with focus on its use as a domain-independent heuristic
Planning and Scheduling in the Space
*
Planning for automation
*
What is the added value at RMIT?
Logic Programming in action @ ICAPS’21
Motivation
How to develop systems or ’agents’
that can make autonomous complex sequential decisions on their own?
*
Settings where autonomy required
Robotics.
Video-Games.
Aerospace.
Manufacturing.
Smart houses.
Companion toys.
Mars Rover.
Military operations planning.
Air Traffic Control.
Deep Space One space probe.
Hubble Space Telescope.
Factory and part assembly.
Job-shop scheduling.
AI Planning used in all of them!!
*
Autonomous Behavior in AI:
The Control Problem
The key problem is to select the action to do next.
Three approaches to this problem:
Programming-based: Specify control by hand.
Learning-based: Learn control from experience.
Model-based: Specify problem by hand, derive control automatically.
Approaches orthogonal though; and successes and limitations in each . . .
the control problem
*
Solution 1: Programming Approach
Control specified by programmer; e.g.:
don’t move into a cell if not known to be safe (no Wumpus or Pit)
sense presence of Wumpus or Pits nearby if not known
pick up gold if presence of gold detected in cell
. . .
Advantage: domain-knowledge easy to express.
Disadvantage: cannot deal with situations not anticipated by programmer.
Example of elevator in SARL.
Agent-oriented programming: Week 12!!
*
Agent Programming: 2 SARL Examples
Elevator Control
Agents in City
Solution 2: Learning-based Approach
Unsupervised (Reinforcement Learning):
penalize agent each time that it ’dies’ from Wumpus or Pit
reward agent each time it’s able to pick up the gold, . . .
Supervised (Classification):
learn to classify actions into good or bad from info provided by teacher
Evolutionary:
from pool of controllers: try them, select the ones that do best, and mutate and recombine keeping best
Advantage: does not require much knowledge in principle
Disadvantage: in practice though, right features needed, incomplete information is problematic, and unsupervised learning is slow . . .
We will cover it later…
*
Solution 3: Model-based Approach
Specify model for problem:
actions, initial situation, goals, and sensors
Let a solver compute controller automatically!
SOLVER
Controller
actions
World
observations
Action
Sensors
Initial Goals
Advantage: flexible, clear, and domain-independent.
Disadvantage: need a model; computationally intractable.
Model-based approach to intelligent behavior: AI Planning.
*
Motivation
Planning is a form of general problem solving:
Idea: problems described at high-level and solved automatically.
Objective: facilitate modeling & generality with (small) penalty in performance.
Problem
Language
Planner
Solution
plan
*
The “simple case”: Classical Planning
Finite set of world states.
Fully observable.
Deterministic actions.
Static environment.
Goals are explicit states.
Actions have no duration.
Planning is offline.
PLANNER
Plan
actions
World
Actions
Goals
totally-order sequences of actions
The Blocks World Example
Actuators
pick a block, put down a block, drop a block, unstack a block, etc..
Sensors
sense table, sense on a block
Performance measure
has the goal tower been built?
Wumpus World Example
Performance measure
gold +1000, death -1000
-1 per step, -10 for using the arrow
Environment
Squares adjacent to wumpus are smelly
Squares adjacent to pit are breezy
Glitter iff gold is in the same square
Shooting kills wumpus if facing it
Shooting uses up the only arrow
Grabbing picks up gold if in same square
Releasing drops the gold in same square
Actuators
Left turn, Right turn,
Forward, Grab, Release, Shoot
Sensors
Breeze, Glitter, Smell
*
The Planning Problem
Initial state: current state of the world.
Goal: state of the world we want to achieve.
Operators: actions which change the world.
Planning problem: Is there a sequence of actions leading from the initial state to a goal state?
Use logic to represent states & operators !!
Blocks World: Predicates
Blocks stacked on top of one another on a table.
Robot arm used to move blocks.
Only `top’ block can be moved.
On(x, y) Block x is directly on top of block y
Clear(x) Block x has nothing on top of it
Holding(x) The robot arm is holding block x
HandEmpty The robot arm is empty
OnTable(w) Block w is on the table
Blocks World: State
HandEmpty ∧ OnTable(d) ∧ OnTable(c) ∧
On(b, d) ∧ On(a, b) ∧ Clear(a) ∧ Clear(c)
Blocks World: Operators
pickUp(W) Pick up block W and hold it
W must be clear, gripper must be empty.
putDown(W) Place block W down on table.
W must be in gripper.
stack(U, V) Place block U on top of block V
Gripper must be holding U and V must be clear.
unstack(U, V) Remove block U from block V
U must be clear, U must be on V, gripper must be empty
Logical Representation of States
∀x [ (¬∃y On(y,x) ) ⇔ Clear(x) ]
x is clear if there is no block y such that y is on top of x
∀y ∀x [OnTable(y) ⇔ ¬On(y,x) ]
If any block is on the table it is not on any other block
Empty ⇔ ∀y(¬Holding(y))
The hand is empty iff it is not gripping any block
*
Logical Representation of Actions
X’ is the truth value of predicate X in the “next” state
∀x pickUp(x):
Empty∧Clear(x) —> Holding’(x) ∧ ¬Empty’
∀x,y unstack(x,y):
On(x,y) ∧ Empty ∧Clear(x) —>
¬On’(x,y) ∧ Holding’(x) ∧ Clear’(y) ∧ ¬Empty’
*
“Logical” Representation of Actions
A : Preconditions + Effects means
`Operator/action A can execute when Preconditions are true & produces Effects after its execution’
pickUp(x):
Precond: Empty∧ Clear(x)
Effects: Holding(x) ∧ ¬Empty
putDown(x):
Precond: Holding(x)
Effects: [Empty ∧ ¬Holding(x) ∧ OnTable(x) ∧ Clear(x)]
stack(x,y):
Precond: Holding(x) ∧ Clear(y)
Effects: (On(x,y) ∧ Empty ∧ ¬Holding(x) ∧ Clear(x))
unstack(x,y):
Precond: On(x,y) ∧ Empty ∧Clear(x)
Effects: ¬On(x,y) ∧ Holding(x) ∧ Clear(y) ∧ ¬Empty
*
The Frame Problem/Axioms
Actions specify what changes, but how do we specify what doesn’t change?
∀x pickUp(x):
Empty∧Clear(x) —> Holding’(x) ∧ ¬Empty’
… but what about all other blocks different from x?
∀x pickUp(x):
∀y. y ≠ x ∧ OnTable(y) —> OnTable’(y)
∀y. y ≠ x ∧ ¬OnTable(y) —> ¬OnTable’(y)
∀y,z. y ≠ x ∧ On(y, z) —> On’(y, z)
∀y. y ≠ x ∧ Clear(y) —> Clear’(y, z)
∀y,c. Color(y, c) —> Color’(y, c)
….
*
Blocks World
a
Empty ∧ OnTable(a) ∧ OnTable(c) ∧ OnTable(d) ∧
On(b, a) ∧ On(e, d) ∧ Clear(b) ∧ Clear(c) ∧ Clear(e)
b
d
e
c
∀x ∀y [Unstack(x,y) ∧ On(x,y) ∧ Empty ∧ Clear(x) ⇒
Holding(x) ∧ Clear(y) ∧ ¬Empty]
Execute: Unstack(e,d)
Execute Unstack(e,d)
a
Empty ∧ OnTable(a) ∧ OnTable(c) ∧ OnTable(d) ∧
On(b, a) ∧ On(e, d) ∧ Clear(b) ∧ Clear(c) ∧ Clear(e)
b
d
c
Unstack(e,d) : On(e,d) ∧ Empty ∧Clear(e) ⇒
Holding(e) ∧ Clear(d) ∧ ¬Empty
e
Holding(e) ∧ OnTable(a) ∧ OnTable(c) ∧ OnTable(d) ∧
On(b, a) ∧ Clear(b) ∧ Clear(d) ∧ Clear(c)
Models, Languages, and Solvers
Planning is a form of general problem solving:
Idea: problems described at high-level and solved automatically.
Goal: facilitate modeling with small penalty in performance.
Many models & many solution forms: uncertainty, cost,.
Models described in suitable planning languages (Strips, PDDL, PPDDL, . . . ).
PLANNER
Controller / Plan
Model of problem
*
Problem Solving & Classical Planning
Three components in modeling and problem solving:
languages for describing problems conveniently.
models for making sense of classes of problems.
algorithms for solving the models.
In classical planning, the mathematical model is fixed and corresponds to the (deterministic) state model:
state space S
initial state s0 ∈ S & goal states SG ⊆ S
actions A(s) applicable in each s
transition function s’ = f(a, s), for all a ∈ A(s)
Languages include STRIPS, ADL, PDDL, etc
Like search!
Can you please go and fill the CES for AI?
Invite must be in your email?
Don’t rate 3, it is dropped…
Don’t just ignore it if you are active and like the course: we want all voices!
NOT YET!! 🙂
Classical Planning
PLANNER
Sequential Plan
State Model
Language
Algorithm
Planning Languages
Planning languages provide a convenient, declarative way for describing state models.
They capture fragments of predicate logic + extensions for expressing:
the set of available actions;
the conditions under which they are applicable; and
the effects they produce.
• In addition, they must accommodate information about the initial situation & the goal situation.
The STRIPS Language
STanford Research Institute Planning System.
Used for control of SHAKEy robot.
Representation of states: fluents
Representation of actions:
preconditions
positive effects
negative effects
Everything not changed stays the same: frame axioms not needed!
Strips is one of the oldest, simplest, and most used planning language,
even if not too flexible.
*
The STRIP Language (cont.)
A problem in STRIPS is a tuple P =
F stands for set of all atoms (boolean vars).
O stands for set of all operators (actions).
I ⊆ F stands for initial situation.
G ⊆ F stands for goal situation.
Operators o ∈ O represented by o =
The Precondition list Pre(o) ⊆ F
The Add list Add(o) ⊆ F
The Delete list Del(o) ⊆ F
*
Example: Blocks World in PDDL
(define (domain BLOCKS)
(:requirements :strips) …
(:action pick_up
:parameters (?x)
:precondition (and (clear ?x) (OnTable ?x) (handempty))
:effect (and (not (OnTable ?x)) (not (clear ?x)) (not (handempty))
(holding ?x)))
(:action put_down
:parameters (?x)
:precondition (holding ?x)
:effect (and (not (holding ?x))(clear ?x)(handempty)(OnTable ?x)))
(:action stack
:parameters (?x ?y)
:precondition (and (holding ?x) (clear ?y))
:effect (and (not (holding ?x))(not (clear ?y))
(clear ?x)(handempty)(on ?x ?y))) …
(define (problem BLOCKS_6_1)
(:domain BLOCKS)
(:objects F D C E B A)
(:init (clear A) (clear B) … (OnTable B) … (handempty))
(:goal (and(on E F) (on F C) (on C B) (on B A) (on A D))))
Planning Domain Description Language
*
Example: 15-Puzzle in PDDL
(define (domain tile)
(:requirements :strips :typing :equality)
(:types tile position)
(:constants blank – tile)
(:predicates (at ?t – tile ?x – position ?y – position)
(inc ?p – position ?pp – position)
(dec ?p – position ?pp – position))
(:action move-up
:parameters (?t – tile ?px – position ?py – position ?bx – position
?by – position)
:precondition (and (= ?px ?bx) (dec ?by ?py) (not (= ?t blank))
…)
:effect (and (not (at blank ?bx ?by)) (not (at ?t ?px ?py))
(at blank ?px ?py) (at ?t ?bx ?by)))
…
(define (domain eight_tile) ..
(:constants t1 t2 t3 t4 t5 t6 t7 t8 – tile p1 p2 p3 – position)
(:timeless (inc p1 p2) (inc p2 p3) (dec p3 p2) (dec p2 p1)))
(define (situation eight_standard)
(:domain eight_tile)
(:init (at blank p1 p1) (at t1 p2 p1) (at t2 p3 p1) (at t3 p1 p2) ..)
(:goal (and (at t8 p1 p1) (at t7 p2 p1) (at t6 p3 p1) ..)
*
PDDL in Visual Studio Code!
http://planning.domains/
Modeling challenge!
Consider a “house domain”.
One of the actions is to turn lights on and off by toggling light switches. Complete STRIP operator toggle(x) for toggling switch x. Precondition is to be near the switch being toggled. There is a predicate LightOn(x) encoding that the light for switch x is on.
toggle(x):
Precond: CloseTo(x)
Add list:
Delete list:
From Languages to Models
A STRIPS problem P =
The states s ∈ S are collections of atoms from F.
The initial state s0 is I.
The goal states s are such that G ⊆ s.
The actions a in A(s) are ops in O s.t. Prec(a) ⊆ s.
The next state is s’ = s − Del(a) + Add(a).
Solution of P is solution of S(P)
How to solve S(P)?
*
Planners: Solve S(P)
Start with initial state.
Apply all possible actions to initial state.
Apply all possible actions to generated states.
Apply all ….
…
Get tree of all possible outcomes.
See if one of the states is the goal state.
Nice idea, but exponential…
Simple!
Planners need heuristics!
*
Planning Graphs — Video Slides
Computation: how to solve S(P)?
Key issue: exploit 2 roles of language:
specification: concise model description.
computation: reveal useful heuristic info.
Explicitly searches graph associated with model S(P) with heuristic h(s) that estimates cost from s to goal
Key idea: Heuristic h extracted automatically from problem P.
mainstream approach; huge spaces!
AI Planning = Search + Knowledge Representation
(for dynamic systems)
Planners and Heuristics
How do we find a good heuristic?
… without knowing the problem at hand!
Answer: Relax the problem & solve it.
Simple relaxations used in Planning:
ignore certain atoms.
ignore delete lists.
graphplan.
Optimal solution to a relaxed problem gives an admissible heuristic for the original one…
*
8-puzzle as a Planning Problem
Numbered tile t at position s1
4
3
8
1
5
7
2
6
Action: Slide(t, s1, s2) – move t from s1 to s2
Precondition: At(t,s1) ∧ Tile(t) ∧ Blank(s2) ∧ Adjacent(s1,s2)
Effect: At(t,s2) ∧ Blank(s1) ∧ ¬At(t,s1) ∧ ¬ Blank(s2)
Proposal 1: ignore Blank(.)
Manhattan distance
yields
Proposal 2: ignore Blank(.) and Adjacent(., .)
Misplaced tiles
yields
Planning Graphs for heuristic extraction
Graphplan-based Heuristics
Build reachability graph P0, A0, P1, A1, . . .
P0 = {p ∈ Init}
Ai = {a ∈ O | Prec(a) ⊆ Pi}
Pi+1 = Pi ∪ {p ∈ Add(a) | a ∈ Ai}
Graph implicitly represents max heuristic:
hmax(s) = min i such that G ⊆ Pi
P0
A0
P1
A1
Pk
*
Planning Graph
P0
P1
A0
P2
A1
Have(eggs)
Have(cake)
eat(cake)
persistence actions
precondition links
Have(eggs)
bake
Have(water)
Have(flour)
Have(water)
Have(flour)
Have(cake)
playdo
Have(playdo)
play
Have(eggs)
Have(water)
Have(flour)
Have(playdo)
Happy
Satisfied
causal links
*
Planning Graph
Efficient way to create a representation of a planning problem that can be used to:
Achieve better heuristic estimates.
Directly construct plans.
Termination? YES!
PG are monotonically increasing or decreasing:
Literals increase monotonically
Actions increase monotonically
Mutexes decrease monotonically
Since there is a finite number of actions and literals, every PG will eventually level off
n levels, a action, l literals O(n(a+l)2)
*
Graphplan…
Check this slides on Planning Graphs
Check how to have and eat the cake!
Planning graph is an input to the planner.
can be “precompiled” offline
Goal is impossible if it is not in the final level.
Mostly used to extract heuristic value.
Can be used to generate (“approx”) plans.
First introduce in GRAPHPLAN planner, but many intricacies and variations …
Computational Approaches to Planning
STRIPS algorithm (70’s): Total ordering planning backward from Goal in depth-first manner.
Partial Order (POCL) Planning (80’s): work on any subgoal, resolve threats.
Graphplan (1995 – . . . ): build graph containing all possible parallel plans up to certain length; then extract plan by searching graph backward from Goal
SatPlan (1996 – . . . ): map planning problem into SAT problem; use state-of-the-art SAT solver
Heuristic Search Planning (1996 – . . . ): search state space S(P) with heuristic function h extracted from problem P.
Model Checking Planning (1998 – . . . ): search state space S(P) with ‘symbolic’ BrFS where sets of states represented by formulas implemented by BDDs
State of the Art in Classical Planning
Fast and significant progress since Graphplan (1995)
Empirical methodology
standard PDDL language
planners and benchmarks available
ICAPS planning competitions (every 2 years). . .
Novel formulations and ideas
Graphplan, SAT, HSP, Model checking, . . .
Large problems solved
Some Planners
HSP: Heuristic Search Planning:
https://code.google.com/p/hsp-planners/
Fast Downward & LAMA (and other variations):
http://www.fast-downward.org/
SATPLAN:
http://www.cs.rochester.edu/~kautz/satplan/
GRAPHPLAN:
http://www.cs.cmu.edu/~avrim/graphplan.html
International Planning Competition (IPC):
http://ipc.icaps-conference.org/
Beyond Classical Planning
Uncertainty: do not know the whole state?
Non-deterministic actions: pick block may fail.
Actions with duration: takes time to drive.
Actions with cost: find optimal plan.
Parallel actions: grab table from both sides.
Sensing information: sensors can give you info.
Conditional effects: effects depend on state executed.
Control knowledge: encode domain expert knowlege
…
Do you want to do investigate any of these problems? Come and talk to me! (… and get a good grade in AI!)
Conclusions
AI Planning = a form of general problem solving.
Key components:
models, languages, algorithms.
Classical Planning:
complete + deterministic.
Reachability analysis on a state model graph.
Difference with search (BFS, DFS):
convenient KR representation of graph (e.g., PDDL).
automatic heuristic extraction.
AI Planning = Search + KR (for action)