程序代写 PowerPoint Presentation

PowerPoint Presentation

Classical Planning
Non-Forward Search

Copyright By PowCoder代写 加微信 powcoder

6CCS3AIP – Artificial Intelligence Planning
Dr Tommy Thompson

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Hello and welcome to this new segment on non-forward search. In this segment we’re going to look at alternative approaches to searching through the state space. Namely, how else can we search the state space beyond the canonical form we decided on at the start of the module.

Classical Planning: Material Overview
Classical Planning: The fundamentals.

Improving Search
Heuristic Design (RPG/LAMA)
Dual Openlist Search

Optimal Planning
SAS+ Planning
Pattern Databases

Non-Forward Search
SAT Planning
POP Planning
HTN Planning

Planning Under Uncertainty

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Returning to our overview back at the introduction, we’ve now cleared through a variety of approaches not just in heuristic search, but how we can optimise using specialisms of the planning approach or finding means to generate useful admissible heuristics. But now, let’s go back to the beginning when we discussed planning fundamentals…

Fundamentals: Forward Search
Definition of the problem – PDDL

Forward Search
Breadth/Depth First

Introducing Heuristics

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Going back to my introductory chapter, we talked about how we have the problem definition in PDDL and from that we enumerate all the possible states we can visit. We generate the initial state and from there roll out with new states based on the state transition system we have established. And depending on whether we use a simple algorithm, such as breath or depth first, or we’re using a heuristic or cost driven mechanism, be it A*, or even something like Enforced Hill Climbing, we are reliant on the idea that we continue to move forward through the search space until we find the goal and then construct the path back to the initial state.

This is – perhaps obviously – what we call forward search – we’re always starting at the initial state and moving forward to the goal.
So, with this in mind, what if we could search the state space in a slightly different way? What if we formulated the problem space differently as well? Or even question some of the assumptions we make in how this works.
This is largely what we’re discussing in this segment and each of these chapters is but a broad overview of these topics, rather than going into them in too much detail.

The emphasis here is making sure you’re aware that there have been alternative approaches established to how search works, so this really just a taster of subject you could get into much more detail through further reading.

Construct a graph that encodes constraints on possible plans.

The planning graph constrains search to a valid plan.
Finds shortest plans (i.e. shortest makespan).
Will search backwards from the end of the graph back to the initial state.

Graphs can be built fairly quickly.
Graph construction is in polynomial time, but planning can be exptime.
Will terminate with failure if there is no plan.

Originally developed for Graphplan planner, is the basis for the RPG heuristic.
A. Blum and M. Furst, “Fast Planning Through Planning Graph Analysis” [pdf], Artificial Intelligence, 90:281–300 (1997).
onBT {T}
onCA {T}
clearB {T}
clearC {T}
C to T from A

C to B from A

B to C from T

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So for our first foray into alternative approaches, let’s look at something that will actually look quite familiar to you when you see it in action, given it’s a combination of things that we’ve actually already covered.

Graphplan generates a planning graph of fact and action layers, whereby we roll out all of the facts that are true on the current layer, then all of the actions we can achieve from the previous fact layer.

Now if you have already looked at our chapter on the relaxed planning graph heuristic which is used in the FF, this will look very familiar to you. And that’s because RPG is based on GraphPlan albeit with simplifications to the process.
One of the reasons for this, is that graphs can be built in polynomial but the search process for an actual plan can be exponential time, hence the delete relaxations and the focus on action sets rather than a totally ordered set of actions helps make the RPG heuristic find a solution faster.

However, the process of finding a plan in GraphPlan is slightly different and one very important feature is that it also captures the mutex relations that exist between facts in the action layer, such that we ensure actions that appear in later action layers can only occur at that time, provided the preconditions are not mutex.

Graphplan Example: Rocket Domain
X is at location Y
: rocket R has fuel
: X is in rocket R

: load X (onto R) at location L (X and R must be at L)
: unload X (from R) at location L (R must be at L)
: move rocket R from X to Y(R must be at L and have fuel)

Graph representation:
Solid black lines: preconditions/effects
Dotted red lines: negated preconditions/effects

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So to go through this in detail, let’s work through an example. This is the rocket domain, where we have a rocket we’re trying to move a rocket around with objects being stored within it and for the rocket to have fuel in order for us to move.

Nothing here is terribly new or complicated, but it’s worth acknowledging here that one of the interesting elements of this domain is that upon moving the rocket, it’s out of fuel. And we don’t have means to refuel it. Hence it’s important that we load any and all materials we need onto the rocket before we set it off to its destination.

Now we’re going to roll out the planning graph in a manner similar to that we have done before, but this time, we’re going to actively maintain references to the preconditions and effects that arise from executing these actions.
In addition, we’re also going to use the dotted red lines to track what facts are negated by action execution. This is important for us to acknowledge what mutex relations will exitst.

Building a Simple Planning Graph
S0 A0 S1

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So let’s start with the initial state, where we have packages A and B at location L. The rocket is at location L and we have fuel in the rocket. The goal is to have both packages at location P, which the rocket can fly to with a single action.
Much like what we saw before with the RPG heuristic, we setup our initial proposition or fact layer S0 and then find what actions are valid for us to execute in action layer A0 – including the no-op, or doing nothing – plus we are tracking the preconditions required by each of these facts to enable us to execute each of these actions.

This then results in the subsequent fact layer S1, which shows all the facts that could be true, as a result of the execution of those actions or just doing nothing and skipping a timestep.

Building a Simple Planning Graph
S0 A0 S1

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

But this time, let’s also maintain a record of the effects of actions, what action created a fact or deleted it, which is going to be useful for us later on during the planning process.
So here, we can clearly see that loading the packages onto the rocket delete the facts that A and B are at L, but also that moving the rocket from L to P deletes the fact that it’s at L, but also that it has any fuel after the journey is completed.

Building a Simple Planning Graph
S0 A0 S1 A1 S2

unload A P
unload B P

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So as we roll it out again, we’re seeing how all of the preconditions and effects are being monitored throughout the graph and we see that the goal facts, of the packages A and B being at location P are now visible within the latter fact layer. So we have – much like we have done with the RPG heuristic – generated action layers based on available facts, then created new fact layers based on the new facts that can emerge, with the no-op maintaining the state as it currently is, plus in future actions layers we consider all of the possible actions that can now be executed in order to solve this problem and this continues on until those goal facts.

But all of this is pretty much the same as the RPG heuristic, except we’re paying attention to the delete effects of actions, whereas in RPG we simply ignored them thanks to the delete relaxation.
So what is important to recognise here, is this will not result in a valid plan yet. Let’s discuss why…

Valid Plans
A valid plan from graphplan is one in which:
Actions at the same level don’t interfere with one another.
Each action’s preconditions are true at that point in the plan.
Goals are satisfied at the end of the graph.

This is where mutexes kick in:
Two actions or literals are mutex if no valid plan could contain both.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

In order for a valid plan to emerge in Graph Plan, we are reliant on a much stricter set of rules than we were when we using this for RPG.

When generating a valid plan, none of the actions at the same level can interfere with one another.
This includes whether they both change the same fact on the subsequent fact layer
Or if their preconditions contradict.
And of course if the goal facts appear on the final fact layer.

Hence, for the example we’ve made thus far, this doesn’t generate a valid plan, because even if we consider actions that could executed in parallel, we could only load A and B onto L in the first action layer, then move the Rocket to P on the second action layer. We still need to unload the items – which of course could be done in parallel – on a third action layer.
Why is this? Well because if we consider our first action layer, we could load the packages and move the rocket on that first layers, meaning that in the second layer, we could then unload all packages. What we have is the relaxed plan solution but we need to reinforce the mutex relations that exist between actions and facts in order for graph plan to find the actual solution.

Different types of mutex relations can appear in a planning graph.

Interference
Two actions that interfere with one another, where the effect of one negates the precondition of another.

Competing Needs
Two actions where any of their preconditions are mutex with one another.

Inconsistent Support
Two literals that are mutex given all ways of creating both are mutex.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

There are three types of mutexes that can emerge – and don’t worry we’re going to explore the final graph shortly.

First of all we have interference mutexes, these are where two actions interfere with one another. Given the effect of one action will negate the precondition of another one. This is one of the most common things to spot in our earlier action layers.
Secondly, we have what are known as competing needs, where two actions become mutex because their preconditions are mutex with one another. We’ll definitely see some of these when we go back and revisit the second action layer, given we’re now in a position where the fact layer now permits contradictory facts to exist.
And lastly, we have inconsistent support mutexes, where two literals are mutex because all ways in which we can create them in the fact layer are also mutex.

Using Mutexes
We extend the planning graph and acknowledge mutexes in fact/action layers.

Action Layers
Include all actions (including no-op) that have all their preconditions satisfied and are not mutex.
Mark as mutex all action pairs that are incompatible

Fact Layers
Generate all propositions that are the effect of an action in preceding fact layer.
Mark as mutex all pairs of propositions that can only be generated by mutex action pairs.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So now, let’s consider how this will roll out with the mutexes being adopted.
If we consider our action layers, we are only going to include all actions – including the no-op of course – that have all their actions satisfied, but are also not mutex.
We need to mark all actions that are incompatible as mutex in the action layer.

Meanwhile, we will also generate all facts in the subsequent fact layer, but mark any and all pairs of facts that could only be generated by mutex action pairs.

Finding Mutexes
S0 A0 S1 A1 S2 A2 S3

unload A P
unload B P

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So with this in mind, the let us return to the original example, only this I have actually rolled this out a little farther to a third fact layer, from which we could potentially build a solution from.
Why? Because we’re actually able to handle the mutexes that have emerged. Let’s talk about where some of these mutexes are and how from this we are able to start building our final plan from this graph.

Finding Mutexes
S0 A0 S1 A1 S2 A2 S3

unload A P
unload B P

Interference
(load A L) and (load B L)
Are mutex with (move L P)

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

First things first, even in the first action layer, we have interference.
The two load actions both have a mutex relationship with the move action.
While the two load actions themselves could be executed concurrently, they can’t be executed on the same action layer as move, because the effect of the move action – changing (at R L) to (at R P) – is mutex.
Note that we could actually still apply both load actions in this action layer, they’re fine they don’t contradict each other, but the move causes complications.

And if you look farther into the planning graph, you’ll notice that these mutexes will persist as we move through and in fact will get worse over time as more facts are added and more actions are available.

Finding Mutexes
S0 A0 S1 A1 S2 A2 S3

unload A P
unload B P

Competing Needs
(move L P) and (move P L)
have mutex preconditions

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Secondly, we begin to see competing need mutexes appear and a good example of this is the second action layer, given we now have a situation where both moving the rocket from L to P and vice versa are available on the same fact layer. This is of course mutex, but unlike the interference mutex, where the action of one impedes the effect of another – although to be fair that is true – in this case the preconditions are mutex.
Why are they mutex? Because the in the previous fact layer, the fact that the rocket is at L and P are mutex, hence these actions are also mutex in turn. These facts appear as mutex, because the action (move L P) has a mutex action pair with the no-op. This allows us to very quickly catch all of the mutex facts that appear in the second fact layer.

Basic Graphplan Algorithm
Grow the planning graph (PG) until all goals are reachable and none are pairwise mutex.
(If PG levels off [reaches a steady state] first, fail).

Search backwards from the max fact layer for a valid plan.

If none found, add a level to the PG and try again.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

What we have to do is continually generate new fact and action layers and continue to do so until we reach a point that the goal facts are not mutex with one another.
In the event we reach a point where there is no change happening in the fact layers and the goal facts haven’t emerged, we just bail out and declare it as a failure.

However, in the event the goal facts are eventually reached without the mutex relations, we will then plan backwards from that point to see if a complete plan can be made without mutex actions.
If it can be found, great, we have a plan! If not, we continue to generate more fact layers and then try again. Given it’s possible that the goal facts are not mutex, but the actions to create them still are.

Plan Generation
Backward chain on the planning graph.
Complete all goals at one level before going back

At level i, pick a non-mutex subset of actions that achieve the goals at level i+1.
The preconditions of these actions become the goals at level i.

Build the action subset by iterating over goals, choosing an action that has the goal as an effect.
Use an action that was already selected if possible. Do forward checking on remaining goals.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

The actual plan generation process is pretty much what we saw with the RPG heuristic: iterate backwards finding actions that at first satisfy the goals at the final fact layer, then add those preconditions to the goal layer as we iterate backwards. Again we’re capable of picking any non-mutex subset of actions. Hence in this example, we can safely load or unload both packages at the same time. Given, when we execute it, we’re just going to pick one of them in an arbitrary fashion, given it doesn’t matter which one we pick. Provided they are both executed before the next set of actions, then everything will be valid.

Creating planning graphs is a polynomial time process.

Finding solutions however, is still potentially exponential.

Mutexes across facts and actions help ensure we find complete and sound plans.

Generate graphs forward, then plan backwards.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So to wrap up, GraphPlan is an interesting approach that enables us to generate a search space within polynomial time that ensures we have all goals available to us.
However, the search time can still prove demanding as we iterate back through the graph. This example is of course rather trivial and fails to really grasp the potential complexity of what we could be dealing with in this case.

And it’s an interesting decision on our part to introduce this now after having shown you the RPG heuristic, given this is the basis from which lat

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com