PowerPoint Presentation
Classical Planning
Relaxed Planning & RPG Heuristic
Copyright By PowCoder代写 加微信 powcoder
6CCS3AIP – Artificial Intelligence Planning
Dr Tommy Thompson
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Hello and welcome to this second chapter on classical planning. In this chapter we’re going to look at how to create domain independent heuristics and one example of this through relaxed planning.
Domain Independent Heuristics
Building heuristics that can adapt to different domains/problems.
Not reliant on specific information about the problem.
We analyse aspects of the search and planning process to find potential heuristics.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So heuristics for problems in planning are important, because the range of planning problems is very diverse. But we might not know what is a good heuristic to a problem at the time we’re establishing what the problem is. Plus given we typically encode planning domains in the likes of STRIPS and PDDL, we don’t provide the heuristic as part of that problem. So can we establish a means to create the heuristic in a way that is independent of the actual domain itself, but is instead arises from an analysis of the problem itself.
The first of these is the focus of today’s chapter: relaxed planning and how this can used to calculate optimal heuristics.
Domain Independent Heuristics
STRIPS planning (Fikes & Nilsson, 1971) uses number of goals satisfied.
Number of variables from sgoal satisfied in scurrent
Goal-Counting Heuristic: More satisfied goals, then closer to the goal?
Uninformative about the problem.
Ignores a lot of the problem structure.
Fikes, R.E. and Nilsson, N.J., 1971. STRIPS: A new approach to the application of theorem proving to problem solving.
Artificial intelligence, 2(3-4), pp.189-208.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
If we consider early planning applications, even the likes of STRIPS was reliant on domain independent information, given much like PDDL it is reliant on a notation that users adopt to input the planning problems they are trying to solve.
In fact, STRIPS relies on what is known as the Goal-Counting heuristic. It looks at the current state and counts how many of the goals listed in the goal state are currently satisfied.
So if we look at say the examples on the right of this screen, then we would count how many of the packages are currently in the correct location in driver log, how many cubes are positioned correctly atop one another in blocks world. But the problem is, that’s actually not a very good heuristic. It doesn’t tell us an awful lot about how far away we are from the current state to the goal, which is what we need from a good heuristic. If we consider that driver log problem, the number of states where the goal-counting heuristic would be zero is significant, given there is a lot of legwork that needs to be completed in order to get even one package in the correct destination, never mind all three. Hence goal-counting is ignoring a huge amount of the problem structure and not really telling us anything interesting or useful.
Relaxed Planning
Establishing valid solutions on planning problems is hard.
Calculating whether a planning instance has any valid solutions exists within PSPACE-complete and optimal planning is NP-Hard to compute (Bylander, 1994).
Need a polynomial time heuristic – given we’re going to use it a lot!
Relaxed Plan Heuristics reliant on solving the ‘relaxed problem’.
Estimate cost using ‘simpler’ version of existing problem.
Solving the relaxed problem provides a lower-bound on optimal plan length.
Bylander, T., 1994. The computational complexity of propositional STRIPS planning.
Artificial Intelligence, 69(1-2), pp.165-204.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Now ideally we want a much more complex heuristic that gives us richer information about the problem. But one of the reasons this is a headache is that planning is highly expensive to compute. Most planning problems exist within PSPACE-complete of computational complexity while trying to actually solve a problem is typically NP-Hard. So we need to find a heuristic that is informative, but also can be calculated in polynomial time so as to minimse overhead.
One great way to achieve this is relaxed planning: whereby we attempt to solve a simpler or relaxed version of the problem and by solving the relaxed problem, we can use the solution to help construct a heuristic value. So let’s walk through how that’s going to work.
Delete Relaxation
For any given action, we have two types of effects:
Add Effects
New information added to the world state.
Delete Effects
Information removed from the world state.
Delete relaxation heuristics ignore delete effects when the state-transition model is applied.
Σ = (S,A,γ)
s ∈ S, a ∈ A
γ(s,a) s’
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Relaxed planning is based on the principle of delete relaxation. So if we consider the typical actions that we have in a planning problems, the effects can be considered to be one of two types:
Add effects, where we add new information to the state.
Delete effects, where we remove information from the world state.
So if we consider the action that is happening in Blocksworld example to the right, as C moves onto A, we add the effect that C is on A. But the delete effect removes the fact that C is on the Table and A is clear.
Deletion relaxation is a principle where, during the state transition system that we could use for a typical problem space, we ignore the delete effects when the state transition function gamma is applied.
This means, that the state space becomes broader and more actions are applicable in each state, because information that shouldn’t be true, is now also true during play.
Delete Relaxation
Delete Relaxation
Estimate cost to the goal by removing negative effects of actions.
i.e. any PDDL effect that removes a fact is no longer considered.
Example: FreeCell Solitaire
Free cells, tableau positions remain available after moving cards onto them.
Cards remain movable and remain valid targets for other cards after moving cards on top of them.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So if we consider for example a game of freecell solitaire, where cards can either moving into free cells on the board or into tableau positions – meaning the decks of decreasing numbers and alternating red/black colours – if we remove the delete effects, then the fact that a cell is free still holds, even after we put a card on top of it. Similarly, because cards are still free when we put cards on top of them, it means other cards can still be put atop them as well.
This sounds anarchic, but what it does is it removes so many of the hard constraints of the puzzle and makes it a lot easier to solve.
Relaxed Planning
Relaxed planning used to denote planning using delete relaxation.
Resulting in relaxed plans.
How does this work as a heuristic?
For evaluating current state scurr, calculate relaxed plan.
h(Scurr) = cost (#actions) of the relaxed plan.
Σ = (S,A,γ)
s ∈ S, a ∈ A
γ(s,a) s’
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So the process of finding a solution to a problem that uses delete relaxation, is known as Relaxed Planning, which in turn generates relaxed plans.
We can then adopt this as a heuristic? We do this by evaluating any given state in our planning problem by solving the relaxed problem and generating a relaxed plan from that state.
We can then use the cost of the relaxed plan as a heuristic value for that state.
And provided we have generated the optimal relaxed plan, then this will provide us with an admissible heuristic.
So let’s take a look at how this can be used in one specific case…
Relaxed Planning Graph Heuristic
Relaxed Planning Graph (RPG) heuristic creates a simple layered approach to exploring the state space.
Inspired by the GraphPlan system
GraphPlan covered later in this module.
For reference, see (Blum & Furst, 1995)
The relaxed solution plan
Where each Oi is the set of actions selected in parallel at time step i, and m is the number of the first fact layer containing all goals
h(s) := ∑ |Oi|
Blum, A. L., & Furst, M. L. (1995). Fast planning through planning graph analysis.
In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI95), pp. 1636
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Let us consider the relaxed planning graph or RPG Heuristic. This is an approach utilised in the FF planner from 2001 that is reliant on the principles of GraphPlan, a system whereby we generate a flat layered approach to search which we will see in a minute. We’ll also be talking more about GraphPlan later in this module, so this is something of a preview of the complexities we’ll find in GraphPlan later on. The RPG heuristic generates relaxed plans that are sets of actions that can be executed in parallel at a given timestep. We then count how many sets of these actions exist and this gives us our heuristic value.
Let’s walk through an example and start to get a grasp on how this works.
Building a Relaxed Planning Graph
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So let’s start with this situation, where we’re going to calculate the RPG value for this blocksworld example. I have C on A, A and B on the table and the goal state is to have A on B on C.
In order for the relaxed planning graph to be constructed, it builds what is known as a fact layer: this is going to include all of the information that is true at this point in time with respect to the problem.
Building a Relaxed Planning Graph
onBT {T}
onCA {T}
clearB {T}
clearC {T}
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So our first fact layer, layer 0 only has the facts that are true of the current state. However, for the sake of this example, I have also provided all the facts that aren’t true, these are listed in red and without the {T} on the right hand side. You usually wouldn’t have these in the initial fact layer given the closed world assumption, but given all of these will become true momentarily, I wanted to share them with you
Building a Relaxed Planning Graph
onBT {T}
onCA {T}
clearB {T}
clearC {T}
C to T from A
C to B from A
B to C from T
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Next we also have the first action layer, and this is all of the valid actions that can exist at this point in time. So here
We then generate the first fact layer, including all three actions we can make from the initial fact layer:
Moving C on to the table T
Moving C onto B
Moving B onto C.
Building a Relaxed Planning Graph
onBT {T}
onCA {T}
clearB {T}
clearC {T}
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
C to T from A
C to B from A
B to C from T
C to A from T
C to B from T
B to A from T
B to T from C
A to B from T
A to C from T
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So now that we have all of these actions, we can generate the second fact layer. Now as mentioned previously, we’re now using delete relaxation, so we apply the add effects to the new fact layer, but we do not apply the delete effects.
Hence this new fact layer has made several facts now true, such as B now being atop C, C being on top of A and C being atop B. Given that the delete effects are not applied, all of these facts are true in the fact layer. Despite the fact that it’s impossible for all three of them to be true at the same time.
Now with the second fact layer constructed, we can build the second action layer. This action layer contains all of the new actions we can now achieve that were not possible in the previous fact layer.
Building a Relaxed Planning Graph
onBT {T}
onCA {T}
clearB {T}
clearC {T}
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
C to T from A
C to B from A
B to C from T
C to A from T
C to B from T
B to A from T
B to T from C
A to B from T
A to C from T
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
This of course now generates our final fact layer, but what’s important is that now all three facts that we expect to see in the goal state, A on B, B on C are now all true. Hence we’ve reached a fact layer where all facts pertaining to the goal are true and from this, we can now work backwards to create a solution.
Extracting a Solution
To get a solution, work backwards through the RPG.
At each fact layer f(n), we have goals to achieve g(n).
We start with g(n) containing the problem goals.
For each fact in g(n):
If it was in f(n-1), add it to g(n-1)
Otherwise, choose an action from a(n), and add its preconditions to g(n-1)
Stop when at g(0).
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Given all the facts are true, the process of extracting the relaxed plan solution, is to work backwards from the goal layer g, creating new goal layers back to the initial layer, figuring out the number of sets of actions required to achieve it.
We look at the preceding layer and whether a fact at layer n existed in layer n-1.
If it did, we add it g(n-1). If it doesn’t, we choose an action from the intermediate fact layer. And add its preconditions to the g(n-1).
This might sound confusing, but it’s pretty straightforward when we see it in practice.
Extracting a Solution
G(n) = onAB, onBC
G(n-1) = onBC
onBT {T}
onCA {T}
clearB {T}
clearC {T}
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
C to T from A
C to B from A
B to C from T
C to A from T
C to B from T
B to A from T
B to T from C
A to B from T
A to C from T
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So if we look at the goal layer, the two facts we need are A on B and B on C.
If we then look at the preceding fact layer, B on C is satisfied, but A on B is not. So we need to find the action that achieves it, in this case move A from the table onto B.
Extracting a Solution
G(n-1) = onBC, clearA
G(n-2) = ???
onBT {T}
onCA {T}
clearB {T}
clearC {T}
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
C to T from A
C to B from A
B to C from T
C to A from T
C to B from T
B to A from T
B to T from C
A to B from T
A to C from T
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Next we have to factor not just the last remaining goal fact, B on C, but also the precondition to move A onto B action, which is that B is clear. Given we added that action to the relaxed plan solution.
Extracting a Solution
G(n-1) = onBC, clearA
G(n-2) = clearB
onBT {T}
onCA {T}
clearB {T}
clearC {T}
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
C on T from A
C on B from A
B on C from T
C on A from T
C on B from T
B on A from T
B on T from C
A on B from T
A on C from T
onBT {T}
onCA {T}
clearA {T}
clearB {T}
clearC {T}
Final relaxed solution:
Put C on T from A, Put B to C from T, Put A to B from T
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So we identify the two actions that achieve onBC, which is moving B onto C. As well as moving C onto the Table.
This satisfies both of these facts in G(n-1). The preconditions for moving C onto the Table (that C is clear) are already satisfied in fact layer G(n-1). The only outstanding fact is whether B is clear, which is satisfied in G(n-2). This means we have found a relaxed solution, with two actions set, putting C on T and B on C, and the second set comprised of only A on B. Funnily enough, this is also the actual solution to the problem.
But what this means, is that we now have a RPG heuristic of length 3, given there are 3 actions in the relaxed plan.
A good way to come up with heuristics: solve the simplified version of the problem.
Delete Relaxation – discarding all delete effects – provides simplification that can be explored using greedy search in polynomial time.
Delete Relaxation always creates simplifications: problems always easier to solve than the original.
RPG layers provide admissible heuristic to relaxed problems.
But the relaxed plan length is not.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
In summary, one good way to establish a domain-independent heuristic is solve the relaxed version of the problem.
The delete relaxation provides a nice simplification, given it means that many actions can become possible despite many of the issues that arise with contradictory information in the fact layers.
One thing to recognise is that delete relaxations always create a simplification of a problem, they never make a problem harder and in turn it makes it easer to solve than the original.
Use of the RPG method will find optimal solutions to relaxed problems given the number of layers is an admissible heuristic and in turn provide us with an admissible domain independent heuristic.
Classical Planning
Relaxed Planning & RPG Heuristic
6CCS3AIP – Artificial Intelligence Planning
Dr Tommy Thompson
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
As we wrap up this chapter, this is but one example of how we can approach this problem. In the next chapter, we’re going to look at how we can use the RPG heuristic in a planning system in order to achieve plans. But also the problems that still arise along the way. Thanks for watching and we’ll see you in the next chapter.
/docProps/thumbnail.jpeg
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com