CS代写 ICAPS 2017

PowerPoint Presentation

Classical Planning
Optimal Planning

Copyright By PowCoder代写 加微信 powcoder

Cost Partitioning
6CCS3AIP – Artificial Intelligence Planning
Dr Tommy Thompson

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Hello and welcome to this chapter on Optimal Planning and in this chapter were going discuss another approach for optimal planning referred to as cost partitioning and how this can help us to achieve optimal classical planning.

Recap: The Perils of Expressivity
By having such an expressive formalism, state spaces can begin to explode.

How do we circumvent this?
Use heuristics to guide the search?
Find landmarks to give us sub-goals to solve?
Use local search (see EHC)?

Often sacrificing completeness of search in an effort to find a solution quickly.
Hence Best-First Search fallback in EHC.

What if we instead sacrifice expressiveness for efficiency?

Image Credit: A hierarchical neuronal network for planning behavior
PNAS November 25, 1997 94 (24) 13293-13298

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Why are we doing this? Well as discussed in a previous chapter, we’re trying to find ways to optimise the planning process.
And this can arise in different formalisms or approaches to how we have traditionally done things.

Recap: SAS+ Planning
Optimising planning by analysing the problem.

SAS+ Planning re-encodes existing problem into new formulation.

Use of DTG’s help us visualise assignment of variables to values during search/execution.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

First we had SAS+ planning, where we revised the problem space into Domain State Variables that allow us to successfully capture the mutual exclusion relations that exist between specific propositions and use that to streamline the statespace into something more accessible.

Recap: Pattern Databases
Generate heuristics of sub-problems.

Target Patterns: partial specification of the goal state.

Establish a Pattern Database (PDB)
Set of all permutations of the target pattern.

Calculate pattern cost.
Compute distance to target pattern.
Minimum number of actions to achieve target pattern.

Target Pattern

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

And in our previous session we discussed Pattern Databases, a process whereby heuristics for problems are calculated by devising the cost of solving sub-problems of the actual problem. By creating a database of target patterns, we search backwards from the goal state of the ‘abstract’ state space. The abstract state space allows us to search through the state space only considering specific variables, hence entire collections of states are reduced to just one state, given we’re only interested in the values that appear in the pattern. In the case of these diagrams, we’re only interested in the positions of tiles 1-4 as part of the 8-puizzle problem.

We calculate the distance to the goal from a given pattern by counting the number of actions it would take to move through the abstract space, this then becomes the heuristic for that pattern.

Combining Heuristic Information
If we want to achieve optimal planning, we need to ensure admissible heuristics are provided.

But given planning is hard (NP-Hard in even the simplest case), can’t expect any one heuristic to always work well in a given task.

Also, no one heuristic can really grasp the complexity of larger domains.

So can we find a way to use combine information from different heuristics?

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So let’s take a moment to really think about what we’re doing here in this optimal planning phase of the module. We really want to find ways to improve our search process, and while SAS+ planning opted to move us into a different formalism, such that we can capture so many important facets of the problem, we still need to have useful and powerful admissible heuristics.

And we tried to address this in pattern databases given these abstractions of the state space are giving us these nice heuristics that tell us information about how to solve sub sets of the larger problem. But also, we did try some other approaches in the earlier segments on classical planning. We had the relaxed planning graph heuristic, as well as the use of landmarks in LM-Count to help us make smart decisions. But in each case, we can’t expect any one of these approaches to work well in a given task. There is the reality that these heuristics may just not work as effectively in one domain compared to another. Let’s face it: blocksworld can become increasingly complicated, but it is also not as complex as some of the logistics domains. So one approach that happens to work well for BlocksWorld may not work well with say the logistics domain.

Plus, in the case of some domains, the complexity is such that no one heuristic could really encapsulate all of this knowledge in a way that ensures optimal planning.

This is all a very long-winded way of saying, could we find a better way to exploit the complementary strengths of each heuristic such that we can better search the state space?

If we have multiple abstraction heuristics for the state space, how do we know which is best for a given problem?

Difficult to determine, especially when using domain independent heuristics.

Even more difficult to determine on a per-domain and per-problem basis.

Image Credit:
A Comparison of Cost Partitioning Algorithms for Optimal Classical Planning
J. Seipp, T. Keller and M. Helmert, ICAPS 2017

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

If we take a moment to return to the idea of abstractions which we have previously explored, we can see that a given abstraction function creates a new abstract state space.
Now in the pattern databases we used the number of actions between abstract states to the goal as the heuristic value. But of course, our heuristic function can be any calculation with respect to that state space so the heuristic value can be different depending on the abstraction.

So for example, consider this simple state space with five states with transitions and action costs identified, and each action is highlighted by a given colour. We then apply abstraction heuristics to that problem and it generates two unique abstractions of the state space.
In each case the topology and structure of that abstract space is slightly different. But it also results in two heuristics, which for now produce the same heuristic value for the initial state s1 as we head towards goal state s5.

Now there’s a good chance that that heuristic value from different states will change and in this case, we can actually see that. If we consider for example the state evaluation for s2, using h1 that value is still 5, but for h2 it’s not 4. Plus for s3 then h1 is 1, h2 is 4. So clearly these two heuristics are capturing different pieces of information about this problem – what they are, we don’t know, this is a rather theoretical example – but depending on the state, we’re getting different values based on the abstraction heuristic.

And with that in mind, could we find a way to blend these together in such a way that it proves useful to us?

What if we tried to combine or alternate between these two heuristics?
hadd or hmax?

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Now we could work to find ways to combine or use these together. So we could consider hadd where you could add the two heuristics together. Or we could try taking the best of the two heuristics at any point in time, which is what we would denote as hmax.

Now, let’s look at how well this works and does not in a given situation. If we consider evaluating hmax from state s3, then based on each of these abstraction heuristics, we can see that using h1 is the best option. It generates a heuristic value of 1 and this suggests it’s the best action. Let’s not worry here if that’s actually the right one, but it gives us a result we can prioritise. Similar, if we apply hmax to s1, we get the value of 5, given both of them have a heuristic value of 5 as we saw on the previous slide. This is actually quite nice, given it’s always going to prove to be admissible. Provided h1 and h2 are admissible of course, even in the case they both give the same answer, the heuristic is admissible. This would enable us to flip between heuristics based on the values provided by each. But this means we are – at any given time – only relying on the information found in any one heuristic at any point in time.

But, if we were to try and combine the two heuristics, can we use this in an admissible way? Both h1 and h2 are admissible individually, but simply summing them together isn’t going to work. So if we look at say h add for s3, then it’s going to give us an admissible value of 5, which is fine. But, if we try that from the original state, we see that it’s going to cost 5+5 which is 10 meaning that h add is now not admissible in this instance. Recall that earlier in the module I discussed the idea of ensuring heuristics are additive, such that in this case the sum of h1 + h2 is still admissible and in this case it is not.

So we’re in a situation where we can use them individually, which is admissible but isn’t capitalising on their individual properties. But combining them might not work given we can’t expect all summations of abstractions to always prove to be additive for any given problem or domain.

And so with that in mind, again, I ask the question: if we have multiple heuristics for a problem, can we determine what is a good way to blend them together?

Cost Partitioning
Distribute the cost of each action across identical planning tasks.

Combine arbitrary admissible heuristics into one single heuristic.

However, the partitioning is designed such that sum of heuristics is now additive (and therefore admissible).

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

The answer to this approach is one which is given away by the title of this chapter and that is cost partitioning.
Here we are hoping to split the action costs – or more appropriately, partition them – among the heuristics in such a way that the total cost implied by the heuristic does not exceed the original action cost, thus ensuring it remains admissible.

So in essence, we’re trying to ensure our suite of available heuristics is additive, such that we can ensure the summation of these heuristics is still admissible.

But how do we do that? Well, there are a couple of approaches that we’ll discuss in this chapter and while it isn’t exhaustive, it gives you an indication of techniques that are being employed in the field.

Optimal Cost Partitioning
The highest heuristic value for a given state amongst all cost partitionings.

Pro: Fast to compute (polynomial time)
Cons: Still too expensive in practice
Partitioned

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So let’s consider optimal cost partitioning, whereby we aim to find the best heuristic values for a given state among all of the cost partitionings that we have available.
Now that might sound a little weird, but we’re essentially ensuring that – if we have found the best values in one heuristics, we tweak the other heuristic such that it stills additive.

So returning to our example, here we have the original h1 and h2 at 5 each, and the values for s1 are 5 in each case.

Now when we first look first at h1, we can see the optimal cost is 5, given we take the black action which has a value of four, followed by the green action with a value of one.
This results in 5, which is fine, but when we now consider h2, we can’t achieve the same goal using the same path. Instead we need to the red and then blue action to get 5.

But the two actions that across both abstractions ensure a path to the goal are black and blue, so let’s redistribute their costs such that we get an admissible heuristic each time that is also additive for both h1 and h2.
So in this case, we assign the full cost of the black action to h1 and thus give it a value of 0 in h2, while we need to redistribute the cost of four across both h1 and h2. We given it a value of 1 in h1 and 3 in h2. Resulting in h1 being 5, h2 being 3 and an additive admissible partitioned heuristic value of 8.

So this requires us to analyse and recompute costs across all of the actions in each abstraction. It works and is relatively fast to compute, and by fast we mean it runs in polynomial time, but it still isn’t all that feasible to run on larger problems. Given rolling this out in even small example state spaces is going to be quite time consuming to do.

Post Hoc Optimisation
Compute weight factors to be applied to costs relevant to .

If not relevant to , set cost to 0.
Partitioned

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Next up, we have post-hoc optimisation, whereby we isolate what are the relevant actions in each abstraction and once we have those, we find a set of weights that we are going to apply to each heuristic, such that the summation of their calculations is additive and in-turn admissible.

So, if we consider the example here, it’s clear than in both cases the black and green actions are relevant given in both cases they get us to the goal. Meanwhile in h1 the red action is not relevant, given that keeps us in the same abstract state and the same applies to the green action in h2. So in it’s clear than in h1, the black blue and green actions are relevant, while in h2, the black, blue and red actions are relevant.

So next we calculate a weighting for each heuristic w1 and w2 – each of which being between 0 and 1 and sum up to a value of 1 – such that when we then multiply each cost of a relevant action by the weight and if it isn’t a relevant action, we set it to 0.

So if we return to h1 under post-hoc optimisation, we have calculated a weighting of 0.25 such that it now means the black and blue actions are now 1 each, and in h2 with a weighting of 0.75 they are 3 each. Meanwhile the red action in h1 is now 0 given it isn’t relevant to the heuristic calculation and the same applies for the green action in h2. Now this is but one configuration, we could set weights w1 and w2 differently, but this gives us whole numbers for each action – which is easier to work with – and still retains the additive feature we so desire.

Greedy Zero-One Cost
Orders heuristics in a particular sequence.

We then use the full costs for the first relevant heuristic.
Partitioned

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

Next up is actually a really simple approach, known as Greedy Zero-One cost partitioning and the process is pretty straight forward. We order our heuristics into a particular sequence and then we use the cost of a given action in the heuristic in the first instance we find a heuristic where that action is relevant to reaching the goal. If the same action then appears in subsequent heuristics, we just set that value to zero. Plus if the action is not relevant, it also gets set to zero.

So returning to our examples, the black blue and green actions are relevant actions in this context. Hence we keep them as they originally were, with costs of 4, 4 and 1 respectively. Clearly, as we’ve seen in the previous example on post-hoc optimisation, the red action in h1 isn’t relevant. So we scrub it and set it to zero.

Meanwhile, if we look at h2, the green action is again not relevant, so we set it to zero, but all the other actions are relevant. However, we have already used the black and blue action in this case, so we also set those a value of zero. Hence the only action that retains its cost value is the red one, given the green one isn’t relevant. This actually means that the heuristic value is 0 for h2. Not terribly informative, but it still gives us an additive set of heuristics.

But, it’s also a little wasteful, given if we think about it, we stripped any useful information from h2 because already used the costs of relevant actions in h1. But what’s interesting here, is that if we think about it, the cost of the blue action in h1 isn’t relevant, given we don’t use that to calculate the heuristic. Hence we have a slightly improved version of this approach called saturated cost partitioning.

Saturated Cost Partitioning
Orders heuristics in a particular sequence.

Use minimum costs to preserve estimate.
Use remaining cost for other heuristics.
Partitioned

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So this time around with saturated cost partitioning, we do the same thing as with greedy zero one, but this time we set the costs of relevant actions to the best the minimum cost that preserves the heuristic estimate. Then we use whatever remaining cost is left for the relevant actions in the other heuristics.

So if we revisit this example from the greedy-zero one partitioning, the blue action is reduced to a cost of one in h1, bringing it in line with the green action and this means that for the blue action in h2, we can use that remaining cost value as part of the heuristic calculation.

This now means that with saturated cost partitioning we have a cost of 8, which is not only again closer to the actual value and thus a little more informative, but is also still admissible.

Uniform Cost Partitioning
Distribute costs uniformly to all relevant actions.

Distribution across all heuristics.
Partitioned

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

And lastly, we can consider one more approach, known as uniform cost partitioning, which is to consider all of the relevant actions in each heuristic and then divide their cost uniformly over the number of heuristics that those relevant actions appear in.

So, if we look at h1, then in this instance the red action is not relevant. But the black blue and green actions are relevant.
Meanwhile, over in h2, the green action is not relevant, but the red, black and blue actions are relevant. If we consider the union of these two sets, then only black and blue actions are relevant in both. So we will divide their total cost by the number of heuristics that use them. Meaning we then divide their cost by 2. So this means that in both h1 and h2 the black and blue actions now only cost 2 in each case.

However, the cost of the red action in h1 is set to zero because it is not a relevant action, but the green action – given it is only relevant in h1, keeps its cost of 1.
And we can see the exact same thing with h2, only this time red is relevant and green is not. So their costs are set accordingly.

In Summary
Cost partitioning is one method to allow for us to combine multiple admissible heuristics.

While useful in optimal planning, not something we will then pursue in the rest of the module.

FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS

So as we wrap up, it’s important that we acknowledge in order to achieve classical planning, we need good heuristics and that sometimes, we need good heuristics to be used as the same time.
And of course that isn’t an easy process, given if we do have useful heuristics we need to ensure they can be combined and if they’re not going to be additive then that’s not useful.

So cost partitioning is a process where can go about circumv

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com