PowerPoint Presentation
Classical Planning
Non-Forward Search
Copyright By PowCoder代写 加微信 powcoder
Partial Order Planning
6CCS3AIP – Artificial Intelligence Planning
Dr Tommy Thompson
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Hello and welcome to this fifth chapter on classical planning. In this chapter we’re going to look at something that seems simple on paper, what about using multiple heuristics at the same time when searching?
Forward Search
Searching forward from the initial to goal state can be problematic due to the nature of the problem space.
We may wind up creating redundant steps in our plans.
This could be the fault of the heuristic.
Or is a reflection of a corner of the search space we fall into.
We may find ourself in a situation where we’re caught in a loop.
Heuristic search should avoid this, but uninformed search is vulnerable.
Depending on the algorithm, we can be at the mercy of the state space to actually find the solution.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
If we consider how forward search works, we’re pushing forward from our initial state and seeking that all important goal state. Now, depending on the nature of the search space, this can be pretty problematic given the state space will differ from problem to problem. Hence there is a good chance that we are at the mercy of the state space, meaning that we can potentially get caught out by things like redundant states that we have to push through, we might find loops in the state space and yes in many instances the heuristic or the search algorithm will help us avoid this. We know that uninformed search algorithms aren’t suited to these states spaces, so we use heuristic search but even then, as we’ve seen throughout this module we’ve seen that there are situations where even that doesn’t work out well for us.
So, what about just going the other direction?
Regression Search
Searching backwards from the goal state.
Instead of aiming of searching for a state that holds all goals, we search for assignments of the goal facts.
Find actions with add effects that create goal facts.
Then add (unsatisfied) preconditions as subgoals to achieve.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
And this brings us to backward search or regression search, where we’re going to go backwards from the goal state to try and find the solution.
In fact, on that note, it’s worth noting that backwards search is typically referred to as regression search, given you’re attempting to regress facts in the goal state to previous states in the state space. And in conjunction with this, you will sometimes see Forward Search referred to as Progession Planning, given – as the name implies – you’re in the process of progressing from the initial state to a more advanced or desirable outcome. So yeah, I actually tend to refer to it as Regression Search personally, but it is something to acknowledge that it is referred to in literature in different ways.
The process should sound familiar to you given we’ve already discussed some of this with the RPG Heuristic and then GraphPlan. The idea is that we start from the goal state and start trying to find ways to assign the facts in the goal state by looking at the add effects of actions. If this creates these goals, then we consider this a valid regression and then consider the unsatisfied preconditions of those actions as new subgoals we also need to satisfy.
Regression Planning: More Formally
The set of actions applicable in are the operators that are relevant and consistent.
Namely, for which and
Operators that add something in the state, and don’t delete anything in the state.
The state that follows the application of is such that:
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
A little more formally, the actions we are looking for are those that are relevant and consistent in the context of the current problem.
This means we consider all possible actions and assess whether they result in new facts being added to the state, while also not removing information from the state. These are considered to be the appropriate actions we can consider and considering we’re starting from the goal, this can be problematic given it could mean we avoid actions that are temporarily destroying goal facts only to replace them later.
But ultimately, the state transition is then achieved by subtracting the add effects of the valid action, but also adding the preconditions, such that we then attempt to solve them next.
Why Search Backwards
But, why is searching backwards any different?
States spaces when searching forwards vs backwards are not symmetric.
Forward Search:
One initial state.
Apply action a to state s? One valid unique successor s’.
Backward Search
A set of goal states.
Multiple states where a can be applied to reach s’.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So let’s take a moment – before I get into the how this works and where it’s applied – to discuss why you would even bother. Why is searching backwards any different than searching forwards.
And the real takeaway from this, is that the search spaces when you search in either direction are not symmetric with one another. And by that what I mean is if I was to search forwards in say a blocks world example and find the goal, and then search backwards from the goal to the initial state, those two search spaces are not mirrors of one another. Those search spaces will look quite different.
Now why is that?
Well first of all, if you consider that forward search is all about starting in one state and finding any state that satisfies the goal conditions. But the thing is that the branching factor can be quite large, given if we are in one state s, we apply an action a, we get a successor s’ – this is known, this is our state transition system – but if there are a lot of actions, we can find ourself in a lot of unique states.
Now conversely, backward search isn’t really handling one individual state, it’s handling a set of goal states. So our search space isn’t one unique world state, it’s a set of world states. Why? Because there may be numerous states that satisfy the goal conditions, but when searching backwards we’re only interested in how those facts could be reached. In addition, there is also the notion of how actions results in successor states. As I mentioned, if you apply action a in state s, you reach state s’. But going backwards, if you look at state s’ and ask ourselves if we had just applied action a, how many different states could we have been in that resulted in s’? It’s not as straight forward.
In fact, if you consider the diagram at the top, you can see that there is one action that at state s3 that results in state s9. But there s9 is a state that can be reached from s3, s5, s6 and s10. It might be that in each case, the same action resulted in s9. I mean we don’t know, it’s a flimsy wee diagram, but it helps serve my point.
Regression Search: STRIPS
The STanford Research Institute Problem Solver
Arguably the first planning system dating back to 1971
Used regression search from the goal state.
But… it was incomplete.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Now it’s worth mentioning that, STRIPS, arguably the first real AI planning system, uses regression search.
The way that this works is that STRIPS only considers the preconditions of the last operator added to the plan.
So it works backwards from the goal state, and then once an action is added, it recursively calls the same algorithm to see whether that actions preconditions can be satisfied. Only in the event that this is satisfied can it then continue on with the remaining planning process.
This actually results in it being incomplete, since if it cannot satisfy one given precondition based on a valid action it has explored, it bottoms out and assumes the planning process is impossible. It can’t find solutions for some problems and struggles to find optimal solutions for others, notably the in Blocksworld, which I have covered previously in the series. Why? Well it’s to do with conflicting goals that impact one another. But let’s take a look at an example of how regression search might actually work.
(unload MH)
(at mydvd amazon)
(at truck amazon)
(at driver home)
(path home amazon)
(link amazon london)
(link london myhouse)
Regression Search (Example)
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So let’s look at a very quick example. Say that we have this situation where I wanted a DVD delivered to my house by Amazon.
In the initial state, we had the DVD at amazon, so someone needed to get in a truck, load the dvd onto the truck and then drive it to me.
So if we start with the initial state, you’ll notice that only MyDvD, which is M is set to have the value of H, which is at my home.
You’ll notice that where the Truck is and where the Driver is, are not relevant here. Because those are not critical goal facts.
Now the only action that will create the desired goal fact is to unload M from T at location H. Given that this now ensures that the DVD is at my home, at H, it requires that we solve the precondition, which is that the truck is also at my H in order for the unload to happen.
(unload MH)
(drive LH)
(at mydvd amazon)
(at truck amazon)
(at driver home)
(path home amazon)
(link amazon london)
(link london myhouse)
Regression Search (Example)
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
Now let’s consider what actions could have been employed in order for this to happen?
Loading MyDVD onto the Truck at H (home)
Loading MyDVD onto the Truck at L (London)
Loading MyDVD onto the Truck at A (Amazon)
Driving the Truck from L to H (London to my Home)
Now as mentioned, it’s important to factor in whether or not an action is consistent. In this case, the first action, loading MyDVD onto the truck, isn’t a good action here. Why? Because yes it will satisfy the precondition to have MyDVD on the truck, but it also deletes a goal fact of MyDVD being at H. So it’s not applicable, we ignore it.
Driving from L to H makes a lot of sense here. But you’ll notice that also the other two load actions for loading MyDVD onto the truck. Now we’re only interested in whether the add effects satisfy our current subgoal, which is to have MyDVD on the Truck. So these are valid paths of the state space to explore, even though it leads to the possibility in that the truck is in one of two different places in each of theses scenarios.
(unload MH)
(drive LH)
(at mydvd amazon)
(at truck amazon)
(at driver home)
(path home amazon)
(link amazon london)
(link london myhouse)
(drive LH)
(drive LH)
Regression Search (Example)
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So let’s quickly roll out the top two state sets above, and again we’re seeing the need to satisfy two preconditions:
In the first instance, it’s that the driver is in the truck and we know that would happen if the driver has driven the truck somewhere, as we already saw with the first drive action from L to H. Now this gives us a more concrete state, but it doesn’t address the fact that somehow the drive between L and H happens before loading M onto T at L?
Meanwhile the second rollout sees us also add a drive action. Which means we can now confirm the driver is in T. But we still don’t really know where the Truck is at this point. It could be either at A or at L given we’ve driven away from H to L.
And of course let’s consider the other state I didn’t expand yet…
(unload MH)
(drive LH)
(at mydvd amazon)
(at truck amazon)
(at driver home)
(path home amazon)
(link amazon london)
(link london myhouse)
(board DH)
(board DL)
(board DD)
(drive LH)
(drive LH)
(drive AL)
Regression Search (Example)
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
And it’s much of the same process here, with varying valid and consistent actions that allow us to enumerate possible situations from the goal state.
But the key thing, is we’re actually getting closer to a valid initial state, given that Drive AL driving from Amazon to London is a state where M is in T, T is at A and D is driving T. So we only have to get M loaded onto T, for D to board T at A which in turn requires D to walk from their home to Amazon. At which point the plan is satisfied. So it’s a different way in which to do it, but we are able to find a good solution by rolling out in this direction.
But it is possible that we could have found other valid actions throughout this search graph. In fact we found actions such as the driver boarding the truck and the package being loaded onto the truck that are useful, but just not at that point in time. Hence, it would useful if we could establish that certain actions are good to have in the plan, but just don’t commit to them entirely.
Partial Order Planning: Searching in Plan Space
Planning in plan space, using a lifted framework.
Initial plan, initial state, goal state.
Pick an unresolved goal.
Select an action to achieve it (or the initial state)
Add to the plan
Add a causal link and add its preconditions as open goals.
Find an unbound parameter and select a binding.
Find a threat, resolve by promotion or demotion.
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
And this brings us to the idea of Partial Order Planning, where we’re going to play within the plan space – a sort of lifted representation of the state model – and find actions we want to apply, but not dictate the order immediately. In fact, we’ll generate a partial order plan, after which the job is then to try and find the optimal solution that ensures their orderings are valid.
So we begin with our initial and goal state and our set of actions. We can then aim to find an action that addresses an unresolved goal much like before, and if it is satisfying what we’re trying to do, we add it to the plan, we add its preconditions as subgoals. But we also add what is known as a causal link.
A causal link is a special identifier that acknowledges which actions meet which preconditions of other actions. This helps us better identify the ordering of actions in the final plan, given the link can now tell us that ideally a group of actions should appear at some before the other action. Ideally, we don’t want a lot of causal links to emerge, given that this means the number of possible orderings of actions is going to be much more tightly constrained.
We can then also consider bindings: which is ensuring that two actions use the same parameter, thus ensuring some continuity. So for example loading a package and unloading a package at different points in the plan both use the same package or the same truck.
And lastly, there are threats: threats are when the search process discovers that an existing ordering of actions will break bindings and causal links. You can then resolve this by either promoting an action, meaning you move it to after the connection it threatens, or demotion, which moves it behind it.
Link LH Link AL At TA at DVDA
Ignoring the Driver
Unload ?P ?L ?T
POP Example
At ?T ?L In ?P ?T
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So let’s consider this example where we start with the initial and goal state, and then we consider how to achieve the goal fact of having the DVD at H.
We could consider using the unload action, given that would achieve the desired effect, we then need to consider the preconditions involved and what assignment to all of these variables is required.
Link LH Link AL At TA at DVDA
Ignoring the Driver
Unload DVD H T
POP Example
At T H In DVD T
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
We can quickly fill in the possible values for this, with Truck T being used and the DVD being at location H (my house). We can try a variety of combinations, but this is the one that works.
This not only sets up our preconditions, but also sets the new goal for us to solve is getting the truck to H and the DVD in the truck.
Link LH Link AL At TA at DVDA
Ignoring the Driver
Unload DVD H T
POP Example
At T H In DVD T
Load DVD T ?L
At T ?L At DVD ?L
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So next we consider how to satisfy loading the DVD onto the truck at L, but that doesn’t work. Because the facts, given there is an inconsistency within some of the facts of where T is. If we assigned the variable L to be H, then it violates the goal we’ve already solved. And doesn’t give us useful information of how the truck got to location H in the first place. So let’s scrap that for now.
Link LH Link AL At TA at DVDA
Ignoring the Driver
Unload DVD H T
POP Example
At T H In DVD T
Drive T L H
Link LH At TL
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
We can establish that driving from L to H will get us to the correct destination and also, it generates a causal link in saying that Driving from T to H is required before we unload the DVD. Plus there’s also a causal link dependency on the drive action, because in order for it to work, we know that Link LH needs to be a valid fact. So we know at some point before executing this action, we have to be at L in order to reach H. Now we still have two facts we need to resolve, how did the DVD get in the truck in the first place, and how did the truck get to location L before it drove to H.
Link LH Link AL At TA at DVDA
Ignoring the Driver
Unload DVD H T
POP Example
At T H In DVD T
Drive T L H
Link LH At TL
Drive T A L
Link AL At TA
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So now, having tackled how the truck reached L, we’ve achieved a lot of useful information: we now know that Driving T from A to L can satisfy the need for T to be at L before it drives on to H.
But also, we create three new causal links. We establish that Drive T from A to L is required before we then drive it from L to H. But also, we link the preconditions of Driving from A to L to the initial state. So we know that driving T from A to L not only has to happen before we drive to H, but it also needs to happen after the initial state.
This is coming along nicely, but, we still don’t know about the DVD being in the truck. Perhaps we can revisit that action from before?
Link LH Link AL At TA at DVDA
Ignoring the Driver
Unload DVD H T
POP Example
At T H In DVD T
Drive T L H
Link LH At TL
Drive T A L
Link AL At TA
Load DVD T ?L
At T ?L At DVD ?L
L = H, A or L?
FACULTY OF NATURAL & MATHEMATICAL SCIENCES
DEPARTMENT OF INFORMATICS
So if we reconsider the load action, we now have a range of values that the location can be set to inside the load action. It could A, or L or H.
So let’s find the best one that fits in. We know going through this example the answer is A, but what happens when we a
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com