程序代写 ICAPS 2013. - PowCoder代写

PowerPoint Presentation

Improving LPRPG-P Using Distance-Cost Pairs
” Searching for Good Solutions in Goal-Dense Search Spaces.” A. J. Coles and A. I. Coles. ICAPS 2013.

Distance to go, cost to go

Distance to go, cost to go
In theory, what is the ‘distance to go’?
(i.e. minimum number of actions to call something a goal state)

Distance to go, cost to go
In theory, what is the cost to go for this state?
(Equivalent to asking ‘which forgo actions must be applied?’)

Distance to go, cost to go
d(S) = 0 → by calling this a goal state right now, and applying no more actions

c(S) = 0 → by applying actions to meet all the goals, so paying no preference costs

What we want: distance–cost pairs:
4 actions will get down to cost 20
8 actions will get down to cost 10
10 actions will get down to cost 7…

Building block: the relaxed plan

Start by pretending all goals are hard goals

What are these actions for?
To meet the actual hard goals (if any)
To meet some preference

Record this when building the relaxed plan.

Annotations
‘Colour code’ with the reason each action is in the plan (H = hard goals)

Distance-cost pairs for previous slide
5 actions = cost 3
7 actions = cost 2
11 actions = cost 1
12 actions = cost 0

The question on none of your minds

Why did you not have the distance cost pair 9:2 (meet the hard goals and p0)?
Why did you not have the distance cost pair 10:1 (meet the hard goals and p0,p1)?
Is even better than the 11:1 on the previous slide

Answer: trying all combinations of preferences would give many more distance-cost pairs, but take ages. Compromise: use a greedy algorithm. Start with just the hard goals and then add preferences one at a time:
Greedy by length: add the preference that requires the fewest actions to be added to the relaxed plan in order to satisfy it (minimise distance to go);
Greedy by cost: add the preference that reduces the cost of the resulting relaxed plan by the most (i.e. the most expensive one that is still violated).

How do we use these?
Now we have a trade-off in search:
Want to explore good states that are close to satisfying all the preferences (high distance, low cost);
But also prioritise reaching better states quickly (low distance, high-er cost).
Solution, dual open-list search:
One open-list sorted by h(all) the length of the relaxed plan to satisfy all preferences;
One open-list sorted by h(CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts