CS代写 We got the numbers

We got the numbers

We got the numbers
How to plan with numbers

“The old get old
And the young get stronger
May take a week
And it may take longer
They got the guns
But we got the numbers”
Five to One, The Doors

Who needs numbers?
Most real planning problems involve resources – money, materials, people, components…
Some values are discrete and it is reasonable to name them (people, components), but others are continuous and can only be measured with numbers (materials, perhaps money?)

PDDL would be too limited to be of practical use if it did not support modelling numeric fluents (fluents = variables with changing values, propositional fluents = variables with Boolean values, numeric fluents = fluents with numeric values – real valued)

In domain, declare numeric fluents like predicates:
(:functions (functionName ?parameter1 – type …) … )
Then these fluents can be used in preconditions and effects
In preconditions, we can only use built-in predicates on numeric expressions: < > <= >= =
Numeric expressions are built using fluents, constants and standard arithmetic operators: + * – / Note that the are no other operators or functions (no trig, no exponentiation, no mod, no logs)
Effects can modify numeric fluents using: increase decrease assign scale-up scale-down Note that few planners (if any) handle scale-up or scale-down effects

Some quick examples
Loading packages into trucks uses space and adds weight:
(:action load
:parameters (?t – truck ?p – package ?loc – location)
:precondition (and (at ?t ?loc) (at ?p ?loc)
(<= (volume ?p) (empty_capacity ?t))) :effect (and (not (at ?p ?loc)) (in ?p ?t) (decrease (empty_capacity ?t) (volume ?p)) (increase (weight ?t) (weight ?p)))) Assumes we have declared: (:functions (volume ?p – package) (empty_capacity ?t – truck) (weight ?o – locatable)) Note use of prefix operator notation – all operators are used in this format in PDDL Semantic curio PDDL demands that there be no ambiguity about the state, so simultaneous actions have to be non-interfering (eg we cannot simultaneously check a condition and delete it with two different actions) To consider whether this makes sense, think about two people acting simultaneously, one to shut a door and the other to walk through the doorway However, commuting numeric effects are allowed to be simultaneous Imagine loading to packages into a truck simultaneously – they both affect the capacity and the weight, but these effects do not impact on one another How bad is adding numbers? Adding numbers to a planning problem takes us into a whole new world of difficulty This is equivalent to: find x, y and z (all positive integers) such that: x3 + y3 = z3 proved there is no plan for this problem in 1840s, though Fermat claimed to have a proof in 1630s In 1993-1995 completed the last steps in a several hundred year search for a proof that there is no such plan for any similar problem (except for the quadratic case) More generally – it is undecidable whether there is a solution for an arbitrary Diophantine equation (multi-variate polynomial to be solved in integers) (:action inc :parameter (?n – name) :effect (increase (value ?n) 1)) Initially: (= (value x) 1) (= (value y) 1) (= (value z) 1) Goal: (= (+ (* (value x) (* (value x) (value x))) (* (value y) (* (value y) (value y)))) (* (value z) (* (value z) (value z)))) Metric-FF ( ) Hoffmann was one of the first researchers to tackle the use of numeric fluents in modern planners, using a generalization of his relaxed plan heuristic to account for numeric effects The broad idea is to see numeric effects that change values in a direction that makes goals or preconditions “more likely” to become true as positive effects, and those that make it “less likely” as negative effects For example: note that loading a truck decreases empty_capacity and there is a precondition requiring empty_capacity to be larger than volumes of packages – this effect makes it “less likely” that we can load another package, so it is relaxed out A Modified Presentation We are going to consider a slightly different presentation of the ideas behind Metric-FF – easier to follow, less technically complex and basically equivalent in operation First step: for all relevant numeric fluents in a problem, we will record the reachable range of values – this will be an interval [x,y] where x is the lower bound and y the upper bound on the possible value of this fluent In the initial state, a fluent with value x has reachable range [x,x] A special case is initially undefined fluents – we will ignore those here, but correct treatment of undefined fluents is a detail that implementers have to consider (:action buy :parameters (?g – good) :precondition (and (>= (cash) (cost ?g)) (available ?g))
:effect (and (not (available ?g)) (decrease (cash) (cost ?g)) (increase (happiness) (satisfaction ?g))))

(:action goto
:parameters (?f ?t – location)
:precondition (and (connected ?f ?t) (at ?f))
:effect (and (not (at ?f)) (at ?t)))

(:action request
:parameters (?g – good ?loc – location)
:precondition (and (at ?loc) (vendor ?g ?loc))
:effect (and (not (vendor ?g ?loc)) (available ?g)))

Initially: (at home) (vendor bread bakery) (vendor coffee café) (vendor chocolate shop)
(= (satisfaction bread) 1) (= (satisfaction coffee) 2) (= (satisfaction chocolate) 5)
(= (cost bread) 2) (= (cost coffee) 1) (= (cost chocolate) 3) (= (cash) 4) (= (happiness) 0)
(connected home bakery) (connected bakery cafe) (connected café shop)
Goal: (>= (happiness) 6)
Notice that locations are connected in order: Home -> Bakery -> Cafe -> Shop

Reachability Analysis
At each action layer, identify all applicable actions and then apply all effects together (for propositions, this is just as before, but for numbers this means gathering all effects and applying them in conjunction)

We maintain a record of all reachable values (subsets of truth values reachable for propositions, ranges of numeric values for numeric fluents)

Decrease effects reduce lower bound and increase effects increase upper bound

(happiness)

(at bakery)

(available bread)

(available coffee)

(available chocolate)

(vendor bread bakery)

(vendor coffee café)

(vendor chocolate shop)

Initial state

buy action
(= (satisfaction bread) 1) (= (satisfaction coffee) 2) (= (satisfaction chocolate) 5)
(= (cost bread) 2) (= (cost coffee) 1) (= (cost chocolate) 3)

(chocolate)

buy actions (bread, coffee)

buy actions (bread, coffee, chocolate)
Action layers list only the actions (in abbreviated form) with relevant effects at each layer
Buying bread and coffee both reduce cash by a total of 3
Buying bread, coffee and chocolate reduce cash lower bound by a further 6 (combined cost)
Buying bread, coffee and chocolate increase total happiness by a combined 8
Buying bread costs 2 and gives 1 happiness
Buying bread and coffee increase happiness by a total of 3

Resolving effects
An effect (increase(x)) will increase the upper bound for fluent x by the maximum value of (decrease will reduce the lower bound by the maximum value of )
The maximum value of an expression is found as follows:
Max(A + B) = Max(A) + Max(B)
Max(A * B) = Max(A) * Max(B)
Max(A – B ) = Max(A) – Min(B)
Max(A/B) = Max(A)/Min(B) if Min(B) > 0, Max(A)/Max(B) if Max(B) < 0, undefined if B = [0,0] and ∞ otherwise Max(variable [x,y]) = y Max(number n) = n (assign(x)) increases the upper bound to the maximum of its current value and the maximum value of and decreases the lower bound similarly
Multiple effects on the same variable can be evaluated in any order (the effects commute) – the right hand side expressions are always evaluated using the values from the preceding fact layer (so that the order of updates to arrive at the new values in the next layer is not important)

Evaluating Conditions
A condition can be checked using similar interval arithmetic:
A < B iff Min(A) < Max(B) A = B iff (not (Min(A) > Max(B) or Max(A) < Min(B))) This says, A=B iff the intervals for A and B overlap Note that this relaxation can be quite weak for equality – for example, if we have an action: (:action increase :effect (increase (x) 2)) Then from an initial state with (x = 0) we can reach x in [0,2] after one step, which appears to satisfy a goal (x = 1), even though that goal cannot be satisfied with this action and initial state Extracting a relaxed plan As usual, if we need a proposition to have a particular value (true or false) then we can add the earliest achieving action to the relaxed plan If we need a numeric precondition or goal to be satisfied, things are trickier! We can take the first level at which the condition is satisfied and then proceed as follows: First, rewrite the expression as a standard comparison (A > B) -> (A – B > 0); (A >= B) -> (A – B >= 0) and (A = B) -> (A >= B) AND (B >= A) and then rewrite the two expressions as above
We now consider conditions of the form A > 0 or A >= 0, and we focus on A: rewrite A as a sum of terms: t1 + t2 + t3 … to that each ti has no addition/subtraction in it (a ti could be negative and can include multiplication or division – this is simply arithmetic to split compound expressions) Eg: (3x – y(u + 2v/w)) becomes 3x + (– yu) + (– 2yv/w) etc
Now evaluate the range of each term in this sum [a,b] (remember that –[a,b] = [-b,-a] and a number, n, has range [n,n])
Ideally, we want the cheapest plan to achieve the condition – there are various options here, but best is to ask what the largest value of our expression (A) was in the preceding layer – say it was x – and then find achievers at the current layer that increase x by enough to satisfy A > 0 (or A>= 0) – we can look for the relative contributions of each variable to the expression and request the biggest contribution from the most significant contributor first, and then keep adding in more contributions until we achieve the target. Then we push back to the preceding layer any outstanding requirement after these contributors have done their work.

(happiness)

(at bakery)

(available bread)

(available coffee)

(available chocolate)

(vendor bread bakery)

(vendor coffee café)

(vendor chocolate shop)

buy action

(chocolate)

buy actions (bread, coffee)

buy actions (bread, coffee, chocolate)
❶ Recall: goal is (happiness) >= 6

First true here

Maximum at previous layer is 4, so we need at least 2 more

Biggest contributor is buying chocolate (adds 5) so take that action and propagate back goal (happiness) >= 1
Because 6 (target) – 5 (chocolate) leaves 1 to satisfy
❶Buy chocolate
❸ Require (happiness) >= 1 which is satisfied at the preceding layer, so push it back
❹ Request chocolate
❷ We need enough cash to satisfy the buy precondition, but that is satisfied in the initial state, so no additional actions are required for this
❹ The propositional precondition of Buy Chocolate is (available chocolate) and this is satisfied by Request Chocolate
❻ goto café shop
❼ goto bakery café
❺ buy bread
❽ Request bread
❾ goto home bakery
❺ To achieve (happiness) >= 1 there is only one achiever: buying bread
❻-❾ Propositional preconditions use achievers in usual way
❿The final relaxed plan contains 7 actions including buying bread and chocolate

Metric-FF reports
C:\…\ff -i 127 -o shoppingdom.pddl -f shop1.pddl

ff: parsing domain file
domain ‘SHOPPING’ defined
ff: parsing problem file
problem ‘SHOP1’ defined

no metric specified. plan length assumed.

checking for cyclic := effects — OK.

ff: search configuration is EHC, if that fails then best-first on 1*g(s) + 5*h(s) where
metric is plan length
selecting at step 4:
BUY CHOCOLATE
selecting at step 3:
REQUEST CHOCOLATE SHOP
selecting at step 2:
GOTO CAFE SHOP
selecting at step 1:
REQUEST BREAD BAKERY
GOTO BAKERY CAFE
selecting at step 0:
GOTO HOME BAKERY

Cueing down from goal distance: 7 into depth [1]
Here are these 7 actions in the relaxed plan

Does it work
Yes… and no
This heuristic helps to guide the search with numeric conditions and effects and it also identifies helpful actions
Unfortunately, sometimes the apparently helpful actions are not actually helpful!

Metric-FF final plan
FF sees it is helpful to request bread, but if it buys the bread, it will not have enough cash left to buy the chocolate and the goal is no longer reachable
Therefore, after requesting the bread it then identifies that it needs to request and buy coffee and chocolate
ff: found legal plan as follows

step 0: GOTO HOME BAKERY
1: REQUEST BREAD BAKERY
2: GOTO BAKERY CAFE
3: REQUEST COFFEE CAFE
4: BUY COFFEE
5: GOTO CAFE SHOP
6: REQUEST CHOCOLATE SHOP
7: BUY CHOCOLATE

A problem with incremental bounds
Building the bounds by incrementally applying actions repeatedly at each layer has an important cost
Suppose there is an action to increase (X) by 1 and initially (X) is 0 and the goal is for (X) >= 1000000 – then the reachability graph requires 1000000 layers, even though it is trivial!
An alternative, is to let the bound on (X) go to [0,∞] as soon as (X) can increase
In relaxed plan extraction, we then count the number of repetitions of the increasing action required to achieve the goal
This has a propagating effect if the increasing action also has a numeric precondition (eg if increasing (X) by 1 costs $1 and we can work a day to earn a dollar, then to get (X) >= 1000000 requires 1000000 repetitions of the increase, which will require $1000000 and this, in turn, requires 1000000 days of work)
If the number of applications is bounded by some limited resource, this can be used to bound the maximum value of a variable

If we modify the buy action (no cost and no consumption!) AND change bread to offer satisfaction of 2

Then, with Metric-FF incremental application approach, we get an initial relaxed plan of 7 steps
(:action buy
:parameters (?g – good)
:precondition (and (>= (cash) (cost ?g)) (available ?g))
:effect (and (increase (happiness) (satisfaction ?g))))

selecting at step 3:
BUY COFFEE
selecting at step 2:
REQUEST COFFEE CAFE
selecting at step 1:
REQUEST BREAD BAKERY
GOTO BAKERY CAFE
selecting at step 0:
GOTO HOME BAKERY

Example (POPF)
POPF uses the alternative approach (extend to infinity) so gets the following relaxed plan:
Relaxed plan for state 1, h: 5 :
(goto home bakery), start
(request bread bakery), start
(buy bread), start
(buy bread), start
(buy bread), start

A Counter Intuitive Outcome
Slightly weirdly, FF ends up with this plan (you have to track the successive relaxed plans to see why FF switches track from bread and coffee to chocolate)
ff: found legal plan as follows

step 0: GOTO HOME BAKERY
1: GOTO BAKERY CAFE
2: GOTO CAFE SHOP
3: REQUEST CHOCOLATE SHOP
4: BUY CHOCOLATE
5: BUY CHOCOLATE
POPF comes up with final plan:
;;;; Solution Found
; States evaluated: 6
; Cost: 0.004
0.00000: (goto home bakery)
0.00100: (request bread bakery)
0.00200: (buy bread)
0.00300: (buy bread)
0.00400: (buy bread)

But it’s not always good
On the other hand, if we leave bread with satisfaction 1, POPF buys 6 loaves, while FF still buys 2 bars of chocolate – these two approaches have wins and losses when comparing performance

Traps for the unwary
Consider a domain consisting of these actions:

Suppose initially x = 0, y = 1 and the goal is x >= 4
The POPF approach spots that 4 applications of increaseX will achieve the goal, while FF spots that it is helpful to increase y, but goes for increaseY 3 times (so y = 4) and then increaseX to finish the job
The best plan is to increaseY (y = 2) and then increaseX twice

But, if we start with y = 0, the POPF approach runs into a problem: increaseX has no effect when y = 0 and in order to see the impact of increaseY we need to reapply increaseX – so we need to check, at each layer, whether the right-hand side expressions in effects have changed, and reapply actions that are affected (POPF does not correctly implement this)
(:action increaseX
:effect (increase (x) (y)))

(:action increase
:effect (increase (y) 1))

Some final thoughts
One of the hardest parts of managing numbers through this relaxation is in finding a good relaxed plan: in general, given a goal [complex expression involving multiple variables] > 0, working out which target values for the variables in the expression will be most efficient is hard
Later, we will consider whether, at least for some cases, there are opportunities to tackle this problem

We have looked at the interval-relaxation, in which we relax number values to reachable intervals, but there are alternatives:
a set-relaxation in which we enumerate the reachable values for each numeric variable,
multi-interval in which we represent the reachable range with a union of intervals and
an abstraction-set-relaxation, in which we enumerate reachable values until the set reaches a limiting size and then switch to an interval or some abstraction (eg “all values reachable”)

All these make sense in particular cases, so part of the challenge is perhaps being able to identify which relaxations work best for which problems and automatically deploy the best for a given domain and problem

/docProps/thumbnail.jpeg

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts