CS代考 7CCSMAMS: Agents & Multi-Agent Systems

7CCSMAMS: Agents & Multi-Agent Systems

Embedded agents

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

1

Today
Information about the exam
Agent environments
Specifying environment behaviour
Specifying embedded agent behaviour
Specifying what it means for an embedded agent to be successful

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

Exam
100% multiple choice 2 hour exam in January
Answer all of around 17 questions, each worth 4-6%
Covers all topics of the module: anything in lectures is assessable, anything not in the lectures is not
Tests understanding of concepts and techniques, including through application to example cases
It is not primarily a memory test, though you need to remember to understand

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

Preparing for the exam
Use the materials provided: the lecture slides and tutorial exercise solutions and past exam papers
This is the first year in which the exam is multiple choice
The past exam papers still act as a guide on the style and level of questions and can be used to revise
I have uploaded to KEATS an example of a multiple choice exam paper for an equivalent level module (that I also taught), so you can understand the structure and format
In the last week of term, you will be given a mock exam paper using the same format as your January exam

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

Motivating example for today:

Consider the travelling to a location to take a rock sample.
It is embedded / situated in an environment: the surface of Mars.
In specifying its reasoning, we need to understand what about the environment it can rely on and what it cannot.

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

6

Characteristics of agent environments
An agent is a computer system capable of flexible autonomous action in some multi agent environment in order to meet its design objectives.
Continuous sense-decide-act loop.

AGENT
ENVIRONMENT

input

output

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

6

An accessible environment is one in which the agent can obtain complete, accurate, up-to-date information about the environment’s state (in particular we care about the parts of the state that are relevant to the agent).

Most moderately complex environments (including, for example, the everyday physical world and the Internet) are inaccessible.

The more accessible an environment is, the simpler it is to build agents to operate in it.
7

Environments: Accessible & inaccessible

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

7

8

Environments: Accessible & inaccessible

Image: MichaelMaggs (CC BY-SA 3.0 license)

Image: Oil Industry News (public domain)

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

8

A deterministic environment is one in which any action has a single guaranteed effect — there is no uncertainty about the state that will result from performing an action

The physical world can to all intents and purposes be regarded as non-deterministic

Non-deterministic environments present greater problems for the agent designer
9

Environments: Deterministic & non-deterministic

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

9

10

Environments: Deterministic & non-deterministic

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

10

In an episodic environment, the performance of an agent is dependent on a number of discrete episodes, with no link between the performance of an agent in different scenarios.

Episodic environments are simpler from the agent developer’s perspective because the agent can decide what action to perform based only on the current episode — it need not reason about the interactions between this and future episodes.
11

Environments: Episodic & non-episodic

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

11

12

Environments: Episodic & non-episodic
Image: (CC BY-SA 3.0 license)

Image: Presidencia de la República Mexicana (CC BY-SA 2.0 license)

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

12

A static environment is one that can be assumed to remain unchanged except by the performance of actions by the agent.

A dynamic environment is one that has other processes operating on it, and which hence changes in ways beyond the agent’s control.

Other processes can interfere with the agent’s actions.

The physical world and the internet are highly dynamic environments
13

Environments: Static & dynamic

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

13

14

Environments: Static & dynamic

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

14

An environment is discrete if there are a fixed, finite number of actions and percepts in it.

Discrete environments could in principle be handled by a kind of “lookup table”.
15

Environments: Discrete & continuous

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

15

16

Environments: Discrete & continuous
Image: Matt (CC BY-NC-ND 2.0 license)

Image: (CC BY 2.0 license)

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

16

The symbol  means ‘there exists’.
The symbol  means ‘for all’.

Given sets A and B, a function from A to B associates each element of A with exactly one element of B.
If f is a function from A to B, we write f : A  B
If f associates x  A with y  B, then we write f(x) = y

17

Functions and sets (revision)

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

17

A = { x | x has the property P }
is the set of all x such that the property P holds for x.
{ x | 2y = x for some positive integer y } = { 2, 4, 6, 8, 10, 12, ….. }
{ p | p > 0,  n > 0 such that n is an integer, np is even } = { 2, 4, 6, 8, 10, 12… }

x  A means x is an element of the set A.
∅ is the empty set, i.e. the set with no elements in it.

2A is the power set of A, i.e. the set of all subsets of A.
The Cartesian product of sets A and B is the set
A x B = { (x,y) | x  A and y  B }

18

Functions and sets (revision)

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

18

Exercise
1. Are the following statements true or false?
 x such that 4 < x < 8 and x is odd  x such that 4 < x < 8, x is odd 2. A = { a, b }. What is 2A? 3. Let A = {1, 2} and B = {a, b, c}. What is A x B ? 4. Does the following define a function? Let X = {1, 2, 3, 4, 5} and Y = {x | x is an integer}. f : X  Y. For n  X, f(n) = 2n if n is odd. 5. Does the following define a function? Let A = {a, b, c} and B = {1, 2, 3} f : A  B. f(a) = 2, f(b) = 2, f(a) = 3, f(c) = 2 Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 20 Embedded agent An agent is a computer system capable of flexible autonomous action in some multi agent environment in order to meet its design objectives. Continuous sense-decide-act loop. AGENT ENVIRONMENT input output Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 20 Assume the environment may be in any of a finite set E of discrete, instantaneous states: E = {e, e', . . . } Agents are assumed to have a repertoire of possible actions available to them, which transform the state of the environment: Ac = {α, α', . . . } Idea: environment starts at some state and agent chooses action to perform in each state which causes environment to change, leading to new state etc. 21 Agent and environment interaction Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 21 A run, r, of an agent in an environment is a sequence of interleaved environment states and actions: 22 α0 r: e0 ⟶e1 ⟶e2 ⟶e3 ⟶... ⟶eu α1 α2 α3 αu-1 Agent runs Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 22 Let: R be the set of all such possible finite sequences (over E and Ac) R Ac be the subset of these that end with an action R E be the subset of these that end with an environment state 23 Possible runs Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 23 A state transformer function represents behaviour of the environment: τ: R Ac ⟶2E This function takes a run that ends with an action and returns a subset of E (so a set of environment states). So environment is history dependent and non-deterministic. If τ(r)=∅, then there are no possible successor states to r. In this case, we say that the system has ended its run An environment Env is then a triple Env =〈E,e0,τ〉 where: E is a set of environment states, e0 ∈ E is the initial state, and τ is a state transformer function. 24 Environment behaviour Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 24 Example Consider the environment Env1=〈E,e0,τ〉 defined as follows: E = {e0 , e1, e2 , e3, e4, e5} τ(e0 ⟶) = {e1, e2} τ(e0 ⟶) = {e3, e4, e5} For all other x, τ(x) = ∅ 25 α0 α1 Environment behaviour Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 25 Consider the environment Env1=〈E,e0,τ〉 defined as follows: E = {e0 , e1, e2 , e3} τ(e0 ⟶) = {e1, e2} τ(e0 ⟶) = {e3, e1} τ(e0 ⟶e1 ⟶) = {e3} τ(e0 ⟶e1 ⟶) = {e3} For all other x, τ(x) = ∅ 26 Exercise α0 α1 α0 α1 α1 α1 What are all the possible ended runs that could occur in this environment? Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 26 Agent behaviour can be viewed as a function which maps runs to actions: Ag: RE⟶ Ac This function takes a run that ends with an environment and returns exactly one action. (We’re assuming here that agents are deterministic.) An agent makes a decision about what action to perform based on the history of the system that it has witnessed to date. Let AG be the set of all agents. 27 Agent behaviour Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 27 A system is a pair (Ag, Env) containing an agent and an environment Any system will have associated with it a set of possible terminated runs; we denote the set of runs of agent Ag in environment Env by R(Ag, Env) 28 Agent behaviour Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 28 Exercise Consider the environment Env =〈E,e0,τ〉in each of the following cases. For each, give the value of R(Ag, Env) Case 1. E = {e0 , e1, e2 , e3, e4, e5} τ(e0 ⟶) = {e1, e2}, τ(e0 ⟶) = {e3, e4, e5} For all other x, τ(x)=∅ Ag(e0) = α0 Case 2. E = {e0 , e1, e2 , e3} τ(e0 ⟶) = {e1, e2}, τ(e0 ⟶) = {e3, e1}, τ(e0 ⟶ e1 ⟶) = {e3}, τ(e0 ⟶ e1 ⟶) = {e3} For all other x, τ(x) = ∅ Ag((e0 )) = α0 , Ag((e0 ,α0,e1)) = α1 α0 α1 α0 α0 α1 α1 α1 α1 Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e Two agents Ag1 and Ag2 are called behaviourally equivalent with respect to environment Env iff R(Ag1, Env) = R(Ag2, Env) If this is true for any environment Env, they are simply called behaviourally equivalent. 30 Agent behaviour Slide ‹#› out of 77 Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e 30 Formally, a sequence (e0 , α0, e1, α1, e2 , ...) represents a run of an agent Ag in environment Env =〈E,e0,τ〉 iff: 1. e0 is the initial state of Env 2. α0 =Ag(e0); and 3. for u > 0,
eu ∈ τ((e0, α0, …, αu-1))
αu = Ag((e0, α0, …, eu))
31

Agent behaviour

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

31

Some agents decide what to do without reference to their history — they base their decision making entirely on the present, with no reference at all to the past
We call such agents purely reactive (or simple reflex agents):
action : E ⟶ Ac
A thermostat is a purely reactive agent
32
on otherwise
off if e = temperature OK
{
action(e) =

Purely reactive agents

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

32

AGENT
ENVIRONMENT

input

output

Purely reactive agents

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

Now introduce perception subsystem:
34

Environment
Agent

see
action

Perception

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

34

The perception function see maps an environment state to a percept:
see : E → Per
The action-selection function action maps an internal state to an action:
action : Per → Ac
Where
Per is a non-empty set of percepts that the agent can obtain through its sensors
see describes the process of perception
action defines decisions based on a percept
35

Perception

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

35

Exercise
A purely reactive agent is defined by the function AgR: E ⟶ Ac. A standard agent is defined by the function AgS: RE ⟶ Ac. Prove or refute the following statements.
For every purely reactive agent, there is a behaviourally equivalent standard agent
For every standard agent, there is a behaviourally equivalent purely reactive agent.
Hints:
To prove that two sets A and B are equal, it is sufficient to show that A ⊆ B and B ⊆ A.
To refute a statement, it is sufficient to identify a counter example and show that the statement cannot hold for this counter example.

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

We now consider agents that maintain state:
37

Environment
Agent

see
action

next

state

Agents with state

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

37

Internal data structure state typically used to record information about the environment state and history.
Let I be the set of all internal states of the agent.
The perception function see maps an environment state to a percept:
see : E → Per
The function next is introduced, which maps an internal state and a percept to an internal state:
next : I x Per → I
The action-selection function action maps an internal state to an action:
action : I → Ac
38

Agents with state

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

38

Agent starts in some initial internal state i0
Observes its environment state e, and generates a percept see(e)
Internal state of the agent is then updated via next function, becoming next(i0, see(e))
The action selected by the agent is
action (next(i0, see(e)))
Goto 2
39

Control loop

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

39

We build agents in order to carry out tasks for us
The task must be specified by us…
But we want to tell agents what to do without telling them how to do it
40

Tasks for agents

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

40

One possibility: associate utilities with individual states — the task of the agent is then to bring about states that maximise utility
A task specification is a function
u : E → ℝ
which associates a real number with every environment state
Example: trader agent might use u(e) = p(e) where p(e) is profit in state e.
Example: Mars rover agent might use u(e) = r(e) – d(e) where r(e) is weight of collected rocks in state e and d(e) is distance from mother ship in state e
41

Utility functions over states

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

41

But what is the value of a run…
minimum utility of state on run?
maximum utility of state on run?
sum of utilities of states on run?
average?
Disadvantage: difficult to specify a long term view when assigning utilities to individual states

42

Utility functions over states

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

42

Assign a utility not to individual states, but to runs themselves:
u : R → ℝ
Such an approach takes an inherently long term view.
Example: trading agent might use u(r) = (p(r) – l(r)) x fp(r) where fp(r) is profit in the final state of r, p(r) is the number of states agent is in profit and l(r) is the number of states agent is not in profit.

43

Utility functions over runs

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

43

Write P(r | Ag, Env) to denote probability that run r occurs when agent Ag is placed in environment Env
Note:

Expected utility of a particular run is the probability that run will occur times the utility of the run: u(r)P(r|Ag, Env)
Then optimal agent Agopt in an environment Env is the one that maximizes expected utility over all possible runs:
44
Ag ∈ AG
r∈R(Ag, Env)
Agopt = arg max Σ u(r)P(r|Ag, Env)
Σ P(r|Ag, Env) = 1
r∈R(Ag, Env)

Expected utility & optimal agents

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

44

Exercise
In rock-paper-scissors, each of 2 players simultaneously selects one of 3 moves: rock (R), paper (P), or scissors (S), i.e. Ac = {R, P, S}. R beats S; S beats P; P beats R; else a draw. The utility of winning is 1, losing is −1, drawing is 0.
In a round, possible environment states are {e0, RR, RS, RP, SR, SS, SP, PR, PS, PP}. e0 is the state before play. XY means our agent played X, its opponent played Y.
Our opponent plays P, S and R with probabilities 0.4, 0.5, and 0.1 respectively. So the probability that τ((e0,α)) ∈ {PP,RP,SP} is 0.4, τ((e0,α)) ∈ {PS,RS,SS} is 0.5, τ((e0,α)) ∈ {PR,RR,SR} is 0.1.
We can define a stochastic agent with Ag: RE → 2Ac × [0,1] where (α,p) ∈ Ag(r) means p is the probability of performing α next in run r. The following hold:
for all r ∈ RE: if (αi, pi) ∈ Ag(r) and (αi, pj) ∈ Ag(r), then pi = pj (each action occurs at most once)
Σ ∀(αi,pi) ∈ Ag(r) pi = 1 (the probabilities of the possible actions sum to 1).
Specify an optimal stochastic agent for this problem.

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

Some agents cannot be implemented on some computers
(A function Ag : R E → Ac may need more than available memory to implement)

Write AGm to denote the agents that can be implemented on machine (computer) m:
AGm = { Ag|Ag ∈ AG and Ag can be implemented on m}

We can replace the earlier equation with the following, which defines the bounded optimal agent Agbopt:
46
Ag ∈ AGm
r∈R(Ag, Env)
Agbopt = arg max Σ u(r)P(r|Ag, Env)

Bounded optimal agents

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

46

Where do the numbers come from?
We don’t think in terms of utilities!
Hard to formulate tasks in these terms
47

Difficulties with utility-based approaches

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

47

Not obvious how to generate the numbers
If we’re measuring utility based on something inherently numeric:
State where I have £20, utility = 20
State where I have £300, utility = 300
But otherwise, how do we decide what utility to assign:
State where I’m at home on Saturday night watching tv on my own eating a nice dinner, utility = 1.7? 39? 72?
State where I’m out at a party with all my friends, utility = 2.5? 60? 89?

48

Difficulties with utility-based approaches

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

48

We don’t think in terms of utilities
When I’m considering whether to stay in on my own and eat a nice dinner and watch tv, or to go out to a party with my friends, I don’t think “Staying in would give me utility 17 but going out will give me utility 29, so I’m going to go out”

49

Difficulties with utility-based approaches

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

49

Hard to formulate tasks in these terms
Say we have a robot and we want it to collect samples on Mars.
How do we define its utility function?
utility = number of samples it has?
utility = number of different types of samples it has?
utility = weight of samples it has?
utility = number of samples – energy left?
Not straightforward to define the utility function so that we get the kind of behaviour that we want.

50

Difficulties with utility-based approaches

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

50

A special case of assigning utilities to runs is to assign 0 (false) or 1 (true)

If a run is assigned 1, then the agent succeeds on that run, otherwise it fails

Call these predicate task specifications

Denote predicate task specification by Ψ.
Thus Ψ : R → {0, 1}
51

Predicate task specifications

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

51

A task environment is a pair 〈Env, Ψ〉 where
Env is an environment, and
Ψ : R → {0, 1} is a predicate over runs.

Let TE be the set of all task environments.

A task environment specifies:
the properties of the system the agent will inhabit
the criteria by which an agent will be judged to have either failed or succeeded
52

Task environments

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

52

Write RΨ (Ag, Env) to denote set of all runs of the agent Ag in environment Env that satisfy Ψ:
RΨ(Ag, Env) = {r | r ∈ R(Ag, Env) and Ψ(r) = 1}

We then say that an agent Ag succeeds in task environment 〈Env, Ψ〉 if
RΨ(Ag, Env) = R(Ag, Env)
53

Predicate task specifications

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

53

Let P(r | Ag, Env) denote probability that run r occurs if agent Ag is placed in environment Env

Then the probability P(Ψ | Ag, Env) that Ψ is satisfied by Ag in Env would simply be:
54
r∈RΨ(Ag, Env)
P(Ψ | Ag, Env) = Σ P(r|Ag, Env)

Probability of success

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

54

Two most common types of tasks are achievement tasks and maintenance tasks:

Achievement tasks are those of the form “achieve state of affairs φ”

Maintenance tasks are those of the form “maintain state of affairs Ψ”
55

Achievement and maintenance tasks

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

55

An achievement task is specified by a set G of “good” or “goal” states: G ⊆ E. The agent succeeds if it is guaranteed to bring about at least one of these states (we do not care which one — they are all considered equally good).

Let G ⊆ E be the set of states in which the achievement goal is realised. An agent Ag succeeds if and only if
r  R(Ag, Env): ei  {ex| ex is a state that appears in r} such that ei  G.
56

Achievement tasks

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

56

A maintenance task is specified by a set B of “bad” states: B ⊆ E. The agent succeeds if it is guaranteed never to bring about one of these states, i.e. it manages to avoid all states in B, it never performs actions which result in any state in B occurring

Let B ⊆ E be the set of bad states that we wish to avoid. An agent Ag succeeds if and only if
r  R(Ag, Env): ei  {ex| ex is a state that appears in r} such that ei  B.
57

Maintenance tasks

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

57

Next week we look further into how to how we design agents that can autonomously fulfil their goals, the possible architectures of these agents, and how nature can inspire this design task.
After that, we move on to focus on multi-agent systems.

Next week

Slide ‹#› out of 77

Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e

58