7CCSMAMS: Agents & Multi-Agent Systems
Embedded agents
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
1
Today
Information about the exam
Agent environments
Specifying environment behaviour
Specifying embedded agent behaviour
Specifying what it means for an embedded agent to be successful
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
Exam
100% multiple choice 2 hour exam in January
Answer all of around 17 questions, each worth 4-6%
Covers all topics of the module: anything in lectures is assessable, anything not in the lectures is not
Tests understanding of concepts and techniques, including through application to example cases
It is not primarily a memory test, though you need to remember to understand
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
Preparing for the exam
Use the materials provided: the lecture slides and tutorial exercise solutions and past exam papers
This is the first year in which the exam is multiple choice
The past exam papers still act as a guide on the style and level of questions and can be used to revise
I have uploaded to KEATS an example of a multiple choice exam paper for an equivalent level module (that I also taught), so you can understand the structure and format
In the last week of term, you will be given a mock exam paper using the same format as your January exam
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
Motivating example for today:
Consider the travelling to a location to take a rock sample.
It is embedded / situated in an environment: the surface of Mars.
In specifying its reasoning, we need to understand what about the environment it can rely on and what it cannot.
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
6
Characteristics of agent environments
An agent is a computer system capable of flexible autonomous action in some multi agent environment in order to meet its design objectives.
Continuous sense-decide-act loop.
AGENT
ENVIRONMENT
input
output
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
6
An accessible environment is one in which the agent can obtain complete, accurate, up-to-date information about the environment’s state (in particular we care about the parts of the state that are relevant to the agent).
Most moderately complex environments (including, for example, the everyday physical world and the Internet) are inaccessible.
The more accessible an environment is, the simpler it is to build agents to operate in it.
7
Environments: Accessible & inaccessible
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
7
8
Environments: Accessible & inaccessible
Image: MichaelMaggs (CC BY-SA 3.0 license)
Image: Oil Industry News (public domain)
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
8
A deterministic environment is one in which any action has a single guaranteed effect — there is no uncertainty about the state that will result from performing an action
The physical world can to all intents and purposes be regarded as non-deterministic
Non-deterministic environments present greater problems for the agent designer
9
Environments: Deterministic & non-deterministic
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
9
10
Environments: Deterministic & non-deterministic
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
10
In an episodic environment, the performance of an agent is dependent on a number of discrete episodes, with no link between the performance of an agent in different scenarios.
Episodic environments are simpler from the agent developer’s perspective because the agent can decide what action to perform based only on the current episode — it need not reason about the interactions between this and future episodes.
11
Environments: Episodic & non-episodic
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
11
12
Environments: Episodic & non-episodic
Image: (CC BY-SA 3.0 license)
Image: Presidencia de la República Mexicana (CC BY-SA 2.0 license)
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
12
A static environment is one that can be assumed to remain unchanged except by the performance of actions by the agent.
A dynamic environment is one that has other processes operating on it, and which hence changes in ways beyond the agent’s control.
Other processes can interfere with the agent’s actions.
The physical world and the internet are highly dynamic environments
13
Environments: Static & dynamic
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
13
14
Environments: Static & dynamic
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
14
An environment is discrete if there are a fixed, finite number of actions and percepts in it.
Discrete environments could in principle be handled by a kind of “lookup table”.
15
Environments: Discrete & continuous
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
15
16
Environments: Discrete & continuous
Image: Matt (CC BY-NC-ND 2.0 license)
Image: (CC BY 2.0 license)
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
16
The symbol means ‘there exists’.
The symbol means ‘for all’.
Given sets A and B, a function from A to B associates each element of A with exactly one element of B.
If f is a function from A to B, we write f : A B
If f associates x A with y B, then we write f(x) = y
17
Functions and sets (revision)
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
17
A = { x | x has the property P }
is the set of all x such that the property P holds for x.
{ x | 2y = x for some positive integer y } = { 2, 4, 6, 8, 10, 12, ….. }
{ p | p > 0, n > 0 such that n is an integer, np is even } = { 2, 4, 6, 8, 10, 12… }
x A means x is an element of the set A.
∅ is the empty set, i.e. the set with no elements in it.
2A is the power set of A, i.e. the set of all subsets of A.
The Cartesian product of sets A and B is the set
A x B = { (x,y) | x A and y B }
18
Functions and sets (revision)
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
18
Exercise
1. Are the following statements true or false?
x such that 4 < x < 8 and x is odd
x such that 4 < x < 8, x is odd
2. A = { a, b }. What is 2A?
3. Let A = {1, 2} and B = {a, b, c}. What is A x B ?
4. Does the following define a function?
Let X = {1, 2, 3, 4, 5} and Y = {x | x is an integer}.
f : X Y. For n X, f(n) = 2n if n is odd.
5. Does the following define a function?
Let A = {a, b, c} and B = {1, 2, 3}
f : A B. f(a) = 2, f(b) = 2, f(a) = 3, f(c) = 2
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
20
Embedded agent
An agent is a computer system capable of flexible autonomous action in some multi agent environment in order to meet its design objectives.
Continuous sense-decide-act loop.
AGENT
ENVIRONMENT
input
output
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
20
Assume the environment may be in any of a finite set E of discrete, instantaneous states:
E = {e, e', . . . }
Agents are assumed to have a repertoire of possible actions available to them, which transform the state of the environment:
Ac = {α, α', . . . }
Idea: environment starts at some state and agent chooses action to perform in each state which causes environment to change, leading to new state etc.
21
Agent and environment interaction
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
21
A run, r, of an agent in an environment is a sequence of interleaved environment states and actions:
22
α0
r: e0 ⟶e1 ⟶e2 ⟶e3 ⟶... ⟶eu
α1
α2
α3
αu-1
Agent runs
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
22
Let:
R be the set of all such possible finite sequences (over E and Ac)
R Ac be the subset of these that end with an action
R E be the subset of these that end with an environment state
23
Possible runs
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
23
A state transformer function represents behaviour of the environment:
τ: R Ac ⟶2E
This function takes a run that ends with an action and returns a subset of E (so a set of environment states).
So environment is history dependent and non-deterministic.
If τ(r)=∅, then there are no possible successor states to r. In this case, we say that the system has ended its run
An environment Env is then a triple Env =〈E,e0,τ〉 where: E is a set of environment states, e0 ∈ E is the initial state, and τ is a state transformer function.
24
Environment behaviour
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
24
Example
Consider the environment Env1=〈E,e0,τ〉 defined as follows:
E = {e0 , e1, e2 , e3, e4, e5}
τ(e0 ⟶) = {e1, e2}
τ(e0 ⟶) = {e3, e4, e5}
For all other x, τ(x) = ∅
25
α0
α1
Environment behaviour
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
25
Consider the environment Env1=〈E,e0,τ〉 defined as follows:
E = {e0 , e1, e2 , e3}
τ(e0 ⟶) = {e1, e2}
τ(e0 ⟶) = {e3, e1}
τ(e0 ⟶e1 ⟶) = {e3}
τ(e0 ⟶e1 ⟶) = {e3}
For all other x, τ(x) = ∅
26
Exercise
α0
α1
α0
α1
α1
α1
What are all the possible ended runs that could occur in this environment?
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
26
Agent behaviour can be viewed as a function which maps runs to actions:
Ag: RE⟶ Ac
This function takes a run that ends with an environment and returns exactly one action. (We’re assuming here that agents are deterministic.)
An agent makes a decision about what action to perform based on the history of the system that it has witnessed to date. Let AG be the set of all agents.
27
Agent behaviour
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
27
A system is a pair (Ag, Env) containing an agent and an environment
Any system will have associated with it a set of possible terminated runs; we denote the set of runs of agent Ag in environment Env by R(Ag, Env)
28
Agent behaviour
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
28
Exercise
Consider the environment Env =〈E,e0,τ〉in each of the following cases. For each, give the value of R(Ag, Env)
Case 1.
E = {e0 , e1, e2 , e3, e4, e5}
τ(e0 ⟶) = {e1, e2}, τ(e0 ⟶) = {e3, e4, e5}
For all other x, τ(x)=∅
Ag(e0) = α0
Case 2.
E = {e0 , e1, e2 , e3}
τ(e0 ⟶) = {e1, e2}, τ(e0 ⟶) = {e3, e1},
τ(e0 ⟶ e1 ⟶) = {e3}, τ(e0 ⟶ e1 ⟶) = {e3}
For all other x, τ(x) = ∅
Ag((e0 )) = α0 , Ag((e0 ,α0,e1)) = α1
α0
α1
α0
α0
α1
α1
α1
α1
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
Two agents Ag1 and Ag2 are called behaviourally equivalent with respect to environment Env iff
R(Ag1, Env) = R(Ag2, Env)
If this is true for any environment Env, they are simply called behaviourally equivalent.
30
Agent behaviour
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
30
Formally, a sequence
(e0 , α0, e1, α1, e2 , ...)
represents a run of an agent Ag in environment Env =〈E,e0,τ〉 iff:
1. e0 is the initial state of Env
2. α0 =Ag(e0); and
3. for u > 0,
eu ∈ τ((e0, α0, …, αu-1))
αu = Ag((e0, α0, …, eu))
31
Agent behaviour
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
31
Some agents decide what to do without reference to their history — they base their decision making entirely on the present, with no reference at all to the past
We call such agents purely reactive (or simple reflex agents):
action : E ⟶ Ac
A thermostat is a purely reactive agent
32
on otherwise
off if e = temperature OK
{
action(e) =
Purely reactive agents
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
32
AGENT
ENVIRONMENT
input
output
Purely reactive agents
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
Now introduce perception subsystem:
34
Environment
Agent
see
action
Perception
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
34
The perception function see maps an environment state to a percept:
see : E → Per
The action-selection function action maps an internal state to an action:
action : Per → Ac
Where
Per is a non-empty set of percepts that the agent can obtain through its sensors
see describes the process of perception
action defines decisions based on a percept
35
Perception
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
35
Exercise
A purely reactive agent is defined by the function AgR: E ⟶ Ac. A standard agent is defined by the function AgS: RE ⟶ Ac. Prove or refute the following statements.
For every purely reactive agent, there is a behaviourally equivalent standard agent
For every standard agent, there is a behaviourally equivalent purely reactive agent.
Hints:
To prove that two sets A and B are equal, it is sufficient to show that A ⊆ B and B ⊆ A.
To refute a statement, it is sufficient to identify a counter example and show that the statement cannot hold for this counter example.
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
We now consider agents that maintain state:
37
Environment
Agent
see
action
next
state
Agents with state
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
37
Internal data structure state typically used to record information about the environment state and history.
Let I be the set of all internal states of the agent.
The perception function see maps an environment state to a percept:
see : E → Per
The function next is introduced, which maps an internal state and a percept to an internal state:
next : I x Per → I
The action-selection function action maps an internal state to an action:
action : I → Ac
38
Agents with state
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
38
Agent starts in some initial internal state i0
Observes its environment state e, and generates a percept see(e)
Internal state of the agent is then updated via next function, becoming next(i0, see(e))
The action selected by the agent is
action (next(i0, see(e)))
Goto 2
39
Control loop
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
39
We build agents in order to carry out tasks for us
The task must be specified by us…
But we want to tell agents what to do without telling them how to do it
40
Tasks for agents
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
40
One possibility: associate utilities with individual states — the task of the agent is then to bring about states that maximise utility
A task specification is a function
u : E → ℝ
which associates a real number with every environment state
Example: trader agent might use u(e) = p(e) where p(e) is profit in state e.
Example: Mars rover agent might use u(e) = r(e) – d(e) where r(e) is weight of collected rocks in state e and d(e) is distance from mother ship in state e
41
Utility functions over states
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
41
But what is the value of a run…
minimum utility of state on run?
maximum utility of state on run?
sum of utilities of states on run?
average?
Disadvantage: difficult to specify a long term view when assigning utilities to individual states
42
Utility functions over states
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
42
Assign a utility not to individual states, but to runs themselves:
u : R → ℝ
Such an approach takes an inherently long term view.
Example: trading agent might use u(r) = (p(r) – l(r)) x fp(r) where fp(r) is profit in the final state of r, p(r) is the number of states agent is in profit and l(r) is the number of states agent is not in profit.
43
Utility functions over runs
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
43
Write P(r | Ag, Env) to denote probability that run r occurs when agent Ag is placed in environment Env
Note:
Expected utility of a particular run is the probability that run will occur times the utility of the run: u(r)P(r|Ag, Env)
Then optimal agent Agopt in an environment Env is the one that maximizes expected utility over all possible runs:
44
Ag ∈ AG
r∈R(Ag, Env)
Agopt = arg max Σ u(r)P(r|Ag, Env)
Σ P(r|Ag, Env) = 1
r∈R(Ag, Env)
Expected utility & optimal agents
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
44
Exercise
In rock-paper-scissors, each of 2 players simultaneously selects one of 3 moves: rock (R), paper (P), or scissors (S), i.e. Ac = {R, P, S}. R beats S; S beats P; P beats R; else a draw. The utility of winning is 1, losing is −1, drawing is 0.
In a round, possible environment states are {e0, RR, RS, RP, SR, SS, SP, PR, PS, PP}. e0 is the state before play. XY means our agent played X, its opponent played Y.
Our opponent plays P, S and R with probabilities 0.4, 0.5, and 0.1 respectively. So the probability that τ((e0,α)) ∈ {PP,RP,SP} is 0.4, τ((e0,α)) ∈ {PS,RS,SS} is 0.5, τ((e0,α)) ∈ {PR,RR,SR} is 0.1.
We can define a stochastic agent with Ag: RE → 2Ac × [0,1] where (α,p) ∈ Ag(r) means p is the probability of performing α next in run r. The following hold:
for all r ∈ RE: if (αi, pi) ∈ Ag(r) and (αi, pj) ∈ Ag(r), then pi = pj (each action occurs at most once)
Σ ∀(αi,pi) ∈ Ag(r) pi = 1 (the probabilities of the possible actions sum to 1).
Specify an optimal stochastic agent for this problem.
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
Some agents cannot be implemented on some computers
(A function Ag : R E → Ac may need more than available memory to implement)
Write AGm to denote the agents that can be implemented on machine (computer) m:
AGm = { Ag|Ag ∈ AG and Ag can be implemented on m}
We can replace the earlier equation with the following, which defines the bounded optimal agent Agbopt:
46
Ag ∈ AGm
r∈R(Ag, Env)
Agbopt = arg max Σ u(r)P(r|Ag, Env)
Bounded optimal agents
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
46
Where do the numbers come from?
We don’t think in terms of utilities!
Hard to formulate tasks in these terms
47
Difficulties with utility-based approaches
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
47
Not obvious how to generate the numbers
If we’re measuring utility based on something inherently numeric:
State where I have £20, utility = 20
State where I have £300, utility = 300
But otherwise, how do we decide what utility to assign:
State where I’m at home on Saturday night watching tv on my own eating a nice dinner, utility = 1.7? 39? 72?
State where I’m out at a party with all my friends, utility = 2.5? 60? 89?
48
Difficulties with utility-based approaches
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
48
We don’t think in terms of utilities
When I’m considering whether to stay in on my own and eat a nice dinner and watch tv, or to go out to a party with my friends, I don’t think “Staying in would give me utility 17 but going out will give me utility 29, so I’m going to go out”
49
Difficulties with utility-based approaches
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
49
Hard to formulate tasks in these terms
Say we have a robot and we want it to collect samples on Mars.
How do we define its utility function?
utility = number of samples it has?
utility = number of different types of samples it has?
utility = weight of samples it has?
utility = number of samples – energy left?
Not straightforward to define the utility function so that we get the kind of behaviour that we want.
50
Difficulties with utility-based approaches
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
50
A special case of assigning utilities to runs is to assign 0 (false) or 1 (true)
If a run is assigned 1, then the agent succeeds on that run, otherwise it fails
Call these predicate task specifications
Denote predicate task specification by Ψ.
Thus Ψ : R → {0, 1}
51
Predicate task specifications
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
51
A task environment is a pair 〈Env, Ψ〉 where
Env is an environment, and
Ψ : R → {0, 1} is a predicate over runs.
Let TE be the set of all task environments.
A task environment specifies:
the properties of the system the agent will inhabit
the criteria by which an agent will be judged to have either failed or succeeded
52
Task environments
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
52
Write RΨ (Ag, Env) to denote set of all runs of the agent Ag in environment Env that satisfy Ψ:
RΨ(Ag, Env) = {r | r ∈ R(Ag, Env) and Ψ(r) = 1}
We then say that an agent Ag succeeds in task environment 〈Env, Ψ〉 if
RΨ(Ag, Env) = R(Ag, Env)
53
Predicate task specifications
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
53
Let P(r | Ag, Env) denote probability that run r occurs if agent Ag is placed in environment Env
Then the probability P(Ψ | Ag, Env) that Ψ is satisfied by Ag in Env would simply be:
54
r∈RΨ(Ag, Env)
P(Ψ | Ag, Env) = Σ P(r|Ag, Env)
Probability of success
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
54
Two most common types of tasks are achievement tasks and maintenance tasks:
Achievement tasks are those of the form “achieve state of affairs φ”
Maintenance tasks are those of the form “maintain state of affairs Ψ”
55
Achievement and maintenance tasks
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
55
An achievement task is specified by a set G of “good” or “goal” states: G ⊆ E. The agent succeeds if it is guaranteed to bring about at least one of these states (we do not care which one — they are all considered equally good).
Let G ⊆ E be the set of states in which the achievement goal is realised. An agent Ag succeeds if and only if
r R(Ag, Env): ei {ex| ex is a state that appears in r} such that ei G.
56
Achievement tasks
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
56
A maintenance task is specified by a set B of “bad” states: B ⊆ E. The agent succeeds if it is guaranteed never to bring about one of these states, i.e. it manages to avoid all states in B, it never performs actions which result in any state in B occurring
Let B ⊆ E be the set of bad states that we wish to avoid. An agent Ag succeeds if and only if
r R(Ag, Env): ei {ex| ex is a state that appears in r} such that ei B.
57
Maintenance tasks
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
57
Next week we look further into how to how we design agents that can autonomously fulfil their goals, the possible architectures of these agents, and how nature can inspire this design task.
After that, we move on to focus on multi-agent systems.
Next week
Slide ‹#› out of 77
Some slides adapted from slides made available by M. Wooldridge, S. Parsons & T. Payne http://www.cs.ox.ac.uk/people/michael.wooldridge/pubs/imas/IMAS2e
58