CS计算机代考程序代写 Agent-based Systems

Agent-based Systems
Paolo Turrini
� www.dcs.warwick.ac.uk/~pturrini � p.turrini@warwick.ac.uk

Extensive Games Predicting the future
Paolo Turrini Strategies Agent-based Systems

Plan for Today
We have seen one-shot games, where players play one action each. Now we look at games where time matters, with individual actions being played in sequence.
These are called extensive games.
Today we focus on the basic model for this kind of scenario: modelling extensive games of perfect information
translation from the extensive into the normal form Zermelo’s Theorem(again!): existence of pure Nash equilibria new solution concept: subgame-perfect equilibria
famous examples: ultimatum game and centipede game This material is also covered in Chapter 4 of the Essentials.
K. Leyton-Brown and Y. Shoham.
Essentials of Game Theory: A Concise, Multidisciplinary Introduction
Claypool Publishers, 2008. Chapter 4.
Paolo Turrini Strategies Agent-based Systems

The Ultimatum Game
Player 1 chooses a division of a given amount of money.
Player 2 accepts this division or rejects it (in which case both get nothing).
1
90:10
222
Acc Rej Acc Rej Acc Rej (5,5) (0,0) (7,3) (0,0) (9,1) (0,0)
Remark: Possibly the most famous game used to study fairness in humans.
50:50 70:30
Paolo Turrini Strategies Agent-based Systems

Strategic Games in Extensive Form
An extensive-form game is a tuple �N,A,H,Z,i,A,σ,u�, where N = {1,…,n} is a finite set of players;
A is a (single) set of actions;
H is a set of choice nodes (non-leaf nodes of the tree);
Z is a set of outcome nodes (leaf nodes of the tree);
i : H → N is the turn function, fixing whose turn it is when; A : H → 2A is the action function, fixing the playable actions; σ : H × A → H ∪ Z is the (injective) successor function; and u = (u1,…,un) is a profile of utility functions ui : Z → R.
Must be finite. Must have exactly one root h0 ∈ H s.t. h0 �= σ(h,a) for all h ∈ H and a∈A. MusthaveA(h)�={}forallnodesh∈H.
Notice: Requiring σ to be injective ensures every node has (at most) one parent (so the descendants of h0 really form a tree).
Paolo Turrini Strategies Agent-based Systems

Ultimatum Game: Let’s play!
Player 1 chooses a division of a given amount of money.
Player 2 accepts this division or rejects it (in which case both get nothing).
1
90:10
222
Acc Rej Acc Rej Acc Rej (5,5) (0,0) (7,3) (0,0) (9,1) (0,0)
50:50 70:30
Paolo Turrini Strategies Agent-based Systems

Pure Strategies
Notation: Write Hi := {h ∈ H | i(h) = i} for the set of choice nodes in which it is player i’s turn to choose an action.
A pure strategy for player i maps nodes h ∈ Hi to actions in A(h). Thus, it is a function αi : Hi → A that respects αi (h) ∈ A(h).
Given a profile α = (α1, . . . , αn) of pure strategies, the outcome of the game is the outcome node computed by this program:
h ← h0
while h �∈ Z do h ← σ(h,αi(h)(h)) return h
Notice: A strategy describes what to do for every choice node where it would be your turn, even those you may never actually reach.
Paolo Turrini Strategies Agent-based Systems

Translation to Normal Form
Every extensive-form game can be translated into a normal-form game. Translating �N,A,H,Z,i,A,σ,u� to normal-form game �N�,A�,u��:
N� = N, i.e., the set of players stays the same;
A� =A�1 ×···×A�n,withA�i ={αi :Hi →A|αi(h)∈A(h)},i.e.,thesetof action profiles in the normal-form game is the set of pure-strategy profiles in the extensive game;
u� = (u1�, . . . , un�), with u�i : α �→ ui (out(α)), where out(α) is the outcome of the extensive game under pure-strategy profile α.
Thus, the full machinery developed for normal-form games (such as mixed strategies, Nash equilibria, other solution concepts) is available.
So why use the extensive form at all? Because it (often) is a more compact as well as intuitive form of representation.
Paolo Turrini Strategies Agent-based Systems

Exercise: Translation to Normal Form
Recall the Ultimatum Game:
50:50 70:30
1
222
Acc Rej Acc Rej Acc Rej (5,5) (0,0) (7,3) (0,0) (9,1) (0,0)
Sketch the normal form. How many matrix cells?
90:10
Paolo Turrini Strategies Agent-based Systems

Translation from Normal Form
Can we also translate from normal-form to extensive-form games? No! At least not in all cases. So the normal form is more general.
Exercise: Explain why it doesn’t work for the Prisoner’s Dilemma.
CD
C
D
−10 −10
0
−25
−25 0
−20 −20
Paolo Turrini
Strategies
Agent-based Systems

Existence of Pure Nash Equilibria
Theorem (Zermelo, 1913: revisited in modern fashion)
Every (finite) extensive-form game has at least one pure Nash equilibrium. a aZermelo didn’t know NE nor used the technique below!
Proof.
Work your way up, from the “lowest” choice nodes to the root. Label each h ∈ H with anactiona� ∈A(h)andavector(uh1,…,uhn):
Find (one of) the best action(s) for the selected player i� = i(h): � σ(h,a)
a ∈argmaxui�
a∈A(h) h
Compute the utility labels ui for node h for all agents i ∈ N:
uh := uσ(h,a�) (where uz := ui (z) for any z ∈ Z)
iii
This process is well-defined and terminates. And by construction, the resulting
assignment {h �→ a�} of nodes to pure strategies is a NE.
This method for solving a game is called backwards induction.
Paolo Turrini Strategies Agent-based Systems

Historical Note: Relevance to Chess
Of course, Zermelo did not phrase his result quite like that: extensive games and Nash equilibria were introduced much later than 1913.
The title of Zermelo’s paper mentions chess (das Schachspiel) …
Using essentially the same argument we have (backwards induction) it is easy to see that chess must be determined: either White has a winning strategy, or Black has, or both players can force a draw.
Of course, the existence of such a strategy does not mean that anyone knows what it actually looks like (the game tree is too big).
Still, the basic idea of backwards induction is at the bottom of any chess-playing program (and the same is true for similar games).
Paolo Turrini Strategies Agent-based Systems

Example: Backwards Induction
1
70:30
2
Here is the (only!) Nash equilibrium (90:10, Acc-Acc-Acc) you will find by applying backwards induction to the Ultimatum Game:
50:50 90:10
22
Acc Rej Acc Rej Acc Rej
(5,5) (0,0) (7,3) (0,0) (9,1) (0,0) Exercise: Is this the only pure Nash equilibrium for this game?
Paolo Turrini Strategies Agent-based Systems

Noncredible Threats
There are several other Nash equilibria, such as (50:50, Acc-Rej-Rej):
1
50:50 70:30 222
Acc Rej Acc Rej Acc Rej (5,5) (0,0) (7,3) (0,0) (9,1) (0,0)
Indeed, no player has an incentive to unilaterally change her strategy. Nevertheless, this does not seem a reasonable solution for the game: Player 2’s threats to reject are not credible.
Example: In the hypothetical situation where the righthand subgame is reached, to reject would be a strictly dominated strategy for Player 2.
90:10
Paolo Turrini Strategies Agent-based Systems

Subgame-Perfect Equilibrium
Every internal node h ∈ H induces a subgame in the natural manner.
A strategy profile s is a subgame-perfect equilibrium of an extensive game G0 if, for every (not necessarily proper) subgame G of G0, the restriction of s to G is a Nash equilibrium.
Theorem (Selten, 1965)
Every (finite) extensive-form game has at least one subgame-perfect equilibrium. Proof.
This is what we showed when we proved Zermelo’s Theorem.
Notice: Selten (1965) introduced the concept of SPE for a more specific family of games and did not quite state the theorem above, but these ideas are clearly implicit in that paper.
R. Selten.
Spieltheoretische Behandlung eines Oligopolmodells mit Nachfragetr ̈agheit.
Zeitschrift fu ̈r die Gesamte Staatswissenschaft
121(2):301–324, 1965.
Paolo Turrini Strategies Agent-based Systems

Let’s Play: Centipede Game
We start in the choice node on the left. At each step, the player whose turn it is can choose between going right and going down:
1 2 1 2 (30,30)
(5, 5)
(0, 15)
(35, 10)
(25, 35)
strategies:
• down-down • down-right • right-down • right-right
Rules: You play and then receive your payoff.
Variant 1: Two volunteers play in full public view, step by step.
Variant 2: Everyone must play, specifying their full strategy on a form. We randomly pick two. The first name gets revealed, must play, and receives their payoff.
Paolo Turrini Strategies Agent-based Systems

Discussion
It appears that humans rarely play their SPE strategies.
And even when they do, this can result in counterintuitive effects:
Suppose you play your SPE strategy but your opponent doesn’t.
Then you are committed to continuing to play a strategy that you devised on the basis of an assumption (full rationality of your opponent) that just turned out to be wrong …
Paolo Turrini Strategies Agent-based Systems

Summary
This has been an introduction to extensive games, where we (for the first time) model the sequential nature of most real games:
definition of the formal model
pure strategies as functions from choice nodes to actions translation into normal form is always possible translation from normal form into extensive form is not noncredible threats call for new solution concept: SPE subgame-perfect equilibrium = NE in every subgame backwards induction shows: SPE and NE always exist
What next? Games with limited foresight
Paolo Turrini Strategies Agent-based Systems