Algorithmic Game Theory and Applications: Sample Solutions for Coursework 2
1. Recall that a Nash equilibrium in an extensive game is subgame per- fect nash equilibrium (SPNE) if it is also a Nash equilibrium in every subgame of the original game. Formally, a subgame, is a game defined by a subtree, Tv of the original game tree, T, such that the subtree Tv, rooted at a node v, has the property that for every decendent u of v in the game tree (including v itself), every node in the same information set as u is also in the subtree Tv.
(a) [7 points] Give an example of a pure NE which is not a SPNE, for a finite extensive form game of perfect information.
Solution: In the 2-player finite extensive form game of per- fect information depicted in Figure 1, the pure strategy profile (B,a), where player 1 plays action B, and player 2 plays action a, both with probability 1, is a pure , but is not a subgame-perfect pure Nash equilibrium, because in the sub- game rooted at the only node controlled by player 2, the only Nash equilibrium is for player 2 to choose b.
Copyright By PowCoder代写 加微信 powcoder
Figure 1: Exercise 1(a): (B, a) is an NE of this game, but it is not a SPNE
(b) [10 points] Show that every finite extensive game of perfect infor- mation where there are no chance nodes and where no player has the same payoffs at any two distinct terminal nodes has a unique pure-strategy SPNE.
Solution: Consider the proof of Kuhn’s theorem, which shows the existence of a pure SPNE for any finite extensive form game of perfect information. Recall that the proof also yields an algorithm for computing a pure SPNE, by “backward induction”. Assuming that there are no chance nodes, and no player has the same payoffs at any two distinct terminal nodes, we wish to show that there must be a unique pure SPNE. Suppose s = (s1, . . . , sn) and s′ = (s′1, . . . , s′n) are two different pure SPNEs for such a game. Let v be a node of the game tree belonging to some player i such that player i’s choices of actions in s and s′ differ. i.e., such that si(v) ̸= s′i(v), and such that v is as close as possible to the leaves of the game tree, so that no descendants v′ of v, belonging to some player j, satisfies sj(v′) ̸= sj(v).
Thus, since there are no chance nodes in the game tree, the strate- gies s and s′ in the subgame rooted at v yield distinct plays πs and πs′ , ending at distinct leaves, and since the leaves all yield distinct payoffs to player i, without loss of generality ui(πs) > ui(πs′). But then s′ is not an SPNE, because it would improve player i′ payoff in the subgame rooted at node v to unilaterally switch its strategy in s′, letting s′i(v) := si(v) (and keeping everything else the same). This contradicts the assumption that s and s′ are two distinct SPNEs, so there must be a unique SPNE.
(c) [8 points] Give an example of a finite extensive form game that contains a pure but does not contain any subgame- perfect pure . Justify your answer.
Solution: In the example EFG given in Figure 2, (BC,a) is a pure NE, but it is not a SPNE. In fact, the subgame rooted at the unique node controlled by player 2 is simply a version of the “matching pennies” game, for which the unique equilibrium is the one where both players play a (1/2, 1/2) mixed strategy on their actions. Thus the only SPNE in the game depicted in Figure 2 is described as follows: player 1 plays action B at the root (with probability 1), and also plays a 1/2 − 1/2 mix of actions C and D in its other information set. Player 2 plays a 1/2 − 1/2 mix of
Figure 2: Exercise 1(c): a pure NE, but no pure SPNE; only a mixed SPNE
actions a and b in the only node (and the only information set) that it controls.
2. Consider the finite extensive form game of imperfect information de- picted in Figure 3.
(a) [3 points] Does this game satisfy “perfect recall”? Explain your answer.
Yes. This game satisfies perfect recall. The reason is as follows: the only non-trivial information set (i.e., the only information set containing more than one node of the game tree), is that informa- tion set belonging to player 2 which contains 2 nodes. Note that, for both nodes in that information set, the prior “visible history” for player 2, is exactly the same: namely, in order to reach either node, at the only prior information set belonging to the same player, player 2 must have chosen action “a”. Consequently, this game satisfies perfect recall.
The following observations are not needed in the solution itself, but inform the subsequent questions:
Hence, note that, as mentioned in lectures (due to a result of Kuhn [1953]), in such an extensive form game of perfect recall, it suffices to consider only “behavior strategies” (because any possible distribution on the “leaves” of the game tree that can be induced by any “mixed” strategies for any player can be induced by “behavior strategies”).
Figure 3: Question 2
Recall that a “behavior strategy” is a strategy in which each player chooses, for each information set that belongs to it, a prob- ability distribution on actions available at that information set. (Note that a pure strategy, which is a function that maps each information set (deterministically) to one action available at that information set is by definition a specific kind of behavior strat- egy.)
(b) [12 points] Compute all SPNEs in this game, describing each of them in terms of “behavior strategies”.
Let us first compute all NEs of the subgame whose root is arrived at after the action sequence B,a, at a node controlled by player 1.
This is a 2 × 2 game, where each player has two pure strategies. Namely, player 1 has pure strategies C and D, and player 2 has pure strategies c and d.
If we convert this subgame to normal form, it looks like this:
cd C (7, 7) (0, 0) D (0, 0) (7, 7)
There are obviously two pure in this subgame, namely, (C, c) and (D, d) (i.e., when both players play C and c, respectively, with probability 1; and likewise when both players play D and d respective, with probability 1).
In both of these pure NEs, the payoff to both players is 7. There is also a unique third fully mixed (behavior) in this subgame. Namely, in this fully mixed NE both players play both of their actions in this subgame with probability 1/2 each. It is easy to check that this constitutes a (and an easy way to compute this NE would be to use the useful corollary to Nash’s theorem to set up suitable equations, as usual).
Note that in this third, fully mixed, NE, the expected payoff for both players is 7/2 = 3.5.
There are no other NEs in this subgame.
Now, in order to determine all SPNEs in the entire extensive form
game, we can use “backward induction” based on the NEs in this subgame, as follows:
If the NE in the subgame is either of the pure NEs (C, c) or (D, d), then the payoff to both players (including player 2) in that NE is 7.
So, suppose that in some SPNE, either one of these two pure NEs is being played in that subgame. Then at the node controlled by player 2 just above that subgame, namely the node which constitutes a singleton information set for player 2, player 2 would have a choice between playing action b and getting a payoff of 5, versus playing action a and getting an expected payoff of 7. Therefore, in any such SPNE, player 2 will play a with probability 1 in its singleton information set. And therefore, player 1 will play B at the root of the game tree in any such SPNE, because its payoff from playing A is 6, whereas its payoff from playing B is 7.
Hence, to summarize, the following are subgame-perfect NEs in behavior strategies (in fact, pure subgame-perfect NEs) for this game:
• Player 1 plays action B and action C from its information sets (both with probability 1). Player 2 plays action a and action c, from its information sets (both with probability 1).
• Player 1 plays action B and action D from its information sets (both with probability 1). Player 2 plays action a and action d, from its information sets (both with probability 1).
The only thing left to consider is what happens in an SPNE where, in the subgame reached after actions B followed by a, the NE that is played is the unique fully mixed NE which gives both players expected payoff 7/2 = 3.5.
Note that in this case, the choice faced by player 2 at its single- ton information set just above that subgame is between playing action b and getting payoff 5, versus playing action a and getting expected payoff 3.5. In that case, in any such SPNE, player 2 must play action b with probability 1 at that singleton informa- tion set.
Hence, at the root, in any such SPNE, player 1 will play action A with probability 1, to get a payoff of 6 (instead of a payoff of 5, if it had chosen action B).
In other words, a third SPNE in this game is:
• Player 1 plays action A with probability 1 from its first infor- mation set, and plays both actions C and D with probability
1/2 from its second information set. Player 2 plays action b from its first (singleton) information set (with probability 1), and plays both actions c and d with probability 1/2 from its second information set.
It is easy to see that these are all the SPNEs in this game, because every SPNE must also constitute an NE in every subgame, and we have enumerated, starting from the bottom 2×2 subgame, all the NEs that could arise, and all the (unique) SPNEs in the rest of the game that these give rise to.
(c) [6 points] Are there any other Nash equilibria other than SPNEs you have computed? Explain your answer.
Yes, there are other NEs in this game, besides the SPNEs we have computed.
For example, consider the following pure NE:
• Player 1 plays actions A and C, each with probability 1. Player 2 plays actions b and d, each with probability 1.
To see that this is an NE of the game, note that neither player could improve its own payoff by unilaterally changing its own strategy. Specifically, in this NE, both players get payoff 6.
Any unilateral change of strategy by player 1, to a different pure strategy (assuming player 2 does not change its strategy) can at most earn payoff 5 for player 1.
Likewise, any unilateral change of strategy by player 2, to a dif- ferent pure strategy (assuming player 1 does not change its strat- egy), will not change the payoff for player 2 at all (its payoff will remain as 6).
Hence, this is a . However, this is clearly NOT a subgame-perfect , because it is not an NE in the 2 × 2 subgame rooted at the node reached by the sequence of actions B followed by a. Specifically, as we have seen above (C, d) is not an NE in that subgame (because it earns both players a payoff of 0 in that subgame, and both players could improve their payoff to 7 in that subgame by unilaterally changing their strategy).
Hence, there are certainly NEs in this game which are not SPNEs.
(d) [4 points] Which of the equilibria in this game (pure and mixed) are “credible” and “sequentially rational” (specifically, which ones do not involve “non-credible threats”). Explain.
This question is unfortunately not phrased very formally, because the notion of a “non-credible threat” is not formally defined, and moreover even formalizing the notion of “sequentially rational” is actually rather tricky, and involves defining various concepts like a “system of beliefs” for the players, which then leads to tricky notions regarding “consistency of beliefs”, and ultimately “sequential equilibrium”, none of which we covered in lectures (and you are not expected to know).
So, the question should be interpreted as an informal question. Therefore, I do not expect a detailed solution to this question based on formal definitions of sequential rationality. Any rea- sonable answer which argues that the SPNEs in this game are “sequentially rational” and that there are indeed other NEs in this game which are NOT “credible” or “sequentially rational”, under any reasonable definition of “sequential rationality” (see below, for further discussion of this) will get full credit. Intuitively, in all of the SPNEs in this game (identified in our solution to 2(b)), both players never play a distribution on ac- tions at any information set which they wouldn’t do if the game actually reached that information set with positive probability. This includes the SPNE which involves fully mixed strategies for both players in the lower 2 × 2 subgame (where both player’s ex- pected payoff in that lower subgame is 3.5), and hence in which that lower subgame is actually reached with probability 0. In- deed, since each SPNE is also an NE in the lower subgame, we know that the strategies the players play in that subgame would continue to be their strategies, even if that subgame was reached with positive probability.
On the other hand, there are other NEs in this game which are not SPNEs and which are not sequentially rational.
Consider for example, the following NE:
• Player 1 plays A and C, both with probability 1. Player 2 plays b and c, both with probability 1.
Let us call this pure profile b′. It is easy to see that b′ constitutes a 8
, because neither player can unilaterally deviate from its strategy in b′ and strictly improve its own payoff in the entire game. Note however that b′ is clearly not an SPNE, because in the subgame rooted at player 2’s top node (which is reached with probability 0 under the profile b′), it would be strictly better for player 2 to play a at the root of that subgame.
Indeed, this is the reason why this NE is not “sequentially ratio- nal”, because if somehow, there was positive probability of reach- ing the top node belonging to player 2, then player 2 (knowing the profile b′, and assuming it will be played in the bottom 2 × 2 subgame) will want to switch its action at that node from b to a. So, this NE is not “sequentially rational”.
However, although it is intuitively clear that this NE is not “se- quentially rational”, there isn’t really any “non-credible threat” involved in this NE, depending on how you interpret the term “non-credible threat”, because player 2 does not really gain any- thing in this circumstance by “threatening” to play b, because if it had played a, and if player 1 had played B in response to that, then (the strategies remaining C and c in the lower 2 × 2 game), both players would increase their own payoff from 6 to 7. For completeness, I will now elaborate on the concepts needed for defining sequential rationality formally. (But this is not expected in solutions to this question.)
Specifically, a system of beliefs is a mapping μ : V → [0, 1] from the nodes of the game tree to probabilities, such that, for all players i and all information sets Ii,j belonging to player i, we have v∈Ii,j μ(v) = 1. In other words, the probabilities that μ associates with the nodes in each information set sum to 1. In this sense, we can view the probability distribution that μ imposes on Ii,j as player i’s “belief” regarding what the probability is of being in any particular node of that information set, when it is confronted with the task of making a choice at that information set.
Now, we say that the pair (b, μ) (called an assessment), consisting of a behavior strategy profile b, together with a system of beliefs μ, is sequentially rational, if at every information set Ii,j, the ex- pected payoff to player i, starting in the nodes of the information set Ii,j, according to the probabilities specified in the belief μ, if all players play thereafter according to the behavior strategies
b, the expected payoff to player i is as high as it would be if it (after reaching Ii,j according to the probabilities μ) unilaterally switched its own strategy bi to a different strategy b′i. (Although I haven’t described this in mathematical notation, it can be fully formalized, using the definitions of conditional probability.) However, not all belief systems μ “make sense” with respect to a particular behavior strategy profile b = (b1, . . . , bn) for the play- ers. We need some notion of “consistency” of beliefs, in particular with respect to a particular strategy profile.
Let us first say that a system of beliefs μ is suitable for a behav- ior profile b, if for any information sets Ii,j that is reached with positive probability under profile b, it is the case that for any node v ∈ Ii,j, μ(v) is precisely the conditional probability (under profile b) of reaching node v, conditioned on reaching information set Ii,j.
Let us confine ourselves to only those systes of belief μ which are suitable for a profile b.
We will then say that a b∗ is sequentially ra- tional if there exists a suitable system of beliefs μ∗ with respect to b∗, such that (b,μ∗) is sequentially rational.
Now it is not difficult to check that all three SPNEs in the game given in Figure 1 are sequentially rational, whereas there are NEs, like the one mentioned above, which are not sequentially rational, according to the definition we have just given.
For those who want to know more: there is in fact a more refined notion of it should mean for a belief system to be “consistent” with a strategy profile b, including with respect to information sets that are not reached with positive probability under the pro- file b. What “consistency” condition should we impose on be- liefs regarding information sets that are not reached with positive probability under profile b?
One notion of consistency, defined by Kreps and Wilson, is the following. A behavior strategy is called “fully mixed” if it places positive probability on every action available at every informa- tion set. Consider an infinite sequence ⟨bk | k ∈ N⟩, of fully mixed behavior strategy profiles bk, and let μk be the unique suitable belief system for bk. μk is uniquely determined, because under bk every information set is reached with positive probability, and hence the suitable beliefs are determined by the respective con-
4,6,7 4,5,6
2, 4, 8 1, 8, 9
Figure 4: Question 3
ditional probabilities of reaching each node in each information set. Then an assessment pair (b,μ) is said to be consisent if (b, μ) = limk→∞(bk, μk), for some sequence of fully mixed strate- gies ⟨bk | k ∈ N⟩. A sequential equilibrium (again defined by Kreps and Wilson) is defined to be an assessment (b, μ) which is both consistent and sequentially rational. Every extensive form game of perfect recall has a sequential equilibrium.
3. Consider the atomic network congestion game, with three players, de- scribed by the directed graph in Figure 2.
In this game, every player i (for i = 1, 2, 3) needs to choose a directed path from the source s to the target t. Thus, every player i’s set of possible actions (i.e., its set of pure strategies) is the set of all possible directed paths from s to t.
Each edge e is labeled with a sequence of three numbers (c1,c2,c3). Given a profile π = (π1,π2,π3) of pure strategies (i.e., s-t-paths) for all three players, the cost to player i of each directed edge, e, that is contained in player i’s path πi, is ck, where k is the total number of players that have chosen edge e in their path. The total cost to player i, in the given profile π, is the sum of the costs of all the edges in its path πi from s to t. Each player of course wants to minimize its own total cost.
(a) [10 points] Compute a pure strategy NE in this atomic network 11
congestion game. Explain why what you have computed is a pure NE.
In this game, a pure NE consists of the three players playing the following three paths:
s → v1 → t s → v2 → t s → v3 → t
The costs to the three players in this NE are: 6, 5, and 3, respec- tively.
It is not difficult to check that this is an NE, by exhaustively examining all possible unilateral deviations by each of the three players, and seeing
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com