代写代考 Tutorial 7: solution sketches

Tutorial 7: solution sketches
1. The Bellman optimality equations are as follows:
x5 = max{x1, x2}
x3 = max{x2, x4}

Copyright By PowCoder代写 加微信 powcoder

x2 = 2×1/5 + x4/5 + 2×6/5
x1 =x2/6+x4/6+x5/6+x6/2
Our goal is to compute the unique minimal (least non-negative) solu- tion vector p∗ = (p∗1, . . . , p∗6), which gives the optimal probabilities of reaching s6 starting from each state. It is clear that p∗4 = 0, so that at s3 the node s2 is always chosen, giving p∗3 = p∗2. Also, since p∗6 = 1, it only remains to solve for p∗1, p∗2, and p∗5. From the optimality conditions we see that the equations governing these are as follows:
p∗5 = max{p∗1, p∗2}
p∗2 = 2p∗1/5 + 2/5
p∗1 = p∗2/6 + p∗5/6 + 1/2
We can (as shown on the lecture slides) solve this by computing the unique optimal solution for the following linear programming problem:
Minimize: x1 + x2 + x5 Subject to:
x2 ≥ (2/5)x1 + (2/5)
x1 ≥ (1/6)x2 + (1/6)x5 + (1/2)
In this small example, we can also avoid using linear programming, by enumerating all possible cases of the max equation. There are two cases to consider: (i) max{p∗1, p∗2} = p∗2 and (ii) max{p∗1, p∗2} = p∗1. In both of these cases we know the value of p∗5, so we can calculate the rest.

In case (i), the equations reduce to
p∗2 = 2p∗1/5 + 2/5 p∗1 = p∗2/3 + 1/2
These can be solved to get p∗1 = 19/26 and p∗2 = 18/26. This contradicts our assuption that p∗2 = max{p∗1, p∗2}.
In case (ii), the equations reduce to
p∗2 = 2p∗1/5 + 2/5
p∗1 = p∗1/6 + p∗6/6 + 1/2
which gives us p∗1 = 17/23 and p∗2 = 16/23. This gives us the full solu-
tion to the original problem: p∗ = (p∗1, . . . p∗6) = (17/23, 16/23, 16/23, 0, 17/23, 1). Player 1s optimal strategy is to choose s2 when at node s3, and to choose
s1 when at node s5.
2. As we are working with a congestion game, we can find a pure by starting at any pure strategy profile, and iteratively improving it until we can’t. To get a concrete starting point, let’s say all players take the route s → v3 → t. Then we can do iterative improvements for example1 as follows:
(i) Player1switchestos→v2 →v1 →t (ii) Player2switchestos→v2 →v1 →t
(iii) Player 3 switches to s → v1 → t. (iv) Player 2 switches to s → v1 → t
At (iv) no further improvements can be made, so we reached the fol- lowing NE:
Player1: s→v2 →v1 →t Player2: s→v1 →t Player3: s→v1 →t
Note that in the above sequence we weren’t done at stage (iii), even though every player had switched once. Other starting points will take
1at many stages there’s more than one option on who improves and how 2

through other sequences of steps, and they might end up in a different NE, although it turns out that in this game all pure Nash equilibria send twoplayersviatheroutes→v1 →tandonevias→v2 →v1 →t, differing only in which player chooses the path s → v2 → v1 → t.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com