程序代写 Algorithmic Game Theory and Applications

Algorithmic Game Theory and Applications
Lecture 11:
Games of Perfect Information

Copyright By PowCoder代写 加微信 powcoder

AGTA: Lecture 11

finite games of perfect information
Recall, a perfect information (PI) game has only 1 node per information set.
Theorem([Kuhn’53]) Every finite n-person extensive PI-game, G, has a NE, in fact, a subgame-perfect NE (SPNE), in pure strategies.
I.e., some pure profile, s∗ = (s∗1, . . . , s∗n), is a SPNE.
Before proving this, we need some definitions.
For a game G with game tree T, and for w ∈ T, define the subtree Tw ⊆ T , by:
Tw ={w′ ∈T |w′ =ww′′ forw′′ ∈Σ∗} Since the game tree is finite, we can just associate
payoffs to the leaves.
Thus, the subtee Tw, in an obvious way, defines a
“subgame”, Gw, which is also a PI-game.
Let the depth of a node w in T be its length |w| (as a string). Let the depth of a tree T be the maximum depth of any node in T. Let the depth of a game G be the depth of its game tree.
AGTA: Lecture 11

Proof We prove by induction on the depth of a subgame Gw that it has a pure strategy SPNE, sw = (sw1 ,…,swn). Then s∗ := sε.
Base case, depth 0: In this case we are at a leaf w. there is nothing to show: each player i gets payoff ui(w), and the strategies in the SPNE s∗ are “empty” (it doesn’t matter which player’s node w is, since there are no actions to take.)
Inductive step: Suppose depth of Gw is k + 1. Let Act(w) = {a′1, . . . , a′r} be the set of actions available at the root of Gw. The subtrees Twa′j, for j =
1,…,r, each define a PI-subgame Gwa′j, of depth ≤ k.
proof of Kuhn’s theorem (easy)
Thus, by induction, each game Gwa′j has a pure strategySPNE,swa′j =(swa′j,…,swa′j).
To define sw = (sw1 ,…,swn), there are two cases to
consider ……
AGTA: Lecture 11

1. pl(w) = 0, i.e., the root node, w, of Tw is a chance node (belongs to “nature”).
Let the strategy swi for player i be just the obvious
swa′, of its pure strategies in a′∈Act(w) i
each of the subgames. (Explanation of “union” of disjoint strategy functions.)
Claim: sw = (sw1 ,…,swn) is a pure SPNE of Gw. Suppose not. Then some player i could improve its expected payoff by switching to a different pure strategy in one of the subgames. But that violates the inductive hypothesis on that subgame.
2. pl(w) = i > 0. In this case the root node, w,
of Tw belongs to player i. For a ∈ Act(w), let
hwa(swa) be the expected payoff to player i in the i
subgame Gwa. ′ ′
For some a′, hwa (swa ) = maxa∈Act(w) hwa(swa).
For players i′ ̸= i, define sw = 􏰋 swa.
i′ a∈Act(w) i′ Fori,definesw =(􏰋 swa)∪{w􏰎→a′}.
i a∈Act(w) i
Claim: sw = (sw1 ,…,swn) is a pure SPNE of Gw. The proof is basically the same argument as the previous case. Even for the root player, i, to (unilaterally) improve its payoff, it would need to do so in some (strict) subgame.
AGTA: Lecture 11

The proof yields an EASY “bottom up” algorithm for computing a pure SPNE in a finite PI-game:
We inductively “attach” to the root of every subtree Tw, a SPNE sw for the game Gw, together with the expected payoff vector hw := (hw1 (sw), . . . , hwn (sw)).
1. Initially: Attach to each leaf w the empty profile sw = (∅,…,∅), & payoff vector hw := (u1(w), . . . , un(w)).
2. While (there is an unattached node w
all of whose children are attached)
• if (pl(w) = 0) then
sw := (sw1 ,…,swn),
where sw := 􏰋 swa ; i a∈Act(w) i
easy “bottom-up” algorithm for SPNE’s in finite PI-games
note, hw is given by:
hw(sw):=􏰂 q (a)∗hwa(swa);
i a∈Act(w) w i else if (pl(w) = i > 0) then
Let sw := (sw1 ,…,swn), & hw := hwa′, where sw :=􏰋 swa,fori′ ̸=i,
i′ a∈Act(w) i′
sw := (􏰋 swa) 􏰋{w 􏰎→ a′},
i a∈Act(w) i
and where a′ ∈ Act(w) such that
hwa′(swa′) = maxa∈Act(w) hwa(swa) ii
AGTA: Lecture 11

consequences for zero-sum finite PI-games
Recall that, by the Minimax Theorem, for every finite zero-sum game Γ, there is a value v∗ such that for any NE (x∗1,x∗2) of Γ, v∗ = U(x∗1,x∗2), and
max min U(x1, x2) = v∗ = min max U(x1, x2) x1∈X1 x2∈X2 x2∈X2 x1∈X1
But it follows from Kuhn’s theorem that for extensive PI-games G there is in fact a pure NE (in fact, SPNE) (s∗1,s∗2) such that v∗ = u(s∗1,s∗2) := h(s∗1,s∗2), and thus that in fact
max min u(s1, s2) = v∗ = min max u(s1, s2) s1∈S1 s2∈S2 s2∈S2 s1∈S1
Definition We say a finite zero-sum game Γ is determined or has a value, precisely if
max min u(s1, s2) = min max u(s1, s2) s1∈S1 s2∈S2 s2∈S2 s1∈S1
It thus follows from Kuhn’s theorem that:
Proposition ([Zermelo’1912]) Every finite zero-sum PI-game, G, is determined (has a value).
Moreover, the value & a pure minimax profile can be computed “efficiently” from G.
AGTA: Lecture 11

Chess is a finite PI-game (after 50 moves with no piece taken, it ends in a draw). In fact, it’s a win-lose-draw PI-game: there are no chance nodes and the only possible payoffs u(s1, s2) are 1, −1, and 0.
Proposition([Zermelo’1912]) In Chess, either 1. White has a “winning strategy”, or
2. Black has a “winning strategy”, or
3. Both players have strategies to force a draw.
A “winning strategy”, e.g., for White (Player 1) is a pure strategy s∗1 that guarantees value u(s∗1, s2) = 1.
Question: Which one is the right answer?? Problem: The tree is far too big!!
Even with ∼ 200 depth & ∼ 5 moves per node: 5200 nodes!
Despite having an “efficient” algorithm to compute the value v∗ given the tree, we can’t even look at the whole tree! We need algorithms that don’t look at the whole tree.
AGTA: Lecture 11

There’s > 50 years of research on chess & other game playing programs, (Shannon, Turing, . . .). Heuristic game-tree search procedures are now very refined. See any AI text (e.g., [Russel-Norvig’02]).
We can do a recursive “top-down” search in Depth- First style. If additionally we have a function Eval(w) that heuristically “evaluates” a node’s “goodness” score, we can use Eval(w) to stop the search at, e.g., any desired depth.
While searching “top-down”, we can “prune out” irrelevant subtrees using α-β-pruning. Idea: while searching the minmax tree, maintain two values:
α- “maximizer can assure a score of at least α”;
β- “minimizer can assure a score of at most β”;
50 years of game-tree search
Player II:
AGTA: Lecture 11

minmax search with α-β-pruning
Assume, for simplicity, that players alternate moves, and root belongs to Player 1 (maximizer). Assume −1 ≤ Eval(w) ≤ +1. Score −1 (+1) means player 1 definitely loses (wins). Start the search by calling: MaxVal(ε, −1, +1);
MaxVal(w, α, β)
If depth(w) ≥ MaxDepth then return Eval(w). Else
for each a ∈ Act(w)
α := max{α, MinVal(wa, α, β)}; if α ≥ β, then return β
MinVal(w, α, β)
If depth(w) ≥ MaxDepth, then return Eval(w). Else
for each a ∈ Act(w)
β := min{β, MaxVal(wa, α, β)}; if β ≤ α, then return α
Heuristic game-tree search is a world of research onto
itself. Please see, e.g., the references in AI texts. AGTA: Lecture 11

Boolean circuits can be viewed as a zero-sum PI- game, between AND and OR: OR the maximizer, AND the minimizer. (negations “pushed down” to bottom.) This is a win-lose PI-game: no chance nodes and the only possible payoffs u(s1, s2) are 1 and −1.
boolean circuits
as finite extensive PI-games
Player II:
−1 −1 1 −1 −1 1
Note: game tree can be exponentially bigger, but efficient bottom-up algorithm works directly on circuit.
AGTA: Lecture 11

For a (possibly infinite) zero-sum 2-player PI-game, we would like to similarly define the game to be “determined” if
max min u(s1, s2) = min max u(s1, s2) s1∈S1 s2∈S2 s2∈S2 s1∈S1
But, for infinite games max & min may not exist! Instead, let’s define an (infinite) zero-sum game to be determined (have a value) if:
sup inf u(s1, s2) = inf sup u(s1, s2) s1∈S1 s2∈S2 s2∈S2 s1∈S1
In the simple setting of infinite win-lose PI-games (2 players, zero-sum, no chance nodes, and only payoffs are 1 and −1), this definition says a game is determined precisely when one player or the other has a winning strategy: a strategy s∗1 ∈ S1 such that for any s2 ∈ S2, u(s∗1, s2) = 1 (and vice versa for player 2).
Question: Is every win-lose PI-game determined? Answer: No …….
Ok, let’s try to generalize to infinite zero-sum games
AGTA: Lecture 11

determinacy and its boundaries
For win-lose PI-games, we can define the payoff function by providing the set Y = u−1(1) ⊆ ΨT,
of complete plays on which player 1 wins (player 2 necessarily wins on all other plays).
If, additionally, we assume that players alternate moves, we can specify such a game as G = ⟨T, Y ⟩.
Fact Even for the binary tree T = {L,R}∗, there are sets Y ⊆ ΨT , such that the win-lose PI-game G = ⟨T, Y ⟩ is not determined.
(This fact is not very easy to show. Its proof uses the “axiom of choice”. See, e.g., [Mycielski, Ch. 3 of Handbook of GT,1992].)
Fortunately, large classes of win-lose PI-games are determined. In particular:
Theorem([D. A. Martin’75]) Whenever Y is a so called “Borel set”, the game ⟨Σ∗, Y ⟩ is determined.
(This is a deep theorem, with deep connections to logic and set theory. The theorem holds even when the action alphabet Σ is an infinite set. )
AGTA: Lecture 11

food for thought: win-lose games on finite graphs
Instead of a tree, we have a finite directed graph:
Player II:
• Starting at “Start”, does Player I have a strategy to “force” the play to reach the “Goal”?
• Convince yourself that this is a (possibly infinite) win-lose PI-game.
• Is this game determined for all finite graphs?
• If so, how would you compute a winning strategy
for Player 1?
AGTA: Lecture 11

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com