Algorithmic Game Theory and Applications
Lecture 4:
2-player zero-sum games, and the Minimax Theorem
Copyright By PowCoder代写 加微信 powcoder
AGTA: Lecture 4
2-person zero-sum games
A finite 2-person zero-sum (2p-zs) strategic game Γ, is a strategic game where:
• For players i ∈ {1, 2}, the payoff functions
ui :S→Raresuchthatforalls=(s1,s2)∈S,
u1(s) + u2(s) = 0 I.e., u1(s) = −u2(s).
ui(s1, s2) can conveniently be viewed as a m1 × m2 payoff matrix Ai, where:
u1(1,1) …… u1(1,m2) A= . . .
. . .
u1(m1,1) …… u1(m1,m2)
Note, A2 = −A1. Thus we may assume only one function u(s1, s2) is given, as one matrix, A = A1. Player 1 wants to maximize u(s), while Player 2 wants to minimize it (i.e., to maximize its negative).
AGTA: Lecture 4
matrices and vectors
As just noted, a 2p-zs game can be described by an m1 × m2 matrix:
a1,1 …… a1,m …2
. . . A = . ai,j .
. . . am1,1 …… am1,m2
where ai,j = u(i, j).
For any (n1 × n2)-matrix A we’ll either use ai,j or (A)i,j to denote the entry in the i’th row and j’th column of A.
For (n1 × n2) matrices A and B, let A≥B
denote that for all i,j, ai,j ≥ bi,j. Let
A>B denote that for all i,j, ai,j > bi,j.
For a matrix A, let A ≥ 0 denote that every entry is ≥ 0. Likewise, let A > 0 mean every entry is > 0.
AGTA: Lecture 4
Recall matrix multiplication: given (n1 × n2)-matrix A and (n2 × n3)-matrix B, the product AB is an (n1 × n3)-matrix C, where
ci,j = ai,k ∗ bk,j
Fact: matrix multiplication is “associative”: i.e.,
(AB)C = A(BC)
(Note: for the multiplications to be defined, the dimensions of the matrices A, B, and C need to be “consistent”: (n1 × n2), (n2 × n3), and (n3 × n4), respectively.)
Fact: For matrices A, B , C , of appropriate dimensions, if A ≥ B, and C ≥ 0, then
AC ≥ BC, and likewise, CA ≥ CB. (C’s dimensions might be different in each case.)
more review of matrices and vectors
AGTA: Lecture 4
For a (n1×n2) matrix B, let BT denote the (n2×n1) transpose matrix, where (BT )i,j := (B)j,i.
We can view a column vector, y = . , as a
more on matrices and vectors
y(m) (m×1)-matrix. Then, yT would be a (1×m)-matrix,
i.e., a row vector.
Typically, we think of “vectors” as column vectors and explicitly transpose them if we need to. We’ll call a length m vector an m-vector.
Multiplying a (n1 × n2)-matrix A by a n2-vector y is just a special case of matrix multiplication:
Ay is a n1-vector.
Likewise, pre-multiplying A, by a n1-row vector yT , is also just a special case of matrix multiplication: yT A is a n2-row vector.
For a column (row) vector y, we use (y)j to denote the entry (y)j,1 (respectively, (y)1,j).
AGTA: Lecture 4
A matrix view of zero-sum games
Suppose we have a 2p-zs game given by a (m1 × m2)-matrix, A.
Suppose Player 1 chooses a mixed strategy x1, and Player 2 chooses mixed strategy x2 (assume x1 and x2 are given by column vectors). Consider the product
xT1 Ax2 If you do the calculation,
xT1 Ax2 = (x1(i) ∗ x2(j)) ∗ ai,j
But note that (x1(i) ∗ x2(j)) is precisely the probability of the pure combination s = (i, j). Thus, for the mixed profile x = (x1, x2)
xT1 Ax2 = U1(x) = −U2(x)
where U1(x) is the expected payoff (which Player 1 is trying to maximize, and Player 2 is trying to minimize).
AGTA: Lecture 4
Suppose Player 1 chooses a mixed strategy x∗1 ∈ X1, by trying to maximize the “worst that can happen”. The worst that can happen would be for Player 2 to choose x2 which minimizes (x∗1)T Ax2.
Definition: x∗1 ∈ X1 is a minmaximizer for Player 1 if
min (x∗1)T Ax2 = max min (x1)T Ax2 x2∈X2 x1∈X1 x2∈X2
Similarly, x∗2 ∈ X2 is a maxminimizer for Player 2 if max (x1)T Ax∗2 = min max xT1 Ax2
x1∈X1 x2∈X2 x1∈X1 Note that
min (x∗1)T Ax2 ≤ (x∗1)T Ax∗2 ≤ max xT1 Ax∗2 x2∈X2 x1∈X1
Amazingly, von Neumann (1928) showed equality holds!
“minmaximizing” strategies
AGTA: Lecture 4
The Minimax Theorem
Theorem(von Neumann) Let a 2p-zs game Γ be given by an (m1 × m2)-matrix A of real numbers. There exists a unique value v∗ ∈ R, such that there exists x∗ = (x∗1, x∗2) ∈ X such that
1. ((x∗1)TA)j ≥ v∗, for j = 1,…,m2. 2. (Ax∗2)j ≤ v∗, for j = 1,…,m1.
3. And (thus) v∗ = (x∗1)T Ax∗2 and
max min(x1)TAx2=v∗= min maxxT1Ax2 x1∈X1 x2∈X2 x2∈X2 x1∈X1
4. In fact, the above conditions all hold precisely when x∗ = (x∗1, x∗2) is any .
Equivalently, they hold precisely when x∗1 is any minmaximizer and x∗2 is any maxminimizer.
AGTA: Lecture 4
some remarks
(1.) says x∗1 guarantees Player 1 at least expected profit v∗, and
(2.) says x∗2 guarantees Player 2 at most expected “loss” v∗.
We call any such x∗ = (x∗1, x∗2) a minimax profile. We call the unique v∗ the minimax value of game
It is obvious that the maximum profit that Player 1 can guarantee for itself should be ≤ the minimum loss that Player 2 can guarantee for itself, i.e., that
max min (x1)T Ax2 ≤ min max xT1 Ax2 x1∈X1 x2∈X2 x2∈X2 x1∈X1
What is not obvious at all is why these two values should be the same!
AGTA: Lecture 4
The Minimax Theorem follows directly from Nash’s Theorem (but historically, it predates Nash).
Proof: Let x∗ = (x∗1,x∗2) ∈ X be a NE of the 2-player zero-sum game Γ, with matrix A.
Let v∗ := (x∗1)T Ax∗2 = U1(x∗) = −U2(x∗).
Since x∗1 and x∗2 are “best responses” to each other,
we know that for i ∈ {1, 2}
Ui(x∗−i; πi,j ) ≤ Ui(x∗)
1. U1(x∗−1; π1,j) = (Ax∗2)j. Thus,
(Ax∗2)j ≤ v∗ = U1(x∗) for all j = 1,…,m1.
2. U2(x∗−2; π2,j) = −((x∗1)T A)j. Thus, ((x∗1)T A)j ≥ v∗ = −U2(x∗)
for all j = 1,…,m2.
AGTA: Lecture 4
Proof of the Minimax Theorem
10 3. maxx1∈X1(x1)T Ax∗2 ≤ v∗ because (x1)T Ax∗2 is a
“weighted average” of (Ax∗2)j’s.
Similarly, v∗ ≤ minx2∈X2(x∗1)T Ax2 because (x∗1)T Ax2 is a “weighted average” of ((x∗1)T A)j’s. Thus
max (x1)T Ax∗2 ≤ v∗ ≤ min (x∗1)T Ax2 x1∈X1 x2∈X2
We earlier noted the opposite inequalities, so,
min maxxT1Ax2=v∗=max min(x1)TAx2 x2∈X2 x1∈X1 x1∈X1 x2∈X2
4. We didn’t assume anything about the particular we chose. So, for every NE, x∗, letting v′ = (x∗1)T Ax∗2,
max min(x1)TAx2=v′=v∗= min maxxT1Ax2 x1∈X1 x2∈X2 x2∈X2 x1∈X1
Moreover, if x∗ = (x∗1, x∗2) satisfies conditions (1) and (2) for some v∗, then x∗ must be a .
Q.E.D. (Minimax Theorem)
AGTA: Lecture 4
• Thus, for 2-player zero-sum games, and Minimax profiles are the same thing.
• Let us note here
Useful Corollary for Minimax: In a minimax profile x∗ = (x∗1, x∗2),
1. if x∗2(j) > 0 then ((x∗1)T A)j = (x∗1)T Ax∗2 = v∗. 2. if x∗1(j) > 0 then (Ax∗2)j = (x∗1)T Ax∗2 = v∗.
This is an immediate consequence of the Useful Corollary for .
• If you were playing a 2-player zero-sum game (say, as player 1) would you always play a minmaximizer strategy?
• What if you were convinced your opponent is an idiot?
• Notice, we have no clue yet how to compute the minimax value and a minimax profile.
That is about to change. AGTA: Lecture 4
remarks and food for thought
minimax as an optimization problem
Consider the following “optimization problem”:
Maximize v
Subject to constraints:
(xT1 A)j ≥ v for j = 1,…,m2, x1(1) + . . . + x1(m1) = 1, x1(j) ≥ 0 for j = 1,…,m1
It follows from the minimax theorem that an optimal solution (x∗1,v∗) would give precisely the minimax value v∗, and a minmaximizer x∗1 for Player 1.
We are optimizing a “linear objective”,
under “linear constraints” (or “linear inequalities”).
That’s what Linear Programming is. Fortunately, we have good algorithms for it. Next time, we start Linear Programming.
AGTA: Lecture 4
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com