Game Playing and Adversarial Search
Chapter 5, Sections 1–5
Copyright By PowCoder代写 加微信 powcoder
AIMA Slides © and ; Chapter 5, Sections 1–5 1
♢ Perfect play
♢ Resource limits
♢ α–β pruning
♢ Games of chance
AIMA Slides © and ; Chapter 5, Sections 1–5 2
Games vs. search problems
“Unpredictable” opponent ⇒ solution is a contingency plan
Time limits ⇒ unlikely to find goal, must approximate
Plan of attack:
• algorithm for perfect play ( , 1944)
• finite horizon, approximate evaluation (Zuse, 1945; Shannon, 1950; Samuel,
• pruning to reduce costs (McCarthy, 1956)
AIMA Slides © and ; Chapter 5, Sections 1–5 3
Types of games
AIMA Slides © and ; Chapter 5, Sections 1–5 4
Representing a game as a search problem
We can formally define a strategic two-player game by:
• initial state
• terminal test (i.e. win / lose / draw)
• utility function (i.e. numeric reward for outcome)
chess: +1, 0, -1
poker: cash won or lost
In a zero-sum game with 2 players:
each player’s utility for a state are equal and opposite
AIMA Slides © and ; Chapter 5, Sections 1–5 5
Perfect play for deterministic, perfect-information games
Idea: choose move to position with highest minimax value
= best achievable payoff against best play
E.g., 2-ply game:
AIMA Slides © and ; Chapter 5, Sections 1–5 6
Minimax algorithm
function Minimax-Decision(game) returns an operator
for each op in Operators[game] do
Value[op]←Minimax-Value(Apply(op, game), game)
return the op with the highest Value[op]
function Minimax-Value(state, game) returns a utility value
if Terminal-Test[game](state) then
return Utility[game](state)
else if max is to move in state then
return the highest Minimax-Value of Successors(state)
return the lowest Minimax-Value of Successors(state)
AIMA Slides © and ; Chapter 5, Sections 1–5 7
Properties of minimax
Complete??
Time complexity??
Space complexity??
AIMA Slides © and ; Chapter 5, Sections 1–5 8
Properties of minimax
Complete?? Yes, if tree is finite (chess has specific rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise??
Time complexity?? O(bm)
Space complexity?? O(bm) (depth-first exploration)
For chess, b ≈ 35, m ≈ 100 for “reasonable” games
⇒ exact solution completely infeasible
AIMA Slides © and ; Chapter 5, Sections 1–5 9
Resource limits
Suppose we have 100 seconds, explore 104 nodes/second
⇒ 106 nodes per move
Standard approach:
• cutoff test
e.g., depth limit (perhaps add quiescence search)
• evaluation function
= estimated desirability of position
AIMA Slides © and ; Chapter 5, Sections 1–5 10
Evaluation functions
White slightly better Black winning
For chess, typically linear weighted sum of features
Eval(s) = w1f1(s) + w2f2(s) + . . . + wnfn(s)
e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens)
AIMA Slides © and ; Chapter 5, Sections 1–5 11
Digression: Exact values don’t matter
1 2 2 4 1 20 20 400
Behaviour is preserved under any monotonic transformation of Eval
Only the order matters:
payoff in deterministic games acts as an ordinal utility function
AIMA Slides © and ; Chapter 5, Sections 1–5 12
Cutting off search
MinimaxCutoff is identical to MinimaxValue except
1. Terminal? is replaced by Cutoff?
2. Utility is replaced by Eval
Does it work in practice?
bm = 106, b = 35 ⇒ m = 4
4-ply lookahead is a hopeless chess player!
4-ply ≈ human novice
8-ply ≈ typical PC, human master
12-ply ≈ IBM’s Deep Blue, Kasparov
AIMA Slides © and ; Chapter 5, Sections 1–5 13
α–β pruning example
AIMA Slides © and ; Chapter 5, Sections 1–5 14
α–β pruning example
AIMA Slides © and ; Chapter 5, Sections 1–5 15
α–β pruning example
AIMA Slides © and ; Chapter 5, Sections 1–5 16
α–β pruning example
AIMA Slides © and ; Chapter 5, Sections 1–5 17
α–β pruning example
AIMA Slides © and ; Chapter 5, Sections 1–5 18
Properties of α–β
Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With “perfect ordering,” time complexity = O(bm/2)
⇒ doubles depth of search
⇒ can easily reach depth 8 and play good chess
A simple example of the value of reasoning about which computations are
relevant (a form of metareasoning)
AIMA Slides © and ; Chapter 5, Sections 1–5 19
Why is it called α–β?
α is the best value (to max) found so far off the current path
If V is worse than α, max will avoid it ⇒ prune that branch
Define β similarly for min
AIMA Slides © and ; Chapter 5, Sections 1–5 20
The α–β algorithm
Basically Minimax + keep track of α, β + prune
function Max-Value(state, game,α,β) returns the minimax value of state
inputs: state, current state in game
game, game description
α, the best score for max along the path to state
β, the best score for min along the path to state
if Cutoff-Test(state) then return Eval(state)
for each s in Successors(state) do
α←Max(α,Min-Value(s, game,α,β))
if α ≥ β then return β
function Min-Value(state, game,α,β) returns the minimax value of state
if Cutoff-Test(state) then return Eval(state)
for each s in Successors(state) do
β←Min(β,Max-Value(s, game,α,β))
if β ≤ α then return α
AIMA Slides © and ; Chapter 5, Sections 1–5 21
Deterministic games in practice
Checkers: Chinook ended 40-year-reign of human world champion in 1994. Used an endgame database defining perfect play for all
positions involving 8 or fewer pieces on the board, a total of 443,748,401,247
positions.
Chess: Deep Blue defeated human world champion in a six-
game match in 1997. Deep Blue searches 200 million positions per second,
uses very sophisticated evaluation, and undisclosed methods for extending
some lines of search up to 40 ply.
Othello: human champions refuse to compete against computers, who are
Go: human champions refuse to compete against computers, who are too
bad. In go, b > 300, so most programs use random moves initially, along
with pattern knowledge bases to suggest plausible moves.
AIMA Slides © and ; Chapter 5, Sections 1–5 22
Nondeterministic games
E..g, in backgammon, the dice rolls determine the legal moves
Simplified example with coin-flipping instead of dice-rolling:
AIMA Slides © and ; Chapter 5, Sections 1–5 23
Algorithm for nondeterministic games
Expectiminimax gives perfect play
Just like Minimax, except we must also handle chance nodes:
if state is a chance node then
return average ofExpectiMinimax-Value of Successors(state)
A version of α–β pruning is possible
but only if the leaf values are bounded. Why??
AIMA Slides © and ; Chapter 5, Sections 1–5 24
Nondeterministic games in practice
Dice rolls increase b: 21 possible rolls with 2 dice
Backgammon ≈ 20 legal moves (can be 6,000 with 1-1 roll)
depth 4 = 20× (21× 20)3 ≈ 1.2× 109
As depth increases, probability of reaching a given node shrinks
⇒ value of lookahead is diminished
α–β pruning is much less effective
TDGammon uses depth-2 search + very good Eval
≈ world-champion level
AIMA Slides © and ; Chapter 5, Sections 1–5 25
Digression: Exact values DO matter
Behaviour is preserved only by positive linear transformation of Eval
Hence Eval should be proportional to the expected payoff
AIMA Slides © and ; Chapter 5, Sections 1–5 26
Games illustrate several important points about AI
♢ perfection is unattainable ⇒ must approximate and make trade-offs
♢ uncertainty limits the value of look-ahead
♢ can programs learn for themselves as they play? (stay tuned…)
Examples of skills expected:
♢ Demonstrate operation of game search algorithms
♢ Discuss and evaluate the properties of game search algorithms
♢ Design suitable evaluation functions for a game
♢ Explain how to search in nondeterministic games
AIMA Slides © and ; Chapter 5, Sections 1–5 27
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com