CS计算机代考程序代写 AI algorithm PowerPoint Presentation

PowerPoint Presentation

CSE 3521: Adversarial Search
[Many slides are adapted from the UC Berkeley. CS188 Intro to AI at UC Berkeley and previous CSE 3521 course at OSU.]

Game Playing Agents
Build an agent to play:
Tic-Tac-Toe
Checkers
Chess
Go
Poker

For centuries used by humans to exert intelligence
Recent success in building game programs that challenge human supremacy

Zero-Sum Games
Agents have opposite utilities (values on outcomes)
Let us think of a single value that one maximizes and the other minimizes
Adversarial, pure competition

General Games
Agents have independent utilities (values on outcomes)
Cooperation, indifference, competition, and more are all possible

Types

Many kinds of games!

Axes:
Deterministic or stochastic?
One, two, or more players?
Zero sum?
Perfect information (can you see the state)?

Want algorithms for calculating a strategy (i.e., agent function, policy) which recommends a move (action) from each state
Game-playing agents

4

Two-player games
In many games someone wins, someone loses

These types of games are adversarial

Player A will try to maximize the chance of A winning
Player B will try to maximize the chance of B winning

Completely adversarial: zero-sum game
5

A
B

Zero-sum game: Total payoff to all players is always the same. So, for chess there is one winner or a draw. So the zero-sum game is {(1, 0), (0, 1), {½, ½}).

5

Can We Use Search?

Tic-Tac-Toe
X goes first; trade X/O between games

What strategies do you use?
How do you choose your first move?
What would a bad first move be?
Do you look ahead?
7

Tic-Tac-Toe
Two-player

Turn-taking

Fully observable

Zero-sum

Time-constrained game?

Adversarial Search: Game Tree
9

Agent
Opp
Agent
You choose moves in red
Opponent in blue
You don’t have complete control

Adversarial Search: Game Tree
Agent generates a new Game Tree for each turn

Successor function
Next move
Agent is called MAX
Opponent is called MIN
Assume MAX plays the first move (initial state)

Terminal Test
Is a state a win/loss/draw for MAX

All states are fully observable

Uncertainty is caused by the action of another

MAX wants MIN to lose (and vice versa)

Two player game tree

11

Adversarial Search: Game Tree
No plan exists that guarantees success for either player

At each turn, the choice of which action to perform must be made within a specified time limit

The state space is enormous for difficult games (Chess, Go, etc.)
Only a tiny fraction of this space can be explored within the time limit
The branching factor and the depth of the terminal states are large

Chess
Number of states: ~1040
Branching factor: ~35
Number of total moves in a game: ~100

Adversarial Search: Game Tree
Diagram does not show all symmetries

MAX’s play 
MIN’s play 
Terminal state
(win for MAX) 
MIN nodes

MAX nodes

Adversarial Search: Game Tree
Agent’s turn

Build a game tree from the current state to a maximal depth h (called the horizon) so that it is feasible within the time limit

Evaluate the states at the leaf nodes
How???

turn 1
turn 2
turn 3
Agent choose moves in red
Opponent in blue

Evaluation Function
Evaluation function e(s): state s  scalar
A utility with respect to the agent

A heuristic function that estimates how favorable state s is for MAX (agent)
e(s) > 0: s is favorable to MAX
e(s) < 0: s is favorable to MIN e(s) = 0: s is neutral If the agent is playing X which states are favorable to MAX? Evaluation Function: Example Tic-Tac-Toe A possible heuristic function: e(s) = # rows/columns/diagonals open for MAX – # rows/columns/diagonals open for MIN 8-8 = 0 6-4 = 2 3-3 = 0 Evaluation Function Usually a weighted sum of “features”: What features can you define for chess? Number of pieces of each type Number of possible moves Number of squares controlled Evaluation functions Evaluation functions score non-terminals in depth-limited search Ideal function: returns the actual minimax value of the position In practice: typically weighted linear sum of feature functions: e.g. f1(s) = (num white queens – num black queens), etc. Adversarial Search (Minimax) Deterministic, zero-sum games: Tic-tac-toe, chess, checkers One player maximizes result The other minimizes result Minimax search: A state-space search tree Players alternate turns Compute each node’s minimax value: the best achievable utility against a rational (optimal) adversary Assumes the opponent also is using the same game tree and evaluation function 8 2 5 6 max min 2 5 5 Terminal values: part of the game Minimax values: computed recursively 19 Example max min Example max min Ph-1 Example 10 100 2 20 2 10 10 max min Ph-2 Example 10 100 2 20 2 10 10 max min You assume the opponent will act rationally Ph-3 Example: Tic-Tac-Toe at horizon (h) = 2 6-5=1 5-6=-1 5-5=0 5-5=0 6-5=1 5-5=1 4-5=-1 5-6=-1 6-4=2 5-4=1 6-6=0 4-6=-2 -1 -2 1 1 Best move Example: Continued 0 1 1 1 3 2 1 1 2 1 0 1 1 0 0 2 0 1 1 1 2 2 2 3 1 2 Minimax implementation (recursively using DFS) def min-value(state): initialize v = +∞ for each successor of state: v = min(v, max-value(successor)) return v def max-value(state): initialize v = -∞ for each successor of state: v = max(v, min-value(successor)) return v 26 Minimax implementation (recursively using DFS) def value(state): if the state is a terminal state: return the state’s utility if the next agent is MAX: return max-value(state) if the next agent is MIN: return min-value(state) def min-value(state): initialize v = +∞ for each successor of state: v = min(v, value(successor)) return v def max-value(state): initialize v = -∞ for each successor of state: v = max(v, value(successor)) return v Pick an order for two reasons: sequential processor and pruning 27 Minimax example 28 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 Minimax example 29 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 Search in depth-first manner: O(bm) time Minimax example 30 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 Minimax example 31 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 Minimax example 32 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 4 Minimax example 33 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 4 1 Minimax example 34 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 4 1 2 Minimax example 35 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 4 1 2 4 Minimax example 36 4 10 5 3 1 16 6 2 11 MAX MIN a1 a2 a3 4 1 2 4 Minimax efficiency How efficient is minimax? Just like (exhaustive) DFS: “for each successor” Time: O(bm) Space: O(bm) Example: For chess, b  35, m  100 Exact solution is completely infeasible But do we need to explore the whole tree? Resource limits The search process: Don’t know the middle nodes values before reaching the terminal states Problem: In realistic games, we cannot search to leaves! Solution: Depth-limited search Instead, search only to a limited depth in the tree Replace terminal utilities with an evaluation function for non-terminal positions (like heuristics) Example: Suppose we have 100 seconds, and can explore 10K nodes/sec So can check 1M nodes per move - reaches about depth 8 – decent chess program Guarantee of optimal play is gone More plies makes a BIG difference ? ? ? ? -1 -2 4 9 4 min max -2 4 Remember: we perform the entire one-round of minimax search mainly to determine just the next step! Depth matters Evaluation functions are always imperfect The deeper in the tree the evaluation function is buried, the less the quality of the evaluation function matters An important example of the tradeoff between complexity of evaluation functions and complexity of computation å n ii i=1 e(s)=wf(s) /docProps/thumbnail.jpeg