CS代写 FIT3080 – Artificial Intelligence

FIT3080 – Artificial Intelligence
Adversarial Search Algorithms
www.monash.edu.au

Assignment 1 is out!
Modelling and solving problems using search
4 parts (mix of conceptual and practical tasks) Check Moodle for the specification
Due date: September 7, 11:55pm (Melbourne)
Announcements
FIT3080 Artificial Intelligence S2 2020 2

Informed tree/graph search
Expand ‘most promising’ nodes first,
Using evaluation function: f(n) = g(n) + h(n)
What are they Properties & guarantees Effectiveness
Heuristics
Bounded suboptimal search
FIT3080 Artificial Intelligence S2 2020 3

Route planning
FIT3080 Artificial Intelligence S2 2021 4

Route planning
Coordination problems
FIT3080 Artificial Intelligence S2 2021 5

Tic-Tac-Toe
This week: games!
FIT3080 Artificial Intelligence S2 2021 6

Tic-Tac-Toe
This week: games!
FIT3080 Artificial Intelligence S2 2021 7

Tic-Tac-Toe
Games with an adversary
This week: games!
FIT3080 Artificial Intelligence S2 2021 8

This week: games!
Tic-Tac-Toe
Games with an adversary Man vs. Machine!
FIT3080 Artificial Intelligence S2 2021 9

This week: games!
Chinook vs. Tinsley (1993/4) Watch Jonathan Schaffer wonderful account:

FIT3080 Artificial Intelligence S2 2021 10

This week: games!
Deep Blue vs.
A short recap from the BBC:

FIT3080 Artificial Intelligence S2 2021 11

This week: games!
Deep Blue vs.
A short recap from the BBC:

FIT3080 Artificial Intelligence S2 2021 12

This week: games!
Chess AlphaGo vs. (2016)
There’s a wonderful documentary about this match:

FIT3080 Artificial Intelligence S2 2021 13

Types of games
Perfect information (deterministic, full obsv.)
Chess, Go, noughts-and-crosses (tic-tac-toe), draughts (checkers), etc.
Imperfect information (stochastic, partially obsv.)
Poker, Texas hold’em (variant of poker), bridge, backgammon, etc
n-Players (e.g., n = 2 for chess)
Sequential (vs. simultaneous) Zero-sum (vs non-zero sum games)
Our focus: 2p, sequential, perfect information, zero sum FIT3080 Artificial Intelligence S2 2021 14

Game Trees
Start state: initial configuration of the board Players are MAX and MIN
A position favourable to MAX has a value > 0 (winning often ∞) A position favourable to MIN has a value < 0 (winning often −∞) Each player chooses the most promising move for them Each move adds new nodes to the frontier The set of nodes generated for one player is called a ply Each node has an associated utility for each player Leaves of the tree are called terminal nodes FIT3080 Artificial Intelligence S2 2020 15 Game Tree (2-player, Deterministic, Turns) FIT3080 Artificial Intelligence S2 2021 16 Solving Game Trees A solution is a game-tree where we can label the root node with an outcome (win, loss, draw, or a utility value) Each move available to the opponent is a subproblem (another game) All subproblems need to be solved before the parent node can be considered solved A strategy for MAX (resp. MIN) is a solution-tree which contains exactly one successor for each MAX (resp. MIN) node. A winning strategy for MAX (resp. MIN) is a solution-tree where every terminal is a WIN (resp. LOSS, resp. max utility value) FIT3080 Artificial Intelligence S2 2020 17 Goal: find a winning strategy for MAX For all nodes representing a game situation where it is MIN’s move next, show that MAX can win from every position to which MIN might move For all nodes representing a game situation where it is MAX’s move next, show that MAX can win from just one position to which MAX might move Solving Game Trees FIT3080 Artificial Intelligence S2 2021 18 Unpredictable opponent: Must specify a move for every possible opponent reply Time limits: not all games can be searched to the end find a good first move Games versus Search Problems FIT3080 Artificial Intelligence S2 2021 19 If MAX were to choose among tip nodes (or available nodes), she would take the node with the largest value If MIN were to choose among tip nodes (or available nodes), she would take the node with the smallest value Choose a move to the position with the highest minimax value: best achievable payoff against best play E.g., 2-ply game: Search ideas FIT3080 Artificial Intelligence S2 2021 20 Minimax Algorithm FIT3080 Artificial Intelligence S2 2021 21 Evaluation function: { # rows, columns, diagonals available to MAX – # of rows, columns, diagonals available to MIN } 6-5=1 5-5=0 6-5=1 5-5=0 Minimax Example: Tic-Tac-Toe 4-5=-1 5-4=1 6-4=2 FIT3080 Artificial Intelligence S2 2021 22 Properties of Minimax Complete Depth First Exploration: all paths to depth m Yes (if tree is finite) Yes (against an optimal opponent) Time complexity? Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for “reasonable” games → exact solution completely infeasible FIT3080 Artificial Intelligence S2 2021 23 We can improve performance by eliminating symmetric moves Pruning the Search Space FIT3080 Artificial Intelligence S2 2021 24 We can improve performance by eliminating redundant nodes Pruning the Search Space FIT3080 Artificial Intelligence S2 2021 25 α-value of a MAX node – current largest final backed-up value of its successors α-value is the lower bound for a MAX backed-up value β-value of a MIN node – current smallest final backed-up value of its successors β-value is the upper bound for a MIN backed-up value Definitions: α and β Values FIT3080 Artificial Intelligence S2 2021 26 Rules for discontinuing the search: α cut-off: search can be discontinued below any MIN node having a β-value ≤ α-value of any of its MAX node ancestors ● The final backed-up value of this MIN node is set to its β-value β cut-off: search can be discontinued below any MAX node having an α-value ≥ β-value of any of its MIN node ancestors ● The final backed-up value of this MAX node is set to its α-value α-β Procedure FIT3080 Artificial Intelligence S2 2021 27 ● All the successors of the start node are given final backed-up values ● The best first move is that which creates the successor with the highest backed-up value Termination Condition FIT3080 Artificial Intelligence S2 2021 28 MinimaxAlgorithm (again) FIT3080 Artificial Intelligence S2 2021 29 The α-β Algorithm (I) α = the value of the best (i.e., highest-value) choice we have found so far at any choice point along the path for MAX. β = the value of the best (i.e., lowest-value) choice we have found so far at any choice point along the path for MIN. FIT3080 Artificial Intelligence S2 2021 30 Xβ=2 α cut α-β Pruning – Example FIT3080 Artificial Intelligence S2 2021 31 Start node Backed-up value = +1 MAX nodes MIN nodes 0 +5-3+3+3-3 0+2-2+3+5+2+5-5 0+1+5+1 α-β Pruning – Large Example -3 0 -5+5-3+3+2+3-3 0-1 -2 0+1+4+5+1-1-1+3 -3+2 - FIT3080 Artificial Intelligence S2 2021 32 xα=+2 β cut β cut β cut β=-3x β=+3 β=+5 β =+2 α-β Pruning – Part of Large Example FIT3080 Artificial Intelligence S2 2021 33 xα=+2 β cut Does the order of the moves matter? β cut β cut β=-3x β=+3 β=+5 β =+2 α-β Pruning – Part of Large Example FIT3080 Artificial Intelligence S2 2021 34 The effectiveness of the αβ algorithm depends on the order in which states are examined With perfect ordering: ● time complexity = O(sqrt(b)m) = O(bm/2) i.e., depth of search can be doubled Adding dynamic ordering schemes brings us close to the theoretical limit Move Ordering FIT3080 Artificial Intelligence S2 2021 35 Suppose we have 100 secs per move, and we explore 104 nodes/sec → 106 nodes per move Still might not reach a terminal node! Idea: treat the frontier nodes as terminal states Heuristics for Games FIT3080 Artificial Intelligence S2 2021 36 Some ideas for generating these. From experience: Material values in chess (1 = pawn, 3 = bishop, ... 9 = Queen) State features: f1(s) = (# of white queens) – (# of black queens) Combinations of features (e.g., linear weighted sum) Eval(s) = w1 f1(s) + w2 f2(s) + ... + wn fn(s), From preprocessing: End-game databases Evaluation Functions FIT3080 Artificial Intelligence S2 2021 37 Depth limited search (fixed cost) Depth-First Iterative Deepening (anytime) Beam search (forward pruning) Don’t search at all! (use table lookups) How do we search? FIT3080 Artificial Intelligence S2 2021 38 Potentially misleading evaluations Search considerations FIT3080 Artificial Intelligence S2 2021 39 Potentially misleading evaluations Search considerations FIT3080 Artificial Intelligence S2 2021 40 Starting from a fixed board position: (state, player) 1)Run AB up to some limit (DFS, or DFID) 2)Treat the frontier nodes s’ as terminal states 3)Compute a score using EVAL(s’, p) 4)Back up the values 5)Pick the best move 6)Repeat from the new board position Heuristic AB Tree Search FIT3080 Artificial Intelligence S2 2021 41 Types of Search wide, but shallow narrow, but deep Historically practitioners preferred Type 1 Searches. More recently, Type 2 approaches (and Type1/2 combinations) have emerged. FIT3080 Artificial Intelligence S2 2021 42 Use sampling to obtain estimates of utility at a given state Starting from a given position (s, p) “Roll out” the game until a terminal state Fixed policy determines the moves at each ply (fast!) Repeat many times to obtain an “average” utility Monte Carlo Simulation FIT3080 Artificial Intelligence S2 2021 43 Four main ingredients: 1) Selection: pick a node from the frontier 2) Expand: generate successors to extend the frontier 3) Simulate games from the newly added node 4) Propagate the results Monte Carlo Tree Search FIT3080 Artificial Intelligence S2 2021 44 Instead of always maximising average utility, we can bias the choices: exploration vs exploitation Monte Carlo Tree Search FIT3080 Artificial Intelligence S2 2021 45 Deterministic Games in Practice Checkers: Chinook defeated the world champion in an abbreviated game in 1990. It uses αβ search combined with a pre- computed database defining perfect play for 39 trillion endgame positions. Chess: Deep Blue defeated human world champion in a six-game match in 1997. Deep Blue searches 30 billion positions per move (200 million per second), normally searching to depth 14, and extending the search up to depth 40 for promising options. Heuristics reduce the EBF to about 3. Othello: In 1997, a computer defeated the world champion 6-0. Humans are no match for computers. Go: b ~ 19 x 19 = 361, which is too large for αβ. In 2016, AlphaGo, which uses Deep Learning, beat the world champion 4-1. AlphaGo also used Monte Carlo Tree Search (MTCS) FIT3080 Artificial Intelligence S2 2021 46 Summary: Adversarial Search Games illustrate important points about AI Perfection is unattainable → must approximate It causes one to think about which problems are worth thinking about The basic ideas discussed today generalise: ● >2 players → more ply and more costs
● Stochastic or hidden information →probabilistic reasoning (expectimax)
FIT3080 Artificial Intelligence S2 2021 47

Propositional logic
Inference Resolution Satisfiability
First Order Logic Logic and agents
Next week – Logical Agents
FIT3080 Artificial Intelligence S2 2021 48

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts