CS计算机代考程序代写 AI algorithm PowerPoint Presentation

PowerPoint Presentation

CSE 3521: Adversarial Search
[Many slides are adapted from the UC Berkeley. CS188 Intro to AI at UC Berkeley and previous CSE 3521 course at OSU.]

Game tree pruning
Ignore portions of search tree that make no difference to final choice

Minimax example

12
8
5
2

14
4
6

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
12
8
2
3
2

14
 14
5
 5

2
 2

2
3
Note: Only showed MIN pruning here

Minimax pruning

12
8
5
2

Can We Do Better?

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5

Alpha-Beta Pruning

MAX
MIN
Terminal
≤ 3
3
Beta

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
3
12
8
Final score
≤ 3
Alpha

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
 2
3
12
8
2

3 > 2 !!!
No “final” score
But fine to tell MAX “2”
3
2

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
12
8
2
3
2

14
 14
3 > 14 ?

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
12
8
2
3
2

14
 14
5
 5

3 > 5 ?

Example

b = 2

2
The beta value of a MIN
node is an upper bound on
the final backed-up value.
It can never increase

Example
The beta value of a MIN
node is an upper bound on
the final backed-up value.
It can never increase

1
b = 1

Example
a = 1
The alpha value of a MAX
node is a lower bound on
the final backed-up value.
It can never decrease

1
b = 1

Example
a = 1

1
b = 1

-1
b = -1

Example
a = 1

1
b = 1

-1
b = -1
Search can be discontinued below
any MIN node whose beta value is
less than or equal to the alpha value
of one of its MAX ancestors

Alpha-Beta Pruning

3
-1

 Pruning
 -1

 3

This part of the tree can’t have any effect on the value that will be backed up to the root

Alpha-Beta Pruning (Algorithm)
Explore the game tree to depth h (horizon) in depth-first manner
Back up alpha and beta values whenever possible
Prune branches that can’t lead to changing the final decision
Update the alpha/beta value of the parent of a node N when the search below N has been completed or discontinued

Discontinue the search below a MAX node N if its alpha value is  the beta value of a MIN ancestor of N
Discontinue the search below a MIN node N if its beta value is  the alpha value of a MAX ancestor of N

Alpha-Beta implementation (recursively)
def min-value(state , α, β):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α, β):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

α: MAX’s best option on path to root so far: the lowest score MAX can possibly get
β: MIN’s best option on path to root so far: the highest score MIN can possibly get
In the beginning, α= -∞ and β= +∞

Pick an order for two reasons: sequential processor and pruning
20

Alpha-Beta Pruning

MAX
MIN
Terminal
≤ 3
3
Beta
def min-value(state , α= -∞, β= +∞):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α= -∞, β= +∞):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
3
12
8
Final score
≤ 3
Alpha
def min-value(state , α= -∞, β= 3):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α= -∞, β= +∞):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
 2
3
12
8
2

3 > 2 !!!
No “final” score
But fine to tell MAX “2”
3
2
def min-value(state , α= 3, β= +∞):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α= 3, β= +∞):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
12
8
2
3
2

14
 14
3 > 14 ?

def min-value(state , α= 3, β= +∞):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α= 3, β= +∞):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
12
8
2
3
2

14
 14
5
 5

3 > 5 ?

def min-value(state , α= 3, β= 14):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α= 3, β= +∞):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

Alpha-Beta Pruning

MAX
MIN
Terminal
 3
3
12
8
2
3
2

14
 14
5
 5

2
 2

2
3
Note: Only showed MIN pruning here
def min-value(state , α= 3, β= 5):
initialize v = +∞
for each successor of state:
v = min(v, value(successor, α, β))
if v ≤ α return v
β = min(β, v)
return v
def max-value(state, α= 3, β= +∞):
initialize v = -∞
for each successor of state:
v = max(v, value(successor, α, β))
if v ≥ β return v
α = max(α, v)
return v

Another Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
-3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
-3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
-3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
-3
3
3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
-3
3
3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
5

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
5
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
-5
-5
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
-5
-5
0

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
-5
-5
0
1

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
-5
-5
2
2
2
2
1
1

Example

0
5
-3
2
5
-2
3
2
-3
0
3
3
-5
0
1
-3
5
0
1
-5
5
3
2
-3
5
0
0
0
0
-3
3
3
0
2
2
2
2
1
1
-3

1
1
-5
-5
-5
1
2
2
2
2
1

How much do we gain?
Consider these two cases:

3
a = 3
-1
b=-1
(4)

3
a = 3
4
b=4
-1

Alpha-Beta Pruning properties
This pruning has no effect on minimax value computed for the root!

Values of intermediate nodes might be wrong
Important: children of the root may have the wrong values

Good child ordering improves effectiveness of pruning

With “perfect ordering”:
Time complexity drops to O(bm/2) [Knuth and Moore 1975]
Doubles solvable depth!
Full search of, e.g., chess, is still hopeless…

This is a simple example of meta-reasoning (computing about what to compute)

10
10
0
max
min

Exercise-1 (adversarial)

Answer Exercise-1 (adversarial)

4
8
8

Exercise-1 (Alpha-Beta)

Answer Exercise-1 (Alpha-Beta)

4
8
8

Exercise-2 (adversarial)

Answer Exercise-2 (adversarial)

10
100
2
20
2
10
10

Exercise-2 (Alpha-Beta)

Answer Exercise-2 (Alpha-Beta)

10
100
2
-∞
2
10
10

Uncertain Outcomes

Worst Case vs. Average Case

10
10
9
100
max
min

Idea: Uncertain outcomes controlled by chance, thus not adversary!

At least 10, or otherwise just rather dead; or just 100, or otherwise rather dead; that’s yet something different than minimax or expectimax
65

Reminder: Probabilities
A random variable represents an event whose outcome is unknown
A probability distribution is an assignment of weights to outcomes

Example: Traffic on freeway
Random variable: T = whether there’s traffic
Outcomes: T in {none, light, heavy}
Distribution: P(T=none) = 0.25, P(T=light) = 0.50, P(T=heavy) = 0.25

Some laws of probability (more later):
Probabilities are always non-negative
Probabilities over all possible outcomes sum to one

As we get more evidence, probabilities may change:
P(T=heavy) = 0.25, P(T=heavy | Hour=8am) = 0.60
We’ll talk about methods for reasoning and updating probabilities later
0.25
0.50
0.25

The expected value of a function of a random variable is the average, weighted by the probability distribution over outcomes

Example: How long to get to the airport?

Reminder: Expectations
0.25
0.50
0.25
Probability:
20 min
30 min
60 min
Time:

35 min
x
x
x
+
+

Expectimax Search
Why wouldn’t we know what the result of an action will be?
Explicit randomness: flip a coin or rolling dice
Unpredictable opponents: the ghosts respond randomly
Actions can fail: when moving a robot, wheels might slip

Values should now reflect average-case (expectimax) outcomes, not worst-case (minimax) outcomes

Expectimax search: compute the average score under optimal play
Max nodes as in minimax search
Chance nodes are like min nodes, but the outcome is uncertain
Calculate their expected utilities
I.e., take weighted average (expectation) of children

10
4
5
7

max
chance
10
10
9
100

Expectimax pseudocode

def value(state):
if the state is a terminal state: return the state’s utility
if the next agent is MAX: return max-value(state)
if the next agent is EXP: return exp-value(state)
def exp-value(state):
initialize v = 0
for each successor of state:
p = probability(successor)
v += p * value(successor)
return v

def max-value(state):
initialize v = -∞
for each successor of state:
v = max(v, value(successor))
return v

Pick an order for two reasons: sequential processor and pruning
69

Expectimax pseudocode

def exp-value(state):
initialize v = 0
for each successor of state:
p = probability(successor)
v += p * value(successor)
return v

5
7

8
24
-12
1/2
1/3
1/6
v = (1/2) (8) + (1/3) (24) + (1/6) (-12) = 10

Expectimax example

12
9
6
0

15
4
6

Expectimax pruning?

12
9

Depth-Limited Expectimax

…
…
492
362
…

400
300
Estimate of true expectimax value (which would require a lot of work to compute)

Expectimax: Coin Flip
Expected value = P(heads) * MIN(heads) +P(tails) * MIN(tails)
74

Evaluation function meaning

A1 is best move
A2 is best move
2 outcomes with probabilities {.9, .1}

Modeling Assumptions

The dangers of optimism and pessimism
Dangerous Optimism
Assuming chance when the world is adversarial
Dangerous Pessimism
Assuming the worst case when it’s not likely

Uncertainty everywhere
Not just for games of chance!
I’m sick: will I sneeze this minute?
Email contains “FREE!”: is it spam?
Tooth hurts: have cavity?
60 min enough to get to the airport?
Robot rotated wheel three times; how far did it advance?
Safe to cross street? (Look both ways!)

Sources of uncertainty in random variables:
Inherently random process (dice, etc.)
Insufficient or weak evidence
Ignorance of underlying processes
Unmodeled variables
The world’s just noisy – it doesn’t behave according to plan!

Compared to fuzzy logic, which has degrees of truth, rather than just degrees of belief (Talk about next)

/docProps/thumbnail.jpeg

Related Posts