程序代写代做代考 scheme data structure AI algorithm Excel PII: 0004-3702(75)90019-3

PII: 0004-3702(75)90019-3

ARTIFICIAL INTELLIGENCE 293

An Analysis of Alpha-Beta Priming’

Donald E. Knuth and Ronald W. Moore

Computer Science Department, Stanferd University,
Stanford, Calif. 94305, U.S.A.

Recommended by U. Montanari

ABSTRACT
The alpha-beta technique for searching game trees is analyzed, in an attempt to provide some
insight into its behavior. The first portion o f this paper is an expository presentation o f the
method together with a proof o f its correctness and a historical ch’scussion. The alpha-beta
procedure is shown to be optimal in a certain sense, and bounds are obtained for its running
time with various kinds o f random data.

Put one pound of Alpha Beta Prunes
in a jar or dish that has a cover.

Pour one quart o f boiling water over prunes.
The longer prunes soak, the plumper they get

Alpha Beta Acme Markets, Inc.,
La Habra, California

Computer programs for playing games like e, hess typically choos~ their
moves by seaxching a large tree of potential continuations. A technique
called “alpha-beta pruning” is generally ~ used to speed up such search
processes without loss of information, The purpose of this paper is to
analyze the alpha-beta procedure in order to obtain some quantitative
estimates of its performance characteristics.

i This research was supported in part by the National Science Foundation under grant
number GJ 36473X and by the Office of Naval Research under contract NR 044-402.

Artificial Intelligence 6 (1975), 293-326
Copyright © 1975 by North-Holland Publishing Company

294 D.E. KNUTH AND R. W. MOORE

Section 1 defines the basic concepts associated with game trees. Section 2
presents the alpha-beta method together with a related technique which is
similar, but not as powerful, because it fails to make “deep cutoffs”. The
correctness of both methods is demonstrated, and Section 3 gives examples
and further development of the algorithms. Several suggestions for applying
the method in practice appear in Section 4, and the history of alpha-beta
pruning is discussed in Section 5.

Section 6 begins the quantitative analysis, byderiving lower bounds on
the amount of searching needed by alpha-beta and by any algorithm which
solves the same general problem. Section 7 derives upper bounds, primarily by
considering the case of random trees when no deep cutoffs are made. It is
shown that the procedure is reasonably efficient even under these weak
assumptions. Section 8 shows how to introduce some of the deep cutoffs into
the analysis; and Section 9 shows that the efficiency improves when there are
dependencies between successive moves. This paper is essentially self-
contained, except for a few mathematical resultsquoted i n the later sec-
tions.

1. Games and Position Values

The two-person games we are dealing with can be characterized by a set of
“positions”, and by a set of rules for moving from one position to ~,nother,
the players moving alternately. We assume that no infinite sequence of
positions is allowed by the rules, 2 and that there are only finitely many legal
moves from every position. It follows from the “infinity lemma” (see [11,
Section 2.3,4.3]) that for every position p there is a number N(p) such that no
game starting a t p lasts longer than N(p) moves.

I fp is a position from which there are no legal moves, there is an integer-
valued function f(p) which represents the value of this position to the player
whose turn it is to play from p; the value to the other player is assuraed to be

– – – f ( p ) .

If p is a position from which there are d legal moves Pl, • •., Pd, where
d > 1, the problem is to choose the “best” move. We assume that the best
move is one which achieves the greatest possible value when the game ends,
if the opponent also chooses moves which are best for h im. Let F(p) be the
greatest possible value achievable from position p against the optimal
defensive strategy, from the standpoint of the player who is moving from that

2 Strictly speaking, chess does not satisfy this condition, since its rules for repeated
positions only give the players the option to request a draw; in certain circumstances;
if neither player actually does ask for a draw,: the game can go on forever. But this techni-
cality is of no practical importance, since computer chess programs only look finitely many
moves ahead. I r i s possible to deal with infinite games by assigning appropriate values to
repeated positions, but such questionsare beyond the scope of this paper.
Artificial Intelligence 6 f1975), 293-326

AN ANALYSIS OF ALPHA-BETA PRUNING 295

position. Since the value.(to this player) after moving to position Pt will be
-F(pl), we have

~f(p) if d = 0, (1)
F(p) = (max(-F(pl),…, i f d > 0.

This formula serves to define F(p) for al! positions p, by induction on the
length of the longest game playable from p.

In most discussions of game-playing, a slightly different formalism is used;
the two players are named Max and Min, where all values are given from
Max’s viewpoint. Thus, if p is a terminal position with Max to move, its
value is f(p) as before, but if p is a terminal position with Min to move its
value is

gO’) = -f(P) . (2)

Max will try to maximize the final value, and Min will try to minimize it.
There are now two functions corresponding to( l) , namely

F(p) V = ~f(P) if d = 0,
[max(G(pl) , . . . , G(pd)) if d > 0, (3)

which is the best value Max can guarantee starting at position p, and

fg(p) if d = 0,
G(p) = [.min(F(pl),…, F(Pd)) if d > 0, (4)

which is the best that Min can be sure of achieving. As before, we assume
that Pl , . •., Pa are the legal moves from position p. It is easy to prove by
induction that the two definitions of F in (1) and (3) are identical, and that

ffi – F 0 , ) (5)

for all p. Thus the two approaches are equivalent.
Sometimes it is easier to reason about game-playing by using the “mini-

max” framework of (3) and (4) instead of the “negmax” approach of eq. (1);
the reason is that we are sometimes less confused if we consistently evaluate
the game positions from one player’s standpoint. On the other hand, formula-
tion (1) is advantageous when we’re trying to prove things about games,
because we don’t have to deal with two (or sometimes even four or eight) sep-
arate cases when we want to establish our results. Eq. (I) is analogous to
the “NOR” operation which arises in circuit design; two levels of NOR logic
are equivalent to a level of ANDs followed by a level of OR~.

The function F(p) is the maximum final value that can be achieved if both
players play optimally; but we should remark that this reflects a rather
conservative strategy that won’t always be best against poor players or
against the nonoptimal players we encounter in the real world. For example,
suppose that there are two moves, to positions p~ and P2, where p~ assures a
draw (value 0) but cannot possibly win, while P2 give a chance of either
victory or defeat depending on whether or not the opponent overlooks a

Artificial Intelligence 6 (1975), 293-326

296 D.E. KNUTH AND R. W. MOORE

rather subtle winning move. We may be better off gambling on the move to
P2, which is our only chance to win, unless we are convinced of our opponent’s
competence. Indeed, humans seem to beat chess-playing programs by adopting
such a strategy.

2. Development of the Algorithm

The following algorithm (expressed in an ad-hoc ALGOL-like l~ngnage)
clearly computes F(p), by following definition (1):

integer procedmre F (position p):
begin integer m, i, t, d;

determine the successor positions P i , – . . , P~;
if d = 0 then F := f (p) else
begin m :ffi -Qo;

for i :– 1 step 1 until d do
beg i . t :ffi

i f t > m then m := t;
end;

F :-” m;
end;

end.

Here Qo denotes a value that is greater than or equal to ]f(P)l for all terminal
positions of the game, hence – u3 is less than or equal to +F(p) for all p.
This algorithm is a “brute force” search through all possible continuations;
the infinity lemma assures us that the algorithm will terminate in finitely
many steps.

It is possible to improve on the brute-force search by using a “branch-and-
bound” technique [14], ignoring moves which are incapable of being better
than moves which are already known. For example, i f F(pi) = -10, then
F(p) >i 10, and we don’t have to know the exact Value ofF(p2) if we can
deduce that F(p2) >I – 10 (i.e., that -F(pz) ~ 10). Thus if P~t is a legal
move from P2 such that F(Pzl)( <~ 10, w e n e e d n o t bother to explore any other moves from Pz. In game-playing terminology, a move to Pz can be "refuted" (relative tothe alternative move Pt) ff the opposing player can make a reply to Pz that is at least as good as his best reply to Pl- Once a move has been refutedi we need not search for the best possible refutation. This line of reasoning leads to a computational technique that avoids much of the computation done by F. We shall define FI as a procedure on two parameters p and bound, and our goal is to achieve the following Conditions: FI (p, bound)- F(p) if F(p) < bound, Fl(p, bound) ~ bound i fF(p) >~rbound. (6)

Artificial Intelligence6 (1975), 293.326

AN ANALYSIS OF ALPHA-BETA PRUNING 297

These relations do not fully define F1, but they are sufficiently powerful to
calculate F(p) for any starting position p because they imply that

Fl(p, oo) = F(p). (7)
The following algorithm corresponds to this branch-and-bound idea.

integer procedure FI (positioRp, integer bound):
begin integer m, i, t, d;

determine the successor positions Pl , . •., Pj”
i f d = 0 then F l := f(p) else
begin m :– -oo ;

for i := 1 step 1 nntil d do
begin t := -F l (p t , : m ) ;

i f t > m t h e n m : = t;
if m >i bound then go to done;

end;
done: FI : = m;
end;

end.

We can prove that this procedure satisfies (6) by arguing as follows: At the
beginning of the tth iteration of the for loop, we have the “invariant”
condition

m = m a x ( – F ( p l ) , . . . , – F ( p H ) ) (8)
just as in procedure F. (The max operation over an empty set is conventionally
defined to be -oo.) For if ,F(pt ) is >m, then F l ( p i , – m ) = F(p~), by
condition (6) and induction on the length of the game following p; therefore
(8) will hold on the next iteration. And i f m a x ( – F ( p l ) , . . . , -F(pi)) >~bound
for any i, then F(p) >t bound. It follows that condition (6) holds for all p.

The procedure can be improved further if we introduce both lower and
uppe. r bounds; this idea, which is called alpha-beta pruning, is a significant
extension to the one-sided branch-and-bound method. (Unfortunately it
doesn’t apply to all branch-and-bound algorithms, it works only when a
game tree is being explored.) We define a procedure F2 of three parameters p,
alpha, and beta, for alpha < beta, satisfying the following conditions analogous to (6): F2(p, alpha, beta) <~ alpha if F(p) ~ alpha, F2(p, alpha, beta) - F(p) if alpha < F(p) < beta, "- (9) F2(p, alpha, beta) >~ beta if F(p) >/ beta.

Again, these conditions do not fully specify F2, but they imply that
F2(p, – oo, oo) – F(p). ( I O)

It turns out that this improved algorithm looks only a little different from the
others, when it is expressed in a programming language:

Artificial Intelligence 6 (1975), 293-326

298 D . E . KNUTH AND R. W. MOORE

integer procedure F2 (position p, integer alpha, integer beta):
begin integer m, i, t, d;

determine the successor positions p 1, • •., Pa;
i f d = 0 then F 2 : = f ( p ) else
begin m := alpha;

for i : – 1 step 1 until d do
begin t : = – F2(pl, — beta, – m ) ;

i f t > m then m := t;
if m >1 beta then go to done;

end;
done: F2 : = m;
end;

end;

To prove the validity of F2, we proceed as we did with FI. The invariant
relation analogous to (8) is now

m = m a x ( a l p h a , – F ( P i ) , . . . , -F(p~-I)) (11)
and m < beta. If --F(p~ >i beta, then -F2(pl , -be ta , – m ) will also be
>~beta, and i fm < - F ( p i ) < beta, then -F2(p t , -be ta , - m ) = -F(pt ) ; so the proof goes through as before, establishing (9) by induction. Now that we have found two improvements of the minimax procedure, it is natural to ask whether still further improvement is possible. Is there an "alpha-beta-gamma" p~'ocedure F3, which makes use say of the second- largest value found so far, or some other gimmick ? Section 6 below shows that the answer is no, or at least that there is a reasonable sense in which procedure F2 is optimum. 3. Examples and Refinements As an example of these procedures, consider thetree in Fig, 1, which repre- sents a position that has three successors, each of which has three successors, etc., until we get t o 34 = 81 positions~possible after four moves; and these 81 positions have been assigned ' , random',fvalues according to the first I]1 digits of n. Fig, 1 shows the F values computed from t he f ' s ; thus, the root node at the top of the tree has an effective value of 2 after best play by both sides. Fig. 2 shows the same situation as it is evaluated by procedure FI Cp, oo). Note that only 36 of the 81 terminal positions are examined, and that one of the nodes at level 2 now has the "approximate" vaiue 3 instead of its true value 7; but this approximation does not of course affect the value at the top. Fig: 3 shows the same situation as it is evaluated by the full alpha-beta pruning procedure. F2(p, - o o , + oo) will always examine the same nodes as Fl(p, oo) until the fourth level of lookaheadis reached, in any game tree; Artificial lntell~ence 6 (1975), 293-326 AN ANAEYSIS OF ALPHA-BETA PRUNING 299 this is a consequence of the theory developed below. On levels 4, 5 , . . . , however, procedure F2 is occasionally able to make "deep cutoffs" which FI is incapable of finding. A comparison of Fig. 3 with Fig. 2 ~hows that there are five deep cutoffs in this example. 2 11/\ / iX / iX / /X / i X / N I N / ~ \ / ~ \ - 1 - 1 - 2 - 3 - 7 - 2 - 4 - 2 - 3 - 2 - 0 - 2 - 1 -1 - 3 - 3 - 0 - 2 - 0 - 4 - 4 - 0 - 1 - 0 - 2 - 0 - 8 0 0 0 0 0 0 0 0 0 • 0 0 0 0 • 0 0 0 0 0 0 0 0 0 0 • • FiG. I . Complete ev~tluatioa o f a game tree. 2 -2 2 . i . / \ /e,.,. / \..,.,. _ ~ / ! i ! i " - / i N , , , , . , - 1 - 1 - 2 - 3 , , -4 , - 2 - o - 2 I ~\ I :~ , - - . . - o - l - o , , • e / t e / t • i ~ • • • - . - • • e • * • /~, ~',~,,~,,,,,,~,,,,, ~'h,;:~ ~ ~ ~ ' 3141 263358 846 3279502 0974944 230781640 F=G. 2. Tit=' ::,une tree of Fig. 1 evaluated with procedure FI (branch-and-bound strategy). . . 1 t / \ ,. '":"-. / \ " ' - 2 / i \ 2 ~ . , 4 / , - , , , I\ ;~', ;,",', I~\ , ~\ ,,, --1 .1 - 2 - 2 r t - 2 I . 2 - 0 - 2 s i / : ~ e ' ~ , - 0 - 4 - 4 - 2 - 1 - 0 i* t , O O O O | ~ 0 ~ % O O O t : \ o o e o O o ~ , , /~ ~,, /~ ~ ,~, ;~,, /~ :t,, /~,, ~, /~ ~, ;~,, ,~,, A ~, ,~ ,A,,o~,~ ~ , p,, ~ ~ ;~,, t~,, ;~,, 3141 265358 846 3 2 9 5 0 2 ,1~ I I , , i t 2 781640 ]FIG. 3. The game tree of Fig. 1 evaluated with procedure F2 (~dpha-beta strategy). All of these illustrations present the results in terms of the "negamax" model of Section 1; if the reader prefers to see it in "minimax" terms, it is sufficient to ignore all the minus signs in Figs. 1-3. The procedures of Section 2 can readily be converted to the minimax conventions, for example by replac'- ing F2 l:y the following two procedures: Artificial Intelligence 6 (1975), 293-326 300 D.E. KNUTH AND R. W. MOORE integer procedure F2 (position p, integer alpha, integer beta): begin integer m, i, t, d; determine the successor positions Pl , • • . , P~; if d = 0 then F2 :-- f (p) else beg~ m := alpha; for i : - 1 step 1 until d do begin t : = G2(ps, m, beta); if t > m then m := t;
if m >I beta then go to done;

end;
done: F2 : – m;
end;

end;

integer procedure G2 (position p, integer alpha, integer beta);
begin integer m, i, t, d;

determine the successor p o s i t i o n s p l , . . . , Pd;
if d = 0 then G2 := g(p) else
begin m : – beta;

for i := 1 step 1 until d do
begin t : – F2(p~, alpha, m);

if t < m then m := t; i f m ~ alpha then go to done; end; done: F2 := m; end; end. It is a simple but instructive exercise to prove that G2(p, alpha, beta) always equals - F2(p, -beta , - alp/~), The above procedures have made use of a magic routine that determines the successors Pl , - •., PJ of a given position p. If we want to be more explicit about +how positions are represented, it is natural to use the format o f linked records: When p is a reference to a r e ~ r d denoting a position, let first(p) be a reference to the first successor of that position, or A (a null reference) i f the position is terminal. Similarly if q references a successor p+ o f p, let next(q) be a referenceto the next successor P++I, or A if i - d. Finally let generate(p)be a procedure that creates the records for P t , . . . , PJ, sets their next fields, and makes first(p) point to Pl (or to A if d = 0). Then the alpha-beta pruning method takes the following more explicit form. integer procedure F2 (tel(position) p, integer alpha, integer beta): begin integerm, t; ref (position) q; generate(p); q := first(p); Artificial Intelligence 6 (1975), 293-326 AN ANALYSIS OF ALPHA-BETA PRUNING 301 i f q = A then F2 : = f ( p ) else begin m := alpha; while q ~ A and m < beta do begin t : - - F 2 ( q , - b e t a , - m ) ; i f t > m then m := t;
q : – next(q);

end;
F2 : = m ;

end;
end.

It is interesting to convert this recursive procedure to an iterative (non-
recursive) form by a sequence of mechanical transformations, and to apply
simple optimizations which preserve program correctness (see [13]). The
resulting procedure is surprisingly simple, but not as easy to prove correct as
the recursive form:

integer procedure alphabeta (gel (position) p);
begin integer I; ¢.omment level of recursion;

integer array a [ -2 :L] ; comment stack for recursion, where
all – 2], a[! – 1], all], all + 1] denote respectively
alpha, – beta, m, – t in procedure F2;

ref (position) array r[0:L + 1]; comment another stack for
recursion, where rill and r[l + 1] denote respectively
p and q in F2;

1 : = 0; a [ – 2 ] := a [ – l ] : – – o o ; r[0] : = p ;
1:2: generate (rill);

r [ / + 1] : – f i r s t ( r i l l ) ;
i f r[l + 1 ] = A then a[l] : = f(r[l]) else
begin a[l] := a [ / – 2];

loop: 1 := 1 + I; go to F2;
resume: if – a l l + 1 ] > al l] then

begin all] := – a l l + 1];
i f all + 1] ~ a[! – 1] then go to done;

end;
r[I + 1] := next(r[1 + 1]);
i f r[l + 1 ] 😛 A then go to loop;

end;
done: l : = I – 1; if 1 t> 0 then go to resume;
alphabeta : – a[0];

end.

This procedure alphabeta(p) will compute the same value as F2(p, – oo, + oo);
we must choose L large enough so that the level of recursion never exceeds L.

Artifldal Intelligence 6 (197~, 293-326

302 D.E. KNUTH AND R, W. MOORE

4. Applications

When a computer is playing a complex game, it will rarely be able to :;earch all
possibilities until truly terminal positions are reached; even the alpha-beta
technique won’t be last enough to solve the game of chess:! But we can still
use the above procedures, if the routine that generates all moves is modified
so that sufficiently deep positions are considered to be terminal. For example,
if we wish to look six moves ahead (three for each player), we can pretend
that the positions reached at level 6 have no successors. To compute f at
such artificially-terminal positions, we must of course use our best guess
about the value, hoping that a sufficiently deep search will ameliorate the
inaccuracy of our guess. (Most of the time will be spent in evaluating these
guessed values for f , unless the determination of legal moves is especially
difficult, so some quickly-computed estimate is needed.)

Instead of searching to a fixed depth, it is also possible to carry some lines
further, e.g., to play out all sequences of captures. An interesting approach
was suggested by Floyd in 1965 [6]), although it has apparently not yet been
tried in large-scale experiments. Each move in Floyd’s scheme is assigned a
“likelihood” according to the following general plan: A forced move has
“likelihood” of 1, while very implausible moves (like queen sacrifices in
chess) get 0.01 or so. In chess a “recapture” has “likelihood” greater than ~;
and the best strategic choice out of 20 or 30 possibilities gets a “likelihood”
of about 0.1, while the worst choices get say 0.02. When the product of all
“likelihoods” leading to a position becomes less than a given threshold
(say 10-s), we consider that position to be terminal and estimate its value
without further searching. Under this scheme, the “most likely” branches of
the tree are given the most attention.

Whatever method is used to produce a tree of reasonable size, the alpha-
beta procedure can be somewhat improved if we have an idea what the value
of the initial position will be. Instead of calling F 2 ~ , , o0, .+ ~) , we can
try F2(p, a, b) where we expect the value to be greater than a and less than b.
For example, i f F2(p, 0, 4) is used instead of F2(p, -10 , +10) in Fig. 3, the
rightmost ” – 4 ” on level 3, and t h e ” 4 ” below it, do not need to be con-
sidered. If our expectation is fulfilled, we may have pruned off more of the
tree; on theother hand if the value turns out to be low, say F2(p,a, b) ffi v,
where v ~< a, we can use F2(p, - co, v)to deduce thecorrect value. This idea has been used in some versions of Greenblatt's chess program [8]. 5. History -- Before we begin to make quantitative analyses of alpha-beta's effectiveness, let us look briefly at its historical development. The early history is somewhat obscure, became it is based on undocumented recollections and because Artificial Intelligence 6 (1975), 293--326 AN ANALYSIS OF ALPHA-BETA PRUNING 303 some people have confused procedure F1 with the stronger procedure F2; therefore the following account is based on the best information now avail- able to the authors. McCarthy [15] thought of the method during the Dartmouth Summer Research Conference on Artificial Intelligence in 1956, when Bernstein described an early chess program [3] which didn't use any sort of alpha-beta. McCarthy "criticized it on the spot for this [reason], but Bernstein was not convinced. No formal specification of the algorithm was given at that time." I t is plausible that McCarthy's remarks at that conference led to the use of alpha-beta pruning in game-playing programs of the .late 1950s. Samuel has stated that the idea was present in his checker-playing programs, but he did not allude to it in his classic article [21] because he felt that the other aspects of his program were more significant. The first published discussion of a method for game tree pruning appeared in Newell, Shaw and Simon's description [16] of their early chess program. However, they illustrate only the "one-sided" technique used in procedure F1 above, so it is not clear whether they made use of "deep cutoffs". McCarthy coined the name "alpha-beta" when he first wrote a LISp program embodying the technique. His original approach was somewhat more elaborate than the method described above, since he assumed the existence of two functicns "optimistic value(p)" and "'pessimistic value(p)'" which were to be upper and lower bounds on the value of a position. McCarthy's form o f alpha-beta searching was equivalent to replacing the abort: body of procedure F2 by i f optimistic value(p) <~ alpha then F2 : = alpha else ifpessimistic value(p) >>. beta then F2 := beta
else begin 1 and I – j is even; otherwise (i.e.,
when l , j is odd, hence at = I) it is of type 3. It is easy to establish the
following facts by induction on the computation, i.e., by showing that they
are invariant assertions:

(1) A type i position pis examined by calling F2(p, – ~ , + oo). If it is not
terminal, its successor positionpl is of type 1, and F(p) = – F ( p 0 # +oo.
The other succesror positions p , . . . , Pd are of type 2, and they are all
examined by caning F2(pi, –o~, F(pl)).

(2) A type 2 position p is examined by calling F2(p, – c o , beta), where
– Qo < beta <<. F(p). If it is not terminal, its successor position Pi is of type 3, and F(p)ffi -F (p l ) ; hence, by the mechanism of procedure F2 as defined in Section 2, the ot~zer successors P 2 , . . . , Pd are not examined. (3) A type 3 position p is examined by calling F2(p, alpha, + oo) where + oo > alpha >t F(p). I f it is not terminal, each of its successor positions Pl
is of type 2 and they are all examined by calling F2(pi, – o o , -alpha).

Artificial Intelligence 6 (1975), 293-326

306 V.E. KNUTH AND R. W. MOORE

It follows by induction on I that every critical position is examined.

COROLLARY 1. I f every position on levels O; I . . . . , 1 — 1 o f a game tree
satisfying the conditions o f Theorem 1 has exaetly d successors, for some
f i xed constant d, then the alpha-beta procedure examines exactly

d tll2j “t” d rl/21 – 1 (13t
positions on level I.

Proof. There a r e d i-|/2J sequences at • . . at, with I ~< al ~ d for all i, such that at = I for all odd values of / ; there are dr ~/21 such sequences with a| ffi 1 for all even values of i; and we subtract I for the sequence I . . . I which was counted twice. This corollary was first derived by Levin in 1961, but no proof was apparently ever written down at the time. In fact, the informal memo [I0] by Har t and Edwards justifies the result by saying: , 'For a convincing personal proof using the new heuristic hand waving technique, see the author of this theorem. '~ A proof was later published by Slagle and Dixon [25]. However, none of these authors pointed out that t h e valueof the root position must not equal + oo. Although this is a rare occurrence innontrivial games, since it means that the root position is a forced win or loss, it is a necessary hypothesis for both the theorem and the corollary, since the number of positions examined on level I will be d t|/2J when the root value is +co, and it will be d r~/21 when the root value is - co . Roughly s ' peaking, we gain a factor of 2 when the root value is ~ oo. The characterization of perfect alpha-beta pruning in terms of critical positions allows us to extend Corollary 1 to a much more general class of game trees, having any desired probability distribution of legal moves on each level• COROLLARY 2. Let a random game tree be generated in such a way that each position on leve l j has probability ql o f being nontermi~tal, and has an average o f dj successors. Then the expected number o f position s on level l i s do dt . . . d t , l ; and the expected number o f positions on level I e x a m ~ d by alpha-beta technique under the assumptions o f Theorem l is doqldzq3 . . . dt-2q,-t ~ qodxq2d3 . . • qi.2d|-t : qoqt . . . qt-i " ' " I even; ~14~ doqld~q~ . . . qr-2dl-t ~ qodiq2d3 . . . dl-2q~-x - q0qt:- 'J ql-t l odd. ( ') • (M re precisely. :theassumpUons underlying this random branclfing process are that level j :+ 1 of the: tree is formed £tom level j as follows: Each position p on l e v e l j is assigned a probability ~str ibution . where ra(p)::~ theprobab~ty t ha tp will have d successors; these
• distributions may be d:fiTerent for:different positions p, but leach must satisfy
to(P) = i – qj, and each must have the mean vaiucrt(p) + 2r~(p) + . . .

Artif~ial lnteJligenc¢ 6 (1975)~ 7293,-326

AN ANALYSIS OF ALPHA-BETA PRUNING 307

= dj. The number of successor positions for p is chosen at random from this
distribution, independently of the number of successors of other positions on
level j.)

Proof If x is the expected number of positions of a certain type on levelA
then xd~ is the expected number of successors of these positions, and xqj is
the expected number of “number 1” successors. It follows as in Corollary 1
that (14) is the expected number of critical positions on level l; for example,
qo q l . . – q H is the expected number of positions on level ! whose identifying
coordinates are all l ‘s.

Note that (14) reduces to (13) when qj = 1 and dl – d for 0 ~< j < / . Intuitively we might think that alpha-beta pruning would be most effective when perfect-ordering assumption (12) holds; i.e., when the first successor of every position is the best possible move. But this is not always the case: Fig. 4 shows two game trees which are identical except for the left-to-right ordering of successor positions; alpha-beta search will investigate more of the left-hand tree than the right-hand tree, although the left-hand tree has its positions perfectly ordered at every branch. 4 4 A A 2 i f " "~3 3 A , , - 2 -1 - I - 2 FIG. 4. Perfect ordering is not always best. Thus the truly optimum order of game trees traversal isn't obvious. On the other hand it is po~ible to show that there always exists an order for pro- cessing the tree so that alpha-beta examines as few of the terminal positions as possible; no algorithm can do better. This can be demonstrated by strengthening the technique used to prove Theorem I, as we shall see. Tt~oe~M 2. Alpha, beta pruning is optimum in the following sense: Given any game tree and any algorithm which computes the value o f the root positim~, there is a way to permute the tree (by reordering successor positions i f necessary) so that every terminal position examined by the alpha-beta method under this permutation is examined by the given algorithm. Furthermore i f the value of the root is not +_ oo, the aipha-bet~ procedure examines precisely the positions which are critical under this permutation. (It is assumed that all terminal positions have independent values, or Artificial Intelligence6 (1975), 293-326 308 D.E. KNUrH AND R. W, MOORE equivalently that the algorithm has no knowledge about dependencies between the values of terminal positions.) An equivalent result has been obtained by G. M. Adelson-Velskiy [l, Appendix l]; a somewhat simpler proof will be presented here. Proof. The following functions F~ and F~ Oefine the best possible bounds on the value of any position p, based on the terminal positions examined by the given algorithm: " ( , ifp is terminal and not examined, Fj(p) = ~f(p) i fp is terminal and examined, (15) [max( - F~(p0, • . . , -F~(pd)) otherwise; + ( , ifp is terminal and not examined, F~(p) -- ~f(p) i fp is terminal and examined, (16) [max(-F~(p), . . . , -F~p~)) otherwise. Note that Fz(p) < F~(p)for all p. By independently varying the values at unexamined terminal positions below p, we can make F(p) assume any given value between F~p) and F.(p), but we can never go beyond these limits. When p is the root position we must therefore have F~(p) -- F~(p) = F(p). Assume that the root value is not _+ co. We will show how to permute the tree so that every critical terminal position (according to the new numbering of positions) is examined by the given al~orithm and that precisely the critical positions are examined by the alpha-beta procedure F2. The critical positions will be classified as type 1, 2, or 3 as in the proof of Theorem 1, the root being type I. "Une following facts can be proved by induction : (1) A type I positionp has Ft(p) = ~(p) = F(p) # _+ co, and it is examined during the alpha-beta procedure by cailingF2(p, - co, + co). Ifp is terminal, it must be examined by the given algorithm, since Fdp) # - co. I f it is not terminal, let j and k be such that F~(p)= -F.(pj) and F.(p)--- -Fg(pt). Then by (15) and (16) we have ,,), • • hence ~(pj) = Fz(Pt)and wemay assume that j~=k. By pe.rmuting the successor posi!i0ns we may assume in fact that j - k _ ~1. Posit~on Pi (after permutation) is Of ~ 1 ; the other s U ~ o r positions pc ' . . . ,p~ are of type 2, anti'they are allexaminedby calfing P2(pt, - c o , '~F(p0). (2) A type 2 position phaS F~(p) ~ > – c o , and it is examined during the
a l p h a ‘ b e t a p r ~ u r e by calling F2(p, , c o , beta), wheie’–oo < beta ~ F~(p). I f p ~term~ai, rit m n s t ~ ex~ined by ~ given algorithm. Other~v. !selet J b e ~t/ch ~ a t ~ ( p ) = / ~ F . ( p j ) , ~ d ~ m u ~ the i SucceSs0r positions • if necessary So that j = L position pk (after ~rmutation)is of type 3and is examined by calling F2(px, -be ta ,+ co). Since F . ( p O - -F~p) <~-beta, this call returns a value ~ - b e t a ; hence the other successors P2 , - . . ,Pd Artificial Intelligence 6(1975); 293-326 AN ANALYSIS OF ALPHA-BErA PRUNING 309 (which are not critical positions) are not examined by the alpha-beta method, nor are their descendants. (3) A type 3 position p has F,.(p) < co, and it is examined during the alpha.beta procedure by calling F2(p, alpha, + co), where F~(p) <~ alpha < oo. I fp is terminal, it must be exam/ned by the given algorithm. Otherwise all its sl:ccessor positions p~ are of type 2, and they are all examined by calling F2(p~, - co, -alpha). (There is no need to permute them, the ordering makes absolutely no difference here.) A similar argument can be given when the root value is + co (treating it as a type 2 position) or - c o (type 3). A surprising corollary of this proof is that the ordering of successors to type 3 positions in an optimally-ordered tree has absolutely no effect on the behavior o f alpha-beta pruning. Type 1 positions constitute the so-called "principal variation", corresponding to the best strategy by both players. The alternative responses to moves on the principal variation are of type 2. Type 3 positions occur when the best move is made from a type 2 position, and the successors of type 3 positions are again of type 2. Hence about half of the critical positions of a perfectly ordered game tree are of type 3, and current game-playing algorithms are probably wasting nearly half of the time they now spend trying to put successor moves in order. Let us say that a game tree is uniform of degree d and height h if every position on levels 0~ 1 , . . . , h - i has exactly d successors, and if every position on level h is terminal. For example, Fig. 1 is a uniform tree of height 4 and degree 3, but the trees of Fig. 4 are not uniform. Since all permutations of a uniform tree are uniform, Theorem 2 implies the following generalization of Corollary 1. COROLLARY 3. Any algorithm which evaluates a uniform game tree of height h and degree d must evaluate at least d rh/zl + d th/z' - 1 (17) terminal positions. The aiFha-beta procedure achieves this lower bound, i f the best move is considered first at each position of types 1 and 2. 7. Uniform Trees Without Deep Cutoffs Now that we have determined the best case of alpha-beta pruning, let's be more pessimistic and try to look at the worst that can happen. Given any finite tree, it is possible to find a sequence of values for the terminal positions so that the alpha-beta procedure will examine every node of the tree, without making any cutoffs unless the tree branches are permutco. (To see this, arrange the values so that whenever F2(p, alpha, beta) is called, the condition -alpha > F(pj) > F(pz) > . . . > F(p~) > -be ta is satisfied.) On the other

Artbqclal lntelflgence 6 (1975), 293-326

310 D.E. KNUTH AND R. W. MOORE

hand, there are game trees with distinct terminal values forwhich thealpha-
beta procedure will always find some cutoffs no matter how the branches
are permuted, as shownin Fig. 5. (Procedure FI does not enjoy this property.)

/ \ . . l \ .
! \ ./,,. !\ . / \
/ \ / \ / \ / \ / \ / \ / ‘ \ / \

al a 2 b l b 2 a 3 a 4 b3 b4 a 5 a 6 b 5 be a7 a 8 b7 b 8

FXG. 5. I f max(ab . . . . as )< min(bh . . . ,be) , the alpha-beta procedure will always find at least two cutoffs, no matter how we permute t i c branches of this game tree. Since game-playing programs usually use some sort of ordering strategy in connection with alpha-beta pruning, these facts about the worst case are of tittle or no practical significance. A more useful upper bound relevant to the behavior we may expect in practice can be based on the assumption of random data. Feller, Gaschnig and GiUogly have recently undertaken a study [7] of the average number of terminal positions examined when the alpha- beta procedure is applied to a uniform tree of degree d and height h, giving independent random values to the terminal positions on level h. They have obtained formulas by which this average number can be computed, in roughly d s st~,s, and their theoretically-predicted results Were only Slightly higher than empiricaUy, observed data obtained from a modified chess-playing program. Unfortunately the formulas turn out to be extremely complicated, even for this reasonably simple theoretical model, so t h a t t h e asymptotic behavior for large d and/or h seems to defy analysis. Since we are looking for upper bounds ~yway, it is natural to consider the behavior of the We.aker procedure F L ~ S method is weaker since it doesn't find any "deep cutoffs" ; but it is m ~ better than complete mir~i- niaxing, and Figs. 1-3 indicatethat deep cutoffs probably have only a second-order effect/on the efficiency. Furthermore, procedure FI has the great virtue that its analysis is much simpler than that of the full alpha-beta procedure F2. Onthe other hand, the analysis of F l i sby no means as easy as it looks, and the mathematics turnsou~ to be extremely interesting. I n fact, the Artificial Intelligence6 (1975), 293.37,,6 AN ANALYSIS OF ALPHA-BETA PRUNING 3ll authors' first analysis was found to be incorrect, although several competent people had checked it without seeing any mistakes. Since the error is quite instructive, we shall present our original (but fallacious) analysis here, challenging the reader to "find the bug"; then we shall study how to fix things up. With this understanding, let us conside~ the following problem: A uniform game tree of degree d and height h is constructed with random values attached to its d ~ terminal positions. What is the expected number of terminal positions examined when procedure FI is applied to this tree? The answer to this problem w~H be denoted by T(d, h). Since the search procedure depends only on the relative order of the d h terminal values, not on their magnitudes, and since there is zero probability that two different terminal positions get the same value, we may assume that the respective values assigned to the terminal positions are p~rmutations of {1, 2, .... , dh}, each permutation occurring with probability 1/(dh)!. From this observation it is clear that the d ~ values of positions on each level I are also in random order, for 0 ~< l < h. Although procedure Fl does not always compute the exact F values at every position, it is not difficult tO r' verify that the decisions F1 makes a~aut ,'atoffs depend entirely on the F values (not on the approximate values Fli~p)); so we may conclude that the expected number of positions examined on level I is T(d, l) for 0 ~< l ~< h. This justifies restricting attention to a single level h when we count the number of positions examined. In order to simplify the notation, let us consider first the case of ternary trees, d = 3; the general case will follow easily once this one is understood. Our first step is to classify the positions of the tree into types A, B, C as follows: The root position is type A. The first successor of every nonterminal position is type A. The second successor of every nonterminal position is type B. The third successor of every nonterminal position is type C. 1 1 1 1 314 3/5 1 9/14 9/20 I i \ I I \ I I \ ¥11 V12 Y13 Y21 Y22 Y23 Y31 Y32 Y33 Fzo. 6. Part of a uniform ternary tree. Artificial Intelligence 6 (1975). 293-326 312 D.E. KNUTH AND R. W. MOORE Fig. 6 shows the local "environment" of typical A, B, C positions, as they appear below a nonterminal position p which may be of any type. The F-values of these three positions are xl, x2, x3, respectively, and their descendants have respective F-vahles Y11,--.,Y33. Our assumptions guarantee that Yll,- .., Y33 are in random order, no matter what level of the tree we are studying; hence the values x~ = m a x ( - Y l t , --Yl2, -Y13), • • . , x3 ~-" m a x ( - y ~ l , --Y32, --733) are also in random order. If position p is examined by calling Fl(p, bored), then position A will be examined by the subsequent call FI(A, + o0), by definition ofF1 (see Section 2). Eventually the valu~ xt will be returned; and if - x l < bound, position B will be examined by calling FI(B, xt). Eventually the value x2 will be returned; or, if x2 >i xi, any value ~ >t Xl may be returned. If max( -x l , -x~) < bound, position C will be examined by calling FI(C, min(xl, x2)). Note that - m a x ( - x l , -~2) -min (x l , x2); the precise value of x~ is not involved when C is called. This argument shows that all three successors of an A position are always examined (s,:nce the corresponding bound is +00). Each B position will examine its first successor, but (since i',s bound is xt - - min(.vl 1, Yt 2, Yl 3)) it will examine the second successor if and only if -Y2t < -min(y~ ~, Y12, Yi3), i.eo, if and only if the values satisfy min(ylt, YI2, Yt3) < Y21- This happens with probability ¼, since the y's are randomly ordered and since the relation min(yl 1, Yl 2, Yl 3) > Y2~ obviously holds with probability ¼.Similarly
the third successor of a B position is evaluated if and only if the values
satisfy min(yll, YI2, YI3) <: min(y2t, Y22), and this has probability ~. The probability that the second successor of a C position is evaluated is the probability that max(min(yll, YI2, Y13), min0'21, )22, Y23)) < Y31, and this occurs ~ of the time; the third successor is examined with probability ~o. (A general formula for ithese probabilities is derived below.) Let A~, B,, C~ be the expected number of positions examined n levels below an A, B, or C position examined by p r ~ d u r e F1 in a random game tree. Our discussion proves that A o - B o f C o = l ; A,,+l = A,,% B,, + C,; (18) B.+l = A. + ¼S, + IC.;.i: c.+i = A , + + and T(3, h) - As is the answer to our problem when d -- 3. The solution to these simultaneous linear rec~lrrences can be studied in many ways, and for our purposes the use of generating functions is most convenient. Let Artificial Intelligence 6(1975), 293-326 AN ANALYSIS OF ALPHA-BETA PRUNING 313 A(z) = ~ A, z ~, B(z) = #~o B~ z", C(z) = E C, z', n~-O n~>O

so that (18) is equivalent to
A(z) – 1 = zA(z) + .zB(z) + zC(z),
B ( z ) – 1 = zA(z) + ~}zB(z)+ ~}zC(z),
C(z) – 1 = zA(z) + ,-~4zB(z) + 2-~zC(z).

By Cramer’s rule, A(z) – U(z)/V(z), where

U(z) = de 1 ~ z – 1 ]z ,

V(z) det ¼z-I ] z
. ~-g4z ~f-6z– 1

(19)

(2o)

¢1 c2 C3
+ + , (21)

A(z) = 1 – rlz 1 – – r 2 z 1 – – r 3 z

c, = -r,U(1/rt)/r'(l/r~). (22)
Consequently A(z) = ~.)o(cl(rlz)” + cz(rzz)” + ea(raz)’), and we have

A, = + c2 z +
by equating coeflicie,Rs ofz,. Ifwe number the roots so that {rl [ > It21 >~ Ir3[
(and the theorem of Perron [17] assures us that this can be done), we have
asymptotically

A, .., clr ~. (23)

Numerical calculation gives rt – 2.533911, ci—- 1.162125; thus, the alpha-
beta procedure without deep cutoffs in a random ternary tree w/ll examine
about as many nodes as in a tree of the’ same height with average degree
2,534 instead of 3. (It is worthwhile to note that (23) -redicts about 48
positions to be examined on: the fourth level, while on:~; 35 occurred in
Fig. 2; the reason for this discrepancy is chiefly that the one-digit values in
Fig. 2 are nonrandom because of frequent equalities.)

Elementary manipulation of determinants shows that the equation z 3 V(l/z)
= 0 is the same as

1 – z 1 z /
det I ¼-z = 0 ;

hence r s is the largest eigenvalue of the matrix
Artificial Intelligence 6 (1975), 293-326

where

are polynomials in z. If the equation z 3 V(l/z) = 0 has distinct roots rl, r z, r3,
there will be a partial fraction expansion of the form .-

314 D . E . KNUTH AND R. W. MOORE

We might have deduced this directly from eq. (18), i f we had known enough
matrix theory to calcolat¢ the constant cl by matrix-theoretic means instead
of function-theoretio means.

This solves the case d .– 3. For general dwe find similarly that the expected
number of terminal positions examined by the alpha-beta procedure without
deep cutoffs, in a random uniform game tree of degree d and height h, is
asymptotically

T(d , h) ,.. co(d) ro(d) ~’ (24)
for fixed d as h ~ 00, where re(d) is the largest eigenvalue of a certain d x d

M d – –

matrix
rP l l P12 ” ‘ ” Ptd~

P21 P22 – – – P24
i •

Pdt Pd2 ” ‘ – Pdd

(25)

and where co(d) is an approl ~riate consent. The general matrix element p~j
in (25) is the probability that

max (min(Ylt,. . . , Y~d)) < rain Y~ (26) l ~ k < l 1~/¢< / in a sequence of ( i - l)d + ( j - 1) independent identically distributed random variables YI t , . . . , Ylcj-t~- When i - 1 o r j - 1, the probability in (26) is 1, since the rain over an empty set is + 00 and the max is -Qo. When i , j > 1 we can evaluate the
probability in several ways, of which the simplest seems to be combinatorial:
For (26) to hold, the minimum of all the Y’s must b~ Yt,tl for some kl < i, and this occurs with probability (i - l )d / ( ( ! - l ) d + j - I ) ; remov- ing Yk, t, .... ,Yti4 from consideration, the minimum of the remaining Y's must be Y~a,2 for s o m e k2 < i , and this occurs with probability ( i - 2 ) d / ( ( i - 2 ) d + j - 1); and so on. Therefore (26) occurs with pro- bability _ , / . ( i - 1)d ( i - 2)d d ~ . . . . a e B Pu = (i - Dd + j ' - I (i - 2)d + j - - 1 d + i - I - - l / ( i - I t 2 1 - 1 ) / d ) . (27) This explicit formula allows us to calculate to(d)numerically for small d without much difficulty, and to calculate c o ( d ) f o r small d with somewhat more difficulty using (22). Arti~:intelliocnce:6 (1975), 293-326 AN ANALYSIS OF ALPHA-BETA PRUNILNG 315 The form of (27) isn't very convenient for asymptotic calculations; there is a much simpler expression which yields an excellent approximation: LEMMA 1. When 0 <~ x <. 1 and k is a positive integer, k - ' , ~ < ( k - l + x ) k - 1 <~ kX/r(1 + x). (a8) (Note that 0.885603 < F(l + ac) ~< l for 0 ~< x ~< 1, with the minimum value occurring at x = 0.461632; hence the simple formula k x is always within about 11 ~o of the exact val!ue of the binomial coefficient.) Proof. When 0 ~ x <~ I and t > – 1 we have
(1 + t) ~’ ~< 1 + tx, (29) since the function f (x ) = (1 + t)X/(l + tx) satisfies f (0) = f ( l ) = 1, and s ince f"(x) = ((ln(l + t) - t/(1 + tx)) z + t2/(l + tx)2)f(x) > O.

Using (29) for t ~- 1, ½, ~ , . . . yields

l + x l % x l + ~ x | , < ~ < - - - - ~ < . . . 2 ~ 2 ~ (t)~ ~< l i m ( l + x ) ( 2 + x ) ( m + x ) 1 1 m-.® 1 ~ " ' ' m (m + 1)" = 1-(1 + x) and the kth term of this series of inequalities is k - 1 For trees of height 2, deep cutoffs are impossible, and procedures F I and F2 have an identical effect. How many or" the d 2 positions at level 2 are examined ? Our analysis gives an exact answer for this case, and Lemma 1 can be used to give a good approximate result which we may state as a theorem. THEOR~ 3. The expected number of terminal positions examined by the alpha-beta procedure on level 2 of a random uniform game tree of degree d is T(d, 2) = E p . O0) l~l , j~d where the Pi~ are defined in (27). We have dZ d 2 C, l ~ <~ T(d, 2) <~ C2 log d (31) for certain positive constants C1 and (?2. Proof. Eq. (30) follows from our previous remark3, and from Lemma 1 we know that C S(d) <<. T(d, 2) g S(d), where C ~ n.885603 = i n f o ~ t F(l + x) and Artificial Intelligence 6 (t975), 293-326 316 v . E . KNUTH AND X. W. MOORE l e~t,j~d d d--I = d + 1 k Now for k=d'we have k- lJdfexp(-- t In d/d)f f i l - t In d/d+O((Iog d/d)Z), hence for x / d g k ~ min(j~,f6),
(32)

min(fs,f6) < max(mm(fi, f2), min(fs,f,)). Each of these two events has probability {, but they are not independent. A i • I A ( IB QA ( l i b OA (lib QA (liB tl #2 f3 f4 vs fe f7 f8 FIG. 7. A tree which reveals the fallacious reasoning. When the fallacy is stated in these terms, the error is quite plain, but the dependence was much harder to s~; in the diagrams we had been drawing for ourselves. For example, when we argued using Fig. 6 that the second successor of a B position is examined with probability ¼, we neglected to consider that, when p is itself of type B or C, the B node in Fig. 6 is entered only when rain(y11, Y12, Yt3) is less than the bound at p; so rain(y, t, Yt2, Yl 3) is some- what smaller than a random value would be. What we should have computed is the probability that Yzt > min(ytl, Yl2, Y~3) given that position B is not
cut off. And unfortunately this can depend in a very complicated way on the
ancestors ofp.

To make matters worse, our error is in the wrong direction, it doesn’t even
provide an upper bound for alptm-~eta searching; it yields only a lower
bound on an upper bound (i.e., nothing). In order to get information relevant
to the behavior of procedure F2 on random data, we need at least an upper
bound on the behavior of procedure F1.

A correct analysis o f the binatry case (d 2 ) involves the solution of
recurrences

A.+t — An + B.¢°)’

Bi,+) ,= Aa + ~tBCt÷ t) for k ~ 0, 03)

A o = °) = Boo ” = ]3I, 2) = – ” = l ,

where the Pt are appropriate probabilities. Fo r example, Po = ~; PoP. is the
probability that (32) holds;andpoptp2 is the probability that fifteen indepen-
dent random variables satisfy

As>f13Aft,t,
f13 A f i* < (f , Afro) V (f i t Aft2), (34) (fg^ fto)V ( f i t Af t2 )>( ( f l ^ f , ) V (f3 ^ f ,)) A ((fs A f¢) v (f’n A fs)),
Artificial Intelligence 6 (1975), 293-326

318 D . E . KNUTH AND g. W. MOORE

writing v for max and ^ for min. These probabilities can be computed
exactly by evaluating appropriate integrals, bug the formulas are complicated
and it is easier to look for upper bounds. We can at least show easily that the
probability in (34) is g} , since the fLrSt and third conditions are independent,
and they each hold with probability ~. Thus we obtain an upper bound if we
set Po – -P , ffi P4 ffi . – – ffi ~ and Pt = Pa = . – . — 1; this is equivalent to

A o – B o = 1~
= An + en,

en+~ = A. + }A,.

the recurrence

(35)

Similarly in the case of degree 3, we obtain an upper bound on the average
number of nodes examined without deep cutoffs by solving the recurrence

A o – B o f f i C o – – – 1 ,

A.+I = An + Bn + C., (36)
S.+~ = A . + ~}A. + ~jAn.
C.+, = A. + ~2~A. + z~oAn,

in place of (18). This is equivalent to

A.+I — A. + (I + ¼ + t + 1 + ~ + ~o)An-t
and for general degree d we get the recurrence

An+l ffi An + $~An-l, (37)

where Ao — 1, Al — d, and

Sd ~” 2~l~d PlJ” (38)

This gives a valid upper bound on the behavior of procedure F1, because it is
equivalent to setting bound,– + ooat certain positions (and this operation
never decreases the number of positions examined). Furthermore we can
solve (37) explicitly, to obtain an asymptotic upper bound on T(d, h) ofthe
form ct(d)r~(d) s, wherethe growth ratio is

rl(d) ffi ~/(Sd + ¼) + ½. (39)

Unfortunately it turns out that Sa is o f order d2/log d, by Theorem 3; so
(39) isof order d/~/logd, while an upper boundofbrder d/log dis desired.

Another way to get an upper bound reties On amore detailed analysis of
the structural behaviorof procedure F1, as in the following theorem.

Tt~om~ 4. The expocted number of tertninal positions examined by :he
alpha-beta procedure without deep cutoffs, in a random uniform gamet ree of
degree d and height h, satisfies

T(d, h) < c*(d)r*(d) h, (40) where r*(d) is the largest eigenvalue -of the matrix Artificial Intellt'gence 6(1975), 293-326 ANANALYSIS OF ALPHA-BETA PRUNING g~ /Pdl ~ P d 2 and c*(d) is an appropriate constant. (The p~ in (41) are th~ same as in (25),) O a ~ ~/P2a ~Pdd 319 (40 Proof. Assign coordinates at • . . a~ to the positions of the tree as in Section 6. For 1 1> 1, it is easy to prove by induction that position a t . . . a~ has
bound = min{F(a, . . . a:_lk) i I ~< k < al} when it is examined by procedure F ! ; hence it is examined if and only if at . . • a H is examined and - min F(al . . .a~_lk) < rain F(al . . .a~-2k) or I = 1, (42) 1 <~k 2 holds with
probability p~j, where i = at- i and ] = a~, because of definition (26); hence if
the P~ were independent we would have a ~ . . . ah examined with probability
P,~a3 polo3 • . . Po,-za,, and this is precisely equivalent to the analysis leading
to (24). However, the P~ aren’t independent, as we have observed in (32) and
(34).

Condition P~ is a function of the terminal values

f ( a x , . , a l -2jk a~+l . . . as),
where j < a H o r j = al- l and k < az. Hence Pi is independent of PI, P 2 , . . . , Pi-2. (This generalizes an observation we made about (34).) Let x be the probability that position a l . . . ah is examine/t, and assume for convenience in notation that h is odd. Then by the partial independence of the P{s, we have - X < Pala~ [ ' ~ 4 " " " PaJ,.2ah. t" X < p ,~ , ~a4as" Pah',oh; hence x < Jpol.: p.2o3 .Poh_, and the theorem follows by choo~.~ing c*(d) large enough. ~.T~.w.w..we are ready to establish the correct asymptotic growth rate of the branchmg facLoz for procedure F l. T,~ORm 5. The expected number T(d, h) of terminal positions examined by the alpha-beta procedure without deep cutoffs, in a random uniform game tree of degree d and height h, has a branc,~ing factor lira T(d, h) l/h - r(d) (43) Artificial Intelligence 6 (1975), 293-326 320 D.e. KlqUa~ Am3 x. w. MOORe which satisfies • d d ~ cS~ .o~ , ~< r(d~ <~ C, log. ~. . . (44) for certain positive constants Cs and G . Proof We have T(d. hi + h2) <~ T(d, hi) T(d, h2); ' (45) since the right-hand side of (45) is the number of positions that would be examined by FI if bound were set to + oo for all positions at height hi. Furthermore the arguments above prove that • lim inf T(d, h) >~ ro(d), lim sup T(d, h) <~ rt(d), r*(d). h'* ~ h-* eo By a standard argument about subadditive functions (see, e.g,, [20, P¢oblem 1.98]) it follows that the limit (43) exists. To prove the lower bound in (44) weshalt show that ro(d) >i C3 d]log d.
The largest eigenvalue of a matrix with positive entriesp~l is known to be
>~min~(~jp~j), according to the theory of Perron [19]; see [26, Section 2.1]
for a modem account of this theory, 4 Therefore by Lemma 1,

• ( Z i ‘ u 1’~’~ ” r e (d )~ C rain – ‘ )
l–z

C m i n ( ‘, . I i .~
• 2 ~ , . ~ , k l – i j .

1 d – t d – 1
– C l . d . ~ / , z > C -J’~d-‘

where C = 0.885603 = info~=~t F(1 + x), since d “ila ffi exp(- lnd/d) >
I – in did. . . . . • ….. – ,
To get the upper bound in (44), we shall prove that r*(d) < C4 d/log d, using a rather curious matrix norm. If s and t are positive real numbers with 1 I - + = i , (46) then all eigenvalues ~ of a matrix A with entries :a~ l satisfy To prove this, let 24x =' ~ , where is a non~ero vector; b y H~lder's in- equality [9, Section 2.7], pt(E i" = ( E / X a,sxjl'~ l~" x.t , I.-,.~,~ I I r " ' a* * I t ~ 1 t#t • We are indebted to Dr-L H. Wilkinson for suggesting this proof of the lower bound. Arafldal Intetllger.ee 6 (197b':), 293-326 AN ANALYSIS OF ALPHA-BETA PRUNING 321 - - ( ~ ( ' ~ la~jl)s/')I/*/~ lX~il) t/s and (47) follows. If we let s - t = 2, inequality (47) yields r*(d) = O(d/x/log d), while if s or t ~ oo the upper bound is merely O(d). Therefore some care is necessary in selecting the best s a n d t; for our purposes we choose s = f (d) and t = f(d)/(f(d) -1) , wheref(d)= ½ In d/ln In d. Then \ 1 -,~ i~ .d\ l -,Q'~d < (v/d d'/''F ( d - / d ) G ~ ' v/d-'('i-')/2d)s") '/'" (48) The inner sum is 9(d) = l/(1 - d -f/td) = (4d/in d)(l + O(ln In d/In d)), so dg(d) "/* = d s(a)-x/2 exp(½ I n 4 In d/In l n d + In In d + O(1)). Hence the right-hand side o f (48) is exp(ln d - In In d + In 4 + O((ln In d)2/ln d)); we have proved that r*(d) ~< (4d/In d)(l + O((ln in d)Z/ln d)) as d --~ ~ . 8 9 10 11 12 13 14 15 16 t l l TASx~ 1. Bounds for the branching factor in a random tree when no deep cutoffs are performed i l l i d ro(d) 2 1.847 3 2.534 4 3.142 5 3.701 6 4.226 7 4.724 5.203 5.664 6.!12 6.547 6.972 7.388 7.795 8,195 8:589 t I i i i rl(d) r*(d) 1.884 ! .912 2.666 2.722 3.397 3.473 4.095 4.186 4.767 4.871 5.421 5.532 6.059 6.176 6.684 6.805 7.298 7.420 7.902 8.024 8.498 87618 9.086 9'203 9.668 9.781 10.243 10'350 10.813 10.913 i i i I d 17 18 19 2O 21 22 23 24 23 26 27 28 29 30 31 i | i ro(d) 8.976 9.358 9.734 10.106 10.473 10.836 11.194 11.550 11.901 12.250 12.595 12.937 13.277 13.614 13.948 I l l l . I I I I u l , • i , L i i rl(tO r*(d) 11 378 11.470 11.938 12.021 12.494 12.567 13.045 13.108 13.593 13.644 14d37 14.176 14.678 14.704 15.215 15.228 15.750 ".5.748 16.282 16.265 16.811 16.778 17.337 17., ]8 ! 7,861 17.796 18.383 18.300 18.903 18.802 i i i i i Table 1 shows the various bounds we have obtained on r(d), namely the lower bound to(d) and the upper bounds rl(d) and r*(d). We have proved that to(d) and r*(d) grow as d/log d, and that rl(d) grows ~ as d/v/log d; but the table shows that rl(d) is actually a better bound for d ~ 24. Artificial Intelligence 6 (1975L 293-326 24 322 D. E. KNUTH AND R. W. MOORE 8. Discussion of the Model The theoretical model we have studied gives us an upper bound on the actual behavior obtained in practice. It is an upper bound for four separate reasons: (a) the deep cutoffs are not considered; (5) the ordering of successor positions is random; (c) the terminal positions are assumed to have distinct values; (d) the terminal values are assumed to be independent of each other. Each of these conditions makes our model pessimistic; for example, it is usually possible in practice to make plausible guesses that some moves will be better than others. Furthermore, the large number of equal terminal values in typical games helps to provide additional cutoifs. The effect of assumption (d) is less clear, and it will be studied in Section 9. In spite of all these pessimistic assumptions, the results of our calculations show that alpha-beta pruning will be reasonably efficient. Let us now try to estimate the effect of deep cutoffs vs no deep cutoffs. One way to study this is in terms of the best case: Under 'ideal ordering of Successor positions, what is the analogue for procedure F1 of the theory developed in Section 67 It is not difficult to see that the positions ai . . • at examined by FI in the best case are precisely those with no two non-l's in a row, i.e., those for which a~ > 1 implies a~+t = 1.

In the ternary case under best ordering, we obtain the recurrence

Ao = Bo = Co = 1,

An+l – An + Bn + Cn, (49)
B.+I = A.,

C,+t = An,
hence An÷i – A, -I- 2A._i. For general d the corresponding recurrence is

A o = 1, A1 – d, An+ 2 = An+t + ( d – 1)An. (50)
The solution to this recurrence is

1
An – x/(4d 3) ((x/(d – ¼) ~+ ½).+2 _ (_x/(d -i) + ½)”+2); (51)

so the growth rate or effective branching factor is x/(d – I) + ½, not much
higher than the value x/d obtained for the full method includingdeep cutoffs.
This result tends to support the rcontention that deep cutoffs have only a
second, order effect, although we must admit that poor ordering of successor
moves will make deep cutoffs increasingly valuable.

9. Dependent Terminal Values
Our model gives independent values to all the terminal positions, but such
independence doesEt happen very often i n real games. For example, i f f (p)
Artificial Intelligenee: 6 (L975), 293-,326

AN ANALYSIS OF ALPHA=BETA PRUNING 32:3

is based on the piece count in a chess game, all the positions following a
blunder will tend to have low scores for the player who loses his men.

In this section we shall try to account for such dependencies by considering
a total dependency model, which ha~ the following property for all non-
terminal positions p: For each i and A all of the terminal successors of p~
either have greater value than all terminal successors of p j, or they all have
lesser value, This model is equivalent to assigning a permutation of {0, 1 , . . . ,
d – 1 } to the moves at every position, and then l~sing the concatenation of all
move numbers leading to a terminal position as that position’s value,
considered as a radix-d number. For example, Fig. 8 shows a uniform
ternary game tree of height 3 constructed in this way.

. / 0 , \ / 1 ~ o , _/ ~. ,~)_

201 .021 210 201 102. 201 102. 021. 120.
/ ~ I I \ I \ I . ~ \ 102 1(~0101/ [ ~122121 120/ I ~021 0~0022/ I ~20’i 2~00~202/ ~ zL~1222220

110 112 111 .012 010 011 002 O0Q 001 210 212 211
Fz(~. 8. A tree with “totally dependent” values,

Another way to look at this model is to imagine assigning the values
0, 1 , . . . , d ~ – ! in d-ary notation to the terminal positions, and then to apply
a random permutation to the branches emanating from every nonterminal
position. It follows that the F value at the root of a ternary tree is always
– ( 0 2 0 2 . . . 20)s i fh is odd, + ( 2 0 2 0 . . . 20)s i fh is even.

THEOREM 6. The expected number o f terminal positions examined by the
alpha-beta procedure, in a random totally dependent uniform game tree of
degree d and height h, is

d -//~,~,~rht21 dt~12j Hi+ — ~ , . – + Ha i _ Hi ) + H i , (52)

where Ha = 1 + ½ + . . . + l/d.

Proof. As in our other proofs, we divide the positions of the tree into a
finite number of classes or types for which recurrence relations can be given.
In this case we use three types, somewhat as in our proof of Theorems I and 2.

A type 1 position p is examined by calling F2(p, alpha, beta) where all
terminal descendants q of p have alpha < +f(q) < beta; here the + or - sign is used according as p is an even or an odd number of kvcls from the bottom of the tree. I f p is nonterminal, its successors ere assigned a definite ranking; let us say that p~ is relevant ifF(pi) < F(pj) for all 1 ~. j < i. Then Artificial Intelligence 6 (1975), 293-326 324 D.E. KNUTH.AND R. W. MOORE all of the relevant successors ofp axe examhted by calling F2(p~, -beta, - m ) where F(pi) lies be tween-be ta and - m , hence the relevant p~ are again of type 1. The irrelevant p~ are examined by calling F2(pa, -beta, -m) where F(pi) > – m , and we shall call them type 2.

A type 2 position p is examined by calling F2(p, alpha, beta) where all
terminal descendants q of p have -If(q) > beta. I f p is nonterminal, its first
successor Pt is classified as type 3, and it is examined by calling F2(pl, -beta,
-alpha). This procedure call eventually returns a Value ~< -beta, causing an immediate cutoff. A type 3 position p is examined by calling F2(p, alpha, betd) where all terminal descendants q of p have- l - f (q)< alpha. If p i s nonterminal, all its successors are classified type 2, and they ar~ examined by calling F2(pi, -beta, -alpha); they aH return values >~ -alpha.

Let An, Bn, C, be the expected number of terminal positions examined in a
random totally dependent uniform tree of degree d and height h, when the
root is of type 1, 2, or 3 respectively. The above argument shows that the
following recurrence relations hold:

A o – Bo = Co = 1,
A.+I = An + (½A. + ½On) + ( An + en) + . . .

+((I/d)An + ( (a” l)td)B,)
-//dAn+ ( d – Hd)e,, (53)

B.+I
Cn+l = dB~.

Consequently B, — d tn/2j, and As has the value stated in (52).

COROLLARY 4. When d >1 3, the average number of positions examined by
alpha.beta search under the assumption of totallydependent terminal values is
bounded by a constantS times the optimum number of positions specified in
Corollary 3.

Proof. The grO~:dl of (52) as h –, oo is order d s/2. The stated constant is
approxitaately

( d – Hd)(l + t td)/2(d. 1t2).

(Whenld= 2 the growth rate of (52) is order (~)~ instead o f ~/2s,)

Incidentally, we can also analyzeprocedureFlunder ~the same assumptions;
the restriction Of deep cutoffs leads to the recurrence

Ao — 1, AI – 1, A~+2 –HdA~+I + (d – Hd)A,, (54)

and the corresponding growth rate is of order (~/(d – Ha + ¼H 2) + ½Hd) h.
So againthe branching:factor is approximately ~/d for large d.

sThis “constant” depends on the degree d, but not onthe height ,~,
Artillcial haell~ence 6 0975), 293-.326

AN ANALYSIS OF ALPHA-BETA P~UNING 325

The authors of [7] have suggested another model to account for depen-
dencies between positions: Each branch O.e., each arc) of the uniform game
tree is assigned a random number b:tween 0 and 1, and the values of terminal
positions are taken to be the sums of all values on the branches above. If we
apply the naive approach of Section 7 ‘!o the analysis of this model without
deep cutoffs, the probabdity needed in place of eq. (26) is the probability that

max (Xk + min(Y~l , . . . , Ykd)) < X~ + rain Yl~, (55) l<~k

Related Posts