62
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
Table 1
Comparison of hardware and software searches
Software search
Hardware search
Host processor, C code
Chess chip, state machines
Explores tree near root
Explores tree near leaves
No quiescence search
Complex quiescence search
Complex recursive extensions
Mostly local extensions
Transposition table
No transposition table
Uses hardware search as
Uses on-chip static evaluation function
dynamic evaluation function
Flexible
Hardwired, limited configurability
previous research in such systems [8,13], integrating a large scale parallel search with
the selective search mechanisms in Deep Blue created a new set of challenges. The
parallel search and its interaction with the selective search is described in Section 6.
3. The chess chip
The chip used in Deep Blue is described in detail in [14]. This section will give a brief
overview of the chip. Details of the functionality that is implemented will be described in
the later sections on the hardware search (Section 5) and evaluation function (Section 7).
The chess chip divides into three parts: the move generator, the evaluation function, and
the search control. We will examine each of these in turn, followed by a brief description
of the on-chip support for external circuitry.
3.1. Move generation
The move generator in the Deep Blue chip was based
7
on the Deep Thought move
generator chip [12,13,17], which was in turn based on the move generator of the Belle
chess machine [7]. The Deep Blue chip has a number of additional functions, including
the generation of checking and check evasion moves, as well as allowing the generation of
certain kinds of attacking moves, which permits improved quiescence searching. The chip
also supports several search extensions, including singular extensions [2].
The move generator is implemented as an 8
× 8 array of combinatorial logic, which
is effectively a silicon chessboard. A hardwired finite state machine controls move
generation. The move generator, although it generates only one move at a time, implicitly
computes all the possible moves and selects one via an arbitration network. Computing all
the moves simultaneously is one way to get minimum latency while generating moves in a
reasonable order.
7
The Deep Blue move generator is actually a superset of the Deep Thought move generator.
M. Campbell et al. / Artificial Intelligence 134 (2002) 57–83
63
A reasonable move ordering, preferably as close to best-first as possible, is an important
consideration for efficient search in game trees. The chess chip uses an ordering that
has worked well in practice, first generating captures (ordered from low-valued pieces
capturing high-valued pieces to high-valued capturing low-valued), followed by non-
capture moves (ordered by centrality). After a move has been examined, a mechanism
exists for masking it out and generating the next move in sequence.
3.2. Evaluation function
The evaluation function implemented in the Deep Blue chip is composed of a “fast
evaluation” and a “slow evaluation” [7]. This is a standard technique to skip computing
an expensive full evaluation when an approximation is good enough. The fast evaluation,
which computes a score for a chess position in a single clock cycle, contains all the easily
computed major evaluation terms with high values. The most significant part of the fast
evaluation is the “piece placement” value, i.e., the sum of the basic piece values with
square-based location adjustments. Positional features that can be computed quickly, such
as “pawn can run”, are also part of the fast evaluation. The slow evaluation scans the chess
board one column at a time, computing values for chess concepts such as square control,
pins, X-rays, king safety, pawn structure, passed pawns, ray control, outposts, pawn
majority, rook on the 7th, blockade, restraint, color complex, trapped pieces, development,
and so on. The features recognized in both the slow and fast evaluation functions have
programmable weights, allowing their relative importance to be easily adjusted.
3.3. Search control
The search control portion of the chip uses a number of state machines to implement
null-window alpha-beta search. The advantage of null-window search is that it eliminates
the need for a value stack, simplifying the hardware design.
8
The disadvantage is that in
some cases it is necessary to do multiple searches, for example when an exact score is
needed.
Another limitation on the hardware search is the lack of a transposition table, which is
known to improve search efficiency significantly in many cases. The effect of this limitation
is lessened by the fact that the upper levels of the Deep Blue search are in software and
have access to a transposition table.
The search requires a move stack to keep track of moves that have been explored so
far at each level of the search tree. The move stack in Deep Blue II includes a repetition
detector, which was not included in Deep Blue I. This detector contained a 32-entry circular
buffer of the last 32 ply. Using a content-addressable memory algorithm [13], the repetition
detector maintains the numbers of pieces displaced in each of the last 32 positions with
respect to the current board position. When the number of pieces displaced equals zero, we
have a repeated position. If the number of pieces displaced equals one, the hardware can
recognize the presence of a legal move that would lead to repetition, and bound the score
8
The alpha-beta algorithm normally maintains two values, alpha and beta, on a stack.