Adversarial Search Outline Optimal decisions



Yüklə 456 b.
tarix16.11.2017
ölçüsü456 b.
#10673


Adversarial Search


Outline

  • Optimal decisions

  • α-β pruning

  • Imperfect, real-time decisions



Games vs. search problems

  • "Unpredictable" opponent  specifying a move for every possible opponent reply

  • Time limits  unlikely to find goal, must approximate



Game tree (2-player, deterministic, turns)



Minimax

  • Perfect play for deterministic games

  • Idea: choose move to position with highest minimax value = best achievable payoff against best play

  • E.g., 2-ply game:



Minimax algorithm



Properties of minimax

  • Complete? Yes (if tree is finite)

  • Optimal? Yes (against an optimal opponent)

  • Time complexity? O(bm)

  • Space complexity? O(bm) (depth-first exploration)

  • For chess, b ≈ 35, m ≈100 for "reasonable" games  exact solution completely infeasible



α-β pruning example



α-β pruning example



α-β pruning example



α-β pruning example



α-β pruning example



Properties of α-β

  • Pruning does not affect final result

  • Good move ordering improves effectiveness of pruning

  • With "perfect ordering," time complexity = O(bm/2)

    •  doubles depth of search
  • A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)



Why is it called α-β?

  • α is the value of the best (i.e., highest-value) choice found so far at any choice point along the path for max

  • If v is worse than α, max will avoid it

    •  prune that branch
  • Define β similarly for min



The α-β algorithm



The α-β algorithm



Resource limits

  • Suppose we have 100 secs, explore 104 nodes/sec  106 nodes per move

  • Standard approach:

  • cutoff test:

    • e.g., depth limit (perhaps add quiescence search)
  • evaluation function

    • = estimated desirability of position


Evaluation functions

  • For chess, typically linear weighted sum of features

  • Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)

  • e.g., w1 = 9 with

  • f1(s) = (number of white queens) – (number of black queens), etc.



Cutting off search

  • MinimaxCutoff is identical to MinimaxValue except

    • Terminal? is replaced by Cutoff?
    • Utility is replaced by Eval
  • Does it work in practice?

  • bm = 106, b=35  m=4

  • 4-ply lookahead is a hopeless chess player!

    • 4-ply ≈ human novice
    • 8-ply ≈ typical PC, human master (?????)
    • 12-ply ≈ Deep Blue, Kasparov (?????)


Deterministic games in practice

  • Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994.

    • Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.
  • Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997.

    • Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.
  • Othello: human champions refuse to compete against computers, who are too good.

  • Go: human champions refuse to compete against computers, who are too bad.

    • In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.


Summary

  • Games are fun to work on!

  • They illustrate several important points about AI

  • perfection is unattainable  must approximate

  • good idea to think about what to think about



Yüklə 456 b.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə