PacMan The Game Areas of Intelligence Classical Ghost Algorithm Basic Neuro-Evolution



Yüklə 481 b.
tarix14.10.2017
ölçüsü481 b.
#4788



PacMan

  • PacMan

  • Basic Neuro-Evolution

    • The Model
    • The ANN
    • Offline Learning
    • Online Learning
  • Neuro-Evolution of Augmenting Topologies

    • The Model
    • The ANN
    • Training Algorithm
    • Experiments
    • Discussions
  • Conclusions

  • References



Classical PacMan released by Namco (Japan) in 1980

  • Classical PacMan released by Namco (Japan) in 1980

  • Single player predator/prey game

  • Player plays as the PacMan:

    • Navigates through a 2D maze
    • Eats pellets
    • Avoids ‘Ghosts’
  • Ghosts played by the computer

    • Usually 4 in number
    • Chase the PacMan and try to catch it
  • Winning / Losing

    • PacMan wins if it has eaten all pellets or survived for 180 sec
    • PacMan loses if caught by a Ghost
  • Special Power-pills

    • Allows PacMan to eat the Ghosts for a short period of time
    • Eaten Ghosts respawns


Programming the PacMan to replace human player

  • Programming the PacMan to replace human player

    • Heavily researched field
    • Gives insight into AI, but doesn't add value to the game engine
  • Making the Ghosts more intelligent

    • Thrives to:
      • Make the Ghosts more efficient killers
      • Incorporate teamwork
      • Minimize score of the PacMan
      • Make the Ghosts learn & adapt
      • Make the game more 'interesting‘
    • Adds value to the game
    • Scarcely researched field
    • Currently our topic of interest


Two modes of play:

  • Two modes of play:

    • Attack: Chase down Pacman
    • Scatter: Break away when Pacman has a power-pill
  • Decide target cell on reaching intersection:

    • Attack target
      • Red Ghost: Current position of Pacman
      • Pink Ghost: Cell 4 positions ahead of Pacman
      • Blue Ghost: Cell 2 positions ahead of Pacman
      • Orange Ghost: Target when far away, retire to corner when nearby
    • Scatter target
      • Pseudo-random behaviour
  • Minimize distance from target cell



We will discuss two approaches for programming intelligence, learnability & adaptability into the Ghosts:

  • We will discuss two approaches for programming intelligence, learnability & adaptability into the Ghosts:

  • Local optimization using basic Neuro-Evolution

  • Global optimization using Neuro-Evolution of Augmenting Topologies (NEAT)



Game field modeled as a grid. Each square can have any of:

  • Game field modeled as a grid. Each square can have any of:

    • Wall block
    • Pellet
    • Pacman
    • Ghost(s)
  • Power-pills not incorporated to keep things simple

  • Ghosts have only local info about the grid

  • Game proceeds in cycles of 2 steps:

    • Gather information about environment
    • Make a move by processing the information
  • Each Ghost controlled independently by a dedicated ANN

  • ANNs trained by Evolutionary Algorithm (GA):

    • Offline (Train & Deploy)
    • Online (Learn as you go)
  • Weight-adjusting is the only means of training the ANNs



A 4-5-4 Feed-Forward neural controller, comprising of sigmoid neurons is employed to manage the Ghosts’ motion.

    • A 4-5-4 Feed-Forward neural controller, comprising of sigmoid neurons is employed to manage the Ghosts’ motion.
  • Inputs:

    • Using their sensors, Ghosts inspect the environment from their own point of view and decide their next action.
    • Each Ghost receives input information from its environment expressed in the neural network’s input array of dimension 4
  • Outputs

    • Four scores corresponding to each of the direction
    • The direction with the maximum score is selected






Off-line evolutionary learning approach is used in order to produce some ‘good’ (i.e. in terms of performance) initial behaviors for the online learning mechanism.

  • Off-line evolutionary learning approach is used in order to produce some ‘good’ (i.e. in terms of performance) initial behaviors for the online learning mechanism.

  • The neural networks that determine the behavior of the Ghosts are themselves evolved.

  • The evolving process is limited to the connection weights of the neural network.

  • The evolutionary procedure is based on genetic algorithm

  • Each Ghost has a genome that encodes the connection weights of its neural network.

  • A population of neural networks (Ghosts) is initialized randomly with initial uniformly distributed random connection weights that lie within [-5, 5].



At each generation:

  • At each generation:

  • Step 1

  • Every Ghost in the population is cloned 4 times.

  • These 4 clones are placed in the PacMan game field and play N games, each one for an evaluation period of t simulation steps.

  • The outcome of these games is to ascertain the time taken to kill pacman tk for each game.



Step 2

  • Step 2

  • Each Ghost is evaluated for each game and its fitness value is given E{f} over N games where f is given by :-

  • By the use of fitness function f , we promote Pacman killing behaviors capable of achieving high performance value.



Step 3

  • Step 3

  • A pure elitism selection method is used where

  • only the 10% best fit solutions determine the members of the intermediate population and, therefore, are able to breed.

  • Step 4

  • Each parent clones an equal number of offspring in order to replace the non-picked solutions from elitism.



Step 5

  • Step 5

  • Mutation occurs in each gene (connection weight) of each offspring’s genome with a small probability. A uniform random distribution is used to define the mutated value of the connection weight.

  • The algorithm is terminated when a predetermined

  • number of generations g is achieved (e.g. g = 1000) and the best-fit Ghost’s connections weights are saved.



Pros:

  • Pros:

  • Computationally efficient – Minimum in-game computations

  • Can be tailor-made for specific maps

  • Cons:

  • Cannot adapt to changing maps

  • May overfit to training player's characteristics



This learning approach is based on the idea of Ghosts that learn while they are playing against Pacman.

  • This learning approach is based on the idea of Ghosts that learn while they are playing against Pacman.

  • In other words, Ghosts that are reactive to any player’s behavior and learn from its strategy instead of being predictable and uninteresting

  • Furthermore, this approach’s additional objective is to keep the game’s interest at high levels as long as it is being played.



Beginning from any initial group of homogeneous offline trained (OLT) Ghosts, the OLL mechanism attempts to transform them into a group of heterogeneous Ghosts that are interesting to play against.

  • Beginning from any initial group of homogeneous offline trained (OLT) Ghosts, the OLL mechanism attempts to transform them into a group of heterogeneous Ghosts that are interesting to play against.

  • An OLT Ghost is cloned 4 times and its clones are placed in the Pacman game field to play against a selected Pacman type of player.



At each generation:

  • At each generation:

  • Step 1

  • Each ghost is evaluated every t simulation steps while the game is played. The fitness function is :-

  • f = d1 – dt

  • where ,

  • di = Distance between ghost and pacman at the ith simulation step.

  • This fitness function promotes ghosts that move towards pacman within an evaluation period of t seconds.



Step 2

  • Step 2

  • A pure elitism selection method is used where only the fittest solution is able to breed. The best-fit parent clones an offspring.

  • Step 3

  • Mutation occurs in each gene (connection weight) of each offspring’s genome with a probability that is inversely proportional to the entropy of the group of ghosts.



Step 4

  • Step 4

  • The cloned offspring is evaluated briefly offline mode, that is, by replacing the worst-fit member of the population and playing an offline (i.e. no visualization of the actions) short game of t simulation steps.

  • The fitness values of the mutated offspring and the worst-fit Ghost are compared and the better one is kept for the next generation.



Pros:

  • Pros:

  • Adapts easily to varying maps, player mindsets

  • Hugely generalized

  • Cons:

  • Slow due to intensive computations during run-time

  • May take some time to re-train on new maps



Takes into account team work and strategic formations

  • Takes into account team work and strategic formations

  • Operates on global data

  • Has three modes of operation:

    • Chasing (pursuing Pacman)
    • Fleeing (evading Pacman when Pacman has consumed a power-pill)
    • Returning (returning back to the hideout to be restored)
  • Optimizes the team of Ghosts as a whole

  • Each Ghost controlled independently by a dedicated ANN

  • ANNs trained by Evolutionary Algorithm (GA)

  • ANN training affects:

    • Weights of the edges
    • Interconnection of the perceptrons (ANN topology)
  • Ghosts trained in real-time:

    • Training proceeds parallely with the game
    • Adapts the Ghosts over short time slices
  • Ghosts classified according to their distance from Pacman

    • Each distance class has a dedicated ANN population which evolves genetically
    • Multiple populations aid in heterogeneous strategy development


Each ANN represents a Ghost

  • Each ANN represents a Ghost

  • Input:

    • Current status of Ghost
    • Current status of closest Ghost
    • Current status of closest Ghost to Pacman
    • Distances to objects of interest (Pacman, Ghost, Powerpill, Pellet, Intersection, etc)
    • Distances between Pacman & objects of interest (Ghost, Powerpill, Pellet, Intersection, etc)
  • Output:

    • Score of a cell
    • Applied 4 times, one for each adjacent cell
    • Cell with maximum score selected for making move
  • Connections:

    • Minimally connected
    • Evolves through NEAT


Initialize:

  • Initialize:

    • A number of random neural network populations generated, each corresponding to ghosts classified according to their distance to Pacman.
    • Game divided into time slices of a small number of moves. Gn represents the state of the game beginning at time slice n.


Algorithm:

  • Algorithm:

    • Mark a ghost for learning during current time slice, beginning at Gn.
    • Look ahead (based on the models of the other ghosts and Pacman) and store the game state as expected to be like at the beginning of the next slice through simulated play (eGn+1 ). This will be the starting state for the NEAT simulation runs.
    • The fitness of a ghost strategy is determined by evaluating the game state that we expect to reach when the strategy is used in place of the marked ghost (eGn+2 ). This evaluation is an evaluation of the end state. Various fitness schemes are considered.
    • In parallel to the running of the actual game, run NEAT until the actual game reaches Gn+1.
    • The best individual from the simulations is substituted into the game, replacing the marked ghost.
    • Repeat the process for the next ghost in turn.




Improvement over Classical AI

  • Improvement over Classical AI

  • Ghosts tend to form clusters, reducing effectiveness



Inefficient as compared to Experiment 1

  • Inefficient as compared to Experiment 1

  • Ghosts tend to oscillate in dispersed locations



Teamwork improved

  • Teamwork improved

  • Ghosts committing ‘suicide’!



Kill rate significantly increased

  • Kill rate significantly increased



Uses high-level global data about the state of the game

  • Uses high-level global data about the state of the game

  • Reduces computational lag by looking ahead and employing parallelism

  • Encourages system to learn short-term strategies, rather than generalized long-term ones

  • Chalks down basic fitness strategies, opening up a horizon for many more

  • Demonstrates complex team behaviours



PacMan serves as a good test-bed for programming intelligent agents

  • PacMan serves as a good test-bed for programming intelligent agents

  • Generalized strategies applicable to a vast class of predator/prey game

  • Programming Ghosts gives good insight into efficient team strategies



[1] G. N. Yannakakis, and J. Hallam, "Evolving Opponents for Interesting Interactive Computer Games,'' in Proceedings of the 8th International Conference on the Simulation of Adaptive Behavior (SAB'04); From Animals to Animats 8, pp. 499-508, Los Angeles, CA, USA, July 13-17, 2004. The MIT Press.

  • [1] G. N. Yannakakis, and J. Hallam, "Evolving Opponents for Interesting Interactive Computer Games,'' in Proceedings of the 8th International Conference on the Simulation of Adaptive Behavior (SAB'04); From Animals to Animats 8, pp. 499-508, Los Angeles, CA, USA, July 13-17, 2004. The MIT Press.

  • [2] Mark Wittkamp, Luigi Barone, Philip Hingston, Using NEAT for Continuous Adaptation and Teamwork Formation in Pacman , 2008 IEEE Symposium on computational Intelligence and Games (CIG ‘08)

  • [3] Kenneth O. Stanley, Risto Miikkulainen(2002), Evolving Neural Networks through Augmenting Topologies, The MIT Press Journals



Yüklə 481 b.

Dostları ilə paylaş:




Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur ©www.genderi.org 2024
rəhbərliyinə müraciət

    Ana səhifə