Work presented in EvoAPPs 2020, included the Special Session "Soft Computing applied to Games".
Granted with the *** BEST PAPER AWARD *** of the Conference. :D
The abstract is:
General Videogame Playing is one of the hottest topics in
the research field of AI in videogames. It aims at the implementation of algorithms or autonomous agents able to play a set of unknown games efficiently, just receiving the set of rules to play in real time. Thus, this work presents the implementation of eight approaches based on the main techniques applied in the literature to face this problem, including two different hybrid implementations combining Montecarlo Tree Search and
Genetic Algorithms. They have been created within the General Video Game Artificial Intelligence (GVGAI) Competition platform. Then, the algorithms have been tested in a set of 20 games from that competition, analyzing its performance. According to the obtained results, we can conclude that the proposed hybrid approaches are the best approaches, and they would be a very competitive entry for the competition.
4. Autonomous agents (bots):
▪ Usual approach
- Creating an agent specialized
on playing a specific game
- It is normally trained to
improve its performance
on that game
4
UnrealBot in Unreal game
A.M. Mora et al.: Evolving Bot AI in UnrealTM.
EvoApplications (1) 2010: 171-180
5. Autonomous agents (bots):
▪ GGP approach
- Creating an agent able
to play efficiently to several
(previously unknown) games
- Source code cannot be
changed and the agent is not
trained in advance
- It only knows the game rules
5
GarythegameBot
6. GGP aims to implement a human-like agent/bot, i.e. able
to learn (or to adapt its behaviour) to play different
games.
6
8. Based on a Framework which uses:
▪ (Video) Game Description Language
- Proposed by the Stanford University (GGP project)
- Representation of games
- Each game is composed by a set of rules
- Tree structure
- Human readable
8
9. Files to define a game:
▪ Game description
- Types of sprites (objects)
- Types of elements in the
map
- Types of interactions
- Winning/Losing rules
9
Sokoban game
10. Files to define a game:
▪ Level definition
- 2D matrix of symbols
- Positions of elements
10
NES--still-the-best
w wall
A avatar
+ key
G gate
1 enemy
Zelda game
11. http://www.gvgai.net/
▪ Unknown games
- Game description files are provided to the agent at runtime
- It must interpret them and make its decisions
▪ Objectives
- Expend as much 40 ms per decision of action
- Play until 2000 game ticks or win/loss
- Reach the highest score in every level of every game
11
13. Very successful method in this domain
▪ Tree exploration technique, which analyses the most promising nodes
13
Selection is applied
recursively until a
leaf is reached
One or more
nodes are
created
One random
simulated game
is played
The result of the
game is
backpropagated
Repeat N times
14. Genetic Algorithm has been considered
▪ Algorithm inspired in the natural evolution of the species
Individuals Solutions (vectors)
Evaluation Fitness function (objective)
Selection Parents to be combined.
Based on the fitness value.
Crossover Recombination of parents.
Generate offspring.
Mutation Random change on individuals.
Replacement Which individuals survive.
14
Repeat N times
16. ▪ Simple
- (Agent) Random selects a random action from the available ones
- (Agent) One Step Ahead evaluate all available actions in the present
state and select the one with the best result
▪ MCTS
(A node in the tree is a state in the game – children of a node are resulting states
from applying an action – simulation until a certain depth – score for terminal
nodes is a large positive number if win and large negative number if loss)
- (Agent) Greedy the score of a node is its cumulative score. The best
is chosen in the selection stage.
- (Agent) UCT selection and propagation methods has been changed
to use Upper Confidence bounds for Trees.
16
17. ▪ Genetic Algorithms
(Individual is a sequence of actions to a given depth from a state - initialized with
random actions - fitness is the score obtained after simulating the actions - binary
tournament selection - uniform crossover - mutation changes an action)
- (Agent) Depth 7 (AG_D7) individuals are sets of actions up to depth 7
- (Agent) Depth 10 (AG_D10) individuals are sets of actions up to depth 10
17
Population Crossover
…
…
…
…
1
2
…
N
1 2 … D
Mutation
18. ▪ Hybrid (MCTS + GA)
- Collaborative (SEQ) the algorithms are run alternatively (sequentially),
for every game step
- Competitive (PAR) both algorithms are run for the same game step,
(having half time for decision). Best result is chosen
18
(Agent) MCTS_AG_SEQ
(Agent) MCTS_AG_PAR
20. Parameter configuration:
MCTS GA
20
PARAMETER VALUE
Scanning Depth 10
Balancing constant in UCT 𝟐
PARAMETER VALUE
FITNESS_WEIGHT
Depth weighting constant
0.90
SIMULATION_DEPTH
Number of actions per Individuals
7 or 10
POPULATION_ACTION
Size of the population for every action
5
REP_PROBABILITY
Probability of replacing the winner of
a tournament by the offspring instead
of the loser
0.1
MUT_PROB
Mutation Probability
1
𝑆𝐼𝑀𝑈𝐿𝐴𝑇𝐼𝑂𝑁_𝐷𝐸𝑃𝑇𝐻
21. Simulation:
- 8 agents
- 10 games (CIG15 GVGAI Competition validation set)
- 40 ms per decision (running time for each algorithm)
- 10 matches per game
- GAMES:
(G1) Camel Race, (G2) Dig Dug, (G3) Firestorms,
(G4) Infection, (G5) Firecaster, (G6) Overload,
(G7) Pac-Man, (G8) Seaquest,
(G9) Whack A Mole, (G10) Eggomania
21
22. Results (average score):
- Hybrid approaches are the best
- MCTS_AG_PAR is the most robust in almost all the games
- AG agents also perform very well (better with smaller depth)22
Best score Second best
23. Ranking distribution (10 matches x 10 games):
- Hybrid approaches are the best
- MCTS_AG_PAR is the most robust in almost all the games
- AG agents also perform very well (better with higher depth)
23
24. Statistical tests(MCTS_AG_PAR as control method):
- No statistical difference between the best methods (p-value > 0.05).
24
i Algorithm
z =
(R0-Ri)/SE p-value
Holm/
Hochberg
/Hommel Holland Rom Finner Li
7 Random 3 0,000618 0.007142 0.007300 0.007512 0.007300 0.023529
6
One Step
Ahead 3,286335 0,001015 0.008333 0.008512 0.008764 0.014548 0.023529
5
MCTS
GREEDY 2,327820 0,019921 0.01 0.010206 0.010515 0.021742 0.023529
4 MCTS UCT 1,962672 0,049684 0.0125 0.012741 0.013109 0.028885 0.023529
3 AG_D10 1,962672 0,049684 0.016666 0.016952 0.016666 0.035975 0.023529
2 AG_D7 1,049801 0,293809 0.025 0.025320 0.025 0.043013 0.023529
1
MCTS_AG_
SEQ 0,593366 0,552936 0.05 0.050000 0.05 0.050000 0.05
25. Simulation:
- 8 agents
- 10 games (CIG14 GVGAI Competition validation set)
- 40 ms per decision (running time for each algorithm)
- 10 matches per game
- GAMES:
(G1) Alien, (G2) Boulder Dash, (G3) Butterflies,
(G4) Chase, (G5) Frogs, (G6) Missile Command,
(G7) Portals, (G8) Sokoban,
(G9) Survive Zombies, (G10) Zelda
25
26. Results (best score):
- More variability on the results
- Hybrid methods are still the best overall
- MCTS_AG_PAR seems a bit better
- AG agents still perform very well (better with smaller depth)26
Best score Second best
Random
One Step
Ahead
MCTS
UCT
MCTS
GREEDY
AG
D7
AG
D10
MCTS
AG SEQ
MCTS
AG PAR
G1 66 45 74 78 80 80 80 84
G2 3 2 22 6 7 5 26 22
G3 52 26 72 16 62 74 64 40
G4 7 1 7 3 7 7 7 7
G5 0.2 1 1 0.2 1 1 1 1
G6 0.1 0.3 2 2 8 5 2 5
G7 0 0 0 0 1 1 1 1
G8 0 0 0 0 1 0 0 1
G9 16 4 32 10 34 34 45 43
G10 7 1 4 6 8 8 8 8
29. - 8 different GGP agents have been implemented
- 20 different games from GVGAI Competition have been used for
testing and comparing the agents
- Hybrid approaches (MCTS+GAs) have obtained the best results
- Robust results, however there is not an approach good for all the
games
▪ Future Work
- Try to improve MCST+GA agents (better operators in GA)
- Investigate other hybrid approaches with MCTS
- Compare with other agents from the state of the art
29