SlideShare ist ein Scribd-Unternehmen logo
1 von 25
Downloaden Sie, um offline zu lesen
Parametric Action Pre-Selection for MCTS in
Real-Time Strategy Games
Abdessamed Ouessai, Mohammed Salem, and Antonio M. Mora
University of Mascara,
Algeria
University of Granada,
Spain
VI CoSECiVi-2020
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
Introduction
→ First game AI research domain: Classic board games
→ Evolution of board games is constrained by physics
→ Video games represent an unconstrained medium
→ Real-Time Strategy sub-genre concretized abstract board games (Warfare)
→ RTS Games are an evolution of abstract board games
→ ++ Concrete | ++ Challenging for humans | ++ Complex for AI
1
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
RTS Games & AI
→ Multiplayer, zero-sum, non-deterministic game with imperfect information.
→ Top-down perspective. Recognizable mouse and keyboard-based UI.
General Strategy
Gather Build & Train Confront
Destruction of Opponent’s Forces
Units Structures Resources
Victory
Condition
2
RTS Games & AI
→ What does an RTS game-playing AI have to deal with?
3
Short decision cycles (~50/s) Simultaneous moves for different units
Durative actions (> one decision cycle)
Non-determinismPartial observability (opponent & environment)
Exponential growth of the decision/state spaces
Chess Go StarCraft
Branching Factor 36 180 1050
State Space 1047 10171 101685
Real-Time Aspect
Uncertainty
Complexity Large topographic environments
Approximate
Estimates
RTS Games & AI
→ Notable developments:
→ Scripts: Portfolio Greedy Search (Churchill et al, 2013), Puppet Search (Barriga et al, 2015)
→ Learning: Bayesian Models (Synnaeve et al, 2011), AlphaStar (Vinyals et al, 2019)
→ Planning: NaïveMCTS (Ontañón, 2013), AHTN (Ontañón and Buro, 2015), CCG (Kantharaju et al, 2018)
→ Evaluation: CNN (Stanescu et al, 2016), (Barriga et al, 2019)
→ Competitions:
→ IEEE CoG (StarCraft & µRTS), AAAI AIIDE (StarCraft), SSCAIT
→ RTS AI Testbeds:
→ ORTS – Wargus – BWAPI(SC) – SparCraft – SC2LE – ELF – DeepRTS - µRTS.
4
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
Monte Carlo Tree Search
→ An iterative, anytime, sampling-based search framework
→ Main components:
→ Tree Policy
→ Default Policy
→ Popular variant:
→ UCT (UCB1 as Tree Policy)
→ Popular application:
→ Go (AlphaGo)
→ Downside:
→ Scalability issues
5
Tree Policy
Reward
Default Policy
(4) Backpropagation(3) Simulation(2) Expansion(1) Selection
Monte Carlo Tree Search
→ Proposed solutions to enhance MCTS scalability:
6
CMAB
Abstraction
→ Selection phase framed as a Combinatorial Multi-Armed Bandit problem
→ NaïveMCTS is based on a CMAB formulation and a naïve assumption
𝑎1 𝑎2 𝑎3 … 𝑎 𝑛
𝑣1 𝑣2 𝑣3 … 𝑣 𝑛
𝑢1 𝑢2 𝑢3 … 𝑢 𝑛Units
Player Action
(𝛼 𝑡)
Values
𝑣𝑖 =
𝑛
𝑖=1
𝑉(𝛼 𝑡)
(The naïve assumption)
→ Search the decision space induced by expert-authored scripts instead of the original
decision space
→ Downsides: (1) Sacrifices tactical performance. (2) Performance depends on scripts
→ Successfully adapts MCTS to combinatorial decision spaces (ex. RTS Games)
→ Downside: The algorithm is still affected by the dimensionality of the decision space.
Monte Carlo Tree Search
→ Our proposition:
→ A multi-stage parametric action pre-selection scheme to control the decision space
and its granularity
→ Combine abstraction with CMAB (NaïveMCTS) using small-scale parametric scripts
(heuristics)
→ Define a strategy as a collection of heuristics and parameters
7
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
Parametric Action Pre-Selection
→ Expert-authored scripts usually encode a deterministic strategy using a limited portion of
the decision space
→ How to generate novel strategies that can better exploit the available actions?
→ How to preserve low-level tactical performance?
→ A strategy is a combination of heuristics
8
Direct offense
heuristic
Harvest heuristicTrain heuristic
Worker Rush
Strategy
→ Heuristic: A parametric single-goal procedure for
controlling a sub-group of units
→ Single unit:
ℎ ∈ H ∶ 𝑆 × 𝑈 × 𝐴𝑙
× 𝑅ℎ → 𝐴 𝑘
𝑘 ≤ 𝑙
→ 𝑆 : States, 𝑈 : Units, 𝐴 : Unit-Actions, 𝑅ℎ : Parameters
→ Group of units: applied to each member
→ In expert-authored scripts, 𝑘 = 1 and 𝑅ℎ = 1
Parametric Action Pre-Selection
→ Action Pre-Selection: Downsizing the decision space by selecting a subset of actions satisfying a certain
criterion (strategy), prior to planning
→ When 𝑘 > 1 the final decision will be made by a a search approach (ex. MCTS)
→ A unit partitioning 𝑑 ∈ D determines unit groups (manually or automatically)
→ Each unit group is associated with a heuristic. Heuristics’ output defines the search space
9
Planning (MCTS)Pre-Selected ActionsOriginal Actions
Partitioning
Heuristics
Parameters
Action
Pre-Selection
Parametric Action Pre-Selection
→ The general algorithm:
→ Pre-selected actions are refined over successive phases
→ Parametric Action Pre-Selection: 𝑇(𝑠, 𝑈, 𝐴0, 𝑥1, … , 𝑥 𝑛) with 𝑥𝑖(𝐴𝑖−1, 𝑑𝑖, 𝐻𝑖, 𝜃𝑖)
→ A strategy can be expressed as: 𝜎 = (𝑑1, … , 𝑑 𝑛, 𝐻𝑖, … , 𝐻 𝑛, 𝜃1, … , 𝜃 𝑛)
10
A
d1
g1
gm1
H1
h1
hm1
A
Ò1
d2
g1
gm2
H2
h1
h m2
Ò2
A n-110
dn
g1
gmn
H n
h1
hmn
Òn
Game State s
Units U
A n
Search
Execution
𝑥1 𝑥2 𝑥 𝑛
𝑇
Parametric Action Pre-Selection
→ Proposed implementation: ParaMCTS
→ A 2-phase action pre-selection process using NaïveMCTS for search
→ Inspired by the macro- and micro-management task decomposition
→ 47 parameter govern the behaviour of ParaMCTS, tuned manually
→ NaïveMCTS enhancement: Inactive player-action pruning (previous study)
11
Groups Heuristics Parameters
Harvesters <Harvest> maxU, buildMode, pf,
…
Offense <Attack> maxU, offMode,
maxTargets, pf, …
Defense <Defend> maxU, defMode,
defPerimeter, pf, …
Structures <Train> maxU, trainMode, …
Groups Heuristics Parameters
Front-Line <Front-Line Tactics> maxU, waitDuration,
…
Back <Back Tactics> waitDuration, …
Phase-1 (𝑥1) Phase-2 (𝑥2)
NaïveMCTS
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
Experiments & Results
→ How can MCTS benefit from the downsized decision space?
→ Should we increasing the playout duration, the maximum search depth, or both? By how much?
→ How does the performance of ParaMCTS compare to state-of-the-art agents?
→ Experiments setting:
→ Computation budget: 100𝑚𝑠 per game cycle, Maps: basesWorkers 8 × 8, 16 × 16, 32 × 32
→ Tested maximum search depths: {10, 15, 20, 30, 50}. Tested playout durations: {100, 150, 200, 300, 500}
12
→ A lightweight, AI research-focused RTS simulator
→ Open source, written in Java by Santiago Ontañón
→ Includes a forward model and many baseline agents
→ Subject of a yearly AI competition as part of IEEE CoG
Testbed: µRTS (or microRTS)
Experiments & Results
→ Experiments 1: Two 120 iteration round-robin tournaments
1) Between ParaMCTS variants with a fixed playout duration (100 cycles) and different max search depths
2) Between ParaMCTS variants with a fixed max search depth (10) and different playout duration
→ Total matches: 4800 in each map. Score = Wins + Draws / 2, normalized.
→ Results:
13
Experiments & Results
→ Experiment 2: Maximum search depth and playout duration combinations
→ 100 match between each ParaMCTS(search depth, playout duration) variant and MixedBot
→ Sides switched after 50 matches. ParaMCTS implements a similar strategy to MixedBot
→ Total matches: 2500 in each map
→ Results:
14
Experiments & Results
→ Experiment 3: Vs. state-of-the-art.
→ 100 iteration round-robin tournament
→ Participants:
→ ParaMCTS
→ MixedBot
→ Izanagi
→ Droplet
→ NaïveMCTS*
→ NaïveMCTS
→ Total Matches: 3000 in each map
→ 11.9 to 19.1 overall margin
15
Top ranking agents from
2019’s µRTS competition
Same hyperparameters as
ParaMCTS
Using best hyperparameters
→ Introduction
→ RTS Games & AI
→ Monte Carlo Tree Search
→ Parametric Action Pre-Selection
→ Experiments & Results
→ Conclusion & Future Work
Overview
Conclusion & Future Work
→ Parametric action pre-selection describes a general action/state abstraction framework,
applicable to any game with similar characteristics to RTS games
→ Using heuristics instead of scripts grants greater flexibility
→ A proposed implementation, ParaMCTS, significantly outperformed state-of-the-art
agents, using manually tuned parameters
→ Recovered computation budget is better used for deeper search
16
Future Work
→ ParaMCTS parameter optimization for different objectives (maps, opponents, …)
→ Dynamic parameter adaptation through RL
→ Heuristic/partitioning discovery
→ Difficulty adjustment given adequate heuristics and parameters
Thank You
abdessamed.ouessai@univ-mascara.dz
salem@univ-mascara.dz
amorag@ugr.es

Weitere ähnliche Inhalte

Ähnlich wie CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strategy Games

Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning
Improving the Performance of MCTS-Based μRTS Agents Through Move PruningImproving the Performance of MCTS-Based μRTS Agents Through Move Pruning
Improving the Performance of MCTS-Based μRTS Agents Through Move PruningAntonio Mora
 
2017 Fighting Game AI Competition
2017 Fighting Game AI Competition2017 Fighting Game AI Competition
2017 Fighting Game AI Competitionftgaic
 
Testing hybrid computational intelligence algorithms for general game playing...
Testing hybrid computational intelligence algorithms for general game playing...Testing hybrid computational intelligence algorithms for general game playing...
Testing hybrid computational intelligence algorithms for general game playing...Antonio Mora
 
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)ftgaic
 
Alpha go 16110226_김영우
Alpha go 16110226_김영우Alpha go 16110226_김영우
Alpha go 16110226_김영우영우 김
 
Towards Automatic StarCraft Strategy Generation Using Genetic Programming
Towards Automatic StarCraft Strategy Generation Using Genetic ProgrammingTowards Automatic StarCraft Strategy Generation Using Genetic Programming
Towards Automatic StarCraft Strategy Generation Using Genetic ProgrammingPablo García Sánchez
 
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYERA STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYERijcseit
 
new file best book from the university.pdf
new file best book from the university.pdfnew file best book from the university.pdf
new file best book from the university.pdfMUKESHKUMAR601613
 
Mastering the game of go with deep neural networks and tree searching
Mastering the game of go with deep neural networks and tree searchingMastering the game of go with deep neural networks and tree searching
Mastering the game of go with deep neural networks and tree searchingBrian Kim
 
Streaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same GameStreaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same GameNumenta
 
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeAlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeJoonhyung Lee
 
Reinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving CarsReinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving CarsSneha Ravikumar
 
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...Deren Lei
 
Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...
Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...
Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...Yam Peleg
 
Dynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameDynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameSuelen Carvalho
 
AI3391 Artificial intelligence Session 15 Min Max Algorithm.pptx
AI3391 Artificial intelligence Session 15  Min Max Algorithm.pptxAI3391 Artificial intelligence Session 15  Min Max Algorithm.pptx
AI3391 Artificial intelligence Session 15 Min Max Algorithm.pptxAsst.prof M.Gokilavani
 
Cyber Security Forum: DARPA's Cyber Grand Challenge. What Happened and What'...
Cyber Security Forum: DARPA's Cyber Grand Challenge.  What Happened and What'...Cyber Security Forum: DARPA's Cyber Grand Challenge.  What Happened and What'...
Cyber Security Forum: DARPA's Cyber Grand Challenge. What Happened and What'...Tim Vidas
 
Applying AI in Games (GDC2019)
Applying AI in Games (GDC2019)Applying AI in Games (GDC2019)
Applying AI in Games (GDC2019)Jun Okumura
 

Ähnlich wie CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strategy Games (20)

Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning
Improving the Performance of MCTS-Based μRTS Agents Through Move PruningImproving the Performance of MCTS-Based μRTS Agents Through Move Pruning
Improving the Performance of MCTS-Based μRTS Agents Through Move Pruning
 
2017 Fighting Game AI Competition
2017 Fighting Game AI Competition2017 Fighting Game AI Competition
2017 Fighting Game AI Competition
 
Testing hybrid computational intelligence algorithms for general game playing...
Testing hybrid computational intelligence algorithms for general game playing...Testing hybrid computational intelligence algorithms for general game playing...
Testing hybrid computational intelligence algorithms for general game playing...
 
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)
Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)
 
Alpha go 16110226_김영우
Alpha go 16110226_김영우Alpha go 16110226_김영우
Alpha go 16110226_김영우
 
Towards Automatic StarCraft Strategy Generation Using Genetic Programming
Towards Automatic StarCraft Strategy Generation Using Genetic ProgrammingTowards Automatic StarCraft Strategy Generation Using Genetic Programming
Towards Automatic StarCraft Strategy Generation Using Genetic Programming
 
1.game
1.game1.game
1.game
 
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYERA STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
A STRATEGIC HYBRID TECHNIQUE TO DEVELOP A GAME PLAYER
 
new file best book from the university.pdf
new file best book from the university.pdfnew file best book from the university.pdf
new file best book from the university.pdf
 
Mastering the game of go with deep neural networks and tree searching
Mastering the game of go with deep neural networks and tree searchingMastering the game of go with deep neural networks and tree searching
Mastering the game of go with deep neural networks and tree searching
 
Streaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same GameStreaming Analytics: It's Not the Same Game
Streaming Analytics: It's Not the Same Game
 
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human KnowledgeAlphaGo Zero: Mastering the Game of Go Without Human Knowledge
AlphaGo Zero: Mastering the Game of Go Without Human Knowledge
 
Reinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving CarsReinforcement Learning for Self Driving Cars
Reinforcement Learning for Self Driving Cars
 
All projects
All projectsAll projects
All projects
 
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
Learning to Reason in Round-based Games: Multi-task Sequence Generation for P...
 
Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...
Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...
Learning Graphs Representations Using Recurrent Graph Convolution Networks Fo...
 
Dynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameDynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris Game
 
AI3391 Artificial intelligence Session 15 Min Max Algorithm.pptx
AI3391 Artificial intelligence Session 15  Min Max Algorithm.pptxAI3391 Artificial intelligence Session 15  Min Max Algorithm.pptx
AI3391 Artificial intelligence Session 15 Min Max Algorithm.pptx
 
Cyber Security Forum: DARPA's Cyber Grand Challenge. What Happened and What'...
Cyber Security Forum: DARPA's Cyber Grand Challenge.  What Happened and What'...Cyber Security Forum: DARPA's Cyber Grand Challenge.  What Happened and What'...
Cyber Security Forum: DARPA's Cyber Grand Challenge. What Happened and What'...
 
Applying AI in Games (GDC2019)
Applying AI in Games (GDC2019)Applying AI in Games (GDC2019)
Applying AI in Games (GDC2019)
 

Mehr von Sociedad Española para las Ciencias del Videojuego

Mehr von Sociedad Española para las Ciencias del Videojuego (20)

CoSECiVi 2020 - GRETIVE Un Bot Evolutivo para HearthStone basado en Perfiles
CoSECiVi 2020 - GRETIVE Un Bot Evolutivo para HearthStone basado en PerfilesCoSECiVi 2020 - GRETIVE Un Bot Evolutivo para HearthStone basado en Perfiles
CoSECiVi 2020 - GRETIVE Un Bot Evolutivo para HearthStone basado en Perfiles
 
CoSECiVi 2020 - Las consecuencias del glitch en el entorno virtual interactivo
CoSECiVi 2020 - Las consecuencias del glitch en el entorno virtual interactivoCoSECiVi 2020 - Las consecuencias del glitch en el entorno virtual interactivo
CoSECiVi 2020 - Las consecuencias del glitch en el entorno virtual interactivo
 
CoSECiVi 2020 - Games studies in architectural education: An experimental gra...
CoSECiVi 2020 - Games studies in architectural education: An experimental gra...CoSECiVi 2020 - Games studies in architectural education: An experimental gra...
CoSECiVi 2020 - Games studies in architectural education: An experimental gra...
 
CoSECiVi 2020 - Multiresolution Foliage Rendering
CoSECiVi 2020 - Multiresolution Foliage RenderingCoSECiVi 2020 - Multiresolution Foliage Rendering
CoSECiVi 2020 - Multiresolution Foliage Rendering
 
CoSECiVi 2020 - Development of a User-Friendly Application for Creating Tacti...
CoSECiVi 2020 - Development of a User-Friendly Application for Creating Tacti...CoSECiVi 2020 - Development of a User-Friendly Application for Creating Tacti...
CoSECiVi 2020 - Development of a User-Friendly Application for Creating Tacti...
 
CoSECiVi 2020 - Entornos parcialmente no euclidianos en realidad virtual
CoSECiVi 2020 - Entornos parcialmente no euclidianos en realidad virtualCoSECiVi 2020 - Entornos parcialmente no euclidianos en realidad virtual
CoSECiVi 2020 - Entornos parcialmente no euclidianos en realidad virtual
 
CoSECiVi 2020 - An Exploration on Automating Player Personality Identificatio...
CoSECiVi 2020 - An Exploration on Automating Player Personality Identificatio...CoSECiVi 2020 - An Exploration on Automating Player Personality Identificatio...
CoSECiVi 2020 - An Exploration on Automating Player Personality Identificatio...
 
CoSECiVi 2020 - Data mining of deck archetypes in Hearthstone
CoSECiVi 2020 - Data mining of deck archetypes in HearthstoneCoSECiVi 2020 - Data mining of deck archetypes in Hearthstone
CoSECiVi 2020 - Data mining of deck archetypes in Hearthstone
 
CoSECiVi 2020 - Descubrimiento de modelos de comportamiento de perfiles de ju...
CoSECiVi 2020 - Descubrimiento de modelos de comportamiento de perfiles de ju...CoSECiVi 2020 - Descubrimiento de modelos de comportamiento de perfiles de ju...
CoSECiVi 2020 - Descubrimiento de modelos de comportamiento de perfiles de ju...
 
CoSECiVi 2020 - Virtual Reality and Chess. A Video Game for Cognitive Trainin...
CoSECiVi 2020 - Virtual Reality and Chess. A Video Game for Cognitive Trainin...CoSECiVi 2020 - Virtual Reality and Chess. A Video Game for Cognitive Trainin...
CoSECiVi 2020 - Virtual Reality and Chess. A Video Game for Cognitive Trainin...
 
CoSECiVi'16 - Hacia la generación automática de mecánicas de juego: un edito...
CoSECiVi'16 - 	Hacia la generación automática de mecánicas de juego: un edito...CoSECiVi'16 - 	Hacia la generación automática de mecánicas de juego: un edito...
CoSECiVi'16 - Hacia la generación automática de mecánicas de juego: un edito...
 
CoSECiVi'16 - Computación Efímera: identificando retos para la investigación e...
CoSECiVi'16 - Computación Efímera: identificando retos para la investigación e...CoSECiVi'16 - Computación Efímera: identificando retos para la investigación e...
CoSECiVi'16 - Computación Efímera: identificando retos para la investigación e...
 
CoSECiVi'16 - Walking in VR. Measuring Presence and Simulator Sickness in Fir...
CoSECiVi'16 - Walking in VR. Measuring Presence and Simulator Sickness in Fir...CoSECiVi'16 - Walking in VR. Measuring Presence and Simulator Sickness in Fir...
CoSECiVi'16 - Walking in VR. Measuring Presence and Simulator Sickness in Fir...
 
CoSECiVi'16 - Extensión de los grafos de dependencia para incrementar la reju...
CoSECiVi'16 - Extensión de los grafos de dependencia para incrementar la reju...CoSECiVi'16 - Extensión de los grafos de dependencia para incrementar la reju...
CoSECiVi'16 - Extensión de los grafos de dependencia para incrementar la reju...
 
CoSECiVi'16 - Sólo puede quedar uno: Evolución de Bots para RTS basada en sup...
CoSECiVi'16 - Sólo puede quedar uno: Evolución de Bots para RTS basada en sup...CoSECiVi'16 - Sólo puede quedar uno: Evolución de Bots para RTS basada en sup...
CoSECiVi'16 - Sólo puede quedar uno: Evolución de Bots para RTS basada en sup...
 
CoSECiVi'16 - Living-UGR: Una aventura gráfica geolocalizada para difundir el...
CoSECiVi'16 - Living-UGR: Una aventura gráfica geolocalizada para difundir el...CoSECiVi'16 - Living-UGR: Una aventura gráfica geolocalizada para difundir el...
CoSECiVi'16 - Living-UGR: Una aventura gráfica geolocalizada para difundir el...
 
CoSECiVi'16 - Desarrollo de una plataforma basada en Unity3D para la aplicaci...
CoSECiVi'16 - Desarrollo de una plataforma basada en Unity3D para la aplicaci...CoSECiVi'16 - Desarrollo de una plataforma basada en Unity3D para la aplicaci...
CoSECiVi'16 - Desarrollo de una plataforma basada en Unity3D para la aplicaci...
 
CoSECiVi'16 - Educapiz: Una herramienta para educación infantil basada en ser...
CoSECiVi'16 - Educapiz: Una herramienta para educación infantil basada en ser...CoSECiVi'16 - Educapiz: Una herramienta para educación infantil basada en ser...
CoSECiVi'16 - Educapiz: Una herramienta para educación infantil basada en ser...
 
CoSECiVi'15 - Predicting the winner in two player StarCraft games
CoSECiVi'15 - Predicting the winner in two player StarCraft gamesCoSECiVi'15 - Predicting the winner in two player StarCraft games
CoSECiVi'15 - Predicting the winner in two player StarCraft games
 
CoSECiVi'15 - Automatic gameplay testing for message passing architectures
CoSECiVi'15 - Automatic gameplay testing for message passing architecturesCoSECiVi'15 - Automatic gameplay testing for message passing architectures
CoSECiVi'15 - Automatic gameplay testing for message passing architectures
 

Kürzlich hochgeladen

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 

Kürzlich hochgeladen (20)

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 

CoSECiVi 2020 - Parametric Action Pre-Selection for MCTS in Real-Time Strategy Games

  • 1. Parametric Action Pre-Selection for MCTS in Real-Time Strategy Games Abdessamed Ouessai, Mohammed Salem, and Antonio M. Mora University of Mascara, Algeria University of Granada, Spain VI CoSECiVi-2020
  • 2. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 3. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 4. Introduction → First game AI research domain: Classic board games → Evolution of board games is constrained by physics → Video games represent an unconstrained medium → Real-Time Strategy sub-genre concretized abstract board games (Warfare) → RTS Games are an evolution of abstract board games → ++ Concrete | ++ Challenging for humans | ++ Complex for AI 1
  • 5. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 6. RTS Games & AI → Multiplayer, zero-sum, non-deterministic game with imperfect information. → Top-down perspective. Recognizable mouse and keyboard-based UI. General Strategy Gather Build & Train Confront Destruction of Opponent’s Forces Units Structures Resources Victory Condition 2
  • 7. RTS Games & AI → What does an RTS game-playing AI have to deal with? 3 Short decision cycles (~50/s) Simultaneous moves for different units Durative actions (> one decision cycle) Non-determinismPartial observability (opponent & environment) Exponential growth of the decision/state spaces Chess Go StarCraft Branching Factor 36 180 1050 State Space 1047 10171 101685 Real-Time Aspect Uncertainty Complexity Large topographic environments Approximate Estimates
  • 8. RTS Games & AI → Notable developments: → Scripts: Portfolio Greedy Search (Churchill et al, 2013), Puppet Search (Barriga et al, 2015) → Learning: Bayesian Models (Synnaeve et al, 2011), AlphaStar (Vinyals et al, 2019) → Planning: NaïveMCTS (Ontañón, 2013), AHTN (Ontañón and Buro, 2015), CCG (Kantharaju et al, 2018) → Evaluation: CNN (Stanescu et al, 2016), (Barriga et al, 2019) → Competitions: → IEEE CoG (StarCraft & µRTS), AAAI AIIDE (StarCraft), SSCAIT → RTS AI Testbeds: → ORTS – Wargus – BWAPI(SC) – SparCraft – SC2LE – ELF – DeepRTS - µRTS. 4
  • 9. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 10. Monte Carlo Tree Search → An iterative, anytime, sampling-based search framework → Main components: → Tree Policy → Default Policy → Popular variant: → UCT (UCB1 as Tree Policy) → Popular application: → Go (AlphaGo) → Downside: → Scalability issues 5 Tree Policy Reward Default Policy (4) Backpropagation(3) Simulation(2) Expansion(1) Selection
  • 11. Monte Carlo Tree Search → Proposed solutions to enhance MCTS scalability: 6 CMAB Abstraction → Selection phase framed as a Combinatorial Multi-Armed Bandit problem → NaïveMCTS is based on a CMAB formulation and a naïve assumption 𝑎1 𝑎2 𝑎3 … 𝑎 𝑛 𝑣1 𝑣2 𝑣3 … 𝑣 𝑛 𝑢1 𝑢2 𝑢3 … 𝑢 𝑛Units Player Action (𝛼 𝑡) Values 𝑣𝑖 = 𝑛 𝑖=1 𝑉(𝛼 𝑡) (The naïve assumption) → Search the decision space induced by expert-authored scripts instead of the original decision space → Downsides: (1) Sacrifices tactical performance. (2) Performance depends on scripts → Successfully adapts MCTS to combinatorial decision spaces (ex. RTS Games) → Downside: The algorithm is still affected by the dimensionality of the decision space.
  • 12. Monte Carlo Tree Search → Our proposition: → A multi-stage parametric action pre-selection scheme to control the decision space and its granularity → Combine abstraction with CMAB (NaïveMCTS) using small-scale parametric scripts (heuristics) → Define a strategy as a collection of heuristics and parameters 7
  • 13. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 14. Parametric Action Pre-Selection → Expert-authored scripts usually encode a deterministic strategy using a limited portion of the decision space → How to generate novel strategies that can better exploit the available actions? → How to preserve low-level tactical performance? → A strategy is a combination of heuristics 8 Direct offense heuristic Harvest heuristicTrain heuristic Worker Rush Strategy → Heuristic: A parametric single-goal procedure for controlling a sub-group of units → Single unit: ℎ ∈ H ∶ 𝑆 × 𝑈 × 𝐴𝑙 × 𝑅ℎ → 𝐴 𝑘 𝑘 ≤ 𝑙 → 𝑆 : States, 𝑈 : Units, 𝐴 : Unit-Actions, 𝑅ℎ : Parameters → Group of units: applied to each member → In expert-authored scripts, 𝑘 = 1 and 𝑅ℎ = 1
  • 15. Parametric Action Pre-Selection → Action Pre-Selection: Downsizing the decision space by selecting a subset of actions satisfying a certain criterion (strategy), prior to planning → When 𝑘 > 1 the final decision will be made by a a search approach (ex. MCTS) → A unit partitioning 𝑑 ∈ D determines unit groups (manually or automatically) → Each unit group is associated with a heuristic. Heuristics’ output defines the search space 9 Planning (MCTS)Pre-Selected ActionsOriginal Actions Partitioning Heuristics Parameters Action Pre-Selection
  • 16. Parametric Action Pre-Selection → The general algorithm: → Pre-selected actions are refined over successive phases → Parametric Action Pre-Selection: 𝑇(𝑠, 𝑈, 𝐴0, 𝑥1, … , 𝑥 𝑛) with 𝑥𝑖(𝐴𝑖−1, 𝑑𝑖, 𝐻𝑖, 𝜃𝑖) → A strategy can be expressed as: 𝜎 = (𝑑1, … , 𝑑 𝑛, 𝐻𝑖, … , 𝐻 𝑛, 𝜃1, … , 𝜃 𝑛) 10 A d1 g1 gm1 H1 h1 hm1 A Ò1 d2 g1 gm2 H2 h1 h m2 Ò2 A n-110 dn g1 gmn H n h1 hmn Òn Game State s Units U A n Search Execution 𝑥1 𝑥2 𝑥 𝑛 𝑇
  • 17. Parametric Action Pre-Selection → Proposed implementation: ParaMCTS → A 2-phase action pre-selection process using NaïveMCTS for search → Inspired by the macro- and micro-management task decomposition → 47 parameter govern the behaviour of ParaMCTS, tuned manually → NaïveMCTS enhancement: Inactive player-action pruning (previous study) 11 Groups Heuristics Parameters Harvesters <Harvest> maxU, buildMode, pf, … Offense <Attack> maxU, offMode, maxTargets, pf, … Defense <Defend> maxU, defMode, defPerimeter, pf, … Structures <Train> maxU, trainMode, … Groups Heuristics Parameters Front-Line <Front-Line Tactics> maxU, waitDuration, … Back <Back Tactics> waitDuration, … Phase-1 (𝑥1) Phase-2 (𝑥2) NaïveMCTS
  • 18. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 19. Experiments & Results → How can MCTS benefit from the downsized decision space? → Should we increasing the playout duration, the maximum search depth, or both? By how much? → How does the performance of ParaMCTS compare to state-of-the-art agents? → Experiments setting: → Computation budget: 100𝑚𝑠 per game cycle, Maps: basesWorkers 8 × 8, 16 × 16, 32 × 32 → Tested maximum search depths: {10, 15, 20, 30, 50}. Tested playout durations: {100, 150, 200, 300, 500} 12 → A lightweight, AI research-focused RTS simulator → Open source, written in Java by Santiago Ontañón → Includes a forward model and many baseline agents → Subject of a yearly AI competition as part of IEEE CoG Testbed: µRTS (or microRTS)
  • 20. Experiments & Results → Experiments 1: Two 120 iteration round-robin tournaments 1) Between ParaMCTS variants with a fixed playout duration (100 cycles) and different max search depths 2) Between ParaMCTS variants with a fixed max search depth (10) and different playout duration → Total matches: 4800 in each map. Score = Wins + Draws / 2, normalized. → Results: 13
  • 21. Experiments & Results → Experiment 2: Maximum search depth and playout duration combinations → 100 match between each ParaMCTS(search depth, playout duration) variant and MixedBot → Sides switched after 50 matches. ParaMCTS implements a similar strategy to MixedBot → Total matches: 2500 in each map → Results: 14
  • 22. Experiments & Results → Experiment 3: Vs. state-of-the-art. → 100 iteration round-robin tournament → Participants: → ParaMCTS → MixedBot → Izanagi → Droplet → NaïveMCTS* → NaïveMCTS → Total Matches: 3000 in each map → 11.9 to 19.1 overall margin 15 Top ranking agents from 2019’s µRTS competition Same hyperparameters as ParaMCTS Using best hyperparameters
  • 23. → Introduction → RTS Games & AI → Monte Carlo Tree Search → Parametric Action Pre-Selection → Experiments & Results → Conclusion & Future Work Overview
  • 24. Conclusion & Future Work → Parametric action pre-selection describes a general action/state abstraction framework, applicable to any game with similar characteristics to RTS games → Using heuristics instead of scripts grants greater flexibility → A proposed implementation, ParaMCTS, significantly outperformed state-of-the-art agents, using manually tuned parameters → Recovered computation budget is better used for deeper search 16 Future Work → ParaMCTS parameter optimization for different objectives (maps, opponents, …) → Dynamic parameter adaptation through RL → Heuristic/partitioning discovery → Difficulty adjustment given adequate heuristics and parameters