Mcts ai

•Als PPTX, PDF herunterladen•

1 gefällt mir•1,772 views

ftgaic

Here we describe the mechanism of our sample fighting game AI using Monte Carlo Tree Search.

Technologie

MctsAi
Team FightingICE
March 27, 2016
http://www.ice.ci.ritsumei.ac.jp/~ftgaic/

Outline of MctsAi
 A sample fighting game AI implementing UCB applied to trees (UCT)
[1] for the FightingICE platform
 A typical Monte-Carlo Tree Search (MCTS) algorithm [2]
[1] Levente Kocsis and Csaba Szepesvari, “Bandit based Monte-Carlo Planning”
[2] R Coulom, “Efficient selectivity and backup operators in Monte-Carlo tree search”

UCT
 Repeat Selection→Expansion→Playout→Backpropagation until
 Reaching the predefined maximum time-length or the maximum number of playouts
 Use UCB1 value in Selection
 Finally select the action associated with the adjacent child node, of the root node,
having maximum number of visits
selection expansion playout backpropagation

Upper Confidence Bound (UCB1) [3]
𝑈𝐶𝐵1(𝑖) = 𝑋𝑖 + 𝐶
2𝑙𝑛𝑁𝑖
𝑝
𝑁𝑖
𝑈𝐶𝐵1(𝑖): 𝑈𝐶𝐵1 value of node i
𝑋𝑖: Average evaluation value of node i
𝐶 : Balancing parameter （empirically set to 3 in the sample AI）
𝑁𝑖
𝑝
: Number of visits to the parent node of node 𝑖
𝑁𝑖: Number of visits to node 𝑖
Select a less visited node with a high evaluation value
[3] P Auer and N Cesa-Bianchi and P Fischer, “Finite-time analysis of the multiarmed bandit problem”

MctsAi Procedure
1. Expand all adjacent child nodes at once from the root node
2. Repeat an iteration of Selection, Expansion, Playout, and
Backprogation as many times as possible for 16.5ms (<-also
empirically set)
3. Select an action to perform

1 Expansion of all adjacent child nodes
from the root node
 Assign a very large random value to non-visited nodes as
their initial UCB1 value
0 0
10002
NaN
0
NaN
0
100109999
NaN
0
ucb1value
avg eval.
value
# of visits
Node :

2.1 Selection
 Select nodes with highest UCB1 value all the way down to a leaf node
0
10002
NaN
0
NaN
0
100109999
NaN
0
17
4.42
0.3
3
2.5
10
4.764.07
0.5
4
NaN
0
10030
NaN
0
10028
NaN
0
10020
Example 1
Example 2

2.2 Expansion
 If a leaf node having 10 visits at the depth level of 1 is reached, then
expand all of its child nodes at once
17
4.42
0.3
3
2.5
10
4.764.07
0.5
4
NaN
0
10030
NaN
0
10028
NaN
0
10020
17
4.42
0.3
3
2.5
10
4.764.07
0.5
4

2.3 Playout
0
10002
NaN
0
NaN
0
100109999
NaN
0
17
4.42
0.3
3
2.5
10
4.764.07
0.5
4
NaN
0
10030
NaN
0
10028
NaN
0
10020
 Perform a random simulation for 60 frames ahead
Example 1
Example 2

2.4 Backpropagation
17
4.42
0.3
3
2.5
10
4.764.07
0.5
4
NaN
0
10030
NaN
0
10028
NaN
0
10020
 Backpropagate a newly obtained evaluation value and modify the UCB1 value and
number of visits of all related nodes
18
4.46
0.3
3
2.27
11
4.444.10
0.5
4
0
1
6.57
NaN
0
10028
NaN
0
10020

3 Selection of an action
0.33
3
4.64
0.33
3
4.64
2.66
6
5.71
56
4.14
2.53
28
1.95
22
3.763.81
0.33
6
0.5
2
5.98
2.2
5
5.66
4.09
11
6.43
 Finally, select the action associated with the child
node having the highest number of visits

Empfohlen

Monte Carlo Tree Search for the Super Mario BrosChih-Sheng Lin

Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)ftgaic

Find nuclei in images with U-netDing Li

Non-parametric probability distribution for fitting dataNikhil Chandra Sarkar

Gossip-based resource allocation for green computing in large cloudsRerngvit Yanggratoke

[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon

Baseball Prediction Model on TensorflowJay Ryu

ModuLab DLC-Medical3Dongheon Lee

Empfohlen

Monte Carlo Tree Search for the Super Mario BrosChih-Sheng Lin

Application of Monte Carlo Tree Search in a Fighting Game AI (GCCE 2016)ftgaic

Find nuclei in images with U-netDing Li

Non-parametric probability distribution for fitting dataNikhil Chandra Sarkar

Gossip-based resource allocation for green computing in large cloudsRerngvit Yanggratoke

[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon

Baseball Prediction Model on TensorflowJay Ryu

ModuLab DLC-Medical3Dongheon Lee

Seismic data analysis with u netDing Li

Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Gurbinder Gill

Distance and Time Based Node Selection for Probabilistic Coverage in People-C...Ubi NAIST

safe and efficient off policy reinforcement learningRyo Iwaki

Teach a neural network to read handwritingVipul Kaushal

ES_SAA_OG_PF_ECCTD_PosSyed Asad Alam

Deep Learning for AI (2)Dongheon Lee

IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...Anna Guitart Atienza

Personalized news recommendation enginePrateek Sachdev

Lab 1 batsBlankyteach

Clustering tutorialLio Gonçalves

WMT14_sakaguchiKeisuke Sakaguchi

ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkHiroki Nakahara

自然方策勾配法の基礎と応用Ryo Iwaki

ゆるふわ強化学習入門Ryo Iwaki

FPGA2018: A Lightweight YOLOv2: A binarized CNN with a parallel support vecto...Hiroki Nakahara

increasing the action gap - new operators for reinforcement learningRyo Iwaki

Batch normalization presentationOwin Will

Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Pooyan Jamshidi

FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi

"Monte-Carlo Tree Search for the game of Go"BigMC

Challenges for implementing Monte Carlo Tree Search in commercial gamesMatthew Bedder

Weitere ähnliche Inhalte

Was ist angesagt?

Seismic data analysis with u netDing Li

Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...Gurbinder Gill

Distance and Time Based Node Selection for Probabilistic Coverage in People-C...Ubi NAIST

safe and efficient off policy reinforcement learningRyo Iwaki

Teach a neural network to read handwritingVipul Kaushal

ES_SAA_OG_PF_ECCTD_PosSyed Asad Alam

Deep Learning for AI (2)Dongheon Lee

IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...Anna Guitart Atienza

Personalized news recommendation enginePrateek Sachdev

Lab 1 batsBlankyteach

Clustering tutorialLio Gonçalves

WMT14_sakaguchiKeisuke Sakaguchi

ISMVL2018: A Ternary Weight Binary Input Convolutional Neural NetworkHiroki Nakahara

自然方策勾配法の基礎と応用Ryo Iwaki

ゆるふわ強化学習入門Ryo Iwaki

FPGA2018: A Lightweight YOLOv2: A binarized CNN with a parallel support vecto...Hiroki Nakahara

increasing the action gap - new operators for reinforcement learningRyo Iwaki

Batch normalization presentationOwin Will

Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...Pooyan Jamshidi

FCN-Based 6D Robotic Grasping for Arbitrary Placed ObjectsKusano Hitoshi

Was ist angesagt? (20)

Seismic data analysis with u net

Efficient Variable Size Template Matching Using Fast Normalized Cross Correla...

Distance and Time Based Node Selection for Probabilistic Coverage in People-C...

safe and efficient off policy reinforcement learning

Teach a neural network to read handwriting

ES_SAA_OG_PF_ECCTD_Pos

Deep Learning for AI (2)

IEEE CIG 2017 New York, Games and Big Data: A Scalable Multi-Dimensional Chur...

Personalized news recommendation engine

Lab 1 bats

Clustering tutorial

WMT14_sakaguchi

ISMVL2018: A Ternary Weight Binary Input Convolutional Neural Network

自然方策勾配法の基礎と応用

ゆるふわ強化学習入門

FPGA2018: A Lightweight YOLOv2: A binarized CNN with a parallel support vecto...

increasing the action gap - new operators for reinforcement learning

Batch normalization presentation

Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...

FCN-Based 6D Robotic Grasping for Arbitrary Placed Objects

Andere mochten auch

"Monte-Carlo Tree Search for the game of Go"BigMC

Challenges for implementing Monte Carlo Tree Search in commercial gamesMatthew Bedder

Applying fuzzy control in fighting game aiftgaic

Alpha go 16110226_김영우영우 김

What did AlphaGo do to beat the strongest human Go player?Tobias Pfeiffer

2016 Fighting Game Artificial Intelligence Competitionftgaic

2013 Fighting Game Artificial Intelligence Competitionftgaic

AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...Michael Jongho Moon

8 queens problem using back trackingTech_MX

AlphaGo 알고리즘 요약Jooyoul Lee

Andere mochten auch (10)

"Monte-Carlo Tree Search for the game of Go"

Challenges for implementing Monte Carlo Tree Search in commercial games

Applying fuzzy control in fighting game ai

Alpha go 16110226_김영우

What did AlphaGo do to beat the strongest human Go player?

2016 Fighting Game Artificial Intelligence Competition

2013 Fighting Game Artificial Intelligence Competition

AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...

8 queens problem using back tracking

AlphaGo 알고리즘 요약

Ähnlich wie Mcts ai

B.tech_project_ppt.pptxsupratikmondal6

캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)NAVER Engineering

AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...AILABS Academy

System Monitoringbutest

GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用NVIDIA Taiwan

Introduction to Neural Networks and Deep LearningVahid Mirjalili

Erca energy efficient routing and reclusteringaciijournal

"An adaptive modular approach to the mining of sensor network ...butest

Waste Classification System using Convolutional Neural Networks.pptxJohnPrasad14

Machine Project.pptx___''''''""'""_-#(($!_!_!$!$!$(3(($($?$sidhantkumarpdt

ICSRS_R038.pptxJavier Fernández Muñoz

dm_clustering2.pptBhuvanya Raghunathan

Fixed-Point Code Synthesis for Neural NetworksIJITE

Fixed-Point Code Synthesis for Neural Networksgerogepatton

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya

Project session part_IMina Yonan

Respose surface methodsVenkatasami murugesan

MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMLAI2

UNetEliyaLaialy (2).pptxNoorUlHaq47

Neural network basic and introduction of Deep learningTapas Majumdar

Ähnlich wie Mcts ai (20)

B.tech_project_ppt.pptx

캡슐 네트워크를 이용한 엔드투엔드 음성 단어 인식, 배재성(KAIST 석사과정)

AILABS - Lecture Series - Is AI the New Electricity? Topic:- Classification a...

System Monitoring

GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用

Introduction to Neural Networks and Deep Learning

Erca energy efficient routing and reclustering

"An adaptive modular approach to the mining of sensor network ...

Waste Classification System using Convolutional Neural Networks.pptx

Machine Project.pptx___''''''""'""_-#(($!_!_!$!$!$(3(($($?$

ICSRS_R038.pptx

dm_clustering2.ppt

Fixed-Point Code Synthesis for Neural Networks

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020

Project session part_I

Respose surface methods

MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures

UNetEliyaLaialy (2).pptx

Neural network basic and introduction of Deep learning

Mehr von ftgaic

2021 Fighting Game AI Competitionftgaic

2020 Fighting Game AI Competitionftgaic

2019 Fighting Game AI Competitionftgaic

2018 Fighting Game AI Competition ftgaic

Introduction to the Replay File Analysis Toolftgaic

2017 Fighting Game AI Competitionftgaic

2015 Fighting Game Artificial Intelligence Competitionftgaic

2014 Fighting Game Artificial Intelligence Competitionftgaic

Mehr von ftgaic (8)

2021 Fighting Game AI Competition

2020 Fighting Game AI Competition

2019 Fighting Game AI Competition

2018 Fighting Game AI Competition

Introduction to the Replay File Analysis Tool

2017 Fighting Game AI Competition

2015 Fighting Game Artificial Intelligence Competition

2014 Fighting Game Artificial Intelligence Competition

Kürzlich hochgeladen

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Histor y of HAM Radio presentation slidevu2urc

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

A Call to Action for Generative AI in 2024Results

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Kürzlich hochgeladen (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Histor y of HAM Radio presentation slide

Data Cloud, More than a CDP by Matt Robison

[2024]Digital Global Overview Report 2024 Meltwater.pdf

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

Scaling API-first – The story of a global engineering organization

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Presentation on how to chat with PDF using ChatGPT code interpreter

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Powerful Google developer tools for immediate impact! (2023-24 C)

A Call to Action for Generative AI in 2024

08448380779 Call Girls In Civil Lines Women Seeking Men

Exploring the Future Potential of AI-Enabled Smartphone Processors

How to Troubleshoot Apps for the Modern Connected Worker

08448380779 Call Girls In Friends Colony Women Seeking Men

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Advantages of Hiring UIUX Design Service Providers for Your Business

Mcts ai

1. MctsAi Team FightingICE March 27, 2016 http://www.ice.ci.ritsumei.ac.jp/~ftgaic/

2. Outline of MctsAi  A sample fighting game AI implementing UCB applied to trees (UCT) [1] for the FightingICE platform  A typical Monte-Carlo Tree Search (MCTS) algorithm [2] [1] Levente Kocsis and Csaba Szepesvari, “Bandit based Monte-Carlo Planning” [2] R Coulom, “Efficient selectivity and backup operators in Monte-Carlo tree search”

3. UCT  Repeat Selection→Expansion→Playout→Backpropagation until  Reaching the predefined maximum time-length or the maximum number of playouts  Use UCB1 value in Selection  Finally select the action associated with the adjacent child node, of the root node, having maximum number of visits selection expansion playout backpropagation

4. Upper Confidence Bound (UCB1) [3] 𝑈𝐶𝐵1(𝑖) = 𝑋𝑖 + 𝐶 2𝑙𝑛𝑁𝑖 𝑝 𝑁𝑖 𝑈𝐶𝐵1(𝑖): 𝑈𝐶𝐵1 value of node i 𝑋𝑖: Average evaluation value of node i 𝐶 : Balancing parameter （empirically set to 3 in the sample AI） 𝑁𝑖 𝑝 : Number of visits to the parent node of node 𝑖 𝑁𝑖: Number of visits to node 𝑖 Select a less visited node with a high evaluation value [3] P Auer and N Cesa-Bianchi and P Fischer, “Finite-time analysis of the multiarmed bandit problem”

5. MctsAi Procedure 1. Expand all adjacent child nodes at once from the root node 2. Repeat an iteration of Selection, Expansion, Playout, and Backprogation as many times as possible for 16.5ms (<-also empirically set) 3. Select an action to perform

6. 1 Expansion of all adjacent child nodes from the root node  Assign a very large random value to non-visited nodes as their initial UCB1 value 0 0 10002 NaN 0 NaN 0 100109999 NaN 0 ucb1value avg eval. value # of visits Node :

7. 2.1 Selection  Select nodes with highest UCB1 value all the way down to a leaf node 0 10002 NaN 0 NaN 0 100109999 NaN 0 17 4.42 0.3 3 2.5 10 4.764.07 0.5 4 NaN 0 10030 NaN 0 10028 NaN 0 10020 Example 1 Example 2

8. 2.2 Expansion  If a leaf node having 10 visits at the depth level of 1 is reached, then expand all of its child nodes at once 17 4.42 0.3 3 2.5 10 4.764.07 0.5 4 NaN 0 10030 NaN 0 10028 NaN 0 10020 17 4.42 0.3 3 2.5 10 4.764.07 0.5 4

9. 2.3 Playout 0 10002 NaN 0 NaN 0 100109999 NaN 0 17 4.42 0.3 3 2.5 10 4.764.07 0.5 4 NaN 0 10030 NaN 0 10028 NaN 0 10020  Perform a random simulation for 60 frames ahead Example 1 Example 2

10. 2.4 Backpropagation 17 4.42 0.3 3 2.5 10 4.764.07 0.5 4 NaN 0 10030 NaN 0 10028 NaN 0 10020  Backpropagate a newly obtained evaluation value and modify the UCB1 value and number of visits of all related nodes 18 4.46 0.3 3 2.27 11 4.444.10 0.5 4 0 1 6.57 NaN 0 10028 NaN 0 10020

11. 3 Selection of an action 0.33 3 4.64 0.33 3 4.64 2.66 6 5.71 56 4.14 2.53 28 1.95 22 3.763.81 0.33 6 0.5 2 5.98 2.2 5 5.66 4.09 11 6.43  Finally, select the action associated with the child node having the highest number of visits

12. That’s all folks!

Hinweis der Redaktion

バランスパラメーターCを3