Alpha go 16110226_김영우

•

0 gefällt mir•892 views

영우 김

Alphago Overview

Technologie

System Engineering Laboratory
koain@naver.com
16110226 Kim Young Woo
Mastering the game of Go with Deep
Neural Networks and Tree Search

Contents
Ⅰ. What is the AlphaGo?
- Go Machine
Ⅱ. Background
- Overview
- MCTS (Monte Carlo Tree Search)
- CNN (Convolutional Neural Network)
Ⅲ. Components
- Policy Networks
- Value Networks
- Searching with policy and value networks
Ⅳ. Conclusion

What is the AlphaGo? – Go Machine
- AlphaGo is a computer program developed by Google DeepMind to play game Go.
- The first computer Go program to beat a professional human Go player without
handicaps.

What is the AlphaGo? – Go Machine
- For Chess, IBM Deep Blue beat the world Chess champion using Brute Force search.
- Go game board is 19 x 19 size and possible states of one point are 3. The number of
all cases is 3361
≒ 10170
.
- The number of the next reasonable position is 250 and a Go game is ended at 150
turns on average. The tree’s depth is the 150 and breadth is the 250.
- It is impossible to search all of the cases with current technology.
- It is the key how can decrease the depth and breadth of the search tree.

Background – Overview
1. MCTS (Monte Carlo Tree Search)
- It is used by many AI Go programs.
2. CNN (Convolution Neural Networks)
- Policy Networks
- Value Networks

Background – MCTS
- When it is impossible to explore all paths, it is efficient
- Selection : Select the most promising path from root to leaf.
- Expansion : If game is not ended, either create one or more child nodes or choose
from them.
- Simulation : Play game from chosen node until game is ended.
- Backpropagation : Update information on the path from root to chosen node using the
simulation result.

Background – CNN
- Convolution Layer : It extracts meaningful data(feature maps) from input image.
- Sub-sampling Layer : It max-pooling from feature maps.
- Fully-Connected Layer : It is used for classification from feature maps.

- 𝑠 : State of the board
- 𝑎 : Next action
- 𝑣 𝑠 : Valuation function
- 𝑃 𝑎 𝑠 : Propability distribution over possible moves 𝑎 in position 𝑠
- 𝑃𝜎 : Supervised learning based Policy Networks
- 𝑃𝜋 : Rapidly sample actions during rollouts.
- 𝑃𝜌 : Reinforcement learning based Policy Networks
- 𝑣 𝜃 : Value Networks that predicts the winner of games
Components

Components - Policy Networks
- Decrease the breadth of the search tree.
- Convolutional Neural Networks for finding next action.
- Estimating value function 𝑃 𝑎 𝑠 .
- Supervised Learning and Reinforcement Learning.
1. SL(Supervised Learning) Policy Network
- Learn from human expert using 30 million data from KGS GO Server
2. RL(Reinforcement Learning) Policy Network
- Initialized to SL
- Learn from playing games of self-play with RL policy Network
- RL policy Network won more than 80% games against SL policy Network

Components - Value Networks
- Decrease the depth of the search tree.
- Convolutional Neural Networks for predicting outcome
from position 𝑠.
- Estimating value function 𝑣 𝑝(𝑠).
- Reinforcement Learning.
1. Reinforcement Learning
- Self play game from RL policy networks
- Avoiding the overfitting from KGS data sets.

Components – Searching with policy and value network
- 𝑄 : MCTS action value
- 𝑢(𝑃) : Bonus that depends on a stored prior probability P, that is in inverse proportion
to the visit counts.
- Selection : Select maximum 𝑄 + 𝑢(𝑃) value at step L-1.
- Expansion : Expanse the nodes based on 𝑃𝜎 at step L
- Evaluation : Evaluate the win rate by simulating using 𝑣 𝜃 and random rollout playing.
- Backup : 𝑄 and visit counts of all traversed edges are updated.

Conclusion
- Single-machine AlphaGo is many dan ranks stronger than any previous Go program,
winning 494 out of 495 games (99.8%) against outher Go programs.
- Distributed version Alphago won the match 5 games to 0 against Fan Hui, European
Go Champion.

Weitere ähnliche Inhalte

Ähnlich wie Alpha go 16110226_김영우

Chess Engine

sleepy sleep

IaGo: an Othello AI inspired by AlphaGo

Shion Honda

Mastering the game of go with deep neural networks and tree searching

Brian Kim

La question de la durabilité des technologies de calcul et de télécommunication

Alexandre Monnin

AlphaGo and AlphaGo Zero

☕ Keita Watanabe

J-Fall 2017 - AI Self-learning Game Playing

Richard Abbuhl

A Presentation on the Paper: Mastering the game of Go with deep neural networ...

AdityaSuryavamshi

Project on ai gaming

Roshan Panday

Reinforcement Learning for Self Driving Cars

Sneha Ravikumar

Understanding AlphaGo

Amit Mandelbaum

Ibm's deep blue chess grandmaster chips

Nadeeshaan Gunasinghe

Introduction to Alphago Zero

Chia-Ching Lin

Google Deepmind Mastering Go Research Paper

Business of Software Conference

AlphaZero and beyond: Polygames

Olivier Teytaud

AI Lecture 5 (game playing)

Tajim Md. Niamat Ullah Akhund

Gdc19 junsik hwang_v20190314_upload

Junsik Whang

Computer Vision abbreviated as CV aims to teach computers to achieve human level vision capabilities. Applications of CV in self driving cars, robotics, healthcare, education and the multitude of apps that allow customers to use the smartphone cameras to convey information has made it one of the most popular fields in Artificial Intelligence. The recent advances in Deep Learning, data storage and computing capabilities has lead to the huge success of CV. There are several tasks in computer vision, such as classification, object detection, image segmentation, optical character recognition, scene reconstruction and many others. In this presentation I will talk about applying Transfer Learning, Image classification, object detection and the metrics required to measure them on still images. The increase in accuracy over of CV tasks over the past decade is due to Convolutional Neural Networks (CNN), CNN is the base used in architectures such as RESNET or VGGNET. I will go through how to use these pre-trained models for image classification and feature extraction. One of the break throughs in object detection has come with one-shot learning, where the bounding box and the class of the object is predicted simultaneously. This leads to low latency during inference (155 frames per second) and high accuracy. This is the framework behind object detection using YOLO , I will explain how to use yolo for specific use cases.

Computer Vision for Beginners

Sanghamitra Deb

Final Presentation - Edan&Itzik

itzik cohen

Minimax.pdf

SATYAMMAHAPATRA1

Artificial neural networks introduction

SungminYou

Ähnlich wie Alpha go 16110226_김영우 (20)

Chess Engine

IaGo: an Othello AI inspired by AlphaGo

Mastering the game of go with deep neural networks and tree searching

La question de la durabilité des technologies de calcul et de télécommunication

AlphaGo and AlphaGo Zero

J-Fall 2017 - AI Self-learning Game Playing

A Presentation on the Paper: Mastering the game of Go with deep neural networ...

Project on ai gaming

Reinforcement Learning for Self Driving Cars

Understanding AlphaGo

Ibm's deep blue chess grandmaster chips

Introduction to Alphago Zero

Google Deepmind Mastering Go Research Paper

AlphaZero and beyond: Polygames

AI Lecture 5 (game playing)

Gdc19 junsik hwang_v20190314_upload

Computer Vision for Beginners

Final Presentation - Edan&Itzik

Minimax.pdf

Artificial neural networks introduction

Kürzlich hochgeladen

Accelerating FinTech Innovation: Unleashing API Economy and GenAI Vasa Krishnan, Chief Technology Officer - FinResults Apidays New York 2024: The API Economy in the AI Era (April 30 & May 1, 2024) ------ Check out our conferences at https://www.apidays.global/ Do you want to sponsor or talk at one of our conferences? https://apidays.typeform.com/to/ILJeAaV8 Learn more on APIscene, the global media made by the community for the community: https://www.apiscene.io Explore the API ecosystem with the API Landscape: https://apilandscape.apiscene.io/

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

apidays

DBX First Quarter 2024 Investor Presentation

Dropbox

Three things you will take away from the session: • How to run an effective tenant-to-tenant migration • Best practices for before, during, and after migration • Tips for using migration as a springboard to prepare for Copilot in Microsoft 365 Main ideas: Migration Overview: The presentation covers the current reality of cross-tenant migrations, the triggers, phases, best practices, and benefits of a successful tenant migration Considerations: When considering a migration, it is important to consider the migration scope, performance, customization, flexibility, user-friendly interface, automation, monitoring, support, training, scalability, data integrity, data security, cost, and licensing structure Next Wave: The next wave of change includes the launch of Copilot, which requires businesses to be prepared for upcoming changes related to Copilot and the cloud, and to consolidate data and tighten governance ShareGate: ShareGate can help with pre-migration analysis, configurable migration tool, and automated, end-user driven collaborative governance

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

sammart93

Angeliki Cooney has spent over twenty years at the forefront of the life sciences industry, working out of Wynantskill, NY. She is highly regarded for her dedication to advancing the development and accessibility of innovative treatments for chronic diseases, rare disorders, and cancer. Her professional journey has centered on strategic consulting for biopharmaceutical companies, facilitating digital transformation, enhancing omnichannel engagement, and refining strategic commercial practices. Angeliki's innovative contributions include pioneering several software-as-a-service (SaaS) products for the life sciences sector, earning her three patents. As the Senior Vice President of Life Sciences at Avenga, Angeliki orchestrated the firm's strategic entry into the U.S. market. Avenga, a renowned digital engineering and consulting firm, partners with significant entities in the pharmaceutical and biotechnology fields. Her leadership was instrumental in expanding Avenga's client base and establishing its presence in the competitive U.S. market.

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

Angeliki Cooney

presentation ICT roal in 21st century education

jfdjdjcjdnsjd

AWS Community Day CPH - Three problems of Terraform

Andrey Devyatkin

Abhishek Deb(1), Mr Abdul Kalam(2) M. Des (UX) , School of Design, DIT University , Dehradun. This paper explores the future potential of AI-enabled smartphone processors, aiming to investigate the advancements, capabilities, and implications of integrating artificial intelligence (AI) into smartphone technology. The research study goals consist of evaluating the development of AI in mobile phone processors, analyzing the existing state as well as abilities of AI-enabled cpus determining future patterns as well as chances together with reviewing obstacles as well as factors to consider for more growth.

Exploring the Future Potential of AI-Enabled Smartphone Processors

debabhi2

Manulife - Insurer Transformation Award 2024

The Digital Insurer

Dubai, known for its towering skyscrapers, luxurious lifestyle, and relentless pursuit of innovation, often finds itself in the global spotlight. However, amidst the glitz and glamour, the emirate faces its own set of challenges, including the occasional threat of flooding. In recent years, Dubai has experienced sporadic but significant floods, disrupting normalcy and posing unique challenges to its infrastructure. Among the critical nodes in this bustling metropolis is the Dubai International Airport, a vital hub connecting the world. This article delves into the intersection of Dubai flood events and the resilience demonstrated by the Dubai International Airport in the face of such challenges.

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Orbitshub

MS Copilot expands with MS Graph connectors

Nanddeep Nachan

Architecting Cloud Native Applications

WSO2

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

MadyBayot

CNIC Information System with Pakdata Cf In Pakistan

danishmna97

Following the popularity of “Cloud Revolution: Exploring the New Wave of Serverless Spatial Data,” we’re thrilled to announce this much-anticipated encore webinar. In this sequel, we’ll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you’re building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

The microservices honeymoon is over. When starting a new project or revamping a legacy monolith, teams started looking for alternatives to microservices. The Modular Monolith, or 'Modulith', is an architecture that reaps the benefits of (vertical) functional decoupling without the high costs associated with separate deployments. This talk will delve into the advantages and challenges of this progressive architecture, beginning with exploring the concept of a 'module', its internal structure, public API, and inter-module communication patterns. Supported by spring-modulith, the talk provides practical guidance on addressing the main challenges of a Modultith Architecture: finding and guarding module boundaries, data decoupling, and integration module-testing. You should not miss this talk if you are a software architect or tech lead seeking practical, scalable solutions. About the author With two decades of experience, Victor is a Java Champion working as a trainer for top companies in Europe. Five thousands developers in 120 companies attended his workshops, so he gets to debate every week the challenges that various projects struggle with. In return, Victor summarizes key points from these workshops in conference talks and online meetups for the European Software Crafters, the world’s largest developer community around architecture, refactoring, and testing. Discover how Victor can help you on victorrentea.ro : company training catalog, consultancy and YouTube playlists.

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Victor Rentea

Join our latest Connector Corner webinar to discover how UiPath Integration Service revolutionizes API-centric automation in a 'Quote to Cash' process—and how that automation empowers businesses to accelerate revenue generation. A comprehensive demo will explore connecting systems, GenAI, and people, through powerful pre-built connectors designed to speed process cycle times. Speakers: James Dickson, Senior Software Engineer Charlie Greenberg, Host, Product Marketing Manager

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

DianaGray10

Following the popularity of "Cloud Revolution: Exploring the New Wave of Serverless Spatial Data," we're thrilled to announce this much-anticipated encore webinar. In this sequel, we'll dive deeper into the Cloud-Native realm by uncovering practical applications and FME support for these new formats, including COGs, COPC, FlatGeoBuf, GeoParquet, STAC, and ZARR. Building on the foundation laid by industry leaders Michelle Roby of Radiant Earth and Chris Holmes of Planet in the first webinar, this second part offers an in-depth look at the real-world application and behind-the-scenes dynamics of these cutting-edge formats. We will spotlight specific use-cases and workflows, showcasing their efficiency and relevance in practical scenarios. Discover the vast possibilities each format holds, highlighted through detailed discussions and demonstrations. Our expert speakers will dissect the key aspects and provide critical takeaways for effective use, ensuring attendees leave with a thorough understanding of how to apply these formats in their own projects. Elevate your understanding of how FME supports these cutting-edge technologies, enhancing your ability to manage, share, and analyze spatial data. Whether you're building on knowledge from our initial session or are new to the serverless spatial data landscape, this webinar is your gateway to mastering cloud-native formats in your workflows.

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Safe Software

How to Troubleshoot Apps for the Modern Connected Worker

ThousandEyes

Boost Fertility New Invention Ups Success Rates.pdf

sudhanshuwaghmare1

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Zilliz

Kürzlich hochgeladen (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

DBX First Quarter 2024 Investor Presentation

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...

presentation ICT roal in 21st century education

AWS Community Day CPH - Three problems of Terraform

Exploring the Future Potential of AI-Enabled Smartphone Processors

Manulife - Insurer Transformation Award 2024

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

MS Copilot expands with MS Graph connectors

Architecting Cloud Native Applications

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

CNIC Information System with Pakdata Cf In Pakistan

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

How to Troubleshoot Apps for the Modern Connected Worker

Boost Fertility New Invention Ups Success Rates.pdf

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Alpha go 16110226_김영우

1. System Engineering Laboratory koain@naver.com 16110226 Kim Young Woo Mastering the game of Go with Deep Neural Networks and Tree Search

2. Contents Ⅰ. What is the AlphaGo? - Go Machine Ⅱ. Background - Overview - MCTS (Monte Carlo Tree Search) - CNN (Convolutional Neural Network) Ⅲ. Components - Policy Networks - Value Networks - Searching with policy and value networks Ⅳ. Conclusion

3. Ⅰ. What is the AlphaGo?

4. What is the AlphaGo? – Go Machine - AlphaGo is a computer program developed by Google DeepMind to play game Go. - The first computer Go program to beat a professional human Go player without handicaps.

5. What is the AlphaGo? – Go Machine - For Chess, IBM Deep Blue beat the world Chess champion using Brute Force search. - Go game board is 19 x 19 size and possible states of one point are 3. The number of all cases is 3361 ≒ 10170 . - The number of the next reasonable position is 250 and a Go game is ended at 150 turns on average. The tree’s depth is the 150 and breadth is the 250. - It is impossible to search all of the cases with current technology. - It is the key how can decrease the depth and breadth of the search tree.

6. Ⅱ. Background

7. Background – Overview 1. MCTS (Monte Carlo Tree Search) - It is used by many AI Go programs. 2. CNN (Convolution Neural Networks) - Policy Networks - Value Networks

8. Background – MCTS - When it is impossible to explore all paths, it is efficient - Selection : Select the most promising path from root to leaf. - Expansion : If game is not ended, either create one or more child nodes or choose from them. - Simulation : Play game from chosen node until game is ended. - Backpropagation : Update information on the path from root to chosen node using the simulation result.

9. Background – CNN - Convolution Layer : It extracts meaningful data(feature maps) from input image. - Sub-sampling Layer : It max-pooling from feature maps. - Fully-Connected Layer : It is used for classification from feature maps.

10. Ⅲ. Components

11. - 𝑠 : State of the board - 𝑎 : Next action - 𝑣 𝑠 : Valuation function - 𝑃 𝑎 𝑠 : Propability distribution over possible moves 𝑎 in position 𝑠 - 𝑃𝜎 : Supervised learning based Policy Networks - 𝑃𝜋 : Rapidly sample actions during rollouts. - 𝑃𝜌 : Reinforcement learning based Policy Networks - 𝑣 𝜃 : Value Networks that predicts the winner of games Components

12. Components - Policy Networks - Decrease the breadth of the search tree. - Convolutional Neural Networks for finding next action. - Estimating value function 𝑃 𝑎 𝑠 . - Supervised Learning and Reinforcement Learning. 1. SL(Supervised Learning) Policy Network - Learn from human expert using 30 million data from KGS GO Server 2. RL(Reinforcement Learning) Policy Network - Initialized to SL - Learn from playing games of self-play with RL policy Network - RL policy Network won more than 80% games against SL policy Network

13. Components - Value Networks - Decrease the depth of the search tree. - Convolutional Neural Networks for predicting outcome from position 𝑠. - Estimating value function 𝑣 𝑝(𝑠). - Reinforcement Learning. 1. Reinforcement Learning - Self play game from RL policy networks - Avoiding the overfitting from KGS data sets.

14. Components – Searching with policy and value network - 𝑄 : MCTS action value - 𝑢(𝑃) : Bonus that depends on a stored prior probability P, that is in inverse proportion to the visit counts. - Selection : Select maximum 𝑄 + 𝑢(𝑃) value at step L-1. - Expansion : Expanse the nodes based on 𝑃𝜎 at step L - Evaluation : Evaluate the win rate by simulating using 𝑣 𝜃 and random rollout playing. - Backup : 𝑄 and visit counts of all traversed edges are updated.

15. Ⅳ. Conclusion

16. Conclusion - Single-machine AlphaGo is many dan ranks stronger than any previous Go program, winning 494 out of 495 games (99.8%) against outher Go programs. - Distributed version Alphago won the match 5 games to 0 against Fan Hui, European Go Champion.

17. The End.

Alpha go 16110226_김영우

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Alpha go 16110226_김영우

Ähnlich wie Alpha go 16110226_김영우 (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Alpha go 16110226_김영우