SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
From Alpha Go to
Alpha Zero
Google London

March 2018
Juantomás García
• Data Solutions Manager @ OpenSistemas
• GDE (Google Developer Expert) for cloud
Others
• Co-Author of the first Spanish free software book “La Pastilla
Roja”
• President of Hispalinux (Spanish Linux User Group)
• Organizer of the Machine Learning Spain and GDG Cloud Madrid.
Who I am
• People interested in Machine Learning
• Wants to know more about what’s is Alpha Go
• With a good technical background.
Who are the Audience
• I love Machine Learning.
• There are a lot of takeaways from this project.
• I wish to divulge it
Why I did this presentation
• Alpha Go: the epic project
• AlphaGo Zero: re-evolution version
• Alpha Zero: Looking for general solutions
• DIY: Alpha Zero Connect 4
• Takeaways
Outline
A brief introduction
• Deep Blue was about brute force
• They were emulating how humans play chess
A brief introduction
• A very huge Search Space
Chess -> Opening 20 possible moves
Go -> Opening 361 possible moves
Alpha Go Main Concepts
• Policy Neural Network
“To decide which are the most sensible moves in
a particular board position”.
Alpha Go Main Concepts
• Value Neural Network
“How great is a particular board arrangements”.
“How likely you are to win the game with this
position”.
Alpha Go Main Concepts
Alpha Go First Approach: SL
• Just train both networks using human games.
• Just old and ordinary supervised learning.
• With this: AlphaGo just play with like a weak
human.
• It like the approach of deep blue: just emulating
human chess players
Alpha Go First Approach: SL
Alpha Go Second Approach: RL
• Improve SL version starting playing again itself.
• With Reinforcement Learning is able to play well
against state of the art go playing programs
• These programs are using MCTS
Alpha Go Second Approach: RL
Alpha Go Second Approach: RL
• It is not 2 NN vs Monte Carlo Tree Search
• Is a better MCTS thanks to the NNs.
Alpha Go Second Approach: RL
• Optimal Value Function V*(s)
“Determine the outcome of the game from every
board position (s is the state)”.
Brute force solution is impossible:
Chess: 35 ** 80
Go: 250 ** 150
Alpha Go Second Approach: RL
• Two solutions for reduce the effective search
space:
Truncate the tree subtree search: V(s) like V*(s)
Reducing the breadth of the search with the
policy: P(a|s)
We MCTS rollout the moves choose by the policy
function and evaluate with the optimal value
function.
AlphaGo: The Match
AlphaGo Zero: Re-Evolution version
• Just trained with Reinforcement Learning
• Choose the less out different moves: u(s,a)
• Just one neural network for policy and value.
• Every time a search is done the neural network is
retrained.
AlphaGo Zero: Re-Evolution version
• Human games was noisy and not reliable.
• Don’t use rollouts for predict who will win.
AlphaGo Zero: Re-Evolution version
AlphaGo Zero: Re-Evolution version
Alpha Zero: New Challenges
AlphaGo Zero VS AlphaZero:
• Binary outcome (win / loss) × expected outcome
(including
• 3 draws or potentially other outcomes)
• Board positions transformed before passing to neural
networks (by randomly selected rotation or redirection) × no
data augmentation
• Games generated by the best player from previous iterations
(margin of 55 %) × continual update using the latest
parameters (without the evaluation and selection steps)
• Hyper-parameters tuned by Bayesian optimisation × reused
the same hyper-parameters without game-specific tuning
Alpha Zero
Alpha Zero: DYI
https://medium.com/applied-data-science/how-to-build-your-own-alphazero-ai-using-python-and-keras-7f664945c188
Takeaways
RL is more than Atari Games and GO
Takeaways
AI discovery new ways to play.
Think about new projects like proteins fold.
Takeaways
We’re living awesome times.
Sharing AI papers, tools, models, etc. More
than any time before.
Takeaways
As Ms Fei Fei said: “It’s about democratizing AI”
Takeaways
Watch this Documentary Film about Alpha Go:
Thank You

Weitere ähnliche Inhalte

Was ist angesagt?

AlphaGo 알고리즘 요약
AlphaGo 알고리즘 요약AlphaGo 알고리즘 요약
AlphaGo 알고리즘 요약Jooyoul Lee
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리Shane (Seungwhan) Moon
 
전리품 분배 시스템 기획 배상욱
전리품 분배 시스템 기획 배상욱전리품 분배 시스템 기획 배상욱
전리품 분배 시스템 기획 배상욱SwooBae
 
게임기획 포트폴리오 애니팡역기획 배상욱
게임기획 포트폴리오 애니팡역기획 배상욱게임기획 포트폴리오 애니팡역기획 배상욱
게임기획 포트폴리오 애니팡역기획 배상욱SwooBae
 
알파고 해부하기 3부
알파고 해부하기 3부알파고 해부하기 3부
알파고 해부하기 3부Donghun Lee
 
Hangman game - AI Powered and in Python
Hangman game - AI Powered and in PythonHangman game - AI Powered and in Python
Hangman game - AI Powered and in Pythonaiclub_slides
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review상은 박
 
Habitica Design Challenge Finalist for Octalysis - Ivan Milev
Habitica Design Challenge Finalist for Octalysis - Ivan Milev  Habitica Design Challenge Finalist for Octalysis - Ivan Milev
Habitica Design Challenge Finalist for Octalysis - Ivan Milev Yu-kai Chou
 
게임 기획 튜토리얼 (2015 개정판)
게임 기획 튜토리얼 (2015 개정판)게임 기획 튜토리얼 (2015 개정판)
게임 기획 튜토리얼 (2015 개정판)Lee Sangkyoon (Kay)
 
게임별 유저 성향 분석
게임별 유저 성향 분석게임별 유저 성향 분석
게임별 유저 성향 분석ACE Trader
 
The Rise and Rise of Idle Games
The Rise and Rise of Idle GamesThe Rise and Rise of Idle Games
The Rise and Rise of Idle GamesAnthony Pecorella
 
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree Search
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree SearchAlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree Search
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree SearchKarel Ha
 
2013 Gould Backer Zone Defense
2013 Gould Backer Zone Defense2013 Gould Backer Zone Defense
2013 Gould Backer Zone Defensedockj
 
ML Zoomcamp 1.4 - CRISP-DM
ML Zoomcamp 1.4 - CRISP-DMML Zoomcamp 1.4 - CRISP-DM
ML Zoomcamp 1.4 - CRISP-DMAlexey Grigorev
 
알파고 해부하기 2부
알파고 해부하기 2부알파고 해부하기 2부
알파고 해부하기 2부Donghun Lee
 
DiscoRank: optimizing discoverability on SoundCloud
DiscoRank: optimizing discoverability on SoundCloudDiscoRank: optimizing discoverability on SoundCloud
DiscoRank: optimizing discoverability on SoundCloudAmélie Anglade
 
A brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to gamesA brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to gamesThomas da Silva Paula
 
강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introduction강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introductionTaehoon Kim
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningTaehoon Kim
 

Was ist angesagt? (20)

AlphaGo 알고리즘 요약
AlphaGo 알고리즘 요약AlphaGo 알고리즘 요약
AlphaGo 알고리즘 요약
 
Understanding AlphaGo
Understanding AlphaGoUnderstanding AlphaGo
Understanding AlphaGo
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
 
전리품 분배 시스템 기획 배상욱
전리품 분배 시스템 기획 배상욱전리품 분배 시스템 기획 배상욱
전리품 분배 시스템 기획 배상욱
 
게임기획 포트폴리오 애니팡역기획 배상욱
게임기획 포트폴리오 애니팡역기획 배상욱게임기획 포트폴리오 애니팡역기획 배상욱
게임기획 포트폴리오 애니팡역기획 배상욱
 
알파고 해부하기 3부
알파고 해부하기 3부알파고 해부하기 3부
알파고 해부하기 3부
 
Hangman game - AI Powered and in Python
Hangman game - AI Powered and in PythonHangman game - AI Powered and in Python
Hangman game - AI Powered and in Python
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
 
Habitica Design Challenge Finalist for Octalysis - Ivan Milev
Habitica Design Challenge Finalist for Octalysis - Ivan Milev  Habitica Design Challenge Finalist for Octalysis - Ivan Milev
Habitica Design Challenge Finalist for Octalysis - Ivan Milev
 
게임 기획 튜토리얼 (2015 개정판)
게임 기획 튜토리얼 (2015 개정판)게임 기획 튜토리얼 (2015 개정판)
게임 기획 튜토리얼 (2015 개정판)
 
게임별 유저 성향 분석
게임별 유저 성향 분석게임별 유저 성향 분석
게임별 유저 성향 분석
 
The Rise and Rise of Idle Games
The Rise and Rise of Idle GamesThe Rise and Rise of Idle Games
The Rise and Rise of Idle Games
 
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree Search
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree SearchAlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree Search
AlphaGo: Mastering the Game of Go with Deep Neural Networks and Tree Search
 
2013 Gould Backer Zone Defense
2013 Gould Backer Zone Defense2013 Gould Backer Zone Defense
2013 Gould Backer Zone Defense
 
ML Zoomcamp 1.4 - CRISP-DM
ML Zoomcamp 1.4 - CRISP-DMML Zoomcamp 1.4 - CRISP-DM
ML Zoomcamp 1.4 - CRISP-DM
 
알파고 해부하기 2부
알파고 해부하기 2부알파고 해부하기 2부
알파고 해부하기 2부
 
DiscoRank: optimizing discoverability on SoundCloud
DiscoRank: optimizing discoverability on SoundCloudDiscoRank: optimizing discoverability on SoundCloud
DiscoRank: optimizing discoverability on SoundCloud
 
A brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to gamesA brief overview of Reinforcement Learning applied to games
A brief overview of Reinforcement Learning applied to games
 
강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introduction강화 학습 기초 Reinforcement Learning an introduction
강화 학습 기초 Reinforcement Learning an introduction
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learning
 

Ähnlich wie Alpha zero - London 2018

Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game PlayingAman Patel
 
Chakrabarti alpha go analysis
Chakrabarti alpha go analysisChakrabarti alpha go analysis
Chakrabarti alpha go analysisDave Selinger
 
A Presentation on the Paper: Mastering the game of Go with deep neural networ...
A Presentation on the Paper: Mastering the game of Go with deep neural networ...A Presentation on the Paper: Mastering the game of Go with deep neural networ...
A Presentation on the Paper: Mastering the game of Go with deep neural networ...AdityaSuryavamshi
 
How DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoHow DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoTim Riser
 
J-Fall 2017 - AI Self-learning Game Playing
J-Fall 2017 - AI Self-learning Game PlayingJ-Fall 2017 - AI Self-learning Game Playing
J-Fall 2017 - AI Self-learning Game PlayingRichard Abbuhl
 
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...Michael Jongho Moon
 
Implementation and analysis of search algorithms in single player connect fou...
Implementation and analysis of search algorithms in single player connect fou...Implementation and analysis of search algorithms in single player connect fou...
Implementation and analysis of search algorithms in single player connect fou...Anmol Rajpurohit
 
Devoxx 2017 - AI Self-learning Game Playing
Devoxx 2017 - AI Self-learning Game PlayingDevoxx 2017 - AI Self-learning Game Playing
Devoxx 2017 - AI Self-learning Game PlayingRichard Abbuhl
 
AlphaGo zero
AlphaGo zeroAlphaGo zero
AlphaGo zeroDong Guo
 
Alpha go 16110226_김영우
Alpha go 16110226_김영우Alpha go 16110226_김영우
Alpha go 16110226_김영우영우 김
 
IaGo: an Othello AI inspired by AlphaGo
IaGo: an Othello AI inspired by AlphaGoIaGo: an Othello AI inspired by AlphaGo
IaGo: an Othello AI inspired by AlphaGoShion Honda
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...Seldon
 
[1312.5602] Playing Atari with Deep Reinforcement Learning
[1312.5602] Playing Atari with Deep Reinforcement Learning[1312.5602] Playing Atari with Deep Reinforcement Learning
[1312.5602] Playing Atari with Deep Reinforcement LearningSeung Jae Lee
 
chess-algorithms-theory-and-practice_ver2017.pdf
chess-algorithms-theory-and-practice_ver2017.pdfchess-algorithms-theory-and-practice_ver2017.pdf
chess-algorithms-theory-and-practice_ver2017.pdfrajdipdas12
 
Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Balázs Hidasi
 
21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE
21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE
21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCEudayvanand
 
Mastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree searchMastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree searchSanFengChang
 

Ähnlich wie Alpha zero - London 2018 (20)

From alpha go to alpha zero TLP innova 2018
From alpha go to alpha zero  TLP innova 2018From alpha go to alpha zero  TLP innova 2018
From alpha go to alpha zero TLP innova 2018
 
Adversarial search with Game Playing
Adversarial search with Game PlayingAdversarial search with Game Playing
Adversarial search with Game Playing
 
Chakrabarti alpha go analysis
Chakrabarti alpha go analysisChakrabarti alpha go analysis
Chakrabarti alpha go analysis
 
Games
GamesGames
Games
 
A Presentation on the Paper: Mastering the game of Go with deep neural networ...
A Presentation on the Paper: Mastering the game of Go with deep neural networ...A Presentation on the Paper: Mastering the game of Go with deep neural networ...
A Presentation on the Paper: Mastering the game of Go with deep neural networ...
 
How DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of GoHow DeepMind Mastered The Game Of Go
How DeepMind Mastered The Game Of Go
 
J-Fall 2017 - AI Self-learning Game Playing
J-Fall 2017 - AI Self-learning Game PlayingJ-Fall 2017 - AI Self-learning Game Playing
J-Fall 2017 - AI Self-learning Game Playing
 
Games.4
Games.4Games.4
Games.4
 
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
AlphaGo: An AI Go player based on deep neural networks and monte carlo tree s...
 
Implementation and analysis of search algorithms in single player connect fou...
Implementation and analysis of search algorithms in single player connect fou...Implementation and analysis of search algorithms in single player connect fou...
Implementation and analysis of search algorithms in single player connect fou...
 
Devoxx 2017 - AI Self-learning Game Playing
Devoxx 2017 - AI Self-learning Game PlayingDevoxx 2017 - AI Self-learning Game Playing
Devoxx 2017 - AI Self-learning Game Playing
 
AlphaGo zero
AlphaGo zeroAlphaGo zero
AlphaGo zero
 
Alpha go 16110226_김영우
Alpha go 16110226_김영우Alpha go 16110226_김영우
Alpha go 16110226_김영우
 
IaGo: an Othello AI inspired by AlphaGo
IaGo: an Othello AI inspired by AlphaGoIaGo: an Othello AI inspired by AlphaGo
IaGo: an Othello AI inspired by AlphaGo
 
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
TensorFlow London 11: Pierre Harvey Richemond 'Trends and Developments in Rei...
 
[1312.5602] Playing Atari with Deep Reinforcement Learning
[1312.5602] Playing Atari with Deep Reinforcement Learning[1312.5602] Playing Atari with Deep Reinforcement Learning
[1312.5602] Playing Atari with Deep Reinforcement Learning
 
chess-algorithms-theory-and-practice_ver2017.pdf
chess-algorithms-theory-and-practice_ver2017.pdfchess-algorithms-theory-and-practice_ver2017.pdf
chess-algorithms-theory-and-practice_ver2017.pdf
 
Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...
 
21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE
21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE
21CSC206T_UNIT3.pptx.pdf ARITIFICIAL INTELLIGENCE
 
Mastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree searchMastering the game of go with deep neural networks and tree search
Mastering the game of go with deep neural networks and tree search
 

Mehr von Juantomás García Molina

#AbadIA machine learning pipelines commit conf 2019
#AbadIA   machine learning pipelines commit conf 2019#AbadIA   machine learning pipelines commit conf 2019
#AbadIA machine learning pipelines commit conf 2019Juantomás García Molina
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
AbadIA:  the abbey of the crime AI - GDG Cloud London 2018AbadIA:  the abbey of the crime AI - GDG Cloud London 2018
AbadIA: the abbey of the crime AI - GDG Cloud London 2018Juantomás García Molina
 
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018Juantomás García Molina
 
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018Juantomás García Molina
 
AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018Juantomás García Molina
 
From Alpha Go to Alpha Zero - Vaas Madrid 2018
From Alpha Go to Alpha Zero -  Vaas Madrid 2018From Alpha Go to Alpha Zero -  Vaas Madrid 2018
From Alpha Go to Alpha Zero - Vaas Madrid 2018Juantomás García Molina
 
Codemotion madrid 2017 Arquitectura kappa 2.0
Codemotion madrid 2017  Arquitectura kappa 2.0Codemotion madrid 2017  Arquitectura kappa 2.0
Codemotion madrid 2017 Arquitectura kappa 2.0Juantomás García Molina
 
Meetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use casesMeetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use casesJuantomás García Molina
 
Gdg cloud london 2017 kappa architecture 2.0 copia
Gdg cloud london 2017   kappa architecture 2.0 copiaGdg cloud london 2017   kappa architecture 2.0 copia
Gdg cloud london 2017 kappa architecture 2.0 copiaJuantomás García Molina
 
Datascience lab 2017 odessa kappa architecture 2.0
Datascience lab 2017 odessa   kappa architecture 2.0Datascience lab 2017 odessa   kappa architecture 2.0
Datascience lab 2017 odessa kappa architecture 2.0Juantomás García Molina
 
Databeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a serviceDatabeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a serviceJuantomás García Molina
 
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017Juantomás García Molina
 
Librecon 2016 bilbao: kappa architecture IoT of the cars
Librecon 2016 bilbao:   kappa architecture IoT of the carsLibrecon 2016 bilbao:   kappa architecture IoT of the cars
Librecon 2016 bilbao: kappa architecture IoT of the carsJuantomás García Molina
 

Mehr von Juantomás García Molina (20)

#AbadIA machine learning pipelines commit conf 2019
#AbadIA   machine learning pipelines commit conf 2019#AbadIA   machine learning pipelines commit conf 2019
#AbadIA machine learning pipelines commit conf 2019
 
AbadIA - sphere it krakow 2019
AbadIA -   sphere it krakow 2019AbadIA -   sphere it krakow 2019
AbadIA - sphere it krakow 2019
 
AbadIA ING Direct - Madrid 2019
AbadIA ING Direct - Madrid 2019AbadIA ING Direct - Madrid 2019
AbadIA ING Direct - Madrid 2019
 
AbadIA US Secret Tour - Pittsburgh'19
AbadIA US Secret Tour - Pittsburgh'19AbadIA US Secret Tour - Pittsburgh'19
AbadIA US Secret Tour - Pittsburgh'19
 
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
AbadIA:  the abbey of the crime AI - GDG Cloud London 2018AbadIA:  the abbey of the crime AI - GDG Cloud London 2018
AbadIA: the abbey of the crime AI - GDG Cloud London 2018
 
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018#AbadIA:  the abbey of the crime AI - IO18 extended madrid 2018
#AbadIA: the abbey of the crime AI - IO18 extended madrid 2018
 
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
#AbadIA: the abbey of the crime AI - IBM meetup Madrid 2018
 
AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018AbadIA: the abbey of the crime AI - Vaas Madrid 2018
AbadIA: the abbey of the crime AI - Vaas Madrid 2018
 
From Alpha Go to Alpha Zero - Vaas Madrid 2018
From Alpha Go to Alpha Zero -  Vaas Madrid 2018From Alpha Go to Alpha Zero -  Vaas Madrid 2018
From Alpha Go to Alpha Zero - Vaas Madrid 2018
 
Codemotion madrid 2017 Arquitectura kappa 2.0
Codemotion madrid 2017  Arquitectura kappa 2.0Codemotion madrid 2017  Arquitectura kappa 2.0
Codemotion madrid 2017 Arquitectura kappa 2.0
 
JBCN barcelona 2017 kappa architecture 2.0
JBCN barcelona 2017 kappa architecture 2.0JBCN barcelona 2017 kappa architecture 2.0
JBCN barcelona 2017 kappa architecture 2.0
 
Meetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use casesMeetup big data developers 2017 madrid - spark real use cases
Meetup big data developers 2017 madrid - spark real use cases
 
Gdg cloud madrid 2017 - GDG kick off metuup
Gdg cloud madrid 2017  - GDG kick off metuupGdg cloud madrid 2017  - GDG kick off metuup
Gdg cloud madrid 2017 - GDG kick off metuup
 
Scalaua 2017 kyev kappa architecture 2.0
Scalaua 2017 kyev   kappa architecture 2.0Scalaua 2017 kyev   kappa architecture 2.0
Scalaua 2017 kyev kappa architecture 2.0
 
Icea 2017 big data - recursos humanos
Icea 2017   big data - recursos humanosIcea 2017   big data - recursos humanos
Icea 2017 big data - recursos humanos
 
Gdg cloud london 2017 kappa architecture 2.0 copia
Gdg cloud london 2017   kappa architecture 2.0 copiaGdg cloud london 2017   kappa architecture 2.0 copia
Gdg cloud london 2017 kappa architecture 2.0 copia
 
Datascience lab 2017 odessa kappa architecture 2.0
Datascience lab 2017 odessa   kappa architecture 2.0Datascience lab 2017 odessa   kappa architecture 2.0
Datascience lab 2017 odessa kappa architecture 2.0
 
Databeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a serviceDatabeers madrid 2017 - Paas pigeons as a service
Databeers madrid 2017 - Paas pigeons as a service
 
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017
How to create a personal knowledge graph IBM Meetup Big Data Madrid 2017
 
Librecon 2016 bilbao: kappa architecture IoT of the cars
Librecon 2016 bilbao:   kappa architecture IoT of the carsLibrecon 2016 bilbao:   kappa architecture IoT of the cars
Librecon 2016 bilbao: kappa architecture IoT of the cars
 

Kürzlich hochgeladen

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiessarkmank1
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesRAJNEESHKUMAR341697
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdfAldoGarca30
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilVinayVitekari
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxSCMS School of Architecture
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesMayuraD1
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...drmkjayanthikannan
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 

Kürzlich hochgeladen (20)

Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
PE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and propertiesPE 459 LECTURE 2- natural gas basic concepts and properties
PE 459 LECTURE 2- natural gas basic concepts and properties
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Engineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planesEngineering Drawing focus on projection of planes
Engineering Drawing focus on projection of planes
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 

Alpha zero - London 2018

  • 1. From Alpha Go to Alpha Zero Google London March 2018
  • 2. Juantomás García • Data Solutions Manager @ OpenSistemas • GDE (Google Developer Expert) for cloud Others • Co-Author of the first Spanish free software book “La Pastilla Roja” • President of Hispalinux (Spanish Linux User Group) • Organizer of the Machine Learning Spain and GDG Cloud Madrid. Who I am
  • 3. • People interested in Machine Learning • Wants to know more about what’s is Alpha Go • With a good technical background. Who are the Audience
  • 4. • I love Machine Learning. • There are a lot of takeaways from this project. • I wish to divulge it Why I did this presentation
  • 5. • Alpha Go: the epic project • AlphaGo Zero: re-evolution version • Alpha Zero: Looking for general solutions • DIY: Alpha Zero Connect 4 • Takeaways Outline
  • 6. A brief introduction • Deep Blue was about brute force • They were emulating how humans play chess
  • 7. A brief introduction • A very huge Search Space Chess -> Opening 20 possible moves Go -> Opening 361 possible moves
  • 8. Alpha Go Main Concepts • Policy Neural Network “To decide which are the most sensible moves in a particular board position”.
  • 9. Alpha Go Main Concepts • Value Neural Network “How great is a particular board arrangements”. “How likely you are to win the game with this position”.
  • 10. Alpha Go Main Concepts
  • 11. Alpha Go First Approach: SL • Just train both networks using human games. • Just old and ordinary supervised learning. • With this: AlphaGo just play with like a weak human. • It like the approach of deep blue: just emulating human chess players
  • 12. Alpha Go First Approach: SL
  • 13. Alpha Go Second Approach: RL • Improve SL version starting playing again itself. • With Reinforcement Learning is able to play well against state of the art go playing programs • These programs are using MCTS
  • 14. Alpha Go Second Approach: RL
  • 15. Alpha Go Second Approach: RL • It is not 2 NN vs Monte Carlo Tree Search • Is a better MCTS thanks to the NNs.
  • 16. Alpha Go Second Approach: RL • Optimal Value Function V*(s) “Determine the outcome of the game from every board position (s is the state)”. Brute force solution is impossible: Chess: 35 ** 80 Go: 250 ** 150
  • 17. Alpha Go Second Approach: RL • Two solutions for reduce the effective search space: Truncate the tree subtree search: V(s) like V*(s) Reducing the breadth of the search with the policy: P(a|s) We MCTS rollout the moves choose by the policy function and evaluate with the optimal value function.
  • 19. AlphaGo Zero: Re-Evolution version • Just trained with Reinforcement Learning • Choose the less out different moves: u(s,a) • Just one neural network for policy and value. • Every time a search is done the neural network is retrained.
  • 20. AlphaGo Zero: Re-Evolution version • Human games was noisy and not reliable. • Don’t use rollouts for predict who will win.
  • 23. Alpha Zero: New Challenges AlphaGo Zero VS AlphaZero: • Binary outcome (win / loss) × expected outcome (including • 3 draws or potentially other outcomes) • Board positions transformed before passing to neural networks (by randomly selected rotation or redirection) × no data augmentation • Games generated by the best player from previous iterations (margin of 55 %) × continual update using the latest parameters (without the evaluation and selection steps) • Hyper-parameters tuned by Bayesian optimisation × reused the same hyper-parameters without game-specific tuning
  • 26. Takeaways RL is more than Atari Games and GO
  • 27. Takeaways AI discovery new ways to play. Think about new projects like proteins fold.
  • 28. Takeaways We’re living awesome times. Sharing AI papers, tools, models, etc. More than any time before.
  • 29. Takeaways As Ms Fei Fei said: “It’s about democratizing AI”
  • 30. Takeaways Watch this Documentary Film about Alpha Go: