SlideShare ist ein Scribd-Unternehmen logo
1 von 16
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Introduction of Reinforcement Learning
1
곽동현
서울대학교 바이오지능 연구실
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Background
• 기존의 강화학습(Reinforcement Learning)에서 Q function을
DNN 혹은 CNN으로 근사하여 문제를 해결하는 시도가 최근
Google DeepMind를 필두로 활발히 연구가 되고 있다.
• 최근 연구에서는 Atari 2600, 바둑을 인간보다 더 잘 플레이하
는 수준의 경이적인 성과를 보이고 있으며, 나아가 3D 게임이
나 로봇 컨트롤 문제에도 적용되고 있다.
2
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
What is AI? ML?
3https://www.linkedin.com/pulse/deep-dive-venture-landscape-ai-ajit-nazre-rahul-garg-nazre
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Various Field with ML
4https://www.linkedin.com/pulse/how-exceed-your-goals-2016-dr-travis-bradberry-1
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Various Algorithm in ML
5
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Function Approximation
6http://arxiv.org/pdf/1411.4555.pdf https://people.mpi-inf.mpg.de/~kkim/supres/supres.htm
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
What is Deep Learning?
7
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Machine Learning
• Supervised Learning :
y = f(x)
• Unsupervised Learning :
x ~ p(x) , x = f(x)
• Reinforcement Learning :
??
8
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Agent-Environment Interaction
• Objective : Maximize the expected sum of future rewards
• Algorithms
1) Planning : Dynamic Programming Based
2) Reinforcement Learning : Machine Learning Based
9
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Example of Supervised
Learning
10
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Polynomial Curve Fitting
11
Microsoft Excel 2007의 추세선
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Example of
Unupervised Learning
12
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Clustering
13
http://www.frankichamaki.com/data-driven-market-segmentation-more-effective-marketing-to-
segments-using-ai/
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Example of
Reinforcement Learning
14
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
Videos
• A crawling robot: a Q-learning example
https://www.youtube.com/watch?v=2iNrJx6IDEo
• Deep Reinforcement Learning for Robotic
Manipulation
https://youtu.be/ZhsEKTo7V04?t=1m27s
15
© 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr
THANK YOU
16

Weitere ähnliche Inhalte

Andere mochten auch

Java Micro Edition Platform & Android - Seminar on Small and Mobile Devices
Java Micro Edition Platform & Android - Seminar on Small and Mobile DevicesJava Micro Edition Platform & Android - Seminar on Small and Mobile Devices
Java Micro Edition Platform & Android - Seminar on Small and Mobile Devices
juricde
 

Andere mochten auch (16)

GKAC 2015 Apr. - Xamarin forms, mvvm and testing
GKAC 2015 Apr. - Xamarin forms, mvvm and testingGKAC 2015 Apr. - Xamarin forms, mvvm and testing
GKAC 2015 Apr. - Xamarin forms, mvvm and testing
 
GKAC 2015 Apr. - Android Looper
GKAC 2015 Apr. - Android LooperGKAC 2015 Apr. - Android Looper
GKAC 2015 Apr. - Android Looper
 
Java Micro Edition Platform & Android - Seminar on Small and Mobile Devices
Java Micro Edition Platform & Android - Seminar on Small and Mobile DevicesJava Micro Edition Platform & Android - Seminar on Small and Mobile Devices
Java Micro Edition Platform & Android - Seminar on Small and Mobile Devices
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question Answering
 
Intro to Android : Making your first App!
Intro to Android : Making your first App!Intro to Android : Making your first App!
Intro to Android : Making your first App!
 
Async task, threads, pools, and executors oh my!
Async task, threads, pools, and executors oh my!Async task, threads, pools, and executors oh my!
Async task, threads, pools, and executors oh my!
 
GKAC 2015 Apr. - Battery, 안드로이드를 위한 쉬운 웹 API 호출
GKAC 2015 Apr. - Battery, 안드로이드를 위한 쉬운 웹 API 호출GKAC 2015 Apr. - Battery, 안드로이드를 위한 쉬운 웹 API 호출
GKAC 2015 Apr. - Battery, 안드로이드를 위한 쉬운 웹 API 호출
 
Tensorflow 101
Tensorflow 101Tensorflow 101
Tensorflow 101
 
GKAC 2014 Nov. - Android Wear 개발, 할까요 말까요?
GKAC 2014 Nov. - Android Wear 개발, 할까요 말까요?GKAC 2014 Nov. - Android Wear 개발, 할까요 말까요?
GKAC 2014 Nov. - Android Wear 개발, 할까요 말까요?
 
GKAC 2015 Apr. - RxAndroid
GKAC 2015 Apr. - RxAndroidGKAC 2015 Apr. - RxAndroid
GKAC 2015 Apr. - RxAndroid
 
GKAC 2014 Nov. - RxJava를 활용한 Functional Reactive Programming
GKAC 2014 Nov. - RxJava를 활용한 Functional Reactive ProgrammingGKAC 2014 Nov. - RxJava를 활용한 Functional Reactive Programming
GKAC 2014 Nov. - RxJava를 활용한 Functional Reactive Programming
 
접근성(Accessibility)과 안드로이드
접근성(Accessibility)과 안드로이드접근성(Accessibility)과 안드로이드
접근성(Accessibility)과 안드로이드
 
GKAC 2014 Nov. - 안드로이드 스튜디오로 생산성 올리기
GKAC 2014 Nov. - 안드로이드 스튜디오로 생산성 올리기GKAC 2014 Nov. - 안드로이드 스튜디오로 생산성 올리기
GKAC 2014 Nov. - 안드로이드 스튜디오로 생산성 올리기
 
Android - Preventing common memory leaks
Android - Preventing common memory leaksAndroid - Preventing common memory leaks
Android - Preventing common memory leaks
 
Instalasi Android 7.0 "Nougat"
Instalasi Android 7.0 "Nougat"Instalasi Android 7.0 "Nougat"
Instalasi Android 7.0 "Nougat"
 
Google Firebase로 레고블럭 조립하기 - IO Extended 2016
Google Firebase로 레고블럭 조립하기 - IO Extended 2016Google Firebase로 레고블럭 조립하기 - IO Extended 2016
Google Firebase로 레고블럭 조립하기 - IO Extended 2016
 

Ähnlich wie Reinfocement learning

Session-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networksSession-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networks
Zimin Park
 

Ähnlich wie Reinfocement learning (7)

Bayesian networks in AI
Bayesian networks in AIBayesian networks in AI
Bayesian networks in AI
 
deep-learning-and-what's-next-with-Chinese-annotation
deep-learning-and-what's-next-with-Chinese-annotationdeep-learning-and-what's-next-with-Chinese-annotation
deep-learning-and-what's-next-with-Chinese-annotation
 
Big Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning DemystifiedBig Data LDN 2017: Deep Learning Demystified
Big Data LDN 2017: Deep Learning Demystified
 
深層学習フレームワーク概要とChainerの事例紹介
深層学習フレームワーク概要とChainerの事例紹介深層学習フレームワーク概要とChainerの事例紹介
深層学習フレームワーク概要とChainerの事例紹介
 
Session-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networksSession-based recommendations with recurrent neural networks
Session-based recommendations with recurrent neural networks
 
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
Using Neural Net Algorithms to Classify Human Activity, with Applications in ...
 
Convolutional Neural Network
Convolutional Neural NetworkConvolutional Neural Network
Convolutional Neural Network
 

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 

Reinfocement learning

  • 1. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Introduction of Reinforcement Learning 1 곽동현 서울대학교 바이오지능 연구실
  • 2. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Background • 기존의 강화학습(Reinforcement Learning)에서 Q function을 DNN 혹은 CNN으로 근사하여 문제를 해결하는 시도가 최근 Google DeepMind를 필두로 활발히 연구가 되고 있다. • 최근 연구에서는 Atari 2600, 바둑을 인간보다 더 잘 플레이하 는 수준의 경이적인 성과를 보이고 있으며, 나아가 3D 게임이 나 로봇 컨트롤 문제에도 적용되고 있다. 2
  • 3. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr What is AI? ML? 3https://www.linkedin.com/pulse/deep-dive-venture-landscape-ai-ajit-nazre-rahul-garg-nazre
  • 4. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Various Field with ML 4https://www.linkedin.com/pulse/how-exceed-your-goals-2016-dr-travis-bradberry-1
  • 5. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Various Algorithm in ML 5
  • 6. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Function Approximation 6http://arxiv.org/pdf/1411.4555.pdf https://people.mpi-inf.mpg.de/~kkim/supres/supres.htm
  • 7. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr What is Deep Learning? 7
  • 8. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Machine Learning • Supervised Learning : y = f(x) • Unsupervised Learning : x ~ p(x) , x = f(x) • Reinforcement Learning : ?? 8
  • 9. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Agent-Environment Interaction • Objective : Maximize the expected sum of future rewards • Algorithms 1) Planning : Dynamic Programming Based 2) Reinforcement Learning : Machine Learning Based 9
  • 10. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Example of Supervised Learning 10
  • 11. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Polynomial Curve Fitting 11 Microsoft Excel 2007의 추세선
  • 12. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Example of Unupervised Learning 12
  • 13. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Clustering 13 http://www.frankichamaki.com/data-driven-market-segmentation-more-effective-marketing-to- segments-using-ai/
  • 14. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Example of Reinforcement Learning 14
  • 15. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr Videos • A crawling robot: a Q-learning example https://www.youtube.com/watch?v=2iNrJx6IDEo • Deep Reinforcement Learning for Robotic Manipulation https://youtu.be/ZhsEKTo7V04?t=1m27s 15
  • 16. © 2016. SNU CSE Biointelligence Lab., http://bi.snu.ac.kr THANK YOU 16

Hinweis der Redaktion

  1. 고전적인 AI 분류에서, 원래 ML은 작은 한 파트였다. 그리고 이 AI를 구현하는 방법 중의 하나가 원래 ML이고 그 안에 Deep이 있다. 아주 작은 일부분 그런데 지금은 흐름이 바뀌어서 AI에서 제시된 분야의 상당 부분이 ML을 통해 연구되고 있다. 따라서 지금의 트랜드는 거의 AI = ML 처럼 되어가고 있지만, 아직도 고전적인 AI 만 연구하는 사람이 많아서 이렇게 말하면 큰일날 수도 있다.
  2. 머신러닝은 이렇게 방대한 분야들로부터 탄생한 학문이다. 따라서 처음에 공부하면 굉장히 두서가 없고, 난해하다. 그래서 초반에는 좋은 교재와 세미나를 통한 학습이 필수적이다.
  3. 알고리즘마다 경계면을 찾는 방식이 다름
  4. 이런식으로 어떤 데이터가 들어왔을 때, ouput을 내는 함수 f를 학습을 통해 찾는다. 명시적인 구현이아니라.