SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Downloaden Sie, um offline zu lesen
© 2019 The MathWorks, Inc.
How to Train your Robot
with Deep Reinforcement Learning
Lucas García, PhD
Senior Application Engineer
MathWorks
@mathinking
2
Did you know that
more neurons get activated
in your brain when you walk
than when you play a
game of chess?
© 2019 The MathWorks, Inc.
How would you build an AI
that could walk?
4
Credit: Tom Buehler / MIT CSAIL
5
Credit: Erico Guizzo/IEEE Spectrum
© 2019 The MathWorks, Inc.
Lucas García, PhD
Senior Application Engineer
MathWorks
@mathinking
Thanks to: Aditya Baru, Sebastian Castro, Brian Douglas, John Glass, Carlos Sanchis, Emmanouil Tzorakoleftherakis and others.
7
The goal of control
8
The goal of control
9
A walking robot – the traditional way
Observations
Motor
Commands
Camera
Data
Feature
Extraction
State
Estimation
Control
System
Motor
Commands
Observations
Sensors
Motor
Control
Leg & Trunk
Trajectories
Balance
10
A walking robot – the alternative approach
Observations
Camera
Data
Feature
Extraction
State
Estimation
Control
System
Sensors
Motor
Commands
Motor
Commands
Observations
Camera
Data
Sensors
Black Box
Controller
11
What is Reinforcement Learning?
Reinforcement learning is learning what to do—how to map
situations to actions—so as to maximize a numerical reward signal.
The learner is not told which actions to take, but instead must
discover which actions yield the most reward by trying them.
Sutton and Barto,
Reinforcement Learning: An Introduction
“
”
12
Reinforcement Learning Applications
video games
autonomous vehicles
robotics controls
13
Some Reinforcement Learning Terminology
14
Reinforcement Learning Workflow
15
Reinforcement Learning Workflow
16
Environment
▪ Everything outside of an agent
17
Environment
▪ Everything outside of an agent
𝑋, 𝑌, 𝑍, 𝜓, 𝜃, 𝜙
𝑞𝑅1 … 𝑞𝑅𝑁
𝑞𝐿1 … 𝑞𝐿𝑁
+ derivatives
𝐹𝑅, 𝐹𝐿
𝜏𝑅1 … 𝜏𝑅𝑁
𝜏𝐿1 … 𝜏𝐿𝑁
18
Environment - Simulink
19
Reinforcement Learning Workflow
20
Reward
A function that outputs a scalar number that represents the "goodness" of
an agent being in a particular state and taking a particular action.
21
𝑟𝑡 = − 50 𝑧 − 𝑧0
2
Crafting the Reward
𝑟𝑡 = + 25
𝑇𝑓
𝑇𝑠
𝑟𝑡 = + 𝑣𝑥
𝑟𝑡 = − 3𝑦2
𝑟𝑡 = − 0.02 ෍
𝑖=1
𝑁
𝜏𝑅𝑖
2
+ 𝜏𝐿𝑖
2
22
Crafting the Reward
23
Reinforcement Learning Workflow
24
The Agent
25
The Agent
Policy
function that maps
observations to actions
Reinforcement
Learning Algorithm
optimization method
used to find the
optimal policy
26
The Policy
Tells the agent which
actions to take given
the current state
reward the instantaneous benefit of being in a state and taking a specific action
value the total reward an agent expects to receive from a state and onwards into the future
27
The Policy
It’s not feasible to try every possible action!
28
The Policy – Actor-Critic
Actor chooses an action given the
current state
Critic predicts the value of that state
and action
29
The Policy – Actor-Critic
30
The Policy – Actor-Critic
31
Reinforcement Learning Workflow
32
Training our Deep Reinforcement Learning Agent
Accelerate training by running simulations in parallel
on multicore computers, clusters or the cloud
Train on the GPU when using
Deep Neural Networks for Actor
or Critic representations
33
Training our Deep Reinforcement Learning Agent
34
Reinforcement Learning Workflow
35
Deploy policy to the target hardware
Automatically generate C/C++ or CUDA code
to run the policy on an embedded system
36
Deploy policy to the target hardware
37
Key takeaways
▪ Reinforcement Learning can solve complicated problems
▪ Deep Neural Networks can handle continuous or high-dimensional
state and action spaces
▪ MATLAB and Simulink provide a complete workflow for Deep
Reinforcement Learning
Can’t wait to play with it? Visit our booth!
Code
github.com/mathworks/msra-walking-robot
Download MATLAB
mathworks.com/matlab-bigth19
38
Credit: DLR / MathWorks
Learn more
© 2019 The MathWorks, Inc.
What will Your Next AI look like?
Lucas García, PhD
Senior Application Engineer
MathWorks
@mathinking

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learn
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
How You Can Change the World
How You Can Change the WorldHow You Can Change the World
How You Can Change the World
 
Forgotten women in tech history.
Forgotten women in tech history.Forgotten women in tech history.
Forgotten women in tech history.
 
AI FOR BUSINESS LEADERS
AI FOR BUSINESS LEADERSAI FOR BUSINESS LEADERS
AI FOR BUSINESS LEADERS
 
Leveraging Generative AI & Best practices
Leveraging Generative AI & Best practicesLeveraging Generative AI & Best practices
Leveraging Generative AI & Best practices
 
UTILITY OF AI
UTILITY OF AIUTILITY OF AI
UTILITY OF AI
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
Web design singapore__award_winning_service
Web design singapore__award_winning_serviceWeb design singapore__award_winning_service
Web design singapore__award_winning_service
 
Solve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscapeSolve for X with AI: a VC view of the Machine Learning & AI landscape
Solve for X with AI: a VC view of the Machine Learning & AI landscape
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Media-Morphosis Transforming Media and Entertainment
Media-Morphosis Transforming Media and EntertainmentMedia-Morphosis Transforming Media and Entertainment
Media-Morphosis Transforming Media and Entertainment
 
TEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of WorkTEDx Manchester: AI & The Future of Work
TEDx Manchester: AI & The Future of Work
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of Everything
 
A Product Manager's Job
A Product Manager's JobA Product Manager's Job
A Product Manager's Job
 
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
Unlocking the Power of ChatGPT and AI in Testing - NextSteps, presented by Ap...
 
AI: Built to Scale
AI: Built to ScaleAI: Built to Scale
AI: Built to Scale
 
The power of creative collaboration
The power of creative collaborationThe power of creative collaboration
The power of creative collaboration
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
SXSW 2016 takeaways
SXSW 2016 takeawaysSXSW 2016 takeaways
SXSW 2016 takeaways
 

Ähnlich wie How to train your robot (with Deep Reinforcement Learning)

Siddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileSiddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for Mobile
IT Arena
 

Ähnlich wie How to train your robot (with Deep Reinforcement Learning) (20)

Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack Big data expo - machine learning in the elastic stack
Big data expo - machine learning in the elastic stack
 
900 keynote abbott
900 keynote abbott900 keynote abbott
900 keynote abbott
 
IRJET- Virtual Fitness Trainer with Spontaneous Feedback using a Line of Moti...
IRJET- Virtual Fitness Trainer with Spontaneous Feedback using a Line of Moti...IRJET- Virtual Fitness Trainer with Spontaneous Feedback using a Line of Moti...
IRJET- Virtual Fitness Trainer with Spontaneous Feedback using a Line of Moti...
 
Virtual Yoga System Using Kinect Sensor
Virtual Yoga System Using Kinect SensorVirtual Yoga System Using Kinect Sensor
Virtual Yoga System Using Kinect Sensor
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用方策勾配型強化学習の基礎と応用
方策勾配型強化学習の基礎と応用
 
Machine Learning Presentation
Machine Learning PresentationMachine Learning Presentation
Machine Learning Presentation
 
GAMING BOT USING REINFORCEMENT LEARNING
GAMING BOT USING REINFORCEMENT LEARNINGGAMING BOT USING REINFORCEMENT LEARNING
GAMING BOT USING REINFORCEMENT LEARNING
 
Decision Review System
Decision Review SystemDecision Review System
Decision Review System
 
Human pose detection using machine learning by Grandel
Human pose detection using machine learning by GrandelHuman pose detection using machine learning by Grandel
Human pose detection using machine learning by Grandel
 
IRJET - Human Pose Detection using Deep Learning
IRJET - Human Pose Detection using Deep LearningIRJET - Human Pose Detection using Deep Learning
IRJET - Human Pose Detection using Deep Learning
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
IRJET - Face Recognition Door Lock using IoT
IRJET - Face Recognition Door Lock using IoTIRJET - Face Recognition Door Lock using IoT
IRJET - Face Recognition Door Lock using IoT
 
Siddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobileSiddha Ganju. Deep learning on mobile
Siddha Ganju. Deep learning on mobile
 
Siddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for MobileSiddha Ganju, NVIDIA. Deep Learning for Mobile
Siddha Ganju, NVIDIA. Deep Learning for Mobile
 
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
Machine Learning & IT Service Intelligence for the Enterprise: The Future is ...
 
Data Modeling using Symbolic Regression
Data Modeling using Symbolic RegressionData Modeling using Symbolic Regression
Data Modeling using Symbolic Regression
 
Aprendizaje reforzado con swift
Aprendizaje reforzado con swiftAprendizaje reforzado con swift
Aprendizaje reforzado con swift
 
Machine Learning in Cybersecurity.pdf
Machine Learning in Cybersecurity.pdfMachine Learning in Cybersecurity.pdf
Machine Learning in Cybersecurity.pdf
 
Person Acquisition and Identification Tool
Person Acquisition and Identification ToolPerson Acquisition and Identification Tool
Person Acquisition and Identification Tool
 

Kürzlich hochgeladen

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
dollysharma2066
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
Tonystark477637
 

Kürzlich hochgeladen (20)

Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Vivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design SpainVivazz, Mieres Social Housing Design Spain
Vivazz, Mieres Social Housing Design Spain
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 

How to train your robot (with Deep Reinforcement Learning)