SlideShare ist ein Scribd-Unternehmen logo
1 von 64
Downloaden Sie, um offline zu lesen
Deep Learning &
Reinforcement Learning MILA
Summer School Highlights
Natalia Díaz Rodríguez, PhD 26th June-5th July 2017
Montreal, Quebec
Learning to Learn - Nando de Freitas
• What is the intrinsic motivation we are
here? learning, satisfaction of getting
knowledge
• From Bengio’s brothers 92 to GitHub.com/
deepmind/learning-to-learn
• 1 single network: optimiser & optimizee
• Generalize: learning to learn X by doing
Y (unsup. by super. learning)
(Task-oriented) Language grounding
Related: language grounding
Rich Sutton: TD-learning
NdF’17
Does not scale for large amounts of actions
Related: Satinder Singh’s RL Talk
Language grounding via
instruction-guided RL
Automatic differentiation: the new trend by all
DL frameworks
• Matt Johnson great tutorial on Automatic
Differentiation
• IDEA: checkpointing and less config
boilerplate code
• Becoming standard:
• Tensor Flow eager
• PyTorch Taping
Graphical models and DL: a powerful
combination -Matt Johnson (Google)
GANs are sexy
CycleGAN Zhu’17
GANs state-of-the-art
• Applications: image generation, attribute morphing, image inpainting…
• State-of-the-art
• BEGAN*, Cycle-GAN (draw a bag and find a real one)
• Unsupervised Pixel–Level Domain Adaptation with Generative
Adversarial Networks, Bousmalis 16 (Unsupervised (GAN)–
based architecture able to learn a transformation without using
corresponding pairs from the two domains, code to appear,
CVPR17).
• The best state of the art approach improving over:
• Decoupling from the Task-Specific Architecture
• Generalization Across Label Spaces
• Achieve Training Stability
• Data Augmentation
* Fast and stable, new boundary equilibrium enforcing method paired with a loss derived from the Wasserstein distance for
training auto-encoder based GAN
CycleGAN
KNN is still one of the most repeated
quantitative measure for unsupervised
evaluation
Bousmalis’16
GANs help Semi and Unsupervised
learning as well as domain randomisation
• CVAE-GAN fine-grained category image
generation.
CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17
GANs Mode Collapse: inability to generate a variable
distribution of data
CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17
One/Few-shot learning
• Extending siamese with one-shot learning: Siamese
Neural Networks for One-shot Image Recognition.
One Shot Learning with Siamese Networks in PyTorch – Harshvardhan Gupta – Medium
This is Part 1 of a two part article. Part 2 will be shown here once it is published. 

• Black-Box Data-efficient Policy Search for Robotics
Mouret17 (Gaussian process regression for policy
optimisation using model based policy search). 5
episodes enough to learn the whole dynamics of
the arm from scratch.
• If you can’t predict
reward, predict a
relative ordering
rank (same vs
different)
• Siamese network:
optimize all rankings
simultaneously
• Natural language embedding into
multidimensional space really helps learning
(humans ALWAYS learn language)
• Physics and bodies provide essential
consistency for understanding intelligence, and
facilitate transfer and continuous learning
• Solving many tasks helps: sometimes many
tasks are essential to learn at all [Learning more
things at once often helps performance in RL.
Intentional unintentional agents]
• Reporting failure cases is also important!
Take Home
Messages
28
[NdF]
• TD-learning is back & hot (from the first
TD-Gammon AI won game)*
• Only 1 reward at the end
• No feedback along the way
• New venue: Int’ conference on RL and
decision making https://groups.google.com/
forum/#!forum/rldm-list
* See unsupervised representation learning talk by R. Sutton and latest
DeepMind (Mnih’17 evolution of UNREAL)
Take Home
Messages
31
• Domain randomization: use to transfer
from simulation to real life learning without
domain adaptation (OpenAI, NVIDIA cube
pose estimation: distractors and different
backgrounds, lights, virtual elements to real
images).
• Learning by demonstration and few shot
learning: Most data-efficient learning
algorithms for semi supervised learning
Take Home
Messages
32
• Regularizing NN by penalising confident
output distributions [Pereyra 17].
• Additional objectives (similar to UNREAL):
RL with Unsupervised Auxiliary Tasks
[Jaderberg’17]
• Generating grounded rewards
automatically [Littman, Topcu et al 17].
Take Home Papers
33
*Reinforcement Learning with Unsupervised Auxiliary Tasks - Implementation: https://github.com/miyosuda/unreal
**Option: a generalisation step of a single-step action that may span across more than 1 timestep and can be used as a
standard action. We move to the policy mu over options o with probability mu(s,o). We can derive a policy over options
Pi_omega that maximises the expected discounted (via regrets) sum of rewards.
•DeepMind 2 parallel works: Relational Networks and Visual Interaction
Networks (philosophically similar works using abstract logic to reason
about the world).  
•Dealing with sparse rewards:
•Reward shaping: Off-Policy Reward Shaping with Ensembles: https://
arxiv.org/abs/1502.03248 and Expressing Arbitrary Reward Functions
as Potential-Based Advice: https://www.aaai.org/ocs/index.php/AAAI/
AAAI15/paper/viewFile/9893/9923
•http://papers.nips.cc/paper/6538-safe-and-efficient-off-policy-reinf
 https://ai.vub.ac.be/sites/default/files/PID3130853.pdf
•Reinforcement Learning from Demonstration through Shaping
•Non-Markovian Rewards Expressed in LTL: Guiding Search Via
Reward Shaping. A. Camacho, et al. (RLDM), June 2017
•https://arxiv.org/pdf/1706.10295.pdf
Take Home Papers
34
•GANS:
•Allan Ma (Guelph) State of art GAN implem. +
evaluation.
•GAN used to perform domain adaptation (useful
ideas to go from simulated robot simulation to
real world robot simulation)
•LANGUAGE GROUNDING AND VISUAL/DIALOG
HYBRID SYSTEMS (Ideas for PARL.AI grant call):
End-to-end optimization of goal-driven and visually
grounded dialogue systems    
Take Home Papers
35
• Dex-Net Grasping dataset (10K 3D models to acquire force
closure grasps, for the ABB YuMi)
• ROS service for grasp planning. Dex-Net as a Service: Fall
2017. HTTP web API to create new databases with custom 3D
models and compute grasp robustness metrics.
• Google robot farm dataset: many robot arms for grasping,
pushing, etc. 800,000 grasp attempts (6-14 robotic
manipulators)
• Using Baxter:
• Pinto and Gupta Baxter dataset (40k grasping experiences).
CNNs predict lifting successes or to resist grasp perturbations
caused by an adversary*.
• Oberlin’15 Autonomously collecting object scans
Take Home Datasets
36
*Lerrel Pinto and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In Proc. IEEE
Int. Conf. Robotics and Automation (ICRA), 2016.
Lerrel Pinto, James Davidson, and Abhinav Gupta. Supervision via competition: Robot adversaries for learning tasks. arXiv preprint
arXiv:1610.01685, 2016.
Food for thought
• Is AI = DL + RL? (Hado van Hasselt)
• Does the brain do backpropagation?
• Even if the brain is not doing back-propagation as
ANN do, there is no mathematical handicap that
can prove otherwise
• CNNs and LSTMs: successful ubiquitous AI
models inspired by the human brain
• :( Neuroscience is still far apart from AI community
Keyword Summary
• GANS as data augmentation
(CycleGAN, BEGAN,…)
• Autoregressive models (PixelGAN)
• Embedding language and vision
representations
•End-to-end
•Self-supervision
•Learning by:
•Imitation*, cloning, demonstration and by predicting the
future (natural learning)
•One-shot learning
•Reward shaping and other myriad signals
•TD-learning
•Options framework
* E.g. Imitating Driver Behavior with Generative Adversarial Networks https://arxiv.org/pdf/1701.06699.pdf
Keyword
Summary
41
Grants and competitions
• https://nips.cc/
Conferences/2017/
CompetitionTrack
Learning to run
Papers right out of the oven
[PDF] End-to-End Learning of Semantic Grasping
E Jang, S Vijaynarasimhan, P Pastor, J Ibarz, S Levine - arXiv preprint arXiv: …, 2017
Abstract: We consider the task of semantic robotic grasping, in which a robot picks up an
object of a user-specified class using only monocular images. Inspired by the two-stream
hypothesis of visual reasoning, we present a semantic grasping framework that learns object
[PDF] Imitation from Observation: Learning to Imitate Behaviors from Raw Video via
Context Translation
YX Liu, A Gupta, P Abbeel, S Levine - arXiv preprint arXiv:1707.03374, 2017
Abstract: Imitation learning is an effective approach for autonomous systems to acquire
control policies when an explicit reward function is unavailable, using supervision provided
as demonstrations from an expert, typically a human operator. However, standard imitation
[PDF] Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End
Learning from Demonstration
R Rahmatizadeh, P Abolghasemi, L Bölöni, S Levine - arXiv preprint arXiv: …, 2017
42
Papers right out of the oven
43
Limitations:
• Requires a substantial number of demonstrations to learn the
translation model.
• Requires observations of demonstrations from multiple
contexts in order to learn to translate between them.
Insights:
• Training an end-to-end model from scratch for each task may
be inefficient in practice
• Combining our method with higher level representations
proposed in prior work would likely lead to more efficient
training (Sermanet et al., 2017).
• Challenge: Domain shift: combine multiple tasks from different
contexts into a single model
Papers right out of the oven
Papers right out of the oven
Papers right out of the oven
• REINFORCEMENT LEARNING WITH
UNSUPERVISED AUXILIARY TASKS
(UNREAL and extension Mnih17)
• Auxiliary control and reward prediction
tasks in Deep RL doubles data efficiency
& robustness to hyperp. settings.
• A3C successor in learning speed and the
robustness (over 87% of human scores)
• Slides
• TensorFlow Session
• Github Project Tutorial
• TensorFlow Installation Notes
• Theano Session Tutorial
RESOURCES
48
Thank you!
natalia.diaz@ensta-paristech.fr
@NataliaDiazRodr
www.linkedin.com/in/nataliadr
Appendix
AI safety
Using relational properties in our priors?
•Neural-symbolic (Knowledge Graph) learning
and reasoning
62
Relational Networks (Santoro’17) and Visual Interaction Networks (Watters’17)
Philosophically similar models using abstract logic to reason about the world
Interpreting unsupervised representations
•Understanding intermediate layers using linear
classifier probes. Alain and Bengio’16 https://
arxiv.org/pdf/1610.01644.pdf
•Explaining the Unexplained: A CLass-Enhanced
Attentive Response (CLEAR) Approach to
Understanding Deep Neural Networks, Kumar et
al 17. https://arxiv.org/pdf/1704.04133.pdf
MILA DL & RL summer school highlights

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Márton Miháltz
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadRoelof Pieters
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovBhaskar Mitra
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP ApplicationsSamiur Rahman
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsRoelof Pieters
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question AnsweringSujit Pal
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItHolberton School
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information RetrievalRoelof Pieters
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?NAVER Engineering
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning SystemsAnuj Gupta
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddingsRoelof Pieters
 
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and ApplicationsDay 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and ApplicationsAseda Owusua Addai-Deseh
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processingNAVER Engineering
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature EngineeringAlice Zheng
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature surveyAkshay Hegde
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
Toward Continual Learning on the Edge
Toward Continual Learning on the EdgeToward Continual Learning on the Edge
Toward Continual Learning on the EdgeVincenzo Lomonaco
 

Was ist angesagt? (20)

Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
 
Creative AI & multimodality: looking ahead
Creative AI & multimodality: looking aheadCreative AI & multimodality: looking ahead
Creative AI & multimodality: looking ahead
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
 
Multi modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed modelsMulti modal retrieval and generation with deep distributed models
Multi modal retrieval and generation with deep distributed models
 
Deep Learning Models for Question Answering
Deep Learning Models for Question AnsweringDeep Learning Models for Question Answering
Deep Learning Models for Question Answering
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do It
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deeplearning NLP
Deeplearning NLPDeeplearning NLP
Deeplearning NLP
 
Deep Learning for Information Retrieval
Deep Learning for Information RetrievalDeep Learning for Information Retrieval
Deep Learning for Information Retrieval
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Building Continuous Learning Systems
Building Continuous Learning SystemsBuilding Continuous Learning Systems
Building Continuous Learning Systems
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and ApplicationsDay 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
 
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
[GAN by Hung-yi Lee]Part 2: The application of GAN to speech and text processing
 
The How and Why of Feature Engineering
The How and Why of Feature EngineeringThe How and Why of Feature Engineering
The How and Why of Feature Engineering
 
Deep Learning - A Literature survey
Deep Learning - A Literature surveyDeep Learning - A Literature survey
Deep Learning - A Literature survey
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Toward Continual Learning on the Edge
Toward Continual Learning on the EdgeToward Continual Learning on the Edge
Toward Continual Learning on the Edge
 

Ähnlich wie MILA DL & RL summer school highlights

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...gabrielesisinna
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)Alexander Borzunov
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Fernando Constantino
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTKAshish Jaiman
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye viewRoelof Pieters
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Impetus Technologies
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump StartMichele Toni
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator ProgramGoDataDriven
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningCharles Deledalle
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentationparlamind
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDatabricks
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSebastian Ruder
 
Visual concept learning
Visual concept learningVisual concept learning
Visual concept learningVaibhav Singh
 
Learn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building ProjectsLearn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building ProjectsJohn Alex
 

Ähnlich wie MILA DL & RL summer school highlights (20)

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)How to do science in a large IT company (ICPC World Finals 2021, Moscow)
How to do science in a large IT company (ICPC World Finals 2021, Moscow)
 
Open ai openpower
Open ai openpowerOpen ai openpower
Open ai openpower
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.Transfer Learning: Breve introducción a modelos pre-entrenados.
Transfer Learning: Breve introducción a modelos pre-entrenados.
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
 
Deep Learning Jump Start
Deep Learning Jump StartDeep Learning Jump Start
Deep Learning Jump Start
 
Data Science Accelerator Program
Data Science Accelerator ProgramData Science Accelerator Program
Data Science Accelerator Program
 
MLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learningMLIP - Chapter 3 - Introduction to deep learning
MLIP - Chapter 3 - Introduction to deep learning
 
#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation#1 Berlin Students in AI, Machine Learning & NLP presentation
#1 Berlin Students in AI, Machine Learning & NLP presentation
 
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and PandasDistributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
Distributed Models Over Distributed Data with MLflow, Pyspark, and Pandas
 
Laird ibm-small
Laird ibm-smallLaird ibm-small
Laird ibm-small
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep Learning
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Visual concept learning
Visual concept learningVisual concept learning
Visual concept learning
 
Learn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building ProjectsLearn Real World Machine Learning By Building Projects
Learn Real World Machine Learning By Building Projects
 

Mehr von Natalia Díaz Rodríguez

State representation learning for control: an overview
State representation learning for control: an overview State representation learning for control: an overview
State representation learning for control: an overview Natalia Díaz Rodríguez
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 Natalia Díaz Rodríguez
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overviewNatalia Díaz Rodríguez
 
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...Natalia Díaz Rodríguez
 
How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)Natalia Díaz Rodríguez
 
Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...Natalia Díaz Rodríguez
 
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...Natalia Díaz Rodríguez
 
Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...Natalia Díaz Rodríguez
 
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa RicaUCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa RicaNatalia Díaz Rodríguez
 
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia DíazIFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia DíazNatalia Díaz Rodríguez
 
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...Natalia Díaz Rodríguez
 
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...Natalia Díaz Rodríguez
 

Mehr von Natalia Díaz Rodríguez (15)

State representation learning for control: an overview
State representation learning for control: an overview State representation learning for control: an overview
State representation learning for control: an overview
 
Continual learning and robotics
Continual learning and robotics   Continual learning and robotics
Continual learning and robotics
 
PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018 PAISS (PRAIRIE AI Summer School) Digest July 2018
PAISS (PRAIRIE AI Summer School) Digest July 2018
 
State Representation Learning for control: an overview
State Representation Learning for control: an overviewState Representation Learning for control: an overview
State Representation Learning for control: an overview
 
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
A Folksonomy of styles, aka: other stylists also said and Subjective Influenc...
 
How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)How to write systematic literature reviews (ideally, your first PhD paper)
How to write systematic literature reviews (ideally, your first PhD paper)
 
Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...Semantic security framework and context-aware role-based access control ontol...
Semantic security framework and context-aware role-based access control ontol...
 
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
An Ontology for Wearables Data Interoperability and Ambient Assisted Living A...
 
Guest lecture @Stanford Aug 4th 2015
Guest lecture @Stanford Aug 4th 2015 Guest lecture @Stanford Aug 4th 2015
Guest lecture @Stanford Aug 4th 2015
 
PhD Defense Natalia Díaz Rodríguez
PhD Defense Natalia Díaz RodríguezPhD Defense Natalia Díaz Rodríguez
PhD Defense Natalia Díaz Rodríguez
 
Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...Smart Dosing: A mobile application for tracking the medication tray-filling a...
Smart Dosing: A mobile application for tracking the medication tray-filling a...
 
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa RicaUCAmI Presentation Dec.2013, Guanacaste, Costa Rica
UCAmI Presentation Dec.2013, Guanacaste, Costa Rica
 
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia DíazIFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
IFSA World Congress -NAFIPS 2013 Edmonton, Alberta. Natalia Díaz
 
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
Extending Semantic Web Tools for Improving Smart Spaces Interoperability and ...
 
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
A Framework for Context-aware applications for Smart Spaces. ruSmart 2011 St ...
 

Kürzlich hochgeladen

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 

Kürzlich hochgeladen (20)

Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 

MILA DL & RL summer school highlights

  • 1. Deep Learning & Reinforcement Learning MILA Summer School Highlights Natalia Díaz Rodríguez, PhD 26th June-5th July 2017 Montreal, Quebec
  • 2. Learning to Learn - Nando de Freitas • What is the intrinsic motivation we are here? learning, satisfaction of getting knowledge • From Bengio’s brothers 92 to GitHub.com/ deepmind/learning-to-learn • 1 single network: optimiser & optimizee • Generalize: learning to learn X by doing Y (unsup. by super. learning)
  • 3.
  • 4.
  • 5.
  • 6.
  • 11.
  • 12. Does not scale for large amounts of actions
  • 15.
  • 16. Automatic differentiation: the new trend by all DL frameworks • Matt Johnson great tutorial on Automatic Differentiation • IDEA: checkpointing and less config boilerplate code • Becoming standard: • Tensor Flow eager • PyTorch Taping
  • 17. Graphical models and DL: a powerful combination -Matt Johnson (Google)
  • 19. GANs state-of-the-art • Applications: image generation, attribute morphing, image inpainting… • State-of-the-art • BEGAN*, Cycle-GAN (draw a bag and find a real one) • Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Networks, Bousmalis 16 (Unsupervised (GAN)– based architecture able to learn a transformation without using corresponding pairs from the two domains, code to appear, CVPR17). • The best state of the art approach improving over: • Decoupling from the Task-Specific Architecture • Generalization Across Label Spaces • Achieve Training Stability • Data Augmentation * Fast and stable, new boundary equilibrium enforcing method paired with a loss derived from the Wasserstein distance for training auto-encoder based GAN CycleGAN
  • 20.
  • 21. KNN is still one of the most repeated quantitative measure for unsupervised evaluation Bousmalis’16
  • 22. GANs help Semi and Unsupervised learning as well as domain randomisation
  • 23. • CVAE-GAN fine-grained category image generation. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17 GANs Mode Collapse: inability to generate a variable distribution of data
  • 24. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training, Bao’17
  • 25. One/Few-shot learning • Extending siamese with one-shot learning: Siamese Neural Networks for One-shot Image Recognition. One Shot Learning with Siamese Networks in PyTorch – Harshvardhan Gupta – Medium This is Part 1 of a two part article. Part 2 will be shown here once it is published. • Black-Box Data-efficient Policy Search for Robotics Mouret17 (Gaussian process regression for policy optimisation using model based policy search). 5 episodes enough to learn the whole dynamics of the arm from scratch.
  • 26.
  • 27. • If you can’t predict reward, predict a relative ordering rank (same vs different) • Siamese network: optimize all rankings simultaneously
  • 28. • Natural language embedding into multidimensional space really helps learning (humans ALWAYS learn language) • Physics and bodies provide essential consistency for understanding intelligence, and facilitate transfer and continuous learning • Solving many tasks helps: sometimes many tasks are essential to learn at all [Learning more things at once often helps performance in RL. Intentional unintentional agents] • Reporting failure cases is also important! Take Home Messages 28 [NdF]
  • 29.
  • 30.
  • 31. • TD-learning is back & hot (from the first TD-Gammon AI won game)* • Only 1 reward at the end • No feedback along the way • New venue: Int’ conference on RL and decision making https://groups.google.com/ forum/#!forum/rldm-list * See unsupervised representation learning talk by R. Sutton and latest DeepMind (Mnih’17 evolution of UNREAL) Take Home Messages 31
  • 32. • Domain randomization: use to transfer from simulation to real life learning without domain adaptation (OpenAI, NVIDIA cube pose estimation: distractors and different backgrounds, lights, virtual elements to real images). • Learning by demonstration and few shot learning: Most data-efficient learning algorithms for semi supervised learning Take Home Messages 32
  • 33. • Regularizing NN by penalising confident output distributions [Pereyra 17]. • Additional objectives (similar to UNREAL): RL with Unsupervised Auxiliary Tasks [Jaderberg’17] • Generating grounded rewards automatically [Littman, Topcu et al 17]. Take Home Papers 33 *Reinforcement Learning with Unsupervised Auxiliary Tasks - Implementation: https://github.com/miyosuda/unreal **Option: a generalisation step of a single-step action that may span across more than 1 timestep and can be used as a standard action. We move to the policy mu over options o with probability mu(s,o). We can derive a policy over options Pi_omega that maximises the expected discounted (via regrets) sum of rewards.
  • 34. •DeepMind 2 parallel works: Relational Networks and Visual Interaction Networks (philosophically similar works using abstract logic to reason about the world).   •Dealing with sparse rewards: •Reward shaping: Off-Policy Reward Shaping with Ensembles: https:// arxiv.org/abs/1502.03248 and Expressing Arbitrary Reward Functions as Potential-Based Advice: https://www.aaai.org/ocs/index.php/AAAI/ AAAI15/paper/viewFile/9893/9923 •http://papers.nips.cc/paper/6538-safe-and-efficient-off-policy-reinf  https://ai.vub.ac.be/sites/default/files/PID3130853.pdf •Reinforcement Learning from Demonstration through Shaping •Non-Markovian Rewards Expressed in LTL: Guiding Search Via Reward Shaping. A. Camacho, et al. (RLDM), June 2017 •https://arxiv.org/pdf/1706.10295.pdf Take Home Papers 34
  • 35. •GANS: •Allan Ma (Guelph) State of art GAN implem. + evaluation. •GAN used to perform domain adaptation (useful ideas to go from simulated robot simulation to real world robot simulation) •LANGUAGE GROUNDING AND VISUAL/DIALOG HYBRID SYSTEMS (Ideas for PARL.AI grant call): End-to-end optimization of goal-driven and visually grounded dialogue systems     Take Home Papers 35
  • 36. • Dex-Net Grasping dataset (10K 3D models to acquire force closure grasps, for the ABB YuMi) • ROS service for grasp planning. Dex-Net as a Service: Fall 2017. HTTP web API to create new databases with custom 3D models and compute grasp robustness metrics. • Google robot farm dataset: many robot arms for grasping, pushing, etc. 800,000 grasp attempts (6-14 robotic manipulators) • Using Baxter: • Pinto and Gupta Baxter dataset (40k grasping experiences). CNNs predict lifting successes or to resist grasp perturbations caused by an adversary*. • Oberlin’15 Autonomously collecting object scans Take Home Datasets 36 *Lerrel Pinto and Abhinav Gupta. Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours. In Proc. IEEE Int. Conf. Robotics and Automation (ICRA), 2016. Lerrel Pinto, James Davidson, and Abhinav Gupta. Supervision via competition: Robot adversaries for learning tasks. arXiv preprint arXiv:1610.01685, 2016.
  • 37.
  • 38. Food for thought • Is AI = DL + RL? (Hado van Hasselt) • Does the brain do backpropagation? • Even if the brain is not doing back-propagation as ANN do, there is no mathematical handicap that can prove otherwise • CNNs and LSTMs: successful ubiquitous AI models inspired by the human brain • :( Neuroscience is still far apart from AI community
  • 39. Keyword Summary • GANS as data augmentation (CycleGAN, BEGAN,…) • Autoregressive models (PixelGAN) • Embedding language and vision representations
  • 40. •End-to-end •Self-supervision •Learning by: •Imitation*, cloning, demonstration and by predicting the future (natural learning) •One-shot learning •Reward shaping and other myriad signals •TD-learning •Options framework * E.g. Imitating Driver Behavior with Generative Adversarial Networks https://arxiv.org/pdf/1701.06699.pdf Keyword Summary
  • 41. 41 Grants and competitions • https://nips.cc/ Conferences/2017/ CompetitionTrack Learning to run
  • 42. Papers right out of the oven [PDF] End-to-End Learning of Semantic Grasping E Jang, S Vijaynarasimhan, P Pastor, J Ibarz, S Levine - arXiv preprint arXiv: …, 2017 Abstract: We consider the task of semantic robotic grasping, in which a robot picks up an object of a user-specified class using only monocular images. Inspired by the two-stream hypothesis of visual reasoning, we present a semantic grasping framework that learns object [PDF] Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation YX Liu, A Gupta, P Abbeel, S Levine - arXiv preprint arXiv:1707.03374, 2017 Abstract: Imitation learning is an effective approach for autonomous systems to acquire control policies when an explicit reward function is unavailable, using supervision provided as demonstrations from an expert, typically a human operator. However, standard imitation [PDF] Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration R Rahmatizadeh, P Abolghasemi, L Bölöni, S Levine - arXiv preprint arXiv: …, 2017 42
  • 43. Papers right out of the oven 43
  • 44.
  • 45. Limitations: • Requires a substantial number of demonstrations to learn the translation model. • Requires observations of demonstrations from multiple contexts in order to learn to translate between them. Insights: • Training an end-to-end model from scratch for each task may be inefficient in practice • Combining our method with higher level representations proposed in prior work would likely lead to more efficient training (Sermanet et al., 2017). • Challenge: Domain shift: combine multiple tasks from different contexts into a single model Papers right out of the oven
  • 46. Papers right out of the oven
  • 47. Papers right out of the oven • REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS (UNREAL and extension Mnih17) • Auxiliary control and reward prediction tasks in Deep RL doubles data efficiency & robustness to hyperp. settings. • A3C successor in learning speed and the robustness (over 87% of human scores)
  • 48. • Slides • TensorFlow Session • Github Project Tutorial • TensorFlow Installation Notes • Theano Session Tutorial RESOURCES 48
  • 49.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 59.
  • 60.
  • 61.
  • 62. Using relational properties in our priors? •Neural-symbolic (Knowledge Graph) learning and reasoning 62 Relational Networks (Santoro’17) and Visual Interaction Networks (Watters’17) Philosophically similar models using abstract logic to reason about the world
  • 63. Interpreting unsupervised representations •Understanding intermediate layers using linear classifier probes. Alain and Bengio’16 https:// arxiv.org/pdf/1610.01644.pdf •Explaining the Unexplained: A CLass-Enhanced Attentive Response (CLEAR) Approach to Understanding Deep Neural Networks, Kumar et al 17. https://arxiv.org/pdf/1704.04133.pdf