Deep Learning: Advances Of The Last Year

Deep Learning: Advances Of The
Last Year
Tyantov Eduard

Neural Networks
1. Multilayer perceptron, 1960 2. Backpropagation algorithm, 1974
3. Convolutional neural
network, 1990s

Convolution
Color image
(RGB)
Feature maps3
channels
Convolutions
Input
output

Network Architecture
VGG-network

#1 OCR: Google Maps & Street View
Results
– State-of-the-art: 84.2% accuracy on French Street Name Signs dataset
– Deployed to recognize street numbers and addresses
– Expanded to detect business store-front signs

#2 Visual reasoning
Results
– CLEVR results included super-human performance at 95.5% (68.5 prev.)
– Potential in B2B: “is store shelf properly merchandised ?”

#3 Pix2Code
Overview
– Generating Code from a Graphical User Interface Screenshot
– Experimental, 77% accuracy

#4 SketchRNN: Teaching Machines to Draw
Overview
• Dataset of vector drawing obtained from “Quick, Draw!”
– an online game (70K examples)
• The model learned how to draw simple concepts

#4 SketchRNN: Examples
Arithmetic on
latent
representation

#5 What is GAN ?
Overview
– Generative Adversarial Networks – two neural networks contesting with each other
– Primarily applied to modeling natural images
– Architecture
• Generator creates fake pictures and tries to fool Discriminator
• Discriminator distinguishes generated from real

#5 GAN: Architecture
G generates pictures from
numbers (latent representation)
Generated Bedrooms

#7 Face aging with conditional GAN
Overview
– Trained on IMDB dataset (age known)
– One can add parameters to latent space, i.e. age
– During evaluation age component is changed

#8 Generative Adversarial Text to Image Synthesis
Overview
– Generating realistic images from text
– G outputs fake images based on text, D distinguish fake/real image+text

#9 Professional-Level Photographs
Aspects
– Dataset:
• professional photos
• negative examples were generated by applying a combination of image filters, degrading their appearance
– Model: GAN, Generator tries to fix negative examples
– AI travels thousands of panoramas to pick & enhance

#10 Pix2pix
Overview
– Image-to-image translation
– Generator learns to transfer between domains

#11 Pix2pix: cat examples
Demo

#12 CycleGAN: tasks
xxxx
– xxxx

#12 CycleGAN
Overview
– Unpaired Image-to-Image Translation (like Style Transfer)
– 2 pairs of G + D from one domain to another and backward
– Cycle consistency
– More powerful and controllable than Style Transfer (object transfigurations, …)

#13 Molecule development in oncology
Overview
– AAE model learns underlying distribution of molecular fingerprints
– The output was used to screen 72M and select candidates with potential anti-cancer properties
– Result: 69 new molecule (half is already using to fight cancer)

#14 Adversarial attacks
Overview
– Adversarial examples—inputs formed by applying small but intentionally worst-case perturbations
– Would-be problem for future ML systems, now in active research
– Potential usage for us:
• robust defense for Antispam
• adversarial Captcha

#14 Adversarial attacks: examples IRL
impersonations to fool Face Recognition
Perturbations in the shape of the text “LOVE HATE” - self-driving cars

#15 WaveNet: A Generative Model for Raw Audio
Results
– WaveNets are able to generate speech which mimics any human voice
– Text To Speech: State of the art, reduced the gap with human performance by over 50%
– Can be used to synthesize audio signals such as music
Text Piano

#16 Lips Reading
Results
– Trained on TV dataset
– Model can operate on visual, audio or both
– Beats a professional lip reader from BBC

#17 Obama Lip Sync
Training on a large amount of video of the same person we can create
believable video from audio with convincing lip sync

#18 Google Neural Machine Translation
Overview
– Architecture: RecurrentNN + Attention + a lot of tricks
– GNMT reduces translation errors by 55%-85% on several major language

#19 Negotiations. Deal or no deal ?
How
– Supervised on human dialogues and then finetuned on AI vs AI
– Bot simulates a future conversation to the end and picks highest EV
Results
– There were cases where agents initially feigned interest in a valueless item
– A step toward creating bots than can reason, converse, and negotiate

#19 Negotiations: out of control
Bob: I can i i everything else
Alice: balls have zero to me to me to me to me to me to me to me to me to
Bob: you i everything else
Alice: balls have a ball to me to me to me to me to me to me to me to me

#20 What is RL ?
Task
• Learn how to behave successfully to achieve a
goal while interacting with external enviroment
• Learn via experience!
Examples
• Game playing
• Traffic control
• Robotics

#21 RL: advance in games
Atari games (2600) – RL surpasses human performance in a majority of games (2015 year)
Results
• Using auxiliary tasks (pixel and reward
prediction) training is 10x faster
• On Atari games – 9x super human performance
• 87% human performance on Labyrinth

#22 Mastering the game of Go
“We showed AlphaGo a large number of strong amateur games to help it develop its own understanding of what
reasonable human play looks like. Then we had it play against different versions of itself thousands of times, each
time learning from its mistakes and incrementally improving until it became immensely strong, through a process
known as reinforcement learning.”
Overview
– Deepmind’s AlphaGo beats all the human champions
– was awarded professional 9-dan
– … and will retire

#23 Dota
Overview
– Bot has learned entirely via self-play
– Beats the top pros at 1v1 during The International 2017

#24 Robots that Learn
One-shot learning
System can learn a behavior from a single demonstration delivered within a simulator, then reproduce that
behavior in different setups in reality.

#25 Locomotion Behaviors in Rich Environments
Overview
– Trained: simulated bodies on a diverse set of challenging terrains and obstacles
– Simple reward function: progress
– Result: rich environments promote complex behavior

#26 Learning from Human Preferences
Overview
– No goal system
– Algorithm can infer what human want by being told what’s of 2 better
900 bits of
human feedback

#26 Learning from Human Preferences: failure case
– Goal: grasp items
– Result: algorithm tricked evaluators (it only appeared to grasp)

#27 Production: Cooling a Data Center
Overview
– Data: historical, collected by thousands of sensors (temperatures, power, pump speeds, setpoints, …)
– Ensemble of deep neural networks

#28 One Model To Learn Them All
Overview
– One model to learn 8 tasks of multiple domains (image, text, speech)
Results
– Metrics are similar to state-of-the-art on large tasks, better on small
– Transfer learning across domains

#29 Training Imagenet in 1 hour
Overview
– 29 hours on 8 GPU -> 1 hour on 256 GPU cluster
– 90% scaling efficiency (minibatch of 8192)

#30 Self-driving cars
Overview
– Google’s Waymo has launched beta-program
• 3 million miles driven
– Intels’s MobilEye & BMW & Audi – fully autonomous cars in 2021

#31 Health
Overview
– Data Science Bowl 2017 fights cancer ($1m in prizes)
– DeepMind focuses on Health problems
• Assisting Pathologists in Detecting Cancer with Deep Learning
A closeup of a lymph node biopsy
Breast

#32 Investments & Teams
Investments
– OpenAI: $1 billion by Musk & Co
– China to become World Leader in AI by 2030, to build domestic industry worth almost $150 billion
Teams
Amount of employees (not precise, universities not included):
– Facebook’s FAIR - 80
– Google’s DeepMind – 250
– GoogleBrain - 30
– OpenAI – 10
– Baidu – 1300
– Microsoft Research AI - 100

Deep Learning: Advances Of The Last Year

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Deep Learning: Advances Of The Last Year

Ähnlich wie Deep Learning: Advances Of The Last Year (20)

Mehr von Eduard Tyantov

Mehr von Eduard Tyantov (6)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Deep Learning: Advances Of The Last Year