10. #1 OCR: Google Maps & Street View
Results
– State-of-the-art: 84.2% accuracy on French Street Name Signs dataset
– Deployed to recognize street numbers and addresses
– Expanded to detect business store-front signs
11. #2 Visual reasoning
Results
– CLEVR results included super-human performance at 95.5% (68.5 prev.)
– Potential in B2B: “is store shelf properly merchandised ?”
13. #4 SketchRNN: Teaching Machines to Draw
Overview
• Dataset of vector drawing obtained from “Quick, Draw!”
– an online game (70K examples)
• The model learned how to draw simple concepts
16. #5 What is GAN ?
Overview
– Generative Adversarial Networks – two neural networks contesting with each other
– Primarily applied to modeling natural images
– Architecture
• Generator creates fake pictures and tries to fool Discriminator
• Discriminator distinguishes generated from real
17. #5 GAN: Architecture
G generates pictures from
numbers (latent representation)
Generated Bedrooms
19. #7 Face aging with conditional GAN
Overview
– Trained on IMDB dataset (age known)
– One can add parameters to latent space, i.e. age
– During evaluation age component is changed
20. #8 Generative Adversarial Text to Image Synthesis
Overview
– Generating realistic images from text
– G outputs fake images based on text, D distinguish fake/real image+text
21. #9 Professional-Level Photographs
Aspects
– Dataset:
• professional photos
• negative examples were generated by applying a combination of image filters, degrading their appearance
– Model: GAN, Generator tries to fix negative examples
– AI travels thousands of panoramas to pick & enhance
25. #12 CycleGAN
Overview
– Unpaired Image-to-Image Translation (like Style Transfer)
– 2 pairs of G + D from one domain to another and backward
– Cycle consistency
– More powerful and controllable than Style Transfer (object transfigurations, …)
27. #13 Molecule development in oncology
Overview
– AAE model learns underlying distribution of molecular fingerprints
– The output was used to screen 72M and select candidates with potential anti-cancer properties
– Result: 69 new molecule (half is already using to fight cancer)
28. #14 Adversarial attacks
Overview
– Adversarial examples—inputs formed by applying small but intentionally worst-case perturbations
– Would-be problem for future ML systems, now in active research
– Potential usage for us:
• robust defense for Antispam
• adversarial Captcha
29. #14 Adversarial attacks: examples IRL
impersonations to fool Face Recognition
Perturbations in the shape of the text “LOVE HATE” - self-driving cars
31. #15 WaveNet: A Generative Model for Raw Audio
Results
– WaveNets are able to generate speech which mimics any human voice
– Text To Speech: State of the art, reduced the gap with human performance by over 50%
– Can be used to synthesize audio signals such as music
Text Piano
32. #16 Lips Reading
Results
– Trained on TV dataset
– Model can operate on visual, audio or both
– Beats a professional lip reader from BBC
33. #17 Obama Lip Sync
Training on a large amount of video of the same person we can create
believable video from audio with convincing lip sync
35. #18 Google Neural Machine Translation
Overview
– Architecture: RecurrentNN + Attention + a lot of tricks
– GNMT reduces translation errors by 55%-85% on several major language
36. #19 Negotiations. Deal or no deal ?
How
– Supervised on human dialogues and then finetuned on AI vs AI
– Bot simulates a future conversation to the end and picks highest EV
Results
– There were cases where agents initially feigned interest in a valueless item
– A step toward creating bots than can reason, converse, and negotiate
37. #19 Negotiations: out of control
Bob: I can i i everything else
Alice: balls have zero to me to me to me to me to me to me to me to me to
Bob: you i everything else
Alice: balls have a ball to me to me to me to me to me to me to me to me
39. #20 What is RL ?
Task
• Learn how to behave successfully to achieve a
goal while interacting with external enviroment
• Learn via experience!
Examples
• Game playing
• Traffic control
• Robotics
40. #21 RL: advance in games
Atari games (2600) – RL surpasses human performance in a majority of games (2015 year)
Results
• Using auxiliary tasks (pixel and reward
prediction) training is 10x faster
• On Atari games – 9x super human performance
• 87% human performance on Labyrinth
41. #22 Mastering the game of Go
“We showed AlphaGo a large number of strong amateur games to help it develop its own understanding of what
reasonable human play looks like. Then we had it play against different versions of itself thousands of times, each
time learning from its mistakes and incrementally improving until it became immensely strong, through a process
known as reinforcement learning.”
Overview
– Deepmind’s AlphaGo beats all the human champions
– was awarded professional 9-dan
– … and will retire
42. #23 Dota
Overview
– Bot has learned entirely via self-play
– Beats the top pros at 1v1 during The International 2017
43. #24 Robots that Learn
One-shot learning
System can learn a behavior from a single demonstration delivered within a simulator, then reproduce that
behavior in different setups in reality.
44. #25 Locomotion Behaviors in Rich Environments
Overview
– Trained: simulated bodies on a diverse set of challenging terrains and obstacles
– Simple reward function: progress
– Result: rich environments promote complex behavior
45. #26 Learning from Human Preferences
Overview
– No goal system
– Algorithm can infer what human want by being told what’s of 2 better
900 bits of
human feedback
46. #26 Learning from Human Preferences: failure case
– Goal: grasp items
– Result: algorithm tricked evaluators (it only appeared to grasp)
47. #27 Production: Cooling a Data Center
Overview
– Data: historical, collected by thousands of sensors (temperatures, power, pump speeds, setpoints, …)
– Ensemble of deep neural networks
49. #28 One Model To Learn Them All
Overview
– One model to learn 8 tasks of multiple domains (image, text, speech)
Results
– Metrics are similar to state-of-the-art on large tasks, better on small
– Transfer learning across domains
50. #29 Training Imagenet in 1 hour
Overview
– 29 hours on 8 GPU -> 1 hour on 256 GPU cluster
– 90% scaling efficiency (minibatch of 8192)
52. #30 Self-driving cars
Overview
– Google’s Waymo has launched beta-program
• 3 million miles driven
– Intels’s MobilEye & BMW & Audi – fully autonomous cars in 2021
53. #31 Health
Overview
– Data Science Bowl 2017 fights cancer ($1m in prizes)
– DeepMind focuses on Health problems
• Assisting Pathologists in Detecting Cancer with Deep Learning
A closeup of a lymph node biopsy
Breast
54. #32 Investments & Teams
Investments
– OpenAI: $1 billion by Musk & Co
– China to become World Leader in AI by 2030, to build domestic industry worth almost $150 billion
Teams
Amount of employees (not precise, universities not included):
– Facebook’s FAIR - 80
– Google’s DeepMind – 250
– GoogleBrain - 30
– OpenAI – 10
– Baidu – 1300
– Microsoft Research AI - 100