SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
Deep Learning: a Next Step?
Kyunghyun Cho
New York University
Center for Data Science, and
Courant Institute of Mathematical Sciences
Naver Labs
Awesomeness
everywhere!
Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Program
Interpreter
Awesome
Meta-
Learner
Awesome
Atari
Player
What we want is…
Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Memory
• One system with
many modules
• Modules interact with
each other to solve a task
• Knowledge sharing across tasks via
shared modules
• Some trainable, others fixed
Paradigm shift
• One neural network per task
• One neural network per function
• Multiple networks cooperate to
solve many higher-level tasks
• Mixture of trainable networks
and fixed modules
Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Memory
Examples
• Q&A system
1. Receives a question via
awesome LM+ASR
2. Retrieves relevant info from
awesome memory
3. Generates a response via
awesome LM
• Autonomous driving
1. Senses the environment with
awesome ConvNet+ASR
2. Plans a route with
awesome memory
3. Controls a car via awesome
robot arm controller
But, simple composition of neural networks may not work! Why Not?
Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Memory
Learning to use an NN module Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Memory
• Why not?
• Target tasks are often unknown at
training time
• Input/output cannot be defined
well a priori
• The amount of learning signal
differs vastly across tasks
• Rich information captured by the NN
module must be passed along
• Internal of the NN module must allow
external manipulation
Good news: NN’s are transparent!
Hidden activations of a recurrent language model
• NN’s are not black boxes.
• We can observe every single bit
inside a neural net.
Bad news: NN’s are not easy to understand!
• Humans are not good with high-dimensional
vectors
• Distributed representation
• exponential combinations of hidden units
Learning to use an NN module Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Memory
• Neural nets are good at interpreting
high-dimensional input
• Neural nets are also good at
predicting high-dimensional output
• Internal representation learned by a
neural network is well structured
• Neural nets can be trained with an
arbitrary objective
(My Rejected NSF Proposal, 2016)
Learning to use an NN module Awesome
ConvNet
Awesome
LM
Awesome
ASR
Awesome
RoboArm
Controller
Awesome
Q&A
Awesome
Auto-
Driver
Awesome
Memory
1. Query-Efficient Imitation Learning
2. Trainable Decoding
• Real-time Neural Machine Translation
• Trainable Greedy Decoding
3. Neural Query Reformulation
4. Non-Parametric Neural Machine Translation
Query-Efficient Imitation Learning
Jiakai Zhang & K Cho. Query-Efficient Imitation Learning for End-to-End
Autonomous Driving. AAAI 2017.
Imitation Learning
• A learner directly interacts with the world
• A supervisor augments reward signal from
the world
• Advantages over supervised and
• Match between training and test
• Strong learning signal
• Disadvantages
• Where do we get the supervisor???
(Ross et al., 2011; Daume III et al., 2007; and more…)
• Supervisors are expensive
• As the learner gets better, less
intervention from the supervisor
• Learner learns from difficult examples
• Questions:
1. Where do we get the safety net?
2. What is the impact on the
learner’s performance?
SafeDAgger: Query-Efficient Imitation Learning
(Zhang&Cho, AAAI 2017; Laskey et al., ICRA 2016)
SafeDAgger: Query-Efficient Imitation Learning
1. Learner observes the world
2. SafetyNet observes the learner
3. SafetyNet predicts whether the
learner will fail
4. If no, the learner continues
5. If yes,
1. the supervisor intervenes
2. The learner imitate the
supervisor’s behaviour
Reminds us of the value function from RL!
SafeDAgger: Learning
1. Initial labelled data sets: and
2. Train the policy using
3. Train the safety net using
1. Target for the safety net given
4. Collect additional data
1. Let drive, but the expert intervenes when
2. Collect data:
5. Data aggregation:
6. Go to 2
After 1st iteration
SafeDAgger in Action
SafeDAgger in Action
Trainable Decoding of
Neural Machine Translation
Jiatao Gu, Graham Neubig, K Cho and Victor Li. Learning to Translate in Real-time
with Neural Machine Translation. EACL 2017.
Jiatao Gu, K Cho and Victor Li. Trainable Greedy Decoding for Neural Machine
Translation. EMNLP 2017.
Trainable Decoding
Motivation
• Many decoding objectives unknown while training
• Lack of target training examples
• Arbitrary (non-differentiable) decoding objectives
• Sample-”in”efficiency of RL algorithms
Our Approach
• Train NMT with supervised learning
• Train a decoding module on top
(1) Real-Time Translation
Decoding
1. Start with a pretrained NMT
2. A simultaenous decoder intercepts and
interprets the incoming signal
3. The simultaneous decoder forces the
pretrained model to either
1. output a target symbol, or
2. wait for a next source symbol
Learning
1. Trade-off between delay and quality
2. Stochastic policy gradient (REINFORCE)
(Gu, Neubig, Cho & Li, EACL 2017)
(1) Real-Time Translation
(2) Trainable Greedy Decoding
Decoding
1. Start with a pretrained NMT
2. A Trainable decoder intercepts and
interprets the incoming signal
3. The trainable decoder sends out
the altering signal back to the
pretrained model
Learning
1. Deterministic policy gradient
2. Maximize any arbitrary objective
(Gu, Cho & Li, 2017)
(2) Trainable Greedy Decoding
Models
1. Actor
• Input: prev. hid. state , prev. symbol , and
context from the attention model
• Output: additive bias for hid. state
• Example:
2. Critic
• Input: a sequence of the hidden states from the decoder
• Output: a predicted return
• In our case, the critic estimates the full return rather than
Q at each time step
(Gu, Cho & Li, 2017)
(2) Trainable Greedy Decoding
(Gu, Cho & Li, EMNLP 2017)
Learning
1) Generate translation given a source sentence with noise
and
2) Train the critic to minimize
3) Generate multiple translations with noise
4) Critic-aware actor learning: newly proposed
where
Inference: simply throw away the critic and use the actor
(2) Trainable Greedy Decoding
• The trainable decoder does improve the target decoding objective
• Training is quite unstable without the critic-aware actor learning algorithm
• More work is definitely needed for further improvement
Toward End-to-End Q&A
Rodrigo Nogueira & K Cho. Task-Oriented Query Reformulation with Reinforcement
Learning. EMNLP 2017.
Dunn et al. SearchQA: A New Q&A Dataset Augmented with Context from a Search
Engine. arXiv 2017.
End-to-End Question-Answering
Neural Query Reformulator
Machine Comprehension
Trainable
Fixed
(Black box)
Neural Query Reformulator
Neural Query Reformulator
1. Reads an original query q0
2. Augment/reformulate q0
Learning
1. Hard RL problem: partial observability
due to the black box search engine
2. Policy gradient to maximize recall@K
(Nogueira & Cho, 2017)
Code and data available at https://github.com/nyu-dl/QueryReformulator
SearchQA: new dataset
for machine comprehension
(Dunn et al., 2017)
Data available at https://github.com/nyu-dl/SearchQA
(Q, A)
(Q, A, { S1, S2, . . . , SN } )
Retrieve
Crawl
Search
SearchQA
1. Realistic, noisy context from Google
2. Multiple snippets per question
3. Large-scale data (140k q-a-c tuples)
And, Google did it!
• A pretrained, black-box Q&A
model
• Query reformulation with RL
• Tested on SearchQA
(Buck et al., 2017)
https arxiv.org abs
Few more relevant research directions
• Communicating neural networks
• Neural nets talk to each other to solve a problem
• Sukhbaatar & Fergus (2015), Foerster et al. (2016), Evtimova et al. (2017), Lewis et al. (2017),
…
• Multimodal processing
• Image captioning, zero-shot retrieval, …
• Cho et al. (2015, review paper)
• Planning, program synthesis
• How do the modules compose with each other to solve a task?
• Neural programmer interpreter [Reed et al., 2016; Cai et al., 2017]
• Forward modelling [Henaff et al., 2017; Sutton, 1991 Dyna; optimal control…]
• Mixture of experts [Google], progressive networks [Google DeepMind]
Paradigm Shift: modular, life-long learning
Neural
Network
Environment
Users/Experts
Search
Engine
Neural
Network
Database
Neural
Network
Neural
Network
Neural Machine Translation
Multilingual, Character-Level, Non-parametric
Machine Translation
• [Allen 1987 IEEE 1st ICNN]
• 3310 En-Es pairs constructed on 31
En, 40 Es words, max 10/11 word
sentence; 33 used as test set
• Binary encoding of words – 50
inputs, 66 outputs; 1 or 3 hidden
150-unit layers. Ave WER: 1.3
words
• [Chrisman 1992 Connection Science]
• Dual-ported RAAM architecture
[Pollack 1990 Artificial Intelligence]
applied to corpus of 216 parallel pairs
of simple En-Es sentences:
• Split 50/50 as train/test, 75% of
sentences correctly translated!
Brief resurrection in 1997: Spain
Modern neural machine translation
rce
ence
get
ence
ural
work
Source
Sentence
Target
Sentence
Neural Net
SMT
(Schwenk et al. 2006)
Source
Sentence
Target
Sentence
SMT
Neural Net
(Devlin et al. 2014)al MT
e
ce
et
ce
al
ork
Source
Sentence
Target
Sentence
Neural Net
SMT
(Schwenk et al. 2006)
Source
Sentence
Target
Sentence
SMT
Neural Net
(Devlin et al. 2014)MT
Source
Sentence
Target
Sentence
Neural
Network
So
Sen
Ta
Sen
Neur
S
(SchwenkNeural MT
1 Year
WMT 2017: news translation task
A better single-pair translation system has
never been “the” goal of neural MT
Continuous representation:
Interlingua 2.0?
What if we can project sentences in multiple languages into a single vector space?
What does NMT do?
Encoder
• Project a source sentence into a
set of continuous vectors
Decoder+Attention
• Decode a target sentence from a
set of “source” continuous
vectors
What is this “continuous vector space”?
• Similar sentences are near each other
in this vector space
• Multiple dimensions of similarity are
encoded simultaneously
(Sutskever et al., 2014)
What is this “continuous vector space”?
• Similar sentences are near each other
in this vector space
• Multiple dimensions of similarity are
encoded simultaneously
• (Trainable) near-bijective mapping
between the continuous vector space
and the sentence space
• Stripped of hard linguistic symbols
What is this “continuous vector space”?
(Firat et al., 2016; Luong et al., 2015; Dong et al., 2015)
• Can this continuous vector space be shared across multiple languages?
Multi-way, multilingual machine translation (1)
Language-agnostic
Continuous Vector
Space
• One encoder per source language
• One decoder per target language
• Attention/alignment shared across
all the language pairs
• Only bilingual parallel
corpora necessary
• No multi-way parallel corpus needed
(Firat et al., 2016)
Multi-way, multilingual machine translation (2)
• Neural nets are like lego
• Build one encoder per source
• Build one decoder per target
• Build one attention mechanism
• Given a sentence pair
•
•
(Firat et al., 2016)
Multi-way, multilingual machine translation (3)
Language-
agnostic
Continuous
Vector Space
• Sentence-level positive language transfer
• Helps low-resource language pairs
• Why?
1. Better structural constraint on the
continuous vector space
2. Regularization
• Real-valued vector-based interlingua?
(Firat et al., 2016)
Beyond languages: multimodal translation
• Does the source have to be “sentence”?
Annotation
Vectors
Word
Ssample
ui
Recurrent
State
zi
f = (a, man, is, jumping, into, a, lake, .)
+
hj
Attention
Mechanism
a
Attention
weight
j
ajΣ =1
ConvolutionalNeuralNetwork
(Xu et al., 2015)
Beyond languages: multimodal translation
(Caglayan et al., 2016; Elliott & Kadar, 2017)
What is a sentence?
Is a sentence a sequence of phrases, words, morphemes or characters?
What is a sentence to a neural net?
• Each word/symbol: one-hot vector
• Prior-less encoding
• Permutation invariant
• Sentence
• To us: a sequence of words
• To NN: a sequence of one-hot vectors
• What does it mean?
Why not words?
• Inefficient handling of various morphological variants
• Sub-optimal segmentation/tokenization
• “Etxaberria”, “Etxazarra”, “Etxaguren”, “Etxarren”: four independent vectors
• Lack of generalization to novel/rare morphological variants
• For instance, in Arabic => “and to his vehicle”
• One vector for compound words?
• “kolmi/vaihe/kilo/watti/tunti/mittari” => one vector?
• “kolme” => one vector?
• Spelling issues
• See Workshop on Processing Historical Language or Universal Dependencies
• Good segmentation/tokenization needed for each language
• So, no, words don’t look like the units we want to work with…
Then, what should we do…?
• Original: 고양이가 침대 위에 누워있습니다
• Word-level modelling:
(고양이가, 침대, 위에, 누워있습니다)
• Subword-level modelling (Sennrich et al., 2015; Wu et al., 2016)
(고양이, 가, 침대, 위, 에, 누워, 있습니, 다)
• Character-level modelling with segmentation
(Wang et al., 2015; Luong & Manning, 2016; Costa-Jussa & Fonollosa, 2016)
((ㄱ,ㅗ,ㅇ,ㅑ,ㅇ,ㅣ,ㄱ,ㅏ), (ㅊ,ㅣ,ㅁ,ㄷ,ㅐ), (ㅇ,ㅟ,ㅇ,ㅔ),
(ㄴ,ㅜ,ㅇ,ㅝ,ㅇ,ㅣ,ㅆ,ㅅ,ㅡ,ㅂ,ㄴ,ㅣ,ㄷ,ㅏ))
• Fully character-level modelling (Chung et al., 2016; Lee et al., 2017)
(ㄱ,ㅗ,ㅇ,ㅑ,ㅇ,ㅣ,ㄱ,ㅏ,_,ㅊ,ㅣ,ㅁ,ㄷ,ㅐ,_,ㅇ,ㅟ,ㅇ,ㅔ,_,ㄴ,ㅜ,ㅇ,ㅝ,ㅇ,ㅣ,ㅆ,ㅅ,ㅡ,ㅂ
,ㄴ,ㅣ,ㄷ,ㅏ))
Character-level translation
• Source: subword-level representation
• Target: character-level representation
• The decoder implicitly learned word-like units automatically!
(Chung et al., 2017)
Fully Character-level translation
• Source: character-level representation
• Target: character-level representation
• Efficient modelling with
a convolutional-recurrent encoder
• Works as well as, or better than,
subword-level translation
(Lee et al., 2017)
(Lee et al., 2017)
• More robust to errors
• Better handles rare tokens
• Rare tokens are not necessary rare!
Character-level Multilingual Translation
• When symbols are shared across multiple languages, why not share a
single encoder/decoder for them?
1. Language transfer at all levels: letters, words, phrases, sentences, …
2. Intra-sentence code-switching without any specific data
(Lee et al., 2017; Johnson et al., 2016; Ha et al., 2016)
Non-parametric
neural machine translation
Bridging question-answering, information retrieval and machine translation
Parametric ML: Learning as Compression
• What does learning do?
• Parametric machine learning: data compression + pattern matching
Neural
Network
Training
Data
learning
Neural
Network
Inference
Non-Parametric NMT (1)
• Bring the whole training corpus together with a model
• Retrieved a small subset of examples using a fast search engine
• Let NMT figure out how to fuse
1. the current sentence, and
2. the retrieved translation pairs
Non-Parametric NMT (2)
• Apache Lucene: search engine
• A key-value memory network
[Gulcehre et al., 2017; Miller et al., 2016]
for storing retrieved pairs
• Similar to larger-context NMT
• [Wang et al., 2017;
Jean et al., 2017]
• Similar to NMT with external
knowledge
• [Ahn et al., 2016;
Bahdanau et al., 2017]
Non-Parametric NMT (3)
• When retrieved pairs are similar, huge
improvement!
• Otherwise, revert back to a normal NMT
• More consistency in style and vocabulary choice
Other advances in neural machine translation
• Discourse-level machine translation
• [Jean et al., 2017; DCU, 2017]
• Better decoding strategies
• Learning-to-search [Wiseman & Rush, 2016]
• Reinforcement learning [MRT, 2016; Ranzato et al., 2015; Bahdanau et al., 2015]
• Trainable decoding [Gu et al., 2017]
• Alternative decoding cost [Li et al., 2016; Li et al., 2017]
• Linguistics-guided neural machine translation
• Learning to parse and translate [Eriguchi et al., 2017; Rohee & Goldberg, 2017; Luong
et al., 2016]
• Syntax-aware neural machine translation [Nadejde et al., 2017]
Paradigm Shift: modular, life-long learning
Search
Engine
Neural
Network
Database
Neural
Network
Neural
Network
• TenCent, eBay, Google, NVIDIA,
Facebook and NYU for generously
supporting my research and lab!
• Some of the works were sponsored
through industrial projects with
Samsung and NVIDIA!
Acknowledgement

Weitere ähnliche Inhalte

Was ist angesagt?

Deep Learning in Robotics
Deep Learning in RoboticsDeep Learning in Robotics
Deep Learning in RoboticsSungjoon Choi
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingJonathan Mugan
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskSaurabh Saxena
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersRoelof Pieters
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017Shuai Zhang
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...Balázs Hidasi
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsRoelof Pieters
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Balázs Hidasi
 
AI for Neuroscience and Neuroscience for AI
AI for Neuroscience and Neuroscience for AIAI for Neuroscience and Neuroscience for AI
AI for Neuroscience and Neuroscience for AIMLconf
 
Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Balázs Hidasi
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendationsBalázs Hidasi
 
Introduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLabIntroduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLabCloudxLab
 
Machine Learning: A gentle Introduction
Machine Learning: A gentle IntroductionMachine Learning: A gentle Introduction
Machine Learning: A gentle IntroductionMatthias Zimmermann
 

Was ist angesagt? (20)

Deep Learning in Robotics
Deep Learning in RoboticsDeep Learning in Robotics
Deep Learning in Robotics
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
Deep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ersDeep Learning, an interactive introduction for NLP-ers
Deep Learning, an interactive introduction for NLP-ers
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
GRU4Rec v2 - Recurrent Neural Networks with Top-k Gains for Session-based Rec...
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
Parallel Recurrent Neural Network Architectures for Feature-rich Session-base...
 
AI for Neuroscience and Neuroscience for AI
AI for Neuroscience and Neuroscience for AIAI for Neuroscience and Neuroscience for AI
AI for Neuroscience and Neuroscience for AI
 
Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...Deep learning to the rescue - solving long standing problems of recommender ...
Deep learning to the rescue - solving long standing problems of recommender ...
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Introduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLabIntroduction to Deep Learning | CloudxLab
Introduction to Deep Learning | CloudxLab
 
Machine Learning: A gentle Introduction
Machine Learning: A gentle IntroductionMachine Learning: A gentle Introduction
Machine Learning: A gentle Introduction
 
IROS 2017 Slides
IROS 2017 SlidesIROS 2017 Slides
IROS 2017 Slides
 
Deeplearning NLP
Deeplearning NLPDeeplearning NLP
Deeplearning NLP
 
InfoGAIL
InfoGAIL InfoGAIL
InfoGAIL
 

Andere mochten auch

Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in VideosNAVER Engineering
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review상은 박
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answeringNAVER Engineering
 
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망NAVER Engineering
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단NAVER Engineering
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부Donghun Lee
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고Donghun Lee
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkNAVER Engineering
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QANAVER Engineering
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기Woong won Lee
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기이 의령
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리Shane (Seungwhan) Moon
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016Taehoon Kim
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Jeongkyu Shin
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017Taehoon Kim
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_shareNAVER D2
 
[124]자율주행과 기계학습
[124]자율주행과 기계학습[124]자율주행과 기계학습
[124]자율주행과 기계학습NAVER D2
 
밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어NAVER D2
 

Andere mochten auch (20)

Video Object Segmentation in Videos
Video Object Segmentation in VideosVideo Object Segmentation in Videos
Video Object Segmentation in Videos
 
알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review알파고 풀어보기 / Alpha Technical Review
알파고 풀어보기 / Alpha Technical Review
 
Step-by-step approach to question answering
Step-by-step approach to question answeringStep-by-step approach to question answering
Step-by-step approach to question answering
 
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
딥러닝을 활용한 비디오 스토리 질의응답: 뽀로로QA와 심층 임베딩 메모리망
 
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
조음 Goodness-Of-Pronunciation 자질을 이용한 영어 학습자의 조음 오류 진단
 
알파고 해부하기 1부
알파고 해부하기 1부알파고 해부하기 1부
알파고 해부하기 1부
 
바둑인을 위한 알파고
바둑인을 위한 알파고바둑인을 위한 알파고
바둑인을 위한 알파고
 
Online video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident networkOnline video object segmentation via convolutional trident network
Online video object segmentation via convolutional trident network
 
Multimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QAMultimodal Sequential Learning for Video QA
Multimodal Sequential Learning for Video QA
 
RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기RLCode와 A3C 쉽고 깊게 이해하기
RLCode와 A3C 쉽고 깊게 이해하기
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
[2017 PYCON 튜토리얼]OpenAI Gym을 이용한 강화학습 에이전트 만들기
 
알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리알파고 (바둑 인공지능)의 작동 원리
알파고 (바둑 인공지능)의 작동 원리
 
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
딥러닝과 강화 학습으로 나보다 잘하는 쿠키런 AI 구현하기 DEVIEW 2016
 
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
Let Android dream electric sheep: Making emotion model for chat-bot with Pyth...
 
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017알아두면 쓸데있는 신기한 강화학습 NAVER 2017
알아두면 쓸데있는 신기한 강화학습 NAVER 2017
 
what is_tabs_share
what is_tabs_sharewhat is_tabs_share
what is_tabs_share
 
[124]자율주행과 기계학습
[124]자율주행과 기계학습[124]자율주행과 기계학습
[124]자율주행과 기계학습
 
밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어밑바닥부터시작하는360뷰어
밑바닥부터시작하는360뷰어
 

Ähnlich wie Deep Learning, Where Are You Going?

Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPMENGSAYLOEM1
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014Paris Open Source Summit
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separationNAVER Engineering
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Roelof Pieters
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introductionAdwait Bhave
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Sri Ambati
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNAVER Engineering
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPMachine Learning Prague
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...AI Frontiers
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsSanghamitra Deb
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTKAshish Jaiman
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersYoung Seok Kim
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
 

Ähnlich wie Deep Learning, Where Are You Going? (20)

Deep Domain
Deep DomainDeep Domain
Deep Domain
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
OWF14 - Big Data : The State of Machine Learning in 2014
OWF14 - Big Data : The State of Machine  Learning in 2014OWF14 - Big Data : The State of Machine  Learning in 2014
OWF14 - Big Data : The State of Machine Learning in 2014
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Improving neural question generation using answer separation
Improving neural question generation using answer separationImproving neural question generation using answer separation
Improving neural question generation using answer separation
 
Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!Deep Learning & NLP: Graphs to the Rescue!
Deep Learning & NLP: Graphs to the Rescue!
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
 
Deep learning introduction
Deep learning introductionDeep learning introduction
Deep learning introduction
 
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
Invoice 2 Vec: Creating AI to Read Documents - Mark Landry - H2O AI World Lon...
 
Naver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltcNaver learning to rank question answer pairs using hrde-ltc
Naver learning to rank question answer pairs using hrde-ltc
 
Tomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLPTomáš Mikolov - Distributed Representations for NLP
Tomáš Mikolov - Distributed Representations for NLP
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 

Mehr von NAVER Engineering

디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIXNAVER Engineering
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)NAVER Engineering
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트NAVER Engineering
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호NAVER Engineering
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라NAVER Engineering
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기NAVER Engineering
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정NAVER Engineering
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기NAVER Engineering
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)NAVER Engineering
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드NAVER Engineering
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기NAVER Engineering
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활NAVER Engineering
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출NAVER Engineering
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우NAVER Engineering
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...NAVER Engineering
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법NAVER Engineering
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며NAVER Engineering
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기NAVER Engineering
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기NAVER Engineering
 

Mehr von NAVER Engineering (20)

React vac pattern
React vac patternReact vac pattern
React vac pattern
 
디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX디자인 시스템에 직방 ZUIX
디자인 시스템에 직방 ZUIX
 
진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)진화하는 디자인 시스템(걸음마 편)
진화하는 디자인 시스템(걸음마 편)
 
서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트서비스 운영을 위한 디자인시스템 프로젝트
서비스 운영을 위한 디자인시스템 프로젝트
 
BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호BPL(Banksalad Product Language) 무야호
BPL(Banksalad Product Language) 무야호
 
이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라이번 생에 디자인 시스템은 처음이라
이번 생에 디자인 시스템은 처음이라
 
날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기날고 있는 여러 비행기 넘나 들며 정비하기
날고 있는 여러 비행기 넘나 들며 정비하기
 
쏘카프레임 구축 배경과 과정
 쏘카프레임 구축 배경과 과정 쏘카프레임 구축 배경과 과정
쏘카프레임 구축 배경과 과정
 
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
플랫폼 디자이너 없이 디자인 시스템을 구축하는 프로덕트 디자이너의 우당탕탕 고통 연대기
 
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
200820 NAVER TECH CONCERT 15_Code Review is Horse(코드리뷰는 말이야)(feat.Latte)
 
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
200819 NAVER TECH CONCERT 03_화려한 코루틴이 내 앱을 감싸네! 코루틴으로 작성해보는 깔끔한 비동기 코드
 
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
200819 NAVER TECH CONCERT 10_맥북에서도 아이맥프로에서 빌드하는 것처럼 빌드 속도 빠르게 하기
 
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
200819 NAVER TECH CONCERT 08_성능을 고민하는 슬기로운 개발자 생활
 
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
200819 NAVER TECH CONCERT 05_모르면 손해보는 Android 디버깅/분석 꿀팁 대방출
 
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
200819 NAVER TECH CONCERT 09_Case.xcodeproj - 좋은 동료로 거듭나기 위한 노하우
 
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
200820 NAVER TECH CONCERT 14_야 너두 할 수 있어. 비전공자, COBOL 개발자를 거쳐 네이버에서 FE 개발하게 된...
 
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
200820 NAVER TECH CONCERT 13_네이버에서 오픈 소스 개발을 통해 성장하는 방법
 
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
200820 NAVER TECH CONCERT 12_상반기 네이버 인턴을 돌아보며
 
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
200820 NAVER TECH CONCERT 11_빠르게 성장하는 슈퍼루키로 거듭나기
 
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
200819 NAVER TECH CONCERT 07_신입 iOS 개발자 개발업무 적응기
 

Kürzlich hochgeladen

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Kürzlich hochgeladen (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Deep Learning, Where Are You Going?

  • 1. Deep Learning: a Next Step? Kyunghyun Cho New York University Center for Data Science, and Courant Institute of Mathematical Sciences Naver Labs
  • 3. What we want is… Awesome ConvNet Awesome LM Awesome ASR Awesome RoboArm Controller Awesome Q&A Awesome Auto- Driver Awesome Memory • One system with many modules • Modules interact with each other to solve a task • Knowledge sharing across tasks via shared modules • Some trainable, others fixed
  • 4. Paradigm shift • One neural network per task • One neural network per function • Multiple networks cooperate to solve many higher-level tasks • Mixture of trainable networks and fixed modules Awesome ConvNet Awesome LM Awesome ASR Awesome RoboArm Controller Awesome Q&A Awesome Auto- Driver Awesome Memory
  • 5. Examples • Q&A system 1. Receives a question via awesome LM+ASR 2. Retrieves relevant info from awesome memory 3. Generates a response via awesome LM • Autonomous driving 1. Senses the environment with awesome ConvNet+ASR 2. Plans a route with awesome memory 3. Controls a car via awesome robot arm controller But, simple composition of neural networks may not work! Why Not? Awesome ConvNet Awesome LM Awesome ASR Awesome RoboArm Controller Awesome Q&A Awesome Auto- Driver Awesome Memory
  • 6. Learning to use an NN module Awesome ConvNet Awesome LM Awesome ASR Awesome RoboArm Controller Awesome Q&A Awesome Auto- Driver Awesome Memory • Why not? • Target tasks are often unknown at training time • Input/output cannot be defined well a priori • The amount of learning signal differs vastly across tasks • Rich information captured by the NN module must be passed along • Internal of the NN module must allow external manipulation
  • 7. Good news: NN’s are transparent! Hidden activations of a recurrent language model • NN’s are not black boxes. • We can observe every single bit inside a neural net. Bad news: NN’s are not easy to understand! • Humans are not good with high-dimensional vectors • Distributed representation • exponential combinations of hidden units
  • 8. Learning to use an NN module Awesome ConvNet Awesome LM Awesome ASR Awesome RoboArm Controller Awesome Q&A Awesome Auto- Driver Awesome Memory • Neural nets are good at interpreting high-dimensional input • Neural nets are also good at predicting high-dimensional output • Internal representation learned by a neural network is well structured • Neural nets can be trained with an arbitrary objective (My Rejected NSF Proposal, 2016)
  • 9. Learning to use an NN module Awesome ConvNet Awesome LM Awesome ASR Awesome RoboArm Controller Awesome Q&A Awesome Auto- Driver Awesome Memory 1. Query-Efficient Imitation Learning 2. Trainable Decoding • Real-time Neural Machine Translation • Trainable Greedy Decoding 3. Neural Query Reformulation 4. Non-Parametric Neural Machine Translation
  • 10. Query-Efficient Imitation Learning Jiakai Zhang & K Cho. Query-Efficient Imitation Learning for End-to-End Autonomous Driving. AAAI 2017.
  • 11. Imitation Learning • A learner directly interacts with the world • A supervisor augments reward signal from the world • Advantages over supervised and • Match between training and test • Strong learning signal • Disadvantages • Where do we get the supervisor??? (Ross et al., 2011; Daume III et al., 2007; and more…)
  • 12. • Supervisors are expensive • As the learner gets better, less intervention from the supervisor • Learner learns from difficult examples • Questions: 1. Where do we get the safety net? 2. What is the impact on the learner’s performance? SafeDAgger: Query-Efficient Imitation Learning (Zhang&Cho, AAAI 2017; Laskey et al., ICRA 2016)
  • 13. SafeDAgger: Query-Efficient Imitation Learning 1. Learner observes the world 2. SafetyNet observes the learner 3. SafetyNet predicts whether the learner will fail 4. If no, the learner continues 5. If yes, 1. the supervisor intervenes 2. The learner imitate the supervisor’s behaviour Reminds us of the value function from RL!
  • 14. SafeDAgger: Learning 1. Initial labelled data sets: and 2. Train the policy using 3. Train the safety net using 1. Target for the safety net given 4. Collect additional data 1. Let drive, but the expert intervenes when 2. Collect data: 5. Data aggregation: 6. Go to 2
  • 18. Trainable Decoding of Neural Machine Translation Jiatao Gu, Graham Neubig, K Cho and Victor Li. Learning to Translate in Real-time with Neural Machine Translation. EACL 2017. Jiatao Gu, K Cho and Victor Li. Trainable Greedy Decoding for Neural Machine Translation. EMNLP 2017.
  • 19. Trainable Decoding Motivation • Many decoding objectives unknown while training • Lack of target training examples • Arbitrary (non-differentiable) decoding objectives • Sample-”in”efficiency of RL algorithms Our Approach • Train NMT with supervised learning • Train a decoding module on top
  • 20. (1) Real-Time Translation Decoding 1. Start with a pretrained NMT 2. A simultaenous decoder intercepts and interprets the incoming signal 3. The simultaneous decoder forces the pretrained model to either 1. output a target symbol, or 2. wait for a next source symbol Learning 1. Trade-off between delay and quality 2. Stochastic policy gradient (REINFORCE) (Gu, Neubig, Cho & Li, EACL 2017)
  • 22. (2) Trainable Greedy Decoding Decoding 1. Start with a pretrained NMT 2. A Trainable decoder intercepts and interprets the incoming signal 3. The trainable decoder sends out the altering signal back to the pretrained model Learning 1. Deterministic policy gradient 2. Maximize any arbitrary objective (Gu, Cho & Li, 2017)
  • 23. (2) Trainable Greedy Decoding Models 1. Actor • Input: prev. hid. state , prev. symbol , and context from the attention model • Output: additive bias for hid. state • Example: 2. Critic • Input: a sequence of the hidden states from the decoder • Output: a predicted return • In our case, the critic estimates the full return rather than Q at each time step (Gu, Cho & Li, 2017)
  • 24. (2) Trainable Greedy Decoding (Gu, Cho & Li, EMNLP 2017) Learning 1) Generate translation given a source sentence with noise and 2) Train the critic to minimize 3) Generate multiple translations with noise 4) Critic-aware actor learning: newly proposed where Inference: simply throw away the critic and use the actor
  • 25. (2) Trainable Greedy Decoding • The trainable decoder does improve the target decoding objective • Training is quite unstable without the critic-aware actor learning algorithm • More work is definitely needed for further improvement
  • 26. Toward End-to-End Q&A Rodrigo Nogueira & K Cho. Task-Oriented Query Reformulation with Reinforcement Learning. EMNLP 2017. Dunn et al. SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine. arXiv 2017.
  • 27. End-to-End Question-Answering Neural Query Reformulator Machine Comprehension Trainable Fixed (Black box)
  • 28. Neural Query Reformulator Neural Query Reformulator 1. Reads an original query q0 2. Augment/reformulate q0 Learning 1. Hard RL problem: partial observability due to the black box search engine 2. Policy gradient to maximize recall@K (Nogueira & Cho, 2017) Code and data available at https://github.com/nyu-dl/QueryReformulator
  • 29. SearchQA: new dataset for machine comprehension (Dunn et al., 2017) Data available at https://github.com/nyu-dl/SearchQA (Q, A) (Q, A, { S1, S2, . . . , SN } ) Retrieve Crawl Search SearchQA 1. Realistic, noisy context from Google 2. Multiple snippets per question 3. Large-scale data (140k q-a-c tuples)
  • 30. And, Google did it! • A pretrained, black-box Q&A model • Query reformulation with RL • Tested on SearchQA (Buck et al., 2017) https arxiv.org abs
  • 31. Few more relevant research directions • Communicating neural networks • Neural nets talk to each other to solve a problem • Sukhbaatar & Fergus (2015), Foerster et al. (2016), Evtimova et al. (2017), Lewis et al. (2017), … • Multimodal processing • Image captioning, zero-shot retrieval, … • Cho et al. (2015, review paper) • Planning, program synthesis • How do the modules compose with each other to solve a task? • Neural programmer interpreter [Reed et al., 2016; Cai et al., 2017] • Forward modelling [Henaff et al., 2017; Sutton, 1991 Dyna; optimal control…] • Mixture of experts [Google], progressive networks [Google DeepMind]
  • 32. Paradigm Shift: modular, life-long learning Neural Network Environment Users/Experts Search Engine Neural Network Database Neural Network Neural Network
  • 33. Neural Machine Translation Multilingual, Character-Level, Non-parametric Machine Translation
  • 34.
  • 35. • [Allen 1987 IEEE 1st ICNN] • 3310 En-Es pairs constructed on 31 En, 40 Es words, max 10/11 word sentence; 33 used as test set • Binary encoding of words – 50 inputs, 66 outputs; 1 or 3 hidden 150-unit layers. Ave WER: 1.3 words • [Chrisman 1992 Connection Science] • Dual-ported RAAM architecture [Pollack 1990 Artificial Intelligence] applied to corpus of 216 parallel pairs of simple En-Es sentences: • Split 50/50 as train/test, 75% of sentences correctly translated!
  • 36. Brief resurrection in 1997: Spain
  • 37. Modern neural machine translation rce ence get ence ural work Source Sentence Target Sentence Neural Net SMT (Schwenk et al. 2006) Source Sentence Target Sentence SMT Neural Net (Devlin et al. 2014)al MT e ce et ce al ork Source Sentence Target Sentence Neural Net SMT (Schwenk et al. 2006) Source Sentence Target Sentence SMT Neural Net (Devlin et al. 2014)MT Source Sentence Target Sentence Neural Network So Sen Ta Sen Neur S (SchwenkNeural MT
  • 39. WMT 2017: news translation task
  • 40. A better single-pair translation system has never been “the” goal of neural MT
  • 41. Continuous representation: Interlingua 2.0? What if we can project sentences in multiple languages into a single vector space?
  • 42. What does NMT do? Encoder • Project a source sentence into a set of continuous vectors Decoder+Attention • Decode a target sentence from a set of “source” continuous vectors
  • 43. What is this “continuous vector space”? • Similar sentences are near each other in this vector space • Multiple dimensions of similarity are encoded simultaneously (Sutskever et al., 2014)
  • 44. What is this “continuous vector space”? • Similar sentences are near each other in this vector space • Multiple dimensions of similarity are encoded simultaneously • (Trainable) near-bijective mapping between the continuous vector space and the sentence space • Stripped of hard linguistic symbols
  • 45. What is this “continuous vector space”? (Firat et al., 2016; Luong et al., 2015; Dong et al., 2015) • Can this continuous vector space be shared across multiple languages?
  • 46. Multi-way, multilingual machine translation (1) Language-agnostic Continuous Vector Space • One encoder per source language • One decoder per target language • Attention/alignment shared across all the language pairs • Only bilingual parallel corpora necessary • No multi-way parallel corpus needed (Firat et al., 2016)
  • 47. Multi-way, multilingual machine translation (2) • Neural nets are like lego • Build one encoder per source • Build one decoder per target • Build one attention mechanism • Given a sentence pair • • (Firat et al., 2016)
  • 48. Multi-way, multilingual machine translation (3) Language- agnostic Continuous Vector Space • Sentence-level positive language transfer • Helps low-resource language pairs • Why? 1. Better structural constraint on the continuous vector space 2. Regularization • Real-valued vector-based interlingua? (Firat et al., 2016)
  • 49. Beyond languages: multimodal translation • Does the source have to be “sentence”? Annotation Vectors Word Ssample ui Recurrent State zi f = (a, man, is, jumping, into, a, lake, .) + hj Attention Mechanism a Attention weight j ajΣ =1 ConvolutionalNeuralNetwork (Xu et al., 2015)
  • 50. Beyond languages: multimodal translation (Caglayan et al., 2016; Elliott & Kadar, 2017)
  • 51. What is a sentence? Is a sentence a sequence of phrases, words, morphemes or characters?
  • 52. What is a sentence to a neural net? • Each word/symbol: one-hot vector • Prior-less encoding • Permutation invariant • Sentence • To us: a sequence of words • To NN: a sequence of one-hot vectors • What does it mean?
  • 53. Why not words? • Inefficient handling of various morphological variants • Sub-optimal segmentation/tokenization • “Etxaberria”, “Etxazarra”, “Etxaguren”, “Etxarren”: four independent vectors • Lack of generalization to novel/rare morphological variants • For instance, in Arabic => “and to his vehicle” • One vector for compound words? • “kolmi/vaihe/kilo/watti/tunti/mittari” => one vector? • “kolme” => one vector? • Spelling issues • See Workshop on Processing Historical Language or Universal Dependencies • Good segmentation/tokenization needed for each language • So, no, words don’t look like the units we want to work with…
  • 54. Then, what should we do…? • Original: 고양이가 침대 위에 누워있습니다 • Word-level modelling: (고양이가, 침대, 위에, 누워있습니다) • Subword-level modelling (Sennrich et al., 2015; Wu et al., 2016) (고양이, 가, 침대, 위, 에, 누워, 있습니, 다) • Character-level modelling with segmentation (Wang et al., 2015; Luong & Manning, 2016; Costa-Jussa & Fonollosa, 2016) ((ㄱ,ㅗ,ㅇ,ㅑ,ㅇ,ㅣ,ㄱ,ㅏ), (ㅊ,ㅣ,ㅁ,ㄷ,ㅐ), (ㅇ,ㅟ,ㅇ,ㅔ), (ㄴ,ㅜ,ㅇ,ㅝ,ㅇ,ㅣ,ㅆ,ㅅ,ㅡ,ㅂ,ㄴ,ㅣ,ㄷ,ㅏ)) • Fully character-level modelling (Chung et al., 2016; Lee et al., 2017) (ㄱ,ㅗ,ㅇ,ㅑ,ㅇ,ㅣ,ㄱ,ㅏ,_,ㅊ,ㅣ,ㅁ,ㄷ,ㅐ,_,ㅇ,ㅟ,ㅇ,ㅔ,_,ㄴ,ㅜ,ㅇ,ㅝ,ㅇ,ㅣ,ㅆ,ㅅ,ㅡ,ㅂ ,ㄴ,ㅣ,ㄷ,ㅏ))
  • 55. Character-level translation • Source: subword-level representation • Target: character-level representation • The decoder implicitly learned word-like units automatically! (Chung et al., 2017)
  • 56. Fully Character-level translation • Source: character-level representation • Target: character-level representation • Efficient modelling with a convolutional-recurrent encoder • Works as well as, or better than, subword-level translation (Lee et al., 2017)
  • 57. (Lee et al., 2017) • More robust to errors • Better handles rare tokens • Rare tokens are not necessary rare!
  • 58. Character-level Multilingual Translation • When symbols are shared across multiple languages, why not share a single encoder/decoder for them? 1. Language transfer at all levels: letters, words, phrases, sentences, … 2. Intra-sentence code-switching without any specific data (Lee et al., 2017; Johnson et al., 2016; Ha et al., 2016)
  • 59. Non-parametric neural machine translation Bridging question-answering, information retrieval and machine translation
  • 60. Parametric ML: Learning as Compression • What does learning do? • Parametric machine learning: data compression + pattern matching Neural Network Training Data learning Neural Network Inference
  • 61. Non-Parametric NMT (1) • Bring the whole training corpus together with a model • Retrieved a small subset of examples using a fast search engine • Let NMT figure out how to fuse 1. the current sentence, and 2. the retrieved translation pairs
  • 62. Non-Parametric NMT (2) • Apache Lucene: search engine • A key-value memory network [Gulcehre et al., 2017; Miller et al., 2016] for storing retrieved pairs • Similar to larger-context NMT • [Wang et al., 2017; Jean et al., 2017] • Similar to NMT with external knowledge • [Ahn et al., 2016; Bahdanau et al., 2017]
  • 63. Non-Parametric NMT (3) • When retrieved pairs are similar, huge improvement! • Otherwise, revert back to a normal NMT • More consistency in style and vocabulary choice
  • 64. Other advances in neural machine translation • Discourse-level machine translation • [Jean et al., 2017; DCU, 2017] • Better decoding strategies • Learning-to-search [Wiseman & Rush, 2016] • Reinforcement learning [MRT, 2016; Ranzato et al., 2015; Bahdanau et al., 2015] • Trainable decoding [Gu et al., 2017] • Alternative decoding cost [Li et al., 2016; Li et al., 2017] • Linguistics-guided neural machine translation • Learning to parse and translate [Eriguchi et al., 2017; Rohee & Goldberg, 2017; Luong et al., 2016] • Syntax-aware neural machine translation [Nadejde et al., 2017]
  • 65. Paradigm Shift: modular, life-long learning Search Engine Neural Network Database Neural Network Neural Network • TenCent, eBay, Google, NVIDIA, Facebook and NYU for generously supporting my research and lab! • Some of the works were sponsored through industrial projects with Samsung and NVIDIA! Acknowledgement