SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Introduction to
Sequence to Sequence Model
2017.03.16 Seminar
Presenter : Hyemin Ahn
Recurrent Neural Networks : For what?
2017-03-28 CPSLAB (EECS) 2
 Human remembers and uses the pattern of sequence.
• Try ‘a b c d e f g…’
• But how about ‘z y x w v u t s…’ ?
 The idea behind RNN is to make use of sequential
information.
 Let’s learn a pattern of a sequence, and utilize (estimate,
generate, etc…) it!
 But HOW?
Recurrent Neural Networks : Typical RNNs
2017-03-28 CPSLAB (EECS) 3
OUTPUT
INPUT
ONE
STEP
DELAY
HIDDEN
STATE
 RNNs are called “RECURRENT” because they
perform the same task for every element of a
sequence, with the output being depended on
the previous computations.
 RNNs have a “memory” which captures
information about what has been calculated so
far.
 The hidden state ℎ 𝑡 captures some information
about a sequence.
 If we use 𝑓 = tanh , Vanishing/Exploding
gradient problem happens.
 For overcome this, we use LSTM/GRU.
𝒉 𝒕
𝒚 𝒕
𝒙 𝒕
ℎ 𝑡 = 𝑓 𝑈𝑥 𝑡 + 𝑊ℎ 𝑡−1 + 𝑏
𝑦𝑡 = 𝑉ℎ 𝑡 + 𝑐
𝑈
𝑊
𝑉
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 4
 Let’s think about the machine, which guesses the dinner menu from
things in shopping bag.
Umm,,
Carbonara!
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 5
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 6
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Forget
Some
Memories!
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 7
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Forget
Some
Memories!
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 8
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
Insert
Some
Memories!
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 9
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 10
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒙 𝒕
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 11
𝑪 𝒕
Cell state,
Internal memory unit,
Like a conveyor belt!
𝒉 𝒕
𝒚 𝒕
𝒙 𝒕
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 12
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 13
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Neural Networks : LSTM
2017-03-28 CPSLAB (EECS) 14
LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given,
(2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Recurrent Neural Networks : GRU
2017-03-28 CPSLAB (EECS) 15
𝑓𝑡 = 𝜎(𝑊𝑓 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑓)
𝑖 𝑡 = 𝜎 𝑊𝑖 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑖
𝑜𝑡 = 𝜎(𝑊𝑜 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑜)
𝐶𝑡 = tanh 𝑊𝐶 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶
𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖 𝑡 ∗ 𝐶𝑡
ℎ 𝑡 = 𝑜𝑡 ∗ tanh(𝐶𝑡)
Maybe we can simplify this structure, efficiently!
GRU
𝑧𝑡 = 𝜎 𝑊𝑧 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑧
𝑟𝑡 = 𝜎 𝑊𝑟 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑟
ℎ 𝑡 = tanh 𝑊ℎ ∙ 𝑟𝑡 ∗ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶
ℎ 𝑡 = (1 − 𝑧𝑡) ∗ ℎ 𝑡−1 + 𝑧𝑡 ∗ ℎ 𝑡
Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Sequence to Sequence Model: What is it?
2017-03-28 CPSLAB (EECS) 16
ℎ 𝑒(1) ℎ 𝑒(2) ℎ 𝑒(3) ℎ 𝑒(4) ℎ 𝑒(5)
LSTM/GRU
Encoder
LSTM/GRU
Decoder
ℎ 𝑑(1) ℎ 𝑑(𝑇𝑒)
Western Food
To
Korean Food
Transition
Sequence to Sequence Model: Implementation
2017-03-28 CPSLAB (EECS) 17
 The simplest way to implement sequence to sequence model is
to just pass the last hidden state of decoder 𝒉 𝑻
to the first GRU cell of encoder!
 However, this method’s power gets weaker when the encoder need to
generate longer sequence.
Sequence to Sequence Model: Attention Decoder
2017-03-28 CPSLAB (EECS) 18
Bidirectional
GRU Encoder
Attention
GRU Decoder
𝑐𝑡
 For each GRU cell consisting the
decoder, let’s differently pass the
encoder’s information!
ℎ𝑖 =
ℎ𝑖
ℎ𝑖
𝑐𝑖 =
𝑗=1
𝑇𝑥
𝛼𝑖𝑗ℎ𝑗
𝑠𝑖 = 𝑓 𝑠𝑖−1, 𝑦𝑖−1, 𝑐𝑖
= 1 − 𝑧𝑖 ∗ 𝑠𝑖−1 + 𝑧𝑖 ∗ 𝑠𝑖
𝑧𝑖 = 𝜎 𝑊𝑧 𝑦𝑖−1 + 𝑈𝑧 𝑠𝑖−1
𝑟𝑖 = 𝜎 𝑊𝑟 𝑦𝑖−1 + 𝑈𝑟 𝑠𝑖−1
𝑠𝑖 = tanh(𝑦𝑖−1 + 𝑈 𝑟𝑖 ∗ 𝑠𝑖−1 + 𝐶𝑐𝑖)
𝛼𝑖𝑗 =
exp(𝑒 𝑖𝑗)
𝑘=1
𝑇 𝑥 exp(𝑒 𝑖𝑘)
𝑒𝑖𝑗 = 𝑣 𝑎
𝑇
tanh 𝑊𝑎 𝑠𝑖−1 + 𝑈 𝑎ℎ𝑗
Sequence to Sequence Model: Example codes
2017-03-28 CPSLAB (EECS) 19
Codes Here @ Github
2017-03-28 CPSLAB (EECS) 20

Weitere ähnliche Inhalte

Was ist angesagt?

Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Deep Learning Italia
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkKnoldus Inc.
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)Shuntaro Yada
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingMinh Pham
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptxTuCaoMinh2
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Edureka!
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUSri Geetha
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You NeedDaiki Tanaka
 

Was ist angesagt? (20)

Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
Bert
BertBert
Bert
 
Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10) Deep Learning: Recurrent Neural Network (Chapter 10)
Deep Learning: Recurrent Neural Network (Chapter 10)
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)A Review of Deep Contextualized Word Representations (Peters+, 2018)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
 
Transformers AI PPT.pptx
Transformers AI PPT.pptxTransformers AI PPT.pptx
Transformers AI PPT.pptx
 
Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
BERT introduction
BERT introductionBERT introduction
BERT introduction
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx[AIoTLab]attention mechanism.pptx
[AIoTLab]attention mechanism.pptx
 
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
Recurrent Neural Networks (RNN) | RNN LSTM | Deep Learning Tutorial | Tensorf...
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
INTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRUINTRODUCTION TO NLP, RNN, LSTM, GRU
INTRODUCTION TO NLP, RNN, LSTM, GRU
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 

Andere mochten auch

1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based accelerationHye-min Ahn
 
Kernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian ProcessesKernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian ProcessesSungjoon Choi
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPGHye-min Ahn
 
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
Understanding deep learning requires rethinking generalization (2017)    2 2(2)Understanding deep learning requires rethinking generalization (2017)    2 2(2)
Understanding deep learning requires rethinking generalization (2017) 2 2(2)정훈 서
 
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2정훈 서
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks남주 김
 
해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1Haesun Park
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheLeslie Samuel
 
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Mad Scientists
 
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...Darshan Santani
 
회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것성환 조
 
Cooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message PassingCooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message PassingLyft
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)YerevaNN research lab
 
Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search EngineDeep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search EngineKoby Karp
 
Human brain how it work
Human brain how it workHuman brain how it work
Human brain how it workhudvin
 
Build Message Bot With Neural Network
Build Message Bot With Neural NetworkBuild Message Bot With Neural Network
Build Message Bot With Neural NetworkBilly Yang
 
시나브로 Django 발표
시나브로 Django 발표시나브로 Django 발표
시나브로 Django 발표명서 강
 
Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)정훈 서
 

Andere mochten auch (20)

1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
1118_Seminar_Continuous_Deep Q-Learning with Model based acceleration
 
Kernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian ProcessesKernel, RKHS, and Gaussian Processes
Kernel, RKHS, and Gaussian Processes
 
0415_seminar_DeepDPG
0415_seminar_DeepDPG0415_seminar_DeepDPG
0415_seminar_DeepDPG
 
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
Understanding deep learning requires rethinking generalization (2017)    2 2(2)Understanding deep learning requires rethinking generalization (2017)    2 2(2)
Understanding deep learning requires rethinking generalization (2017) 2 2(2)
 
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
서버리스 IoT 백엔드 개발 및 구현 사례 : 윤석찬 (AWS 테크에반젤리스트)
 
Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2Understanding deep learning requires rethinking generalization (2017) 1/2
Understanding deep learning requires rethinking generalization (2017) 1/2
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1해커에게 전해들은 머신러닝 #1
해커에게 전해들은 머신러닝 #1
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
Sampling-Importance-Sampling을 이용한 선수 경기능력 측정
 
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
Loud and Trendy: Crowdsourcing Impressions of Social Ambiance in Popular Indo...
 
회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것회사에서 기술서적을 읽는다는것
회사에서 기술서적을 읽는다는것
 
Cooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message PassingCooperative Collision Avoidance via Proximal Message Passing
Cooperative Collision Avoidance via Proximal Message Passing
 
Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)Sentence representations and question answering (YerevaNN)
Sentence representations and question answering (YerevaNN)
 
Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search EngineDeep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
Deep Learning Meetup 7 - Building a Deep Learning-powered Search Engine
 
Human brain how it work
Human brain how it workHuman brain how it work
Human brain how it work
 
RNN & LSTM
RNN & LSTMRNN & LSTM
RNN & LSTM
 
Build Message Bot With Neural Network
Build Message Bot With Neural NetworkBuild Message Bot With Neural Network
Build Message Bot With Neural Network
 
시나브로 Django 발표
시나브로 Django 발표시나브로 Django 발표
시나브로 Django 발표
 
Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)Paper Reading : Enriching word vectors with subword information(2016)
Paper Reading : Enriching word vectors with subword information(2016)
 

Ähnlich wie Introduction For seq2seq(sequence to sequence) and RNN

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural NetworksSharath TS
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingDongang (Sean) Wang
 
Intro to deep learning_ matteo alberti
Intro to deep learning_ matteo albertiIntro to deep learning_ matteo alberti
Intro to deep learning_ matteo albertiDeep Learning Italia
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Universitat Politècnica de Catalunya
 
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...cscpconf
 
Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1paul0001
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...TELKOMNIKA JOURNAL
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissectionChenYiHuang5
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionEun Ji Lee
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patternsVito Strano
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceJonathan Mugan
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaSpark Summit
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksSang Jun Lee
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3TOMMYLINK1
 

Ähnlich wie Introduction For seq2seq(sequence to sequence) and RNN (20)

Recurrent Neural Networks
Recurrent Neural NetworksRecurrent Neural Networks
Recurrent Neural Networks
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
 
Intro to deep learning_ matteo alberti
Intro to deep learning_ matteo albertiIntro to deep learning_ matteo alberti
Intro to deep learning_ matteo alberti
 
LSTM
LSTMLSTM
LSTM
 
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelo...
 
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
Economic Load Dispatch (ELD), Economic Emission Dispatch (EED), Combined Econ...
 
Rnn presentation 2
Rnn presentation 2Rnn presentation 2
Rnn presentation 2
 
Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1Comp7404 ai group_project_15apr2018_v2.1
Comp7404 ai group_project_15apr2018_v2.1
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...
 
Paper Study: Transformer dissection
Paper Study: Transformer dissectionPaper Study: Transformer dissection
Paper Study: Transformer dissection
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual AttentionShow, Attend and Tell: Neural Image Caption Generation with Visual Attention
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
 
Echo state networks and locomotion patterns
Echo state networks and locomotion patternsEcho state networks and locomotion patterns
Echo state networks and locomotion patterns
 
What Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial IntelligenceWhat Deep Learning Means for Artificial Intelligence
What Deep Learning Means for Artificial Intelligence
 
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves MabialaDeep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
Deep Recurrent Neural Networks for Sequence Learning in Spark by Yves Mabiala
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
 
Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3Rabbit challenge 5_dnn3
Rabbit challenge 5_dnn3
 

Kürzlich hochgeladen

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARKOUSTAV SARKAR
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersMairaAshraf6
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxnuruddin69
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...Health
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxSCMS School of Architecture
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Call Girls Mumbai
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxmaisarahman1
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Servicemeghakumariji156
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...HenryBriggs2
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network DevicesChandrakantDivate1
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 

Kürzlich hochgeladen (20)

HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
Computer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to ComputersComputer Lecture 01.pptxIntroduction to Computers
Computer Lecture 01.pptxIntroduction to Computers
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Bridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptxBridge Jacking Design Sample Calculation.pptx
Bridge Jacking Design Sample Calculation.pptx
 
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
+97470301568>> buy weed in qatar,buy thc oil qatar,buy weed and vape oil in d...
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptxA CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
A CASE STUDY ON CERAMIC INDUSTRY OF BANGLADESH.pptx
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best ServiceTamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
Tamil Call Girls Bhayandar WhatsApp +91-9930687706, Best Service
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
Computer Networks Basics of Network Devices
Computer Networks  Basics of Network DevicesComputer Networks  Basics of Network Devices
Computer Networks Basics of Network Devices
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 

Introduction For seq2seq(sequence to sequence) and RNN

  • 1. Introduction to Sequence to Sequence Model 2017.03.16 Seminar Presenter : Hyemin Ahn
  • 2. Recurrent Neural Networks : For what? 2017-03-28 CPSLAB (EECS) 2  Human remembers and uses the pattern of sequence. • Try ‘a b c d e f g…’ • But how about ‘z y x w v u t s…’ ?  The idea behind RNN is to make use of sequential information.  Let’s learn a pattern of a sequence, and utilize (estimate, generate, etc…) it!  But HOW?
  • 3. Recurrent Neural Networks : Typical RNNs 2017-03-28 CPSLAB (EECS) 3 OUTPUT INPUT ONE STEP DELAY HIDDEN STATE  RNNs are called “RECURRENT” because they perform the same task for every element of a sequence, with the output being depended on the previous computations.  RNNs have a “memory” which captures information about what has been calculated so far.  The hidden state ℎ 𝑡 captures some information about a sequence.  If we use 𝑓 = tanh , Vanishing/Exploding gradient problem happens.  For overcome this, we use LSTM/GRU. 𝒉 𝒕 𝒚 𝒕 𝒙 𝒕 ℎ 𝑡 = 𝑓 𝑈𝑥 𝑡 + 𝑊ℎ 𝑡−1 + 𝑏 𝑦𝑡 = 𝑉ℎ 𝑡 + 𝑐 𝑈 𝑊 𝑉
  • 4. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 4  Let’s think about the machine, which guesses the dinner menu from things in shopping bag. Umm,, Carbonara!
  • 5. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 5 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕
  • 6. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 6 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 Forget Some Memories!
  • 7. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 7 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 Forget Some Memories! LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 8. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 8 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 Insert Some Memories! LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 9. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 9 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 10. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 10 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒙 𝒕 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 11. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 11 𝑪 𝒕 Cell state, Internal memory unit, Like a conveyor belt! 𝒉 𝒕 𝒚 𝒕 𝒙 𝒕 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡.
  • 12. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 12 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡. Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 13. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 13 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡. Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 14. Recurrent Neural Networks : LSTM 2017-03-28 CPSLAB (EECS) 14 LSTM learns (1) How to forget a memory when the ℎ 𝑡−1 and new input 𝑥 𝑡 is given, (2) Then how to add the new memory with given ℎ 𝑡−1 and 𝑥 𝑡. Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 15. Recurrent Neural Networks : GRU 2017-03-28 CPSLAB (EECS) 15 𝑓𝑡 = 𝜎(𝑊𝑓 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑓) 𝑖 𝑡 = 𝜎 𝑊𝑖 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏𝑖 𝑜𝑡 = 𝜎(𝑊𝑜 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑜) 𝐶𝑡 = tanh 𝑊𝐶 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶 𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖 𝑡 ∗ 𝐶𝑡 ℎ 𝑡 = 𝑜𝑡 ∗ tanh(𝐶𝑡) Maybe we can simplify this structure, efficiently! GRU 𝑧𝑡 = 𝜎 𝑊𝑧 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑧 𝑟𝑡 = 𝜎 𝑊𝑟 ∙ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝑟 ℎ 𝑡 = tanh 𝑊ℎ ∙ 𝑟𝑡 ∗ ℎ 𝑡−1, 𝑥 𝑡 + 𝑏 𝐶 ℎ 𝑡 = (1 − 𝑧𝑡) ∗ ℎ 𝑡−1 + 𝑧𝑡 ∗ ℎ 𝑡 Figures from http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  • 16. Sequence to Sequence Model: What is it? 2017-03-28 CPSLAB (EECS) 16 ℎ 𝑒(1) ℎ 𝑒(2) ℎ 𝑒(3) ℎ 𝑒(4) ℎ 𝑒(5) LSTM/GRU Encoder LSTM/GRU Decoder ℎ 𝑑(1) ℎ 𝑑(𝑇𝑒) Western Food To Korean Food Transition
  • 17. Sequence to Sequence Model: Implementation 2017-03-28 CPSLAB (EECS) 17  The simplest way to implement sequence to sequence model is to just pass the last hidden state of decoder 𝒉 𝑻 to the first GRU cell of encoder!  However, this method’s power gets weaker when the encoder need to generate longer sequence.
  • 18. Sequence to Sequence Model: Attention Decoder 2017-03-28 CPSLAB (EECS) 18 Bidirectional GRU Encoder Attention GRU Decoder 𝑐𝑡  For each GRU cell consisting the decoder, let’s differently pass the encoder’s information! ℎ𝑖 = ℎ𝑖 ℎ𝑖 𝑐𝑖 = 𝑗=1 𝑇𝑥 𝛼𝑖𝑗ℎ𝑗 𝑠𝑖 = 𝑓 𝑠𝑖−1, 𝑦𝑖−1, 𝑐𝑖 = 1 − 𝑧𝑖 ∗ 𝑠𝑖−1 + 𝑧𝑖 ∗ 𝑠𝑖 𝑧𝑖 = 𝜎 𝑊𝑧 𝑦𝑖−1 + 𝑈𝑧 𝑠𝑖−1 𝑟𝑖 = 𝜎 𝑊𝑟 𝑦𝑖−1 + 𝑈𝑟 𝑠𝑖−1 𝑠𝑖 = tanh(𝑦𝑖−1 + 𝑈 𝑟𝑖 ∗ 𝑠𝑖−1 + 𝐶𝑐𝑖) 𝛼𝑖𝑗 = exp(𝑒 𝑖𝑗) 𝑘=1 𝑇 𝑥 exp(𝑒 𝑖𝑘) 𝑒𝑖𝑗 = 𝑣 𝑎 𝑇 tanh 𝑊𝑎 𝑠𝑖−1 + 𝑈 𝑎ℎ𝑗
  • 19. Sequence to Sequence Model: Example codes 2017-03-28 CPSLAB (EECS) 19 Codes Here @ Github