SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Computa(onal	
  Linguis(cs	
  
Week	
  10	
  
Neural	
  Sequence	
  Modeling	
  
Mark	
  Chang	
  
Outlines	
  
•  Recurrent	
  Neural	
  Networks	
  
•  Long	
  short-­‐term	
  Memory	
  
•  Neural	
  Turing	
  Machine	
  
•  Applica(ons	
  
Recurrent	
  Neural	
  Networks	
  
短期記憶
白
白日依山盡,黃河入海流
白日
白日依
…..
白日依山
短期記憶
白
 n(白)
日
 n(日)
n
W1
W2
x1
x2
b
Wb
y
n
W1
W2
x1
x2
b
Wb
y
Recurrent	
  Neural	
  Network	
  
白
日
 n(n(白),日)
n(白)
依
 n(n(n(白),日),依)
類神經網路到深度學習
Feedforward	
  Neural	
  Network	
   Recurrent	
  Neural	
  Network	
  
Long	
  Short	
  Term	
  Memory	
  Neural	
  Turing	
  Machine	
  
Recurrent	
  Neural	
  Network	
  
nin,t = wcxt + wpnout,t 1 + wb
nout,t =
1
1 + e nin,t
把上一個時間點的nout,接回這個時間點的nin	
  
Recurrent	
  Neural	
  Network
….
x0
y0
 y1
x1
 x2
y2
 yt
xt
Recurrent	
  Neural	
  Network
x0
 x1
 xt-­‐1
 xt
y0
 y1
 yt-­‐1
 yt
Backward	
  Propaga(on	
  Through	
  Time
t	
  =	
  0
in,0 =
@J
@nout,0
@nout,0
@nin,0
= out,0
@nout,0
@nin,0
t	
  =	
  1
 in,0=
@J
@nout,1
@nout,1
@nin,1
@nin,1
@nout,0
@nout,0
@nin,0
= out,1
@nout,1
@nin,1
@nin,1
@nout,0
@nout,0
@nin,0
= in,1
@nin,1
@nout,0
@nout,0
@nin,0
= out,0
@nout,0
@nin,0
Backward	
  Propaga(on	
  Through	
  Time
in,s =
8
>><
>>:
@J
@nout,s
@nout,s
@nin,s
if s = t
in,s+1
@nin,s+1
@nout,s
@nout,s
@nin,s
otherwise
http://cpmarkchang.logdown.com/posts/278457-neural-network-recurrent-neural-network	
  
in,s+1in,s = in,s+1
@nin,s+1
@nout,s
@nout,s
@nin,s
in,t =
@J
@nout,t
@nout,t
@nin,t
Deep	
  RNN	
  
y0	
  
x0	
  
y1	
  
x1	
  
yt-­‐1	
  
xy-­‐1	
  
yt	
  
xt	
  
Bi-­‐Direc(onal	
  RNN	
  
x0	
  
x0	
  
x1	
  
x1	
  
xt-­‐1	
  
xy-­‐1	
  
xt	
  
xt	
  
y0	
   y1	
   yt-­‐1	
   yt	
  
Long	
  Short-­‐Term	
  Memory	
  
Vanishing	
  Gradient	
  Problem
in,0
in,0 = out,t
@nout,t
@nin,t
@nin,t
@nout,t 1
...
@nin,1
@nout,0
@nout,0
@nin,0
out,t
Long	
  Short-­‐Term	
  Memory	
  
xt
 m
yt
Cin
c cc
k n
b
nout
Memory	
  Cell
kout
Cread
Cforget
Cwrite
mout,t
mout,t-­‐1
Cout
min,t
Long	
  Short-­‐Term	
  Memory	
  
輸入值
Cin
讀取開關
Cread
遺忘開關
Cforget
寫入開關
Cwrite
輸出值
Cout
Long	
  Short-­‐Term	
  Memory	
  
•  寫入開關Cwrite:控制是否可寫入記憶體	
  
	
  
Cwrite = sigmoid(wcw,xxt + wcw,yyt 1 + wcw,b)
kout = sigmoid(wk,xxt + wk,b)
min,t = koutCwrite
Long	
  Short-­‐Term	
  Memory
•  遺忘開關Cforget:控制是否保留之前的值	
  
Cforget = sigmoid(wcf,xxt + wcf,yyt + wcf,b)
mout,t = min,t +Cforgetmout,t 1
Long	
  Short-­‐Term	
  Memory
•  讀取開關Cread	
  :控制是否可讀取記憶體
nout = sigmoid(mout,t)
Cread = sigmoid(wcr,xxt + wcr,yyt 1 + wcr,b)
Cout nout= Cread
Training:	
  Backward	
  Propaga(on	
  
hRp://www.felixgers.de/papers/phd.pdf	
  
mout,t = min,t +Cforgetmout,t 1 min,t = koutCwrite
@mout,t
@wk,x
=
@min,t
@wk,x
+ Cforget
@mout,t 1
@wk,x
= Cwrite
@kout
@wk,x
+ Cforget
@mout,t 1
@wk,x
Long-­‐Short	
  Term	
  Memory	
  
https://class.coursera.org/neuralnets-2012-001/lecture/95	
  
Neural	
  Turing	
  Machine	
  
Neural	
  Turing	
  Machine	
  
Input
Output
Read/Write	
  
Head 	
  	
  
controller	
  
Memory	
  
Memory
Memory	
  Address	
  
Memory	
  Block	
  
Block	
  
Length	
  
0
 1
 … i
 … n
0
j
m
…
…
Read	
  Opera(on
11 2
21 3
42 1
Read	
  Opera(on:
0 00 00.9
 0.1
0
 1
 … i
 … n
2
6
4
r0
r1
r2
3
7
5 =
2
6
4
1 ⇤ 0.9 + 2 ⇤ 0.1
1 ⇤ 0.9 + 1 ⇤ 0.1
2 ⇤ 0.9 + 4 ⇤ 0.1
3
7
5 =
2
6
4
1.1
1.0
2.2
3
7
5
X
i
w(i) = 1, 0  w(i)  1, 8i
r
X
i
w(i)M(i)
Read	
  Vector:
r
Head	
  Loca(on:	
  
w
Memory	
  :	
  	
  M
1.1
1.0
2.2
Erase	
  Opera(on
Erase	
  Opera(on:	
  

0
1
1
11 2
21 3
42 1
0 00 00.9
 0.1
0
 1
 … i
 … n
0
j
m
…
…
11 2
3
1
0.1
1.8
0.2
3.6
0  e(j)  1, 8j
M =
2
6
4
1(1 0.9) 2(1 0.1) 3 ...
1 1 2 ...
2(1 0.9) 4(1 0.1) 1 ...
3
7
5 =
2
6
4
0.1 1.8 3 ...
1 1 2 ...
0.2 3.6 1 ...
3
7
5
M(i) (1 w(i)e)M(i)
Head	
  Loca(on:	
  
w
Erase	
  Vector:	
  
e
Memory	
  :	
  	
  M
Add	
  Opera(on
Add	
  Opera(on:
1
1
0
0 00 00.9
 0.1
0
 1
 … i
 … n
11 2
3
1
0.1
1.8
0.2
3.6
2
3
10.2
3.6
1.9
1.9
1.1
1.0
M =
2
6
4
0.1 + 0.9 1.8 + 0.1 3 ...
1.0 + 0.9 1.0 + 0.1 2 ...
0.2 3.6 1 ...
3
7
5 =
2
6
4
1.0 1.9 3 ...
1.9 1.1 2 ...
0.2 3.6 1 ...
3
7
5
M(i) M(i) + w(i)a
Add	
  Vector:	
  
a
Memory	
  :	
  	
  M
Head	
  Loca(on:	
  
w
0
j
m
…
…
Controller
controller	
  
Input
Read	
  Vector:
r
Head	
  Loca(on:	
  
w
Output
Add	
  Vector:	
  
a
Erase	
  Vector:	
  	
  	
  	
  	
  	
  	
  
e
Addressing	
  
Mechanisms
Content	
  Addressing	
  Parameter:
Interpola(on	
  Parameter:	
  
Convolu(onal	
  Shi^	
  Parameter:	
  
Sharpening	
  Parameter:
Memory	
  Key:	
  
k
s
g
0 0000 1
.45
 .05
.50
0
 0
 0
.45
.05
 .50
 0
 0
 0
0
 0
 0
 1
 0
 0
Head	
  Loca(on:	
  
w
11 2 04 0
21 3 01 1
42 1 15 0
	
  
0 00 00.9
 0.1
wt 1Head	
  Loca(on:	
  
MMemory:	
  
Previous	
  State
2
3
1
Memory	
  	
  
Key:	
  
k
= 50
g = 0.5
00 1s =
= 50
Controller	
  
Outputs
Content	
  	
  
Addressing	
  	
  
Interpola(on	
  	
  
Convolu(onal	
  
Shi^	
  
Sharpening	
  
Content	
  Addressing
11 2 04 0
21 3 01 1
42 1 15 0
2
3
1
.16
.16
 .16
 .16
.16
.16
0 0000 1 .15
.10
.47
 .08
 .13
.17
Memory	
  Key:	
  
kMemory	
  :	
  	
  M
Head	
  Loca(on:	
  
w
K[u, v] =
u · v
|u| · |v|
w(i)
e K[k,M(i)]
P
j e K[k,M(j)]
= 50 = 5 = 0
找出記憶體 中與 內容相近的位置。	
  
參數 :調整集中度
M k
Interpola(on
0 00 00.9
 0.1
0 0000 1
0 0000 1 0 00 00.9
 0.1
.45
.05
 .50
 0
 0
 0
wt 1
wt
g = 1 g = 0.5 g = 0
wt gwt + (1 g)wt 1
將讀寫頭位置  與上一個時段位置  結合。	
  
參數 :調整目前的與上個時段的比率
wt wt 1
g
Convolu(onal	
  Shi^
.45
.05
 .50
 0
 0
 0
 .45
.05
 .50
 0
 0
 0
.45
.05
.50
 0
 0
 0
 .45
 .05
.50
0
 0
 0
.45
.05
 .50
 0
 0
 0
.025
 .475
 .025
 .25
 0
 .225
01 0 00 1 .5
 0
 .5
-­‐1
 0
 1
-­‐1
 0
 1
 -­‐1
 0
 1
s = s = s =
wi 1 wi wi+1
s1s0s 1
wi
w(i)
X
j
w(j)s(i j)
w(i) w(i 1)s(1) + w(i)s(0) + w(i + 1)s( 1)
s
將 內的數值做平移。	
  
參數 :調整平移方向
s
w w
w
Sharpening
0
 0
 0
 1
 0
 0
 0
 .37
 0
 .62
 0
 0
0
 .45
 .05
 .50
 0
 0
.16
.16
 .16
.16
 .16
.16
w(i)
w(i)
P
j w(j)
= 50 = 5 = 0
使 中的值更集中(或分散)。	
  
參數 :調整集中度
w
w
Experiment:	
  Repeat	
  Copy	
  
hRps://github.com/fumin/ntm	
  
Evolu(on	
  of	
  Recurrent	
  Neural	
  
Network	
  
Recurrent	
  Neural	
  Network	
  
Long	
  Short	
  Term	
  Memory	
  
Neural	
  Turing	
  Machine	
  
短期記憶
可控制記憶體的讀寫
可更靈活地控制記憶體讀寫頭
的位置
Applica(ons	
  
Machine	
  Transla(on	
  
hRp://arxiv.org/pdf/1409.3215.pdf	
  
A	
  B	
  C	
  	
  -­‐>	
  	
  W	
  X	
  Y	
  Z	
  
Chinese	
  Word	
  Segmenta(on	
  
hRp://arxiv.org/pdf/1602.04874v1.pdf	
  
	
  
Chinese	
  Poetry	
  Genera(on	
  
hRp://emnlp2014.org/papers/pdf/EMNLP2014074.pdf	
  
	
  
Image	
  Cap(on	
  Genera(on	
  
hRp://arxiv.org/pdf/1411.4555v2.pdf	
  
	
  
Visual	
  Ques(on	
  Answering	
  
hRp://arxiv.org/pdf/1505.00468v6.pdf	
  
	
  
Further	
  Reading	
  
•  The	
  Unreasonable	
  Effec(veness	
  of	
  
RecurrentNeural	
  Networks	
  
–  hRp://karpathy.github.io/2015/05/21/rnneffec(veness/	
  
•  Understanding	
  LSTM	
  Networks	
  
–  hRp://colah.github.io/posts/2015-­‐08-­‐Understanding-­‐LSTMs/	
  
•  Recurrent	
  Neural	
  Networks	
  
–  hRp://cpmarkchang.logdown.com/posts/278457-­‐neural-­‐network-­‐recurrent-­‐neural-­‐network	
  
•  Neural	
  Turing	
  Machine	
  
–  hRp://cpmarkchang.logdown.com/posts/279710-­‐neural-­‐network-­‐neural-­‐turing-­‐machine	
  

Weitere ähnliche Inhalte

Was ist angesagt?

Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert홍배 김
 
Kristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericosKristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericosKristhyanAndreeKurtL
 
Gems of GameplayKit. UA Mobile 2017.
Gems of GameplayKit. UA Mobile 2017.Gems of GameplayKit. UA Mobile 2017.
Gems of GameplayKit. UA Mobile 2017.UA Mobile
 
Gentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowGentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowKhor SoonHin
 
Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Khor SoonHin
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache sparkEmiliano Martinez Sanchez
 
An Introduction to Deep Learning with Apache MXNet (November 2017)
An Introduction to Deep Learning with Apache MXNet (November 2017)An Introduction to Deep Learning with Apache MXNet (November 2017)
An Introduction to Deep Learning with Apache MXNet (November 2017)Julien SIMON
 
Machine Learning Introduction
Machine Learning IntroductionMachine Learning Introduction
Machine Learning IntroductionAkira Sosa
 
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」Ken'ichi Matsui
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」Ken'ichi Matsui
 
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data Sciencehenrygarner
 
The Ring programming language version 1.6 book - Part 62 of 189
The Ring programming language version 1.6 book - Part 62 of 189The Ring programming language version 1.6 book - Part 62 of 189
The Ring programming language version 1.6 book - Part 62 of 189Mahmoud Samir Fayed
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning LiveMike Anderson
 
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data ScienceMike Anderson
 
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料Ken'ichi Matsui
 
Procedural Content Generation with Clojure
Procedural Content Generation with ClojureProcedural Content Generation with Clojure
Procedural Content Generation with ClojureMike Anderson
 
統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半Ken'ichi Matsui
 
Creating Games for Asha - platform
Creating Games for Asha - platformCreating Games for Asha - platform
Creating Games for Asha - platformJussi Pohjolainen
 

Was ist angesagt? (20)

Explanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expertExplanation on Tensorflow example -Deep mnist for expert
Explanation on Tensorflow example -Deep mnist for expert
 
Kristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericosKristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericos
 
Gems of GameplayKit. UA Mobile 2017.
Gems of GameplayKit. UA Mobile 2017.Gems of GameplayKit. UA Mobile 2017.
Gems of GameplayKit. UA Mobile 2017.
 
Gentlest Introduction to Tensorflow
Gentlest Introduction to TensorflowGentlest Introduction to Tensorflow
Gentlest Introduction to Tensorflow
 
Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3Gentlest Introduction to Tensorflow - Part 3
Gentlest Introduction to Tensorflow - Part 3
 
Graph convolutional networks in apache spark
Graph convolutional networks in apache sparkGraph convolutional networks in apache spark
Graph convolutional networks in apache spark
 
An Introduction to Deep Learning with Apache MXNet (November 2017)
An Introduction to Deep Learning with Apache MXNet (November 2017)An Introduction to Deep Learning with Apache MXNet (November 2017)
An Introduction to Deep Learning with Apache MXNet (November 2017)
 
Machine Learning Introduction
Machine Learning IntroductionMachine Learning Introduction
Machine Learning Introduction
 
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
 
Logic gates
Logic gatesLogic gates
Logic gates
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
 
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data Science
 
The Ring programming language version 1.6 book - Part 62 of 189
The Ring programming language version 1.6 book - Part 62 of 189The Ring programming language version 1.6 book - Part 62 of 189
The Ring programming language version 1.6 book - Part 62 of 189
 
Machine Learning Live
Machine Learning LiveMachine Learning Live
Machine Learning Live
 
Clojure for Data Science
Clojure for Data ScienceClojure for Data Science
Clojure for Data Science
 
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
 
Procedural Content Generation with Clojure
Procedural Content Generation with ClojureProcedural Content Generation with Clojure
Procedural Content Generation with Clojure
 
統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半統計的学習の基礎 4章 前半
統計的学習の基礎 4章 前半
 
Creating Games for Asha - platform
Creating Games for Asha - platformCreating Games for Asha - platform
Creating Games for Asha - platform
 
Corona sdk
Corona sdkCorona sdk
Corona sdk
 

Andere mochten auch

TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用Mark Chang
 
TensorFlow 深度學習快速上手班--深度學習
 TensorFlow 深度學習快速上手班--深度學習 TensorFlow 深度學習快速上手班--深度學習
TensorFlow 深度學習快速上手班--深度學習Mark Chang
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterMark Chang
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly ProblemMark Chang
 
Neural Art (English Version)
Neural Art (English Version)Neural Art (English Version)
Neural Art (English Version)Mark Chang
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksMark Chang
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
AlphaGo in Depth
AlphaGo in Depth AlphaGo in Depth
AlphaGo in Depth Mark Chang
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
淺談深度學習
淺談深度學習淺談深度學習
淺談深度學習Mark Chang
 
民主的網路世代—推動罷免的工程師們
民主的網路世代—推動罷免的工程師們民主的網路世代—推動罷免的工程師們
民主的網路世代—推動罷免的工程師們Mark Chang
 
自然語言處理簡介
自然語言處理簡介自然語言處理簡介
自然語言處理簡介Mark Chang
 
TensorFlow 深度學習講座
TensorFlow 深度學習講座TensorFlow 深度學習講座
TensorFlow 深度學習講座Mark Chang
 
Image completion
Image completionImage completion
Image completionMark Chang
 
NeuralArt 電腦作畫
NeuralArt 電腦作畫NeuralArt 電腦作畫
NeuralArt 電腦作畫Mark Chang
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習Mark Chang
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringAkram El-Korashy
 
機械学習を使ったハッキング手法
機械学習を使ったハッキング手法機械学習を使ったハッキング手法
機械学習を使ったハッキング手法Isao Takaesu
 
Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing MachinesKato Yuzuru
 

Andere mochten auch (20)

TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用
 
TensorFlow 深度學習快速上手班--深度學習
 TensorFlow 深度學習快速上手班--深度學習 TensorFlow 深度學習快速上手班--深度學習
TensorFlow 深度學習快速上手班--深度學習
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive Writer
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
 
Neural Art (English Version)
Neural Art (English Version)Neural Art (English Version)
Neural Art (English Version)
 
Neural Doodle
Neural DoodleNeural Doodle
Neural Doodle
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural Networks
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
AlphaGo in Depth
AlphaGo in Depth AlphaGo in Depth
AlphaGo in Depth
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
淺談深度學習
淺談深度學習淺談深度學習
淺談深度學習
 
民主的網路世代—推動罷免的工程師們
民主的網路世代—推動罷免的工程師們民主的網路世代—推動罷免的工程師們
民主的網路世代—推動罷免的工程師們
 
自然語言處理簡介
自然語言處理簡介自然語言處理簡介
自然語言處理簡介
 
TensorFlow 深度學習講座
TensorFlow 深度學習講座TensorFlow 深度學習講座
TensorFlow 深度學習講座
 
Image completion
Image completionImage completion
Image completion
 
NeuralArt 電腦作畫
NeuralArt 電腦作畫NeuralArt 電腦作畫
NeuralArt 電腦作畫
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習
 
Memory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question AnsweringMemory Networks, Neural Turing Machines, and Question Answering
Memory Networks, Neural Turing Machines, and Question Answering
 
機械学習を使ったハッキング手法
機械学習を使ったハッキング手法機械学習を使ったハッキング手法
機械学習を使ったハッキング手法
 
Neural Turing Machines
Neural Turing MachinesNeural Turing Machines
Neural Turing Machines
 

Ähnlich wie Computational Linguistics week 10

Constant strain triangular
Constant strain triangular Constant strain triangular
Constant strain triangular rahul183
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer visionzukun
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptronsmitamm
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motionJa-Keoung Koo
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingDongang (Sean) Wang
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsElvis DOHMATOB
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2San Kim
 
Deep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An IntroDeep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An IntroSiby Jose Plathottam
 
Meta-Learning with Memory-Augmented Neural Networks (MANN)
Meta-Learning with Memory-Augmented Neural Networks (MANN)Meta-Learning with Memory-Augmented Neural Networks (MANN)
Meta-Learning with Memory-Augmented Neural Networks (MANN)Yeonsu Kim
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare EventsTaegyun Jeon
 
Applied Digital Signal Processing 1st Edition Manolakis Solutions Manual
Applied Digital Signal Processing 1st Edition Manolakis Solutions ManualApplied Digital Signal Processing 1st Edition Manolakis Solutions Manual
Applied Digital Signal Processing 1st Edition Manolakis Solutions Manualtowojixi
 
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience hirokazutanaka
 
Artificial Neural Network
Artificial Neural Network Artificial Neural Network
Artificial Neural Network Iman Ardekani
 
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...Umbra Software
 

Ähnlich wie Computational Linguistics week 10 (20)

Constant strain triangular
Constant strain triangular Constant strain triangular
Constant strain triangular
 
Families of Triangular Norm Based Kernel Function and Its Application to Kern...
Families of Triangular Norm Based Kernel Function and Its Application to Kern...Families of Triangular Norm Based Kernel Function and Its Application to Kern...
Families of Triangular Norm Based Kernel Function and Its Application to Kern...
 
05 history of cv a machine learning (theory) perspective on computer vision
05  history of cv a machine learning (theory) perspective on computer vision05  history of cv a machine learning (theory) perspective on computer vision
05 history of cv a machine learning (theory) perspective on computer vision
 
Artificial neural networks
Artificial neural networks Artificial neural networks
Artificial neural networks
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
UNIT I_3.pdf
UNIT I_3.pdfUNIT I_3.pdf
UNIT I_3.pdf
 
Journey to structure from motion
Journey to structure from motionJourney to structure from motion
Journey to structure from motion
 
RNN and sequence-to-sequence processing
RNN and sequence-to-sequence processingRNN and sequence-to-sequence processing
RNN and sequence-to-sequence processing
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Deep learning study 2
Deep learning study 2Deep learning study 2
Deep learning study 2
 
2-Perceptrons.pdf
2-Perceptrons.pdf2-Perceptrons.pdf
2-Perceptrons.pdf
 
Deep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An IntroDeep Learning & Tensor flow: An Intro
Deep Learning & Tensor flow: An Intro
 
Meta-Learning with Memory-Augmented Neural Networks (MANN)
Meta-Learning with Memory-Augmented Neural Networks (MANN)Meta-Learning with Memory-Augmented Neural Networks (MANN)
Meta-Learning with Memory-Augmented Neural Networks (MANN)
 
[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events[PR12] PR-036 Learning to Remember Rare Events
[PR12] PR-036 Learning to Remember Rare Events
 
Applied Digital Signal Processing 1st Edition Manolakis Solutions Manual
Applied Digital Signal Processing 1st Edition Manolakis Solutions ManualApplied Digital Signal Processing 1st Edition Manolakis Solutions Manual
Applied Digital Signal Processing 1st Edition Manolakis Solutions Manual
 
Annintro
AnnintroAnnintro
Annintro
 
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
JAISTサマースクール2016「脳を知るための理論」講義04 Neural Networks and Neuroscience
 
Neural network
Neural networkNeural network
Neural network
 
Artificial Neural Network
Artificial Neural Network Artificial Neural Network
Artificial Neural Network
 
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
Umbra Ignite 2015: Rulon Raymond – The State of Skinning – a dive into modern...
 

Mehr von Mark Chang

Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationMark Chang
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationMark Chang
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the WeightsMark Chang
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the WeightsMark Chang
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningMark Chang
 
PAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningPAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningMark Chang
 
Domain Adaptation
Domain AdaptationDomain Adaptation
Domain AdaptationMark Chang
 
Language Understanding for Text-based Games using Deep Reinforcement Learning
Language Understanding for Text-based Games using Deep Reinforcement LearningLanguage Understanding for Text-based Games using Deep Reinforcement Learning
Language Understanding for Text-based Games using Deep Reinforcement LearningMark Chang
 
Discourse Representation Theory
Discourse Representation TheoryDiscourse Representation Theory
Discourse Representation TheoryMark Chang
 

Mehr von Mark Chang (9)

Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Modeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential EquationModeling the Dynamics of SGD by Stochastic Differential Equation
Modeling the Dynamics of SGD by Stochastic Differential Equation
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the Weights
 
Information in the Weights
Information in the WeightsInformation in the Weights
Information in the Weights
 
PAC Bayesian for Deep Learning
PAC Bayesian for Deep LearningPAC Bayesian for Deep Learning
PAC Bayesian for Deep Learning
 
PAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep LearningPAC-Bayesian Bound for Deep Learning
PAC-Bayesian Bound for Deep Learning
 
Domain Adaptation
Domain AdaptationDomain Adaptation
Domain Adaptation
 
Language Understanding for Text-based Games using Deep Reinforcement Learning
Language Understanding for Text-based Games using Deep Reinforcement LearningLanguage Understanding for Text-based Games using Deep Reinforcement Learning
Language Understanding for Text-based Games using Deep Reinforcement Learning
 
Discourse Representation Theory
Discourse Representation TheoryDiscourse Representation Theory
Discourse Representation Theory
 

Kürzlich hochgeladen

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Computational Linguistics week 10

  • 1. Computa(onal  Linguis(cs   Week  10   Neural  Sequence  Modeling   Mark  Chang  
  • 2. Outlines   •  Recurrent  Neural  Networks   •  Long  short-­‐term  Memory   •  Neural  Turing  Machine   •  Applica(ons  
  • 6. Recurrent  Neural  Network   白 日 n(n(白),日) n(白) 依 n(n(n(白),日),依)
  • 7. 類神經網路到深度學習 Feedforward  Neural  Network   Recurrent  Neural  Network   Long  Short  Term  Memory  Neural  Turing  Machine  
  • 8. Recurrent  Neural  Network   nin,t = wcxt + wpnout,t 1 + wb nout,t = 1 1 + e nin,t 把上一個時間點的nout,接回這個時間點的nin  
  • 10. Recurrent  Neural  Network x0 x1 xt-­‐1 xt y0 y1 yt-­‐1 yt
  • 11. Backward  Propaga(on  Through  Time t  =  0 in,0 = @J @nout,0 @nout,0 @nin,0 = out,0 @nout,0 @nin,0 t  =  1 in,0= @J @nout,1 @nout,1 @nin,1 @nin,1 @nout,0 @nout,0 @nin,0 = out,1 @nout,1 @nin,1 @nin,1 @nout,0 @nout,0 @nin,0 = in,1 @nin,1 @nout,0 @nout,0 @nin,0 = out,0 @nout,0 @nin,0
  • 12. Backward  Propaga(on  Through  Time in,s = 8 >>< >>: @J @nout,s @nout,s @nin,s if s = t in,s+1 @nin,s+1 @nout,s @nout,s @nin,s otherwise http://cpmarkchang.logdown.com/posts/278457-neural-network-recurrent-neural-network   in,s+1in,s = in,s+1 @nin,s+1 @nout,s @nout,s @nin,s in,t = @J @nout,t @nout,t @nin,t
  • 13. Deep  RNN   y0   x0   y1   x1   yt-­‐1   xy-­‐1   yt   xt  
  • 14. Bi-­‐Direc(onal  RNN   x0   x0   x1   x1   xt-­‐1   xy-­‐1   xt   xt   y0   y1   yt-­‐1   yt  
  • 16. Vanishing  Gradient  Problem in,0 in,0 = out,t @nout,t @nin,t @nin,t @nout,t 1 ... @nin,1 @nout,0 @nout,0 @nin,0 out,t
  • 17. Long  Short-­‐Term  Memory   xt m yt Cin c cc k n b nout Memory  Cell kout Cread Cforget Cwrite mout,t mout,t-­‐1 Cout min,t
  • 18. Long  Short-­‐Term  Memory   輸入值 Cin 讀取開關 Cread 遺忘開關 Cforget 寫入開關 Cwrite 輸出值 Cout
  • 19. Long  Short-­‐Term  Memory   •  寫入開關Cwrite:控制是否可寫入記憶體     Cwrite = sigmoid(wcw,xxt + wcw,yyt 1 + wcw,b) kout = sigmoid(wk,xxt + wk,b) min,t = koutCwrite
  • 20. Long  Short-­‐Term  Memory •  遺忘開關Cforget:控制是否保留之前的值   Cforget = sigmoid(wcf,xxt + wcf,yyt + wcf,b) mout,t = min,t +Cforgetmout,t 1
  • 21. Long  Short-­‐Term  Memory •  讀取開關Cread  :控制是否可讀取記憶體 nout = sigmoid(mout,t) Cread = sigmoid(wcr,xxt + wcr,yyt 1 + wcr,b) Cout nout= Cread
  • 22. Training:  Backward  Propaga(on   hRp://www.felixgers.de/papers/phd.pdf   mout,t = min,t +Cforgetmout,t 1 min,t = koutCwrite @mout,t @wk,x = @min,t @wk,x + Cforget @mout,t 1 @wk,x = Cwrite @kout @wk,x + Cforget @mout,t 1 @wk,x
  • 23. Long-­‐Short  Term  Memory   https://class.coursera.org/neuralnets-2012-001/lecture/95  
  • 25. Neural  Turing  Machine   Input Output Read/Write   Head     controller   Memory  
  • 26. Memory Memory  Address   Memory  Block   Block   Length   0 1 … i … n 0 j m … …
  • 27. Read  Opera(on 11 2 21 3 42 1 Read  Opera(on: 0 00 00.9 0.1 0 1 … i … n 2 6 4 r0 r1 r2 3 7 5 = 2 6 4 1 ⇤ 0.9 + 2 ⇤ 0.1 1 ⇤ 0.9 + 1 ⇤ 0.1 2 ⇤ 0.9 + 4 ⇤ 0.1 3 7 5 = 2 6 4 1.1 1.0 2.2 3 7 5 X i w(i) = 1, 0  w(i)  1, 8i r X i w(i)M(i) Read  Vector: r Head  Loca(on:   w Memory  :    M 1.1 1.0 2.2
  • 28. Erase  Opera(on Erase  Opera(on:   0 1 1 11 2 21 3 42 1 0 00 00.9 0.1 0 1 … i … n 0 j m … … 11 2 3 1 0.1 1.8 0.2 3.6 0  e(j)  1, 8j M = 2 6 4 1(1 0.9) 2(1 0.1) 3 ... 1 1 2 ... 2(1 0.9) 4(1 0.1) 1 ... 3 7 5 = 2 6 4 0.1 1.8 3 ... 1 1 2 ... 0.2 3.6 1 ... 3 7 5 M(i) (1 w(i)e)M(i) Head  Loca(on:   w Erase  Vector:   e Memory  :    M
  • 29. Add  Opera(on Add  Opera(on: 1 1 0 0 00 00.9 0.1 0 1 … i … n 11 2 3 1 0.1 1.8 0.2 3.6 2 3 10.2 3.6 1.9 1.9 1.1 1.0 M = 2 6 4 0.1 + 0.9 1.8 + 0.1 3 ... 1.0 + 0.9 1.0 + 0.1 2 ... 0.2 3.6 1 ... 3 7 5 = 2 6 4 1.0 1.9 3 ... 1.9 1.1 2 ... 0.2 3.6 1 ... 3 7 5 M(i) M(i) + w(i)a Add  Vector:   a Memory  :    M Head  Loca(on:   w 0 j m … …
  • 30. Controller controller   Input Read  Vector: r Head  Loca(on:   w Output Add  Vector:   a Erase  Vector:               e Addressing   Mechanisms Content  Addressing  Parameter: Interpola(on  Parameter:   Convolu(onal  Shi^  Parameter:   Sharpening  Parameter: Memory  Key:   k s g
  • 31. 0 0000 1 .45 .05 .50 0 0 0 .45 .05 .50 0 0 0 0 0 0 1 0 0 Head  Loca(on:   w 11 2 04 0 21 3 01 1 42 1 15 0   0 00 00.9 0.1 wt 1Head  Loca(on:   MMemory:   Previous  State 2 3 1 Memory     Key:   k = 50 g = 0.5 00 1s = = 50 Controller   Outputs Content     Addressing     Interpola(on     Convolu(onal   Shi^   Sharpening  
  • 32. Content  Addressing 11 2 04 0 21 3 01 1 42 1 15 0 2 3 1 .16 .16 .16 .16 .16 .16 0 0000 1 .15 .10 .47 .08 .13 .17 Memory  Key:   kMemory  :    M Head  Loca(on:   w K[u, v] = u · v |u| · |v| w(i) e K[k,M(i)] P j e K[k,M(j)] = 50 = 5 = 0 找出記憶體 中與 內容相近的位置。   參數 :調整集中度 M k
  • 33. Interpola(on 0 00 00.9 0.1 0 0000 1 0 0000 1 0 00 00.9 0.1 .45 .05 .50 0 0 0 wt 1 wt g = 1 g = 0.5 g = 0 wt gwt + (1 g)wt 1 將讀寫頭位置  與上一個時段位置  結合。   參數 :調整目前的與上個時段的比率 wt wt 1 g
  • 34. Convolu(onal  Shi^ .45 .05 .50 0 0 0 .45 .05 .50 0 0 0 .45 .05 .50 0 0 0 .45 .05 .50 0 0 0 .45 .05 .50 0 0 0 .025 .475 .025 .25 0 .225 01 0 00 1 .5 0 .5 -­‐1 0 1 -­‐1 0 1 -­‐1 0 1 s = s = s = wi 1 wi wi+1 s1s0s 1 wi w(i) X j w(j)s(i j) w(i) w(i 1)s(1) + w(i)s(0) + w(i + 1)s( 1) s 將 內的數值做平移。   參數 :調整平移方向 s w w w
  • 35. Sharpening 0 0 0 1 0 0 0 .37 0 .62 0 0 0 .45 .05 .50 0 0 .16 .16 .16 .16 .16 .16 w(i) w(i) P j w(j) = 50 = 5 = 0 使 中的值更集中(或分散)。   參數 :調整集中度 w w
  • 36. Experiment:  Repeat  Copy   hRps://github.com/fumin/ntm  
  • 37. Evolu(on  of  Recurrent  Neural   Network   Recurrent  Neural  Network   Long  Short  Term  Memory   Neural  Turing  Machine   短期記憶 可控制記憶體的讀寫 可更靈活地控制記憶體讀寫頭 的位置
  • 39. Machine  Transla(on   hRp://arxiv.org/pdf/1409.3215.pdf   A  B  C    -­‐>    W  X  Y  Z  
  • 40. Chinese  Word  Segmenta(on   hRp://arxiv.org/pdf/1602.04874v1.pdf    
  • 41. Chinese  Poetry  Genera(on   hRp://emnlp2014.org/papers/pdf/EMNLP2014074.pdf    
  • 42. Image  Cap(on  Genera(on   hRp://arxiv.org/pdf/1411.4555v2.pdf    
  • 43. Visual  Ques(on  Answering   hRp://arxiv.org/pdf/1505.00468v6.pdf    
  • 44. Further  Reading   •  The  Unreasonable  Effec(veness  of   RecurrentNeural  Networks   –  hRp://karpathy.github.io/2015/05/21/rnneffec(veness/   •  Understanding  LSTM  Networks   –  hRp://colah.github.io/posts/2015-­‐08-­‐Understanding-­‐LSTMs/   •  Recurrent  Neural  Networks   –  hRp://cpmarkchang.logdown.com/posts/278457-­‐neural-­‐network-­‐recurrent-­‐neural-­‐network   •  Neural  Turing  Machine   –  hRp://cpmarkchang.logdown.com/posts/279710-­‐neural-­‐network-­‐neural-­‐turing-­‐machine