Computational Linguistics week 10

Computa(onal
Linguis(cs

Week
10

Neural
Sequence
Modeling

Mark
Chang

Outlines

•  Recurrent
Neural
Networks

•  Long
short-‐term
Memory

•  Neural
Turing
Machine

•  Applica(ons

Recurrent
Neural
Networks

短期記憶
白
白日依山盡，黃河入海流
白日
白日依
…..
白日依山

短期記憶
白
n(白)
日
n(日)
n
W1
W2
x1
x2
b
Wb
y
n
W1
W2
x1
x2
b
Wb
y

Recurrent
Neural
Network

白
日
n(n(白),日)
n(白)
依
n(n(n(白),日),依)

類神經網路到深度學習
Feedforward
Neural
Network
Recurrent
Neural
Network

Long
Short
Term
Memory
Neural
Turing
Machine

Recurrent
Neural
Network

nin,t = wcxt + wpnout,t 1 + wb
nout,t =
1
1 + e nin,t
把上一個時間點的nout，接回這個時間點的nin

Recurrent
Neural
Network
….
x0
y0
y1
x1
x2
y2
yt
xt

Recurrent
Neural
Network
x0
x1
xt-‐1
xt
y0
y1
yt-‐1
yt

Backward
Propaga(on
Through
Time
t
=
0
in,0 =
@J
@nout,0
@nout,0
@nin,0
= out,0
@nout,0
@nin,0
t
=
1
in,0=
@J
@nout,1
@nout,1
@nin,1
@nin,1
@nout,0
@nout,0
@nin,0
= out,1
@nout,1
@nin,1
@nin,1
@nout,0
@nout,0
@nin,0
= in,1
@nin,1
@nout,0
@nout,0
@nin,0
= out,0
@nout,0
@nin,0

Backward
Propaga(on
Through
Time
in,s =
8
>><
>>:
@J
@nout,s
@nout,s
@nin,s
if s = t
in,s+1
@nin,s+1
@nout,s
@nout,s
@nin,s
otherwise
http://cpmarkchang.logdown.com/posts/278457-neural-network-recurrent-neural-network

in,s+1in,s = in,s+1
@nin,s+1
@nout,s
@nout,s
@nin,s
in,t =
@J
@nout,t
@nout,t
@nin,t

Deep
RNN

y0

x0

y1

x1

yt-‐1

xy-‐1

yt

xt

Bi-‐Direc(onal
RNN

x0

x0

x1

x1

xt-‐1

xy-‐1

xt

xt

y0
y1
yt-‐1
yt

Long
Short-‐Term
Memory

Vanishing
Gradient
Problem
in,0
in,0 = out,t
@nout,t
@nin,t
@nin,t
@nout,t 1
...
@nin,1
@nout,0
@nout,0
@nin,0
out,t

Long
Short-‐Term
Memory

xt
m
yt
Cin
c cc
k n
b
nout
Memory
Cell
kout
Cread
Cforget
Cwrite
mout,t
mout,t-‐1
Cout
min,t

Long
Short-‐Term
Memory

輸入值
Cin
讀取開關
Cread
遺忘開關
Cforget
寫入開關
Cwrite
輸出值
Cout

Long
Short-‐Term
Memory

•  寫入開關Cwrite：控制是否可寫入記憶體

Cwrite = sigmoid(wcw,xxt + wcw,yyt 1 + wcw,b)
kout = sigmoid(wk,xxt + wk,b)
min,t = koutCwrite

Long
Short-‐Term
Memory
•  遺忘開關Cforget：控制是否保留之前的值

Cforget = sigmoid(wcf,xxt + wcf,yyt + wcf,b)
mout,t = min,t +Cforgetmout,t 1

Long
Short-‐Term
Memory
•  讀取開關Cread
：控制是否可讀取記憶體
nout = sigmoid(mout,t)
Cread = sigmoid(wcr,xxt + wcr,yyt 1 + wcr,b)
Cout nout= Cread

Training:
Backward
Propaga(on

hRp://www.felixgers.de/papers/phd.pdf

mout,t = min,t +Cforgetmout,t 1 min,t = koutCwrite
@mout,t
@wk,x
=
@min,t
@wk,x
+ Cforget
@mout,t 1
@wk,x
= Cwrite
@kout
@wk,x
+ Cforget
@mout,t 1
@wk,x

Long-‐Short
Term
Memory

https://class.coursera.org/neuralnets-2012-001/lecture/95

Neural
Turing
Machine

Input
Output
Read/Write

Head

controller

Memory

Memory
Memory
Address

Memory
Block

Block

Length

0
1
… i
… n
0
j
m
…
…

Read
Opera(on
11 2
21 3
42 1
Read
Opera(on:
0 00 00.9
0.1
0
1
… i
… n
2
6
4
r0
r1
r2
3
7
5 =
2
6
4
1 ⇤ 0.9 + 2 ⇤ 0.1
1 ⇤ 0.9 + 1 ⇤ 0.1
2 ⇤ 0.9 + 4 ⇤ 0.1
3
7
5 =
2
6
4
1.1
1.0
2.2
3
7
5
X
i
w(i) = 1, 0  w(i)  1, 8i
r
X
i
w(i)M(i)
Read
Vector:
r
Head
Loca(on:

w
Memory
:

M
1.1
1.0
2.2

Erase
Opera(on
Erase
Opera(on:

0
1
1
11 2
21 3
42 1
0 00 00.9
0.1
0
1
… i
… n
0
j
m
…
…
11 2
3
1
0.1
1.8
0.2
3.6
0  e(j)  1, 8j
M =
2
6
4
1(1 0.9) 2(1 0.1) 3 ...
1 1 2 ...
2(1 0.9) 4(1 0.1) 1 ...
3
7
5 =
2
6
4
0.1 1.8 3 ...
1 1 2 ...
0.2 3.6 1 ...
3
7
5
M(i) (1 w(i)e)M(i)
Head
Loca(on:

w
Erase
Vector:

e
Memory
:

M

Add
Opera(on
Add
Opera(on:
1
1
0
0 00 00.9
0.1
0
1
… i
… n
11 2
3
1
0.1
1.8
0.2
3.6
2
3
10.2
3.6
1.9
1.9
1.1
1.0
M =
2
6
4
0.1 + 0.9 1.8 + 0.1 3 ...
1.0 + 0.9 1.0 + 0.1 2 ...
0.2 3.6 1 ...
3
7
5 =
2
6
4
1.0 1.9 3 ...
1.9 1.1 2 ...
0.2 3.6 1 ...
3
7
5
M(i) M(i) + w(i)a
Add
Vector:

a
Memory
:

M
Head
Loca(on:

w
0
j
m
…
…

Controller
controller

Input
Read
Vector:
r
Head
Loca(on:

w
Output
Add
Vector:

a
Erase
Vector:

e
Addressing

Mechanisms
Content
Addressing
Parameter:
Interpola(on
Parameter:

Convolu(onal
Shi^
Parameter:

Sharpening
Parameter:
Memory
Key:

k
s
g

0 0000 1
.45
.05
.50
0
0
0
.45
.05
.50
0
0
0
0
0
0
1
0
0
Head
Loca(on:

w
11 2 04 0
21 3 01 1
42 1 15 0

0 00 00.9
0.1
wt 1Head
Loca(on:

MMemory:

Previous
State
2
3
1
Memory

Key:

k
= 50
g = 0.5
00 1s =
= 50
Controller

Outputs
Content

Addressing

Interpola(on

Convolu(onal

Shi^

Sharpening

Content
Addressing
11 2 04 0
21 3 01 1
42 1 15 0
2
3
1
.16
.16
.16
.16
.16
.16
0 0000 1 .15
.10
.47
.08
.13
.17
Memory
Key:

kMemory
:

M
Head
Loca(on:

w
K[u, v] =
u · v
|u| · |v|
w(i)
e K[k,M(i)]
P
j e K[k,M(j)]
= 50 = 5 = 0
找出記憶體　中與　內容相近的位置。

參數　：調整集中度
M k

Interpola(on
0 00 00.9
0.1
0 0000 1
0 0000 1 0 00 00.9
0.1
.45
.05
.50
0
0
0
wt 1
wt
g = 1 g = 0.5 g = 0
wt gwt + (1 g)wt 1
將讀寫頭位置　與上一個時段位置　　結合。

參數　：調整目前的與上個時段的比率
wt wt 1
g

Convolu(onal
Shi^
.45
.05
.50
0
0
0
.45
.05
.50
0
0
0
.45
.05
.50
0
0
0
.45
.05
.50
0
0
0
.45
.05
.50
0
0
0
.025
.475
.025
.25
0
.225
01 0 00 1 .5
0
.5
-‐1
0
1
-‐1
0
1
-‐1
0
1
s = s = s =
wi 1 wi wi+1
s1s0s 1
wi
w(i)
X
j
w(j)s(i j)
w(i) w(i 1)s(1) + w(i)s(0) + w(i + 1)s( 1)
s
將　內的數值做平移。

參數　：調整平移方向
s
w w
w

Sharpening
0
0
0
1
0
0
0
.37
0
.62
0
0
0
.45
.05
.50
0
0
.16
.16
.16
.16
.16
.16
w(i)
w(i)
P
j w(j)
= 50 = 5 = 0
使　中的值更集中（或分散）。

參數　：調整集中度
w
w

Experiment:
Repeat
Copy

hRps://github.com/fumin/ntm

Evolu(on
of
Recurrent
Neural

Network

Recurrent
Neural
Network

Long
Short
Term
Memory

Neural
Turing
Machine

短期記憶
可控制記憶體的讀寫
可更靈活地控制記憶體讀寫頭
的位置

Machine
Transla(on

hRp://arxiv.org/pdf/1409.3215.pdf

A
B
C

-‐>

W
X
Y
Z

Chinese
Word
Segmenta(on

hRp://arxiv.org/pdf/1602.04874v1.pdf

Chinese
Poetry
Genera(on

hRp://emnlp2014.org/papers/pdf/EMNLP2014074.pdf

Image
Cap(on
Genera(on


Visual
Ques(on
Answering


Further
Reading

•  The
Unreasonable
Eﬀec(veness
of

RecurrentNeural
Networks

–  hRp://karpathy.github.io/2015/05/21/rnneﬀec(veness/

•  Understanding
LSTM
Networks

–  hRp://colah.github.io/posts/2015-‐08-‐Understanding-‐LSTMs/

•  Recurrent
Neural
Networks

–  hRp://cpmarkchang.logdown.com/posts/278457-‐neural-‐network-‐recurrent-‐neural-‐network

•  Neural
Turing
Machine

–  hRp://cpmarkchang.logdown.com/posts/279710-‐neural-‐network-‐neural-‐turing-‐machine

Computational Linguistics week 10

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie Computational Linguistics week 10

Ähnlich wie Computational Linguistics week 10 (20)

Mehr von Mark Chang

Mehr von Mark Chang (9)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Computational Linguistics week 10