2. This is a brief summary of the paper
“Neural Turing Machines”
http://arxiv.org/abs/1410.5401
Written by
A. Graves
G. Wayne
I. Danihelka
Google DeepMind, London UK
3. “Neural Turing Machines” are, in a single phrase,
Neural Networks having the capability of
coupling to external memories.
The combined system is analogous to
a Turing Machine.
5. Neural Network
・Neural Network(NN) learns from large amount of observational data.
(data is a tuple of [External Input, External Output])
Neural Network
6. ・Recurrent Neural Network(RNN) introduces directed circles to NN,
which work as a sort of internal memories.
(Current states are determined by previous states and External Input)
Recurrent Neural Network
Directed circle
Recurrent Neural Network
7. ・”Neural Turing Machine” is NN which has the capability
of coupling to the external memories.
(Controller is NN with parameters for coupling to external memories)
External Memory
Neural Turing Machine
8. ・ Read/Write heads use weights to access external memory.
・ Weights are determined by the parameters on controller.
・ Parameters are learned from large amount of external I/O data.
N ×M matrix
N locations for
M size vector
N
M
Read head
Write head
e: to erase vectors
a: to add new vectors
weighted
access
Controller
(NN with parameters
for adjusting weights)
External Memory
How to access external memories
External Input External output
9. Content Addressing:
Weight adjustment based on the content on the each location.
Interpolation:
Determines how much we use previous weight state.
Convolutional Shift and Sharping :
Weight adjustment based on the location of the memory.
How to update weight
12. Result of copy algorithm
・ NTM learns some form of copy algorithm.
・ NTM performs better than LSTM(a kind of RNN).
・ Even NTM copy algorithm makes some mistakes
for long length data(as indicated by the red arrow).
NTM
・ Outputs are supposed to be a copy of targets.
13. Result of copy algorithm
LSTM
・ Outputs are supposed to be a copy of targets.
・ NTM learns some form of copy algorithm.
・ NTM performs better than LSTM(a kind of RNN).
・ Even NTM copy algorithm makes some mistakes
for long length data(as indicated by the red arrow).
14. How NTM uses an external memory for copy algorithm
・ All weight focus on a single location.
・ Read locations exactly match the write locations.
External
Inputs/Outputs
Adds/Reads
Vectors to
Memory
Write/Read
Weightings
16. How NTM uses an external memory for repeat copy algorithm
・ All weights focus on a single location.
・ Read locations are repeatedly referred by the write head.
17. Result of repeat copy algorithm
・ NTMs learns some form of repeated copy algorithm.
21. Results of Dynamical N-grams
・ NTM predicts the next bit almost as well as Optimal estimator.
Optimal:(N1, N0 is the number of 1,0 seen in the previous c bits)
23. Results of Priority Sort
・Write head writes to locations according to a linear function of priority
・Read head reads from locations in increasing order.
25. ・”Neural Turing Machines” are, in a single phrase, Neural Networks
having the capability of coupling to external memories.
Conclusion
・ We see the capability of using external memories through the
application of copy, repeat copy, associative recall, dynamical N-grams,
Priority sort.
・ I refer the readers who are really interested in this summary to
the original paper(http://arxiv.org/abs/1410.5401).