SlideShare ist ein Scribd-Unternehmen logo
1 von 68
Downloaden Sie, um offline zu lesen
Convolutional Neural Networks
and Natural Language Processing
Thomas Delteil – github.com/thomasdelteil – linkedin.com/in/thomasdelteil
Applied Scientist @ AWS Deep Engine
Goals
§ Explain what convolutions are
§ Show how to handle textual data
§ Analyze a reference neural network
architecture for text classification
§ Demonstrate how to train and deploy a CNN for
Natural Language Processing
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Convolutions
And where to find them
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
2012 - ImageNet Classification with Deep Convolutional Neural Networks
ImageNet classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E.
Hinton, Advances in Neural Information Processing Systems, 2012
AlexNet architecture
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
ImageNet competition
Classify images among 1000 classes:
AlexNet Top-5 error-rate, 25% => 16%!
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Actual photo of the reaction from the computer vision community*
*might just be a stock photo
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
I told you
so!
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What made Convolutional Neural Networks viable?
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
GPUs!
- Nvidia V100, float16 Ops:
~ 120 TFLOPS, 5000+ cuda cores
- #1 Super computer 2005 ~135 TFLOPS
Source: Mathworks
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Sea/Land segmentation via satellite images
DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation, Ruirui Li et al, 2017
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Automatic Galaxy classication
Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks , Nour Eldeen M. Khalifa, 2017
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Medical Imaging, MRI, X-ray, surgical cameras
Review of MRI-based Brain Tumor Image Segmentation Using Deep Learning Methods, Ali Isn et al. 2016
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ?
It is the cross-channel sum of the element-wise
multiplication of a convolutional filter (kernel/mask)
computed over a sliding window on an input tensor
given a certain stride and padding, plus a bias term.
The result is called a feature map.
2 2 1
3 1 -1
4 3 2
1 -1
-1 0
Input matrix (3x3)
no padding
1 channel
Kernel (2x2)
Stride 1
Bias = 2
Feature map (2x2)
-1 2
0 1
1*2 –1*2 –1*3 + 0*1 + 2 = – 1
1*2 –1*2 –1*1 + 0*-1 + 2. = 2
1*3 –1*1 –1*4 + 0*3 + 2 = 0
1*1 – (-1)*1 –1*3 + 0*2 + 2 = 1
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Padding
Source: Machine Learning guru - Neural Networks CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Stride = 2
Source: Machine Learning guru - Neural Networks CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Multi Channel
1 convolutional filter
(3)x(3x3)
Source: Machine Learning guru - Neural Networks CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
What is a convolution ? Multi Channel
source: Convolutional Neural Networks on the iphone with vggnet
N: Number of input channels
W:Width of the kernel
H: Height of the kernel
M: Number of output channels
Kernel size = ! ∗ # ∗ $
#Params = % ∗ ! ∗ # ∗ $ + %
256 convolutions of kernel (3,3) on 256 input channels
256*256*3*3 = ~0.5M
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Easily parallelizable
Convolution computations are:
- Independent (across filters and within
filter)
- Simple (multiplication and sums)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Why does it work?
Sharpening filter
Laplacian filter
Sobel x-axis filter
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Why does it work?
- Detect patterns at larger and larger scale by stacking convolution
layers on top of each others to grow the receptive field
- Applicable to spatially correlated data
Source: AlexNet first 96 (55x55) filters learned represented in RGB
space (3 input channels)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Growing receptive field
Source: ML Review, A guide to receptive field arithmetic
Deeper in the
network
Visualize convolutions
http://scs.ryerson.ca/~aharley/vis/conv/flat.html
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Visualize convolutions
Source: Neural Network 3D Simulation
(warning flashing lights)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
State of the art networks are getting deeper and more complex
Source: Inception v3
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
input
Learn Data Science – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
High number of parameters => Requires a lot of data to train
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Advanced type of convolutions
Source: An introduction to different types of convolutions
Transposed Convolutions
(deconvolution)
EnhanceNet
Dilated Convolutions
WaveNet
Depth-wise separable
Convolutions
MobileNet
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
On to Natural Language
Processing
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
NLP
Machine
translation
OCR
Q&A
Sentiment
Analysis
Speech
Recognition
TTS
Topic
Modelling
Information
Retrieval
Natural
Language
Understanding
Document
Classification
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
NLP Domains
8.4PB
of information per second
as of 2020
source: business2comunity, 2016
70%
of companies
use customer feedback
Source: business2comunity, 2016
£1.3Tvalue of company
data
source: IDC, 2014
10%
of organizations expect to
commercialise their data by 2020
source: Gartner, 2016
NLP Industry Facts
Source: Ticary, What is natural language processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Convolutions and Natural Language Processing
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Representation
?
source: Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn,and Dong Yu,. Classification Convolutional Neural Networks for
Speech Recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Encoding Data word-level
- Word-level embedding (word2vec). Word -> N-dimensional vector
Source: Convolutional Neural Networks for Sentence Classification,Yoon Kim, 2014
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
N
time
different
embeddings
V A N C O U V E R N L P …
_ 0 0 0 0 0 0 0 0 0 1 0 0 0
- 0 0 0 0 0 0 0 0 0 0 0 0 0
. 0 0 0 0 0 0 0 0 0 0 0 0 0
A 0 1 0 0 0 0 0 0 0 0 0 0 0
B 0 0 0 0 0 0 0 0 0 0 0 0 0
C 0 0 0 1 0 0 0 0 0 0 0 0 0
D 0 0 0 0 0 0 0 0 0 0 0 0 0
E 0 0 0 0 0 0 0 1 0 0 0 0 0
F 0 0 0 0 0 0 0 0 0 0 0 0 0
G 0 0 0 0 0 0 0 0 0 0 0 0 0
H 0 0 0 0 0 0 0 0 0 0 0 0 0
I 0 0 0 0 0 0 0 0 0 0 0 0 0
J 0 0 0 0 0 0 0 0 0 0 0 0 0
K 0 0 0 0 0 0 0 0 0 0 0 0 0
L 0 0 0 0 0 0 0 0 0 0 0 1 0
M 0 0 0 0 0 0 0 0 0 0 0 0 0
N 0 0 1 0 0 0 0 0 0 0 1 0 0
O 0 0 0 0 1 0 0 0 0 0 0 0 0
P 0 0 0 0 0 0 0 0 0 0 0 0 1
Q 0 0 0 0 0 0 0 0 0 0 0 0 0
R 0 0 0 0 0 0 0 0 1 0 0 0 0
S 0 0 0 0 0 0 0 0 0 0 0 0 0
T 0 0 0 0 0 0 0 0 0 0 0 0 0
U 0 0 0 0 0 1 0 0 0 0 0 0 0
V 1 0 0 0 0 0 1 0 0 0 0 0 0
W 0 0 0 0 0 0 0 0 0 0 0 0 0
X 0 0 0 0 0 0 0 0 0 0 0 0 0
Y 0 0 0 0 0 0 0 0 0 0 0 0 0
Z 0 0 0 0 0 0 0 0 0 0 0 0 0
Encoding Data – Character-level
- One-hot encoding
- Alphabet
- Sparse representation
- Character embedding
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Text classification, N categories
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Text classification, N categories
Neural
Network
- Fiction: 0%
- Biography: 6%
…
- Play: 80%
…
- Documentation: 0%
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
source: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. NIPS 2015
Visualization with Netro
Deep Neural Network: Crepe Model
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Visualization with Netron
Intuition: convolutions act similarly as n-grams
V A N C O U V E R … 1013
_ 0 0 0 0 0 0 0 0 0 1
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
- 0 0 0 0 0 0 0 0 0 0 …
. 0 0 0 0 0 0 0 0 0 0 …
A 0 1 0 0 0 0 0 0 0 0 …
B 0 0 0 0 0 0 0 0 0 0 …
C 0 0 0 1 0 0 0 0 0 0 …
D 0 0 0 0 0 0 0 0 0 0 …
E 0 0 0 0 0 0 0 1 0 0 …
F 0 0 0 0 0 0 0 0 0 0 …
G 0 0 0 0 0 0 0 0 0 0 …
H 0 0 0 0 0 0 0 0 0 0 …
I 0 0 0 0 0 0 0 0 0 0 …
J 0 0 0 0 0 0 0 0 0 0 …
K 0 0 0 0 0 0 0 0 0 0 …
L 0 0 0 0 0 0 0 0 0 0 …
M 0 0 0 0 0 0 0 0 0 0 …
N 0 0 1 0 0 0 0 0 0 0 …
O 0 0 0 0 1 0 0 0 0 0 …
P 0 0 0 0 0 0 0 0 0 0 …
Q 0 0 0 0 0 0 0 0 0 0 …
R 0 0 0 0 0 0 0 0 1 0 …
S 0 0 0 0 0 0 0 0 0 0 …
T 0 0 0 0 0 0 0 0 0 0 …
U 0 0 0 0 0 1 0 0 0 0 …
V 1 0 0 0 0 0 1 0 0 0 …
W 0 0 0 0 0 0 0 0 0 0 …
X 0 0 0 0 0 0 0 0 0 0 …
Y 0 0 0 0 0 0 0 0 0 0 …
Z 0 0 0 0 0 0 0 0 0 0 …
0 1 2 3 4 … … … … … … … … 1007
0 6.4 1.1 3.2 0.3 -0.4 … … … … … … … … …
1 -2.1 0.2 -3.4 … … … … … … … … … … …
… … … … … … … … … … … … … … …
… … … … … … … … … … … … … … …
… … … … … … … … … … … … … … …
254 … … … … … … … … … … … … … …
255 1.2 3.4 -1 1.2 3.2 … … … … … … … … …
x 256
69x1014x1 = ~70k
1x1008x256 = ~256k
x 1008
Temporal Convolution (256 69*7/1)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
1x1008x256 = ~256k
1x1008x256 = ~ 256k
Activation Function: Rectified Linear Unit (ReLU)
! " = $
", " ≥ 0
0, " < 0
0 1 2 3 4 5 … 1007
0 6.4 1.1 3.2 0.3 -0.4 0.2 … …
… … … … … … … … …
255 1.2 3.4 -1 1.2 3.2 2.8 … …
0 1 2 3 4 5 … 1007
0 6.4 1.1 3.2 0.3 0 0.2 … …
… … … … … … … … …
255 1.2 3.4 0 1.2 3.2 2.8 … …
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0 1 2 3 4 5 … 1007
0 6.4 1.1 3.2 0.3 0 0.2 … …
… … … … … … … … …
255 1.2 3.4 0 1.2 3.2 2.8 … …
0 1 … 335
0 6.4 0.3 … …
… … … … …
255 3.4 3.2 … …
1x1008x256 = ~256k
1x336x256 = ~86k
x 336
x 256
Down-sampling: Max-Pooling (256 1*3/3)
source : Stanford's CS231n
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Fast forward…
1x336x256 = ~86k <- after 1 convolution layer (69*7/1) and 1 max pooling (3x1/3)
1x330x256 = ~85k <- after 1 convolution layer (1*7/1)
1x110x256 = ~28k <- 1 max-pooling (1*3/3)
3x102x256 = ~26k <- 4 convolutions layers (1*3/1)
1x34x256 = ~9k <- 1 max-pooling (1*3/3)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0 1 2 3 4 5 6 7 8 … 33
0 6.4 0.1 … … … … … … … … …
1 2.1 24.9 … … … … … … … … …
… … … … … … … … … … … …
255 … … … … … … … … … … 9.9
0
0 6.4
1 0.1
… …
34 2.1
35 24.9
… …
… …
… …
8703 9.9 8704x1x1 = ~9k
1x34x256 = ~9k
x 256
Flattening Layer
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0
0 6.4
1 0.1
… …
8703 9.9
8704x1x1 = ~9k
0
1
k
1023
x 1024
1024x1x1 = ~1k
!" # = %
&'(
)*(+
,"& ∗ .& + 0"
0
0 8.7
1 -2.1
… …
1023 32.1
Fully Connected / Dense layer (1024)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0
0 8.7
1 0
… …
… …
… …
… …
… …
… …
… …
1023 32.1
DROP OUT
1024x1x1 = ~1k
0
1
k
1023
x 1024
1024x1x1 = ~1k
!" # = %
&'(
)*(+
,"& ∗ .& + 0"
0
0 9.2
1 5.3
… …
1023 0.1
ignored
Dropout (p=0.5) + Fully Connected Layer (1024)
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
0
0 6.4
1 0.1
… …
… …
… …
… …
… …
… …
… …
1023 9.9
1024x1x1 = ~1k
0
…
N-1
x N
Nx1x1 = N
0
0 2.7
1 0.1
… …
… …
N-1 12.5
ignored
Softmax
0
0 0.1
1 0.01
… …
… …
N-1 0.8
Nx1x1 = N
!"#$%&' ( ) =
+,-
∑/01
234 +
,/
Output: Dropout + Dense + Softmax for N categories
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Text classification, N categories
Neural
Network
- Fiction: 0%
- Biography: 6%
…
- Play: 6%
…
- Documentation: 80%
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
How to train the network? Backward propagation!
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Backward propagation – Efficient Gradient Descent
- Fiction: 0%
- Biography: 6% 0%
…
- Play: 6% 100%
…
- Documentation: 80% 0%
- Fiction: 0%
- Biography: 6%
…
- Play: 6%
…
- Documentation: 80%
Update the weights of the convolutional masks and fully
connected units so that the error will be minimized next time
Neural
Network
!"# = !"# − &.
()
(*+,
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Learning Rate ! : How much to update the weights for every batch of documents?
Training Parameters: Learning Rate
Source:Towards data Science: Gradient descent in a nutshell
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Training parameters: Batch Size
Batch size: How many examples to learn from in one step?
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Training parameters: Number of epochs
Number of epochs: How many times should we feed the network the entire training set?
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Jupyter notebook demo – Crepe in Apache MXNet/Gluon
https://github.com/ThomasDelteil/CNN_NLP_MXNet
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Results
Traditional approaches
Word-level CNN
Character-level CNN
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
For images
For text
Humans to rephrase the examples
Synonyms
Similar semantic meaning
Data Augmentation
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
fast
swift
speedy
idle
indolent
slothful
hound
pup
mutt
leaps
springs
bounds
hops
hazel
brunette
chestnut
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
fast
swift
speedy
idle
indolent
slothful
hound
pup
mutt
leaps
springs
bounds
hops
hazel
brunette
chestnut
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Data Augmentation
The quick brown fox jumps over the lazy dog
fast
swift
speedy
idle
indolent
slothful
hound
pup
mutt
leaps
springs
bounds
hops
The swift brunette fox leaps over the slothful pup
hazel
brunette
chestnut
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
You need a large dataset
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
…A very large dataset!
Live Demo – Classification of product category for Amazon Reviews
https://thomasdelteil.github.io/CNN_NLP_MXNet/
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
- Develop model using a Jupyter notebook
- Train model on GPU instance
- Package model behind web API in a Docker container, e.g using MXNet Model Server
- Upload container to container registry
- Deploy container to an elastic container service
- Enjoy quick and linear scaling
- Put the API behind a load balancer with SSL termination
- Enjoy J
Workflow and Operationalization
Elastic
Container
Service
GPU instance Container
Registry
Auto-scaling Load
Balancer
Container
HTTPS request
“Loved this
book”
HTTPS response
{
“prediction” : {
“book”: 0.99
}
}
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Advanced use-cases for
Convolutions and NLP
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
CNN + LSTM: Spatially and Temporally Deep Neural Networks
- CNN for feature extraction
- LSTM for temporal representation
Applications:
- Video (CNN for frames, LSTM to
combine them temporally)
- Text tasks
- Audio (Language detection)
Source: Combining CNN and RNN for spoken language detection
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Advanced use-case: Speech Generation WaveNet
Source: DeepMind Wavenet generative model raw audio
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
WaveNet: Dilated Causal Convolution
Source: DeepMind Wavenet generative model raw audio
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
WaveNet: Dilated Causal Convolution
Source: DeepMind Wavenet generative model raw audio
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Summary
§ Learned about convolutions
§ Applied them to textual data
§ Studied the crepe architecture from
Zhang et al. in details
§ Learned about advanced use cases and
operationalization
Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
Thank you!
Connect here
github.com/thomasdelteil
linkedin.com/in/thomasdelteil
tdelteil@amazon.com
Photos credits: https://pexels.com and https://unsplash.com/

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaAlexey Grigorev
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERTshaurya uppal
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Yuta Niki
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTSuman Debnath
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRUananth
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep LearningNatasha Latysheva
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity RecognitionTomer Lieber
 
Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Olusola Amusan
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion ModelsSangwoo Mo
 
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position EmbeddingRoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embeddingtaeseon ryu
 

Was ist angesagt? (20)

Introduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga PetrovaIntroduction to Transformers for NLP - Olga Petrova
Introduction to Transformers for NLP - Olga Petrova
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
 
NLP State of the Art | BERT
NLP State of the Art | BERTNLP State of the Art | BERT
NLP State of the Art | BERT
 
Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)Transformer Introduction (Seminar Material)
Transformer Introduction (Seminar Material)
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 
An introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERTAn introduction to the Transformers architecture and BERT
An introduction to the Transformers architecture and BERT
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
[Paper review] BERT
[Paper review] BERT[Paper review] BERT
[Paper review] BERT
 
A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
Word2Vec
Word2VecWord2Vec
Word2Vec
 
Bert
BertBert
Bert
 
Sequence Modelling with Deep Learning
Sequence Modelling with Deep LearningSequence Modelling with Deep Learning
Sequence Modelling with Deep Learning
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)Long Short Term Memory (Neural Networks)
Long Short Term Memory (Neural Networks)
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position EmbeddingRoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
 

Ähnlich wie Convolutional Neural Networks and Natural Language Processing

The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?Frank van Harmelen
 
Introduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersIntroduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersEmanuele Della Valle
 
Deep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationDeep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationHPCC Systems
 
Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?Venu Vasudevan
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationFrank van Harmelen
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningmustafa sarac
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...MaRS Discovery District
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
OK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial DataOK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial DataAndrew Turner
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Amr Rashed
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning TutorialAmr Rashed
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
 
Small, Medium and Big Data
Small, Medium and Big DataSmall, Medium and Big Data
Small, Medium and Big DataPierre De Wilde
 
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...Big Data Week
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningAmr Rashed
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson StudioSasha Lazarevic
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Paul Houle
 

Ähnlich wie Convolutional Neural Networks and Natural Language Processing (20)

The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?The Web of Data: do we actually understand what we built?
The Web of Data: do we actually understand what we built?
 
Introduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS PractitionersIntroduction to Semantic Web for GIS Practitioners
Introduction to Semantic Web for GIS Practitioners
 
Novi sad ai event 1-2018
Novi sad ai event 1-2018Novi sad ai event 1-2018
Novi sad ai event 1-2018
 
Deep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text ClassificationDeep Content Learning in Traffic Prediction and Text Classification
Deep Content Learning in Traffic Prediction and Text Classification
 
Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?Deep Learning for IoT : is there a shallow end of the pool?
Deep Learning for IoT : is there a shallow end of the pool?
 
The Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge RepresentationThe Empirical Turn in Knowledge Representation
The Empirical Turn in Knowledge Representation
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
building intelligent systems with large scale deep learning
building intelligent systems with large scale deep learningbuilding intelligent systems with large scale deep learning
building intelligent systems with large scale deep learning
 
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
Deep Learning: Changing the Playing Field of Artificial Intelligence - MaRS G...
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
OK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial DataOK festival Lightning Talk - Collaborative Open Geospatial Data
OK festival Lightning Talk - Collaborative Open Geospatial Data
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Small, Medium and Big Data
Small, Medium and Big DataSmall, Medium and Big Data
Small, Medium and Big Data
 
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
BDW16 London - Chris von Csefalvay, Helioserv - Cats and What They Tell us Ab...
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
Query Understanding
Query UnderstandingQuery Understanding
Query Understanding
 
Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6Chatbots in 2017 -- Ithaca Talk Dec 6
Chatbots in 2017 -- Ithaca Talk Dec 6
 

Kürzlich hochgeladen

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 

Kürzlich hochgeladen (20)

Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 

Convolutional Neural Networks and Natural Language Processing

  • 1. Convolutional Neural Networks and Natural Language Processing Thomas Delteil – github.com/thomasdelteil – linkedin.com/in/thomasdelteil Applied Scientist @ AWS Deep Engine
  • 2. Goals § Explain what convolutions are § Show how to handle textual data § Analyze a reference neural network architecture for text classification § Demonstrate how to train and deploy a CNN for Natural Language Processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 3. Convolutions And where to find them Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 4. 2012 - ImageNet Classification with Deep Convolutional Neural Networks ImageNet classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, Advances in Neural Information Processing Systems, 2012 AlexNet architecture Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 5. ImageNet competition Classify images among 1000 classes: AlexNet Top-5 error-rate, 25% => 16%! Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 6. Actual photo of the reaction from the computer vision community* *might just be a stock photo Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 7. I told you so! Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 8. What made Convolutional Neural Networks viable? Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 9. GPUs! - Nvidia V100, float16 Ops: ~ 120 TFLOPS, 5000+ cuda cores - #1 Super computer 2005 ~135 TFLOPS Source: Mathworks Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 10. Sea/Land segmentation via satellite images DeepUNet: A Deep Fully Convolutional Network for Pixel-level Sea-Land Segmentation, Ruirui Li et al, 2017 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 11. Automatic Galaxy classication Deep Galaxy: Classification of Galaxies based on Deep Convolutional Neural Networks , Nour Eldeen M. Khalifa, 2017 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 12. Medical Imaging, MRI, X-ray, surgical cameras Review of MRI-based Brain Tumor Image Segmentation Using Deep Learning Methods, Ali Isn et al. 2016 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 13. What is a convolution ? It is the cross-channel sum of the element-wise multiplication of a convolutional filter (kernel/mask) computed over a sliding window on an input tensor given a certain stride and padding, plus a bias term. The result is called a feature map. 2 2 1 3 1 -1 4 3 2 1 -1 -1 0 Input matrix (3x3) no padding 1 channel Kernel (2x2) Stride 1 Bias = 2 Feature map (2x2) -1 2 0 1 1*2 –1*2 –1*3 + 0*1 + 2 = – 1 1*2 –1*2 –1*1 + 0*-1 + 2. = 2 1*3 –1*1 –1*4 + 0*3 + 2 = 0 1*1 – (-1)*1 –1*3 + 0*2 + 2 = 1 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 14. What is a convolution ? Padding Source: Machine Learning guru - Neural Networks CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 15. What is a convolution ? Stride = 2 Source: Machine Learning guru - Neural Networks CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 16. What is a convolution ? Multi Channel 1 convolutional filter (3)x(3x3) Source: Machine Learning guru - Neural Networks CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 17. What is a convolution ? Multi Channel source: Convolutional Neural Networks on the iphone with vggnet N: Number of input channels W:Width of the kernel H: Height of the kernel M: Number of output channels Kernel size = ! ∗ # ∗ $ #Params = % ∗ ! ∗ # ∗ $ + % 256 convolutions of kernel (3,3) on 256 input channels 256*256*3*3 = ~0.5M Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 18. Easily parallelizable Convolution computations are: - Independent (across filters and within filter) - Simple (multiplication and sums) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 19. Why does it work? Sharpening filter Laplacian filter Sobel x-axis filter Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 20. Why does it work? - Detect patterns at larger and larger scale by stacking convolution layers on top of each others to grow the receptive field - Applicable to spatially correlated data Source: AlexNet first 96 (55x55) filters learned represented in RGB space (3 input channels) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 21. Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil Growing receptive field Source: ML Review, A guide to receptive field arithmetic Deeper in the network
  • 22. Visualize convolutions http://scs.ryerson.ca/~aharley/vis/conv/flat.html Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 23. Visualize convolutions Source: Neural Network 3D Simulation (warning flashing lights) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 24. State of the art networks are getting deeper and more complex Source: Inception v3 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil input
  • 25. Learn Data Science – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil High number of parameters => Requires a lot of data to train Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 26. Advanced type of convolutions Source: An introduction to different types of convolutions Transposed Convolutions (deconvolution) EnhanceNet Dilated Convolutions WaveNet Depth-wise separable Convolutions MobileNet Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 27. On to Natural Language Processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 28. NLP Machine translation OCR Q&A Sentiment Analysis Speech Recognition TTS Topic Modelling Information Retrieval Natural Language Understanding Document Classification Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil NLP Domains
  • 29. 8.4PB of information per second as of 2020 source: business2comunity, 2016 70% of companies use customer feedback Source: business2comunity, 2016 £1.3Tvalue of company data source: IDC, 2014 10% of organizations expect to commercialise their data by 2020 source: Gartner, 2016 NLP Industry Facts Source: Ticary, What is natural language processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 30. Convolutions and Natural Language Processing Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 31. Data Representation ? source: Ossama Abdel-Hamid, Abdel-rahman Mohamed, Hui Jiang, Li Deng, Gerald Penn,and Dong Yu,. Classification Convolutional Neural Networks for Speech Recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2014 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 32. Encoding Data word-level - Word-level embedding (word2vec). Word -> N-dimensional vector Source: Convolutional Neural Networks for Sentence Classification,Yoon Kim, 2014 Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil N time different embeddings
  • 33. V A N C O U V E R N L P … _ 0 0 0 0 0 0 0 0 0 1 0 0 0 - 0 0 0 0 0 0 0 0 0 0 0 0 0 . 0 0 0 0 0 0 0 0 0 0 0 0 0 A 0 1 0 0 0 0 0 0 0 0 0 0 0 B 0 0 0 0 0 0 0 0 0 0 0 0 0 C 0 0 0 1 0 0 0 0 0 0 0 0 0 D 0 0 0 0 0 0 0 0 0 0 0 0 0 E 0 0 0 0 0 0 0 1 0 0 0 0 0 F 0 0 0 0 0 0 0 0 0 0 0 0 0 G 0 0 0 0 0 0 0 0 0 0 0 0 0 H 0 0 0 0 0 0 0 0 0 0 0 0 0 I 0 0 0 0 0 0 0 0 0 0 0 0 0 J 0 0 0 0 0 0 0 0 0 0 0 0 0 K 0 0 0 0 0 0 0 0 0 0 0 0 0 L 0 0 0 0 0 0 0 0 0 0 0 1 0 M 0 0 0 0 0 0 0 0 0 0 0 0 0 N 0 0 1 0 0 0 0 0 0 0 1 0 0 O 0 0 0 0 1 0 0 0 0 0 0 0 0 P 0 0 0 0 0 0 0 0 0 0 0 0 1 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 R 0 0 0 0 0 0 0 0 1 0 0 0 0 S 0 0 0 0 0 0 0 0 0 0 0 0 0 T 0 0 0 0 0 0 0 0 0 0 0 0 0 U 0 0 0 0 0 1 0 0 0 0 0 0 0 V 1 0 0 0 0 0 1 0 0 0 0 0 0 W 0 0 0 0 0 0 0 0 0 0 0 0 0 X 0 0 0 0 0 0 0 0 0 0 0 0 0 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 Z 0 0 0 0 0 0 0 0 0 0 0 0 0 Encoding Data – Character-level - One-hot encoding - Alphabet - Sparse representation - Character embedding Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 34. Text classification, N categories Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 35. Text classification, N categories Neural Network - Fiction: 0% - Biography: 6% … - Play: 80% … - Documentation: 0% Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 36. source: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. NIPS 2015 Visualization with Netro Deep Neural Network: Crepe Model Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil Visualization with Netron Intuition: convolutions act similarly as n-grams
  • 37. V A N C O U V E R … 1013 _ 0 0 0 0 0 0 0 0 0 1 … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … - 0 0 0 0 0 0 0 0 0 0 … . 0 0 0 0 0 0 0 0 0 0 … A 0 1 0 0 0 0 0 0 0 0 … B 0 0 0 0 0 0 0 0 0 0 … C 0 0 0 1 0 0 0 0 0 0 … D 0 0 0 0 0 0 0 0 0 0 … E 0 0 0 0 0 0 0 1 0 0 … F 0 0 0 0 0 0 0 0 0 0 … G 0 0 0 0 0 0 0 0 0 0 … H 0 0 0 0 0 0 0 0 0 0 … I 0 0 0 0 0 0 0 0 0 0 … J 0 0 0 0 0 0 0 0 0 0 … K 0 0 0 0 0 0 0 0 0 0 … L 0 0 0 0 0 0 0 0 0 0 … M 0 0 0 0 0 0 0 0 0 0 … N 0 0 1 0 0 0 0 0 0 0 … O 0 0 0 0 1 0 0 0 0 0 … P 0 0 0 0 0 0 0 0 0 0 … Q 0 0 0 0 0 0 0 0 0 0 … R 0 0 0 0 0 0 0 0 1 0 … S 0 0 0 0 0 0 0 0 0 0 … T 0 0 0 0 0 0 0 0 0 0 … U 0 0 0 0 0 1 0 0 0 0 … V 1 0 0 0 0 0 1 0 0 0 … W 0 0 0 0 0 0 0 0 0 0 … X 0 0 0 0 0 0 0 0 0 0 … Y 0 0 0 0 0 0 0 0 0 0 … Z 0 0 0 0 0 0 0 0 0 0 … 0 1 2 3 4 … … … … … … … … 1007 0 6.4 1.1 3.2 0.3 -0.4 … … … … … … … … … 1 -2.1 0.2 -3.4 … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … 254 … … … … … … … … … … … … … … 255 1.2 3.4 -1 1.2 3.2 … … … … … … … … … x 256 69x1014x1 = ~70k 1x1008x256 = ~256k x 1008 Temporal Convolution (256 69*7/1) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 38. 1x1008x256 = ~256k 1x1008x256 = ~ 256k Activation Function: Rectified Linear Unit (ReLU) ! " = $ ", " ≥ 0 0, " < 0 0 1 2 3 4 5 … 1007 0 6.4 1.1 3.2 0.3 -0.4 0.2 … … … … … … … … … … … 255 1.2 3.4 -1 1.2 3.2 2.8 … … 0 1 2 3 4 5 … 1007 0 6.4 1.1 3.2 0.3 0 0.2 … … … … … … … … … … … 255 1.2 3.4 0 1.2 3.2 2.8 … … Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 39. 0 1 2 3 4 5 … 1007 0 6.4 1.1 3.2 0.3 0 0.2 … … … … … … … … … … … 255 1.2 3.4 0 1.2 3.2 2.8 … … 0 1 … 335 0 6.4 0.3 … … … … … … … 255 3.4 3.2 … … 1x1008x256 = ~256k 1x336x256 = ~86k x 336 x 256 Down-sampling: Max-Pooling (256 1*3/3) source : Stanford's CS231n Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 40. Fast forward… 1x336x256 = ~86k <- after 1 convolution layer (69*7/1) and 1 max pooling (3x1/3) 1x330x256 = ~85k <- after 1 convolution layer (1*7/1) 1x110x256 = ~28k <- 1 max-pooling (1*3/3) 3x102x256 = ~26k <- 4 convolutions layers (1*3/1) 1x34x256 = ~9k <- 1 max-pooling (1*3/3) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 41. 0 1 2 3 4 5 6 7 8 … 33 0 6.4 0.1 … … … … … … … … … 1 2.1 24.9 … … … … … … … … … … … … … … … … … … … … … 255 … … … … … … … … … … 9.9 0 0 6.4 1 0.1 … … 34 2.1 35 24.9 … … … … … … 8703 9.9 8704x1x1 = ~9k 1x34x256 = ~9k x 256 Flattening Layer Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 42. 0 0 6.4 1 0.1 … … 8703 9.9 8704x1x1 = ~9k 0 1 k 1023 x 1024 1024x1x1 = ~1k !" # = % &'( )*(+ ,"& ∗ .& + 0" 0 0 8.7 1 -2.1 … … 1023 32.1 Fully Connected / Dense layer (1024) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 43. 0 0 8.7 1 0 … … … … … … … … … … … … … … 1023 32.1 DROP OUT 1024x1x1 = ~1k 0 1 k 1023 x 1024 1024x1x1 = ~1k !" # = % &'( )*(+ ,"& ∗ .& + 0" 0 0 9.2 1 5.3 … … 1023 0.1 ignored Dropout (p=0.5) + Fully Connected Layer (1024) Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 44. 0 0 6.4 1 0.1 … … … … … … … … … … … … … … 1023 9.9 1024x1x1 = ~1k 0 … N-1 x N Nx1x1 = N 0 0 2.7 1 0.1 … … … … N-1 12.5 ignored Softmax 0 0 0.1 1 0.01 … … … … N-1 0.8 Nx1x1 = N !"#$%&' ( ) = +,- ∑/01 234 + ,/ Output: Dropout + Dense + Softmax for N categories Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 45. Text classification, N categories Neural Network - Fiction: 0% - Biography: 6% … - Play: 6% … - Documentation: 80% Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 46. How to train the network? Backward propagation! Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 47. Backward propagation – Efficient Gradient Descent - Fiction: 0% - Biography: 6% 0% … - Play: 6% 100% … - Documentation: 80% 0% - Fiction: 0% - Biography: 6% … - Play: 6% … - Documentation: 80% Update the weights of the convolutional masks and fully connected units so that the error will be minimized next time Neural Network !"# = !"# − &. () (*+, Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 48. Learning Rate ! : How much to update the weights for every batch of documents? Training Parameters: Learning Rate Source:Towards data Science: Gradient descent in a nutshell Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 49. Training parameters: Batch Size Batch size: How many examples to learn from in one step? Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 50. Training parameters: Number of epochs Number of epochs: How many times should we feed the network the entire training set? Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 51. Jupyter notebook demo – Crepe in Apache MXNet/Gluon https://github.com/ThomasDelteil/CNN_NLP_MXNet Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 52. Results Traditional approaches Word-level CNN Character-level CNN Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 53. For images For text Humans to rephrase the examples Synonyms Similar semantic meaning Data Augmentation Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 54. Data Augmentation The quick brown fox jumps over the lazy dog Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 55. Data Augmentation The quick brown fox jumps over the lazy dog fast swift speedy idle indolent slothful hound pup mutt leaps springs bounds hops hazel brunette chestnut Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 56. Data Augmentation The quick brown fox jumps over the lazy dog fast swift speedy idle indolent slothful hound pup mutt leaps springs bounds hops hazel brunette chestnut Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 57. Data Augmentation The quick brown fox jumps over the lazy dog fast swift speedy idle indolent slothful hound pup mutt leaps springs bounds hops The swift brunette fox leaps over the slothful pup hazel brunette chestnut Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 58. You need a large dataset Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 59. …A very large dataset!
  • 60. Live Demo – Classification of product category for Amazon Reviews https://thomasdelteil.github.io/CNN_NLP_MXNet/ Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 61. - Develop model using a Jupyter notebook - Train model on GPU instance - Package model behind web API in a Docker container, e.g using MXNet Model Server - Upload container to container registry - Deploy container to an elastic container service - Enjoy quick and linear scaling - Put the API behind a load balancer with SSL termination - Enjoy J Workflow and Operationalization Elastic Container Service GPU instance Container Registry Auto-scaling Load Balancer Container HTTPS request “Loved this book” HTTPS response { “prediction” : { “book”: 0.99 } } Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 62. Advanced use-cases for Convolutions and NLP Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 63. CNN + LSTM: Spatially and Temporally Deep Neural Networks - CNN for feature extraction - LSTM for temporal representation Applications: - Video (CNN for frames, LSTM to combine them temporally) - Text tasks - Audio (Language detection) Source: Combining CNN and RNN for spoken language detection Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 64. Advanced use-case: Speech Generation WaveNet Source: DeepMind Wavenet generative model raw audio Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 65. WaveNet: Dilated Causal Convolution Source: DeepMind Wavenet generative model raw audio Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 66. WaveNet: Dilated Causal Convolution Source: DeepMind Wavenet generative model raw audio Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil
  • 67. Summary § Learned about convolutions § Applied them to textual data § Studied the crepe architecture from Zhang et al. in details § Learned about advanced use cases and operationalization Learn Data Science, Vancouver – Deep Learning and NLP - CNNs and NLP - Thomas Delteil - github.com/thomasdelteil - linkedin.com/in/thomasdelteil