Weitere ähnliche Inhalte
Ähnlich wie Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics (20)
Kürzlich hochgeladen (20)
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
- 1. © Metta Innovations 2017
Jun 2017
An Introduction to Deep Learning
applied to Video Analytics
Aurélio Figueiredo
aurelio@mettainnovations.com.br
- 2. © Metta Innovations 2017
*Minimal Similarity Accumulation Attribute Using
Dimensionality Reduction with Feature Extraction.
In 78th European Association of Geoscientists and
Engineers (EAGE) Conference and Exhibition 2016
Metta Innovations
Great know-how in solutions applied
to Oil & Gas Industry
. Computational Geophysics
. Machine Learning / Deep Learning
. Automatic Seismic Interpretation
. ...
Autoencoder
EAGE 2016*
- 3. © Metta Innovations 2017
Metta Innovations
Expertise:
. Visualization
. Virtual and augmented reality
. High Performance Computing (HPC)
. Computer Vision
. Big Data
. Deep Learning
- 4. © Metta Innovations 2017
Agenda
Open Source Libraries
&
Examples and
Applications
What is Deep
Learning?
Why Deep Learning?
How to Use CNTK?
Code sample with CNTK
technology?
- 5. © Metta Innovations 2017
What is Deep Learning?
Deep learning (or deep structured learning or hierarchical learning):
1. Subfield of machine learning inspired by function of the brain
2. Artificial neural networks (ANNs) with many hidden layers
3. Suitable when target function is very complex (datasets REALLY large)
- 6. © Metta Innovations 2017
What is Deep Learning?
Deep Learning is the state-of-art approach to:
1. Computer Vision
2. Speech Recognition & Natural Language
Processing
3. Audio Recognition & Machine Translation
4. Social Network Filtering
5. Bioinformatics
In all these fields they produced results
comparable to or superior to human experts!
- 7. © Metta Innovations 2017
Why Deep Learning?
Automatically extracting features instead of manual extraction in feature engineering:
1. You don’t have to figure out the features ahead of time
2. We can use the same neural net approach for many different problems
Open sourcing is predominant in deep learning:
- 8. © Metta Innovations 2017
Why Deep Learning?
ImageNet Large Scale Visual Recognition Competition (ILSVRC*):
1. Evaluate algorithms for object detection at large scale
2. Measure progress of computer vision for large scale image
*http://www.image-net.org/challenges/LSVRC/
2010
Nec America
2011
Xerox
2012
AlexNet
2013
Clarify
2014
VGG
2014
GoogleNet
2015
MS ResNet
ImageNet Classification top-5 error (%)
30.0
20.0
10.0
0
28.2
25.8
16.4
11.7
7.3 3.6
6.7
- 10. © Metta Innovations 2017
Developed by Berkeley Vision and Learning Center
Extended by Facebook ( Caffe2 )
Mainly focusing on computer vision applications
Platforms:
● Linux
● macOS
● Windows
Supported programming languages:
● Python
● C++
● MATLAB
Caffe
- 11. © Metta Innovations 2017
Developed by Microsoft Research - Released Jan/2016
Platforms:
● Windows
● Linux
Supported programming languages:
● Python (versions supported are 2.7, 3.4, and 3.5)
● C++
● C#/.NET Managed
Microsoft Cognitive Toolkit
- 12. © Metta Innovations 2017
Developed by DMLC team (used on Amazon AWS)
Platforms:
● Linux
● macOS
● Windows
● Android/iOS
● AWS
MXNet
Supported programming languages:
● Python
● C++
● Scala
● R
● Perl
● MATLAB
- 13. © Metta Innovations 2017
Developed by Google - Released Nov/2015:
Platforms:
● Linux
● macOS
● Windows
● Android (new!)
Supported programming languages:
● Python
● C/C++
● Java
● Go
● R
Tensor Flow
- 14. © Metta Innovations 2017
Developed by University of Montreal
Platforms:
● Linux
● macOS
● Windows
Supported programming languages:
● Python
Theano
- 15. © Metta Innovations 2017
Developed by R Collobert, K Kavukcuoglu, C Farabet - Released Oct/2002
Platforms:
● Linux
● Mac OS X
● Windows
● Android
Supported programming languages:
● Lua
● LuaJIT
● C
● C++/OpenCL
Torch
- 16. © Metta Innovations 2017
Libraries Comparison
Single GPU - Benchmarking State-of-the-Art Deep Learning Software:
http://dlbench.comp.hkbu.edu.hk
FCN-8 AlexNet ResNet-50 LSTM-64
CNTK 0.037 0.040 0.207 0.122
Caffe 0.038 0.026 0.307 ---
TensorFlow 0.063 --- --- 0.144
Torch 0.048 0.033 0.188 0.194
Seconds per minibatch on G1080 (G980) GPU. Lower is better. *November 2016.
*Benchmarking State-of-the-Art Deep Learning Software:
http://dlbench.comp.hkbu.edu.hk
- 17. © Metta Innovations 2017
1 GPU
CNTK Theano TensorFlow Torch Caffe
Libraries Comparison
Speed comparison (samples / second). Higher is better. *December 2015.
80000
60000
40000
20000
0
1 x 4 GPUs 2 x 4 GPUs (8 GPUs)
*Benchmarking State-of-the-Art Deep Learning Software:
http://dlbench.comp.hkbu.edu.hk
Multiples GPU - Benchmarking State-of-the-Art Deep Learning Software:
http://dlbench.comp.hkbu.edu.hk
- 19. © Metta Innovations 2017
How to Use CNTK
The Microsoft Cognitive Toolkit (CNTK) 2.0:
1. Arbitrary Neural Networks expressed through
building blocks
2. Compose simple blocks into complex Computational
Networks
3. Support relevant Network Types and Applications
4. LEGO-Like composability allows CNTK support
wide range of Networks and Applications
- 20. © Metta Innovations 2017
The 3 Main Steps to Train Instances
How to Use CNTK
Creating / Training the Network Model (3 steps):
1. Reader - reads and prepares dataset to be trained
2. Network Model - defines and configures network topology
3. Trainer - chooses the criteria used to train the nodes
Reader
Minibatch Source
Deserializer (task specific)
Automatic Randomization
Train / Test Samples
Network
Model Function
Criterion Function
Engine Config. (GPU/CPU)
Padding
Trainer & Evaluator
Stochastic Gradient Descent
1. Momentum
2. Adam
3. ...
Mini batching
Evaluates the error
Training
Data
Trained
Model
- 21. © Metta Innovations 2017
Activation
Function
(W2
, b2
)
Network Model
Model Function:
● Transforms Entry Features into Predictions
● Defines Model Structure and Initialization
Criterion Function:
● Compares Output Features with Labels
● Measure Training Loss & Additional Metrics
● Defines Training & Evaluation Criteria
● Provides Gradients according to the Training
Criteria
How to Use CNTK
Softmax
(W1
, b1
)
Cross Entropy
Activation
Function
Activation
Function
(Wout
, bout
)
x y
- 22. © Metta Innovations 2017
Create / Train the Network Model:
1. Reader - reads and prepares
dataset to be trained
2. Network Model - defines and
configures network topology
3. Trainer - chooses the criteria
used to train the nodes
How to Use CNTK
- 23. © Metta Innovations 2017
Create / Train the Network Model:
1. Reader - reads and prepares
dataset to be trained
2. Network Model - defines and
configures network topology
3. Trainer - chooses the criteria
used to train the nodes
How to Use CNTK
- 24. © Metta Innovations 2017
Create / Train the Network Model:
1. Reader - reads and prepares
dataset to be trained
2. Network Model - defines and
configures network topology
3. Trainer - chooses the criteria
used to train the nodes
How to Use CNTK
Basic autoencoder:
● MNIST handwritten digits data
● Each image 28x28 = 784 pixels
- 25. © Metta Innovations 2017
Create / Train the Network Model:
1. Reader - reads and prepares
dataset to be trained
2. Network Model - defines and
configures network topology
3. Trainer - chooses the criteria used
to train the nodes
How to Use CNTK
- 26. © Metta Innovations 2017
Create / Train the Network Model:
1. Reader - reads and prepares
dataset to be trained
2. Network Model - defines and
configures network topology
3. Trainer - chooses the criteria
used to train the nodes
How to Use CNTK