Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics

© Metta Innovations 2017
Jun 2017
An Introduction to Deep Learning
applied to Video Analytics
Aurélio Figueiredo
aurelio@mettainnovations.com.br

*Minimal Similarity Accumulation Attribute Using
Dimensionality Reduction with Feature Extraction.
In 78th European Association of Geoscientists and
Engineers (EAGE) Conference and Exhibition 2016
Metta Innovations
Great know-how in solutions applied
to Oil & Gas Industry
. Computational Geophysics
. Machine Learning / Deep Learning
. Automatic Seismic Interpretation
. ...
Autoencoder
EAGE 2016*

Metta Innovations
Expertise:
. Visualization
. Virtual and augmented reality
. High Performance Computing (HPC)
. Computer Vision
. Big Data
. Deep Learning

Agenda
Open Source Libraries
&
Examples and
Applications
What is Deep
Learning?
Why Deep Learning?
How to Use CNTK?
Code sample with CNTK
technology?

What is Deep Learning?
Deep learning (or deep structured learning or hierarchical learning):
1. Subfield of machine learning inspired by function of the brain
2. Artificial neural networks (ANNs) with many hidden layers
3. Suitable when target function is very complex (datasets REALLY large)

What is Deep Learning?
Deep Learning is the state-of-art approach to:
1. Computer Vision
2. Speech Recognition & Natural Language
Processing
3. Audio Recognition & Machine Translation
4. Social Network Filtering
5. Bioinformatics
In all these fields they produced results
comparable to or superior to human experts!

Why Deep Learning?
Automatically extracting features instead of manual extraction in feature engineering:
1. You don’t have to figure out the features ahead of time
2. We can use the same neural net approach for many different problems
Open sourcing is predominant in deep learning:

Why Deep Learning?
ImageNet Large Scale Visual Recognition Competition (ILSVRC*):
1. Evaluate algorithms for object detection at large scale
2. Measure progress of computer vision for large scale image
*http://www.image-net.org/challenges/LSVRC/
2010
Nec America
2011
Xerox
2012
AlexNet
2013
Clarify
2014
VGG
2014
GoogleNet
2015
MS ResNet
ImageNet Classification top-5 error (%)
30.0
20.0
10.0
0
28.2
25.8
16.4
11.7
7.3 3.6
6.7

Open Source
Libraries

Developed by Berkeley Vision and Learning Center
Extended by Facebook ( Caffe2 )
Mainly focusing on computer vision applications
Platforms:
● Linux
● macOS
● Windows
Supported programming languages:
● Python
● C++
● MATLAB
Caffe

Developed by Microsoft Research - Released Jan/2016
Platforms:
● Windows
● Linux
● Python (versions supported are 2.7, 3.4, and 3.5)
● C++
● C#/.NET Managed
Microsoft Cognitive Toolkit

Developed by DMLC team (used on Amazon AWS)
Platforms:
● Linux
● macOS
● Windows
● Android/iOS
● AWS
MXNet
● Python
● C++
● Scala
● R
● Perl
● MATLAB

Developed by Google - Released Nov/2015:
Platforms:
● Linux
● macOS
● Windows
● Android (new!)
● Python
● C/C++
● Java
● Go
● R
Tensor Flow

Developed by University of Montreal
Platforms:
● Linux
● macOS
● Windows
● Python
Theano

Developed by R Collobert, K Kavukcuoglu, C Farabet - Released Oct/2002
Platforms:
● Linux
● Mac OS X
● Windows
● Android
● Lua
● LuaJIT
● C
● C++/OpenCL
Torch

Libraries Comparison
Single GPU - Benchmarking State-of-the-Art Deep Learning Software:
http://dlbench.comp.hkbu.edu.hk
FCN-8 AlexNet ResNet-50 LSTM-64
CNTK 0.037 0.040 0.207 0.122
Caffe 0.038 0.026 0.307 ---
TensorFlow 0.063 --- --- 0.144
Torch 0.048 0.033 0.188 0.194
Seconds per minibatch on G1080 (G980) GPU. Lower is better. *November 2016.
*Benchmarking State-of-the-Art Deep Learning Software:

1 GPU
CNTK Theano TensorFlow Torch Caffe
Libraries Comparison
Speed comparison (samples / second). Higher is better. *December 2015.
80000
60000
40000
20000
0
1 x 4 GPUs 2 x 4 GPUs (8 GPUs)
*Benchmarking State-of-the-Art Deep Learning Software:
Multiples GPU - Benchmarking State-of-the-Art Deep Learning Software:

Using Microsoft Cognitive
Toolkit

How to Use CNTK
The Microsoft Cognitive Toolkit (CNTK) 2.0:
1. Arbitrary Neural Networks expressed through
building blocks
2. Compose simple blocks into complex Computational
Networks
3. Support relevant Network Types and Applications
4. LEGO-Like composability allows CNTK support
wide range of Networks and Applications

The 3 Main Steps to Train Instances
How to Use CNTK
Creating / Training the Network Model (3 steps):
1. Reader - reads and prepares dataset to be trained
2. Network Model - defines and configures network topology
3. Trainer - chooses the criteria used to train the nodes
Reader
Minibatch Source
Deserializer (task specific)
Automatic Randomization
Train / Test Samples
Network
Model Function
Criterion Function
Engine Config. (GPU/CPU)
Padding
Trainer & Evaluator
Stochastic Gradient Descent
1. Momentum
2. Adam
3. ...
Mini batching
Evaluates the error
Training
Data
Trained
Model

Activation
Function
(W2
, b2
)
Network Model
Model Function:
● Transforms Entry Features into Predictions
● Defines Model Structure and Initialization
Criterion Function:
● Compares Output Features with Labels
● Measure Training Loss & Additional Metrics
● Defines Training & Evaluation Criteria
● Provides Gradients according to the Training
Criteria
How to Use CNTK
Softmax
(W1
, b1
)
Cross Entropy
Activation
Function
Activation
Function
(Wout
, bout
)
x y

Create / Train the Network Model:
1. Reader - reads and prepares
dataset to be trained
2. Network Model - defines and
configures network topology
3. Trainer - chooses the criteria
used to train the nodes
How to Use CNTK

3. Trainer - chooses the criteria
used to train the nodes
How to Use CNTK
Basic autoencoder:
● MNIST handwritten digits data
● Each image 28x28 = 784 pixels

3. Trainer - chooses the criteria used
to train the nodes
How to Use CNTK

Call to Action
https://notebooks.azure.com/

Applications & Examples

Microsoft Build 2017 - Workplace Safety

NEC - Face Recognition

Amazon Go

Metta Intelligent Market

NVIDIA DriveNet Demo

Metta Objects Detection

We are hiring
careers@mettainnovations.com.br
Join the team

Contact Us
Aurélio Figueiredo
aurelio@mettainnovations.com.br

Thank you!

Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics

Ähnlich wie Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics