SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
Copyright © 2017 MathWorks, Inc 1
Girish Venkataramani, Avinash Nehemiah
May 2017
Deep Learning and Vision Algorithm
Development in MATLAB Targeting
Embedded GPUs
Copyright © 2017 MathWorks, Inc 2
Design Deep
Learning & Vision
Algorithms
Talk Outline
High Performance
Embedded
Implementation
Highlights
• Manage large image sets
• Automate image labeling
• Easy access to models
• Pre-built training
frameworks
Highlights
• Automate compilation of
MATLAB to CUDA
• 14x faster than pyCaffe
60% faster than C++ Caffe
3x faster than TensorFlow
Accelerate and Scale
Training
Highlights
• Acceleration with GPUs
• Scale to clusters
Copyright © 2017 MathWorks, Inc 3
Let’s Use Object Detection as an Example
TRUCK
SUV
CAR
In our example we’ll use deep learning for object detection.
Copyright © 2017 MathWorks, Inc 5
Transfer Learning Workflow
Transfer Learning
Images
New
Classifier
Learn New
Weights
Modify
Network
Structure
Load
Reference
NetworkLabels
Training Data
Labels: Car, Truck,
Large Truck, SUV, Van
Alexnet, VGG-16,
VGG-19, GoogLeNet
Copyright © 2017 MathWorks, Inc 6
Manage Large Sets of Images
Transfer Learning
Images
New
Classifier
Learn New
Weights
Modify
Network
Structure
Load
Reference
NetworkLabels
Handle Large Sets of Images
Easily manage large sets of images
- Single line of code to access images
- Operates on disk, database, big-data file system
imageData = imageDataStore(‘vehicles’)
Easily manage large sets of images
- Single line of code to access images
- Operates on disk, database, big-data file system
Organize Images in Folders
(~ 10,000 images , 5 folders)
Copyright © 2017 MathWorks, Inc 7
Automate Ground Truth Labeling
Transfer Learning
Images
New
Classifier
Learn New
Weights
Modify
Network
Structure
Load
Reference
NetworkLabels
Ground Truth Labeling
Copyright © 2017 MathWorks, Inc 8
Automate Ground Truth Labeling
Automate Ground Truth Labeling
Copyright © 2017 MathWorks, Inc 9
Access Reference Models in MATLAB
Transfer Learning
Images
New
Classifier
Learn New
Weights
Modify
Network
Structure
Load
Reference
NetworkLabels
Easily Load Reference Networks
Access Models with 1-line of MATLAB Code
Net1 = alexnet
Net2 = vgg16
Net3 = vgg19
Copyright © 2017 MathWorks, Inc 10
Access Reference Models in MATLAB
Easily manage large sets of images
- Single line of code to access images
- Operates on disk, database, big-data file system
1. Reference Models
2. Model Importer
3. Tutorials
Copyright © 2017 MathWorks, Inc 11
Modify Network Structure
Transfer Learning
Images
New
Classifier
Learn New
Weights
Modify
Network
Structure
Load
Reference
NetworkLabels
Simple MATLAB API to modify layers:
layers(23) = fullyConnectedLayer(5, 'Name','fc8');
layers(25) = classificationLayer('Name',‘VehicleClassifier')
Copyright © 2017 MathWorks, Inc 12
Training Object Detectors
Transfer Learning
Images
New
Classifier
Learn New
Weights
Modify
Network
Structure
Load
Reference
NetworkLabels
Train Any Network
trainNetwork(datastore, layers, options)
Pre-built Frameworks for Computer Vision
• Deep Learning: R-CNN, Fast R-CNN, Faster R-CNN
• Machine Learning: ACF, Cascade Object Detectors
Copyright © 2017 MathWorks, Inc 13
Visualizing and Debugging Intermediate Results
Filters
…
Activations
Deep Dream
Training Accuracy
Visualization
Deep Dream
Layer Activations Feature Visualization
• Many options for visualizations and debugging
• Examples to get started
Copyright © 2017 MathWorks, Inc 14
Real World Systems Use More Than
Deep Learning
Deep learning vehicle detector performance degraded with environmental effects (fog, etc. )
Fog Removal
Challenge: Deep learning frameworks do not include “classical” computer vision
Solution: Convert MATLAB code with deep learning and computer vision to embedded implementation
Copyright © 2017 MathWorks, Inc 15
Talk Outline
Design Deep
Learning & Vision
Algorithms
High Performance
Embedded
Implementation
Accelerate and Scale
Training
Can you solve “real” problems for production
systems with MATLAB?
Copyright © 2017 MathWorks, Inc 16
Single code change
trainingOptions(‘sgdm’,…
‘ExecutionEnvironment’,’CPU’)
Accelerate and Scale Computing
Multi-core CPU
‘ExecutionEnvironment’,’GPU’)
GPU
‘ExecutionEnvironment’,’multi-GPU’) Multiple
GPU
‘ExecutionEnvironment’,’parallel’)
Cluster/
Cloud
Copyright © 2017 MathWorks, Inc 17
After Many Iterations to Find The Best Model
Copyright © 2017 MathWorks, Inc 18
Talk Outline
Design Deep
Learning & Vision
Algorithms
High Performance
Embedded
Implementation
Accelerate and Scale
Training
Can you create high performance implementation
from MATLAB code ?
Copyright © 2017 MathWorks, Inc 19
Presenting the MATLAB to CUDA parallelizing compiler
Why?
• Alexnet inference using MATLAB solution is
• ~14x faster than pyCaffe and 50% faster than C++-Caffe
• ~ 4x faster and ~3x less memory-use than TensorFlow
Copyright © 2017 MathWorks, Inc 20
Sample Generated CUDA Code
MATLAB source code Auto-generated CUDA code
Copyright © 2017 MathWorks, Inc 21
MATLAB to CUDA compiler flow
Control-flow graph
Intermediate representation
(CFG – IR)
Front-end
Parallel loop creation
Library function mapping
CUDA kernel creation
cudaMemcpy
minimization
Shared memory synthesis
CUDA code emission
….
Traditional compiler
optimizations
….
(×) cublas-gemm
() cuSolver calls
fft cuFFT calls
nnet cuDNN calls
Library function mapping
Parallel loop creation
Identify loop-nests that will
become CUDA kernels
…
.
CUDA kernel creation
Convert loop to CUDA kernel
Thread/blocks inferred from loop dims
cudaMemcpy
minimization
Shared memory synthesis
Perform Use-def analysis.
cudaMalloc GPU vars, insert memcpy
Infer data locality. Map to shared
memory. Synthesize shared memory
access
CUDA kernel
optimizations
Copyright © 2017 MathWorks, Inc 22
MATLAB to CUDA compiler:
Creating large parallel loops!
Control-flow graph
Intermediate representation
(CFG – IR)
Front-end
Scalarization
Loop perfectization
Loop interchange
Loop fusion
Scalar replacement
Library function mapping
CUDA code emission
….
Traditional compiler
optimizations
…
.
Loop
optimizations
Scalarization
Loop fusion
Scalar replacement
Parallel loop creation
CUDA kernel creation
cudaMemcpy
minimization
Shared memory synthesis
CUDA kernel
optimizations
Copyright © 2017 MathWorks, Inc 23
MATLAB to CUDA compiler:
Creating large parallel loops!
Control-flow graph
Intermediate representation
(CFG – IR)
Front-end
Scalarization
Loop perfectization
Loop interchange
Loop fusion
Scalar replacement
Library function mapping
CUDA code emission
….
Traditional compiler
optimizations
…
.
Loop
optimizations
2 kernels (size N), 20*N bytes
1 kernel (size N), 16*N bytes
Scalarization
Loop fusion
Scalar replacement
Parallel loop creation
CUDA kernel creation
cudaMemcpy
minimization
Shared memory synthesis
CUDA kernel
optimizations
Copyright © 2017 MathWorks, Inc 24
cudaMemcpy minimization
A(:) = ….
C(:) = ….
for i = 1:N
….
gB = kernel1(gA);
gA = kernel2(gB);
if (some_condition)
gC = kernel3(gA, gB);
end
….
end
…. = C;
cudaMemcpy
*definitely* needed
cudaMemcpy
*not* needed
cudaMemcpy
*may be* needed
Observations
• Equivalent to Partial redundancy elimination (PRE)
• Dynamic strategy – track memory location with a
status flag per variable
• Use-Def to determine where to insert memcpy
A(:) = …
A_isDirtyOnCpu = true;
…
for i = 1:N
if (A_isDirtyOnCpu)
cudaMemcpy(gA, A);
A_isDirtyOnCpu = false;
end
gB = kernel1(gA);
gA = kernel2(gB);
if (somecondition)
gC = kernel3(gA, gB);
C_isDirtyOnGpu = true;
end
…
end
…
if (C_isDirtyOnGpu)
cudaMemcpy(C, gC);
C_isDirtyOnGpu = false;
end
… = C;
Assume gA, gB and gC are mapped to GPU memory Generated (pseudo) code
Copyright © 2017 MathWorks, Inc 25
Example: Compiling fog-rectification algorithm
Copyright © 2017 MathWorks, Inc 26
MATLAB to CUDA compilation of computer vision
applications
Distance
transform
Fog removal
SURF feature
extraction
Ray tracing
Stereo disparity
Copyright © 2017 MathWorks, Inc 27
Deep learning prediction performance: Alexnet
Framerate(Fps)
Batch Size
CPU Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50 GHz
GPU Tesla K40c
0
200
400
600
800
1000
1200
1400
1 16 32 64
Py-Caffe
TensorFlow
Copyright © 2017 MathWorks, Inc 28
Deep learning prediction performance: Alexnet
0
1
2
3
4
5
6
7
8
9
CPU resident memory GPU peak memory (nvidia-smi)
Memoryusage(GB)
Batch Size
1 16 32 64
CPU Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50 GHz
GPU Tesla K40c
Py-Caffe
MATLABtoCUDAcompiler
TensorFlow
MATLABonCPU+GPU
C++-Caffe
Copyright © 2017 MathWorks, Inc 29
Deep learning prediction performance: Alexnet
Jetson (Tegra) TX1
0
50
100
150
200
250
1 16 32 64 128
Framerate(Fps)
Batch Size
C++-Caffe
MATLAB to CUDA
compiler
Copyright © 2017 MathWorks, Inc 30
Create CNNs with MATLAB,
Deploy with MATLAB to CUDA compiler
Alexnet YOLO
People detection Lane detection
~20 Fps (K40c)
~30 Fps
(Tegra X1)
~66 Fps
(Tegra X1)
(K40c)
Copyright © 2017 MathWorks, Inc 31
Conclusions
Design Deep
Learning & Vision
Algorithm
Accelerate and Scale
Training
Deep learning design is
easy in MATLAB
Managing datasets and
scaling up training is easy
in MATLAB
MATLAB to CUDA compiler
10x – 14x faster than pyCaffe
1.3x – 4x faster than TensorFlow
1.07 – 1.6x faster than C++ Caffe
High Performance
Embedded
Implementation
Copyright © 2017 MathWorks, Inc 32
What next?
www.mathworks.com/matlab-cuda-beta
MATLAB to CUDA compiler:
Sign up for our beta program
Try deep learning in MATLAB
Visit our booth and see our demos
Booth #: 808

Weitere ähnliche Inhalte

Was ist angesagt?

Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Lablup Inc.
 
JMI Techtalk: 한재근 - How to use GPU for developing AI
JMI Techtalk: 한재근 - How to use GPU for developing AIJMI Techtalk: 한재근 - How to use GPU for developing AI
JMI Techtalk: 한재근 - How to use GPU for developing AILablup Inc.
 
ML6 talk at Nexxworks Bootcamp
ML6 talk at Nexxworks BootcampML6 talk at Nexxworks Bootcamp
ML6 talk at Nexxworks BootcampKarel Dumon
 
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetsonNVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetsonNVIDIA Taiwan
 
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde..."Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...Edge AI and Vision Alliance
 
Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Karel Dumon
 
Machine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMatthias Feys
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt
 
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori..."Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...Edge AI and Vision Alliance
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersJen Aman
 
"Deploying Deep Learning Models on Embedded Processors for Autonomous Systems...
"Deploying Deep Learning Models on Embedded Processors for Autonomous Systems..."Deploying Deep Learning Models on Embedded Processors for Autonomous Systems...
"Deploying Deep Learning Models on Embedded Processors for Autonomous Systems...Edge AI and Vision Alliance
 
Accumulo and the Convergence of Machine Learning, Big Data, and Supercomputing
Accumulo and the Convergence of Machine Learning, Big Data, and SupercomputingAccumulo and the Convergence of Machine Learning, Big Data, and Supercomputing
Accumulo and the Convergence of Machine Learning, Big Data, and SupercomputingAccumulo Summit
 
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Ilham Amezzane
 
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P..."Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...Edge AI and Vision Alliance
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ..."Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...Edge AI and Vision Alliance
 
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangIntroduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangPAPIs.io
 
TinyML as-a-Service
TinyML as-a-ServiceTinyML as-a-Service
TinyML as-a-ServiceHiroshi Doyu
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetEric Haibin Lin
 
GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用NVIDIA Taiwan
 

Was ist angesagt? (20)

Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
JMI Techtalk: 한재근 - How to use GPU for developing AI
JMI Techtalk: 한재근 - How to use GPU for developing AIJMI Techtalk: 한재근 - How to use GPU for developing AI
JMI Techtalk: 한재근 - How to use GPU for developing AI
 
ML6 talk at Nexxworks Bootcamp
ML6 talk at Nexxworks BootcampML6 talk at Nexxworks Bootcamp
ML6 talk at Nexxworks Bootcamp
 
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetsonNVIDIA深度學習教育機構 (DLI): Object detection with jetson
NVIDIA深度學習教育機構 (DLI): Object detection with jetson
 
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde..."Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
"Deep Learning Beyond Cats and Cars: Developing a Real-life DNN-based Embedde...
 
Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)Nexxworks bootcamp ML6 (27/09/2017)
Nexxworks bootcamp ML6 (27/09/2017)
 
Machine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud PlatformMachine learning at scale with Google Cloud Platform
Machine learning at scale with Google Cloud Platform
 
SigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the UntunableSigOpt at GTC - Tuning the Untunable
SigOpt at GTC - Tuning the Untunable
 
Hadoop + GPU
Hadoop + GPUHadoop + GPU
Hadoop + GPU
 
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori..."Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
"Blending Cloud and Edge Machine Learning to Deliver Real-time Video Monitori...
 
Time-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity ClustersTime-Evolving Graph Processing On Commodity Clusters
Time-Evolving Graph Processing On Commodity Clusters
 
"Deploying Deep Learning Models on Embedded Processors for Autonomous Systems...
"Deploying Deep Learning Models on Embedded Processors for Autonomous Systems..."Deploying Deep Learning Models on Embedded Processors for Autonomous Systems...
"Deploying Deep Learning Models on Embedded Processors for Autonomous Systems...
 
Accumulo and the Convergence of Machine Learning, Big Data, and Supercomputing
Accumulo and the Convergence of Machine Learning, Big Data, and SupercomputingAccumulo and the Convergence of Machine Learning, Big Data, and Supercomputing
Accumulo and the Convergence of Machine Learning, Big Data, and Supercomputing
 
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
Hardware Acceleration of SVM Training for Real-time Embedded Systems: An Over...
 
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P..."Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
"Approaches for Energy Efficient Implementation of Deep Neural Networks," a P...
 
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ..."Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
"Implementing the TensorFlow Deep Learning Framework on Qualcomm’s Low-power ...
 
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike WangIntroduction to multi gpu deep learning with DIGITS 2 - Mike Wang
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
 
TinyML as-a-Service
TinyML as-a-ServiceTinyML as-a-Service
TinyML as-a-Service
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
 
GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用
 

Andere mochten auch

Introduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLABIntroduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLABRay Phan
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing BasicsNam Le
 
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction setSaumitra Rukmangad
 
8085 microprocessor architecture ppt
8085 microprocessor architecture ppt8085 microprocessor architecture ppt
8085 microprocessor architecture pptParvesh Gautam
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image ProcessingSahil Biswas
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningLior Rokach
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 

Andere mochten auch (9)

Introduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLABIntroduction to Digital Image Processing Using MATLAB
Introduction to Digital Image Processing Using MATLAB
 
Image Processing Basics
Image Processing BasicsImage Processing Basics
Image Processing Basics
 
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
8085 Paper Presentation slides,ppt,microprocessor 8085 ,guide, instruction set
 
8085 microprocessor architecture ppt
8085 microprocessor architecture ppt8085 microprocessor architecture ppt
8085 microprocessor architecture ppt
 
Digital Image Processing
Digital Image ProcessingDigital Image Processing
Digital Image Processing
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
IoT architecture
IoT architectureIoT architecture
IoT architecture
 

Ähnlich wie "Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded GPUs," a Presentation from MathWorks

A Gentle Introduction to GPU Computing by Armen Donigian
A Gentle Introduction to GPU Computing by Armen DonigianA Gentle Introduction to GPU Computing by Armen Donigian
A Gentle Introduction to GPU Computing by Armen DonigianData Con LA
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSDatabricks
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model CompressionApache MXNet
 
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...Channy Yun
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC
 
Heterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkHeterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkKNIMESlides
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningSri Ambati
 
Time series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 LausanneTime series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 LausanneSunil Mallya
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform Seldon
 
Debugging and Performance tricks for MXNet Gluon
Debugging and Performance tricks for MXNet GluonDebugging and Performance tricks for MXNet Gluon
Debugging and Performance tricks for MXNet GluonApache MXNet
 
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Amazon Web Services
 
Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07Ted Dunning
 
Distributed deep learning optimizations
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizationsgeetachauhan
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Financegeetachauhan
 
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...CodeOps Technologies LLP
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaGoDataDriven
 
Update on the Mont-Blanc Project for ARM-based HPC
Update on the Mont-Blanc Project for ARM-based HPCUpdate on the Mont-Blanc Project for ARM-based HPC
Update on the Mont-Blanc Project for ARM-based HPCinside-BigData.com
 
"Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P...
"Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P..."Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P...
"Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P...Edge AI and Vision Alliance
 

Ähnlich wie "Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded GPUs," a Presentation from MathWorks (20)

A Gentle Introduction to GPU Computing by Armen Donigian
A Gentle Introduction to GPU Computing by Armen DonigianA Gentle Introduction to GPU Computing by Armen Donigian
A Gentle Introduction to GPU Computing by Armen Donigian
 
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDSAccelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
Accelerated Machine Learning with RAPIDS and MLflow, Nvidia/RAPIDS
 
AI On the Edge: Model Compression
AI On the Edge: Model CompressionAI On the Edge: Model Compression
AI On the Edge: Model Compression
 
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
ICGIS 2018 - Cloud-powered Machine Learnings on Geospactial Services (Channy ...
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020
 
Heterogeneous Data Mining with Spark
Heterogeneous Data Mining with SparkHeterogeneous Data Mining with Spark
Heterogeneous Data Mining with Spark
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine Learning
 
Big Data Analytics With MATLAB
Big Data Analytics With MATLABBig Data Analytics With MATLAB
Big Data Analytics With MATLAB
 
Time series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 LausanneTime series modeling workd AMLD 2018 Lausanne
Time series modeling workd AMLD 2018 Lausanne
 
TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform TensorFlow 16: Building a Data Science Platform
TensorFlow 16: Building a Data Science Platform
 
Debugging and Performance tricks for MXNet Gluon
Debugging and Performance tricks for MXNet GluonDebugging and Performance tricks for MXNet Gluon
Debugging and Performance tricks for MXNet Gluon
 
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
 
Boston hug-2012-07
Boston hug-2012-07Boston hug-2012-07
Boston hug-2012-07
 
Distributed deep learning optimizations
Distributed deep learning optimizationsDistributed deep learning optimizations
Distributed deep learning optimizations
 
Distributed deep learning optimizations for Finance
Distributed deep learning optimizations for FinanceDistributed deep learning optimizations for Finance
Distributed deep learning optimizations for Finance
 
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
Developing and Deploying Deep Learning Based Computer Vision Systems - Alka N...
 
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei ZahariaDeep learning and streaming in Apache Spark 2.2 by Matei Zaharia
Deep learning and streaming in Apache Spark 2.2 by Matei Zaharia
 
Update on the Mont-Blanc Project for ARM-based HPC
Update on the Mont-Blanc Project for ARM-based HPCUpdate on the Mont-Blanc Project for ARM-based HPC
Update on the Mont-Blanc Project for ARM-based HPC
 
"Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P...
"Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P..."Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P...
"Performing Multiple Perceptual Tasks With a Single Deep Neural Network," a P...
 

Mehr von Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

Mehr von Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Kürzlich hochgeladen

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 

Kürzlich hochgeladen (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

"Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded GPUs," a Presentation from MathWorks

  • 1. Copyright © 2017 MathWorks, Inc 1 Girish Venkataramani, Avinash Nehemiah May 2017 Deep Learning and Vision Algorithm Development in MATLAB Targeting Embedded GPUs
  • 2. Copyright © 2017 MathWorks, Inc 2 Design Deep Learning & Vision Algorithms Talk Outline High Performance Embedded Implementation Highlights • Manage large image sets • Automate image labeling • Easy access to models • Pre-built training frameworks Highlights • Automate compilation of MATLAB to CUDA • 14x faster than pyCaffe 60% faster than C++ Caffe 3x faster than TensorFlow Accelerate and Scale Training Highlights • Acceleration with GPUs • Scale to clusters
  • 3. Copyright © 2017 MathWorks, Inc 3 Let’s Use Object Detection as an Example TRUCK SUV CAR In our example we’ll use deep learning for object detection.
  • 4. Copyright © 2017 MathWorks, Inc 5 Transfer Learning Workflow Transfer Learning Images New Classifier Learn New Weights Modify Network Structure Load Reference NetworkLabels Training Data Labels: Car, Truck, Large Truck, SUV, Van Alexnet, VGG-16, VGG-19, GoogLeNet
  • 5. Copyright © 2017 MathWorks, Inc 6 Manage Large Sets of Images Transfer Learning Images New Classifier Learn New Weights Modify Network Structure Load Reference NetworkLabels Handle Large Sets of Images Easily manage large sets of images - Single line of code to access images - Operates on disk, database, big-data file system imageData = imageDataStore(‘vehicles’) Easily manage large sets of images - Single line of code to access images - Operates on disk, database, big-data file system Organize Images in Folders (~ 10,000 images , 5 folders)
  • 6. Copyright © 2017 MathWorks, Inc 7 Automate Ground Truth Labeling Transfer Learning Images New Classifier Learn New Weights Modify Network Structure Load Reference NetworkLabels Ground Truth Labeling
  • 7. Copyright © 2017 MathWorks, Inc 8 Automate Ground Truth Labeling Automate Ground Truth Labeling
  • 8. Copyright © 2017 MathWorks, Inc 9 Access Reference Models in MATLAB Transfer Learning Images New Classifier Learn New Weights Modify Network Structure Load Reference NetworkLabels Easily Load Reference Networks Access Models with 1-line of MATLAB Code Net1 = alexnet Net2 = vgg16 Net3 = vgg19
  • 9. Copyright © 2017 MathWorks, Inc 10 Access Reference Models in MATLAB Easily manage large sets of images - Single line of code to access images - Operates on disk, database, big-data file system 1. Reference Models 2. Model Importer 3. Tutorials
  • 10. Copyright © 2017 MathWorks, Inc 11 Modify Network Structure Transfer Learning Images New Classifier Learn New Weights Modify Network Structure Load Reference NetworkLabels Simple MATLAB API to modify layers: layers(23) = fullyConnectedLayer(5, 'Name','fc8'); layers(25) = classificationLayer('Name',‘VehicleClassifier')
  • 11. Copyright © 2017 MathWorks, Inc 12 Training Object Detectors Transfer Learning Images New Classifier Learn New Weights Modify Network Structure Load Reference NetworkLabels Train Any Network trainNetwork(datastore, layers, options) Pre-built Frameworks for Computer Vision • Deep Learning: R-CNN, Fast R-CNN, Faster R-CNN • Machine Learning: ACF, Cascade Object Detectors
  • 12. Copyright © 2017 MathWorks, Inc 13 Visualizing and Debugging Intermediate Results Filters … Activations Deep Dream Training Accuracy Visualization Deep Dream Layer Activations Feature Visualization • Many options for visualizations and debugging • Examples to get started
  • 13. Copyright © 2017 MathWorks, Inc 14 Real World Systems Use More Than Deep Learning Deep learning vehicle detector performance degraded with environmental effects (fog, etc. ) Fog Removal Challenge: Deep learning frameworks do not include “classical” computer vision Solution: Convert MATLAB code with deep learning and computer vision to embedded implementation
  • 14. Copyright © 2017 MathWorks, Inc 15 Talk Outline Design Deep Learning & Vision Algorithms High Performance Embedded Implementation Accelerate and Scale Training Can you solve “real” problems for production systems with MATLAB?
  • 15. Copyright © 2017 MathWorks, Inc 16 Single code change trainingOptions(‘sgdm’,… ‘ExecutionEnvironment’,’CPU’) Accelerate and Scale Computing Multi-core CPU ‘ExecutionEnvironment’,’GPU’) GPU ‘ExecutionEnvironment’,’multi-GPU’) Multiple GPU ‘ExecutionEnvironment’,’parallel’) Cluster/ Cloud
  • 16. Copyright © 2017 MathWorks, Inc 17 After Many Iterations to Find The Best Model
  • 17. Copyright © 2017 MathWorks, Inc 18 Talk Outline Design Deep Learning & Vision Algorithms High Performance Embedded Implementation Accelerate and Scale Training Can you create high performance implementation from MATLAB code ?
  • 18. Copyright © 2017 MathWorks, Inc 19 Presenting the MATLAB to CUDA parallelizing compiler Why? • Alexnet inference using MATLAB solution is • ~14x faster than pyCaffe and 50% faster than C++-Caffe • ~ 4x faster and ~3x less memory-use than TensorFlow
  • 19. Copyright © 2017 MathWorks, Inc 20 Sample Generated CUDA Code MATLAB source code Auto-generated CUDA code
  • 20. Copyright © 2017 MathWorks, Inc 21 MATLAB to CUDA compiler flow Control-flow graph Intermediate representation (CFG – IR) Front-end Parallel loop creation Library function mapping CUDA kernel creation cudaMemcpy minimization Shared memory synthesis CUDA code emission …. Traditional compiler optimizations …. (×) cublas-gemm () cuSolver calls fft cuFFT calls nnet cuDNN calls Library function mapping Parallel loop creation Identify loop-nests that will become CUDA kernels … . CUDA kernel creation Convert loop to CUDA kernel Thread/blocks inferred from loop dims cudaMemcpy minimization Shared memory synthesis Perform Use-def analysis. cudaMalloc GPU vars, insert memcpy Infer data locality. Map to shared memory. Synthesize shared memory access CUDA kernel optimizations
  • 21. Copyright © 2017 MathWorks, Inc 22 MATLAB to CUDA compiler: Creating large parallel loops! Control-flow graph Intermediate representation (CFG – IR) Front-end Scalarization Loop perfectization Loop interchange Loop fusion Scalar replacement Library function mapping CUDA code emission …. Traditional compiler optimizations … . Loop optimizations Scalarization Loop fusion Scalar replacement Parallel loop creation CUDA kernel creation cudaMemcpy minimization Shared memory synthesis CUDA kernel optimizations
  • 22. Copyright © 2017 MathWorks, Inc 23 MATLAB to CUDA compiler: Creating large parallel loops! Control-flow graph Intermediate representation (CFG – IR) Front-end Scalarization Loop perfectization Loop interchange Loop fusion Scalar replacement Library function mapping CUDA code emission …. Traditional compiler optimizations … . Loop optimizations 2 kernels (size N), 20*N bytes 1 kernel (size N), 16*N bytes Scalarization Loop fusion Scalar replacement Parallel loop creation CUDA kernel creation cudaMemcpy minimization Shared memory synthesis CUDA kernel optimizations
  • 23. Copyright © 2017 MathWorks, Inc 24 cudaMemcpy minimization A(:) = …. C(:) = …. for i = 1:N …. gB = kernel1(gA); gA = kernel2(gB); if (some_condition) gC = kernel3(gA, gB); end …. end …. = C; cudaMemcpy *definitely* needed cudaMemcpy *not* needed cudaMemcpy *may be* needed Observations • Equivalent to Partial redundancy elimination (PRE) • Dynamic strategy – track memory location with a status flag per variable • Use-Def to determine where to insert memcpy A(:) = … A_isDirtyOnCpu = true; … for i = 1:N if (A_isDirtyOnCpu) cudaMemcpy(gA, A); A_isDirtyOnCpu = false; end gB = kernel1(gA); gA = kernel2(gB); if (somecondition) gC = kernel3(gA, gB); C_isDirtyOnGpu = true; end … end … if (C_isDirtyOnGpu) cudaMemcpy(C, gC); C_isDirtyOnGpu = false; end … = C; Assume gA, gB and gC are mapped to GPU memory Generated (pseudo) code
  • 24. Copyright © 2017 MathWorks, Inc 25 Example: Compiling fog-rectification algorithm
  • 25. Copyright © 2017 MathWorks, Inc 26 MATLAB to CUDA compilation of computer vision applications Distance transform Fog removal SURF feature extraction Ray tracing Stereo disparity
  • 26. Copyright © 2017 MathWorks, Inc 27 Deep learning prediction performance: Alexnet Framerate(Fps) Batch Size CPU Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50 GHz GPU Tesla K40c 0 200 400 600 800 1000 1200 1400 1 16 32 64 Py-Caffe TensorFlow
  • 27. Copyright © 2017 MathWorks, Inc 28 Deep learning prediction performance: Alexnet 0 1 2 3 4 5 6 7 8 9 CPU resident memory GPU peak memory (nvidia-smi) Memoryusage(GB) Batch Size 1 16 32 64 CPU Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50 GHz GPU Tesla K40c Py-Caffe MATLABtoCUDAcompiler TensorFlow MATLABonCPU+GPU C++-Caffe
  • 28. Copyright © 2017 MathWorks, Inc 29 Deep learning prediction performance: Alexnet Jetson (Tegra) TX1 0 50 100 150 200 250 1 16 32 64 128 Framerate(Fps) Batch Size C++-Caffe MATLAB to CUDA compiler
  • 29. Copyright © 2017 MathWorks, Inc 30 Create CNNs with MATLAB, Deploy with MATLAB to CUDA compiler Alexnet YOLO People detection Lane detection ~20 Fps (K40c) ~30 Fps (Tegra X1) ~66 Fps (Tegra X1) (K40c)
  • 30. Copyright © 2017 MathWorks, Inc 31 Conclusions Design Deep Learning & Vision Algorithm Accelerate and Scale Training Deep learning design is easy in MATLAB Managing datasets and scaling up training is easy in MATLAB MATLAB to CUDA compiler 10x – 14x faster than pyCaffe 1.3x – 4x faster than TensorFlow 1.07 – 1.6x faster than C++ Caffe High Performance Embedded Implementation
  • 31. Copyright © 2017 MathWorks, Inc 32 What next? www.mathworks.com/matlab-cuda-beta MATLAB to CUDA compiler: Sign up for our beta program Try deep learning in MATLAB Visit our booth and see our demos Booth #: 808