SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Recent Trends in DNN
Compression
October 12th, 2018
Kaushalya Madhawa
Murata Laboratory
1
Tokyo Tech
Back then…
2
• Size of commonly used DNNs
• AlexNet 240MB
• VGG 16 552MB
• Inception V3 109MB
• Running models on the cloud has its own
disadvantages
• Network latency
• Privacy
DNN Compression
• Can we achieve the same accuracy with
smaller models?
• There are several approaches to obtain
smaller models
– Compressing pre-trained networks
• DeepCompression (Han+, 2016)
– Designing of compact models
• SqueezeNet (Iandola+, 2016)
• MobileNets (Howard+, 2017)
3
Deep Compression (Han+, ICLR 2016)
• One of the first papers to introduce model compression
• Requires specific custom hardware to leverage
inferencing
• Sparsity doesn’t always translate to reduced inference
time
4
Deep Compression (Han+, ICLR 2016)
• One of the first papers to introduce model compression
• Requires specific custom hardware to leverage
inferencing
• Sparsity doesn’t always translate to reduced inference
time
5
Compact Models
• Designing networks
with less number of
parameters
• SqueezeNet - AlexNet
level accuracy with 50x
less parameters
• MobileNets - Depth-
wise separable
convolutions
6
Fire module: SqueezeNet
Compact Models
• Designing networks
with less number of
parameters
• SqueezeNet - AlexNet
level accuracy with 50x
less parameters
• MobileNets - Depth-
wise separable
convolutions
7
Fire module: SqueezeNet
Requires lot of expertise and consumes lot of time!
State-of-the-art (SOTA) in 2018
!8
State-of-the-art (SOTA) in 2018
• Mobile devices
• More memory

• Has dedicated hardware to run ML models

• Deep Learning frameworks
• Models
• Directly optimize models for the resource constraint (eg:
size)

• More focus on latency

• Optimize for multiple objectives
!9
SOTA in 2018: Devices
• Storage: <128MB • Storage: <512MB
• Neural Engine: dedicated
hardware for ML algorithms
• CoreML/ TF-Lite
!10
SOTA in 2018: Models
• Model compression

• Structured pruning is used to reduce the latency 

• Designing compact models

• Neural architecture search for finding models fulfilling
the resource restrictions

• In addition to accuracy, latency or model size also
incorporated into the objective
!11
Neural Architectural Search
• Automates the designing of neural network models

• NasNet (Zoph and Le, 2017): Accuracy is used as the
reward in a reinforcement learning model

• PPP-Net (Dong+, 2018): A multi-objective architecture
search to optimize for both accuracy and inference time
!12
Mnasnet (Tan+, 2018)
• Neural Architecture Search for mobile
devices

• Optimized for both accuracy and latency 

• Multiple pareto-optimal solutions are
found in a single architecture search

• Latency is directly measured on a mobile
phone

• Able to find models that run 1.5x faster
than MobileNet v2
Sample models
from search space Trainer
Mobile
phones
Multi-objective
reward
latency
reward
Controller
accuracy
maximize 
m
ACC(m) ×
[
LAT(m)
T ]
w
w =
{
α, if LAT(m) ≤ T
β, otherwise
!13
Mnasnet
Model Name Model_Size Top-1 Accuracy Top-5 Accuracy
TF Lite
Performance
MnasNet_0.50_22
4
8.5 Mb 68.03% 87.79% 37 ms
MnasNet_0.75_22
4
12 Mb 71.72% 90.17% 61ms
MnasNet_1.3_224 24 Mb 75.24% 92.55% 152 ms
SqueezeNet 5.0 Mb 49.0% 72.9% 224 ms
ResNet_V2_101 178.3 Mb 76.8% 93.6% 1880 ms
Inception_V3 95.3 Mb 77.9% 93.8% 1433 ms
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md
!14
Mnasnet
Model Name Model_Size Top-1 Accuracy Top-5 Accuracy
TF Lite
Performance
MnasNet_0.50_22
4
8.5 Mb 68.03% 87.79% 37 ms
MnasNet_0.75_22
4
12 Mb 71.72% 90.17% 61ms
MnasNet_1.3_224 24 Mb 75.24% 92.55% 152 ms
SqueezeNet 5.0 Mb 49.0% 72.9% 224 ms
ResNet_V2_101 178.3 Mb 76.8% 93.6% 1880 ms
Inception_V3 95.3 Mb 77.9% 93.8% 1433 ms
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md
!15
Summary
• Mobile devices are more capable in running
DNN models

• Unstructured pruning is out of fashion

• Accuracy and platform-dependent restrictions
are incorporated into multi-objective model
search
!16
References
• Dong, Jin-Dong, et al. "DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural
Architectures." arXiv preprint arXiv:1806.08198 (2018).
• Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural
networks with pruning, trained quantization and huffman coding." arXiv preprint arXiv:
1510.00149 (2015).
• Tan, Mingxing, et al. "MnasNet: Platform-Aware Neural Architecture Search for Mobile." arXiv
preprint arXiv:1807.11626 (2018).
• Zoph, Barret, and Quoc V. Le. "Neural architecture search with reinforcement learning." arXiv
preprint arXiv:1611.01578 (2016)
17

Weitere ähnliche Inhalte

Was ist angesagt?

Andy Davidson Automation Presentation from UKNOF 31
Andy Davidson Automation Presentation from UKNOF 31Andy Davidson Automation Presentation from UKNOF 31
Andy Davidson Automation Presentation from UKNOF 31Nicole White
 
Distributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedDistributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedWee Hyong Tok
 
Discover the OVH Dedicated Cloud Webinar
Discover the OVH Dedicated Cloud WebinarDiscover the OVH Dedicated Cloud Webinar
Discover the OVH Dedicated Cloud WebinarOVHcloud
 
The Fabric of the Future
The Fabric of the FutureThe Fabric of the Future
The Fabric of the FutureUniFabric
 
Running JVM in Docker
Running JVM in DockerRunning JVM in Docker
Running JVM in DockerUri Savelchev
 
Desktop Private Cloud
Desktop Private CloudDesktop Private Cloud
Desktop Private CloudPaul Morse
 
Deep Learning Computer Build
Deep Learning Computer BuildDeep Learning Computer Build
Deep Learning Computer BuildPetteriTeikariPhD
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)Julien SIMON
 
Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and Saptak Sen
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Julien SIMON
 
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackAdam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackShapeBlue
 
Getting it Right: OpenStack Private Cloud Storage
Getting it Right: OpenStack Private Cloud StorageGetting it Right: OpenStack Private Cloud Storage
Getting it Right: OpenStack Private Cloud StorageNetApp
 
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...NetApp
 
(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC
(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC
(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPCAmazon Web Services
 
MySQL: Scale Through Consolidation Webinar
MySQL: Scale Through Consolidation Webinar MySQL: Scale Through Consolidation Webinar
MySQL: Scale Through Consolidation Webinar NetApp
 
HybridAzureCloud
HybridAzureCloudHybridAzureCloud
HybridAzureCloudChris Condo
 
ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage
ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage   ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage
ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage Lê Văn Duy
 
Virtualization & Global CyberSoft
Virtualization & Global CyberSoftVirtualization & Global CyberSoft
Virtualization & Global CyberSoftHieu Le Trung
 
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,..."Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...Edge AI and Vision Alliance
 
2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기iFunFactory Inc.
 

Was ist angesagt? (20)

Andy Davidson Automation Presentation from UKNOF 31
Andy Davidson Automation Presentation from UKNOF 31Andy Davidson Automation Presentation from UKNOF 31
Andy Davidson Automation Presentation from UKNOF 31
 
Distributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedDistributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learned
 
Discover the OVH Dedicated Cloud Webinar
Discover the OVH Dedicated Cloud WebinarDiscover the OVH Dedicated Cloud Webinar
Discover the OVH Dedicated Cloud Webinar
 
The Fabric of the Future
The Fabric of the FutureThe Fabric of the Future
The Fabric of the Future
 
Running JVM in Docker
Running JVM in DockerRunning JVM in Docker
Running JVM in Docker
 
Desktop Private Cloud
Desktop Private CloudDesktop Private Cloud
Desktop Private Cloud
 
Deep Learning Computer Build
Deep Learning Computer BuildDeep Learning Computer Build
Deep Learning Computer Build
 
An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)An Introduction to Deep Learning (May 2018)
An Introduction to Deep Learning (May 2018)
 
Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and Taking High Performance Computing to the Cloud: Windows HPC and
Taking High Performance Computing to the Cloud: Windows HPC and
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStackAdam Dagnall: Advanced S3 compatible storage integration in CloudStack
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
 
Getting it Right: OpenStack Private Cloud Storage
Getting it Right: OpenStack Private Cloud StorageGetting it Right: OpenStack Private Cloud Storage
Getting it Right: OpenStack Private Cloud Storage
 
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
Gain Storage Control with SIOC and Take Performance Control with QoS from Sol...
 
(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC
(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC
(CMP303) ResearchCloud: CfnCluster and Internet2 for Enterprise HPC
 
MySQL: Scale Through Consolidation Webinar
MySQL: Scale Through Consolidation Webinar MySQL: Scale Through Consolidation Webinar
MySQL: Scale Through Consolidation Webinar
 
HybridAzureCloud
HybridAzureCloudHybridAzureCloud
HybridAzureCloud
 
ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage
ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage   ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage
ITLC Ha Noi : Openstack From Atlanta to Ha Noi - Storage
 
Virtualization & Global CyberSoft
Virtualization & Global CyberSoftVirtualization & Global CyberSoft
Virtualization & Global CyberSoft
 
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,..."Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
"Lessons Learned from Bringing Mobile and Embedded Vision Products to Market,...
 
2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기2016 NDC - 모바일 게임 서버 엔진 개발 후기
2016 NDC - 모바일 게임 서버 엔진 개발 후기
 

Ähnlich wie Trends in DNN compression

Big Memory for HPC
Big Memory for HPCBig Memory for HPC
Big Memory for HPCMemVerge
 
“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...
“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...
“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...Edge AI and Vision Alliance
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNLinaro
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture corehard_by
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobileAnirudh Koul
 
Accelerating Deep Learning Inference 
on Mobile Systems
Accelerating Deep Learning Inference 
on Mobile SystemsAccelerating Deep Learning Inference 
on Mobile Systems
Accelerating Deep Learning Inference 
on Mobile SystemsDarian Frajberg
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talksRuslan Meshenberg
 
Hyper-Convergence: Worth the Hype?
Hyper-Convergence: Worth the Hype?Hyper-Convergence: Worth the Hype?
Hyper-Convergence: Worth the Hype?Brian Anderson
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Community
 
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studies
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case StudiesCPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studies
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studiesoptimizatiodirectdirect
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAmazon Web Services
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsS N
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Databricks
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 

Ähnlich wie Trends in DNN compression (20)

Big Memory for HPC
Big Memory for HPCBig Memory for HPC
Big Memory for HPC
 
Large scalecplex
Large scalecplexLarge scalecplex
Large scalecplex
 
“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...
“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...
“A Practical Guide to Implementing ML on Embedded Devices,” a Presentation fr...
 
HKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NNHKG18-312 - CMSIS-NN
HKG18-312 - CMSIS-NN
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
 
Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
 
Accelerating Deep Learning Inference 
on Mobile Systems
Accelerating Deep Learning Inference 
on Mobile SystemsAccelerating Deep Learning Inference 
on Mobile Systems
Accelerating Deep Learning Inference 
on Mobile Systems
 
Netflix oss season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talksNetflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss season 2 episode 1 - meetup Lightning talks
 
Hyper-Convergence: Worth the Hype?
Hyper-Convergence: Worth the Hype?Hyper-Convergence: Worth the Hype?
Hyper-Convergence: Worth the Hype?
 
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
Ceph Day Shanghai - SSD/NVM Technology Boosting Ceph Performance
 
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studies
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case StudiesCPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studies
CPLEX Optimization Studio, Modeling, Theory, Best Practices and Case Studies
 
AWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data AnalyticsAWS Sydney Summit 2013 - Big Data Analytics
AWS Sydney Summit 2013 - Big Data Analytics
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Spark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloadsSpark and Deep Learning frameworks with distributed workloads
Spark and Deep Learning frameworks with distributed workloads
 
SpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud ComputingSpringPeople - Introduction to Cloud Computing
SpringPeople - Introduction to Cloud Computing
 
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
Lessons Learned Replatforming A Large Machine Learning Application To Apache ...
 
Open power ddl and lms
Open power ddl and lmsOpen power ddl and lms
Open power ddl and lms
 
Os Lamothe
Os LamotheOs Lamothe
Os Lamothe
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 

Mehr von Kaushalya Madhawa

On the limitations of representing functions on sets
On the limitations of representing functions on setsOn the limitations of representing functions on sets
On the limitations of representing functions on setsKaushalya Madhawa
 
Graphs for Visual Understanding
Graphs for Visual UnderstandingGraphs for Visual Understanding
Graphs for Visual UnderstandingKaushalya Madhawa
 
Robustness of compressed CNNs
Robustness of compressed CNNsRobustness of compressed CNNs
Robustness of compressed CNNsKaushalya Madhawa
 
Pruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inferencePruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inferenceKaushalya Madhawa
 
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...Kaushalya Madhawa
 
Opportunities in Higher Education & Career Guidance
Opportunities in Higher Education & Career GuidanceOpportunities in Higher Education & Career Guidance
Opportunities in Higher Education & Career GuidanceKaushalya Madhawa
 
Automatic generation of event summaries using microblog streams
Automatic generation of event summaries using microblog streamsAutomatic generation of event summaries using microblog streams
Automatic generation of event summaries using microblog streamsKaushalya Madhawa
 
Understanding social connections
Understanding social connectionsUnderstanding social connections
Understanding social connectionsKaushalya Madhawa
 
Leveraging mobile network big data for urban planning
Leveraging mobile network big data for urban planningLeveraging mobile network big data for urban planning
Leveraging mobile network big data for urban planningKaushalya Madhawa
 

Mehr von Kaushalya Madhawa (9)

On the limitations of representing functions on sets
On the limitations of representing functions on setsOn the limitations of representing functions on sets
On the limitations of representing functions on sets
 
Graphs for Visual Understanding
Graphs for Visual UnderstandingGraphs for Visual Understanding
Graphs for Visual Understanding
 
Robustness of compressed CNNs
Robustness of compressed CNNsRobustness of compressed CNNs
Robustness of compressed CNNs
 
Pruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inferencePruning convolutional neural networks for resource efficient inference
Pruning convolutional neural networks for resource efficient inference
 
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
ABRA: Approximating Betweenness Centrality in Static and Dynamic Graphs with ...
 
Opportunities in Higher Education & Career Guidance
Opportunities in Higher Education & Career GuidanceOpportunities in Higher Education & Career Guidance
Opportunities in Higher Education & Career Guidance
 
Automatic generation of event summaries using microblog streams
Automatic generation of event summaries using microblog streamsAutomatic generation of event summaries using microblog streams
Automatic generation of event summaries using microblog streams
 
Understanding social connections
Understanding social connectionsUnderstanding social connections
Understanding social connections
 
Leveraging mobile network big data for urban planning
Leveraging mobile network big data for urban planningLeveraging mobile network big data for urban planning
Leveraging mobile network big data for urban planning
 

Kürzlich hochgeladen

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 

Kürzlich hochgeladen (20)

Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 

Trends in DNN compression

  • 1. Recent Trends in DNN Compression October 12th, 2018 Kaushalya Madhawa Murata Laboratory 1 Tokyo Tech
  • 2. Back then… 2 • Size of commonly used DNNs • AlexNet 240MB • VGG 16 552MB • Inception V3 109MB • Running models on the cloud has its own disadvantages • Network latency • Privacy
  • 3. DNN Compression • Can we achieve the same accuracy with smaller models? • There are several approaches to obtain smaller models – Compressing pre-trained networks • DeepCompression (Han+, 2016) – Designing of compact models • SqueezeNet (Iandola+, 2016) • MobileNets (Howard+, 2017) 3
  • 4. Deep Compression (Han+, ICLR 2016) • One of the first papers to introduce model compression • Requires specific custom hardware to leverage inferencing • Sparsity doesn’t always translate to reduced inference time 4
  • 5. Deep Compression (Han+, ICLR 2016) • One of the first papers to introduce model compression • Requires specific custom hardware to leverage inferencing • Sparsity doesn’t always translate to reduced inference time 5
  • 6. Compact Models • Designing networks with less number of parameters • SqueezeNet - AlexNet level accuracy with 50x less parameters • MobileNets - Depth- wise separable convolutions 6 Fire module: SqueezeNet
  • 7. Compact Models • Designing networks with less number of parameters • SqueezeNet - AlexNet level accuracy with 50x less parameters • MobileNets - Depth- wise separable convolutions 7 Fire module: SqueezeNet Requires lot of expertise and consumes lot of time!
  • 9. State-of-the-art (SOTA) in 2018 • Mobile devices • More memory • Has dedicated hardware to run ML models • Deep Learning frameworks • Models • Directly optimize models for the resource constraint (eg: size) • More focus on latency • Optimize for multiple objectives !9
  • 10. SOTA in 2018: Devices • Storage: <128MB • Storage: <512MB • Neural Engine: dedicated hardware for ML algorithms • CoreML/ TF-Lite !10
  • 11. SOTA in 2018: Models • Model compression • Structured pruning is used to reduce the latency • Designing compact models • Neural architecture search for finding models fulfilling the resource restrictions • In addition to accuracy, latency or model size also incorporated into the objective !11
  • 12. Neural Architectural Search • Automates the designing of neural network models • NasNet (Zoph and Le, 2017): Accuracy is used as the reward in a reinforcement learning model • PPP-Net (Dong+, 2018): A multi-objective architecture search to optimize for both accuracy and inference time !12
  • 13. Mnasnet (Tan+, 2018) • Neural Architecture Search for mobile devices • Optimized for both accuracy and latency • Multiple pareto-optimal solutions are found in a single architecture search • Latency is directly measured on a mobile phone • Able to find models that run 1.5x faster than MobileNet v2 Sample models from search space Trainer Mobile phones Multi-objective reward latency reward Controller accuracy maximize  m ACC(m) × [ LAT(m) T ] w w = { α, if LAT(m) ≤ T β, otherwise !13
  • 14. Mnasnet Model Name Model_Size Top-1 Accuracy Top-5 Accuracy TF Lite Performance MnasNet_0.50_22 4 8.5 Mb 68.03% 87.79% 37 ms MnasNet_0.75_22 4 12 Mb 71.72% 90.17% 61ms MnasNet_1.3_224 24 Mb 75.24% 92.55% 152 ms SqueezeNet 5.0 Mb 49.0% 72.9% 224 ms ResNet_V2_101 178.3 Mb 76.8% 93.6% 1880 ms Inception_V3 95.3 Mb 77.9% 93.8% 1433 ms https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md !14
  • 15. Mnasnet Model Name Model_Size Top-1 Accuracy Top-5 Accuracy TF Lite Performance MnasNet_0.50_22 4 8.5 Mb 68.03% 87.79% 37 ms MnasNet_0.75_22 4 12 Mb 71.72% 90.17% 61ms MnasNet_1.3_224 24 Mb 75.24% 92.55% 152 ms SqueezeNet 5.0 Mb 49.0% 72.9% 224 ms ResNet_V2_101 178.3 Mb 76.8% 93.6% 1880 ms Inception_V3 95.3 Mb 77.9% 93.8% 1433 ms https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md !15
  • 16. Summary • Mobile devices are more capable in running DNN models • Unstructured pruning is out of fashion • Accuracy and platform-dependent restrictions are incorporated into multi-objective model search !16
  • 17. References • Dong, Jin-Dong, et al. "DPP-Net: Device-aware Progressive Search for Pareto-optimal Neural Architectures." arXiv preprint arXiv:1806.08198 (2018). • Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding." arXiv preprint arXiv: 1510.00149 (2015). • Tan, Mingxing, et al. "MnasNet: Platform-Aware Neural Architecture Search for Mobile." arXiv preprint arXiv:1807.11626 (2018). • Zoph, Barret, and Quoc V. Le. "Neural architecture search with reinforcement learning." arXiv preprint arXiv:1611.01578 (2016) 17