SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Julien Simon, Principal AI/ML Evangelist, Amazon Web Services
@julsimon
Simone Mangiante, Research and Standards Specialist, Vodafone
Simone.Mangiante@Vodafone.com
Machine Learning Inference at the Edge
Agenda
• Deep Learning at the Edge?
• Apache MXNet
• Predicting in the Cloud or at the Edge?
• Case study: Driver Monitoring by Vodafone
• New services: AWS Greengrass ML and AWS DeepLens
• Resources
Deep Learning at the Edge?
Use Cases
Voice/sound
recognition
Collision
avoidance
Image
recognition
Anomaly
detection
More
!
Smart
Agriculture
Predictive
maintenance
Self-driving
cars
Video
surveillance
Robotics
Deep Learning challenges at the Edge
• Resource-constrained devices
• CPU, memory, storage, power consumption.
• Network connectivity
• Availability, cost, bandwidth, latency.
• On-device prediction may be the only option.
• Deployment
• Updating code and models on a fleet of devices
is not easy.
Deep Learning wishlist at the Edge
• Rely on cloud-based services for seamless training and deployment.
• Have the option to use cloud-based prediction.
• Be able to run device-based prediction with good performance.
• Support different technical environments (CPUs, languages).
Apache MXNet
Apache MXNet: Open Source library for Deep Learning
Programmable Portable High Performance
Near linear scaling
across hundreds of
GPUs
Highly efficient
models for
mobile
and IoT
Simple syntax,
multiple
languages
Most Open Best On AWS
Optimized for
Deep Learning on AWS
Accepted into the
Apache Incubator
Apache MXNet for IoT
1. Flexible experimentation in the Cloud.
2. Scalable training in the Cloud.
3. Good prediction performance at the Edge.
4. Prediction in the Cloud or at the Edge.
1 - Flexible experimentation in the Cloud
• API for Python, R, Perl, Matlab, Scala, C++.
• Gluon
• Imperative programming aka ‘define-by-run’.
• Inspect, debug and modify models during training.
• Extensive model zoo
• Pre-trained computer vision models.
• DenseNet, SqueezeNet for resource-constrained devices.
2 - Scalable training in the Cloud
Amazon SageMaker AWS Deep Learning AMI
Amazon EC2 c5 p3
3 - Good prediction performance at the Edge
• MXNet is written in C++.
• Gluon networks can be ‘hybridized’ for additional speed.
• Two libraries boost performance on CPU-only devices
• Fast implementation of math primitives
• Hardware-specific instructions, e.g. Intel AVX or ARM NEON
• Intel Math Kernel Library https://software.intel.com/en-us/mkl
• NNPACK https://github.com/Maratyszcza/NNPACK
• Mixed precision training
• Use float16 instead of float32 for weights and activations
• Almost 2x reduction in model size, no loss of accuracy, faster inference
• https://devblogs.nvidia.com/parallelforall/mixed-precision-training-deep-neural-networks/
4 - Predicting in the Cloud or at the Edge
• Cloud-based: invoke a Lambda function with AWS IoT.
• Cloud-based: invoke a SageMaker endpoint with HTTP.
• Device-based: bring your own code and model.
• Device-based: deploy your code and model with AWS Greengrass.
Invoking a Lambda function with AWS IoT
• Train a model in SageMaker (or bring your own).
• Host it in S3 (or embed it in a Lambda function).
• Write a Lambda function performing prediction.
• Invoke it through AWS IoT.
Best when
Devices can support neither HTTP nor local
inference (e.g. Arduino).
Costs must be kept as low as possible.
Requirements
Network is available and reliable
(MQTT is less demanding than HTTP).
Devices are provisioned in AWS IoT (certificate, keys).
https://aws.amazon.com/blogs/compute/seamlessly-scale-predictions-with-aws-lambda-and-mxnet/
Invoking a SageMaker endpoint with HTTP
• Train a model in SageMaker (or bring your own).
• Deploy it to a prediction endpoint.
• Invoke the HTTP endpoint from your devices.
Best when
Devices are not powerful enough for local inference.
Models can’t be easily deployed to devices.
Additional cloud-based data is required for prediction.
Prediction activity must be centralized.
Requirements
Network is available and reliable.
Devices support HTTP.
Bring your own code and model
• Train a model in SageMaker (or bring your own).
• Bring your own application code.
• Provision devices at manufacturing time
(or use your own update mechanism).
Best when
You don’t want to or can’t rely on cloud services
(no network connectivity?)
Requirements
Devices are powerful enough for local inference.
Models don’t need to be updated, if ever.
DIY!
Deploy your code and model with AWS Greengrass
• Train a model in SageMaker (or bring your own).
• Write a Lambda function performing prediction.
• Add both as resources in your Greengrass group.
• Let Greengrass handle deployment and updates.
Best when
You want the same programming model in the Cloud
and at the Edge.
Code and models need to be updated, even if
network connectivity is infrequent or unreliable.
One device in the group should be able to perform
prediction on behalf on other devices.
Requirements
Devices are powerful enough to run Greengrass
(XXX HW requirements)
Devices are provisioned in AWS IoT (certificate, keys).
Distributing ML at the edge
Dr Simone Mangiante
simone.mangiante@vodafone.com
27 March 2018
Distributed, low latency machine
learning
Increasing amount of in-vehicle
data (40+ TB/h)
Desire for more mobile-driven
services in cars harnessing
advanced network functions
Needs to be developer friendly
ML at the edge: challenges and benefits
Challenges from use cases, benefits from deployment at the telco network edge
Reduce in-vehicle bill-of-material by
offloading computing resources
Optimise service cost with shared resources
Seamlessly upgrade in a cloud-based
environment
Evolve at the pace of technology leveraging
the latest advances
Real-time alerts that enhance safety
What is Mobile/Multi-access Edge Computing?
MEC is a network architectural concept
that enables cloud-computing capabilities
at the edge of the network
MEC offers applications and content
providers cloud-computing capabilities at
the edge of the network
CloudMobile Edge CloudRadio Site
• MEC software (i.e. MEC Platform) runs as a
VNF on a cloud-based edge infrastructure
• 3rd party applications can be deployed on
MEC platforms
• MEC platform can expose network APIs to
applications
4G/5G Services Edge Cloud Infrastructure
VNF
Edge Cloud Services & API
Edge Application
MEC Platform
Virtual
Network
Functions
Edge DeploymentEdge Deployment
The rationale for edge computing goes beyond latency
Latency argument is sometimes ‘overhyped’
Mobile Core network
MEC driver monitoring proof of concept
Saguna vEdge
AWS Greengrass
Driver Monitoring App
Vodafone 4G/LTE
MEC Edge Cloud
Internet
IPsec tunnel IPsec tunnel
1
2
3
4 5
1 • Camera device
(Raspberry Pi)
• Connected via cellular
radio
• Video streamed to Driver
Monitoring App
• Filters traffic based on
traffic rules: e.g. pass-
through to Internet,
mirror, redirect to local
application
• Can ‘chain’
applications
2 • Camera device is in
Greengrass Group
• Runs in a VM
• It’s a ‘MEC application’
• Receives traffic from radio
access as per configured
traffic rules
3 4 • Edge app
• Business logic
• Includes neural
network
5 • Training
MQTT
messages
Notifications/results
Model updates
Local breakout
Driver monitoring application
Convolutional Neural Network
• Two architectures explored:
Inception, MobileNet
• MobileNet chosen for its speed
advantage and ability to run on a
light platform (as for a PoC)
Training data
• Source: Kaggle.com
• Dataset: State Farm Distracted
Driver Detection
Business logic
• Classifies a series of sequential video
frames
• Then makes decision on whether
“distracted driving” detected
• For PoC, sends message to Raspberry Pi
for local web server and displaying via
browser on monitor
Stream video
Receive alerts
In-Vehicle
AWS IoT SDK
MEC Edge Cloud
Receive video
Send alerts
Perform inference
using ML model
AWS Greengrass core
Create and train
the driver
monitoring model
AWS Cloud
AWS SageMaker
Developed by
The future is exciting.
Ready?
ML Inference using AWS Greengrass
Deploy model
and Lambda
Send
insightsPredict and take
actions locally
on device
AWS Cloud
PREVIEW
AVAILABLE
AWS Greengrass ML
PREVIEW
AVAILABLE
AWS DeepLens
A new way to learn
Custom built for Deep Learning
Broad Framework Support
Deploy models from Amazon SageMaker
Integrated with AWS
Fully programmable with AWS Lambda
AWS DeepLens
W o r l d ’ s f i r s t D e e p L e a r n i n g e n a b l e d v i d e o c a m e r a f o r d e v e l o p e r s
AWS DeepLens
Object detection with AWS DeepLens
Resources
Resources
https://mxnet.incubator.apache.org
http://gluon.mxnet.io
https://aws.amazon.com/sagemaker (free tier available)
An overview of Amazon SageMaker: https://www.youtube.com/watch?v=ym7NEYEx9x4
https://github.com/awslabs/amazon-sagemaker-examples
https://aws.amazon.com/greengrass (free tier available)
https://aws.amazon.com/deeplens
https://medium.com/@julsimon
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Thank you!
Julien Simon, Principal AI/ML Evangelist, Amazon Web Services
@julsimon
Simone Mangiante, Research and Standards Specialist, Vodafone
Simone.Mangiante@Vodafone.com

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
 
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision TreeApache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
Apache Kafka vs RabbitMQ: Fit For Purpose / Decision Tree
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
 
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini PalthepuApache Flink Crash Course by Slim Baltagi and Srini Palthepu
Apache Flink Crash Course by Slim Baltagi and Srini Palthepu
 
Google File System
Google File SystemGoogle File System
Google File System
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
 
Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain Kafka streams windowing behind the curtain
Kafka streams windowing behind the curtain
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Apache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel IndustryApache Kafka in the Airline, Aviation and Travel Industry
Apache Kafka in the Airline, Aviation and Travel Industry
 
Apache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, ConfluentApache Kafka and the Data Mesh | Michael Noll, Confluent
Apache Kafka and the Data Mesh | Michael Noll, Confluent
 
Confluent Enterprise Datasheet
Confluent Enterprise DatasheetConfluent Enterprise Datasheet
Confluent Enterprise Datasheet
 
Spark graphx
Spark graphxSpark graphx
Spark graphx
 
Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar Deep Dive into Building Streaming Applications with Apache Pulsar
Deep Dive into Building Streaming Applications with Apache Pulsar
 
Scaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and FeastScaling Data and ML with Apache Spark and Feast
Scaling Data and ML with Apache Spark and Feast
 
Design Principles for a High-Performance Smalltalk
Design Principles for a High-Performance SmalltalkDesign Principles for a High-Performance Smalltalk
Design Principles for a High-Performance Smalltalk
 
Apache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink MeetupApache Flink @ NYC Flink Meetup
Apache Flink @ NYC Flink Meetup
 
Functional Smalltalk
Functional SmalltalkFunctional Smalltalk
Functional Smalltalk
 

Ähnlich wie Machine Learning Inference at the Edge

Survey on cloud simulator
Survey on cloud simulatorSurvey on cloud simulator
Survey on cloud simulator
Habibur Rahman
 
08 sdn system intelligence short public beijing sdn conference - 130828
08 sdn system intelligence   short public beijing sdn conference - 13082808 sdn system intelligence   short public beijing sdn conference - 130828
08 sdn system intelligence short public beijing sdn conference - 130828
Mason Mei
 

Ähnlich wie Machine Learning Inference at the Edge (20)

Machine Learning inference at the Edge
Machine Learning inference at the EdgeMachine Learning inference at the Edge
Machine Learning inference at the Edge
 
DataPalooza: ML & IoT Workshop
DataPalooza: ML & IoT WorkshopDataPalooza: ML & IoT Workshop
DataPalooza: ML & IoT Workshop
 
Deep Learning at the Edge
Deep Learning at the EdgeDeep Learning at the Edge
Deep Learning at the Edge
 
Datapalooza: A Music Festival Themed ML & IoT Workshop
Datapalooza: A Music Festival Themed ML & IoT WorkshopDatapalooza: A Music Festival Themed ML & IoT Workshop
Datapalooza: A Music Festival Themed ML & IoT Workshop
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
 
Enabling Deep Learning in IoT Applications with Apache MXNet
Enabling Deep Learning in IoT Applications with Apache MXNetEnabling Deep Learning in IoT Applications with Apache MXNet
Enabling Deep Learning in IoT Applications with Apache MXNet
 
Neev cloud services with AWS
Neev cloud services with AWSNeev cloud services with AWS
Neev cloud services with AWS
 
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
“Accelerating Newer ML Models Using the Qualcomm AI Stack,” a Presentation fr...
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud Computing
 
A Complete Guide Cloud Computing
A Complete Guide Cloud ComputingA Complete Guide Cloud Computing
A Complete Guide Cloud Computing
 
Cloud and its job oppertunities
Cloud and its job oppertunitiesCloud and its job oppertunities
Cloud and its job oppertunities
 
Survey on cloud simulator
Survey on cloud simulatorSurvey on cloud simulator
Survey on cloud simulator
 
How do you deliver your applications to the cloud?
How do you deliver your applications to the cloud?How do you deliver your applications to the cloud?
How do you deliver your applications to the cloud?
 
Cloud ppt
Cloud pptCloud ppt
Cloud ppt
 
Lightening the burden of cloud resources administration: from VMs to Functions
Lightening the burden of cloud resources administration: from VMs to FunctionsLightening the burden of cloud resources administration: from VMs to Functions
Lightening the burden of cloud resources administration: from VMs to Functions
 
08 sdn system intelligence short public beijing sdn conference - 130828
08 sdn system intelligence   short public beijing sdn conference - 13082808 sdn system intelligence   short public beijing sdn conference - 130828
08 sdn system intelligence short public beijing sdn conference - 130828
 
在小學有效運用雲端電腦以促進電子學習(第一節筆記)
在小學有效運用雲端電腦以促進電子學習(第一節筆記)在小學有效運用雲端電腦以促進電子學習(第一節筆記)
在小學有效運用雲端電腦以促進電子學習(第一節筆記)
 
cc.pptx
cc.pptxcc.pptx
cc.pptx
 
201908 Overview of Automated ML
201908 Overview of Automated ML201908 Overview of Automated ML
201908 Overview of Automated ML
 

Mehr von Julien SIMON

Mehr von Julien SIMON (20)

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
 
Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
 

Kürzlich hochgeladen

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Machine Learning Inference at the Edge

  • 1. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Julien Simon, Principal AI/ML Evangelist, Amazon Web Services @julsimon Simone Mangiante, Research and Standards Specialist, Vodafone Simone.Mangiante@Vodafone.com Machine Learning Inference at the Edge
  • 2. Agenda • Deep Learning at the Edge? • Apache MXNet • Predicting in the Cloud or at the Edge? • Case study: Driver Monitoring by Vodafone • New services: AWS Greengrass ML and AWS DeepLens • Resources
  • 3. Deep Learning at the Edge?
  • 5. Deep Learning challenges at the Edge • Resource-constrained devices • CPU, memory, storage, power consumption. • Network connectivity • Availability, cost, bandwidth, latency. • On-device prediction may be the only option. • Deployment • Updating code and models on a fleet of devices is not easy.
  • 6. Deep Learning wishlist at the Edge • Rely on cloud-based services for seamless training and deployment. • Have the option to use cloud-based prediction. • Be able to run device-based prediction with good performance. • Support different technical environments (CPUs, languages).
  • 8. Apache MXNet: Open Source library for Deep Learning Programmable Portable High Performance Near linear scaling across hundreds of GPUs Highly efficient models for mobile and IoT Simple syntax, multiple languages Most Open Best On AWS Optimized for Deep Learning on AWS Accepted into the Apache Incubator
  • 9. Apache MXNet for IoT 1. Flexible experimentation in the Cloud. 2. Scalable training in the Cloud. 3. Good prediction performance at the Edge. 4. Prediction in the Cloud or at the Edge.
  • 10. 1 - Flexible experimentation in the Cloud • API for Python, R, Perl, Matlab, Scala, C++. • Gluon • Imperative programming aka ‘define-by-run’. • Inspect, debug and modify models during training. • Extensive model zoo • Pre-trained computer vision models. • DenseNet, SqueezeNet for resource-constrained devices.
  • 11. 2 - Scalable training in the Cloud Amazon SageMaker AWS Deep Learning AMI Amazon EC2 c5 p3
  • 12. 3 - Good prediction performance at the Edge • MXNet is written in C++. • Gluon networks can be ‘hybridized’ for additional speed. • Two libraries boost performance on CPU-only devices • Fast implementation of math primitives • Hardware-specific instructions, e.g. Intel AVX or ARM NEON • Intel Math Kernel Library https://software.intel.com/en-us/mkl • NNPACK https://github.com/Maratyszcza/NNPACK • Mixed precision training • Use float16 instead of float32 for weights and activations • Almost 2x reduction in model size, no loss of accuracy, faster inference • https://devblogs.nvidia.com/parallelforall/mixed-precision-training-deep-neural-networks/
  • 13. 4 - Predicting in the Cloud or at the Edge • Cloud-based: invoke a Lambda function with AWS IoT. • Cloud-based: invoke a SageMaker endpoint with HTTP. • Device-based: bring your own code and model. • Device-based: deploy your code and model with AWS Greengrass.
  • 14. Invoking a Lambda function with AWS IoT • Train a model in SageMaker (or bring your own). • Host it in S3 (or embed it in a Lambda function). • Write a Lambda function performing prediction. • Invoke it through AWS IoT. Best when Devices can support neither HTTP nor local inference (e.g. Arduino). Costs must be kept as low as possible. Requirements Network is available and reliable (MQTT is less demanding than HTTP). Devices are provisioned in AWS IoT (certificate, keys). https://aws.amazon.com/blogs/compute/seamlessly-scale-predictions-with-aws-lambda-and-mxnet/
  • 15. Invoking a SageMaker endpoint with HTTP • Train a model in SageMaker (or bring your own). • Deploy it to a prediction endpoint. • Invoke the HTTP endpoint from your devices. Best when Devices are not powerful enough for local inference. Models can’t be easily deployed to devices. Additional cloud-based data is required for prediction. Prediction activity must be centralized. Requirements Network is available and reliable. Devices support HTTP.
  • 16. Bring your own code and model • Train a model in SageMaker (or bring your own). • Bring your own application code. • Provision devices at manufacturing time (or use your own update mechanism). Best when You don’t want to or can’t rely on cloud services (no network connectivity?) Requirements Devices are powerful enough for local inference. Models don’t need to be updated, if ever. DIY!
  • 17. Deploy your code and model with AWS Greengrass • Train a model in SageMaker (or bring your own). • Write a Lambda function performing prediction. • Add both as resources in your Greengrass group. • Let Greengrass handle deployment and updates. Best when You want the same programming model in the Cloud and at the Edge. Code and models need to be updated, even if network connectivity is infrequent or unreliable. One device in the group should be able to perform prediction on behalf on other devices. Requirements Devices are powerful enough to run Greengrass (XXX HW requirements) Devices are provisioned in AWS IoT (certificate, keys).
  • 18. Distributing ML at the edge Dr Simone Mangiante simone.mangiante@vodafone.com 27 March 2018
  • 19. Distributed, low latency machine learning Increasing amount of in-vehicle data (40+ TB/h) Desire for more mobile-driven services in cars harnessing advanced network functions Needs to be developer friendly ML at the edge: challenges and benefits Challenges from use cases, benefits from deployment at the telco network edge Reduce in-vehicle bill-of-material by offloading computing resources Optimise service cost with shared resources Seamlessly upgrade in a cloud-based environment Evolve at the pace of technology leveraging the latest advances Real-time alerts that enhance safety
  • 20. What is Mobile/Multi-access Edge Computing? MEC is a network architectural concept that enables cloud-computing capabilities at the edge of the network MEC offers applications and content providers cloud-computing capabilities at the edge of the network CloudMobile Edge CloudRadio Site • MEC software (i.e. MEC Platform) runs as a VNF on a cloud-based edge infrastructure • 3rd party applications can be deployed on MEC platforms • MEC platform can expose network APIs to applications 4G/5G Services Edge Cloud Infrastructure VNF Edge Cloud Services & API Edge Application MEC Platform Virtual Network Functions Edge DeploymentEdge Deployment
  • 21. The rationale for edge computing goes beyond latency Latency argument is sometimes ‘overhyped’
  • 22. Mobile Core network MEC driver monitoring proof of concept Saguna vEdge AWS Greengrass Driver Monitoring App Vodafone 4G/LTE MEC Edge Cloud Internet IPsec tunnel IPsec tunnel 1 2 3 4 5 1 • Camera device (Raspberry Pi) • Connected via cellular radio • Video streamed to Driver Monitoring App • Filters traffic based on traffic rules: e.g. pass- through to Internet, mirror, redirect to local application • Can ‘chain’ applications 2 • Camera device is in Greengrass Group • Runs in a VM • It’s a ‘MEC application’ • Receives traffic from radio access as per configured traffic rules 3 4 • Edge app • Business logic • Includes neural network 5 • Training MQTT messages Notifications/results Model updates Local breakout
  • 23. Driver monitoring application Convolutional Neural Network • Two architectures explored: Inception, MobileNet • MobileNet chosen for its speed advantage and ability to run on a light platform (as for a PoC) Training data • Source: Kaggle.com • Dataset: State Farm Distracted Driver Detection Business logic • Classifies a series of sequential video frames • Then makes decision on whether “distracted driving” detected • For PoC, sends message to Raspberry Pi for local web server and displaying via browser on monitor Stream video Receive alerts In-Vehicle AWS IoT SDK MEC Edge Cloud Receive video Send alerts Perform inference using ML model AWS Greengrass core Create and train the driver monitoring model AWS Cloud AWS SageMaker Developed by
  • 24. The future is exciting. Ready?
  • 25. ML Inference using AWS Greengrass Deploy model and Lambda Send insightsPredict and take actions locally on device AWS Cloud PREVIEW AVAILABLE
  • 28. A new way to learn Custom built for Deep Learning Broad Framework Support Deploy models from Amazon SageMaker Integrated with AWS Fully programmable with AWS Lambda AWS DeepLens W o r l d ’ s f i r s t D e e p L e a r n i n g e n a b l e d v i d e o c a m e r a f o r d e v e l o p e r s
  • 30. Object detection with AWS DeepLens
  • 32. Resources https://mxnet.incubator.apache.org http://gluon.mxnet.io https://aws.amazon.com/sagemaker (free tier available) An overview of Amazon SageMaker: https://www.youtube.com/watch?v=ym7NEYEx9x4 https://github.com/awslabs/amazon-sagemaker-examples https://aws.amazon.com/greengrass (free tier available) https://aws.amazon.com/deeplens https://medium.com/@julsimon
  • 33. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Thank you! Julien Simon, Principal AI/ML Evangelist, Amazon Web Services @julsimon Simone Mangiante, Research and Standards Specialist, Vodafone Simone.Mangiante@Vodafone.com