SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Downloaden Sie, um offline zu lesen
© 2017 MapR TechnologiesMapR Confidential 1
State of the Art Robot Predictive
Maintenance with Real-time
Sensor Data
Mateusz Dymczyk, Software Engineer @ h2o.ai
Mathieu Dumoulin, Data Engineer @ MapR
Strata New York 2017
© 2017 MapR TechnologiesMapR Confidential 2
State of the Art Robot Predictive
Maintenance with Real-time
Sensor Data Part 2
Mateusz Dymczyk, Software Engineer @ h2o.ai
Mathieu Dumoulin, Data Engineer @ MapR
Strata New York 2017
© 2017 MapR Technologies 3
Mateusz Dymczyk and Mathieu Dumoulin
• Data Engineer @ MapR
Technologies
• Previously data scientist, DS
manager, search, NLP and ML
engineering Canada and in
Japan
• Software Engineer @ H2O.ai
• Previously ML/NLP @ Fujitsu
Laboratories and en-japan inc
© 2017 MapR Technologies
• 907B$/y investment until 20201
• 1,6M operational industrial
robots in the world in 20152
• 2.6M by 20201
1: What Everyone Must Know About Industry 4.0, Forbes June 2016
2: International Federation of Robotics (IFR) study World Robotics 2016source: PwC 2016 Global Industry 4.0 Survey
Industry 4.0 is Now
Industry 4.0 systems1:
1. Interoperable
2. Information transparency
3. Technical assistance
4. Decentralized decision making
© 2017 MapR Technologies
5
Predictive Maintenance for Industrial Robots
Primary goal: Reduce unplanned downtime
© 2017 MapR Technologies
Robot Actuator Failure Prediction PoC
Model 6-axis industrial robot
LPMS-B2
Wireless movement sensor
PoC Goal: Predict potential actuator failure in real-time (within 3s)
© 2017 MapR Technologies 7
Success criteria
• Detect correct robot state
(Normal/Failure) within in 3s
• Recall > precision
• Improve over time once a
“MVP” model is working
Photo: Ambient Intelligence Blog
© 2017 MapR Technologies 8
Need for Scale: Deploy to a Real Factory
Tesla Factory photo by Paul Sakuma/AP
© 2017 MapR Technologies
Don’t Reinvent the Wheel
• We have limited time and
bugdet for this PoC
• Tools > assembly of existing
software > coding
• The state of the art is often
OSS anyways!
© 2017 MapR Technologies 10
Video of solution in action 2m
© 2017 MapR Technologies
PoC Building Blocs
People: 2 Engineers LP-RESEARCH, ML Engineer and Data Engineer
Effort: 2 months part-time
© 2017 MapR Technologies 12
Experimental Setup
© 2017 MapR Technologies 13
Experimental Setup: Normal State
© 2017 MapR Technologies 14
Experimental Setup: Failure State
© 2017 MapR Technologies
Anomaly Detection for Predictive
Maintenance
© 2017 MapR TechnologiesMapR Confidential 16
Machine Learning Project Flow
Explore and
Analyze
Choose
Algorithm
Build
Model
Evaluate
Model
Put into
production
Problem
evaluation &
definition
Data
preparation
© 2017 MapR Technologies
1. Starting Point
– Classification problem
– Time series data
• Linear Acceleration X, Y, Z axis
– No labeled data at first
• Accumulate over time
2. Machine Learning goal/metrics
– Recall vs. Precision
3. Additional Requirements
– Detect state within 3 seconds
Problem Definition
Normal State (OK!)
PREDICT FAILURE
© 2017 MapR Technologies 18
Data Source: Movement Sensor
• Real-time, on-device calculation of
linear acceleration
– Data centered around 0
– Measurements [-1,1]
• Data output rates of up to 400Hz
• Very sensitive
www.lp-research.com
LPMS-B2
© 2017 MapR Technologies 19
Sensor Data Preparation
200ms window
Ref: 21 Great Articles and Tutorials on Time Series
• Feature selection(3 / 27 features)
• Windowing
– Window size: 200ms
– Sensor data rate: 100Hz
© 2017 MapR Technologies 20
Modeling for Anomaly Detection
• Unlabeled data -> unsupervised learning
• Training data consists only of data
during “normal state” runs
– Only train on normal op. data
• Conclusion: anomaly detection
• Possible algorithms:
– HMM
– Autoencoders
– LSTM auto encoders
– KNN, Local Outlier Factor
Anomaly Detection
Get Ted Dunning’s Anomaly Dectection Book
Anomaly!
© 2017 MapR Technologies
First Model: Autoencoders
• A kind of neural network
used for unsupervised
learning of efficient codings
• Requires a training pass to
learn a representation of
”normal” data
• Anomalous data will have a
large reconstruction error
compared to normal data
Längkvist, Martin, Lars Karlsson, and Amy Loutfi. "A review of unsupervised feature learning and
deep learning for time-series modeling." Pattern Recognition Letters 42 (2014): 11-24.
© 2017 MapR Technologies 22
Experimental Setup: Training the Model
© 2017 MapR Technologies
Performance Evaluation
• Evaluation dataset
– Captured from a preprogrammed “pre-failure” operation
mode
– 1x full movement cycle of (“pre-failure”) labeled data
• Normal 90% Failure 10%
• Performance measures:
– MSE during training
– TPR/FPR on the test dataset
Note: For an example with code: https://machinelearningmastery.com
© 2017 MapR Technologies 24
ML – Results
Note: Time window: 200ms, Threshold: 2SD
© 2017 MapR Technologies 25
Experimental Setup: Real-time Predictions
© 2017 MapR Technologies
Next Step: Long Short Term Memory (LSTM)
• Deep learning architecture in
the RNN family that
remembers arbitrary intervals1.
• Overcomes known RNN issues
– limited memory
– instability
• Especially used for image, text
and speech applications
… and time series data
Ref: “Understanding LSTM Networks” by Christopher Olah (2015)
1: Long Short-Term Memory, Hochreiter and Schmidhuber (1997)
RNN
LSTM
© 2017 MapR Technologies
Implementation: Keras with TensorFlow Backend
• Similar design to
Autoencoder
• Encoder and decoder
are separate
• Model implemented with
Keras in Python but
executed by H2O Deep
Water
© 2017 MapR Technologies
LSTM and H2O: Deep Water
• Keras model is trained through H2O
– Fast data ingest, missing value handling, ignoring
columns, etc. 2.5m/100 epoch
– MOJO output (binary model representation)
• Usable from any JVM language
• Just like H2O POJO!
• Prediction service infrastructure is reused
© 2017 MapR Technologies
LSTM Results
LinAccX Results
LinAccZ
LinAccY
© 2017 MapR Technologies 30
Conclusion
© 2017 MapR Technologies
What We Didn’t Talk About (Much)
Security: System and Data
Reliability and Scalability Machine learning logistics
Integration in a Factory
© 2017 MapR Technologies 32
• Clever assembly of existing enterprise
software can do it with surprisingly small
time, effort and complexity
• H2O and MapR offers a fast path to value for
production ML
• LSTM doesn’t easily beat Autoencoders
without significant effort and expertise
• Converged platforms reduce complexity
Advanced Predictive Maintenance
Poster by J. Howard Miller (1943)
© 2017 MapR TechnologiesMapR Confidential
New: Machine Learning Logistics
Model Management in the Real World
O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017
Get free pdf copy of book courtesy of MapR:
https://mapr.com/ebook/machine-learning-logistics/
Visit MapR booth for free book signings & booth theater
presentations by the authors
Wed schedule:
Book signing: afternoon break 3:35 – 4:20 pm
Booth presentation by Ted Dunning: 3:00 – 3:30 pm
Thur schedule:
Book signing: morning break 10:45 – 11:20 am
Booth presentation by Ellen Friedman: 3:00 – 3:30 pm
© 2017 MapR Technologies 34
Q&A
ENGAGE WITH US
mateusz@h2o.ai
mathieu.dumoulin@mapr.com
PROJECT GITHUB:
github.com/mdymczyk/iot-pipeline
Our thanks to:
LP RESEARCH
www.lp-research.com
contact: Klaus Peterson
klaus@lp-research.com
© 2017 MapR Technologies 35
Thank you to LP-RESEARCH!
Hardware design and production
Expertise in Motion sensors
Gyroscope
Accelerometer
Magnetometer
Sensor fusion algorithm
development
Multi-platform application
development
See all our products: https://www.lp-research.com/products/
LPMS-B2 LPMS-CU2 LPMS-CANAL2 LPMS-USBAL2OEM also
available!

Weitere ähnliche Inhalte

Was ist angesagt?

Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapRThe World Bank
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient DataCarol McDonald
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLDESMOND YUEN
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...MapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Carol McDonald
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark Summit
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataCarol McDonald
 

Was ist angesagt? (20)

Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapR
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient Data
 
MapR & Skytree:
MapR & Skytree: MapR & Skytree:
MapR & Skytree:
 
Very large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDLVery large scale distributed deep learning on BigDL
Very large scale distributed deep learning on BigDL
 
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
Streaming Goes Mainstream: New Architecture & Emerging Technologies for Strea...
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
 

Ähnlich wie State of the Art Robot Predictive Maintenance with Real-time Sensor Data

Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionDataWorks Summit
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksJustin Brandenburg
 
Map r chicago_advanalytics_oct_meetup
Map r chicago_advanalytics_oct_meetupMap r chicago_advanalytics_oct_meetup
Map r chicago_advanalytics_oct_meetupAlan Iovine
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningTed Dunning
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logisticsTed Dunning
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive
 
Big Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data FabricBig Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data FabricMatt Stubbs
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Carol McDonald
 
Using TensorFlow for Machine Learning
Using TensorFlow for Machine LearningUsing TensorFlow for Machine Learning
Using TensorFlow for Machine LearningJustin Brandenburg
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Chris Fregly
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsEllen Friedman
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Mathieu Dumoulin
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globallyridhav
 
Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...
Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...
Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...Matt Stubbs
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Carol McDonald
 

Ähnlich wie State of the Art Robot Predictive Maintenance with Real-time Sensor Data (20)

Real-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in ActionReal-Time Robot Predictive Maintenance in Action
Real-Time Robot Predictive Maintenance in Action
 
Predictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural NetworksPredictive Maintenance Using Recurrent Neural Networks
Predictive Maintenance Using Recurrent Neural Networks
 
Map r chicago_advanalytics_oct_meetup
Map r chicago_advanalytics_oct_meetupMap r chicago_advanalytics_oct_meetup
Map r chicago_advanalytics_oct_meetup
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine LearningStreaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logistics
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Big Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data FabricBig Data LDN 2017: Real World Impact of a Global Data Fabric
Big Data LDN 2017: Real World Impact of a Global Data Fabric
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
 
Using TensorFlow for Machine Learning
Using TensorFlow for Machine LearningUsing TensorFlow for Machine Learning
Using TensorFlow for Machine Learning
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 
DataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven OrganizationsDataOps: An Agile Method for Data-Driven Organizations
DataOps: An Agile Method for Data-Driven Organizations
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
MapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn GloballyMapR Edge : Act Locally Learn Globally
MapR Edge : Act Locally Learn Globally
 
Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...
Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...
Big Data LDN 2017: The Intelligent Edge: What Data-driven Means in the Age of...
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
Smart App@Pivotal by Dat Tran
Smart App@Pivotal by Dat TranSmart App@Pivotal by Dat Tran
Smart App@Pivotal by Dat Tran
 

Mehr von Mathieu Dumoulin

Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on SparkMathieu Dumoulin
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comMathieu Dumoulin
 
Introduction aux algorithmes map reduce
Introduction aux algorithmes map reduceIntroduction aux algorithmes map reduce
Introduction aux algorithmes map reduceMathieu Dumoulin
 
MapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifiéMapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifiéMathieu Dumoulin
 
Presentation Hadoop Québec
Presentation Hadoop QuébecPresentation Hadoop Québec
Presentation Hadoop QuébecMathieu Dumoulin
 

Mehr von Mathieu Dumoulin (6)

Distributed Deep Learning on Spark
Distributed Deep Learning on SparkDistributed Deep Learning on Spark
Distributed Deep Learning on Spark
 
Real world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.comReal world machine learning with Java for Fumankaitori.com
Real world machine learning with Java for Fumankaitori.com
 
Introduction aux algorithmes map reduce
Introduction aux algorithmes map reduceIntroduction aux algorithmes map reduce
Introduction aux algorithmes map reduce
 
MapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifiéMapReduce: Traitement de données distribué à grande échelle simplifié
MapReduce: Traitement de données distribué à grande échelle simplifié
 
Presentation Hadoop Québec
Presentation Hadoop QuébecPresentation Hadoop Québec
Presentation Hadoop Québec
 
Introduction à Hadoop
Introduction à HadoopIntroduction à Hadoop
Introduction à Hadoop
 

Kürzlich hochgeladen

20240330_고급진 코드를 위한 exception 다루기
20240330_고급진 코드를 위한 exception 다루기20240330_고급진 코드를 위한 exception 다루기
20240330_고급진 코드를 위한 exception 다루기Chiwon Song
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmonyelliciumsolutionspun
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadIvo Andreev
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampVICTOR MAESTRE RAMIREZ
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntelliSource Technologies
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdfMeon Technology
 
Kubernetes go-live checklist for your microservices.pptx
Kubernetes go-live checklist for your microservices.pptxKubernetes go-live checklist for your microservices.pptx
Kubernetes go-live checklist for your microservices.pptxPrakarsh -
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Jaydeep Chhasatia
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorShane Coughlan
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsJaydeep Chhasatia
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024Mind IT Systems
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLAlluxio, Inc.
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfTobias Schneck
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIIvo Andreev
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesSoftwareMill
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyRaymond Okyere-Forson
 
About .NET 8 and a first glimpse into .NET9
About .NET 8 and a first glimpse into .NET9About .NET 8 and a first glimpse into .NET9
About .NET 8 and a first glimpse into .NET9Jürgen Gutsch
 

Kürzlich hochgeladen (20)

Sustainable Web Design - Claire Thornewill
Sustainable Web Design - Claire ThornewillSustainable Web Design - Claire Thornewill
Sustainable Web Design - Claire Thornewill
 
20240330_고급진 코드를 위한 exception 다루기
20240330_고급진 코드를 위한 exception 다루기20240330_고급진 코드를 위한 exception 다루기
20240330_고급진 코드를 위한 exception 다루기
 
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine HarmonyLeveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
Leveraging DxSherpa's Generative AI Services to Unlock Human-Machine Harmony
 
Cybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and BadCybersecurity Challenges with Generative AI - for Good and Bad
Cybersecurity Challenges with Generative AI - for Good and Bad
 
Deep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - DatacampDeep Learning for Images with PyTorch - Datacamp
Deep Learning for Images with PyTorch - Datacamp
 
Introduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptxIntroduction-to-Software-Development-Outsourcing.pptx
Introduction-to-Software-Development-Outsourcing.pptx
 
online pdf editor software solutions.pdf
online pdf editor software solutions.pdfonline pdf editor software solutions.pdf
online pdf editor software solutions.pdf
 
Kubernetes go-live checklist for your microservices.pptx
Kubernetes go-live checklist for your microservices.pptxKubernetes go-live checklist for your microservices.pptx
Kubernetes go-live checklist for your microservices.pptx
 
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
Optimizing Business Potential: A Guide to Outsourcing Engineering Services in...
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
OpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS CalculatorOpenChain Webinar: Universal CVSS Calculator
OpenChain Webinar: Universal CVSS Calculator
 
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software TeamsYour Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
Your Vision, Our Expertise: TECUNIQUE's Tailored Software Teams
 
Top Software Development Trends in 2024
Top Software Development Trends in  2024Top Software Development Trends in  2024
Top Software Development Trends in 2024
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdfARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
ARM Talk @ Rejekts - Will ARM be the new Mainstream in our Data Centers_.pdf
 
JS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AIJS-Experts - Cybersecurity for Generative AI
JS-Experts - Cybersecurity for Generative AI
 
Growing Oxen: channel operators and retries
Growing Oxen: channel operators and retriesGrowing Oxen: channel operators and retries
Growing Oxen: channel operators and retries
 
Kawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in TrivandrumKawika Technologies pvt ltd Software Development Company in Trivandrum
Kawika Technologies pvt ltd Software Development Company in Trivandrum
 
AI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human BeautyAI Embracing Every Shade of Human Beauty
AI Embracing Every Shade of Human Beauty
 
About .NET 8 and a first glimpse into .NET9
About .NET 8 and a first glimpse into .NET9About .NET 8 and a first glimpse into .NET9
About .NET 8 and a first glimpse into .NET9
 

State of the Art Robot Predictive Maintenance with Real-time Sensor Data

  • 1. © 2017 MapR TechnologiesMapR Confidential 1 State of the Art Robot Predictive Maintenance with Real-time Sensor Data Mateusz Dymczyk, Software Engineer @ h2o.ai Mathieu Dumoulin, Data Engineer @ MapR Strata New York 2017
  • 2. © 2017 MapR TechnologiesMapR Confidential 2 State of the Art Robot Predictive Maintenance with Real-time Sensor Data Part 2 Mateusz Dymczyk, Software Engineer @ h2o.ai Mathieu Dumoulin, Data Engineer @ MapR Strata New York 2017
  • 3. © 2017 MapR Technologies 3 Mateusz Dymczyk and Mathieu Dumoulin • Data Engineer @ MapR Technologies • Previously data scientist, DS manager, search, NLP and ML engineering Canada and in Japan • Software Engineer @ H2O.ai • Previously ML/NLP @ Fujitsu Laboratories and en-japan inc
  • 4. © 2017 MapR Technologies • 907B$/y investment until 20201 • 1,6M operational industrial robots in the world in 20152 • 2.6M by 20201 1: What Everyone Must Know About Industry 4.0, Forbes June 2016 2: International Federation of Robotics (IFR) study World Robotics 2016source: PwC 2016 Global Industry 4.0 Survey Industry 4.0 is Now Industry 4.0 systems1: 1. Interoperable 2. Information transparency 3. Technical assistance 4. Decentralized decision making
  • 5. © 2017 MapR Technologies 5 Predictive Maintenance for Industrial Robots Primary goal: Reduce unplanned downtime
  • 6. © 2017 MapR Technologies Robot Actuator Failure Prediction PoC Model 6-axis industrial robot LPMS-B2 Wireless movement sensor PoC Goal: Predict potential actuator failure in real-time (within 3s)
  • 7. © 2017 MapR Technologies 7 Success criteria • Detect correct robot state (Normal/Failure) within in 3s • Recall > precision • Improve over time once a “MVP” model is working Photo: Ambient Intelligence Blog
  • 8. © 2017 MapR Technologies 8 Need for Scale: Deploy to a Real Factory Tesla Factory photo by Paul Sakuma/AP
  • 9. © 2017 MapR Technologies Don’t Reinvent the Wheel • We have limited time and bugdet for this PoC • Tools > assembly of existing software > coding • The state of the art is often OSS anyways!
  • 10. © 2017 MapR Technologies 10 Video of solution in action 2m
  • 11. © 2017 MapR Technologies PoC Building Blocs People: 2 Engineers LP-RESEARCH, ML Engineer and Data Engineer Effort: 2 months part-time
  • 12. © 2017 MapR Technologies 12 Experimental Setup
  • 13. © 2017 MapR Technologies 13 Experimental Setup: Normal State
  • 14. © 2017 MapR Technologies 14 Experimental Setup: Failure State
  • 15. © 2017 MapR Technologies Anomaly Detection for Predictive Maintenance
  • 16. © 2017 MapR TechnologiesMapR Confidential 16 Machine Learning Project Flow Explore and Analyze Choose Algorithm Build Model Evaluate Model Put into production Problem evaluation & definition Data preparation
  • 17. © 2017 MapR Technologies 1. Starting Point – Classification problem – Time series data • Linear Acceleration X, Y, Z axis – No labeled data at first • Accumulate over time 2. Machine Learning goal/metrics – Recall vs. Precision 3. Additional Requirements – Detect state within 3 seconds Problem Definition Normal State (OK!) PREDICT FAILURE
  • 18. © 2017 MapR Technologies 18 Data Source: Movement Sensor • Real-time, on-device calculation of linear acceleration – Data centered around 0 – Measurements [-1,1] • Data output rates of up to 400Hz • Very sensitive www.lp-research.com LPMS-B2
  • 19. © 2017 MapR Technologies 19 Sensor Data Preparation 200ms window Ref: 21 Great Articles and Tutorials on Time Series • Feature selection(3 / 27 features) • Windowing – Window size: 200ms – Sensor data rate: 100Hz
  • 20. © 2017 MapR Technologies 20 Modeling for Anomaly Detection • Unlabeled data -> unsupervised learning • Training data consists only of data during “normal state” runs – Only train on normal op. data • Conclusion: anomaly detection • Possible algorithms: – HMM – Autoencoders – LSTM auto encoders – KNN, Local Outlier Factor Anomaly Detection Get Ted Dunning’s Anomaly Dectection Book Anomaly!
  • 21. © 2017 MapR Technologies First Model: Autoencoders • A kind of neural network used for unsupervised learning of efficient codings • Requires a training pass to learn a representation of ”normal” data • Anomalous data will have a large reconstruction error compared to normal data Längkvist, Martin, Lars Karlsson, and Amy Loutfi. "A review of unsupervised feature learning and deep learning for time-series modeling." Pattern Recognition Letters 42 (2014): 11-24.
  • 22. © 2017 MapR Technologies 22 Experimental Setup: Training the Model
  • 23. © 2017 MapR Technologies Performance Evaluation • Evaluation dataset – Captured from a preprogrammed “pre-failure” operation mode – 1x full movement cycle of (“pre-failure”) labeled data • Normal 90% Failure 10% • Performance measures: – MSE during training – TPR/FPR on the test dataset Note: For an example with code: https://machinelearningmastery.com
  • 24. © 2017 MapR Technologies 24 ML – Results Note: Time window: 200ms, Threshold: 2SD
  • 25. © 2017 MapR Technologies 25 Experimental Setup: Real-time Predictions
  • 26. © 2017 MapR Technologies Next Step: Long Short Term Memory (LSTM) • Deep learning architecture in the RNN family that remembers arbitrary intervals1. • Overcomes known RNN issues – limited memory – instability • Especially used for image, text and speech applications … and time series data Ref: “Understanding LSTM Networks” by Christopher Olah (2015) 1: Long Short-Term Memory, Hochreiter and Schmidhuber (1997) RNN LSTM
  • 27. © 2017 MapR Technologies Implementation: Keras with TensorFlow Backend • Similar design to Autoencoder • Encoder and decoder are separate • Model implemented with Keras in Python but executed by H2O Deep Water
  • 28. © 2017 MapR Technologies LSTM and H2O: Deep Water • Keras model is trained through H2O – Fast data ingest, missing value handling, ignoring columns, etc. 2.5m/100 epoch – MOJO output (binary model representation) • Usable from any JVM language • Just like H2O POJO! • Prediction service infrastructure is reused
  • 29. © 2017 MapR Technologies LSTM Results LinAccX Results LinAccZ LinAccY
  • 30. © 2017 MapR Technologies 30 Conclusion
  • 31. © 2017 MapR Technologies What We Didn’t Talk About (Much) Security: System and Data Reliability and Scalability Machine learning logistics Integration in a Factory
  • 32. © 2017 MapR Technologies 32 • Clever assembly of existing enterprise software can do it with surprisingly small time, effort and complexity • H2O and MapR offers a fast path to value for production ML • LSTM doesn’t easily beat Autoencoders without significant effort and expertise • Converged platforms reduce complexity Advanced Predictive Maintenance Poster by J. Howard Miller (1943)
  • 33. © 2017 MapR TechnologiesMapR Confidential New: Machine Learning Logistics Model Management in the Real World O’Reilly book by Ellen Friedman & Ted Dunning © Sept 2017 Get free pdf copy of book courtesy of MapR: https://mapr.com/ebook/machine-learning-logistics/ Visit MapR booth for free book signings & booth theater presentations by the authors Wed schedule: Book signing: afternoon break 3:35 – 4:20 pm Booth presentation by Ted Dunning: 3:00 – 3:30 pm Thur schedule: Book signing: morning break 10:45 – 11:20 am Booth presentation by Ellen Friedman: 3:00 – 3:30 pm
  • 34. © 2017 MapR Technologies 34 Q&A ENGAGE WITH US mateusz@h2o.ai mathieu.dumoulin@mapr.com PROJECT GITHUB: github.com/mdymczyk/iot-pipeline Our thanks to: LP RESEARCH www.lp-research.com contact: Klaus Peterson klaus@lp-research.com
  • 35. © 2017 MapR Technologies 35 Thank you to LP-RESEARCH! Hardware design and production Expertise in Motion sensors Gyroscope Accelerometer Magnetometer Sensor fusion algorithm development Multi-platform application development See all our products: https://www.lp-research.com/products/ LPMS-B2 LPMS-CU2 LPMS-CANAL2 LPMS-USBAL2OEM also available!

Hinweis der Redaktion

  1. Industry 4.0 is all about digitization of the factory. Sensors everywhere. All this data makes possible new opportunities for automation, cost savings, higher productivity and higher quality. What makes a system Industry 4.0 Interoperability — machines, devices, sensors and people that connect and communicate with one another. Information transparency — the systems create a virtual copy of the physical world through sensor data in order to contextualize information. Technical assistance — both the ability of the systems to support humans in making decisions and solving problems andthe ability to assist humans with tasks that are too difficult or unsafe for humans. Decentralized decision-making — the ability of cyber-physical systems to make simple decisions on their own and become as autonomous as possible. Our talk will focus on Data & Analytics for improving the efficiency of operations of factories with lots of industrial robots. We combine Smart sensors, DB Analytics (ML), Cloud computing and AR to power a real-world, state of the art predictive analytics system.
  2. Order parts predictively Increased factory efficiency Robots operate at peak efficiency
  3. We have a business goal, a robot and a sensor to work with. We are gonna have to data science the shit out of this [9]https://www.quora.com/Which-type-of-Sensors-use-in-industrial-robots inderesting: no motion sensors! that’s the justification here.
  4. Based on known real-world requirement of state of the art Japanese car-parts manufacturers. Recall is more important than precision because too many false alarms will increase costs and make trusting the system very hard. Precision can be initially very low and still the system can be useful IF you can trust the predictions. The models can then be improved over time.
  5. Scale with number of sensors, robots and factories. GB a day quickly become many GB per hour or even minutes. This is comfortably on moderate sized clusters (5-25 nodes) using current big data platforms used by attendees of Strata.
  6. Working software over complex implementations that never get done
  7. 異常がない場合、ご覧いただいた通り緑のマークが表示されます。
  8. 20m mark
  9. What do we even want?! I.E.: Data gathering Feature selection, extraction, engineering and data transformation 3) Pick all potential algorithms 4) Build a model using your library/tool of choice 5) Evaluate according to previously defined metrics 6) If not good enough then either try a different approach, features or method parameters 7) Otherwise extract the model and put it into production!
  10. Simplifications: Data is centered around 0 Data is scaled [-1,1] No missing values
  11. Mention why we are doing it with machine learning at all! No rules, automatically learn the best parameters for each application without new coding and not based on supervised techniques. Especially good when we don’t know what we are looking for: machines can break in a variety of ways.
  12. Mention why we are doing it with machine learning at all! No rules, automatically learn the best parameters for each application without new coding and not based on supervised techniques. Especially good when we don’t know what we are looking for: machines can break in a variety of ways. Peeking: ML modeling mistake where some data is used to train a model includes information about the answer
  13. Short mention of what does ”large” reconstruction error mean? Discussion of thresholds and why SD is a good choice.
  14. MSE measures the average of the squares of the errors or deviations—that is, the difference between the estimator and what is estimated.  The MSE is a measure of the quality of an estimator—it is always non-negative, and values closer to zero are better. Note: using RMSE is popular too and is a scaled version of MSE, otherwise it’s identical.
  15. Circle back to slide
  16. RNNs can remember their former inputs and operate over a sequence of vectors. Good for time series? Training with Back Propagation Through Time is unstable1 Effective limit of RNNs is 5-10 discreet time steps2 “Works slightly better (than RNN) in practice, owing to its more powerful update equation and some appealing backpropagation dynamics” - Andrej Karpathy I mentioned the remarkable results people are achieving with RNNs. Essentially all of these are achieved using LSTMs. They really work a lot better for most tasks! – C Olah text and speech : (Google Translate, Apple’s Siri, Amazon’s Alexa)
  17. LSTM Can learn in excess of 1000 discreet time steps Algorithm is local in space and time Computational complexity per time step/weight is O(1) Keras has implementation of LSTM layer Lots of examples are available (TODO: 1, 2, 3)
  18. Keras and TF stil need expertise unavailable to most engineers, but it’s a huge step in the right direction Prediction throughput is slower, will need more engineering to make it work properly
  19. Mention Convergence
  20. Have a clear plan for production Data Science + Data Engineering = Win Effort: Data engineering > data science