Demystifying Machine Learning for Manufacturing: Data Science for all
1. Infosys Confidential1 |1 Internet of Manufacturing Midwest 2018
DemystifyingMachine
Learningfor
Manufacturing:
Data ScienceforAll
Jeff Kavanaugh
June 7, 2018
2. Infosys Confidential2 |2 Internet of Manufacturing Midwest 2018
Today’s discussion
Research
Industry 4.0 maturity index and
framework
Future proof: learning and
communication through data
science and critical thinking
Machine Learning and
Analytics Background
IIoT and AI in Practice
Industrial IoT through facilities
energy management
Water Treatment Plant
Automotive OEM predictive
churn
4. Infosys Confidential4 |4 Internet of Manufacturing Midwest 2018
Industry 4.0: Beyond the POC… the time to scale is upon us
Industry 4.0 integrates the physical and virtual worlds through technology enablers, which brings the fungibility and
speed of software to manufacturing operations. The potential value created by Industry 4.0 vastly exceeds the low-
single-digit cost savings that many manufacturers pursue today (acatech, Infosys, BCG, McKinsey, et al).
Disruptive technology enablers for Industry 4.0 are at a tipping point
*McKinsey, acatech, Infosys, BCG research
100xdisruptive digitalinnovation is
100x faster than physical
disruption*
25Bconnected things forecasted to
ship by 2020.*
250Mconnected vehicles are forecasted
to have some form of wireless
network connection by 2020.*
$421Bin cost and efficiency gains
per annum
$907Bin annual digitalinvestments
$493Bin digital revenue gains per
annum
Industry 4.0 is changing manufacturing
But are we ready?....
5. Infosys Confidential5 |5 Internet of Manufacturing Midwest 2018
• Industry 4.0 announced atHannover
Messe 2011, butsystematic
implementation still only 18%
• Current speed of implementation
places the 2022 goalof 46% at risk
• Reason: data hurdles and piecemeal
POCapproach – unclear path
Approach toovercome barriers:
1. Evaluate your digital maturity
2. Proof of concepts to demonstrate
business value, then scaledaction
3. Set clear targets
4. Prioritize measures thatwill bring
the mostvalue to business
5. Demonstrate courage, persistence
Industry 4.0: Global study conducted on operations efficiency as a
driver for competitiveness
Dimensions
Maintenance
Efficiency
Information
Efficiency
Energy
Efficiency
Service
Efficiency
• Vast majority (82%) of
companies areaware of
the high potential in
implementingIndustry4.0
concepts
• 46% want to implement
Industry 4.0 solutions
systematically for enhanced
asset efficiency by 2022
• Only 30% have implemented
data-driven or intelligent
services
Potential recognizedSystematically implemented
Partly implemented No awareness
2017 2022
Directional findings
Source: Infosys and Institute for Industrial Management (FIR) at RWTHAachen study conducted in 2015 and updated in 2017.
Sample size: 433 executives across industrial manufacturing sectors from China, France, Germany, UK and USA
OPERATIONSEFFICIENCY
The opportunity
Performance
Efficiency
Engineering
Efficiency
15%
4%
32%
18%
35%
32%
18%
46%
6. Infosys Confidential6 |6 Internet of Manufacturing Midwest 2018
Humans still matter! Industry 4.0 maturity is about more than the
technology, and poor reasoning skills are constraining progress
Their No. 1 complaint?
Poor critical-reasoning skills
A survey by PayScale Inc., an online pay and
benefits researcher, showed 50% of employers
complain that college graduates they hire aren’t
ready for the workplace.
Source: UTD research study December 2017 and PayScale Inc., 2016
7. Infosys Confidential7 |7 Internet of Manufacturing Midwest 2018
Industry 4.0 maturity drives significant efficiency improvement,
and analytics is a fundamental requirement
Near-Term
Long-Term
Computerization
E.g. CNC milling
machinebutnot
connected
Business
applications
connected to
each other
Up to date
digitalmodel
(Digital
Shadow) to
showwhat’s
happening
now
BigData
analyticsto
understand
rootcauses
Advanced
analyticsfor
simulation&
identification
of mostlikely
scenarios
Automated
decision
makingand
actions
Source: Industrie 4.0 Maturity Index, acatechstudy supported by Infosys, 2017
8. Infosys Confidential8 |8 Internet of Manufacturing Midwest 2018
Machine Learning is an important component in Industry 4.0 analytics
Applied Machine Learning
Computer Vision
Unstructured Text
Analytics
Other AI Offerings
Deep Neural Networks
Data Analytics Cognitive
Time&AIInfrastructure
Predict
Categories?
Labeled
Data ?
Classification
Clustering /
Anomaly Detection
Predicting
values ?
Regression
Dimension
Reduction
Y
N
Y
N
Y
N High-Fidelity Speech
Synthesis
Video Analysis
Image Insights /
Comparison
Names Entity
Extraction
Chat Bots
Knowledge
Management
Language
Translation
Text Extraction
9. Infosys Confidential 9 Infosys Confidential|9 Internet of Manufacturing Midwest 2018
Machine Learning involves solving business problems using 25+
algorithms segmented into 4 groups
http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
Solving a Machine Learning (ML)
problem depends on finding the
right algorithms for the business
problem
Different algorithms are better suited
for different types of data and
different problems
We have found the Python scikit-learn
flowchart useful for selecting ML
algorithms specific to business
problem and available data
Yes/No;
quality
pass/fail
Addresses
too much
sensor data!
Old Faithful
(Describes
relationships)
Groups into
similar
characteristics
10. Infosys Confidential 10 Infosys Confidential|10 Internet of Manufacturing Midwest 2018
Machine Learning platforms (tools) have a large library of algorithms,
designed to address different types of business problems
Classification Regression
Clustering and Anomaly Detection Dimensionality Reduction
Identify category to which an object belongs
Applications:spam detection, image recognition, quality P/F
Support Vector
Machine (SVM)
Stochastic Gradient
Descent (SGD)
Classifier
k-Neighbors
Classification
Random Forest /
Decision Trees
Predicts a continuous-valued attribute associated with an object
Applications:forecasting,pricing determination
Stochastic Gradient
Descent (SGD)
Regressor
Lasso Ridge
Regression
Elastic Net
Automatic groupingof similar objects into sets
Applications: visualization, sensor feeds, efficiency improvement
K-Means
Clustering
Gaussian
Mixture Models
Mean Shift Spectral
Clustering
Reduce the number of random variables to consider
Applications: customer feed (Twitter) segmentation, groupingdata
Randomized Principal
Component Analysis
(PCA)
Kernel
Approximation
Locally Linear
Embedding
Spectral
Embedding
11. Infosys Confidential 11 Infosys Confidential|11 Internet of Manufacturing Midwest 2018
Machine learning techniques are organized by ability to learn
Supervised
Machine Learning
Raw Data Features /
Labels
Train and
Evaluate
Trained
Model
Deploy /
Improve
Unsupervised
Machine Learning
Raw Data Algorithms Cluster /
Anomalies
Use in supervised
learning
Reinforcement
Machine Learning
Goal Initialize Agent Environment
Action
Reward /
Penalty
Traditional Data Analysis
Raw Data Use Model/
Improve
Analyze and
Write Rules
“If” and “Else” decisions designed by humans,
coupled with functions ( e.g. Excel functions), to
process data or adjust to user input
Changing the task might require a rewrite of the model
The training data one feeds to the algorithm
includes the desired solutions, called labels
Based on learning, algorithm provides outputs for
new real time inputs
Examples: Classification, Regression
Only the input data is known, and no known output
data is given to the algorithm. They are usually
harder to understand and evaluate
Examples: Clustering (Identifying topics in a set of blog
posts, segmenting customers into groups with similar
preferences, detecting abnormal access patterns to a
website)
The learning system, called an Agent, can observe
the Environment, select and perform Actions, and
get Rewards in return, or Penalties in the form of
negative rewards
Examples : DeepMind’s AlphaGo, Walking Robot,
Automated Trader
12. Infosys Confidential 12 Infosys Confidential|12 Internet of Manufacturing Midwest 2018
Common machine learning use cases in a manufacturing context
Order to Cash Core Manufacturing Procure to Pay Record to Report
Demand estimation - order
quantities
Predictive maintenance
Contracts analysis for
named entity
FP&A Forecasting
Anomaly detection: credit
risk
Tech support / knowledge
base
Commodity price forecasting
Real-time monitoring of
foreign exchange
Order entry automation In-line quality inspection Consistent supplier terms
Automated inventory
stocking for service truck
Defect root cause and
corrective action
Long tail spend analytics
Simplification and
automation of manual
services billing
Production planning and
scheduling
Demand forecasting using
sales pipeline
13. Infosys Confidential13 |13 Internet of Manufacturing Midwest 2018
Industry 4.0:
Illustrative Case
Studies
13
Industrial
Examples
14. Infosys Confidential14 |14 Internet of Manufacturing Midwest 2018
Energy matters! Industrial IoT aids energy optimization in Infosys
campuses
46% reduction in per-capita energy consumption over 8 years
$100 million savings over 3 years
• Chillers
• HVAC
• Generators
• Elevators
• Sewage
Treatment
plants
• Solar power
plants
Large Campuses
80 million+ square feet Assets Managed
Central Command Center Demand Management
Digital Twin and Optimum
Operating Conditions
Predictive
Maintenance
Solution Approach
Business Benefit
Business Need Sustainability initiative at Infosys and implementation using IIoT solution
15. Infosys Confidential15 |15 Internet of Manufacturing Midwest 2018
Transparency Predictability AdaptabilityVisibility
Path of development
MaturityLevel/
BusinessValue
Visibility
Centralized command center for real-time visibility
Real-time data acquisition
Visibility to key operating parameters
16. Infosys Confidential16 |16 Internet of Manufacturing Midwest 2018
Transparency and the Digital Twin
Analyzing performance – As Designed vs As Installed vs As Operated
As Designed As Installed As Operated
Transparency Predictability AdaptabilityVisibility
Path of development
MaturityLevel/
BusinessValue
Plot of Critical Performance Parameters
• Condenser Water Delta (leaving temp – entering temp)
• Chiller Water Delta (leaving temp – entering temp)
• Evaporator Small Temp Diff (Ref. Sat temp– Chiller
Water leaving temp)
• Condenser Small Temp Diff (Ref. Sat temp–
Condenser Water leaving temp)
• Chiller Working Hours
Digital Model complements
physical assets
Study operating conditions,
trends and performance
17. Infosys Confidential17 |17 Internet of Manufacturing Midwest 2018
Predictability
Data Collection Data Cleansing
Correlation
Analysis
Exploratory Data
Analysis
Event Detection Prognostics
• Identificationof key performance indicators
• Exploratoryanalysis and visualization of data
• Event detection – Hotelling’sT-squared and quartile-
based method
• Prognostics– ARIMA model with xreg variable
• Knowledge model development
Implemented advanced analytics on chiller data for event detection and prognostics
Transparency Predictability AdaptabilityVisibility
Path of development
MaturityLevel/
BusinessValue
18. Infosys Confidential18 |18 Internet of Manufacturing Midwest 2018
Example: greenfield waste water treatment plant’s pumping station
• Plant: State-of-the-art waste water treatment plant in Europe
• Three operating scenarios:
1) Average flow
2) Average + Industry peak flow
3) Peak flow (heavy rain, flood)
• Three different design solutions for the pumping station to be analyzed
Case 1: 4 big pumps and 3 small pumps (original design requirement
from the Client)
Case 2: 3 big pumps and 3 small pumps
Case 3: 2 big pumps and 3 small pumps
FOCUS
The strict environmental permit must be
fulfilled which means that the effluent
from the pumping station to
environment is not acceptable
TARGET
To find optimal design solution to fulfil the
required availability and safety with minimum
lifecycle cost
EVALUATE
To create a RAMS simulation model of the
different design alternatives with different
operation and maintenance scenarios
RESULTS
To find out design solution to fulfil the
required availability and safety with
minimum lifecycle cost
19. Infosys Confidential19 |19 Internet of Manufacturing Midwest 2018
Moving from a traditional RAMS* to AI-based RAMS design enabled
faster decision-making with more accuracy
Integrated Operation
Defined design solutios:
- Design basis, specification,
objectives & requirements
Supplier-specific
Work Packages
Selected WP-
suppliers
Technical Performance & Availability
Operability & Maintainability
Safety & O&M Cost
Supply management
Anomaly?
Identify parameter
Healty Baseline
In-Situ MonitoringAlarm
Parameter Isolation
Failure DefinitionData-Driven Models
Physics of
Failures Model
RAMS database
Remaining Useful
Life Estimation
Yes
Continue
monitoring No
Instrumented process
Automated identification
and data capture
Application of
Prognostics and
Healty Management
Drishti 4.0 Operational
Excellence AI Platform
RAMS
Design
Process
System Design and Realization Processes
Identification of
critical RCM positions
Definition of Maintenance
Categories for RCM-
positions
Development and
Planning of CBM,
TBM and CM task
Optimization of the Plant specific maintenance
service program to achieve required availability
and safety with minimum costs
* RAMS = Reliability, Availability, Maintainability, Safety
20. Infosys Confidential20 |20 Internet of Manufacturing Midwest 2018
Visualization was an important tool to take decisions and
interventions based on analytics recommendations
Plant performance
KPIs:
RAMS and Risk
Index
Downtime insights:
Troubled asset, reason
for failures, cost savings
21. Infosys Confidential21 |21 Internet of Manufacturing Midwest 2018
Condition-based monitoring used AI to recommend pump
maintenance and proactive interventions
Results of RAMSanalyses of three design solution cases with currentmaintenance service program(without
CBM= Condition Based Maintenance)and with Drishti 4.0* operational excellence AI platform (=with CBM)
Pumping station
operational time 30 a
Case 1:
4 big and 3 small pumps
Case 2:
3 big and 3 small pumps
Case 3:
2 big and 3 small pumps
Maintenance service program Without CBM With CBM Without CBM With CBM Without CBM With CBM
Reliability and Availability
requirementsare fulfilled YES YES YES YES NO NO
Total Life Cycle costs (€)
2,307,358 2,107,910 1,743,175 1,568,232 1,486,811 1,308,584
Case 3:
Not acceptable
because of
violation of
environmental
permit
Case 2 with CBM: recommended
designsolution
1)No environmental risks
2)LCC cost are 740k€less than the
original design solution
RAMS designsavings 565k€
With CBMLCC savings 175k€
Case 1 without CBM:
The current design
solution
* Dhristri 4.0: dhristri.com
22. Infosys Confidential22 |22 Internet of Manufacturing Midwest 2018
Machine Learning
in action:
Churn Prediction
MajorAuto Manufacturer
23. Infosys Confidential23 |23 Internet of Manufacturing Midwest 2018
• Purchased vehicle
• Enrolled in trial (1 year)
• Converted to paid subscription
• Renewed paid subscription
The customer digital services cycle can be defined in the shape of a
funnel, and at each stage, there is churn. How do we reduce churn?
• At each stage of the funnel, we lose customers
– What can we do to reduce churn at each stage?
• By using customer and vehicle data across each stage,
we can use machine learning to predict a customer’s
likelihood to churn (i.e. customer does not progress to
the next stage of the funnel)
Find my car
Remote
Climate Start
Remote
Lock/Unlock
Typical
cloud
connected
features
Are there usage trends or customer behaviors thatcan
predict a customer’s likelihood of churning?
24. Infosys Confidential24 |24 Internet of Manufacturing Midwest 2018
Initial subscriptions present the biggest opportunity for improvement
Current annual sales: ~300,000
Luxury Brand
Annual sales (projected 2019): ~1,500,000
Mass Market
Brand
Select models
Increasing initial paid subscriptions is largest improvementopportunity
13%
33,600 vehicles
Enrolledintrial
and converted
to paid
64%
243,200 vehicles
Enrolledintrial but did
not convert to paid subscription
23%
64,400 vehicles
Did not enroll in
trial
9%
136,800 vehicles
Enrolledintrial and
convertedto paid
46%
699,200 vehicles
Enrolledintrial but did
not convert to paid subscription
45%
684,000 vehicles
Did not enroll intrial
25. Infosys Confidential25 |25 Internet of Manufacturing Midwest 2018
We gathered all relative usage and subscription data
• We gather data for new vehicles that were sold and enrolled in a trial of one month (in 2016)
– These vehicles were up for renewal in the following year, total vehicles in sample: ~24,000
Sep Oct Nov Dec Jan Feb Mar Apt May Jun Jul Aug Sep
Vehicle
Sale
Service
Renewal
• Next, we collected all usage and sales data for these
vehicles for the month before the renewal (~35,000 total commands)
– What specific commands were used by each vehicle? e.g. Remote Lock, Remote Start, Vehicle Finder, etc.
– What year / model was the vehicle?
– What was usage on the weekday vs the weekend for each vehicle?
– What metropolitan area was the selling dealer in?
– Did the customer subscribe to paid services?
Usage
Analysis
Trial Period
20172016
After getting the right data, we can build a model to answer the overarching question:
Which customers will subscribe?
26. Infosys Confidential26 |26 Internet of Manufacturing Midwest 2018
After a number of tuning iterations, the model enables churn
prediction on an individual basis
• The classification model was tuned over multiple iterations using Microsoft Azure, in order to find the
ideal level of accuracy and resiliency measured with the Receiver Operating Characteristic (ROC)
– Certain data was removed from the model to improve model quality
The tuned model enables us to determine the churn probabilityfor each customer
Infosys Churn Model POC, 2018
This curve would
indicate we could
predict every single
customer perfectly
(impossible!)
This straight line
would indicate we
are guessing
randomly
This is the
accuracy of our
model using limited
data – this can only
improve as we add
more data points,
e.g. demographics
VIN: SN0001
Model: SUV MODEL A
Weekday uses: 48
Weekend uses: 18
Remote: 0
Status: 0
Finder: 31
Lock: 35
Renewal
probability:
73.9%
Over the course of 100+ iterations, the machine learning
algorithm uses the training set to build a decision tree
based on the input data. Sample decision branches:
• Is weekend usage greater than weekday usage?
• Was the car purchased in an area with extreme weather?
repeatcustomers extraneousdata fields(e.g. VIN)
27. Infosys Confidential27 |27 Internet of Manufacturing Midwest 2018
Net Profit (in 000's)
PerYear, 1 yearold vehicles Only
CampaignConversion Rate
## 2% 3% 4% 5%
Probabilityofrenewal
5 $ 16 $ 38 $ 59 $ 81
10 $ 12 $ 49 $ 87 $ 125
15 $ (9) $ 44 $ 96 $ 149
20 $ (36) $ 30 $ 96 $ 161
25 $ (59) $ 13 $ 85 $ 158
30 $ (80) $ (3) $ 74 $ 150
35 $ (95) $ (15) $ 65 $ 145
40 $ (109) $ (26) $ 57 $ 141
We can now choose customers to address, to maximize profitability
• Incentives are 4% effective
• Customers < 20% likely to renew
• Profit in first year: $96k
• This is maximum cumulative profit for the scenario
How effectivecustomer
incentives are (“change
their minds”)
We choose the retention(renewal) probabilityof
customers to address
Additional net profit per
year (000s)
Based on incentive effectiveness, we can maximize value by choosing the targets
This is relevant for many manufacturing scenarios involving diminishing returns
0% likely
to renew
100% likely
to renew
Less than 10%
likelyto renew
~22,000 customers (VINs)
Less than 25%
likelyto renew
• We can choose which
customers to reach out to
• This enables better
efficiency of resources
VIN Scored Probabilities
SN00001 0.121978149
SN00002 0.48944521
SN00003 0.054602593
SN00004 0.196847379
SN00005 0.128807783
Actual model output
(VIN data isanonymized)
Sample implementation using
machine learning output*
* Using Azure Machine Learning Studio
28. Infosys Confidential28 |28 Internet of Manufacturing Midwest 2018
Continuing the conversation….
Jeff Kavanaugh
Partner, Manufacturing
Infosys Consulting
jeff_kavanaugh@infosys.com
Adjunct Professor
University of Texas at Dallas
jeff_kavanaugh@utdallas.edu
@jeffkav
www.infosysconsultinginsights.com
http://bit.ly/2qzanfr
www.infosys.design/plantio
Foundational skills for learning
in the age of AI (Amazon,B&N)
https://www.infosys.com
/age-of-ai/