SlideShare a Scribd company logo
1 of 33
Download to read offline
Concept Neurons –
Handling Drift Issues for
Real-Time Industrial Data Mining
Luis Moreira-Matias .::. luis.matias@neclab.eu
Intelligent Transport Systems Group
Social Solutions Research division
NEC Laboratories Europe, Heidelberg, DE
Joao Gama, Joao Mendes-Moreira
University of Porto and LIAAD INESC-TEC, Portugal
Riva Del Garda, Italy @ ECML/PKDD .::. September, 2016
Outline
 Problem Overview (Real-Time Industrial DM)
 Notes on Concept Drift phenomenon
 Concept Neurons
 Case Studies
 Experiments
 Final Remarks
3 © NEC Corporation 20152016
Increasing Interest on Analytics/Data Science during recent
years pushed software engineers into the game!
Problem Overview
4 © NEC Corporation 20152016
Problem Overview
5 © NEC Corporation 20152016
Trends
● Tons of new “Data Scientist” roles filled with
programmers fundamental background;
● Real-Time Data Processing;
● Off-the-shelf libraries -> Offline Machine Learning
Temptation!!
Unrealistic assumptions
Lead to suboptimal results!
6 © NEC Corporation 20152016
Notes on Concept Drift phenomenon
∃𝑋: 𝑝𝑡(𝑦|𝑋) ≠ 𝑝𝑡+1(𝑦|𝑋)
This guy still does a fair
job under drift ...but for
how long?
How much are you losing by
relying in an inaccurate
model?
t
7 © NEC Corporation 20152016
Notes on Concept Drift phenomenon
● Real-Time Data Mining must cope with Concept Drift!
● Adaptive/Online Learning Schemas are not yet
popular among off-the-shelf libraries;
● Different types of drift require different drift handling
mechanisms;
Image kindly extracted from: Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift
adaptation. ACM Computing Surveys (CSUR), 46(4), 44.
● How can we adapt existing off-the-shelf
algorithms to resist drift without having
large empirical/fundamental effort?
8 © NEC Corporation 20152016
Business Value of Real-Time DM
Examples of the Value of Reactiveness Ability to drift:
● Transportation
● Highway Congestion Prediction (for traffic control purposes);
● Travel Time Prediction (for navigational purposes);
● Recommendation Systems
● Retail (highly popular new product);
● Media (highly popular new movie);
● Communications
● Security Failures (new Virus signature);
● Fraud Detection (bank transactions / mobile phone carriers);
9 © NEC Corporation 20152016
Concept Neuron – Base Idea
● Base Real-Time DM Learning Schema
Data
(X)
Feedback
(y)
Memory
Loss
Estimation
Model
Learning
Change
Detection
Prediction
Alarm
Image adapted from: Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM
Computing Surveys (CSUR), 46(4), 44.
A
B
C
D
E
F
10 © NEC Corporation 20152016
Asynchronous Concept Neuron (ACN)
● Assumptions
● Our samples are being generated by a distribution which is
constantly drifting somehow slowly;
● An offline learning model is trained periodically for the most
recent (window) data;
● Example of possible base learners:
● Multilayer Perceptron (with and without kernels);
● Least Squares (with and without kernels, l1/l2 norms);
● MultiAdaptive Regression Spilines (MARS);
● Time Series Analysis (e.g. ARIMA, Exponential Smoothing);
11 © NEC Corporation 20152016
Asynchronous Concept Neuron (ACN)
Data
(X)
Windowed
Memory
Loss
Estimation
Offline Model
Learning
Prediction
A
B
C
E
Feedback
(y)
D
Model
Update
12 © NEC Corporation 20152016
Asynchronous Concept Neuron
● Modus Operandi
1. An offline model is periodically trained over the samples arrived
within a given window T;
2. Loss is estimated for each arrived sample;
3. Update the model in the inverse direction of the gradient loss;
● Additional Assumption (implied):
● Convergence is still possible at a smaller rate when done
incrementally for a sufficiently small learning rate and an
adequate Learner;
13 © NEC Corporation 20152016
Synchronous Concept Neuron (SCN)
● Assumptions
● Our samples are being generated by a distribution which may
suffer drifts recurrently (however, it is stationary for some
periods in time);
● These drift events are usually limited on time – if not, the model
needs to be fully re-trained;
● An (online) learning model is trained periodically for the most
recent (window) data;
● Example of possible base learners:
● Anytype!
14 © NEC Corporation 20152016
Synchronous Concept Neuron (SCN)
Data
(X)
Feedback
(y)
Windowed
Memory
Loss
Estimation
Model
Learning
Change
Detection
Prediction
A
B
C
D
E
Prediction
Corrections
15 © NEC Corporation 20152016
Synchronous Concept Neuron (SCN)
● Modus Operandi
1. A model is periodically trained over the samples arrived within a
given window T;
2. If a drift alarm is triggered, the most recent residuals are used to
update the model’s prediction directly (greedy);
3. If a novel drift is detected on the base model outputs, the learning rate is
increased;
4. If no drift is detected for a some periods (hyperparameter), the greedy
updates are turned off.
● Additional Assumption (implied):
● As the drift re-occurs typically with a fast rate but yet limited in
time, we can rely on our model while simply updating its
outputs to guarantee no divergence.
16 © NEC Corporation 20152016
Case Study A – Taxi Demand Forecasting
17 © NEC Corporation 20152016
Case Study A – Taxi Demand Forecasting
4 Key Variables:
1) the expected price for
a service over time;
2) The distance of each
stand (i.e. cost);
3) Number of taxis
already parked in a
stand;
4) The demand
predicted for a given
stand;
18 © NEC Corporation 20152016
Real-World Case Study A: Portuguese Taxi Operator
Case Study: PORTUGAL (EMEA)
19 © NEC Corporation 20152016
Real-World Case Study A: Portuguese Taxi Operator
Porto is Portugal’s second largest city;
1.3 million inhabitants;
Two taxi fleets = 700 taxi vehicles;
Data acquired using one fleet of roughly 450 taxi vehicles;
FCD feed from August 2011 to April 2012.
1 sample/veh./15 secs;
Total: 1 Million logged trips
Aggregation: demand counts per 30 minutes;
20 © NEC Corporation 20152016
Real-World Case Study B - Predict Traffic Incidents
▌Is traffic congestion a necessary evil?
21 © NEC Corporation 20152016
Real-World Case Study B - Predict Traffic Incidents
Re-routing Reversible Lanes
Earlier Dispatching of Safety/Clearance Personnel
Dynamic Speed Control
22 © NEC Corporation 20152016
Real-World Case Study B: Japanese Highway Operator
Case Study: JAPAN (APAC)
23 © NEC Corporation 20152016
Real-World Case Study B: Japanese Highway Operator
▌Data collected from 106 sensors deployed along 20km of freeway;
▌Period studied: 3 non-consecutive weeks;
▌Sample: the number of vehicles (flow) traversed per 15 minutes;
24 © NEC Corporation 20152016
Experimental Setup
▌Statistical independence is assumed to be in place
both for the (A) demand in the stands and for the (B)
flow in each sensors;
▌Test sets
A) 4 last weeks;
B) Last 5 days of each one of the 3 weeks;
▌Task: Predict the next term of the series
A) Demand Count for each stand;
B) Flow Count for each sensor;
B) Congestion Prediction;
25 © NEC Corporation 20152016
Experimental Setup
▌A) fully sample-by-sample retrained ARIMA with GLS (ARIGLS) vs.
ACN (base learner: ARIGLS);
Main Idea: reduce the computational complexity of running full GLS for each new
sample;
▌Evaluation: sMAPE
▌In B), we compared two fully sample-by-sample retrained ARIMA
and ETS (w/GLS) vs. SCN (base learner: online weighting
ensemble of ARIMA and ETS);
Main Idea: check how SCN handles bursty/reocurring drifts;
▌Evaluation: RMSE, MAE (flow count prediction)
▌Evaluation: Precision, Recall (Congestion Prediction)
26 © NEC Corporation 20152016
Results A: Portuguese Taxi Operator
Case Study: PORTUGAL (EMEA)
27 © NEC Corporation 20152016
Results A: Portuguese Taxi Operator
▌Avg. Training/Runtime of ARIGLS per prediction: 99.77s;
▌Avg. Training/Runtime of ACN per prediction: 32.44s
28 © NEC Corporation 20152016
Real-World Case Study B: Japanese Highway Operator
Case Study: JAPAN (APAC)
29 © NEC Corporation 20152016
Results B: Japanese Highway Operator
▌Results for SCN (Drift3Flow):
RMSE/MAE -5% (flow/occupancy forecasting)
PRECISION +2% / RECALL +30% (incident prediction)
+200 INCIDENTS PREDICTED
30 © NEC Corporation 20152016
Final Remarks
● Huge market needs on Data Science;
● Great real-time data processing abilities;
● Complex DM problems;
Bias to Use the same off-the-shelf technique to solve ALL problems;
● Real-time DM must cope with Concept Drift 
● Concept Neurons can operate on the top of the most
used supervised learning algorithms;
● We demonstrated that it can guarantee convergence,
lower computational effort and higher generalization;
● We showed the added business value of it in two
applications along transportation domain;
31 © NEC Corporation 20152016
Some Additional References (feel free to ask for more... )
luis.moreira.matias@gmail.com
▌ Moreira-Matias L., Cats O., Gama J., Mendes-Moreira J. and Sousa J.F. “An Online Learning Approach to Eliminate Bus
Bunching in Real-Time.” Applied Soft Computing. Vol. 47. pp. 460-482. 2016
▌ Moreira-Matias L., Gama J., Ferreira M., Mendes-Moreira J. and Damas L. “Time-evolving O-D Matrix Estimation using
highspeed GPS data streams.” Expert Systems with Applications. Vol. 44. pp. 275-268. 2016.
▌ Khiary, J., Moreira-Matias, L., Cerqueira, V., Cats, O. “Automated Setting of Bus Schedule Coverage using Unsupervised
Machine Learning”. Advances in Knowledge Discovery and Data Mining - 20th Pacific-Asia Conference, PAKDD. pp. 552-564.
Springer. (2016)
▌ Moreira-Matias, L., Alesiani, F. “Drift3Flow: Freeway-Incident Prediction using Real-Time Learning.” 18th International IEEE
Conference on Intelligent Transportation Systems (ITSC). pp. 566-571. 2015.
▌ Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L. “Predicting Taxi-Passenger Demand Using Streaming
Data”. IEEE Transactions on Intelligent Transportation Systems. Vol.14, no.3. pp.1393-1402. 2013.
▌ Moreira-Matias L., Mendes-Moreira J., Sousa J.F. and Gama J. “Improving Mass Transit Operations by using AVL based
Systems: A Survey”. IEEE Transactions on Intelligent Transportation Systems. Vol. 16, no. 4. pp. 1636-1653. 2015.
▌ Mendes-Moreira J., Moreira-Matias L., Gama J. and Sousa J.F. “Validating the Coverage of Bus Schedules: A Machine Learning
Approach”. Information Sciences. Vol. 293, no. 1. pp. 299-313. 2015.
▌ Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L. “On Predicting the Taxi-Passenger Demand: A Real-
Time Approach”. Progress in Artificial Intelligence. LNCS 8154. Springer. pp. 54-65. 2013.
▌ Moreira-Matias, L., Gama, J., Mendes-Moreira, J., Freire de Sousa, J. “An Incremental Probabilistic Model to Predict Bus
Bunching in Real-Time”. Advances in Intelligent Data Analysis XIII. LNCS vol. 8819. pp. 227-238. Springer. 2014.
▌ Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L.. “Online Predictive Model for Taxi Services. Advances
in Intelligent Data Analysis XI. LNCS vol. 7619. pp. 230-240. Springer. 2012.
▌ Moreira-Matias, L., Fernandes, R., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L., “An Online Recommendation System
for the Taxi Stand choice Problem” (Poster) IEEE Vehicular Network Conference (IEEE VNC). pp. 173-180. 2012.
32 © NEC Corporation 20152016
Thank you for your time!
luis.matias@neclab.eu
presentation_ECMLPKDD16_Concept_v1

More Related Content

Viewers also liked

Contaminación por Purines
Contaminación por PurinesContaminación por Purines
Contaminación por PurinesLislly Isabel
 
обзор рынка-лизинга-июнь2016
обзор рынка-лизинга-июнь2016обзор рынка-лизинга-июнь2016
обзор рынка-лизинга-июнь2016mResearcher
 
10 Small Dog Breeds to Own if You Live in a Small Apartment
10 Small Dog Breeds to Own if You Live in a Small Apartment10 Small Dog Breeds to Own if You Live in a Small Apartment
10 Small Dog Breeds to Own if You Live in a Small ApartmentDogseechew
 
Municipio chacao
Municipio chacaoMunicipio chacao
Municipio chacaoJulio Leal
 
Opening Sequence
Opening SequenceOpening Sequence
Opening Sequencekhalfyard
 
Actividad del libro.
Actividad del libro.Actividad del libro.
Actividad del libro.Angie Rivera
 
обзор 2016 каналы-дистрибуции-страхования
обзор 2016 каналы-дистрибуции-страхованияобзор 2016 каналы-дистрибуции-страхования
обзор 2016 каналы-дистрибуции-страхованияmResearcher
 
Eng.ahmed tarek c.v.
Eng.ahmed tarek c.v.Eng.ahmed tarek c.v.
Eng.ahmed tarek c.v.Ahmed Aly
 
Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...
Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...
Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...Núria Sánchez Ruano
 

Viewers also liked (12)

Contaminación por Purines
Contaminación por PurinesContaminación por Purines
Contaminación por Purines
 
Script
Script Script
Script
 
обзор рынка-лизинга-июнь2016
обзор рынка-лизинга-июнь2016обзор рынка-лизинга-июнь2016
обзор рынка-лизинга-июнь2016
 
10 Small Dog Breeds to Own if You Live in a Small Apartment
10 Small Dog Breeds to Own if You Live in a Small Apartment10 Small Dog Breeds to Own if You Live in a Small Apartment
10 Small Dog Breeds to Own if You Live in a Small Apartment
 
Municipio chacao
Municipio chacaoMunicipio chacao
Municipio chacao
 
Opening Sequence
Opening SequenceOpening Sequence
Opening Sequence
 
Canva
CanvaCanva
Canva
 
Vergados_Dissertation
Vergados_DissertationVergados_Dissertation
Vergados_Dissertation
 
Actividad del libro.
Actividad del libro.Actividad del libro.
Actividad del libro.
 
обзор 2016 каналы-дистрибуции-страхования
обзор 2016 каналы-дистрибуции-страхованияобзор 2016 каналы-дистрибуции-страхования
обзор 2016 каналы-дистрибуции-страхования
 
Eng.ahmed tarek c.v.
Eng.ahmed tarek c.v.Eng.ahmed tarek c.v.
Eng.ahmed tarek c.v.
 
Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...
Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...
Miguelangelmayer reutilizaciinformaciclinicaperrecerca-150506065712-conversio...
 

Similar to presentation_ECMLPKDD16_Concept_v1

Online Memory Leak Detection in the Cloud-based Infrastructures
Online Memory Leak Detection in the Cloud-based InfrastructuresOnline Memory Leak Detection in the Cloud-based Infrastructures
Online Memory Leak Detection in the Cloud-based InfrastructuresAnshul Jindal
 
Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsVMware Tanzu
 
ILT202411111111111111111111111111111.pdf
ILT202411111111111111111111111111111.pdfILT202411111111111111111111111111111.pdf
ILT202411111111111111111111111111111.pdfw7823125
 
D space magazin_2019_2_iupui_english
D space magazin_2019_2_iupui_englishD space magazin_2019_2_iupui_english
D space magazin_2019_2_iupui_englishSree Shruthi
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITOMarcoMellia
 
Rethinking the Mobile Code Offloading Paradigm: From Concept to Practice
Rethinking the Mobile Code Offloading Paradigm: From Concept to PracticeRethinking the Mobile Code Offloading Paradigm: From Concept to Practice
Rethinking the Mobile Code Offloading Paradigm: From Concept to PracticeMobileSoft
 
Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry
Learn How to Operationalize IoT Apps on Pivotal Cloud FoundryLearn How to Operationalize IoT Apps on Pivotal Cloud Foundry
Learn How to Operationalize IoT Apps on Pivotal Cloud FoundryVMware Tanzu
 
Quantum Computing: Timing is Everything
Quantum Computing: Timing is EverythingQuantum Computing: Timing is Everything
Quantum Computing: Timing is Everythinginside-BigData.com
 
Flood and rainfall predction final
Flood and rainfall predction finalFlood and rainfall predction final
Flood and rainfall predction finalCity University
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper reviewMazen Aly
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsOlivier Teytaud
 
FLOOD FORECASTING USING MACHINE LEARNING ALGORITHM
FLOOD FORECASTING USING MACHINE LEARNING ALGORITHMFLOOD FORECASTING USING MACHINE LEARNING ALGORITHM
FLOOD FORECASTING USING MACHINE LEARNING ALGORITHMIRJET Journal
 
OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...
OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...
OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...eNovance
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisVivek Raja P S
 
Next generation alerting and fault detection, SRECon Europe 2016
Next generation alerting and fault detection, SRECon Europe 2016Next generation alerting and fault detection, SRECon Europe 2016
Next generation alerting and fault detection, SRECon Europe 2016Dieter Plaetinck
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Matt Stubbs
 

Similar to presentation_ECMLPKDD16_Concept_v1 (20)

CJammer_ieee_itsc_v1
CJammer_ieee_itsc_v1CJammer_ieee_itsc_v1
CJammer_ieee_itsc_v1
 
Online Memory Leak Detection in the Cloud-based Infrastructures
Online Memory Leak Detection in the Cloud-based InfrastructuresOnline Memory Leak Detection in the Cloud-based Infrastructures
Online Memory Leak Detection in the Cloud-based Infrastructures
 
Data Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of ThingsData Science Powered Apps for Internet of Things
Data Science Powered Apps for Internet of Things
 
ILT202411111111111111111111111111111.pdf
ILT202411111111111111111111111111111.pdfILT202411111111111111111111111111111.pdf
ILT202411111111111111111111111111111.pdf
 
“How QuantCube Technology uses alternative data to create macroeconomic, fina...
“How QuantCube Technology uses alternative data to create macroeconomic, fina...“How QuantCube Technology uses alternative data to create macroeconomic, fina...
“How QuantCube Technology uses alternative data to create macroeconomic, fina...
 
D space magazin_2019_2_iupui_english
D space magazin_2019_2_iupui_englishD space magazin_2019_2_iupui_english
D space magazin_2019_2_iupui_english
 
DATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITODATI, AI E ROBOTICA @POLITO
DATI, AI E ROBOTICA @POLITO
 
Rethinking the Mobile Code Offloading Paradigm: From Concept to Practice
Rethinking the Mobile Code Offloading Paradigm: From Concept to PracticeRethinking the Mobile Code Offloading Paradigm: From Concept to Practice
Rethinking the Mobile Code Offloading Paradigm: From Concept to Practice
 
Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry
Learn How to Operationalize IoT Apps on Pivotal Cloud FoundryLearn How to Operationalize IoT Apps on Pivotal Cloud Foundry
Learn How to Operationalize IoT Apps on Pivotal Cloud Foundry
 
Quantum Computing: Timing is Everything
Quantum Computing: Timing is EverythingQuantum Computing: Timing is Everything
Quantum Computing: Timing is Everything
 
Flood and rainfall predction final
Flood and rainfall predction finalFlood and rainfall predction final
Flood and rainfall predction final
 
Ad Click Prediction - Paper review
Ad Click Prediction - Paper reviewAd Click Prediction - Paper review
Ad Click Prediction - Paper review
 
Minh nguyen 2021 (2)
Minh nguyen 2021 (2)Minh nguyen 2021 (2)
Minh nguyen 2021 (2)
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
FLOOD FORECASTING USING MACHINE LEARNING ALGORITHM
FLOOD FORECASTING USING MACHINE LEARNING ALGORITHMFLOOD FORECASTING USING MACHINE LEARNING ALGORITHM
FLOOD FORECASTING USING MACHINE LEARNING ALGORITHM
 
OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...
OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...
OpenStack in Action 4! Susheel Varma - VPH-Share: Patient-Centred Multi-scale...
 
Model Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model AnalysisModel Drift Monitoring using Tensorflow Model Analysis
Model Drift Monitoring using Tensorflow Model Analysis
 
Next generation alerting and fault detection, SRECon Europe 2016
Next generation alerting and fault detection, SRECon Europe 2016Next generation alerting and fault detection, SRECon Europe 2016
Next generation alerting and fault detection, SRECon Europe 2016
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
Big Data LDN 2018: DATA OPERATIONS PROBLEMS CREATED BY DEEP LEARNING, AND HOW...
 

presentation_ECMLPKDD16_Concept_v1

  • 1. Concept Neurons – Handling Drift Issues for Real-Time Industrial Data Mining Luis Moreira-Matias .::. luis.matias@neclab.eu Intelligent Transport Systems Group Social Solutions Research division NEC Laboratories Europe, Heidelberg, DE Joao Gama, Joao Mendes-Moreira University of Porto and LIAAD INESC-TEC, Portugal Riva Del Garda, Italy @ ECML/PKDD .::. September, 2016
  • 2. Outline  Problem Overview (Real-Time Industrial DM)  Notes on Concept Drift phenomenon  Concept Neurons  Case Studies  Experiments  Final Remarks
  • 3. 3 © NEC Corporation 20152016 Increasing Interest on Analytics/Data Science during recent years pushed software engineers into the game! Problem Overview
  • 4. 4 © NEC Corporation 20152016 Problem Overview
  • 5. 5 © NEC Corporation 20152016 Trends ● Tons of new “Data Scientist” roles filled with programmers fundamental background; ● Real-Time Data Processing; ● Off-the-shelf libraries -> Offline Machine Learning Temptation!! Unrealistic assumptions Lead to suboptimal results!
  • 6. 6 © NEC Corporation 20152016 Notes on Concept Drift phenomenon ∃𝑋: 𝑝𝑡(𝑦|𝑋) ≠ 𝑝𝑡+1(𝑦|𝑋) This guy still does a fair job under drift ...but for how long? How much are you losing by relying in an inaccurate model? t
  • 7. 7 © NEC Corporation 20152016 Notes on Concept Drift phenomenon ● Real-Time Data Mining must cope with Concept Drift! ● Adaptive/Online Learning Schemas are not yet popular among off-the-shelf libraries; ● Different types of drift require different drift handling mechanisms; Image kindly extracted from: Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Computing Surveys (CSUR), 46(4), 44. ● How can we adapt existing off-the-shelf algorithms to resist drift without having large empirical/fundamental effort?
  • 8. 8 © NEC Corporation 20152016 Business Value of Real-Time DM Examples of the Value of Reactiveness Ability to drift: ● Transportation ● Highway Congestion Prediction (for traffic control purposes); ● Travel Time Prediction (for navigational purposes); ● Recommendation Systems ● Retail (highly popular new product); ● Media (highly popular new movie); ● Communications ● Security Failures (new Virus signature); ● Fraud Detection (bank transactions / mobile phone carriers);
  • 9. 9 © NEC Corporation 20152016 Concept Neuron – Base Idea ● Base Real-Time DM Learning Schema Data (X) Feedback (y) Memory Loss Estimation Model Learning Change Detection Prediction Alarm Image adapted from: Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., & Bouchachia, A. (2014). A survey on concept drift adaptation. ACM Computing Surveys (CSUR), 46(4), 44. A B C D E F
  • 10. 10 © NEC Corporation 20152016 Asynchronous Concept Neuron (ACN) ● Assumptions ● Our samples are being generated by a distribution which is constantly drifting somehow slowly; ● An offline learning model is trained periodically for the most recent (window) data; ● Example of possible base learners: ● Multilayer Perceptron (with and without kernels); ● Least Squares (with and without kernels, l1/l2 norms); ● MultiAdaptive Regression Spilines (MARS); ● Time Series Analysis (e.g. ARIMA, Exponential Smoothing);
  • 11. 11 © NEC Corporation 20152016 Asynchronous Concept Neuron (ACN) Data (X) Windowed Memory Loss Estimation Offline Model Learning Prediction A B C E Feedback (y) D Model Update
  • 12. 12 © NEC Corporation 20152016 Asynchronous Concept Neuron ● Modus Operandi 1. An offline model is periodically trained over the samples arrived within a given window T; 2. Loss is estimated for each arrived sample; 3. Update the model in the inverse direction of the gradient loss; ● Additional Assumption (implied): ● Convergence is still possible at a smaller rate when done incrementally for a sufficiently small learning rate and an adequate Learner;
  • 13. 13 © NEC Corporation 20152016 Synchronous Concept Neuron (SCN) ● Assumptions ● Our samples are being generated by a distribution which may suffer drifts recurrently (however, it is stationary for some periods in time); ● These drift events are usually limited on time – if not, the model needs to be fully re-trained; ● An (online) learning model is trained periodically for the most recent (window) data; ● Example of possible base learners: ● Anytype!
  • 14. 14 © NEC Corporation 20152016 Synchronous Concept Neuron (SCN) Data (X) Feedback (y) Windowed Memory Loss Estimation Model Learning Change Detection Prediction A B C D E Prediction Corrections
  • 15. 15 © NEC Corporation 20152016 Synchronous Concept Neuron (SCN) ● Modus Operandi 1. A model is periodically trained over the samples arrived within a given window T; 2. If a drift alarm is triggered, the most recent residuals are used to update the model’s prediction directly (greedy); 3. If a novel drift is detected on the base model outputs, the learning rate is increased; 4. If no drift is detected for a some periods (hyperparameter), the greedy updates are turned off. ● Additional Assumption (implied): ● As the drift re-occurs typically with a fast rate but yet limited in time, we can rely on our model while simply updating its outputs to guarantee no divergence.
  • 16. 16 © NEC Corporation 20152016 Case Study A – Taxi Demand Forecasting
  • 17. 17 © NEC Corporation 20152016 Case Study A – Taxi Demand Forecasting 4 Key Variables: 1) the expected price for a service over time; 2) The distance of each stand (i.e. cost); 3) Number of taxis already parked in a stand; 4) The demand predicted for a given stand;
  • 18. 18 © NEC Corporation 20152016 Real-World Case Study A: Portuguese Taxi Operator Case Study: PORTUGAL (EMEA)
  • 19. 19 © NEC Corporation 20152016 Real-World Case Study A: Portuguese Taxi Operator Porto is Portugal’s second largest city; 1.3 million inhabitants; Two taxi fleets = 700 taxi vehicles; Data acquired using one fleet of roughly 450 taxi vehicles; FCD feed from August 2011 to April 2012. 1 sample/veh./15 secs; Total: 1 Million logged trips Aggregation: demand counts per 30 minutes;
  • 20. 20 © NEC Corporation 20152016 Real-World Case Study B - Predict Traffic Incidents ▌Is traffic congestion a necessary evil?
  • 21. 21 © NEC Corporation 20152016 Real-World Case Study B - Predict Traffic Incidents Re-routing Reversible Lanes Earlier Dispatching of Safety/Clearance Personnel Dynamic Speed Control
  • 22. 22 © NEC Corporation 20152016 Real-World Case Study B: Japanese Highway Operator Case Study: JAPAN (APAC)
  • 23. 23 © NEC Corporation 20152016 Real-World Case Study B: Japanese Highway Operator ▌Data collected from 106 sensors deployed along 20km of freeway; ▌Period studied: 3 non-consecutive weeks; ▌Sample: the number of vehicles (flow) traversed per 15 minutes;
  • 24. 24 © NEC Corporation 20152016 Experimental Setup ▌Statistical independence is assumed to be in place both for the (A) demand in the stands and for the (B) flow in each sensors; ▌Test sets A) 4 last weeks; B) Last 5 days of each one of the 3 weeks; ▌Task: Predict the next term of the series A) Demand Count for each stand; B) Flow Count for each sensor; B) Congestion Prediction;
  • 25. 25 © NEC Corporation 20152016 Experimental Setup ▌A) fully sample-by-sample retrained ARIMA with GLS (ARIGLS) vs. ACN (base learner: ARIGLS); Main Idea: reduce the computational complexity of running full GLS for each new sample; ▌Evaluation: sMAPE ▌In B), we compared two fully sample-by-sample retrained ARIMA and ETS (w/GLS) vs. SCN (base learner: online weighting ensemble of ARIMA and ETS); Main Idea: check how SCN handles bursty/reocurring drifts; ▌Evaluation: RMSE, MAE (flow count prediction) ▌Evaluation: Precision, Recall (Congestion Prediction)
  • 26. 26 © NEC Corporation 20152016 Results A: Portuguese Taxi Operator Case Study: PORTUGAL (EMEA)
  • 27. 27 © NEC Corporation 20152016 Results A: Portuguese Taxi Operator ▌Avg. Training/Runtime of ARIGLS per prediction: 99.77s; ▌Avg. Training/Runtime of ACN per prediction: 32.44s
  • 28. 28 © NEC Corporation 20152016 Real-World Case Study B: Japanese Highway Operator Case Study: JAPAN (APAC)
  • 29. 29 © NEC Corporation 20152016 Results B: Japanese Highway Operator ▌Results for SCN (Drift3Flow): RMSE/MAE -5% (flow/occupancy forecasting) PRECISION +2% / RECALL +30% (incident prediction) +200 INCIDENTS PREDICTED
  • 30. 30 © NEC Corporation 20152016 Final Remarks ● Huge market needs on Data Science; ● Great real-time data processing abilities; ● Complex DM problems; Bias to Use the same off-the-shelf technique to solve ALL problems; ● Real-time DM must cope with Concept Drift  ● Concept Neurons can operate on the top of the most used supervised learning algorithms; ● We demonstrated that it can guarantee convergence, lower computational effort and higher generalization; ● We showed the added business value of it in two applications along transportation domain;
  • 31. 31 © NEC Corporation 20152016 Some Additional References (feel free to ask for more... ) luis.moreira.matias@gmail.com ▌ Moreira-Matias L., Cats O., Gama J., Mendes-Moreira J. and Sousa J.F. “An Online Learning Approach to Eliminate Bus Bunching in Real-Time.” Applied Soft Computing. Vol. 47. pp. 460-482. 2016 ▌ Moreira-Matias L., Gama J., Ferreira M., Mendes-Moreira J. and Damas L. “Time-evolving O-D Matrix Estimation using highspeed GPS data streams.” Expert Systems with Applications. Vol. 44. pp. 275-268. 2016. ▌ Khiary, J., Moreira-Matias, L., Cerqueira, V., Cats, O. “Automated Setting of Bus Schedule Coverage using Unsupervised Machine Learning”. Advances in Knowledge Discovery and Data Mining - 20th Pacific-Asia Conference, PAKDD. pp. 552-564. Springer. (2016) ▌ Moreira-Matias, L., Alesiani, F. “Drift3Flow: Freeway-Incident Prediction using Real-Time Learning.” 18th International IEEE Conference on Intelligent Transportation Systems (ITSC). pp. 566-571. 2015. ▌ Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L. “Predicting Taxi-Passenger Demand Using Streaming Data”. IEEE Transactions on Intelligent Transportation Systems. Vol.14, no.3. pp.1393-1402. 2013. ▌ Moreira-Matias L., Mendes-Moreira J., Sousa J.F. and Gama J. “Improving Mass Transit Operations by using AVL based Systems: A Survey”. IEEE Transactions on Intelligent Transportation Systems. Vol. 16, no. 4. pp. 1636-1653. 2015. ▌ Mendes-Moreira J., Moreira-Matias L., Gama J. and Sousa J.F. “Validating the Coverage of Bus Schedules: A Machine Learning Approach”. Information Sciences. Vol. 293, no. 1. pp. 299-313. 2015. ▌ Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L. “On Predicting the Taxi-Passenger Demand: A Real- Time Approach”. Progress in Artificial Intelligence. LNCS 8154. Springer. pp. 54-65. 2013. ▌ Moreira-Matias, L., Gama, J., Mendes-Moreira, J., Freire de Sousa, J. “An Incremental Probabilistic Model to Predict Bus Bunching in Real-Time”. Advances in Intelligent Data Analysis XIII. LNCS vol. 8819. pp. 227-238. Springer. 2014. ▌ Moreira-Matias, L., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L.. “Online Predictive Model for Taxi Services. Advances in Intelligent Data Analysis XI. LNCS vol. 7619. pp. 230-240. Springer. 2012. ▌ Moreira-Matias, L., Fernandes, R., Gama, J., Ferreira, M., Mendes-Moreira, J., Damas, L., “An Online Recommendation System for the Taxi Stand choice Problem” (Poster) IEEE Vehicular Network Conference (IEEE VNC). pp. 173-180. 2012.
  • 32. 32 © NEC Corporation 20152016 Thank you for your time! luis.matias@neclab.eu