SlideShare ist ein Scribd-Unternehmen logo
1 von 45
Downloaden Sie, um offline zu lesen
A DEEP
LEARNIN
G
USE
CASE
FOR WATER
END USE
DETECTION
Motivatio
n
Motivation
Urban water supply
• We need good demand management policies
to achieve a good sustainable development.
• Adding a new water source imply:
• Higher costs.
• Environmental damage.
• Poorer quality.
• “the largest, least expensive, and most
environmentally sound source of water […] is
the water currently being wasted in every
sector of our economy”.[1]
[1] Gleick, P. et al. (2003). Waste Not, Want Not: The Potential for
Urban Water Conservation
Motivation
End uses of water
• Residential use of water -> The 70% of
total water consumption.
• A good understanding of the demand and
its characterization could be very useful to
create good management policies.
• Several problems can be addressed using
AI techniques:
• Final use classification (dishwasher,
toilet, irrigation, taps).
• Water demand forecasting.
Motivation
The problem
• Installing a meter on each water device is
very expensive and intrusive.
• To overcome this problem, it is possible to
install a unique precision meter at the
home main water connection.
• Predictive models can read these meters
and make predictions:
• End use: Classification problem.
• Forecasting: Regression problem.
Motivation
Data Source
• Canal de Isabel II monitors since 2008 a
sample of 300 homes spread over the
region of Madrid.
• 15 million hours monitored for 9 years.
• 35 million of events.
• The sample is stratified and spread along
different geographical areas of the region
to be considered representative of the
domestic users of Madrid.
• The goal is the study of patterns of
consumption and end uses of urban water.
Motivation
Project information
7
PROJECT
TITLE
Pattern Recognition in Residential End Uses of Water
RESEARCH
LINE
Assurance of the balance (availability / demand)
CLIENT Canal de Isabel II
CONSORTIU
M
Exeleria: Preprocessing tasks
Treelogic: Machine Learning tasks
GOAL Developing an automatic system for identifying the
end uses of water in the domestic applications, from
the signals registered by water meters, using
advanced techniques of machine learning, such as
artificial neural networks (ANN) or other statistical
methods
Starting
Point
Starting Point
Hardware Infraestructure
9
WATER METER DATALOGGER
Starting Point
• Data was labeled by operators (experts)
who classify water use events using
specialized software.
• This task involves a considerable amount
of man-hours.
• 1 hour of an operator to analyse a two-
week period of data from each installation.
Starting Point
8 type of events
SHOWERS
(INCLUDING BATHTUBS)
DISHWASHERWASHING MACHINECISTERNS
LEAKS
FAUCETS
POOL IRRIGATION
Previous
analysis and
visualization
Previous analysis and visualization
Pulse to Flow
1
DATE COUNTER
(Number of accumulated pulses)
01/06/2008 0:47:35 31542
01/06/2008 0:48:13 31543
01/06/2008 0:48:55 31544
01/06/2008 0:49:38 31545
01/06/2008 1:20:29 31546
01/06/2008 1:20:46 31547
01/06/2008 1:21:03 31548
01/06/2008 1:21:20 31549
………………… …………………
BASELINE INFORMATION
• Date
• Number of pulse
Previous analysis and visualization
Pulse to Flow
1
Previous analysis and visualization
Pulse to Flow
1
Previous analysis and visualization
Pulse to Flow
1
Previous analysis and visualization
Pulse to Flow
1
Previous analysis and
visualization
Episodes
• An episode is a period of time where the
flow is distinct to zero and is between two
zero-flow instants.
• An episode may consist of one or more
events.
• An event only belongs to an episode.
Previous analysis and
visualization
Events
• An event is an elementary unit of
consumption that occurs in a period of time
of enough duration, in which the instant flow
can be clearly differentiated from the rest.
• A particular domestic use may consist of
one or more events.
• One or several events that converge in time
form an episode.
3 domestic uses which involve 4 events
FAUCETS
1 EVENT
CISTERNS
1 EVENT
WASHING MACHINE
CYCLE 1
WASHING MACHINE
CYCLE 2
2 EVENT
Q
T
3 domestic uses which involve 4 events and 3
episodes
Q
T
EPISODE 1 EPISODE 2 EPISODE 3
FAUCETS
1 EVENT
CISTERNS
1 EVENT
WASHING MACHINE
CYCLE 1
WASHING MACHINE
CYCLE 2
Previous analysis and
visualization
Events identification
• When an episode consist of more than one
event, the events are overlapped.
• Graphically the events are "stacked" on
others as a ladder.
• How do we discriminate events?
o It is the same event if…
⁻ The flow rate keeps constant or the
change is not significant.
o It is a different event if…
⁻ There is a significant change in the
flow rate.
Approach
Approach
Feature Extraction
2
37 FEATURES WERE EXTRACTED FROM EVERY EVENT: duration, volume, maximum flow,
initial Gradient, …
Approach
Deep Neural Networks
• Deep Learning (DL) is a major
breakthrough in artificial intelligence with a
high potential for predictive applications.
• It has been recognized as one of ten
breakthrough technologies according to
MIT Technology Review.
• DL has gone from being considered an
academic field to being applied in
engineering thanks to frameworks like
TensorFlow or CNTK.
• Very powerful, they can solve very complex
tasks.
• They require a large amount of data.
• Large training times, they require
specialized hardware for complex tasks.
• Slow classifiers.
Approach
Deep Neural Networks
Approach
Speedup (SDAs)
• A disadvantage of the backpropagation
algorithm is that the training fast in the last
layers (near the output), but very slow if
we are far away from the output.
• If we don’t have a lot of training data to
perform a high number of back propagation
iterations, we only train the layers at the
output..
• If we can initialize the neural network with
useful weights in the firsts layers, the
training procedure will speed up.
• If that initialization is not supervised we
can use unlabeled data.
Approach
Speedup (SDAs)
• Imagine a neural network that has one hidden layer
• With the same number of neurons in the input than in the output.
• We add noise to the input and we train the network to recover the original input.
• The network will learn to generalize because it will receive different data with the same output.
• The network will learn to identify useful features of the image.
Approach
Speedup (SDAs)
• How can I initialize an MLP using autoencoders?
• Stacking them.
• We can remove the decoding layer and attach another autoencder in the output.
• An autoencoder can just find basic useful weights.
• The idea of autoencder in Deep Learning is using several autoencers training in a sequential way
using the hidden layer as an input of the next autoencoder.
Speedup (SDAs)
Results
Benchmark
ACCURACY OF
DEEP NEURAL
NETWORKS
81.78%
In 1l meters
91.19%
In 0,1l meters
ACCURACY OF
SVMs
67.41%
In 1l meters
84.78%
In 0,1l meters
Results
Accuracy comparison of Deep Neural Networks and SVMs in every water use
What else can
Deep Learning
do for
water supply
companies…?
What else…?
Time Series
• Water supply companies are also interested on:
• Water demand forecasting.
• Weather or quantitative precipitation forecast:
o Volume of water in reservoirs.
o Alert systems.
• Time series forecasting.
What else…?
RNN
3
• Traditional NN assume that inputs are independents of each other.
• RNN incorporate memory that contains the essence of what has happened previously.
What else…?
LSTM
3
• A variant of RNN, capable of learning long term dependences.
• Internal architecture more complex than Simple RNN architecture.
• Most widely used type of RNN.
[**] https://datamarket.com/data/set/22ls/monthly-precipitation-mm-southwestern-mountain-region-1932-1966
Southwestern precipitation forecast
Monthly precipitation (mm.)
Southwester mountain region (1932-1966)
DATA SET
420
ROWS
2
COLUMNS
(number of month, precipitation)
Solution
• LSTM network
o Input – 20 timesteps, 1 feature
o Hidden Layer 1 – 20 LSTM
o Output – 1 neuron
• MSE -> 16,94
4
Monthly precipitation (mm.) · Prediction (last 63
months)
CONCLUSIO
NS
01 02
03 04
Data science can help us to
UNDERSTAND of the water
demand and its
characterization.
Deep Learning Models can
achieve very good results in
terms of ACCURACY when is
trained using large enough
datasets.
This METHODOLOGY is
actually in use for processing
data from the Panel for
residential consumption
patterns assessment and end-
uses monitoring project of
Canal de Isabel II in Madrid.
It could be very USEFUL to
create good management
policies.
THANKS
!
Roberto
Díaz
LEADER OF THE
DATA SCIENCE
RESEARCH
José
Antonio
Sánchez
SENIOR R&D
ENGINEER
THANKS
!
Contacto
Parque Tecnológico de
Asturias
Parcela 30
E33428 Llanera
Asturias
ESPAÑA
Avda. Manoteras, 38
Oficina D614
E28050 Madrid
ESPAÑA
T +34 902 286 386
central@treelogic.com
www.Treelogic.com

Weitere ähnliche Inhalte

Was ist angesagt?

ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Srinath Perera
 
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma TangOptimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Databricks
 
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
Srinath Perera
 
Apache Storm
Apache StormApache Storm
Apache Storm
Edureka!
 

Was ist angesagt? (20)

And Then There Are Algorithms
And Then There Are AlgorithmsAnd Then There Are Algorithms
And Then There Are Algorithms
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma TangOptimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
Optimal Strategies for Large Scale Batch ETL Jobs with Emma Tang
 
Buzz Words Dunning Real-Time Learning
Buzz Words Dunning Real-Time LearningBuzz Words Dunning Real-Time Learning
Buzz Words Dunning Real-Time Learning
 
Tuning Java Servers
Tuning Java Servers Tuning Java Servers
Tuning Java Servers
 
ACM 2013-02-25
ACM 2013-02-25ACM 2013-02-25
ACM 2013-02-25
 
Enterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using SparkEnterprise Scale Topological Data Analysis Using Spark
Enterprise Scale Topological Data Analysis Using Spark
 
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
ICTER 2014 Invited Talk: Large Scale Data Processing in the Real World: from ...
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
 
From Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim HunterFrom Pipelines to Refineries: scaling big data applications with Tim Hunter
From Pipelines to Refineries: scaling big data applications with Tim Hunter
 
Solving DEBS Grand Challenge with WSO2 CEP
Solving DEBS Grand Challenge with WSO2 CEPSolving DEBS Grand Challenge with WSO2 CEP
Solving DEBS Grand Challenge with WSO2 CEP
 
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
Strata NYC 2015: Sketching Big Data with Spark: randomized algorithms for lar...
 
Apache Storm and Oracle Event Processing for Real-time Analytics
Apache Storm and Oracle Event Processing for Real-time AnalyticsApache Storm and Oracle Event Processing for Real-time Analytics
Apache Storm and Oracle Event Processing for Real-time Analytics
 
Scaling Security Threat Detection with Apache Spark and Databricks
Scaling Security Threat Detection with Apache Spark and DatabricksScaling Security Threat Detection with Apache Spark and Databricks
Scaling Security Threat Detection with Apache Spark and Databricks
 
Real-time driving score service using Flink
Real-time driving score service using FlinkReal-time driving score service using Flink
Real-time driving score service using Flink
 
Deconstructiong Recommendations on Spark-(Ilya Ganelin, Capital One)
Deconstructiong Recommendations on Spark-(Ilya Ganelin, Capital One)Deconstructiong Recommendations on Spark-(Ilya Ganelin, Capital One)
Deconstructiong Recommendations on Spark-(Ilya Ganelin, Capital One)
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 

Ähnlich wie A Deep Learning use case for water end use detection by Roberto Díaz and José Antonio Sánchez at Big Data Spain 2017

Directions OGC CHISP-1 Webinar Slides
Directions OGC CHISP-1 Webinar SlidesDirections OGC CHISP-1 Webinar Slides
Directions OGC CHISP-1 Webinar Slides
Alex Joseph
 
DefCon 2011 - Vulnerabilities in Wireless Water Meters
DefCon 2011 - Vulnerabilities in Wireless Water MetersDefCon 2011 - Vulnerabilities in Wireless Water Meters
DefCon 2011 - Vulnerabilities in Wireless Water Meters
Michael Smith
 

Ähnlich wie A Deep Learning use case for water end use detection by Roberto Díaz and José Antonio Sánchez at Big Data Spain 2017 (20)

FLOOD PPT 1.pptx
FLOOD PPT 1.pptxFLOOD PPT 1.pptx
FLOOD PPT 1.pptx
 
INCENDIA
INCENDIAINCENDIA
INCENDIA
 
SRS and PMP of Smart Bio- Waste Management System
SRS and PMP of Smart Bio- Waste Management SystemSRS and PMP of Smart Bio- Waste Management System
SRS and PMP of Smart Bio- Waste Management System
 
Smart Water Meter System for Detecting Sudden Water Leakage
Smart Water Meter System for Detecting Sudden Water LeakageSmart Water Meter System for Detecting Sudden Water Leakage
Smart Water Meter System for Detecting Sudden Water Leakage
 
Smart Dam Monitering & Controling
Smart Dam Monitering & ControlingSmart Dam Monitering & Controling
Smart Dam Monitering & Controling
 
Monitoring of Transmission and Distribution Grids using PMUs
Monitoring of Transmission and Distribution Grids using PMUsMonitoring of Transmission and Distribution Grids using PMUs
Monitoring of Transmission and Distribution Grids using PMUs
 
Stream Processing Overview
Stream Processing OverviewStream Processing Overview
Stream Processing Overview
 
Wqtc2013 invest ofperformanceprobswitheds-20130910
Wqtc2013 invest ofperformanceprobswitheds-20130910Wqtc2013 invest ofperformanceprobswitheds-20130910
Wqtc2013 invest ofperformanceprobswitheds-20130910
 
Clearwater Controls - DERAGGER
Clearwater Controls - DERAGGERClearwater Controls - DERAGGER
Clearwater Controls - DERAGGER
 
513335690-AI-applications-in-civil-engg.pptx
513335690-AI-applications-in-civil-engg.pptx513335690-AI-applications-in-civil-engg.pptx
513335690-AI-applications-in-civil-engg.pptx
 
Water Level and Leakage Detection System with its Quality Analysis based on S...
Water Level and Leakage Detection System with its Quality Analysis based on S...Water Level and Leakage Detection System with its Quality Analysis based on S...
Water Level and Leakage Detection System with its Quality Analysis based on S...
 
Directions OGC CHISP-1 Webinar Slides
Directions OGC CHISP-1 Webinar SlidesDirections OGC CHISP-1 Webinar Slides
Directions OGC CHISP-1 Webinar Slides
 
22 - CSIRO - Water Data Management-Sep-17
22 - CSIRO - Water Data Management-Sep-1722 - CSIRO - Water Data Management-Sep-17
22 - CSIRO - Water Data Management-Sep-17
 
IRJET - Design of Water Distribution Network using EPANET Software
IRJET -  	  Design of Water Distribution Network using EPANET SoftwareIRJET -  	  Design of Water Distribution Network using EPANET Software
IRJET - Design of Water Distribution Network using EPANET Software
 
DefCon 2011 - Vulnerabilities in Wireless Water Meters
DefCon 2011 - Vulnerabilities in Wireless Water MetersDefCon 2011 - Vulnerabilities in Wireless Water Meters
DefCon 2011 - Vulnerabilities in Wireless Water Meters
 
18 - DSS_NIH_Presentation-Sep-17
18 - DSS_NIH_Presentation-Sep-1718 - DSS_NIH_Presentation-Sep-17
18 - DSS_NIH_Presentation-Sep-17
 
DSD-NL 2021 WaterCoach, here and in Australia - Wang
DSD-NL 2021 WaterCoach, here and in Australia - WangDSD-NL 2021 WaterCoach, here and in Australia - Wang
DSD-NL 2021 WaterCoach, here and in Australia - Wang
 
N044066769
N044066769N044066769
N044066769
 
IOT Based Water Level Monitoring System For Lake
IOT Based Water Level Monitoring System For LakeIOT Based Water Level Monitoring System For Lake
IOT Based Water Level Monitoring System For Lake
 
Advance smart irrigation system 1
Advance smart irrigation system 1Advance smart irrigation system 1
Advance smart irrigation system 1
 

Mehr von Big Data Spain

Mehr von Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
Feature selection for Big Data: advances and challenges by Verónica Bolón-Can...
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

A Deep Learning use case for water end use detection by Roberto Díaz and José Antonio Sánchez at Big Data Spain 2017

  • 1.
  • 4. Motivation Urban water supply • We need good demand management policies to achieve a good sustainable development. • Adding a new water source imply: • Higher costs. • Environmental damage. • Poorer quality. • “the largest, least expensive, and most environmentally sound source of water […] is the water currently being wasted in every sector of our economy”.[1] [1] Gleick, P. et al. (2003). Waste Not, Want Not: The Potential for Urban Water Conservation
  • 5. Motivation End uses of water • Residential use of water -> The 70% of total water consumption. • A good understanding of the demand and its characterization could be very useful to create good management policies. • Several problems can be addressed using AI techniques: • Final use classification (dishwasher, toilet, irrigation, taps). • Water demand forecasting.
  • 6. Motivation The problem • Installing a meter on each water device is very expensive and intrusive. • To overcome this problem, it is possible to install a unique precision meter at the home main water connection. • Predictive models can read these meters and make predictions: • End use: Classification problem. • Forecasting: Regression problem.
  • 7. Motivation Data Source • Canal de Isabel II monitors since 2008 a sample of 300 homes spread over the region of Madrid. • 15 million hours monitored for 9 years. • 35 million of events. • The sample is stratified and spread along different geographical areas of the region to be considered representative of the domestic users of Madrid. • The goal is the study of patterns of consumption and end uses of urban water.
  • 8. Motivation Project information 7 PROJECT TITLE Pattern Recognition in Residential End Uses of Water RESEARCH LINE Assurance of the balance (availability / demand) CLIENT Canal de Isabel II CONSORTIU M Exeleria: Preprocessing tasks Treelogic: Machine Learning tasks GOAL Developing an automatic system for identifying the end uses of water in the domestic applications, from the signals registered by water meters, using advanced techniques of machine learning, such as artificial neural networks (ANN) or other statistical methods
  • 11. Starting Point • Data was labeled by operators (experts) who classify water use events using specialized software. • This task involves a considerable amount of man-hours. • 1 hour of an operator to analyse a two- week period of data from each installation.
  • 12. Starting Point 8 type of events SHOWERS (INCLUDING BATHTUBS) DISHWASHERWASHING MACHINECISTERNS LEAKS FAUCETS POOL IRRIGATION
  • 14. Previous analysis and visualization Pulse to Flow 1 DATE COUNTER (Number of accumulated pulses) 01/06/2008 0:47:35 31542 01/06/2008 0:48:13 31543 01/06/2008 0:48:55 31544 01/06/2008 0:49:38 31545 01/06/2008 1:20:29 31546 01/06/2008 1:20:46 31547 01/06/2008 1:21:03 31548 01/06/2008 1:21:20 31549 ………………… ………………… BASELINE INFORMATION • Date • Number of pulse
  • 15. Previous analysis and visualization Pulse to Flow 1
  • 16. Previous analysis and visualization Pulse to Flow 1
  • 17. Previous analysis and visualization Pulse to Flow 1
  • 18. Previous analysis and visualization Pulse to Flow 1
  • 19. Previous analysis and visualization Episodes • An episode is a period of time where the flow is distinct to zero and is between two zero-flow instants. • An episode may consist of one or more events. • An event only belongs to an episode.
  • 20. Previous analysis and visualization Events • An event is an elementary unit of consumption that occurs in a period of time of enough duration, in which the instant flow can be clearly differentiated from the rest. • A particular domestic use may consist of one or more events. • One or several events that converge in time form an episode.
  • 21. 3 domestic uses which involve 4 events FAUCETS 1 EVENT CISTERNS 1 EVENT WASHING MACHINE CYCLE 1 WASHING MACHINE CYCLE 2 2 EVENT Q T
  • 22. 3 domestic uses which involve 4 events and 3 episodes Q T EPISODE 1 EPISODE 2 EPISODE 3 FAUCETS 1 EVENT CISTERNS 1 EVENT WASHING MACHINE CYCLE 1 WASHING MACHINE CYCLE 2
  • 23. Previous analysis and visualization Events identification • When an episode consist of more than one event, the events are overlapped. • Graphically the events are "stacked" on others as a ladder. • How do we discriminate events? o It is the same event if… ⁻ The flow rate keeps constant or the change is not significant. o It is a different event if… ⁻ There is a significant change in the flow rate.
  • 24.
  • 25.
  • 27. Approach Feature Extraction 2 37 FEATURES WERE EXTRACTED FROM EVERY EVENT: duration, volume, maximum flow, initial Gradient, …
  • 28. Approach Deep Neural Networks • Deep Learning (DL) is a major breakthrough in artificial intelligence with a high potential for predictive applications. • It has been recognized as one of ten breakthrough technologies according to MIT Technology Review. • DL has gone from being considered an academic field to being applied in engineering thanks to frameworks like TensorFlow or CNTK. • Very powerful, they can solve very complex tasks. • They require a large amount of data. • Large training times, they require specialized hardware for complex tasks. • Slow classifiers.
  • 30. Approach Speedup (SDAs) • A disadvantage of the backpropagation algorithm is that the training fast in the last layers (near the output), but very slow if we are far away from the output. • If we don’t have a lot of training data to perform a high number of back propagation iterations, we only train the layers at the output.. • If we can initialize the neural network with useful weights in the firsts layers, the training procedure will speed up. • If that initialization is not supervised we can use unlabeled data.
  • 31. Approach Speedup (SDAs) • Imagine a neural network that has one hidden layer • With the same number of neurons in the input than in the output. • We add noise to the input and we train the network to recover the original input. • The network will learn to generalize because it will receive different data with the same output. • The network will learn to identify useful features of the image.
  • 32. Approach Speedup (SDAs) • How can I initialize an MLP using autoencoders? • Stacking them. • We can remove the decoding layer and attach another autoencder in the output. • An autoencoder can just find basic useful weights. • The idea of autoencder in Deep Learning is using several autoencers training in a sequential way using the hidden layer as an input of the next autoencoder.
  • 34. Results Benchmark ACCURACY OF DEEP NEURAL NETWORKS 81.78% In 1l meters 91.19% In 0,1l meters ACCURACY OF SVMs 67.41% In 1l meters 84.78% In 0,1l meters
  • 35. Results Accuracy comparison of Deep Neural Networks and SVMs in every water use
  • 36. What else can Deep Learning do for water supply companies…?
  • 37. What else…? Time Series • Water supply companies are also interested on: • Water demand forecasting. • Weather or quantitative precipitation forecast: o Volume of water in reservoirs. o Alert systems. • Time series forecasting.
  • 38. What else…? RNN 3 • Traditional NN assume that inputs are independents of each other. • RNN incorporate memory that contains the essence of what has happened previously.
  • 39. What else…? LSTM 3 • A variant of RNN, capable of learning long term dependences. • Internal architecture more complex than Simple RNN architecture. • Most widely used type of RNN.
  • 40. [**] https://datamarket.com/data/set/22ls/monthly-precipitation-mm-southwestern-mountain-region-1932-1966 Southwestern precipitation forecast Monthly precipitation (mm.) Southwester mountain region (1932-1966) DATA SET 420 ROWS 2 COLUMNS (number of month, precipitation)
  • 41. Solution • LSTM network o Input – 20 timesteps, 1 feature o Hidden Layer 1 – 20 LSTM o Output – 1 neuron • MSE -> 16,94 4
  • 42. Monthly precipitation (mm.) · Prediction (last 63 months)
  • 43. CONCLUSIO NS 01 02 03 04 Data science can help us to UNDERSTAND of the water demand and its characterization. Deep Learning Models can achieve very good results in terms of ACCURACY when is trained using large enough datasets. This METHODOLOGY is actually in use for processing data from the Panel for residential consumption patterns assessment and end- uses monitoring project of Canal de Isabel II in Madrid. It could be very USEFUL to create good management policies.
  • 44. THANKS ! Roberto Díaz LEADER OF THE DATA SCIENCE RESEARCH José Antonio Sánchez SENIOR R&D ENGINEER THANKS !
  • 45. Contacto Parque Tecnológico de Asturias Parcela 30 E33428 Llanera Asturias ESPAÑA Avda. Manoteras, 38 Oficina D614 E28050 Madrid ESPAÑA T +34 902 286 386 central@treelogic.com www.Treelogic.com