Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Data Science Case Studies: The Internet of Things: Implications for the Enterprise

1.834 Aufrufe

Veröffentlicht am

The Internet of Things: Implications for the Enterprise
The Internet Of Things (IoT) is already a reality but getting value out of that is still in its infancy. This session analyzes the implications of IoT for the enterprise with examples from the work we have done.

Rashmi Raghu is a Principal Data Scientist at Pivotal with a focus on the Internet-of-Things and applications in the Energy sector. Her work has spanned diverse industry problems including uncovering patterns & anomalies in massive datasets to predictive maintenance. She holds a Ph.D. in Mechanical Engineering with a minor in Management Science & Engineering from Stanford University. Her doctoral work focused on the development of novel computational models of the cardiovascular system to aid disease research. Prior to that she obtained Master’s and Bachelor’s degrees in Engineering Science from the University of Auckland, New Zealand.

Veröffentlicht in: Daten & Analysen

Data Science Case Studies: The Internet of Things: Implications for the Enterprise

  1. 1. 2© 2015 Pivotal Software, Inc. All rights reserved. 2© 2015 Pivotal Software, Inc. All rights reserved. Internet of Things: Implications for the Enterprise Rashmi Raghu, Ph.D. Principal Data Scientist
  2. 2. 3© 2015 Pivotal Software, Inc. All rights reserved. Gene Sequencing Smart Grids COST TO SEQUENCE ONE GENOME HAS FALLEN FROM $100M IN 2001 TO $10K IN 2011 TO $1K IN 2014 READING SMART METERS EVERY 15 MINUTES IS 3000X MORE DATA INTENSIVE Stock Market Social Media FACEBOOK UPLOADS 250 MILLION PHOTOS EACH DAY Billions of Data Points Oil Exploration Video Surveillance OIL RIGS GENERATE 25000 DATA POINTS PER SECOND Medical Imaging Mobile Sensors
  3. 3. 4© 2015 Pivotal Software, Inc. All rights reserved. Implications for the Enterprise Ÿ  Organizational –  Vision –  Preparedness –  Execution Ÿ  Technical –  Data quality & completeness –  Heterogeneity of data sources –  Technology architecture
  4. 4. 5© 2015 Pivotal Software, Inc. All rights reserved. Implications for the Enterprise Ÿ  Organizational –  Vision –  Preparedness –  Execution Ÿ  Technical –  Data quality & completeness –  Heterogeneity of data sources –  Technology architecture Issues in any of these have implications for data science approaches and their effectiveness
  5. 5. 6© 2015 Pivotal Software, Inc. All rights reserved. Case Studies Oil Drilling Telecommunications Predictive Maintenance Customer Micro-segmentation
  6. 6. 7© 2015 Pivotal Software, Inc. All rights reserved. Case Studies Oil Drilling Telecommunications Predictive Maintenance Customer Micro-segmentation
  7. 7. 8© 2015 Pivotal Software, Inc. All rights reserved. Data: The New Oil Ÿ  Oil & gas exploration and production activities generate large amounts of data from sensors Ÿ  What opportunities exist for data-driven approaches to improve operations? Drilling into the San Andreas Fault at Parkfield California. Credit: Stephen H. Hickman, USGS *http://blog.pivotal.io/pivotal/case-studies-2/data-as-the-new-oil-producing-value-for-the-oil-gas-industry
  8. 8. 9© 2015 Pivotal Software, Inc. All rights reserved. Data: The New Oil Ÿ  Oil & gas exploration and production activities generate large amounts of data from sensors Ÿ  What opportunities exist for data-driven approaches to improve operations? Drilling into the San Andreas Fault at Parkfield California. Credit: Stephen H. Hickman, USGS *http://blog.pivotal.io/pivotal/case-studies-2/data-as-the-new-oil-producing-value-for-the-oil-gas-industry Predictive maintenance •  Predict equipment function and failure •  Motivation: Failure costs estimated at $150,000/incident (billions annually)* •  Goals: –  Early warning system –  Insights into prominent features impacting operation and failure –  Reduction of non-productive drill time –  Reduced incidents
  9. 9. 10© 2015 Pivotal Software, Inc. All rights reserved. Predictive Maintenance for Drilling Operations Integrating & Cleansing Feature Building Modeling
  10. 10. 11© 2015 Pivotal Software, Inc. All rights reserved. Primary Data Sources Integrating & Cleansing Feature Building Modeling Integrated Data Primary data sources Operator Data ( ~ thousands of records ) •  Failure details •  Component details •  Drill Bit details Drill Rig Sensor Data ( ~ billions of records ) •  Rate of Penetration (ROP) •  RPM •  Weight on Bit (WOB) …
  11. 11. 12© 2015 Pivotal Software, Inc. All rights reserved. Primary Data Sources: Challenges Integrating & Cleansing Feature Building Modeling Primary data sources Operator Data ( ~ thousands of records ) •  Failure details •  Component details •  Drill Bit details Drill Rig Sensor Data ( ~ billions of records ) •  Rate of Penetration (ROP) •  RPM •  Weight on Bit (WOB) … Challenges •  Failure instances not clearly labeled •  Labels may be embedded in reports or comments Implications •  Dependent variable generation also becomes a machine learning exercise •  Accuracy of failure prediction impacted by accuracy of failure label derivation
  12. 12. 13© 2015 Pivotal Software, Inc. All rights reserved. Primary Data Sources: Challenges Well ID Depth Comment Event flag 1 1000 equipment not responding 1 2 2000 TOOH to bit. rubber pieces seen 1 Integrating & Cleansing Feature Building Modeling •  Dependent variable generation – a machine learning exercise •  Text analytics pipeline needed to convert failure reports or comments to event flags
  13. 13. 14© 2015 Pivotal Software, Inc. All rights reserved. Complex Feature Set Across Data Sources Integrating & Cleansing Feature Building Modeling •  A failure occurred at the end of this run •  Taking a window of time prior to failure, what features could we extract (e.g. variance of RPM, max bit position velocity)? BitpositionRPM ROPWOB
  14. 14. 15© 2015 Pivotal Software, Inc. All rights reserved. Complex Feature Set Across Data Sources •  Depth •  Rate of Penetration •  Torque •  Weight on Bit •  RPM •  … •  Drill Bit details •  Component details etc. •  Failure events •  … Features on Time Windows •  Mean •  Median •  Standard Deviation •  Range •  Skewness •  … Final Set of Features on Time Windows •  Leverage GPDB / HAWQ (+ MADlib, PL/X) for fast computation of hundreds of features over time windows within billions of rows (or more) of time-series data Operator data Drill Rig Sensor data
  15. 15. 16© 2015 Pivotal Software, Inc. All rights reserved. Predictive Maintenance App Pipeline Data Lake Ingest Business Levers Early Warning System Rig Operator Dashboard Models •  Elastic Net Regression •  Cox Proportional Hazards Regression •  Decision Trees Initial data cleansing filters Wells with failure scores and early warning indicators Feedback loop for continuous model improvementDomain Knowledge Oil Rig Operator HAWQ GPDB PL/X MADlib R Python CJava Perl Spark + MLlib
  16. 16. 17© 2015 Pivotal Software, Inc. All rights reserved. Case Studies Oil Drilling Telecommunications Predictive Maintenance Customer Micro-segmentation
  17. 17. 18© 2015 Pivotal Software, Inc. All rights reserved. State of Data at Telco Company Customer Segments New Data Sources Multi-Gadget Families Affluent Matures Thrifty Families High Tech Singles Budget Singles Seniors Internet Deep Packet Inspection TV Consumption (Linear) Video On Demand Consumption
  18. 18. 19© 2015 Pivotal Software, Inc. All rights reserved. Native Services Video On Demand TVInternet Internet Devices OTT (Over The Top) Services What is the level of engagement with client’s products (TV, VOD, Internet)? What are the patterns of device usage behavior? What is the level of OTT engagement, by segment, and by bandwidth? Understanding Subscriber Behavior
  19. 19. 20© 2015 Pivotal Software, Inc. All rights reserved. Newly Identified Behavior-Based SegmentsSubscribers Moderates OTT & Data Heavyweights Portable OTT Entertainment Seekers iPhone Heavy Android Heavy iPad Heavy In-Home OTT Entertainment Seekers In-Home Native Content Seekers VOD Heavy TV Heavy
  20. 20. 21© 2015 Pivotal Software, Inc. All rights reserved. Moderates OTT & Data Heavyweights In-Home OTT Entertainment Seekers Portable OTT Entertainment Seekers - iPhone Heavy Portable OTT Entertainment Seekers - Android Heavy Portable OTT Entertainment Seekers - iPad Heavy In-Home Native Content Seekers - VOD Heavy In-Home Native Content Seekers - TV Heavy Cross Behavior-based and Existing Segments New Behavior-Based Segments Customized Micro-Segments! Existing Segments Multi-Gadget Families Affluent Matures Thrifty Families Budget Singles High Tech Singles Seniors
  21. 21. 22© 2015 Pivotal Software, Inc. All rights reserved. Heterogeneous Data Sources Ÿ  Prevalence of new data sources was limited but increasing –  Rich usage data available on a subset of the subscribers –  Leads to limited applicability of micro-segments Ÿ  Lack of data may be alleviated by expanding data science efforts –  Leverage micro-segmentation model to score a different subset of subscribers (who we have limited data on) New Data Sources Internet Deep Packet Inspection TV Consumption (Linear) Video On Demand Consumption
  22. 22. 23© 2015 Pivotal Software, Inc. All rights reserved. Driving New Business Value Upsell and Cross-Sell New Product Offerings Data Monetization
  23. 23. 24© 2015 Pivotal Software, Inc. All rights reserved. Implications for the Enterprise Ÿ  Organizational –  Vision –  Preparedness –  Execution Ÿ  Technical / Data –  Data quality & completeness –  Heterogeneity of data sources –  Technology architecture •  Data quality & completeness: •  Data capture mechanisms can have a lasting impact on ability to solve a business problem •  Heterogeneity of data sources: •  Existence of legacy systems & devices may limit the applicability of new models unless that is taken into account ahead of time •  Feedback to spur upgrading of equipment wherever possible
  24. 24. 25© 2015 Pivotal Software, Inc. All rights reserved. Implications for the Enterprise Ÿ  Creating value from IoT requires organizational and technical alignment Ÿ  Impacts of these considerations on data science efforts and outcomes are non-trivial Ÿ  Specific impacts of data issues include: –  Longer time to realization of value –  Model accuracy issues –  Limited applicability of results –  And more …
  25. 25. 26© 2015 Pivotal Software, Inc. All rights reserved. For further information, checkout … Ÿ  Pivotal Blog @ http://blog.pivotal.io Ÿ  Pivotal Data Science Blog @ http://blog.pivotal.io/data-science-pivotal Ÿ  Pivotal Data Product Info, Docs and Downloads @ http://pivotal.io/big-data Ÿ  Oil & Gas Use Case Webinar: –  Video: https://www.youtube.com/watch?v=dhT-tjHCr9E –  Slides: http://www.slideshare.net/Pivotal/data-as-thenewoil Ÿ  Blogs: –  Oil & Gas Use Case: http://blog.pivotal.io/pivotal/case-studies-2/data-as-the-new-oil-producing-value-for-the-oil-gas- industry –  Time Series Analysis: http://blog.pivotal.io/tag/time-series-analysis

×