SlideShare a Scribd company logo
1 of 26
Making Sense of IoT Data
w/ Big Data + Data Science
Charles Cai
- The views expressed here are of my own and not my employer
Making Sense of IoT w/ Big Data + Data Science
u IoT = Big Data
u  In this talk we are going to discuss the latest development in Big Data, Machine
Learning and Data Science and the latest IoT use cases in healthcare new drug
trial, geospatial mapping, disaster relief, retails and insurance etc to cover life
cycle of IoT data analytics: capturing, storing, cleansing, analysing, predicting
and maintaining…
u  There will be 30~50 billion Internet connected devices in 5 years
u  How IoT can drive innovations in various industries
u  IoT = Big Data, how open source big data eco-system supports IoT Driven business cases
Making Sense of IoT Data with Big Data + Data Science
Big Data Week Conference 2015
Charles Cai
Big Data + Data Science
Leading Oil and Gas Trading Company
u  Innovating with Disruptive Technologies
Data Center Operation System
Data Operation System 2.0
Data Science Maturity Model D - I - K - W
Crowdsourcing
MOOC / OSH / OSS
Data Science Maturity Model
Big Data DevOps / Data Scientist Shortage
Operating BDA: MicroservicesGraph Database / Graph Computing
Open Source Hardware / SoftwareData – Information – Knowledge - Wisdom
The Power of Crowdsourcing
Intro
u  Bio
u  #FO #FICC: Investment Banking Front Office: FX/Commodities
u  #ETRM: Energy Trading & Risk Management
u  #entrepreneur #innovator #disruptor
u  Voted as one of the UK’s Top 50 Data Leaders & Influencers
u  Twitter: @caidong
u #big-data #IoT #data-science #MOOC #Mobile #Cloud #UX
u  LinkedIn: http://uk.linkedin.com/in/charlescai/en
Where we are at with Big Data Analytics?
By Thomas Davenport – Harvard Business Review
BI vs DS: from Descriptive to Prescriptive – Ironside
The Ironside Group Quantifiable ROI
u  Use Case: Parkinson Disease New Drug Trial
u  there’s no cure for Parkinson’s disease
u  New medicine trial is an extremely slow process, daily doses x8!
u  Traditional feedbacks from the patients are not frequent at all
u  Wireless enabled wearable device + IMU sensors
u  Classification of wearer activities
u  sitting, standing, walking, running, sleeping…
u  Detect pattern of Parkinson’s Disease symptoms
u  predicting deterioration / improvement speed
u  new trial medicine effectiveness
u  Sensor data 10Hz sampling = 1GB / day / patient
IoT = Big Data
Open Source Hardware
Same Data Science disciplines for vertical industries
Open Source
Data Science Toolbox
Hadoop / Mesos
Distributed Storage
+
Scalable Computation
Open Source Big Data / Data Science Platform
10
COTS Apps
(Excel, Tableau, Qlik...)
Statistical Time Series Analysis
Wider Big Data Analytics eco-systems
•  Shell/APIs: HDFS, Hive, Spark,
HBase, Sqoop, JDBC/ODBC
•  Languages: Julia, Python, R, Scala
- Developed on:
- Operated by:
NLTK: Natural
Language
Distributed
Time Series /
Geospatial /
Graph Databases
GIT
Repo
DataProducts
WebSocket
Drag + Drop
(CZML/GeoJSON)
Web Browser
(collaboration)
Export to CSV/
Excel
Geospatial data
Time Series data
Public Data
Market data
Real-time
Streaming
Open Gov Data
JDBC
via phoenix
HDFS
Hive/Pig
w/ Geospatial
… here’s Microservice and Data Center Operation System related:
Key Sub-systems in Modern Big Data Analytics Stack
Data Analytics
Streaming
Graph Computing
Machine Learning
…
Data Science Maturity Map – where we are, where we are going can go
InformationData Knowledge
Wisdom /
Intelligence
“Note: The current version focuses mainly around data / machine
learning - a new version for cross industry use cases with more
coverage on IoT, container, data flow etc… is being developed – ETA Dec
2015 / Jan 2016.
Please follow Twitter: @caidong to receive the latest version soon”
From Classic to Modern Architecture
Full Text Search Natural Language Process
CCTV / Voice
Computer Vision + Q&A
Deep Learning (CNN/RNN)
RDBMS / DW KV + GraphDB + BD DW
Business Intelligence Big Data, Machine Learning
Lightweight Container +
Microservices + API Harvestingn-tier architecture
Semantic Search
Keyword Search
Named Entity Extraction Q&A N-Grams
Faceted Search Geospatial Search
Tables Primary Keys Foreign Keys Node / Vertex Label Edge / Relationship Properties
Colours Shapes Complex Shapes Textiles Accessories Context
What happened? What’s happening? Predictive Analytics Prescriptive Analysis “Make the trend!”
Database App Server Web Front Cloud Distributed and Fault Tolerant “Data Centre as One Computer”
Unstructured
u  Working with HR Training
team
u  VTA Training Sessions
u  Big Data Bootcamp
u  Lunch and Learn KT Sessions
Big Data Technology is evolving so fast… here’s Hadoop related:
Big data ELT with Apache Sqoop
BI vs Data Science
Data Scientist Career Path
MOOC and Machine Learning
Machine Learning with Apache Spark
Map Reduce 101
Big Data Security: Kerberos/Knox/Sentry
Deep Learning and Use Cases
Time Series and
Geospatial Big Data
Analytics with
ImpalaHBase: Distributed Key-value BigTable
Distributed Time Sereis DB: OpenTSDB
Machine
Learning with
Hadoop and R
Advanced
Machine
Bayesian
Network
Big Data / Data Science Learning Resource: free e-Books
Data Jujitsu: The Art of
Turning Data into Product
Data Mining
Algorithms In R
A Programmer's
Guide to Data Mining
Data Mining and Analysis:
Fundamental Concepts and Algorithms
Mining of
Massive Datasets
The School of
Data Handbook
Theory and Applications
for Advanced Text Mining
An Introduction to
Data Science
All e-Books below
$169.99 (=$240 savings)
Master your Compliance Big Data - Resources: MOOC
-  Coursera
-  EdX
-  Udacity
-  iTuneU
Big Data / Data Science Virtual Machines + Containers
Big Data / Data Science Certifications: EMC, Cloudera, …
CCP: Data Scientists:
-  elite level
-  real-world designing
and developing
-  production-ready
data science solution
-  peer-evaluated for
accuracy, scalability,
and robustness
EMC Data Science Associate:
-  Data Analytics Lifecycle
-  Analyzing / exploring data
w/ R
-  Statistics modelling,
theory and advanced
methods
-  Advanced technology &
tools
-  Operationalizing
Data Science Crowdsourcing : Hackathons, Startups
Title: Open Sans 100 px
u  Subtitle: Open Sans 48 px
Title: Open Sans 100 px
u  Subtitle: Open Sans 48 px
BI vs DS: from Descriptive to Prescriptive - Gartner
Gartner – Analytical Difficulty by Value
BI vs DS: from Descriptive to Prescriptive - IBM
BI vs DS: from Descriptive to Prescriptive - SAP
SAP – Analytics Maturity by Competitive Advantage

More Related Content

What's hot

Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
Srinath Perera
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
JULIO GONZALEZ SANZ
 

What's hot (20)

Big data deep learning: applications and challenges
Big data deep learning: applications and challengesBig data deep learning: applications and challenges
Big data deep learning: applications and challenges
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Big Data and Classification
Big Data and ClassificationBig Data and Classification
Big Data and Classification
 
Big Data
Big DataBig Data
Big Data
 
Big data 101
Big data 101Big data 101
Big data 101
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018
 
On Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challengesOn Big Data Analytics - opportunities and challenges
On Big Data Analytics - opportunities and challenges
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Bigdata analytics
Bigdata analyticsBigdata analytics
Bigdata analytics
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Data Science Courses - BigData VS Data Science
Data Science Courses - BigData VS Data ScienceData Science Courses - BigData VS Data Science
Data Science Courses - BigData VS Data Science
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data Science
 
Big Data - Insights & Challenges
Big Data - Insights & ChallengesBig Data - Insights & Challenges
Big Data - Insights & Challenges
 
Big Data and the Art of Data Science
Big Data and the Art of Data ScienceBig Data and the Art of Data Science
Big Data and the Art of Data Science
 
Addressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop WayAddressing Big Data Challenges - The Hadoop Way
Addressing Big Data Challenges - The Hadoop Way
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 

Viewers also liked

Viewers also liked (6)

HOW TO BECOME DATA DRIVEN (PANEL) - ALPESH DOSHI
HOW TO BECOME DATA DRIVEN (PANEL) - ALPESH DOSHIHOW TO BECOME DATA DRIVEN (PANEL) - ALPESH DOSHI
HOW TO BECOME DATA DRIVEN (PANEL) - ALPESH DOSHI
 
BIG DATA, SPORTS AND GAME – A FUSIONAL TRIPTYCH - VALÉRY BOLLIER
BIG DATA, SPORTS AND GAME – A FUSIONAL TRIPTYCH - VALÉRY BOLLIERBIG DATA, SPORTS AND GAME – A FUSIONAL TRIPTYCH - VALÉRY BOLLIER
BIG DATA, SPORTS AND GAME – A FUSIONAL TRIPTYCH - VALÉRY BOLLIER
 
CHANGING THE STATUS QUO - IOANA HRENINCIUC
CHANGING THE STATUS QUO - IOANA HRENINCIUCCHANGING THE STATUS QUO - IOANA HRENINCIUC
CHANGING THE STATUS QUO - IOANA HRENINCIUC
 
FROM WHEELBARROWS TO MACBETH – BEHAVIOUR MODELLING FOR PUBLISHERS - MARTIN G...
 FROM WHEELBARROWS TO MACBETH – BEHAVIOUR MODELLING FOR PUBLISHERS - MARTIN G... FROM WHEELBARROWS TO MACBETH – BEHAVIOUR MODELLING FOR PUBLISHERS - MARTIN G...
FROM WHEELBARROWS TO MACBETH – BEHAVIOUR MODELLING FOR PUBLISHERS - MARTIN G...
 
CITY DATA EXCHANGE – A MARKETPLACE FOR PUBLIC AND PRIVATE DATA - PETER BJØRN ...
CITY DATA EXCHANGE – A MARKETPLACE FOR PUBLIC AND PRIVATE DATA - PETER BJØRN ...CITY DATA EXCHANGE – A MARKETPLACE FOR PUBLIC AND PRIVATE DATA - PETER BJØRN ...
CITY DATA EXCHANGE – A MARKETPLACE FOR PUBLIC AND PRIVATE DATA - PETER BJØRN ...
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 

Similar to MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI

Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
Perficient, Inc.
 

Similar to MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI (20)

Big data.pptx
Big data.pptxBig data.pptx
Big data.pptx
 
MBA-TU-Thailand:BigData for business startup.
MBA-TU-Thailand:BigData for business startup.MBA-TU-Thailand:BigData for business startup.
MBA-TU-Thailand:BigData for business startup.
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
Big Data: Big Issues for IP
Big Data: Big Issues for IPBig Data: Big Issues for IP
Big Data: Big Issues for IP
 
Data Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinctionData Science versus Artificial Intelligence: a useful distinction
Data Science versus Artificial Intelligence: a useful distinction
 
Big data-and-creativity v.1
Big data-and-creativity v.1Big data-and-creativity v.1
Big data-and-creativity v.1
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th feb
 
Cisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt onlyCisco event 6 05 2014v3 wwt only
Cisco event 6 05 2014v3 wwt only
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
Big data data lake and beyond
Big data data lake and beyond Big data data lake and beyond
Big data data lake and beyond
 
1 UNIT-DSP.pptx
1 UNIT-DSP.pptx1 UNIT-DSP.pptx
1 UNIT-DSP.pptx
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
From Info Science to Data Science & Smart Nation
From Info Science to Data Science & Smart Nation From Info Science to Data Science & Smart Nation
From Info Science to Data Science & Smart Nation
 
On Big Data
On Big DataOn Big Data
On Big Data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
2018 learning approach-digitaltrends
2018 learning approach-digitaltrends2018 learning approach-digitaltrends
2018 learning approach-digitaltrends
 
Innovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringerInnovation med big data – chr. hansens erfaringer
Innovation med big data – chr. hansens erfaringer
 

More from Big Data Week

More from Big Data Week (20)

BDW17 London - Edward Kibardin - Mitie PLC - Learning and Topological Data A...
 BDW17 London - Edward Kibardin - Mitie PLC - Learning and Topological Data A... BDW17 London - Edward Kibardin - Mitie PLC - Learning and Topological Data A...
BDW17 London - Edward Kibardin - Mitie PLC - Learning and Topological Data A...
 
BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...
BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...
BDWW17 London - Steve Bradbury, GRSC - Big Data to the Rescue: A Fraud Case S...
 
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal Inference
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal InferenceBDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal Inference
BDW17 London - Totte Harinen, Uber - Why Big Data Didn’t End Causal Inference
 
BDW17 London - Rita Simoes, Boehringer Ingelheim - Big Data in Pharma: Sittin...
BDW17 London - Rita Simoes, Boehringer Ingelheim - Big Data in Pharma: Sittin...BDW17 London - Rita Simoes, Boehringer Ingelheim - Big Data in Pharma: Sittin...
BDW17 London - Rita Simoes, Boehringer Ingelheim - Big Data in Pharma: Sittin...
 
BDW17 London - Mick Ridley, Exterion Media & Dale Campbell , TfL - Transformi...
BDW17 London - Mick Ridley, Exterion Media & Dale Campbell , TfL - Transformi...BDW17 London - Mick Ridley, Exterion Media & Dale Campbell , TfL - Transformi...
BDW17 London - Mick Ridley, Exterion Media & Dale Campbell , TfL - Transformi...
 
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
 
BDW17 London - Steve Bradbury - GRSC - Making Sense of the Chaos of Data
BDW17 London - Steve Bradbury - GRSC - Making Sense of the Chaos of DataBDW17 London - Steve Bradbury - GRSC - Making Sense of the Chaos of Data
BDW17 London - Steve Bradbury - GRSC - Making Sense of the Chaos of Data
 
BDW17 London - Andy Boura - Thomson Reuters - Does Big Data Have to Mean Big ...
BDW17 London - Andy Boura - Thomson Reuters - Does Big Data Have to Mean Big ...BDW17 London - Andy Boura - Thomson Reuters - Does Big Data Have to Mean Big ...
BDW17 London - Andy Boura - Thomson Reuters - Does Big Data Have to Mean Big ...
 
BDW17 London - Tom Woolrich, Financial Times - What Does Big Data Mean for th...
BDW17 London - Tom Woolrich, Financial Times - What Does Big Data Mean for th...BDW17 London - Tom Woolrich, Financial Times - What Does Big Data Mean for th...
BDW17 London - Tom Woolrich, Financial Times - What Does Big Data Mean for th...
 
BDW17 London - Andrew Fryer, Microsoft - Everybody Needs a Bit of Science in ...
BDW17 London - Andrew Fryer, Microsoft - Everybody Needs a Bit of Science in ...BDW17 London - Andrew Fryer, Microsoft - Everybody Needs a Bit of Science in ...
BDW17 London - Andrew Fryer, Microsoft - Everybody Needs a Bit of Science in ...
 
BDW16 London - Alex Bordei, Bigstep - Building Data Labs in the Cloud
BDW16 London - Alex Bordei, Bigstep - Building Data Labs in the CloudBDW16 London - Alex Bordei, Bigstep - Building Data Labs in the Cloud
BDW16 London - Alex Bordei, Bigstep - Building Data Labs in the Cloud
 
BDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
BDW16 London - William Vambenepe, Google - 3rd Generation Data PlatformBDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
BDW16 London - William Vambenepe, Google - 3rd Generation Data Platform
 
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
BDW16 London - Scott Krueger, skyscanner - Does More Data Mean Better Decisio...
 
BDW16 London - Nondas Sourlas, Bupa - Big Data in Healthcare
BDW16 London  - Nondas Sourlas, Bupa - Big Data in HealthcareBDW16 London  - Nondas Sourlas, Bupa - Big Data in Healthcare
BDW16 London - Nondas Sourlas, Bupa - Big Data in Healthcare
 
BDW16 London - John Callan, Boxever - Data and Analytics - The Fuel Your Bran...
BDW16 London - John Callan, Boxever - Data and Analytics - The Fuel Your Bran...BDW16 London - John Callan, Boxever - Data and Analytics - The Fuel Your Bran...
BDW16 London - John Callan, Boxever - Data and Analytics - The Fuel Your Bran...
 
BDW16 London - John Belchamber, Telefonica - New Data, New Strategies, New Op...
BDW16 London - John Belchamber, Telefonica - New Data, New Strategies, New Op...BDW16 London - John Belchamber, Telefonica - New Data, New Strategies, New Op...
BDW16 London - John Belchamber, Telefonica - New Data, New Strategies, New Op...
 
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
BDW16 London - Deenar Toraskar, Think Reactive - Fast Data Key to Efficient C...
 
BDW16 London - Jonny Voon, Innovate UK - Smart Cities and the Buzz Word Bingo
BDW16 London - Jonny Voon, Innovate UK - Smart Cities and the Buzz Word BingoBDW16 London - Jonny Voon, Innovate UK - Smart Cities and the Buzz Word Bingo
BDW16 London - Jonny Voon, Innovate UK - Smart Cities and the Buzz Word Bingo
 
BDW16 London - Marius Boeru, Bigstep - How to Automate Big Data with Ansible
BDW16 London -  Marius Boeru, Bigstep - How to Automate Big Data with AnsibleBDW16 London -  Marius Boeru, Bigstep - How to Automate Big Data with Ansible
BDW16 London - Marius Boeru, Bigstep - How to Automate Big Data with Ansible
 
BDW16 London - Josh Partridge, Shazam - How Labels, Radio Stations and Brand...
BDW16 London - Josh Partridge, Shazam -  How Labels, Radio Stations and Brand...BDW16 London - Josh Partridge, Shazam -  How Labels, Radio Stations and Brand...
BDW16 London - Josh Partridge, Shazam - How Labels, Radio Stations and Brand...
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI

  • 1. Making Sense of IoT Data w/ Big Data + Data Science Charles Cai - The views expressed here are of my own and not my employer
  • 2. Making Sense of IoT w/ Big Data + Data Science u IoT = Big Data u  In this talk we are going to discuss the latest development in Big Data, Machine Learning and Data Science and the latest IoT use cases in healthcare new drug trial, geospatial mapping, disaster relief, retails and insurance etc to cover life cycle of IoT data analytics: capturing, storing, cleansing, analysing, predicting and maintaining… u  There will be 30~50 billion Internet connected devices in 5 years u  How IoT can drive innovations in various industries u  IoT = Big Data, how open source big data eco-system supports IoT Driven business cases
  • 3. Making Sense of IoT Data with Big Data + Data Science Big Data Week Conference 2015 Charles Cai Big Data + Data Science Leading Oil and Gas Trading Company u  Innovating with Disruptive Technologies Data Center Operation System Data Operation System 2.0 Data Science Maturity Model D - I - K - W Crowdsourcing MOOC / OSH / OSS Data Science Maturity Model Big Data DevOps / Data Scientist Shortage Operating BDA: MicroservicesGraph Database / Graph Computing Open Source Hardware / SoftwareData – Information – Knowledge - Wisdom The Power of Crowdsourcing
  • 4. Intro u  Bio u  #FO #FICC: Investment Banking Front Office: FX/Commodities u  #ETRM: Energy Trading & Risk Management u  #entrepreneur #innovator #disruptor u  Voted as one of the UK’s Top 50 Data Leaders & Influencers u  Twitter: @caidong u #big-data #IoT #data-science #MOOC #Mobile #Cloud #UX u  LinkedIn: http://uk.linkedin.com/in/charlescai/en
  • 5. Where we are at with Big Data Analytics? By Thomas Davenport – Harvard Business Review
  • 6. BI vs DS: from Descriptive to Prescriptive – Ironside The Ironside Group Quantifiable ROI
  • 7. u  Use Case: Parkinson Disease New Drug Trial u  there’s no cure for Parkinson’s disease u  New medicine trial is an extremely slow process, daily doses x8! u  Traditional feedbacks from the patients are not frequent at all u  Wireless enabled wearable device + IMU sensors u  Classification of wearer activities u  sitting, standing, walking, running, sleeping… u  Detect pattern of Parkinson’s Disease symptoms u  predicting deterioration / improvement speed u  new trial medicine effectiveness u  Sensor data 10Hz sampling = 1GB / day / patient IoT = Big Data
  • 9. Same Data Science disciplines for vertical industries
  • 10. Open Source Data Science Toolbox Hadoop / Mesos Distributed Storage + Scalable Computation Open Source Big Data / Data Science Platform 10 COTS Apps (Excel, Tableau, Qlik...) Statistical Time Series Analysis Wider Big Data Analytics eco-systems •  Shell/APIs: HDFS, Hive, Spark, HBase, Sqoop, JDBC/ODBC •  Languages: Julia, Python, R, Scala - Developed on: - Operated by: NLTK: Natural Language Distributed Time Series / Geospatial / Graph Databases GIT Repo DataProducts WebSocket Drag + Drop (CZML/GeoJSON) Web Browser (collaboration) Export to CSV/ Excel Geospatial data Time Series data Public Data Market data Real-time Streaming Open Gov Data JDBC via phoenix HDFS Hive/Pig w/ Geospatial
  • 11. … here’s Microservice and Data Center Operation System related:
  • 12. Key Sub-systems in Modern Big Data Analytics Stack Data Analytics Streaming Graph Computing Machine Learning …
  • 13. Data Science Maturity Map – where we are, where we are going can go InformationData Knowledge Wisdom / Intelligence “Note: The current version focuses mainly around data / machine learning - a new version for cross industry use cases with more coverage on IoT, container, data flow etc… is being developed – ETA Dec 2015 / Jan 2016. Please follow Twitter: @caidong to receive the latest version soon”
  • 14. From Classic to Modern Architecture Full Text Search Natural Language Process CCTV / Voice Computer Vision + Q&A Deep Learning (CNN/RNN) RDBMS / DW KV + GraphDB + BD DW Business Intelligence Big Data, Machine Learning Lightweight Container + Microservices + API Harvestingn-tier architecture Semantic Search Keyword Search Named Entity Extraction Q&A N-Grams Faceted Search Geospatial Search Tables Primary Keys Foreign Keys Node / Vertex Label Edge / Relationship Properties Colours Shapes Complex Shapes Textiles Accessories Context What happened? What’s happening? Predictive Analytics Prescriptive Analysis “Make the trend!” Database App Server Web Front Cloud Distributed and Fault Tolerant “Data Centre as One Computer” Unstructured
  • 15. u  Working with HR Training team u  VTA Training Sessions u  Big Data Bootcamp u  Lunch and Learn KT Sessions Big Data Technology is evolving so fast… here’s Hadoop related: Big data ELT with Apache Sqoop BI vs Data Science Data Scientist Career Path MOOC and Machine Learning Machine Learning with Apache Spark Map Reduce 101 Big Data Security: Kerberos/Knox/Sentry Deep Learning and Use Cases Time Series and Geospatial Big Data Analytics with ImpalaHBase: Distributed Key-value BigTable Distributed Time Sereis DB: OpenTSDB Machine Learning with Hadoop and R Advanced Machine Bayesian Network
  • 16. Big Data / Data Science Learning Resource: free e-Books Data Jujitsu: The Art of Turning Data into Product Data Mining Algorithms In R A Programmer's Guide to Data Mining Data Mining and Analysis: Fundamental Concepts and Algorithms Mining of Massive Datasets The School of Data Handbook Theory and Applications for Advanced Text Mining An Introduction to Data Science
  • 18. Master your Compliance Big Data - Resources: MOOC -  Coursera -  EdX -  Udacity -  iTuneU
  • 19. Big Data / Data Science Virtual Machines + Containers
  • 20. Big Data / Data Science Certifications: EMC, Cloudera, … CCP: Data Scientists: -  elite level -  real-world designing and developing -  production-ready data science solution -  peer-evaluated for accuracy, scalability, and robustness EMC Data Science Associate: -  Data Analytics Lifecycle -  Analyzing / exploring data w/ R -  Statistics modelling, theory and advanced methods -  Advanced technology & tools -  Operationalizing
  • 21. Data Science Crowdsourcing : Hackathons, Startups
  • 22. Title: Open Sans 100 px u  Subtitle: Open Sans 48 px
  • 23. Title: Open Sans 100 px u  Subtitle: Open Sans 48 px
  • 24. BI vs DS: from Descriptive to Prescriptive - Gartner Gartner – Analytical Difficulty by Value
  • 25. BI vs DS: from Descriptive to Prescriptive - IBM
  • 26. BI vs DS: from Descriptive to Prescriptive - SAP SAP – Analytics Maturity by Competitive Advantage