SlideShare a Scribd company logo
1 of 35
Enable Advanced Analytics with Hadoop and
an Enterprise Data Hub
3
Agenda
• Business Problem
• Current Challenges
• Agile Analytics
• Case Studies
4
Business Problem
5
From BI to Advanced Analytics
What happened?
When? And
Where?
What will happen?
How and why did
it happen?
Time
Data Size
Facts Interpretations
How can we do
better?
6
Advanced Analytics that Saves Us Money
• Customer churn
analysis model
• Integrated customer
support and services
• Fraud detection
6
7
Advanced Analytics that Makes Us Money
• Product
recommendation
engines
• Location-based real-
time offers
• Target-based pricing
strategy
7
$
8
Analytic Opportunities
8
Marketing Operations
t
value$
Total Market
Sales
Known Market Customers
9
Enterprise Pressures - Questions
9
Marketing Operations
t
value$
Total Market
Sales
Known Market Customers“We want to know what our
customer do on-line and in
our stored. How can we
combine data from separate
analytics silos to understand
& serve them better?”
“How can we reduce stock-
outs & ensure products are in
the right stores at the right
time? Can we combine data
from our carriers with in-
store historical data from
thousands of stores?
“Theft, or ‘shrinkage’ in our
stores is on the increase –
can we combine POS data
with video surveillance to
reduce it without impacting
customer service
negatively?”
10
Enterprise Pressures - Questions
10
Marketing Operations
t
value$
Total Market
Sales
Known Market Customers“We want to know what our
customer do on-line and in
our stored. How can we
combine data from separate
analytics silos to understand
& serve them better?”
“How can we reduce stock-
outs & ensure products are in
the right stores at the right
time? Can we combine data
from our carriers with in-
store historical data from
thousands of stores?
“Theft, or ‘shrinkage’ in our
stores is on the increase –
can we combine POS data
with video surveillance to
reduce it without impacting
customer service
negatively?”
Data Products
11
Current Challenges
12
Data Product Value
Cost to implement (in time, budget, people, tools)
V
A
L
U
E
5
6
7
8
2
3 4 sensor data
Multi-source – Fuzzy Value
operational data
1
$500K $1M
$500K
$1M
13
Data Product + Risk
Cost to implement (in time, budget, people, tools)
V
A
L
U
E
5
sensor data
Known Value
Single-Source
1
4
7
low
medium
high
13
3
Multi-source – Fuzzy Value
6
8
2
$500K $1M
$500K
$1M
Risks
14
“I’m sick of waiting for my
data, I’m going to make my
own copy.”
“I need to make sure the DW
is secure & compliant for the
mission critical reports.”
Impact of Status Quo
“We don’t have the information
we need to answer key
business questions.”
DBA/DW
Admins
Executives
Data
Scientists
15
What if?
15
Cost to implement (in time, budget, people, tools)
V
A
L
U
E
5
3
1
4
6
8
7
2
$500K $1M
$500K
$1M
low
medium
high
Risks
16
Agile Analytics
17
Traditional Advanced Analytics Process
Time-to-Insight
Project
Definition
Data
Preparation
Exploratory
Analytics
Operational
Analytics
Model
Creation
Model
Evaluation
Deploy
Model
Problem
ID
Data Sampling
Data Access Request
& Discovery
Data Transformation
18
Time-to-Insight
Project
Definition
Data
Preparation
Exploratory
Analytics
Operational
Analytics
Model
Creation
Model
Evaluation
Data Sampling
Data Access Request
& Discovery
Deploy
Model
Problem
ID
Data Transformation
Analytics Process with EDH
19
Time-to-Insight
Project
Definition
Data
Preparation
Exploratory
Analytics
Operational
Analytics
Model
Creation
Model
Evaluation
Deploy
Model
Problem
ID
Analytics Process with EDH
Data
Sampling
Data
Access
Request &
Discovery
Data
Transfor-
mation
20
Analytics Process with EDH
Project
Definition
Data
Preparation
Exploratory
Analytics
Operational
Analytics
Model
Creation
Model
Evaluation
Data
Sampling
Data
Access
Request
&
Discovery
Deploy
Model
Problem
ID
Deliver Insights Sooner
Data
Transfor-
mation
21
Issues
21
Operations
value$
SalesMarketing
Marketing
Market Data
System
Information
22
Step 1 : Collect all Data
22
Marketing
Market Data
System
Information
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
Marketing
23
Step 2 : Create Derived Datasets
23
Marketing
BATCH
PROCESSING
3RD PARTY
APPS
Data Set 1 Data Set 2
24
Step 2 : Create Derived Datasets
24
Marketing
BATCH
PROCESSING
3RD PARTY
APPS
Data Set 1 Data Set 2
25
Step 3 : Data Analysts
25
Marketing
Data Set 1 Data Set 2
ANALYTIC
SQL
SEARCH
ENGINE
26
Step 4 : Analytics
26
Marketing
Data Set 1 Data Set 2
MACHINE
LEARNING
STREAM
PROCESSING
3RD PARTY
APPS
Clustering Recommender Regression
27
Step 4 Cont: Analytics + Data Together
27
Data Set 1 Data Set 2
Old Way
SAS/R
JDBC-SELECT 10%
MACHINE
LEARNING
SAS+/R+
(ORYX)
ALGORITHM
28
Cloudera EDH for Analytics
BATCH
PROCESSING
ANALYTIC
SQL
SEARCH
ENGINE
MACHINE
LEARNING
STREAM
PROCESSING
3RD PARTY
APPS
WORKLOAD MANAGEMENT
STORAGE FOR ANY TYPE OF DATA
UNIFIED, ELASTIC, RESILIENT, SECURE
DATA
MANAGEMENT
SYSTEM
MANAGEMENT
Filesystem Online NoSQL
29
• Acquire necessary
information sooner to
make critical business
decisions
Executives
Business Value Delivered
• Support both
reporting and
analytics needs
• Save resources with
shared security and
management
DBA/DW
Admins
• Acquire data
necessary for projects
• Develop
analysis/models with
better lift faster
• Share data sets to
empower others
Data Scientists
30
Case Studies
31
Monsanto can automate data-driven R&D
decisions to reduce time to market from
years to months.
Ask Bigger Questions:
How do we feed the world?
32
Monsanto feeds our growing, global population
The Challenge:
• 1,000+ research scientists developing products
in silos
• Data processing bottleneck slows development
• Time to market for new product is 5-10 years
The Solution:
• Cloudera Enterprise + Search + Impala: PB-scale
platform for single view of all R&D data
• Integration: Exadata, spatial awareness &
visualization
• Scientists directly access CDH; Navigator offers
auditing & access control
Monsanto can automate data-
driven R&D decisions to reduce time
to market to months from years.
33
Patterns and Predictions analyzes mobile data and social
networking text for real-time identification of risk
factors.
Ask Bigger Questions:
How can we prevent veteran suicide?
34
Patterns and Predictions aids suicide prevention
The Challenge:
• Suicide rates among veterans are roughly
double that of general US adults
• Military efforts struggle to understand risk
factors
The Solution:
• Suicide risk predictive solution built on Cloudera
+ Attivio
• Analyzes veterans’ mobile & social data for real-
time identification of risk factors
• Integrating Cloudera Search + Impala to simplify
environment
The Durkheim Project predicts
suicide risk with statistical
significance (65%+ accuracy).
Thank You!
35

More Related Content

What's hot

How Virtual Reality and Machine Learning Are Powering the New Age of Network ...
How Virtual Reality and Machine Learning Are Powering the New Age of Network ...How Virtual Reality and Machine Learning Are Powering the New Age of Network ...
How Virtual Reality and Machine Learning Are Powering the New Age of Network ...
DataStax
 

What's hot (20)

MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Emergence of MongoDB as an Enterprise Data Hub
Emergence of MongoDB as an Enterprise Data HubEmergence of MongoDB as an Enterprise Data Hub
Emergence of MongoDB as an Enterprise Data Hub
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI Strategy
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?Data Discovery and BI - Is there Really a Difference?
Data Discovery and BI - Is there Really a Difference?
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
 
2020 Big Data & Analytics Maturity Survey Results
2020 Big Data & Analytics Maturity Survey Results2020 Big Data & Analytics Maturity Survey Results
2020 Big Data & Analytics Maturity Survey Results
 
Webinar - Case Study: ProtectWise enhances network security with DataStax alw...
Webinar - Case Study: ProtectWise enhances network security with DataStax alw...Webinar - Case Study: ProtectWise enhances network security with DataStax alw...
Webinar - Case Study: ProtectWise enhances network security with DataStax alw...
 
Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data Strategy
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
How Virtual Reality and Machine Learning Are Powering the New Age of Network ...
How Virtual Reality and Machine Learning Are Powering the New Age of Network ...How Virtual Reality and Machine Learning Are Powering the New Age of Network ...
How Virtual Reality and Machine Learning Are Powering the New Age of Network ...
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
 
Making Big Data Easy for Everyone
Making Big Data Easy for EveryoneMaking Big Data Easy for Everyone
Making Big Data Easy for Everyone
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database
 
Breakout: Data Discovery with Hadoop
Breakout: Data Discovery with HadoopBreakout: Data Discovery with Hadoop
Breakout: Data Discovery with Hadoop
 
Webinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data PlatformWebinar: Transforming Customer Experience Through an Always-On Data Platform
Webinar: Transforming Customer Experience Through an Always-On Data Platform
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
 
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
Driving Datascience at scale using Postgresql, Greenplum and Dataiku - Greenp...
 

Similar to Enable Advanced Analytics with Hadoop and an Enterprise Data Hub

Wanta OConnell Presentation 2012 v4
Wanta OConnell Presentation 2012 v4Wanta OConnell Presentation 2012 v4
Wanta OConnell Presentation 2012 v4
Becky Wanta
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Denodo
 

Similar to Enable Advanced Analytics with Hadoop and an Enterprise Data Hub (20)

Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
Gain a Holistic View of your Customer's Journey
Gain a Holistic View of your Customer's JourneyGain a Holistic View of your Customer's Journey
Gain a Holistic View of your Customer's Journey
 
Wanta OConnell Presentation 2012 v4
Wanta OConnell Presentation 2012 v4Wanta OConnell Presentation 2012 v4
Wanta OConnell Presentation 2012 v4
 
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven DecisionsPower to the People: A Stack to Empower Every User to Make Data-Driven Decisions
Power to the People: A Stack to Empower Every User to Make Data-Driven Decisions
 
Eureka Analytics Seminar Series - Product Management for Data Science Products
Eureka Analytics Seminar Series - Product Management for Data Science ProductsEureka Analytics Seminar Series - Product Management for Data Science Products
Eureka Analytics Seminar Series - Product Management for Data Science Products
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
Smarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with AutomationSmarter Analytics: Supporting the Enterprise with Automation
Smarter Analytics: Supporting the Enterprise with Automation
 
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201... It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
It’s Not About Big Data – It’s About Big Insights - SAP Webinar - 20 Aug 201...
 
Innovative Data Leveraging for Procurement Analytics
Innovative Data Leveraging for Procurement AnalyticsInnovative Data Leveraging for Procurement Analytics
Innovative Data Leveraging for Procurement Analytics
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Where does Data Democracy begin? [Segment-Synapse, 2019]
Where does Data Democracy begin? [Segment-Synapse, 2019]Where does Data Democracy begin? [Segment-Synapse, 2019]
Where does Data Democracy begin? [Segment-Synapse, 2019]
 
Big Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation SlidesBig Data Tools PowerPoint Presentation Slides
Big Data Tools PowerPoint Presentation Slides
 
Getting Started with Big Data for Business Managers
Getting Started with Big Data for Business ManagersGetting Started with Big Data for Business Managers
Getting Started with Big Data for Business Managers
 
Turning Big Data into Better Business Outcomes
Turning Big Data into Better Business OutcomesTurning Big Data into Better Business Outcomes
Turning Big Data into Better Business Outcomes
 
Seeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing ForeverSeeing Redshift: How Amazon Changed Data Warehousing Forever
Seeing Redshift: How Amazon Changed Data Warehousing Forever
 
Understanding What’s Possible: Getting Business Value from Big Data Quickly
Understanding What’s Possible: Getting Business Value from Big Data QuicklyUnderstanding What’s Possible: Getting Business Value from Big Data Quickly
Understanding What’s Possible: Getting Business Value from Big Data Quickly
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Enable Advanced Analytics with Hadoop and an Enterprise Data Hub

  • 1.
  • 2. Enable Advanced Analytics with Hadoop and an Enterprise Data Hub
  • 3. 3 Agenda • Business Problem • Current Challenges • Agile Analytics • Case Studies
  • 5. 5 From BI to Advanced Analytics What happened? When? And Where? What will happen? How and why did it happen? Time Data Size Facts Interpretations How can we do better?
  • 6. 6 Advanced Analytics that Saves Us Money • Customer churn analysis model • Integrated customer support and services • Fraud detection 6
  • 7. 7 Advanced Analytics that Makes Us Money • Product recommendation engines • Location-based real- time offers • Target-based pricing strategy 7 $
  • 9. 9 Enterprise Pressures - Questions 9 Marketing Operations t value$ Total Market Sales Known Market Customers“We want to know what our customer do on-line and in our stored. How can we combine data from separate analytics silos to understand & serve them better?” “How can we reduce stock- outs & ensure products are in the right stores at the right time? Can we combine data from our carriers with in- store historical data from thousands of stores? “Theft, or ‘shrinkage’ in our stores is on the increase – can we combine POS data with video surveillance to reduce it without impacting customer service negatively?”
  • 10. 10 Enterprise Pressures - Questions 10 Marketing Operations t value$ Total Market Sales Known Market Customers“We want to know what our customer do on-line and in our stored. How can we combine data from separate analytics silos to understand & serve them better?” “How can we reduce stock- outs & ensure products are in the right stores at the right time? Can we combine data from our carriers with in- store historical data from thousands of stores? “Theft, or ‘shrinkage’ in our stores is on the increase – can we combine POS data with video surveillance to reduce it without impacting customer service negatively?” Data Products
  • 12. 12 Data Product Value Cost to implement (in time, budget, people, tools) V A L U E 5 6 7 8 2 3 4 sensor data Multi-source – Fuzzy Value operational data 1 $500K $1M $500K $1M
  • 13. 13 Data Product + Risk Cost to implement (in time, budget, people, tools) V A L U E 5 sensor data Known Value Single-Source 1 4 7 low medium high 13 3 Multi-source – Fuzzy Value 6 8 2 $500K $1M $500K $1M Risks
  • 14. 14 “I’m sick of waiting for my data, I’m going to make my own copy.” “I need to make sure the DW is secure & compliant for the mission critical reports.” Impact of Status Quo “We don’t have the information we need to answer key business questions.” DBA/DW Admins Executives Data Scientists
  • 15. 15 What if? 15 Cost to implement (in time, budget, people, tools) V A L U E 5 3 1 4 6 8 7 2 $500K $1M $500K $1M low medium high Risks
  • 17. 17 Traditional Advanced Analytics Process Time-to-Insight Project Definition Data Preparation Exploratory Analytics Operational Analytics Model Creation Model Evaluation Deploy Model Problem ID Data Sampling Data Access Request & Discovery Data Transformation
  • 20. 20 Analytics Process with EDH Project Definition Data Preparation Exploratory Analytics Operational Analytics Model Creation Model Evaluation Data Sampling Data Access Request & Discovery Deploy Model Problem ID Deliver Insights Sooner Data Transfor- mation
  • 22. 22 Step 1 : Collect all Data 22 Marketing Market Data System Information STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE Marketing
  • 23. 23 Step 2 : Create Derived Datasets 23 Marketing BATCH PROCESSING 3RD PARTY APPS Data Set 1 Data Set 2
  • 24. 24 Step 2 : Create Derived Datasets 24 Marketing BATCH PROCESSING 3RD PARTY APPS Data Set 1 Data Set 2
  • 25. 25 Step 3 : Data Analysts 25 Marketing Data Set 1 Data Set 2 ANALYTIC SQL SEARCH ENGINE
  • 26. 26 Step 4 : Analytics 26 Marketing Data Set 1 Data Set 2 MACHINE LEARNING STREAM PROCESSING 3RD PARTY APPS Clustering Recommender Regression
  • 27. 27 Step 4 Cont: Analytics + Data Together 27 Data Set 1 Data Set 2 Old Way SAS/R JDBC-SELECT 10% MACHINE LEARNING SAS+/R+ (ORYX) ALGORITHM
  • 28. 28 Cloudera EDH for Analytics BATCH PROCESSING ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING 3RD PARTY APPS WORKLOAD MANAGEMENT STORAGE FOR ANY TYPE OF DATA UNIFIED, ELASTIC, RESILIENT, SECURE DATA MANAGEMENT SYSTEM MANAGEMENT Filesystem Online NoSQL
  • 29. 29 • Acquire necessary information sooner to make critical business decisions Executives Business Value Delivered • Support both reporting and analytics needs • Save resources with shared security and management DBA/DW Admins • Acquire data necessary for projects • Develop analysis/models with better lift faster • Share data sets to empower others Data Scientists
  • 31. 31 Monsanto can automate data-driven R&D decisions to reduce time to market from years to months. Ask Bigger Questions: How do we feed the world?
  • 32. 32 Monsanto feeds our growing, global population The Challenge: • 1,000+ research scientists developing products in silos • Data processing bottleneck slows development • Time to market for new product is 5-10 years The Solution: • Cloudera Enterprise + Search + Impala: PB-scale platform for single view of all R&D data • Integration: Exadata, spatial awareness & visualization • Scientists directly access CDH; Navigator offers auditing & access control Monsanto can automate data- driven R&D decisions to reduce time to market to months from years.
  • 33. 33 Patterns and Predictions analyzes mobile data and social networking text for real-time identification of risk factors. Ask Bigger Questions: How can we prevent veteran suicide?
  • 34. 34 Patterns and Predictions aids suicide prevention The Challenge: • Suicide rates among veterans are roughly double that of general US adults • Military efforts struggle to understand risk factors The Solution: • Suicide risk predictive solution built on Cloudera + Attivio • Analyzes veterans’ mobile & social data for real- time identification of risk factors • Integrating Cloudera Search + Impala to simplify environment The Durkheim Project predicts suicide risk with statistical significance (65%+ accuracy).

Editor's Notes

  1. The requirements coming from executive: To be able to answer key business questions while run operational reports has created a strained situation between the Data Scientists and the DBA/DW Admins. DBA/DW Admins are forced to choose between DW is secure and compliant, and meeting Data Scientists’ requirements for accessing the data they need when they need them.
  2. Common misconception is that data science work centers around model development. While model development is crucial, most of the effort and time are spent on data preparation. This is due to that in the traditional world of analytics, there are a lot of data movement, which is both time-consuming and limiting for the things data scientists can do.
  3. And so if we come back to look at how this solution now affects the three groups of people in an enterprise, who are closest to the data, we quickly see that:For Data ScientistHe is able to acquire data necessary for the project very quickly, without the need to create rogue data martsBecause he can now use all the data very quickly, he can develop models with much better liftOnce he has the insights, he can share the data set to empower other usersFor the DW administratorHe can now support both the running of mission critical reports in his DW, while fulfilling the need for data from the data scientistsAnd he can save resources and time, now all the data are in one centralized location with unified security and management,For the Executive She can finally get the overall report that she needs on regular basis, but still able to gain competitive edge, whether it’s decreasing costs/risks or increasing revenue, with the insights gained from the use of all the data