Looking to analyze your Big Data assets to unlock real business benefits today? But, are you sick of all the theories, hype and whoopla?
View these slides from Actian and Yellowfin’s "Big Data Analytics with Hadoop" Webinar to discover how we’re making Big Data Analytics fast and easy.
Hold on as we go from data in Hadoop to dashboard in just 40-minutes.
Learn how to combine Hadoop with the most advanced Big Data technologies, and world’s easiest BI solution, to quickly generate real business value from Big Data Analytics.
Watch as we use live CDR data stored in Hadoop – quickly connecting, preparing, optimizing and analyzing this data in a tangible real-world use case from the telecommunications industry – to easily deliver actionable insights to anyone, anywhere, anytime.
To learn more about Yellowfin, and to try its intuitive Business Intelligence platform today, go here: http://www.yellowfinbi.com
To learn more about Actian, and its next generation suite of Big Data technologies, go here: http://www.actian.com/
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
1. December 16, 2013
Making Big Data Analytics
Fast and Easy
Using Actian, Yellowfin and Hadoop
John Ryan
Ryan Templeton
Ivan Seow
Marketing Manager APAC
Actian Corporation
Snr Solutions Architect
Actian Corporation
Snr Technical Consultant
Yellowfin
3. Take Action on Big Data
Making BI Easy
Fastest Data Prep Engine
Fastest Hadoop Loader
Fastest Single Node Database
Fastest MPP Database
Huge library of Analytical Functions
3
4. Take Action on Big Data
Making BI Easy
Fastest Data Prep Engine
Ranked #1 BI Vendor
Dresner Global BI Study 2012 & 13
Fastest Hadoop Loader
#1 Dashboard Vendor:
BARC BI Survey 12
Fastest Single Node Database
Fastest MPP Database
#1 Enterprise Reporting Vendor:
BARC BI Survey 13
Huge library of Analytical Functions
Gartner: ‘Vendor to Consider’
4
5. Today’s Agenda
1. Big Data Analytics with Hadoop
2. Making Analytics in Hadoop Fast & Easy
3. Customer Example (Telecom)
4. Demo: From Data to Dashboard
•
•
Making Hadoop Fast and Easy
Making BI Fast and Easy
5. Summary
5
7. 73%
Expect to have HDFS
in production
Based on 263 respondents
TDWI Best Practices Report – Q2 2013
7
8. 71%
Big Data Source for Analytics
Most Likely to Benefit from Hadoop
Based on 263 respondents
TDWI Best Practices Report – Q2 2013
8
9. Why is analytics inside Hadoop
so hard and slow?
HDFS is a file system,
not a database
Need a Data Scientist
Queries not standard SQL,
only resemble SQL
MapReduce inefficient
for analytic queries
9
11. Actian Big Data Analytic Platform
Big Data Storage
Business
Intelligence
Accelerating Big Data 2.0
Connect
Prepare
Optimize
Analyze
Enterprise
VALUE
DATA
Applications
DW
Advanced technology platform:
Multiple deployment options:
Industry leading:
On-premise
Scale
Cloud
Performance
Hybrid
Complexity
Embedded
Cost (price/performance)
Time to Value
11
12. Actian Big Data Analytic Platform
Big Data Storage
Business
Intelligence
Accelerating Big Data 2.0
Connect
Prepare
Optimize
Analyze
Enterprise
VALUE
DATA
Applications
DW
Advanced technology platform:
Multiple deployment options:
Industry leading:
On-premise
Scale
Cloud
Performance
Hybrid
Complexity
Embedded
Cost (price/performance)
Time to Value
12
13. Actian Big Data Analytic Platform
Big Data Storage
Business
Intelligence
Accelerating Big Data 2.0
Connect
Prepare
Optimize
Analyze
Enterprise
VALUE
DATA
Applications
DW
Advanced technology platform:
Multiple deployment options:
Industry leading:
On-premise
Scale
Cloud
Performance
Hybrid
Complexity
Embedded
Cost (price/performance)
Time to Value
13
14. Actian Big Data Analytic Platform
Big Data Storage
Business
Intelligence
Accelerating Big Data 2.0
Connect
Prepare
Optimize
Analyze
Enterprise
VALUE
DATA
Applications
DW
Advanced technology platform:
Multiple deployment options:
Industry leading:
On-premise
Scale
Cloud
Performance
Hybrid
Complexity
Embedded
Cost (price/performance)
Time to Value
14
15. Actian Big Data Analytic Platform
Big Data Storage
Business
Intelligence
Accelerating Big Data 2.0
Connect
Prepare
Optimize
Analyze
Enterprise
VALUE
DATA
Applications
DW
Advanced technology platform:
Multiple deployment options:
Industry leading:
On-premise
Scale
Cloud
Performance
Hybrid
Complexity
Embedded
Cost (price/performance)
Time to Value
15
16. Industry Leading Performance
Process Hadoop Data Faster
Analyze Data Faster
Dataflow vs PIG (MapReduce)
Database Benchmarks
DBT-3@1TB : Run times
TPC-H QphH@1TB Benchmarks (non-clustered)
16
19. Customer Use Case
Tier two telecom provider
Planning for large growth
with minimal staff impact
Business demands deeper insights
19
20. IT Challenges
Collect, manage, process
CDR data in Hadoop
Swamped with data.
Network switch dumps 200MB /min
during peak times.
Hundreds of thousands of records per drop.
170 columns.
Users are domain experts,
not data scientists
Too hard to analyze
Raw data must first be distilled
and enriched to gain insight
20
21. What the business was asking for
Fastest time to
decision
Speed up processing by an order of magnitude
Increased granularity
of analysis
Without increasing processing times or bogging down
backend
Proactive analysis,
not reactive
Enable trend analysis and predictive capabilities
Answer real
business questions
e.g. visual insight for near real-time customer and vendor
performance, determine routing performance
optimization, etc
Scale for future
growth
Extensible for future capabilities and scalable growth
21
22. Specific Business Questions - CDR Analysis
Answer Service Rate (ASR & Adjusted ASR)
• Calls completed vs. route attempts (vendor performance)
• Calls completed vs. call attempts (customer satisfaction)
Opportunity Monitor
• Calculate profit/loss per call due to routing path chosen
Post Dial Delay (PDD)
• Annoying delay until path through network selected
Analysis of near real time quality measures
• Call duration, jitter and packet loss
Trends and correlations of above metrics
22
23. CDR Workflow Overview
CONNECT
TRANSFORM
Filter data Logical functions
Extract failed
routing attempts
Split flow for separate
processing rules
Meta-node
encapsulates
processing
PARALLEL
DATA
LOAD
23
24. Data processing – Execution Plan
Compiled to a set
of physical graphs
Phase 1
Phase 2
Reader
FilterRows
DeriveFields
Group(partial)
Repartition
Group(final)
Writer
Reader
FilterRows
DeriveFields
Group(partial)
Repartition
Group(final)
Writer
Reader
FilterRows
DeriveFields
Group(partial)
Repartition
Group(final)
Writer
Reader
FilterRows
DeriveFields
Group(partial)
Repartition
Group(final)
Writer
24
26. Customer Take Aways – Actionable Insights
FAST
Processing streaming
CDR data in seconds
26
27. Customer Take Aways - Analysis
Deeper
Analysis
visibility at the Area Code
and Exchange level
27
28. Customer Take Aways – Cost Savings
20,000
updates made to routing
tables during first week
of collecting data
28
29. Customer Take Aways - Scalability
8.9 Billion
rows of data collected
during first 6 months
29
30. Solution Architecture
Clustered Execution
Hadoop
Collection
Parallel Loading
Paraccel
Dataflow
Vectorwise
Very fast reporting
database
Extraction
Cleansing
Yellowfin BI
End Users
• Dashboard
• Ad Hoc
• Statistics
• Data Mining
• Analytics
Desktop &
Mobile Devices
Enrichment
Aggregation
Data
Retention
Analysis
Mining
OSS/BSS
30
30
31. Summary – Take Action on Big Data
Big Data Storage
Business
Intelligence
Accelerating Big Data 2.0
Connect
Prepare
Optimize
Analyze
Enterprise
VALUE
DATA
Applications
DW
Advanced technology platform:
Multiple deployment options:
Industry leading:
On-premise
Scale
Cloud
Performance
Hybrid
Complexity
Embedded
Cost (price/performance)
Time to Value
31