This document provides an overview of the IBM Business Intelligence Pattern with BLU Acceleration. It discusses how this pattern provides a pre-configured deployment for a predictable, high performance analytics solution. It delivers order of magnitude improvements in performance, storage savings, and time to value through the use of in-memory acceleration technologies like Dynamic Cubes and DB2 BLU. Typical performance improvements range from 8-25x over traditional approaches. The pattern allows for a simple, streamlined approach to achieve fast analytics results.
2. Please note
IBM’s statements regarding its plans, directions, and intent are subject to
change or withdrawal without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general
product direction and it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a
commitment, promise, or legal obligation to deliver any material, code or
functionality. Information about potential future products may not be
incorporated into any contract. The development, release, and timing of any
future features or functionality described for our products remains at our sole
discretion.
Performance is based on measurements and projections using standard IBM
benchmarks in a controlled environment. The actual throughput or performance
that any user will experience will vary depending upon many factors, including
considerations such as the amount of multiprogramming in the user’s job
stream, the I/O configuration, the storage configuration, and the workload
processed. Therefore, no assurance can be given that an individual user will
achieve results similar to those stated here.
3. Agenda
Market Problem Today
New Markets/Opportunities Possible
What is the “IBM Business Intelligence Pattern with BLU
Acceleration”?
Performance Overview
Architecture
4. Evolving Business Requirements Challenge the Status Quo
Lead-times for
Hardware & Software
Platforms
Increasingly
independent
knowledge workers
Exploding
Integrated
Systems
Self
Service
Big
Data
Business
Analytics
Volumes,
Exponential
Demand
Recognizing the
Power
of knowledge
Interactive Exploration Transform Information to Innovation
4
5. Interactive Exploration - Its all about getting more data faster!
Interactive
Response Time
User Expectation
Unacceptable
Tolerable
Satisfactory
Good!
Request Volume, Complexity & Concurrency
System response time is directly correlated to the propensity of use for
experimentation, exploration and discovery
6. Data Volume & System Complexity
Leads to Risk & Unpredictable TCO
Complex Custom Infrastructure Unpredictable time to value
Traditional deployment practices Variable results
Multiple approaches Multiple iterations to achieve performance
Complexity
Many query
Strategies
may result in
content rewrite
Multi-Terabyte
Data Volume
DBA Database
& HW tuning
Performance
Environment
Variety of MW
& independent
Configurations
7. In-Memory Acceleration & Patterns of Expertise
Provide Agility and Predictability
Expert Integrated Systems Predictable Time to Value
Pattern encoded deployment Repeatable results
Simple, streamlined approach Fast path to performance
Dynamic Cubes
Simplified
In-Memory
Columnar
Acceleration
Streamlined
Fit for Purpose
Performance
Pattern
deployment
Expert Integrated
Systems
8. IBM Business Intelligence Pattern with BLU Acceleration
Pre-configured deployment for
predicta ble, high performa nce a na lytics solution delivery
9. Fast on Fast
Tailored for volume, concurrency, complexity
•
•
Choose a system that learns, grows and keeps getting faster!
Layers of In-Memory Acceleration
• Results Caching - at the speed of memory!
•
More use = more results in-memory
• Dynamic Cubes
• Prime the system for the workloads you can predict
• Memory-Exploiting Columnar Database
• Acceleration for every combination & permutation
•
Evolutionary Innovation
• Parallel Vector Processing
• Greater query & user concurrency
• Data Skipping
• Less I/O
• Active Compression
• Reduce time spent decompressing data
•
Frequent
requests
Expected
requests
Inevitable
requests
Average Acceleration
of database queries for reporting1
Faster DB Query*
Memory-Exploiting – not Memory-bound!
• Not all in-memory solutions are created equal
• Dynamic Cubes and BLU leverage SSD and SDD to ensure stable,
continuous operation
1. Based on internal testing comparing DB2 10.1 traditional row store vs. DB2 10.5 with BLU Acceleration. SQL queries for 20 different reports and dashboards were run in isolation against the
database to measure database response time. Full report generation time would include data transfer and processing by the BI server. Performance gains will vary by workload and system
specifications.
10. Rich
Pattern-based Deployment for Agility
•
•
•
Low touch optimization with Instrumented selftuning
• Automated query performance tuning
• Create objects
• Schedule & Load
• Auto-mapping to models
Streamlined workflows
• Built-in data landing zone
• Import data from anywhere to the in-memory
columnar repository
• Simplified administration
• Integration of data movement scheduling with
Cognos Administration
Built-in expertise
• Memory Optimization
• Programmatic allocation of cores and memory
• Automated management
• Data source
• Business Intelligence
Request
Select
Go
11. Simple
Economics & Agility
•
Pattern-based deployment for agility
• Complete Stack
• OS, Middleware
• Database
• Business Intelligence
• Load Data and Go!
•
Purpose – built integration
• Reduced skill thresholds
• Automated deployment
• Pattern specific product extensions
•
Expert Integrated System Support
• Deploy to PureApplication System
• for Fastest Time to Value
1 Person
+ 1 Hour
1 Fully Deployed Stack
12. Industry Specific Use Cases
Industry
Use Case
Solution Attributes
Retail
Household and market-basket analysis.
Exploration analysis of billions of rows per month with
millions of customers and product SKUs
Insurance
Claims analysis
Indepth dimensional analysis of millions of customers,
policies and itemized claims
Manufacturing &
Logistics
Parts supply and location identification
Millions of parts, thousands of locations, hundreds of
thousands of processes
Life Science
Large standardized data sets crossreferenced by patient and practitioners.
Millions of rows of “aggregator” data cross-referenced by
attribute sets
Cross-Industry Use Cases
Agenda
Use Case
Solution Attributes
Self-service
Acceleration
Pockets of advanced analysts impacting data
warehouse performance
Self-contained data acceleration layer
Agility of deployment
Re-establish connection with Single-Trusted Data
Local telecom limitations require replica
infrastructure
Data privacy requirements necessitate isolated
tenants
Agility and standardization of deployment
Self-contained data acceleration layer
Support a hub & spoke approach to distributed IT or
replication hosting
Replacement for aging MOLAP infrastructure
Robust OLAP functionality
Faster cube load times, larger volumes
Synchronized with Single-Trusted Data
Reduce risk and cost of deployment
Reduce skill and experience threshold to adopt
BA
Prescriptive pattern-based deployment
Available in general purpose and specialized
varieties
Time to value
New deployments
13. Cognos Dynamic Cubes: Goals
Provide a high performance OLAP solution accessing terabytes of data
Provide an aggregate aware solution
Routing to database summary/aggregate tables
Routing to in-memory aggregate values
Provide an aggregate advisor to assist with selection of
database/memory aggregates
Data cached and shared amongst all users
Provide compelling features
Parent/child (recursive) hierarchies
Multiple hierarchies per dimension
Hidden measures
Virtual cubes
Data
Relative time
Warehouse
Dimensional (member) security
14. Initial Query
DQM
Query Processor
Result Set
Cache
MDX
Engine
Security
Expression Cache
Dynamic
Cube
Security
Data Cache
Member
Cache
Search aggregate
cache for exact
match
SQL queries to obtain
14
member information
DQM
Aggregate Cache
SQL queries
to obtain
fact and
summary data
SQL queries
to obtain
aggregate
data
15. Subsequent Query
DQM
Query Processor
Result Set
Cache
MDX
Engine
Security
Expression Cache
Dynamic
Cube
Security
Data Cache
Member
Cache
15
Search aggregate
cache for exact
match
DQM
Aggregate Cache
SQL queries
to obtain
fact and
summary data
16. What is BLU Acceleration?
This means it can run more
stuff at the same time
•
New innovative technology for analytic queries
• Columnar storage
• New run-time engine with vector (aka SIMD) processing, deep
multi-core optimizations and cache-aware memory
management
• “Active compression” - unique encoding for further storage
reduction beyond DB2 10 levels, and run-time processing
without decompression
•
“Revolution through Evolution”
And this means that analytic
queries with filters and
calculations don’t wait for
data to decompress
• Built directly into the DB2 kernel
• BLU tables can coexists with traditional row tables, in same
schema, tablespaces, bufferpools
• Query any combination of BLU or row data
This is really
• Memory-optimized (not “in-memory”)
•
important. It means
the system will
continue running
even if it does fill up
the memory…other
solutions in market
are “memory-bound”
Value : Order-of-magnitude benefits in …
• Performance
• Storage savings
• Time to value
17. How fast is it ? … Current DB2 10.5 Results
Customer Workload
Speedup over DB2 10.1
Analytic ISV
37.4x
Large European Bank
21.8x
8x-25x
BI Vendor (Simple)
124x
BI Vendor (Complex)
6.1x
improvement
is common
Manufacturer
9.2x
Investment Bank
36.9x
“It was amazing to see the faster query times compared to the performance
results with our row-organized tables. The performance of four of our
queries improved by over 100-fold! The best outcome was a query that
finished 137x faster by using BLU Acceleration.”
- Kent Collins, Database Solutions Architect, BNSF Railway
1. Based on internal testing comparing DB2 10.1 traditional row store vs. DB2 10.5 with BLU Acceleration. SQL queries for 20 different reports and dashboards were run in isolation against the
database to measure database response time. Full report generation time would include data transfer and processing by the BI server. Performance gains will vary by workload and system
specifications.
18. Significant Storage Savings
~2x-3x storage reduction vs DB2 10.1 adaptive compression (comparing all
objects - tables, indexes, etc)
New advanced compression techniques
Fewer storage objects required
DB2 with BLU Accel.
19. DB2 10.5 & Cognos BI Dynamic Cubes
Result Set Cache
Report
Member Cache
Query Data Cache
Aggregate Cache
Aggregate Cache
Database
Cube start up
Member cache filled with queries to
data warehouse dimension tables
Aggregate cache filled with queries
to data warehouse (or database
aggregates, if defined)
Report processing
Waterfall lookup for data in
descending order until all
data is provided
1.
2.
3.
4.
5.
Result set cache
Query data cache
Aggregate cache
Database aggregate
Data warehouse
20. Cognos BI 10.2 Dynamic Cubes Ad-hoc Reports
with DB2 10.5 BLU Acceleration
Server: POWER7+ 780
CPU: 64 cores @ 4.4GHz , 1TB RAM
Cognos/DB2 client LPAR: 32 cores, 512GB
Report Workload Elapsed Time
DB2 10.1 DB2 10.5
DB2 server LPAR: 32 cores, 512GB RAM
V7000 with 1.6TB SSD and 4TB HDD
Operating system: AIX 7.1 TL2 SP2
DB2 versions:
DB2 10.1 FP2 Enterprise Server Edition
24x faster
DB2 10.5 Advanced Enterprise Server Edition
Cognos Business Intelligence 10.2.1
“Our BI solution at Taikang Life is built on a Cognos/DB2 solution. In order to ensure reports run
fast and meet our service level commitments to the business, we have to perform preaggregation
each night in database. While our end users experience fast report times, this batch work has
become a challenge because of limited and shrinking batch windows and an ever increasing
database size because we want to analyze more data. With BLU Acceleration, we’ve been able to
reduce the time spent on pre-aggregation by 30x - from one hour to two minutes! BLU Acceleration
is truly amazing.” –Yong Zhou, BI Manager
21. DB2 with BLU Acceleration : Summary
Breakthrough technology
DB2
DB2
WITH BLU
ACCELERATION
10.5
– Combines and extends the leading technologies
– Over 25 patents filed and pending
– Leveraging years of IBM R&D spanning 10
laboratories in 7 countries worldwide
Typical experience
– 8x-25x performance gains
– 10x storage savings vs. uncompressed data
with indexes
– Simple to implement and use
Order of magnitude improvements in
Super analytics
Super easy
– Consumability
– Speed
– Storage savings
23. Lifecycle of Business Intelligence Pattern with BLU Acceleration
Fully functioning selfservice environments
can be deployed in
minutes
Exploration and discovery is
faster with layers of
acceleration
Closed loop automation
create and populate
aggregates
Closed loop
automation maps
aggregates to
the model
instantly
Self-contained acceleration layer to
minimized impact on the warehouse
and provide a landing zone for
operational data
24. IBM Business Intelligence Pattern with BLU Acceleration
Architecture
Sources
Admin
PureApp Console
Source 1
Pattern Components
Source 2
Source 3
Data Loading
Tools
Data Accelerator
:
:
Metadata
Store
~500GB RAM
~30 Cores
DB2 BLU
Source N
~200GB RAM
~30 Cores
LDAP
Content
Store
Analytics Engine
(Cognos BI)
Network
HTTP
Server
(ELB
service)
Users
25. Data Flows between all components (inc ETL)
Cube/Virtual
BLU
Virtual
Cube
Virtual
Cube
Cube publish & in-memory aggregates
Virtual Cube
Design and Aggr.
Advisor
Virtual
Cube
Model update for aggregates
Core
Star
Schema
In DB update jobs
ETL
ETL
Data
In-Memory
Tools
Report
& Act
ETL Design
– Core Star
ETL
Aggregates
Warehouse
Aggregate tables
ETL/DDL
Script
Design Flow
Data Write
Data Read
IBM Confidential
26. • Space and CPU are both highly dependent on
two main factors
• Report & model complexity.
• Data volumes.
• Both are hard to model ahead, so there are no
hard and fast rules. However…
Complexity
Deployment Characteristics
Based on real-world experiments, we suggest the
starting point being the following allocation sizes
on an IBM PureApplication System box.
Data Size
*Examples provided for education only in the context of IBM PureApplication System Power Mini 32 and 64. Pattern capable of leveraging more RAM.
Deployment
Cores
RAM
Uncompressed DB size
Small (eg: dev)
12
100GB
200GB
Medium
32
512GB
1TB
Large
64
1024GB
2TB
27. Other Consolidation Scenarios
IBM PureApplication System /
Pattern-enabled Environment
Other Patterns
App
servers
Other
Middleware
Hosting
Real-time
Analytics
IBM BI With BLU Acceleration
Cognos
BI
Reporting / Analysis
Dashboards
Data
Warehouse
DB2 BLU
Export and
Explore
31. Communities
• On-line communities, User Groups, Technical Forums, Blogs, Social
networks, and more
o Find the community that interests you …
• Information Management bit.ly/InfoMgmtCommunity
• Business Analytics bit.ly/AnalyticsCommunity
• Enterprise Content Management bit.ly/ECMCommunity
• IBM Champions
o Recognizing individuals who have made the most outstanding contributions to
Information Management, Business Analytics, and Enterprise Content
Management communities
•
ibm.com/champion
32. Related IOD Sessions
Wed. 2-5 Modeling. Deploying and Optimizing New
Features of IBM Cognos Dynamic Cubes v 10.2.1
Session Number 1872
Wed. 3 - 5:45 IBM Cognos Dynamic Cubes Super Session
Session Number 1963
33. Thank You
Your feedback is important!
• Access the Conference Agenda Builder to
complete your session surveys
o Any web or mobile browser at
http://iod13surveys.com/surveys.html
o Any Agenda Builder kiosk onsite