Building Enterprise OLAP on Hadoop for Finance Services Industry, and following a use case of CPIC (fortune 500 insurance company) about how to replace legacy IBM Cognos OLAP with Kyligence platform
1. Building Enterprise OLAP on Hadoop
for Financial Services Industry
Luke Han
luke@kyligence.io | @lukehq
Co-founder & CEO of Kyligence
Creator & VP of Apache Kylin
Microsoft Regional Director & MVP
2. About Kyligence
• Formed by creators of Apache Kylin in 2016
• Offers Enterprise and Cloud version of Apache Kylin
• Funding from Redpoint, Cisco, CBC and Shunwei
• Member of Microsoft Accelerator Shanghai 2017
• Dual HQ in Silicon Valley & Shanghai, China
Kyligence booth: #855
3. Transition to Big Data…
How about your traditional data warehouse?
How about your existing OLAP/BI application?
4. Data Warehouse/OLAP
in Financial Services Industry
o The biggest industry rely on DW/OLAP
application
o Thousands applications build on top of EDW
o Experienced analysts with decade expertise
…in data…but not in technologies
6. But
you are asked to…
o Migrate or build existing OLAP/BI app to Big Data
o Better performance…just because you have Big Data now
o Train yourself to learn MR/Spark/ML…and AI
8. Presentation
Visualization
Data
Lake
Data
Source
o MOLAP on Hadoop
o Simplified Data Modeling
o Optimized for aggregation
query
o ANSI SQL
o Native on Hadoop
o On-Prem & In the Cloud
Apache Kylin: Bring OLAP back to Big Data
OLAP
Data Mart
Hive Impala Spark SQL Drill
MapReduce …Spark
9. Kylin vs Hive: Star-Schema Benchmark
0.17 0.17 0.18
142.42
161.66
189.17
0
20
40
60
80
100
120
140
160
180
200
2 10 20
ResponseTime(seconds)
Data Volume (Scale Factor)
Apache Kylin vs. Apache Hive
(lower is better)
KAP
Apache Hive
* Based on 4 Nodes, 16 Core CPU, 96 GB Memory per node
Apache Kylin
10. Global Users
FSI
• ABC
• CCB
• CMB
• CPIC
• Citic Bank
• China
Unionpay
• HUATAI
Securities
• GUOTAI
Securities
• Lufax
Telecom
• China Mobile
• China Telecom
• Chine Unicom
• AT & T
Internet
• eBay
• Yahoo! Japan
• Baidu
• Meituan
• NetEase
• Expedia
• JD.com
• VIP.com
• 360
• Toutiao
Others
• MachineZone
• Glispa
• Inovex
• Adobe
• iFLYTEC
500+ use cases in production global
Manufacturing
• SAIC
• HUAWEI
• Lenovo
• OPPO
• XIAOMI
• VIVO
Data collected from public information and kylin community
15. TPC-DS
0
50
100
150
200
250
1 4 7 101316192225283134374043464952555861646770737679828588919497
KAP: TPC-DS
• Hive: 33 queries can’t support
or run out of time
• KAP: all 99 queries supported
• Routine query between SQL
on Hadoop and Apache Kylin
23. CPIC: China Pacific Insurance (Group) Co., LTD
• Global Fortune 500 insurance company
• Top 2 insurance company in China
• $40+ billion revenue
• 8+ million customers
• 97,000+ employees
24. Challenges
• Legacy IBM Cognos + DB2 solution can’t support Big Data scenarios
• Long waiting time (minutes ~ hours for reporting)
• Low concurrency (100,000+ employees!)
• High cost
25. 2016.12
~
2017.01
KAP POC: Performance Testing
• Query Latency
• Concurrency
KAP POC: Compatibility
• Cognos Connection
• Cognos Syntax
2017.01
~
2017.03
Development
• Fixed Reports
• Flexible Reports
2017.03
~
2017.05
Go alive
• All dataset aggregation and testing
• Fixed Reports released
2017.05
~
2017.06
Journey of Kyligence Analytics Platform
• No changes on
Hadoop side
• No additional
engineers required
• Most of work done by
analysts
26. KAP + Cognos: Deployment
Dynamic Report
JDBC
Fixed Report
ODBC
KAP Query Server
Reporting & Dashboard OLAP & Data Mart Big Data Platform
27. Benefits after Adopting Kyligence
• One-stop BI platform generates complicated reports
• Over 90% queries return within 3 seconds (including high-dimensional
queries)
• Seamless integration with IBM Cognos, no change at front-end
• 2 KAP cubes replaced 2000+ IBM Cognos cubes
• Cost reduced significantly by adopting open source technology
28. Customer Quote
“Kyligence enables us to find valuable insights faster
from every insurance policy within seconds. Kyligence’s
platform allows us to achieve more with less. Our lean
management system has improved significantly”
-- Minchen Wu, Depute GM of IT, CPIC
29. Fusion Big Data Platform
• Open: Connect to Teradata/Greenplum and IBM Cognos/Saiku…
• Flexible: Self-Services for end users
• Efficiency: Speed up PC and Mobile analytics experience
China Construction Bank (CCB):
2nd Largest Bank in the World
“Apache Kylin is last piece of puzzle to
serving data asserts management
between legacy DW and new Big Data.”
-- Zhi Zhu, Vice Senior Manager of Tech Dept, CCB
30. Enterprise OLAP on Hadoop
Speed Up Mission Critical Analytics
Booth #855
luke@kyligence.io
http://kyligence.io