Submit Search
Upload
Kyligence Leverages Alluxio to Accelerate OLAP in the Cloud
•
1 like
•
969 views
Alluxio, Inc.
Follow
Beijing Meetup - Jan 2018
Read less
Read more
Technology
Report
Share
Report
Share
1 of 38
Download now
Download to read offline
Recommended
Use Alluxio to Unify Storage Systems in Suning
Use Alluxio to Unify Storage Systems in Suning
Alluxio, Inc.
The Evolution of an Open Data Platform with Alluxio
The Evolution of an Open Data Platform with Alluxio
Alluxio, Inc.
賽門鐵克 Storage Foundation 6.0 簡報
賽門鐵克 Storage Foundation 6.0 簡報
Wales Chen
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
Anna Yen
Oracle服务器及存储介绍
Oracle服务器及存储介绍
Ethan M. Liu
Tachyon 2015 08 China
Tachyon 2015 08 China
Tachyon Nexus, Inc.
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
Jazz Yao-Tsung Wang
美团点评技术沙龙14:美团云对象存储系统
美团点评技术沙龙14:美团云对象存储系统
美团点评技术团队
Recommended
Use Alluxio to Unify Storage Systems in Suning
Use Alluxio to Unify Storage Systems in Suning
Alluxio, Inc.
The Evolution of an Open Data Platform with Alluxio
The Evolution of an Open Data Platform with Alluxio
Alluxio, Inc.
賽門鐵克 Storage Foundation 6.0 簡報
賽門鐵克 Storage Foundation 6.0 簡報
Wales Chen
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
Anna Yen
Oracle服务器及存储介绍
Oracle服务器及存储介绍
Ethan M. Liu
Tachyon 2015 08 China
Tachyon 2015 08 China
Tachyon Nexus, Inc.
2016-07-12 Introduction to Big Data Platform Security
2016-07-12 Introduction to Big Data Platform Security
Jazz Yao-Tsung Wang
美团点评技术沙龙14:美团云对象存储系统
美团点评技术沙龙14:美团云对象存储系统
美团点评技术团队
Databases on AWS
Databases on AWS
Chien Chung Shen
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Anna Yen
OpenStack and Docke Integration V6
OpenStack and Docke Integration V6
Guangya Liu
Cloudera introduction
Cloudera introduction
Phate334
架設Hadoop叢集以及mapreduce開發環境
架設Hadoop叢集以及mapreduce開發環境
Phate334
FIT2CLOUD:云管理及DevOps协作平台
FIT2CLOUD:云管理及DevOps协作平台
Fit2Cloud
Introduction to K8S Big Data SIG
Introduction to K8S Big Data SIG
Jazz Yao-Tsung Wang
Full Stack Monitoring with Prometheus and Grafana (Updated)
Full Stack Monitoring with Prometheus and Grafana (Updated)
Jazz Yao-Tsung Wang
美团点评技术沙龙14美团云-Docker平台
美团点评技术沙龙14美团云-Docker平台
美团点评技术团队
阿里自研数据库 Ocean base实践
阿里自研数据库 Ocean base实践
drewz lin
Hybrid Cloud Based on Ceph Object Storage - ShanChun
Hybrid Cloud Based on Ceph Object Storage - ShanChun
Ceph Community
Cloud formation 基礎設施即程式碼和aws資源佈建-workshop
Cloud formation 基礎設施即程式碼和aws資源佈建-workshop
CKmates
Bd paa s - big-data platform as a service
Bd paa s - big-data platform as a service
inwin stack
Continuous Delivery - Opening
Continuous Delivery - Opening
Rick Hwang
Train.IO 【第六期-OpenStack 二三事】
Train.IO 【第六期-OpenStack 二三事】
inwin stack
AWS雲端架構師 培訓&考試課程介紹
AWS雲端架構師 培訓&考試課程介紹
QCloudMentor
00.exalogic概览
00.exalogic概览
Meng He
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Joseph Kuo
新一代企業級雲端資料庫系統
新一代企業級雲端資料庫系統
iServDB & iServCloud
Comboware ComboStack 202105
Comboware ComboStack 202105
Elroy Peng
Kubernetes project update and how to contribute
Kubernetes project update and how to contribute
inwin stack
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Etu Solution
More Related Content
What's hot
Databases on AWS
Databases on AWS
Chien Chung Shen
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Anna Yen
OpenStack and Docke Integration V6
OpenStack and Docke Integration V6
Guangya Liu
Cloudera introduction
Cloudera introduction
Phate334
架設Hadoop叢集以及mapreduce開發環境
架設Hadoop叢集以及mapreduce開發環境
Phate334
FIT2CLOUD:云管理及DevOps协作平台
FIT2CLOUD:云管理及DevOps协作平台
Fit2Cloud
Introduction to K8S Big Data SIG
Introduction to K8S Big Data SIG
Jazz Yao-Tsung Wang
Full Stack Monitoring with Prometheus and Grafana (Updated)
Full Stack Monitoring with Prometheus and Grafana (Updated)
Jazz Yao-Tsung Wang
美团点评技术沙龙14美团云-Docker平台
美团点评技术沙龙14美团云-Docker平台
美团点评技术团队
阿里自研数据库 Ocean base实践
阿里自研数据库 Ocean base实践
drewz lin
Hybrid Cloud Based on Ceph Object Storage - ShanChun
Hybrid Cloud Based on Ceph Object Storage - ShanChun
Ceph Community
Cloud formation 基礎設施即程式碼和aws資源佈建-workshop
Cloud formation 基礎設施即程式碼和aws資源佈建-workshop
CKmates
Bd paa s - big-data platform as a service
Bd paa s - big-data platform as a service
inwin stack
Continuous Delivery - Opening
Continuous Delivery - Opening
Rick Hwang
Train.IO 【第六期-OpenStack 二三事】
Train.IO 【第六期-OpenStack 二三事】
inwin stack
AWS雲端架構師 培訓&考試課程介紹
AWS雲端架構師 培訓&考試課程介紹
QCloudMentor
00.exalogic概览
00.exalogic概览
Meng He
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Joseph Kuo
新一代企業級雲端資料庫系統
新一代企業級雲端資料庫系統
iServDB & iServCloud
Comboware ComboStack 202105
Comboware ComboStack 202105
Elroy Peng
What's hot
(20)
Databases on AWS
Databases on AWS
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
OpenStack and Docke Integration V6
OpenStack and Docke Integration V6
Cloudera introduction
Cloudera introduction
架設Hadoop叢集以及mapreduce開發環境
架設Hadoop叢集以及mapreduce開發環境
FIT2CLOUD:云管理及DevOps协作平台
FIT2CLOUD:云管理及DevOps协作平台
Introduction to K8S Big Data SIG
Introduction to K8S Big Data SIG
Full Stack Monitoring with Prometheus and Grafana (Updated)
Full Stack Monitoring with Prometheus and Grafana (Updated)
美团点评技术沙龙14美团云-Docker平台
美团点评技术沙龙14美团云-Docker平台
阿里自研数据库 Ocean base实践
阿里自研数据库 Ocean base实践
Hybrid Cloud Based on Ceph Object Storage - ShanChun
Hybrid Cloud Based on Ceph Object Storage - ShanChun
Cloud formation 基礎設施即程式碼和aws資源佈建-workshop
Cloud formation 基礎設施即程式碼和aws資源佈建-workshop
Bd paa s - big-data platform as a service
Bd paa s - big-data platform as a service
Continuous Delivery - Opening
Continuous Delivery - Opening
Train.IO 【第六期-OpenStack 二三事】
Train.IO 【第六期-OpenStack 二三事】
AWS雲端架構師 培訓&考試課程介紹
AWS雲端架構師 培訓&考試課程介紹
00.exalogic概览
00.exalogic概览
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
Establish The Core of Cloud Computing Application by Using Hazelcast (Chinese)
新一代企業級雲端資料庫系統
新一代企業級雲端資料庫系統
Comboware ComboStack 202105
Comboware ComboStack 202105
Similar to Kyligence Leverages Alluxio to Accelerate OLAP in the Cloud
Kubernetes project update and how to contribute
Kubernetes project update and how to contribute
inwin stack
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Etu Solution
雲端環境的快取策略-Global Azure Bootcamp 2015 臺北場
雲端環境的快取策略-Global Azure Bootcamp 2015 臺北場
twMVC
Hyper: 让Pod以VM为边界
Hyper: 让Pod以VM为边界
Xu Wang
Oracle 全方位云解决方案概要
Oracle 全方位云解决方案概要
Ethan M. Liu
数据科学分析协作平台CDSW
数据科学分析协作平台CDSW
Jianwei Li
AWS Summit: Strikingly analytics
AWS Summit: Strikingly analytics
Chase Zhang
KubeVela:标准化的云原生平台构建引擎
KubeVela:标准化的云原生平台构建引擎
suncbing1
Oracle saa s paas overview
Oracle saa s paas overview
Chris Lee
Oracle雲端服務介紹 taiwan
Oracle雲端服務介紹 taiwan
Chieh-An Yu
美团技术沙龙04 美团下一代分布式存储系统
美团技术沙龙04 美团下一代分布式存储系统
美团点评技术团队
MySQL5.6&5.7 Cluster 7.3 Review
MySQL5.6&5.7 Cluster 7.3 Review
郁萍 王
Cloudera企业数据中枢平台
Cloudera企业数据中枢平台
Jianwei Li
QIoT ,QuAI
QIoT ,QuAI
Stacy Cheng
Raising The MySQL Bar-Manyi Lu
Raising The MySQL Bar-Manyi Lu
郁萍 王
分会场八和Net backup一起进入云备份时代
分会场八和Net backup一起进入云备份时代
ITband
Kube-OVN Introduction
Kube-OVN Introduction
梦馨 刘
2015中国软件技术大会-开放云介绍
2015中国软件技术大会-开放云介绍
Li Jiansheng
Retrive&rank
Retrive&rank
Joseph Chang
hicloud PaaS 雲創平台 for java developer
hicloud PaaS 雲創平台 for java developer
hicloud-paas
Similar to Kyligence Leverages Alluxio to Accelerate OLAP in the Cloud
(20)
Kubernetes project update and how to contribute
Kubernetes project update and how to contribute
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
Track A-1: Cloudera 大數據產品和技術最前沿資訊報告
雲端環境的快取策略-Global Azure Bootcamp 2015 臺北場
雲端環境的快取策略-Global Azure Bootcamp 2015 臺北場
Hyper: 让Pod以VM为边界
Hyper: 让Pod以VM为边界
Oracle 全方位云解决方案概要
Oracle 全方位云解决方案概要
数据科学分析协作平台CDSW
数据科学分析协作平台CDSW
AWS Summit: Strikingly analytics
AWS Summit: Strikingly analytics
KubeVela:标准化的云原生平台构建引擎
KubeVela:标准化的云原生平台构建引擎
Oracle saa s paas overview
Oracle saa s paas overview
Oracle雲端服務介紹 taiwan
Oracle雲端服務介紹 taiwan
美团技术沙龙04 美团下一代分布式存储系统
美团技术沙龙04 美团下一代分布式存储系统
MySQL5.6&5.7 Cluster 7.3 Review
MySQL5.6&5.7 Cluster 7.3 Review
Cloudera企业数据中枢平台
Cloudera企业数据中枢平台
QIoT ,QuAI
QIoT ,QuAI
Raising The MySQL Bar-Manyi Lu
Raising The MySQL Bar-Manyi Lu
分会场八和Net backup一起进入云备份时代
分会场八和Net backup一起进入云备份时代
Kube-OVN Introduction
Kube-OVN Introduction
2015中国软件技术大会-开放云介绍
2015中国软件技术大会-开放云介绍
Retrive&rank
Retrive&rank
hicloud PaaS 雲創平台 for java developer
hicloud PaaS 雲創平台 for java developer
More from Alluxio, Inc.
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio, Inc.
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
Alluxio, Inc.
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
Alluxio, Inc.
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
Alluxio, Inc.
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Alluxio, Inc.
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio, Inc.
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio, Inc.
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Alluxio, Inc.
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Alluxio, Inc.
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Alluxio, Inc.
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
Alluxio, Inc.
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
Alluxio, Inc.
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio, Inc.
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
Alluxio, Inc.
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
Alluxio, Inc.
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
Alluxio, Inc.
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
Alluxio, Inc.
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
Alluxio, Inc.
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
Alluxio, Inc.
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio, Inc.
More from Alluxio, Inc.
(20)
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Kyligence Leverages Alluxio to Accelerate OLAP in the Cloud
1.
使⽤用 Alluxio 加速云上
OLAP 分析 史少锋 Kyligence 资深架构师 shaofeng.shi@kyligence.io
2.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io 议程 • Apache Kylin and Kyligence Inc. • Kyligence Analytics Platform • KAP in the Cloud • Alluxio + KAP • Summary
3.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Apache Kylin:全球领先的大数据分析技术(OLAP-on-Hadoop) 全球最大的开源软件基金会 • 顶级项目 Ø Apache Kylin, 中国第一个 Apache顶级开源项目,核心 开发者及贡献者都在中国 • 行业认可 Ø 连续两年荣获InfoWorld“最佳 开源大数据工具奖”,与 Spark,TensorFlow一起获奖 • 用户认可 Ø 全球超过500家领先企业使用 Kylin大数据分析平台解决方案 与Apache Kylin团队一起合作使 Kylin通过孵化成为顶级项目对我而言 非常激动人心,Kylin在技术方面当然 是振奋人心的,但同样令人兴奋的是 Kylin代表了亚洲国家,特别是中国, 在开源社区中越来越高的参与度。 —Ted Dunning, Apache 孵化项目副总裁 • 生态社区 Ø 活跃的社区,众多用户及开发者, 广泛的开源、商业合作伙伴体系 • 技术优势 Ø 基于预计算+并行计算+列式存储 等优化技术,实现海量数据+高 并发+亚秒级响应的实时数据分 析平台 44 0.32 0 10 20 30 40 50 SparkSQL Kylin 某金融机构,6.9亿数据,15年数据,查询Top用户 SQL查询延迟
4.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence = Kylin + Intelligence 构建领先的 全球开源社区 企业级 产品 专业服务 管理 与 自动化 云计算 行业 解决方案 Apache Kylin原创团队组建 ü 拥有50% Apache Kylin PMC ü 贡献90%+的Kylin源代码 以Kylin为核心的企业级产品 ü KAP:企业级OLAP平台 ü Kyligence Cloud: 云计算+大数据+智能运维 全方位的原厂专业服务 ü 产品支持 & 认证培训 ü 平台实施 & 架构咨询 ü 硅谷上海 & 全球服务
5.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io 关于我们 2014.11 加入Apache孵化器,Apache Kylin正式开源 2015.11 毕业成为Apache 顶级项目 2016.3 Kyligence 公司建立, 获得红点创投数百万天使投资 2017.4 完成A轮融资(800 万美金),由宽带资 本、顺为资本领投, 红点中国跟投 2016.8 发布企业级智能大 数据解决方案 Kyligence Analytics Platform 2017.5 Kyligence美 国分公司成立 2016.9 二次获得InfoWorld 最佳开源大数据工具 奖 2017.8 Kyligence成为 AWS Technical Partner 2017.9 Kyligence Robot 发布,支持 Apache Kylin在 线智能优化 2017.12 Kyligence Cloud 发布
6.
Kyligence Analytics Platform
7.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence Analytics Platform: 搭建⽤用户和⼤大数据之间的桥梁梁 • 高性能:亚秒级查询延迟, 满足交互式分析的时效性要 求,为mission-critical场景高 度优化 • 高并发:线性扩展,满足大 数据时代爆发的数据分析需 求,支持internet scale在线服 务 • 易使用:标准SQL访问,降 低技术门槛,屏蔽复杂的技术 接口
8.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence Analytics Platform (KAP) 架构 Kyligence RoBot 在线自助服务平台 为DevOps提供 系统监控 Cube优化 SQL调优 Kyligence Analytics Platform Kylin/Open Source KAP/Commercial Online Service Apache Kylin Open Source OLAP On Hadoop KyAnalyzer Agile BI KyStudio Data Model Designer KyManager Administrator Tool KyStorage Columnar Storage Security Cell Level ACL On Demand Deployment On-Premises On-Hybrid On-Cloud
9.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Apache Kylin to KAP 新兴Hadoop技术 分布式计算框架 Scale-Out架构 SQL查询性能差 传统DW产品 经典OLAP理论 Scale-Up架构 Cube容量、性能、并发受限 Apache Kylin OLAP 预计算 + Hadoop 计算框架 KAP Kylin +灵活查询 +明细查询 +智能优化 +企业级安全
10.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io 空间换时间:Cube基础原理简介 数据立方体:是一种多维分析的技术,通过预计算, 将计算结果存储在某多个维度值所映射的空间中,在 查询时通过对Cube的再处理而快速获取结果。 维度模型:数据仓库建设中的一种数据建模 方式,按照事实表、维度表的方式来进行数 据建模,星型模型是应用最广泛的方法 预先进行 汇总、分类、排序
11.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Cube预计算是KAP核心技术理念
12.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io 利用Hadoop强大的并行计算能力 Kyligence Analytics Platform KyAnalyzer,BI Tools, Web App… ANSI SQL KyStorage Map Reduce/Spark/Streaming…
13.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io 预计算能够充分保证查询性能的稳定
14.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP: 超高性能、超高并发 在标准性能测试数据集上,提供亚秒级查询响应,相对Hive有百倍以上加速比
15.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kylin/KAP 全球典型用户 互联网 • eBay • Yahoo! Japan • Baidu 地图 • 美团点评 • 网易 • Expedia • 京东 • 唯品会 • 360 • 今日头条 大金融 • 太平洋保险 • 花旗银行 • 银联 • 华泰证券 • 国泰证券 • 陆金所 • JPMorgan 电信 • 中国移动 • 中国电信 • 中国联通 • AT&T 制造业 • 上汽集团 • 华为 • 联想 • OPPO • 小米 • VIVO 其他 • MachineZone • Inovex • Glispa • Adobe • 科大讯飞 统计数据来与公开渠道及Kylin社区
16.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence in the Cloud
17.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence is the partner of Azure and AWS
18.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP has landed Azure KAP has on boarded Azure global and Mooncake
19.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence Cloud: 一键式部署PaaS服务,支持多朵云
20.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Kyligence Cloud:解决云上⼤大数据困难的问题 • 一键部署:在几分钟内完成KAP及 Hadoop部署 • 动态伸缩:基于实际使用情况动态伸缩 计算资源,实现高扩展性。 • Cloud Native: S3 as storage, Auto scaling, Cloud Formation模版部署 • 节省成本: 读写分离、按需启停可有效 节省运营成本 • 无缝集成BI: 从Hadoop到KAP到BI 工具,在AWS云上获得端到端的解决方案 • 轻松运维:全托管站点令运维更轻松, 使您将注意力集中到业务中 Cloud
21.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Hadoop & KAP 在云上的挑战 • Local Disk 变的易失;HDFS不再是云上适合 Hadoop/Spark 的可靠存储 • VM 删除时,数据一并擦除,导致HDFS产生丢失块 • Local Disk 价格昂贵 • 计算与存储分离的架构 • AWS S3, Azure Blob Store 等是更可靠,成本更低的存储服务,适合大数据场景 • 将计算与存储分离,使得架构变成真正可扩展;AWS EMR, Azure HDInsight 支持 S3, WASB 做 Hadoop 文件存储
22.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Hadoop & KAP 在云上的挑战 • S3,Blob Store 与 HDFS 区别大 • 性能受网络带宽影响大 • 最终一致性 • Meta Data 操作耗时 • KAP 云上方案 • 临时方案:HDFS 用作计算,S3 做备份; • 更好方案:需要一种透明的,在 S3 之上的快速缓存层
23.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP + Alluxio
24.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Alluxio
25.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Highlights of Alluxio • Memory speed virtual distributed storage system • Spark/MapReduce can run over Alluxio just like other FS • Support most cloud storage services like S3, GCS, WASB
26.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP + Alluxio 架构 • Alluxio 挂载 S3 bucket 做为底层文件系统 • KAP 使用 Alluxio 作为文件系统,替代 S3 • 对应用程序透明,几乎没有代码改动
27.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Alluxio 与 EMR 的集成部署 • EMR的master节点部署alluxio master;Core节点启动alluxio worker; 通过bootstrap action安装 • https://github.com/shaofengshi/emr-bootstrap-alluxio
28.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP 配置使用 Alluxio • 配置 Alluxio • 拷贝 alluxio-core-client-runtime-<version>.jar 到KAP的spark目录 • 拷贝 alluxio-site.properties 到 spark/conf • 使用 S3 做写操作的文件系统 • kylin.env.hdfs-working-dir=s3://mybucket/kylin • 使用 Alluxio 做读操作的文件系统 • kylin.storage.columnar.file-system=alluxio://<master-node>:19998/ • 不需要开启读写分离开关
29.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP 使用 Alluxio 曾遇到的问题 • 性能不升反降 • Alluxio 部署在独立的集群上,查询性能反而更慢 • 解决办法:部署在与 Spark 相同集群 • 新文件在 Alluxio 中找不到 • 新文件写入 S3 后,从 Alluxio 查询不到 • 解决办法:递归 ls 上级目录,触发 Alluxio 与 S3 同 步meta data
30.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io KAP 使用 Alluxio 曾遇到的问题 (cont.) • Azure 相关文档少 • 较多配置、jar冲突等问题; • 解决办法:使用新版本Azure Storage Java lib, 使用HDInsight script action自动安装和卸载 • 自动化安装脚本: https://github.com/shaofengshi/hdinsight- scriptaction-alluxio
31.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io SSB Benchmark – S3 vs Alluxio • https://github.com/Kyligence/ssb-kylin • Raw data: 91 millions; Cube size: 20 GB
32.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io SSB Benchmark – S3 vs Alluxio • In average, KAP query latency is reduced to ¼ on Alluxio than on S3
33.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io User profile – WASB vs local HDFS vs Alluxio • User behavior data, 200 millions rows • Cube size 15 GB
34.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io User profile – WASB vs local HDFS vs Alluxio • Alluxio provides close to local HDFS performance, which is 3 to 4X faster than WASB
35.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Summary
36.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io Summary • 结论 • Alluxio 能帮助 KAP 透明加速云上的 OLAP 查询,获得与本地数据接近的性能 • 后续 • Alluxio 提供统一的数据命名空间,帮助 KAP 接入和管理不同数据层 • 利用分层存储,支持缓存更大数据量 • 通过分析 Alluxio 数据使用情况,统计和优化 KAP 的存储使用
37.
Confidential, all rights
reserved ©Kyligence Inc. http://kyligence.io 免费90天试用Kyligence Cloud • Kyligence Cloud is a managed Apache Kylin service that offers elastic enterprise OLAP on Hadoop in the cloud. • Support Azure / AWS in Global + China regions. • Console: https://cloud.kyligence.io
38.
THANK YOU 网站:http://kyligence.io 邮箱:info@kyligence.io Twitter:@Kyligence
Download now